Re: A Search Engine for Searching Across Distributed Eprint Archives

From: Stevan Harnad <>
Date: Tue, 10 Oct 2000 09:42:48 +0100

---------- Forwarded message ----------
Date: Tue, 10 Oct 2000 09:37:23 +0200
From: "Dovey J, Mr" <>
Cc: "'harnad_at_COGLIT.ECS.SOTON.AC.UK'" <>,
Subject: A Search Engine for Searching Across Distributed Eprint Archives


I have been following the list and plowing through the archive. What I
thought was that there are otherexperts in MetaData out there who have given
a considerable amount of thought to the whole issue and who in fact do this
as a profession. The Librarians of this world consider this, the issue of
metadata, to be their forte and the initiative which is referred to in the
message I have forwarded here is one driven by the library fraternity to
enable the sharing of documents (academic papers). I would be very
interested in seeing a cross-pollination of ideas between the two camps.

>From the side of the Tropus/Freenet people, there are a lot of very good
ideas and a useful framework of particularly metadata that could be taken
from the librarians camp and adapted and maybe built on. There are a lot of
standards that are already in existence and some of the problems that this
list has been discussing are issues that have been considered and "solved".

>From the Library/Document archive side, there is the issue of distribution
and storage which I feel could take advantage of the Freenet model.

The Tropus project feels to me as if it is a duplication in some respects of
the "Open Archive" model in that it is attempting to impose a search engine
across what is in effect a _very_ distributed database, with each archive
having to provide a standard mechanism for "added services" to harvest the
metadata so that services can be applied across the top of the system.

The Tropus project (what I have seen so far on the list) seems to want to
actually distribute the metadata to each user. I do not think that this is
as elegant a solution.. I would rather see that a model is implemented where
people can choose to add services such as the "virtual file system" that
someone proposed, as a "value added" service. Rather work on making
available the mechanisms that can act as building blocks for the
distributed sharing of files.

The other issue of course is whether it is wise to restrict the scope of the
project to simply MP3 files? Surely it would be better to introduce
something which would indicate the "category" of information which is being
added to the system ie if I wish to add an MP3 file, then I specify that it
is such, but if I wish to add my favourite poem, which is stored in a plain
old text file, then I should be able to specify that that is the format.

I would prefer to see a more generic system that would allow for some
flexibility in application.

There is also another list to which I am cc'ing this message, the "Open
Source for Libraries" list (OSS4LIB) which has bandied about the idea of
what they have called "Docster" which was a discussion about how an
application could be created to allow libraries to share the documents
between each other which they currently send in various means during the
"inter-library loan" process. If the Tropus project could also address some
of those needs, then there could be some real synergy between oss4lib and
Tropus as each could make use of the expertise of the other.

I look forward to any comments..

John Dovey
Assistant Director (IT)
Library Services, University of Stellenbosch, South Africa
Phone : +27-21-8084100, Fax: +27-21-8084336
