Re: Kepler: Author-Based Archivelets

From: Jim Till <till_at_UHNRES.UTORONTO.CA>
Date: Fri, 29 Jun 2001 07:52:37 -0400

Earlier in this month, I downloaded Kepler via the Kepler home page
(http://kepler.cs.odu.edu), and then set up a personal OAI-compliant
"archivelet" (till_at_home) on my PC at home. (Some email correspondence
with X. Liu and M. Zubair was required before I was able to set up and
register the archivelet successfully - my thanks to them for their help!).

So far, only one document of my own has been posted on my archivelet, and
it's been successfully harvested and cached by the Kepler harvester. It
can be accessed via the Kepler Search Service, at:
http://kepler.cs.odu.edu:8080/searcharc/search.html

A second document, initially cached by the harvester, is a test document
for the Kepler project; it isn't one that I created.

My own document is a version, in HTML, of my article, "Predecessors of
preprint servers", published in Learned Publishing 2001; 14(1): 7-13.
(I've retained copyright).

The original version of the same article, in PDF format, is also freely
available, via: http://www.catchword.com/09531513/v14n1/contp1.htm

The version, in HTML, cached by the Kepler harvester is also archived at:
http://xxx.lanl.gov/html/physics/0102004

I'd initially planned to set up the personal archivelet on the PC at my
office, which has a fast connection to the Internet. However, it's behind
a firewall, and Kepler (at present) doesn't work behind such a firewall.
So, I used the fast connection at my office to download the zipped Windows
version of Kepler onto a rewritable CD-RW, and unzipped it onto the same
CD.

I then took that CD to my home and ran Kepler directly from the CD, on the
PC at my home. I was able to register it successfully (and, was able to
avoid the long download time that would have been required at home, where,
at present, I have only a slow dial-up connection to the Internet). I was
very pleased to learn that, within a day after successful registration,
the contents of my archivelet had been harvested and cached.

Because I have only a dial-up connection to the Internet at home, my
archivelet will be off-line most of the time. This means that the cached
version is the only one that can be accessed. I don't plan to modify the
content of the archivelet very often, so the cached version of the
archivelet should usually be quite up-to-date.

Thus, the results of my little experiment with setting up a personal
archivelet have been, so far, very positive. My only real handicap has
been a lack of much familiarity with proper use of the "Dublin core" of
metadata, and especially, uncertainty about appropriate use of such
metadata for an OAI-compliant archivelet.

One point that may merit some discussion: I believe that the harvester
will check all of the registered archivelets periodically, and, if
anything is changed, the old version in the cache will be replaced by the
new version. But, the archivelet can't notify the harvester that
something has changed.

My understanding, on the basis of a brief exchange of personal emails with
Xiaoming Liu, is that, for the limited number of archivelets that exist
now, frequent harvesting poses no big problem. However, this will become
a problem for a large number of archivelets. This problem could be solved
if the personal archivelet (data provider) could "push" any changes to the
harvester (service provider). But, the OAI protocol is "pull" based, so a
"push" approach would involve a different paradigm?

I believe that Xiaoming Liu is now a subscriber to this Forum. If so, I
hope that he'll correct any misstatements that I may have made about
Kepler, or about the model upon which it, and it's use, is based.

I'm sorry about the length of this message, but it does seem to me to
provide a good "case study" of one preliminary experience with a very
interesting way (although one that's still in an early experimental phase)
to set up a personal OAI-complaint archivelet.

Jim Till
University of Toronto
Received on Wed Jan 03 2001 - 19:17:43 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:46:10 GMT