Re: EPrints, DSpace or ESpace?

From: Robert Spindler <>
Date: Wed, 12 Feb 2003 16:24:43 -0700

Greetings from Arizona! This exchange between Derek and Stevan illustrates
the difficult tension archivists feel these days between preservation
and access. The scholarly research community has profound opportunities
to improve the speed and availability of very current research results
through electronic publishing and self-archiving. These things benefit
the scientific community in very direct and immediate ways...

On the other hand there's the back end of research replication, criticism
and revision, citation tracking/analysis and the history of science.
Archivists think about this stuff quite a bit, perhaps more than the
scientific community does at this time since perhaps they have not
experienced significant/relevant loss so far. We used to think about
catastrophic loss events, we know now that loss is likely to be
more subtle - the gentle corruption of over time from software
incompatibilities, character set incompatibilities, loss of formatting,
addressing failures (its out there but in a place you can't find),
linkage failures (between digital images and their metadata for example),
hidden viruses. Clifford Lynch has noted that our tools for detecting
corruption are very blunt. The subtlety of loss makes preservation
advocacy very difficult because loss is not catastrophic until it reaches
a critical mass.

One of the things I've been trying to pay attention to in this environment
is: "What advice should we be giving to document creators to help them
minimize the potential for loss?" Can we influence the process of document
creation to maximize the potential for *real* archiving without slowing
the dissemination of research? The OAIS reference model is very helpful
in thinking about these things, specifically in terms of submission of
one or two or several forms and formats of the content (the archival
information package, the distribution information package, etc...) Simply
uploading files will not suffice if long term preservation is desired.

The thread about "toll-access" content vs. self archived content
is an important piece. Stevan places a great deal of trust in the
commercial publishing industry for long term preservation of the
"toll-access" content, and yet publishers seem unlikely to make
the timely and continuing preservation actions necessary to retain
electronic content unless there is sufficient market revenue to support
the preservation costs. I can still hear Kevin Guthrie of JSTOR asking
the group at CNI in December, "Which of your institutions is willing
to help fund the public good?" (paraphrased regarding who will pay
the cost of preservation?) Another perspective on this thread is that
there may be significant differences between the self-archived version
and the commercially published version that demonstrate the influence
of reviewers, new research by others, etc. Both versions may indeed
be archival!

In the end its pretty clear that unless the scientific community values
preservation of their work at some level close to the value they place
in fast dissemination, the archival perspective will be very difficult
to sell. We'll need to make preservation as seamless as possible if we're
going to expect the scientific community to participate in saving their
own memory. Retrospective repair is a fool's game no one can afford.

Rob Spindler
University Archivist
Arizona State University Libraries <>
