Re: Prospects for institutional e-print repositories study

From: Stevan Harnad <>
Date: Tue, 15 Jul 2003 15:17:41 +0100

On Tue, 15 Jul 2003, Philip Hunter wrote:

>sh> The pressing worry today (and it is a worry of the research community)
>sh> is *access* (and *impact*), not preservation. The solution is
>sh> to self-archive the supplementary versions, not to re-duplicate the
>sh> preservation problem of the primary corpus, for a still near-non-existent
>sh> secondary incarnation!
> I think perhaps the JISC community and the digital library community
> might consider preservation of the primary corpus (as you put it) as
> a quite separate issue. I hadn't quite taken on board that you accord
> self-archived items a secondary and purely short-term functionality.

Yes, self-archived versions of one's primary publications are secondary,
but, no, their functionality is not purely short-term. The self-archived
papers in, for example,, have been there, and in continuous
use, since 1991; those in CogPrints since 1997. (Several new journal
start-ups have since come and gone in that interval!) Ditto for the
OAI-compliant institutional archives

So they are not, de facto, "purely short-term." That in itself should
help those who feel the urge to worry about preservation to resist the
urge and worry about increasing the available quantity of self-archived
content instead!

(Having said that, yes, it is usually the first three years of
accessibility that are the most critical for ongoing research progress,
as many publishers are happy to make their (primary) contents open-access
after an interval as long as that, their market sales-prospects being
near-zero by then. [But that does not mean that the self-archived
OAI-compliant versions will not persist beyond three years anyway! It
just means that their persistence is the wrong thing to be worrying

>sh> Start worrying about the preservation of the supplementary corpus
>sh> only if and when it looks as if it might become the primary corpus.
>sh> It can't even dream of doing that until it at least exists!
> It could be that it scarcely exists because the real issues for the
> research community are being ignored, which is that eprints need to have
> as much of the value and properties associated with paper documents
> as possible, if researchers are to feel the effort of 'deposit' is
> worthwhile. It seems that your solution is to persuade the researchers
> that 'deposit' equals 'access', and that 'access' is enough.

It is indeed the case that the self-archived corpus is still far too small
(relative to what has been in reach for several years now) because the
real issues for the research community are being ignored (or rather, the
research community has not been made aware of them). But the real issues
for the research community are definitely *not* that "eprints need to
have as much of the value and properties associated with paper documents
as possible"! The published paper version already provides those values
and properties (for those researchers whose institutions can afford the
access-tolls), and all the values and properties associated with online
documents too. That is not what's missing!

What's missing is the *access* to the (vanilla, no-frills) peer-reviewed,
published paper -- for those (many!) researchers whose institutions
*cannot* afford the access-tolls to the publisher's version. For all that
access-denial to would-be users of an author's research means impact-loss
for the author, for the author's institutions, and for research progress
itself. *That* is the critical PostGutenberg point that the research
community has not yet quite grasped:

    Harnad, S. (2003) Self-Archive Unto Others as Ye Would Have Them
    Self-Archive Unto You. The Australian Higher Education Supplement.

    Harnad, S., Carr, L., Brody, T. & Oppenheim, C. (2003)
    Mandated online RAE CVs Linked to University Eprint
    Archives: Improving the UK Research Assessment
    Exercise whilst making it cheaper and easier. Ariadne.

    Harnad, S. (2003) Measuring and Maximising UK Research
    Impact. Times Higher Education Supplement. Friday, June 6 2003.

    Harnad, S. (2003) Maximising UK Research Impact Through

> Where does long term citation come into this? A number of items
> you reference in your mails live in a 'temp' directory, which means
> researchers would be unwise to reference papers in that directory. This
> has to reduce the impact of your own work, and it is puzzling to see you
> make papers available in this way. Preservation is a legitimate concern
> of researchers considering the self-archiving route.

As soon as my papers are published, the preprint reference is
updated to include the safe-to-cite journal publication information
(see above). As soon as I get around to it (usually at 2-month
intervals), the published papers are archived in my institutional
eprint archive (as all but the last [not yet published] papers
above are:

And as soon as I get around to updating my publication
lists -- and -- the eprint-archive
URL takes the place of the Temp URL.

(But the Temp URL continues to exist for years, and points to
the eprint-archive URL. Moreover, if anyone searches for the
paper by author/title/journal, either in an OAI search engine
like OAISter or using or even google, they will get *all* the
versions, with the eprints-archive version clearly marking both the journal
milestone-version and the latest update, if any.)

I am not saying my own self-archiving routine and timetable are optimal,
but it is certainly a far closer approximation to the optimal than not
self-archiving at all, or waiting for some optimal solution to some
hypothetical preservation problem.

Stevan Harnad

