Re: Prospects for institutional e-print repositories study

From: Stevan Harnad <>
Date: Tue, 15 Jul 2003 12:46:46 +0100

On Tue, 15 Jul 2003, Barry Mahon wrote:

> one item is missing from the
> argument, presently, or at least until the widespread self-archiving
> foreseen by Stevan is achieved: the repositories, primarily National
> Libraries <viz. Elsevier Royal/NL Library agreement>, keep a record
> of archived items, so that they can be found. One of the worries
> of the information community is that self-archiving, even if it is
> "ready, willing and able to take over" will not be able to replicate
> this identification. Inherently a widely distributed archive will be
> difficult to track, this has been identified as one of the 'non-trivial'
> issues to be dealt with for the implementation of the Semantic Web.

The Elsevier/NL kind of agreement -- as noted in the many passages quoted
in --
is between the *publisher* and *deposit library* and concerns the
publication itself. That is quite normal, and can and should be extended
to the entire peer-reviewed journal literature of 20,000 journals and
2,000,000 annual articles -- entirely independently of any open-access
or self-archiving considerations, with which it has nothing at all to do!

Self-archiving concerns a *supplementary* version -- a "back-up",
if you like -- of that primary corpus of 20,000/2,000,000. What that
supplementary corpus needs is not *preservation* (the primary version
needs preservation), but *creation*, for the sake of immediate access
and impact (currently being lost, daily) for all would-be users web-wide
whose institutions cannot afford the toll access to the primary
(publisher's) version.

Right now, that supplementary access is barely existent:
If "one of the worries of the information community" is about the
indentification and tracking of those supplementary self-archived
items, then that worry is misplaced: it should be redirected to the
identification and tracking of the primary versions of those articles
(the publisher/deposit-library versions).

The pressing worry today (and it is a worry of the research community)
is *access* (and *impact*), not preservation. The solution is
to self-archive the supplementary versions, not to re-duplicate the
preservation problem of the primary corpus, for a still near-non-existent
secondary incarnation!

Start worrying about the preservation of the supplementary corpus only if
and when it looks as if it might become the primary corpus. It can't
even dream of doing that until it at least exists!

Stevan Harnad

NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98 & 99 & 00 & 01 & 02 & 03):

Discussion can be posted to:
Received on Tue Jul 15 2003 - 12:46:46 BST

