Re: Validation of posted archives

From: Stevan Harnad
Date: Sat, 24 Mar 2001 05:52:42 +0000

(I posted this Wednesday but it does not seem to have appeared.
Here it is belatedly. -- SH)

Date: Wed, 21 Mar 2001 18:19:14 +0000 (GMT)
From: Stevan Harnad <>
To: September 1998 American Scientist Forum
Subject: Re: Validation of posted archives

On Wed, 21 Mar 2001, Guillermo Julio Padron Gonzalez wrote:

> The "name" of a journal is part of the validation of a published paper.
> We all use the rigorousness of the peer review and the editorial
> criteria- of the journals to judge about the validity of a published
> paper. I agree that there can be exceptions, but they are just that:
> exceptions.
> It is clear that nobody has the time or the willingness to dive into
> each paper to find out whether it is the final version of a validated
> paper or it is just electronic garbage. The fact is that a
> non-administered archiving system may cause a proliferation of
> non-validated, duplicated, misleading and even fraudulent information in
> the web and there will be no way to identify the valid information, so
> the readers will go to "validating sites", v. g. the publisher site.
> Unless OAI included some kind of validation...

You are COMPLETELY on the wrong track. I am in Gatwick Airport,
headed for a meeting in Florence, so can only give the briefest
of replies:

There are currently at least 20K+ refereed journals, with at least
2,000,000 refereed articles annually (this estimate could be as
much as an order of magnitude too low!).

Most of those 2,000,000+ refereed articles are currently inaccessible
to most of researchers, across disciplines, all over the planet,
including the authors of those 2,000,000 articles. (Read my words
carefully, I am weighing them as I write them: I said "most of those
articles are inaccessible to most of its potential readers.")

I am not speaking for the Open Archives Initiative (OAI), which
is much broader than what we are discussing (it is providing a
convention for metadata tagging that will make all OAI-compliant
Archives interoperable, whether or not their texts are refereed,
whether or not their texts are free, whether or not their texts are
journal material; and so far OAI is for metadata, not the full
texts themselves).

I am speaking only for a SUBSET of the Open Archives Initiative,
namely, the Self-Archiving Initiative. It is for this initiative that
the free software for creating OAI-compliant Eprint
Archives was created. The objective of this initiative is to free a
SUBSET of the world's on-paper and on-line literature, a TINY subset,
but precisely the one I mentioned above: Those 2,000,000+ annual
articles in the world's 20K+ refereed journals.

Now follow the logic: To MOST of the planet's would-be users of that
literature, MOST of those articles are currently inaccessible, because
they or their institutions cannot afford to pay the
Subscription/Site-License/Pay-Per-View (S/L/P) fees that would give
them access to it all.

It is for these would-be users that the author/institution
self-archiving is being done, and also for the authors of all that
literature, who lose a vast quantity of potential impact for their
research findings because it is inaccessible to so many of its would-be
users (readers, citers, replicators, extenders).

For this vast population, a free, author-self-archived corpus NOW
would be an incalculable benefit.

The authentication/validation/protection you speak about can come
later, once the 2,000,000+ papers are up there, online and free, for
all these currently disenfranchised potential users. This is NOT the
time to worry about such things. Kindly see the paper I linked in my
prior reply. The link is to short answers to precisely questions like
yours, questions that have been repeatedly raised and replied to in
this Forum. They are prima facie questions. It is natural to raise
them. They are answered in that paper by number.

And let me close with the "Los Alamos Lemma": about 50,000 Physicists
have already had the good sense to self-archive 150,000 of their
papers to date in the Physics Archive without worrying for one
microsecond about the concern you raised. Look at the colossal usage
figures for that Archive, and ask yourself whether all those users
(for 10 years now) would have been better off without access until the
day when the problem you raise has been solved in advance.

The Los Alamos Lemma is that any worry or objection that did not hold
back the Physicists from self-archiving should not be holding back any
of the rest of the disciplines either.

Amen. I must fly to Florence now.

Stevan Harnad
Professor of Cognitive Science
Department of Electronics and phone: +44 23-80 592-582
             Computer Science fax: +44 23-80 592-865
University of Southampton
Highfield, Southampton
Received on Wed Jan 03 2001 - 19:17:43 GMT

