As usual, Andrew Odlyzko, the first to quantify such questions, has been
there before. Many thanks for Andrew's reply, which says, in essence,
that if the entire refereed literature were available free online, it
would be accessed incomparably more than it is now, or ever could be,
as long as there remains any price tag on access (whether S, L, or P).
I highly recommend Andrew's writing on this topic to everyone. He cites
only a few of his papers below, but his web page contains them all.

Let me add that there is some misunderstanding in some of the early
returns I have begun receiving in response to the three questions. The
misunderstanding is about P (Pay-Per-View): Andrew has of course not
fallen into this misunderstanding, grasping clearly that the P option
in the current system does not give even a remotely accurate indication
of how much material would be accessed by users if there were no P to
be paid at all:

Librarians have been replying that what they can't afford with S & L
(Subscription and License) they can always get with P, whenever a user
requests it. But this is an illusion based on the current awkward
P-system! A user must first ascertain (from some indexing source) that a
paper exists at all, and then must request to look at the full text via
P (whether through photocopy, ILL, or an online meta-License -- the
last being the "click-through oligopoly" that some publiser consortia
have been envisioning as a metered solution to the problem of providing
access to the entire corpus online).

The P process is both time-consuming and costly (indeed, potentially
ruinously so). What should really be held in mind is the barrier-free
access Andrew is speaking of, where all you have to do to access a
full-text on-screen is to click on it!

Now, dear librarians, please recalculate your confident Pay-Per-View
budget in terms of the potential cost of all those untrammeled clicks!
Your current P-system does not offer or reflect anything like that!

And let me also remind everyone that impact is the flip-side of access,
and impact is the all-important factor for research and the researcher.
All the potential clicks that are lost to a researcher, because of the
P-barrier, and the absence of Andrew's untrammeled access, are losses to
research itself, and hence to the potential beneficiaries of scholarly and
scientific research, namely, all of us.

So the right way to estimate the proportions I asked about is to reckon
how far along the pathway to untrammeled clicks to the entire relevant
refereed paper corpus your budget could conceivably take your
institution's researchers. I think that you will find that the distance
is very small indeed. -- S.H.

On Mon, 15 Jan 2001, Andrew Odlyzko wrote:

> Stevan,
> The general thrust of your argument is certainly correct. The
> overwhelming majority of scholars have access to only a tiny
> fraction of the existing corpus of relevant literature. In
> my paper (A) I give some figures for library spending by various
> institutions. During the academic year 1996-7, Harvard spent
> $71 million on its library system, which even as large and
> rich an institution as Brown University spent only $15 million.
> Clearly Brown scholars had access to only a fraction of the
> resources of Harvard scholars. You can quantify this by looking
> at the source of this statistics, namely the compilation by the
> Association of Research Libraries of statistics on holdings,
> circulation, etc. available at <>.
> However, looking at serial holdings does not by itself say too much.
> First of all, there are measurement problems, with different libraries
> using different classification schemes. A more serious problem is that
> of relevance. Something on the order of 2 million articles appear each
> year in the STM literature. Clearly no more than a fraction can be
> looked at by any single individual, or even the entire faculty of a
> small institution, say. Furthermore, there are issues of quality and
> substitutability. If your library gets the top journals in a field,
> that may be enough for you and your colleagues, especially since for
> most questions, there are many sources of information that will be
> satisfactory (a factor that I discuss in paper (B)). Thus the harm
> from not having access to the entire literature may not be as great
> as raw statistics might make it seem.
> However, that having access to more information is better can be
> seen from the observed behavior of scholars. That is what my paper
> (B) is largely devoted to, showing that people will use even esoteric
> materials if those are easily available. More evidence is available
> in sources such as J. Luther's report, "White paper on electronic
> journal usage statistics," Council on Library and Information Resources,
> Oct. 2000, available at <>. Quoting from page 7:
> Recent data from OhioLINK show that more than half of the articles
> selected by users come from journals not currently held by the library.
> ... There is increasing evidence from both libraries and publishers
> that current holdings are too limited to meet user demand ...
> Although there are and have always been complaints about too much
> information, people do like to have access to everything, and do
> derive benefits from it, even though it is often not easy to quantify
> such benefits.
> Best regards,
> Andrew
> References (both available at
> <>):
> (A) "Competition and cooperation: Libraries and publishers in the
> transition to electronic scholarly journals," Journal of Electronic
> Publishing 4(4) (June 1999), <>
> and in J. Scholarly Publishing 30(4) (July 1999), pp. 163-185.
> (B) "The rapid evolution of scholarly communication," to appear in
> the proceedings of the 1999 PEAK conference,
> <>.
