Re: Archivangelism

From: Stevan Harnad <>
Date: Sat, 10 Jan 2004 22:12:58 +0000

Every one of the points raised by Iain Stevenson below has been answered
(many times over) across the years -- so much so that they have even
become FAQs!

It is hence a really head-shaker that they are still being innocently
raised in 2004. (And historians of the already far-overdue advent of
the open-access era will no doubt find a good deal to learn about human
nature and human progress from such data!)

On Fri, 9 Jan 2004, Iain Stevenson wrote:

> I don't have a problem with self-archiving per se, only how it is managed

How it is managed? Self-archiving is so far not "managed" at all! It is
anarchic, based on individual self-initiative, i.e., whether the author
has the good sense to make the connection between the impact of his own
research and whether or not all would-be users can access it (and what
he can do about it, i.e., how to provide open access, now).

It is precisely "management" that has so far been missing, and I hope
universities and research funders will have the good sense to provide it
soon! What is needed from the "managers" of researchers is an official
extension of their existing institutional "Publish or Perish" policy to
"Provide Open Access (to your peer-reviewed research articles)".

The managers can then also monitor and reward compliance as well as
monitor, measure and reward the resulting research impact.

The researchers' institutional libraries can manage the archives:

All this management costs a pittance, administratively, yet its returns
(in terms of research visibility and impact) and its rewards (in terms
of research productivity, funding, and prestige) will be immense.

> and the argument that it is somehow "toll free"

That it is toll-free is not an *argument* but a *fact.* One need only
understand what "toll" and hence "toll-free" mean here:

Institutions currently pay subscriptions, licenses, or pay-to-view (i.e.,
*tolls*) in order to purchase access for their researchers to whatever
fraction of the peer-reviewed journal literature they can afford (24,000
journals in all, publishing 2,500,000 articles per year).

Toll-free access means access to that literature, without the
user-institution having to pay those tolls.

I assume that Iain is here conflating (1) the toll-costs of
user-institution subscriptions per incoming journal with two other
things: (2) author-institution publication charges per outgoing article
on the open-access journal cost-recovery model (the "golden" road to
open-access provision) and (3) what he imagines to be the high costs of
self-archiving in institutional open-access eprint archives (the "green"
road to open-access provision). The equation of these three things is
incoherent, and Iain is simply dead-wrong about the costs of
self-archiving (and would do well to look at the actual figures,
per article!).

A journal's current total toll-access revenue, per article, averages about
$1500. That is the total amount per article paid annually, jointly,
by those institutions who can afford to subscribe to that journal
(divided by the annual number of articles in the journal). (Figure out
your own institution's serials budget, and divide it by the total number
of articles in the journals it subscribes to, and that will be how much
your own institution is paying per incoming article.)

(Now be careful to keep it straight in your mind what is a journal's
total income per article, from all subscribing institutions, and
what is an institution's total expenditure, for all its subscribed
serials. Let us call the total annual institutional expenditure for
its full current quota of serials 100%. We will shortly be dividing
that amount, counter-intuitively, but correctly, not by the number of
*incoming* articles subscribed to, but by the institution's annual number
of *outgoing* articles published.)

The cost of implementing peer review is $500 per article (i.e., 33% of
its total revenue). *If and when* institutions are no longer paying any
access-tolls at all (100%) for *incoming* articles (because all journals
have converted to the "gold" standard for cost-recovery: one-time payment
per outgoing article), they will be paying, on average, *at most* the
same total amount they paid the old way (100%) (co-varying with the
institution's own ratio of formerly incoming to now outgoing articles).
But it is far more likely that by then journals will have been forced to
cut costs by cutting out the inessentials (offloading all text-generation
and processing onto the author and all access-provision and archiving
onto the institutional archives). So the institutions would more likely
be saving something closer to 66% annually (*if and when*).

That is all hypothetical, however, as there has been no wholesale
conversion to open-access (gold) journal-publishing and its cost-recovery
model. The fewer than 5% of journals that conform to this model are
hence not representative, nor do they provide much open access today (<5%).

The green road to open-access provision, in contrast, provides a
good deal more open access today (at least three times as much), but,
more important, it *could* provide 100% open access, *immediately.*
Self-archiving can provide toll-free access to all 2,500,000 annual
articles in all 24,000 journals, virtually overnight.

At what cost, you ask? The archiving cost per article is a few dollars;
not even worth mentioning! The problem of the institutional eprint
archives at the moment is not that they are too *costly,* but that they
are too *empty* (because there is not yet an official institutional
open-access provision policy mandating and monitoring open-access

> There have to be costs in the system: archives need to be maintained,
> indices provided, databases managed, telecoms links paid for. These are
> all real costs that have to be met somewhere.

To quote in extenso the FAQ on this old chestnut "8. Paying the piper"

        "I worry about self-archiving because someone surely has to pay
        for all this: you can't get something for nothing!"

    "There are many fallacies embedded in this worry, among
    them misunderstandings about the nature of global networked
    communication. Internet connectivity is now a standard part of the
    infrastructure of most of the world's universities and research
    institutions. If you are not equally worried about who pays for
    your emails, websites, and web-browsing, you should not be worrying
    about your self-archiving either. Moreover, paying access-tolls is
    not paying the pertinent piper here anyway! (I.e., it is not
    publishers who pay for universities' network infrastructure!)

    "The refereed research literature is minuscule compared to the rest
    of the traffic on the Web. It is the flea on the tail of the dog.
    Worry about the storage and band-width for the growing daily creation
    and use of audio, video, and multimedia (most of it non-research
    use!) by researchers at universities and research institutions before
    even beginning to fret about the refereed-research flea.

    "As usual, there is also some of the archiving/publishing conflation
    here --
    -- thinking that we must find some sort of counterpart for the
    printing/distribution costs, somewhere. But there isn't any. The
    cost per-paper of permanent online archiving is virtually zero,
    yet everyone, everywhere, has access to it all, forever. This is
    a Gutenberg expense that has simply vanished in the PostGutenberg
    Galaxy, leaving only the Cheshire Cat's Grin.

    "There is indeed one essential publishing cost that still needs
    to be paid, but it has nothing to do with Internet use: It
    is the cost of implementing peer review. That cost, however,
    is only 10-30% of the access-tolls currently being paid, and
    hence could easily be paid out of the annual toll savings.

    "The last of the "who-pays-the-piper" worries is,
    I think, a variant of the Capitalism (14) worry:
    The best way to dispel it is is to note that refereed publishing
    in the PostGutenberg Galaxy, once the literature has been freed
    through self-archiving, is likely (apart from whatever optional
    add-on products and services there may still be a market for)
    to downsize into a *service* (peer review), provided to the
    author-institution, instead of the toll-based *product* (the text)
    that was provided to the reader-institution in the Gutenberg era.

    "Nothing hinges on this, however, for as long as the world wants to
    keep paying for the toll-based product, even after the refereed
    literature has been self-archived, the piper will be fully paid,
    yet the literature will be free of all its access/impact barriers."

> The problems that I see are as follows:
> (a). Implicitly, the publication model of open-access and
> self-archiving reflects the publishing culture of Anglo-American
> STM research, well-funded with grants that include publication costs
> and I suspect also salaried research assistants and post-docs to do
> the leg-work in archiving.

This is completely irrelevant. *Both* roads to open-access provision
-- both (1) the golden road of conversion to open-access publishing
recovering costs at the author-institution end, per outgoing article,
instead of at the user-institution end, per journal toll-accessed,
and (2) the green road of supplementing toll-access publishing with
author-institution self-archiving of their own toll-access journal
articles in their own institutional open-access eprint archives --
simply presuppose that the current costs of journal publication are
whatever they are now. Grants and assistants and legwork have no more
or less to do with it than they do now. (Self-archiving is a few extra
keystrokes per article, over and above the ones that went into generating
it in the first place: no legs needed.)

> In the tradition of social science and humanities research, typified by
> sole researchers with smallish (or no) grants, self-archiving probably
> isn't easily achieved, unless the institution where the worker is based
> provides, staffs and pays for a self-archiving system.

Neither discipline-differences nor grants have anything whatsoever to
do with self-archiving. (You are here conflating the golden and green
roads to open access provision.) Self-archiving costs per article are
negligible. (The extra keystrokes are truly not worth discussing.) Please
adduce evidence to the contrary if you disagree. The 100+ institutions
that have set up one $1000 linux box and installed the free GNU Eprints
software on it are not fretting about the cost per paper. (They are --
or should be -- fretting about the low number of papers, for lack of an
institutional open-access provision policy!)

> And where does that leave the self-funded independent scholar who is
> still a feature of many of the soft-sciences?

There are plenty of nowhere-near-capacity central archives to host
the articles of scholars not affiliated with an institution that has
provided open-access eprint archives.

(And institutions have a long history of hosting unaffiliated scholars
and their work!)

See also:

> The other issue is what about the exclusion of researchers from those
> countries where lack of computers, poor telecoms and general lack of
> funding mean they can't access the self-archiving stores?

Repeatedly raised in this Forum, this worry has been repeatedly answered
by those who are actually activists for alleviating the access problems
of poor countries (Leslie Chan, Subbiah Arunachalam, Barbara Kirsop). I
think I can faithfully express their position as follows:

    "If you want to provide us with computers, telecoms and funding,
    please do! But if not, please don't use that as an excuse for
    continuing to deny us access to the research literature with whatever
    connectivity we do have: We cannot afford toll-access! Open-access
    would be a godsend for us!"

See also " 29. Sitting Pretty":

            "I don't worry about self-archiving because there is really
            no problem: My institution gives me all the access and impact
            I want or need already. I'm satisfied!"

    "If a researcher -- especially a researcher at a well-off
    institution -- does not exercise some critical reflection, the
    natural feeling is: "Where's the problem? I and others at my
    institution were already well-off in paper days. Now, in the
    online era we are even better off, with desktop online access
    to everything, instead of having to walk to the library, and
    with licensing 'big deals' that get us even more journals than
    we used to have!"

    "This is related in part to the "Harvards vs. Have-Nots"
    It is also a symptom of not having understood the causal
    connection between access and impact.

    "Yes, the better-off institutions enjoy better access to the
    peer-reviewed journal literature than the less-well-off
    institutions (and better access than they had in paper-days).
    But no institution can afford toll-access to all or even most
    of the 24,000 peer-reviewed journals that exist. And most
    institutions can afford toll-access to only a small and
    shrinking portion of them. And even the Harvards (not only the
    Have-Nots) are groaning under their growing serials-budget
    expenditures. So no researcher, at any institution, has access
    to more than a fraction of what there is. And usage patterns
    in those lucky fields where open online access is most advanced
    show that when everything is accessible and a keystroke is the
    only barrier, users make vastly more use of the literature.

    "So much for access. The other side of the coin is even more
    important: Researchers at prestigious institutions will also
    say that they only write for one another. But they don't really
    mean it. All researchers are interested in their research impact
    (citation counts), not only because that is one of the things
    that advances their careers and funds their further research,
    but also it is a measure of the size and importance of their
    contribution to knowledge. Few researcher are aware -- because
    the data on the strong causal connection between access and
    impact are new and still being gathered -- of the size of their
    own and their institution's cumulative daily, weekly, monthly,
    and yearly impact-loss owing to access-denial to those would-be
    users world-wide whose institutions cannot afford the toll-access
    to their work.

    "In this equation, the Harvards are losing almost as much as the
    Have-Nots, because they are losing the potential impact from
    the users at the Have-Not institutions, which vastly outnumber
    the Harvards! Yes, the Harvards may be somewhat better off in
    their own access to the research output of others; but the
    following is just as true of them as it is of the Have-Nots:
    For every one of the 2,500,000 articles published annually in
    the 24,000 research journals it is a fact that it is not
    accessible to most of its potential users because of unaffordable
    toll-barriers. And (this too is critical): this would remain
    true even if all 24,000 journals were sold at cost."

> In the UK, we are still living with the damage that the first Research
> Assessment Exercise [RAE] caused by assuming that STM-type article publishing
> was just about the only acceptable form of measuring research output.

In many fields the peer-reviewed journal article is indeed the main
measure of research output. (That is just peer review and "publish or
perish" and not the fault of the RAE!)

Open-access provision is targeting only the author give-away literature
so book-based disciplines (if there are such) are only touched to the extent
that they also publish journal articles:

But it encompasses all of the peer-reviewed journal literature:
all 2,500,000 annual articles in all 24,000 journals. The RAE has
nothing to do with that fact. (But it can benefit from it: .)

> Research can be expressed as a music or dance performance, a sculpture
> or (as in my own department) a broadcast documentary, a magazine
> design, or a round-table debate between industry practitioners and
> academic researchers.

So self-archive those too (if they are digitized, and give-aways)!
Surely you don't want to invoke the impossibility of providing
open access to nondigital or non-giveaway research as a reason for
holding giveaway digital research and its impact hostage to access-tolls!

> It is not inconceivable that such research outputs could be self-archived
> but it is inconceivable that they could be done without considerable
> costs in software, hardware, personnel and know-how. Who pays?

The open-access movement's first, second and third objective is to provide open
access to the annual 2,500,000 articles published in the planet's 24,000
peer-reviewed journals, in order to maximise the impact of that research.

What on earth does that rational and reachable objective have to do
with the question of who will pay to digitize or archive a dance or a
piece of sculpture? And what does it have to do with reforming research
evaluative criteria? These are red herrings!

> (b) One form of self-archiving that derives from the traditional
> format of paper journal publishing was of course the off-print.
> The system whereby anyone could write to an author requesting
> an off-print did disseminate research quickly and freely and was
> inclusive of colleagues everywhere. I still get the occasional
> off-print request from eastern europe, india and china but never from
> western Europe or the USA. It was a simple, easy system that worked
> and although publishers found it increasingly expensive, I do not
> know of any publisher, commercial or non-profit, who ever seriously
> considered abandoning what was felt to be a form of royalties in lieu.
> It seems a great pity that the distribution of requesting and sending
> off-prints seems to be dying. Why re-invent the wheel? Even if you
> have easy electronic access, reading a paper on paper can often be
> much more convenient, or indeed pleasurable, and offprints, although
> they come by snail-mail, are probably more convenient than downloads.

Iain's wisftfulness is touching, but so out of touch that it takes
one's breath away! Go back to postal request-a-print for access instead
of clicking and reading onscreen or printing off a hard copy of any of
the 2,500,000 annual articles in the 24,000 journals? Re-inventing the
wheel? More like replacing modern rapid transit by walking, or reverting
from online communication to the oral tradition! Self-archiving is the
equivalent of providing limitless reprints to all potential users without
anyone having to post a request, post a reprint, or be affiliated with
an institution that can afford the access-tolls! Convenient? Access-denial:
convenient? To whom? (Please see "Sitting Pretty" again, above.)

> (c) I accept that you may not argue for "publisher-free" journals
> although Watts in his article implied that you feel "with the
> exception of peer review , the various editorial services that
> publishers arrange are forms of help that he ( feels he can do
> without"

Watts quoted me correctly. What you have not quite grasped, Iain, is that
a peer-reviewed journal publisher in the PostGutenberg, open-access age
may well turn out to be a peer-review/certification service-provider,
instead of being that *and* an access-provider for a text-product (as in
the Gutenberg toll-access age). (The jury is out on whether a copy-editing
service will need to be bundled in with the peer-review service.)

    "Online Self-Archiving: Distinguishing the Optimal from the Optional"

    "Separating Quality-Control Service-Providing from Document-Providing"

    "Distinguishing the Essentials from the Optional Add-Ons"

    "The True Cost of the Essentials (Implementing Peer Review)"

    "The True Cost of the Essentials

    "Re: The True Cost of the Essentials (Implementing Peer Review - NOT!)"

    "Journal expenses and publication costs"

    "Re: Scientific publishing is not just about administering peer-review"

    "Author Publication Charge Debate"

Except in book-based disciplines, "publish" in academic "publish or
perish" disciplines always meant "publish in a peer-reviewed journal
[or refereed conference proceeding]".

> but I do feel that a lot of the rhetoric about open-access
> does caricature publishers as greedy parasites whose time has come.

I agree with you that this demonization of publishers is unnecessary,
counterproductive and probably unfair -- in the context of open-access

On the subject of journal pricing I take no position because pricing
is a different problem, even though it did help draw our attention to
the access/impact problem:

The key to understanding the difference between the serials budget crisis
and the access/impact problem is to realize that even if the 24,000
journals were all pricing their tolls *at cost*, most institutions
still could not afford toll-access to most of them, and probably no
institution could afford them all. So research impact would continue to
be needlessly lost.

This is not the fault of the publishing community but of the research
community, for their failure to realize that in the online age they can
and should supplement toll-access by providing open-access to their own
articles for all potential users, anywhere, any time, whose institutions
may not be able to afford the toll-access (and not by waiting for them
to mail a reprint request!).

Researchers have been sluggish in realizing this PostGutenberg fact and
capability, but they will realize it, under the influence of empirical
evidence for the size of their cumulative impact loss; and either they
or their managers will twig on the urgent need for a systematic policy
of open-access provision for all refereed research output.

> The whole modern system of peer reviewed quality controlled journals
> would not have existed without the planning and envisaging skills
> of publishers, even if we have to accept Robert Maxwell as one of
> the innovators. Half a century ago, the vehicles for scientific and
> other research publication in the english language world were few
> indeed, and it was publishers who created the modern infrastructure.

Fine. Put we are not talking about publishers now. We are talking about
open access provision, now. This is possible now; it was not possible
then. Otherwise the whole peer-reviewed research enterprise would have
taken a different form from the very beginning. Blame Gutenberg. Thank Tim
Berners-Lee. And then get the research community to get its act together!

> Universities and research institutions could have created journals
> and it is a question for research why by and large they didn't. I
> would argue that publishers (of all kinds) and the research community
> have had a creative partnership that has benefited both, and that
> the partnership should continue to be creative in the future.

Yes, but apart from these pious platitudes to which we can all assent,
what about open-access provision, now? and stanching needless impact-loss,
at long last, now? (The real question for historical research will be
why it took the research community so long to twig on the optimal and
inevitable outcome, and then take the simple, obvious steps to bring
it about!)

> (d) My final point is that open access self-archiving creates the
> illusion of free material without acknowledging the very real costs
> involved in the process. The costs are shifted to another locus
> in the system but they are there.

Iain, I'm afraid the illusion might be yours (if the costs you have in
mind are the negligible costs of self-archiving)! More likely, though,
you are conflating (1) open-access provision via open-access journal
publication (the golden road, in which publishing costs are recovered
by charging the author-institution per outgoing article instead of the
user-institutions per incoming journal) with open-access provision via
author-institution self-archiving of a supplementary version of their
toll-access article output in their own institutional eprint archives,
in order to provide access to all those would-be users whose institutions
cannot afford the toll access (the green road to open access).

On the golden road, there is a shift of costs, because one cost-recovery
system is being substituted for another. On the green road, there is
no shift of cost, because the open-access version is a supplement to,
not a substitute for, the toll-access version.

For the rest, see "Paying the Piper" again.

> I think we agree (and disagree with Watts) that open access and
> subscription models can co-exist, and that it would be undesirable if
> the first totally supplanted the latter

Geoff Watts was just reporting the various opinions expressed by the
research and publishing community. (He did this rather well, I might add,
compared to many other articles I have seen on this topic. And it's not
clear that he took a position of his own.)

You and I agree that toll-access publishing and supplementary open-access
self-archiving can co-exist. I have no opinion on whether or not it would
be desirable for open-access publishing to supplant toll-access publishing.
I'm concerned only with the long-overdue and urgent provision of open access,
now. (Though I can speculate as well as the next guy about where that
may or may not eventually lead: )

> but I am concerned that driving publishers, commercial and non-commercial,
> out of business would only serve to impoverish research rather than
> enhance it.

No one is driving publishers out of business!

I would say that the 1000 out of 24,000 open-access (gold) journals today
(<5%) constitutes competition rather than an attempt to drive toll-access
publishers out of business (and not very threatening competition,
so far). Morever conversion (to the golden cost-recovery model) is an
available option, rather than just extinction (though neither seems

Nor is (green) self-archiving intended or likely to drive publishers
out of business. On the contrary, for a publisher to declare itself
officially green (i.e., author self-archiving-friendly) is a way to
show that they are not opposed to open-access provision, not in favor of
impact-denial, even if not ready to take the much more radical (and risky)
step of converting to gold at a time when the gold cost-recovery model
is still an experimental one.

In fields where self-archiving is the most advanced -- e.g., in physics,
where some subfields are already 100% self-archived, hence open-access,
since the early 1990's -- there has been no sign whatsoever that
this is driving publishers out of business! On the contrary, as Geoff
Watts reported, there has even been a case of an open-access journal,
JHEP, which was "born gold," and achieved a huge impact factor within
a few years of launching, that has now successfully reverted to the
toll-access (green) cost-recovery model -- yet 100% of its contents have
been and remain open access (through self-archiving) throughout. If we
want to make empirical predictions rather than merely spin off doomsday
fantasies, we need to take the actual evidence into account.

    "JHEP will convert from toll-free-access to toll-based access"

Stevan Harnad

NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online (1998-2004)
is available at the American Scientist Open Access Forum:
        To join the Forum:
        Post discussion to:
        Hypermail Archive:

Unified Dual Open-Access-Provision Policy:
    BOAI-2 ("gold"): Publish your article in a suitable open-access
            journal whenever one exists.
    BOAI-1 ("green"): Otherwise, publish your article in a suitable
            toll-access journal and also self-archive it.
Received on Sat Jan 10 2004 - 22:12:58 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:47:16 GMT