Freeing the refereed research literature

 

Self-archiving of authors’ papers is the inevitable future of science communication

 

Stevan Harnad

 

Unlike the authors of books and magazine articles, who write for royalty or fees, the authors of refereed journal articles write only for ‘research impact’. To be read and to have an effect on others’ research, these refereed articles have to be accessible. The idea of access being ‘toll-gated’ by publishers makes as much sense as toll-gated access to commercial advertisements. Now it is possible to liberate the scientific literature from this unneccesary impediment by authors’ depositing their papers in a global, virtual searchable archive. (Author, the first paragraph of your article should be a brief summary of your main message. Please amend accordingly if you don’t like my version, but please keep to the style of stating the "news" of your article up-front. Thanks.).

Unlike the royalty/fee-based literature, which constitutes the vast majority of the printed word, the special, tiny literature comprising refereed journal articles is given away by its authors, who do not benefit from the fact that access-tolls had to be paid to read their papers (as subscriptions, and for the online version, site-licenses or pay-per-view). On the contrary, these access-barriers represent impact-barriers for authors, whose careers and standing depend largely on the research impact of their work.

Impact barriers

There at least 20,000 refereed journals in all fields of scholarship, in which more than 2,000,000 refereed articles appear each year. The amount that all the world’s institutions who can afford the tolls collectively pay for one of these refereed papers is hence more than $2,000 per paper. In exchange for this fee, that particular paper is accessible to readers at those, and only those, institutions.

The research libraries of the world can be divided into the (minority) Harvards and the (majority) Have-nots -- the last by no means limited to the developing world. It is obvious how the Have-nots would benefit from free access to the refereed literature, for without it their meagre budgets can afford only a pitifully small portion of it. But not even Harvard can afford access to anywhere near all of it (see http://fisher.lib.virginia.edu/newarl/index.html). Hence most refereed articles are inaccessible to most researchers. For the authors, this means that much of their potential impact is lost. It is this curtailed research impact and access that is being bought by the $2,000 per article mentioned above.

This is the way things were in the past, when publishing as print-on-paper was the only medium, and sizeable costs of printing and distribution had to be recovered. The new era of the Internet is threatening the majority, paid-for literature (books, magazine articles) in the form of digital piracy. But for the ‘give-away’ scientific literature, it is at last possible to eliminate all access/impact-barriers to refereed research.

Not all costs have vanished, of course. Although the costs of printing and distribution (and their on-line successors, such as publishers’ PDF page-images) are no longer necessary ones, the cost of the quality-control and certification that differentiates the refereed literature from an unfiltered, anarchic, pot-luck vanity press still needs to be paid. Paper and PDF have become mere options, purchasable by those who want and can afford them; refereeing, however, is essential.

Refereeing, and what it costs

Refereeing (peer review) is the system of evaluation and feedback by which expert researchers assure the quality of each others’ research findings. Referees’ services are given free to virtually all scientific journals, but the implementation of the refereeing procedures necessarily entails some cost, for example, archiving submitted papers on a website; selecting appropriate referees; cycling a paper through rounds of review and author revision; making editorial judgments; editing and formatting papers into journal style; administrative costs like databases, subject allocation of referees and manuscript-tracking systems; handling high-quality figures, movies and massive amounts of data in some disciplines; presubmission enquiries; rejecting manuscripts and dealing with appeals; and so on. (Author, note changes in this paragraph)

The lowest possible (author, note change: the AIP website you cite says that it "probably can’t get its costs below $500") cost of implementing refereeing has been estimated as $500 per accepted article by the American Institute of Physics (see http://documents.cern.ch/archive/electronic/other/agenda/a01193/a01193s4t8/transparencies/Doyle.ppt), but even that figure almost certainly has needless costs wrapped into it (for example, the creation of the publisher’s PDF). Author: I think you need a sentence here like: These costs will be higher for papers in areas like genomics and cell biology. Despite these higher costs for some types of paper, I think that the true figure for peer-review implementation alone for all journals in the world is probably much closer to $200 per article or even lower. (author, you need to be clear that this is your own estimate, based on a particular type of journal, hence wording change.) Hence, quality-control costs account for only 10% of the actual cost being paid per article.

Can this situation, where the authors’ and referees’ ‘giveaways’ are being held needless hostage to obsolete printing costs and cost-recovery methods, be remedied? It is not simply a matter of lowering the financial access barriers: even if these were slashed by 90%, most researchers would still be unable to access most research papers. There is one, inevitable solution: the refereed research literature must be freed for everyone, everywhere, forever, online. The irreducible 10% (or so) quality-control costs must no longer be paid for by readers’ institutions, but must be paid for as quality-control service costs by authors’ institutions, per paper published, funded out of 10% of the institution’s annual windfall savings on subscription costs. (Author: what about author collaborations from more than one institution? Can you sketch in a sentence how they would work?)

Achieving liberation

Journal publishers certainly will not scale down to becoming only quality-control providers of their own accord. Nor can libraries afford to do so. Authors cannot and should not be expected to stop submitting their research to established high-quality, high-impact journals in preference for new, alternative journals, with no track records, authorships, or niches, just because those journals happen to be prepared to provide quality-control alone right now. Journal niches are largely saturated already, and researchers’ immediate careers and standing are far more important to them than the potential long-term benefits of risky sacrifices.

But researchers can have their cake and eat it. The entire refereed journal literature can be freed, virtually overnight, without authors having to give up their established refereed journals in a method already shown to work by a portion of the physics community. These physicists have been publicly self-archiving their research papers online – both before and after refereeing, (preprints and postprints) since 1991 in the physics ‘eprint archive’ at http://www.arxiv.org.

The eprint archive currently holds 150,000 papers. The annual number of new papers self-archived therein is now about 30,000, increasing by about 3,500 papers per year. The archive, with its 14 mirror-sites world-wide, gets about 175,000 user ‘hits’ per weekday at its US site alone. So there is no doubt that self-archiving can be done, and that when papers are thus made freely accessible online, they are heavily accessed.

Although these physicists have pioneered the way to free the refereed research literature, those in other disciplines have been slow to realize that the system can work for them too. They have assumed that there must be something unique about physics that makes self-archiving work. This misapprehension has been encouraged by the incorrect impression that the archive contains only unrefereed literature, which compromises the quality control of journals. Yet absolutely nothing has changed in peer review in physics. The same authors who self-archive continue to submit all their papers to their journals of choice, just as they always did, and virtually all the papers in the archive appear in refereed journals about 12 months after journal submission. Nothing has changed – except that a growing portion of the refereed literature in physics is at last accessible, free for all, online. Yet even in physics, self-archiving is growing far too slowly: at the present linear growth rate it will be another decade before the entire physics literature is online and free.

It is now possible for the rate of self-archiving in physics to increase quickly enough, and for the practice to extend to other disciplines. My original "subversive proposal" (1 or correct number) to free the refereed literature through author self-archiving fell largely on deaf ears because self-archiving in an anonymous FTP archive or a Web home page would be unsearchable, unnavigable, irretrievable, and hence unusable. Centralized archiving, even when made available to other disciplines, has not been catching on fast enough either (it took 3 years for http://cogprints.soton.ac.uk to receive 1,000 articles).

The key was to agree on and to introduce metadata-tagging standards to make the contents of all the distributed archives interoperable, hence harvestable into one global ‘virtual’ archive, all papers searchable and retrievable by everyone for free. Now, the Open Archives Initiative (OAI) at http://www.openarchives.org has provided the meta-data tagging standards and a registry for all OAI-compliant eprint archives; and the self-archiving initiative at http://www.eprints.org has provided free software for creating OAI-compliant archives, interoperable with all other open archives, ready to be registered and for their contents to be harvested into searchable global archives (see http://cite-base.ecs.soton.ac.uk/cgi-bin/search).

Not only does distributed, institution-based self archiving benefit authors, it also benefits researchers’ institutions via reduced overheads and greater prestige from their researchers’ work being universally available. Further, the likelihood of eventually reducing the library’s annual serials expenditures budget to 10% is not only an added incentive for the institution itself to hasten the transition to self-archiving. The institutional library can both help researchers to learn self-archiving and also maintain the institution’s own refereed eprint archives as an outgoing collection for external use, in place of the old incoming collection via journal costs, for internal use. Institutional library consortial power should also be used to provide leveraged support for journal publishers who commit themselves to a timetable of downsizing to becoming pure quality-control service providers.

How the transition will happen

First, once all refereed journal articles are self-archived by their authors in their institution’s eprint archive, the literature is freed from all access- and impact-barriers. Self-archiving could be done virtually overnight; the day after, all research papers become accessible online to most of the researchers in the world.

Second, one outcome is that this will be the end of it. The refereed literature will be free online for those who want it and cannot get it any other way, but those who can afford to get it the old way via paying journals will continue to do so. In this event, the access/impact problem will be solved, but the library’s budget crisis will not: it will have become less urgent.

Third, an alternative outcome is that when the refereed literature is accessible online for free, users will prefer the free version (as so many physicists already do). Journal revenues will shrink and institutional savings grow. Journals will have to scale down to providing only the essentials (the quality-control service), with the rest (paper version, on-line PDF version, other ‘added values’) sold as an option.

In none of these outcomes is peer review itself compromised, sacrificed, or put at risk; nor do authors have to give up, even temporarily, submitting to their established journal of choice. All they have to do is self-archive their preprints and postprints in their institutional eprint archives.

Copyright restrictions are no obstacle to self-archiving because preprints can be self-archived without any restriction when the paper is submitted to a journal. When the paper is accepted by a journal, the author can ask the journal to retain his or her right to give away the article online by self-archiving it. In practice, many publishers will agree to this if the author asks, although most do not publicly state it as policy. For these papers, the author can self-archive the refereed postprint alongside the pre-refereeing preprint(s). For those publishers who refuse to publish the paper unless all rights are transferred, I advise authors to sign the restrictive agreement, and self-archive a linked "corrigenda" file listing for the user what changes have to be made in the preprint to make it equivalent to the postprint (for details, see http://www.cogsci.soton.ac.uk/~harnad/Tp/resolution.htm#Harnad/Oppenheim).

 

 

Stevan Harnad is in the Intelligence/Agents/Multimedia Group, Department of Electronics and Computer Science, University of Southampton, Highfield, Southampton SO17 1BJ, UK

 

http://www.cogsci.soton.ac.uk/~harnad/)

 

Author: you have too many references, and they are all to your own work (including more in the text) , which we do not encourage. Please reduce to around five essential ones, ideally not all to your own writing in this area. Interested readers can use these to find the others, or you can add a sentence to say that more bibliographic details are available direct from you. In the reference list, please stick to one online reference per citation.

 

1. Harnad, S. (1995) Universal FTP Archives for Esoteric Science and Scholarship: A Subversive Proposal. In: Ann Okerson & James O’Donnell (Eds.) Scholarly Journals at the Crossroads; A Subversive Proposal for Electronic Publishing. Washington, DC., Association of Research Libraries, June 1995.

 

Harnad, S. (1998/2000) The invisible hand of peer review. Nature [online] (5 Nov. 1998) http://helix.nature.com/webmatters/invisible/invisible.html Longer version in Exploit Interactive 5 (2000): http://www.exploit-lib.org/issue5/peer-review/ http://www.cogsci.soton.ac.uk/~harnad/nature2.html

 

Harnad, S. (2000) E-Knowledge: Freeing the Refereed Journal Corpus Online. Computer Law & Security Report 16(2) 78-87. [Rebuttal to Bloom Editorial in Science and Relman Editorial in New England Journal of Medicine] http://www.cogsci.soton.ac.uk/~harnad/Papers/Harnad/harnad00.scinejm.htm http://www.sciencemag.org/cgi/eletters/285/5425/197#EL12

 

Harnad, S. (2000) Ingelfinger Over-Ruled: The Role of the Web in the Future of Refereed Medical Journal Publishing. The Lancet Perspectives 256 (December Supplement): s16.

 

Harnad, S. (2001) For Whom the Gate Tolls? How and Why to Free the Refereed Research Literature Online Through Author/Institution Self-Archiving, Now.

Harnad, S., Carr, L. & Brody, T. (2001) How and Why To Free All Refereed Research From Access- and Impact-Barriers Online, Now. publication reference? One of these two can be cut.

 

Harnad, S., Varian, H. & Parks, R. (2000) Academic publishing in the online era: What Will Be For-Fee And What Will Be For-Free? Culture Machine 2 (Online Journal) http://www.cogsci.soton.ac.uk/~harnad/Temp/Varian/new1.htm http://culturemachine.tees.ac.uk/frm_f1.htm