Comments on the SPARC position paper

From: Stevan Harnad (harnad@ecs.soton.ac.uk)
Date: Sun Aug 04 2002 - 01:02:22 BST


Self-Archiving, Self-Vetting, "Overlay Journals" and "Disaggregated Models":
Comments on the SPARC Position Paper on Institutional Repositories
                http://www.arl.org/sparc/IR/ir.html

The SPARC position paper, "The Case for Institutional Repositories," http://www.arl.org/sparc/IR/ir.html is excellent and will serve a very important and useful purpose in mapping out for universities exactly why it is in their best interests to self-archive their research output,  and how they should go about doing so.

I will only comment on a few passages, having mostly to do with the topic of "certification" (peer review) in which SPARC's message may have become a little garbled along the same lines that like-minded precursor initiatives (notably E-biomed and Scholar's Forum) have likewise been a little garbled:

    E-biomed:
    A Proposal for Electronic Publications in the Biomedical Sciences
    http://www.nih.gov/about/director/ebiomed/com0509.htm

    Scholars' Forum:
    A New Model For Scholarly Communication
    http://library.caltech.edu/publications/ScholarsForum/042399sharnad.htm

To overview the point in question very briefly:

To provide open access (i.e., free, online, full-text access) to the research output of universities and research institutions worldwide -- output that is currently accessible only by paying access-tolls to the 24,000 peer reviewed journals in which their 2.5 million annual research papers are published -- does not call for or depend upon any changes at all in the peer review system. On the contrary, it would be a profound strategic (and factual) mistake to give the research community the incorrect impression that there is or ought to be any sort of link at all between providing open access to their own research literature by self-archiving it and any modification whatsoever in the peer review system that currently controls and certifies the quality of that research.

The question of peer-review modification has absolutely nothing to do with the institutional repositories and self-archiving that the SPARC paper is advocating. The only thing that authors and institutions need to be clearly and explicitly reassured about (because it is true) is that self-archiving in institutional Eprints Archives will preserve intact that very same peer-reviewed literature (2.5 million peer-reviewed papers annually, in 24,000 peer-reviewed journals) to which it is designed to provide open access.

Hence, apart from providing these reassurances, it is best to leave the certification/peer-review issue alone! http://www.eprints.org/self-faq/#Peer-review-reform

Here is where this potentially misleading and counterproductive topic is first introduced in the SPARC paper's section on "certification":

"Certification. Most of the institutional repository initiatives currently being developed rely on user (including author) communities to control the input of content. These can include academic departments, research centers and labs, administrative groups, and other sub-groups. Faculty and others determine what content merits inclusion and act as arbiters for their own research communities. Any certification at the initial repository submission stage thus comes from the sponsoring community within the institution, and the rigor of qualitative review and certification will vary."

There is a deep potential ambiguity here. The SPARC paper might merely be referring here to how much, and how, institutions might decide to self-vet their own research output when it is still in the form of pre-peer-review preprints, and that would be fine:

    "1.5. Distinguish unrefereed preprints from refereed postprints"
    http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#1.5

But this institutional self-vetting of whatever of its own pre-refereeing research output a university decides to make public online should on no account be described as "qualitative review and certification"! That would instead be peer review, and peer review is the province of the qualified expert referees (most of them affiliated with other institutions, not the author's institution) who are called upon formally by the editors of independent peer-reviewed journals to referee the submissions to those journals; this quality-review is not the province of the institution that is submitting the research. Self-archiving is not self-publishing, and peer-review cannot be self-administered:

    "1.4. Distinguish self-publishing (vanity press) from self-archiving
    (of published, refereed research)"
    http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#1.4

It merely invites confusion to characterize whatever preliminary self-vetting an institution may elect to do on the contents of the unrefereed preprint sector of its Eprint Archives with what it is that journals do when they implement peer review.

Worse, it might invite the conflation of self-archiving with self-publishing, if what the SPARC paper has in mind here is not just the unrefereed preprint sector of the institutional repository, but what would be its refereed postprint sector, consisting of those papers that are certified as having met a specific journal's established quality standards after classical peer review has taken its standard course:

    "What is an Eprint Archive?"
    http://www.eprints.org/self-faq/#Eprint-Archive

    "What is an Eprint?"
    http://www.eprints.org/self-faq/#What-is-Eprint

    "What should be self-archived?"
    http://www.eprints.org/self-faq/#What-self-archive

    "What is the purpose of self-archiving?"
    http://www.eprints.org/self-faq/#purpose-self-archiving

    "Is self-archiving publication?"
    http://www.eprints.org/self-faq/#self-archiving-vs-publication

It is extremely important to clearly differentiate an institution's self-vetting of the unrefereed sector of its archive from the external quality control and certification provided by refereed journals that subsequently yields the refereed sector of its archive. Nothing is gained by conflating the two:

    "Peer-review reform: Why bother with peer review?"
    http://www.eprints.org/self-faq/#Peer-review-reform

"In some instances, the certification will be implicit and associative, deriving from the reputation of the author's host department. In others, it might involve more active review and vetting of the research by the author's departmental peers. While more formal than an associative certification, this certification would typically be less compelling than rigorous external peer review. Still, in addition to the primary level certification, this process helps ensure the relevance of the repository's content for the institution's authors and provides a peer-driven process that encourages faculty participation."

These are all reasonable possibilities for the preliminary self-selection and self-vetting of an institution's unrefereed preprints. But implying that they amount to anything more than that -- by using the term "peer" for both this internal self-vetting and external peer review, and suggesting that there is some sort of continuum of "compellingness" between the two -- is not helpful or clarifying but instead leads to (quite understandable) confusion and resistance on the part of researchers and their institutions:

For, having read the above, the potential user who previously knew the refereed journal literature -- consisting of 24,000 peer-reviewed journals, 2,5 million refereed articles per year, each clearly certified with each journal's quality-control label, and backed by its established reputation and impact -- now no longer has a clear idea what literature we might be talking about here! Are we talking about providing open access to that same refereed literature, or are we talking about substituting some home-grown, home-brew in its place?

Yet there is no need at all for this confusion: As correctly noted in the SPARC paper, University Eprint Archives ("Institutional Repositories") can have a variety of contents, but prominent among them will be the university's own research output (self-archived for the sake of the visibility, usage, impact, and their resulting individual and institutional rewards, as well described elsewhere in the SPARC paper). That institutional research output has, roughly, two embryonic stages: pre-peer-review (unrefereed) preprints and post-peer-review (refereed) postprints:

    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0661.html

Now the pre-peer-review preprint sector of the archive may well require some internal self-vetting (this is up to the institution), but the post-peer-review postprint sector certainly does not, for the "vetting" there has been done -- as it always has been -- by the external referees and editors of the journals to which those papers were submitted as preprints, and by which they were accepted for publication (possibly only after several rounds of substantive revision and re-refereeing) once the refereeing process had transformed them into the postprints.

Nor is the internal self-vetting of the preprint sector any sort of substitute for the external peer review that dynamically transforms the preprints into refereed, journal-certified postprints.

In the above-quoted passage, the functions of the internal preprint self-vetting and the external postprint refereeing/certification are completely conflated -- and conflated, unfortunately, under what appears like an institutional vanity-press penumbra, a taint that the self-archiving initiative certainly does not need, if it is to encourage the opening of access to its existing quality-controlled, certified research literature, such as it is, rather than to some untested substitute for it.

It should be noted that to serve the primary registration and certification functions, a repository must have some official or formal standing within the institution. Informal, grassroots projects - however well-intentioned - would not serve this function until they receive official sanction.

Universities should certainly establish whatever internal standards they see fit for pre-filtering their pre-refereeing research before making it public. But the real filtration continues to be what it always was, namely, classical peer review, implemented and certified as it always was. This needs to be made crystal clear!

"Overlay Journals: Third-party online journals that point to articles and research hosted by one or more repositories provide another mechanism for peer review certification in a disaggregated model."

Unfortunately, the current user of the existing, toll-access refereed-journal literature is becoming more and more confused about just what is actually being contemplated here! Does institutional self-archiving mean that papers lose the quality-control and certification of peer-reviewed journals and have it replaced by something else? By what? And what is the evidence that we would then still have the same literature we are talking about here? Does institutional self-archiving mean giving up the established forms of quality control and certification and replacing them by untested alternatives?

There also seems to be some confusion between the more neutral concept of

(1) "overlay journals" (OJs) (e.g., Arthur Smith: http://ridge.aps.org/APSMITH/ALPSP/talk.html), which merely use  Eprint Archives for input (the online submission/refereeing of author self-archived preprints) and output (the official certification of author self-archived postprints as having been peer-reviewed, accepted and "published" by the OJ in question), but leave the classical peer review system intact;

and the vaguer and more controversial notion of

(2) "deconstructed journals" (DJs) on the "disaggregated model" (e.g., John W.T. Smith:
http://library.ukc.ac.uk/library/papers/jwts/d-journal.htm), in which (as far as I can ascertain) what is being contemplated is the self-archiving of preprints and their subsequent "submission" to one or many evaluating/certifying entities (some of which may be OJs, others some other unspecified kind of certifier) who give the papers their respective "stamps of approval."

"Re: Alternative publishing models - was: Scholar's Forum: A New Model... http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0216.html

JWT Smith has made some testable empirical conjectures, which could eventually be tested in a future programme of empirical research on alternative research quality review and certification systems. But they certainly do not represent an already tested and already validated ("certified"?) alternative system, ready for implementation in place of the 2.5 million annual research articles that currently appear in the 24,000 established refereed journals!

As such, untested speculations of this kind are perhaps a little out of place in the context of a position paper that is recommending concrete (and already tested) practical steps to be taken by universities in order to maximize the visibility, accessibility and impact of their research output (and perhaps eventually to relieve their library serials budgetary burden too).

Author/institution self-archiving of research output -- both preprints and postprints -- is a tested and proven supplement to the classical journal peer review and publication system, but by no means a substitute for it. Self-archiving in Open Access Eprint Archives has now been going on for over a decade, and both its viability and its capacity to increase research visibility and impact have been empirically demonstrated:
http://www.nature.com/nature/debates/e-access/Articles/lawrence.html

Substitutes for the existing journal peer review and publication system, in contrast, require serious and systematic prior testing in their own right; there is nothing anywhere near ready there for practical recommendations other than the feasibility of Overlay Journals (OJs) as a means of increasing the efficiency and speed and lowering the cost of classical peer review.
http://www.ecs.soton.ac.uk/~harnad/Temp/peerev.ppt
Almost no testing of any other model has been done yet; there are no generalizable findings available, and there are many prima facie problems with some of the proposed models (including JWT Smith's "disaggregated" model, [DJs]) that have not even been addressed:

See the discussion (and some of the prima facie problems) of JWT Smith's
model under:

    "Alternative publishing models - was: Scholar's Forum: A New Model..."
    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0216.html

    "Journals are Quality Certification Brand-Names"
    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0270.html

    "Central vs. Distributed Archives"
    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0215.html

    "The True Cost of the Essentials (Implementing Peer Review)"
    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0308.html

    "Workshop on Open Archives Initiative in Europe"
    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0943.html

In contrast, there has been a recent announcement that the Journal of Nonlinear Mathematical Physics http://www.sm.luth.se/~norbert/home_journal/ will become openly accessible as an "overlay journal" (OJ) on the Physics Archive http://arxiv.org/archive/nlin

This is certainly a welcome development -- but note that JNMP is a classically peer-reviewed journal, and hence the "overlay" is not a substitute for classical peer review: It merely increases the visibility, accessibility and impact of the certified, peer-reviewed postprints while at the same time providing a faster, more efficient and economical way of processing submissions and implementing [classical] peer review online.

Indeed, Overlay Journals (OJs) are very much like the Open-Access Journals that are the target of Budapest Open Access Strategy 2: http://www.soros.org/openaccess/read.shtml

Deconstructed/Disaggregated Journals (DJs), in contrast, are a much vaguer, more ambiguous, and more problematic concept, nowhere near ready for recommendation in a SPARC position paper.

"While some of the content for overlay journals might have been previously published in refereed journals, other research may have only existed as a pre-print or work-in-progress."

This is unfortunately beginning to conflate the notion of the "overlay" journal (OJ) with some of the more speculative hypothetical features of the "deconstructed" or "disaggregated" journal (DJ):

The (informal) notion of an overlay journal is quite simple: If researchers are self-archiving their preprints and postprints in Eprint Archives anyway, there is, apart from any remaining demand for paper editions, no reason for a journal to put out its own separate edition at all: Instead, the preprint can first be deposited in the preprint sector of an Eprint Archive. The journal can be notified by the author that the deposit is intended as a formal submission. The referees can review the archived preprint. The author can revise it according to the editor's disposition letter and the referee reports. The revised draft can again be deposited and re-refereed as a revised preprint. Once a final draft is accepted, that then becomes tagged as the journal-certified (refereed) postprint.

End of story. That is an "overlay" journal (OJ), with the postprint permanently "certified" by the journal-name as having met that journal's established quality standards. The peer review is classical, as always; the only thing that has changed is the medium of implementation of the peer review and the medium of publication (both changes being in the direction of greater efficiency, functionality, speed, and economy).

A deconstructed/disaggregated journal (DJ) is an entirely different matter. As far as I can ascertain, what is being contemplated there is something like an approval system plus the possibility that the same paper is approved by a number of different "journals." The underlying assumptions are questionable:

(1) Peer review is neither a static red-light/green-light process nor a grading system, singular or multiple: The preprint does not receive one or a series of "tags." Peer review is a dynamic process of mediated interactions between an author and expert referees, answerable to an expert editor who selects the referees for their expertise and who determines what has to be done to meet the journal's quality standards -- a process during which the content of the preprint undergoes substantive revision, sometimes several rounds of it. The "grading" function comes only after the preprint has been transformed by peer review into the postprint, and consists of the journal's own ranking in the established (and known) hierarchy of journal quality levels (often also associated with the journal's citation impact factor).

It is not at all clear whether and how having raw preprints certified as approved -- singly or many times over -- by a variety of "deconstructed journals" (DJs) can yield a navigable, sign-posted literature of the known quality and quality-standards that we have currently. (And to instead interactively transform them into postprints is simply to reinvent peer review.)

(2) Even more important: Referees are a scarce resource. Referees sacrifice their precious research time to perform this peer-reviewing duty for free, normally at the specific request of the known editor of a journal of known quality, and with the knowledge that the author will be answerable to the editor. The result of this process is the navigable, quality-controlled refereed research literature we have now, with the quality-grade certified by the journal label and its established reputation.

It is not at all clear (and there are many prima facie reasons to doubt) that referees would give of their time and expertise to a "disaggregated" system to provide grades and comments on raw preprints that might or might not be graded and commented upon by other (self-selected? appointed?) referees as well, and might or might not be responsive to their recommendations. Nor is it clear that a disaggregated system would continue to yield a literature that was of any use to other users either.

Classical peer review already exists, and works, and it is the fruits of that classical peer review that we are talking about making openly accessible through self-archiving, nothing more (or less)! Journals (more specifically, their editorial boards and referees) are the current implementers of peer review. They have the experience, and their quality-control "labels" (the journal-names) have the established reputations (and citation impact factors) on which such "metadata" tags depend for their informational value in guiding users. There is no need either to abandon journals or to re-invent them under another name ("DJ").

A peer-reviewed journal, medium-independently, is merely a peer-review service provider and certifier. That is what they are, and that is what they will continue to be. Titles, editorial boards and their referees may migrate, to be sure. They have done so in the past, between different toll-access publishers; they could do so now too, if/when necessary, from toll-access to open-access publishers. But none of this involves any change in the peer review system; hence there should be no implication that it does.

(JWT Smith also contemplates paying referees for their services, another significant and untested departure from classical peer review, with the potential for bias and abuse -- if only there were enough money available to make it worth referees' while, which there is not! At realistic rates, offering to pay a referee for stealing his research time to review a paper would risk adding insult to injury.) http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0303.html

So there is every reason to encourage institutions to self-archive their research output, such as it is, before and after peer review. But there is no reason at all to link this with speculative scenarios about new publication and/or peer review systems, which could well put the very literature we are trying to make more usable and used at risk of ceasing to be useful or usable to anyone.

The message to researchers and their institutions should be very clear:

The self-archiving of your research output, before (preprints) and after (postprints) peer-reviewed publication will maximize its visibility, usage, and impact, with all the resulting benefits to you and your institution. Self-archiving is merely a supplement to the existing system, an extra thing that you and your institution can do, in order to enjoy these benefits. You need give up nothing, and nothing else need change.

In addition, one possible consequence, if enough researchers and their institutions self-archive enough research long enough, is that your institutional libraries might begin to enjoy some savings on their serials expenditures, because of subscription cancellations. This outcome is not guaranteed, but it is a possible further benefit, and might in turn lead to further restructuring of the journal publication system under the cancellation pressure -- probably in the direction of cutting costs and downsizing to the essentials, which will probably reduce to just providing peer review alone. The true cost of that added value, per paper, will in turn be much lower than the total cost now, and it will make most sense to pay for it out of the university's annual windfall subscriptions savings as a service, per outgoing paper, rather than as a product, per incoming paper, as in toll-access days. This outcome too would be very much in line with the practice of institutional self-archiving of outgoing research that is being advocated by the SPARC position paper. http://www.ecs.soton.ac.uk/~harnad/Tp/nature4.htm#B1

The foregoing paragraph, however, only describes a hypothetical possibility, and need not and should not be counted as among the sure benefits of author/institution self-archiving -- which are, to repeat: maximized visibility, usage, and impact for institutional research output, resulting from maximized accessibility.

"As a paper could appear in more than one journal and be evaluated by more than one refereeing body, these overlays would allow the aggregation and combination of research articles by multiple logical approaches - for example, on a particular theme or topic (becoming the functional equivalent of anthology volumes in the humanities and social sciences); across disciplines; or by affiliation (faculty departmental bulletins that aggregate the research of their members)."

Here the speculative notion of substituting "disaggregated journals" (DJs) for classical peer review is being conflated with the completely orthogonal matter of collections and alerting: An open-access online research literature can certainly be linked and bundled and recombined in a variety of very useful ways, but this has nothing whatsoever to do with the way its quality is arrived at and certified as such. Until an alternative has been found, tested and proven to yield at least comparable sign-posted quality, the classical peer review system is the only game in town. Let us not delay the liberation of its fruits from access-barriers still longer by raising the spectre of freeing them not only from the access-tolls but also from the self-same peer review system that (until further notice) generated and certified their quality!

    "Rethinking "Collections" and Selection in the PostGutenberg Age"
    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1796.html

"Such journals exist today-for example, the Annals of Mathematics overlay to arXiv and Perspectives in Electronic Publishing, to name just two-and they will proliferate as the volume of distributed open access content increases."

The Annals of Mathematics http://www.math.princeton.edu/~annals/ is an "overlay" journal (OJ) of the kind I described above, using classical peer review. It is not an example of the "disaggregated" quality control system (DJ).

Perspectives in Electronic Publishing, in contrast, is merely a collection of links to already published work:
http://aims.ecs.soton.ac.uk/pep.nsf
It does not represent any sort of alternative to classical peer review and journal publication.

"Besides overlay journals pointing to distributed content, high-value information portals - centered around large, sophisticated data sets specific to a particular research community - will spawn new types of digital overlay publications based on the shared data."

Journals that are overlays to institutional research repositories are merely certifying that papers bearing their tag have undergone their peer-review and have met their established quality standards. This has nothing to do with alternative forms of quality control, disaggregated or otherwise.

Post hoc collections (link-portals) have nothing to do with quality control either, although they will certainly be valuable for other purposes. http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1796.html

"Regardless of journal type, the basis for assessing the quality of the certification that overlay journals provide differs little from the current journal system: eminent editors, qualified reviewers, rigorous standards, and demonstrated quality."

Not only does it not differ: Overlay Journals (OJs) will provide identical quality and standards -- as long as "overlay" simply means having the implementation of peer review (and the certification of its outcome) piggy-back on the institutional archives, as it should.

Alternative forms of quality control (e.g., DJs), on the other hand, will first have to demonstrate that they work.

And neither of these is to be confused with the post-hoc function of aggregating online content, peer-reviewed or otherwise.

This should all be made crystal clear in the SPARC paper, partly by stating it in a clear straighforward way, and partly by omitting the speculative options that only cloud the picture needlessly (and have nothing to do with institutional self-archiving and its rationale [open access], but simply risk confusing and discouraging would-be self-archivers and their institutions).

"In addition to these analogues to the current journal certification system, a disaggregated model also enables new types of certification models. Roosendaal and Geurts have noted the implications of internal and external certification systems."

Please, let us distinguish the two by calling "internal certification" pre-certification (or "self-certification") so as not to confuse it with peer review, which is by definition external (except in that happy but rare case where an institution happens to house enough of the world's qualified experts on a given piece of research not to have to consult any outside experts).

A good deal of useful pre-filtering can be done by institutions on their own research output, especially if the institution is large enough. (CERN http://preprints.cern.ch/OAi/ has a very rigorous internal review system that all outgoing research must undergo before it is submitted to a journal for peer review.)

But, on balance, "internal certification" rightly raises the spectre of vanity press publication. Nor is it a coincidence that when universities assess their own researchers for promotion and tenure, they tend to rely on the external certification provided by peer reviewed journals (weighted sometimes by their impact factors) rather than just internal review. The same is true of the external assessors of university research output: http://www.hero.ac.uk/rae/

So, please, let us not link the very desirable and face-valid goal of maximizing universities' research visibility and research impact through open access provided by institutional self-archiving with the much more dubious matter of institutional self-certification.

"Certification may pertain at the level of internal, methodological considerations, pertinent to the research itself - the standard basis for most scholarly peer review. Alternatively, the work may be gauged or certified by criteria external to the research itself - for example, by its economic implications or practical applicability. Such internal and external certification systems would typically operate in different contexts and apply different criteria. In a disaggregated model, these multiple certification levels can co-exist."

This is all rather vague, and somewhat amateurish, and would (in my opinion) have been better left out of this otherwise clear and focussed call for institutional self-archiving of research output.

And the idea of expecting referees to spend their precious time refereeing already-refereed and already-certified (i.e., already-published) papers yet again is unrealistic in the extreme, especially considering the growing number of papers, the scarcity of qualified expert referees (who are otherwise busy doing the research itself), and the existing backlogs and delays in refereeing and publication.

Besides, as indicated already, refereeing is not passive tagging or grading: It is a dynamic, interactive, and answerable process in which the preprint is transformed into the accepted postprint, and certified as such. Are we to imagine each of these papers being re-written every time they are submitted to yet another DJ?

There is a lot to be said for postpublication revision and updating of the postprints ("post-postprints") in response to postpublication commentary (or to correct substantive errors that come to light later), but it only invites confusion to call that "disaggregated journal publication." The refereed, journal-certified postprint should remain the critical, canonical, scholarly and archival milestone that it is, perpetually marking the fact that that draft successfully met that journal's established quality standards. Further iterations of this refereeing/certification process make no sense (apart from being profligate with scarce resources) and should in any case be tested for feasibility and outcome before being recommended!

"To support both new and existing certification mechanisms, quality certification metadata could be standardized to allow OAI-compliant harvesting of that information. This would allow a reader to determine whether there is any certificationinformation about an article, regardless of where the article originated or where it is discovered."

Might I venture to put this much more simply (and restrict it to the refereed research literature, which is my only focus)? By far the most relevant and informative "metadatum" certifying the information in a research paper is the JOURNAL-NAME of the journal in which it was published (signalling, as it does, the journal's established reputation, quality level, and impact factor)! (Yes, the AUTHOR-NAME, and the AUTHOR-INSTITUTION metadata-tags may be useful sometimes too, but those cases do not, as they say, "scale" -- otherwise self-certification would have replaced peer review long ago. COMMENT-tags would be welcome too, but caveat emptor.)

    "Peer Review, Peer Commentary, and Eprint Archive Policy"
    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1926.html

Please let us not lose sight of the fact that the main purpose of author/institution self-archiving in institutional Eprint Archives is to maximize the visibility, uptake and impact of research output by maximizing its accessibility (by provising open access). It is not intended as an experimental implementation of speculations about untested new forms of quality control! That would be to put this all-important literature needlessly at risk (and would simply discourage researchers and their institutions from self-archiving it at all). http://www.eprints.org/self-faq/#7.Peer

There is a huge amount of further guiding information that can be derived from the literature to help inform navigation, search and usage. A lot of it will be digitometric analysis based on usage measures such as citation, hits, and commentary http://cogprints.ecs.soton.ac.uk/archive/00001697/index.html

But none of these digitometrics should be mistaken for certification, which, until further notice, is a systematic form of expert human interaction and judgement called peer review: http://www.nature.com/nature/webmatters/invisible/invisible.html

    Harnad, S. & Carr, L. (2000) Integrating, Navigating and Analyzing
    Eprint Archives Through Open Citation Linking (the OpCit Project).
    Current Science 79(5): 629-638.
    http://cogprints.soton.ac.uk/documents/disk0/00/00/16/97/index.html

"Depending on the goals established by each institution, an institutional repository could contain any work product generated by the institution's students, faculty, non-faculty researchers, and staff. This material might include student electronic portfolios, classroom teaching materials, the institution's annual reports, video recordings, computer programs, data sets, photographs, and art works-virtually any digital material that the institution wishes to preserve. However, given SPARC's focus on scholarly communication and on changing the structure of the scholarly publishing model, we will define institutional repositories here-whatever else they might contain-as collecting, preserving, and disseminating scholarly content. This content may include pre-prints and other works-in-progress, peer-reviewed articles, monographs, enduring teaching materials, data sets and other ancillary research material, conference papers, electronic theses and dissertations, and gray literature."

This passage is fine, and refocusses on the items of real value in the SPARC position paper.

"To control and manage the accession of this content requires appropriate policies and mechanisms, including content management and document version control systems. The repository policy framework and technical infrastructure must provide institutional managers the flexibility to control who can contribute, approve, access, and update the digital content coming from a variety of institutional communities and interest groups (including academic departments, libraries, research centers and labs, and individual authors). Several of the institutional repository infrastructure systems currently being developed have the technical capacity to embargo or sequester access to submissions until the content has been approved by a designated reviewer. The nature and extent of this review will reflect the policies and needs of each individual institution, possibly of each participating institutional community. As noted above, sometimes this review will simply validate the author's institutional affiliation and/or authorization to post materials in the repository; in other instances, the review will be more qualitative and extensive, serving as a primary certification."

This is all fine, as long as it is specified that what is at issue is institutional pre-certification or self-certification of its unrefereed research (preprints).

For peer-reviewed research the only institutional authentication required is at most that the AUTHOR-NAME and JOURNAL-NAME are indeed as advertised! (The integrity of the full text could be vetted too, but I'm inclined to suggest that that would be a waste of time and resources at this point. What is needed right now is that institutions should
create and fill their own Eprint Archives with their research output, pre- and post-refereeing, immediately. The "definitive" text, until journals really all become "overlay" journals, is currently in the hands of the publishers and subscribing libraries. For the time being, let authors "self-certify" their refereed, published texts as being what they say they are; let's leave worrying about more rigorous authentication for later. For now, the goal should be to self-archive as much research output as possible, as soon as possible, with minimal fuss. The future will take care of itself. http://www.eprints.org/self-faq/#2.Authentication )

"Institutional repository policies, practices, and expectations must also accommodate the differences in publishing practices between academic disciplines. The early adopter disciplines that developed discipline-specific digital servers were those with an established pre-publication tradition. Obviously, a discipline's existing peer-to-peer communication patterns and research practices need to be considered when developing institutional repository content policies and faculty outreach programs. Scholars in disciplines with no prepublication tradition will have to be persuaded to provide a prepublication version; they might fear plagiarism or anticipate copyright or other acceptance problems in the event they were to submit the work for formal publication. They might also fear the potential for criticism of work not yet benefiting from peer review and editing. For these non-preprint disciplines, a focus on capturing faculty post-publication contributions may prove a more practical initial strategy."

Agreed. And here are some prima facie FAQs for allaying each of these by now familiar prima facie fears:

    http://www.eprints.org/self-faq/#2.Authentication
    http://www.eprints.org/self-faq/#3.Corruption
    http://www.eprints.org/self-faq/#5.Certification
    http://www.eprints.org/self-faq/#6.Evaluation
    http://www.eprints.org/self-faq/#7.Peer
    http://www.eprints.org/self-faq/#10.Copyright
    http://www.eprints.org/self-faq/#11.Plagiarism
    http://www.eprints.org/self-faq/#12.Priority
    http://www.eprints.org/self-faq/#22.Tenure/Promotion
    http://www.eprints.org/self-faq/#self-archiving-legal
    http://www.eprints.org/self-faq/#publisher-forbids

"Including published material in the repository will also help overcome concerns, especially from scholars in non-preprint disciplines, that repository working papers might give a partial view of an author's research."

Indeed. And that is the most important message of all -- and the primary function of institutional eprint archives: to provide open access to all peer-reviewed research output!

"Therefore, including published material, while raising copyright issues that need to be addressed, should lower the barrier to gaining non-preprint traditions to participate. Where authors meet traditional publisher resistance to the self-archiving rights necessary for repository posting, institutions can negotiate with those publishers to allow embargoed access to published research."

Fine.

"While gaining the participation of faculty authors is essential to effecting an evolutionary change in the structure of scholarly publishing, early experience suggests better success when positioning the repository as a complement to, rather than as a replacement for, traditional print journals."

Not only "positioning" it as a complement: Clearly proclaiming that a complement, not a replacement, is exactly what it is! Not just with respect to the relatively trivial issue of on-paper vs. on-line, but also with respect to the much more fundamental one, about journal peer review (vide supra). Institutional self-archiving is certainly no substitute for external peer review. (This is is stated clearly in some parts of the SPARC paper, but unfortunately contradicted, or rendered ambiguous, in other parts.)

"This course partially obviates the most problematic objection to open access digital publishing: that it lacks the quality and prestige of established journals."

This is a non-sequitur and a misunderstanding: The quality and prestige come from being certified as having met the quality standards of an established peer-reviewed journal. This has nothing whatsoever to do with the medium (on-paper or on-line), nor with the access system (toll-access or open-access); and it certainly cannot be attained by self-archiving unrefereed preprints only. The papers must of course continue to be submitted to peer-reviewed journals for refereeing, revision, and subsequent certification.

"This also allows repository proponents to build a case for faculty participation based on the primary benefits that repositories deliver directly to participants, rather than relying on secondary benefits and on altruistic faculty commitment to reforming a scholarly communications model that has served them well on an individual level."

I could not follow this. The primary benefits of self-archiving are the maximization of the visibility, uptake and impact of research output by maximizing its accessibility (through open-access). Researchers certainly will not, and should not, self-archive in order to support untested new "certification" conjectures, nor even to ease their institutions' serials budgets. The appeal must be straight to researchers' self-interest in promoting their own research.

"Additionally, value-added services such as enhanced citation indexing and name authority control will allow a more robust qualitative analysis of faculty performance where impact on one's field is a measurement. The aggregating mechanisms that enable the overall assessment of the qualitative impact of a scholar's body of work will make it easier for academic institutions to emphasize the quality, and de-emphasize the quantity, of an author's work.53 This will weaken the quantity-driven rationale for the superfluous splintering of research into multiple publication submissions. The ability to gauge a faculty member's publishing performance on qualitative rather than quantitative terms should benefit both faculty and their host institutions."

All true, but strategically, it is best to stress maximization of existing performance indicators, rather than hypothetical new ones:

    Harnad, S. (2001) "Research Access, Impact and Assessment
    are linked." Times Higher Education Supplement 1487: p. 16.
    http://www.ecs.soton.ac.uk/~harnad/Tp/thes1.html

"Learned society publishers are for the most part far less aggressive in exploiting their monopolies than their for-profit
counterparts. Even so, most society publishing programs, even in a not-for-profit context, often contribute significantly to covering an organization's operating expenses and member services. It is not surprising, then, that proposals advocating institutional repositories and other open access dissemination of scholarly research generate anxiety, if not outright resistance, amongst society publishers. While one hopes that societies adopt the broadest perspective possible in serving the needs of their members-including the broadest possible access to the scholarly research in the field-it is unlikely that societies will trade their organizations' solvency for the greater good of scholarship. It is important, therefore, to review how society publishers can continue to operate in an environment of institutional repositories and other open access systems."

Once the causal connection between access and impact is cleary demonstrated to the research community, it is highly unlikely that they will knowingly choose to continue to subsidise their Learned Societies' "good works" with the lost impact of their own work, by continuing to hold it hoastage to impact-blocking access-tolls: Societies will need to find better ways to support their good works. See: http://www.eprints.org/self-faq/#19.Learned

"Some suggest that institutional repositories, pre-print servers, and electronic aggregations of individual articles will undermine the importance of the journal as a packager of articles. However, institutional repositories and other open access mechanisms will only threaten the survival of scholarly journals if they defeat the brand positions of the established society journals and if individual article impact metrics replace journal impact factors in academic advancement decisions."

Most of the above is not true, and hence better left unsaid.

It is quite possible (and hence should not be denied) that author/institution self-archiving of refereed research may eventually necessitate downsizing by publishers (to become peer-review/certification service-providers :

    "4.2 Hypothetical Sequel"
    http://www.eprints.org/self-faq/#17.Publishers
    http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#4.2

But none of this has anything to do with journal- vs author- impact metrics! The ISI's Web of Science http://wos.mimas.ac.uk/ has already made it possible (and very useful) for institutions and funding agencies) to use either journal or author citation impact metrics for assessment, whichever is more useful and informative, and it is very likely that weighting publications only by their journal-impact will prove a much blunter instrument than weighting them by the paper's and/or author's impact: http://www.hero.ac.uk/rae/

   Harnad, S., Carr, L., Brody, T. & Oppenheim, C. (2003)
   Mandated online RAE CVs Linked to University Eprint Archives:
   Improving the UK Research Assessment Exercise
   whilst making it cheaper and easier.
   Ariadne
35 (April 2003).
   http://www.ariadne.ac.uk/issue35/harnad

But once the institutional Eprint Archives are up and filled, far richer and more sensitive digitometric measures of impact and usage are waiting to be devised and tested on this vast corpus. A taste is already available from citebase:
http://citebase.eprints.org/
http://citebase.eprints.org/analysis/correlation.php

For ongoing research on these new digitometric performance indicators, see: http://opcit.eprints.org/opcitresearch.shtml

"On the first point, journal brand reputation will, for the foreseeable future, continue to be integral to the assessment of article and author quality."

For the reader/user/navigator of the literature, certainly. But more sensitive measures are developing too, for the evaluator, funder and employer. The all-important JOURNAL-NAME tag, and the established quality level and impact to which it attests will continue to be indispensable sign-posts, but a great deal more will be built on top of them, once the entire refereed journal literature (24K journals) is online and open-access.

"Market-aware journals with prominent editorial boards and well-established publishing histories should be able to maintain their prestige, even with a proliferation of article-based aggregations. As to the second point, while new metrics will evolve that demonstrate the quantitative impact of individual articles, rigorous peer review will continue to provide value. Even after individual article impact analysis becomes widespread and accepted by academic tenure committees, stringent refereeing standards will continue to play a central role in indicating quality."

Correct, and mainly because peer review is the cornerstone of it all.

"Learned societies have long-standing relationships with their members and they should be able to act as focal points for the research communities they represent. While society dues typically include a journal subscription, society members also enjoy other benefits of membership-and, presumably, additional value-beyond the journal subscription itself. Societies, therefore, provide community-supporting services to justify their members' dues besides the value allocated to the journal subscription. While a commercial publisher would find it difficult to charge a subscription fee for a journal freely available online, society publishers-by repositioning the benefits of membership-might well prove able to allow journal article availability via open access repositories without experiencing substantial membership cancellations or revenue attrition."

In other words, members of learned societies may still be willing to pay membership dues to support their societies' "good works." But there is no need to call these dues "subscriptions"! http://www.eprints.org/self-faq/#19.Learned

And the cost of peer review itself can be covered very easily out of institutional subscription savings, if and when it becomes necessary: http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#viii

Given the extent of government and private philanthropic foundation funding for academic research, especially in the sciences, such funding agencies have a vested interest in broadening the dissemination of scientific research. There are several mechanisms by which government and private funding agencies could help to achieve this broadened dissemination. It has been suggested that government and foundation research grants could be written to include subsidies for author page charges and other input-side fees to support open access business models. Such stipulations would help effect change in those disciplines, primarily in the sciences, where author page charges are the norm. Obviously, such subsidies would be less effective in disciplines where input-side models bear the stigma of vanity publishing; still, over time, this resistance could be overcome.

If/when open-access prevails enough to reduce publisher income, it will at the same time increase institutional savings (from cancelled subscriptions). As peer review costs much less than the whole of what journal publishers used to do, it can easily be paid for, at the author/institution end, as a service cost for outgoing research instead of as a product cost for incoming research as it is now, out of just a portion of institutions' annual windfall savings, as indicated below:

"Economically: The burden of scholarly journal costs on academic libraries has been well documented. While the variety of institutional contexts and potential implementations make it difficult to project institutional repository development and operational costs with any precision, the evidence so far suggests that the resources required would represent but a fraction of the journal costs that libraries now incur and over which they have little control."

And that is mainly because peer review alone -- which will be journal publishers' only remaining essential service if and when all journal publication becomes all open-access publication -- costs far less than what journal subscription/license tolls used to cost. The per-paper archiving cost, distributed over the research institutions that generate the outgoing papers, is negligible, compared to what it cost for incoming papers in the toll-based system.

    "The True Cost of the Essentials (Implementing Peer Review)"
    http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0303.html

"Several institutions have applied the e-prints self-archiving software to implement institutional repositories. Developed at the University of Southampton, the free eprints.org self-archiving software now comes configured to run an institutional pre-prints archive. The generic version of e-prints is fully interoperable with all the OAI Metadata Harvesting Protocol."

Not an institutional pre-prints archive: An institutional Eprints Archive. (Eprints = preprints + postprints)

"Universities that have implemented e-prints solutions include Cal Tech, the University of Nottingham, University of Glasgow, and the Australian National University. The participants in all these programs have described their experiences, providing practical insights that should benefit others contemplating an OAI-compliant e-prints implementation."

CalTech reviewed their experience with eprints for SPARC at: http://www.arl.org/sparc/core/index.asp?page=g20#6

Stevan Harnad