Re: OA advantage = EA + AA + QB + OA + UA

From: Stevan Harnad <>
Date: Wed, 20 Oct 2004 18:28:55 +0100

Prior AmSci Topic Thread:

    "OA advantage = EA + AA + QB + OA + UA"

A forthcoming article by Michael Kurtz (Harvard-Smithsonian Center for
Astrophysics) and co-workers reports that in astrophysics -- which (with
its small, closed circle of journals and with all active astrophysicists
worldwide being at institutions that can afford toll-access to all of
them) has had de-facto 100% OA for several years now -- the total number
of citations (hence the average number per article) has not risen; in
fact it may even have diminished a little. There is instead a threefold
increase in usage (readership, downloads).

    Kurtz et al. (2004) "The Effect of Use and Access on Citations"
    Information Processing and Management (submitted)

I think the interpretation of this is fairly clear: Once there is 100% OA,
research is used far more, and although the overall number of references
per article may not increase, their *selectivity* does, because authors can
cite what is most important and relevant, rather than just what their
institutions happen to be able to afford to access (as is the case before
there is 100% OA, which is the prevailing condition in all fields other
than astro currently).

One certainly cannot take the absence of an overall increase of citations
in a field that already has 100% OA, as evidence against the need for 100%
OA in other fields, where OA is far less than 100%!

Michael's interesting finding is probably unique to astro, which was even 100%
OA before the online era (i.e., 100% of astrophysicists were at institutions that
could afford 100% of the astro journals in paper), but his pattern of findings
has suggested that there are several components contributing to the OA Advantage:

(1) What Michael calls the "EA" or "Early Access" advantage: Papers that are
self-archived as preprints, even in astro, get more citations than those that
are not.

If I understand Michael's data correctly, however, the EA is in fact
a permanent increment in a paper's total cumulative citation count and
not just a phase shift that reaches its peak earlier, without increasing
the cumulative total of citations. This is probably because of a paper's
autocatalytic usage/citation/usage/citation cycle, which Tim Brody has
also detected, and is illustrated in Tim's forthcoming usage/citation
correlation paper:

    Brody, T. and Harnad, S. (2004) Using Web Statistics as a Predictor
    of Citation Impact .

(2) The "AA" or "Arxiv advantage," which applies to both preprints and
postprints: Even though they are all already 100% OA through institutional
subscriptions/licenses, papers that are also self-archived in ArXiv get
more citations. (In fields with distributed institutional self-archiving,
AA would of course not be an ArXiv effect but an OAIster effect.) This
advantage would no doubt vanish if toll-access and open-access were fully
integrated, but it is interesting that it is present, even in a 100%
OA field.

(3) The Quality Bias, "QB," which is the fact that the higher-quality,
higher-impact authors tend to self-archive more overall, and that it is
particularly their higher-quality (hence higher-impact) papers that
authors tend selectively to self-archive more. This self-selection bias
is definitely one of the factors underlying the positive correlation
between OA and citation counts, but it is certainly not the only
factor. It will be interesting to estimate the size of QB, relative to
the other 3 factors, especially as OA grows from 0% to 100%. (The QB
component obviously has to shrink as the proportion of self-archiving
authors grows, since QB is based on self-selective differential
self-archiving of only the higher-quality work.)

(4) The true OA Advantage, OAA, which is probably by far the strongest in
fields that are nearer to 0% OA than to 100% OA because OAA is a *relative*
advantage (and a *competitive* one): In a non-OA field (unlike astro,
which is 100% OA), *all* factors give the advantage to the self-archived
article over the non-self-archived one (e.g., even postprints have the
"Early Advantage"). So even if the pure OAA is destined to shrink to
zero once 100% OA is reached, it is a *huge* advantage today, when OA
is far from 100%. It means that authors have a great deal of competitive
incentive to make their own articles OA now, before their competitors do.

In other words, it's really a Prisoner's Dilemma, hence a horse race,
once the odds and the causality are clearly understood! That is why we are
so busily generating the OA advantage data across all disciplines in
our collaborative ISI study in Southampton, Quebec and Oldenburg:

(5) And then, of course, there is also the threefold increase in the
Usage Advantage (UA) (downloads) with OA, which is not to be sneezed at
either! Usage impact too can be counted, quantified and credited! After
all, even when it is citationally silent, an increase in reading
surely has *some* impact on what the researcher/user thinks and does,
if they are not merely Zombies going through empty motions of

    "OA advantage = EA + AA + QB + OA + UA"

(6) The subtlest factor of all (and the hardest to measure) would be an
asymptotic selectivity advantage for higher-quality papers that have
been freed from their prior handicap of inaccessibility once 100% OA
has prevailed: Even if the total number of citations in a field remains
unchanged or even diminishes when it reaches 100% OA (unlike the total
amount of reading, which triples), it becomes primarily the relative
quality and merit of each article that decides whether or not it will
be cited, rather than the arbitrary factors (such as affordability and
accessibility) that had influenced it in the non-OA era. (But this will
require a very subtle retrospective analysis after 100% OA has prevailed
in order to estimate quantitatively!)

Stevan Harnad

