On Wed, 26 Nov 2003, David Spurrett & Subbiah Arunachalam wrote:

>ds> I look forward to the results of the empirical study you describe.
>ds> I would be curious to know... whether
>ds> there was a further pattern that related (a) the extent to which
>ds> publications by authors at particular institutions cited research
>ds> materials available through open access, with (b) their local
>ds> institutional budget for expenditure on journals.
>sa> Stevan Harnad talked about a study on the relative
>sa> citation rates of open-access and toll-access articles
>sa> he is conducting in collaboration with UQaM,
>sa> Southampton, Oldenburg and Loughborough. When will the
>sa> results become available? Will there be any interim
>sa> reports? I am curious to know.

The study is ongoing and we will report the results (as a pre-refereeing
preprint!) as soon as they are available. But meanwhile, much information
inheres in -- and many telling estimates can be made from -- the data
that are already available.

David Spurrett's & Subbiah Arunachalam's queries suggest the following
preliminary analysis, which can already be done by anyone on the basis
of the data already available. (I will ask our super-talented team at
Southampton if they can squeeze it in, along with all the other ongoing

We know from the Lawrence study (below) that the citation enhancement factor
for open- vs. toll-access is about 4.5 in computer science (4.5 times as many
citations for open- vs. toll-access articles in the same venue).

We know from the Eysenck and Smith RAE outcome study in Psychology
(and from the Oppenheim studies in other disciplines) that the correlation
between RAE outcome and citation impact is about .90 (in Psychology).

We also know the 2001 RAE outcome, rank-ordering every department in every
university in the UK and we also
know the size of the funding and the funding difference associate with
each rank.

Hence it is very easy to take those rank orders, for each discipline,
and calculate -- based on that discipline's correlation between its
RAE rank and its citation impact -- the estimated income increase that
would arise from the rank increase induced by the impact increase
caused by open access!

In particular, it would be possible to illustrate how the rank order
would change if, for example, the research output of the lowest-ranked
department in each discipline became open-access, and gained a 2-fold,
3-fold, 4-fold, or 4.5-fold increase in impact (depending on how
close it came to the Lawrence 4.5 estimate -- which might itself be
an underestimate in some disciplines!). The RAE/impact correlation
would predict what rank that department would get, and the RAE/funding
correlation would predict how much more money that would translate into.

Obviously if *all* the articles in all disciplines suddenly became
open-access overnight, there would not be such a dramatic change in
rankings (though it would give some research a better fighting chance),
because all impact would simply be scaled up. (*Simply scaled up*! But
that in itself would represent a huge benefit to research progress and

But never mind that. We must appeal to our lower instincts, in trying
to persuade individual researchers and their institutions that open
access is in their interests. So the above data should be taken in
a first-come, first-served competitive spirit: Right now, it is
definitely not the case that *all* articles are open access. Almost all
are not. Nor is the transition happening overnight (as it could have done,
already a decade ago).

So the incentive to self-archive comes from the fact that those
who do it *now* stand the best chance of changing the relative
research impact-ranking (and hence the research funding) in their
favor: and the study I've sketched would estimate by just how much.
A dimensionless picture of the size of the increment is already
visible in:

The RAE data are open-access, so anyone can do this study. But I will try
to persuade the Southampton team to do it, in order to provide ammunition
for those who are hard at working trying to inform university
administrators and research funders about the benefits to be expected
from mandating open-access provision for all their research output.

[A slight correction to David Spurrett's query about the correlation between
> "(a) the extent to which publications by authors at particular
> institutions cited research materials available through open access,
> with (b) their local institutional budget for expenditure on journals."
First, that's the wrong correlation. We've agreed it's not journal
budget expenditures that will persuade researchers to self-archive,
but research income. Second, we can already answer the question: That
correlation is zero, because the small existing volume of open-access
there is so far has not led to any toll-cancellations, in any discipline
(including Physics, where self-archiving and open-access are most advanced).
The correlation *might* change eventually, but that will not be a *cause* of
universal open access, but an *effect*: ]

    Lawrence, S. (2001) Free online availability substantially
    increases a paper's impact. Nature Web Debates.

    Kurtz, Michael J.; Eichhorn, Guenther; Accomazzi, Alberto;
    Grant, Carolyn S.; Demleitner, Markus; Murray, Stephen S.;
    Martimbeau, Nathalie; Elwell, Barbara. (submitted) The NASA
    Astrophysics Data System: Sociology, Bibliometrics, and Impact.

    the forthcoming Schwartz et al. study

    the work of Andrew Odlyzko:

    and Tim Brody's remarkable citebase usage and citation impact calculator as well as his usage/citation
    impact correlator
    which can predict later citation impact from earlier usage (download)
    impact using variable time-windows and ranges for the Physics ArXiv
    (you need the latest java to be able to use it) at:

    Smith, Andrew, & Eysenck, Michael (2002) "The correlation
    between RAE ratings and citation counts in psychology," June 2002

    Oppenheim, Charles (1995) The correlation between citation counts and
    the 1992 Research Assessment Exercises ratings for British library and
    information science departments, Journal of Documentation, 51:18-27.

    Oppenheim, Charles (1998) The correlation between
    citation counts and the 1992 research assessment exercise
    ratings for British research in genetics, anatomy
    and archaeology, Journal of Documentation, 53:477-87.

    Holmes, Alison & Oppenheim, Charles (2001) Use of citation analysis
    to predict the outcome of the 2001 Research Assessment Exercise for
    Unit of Assessment (UoA) 61: Library and Information Management.

Harnad, S., Carr, L., Brody, T. & Oppenheim, C. (2003) Mandated online
RAE CVs Linked to University Eprint Archives: Improving the UK Research
Assessment Exercise whilst making it cheaper and easier.

Stevan Harnad

