Re: Does the arXiv lead to higher citations and reduced publisher dowloads?

From: Stevan Harnad <>
Date: Wed, 22 Mar 2006 20:14:48 EST

On Tue, 21 Mar 2006, Peter Banks wrote:

> [Re: Kristin Antelman's findings] I... suspect that there is a=20
> small OA citation advantage, I am not convinced by these=20
> data... I doubt that most of the results reach statistical=20
> significance...

Based on past postings from Peter, I think there may be an=20
element of wishful thinking here (ex officio)! Peter, if you are=20
not convinced by KA's data alone, look at all the other data that=20
shows the same thing. For example, see Figure 4 in:

     Hajjem, C., Harnad, S. and Gingras, Y. (2005) Ten-Year
     Cross-Disciplinary Comparison of the Growth of Open Access and How
     it Increases Research Citation Impact. IEEE Data Engineering Bulletin
     28(4) pp. 39-47.

You will see that the ratio of the proportion of OA articles to=20
non-OA articles peaks in the 4-7 citation range, and falls off=20
for higher and lower citation (quality) ranges. But it is always=20
greater than one (i.e., an OA Advantage) except for articles with=20
zero citations (where the ratio reverses); that of course is also=20
the largest number of articles.

But this effect is again just a correlation, and is just as=20
compatible with a Quality self-selection Bias (QB) as with a=20
Quality Advantage (QA) (except that it is hard to see why=20
self-selection QB should peak at the 4-7 range, whereas it's=20
perhaps less difficult to see how a QA advantage could have=20
inverted U-shape, absent for the duds and trivial for the gems --=20
but this awaits more confirmatory data and ways of testing=20
causality more directly.

> I also don't understand how these data exclude Phil's=20
> hypothesis. Since Kristin seems to define quality in terms of=20
> citations, then the logic seems self-referential: how would one=20
> detect a difference in citation due to intrinsic quality when=20
> one has defined quality as number of citations?

You're quite right, except that that argument cuts in both=20
directions: No data to date can decide directly between QA and=20

Received on Thu Mar 23 2006 - 02:01:29 GMT

