Re: Critique of EPS/RIN/RCUK/DTI "Evidence-Based Analysis of Data Concerning Scholarly Journal Publishing" from FrederickFriend on 2006-10-16 (American-Scientist-Open-Access-Forum)

From: FrederickFriend <ucylfjf_at_ucl.ac.uk>
Date: Mon, 16 Oct 2006 11:43:48 +0100

I agree with Stevan that if "Nuclear Physics B" is vulnerable to
cancellation it is not because of OA but because librarians think carefully
about high-cost journals when it comes to priorities within a restricted
budget. I also agree that Astronomy has features which make it untypical. It
is therefore disappointing that the authors of the EPS/RIN/RCUK/DTI study
use Michael Kurtz's valuable work to reach the following general conclusion:
"The little existing evidence suggests that a possible reason for increased
citation counts is..... that authors put their best work into OA format". My
personal experience is that if authors go OA they make as much of their work
OA as possible, not only the best work. The study authors were closer to the
mark when they wrote about citation levels: "This is an area in which much
research has been carried out, but most of it has been on specific subject
areas or titles, making it difficult to generalise". When it comes to
citations and OA, time will tell.

Fred Friend
JISC Scholarly Communication Consultant
Honorary Director Scholarly Communication UCL
E-mail ucylfjf_at_ucl.ac.uk

----- Original Message -----
From: "Stevan Harnad" <harnad_at_ECS.SOTON.AC.UK>
To: <AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM_at_LISTSERVER.SIGMAXI.ORG>
Sent: Saturday, October 14, 2006 2:03 PM
Subject: Re: Critique of EPS/RIN/RCUK/DTI "Evidence-Based Analysis of Data
Concerning Scholarly Journal Publishing"

> Dear Michael,
>
> Thanks for your (as always) very interesting and informative data! They
> show that:
>
> (1) In astronomy, where all active, publishing researchers already have
> online access to all relevant journal articles (a very special case!),
> researchers all use the versions "eprinted" (self-archived) in Arxiv
> first, because those are available first; and they all switch to using
> the journal version, instead of the self-archived one, as soon as the
> journal version is available.
>
> That is interesting, but hardly surprising, in view of the very special
> conditions of astronomy: If I only had access to a self-archived
> preprint or postprint first, I'd used that, faute de mieux. And as soon
> as the official journal version was accessible -- assuming that it's
> equally accessible -- I'd use that.
>
> But these conditions -- (i) open accessibility of the eprint before
> publication, (ii) in one longstanding central repository (Arxiv),
> for many and in some cases most papers, and (iii) open accessibility
> of the journal version of all papers upon publication -- is simply not
> representative of most other fields! In most other fields, (i') only
> about 15% of papers are available early as preprints or postprints,
> (ii') they are self-archived in distributed IRs and websites, not one
> central one (Arxiv), and (iii') the journal versions of many papers are
> not accessible at all to many of the researchers after publication.
>
> That's a very different ball game.
>
> (2) Your data showing that astronomy journals are not cancelled despite
> 100% OA are very interesting, but they too follow almost tautologically
> from (1): If virtually all researchers have access to the journal version,
> and virtually all of them prefer to use that rather than the eprint,
> it stands to reason that it is not being cancelled! (What is cause and
> what is effect there is another question -- i.e., whether preference is
> driving subscriptions or subscriptions are driving preference.)
>
> (3) In astronomy, there is a small, closed circle of core journals,
> and all active researchers worldwide already have access. In many
> fields there is not a closed circle of core journals, and/or not all
> researchers have access. Hence access to a small set of core journals
> is not a precondition for being an active researcher in many fields --
> which does not mean that lacking that access does not weaken the research
> (and that is the point!).
>
> (4) I agree completely that there is a component of self-selection
> Quality Bias (QB) in the correlation between self-archiving and
> citations. The question is (4a) how much of the higher citation count
> for self-archived articles is due to QA (as opposed to Early Advantage,
> Competitive Advantage, Quality Advantage, Usage Advantage, and Arxiv
> (Central) Bias)? And (4b) does self-selection QB itself have any
> causal consequences (or are authors doing it purely superstitiously,
> since it is has no causal effects at all)? The effects of course need
> not be felt in citations; they could be felt in downloads (usage) or in
> other measures of impact (co-citations, influence on research
> direction, funding, fame, etc.).
>
> The most important thing to bear in mind is that it would be absurd to
> imagine that somehow OA guarantees a quality-blind linear increment to
> the usage of any article, regardless of its quality. It is virtually
> certain that OA will benefit the better articles more, because they are
> more worth using and trying to build upon, hence more handicapped by
> access-barriers (which *do* exist in fields other than astro). That's QA,
> not QB. No amount of accessibility will help unciteable papers get used
> and cited. And most papers are uncited, hence probably unciteable!
>
> (5) I think we agree that the basic challenge in assessing causality
> here is that we have a positive correlation (between proportion of papers
> self-archived and citation-counts) but we need to analyze the direction
> of the causation. The fact that higher citation-count papers tend to be
> self-archived more, and lower citation-count papers less is merely a
> restatement of the correlation, not a causal analysis of it: Their
> citation counts come *after* the self-archiving, not before!
>
> The only methodologically irreproachable way to test causality would be
> to randomly choose a (sufficiently large, diverse, and representative)
> sample of N papers at the time of acceptance for publication
> (postprints -- no previous preprint self-archiving) and randomly
> *impose* self-archiving on N/2 of them, and not on the other N/2. That
> way we have random selection and not self-selection. Then we count
> citations for about 2-3 years, for all the papers, and compare them.
>
> No one will do that study, but an approximation to it can be done
> (and we are doing it) by comparing (a) citation counts for papers that
> are self-archived in IRs that have a self-archiving mandate with (b)
> citation counts for papers in IRs without mandates and with (c) papers
> (in the same journal and year) that are not self-archived.
>
> Not a perfect method, problems with small Ns, short available
> time-windows, and admixtures of self-selection and imposed self-archiving
> even with mandates -- but an approximation nonetheless. And other
> metrics -- downloads, co-citations, hub/authority scores, endogamy
> scores, growth-rates, funding, etc. -- can be used to triangulate and
> disambiguate.
>
> Stay tuned.
>
> Now some comments:
>
> On Tue, 10 Oct 2006, Michael Kurtz wrote:
>
>> Dear Stevan and list,
>>
>> Recently Stevan has copied me on two sets of correspondance concerning
>> the OA citation advantage; I thought I would just briefly respond to
>> both.
>>
>> Besides our IPM article:
>> http://adsabs.harvard.edu/abs/2005IPM....41.1395K we have recently
>> published two short papers, both with graphs you might find interesting.
>>
>> The preprint will appear in Learned Publishing
>> http://adsabs.harvard.edu/abs/2006cs........9126H E-prints and Journal
>> Articles in Astronomy: a Productive Co-existence
>>
>> and this is in the J. Electronic Publishing
>> http://adsabs.harvard.edu/abs/2006JEPub...9....2H Effect of E-printing
>> on Citation Rates in Astronomy and Physics
>>
>> There is a point I would like to emphasize from these papers. Figure 2
>> of the Learned Publishing paper shows that the number of ADS users who
>> read the preprint version once the paper has been released drops to near
>> zero. This shows that essentially every astronomer has subscriptions to
>> the main journals, as ADS treats both the arXiv links and the links to
>> the journals equally; also it shows that astronomers prefer the journals.
>
> And it also shows how anomalous Astronomy is, compared to other fields,
> where it is certainly not true that every researcher has subscriptions
> to the main journals...
>
>> Figure 5 of the J Electronic Publishing paper also shows that there is
>> no effect of cost on the OA reads (and thus by extension citation)
>> differential. Note in the plot that there is no change in slope for the
>> obsolescence function of the reads (either of preprinted or
>> non-preprinted) at 36 months. At 36 months the 3 year moving wall
>> allows the papers to be accessed by everyone, this shows clearly that
>> there is no cost effect portion of the OA differential in astronomy.
>> This confirms the conclusion of my IPM article.
>
> And it underscores again, how unrepresentative astronomy is of research as
> a whole.
>
>> Now three comments:
>>
>> Citations are probably the least sensitive measure to see the effects of
>> OA. This is because one must be able to read the core journals in order
>> to write a paper which will be published by them. It is really not
>> possible for a person who has not been regularly reading journal
>> articles on, say, nuclear physics, to suddenly be able to write one, and
>> cite the OA articles which enabled that writing. It takes some time for
>> a body of authors who did not previously have access to form and write
>> acceptable papers.
>
> In astronomy -- where the core journals are few and a closed circle, and
> all active researchers have access to them. But this is not true of
> research as a whole, across disciplines (or around the world).
> Researchers in most fields are no doubt handicapped for having less than
> full access, but that does not prevent them from doing and publishing
> research altogether.
>
>> Any statistical analysis of the causal/bias distinction must take into
>> account the actual distribution of citations among articles. This is
>> why I made the monte carlo analysis in the IPM paper. As a quick
>> example for papers published in the Astrophysical Journal in 2003: The
>> most cited 10% have 39% of all citations, and are 96% in the arXiv; the
>> lowest cited 10% have 0.7% of all citations and are 29% in the arXiv.
>> Showing the causal hypothesis is true will be very difficult under these
>> conditions.
>
> (1) Since all of the published postprints in all these journals
> are accessible to all research-active astronomers as of their date of
> publication, we are of necessity speaking here mostly about an Early
> Access effect (preprints). Most of the other components of the Open Access
> Advantage (Competitive Advantage, Usage Advantage, Quality Advantage)
> are minimized here by the fact that everything in astronomy is OA from
> the date of publication onward. The remaining components are either
> Arxiv-specific (the Arxiv Bias -- the tradition of archiving and hence
> searching in one central repository) or self-selection [Quality Bias]
> influencing who does and does not self-archive *early*, with their
> prepublication preprint.
>
> Since most fields don't post preprints at all, this comparison is mostly
> moot. For most fields, the question about citation advantage concerns
> the postprint only, and as of the date of acceptance for publication,
> not before.
>
> (2) In other fields too, there is the same correlation between citation
> counts and percentage self-archived, but it is based on postprints,
> self-archived at publication, not pre-refereeing preprints self-archived
> much earlier. And, most important, it is not true in these fields that
> the postprint is accessible to all researchers via subscription: Many
> potential users cannot access the article at all if it is not
> self-archived -- and that is the main basis for the OA advantage.
>
>> Perhaps the journal which is most sensitive to cancellations due to OA
>> archiving is Nuclear Physics B; it is 100% in arXiv, and is very
>> expensive. I have several times seen librarians say that they would
>> like to cancel it. One effect of OA on Nuclear Physics B is that its
>> impact factor (as we measure it, I assume ISI gets the same thing) has
>> gone up, just as we show in the J E Pub paper for Physical Review D.
>> Whether Nuclear Physics B has been cancelled more than Nuclear Physics A
>> or Physics Letters B must be well known at Elsevier.
>
> It is an interesting question whether NPB is being cancelled, but if
> it is, it clearly is not because of self-archiving, nor because of
> astronomy's special "universal paid OA" OA to the published version: if
> NPB is being cancelled, it is for the usual reason, which is that it is
> not good enough to justify its share of the institution's journal budget.
>
> Chrs, Stevan
>
Received on Mon Oct 16 2006 - 12:09:46 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:32 GMT