From: David Goodman <dgoodman_at_PHOENIX.PRINCETON.EDU>
Date: Fri, 25 May 2001 17:06:28 -0400

The NUMBER of readings an article gets does not appear to me as
information in need of protection. One publishes an article to get it
read. The number of CITATIONS to an article is public knowledge, it just
has to be gathered.

WHO READS an article is the information that needs to be protected. Within
a University we are concerned with protecting knowledge of what
individual reads it, and this university has decided by policy not to
collect or attempt to collect such data, or permit publishers to do so.
We do not collect data on access by ip addresses or even ranges
at either an article or journal level. Such data might provide us useful
insights into collection development needs, but we have decided to forgo
this knowledge as unsafe to collect.

In commercial enterprises, such as pharmaceutical companies, I understand
they are also interested in preventing the collection of what information
the entire company is accessing, because of the possibility of industrial
espionage, and therefore appropriately use elaborate precautions to
prevent this.

I personally am almost fanatically concerned about intellectual
privacy, but I still see no harm in knowing what articles are accessed by
the world in general. As for the analogy with a physical library, if you
go into a library you will have no trouble determining by physical
inspection what books and journals are most used, and even what journal
articles are read particularly heavily.

 David Goodman, Princeton University Biology Library 609-258-3235

On Fri, 25 May 2001, Jim Till wrote:

> On Thu, 24 May 2001, Tim Brody wrote (about my proposed 2nd criterion for
> evaluation of an eprint archive, which was: 2) its suitability for
> yielding citation data [an 'impact-ranking' criterion?]):
> [tb]> One might also add the facility to export "hit" data, as an
> [tb]> alternative criterion (or any other raw statistical data?).
> What kind of raw statistical data might be most useful, in the future, for
> 'impact-ranking'?
> At the arXiv archive, one section of the FAQ section (under Miscellaneous)
> addresses the question: "Why don't you release statistics about paper
> retrieval?". (See:
> The short answer provided is: "Such 'statistics' are difficult to assess
> for a variety of reasons". The longer answer also includes the comments
> that:
> "It could be argued perhaps correctly that statistics may provide some
> useful information at least on the relative popularity of submissions,
> since the distributed access and other factors may be subsumable into some
> overall scale factor. But even this information is ambiguous in many
> cases, and publicizing, even when accurate, could merely accentuate
> faddishness in fields already excessively faddish".
> And,
> "Most significantly, however, there is a strong philosophic reason for not
> publicizing (or even saving) these statistics. When one browses in a
> library it is very important (in fact legislated) that big brother is not
> watching through a camera mounted on the wall; for the benefit of readers
> it is very important to maintain in every way possible this sense of
> freedom from monitoring in the electronic realm".
> Thought-provoking comments?
> Jim Till
> University of Toronto
