Re: Journal Article Royalties: Reanimating the "Faustian Bargain"

From: Jim Muckerheide <>
Date: Mon, 20 Sep 1999 20:39:04 +0100

Thank you Andrew. This addresses the "privacy" issue to some extent (along with
dynamic IPs?) As a footnote, I have understood that some web page links have a
more 'active' gleaning of linked data than the 'archived data' in this description
the present records of LANL and similar servers (i.e., I've had a web page give me
back my email address in 'real time' despite my dynamic IP). In that case, such a
'solution' could provide more tools, along with strong policies to enable both
privacy and security.

But the central question to me wa: Is there potential value-added to authors for
enhanced "services"? E.g., to have an option to downloaders to be identified to
the authors, and/or to be included in subsequent discussion or follow-up messages
from the authors? (This could even be 'hidden' as a participant may select in a
mailing list, subject to posting to the broadcast list.) Or for an enhanced
capability for a paper to be linked in "cites to" and "cited by" that would
stimulate self-archiving.

If these and other value-added characteristics are "enhanced" in self-archive
lists, they may provided additional impetus to authors to prefer to self-archive,
as advocated by Harnad and others.

Regards, Jim Muckerheide

Andrew Odlyzko wrote:

> A quick response to the messages from Jim Muckerheide and Fytton Rowland:
> All servers that I am aware of do maintain a record of download addresses.
> This does present serious privacy issues, and as a result there are very
> few servers that make their logs widely available.
> To answer another part of the question, server logs would be of very limited
> use in producing "discussion lists" and the like. The reason is that these
> logs are not as informative as one would like for such purposes (which is
> a relief to many privacy advocates and a hindrance to direct marketers and
> the like). What server logs do is record the IP address of the machine
> that requested a page, and this address looks like One
> can then use "reverse DNS lookup" to try to find out what machine that is.
> Here is where the serious problems start. Quite a few such lookups fail,
> and no information is generated about the IP address. (One can then try
> to do other things, such as examine registries of autonomous systems, etc.,
> but even that is of limited use, and let's skip it.) When the lookup
> succeeds, you get information that varies in its utility. Some of the
> addresses will be of the form
> which suggests the request came from John Smith's PC in the Harvard Physics
> Dept. (But even that is not certain, since this PC may have been passed on
> to a student of John Smith.) Others, such as
> will tell you the request came from a dial-in customer of the AT&T WorldNet
> ISP business, and that the modem bank is located in Cambridge, Mass.
> It won't tell you who was using that PC, though. (For that you would need
> to access the WorldNet logs, which are carefully guarded for privacy reasons.)
> The next time you see that address, a different person might be using it.
> Next, many requests come from addresses that look like
> which are proxies that hide any number of users behind them. None of these
> entries produce valid email addresses.
> One of the complications in studying server logs is that you can never be
> certain you have seen all accesses to a page. For example, if many people
> are going through to access your pages, this proxy
> will almost certainly cache (store a local copy) at least some of those
> pages, and then deliver them to requesters without leaving any trace
> on your server.
> All these technical difficulties make it hard to evaluate usage in a
> meaningful way.
> Andrew Odlyzko
