Date: Mon, 22 Nov 2004 08:23:52 +0000

From: Jan Velterop
Date: Sat, 20 Nov 2004 21:22:35 +0000

On 19 Nov 2004, at 21:26, Leslie Chan wrote:

> I imagine it would be relatively simple for Google to add a filter or
> option to display only OA articles in the search results. That would
> really tip the scale.

Perhaps clearly labelling what's OA would be better? The contrast would
be clearer.

From: "David Goodman"
Date: Sat, 20 Nov 2004 18:51:44 -0500

I suspect they will have a hard time distinguishing between the sites
offerring a full article and the ones containing only abstracts.
They may also have a difficult time distinguishing referreed preprints
(generally considered OA) from yet-unrefereed manuscripts, and from
material not intended to be refereed, such as reports.

They will similarly have a hard time keeping up with the material at
the publishers' sites that aren't OA now but will be in a few months,
or the material in journals with partially OA content.

Not that I think any of these is impossible, but I doubt there are
reliable algorithms yet. I suspect that the approach using the true OA
indexes, or at least collaborating with them, will prove the best. It
will be easier to match material known to be OA.

It is not my intention to discourage anyone, but just to mention that
the problem of distinguishing the actual content of a site from its
metadata alone is non-trivial, and so is the problem of analyzing the
actual content and deciphering what it really is.

There are many other such specialized search engines besides Google
Academic; some, like Sirius, are well beyond the beta stage. I am not
aware of any that do distinguish OA items, but the same possibilities
apply to them all.

From: Imre Simon
Date: Sun, 21 Nov 2004 19:17:51 -0200

Hi everybody,

I am just wondering if someone could tell me when a paper resident in
an OAI-compliant repository will be included in google scholar, with an
explicit pointer to the paper in the compliant repository?

As far as I could discover empirically scholar does not harvest from
citeseer. Is this correct?

I also found that two papers in a (now defunct) eprints server had
different destinations in scholar: one of them is included while the
other one is not. What does it take for a self-archived paper to be
included in google scholar?

Further, I also found that scholar includes some papers it finds on the
net while it does not include other papers found on the same site.
Again, what does it take for a paper found on the net (and present in
the main google archives) to be included in google scholar?

Since previous discussions on this list seemed to assume silently that
every self-archived paper is included in scholar I thought it would be
interesting to discover, with some precision, to what extent this
hypothesis is correct or not.

I did not find precise statements on Google clarifying any of the above
questions. I am copying <> on this message.

Cheers, Imre Simon
