> I think that a number of service providers (for instance libraries at
> research centers or universities) would like to harvest exactly the
> content of interest for their researchers and students -- for instance
> pre/postprints and dissertations in economics but not thesis in economics
> etc... Thus the OpenDOAR is not primarily a service for end users but
> more so for service providers.

Two questions arise:

(1) Will it not make more sense for the repositories themselves to expose the
metadata that allow this selective harvesting? (This is not to say that a secondary
service, like OpenDOAR, that hand-classifies or hand-checks classification, is not
an invaluable help at this stage!)

(2) This is all predicated on the analogy of today's local library
collections: Is that likely to continue for much longer, in the DYO online
age? Will searchers not want to search OA sources directly, webwide?

For that, one needs two things: (i) the primary content-providers, the
institutional OA repositories, their contents well tagged by document
type (preprint, postprint, thesis, book chapter, etc.), subject matter
(physics, maths, botany, etc.), and journal-name (in the case of
published journal articles).

In addition one needs: (ii) the secondary service-providers (like the OAI
harvesters and even the non-OAI harvesters such as citeseer and google
scholar) that harvest ("collect") the primary OA material and re-present
it in various re-classified and value-added ways for individual users to
search on and use directly. The critical point is that global harvesting
services are not local libraries, creating collections for their local
users.That notion, I believe, is obsolete. (At best, one has temporary
local harvests, for a particular course, perhaps: a reading list consisting
of a set of links.)

> [W]e could include only a proportion of the sites in ROAR -- a number of
> the sites in ROAR are not original material, there are journals, dead
> URLs etc... OAIster as well is a valuable resource, but even OAIster
> contains sources of various kinds.

Dead URLs are of course useless and should be culled. But again the question
is what the users of such a service are, and what they want and need.

I think harvesters need the primary contents of the IRs, searchable by
content-type, etc., and that is what the OAI service-providers provide. To
a first approximation, the primary content of OA space consists of (a)
institutions' own preprints, postprint and thesis output (green) plus
(b) OA journal output (gold). That is what the OAI service-providers
would be interested in harvesting and making searchable for individual
users (and perhaps other harvesters) along subject and content-type
lines. (Leave the fine-tuning to full-text boolean search!)

> [OpenDOAR does] not include for instance journals as such, neither
> do we include data from the Directory of Open Access Journals
> ( DOAJ [too] is not primarily an end user
> service... [but] a service... for service providers.

More's the pity! It seems obvious that for individual searchers (and even for
harvesters) the contents of ROAR, DOAR, DOAJ, OAIster (and even the Romeo
Directory of "green" journals) should be combined and integrated:

> So: Green and Gold go hand in hand!

Indeed, and byte by byte, in the same OAI service.

Stevan Harnad
