> Archive programmes gain momentum
> Research Information, October/November 2005
> Nadya Anscombe

> Sally Morris (ALPSP)
> 'But imagine if all these individual articles (albeit
> not necessarily final versions) were linked up through networked
> institutional repositories. It could happen that the majority of
> papers from a particular journal become available for any researcher
> to find. This could lead cash-strapped libraries to stop buying that
> journal, which would make it no longer viable.'

What I imagine when I think of cash-strapped librarians is the journals
they cannot afford (i.e. most of them) for their institution's users,
hence the research results their researchers cannot access and use, hence
all the research impact and progress thereby lost. Sally thinks only
that cash-strapped librarians might cancel some journals. (They always
do: they must; that's what "cash-strapped" means; it means not having
anywhere near enough cash to buy access to all, most, many, or even
enough journals to meet all of their users' potential needs). Researchers
(both the producers and the users of the research) can perhaps be forgiven
for thinking instead about all the research access and impact that Sally
seems quite happy to see continue to be "strapped down", because, after
all, the thing that *really* matters is the cash-flow to publishers,
not the research access/impact-flow to researchers. Let access-strapped
researchers eat cake, or go hungry. (Yet is that really what research
is about, and for? Is the public funding research, and are researchers
conducting it, in order to ensure the cash flows from cash-strapped
librarians to publishers' bottom-lines?)

> Sally Morris: If readers access journals through
> repositories this usage will not show up in the data. The librarian
> might therefore decide to cancel a subscription even though the same
> numbers of people still access the journal.'

Cancel one journal in order to be able to afford another one. That is
what cash-strapped librarians -- with finite journal acquisitions budgets
nowhere approaching the capacity subscribe to them all -- are always
doing. But a self-archiving mandate has *no differential effect on that*,
one way or the other! The mandate applies randomly, to the funded articles
in *all* journals. And all it does is provide supplementary access to
the author's draft for the access-denied would-be users, allowing the
cash-strapped librarians to do their selecting just as before, but without
the pressure of elemental necessity: The supplementary author's versions
are a safety net: "Even if I have to cancel this journal in order to be
able to afford that journal, at least I know that the users that really
need access will have access to the author's self-archived draft."

Unless Sally really believes that journals add *no* value worth paying
for, this scenario does not spell the end of journals; just the end of
needless access-denial and impact-loss for researchers, and the end of
desperation and inordinate stress on librarians. Their finite acquisitions
budget will still be spent, selectively, as it always was; but
researchers will no longer be deprived of the bare necessities. And
research will no longer lose 50%-250% of its potential impact.

> And it is not just the publishers who are affected. Institutional
> repositories could also be seen as a threat to traditional libraries.

A threat to libraries? The peer-to-peer provision of peer-reviewed
research by its authors, in order to be used and built upon, as intended,
is now not only to be accorded a lower priority than the protection of
publishers' cash flows, but also in order to protect librarians'
traditional practices? Why is research, done by and for researchers,
being weighed as a means toward these other ends, rather than the end in
itself that it was meant to meet -- an end that publishers and librarians
were meant to help serve, not obstruct?

> Author apathy and reluctance to change seem to be the biggest
> challenges, even for the most successful of repositories. Leo
> Waaijers... told Research Information:
> 'Our biggest problem is convincing academics that they have something
> worth preserving and that an institutional repository is the place
> to do it.'

It might have been more useful to try to convince academics that they
have something worth maximising access to, so as to maximise its research
impact, rather than for the sake of "preserving" it. Preservation is not
the pressing problem facing research today. Needless access-denial and
resultant impact-loss is. Preservation is a librarian's problem, and it
should be addressed to the journals they are using their strapped cash
to subscribe to, not to the institutional researchers they are trying
to persuade to provide supplementary authors' drafts, for those users
at *other* institutions, who cannot afford access to the journal they
happen to appear in. (Librarians are clearly a blessing in the open
access and institutional repository movement, but they are a decidedly
mixed blessing, often regressing, like the dypsomane and the lamp-post,
to what they find familiar, even when it is obsolescent or irrelevant.)

> Peter Morgan, project director for DSpace_at_Cambridge... says 'Many
> people underestimate the cost of an institutional repository. It can
> be set up very cheaply with open-source software, so no-one should
> be able to say they can't get started, but as soon as you want to
> develop the system, provide support and store different kinds of data,
> hardware and personnel costs start to rise. Here, we need at least
> two people on the technical side and one librarian. Any institution
> that wants to start a repository should seriously consider the
> long-term costs.'

It all depends what you want the repository for. If it is for digital
preservation, digital asset management, course-ware, e-publishing, etc.,
it will get pricey. But if it is for providing access to the
self-archived authors' drafts of institutional research output (the
pressing and primary problem, remember?) then creating and maintaining
the archive is cheap, and all that's missing is a policy that mandates
its filling. (Please inquire for University of Southampton's ECS archive's
actual costs.)

> 'Most academics are uneasy about copyright issues and believe that
> in order to succeed in the annual Research Assessment Exercise they
> must publish in the best journals,' says Morgan.

Ninety-percent of journals have already given their blessing to author
self-archiving. And depositing the metadata (author, title, journal)
and full-texts while making the metadata open-access and the full-texts
merely harvestable and invertible for searching by google and the like is
enough. Eprint requests can be emailed to the author, who can then email
the eprint the the wuld-be user -- until he tires of all the keystrokes
and makes the full-text open access too.

The RAE is relevant inasmuch as maximising impact by self-archiving will
give those with a head-start a competitive advantage, but it has nothing
to do with journal choice: *All* articles in *all* journals need to be
self-archived (and that is what Research Councils UK look poised to
mandate at last, for the sake of the UK's competitive advantage, but also
for the sake of the absolute advantage of research itself).

> [Morgan] hopes that
> the recent RCUK position statement will help to change this but
> believes institutional repositories have a larger role to play
> than just publishing academic work.

Repositories don't *publish* academic work! Researchers publish their
work in peer-reviewed *journals*. They self-archive their
authors' drafts in their institutional repositories in order to
supplement the access, usage and impact -- that currently comes only from
those users whose institutions can afford the access to the official
journal version -- with the access, usage and impact of those who
cannot: 50%-250% more usage and impact, according to our data.

> 'Some academics believe that if
> their work is stored on the department computer, there is no need
> to worry about it. There is widespread ignorance about data loss,'
> says Morgan. 'Data has to be managed properly. We believe strongly
> in long-term data preservation and we migrate files into new file
> formats regularly, preserving them for future use.'

Preservation again. Fine as (yet another) potential use for institutional
repositories, but has *nothing* to do with the access/impact problem
that the open-access/self-archiving movement is all about (remember?).

> It is in this area of data preservation that institutional
> repositories really show their worth. By storing files in a managed
> repository researchers will be ensuring that their work can be read
> by future generations, for free. Just like in a library.

The access/impact problem, however, is not the future generations, but
the current ones, who continue to be access-denied as we fret
irrelevantly about data-migration and future generations. Besides, those
worries should first be addressed at the official journal version that the
cash-strapped library paid for, not the home-brew they are trying --
unsuccessfully with these irrelevant preservation incentives -- to
persuade their own authors to provide.

> The right tools for the job There is now a range of adequate,
> easily-available software for creating and maintaining institutional
> repositories. Some are commercial packages but many others are
> available free under open-source licences. The two leading software
> packages are DSpace (MIT, US) and EPrints (Southampton, UK) but
> there are plenty of others to choose from, including the following:

The problem is not choosing software, it is filling archives!

    "EPrints, DSpace or ESpace?"

> DSpace: developed by MIT, US and suitable for data preservation
> applications as well as storing academic papers and experimental
> data ( Eprints: developed by Southampton University,
> UK for the managing of academic research papers (

Both packages are equally suitable for all of these applications. But
the pressing (and neglected) problem is none of these: it is the problem
of adopting a policy to ensure that the archives will be filled with
their target content (the institution's own research article output).
That is what we need policies like the RCUK's proposed one for.

Stevan Harnad
Moderator, American Scientist Open Access Forum

Chaire de recherche du Canada
Centre de neuroscience de la cognition (CNC)
Université du Québec à Montréal
Montréal, Québec, Canada H3C 3P8

Professor of Cognitive Science
Department of Electronics and Computer Science
University of Southampton
Highfield, Southampton
