Minh Ha Duong provides some valuable risk-assessment supplements
to Arthur Sale's excellent list:

> - Choosing a repository to manage one's life
> production is a decision with high emotional stake
> - There are irreversibilities with putting anything on the net
> - The probabilities are unknown

These are all relevant considerations for large-P Preservation
Archiving, but irrelevant to OA self-archiving, which is merely
a supplement to the original article (published and Preserved by
publishers and libraries), self-archived in order to maximize access and
impact, not in order to large-P Preserve one's life production. For OA
self-archiving, small-p preservation -- available with all the major OA
softwares and serious institutional OA repositories -- is enough.
Supplements don't need Preservation, just preservation. The originals
are the ones that need the Preservation.

> Here are several risks associated with Institutional Repositories that
> are well worth mitigating :
> - Technical and disaster risk can arise if IR are physically
> centralized at the institution.
> This need to be mitigated by mirroring the archive using
> a different software, in a datacenter far away from the primary.

Yes, LOCKSS is excellent practice.

> - One political risk is acceptability : big brother aversion could
> lead a significant fraction of researchers to reject an IR as a whole.

If this is speculation about whether or not researchers would comply
with an institutional self-archiving mandate, two international JISC
surveys found that 95% of authors would comply (81% willingly, 14%
reluctantly) and the four institutions with a mandate (QUT, Minho,
Southampton ECS and CERN) all good compliance rates, healthily climbing
toward 100%.

So evidence is preferable to a-priori speculation here (especially since
self-archiving is done in the researcher's self-interest, with palpable
impact effects on funding and salary).

> Network effects imply that this minority behaviour is a risk for
> the community as a whole,
> as it deprecates the integrity, hence the value of the IR .

100% coverage is always desirable, and a mandate is the best way to
achieve it (along with library activism and support) but within an
institution, the competitive advantage of OA, explicit usage/impact
stats like Arthur Sale's, and explicit showcasing, as in DARE's
Cream of Science
will soon raise compliance toward 100% on the strength of
self-interest (and unflattering impact comparisons) alone.

> The measure to alleviate this would be to harvest metadata in
> whatever repositories the researchers do archive in.

The problem is not *where* researchers self-archive, but that only 15%
self-archive spontaneously. (But you are quite right: CERN, QUT and
others have successfully added the systematic harvesting of existing,
distributed self-archived items into the institutions IR to supplement
direct deposits.)

> The immediate difficulty is that these repositories are not
> likely to contain all the metadata the IR tracks.

That problem is trivial. Once you have the full-text, the requisite
OAI metadata are just a few more keystrokes.

> However, there are example of systems that harvest first,
> and allow to add metadata at a later time,
> usually in a decentralized contributive way.

Indeed! A good strategy.

> See e.g. CiteSEER in Computing Science, or RepEc in Economics.

Those are central archives -- very useful -- but we are of course
talking about an institution's own IR for its own research output.

> The latter even manages the relation between authors and labs
> and institutions.

Because Repec harvests from them. But an institution of course
needs to manage its own metadata and its own contents.

> - Another political risk is monetization. What guarantee do the
> researchers have that the IR remains OA ?

What guarantee do they need? And is there any point continuing to lose
daily, weekly, yearly impact needlessly waiting for guarantees, when
what is needed is immediate self-archiving? (Should we worry about the
perennity of our institutions too?)

> Given the financial pressures on the research institutions as a whole,
> if the repository is asked to recoup its costs, it could switch to a
> pay per article distribution mode.

This risk borders on the self-contradictory:

If research institutions want to ease their financial pressures they
need to increase the impact of your their research, not join in with
publishers (who do need to recover their costs) by blocking access (and
impact) to potential users/citers of their own research with gate-tolls.

The costs of an OA IR itself are so small as to make it border on the
absurd to suggest denying access to would-be users unless they pay an
access toll:

> A mirror of the archive which is not under government control
> would mitigate that.

Twinning and mirroring (as well as LOCKSS and harvesting) will take care
of all this quite naturally -- once there is enough content to be worth
twinning, mirroring, etc. For the current spontaneous 15% level it is
hardly worth losing time on...

> - The country-specific dimension of risks could be mitigated
> by mirroring the archive in a different country,
> or even best in a different cultural zone (continent).

Good practice. (Obvious, once we have the precious content; merely a
medicine for preemptive Zeno's Paralysis right now...)

> I guess that my conclusion is that commingling Institutional
> Repositories with the Internet Archive and siblings
> would increase their attractiveness for researchers,
> and hence the rate of deposits we seek to maximize on this list.

And that already comes with the web territory, in this OAI/google age!

Stevan Harnad

