Re: preservation vs. Preservation (fwd)

From: Joanna Barwick <J.P.L.Barwick_at_LBORO.AC.UK>
Date: Tue, 7 Mar 2006 14:20:22 -0000

As Institutional Repository Manager at Charles' institution, could I point
to the correct URL for our repository: !

and also confirm that the choice of PDF files was a conscious one with a
view towards long-term preservation (or should that be Preservation) and

As a repository still in its infancy (we are still in our pilot stage having
only come to life in July 2005), these discussions are invaluable to us and
I appreciate Les's comments.

Joanna Barwick
Support Services Librarian (IR Manager)
Loughborough University

From: "Stevan Harnad" <harnad_at_ECS.SOTON.AC.UK>
Sent: Tuesday, March 07, 2006 1:55 PM
Subject: Re: preservation vs. Preservation (fwd)

>I am forwarding Les Carr's wise and witty reply to Charles Oppenheim.
> It was posted to JISC-REPOSITORIES but the thread is also playing out
> in part on the AmSci Forum. Les is the head of the JISC PRESERV
> Project -- SH
> [Ceterum censeo, in the special case of OA self-archiving we are talking
> about *supplementary author drafts*, not the original, published articles
> themselves (whose Preservation Problem is in the hands of their Publishers
> and Subscribers, not their authors and their institutions), and, more
> important, we are talking about *why* authors should/would bother to
> self-archive such OA supplements at all.]
> Date: Tue, 7 Mar 2006 09:18:03 +0000
> From: Leslie Carr <>
> Subject: Re: preservation vs. Preservation
On 6 Mar 2006, at 20:02, Charles Oppenheim wrote:
>> Stevan, I self-archive to ensure my articles are widely read now
>> AND WILL BE FOR THE FORSEEABLE FUTURE. The former without the latter is
>> just
>> nonsensical.
> [[ Executive Summary: Charles' requirements seem already to be
> satisfied by his self-archiving and by some common repository
> management practice. But that only applies to the articles that he
> has self-archived. ]]
> The truly forseeable future is a very short timeframe indeed, but it
> is interesting that you chose that phrase against "in perpetuity" or
> "for the long term" which seem to be used in this context as a vague
> indicator of some unspecified future time more distant than the
> collapse of Western civilisation (by analogy with the Library of
> Alexandria). It would be very interesting to see some thoughtful
> descriptions of reasonable timeframes for accessibility and hence the
> preservation processes that are likely to come into play.
> A contributor indicated below that he was worried that material in
> a repository might become inaccessible after 24 hours or 3 years. I
> would guess that the former timescale would be most likely due
> to service instability (our data centre just burned down) and the
> latter to institutional instability (we've changed our minds about a
> repository). It is also possible that 3-year inaccessibility could be
> the result of 'format inaccessibility' if the format was made up by
> the researcher (or project) responsible for a piece of data (I can't
> remember how to interpret the contents of this file OR the person who
> created the file has now left the institution).
> On reasonable timescales (ie greater than 3 years) I would have to
> cite in evidence. The earliest papers there (now 16 years
> old) are still accessible (see
> 9204 ).
> Other timeframes for consideration might be as follows:
> a) the download lifespan of your article which elapses once your
> document hasn't been read for a whole year (or decade).
> b) the citation lifespan of your article which elapses once your
> document hasn't been cited for a whole year (or decade).
> c) the relevance lifespan of your article (perhaps as defined by you,
> the author)
> d) your career lifespan which elapses once you retire
> e) your lifespan (I needn't elaborate)
> f) your institution's lifespan (but surely institutions are created,
> not destroyed?)
> g) an arbitrary long-sounding period e.g. 100 years
> h) a statistically-defined period based on some observable feature
> of the literature or of its use
> i) the economic lifespan of your article, which elapses when it
> becomes too expensive to provide access to it
> j) some combination of the above.
> The JISC PRESERV project has been undertaking
> some work on defining preservation criteria in terms of citation
> lifespans: it is interesting to note that the earliest arxiv paper in the
> above list (Gamma-Ray Bursts as the Death Throes of Massive Binary Stars,
> Astrophys.J. 395 (1992) L83-L86) is still receiving approx 5 citations
> a year and around 10 downloads a year from the UK arxiv mirror.
> By self-archiving your papers I'm sure that we would all judge that you
> have made the first step towards satisfying your goal of "ensur[ing] my
> articles are widely read now AND WILL BE FOR THE FORSEEABLE FUTURE". By
> self-archiving them in a reasonable format (you seem to have chosen
> PDF for all your deposits) you have made the decision to use a
> format which is widely accessible, well supported, publicly documented
> and has many different renderer implementations, both commercial and
> open source. This seems to be an excellent basis for accessibility and
> near-to-mid-term preservation and is reflected in PDF's choice of support
> by many preservation-oriented repositories including MIT. Combined with
> some simple and fairly low-impact technical support from the repository
> or its service providers then we might reasonably expect access to your
> articles to be preserved into the longer term, whatever that turns out
> to be.
> So I don't see any problem. You seem to have made all the required
> steps, and as long as Loughborough's repository is managed as
> responsibly as we would expect then your objectives would already
> seem to be satisfied. In other words, best OA practice (as seen in
> many OA repositories) naturally facilitates preservation (with any
> capitalisation that you choose).
> Actually, there is a bit of a problem, and I hope you'll forgive me
> for pointing this out. The repository only contains one of your
> papers from 2005, so the remainder from last year don't yet have the
> same guarantee of preservation that OA practice provides :-)
> Les Carr
> PS Your institutional repository (
> dspace/ ) doesn't yet make any preservation commitments for specific
> formats, including PDF. See
> formats.jsp . But I'm sure that these will appear in time.
>> Quoting Stevan Harnad <harnad_at_ECS.SOTON.AC.UK>:
>>>> it... seems to make little sense to go to the effort of making
>>>> information accessible NOW when it could theoretically be
>>>> inaccessible
>>>> 24 hours from NOW or even 3 years from NOW...
>>> Please refer to Steve Hitchcock's posting about PRESERV.
>> scientist-open-access-forum&D=1&O=D&F=l&P=14808
>>> As I said from the outset, Eprints and OA are of course (quite
>>> naturally
>>> and without fanfare) attending to small-p preservation (as has Arxiv,
>>> since its inception in 1991, and CogPrints since its inception in
>>> 1997
>>> -- note that all their contents are still here, with us, in 2006, in
>>> continuous use, again without any fanfare about large-P
>>> Preservation).
>>> But Preservation is not why they were self-archived!
>>> The point is simple: Preservation is *not* the reason researchers
>>> self-archive their postprints, which are final, refereed drafts of
>>> their published articles. Maximising their accessibility and their
>>> impact is the reason researchers self-archive their postprints. It
>>> is not those self-archived supplements that require the large-P
>>> Preservation, it is the published originals.
>>> If researchers self-archive at all, they do not do it in order to
>>> Preserve
>>> their articles; they do it in order to increase their article's usage
>>> and impact. And only 15% of researchers as yet self-archive. The goal
>>> of OA is to raise that to 15% to 100%. Neither the silly
>>> suggestion that
>>> authors should self-archive in order to Preserve their articles --
>>> nor
>>> any
>>> extra work or complications anyone foolishly adds to the self-
>>> archiving
>>> procedure (such as it is, for example, in Eprints IRs today) in the
>>> interests of Preservation -- will do anything to help raise that 15%
>>> to 100%: On the contrary, a bad reason for self-archiving and
>>> needless
>>> extra work in self-archiving will only deter self-archiving. And
>>> neglect
>>> of OA for other archiving priorities (e.g., Digital Preservation) are
>>> the worst.
>>> At the same time, articles in OA IRs *are* being small-p
>>> preserved, as
>>> noted. So that's not a substantive issue either.
>>> The only substantive issue is how to fill OA IRs with 100% of
>>> institutional OA article output, as soon as possible. (It's already
>>> vastly overdue and substantial research impact and progress continue
>>> to be needlessly lost till it happens.)
>>> I have listed many heroic librarians who understand this fully, and
>>> have been at the forefront of OA efforts and success (e.g., Paula
>>> Callan, Helene Bosc, Eloy Rodrigues, Derek Law, Susanna Mornati,
>>> and many, many others). But there are also many in the library
>>> community
>>> who are ignorant of or indifferent to OA, and have other ideas about
>>> what to do with IRs. Several are discussed in Richard Poynder's
>>> insightful analysis. And it is a parting of ways with them that
>>> Richard was proposing to the OA movement (and he may well be right).
>> little.html
>>> Stevan Harnad
>>>> Hi John,
>>>>> All this has nothing to do with making
>>>>> information accessible NOW. You have failed to distinguish between
>>> present
>>>>> and future accessibility.
>>>> The point I was making is that the differentiation between 'present'
>>> and
>>>> 'future' accessibility is bogus - there no longer is any real
>>> difference.
>>>> And if there is no longer a difference, then the proponents of
>>>> present
>>>> accessibility should probably be considering future accessibility
>>>> as a
>>>> matter of course.
>>>> I'm sure most will continue to treat such matters as 'a horses for
>>> courses'
>>>> situation, like you say. However, it just seems to make little
>>>> sense
>>> to go
>>>> to the effort of making information accessible NOW when it could
>>>> theoretically be inaccessible 24 hours from NOW or even 3 years from
>>> NOW -
>>>> and when some simple technical and administrative measures could
>>>> have
>>> been
>>>> taken to prevent any consequent inaccessibility. It is also
>>>> appears to
>>> be
>>>> inconsistent with Stevan Harnad's definition of 'immediate access',
>>> which
>>>> suggests that information be accessible "today, tomorrow and into
>>>> the
>>>> future".
>>>> Regards,
>>>>> From: J.W.T.Smith []
>>>>> Sent: 03 March 2006 17:29
>>>>> Subject: Re: preservation vs. Preservation
>>>>> Comments below.
>>>>>> John,
>>>>>>> Preservation and access are two different things.
>>>>>> I have to disagree. Preservation is inextricably linked with
>>> access.
>>>>>> To state that 'preservation and access are two totally different
>>> things'
>>>>> is
>>>>>> - I find - a common misconception.
>>>>> I don't suffer from common misconceptions, but I am sometimes
>>>>> misunderstood.
>>>>>> Preservation (with a capital P) is not
>>>>>> merely about preserving digital objects for posterity as an end in
>>>>> itself
>>>>>> (which is, of course, important); it is about preserving the
>>> digital
>>>>>> integrity of the object(s) so as to ensure it remains *accessible*
>>> ad
>>>>>> infinitum.
>>>>>> Robust Preservation strategies always ensure sufficient
>>> administrative
>>>>>> metadata (technical metadata, rights metadata, etc.) is recorded
>>> because
>>>>>> without it, user access can theoretically be jeopardised at *any*
>>> point
>>>>> in
>>>>>> the future. The rate of technical and software obsolesce is such
>>> that
>>>>>> deposits made to IRs today could - theoretically - be inaccessible
>>> in
>>>>> five
>>>>>> years. Preservation is no longer some triviality that can be
>>> addressed
>>>>> far,
>>>>>> far in the future my 'someone else'. IR administrators /
>>>>>> libraries
>>> have
>>>>> to
>>>>>> be in a position to regularly migrate or refresh materials to
>>> preserve
>>>>>> continued user access. Their ability to do so is predicated on
>>>>> preparing
>>>>>> suitable Preservation strategies.
>>>>>> Thus, to suggest that Preservation entails 'limiting' or
>>> 'screening'
>>>>> access
>>>>>> is - in my opinion - to entirely misinterpret the purpose of
>>> digital
>>>>>> preservation. If efforts at attaining '100% OA via 100%
>>> self-archiving'
>>>>> are
>>>>>> not to be in vain, the need for Preservation (with a capital P!)
>>> should
>>>>> not
>>>>>> be pooh-poohed.
>>>>> I did not "pooh-pooh" anything. What you say is true but it is not
>>>>> relevant to what I wrote. All this has nothing to do with making
>>>>> information accessible NOW. You have failed to distinguish between
>>> present
>>>>> and future accessibility.
>>>>> To clarify, for the here and now, I believe Preservation is not the
>>> same
>>>>> thing as making accessible and those whose main interest is
>>> accessibility
>>>>> NOW should not spend too much time on worrying about Preservation.
>>> Now [at
>>>>> this time, currently] PDF is an excellent way of making information
>>>>> available, but I would not suggest it as a preservation format.
>>>>> Since
>>>>> there has been a prevalence for poor quality metaphors/analogies in
>>> this
>>>>> discussion I could say this is a horses for courses situation.
>>>>> Regards,
>>>>> John Smith.
