Open Access and Open Data

From: Peter Murray-Rust <pm286_at_CAM.AC.UK>
Date: Fri, 20 May 2005 09:51:03 +0100

I have recently found it useful in science publishing to distinguish
between Open Access and Open Data. I now feel that the use of these terms
might help to clarify some of the discussions on this list. Open Data is
concerned with the publication of the scientific facts (often in or
associated with peer-reviewed publication) and goes beyond free text and
single PDF documents. On the technical side we promote XML as the only
useful approach. Open Access, as discussed on this list, emphasises the
cultural and political approaches over the technical (licenses, etc.). I
have raised these issues on the list before, received enlightenment, and
will not repeat them here, but refer to reposited material.

I have put three digital objects on OpenAccess/OpenData in our Repository.
I thank the University for being one of the relatively few to have an IR
and allowing me to reposit - this is not yet an automatic right.
A short invited presentation to JISC2005 - the UK's information
infrastructure organisation in which I emphasize that machines read
publications as well as humans
An invited overview for BioMedCentral Bioinformatics on Open chemical data
in biosciences and how it can be accelerated by forward looking publishers
and funders
An accompanying technical article for BioMedCentral Bioinformatics on the
extraction of such data in XML

BMC is an Open Access publisher (political: Gold; technical: Creative
Commons) so I am allowed to reposit these articles

For the record, I am an experienced repositor in our DSpace and together
the 2 repositions (a single document each) took 30 minutes. The repositor
has to provide metadata, descriptions, etc. and I also linked the two
publications. Multicomponent documents can take much longer. The documents
themselves are, unfortunately, only PDF because the BMC's new Publicon
authoring system works by filling in templates rather than submitting a
manuscript - there is no "authors' manuscript".

Note that the last two are pre-review, and like many authors I am slightly
nervous about exposing them. I would expect that these are among very few
"Gold" documents published from chemists. I intend to add appropriate
documents at all stages of the process if the publication(s) are accepted
(preprint and postprint).

We have already shown that documents in these and other repositories are
automatically indexed by Google, MSN *including the chemical structures*.
This work includes OpenData on the Southampton eprints repository for
crystal structures and our own DSpace. Our full publication on this work
is toll-access and is not reposited as the publisher (RSC) is not Open
Access. (For the record, it was announced last year on this list that the
Royal Society of Chemistry was a "Green" publisher. This is not, and has
not ever been, true and when we took it in good faith it caused some
embarrassment. That is why I feel that clear statements on licensing are
important, but again will not pursue that here.) However the
"supplemental data" are OpenData:
and give a good idea of the thoroughness of current search engines.

I have accepted invitations in Open Access meetings to present on Open Data
and would be grateful for feedback offlist on whether this is a useful


Peter Murray-Rust
Unilever Centre for Molecular Informatics
Chemistry Department, Cambridge University
Lensfield Road, CAMBRIDGE, CB2 1EW, UK
Tel: +44-1223-763069 Fax: +44 1223 763076
Received on Fri May 20 2005 - 09:51:03 BST

