> Some time ago there was an article in Science on some gene or protein
> sequence, where the data where kept at the commercial (US) company and not
> published in Science, though Science accepted the article. It is all about
> patents on bio-stuff versus the freedom of publication. Can the reader
> believe the article if the data are proprietary and not public?
> It was quite a stir. I lost the trail. Do you remember this case?

See piece below
Declan Butler

14 March 2002

Nature 416, 111 - 112 (2002)
Geneticists get steamed up over public access to rice genome
Twenty top genome researchers have written to the editorial advisers of
Science protesting at the way the journal occasionally publishes genome maps
without requiring the authors to place the supporting sequence data in
public databases.
The letter is signed by such luminaries as Bob Waterston, head of genetics
at Washington University in St Louis, Nobel laureate Aaron Klug of the MRC
Laboratory of Molecular Biology in Cambridge, UK, and Michael Ashburner,
former head of the European Bioinformatics Institute at Hinxton near
Cambridge. In it they argue that new genome sequences should be made
available in public-domain databases in line with what they term "accepted
norms of the field".
"There are strong rumours in the field that Science is considering allowing
the publication of papers from commercial companies on the rice and mouse
genomes, without demanding the submission of the data in GenBank as a
condition," their letter says.
Boiling point: disputes about gene data have spilt over to the planned
publication of a rice genome.
Several sources confirm that Science intends to publish a paper by the
Swiss-based agricultural biotechnology company Syngenta on its draft of the
rice genome. The supporting sequence data will not be deposited in GenBank,
the sources say, but will be available free to academic researchers from
Syngenta's website, subject to certain restrictions.
Science drew criticism last year when it agreed to publish the draft human
genome assembled by Celera Genomics of Rockville, Maryland, despite the
company's restrictions on access to the sequence data.
Donald Kennedy, Science's editor-in-chief, declines to comment on the
pending paper. "Science is committed to full public access," he says. "But
we will consider rare exceptions if the public benefits of removing valuable
data and results from trade-secret status clearly exceed the costs to the
scientific community of the precedent the exception might create. This was
true for the human genome sequence, and for the most important agricultural
commodity in the Third World, the case is surely even stronger."
According to several researchers, Science also plans to publish a draft
sequence of Oryza sativa L. ssp. indica, the major crop rice cultivar in
China, alongside the Syngenta genome. This second rice genome was completed
recently by a team led by Huanming Yang, director of the Beijing Genomics
Institute, and the supporting sequence data have been deposited in GenBank.
A draft sequence of the rice genome by the agricultural biotechnology
company Monsanto, based in St Louis, Missouri, and one by Celera of the
mouse genome, are also under preparation, but have not yet been scheduled
for publication in any journal.
Syngenta currently makes its data available to a handful of academic groups
through special agreements. The publication of Syngenta's rice genome in
Science might result in changes to the company's policy, giving more
researchers access to the sequence data. But, as the letter demonstrates,
researchers remain deeply divided over the terms of such access. "This goes
to the heart of what science is all about, the free exchange of ideas, data
and reagents," says Bruce Stillman, director of the Cold Spring Harbor
Laboratory in New York state. Science should not compromise on making the
data freely available, he says.
But Ron Cantrell, director of the International Rice Research Institute in
the Philippines, is more supportive of Science's decision to publish. "You
have to ask the question 'is it better not to have any access at all?'," he
says, adding that, in his experience, Syngenta and Monsanto have "been very
forthcoming" in collaborations with the public sector.
Chris Novak, a spokesman for Syngenta, says that the company hopes to work
with the publicly funded International Rice Genome Sequencing Project
(IRGSP). The project intends to produce a 'finished' high-quality sequence,
as opposed to the drafts, containing many gaps, that are about to be
Researchers point out that Science's agreement with Syngenta is not entirely
analogous to the one it reached last year with Celera on the human genome.
Celera contributed no data to the public Human Genome Project, instead
relying on data from the public project to complete its own sequence. In
contrast, Syngenta has already contributed significant mapping data to the
IRGSP, through a collaboration with Clemson University in South Carolina.
But Syngenta has so far refused to share its raw sequence data with all of
the public group - unlike its rival Monsanto, whose contributions of
sequence data are credited with strongly accelerating the public project.
In January, however, Syngenta began talks with the IRGSP and, according to
one IRGSP official, has agreed in principle to match the Monsanto agreement.
If it does, "all the Syngenta and Monsanto data will be in the public domain
by the end of the year", says the official. The likelihood of this happening
might be a factor in persuading Science to accept restrictions on the rice
data for the time being, observers suggest.

I don't know of it. (Perhaps other AmSci Forum readers will.)

But to put it in context: Peer-reviewed scientific research
publications usually do not include the raw data on which they
are based, only the results and analyses that they are reporting.
The referees must decide whether the results themselves are sound and
worth knowing, hence publishing. Sometimes they need access to the
data to ascertain this. But in the past it was not practice -- indeed
it was not possible, because of the expenses or print-on-paper -- to
go on to publish the data themselves.

But the online era now makes this possible too. There is still an
element of voluntarism in it, for even researchers who are not
interested in concealing it as proprietary or commercially exploitable
information may want to keep their data under wraps so that they
themselves can continue to mine it, and eventually to publish it in
further peer-reviewed articles of their own, rather than being
"scooped" by others.

In part, this is nonsense, of course, if it serves to retard rather
than enhance scientific progress. But in part it is a practical
byproduct of the fact that researchers often put a great deal of effort
and skill into the gathering of their data, effort they may be more
reluctant to expend if, as of their first published reports, all their
data were to become fair game for other researchers who had not put in
the time or effort.

But let us not exaggerate this proprietariness either: The division of
labor in data-gathering and analysis (and its funding and rewarding)
will be able to accommodate and optimize all this, now that data can be
archived online along with data-analyses and results. Indeed, perhaps a
new category of peer reviewed output, namely, peer-reviewed data, will
earn a place and credit in researchers' CVs and their productivity/impact
assessment. So open online access to archived data will go hand in hand
with open online access to peer reviewed research findings.

As to proprietary data: Nolo contendere. It is the referees who must
decide whether the results are worth reporting in a peer-reviewed
journal even when the data on which they are based are inaccessible.

Stevan Harnad
