Re: Plagiarism

From: Lee Giles <giles_at_IST.PSU.EDU>
Date: Wed, 21 Dec 2005 15:37:30 -0500

Hi:

CiteSeer has a feature for showing similarity of papers at the sentence
level. Other
DL's or archives could do it on a similar basis using hash matches. You
can see it
in a document page when available. We also show cocitation similarity and
active
bibliography. See for example this document page:
      http://citeseer.ist.psu.edu/17863.html

We will soon offer a new search feature for document similarity to make
search for similar text and features more readily accessible as part of
our Next
Generation CiteSeer project.

Best

Lee Giles

--
Arthur Sale wrote:
      Since the Open Access debate in the UK and the USA is hotting
      up again, perhaps it is a good idea to examine an area of
      research dissemination that is unsavory, but which open
      access can help to minimize, but toll publishers cannot or do
      not. BTW, I use open access in the well-defined technical
      sense of access which is free, unconstrained in time, and
      available to all with Internet access.
       
      My topic is plagiarism: the copying of someone else&#8217;s
      work without acknowledgment. Very dishonest, but we all know
      that it occurs. One of the researchers at the University of
      Tasmania decided to use one of his own papers to test a
      commercial software tool intended for detecting plagiarism in
      student assignments. He was interested to see that it turned
      up a substantial direct quotation from his paper by an author
      in another country, but less pleased to find that the
      quotation was unattributed. He took legal advice, and the
      offending author was contacted for redress.
       
      This detection worked because the offending document was open
      to access on the Internet. Clearly it would be possible for
      any author to do the same with their own papers were a free
      plagiarism detector available, or for a more central resource
      to scan OA repositories at random. The key difficulty is
      determining if the quotation is acknowledged or not
      (currently an eyeball check or cross-checking with citation
      services). Simple tools could be developed to test that
      against reference lists. Bodies like the NIH and the RCUK
      (and their Australian equivalents) should be very interested
      in the development and dissemination of such tools, as it
      would directly address the question of research quality and
      improve it. An effective scheme by itself would act as a
      deterrent, as the chance of detection would increase
      dramatically.
       
      Of course, such detection is not possible at present with
      restricted access as provided by traditional toll-access
      publishers. Their content is not accessible for searching and
      indexing, and they rely entirely on referees to recognize
      plagiarism.
       
      This is then another clear technical advantage to providing
      open access to all research publications. Modern technology
      can help detect and stamp out scholarly fraud.
       
      Arthur
       
Received on Wed Dec 21 2005 - 20:46:01 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:10 GMT