Comparative evaluation of text- and citation-based plagiarism detection approaches using guttenplag
Author(s) -
Béla Gipp,
Norman Meuschke,
Joeran Beel
Publication year - 2011
Publication title -
kops (university of konstanz)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/1998076.1998124
Subject(s) - plagiarism detection , citation , computer science , information retrieval , matching (statistics) , string (physics) , citation analysis , string searching algorithm , natural language processing , artificial intelligence , pattern matching , world wide web , mathematics , statistics , mathematical physics
Various approaches for plagiarism detection exist. All are based on more or less sophisticated text analysis methods such as string matching, fingerprinting or style comparison. In this paper a new approach called Citation-based Plagiarism Detection is evaluated using a doctoral thesis, in which a volunteer crowd-sourcing project called GuttenPlag identified substantial amounts of plagiarism through careful manual inspection. This new approach is able to identify similar and plagiarized documents based on the citations used in the text. It is shown that citation-based plagiarism detection performs significantly better than text-based procedures in identifying strong paraphrasing, translation and some idea plagiarism. Detection rates can be improved by combining citation-based with text-based plagiarism detection.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom