z-logo
Premium
Integrating syntax‐semantic‐based text analysis with structural and citation information for scientific plagiarism detection
Author(s) -
Vani K,
Gupta Deepa
Publication year - 2018
Publication title -
journal of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.903
H-Index - 145
eISSN - 2330-1643
pISSN - 2330-1635
DOI - 10.1002/asi.24027
Subject(s) - plagiarism detection , computer science , syntax , parsing , citation , information retrieval , set (abstract data type) , baseline (sea) , semantic analysis (machine learning) , natural language processing , citation analysis , world wide web , programming language , oceanography , geology
The objective of the work is to explore the potency of integrating structural and citation information with effective syntax‐semantic text‐based analysis for scientific plagiarism detection. One of the major limitations in today's plagiarism checkers is their sole dependence on text‐based detection, where they ignore the citation and structural information. Further, the text‐based detection approaches that they employ usually fail to trace out intelligent manipulations. In the proposed work, a plagiarism detection system is presented that employs the effective coupling of various modules, namely, logical structure classifications and citation parsing, two‐stage candidate document selections, syntax‐semantic‐based exhaustive passage level analysis with plagiarism analysis using structural and citation information. Further, a new plagiarism score, namely, weighted overall similarity index is proposed, opposed to the general plagiarism scores. The proposed approach is evaluated on the data set created by Alzahrani et al. ( ), which contains scientific publications imposed with various plagiarism complexities. Comparison of the final system results is done against a potential baseline approach. The proposed approach exhibits considerable improvement over the comparative baseline, and hence reflects the potency of syntax‐semantic text‐based analysis with structural and citation information.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here