z-logo
open-access-imgOpen Access
Plagiarism Detection in Programming Assignments using Machine Learning
Author(s) -
Nishesh Awale,
Mitesh Pandey,
Anish Dulal,
Bibek Timsina
Publication year - 2020
Publication title -
journal of artificial intelligence and capsule networks
Language(s) - English
Resource type - Journals
ISSN - 2582-2012
DOI - 10.36548/jaicn.2020.3.005
Subject(s) - plagiarism detection , computer science , support vector machine , similarity (geometry) , source code , code (set theory) , set (abstract data type) , artificial intelligence , machine learning , test set , programming language , image (mathematics)
Plagiarism in programming assignments has been increasing these days which affects the evaluation of students. Thispaper proposes a machine learning approach for plagiarism detection of programming assignments. Different features related to source code are computed based on similarity score of n-grams, code style similarity and dead codes. Then, xgboost model is used for training and predicting whether a pair of source code are plagiarised or not. Many plagiarism techniques ignores dead codes such as unused variables and functions in their predictions tasks. But number of unused variables and functions in the source code are considered in this paper. Using our features, the model achieved an accuracy score of 94% and average f1-score of 0.905 on the test set. We also compared the result of xgboost model with support vector machines(SVM) and report that xgboost model performed better on our dataset.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom