
Abstract Classification Using Support Vector Machine Algorithm (Case Study: Abstract in a Computer Science Journal)
Author(s) -
Favorisen Rosyking Lumbanraja,
Eliza Fitri,
Ardiansyah Ardiansyah,
Apri Junaidi,
Rizky Prabowo
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1751/1/012042
Subject(s) - confusion matrix , support vector machine , confusion , computer science , artificial intelligence , machine learning , kernel (algebra) , matrix (chemical analysis) , algorithm , homogeneous , data mining , pattern recognition (psychology) , mathematics , psychology , materials science , combinatorics , psychoanalysis , composite material
Jurnal Komputasi is an online journal written by researchers and published by the Department of Computer Science, University of Lampung. Specific scientific information contained in journals is difficult to find because journals have not been structured and are classified into more specialized categories of computer science. Text mining can convert the shape of a journal into structured by homogeneous data form in it. 144 journal abstracts are collected into one corpus document in CSV format used as a research dataset. Journal abstract classification is done using one of the supervised machine learning methods, namely Support Vector Machine (SVM) so that the classification process is faster than the manual method. The TF-IDF technique is used to transform sentences in the abstract into vector so that they can be modelled with SVM. The classification model will be validated by applying the 10-fold cross validation technique. From these classifications a calculation of the resulting performance will be calculated based on the confusion matrix calculation of the resulting performance will be calculated based on the confusion matrix calculation and the use of 3 SVM kernels. The conclusion based on this research is that there are two factors that affect classification accuracy, that is the number of members between scientific classes that are not balanced and the number of features generated from text mining. The highest accuracy of testing result obtained on the use of 205 features and SVM Linear kernel with a value of 58,3%.