z-logo
open-access-imgOpen Access
The Performance Comparison of the Classifiers According to Binary Bow, Count Bow and Tf-Idf Feature Vectors for Malware Detection
Author(s) -
Young Hoo Kwon,
So Hee Jun,
Won Mo Gal,
Myung Jae Lim
Publication year - 2018
Publication title -
international journal of engineering and technology
Language(s) - English
Resource type - Journals
ISSN - 2227-524X
DOI - 10.14419/ijet.v7i3.33.18515
Subject(s) - support vector machine , pattern recognition (psychology) , opcode , computer science , artificial intelligence , feature (linguistics) , feature vector , philosophy , linguistics , computer hardware
In this paper, we compared the performance of the classifiers according to feature vectors with Binary BOW, Count BOW and TF-IDF for malware detection. We used the feature of Opcode that extracted from PE file. For performance comparison, we measured the AUC score for the classifiers those are DT, KNN, MLP, MNB and SVM. As a result, we recommend neural network (MLP) and instance-based model (KNN) because they show the high AUC score and accuracy regardless of the unbalanced dataset and the feature vector. If you use classical classifiers, we recommend DT because it guarantees high AUC score and accuracy regardless of the same condition as the above. If you use SVM, you have to do Robust scaling to resolved outlier and unbalanced dataset. If you use MNB, you need to use N-gram technique to improve AUC score.  

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here