The Performance Comparison of the Classifiers According to Binary Bow, Count Bow and Tf-Idf Feature Vectors for Malware Detection | Zendy

Young Hoo Kwon | Zendy; So Hee Jun | Zendy; Won Mo Gal | Zendy; Myung Jae Lim | Zendy

Open Access

The Performance Comparison of the Classifiers According to Binary Bow, Count Bow and Tf-Idf Feature Vectors for Malware Detection

Author(s) -

Young Hoo Kwon,

So Hee Jun,

Won Mo Gal,

Myung Jae Lim

Publication year - 2018

Publication title -

international journal of engineering and technology

Language(s) - English

Resource type - Journals

ISSN - 2227-524X

DOI - 10.14419/ijet.v7i3.33.18515

Subject(s) - support vector machine , pattern recognition (psychology) , opcode , computer science , artificial intelligence , feature (linguistics) , feature vector , philosophy , linguistics , computer hardware

In this paper, we compared the performance of the classifiers according to feature vectors with Binary BOW, Count BOW and TF-IDF for malware detection. We used the feature of Opcode that extracted from PE file. For performance comparison, we measured the AUC score for the classifiers those are DT, KNN, MLP, MNB and SVM. As a result, we recommend neural network (MLP) and instance-based model (KNN) because they show the high AUC score and accuracy regardless of the unbalanced dataset and the feature vector. If you use classical classifiers, we recommend DT because it guarantees high AUC score and accuracy regardless of the same condition as the above. If you use SVM, you have to do Robust scaling to resolved outlier and unbalanced dataset. If you use MNB, you need to use N-gram technique to improve AUC score.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore