z-logo
open-access-imgOpen Access
A Study on the Performance of Feature Extraction Methods According to the Size of N-Gram
Author(s) -
Young Hoo Kwon,
Moongu Son,
Dong Keun Chung,
Myung Jae Lim
Publication year - 2018
Publication title -
international journal of engineering and technology
Language(s) - English
Resource type - Journals
ISSN - 2227-524X
DOI - 10.14419/ijet.v7i3.33.18516
Subject(s) - opcode , support vector machine , pattern recognition (psychology) , computer science , artificial intelligence , classifier (uml) , gram , feature extraction , n gram , biology , bacteria , language model , computer hardware , genetics
In this paper, we studied the performance of feature extraction methods according to the size of N-gram for malware detection. The feature is extracted by three methods, using Opcode Only, both Opcode and API and API Only from PE file. We measure the performance of them indirectly with measuring the AUC score and accuracy of classifier. We did experiments with the different N size by using several classifiers such as DT, SVM, KNN and BNB classifiers. As a result, we got the conclusion as followings. If we use N-gram technique, we recommend Opcode Only method through our experiments. Also, the instance-based classifier KNN and DT among the model based classifier have good performance than SVM and BNB.  

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here