
Massive Malware Variants Detection Based on Bag-of-words Perceptual Hashing
Author(s) -
Jian Yu
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1682/1/012046
Subject(s) - malware , computer science , executable , hash function , artificial intelligence , signature (topology) , pattern recognition (psychology) , code (set theory) , data mining , machine learning , computer security , mathematics , set (abstract data type) , operating system , programming language , geometry
Presently, most widely used malware detection methods use signature with reverse engineering to recognize malware variants. Nevertheless, this approach is problematic because the signatures simply modified by using packers on which compress and/or encrypt the executable code to evade detection. In this paper, we present a novel Bag-of-words perceptual hashing to detect variants. The proposed method visualizes malware binary code as grey-scale image, extracts the Grey-level Co-occurrence Matrix features vector and using Bag-of-words model to generate perceptual hashing for malware variants detection. Experimental results show that, the proposed method has a high accuracy and fast detection speed, and has good resilience to popular packers, which is suitable for massive malware variants detection.