
A Classification System for Visualized Malware Based on Multiple Autoencoder Models
Author(s) -
Jongkwan Lee,
Jongdeog Lee
Publication year - 2021
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2021.3122083
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
In this paper, we propose a classification system that uses multiple autoencoder models for identifying malware images. It is crucial to accurately classify malware before we can deploy appropriate countermeasures to prevent them from spreading. Rapid malware classification is the first step in preparing effective countermeasures. Typical approaches to this problem, which can be divided into static or dynamic methods, are not suitable for efficient malware classification because they require either fixed malware patterns or lots of time to investigate, respectively. If the malware analysts have enough time and resources, they can analyze any malware thoroughly. However, finite resources mean they always suffer from a lack of time due to the malware that needs analyzing increasing at a dramatic rate. In the real world, new malware and variants of existing malware are constantly emerging. To address this issue, many researchers have developed approaches using machine learning techniques. However, to date these systems have had difficulty responding appropriately to the rapidly changing malware environment and also suffer from data imbalance problems in the training data. The system proposed in this paper consists of multiple autoencoder models that classify malware that has been converted to an image. Each autoencoder model classifies only one type of malware and is trained using only samples from the corresponding family, this allows the system to update quickly and mitigates the data imbalance problem. We demonstrate our method’s superior performance through various experiments compared to other state-of-the-art techniques using the Malimg dataset.