Shallow Deep Learning using Space-filling Curves for Malware Classification | Zendy

David S. Long | Zendy; Stephen O’Shaughnessy | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Shallow Deep Learning using Space-filling Curves for Malware Classification

Author(s) -

David S. Long,

Stephen O’Shaughnessy

Publication year - 2022

Publication title -

proceedings of the ... international conference on information warfare and security/the proceedings of the ... international conference on information warfare and security

Language(s) - English

Resource type - Journals

eISSN - 2048-9889

pISSN - 2048-9870

DOI - 10.34190/iccws.17.1.13

Subject(s) - malware , artificial intelligence , computer science , machine learning , convolutional neural network , deep learning , f1 score , support vector machine , benchmark (surveying) , computer security , geodesy , geography

The incidents of malware attacks are continually increasing at a rapid rate, thanks to the lucrative potential in schemes such as ransomware, credential stealing Trojans and cryptominers. Their explosive growth is compounded by the ease with which variants can be created from original strains. As a result, anti-virus organisations are struggling to keep up, with some reporting upwards of 14 million samples processed per month. These sheer volumes have caused a shift towards machine learning and artificial intelligence in an effort to alleviate the manual burden of analysis and classification. This research presents a novel framework for the classification of malware into distinct family classes through computer vision and deep learning. In the proposed framework, malware binaries are represented in an abstract form as images mapped through mathematical constructs known as space-filling curves. Convolutional neural networks were constructed and applied to the malware images to build predictive models for classification. The models were optimised using an auto-tuning function for the hyper parameters, which included Bayesian Optimisation, Random search and HyperBand, providing an exhaustive search on the hyper parameters. On a training dataset of 13k malware samples from 23 distinct families, the models yielded an average score of 95% for precision, recall and f1-score. The final deep learning model was validated for robustness against a dataset of more recent variants, comprising 12,816 samples from 16 malware families, returning classification scores of 95%, 86% and 90% for precision, recall and f1-score. The final model was demonstrated to outperform a similar benchmark model considerably. The results show the potential of the deep learning framework as a viable solution to the classification of malware, without the need for manually intensive feature generation or invasive processing techniques.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore