Techniques for Detecting Malware Traffic: A Comprehensive Approach to Feature Selection and Classification | Zendy

A Harsha | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Techniques for Detecting Malware Traffic: A Comprehensive Approach to Feature Selection and Classification

Author(s) -

A Harsha

Publication year - 2021

Publication title -

international journal for research in applied science and engineering technology

Language(s) - English

Resource type - Journals

ISSN - 2321-9653

DOI - 10.22214/ijraset.2021.39088

Subject(s) - computer science , random forest , malware , machine learning , feature selection , artificial intelligence , extreme learning machine , boosting (machine learning) , network packet , encryption , dimensionality reduction , support vector machine , data mining , gradient boosting , curse of dimensionality , artificial neural network , computer security

Since the advent of encryption, there has been a steady increase in malware being transmitted over encrypted networks. Traditional approaches to detect malware like packet content analysis are inefficient in dealing with encrypted data. In the absence of actual packet contents, we can make use of other features like packet size, arrival time, source and destination addresses and other such metadata to detect malware. Such information can be used to train machine learning classifiers in order to classify malicious and benign packets. In this paper, we offer an efficient malware detection approach using classification algorithms in machine learning such as support vector machine, random forest and extreme gradient boosting. We employ an extensive feature selection process to reduce the dimensionality of the chosen dataset. The dataset is then split into training and testing sets. Machine learning algorithms are trained using the training set. These models are then evaluated against the testing set in order to assess their respective performances. We further attempt to tune the hyper parameters of the algorithms, in order to achieve better results. Random forest and extreme gradient boosting algorithms performed exceptionally well in our experiments, resulting in area under the curve values of 0.9928 and 0.9998 respectively. Our work demonstrates that malware traffic can be effectively classified using conventional machine learning algorithms and also shows the importance of dimensionality reduction in such classification problems. Keywords: Malware Detection, Extreme Gradient Boosting, Random Forest, Feature Selection.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore