Applied machine learning in recognition of DGA domain names | Zendy

Miroslav Stampar | Zendy; Krešimir Fertalj | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Applied machine learning in recognition of DGA domain names

Author(s) -

Miroslav Stampar,

Krešimir Fertalj

Publication year - 2021

Publication title -

computer science and information systems

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.244

H-Index - 24

eISSN - 2406-1018

pISSN - 1820-0214

DOI - 10.2298/csis210104046s

Subject(s) - computer science , artificial intelligence , machine learning , heuristics , domain (mathematical analysis) , set (abstract data type) , independence (probability theory) , feature (linguistics) , malware , field (mathematics) , binary classification , pattern recognition (psychology) , support vector machine , mathematics , computer security , mathematical analysis , linguistics , statistics , philosophy , pure mathematics , programming language , operating system

Recognition of domain names generated by domain generation algorithms (DGAs) is the essential part of malware detection by inspection of network traffic. Besides basic heuristics (HE) and limited detection based on blacklists, the most promising course seems to be machine learning (ML). There is a lack of studies that extensively compare different ML models in the field of DGA binary classification, including both conventional and deep learning (DL) representatives. Also, those few that exist are either focused on a small set of models, use a poor set of features in ML models or fail to secure unbiased independence between training and evaluation samples. To overcome these limitations, we engineered a robust feature set, and accordingly trained and evaluated 14 ML, 9 DL, and 2 comparative models on two independent datasets. Results show that if ML features are properly engineered, there is a marginal difference in overall score between top ML and DL representatives. This paper represents the first attempt to neutrally compare the performance of many different models for the recognition of DGA domain names, where the best models perform as well as the top representatives from the literature.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research