Detection of Malicious PDF Files Using a Two‐Stage Machine Learning Algorithm | Zendy

He Kang | Zendy; Zhu Yuefei | Zendy; He Yubo | Zendy; Liu Long | Zendy; Lu Bin | Zendy; Lin Wei | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Detection of Malicious PDF Files Using a Two‐Stage Machine Learning Algorithm

Author(s) -

He Kang,

Zhu Yuefei,

He Yubo,

Liu Long,

Lu Bin,

Lin Wei

Publication year - 2020

Publication title -

chinese journal of electronics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.267

H-Index - 25

eISSN - 2075-5597

pISSN - 1022-4653

DOI - 10.1049/cje.2020.10.002

Subject(s) - computer science , robustness (evolution) , artificial intelligence , classifier (uml) , machine learning , convolutional neural network , data mining , algorithm , pattern recognition (psychology) , biochemistry , chemistry , gene

Portable document format (PDF) files are increasingly used to launch cyberattacks due to their popularity and increasing number of vulnerabilities. Many solutions have been developed to detect malicious files, but their accuracy decreases rapidly in face of new evasion techniques. We explore how to improve the robustness of classifiers for detecting adversarial attacks in PDF files. Content replacement and the n‐gram are implemented to extract robust features using proposed guiding principles. In the two‐stage machine learning model, the objects are divided based on their types, and the anomaly detection model is first trained for each type individually. The former detection results are organized into tree‐like information structure and treated as inputs to convolutional neural network. Experimental results show that the accuracy of our classifier is nearly 100% and the robustness against evasive samples is excellent. The object features also enable the identification of different vulnerabilities exploited in malicious PDF files.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research