z-logo
open-access-imgOpen Access
Identification of transparent, compressed and encrypted data in network traffic
Author(s) -
Aleksandr Igorevich Getman,
Мария Кирилловна Иконникова
Publication year - 2021
Publication title -
trudy instituta sistemnogo programmirovaniâ ran/trudy instituta sistemnogo programmirovaniâ
Language(s) - English
Resource type - Journals
eISSN - 2220-6426
pISSN - 2079-8156
DOI - 10.15514/ispras-2021-33(4)-3
Subject(s) - computer science , encryption , data mining , classifier (uml) , identification (biology) , network packet , proposition , class (philosophy) , deep packet inspection , flow network , artificial intelligence , computer network , mathematics , mathematical optimization , philosophy , botany , epistemology , biology
The article is dedicated to the problem of classifying network traffic into three categories: transparent, compressed and opaque, preferably in real-time. It begins with the description of the areas where this problem needs to be solved, then proceeds to the existing solutions with their methods, advantages and limitations. As most of the current research is done either in the area of separating traffic into transparent and opaque or into compressed and encrypted, the need arises to combine a subset of existing methods to unite these two problems into one. As later the main mathematical ideas and suggestions that lie behind the ideas used in the research done by other scientists are described, the list of the best performing of them is composed to be combined together and used as the features for the random forest classificator, which will divide the provided traffic into three classes. The best performing of these features are used, the optimal tree parameters are chosen and, what’s more, the initial three class classifier is divided into two sequential ones to save time needed for classifying in case of transparent packets. Then comes the proposition of the new method to classify the whole network flow as one into one of those three classes, the validity of which is confirmed on several examples of the protocols most specific in this area (SSH, SSL). The article concludes with the directions in which this research is to be continued, mostly optimizing it for real-time classification and obtaining more samples of traffic suitable for experiments and demonstrations.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here