
Bit‐oriented format extraction approach for automatic binary protocol reverse engineering
Author(s) -
Tao Siyu,
Yu Hongyi,
Li Qing
Publication year - 2016
Publication title -
iet communications
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.355
H-Index - 62
eISSN - 1751-8636
pISSN - 1751-8628
DOI - 10.1049/iet-com.2015.0797
Subject(s) - computer science , reverse engineering , binary number , field (mathematics) , protocol (science) , sorting , cluster analysis , binary data , data mining , algorithm , theoretical computer science , artificial intelligence , mathematics , programming language , medicine , alternative medicine , arithmetic , pathology , pure mathematics
Protocol message format extraction is a principal process of automatic network protocol reverse engineering when target protocol specifications are not available. However, binary protocol reverse engineering has been a new challenge in recent years for approaches that traditionally have dealt with text‐based protocols rather than binary protocols. In this study, the authors propose a novel approach called PRE‐Bin that automatically extracts binary‐type fields of binary protocols based on fine‐grained bits. First, a silhouette coefficient is introduced into the hierarchical clustering to confirm the optimal clustering number of binary frames. Second, a modified multiple sequence alignment algorithm, in which the matching process and back‐tracing rules are redesigned, is also proposed to analyse binary field features. Finally, a Bayes decision model is invoked to describe field features and determine bit‐oriented field boundaries. The maximum a posteriori criterion is leveraged to complete an optimal protocol format estimation of binary field boundaries. The authors implemented a prototype system of PRE‐Bin to infer the specification of binary protocols from actual traffic traces. Experimental results indicate that PRE‐Bin effectively extracts binary fields and outperforms the existing algorithms.