z-logo
Premium
Identifying functions in binary code with reverse extended control flow graphs
Author(s) -
Qiu Jing,
Su Xiaohong,
Ma Peijun
Publication year - 2015
Publication title -
journal of software: evolution and process
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.371
H-Index - 29
eISSN - 2047-7481
pISSN - 2047-7473
DOI - 10.1002/smr.1733
Subject(s) - computer science , control flow graph , control flow , code (set theory) , binary number , graph , function (biology) , binary code , theoretical computer science , algorithm , programming language , arithmetic , mathematics , set (abstract data type) , evolutionary biology , biology
In binary code analysis, current function identification approaches are challenged by functions without explicit call sites and handcrafted assembly without standard prologues/epilogues. We propose a new function representation called a reverse extended control flow graph (RECFG) and a RECFG‐based method for identifying functions in stripped binary code. A function has at least one return instruction (an instruction that makes the control flow leave a function). Therefore, return instructions are more reliable than the function prologues and epilogues used by traditional methods. We first build RECFGs from any values that can be interpreted as return instructions in a code range. Then, for each independent RECFG, the multiple‐decision method chooses a subgraph as the control flow graph of a function. A prototype tool is developed for evaluation on seven open source applications, 138 binaries in MASM32 code examples, and 292 binaries in Windows XP SP3. Experimental results show that the proposed method can identify functions that cannot be identified by current methods with high precision and stable recall. Copyright © 2015 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here