z-logo
open-access-imgOpen Access
Java Bytecode Control Flow Classification: Framework for Guiding Java Decompilation
Author(s) -
Siwadol Sateanpattanakul,
Duangpen Jetpipattanapong,
Seksan Mathulaprangsan
Publication year - 2021
Publication title -
journal of mobile multimedia
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.229
H-Index - 12
eISSN - 1550-4654
pISSN - 1550-4646
DOI - 10.13052/jmm1550-4646.1822
Subject(s) - bytecode , java bytecode , computer science , java , feature (linguistics) , programming language , java modeling language , java annotation , strictfp , frame (networking) , real time java , telecommunications , linguistics , philosophy
Decompilation is the main process of software development, which is very important when a program tries to retrieve lost source codes. Although decompiling Java bytecode is easier than bytecode, many Java decompilers cannot recover originally lost sources, especially the selection statement, i.e., if statement. This deficiency affects directly decompilation performance. In this paper, we propose the methodology for guiding Java decompiler to deal with the aforementioned problem. In the framework, Java bytecode is transformed into two kinds of features called frame feature and latent semantic feature. The former is extracted directly from the bytecode. The latter is achieved by two-step transforming the Java bytecode to bigram and then term frequency-inverse document frequency (TFIDF). After that, both of them are fed to the genetic algorithm to reduce their dimensions. The proposed feature is achieved by converting the selected TFIDF to a latent semantic feature and concatenating it with the selected frame feature. Finally, KNN is used to classify the proposed feature. The experimental results show that the decompilation accuracy is 93.68 percent, which is obviously better than Java Decompiler.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here