Using discriminative feature in software entities for relevance identification of code changes | Zendy

Huang Yuan | Zendy; Chen Xiangping | Zendy; Liu Zhiyong | Zendy; Luo Xiaonan | Zendy; Zheng Zibin | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Using discriminative feature in software entities for relevance identification of code changes

Author(s) -

Huang Yuan,

Chen Xiangping,

Liu Zhiyong,

Luo Xiaonan,

Zheng Zibin

Publication year - 2017

Publication title -

journal of software: evolution and process

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.371

H-Index - 29

eISSN - 2047-7481

pISSN - 2047-7473

DOI - 10.1002/smr.1859

Subject(s) - commit , discriminative model , computer science , artificial intelligence , relevance (law) , identification (biology) , feature (linguistics) , machine learning , code (set theory) , source code , data mining , pattern recognition (psychology) , programming language , database , linguistics , philosophy , botany , set (abstract data type) , political science , law , biology

Developers often bundle unrelated changes (eg, bug fix and feature addition) in a single commit and then submit a “poor cohesive” commit to version control system. Such a commit consists of multiple independent code changes and makes review of code changes harder. If the code changes before commit can be identified as related and unrelated ones, the “cohesiveness” of a commit can be guaranteed. Inspired by the effectiveness of machine learning techniques in classification field, we model the relevance identification of code changes as a binary classification problem (ie, related and unrelated changes) and propose discriminative feature in software entities to characterize the relevance of code changes. In particular, to quantify the discriminative feature, 21 coupling rules and 4 cochanged type relationships are elaborately extracted from software entities to construct related changes vector ( RCV ). Twenty‐one coupling rules at granularities of class, attribute, and method can capture the relevance of code changes from structural coupling dimension, and 4 cochanged type relationships are defined to capture the change type combinations of software entities that may cause related changes. Based on RCV , machine learning algorithms are applied to identify the relevance of code changes. The experiment results show that probabilistic neural network and general regression neural network provide statistically significant improvements in accuracy of relevance identification of code changes over the other 4 machine learning algorithms. Related changes vector with 72 dimensions ( R C V 72 ) outperforms other 2 RCV s with less dimensions.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore