
Manifold embedded distribution adaptation for cross‐project defect prediction
Author(s) -
Sun Ying,
Jing XiaoYuan,
Wu Fei,
Sun Yanfei
Publication year - 2020
Publication title -
iet software
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.305
H-Index - 43
eISSN - 1751-8814
pISSN - 1751-8806
DOI - 10.1049/iet-sen.2019.0389
Subject(s) - subspace topology , manifold alignment , conditional probability distribution , joint probability distribution , feature (linguistics) , domain (mathematical analysis) , domain adaptation , manifold (fluid mechanics) , distribution (mathematics) , computer science , adaptation (eye) , marginal distribution , data mining , artificial intelligence , transfer of learning , feature vector , machine learning , pattern recognition (psychology) , nonlinear dimensionality reduction , mathematics , dimensionality reduction , statistics , engineering , mechanical engineering , mathematical analysis , linguistics , philosophy , physics , optics , random variable , classifier (uml)
Cross‐project defect prediction (CPDP) technology refers to the constructing prediction model to predict the instance label of the target project by utilising labelled data from an external project. The challenge of CPDP methods is the distribution difference between the data from different projects. Transfer learning can transfer the knowledge from the source domain to the target domain with the aim to minimise the domain difference between different domains. However, most existing methods reduce the distribution discrepancy in the original feature space, where the features are high‐dimensional and non‐linear, which makes it hard to reduce the distribution distance between different projects. Moreover, previous works mainly consider marginal distribution or conditional distribution difference. In this study, the authors proposed a manifold embedded distribution adaptation (MDA) approach to narrow the distribution gap in manifold feature subspace. MDA maps source and target project data to manifold subspace and then joint distribution adaptation of conditional and marginal distributions is performed on manifold subspace. To evaluate the effectiveness of MDA, the authors perform extensive experiments on 20 public projects with three indicators. The experiment results show that MDA improves the average performance, but the improvement is not statistically significant in comparison to HYDRA (one of the baselines).