
iSMNN: batch effect correction for single-cell RNA-seq data via iterative supervised mutual nearest neighbor refinement
Author(s) -
Yuchen Yang,
Gang Li,
Yifang Xie,
Li Wang,
Taylor M. Lagler,
Yingxi Yang,
Jiandong Liu,
Qian Li,
Yun Li
Publication year - 2021
Publication title -
briefings in bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.204
H-Index - 113
eISSN - 1477-4054
pISSN - 1467-5463
DOI - 10.1093/bib/bbab122
Subject(s) - computer science , benchmarking , identification (biology) , data mining , artificial intelligence , machine learning , pattern recognition (psychology) , biology , botany , marketing , business
Batch effect correction is an essential step in the integrative analysis of multiple single-cell RNA-sequencing (scRNA-seq) data. One state-of-the-art strategy for batch effect correction is via unsupervised or supervised detection of mutual nearest neighbors (MNNs). However, both types of methods only detect MNNs across batches of uncorrected data, where the large batch effects may affect the MNN search. To address this issue, we presented a batch effect correction approach via iterative supervised MNN (iSMNN) refinement across data after correction. Our benchmarking on both simulation and real datasets showed the advantages of the iterative refinement of MNNs on the performance of correction. Compared to popular alternative methods, our iSMNN is able to better mix the cells of the same cell type across batches. In addition, iSMNN can also facilitate the identification of differentially expressed genes (DEGs) that are relevant to the biological function of certain cell types. These results indicated that iSMNN will be a valuable method for integrating multiple scRNA-seq datasets that can facilitate biological and medical studies at single-cell level.