z-logo
open-access-imgOpen Access
Intrinsic entropy model for feature selection of scRNA-seq data
Author(s) -
Lin Li,
Hui Tang,
Rui Xia,
Hao Dai,
Rui Liu,
Luonan Chen
Publication year - 2022
Publication title -
journal of molecular cell biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.825
H-Index - 62
eISSN - 1674-2788
pISSN - 1759-4685
DOI - 10.1093/jmcb/mjac008
Subject(s) - cluster analysis , entropy (arrow of time) , feature selection , computer science , gene , artificial intelligence , pattern recognition (psychology) , data mining , computational biology , biology , genetics , physics , quantum mechanics
Recent advances of single-cell RNA sequencing (scRNA-seq) technologies have led extensive study on cellular heterogeneity and cell-to-cell variation. However, the high frequency of dropout events and noise in scRNA-seq data confound the accuracy of the downstream analysis, i.e. clustering analysis, whose accuracy depends heavily on the selected feature genes. Here, by deriving entropy decomposition formula, we proposed a feature selection method, i.e. intrinsic entropy (IE) model, to identify the informative genes for accurately clustering analysis. Specifically, by eliminating the ‘noisy’ fluctuation or extrinsic entropy (EE), we extracted the IE of each gene from total entropy (TE), i.e. TE=IE+EE. We showed that the IE of each gene actually reflects the regulatory fluctuation of this gene in a cellular process, and thus high-IE genes provide rich information on cell type or state analysis. To validate the performance of the high-IE genes, we conducted the computational analysis on both simulated datasets and real single-cell datasets by comparing with other representative methods. The results show that our IE model is not only broadly applicable and robust for different clustering and classification methods, but also sensitive for novel cell types. Our results also demonstrate that the intrinsic entropy/fluctuation of a gene serves as information rather than noise in contrast to its total entropy/fluctuation.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom