
MRHCA: a nonparametric statistics based method for hub and co‐expression module identification in large gene co‐expression network
Author(s) -
Zhang Yu,
Cao Sha,
Zhao Jing,
Alsaihati Burair,
Ma Qin,
Zhang Chi
Publication year - 2018
Publication title -
quantitative biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.707
H-Index - 15
eISSN - 2095-4697
pISSN - 2095-4689
DOI - 10.1007/s40484-018-0131-z
Subject(s) - nonparametric statistics , gene co expression network , gene , computational biology , identification (biology) , expression (computer science) , transcriptome , data mining , gene expression , biology , gene regulatory network , computer science , genetics , statistics , mathematics , gene ontology , botany , programming language
Background Gene co‐expression and differential co‐expression analysis has been increasingly used to study co‐functional and co‐regulatory biological mechanisms from large scale transcriptomics data sets. Methods In this study, we develop a nonparametric approach to identify hub genes and modules in a large co‐expression network with low computational and memory cost, namely MRHCA. Results We have applied the method to simulated transcriptomics data sets and demonstrated MRHCA can accurately identify hub genes and estimate size of co‐expression modules. With applying MRHCA and differential co‐expression analysis to E. coli and TCGA cancer data, we have identified significant condition specific activated genes in E. coli and distinct gene expression regulatory mechanisms between the cancer types with high copy number variation and small somatic mutations. Conclusion Our analysis has demonstrated MRHCA can (i) deal with large association networks, (ii) rigorously assess statistical significance for hubs and module sizes, (iii) identify co‐expression modules with low associations, (iv) detect small and significant modules, and (v) allow genes to be present in more than one modules, compared with existing methods.