
Multiple‐cumulative probabilities used to cluster and visualize transcriptomes
Author(s) -
Jia Xingang,
Liu Yisu,
Han Qiuhong,
Lu Zuhong
Publication year - 2017
Publication title -
febs open bio
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.718
H-Index - 31
ISSN - 2211-5463
DOI - 10.1002/2211-5463.12327
Subject(s) - cluster (spacecraft) , transcriptome , computational biology , biology , computer science , genetics , gene expression , gene , programming language
Analysis of gene expression data by clustering and visualizing played a central role in obtaining biological knowledge. Here, we used Pearson's correlation coefficient of multiple‐cumulative probabilities (PCC‐MCP) of genes to define the similarity of gene expression behaviors. To answer the challenge of the high‐dimensional MCPs, we used icc‐cluster , a clustering algorithm that obtained solutions by iterating clustering centers, with PCC‐MCP to group genes. We then used t ‐statistic stochastic neighbor embedding (t‐SNE) of KC‐data to generate optimal maps for clusters of MCP (t‐SNE‐MCP‐O maps). From the analysis of several transcriptome data sets, we demonstrated clear advantages for using icc‐cluster with PCC‐MCP over commonly used clustering methods. t‐SNE‐MCP‐O was also shown to give clearly projecting boundaries for clusters of PCC‐MCP, which made the relationships between clusters easy to visualize and understand.