Biclustering by sparse canonical correlation analysis | Zendy

Pimentel Harold | Zendy; Hu Zhiyue | Zendy; Huang Haiyan | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Biclustering by sparse canonical correlation analysis

Author(s) -

Pimentel Harold,

Hu Zhiyue,

Huang Haiyan

Publication year - 2018

Publication title -

quantitative biology

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.707

H-Index - 15

eISSN - 2095-4697

pISSN - 2095-4689

DOI - 10.1007/s40484-017-0127-0

Subject(s) - biclustering , canonical correlation , cluster analysis , inference , set (abstract data type) , computer science , correlation , multivariate statistics , data mining , data set , covariance matrix , computational biology , pattern recognition (psychology) , artificial intelligence , mathematics , machine learning , biology , algorithm , correlation clustering , cure data clustering algorithm , geometry , programming language

Background Developing appropriate computational tools to distill biological insights from large‐scale gene expression data has been an important part of systems biology. Considering that gene relationships may change or only exist in a subset of collected samples, biclustering that involves clustering both genes and samples has become in‐creasingly important, especially when the samples are pooled from a wide range of experimental conditions. Methods In this paper, we introduce a new biclustering algorithm to find subsets of genomic expression features (EFs) (e.g., genes, isoforms, exon inclusion) that show strong “group interactions” under certain subsets of samples. Group interactions are defined by strong partial correlations, or equivalently, conditional dependencies between EFs after removing the influences of a set of other functionally related EFs. Our new biclustering method, named SCCA‐BC, extends an existing method for group interaction inference, which is based on sparse canonical correlation analysis (SCCA) coupled with repeated random partitioning of the gene expression data set. Results SCCA‐BC gives sensible results on real data sets and outperforms most existing methods in simulations. Software is available at https://github.com/pimentel/scca‐bc . Conclusions SCCA‐BC seems to work in numerous conditions and the results seem promising for future extensions. SCCA‐BC has the ability to find different types of bicluster patterns, and it is especially advantageous in identifying a bicluster whose elements share the same progressive and multivariate normal distribution with a dense covariance matrix.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore