Premium
Covariances Simultaneous Component Analysis: a new method within a framework for modeling covariances
Author(s) -
Smilde Age K.,
Timmerman Marieke E.,
Saccenti Edoardo,
Jansen Jeroen J.,
Hoefsloot Huub C. J.
Publication year - 2015
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.2707
Subject(s) - covariance , biological data , chemometrics , computer science , principal component analysis , data mining , component (thermodynamics) , set (abstract data type) , machine learning , mathematics , artificial intelligence , statistics , bioinformatics , biology , programming language , thermodynamics , physics
In modern omics research, it is more rule than exception that multiple data sets are collected in a study pertaining to the same biological organism. In such cases, it is worthwhile to analyze all data tables simultaneously to arrive at global information of the biological system. This is the area of data fusion or multi‐set analysis, which is a lively research topic in chemometrics, bioinformatics, and biostatistics. Most methods of analyzing such complex data focus on group means, treatment effects, or time courses. There is also information present in the covariances among variables within a group, because this relates directly to individual differences, heterogeneity of responses, and changes of regulation in the biological system. We present a framework for analyzing covariance matrices and a new method that fits nicely in this framework. This new method is based on combining covariance prototypes using simultaneous components and is, therefore, coined Covariances Simultaneous Component Analysis (COVSCA). We present the framework and our new method in mathematical terms, thereby explaining the (dis)similarities of the methods. Systems biology models based on differential equations illustrate the type of variation generated in real‐life biological systems and how this type of variation can be modeled within the framework and with COVSCA. The method is subsequently applied to two real‐life data sets from human and plant metabolomics studies showing biologically meaningful results. Copyright © 2015 John Wiley & Sons, Ltd.