Premium
Model identification in presence of incomplete information by generalized principal component analysis: Application to the common and differential responses of Escherichia coli to multiple pulse perturbations in continuous, high‐biomass density culture
Author(s) -
Guebel Daniel V.,
Cánovas Manuel,
Torres Néstor V.
Publication year - 2009
Publication title -
biotechnology and bioengineering
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.136
H-Index - 189
eISSN - 1097-0290
pISSN - 0006-3592
DOI - 10.1002/bit.22438
Subject(s) - principal component analysis , biological system , perturbation (astronomy) , covariance , mathematics , residual , ordination , statistics , biology , algorithm , physics , quantum mechanics
Abstract In a previous report we described a multivariate approach to discriminate between the different response mechanisms operating in Escherichia coli when a steady, continuous culture of these bacteria was perturbed by a glycerol pulse (Guebel et al., 2009, Biotechnol Bioeng 102: 910–922). Herein, we present a procedure to extend this analysis when multiple, spaced pulse perturbations (glycerol, fumarate, acetate, crotonobetaine, hypersaline plus high‐glycerol basal medium and crotonobetaine plus hypersaline basal medium) are being assessed. The proposed method allows us to identify not only the common responses among different perturbation conditions, but to recognize the specific response for a given stimulus even when the dynamics of the perturbation is unknown. Components common to all conditions are determined first by Generalized Principal Components Analysis (GPCA) upon a set of covariance matrices. A metrics is then built to quantify the similitude distance. This is based on the degree of variance extraction achieved for each variable along the GPCA deflation processes by the common factors. This permits a cluster analysis, which recognizes several compact sub‐sets containing only the most closely related responsive groups. The GPCA is then run again but is restricted to the groups in each sub‐set. Finally, after the data have been exhaustively deflated by the common sub‐set factors, the resulting residual matrices are used to determine the specific response factors by classical principal component analysis (PCA). The proposed method was validated by comparing its predictions with those obtained when the dynamics of the perturbation was determined. In addition, it showed to have a better performance than the obtained with other multivariate alternatives (e.g., orthogonal contrasts based on direct GPCA, Tucker‐3 model, PARAFAC, etc.). Biotechnol. Bioeng. 2009; 104: 785–795 © 2009 Wiley Periodicals, Inc.