Premium
New insights into the meaning and usefulness of principal component analysis of concatenated trajectories
Author(s) -
PierdominiciSottile Gustavo,
Palma Juliana
Publication year - 2015
Publication title -
journal of computational chemistry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.907
H-Index - 188
eISSN - 1096-987X
pISSN - 0192-8651
DOI - 10.1002/jcc.23811
Subject(s) - principal component analysis , eigenvalues and eigenvectors , correctness , trajectory , simple (philosophy) , matrix (chemical analysis) , interpretation (philosophy) , mathematics , sentence , computer science , correlation , algorithm , artificial intelligence , statistics , physics , geometry , chemistry , philosophy , epistemology , quantum mechanics , astronomy , chromatography , programming language
A comparison between different conformations of a given protein, relating both structure and dynamics, can be performed in terms of combined principal component analysis (combined‐PCA). To that end, a trajectory is obtained by concatenating molecular dynamics trajectories of the individual conformations under comparison. Then, the principal components are calculated by diagonalizing the correlation matrix of the concatenated trajectory. Since the introduction of this approach in 1995 it has had a large number of applications. However, the interpretation of the eigenvectors and eigenvalues so obtained is based on intuitive foundations, because analytical expressions relating the concatenated correlation matrix with those of the individual trajectories under consideration have not been provided yet. In this article, we present such expressions for the cases of two, three, and an arbitrary number of concatenated trajectories. The formulas are simple and show what is to be expected and what is not to be expected from a combined‐PCA. Their correctness and usefulness is demonstrated by discussing some representative examples. The results can be summarized in a simple sentence: the correlation matrix of a concatenated trajectory is given by the average of the individual correlation matrices plus the correlation matrix of the individual averages. From this it follows that the combined‐PCA of trajectories belonging to different free energy basins provides information that could also be obtained by alternative and more straightforward means. © 2014 Wiley Periodicals, Inc.