Premium
Euclidean sections of protein conformation space and their implications in dimensionality reduction
Author(s) -
Duan Mojie,
Li Minghai,
Han Li,
Huo Shuanghong
Publication year - 2014
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.24622
Subject(s) - euclidean space , mathematics , euclidean distance matrix , eight dimensional space , euclidean group , dimensionality reduction , euclidean distance , seven dimensional space , metric (unit) , curse of dimensionality , quotient space (topology) , space (punctuation) , geometry , combinatorics , quotient , pure mathematics , topology (electrical circuits) , artificial intelligence , affine space , computer science , operations management , statistics , affine transformation , economics , operating system
Dimensionality reduction is widely used in searching for the intrinsic reaction coordinates for protein conformational changes. We find the dimensionality−reduction methods using the pairwise root−mean−square deviation (RMSD) as the local distance metric face a challenge. We use Isomap as an example to illustrate the problem. We believe that there is an implied assumption for the dimensionality‐reduction approaches that aim to preserve the geometric relations between the objects: both the original space and the reduced space have the same kind of geometry, such as Euclidean geometry vs. Euclidean geometry or spherical geometry vs. spherical geometry. When the protein free energy landscape is mapped onto a 2D plane or 3D space, the reduced space is Euclidean, thus the original space should also be Euclidean. For a protein with N atoms, its conformation space is a subset of the 3N ‐dimensional Euclidean space R 3N . We formally define the protein conformation space as the quotient space of R 3N by the equivalence relation of rigid motions. Whether the quotient space is Euclidean or not depends on how it is parameterized. When the pairwise RMSD is employed as the local distance metric, implicit representations are used for the protein conformation space, leading to no direct correspondence to a Euclidean set. We have demonstrated that an explicit Euclidean‐based representation of protein conformation space and the local distance metric associated to it improve the quality of dimensionality reduction in the tetra‐peptide and β‐hairpin systems. Proteins 2014; 82:2585–2596. © 2014 Wiley Periodicals, Inc.