Premium
A Machine Learning Approach for the Identification of Protein Secondary Structure Elements from Electron Cryo‐Microscopy Density Maps
Author(s) -
Si Dong,
Ji Shuiwang,
Nasr Kamal Al,
He Jing
Publication year - 2012
Publication title -
biopolymers
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.556
H-Index - 125
eISSN - 1097-0282
pISSN - 0006-3525
DOI - 10.1002/bip.22063
Subject(s) - cryo electron microscopy , microscopy , chemistry , artificial intelligence , identification (biology) , electron microscope , protein secondary structure , sensitivity (control systems) , helix (gastropod) , crystallography , pattern recognition (psychology) , computer science , physics , optics , engineering , botany , electronic engineering , biochemistry , biology , ecology , snail
The accuracy of the secondary structure element (SSE) identification from volumetric protein density maps is critical for de‐novo backbone structure derivation in electron cryo‐microscopy (cryoEM). It is still challenging to detect the SSE automatically and accurately from the density maps at medium resolutions (∼5–10 Å). We present a machine learning approach, SSELearner , to automatically identify helices and β‐sheets by using the knowledge from existing volumetric maps in the Electron Microscopy Data Bank. We tested our approach using 10 simulated density maps. The averaged specificity and sensitivity for the helix detection are 94.9% and 95.8%, respectively, and those for the β‐sheet detection are 86.7% and 96.4%, respectively. We have developed a secondary structure annotator, SSID , to predict the helices and β‐strands from the backbone Cα trace. With the help of SSID , we tested our SSELearner using 13 experimentally derived cryo‐EM density maps. The machine learning approach shows the specificity and sensitivity of 91.8% and 74.5%, respectively, for the helix detection and 85.2% and 86.5% respectively for the β‐sheet detection in cryoEM maps of Electron Microscopy Data Bank. The reduced detection accuracy reveals the challenges in SSE detection when the cryoEM maps are used instead of the simulated maps. Our results suggest that it is effective to use one cryoEM map for learning to detect the SSE in another cryoEM map of similar quality. © 2012 Wiley Periodicals, Inc. Biopolymers 97: 698–708, 2012.