Premium
Prediction of protein secondary structure from circular dichroism using theoretically derived spectra
Author(s) -
LouisJeune Caroline,
AndradeNavarro Miguel A.,
PerezIratxeta Carol
Publication year - 2012
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.23188
Subject(s) - protein secondary structure , protein data bank , protein data bank (rcsb pdb) , circular dichroism , spectral line , set (abstract data type) , protein structure , chemistry , biological system , data set , protein tertiary structure , crystallography , reference data , algorithm , computer science , data mining , physics , artificial intelligence , biology , stereochemistry , biochemistry , astronomy , programming language
Circular dichroism (CD) is a spectroscopic technique commonly used to investigate the structure of proteins. Major secondary structure types, alpha-helices and beta-strands, produce distinctive CD spectra. Thus, by comparing the CD spectrum of a protein of interest to a reference set consisting of CD spectra of proteins of known structure, predictive methods can estimate the secondary structure of the protein. Currently available methods, including K2D2, use such experimental CD reference sets, which are very small in size when compared to the number of tertiary structures available in the Protein Data Bank (PDB). Conversely, given a PDB structure, it is possible to predict a theoretical CD spectrum from it. The methodological framework for this calculation was established long ago but only recently a convenient implementation called DichroCalc has been developed. In this study, we set to determine whether theoretically derived spectra could be used as reference set for accurate CD based predictions of secondary structure. We used DichroCalc to calculate the theoretical CD spectra of a nonredundant set of structures representing most proteins in the PDB, and applied a straightforward approach for predicting protein secondary structure content using these theoretical CD spectra as reference set. We show that this method improves the predictions, particularly for the wavelength interval between 200 and 240 nm and for beta-strand content. We have implemented this method, called K2D3, in a publicly accessible web server at http://www. ogic.ca/projects/k2d3.