Premium
Towards developing a protein infrared spectra databank (PISD) for proteomics research
Author(s) -
Hering Joachim A.,
Innocent Peter R.,
Haris Parvez I.
Publication year - 2004
Publication title -
proteomics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.26
H-Index - 167
eISSN - 1615-9861
pISSN - 1615-9853
DOI - 10.1002/pmic.200300808
Subject(s) - protein secondary structure , fourier transform infrared spectroscopy , fourier transform , pattern recognition (psychology) , biological system , computer science , proteomics , infrared spectroscopy , spectral line , artificial intelligence , chemistry , analytical chemistry (journal) , mathematics , biology , physics , chromatography , optics , mathematical analysis , biochemistry , organic chemistry , gene , astronomy
Abstract Fourier transform infrared (FTIR) spectroscopy is an attractive tool for proteomics research as it can be used to rapidly characterize protein secondary structure in aqueous solution. Recently, a number of secondary structure prediction methods based on reference sets of FTIR spectra from proteins with known structure from X‐ray crystallography have been suggested. These prediction methods, often referred to as pattern recognition based approaches, demonstrated good prediction accuracy using some error measure, e.g. , the standard error of prediction (SEP). However, to avoid possible adverse effects from differences in recording, the analysis has been mostly based on reference sets of FTIR spectra from proteins recorded in one laboratory only. As a result, these studies were based on reference sets of FTIR spectra from a limited number of proteins. Pattern recognition based approaches, however, rely on reference sets of FTIR spectra from as many proteins as possible representing all possible band shape variation to be related to the diversity of protein structural classes. Hence, if we want to build reliable pattern recognition based systems to support proteomics research, which are capable of making good predictions from spectral data of any unknown protein, one common goal should be to build a comprehensive protein infrared spectra databank (PISD) containing FTIR spectra of proteins of known structure. We have started the process of developing a comprehensive PISD composed of spectra recorded in different laboratories. As part of this work, here we investigate possible effects on prediction accuracy achieved by a neural network analysis when using reference sets composed of FTIR spectra from different laboratories. Surprisingly low magnitude of difference in SEPs throughout all our experiments suggests that FTIR spectra recorded in different laboratories may be safely combined into one reference set with only minor deterioration of prediction accuracy in the worst case.