z-logo
Premium
PeakSelect : preprocessing tandem mass spectra for better peptide identification
Author(s) -
Zhang Jingfen,
He Simin,
Ling Charles X.,
Cao Xingjun,
Zeng Rong,
Gao Wen
Publication year - 2008
Publication title -
rapid communications in mass spectrometry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.528
H-Index - 136
eISSN - 1097-0231
pISSN - 0951-4198
DOI - 10.1002/rcm.3488
Subject(s) - chemistry , tandem , mass spectrum , noise (video) , tandem mass spectrometry , peptide , pattern recognition (psychology) , identification (biology) , spectral line , artificial intelligence , analytical chemistry (journal) , mass spectrometry , chromatography , computer science , physics , biochemistry , materials science , image (mathematics) , botany , astronomy , composite material , biology
We present a new preprocessing method, PeakSelect , to improve the accuracy and efficiency of Tandem Mass‐Spec peptide (protein) identification. The fundamental difference between noise and fragment ions in spectra is that ions have isotopes but noise does not. We propose a new and important concept of an Isotope Pattern Vector (IPV) which characterizes the isotope cluster of fragment ions. Then the noise and real peaks can be distinguished by the quantitative IPV values. PeakSelect first uses a new method of the Gaussian Mixture Model and Expectation‐Maximization (EM) algorithm to find the base intensity level (baseline) in a spectrum. Then PeakSelect selects features based on the IPV and baseline, and constructs a decision tree to automatically classify the peaks into different categories such as noise, single ion peaks, and overlapping peaks. Experiments show that PeakSelect can help to reduce the Mascot searching time and increase the reliability of peptide identifications. In particular, PeakSelect performs well on complex spectra with a large number of peaks from large peptides, and supports more sequence identification than other well‐known systems. Copyright © 2008 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here