z-logo
open-access-imgOpen Access
Constructing Human Proteoform Families Using Intact-Mass and Top-Down Proteomics with a Multi-Protease Global Post-Translational Modification Discovery Database
Author(s) -
Yunxiang Dai,
Katherine E. Buxton,
Leah V. Schaffer,
Rachel Miller,
Robert J. Millikin,
Mark Scalf,
Brian L. Frey,
Michael R. Shortreed,
Lloyd M. Smith
Publication year - 2019
Publication title -
journal of proteome research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.644
H-Index - 161
eISSN - 1535-3907
pISSN - 1535-3893
DOI - 10.1021/acs.jproteome.9b00339
Subject(s) - proteomics , top down proteomics , computational biology , identification (biology) , jurkat cells , database search engine , computer science , mass spectrometry , tandem mass spectrometry , chemistry , biology , gene , genetics , selected reaction monitoring , search engine , information retrieval , t cell , botany , immune system , chromatography
Complex human biomolecular processes are made possible by the diversity of human proteoforms. Constructing proteoform families, groups of proteoforms derived from the same gene, is one way to represent this diversity. Comprehensive, high-confidence identification of human proteoforms remains a central challenge in mass spectrometry-based proteomics. We have previously reported a strategy for proteoform identification using intact-mass measurements, and we have since improved that strategy by mass calibration based on search results, the use of a global post-translational modification discovery database, and the integration of top-down proteomics results with intact-mass analysis. In the present study, we combine these strategies for enhanced proteoform identification in total cell lysate from the Jurkat human T lymphocyte cell line. We collected, processed, and integrated three types of proteomics data (NeuCode-labeled intact-mass, label-free top-down, and multi-protease bottom-up) to maximize the number of confident proteoform identifications. The integrated analysis revealed 5950 unique experimentally observed proteoforms, which were assembled into 848 proteoform families. Twenty percent of the observed proteoforms were confidently identified at a 3.9% false discovery rate, representing 1207 unique proteoforms derived from 484 genes.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here