Premium
From the genome sequence to the proteome and back: Evaluation of E. coli genome annotation with a 2‐D gel‐based proteomics approach
Author(s) -
Maillet Isabelle,
Berndt Peter,
Malo Céline,
Rodriguez Sabrina,
Brunisholz René A.,
Pragai Zoltan,
Arnold Sabine,
Langen Hanno,
Wyss Markus
Publication year - 2007
Publication title -
proteomics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.26
H-Index - 167
eISSN - 1615-9861
pISSN - 1615-9853
DOI - 10.1002/pmic.200600599
Subject(s) - genome , proteome , computational biology , genome project , annotation , proteomics , biology , escherichia coli , function (biology) , bacterial genome size , identification (biology) , whole genome sequencing , genetics , gene , botany
The ambition of systems biology to understand complex biological systems at the molecular level implies that we need to have a concrete and correct understanding of each molecular entity and its function. However, even for the best‐studied organism, Escherichia coli , a large number of proteins have never been identified and characterised from wild‐type cells, and/or await unravelling of their biological role. Instead, the ORF models for these proteins have been predicted by suitable algorithms and/or through comparison with known, homologous proteins from other organisms, approaches which may be prone to error. In the present study, we used a combination of 2‐DE, MALDI‐TOF‐MS and PMF to identify 1151 different proteins in E. coli K12 JM109. Comparison of the experimental with the theoretical M r and p I values (4000 experimental values each) allowed the identification of numerous proteins with incorrect or incomplete ORF annotations in the current E. coli genome databases. Several inconsistencies in genome annotation were verified experimentally, and up to 55 candidates await further investigation. Our findings demonstrate how an up‐to‐date 2‐D gel‐based proteomics approach can be used for improving the annotation of prokaryotic genomes. They also highlight the need for harmonization among the different E. coli genome databases.