z-logo
open-access-imgOpen Access
Revisiting the protein-coding gene catalog ofDrosophila melanogasterusing 12 fly genomes
Author(s) -
Michael F. Lin,
Joseph W. Carlson,
Madeline A. Crosby,
Beverley B Matthews,
Charles Yu,
Soo Hyung Park,
Kenneth H. Wan,
Andrew J. Schroeder,
L. Sian Gramates,
Susan E. St. Pierre,
Margaret Roark,
Kenneth L. Wiley,
Rob J. Kulathinal,
Peili Zhang,
Kyl V. Myrick,
Jerry Antone,
S Celniker,
William M Gelbart,
Manolis Kellis
Publication year - 2007
Publication title -
genome research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 9.556
H-Index - 297
eISSN - 1549-5469
pISSN - 1088-9051
DOI - 10.1101/gr.6679507
Subject(s) - biology , genome , gene prediction , genetics , drosophila melanogaster , gene , comparative genomics , genome project , genomics , computational biology , orfs , codon usage bias , melanogaster , open reading frame , peptide sequence
The availability of sequenced genomes from 12 Drosophila species has enabled the use of comparative genomics for the systematic discovery of functional elements conserved within this genus. We have developed quantitative metrics for the evolutionary signatures specific to protein-coding regions and applied them genome-wide, resulting in 1193 candidate new protein-coding exons in the D. melanogaster genome. We have reviewed these predictions by manual curation and validated a subset by directed cDNA screening and sequencing, revealing both new genes and new alternative splice forms of known genes. We also used these evolutionary signatures to evaluate existing gene annotations, resulting in the validation of 87% of genes lacking descriptive names and identifying 414 poorly conserved genes that are likely to be spurious predictions, noncoding, or species-specific genes. Furthermore, our methods suggest a variety of refinements to hundreds of existing gene models, such as modifications to translation start codons and exon splice boundaries. Finally, we performed directed genome-wide searches for unusual protein-coding structures, discovering 149 possible examples of stop codon readthrough, 125 new candidate ORFs of polycistronic mRNAs, and several candidate translational frameshifts. These results affect >10% of annotated fly genes and demonstrate the power of comparative genomics to enhance our understanding of genome organization, even in a model organism as intensively studied as Drosophila melanogaster.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom