The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes | Zendy

Kim D. Pruitt | Zendy; Jennifer Harrow | Zendy; Rachel Harte | Zendy; Craig Wallin | Zendy; Mark Diekhans | Zendy; Donna Maglott | Zendy; Steve Searle | Zendy; Catherine M. Farrell | Zendy; Jane Loveland | Zendy; Barbara J. Ruef | Zendy; Elizabeth A. Hart | Zendy; MarieMarthe Suner | Zendy; Melissa Landrum | Zendy; Bronwen Aken | Zendy; Sarah Ayling | Zendy; Robert Baertsch | Zendy; Julio Fernandez-Banet | Zendy; Joshua L. Cherry | Zendy; Val Curwen | Zendy; Michael DiCuccio | Zendy; Manolis Kellis | Zendy; Jennifer Lee | Zendy; Michael F. Lin | Zendy; Michael Schuster | Zendy; Andrew Shkeda | Zendy; Clara Amid | Zendy; Garth Brown | Zendy; Oksana I. Dukhanina | Zendy; Adam Frankish | Zendy; Jennifer Hart | Zendy; B. Maidak | Zendy; Jonathan M. Mudge | Zendy; Michael R. Murphy | Zendy; Terence D. Murphy | Zendy; Jeena Rajan | Zendy; Bhanu Rajput | Zendy; Lillian D. Riddick | Zendy; Catherine Snow | Zendy; Charles A. Steward | Zendy; David Webb | Zendy; Janet A. Weber | Zendy; Laurens Wilming | Zendy; Wenyu Wu | Zendy; Ewan Birney | Zendy; David Haussler | Zendy; Tim Hubbard | Zendy; James Ostell | Zendy; Richard Durbin | Zendy; David J. Lipman | Zendy

Open Access

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes

Author(s) -

Kim D. Pruitt,

Jennifer Harrow,

Rachel Harte,

Craig Wallin,

Mark Diekhans,

Donna Maglott,

Steve Searle,

Catherine M. Farrell,

Jane Loveland,

Barbara J. Ruef,

Elizabeth A. Hart,

MarieMarthe Suner,

Melissa Landrum,

Bronwen Aken,

Sarah Ayling,

Robert Baertsch,

Julio Fernandez-Banet,

Joshua L. Cherry,

Val Curwen,

Michael DiCuccio,

Manolis Kellis,

Jennifer Lee,

Michael F. Lin,

Michael Schuster,

Andrew Shkeda,

Clara Amid,

Garth Brown,

Oksana I. Dukhanina,

Adam Frankish,

Jennifer Hart,

B. Maidak,

Jonathan M. Mudge,

Michael R. Murphy,

Terence D. Murphy,

Jeena Rajan,

Bhanu Rajput,

Lillian D. Riddick,

Catherine Snow,

Charles A. Steward,

David Webb,

Janet A. Weber,

Laurens Wilming,

Wenyu Wu,

Ewan Birney,

David Haussler,

Tim Hubbard,

James Ostell,

Richard Durbin,

David J. Lipman

Publication year - 2009

Publication title -

genome research

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 9.556

H-Index - 297

eISSN - 1549-5469

pISSN - 1088-9051

DOI - 10.1101/gr.080531.108

Subject(s) - ensembl , biology , genome , annotation , computational biology , gene , human genome , identifier , genome project , coding region , reference genome , genetics , human protein atlas , protein function , genomics , computer science , protein expression , programming language

Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research