z-logo
open-access-imgOpen Access
Assemblathon 1: A competitive assessment of de novo short read assembly methods
Author(s) -
Dent Earl,
Keith Bradnam,
John St. John,
Aaron E. Darling,
Dawei Lin,
Joseph Fass,
Hung On Ken Yu,
Vince Buffalo,
Daniel R. Zerbino,
Mark Diekhans,
Ngan Nguyen,
Pramila Ariyaratne,
WingKin Sung,
Zemin Ning,
Matthias Haimel,
Jared T. Simpson,
Nuno A. Fonseca,
İnanç Birol,
Roderick Docking,
Isaac Ho,
Daniel S. Rokhsar,
Rayan Chikhi,
Dominique Lavenier,
Guillaume Chapuis,
Delphine Naquin,
Nicolas Maillet,
Michael C. Schatz,
David R. Kelley,
Adam M. Phillippy,
Sergey Koren,
ShiawPyng Yang,
Wei Wu,
WenChi Chou,
Anuj Srivastava,
Timothy I. Shaw,
J. Graham Ruby,
Peter Skewes-Cox,
Miguel Betegon,
Michelle Dimon,
Victor Solovyev,
Igor Seledtsov,
Petr Kosarev,
Denis Vorobyev,
Ricardo H. Ramírez-González,
Richard M. Leggett,
Dan MacLean,
Fangfang Xia,
Ruibang Luo,
Zhenyu Li,
Yinlong Xie,
Binghang Liu,
Sante Gnerre,
Iain MacCallum,
Dariusz Przybylski,
Filipe J. Ribeiro,
Shuangye Yin,
Ted Sharpe,
Giles Hall,
Paul Kersey,
Richard Durbin,
Shaun D. Jackman,
Jarrod Chapman,
Xiaoqiu Huang,
Joseph L. DeRisi,
Mario Cáccamo,
Yingrui Li,
David B. Jaffe,
Richard E. Green,
David Haussler,
Ian Korf,
Benedict Paten
Publication year - 2011
Publication title -
genome research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 9.556
H-Index - 297
eISSN - 1549-5469
pISSN - 1088-9051
DOI - 10.1101/gr.126599.111
Subject(s) - biology , sequence assembly , benchmark (surveying) , genome , computational biology , genomics , contiguity , hybrid genome assembly , set (abstract data type) , computer science , genetics , gene , transcriptome , ecology , gene expression , geodesy , programming language , geography
Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom