z-logo
open-access-imgOpen Access
Resolving the full spectrum of human genome variation using Linked-Reads
Author(s) -
Patrick Marks,
Sarah Garcia,
Álvaro Martínez Barrio,
Kamila Belhocine,
Jorge Bernate,
Rajiv Bharadwaj,
Keith P. Bjornson,
Claudia Catalanotti,
Josh Delaney,
Adrian Fehr,
Ian T. Fiddes,
Brendan D. Galvin,
Haynes Heaton,
Jill Herschleb,
Christopher M. Hindson,
Esty Holt,
Cassandra B. Jabara,
Susanna Jett,
Nikka Keivanfar,
Sofia Kyriazopoulou-Panagiotopoulou,
Monkol Lek,
Bill K. Lin,
Adam J. Lowe,
Shazia Mahamdallie,
Shamoni Maheshwari,
Tony Makarewicz,
Jamie L. Marshall,
Francesca Meschi,
Christopher J. O'Keefe,
Heather Ordonez,
Pranav Patel,
Andrew Price,
Ariel Royall,
Elise Ruark,
Sheila Seal,
Michael Schnall-Levin,
Preyas Shah,
David Stafford,
Stephen R. Williams,
Indira Wu,
Andrew Wei Xu,
Nazneen Rahman,
Daniel G. MacArthur,
Deanna M. Church
Publication year - 2019
Publication title -
genome research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 9.556
H-Index - 297
eISSN - 1549-5469
pISSN - 1088-9051
DOI - 10.1101/gr.234443.118
Subject(s) - biology , genome , reference genome , human genome , computational biology , hybrid genome assembly , structural variation , genetics , 1000 genomes project , exome , sequence (biology) , dna sequencing , population , exome sequencing , gene , mutation , single nucleotide polymorphism , demography , sociology , genotype
Large-scale population analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short-read whole-genome sequencing. However, these short-read approaches fail to give a complete picture of a genome. They struggle to identify structural events, cannot access repetitive regions, and fail to resolve the human genome into haplotypes. Here, we describe an approach that retains long range information while maintaining the advantages of short reads. Starting from ∼1 ng of high molecular weight DNA, we produce barcoded short-read libraries. Novel informatic approaches allow for the barcoded short reads to be associated with their original long molecules producing a novel data type known as “Linked-Reads”. This approach allows for simultaneous detection of small and large variants from a single library. In this manuscript, we show the advantages of Linked-Reads over standard short-read approaches for reference-based analysis. Linked-Reads allow mapping to 38 Mb of sequence not accessible to short reads, adding sequence in 423 difficult-to-sequence genes including disease-relevant genes STRC , SMN1 , and SMN2 . Both Linked-Read whole-genome and whole-exome sequencing identify complex structural variations, including balanced events and single exon deletions and duplications. Further, Linked-Reads extend the region of high-confidence calls by 68.9 Mb. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom