
Long-read assembly of a Great Dane genome highlights the contribution of GC-rich sequence and mobile elements to canine genomes
Author(s) -
Julia Halo,
Amanda L. Pendleton,
Feichen Shen,
Aurélien J. Doucet,
Thomas Derrien,
Christophe Hitte,
Laura E. Kirby,
Bridget Myers,
Elżbieta Śliwerska,
Sarah B. Emery,
John V. Moran,
Adam R. Boyko,
Jeffrey M. Kidd
Publication year - 2021
Publication title -
proceedings of the national academy of sciences of the united states of america
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 5.011
H-Index - 771
eISSN - 1091-6490
pISSN - 0027-8424
DOI - 10.1073/pnas.2016274118
Subject(s) - retrotransposon , genome , biology , genetics , gene , contig , reference genome , coding region , mobile genetic elements , genome project , gc content , human genome , computational biology , transposable element
Significance Advancements in long-read DNA sequencing technologies provide more comprehensive views of genomes. We used long-read sequences to assemble a Great Dane dog genome that provides several improvements over the existing reference derived from a Boxer. Assembly comparisons revealed that gaps in the Boxer assembly often occur at the beginning of protein-coding genes and have a high-GC content, which likely reflects limitations of previous technologies in resolving GC-rich sequences. Dimorphic LINE-1 and SINEC retrotransposons represent the predominant differences between the Great Dane and Boxer assemblies. Proof-of-principle experiments demonstrated that expression of a canine LINE-1 could promote the retrotransposition of itself and a SINEC_Cf consensus sequence in cultured human cells. Thus, ongoing retrotransposon activity is a major contributor to canine genetic diversity.