The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual
Author(s) -
Kuan-Hao Chao,
Aleksey V. Zimin,
Mihaela Pertea,
Steven L. Salzberg
Publication year - 2023
Publication title -
g3 genes genomes genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.468
H-Index - 66
ISSN - 2160-1836
DOI - 10.1093/g3journal/jkac321
Subject(s) - biology , gene , genome , genetics , genome project , reference genome , dna sequencing , whole genome sequencing , annotation , coding region , human genome , computational biology
We used long-read DNA sequencing to assemble the genome of a Southern Han Chinese male. We organized the sequence into chromosomes and filled in gaps using the recently completed T2T-CHM13 genome as a guide, yielding a gap-free genome, Han1, containing 3,099,707,698 bases. Using the T2T-CHM13 annotation as a reference, we mapped all genes onto the Han1 genome and identified additional gene copies, generating a total of 60,708 putative genes, of which 20,003 are protein-coding. A comprehensive comparison between the genes revealed that 235 protein-coding genes were substantially different between the individuals, with frameshifts or truncations affecting the protein-coding sequence. Most of these were heterozygous variants in which one gene copy was unaffected. This represents the first gene-level comparison between two finished, annotated individual human genomes.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom