z-logo
open-access-imgOpen Access
KOREF_S1: phased, parental trio-binned Korean reference genome using long reads and Hi-C sequencing methods
Author(s) -
Hui-Su Kim,
Sungwon Jeon,
Yeonkyung Kim,
Changjae Kim,
Jihun Bhak,
Jong Bhak
Publication year - 2022
Publication title -
gigascience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.947
H-Index - 54
ISSN - 2047-217X
DOI - 10.1093/gigascience/giac022
Subject(s) - reference genome , genome , nanopore sequencing , sequence assembly , computational biology , biology , chromosome , computer science , genetics , gene , gene expression , transcriptome
Background KOREF is the Korean reference genome, which was constructed with various sequencing technologies including long reads, short reads, and optical mapping methods. It is also the first East Asian multiomic reference genome accompanied by extensive clinical information, time-series and multiomic data, and parental sequencing data. However, it was still not a chromosome-scale reference. Here, we updated the previous KOREF assembly to a new chromosome-level haploid assembly of KOREF, KOREF_S1v2.1. Oxford Nanopore Technologies (ONT) PromethION, Pacific Biosciences HiFi-CCS, and Hi-C technology were used to build the most accurate East Asian reference assembled so far. Results We produced 705 Gb ONT reads and 114 Gb Pacific Biosciences HiFi reads, and corrected ONT reads by Pacific Biosciences reads. The corrected ultra-long reads reached higher accuracy of 1.4% base errors than the previous KOREF_S1v1.0, which was mainly built with short reads. KOREF has parental genome information, and we successfully phased it using a trio-binning method, acquiring a near-complete haploid-assembly. The final assembly resulted in total length of 2.9 Gb with an N50 of 150 Mb, and the longest scaffold covered 97.3% of GRCh38’s chromosome 2. In addition, the final assembly showed high base accuracy, with <0.01% base errors. Conclusions KOREF_S1v2.1 is the first chromosome-scale haploid assembly of the Korean reference genome with high contiguity and accuracy. Our study provides useful resources of the Korean reference genome and demonstrates a new strategy of hybrid assembly that combines ONT's PromethION and PacBio's HiFi-CCS.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom