Final Report: Complete Sequencing of the 2.3Mbp Genome of the Hyperthermophilic Archaeon Pyrbaculum Aerophilum, January 1, 1998 - December 31, 1998
Author(s) -
UngJin Kim,
Melvin I. Simon
Publication year - 1998
Publication title -
osti oai (u.s. department of energy office of scientific and technical information)
Language(s) - English
Resource type - Reports
DOI - 10.2172/765662
Subject(s) - fosmid , shotgun sequencing , biology , genome , genetics , archaea , sulfolobus solfataricus , contig , whole genome sequencing , sequence assembly , massive parallel sequencing , dna sequencing , computational biology , illumina dye sequencing , hyperthermophile , metagenomics , hybrid genome assembly , nanopore sequencing , reference genome , deep sequencing , gene , transcriptome , gene expression
Pyrobaculum aerophilum is a hyperthermophilic archeon discovered from a boiling marine water hole at Maronti Beach, Italy that is capable of growth at 110 C. This microorganism can grow aerobically, unlike most of it's thermophilic relatives. Due to the tolerance to oxygen, it is possible to grow this microbe in the presence of air, i.e. on plates. Therefore, it is a good candidate a model organism for studying archaeal biology and thermophilism. Sequencing the entire genome of this organism will provide a wealth of information on the evolutionary and phylogenetic relationship between archaea and other organisms as well as the biology of thermophilism. We have constructed a physical map that covers estimated 2,3 Megabase pair genome using a 10X fosmid library. The map currently consists of 96 overlapping fosmid clones. We have completed sequencing the entire genome using in random shotgun approach with the supplement of oligonucleotide primer directed sequencing. Total 16,098 random sequences corresponding to approximately 3.5X genomic coverage were obtained by sequencing from both ends with vector-specific primers the 2-3 kbp genomic DNA fragments cloned into pUC18/19 vector after shearing have been assembled into a number of contigs using Phrap program developed by Dr. Phil Green at University of Washington, Seattle. Gaps and regions of low quality base calls have been a total of 2,300 directed sequencing and reassembly. Our current full length genomic sequence still suffers from low data quality: only approximately 99% of the nucleotide sequences are accurate. This is mainly due to the low redundancy (3.5 fold) in random sequencing. We plan to perform 2-3,000 more directed sequencing to polish the sequence to 99,99% accuracy. Final polishing of the sequence data and annotation is currently being performed by UCLA team and Caltech sequencing core facility
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom