Open Access
Tips and tricks for the assembly of a C orynebacterium pseudotuberculosis genome using a semiconductor sequencer
Author(s) -
Ramos Rommel Thiago Jucá,
Carneiro Adriana Ribeiro,
Soares Siomar de Castro,
Santos Anderson Rodrigues dos,
Almeida Sintia,
Guimarães Luis,
Figueira Flávia,
Barbosa Eudes,
Tauch Andreas,
Azevedo Vasco,
Silva Artur
Publication year - 2013
Publication title -
microbial biotechnology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.287
H-Index - 74
ISSN - 1751-7915
DOI - 10.1111/1751-7915.12006
Subject(s) - sequence assembly , genome , computational biology , corynebacterium glutamicum , ion semiconductor sequencing , genomics , biology , corynebacterium pseudotuberculosis , comparative genomics , whole genome sequencing , reference genome , dna sequencing , computer science , genetics , gene , bacteria , gene expression , transcriptome
Summary New sequencing platforms have enabled rapid decoding of complete prokaryotic genomes at relatively low cost. The I on T orrent platform is an example of these technologies, characterized by lower coverage, generating challenges for the genome assembly. One particular problem is the lack of genomes that enable reference‐based assembly, such as the one used in the present study, C orynebacterium pseudotuberculosis biovar equi, which causes high economic losses in the US equine industry. The quality treatment strategy incorporated into the assembly pipeline enabled a 16‐fold greater use of the sequencing data obtained compared with traditional quality filter approaches. Data preprocessing prior to the de novo assembly enabled the use of known methodologies in the next‐generation sequencing data assembly. Moreover, manual curation was proved to be essential for ensuring a quality assembly, which was validated by comparative genomics with other species of the genus C orynebacterium . The present study presents a modus operandi that enables a greater and better use of data obtained from semiconductor sequencing for obtaining the complete genome from a prokaryotic microorganism, C . pseudotuberculosis , which is not a traditional biological model such as E scherichia coli .