Premium
A Min ION ™‐based pipeline for fast and cost‐effective DNA barcoding
Author(s) -
Srivathsan Amrita,
Baloğlu Bilgenur,
Wang Wendy,
Tan Wei X.,
Bertrand Denis,
Ng Amanda H. Q.,
Boey Esther J. H.,
Koh Jayce J. Y.,
Nagarajan Niranjan,
Meier Rudolf
Publication year - 2018
Publication title -
molecular ecology resources
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.96
H-Index - 136
eISSN - 1755-0998
pISSN - 1755-098X
DOI - 10.1111/1755-0998.12890
Subject(s) - minion , barcode , pipeline (software) , nanopore sequencing , biology , dna barcoding , computational biology , amplicon , indel , computer science , dna sequencing , dna , genetics , polymerase chain reaction , gene , evolutionary biology , single nucleotide polymorphism , genotype , programming language , operating system
DNA barcodes are useful for species discovery and species identification, but obtaining barcodes currently requires a well‐equipped molecular laboratory and is time‐consuming, and/or expensive. We here address these issues by developing a barcoding pipeline for Oxford Nanopore Min ION ™ and demonstrating that one flow cell can generate barcodes for ~500 specimens despite the high basecall error rates of Min ION ™ reads. The pipeline overcomes these errors by first summarizing all reads for the same tagged amplicon as a consensus barcode. Consensus barcodes are overall mismatch‐free but retain indel errors that are concentrated in homopolymeric regions. They are addressed with an optional error correction pipeline that is based on conserved amino acid motifs from publicly available barcodes. The effectiveness of this pipeline is documented by analysing reads from three Min ION ™ runs that represent three different stages of Min ION ™ development. They generated data for (i) 511 specimens of a mixed Diptera sample, (ii) 575 specimens of ants and (iii) 50 specimens of Chironomidae. The run based on the latest chemistry yielded Min ION ™ barcodes for 490 of the 511 specimens which were assessed against reference Sanger barcodes ( N = 471). Overall, the Min ION ™ barcodes have an accuracy of 99.3%–100% with the number of ambiguous bases after correction ranging from <0.01% to 1.5% depending on which correction pipeline is used. We demonstrate that it requires ~2 hr of sequencing to gather all information needed for obtaining reliable barcodes for most specimens (>90%). We estimate that up to 1,000 barcodes can be generated in one flow cell and that the cost per barcode can be < USD 2.