TranscriptClean: variant-aware correction of indels, mismatches and splice junctions in long-read transcripts
Author(s) -
Dana Wyman,
A Mortazavi
Publication year - 2018
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/bty483
Subject(s) - splice , indel , python (programming language) , computational biology , computer science , exon , rna splicing , gene isoform , scripting language , genetics , alternative splicing , biology , genome , gene , rna , programming language , single nucleotide polymorphism , genotype
Long-read, single-molecule sequencing platforms hold great potential for isoform discovery and characterization of multi-exon transcripts. However, their high error rates are an obstacle to distinguishing novel transcript isoforms from sequencing artifacts. Therefore, we developed the package TranscriptClean to correct mismatches, microindels and noncanonical splice junctions in mapped transcripts using the reference genome while preserving known variants.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom