
ShapeShifter: a novel approach for identifying and quantifying stable lariat intronic species in RNAseq data
Author(s) -
Taggart Allison J,
Fairbrother William G
Publication year - 2018
Publication title -
quantitative biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.707
H-Index - 15
eISSN - 2095-4697
pISSN - 2095-4689
DOI - 10.1007/s40484-018-0141-x
Subject(s) - encode , intron , rna splicing , computational biology , biology , computer science , artificial intelligence , genetics , gene , rna
Background Most intronic lariats are rapidly turned over after splicing. However, new research suggests that some introns may have additional post‐splicing functions. Current bioinformatics methods used to identify lariats require a sequencing read that traverses the lariat branchpoint. This method provides precise branchpoint sequence and position information, but is limited in its ability to quantify abundance of stabilized lariat species in a given RNAseq sample. Bioinformatic tools are needed to better address these emerging biological questions. Methods We used an unsupervised machine learning approach on sequencing reads from publicly available ENCODE data to learn to identify and quantify lariats based on RNAseq read coverage shape. Results We developed ShapeShifter, a novel approach for identifying and quantifying stable lariat species in RNAseq datasets. We learned a characteristic “lariat” curve from ENCODE RNAseq data and were able to estimate abundances for introns based on read coverage. Using this method we discovered new stable introns in these samples that were not represented using the older, branchpoint‐traversing read method. Conclusions ShapeShifter provides a robust approach towards detecting and quantifying stable lariat species.