Information transduction capacity reduces the uncertainties in annotation-free isoform discovery and quantification
Author(s) -
Yue Deng,
Feng Bao,
Yang Yang,
Xiangyang Ji,
Mulong Du,
Zhengdong Zhang,
Meilin Wang,
Qionghai Dai
Publication year - 2017
Publication title -
nucleic acids research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 9.008
H-Index - 537
eISSN - 1362-4954
pISSN - 0305-1048
DOI - 10.1093/nar/gkx585
Subject(s) - biology , computational biology , inference , rna seq , alternative splicing , rna , annotation , transduction (biophysics) , rna splicing , deep sequencing , gene isoform , transcriptome , computer science , gene , bioinformatics , genetics , gene expression , artificial intelligence , genome , biochemistry
The automated transcript discovery and quantification of high-throughput RNA sequencing (RNA-seq) data are important tasks of next-generation sequencing (NGS) research. However, these tasks are challenging due to the uncertainties that arise in the inference of complete splicing isoform variants from partially observed short reads. Here, we address this problem by explicitly reducing the inherent uncertainties in a biological system caused by missing information. In our approach, the RNA-seq procedure for transforming transcripts into short reads is considered an information transmission process. Consequently, the data uncertainties are substantially reduced by exploiting the information transduction capacity of information theory. The experimental results obtained from the analyses of simulated datasets and RNA-seq datasets from cell lines and tissues demonstrate the advantages of our method over state-of-the-art competitors. Our algorithm is an open-source implementation of MaxInfo.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom