Estimating the Fraction of Non-Coding RNAs in Mammalian Transcriptomes
Author(s) -
Yurong Xin,
Giulio Quarta,
Hin Hark Gan,
Tamar Schlick
Publication year - 2008
Publication title -
bioinformatics and biology insights
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.556
H-Index - 23
ISSN - 1177-9322
DOI - 10.4137/bbi.s443
Subject(s) - non coding rna , computational biology , biology , rna , genome , intergenic region , intron , transcriptome , genetics , gene , gene expression
Recent studies of mammalian transcriptomes have identified numerous RNA transcripts that do not code for proteins; their identity, however, is largely unknown. Here we explore an approach based on sequence randomness patterns to discern different RNA classes. The relative z-score we use helps identify the known ncRNA class from the genome, intergene and intron classes. This leads us to a fractional ncRNA measure of putative ncRNA datasets which we model as a mixture of genuine ncRNAs and other transcripts derived from genomic, intergenic and intronic sequences. We use this model to analyze six representative datasets identified by the FANTOM3 project and two computational approaches based on comparative analysis (RNAz and EvoFold). Our analysis suggests fewer ncRNAs than estimated by DNA sequencing and comparative analysis, but the verity of our approach and its prediction requires more extensive experimental RNA data.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom