z-logo
open-access-imgOpen Access
Accurate prediction of RNA 5-hydroxymethylcytosine modification by utilizing novel position-specific gapped k-mer descriptors
Author(s) -
Sajid Ahmed,
Hossain Md. Zahid,
Mahtab Uddin,
Ghazaleh Taherzadeh,
Alok Sharma,
Swakkhar Shatabda,
Abdollah Dehzangi
Publication year - 2020
Publication title -
computational and structural biotechnology journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.908
H-Index - 45
ISSN - 2001-0370
DOI - 10.1016/j.csbj.2020.10.032
Subject(s) - discriminative model , benchmark (surveying) , computer science , identification (biology) , rna , set (abstract data type) , computational biology , feature (linguistics) , data mining , artificial intelligence , biology , gene , genetics , linguistics , philosophy , botany , geodesy , programming language , geography
RNA modification is an essential step towards generation of new RNA structures. Such modification is potentially able to modify RNA function or its stability. Among different modifications, 5-Hydroxymethylcytosine (5hmC) modification of RNA exhibit significant potential for a series of biological processes. Understanding the distribution of 5hmC in RNA is essential to determine its biological functionality. Although conventional sequencing techniques allow broad identification of 5hmC, they are both time-consuming and resource-intensive. In this study, we propose a new computational tool called iRNA5hmC-PS to tackle this problem. To build iRNA5hmC-PS we extract a set of novel sequence-based features called Position-Specific Gapped k-mer (PSG k-mer) to obtain maximum sequential information. Our feature analysis shows that our proposed PSG k-mer features contain vital information for the identification of 5hmC sites. We also use a group-wise feature importance calculation strategy to select a small subset of features containing maximum discriminative information. Our experimental results demonstrate that iRNA5hmC-PS is able to enhance the prediction performance, dramatically. iRNA5hmC-PS achieves 78.3% prediction performance, which is 12.8% better than those reported in the previous studies. iRNA5hmC-PS is publicly available as an online tool at http://103.109.52.8:81/iRNA5hmC-PS. Its benchmark dataset, source codes, and documentation are available at https://github.com/zahid6454/iRNA5hmC-PS.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom