z-logo
open-access-imgOpen Access
PLncWX: A Machine-Learning Algorithm for Plant lncRNA Identification Based on WOA-XGBoost
Author(s) -
Fei Guo,
Zhixiang Yin,
Kai Zhou,
Jiasi Li
Publication year - 2021
Publication title -
journal of chemistry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.436
H-Index - 50
eISSN - 2090-9063
pISSN - 2090-9071
DOI - 10.1155/2021/6256021
Subject(s) - feature selection , artificial intelligence , encode , machine learning , identification (biology) , support vector machine , feature (linguistics) , computer science , abiotic stress , gene , epigenetics , computational biology , algorithm , biology , botany , genetics , linguistics , philosophy
Long noncoding RNAs (lncRNAs) are a class of RNAs longer than 200 nt and cannot encode the protein. Studies have shown that lncRNAs can regulate gene expression at the epigenetic, transcriptional, and posttranscriptional levels, which are not only closely related to the occurrence, development, and prevention of human diseases, but also can regulate plant flowering and participate in plant abiotic stress responses such as drought and salt. Therefore, how to accurately and efficiently identify lncRNAs is still an essential job of relevant researches. There have been a large number of identification tools based on machine-learning and deep learning algorithms, mostly using human and mouse gene sequences as training sets, seldom plants, and only using one or one class of feature selection methods after feature extraction. We developed an identification model containing dicot, monocot, algae, moss, and fern. After comparing 20 feature selection methods (seven filter and thirteen wrapper methods) combined with seven classifiers, respectively, considering the correlation between features and model redundancy at the same time, we found that the WOA-XGBoost-based model had better performance with 91.55%, 96.78%, and 91.68% of accuracy, AUC, and F1_score. Meanwhile, the number of elements in the feature subset was reduced to 23, which effectively improved the prediction accuracy and modeling efficiency.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom