z-logo
open-access-imgOpen Access
Construction of precise support vector machine based models for predicting promoter strength
Author(s) -
Meng Hailin,
Ma Yingfei,
Mai Guoqin,
Wang Yong,
Liu Chenli
Publication year - 2017
Publication title -
quantitative biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.707
H-Index - 15
eISSN - 2095-4697
pISSN - 2095-4689
DOI - 10.1007/s40484-017-0096-3
Subject(s) - support vector machine , machine learning , artificial neural network , artificial intelligence , test set , set (abstract data type) , computer science , correlation coefficient , sequence (biology) , training set , data mining , biology , genetics , programming language
Background The prediction of the prokaryotic promoter strength based on its sequence is of great importance not only in the fundamental research of life sciences but also in the applied aspect of synthetic biology. Much advance has been made to build quantitative models for strength prediction, especially the introduction of machine learning methods such as artificial neural network (ANN) has significantly improve the prediction accuracy. As one of the most important machine learning methods, support vector machine (SVM) is more powerful to learn knowledge from small sample dataset and thus supposed to work in this problem. Methods To confirm this, we constructed SVM based models to quantitatively predict the promoter strength. A library of 100 promoter sequences and strength values was randomly divided into two datasets, including a training set (≥10 sequences) for model training and a test set (≥10 sequences) for model test. Results The results indicate that the prediction performance increases with an increase of the size of training set, and the best performance was achieved at the size of 90 sequences. After optimization of the model parameters, a high‐performance model was finally trained, with a high squared correlation coefficient for fitting the training set ( R 2 >0.99) and the test set ( R 2 >0.98), both of which are better than that of ANN obtained by our previous work. Conclusions Our results demonstrate the SVM‐based models can be employed for the quantitative prediction of promoter strength.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here