Sequence-based prediction of protein interaction sites with an integrative method
Author(s) -
Xuewen Chen,
Jong Cheol Jeong
Publication year - 2009
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btp039
Subject(s) - computer science , sequence (biology) , random forest , protein function , protein–protein interaction , computational biology , function (biology) , protein sequencing , protein structure prediction , identification (biology) , data mining , machine learning , artificial intelligence , protein structure , bioinformatics , peptide sequence , biology , genetics , biochemistry , botany , gene
Identification of protein interaction sites has significant impact on understanding protein function, elucidating signal transduction networks and drug design studies. With the exponentially growing protein sequence data, predictive methods using sequence information only for protein interaction site prediction have drawn increasing interest. In this article, we propose a predictive model for identifying protein interaction sites. Without using any structure data, the proposed method extracts a wide range of features from protein sequences. A random forest-based integrative model is developed to effectively utilize these features and to deal with the imbalanced data classification problem commonly encountered in binding site predictions.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom