Premium
QSAR model for alignment‐free prediction of human breast cancer biomarkers based on electrostatic potentials of protein pseudofolding HP‐lattice networks
Author(s) -
Vilar Santiago,
GonzálezDíaz Humberto,
Santana Lourdes,
Uriarte Eugenio
Publication year - 2008
Publication title -
journal of computational chemistry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.907
H-Index - 188
eISSN - 1096-987X
pISSN - 0192-8651
DOI - 10.1002/jcc.21016
Subject(s) - quantitative structure–activity relationship , computational biology , molecular descriptor , chemistry , human breast , computational chemistry , computer science , biological system , artificial intelligence , bioinformatics , machine learning , breast cancer , biology , cancer , genetics
Network theory allows relationships to be established between numerical parameters that describe the molecular structure of genes and proteins and their biological properties. These models can be considered as quantitative structure–activity relationships (QSAR) for biopolymers. The work described here concerns the first QSAR model for 122 proteins that are associated with human breast cancer (HBC), as identified experimentally by Sjöblom et al. (Science 2006, 314, 268) from over 10,000 human proteins. In this study, the 122 proteins related to HBC (HBCp) and a control group of 200 proteins that are not related to HBC (non‐HBCp) were forced to fold in an HP lattice network. From these networks a series of electrostatic potential parameters (ξ k ) was calculated to describe each protein numerically. The use of ξ k as an entry point to linear discriminant analysis led to a QSAR model to discriminate between HBCp and non‐HBCp, and this model could help to predict the involvement of a certain gene and/or protein in HBC. In addition, validation procedures were carried out on the model and these included an external prediction series and evaluation of an additional series of 1000 non‐HBCp. In all cases good levels of classification were obtained with values above 80%. This study represents the first example of a QSAR model for the computational chemistry inspired search of potential HBC protein biomarkers. © 2008 Wiley Periodicals, Inc. J Comput Chem 2008