Premium
A novel molecular descriptor selection method in QSAR classification model based on weighted penalized logistic regression
Author(s) -
Algamal Zakariya Yahya,
Lee Muhammad Hisyam
Publication year - 2017
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.2915
Subject(s) - weighting , logistic regression , pattern recognition (psychology) , artificial intelligence , selection (genetic algorithm) , basis (linear algebra) , mathematics , feature selection , computer science , data mining , statistics , medicine , geometry , radiology
Molecular descriptor selection is a pivotal tool for quantitative structure–activity relationship modeling. This paper proposes a novel molecular descriptor selection method on the basis of taking into account the information of the group type that the descriptor belongs to. This descriptor selection method is on the basis of combining penalized logistic regression with 2‐sample t test. The proposed method can perform filtering and weighting simultaneously. Specifically, 2‐sample t test is employed as filter method by removing the descriptor which is not show statistically significant difference. On the other hand, a weighted penalized logistic regression is used by assigning a weight depending on the 2‐sample t test value inside the descriptor type block. The proposed method is experimentally tested and compared with state‐of‐the‐art selection methods. The results show that our proposed method is simpler and faster with efficient classification performance.