z-logo
Premium
Correlation and redundancy on machine learning performance for chemical databases
Author(s) -
Li Hongzhi,
Li Wenze,
Pan Xuefeng,
Huang Jiaqi,
Gao Ting,
Hu LiHong,
Li Hui,
Lu Yinghua
Publication year - 2018
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.3023
Subject(s) - redundancy (engineering) , support vector machine , correlation , computer science , machine learning , regression , linear regression , regression analysis , artificial intelligence , random forest , variable (mathematics) , variables , data mining , mathematics , statistics , geometry , operating system , mathematical analysis
Abstract Variable reduction is an essential step for establishing a robust, accurate, and generalized machine learning model. Variable correlation and redundancy/total correlation are the primary considerations in many variable reduction methods given that they directly impact model performances. However, their effects vary from one class of databases to another. To clarify their effects on regression models on the basis of small chemical databases, a series of calculations are performed. Regression models are built on features with various correlation coefficients and redundancies by 4 machine learning methods: random forest, support vector machine, extreme learning machine, and multiple linear regression. The results suggest that the correlation is, as expected, closely related to the prediction accuracy; ie, generally, the features with large correlation coefficients regarding to response variables achieve better regression models than those with lower ones. However, for the redundancy, no trends on the performances of regression models are disclosed. This may indicate that for these chemical molecular databases, the redundancy might not be a primary concern.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here