Tuning parameter identification for variable selection algorithm using the sum of ranking differences algorithm | Zendy

Nie Mingpeng | Zendy; Meng Liuwei | Zendy; Chen Xiaojing | Zendy; Hu Xinyu | Zendy; Li Limin | Zendy; Yuan Leimin | Zendy; Shi Wen | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Tuning parameter identification for variable selection algorithm using the sum of ranking differences algorithm

Author(s) -

Nie Mingpeng,

Meng Liuwei,

Chen Xiaojing,

Hu Xinyu,

Li Limin,

Yuan Leimin,

Shi Wen

Publication year - 2019

Publication title -

journal of chemometrics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.47

H-Index - 92

eISSN - 1099-128X

pISSN - 0886-9383

DOI - 10.1002/cem.3113

Subject(s) - partial least squares regression , variable elimination , latent variable , ranking (information retrieval) , feature selection , algorithm , variable (mathematics) , mathematics , statistics , projection (relational algebra) , computer science , artificial intelligence , mathematical analysis , inference

Variable selection algorithms are often adopted to select the optimal variable from a full set of variables and are efficient for reducing the variable dimension and improving the model accuracy. Nonetheless, the parameters of the variable selection method and regression model, such as the number of latent variables of the partial least squares (PLS) model and the threshold value of the variable importance index, need to be identified. The parameters directly determine the final performance of the model. Currently, these parameters are often determined subjectively. As a result, the model results may be accidental because of the subjective determination of the parameters. To objectively identify these parameters, the sum of ranking differences (SRD) coupled with partial least squares‐variable importance in projection (PLS‐VIP‐SRD) and partial least squares‐uninformative variable elimination (PLS‐UVE‐SRD) algorithms was applied to determine the latent variable of the PLS model and the threshold value of the variable importance index. Furthermore, public near‐infrared data of corn were used as the calculation data. The final results show that the PLS‐VIP‐SRD and PLS‐UVE‐SRD models can more effectively and objectively determine the optimal parameter combination than the PLS‐VIP and PLS‐UVE models. Moreover, the selected variables are easier to interpret, and the prediction accuracy is also improved to some extent.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore