
Customer churn prediction based on LASSO and Random Forest models
Author(s) -
Qiannan Zhu,
Xinyi Yu,
Yaping Zhao,
Deyi Li
Publication year - 2019
Publication title -
iop conference series. materials science and engineering
Language(s) - English
Resource type - Journals
eISSN - 1757-899X
pISSN - 1757-8981
DOI - 10.1088/1757-899x/631/5/052008
Subject(s) - random forest , multicollinearity , lasso (programming language) , computer science , generalization , construct (python library) , econometrics , predictive modelling , regression analysis , regression , machine learning , artificial intelligence , data mining , statistics , mathematics , programming language , mathematical analysis , world wide web
Customer churn probability is influenced by many factors, due to the complexities of actual problems, high-dimensional data often exists multicollinearity, and ordinary regression model is no longer applicable, while Random Forest model without data processing will lead to a large amount of calculation and make the model become not generalizable. So we try to construct a LASSO-RF model based on the existing theories that the Random Forest model was used to predict the variables selected by LASSO model. This paper takes the member data of an airline company as an example to carry out an empirical study. The results show that compared with the LASSO model or Random Forest model alone, the LASSO-RF model constructed in this paper has a smaller amount of calculation, higher prediction accuracy and stronger generalization ability.