Semi-supervised random forest regression model based on co-training and grouping with information entropy for evaluation of depression symptoms severity | Zendy

Shengfu Lu | Zendy; Xin Shi | Zendy; Mi Li | Zendy; Jinan Jiao | Zendy; Lei Feng | Zendy; Gang Wang | Zendy

Open Access

Semi-supervised random forest regression model based on co-training and grouping with information entropy for evaluation of depression symptoms severity

Author(s) -

Shengfu Lu,

Xin Shi,

Mi Li,

Jinan Jiao,

Lei Feng,

Gang Wang

Publication year - 2021

Publication title -

mathematical biosciences and engineering

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.451

H-Index - 45

eISSN - 1551-0018

pISSN - 1547-1063

DOI - 10.3934/mbe.2021233

Subject(s) - random forest , regression , regression analysis , depression (economics) , entropy (arrow of time) , psychology , statistics , artificial intelligence , mathematics , machine learning , computer science , physics , quantum mechanics , economics , macroeconomics

Semi-supervised learning has always been a hot topic in machine learning. It uses a large number of unlabeled data to improve the performance of the model. This paper combines the co-training strategy and random forest to propose a novel semi-supervised regression algorithm: semi-supervised random forest regression model based on co-training and grouping with information entropy (E-CoGRF), and applies it to the evaluation of depression symptoms severity. The algorithm inherits the ensemble characteristics of random forest, and combines well with co-training. In order to balance the accuracy and diversity of co-training random forests, the algorithm proposes a grouping strategy to decision trees. Moreover, the information entropy is used to measure the confidence, which avoids unnecessary repeated training and improves the efficiency of the model. In the practical application of evaluation of depression symptoms severity, we collect cognitive behavioral data of emotional conflict based on the depressive affective disorder. And on this basis, feature construction and normalization preprocessing are carried out. Finally, the test is conducted on 35 labeled and 80 unlabeled depression patients. The result shows that the proposed algorithm obtains MAE (Mean Absolute Error) = 3.63 and RMSE (Root Mean Squared Error) = 4.50, which is better than other semi-supervised regression algorithms. The proposed method effectively solves the modeling difficulties caused by insufficient labeled samples, and has important reference value for the diagnosis of depression symptoms severity.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research