Premium
Comparative assessment of statistical and machine learning techniques towards estimating the risk of developing type 2 diabetes and cardiovascular complications
Author(s) -
Dalakleidi Kalliopi,
Zarkogianni Konstantia,
Thanopoulou Anastasia,
Nikita Konstantina
Publication year - 2017
Publication title -
expert systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.365
H-Index - 38
eISSN - 1468-0394
pISSN - 0266-4720
DOI - 10.1111/exsy.12214
Subject(s) - logistic regression , computer science , calibration , artificial neural network , artificial intelligence , machine learning , receiver operating characteristic , diabetes mellitus , statistics , type 2 diabetes , medicine , mathematics , endocrinology
Abstract The aim of the present study is to comparatively assess the performance of different machine learning and statistical techniques with regard to their ability to estimate the risk of developing type 2 diabetes mellitus (Case 1) and cardiovascular disease complications (Case 2). This is the first work investigating the application of ensembles of artificial neural networks (EANN) towards producing the 5‐year risk of developing type 2 diabetes mellitus and cardiovascular disease as a long‐term diabetes complication. The performance of the proposed models has been comparatively assessed with the performance obtained by applying logistic regression, Bayesian‐based approaches, and decision trees. The models' discrimination and calibration have been evaluated using the classification accuracy (ACC), the area under the curve (AUC) criterion, and the Hosmer–Lemeshow goodness of fit test. The obtained results demonstrate the superiority of the proposed models (EANN) over the other models. In Case 1, EANN with different topologies has achieved high discrimination and good calibration performance (ACC = 80.20%, AUC = 0.849, p value = .886). In Case 2, EANN based on bagging has resulted in good discrimination and calibration performance (ACC = 92.86%, AUC = 0.739, p value = .755).