1364Predicting obesity and smoking using medication data: a machine-learning approach
Author(s) -
Sitwat Ali
Publication year - 2021
Publication title -
international journal of epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.406
H-Index - 208
eISSN - 1464-3685
pISSN - 0300-5771
DOI - 10.1093/ije/dyab168.030
Subject(s) - medicine , confidence interval , obesity , machine learning , receiver operating characteristic , confounding , cohort , record linkage , artificial intelligence , computer science , environmental health , population
Background Administrative health datasets are widely used in public health research but often lack information about common confounders. We aimed to develop and validate machine learning (ML)-based models using medication data from Australia’s Pharmaceutical Benefits Scheme (PBS) database to predict obesity and smoking. Methods We used data from the D-Health Trial (N = 18,000) and the QSkin Study (N = 43,794). Smoking history, and height and weight were self-reported at study entry. Linkage to the PBS dataset captured 5 years of medication data after cohort entry. We used age, sex, and medication use, classified using Anatomical Therapeutic Classification codes, as potential predictors of smoking and obesity. We trained gradient-boosted machine learning models using data for the first 80% of participants enrolled; models were validated using the remaining 20%. We assessed model performance overall and by sex and age, and compared models generated using 3 and 5 years of PBS data. Results Based on the validation dataset using 3 years of PBS data, the area under the receiver operating characteristic curve (AUC) was 0.70 (95% confidence interval (CI) 0.68 – 0.71) for predicting obesity and 0.71 (95% CI 0.70 – 0.72) for predicting smoking. Models performed better in women than in men. Using 5 years of PBS data resulted in marginal improvement. Conclusions Medication data in combination with age and sex can be used to predict obesity and smoking. These models may be of value to researchers using data collected for administrative purposes.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom