Premium
Applications of machine learning models in the prediction of gastric cancer risk in patients after Helicobacter pylori eradication
Author(s) -
Leung Wai K.,
Cheung Ka Shing,
Li Bofei,
Law Simon Y. K.,
Lui Thomas K. L.
Publication year - 2021
Publication title -
alimentary pharmacology and therapeutics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.308
H-Index - 177
eISSN - 1365-2036
pISSN - 0269-2813
DOI - 10.1111/apt.16272
Subject(s) - medicine , helicobacter pylori , cancer , logistic regression , gastroenterology , receiver operating characteristic
Summary Background The risk of gastric cancer after Helicobacter pylori ( H. pylori ) eradication remains unknown. Aim To evaluate the performances of seven different machine learning models in predicting gastric cancer risk after H. pylori eradication. Methods We identified H. pylori ‐infected patients who had received clarithromycin‐based triple therapy between 2003 and 2014 in Hong Kong. Patients were divided into training (n = 64 238) and validation sets (n = 25 330), according to period of eradication therapy. The data were used to construct seven machine learning models to predict risk of gastric cancer development within 5 years after H. pylori eradication. A total of 26 clinical variables were input into these models. The performances were measured by the area under receiver operating characteristic curve (AUC) analysis. Results During a mean follow‐up of 4.7 years, 0.21% of H. pylori ‐eradicated patients developed gastric cancer. Of the seven machine learning models, extreme gradient boosting (XGBoost) had the best performance in predicting cancer development (AUC 0.97, 95%CI 0.96‐0.98), and was superior to conventional logistic regression (AUC 0.90, 95% CI 0.84‐0.92). With the XGBoost model, the number of patients considered at high risk of gastric cancer was 6.6%, with miss rate of 1.9%. Patient age, presence of intestinal metaplasia, and gastric ulcer were the heavily weighted factors used by the XGBoost. Conclusion Based on simple baseline patient information, machine learning model can accurately predict the risk of post‐eradication gastric cancer. This model could substantially reduce the number of patients who require endoscopic surveillance.