Premium
Development of novel deep multimodal representation learning‐based model for the differentiation of liver tumors on B‐mode ultrasound images
Author(s) -
Sato Masaya,
Kobayashi Tamaki,
Soroida Yoko,
Tanaka Takashi,
Nakatsuka Takuma,
Nakagawa Hayato,
Nakamura Ayaka,
Kurihara Makiko,
Endo Momoe,
Hikita Hiromi,
Sato Mamiko,
Gotoh Hiroaki,
Iwai Tomomi,
Tateishi Ryosuke,
Koike Kazuhiko,
Yatomi Yutaka
Publication year - 2022
Publication title -
journal of gastroenterology and hepatology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.214
H-Index - 130
eISSN - 1440-1746
pISSN - 0815-9319
DOI - 10.1111/jgh.15763
Subject(s) - convolutional neural network , deep learning , medicine , multimodal therapy , artificial intelligence , liver tumor , representation (politics) , pattern recognition (psychology) , radiology , computer science , pathology , hepatocellular carcinoma , politics , political science , law
Background and Aim Recently, multimodal representation learning for images and other information such as numbers or language has gained much attention. The aim of the current study was to analyze the diagnostic performance of deep multimodal representation model‐based integration of tumor image, patient background, and blood biomarkers for the differentiation of liver tumors observed using B‐mode ultrasonography (US). Method First, we applied supervised learning with a convolutional neural network (CNN) to 972 liver nodules in the training and development sets to develop a predictive model using segmented B‐mode tumor images. Additionally, we also applied a deep multimodal representation model to integrate information about patient background or blood biomarkers to B‐mode images. We then investigated the performance of the models in an independent test set of 108 liver nodules. Results Using only the segmented B‐mode images, the diagnostic accuracy and area under the curve (AUC) values were 68.52% and 0.721, respectively. As the information about patient background and blood biomarkers was integrated, the diagnostic performance increased in a stepwise manner. The diagnostic accuracy and AUC value of the multimodal DL model (which integrated B‐mode tumor image, patient age, sex, aspartate aminotransferase, alanine aminotransferase, platelet count, and albumin data) reached 96.30% and 0.994, respectively. Conclusion Integration of patient background and blood biomarkers in addition to US image using multimodal representation learning outperformed the CNN model using US images. We expect that the deep multimodal representation model could be a feasible and acceptable tool for the definitive diagnosis of liver tumors using B‐mode US.