
Late fusion of deep learning and handcrafted visual features for biomedical image modality classification
Author(s) -
Lee Sheng Long,
Zare Mohammad Reza,
Muller Henning
Publication year - 2019
Publication title -
iet image processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.401
H-Index - 45
eISSN - 1751-9667
pISSN - 1751-9659
DOI - 10.1049/iet-ipr.2018.5054
Subject(s) - artificial intelligence , modality (human–computer interaction) , computer science , deep learning , fusion , pattern recognition (psychology) , contextual image classification , image fusion , image (mathematics) , computer vision , philosophy , linguistics
Much of medical knowledge is stored in the biomedical literature, collected in archives like PubMed Central that continue to grow rapidly. A significant part of this knowledge is contained in images with limited metadata available which makes it difficult to explore the visual knowledge in the biomedical literature. Thus, extraction of metadata from visual content is important. One important piece of metadata is the type of the image, which could be one of the various medical imaging modalities such as X‐ray, computed tomography or magnetic resonance images and also of general graphs that are frequent in the literature. This study explores a late, score‐based fusion of several deep convolutional neural networks with a traditional hand‐crafted bag of visual words classifier to classify images from the biomedical literature into image types or modalities. It achieved a classification accuracy of 85.51% on the ImageCLEF 2013 modality classification task, which is better than the best visual methods in the challenge that the data were produced for, and similar compared to mixed methods that make use of both visual and textual information. It achieved similarly good results of 84.23 and 87.04% classification accuracy before and after augmentation, respectively, on the related ImageCLEF 2016 subfigure classification task.