
EXplainable AI (XAI) approach to image captioning
Author(s) -
Han SeungHo,
Kwon MinSu,
Choi HoJin
Publication year - 2020
Publication title -
the journal of engineering
Language(s) - English
Resource type - Journals
ISSN - 2051-3305
DOI - 10.1049/joe.2019.1217
Subject(s) - closed captioning , computer science , sentence , artificial intelligence , task (project management) , phrase , image (mathematics) , object (grammar) , word (group theory) , natural language processing , speech recognition , linguistics , economics , philosophy , management
This article presents an eXplainable AI (XAI) approach to image captioning. Recently, deep learning techniques have been intensively used to this task with relatively good performance. Due to the ‘black‐box’ paradigm of deep learning, however, existing approaches are unable to provide clues to explain the reasons why specific words have been selected when generating captions for given images, hence leading to generate absurd captions occasionally. To overcome this problem, this article proposes an explainable image captioning model, which provides a visual link between the region of an object (or a concept) in the given image and the particular word (or phrase) in the generated sentence. The model has been evaluated with two datasets, MSCOCO and Flickr30K, and both quantitative and qualitative results are presented to show the effectiveness of the proposed model.