
A Survey on Image Captioning datasets and Evaluation Metrics
Author(s) -
Himanshu Sharma
Publication year - 2021
Publication title -
iop conference series. materials science and engineering
Language(s) - English
Resource type - Journals
eISSN - 1757-899X
pISSN - 1757-8981
DOI - 10.1088/1757-899x/1116/1/012184
Subject(s) - closed captioning , computer science , convolutional neural network , image (mathematics) , task (project management) , artificial intelligence , sentence , natural language processing , natural language , artificial neural network , pattern recognition (psychology) , management , economics
In the task of image captioning, a natural language explanation is generated for a given image. It uses the subfields of artificial intelligence: computer vision and language generation. Convolutional Neural Network (CNN) is generally applied to capture image features and language processing models such as Recurrent Neural Network for sentence generation. In this paper, various datasets and evaluation metrics which are useful for image captioning task are discussed. Also, the datasets and evaluation metrics applied by the state-of-the-art image captioning models is summarized.