Image To Latex with DenseNet Encoder and Joint Attention
Author(s) -
Jian Wang,
Yunchuan Sun,
Shenling Wang
Publication year - 2019
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2019.01.246
Subject(s) - computer science , closed captioning , encoder , joint (building) , sequence (biology) , feature (linguistics) , field (mathematics) , artificial intelligence , image (mathematics) , baseline (sea) , convolutional neural network , pattern recognition (psychology) , architectural engineering , linguistics , philosophy , oceanography , mathematics , biology , pure mathematics , engineering , genetics , geology , operating system
Mathematical formula structural analysis usually converts mathematical formulas in images into Latex codes. It has been named as Image2Latex by OpenAi. At present, many researchers use the model in the field of image captioning for image2latex and have achieved good results. In this paper, we propose some improvements to the baseline model which is a sequence-to-sequence model used in image caption. We improve the encoder by employing densely connected convolutional network (DenseNet) because it can strengthen feature extraction and facilitate gradient propagation. We propose to use a more effective joint attention mechanism which include both spatial attention and channel-wise attention to solve this problem. We conducted experiments on the dataset im2latex-100k. Experimental results showed that our model improved the performance of formula analysis.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom