Semantic‐meshed and content‐guided transformer for image captioning | Zendy

Li Xuan | Zendy; Zhang Wenkai | Zendy; Sun Xian | Zendy; Gao Xin | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Semantic‐meshed and content‐guided transformer for image captioning

Author(s) -

Li Xuan,

Zhang Wenkai,

Sun Xian,

Gao Xin

Publication year - 2022

Publication title -

iet computer vision

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.38

H-Index - 37

eISSN - 1751-9640

pISSN - 1751-9632

DOI - 10.1049/cvi2.12099

Subject(s) - closed captioning , computer science , transformer , artificial intelligence , information retrieval , natural language processing , image (mathematics) , computer vision , physics , quantum mechanics , voltage

The transformer architecture has been the dominant framework for today's image captioning tasks because of its superior performance. However, existing methods based on transformer often lack the integrated use of multi‐level semantic information and are weak in maintaining the relevance of captions to the image. In this paper, a semantic‐meshed and content‐guided transformer network is introduced for image captioning to solve these problems. The semantic‐meshed mechanism allows the model to generate words by selecting semantic information of multiple interaction levels adaptively through attention‐based reconstruction. And the content‐guided module guides the words generation by using attribute features that represent the image content, which aims to keep the generated caption consistent with the main content of the image. Experiments on dataset on the MSCOCO captioning dataset are conducted to validate the authors’ model and achieve superior results compared to other state‐of‐the‐art method approaches.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore