Research Library

open-access-imgOpen AccessHierarchical Aligned Multimodal Learning for NER on Tweet Posts
Author(s)
Peipei Liu,
Hong Li,
Yimo Ren,
Jie Liu,
Shuaizong Si,
Hongsong Zhu,
Limin Sun
Publication year2024
Mining structured knowledge from tweets using named entity recognition (NER)can be beneficial for many down stream applications such as recommendation andintention understanding. With tweet posts tending to be multimodal, multimodalnamed entity recognition (MNER) has attracted more attention. In this paper, wepropose a novel approach, which can dynamically align the image and textsequence and achieve the multi-level cross-modal learning to augment textualword representation for MNER improvement. To be specific, our framework can besplit into three main stages: the first stage focuses on intra-modalityrepresentation learning to derive the implicit global and local knowledge ofeach modality, the second evaluates the relevance between the text and itsaccompanying image and integrates different grained visual information based onthe relevance, the third enforces semantic refinement via iterative cross-modalinteractions and co-attention. We conduct experiments on two open datasets, andthe results and detailed analysis demonstrate the advantage of our model.
Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here