z-logo
open-access-imgOpen Access
Multimodal Metaphor Detection Based on Affective Embedding and Deep Transformer
Author(s) -
Sheng Guo,
Narinderjit Singh Sawaran,
Goh Khang Wen
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3613763
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
We propose to detect and interpret multimodal metaphors based on affective embeddings and deep transformer representations across diverse linguistic and visual contexts. Designed for large-scale figurative language analysis, our system uncovers coherent, metaphorically rich expressions through hierarchical multimodal learning. This pipeline extracts diverse semantic featuresąłsuch as emotional valence and conceptual mappingsąłenhanced via weakly supervised learning to ensure broad coverage of metaphorical phenomena. An affective-embedding-based feature selection strategy filters out less discriminative attributes, producing refined multimodal representations. These representations are projected into a transformer, where each expression is modeled as a spatial sequence of the abstract conceptual blends, supporting precise metaphor identification. We construct a weighted similarity graph from these multimodal embeddings, enabling large-scale metaphor clustering through advanced graph-based detection. The resulting metaphor communities reflect shared conceptual mappings-such as anger-as-heat or life-as-journey patternsąłand reveal both conventional and novel metaphorical associations. To deliver accurate metaphor interpretation, a ranking module integrates individual expression features with community-level conceptual patterns to suggest relevant metaphorical meanings. Evaluations on the Multimodal Metaphor Dataset (MMD-1.3M), comprising 1.3 million instances spanning 50 conceptual categories, show that our model achieves an BER score of 0.487 on metaphor identification, outperforming strong baselines like ViLBERT and CLIP by more than 6 points. The system also demonstrates 0.792 clustering precision on novel metaphorical associations, confirming its scalability and accuracy across varied metaphor types and its effectiveness in large-scale figurative language processing.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom