Contrastive and Attention-Based Multimodal Fusion: Detecting Negative Memes Through Diverse Fusion Strategies
Author(s) -
BS Narendiran,
Srividya,
N. Sumith
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3613694
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
With the revolution of social media, the proper analysis of sentiment expressed through both textual and visual media is one of the most challenging problems. While not all negatively charged memes are harmful, sentiment analysis helps distinguish between general negativity and content that may contribute to online harms, such as cyberbullying, misinformation, or hate speech. Conventional sentiment analysis approaches, which mainly concentrate on text-based content, may miss out on nuanced emotional cues present in images, emojis, or the overall context of a conversation. Subtle negative expressions are often misclassified because models do not effectively learn fine-grained differences between sentiment classes. The existing works having static fusion often lack ability to detect fine-grained or subtle negativity. In this work we propose Gated Contrastive Multimodal Fusion Model that processes the text and image streams in memes, through independent modality-specific neural networks, combined by means of an attention mechanism. The gated fusion block present in our architecture allows the model to pay closer attention to the most sentiment-rich elements. The contrastive block provides an accurate training mechanism to improve the classification of negative memes. Through experiments on benchmark datasets, we show that our methodology provides interpretable insights into the contribution of each modality. Thus, outperform SOTA models and open the door to better identification of online harmful memes.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom