z-logo
open-access-imgOpen Access
Multimodal Social Relationship Recognition Based on LLM
Author(s) -
Haopeng Wang,
Zhitian Zhang,
Menglei Xia,
Dejiao Huang,
Ruyi Chang,
Shuai Guo
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3598186
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
In recent years, multimodal social relation recognition has become a critical task in the fields of computer vision and natural language processing. However, existing research still faces key gaps, particularly in effectively aligning image features with linguistic features to improve recognition accuracy.This paper proposes a social relation recognition method based on multimodal feature fusion and validates the crucial role of cross-modal alignment mechanisms in enhancing recognition accuracy. Our approach leverages large language models to meticulously extract event structures from textual descriptions, capturing key elements such as emotional states, scenarios, and relationships, while employing convolutional neural networks to extract deep features from images.Subsequently, we introduce a cross-modal alignment mechanism to semantically align textual event structures with visual features, ensuring high semantic consistency between the two modalities. Extensive experiments on multiple public datasets demonstrate that our method significantly outperforms existing unimodal and basic multimodal approaches, confirming its effectiveness and innovative contributions to social relation recognition.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom