
A Character-Image-Modality and Multi-Label Auxiliary Model for Chinese Sentiment Analysis
Author(s) -
Liya Wang,
Zhe Chen,
Ming Zhang,
Vasile Palade
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3598452
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
In multimodal applications, the corresponding image data for a piece of text is not always directly available. However, for the Chinese language, since Chinese characters originate from pictorial forms, character images are closely linked to textual information. To extract useful information from different modalities, this paper proposes a model based on character-image modality and multi-label auxiliary information to improve the accuracy of Chinese sentiment analysis. First, oracle bone script characters are treated as an image modality. A multi-input, multi-output network architecture is then designed to handle both text sentiment analysis and speech recognition tasks. The overall text sentiment analysis task is divided into three unimodal and one multimodal sentiment analysis sub-tasks. These tasks utilize textual features, image features, text-image features, and speech features. Finally, a multi-task joint training framework guided by multi-labels is constructed, including a parameter-sharing optimization mechanism. The speech recognition labels and text sentiment analysis labels guide the model’s attention to these features. Comparison and ablation experiments demonstrate competitive performance, indicating that even without a traditional image modality dataset, the proposed model effectively advances Chinese sentiment analysis research and enhances the generalization capability of multimodal models across different datasets.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom