A Character-Image-Modality and Multi-Label Auxiliary Model for Chinese Sentiment Analysis | Zendy

Liya Wang | Zendy; Zhe Chen | Zendy; Ming Zhang | Zendy; Vasile Palade | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A Character-Image-Modality and Multi-Label Auxiliary Model for Chinese Sentiment Analysis

Author(s) -

Liya Wang,

Zhe Chen,

Ming Zhang,

Vasile Palade

Publication year - 2025

Publication title -

ieee access

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 0.587

H-Index - 127

eISSN - 2169-3536

DOI - 10.1109/access.2025.3598452

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

In multimodal applications, the corresponding image data for a piece of text is not always directly available. However, for the Chinese language, since Chinese characters originate from pictorial forms, character images are closely linked to textual information. To extract useful information from different modalities, this paper proposes a model based on character-image modality and multi-label auxiliary information to improve the accuracy of Chinese sentiment analysis. First, oracle bone script characters are treated as an image modality. A multi-input, multi-output network architecture is then designed to handle both text sentiment analysis and speech recognition tasks. The overall text sentiment analysis task is divided into three unimodal and one multimodal sentiment analysis sub-tasks. These tasks utilize textual features, image features, text-image features, and speech features. Finally, a multi-task joint training framework guided by multi-labels is constructed, including a parameter-sharing optimization mechanism. The speech recognition labels and text sentiment analysis labels guide the model’s attention to these features. Comparison and ablation experiments demonstrate competitive performance, indicating that even without a traditional image modality dataset, the proposed model effectively advances Chinese sentiment analysis research and enhances the generalization capability of multimodal models across different datasets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research