z-logo
open-access-imgOpen Access
Multi‐level Deep Correlative Networks for Multi‐modal Sentiment Analysis
Author(s) -
Cai Guoyong,
Lyu Guangrui,
Lin Yuming,
Wen Yimin
Publication year - 2020
Publication title -
chinese journal of electronics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.267
H-Index - 25
eISSN - 2075-5597
pISSN - 1022-4653
DOI - 10.1049/cje.2020.09.003
Subject(s) - computer science , discriminative model , sentiment analysis , artificial intelligence , correlative , modal , classifier (uml) , pattern recognition (psychology) , feature (linguistics) , natural language processing , correlation , machine learning , mathematics , linguistics , philosophy , chemistry , geometry , polymer chemistry
Multi‐modal sentiment analysis (MSA) is increasingly becoming a hotspot because it extends the conventional Sentiment analysis (SA) based on texts to multi‐modal content which can provide richer affective information. However, compared with textbased sentiment analysis, multi‐modal sentiment analysis has much more challenges, because the joint learning process on multi‐modal data requires both fine‐grained semantic matching and effective heterogeneous feature fusion. Existing approaches generally infer sentiment type from splicing features extracted from different modalities but neglect the strong semantic correlation among cooccurrence data of different modalities. To solve the challenges, a multi‐level deep correlative network for multimodal sentiment analysis is proposed, which can reduce the semantic gap by analyzing simultaneously the middlelevel semantic features of images and the hierarchical deep correlations. First, the most relevant cross‐modal feature representation is generated with Multi‐modal Deep and discriminative correlation analysis (Multi‐DDCA) while keeping those respective modal feature representations to be discriminative. Second, the high‐level semantic outputs from multi‐modal deep and discriminative correlation analysis are encoded into attention‐correlation cross‐modal feature representation through a co‐attention‐based multimodal correlation submodel, and then they are further merged by multi‐layer neural network to train a sentiment classifier for predicting sentimental categories. Extensive experimental results on five datasets demonstrate the effectiveness of the designed approach, which outperforms several state‐of‐the‐art fusion strategies for sentiment analysis.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here