MT-EfficientNetV2: A Multi-Temporal Scale Fusion EEG Emotion Recognition Method Based on Recurrence Plots | Zendy

Zihan Zhang | Zendy; Zhiyong Zhou | Zendy; Jun Wang | Zendy; Hao Hu | Zendy; Jing Zhao | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

MT-EfficientNetV2: A Multi-Temporal Scale Fusion EEG Emotion Recognition Method Based on Recurrence Plots

Author(s) -

Zihan Zhang,

Zhiyong Zhou,

Jun Wang,

Hao Hu,

Jing Zhao

Publication year - 2025

Publication title -

ieee access

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 0.587

H-Index - 127

eISSN - 2169-3536

DOI - 10.1109/access.2025.3592336

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

Emotion recognition based on electroencephalography (EEG) signals has garnered significant research attention in recent years due to its potential applications in affective computing and brain-computer interfaces. Despite the proposal of various deep learning-based methods for extracting emotional features from EEG signals, most existing models struggle to effectively capture both long-term and short-term dependencies within the signals, failing to fully integrate features across different temporal scales. To address these challenges, we propose a deep learning model that combines multi-temporal-scale fusion, termed MT-EfficientNetV2. This model segments one-dimensional EEG signals using combinations of varying window sizes and fixed step lengths. The Recursive Plot (RP) algorithm is then employed to transform these segments into RGB images that intuitively represent the dynamic characteristics of the signals, facilitating the capture of complex emotional features. Additionally, a three-branch input feature fusion module has been designed to effectively integrate features across different scales within the same temporal domain. The model architecture incorporates DEconv and the SimAM attention mechanism with EfficientNetV2. This integration enhances the global fusion and expression of multi-scale features while strengthening the extraction of key emotional features at the local level, thereby suppressing redundant information. Experiments conducted on the public datasets SEED and SEED-IV yielded accuracies of 98.67% and 96.89%, respectively, surpassing current mainstream methods and validating the feasibility and effectiveness of the proposed approach.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research