Enhancement of Implicit Emotion Recognition in Arabic Text: Annotated dataset and baseline models | Zendy

Hanane Boutouta | Zendy; Abdelaziz Lakhfif | Zendy; Ferial Senator | Zendy; Chahrazed Mediani | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Enhancement of Implicit Emotion Recognition in Arabic Text: Annotated dataset and baseline models

Author(s) -

Hanane Boutouta,

Abdelaziz Lakhfif,

Ferial Senator,

Chahrazed Mediani

Publication year - 2025

Publication title -

ieee access

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 0.587

H-Index - 127

eISSN - 2169-3536

DOI - 10.1109/access.2025.3611337

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

Emotion recognition in textual data has emerged as a rapidly advancing task within the field of Natural Language Processing (NLP). Implicit Emotion Recognition (IER), which involves identifying emotions primarily through contextual cues rather than overt or explicit emotional expressions, remains in its early stages. Despite significant progress in recognizing explicit emotions, current research has largely overlooked IER, particularly for low-resource languages such as Arabic. This study aims to comprehensively address this task for the Arabic language, including dataset construction, annotation, modeling, and evaluation. Specifically, the study (1) presents the first annotated dataset for Arabic Implicit Emotion Recognition (AIER); (2) annotates the dataset with emotion, cue, and cause using a semi-automatic annotation tool, validated by four native Arabic speakers and linguists; (3) investigates the potential of two categories of transformer-based models: masked language models, exemplified by pre-trained Bidirectional Encoder Representations from Transformers (BERT)-based architecture, through a series of fine-tuning experiments, and causal-based models, such as generative Large Language Models (LLMs), via a zero-shot prompting approach; and (4) evaluates the performance of four distinct algorithmic models on the proposed dataset :classical Machine Learning (ML), Deep Learning (DL), BERT-based, and generative LLMs. The experimental results demonstrate that BERT-based models outperform ML, DL models, and generative LLMs. Notably, the fine-tuned MARBERTv2 model achieves superior performance compared to other pre-trained models, obtaining an impressive F1-score of 79.83% on the AIER dataset.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research