An Efficient Approach for Code-Mixed Emotion Classification applying Machine Learning | Zendy

Ahmad Mahmood | Zendy; Miguel Torres-Ruiz | Zendy; Zainab Ahmad | Zendy; Humaira Farid | Zendy; Iqra Ameer | Zendy; Rolando Quintero | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

An Efficient Approach for Code-Mixed Emotion Classification applying Machine Learning

Author(s) -

Ahmad Mahmood,

Miguel Torres-Ruiz,

Zainab Ahmad,

Humaira Farid,

Iqra Ameer,

Rolando Quintero

Publication year - 2025

Publication title -

ieee access

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 0.587

H-Index - 127

eISSN - 2169-3536

DOI - 10.1109/access.2025.3598754

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

Emotion classification aims to find and extract all possible emotions from a piece of text that best represent the author’s state of mind. The task of emotion classification is still considered challenging for under-resourced languages, especially in the case of code-mixing, which is not a standardized language on its own. The widespread use of social media has led to the emergence of code-mixed language, which later gained attention from researchers due to its extensive usage. Emotion classification is an important problem with a range of applications, from healthcare and e-learning to social media. While some work has been done on code-mixed emotion classification, very few studies have focused on code-mixed emotion classification for English and Roman Urdu. Previously, researchers attempted to solve the problem of code-mixed multi-label emotion classification using code-mixed English and Roman Urdu, but the results were relatively low (e.g., Micro F1 = 0.67), indicating that there is still a need for improvement in this area. In this study, we mainly aim to solve two complex tasks: (i) code-mixed multi-label emotion classification and (ii) code-mixed multi-class emotion classification. Our contribution lies in utilizing classical machine learning methods with three distinct multi-label and multi-class classification approaches: (i) One-Versus-Rest (OvR), (ii) Label Powerset (LP), and (iii) Binary Relevance (BR), along with two distinct feature extraction techniques. First, we employ content-based methods using TF-IDF at the word unigram level and experiment with various feature sets ranging from 500 to 3000 features. Second, we use context-based methods by leveraging SBERT-based models for embeddings to capture semantic meanings. Finally, we apply a state-of-the-art Generative AI-based approach, utilizing a quantized version of LLaMa, which is fine-tuned for evaluation. We conducted over 2,000 experiments, and the best results were obtained using classical machine learning (Micro F1 = 0.9142 for multi-label classification and Micro F1 = 0.9238 for multi-class classification) with the combination of the Binary Relevance approach in a context-based setting for both tasks, which indicates that Binary Relevance is an optimized approach for breaking complex multi-label, multi-class tasks into easier ones, especially when the language is difficult enough in its own.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research