Premium
On the Robustness of Transformer‐Based Models to Different Linguistic Perturbations: A Case of Study in Irony Detection
Author(s) -
OrtegaBueno Reynier,
Fersini Elisabetta,
Rosso Paolo
Publication year - 2025
Publication title -
expert systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.365
H-Index - 38
eISSN - 1468-0394
pISSN - 0266-4720
DOI - 10.1111/exsy.70062
ABSTRACT This study investigates the robustness of Transformer models in irony detection addressing various textual perturbations, revealing potential biases in training data concerning ironic and non‐ironic classes. The perturbations involve three distinct approaches, each progressively increasing in complexity. The first approach is word masking, which employs wild‐card characters or utilises BERT‐specific masking through the mask token provided by BERT models. The second approach is word substitution, replacing the bias word with a contextually appropriate alternative. Lastly, paraphrasing generates a new phrase while preserving the original semantic meaning. We leverage Large Language Models (GPT 3.5 Turbo) and human inspection to ensure linguistic correctness and contextual coherence for word substitutions and paraphrasing. The results indicate that models are susceptible to these perturbations, and paraphrasing and word substitution demonstrate the most significant impact on model predictions. The irony class appears to be particularly challenging for models when subjected to these perturbations. The SHAP and LIME methods are used to correlate variations in attribution scores with prediction errors. A notable difference in the Total Variation of attribution scores is observed between original examples and cases involving bias word substitution or masking. Among the corpora used, TwSemEval2018 emerges as the most challenging. Regarding model performance, Transformer‐based models such as RoBERTa and BERTweet demonstrate superior overall performance addressing these perturbations. This research contributes to understanding the robustness and limitations of irony detection models, highlighting areas for improvement in model design and training data curation.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom