
Enhance Social Network Bullying Detection using Multi-Teacher Knowledge Distillation with XGBoost Classifier
Author(s) -
Sathit Prasomphan
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3574679
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Cyberbullying remains a pressing issue in Thai social media, especially among teenagers. While many studies have explored deep learning approaches for sentiment analysis or toxicity detection, the detection of cyberbullying—especially in the Thai language—remains underexplored. This study introduces a novel framework that enhances cyberbullying detection by integrating Multi-Teacher Knowledge Distillation (MTKD) with an XGBoost classifier, specifically adapted for Thai-language social media posts. Unlike prior work that relies solely on neural models, this research demonstrates how distilled soft labels from diverse teacher models can be effectively transferred to a lightweight and interpretable XGBoost student model. A key contribution of this study is the successful adaptation of XGBoost, traditionally used for structured/tabular data, for a natural language classification task by using rich semantic features extracted via pre-trained NLP models. Additionally, although the selected datasets (Wisesight, Thai Toxic Tweet, and 40 Thai Children Stories) are often used for sentiment analysis, we reframe and preprocess them for the purpose of cyberbullying classification by focusing on toxic, harmful, or aggressive linguistic patterns. Our framework achieved strong classification performance—92.5%, 90.5%, and 91.0% accuracy across the three datasets—demonstrating its robustness and practical application in Thai-language cyberbullying detection.
Empowering knowledge with every search
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom