Self-Supervised Learning Meets Custom Autoencoder Classifier: A Semi-Supervised Approach for Encrypted Traffic Anomaly Detection | Zendy

A. Ramzi Bahlali | Zendy; Abdelmalik Bachir | Zendy; Abdeldjalil Labed | Zendy

Open Access

Self-Supervised Learning Meets Custom Autoencoder Classifier: A Semi-Supervised Approach for Encrypted Traffic Anomaly Detection

Author(s) -

A. Ramzi Bahlali,

Abdelmalik Bachir,

Abdeldjalil Labed

Publication year - 2025

Publication title -

ieee access

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 0.587

H-Index - 127

eISSN - 2169-3536

DOI - 10.1109/access.2025.3596179

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

The widespread adoption of encryption in computer networks has made detecting malicious traffic, especially at network perimeters, increasingly challenging. As packet contents are concealed, traditional monitoring techniques such as Deep Packet Inspection (DPI) become ineffective. Consequently, researchers have started employing data-driven methods based on Machine and Deep Learning (ML & DL) to identify malicious behavior even from encrypted traffic, typically within Anomaly-based Network Intrusion Detection Systems (A-NIDS). Existing approaches rely heavily on supervised learning, which requires large volumes of labeled benign and malicious traffic. However, generating these labels is time-consuming, error-prone, and often requires expert knowledge. In this paper, we propose a semi-supervised learning framework that leverages Self-Supervised Learning (SSL) to learn discriminative representations from unlabeled network traffic. We design a novel pretext task that predicts important masked features, enabling the model to capture meaningful structure in the data. These learned representations are fine-tuned with minimal labeled data using a Custom-Autoencoder (Custom-AE) classifier. Experimental results show that the representation learned from our proposed pretext task outperforms the best competing method by 3.41% on UNSW-NB15 (NB15) and 1.53% on CSE-CIC-IDS2018 (CSE18) when evaluated using linear probing. When fine-tuned on the Custom-AE with only 100 benign and 10 malicious samples, it achieves 83.51% (NB15) and 87.43% (CSE18) accuracy, representing gains of 4.55% and 5.08% over the initial features, respectively. This demonstrates stronger suitability for label-scarce real-world scenarios compared to existing approaches.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research