z-logo
open-access-imgOpen Access
TF2ML: Threat Filtering with Two-Stage Machine Learning for Efficient Provenance-Aware Threat Detection and Response
Author(s) -
Krittin Thirasak,
Teerawat Chuaphanngam,
Danupat Chainarong,
Somchart Fugkeaw
Publication year - 2025
Publication title -
ieee open journal of the computer society
Language(s) - English
Resource type - Magazines
eISSN - 2644-1268
DOI - 10.1109/ojcs.2025.3618157
Subject(s) - computing and processing
As cyber threats grow more sophisticated, traditional detection methods struggle to identify advanced and zero-day vulnerabilities. Machine learning (ML) and federated learning approaches have been explored to improve detection accuracy and scalability; however, they often sacrifice efficiency, either by increasing computational overhead or compromising detection precision. Federated learning reduces computational requirements but suffers from accuracy loss, while centralized models provide better detection capabilities at the expense of scalability. This paper presents a provenance-aware threat-hunting system that integrates rule-based preprocessing, a Two-Stage ML approach, and provenance tracking to enhance network security efficiency. We introduce Rule-Based CVE Filtering for preprocessing, leveraging Apache Spark for scalable log processing and MITRE ATT&CK for structured threat intelligence and attack mapping. Our provenance-aware approach ensures that each network log is enriched with metadata—including device ID, user session, network segment, and attack source—enabling precise anomaly attribution and targeted mitigation. By filtering out known vulnerabilities and common threats, our system reduces the computational burden on the ML model, accelerating both training and inference. Experimental evaluation demonstrates an 8.13% reduction in processing time while maintaining 94% classification accuracy compared to existing ML models. Our provenance-aware TF2 ML model further improves performance, achieving a 12.7% increase in processing speed and a 4.5% boost in accuracy over traditional approaches. This hybrid solution balances scalability, computational efficiency, and real-time response, while effectively detecting both known and unknown threats.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom