
ContextMiner: Mining Contextual Features for Conceptualizing Knowledge in Security Texts
Author(s) -
Luis Felipe Gutierrez,
Akbar Namin
Publication year - 2022
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2022.3198944
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
This paper presents ContextMiner , a novel natural language processing (NLP) framework to automatically capture contextual features for the purpose of extracting meaningful context-aware phrases from cybersecurity unstructured textual data. The framework utilizes basic attributes such as part-of-speech tagging, dependency parsing, and a domain-specific grammar to extract the contextual features. The effectiveness and applications of ContextMiner are evaluated and presented from two different perspectives: qualitative and quantitative. As for the qualitative analysis, our case studies show that the proposed framework is capable of retrieving additional contents from the given texts, both in a labeled and unlabeled setting, and thus building context-aware phrases in comparison with existing approaches. From a quantitative point of view, we evaluate ContextMiner as a pre-processing step to perform named entity recognition (NER). Our results show that ContextMiner reduces the corpus up to 70% while maintaining 85% of its relevant entities, with a small drop in the classification metrics. Finally, we explored the utilization of ContextMiner in the construction and reasoning of knowledge graphs.