A GRPO-Based Approach to Hierarchical Knowledge Distillation in Military Information Extraction
Author(s) -
Yi Yang,
Zongyong Li,
Lingshu Li,
Gaoshan Wang,
Yu Hu,
Xia Peng,
Yunxiao Li
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3621013
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Military text processing faces multifaceted challenges, including real-time response requirements, high recognition accuracy demands, and computational resource constraints, compounded by semantic ambiguity, data scalability, and complex relational characteristics. To address these issues, this paper proposes a hierarchical knowledge distillation framework for military information extraction, leveraging the Group Relative Policy Optimization (GRPO) algorithm. The framework establishes a hierarchical knowledge transfer architecture, integrating a pre-trained large language model (Teacher Model, TM) with a lightweight model (Student Model, SM). Initially, systematic prompt engineering templates are designed to guide the TM in generating high-quality military text annotation corpora. Subsequently, semantic features from generative data are fused with a military domain knowledge base to construct an enhanced dataset. The GRPO algorithm is then employed to drive Low-Rank Adaptation (LoRA)-based fine-tuning of the SM on this enriched military corpus. Experimental results demonstrate significant improvements in military information extraction: the GRPO-optimized SM (SM-GRPO) achieves a 48.8% absolute increase in F1-score compared to the baseline SM, while reducing model parameters by 90.2% and inference latency by 83.7%. This approach effectively balances model compression and computational efficiency, offering a practical engineering solution for resource-constrained military environments.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom