
Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes from Construction Accident Reports
Author(s) -
Vedat Togan,
Fatemeh Mostofi,
Onur Behzat Tokdemir,
Fethi Kadioglu
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3576442
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
The time-intensive extraction of insights from textual safety documents using conventional methods causes delays and inaccuracies, hindering proactive incident prevention in construction projects. While the architecture of large language models (LLMs) were well-studied, their deployment efficiencies were often overlooked. This study proposes DistilBERT as a more efficient text management method for extracting safety text from construction safety documents. To maintain the relevance of the extracted safety text, a dataset of 5,224 construction accident cases from 73 projects across the Euro-Asia region was compiled, where incidents were analyzed through detailed questionnaires to identify safety attributes, with term frequency-inverse document frequency (TF-IDF) analysis applied for validation. When benchmarked against conventional NLP methods and state-of-the-art LLMs such as BERT, RoBERTa, and XLNet, DistilBERT demonstrated comparable accuracy with significantly reduced computational time. Specifically, DistilBERT achieved an accuracy of 79% in severity scale classification with an F1 score of 0.72, while reducing processing time by approximately 50% compared to BERT (from 2,918.28 seconds to 1,492.08 seconds). By offering rapid inference speeds with negligible accuracy trade-offs, DistilBERT emerges as a practical tool for automating safety text extraction, making it ideal for settings with limited computational capabilities and urgent decision-making requirements. This study examines how DistilBERT can be integrated into construction safety management systems without modifying the underlying platforms. Future work should focus on API creation, secure machine learning pipelines, and optimized deployment of LLMs, particularly in complex contexts.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom