z-logo
open-access-imgOpen Access
Large Language Model-Based Functional Scenario Generation for Automated Vehicle Safety Evaluation Using Vehicle and Pedestrian Traffic Accident Data
Author(s) -
Jihun Kang,
Woori Ko,
Youngtaek Lee,
Kyeongjin Lee,
Ilsoo Yun
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3612989
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Despite advancements in automated driving technology, traffic accidents remain a significant safety concern in urban areas because of complex traffic environments and current technical limitations of automated vehicles (AV). Structured, quantitative scenario-based evaluations are essential for assessing AV safety. This study aims to generate functional scenarios by accurately inferring vehicle and pedestrian behavior information from unstructured traffic accident descriptions. The dataset comprises 137,432 traffic accident cases collected by the Korean National Police Agency. A total of 1,774 training samples were extracted using term frequency–inverse document frequency keyword analysis. We then evaluated the prediction performance of three large language models (LLM): Claude 3, GPT-4o, and GPT-4o with fine-tuning, using 500 randomly selected traffic accident records that lacked keyword-extractable behavior information. GPT-4o with fine-tuning achieved the highest accuracy of 81.1%. Using this model, 26 functional scenarios were generated. The risk levels for each scenario were evaluated by integrating the equivalent property damage only index with risk predictions from logistic regression and XGBoost models. A weighted average of the normalized results was used to rank the scenarios. Scenarios such as "turning left / crossing," "straight / crossing," and "turning right / crossing" were consistently identified as high-risk. This study demonstrates the effectiveness of LLM-based contextual inference in converting real-world traffic accident data into functional scenarios, and contributes to the advancement of AV safety evaluation frameworks.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom