Minimizing Human Labeling in Training Deep Models for Pedestrian Intention Prediction | Zendy

Muhammad Naveed Riaz | Zendy; Maciej Wielgosz | Zendy; Antonio M. Lopez | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Minimizing Human Labeling in Training Deep Models for Pedestrian Intention Prediction

Author(s) -

Muhammad Naveed Riaz,

Maciej Wielgosz,

Antonio M. Lopez

Publication year - 2025

Publication title -

ieee transactions on intelligent transportation systems

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 1.591

H-Index - 153

eISSN - 1558-0016

pISSN - 1524-9050

DOI - 10.1109/tits.2025.3565667

Subject(s) - transportation , aerospace , communication, networking and broadcast technologies , computing and processing , robotics and control systems , signal processing and analysis

Accurately predicting whether pedestrians will cross in front of an autonomous vehicle is essential for ensuring safe and comfortable maneuvers. However, developing models for this task remains challenging due to the limited availability of diverse datasets containing both crossing (C) and non-crossing (NC) scenarios. Therefore, we propose a procedure that leverages synthetic videos with C/NC labels and an untrained model whose architecture is designed for C/NC prediction to automatically produce C/NC labels for a set of real-world videos. Thus, this procedure performs a synth-to-real unsupervised domain adaptation for C/NC prediction, so we term it S2R-UDA-CP. To assess the effectiveness of S2R-UDA-CP in self-labeling, we utilize two state-of-the-art models, PedGNN and ST-CrossingPose, and we rely on the publicly-available PedSynth dataset, which consists of synthetic videos with C/NC labels. Notably, once the real-world videos are self-labeled, they can be used to train models different from those used in S2R-UDA-CP. These models are designed to operate onboard a vehicle, whereas S2R-UDA-CP is an offline procedure. To evaluate the quality of the C/NC labels generated by S2R-UDA-CP, we also employ PedGraph+ (another literature referent) as it is not used in S2R-UDA-CP. Overall, the results show that training models to predict C/NC using videos labeled by S2R-UDA-CP achieves performance even better than models trained on human-labeled data. Our study also highlights different discrepancies between automatic and human labeling. To the best of our knowledge, this is the first study to evaluate synth-to-real self-labeling for C/NC prediction.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore