Curating Automatic Vehicle Location Data to Compare the Performance of Outlier Filtering Methods | Zendy

Jijo K. Mathew | Zendy; Christopher M. Day | Zendy; Howell Li | Zendy; Darcy M. Bullock | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Curating Automatic Vehicle Location Data to Compare the Performance of Outlier Filtering Methods

Author(s) -

Jijo K. Mathew,

Christopher M. Day,

Howell Li,

Darcy M. Bullock

Publication year - 2021

Language(s) - English

Resource type - Reports

DOI - 10.5703/1288284317435

Subject(s) - computer science , outlier , data set , data mining , global positioning system , identifier , set (abstract data type) , identification (biology) , data collection , data quality , data type , database , service (business) , artificial intelligence , statistics , computer network , mathematics , telecommunications , botany , economy , economics , biology , programming language

Agencies use a variety of technologies and data providers to obtain travel time information. The best quality data can be obtained from second-by-second tracking of vehicles, but that data presents many challenges in terms of privacy, storage requirements and analysis. More frequently agencies collect or purchase segment travel time based upon some type of matching of vehicles between two spatially distributed points. Typical methods for that data collection involve license plate re-identification, Bluetooth, Wi-Fi, or some type of rolling DSRC identifier. One of the challenges in each of these sampling techniques is to employ filtering techniques to remove outliers associated with trip chaining, but not remove important features in the data associated with incidents or traffic congestion. This paper describes a curated data set that was developed from high-fidelity GPS trajectory data. The curated data contained 31,621 vehicle observations spanning 42 days; 2550 observations had travel times greater than 3 minutes more than normal. From this baseline data set, outliers were determined using GPS waypoints to determine if the vehicle left the route. Two performance measures were identified for evaluating three outlier-filtering algorithms by the proportion of true samples rejected and proportion of outliers correctly identified. The effectiveness of the three methods over 10-minute sampling windows was also evaluated. The curated data set has been archived in a digital repository and is available online for others to test outlier-filtering algorithms.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research