
End‐to‐end autonomous driving decision model joined by attention mechanism and spatiotemporal features
Author(s) -
Zhao Xiangmo,
Qi Mingyuan,
Liu Zhanwen,
Fan Songhua,
Li Chao,
Dong Ming
Publication year - 2021
Publication title -
iet intelligent transport systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.579
H-Index - 45
eISSN - 1751-9578
pISSN - 1751-956X
DOI - 10.1049/itr2.12086
Subject(s) - computer science , end to end principle , block (permutation group theory) , artificial intelligence , pruning , object (grammar) , component (thermodynamics) , feature (linguistics) , mechanism (biology) , scale (ratio) , spatial analysis , feature extraction , channel (broadcasting) , computer vision , pattern recognition (psychology) , data mining , geography , mathematics , computer network , linguistics , philosophy , physics , geometry , cartography , remote sensing , epistemology , agronomy , biology , thermodynamics
Autonomous driving decision is a critical component of automatic driving system, which informs and updates the unmanned vehicle of object movements. However, end‐to‐end autonomous driving decision is still a great challenge due to the different scales of traffic target in the wild dynamic traffic scenes. To solve these problems, this paper proposes a novel model with attention mechanism and spatiotemporal features extraction. Specifically, for the important spatial information of traffic targets with scale differences, the spatial dimensions of height H , width W and channel C are independent of each other to build sparse spatial attention map. Moreover, different sparse masks are trained for spatial network by pruning elements of feature maps at the end of each block of backbone, which improves the accuracy of the two subnetworks of spatial network by 2.3% and 3.9%, respectively. Then the extracted spatial information is introduced jointly to the time sequence network with previous speed as input to obtain the vehicle steering angle and speed. Experiments on public virtual datasets show that the prediction accuracy of the model reaches 85.8%. Compared with other state‐of‐the‐art models, our model increased by 4.8% and 2.2%, respectively.