RAM-VO: A Recurrent Attentional Model for Visual Odometry | Zendy

Iury Cleveston | Zendy; Esther Luna Colombini | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

RAM-VO: A Recurrent Attentional Model for Visual Odometry

Author(s) -

Iury Cleveston,

Esther Luna Colombini

Publication year - 2021

Language(s) - English

Resource type - Conference proceedings

DOI - 10.5753/wtdr_ctdr.2021.18684

Subject(s) - visual odometry , computer science , artificial intelligence , reinforcement learning , initialization , context (archaeology) , convolutional neural network , optical flow , monocular , process (computing) , odometry , representation (politics) , computer vision , pattern recognition (psychology) , machine learning , image (mathematics) , robot , mobile robot , paleontology , biology , programming language , operating system , politics , political science , law

Determining the agent's pose is fundamental for developing autonomous vehicles. Visual Odometry (VO) algorithms estimate the egomotion using only visual differences from the input frames. The most recent VO methods implement deep-learning techniques using convolutional neural networks (CNN) widely, adding a high cost to process large images. Also, more data does not imply a better prediction, and the network may have to filter out useless information. In this context, we incrementally formulate a lightweight model called RAM-VO to perform visual odometry regressions using large monocular images. Our model is extended from the Recurrent Attention Model (RAM), which has emerged as a unique architecture that implements a hard attentional mechanism guided by reinforcement learning to select the essential input information. Our methodology modifies the RAM and improves the visual and temporal representation of information, generating the intermediary RAM-R and RAM-RC architectures. Also, we include the optical flow as contextual information for initializing the RL agent and implement the Proximal Policy Optimization (PPO) algorithm to learn a robust policy. The experimental results indicate that RAM-VO can perform regressions with six degrees of freedom using approximately 3 million parameters. Additionally, experiments on the KITTI dataset confirm that RAM-VO produces competitive results using only 5.7% of the input image.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore