A dual foveal-peripheral visual processing model implements efficient saccade selection | Zendy

Emmanuel Daucé | Zendy; Pierre Albiges | Zendy; Laurent Perrinet | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A dual foveal-peripheral visual processing model implements efficient saccade selection

Author(s) -

Emmanuel Daucé,

Pierre Albiges,

Laurent Perrinet

Publication year - 2020

Publication title -

journal of vision

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.126

H-Index - 113

ISSN - 1534-7362

DOI - 10.1167/jov.20.8.22

Subject(s) - foveal , saccade , computer science , artificial intelligence , visual field , fixation (population genetics) , eye movement , peripheral vision , computer vision , pattern recognition (psychology) , neuroscience , psychology , retinal , population , biochemistry , chemistry , demography , sociology

We develop a visuomotor model that implements visual search as a focal accuracy-seeking policy, with the target's position and category drawn independently from a common generative process. Consistently with the anatomical separation between the ventral versus dorsal pathways, the model is composed of two pathways that respectively infer what to see and where to look. The “What” network is a classical deep learning classifier that only processes a small region around the center of fixation, providing a “foveal” accuracy. In contrast, the “Where” network processes the full visual field in a biomimetic fashion, using a log-polar retinotopic encoding, which is preserved up to the action selection level. In our model, the foveal accuracy is used as a monitoring signal to train the “Where” network, much like in the “actor/critic” framework. After training, the “Where” network provides an “accuracy map” that serves to guide the eye toward peripheral objects. Finally, the comparison of both networks’ accuracies amounts to either selecting a saccade or keeping the eye focused at the center to identify the target. We test this setup on a simple task of finding a digit in a large, cluttered image. Our simulation results demonstrate the effectiveness of this approach, increasing by one order of magnitude the radius of the visual field toward which the agent can detect and recognize a target, either through a single saccade or with multiple ones. Importantly, our log-polar treatment of the visual information exploits the strong compression rate performed at the sensory level, providing ways to implement visual search in a sublinear fashion, in contrast with mainstream computer vision.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research