A Normative Account of Confirmation Bias During Reinforcement Learning | Zendy

Germain Lefebvre | Zendy; Christopher Summerfield | Zendy; Rafał Bogacz | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A Normative Account of Confirmation Bias During Reinforcement Learning

Author(s) -

Germain Lefebvre,

Christopher Summerfield,

Rafał Bogacz

Publication year - 2021

Publication title -

neural computation

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.235

H-Index - 169

eISSN - 1530-888X

pISSN - 0899-7667

DOI - 10.1162/neco_a_01455

Subject(s) - reinforcement learning , converse , artificial intelligence , machine learning , computer science , normative , psychology , econometrics , cognitive psychology , mathematics , epistemology , philosophy , geometry

Reinforcement learning involves updating estimates of the value of states and actions on the basis of experience. Previous work has shown that in humans, reinforcement learning exhibits a confirmatory bias: when the value of a chosen option is being updated, estimates are revised more radically following positive than negative reward prediction errors, but the converse is observed when updating the unchosen option value estimate. Here, we simulate performance on a multi-arm bandit task to examine the consequences of a confirmatory bias for reward harvesting. We report a paradoxical finding: that confirmatory biases allow the agent to maximize reward relative to an unbiased updating rule. This principle holds over a wide range of experimental settings and is most influential when decisions are corrupted by noise. We show that this occurs because on average, confirmatory biases lead to overestimating the value of more valuable bandits and underestimating the value of less valuable bandits, rendering decisions overall more robust in the face of noise. Our results show how apparently suboptimal learning rules can in fact be reward maximizing if decisions are made with finite computational precision.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research