z-logo
Premium
Habits, action sequences and reinforcement learning
Author(s) -
Dezfouli Amir,
Balleine Bernard W.
Publication year - 2012
Publication title -
european journal of neuroscience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.346
H-Index - 206
eISSN - 1460-9568
pISSN - 0953-816X
DOI - 10.1111/j.1460-9568.2012.08050.x
Subject(s) - action (physics) , reinforcement learning , contingency , antecedent (behavioral psychology) , reinforcement , psychology , outcome (game theory) , reflexivity , cognitive psychology , computer science , cognitive science , artificial intelligence , social psychology , epistemology , economics , microeconomics , sociology , social science , philosophy , physics , quantum mechanics
It is now widely accepted that instrumental actions can be either goal‐directed or habitual; whereas the former are rapidly acquired and regulated by their outcome, the latter are reflexive, elicited by antecedent stimuli rather than their consequences. Model‐based reinforcement learning (RL) provides an elegant description of goal‐directed action. Through exposure to states, actions and rewards, the agent rapidly constructs a model of the world and can choose an appropriate action based on quite abstract changes in environmental and evaluative demands. This model is powerful but has a problem explaining the development of habitual actions. To account for habits, theorists have argued that another action controller is required, called model‐free RL, that does not form a model of the world but rather caches action values within states allowing a state to select an action based on its reward history rather than its consequences. Nevertheless, there are persistent problems with important predictions from the model; most notably the failure of model‐free RL correctly to predict the insensitivity of habitual actions to changes in the action–reward contingency. Here, we suggest that introducing model‐free RL in instrumental conditioning is unnecessary, and demonstrate that reconceptualizing habits as action sequences allows model‐based RL to be applied to both goal‐directed and habitual actions in a manner consistent with what real animals do. This approach has significant implications for the way habits are currently investigated and generates new experimental predictions.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here