Premium
Continuous‐time mean–variance portfolio selection: A reinforcement learning framework
Author(s) -
Wang Haoran,
Zhou Xun Yu
Publication year - 2020
Publication title -
mathematical finance
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.98
H-Index - 81
eISSN - 1467-9965
pISSN - 0960-1627
DOI - 10.1111/mafi.12281
Subject(s) - reinforcement learning , portfolio , variance (accounting) , computer science , stochastic control , selection (genetic algorithm) , gaussian , mathematical optimization , artificial neural network , entropy (arrow of time) , portfolio optimization , artificial intelligence , optimal control , mathematics , economics , finance , accounting , physics , quantum mechanics
We approach the continuous‐time mean–variance portfolio selection with reinforcement learning (RL). The problem is to achieve the best trade‐off between exploration and exploitation, and is formulated as an entropy‐regularized, relaxed stochastic control problem. We prove that the optimal feedback policy for this problem must be Gaussian, with time‐decaying variance. We then prove a policy improvement theorem, based on which we devise an implementable RL algorithm. We find that our algorithm and its variant outperform both traditional and deep neural network based algorithms in our simulation and empirical studies.