Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards
Author(s) -
Omar Besbes,
Yonatan Gur,
Assaf Zeevi
Publication year - 2019
Publication title -
stochastic systems
Language(s) - English
Resource type - Journals
ISSN - 1946-5238
DOI - 10.1287/stsy.2019.0033
Subject(s) - regret , time horizon , minimax , mathematical optimization , complement (music) , metric (unit) , oracle , parametric statistics , variation (astronomy) , mathematics , computer science , economics , statistics , operations management , biochemistry , chemistry , software engineering , physics , complementation , astrophysics , gene , phenotype
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom