
Policy Optimization as Online Learning with Mediator Feedback
Author(s) -
Alberto Maria Metelli,
Matteo Papini,
Pierluca D’Oro,
Marcello Restelli
Publication year - 2021
Publication title -
proceedings of the ... aaai conference on artificial intelligence
Language(s) - English
Resource type - Journals
eISSN - 2374-3468
pISSN - 2159-5399
DOI - 10.1609/aaai.v35i10.17083
Subject(s) - regret , computer science , logarithm , truncation (statistics) , mathematical optimization , space (punctuation) , reuse , relation (database) , mathematics , machine learning , data mining , engineering , mathematical analysis , operating system , waste management