
Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate
Author(s) -
Mirco Mutti,
Lorenzo Pratissoli,
Marcello Restelli
Publication year - 2021
Publication title -
proceedings of the ... aaai conference on artificial intelligence
Language(s) - English
Resource type - Journals
eISSN - 2374-3468
pISSN - 2159-5399
DOI - 10.1609/aaai.v35i10.17091
Subject(s) - computer science , entropy (arrow of time) , parametric statistics , principle of maximum entropy , mathematical optimization , task (project management) , policy learning , artificial intelligence , machine learning , mathematics , engineering , statistics , physics , quantum mechanics , systems engineering