The value of information in multi-armed bandits with exponentially distributed rewards | Zendy

Ilya O. Ryzhov | Zendy; Warren B. Powell | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

The value of information in multi-armed bandits with exponentially distributed rewards

Author(s) -

Ilya O. Ryzhov,

Warren B. Powell

Publication year - 2011

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2011.04.147

Subject(s) - computer science , exponential distribution , exponential growth , exponential function , value of information , value (mathematics) , prior probability , mathematical optimization , bayesian probability , class (philosophy) , artificial intelligence , machine learning , mathematics , statistics , mathematical analysis

We consider a class of multi-armed bandit problems where the reward obtained by pulling an arm is drawn from an exponential distribution whose parameter is unknown. A Bayesian model with independent gamma priors is used to represent our beliefs and uncertainty about the exponential parameters. We derive a precise expression for the marginal value of information in this problem, which allows us to create a new knowledge gradient (KG) policy for making decisions. The policy is practical and easy to implement, making a case for value of information as a general approach to optimal learning problems with many different types of learning models

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research