Constructing optimal policies for agents with constrained architectures | Zendy

Dmitri Dolgov | Zendy; Edmund H. Durfee | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Constructing optimal policies for agents with constrained architectures

Author(s) -

Dmitri Dolgov,

Edmund H. Durfee

Publication year - 2003

Publication title -

citeseer x (the pennsylvania state university)

Language(s) - English

Resource type - Conference proceedings

ISBN - 1-58113-683-8

DOI - 10.1145/860575.860752

Subject(s) - computer science , distributed computing

Optimal behavior is a very desirable property of autonomous agents [13] and, as such, has received much attention over the years. However, making optimal decisions and executing optimal actions typically requires a substantial effort on the part of an agent, and in some situations the agent might lack the necessary sensory, computational, or actuating resources to carry out the optimal policy. In such cases, the agent will have to do the best it can, given its architectural constraints. We distinguish between three ways in which an agent’s architecture can affect policy optimality. An agent might have limitations that impact its ability to f rmulate, operationalize(convert to internal representation), or executean optimal policy. In this paper, we focus on agents facing the latter two types of limitations. We adopt the transient [7] constrained Markov decision problem (CMDP) framework [2, 11] in our search for optimal policies and show how gradations of increasingly constrained architectures create more complex optimization problems ranging from polynomial to NP-complete problems. We also present algorithms based on linear and integer programming that work across a range of such constrained optimization problems. The contribution of the full paper [5] is a characterization of a portion of the landscape of constrained agent architectures in terms of the complexity of finding optimal policies and algorithms for doing so. The new results that are of the most interest include the complexity proof and the algorithm for finding deterministic policies under linear execution constraints, the analysis of operationalization constraints on action utilization costs and an algorithm for approximating optimal policies that bound the probability of exceeding upper bounds on the total costs of the policy. Here we summarize this work.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research