Off-campus UMass Amherst users: To download dissertations, please use the following link to log into our proxy server with your UMass Amherst user name and password.

Non-UMass Amherst users, please click the view more button below to purchase a copy of this dissertation from Proquest.

(Some titles may also be available free of charge in our Open Access Dissertation Collection, so please check there first.)

Exploiting structure in decentralized Markov decision processes

Raphen Becker, University of Massachusetts Amherst

Abstract

While formal, decision-theoretic models such as the Markov Decision Process (MDP) have greatly advanced the field of single-agent control, application of similar ideas to multi-agent domains has proven problematic. The advantages of such an approach over traditional heuristic and experimental models of multi-agent systems include a more accurate representation of the underlying problem, a more easily defined notion of optimality and the potential for significantly better solutions. The difficulty often comes from the tradeoff between the expressiveness of the model and the complexity of finding an optimal solution. Much of the research in this area has focused on the extremes of this tradeoff. At one extreme are models where each agent has a global view of the world, and solving these problems is no harder than solving single-agent problems. At the other extreme lie very general, decentralized models, which are also nearly impossible to solve optimally. ^ The work proposed here explores the middle-ground by starting with a general decentralized Markov decision process and introducing structure that can be exploited to reduce the complexity. I present two decision-theoretic models that structure the interactions between agents in two different ways. In the first model the agents are independent except for an extra reward signal that depends on each of the agents' histories. In the second model the agents have independent rewards but there is a structured interaction between their transition probabilities. Both of these models can be optimally and approximately solved using my Coverage Set Algorithm. I also extend the first model by allowing the agents to communicate and I introduce an algorithm that finds an optimal joint communication policy for a fixed joint domain-level policy.^

Subject Area

Artificial intelligence|Computer science

Recommended Citation

Becker, Raphen, "Exploiting structure in decentralized Markov decision processes" (2006). Doctoral Dissertations Available from Proquest. AAI3242298.
http://scholarworks.umass.edu/dissertations/AAI3242298

Share

COinS