Interactions in Decentralized Environments
This dissertation has been moved to the following series:
The decentralized Markov decision process (Dec-POMDP) is a powerful formal model for studying multiagent problems where cooperative, coordinated action is optimal, but each agent acts based on local data alone. Unfortunately, it is known that Dec-POMDPs are fundamentally intractable: they are NEXP-complete in the worst case, and have been empirically observed to be beyond feasible optimal solution. To get around these obstacles, researchers have focused on special classes of the general Dec-POMDP problem, restricting the degree to which agent actions can interact with one another. In some cases, it has been proven that these sorts of structured forms of interaction can in fact reduce worst-case complexity. Where formal proofs have been lacking, empirical observations suggest that this may also be true for other cases, although less is known precisely. This thesis unifies a range of this existing work, extending analysis to establish novel complexity results for some popular restricted-interaction models. We also establish some new results concerning cases for which reduced complexity has been proven, showing correspondences between basic structural features and the potential for dimensionality reduction when employing mathematical programming techniques. As our new complexity results establish that worst-case intractability is more widespread than previously known, we look to new ways of analyzing the potential average-case difficulty of Dec-POMDP instances. As this would be extremely difficult using the tools of traditional complexity theory, we take a more empirical approach. In so doing, we identify new analytical measures that apply to all Dec-POMDPs, whatever their structure. These measures allow us to identify problems that are potentially easier to solve on average, and validate this claim empirically. As we show, the performance of well-known optimal dynamic programming methods correlates with our new measure of difficulty. Finally, we explore the approximate case, showing that our measure works well as a predictor of difficulty there, too, and provides a means of setting algorithm parameters to achieve far more efficient performance.