Differences

This shows you the differences between two versions of the page.

Link to this comparison view

models-and-methods:ti-dec-mdp [2014/05/06 05:07] (current)
stefan created
Line 1: Line 1:
 +===== Transition (and Observation) Independent Decentralized MDP (TOI-Dec-MDP) =====
 +
 +The TOI-Dec-MDP,​ introduced by Becker, Zilberstein,​ Lesser, and Goldman [1] was one of the first subclasses of Dec-POMDPs developed expressly with the intention of gaining traction on structured problems. ​ It introduces several key assumptions that limit the mode by which agents interact. ​ The TOI-Dec-MDP is a factored model, where the world state $s$ comprises each agent'​s local state $s_i$, which the respective agent observes directly. ​ An agent'​s actions cannot affect the transitions or observations that another agent experiences,​ but it can affect the joint reward. ​ Hence the agents remain coupled and must coordination their actions in order to maximize their joint value.  ​
 +
 +Operationally,​ the TOI-Dec-MDP representation accommodates a joint reward structure defined over agents'​ events (local transitions or combinations thereof) [1,2].
 +
 +Though seemingly restrictive,​ the authors argue that the TOI-Dec-MDP is expressive enough to capture a wide variety of practical interactions,​ including temporal constraints.
 +
 +==== Motivating Problems ====
 +
 +The TOI-Dec-MDP is motivated by problems involving agents that must act autonomously (from one another) with no communication,​ and cannot interfere with one another'​s actions, yet still should coordinate planned activities so as so maximize joint objectives. ​ The two application domains that the authors reference are planetary explorations rovers selecting which sites at which to collect data and UAVs performing reconnaissance. ​ In both domains, the objectives involve a maximization of information collected by the agents, which can only be achieved by the agents coordinating so as to avoid redundant activities.
 +
 +==== Key Assumptions ====
 +
 +  * Transition Independence:​ $P(s_i^{t+1}|s^t,​a^t)=s_i^{t+1}|s_i^t,​a_i^t)$ ​ [2]
 +  * Observation Independence:​ $P(o_i^{t+1}|s^t,​a^t,​s^{t+1})=P(o_i^{t+1}|s_i^t,​a_i^t,​s_i^{t+1})$ ​ [3]
 +  * Local Full Observability:​ $\forall o_i, \exists s_i \text{ s.t. } P(s_i|o_i) = 1$  [2]
 +
 +==== Theoretical Properties ====
 +
 +The TOI-Dec-MDP is NEXP-complete [2,3].
 +
 +==== Solution Methods ====
 +
 +  - Coverage Set Algorithm (CSA) [1,2]
 +  - Mixed Integer Linear Programming [4]
 +  - Separable Bi-linear Programming [5]
 +  - (Improved) Markovian Policy Seach ((I)MPS) and Markovian Policy Graph Search (MPGS) [6]
 +
 +==== References ====
 +
 +[1] [[http://​rbr.cs.umass.edu/​shlomo/​papers/​BZLGaamas03.html|Transition-Independent Decentralized Markov Decision Processes]]. ​
 +R. Becker, S. Zilberstein,​ V. Lesser, and C.V. Goldman. Proceedings of the Second International Conference on Autonomous Agents and Multi Agent Systems (AAMAS), 41-48, Melbourne, Australia, 2003.
 +
 +[2] [[http://​rbr.cs.umass.edu/​shlomo/​papers/​BZLGjair04.html|Solving Transition-Independent Decentralized Markov Decision Processes]]. ​
 +R. Becker, S. Zilberstein,​ V. Lesser, and C.V. Goldman. Journal of Artificial Intelligence Research, 22:423-455, 2004. 
 +
 +[3] [[http://​rbr.cs.umass.edu/​shlomo/​papers/​GZjair04.html|Decentralized Control of Cooperative Systems: Categorization and Complexity Analysis]]. ​
 +C.V. Goldman and S. Zilberstein. Journal of Artificial Intelligence Research, 22:143-174, 2004.
 +
 +[4] [[http://​citeseerx.ist.psu.edu/​viewdoc/​summary?​doi=10.1.1.105.3094|Mixed-integer linear programming for transit|ion-independent decentralized MDPs.]]
 +Jianhui Wu, Edmund H. Durfee. ​ AAMAS 2006: 1058-1060, 2006.
 +
 +[5] [[http://​rbr.cs.umass.edu/​shlomo/​papers/​PZjair09.html|A Bilinear Programming Approach for Multiagent Planning]].
 +M. Petrik and S. Zilberstein. Journal of Artificial Intelligence Research, 35:235-274, 2009. 
 +
 +[6] [[http://​lis.csail.mit.edu/​new/​bibtexbrowser.php?​key=DADC-aamas13&​bib=pubs%252Flis.bib|Producing Efficient Error-bounded Solutions for Transition Independent Decentralized MDPs]], ​
 +Jilles S. Dibangoye, Christopher Amato, Arnaud Doniec, François Charpillet. In Proceedings of the Twelfth International Conference on Autonomous Agents and Multiagent Systems, 2013.
 + 
  
Recent changes RSS feed Creative Commons License Donate Minima Template by Wikidesign Driven by DokuWiki