Optimal values for Finite-Horizon Dec-POMDPs

Here we provide the value of an optimal solution for some of above problems for several horizons, which can for instance be used to benchmark approximate solutions. The table also notes in which papers the problem had been first solved optimally for a given horizon. Note that values reported are for the undiscounted setting, i.e., the discount factor (often denoted by gamma) was set to 1.0, overriding sometimes the discount factor specified in the .dpomdp files. This explains discrepancies with some of the cited papers. In most cases the reported values have been computed using GMAA*-Cluster (OWS2009), otherwise they have been copied from the referred paper.

DecTiger
Horizon Optimal value First solved by Notes
2 -4.000000 (NTYPM2003)
3 5.190812 (SCZ2005) 1)
4 4.802755 (SCZ2005)
5 7.026451 (OWS2009)
6 10.381625 (SOA2011)
7 9.9935 (DABC2013) 5)
8 12.217 (DABC2013) 5)
9 15.572 (DABC2013) 5)
10 15.184 (DABC2013) 5)
BroadcastChannel
Horizon Optimal value First solved by Notes
2 2.000000 (HBZ2004)
3 2.990000 (HBZ2004)
4 3.890000 (HBZ2004)
5 4.790000 (SC2006)
6 5.690000 (OWS2009)
7 6.590000 (OWS2009)
8 7.490000 (OWS2009)
9 8.390000 (OWS2009)
10 9.290000 (OWS2009)
15 13.790000 (OWS2009)
20 18.313228 (OWS2009)
25 22.881523 (OWS2009)
30 27.421850 (SOA2011)
50 45.501604 (SOA2011)
100 90.760423 (SOA2011)
250 226.500545 (SOA2011)
500 452.738119 (SOA2011)
600 543.228071 (SOA2011)
700 633.724279 (SOA2011)
900 814.709393 (SOA2011)
GridSmall
Horizon Optimal value First solved by Notes
2 0.910000 (OSV2008)
3 1.550444 (OSV2008)
4 2.241577 (OWS2009)
5 2.970496 (SOA2011)
6 3.717168 (SOA2011)
7 4.4657 (DABC2013) 5)
8 5.2319 (DABC2013) 5)
9 5.9878 (DABC2013) 5)
One Door
Horizon Optimal value First solved by Notes
2 0.000000 unpublished
3 -0.000395 unpublished
4 -0.001682 unpublished
Cooperative Box Pushing
Horizon Optimal value First solved by Notes
2 17.600000 (OWS2009)
3 66.081000 (OWS2009)
4 98.593613 (ADZ2009)
5 107.72 (DABC2013) 5)
6 120.67 (DABC2013) 5)
7 156.42 (DABC2013) 5)
8 191.22 (DABC2013) 5)
9 208.19 (DABC2013) 5)
10 220.45 (DABC2013) 5)
Recycling Robots
Horizon Optimal value First solved by Notes
2 7.000000 (OWS2009) 3)
3 10.660125 (OWS2009) 3)
4 13.380000 (OWS2009) 3)
5 16.486000 (OWS2009) 3)
6 19.554200 (OWS2009) 3)
7 22.633740 (OWS2009) 3)
8 25.709878 (OWS2009) 3)
9 28.787037 (OWS2009) 3)
10 31.863889 (OWS2009) 3)
11 34.940833 (OWS2009) 3)
12 38.017750 (OWS2009) 3)
13 41.094675 (OWS2009) 3)
14 44.171598 (OWS2009) 3)
15 47.248521 (OWS2009) 3)
20 62.633136 (SOA2011)
30 93.402367 (SOA2011)
40 124.171598 (SOA2011)
50 154.940828 (SOA2011)
60 185.710059 (SOA2011)
70 216.479290 (SOA2011)
100 308.78 (DABC2013) 5)
Fire Fighting (3 houses, 3 fire levels)
Horizon Optimal value First solved by Notes
2 -4.383496 (OSV2008) 4)
3 -5.736969 (OSV2008)
4 -6.578834 (OWS2009)
5 -7.069874 (SOA2011)
6 -7.175591 (SOA2011)
Mars Rovers
Horizon Optimal value First solved by Notes
2 5.800000 (ADZ2009) (OWS2009) 2)
3 9.380000 (ADZ2009) (OWS2009) 2)
4 10.180800 (OWS2009) 2)
5 13.26 (DABC2013) 5)
6 18.62 (DABC2013) 5)
7 20.90 (DABC2013) 5)
8 22.47 (DABC2013) 5)
9 24.31 (DABC2013) 5)
10 26.31 (DABC2013) 5)
Meeting in a 3x3 grid
Horizon Optimal value First solved by Notes
2 0.000000 (ADZ2009) (OWS2009) 2)
3 0.133200 (ADZ2009) (OWS2009) 2)
4 0.433 (ADZ2009)
5 0.896 (ADZ2009)
Hotel 1
Horizon Optimal value First solved by Notes
2 10.000000 (OWS2009) 3)
3 16.875000 (OWS2009) 3)
4 22.187500 (OWS2009) 3)
5 27.187500 (SOA2011)
6 32.187500 (SOA2011)
7 37.187500 (SOA2011)
8 42.187500 (SOA2011)
9 47.187500 (SOA2011)

To report errors, mistakes, additions, omissions, “prior art” with respect to the “first solved by” column, please send an email to Matthijs Spaan.

Notes

1) In (NTYPM2003) an incorrect optimal value for DecTiger h=3 was presented.
2) The (ADZ2009) paper included new results obtained using the algorithm presented in (OWS2009).
3) In these cases the results reported in the referred paper concern a different discount factor (usually as stated in the problem definition).
4) In (OWS2009) a typo in the third decimal was published for Fire Fighting h=2 (-4.3825 instead of the correct -4.3835).
5) In (DABC2013) epsilon-optimal solutions were found with an epsilon of 0.01.

References

NTYPM2003 (Nair, Tambe, Yokoo, Pynadath & Marsella, IJCAI 2003)
HBZ2004 (Hansen, Bernstein & Zilberstein, AAAI 2004)
SCZ2005 (Szer, Charpillet & Zilberstein, UAI 2005)
SC2006 (Szer & Charpillet, AAAI 2006)
OSV2008 (Oliehoek, Spaan & Vlassis, JAIR 2008)
OWS2009 (Oliehoek, Whiteson & Spaan, AAMAS 2009)
ADZ2009 (Amato, Dibangoye & Zilberstein, ICAPS 2009)
SOA2011 (Spaan, Oliehoek & Amato, IJCAI 2011)
DABC2013 (Dibangoye, Amato, Buffet & Charpillet, IJCAI 2013)

Highest known values for Infinite-Horizon Dec-POMDPs

The table below provides the highest known values for a range of benchmark problems with the common discount factor of 0.9 (except Wireless network which used 0.99). We also list the paper that each result first appeared. Note that the results are often an average over a number of runs, so single runs may have higher values than those listed here.

DecTiger
Highest known value First solved by Notes
13.45 (PP2011)
BroadcastChannel
Highest known value First solved by Notes
9.1 (ABZ2007)
GridSmall
Highest known value First solved by Notes
6.89 (PP2011)
Cooperative Box Pushing
Highest known value First solved by Notes
149.854 (AZ2009)
Recycling Robots
Highest known value First solved by Notes
31.92865 (ABZ2007)
Mars Rovers
Highest known value First solved by Notes
24.13 (PP2011)
Wireless network
Highest known value First solved by Notes
-175.40 (KZ2010) 4)

To report errors, mistakes, additions, or omissions, please send an email to Chris Amato.

Notes

4) In (PP2011), the value produced from (KZ2010) is reported.

References

ABZ2007 (Amato, Bernstein & Zilberstein, UAI 2007)
AZ2009 (Amato & Zilberstein, AAMAS 2009)
KZ2010 (Kumar & Zilberstein, UAI 2010)
PP2011 (Pajarinen & Peltonen, NIPS 2011)

optimal_values.txt · Last modified: 2014/11/17 04:26 by chris
Recent changes RSS feed Creative Commons License Donate Minima Template by Wikidesign Driven by DokuWiki