## Optimal values for Finite-Horizon Dec-POMDPs

Here we provide the value of an optimal solution for some of above problems for several horizons, which can for instance be used to benchmark approximate solutions. The table also notes in which papers the problem had been first solved optimally for a given horizon. Note that values reported are for the undiscounted setting, i.e., the discount factor (often denoted by gamma) was set to 1.0, overriding sometimes the discount factor specified in the .dpomdp files. This explains discrepancies with some of the cited papers. In most cases the reported values have been computed using GMAA*-Cluster (OWS2009), otherwise they have been copied from the referred paper.

DecTiger | |||

Horizon | Optimal value | First solved by | Notes |
---|---|---|---|

2 | -4.000000 | (NTYPM2003) | |

3 | 5.190812 | (SCZ2005) | 1) |

4 | 4.802755 | (SCZ2005) | |

5 | 7.026451 | (OWS2009) | |

6 | 10.381625 | (SOA2011) | |

7 | 9.9935 | (DABC2013) | 5) |

8 | 12.217 | (DABC2013) | 5) |

9 | 15.572 | (DABC2013) | 5) |

10 | 15.184 | (DABC2013) | 5) |

BroadcastChannel | |||

Horizon | Optimal value | First solved by | Notes |

2 | 2.000000 | (HBZ2004) | |

3 | 2.990000 | (HBZ2004) | |

4 | 3.890000 | (HBZ2004) | |

5 | 4.790000 | (SC2006) | |

6 | 5.690000 | (OWS2009) | |

7 | 6.590000 | (OWS2009) | |

8 | 7.490000 | (OWS2009) | |

9 | 8.390000 | (OWS2009) | |

10 | 9.290000 | (OWS2009) | |

15 | 13.790000 | (OWS2009) | |

20 | 18.313228 | (OWS2009) | |

25 | 22.881523 | (OWS2009) | |

30 | 27.421850 | (SOA2011) | |

50 | 45.501604 | (SOA2011) | |

100 | 90.760423 | (SOA2011) | |

250 | 226.500545 | (SOA2011) | |

500 | 452.738119 | (SOA2011) | |

600 | 543.228071 | (SOA2011) | |

700 | 633.724279 | (SOA2011) | |

900 | 814.709393 | (SOA2011) | |

GridSmall | |||

Horizon | Optimal value | First solved by | Notes |

2 | 0.910000 | (OSV2008) | |

3 | 1.550444 | (OSV2008) | |

4 | 2.241577 | (OWS2009) | |

5 | 2.970496 | (SOA2011) | |

6 | 3.717168 | (SOA2011) | |

7 | 4.4657 | (DABC2013) | 5) |

8 | 5.2319 | (DABC2013) | 5) |

9 | 5.9878 | (DABC2013) | 5) |

One Door | |||

Horizon | Optimal value | First solved by | Notes |

2 | 0.000000 | unpublished | |

3 | -0.000395 | unpublished | |

4 | -0.001682 | unpublished | |

Cooperative Box Pushing | |||

Horizon | Optimal value | First solved by | Notes |

2 | 17.600000 | (OWS2009) | |

3 | 66.081000 | (OWS2009) | |

4 | 98.593613 | (ADZ2009) | |

5 | 107.72 | (DABC2013) | 5) |

6 | 120.67 | (DABC2013) | 5) |

7 | 156.42 | (DABC2013) | 5) |

8 | 191.22 | (DABC2013) | 5) |

9 | 208.19 | (DABC2013) | 5) |

10 | 220.45 | (DABC2013) | 5) |

Recycling Robots | |||

Horizon | Optimal value | First solved by | Notes |

2 | 7.000000 | (OWS2009) | 3) |

3 | 10.660125 | (OWS2009) | 3) |

4 | 13.380000 | (OWS2009) | 3) |

5 | 16.486000 | (OWS2009) | 3) |

6 | 19.554200 | (OWS2009) | 3) |

7 | 22.633740 | (OWS2009) | 3) |

8 | 25.709878 | (OWS2009) | 3) |

9 | 28.787037 | (OWS2009) | 3) |

10 | 31.863889 | (OWS2009) | 3) |

11 | 34.940833 | (OWS2009) | 3) |

12 | 38.017750 | (OWS2009) | 3) |

13 | 41.094675 | (OWS2009) | 3) |

14 | 44.171598 | (OWS2009) | 3) |

15 | 47.248521 | (OWS2009) | 3) |

20 | 62.633136 | (SOA2011) | |

30 | 93.402367 | (SOA2011) | |

40 | 124.171598 | (SOA2011) | |

50 | 154.940828 | (SOA2011) | |

60 | 185.710059 | (SOA2011) | |

70 | 216.479290 | (SOA2011) | |

100 | 308.78 | (DABC2013) | 5) |

Fire Fighting (3 houses, 3 fire levels) | |||

Horizon | Optimal value | First solved by | Notes |

2 | -4.383496 | (OSV2008) | 4) |

3 | -5.736969 | (OSV2008) | |

4 | -6.578834 | (OWS2009) | |

5 | -7.069874 | (SOA2011) | |

6 | -7.175591 | (SOA2011) | |

Mars Rovers | |||

Horizon | Optimal value | First solved by | Notes |

2 | 5.800000 | (ADZ2009) (OWS2009) | 2) |

3 | 9.380000 | (ADZ2009) (OWS2009) | 2) |

4 | 10.180800 | (OWS2009) | 2) |

5 | 13.26 | (DABC2013) | 5) |

6 | 18.62 | (DABC2013) | 5) |

7 | 20.90 | (DABC2013) | 5) |

8 | 22.47 | (DABC2013) | 5) |

9 | 24.31 | (DABC2013) | 5) |

10 | 26.31 | (DABC2013) | 5) |

Meeting in a 3x3 grid | |||

Horizon | Optimal value | First solved by | Notes |

2 | 0.000000 | (ADZ2009) (OWS2009) | 2) |

3 | 0.133200 | (ADZ2009) (OWS2009) | 2) |

4 | 0.433 | (ADZ2009) | |

5 | 0.896 | (ADZ2009) | |

Hotel 1 | |||

Horizon | Optimal value | First solved by | Notes |

2 | 10.000000 | (OWS2009) | 3) |

3 | 16.875000 | (OWS2009) | 3) |

4 | 22.187500 | (OWS2009) | 3) |

5 | 27.187500 | (SOA2011) | |

6 | 32.187500 | (SOA2011) | |

7 | 37.187500 | (SOA2011) | |

8 | 42.187500 | (SOA2011) | |

9 | 47.187500 | (SOA2011) |

To report errors, mistakes, additions, omissions, “prior art” with respect to the “first solved by” column, please send an email to Matthijs Spaan.

##### Notes

1) In (NTYPM2003) an incorrect optimal value for DecTiger h=3 was presented.

2) The (ADZ2009) paper included new results obtained using the algorithm presented in (OWS2009).

3) In these cases the results reported in the referred paper concern a different discount factor (usually as stated in the problem definition).

4) In (OWS2009) a typo in the third decimal was published for Fire Fighting h=2 (-4.3825 instead of the correct -4.3835).

5) In (DABC2013) epsilon-optimal solutions were found with an epsilon of 0.01.

##### References

NTYPM2003 (Nair, Tambe, Yokoo, Pynadath & Marsella, IJCAI 2003)

HBZ2004 (Hansen, Bernstein & Zilberstein, AAAI 2004)

SCZ2005 (Szer, Charpillet & Zilberstein, UAI 2005)

SC2006 (Szer & Charpillet, AAAI 2006)

OSV2008 (Oliehoek, Spaan & Vlassis, JAIR 2008)

OWS2009 (Oliehoek, Whiteson & Spaan, AAMAS 2009)

ADZ2009 (Amato, Dibangoye & Zilberstein, ICAPS 2009)

SOA2011 (Spaan, Oliehoek & Amato, IJCAI 2011)

DABC2013 (Dibangoye, Amato, Buffet & Charpillet, IJCAI 2013)

## Highest known values for Infinite-Horizon Dec-POMDPs

The table below provides the highest known values for a range of benchmark problems with the common discount factor of 0.9 (except Wireless network which used 0.99). We also list the paper that each result first appeared. Note that the results are often an average over a number of runs, so single runs may have higher values than those listed here.

DecTiger | ||

Highest known value | First solved by | Notes |
---|---|---|

13.45 | (PP2011) | |

BroadcastChannel | ||

Highest known value | First solved by | Notes |

9.1 | (ABZ2007) | |

GridSmall | ||

Highest known value | First solved by | Notes |

6.89 | (PP2011) | |

Cooperative Box Pushing | ||

Highest known value | First solved by | Notes |

149.854 | (AZ2009) | |

Recycling Robots | ||

Highest known value | First solved by | Notes |

31.92865 | (ABZ2007) | |

Mars Rovers | ||

Highest known value | First solved by | Notes |

24.13 | (PP2011) | |

Wireless network | ||

Highest known value | First solved by | Notes |

-175.40 | (KZ2010) | 4) |

To report errors, mistakes, additions, or omissions, please send an email to Chris Amato.