Differences

This shows you the differences between two versions of the page.

--- sdmia_invited_speakers [2015/10/18 09:32]
matthijs
+++ sdmia_invited_speakers [2015/11/10 07:27]
matthijs
@@ Line 39: / Line 39: @@
 [[http://web.engr.oregonstate.edu/~afern/|Oregon State]]
-Title: TBD\\
+Title: **Learning to Speedup Planning: Filling the Gap Between Reaction and Thinking**\\
 Abstract:\\
-TBD
+The product of most learning algorithms for sequential decision
+making is a policy, which supports very fast, or reactive, decision
+making. Another way to compute decision, given a domain model or
+simulator, is to use a deliberative planning algorithm, potentially
+at a high computational cost. One perspective is that algorithms
+for learning reactive policies are attempting to compiling away the
+deliberative "thinking" process of planners into fast circuits.
+Intuition suggest, however, that such compilation will not support
+quality decision making in the most difficult domains (e.g. chess,
+logistics, etc.). In other words, some domains will always require
+some amount of deliberative planning. Is there a role for learning
+in such cases?
+In this talk, I will revisit the old idea of speedup learning for
+planning, where the goal of learning is to speedup a deliberative
+planning in a domain, given experience in that domain. This speedup
+learning framework offers a bridge between learning for purely
+reactive behavior and pure deliberative planning. I will review
+some prior work and speculate about why it produced only limited
+successes. I will then review some of our own recent work in the
+area of speedup learning for MDP tree search and discuss potential
+future directions.
 === Mykel Kochenderfer ===