Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Last revision Both sides next revision | ||
sdmia_invited_speakers [2015/10/18 09:32] matthijs |
sdmia_invited_speakers [2015/11/10 07:27] matthijs |
||
---|---|---|---|
Line 39: | Line 39: | ||
[[http://web.engr.oregonstate.edu/~afern/|Oregon State]] | [[http://web.engr.oregonstate.edu/~afern/|Oregon State]] | ||
- | Title: TBD\\ | + | Title: **Learning to Speedup Planning: Filling the Gap Between Reaction and Thinking**\\ |
Abstract:\\ | Abstract:\\ | ||
- | TBD | + | The product of most learning algorithms for sequential decision |
+ | making is a policy, which supports very fast, or reactive, decision | ||
+ | making. Another way to compute decision, given a domain model or | ||
+ | simulator, is to use a deliberative planning algorithm, potentially | ||
+ | at a high computational cost. One perspective is that algorithms | ||
+ | for learning reactive policies are attempting to compiling away the | ||
+ | deliberative "thinking" process of planners into fast circuits. | ||
+ | Intuition suggest, however, that such compilation will not support | ||
+ | quality decision making in the most difficult domains (e.g. chess, | ||
+ | logistics, etc.). In other words, some domains will always require | ||
+ | some amount of deliberative planning. Is there a role for learning | ||
+ | in such cases? | ||
+ | |||
+ | In this talk, I will revisit the old idea of speedup learning for | ||
+ | planning, where the goal of learning is to speedup a deliberative | ||
+ | planning in a domain, given experience in that domain. This speedup | ||
+ | learning framework offers a bridge between learning for purely | ||
+ | reactive behavior and pure deliberative planning. I will review | ||
+ | some prior work and speculate about why it produced only limited | ||
+ | successes. I will then review some of our own recent work in the | ||
+ | area of speedup learning for MDP tree search and discuss potential | ||
+ | future directions. | ||
=== Mykel Kochenderfer === | === Mykel Kochenderfer === |