Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Last revision Both sides next revision
sdmia_invited_speakers [2015/10/18 09:32]
matthijs
sdmia_invited_speakers [2015/11/10 07:27]
matthijs
Line 39: Line 39:
 [[http://​web.engr.oregonstate.edu/​~afern/​|Oregon State]] [[http://​web.engr.oregonstate.edu/​~afern/​|Oregon State]]
  
-Title: ​TBD\\+Title: ​**Learning to Speedup Planning: Filling the Gap Between Reaction and Thinking**\\
 Abstract:\\ Abstract:\\
-TBD+The product of most learning algorithms for sequential decision 
 +making is a policy, which supports very fast, or reactive, decision 
 +making. Another way to compute decision, given a domain model or 
 +simulator, is to use a deliberative planning algorithm, potentially 
 +at a high computational cost. One perspective is that algorithms 
 +for learning reactive policies are attempting to compiling away the 
 +deliberative "​thinking"​ process of planners into fast circuits. 
 +Intuition suggest, however, that such compilation will not support 
 +quality decision making in the most difficult domains (e.g. chess, 
 +logistics, etc.). In other words, some domains will always require 
 +some amount of deliberative planning. Is there a role for learning 
 +in such cases? 
 + 
 +In this talk, I will revisit the old idea of speedup learning for 
 +planning, where the goal of learning is to speedup a deliberative 
 +planning in a domain, given experience in that domain. This speedup 
 +learning framework offers a bridge between learning for purely 
 +reactive behavior and pure deliberative planning. I will review 
 +some prior work and speculate about why it produced only limited 
 +successes. I will then review some of our own recent work in the 
 +area of speedup learning for MDP tree search and discuss potential 
 +future directions. 
  
 === Mykel Kochenderfer === === Mykel Kochenderfer ===
Recent changes RSS feed Creative Commons License Donate Minima Template by Wikidesign Driven by DokuWiki