Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
sdmia_invited_speakers [2015/09/24 14:57]
matthijs
sdmia_invited_speakers [2015/11/10 07:27]
matthijs
Line 2: Line 2:
  
 [[sdmia|SDMIA Main Page]] [[sdmia|SDMIA Main Page]]
 +
 +=== Craig Boutilier ===
 +[[http://​www.cs.toronto.edu/​~cebly/​|Google]]
 +
 +Title: **Large-scale MDPs in Practice: Opportunities and Challenges**\\
 +Abstract:\\
 +Markov decision processes (MDPs) have been very well-studied in AI over
 +the past 20 years and offer great promise as a model for sophisticated
 +decision making. However, the practical applications of MDPs and
 +reinforcement learning (RL)---in particular, AI-based approaches---have
 +been somewhat limited. Indeed, the use of MDPs and RL in AI applications
 +pales in comparison to the wide-ranging applications of machine learning
 +across a variety of industrial sectors.
 +
 +In this talk, I'll discuss:
 +  * a sample of areas of direct industrial relevance where MDPs and RL have great promise;
 +  * some speculation as to why ML methods in these areas have succeeded, while the application of sequential decision-making techniques has faltered;
 +  * how we can bridge that gap, including: techniques for leveraging existing large-scale ML methods for modeling MDPs; the tension between model-based and model-free methods; and time permitting, some thoughts on solution methods for such models at industrial scale.
  
 === Emma Brunskill === === Emma Brunskill ===
 [[http://​www.cs.cmu.edu/​~ebrun/​|CMU]] [[http://​www.cs.cmu.edu/​~ebrun/​|CMU]]
  
-Title: ​TBD\\+Title: ​**Quickly Learning to Make Good Decisions**\\
 Abstract:\\ Abstract:\\
-TBD+A fundamental goal of artificial intelligence is to create agents that 
 +learn to make good decisions as they interact with a stochastic 
 +environment. Some of the most exciting and valuable potential 
 +applications involve systems that interact directly with humans, such as 
 +intelligent tutoring systems or medical interfaces. In these cases, 
 +sample efficiency is highly important, as each decision, good or bad, 
 +is impacting a real person. I will describe our research on tackling 
 +this challenge, as well as its relevance to improving educational tools. 
  
 === Alan Fern === === Alan Fern ===
 [[http://​web.engr.oregonstate.edu/​~afern/​|Oregon State]] [[http://​web.engr.oregonstate.edu/​~afern/​|Oregon State]]
  
-Title: ​TBD\\+Title: ​**Learning to Speedup Planning: Filling the Gap Between Reaction and Thinking**\\
 Abstract:\\ Abstract:\\
-TBD+The product of most learning algorithms for sequential decision 
 +making is a policy, which supports very fast, or reactive, decision 
 +making. Another way to compute decision, given a domain model or 
 +simulator, is to use a deliberative planning algorithm, potentially 
 +at a high computational cost. One perspective is that algorithms 
 +for learning reactive policies are attempting to compiling away the 
 +deliberative "​thinking"​ process of planners into fast circuits. 
 +Intuition suggest, however, that such compilation will not support 
 +quality decision making in the most difficult domains (e.g. chess, 
 +logistics, etc.). In other words, some domains will always require 
 +some amount of deliberative planning. Is there a role for learning 
 +in such cases? 
 + 
 +In this talk, I will revisit the old idea of speedup learning for 
 +planning, where the goal of learning is to speedup a deliberative 
 +planning in a domain, given experience in that domain. This speedup 
 +learning framework offers a bridge between learning for purely 
 +reactive behavior and pure deliberative planning. I will review 
 +some prior work and speculate about why it produced only limited 
 +successes. I will then review some of our own recent work in the 
 +area of speedup learning for MDP tree search and discuss potential 
 +future directions. 
  
 === Mykel Kochenderfer === === Mykel Kochenderfer ===
Line 27: Line 75:
 [[http://​teamcore.usc.edu/​tambe/​|USC]] [[http://​teamcore.usc.edu/​tambe/​|USC]]
  
-Title: ​TBD\\+Joint work with Eric Rice, Amulya Yadav, and Robin Petering. 
 + 
 +Title: ​**PSINET: Assisting HIV Prevention Amongst Homeless Youth using POMDPs**\\ 
 +Abstract:​\\ 
 +Homeless youth are prone to Human Immunodeficiency 
 +Virus (HIV) due to their engagement in high risk behavior 
 +such as unprotected sex, sex under influence of 
 +drugs, etc. Many non-profit agencies conduct interventions 
 +to educate and train a select group of homeless 
 +youth about HIV prevention and treatment practices and 
 +rely on word-of-mouth spread of information through 
 +their social network. Previous work in strategic selection 
 +of intervention participants does not handle uncertainties 
 +in the social network’s structure and evolving 
 +network state, potentially causing significant shortcomings 
 +in spread of information. Thus, we developed 
 +PSINET, a decision support system to aid the agencies 
 +in this task. PSINET includes the following key novelties:​ 
 +(i) it handles uncertainties in network structure 
 +and evolving network state; (ii) it addresses these uncertainties 
 +by using POMDPs in influence maximization;​ 
 +and (iii) it provides algorithmic advances to allow high 
 +quality approximate solutions for such POMDPs. We are about 
 +to conduct a pilot test study with homeless youth in Los Angeles; 
 +we will present a progress report.  
 + 
 +=== Jason Williams === 
 +[[http://​research.microsoft.com/​en-us/​people/​jawillia/​|Microsoft Research]] 
 + 
 +Title: **Decision-theoretic control in dialog systems: recent progress and opportunities for research**\\
 Abstract:\\ Abstract:\\
-TBD+Dialog systems interact with a person using natural language to help them achieve some goal.  Dialog systems are now a part of daily life, with commercial systems including Microsoft Cortana, Apple Siri, Amazon Echo, Google Now, Facebook M, in-car systems, and many others. ​ Because dialog is a sequential process, and because computers'​ ability to understand human language is error-prone,​ it has long been an important application for sequential decision making under uncertainty. ​  In this talk, I will first present the dialog system problem through the lens of decision making under uncertainty. ​ I'll then survey recent work which has  tailored methods for state tracking and action selection from the general machine learning literature to the dialog problem. ​ Finally, I'll discuss open problems and current opportunities for research.
  
 === Shlomo Zilberstein === === Shlomo Zilberstein ===
 [[http://​rbr.cs.umass.edu/​shlomo/​|UMass Amherst]] [[http://​rbr.cs.umass.edu/​shlomo/​|UMass Amherst]]
  
-Title: ​TBD\\+Title: ​**Do We Expect Too Much from DEC-POMDP Algorithms?​**\\
 Abstract:\\ Abstract:\\
-TBD+Sequential decision models such as DEC-POMDPs are powerful and elegant approaches for planning in situations that involve multiple cooperating decision makers. They are powerful in the sense that we can, in principle, capture a rich class of problems. ​ They are elegant in the sense that they include the minimal set of ingredients needed to analyze these problems and facilitate rigorous mathematical examination of their fundamental properties. ​ An optimal solution of a DEC-POMDP explicitly answers the question of what should an agent do to maximize value. ​ Implicitly, an optimal solution answers many other questions including the appropriate assignment of meaning to internal memory states, appropriate adoption of goals and subgoals, appropriate assignment of roles to agents, and appropriate assignment of meaning to messages that agents exchange. In fact, an optimal policy optimizes all these choices implicitly. In this talk, I argue that this is just too much to expect from a computational point of view.  There is much to be gained by decomposing the planning problem in a way that some of these questions are answered first and a simplified planning problem is then solved. ​ I discuss a few examples of such decompositions and examine their contribution to the scalability of planning algorithms
  
Recent changes RSS feed Creative Commons License Donate Minima Template by Wikidesign Driven by DokuWiki