Quick Facts

Organizers:Jan Peters, Drew Bagnell, Stefan Schaal
Conference:NIPS 2006
Date:December 8, 2006
Room:Hilton: Sutcliffe B
Location:Westin Resort and Spa and Westin Hilton, Whistler, B.C., Canada
Website:http://www.jan-peters.net/Research/NIPS2006

Abstract

During the last decade, many areas of statistical machine learning have reached a high level of maturity with novel, efficient, and theoretically well founded algorithms that increasingly removed the need for heuristics and manual parameter tuning, which dominated the early days of neural networks. Reinforcement learning (RL) has also made major progress in theory and algorithms, but is somehow lagging behind the success stories of classification, supervised, and unsupervised learning. Besides the long-standing question for scalability of RL to larger and real world problems, even in simpler scenarios, a significant amount of manual tuning and human insight is needed to achieve good performance, e.g., as in exemplified in issues like eligibility factors, learning rates, the choice of function approximators and their basis functions for policy and/or value functions, etc. Some of the reasons for the progress of other statistical learning disciplines comes from connections to well-established fundamental learning approaches, like maximum-likelihood with EM, Bayesian statistics, linear regression, linear and quadratic programming, graph theory, function space analysis, etc.

Therefore, the main question of this workshop is to discuss, how other statistical learning techniques may be used to developed new RL approaches in order to achieve properties including higher numerical robustness, easier use in terms of open parameters, probabilistic and Bayesian interpretations, better scalability, the inclusions of prior knowledge, etc.

Format

Our goal is to bring together researchers who have worked on reinforcement learning techniques which are heading towards new approaches in terms of bringing other statistical learning techniques to bear on RL. The workshop will consist of short presentations, posters, and panel discussions. Topics to be addressed include, but are not limited to:

  • Which methods from supervised and unsupervised learning are the most promising to help developing new RL approaches?
  • How can modern probabilistic and Bayesian method be beneficial for Reinforcement Learning?
  • Which approaches can help reducing the number of open parameters in Reinforcement Learning?
  • Can the Reinforcement Learning Problem be reduced to Classification or Regression?
  • Can reinforcement learning be seen as a big filtering or prediction problem where the prediction of good actions is the main objective?
  • Are there useful alternative ways to formulate the RL problem? E.g, as a dynamic Bayesian network, by using multiplicative rewards, etc.
  • Can reinforcement learning be accelerated by incorporating biases, expert data from demonstration, prior knowledge on reward functions, etc.?

Poster Size Recommendation

We recommend a poster size of 5' x 4'.

Program (tentative)

Morning session: 7:30am–10:30am
7:30amWelcome, Jan Peters, Drew Bagnell, Stefan Schaal
 
A. Foundations for a new Reinforcement Learning
7:33amGame theoretic learning and planning algorithms, Geoff Gordon [pdf]
7:50amReductive Reinforcement Learning, John Langford [link]
8:07amThe Importance of Measure in Reinforcement Learning, Sham Kakade [pdf]
8:24amSample Complexity Results for Reinforcement Learning in Large State Spaces, Csaba Szepesvari [pdf]
8:41amPolicies Based on Trajectory Libraries, Martin Stolle [pdf1] [pdf2]
 
9:58amCoffee Break
 
B. Bayesian Approaches to Reinforcement Learning
9:03amTowards Bayesian Reinforcement Learning, Pascal Poupart [pdf]
9:20amBayesian Policy Gradient Algorithms, Mohammad Ghavamsadeh [pdf]
9:37amBayesian RL for Partially Observable Domains, Joelle Pineau [pdf1] [pdf2]
9:54amBayesian Reinforcement Learning with Gaussian Processes, Yaakov Engel [pdf]
 
10:11amPoster Spotlight Presentations: Part I
 
SKIING BREAK
 
Afternoon session: 3:30pm–6:30pm
C. Imitation and Apprenticeships in Reinforcement Learning
3:30pmFrom Imitation Learning to Reinforcement Learning, Nathan Ratliff [pdf]
3:47pmGraphical Models for Imitation: A New Approach to Speeding up RL, Deepak Verma
4:04pmApprenticeship learning and robotic control, Andrew Ng
 
4:21pmCoffee Break
 
D. Using Inference for Reinforcement Learning
4:26pmVariational Methods for Stochastic Optimization: A Unification of Population-Based Methods, Mark Andrews [pdf]
4:43pmReinforcement Learning by Reward-Weighted Regression, Jan Peters [pdf]
5:00pmProbabilistic inference for solving structured MDPs and POMDPs, Marc Toussaint [pdf]
5:17pmParticle methods for POMDPS, Nando de Freitas [pdf]
 
5:27pmWHAT'S WRONG WITH REINFORCEMENT LEARNING!, Rich Sutton
 
5:37pmPoster Spotlight Presentations: Part II
5:45pmPoster Session (Open End)

Accepted Posters

Poster Spotlights: Session 1

  1. Reinforcement Learning with Dual Representations, Tao Wang, Michael Bowling, Dale Schuurmans. [pdf]
  2. Global Reinforcement Learning, Milen Pavlov, Pascal Poupart [pdf]
  3. Efficient exploration and learning of structure in factored-state MDPs, Carlos Diuk, Michael L. Littman, Alexander L. Strehl [pdf]
  4. Leveraging Anytime Classifiers to solve POMDPs, Erick Chastain, Rajesh Rao [pdf]
  5. Particle methods for POMDPS, Nando de Freitas, Arnaud Doucet, Matt Hoffman, Ruben Martinez-Cantin , Julia Vogel [pdf]
  6. Convexity, linearity and compositionality in stochastic optimal control, Emanuel Todorov [pdf1] [pdf2]
  7. Imitation Learning with a Value-Based Prior, Umar Syed, Robert E. Schapire [pdf]
  8. A path integral approach to agent planning, Hilbert J. Kappen, Wim Wiegerinck, B. van den Broek [pdf]

Poster Spotlights: Session 2

  1. Adaptive Tile-Coding for Reinforcement Learning, Shimon Whiteson, Matthew E. Taylor, Peter Stone [pdf]
  2. Model Learning for Dialog Management, Finale Doshi [pdf]
  3. Learning to acquire whole-body humanoid CoM movements to achieve dynamic tasks with a policy gradient method, Takamitsu Matsubara [pdf]
  4. Learning Omnidirectional Locomotion Using Dimensionality Reduction, J. Zico Kolter, Andrew Y. Ng [pdf]

Participants

This workshop will bring together researchers from different areas of machine learning in order to explore how to approach new topics in reinforcement learning. Attendants of the workshop are encouraged to actively participate by responding with questions and comments about the talks.

Organizers

The workshop is organized by Jan Peters, Stefan Schaal, from the Departments of Computer Science and Neuroscience, University of Southern California, Los Angeles, CA, USA, and by Drew Bagnell from the Robotics Institute, Carnegie Mellon University, PA, USA.

Location and More Information

The most up-to-date information about NIPS 2006 can be found on the NIPS 2006 website.