Quick Facts
Organizers: | Jan Peters, Drew Bagnell, Stefan Schaal |
Conference: | NIPS 2006 |
Date: | December 8, 2006 |
Room: | Hilton: Sutcliffe B |
Location: | Westin Resort and Spa and Westin Hilton, Whistler, B.C., Canada |
Website: | http://www.jan-peters.net/Research/NIPS2006 |
Abstract
During the last decade, many areas of statistical machine learning have reached a high level of maturity with novel, efficient, and theoretically well founded algorithms that increasingly removed the need for heuristics and manual parameter tuning, which dominated the early days of neural networks. Reinforcement learning (RL) has also made major progress in theory and algorithms, but is somehow lagging behind the success stories of classification, supervised, and unsupervised learning. Besides the long-standing question for scalability of RL to larger and real world problems, even in simpler scenarios, a significant amount of manual tuning and human insight is needed to achieve good performance, e.g., as in exemplified in issues like eligibility factors, learning rates, the choice of function approximators and their basis functions for policy and/or value functions, etc. Some of the reasons for the progress of other statistical learning disciplines comes from connections to well-established fundamental learning approaches, like maximum-likelihood with EM, Bayesian statistics, linear regression, linear and quadratic programming, graph theory, function space analysis, etc.
Therefore, the main question of this workshop is to discuss, how other statistical learning techniques may be used to developed new RL approaches in order to achieve properties including higher numerical robustness, easier use in terms of open parameters, probabilistic and Bayesian interpretations, better scalability, the inclusions of prior knowledge, etc.
Format
Our goal is to bring together researchers who have worked on reinforcement learning techniques which are heading towards new approaches in terms of bringing other statistical learning techniques to bear on RL. The workshop will consist of short presentations, posters, and panel discussions. Topics to be addressed include, but are not limited to:
- Which methods from supervised and unsupervised learning are the most promising to help developing new RL approaches?
- How can modern probabilistic and Bayesian method be beneficial for Reinforcement Learning?
- Which approaches can help reducing the number of open parameters in Reinforcement Learning?
- Can the Reinforcement Learning Problem be reduced to Classification or Regression?
- Can reinforcement learning be seen as a big filtering or prediction problem where the prediction of good actions is the main objective?
- Are there useful alternative ways to formulate the RL problem? E.g, as a dynamic Bayesian network, by using multiplicative rewards, etc.
- Can reinforcement learning be accelerated by incorporating biases, expert data from demonstration, prior knowledge on reward functions, etc.?
Poster Size Recommendation
We recommend a poster size of 5' x 4'.
Program (tentative)
Morning session: 7:30am–10:30am | ||
7:30am | Welcome, Jan Peters, Drew Bagnell, Stefan Schaal | |
A. Foundations for a new Reinforcement Learning | ||
7:33am | Game theoretic learning and planning algorithms, Geoff Gordon [pdf] | |
7:50am | Reductive Reinforcement Learning, John Langford [link] | |
8:07am | The Importance of Measure in Reinforcement Learning, Sham Kakade [pdf] | |
8:24am | Sample Complexity Results for Reinforcement Learning in Large State Spaces, Csaba Szepesvari [pdf] | |
8:41am | Policies Based on Trajectory Libraries, Martin Stolle [pdf1] [pdf2] | |
9:58am | Coffee Break | |
B. Bayesian Approaches to Reinforcement Learning | ||
9:03am | Towards Bayesian Reinforcement Learning, Pascal Poupart [pdf] | |
9:20am | Bayesian Policy Gradient Algorithms, Mohammad Ghavamsadeh [pdf] | |
9:37am | Bayesian RL for Partially Observable Domains, Joelle Pineau [pdf1] [pdf2] | |
9:54am | Bayesian Reinforcement Learning with Gaussian Processes, Yaakov Engel [pdf] | |
10:11am | Poster Spotlight Presentations: Part I | |
SKIING BREAK | ||
Afternoon session: 3:30pm–6:30pm | ||
C. Imitation and Apprenticeships in Reinforcement Learning | ||
3:30pm | From Imitation Learning to Reinforcement Learning, Nathan Ratliff [pdf] | |
3:47pm | Graphical Models for Imitation: A New Approach to Speeding up RL, Deepak Verma | |
4:04pm | Apprenticeship learning and robotic control, Andrew Ng | |
4:21pm | Coffee Break | |
D. Using Inference for Reinforcement Learning | ||
4:26pm | Variational Methods for Stochastic Optimization: A Unification of Population-Based Methods, Mark Andrews [pdf] | |
4:43pm | Reinforcement Learning by Reward-Weighted Regression, Jan Peters [pdf] | |
5:00pm | Probabilistic inference for solving structured MDPs and POMDPs, Marc Toussaint [pdf] | |
5:17pm | Particle methods for POMDPS, Nando de Freitas [pdf] | |
5:27pm | WHAT'S WRONG WITH REINFORCEMENT LEARNING!, Rich Sutton | |
5:37pm | Poster Spotlight Presentations: Part II | |
5:45pm | Poster Session (Open End) |
Accepted Posters
Poster Spotlights: Session 1
- Reinforcement Learning with Dual Representations, Tao Wang, Michael Bowling, Dale Schuurmans. [pdf]
- Global Reinforcement Learning, Milen Pavlov, Pascal Poupart [pdf]
- Efficient exploration and learning of structure in factored-state MDPs, Carlos Diuk, Michael L. Littman, Alexander L. Strehl [pdf]
- Leveraging Anytime Classifiers to solve POMDPs, Erick Chastain, Rajesh Rao [pdf]
- Particle methods for POMDPS, Nando de Freitas, Arnaud Doucet, Matt Hoffman, Ruben Martinez-Cantin , Julia Vogel [pdf]
- Convexity, linearity and compositionality in stochastic optimal control, Emanuel Todorov [pdf1] [pdf2]
- Imitation Learning with a Value-Based Prior, Umar Syed, Robert E. Schapire [pdf]
- A path integral approach to agent planning, Hilbert J. Kappen, Wim Wiegerinck, B. van den Broek [pdf]
Poster Spotlights: Session 2
- Adaptive Tile-Coding for Reinforcement Learning, Shimon Whiteson, Matthew E. Taylor, Peter Stone [pdf]
- Model Learning for Dialog Management, Finale Doshi [pdf]
- Learning to acquire whole-body humanoid CoM movements to achieve dynamic tasks with a policy gradient method, Takamitsu Matsubara [pdf]
- Learning Omnidirectional Locomotion Using Dimensionality Reduction, J. Zico Kolter, Andrew Y. Ng [pdf]
Participants
This workshop will bring together researchers from different areas of machine learning in order to explore how to approach new topics in reinforcement learning. Attendants of the workshop are encouraged to actively participate by responding with questions and comments about the talks.
Organizers
The workshop is organized by Jan Peters, Stefan Schaal, from the Departments of Computer Science and Neuroscience, University of Southern California, Los Angeles, CA, USA, and by Drew Bagnell from the Robotics Institute, Carnegie Mellon University, PA, USA.
Location and More Information
The most up-to-date information about NIPS 2006 can be found on the NIPS 2006 website.