Tutorial: Reinforcement Learning for Robotics

Quick Facts

Organizers:Pieter Abbeel (University of California in Berkeley) , Jan Peters (Technische Universitaet Darmstadt )
Conference:ICRA 2012
Date and Time:Friday, 14 May 2012, 0900-1530
Location:Room 11, St. Paul River Centre
Website:http://www.robot-learning.de/Research/Tutorial

Abstract

This all-day tutorial introduces the audience to reinforcement learning. Prior experience in this area is not assumed. In the first half of this tutorial we will cover the foundations of reinforcement learning: Markov decision processes, value iteration, policy iteration, linear programming for solving an MDP, function approximation, model-free versus model-based learning, Q-learning, TD-learning, policy search, the likelihood ratio policy gradient, the policy gradient theorem, actor-critic, natural gradient and importance sampling. In the second half of this tutorial we will discuss example success stories and open problems.

Plans

PartTimeSpeakerTopicSlides
18:45Pieter AbbeelIntroductionAttach:Part1_Introduction.pdf
29:10Jan PetersBackground: Supervised LearningAttach:Part2_Background_Supervised_Learning.pdf
3a9:35Pieter AbbeelOptimal Control: FoundationsAttach:Part_3a_Optimal_Control.pdf
 10:00 COFFEE BREAK
3b10:30Jan PetersOptimal Control with Learned Forward ModelsAttach:Part3b_Optimal_control_with_Learned_Models.pdf
410:55Pieter AbbeelValue Function MethodsAttach:Part4_Value_Function_Methods.pdf
   LUNCH BREAK
514:00 (2pm)Jan PetersPolicy Search MethodsAttach:Part5_Policy_Search.pdf
614:55 (2:55pm)Pieter AbbeelExploration in Reinforcement LearningAttach:Part6_Exploration.pdf
715:10 (3:10)BothWrap-Up and Conclusion|

Bios:

Pieter Abbeel received a BS/MS in Electrical Engineering from KU Leuven (Belgium) and received his Ph.D. degree in Computer Science from Stanford University in 2008. He joined the faculty at UC Berkeley in Fall 2008, with an appointment in the Department of Electrical Engineering and Computer Sciences.

He has won various awards, including best paper awards at ICML and ICRA, the Sloan Fellowship, the Air Force Office of Scientific Research Young Investigator Program (AFOSR-YIP) award, the Okawa Foundation award, and the 2011's TR35. He has developed apprenticeship learning algorithms which have enabled advanced helicopter aerobatics, including maneuvers such as tic-tocs, chaos and auto-rotation, which only exceptional human pilots can perform. His group has also enabled the first end-to-end completion of reliably picking up a crumpled laundry article and folding it. His work has been featured in many popular press outlets, including BBC, New York Times, MIT Technology Review, Discovery Channel, SmartPlanet and Wired. His current research focuses on robotics and machine learning with a particular focus on challenges in personal robotics, surgical robotics and connectomics.

Jan Peters is a computer scientist (Dipl.-Inform. Fernuni Hagen; M.Sc., Ph.D., Univ.S.California), an electrical engineer (Dipl.-Ing. TU Muenchen), and a mechanical engineer (M.Sc. Univ.S.California). Jan has held visiting research positions at ATR, Japan and at National University of Singapore. Currently, Jan Peters is a full professor (W3) at Technische Universitaet Darmstadt heading the FG Intelligente Autonome Systeme while at the same time leading the Robot Learning Lab at the Max Planck Institute for Intelligent Systems. Jan Peters' research interests span a large variety of topics in robotics, machine learning and biomimetic systems with a strong focus on learning of motor skills. He has (co-)authored 165 papers and received the 2007 Dick Volz Best US Robotics Ph.D. Runner-Up Award.