Hierarchical and Structured Learning for Robotics, Reinforcement Learning, Information Theoretic Policy Search
Mail. TU Darmstadt, FB-Informatik, FG-CLAS, Hochschulstr. 10, 64289 Darmstadt
Office. Room E304, Robert-Piloty-Gebaeude S2|02
In November 2016, I left the TU Darmstadt and took up a chair professorship at the University of Lincoln, where I joined the Lincoln Centre of Autonomous Systems. Check my new homepage. Before that, I was an Assistant Professor at the TU Darmstadt from September 2014 to October 2016 and head of the Computational Learning for Autonomous Systems (CLAS) group. Before becomming assistant professor, I joined the IAS group as Post-Doc in November 2011 and became a Group Leader for Machine Learning for Control in October 2013.
Introducing robots in our every-day life is one of the big visions in robotics. To achieve this goal, we have to capacitate robots to autonomously learn a rich set of complex behaviors. Current machine learning approaches have already produced encouraging results in this regard. For example, state of the art approaches have been used to learn games like `ball-in-the-cup', pan-cake flipping, and throwing darts. However, these tasks are tailored to fit the proposed methods. They are mostly homogeneous, i.e., learning a single type of movement is sufficient to solve the task. Hence, they do not reflect the complexities that are involved in solving real-world tasks. In a real-world environment, an autonomous agent has to acquire a rich set of different behaviors to achieve a variety of goals. The agent has to learn autonomously how to explore its environment and determine which are the important features that need to be considered for making a decision. It has to identify relevant behaviors and needs to determine when to learn new behaviors. Furthermore, it needs to learn what are relevant goals and how to re-use behaviors in order to achieve new goals. Current machine learning approaches are, in the majority of the cases, lacking these types of autonomy. They rely hand-tuned parameters, hard coded goals or are over-engineered to match the specific problem. Moreover, the considered tasks are typically heavily structured by the experimenter. For example, the learning problem is reduced to learning a single movement. Such a reduction avoids many real-world problems, such as the autonomous discovery of relevant skills and autonomous goal discovery. This lack of autonomy is one of the main reasons why current approaches could not be scaled to more complex tasks that better reflect the challenges of real-world environments. Only by solving these real world challenges, we can introduce robots in our every-day life, such as house hold robots or robots for caring of the elderly.
My goal is to advance the state-of-the-art in terms of the autonomy, improve the quality and the generalization of the obtained policies, enhance the data efficiency and increase the flexibility of the used policy representation. In this regard, I believe that an autonomous discovery of the underlying structure of the task is of crucial importance. Most tasks can be decomposed into elemental behaviors that can be combined sequentially or even simultaneously by a modular control policy. Such modular structures allow an efficient transfer of learned skills to new tasks. Moreover, elemental behaviors typically solve a specific sub-goal. In order to exploit such a modular structure of a task, we require a learning system that is inherently hierarchical and can keep learning on several layers of abstraction. On the lower level, the agent needs to learn how to achieve the sub-goals while on the upper level, the agent needs to learn to choose the sub-goals accordingly. Such automatic extraction of the modular task structure will offer the agent more flexibility in comparison to engineered approaches that fix the structure of the sub-goals.
My core research topics are:
For all publications please see my Publication Page
Before coming to Darmstadt, I did my Ph.D. at the Graz University of Technology (TUG) under the supervision of Wolfgang Maass. I started my Ph.D. studies in August 2005. During my Ph.D., I was involved in several nation-funded and European-Union funded projects which concentrated on reinforcement learning for robotics, biologically inspired robotics, neural motor control and probabilistic inference for motor planning.
My Thesis, "On Motor Skill Learning and Movement Representations for Robotics" concentrated on value-based algorithms for motor skill learning, learning with different movement representations and policy search algorithms. I defended my PhD thesis in April 2012. I was born in Graz, Austria. Before doing my PhD, I finished my studies in telematics at the TUG in the year 2005. I also developed the Reinforcement Learning Toolbox, a C++ software library for RL algorithms, as his Master Thesis, which was frequently used by other scientists.
StochasticSearch: Contains implementations of episodic REPS, CECER, MORE.Preliminary version which contains only basic documentation.