In November 2016, I left the TU Darmstadt. After a stint at a chair professorship at the University of LincolnLincoln Centre of Autonomous Systems, I became full professor at the KIT and am heading the chair "Autonomous Learning Robots" since January 2020. Check out my new homepage here! You can still reach me via

Gerhard Neumann

Research Interests

Hierarchical and Structured Learning for Robotics, Reinforcement Learning, Information Theoretic Policy Search

More Information

Curriculum Vitae Publications Google Citations DBLP

Contact Information

In November 2016, I left the TU Darmstadt and took up a chair professorship at the University of Lincoln, where I joined the Lincoln Centre of Autonomous Systems. Check my new homepage. Before that, I was an Assistant Professor at the TU Darmstadt from September 2014 to October 2016 and head of the Computational Learning for Autonomous Systems (CLAS) group. Before becoming assistant professor, I joined the IAS group as Post-Doc in November 2011 and became a Group Leader for Machine Learning for Control in October 2013. I was also an intern with Jan Peters during my Ph.D. in 2008 at the Max Planck Institute for Biological Cybernetics.

Research interests

Introducing robots in our every-day life is one of the big visions in robotics. To achieve this goal, we have to capacitate robots to autonomously learn a rich set of complex behaviors. Current machine learning approaches have already produced encouraging results in this regard. For example, state of the art approaches have been used to learn games like `ball-in-the-cup', pan-cake flipping, and throwing darts. However, these tasks are tailored to fit the proposed methods. They are mostly homogeneous, i.e., learning a single type of movement is sufficient to solve the task. Hence, they do not reflect the complexities that are involved in solving real-world tasks. In a real-world environment, an autonomous agent has to acquire a rich set of different behaviors to achieve a variety of goals. The agent has to learn autonomously how to explore its environment and determine which are the important features that need to be considered for making a decision. It has to identify relevant behaviors and needs to determine when to learn new behaviors. Furthermore, it needs to learn what are relevant goals and how to re-use behaviors in order to achieve new goals. Current machine learning approaches are, in the majority of the cases, lacking these types of autonomy. They rely hand-tuned parameters, hard coded goals or are over-engineered to match the specific problem. Moreover, the considered tasks are typically heavily structured by the experimenter. For example, the learning problem is reduced to learning a single movement. Such a reduction avoids many real-world problems, such as the autonomous discovery of relevant skills and autonomous goal discovery. This lack of autonomy is one of the main reasons why current approaches could not be scaled to more complex tasks that better reflect the challenges of real-world environments. Only by solving these real world challenges, we can introduce robots in our every-day life, such as house hold robots or robots for caring of the elderly.

My goal is to advance the state-of-the-art in terms of the autonomy, improve the quality and the generalization of the obtained policies, enhance the data efficiency and increase the flexibility of the used policy representation. In this regard, I believe that an autonomous discovery of the underlying structure of the task is of crucial importance. Most tasks can be decomposed into elemental behaviors that can be combined sequentially or even simultaneously by a modular control policy. Such modular structures allow an efficient transfer of learned skills to new tasks. Moreover, elemental behaviors typically solve a specific sub-goal. In order to exploit such a modular structure of a task, we require a learning system that is inherently hierarchical and can keep learning on several layers of abstraction. On the lower level, the agent needs to learn how to achieve the sub-goals while on the upper level, the agent needs to learn to choose the sub-goals accordingly. Such automatic extraction of the modular task structure will offer the agent more flexibility in comparison to engineered approaches that fix the structure of the sub-goals.

My core research topics are:

  • Information-theoretic policy search methods
  • Hierarchical and structured representations for robot learning
  • Movement representations for motor skill learning
  • Hierarchical Bayesian Models
  • Reinforcement Learning for robotics
  • Cooperative autonomous systems and multi-agent systems
  • Robot Applications for Machine Learning
  • Stochastic Optimal Control
  • Biologically Inspired Robotics and Motor Control

Key References

    •     Bib
      Daniel, C.; van Hoof, H.; Peters, J.; Neumann, G. (2016). Probabilistic Inference for Determining Options in Reinforcement Learning, Machine Learning (MLJ), 104, 2-3, pp.337-357.
    •     Bib
      Akrour, R.; Abdolmaleki, A.; Abdulsamad, H.; Neumann, G. (2016). Model-Free Trajectory Optimization for Reinforcement Learning, Proceedings of the International Conference on Machine Learning (ICML).
    •     Bib
      Daniel, C.; Neumann, G.; Kroemer, O.; Peters, J. (2016). Hierarchical Relative Entropy Policy Search, Journal of Machine Learning Research (JMLR), 17, pp.1-50.
    •     Bib
      Abdolmaleki, A.; Lioutikov, R.; Peters, J; Lau, N.; Reis, L.; Neumann, G. (2015). Model-Based Relative Entropy Stochastic Search, Advances in Neural Information Processing Systems (NIPS / NeurIPS), MIT Press.
    •     Bib
      Wirth, C.; Fürnkranz, J.; Neumann G. (2015). Model-Free Preference-Based Reinforcement Learning, Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI-15).
    •     Bib
      Kupcsik, A.G.; Deisenroth, M.P.; Peters, J.; Ai Poh, L.; Vadakkepat, V.; Neumann, G. (2017). Model-based Contextual Policy Search for Data-Efficient Generalization of Robot Skills, Artificial Intelligence, 247, pp.415-439.
    •     Bib
      Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G (2013). Probabilistic Movement Primitives, Advances in Neural Information Processing Systems (NIPS / NeurIPS), MIT Press.
    •       Bib
      Deisenroth, M. P.; Neumann, G.; Peters, J. (2013). A Survey on Policy Search for Robotics, Foundations and Trends in Robotics, 21, pp.388-403.
    •     Bib
      Neumann, G. (2011). Variational Inference for Policy Search in Changing Situations, Proceedings of the International Conference on Machine Learning (ICML 2011) .
    •     Bib
      Neumann, G.; Maass, W; Peters, J. (2009). Learning Complex Motions by Sequencing Simpler Motion Templates, Proceedings of the International Conference on Machine Learning (ICML2009).
    •     Bib
      Neumann, G.; Peters, J. (2009). Fitted Q-iteration by Advantage Weighted Regression, Advances in Neural Information Processing Systems 22 (NIPS/NeurIPS), Cambridge, MA: MIT Press.

For all publications please see my Publication Page

Some more information about me

Before coming to Darmstadt, I did my Ph.D. at the Graz University of Technology (TUG) under the supervision of Wolfgang Maass. I started my Ph.D. studies in August 2005. During my Ph.D., I was involved in several nation-funded and European-Union funded projects which concentrated on reinforcement learning for robotics, biologically inspired robotics, neural motor control and probabilistic inference for motor planning.

My Thesis, "On Motor Skill Learning and Movement Representations for Robotics" concentrated on value-based algorithms for motor skill learning, learning with different movement representations and policy search algorithms. I defended my PhD thesis in April 2012. I was born in Graz, Austria. Before doing my PhD, I finished my studies in telematics at the TUG in the year 2005. I also developed the Reinforcement Learning Toolbox, a C++ software library for RL algorithms, as his Master Thesis, which was frequently used by other scientists.



  • I got appointed as chair of autonomous systems and robotics at the university of Lincoln and join the Lincoln Center for Autonomous Systems (L-CAS). I am very excited about this great opportunity, but also a bit sad that I will leave Darmstadt. It was a great time here. I will start at the 1st of November in Lincoln.
  • I am giving the Probabilistic Graphical Models (Machine Learning 2) lecture in WS 2015. Check out the homepage for more information!
  • I am giving the Intelligent Multi Agent Systems Lecture lecture in SS 2015. Check out the homepage for more information!
  • I am giving the Robot Learning Lecture in WS 2015. Check out the homepage for more information!
  • I am co-organizing the HUMANOIDS workshop on Policy Representations for Humanoids with Heni Ben Amor and Neil Dantam.
  • I am organizing the NIPS workshop Autonomously Learning Robots with Marc Toussaint, Joelle Pinault and Peter Auer
  • I am giving the Robot Learning Lecture in 2014. Check out the homepage for more information!


StochasticSearch: Contains implementations of episodic REPS, CECER, MORE.Preliminary version which contains only basic documentation.