Simone Parisi

I have graduated and moved to the META AI team in Pittsburgh.

Research Interests

Reinforcement learning, policy search, multiobjective optimization, state representation, feature selection, robotics, sparse rewards, intrinsic motivation, exploration, hierarchical learning, deep learning

More Information

Curriculum Vitae Publications Google Citations DBLP

Contact Information

Mail. simone(at)

Simone Parisi joined the Intelligent Autonomous System lab on October, 1st, 2014 as a PhD student. His research interests include, amongst others, reinforcement learning, robotics, multi-objective optimization, and intrinsic motivation. During his PhD, Simone is working on Scalable Autonomous Reinforcement Learning (ScARL), developing and evaluating new methods in the field of robotics to guarantee both high degree of autonomy and the ability to solve complex task.

Before his PhD, Simone completed his MSc in Computer Science Engineering at the Politecnico di Milano, Italy, and at the University of Queensland, Australia. His thesis, entitled “Study and analysis of policy gradient approaches for multi-objective decision problems", was written under the supervision of Marcello Restelli and Matteo Pirotta.

Research Interests

Over the last decade, reinforcement learning has established as a framework for solving a large variety of tasks in robotics. A lot of effort has been directed towards scaling reinforcement learning to control high-dimensional systems and tasks (such as skills with many degrees of freedom). These advances, however, generally depend on hand-crafted state description as well as pre-structured parametrized policies. Furthermore, reward shaping using expert knowledge is frequently needed to scale reinforcement learning to high dimensional tasks. This large amount of required pre-structuring is in stark contrast to the goal of developing autonomous learning. It is therefore necessary to develop systematic methods to increase the autonomy of the learning system while keeping their scalability, by going beyond traditional approaches.


  • MiPS: A minimal toolbox for Matlab with some of the most famous policy search algorithms, as well as some recent multi-objective methods and benchmark problems in reinforcement learning. It was developed with the support of Matteo Pirotta.
  • tensorl: A minimal toolbox in Tensorflow with some of the most famous RL algorithms.

Key References (:googlescholar -kIVAcAAAAAJ:)

    •     Bib
      Parisi, S.; Tangkaratt, V.; Peters, J.; Khan, M. E. (2019). TD-Regularized Actor-Critic Methods, Machine Learning (MLJ), 108, 8, pp.1467-1501.
    •     Bib
      Tangkaratt, V.; van Hoof, H.; Parisi, S.; Neumann, G.; Peters, J.; Sugiyama, M. (2017). Policy Search with High-Dimensional Context Variables, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).
    •     Bib
      Parisi, S.; Abdulsamad, H.; Paraschos, A.; Daniel, C.; Peters, J. (2015). Reinforcement Learning vs Human Programming in Tetherball Robot Games, Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS).
    •     Bib
      Parisi, S.; Pirotta, M.; Peters, J. (2017). Manifold-based Multi-objective Policy Search with Sample Reuse, Neurocomputing, 263, pp.3-14.
    •     Bib
      Parisi, S.; Pirotta, M.; Restelli, M. (2016). Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation, Journal of Artificial Intelligence Research (JAIR), 57, pp.187-227.