Currently Offered Topics / Aktuelle angebotene Themen fuer Abschlussarbeiten

We offer these current topics directly for Bachelor and Master students at TU Darmstadt who can feel free to DIRECTLY contact the thesis advisor if you are interested in one of these topics. Excellent external students from another university may be accepted but are required to first email Jan Peters before contacting any other lab member for a thesis topic. Note that we cannot provide funding for any of these theses projects.

We highly recommend that you do either our robotics and machine learning lectures (Robot Learning, Statistical Machine Learning) or our colleagues (Grundlagen der Robotik, Probabilistic Graphical Models and/or Deep Learning). Even more important to us is that you take both Robot Learning: Integrated Project, Part 1 (Literature Review and Simulation Studies) and Part 2 (Evaluation and Submission to a Conference) before doing a thesis with us.

In addition, we are usually happy to devise new topics on request to suit the abilities of excellent students. Please DIRECTLY contact the thesis advisor if you are interested in one of these topics. When you contact the advisor, it would be nice if you could mention (1) WHY you are interested in the topic (dreams, parts of the problem, etc), and (2) WHAT makes you special for the projects (e.g., class work, project experience, special programming or math skills, prior work, etc.). Supplementary materials (CV, grades, etc) are highly appreciated. Of course, such materials are not mandatory but they help the advisor to see whether the topic is too easy, just about right or too hard for you.

Only contact *ONE* potential advisor at the same time! If you contact a second one without first concluding discussions with the first advisor (i.e., decide for or against the thesis with her or him), we may not consider you at all. Only if you are super excited for at most two topics send an email to both supervisors, so that the supervisors are aware of the additional interest.

FOR FB16+FB18 STUDENTS: Students from other depts at TU Darmstadt (e.g., ME, EE, IST), you need an additional formal supervisor who officially issues the topic. Please do not try to arrange your home dept advisor by yourself but let the supervising IAS member get in touch with that person instead. Multiple professors from other depts have complained that they were asked to co-supervise before getting contacted by our advising lab member.


Combining Deep Reinforcement Learning and 3D Vision for Dual-arm Robotic Tasks

Scope: Master thesis
Advisor: SnehalJauhri
Added: Novermber 10, 2022
Start: ASAP
Topic (in detail): Attach:Theses/OpenTopics/irosa_master_thesis_doc.pdf
Topic (in brief): Recent breakthroughs in Deep Reinforcement Learning (RL) have led to an increased deployment of learning-based methods in robotics. Nevertheless, RL for robotics has been limited to simple setups that assume perfect knowledge about the robot’s environment.
Recent work at the iRosa lab [1] ( has successfully utilized Deep RL for performing mobile manipulation tasks (i.e. picking and placing objects using the robot arm while moving using the wheeled base of the robot). However, even in these experiments, the robot just used one out of its two arms, and the method assumed perfect perception of the environment.
In this thesis, we aim to build on advances in 3D Vision [2] ( and combine them with Deep RL to learn using real-world, imperfect 3D information such as point-clouds or occupancy grids. We also aim to solve the more challenging problem of dual-arm mobile manipulation instead of just using a single arm [3].
Note that this thesis builds upon a successful research paper. Therefore, a good starting point in terms of code and theory exists.

Highly motivated students can apply by sending an e-mail expressing their interest to Snehal Jauhri , attaching your letter of motivation and possibly your CV.

Requirements: Enthusiasm, ambition, and a curious mind go a long way. There will be ample supervision provided to help the student understand basic as well as advanced concepts. However, prior knowledge about reinforcement learning, robotics and Python programming would be a plus.

[1] S. Jauhri, J. Peters, and G. Chalvatzaki, “Robot learning of mobile manipulation with reachability behavior priors,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 8399–8406, 2022.
[2] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660.
[3] K. S. Luck and H. B. Amor, “Extracting bimanual synergies with reinforcement learning,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2017, pp. 4805–4812.

Learning Human-Robot Interaction with Gaussian Processes HSMMs

Scope: Bachelor/Master thesis
Advisor: Vignesh Prasad
Added: 2022-10-26
Start: ASAP
Topic: An Important aspect of Human-Robot Interaction is enabling a robot to react in a quick and efficient manner to the motions of a human. Learning from Demonstration (LfD) techniques are useful in learning such interactions between a human and a robot where the robot's trajectory can be adapted based on the human's trajectory during test time [1,2]. Such explicit conditioning approaches are common when Gaussian Processes (GPs). Moreover, one aspect of physical interactions is that they can be naturally broken down into different segments which follow a natural sequential progression, for example in a handshake you have reaching, shaking, and receeding phases. Hidden Markov Models (HMMs) and Hidden Semi-Markov Models (HSMMs) naturally yield themselves to such sequential segmentation tasks [3]. The main goal of this thesis would be to explore how the use of GPs can be combined with HMMs/HSMMs like in [3] and extend them for learning physically interactive behaviours for humanoid robots.

Note that this thesis builds upon promising work so a good starting point in terms of code and theory exists.

Students can apply by sending an e-mail expressing your interest to , attaching your CV and grade sheet.


  • Good knowledge of Python
  • Prior experience with robotics, machine learning, learning from demonstration or imitation learning is a plus

[1] Bayesian interaction primitives: A slam approach to human-robot interaction; Campbell, J. and Ben Amor, H. (2017).
[2] Learning interaction for collaborative tasks with probabilistic movement primitives; Maeda et al. (2014).
[3] Sequence pattern extraction by segmenting time series data using GP-HSMM with hierarchical dirichlet process; Nagano et al. (2018).
[4] MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot Interaction; Prasad et al. (2022).

Energy-based Models for 6D Pose Estimation

Scope: Master thesis
Advisor: Niklas Funk
Added: 2022-10-06
Start: ASAP
Topic: Estimating an object’s pose is crucial in many different robotic applications. The task of pose estimation is an active research problem. It is challenging because the information of the scene captured by a camera is always incomplete. Some parts of an object are always occluded, or multiple objects might occlude themselves. Moreover, it is particularly challenging to estimate rotations, as they might not be unique, and their representations are non-continuous. While many approaches to this problem have been proposed, recently, deep learning-based methods have been dominant [1,2]. Yet, those approaches still struggle with the aforementioned problems.
To tackle the difficulties, we propose exploring Energy-Based Models (EBMs) [3] for multi object pose estimation from images. Instead of directly predicting the target values from the input, EBMs measure the compatibility between potential target values and the input. It was shown that many modern machine learning models can be interpreted as EBMs and that there are performance advantages when applying EBM-specific methods to these models [4,5]. We hope to achieve similar performance gains for 6D pose estimation.
Note that this thesis builds upon a very promising Master's thesis. Therefore, a good starting point in terms of code and theory exists.

Highly motivated students can apply by sending an e-mail expressing your interest to , attaching your letter of motivation and possibly your CV.


  • Good knowledge of Python
  • Experience with deep learning libraries (in particular Pytorch)
  • Prior experience with pose estimation / image segmentation is a plus

[1] Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes; Xiang et al.
[2] Single-stage 6d object pose estimation; Hu et al.
[3] A tutorial on energybased learning; LeCun et al.
[4] Your GAN is secretly an energy-based model and you should use discriminator driven latent sampling; Che et al.
[5] Your classifier is secretly an energy based model and you should treat it like one; Grathwohl et al.

Robotic Tactile Exploratory Procedures for Identifying Object Properties

Scope: Master's thesis
Advisor: Tim Schneider, Alap Kshirsagar
Added: 2022-09-06
Start: ASAP
Topic: Identifying properties such as shape, deformability, roughness etc. is important for successful manipulation of an object. Humans use specific “exploratory procedures (EPs)” [1] to identify object properties, for example, lateral motion to detect texture and pressure to detect deformability. Our goal is to understand whether these exploratory procedures are optimal for robotic arms equipped with tactile sensors. We specifically focus on three properties and their corresponding EPs: texture (lateral motion), shape (contour following) and deformability (pressure).

Goals of the thesis

  • Literature review of robotic EPs for identifying object properties [2,3,4]
  • Develop and implement robotic EPs for a Digit tactile sensor
  • Compare performance of robotic EPs with human EPs

Desired Qualifications

  • Interested in working with real robotic systems
  • Python programming skills

[1] Lederman and Klatzky, “Haptic perception: a tutorial”
[2] Seminara et al., “Active Haptic Perception in Robots: A Review”
[3] Chu et al., “Using robotic exploratory procedures to learn the meaning of haptic adjectives”
[4] Kerzel et al., “Neuro-Robotic Haptic Object Classification by Active Exploration on a Novel Dataset”

On designing reward functions for robotic tasks

Scope: Master's thesis
Advisor: Davide Tateo, Georgia Chalvatzaki
Added: 2022-09-1
Start: ASAP
Topic: Defining a proper reward function to solve a robotic task is a complex and time-consuming process. Reinforcement Learning algorithms are sensitive to reward function definitions, and an improper design of the reward function may lead to suboptimal performance of the robotic agent, even in simple low-dimensional environments. This issue makes it complex to design novel reinforcement learning environments, as the reward tuning procedure takes too much time and leads to overcomplicated and algorithm-specific reward functions.

The objective of this thesis is to study and develop a set of guidelines for building Reinforcement Learning environments representing robotics simulated tasks. We will analyze in-depth the impact of different types of reward functions on very simple tasks for continuous control such as navigation, manipulation, and locomotion. We will consider how the state space affects learning (e.g., dealing with rotations) and how we should deal with these issues in a standard Reinforcement Learning setting. Furthermore, we will verify how to design a reward function that leads to a policy producing smooth actions, to minimize the issues of the sim-to-real transfer of the learned behavior.


  • Curriculum Vitae (CV);
  • A motivation letter explaining the reason for applying for this thesis and academic/career objectives.

Minimum knowledge

  • Good Python programming skills;
  • Basic knowledge of Reinforcement Learning.

Preferred knowledge

  • Knowledge of the PyTorch library;
  • Knowledge of the MuJoCo and PyBullet libraries.
  • Knowledge of the MushroomRL library.

Accepted candidate will

  • Port some classical MuJoCo/PyBullet locomotion environments into MushroomRL;
  • Design a set of simple manipulation tasks using PyBullet;
  • Design a set of simple navigation tasks;
  • Analyze the impact of different reward function definitions in these environments;
  • Verify if the insights coming from the simple tasks still holds in more complex (already existing) environments;
  • (Optionally) Port some classical MuJoCo/PyBullet locomotion environments into MushroomRL.

Tell me what to imitate?

Scope: Master thesis
Advisor: An Thai Le, Ali Younes
Added: 2022-08-26
Start: ASAP
Topic: Imitation Learning has been around for over two decades and has achieved huge successes in the acquisition of new motor skills from expert demonstrations. However, only the question of "how to imitate" is well-addressed so far, given the expert demonstrations lie in the same domain as robot state space.

Recently, there are some works implicitly addressing the question of "what and how to imitate" such as learning embodiment-agnostic task-progress indicator [1] from supervised datasets. This thesis explores learning methods from datasets such as Simitate [2] to address the "what to imitate" question explicitly, i.e. to identify what parts of the expert should be imitated before learning skills from demonstrations, given the expert and learner have dissimilar embodiments. Identifying keypoints in human demonstrations and surrounding contexts [3] for imitation is also a promising perspective to deal with this problem. The thesis will open an opportunity for cross-domain imitation learning, e.g. robotic manipulations from human video demonstrations, which is still an open question to date.

Goals of the thesis:

  • Literature review on what has been done in Visual Imitation Learning.
  • Development of novel methods to realize Cross-Domain Imitation Learning in general, with the focus on video human demonstration.
  • Implement the proposed method in the Human-Panda imitation scenarios, since both embodiments are strongly dissimilar.

Required Qualification:

  • Strong Programming Skills in python and/or C++
  • Optional Prior experience with ROS would be beneficial.
  • Strong motivation and/or foundation in Robotics and Computer Vision.

[1] Zakka, Kevin, et al. "Xirl: Cross-embodiment inverse reinforcement learning." Conference on Robot Learning. PMLR, 2022.
[2] Memmesheimer, Raphael, et al. "Simitate: A hybrid imitation learning benchmark." IEEE/RSJ IROS, 2019.
[3] J. Gao, Z. Tao, N.Jaquier, and T. Asfour. K-VIL: Keypoints-based Visual Imitation Learning, arXiv preprint, 2022.

Human Uncertainty Detection

Scope: Bachelor or Master thesis
Advisor: Lisa Scherf
Added: 2022-08-04
Start: August / September / October 2022
Topic: Detecting human uncertainty can be beneficial in many human-robot or human-system interaction scenarios. In an interactive reinforcement learning setting, human uncertainty could indicate a higher probability for incorrect advice and help to decide whether to follow provided human advice [1]. In student-tutor frameworks or in human-AI interaction in general, a user’s uncertainty can be used to decide when and how a robot or a system should offer help or provide suggestions [2]. Our goal is to analyze human uncertainty in a task with perceptual uncertainty and learn to detect uncertainty using related indicators, such as facial expressions [3], pupil dilation [4], speech [2] and/or reaction time. More specifically, first the experiment has to be designed and implemented. The multimodal data gathered in experimental trials with human participants is then used to train a model in order to predict human uncertainty.

The ideal candidate for this thesis has good Python programming skills and knowledge of machine learning.

Highly motivated students can apply by sending an e-mail expressing your interest to , attaching your letter of motivation and possibly your CV.

[1] L. Scherf, et al. "Learning from Unreliable Human Action Advice in Interactive Reinforcement Learning"
[2] K. Forbes-Riley, et al. "Benefits and challenges of real-time uncertainty detection and adaptation in a spoken dialogue computer tutor"
[3] P. Jahoda, et al. "Detecting decision ambiguity from facial images"
[4] A. E. Urai, et al. "Pupil-linked arousal is driven by decision uncertainty and alters serial choice bias"

Analyzing Graph Attention Mechanisms

Scope: Master thesis
Advisor: Niklas Funk
Added: 2022-08-04
Start: ASAP
Topic: By now Transformers are one of the most popular architectures across a wide range of fields [1]. Yet, the underlying mechanisms that are the basis of these powerful architectures are still not fully understood. In fact, all Transformers build up a graph in which information is passed through message passing. This procedure of message passing to obtain powerful representations is also known as Graph Attention. Recently, many different graph attention algorithms have been proposed [2-4].
The goal of this thesis is to thoroughly analyze these graph attention mechanisms to potentially draw connections to oversquashing [5], an effect that is limiting the expressivity of the networks.

Highly motivated students can apply by sending an e-mail expressing your interest to , attaching your letter of motivation and possibly your CV. Note that this is a pure machine learning topic.


  • Good knowledge of Python
  • Experience with deep learning libraries (in particular Pytorch)
  • Prior experience with graph neural networks / Transformers is a plus

[1] A Survey of Transformers; Tianyang Lin et al.
[2] Attention is all you need; Ashish Vaswani et al.
[3] Graph attention networks; Petar Velickovic et al.
[4] How attentive are graph attention networks?; Shaked Brody et al.
[5] Understanding over-squashing and bottlenecks on graphs via curvature; Jake Topping et al.

Scaling learned, graph-based assembly policies

Scope: Master thesis
Advisor: Niklas Funk
Added: 2022-08-04
Start: ASAP
Topic: Solving assembly tasks with a large number of building blocks and arbitrary desired designs (similar to playing LEGO) recently gathered lots of interest [1-3], yet, it remains a challenging task for machine learning algorithms.
The goal of this thesis would be to build up on our very recent work on assembly [1,2] and to scale the methods to allow handling a larger number of blocks. This thesis could go in multiple directions, including the following (just to give you an idea):

  • scaling our previous methods to incorporate mobile manipulators or the Kobo bi-manual manipulation platform. The increased workspace of both would allow for handling a wider range of objects
  • [2] has shown more powerful, yet, it includes running a MILP for every desired structure. Thus another idea could be to investigate approaches aiming to approximate this solution
  • adapting the methods to handle more irregular-shaped objects / investigate curriculum learning

Highly motivated students can apply by sending an e-mail expressing your interest to , attaching your letter of motivation and possibly your CV.


  • Good knowledge of Python
  • Experience with deep learning libraries (in particular Pytorch) is a plus
  • Experience with reinforcement learning / having taken Robot Learning is also a plus

[1] Learn2Assemble with Structured Representations and Search for Robotic Architectural Construction; Niklas Funk et al.
[2] Graph-based Reinforcement Learning meets Mixed Integer Programs: An application to 3D robot assembly discovery; Niklas Funk et al.
[3] Structured agents for physical construction; Victor Bapst et al.

Opening the Human Black Box — Inverse Reinforcement Learning for Human Juggling

Scope: Master thesis
Advisor: Kai Ploeger
Added: 2022-06-27
Start: Immediately
Topic: Humans can perform complex dynamic motor skills remarkably well, despite slow reaction times, bad perception, and high actuation noise. We will figure out how humans cope with these weaknesses in the context of toss juggling by applying inverse reinforcement learning.

This project can be broken down into three parts:
1. Study design and preparation
2. Data recording
3. IRL and evaluation

For this project, it is helpful but not necessary to be able to juggle.
Contact .

Inverse Reinforcement Learning from Observation for Locomotion on the Unitree A1 Robot

Scope: Master thesis
Advisor: Firas Al-Hafez, Davide Tateo
Added: 2022-06-24
Start: October 2022
Topic: Reinforcement Learning (RL) recently achieved remarkable success on locomotion tasks such as for quadrupeds and humanoids. Despite the success, approaches building on RL usually require huge effort and expert knowledge for the definition of the reward function in order to get a smooth and natural-looking gait. In contrast, Inverse Reinforcement Learning (IRL) infers a reward function given a set of expert demonstrations. While it is easy to observe an expert, the expert's true state and action might be hidden and not available for learning. Moreover, the expert's dynamics and kinematics might be unknown resulting in a correspondance problem when mapping the expert data to the agent's observation space.

In this thesis, the following tasks need to be completed:
1. In depth literature research on state-of-art IRL from observation approaches for quadruped locomotion
2. Setting up a simulation environment for the Unitree A1 in our mushroom-rl library
3. Implementation of a selected approach in our mushroom-rl library
4. Evaluation of the agent's performance in simulation and on the real robot (optional)

Required Qualification:
1. Strong Python programming skills
2. Prior experience in robotics
3. Attendance of the lecture "Reinforcement Learning: From Fundamentals to the Deep Approaches"

Desired Qualification:
1. Hands-on experience on robotics-related RL projects
2. Experience in ROS
3. Experience in Cpp
4. Attendance of the lectures "Statistical Machine Learning" and "Computational Engineering and Robotics"

[1] Peng, Jason et al. "Learning Agile Robotic Locomotion Skills by Imitating Animals"
[2] Li, Zhongyu et al. "Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots"
[3] Escontrela, Alejandro et al. "Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions"

Long-Horizon Manipulation Tasks from Visual Imitation Learning (LHMT-VIL): Algorithm

Scope: Master thesis
Advisor: Suman Pal, Ravi Prakash, Vignesh Prasad, Aiswarya Menon
Added: 2022-06-16
Start: Immediately
Topic: The objective of this thesis is to create a method to solve long-horizon robot manipulation tasks from Visual Imitation Learning (VIL). Thus, given a video demonstration of a human performing a long horizon manipulation task such as an assembly sequence, a robot should imitate the identical task by analyzing the video.

The proposed architecture can be broken down into the following sub-tasks:
1. Multi-object 6D pose estimation from video: Identify the object 6D poses in each video frame to generate the object trajectories
2. Action segmentation from video: Classify the action being performed in each video frame
3. High-level task representation learning: Learn the sequence of robotic movement primitives with the associated object poses such that the robot completes the demonstrated task
4. Low-level movement primitives: Create a database of low-level robotic movement primitives which can be sequenced to solve the long-horizon task

Desired Qualification:
1. Strong Python programming skills
2. Prior experience in Computer Vision and/or Robotics is preferred

Long-Horizon Manipulation Tasks from Visual Imitation Learning (LHMT-VIL): Dataset

Scope: Master thesis
Advisor: Suman Pal, Ravi Prakash, Vignesh Prasad, Aiswarya Menon
Added: 2022-06-16
Start: Immediately
Topic: The objective of this thesis is to create a large-scale dataset to solve long-horizon robot manipulation tasks from Visual Imitation Learning (VIL). Thus, given a video demonstration of a human performing a long horizon manipulation task such as an assembly sequence, a robot should imitate the identical task by analyzing the video.

During the project, we will create a large-scale dataset of videos of humans demonstrating industrial assembly sequences. The dataset will contain information of the 6D poses of the objects, the hand and body poses of the human, the action sequences among numerous other features. The dataset will be open-sourced to encourage further research on VIL.

Desired Qualification:
1. Strong Python programming skills
2. Prior experience in Computer Vision and/or Robotics is preferred

[1] F. Sener, et al. "Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities". CVPR 2022.
[2] P. Sharma, et al. "Multiple Interactions Made Easy (MIME) : Large Scale Demonstrations Data for Imitation." CoRL, 2018.

Self-Supervised Correspondence Learning for Cross-Domain Imitation Learning

Scope: Master thesis
Advisor: An Thai Le
Added: 2022-05-21
Start: October 2022
Topic: Imitation Learning has achieved huge successes over decades in the acquisition of new motor skills from expert demonstrations. However, most of these successes assume expert demonstrations lie in the same domain as the learner, which hinders the application of Imitation Learning in a variety of cases, e.g. teaching skills from videos. Recently, there is a body of works addressing the correspondence problem by learning directly the mapping between expert and learner state-action spaces [1, 2] or a embodiment-agnostic task-progress indicator [3] from supervised datasets. However, such supervised datasets could be hard or impossible to collect.

This project explores various techniques and insights from Self-Supervised Learning to design a method that learns the correspondence without human supervision. A promising direction could be applying Optimal Transport cost [4] to measure the similarity of state-action spaces having different supports in the self-supervised setting.

The ideal candidate for this thesis has a good programming skills in Python as well as solid knowledge of (deep) RL algorithms/techniques.

[1] Raychaudhuri, Dripta S., et al. "Cross-domain imitation from observations." International Conference on Machine Learning. PMLR, 2021.
[2] Kim, Kuno, et al. "Domain adaptive imitation learning." International Conference on Machine Learning. PMLR, 2020.
[3] Zakka, Kevin, et al. "Xirl: Cross-embodiment inverse reinforcement learning." Conference on Robot Learning. PMLR, 2022.
[4] Fickinger, Arnaud, et al. "Cross-Domain Imitation Learning via Optimal Transport." arXiv preprint arXiv:2110.03684 (2021).

Learning 3D Inverted Pendulum Stabilization

Scope: Master thesis
Advisor: Pascal Klink, Kai Ploeger
Added: 2022-05-18
Start: End of 2022
Topic: The concept of starting small is widely applied in reinforcement learning in order to improve learning speed and -stability of autonomous agents [1, 2, 3]. This project focuses on applying such a concept to the task of controlling a 3D inverted pendulum with a Barrett WAM robot (on the right - in this image connected to a badminton racket). More precisely, the robot is tasked to follow increasingly complex trajectories with its endeffector while simultaneously stabilizing a pole that is attached to said endeffector. The generation of increasingly complex target trajectories will be performed by a novel algorithm that has been developed at IAS. The design of the overall learning agent will first be done in a simulator and then transferred to the real system. The transfer to the real system also comprises the design and assembly of the physical 3D inverted pendulum.

The ideal candidate for this thesis has a solid knowledge of robotics and simulators, good programming skills as well as knowledge of (deep) RL algorithms/techniques.

[1] Andrychowicz, OpenAI: Marcin, et al. "Learning dexterous in-hand manipulation." IJRR, 2020.
[2] Silver, David, et al. "Mastering the game of go without human knowledge." Nature, 2017.
[3] Rudin, Nikita, et al. "Learning to walk in minutes using massively parallel deep reinforcement learning." CoRL, 2022.

Adaptive Human-Robot Interactions with Human Trust Maximization

Scope: Master thesis
Advisor: Kay Hansel, Georgia Chalvatzaki
Added: 2022-03-18
Start: April
Topic: Building trust between humans and robots is a major goal of Human-Robot Interaction (HRI). Usually, trust in HRI has been associated with risk aversion: a robot is trustworthy when its actions do not put the human at risk. However, we believe that trust is a bilateral concept that governs the behavior and participation in the collaborative tasks of both interacting parties. On the one hand, the human has to trust the robot about its actions, e.g., delivering the requested object, acting safely, and interacting in a reasonable time horizon. On the other hand, the robot should trust the human regarding their actions, e.g., have a reliable belief about the human's next action that would not lead to task failure; a certainty in the requested task. However, providing a computational model of trust is extremely challenging.
Therefore, this thesis explores trust maximization as a partially observable problem, where trust is considered as a latent variable that needs to be inferred. This consideration results in a dual optimization problem for two reasons: (i) the robot behavior must be optimized to maximize the human's latent trust distribution; (ii) an optimization of the human's prediction model must be performed to maximize the robot's trust. To address this challenging optimization problem, we will rely on variational inference and metrics like Mutual Information for optimization.
Highly motivated students can apply by sending an e-mail expressing your interest to , attaching your letter of motivation and possibly your CV.


  • Good knowledge of Python and/or C++;
  • Good knowledge in Robotics and Machine Learning;
  • Good knowledge of Deep Learning frameworks, e.g, PyTorch;

[1] Xu, Anqi, and Gregory Dudek. "Optimo: Online probabilistic trust inference model for asymmetric human-robot collaborations." ACM/IEEE HRI, IEEE, 2015;
[2] Kwon, Minae, et al. "When humans aren’t optimal: Robots that collaborate with risk-aware humans." ACM/IEEE HRI, IEEE, 2020;
[3] Chen, Min, et al. "Planning with trust for human-robot collaboration." ACM/IEEE HRI, IEEE, 2018;
[4] Poole, Ben et al. “On variational bounds of mutual information”. ICML, PMLR, 2019.

Statistical Model-based Reinforcement Learning

Scope: Master thesis
Advisor: Joe Watson
Added: 2022-01-25
Start: ASAP
Topic: Revisit the idea of Gaussian processes for data-driven control. The project is well defined, and similar to previous works such as PILCO and guided policy search. The student will learn about approximate inference, optimal control and Gaussian processes. Preferably, applicants have taken the statistical machine learning, robot learning and / or robot learning integrated project courses.

The goals of this thesis are:

  • Use Gaussian processes and approximate inference for model-based reinforcement learning
  • Implement and evaluate algorithms on real robotic systems

If you are interested in this thesis, please send an e-mail with your CV and transcripts to


  • Python software development
  • Patience for research on real robotic systems


  1. PILCO: A Model-Based and Data-Efficient Approach to Policy Search, Diesenroth et al. (2011)
  2. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics, Levine et al. (2014)
  3. Stochastic Control as Approximate Input Inference, Watson et al. (2021)

Hierarchical VADAM

Scope: Master's thesis, Bachelor's thesis
Advisor: Oleg Arenz
Added: 2021-11-04
Start: ASAP
Topic: TLDR; Learning a mixture of mean-field Gaussians by combining VIPS with VADAM

ADAM is a popular method for minimizing a loss function in deep learning. Khan et al. [1] showed, that a slight modification of ADAM (called VADAM) can be applied to find the parameters of a Gaussian (with diagonal covariance), to approximate the posterior in Bayesian Inference (with NN loss functions). VIPS [2] is a method to optimize a Gaussian Mixture model for better, multimodal, posterior approximations, by enabling us to optimize the individual Gaussian components independently. However, VIPS learns full covariance matrices (using the MORE algorithm) and thus does not scale to very high-dimensional problems, e.g. neural network parameters. In this thesis, you will replace the MORE-optimizer within the VIPS framework by VADAM to efficiently learn mixture of mean-field Gaussians for high-dimensional, multi-modal variational inference. We will likely use our implementation of VIPS, which is written in Tensorflow 2.

The topic is suitable for a Bachelor thesis as it should be relatively straightforward to implement (if you are familar with python/TF2). However, the topic also has a lot of potential to be useful for a wide range of audience, and we should aim to publish the result, which would require extra effort from you. To apply, first try to grasp the main insights and the mechanics of VADAM and VIPS ([1] and [2]) and arrange a meeting with me.

[1] Khan, Mohammad, et al. "Fast and scalable bayesian deep learning by weight-perturbation in adam." ICML 2018.
[2] Oleg Arenz, Mingjun Zhong, Gerhard Neumann. "Efficient Gradient-Free Variational Inference using Policy Search". ICML. 2018.

Causal inference of human behavior dynamics for physical Human-Robot Interactions

Scope: Master's thesis
Advisor:Georgia Chalvatzaki, Kay Hansel
Added: 2021-10-16
Start: ASAP
Topic: In this thesis, we will study and develop ways of approximating an efficient behavior model of a human in close interaction with a robot. We will research the extension of our prior work on the graph-based representation of the human into a method that leverages multiple attention mechanisms to encode relative dynamics in the human body. Inspired by methods in causal discovery, we will treat the motion prediction problem as such. In essence, the need for a differentiable and accurate human motion model is essential for efficient tracking and optimization of HRI dynamics. You will test your method in the context of motion prediction, especially for HRI tasks like human-robot handovers, and you could demonstrate your results in a real world experiment.

Highly motivated students can apply by sending an e-mail expressing your interest to , attaching your a letter of motivation and possibly your CV.

Minimum knowledge

  • Good knowledge of Python and/or C++;
  • Good knowledge of Robotics;
  • Good knowledge of Deep Learning frameworks, e.g, PyTorch


  1. Li, Q., Chalvatzaki, G., Peters, J., Wang, Y., Directed Acyclic Graph Neural Network for Human Motion Prediction, 2021 IEEE International Conference on Robotics and Automation (ICRA).
  2. Löwe, S., Madras, D., Zemel, R. and Welling, M., 2020. Amortized causal discovery: Learning to infer causal graphs from time-series data. arXiv preprint arXiv:2006.10833.
  3. Yang, W., Paxton, C., Mousavian, A., Chao, Y.W., Cakmak, M. and Fox, D., 2020. Reactive human-to-robot handovers of arbitrary objects. arXiv preprint arXiv:2011.08961.

Incorporating First and Second Order Mental Models for Human-Robot Cooperative Manipulation Under Partial Observability

Scope: Master Thesis
Advisor: Dorothea Koert, Joni Pajarinen
Added: 2021-06-08
Start: ASAP

The ability to model the beliefs and goals of a partner is an essential part of cooperative tasks. While humans develop theory of mind models for this aim already at a very early age [1] it is still an open question how to implement and make use of such models for cooperative robots [2,3,4]. In particular, in shared workspaces human robot collaboration could potentially profit from the use of such models e.g. if the robot can detect and react to planned human goals or a human's false beliefs during task execution. To make such robots a reality, the goal of this thesis is to investigate the use of first and second order mental models in a cooperative manipulation task under partial observability. Partially observable Markov decision processes (POMDPs) and interactive POMDPs (I-POMDPs) [5] define an optimal solution to the mental modeling task and may provide a solid theoretical basis for modelling. The thesis may also compare related approaches from the literature and setup an experimental design for evaluation with the bi-manual robot platform Kobo.

Highly motivated students can apply by sending an e-mail expressing your interest to attaching your CV and transcripts.


  1. Wimmer, H., & Perner, J. Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children's understanding of deception (1983)
  2. Sandra Devin and Rachid Alami. An implemented theory of mind to improve human-robot shared plans execution (2016)
  3. Neil Rabinowitz, Frank Perbet, Francis Song, Chiyuan Zhang, SM Ali Eslami,and Matthew Botvinick. Machine theory of mind (2018)
  4. Connor Brooks and Daniel Szafir. Building second-order mental models for human-robot interaction. (2019)
  5. Prashant Doshi, Xia Qu, Adam Goodie, and Diana Young. Modeling recursive reasoning by humans using empirically informed interactive pomdps. (2010)

Discovering neural parts in objects with invertible NNs for robot grasping

Scope: Master Thesis
Advisor: Georgia Chalvatzaki, Despoina Paschalidou
Added: 2021-05-19
Start: ASAP

In this thesis, we will investigate the use of 3D primitive representations in objects using Invertible Neural Networks (INNs). Through INNs we can learn the implicit surface function of the objects and their mesh. Apart from extracting the object’s shape, we can parse the object into semantically interpretable parts. In our work our main focus will be to segment the parts in objects that are semantically related to object affordances. Moreover, the implicit representation of the primitive can allow us to compute directly the grasp configuration of the object, allowing grasp planning. Interested students are expected to have experience with Computer Vision and Deep Learning, but also know how to program in Python using DL libraries like PyTorch.

The thesis will be co-supervised by Despoina Paschalidou (Ph.D. candidate at the Max Planck Institute for Intelligent Systems and the Max Planck ETH Center for Learning Systems). Highly motivated students can apply by sending an e-mail expressing your interest to , attaching your a letter of motivation and possibly your CV.


  1. Paschalidou, Despoina, Angelos Katharopoulos, Andreas Geiger, and Sanja Fidler. "Neural Parts: Learning expressive 3D shape abstractions with invertible neural networks." arXiv preprint arXiv:2103.10429 (2021).
  2. Karunratanakul, Korrawe, Jinlong Yang, Yan Zhang, Michael Black, Krikamol Muandet, and Siyu Tang. "Grasping Field: Learning Implicit Representations for Human Grasps." arXiv preprint arXiv:2008.04451 (2020).
  3. Chao, Yu-Wei, Wei Yang, Yu Xiang, Pavlo Molchanov, Ankur Handa, Jonathan Tremblay, Yashraj S. Narang et al. "DexYCB: A Benchmark for Capturing Hand Grasping of Objects." arXiv preprint arXiv:2104.04631 (2021).
  4. Do, Thanh-Toan, Anh Nguyen, and Ian Reid. "Affordancenet: An end-to-end deep learning approach for object affordance detection." In 2018 IEEE international conference on robotics and automation (ICRA), pp. 5882-5889. IEEE, 2018.

Cross-platform Benchmark of Robot Grasp Planning

Scope: Master Thesis
Advisor: Georgia Chalvatzaki, Daniel Leidner
Added: 2021-04-22
Start: ASAP

Grasp planning is one of the most challenging tasks in robot manipulation. Apart from perception ambiguity, the grasp robustness and the successful execution rely heavily on the dynamics of the robotic hands. The student is expected to research and develop benchmarking environments and evaluation metrics for grasp planning. The development in simulation environments as ISAAC Sim and Gazebo will allow us to integrate and evaluate different robotic hands for grasping a variety of everyday objects. We will evaluate grasp performance using different metrics (e.g., object-category-wise, affordance-wise, etc.), and finally, test the sim2real gap when transferring such approaches from popular simulators to real robots. The student will have the chance to work with different robotic hands (Justin hand, PAL TIAGo hands, Robotiq gripper, Panda gripper, etc.) and is expected to transfer the results to at least two robots (Rollin’ Justin at DLR and TIAGo++ at TU Darmstadt). The results of this thesis are intended to be made public (both the data and the benchmarking framework) for the benefit of the robotics community. As this thesis is offered in collaboration with the DLR institute of Robotics and Mechatronics in Oberpfaffenhofen near Munich, the student is expected to work in DLR for a period of 8-months for the thesis. On-site work at the premises of DLR can be expected but not guaranteed due to COVID-19 restrictions. A large part of the project can be carried out remotely.

Highly motivated students can apply by sending an e-mail expressing your interest to and , attaching your a letter of motivation and possibly your CV.


  1. Collins, Jack, Shelvin Chand, Anthony Vanderkop, and David Howard. "A Review of Physics Simulators for Robotic Applications." IEEE Access (2021).
  2. Bekiroglu, Y., Marturi, N., Roa, M. A., Adjigble, K. J. M., Pardi, T., Grimm, C., ... & Stolkin, R. (2019). Benchmarking protocol for grasp planning algorithms. IEEE Robotics and Automation Letters, 5(2), 315-322.

AADD: Reinforcement Learning for Unbiased Autonomous Driving

Scope: Master's thesis
Advisor: Tuan Dam, Carlo D'Eramo, Joni Pajarinen
Start: ASAP
Topic: Applying reinforcement to autonomous driving is a promising but challenging research direction due to the high uncertainty and environmental conditions in the task. Efficient reinforcement learning is needed. For efficient reinforcement learning recent work has suggested solving the Bellman Optimality equation with Stability guarantees but unfortunately no guarantee for zero bias has been proposed in this context making reinforcement learning susceptible to getting stuck in dangerous solutions. In this work we formulate the Bellman equation into a Convex-Concave Saddle Point Problem and solve it using a new proposed Accelerated Primal-Dual Algorithm [3]. We will test the algorithm in benchmark problems and in an autonomous driving task such as the one shown on the right (see video) where an efficient unbiased solution is needed.
[1] Ofir Nachum, Yinlam Chow, and Mohammad Ghavamzadeh. Path consistency learning in Tsallis entropy regularized mdps. arXiv preprint arXiv:1802.03501 , 2018.
[2] Dai, Bo, et al. "Sbeed: Convergent reinforcement learning with nonlinear function approximation." International Conference on Machine Learning. PMLR, 2018.
[3] Erfan Yazdandoost Hamedani and Necdet Serhat Aybat. A primal-dual algorithm for general convex-concave saddle point problems. arXiv preprint arXiv:1803.01401 , 2018

Above Average Decision Making Under Uncertainty

Scope: Master's thesis
Advisor: Tuan Dam, Joni Pajarinen
Start: ASAP
Topic: Google Deepmind recently showed how Monte Carlo Tree Search (MCTS) combined with neural networks can be used to play Go on a super-human level. However, one disadvantage of MCTS is that the search tree explodes exponentially with respect to the planning horizon. In this Master thesis the student will integrate the advantages of MCTS, that is, optimistic decision making into a policy representation that is limited in size with respect to the planning horizon. The outcome will be an approach that can plan further into the future. The application domain will include partially observable problems where decisions can have far reaching consequences.

Approximate Inference Methods for Stochastic Optimal Control

Scope: Master's thesis
Advisor: Joe Watson
Start: ASAP

Recent work has presented a control-as-inference formulation that frames optimal control as input estimation. The linear Gaussian assumption can be shown to be equivalent to the LQR solution, while approximate inference through linearization can be viewed as a Gauss–Newton method, similar to popular trajectory optimization methods (e.g. iLQR). However, the linearization approximation limits both the tolerable environment stochasticity and exploration during inference.

The aim of this thesis is to use alternative approximate inference methods (e.g. quadrature, monte carlo, variational), and investigate the benefits to stochastic optimal control and trajectory optimization. Ideally, prospective students are interested in optimal control, approximate inference methods and model-based reinforcement learning.

Interactive dance: at the interface of high-level reactive robotic control and human-robot interactions.

Scope: Master's thesis
Advisor: Vincent Berenz (a collaborator at Tübingen at the at the Max Planck Institute for Intelligent Systems)
Start: ASAP

Robotic scripted dance is common. One the other hand, interactive dance, in which the robot uses runtime sensory information to continuously adapt its moves to those of its (human) partner, remains challenging. It requires integration of together various sensors, action modalities and cognitive processes. The selected candidate objective will be to develop such an interactive dance, based on the software suit for simultaneous perception and motion generation our department built over the years. The target robot on which the dance will be applied is the wheeled robot Softbank Robotics Pepper. This master thesis is with the Max Planck Institute for Intelligent Systems and is located in Tuebingen. More information:

Minimum knowledge

  • Good Python programming skills.

Preferred knowledge

  • Knowledge of deep neural network, deep recurrent neural networks
  • Basic knowledge of Reinforcement Learning, POMDP, Memory Representation in POMDP
  • Knowledge of recent Deep RL methodologies;

[1] Deep recurrent q-learning for partially observable mdps, Hausknecht et al.
[2] Learning deep neural network policies with continuous memory states, Zhang et al.
[3] Recurrent Ladder Networks, Prémont-Schwarz et al.


zum Seitenanfang