Currently Offered Topics / Aktuelle angebotene Themen fuer Abschlussarbeiten

We offer these current topics directly for Bachelor and Master students at TU Darmstadt who can feel free to DIRECTLY contact the thesis advisor if you are interested in one of these topics. Excellent external students from another university may be accepted but please first email Jan Peters. Note that we cannot provide funding for any of these theses projects.

In addition, we are usually happy to devise new topics on request to suit the abilities of excellent students. Please DIRECTLY contact the thesis advisor if you are interested in one of these topics. When you contact the advisor, it would be nice if you could mention (1) WHY you are interested in the topic (dreams, parts of the problem, etc), and (2) WHAT makes you special for the projects (e.g., class work, project experience, special programming or math skills, prior work, etc.). Supplementary materials (CV, grades, etc) are highly appreciated. Of course, such materials are not mandatory but they help the advisor to see whether the topic is too easy, just about right or too hard for you.

Only contact *ONE* potential advisor at the same time! If you contact more a second one without first concluding discussions with the first advisor (i.e., decide for or against the thesis with her or him), we may not consider you at all. Only if you are super excited for at most two topics send an email to both supervisors, so that the supervisors are aware of the additional interest.

AADD: Adaptive Autonomous Deep Driving

Scope: Master's thesis
Advisor: Joni Pajarinen, Dorothea Koert
Start: ASAP
Topic: This thesis project focuses on deep reinforcement learning for autonomous driving in challenging conditions. Due to pedestrians and bad weather there can be high uncertainty about the true state of the world: humans are hard to detect and may behave erratically. This is especially true for small children. One example scenario is parking in a crowded parking lot. The thesis will focus on deep reinforcement learning methods which can take uncertainty into account. The learning will be mainly performed in a simulated environment such as the one shown on the right (video: https://youtu.be/Hp8Dz-Zek2E). If desired, the student will also get the chance to participate in the large autonomous driving project at Darmstadt which will have a real autonomous car.

Contribute to an intelligent service robot for elderly people

Scope: Master's or Bachelor's thesis
Advisor: Dorothea Koert
Start: ASAP
Topic: In the context of the KoBo34 project, which aims to build an assistive robot for elderly people, we offer different thesis topics in the context of learning robot skills for human robot interaction as well as predicting human motions into the future and recognizing human intentions. If you are interested in this research area please contact me directly to discuss more concrete topics.

Correlated Exploration in Deep Reinforcement Learning

Scope: Master's or Bachelor's thesis
Advisor: Riad Akrour, Oleg Arenz
Start: ASAP
Topic: Correlated exploration is any exploration mechanism that enforces correlation of the action noise with respect to time or states. Correlated exploration is important for robotics in order to reduce or eliminate jerkiness of exploration and maintain the physical integrity of the robot. Correlated exploration was studied on low dimensional policy representations [1, 2], and we demonstrated suitability of such a learning scheme, for specialized policies, directly on a robotics platform [3]. It has also been shown that correlated exploration can be applied to larger, neural network based, policies [4]. However, the exploration scheme of [4], if seen as an episodic contextual policy search algorithm, is rather primitive in its adaption of the exploration noise, and does not offer the necessary guarantees to be applied directly on a robot. In this thesis, we propose to leverage our expertise in entropy regularized policy search algorithms [5, 6] to improve over these shortcomings in order to provide a safe and efficient correlated exploration algorithm for robotics. The successful candidate is expected to investigate the following topics:

  • Set-up baseline by integrating correlated exploration of [4] to recent versions of DDPG such as [7].
  • Compare uncorrelated and correlated exploration on simulated tasks and on the Quanser robots.
  • Improve over existing correlated exploration formulations by, for example, integrating the gradient update of DDPG to our well founded formulations of entropy regularized episodic policy search algorithms [5, 6].

The successful candidate is expected to conduct their thesis with scientific rigor and a drive for quality such that their work find its place at a top machine learning or robotics conference.
[1] Rückstieß, T. et al.; State-dependent exploration for policy gradient methods; ECML 2008.
[2] van Hoof, H. et al.; Generalized exploration in policy search.; MLJ 2017.
[3] Parisi, S. et al.; Reinforcement learning vs human programming in tetherball robot games; IROS 2015.
[4] Plappert, M. et al.; Parameter space noise for exploration; ICLR 2018.
[5] Akrour, R. et al.; Model-free trajectory-based policy optimization with monotonic improvement; JMLR 2018.
[6] Arenz, O. et al.; Efficient gradient-free variational inference using policy search; ICML 2018.
[7] Fujimoto, S. et al.; Addressing function approximation error in actor-critic methods; ICML 2018.

Deep Adversial Learning of Object Disentangling

Scope: Master's thesis
Advisor: Oleg Arenz, Joni Pajarinen
Start: ASAP
Topic: When confronted with large piles of entangled or otherwise stuck together objects a robot
has to separate the objects before further manipulation is possible. For example, in waste segregation
the robot may put different types of objects into different containers. In this Master thesis project, one
robot will learn to disentangle objects and another adversarial robot will learn to entangle objects.
Learning will be done on real robots shown in the picture right.
Background knowledge: robot learning

Deep Handshake Turing Test: Learning the handshake as an important part of the human-robot interaction

Scope: Master's thesis, Bachelor's thesis
Advisor: Jan Peters, Ruth Stock-Homburg, Katharina Schneider
Start: ASAP
Topic: Companies of various industries started introducing anthropomorphic, social robots that interact with customers by gesturing and showing facial expressions with their equipped extremities and head. In this way, they have a social presence that, in turn, can create an emotional bond with the human within the interaction. Accordingly, the physical and haptic contact between a social robot and a human is an important part of the human-robot-interaction. Handshaking is a simple human interaction, but it is a complex movement and can be applied in several different social contexts, such as greeting or congratulations. Therefore, an anthropomorphic, social robot that interacts with humans should be motor intelligent and have the ability to show a human-like and authentic handshake behavior. While first theoretical frameworks about the human hand movement for handshaking were investigated, their implementation for anthropomorphic robots using the handshake turing test are not yet well understood. The thesis is embedded in the interdisciplinary FIF-project "Handshake Turing Test – Androide robot vs. human." The aim of this thesis is first to survey the literature of theories about the human handshake and handshake turing test, second to develop a concept of handshaking for an anthropomorphic, social robot, and third to test the concept on our real anthropomorphic robot Elenoide (see picture).

Deep Model Learning for Inverse Dynamics

Scope: Master's thesis, Bachelor's thesis
Advisor: Michael Lutter
Start: Anytime
Topic: One common robot learning task is to learn the inverse dynamics model using machine learning techniques. However, up to now the very recent advances in Deep Learning and Computing Hardware have not been tried out in learning inverses dynamics models online.

Therefore, this thesis should implement a Deep Learning model that learns the inverse dynamics model and evaluate the performance on a real-robotic system. So if your are excited to try out Deep Learning and want to get your hands dirty with real robots, this thesis is perfect for you. Additionally, you will present your thesis to ABB, a robot manufacturer that engineered inverse dynamics models for decades, and show them how efficient model learning is. So if you are interested in this this message, feel free to message me (michael@robot-learning.de)

TL;TR:

  • Learn Inverse Dynamics Models Online
  • Test your learned model on real robots
  • Impress ABB and publish your thesis at a conference/journal
  • Good knowledge of Machine Learning & Deep Learning required
  • Good programming skills in Python / C++ required

Deep Perceptual Primitives for Dynamic Environments:

Scope: Master's thesis, Bachelor's thesis
Advisor: Michael Lutter
Start: Anytime
Topic: Learning primitives for robotics mostly focused on learning open-loop policies for overly simplified static environments. Especially for two-armed robots such as Darias or YuMi open-loop policies do not provide sufficient flexibility, as the static policies cannot adapt their movements w.r.t. the other arm. Additionally, the currently learned primitives rely on low-dimensional and engineered feature representations. In the recent advent of Deep Learning, which is especially suitable for feature learning in high dimensional spaces, it should be possible to learn Deep Perceptual Primitives, which are closed-loop primitives using raw sensor data including visual data for action selection. The Deep Perceptual Primitives should be able to implicitly learn a good feature representation, the coordination across arms and the adaption to different environments.

Therefore, this thesis should develop a new primitive based on Deep Learning and evaluate the performance on a real-robotic system. Based on this research scope and depending on your time, your current knowledge and your interest, we can define a specific topic which fits you best. With your thesis you also get the chance to impress ABB, a leading manufacturer for robots, and publish your work at a leading conference. So if you are interested in this research scope, feel free to message me (michael@robot-learning.de)

TL;TR:

  • Develop Deep Learning Skills for robots
  • Test your learned skills on real robots
  • Impress ABB and publish your thesis at a conference/journal
  • Good knowledge of Machine Learning & Deep Learning required
  • Good programming skills in Python required

Deep multi-objective optimization

Scope: Master's thesis
Advisor: Simone Parisi
Start: ASAP
Topic: Recent advances in multi-objective optimization propose gradient-based approach to learn an approximation of the Pareto frontier manifold of all possible policies. However, this work is limited to linear policies and linear manifold approximators. Furthermore, they require to use a differentiable indicator of the quality of each policy, but the most used indicators are non-differentiable. Building on recent advances in multi-objective mathematical optimization and deep learning, we want to build differentiable approximators of the most used indicators and to use deep network to approximate the Pareto frontier manifold.

Interactive dance: at the interface of high-level reactive robotic control and human-robot interactions.

Scope: Master's thesis
Advisor: Vincent Berenz (a collaborator at Tübingen at the at the Max Planck Institute for Intelligent Systems)
Start: ASAP
Topic:

Robotic scripted dance is common. One the other hand, interactive dance, in which the robot uses runtime sensory information to continuously adapt its moves to those of its (human) partner, remains challenging. It requires integration of together various sensors, action modalities and cognitive processes. The selected candidate objective will be to develop such an interactive dance, based on the software suit for simultaneous perception and motion generation our department built over the years. The target robot on which the dance will be applied is the wheeled robot Softbank Robotics Pepper. This master thesis is with the Max Planck Institute for Intelligent Systems and is located in Tuebingen. More information: https://am.is.tuebingen.mpg.de/jobs/master-thesis-interactive-dance-performed-by-sofbank-robotics-pepper

Joint Learning of Humans and Robots

Scope: Master's Thesis, Bachelor's thesis
Advisor: Marco Ewerton
Start: Already taken
Topic: Recent research has leveraged Learning from Demonstrations and Probabilistic Movement Representations to allow humans and robots to efficiently perform tasks together, such as moving objects from one location to another without hitting obstacles in the way Co-manipulation with a Library of Virtual Guiding Fixtures. In some situations, however, it might be not trivial to provide good demonstrations to the robot. Moreover, the robot and the human might need to adapt their respective behaviors with time in order to get used to each other and achieve better performance in previously unknown environments. Intelligent prosthetic limbs or exoskeletons could for instance adapt to human users as the users themselves get accustomed to those devices and as both agents face new environments. In this project, the student will explore Learning from Demonstrations, Probabilistic Movement Primitives and Policy Search algorithms in order to enable robots to assist humans in shared control tasks.

Learning a Friction Hystersis with MOSAIC

Scope: Master's thesis, Bachelor thesis
Advisor: Jan Peters
Start: ASAP
Topic: Inspired by results in neuroscience, especially in the Cerebellum, Kawato & Wolpert introduced the idea of the MOSAIC (modular selection and identification for control) learning architecture. In this architecture, local forward models, i.e., models that predict future states and events, are learned directly from observations. Based on the prediction accuracy of these models, corresponding inverse models can be learned. In this thesis, we want to focus on the problem of learning to control a robot system with a hysteresis in its friction.

Learning Locally Linear Dynamical Systems For Reinforcement Learning

Scope: Master's Thesis, Bachelor's thesis
Advisor: Hany Abdulsamad
Start: ASAP
Topic: Model-based Reinforcement Learning is an approach to learn complex tasks given local approximations of the nonlinear dynamics of the environment and cost functions. It has proven to be a sample efficient approach for learning on real robots. Classical approaches for learning such local models have certain restrictions on the overall structure; for example the number of local componants and switching dynamics. State of the art research has recently moved to more general settings with nonparameteric approaches that require less structure. The aim of this thesis is to review the literature on this subject and to compare existing algorithms on real robots like the BioRob or the Barrett WAM.

Learning Robust Stochastic Controllers in an Adversarial Setting

Scope: Master's Thesis
Advisor: Hany Abdulsamad
Start: ASAP
Topic: Standard learning control techniques focus on learning deterministic controllers. Even advanced policy search methods that rely on stochastic search distributions use stochastic controllers only for the purpose of exploration, the final policy is only applied in its deterministic form. There are however cases in which a deterministic controller is always sub-optimal, such as in scenarios with random unstructured disturbances. In this thesis we want to address the problem of learning true stochastic optimal controllers in the context of an adversarial setting, and investigate the question if adversarial learning can be used to generalize standard policy search methods. This topic includes very interesting and deep connections to robust control, game theory and multi-agent learning.

Learning to Support a Learning Agent in a Cooperative Setup

Scope: Master's Thesis, Bachelor's thesis
Advisor: Hany Abdulsamad
Start: ASAP
Topic: A great challenge in applying Reinforcement Learning approaches is the need for human intervention to reset the scenario of a learned task, making the process very tedious and time consuming. A clear example is learning table tennis, where we are either limited to using a ball gun with predictable pattern of initial positions or a human is needed to play against the robot. However given a second robotic player, we propose a new setup, in which the two agents cooperate to develop two different strategies, where one agent learns to support the second in becoming a great table tennis player. It is interesting to see if in such a scenario the agents would be able to discover what might resemble a defensive and an aggressive strategy in table tennis. The thesis will concentrate on developing the concept of cooperation and testing the results in simulation and on our own real table tennis setup.

Model-Based RL is Approx. Trajectory Optimization, which is Approx. Optimal Control

Scope: Master's Thesis
Advisor: Hany Abdulsamad
Start: ASAP
Topic: Let's stop reinventing the wheel. Most recent approaches to model-based RL revolve around the concept of trajectory optimization (iLQG, GPS, PILCO <-- all related to DDP). They can all be categorized in terms of direct and indirect shooting methods, two categories of optimal control that have existed for a very long time. It is time to clarify this connection. The aim of this thesis is, first dive into the literature of control and model-based RL, second investigate the possibility of applying information-theoretic bounds to standard optimal control techniques.

Non Parametric Off-Policy Policy Gradient Estimation

Scope: Master's Thesis; Bachelor's Thesis
Advisor: Samuele Tosatto
Start: Fall Semester 2018
Topic: One of the major drawback state-of-the-art RL is its sample inefficiency. Very often, the amount of interaction with the system needed to learn a task is not realistic for real world application. We find a fundamental cause of such inefficiency in the Policy Gradient Theorem: the estimation of the state-distribution is often done by monte-carlo sampling, thus "on-policy". In fact, while it is easy to obtain off-policy estimation of the value function, there is currently no method for estimating the state-distribution offline. We propose a method which is able to provide an off-policy estimation of the state-distribution, and an analytical solution for the policy gradient. The derived algorithm seems to be very promising for real-world applications, but yet it has to be enhanced. The thesis consists in improving the method proposed in such a way to solve a robotic task defined together with the student. The ideal applicant must have good mathematical skills, the will to study and understand deeply the RL theory, to come up with some idea, and to have good programming skills (python, tensorflow). What the student will gain from this project is the possibility to better understand the RL theory and the state-of-the-art, work with robot (our target platform is Darias), and hopefully to have a publication.

The Recursive Newton Euler Algorithm on a Differentiable Computation Graph for learning Robot Dynamics

Scope: Master's Thesis
Advisor: Joe Watson, Michael Lutter
Start: Anytime
Topic: Model-based Reinforcement Learning for robotics typically requires learning the nonlinear dynamics of complex multibody mechanical systems. The Recursive Newton Euler Algorithm (RNEA) is an existing means of efficiently modelling such systems. This project looks at using the Lie Algebra perspective of RNEA to implement the algorithm on a differential computation graph, the basis of deep learning models. This potentially offers the means of learning high-fidelity and interpretable models for robotics from data. See this write up for more details.

Robotics Under Partial Observability

Scope: Master's thesis, Bachelor's thesis
Advisor: Joni Pajarinen
Start: Already started

Topic: Partial observability is a defining property of robotics. Noisy sensors and actuators make state estimation difficult and even with accurate sensors occlusions prevent full observability. To gain full autonomy, a robot should use available observations for both state estimation and to plan how to gain the information required for performing the assigned tasks. Recently approaches which take partial observability into account have gained traction for example in autonomous driving, household robotics, and interactive perception. This Bachelor/Master thesis focuses on surveying the literature with respect to partial observability in robotics; categorizing different approaches and discussing open questions. This thesis topic is a good fit for a student who likes searching for and categorising information and likes to get a deeper understanding of the state-of-the-art.

Targeted Exploration Using Value Bounds

Scope: Master's thesis
Advisor: Joni Pajarinen
Start: ASAP
Topic: Efficient exploration is one of the most prominent challenges in deep reinforcement learning. In reinforcement learning, exploration of the state space is critical for finding high value actions and connecting them to the causing actions. Exploration in model-free reinforcement learning has relied on classical techniques, empirical uncertainty estimates of the value function, or random policies. In model-based reinforcement learning value bounds have been used successfully to direct exploration. In this Master thesis project the student will investigate how lower and upper value bounds can be used to target exploration in model-free reinforcement learning into the most promising parts of the state space. This thesis topic requires background knowledge in reinforcement learning gained e.g. through machine learning or robot learning courses.

  

zum Seitenanfang