Maximilian Tölle

Research Interests

Vision-Language-Action Models, Task Representations, 3D Representations for Robot Manipulation

Affiliations

1. TU Darmstadt, Intelligent Autonomous Systems, Computer Science Department
2. German Research Center for AI (DFKI), Research Department: SAIROL

Contact

maximilian.toelle(at)dfki(dot)de
Room 2.1.16, Building S4|14, DFKI, SAIROL, Mornewegstraße 30, 64293 Darmstadt
+49-151-12764557

Maximilian joined the Intelligent Autonomous Systems Group as a Ph.D. student in November 2022. He is employed as a Researcher at the DFKI Research Department SAIROL under the supervision of Dr. Puze Liu and Prof. Dr. Jan Peters.

Before starting his Ph.D., Maximilian studied at RWTH Aachen and completed his Bachelor's degree in Mechanical Engineering and his Master's degree in Automation Engineering. At IAS, he wrote his Master's Thesis "Curriculum Adversarial Reinforcement Learning" under the supervision of Prof. Dr. Carlo D'Eramo and Prof. Dr. Georgia Chalvatzaki. In his thesis, he explored a novel combination of concepts to improve the training of robust policies.

Teaching Assistant

Statistical Machine Learning (SS 2023, WS 2023/2024, SS 2024, WS 2024/2025)
Robot Learning (WS 2023/2024, SS 2024)

Research Interests

I am inspired by the vision of affordable robots that support us during our everyday life. Communicating tasks to such robots should be as easy as interacting with other humans using our natural language. While today's language models already equip us with common ground knowledge to understand a language instruction, the transfer into low-level robotic actions is a highly complex problem yet to be solved. Current approaches record large datasets of language labeled robotic trajectories and train goal-conditioned policies to imitate recorded behavior. However, even the largest model at present is not able to generalize from its skill distribution during training. Data scaling is the most common solution hypothesis to this problem. I believe that we lag on solutions to more fundamental research problems. What is a good vision-language task representation for robotic control? How do we learn such an action-centric task representation?

Key References

- Bib
  Toelle, M.; Gruner, T.; Palenicek, D.; Schneider, T. Guenster, J.; Watson, J.; Tateo, D.; Liu, P.; Peters, J. (2025). Towards Safe Robot Foundation Models using Inductive Biases, SafeVLM Workshop @ IEEE International Conference on Robotics and Automation (ICRA), Spotlight.
- Bib
  Toelle, M.; Gruner, T.; Palenicek, D.; Guenster, J.; Liu, P.; Watson, J.; Tateo, D.; Peters, J. (2025). Towards Safe Robot Foundation Models, German Robotics Conference (GRC).
- Bib
  Scherer, C. F.; Tölle, M.; Gruner, T.; Palenicek, D.; Schneider, T.; Schramowski, P.; Belousov, B.; Peters, J. (2025). AllmAN: A German Vision-Language-Action Model, German Robotics Conference (GRC).
- Bib
  Toelle, M.; Belousov, B.; Peters, J. (2023). A Unifying Perspective on Language-Based Task Representations for Robot Control, CoRL Workshop on Language and Robot Learning: Language as Grounding.