Machine Learning, Robotics, Inverse Reinforcement Learning, Imitation Learning, Grasping and Manipulation, Reinforcement Learning, Variational Inference
GitHub Publications Google Citations DBLP ORCID
Oleg Arenz
TU Darmstadt, FG IAS,
Hochschulstr. 10, 64289 Darmstadt
Office.
Room E226, Building S2|02
+49-6151-16-20074
oleg.arenz@tu-darmstadt.de
During my PhD, I investigated several different learning problems for robotics, namely reinforcement learning, inverse reinforcement learning and variational inference, and showed that they can all be framed as an information projection, which is a particular type of distribution-matching problem. By treating the aforementioned learning problems as different instances of an information projection, we can solve them based on similar insights. For example, we derived an upper bound on the I-Projection objective and used it in combination with an expectation-maximization procedure for variational inference, density estimation as well as non-adversarial imitation learning.
Many kinds of learning problems have been proven to be very useful for robotics and will remain integral parts of increasingly intelligent robots. These learning problems include reinforcement learning, imitation learning, inverse reinforcement learning and various types of supervised, semi-supervised and unsupervised problems. However, in order to develop intelligent robots capable of assisting humans in a variety of different tasks in a changing environment---for example domestic robots---, we need to go beyond these general but limited problem formulations. I think it is time to move to the next layer of abstraction, to devise new mathematically well-specified objectives that encompass and build on the aforementioned learning problems. I argue that it is important to think about the different modules that make up the overall system, how they interact with each other and what goal they optimize together because we need to build on their synergies in order to scale to more complex problem settings.
However, not just we, also the robot itself should devise new abstraction layers. Reasoning about a task on different levels of abstractions is beneficial for many challenging problems in reinforcement learning and inverse reinforcement learning. High-level MDPs have the potential for targeted exploration that enables us to learn better low-level controllers, and their reward functions and policies are more generalizable and less prone to changes in the environment. On the other hand, we need low-level policies to control the robot---especially for highly dynamic and reactive movements---and for representing skills that can be used within different higher-level tasks. Hence, I argue that hierarchical learning will be a key ingredient of future, holistic approaches to robot learning.
Machine Learning, Robotics, Inverse Reinforcement Learning, Imitation Learning, Manipulation, Teleoperation, Hierarchical Learning, Reinforcement Learning