Publication Details

SELECT * FROM publications WHERE Record_Number=11136
Reference TypeUnpublished Work
Author(s)Arenz, O.; Neumann, G.
TitleIterative Cost Learning from Different Types of Human Feedback
Journal/Conference/Book TitleIROS 2016 Workshop on Human-Robot Collaboration
KeywordsInverse Reinforcement Learning, Preference Learning
AbstractHuman-robot collaboration in unstructured envi- ronments often involves different types of interactions. These interactions usually occur frequently during normal operation and may provide valuable information about the task to the robot. It is therefore sensible to utilize this data for lifelong robot learning. Learning from human interactions is an active field of research, e.g., Inverse Reinforcement Learning, which aims at learning from demonstrations, or Preference Learning, which aims at learning from human preferences. However, learning from a combination of different types of feedback is still little explored. In this paper, we propose a method for inferring a reward function from a combination of expert demonstrations, pairwise preferences, star ratings as well as oracle-based evaluations of the true reward function. Our method extends Maximum Entropy Inverse Reinforcement Learning in order to account for the additional types of human feedback by framing them as constraints to the original optimization problem. We demonstrate on a gridworld, that the resulting optimization problem can be solved based on the Alternating Direction Method of Multipliers (ADMM), even when confronted with a large amount of training data.
Link to PDF


zum Seitenanfang