Publication Details

SELECT * FROM publications WHERE Record_Number=11330
Reference TypeConference Paper
Author(s)Muratore, F.; Gienger, M.; Peters, J.
TitleAssessing Transferability in Reinforcement Learning from Randomized Simulations
Journal/Conference/Book TitleReinforcement Learning and Decision Making (RLDM)
Keywordsdomain randomization, simulation optimization, sim-2-real
AbstractExploration-based reinforcement learning of control policies on physical systems is generally time-intensive and can lead to catastrophic failures. Therefore, simulation-based policy search appears to be an appealing alternative. Unfortunately, running policy search on a slightly faulty simulator can easily lead to the maximization of the ‘Simulation Optimization Bias’ (SOB), where the policy exploits modeling errors of the simulator such that the resulting behavior can potentially damage the device. For this reason, much work in reinforcement learning has focused on model-free methods. The resulting lack of safe simulation-based policy learning techniques imposes severe limitations on the application of reinforcement learning to real-world systems. In this paper, we explore how physics simulations can be utilized for a robust policy optimization by randomizing the simulator’s parameters and training from model ensembles. We propose an algorithm called Simulation-based Policy Optimization with Transferability Assessment (SPOTA) that uses an estimator of the SOB to formulate a stopping criterion for training. We show that the simulation-based policy search algorithm is able to learn a control policy exclusively from a randomized simulator that can be applied directly to a different system without using any data from the latter.


zum Seitenanfang