Learning Operational Space Control
Operational space control (OSC) is one of the most elegant approaches to task control for complex, redundant robots. Its potential for dynamically consistent control, compliant control, force control, and hierarchical control has not been exhausted to date. Applications of OSC range from endeffector control of manipulators up to balancing and gait execution for humanoid robots [1].
(:youtube Q5UiPnXsFZA :) If the robot model is accurately known, operational space control is wellunderstood yielding a variety of different solution alternatives. These choices include resolvedmotion rate control, resolved acceleration control, and direct forcebased taskspace control. However, as in many new robotic systems are supposed to operate safely in human environments, compliant, lowgain operational space control is desired. As a result, the practical use of operational space control becomes increasingly difficult in the presence of unmodeled nonlinearities, leading to reduced accuracy or even unpredictable and unstable nullspace behavior in the robot system.
Learning control methods are a promising potential solution to this problem. However, learning methods do not easily provide the highly structured knowledge required in traditional operational space control laws, e.g., Jacobians, inertia matrices, and Coriolis/centripetal and gravity forces, since all these terms are not always instantly observable. They are therefore not suitable for formulating supervised learning as traditionally used in learning control approaches. In this project, we have designed novel approaches to learning operational space control that avoid extracting such structured knowledge and rather aim at learning the operational space control law directly, i.e., we pose OSC as a direct inverse model learning problem. A first important insight for this project is that a physically correct solution to the inverse problem with redundant degreesoffreedom does exist when learning of the inverse map is performed in a suitable piecewise linear way [2, 3]. The second crucial component for our work is based on the insight that many operational space controllers can be understood in terms of a constrained optimal control problem [1]. The cost function associated with this optimal control problem allows us to formulate a learning algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the machine learning point of view, this learning problem corresponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectationmaximization policy search algorithm in order to solve this problem. Evaluations on a simulated three degrees of freedom robot arm show that the approach always converges to the globally optimal solution if provided with sufficient data [3].
The application to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degreeoffreedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on real, physical Mitsubishi PA10 medical robotics arm [4] and a highspeed Barrett WAM robot arm. In future work, we will extend the kernelbased approaches from learning models for control to operational space control.
Contact persons: Jan Peters, Duy NguyenTuong
Collaborators: Stefan Schaal (USC), Jun Nakanishi (ATR)
References


 Peters, J.;Schaal, S. (2007). Reinforcement learning by rewardweighted regression for operational space control, Proceedings of the International Conference on Machine Learning (ICML2007).

 Peters, J.; Schaal, S. (2008). Learning to control in operational space, International Journal of Robotics Research (IJRR), 27, pp.197212.

 Peters, J.; NguyenTuong, D. (2008). RealTime Learning of Resolved Velocity Control on a Mitsubishi PA10, International Conference on Robotics and Automation (ICRA).
