Publication Details

SELECT * FROM publications WHERE Record_Number=1804
Reference TypeConference Proceedings
Author(s)Schaal, S.;Peters, J.;Nakanishi, J.;Ijspeert, A.
TitleControl, planning, learning, and imitation with dynamic movement primitives
Journal/Conference/Book TitleWorkshop on Bilateral Paradigms on Humans and Humanoids, IEEE International Conference on Intelligent Robots and Systems (IROS 2003)
Keywordsmovement primitives, supervised learning, reinforcment learning, locomotion, phase resetting, learning from demonstration
AbstractIn both human and humanoid movement science, the topic of movement primitives has become central in understanding the generation of complex motion with high degree-of-freedom bodies. A theory of control, planning, learning, and imitation with movement primitives seems to be crucial in order to reduce the search space during motor learning and achieve a large level of autonomy and flexibility in dynamically changing environments. Movement recognition based on the same representations as used for movement generation, i.e., movement primitives, is equally intimately tied into these research questions. This paper discusses a comprehensive framework for motor control with movement primitives using a recently developed theory of dynamic movement primitives (DMP). DMPs are a formulation of movement primitives with autonomous nonlinear differential equations, whose time evolution creates smooth kinematic movement plans. Model-based control theory is used to convert such movement plans into motor commands. By means of coupling terms, on-line modifications can be incorporated into the time evolution of the differential equations, thus providing a rather flexible and reactive framework for motor planning and execution � indeed, DMPs form complete kinematic control policies, not just a particular desired trajectory. The linear parameterization of DMPs lends itself naturally to supervised learning from demonstrations. Moreover, the temporal, scale, and translation invariance of the differential equations with respect to these parameters provides a useful means for movement recognition. A novel reinforcement learning technique based on natural stochastic policy gradients allows a general approach of improving DMPs by trial and error learning with respect to almost arbitrary optimization criteria, including situations with delayed rewards. We demonstrate the different ingredients of the DMP approach in various examples, involving skill learning from demonstration on the humanoid robot DB and an application of learning simulated biped walking from a demonstrated trajectory, including self-improvement of the movement patterns in the spirit of energy efficiency through resonance tuning.
Link to PDF[0].pdf


zum Seitenanfang