Publication Details
You can download this complete bibtex reference list as all-ias-publications.bib.Reference Type | Journal Article |
Author(s) | Look, A.; Rakitsch, B.; Kandemir, M.; Peters, J. |
Year | submitted |
Title | Sampling-Free Probabilistic Deep State-Space Models |
Journal/Conference/Book Title | Submitted to Transactions on Pattern Analysis and Machine Intelligence (PAMI) |
Link to PDF | https://arxiv.org/pdf/2309.08256.pdf |
Reference Type | Journal Article |
Author(s) | Klink, P.; Wolf, F.; Ploeger, K.; Peter, J.; Pajarinen, J. |
Year | submitted |
Title | Tracking Control for a Spherical Pendulum via Curriculum Reinforcement Learning |
Journal/Conference/Book Title | Submitted to the IEEE Transactions on Robotics (T-Ro) |
URL(s) | https://arxiv.org/abs/2309.14096 |
Link to PDF | https://arxiv.org/pdf/2309.14096.pdf |
Reference Type | Journal Article |
Author(s) | Watson, J.; Song, C.; Weeger, O.; Gruner, T.; Le, A.T.; Hansel, K.; Headway, A.; Arenz, O.; Trojak, W.; Cranmer, M.; D’Eramo, C.; Bülow, F.; Goyal, T.; Peters, J.; Hoffman, M.W.; |
Year | submitted |
Title | Machine Learning with Physics Knowledge for Prediction: A Survey |
Journal/Conference/Book Title | Transactions on Machine Learning Research (TMLR) |
Link to PDF | https://www.arxiv.org/pdf/2408.09840 |
Reference Type | Journal Article |
Author(s) | Carvalho, J.; Le, A.T.; Jahr, P. ; Sun, Q. ; Urain, J.; Koert, D.; Peters, J. |
Year | submitted |
Title | Grasp Diffusion Network: Learning Grasp Generators from Partial Point Clouds with Diffusion Models in SO(3)xR3 |
Journal/Conference/Book Title | Submitted to the IEEE Robotics and Automation Letters (R-AL) |
URL(s) | https://sites.google.com/view/graspdiffusionnetwork |
Link to PDF | https://arxiv.org/abs/2412.08398 |
Reference Type | Journal Article |
Author(s) | Palenicek, D.; Lutter, M.; Carvalho, J.; Dennert, D.; Ahmad, F.; Peters, J. |
Year | submitted |
Title | Diminishing Return of Value Expansion Methods |
Journal/Conference/Book Title | Submitted to the IEEE Transaction on Pattern Analysis and Machine Intelligence (PAMI) |
Link to PDF | https://arxiv.org/pdf/2412.20537 |
Reference Type | Journal Article |
Author(s) | Carvalho, J.; Le, A.T.; Kicki, P. ; Koert, D.; Peters, J. |
Year | submitted |
Title | Motion Planning Diffusion: Learning and Adapting Robot Motion Planning with Diffusion Models |
Journal/Conference/Book Title | Submitted to the IEEE Transactions on Robotics (T-Ro) |
URL(s) | https://sites.google.com/view/motionplanningdiffusion |
Link to PDF | https://arxiv.org/abs/2412.19948 |
Reference Type | Journal Article |
Author(s) | Le, A. T.; Hansel, K.; Carvalho, J.; Watson, J.; Urain, J.; Biess, A.; Chalvatzaki, G.; Peters, J. |
Year | submitted |
Title | Global Tensor Motion Planning |
Journal/Conference/Book Title | Submitted to the IEEE Robotics and Automation Letters (R-AL) |
Link to PDF | https://arxiv.org/pdf/2411.19393 |
Reference Type | Journal Article |
Author(s) | Klink, P.; D'Eramo, C.; Peters, J.; Pajarinen, J. |
Year | in press |
Title | On the Benefit of Optimal Transport for Curriculum Reinforcement Learning |
Journal/Conference/Book Title | IEEE Transaction on Pattern Analysis and Machine Intelligence (PAMI) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PascalKlink/benefit-ot-crl-pami-with-ap.pdf |
Reference Type | Journal Article |
Author(s) | Luis, C.E.; Bottero, A.G.; Vinogradska, J.; Berkenkamp, F.; Peters, J. |
Year | in press |
Title | Uncertainty Representations in State-Space Layers for Deep Reinforcement Learning under Partial Observability |
Journal/Conference/Book Title | Transactions on Machine Learning Research (TMLR) |
Link to PDF | https://arxiv.org/pdf/2409.16824 |
Reference Type | Journal Article |
Author(s) | Prasad, V.; Heitlinger, L; Koert, D.; Stock-Homburg, R.; Peters, J.; Chalvatzaki, G. |
Year | conditionally accepted |
Title | Learning Multimodal Latent Dynamics for Human-Robot Interaction |
Journal/Conference/Book Title | Submitted to the IEEE Transaction of Robotics (T-RO) |
Link to PDF | https://arxiv.org/abs/2311.16380 |
Reference Type | Journal Article |
Author(s) | Liu, P.; Bou-Ammar H.; Peters, J.; Tateo D. |
Year | conditionally accepted |
Title | Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications |
Journal/Conference/Book Title | Submitted to the IEEE Transactions on Robotics (T-Ro) |
Keywords | Safe Reinforcement Learning, Constraint Manifold, Safe Exploration |
Abstract | Integrating learning-based techniques, especially reinforcement learning, into robotics is promising for solving complex problems in unstructured environments. However, most existing approaches are trained in well-tuned simulators and subsequently deployed on real robots without online fine-tuning. In this setting, the simulation's realism seriously impacts the deployment's success rate. Instead, learning with real-world interaction data offers a promising alternative: not only eliminates the need for a fine-tuned simulator but also applies to a broader range of tasks where accurate modeling is unfeasible. One major problem for on-robot reinforcement learning is ensuring safety, as uncontrolled exploration can cause catastrophic damage to the robot or the environment. Indeed, safety specifications, often represented as constraints, can be complex and non-linear, making safety challenging to guarantee in learning systems. In this paper, we show how we can impose complex safety constraints on learning-based robotics systems in a principled manner, both from theoretical and practical points of view. Our approach is based on the concept of the Constraint Manifold, representing the set of safe robot configurations. Exploiting differential geometry techniques, i.e., the tangent space, we can construct a safe action space, allowing learning agents to sample arbitrary actions while ensuring safety. We demonstrate the method's effectiveness in a real-world Robot Air Hockey task, showing that our method can handle high-dimensional tasks with complex constraints. Videos of the real robot experiments are available on the project website |
URL(s) | https://puzeliu.github.io/TRO-ATACOM |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PuzeLiu/TRO_ATACOM_submitted.pdf |
Reference Type | Journal Article |
Author(s) | Urain, J.; Mandlekar, A.; Du, Y.; Shafiullah, M.; Xu, D.; Fragkiadaki, K.; Chalvatzaki, G.; Peters, J. |
Year | conditionally accepted |
Title | Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations |
Journal/Conference/Book Title | IEEE Transactions on Robotics (T-RO) |
Link to PDF | https://arxiv.org/pdf/2408.04380 |
Reference Type | Journal Article |
Author(s) | Vincent, T.; Palenicek, D.; Belousov, B.; Peters, J.; D'Eramo, C. |
Year | 2025 |
Title | Iterated Q-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning |
Journal/Conference/Book Title | Transactions on Machine Learning Research (TMLR) |
Keywords | deep reinforcement learning, temporal difference, approximate value iteration |
Abstract | The vast majority of Reinforcement Learning methods is largely impacted by the computation effort and data requirements needed to obtain effective estimates of action-value functions, which in turn determine the quality of the overall performance and the sample-efficiency of the learning procedure. Typically, action-value functions are estimated through an iterative scheme that alternates the application of an empirical approximation of the Bellman operator and a subsequent projection step onto a considered function space. It has been observed that this scheme can be potentially generalized to carry out multiple iterations of the Bellman operator at once, benefiting the underlying learning algorithm. However, until now, it has been challenging to effectively implement this idea, especially in high-dimensional problems. In this paper, we introduce iterated Q-Network (i-QN), a novel principled approach that enables multiple consecutive Bellman updates by learning a tailored sequence of action-value functions where each serves as the target for the next. We show that i-QN is theoretically grounded and that it can be seamlessly used in value-based and actor-critic methods. We empirically demonstrate the advantages of i-QN in Atari 2600 games and MuJoCo continuous control problems. |
Link to PDF | https://arxiv.org/pdf/2403.02107 |
Reference Type | Journal Article |
Author(s) | Funk, N.; Chen, C.; Schneider, T.; Chalvatzaki, G.; Calandra, R.; Peters, J. |
Year | 2025 |
Title | On the Importance of Tactile Sensing for Imitation Learning: A Case Study on Robotic Match Lighting |
Journal/Conference/Book Title | ICRA 2025 Workshop on “Towards Human Level Intelligence Vision and Tactile Sensing” |
URL(s) | https://sites.google.com/view/tactile-il |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NiklasFunk/LfTactileDemos_18_04.pdf |
Reference Type | Conference Paper |
Author(s) | Alt, B.; Kienle, C.; Katic, D.; Jäkel, R.; Beetz, M. |
Year | 2025 |
Title | Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization |
Journal/Conference/Book Title | IEEE International Conference on Robotics and Automation (ICRA) |
URL(s) | https://arxiv.org/abs/2409.08678 |
Link to PDF | https://arxiv.org/pdf/2409.08678 |
Reference Type | Conference Paper |
Author(s) | Kienle, C.; Alt, B.; Katic, D.; Jäkel, R.; Peters, J. |
Year | 2025 |
Title | QueryCAD: Grounded Question Answering for CAD Models |
Journal/Conference/Book Title | IEEE International Conference on Robotics and Automation (ICRA) |
URL(s) | https://arxiv.org/abs/2409.08704 |
Link to PDF | https://arxiv.org/pdf/2409.08704 |
Reference Type | Conference Proceedings |
Author(s) | Vincent, T.; Wahren, F.; Peters, J.; Belousov, B.; D'Eramo, C. |
Year | 2025 |
Title | Adaptive Q-Network: On-the-fly Target Selection for Deep Reinforcement Learning |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Link to PDF | https://arxiv.org/pdf/2405.16195 |
Reference Type | Conference Proceedings |
Author(s) | Diwan, A.A.; Urain, J.; Kober, J.; Peters, J. |
Year | 2025 |
Title | Noise-conditioned Energy-based Annealed Rewards (NEAR): A Generative Framework for Imitation Learning from Observation |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Abstract | This paper introduces a new imitation learning framework based on energy-based generative models capable of learning complex, physics-dependent, robot motion policies through state-only expert motion trajectories. Our algorithm, called Noise-conditioned Energy-based Annealed Rewards (NEAR), constructs several perturbed versions of the expert's motion data distribution and learns smooth, and well-defined representations of the data distribution's energy function using denoising score matching. We propose to use these learnt energy functions as reward functions to learn imitation policies via reinforcement learning. We also present a strategy to gradually switch between the learnt energy functions, ensuring that the learnt rewards are always well-defined in the manifold of policy-generated samples. We evaluate our algorithm on complex humanoid tasks such as locomotion and martial arts and compare it with state-only adversarial imitation learning algorithms like Adversarial Motion Priors (AMP). Our framework sidesteps the optimisation challenges of adversarial imitation learning techniques and produces results comparable to AMP in several quantitative metrics across multiple imitation settings. |
URL(s) | https://anishhdiwan.github.io/noise-conditioned-energy-based-annealed-rewards/ |
Link to PDF | https://arxiv.org/pdf/2501.14856 |
Reference Type | Conference Proceedings |
Author(s) | Straub, D.; Niehues, T.F.; Peters, J.; Rothkopf, C.A. |
Year | 2025 |
Title | Inverse decision-making using neural amortized Bayesian actors |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Reference Type | Journal Article |
Author(s) | Huang, J.; Tateo, D.; Liu, P.; Peters, J. |
Year | 2025 |
Title | Adaptive Control based Friction Estimation for Tracking Control of Robot Manipulators |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters, and IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | Model Learning for Control; Robust/Adaptive Control; Calibration and Identification |
Abstract | Adaptive control is often used for friction compensation in trajectory tracking tasks because it does not require torque sensors. However, it has some drawbacks: first, the most common certainty-equivalence adaptive control design is based on linearized parameterization of the friction model, therefore nonlinear effects, including the stiction and Stribeck effect, are usually omitted. Second, the adaptive control-based estimation can be biased due to non-zero steady-state error. Third, neglecting unknown model mismatch could result in non-robust estimation. This paper proposes a novel linear parameterized friction model capturing the nonlinear static friction phenomenon. Subsequently, an adaptive control-based friction estimator is proposed to reduce the bias during estimation based on backstepping. Finally, we propose an algorithm to generate excitation for robust estimation. Using a KUKA iiwa 14, we conducted trajectory tracking experiments to evaluate the estimated friction model, including random Fourier and drawing trajectories, showing the effectiveness of our methodology in different control schemes. |
Volume | 10 |
Number of Volumes | 3 |
Pages | 2454-2461 |
URL(s) | https://ieeexplore.ieee.org/document/10842510 |
Link to PDF | https://arxiv.org/pdf/2409.05054v2 |
Reference Type | Conference Proceedings |
Author(s) | Duret, G.; Bourennane, Y.; Mazurak, D.; Samsonenko, A.; Zara, F.; Chen, L.; Peters, J. |
Year | 2025 |
Title | Facilitate and scale up the creation of 3D meshes and 6D category-based datasets with generative models: GenVegeFruits |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Image Processing (ICIP) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/20250206023159_734992_2500.pdf |
Reference Type | Conference Proceedings |
Author(s) | Palenicek, D.; Vogt, F.; Peters, J. |
Year | 2025 |
Title | Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization |
Journal/Conference/Book Title | Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Link to PDF | https://arxiv.org/pdf/2502.07523 |
Reference Type | Conference Proceedings |
Author(s) | Bohlinger, N.; Peters, J. |
Year | 2025 |
Title | Massively Scaling Explicit Policy-conditioned Value Functions |
Journal/Conference/Book Title | Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Keywords | Deep Reinforcement Learning, Scaling Laws, Value Functions |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NicoBohlinger/pssvf_rldm.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bohlinger, N.; Czechmanowski, G.; Krupka, M.; Kicki, P.; Walas, K.; Peters, J.; Tateo, D. |
Year | 2025 |
Title | Morphology-Aware Legged Locomotion with Reinforcement Learning |
Journal/Conference/Book Title | German Robotics Conference (GRC) |
Keywords | Multi-Embodiment, Reinforcement Learning, Locomotion |
URL(s) | https://github.com/nico-bohlinger/one_policy_to_run_them_all_website |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NicoBohlinger/grc_2025.pdf |
Reference Type | Conference Proceedings |
Author(s) | Carvalho, J.; Le, A.; Jahr, P. ; Sun, Q. ; Urain, J.; Koert, D.; Peters, J. |
Year | 2025 |
Title | Grasp Diffusion Network: Learning Grasp Generators from Partial Point Clouds with Diffusion Models in SO(3)×R3 |
Journal/Conference/Book Title | German Robotics Conference (GRC) |
URL(s) | https://sites.google.com/view/graspdiffusionnetwork |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoaoCarvalho/2025_carvalho_GDN_GRC.pdf |
Reference Type | Conference Proceedings |
Author(s) | Chen, J.; Kshirsagar, A.; Heller, F.; Gomez Andreu, M.; Belousov, B.; Schneider, T.; Lin, L. P. Y.; Doerschner, K.; Drewing, K.; Peters, J. |
Year | 2025 |
Title | Active Sampling for Hardness Classification with Vision-Based Tactile Sensors |
Journal/Conference/Book Title | German Robotics Conference (GRC) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/AlapKshirsagar/GRC25_0204_FI_Tactile_Hardness.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nonnengiesser, F.; Kshirsagar, A.; Belousov, B.; Peters, J. |
Year | 2025 |
Title | Visuotactile In-Hand Pose Estimation |
Journal/Conference/Book Title | German Robotics Conference (GRC) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/AlapKshirsagar/GRC25_0142_FI_In_Hand_pose.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen, D.H.; Schneider, T.; Duret, G.; Kshirsagar, A.; Belousov, B.; Peters, J. |
Year | 2025 |
Title | TacEx: GelSight Tactile Simulation in Isaac Sim – Combining Soft-Body and Visuotactile Simulators |
Journal/Conference/Book Title | German Robotics Conference (GRC) |
Link to PDF | https://arxiv.org/pdf/2411.04776 |
Reference Type | Conference Proceedings |
Author(s) | Vincent, T.; Faust, T.; Tripathi, Y.; Peters, J.; D'Eramo, C. |
Year | 2025 |
Title | Eau De Q-Network: Adaptive Distillation of Neural Networks in Deep Reinforcement Learning |
Journal/Conference/Book Title | Conference on Reinforcement Learning and Decision Making (RLDM) |
Keywords | Deep Reinforcement Learning, Sparse Training, Distillation |
Abstract | ias,ias-pub,jan |
Reference Type | Conference Proceedings |
Author(s) | Pompetzki, K.; Le, A. T.; Gruner, T.; Watson, J.; Chalvatzaki, G.; Peters, J. |
Year | 2025 |
Title | Geometrically-Aware Goal Inference: Leveraging Motion Planning as Inference |
Journal/Conference/Book Title | German Robotics Conference (GRC) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/KayHansel/grc_2025_geometrically_aware_goal_inference.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lenz, J.; Pfenning, I.; Gruner, T.; Palenicek, D.; Schneider, T.; Peters, J. |
Year | 2025 |
Title | Exploring the Role of Vision and Touch in Reinforcement Learning for Dexterous Insertion Tasks |
Journal/Conference/Book Title | Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/TheoGruner/rldm_visual_tactile_rl_preprint.pdf |
Reference Type | Conference Proceedings |
Author(s) | Celik, O.; Li, Z.; Blessing, D.; Li, G.; Palenicek, D.; Peters, J.; Chalvatzaki, G.; Neumann, G. |
Year | 2025 |
Title | Diffusion-Based Maximum Entropy Reinforcement Learning |
Journal/Conference/Book Title | 7th Robot Learning Workshop: Towards Robots with Human-Level Abilities (ICLR) |
Reference Type | Conference Proceedings |
Author(s) | Scherer, C. F.; Tölle, M.; Gruner, T.; Palenicek, D.; Schneider, T.; Schramowski, P.; Belousov, B.; Peters, J. |
Year | 2025 |
Title | AllmAN: A German Vision-Language-Action Model |
Journal/Conference/Book Title | German Robotics Conference (GRC) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/MaximilianTölle/AllmAN_A_German_Vision_Language_Action_Model.pdf |
Reference Type | Conference Proceedings |
Author(s) | Toelle, M.; Gruner, T.; Palenicek, D.; Guenster, J.; Liu, P.; Watson, J.; Tateo, D.; Peters, J. |
Year | 2025 |
Title | Towards Safe Robot Foundation Models |
Journal/Conference/Book Title | German Robotics Conference (GRC) |
URL(s) | https://sites.google.com/robot-learning.de/towards-safe-rfm |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/TheoGruner/grc_2025_srfm.pdf |
Reference Type | Conference Paper |
Author(s) | Schneider, T.; de Farias, C.; Calandra, R.; Chen L.; Peters, J. |
Year | 2025 |
Title | Active Perception for Tactile Sensing: A Task-Agnostic Attention-Based Approach |
Journal/Conference/Book Title | German Robotics Conference (GRC) |
Keywords | Force and Tactile Sensing; Perception-Action Coupling; Reinforcement Learning |
Abstract | Humans make extensive use of haptic exploration to map and identify the properties of the objects that we touch. Also, in robotics, the use of active tactile perception has emerged as an important research domain that complements vision for tasks such as object classification, shape reconstruction, and manipulation. In this work, we introduce TAP (Task-agnostic Active Perception) – a novel framework that leverages reinforcement learning (RL) and transformer-based architectures to address the challenges posed by partially observable environments. TAP integrates Soft Actor-Critic (SAC) and CrossQalgorithms within a unified optimization objective, jointly training a perception module and decision-making policy. By design, TAP is task-agnostic and can, in principle, generalize to any active perception problem. We evaluate TAP across diverse tasks, including toy examples and a realistic application involving haptic exploration of 3D models of handwritten digits. Experiments demonstrate the efficacy of TAP, achieving a classification accuracy of 92% on Tactile MNIST. These findings underscore the potential of TAP as a versatile and generalizable framework for advancing active tactile perception in robotics. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/TimSchneider_RIG2025_TAP.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bohlinger, N.; Kinzel, J.; Palenicek, D.; Antczak, L.; Peters, J. |
Year | 2025 |
Title | Gait in Eight: Efficient On-Robot Learning for Omnidirectional Quadruped Locomotion |
Journal/Conference/Book Title | Under review |
Keywords | Reinforcement Learning, Locomotion |
Abstract | On-robot Reinforcement Learning is a promising approach to train embodiment-aware policies for legged robots. However, the computational constraints of real-time learning on robots pose a significant challenge. We present a framework for efficiently learning quadruped locomotion in just 8 minutes of raw real-time training utilizing the sample efficiency and minimal computational overhead of the new off-policy algorithm CrossQ. We investigate two control architectures: Predicting joint target positions for agile, high-speed locomotion and Central Pattern Generators for stable, natural gaits. While prior work focused on learning simple forward gaits, our framework extends on-robot learning to omnidirectional locomotion. We demonstrate the robustness of our approach in different indoor and outdoor environments and provide the videos and code for our experiments at: https://nico-bohlinger.github. io/gait_in_eight_website |
URL(s) | https://nico-bohlinger.github.io/gait_in_eight_website/ |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NicoBohlinger/gait_in_eight.pdf |
Reference Type | Conference Proceedings |
Author(s) | Stasica, M.; Bick, A.; Bohlinger, N.; Mohseni, O.; Fritzsche, J.; Hübler, C.; Peters, J.; Seyfarth, A. |
Year | 2025 |
Title | Bridge the Gap: Enhancing Quadruped Locomotion with Vertical Ground Perturbations |
Journal/Conference/Book Title | Under review |
Keywords | Reinforcement Learning, Locomotion |
Abstract | Legged robots, particularly quadrupeds, excel at navigating rough terrains, yet their performance under vertical ground perturbations, such as those from oscillating surfaces, remains underexplored. This study introduces a novel approach to enhance quadruped locomotion robustness by training the Unitree Go2 robot on an oscillating bridge—a 13.24-meter steel-and-concrete structure with a 2 Hz eigenfrequency designed to perturb locomotion. Using Reinforcement Learning (RL) with the Proximal Policy Optimization (PPO) algorithm in a MuJoCo simulation, we trained 15 distinct locomotion policies, combining five gaits (trot, pace, bound, free, default) with three training conditions: rigid bridge and two oscillating bridge setups with differing height regulation strategies (relative to bridge surface or ground). Domain randomization ensured zero-shot transfer to the real-world bridge. Our results demonstrate that policies trained on the oscillating bridge exhibit superior stability and adaptability compared to those trained on rigid surfaces. Our framework enables robust gait patterns even without prior bridge exposure. These findings highlight the potential of simulation-based RL to improve quadruped locomotion during dynamic ground perturbations, offering insights for designing robots capable of traversing vibrating environments. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NicoBohlinger/bridge_the_gap.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kienle, C.; Alt, B.; Schneider F.; Pertlwieser, T.; Jäkel, R.; Rayyes R. |
Year | 2025 |
Title | AI-based Framework for Robust Model-Based Connector Mating in Robotic Wire Harness Installation |
Journal/Conference/Book Title | IEEE International Conference on Automation Science and Engineering (CASE) |
URL(s) | https://arxiv.org/abs/2503.09409 |
Link to PDF | https://arxiv.org/pdf/2503.09409 |
Reference Type | Conference Proceedings |
Author(s) | Le, A.T.; Pompetzki, K.; Peters, J.; Biess, A. |
Year | 2025 |
Title | Kinematics Correspondence As Inexact Graph Matching |
Journal/Conference/Book Title | German Robotics Conference (GRC) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/AnThaiLe/imcorr.pdf |
Reference Type | Conference Proceedings |
Author(s) | Moyen, S. B.; Krohn, R.; Lueth, S.; Pompetzki, K.; Chalvatzaki, G. |
Year | 2025 |
Title | Intuitive User Interfaces for Mobile Manipulation |
Journal/Conference/Book Title | German Robotics Conference (GRC) |
Link to PDF | 2025__intuitive_user_interfaces_for_mobile_manipulation__grc.pdf |
Reference Type | Conference Proceedings |
Author(s) | Koosha, T. A.; Kshirsagar, A.; Augustat, N.; Hahne, F.; Mühl, D.; Melzig, C. A.; Bremmer, F.; Peters, J.; Endres, D. M. |
Year | 2025 |
Title | Staring Down the Elevator Shaft: Postural Responses to Virtual Heights in an Indoor Environment |
Journal/Conference/Book Title | Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci) |
Keywords | Cognitive Neuroscience ; Action ; Behavioral Science ; Motor control ; Quantitative Behavior |
Link to PDF | /uploads/Team/AlapKshirsagar/2025-cogsci-fearbalance-init.pdf |
Reference Type | Journal Article |
Author(s) | Chowdhury, A.; Maurer, H.; Kshirsagar, A.; Ploeger, K.; Peters, J.; Mueller, H. |
Year | 2025 |
Title | The Earlier You Know, the Smoother You Act |
Journal/Conference/Book Title | Conference of the Human Movement Science Section of the German Association of Sports Science |
URL(s) | https://mediatum.ub.tum.de/1774621 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/AlapKshirsagar/2025-dmc-juggling.pdf |
Reference Type | Journal Article |
Author(s) | Dam, T.; D'Eramo, C.; Peters, J.; Pajarinen, J. |
Year | 2024 |
Title | A Unified Perspective on Value Backup and Exploration in Monte-Carlo Tree Search |
Journal/Conference/Book Title | Journal of Artificial Intelligence Research (JAIR) |
Volume | 81 |
Pages | 511-577 |
Link to PDF | https://arxiv.org/pdf/2202.07071 |
Reference Type | Journal Article |
Author(s) | Abdulsamad, H.; Nickl, P.; Klink, P.; Peters, J. |
Year | 2024 |
Title | Variational Hierarchical Mixtures for Probabilistic Learning of Inverse Dynamics |
Journal/Conference/Book Title | IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) |
Volume | 46 |
Number | 4 |
Pages | 1950-1963 |
Link to PDF | https://arxiv.org/pdf/2211.01120.pdf |
Reference Type | Journal Article |
Author(s) | Kicki, P.; Liu, P.; Tateo, D.; Bou Ammar, H.; Walas, K.; Skrzypczynski, P.; Peters, J. |
Year | 2024 |
Title | Fast Kinodynamic Planning on the Constraint Manifold with Deep Neural Networks |
Journal/Conference/Book Title | IEEE Transactions on Robotics (T-Ro), and Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Volume | 40 |
Pages | 277-297 |
Link to PDF | https://arxiv.org/pdf/2301.04330.pdf |
Reference Type | Conference Paper |
Author(s) | Bhatt, A.; Palenicek, D.; Belousov, B.; Argus, M.; Amiranashvili, A.; Brox, T.; Peters, J. |
Year | 2024 |
Title | CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Abstract | Sample efficiency is a crucial problem in deep reinforcement learning. Recent algorithms, such as REDQ and DroQ, found a way to improve the sample efficiency by increasing the update-to-data (UTD) ratio to 20 gradient update steps on the critic per environment sample. However, this comes at the expense of a greatly increased computational cost. To reduce this computational burden, we introduce CrossQ: a lightweight algorithm that makes careful use of Batch Normalization and removes target networks to surpass the state-of-the-art in sample efficiency while maintaining a low UTD ratio of 1. Notably, CrossQ does not rely on advanced bias-reduction schemes used in current methods. CrossQ's contributions are thus threefold: (1) state-of-the-art sample efficiency, (2) substantial reduction in computational cost compared to REDQ and DroQ, and (3) ease of implementation, requiring just a few lines of code on top of SAC. |
Volume | Spotlight |
URL(s) | http://aditya.bhatts.org/CrossQ/ |
Link to PDF | https://openreview.net/pdf?id=PczQtTsTIX |
Reference Type | Conference Proceedings |
Author(s) | Derstroff, C.; Brugger, J.; Cerrato, M.; Peters, J.; Kramer, S. |
Year | 2024 |
Title | Peer Learning: Learning Complex Policies in Groups from Scratch via Action Recommendations |
Journal/Conference/Book Title | Proceedings of the National Conference on Artificial Intelligence (AAAI) |
Link to PDF | https://arxiv.org/pdf/2312.09950.pdf |
Reference Type | Conference Proceedings |
Author(s) | Vincent, T.; Metelli, A.; Belousov, B.; Peters, J.; Restelli, M.; D'Eramo, C. |
Year | 2024 |
Title | Parameterized Projected Bellman Operator |
Journal/Conference/Book Title | Proceedings of the National Conference on Artificial Intelligence (AAAI) |
Link to PDF | https://arxiv.org/pdf/2312.12869.pdf |
Reference Type | Journal Article |
Author(s) | Funk, N.; Helmut, E.; Chalvatzaki, G.; Calandra, R.; Peters, J. |
Year | 2024 |
Title | Evetac: An Event-based Optical Tactile Sensor for Robotic Manipulation |
Journal/Conference/Book Title | IEEE Transactions on Robotics (T-RO) |
Volume | 40 |
Pages | 3812-3832 |
ISBN/ISSN | 1941-0468 |
URL(s) | https://sites.google.com/view/evetac |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NiklasFunk/Evetac_journal.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tiboni, G.; Klink, P.; Peters, J.; Tommasi, T.; D'Eramo, C.; Chalvatzaki, G. |
Year | 2024 |
Title | Domain Randomization via Entropy Maximization |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Link to PDF | https://arxiv.org/pdf/2311.01885.pdf |
Reference Type | Conference Proceedings |
Author(s) | Hahne, F.; Prasad V.; Kshirsagar A.; Koert D.; Stock-Homburg R. M.; Peters J.; Chalvatzaki G. |
Year | 2024 |
Title | Transition State Clustering for Interaction Segmentation and Learning |
Journal/Conference/Book Title | ACM/IEEE International Conference on Human Robot Interaction (HRI), Late Breaking Report |
Link to PDF | https://arxiv.org/pdf/2402.14548.pdf |
Reference Type | Conference Proceedings |
Author(s) | Goeksu, Y.; Almeida-Correia, A.; Prasad, V.; Kshirsagar, A.; Koert, D.; Peters, J.; Chalvatzaki, G. |
Year | 2024 |
Title | Kinematically Constrained Human-like Bimanual Robot-to-Human Handovers |
Journal/Conference/Book Title | ACM/IEEE International Conference on Human Robot Interaction (HRI), Late Breaking Report |
Link to PDF | https://arxiv.org/pdf/2402.14525.pdf |
Reference Type | Conference Proceedings |
Author(s) | Hendawy, A.; Peters, J.; D'Eramo, C. |
Year | 2024 |
Title | Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Abstract | Multi-Task Reinforcement Learning (MTRL) tackles the long-standing problem of endowing agents with skills that generalize across a variety of problems. To this end, sharing representations plays a fundamental role in capturing both unique and common characteristics of the tasks. Tasks may exhibit similarities in terms of skills, objects, or physical properties while leveraging their representations eases the achievement of a universal policy. Nevertheless, the pursuit of learning a shared set of diverse representations is still an open challenge. In this paper, we introduce a novel approach for representation learning in MTRL that encapsulates common structures among the tasks using orthogonal representations to promote diversity. Our method, named Mixture Of Orthogonal Experts (MOORE), leverages a Gram-Schmidt process to shape a shared subspace of representations generated by a mixture of experts. When task-specific information is provided, MOORE generates relevant representations from this shared subspace. We assess the effectiveness of our approach on two MTRL benchmarks, namely MiniGrid and MetaWorld, showing that MOORE surpasses related baselines and establishes a new state-of-the-art result on MetaWorld. |
URL(s) | https://arxiv.org/abs/2311.11385 |
Link to PDF | https://arxiv.org/pdf/2311.11385.pdf |
Reference Type | Conference Paper |
Author(s) | Reddi, A.; Toelle, M.; Peters, J.; Chalvatzaki, G.; D'Eramo, C. |
Year | 2024 |
Title | Robust Adversarial Reinforcement Learning via Bounded Rationality Curricula |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Abstract | Robustness against adversarial attacks and distribution shifts is a long-standing goal of Reinforcement Learning (RL). To this end, Robust Adversarial Reinforcement Learning (RARL) trains a protagonist against destabilizing forces exercised by an adversary in a competitive zero-sum Markov game, whose optimal solution, i.e., rational strategy, corresponds to a Nash equilibrium. However, finding Nash equilibria requires facing complex saddle point optimization problems, which can be prohibitive to solve, especially for high-dimensional control. In this paper, we propose a novel approach for adversarial RL based on entropy regularization to ease the complexity of the saddle point optimization problem. We show that the solution of this entropy-regularized problem corresponds to a Quantal Response Equilibrium (QRE), a generalization of Nash equilibria that accounts for bounded rationality, i.e., agents sometimes play random actions instead of optimal ones. Crucially, the connection between the entropy-regularized objective and QRE enables free modulation of the rationality of the agents by simply tuning the temperature coefficient. We leverage this insight to propose our novel algorithm, Quantal Adversarial RL (QARL), which gradually increases the rationality of the adversary in a curriculum fashion until it is fully rational, easing the complexity of the optimization problem while retaining robustness. We provide extensive evidence of QARL outperforming RARL and recent baselines across several MuJoCo locomotion and navigation problems in overall performance and robustness. |
Volume | Spotlight |
URL(s) | https://arxiv.org/abs/2311.01642 |
Link to PDF | https://arxiv.org/pdf/2311.01642.pdf |
Reference Type | Conference Proceedings |
Author(s) | Al-Hafez, F.; Zhao, G.; Peters, J.; Tateo, D. |
Year | 2024 |
Title | Time-Efficient Reinforcement Learning with Stochastic Stateful Policies |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Keywords | reinforcement learning, imitation, stateful policies |
Link to PDF | https://arxiv.org/pdf/2311.04082.pdf |
Reference Type | Journal Article |
Author(s) | Prasad, V.; Kshirsagar, A; Koert, D.; Stock-Homburg, R.; Peters, J.; Chalvatzaki, G. |
Year | 2024 |
Title | MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from Demonstrations |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Volume | 9 |
Number | 7 |
Pages | 6043--6050 |
URL(s) | https://ieeexplore.ieee.org/abstract/document/10517380/ |
Link to PDF | https://arxiv.org/pdf/2407.07636 |
Reference Type | Conference Proceedings |
Author(s) | Scherf, L.; Gasche, L. A.; Chemangui, E.; Koert, D. |
Year | 2024 |
Title | Are You Sure? - Multi-Modal Human Decision Uncertainty Detection in Human-Robot Interaction |
Journal/Conference/Book Title | 2024 ACM/IEEE International Conference on Human-Robot Interaction (HRI ’24) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/LisaScherf/Scherf_HRI2024.pdf |
Reference Type | Conference Proceedings |
Author(s) | Boehm, A.; Schneider, T.; Belousov, B.; Kshirsagar, A.; Lin, L.; Doerschner, K.; Drewing, K.; Rothkopf, C.A.; Peters, J. |
Year | 2024 |
Title | What Matters for Active Texture Recognition With Vision-Based Tactile Sensors |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/boehm24_tart.pdf |
Reference Type | Journal Article |
Author(s) | Weng, Y.; Chun, S.; Ohashi, M.; Matsuda, T.; Sekimoria, Y.; Pajarinen, J.; Peters, J.; Maki, T. |
Year | 2024 |
Title | Autonomous Underwater Vehicle Link Alignment Control in Unknown Environments Using Reinforcement Learning |
Journal/Conference/Book Title | Journal of Field Robotics |
Volume | 41 |
Number | 6 |
Pages | 1724--1743 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoniPajarinen/Weng_JFR_2024.pdf |
Reference Type | Conference Proceedings |
Author(s) | Spartakov, R.; Kshirsagar, A.; Mühl, D.; Schween, R.; Endres, D.M.; Bremmer, F.; Melzig, C.; Peters, J. |
Year | 2024 |
Title | Balancing on the Edge: Review and Computational Framework on the Dynamics of Fear of Falling and Fear of Heights in Postural Control |
Journal/Conference/Book Title | Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci) |
Keywords | Cognitive Neuroscience, Psychology, Dynamical Systems, Motor control, Comparative Analysis |
URL(s) | https://escholarship.org/uc/item/13m560t0 |
Link to PDF | /uploads/Team/AlapKshirsagar/2024-cogsci-fearbalance.pdf |
Reference Type | Journal Article |
Author(s) | Holzmann, P.; Maik Pfefferkorn, M.; Peters, J.; Findeisen, R. |
Year | 2024 |
Title | Learning Energy-Efficient Trajectory Planning for Robotic Manipulators using Bayesian Optimization |
Journal/Conference/Book Title | Proceedings of the European Control Conference (ECC) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/ByType/ECC_2024_final.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wiebe, F.; Turcato, N.; Dalla Libera, A.; Zhang, C.; Vincent, T.; Vyas, S.; Giacomuzzo, G.; Carli, R.; Romeres, D.; Sathuluri, A.; Zimmermann, M.; Belousov, B.; Peters, J.; Kirchner, F.; Kumar, S. |
Year | 2024 |
Title | Reinforcement Learning for Athletic Intelligence: Lessons from the 1st “AI Olympics with RealAIGym” Competition |
Journal/Conference/Book Title | The 33rd International Joint Conference on Artificial Intelligence |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/ijcai24realaigym.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lin, L.; Boehm, A.; Belousov, B.; Kshirsagar, A.; Schneider, T.; Peters, J.; Doerschner, K.; Drewing, K. |
Year | 2024 |
Title | Task-Adapted Single-Finger Explorations of Complex Objects |
Journal/Conference/Book Title | Eurohaptics |
Link to PDF | https://eurohaptics.org/ehc2024/wp-content/uploads/sites/5/2024/06/1106-doc.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen, D.M.H.; Lukashina, N.; Nguyen, N.; Le, A.T.; Nguyen, T.T.; Ho, N.; Peters, J.; Sonntag, D.; Zaverkin, V.; Niepert, M. |
Year | 2024 |
Title | Structure-Aware E(3)-Invariant Molecular Conformer Aggregation Networks |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Link to PDF | https://arxiv.org/pdf/2402.01975 |
Reference Type | Unpublished Work |
Author(s) | Becker, N.; Gattung, E.; Hansel, K.; Schneider, T.; Zhu, Y.; Hasegawa, Y.; Peters, J. |
Year | 2024 |
Title | Integrating Visuo-tactile Sensing with Haptic Feedback for Teleoperated Robot Manipulation |
Journal/Conference/Book Title | IEEE ICRA 2024 Workshop on Robot Embodiment through Visuo-Tactile Perception |
Abstract | Telerobotics enables humans to overcome spatial constraints and allows them to physically interact with the environment in remote locations. However, the sensory feedback provided by the system to the operator is often purely visual, limiting the operator’s dexterity in manipulation tasks. In this work, we address this issue by equipping the robot’s end-effector with high-resolution visuotactile GelSight sensors. Using low-cost MANUS-Gloves, we provide the operator with haptic feedback about forces acting at the points of contact in the form of vibration signals. We propose two different methods for estimating these forces; one based on estimating the movement of markers on the sensor surface and one deep-learning approach. Additionally, we integrate our system into a virtual-reality teleoperation pipeline in which a human operator controls both arms of a Tiago robot while receiving visual and haptic feedback. We believe that integrating haptic feedback is a crucial step for dexterous manipulation in teleoperated robotic systems. |
URL(s) | https://arxiv.org/abs/2404.19585 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/KayHansel/integrating_visuotactile_sensing_with_haptic_feedback_for_teleoperated_robot_manipulation.pdf |
Reference Type | Conference Proceedings |
Author(s) | Palenicek, D.; Gruner, T.; Schneider, T.; Böhm, A.; Lenz, J.; Pfenning, I. and Krämer, E.; Peters, J. |
Year | 2024 |
Title | Learning Tactile Insertion in the Real World |
Journal/Conference/Book Title | IEEE ICRA 2024 Workshop on Robot Embodiment through Visuo-Tactile Perception |
Link to PDF | https://arxiv.org/pdf/2405.00383 |
Reference Type | Journal Article |
Author(s) | Gu, S.; Liu, P.; Kshirsagar, A.; Chen, G.; Peters, J.; Knoll, A. |
Year | 2024 |
Title | ROSCOM: Robust Safe Reinforcement Learning on Stochastic Constraint Manifolds |
Journal/Conference/Book Title | IEEE Transactions on Automation Science and Engineering (T-ASE) |
Keywords | safe reinforcement learning, constrained manifolds, robust reinforcement learning, robotics |
URL(s) | https://ieeexplore.ieee.org/abstract/document/10616119 |
Link to PDF | /uploads/Team/AlapKshirsagar/2024-roscom-tase.pdf |
Reference Type | Conference Paper |
Author(s) | Vincent, T.; Wahren, F.; Peters, J.; Belousov, B.; D'Eramo, C.; |
Year | 2024 |
Title | Adaptive Q-Network: On-the-fly Target Selection for Deep Reinforcement Learning |
Journal/Conference/Book Title | European Workshop on Reinforcement Learning (EWRL) |
Keywords | Reinforcement Learning |
Link to PDF | https://arxiv.org/pdf/2405.16195 |
Reference Type | Conference Paper |
Author(s) | Vincent, T.; Wahren, F.; Peters, J.; Belousov, B.; D'Eramo, C.; |
Year | 2024 |
Title | Adaptive Q-Network: On-the-fly Target Selection for Deep Reinforcement Learning |
Journal/Conference/Book Title | ICML Workshop on Automated Reinforcement Learning |
Link to PDF | https://openreview.net/pdf?id=EJAEtpdtaW |
Reference Type | Journal Article |
Author(s) | Herrmann, F.; Zach, S.B.; Banfi, J.; Peters, J.; Chalvatzaki, G.; Tateo, D. |
Year | 2024 |
Title | Safe and Efficient Path Planning under Uncertainty via Deep Collision Probability Fields |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DavideTateo/DCPF_RAL2024.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bohlinger, N.; Czechmanowski, G.; Krupka, M.; Kicki, P.; Walas, K.; Peters, J.; Tateo, D. |
Year | 2024 |
Title | One Policy to Run Them All: an End-to-end Learning Approach to Multi-Embodiment Locomotion |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL) |
Keywords | Multi-Embodiment, Reinforcement Learning, Locomotion |
URL(s) | https://github.com/nico-bohlinger/one_policy_to_run_them_all_website |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NicoBohlinger/one_policy_to_run_them_all.pdf |
Reference Type | Journal Article |
Author(s) | Drolet, M.; Stepputtis, S.; Kailas, S.; Jain, A.; Schaal, S.; Peters, J.; Ben Amor, H. |
Year | 2024 |
Title | A Comparison of Imitation Learning Algorithms for Bimanual Manipulation |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Keywords | imitation, manipulation |
URL(s) | https://bimanual-imitation.github.io/ |
Link to PDF | https://arxiv.org/pdf/2408.06536 |
Reference Type | Conference Paper |
Author(s) | Geiss, H.J.; Al-Hafez, F.; Seyfarth, A.; Peters, J.; Tateo, D. |
Year | 2024 |
Title | Exciting Action: Investigating Efficient Exploration for Learning Musculoskeletal Humanoid Locomotion |
Journal/Conference/Book Title | IEEE-RAS International Conference on Humanoid Robots (Humanoids) |
Link to PDF | https://arxiv.org/abs/2407.11658 |
Reference Type | Conference Paper |
Author(s) | Jansonnie, P.; Wu, B.; Perez, J.; Peters J. |
Year | 2024 |
Title | Unsupervised Skill Discovery for Robotic Manipulation through Automatic Task Generation |
Journal/Conference/Book Title | IEEE-RAS International Conference on Humanoid Robots (Humanoids) |
URL(s) | https://arxiv.org/abs/2410.04855 |
Link to PDF | https://arxiv.org/pdf/2410.04855 |
Reference Type | Conference Proceedings |
Author(s) | Nguyen, D.H.M.*; Le, A.T.*; Nguyen, T.Q.; Nghiem, T.D.; Duong-Tran, D. ; Peters, J.; Li, S.; Niepert, M.; Sonntag, D. |
Year | 2024 |
Title | Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model |
Journal/Conference/Book Title | Asian Conference on Machine Learning (ACML) |
Keywords | Optimal Transport, Prompt Learning |
Link to PDF | https://arxiv.org/pdf/2407.04489 |
Reference Type | Conference Proceedings |
Author(s) | Watson, J.; Hahner, B.; Belousov, B.; Peters, J. |
Year | 2024 |
Title | Tractable Bayesian Dynamics Priors from Differentiable Physics for Learning and Control |
Journal/Conference/Book Title | 40th Anniversary of the IEEE International Conference on Robotics and Automation (ICRA@40) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoeWatson/watson24sim2gp.pdf |
Reference Type | Conference Proceedings |
Author(s) | Palenicek, D.; Gruner, T.; Schneider, T.; Böhm, A.; Lenz, J.; Pfenning, I. and Krämer, E.; Peters, J. |
Year | 2024 |
Title | Learning Tactile Insertion in the Real World |
Journal/Conference/Book Title | 40th Anniversary of the IEEE International Conference on Robotics and Automation (ICRA@40) |
Link to PDF | https://arxiv.org/pdf/2405.00383 |
Reference Type | Conference Proceedings |
Author(s) | Bhatt, A.; Palenicek, D.; Belousov, B.; Argus, M.; Amiranashvili, A.; Brox, T.; Peters, J. |
Year | 2024 |
Title | CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity |
Journal/Conference/Book Title | European Workshop on Reinforcement Learning (EWRL) |
Link to PDF | https://openreview.net/pdf?id=PczQtTsTIX |
Reference Type | Journal Article |
Author(s) | Al-Hafez, F.; Zhao, G.; Peters, J.; Tateo, D. |
Year | 2024 |
Title | Time-Efficient Reinforcement Learning with Stochastic Stateful Policies |
Journal/Conference/Book Title | European Workshop on Reinforcement Learning (EWRL) |
Link to PDF | https://arxiv.org/abs/2311.04082 |
Reference Type | Journal Article |
Author(s) | Luis, C.E.; Bottero, A.G.; Vinogradska, J.; Berkenkamp, F.; Peters, J. |
Year | 2024 |
Title | Value-Distributional Model-Based Reinforcement Learning |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Volume | 25 |
Number | 298 |
Pages | 1-42 |
Reference Type | Conference Proceedings |
Author(s) | Kshirsagar, A.; Heller, F.; Gomez Andreu, M.; Belousov, B.; Schneider, T.; Lin, L. P. Y.; Doerschner, K.; Drewing, K.; Peters, J. |
Year | 2024 |
Title | Hardness Similarity Detection Using Vision-Based Tactile Sensors |
Journal/Conference/Book Title | 40th Anniversary of the IEEE International Conference on Robotics and Automation (ICRA@40) |
Link to PDF | /uploads/Team/AlapKshirsagar/2024-icra40-hardness.pdf |
Reference Type | Journal Article |
Author(s) | Guenster, J.; Liu, P.; Peters, J.; Tateo, D. |
Year | 2024 |
Title | Handling Long-Term Safety and Uncertainty in Safe Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the Conference on Robot Learning (CoRL) |
Link to PDF | https://arxiv.org/pdf/2409.12045 |
Reference Type | Journal Article |
Author(s) | Kicki, P.; Tateo; D., Liu, P.; Guenster, J.; Peters, J.; Walas, K. |
Year | 2024 |
Title | Bridging the gap between Learning-to-plan, Motion Primitives and Safe Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the Conference on Robot Learning (CoRL) |
Link to PDF | https://arxiv.org/pdf/2408.14063 |
Reference Type | Conference Proceedings |
Author(s) | Becker, N.; Sovailo, K.; Zhu, C.; Gattung, E.; Hansel, K.; Schneider, T.; Zhu, Y.; Hasegawa, Y.; Peters, J. |
Year | 2024 |
Title | Integrating and Evaluating Visuo-tactile Sensing with Haptic Feedback for Teleoperated Robot Manipulation |
Journal/Conference/Book Title | 40th Anniversary of the IEEE International Conference on Robotics and Automation (ICRA@40) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/KayHansel/Integrating_and_evaluating_visuo_tactile_sensing_with_haptic_feedback_for_teleoperated_robot_manipulation |
Reference Type | Conference Proceedings |
Author(s) | Qian, C.; Urain, J.; Zakka, K.; Peters, J. |
Year | 2024 |
Title | PianoMime: Learning a Generalist, Dexterous Piano Player from Internet Demonstrations |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL) |
Link to PDF | https://arxiv.org/pdf/2407.18178 |
Reference Type | Conference Proceedings |
Author(s) | Gomez Andreu, M.A.; Ploeger, K.; Peters, J. |
Year | 2024 |
Title | Beyond the Cascade: Juggling Vanilla Siteswap Patterns |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Abstract | Being widespread in human motor behavior, dynamic movements demonstrate higher efficiency and greater capacity to address a broader range of skill domains compared to their quasi-static counterparts. Among the frequently studied dynamic manipulation problems, robotic juggling tasks stand out due to their inherent ability to scale their difficulty levels to arbitrary extents, making them an excellent subject for investigation. In this study, we explore juggling patterns with mixed throw heights, following the vanilla siteswap juggling notation, which jugglers widely adopted to describe toss juggling patterns. This requires extending our previous analysis of the simpler cascade juggling task by a throw-height sequence planner and further constraints on the end effector trajectory. These are not necessary for cascade patterns but are vital to achieving patterns with mixed throw heights. Using a simulated environment, we demonstrate successful juggling of most common 3-9 ball siteswap patterns up to 9 ball height, transitions between these patterns, and random sequences covering all possible vanilla siteswap patterns with throws between 2 and 9 ball height. https://kai-ploeger.com/beyond-cascades |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/KaiPloeger/iros2024_beyond_the_cascade.pdf |
Reference Type | Journal Article |
Author(s) | Lenz, J.; Gruner, T.; Palenicek, D.; Schneider, T.; Pfenning, I.; Peters J. |
Year | 2024 |
Title | Analysing the Interplay of Vision and Touch for Dexterous Insertion Tasks |
Journal/Conference/Book Title | CoRL 2024 Workshop on Learning Robot Fine and Dexterous Manipulation: Perception and Control |
Link to PDF | https://arxiv.org/pdf/2410.23860 |
Reference Type | Conference Proceedings |
Author(s) | Prasad, V.; Kshirsagar, A; Koert, D.; Stock-Homburg, R.; Peters, J.; Chalvatzaki, G. |
Year | 2024 |
Title | MoVEInt: Mixture of Variational Experts for Learning HRI from Demonstrations |
Journal/Conference/Book Title | Workshop on Nonverbal Cues for Human-Robot Cooperative Intelligence at IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | https://nocworkshop.github.io/2024/pdf/flash-talk+poster/IROS24_3160_moveint.pdf |
Reference Type | Conference Proceedings |
Author(s) | Prasad, V.; Kshirsagar, A; Koert, D.; Stock-Homburg, R.; Peters, J.; Chalvatzaki, G. |
Year | 2024 |
Title | MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from Demonstrations |
Journal/Conference/Book Title | Workshop on Structural Priors as Inductive Biases for Learning Robot Dynamics at Robotics: Science and Systems (RSS) |
URL(s) | https://sites.google.com/alora.tech/priors4robots24/accepted-contributions?authuser=0 |
Reference Type | Conference Proceedings |
Author(s) | Helmut, E.; Dziarski, L.; Funk, N.; Belousov, B.; Peters, J. |
Year | 2024 |
Title | Learning Force Distribution Estimation for the GelSight Mini Optical Tactile Sensor Based on Finite Element Analysis |
Journal/Conference/Book Title | 2nd NeurIPS Workshop on Touch Processing: From Data to Knowledge |
URL(s) | https://feats-ai.github.io |
Link to PDF | https://arxiv.org/pdf/2411.03315 |
Reference Type | Conference Proceedings |
Author(s) | Le, A. T.; Hansel, K.; Carvalho, J.; Urain, J.; Biess, A.; Chalvatzaki, G.; Peters, J. |
Year | 2024 |
Title | Global Tensor Motion Planning |
Journal/Conference/Book Title | CoRL 2024 Workshop on Differentiable Optimization Everywhere |
Reference Type | Conference Proceedings |
Author(s) | Nguyen, D.H.; Schneider, T.; Duret, G.; Kshirsagar, A.; Belousov, B.; Peters, J. |
Year | 2024 |
Title | TacEx: GelSight Tactile Simulation in Isaac Sim – Combining Soft-Body and Visuotactile Simulators |
Journal/Conference/Book Title | CoRL 2024 Workshop on Learning Robot Fine and Dexterous Manipulation: Perception and Control |
URL(s) | https://sites.google.com/view/tacex |
Link to PDF | https://arxiv.org/pdf/2411.04776 |
Reference Type | Journal Article |
Author(s) | Liu, Y.; Belousov, B.; Schneider, T.; Harsono, K.; Cheng, T.W.; Shih, S.G.; Tessmann, O.; Peters, J. |
Year | 2024 |
Title | Advancing Sustainable Construction: Discrete Modular Systems & Robotic Assembly |
Journal/Conference/Book Title | Sustainability |
Publisher | MDPI |
Volume | 16 |
Pages | 6678 |
Date | 2024/8/4 |
URL(s) | https://www.mdpi.com/2071-1050/16/15/6678 |
Link to PDF | https://www.mdpi.com/2071-1050/16/15/6678/pdf?version=1722947710 |
Reference Type | Conference Proceedings |
Author(s) | Kornmann, M.; He, Q.; Kshirsagar, A.; Ploeger, K.; Peters, J. |
Year | 2024 |
Title | Learning to Accurately Throw Paper Planes |
Journal/Conference/Book Title | CoRL 2024 Workshop on Learning Robot Fine and Dexterous Manipulation: Perception and Control |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/AlapKshirsagar/paper-planes-corl-ws.pdf |
Reference Type | Conference Proceedings |
Author(s) | Meser, M.; Bhatt, A.; Belousov, B.; Peters, J. |
Year | 2024 |
Title | MuJoCo MPC for Humanoid Control: Evaluation on HumanoidBench |
Journal/Conference/Book Title | 40th Anniversary of the IEEE International Conference on Robotics and Automation (ICRA@40) |
Link to PDF | https://arxiv.org/pdf/2408.00342 |
Reference Type | Conference Proceedings |
Author(s) | Bohlinger, N.; Czechmanowski, G.; Krupka, M.; Kicki, P.; Walas, K.; Peters, J.; Tateo, D. |
Year | 2024 |
Title | One Policy to Run Them All: an End-to-end Learning Approach to Multi-Embodiment Locomotion |
Journal/Conference/Book Title | CoRL 2024 Workshop on Morphology-Aware Policy and Design Learning Workshop |
URL(s) | https://github.com/nico-bohlinger/one_policy_to_run_them_all_website |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NicoBohlinger/one_policy_to_run_them_all.pdf |
Reference Type | Conference Proceedings |
Author(s) | Drolet, M.; Stepputtis, S.; Kailas, S.; Jain, A.; Schaal, S.; Peters, J.; Ben Amor, H. |
Year | 2024 |
Title | A Comparison of Imitation Learning Algorithms for Bimanual Manipulation |
Journal/Conference/Book Title | CoRL 2024 Workshop on Whole-body Control and Bimanual Manipulation |
URL(s) | https://bimanual-imitation.github.io/ |
Link to PDF | https://arxiv.org/pdf/2408.06536 |
Reference Type | Conference Proceedings |
Author(s) | Funk, N.; Urain, J.; Carvalho, J.; Prasad, V.; Chalvatzaki, G.; Peters, J. |
Year | 2024 |
Title | ACTIONFLOW: Equivariant, Accurate, and Efficient Manipulation Policies with Flow Matching |
Journal/Conference/Book Title | CoRL 2024 Workshop on Mastering Robot Manipulation in a World of Abundant Data |
URL(s) | https://flowbasedpolicies.github.io/ |
Link to PDF | https://arxiv.org/pdf/2409.04576 |
Reference Type | Conference Proceedings |
Author(s) | Kaidanov, O.; Al-Hafez, F.; Süvari, Y.; Belousov, B.; Peters, J. |
Year | 2024 |
Title | The Role of Domain Randomization in Training Diffusion Policies for Whole-Body Humanoid Control |
Journal/Conference/Book Title | CoRL 2024 Workshop on Whole-body Control and Bimanual Manipulation: Applications in Humanoids and Beyond |
URL(s) | https://sites.google.com/view/dps-for-humanoid-control |
Link to PDF | https://arxiv.org/pdf/2411.01349 |
Reference Type | Conference Proceedings |
Author(s) | Holgado-Alvarez, J.H.; Reddi, A.; D'Eramo, C. |
Year | 2024 |
Title | Dynamic Obstacle Avoidance with Bounded Rationality Adversarial Reinforcement Learning |
Journal/Conference/Book Title | CoRL 2024 Locolearn Workshop |
Link to PDF | https://openreview.net/pdf?id=YVHkV0ax7F |
Reference Type | Conference Proceedings |
Author(s) | Faust, T.L.; Maraqten, H.; Aghadavoodi, E.; Belousov, B.; Peters, J. |
Year | 2024 |
Title | Velocity-History-Based Soft Actor-Critic: Tackling IROS'24 Competition AI Olympics with RealAIGym |
Journal/Conference/Book Title | IROS'24 Competition AI Olympics with RealAIGym |
Link to PDF | https://arxiv.org/pdf/2410.20096 |
Reference Type | Conference Proceedings |
Author(s) | Bohlinger, N.; Czechmanowski, G.; Krupka, M.; Kicki, P.; Walas, K.; Peters, J.; Tateo, D. |
Year | 2024 |
Title | One Policy to Run Them All: Towards an End-to-end Learning Approach to Multi-Embodiment Locomotion |
Journal/Conference/Book Title | RSS 2024 Workshop on Embodiment-Aware Robot Learning |
URL(s) | https://github.com/nico-bohlinger/one_policy_to_run_them_all_website |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NicoBohlinger/rss_workshop_one_policy_to_run_them_all.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bohlinger, N.; Tateo, D.; Kicki, P.; Walas, K.; Peters, J. |
Year | 2024 |
Title | Benefits of an Actuated Spine in Agile Quadruped Locomotion |
Journal/Conference/Book Title | ICRA 2024 Workshop on Bio-inspired Robotics and Robotics for Biology |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NicoBohlinger/icra_2024_workshop.pdf |
Reference Type | Conference Proceedings |
Author(s) | Alt, B.; Zahn, J.; Kienle, C.; Dvorak, J.; May, M.; Katic, D.; Jäkel, R.; Kopp, T.; Beetz, M.; Lanza, G. |
Year | 2024 |
Title | Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization |
Journal/Conference/Book Title | 57th CIRP Conference on Manufacturing Systems 2024 (CMS 2024) |
URL(s) | https://www.sciencedirect.com/science/article/pii/S2212827124012915 |
Link to PDF | https://arxiv.org/pdf/2404.19349 |
Reference Type | Conference Proceedings |
Author(s) | Kienle, C.; Alt, B.; Celik, O.; Becker, P.; Katic, D.; Jäkel, R.; Neumann, G. |
Year | 2024 |
Title | MuTT: A Multimodal Trajectory Transformer for Robot Skills |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
URL(s) | https://ieeexplore.ieee.org/abstract/document/10802198 |
Link to PDF | https://arxiv.org/pdf/2407.15660 |
Reference Type | Conference Proceedings |
Author(s) | Funk, N.; Urain, J.; Carvalho, J.; Prasad, V.; Chalvatzaki, G.; Peters, J. |
Year | 2024 |
Title | ActionFlow: Efficient, Accurate, and Fast Policies with Spatially Symmetric Flow Matching |
Journal/Conference/Book Title | R:SS workshop: Structural Priors as Inductive Biases for Learning Robot Dynamics |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NiklasFunk/actionflow_rss.pdf |
Reference Type | Conference Proceedings |
Author(s) | Liu, P.; Guenster, J.; Funk, N.; Groeger, S.; Chen, D.; Bou Ammar, H.; Jankowski, J.; Maric, A.; Calinon, S.; et, al.; Lioutikov, R.; Neumann, G.; Likmeta, A.; Zhalehmehrabi, A.; Bonenfant, T.; Restelli, M.; Tateo, D.; Liu, Z.; Peters, R. |
Year | 2024 |
Title | A Retrospective on the Robot Air Hockey Challenge: Benchmarking Robust, Reliable, and Safe Learning Techniques for Real-world Robotics |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems |
URL(s) | http://air-hockey-challenge.robot-learning.net/ |
Link to PDF | https://arxiv.org/pdf/2411.05718 |
Reference Type | Journal Article |
Author(s) | Buechler, D.; Calandra, R.; Peters, J. |
Year | 2023 |
Title | Learning to Control Highly Accelerated Ballistic Movements on Muscular Robots |
Journal/Conference/Book Title | Robotics and Autonomous Systems |
Volume | 159 |
Number | 104230 |
Link to PDF | https://arxiv.org/pdf/1904.03665.pdf |
Reference Type | Conference Paper |
Author(s) | Toelle, M.; Belousov, B.; Peters, J. |
Year | 2023 |
Title | A Unifying Perspective on Language-Based Task Representations for Robot Control |
Journal/Conference/Book Title | CoRL Workshop on Language and Robot Learning: Language as Grounding |
Keywords | Language-Based Task Representations, Robot Control |
Abstract | Natural language is becoming increasingly important in robot control for both high-level planning and goal-directed conditioning of motor skills. While a number of solutions have been proposed already, it is yet to be seen what architecture will succeed in seamlessly integrating language, vision, and action. To better understand the landscape of existing methods, we propose to view the algorithms from the perspective of “Language-Based Task Representations”, i.e., categorizing the methods that condition robot action generation on natural language commands according to their task representation and embedding architecture. Our proposed taxonomy intuitively groups existing algorithms, highlights their commonalities and distinctions, and suggests directions for further investigation. |
URL(s) | https://openreview.net/pdf?id=X128TIOXXw |
Reference Type | Journal Article |
Author(s) | Lutter, M.; Peters, J. |
Year | 2023 |
Title | Combining Physics and Deep Learning to learn Continuous-Time Dynamics Models |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Volume | 42 |
Number | 3 |
Link to PDF | https://arxiv.org/pdf/2110.01894.pdf |
Reference Type | Journal Article |
Author(s) | Lutter, M.; Belousov, B.; Mannor, S.; Fox, D.; Garg, A.; Peters, J. |
Year | 2023 |
Title | Continuous-Time Fitted Value Iteration for Robust Policies |
Journal/Conference/Book Title | IEEE Transaction on Pattern Analysis and Machine Intelligence (PAMI) |
URL(s) | https://ieeexplore.ieee.org/document/9925102 |
Link to PDF | https://arxiv.org/pdf/2110.01954.pdf |
Reference Type | Journal Article |
Author(s) | Loeckel, S.; Ju, S.; Schaller, M.; van Vliet, P.; Peters, J. |
Year | 2023 |
Title | An Adaptive Human Driver Model for Realistic Race Car Simulations |
Journal/Conference/Book Title | IEEE Transactions on Systems, Man and Cybernetics: Systems (TSMC) |
Volume | 53 |
Number | 11 |
Pages | 6718-6730 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/Loeckel_TSMC_2023.pdf |
Reference Type | Journal Article |
Author(s) | Look, A.; Kandemir, M.; Rakitsch, B.; Peters, J. |
Year | 2023 |
Title | A Deterministic Approximation to Neural SDEs |
Journal/Conference/Book Title | IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) |
Volume | 45 |
Number | 4 |
Pages | 4023-4037 |
Link to PDF | https://arxiv.org/pdf/2006.08973.pdf |
Reference Type | Conference Proceedings |
Author(s) | Liu, Y.; Belousov, B.; Funk, N.; Chalvatzaki, G.; Peters, J.; Tessman, O. |
Year | 2023 |
Title | Auto(mated)nomous Assembly |
Journal/Conference/Book Title | International Conference on Trends on Construction in the Post-Digital Era |
Publisher | Springer, Cham |
Pages | 167-181 |
URL(s) | https://link.springer.com/chapter/10.1007/978-3-031-20241-4_12 |
Reference Type | Conference Proceedings |
Author(s) | Liu, P.; Zhang, K.; Tateo, D.; Jauhri, S.; Hu, Z.; Peters, J. Chalvatzaki, G. |
Year | 2023 |
Title | Safe Reinforcement Learning of Dynamic High-Dimensional Robotic Tasks: Navigation, Manipulation, Interaction |
Journal/Conference/Book Title | 2023 IEEE International Conference on Robotics and Automation (ICRA) |
Abstract | Safety is a crucial property of every robotic platform: any control policy should always comply with actuator limits and avoid collisions with the environment and humans. In reinforcement learning, safety is even more fundamental for exploring an environment without causing any damage. While there are many proposed solutions to the safe exploration problem, only a few of them can deal with the complexity of the real world. This paper introduces a new formulation of safe exploration for reinforcement learning of various robotic tasks. Our approach applies to a wide class of robotic platforms and enforces safety even under complex collision constraints learned from data by exploring the tangent space of the constraint manifold. Our proposed approach achieves state-of-the-art performance in simulated high-dimensional and dynamic tasks while avoiding collisions with the environment. We show safe real-world deployment of our learned controller on a TIAGo++ robot, achieving remarkable performance in manipulation and human-robot interaction tasks. |
Publisher | IEEE |
URL(s) | https://arxiv.org/abs/2209.13308 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PuzeLiu/ICRA_2023_ATACOM.pdf |
Reference Type | Conference Paper |
Author(s) | Zhu, Y.; Nazirjonov, S.; Jiang, B.; Colan, J.; Aoyama, T.; Hasegawa, Y.; Belousov, B.; Hansel, K.; Peters, J. |
Year | 2023 |
Title | Visual Tactile Sensor Based Force Estimation for Position-Force Teleoperation |
Journal/Conference/Book Title | IEEE International Conference on Cyborg and Bionic Systems (CBS) |
Pages | 49-52 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/KayHansel/2022__visual_tactile_sensor_based_force_estimation_for_position_force_teleoperation.pdf |
Reference Type | Conference Proceedings |
Author(s) | Zelch, C.; Peters, J.; von Stryk, C. |
Year | 2023 |
Title | Start State Selection for Control Policy Learning from Optimal Trajectories |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Research/Overview/acceptedversion_Zelch_ICRA23.pdf |
Reference Type | Conference Proceedings |
Author(s) | Urain, J.; Funk, N.; Peters, J.; Chalvatzaki G |
Year | 2023 |
Title | SE(3)-DiffusionFields: Learning smooth cost functions for joint grasp and motion optimization through diffusion |
Journal/Conference/Book Title | International Conference on Robotics and Automation (ICRA) |
Keywords | SE(3), 6D grasping, Robotics, Diffusion Models |
URL(s) | https://sites.google.com/view/se3dif/home |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JulenUrainDeJesus/2023se3graspurain.pdf |
Reference Type | Conference Proceedings |
Author(s) | Hansel, K.; Urain, J.; Peters, J.; Chalvatzaki, G. |
Year | 2023 |
Title | Hierarchical Policy Blending as Inference for Reactive Robot Control |
Journal/Conference/Book Title | 2023 IEEE International Conference on Robotics and Automation (ICRA) |
Abstract | Motion generation in cluttered, dense, and dynamic environments is a central topic in robotics, rendered as a multi-objective decision-making problem. Current approaches trade-off between safety and performance. On the one hand, reactive policies guarantee fast response to environmental changes at the risk of suboptimal behavior. On the other hand, planning-based motion generation provides feasible trajectories, but the high computational cost may limit the control frequency and thus safety. To combine the benefits of reactive policies and planning, we propose a hierarchical motion generation method. Moreover, we adopt probabilistic inference methods to formalize the hierarchical model and stochastic optimization. We realize this approach as a weighted product of stochastic, reactive expert policies, where planning is used to adaptively compute the optimal weights over the task horizon. This stochastic optimization avoids local optima and proposes feasible reactive plans that find paths in cluttered and dense environments. Our extensive experimental study in planar navigation and 6DoF manipulation shows that our proposed hierarchical motion generation method outperforms both myopic reactive controllers and online re-planning methods. |
Publisher | IEEE |
URL(s) | https://sites.google.com/view/hipbi |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/KayHansel/hierarchical_policy_blending_as_inference_icra_2023__version_2024.pdf |
Reference Type | Conference Proceedings |
Author(s) | Le, A. T.; Hansel, K.; Peters, J.; Chalvatzaki, G. |
Year | 2023 |
Title | Hierarchical Policy Blending As Optimal Transport |
Journal/Conference/Book Title | 5th Annual Learning for Dynamics & Control Conference (L4DC) |
Publisher | PMLR |
URL(s) | https://sites.google.com/view/hipobot |
Link to PDF | https://proceedings.mlr.press/v211/le23a/le23a.pdf |
Reference Type | Conference Proceedings |
Author(s) | Luis, C.; Bottero, A.G.; Vinogradska, J.; Berkenkamp, F.; Peters, J. |
Year | 2023 |
Title | Model-Based Uncertainty in Value Functions |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS) |
Link to PDF | https://arxiv.org/pdf/2302.12526.pdf |
Reference Type | Conference Proceedings |
Author(s) | Al-Hafez, F.; Tateo, D.; Arenz, O.; Zhao, G.; Peters, J. |
Year | 2023 |
Title | LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Keywords | Inverse Reinforcement Learning, Inverse Q-Learning, Implicit Reward Regularization, Imitation Learning, Locomotion |
Abstract | Recent methods for imitation learning directly learn a Q-function using an implicit reward formulation, rather than an explicit reward function. However, these methods generally require implicit reward regularization for improving stability, mistreating or even neglecting absorbing states. Previous works show that a squared norm regularization on the implicit reward function is effective, but do not provide a theoretical analysis of the resulting properties of the algorithms. In this work, we show that using this regularizer under a mixture distribution of the policy and the expert provides a particularly illuminating perspective: the original objective can be understood as squared Bellman error minimization, and the corresponding optimization problem minimizes the χ2-Divergence between the expert and the mixture distribution. This perspective allows us to address instabilities and properly treat absorbing states. We show that our method, Least Squares Inverse Q-Learning (LS-IQ), outperforms state-of-the-art algorithms, particularly in environments with absorbing states. Finally, we propose to use an inverse dynamics model to learn from observations only. Using this approach, we retain performance in settings where no expert actions are available. |
URL(s) | https://openreview.net/forum?id=o3Q4m8jg4BR&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DICLR.cc%2F2023%2FConference%2FAuthors%23your-submissions) |
Link to PDF | https://openreview.net/pdf?id=o3Q4m8jg4BR |
Reference Type | Conference Proceedings |
Author(s) | Palenicek, D.; Lutter, M.; Carvalho, J.; Peters, J. |
Year | 2023 |
Title | Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Link to PDF | https://openreview.net/pdf?id=H4Ncs5jhTCu |
Reference Type | Conference Proceedings |
Author(s) | Buechler, D.; Guist, S.; Calandra, R.; Berenz, V.; Schoelkopf, B.; Peters, J. |
Year | 2023 |
Title | Learning to Play Table Tennis From Scratch using Muscular Robots |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), IEEE T-TRo Track |
Link to PDF | https://arxiv.org/pdf/2006.05935.pdf |
Reference Type | Conference Proceedings |
Author(s) | Urain, J.; Tateo, D.; Peters, J. |
Year | 2023 |
Title | Learning Stable Vector Fields on Lie Groups |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), IEEE R-AL Track |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/urain_2023_liesvf.pdf |
Reference Type | Journal Article |
Author(s) | Bjelonic, F.; Lee, J.; Arm, P.; Sako, D.; Tateo, D.; Peters, J.; Hutter, M. |
Year | 2023 |
Title | Learning-Based Design and Control for Quadrupedal Robots With Parallel-Elastic Actuators |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (R-AL) |
Volume | 8 |
Number | 3 |
Pages | 1611-1618 |
Link to PDF | https://arxiv.org/pdf/2301.03509 |
Reference Type | Conference Proceedings |
Author(s) | Bethge, J.; Pfefferkorn, M.; Rose, A.; Peters, J.; Findeisen, R. |
Year | 2023 |
Title | Model predictive control with Gaussian-process-supported dynamical constraints for autonomous vehicles |
Journal/Conference/Book Title | Proceedings of the 22nd World Congress of the International Federation of Automatic Control |
Link to PDF | https://arxiv.org/pdf/2303.04725.pdf |
Reference Type | Journal Article |
Author(s) | Urain, J.; Li, A.; Liu, P.; D'Eramo, C.; Peters, J. |
Year | 2023 |
Title | Composable energy policies for reactive motion generation and reinforcement learning |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/urain_2023_cep_ijrr.pdf |
Reference Type | Journal Article |
Author(s) | Abdulsamad, H.; Peters, J. |
Year | 2023 |
Title | Model-Based Reinforcement Learning via Stochastic Hybrid Models |
Journal/Conference/Book Title | IEEE Open Journal of Control Systems, Special Section: Intersection of Machine Learning with Control |
Publisher | IEEE |
Volume | 2 |
Pages | 155 - 170 |
Electronic Resource Number | DOI 10.1109/OJCSYS.2023.3277308 |
Link to PDF | https://arxiv.org/pdf/2111.06211.pdf |
Reference Type | Journal Article |
Author(s) | Ju, S.; van Vliet, P.; Arenz, O.; Peters, J. |
Year | 2023 |
Title | Digital Twin of a Driver-in-the-Loop Race Car Simulation with Contextual Reinforcement Learning |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Volume | 8 |
Number | 7 |
Pages | 4107-4114 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/RAL_Siwei_Ju.pdf |
Reference Type | Journal Article |
Author(s) | Peters, S.; Peters, J.; Findeisen, R. |
Year | 2023 |
Title | Quantifying Uncertainties along the Automated Driving Stack |
Journal/Conference/Book Title | ATZ worldwide volume |
Volume | 125 |
Pages | 62-65 |
URL(s) | https://link.springer.com/article/10.1007/s38311-023-1489-8 |
Reference Type | Journal Article |
Author(s) | Arenz, O.; Dahlinger, P.; Ye, Z.; Volpp, M.; Neumann, G. |
Year | 2023 |
Title | A Unified Perspective on Natural Gradient Variational Inference with Gaussian Mixture Models |
Journal/Conference/Book Title | Transactions on Machine Learning Research (TMLR) |
Abstract | Variational inference with Gaussian mixture models (GMMs) enables learning of highly tractable yet multi-modal approximations of intractable target distributions with up to a few hundred dimensions. The two currently most effective methods for GMM-based variational inference, VIPS and iBayes-GMM, both employ independent natural gradient updates for the individual components and their weights. We show for the first time, that their derived updates are equivalent, although their practical implementations and theoretical guarantees differ. We identify several design choices that distinguish both approaches, namely with respect to sample selection, natural gradient estimation, stepsize adaptation, and whether trust regions are enforced or the number of components adapted. We argue that for both approaches, the quality of the learned approximations can heavily suffer from the respective design choices: By updating the individual components using samples from the mixture model, iBayes-GMM often fails to produce meaningful updates to low-weight components, and by using a zero-order method for estimating the natural gradient, VIPS scales badly to higher-dimensional problems. Furthermore, we show that information-geometric trust-regions (used by VIPS) are effective even when using first-order natural gradient estimates, and often outperform the improved Bayesian learning rule (iBLR) update used by iBayes-GMM. We systematically evaluate the effects of design choices and show that a hybrid approach significantly outperforms both prior works. Along with this work, we publish our highly modular and efficient implementation for natural gradient variational inference with Gaussian mixture models, which supports 432 different combinations of design choices, facilitates the reproduction of all our experiments, and may prove valuable for the practitioner. |
Link to PDF | /uploads/Team/PubOlegArenz/gmmvi.pdf |
Reference Type | Conference Proceedings |
Author(s) | Carvalho, J.; Le, A. T.; Baierl, M.; Koert, D.; Peters, J. |
Year | 2023 |
Title | Motion Planning Diffusion: Learning and Planning of Robot Motions with Diffusion Models |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | Motion Planning, Diffusion Models |
URL(s) | https://sites.google.com/view/mp-diffusion |
Link to PDF | https://arxiv.org/abs/2308.01557 |
Reference Type | Conference Paper |
Author(s) | Funk, N.; Mueller, P.-O.; Belousov, B.; Savchenko, A.; Findeisen, R.; Peters, J. |
Year | 2023 |
Title | High-Resolution Pixelwise Contact Area and Normal Force Estimation for the GelSight Mini Visuotactile Sensor Using Neural Networks |
Journal/Conference/Book Title | Embracing Contacts-Workshop at ICRA 2023 |
URL(s) | https://openreview.net/forum?id=dUO0QQw4FW |
Link to PDF | https://openreview.net/pdf?id=dUO0QQw4FW |
Reference Type | Journal Article |
Author(s) | Scherf, L.; Schmidt, A.; Pal, S.; Koert, D. |
Year | 2023 |
Title | Interactively learning behavior trees from imperfect human demonstrations |
Journal/Conference/Book Title | Frontiers in Robotics and AI |
Volume | 10 |
Link to PDF | https://www.frontiersin.org/articles/10.3389/frobt.2023.1152595/pdf |
Reference Type | Conference Paper |
Author(s) | Vincent, T.; Belousov, B.; D'Eramo, C.; Peters, J. |
Year | 2023 |
Title | Iterated Deep Q-Network: Efficient Learning of Bellman Iterations for Deep Reinforcement Learning |
Journal/Conference/Book Title | European Workshop on Reinforcement Learning (EWRL) |
Link to PDF | https://openreview.net/pdf?id=6dJGuVyR7K |
Reference Type | Journal Article |
Author(s) | Flynn, H.; Reeb, D.; Kandemir, M.; Peters, J. |
Year | 2023 |
Title | PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental Comparison |
Journal/Conference/Book Title | IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) |
Volume | 45 |
Number | 12 |
Pages | 15308-15327 |
Link to PDF | https://arxiv.org/pdf/2211.16110.pdf |
Reference Type | Conference Proceedings |
Author(s) | Al-Hafez, F.; Tateo, D.; Arenz, O.; Zhao, G.; Peters, J. |
Year | 2023 |
Title | Least Squares Inverse Q-Learning |
Journal/Conference/Book Title | European Workshop on Reinforcement Learning (EWRL) |
URL(s) | https://openreview.net/forum?id=BcHDYvNg-W4 |
Link to PDF | https://openreview.net/pdf?id=BcHDYvNg-W4 |
Reference Type | Journal Article |
Author(s) | Look, A.; Kandemir, M.; Rakitsch, B.; Peters, J. |
Year | 2023 |
Title | Cheap and Deterministic Inference for Deep State-Space Models of Interacting Dynamical Systems |
Journal/Conference/Book Title | Transactions on Machine Learning Research (TMLR) |
Abstract | Graph neural networks are often used to model interacting dynamical systems since they gracefully scale to systems with a varying and high number of agents. While there has been much progress made for deterministic interacting systems, modeling is much more challenging for stochastic systems in which one is interested in obtaining a predictive distribution over future trajectories. Existing methods are either computationally slow since they rely on Monte Carlo sampling or make simplifying assumptions such that the predictive distribution is unimodal. In this work, we present a deep state-space model which employs graph neural networks in order to model the underlying interacting dynamical system. The predictive distribution is multimodal and has the form of a Gaussian mixture model, where the moments of the Gaussian components can be computed via deterministic moment matching rules. Our moment matching scheme can be exploited for sample-free inference, leading to more efficient and stable training compared to Monte Carlo alternatives. Furthermore, we propose structured approximations to the covariance matrices of the Gaussian components in order to scale up to systems with many agents. We benchmark our novel framework on two challenging autonomous driving datasets. Both confirm the benefits of our method compared to state-of-the-art methods. We further demonstrate the usefulness of our individual contributions in a carefully designed ablation study and provide a detailed runtime analysis of our proposed covariance approximations. Finally, we empirically demonstrate the generalization ability of our method by evaluating its performance on unseen scenarios. |
Link to PDF | https://arxiv.org/abs/2305.01773 |
Reference Type | Conference Proceedings |
Author(s) | Flynn, H.; Reeb, D.; Kandemir, M.; Peters, J. |
Year | 2023 |
Title | Improved Algorithms for Stochastic Linear Bandits Using Tail Bounds for Martingale Mixtures |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Link to PDF | /uploads/Team/HamishFlynn/mmucb2.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gruner, T.; Belousov, B.; Muratore, F.; Palenicek, D.; Peters, J. |
Year | 2023 |
Title | Pseudo-Likelihood Inference |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Link to PDF | /uploads/Team/TheoGruner/pli_final |
Reference Type | Conference Proceedings |
Author(s) | Le, A. T.; Chalvatzaki, G.; Biess, A.; Peters, J. |
Year | 2023 |
Title | Accelerating Motion Planning via Optimal Transport |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
URL(s) | https://sites.google.com/view/sinkhorn-step/ |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/AnThaiLe/mpot_preprint.pdf |
Reference Type | Conference Proceedings |
Author(s) | Le, A. T.; Chalvatzaki, G.; Biess, A.; Peters, J. |
Year | 2023 |
Title | Accelerating Motion Planning via Optimal Transport |
Journal/Conference/Book Title | IROS 2023 Workshop on Differentiable Probabilistic Robotics: Emerging Perspectives on Robot Learning |
Keywords | Motion Planning, Trajectory Optimization, Optimal Transport |
Volume | [Oral] |
Link to PDF | https://openreview.net/pdf?id=Gx62uPXEul |
Reference Type | Conference Paper |
Author(s) | Rother, D.; Weisswange, T.H.; Peters, J. |
Year | 2023 |
Title | Disentangling Interaction using Maximum Entropy Reinforcement Learning in Multi-Agent Systems |
Journal/Conference/Book Title | European Conference on Artificial Intelligence (ECAI) |
Keywords | Ad-Hoc Teamwork, Maximum Entropy Reinforcement Learning, Coexistence, Mixed Motive |
Abstract | Research on multi-agent interaction involving both multiple artificial agents and humans is still in its infancy. Most recent approaches have focused on environments with collaboration-focused human behavior, or providing only a small, defined set of situations. When deploying robots in human-inhabited environments in the future, it will be unlikely that all interactions fit a predefined model of collaboration, where collaborative behavior is still expected from the robot. Existing approaches are unlikely to effectively create such behaviors in such "coexistence" environments. To tackle this issue, we introduce a novel framework that decomposes interaction and tasksolving into separate learning problems and blends the resulting policies at inference time. Policies are learned with maximum entropy reinforcement learning, allowing us to create interaction-impact-aware agents and scale the cost of training agents linearly with the number of agents and available tasks. We propose a weighting function covering the alignment of interaction distributions with the original task. We demonstrate that our framework addresses the scaling problem while solving a given task and considering collaboration opportunities in a co-existence particle environment and a new cooking environment. Our work introduces a new learning paradigm that opens the path to more complex multi-robot, multi-human interactions. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DavidRother/ecai_2023.pdf |
Reference Type | Conference Proceedings |
Author(s) | Rother, D. |
Year | 2023 |
Title | Implicitly Cooperative Agents through Impact-Aware Learning |
Journal/Conference/Book Title | European Conference on Artificial Intelligence (ECAI) |
Abstract | This research explores how autonomous agents learn and interact in shared environments, emphasizing the understanding of others as explicit agents rather than simple dynamic obstacles. When deploying robots in human-inhabited environments in the future, it will be unlikely that all interactions fit a predefined model of collaboration, where collaborative behavior is still expected from the robot. Utilizing the "theory of mind" concept, the research aims to infer the beliefs, policies, intentions, and goals of other agents, enabling the evaluation of our agent’s impact on them. The study aims to create a multi-agent system capable of promoting inherent cooperation even with mixed objectives and adapting to various applications. Using Reinforcement Learning we develop a modular system that is capable to adapt to changing team sizes and motives for different agents. The developed method is trialed in a real-world assistant robot setup, testing cooperative actions without explicit initiation. Further evaluations occur in simulated environments, i.e. a cooking environment, to manage the policies of other agents and action recognition issues.We can measure the success of our method through the increased utility of either the population or single agents. Additionally, user studies can be conducted in which we can directly measure the satisfaction of humans when working alongside our agents and compare those to other methods. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DavidRother/doctoral_consortium.pdf |
Reference Type | Conference Paper |
Author(s) | Vincent, T.; Metelli, A.; Peters, J.; Restelli, M.; D'Eramo, C. |
Year | 2023 |
Title | Parameterized projected Bellman operator |
Journal/Conference/Book Title | ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems |
Link to PDF | https://openreview.net/pdf?id=UnNdjopNeW |
Reference Type | Conference Paper |
Author(s) | Chen, Q.; Zhu, Y.; Hansel, Kay.; Aoyama, T.; Hasegawa, Y. |
Year | 2023 |
Title | Human Preferences and Robot Constraints Aware Shared Control for Smooth Follower Motion Execution |
Journal/Conference/Book Title | IEEE International Symposium on Micro-NanoMechatronics and Human Science (MHS) |
Publisher | IEEE |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/KayHansel/human_preferences_and_robot_constraints_aware_shared_control.pdf |
Reference Type | Journal Article |
Author(s) | Gu, S.; Kshirsagar, A.; Du Y.; Chen G.; Peters J.; Knoll A. |
Year | 2023 |
Title | A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors |
Journal/Conference/Book Title | Frontiers in Neurorobotics |
Volume | 17 |
Number | 1280341 |
URL(s) | https://www.frontiersin.org/articles/10.3389/fnbot.2023.1280341/full |
Link to PDF | https://www.frontiersin.org/articles/10.3389/fnbot.2023.1280341/full |
Reference Type | Conference Paper |
Author(s) | Mittenbuehler, M.; Hendawy, A.; D'Eramo, C.; Chalvatzaki, G. |
Year | 2023 |
Title | Parameter-efficient Tuning of Pretrained Visual-Language Models in Multitask Robot Learning |
Journal/Conference/Book Title | CoRL 2023 Workshop on Learning Effective Abstractions for Planning (LEAP) |
Keywords | pretrained visual-language models, multitask robot learning, adapters |
Abstract | Multimodal pretrained visual-language models (pVLMs) have showcased excellence across several applications, like visual question-answering. Their recent application for policy learning manifested promising avenues for augmenting robotic capabilities in the real world. This paper delves into the problem of parameter-efficient tuning of pVLMs for adapting them to robotic manipulation tasks with low-resource data. We showcase how Low-Rank Adapters (LoRA) can be injected into behavioral cloning temporal transformers to fuse language, multi-view images, and proprioception for multitask robot learning, even for long-horizon tasks. Preliminary results indicate our approach vastly outperforms baseline architectures and tuning methods, paving the way toward parameter-efficient adaptation of pretrained large multimodal transformers for robot learning with only a handful of demonstrations. |
Reference Type | Conference Paper |
Author(s) | Metternich, H.; Hendawy, A.; Klink, P.; Peters, J.; D'Eramo, C. |
Year | 2023 |
Title | Using Proto-Value Functions for Curriculum Generation in Goal-Conditioned RL |
Journal/Conference/Book Title | NeurIPS 2023 Workshop on Goal-Conditioned Reinforcement Learning |
Keywords | Reinforcement Learning, Curriculum Learning, Graph Laplacian |
Abstract | In this paper, we investigate the use of Proto Value Functions (PVFs) for measuring the similarity between tasks in the context of Curriculum Learning (CL). PVFs serve as a mathematical framework for generating basis functions for the state space of a Markov Decision Process (MDP). They capture the structure of the state space manifold and have been shown to be useful for value function approximation in Reinforcement Learning (RL). We show that even a few PVFs allow us to estimate the similarity between tasks. Based on this observation, we introduce a new algorithm called Curriculum Representation Policy Iteration (CRPI) that uses PVFs for CL, and we provide a proof of concept in a Goal-Conditioned Reinforcement Learning (GCRL) setting. |
Reference Type | Conference Proceedings |
Author(s) | Le, A. T.; Chalvatzaki, G.; Biess, A.; Peters, J. |
Year | 2023 |
Title | Accelerating Motion Planning via Optimal Transport |
Journal/Conference/Book Title | NeurIPS 2023 Workshop Optimal Transport and Machine Learning |
Volume | [Oral] |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/AnThaiLe/mpot_preprint.pdf |
Reference Type | Conference Paper |
Author(s) | Boehm, A.; Schneider, T.; Belousov, B.; Kshirsagar, A.; Lin, L.; Doerschner, K.; Drewing, K.; Rothkopf, C.A.; Peters, J. |
Year | 2023 |
Title | Tactile Active Texture Recognition With Vision-Based Tactile Sensors |
Journal/Conference/Book Title | NeurIPS Workshop on Touch Processing: a new Sensing Modality for AI |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/boehm24_tart.pdf |
Reference Type | Conference Proceedings |
Author(s) | Watson, J.; Peters, J.; |
Year | 2023 |
Title | Sample-Efficient Online Imitation Learning using Pretrained Behavioural Cloning Policies |
Journal/Conference/Book Title | NeurIPS 6th Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoeWatson/watson23rlws.pdf |
Reference Type | Conference Proceedings |
Author(s) | Al-Hafez, F.; Zhao, G.; Peters, J.; Tateo, D. |
Year | 2023 |
Title | LocoMuJoCo: A Comprehensive Imitation Learning Benchmark for Locomotion |
Journal/Conference/Book Title | Robot Learning Workshop, Conference on Neural Information Processing Systems (NeurIPS) |
Abstract | Imitation Learning (IL) holds great promise for enabling agile locomotion in embodied agents. However, many existing locomotion benchmarks primarily focus on simplified toy tasks, often failing to capture the complexity of real-world scenarios and steering research toward unrealistic domains. To advance research in IL for locomotion, we present a novel benchmark designed to facilitate rigorous evaluation and comparison of IL algorithms. This benchmark encompasses a diverse set of environments, including quadrupeds, bipeds, and musculoskeletal human models, each accompanied by comprehensive datasets, such as real noisy motion capture data, ground truth expert data, and ground truth sub-optimal data, enabling evaluation across a spectrum of difficulty levels. To increase the robustness of learned agents, we provide an easy interface for dynamics randomization and offer a wide range of partially observable tasks to train agents across different embodiments. Finally, we provide handcrafted metrics for each task and ship our benchmark with state-of-the-art baseline algorithms to ease evaluation and enable fast benchmarking. |
Link to PDF | https://arxiv.org/pdf/2311.02496.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lach, L.; Haschke, R.; Tateo, D.; Peters, J.; Ritter, H.; Sol, J.; Torras, C. |
Year | 2023 |
Title | Towards Transferring Tactile-based Continuous Force Control Policies from Simulation to Robot |
Journal/Conference/Book Title | NeurIPS 2023 Workshop on Touch Processing |
Link to PDF | https://arxiv.org/pdf/2311.07245.pdf |
Reference Type | Conference Proceedings |
Author(s) | Zelch, C.; Peters, J.; von Stryk, O. |
Year | 2023 |
Title | Clustering of Motion Trajectories by a Distance Measure Based on Semantic Features |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Humanoid Robots (Humanoids) |
Link to PDF | https://arxiv.org/pdf/2404.17269 |
Reference Type | Journal Article |
Author(s) | Parisi, S.; Tateo, D.; Hensel, M.; D'Eramo, C.; Peters, J.; Pajarinen, J. |
Year | 2022 |
Title | Long-Term Visitation Value for Deep Exploration in Sparse-Reward Reinforcement Learning |
Journal/Conference/Book Title | Algorithms |
Volume | 15 |
Number | 3 |
Pages | 81 |
Link to PDF | https://arxiv.org/abs/2001.00119 |
Reference Type | Journal Article |
Author(s) | Akrour, R.; Tateo, D.; Peters, J. |
Year | 2022 |
Title | Continuous Action Reinforcement Learning from a Mixture of Interpretable Experts |
Journal/Conference/Book Title | IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) |
Volume | 44 |
Number | 10 |
Pages | 6795-6806 |
Link to PDF | https://arxiv.org/pdf/2006.05911.pdf |
Reference Type | Journal Article |
Author(s) | Loeckel, S.; Kretschi, A.; van Vliet, P.; Peters, J. |
Year | 2022 |
Title | Identification and modelling of race driving styles |
Journal/Conference/Book Title | Vehicle System Dynamics |
Volume | 60 |
Number | 8 |
Pages | 2890--2918 |
Link to PDF | https://doi.org/10.1080/00423114.2021.1930070 |
Reference Type | Journal Article |
Author(s) | Tosatto, S.; Carvalho, J.; Peters, J. |
Year | 2022 |
Title | Batch Reinforcement Learning with a Nonparametric Off-Policy Policy Gradient |
Journal/Conference/Book Title | IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) |
Volume | 44 |
Number | 10 |
Pages | 5996--6010 |
URL(s) | https://ieeexplore.ieee.org/document/9449972 |
Reference Type | Journal Article |
Author(s) | Belousov, B.; Wibranek, B.; Schneider, J.; Schneider, T.; Chalvatzaki, G.; Peters, J.; Tessmann, O. |
Year | 2022 |
Title | Robotic Architectural Assembly with Tactile Skills: Simulation and Optimization |
Journal/Conference/Book Title | Automation in Construction |
Volume | 133 |
Pages | 104006 |
URL(s) | https://doi.org/10.1016/j.autcon.2021.104006 |
Link to PDF | https://www.sciencedirect.com/science/article/pii/S092658052100457X/pdfft?md5=d34f2f24487d3e8e4c84d3d8a60a9f28&pid=1-s2.0-S092658052100457X-main.pdf |
Reference Type | Journal Article |
Author(s) | Funk, N.; Schaff, C.; Madan, R.; Yoneda, T.; Urain, J.; Watson, J.; Gordon, E.; Widmaier, F; Bauer, S.; Srinivasa, S.; Bhattacharjee, T.; Walter, M.; Peters, J. |
Year | 2022 |
Title | Benchmarking Structured Policies and Policy Optimization for Real-World Dexterous Object Manipulation |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (R-AL) |
Abstract | Dexterous manipulation is a challenging and important problem in robotics. While data-driven methods are a promising approach, current benchmarks require simulation or extensive engineering support due to the sample inefficiency of popular methods. We present benchmarks for the TriFinger system, an open-source robotic platform for dexterous manipulation and the focus of the 2020 Real Robot Challenge. The benchmarked methods, which were successful in the challenge, can be generally described as structured policies, as they combine elements of classical robotics and modern policy optimization. This inclusion of inductive biases facilitates sample efficiency, interpretability, reliability and high performance. The key aspects of this benchmarking is validation of the baselines across both simulation and the real system, thorough ablation study over the core features of each solution, and a retrospective analysis of the challenge as a manipulation benchmark. The code and demo videos for this work can be found on our website (https://sites.google.com/view/benchmark-rrc). |
URL(s) | https://sites.google.com/view/benchmark-rrc |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NiklasFunk/RRC_v1.pdf |
Reference Type | Journal Article |
Author(s) | Muratore, F.; Ramos, F.; Turk, G.; Yu, W.; Gienger, M.; Peters, J. |
Year | 2022 |
Title | Robot Learning from Randomized Simulations: A Review |
Journal/Conference/Book Title | Frontiers in Robotics and AI |
Keywords | robotics, simulation, reality gap, simulation optimization bias, reinforcement learning, domain randomization, sim-to-real |
Abstract | The rise of deep learning has caused a paradigm shift in robotics research, favoring methods that require large amounts of data. It is prohibitively expensive to generate such data sets on a physical platform. Therefore, state-of-the-art approaches learn in simulation where data generation is fast as well as inexpensive and subsequently transfer the knowledge to the real robot (sim-to-real). Despite becoming increasingly realistic, all simulators are by construction based on models, hence inevitably imperfect. This raises the question of how simulators can be modified to facilitate learning robot control policies and overcome the mismatch between simulation and reality, often called the ‘reality gap’. We provide a comprehensive review of sim-to-real research for robotics, focusing on a technique named ‘domain randomization’ which is a method for learning from randomized simulations. |
Volume | 9 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Muratore_RTYGP--RobotLearningFromRandomizedSimulations-AReview.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Muratore_RTYGP--RobotLearningFromRandomizedSimulations-AReview.pdf |
Language | English |
Last Modified Date | 2022-01.22 |
Reference Type | Journal Article |
Author(s) | Cowen-Rivers, A.I.; Palenicek, D.; Moens, V.; Abdullah, M.A.; Sootla, A.; Wang, J.; Bou-Ammar, H. |
Year | 2022 |
Title | SAMBA: safe model-based & active reinforcement learning |
Journal/Conference/Book Title | Machine Learning |
Link to PDF | https://link.springer.com/article/10.1007/s10994-021-06103-6 |
Reference Type | Journal Article |
Author(s) | You, B.; Arenz, O.; Chen, Y.; Peters, J. |
Year | 2022 |
Title | Integrating Contrastive Learning with Dynamic Models for Reinforcement Learning from Images |
Journal/Conference/Book Title | Neurocomputing |
URL(s) | https://doi.org/10.1016/j.neucom.2021.12.094 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/OlegArenz/Integrating_Contrastive_Learning_with_Dynamic_Models_for_Reinforcement_Learning_from_Images.pdf |
Reference Type | Conference Proceedings |
Author(s) | Klink, P.; D`Eramo, C.; Peters, J.; Pajarinen, J. |
Year | 2022 |
Title | Boosted Curriculum Reinforcement Learning |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PascalKlink/boosted-crl.pdf |
Reference Type | Conference Proceedings |
Author(s) | Memmel, M.; Liu, P.; Tateo, D.; Peters, J. |
Year | 2022 |
Title | Dimensionality Reduction and Prioritized Exploration for Policy Search |
Journal/Conference/Book Title | 25th International Conference on Artificial Intelligence and Statistics (AISTATS) |
Keywords | Policy Search, Dimensionality Reduction, Exploration |
Abstract | Black-box policy optimization, a class of reinforcement learning algorithms, explore and update policies at the parameter level. These algorithms are applied widely in robotics applications with movement primitives and non-differentiable policies. These methods are particularly relevant where exploration at the action level could lead to actuator damage or other safety issues. However, this class of algorithms does not scale well with the increasing dimensionality of the policy, leading to high demand for samples that are expensive to obtain on real-world systems. In most systems, policy parameters do not contribute equally to the return. Thus, identifying those parameters which contribute most allows us to narrow the exploration and speed up learning. Updating only the effective parameters requires fewer samples, solving the scalability issue. We present a novel method to prioritize exploration of effective parameters, coping with full covariance matrix updates. Our algorithm learns faster than recent approaches and requires fewer samples to achieve state-of-the-art results. To select these effective parameters, we consider both the Pearson correlation coefficient and the Mutual Information. We showcase the capabilities of our approach using the Relative Entropy Policy Search algorithm in several simulated environments, including robotics simulations. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PuzeLiu/AISTATS2022_DR-CREPS.pdf |
Reference Type | Journal Article |
Author(s) | Flynn, H.; Reeb, D.; Kandemir, M.; Peters, J. |
Year | 2022 |
Title | PAC-Bayesian Lifelong Learning For Multi-Armed Bandits |
Journal/Conference/Book Title | Data Mining and Knowledge Discovery |
Volume | 36 |
Number | 2 |
Pages | 841-876 |
Link to PDF | https://arxiv.org/pdf/2203.03303.pdf |
Reference Type | Journal Article |
Author(s) | Prasad, V.; Stock-Homburg, R.; Peters, J. |
Year | 2022 |
Title | Human-Robot Handshaking: A Review |
Journal/Conference/Book Title | International Journal of Social Robotics (IJSR) |
Keywords | Handshaking, Physical HRI, Social Robotics |
Abstract | For some years now, the use of social, anthropomorphic robots in various situations has been on the rise. These are robots developed to interact with humans and are equipped with corresponding extremities. They already support human users in various industries, such as retail, gastronomy, hotels, education and healthcare. During such Human-Robot Interaction (HRI) scenarios, physical touch plays a central role in the various applications of social robots as interactive non-verbal behaviour is a key factor in making the interaction more natural. Shaking hands is a simple, natural interaction used commonly in many social contexts and is seen as a symbol of greeting, farewell and congratulations. In this paper, we take a look at the existing state of Human-Robot Handshaking research, categorise the works based on their focus areas, draw out the major findings of these areas while analysing their pitfalls. We mainly see that some form of synchronisation exists during the different phases of the interaction. In addition to this, we also find that additional factors like gaze, voice facial expressions etc. can affect the perception of a robotic handshake and that internal factors like personality and mood can affect the way in which handshaking behaviours are executed by humans. Based on the findings and insights, we finally discuss possible ways forward for research on such physically interactive behaviours. |
Volume | 14 |
Number | 1 |
Pages | 277-293 |
URL(s) | https://link.springer.com/content/pdf/10.1007/s12369-021-00763-z.pdf |
Link to PDF | https://link.springer.com/content/pdf/10.1007/s12369-021-00763-z.pdf |
Reference Type | Journal Article |
Author(s) | Zheng, Y.; Veiga, F.F.; Peters, J.; Santos, V.J. |
Year | 2022 |
Title | Autonomous Learning of Page Flipping Movements via Tactile Feedback |
Journal/Conference/Book Title | IEEE Transactions on Robotics (T-Ro) |
Volume | 38 |
Number | 5 |
Pages | 2734 - 2749 |
URL(s) | https://ieeexplore.ieee.org/document/9786532 |
Reference Type | Conference Paper |
Author(s) | Palenicek, D.; Lutter, M., Peters, J. |
Year | 2022 |
Title | Revisiting Model-based Value Expansion |
Journal/Conference/Book Title | Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Link to PDF | https://arxiv.org/pdf/2203.14660.pdf |
Reference Type | Journal Article |
Author(s) | Hansel, K.; Moos, J.; Abdulsamad, H.; Stark, S.; Clever, D.; Peters, J. |
Year | 2022 |
Title | Robust Reinforcement Learning: A Review of Foundations and Recent Advances |
Journal/Conference/Book Title | Machine Learning and Knowledge Extraction (MAKE) |
Publisher | MDPI |
Volume | 4 |
Number | 1 |
Pages | 276--315 |
ISBN/ISSN | 2504-4990 |
URL(s) | https://www.mdpi.com/2504-4990/4/1/13 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/KayHansel/robustRLsurvey22_hansel.pdf |
Reference Type | Unpublished Work |
Author(s) | Carvalho, J.; Peters, J. |
Year | 2022 |
Title | An Analysis of Measure-Valued Derivatives for Policy Gradients |
Journal/Conference/Book Title | Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
URL(s) | https://arxiv.org/pdf/2203.03917.pdf |
Reference Type | Journal Article |
Author(s) | Weng, Y.; Pajarinen, J.; Akrour, R.; Matsuda, T.; Peters, J.; Maki, T. |
Year | 2022 |
Title | Reinforcement Learning Based Underwater Wireless Optical Communication Alignment for Multiple Autonomous Underwater Vehicles |
Journal/Conference/Book Title | IEEE Journal of Oceanic Engineering |
Volume | 47 |
Number | 4 |
Pages | 1231-1245 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Research/Overview/FinalSubmission-1.pdf |
Reference Type | Journal Article |
Author(s) | Buechler, D.; Guist, S.; Calandra, R.; Berenz, V.; Schoelkopf, B.; Peters, J. |
Year | 2022 |
Title | Learning to Play Table Tennis From Scratch using Muscular Robots |
Journal/Conference/Book Title | IEEE Transactions on Robotics (T-Ro) |
Volume | 38 |
Number | 6 |
Pages | 3850-3860 |
Link to PDF | https://arxiv.org/pdf/2006.05935.pdf |
Reference Type | Conference Proceedings |
Author(s) | Klink, P.; Yang, H.; D'Eramo, C.; Peters, J.; Pajarinen, J. |
Year | 2022 |
Title | Curriculum Reinforcement Learning via Constrained Optimal Transport |
Journal/Conference/Book Title | International Conference on Machine Learning (ICML) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PascalKlink/crrt.pdf |
Reference Type | Journal Article |
Author(s) | Weng, Y.; Matsuda, T.; Sekimuri, Y.; Pajarinen, J.; Peters, J.; Maki, T. |
Year | 2022 |
Title | Pointing Error Control of Underwater Wireless Optical Communication on Mobile Platform |
Journal/Conference/Book Title | IEEE Photonics Technology Letters |
Volume | 34 |
Number | 13 |
Pages | 699-702 |
URL(s) | https://ieeexplore.ieee.org/document/9791364 |
Reference Type | Journal Article |
Author(s) | Cowen-Rivers, A.; Lyu, W.; Tutunov, R.; Wang, Z.; Grosnit, A.; Griffiths, R.R.; Maraval, A.; Jianye, H.; Wang, J.; Peters, J.; Bou Ammar, H. |
Year | 2022 |
Title | HEBO: An Empirical Study of Assumptions in Bayesian Optimisation |
Journal/Conference/Book Title | Journal of Artificial Intelligence Research |
Volume | 74 |
Pages | 1269-1349 |
Link to PDF | https://arxiv.org/pdf/2012.03826.pdf |
Reference Type | Conference Proceedings |
Author(s) | Urain, J.*; Le, A.T.*; Lambert, A.*; Chalvatzaki, G.; Boots, B.; Peters, J. |
Year | 2022 |
Title | Learning Implicit Priors for Motion Optimization |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | Motion Planning, Energy-Based Models |
URL(s) | https://sites.google.com/view/implicit-priors |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/AnThaiLe/iros2022_ebmtrajopt.pdf |
Reference Type | Conference Proceedings |
Author(s) | Liu, P.; Zhang, K.; Tateo, D.; Jauhri, S.; Peters, J.; Chalvatzaki, G.; |
Year | 2022 |
Title | Regularized Deep Signed Distance Fields for Reactive Motion Generation |
Journal/Conference/Book Title | 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PuzeLiu/IROS_2022_ReDSDF.pdf |
Reference Type | Conference Proceedings |
Author(s) | Zheng, Y.; Veiga, F.F.; Peters, J.; Santos, V.J. |
Year | 2022 |
Title | Autonomous Learning of Page Flipping Movements via Tactile Feedback |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Member/FilipeVeiga/IROS22_3682_MS.pdf |
Reference Type | Conference Proceedings |
Author(s) | Funk, N.; Menzenbach, S.; Chalvatzaki, G.; Peters, J. |
Year | 2022 |
Title | Graph-based Reinforcement Learning meets Mixed Integer Programs: An application to 3D robot assembly discovery |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
URL(s) | https://sites.google.com/view/rl-meets-milp |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NiklasFunk/GNN_meets_MILP_v1.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ploeger, K.; Peters, J. |
Year | 2022 |
Title | Controlling the Cascade: Kinematic Planning for N-ball Toss Juggling |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Abstract | Dynamic movements are ubiquitous in human motor behavior as they tend to be more efficient and can solve a broader range of skill domains than their quasi-static counterparts. For decades, robotic juggling tasks have been among the most frequently studied dynamic manipulation problems since the required dynamic dexterity can be scaled to arbitrarily high difficulty. However, successful approaches have been limited to basic juggling skills, indicating a lack of understanding of the required constraints for dexterous toss juggling. We present a detailed analysis of the toss juggling task, identifying the key challenges and formalizing it as a trajectory optimization problem. Building on our state-of-the-art, real-world toss juggling platform, we reach the theoretical limits of toss juggling in simulation, evaluate a resulting real-time controller in environments of varying difficulty and achieve robust toss juggling of up to 17 balls on two anthropomorphic manipulators. |
Link to PDF | https://arxiv.org/abs/2207.01414 |
Reference Type | Conference Proceedings |
Author(s) | Schneider, T.; Belousov, B.; Chalvatzaki, G.; Romeres, D.; Jha, D.K.; Peters, J. |
Year | 2022 |
Title | Active Exploration for Robotic Manipulation |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
URL(s) | https://sites.google.com/view/aerm |
Link to PDF | https://arxiv.org/pdf/2210.12806.pdf |
Reference Type | Conference Proceedings |
Author(s) | Schneider, T.; Belousov, B.; Abdulsamad, H.; Peters, J. |
Year | 2022 |
Title | Active Inference for Robotic Manipulation |
Journal/Conference/Book Title | 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Keywords | robotic manipulation, model-based reinforcement learning, information gain |
Abstract | Robotic manipulation stands as a largely unsolved problem despite significant advances in robotics and machine learning in the last decades. One of the central challenges of manipulation is partial observability, as the agent usually does not know all physical properties of the environment and the objects it is manipulating in advance. A recently emerging theory that deals with partial observability in an explicit manner is Active Inference. It does so by driving the agent to act in a way that is not only goal-directed but also informative about the environment. In this work, we apply Active Inference to a hard-to-explore simulated robotic manipulation tasks, in which the agent has to balance a ball into a target zone. Since the reward of this task is sparse, in order to explore this environment, the agent has to learn to balance the ball without any extrinsic feedback, purely driven by its own curiosity. We show that the information-seeking behavior induced by Active Inference allows the agent to explore these challenging, sparse environments systematically. Finally, we conclude that using an information-seeking objective is beneficial in sparse environments and allows the agent to solve tasks in which methods that do not exhibit directed exploration fail. |
URL(s) | https://arxiv.org/abs/2206.10313 |
Link to PDF | https://arxiv.org/pdf/2206.10313.pdf |
Language | English |
Last Modified Date | 01.06.2022 |
Reference Type | Conference Proceedings |
Author(s) | Galljamov, R.; Zhao, G.; Belousov, B.; Seyfarth, A.; Peters, J. |
Year | 2022 |
Title | Improving Sample Efficiency of Example-Guided Deep Reinforcement Learning for Bipedal Walking |
Journal/Conference/Book Title | 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids) |
URL(s) | https://ieeexplore.ieee.org/document/10000068 |
Reference Type | Conference Proceedings |
Author(s) | Watson, J.; Peters, J. |
Year | 2022 |
Title | Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoeWatson/watson22corl.pdf |
Reference Type | Journal Article |
Author(s) | Weng, Y.; Matsuda, T.; Sekimoria, Y.; Pajarinen, J.; Peters, J.; Maki, T. |
Year | 2022 |
Title | Establishment of Line-of-Sight Optical Links Between Autonomous Underwater Vehicles: Field Experiment and Performance Validation |
Journal/Conference/Book Title | Applied Ocean Research |
Volume | 129 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Research/Overview/Establishment_of_Line_of_Sight_Optical_Links.pdf |
Reference Type | Conference Proceedings |
Author(s) | Carvalho, J.; Koert, D.; Daniv, M.; Peters, J. |
Year | 2022 |
Title | Adapting Object-Centric Probabilistic Movement Primitives with Residual Reinforcement Learning |
Journal/Conference/Book Title | 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids) |
Abstract | It is desirable for future robots to quickly learn new tasks and adapt learned skills to constantly changing environments. To this end, Probabilistic Movement Primitives (ProMPs) have shown to be a promising framework to learn generalizable trajectory generators from distributions over demonstrated trajectories. However, in practical applications that require high precision in the manipulation of objects, the accuracy of ProMPs is often insufficient, in particular when they are learned in cartesian space from external observations and executed with limited controller gains. Therefore, we propose to combine ProMPs with the Residual Reinforcement Learning (RRL) framework, to account for both, corrections in position and orientation during task execution. In particular, we learn a residual on top of a nominal ProMP trajectory with Soft Actor-Critic and incorporate the variability in the demonstrations as a decision variable to reduce the search space for RRL. As a proof of concept, we evaluate our proposed method on a 3D block insertion task with a 7-DoF Franka Emika Panda robot. Experimental results show that the robot successfully learns to complete the insertion, which was not possible before with using basic ProMPs. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoaoCarvalho/Adapting_Object_Centric_Probabilistic_Movement_Primitives_with_Residual_Reinforcement_Learning___compressed.pdf |
Reference Type | Conference Proceedings |
Author(s) | Vorndamme, J.; Carvalho, J.; Laha, R.; Koert, D.; Figueredo, L.; Peters, J.; Haddadin, S. |
Year | 2022 |
Title | Integrated Bi-Manual Motion Generation and Control shaped for Probabilistic Movement Primitives |
Journal/Conference/Book Title | 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids) |
Abstract | This work introduces a novel cooperative control framework that allows for real-time reactiveness and adaptation whilst satisfying implicit constraints stemming from probabilistic/stochastic trajectories. Stemming from task-oriented sampling and/or task-oriented demonstrations, e.g., learning based on motion primitives, such trajectories carry additional information often neglected during real-time control deployment. In particular, methods such as probabilistic movement primitives offer the advantage to capture the inherent stochasticity in human demonstrations – which in turn reflects human’s understanding about task-variability and adaption possibilities. This information, however, is often poorly exploited and, mostly, used during offline trajectory planning stage. Our work instead introduces a novel real-time motion-generation strategy that explicitly exploits such information to improve trajectories according to changes in the environmental condition and robot task-space topology. The proposed solution is particularly well suited for bimanual and coordinated systems where the increased kinematic complexity, tightly-coupled constraints and reduced workspace have detrimental effects on the manipulability, joint limits, and are even capable of causing unstable behavior and task-failure. Our methodology addresses these challenges, and improves performance and task-execution by taking the confidence range region explicitly into account whilst maneuvering towards better configurations. Furthermore, it can directly cope with different closed-chain kinematics and task-space topologies, resulting for instance from different grasps. Experimental evaluations on a bi-manual Franka panda robot show that the proposed method can run in the inner control loop of the robot and enables successful execution of highly constrained tasks. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoaoCarvalho/IntegratedBiManualMotionGenerationandControlshapedforProMPs.pdf |
Reference Type | Conference Proceedings |
Author(s) | Scherf, L.; Turan, C.; Koert, D. |
Year | 2022 |
Title | Learning from Unreliable Human Action Advice in Interactive Reinforcement Learning |
Journal/Conference/Book Title | 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids) |
Abstract | Interactive Reinforcement Learning (IRL) uses human input to improve learning speed and enable learning in more complex environments. Human action advice is here one of the input channels preferred by human users. However, many existing IRL approaches do not explicitly consider the possibility of inaccurate human action advice. Moreover, most approaches that account for inaccurate advice compute trust in human action advice independent of a state. This can lead to problems in practical cases, where human input might be inaccurate only in some states while it is still useful in others. To this end, we propose a novel algorithm that can handle state-dependent unreliable human action advice in IRL. Here, we combine three potential indicator signals for unreliable advice, i.e. consistency of advice, retrospective optimality of advice, and behavioral cues that hint at human uncertainty. We evaluate our method in a simulated gridworld and in robotic sorting tasks with 28 subjects. We show that our method outperforms a state-independent baseline and analyze occurrences of behavioral cues related to unreliable advice. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/LisaScherf/Scherf_Humanoids_2022_final.pdf |
Reference Type | Conference Proceedings |
Author(s) | Prasad, V.; Koert, D.; Stock-Homburg, R.; Peters, J.; Chalvatzaki, G. |
Year | 2022 |
Title | MILD: Multimodal Interactive Latent Dynamics for Learning Human-Robot Interaction |
Journal/Conference/Book Title | IEEE-RAS International Conference on Humanoid Robots (Humanoids) |
Abstract | Modeling interaction dynamics to generate robot trajectories that enable a robot to adapt and react to a human’s actions and intentions is critical for efficient and effective collaborative Human-Robot Interactions (HRI). Learning from Demonstration (LfD) methods from Human-Human Interactions (HHI) have shown promising results, especially when coupled with representation learning techniques. However, such methods for learning HRI either do not scale well to high dimensional data or cannot accurately adapt to changing via-poses of the interacting partner. We propose Multimodal Interactive Latent Dynamics (MILD), a method that couples deep representation learning and probabilistic machine learning to address the problem of two-party physical HRIs. We learn the interaction dynamics from demonstrations, using Hidden Semi-Markov Models (HSMMs) to model the joint distribution of the interacting agents in the latent space of a Variational Autoencoder (VAE). Our experimental evaluations for learning HRI from HHI demonstrations show that MILD effectively captures the multimodality in the latent representations of HRI tasks, allowing us to decode the varying dynamics occurring in such tasks. Compared to related work, MILD generates more accurate trajectories for the controlled agent (robot) when conditioned on the observed agent’s (human) trajectory. Notably, MILD can learn directly from camera-based pose estimations to generate trajectories, which we then map to a humanoid robot without the need for any additional training. Supplementary Material: https://bit.ly/MILD-HRI |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/VigneshPrasad/HUMANOIDS22-MILD-HRI.pdf |
Reference Type | Unpublished Work |
Author(s) | Carvalho, J.; Baierl, M; Urain, J; Peters, J. |
Year | 2022 |
Title | Conditioned Score-Based Models for Learning Collision-Free Trajectory Generation |
Journal/Conference/Book Title | NeurIPS 2022 Workshop on Score-Based Methods |
Abstract | Planning a motion in a cluttered environment is a recurring task autonomous agents need to solve. This paper presents a first attempt to learn generative models for collision-free trajectory generation based on conditioned score-based models. Given multiple navigation tasks, environment maps and collision-free trajectories pre-computed with a sample-based planner, using a signed distance function loss we learn a vision encoder of the map and use its embedding to learn a conditioned score-based model for trajectory generation. A novelty of our method is to integrate in a temporal U-net architecture, using a cross-attention mechanism, conditioning variables such as the latent representation of the environment and task features. We validate our approach in a simulated 2D planar navigation toy task, where a robot needs to plan a path that avoids obstacles in a scene. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoaoCarvalho/Conditioned_Score_Based_Models_for_Learning_Collision_Free_Trajectory_Generation.pdf |
Reference Type | Conference Paper |
Author(s) | Siebenborn, M.; Belousov, B.; Huang, J.; Peters, J. |
Year | 2022 |
Title | How Crucial is Transformer in Decision Transformer? |
Journal/Conference/Book Title | Foundation Models for Decision Making Workshop at Neural Information Processing Systems |
URL(s) | https://arxiv.org/abs/2211.14655 |
Link to PDF | https://arxiv.org/pdf/2211.14655.pdf |
Reference Type | Journal Article |
Author(s) | Urain, J.; Tateo, D; Peters, J. |
Year | 2022 |
Title | Learning Stable Vector Fields on Lie Groups |
Journal/Conference/Book Title | Robotics and Automation Letters (RA-L) |
Keywords | Imitation learning, lie groups, learning from demonstration, machine learning for robot control, reactive motion generation. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JulenUrainDeJesus/2022msvfurain.pdf |
Reference Type | Conference Proceedings |
Author(s) | Watson, J.; Hanher, B.; Peters, J. |
Year | 2022 |
Title | Differentiable Simulators as Gaussian Processes |
Journal/Conference/Book Title | R:SS Workshop: Differentiable Simulation for Robotics |
Reference Type | Conference Proceedings |
Author(s) | Watson, J.; Peters, J.; |
Year | 2022 |
Title | Stationary Posterior Policy Iteration with Variational Inference |
Journal/Conference/Book Title | The Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Reference Type | Conference Proceedings |
Author(s) | Bottero, A.G.; Luis, C.E.; Vinogradska, J.; Berkenkamp, F.; Peters, J. |
Year | 2022 |
Title | Information-Theoretic Safe Exploration with Gaussian Processes |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Link to PDF | https://proceedings.neurips.cc/paper_files/paper/2022/file/c628644624c1be9c8cfb1541fa6421fd-Paper-Conference.pdf |
Reference Type | Conference Proceedings |
Author(s) | Le, A. T.; Urain, J.; Lambert, A.; Chalvatzaki, G.; Boots, B.; Peters, J. |
Year | 2022 |
Title | Learning Implicit Priors for Motion Optimization |
Journal/Conference/Book Title | RSS 2022 Workshop on Implicit Representations for Robotic Manipulation |
Reference Type | Journal Article |
Author(s) | Dam, T.; Chalvatzaki, G.; Peters, J.; Pajarinen, J. |
Year | 2022 |
Title | Monte-carlo robot path planning |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Volume | 7 |
Number | 4 |
Pages | 11213-11220 |
Link to PDF | https://arxiv.org/pdf/2208.02673 |
Reference Type | Journal Article |
Author(s) | Muratore, F.; Gienger, M.; Peters, J. |
Year | 2021 |
Title | Assessing Transferability from Simulation to Reality for Reinforcement Learning |
Journal/Conference/Book Title | IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) |
Abstract | Learning robot control policies from physics simulations is of great interest to the robotics community as it may render the learning process faster, cheaper, and safer by alleviating the need for expensive real-world experiments. However, the direct transfer of learned behavior from simulation to reality is a major challenge. Optimizing a policy on a slightly faulty simulator can easily lead to the maximization of the ‘Simulation Optimization Bias’ (SOB). In this case, the optimizer exploits modeling errors of the simulator such that the resulting behavior can potentially damage the robot. We tackle this challenge by applying domain randomization, i.e., randomizing the parameters of the physics simulations during learning. We propose an algorithm called Simulation-based Policy Optimization with Transferability Assessment (SPOTA) which uses an estimator of the SOB to formulate a stopping criterion for training. The introduced estimator quantifies the over-fitting to the set of domains experienced while training. Our experimental results on two different second order nonlinear systems show that the new simulation-based policy search algorithm is able to learn a control policy exclusively from a randomized simulator, which can be applied directly to real systems without any additional training. |
Publisher | IEEE |
Volume | 43 |
Number | 4 |
Pages | 1172-1183 |
Date | April 2021 |
ISBN/ISSN | 0162-8828 |
URL(s) | 10.1109/TPAMI.2019.2952353 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Muratore_Gienger_Peters--AssessingTransferabilityfromSimulationToRealityForReinforcementLearning.pdf |
Language | English |
Reference Type | Journal Article |
Author(s) | Rawal, N.; Koert, D.; Turan, C.; Kersting, K.; Peters, J.; Stock-Homburg, R. |
Year | 2021 |
Title | ExGenNet: Learning to Generate Robotic Facial Expression Using Facial Expression Recognition |
Journal/Conference/Book Title | Frontiers in Robotics & AI |
Volume | 8 |
Number | 730317 |
URL(s) | https://www.frontiersin.org/articles/10.3389/frobt.2021.730317/full |
Reference Type | Journal Article |
Author(s) | Tosatto, S.; Akrour, R.; Peters, J. |
Year | 2021 |
Title | An Upper Bound of the Bias of Nadaraya-Watson Kernel Regression under Lipschitz Assumptions |
Journal/Conference/Book Title | Stats |
Keywords | Nonparametric, Bias, Kernel Regression |
Abstract | The Nadaraya-Watson kernel estimator is among the most popular nonparameteric regression technique thanks to its simplicity. Its asymptotic bias has been studied by Rosenblatt in 1969 and has been reported in several related literature. However, given its asymptotic nature, it gives no access to a hard bound. The increasing popularity of predictive tools for automated decision-making surges the need for hard (non-probabilistic) guarantees. To alleviate this issue, we propose an upper bound of the bias which holds for finite bandwidths using Lipschitz assumptions and mitigating some of the prerequisites of Rosenblatt’s analysis. Our bound has potential applications in fields like surgical robots or self-driving cars, where some hard guarantees on the prediction-error are needed. |
Place Published | Basel, Switzerland |
Volume | 4 |
Pages | 1--17 |
URL(s) | https://www.mdpi.com/2571-905X/4/1/1/htm |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Alumni/SamueleTosatto/tosatto-mdpi-2021.pdf |
Reference Type | Journal Article |
Author(s) | Muratore, F.; Eilers, C.; Gienger, M.; Peters, J. |
Year | 2021 |
Title | Data-efficient Domain Randomization with Bayesian Optimization |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L), with Presentation at the IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | sim-2-real, domain randomization, bayesian optimization |
Abstract | When learning policies for robot control, the required real-world data is typically prohibitively expensive to acquire, so learning in simulation is a popular strategy. Unfortunately, such polices are often not transferable to the real world due to a mismatch between the simulation and reality, called 'reality gap'. Domain randomization methods tackle this problem by randomizing the physics simulator (source domain) during training according to a distribution over domain parameters in order to obtain more robust policies that are able to overcome the reality gap. Most domain randomization approaches sample the domain parameters from a fixed distribution. This solution is suboptimal in the context of sim-to-real transferability, since it yields policies that have been trained without explicitly optimizing for the reward on the real system (target domain). Additionally, a fixed distribution assumes there is prior knowledge about the uncertainty over the domain parameters. In this paper, we propose Bayesian Domain Randomization (BayRn), a black-box sim-to-real algorithm that solves tasks efficiently by adapting the domain parameter distribution during learning given sparse data from the real-world target domain. BayRn uses Bayesian optimization to search the space of source domain distribution parameters such that this leads to a policy which maximizes the real-word objective, allowing for adaptive distributions during policy optimization. We experimentally validate the proposed approach in sim-to-sim as well as in sim-to-real experiments, comparing against three baseline methods on two robotic tasks. Our results show that BayRn is able to perform sim-to-real transfer, while significantly reducing the required prior knowledge. |
Publisher | IEEE |
URL(s) | https://ieeexplore.ieee.org/document/9327467 |
Link to PDF | https://arxiv.org/pdf/2003.02471.pdf |
Language | English |
Last Modified Date | 2021-01-06 |
Reference Type | Journal Article |
Author(s) | Hoefer, S.; Bekris, K.; Handa, A.; Gamboa, J.C.; Golemo, F.; Mozifian, M.; Atkeson, C.G., Fox, D.; Goldberg, K.; Leonard, J.; Liu, C.K.; Peters, J.; Song, S.; Welinder, P.; White, M. |
Year | 2021 |
Title | Sim2Real in Robotics and Automation: Applications and Challenges |
Journal/Conference/Book Title | IEEE Transactions on Automation Science (T-ASE) |
Volume | 18 |
Number | 2 |
Pages | 398-400 |
Link to PDF | https://arxiv.org/pdf/2012.03806.pdf |
Reference Type | Journal Article |
Author(s) | Akrour, R.; Atamna, A.; Peters, J. |
Year | 2021 |
Title | Convex Optimization with an Interpolation-based Projection and its Application to Deep Learning |
Journal/Conference/Book Title | Machine Learning (MACH) |
Volume | 110 |
Number | 8 |
Pages | 2267-2289 |
Link to PDF | https://arxiv.org/pdf/2011.07016.pdf |
Reference Type | Book |
Author(s) | Belousov, B.; Abdulsamad H.; Klink, P.; Parisi, S.; Peters, J. |
Year | 2021 |
Title | Reinforcement Learning Algorithms: Analysis and Applications |
Journal/Conference/Book Title | Studies in Computational Intelligence |
Publisher | Springer International Publishing |
Edition | 1 |
URL(s) | https://www.springer.com/gp/book/9783030411879 |
Reference Type | Conference Paper |
Author(s) | Watson, J.; Lin, J. A.; Klink, P.; Peters, J. |
Year | 2021 |
Title | Neural Linear Models with Functional Gaussian Process Priors |
Journal/Conference/Book Title | 3rd Symposium on Advances in Approximate Bayesian Inference (AABI) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoeWatson/AABI21.pdf |
Reference Type | Conference Proceedings |
Author(s) | Watson, J.; Peters, J. |
Year | 2021 |
Title | Advancing Trajectory Optimization with Approximate Inference: Exploration, Covariance Control and Adaptive Risk |
Journal/Conference/Book Title | American Control Conference (ACC) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoeWatson/cdc21.pdf |
Reference Type | Conference Proceedings |
Author(s) | Watson, J.; Lin J. A.; Klink, P.; Pajarinen, J.; Peters, J. |
Year | 2021 |
Title | Latent Derivative Bayesian Last Layer Networks |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoeWatson/AISTATS21.pdf |
Reference Type | Journal Article |
Author(s) | Klink, P.; Abdulsamad, H.; Belousov, B.; D'Eramo, C.; Peters, J.; Pajarinen, J. |
Year | 2021 |
Title | A Probabilistic Interpretation of Self-Paced Learning with Applications to Reinforcement Learning |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Link to PDF | http://arxiv.org/abs/2102.13176 |
Reference Type | Conference Proceedings |
Author(s) | Tosatto, S.; Chalvatzaki, G.; Peters, J. |
Year | 2021 |
Title | Contextual Latent-Movements Off-Policy Optimization for Robotic Manipulation Skills |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://arxiv.org/pdf/2010.13766.pdf |
Reference Type | Conference Proceedings |
Author(s) | Li, Q.; Chalvatzaki, G.; Peters, J.; Wang, Y. |
Year | 2021 |
Title | Directed Acyclic Graph Neural Network for Human Motion Prediction |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
URL(s) | https://ieeexplore.ieee.org/document/9561540 |
Reference Type | Conference Proceedings |
Author(s) | Prasad, V.; Stock-Homburg, R.; Peters, J. |
Year | 2021 |
Title | Learning Human-like Hand Reaching for Human-Robot Handshaking |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/VigneshPrasad/ICRA22-Prasad.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Silberbauer, J.; Watson, J.; Peters, J. |
Year | 2021 |
Title | Differentiable Physics Models for Real-world Offline Model-based Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoeWatson/LutterICRA2021.pdf |
Reference Type | Conference Proceedings |
Author(s) | Morgan, A.; Nandha, D.; Chalvatzaki, G.; D'Eramo, C.; Dollar, A.; Peters, J. |
Year | 2021 |
Title | Model Predictive Actor-Critic: Accelerating Robot Skill Acquisition with Deep Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
URL(s) | https://ieeexplore.ieee.org/document/9561298 |
Reference Type | Conference Proceedings |
Author(s) | Abdulsamad, H.; Nickl, P.; Klink, P.; Peters, J. |
Year | 2021 |
Title | A Variational Infinite Mixture for Probabilistic Inverse Dynamics Learning |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
URL(s) | https://arxiv.org/pdf/2011.05217.pdf |
Reference Type | Conference Paper |
Author(s) | Dam, T.; D'Eramo, C.; Peters, J.; Pajarinen J. |
Year | 2021 |
Title | Convex Regularization in Monte-Carlo Tree Search |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Abstract | Monte-Carlo planning and Reinforcement Learning (RL) are essential to sequential decision making. The recent AlphaGo and AlphaZero algorithms have shown how to successfully combine these two paradigms in order to solve large scale sequential decision problems. These methodologies exploit a variant of the well-known UCT algorithm to trade off exploitation of good actions and exploration of unvisited states, but their empirical success comes at the cost of poor sample-efficiency and high computation time. In this paper, we overcome these limitations by considering convex regularization in Monte-Carlo Tree Search (MCTS), which has been successfully used in RL to efficiently drive exploration. First, we introduce a unifying theory on the use of generic convex regularizers in MCTS, deriving the regret analysis and providing guarantees of exponential convergence rate. Second, we exploit our theoretical framework to introduce novel regularized backup operators for MCTS, based on the relative entropy of the policy update, and on the Tsallis entropy of the policy. Finally, we empirically evaluate the proposed operators in AlphaGo and AlphaZero on problems of increasing dimensionality and branching factor, from a toy problem to several Atari games, showing their superiority wrt representative baselines. |
Link to PDF | https://arxiv.org/pdf/2007.00391.pdf |
Reference Type | Journal Article |
Author(s) | Tanneberg, D.; Ploeger, K.; Rueckert, E.; Peters, J. |
Year | 2021 |
Title | SKID RAW: Skill Discovery from Raw Trajectories |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | https://arxiv.org/pdf/2103.14610.pdf |
Reference Type | Conference Proceedings |
Author(s) | Abdulsamad, H.; Dorau, T.; Belousov, B.; Zhu, J.-J; Peters, J. |
Year | 2021 |
Title | Distributionally Robust Trajectory Optimization Under Uncertain Dynamics via Relative Entropy Trust-Regions |
Journal/Conference/Book Title | arXiv |
Link to PDF | https://arxiv.org/pdf/2103.15388.pdf |
Reference Type | Conference Proceedings |
Author(s) | Carvalho, J., Tateo, D., Muratore, F., Peters, J. |
Year | 2021 |
Title | An Empirical Analysis of Measure-Valued Derivatives for Policy Gradients |
Journal/Conference/Book Title | International Joint Conference on Neural Networks (IJCNN) |
Keywords | Measure-Valued Derivative, Policy Gradient, Reinforcement Learning |
Abstract | Reinforcement learning methods for robotics are increasingly successful due to the constant development of better policy gradient techniques. A precise (low variance) and accurate (low bias) gradient estimator is crucial to face increasingly complex tasks. Traditional policy gradient algorithms use the likelihood-ratio trick, which is known to produce unbiased but high variance estimates. More modern approaches exploit the reparametrization trick, which gives lower variance gradient estimates but requires differentiable value function approximators. In this work, we study a different type of stochastic gradient estimator: the Measure-Valued Derivative. This estimator is unbiased, has low variance, and can be used with differentiable and non-differentiable function approximators. We empirically evaluate this estimator in the actor-critic policy gradient setting and show that it can reach comparable performance with methods based on the likelihood-ratio or reparametrization tricks, both in low and high-dimensional action spaces. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoaoCarvalho/2021_ijcnn-mvd_rl.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Mannor, S.; Peters, J.; Fox, D.; Garg, A. |
Year | 2021 |
Title | Value Iteration in Continuous Actions, States and Time |
Journal/Conference/Book Title | International Conference on Machine Learning (ICML) |
Link to PDF | https://arxiv.org/pdf/2105.04682.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Mannor, S.; Peters, J.; Fox, D.; Garg, A. |
Year | 2021 |
Title | Robust Value Iteration for Continuous Control Tasks |
Journal/Conference/Book Title | Robotics: Science and Systems (RSS) |
Link to PDF | https://arxiv.org/pdf/2105.12189.pdf |
Reference Type | Journal Article |
Author(s) | Funk, N.; Baumann, D.; Berenz, V.; Trimpe, S. |
Year | 2021 |
Title | Learning event-triggered control from data through joint optimization |
Journal/Conference/Book Title | IFAC Journal of Systems and Control |
Keywords | Event-triggered control, Reinforcement learning, Stability verification, Neural networks |
Abstract | We present a framework for model-free learning of event-triggered control strategies. Event-triggered methods aim to achieve high control performance while only closing the feedback loop when needed. This enables resource savings, e.g., network bandwidth if control commands are sent via communication networks, as in networked control systems. Event-triggered controllers consist of a communication policy, determining when to communicate, and a control policy, deciding what to communicate. It is essential to jointly optimize the two policies since individual optimization does not necessarily yield the overall optimal solution. To address this need for joint optimization, we propose a novel algorithm based on hierarchical reinforcement learning. The resulting algorithm is shown to accomplish high-performance control in line with resource savings and scales seamlessly to nonlinear and high-dimensional systems. The method’s applicability to real-world scenarios is demonstrated through experiments on a six degrees of freedom real-time controlled manipulator. Further, we propose an approach towards evaluating the stability of the learned neural network policies. |
Volume | 16 |
Pages | 100144 |
ISBN/ISSN | 2468-6018 |
URL(s) | https://www.sciencedirect.com/science/article/pii/S2468601821000055 |
Electronic Resource Number | https://doi.org/10.1016/j.ifacsc.2021.100144 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NiklasFunk/Learn_etc_v1.pdf |
Reference Type | Conference Proceedings |
Author(s) | Urain, J.; Li, A.; Liu, P.; D'Eramo, C.; Peters, J. |
Year | 2021 |
Title | Composable Energy Policies for Reactive Motion Generation and Reinforcement Learning |
Journal/Conference/Book Title | Robotics: Science and Systems (RSS) |
Abstract | Reactive motion generation problems are usually solved by computing actions as a sum of policies. However, these policies are independent of each other and thus, they can have conflicting behaviors when summing their contributions together. We introduce Composable Energy Policies (CEP), a novel framework for modular reactive motion generation. CEPcomputes the control action by optimization over the product of a set of stochastic policies. This product of policies will provide a high probability to those actions that satisfy all the components and low probability to the others. Optimizing over the product of the policies avoids the detrimental effect of conflicting behaviors between policies choosing an action that satisfies all the objectives. Besides, we show that CEP naturally adapts to the Reinforcement Learning problem allowing us to integrate, in a hierarchical fashion, any distribution as prior, from multimodal distributions to non-smooth distributions and learn a new policy given them. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JulenUrainDeJesus/cep_rss_2021.pdf |
Reference Type | Book Section |
Author(s) | Hansel, K.; Moos, J.; Derstroff, C. |
Year | 2021 |
Title | Benchmarking the Natural Gradient in Policy Gradient Methods and Evolution Strategies |
Journal/Conference/Book Title | Reinforcement Learning Algorithms: Analysis and Applications |
Publisher | Springer |
Pages | 69--84 |
URL(s) | https://link.springer.com/chapter/10.1007/978-3-030-41188-6_7 |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Clever, D.; Kirsten, R.; Listmann, K.; Peters, J. |
Year | 2021 |
Title | Building Skill Learning Systems for Robotics |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Automation Science and Engineering (CASE) |
URL(s) | https://ieeexplore.ieee.org/document/9551562 |
Reference Type | Conference Proceedings |
Author(s) | Knaust, M.; Koert, D. |
Year | 2021 |
Title | Guided Robot Skill Learning: A User-Study on Learning Probabilistic Movement Primitives with Non-Experts |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | KOBO |
URL(s) | https://ieeexplore.ieee.org/document/9555785 |
Reference Type | Journal Article |
Author(s) | Lampariello, R.; Mishra, H.; Oumer, N.W.; Peters, J. |
Year | 2021 |
Title | Robust Motion Prediction of a Free-Tumbling Satellite with On-Ground Experimental Validation |
Journal/Conference/Book Title | Journal of Guidance, Control, and Dynamics |
Volume | 44 |
Number | 10 |
Pages | 1777-1793 |
URL(s) | https://arc.aiaa.org/doi/pdf/10.2514/1.G005745 |
Reference Type | Conference Proceedings |
Author(s) | Liu, P.; Tateo, D.; Bou-Ammar, H.; Peters, J. |
Year | 2021 |
Title | Efficient and Reactive Planning for High Speed Robot Air Hockey |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Abstract | Highly dynamic robotic tasks require high-speed and reactive robots. These tasks are particularly challenging due to the physical constraints, hardware limitations, and the high uncertainty of dynamics and sensor measures. To face these issues, it's crucial to design robotics agents that generate precise and fast trajectories and react immediately to environmental changes. Air hockey is an example of this kind of task. Due to the environment's characteristics, it is possible to formalize the problem and derive clean mathematical solutions. For these reasons, this environment is perfect for pushing to the limit the performance of currently available general-purpose robotic manipulators. Using two Kuka Iiwa 14, we show how to design a policy for general-purpose robotic manipulators for the air hockey game. We demonstrate that a real robot arm can perform fast-hitting movements and that the two robots can play against each other on a medium-size air hockey table in simulation. |
URL(s) | https://sites.google.com/view/robot-air-hockey |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PuzeLiu/IROS_2021_Air_Hockey.pdf |
Reference Type | Conference Proceedings |
Author(s) | Muratore, F.; Gruner, T.; Wiese, F.; Belousov, B.; Gienger, M.; Peters, J. |
Year | 2021 |
Title | Neural Posterior Domain Randomization |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL) |
Keywords | sim-to-real, domain randomization, likelihood-free inference |
Abstract | Combining domain randomization and reinforcement learning is a widely used approach to obtain control policies that can bridge the gap between simulation and reality. However, existing methods make limiting assumptions on the form of the domain parameter distribution which prevents them from utilizing the full power of domain randomization. Typically, a restricted family of probability distributions (e.g., normal or uniform) is chosen a priori for every parameter. Furthermore, straightforward approaches based on deep learning require differentiable simulators, which are either not available or can only simulate a limited class of systems. Such rigid assumptions diminish the applicability of domain randomization in robotics. Building upon recently proposed neural likelihood-free inference methods, we introduce Neural Posterior Domain Randomization (NPDR), an algorithm that alternates between learning a policy from a randomized simulator and adapting the posterior distribution over the simulator’s parameters in a Bayesian fashion. Our approach only requires a parameterized simulator, coarse prior ranges, a policy (optionally with optimization routine), and a small set of real-world observations. Most importantly, the domain parameter distribution is not restricted to a specific family, parameters can be correlated, and the simulator does not have to be differentiable. We show that the presented method is able to efficiently adapt the posterior over the domain parameters to closer match the observed dynamics. Moreover, we demonstrate that NPDR can learn transferable policies using fewer real-world rollouts than comparable algorithms. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Muratore_GWBPG--NeuralPosteriorDomainRandomization.pdf |
Language | English |
Last Modified Date | 2021-11-06 |
Reference Type | Conference Proceedings |
Author(s) | Wibranek, B.; Liu, Y.; Funk, N.; Belousov, B.; Peters, J.; Tessmann, O. |
Year | 2021 |
Title | Reinforcement Learning for Sequential Assembly of SL-Blocks: Self-Interlocking Combinatorial Design Based on Machine Learning |
Journal/Conference/Book Title | Proceedings of the 39th eCAADe Conference |
Keywords | SKILLS4ROBOTS |
Link to PDF | http://papers.cumincad.org/data/works/att/ecaade2021_247.pdf |
Reference Type | Journal Article |
Author(s) | Bustamante, S.; Peters, J.; Schoelkopf, B.; Grosse-Wentrup, M.; Jayaram, V. |
Year | 2021 |
Title | ArmSym: a virtual human-robot interaction laboratory for assistive robotics |
Journal/Conference/Book Title | IEEE Transactions on Human-Machine Systems |
Volume | 51 |
Number | 6 |
Pages | 568-577 |
Link to PDF | https://elib.dlr.de/147359/1/ArmSym_elib.pdf |
Reference Type | Journal Article |
Author(s) | D`Eramo, C.; Tateo, D; Bonarini, A.; Restelli, M.; Peters, J. |
Year | 2021 |
Title | MushroomRL: Simplifying Reinforcement Learning Research |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Volume | 22 |
Number | 131 |
Pages | 1-5 |
Link to PDF | https://jmlr.org/papers/volume22/18-056/18-056.pdf |
Reference Type | Journal Article |
Author(s) | D`Eramo, C.; Cini, A.; Nuara, A.; Pirotta, M.; Alippi, C.; Peters, J.; Restelli, M. |
Year | 2021 |
Title | Gaussian Approximation for Bias Reduction in Q-Learning |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Link to PDF | https://www.jmlr.org/papers/volume22/20-633/20-633.pdf |
Reference Type | Conference Proceedings |
Author(s) | Funk, N.; Chalvatzaki, G.; Belousov, B.; Peters, J. |
Year | 2021 |
Title | Learn2Assemble with Structured Representations and Search for Robotic Architectural Construction |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL) |
Keywords | Structured representations, Autonomous assembly, Manipulation |
Abstract | Autonomous robotic assembly requires a well-orchestrated sequence of high-level actions and smooth manipulation executions. Learning to assemble complex 3D structures remains a challenging problem that requires drawing connections between target designs and building blocks, and creating valid assembly sequences considering structural stability and feasibility. To address the combinatorial complexity of the assembly tasks, we propose a multi-head attention graph representation that can be trained with reinforcement learning (RL) to encode the spatial relations and provide meaningful assembly actions. Combining structured representations with model-free RL and Monte-Carlo planning allows agents to operate with various target shapes and building block types. We design a hierarchical control framework that learns to sequence the building blocks to construct arbitrary 3D designs and ensures their feasibility, as we plan the geometric execution with the robot-in-the-loop. We demonstrate the flexibility of the proposed structured representation and our algorithmic solution in a series of simulated 3D assembly tasks with robotic evaluation, which showcases our method's ability to learn to construct stable structures with a large number of building blocks. Code and videos are available at: https://sites.google.com/view/learn2assemble |
URL(s) | https://sites.google.com/view/learn2assemble |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/NiklasFunk/L2A_v1.pdf |
Reference Type | Conference Proceedings |
Author(s) | Liu, P.; Tateo, D.; Bou-Ammar, H.; Peters, J. |
Year | 2021 |
Title | Robot Reinforcement Learning on the Constraint Manifold |
Journal/Conference/Book Title | Proceedings of the Conference on Robot Learning (CoRL) |
Keywords | Robot Learning, Constrained Reinforcement Learning, Safe Exploration |
Abstract | Reinforcement learning in robotics is extremely challenging due to many practical issues, including safety, mechanical constraints, and wear and tear. Typically, these issues are not considered in the machine learning literature. One crucial problem in applying reinforcement learning in the real world is Safe Exploration, which requires physical and safety constraints satisfaction throughout the learning process. To explore in such a safety-critical environment, leveraging known information such as robot models and constraints is beneficial to provide more robust safety guarantees. Exploiting this knowledge, we propose a novel method to learn robotics tasks in simulation efficiently while satisfying the constraints during the learning process. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PuzeLiu/CORL_2021_Learning_on_the_Manifold.pdf |
Reference Type | Journal Article |
Author(s) | Abdulsamad, H.; Peters, J. |
Year | 2021 |
Title | Model-Based Reinforcement Learning for Stochastic Hybrid Systems |
Journal/Conference/Book Title | arXiv |
Link to PDF | https://arxiv.org/pdf/2111.06211.pdf |
Reference Type | Book Section |
Author(s) | Palenicek, D. |
Year | 2021 |
Title | A Survey on Constraining Policy Updates Using the KL Divergence |
Journal/Conference/Book Title | Reinforcement Learning Algorithms: Analysis and Applications |
Pages | 49-57 |
URL(s) | https://link.springer.com/chapter/10.1007/978-3-030-41188-6_5 |
Reference Type | Conference Proceedings |
Author(s) | Bauer, S.; Wüthrich, W.; Widmaier, F.; Buchholz, A.; Stark, S.; Goyal, A.; Steinbrenner, T.; Akpo, J.; Joshi, S.; Berenz, V.; Agrawal, V.; Funk, N.; Urain, J.; Peters, J.; Watson, J.; Et, A.L.l |
Year | 2021 |
Title | Real Robot Challenge: A Robotics Competition in the Cloud |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Link to PDF | https://proceedings.mlr.press/v176/bauer22a/bauer22a.pdf |
Reference Type | Journal Article |
Author(s) | Veiga, F. F.; Edin B.B; Peters, J. |
Year | 2020 |
Title | Grip Stabilization through Independent Finger Tactile Feedback Control |
Journal/Conference/Book Title | Sensors (Special Issue on Sensors and Robot Control) |
Volume | 20 |
Link to PDF | https://www.mdpi.com/1424-8220/20/6/1748/pdf |
Reference Type | Journal Article |
Author(s) | Vinogradska, J.; Bischoff, B.; Koller, T.; Achterhold, J.; Peters, J. |
Year | 2020 |
Title | Numerical Quadrature for Probabilistic Policy Search |
Journal/Conference/Book Title | IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) |
Volume | 42 |
Number | 1 |
Pages | 164-175 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NuQuPS_preprint.pdf |
Reference Type | Journal Article |
Author(s) | Lioutikov, R.; Maeda, G.; Veiga, F.F.; Kersting, K.; Peters, J. |
Year | 2020 |
Title | Learning Attribute Grammars for Movement Primitive Sequencing |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | Skillz4robots |
Volume | 39 |
Number | 1 |
Pages | 21-38 |
Reference Type | Journal Article |
Author(s) | Arenz, O.; Zhong, M.; Neumann G. |
Year | 2020 |
Title | Trust-Region Variational Inference with Gaussian Mixture Models |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Keywords | approximate inference, variational inference, sampling, policy search, mcmc, markov chain monte carlo |
Abstract | Many methods for machine learning rely on approximate inference from intractable probability distributions. Variational inference approximates such distributions by tractable models that can be subsequently used for approximate inference. Learning sufficiently accurate approximations requires a rich model family and careful exploration of the relevant modes of the target distribution. We propose a method for learning accurate GMM approximations of intractable probability distributions based on insights from policy search by using information-geometric trust regions for principled exploration. For efficient improvement of the GMM approximation, we derive a lower bound on the corresponding optimization objective enabling us to update the components independently. Our use of the lower bound ensures convergence to a stationary point of the original objective. The number of components is adapted online by adding new components in promising regions and by deleting components with negligible weight. We demonstrate on several domains that we can learn approximations of complex, multimodal distributions with a quality that is unmet by previous variational inference methods, and that the GMM approximation can be used for drawing samples that are on par with samples created by state-of-the-art MCMC samplers while requiring up to three orders of magnitude less computational resources. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/OlegArenz/VIPS_JMLR.pdf |
Reference Type | Journal Article |
Author(s) | Gomez-Gonzalez, S.; Neumann, G.; Schoelkopf, B.; Peters, J. |
Year | 2020 |
Title | Adaptation and Robust Learning of Probabilistic Movement Primitives |
Journal/Conference/Book Title | IEEE Transactions on Robotics (T-Ro) |
Volume | 36 |
Number | 2 |
Pages | 366-379 |
Link to PDF | https://arxiv.org/pdf/1808.10648.pdf |
Reference Type | Journal Article |
Author(s) | Koert, D.; Trick, S.; Ewerton, M.; Lutter, M.; Peters, J. |
Year | 2020 |
Title | Incremental Learning of an Open-Ended Collaborative Skill Library |
Journal/Conference/Book Title | International Journal of Humanoid Robotics (IJHR) |
Keywords | SKILLS4ROBOTS, KOBO |
Volume | 17 |
Number | 1 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/Koert_IJHR_2020.pdf |
Reference Type | Conference Proceedings |
Author(s) | Dam, T.; Klink, P.; D'Eramo, C.; Peters, J.; Pajarinen, J. |
Year | 2020 |
Title | Generalized Mean Estimation in Monte-Carlo Tree Search |
Journal/Conference/Book Title | Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) |
Abstract | We consider Monte-Carlo Tree Search (MCTS) applied to Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs), and the well-known Upper Confidence bound for Trees (UCT) algorithm. In UCT, a tree with nodes (states) and edges (actions) is incrementally built by the expansion of nodes, and the values of nodes are updated through a backup strategy based on the average value of child nodes. However, it has been shown that with enough samples the maximum operator yields more accurate node value estimates than averaging. Instead of settling for one of these value estimates, we go a step further proposing a novel backup strategy which uses the power mean operator, which computes a value between the average and maximum value. We call our new approach Power-UCT and argue how the use of the power mean operator helps to speed up the learning in MCTS. We theoretically analyze our method providing guarantees of convergence to the optimum. Moreover, we discuss a heuristic approach to balance the greediness of backups by tuning the power mean operator according to the number of visits to each node. Finally, we empirically demonstrate the effectiveness of our method in well-known MDP and POMDP benchmarks, showing significant improvement in performance and convergence speed w.r.t. UCT. |
URL(s) | https://www.ijcai.org/Proceedings/2020/0332.pdf |
Link to PDF | https://www.ijcai.org/Proceedings/2020/0332.pdf |
Reference Type | Journal Article |
Author(s) | Loeckel, S.; Peters, J.; van Vliet, P. |
Year | 2020 |
Title | A Probabilistic Framework for Imitating Human Race Driver Behavior |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L), with Presentation at the IEEE International Conference on Robotics and Automation (ICRA) |
Volume | 5 |
Number | 2 |
Link to PDF | https://arxiv.org/pdf/2001.08255.pdf |
Reference Type | Conference Proceedings |
Author(s) | Motokura, K.; Takahashi, M.; Ewerton, M.; Peters, J. |
Year | 2020 |
Title | Plucking Motions for Tea Harvesting Robots Using Probabilistic Movement Primitives |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (ICRA/RA-L), with Presentation at the IEEE International Conference on Robotics and Automation (ICRA) |
Volume | 5 |
Number | 2 |
Pages | 2377-3766 |
URL(s) | https://ieeexplore.ieee.org/document/9013082 |
Reference Type | Conference Proceedings |
Author(s) | Zelch, C.; Peters, J.; von Stryk, O. |
Year | 2020 |
Title | Learning Control Policies from Optimal Trajectories |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/Members/Zelch_ICRA_2020.pdf |
Reference Type | Journal Article |
Author(s) | Gomez-Gonzalez, S.; Prokudin, S.; Schoelkopf, B.; Peters, J. |
Year | 2020 |
Title | Real Time Trajectory Prediction Using Deep Conditional Generative Models |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (ICRA/RA-L), with Presentation at the IEEE International Conference on Robotics and Automation (ICRA) |
Volume | 5 |
Number | 2 |
Pages | 970-976 |
Link to PDF | https://arxiv.org/pdf/1909.03895.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tosatto, S.; Carvalho, J.; Abdulsamad, H.; Peters, J. |
Year | 2020 |
Title | A Nonparametric Off-Policy Policy Gradient |
Journal/Conference/Book Title | Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS) |
Keywords | nonparametric, policy gradient, off policy, reinforcement learning |
Abstract | Reinforcement learning (RL) algorithms still suffer from high sample complexity despite outstanding recent successes. The need for intensive interactions with the environment is especially observed in many widely popular policy gradient algorithms that perform updates using on-policy samples. The price of such inefficiency becomes evident in real-world scenarios such as interaction-driven robot learning, where the success of RL has been rather limited. We address this issue by building on the general sample efficiency of off-policy algorithms. With nonparametric regression and density estimation methods we construct a nonparametric Bellman equation in a principled manner, which allows us to obtain closed-form estimates of the value function, and to analytically express the full policy gradient. We provide a theoretical analysis of our estimate to show that it is consistent under mild smoothness assumptions and empirically show that our approach has better sample efficiency than state-of-the-art policy gradient methods. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SamueleTosatto/tosatto2020.pdf |
Reference Type | Conference Proceedings |
Author(s) | D`Eramo, C.; Tateo, D.; Bonarini, A.; Restelli, M.; Peters, J. |
Year | 2020 |
Title | Sharing Knowledge in Multi-Task Deep Reinforcement Learning |
Journal/Conference/Book Title | International Conference in Learning Representations (ICLR) |
Link to PDF | https://openreview.net/pdf?id=rkgpv2VFvr |
Reference Type | Conference Proceedings |
Author(s) | Eilers, C.; Eschmann, J.; Menzenbach, R.; Belousov, B.; Muratore, F.; Peters, J. |
Year | 2020 |
Title | Underactuated Waypoint Trajectory Optimization for Light Painting Photography |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | SKILLS4ROBOTS |
Abstract | Despite their abundance in robotics and nature, underactuated systems remain a challenge for control engineering. Trajectory optimization provides a generally applicable solution, however its efficiency strongly depends on the skill of the engineer to frame the problem in an optimizer-friendly way. This paper proposes a procedure that automates such problem reformulation for a class of tasks in which the desired trajectory is specified by a sequence of waypoints. The approach is based on introducing auxiliary optimization variables that represent waypoint activations. To validate the proposed method, a letter drawing task is set up where shapes traced by the tip of a rotary inverted pendulum are visualized using long exposure photography. |
Custom 1 | https://www.youtube.com/watch?v=IiophaKtWG0&feature=youtu.be |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Eilers_Eschmann_Menzenbach_BMP--UnderactuatedWaypointTrajectoryOptimizationforLightPaintingPhotography.pdf |
Reference Type | Conference Proceedings |
Author(s) | Stock-Homburg, R.; Peters, J.; Schneider, K.; Prasad, V.; Nukovic, L. |
Year | 2020 |
Title | Evaluation of the Handshake Turing Test for anthropomorphic Robots |
Journal/Conference/Book Title | Proceedings of the ACM/IEEE International Conference on Human Robot Interaction (HRI), Late Breaking Report |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/VigneshPrasad/HRI20-RSH.pdf |
Reference Type | Report |
Author(s) | Tosatto, S.; Stadtmueller, J.; Peters, J. |
Year | 2020 |
Title | Dimensionality Reduction of Movement Primitives in Parameter Space |
Journal/Conference/Book Title | arXiv |
Keywords | movement primitives, dimensionality reduction, imitation learning, robot learning |
Abstract | Movement primitives are an important policy class for real-world robotics. However, the high dimensionality of their parametrization makes the policy optimization expensive both in terms of samples and computation. Enabling an efficient representation of movement primitives facilitates the application of machine learning techniques such as reinforcement on robotics. Motions, especially in highly redundant kinematic structures, exhibit high correlation in the configuration space. For these reasons, prior work has mainly focused on the application of dimensionality reduction techniques in the configuration space. In this paper, we investigate the application of dimensionality reduction in the parameter space, identifying principal movements. The resulting approach is enriched with a probabilistic treatment of the parameters, inheriting all the properties of the Probabilistic Movement Primitives. We test the proposed technique both on a real robotic task and on a database of complex human movements. The empirical analysis shows that the dimensionality reduction in parameter space is more effective than in configuration space, as it enables the representation of the movements with a significant reduction of parameters. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SamueleTosatto/tosatto2020b.pdf |
Reference Type | Conference Proceedings |
Author(s) | Almeida Santos, A.; Gil, C.E.M.; Peters, J.; Steinke, F. |
Year | 2020 |
Title | Decentralized Data-Driven Tuning of Droop Frequency Controllers |
Journal/Conference/Book Title | 2020 IEEE PES Innovative Smart Grid Technologies Europe |
Link to PDF | https://www.eins.tu-darmstadt.de/fileadmin/user_upload/publications_pdf/20_ISGTEU_SanMorPetSte_paper.pdf |
Reference Type | Conference Proceedings |
Author(s) | Abdulsamad, H.; Peters, J. |
Year | 2020 |
Title | Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation |
Journal/Conference/Book Title | 2nd Annual Conference on Learning for Dynamics and Control |
Link to PDF | https://arxiv.org/abs/2005.01432 |
Reference Type | Journal Article |
Author(s) | Ewerton, M.; Arenz, O.; Peters, J. |
Year | 2020 |
Title | Assisted Teleoperation in Changing Environments with a Mixture of Virtual Guides |
Journal/Conference/Book Title | Advanced Robotics |
Volume | 34 |
Number of Volumes | 18 |
Link to PDF | https://arxiv.org/pdf/2008.05251.pdf |
Reference Type | Conference Proceedings |
Author(s) | Becker, P.; Arenz, O.; Neumann, G. |
Year | 2020 |
Title | Expected Information Maximization: Using the I-Projection for Mixture Density Estimation |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Link to PDF | /uploads/Team/OlegArenz/beckerEIM.pdf |
Reference Type | Journal Article |
Author(s) | Lauri, M.; Pajarinen, J.; Peters, J.; Frintrop, S. |
Year | 2020 |
Title | Multi-Sensor Next-Best-View Planning as Matroid-Constrained Submodular Maximization |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Volume | 5 |
Number | 4 |
Pages | 5323-5330 |
Link to PDF | https://arxiv.org/pdf/2007.02084.pdf |
Reference Type | Journal Article |
Author(s) | Agudelo-Espana, D.; Gomez-Gonzalez, S.; Bauer, S.; Schoelkopf, B.; Peters, J. |
Year | 2020 |
Title | Bayesian Online Prediction of Change Points |
Journal/Conference/Book Title | Conference on Uncertainty in Artificial Intelligence (UAI) |
Link to PDF | https://arxiv.org/pdf/1902.04524.pdf |
Reference Type | Conference Proceedings |
Author(s) | Laux, M.; Arenz, O.; Pajarinen, J.; Peters, J. |
Year | 2020 |
Title | Deep Adversarial Reinforcement Learning for Object Disentangling |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2020) |
Link to PDF | /uploads/Site/EditPublication/Melvin_Iros.pdf |
Reference Type | Journal Article |
Author(s) | Koert, D.; Kircher, M.; Salikutluk, V.; D'Eramo, C.; Peters, J. |
Year | 2020 |
Title | Multi-Channel Interactive Reinforcement Learning for Sequential Tasks |
Journal/Conference/Book Title | Frontiers in Robotics and AI Human-Robot Interaction |
Keywords | SKILLS4ROBOTS, KOBO |
URL(s) | https://www.frontiersin.org/articles/10.3389/frobt.2020.00097/full |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DorotheaKoert/multi_channel_feedback_rl_sequential.pdf |
Reference Type | Report |
Author(s) | Arenz, O.; Neumann, G. |
Year | 2020 |
Title | Non-Adversarial Imitation Learning and its Connections to Adversarial Methods |
Journal/Conference/Book Title | arXiv |
Keywords | Imitation Learning, Inverse Reinforcement Learning, Non-Adversarial Imitation Learning, Adversarial Imitation Learning, AIRL |
Abstract | Many modern methods for imitation learning and inverse reinforcement learning, such as GAIL or AIRL, are based on an adversarial formulation. These methods apply GANs to match the expert's distribution over states and actions with the implicit state-action distribution induced by the agent's policy. However, by framing imitation learning as a saddle point problem, adversarial methods can suffer from unstable optimization, and convergence can only be shown for small policy updates. We address these problems by proposing a framework for non-adversarial imitation learning. The resulting algorithms are similar to their adversarial counterparts and, thus, provide insights for adversarial imitation learning methods. Most notably, we show that AIRL is an instance of our non-adversarial formulation, which enables us to greatly simplify its derivations and obtain stronger convergence guarantees. We also show that our non-adversarial formulation can be used to derive novel algorithms by presenting a method for offline imitation learning that is inspired by the recent ValueDice algorithm, but does not rely on small policy updates for convergence. In our simulated robot experiments, our offline method for non-adversarial imitation learning seems to perform best when using many updates for policy and discriminator at each iteration and outperforms behavioral cloning and ValueDice. |
Link to PDF | /uploads/Team/OlegArenz/nail_arxiv.pdf |
Reference Type | Unpublished Work |
Author(s) | Abi-Farraj, F.; Pacchierotti, C.; Arenz, O.; Neumann, G.; Giordano, P. |
Year | 2020 |
Title | Haptic-based Guided Grasping in a Cluttered Environment |
Journal/Conference/Book Title | IEEE Haptics Symposium |
Link to PDF | https://www.roboticvision.org/wp-content/uploads/Haptic-based-Guided-Grasping-in-a-Cluttered-Environment.pdf |
Reference Type | Conference Proceedings |
Author(s) | Keller, L.; Tanneberg, D.; Stark, S.; Peters, J. |
Year | 2020 |
Title | Model-Based Quality-Diversity Search for Efficient Robot Learning |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | https://arxiv.org/pdf/2008.04589.pdf |
Reference Type | Conference Proceedings |
Author(s) | Klink, P.; D'Eramo, C.; Peters, J.; Pajarinen, J. |
Year | 2020 |
Title | Self-Paced Deep Reinforcement Learning |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PascalKlink/neurips-2020-2.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ploeger, K.; Lutter, M.; Peters, J. |
Year | 2020 |
Title | High Acceleration Reinforcement Learning for Real-World Juggling with Binary Rewards |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL) |
Link to PDF | https://arxiv.org/pdf/2010.13483.pdf |
Reference Type | Conference Proceedings |
Author(s) | Urain, J.; Ginesi, M.; Tateo, D.; Peters, J. |
Year | 2020 |
Title | ImitationFlow: Learning Deep Stable Stochastic Dynamic Systems by Normalizing Flows |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems |
Keywords | Movement Primitives, Imitation Learning |
Abstract | We introduce ImitationFlow, a novel Deep generative model that allows learning complex globally stable, stochastic, nonlinear dynamics. Our approach extends the Normalizing Flows framework to learn stable Stochastic Differential Equations. We prove the Lyapunov stability for a class of Stochastic Differential Equations and we propose a learning algorithm to learn them from a set of demonstrated trajectories. Our model extends the set of stable dynamical systems that can be represented by state-of-the-art approaches, eliminates the Gaussian assumption on the demonstrations, and outperforms the previous algorithms in terms of representation accuracy. We show the effectiveness of our method with both standard datasets and a real robot experiment. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JulenUrainDeJesus/2020iflowurain.pdf |
Reference Type | Conference Proceedings |
Author(s) | Abdulsamad, H.; Peters, J. |
Year | 2020 |
Title | Learning Hybrid Dynamics and Control |
Journal/Conference/Book Title | ECML/PKDD Workshop on Deep Continuous-Discrete Machine Learning |
Reference Type | Journal Article |
Author(s) | Tanneberg, D.; Rueckert, E.; Peters, J. |
Year | 2020 |
Title | Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers |
Journal/Conference/Book Title | Nature Machine Intelligence |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Volume | 2 |
Number | 12 |
Pages | 753-763 |
URL(s) | https://rdcu.be/caRlg |
Link to PDF | https://arxiv.org/pdf/2105.07957.pdf |
Reference Type | Journal Article |
Author(s) | Veiga, F. F.; Akrour, R.; Peters, J. |
Year | 2020 |
Title | Hierarchical Tactile-Based Control Decomposition of Dexterous In-Hand Manipulation Tasks |
Journal/Conference/Book Title | Frontiers in Robotics and AI |
URL(s) | https://www.frontiersin.org/articles/10.3389/frobt.2020.521448/full |
Reference Type | Conference Proceedings |
Author(s) | Urain, J.; Tateo, D.; Ren, T.; Peters, J. |
Year | 2020 |
Title | Structured policy representation: Imposing stability in arbitrarily conditioned dynamic systems |
Journal/Conference/Book Title | NeurIPS 2020, 3rd Robot Learning Workshop |
Keywords | Movement Primitives, Imitation Learning, Inductive Bias |
Abstract | We present a new family of deep neural network-based dynamic systems. The presented dynamics are globally stable and can be conditioned with an arbitrary context state. We show how these dynamics can be used as structured robot policies. Global stability is one of the most important and straightforward inductive biases as it allows us to impose reasonable behaviors outside the region of the demonstrations. |
Pages | 7 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JulenUrainDeJesus/2020_structuredpolicy_urain.pdf |
Reference Type | Conference Proceedings |
Author(s) | Watson, J.; Imohiosen A.; Peters, J. |
Year | 2020 |
Title | Active Inference or Control as Inference? A Unifying View |
Journal/Conference/Book Title | International Workshop on Active Inference |
Reference Type | Conference Proceedings |
Author(s) | Prasad, V.; Stock-Homburg, R.; Peters, J. |
Year | 2020 |
Title | Advances in Human-Robot Handshaking |
Journal/Conference/Book Title | International Conference on Social Robotics |
Publisher | Springer |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/VigneshPrasad/ICSR21-Prasad.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Clever, D.; Belousov, B.; Listmann, K.; Peters, J. |
Year | 2020 |
Title | Evaluating the Robustness of HJB Optimal Feedback Control |
Journal/Conference/Book Title | International Symposium on Robotics |
Reference Type | Conference Paper |
Author(s) | Rother, D., Haider, T., & Eger, S |
Year | 2020 |
Title | CMCE at SemEval-2020 Task 1: Clustering on Manifolds of Contextualized Embeddings to Detect Historical Meaning Shifts |
Journal/Conference/Book Title | 14th International Workshop on Semantic Evaluation (SemEval) |
Keywords | Natural Language Processing, Unsupervised Clustering, Semantic Shift Detection, Semantic Evaluation |
Abstract | This paper describes the system Clustering on Manifolds of Contextualized Embeddings (CMCE) submitted to the SemEval-2020 Task 1 on Unsupervised Lexical Semantic Change Detection. Subtask 1 asks to identify whether or not a word gained/lost a sense across two time periods. Subtask 2 is about computing a ranking of words according to the amount of change their senses underwent. Our system uses contextualized word embeddings from MBERT, whose dimensionality we reduce with an autoencoder and the UMAP algorithm, to be able to use a wider array of clustering algorithms that can automatically determine the number of clusters. We use Hierarchical Density Based Clustering (HDBSCAN) and compare it to Gaussian Mixture Models (GMMs) and other clustering algorithms. Remarkably, with only 10 dimensional MBERT embeddings (reduced from the original size of 768), our submitted model performs best on subtask 1 for English and ranks third in subtask 2 for English. In addition to describing our system, we discuss our hyperparameter configurations and examine why our system lags behind for the other languages involved in the shared task (German, Swedish, Latin). Our code is available at https://github.com/DavidRother/semeval2020-task1 |
Pages | 187-193 |
Link to PDF | https://pure.mpg.de/rest/items/item_3278784/component/file_3278792/content |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Silberbauer, J.; Watson, J.; Peters, J. |
Year | 2020 |
Title | A Differentiable Newton Euler Algorithm for Multi-body Model Learning |
Journal/Conference/Book Title | R:SS Structured Approaches to Robot Learning Workshop |
Reference Type | Journal Article |
Author(s) | Tanneberg, D.; Peters, J.; Rueckert, E. |
Year | 2019 |
Title | Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks |
Journal/Conference/Book Title | Neural Networks |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Volume | 109 |
Pages | 67-80 |
URL(s) | https://doi.org/10.1016/j.neunet.2018.10.005 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DanielTanneberg/tanneberg_NN18.pdf |
Reference Type | Journal Article |
Author(s) | Koc, O.; Peters, J. |
Year | 2019 |
Title | Learning to serve: an experimental study for a new learning from demonstrations framework |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (ICRA/RA-L), with Presentation at the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Learning_to_Serve_2019.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lauri, M.; Pajarinen, J.; Peters, J. |
Year | 2019 |
Title | Information gathering in decentralized POMDPs by policy graph improvement |
Journal/Conference/Book Title | Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS) |
Keywords | ROBOLEAP,SKILLS4ROBOTS |
Link to PDF | https://arxiv.org/pdf/1902.09840 |
Reference Type | Journal Article |
Author(s) | Brandherm, F.; Peters, J.; Neumann, G.; Akrour, R. |
Year | 2019 |
Title | Learning Replanning Policies with Direct Policy Search |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/RiadAkrour/florian_ral_sub.pdf |
Reference Type | Journal Article |
Author(s) | Gebhardt, G.H.W.; Kupcsik, A.; Neumann, G. |
Year | 2019 |
Title | The Kernel Kalman Rule |
Journal/Conference/Book Title | Machine Learning Journal (MLJ) |
Publisher | Springer US |
Volume | 108 |
Number | 12 |
Pages | 2113–2157 |
ISBN/ISSN | 0885-6125 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/GregorGebhardt/TheKernelKalmanRuleJournal.pdf |
Reference Type | Journal Article |
Author(s) | Parisi, S.; Tangkaratt, V.; Peters, J.; Khan, M. E. |
Year | 2019 |
Title | TD-Regularized Actor-Critic Methods |
Journal/Conference/Book Title | Machine Learning (MLJ) |
Volume | 108 |
Number | 8 |
Pages | 1467-1501 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SimoneParisi/parisi2019mlj.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Ritter, C.; Peters, J. |
Year | 2019 |
Title | Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/MichaelLutter/lutter_iclr_2019.pdf |
Reference Type | Journal Article |
Author(s) | Koc, O.; Maeda, G.; Peters, J. |
Year | 2019 |
Title | Optimizing the Execution of Dynamic Robot Movements with Learning Control |
Journal/Conference/Book Title | IEEE Transactions on Robotics |
Volume | 35 |
Number | 4 |
Pages | 1552-3098 |
Link to PDF | https://arxiv.org/pdf/1807.01918.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tosatto, S.; D'Eramo, C.; Pajarinen, J.; Restelli, M.; Peters, J. |
Year | 2019 |
Title | Exploration Driven By an Optimistic Bellman Equation |
Journal/Conference/Book Title | Proceedings of the International Joint Conference on Neural Networks (IJCNN) |
Keywords | exploration; reinforcement learning; intrinsic motivation; Bosch-Forschungstiftung |
Abstract | Exploring high-dimensional state spaces and finding sparse rewards are central problems in reinforcement learning. Exploration strategies are frequently either naı̈ve (e.g., simplistic epsilon-greedy or Boltzmann policies), intractable (i.e., full Bayesian treatment of reinforcement learning) or rely heavily on heuristics. The lack of a tractable but principled exploration approach unnecessarily complicates the application of reinforcement learning to a broader range of problems. Efficient exploration can be accomplished by relying on the uncertainty of the state-action value function. To obtain the uncertainty, we maintain an ensemble of value function estimates and present an optimistic Bellman equation (OBE) for such ensembles. This OBE is derived from a relative entropy maximization principle and yields an implicit exploration bonus resulting in improved exploration during action selection. The implied exploration bonus can be seen as a well-principled type of intrinsic motivation and exhibits favorable theoretical properties. OBE can be applied to a wide range of algorithms. We propose two algorithms as an application of the principle: Optimistic Q-learning and Optimistic DQN which outperform comparison methods on standard benchmarks. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SamueleTosatto/TosattoIJCNN2019.pdf |
Language | English |
Reference Type | Conference Proceedings |
Author(s) | Wibranek, B.; Belousov, B.; Sadybakasov, A.; Tessmann, O. |
Year | 2019 |
Title | Interactive Assemblies: Man-Machine Collaboration through Building Components for As-Built Digital Models |
Journal/Conference/Book Title | Computer-Aided Architectural Design Futures (CAAD Futures) |
Keywords | SKILLS4ROBOTS |
Link to PDF | /uploads/Team/BorisBelousov/wibranek_caad19.pdf |
Reference Type | Journal Article |
Author(s) | Abi Farraj, F.; Pacchierotti, C.; Arenz, O.; Neumann, G.; Giordano, P. |
Year | 2019 |
Title | A Haptic Shared-Control Architecture for Guided Multi-Target Robotic Grasping |
Journal/Conference/Book Title | IEEE Transactions on Haptics |
Keywords | Grasping, Task analysis, Manipulators, Grippers, Service robots |
Abstract | Although robotic telemanipulation has always been a key technology for the nuclear industry, little advancement has been seen over the last decades. Despite complex remote handling requirements, simple mechanically-linked master-slave manipulators still dominate the field. Nonetheless, there is a pressing need for more effective robotic solutions able to significantly speed up the decommissioning of legacy radioactive waste. This paper describes a novel haptic shared-control approach for assisting a human operator in the sort and segregation of different objects in a cluttered and unknown environment. A 3D scan of the scene is used to generate a set of potential grasp candidates on the objects at hand. These grasp candidates are then used to generate guiding haptic cues, which assist the operator in approaching and grasping the objects. The haptic feedback is designed to be smooth and continuous as the user switches from a grasp candidate to the next one, or from one object to another one, avoiding any discontinuity or abrupt changes. To validate our approach, we carried out two human-subject studies, enrolling 15 participants. We registered an average improvement of 20.8%, 20.1%, 32.5% in terms of completion time, linear trajectory, and perceived effectiveness, respectively, between the proposed approach and standard teleoperation. |
URL(s) | https://ieeexplore.ieee.org/document/8700204 |
Link to PDF | https://inria.hal.science/hal-02113206/file/abi_farraj-TOH-sharedcontrol-grasping.pdf |
Reference Type | Conference Proceedings |
Author(s) | Akrour, R.; Pajarinen, J.; Neumann, G.; Peters, J. |
Year | 2019 |
Title | Projections for Approximate Policy Iteration Algorithms |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Keywords | ROBOLEAP,SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/RiadAkrour/papi.pdf |
Reference Type | Conference Proceedings |
Author(s) | Becker-Ehmck, P.; Peters, J.; van der Smagt, P. |
Year | 2019 |
Title | Switching Linear Dynamics for Variational Bayes Filtering |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Link to PDF | https://arxiv.org/pdf/1905.12434.pdf |
Reference Type | Conference Proceedings |
Author(s) | Belousov, B.; Abdulsamad, H.; Schultheis, M.; Peters, J. |
Year | 2019 |
Title | Belief Space Model Predictive Control for Approximately Optimal System Identification |
Journal/Conference/Book Title | 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Keywords | SKILLS4ROBOTS |
Abstract | The fundamental problem of reinforcement learning is to control a dynamical system whose properties are not fully known in advance. Many articles nowadays are addressing the issue of optimal exploration in this setting by investigating the ideas such as curiosity, intrinsic motivation, empowerment, and others. Interestingly, closely related questions of optimal input design with the goal of producing the most informative system excitation have been studied in adjacent fields grounded in statistical decision theory. In most general terms, the problem faced by a curious reinforcement learning agent can be stated as a sequential Bayesian optimal experimental design problem. It is well known that finding an optimal feedback policy for this type of setting is extremely hard and analytically intractable even for linear systems due to the non-linearity of the Bayesian filtering step. Therefore, approximations are needed. We consider one type of approximation based on replacing the feedback policy by repeated trajectory optimization in the belief space. By reasoning about the future uncertainty over the internal world model, the agent can decide what actions to take at every moment given its current belief and expected outcomes of future actions. Such approach became computationally feasible relatively recently, thanks to advances in automatic differentiation. Being straightforward to implement, it can serve as a strong baseline for exploration algorithms in continuous robotic control tasks. Preliminary evaluations on a physical pendulum with unknown system parameters indicate that the proposed approach can infer the correct parameter values quickly and reliably, outperforming random excitation and naive sinusoidal excitation signals, and matching the performance of the best manually designed system identification controller based on the knowledge of the system dynamics. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/rldm19_belousov.pdf |
Reference Type | Journal Article |
Author(s) | Belousov, B.; Peters, J. |
Year | 2019 |
Title | Entropic Regularization of Markov Decision Processes |
Journal/Conference/Book Title | Entropy |
Keywords | SKILLS4ROBOTS |
Publisher | MDPI |
Volume | 21 |
Number | 7 |
ISBN/ISSN | 1099-4300 |
URL(s) | https://www.mdpi.com/1099-4300/21/7/674 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/entropy19_belousov.pdf |
Reference Type | Journal Article |
Author(s) | Pajarinen, J.; Thai, H.L.; Akrour, R.; Peters, J.; Neumann, G. |
Year | 2019 |
Title | Compatible natural gradient policy search |
Journal/Conference/Book Title | Machine Learning (MLJ) |
Keywords | ROBOLEAP,SKILLS4ROBOTS |
Publisher | Springer |
Volume | 108 |
Number | 8 |
Pages | 1443--1466 |
Date | September 2019 |
Link to PDF | https://link.springer.com/content/pdf/10.1007%2Fs10994-019-05807-0.pdf |
Reference Type | Journal Article |
Author(s) | Celemin, C.; Maeda, G.; Peters, J.; Ruiz-del-Solar, J.; Kober, J. |
Year | 2019 |
Title | Reinforcement Learning of Motor Skills using Policy Search and Human Corrective Advice |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Volume | 38 |
Number | 14 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Alumni/JensKober/IJRR__Revision_.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nass, D.; Belousov, B.; Peters, J. |
Year | 2019 |
Title | Entropic Risk Measure in Policy Search |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/iros19_nass_v2.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ozdenizci, O.; Meyer, T.; Wichmann, F.; Peters, J.; Schoelkopf B.; Cetin, M.; Grosse-Wentrup, M. |
Year | 2019 |
Title | Neural Signatures of Motor Skill in the Resting Brain |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC) |
Link to PDF | https://arxiv.org/pdf/1907.09533.pdf |
Reference Type | Conference Proceedings |
Author(s) | Urain, J.; Peters, J. |
Year | 2019 |
Title | Generalized Multiple Correlation Coefficient as a Similarity Measurement between Trajectories |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | 2019 |
Abstract | Similarity distance measure between two trajectories is an essential tool to understand patterns in motion, for example, in Human-Robot Interaction or Imitation Learning. The problem has been faced in many fields, from Signal Processing, Probabilistic Theory field, Topology field or Statistics field. Anyway, up to now, none of the trajectory similarity measurements metrics are invariant to all possible linear transformation of the trajectories~(rotation, scaling, reflection, shear mapping or squeeze mapping). Also not all of them are robust in front of noisy signals or fast enough for real-time trajectory classification. To overcome this limitation this paper proposes a similarity distance metric that will remain invariant in front of any possible linear transformation. Based on Pearson's Correlation Coefficient and the Coefficient of Determination, our similarity metric, the Generalized Multiple Correlation Coefficient~(GMCC) is presented like the natural extension of the Multiple Correlation Coefficient. The motivation of this paper is two-fold: First, to introduce a new correlation metric that presents the best properties to compute similarities between trajectories invariant to linear transformations and compare it with some state of the art similarity distances. Second, to present a natural way of integrating the similarity metric in an Imitation Learning scenario for clustering robot trajectories. |
Place Published | IROS 2019 |
Date | 2019 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JulenUrainDeJesus/julen_IROS_2019.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Peters, J. |
Year | 2019 |
Title | Deep Lagrangian Networks for end-to-end learning of energy-based control for under-actuated systems |
Journal/Conference/Book Title | International Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/MichaelLutter/IROS_2019_Final_DeLaN_Energy_Control.pdf |
Reference Type | Journal Article |
Author(s) | Koert, D.; Pajarinen, J.; Schotschneider, A.; Trick, S., Rothkopf, C.; Peters, J. |
Year | 2019 |
Title | Learning Intention Aware Online Adaptation of Movement Primitives |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L), with presentation at the IEEE International Conference on Intelligent Robots and Systems (IROS) |
Keywords | SKILLS4ROBOTS, KOBO |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DorotheaKoert/final_ral_2019_koert.pdf |
Reference Type | Conference Proceedings |
Author(s) | Celik, O.; Abdulsamad, H.; Peters, J. |
Year | 2019 |
Title | Chance-Constrained Trajectory Optimization for Nonlinear Systems with Unknown Stochastic Dynamics |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Abstract | Iterative trajectory optimization techniques for non-linear dynamical systems are among the most powerful and sample-efficient methods of model-based reinforcement learning and approximate optimal control. By leveraging time-variant local linear-quadratic approximations of system dynamics and rewards, such methods are able to find both a target-optimal trajectory and time-variant optimal feedback controllers. How- ever, the local linear-quadratic approximations are a major source of optimization bias that leads to catastrophic greedy updates, raising the issue of proper regularization. Moreover, the approximate models’ disregard for any physical state-action limits of the system, causes further aggravation of the problem, as the optimization moves towards unreachable areas of the state-action space. In this paper, we address these drawbacks in the scenario of online-fitted stochastic dynamics. We propose modeling state and action physical limits as probabilistic chance constraints and introduce a new trajectory optimization technique that integrates such probabilistic constraints by opti- mizing a relaxed quadratic program. Our empirical evaluations show a significant improvement in the robustness of the learning process, which enables our approach to perform more effective updates, and avoid premature convergence observed in other state-of-the-art techniques. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/HanyAbdulsamad/celik2019chance.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Peters, J. |
Year | 2019 |
Title | Deep Optimal Control: Using the Euler-Lagrange Equation to learn an Optimal Feedback Control Law |
Journal/Conference/Book Title | Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/MichaelLutter/RLDM2019_Deep_Optimal_Control.pdf |
Reference Type | Conference Proceedings |
Author(s) | Trick, S.; Koert, D.; Peters, J.; Rothkopf, C. |
Year | 2019 |
Title | Multimodal Uncertainty Reduction for Intention Recognition in Human-Robot Interaction |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | SKILLS4ROBOTS, KOBO |
Link to PDF | https://arxiv.org/pdf/1907.02426.pdf |
Reference Type | Conference Proceedings |
Author(s) | Stark, S.; Peters, J.; Rueckert, E. |
Year | 2019 |
Title | Experience Reuse with Probabilistic Movement Primitives |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | GOAL-Robots |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SvenjaStark/stark_iros2019_update.pdf |
Reference Type | Conference Proceedings |
Author(s) | Liu, Z.; Hitzmann, A.; Ikemoto, S.; Stark, S.; Peters, J.; Hosoda, K. |
Year | 2019 |
Title | Local Online Motor Babbling: Learning Motor Abundance of a Musculoskeletal Robot Arm |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Abstract | Motor babbling and goal babbling has been used for sensorimotor learning of highly redundant systems in soft robotics. Recent works in goal babbling has demonstrated successful learning of inverse kinematics (IK) on such systems, and suggests that babbling in the goal space better resolves motor redundancy by learning as few yet efficient sensorimotor mappings as possible. However, for musculoskeletal robot systems, motor redundancy can provide useful information to explain muscle activation patterns, thus the term motor abundance. In this work, we introduce some simple heuristics to empirically define the unknown goal space, and learn the IK of a 10 DoF musculoskeletal robot arm using directed goal babbling. We then further propose local online motor babbling guided by Covariance Matrix Adaptation Evolution Strategy (CMA-ES), which bootstraps on the goal babbling samples for initialization, such that motor abundance can be queried online for any static goal. Our approach leverages the resolving of redundancies and the efficient guided exploration of motor abundance in two stages of learning, allowing both kinematic accuracy and motor variability at the queried goal. The result shows that local online motor babbling guided by CMA-ES can efficiently explore motor abundance on musculoskeletal robot systems and gives useful insights in terms of muscle stiffness and synergy. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SvenjaStark/Liu_IROS_2019.pdf |
Reference Type | Conference Proceedings |
Author(s) | Belousov, B.; Sadybakasov, A.; Wibranek, B.; Veiga, F.; Tessmann, O.; Peters, J. |
Year | 2019 |
Title | Building a Library of Tactile Skills Based on FingerVision |
Journal/Conference/Book Title | Proceedings of the 2019 IEEE-RAS 19th International Conference on Humanoid Robots (Humanoids) |
Keywords | SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/belousov19_fingervision.pdf |
Reference Type | Conference Proceedings |
Author(s) | Schultheis, M.; Belousov, B.; Abdulsamad, H.; Peters, J. |
Year | 2019 |
Title | Receding Horizon Curiosity |
Journal/Conference/Book Title | Proceedings of the 3rd Conference on Robot Learning (CoRL) |
Keywords | SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/schultheis19_rhc.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Belousov, B.; Listmann, K.; Clever, D.; Peters, J. |
Year | 2019 |
Title | HJB Optimal Feedback Control with Deep Differential Value Functions and Action Constraints |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/MichaelLutter/CoRL2019_Deep_Optimal_HJB_Control.pdf |
Reference Type | Journal Article |
Author(s) | Ewerton, M.; Arenz, O.; Maeda, G.; Koert, D.; Kolev, Z.; Takahashi, M.; Peters, J. |
Year | 2019 |
Title | Learning Trajectory Distributions for Assisted Teleoperation and Path Planning |
Journal/Conference/Book Title | Frontiers in Robotics and AI |
URL(s) | https://www.frontiersin.org/articles/10.3389/frobt.2019.00089/full |
Link to PDF | https://www.frontiersin.org/articles/10.3389/frobt.2019.00089/full |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Maeda, G.; Koert, D.; Kolev, Z.; Takahashi, M.; Peters, J. |
Year | 2019 |
Title | Reinforcement Learning of Trajectory Distributions: Applications in Assisted Teleoperation and Motion Planning |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Abstract | The majority of learning from demonstration approaches do not address suboptimal demonstrations or cases when drastic changes in the environment occur after the demonstrations were made. For example, in real teleoperation tasks, the demonstrations provided by the user are often suboptimal due to interface and hardware limitations. In tasks involving co-manipulation and manipulation planning, the environment often changes due to unexpected obstacles rendering previous demonstrations invalid. This paper presents a reinforcement learning algorithm that exploits the use of relevance functions to tackle such problems. This paper introduces the Pearson correlation as a measure of the relevance of policy parameters in regards to each of the components of the cost function to be optimized. The method is demonstrated in a static environment where the quality of the teleoperation is compromised by the visual interface (operating a robot in a three-dimensional task by using a simple 2D monitor). Afterward, we tested the method on a dynamic environment using a real 7-DoF robot arm where distributions are computed online via Gaussian Process regression. |
Place Published | Macau, China |
Pages | 4294--4300 |
Date | November 4-8, 2019 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Member/PubMarcoEwerton/Ewerton_IROS_2019.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wibranek, B.; Belousov, B.; Sadybakasov, A.; Peters, J.; Tessmann, O. |
Year | 2019 |
Title | Interactive Structure: Robotic Repositioning of Vertical Elements in Man-Machine Collaborative Assembly through Vision-Based Tactile Sensing |
Journal/Conference/Book Title | Proceedings of the 37th eCAADe and 23rd SIGraDi Conference |
Keywords | SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/wibranek_sigradi19.pdf |
Reference Type | Conference Proceedings |
Author(s) | Klink, P.; Abdulsamad, H.; Belousov, B.; Peters, J. |
Year | 2019 |
Title | Self-Paced Contextual Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the 3rd Conference on Robot Learning (CoRL) |
Abstract | Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots. Although contextual reinforcement learning provides a principled framework for learning and generalization of behaviors across related tasks, it generally relies on uninformed sampling of environments from an unknown, uncontrolled context distribution, thus missing the benefits of structured, sequential learning. We introduce a novel relative entropy reinforcement learning algorithm that gives the agent the freedom to control the intermediate task distribution, allowing for its gradual progression towards the target context distribution. Empirical evaluation shows that the proposed curriculum learning scheme drastically improves sample efficiency and enables learning in scenarios with both broad and sharp target context distributions in which classical approaches perform sub-optimally. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PascalKlink/sprl.pdf |
Reference Type | Conference Paper |
Author(s) | Watson, J.; Abdulsamad, H.; Peters, J. |
Year | 2019 |
Title | Stochastic Optimal Control as Approximate Input Inference |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoeWatson/Watson19I2c.pdf |
Reference Type | Conference Paper |
Author(s) | Abdulsamad, H.; Naveh, K.; Peters, J. |
Year | 2019 |
Title | Model-Based Relative Entropy Policy Search for Stochastic Hybrid Systems |
Journal/Conference/Book Title | 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Reference Type | Journal Article |
Author(s) | Gomez Gonzalez, S.; Nemmour, Y.; Schoelkopf, B.; Peters, J. |
Year | 2019 |
Title | Reliable Real Time Ball Tracking for Robot Table Tennis |
Journal/Conference/Book Title | Robotics |
Volume | 8 |
Number of Volumes | 90 |
Number | 4 |
URL(s) | https://www.mdpi.com/2218-6581/8/4/90 |
Reference Type | Journal Article |
Author(s) | Schuermann, T.; Mohler, B.J.; Peters, J.; Beckerle, P. |
Year | 2019 |
Title | How Cognitive Models of Human Body Experience Might Push Robotics |
Journal/Conference/Book Title | Frontiers in Neurorobotics |
Reference Type | Conference Proceedings |
Author(s) | Delfosse, Q.; Stark, S.; Tanneberg, D.; Santucci, V. G.; Peters, J. |
Year | 2019 |
Title | Open-Ended Learning of Grasp Strategies using Intrinsically Motivated Self-Supervision |
Journal/Conference/Book Title | Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SvenjaStark/delfosse_iros2019.pdf |
Reference Type | Conference Paper |
Author(s) | Muratore, F.; Gienger, M.; Peters, J. |
Year | 2019 |
Title | Assessing Transferability in Reinforcement Learning from Randomized Simulations |
Journal/Conference/Book Title | Reinforcement Learning and Decision Making (RLDM) |
Keywords | domain randomization, simulation optimization, sim-2-real |
Abstract | Exploration-based reinforcement learning of control policies on physical systems is generally time-intensive and can lead to catastrophic failures. Therefore, simulation-based policy search appears to be an appealing alternative. Unfortunately, running policy search on a slightly faulty simulator can easily lead to the maximization of the ‘Simulation Optimization Bias’ (SOB), where the policy exploits modeling errors of the simulator such that the resulting behavior can potentially damage the device. For this reason, much work in reinforcement learning has focused on model-free methods. The resulting lack of safe simulation-based policy learning techniques imposes severe limitations on the application of reinforcement learning to real-world systems. In this paper, we explore how physics simulations can be utilized for a robust policy optimization by randomizing the simulator’s parameters and training from model ensembles. We propose an algorithm called Simulation-based Policy Optimization with Transferability Assessment (SPOTA) that uses an estimator of the SOB to formulate a stopping criterion for training. We show that the simulation-based policy search algorithm is able to learn a control policy exclusively from a randomized simulator that can be applied directly to a different system without using any data from the latter. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Muratore_GP--RLDM_2019.pdf |
Language | English |
Reference Type | Conference Proceedings |
Author(s) | Look, A.; Kandemir, M. |
Year | 2019 |
Title | Differential Bayesian Neural Nets |
Journal/Conference/Book Title | NeurIPS Bayesian Workshop |
Abstract | Neural Ordinary Differential Equations (N-ODEs) are a powerful building block for learning systems, which extend residual networks to a continuous-time dynamical system. We propose a Bayesian version of N-ODEs that enables well-calibrated quantification of prediction uncertainty, while maintaining the expressive power of their deterministic counterpart. We assign Bayesian Neural Nets (BNNs) to both the drift and the diffusion terms of a Stochastic Differential Equation (SDE) that models the flow of the activation map in time. We infer the posterior on the BNN weights using a straightforward adaptation of Stochastic Gradient Langevin Dynamics (SGLD). We illustrate significantly improved stability on two synthetic time series prediction tasks and report better model fit on UCI regression benchmarks with our method when compared to its non-Bayesian counterpart. |
Reference Type | Report |
Author(s) | Tanneberg, D.; Rueckert, E.; Peters, J. |
Year | 2019 |
Title | Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer Architecture |
Journal/Conference/Book Title | arXiv |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
URL(s) | https://arxiv.org/pdf/1911.00926.pdf |
Reference Type | Conference Paper |
Author(s) | Klink, P.; Peters, J. |
Year | 2019 |
Title | Measuring Similarities between Markov Decision Processes |
Journal/Conference/Book Title | 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Reference Type | Journal Article |
Author(s) | Kroemer, O.; Leischnig, S.; Luettgen, S.; Peters, J. |
Year | 2018 |
Title | A Kernel-based Approach to Learning Contact Distributions for Robot Manipulation Tasks |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Volume | 42 |
Number | 3 |
Pages | 581-600 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Alumni/OliverKroemer/KroemerAuRo17Updated2.pdf |
Reference Type | Journal Article |
Author(s) | Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Using Probabilistic Movement Primitives in Robotics |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Volume | 42 |
Number | 3 |
Pages | 529-551 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/AlexandrosParaschos/promps_auro.pdf |
Reference Type | Journal Article |
Author(s) | Yi, Z.; Zhang, Y.; Peters, J. |
Year | 2018 |
Title | Biomimetic Tactile Sensors and Signal Processing with Spike Trains: A Review |
Journal/Conference/Book Title | Sensors & Actuators: A. Physical |
Volume | 269 |
Pages | 41-52 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/SNA2018yi.pdf |
Reference Type | Journal Article |
Author(s) | Paraschos, A.; Rueckert, E.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Probabilistic Movement Primitives under Unknown System Dynamics |
Journal/Conference/Book Title | Advanced Robotics (ARJ) |
Volume | 32 |
Number | 6 |
Pages | 297-310 |
Link to PDF | https://www.ias.tu-darmstadt.de/uploads/Alumni/AlexandrosParaschos/Paraschos_AR_2018.pdf |
Reference Type | Journal Article |
Author(s) | Manschitz, S.; Gienger, M.; Kober, J.; Peters, J. |
Year | 2018 |
Title | Mixture of Attractors: A novel Movement Primitive Representation for Learning Motor Skills from Demonstrations |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Volume | 3 |
Number | 2 |
Pages | 926-933 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/SimonManschitz/ManschitzRAL2018.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lioutikov, R.; Maeda, G.; Veiga, F.F.; Kersting, K.; Peters, J. |
Year | 2018 |
Title | Inducing Probabilistic Context-Free Grammars for the Sequencing of Robot Movement Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-Hand |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/RudolfLioutikov/lioutikov_movement_pcfg_icra2018.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gebhardt, G.H.W.; Daun, K.; Schnaubelt, M.; Neumann, G. |
Year | 2018 |
Title | Learning Robust Policies for Object Manipulation with Robot Swarms |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | swarm robotics, policy search, kernel methods, kilobots |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/GregorGebhardt/LearningRobustPoliciesForObjectManipulationWithRobotSwarms.pdf |
Reference Type | Journal Article |
Author(s) | Vinogradska, J.; Bischoff, B.; Peters, J. |
Year | 2018 |
Title | Approximate Value Iteration based on Numerical Quadrature |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation, and IEEE Robotics and Automation Letters (RA-L) |
Volume | 3 |
Number of Volumes | 2 |
Pages | 1330-1337 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NQVI_RAL_manuscript.pdf |
Reference Type | Conference Proceedings |
Author(s) | Pinsler, R.; Akrour, R.; Osa, T.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Sample and Feedback Efficient Hierarchical Reinforcement Learning from Human Preferences |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | IAS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/RiadAkrour/icra18_robert.pdf |
Reference Type | Conference Proceedings |
Author(s) | Koert, D.; Maeda, G.; Neumann, G.; Peters, J. |
Year | 2018 |
Title | Learning Coupled Forward-Inverse Models with Combined Prediction Errors |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-Hand,SKILLS4ROBOTS |
Abstract | Challenging tasks in unstructured environments require robots to learn complex models. Given a large amount of information, learning multiple simple models can offer an efficient alternative to a monolithic complex network. Training multiple models---that is, learning their parameters and their responsibilities---has been shown to be prohibitively hard as optimization is prone to local minima. To efficiently learn multiple models for different contexts, we thus develop a new algorithm based on expectation maximization (EM). In contrast to comparable concepts, this algorithm trains multiple modules of paired forward-inverse models by using the prediction errors of both forward and inverse models simultaneously. In particular, we show that our method yields a substantial improvement over only considering the errors of the forward models on tasks where the inverse space contains multiple solutions. |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/DorotheaKoert/cfim_final.pdf |
Reference Type | Journal Article |
Author(s) | Osa, T.; Pajarinen, J.; Neumann, G.; Bagnell, J.A.; Abbeel, P.; Peters, J. |
Year | 2018 |
Title | An Algorithmic Perspective on Imitation Learning |
Journal/Conference/Book Title | Foundations and Trends in Robotics |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/1811.06711 |
Reference Type | Journal Article |
Author(s) | Veiga, F.; Peters, J.; Hermans, T. |
Year | 2018 |
Title | Grip Stabilization of Novel Objects using Slip Prediction |
Journal/Conference/Book Title | IEEE Transactions on Haptics |
Volume | 11 |
Number | 4 |
Pages | 531--542 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/veigaToH2018.pdf |
Reference Type | Journal Article |
Author(s) | Koc, O.; Maeda, G.; Peters, J. |
Year | 2018 |
Title | Online optimal trajectory generation for robot table tennis |
Journal/Conference/Book Title | Robotics and Autonomous Systems (RAS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Online_optimal_trajectory_generation.pdf |
Reference Type | Journal Article |
Author(s) | Ewerton, M.; Rother, D.; Weimar, J.; Kollegger, G.; Wiemeyer, J.; Peters, J.; Maeda, G. |
Year | 2018 |
Title | Assisting Movement Training and Execution with Visual and Haptic Feedback |
Journal/Conference/Book Title | Frontiers in Neurorobotics |
Keywords | 3rd-Hand, BIMROB, RoMaNS, SKILLS4ROBOTS, NEDO |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/fnbot-12-00024.pdf |
Reference Type | Conference Proceedings |
Author(s) | Belousov, B.; Peters, J. |
Year | 2018 |
Title | Entropic Regularization of Markov Decision Processes |
Journal/Conference/Book Title | 38th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering |
Keywords | reinforcement learning; actor-critic methods; entropic proximal mappings; policy search |
Abstract | The problem of synthesis of an optimal feedback controller for a given Markov decision process (MDP) can in principle be solved by value iteration or policy iteration. However, if system dynamics and the reward function are unknown, the only way for a learning agent to discover an optimal controller is through interaction with the MDP. During data gathering, it is crucial to account for the lack of information, because otherwise ignorance will push the agent towards dangerous areas of the state space. To prevent such behavior and smoothen learning dynamics, prior works proposed to bound the information loss measured by the Kullback-Leibler (KL) divergence at every policy improvement step. In this paper, we consider a broader family of f -divergences that preserve the beneficial property of the KL divergence of providing the policy improvement step in closed form accompanied by a compatible dual objective for policy evaluation. Such entropic proximal policy optimization view gives a unified perspective on compatible actor-critic architectures. In particular, common least squares value function fitting coupled with advantage-weighted maximum likelihood policy estimation is shown to correspond to the Pearson χ2-divergence penalty. Other connections can be established by considering different choices of the penalty generator function f . |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/maxent18_belousov.pdf |
Reference Type | Conference Proceedings |
Author(s) | Parmas, P.; Doya, K.; Rasmussen, C.; Peters, J. |
Year | 2018 |
Title | PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Reference Type | Conference Proceedings |
Author(s) | Arenz, O.; Zhong, M.; Neumann, G. |
Year | 2018 |
Title | Efficient Gradient-Free Variational Inference using Policy Search |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Keywords | Variational Inference, Policy Search, Sampling |
Abstract | Inference from complex distributions is a common problem in machine learning needed for many Bayesian methods. We propose an efficient, gradient-free method for learning general GMM approximations of multimodal distributions based on recent insights from stochastic search methods. Our method establishes information-geometric trust regions to ensure efficient exploration of the sampling space and stability of the GMM updates, allowing for efficient estimation of multi-variate Gaussian variational distributions. For GMMs, we apply a variational lower bound to decompose the learning objective into sub-problems given by learning the individual mixture components and the coefficients. The number of mixture components is adapted online in order to allow for arbitrary exact approximations. We demonstrate on several domains that we can learn significantly better approximations than competing variational inference methods and that the quality of samples drawn from our approximations is on par with samples created by state-of-the-art MCMC samplers that require significantly more computational resources. |
Editor(s) | Dy, Jennifer and Krause, Andreas |
Publisher | PMLR |
Volume | 80 |
Pages | 234--243 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/OlegArenz/VIPS_full.pdf |
Reference Type | Journal Article |
Author(s) | Buechler, D.; Calandra, R.; Schoelkopf, B.; Peters, J. |
Year | 2018 |
Title | Control of Musculoskeletal Systems using Learned Dynamics Models |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters, and IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | https://ei.is.tuebingen.mpg.de/uploads_file/attachment/attachment/422/RAL18final.pdf |
Reference Type | Journal Article |
Author(s) | Sosic, A.; Rueckert, E.; Peters, J.; Zoubir, A.M.; Koeppl, H |
Year | 2018 |
Title | Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Volume | 19 |
Number | 69 |
Pages | 1--45 |
Reference Type | Conference Proceedings |
Author(s) | Akrour, R.; Veiga, F.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Regularizing Reinforcement Learning with State Abstraction |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/RiadAkrour/iros18_riad.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gondaliya, K.D.; Peters, J.; Rueckert, E. |
Year | 2018 |
Title | Learning to Categorize Bug Reports with LSTM Networks |
Journal/Conference/Book Title | Proceedings of the International Conference on Advances in System Testing and Validation Lifecycle |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VALID2018Gondaliya.pdf |
Reference Type | Journal Article |
Author(s) | Osa, T.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Hierarchical Reinforcement Learning of Multiple Grasping Strategies with Human Instructions |
Journal/Conference/Book Title | Advanced Robotics |
Volume | 32 |
Number | 18 |
Pages | 955-968 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/advanced_roboitcs_18osa.pdf |
Reference Type | Conference Paper |
Author(s) | Muratore, F.; Treede, F.; Gienger, M.; Peters, J. |
Year | 2018 |
Title | Domain Randomization for Simulation-Based Policy Optimization with Transferability Assessment |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL) |
Keywords | domain randomization, simulation optimization, sim-2-real |
Abstract | Exploration-based reinforcement learning on real robot systems is generally time-intensive and can lead to catastrophic robot failures. Therefore, simulation-based policy search appears to be an appealing alternative. Unfortunately, running policy search on a slightly faulty simulator can easily lead to the maximization of the ‘Simulation Optimization Bias’ (SOB), where the policy exploits modeling errors of the simulator such that the resulting behavior can potentially damage the robot. For this reason, much work in robot reinforcement learning has focused on model-free methods that learn on real-world systems. The resulting lack of safe simulation-based policy learning techniques imposes severe limitations on the application of robot reinforcement learning. In this paper, we explore how physics simulations can be utilized for a robust policy optimization by perturbing the simulator’s parameters and training from model ensembles. We propose a new algorithm called Simulation-based Policy Optimization with Transferability Assessment (SPOTA) that uses a biased estimator of the SOB to formulate a stopping criterion for training. We show that the new simulation-based policy search algorithm is able to learn a control policy exclusively from a randomized simulator that can be applied directly to a different system without using any data from the latter. |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Muratore_Treede_Gienger_Peters--SPOTA_CoRL2018.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Muratore_Treede_Gienger_Peters--SPOTA_CoRL2018.pdf |
Language | English |
Reference Type | Conference Proceedings |
Author(s) | Koert, D.; Trick, S.; Ewerton, M.; Lutter, M.; Peters, J. |
Year | 2018 |
Title | Online Learning of an Open-Ended Skill Library for Collaborative Tasks |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | SKILLS4ROBOTS, KOBO |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DorotheaKoert/incremental_promp_2018.pdf |
Reference Type | Conference Paper |
Author(s) | Akrour, R.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Constraint-Space Projection Direct Policy Search |
Journal/Conference/Book Title | European Workshops on Reinforcement Learning (EWRL) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/RiadAkrour/ewrl18_riad.pdf |
Reference Type | Conference Proceedings |
Author(s) | Hoelscher, J.; Koert, D.; Peters, J.; Pajarinen, J. |
Year | 2018 |
Title | Utilizing Human Feedback in POMDP Execution and Specification |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | ROMANS,SKILLS4ROBOTS,ROBOLEAP |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DorotheaKoert/pomdp_user_interaction_2018.pdf |
Reference Type | Conference Proceedings |
Author(s) | Belousov, B.; Peters, J. |
Year | 2018 |
Title | Mean Squared Advantage Minimization as a Consequence of Entropic Policy Improvement Regularization |
Journal/Conference/Book Title | European Workshops on Reinforcement Learning (EWRL) |
Keywords | policy optimization, entropic proximal mappings, actor-critic algorithms |
Abstract | Policy improvement regularization with entropy-like f-divergence penalties provides a unifying perspective on actor-critic algorithms, rendering policy improvement and policy evaluation steps as primal and dual subproblems of the same optimization problem. For small policy improvement steps, we show that all f-divergences with twice differentiable generator function f yield a mean squared advantage minimization objective for the policy evaluation step and an advantage-weighted maximum log-likelihood objective for the policy improvement step. The mean squared advantage objective fits in-between the well-known mean squared Bellman error and the mean squared temporal difference error objectives, requiring only the expectation of the temporal difference error with respect to the next state and not the policy, in contrast to the Bellman error, which requires both, and the temporal difference error, which requires none. The advantage-weighted maximum log-likelihood policy improvement rule emerges as a linear approximation to a more general weighting scheme where weights are a monotone function of the advantage. Thus, the entropic policy regularization framework provides a rigorous justification for the common practice of least squares value function fitting accompanied by advantage-weighted maximum log-likelihood policy parameters estimation, at the same time pointing at the direction in which this classical actor-critic approach can be extended. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/ewrl18_belousov.pdf |
Reference Type | Unpublished Work |
Author(s) | Pinsler, R.; Maag, M.; Arenz, O.; Neumann, G. |
Year | 2018 |
Title | Inverse Reinforcement Learning of Bird Flocking Behavior |
Journal/Conference/Book Title | Swarms: From Biology to Robotics and Back (ICRA Workshop) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/OlegArenz/PinslerEtAl_ICRA2018swarms.pdf |
Reference Type | Journal Article |
Author(s) | Kupcsik, A.G.; Deisenroth, M.P.; Peters, J.; Ai Poh, L.; Vadakkepat, V.; Neumann, G. |
Year | 2017 |
Title | Model-based Contextual Policy Search for Data-Efficient Generalization of Robot Skills |
Journal/Conference/Book Title | Artificial Intelligence |
Keywords | ComPLACS |
Volume | 247 |
Pages | 415-439 |
Date | June 2017 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kupcsik_AIJ_2015.pdf |
Reference Type | Journal Article |
Author(s) | Wang, Z.; Boularias, A.; Muelling, K.; Schoelkopf, B.; Peters, J. |
Year | 2017 |
Title | Anticipatory Action Selection for Human-Robot Table Tennis |
Journal/Conference/Book Title | Artificial Intelligence |
Volume | 247 |
Pages | 399-414 |
Date | June 2017 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Anticipatory_Action_Selection.pdf |
Reference Type | Journal Article |
Author(s) | Maeda, G.; Neumann, G.; Ewerton, M.; Lioutikov, R.; Kroemer, O.; Peters, J. |
Year | 2017 |
Title | Probabilistic Movement Primitives for Coordination of Multiple Human-Robot Collaborative Tasks |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Keywords | 3rd-Hand, BIMROB |
Volume | 41 |
Number | 3 |
Pages | 593-612 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/gjm_2016_AURO_c.pdf |
Reference Type | Journal Article |
Author(s) | Parisi, S.; Pirotta, M.; Peters, J. |
Year | 2017 |
Title | Manifold-based Multi-objective Policy Search with Sample Reuse |
Journal/Conference/Book Title | Neurocomputing |
Keywords | multi-objective, reinforcement learning, policy search, black-box optimization, importance sampling |
Volume | 263 |
Pages | 3-14 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/parisi_neurocomp_morl.pdf |
Reference Type | Book Section |
Author(s) | Peters, J.; Lee, D.; Kober, J.; Nguyen-Tuong, D.; Bagnell, J.; Schaal, S. |
Year | 2017 |
Title | Chapter 15: Robot Learning |
Journal/Conference/Book Title | Springer Handbook of Robotics, 2nd Edition |
Publisher | Springer International Publishing |
Pages | 357-394 |
Reference Type | Journal Article |
Author(s) | Maeda, G.; Ewerton, M.; Neumann, G.; Lioutikov, R.; Peters, J. |
Year | 2017 |
Title | Phase Estimation for Fast Action Recognition and Trajectory Generation in Human-Robot Collaboration |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | 3rd-Hand, BIMROB |
Volume | 36 |
Number | 13-14 |
Pages | 1579-1594 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/phase_estim_IJRR.pdf |
Reference Type | Journal Article |
Author(s) | Padois, V.; Ivaldib, S.; BabiÄ, J.; Mistry, M.; Peters, J.; Nori, F. |
Year | 2017 |
Title | Whole-body multi-contact motion in humans and humanoids |
Journal/Conference/Book Title | Robotics and Autonomous Systems |
Volume | 90 |
Pages | 97-117 |
Date | April 2017 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ias_padois_et_al_revised_finalised.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tangkaratt, V.; van Hoof, H.; Parisi, S.; Neumann, G.; Peters, J.; Sugiyama, M. |
Year | 2017 |
Title | Policy Search with High-Dimensional Context Variables |
Journal/Conference/Book Title | Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/tangkaratt2017policy.pdf |
Reference Type | Journal Article |
Author(s) | Ivaldi, S.; Lefort, S.; Peters, J.; Chetouani, M.; Provasi, J.; Zibetti, E. |
Year | 2017 |
Title | Towards Engagement Models that Consider Individual Factors in HRI: On the Relation of Extroversion and Negative Attitude Towards Robots to Gaze and Speech During a Human-Robot Assembly Task |
Journal/Conference/Book Title | International Journal of Social Robotics |
Volume | 9 |
Number of Volumes | 1 |
Pages | 63-86 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IJSR_edhhi.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gebhardt, G.H.W.; Kupcsik, A.G.; Neumann, G. |
Year | 2017 |
Title | The Kernel Kalman Rule - Efficient Nonparametric Inference with Recursive Least Squares |
Journal/Conference/Book Title | Proceedings of the National Conference on Artificial Intelligence (AAAI) |
Abstract | Nonparametric inference techniques provide promising tools for probabilistic reasoning in high-dimensional nonlinear systems. Most of these techniques embed distributions into reproducing kernel Hilbert spaces (RKHS) and rely on the kernel Bayes’ rule (KBR) to manipulate the embeddings. However, the computational demands of the KBR scale poorly with the number of samples and the KBR often suffers from numerical instabilities. In this paper, we present the kernel Kalman rule (KKR) as an alternative to the KBR. The derivation of the KKR is based on recursive least squares, inspired by the derivation of the Kalman innovation update. We apply the KKR to filtering tasks where we use RKHS embeddings to represent the belief state, resulting in the kernel Kalman filter (KKF). We show on a nonlinear state estimation task with high dimensional observations that our approach provides a significantly improved estimation accuracy while the computational demands are significantly decreased. |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/GregorGebhardt/TheKernelKalmanRule.pdf |
Reference Type | Journal Article |
Author(s) | Yi, Z.; Zhang, Y.; Peters, J. |
Year | 2017 |
Title | Bioinspired Tactile Sensor for Surface Roughness Discrimination |
Journal/Conference/Book Title | Sensors and Actuators A: Physical |
Volume | 255 |
Pages | 46-53 |
Date | 1 March 2017 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bioinspired_tactile_sensor.pdf |
Reference Type | Journal Article |
Author(s) | Osa, T.; Ghalamzan, E. A. M.; Stolkin, R.; Lioutikov, R.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Guiding Trajectory Optimization by Demonstrated Distributions |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Publisher | IEEE |
Volume | 2 |
Number | 2 |
Pages | 819-826 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Osa_RAL_2017.pdf |
Reference Type | Journal Article |
Author(s) | Kroemer, O.; Peters, J. |
Year | 2017 |
Title | A Comparison of Autoregressive Hidden Markov Models for Multi-Modal Manipulations with Variable Masses |
Journal/Conference/Book Title | Proceedings of the International Conference of Robotics and Automation, and IEEE Robotics and Automation Letters (RA-L) |
Keywords | 3rd-Hand, TACMAN |
Volume | 2 |
Number | 2 |
Pages | 1101 - 1108 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Kroemer_RAL_2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Kollegger, G.; Maeda, G.; Wiemeyer, J.; Peters, J. |
Year | 2017 |
Title | Iterative Feedback-basierte Korrekturstrategien beim Bewegungslernen von Mensch-Roboter-Dyaden |
Journal/Conference/Book Title | DVS Sportmotorik 2017 |
Keywords | BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_motorik2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kollegger, G.; Reinhardt, N.; Ewerton, M.; Peters, J.; Wiemeyer, J. |
Year | 2017 |
Title | Die Bedeutung der Beobachtungsperspektive beim Bewegungslernen von Mensch-Roboter-Dyaden |
Journal/Conference/Book Title | DVS Sportmotorik 2017 |
Keywords | BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/kollegger_motorik2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wiemeyer, J.; Peters, J.; Kollegger, G.; Ewerton, M. |
Year | 2017 |
Title | BIMROB – Bidirektionale Interaktion von Mensch und Roboter beim Bewegungslernen |
Journal/Conference/Book Title | DVS Sportmotorik 2017 |
Keywords | BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/wiemeyer_motorik2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Farraj, F. B.; Osa, T.; Pedemonte, N.; Peters, J.; Neumann, G.; Giordano, P.R. |
Year | 2017 |
Title | A Learning-based Shared Control Architecture for Interactive Task Execution |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/firas_ICRA17.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wilbers, D.; Lioutikov, R.; Peters, J. |
Year | 2017 |
Title | Context-Driven Movement Primitive Adaptation |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-Hand |
Link to PDF | /uploads/Member/PubRudolfLioutikov/wilbers_icra_2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | End, F.; Akrour, R.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Layered Direct Policy Search for Learning Hierarchical Skills |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Research/Overview/icra17_felix.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gabriel, A.; Akrour, R.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Empowered Skills |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | IAS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Research/Overview/icra17_alex.pdf |
Reference Type | Conference Proceedings |
Author(s) | Abdulsamad, H.; Arenz, O.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | State-Regularized Policy Search for Linearized Dynamical Systems |
Journal/Conference/Book Title | Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Abdulsamad_ICAPS_2017.pdf |
Reference Type | Journal Article |
Author(s) | Lioutikov, R.; Neumann, G.; Maeda, G.; Peters, J. |
Year | 2017 |
Title | Learning Movement Primitive Libraries through Probabilistic Segmentation |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | 3rd-hand |
Volume | 36 |
Number | 8 |
Pages | 879-894 |
Link to PDF | /uploads/Publications/lioutikov_probs_ijrr2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Fiebig, K.H.; Jayaram, V.; Hesse, T.; Blank, A.; Peters, J.; Grosse-Wentrup, M. |
Year | 2017 |
Title | Bayesian Regression for Artifact Correction in Electroencephalography |
Journal/Conference/Book Title | Proceedings of the 7th Graz Brain-Computer Interface Conference |
Reference Type | Conference Proceedings |
Author(s) | Akrour, R.; Sorokin, D.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Local Bayesian Optimization of Motor Skills |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Link to PDF | http://proceedings.mlr.press/v70/akrour17a/akrour17a.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gebhardt, G.H.W.; Daun, K.; Schnaubelt, M.; Hendrich, A.; Kauth, D.; Neumann, G. |
Year | 2017 |
Title | Learning to Assemble Objects with a Robot Swarm |
Journal/Conference/Book Title | Proceedings of the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) |
Keywords | multi-agent learning, reinforcement learning, swarm robotics |
Abstract | Nature provides us with a multitude of examples that show how swarms of simple agents are much richer in their abilities than a single individual. This insight is a main principle that swarm robotics tries to exploit. In the last years, large swarms of low-cost robots such as the Kilobots have become available. This allows to bring algorithms developed for swarm robotics from simulations to the real world. Recently, the Kilobots have been used for an assembly task with multiple objects: a human operator controlled a light source to guide the swarm of light-sensitive robots such that they successfully assembled an object of multiple parts. However, hand-coding the control of the light source for autonomous assembly is not straight forward as the interactions of the swarm with the object or the reaction to the light source are hard to model. |
Publisher | International Foundation for Autonomous Agents and Multiagent Systems |
Pages | 1547--1549 |
URL(s) | http://dl.acm.org/citation.cfm?id=3091282.3091357 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/GregorGebhardt/LearningToAssembleObjectsWithARobotSwarm.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kollegger, G.; Ewerton, M.; Wiemeyer, J.; Peters, J. |
Year | 2017 |
Title | BIMROB – Bidirectional Interaction between human and robot for the learning of movements – Robot trains human – Human trains robot |
Journal/Conference/Book Title | 23. SportÂwissenschaftÂlicher Hochschultag der dvs |
Keywords | BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/kollegger_dvs_hochschultag_2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tosatto, S.; Pirotta, M.; D'Eramo, C; Restelli, M. |
Year | 2017 |
Title | Boosted Fitted Q-Iteration |
Journal/Conference/Book Title | Proceedings of the International Conference of Machine Learning (ICML) |
Keywords | Bosch-Forschungstiftung |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/tosatto_icml2017.pdf |
Reference Type | Conference Paper |
Author(s) | Belousov, B.; Neumann, G.; Rothkopf, C.A.; Peters, J. |
Year | 2017 |
Title | Catching Heuristics Are Optimal Control Policies |
Journal/Conference/Book Title | Proceedings of the Karniel Thirteenth Computational Motor Control Workshop |
Keywords | SKILLS4ROBOTS |
Abstract | Two seemingly contradictory theories attempt to explain how humans move to intercept an airborne ball. One theory posits that humans predict the ball trajectory to optimally plan future actions; the other claims that, instead of performing such complicated computations, humans employ heuristics to reactively choose appropriate actions based on immediate visual feedback. In this paper, we show that interception strategies appearing to be heuristics can be understood as computational solutions to the optimal control problem faced by a ball-catching agent acting under uncertainty. Modeling catching as a continuous partially observable Markov decision process and employing stochastic optimal control theory, we discover that the four main heuristics described in the literature are optimal solutions if the catcher has sufficient time to continuously visually track the ball. Specifically, by varying model parameters such as noise, time to ground contact, and perceptual latency, we show that different strategies arise under different circumstances. The catcher’s policy switches between generating reactive and predictive behavior based on the ratio of system to observation noise and the ratio between reaction time and task duration. Thus, we provide a rational account of human ball-catching behavior and a unifying explanation for seemingly contradictory theories of target interception on the basis of stochastic optimal control. |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Belousov_ANIPS_2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | Busch, B.; Maeda, G.; Mollard, Y.; Demangeat, M.; Lopes, M. |
Year | 2017 |
Title | Postural Optimization for an Ergonomic Human-Robot Interaction |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | 3rd-Hand |
Reference Type | Conference Proceedings |
Author(s) | Pajarinen, J.; Kyrki, V.; Koval, M.; Srinivasa, S; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Hybrid Control Trajectory Optimization under Uncertainty |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | RoMaNS |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/pajarinen_iros_2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Parisi, S.; Ramstedt, S.; Peters, J. |
Year | 2017 |
Title | Goal-Driven Dimensionality Reduction for Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/parisi2017iros.pdf |
Reference Type | Journal Article |
Author(s) | Paraschos, A.; Lioutikov, R.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Probabilistic Prioritization of Movement Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Intelligent Robot Systems, and IEEE Robotics and Automation Letters (RA-L) |
Keywords | codyco |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/AlexandrosParaschos/paraschos_prob_prio.pdf |
Reference Type | Journal Article |
Author(s) | van Hoof, H.; Tanneberg, D.; Peters, J. |
Year | 2017 |
Title | Generalized Exploration in Policy Search |
Journal/Conference/Book Title | Machine Learning (MLJ) |
Volume | 106 |
Number | 9-10 |
Pages | 1705-1724 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/vanHoof_MLJ_2017.pdf |
Reference Type | Journal Article |
Author(s) | van Hoof, H.; Neumann, G.; Peters, J. |
Year | 2017 |
Title | Non-parametric Policy Search with Limited Information Loss |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Keywords | TACMAN, reinforcement learning |
Volume | 18 |
Number | 73 |
Pages | 1-46 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Alumni/HerkeVanHoof/vanHoof_JMLR_2017.pdf |
Reference Type | Journal Article |
Author(s) | Vinogradska, J.; Bischoff, B.; Nguyen-Tuong, D.; Peters, J. |
Year | 2017 |
Title | Stability of Controllers for Gaussian Process Forward Models |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Volume | 18 |
Number | 100 |
Pages | 1-37 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/16-590.pdf |
Reference Type | Journal Article |
Author(s) | Dermy, O.; Paraschos, A.; Ewerton, M.; Charpillet, F.; Peters, J.; Ivaldi, S |
Year | 2017 |
Title | Prediction of intention during interaction with iCub with Probabilistic Movement Primitives |
Journal/Conference/Book Title | Frontiers in Robotics and AI |
Keywords | CoDyCo, BIMROB |
Volume | 4 |
Pages | 45 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/frobt-04-00045.pdf |
Reference Type | Generic |
Author(s) | Ewerton, M.; Maeda, G.; Rother, D.; Weimar, J.; Lotter, L.; Kollegger, G.; Wiemeyer, J.; Peters, J. |
Year | 2017 |
Title | Assisting the practice of motor skills by humans with a probability distribution over trajectories |
Journal/Conference/Book Title | Workshop Human-in-the-loop robotic manipulation: on the influence of the human role at IROS 2017, Vancouver, Canada |
Keywords | 3rd-Hand, BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Member/PubMarcoEwerton/WORKSHOP_IROS_2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tanneberg, D.; Peters, J.; Rueckert, E. |
Year | 2017 |
Title | Online Learning with Stochastic Recurrent Neural Networks using Intrinsic Motivation Signals |
Journal/Conference/Book Title | Proceedings of the Conference on Robot Learning (CoRL) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/DanielTanneberg/corl17_01.pdf |
Reference Type | Conference Proceedings |
Author(s) | Maeda, G.; Ewerton, M.; Osa, T.; Busch, B.; Peters, J. |
Year | 2017 |
Title | Active Incremental Learning of Robot Movement Primitives |
Journal/Conference/Book Title | Proceedings of the Conference on Robot Learning (CoRL) |
Keywords | 3rd-Hand, BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/maedaCoRL_20171014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Rueckert, E.; Nakatenus, M.; Tosatto, S.; Peters, J. |
Year | 2017 |
Title | Learning Inverse Dynamics Models in O(n) time with LSTM networks |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS, Bosch-Forschungstiftung |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Humanoids2017Rueckert.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tanneberg, D.; Peters, J.; Rueckert, E. |
Year | 2017 |
Title | Efficient Online Adaptation with Stochastic Recurrent Neural Networks |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/DanielTanneberg/humanoids17_01.pdf |
Reference Type | Conference Proceedings |
Author(s) | Stark, S.; Peters, J.; Rueckert, E. |
Year | 2017 |
Title | A Comparison of Distance Measures for Learning Nonparametric Motor Skill Libraries |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/SvenjaStark/stark_humanoids2017.pdf |
Reference Type | Journal Article |
Author(s) | Kollegger, G.; Ewerton, M.; Wiemeyer, J.; Peters, J. |
Year | 2017 |
Title | BIMROB -- Bidirectional Interaction Between Human and Robot for the Learning of Movements |
Journal/Conference/Book Title | Proceedings of the 11th International Symposium on Computer Science in Sport (IACSS 2017) |
Keywords | BIMROB |
Editor(s) | Lames, M.; Saupe, D.; Wiemeyer, J. |
Publisher | Springer International Publishing |
Pages | 151--163 |
ISBN/ISSN | 978-3-319-67846-7 |
URL(s) | https://doi.org/10.1007/978-3-319-67846-7_15 |
Reference Type | Conference Proceedings |
Author(s) | Thiem, S.; Stark, S.; Tanneberg, D.; Peters, J.; Rueckert, E. |
Year | 2017 |
Title | Simulation of the underactuated Sake Robotics Gripper in V-REP |
Journal/Conference/Book Title | Workshop at the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/PubElmarRueckert/Humanoids2017Thiem.pdf |
Reference Type | Conference Proceedings |
Author(s) | Grossberger, L.; Hohmann, M.R.; Peters J.; Grosse-Wentrup, M. |
Year | 2017 |
Title | Investigating Music Imagery as a Cognitive Paradigm for Low-Cost Brain-Computer Interfaces |
Journal/Conference/Book Title | Proceedings of the 7th Graz Brain-Computer Interface Conference |
Reference Type | Conference Proceedings |
Author(s) | Kollegger, G., Wiemeyer, J., Ewerton, M. & Peters, J. |
Year | 2017 |
Title | BIMROB - Bidirectional Interaction between human and robot for the learning of movements - Robot trains human - Human trains robot |
Journal/Conference/Book Title | Inovation & Technologie im Sport - 23. Sportwissenschaftlicher Hochschultag der deutschen Vereinigung für Sportwissenschaft |
Keywords | BIMROB |
Editor(s) | A. Schwirtz, F. Mess, Y. Demetriou & V. Senner |
Place Published | Hamburg |
Publisher | Czwalina-Feldhaus |
Pages | 179 |
Reference Type | Report |
Author(s) | Belousov, B.; Peters, J. |
Year | 2017 |
Title | f-Divergence Constrained Policy Improvement |
Journal/Conference/Book Title | arXiv |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/1801.00056.pdf |
Reference Type | Conference Proceedings |
Author(s) | Osa, T.; Peters, J.; Neumann, G. |
Year | 2016 |
Title | Experiments with Hierarchical Reinforcement Learning of Multiple Grasping Policies |
Journal/Conference/Book Title | Proceedings of the International Symposium on Experimental Robotics (ISER) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/osa_ISER2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | Arenz, O.; Abdulsamad, H.; Neumann, G. |
Year | 2016 |
Title | Optimal Control and Inverse Optimal Control by Distribution Matching |
Journal/Conference/Book Title | Proceedings of the International Conference on Intelligent Robots and Systems (IROS) |
Keywords | Imitation Learning, Inverse Optimal Control, Optimal Control |
Abstract | Optimal control is a powerful approach to achieve optimal behavior. However, it typically requires a manual specification of a cost function which often contains several objectives, such as reaching goal positions at different time steps or energy efficiency. Manually trading-off these objectives is often difficult and requires a high engineering effort. In this paper, we present a new approach to specify optimal behavior. We directly specify the desired behavior by a distribution over future states or features of the states. For example, the experimenter could choose to reach certain mean positions with given accuracy/variance at specified time steps. Our approach also unifies optimal control and inverse optimal control in one framework. Given a desired state distribution, we estimate a cost function such that the optimal controller matches the desired distribution. If the desired distribution is estimated from expert demonstrations, our approach performs inverse optimal control. We evaluate our approach on several optimal and inverse optimal control tasks on non-linear systems using incremental linearizations similar to differential dynamic programming approaches. |
Publisher | IEEE |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Team/OlegArenz/OC and IOC By Matching Distributions_withSupplements.pdf |
Reference Type | Journal Article |
Author(s) | Rueckert, E.; Kappel, D.; Tanneberg, D.; Pecevski, D; Peters, J. |
Year | 2016 |
Title | Recurrent Spiking Networks Solve Planning Tasks |
Journal/Conference/Book Title | Nature PG: Scientific Reports |
Keywords | 3rdHand, CoDyCo |
Publisher | Nature Publishing Group |
Volume | 6 |
Number | 21142 |
Date | 2016/02/18/online |
ISBN/ISSN | 10.1038/srep21142 |
Custom 2 | http://www.nature.com/articles/srep21142#supplementary-information |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/srep21142 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/srep21142 |
Reference Type | Conference Proceedings |
Author(s) | Kohlschuetter, J.; Peters, J.; Rueckert, E. |
Year | 2016 |
Title | Learning Probabilistic Features from EMG Data for Predicting Knee Abnormalities |
Journal/Conference/Book Title | Proceedings of the XIV Mediterranean Conference on Medical and Biological Engineering and Computing (MEDICON) |
Keywords | CoDyCo, TACMAN |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/KohlschuetterMEDICON_2016.pdf |
Reference Type | Journal Article |
Author(s) | Maeda, G.; Ewerton, M.; Koert, D; Peters, J. |
Year | 2016 |
Title | Acquiring and Generalizing the Embodiment Mapping from Human Observations to Robot Skills |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Keywords | 3rd-Hand, BIMROB |
Volume | 1 |
Number | 2 |
Pages | 784--791 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/GuilhermeMaeda/maeda_RAL_golf_2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | Modugno, V.; Neumann, G.; Rueckert, E.; Oriolo, G.; Peters, J.; Ivaldi, S. |
Year | 2016 |
Title | Learning soft task priorities for control of redundant robots |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | CoDyCo |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/main_revised.pdf |
Reference Type | Conference Proceedings |
Author(s) | Buechler, D.; Ott, H.; Peters, J. |
Year | 2016 |
Title | A Lightweight Robotic Arm with Pneumatic Muscles for Robot Learning |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Maeda, G.; Neumann, G.; Kisner, V.; Kollegger, G.; Wiemeyer, J.; Peters, J. |
Year | 2016 |
Title | Movement Primitives with Multiple Phase Parameters |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | BIMROB, 3rd-Hand |
Pages | 201--206 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_icra_2016_stockholm.pdf |
Reference Type | Journal Article |
Author(s) | Daniel, C.; Neumann, G.; Kroemer, O.; Peters, J. |
Year | 2016 |
Title | Hierarchical Relative Entropy Policy Search |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Volume | 17 |
Pages | 1-50 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Daniel2016JMLR.pdf |
Reference Type | Report |
Author(s) | Veiga, F.F.; Peters, J. |
Year | 2016 |
Title | Can Modular Finger Control for In-Hand Object Stabilization be accomplished by Independent Tactile Feedback Control Laws? |
Journal/Conference/Book Title | arXiv |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/1612.08202.pdf |
Reference Type | Journal Article |
Author(s) | Abdolmaleki, A.; Lau, N.; Reis, L.; Peters, J.; Neumann, G. |
Year | 2016 |
Title | Contextual Policy Search for Linear and Nonlinear Generalization of a Humanoid Walking Controller |
Journal/Conference/Book Title | Journal of Intelligent & Robotic Systems |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/contextualWalking.pdf |
Reference Type | Conference Proceedings |
Author(s) | Vinogradska, J.; Bischoff, B.; Nguyen-Tuong, D.; Romer, A.; Schmidt, H.; Peters, J. |
Year | 2016 |
Title | Stability of Controllers for Gaussian Process Forward Models |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Publications/Vinogradska_ICML_2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | Akrour, R.; Abdolmaleki, A.; Abdulsamad, H.; Neumann, G. |
Year | 2016 |
Title | Model-Free Trajectory Optimization for Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/akrour16.pdf |
Reference Type | Conference Proceedings |
Author(s) | Sharma, D.; Tanneberg, D.; Grosse-Wentrup, M.; Peters, J.; Rueckert, E. |
Year | 2016 |
Title | Adaptive Training Strategies for BCIs |
Journal/Conference/Book Title | Cybathlon Symposium |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/ElmarR%c3%bcckert/Cybathlon16_AdaptiveTrainingRL.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Peters, J.; Rasmussen, C.E.; Deisenroth, M.P. |
Year | 2016 |
Title | Manifold Gaussian Processes for Regression |
Journal/Conference/Book Title | Proceedings of the International Joint Conference on Neural Networks (IJCNN) |
Keywords | CoDyCo |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/1402.5876v4 |
Reference Type | Conference Proceedings |
Author(s) | Weber, P.; Rueckert, E.; Calandra, R.; Peters, J.; Beckerle, P. |
Year | 2016 |
Title | A Low-cost Sensor Glove with Vibrotactile Feedback and Multiple Finger Joint and Hand Motion Sensing for Human-Robot Interaction |
Journal/Conference/Book Title | Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) |
Keywords | CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/ElmarR%c3%bcckert/ROMANS16_daglove.pdf |
Reference Type | Journal Article |
Author(s) | Rueckert, E.; Camernik, J.; Peters, J.; Babic, J. |
Year | 2016 |
Title | Probabilistic Movement Models Show that Postural Control Precedes and Predicts Volitional Motor Control |
Journal/Conference/Book Title | Nature PG: Scientific Reports |
Keywords | CoDyCo; TACMAN |
Volume | 6 |
Number | 28455 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/srep28455 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/srep28455 |
Reference Type | Journal Article |
Author(s) | Daniel, C.; van Hoof, H.; Peters, J.; Neumann, G. |
Year | 2016 |
Title | Probabilistic Inference for Determining Options in Reinforcement Learning |
Journal/Conference/Book Title | Machine Learning (MLJ) |
Volume | 104 |
Number | 2-3 |
Pages | 337-357 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Daniel2016ECML.pdf |
Reference Type | Conference Proceedings |
Author(s) | Manschitz, S.; Gienger, M.; Kober, J.; Peters, J. |
Year | 2016 |
Title | Probabilistic Decomposition of Sequential Force Interaction Tasks into Movement Primitives |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | Honda, HRI-Collaboration |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/SimonManschitz/ManschitzIROS2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; Chen, N.; Karl, M.; van der Smagt, P.; Peters, J. |
Year | 2016 |
Title | Stable Reinforcement Learning with Autoencoders for Tactile and Visual Data |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | TACMAN, tactile manipulation |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/hoof2016IROS.pdf |
Reference Type | Conference Proceedings |
Author(s) | Yi, Z.; Calandra, R.; Veiga, F.; van Hoof, H.; Hermans, T.; Zhang, Y.; Peters, J. |
Year | 2016 |
Title | Active Tactile Object Exploration with Gaussian Processes |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | TACMAN, tactile manipulation |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Publications/Other/iros2016yi.pdf |
Reference Type | Conference Proceedings |
Author(s) | Koc, O.; Peters, J.; Maeda, G. |
Year | 2016 |
Title | A New Trajectory Generation Framework in Robotic Table Tennis |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Reference Type | Conference Proceedings |
Author(s) | Belousov, B.; Neumann, G.; Rothkopf, C.; Peters, J. |
Year | 2016 |
Title | Catching Heuristics Are Optimal Control Policies |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Keywords | SKILLS4ROBOTS |
Abstract | Two seemingly contradictory theories attempt to explain how humans move to intercept an airborne ball. One theory posits that humans predict the ball trajectory to optimally plan future actions; the other claims that, instead of performing such complicated computations, humans employ heuristics to reactively choose appropriate actions based on immediate visual feedback. In this paper, we show that interception strategies appearing to be heuristics can be understood as computa- tional solutions to the optimal control problem faced by a ball-catching agent acting under uncertainty. Modeling catching as a continuous partially observable Markov decision process and employing stochastic optimal control theory, we discover that the four main heuristics described in the literature are optimal solutions if the catcher has sufficient time to continuously visually track the ball. Specifically, by varying model parameters such as noise, time to ground contact, and perceptual latency, we show that different strategies arise under different circumstances. The catcher’s policy switches between generating reactive and predictive behavior based on the ratio of system to observation noise and the ratio between reaction time and task duration. Thus, we provide a rational account of human ball-catching behavior and a unifying explanation for seemingly contradictory theories of target interception on the basis of stochastic optimal control. |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Belousov_ANIPS_2016.pdf |
Reference Type | Generic |
Author(s) | Maeda, G.; Maloo, A.; Ewerton, M.; Lioutikov, R.; Peters, J. |
Year | 2016 |
Title | Proactive Human-Robot Collaboration with Interaction Primitives |
Journal/Conference/Book Title | International Workshop on Human-Friendly Robotics (HFR), Genoa, Italy |
Keywords | 3rd-Hand, BIMROB |
Reference Type | Conference Proceedings |
Author(s) | Maeda, G.; Maloo, A.; Ewerton, M.; Lioutikov, R.; Peters, J. |
Year | 2016 |
Title | Anticipative Interaction Primitives for Human-Robot Collaboration |
Journal/Conference/Book Title | AAAI Fall Symposium Series. Shared Autonomy in Research and Practice, Arlington, VA, USA |
Keywords | 3rd-Hand, BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/maeda-maloo_AAAI_symposium.pdf |
Reference Type | Book Section |
Author(s) | Peters, J.; Tedrake, R.; Roy, N.; Morimoto, J. |
Year | 2016 |
Title | Robot Learning |
Journal/Conference/Book Title | Encyclopedia of Machine Learning, 2nd Edition, Invited Article |
Reference Type | Conference Proceedings |
Author(s) | Tanneberg, D.; Paraschos, A.; Peters, J.; Rueckert, E. |
Year | 2016 |
Title | Deep Spiking Networks for Model-based Planning in Humanoids |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo; TACMAN |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/DanielTanneberg/tanneberg_humanoids16.pdf |
Reference Type | Conference Proceedings |
Author(s) | Huang, Y.; Buechler, D.; Koc, O.; Schoelkopf, B.; Peters, J. |
Year | 2016 |
Title | Jointly Learning Trajectory Generation and Hitting Point Prediction in Robot Table Tennis |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Reference Type | Conference Proceedings |
Author(s) | Koert, D.; Maeda, G.J.; Lioutikov, R.; Neumann, G.; Peters, J. |
Year | 2016 |
Title | Demonstration Based Trajectory Optimization for Generalizable Robot Motions |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | 3rd-Hand,SKILLS4ROBOTS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/DorotheaKoert/Debato.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez-Gonzalez, S.; Neumann, G.; Schoelkopf, B.; Peters, J. |
Year | 2016 |
Title | Using Probabilistic Movement Primitives for Striking Movements |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Maeda, G.J.; Kollegger, G.; Wiemeyer, J.; Peters, J. |
Year | 2016 |
Title | Incremental Imitation Learning of Context-Dependent Motor Skills |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | BIMROB, 3rd-Hand |
Pages | 351--358 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_humanoids_2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | Azad, M.; Ortenzi, V.; Lin, H., C.; Rueckert, E.; Mistry, M. |
Year | 2016 |
Title | Model Estimation and Control of Complaint Contact Normal Force |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Humanoids2016Azad.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kollegger, G.; Ewerton, M.; Peters, J.; Wiemeyer, J. |
Year | 2016 |
Title | Bidirektionale Interaktion zwischen Mensch und Roboter beim Bewegungslernen (BIMROB) |
Journal/Conference/Book Title | 11. Symposium der DVS Sportinformatik |
Keywords | BIMROB |
Link to PDF | http://www.sportinformatik2016.ovgu.de/Tagung/Abstracts.html |
Reference Type | Conference Proceedings |
Author(s) | Parisi, S; Blank, A; Viernickel T; Peters, J |
Year | 2016 |
Title | Local-utopia Policy Selection for Multi-objective Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/parisi2016local.pdf |
Reference Type | Book Section |
Author(s) | Peters, J.; Bagnell, J.A. |
Year | 2016 |
Title | Policy gradient methods |
Journal/Conference/Book Title | Encyclopedia of Machine Learning, 2nd Edition, Invited Article |
Reference Type | Conference Proceedings |
Author(s) | Fiebig, K.-H.; Jayaram, V.; Peters, J.; Grosse-Wentrup, M. |
Year | 2016 |
Title | Multi-Task Logistic Regression in Brain-Computer Interfaces |
Journal/Conference/Book Title | IEEE SMC 2016 — 6th Workshop on Brain-Machine Interface Systems |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/smc_2016_FiJaPeGW_mtl_logreg_v2.pdf |
Reference Type | Journal Article |
Author(s) | Yi, Z.; Zhang, Y.; Peters, J. |
Year | 2016 |
Title | Surface Roughness Discrimination Using Bioinspired Tactile Sensors |
Journal/Conference/Book Title | Proceedings of the 16th International Conference on Biomedical Engineering |
Reference Type | Unpublished Work |
Author(s) | Arenz, O.; Neumann, G. |
Year | 2016 |
Title | Iterative Cost Learning from Different Types of Human Feedback |
Journal/Conference/Book Title | IROS 2016 Workshop on Human-Robot Collaboration |
Keywords | Inverse Reinforcement Learning, Preference Learning |
Abstract | Human-robot collaboration in unstructured envi- ronments often involves different types of interactions. These interactions usually occur frequently during normal operation and may provide valuable information about the task to the robot. It is therefore sensible to utilize this data for lifelong robot learning. Learning from human interactions is an active field of research, e.g., Inverse Reinforcement Learning, which aims at learning from demonstrations, or Preference Learning, which aims at learning from human preferences. However, learning from a combination of different types of feedback is still little explored. In this paper, we propose a method for inferring a reward function from a combination of expert demonstrations, pairwise preferences, star ratings as well as oracle-based evaluations of the true reward function. Our method extends Maximum Entropy Inverse Reinforcement Learning in order to account for the additional types of human feedback by framing them as constraints to the original optimization problem. We demonstrate on a gridworld, that the resulting optimization problem can be solved based on the Alternating Direction Method of Multipliers (ADMM), even when confronted with a large amount of training data. |
URL(s) | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/arenz_workshop_IROS16.pdf |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/arenz_workshop_IROS16.pdf |
Reference Type | Unpublished Work |
Author(s) | Arenz, O.; Abdulsamad, H.; Neumann, G. |
Year | 2016 |
Title | (Inverse) Optimal Control for Matching Higher-Order Moments |
Journal/Conference/Book Title | DGR Days (Leipzig) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/OlegArenz/oleg_dgr_2016.pdf |
Reference Type | Journal Article |
Author(s) | Calandra, R.; Seyfarth, A.; Peters, J.; Deisenroth, M. |
Year | 2015 |
Title | Bayesian Optimization for Learning Gaits under Uncertainty |
Journal/Conference/Book Title | Annals of Mathematics and Artificial Intelligence (AMAI) |
Keywords | CoDyCo |
Abstract | Designing gaits and corresponding control policies is a key challenge in robot locomotion. Even with a viable controller parameterization, finding nearoptimal parameters can be daunting. Typically, this kind of parameter optimization requires specific expert knowledge and extensive robot experiments. Automatic black-box gait optimization methods greatly reduce the need for human expertise and time-consuming design processes. Many different approaches for automatic gait optimization have been suggested to date, such as grid search and evolutionary algorithms. In this article, we thoroughly discuss multiple of these optimization methods in the context of automatic gait optimization. Moreover, we extensively evaluate Bayesian optimization, a model-based approach to black-box optimization under uncertainty, on both simulated problems and real robots. This evaluation demonstrates that Bayesian optimization is particularly suited for robotic applications, where it is crucial to find a good set of gait parameters in a small number of experiments. |
URL(s) | http://dx.doi.org/10.1007/s10472-015-9463-9 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Calandra2015a.pdf |
Reference Type | Journal Article |
Author(s) | Mariti, C.; Muscolo, G.G.; Peters, J.; Puig, D.; Recchiuto, C.T.; Sighieri, C.; Solanas, A.; von Stryk, O. |
Year | 2015 |
Title | Developing biorobotics for veterinary research into cat movements |
Journal/Conference/Book Title | Journal of Veterinary Behavior: Clinical Applications and Research |
Volume | 10 |
Number of Volumes | 3 |
Pages | 248-254 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Developing_biorobotics_for_veterinary_research.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; Peters, J.; Neumann, G. |
Year | 2015 |
Title | Learning of Non-Parametric Control Policies with High-Dimensional State Features |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS) |
Keywords | TACMAN |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/hoof2015learning.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Ivaldi, S.; Deisenroth, M.;Rueckert, E.; Peters, J. |
Year | 2015 |
Title | Learning Inverse Dynamics Models with Contacts |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_ICRA15.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Daniel, C.; Neumann, G; van Hoof, H.; Peters, J. |
Year | 2015 |
Title | Towards Learning Hierarchical Skills for Multi-Phase Manipulation Tasks |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-hand, 3rdHand |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/KroemerICRA15.pdf |
Reference Type | Conference Proceedings |
Author(s) | Rueckert, E.; Mundo, J.; Paraschos, A.; Peters, J.; Neumann, G. |
Year | 2015 |
Title | Extracting Low-Dimensional Control Variables for Movement Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rdHand, CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Rueckert_ICRA14LMProMPsFinal.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Neumann, G.; Lioutikov, R.; Ben Amor, H.; Peters, J.; Maeda, G. |
Year | 2015 |
Title | Learning Multiple Collaborative Tasks with a Mixture of Interaction Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-Hand, CompLACS, BIMROB |
Pages | 1535--1542 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_icra_2015_seattle.pdf |
Reference Type | Conference Proceedings |
Author(s) | Traversaro, S.; Del Prete, A.; Ivaldi, S.; Nori, F. |
Year | 2015 |
Title | Avoiding to rely on Inertial Parameters in Estimating Joint Torques with proximal F/T sensing |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | CoDyCo |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICRA15_2129_FI.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lopes, M.; Peters, J.; Piater, J.; Toussaint, M.; Baisero, A.; Busch, B.; Erkent, O.; Kroemer, O.; Lioutikov, R.; Maeda, G.; Mollard, Y.; Munzer, T.; Shukla, D. |
Year | 2015 |
Title | Semi-Autonomous 3rd-Hand Robot |
Journal/Conference/Book Title | Workshop on Cognitive Robotics in Future Manufacturing Scenarios, European Robotics Forum, Vienna, Austria |
Keywords | 3rdhand |
Link to PDF | https://iis.uibk.ac.at/public/papers/Lopes-2015-CogRobFoF.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lioutikov, R.; Neumann, G.; Maeda, G.J.; Peters, J. |
Year | 2015 |
Title | Probabilistic Segmentation Applied to an Assembly Task |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | 3rd-hand |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/lioutikov_humanoids_2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Paraschos, A.; Rueckert, E.; Peters, J; Neumann, G. |
Year | 2015 |
Title | Model-Free Probabilistic Movement Primitives for Physical Interaction |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/PubAlexParaschos/Paraschos_IROS_2015.pdf |
Reference Type | Conference Paper |
Author(s) | Rueckert, E.; Lioutikov, R.; Calandra, R.; Schmidt, M.; Beckerle, P.; Peters, J. |
Year | 2015 |
Title | Low-cost Sensor Glove with Force Feedback for Learning from Demonstrations using Probabilistic Trajectory Representations |
Journal/Conference/Book Title | ICRA 2015 Workshop on Tactile and force sensing for autonomous compliant intelligent robots |
Keywords | CoDyCo |
URL(s) | http://arxiv.org/abs/1510.03253 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Workshops/ICRA2015TactileForce/13_icra_ws_tactileforce.pdf |
Reference Type | Generic |
Author(s) | Ewerton, M.; Neumann, G.; Lioutikov, R.; Ben Amor, H.; Peters, J.; Maeda, G. |
Year | 2015 |
Title | Modeling Spatio-Temporal Variability in Human-Robot Interaction with Probabilistic Movement Primitives |
Journal/Conference/Book Title | Workshop on Machine Learning for Social Robotics, ICRA |
Keywords | 3rd-Hand, CompLACS, BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_workshop_ml_social_robotics_icra_2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Parisi, S.; Abdulsamad, H.; Paraschos, A.; Daniel, C.; Peters, J. |
Year | 2015 |
Title | Reinforcement Learning vs Human Programming in Tetherball Robot Games |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | SCARL |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Member/PubSimoneParisi/parisi_iros_2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Veiga, F.F.; van Hoof, H.; Peters, J.; Hermans, T. |
Year | 2015 |
Title | Stabilizing Novel Objects by Learning to Predict Tactile Slip |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | TACMAN |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/IROS2015veiga.pdf |
Reference Type | Conference Proceedings |
Author(s) | Huang, Y.; Schoelkopf, B.; Peters, J. |
Year | 2015 |
Title | Learning Optimal Striking Points for A Ping-Pong Playing Robot |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/YanlongHuang/Yanlong_IROS2015 |
Reference Type | Conference Proceedings |
Author(s) | Manschitz, S.; Kober, J.; Gienger, M.; Peters, J. |
Year | 2015 |
Title | Probabilistic Progress Prediction and Sequencing of Concurrent Movement Primitives |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | Honda |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/ManschitzIROS2015_v2.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Maeda, G.J.; Peters, J.; Neumann, G. |
Year | 2015 |
Title | Learning Motor Skills from Partially Observed Movements Executed at Different Speeds |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | BIMROB, 3rd-hand |
Pages | 456--463 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_iros_2015_hamburg.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wahrburg, A.; Zeiss, S.; Matthias, B.; Peters, J.; Ding, H. |
Year | 2015 |
Title | Combined Pose-Wrench and State Machine Representation for Modeling Robotic Assembly Skills |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | ABB |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Wahrburg_IROS_2015.pdf |
Reference Type | Journal Article |
Author(s) | Daniel, C.; Kroemer, O.; Viering, M.; Metz, J.; Peters, J. |
Year | 2015 |
Title | Active Reward Learning with a Novel Acquisition Function |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Keywords | ComPLACS |
Volume | 39 |
Number of Volumes | 3 |
Pages | 389-405 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/ChristianDaniel/ActiveRewardLearning.pdf |
Reference Type | Conference Proceedings |
Author(s) | Fritsche, L.; Unverzagt, F.; Peters, J.; Calandra, R. |
Year | 2015 |
Title | First-Person Tele-Operation of a Humanoid Robot |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Fritsche_Humanoids15.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Ivaldi, S.; Deisenroth, M.; Peters, J. |
Year | 2015 |
Title | Learning Torque Control in Presence of Contacts using Tactile Sensing from Robot Skin |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_humanoids2015.pdf |
Reference Type | Journal Article |
Author(s) | Manschitz, S.; Kober, J.; Gienger, M.; Peters, J. |
Year | 2015 |
Title | Learning Movement Primitive Attractor Goals and Sequential Skills from Kinesthetic Demonstrations |
Journal/Conference/Book Title | Robotics and Autonomous Systems |
Keywords | Honda, HRI-Collaboration |
Volume | 74 |
Pages | 97-107 |
ISBN/ISSN | 0921-8890 |
URL(s) | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/ManschitzRAS2015_v2.pdf |
Reference Type | Conference Proceedings |
Author(s) | Maeda, G.; Neumann, G.; Ewerton, M.; Lioutikov, R.; Peters, J. |
Year | 2015 |
Title | A Probabilistic Framework for Semi-Autonomous Robots Based on Interaction Primitives with Phase Estimation |
Journal/Conference/Book Title | Proceedings of the International Symposium of Robotics Research (ISRR) |
Keywords | 3rd-Hand, BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/ISRR_uploaded_20150814_small.pdf |
Reference Type | Conference Proceedings |
Author(s) | Koc, O.; Maeda, G.; Neumann, G.; Peters, J. |
Year | 2015 |
Title | Optimizing Robot Striking Movement Primitives with Iterative Learning Control |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | 3rd-hand |
Reference Type | Conference Proceedings |
Author(s) | Hoelscher, J.; Peters, J.; Hermans, T. |
Year | 2015 |
Title | Evaluation of Interactive Object Recognition with Tactile Sensing |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | TACMAN, tactile manipulation |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Theses/hoelscher_ichr2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; Hermans, T.; Neumann, G.; Peters, J. |
Year | 2015 |
Title | Learning Robot In-Hand Manipulation with Tactile Features |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | TACMAN, tactile manipulation |
URL(s) | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/HoofHumanoids2015.pdf |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/HoofHumanoids2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Leischnig, S.; Luettgen, S.; Kroemer, O.; Peters, J. |
Year | 2015 |
Title | A Comparison of Contact Distribution Representations for Learning to Predict Object Interactions |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | TACMAN, tactile manipulation |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Leischnig-Humanoids-2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Abdolmaleki, A.; Lioutikov, R.; Peters, J; Lau, N.; Reis, L.; Neumann, G. |
Year | 2015 |
Title | Model-Based Relative Entropy Stochastic Search |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Keywords | LearnRobotS |
Place Published | Cambridge, MA |
Publisher | MIT Press |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/GerhardNeumann/Abdolmaleki_NIPS2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Dann, C.; Neumann, G.; Peters, J. |
Year | 2015 |
Title | Policy Evaluation with Temporal Differences: A Survey and Comparison |
Journal/Conference/Book Title | Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling (ICAPS) |
Pages | 359-360 |
Reference Type | Journal Article |
Author(s) | Lioutikov, R.; Paraschos, A.; Peters, J.; Neumann, G. |
Year | 2014 |
Title | Generalizing Movements with Information Theoretic Stochastic Optimal Control |
Journal/Conference/Book Title | Journal of Aerospace Information Systems |
Volume | 11 |
Number | 9 |
Pages | 579-595 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/lioutikov_2014_itsoc.pdf |
Reference Type | Journal Article |
Author(s) | Neumann, G.; Daniel, C.; Paraschos, A.; Kupcsik, A.; Peters, J. |
Year | 2014 |
Title | Learning Modular Policies for Robotics |
Journal/Conference/Book Title | Frontiers in Computational Neuroscience |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/fncom-08-00062.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nori, F.; Peters, J.; Padois, V.; Babic, J.; Mistry, M.; Ivaldi, S. |
Year | 2014 |
Title | Whole-body motion in humans and humanoids |
Journal/Conference/Book Title | Proceedings of the Workshop on New Research Frontiers for Intelligent Autonomous Systems (NRF-IAS) |
Keywords | CoDyCo |
Pages | 81-92 |
URL(s) | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/nori2014iascodyco.pdf |
Reference Type | Journal Article |
Author(s) | Dann, C.; Neumann, G.; Peters, J. |
Year | 2014 |
Title | Policy Evaluation with Temporal Differences: A Survey and Comparison |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Keywords | ComPLACS |
Volume | 15 |
Number | March |
Pages | 809-883 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/dann14a.pdf |
Reference Type | Journal Article |
Author(s) | Meyer, T.; Peters, J.; Zander, T.O.; Schoelkopf, B.; Grosse-Wentrup, M. |
Year | 2014 |
Title | Predicting Motor Learning Performance from Electroencephalographic Data |
Journal/Conference/Book Title | Journal of Neuroengineering and Rehabilitation |
Keywords | Team Athena-Minerva |
Volume | 11 |
Number | 1 |
URL(s) | http://www.ias.tu-darmstadt.de/uploads/Publications/Meyer_JNER_2013.pdf |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Meyer_JNER_2013.pdf |
Reference Type | Journal Article |
Author(s) | Bocsi, B.; Csato, L.; Peters, J. |
Year | 2014 |
Title | Indirect Robot Model Learning for Tracking Control |
Journal/Conference/Book Title | Advanced Robotics |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Bocsi_AR_2014.pdf |
Reference Type | Journal Article |
Author(s) | Ben Amor, H.; Saxena, A.; Hudson, N.; Peters, J. |
Year | 2014 |
Title | Special issue on autonomous grasping and manipulation |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Reference Type | Journal Article |
Author(s) | Deisenroth, M.P.; Fox, D.; Rasmussen, C.E. |
Year | 2014 |
Title | Gaussian Processes for Data-Efficient Learning in Robotics and Control |
Journal/Conference/Book Title | IEEE Transactions on Pattern Analysis and Machine Intelligence |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/pami_final_w_appendix.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/pami_final_w_appendix.pdf |
Reference Type | Journal Article |
Author(s) | Wierstra, D.; Schaul, T.; Glasmachers, T.; Sun, Y.; Peters, J.; Schmidhuber, J. |
Year | 2014 |
Title | Natural Evolution Strategies |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Volume | 15 |
Number | March |
Pages | 949-980 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/wierstra14a.pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Englert, P.; Peters, J.; Fox, D. |
Year | 2014 |
Title | Multi-Task Policy Search for Robotics |
Journal/Conference/Book Title | Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Deisenroth_ICRA_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bischoff, B.; Nguyen-Tuong, D.; van Hoof, H. McHutchon, A.; Rasmussen, C.E.; Knoll, A.; Peters, J.; Deisenroth, M.P. |
Year | 2014 |
Title | Policy Search For Learning Robot Control Using Sparse Data |
Journal/Conference/Book Title | Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Bischoff_ICRA_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Seyfarth, A.; Peters, J.; Deisenroth, M.P. |
Year | 2014 |
Title | An Experimental Comparison of Bayesian Optimization for Bipedal Locomotion |
Journal/Conference/Book Title | Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_ICRA2014.pdf |
Reference Type | Book |
Author(s) | Kober, J.; Peters, J. |
Year | 2014 |
Title | Learning Motor Skills - From Algorithms to Robot Experiments |
Journal/Conference/Book Title | Springer Tracts in Advanced Robotics 97 (STAR Series), Springer |
ISBN/ISSN | 978-3-319-03193-4 |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; van Hoof, H.; Neumann, G.; Peters, J. |
Year | 2014 |
Title | Learning to Predict Phases of Manipulation Tasks as Hidden States |
Journal/Conference/Book Title | Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | TACMAN, 3rd-Hand |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ben Amor, H.; Neumann, G.; Kamthe, S.; Kroemer, O.; Peters, J. |
Year | 2014 |
Title | Interaction Primitives for Human-Robot Cooperation Tasks |
Journal/Conference/Book Title | Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | CoDyCo, ComPLACS |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/icraHeniInteract.pdf |
Reference Type | Conference Proceedings |
Author(s) | Haji Ghassemi, N.; Deisenroth, M.P. |
Year | 2014 |
Title | Approximate Inference for Long-Term Forecasting with Periodic Gaussian Processes |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Analytic_Long-Term_Forecasting.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Gopalan, N.; Seyfarth, A.; Peters, J.; Deisenroth, M.P. |
Year | 2014 |
Title | Bayesian Gait Optimization for Bipedal Locomotion |
Journal/Conference/Book Title | Proceedings of the 2014 Learning and Intelligent Optimization Conference (LION8) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_LION8.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kamthe, S.; Peters, J.; Deisenroth, M. |
Year | 2014 |
Title | Multi-modal filtering for non-linear estimation |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) |
Link to PDF | https://spiral.imperial.ac.uk:8443/bitstream/10044/1/12921/2/ICASSP_Final.pdf |
Reference Type | Conference Proceedings |
Author(s) | Manschitz, S.; Kober, J.; Gienger, M.; Peters, J. |
Year | 2014 |
Title | Learning to Unscrew a Light Bulb from Demonstrations |
Journal/Conference/Book Title | Proceedings of ISR/ROBOTIK 2014 |
Keywords | HRI-Collaboration |
Reference Type | Journal Article |
Author(s) | Muelling, K.; Boularias, A.; Schoelkopf, B.; Peters, J. |
Year | 2014 |
Title | Learning Strategies in Table Tennis using Inverse Reinforcement Learning |
Journal/Conference/Book Title | Biological Cybernetics |
Volume | 108 |
Number | 5 |
Pages | 603-619 |
Custom 1 | DOI 10.1007/s00422-014-0599-1 |
Custom 2 | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/Muelling_BICY_2014.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Research/Overview/Learning_strategies_in_table_tennis_usin.pdf |
Reference Type | Journal Article |
Author(s) | Saut, J.-P.; Ivaldi, S.; Sahbani, A.; Bidaud, P. |
Year | 2014 |
Title | Grasping objects localized from uncertain point cloud data |
Journal/Conference/Book Title | Robotics and Autonomous Systems |
Keywords | CoDyCo |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/auro2013_final.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lioutikov, R.; Kroemer, O.; Peters, J.; Maeda, G. |
Year | 2014 |
Title | Learning Manipulation by Sequencing Motor Primitives with a Two-Armed Robot |
Journal/Conference/Book Title | Proceedings of the 13th International Conference on Intelligent Autonomous Systems (IAS) |
Keywords | 3rd-Hand |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Member/PubRudolfLioutikov/lioutikov_ias13_conf.pdf |
Reference Type | Conference Proceedings |
Author(s) | Daniel, C.; Viering, M.; Metz, J.; Kroemer, O.; Peters, J. |
Year | 2014 |
Title | Active Reward Learning |
Journal/Conference/Book Title | Proceedings of Robotics: Science & Systems (R:SS) |
Keywords | complacs |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Daniel_RSS_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Peters, J. |
Year | 2014 |
Title | Predicting Object Interactions from Contact Distributions |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | 3rd-Hand, TACMAN, CoDyCo |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/KroemerIROS2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Chebotar, Y.; Kroemer, O.; Peters, J. |
Year | 2014 |
Title | Learning Robot Tactile Sensing for Object Manipulation |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | 3rd-Hand, TACMAN |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/ChebotarIROS2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Manschitz, S.; Kober, J.; Gienger, M.; Peters, J. |
Year | 2014 |
Title | Learning to Sequence Movement Primitives from Demonstrations |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | HRI-Collaboration, Honda |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/ManschitzIROS2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Luck, K.S.; Neumann, G.; Berger, E.; Peters, J.; Ben Amor, H. |
Year | 2014 |
Title | Latent Space Policy Search for Robotics |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | Complacs, codyco |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Luck_IROS_2014.pdf |
Reference Type | Journal Article |
Author(s) | van Hoof, H.; Kroemer, O; Peters, J. |
Year | 2014 |
Title | Probabilistic Segmentation and Targeted Exploration of Objects in Cluttered Environments |
Journal/Conference/Book Title | IEEE Transactions on Robotics (TRo) |
Volume | 30 |
Number | 5 |
Pages | 1198-1209 |
ISBN/ISSN | 1552-3098 |
URL(s) | http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6870500&tag=1 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/hoof2014probabilistic.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez, V.; Kappen, B; Peters, J.; Neumann, G |
Year | 2014 |
Title | Policy Search for Path Integral Control |
Journal/Conference/Book Title | Proceedings of the European Conference on Machine Learning (ECML) |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Gomez_ECML_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Maeda, G.J.; Ewerton, M.; Lioutikov, R.; Amor, H.B.; Peters, J.; Neumann, G. |
Year | 2014 |
Title | Learning Interaction for Collaborative Tasks with Probabilistic Movement Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | 3rd-Hand, CompLACS |
Pages | 527--534 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/maeda2014InteractionProMP_HUMANOIDS.pdf |
Reference Type | Conference Proceedings |
Author(s) | Brandl, S.; Kroemer, O.; Peters, J. |
Year | 2014 |
Title | Generalizing Pouring Actions Between Objects using Warped Parameters |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | 3rd-Hand |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/BrandlHumanoids2014Final.pdf |
Reference Type | Conference Proceedings |
Author(s) | Colome, A.; Neumann, G.; Peters, J.; Torras, C. |
Year | 2014 |
Title | Dimensionality Reduction for Probabilistic Movement Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Colome_Humanoids_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Rueckert, E.; Mindt, M.; Peters, J.; Neumann, G. |
Year | 2014 |
Title | Robust Policy Updates for Stochastic Optimal Control |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/AICOHumanoidsFinal.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ivaldi, S.; Peters, J.; Padois, V.; Nori, F. |
Year | 2014 |
Title | Tools for simulating humanoid robot dynamics: a survey based on user feedback |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/ivaldi2014simulators.pdf |
Reference Type | Journal Article |
Author(s) | Droniou, A.; Ivaldi, S.; Sigaud, O. |
Year | 2014 |
Title | Deep unsupervised network for multimodal perception, representation and classification |
Journal/Conference/Book Title | Robotics and Autonomous Systems |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Deep unsupervised network_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Hermans, T.; Veiga, F.; Hoelscher, J.; van Hoof, H.; Peters, J. |
Year | 2014 |
Title | Demonstration: Learning for Tactile Manipulation |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS/NeurIPS), Demonstration Track. |
Keywords | TACMAN, tactile manipulation |
Abstract | Tactile sensing affords robots the opportunity to dexterously manipulate objects in-hand without the need of strong object models and planning. Our demonstration focuses on learning for tactile, in-hand manipulation by robots. We address learning problems related to the control of objects in-hand, as well as perception problems encountered by a robot exploring its environment with a tactile sensor. We demonstrate applications for three specific learning problems: learning to detect slip for grasp stability, learning to reposition objects in-hand, and learning to identify objects and object properties through tactile exploration. We address the problem of learning to detect slip of grasped objects. We show that the robot can learn a detector for slip events which generalizes to novel objects. We leverage this slip detector to produce a feedback controller that can stabilize objects during grasping and manipulation. Our work compares a number of supervised learning approaches and feature representations in order to achieve reliable slip detection. Tactile sensors provide observations of high enough dimension to cause problems for traditional reinforcement learning methods. As such, we introduce a novel reinforcement learning (RL) algorithm which learns transition functions embedded in a reproducing kernel Hilbert space (RKHS). The resulting policy search algorithm provides robust policy updates which can efficiently deal with high-dimensional sensory input. We demonstrate the method on the problem of repositioning a grasped object in the hand. Finally, we present a method for learning to classify objects through tactile exploration. The robot collects data from a number of objects through various exploratory motions. The robot learns a classifier for each object to be used during exploration of its environment to detect objects in cluttered environments. Here again we compare a number of learning methods and features present in the literature and synthesize a method to best work in human environments the robot is likely to encounter. Users will be able to interact with a robot hand by giving it objects to grasp and attempting to remove these objects from the robot. The hand will also perform some basic in-hand manipulation tasks such as rolling the object between the fingers and rotating the object about a fixed grasp point. Users will also be able to interact with a touch sensor capable of classifying objects as well as semantic events such as slipping from a stable contact location. |
Place Published | Cambridge, MA |
Publisher | MIT Press |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/TuckerHermans/learning_tactile_manipulation_demo.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lioutikov, R.; Paraschos, A.; Peters, J.; Neumann, G. |
Year | 2014 |
Title | Sample-Based Information-Theoretic Stochastic Optimal Control |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-Hand |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/RudolfLioutikov/lioutikov_icra_2014.pdf |
Reference Type | Journal Article |
Author(s) | Muelling, K.; Kober, J.; Kroemer, O.; Peters, J. |
Year | 2013 |
Title | Learning to Select and Generalize Striking Movements in Robot Table Tennis |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | GeRT |
Volume | 32 |
Number | 3 |
Pages | 263-279 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_IJRR_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_IJRR_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Daniel, C.; Neumann, G.; Kroemer, O.; Peters, J. |
Year | 2013 |
Title | Learning Sequential Motor Tasks |
Journal/Conference/Book Title | Proceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | GeRT, CompLACS |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_ICRA_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_ICRA_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Englert, P.; Paraschos, A.; Peters, J.; Deisenroth, M. P. |
Year | 2013 |
Title | Model-based Imitation Learning by Probabilistic Trajectory Matching |
Journal/Conference/Book Title | Proceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA) |
Abstract | One of the most elegant ways of teaching new skills to robots is to provide demonstrations of a task and let the robot imitate this behavior. Such imitation learning is a non-trivial task: Different anatomies of robot and teacher, and reduced robustness towards changes in the control task are two major difficulties in imitation learning. We present an imitation-learning approach to efficiently learn a task from expert demonstrations. Instead of finding policies indirectly, either via state-action mappings (behavioral cloning), or cost function learning (inverse reinforcement learning), our goal is to find policies directly such that predicted trajectories match observed ones. To achieve this aim, we model the trajectory of the teacher and the predicted robot trajectory by means of probability distributions. We match these distributions by minimizing their Kullback-Leibler divergence. In this paper, we propose to learn probabilistic forward models to compute a probability distribution over trajectories. We compare our approach to model-based reinforcement learning methods with hand-crafted cost functions. Finally, we evaluate our method with experiments on a real compliant robot. |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Englert_ICRA_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Englert_ICRA_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gopalan, N.; Deisenroth, M. P.; Peters, J. |
Year | 2013 |
Title | Feedback Error Learning for Rhythmic Motor Primitives |
Journal/Conference/Book Title | Proceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gopalan_ICRA_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gopalan_ICRA_2013.pdf |
Reference Type | Journal Article |
Author(s) | Wang, Z.; Muelling, K.; Deisenroth, M. P.; Ben Amor, H.; Vogt, D.; Schoelkopf, B.; Peters, J. |
Year | 2013 |
Title | Probabilistic Movement Modeling for Intention Inference in Human-Robot Interaction |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Volume | 32 |
Number | 7 |
Pages | 841-858 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_IJRR_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_IJRR_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J.; Muelling, K.; Kroemer, O.; Neumann, G. |
Year | 2013 |
Title | Towards Robot Skill Learning: From Simple Skills to Table Tennis |
Journal/Conference/Book Title | Proceedings of the European Conference on Machine Learning (ECML), Nectar Track |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/peters_ECML_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/peters_ECML_2013.pdf |
Reference Type | Journal Article |
Author(s) | Kober, J.; Bagnell, D.; Peters, J. |
Year | 2013 |
Title | Reinforcement Learning in Robotics: A Survey |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Volume | 32 |
Number | 11 |
Pages | 1238-1274 |
URL(s) | http://www.ias.tu-darmstadt.de/uploads/Publications/Kober_IJRR_2013.pdf |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Kober_IJRR_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kupcsik, A.G.; Deisenroth, M.P.; Peters, J.; Neumann, G. |
Year | 2013 |
Title | Data-Efficient Generalization of Robot Skills with Contextual Policy Search |
Journal/Conference/Book Title | Proceedings of the National Conference on Artificial Intelligence (AAAI) |
Keywords | GeRT, ComPLACS |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kupcsik_AAAI_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kupcsik_AAAI_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bocsi, B.; Csato, L.; Peters, J. |
Year | 2013 |
Title | Alignment-based Transfer Learning for Robot Models |
Journal/Conference/Book Title | Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_IJCNN_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_IJCNN_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Daniel, C.; Neumann, G.; Peters, J. |
Year | 2013 |
Title | Autonomous Reinforcement Learning with Hierarchical REPS |
Journal/Conference/Book Title | Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN) |
Keywords | GeRT, CompLACS |
Reference Type | Journal Article |
Author(s) | Englert, P.; Paraschos, A.; Peters, J.;Deisenroth, M.P. |
Year | 2013 |
Title | Probabilistic Model-based Imitation Learning |
Journal/Conference/Book Title | Adaptive Behavior Journal |
Volume | 21 |
Pages | 388-403 |
URL(s) | http://www.ias.tu-darmstadt.de/uploads/Publications/Englert_ABJ_2013.pdf |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Englert_ABJ_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J.; Muelling, K.; Nguyen-Tuong, D.; Kroemer, O. |
Year | 2013 |
Title | Learning Skills with Motor Primitives |
Journal/Conference/Book Title | Proceedings of the 16th Yale Learning Workshop |
Reference Type | Conference Proceedings |
Author(s) | Neumann, G.; Kupcsik, A.G.; Deisenroth, M.P.; Peters, J. |
Year | 2013 |
Title | Information-Theoretic Motor Skill Learning |
Journal/Conference/Book Title | Proceedings of the AAAI 2013 Workshop on Intelligent Robotic Systems |
Keywords | ComPLACS |
Reference Type | Conference Proceedings |
Author(s) | Ben Amor, H.; Vogt, D.; Ewerton, M.; Berger, E.; Jung, B.; Peters, J. |
Year | 2013 |
Title | Learning Responsive Robot Behavior by Imitation |
Journal/Conference/Book Title | Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/iros2013Heni.pdf |
Reference Type | Journal Article |
Author(s) | Deisenroth, M. P.; Neumann, G.; Peters, J. |
Year | 2013 |
Title | A Survey on Policy Search for Robotics |
Journal/Conference/Book Title | Foundations and Trends in Robotics |
Keywords | CompLACS |
Volume | 21 |
Pages | 388-403 |
URL(s) | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/PolicySearchReview.pdf |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/PolicySearchReview.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; Kroemer, O; Peters, J. |
Year | 2013 |
Title | Probabilistic Interactive Segmentation for Anthropomorphic Robots in Cluttered Environments |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | gert, complacs |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/hoof-HUMANOIDS.pdf |
Reference Type | Conference Proceedings |
Author(s) | Paraschos, A.; Neumann, G; Peters, J. |
Year | 2013 |
Title | A Probabilistic Approach to Robot Trajectory Generation |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo, ComPLACS |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Paraschos_Humanoids_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Berger, E.; Vogt, D.; Haji-Ghassemi, N.; Jung, B.; Ben Amor, H. |
Year | 2013 |
Title | Inferring Guidance Information in Cooperative Human-Robot Tasks |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/humanoids2013Heni.pdf |
Reference Type | Conference Proceedings |
Author(s) | Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G |
Year | 2013 |
Title | Probabilistic Movement Primitives |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Keywords | CoDyCo, ComPLACS |
Place Published | Cambridge, MA |
Publisher | MIT Press |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Paraschos_NIPS_2013a.pdf |
Reference Type | Book Section |
Author(s) | Sigaud, O.; Peters, J. |
Year | 2012 |
Title | Robot Learning |
Journal/Conference/Book Title | Encyclopedia of the Sciences of Learning, Springer Verlag |
Publisher | Springer Verlag |
Reprint Edition | 978-1-4419-1428-6 |
URL(s) | http://dx.doi.org/10.1007/978-3-642-05181-4_1 |
Reference Type | Journal Article |
Author(s) | Lampert, C.H.; Peters, J. |
Year | 2012 |
Title | Real-Time Detection of Colored Objects In Multiple Camera Streams With Off-the-Shelf Hardware Components |
Journal/Conference/Book Title | Journal of Real-Time Image Processing |
Volume | 7 |
Number | 1 |
Pages | 31-41 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/rtblob-jrtip2010_6651[0].pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/rtblob-jrtip2010_6651[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Daniel, C.; Neumann, G.; Peters, J. |
Year | 2012 |
Title | Hierarchical Relative Entropy Policy Search |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS 2012) |
Keywords | GeRT, CompLACS |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Member/ChristianDaniel/DanielAISTATS2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Member/ChristianDaniel/DanielAISTATS2012.pdf |
Reference Type | Journal Article |
Author(s) | Deisenroth, M.P.; Turner, R.; Huber, M.; Hanebeck, U.D.; Rasmussen, C.E |
Year | 2012 |
Title | Robust Filtering and Smoothing with Gaussian Processes |
Journal/Conference/Book Title | IEEE Transactions on Automatic Control |
Keywords | Gaussian process, filtering, smoothing |
Abstract | We propose a principled algorithm for robust Bayesian filtering and smoothing in nonlinear stochastic dynamic systems when both the transition function and the measurement function are described by non-parametric Gaussian process (GP) models. GPs are gaining increasing importance in signal processing, machine learning, robotics, and control for representing unknown system functions by posterior probability distributions. This modern way of "system identification" is more robust than finding point estimates of a parametric function representation. Our principled filteringslash smoothing approach for GP dynamic systems is based on analytic moment matching in the context of the forward-backward algorithm. Our numerical evaluations demonstrate the robustness of the proposed approach in situations where other state-of-the-art Gaussian filters and smoothers can fail. |
Number of Volumes | IEEE |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/deisenroth_IEEE-TAC2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Ugur, E.; Oztop, E. ; Peters, J. |
Year | 2012 |
Title | A Kernel-based Approach to Direct Action Perception |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bocsi, B.; Hennig, P.; Csato, L.; Peters, J. |
Year | 2012 |
Title | Learning Tracking Control with Forward Models |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_ICRA_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_ICRA_2012.pdf |
Reference Type | Journal Article |
Author(s) | Kober, J.; Wilhelm, A.; Oztop, E.; Peters, J. |
Year | 2012 |
Title | Reinforcement Learning to Adjust Parametrized Motor Primitives to New Situations |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Keywords | Skill learning; Motor primitives; Reinforcement learning; Meta-parameters; Policy learning |
Publisher | Springer US |
Volume | 33 |
Number | 4 |
Pages | 361-379 |
ISBN/ISSN | 0929-5593 |
URL(s) | http://dx.doi.org/10.1007/s10514-012-9290-3 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/kober_auro2012.pdf |
Language | English |
Reference Type | Journal Article |
Author(s) | Vitzthum, A.; Ben Amor, H.; Heumer, G.; Jung, B. |
Year | 2012 |
Title | XSAMPL3D - An Action Description Language for the Animation of Virtual Characters |
Journal/Conference/Book Title | Journal of Virtual Reality and Broadcasting |
Volume | 9 |
Number | 1 |
URL(s) | http://www.jvrb.org/9.2012 |
Link to PDF | http://www.jvrb.org/9.2012/3262/920121.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wang, Z.;Deisenroth, M; Ben Amor, H.; Vogt, D.; Schoelkopf, B.; Peters, J. |
Year | 2012 |
Title | Probabilistic Modeling of Human Movements for Intention Inference |
Journal/Conference/Book Title | Proceedings of Robotics: Science and Systems (R:SS) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_RSS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_RSS_2012.pdf |
Reference Type | Journal Article |
Author(s) | Nguyen-Tuong, D.; Peters, J. |
Year | 2012 |
Title | Online Kernel-based Learning for Task-Space Tracking Robot Control |
Journal/Conference/Book Title | IEEE Transactions on Neural Networks and Learning Systems |
Volume | 23 |
Number | 9 |
Pages | 1417-1425 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/NguyenTuong_TNN_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/NguyenTuong_TNN_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Mohamed, S. |
Year | 2012 |
Title | Expectation Propagation in Gaussian Process Dynamical Systems |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 26 (NIPS/NeurIPS), Cambridge, MA: MIT Press. |
Abstract | Rich and complex time-series data, such as those generated from engineering systems, financial markets, videos, or neural recordings are now a common feature of modern data analysis. Explaining the phenomena underlying these diverse data sets requires flexible and accurate models. In this paper, we promote Gaussian process dynamical systems as a rich model class that is appropriate for such an analysis. We present a new approximate message-passing algorithm for Bayesian state estimation and inference in Gaussian process dynamical systems, a non-parametric probabilistic generalization of commonly used state-space models. We derive our message-passing algorithm using Expectation Propagation provide a unifying perspective on message passing in general state-space models. We show that existing Gaussian filters and smoothers appear as special cases within our inference framework, and that these existing approaches can be improved upon using iterated message passing. Using both synthetic and real-world data, we demonstrate that iterated message passing can improve inference in a wide range of tasks in Bayesian state estimation, thus leading to improved predictions and more effective decision making. |
Publisher | The MIT Press |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_NIPS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J.; Muelling, K.; Nguyen-Tuong, D.; Kroemer, O. |
Year | 2012 |
Title | Robot Skill Learning |
Journal/Conference/Book Title | Proceedings of the European Conference on Artificial Intelligence (ECAI) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ECAI2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ECAI2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Boularias, A.; Kroemer, O.; Peters, J. |
Year | 2012 |
Title | Structured Apprenticeship Learning |
Journal/Conference/Book Title | Proceedings of the European Conference on Machine Learning (ECML) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_ECML_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_ECML_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Meyer, T.; Peters, J.;Broetz, D.; Zander, T.; Schoelkopf, B.; Soekadar, S.; Grosse-Wentrup, M. |
Year | 2012 |
Title | A Brain-Robot Interface for Studying Motor Learning after Stroke |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS) |
Keywords | Team Athena-Minerva |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Meyer_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Meyer_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Calandra, R.; Seyfarth, A.; Peters, J. |
Year | 2012 |
Title | Toward Fast Policy Search for Learning Legged Locomotion |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS) |
Keywords | legged locomotion, policy search, reinforcement learning, Gaussian process |
Abstract | Legged locomotion is one of the most versatile forms of mobility. However, despite the importance of legged locomotion and the large number of legged robotics studies, no biped or quadruped matches the agility and versatility of their biological counterparts to date. Approaches to designing controllers for legged locomotion systems are often based on either the assumption of perfectly known dynamics or mechanical designs that substantially reduce the dimensionality of the problem. The few existing approaches for learning controllers for legged systems either require exhaustive real-world data or they improve controllers only conservatively, leading to slow learning. We present a data-efficient approach to learning feedback controllers for legged locomotive systems, based on learned probabilistic forward models for generating walking policies. On a compass walker, we show that our approach allows for learning gait policies from very little data. Moreover, we analyze learned locomotion models of a biomechanically inspired biped. Our approach has the potential to scale to high-dimensional humanoid robots with little loss in efficiency. |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; Kroemer, O.;Ben Amor, H.; Peters, J. |
Year | 2012 |
Title | Maximally Informative Interaction Learning for Scene Exploration |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VanHoof_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VanHoof_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Daniel, C.; Neumann, G.; Peters, J. |
Year | 2012 |
Title | Learning Concurrent Motor Skills in Versatile Solution Spaces |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS) |
Keywords | GeRT, CompLACS |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ben Amor, H.; Kroemer, O.; Hillenbrand, U.; Neumann, G.; Peters, J. |
Year | 2012 |
Title | Generalization of Human Grasping for Multi-Fingered Robot Hands |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/BenAmor_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/BenAmor_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J; Muelling, K.; Peters, J. |
Year | 2012 |
Title | Learning Throwing and Catching Skills |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS), Video Track |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kober_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kober_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Peters, J. |
Year | 2012 |
Title | Solving Nonlinear Continuous State-Action-Observation POMDPs for Mechanical Systems with Gaussian Noise |
Journal/Conference/Book Title | Proceedings of the European Workshop on Reinforcement Learning (EWRL) |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Deisenroth_EWRL_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Muelling, K.; Kober, J.; Kroemer, O.; Peters, J. |
Year | 2012 |
Title | Learning to Select and Generalize Striking Movements in Robot Table Tennis |
Journal/Conference/Book Title | Proceedings of the AAAI 2012 Fall Symposium on Robots that Learn Interactively from Human Teachers |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/aaaifss12rliht_submission_2.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/aaaifss12rliht_submission_2.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Raiko, T.; Deisenroth, M.P.; Montesino Pouzols, F. |
Year | 2012 |
Title | Learning Deep Belief Networks from Non-Stationary Streams |
Journal/Conference/Book Title | International Conference on Artificial Neural Networks (ICANN) |
Keywords | deep learning, non-stationary data |
Abstract | Deep learning has proven to be beneficial for complex tasks such as classifying images. However, this approach has been mostly ap- plied to static datasets. The analysis of non-stationary (e.g., concept drift) streams of data involves specific issues connected with the tempo- ral and changing nature of the data. In this paper, we propose a proof- of-concept method, called Adaptive Deep Belief Networks, of how deep learning can be generalized to learn online from changing streams of data. We do so by exploiting the generative properties of the model to incrementally re-train the Deep Belief Network whenever new data are collected. This approach eliminates the need to store past observations and, therefore, requires only constant memory consumption. Hence, our approach can be valuable for life-long learning from non-stationary data streams. |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/calandra_icann2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Meyer, T.; Peters, J.; Broetz, D.; Zander, T.; Schoelkopf, B.; Soekadar, S.; Grosse-Wentrup, M. |
Year | 2012 |
Title | Investigating the Neural Basis for Stroke Rehabilitation by Brain-Computer Interfaces |
Journal/Conference/Book Title | International Conference on Neurorehabilitation |
Keywords | Team Athena-Minerva |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Ben Amor, H.; Ewerton, M.; Peters, J. |
Year | 2012 |
Title | Point Cloud Completion Using Extrusions |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_Humanoids_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_Humanoids_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Boularias, A.; Kroemer, O.; Peters, J. |
Year | 2012 |
Title | Algorithms for Learning Markov Field Policies |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 26 (NIPS/NeurIPS), Cambridge, MA: MIT Press. |
Keywords | GeRT |
Place Published | Cambridge, MA |
Publisher | MIT Press |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_NIPS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_NIPS_2012.pdf |
Reference Type | Book |
Author(s) | Deisenroth M. P.; Szepesvari C.; Peters J. |
Year | 2012 |
Journal/Conference/Book Title | Proceedings of the 10th European Workshop on Reinforcement Learning |
Editor(s) | Deisenroth M. P.; Szepesvari C., Peters J. |
Place Published | JMLR W&C |
Volume | 24 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_EWRLproceedings_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_EWRLproceedings_2012.pdf |
Reference Type | Journal Article |
Author(s) | Piater, J.; Jodogne, S.; Detry, R.; Kraft, D.; Krueger, N.; Kroemer, O.; Peters, J. |
Year | 2011 |
Title | Learning Visual Representations for Perception-Action Systems |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Volume | 30 |
Number | 3 |
Pages | 294-307 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Piater_IJRR_2010.pdf |
Reference Type | Journal Article |
Author(s) | Detry, R.; Kraft, D.; Kroemer, O.; Peters, J.; Krueger, N.; Piater, J.; |
Year | 2011 |
Title | Learning Grasp Affordance Densities |
Journal/Conference/Book Title | Paladyn Journal of Behavioral Robotics |
Keywords | GeRT |
Volume | 2 |
Number | 1 |
Pages | 1-17 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Detry_PJBR_2011.pdf |
Reference Type | Journal Article |
Author(s) | Kober, J.; Peters, J. |
Year | 2011 |
Title | Policy Search for Motor Primitives in Robotics |
Journal/Conference/Book Title | Machine Learning (MLJ) |
Volume | 84 |
Number | 1-2 |
Pages | 171-203 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/kober_MACH_2011.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/kober_MACH_2011.pdf |
Reference Type | Journal Article |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2011 |
Title | Incremental Sparsification for Real-time Online Model Learning |
Journal/Conference/Book Title | Neurocomputing |
Volume | 74 |
Number | 11 |
Pages | 1859-1867 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Nguyen_NEURO_2011.pdf |
Reference Type | Journal Article |
Author(s) | Gomez Rodriguez, M.; Peters, J.; Hill, J.; Schoelkopf, B.; Gharabaghi, A.; Grosse-Wentrup, M. |
Year | 2011 |
Title | Closing the Sensorimotor Loop: Haptic Feedback Helps Decoding of Motor Imagery |
Journal/Conference/Book Title | Journal of Neuroengineering |
Keywords | Team Athena-Minerva |
Volume | 8 |
Number | 3 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gomez-RodriguezJNE2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lampariello, R.; Nguyen Tuong, D.; Castellini, C.; Hirzinger, G.; Peters, J. |
Year | 2011 |
Title | Energy-optimal robot catching in real-time |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Lampariello_ICRA_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Peters, J. |
Year | 2011 |
Title | A Flexible Hybrid Framework for Modeling Complex Manipulation Tasks |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | GeRT |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Peters, J. |
Year | 2011 |
Title | Active Exploration for Robot Parameter Selection in Episodic Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the 2011 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL) |
Keywords | GeRT |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ADPRL_2011.pdf |
Reference Type | Journal Article |
Author(s) | Kroemer, O.; Lampert, C.H.; Peters, J. |
Year | 2011 |
Title | Learning Dynamic Tactile Sensing with Robust Vision-based Training |
Journal/Conference/Book Title | IEEE Transactions on Robotics (T-Ro) |
Volume | 27 |
Number | 3 |
Pages | 545-557 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_TRo_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Boularias, A.; Kroemer, O.; Peters, J. |
Year | 2011 |
Title | Learning Robot Grasping from 3D Images with Markov Random Fields |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robot Systems (IROS) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Publications/Boularias_IROS_2011.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Boularias_IROS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Peters, J. |
Year | 2011 |
Title | A Non-Parametric Approach to Dynamic Programming |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 25 (NIPS/NeurIPS) |
Keywords | GeRT |
Place Published | Cambridge, MA |
Publisher | MIT Press |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer2011NIPS.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer2011NIPS.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; van der Zant, T. ; Wiering, M.A. |
Year | 2011 |
Title | Adaptive Visual Face Tracking for an Autonomous Robot |
Journal/Conference/Book Title | Proceedings of the Belgian-Dutch Artificial Intelligence Conference (BNAIC 11) |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VanHoof_BNAIC_2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VanHoof_BNAIC_2011.pdf |
Reference Type | Journal Article |
Author(s) | Muelling, K.; Kober, J.; Peters, J. |
Year | 2011 |
Title | A Biomimetic Approach to Robot Table Tennis |
Journal/Conference/Book Title | Adaptive Behavior Journal |
Volume | 19 |
Number | 5 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bocsi, B.; Nguyen-Tuong, D; Csato, L; Schoelkopf, B.; Peters, J. |
Year | 2011 |
Title | Learning Inverse Kinematics with Structured Prediction |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robot Systems (IROS) |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_IROS_2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_IROS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wang, Z.; Lampert, C; Muelling, K; Schoelkopf, B.; Peters, J. |
Year | 2011 |
Title | Learning Anticipation Policies for Robot Table Tennis |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robot Systems (IROS) |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_IROS_2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_IROS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2011 |
Title | Learning Task-Space Tracking Control with Kernels |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robot Systems (IROS) |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Nguyen_IROS_2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Nguyen_IROS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Peters, J. |
Year | 2011 |
Title | Learning Elementary Movements Jointly with a Higher Level Task |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robot Systems (IROS) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Kober_IROS_2011.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kober_IROS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez Rodriguez, M.; Grosse-Wentrup, M.; Hill, J.; Schoelkopf, B.; Gharabaghi, A.; Peters, J. |
Year | 2011 |
Title | Towards Brain-Robot Interfaces for Stroke Rehabilitation |
Journal/Conference/Book Title | Proceedings of the International Conference on Rehabilitation Robotics (ICORR) |
Keywords | Team Athena-Minerva |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gomez_ICORR_2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gomez_ICORR_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wang, Z.; Boularias, A.; Muelling, K.; Peters, J. |
Year | 2011 |
Title | Balancing Safety and Exploitability in Opponent Modeling |
Journal/Conference/Book Title | Proceedings of the Twenty-Fifth National Conference on Artificial Intelligence (AAAI) |
Keywords | GeRT |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_AAAI_2011.pdf |
Reference Type | Journal Article |
Author(s) | Hachiya, H.; Peters, J.; Sugiyama, M. |
Year | 2011 |
Title | Reward Weighted Regression with Sample Reuse for Direct Policy Search in Reinforcement Learning |
Journal/Conference/Book Title | Neural Computation |
Volume | 23 |
Number | 11 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Hachiya_NC2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Hachiya_NC2011.pdf |
Reference Type | Journal Article |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2011 |
Title | Model Learning in Robotics: a Survey |
Journal/Conference/Book Title | Cognitive Processing |
Volume | 12 |
Number | 4 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Nguyen_CP_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Oztop, E.; Peters, J. |
Year | 2011 |
Title | Reinforcement Learning to adjust Robot Movements to New Situations |
Journal/Conference/Book Title | Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Best Paper Track |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kober_IJCAI_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Boularias, A.; Kober, J.; Peters, J. |
Year | 2011 |
Title | Relative Entropy Inverse Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011) |
Keywords | GeRT |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/boularias11a.pdf |
Reference Type | Journal Article |
Author(s) | Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J. |
Year | 2010 |
Title | Recurrent Policy Gradients |
Journal/Conference/Book Title | Logic Journal of the IGPL |
Volume | 18 |
Number of Volumes | 5 |
Pages | 620-634 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/jzp049v1_5879.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Oztop, E.; Peters, J. |
Year | 2010 |
Title | Reinforcement Learning to adjust Robot Movements to New Situations |
Journal/Conference/Book Title | Proceedings of Robotics: Science and Systems (R:SS) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/RSS2010-Kober_6438[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Detry, R.; Piater, J.; Peters, J. |
Year | 2010 |
Title | Adapting Preshaped Grasping Movements using Vision Descriptors |
Journal/Conference/Book Title | From Animals to Animats 11, International Conference on the Simulation of Adaptive Behavior (SAB) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Kroemer_6437[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Kroemer_6437[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Detry, R.; Piater, J.; Peters, J. |
Year | 2010 |
Title | Grasping with Vision Descriptors and Motor Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Informatics in Control, Automation and Robotics (ICINCO) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/ICINCO2010-Kroemer_6436[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/ICINCO2010-Kroemer_6436[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Muelling, K.; Kober, J.; Peters, J. |
Year | 2010 |
Title | Simulating Human Table Tennis with a Biomimetic Robot Setup |
Journal/Conference/Book Title | From Animals to Animats 11, International Conference on the Simulation of Adaptive Behavior (SAB) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Muelling_6626[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Muelling_6626[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2010 |
Title | Incremental Sparsification for Real-time Online Model Learning |
Journal/Conference/Book Title | Proceedings of Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/AISTATS2010-Nguyen-Tuong.pdf |
Reference Type | Journal Article |
Author(s) | Kober, J.; Peters, J. |
Year | 2010 |
Title | Imitation and Reinforcement Learning - Practical Algorithms for Motor Primitive Learning in Robotics |
Journal/Conference/Book Title | IEEE Robotics and Automation Magazine |
Volume | 17 |
Number | 2 |
Pages | 55-62 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/kober_RAM_2010.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/kober_RAM_2010.pdf |
Reference Type | Journal Article |
Author(s) | Kroemer, O.; Detry, R.; Piater, J.; Peters, J. |
Year | 2010 |
Title | Combining Active Learning and Reactive Control for Robot Grasping |
Journal/Conference/Book Title | Robotics and Autonomous Systems |
Keywords | GeRT |
Volume | 58 |
Number | 9 |
Pages | 1105-1116 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/KroemerJRAS_6636[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/KroemerJRAS_6636[0].pdf |
Reference Type | Book Section |
Author(s) | Nguyen Tuong, D.; Peters, J.;Seeger, M. |
Year | 2010 |
Title | Real-Time Local GP Model Learning |
Journal/Conference/Book Title | From Motor Learning to Interaction Learning in Robots, Springer Verlag |
Number | 264 |
Reprint Edition | 978-3-642-05180-7 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/LGP_IROS_Chapter_6233.pdf |
Reference Type | Book Section |
Author(s) | Peters, J.; Tedrake, R.; Roy, N.; Morimoto, J. |
Year | 2010 |
Title | Robot Learning |
Journal/Conference/Book Title | Encyclopedia of Machine Learning |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/EncyclopediaMachineLearning-Peters-RobotLearning_[0].pdf |
Reference Type | Book |
Author(s) | Sigaud, O.; Peters, J. |
Year | 2010 |
Title | From Motor Learning to Interaction Learning in Robots |
Journal/Conference/Book Title | Studies in Computational Intelligence, Springer Verlag |
Number of Volumes | Springer V |
Number | 264 |
Reprint Edition | 978-3-642-05180-7 |
Link to PDF | http://dx.doi.org/10.1007/978-3-642-05181-4 |
Reference Type | Book Section |
Author(s) | Kober, J.; Mohler, B.; Peters, J. |
Year | 2010 |
Title | Imitation and Reinforcement Learning for Motor Primitives with Perceptual Coupling |
Journal/Conference/Book Title | From Motor Learning to Interaction Learning in Robots, Springer Verlag |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Imitation%20and%20Reinforcement%20Learning%20for%20Motor%20Primitives%20with%20Perceptual%20Coupling_6234[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Muelling, K.; Altun, Y. |
Year | 2010 |
Title | Relative Entropy Policy Search |
Journal/Conference/Book Title | Proceedings of the Twenty-Fourth National Conference on Artificial Intelligence (AAAI), Physically Grounded AI Track |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/Peters2010_REPS.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Muelling, K.; Kroemer, O.; Lampert, C.H.; Schoelkopf, B.; Peters, J. |
Year | 2010 |
Title | Movement Templates for Learning of Hitting and Batting |
Journal/Conference/Book Title | IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/ICRA2010-Kober_6231[1].pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2010 |
Title | Using Model Knowledge for Learning Inverse Dynamics |
Journal/Conference/Book Title | IEEE International Conference on Robotics and Automation |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICRA2010-NguyenTuong_6232.pdf |
Reference Type | Journal Article |
Author(s) | Sehnke, F.; Osendorfer, C.; Rueckstiess, T.; Graves, A.; Peters, J.; Schmidhuber, J. |
Year | 2010 |
Title | Parameter-exploring Policy Gradients |
Journal/Conference/Book Title | Neural Networks |
Volume | 23 |
Number of Volumes | 4 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Neural-Networks-2010-Sehnke.pdf |
Reference Type | Book Section |
Author(s) | Peters, J.; Bagnell, J.A. |
Year | 2010 |
Title | Policy gradient methods |
Journal/Conference/Book Title | Encyclopedia of Machine Learning (invited article) |
Number of Volumes | Springer V |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Peters_EOMLA_submitted_6074[0].pdf |
Reference Type | Journal Article |
Author(s) | Morimura, T.; Uchibe, E.; Yoshimoto, J.; Peters, J.; Doya, K. |
Year | 2010 |
Title | Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning |
Journal/Conference/Book Title | Neural Computation |
Volume | 22 |
Number | 2 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/LSD_revise_ver3_5904[0].pdf |
Reference Type | Book Section |
Author(s) | Detry, R.; Baseski, E.; Popovic, M.; Touati, Y.; Krueger, N.; Kroemer, O.; Peters, J.; Piater, J. |
Year | 2010 |
Title | Learning Continuous Grasp Affordances by Sensorimotor Exploration |
Journal/Conference/Book Title | From Motor Learning to Interaction Learning in Robots, Springer Verlag |
Number | 264 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Detry-2010-MotorInteractionLearning_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Erkan, A.: Kroemer, O.; Detry, R.; Altun, Y.; Piater, J.; Peters, J. |
Year | 2010 |
Title | Learning Probabilistic Discriminative Models of Grasp Affordances under Limited Supervision |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/erkan_IROS_2010.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/erkan_IROS_2010.pdf |
Reference Type | Conference Proceedings |
Author(s) | Muelling, K.; Kober, J.; Peters, J. |
Year | 2010 |
Title | A Biomimetic Approach to Robot Table Tennis |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez Rodriguez, M.; Grosse Wentrup, M.; Peters, J.; Naros, G.; Hill, J.; Gharabaghi, A.; Schoelkopf, B. |
Year | 2010 |
Title | Epidural ECoG Online Decoding of Arm Movement Intention in Hemiparesis |
Journal/Conference/Book Title | 1st ICPR Workshop on Brain Decoding: Pattern Recognition Challenges in Neuroimaging |
Keywords | Team Athena-Minerva |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICPR-WBD-2010-Gomez-Rodriguez.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez Rodriguez, M.; Peters, J.; Hill, J.; Schoelkopf, B.; Gharabaghi, A.; Grosse-Wentrup, M. |
Year | 2010 |
Title | Closing the Sensorimotor Loop: Haptic Feedback Facilitates Decoding of Arm Movement Imagery |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (Workshop on Brain-Machine Interfaces) |
Keywords | Team Athena-Minerva |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/eeg-smc2010_6591.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez Rodriguez, M.; Peters, J.; Hill, J.; Gharabaghi, A.; Schoelkopf, B.; Grosse-Wentrup, M. |
Year | 2010 |
Title | BCI and robotics framework for stroke rehabilitation |
Journal/Conference/Book Title | Proceedings of the 4th International BCI Meeting, May 31 - June 4, 2010. Asilomar, CA, USA |
Keywords | Team Athena-Minerva |
Link to PDF | http://bcimeeting.org/2010/ |
Reference Type | Conference Proceedings |
Author(s) | Lampert, C. H.; Kroemer, O. |
Year | 2010 |
Title | Weakly-Paired Maximum Covariance Analysis for Multimodal Dimensionality Reduction and Transfer Learning |
Journal/Conference/Book Title | Proceedings of the 11th European Conference on Computer Vision (ECCV 2010) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/lampert-eccv2010.pdf |
Reference Type | Conference Proceedings |
Author(s) | Chiappa, S.; Peters, J. |
Year | 2010 |
Title | Movement extraction by detecting dynamics switches and repetitions |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 24 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Chiappa_NIPS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Alvaro, M.; Peters, J.; Schoelfkopf, B.; Lawrence, N. |
Year | 2010 |
Title | Switched Latent Force Models for Movement Segmentation |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 24 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Alvarez_NIPS_2011.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.;Kober, J.;Schaal, S. |
Year | 2010 |
Title | Policy learning algorithmis for motor learning (Algorithmen zum automatischen Erlernen von Motorfaehigkigkeiten) |
Journal/Conference/Book Title | Automatisierungstechnik |
Keywords | reinforcement leanring, motor skills |
Abstract | Robot learning methods which allow au- tonomous robots to adapt to novel situations have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to ful- fill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics. If possible, scaling was usually only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general ap- proach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human- like performance. For doing so, we study two major components for such an approach, i. e., firstly, we study policy learning algo- rithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structu- res for task representation and execution. |
Volume | 58 |
Number | 12 |
Pages | 688-694 |
Short Title | Policy learning algorithmis for motor learning (Algorithmen zum automatischen Erlernen von Motorfähigkigkeiten) |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/at-Automatisierungstechnik-Algorithmen_zum_Automatischen_Erlernen_von_Motorfhigkeiten |
Reference Type | Conference Proceedings |
Author(s) | Muelling, K.; Kober, J.; Peters, J. |
Year | 2010 |
Title | Learning Table Tennis with a Mixture of Motor Primitives |
Journal/Conference/Book Title | 10th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2010) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_ICHR_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_ICHR_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Muelling, K.; Kober, J. |
Year | 2010 |
Title | Experiments with Motor Primitives to learn Table Tennis |
Journal/Conference/Book Title | 12th International Symposium on Experimental Robotics (ISER 2010) |
Reference Type | Conference Proceedings |
Author(s) | Hachiya, H.; Peters, J.; Sugiyama, M. |
Year | 2009 |
Title | Efficient Sample Reuse in EM-based Policy Search |
Journal/Conference/Book Title | Proceedings of the 16th European Conference on Machine Learning (ECML 2009) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ECML-PKDD-2009-Hachiya_6068[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J.; Muelling, K.; Nguyen-Tuong, D.; Kroemer, O. |
Year | 2009 |
Title | Towards Motor Skill Learning for Robotics |
Journal/Conference/Book Title | Proceedings of the International Symposium on Robotics Research (ISRR), Invited Paper |
Abstract | Learning robots that can acquire new motor skills and refine existing one has been a long standing vision of robotics, artificial intelligence, and the cognitive sciences. Early steps towards this goal in the 1980s made clear that reasoning and human insights will not suffice. Instead, new hope has been offered by the rise of modern machine learning approaches. However, to date, it becomes increasingly clear that off-the-shelf machine learning approaches will not suffice for motor skill learning as these methods often do not scale into the high-dimensional domains of manipulator and humanoid robotics nor do they fulfill the real-time requirement of our domain. As an alternative, we propose to break the generic skill learning problem into parts that we can understand well from a robotics point of view. After designing appropriate learning approaches for these basic components, these will serve as the ingredients of a general approach to motor skill learning. In this paper, we discuss our recent and current progress in this direction. For doing so, we present our work on learning to control, on learning elementary movements as well as our steps towards learning of complex tasks. We show several evaluations both using real robots as well as physically realistic simulations. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/peters_ISRR_2007.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Seeger, M.; Peters, J. |
Year | 2009 |
Title | Local Gaussian Process Regression for Real Time Online Model Learning and Control |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 22 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/3403-local-gaussian-process-regression-for-real-time-online-model-learning.pdf |
Reference Type | Conference Proceedings |
Author(s) | Neumann, G.; Peters, J. |
Year | 2009 |
Title | Fitted Q-iteration by Advantage Weighted Regression |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 22 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NIPS2008-Neumann_5520%5B0%5D.pdf |
Reference Type | Journal Article |
Author(s) | Hachiya, H.; Akiyama, T.; Sugiyama, M.; Peters, J. |
Year | 2009 |
Title | Adaptive Importance Sampling for Value Function Approximation in Off-policy Reinforcement Learning |
Journal/Conference/Book Title | Neural Networks |
Keywords | off-policy reinforcement learning; value function approximation; policy iteration; adaptive importance sampling; importance-weighted cross-validation; efficient sample reuse |
Abstract | Off-policy reinforcement learning is aimed at efficiently using data samples gathered from a different policy than the currently optimized one. A common approach is to use importance sampling techniques for compensating for the bias of value function estimators caused by the difference between the data-sampling policy and the target policy. However, existing off-policy methods often do not take the variance of the value function estimators explicitly into account and, therefore, their performance tends to be unstable. To cope with this problem, we propose using an adaptive importance sampling technique which allows us to actively control the trade-off between bias and variance. We further provide a method for optimally determining the trade-off parameter based on a variant of cross-validation. We demonstrate the usefulness of the proposed approach through simulations. |
Volume | 22 |
Number | 10 |
Pages | 1399-1410 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/hachiya-AdaptiveImportanceSampling_5530.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Peters, J. |
Year | 2009 |
Title | Policy Search for Motor Primitives in Robotics |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 22 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NIPS2008-Kober-Peters_5411[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Chiappa, S.; Kober, J.; Peters, J. |
Year | 2009 |
Title | Using Bayesian Dynamical Systems for Motion Template Libraries |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 22 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NIPS2008-Chiappa_5400[0].pdf |
Reference Type | Journal Article |
Author(s) | Deisenroth, M.P.; Rasmussen, C.E.; Peters, J. |
Year | 2009 |
Title | Gaussian Process Dynamic Programming |
Journal/Conference/Book Title | Neurocomputing |
Number | 72 |
Pages | 1508-1524 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Neurocomputing-2009-Deisenroth-Preprint_5531.pdf |
Reference Type | Conference Proceedings |
Author(s) | Hoffman, M.; de Freitas, N. ; Doucet, A.; Peters, J. |
Year | 2009 |
Title | An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Reward |
Journal/Conference/Book Title | Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AIStats) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/AIStats2009-Hoffman_5658.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J. |
Year | 2009 |
Title | Using Reward-Weighted Imitation for Robot Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the 2009 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/peters_ADPRL_2009.pdf |
Reference Type | Conference Proceedings |
Author(s) | Hachiya, H.; Akiyama, T.; Sugiyama, M.; Peters, J. |
Year | 2009 |
Title | Efficient Data Reuse in Value Function Approximation |
Journal/Conference/Book Title | Proceedings of the 2009 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ADPRL2009-Hachiya.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Peters, J. |
Year | 2009 |
Title | Learning Motor Primitives for Robotics |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/ICRA2009-Kober_5661[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Piater, J.; Jodogne, S.; Detry, R.; Kraft, D.; Krueger, N.; Kroemer, O.; Peters, J. |
Year | 2009 |
Title | Learning Visual Representations for Interactive Systems |
Journal/Conference/Book Title | Proceedings of the International Symposium on Robotics Research (ISRR), Invited Paper |
Abstract | We describe two quite different methods for associating action parameters to visual percepts. Our RLVC algorithm performs reinforcement learning directly on the visual input space. To make this very large space manageable, RLVC interleaves the reinforcement learner with a supervised classiï¬cation algorithm that seeks to split perceptual states so as to reduce perceptual aliasing. This results in an adaptive discretization of the perceptual space based on the presence or absence of visual features. Its extension RLJC also handles continuous action spaces. In contrast to the minimalistic visual representations produced by RLVC and RLJC, our second method learns structural object models for robust object detection and pose estimation by probabilistic inference. To these models, the method associates grasp experiences autonomously learned by trial and error. These experiences form a non-parametric representation of grasp success likelihoods over gripper poses, which we call a grasp density. Thus, object detection in a novel scene simultaneously produces suitable grasping options. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Piater-2009-ISRR.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Peters, J. |
Year | 2009 |
Title | Learning new basic Movements for Robotics |
Journal/Conference/Book Title | Proceedings of Autonome Mobile Systeme (AMS 2009) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/paper_16.pdf |
Reference Type | Conference Proceedings |
Author(s) | Muelling, K.; Peters, J. |
Year | 2009 |
Title | A computational model of human table tennis for robot application |
Journal/Conference/Book Title | Proceedings of Autonome Mobile Systeme (AMS 2009) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/A_computational_model_of_human_table_tennis.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Detry, R.; Piater, J.; Peters, J. |
Year | 2009 |
Title | Active Learning Using Mean Shift Optimization for Robot Grasping |
Journal/Conference/Book Title | Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/kroemer_IROS_2009.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Seeger, M.; Peters, J. |
Year | 2009 |
Title | Sparse Online Model Learning for Robot Control with Support Vector Regression |
Journal/Conference/Book Title | Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Sparse_Online_Model_Learning_for_Robot_Control.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.; Ng, A. |
Year | 2009 |
Title | Guest Editorial: Special Issue on Robot Learning, Part B |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Volume | 27 |
Number | 2 |
Tertiary Author | 91-92 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Peters-Ng2009_Article_GuestEditorialSpecialIssueOnRo.pdf |
Reference Type | Conference Proceedings |
Author(s) | Sigaud, O.; Peters, J. |
Year | 2009 |
Title | From Motor Learning to Interaction Learning in Robots |
Journal/Conference/Book Title | Proceedings of Journees Nationales de la Recherche en Robotique |
Tertiary Author | 189-195 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/JNRR2009-Sigaud_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Neumann, G.; Maass, W; Peters, J. |
Year | 2009 |
Title | Learning Complex Motions by Sequencing Simpler Motion Templates |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML2009) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICML2009-Neumann.pdf |
Reference Type | Conference Proceedings |
Author(s) | Detry, R; Baseski, E.; Popovic, M.; Touati, Y.; Krueger, N; Kroemer, O.; Peters, J.; Piater, J; |
Year | 2009 |
Title | Learning Object-specific Grasp Affordance Densities |
Journal/Conference/Book Title | Proceedings of the International Conference on Development & Learning (ICDL 2009) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/ICDL2009-Detry_[0].pdf |
Reference Type | Journal Article |
Author(s) | Nguyen Tuong, D.; Seeger, M.; Peters, J. |
Year | 2009 |
Title | Model Learning with Local Gaussian Process Regression |
Journal/Conference/Book Title | Advanced Robotics |
Volume | 23 |
Number | 15 |
Pages | 2015-2034 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Nguyen-Tuong-ModelLearningLocalGaussian.pdf |
Reference Type | Journal Article |
Author(s) | Kober, J.; Peters, J. |
Year | 2009 |
Title | Reinforcement Learning fuer Motor-Primitive |
Journal/Conference/Book Title | Kuenstliche Intelligenz |
Link to PDF | http://www.kuenstliche-intelligenz.de/index.php?id=7779&tx_ki_pi1[showUid]=1820&cHash=a9015a9e57 |
Reference Type | Journal Article |
Author(s) | Peters, J.; Morimoto, J.; Tedrake, R.; Roy, N. |
Year | 2009 |
Title | Robot Learning |
Journal/Conference/Book Title | IEEE Robotics & Automation Magazine |
Keywords | robot learning, tc spotlight |
Volume | 16 |
Number | 3 |
Pages | 19-20 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/05233410.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.; Ng, A. |
Year | 2009 |
Title | Guest Editorial: Special Issue on Robot Learning, Part A |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Volume | 27 |
Number | 1 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Peters-Ng2009_Article_GuestEditorialSpecialIssueOnRo.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lampert, C.H.; Peters, J. |
Year | 2009 |
Title | Active Structured Learning for High-Speed Object Detection |
Journal/Conference/Book Title | Proceedings of the DAGM (Pattern Recognition) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/DAGM2009-Lampert.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez Rodriguez, M.; Kober, J.; Schoelkopf, B. |
Year | 2009 |
Title | Denoising photographs using dark frames optimized by quadratic programming |
Journal/Conference/Book Title | Proceedings of the First IEEE International Conference on Computational Photography (ICCP 2009) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/ICCP09-GomezRodriguez_5491[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/ICCP09-GomezRodriguez_5491[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Peters, J.; Rasmussen, C.E. |
Year | 2008 |
Title | Approximate Dynamic Programming with Gaussian Processes |
Journal/Conference/Book Title | American Control Conference |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Main/PublicationsByYear/deisenroth_ACC2008.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Seeger, M. |
Year | 2008 |
Title | Computed Torque Control with Nonparametric Regressions Techniques |
Journal/Conference/Book Title | American Control Conference |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NguyenTuong_ACC2008.pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Rasmussen, C.E.; Peters, J. |
Year | 2008 |
Title | Model-Based Reinforcement Learning with Continuous States and Actions |
Journal/Conference/Book Title | Proceedings of the European Symposium on Artificial Neural Networks (ESANN 2008) |
Pages | 19-24 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/deisenroth_ESANN2008.pdf |
Reference Type | Journal Article |
Author(s) | Steinke, F.; Hein, M.; Peters, J.; Schoelkopf, B |
Year | 2008 |
Title | Manifold-valued Thin-Plate Splines with Applications in Computer Graphics |
Journal/Conference/Book Title | Computer Graphics Forum (Special Issue on Eurographics 2008) |
Volume | 27 |
Number | 2 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Steinke_EGFinal-1049.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2008 |
Title | Learning Inverse Dynamics: a Comparison |
Journal/Conference/Book Title | Proceedings of the European Symposium on Artificial Neural Networks (ESANN 2008) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NguyenTuong_ACC2008.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Nguyen-Tuong, D. |
Year | 2008 |
Title | Real-Time Learning of Resolved Velocity Control on a Mitsubishi PA-10 |
Journal/Conference/Book Title | International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICRA2008-Peters_4865[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Hachiya, H.; Akiyama, T.; Sugiyama, M.; Peters, J. |
Year | 2008 |
Title | Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation |
Journal/Conference/Book Title | Proceedings of the Twenty-Third National Conference on Artificial Intelligence (AAAI 2008) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/AAAI-2008-Hachiya_5096[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Wierstra, D.; Schaul, T.; Peters, J.; Schmidhuber, J. |
Year | 2008 |
Title | Natural Evolution Strategies |
Journal/Conference/Book Title | 2008 IEEE Congress on Evolutionary Computation |
Abstract | This paper presents Natural Evolution Strategies (NES), a novel algorithm for performing real-valued black box function optimization: optimizing an unknown objective function where algorithm-selected function measurements con- stitute the only information accessible to the method. Natu- ral Evolution Strategies search the ï¬tness landscape using a multivariate normal distribution with a self-adapting mutation matrix to generate correlated mutations in promising regions. NES shares this property with Covariance Matrix Adaption (CMA), an Evolution Strategy (ES) which has been shown to perform well on a variety of high-precision optimization tasks. The Natural Evolution Strategies algorithm, however, is simpler, less ad-hoc and more principled. Self-adaptation of the mutation matrix is derived using a Monte Carlo estimate of the natural gradient towards better expected ï¬tness. By following the natural gradient instead of the �vanilla� gradient, we can ensure efï¬cient update steps while preventing early convergence due to overly greedy updates, resulting in reduced sensitivity to local suboptima. We show NES has competitive performance with CMA on several tasks, while outperforming it on one task that is rich in deceptive local optima, the Rastrigin benchmark. found and the algorithm�s sensitivity to local suboptima on the ï¬tness landscape. |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/wierstra-CEC2008.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2008 |
Title | Local Gaussian Processes Regression for Real-time Model-based Robot Control |
Journal/Conference/Book Title | International Conference on Intelligent Robot Systems (IROS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2008-Nguyen_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Mohler, B.; Peters, J. |
Year | 2008 |
Title | Learning Perceptual Coupling for Motor Primitives |
Journal/Conference/Book Title | International Conference on Intelligent Robot Systems (IROS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2008-Kober_5414[0].pdf |
Reference Type | Book |
Author(s) | Lesperance, Y.; Lakemeyer, G.; Peters, J.; Pirri, F. |
Year | 2008 |
Title | Proceedings of the 6th International Cognitive Robotics Workshop (CogRob 2008) |
Journal/Conference/Book Title | July 21-22, 2008, Patras, Greece, ISBN 978-960-6843-09-9 |
Reference Type | Conference Proceedings |
Author(s) | Wierstra, D.; Schaul, T.; Peters, J.; Schmidthuber, J. |
Year | 2008 |
Title | Fitness Expectation Maximization |
Journal/Conference/Book Title | 10th International Conference on Parallel Problem Solving from Nature (PPSN 2008) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ppsn08.pdf |
Reference Type | Journal Article |
Author(s) | Nakanishi, J.;Cory, R.;Mistry, M.;Peters, J.;Schaal, S. |
Year | 2008 |
Title | Operational space control: A theoretical and empirical comparison |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | task space control, operational space control, redundancy resolution, humanoid robotics |
Abstract | Dexterous manipulation with a highly redundant movement system is one of the hallmarks of hu- man motor skills. From numerous behavioral studies, there is strong evidence that humans employ compliant task space control, i.e., they focus control only on task variables while keeping redundant degrees-of-freedom as compliant as possible. This strategy is robust towards unknown disturbances and simultaneously safe for the operator and the environment. The theory of operational space con- trol in robotics aims to achieve similar performance properties. However, despite various compelling theoretical lines of research, advanced operational space control is hardly found in actual robotics imple- mentations, in particular new kinds of robots like humanoids and service robots, which would strongly profit from compliant dexterous manipulation. To analyze the pros and cons of different approaches to operational space control, this paper focuses on a theoretical and empirical evaluation of different methods that have been suggested in the literature, but also some new variants of operational space controllers. We address formulations at the velocity, acceleration and force levels. First, we formulate all controllers in a common notational framework, including quaternion-based orientation control, and discuss some of their theoretical properties. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm with several benchmark tasks. As an aside, we also introduce a novel parameter estimation algorithm for rigid body dynamics, which ensures physical consistency, as this issue was crucial for our successful robot implementations. Our extensive empirical results demonstrate that one of the simplified acceleration-based approaches can be advantageous in terms of task performance, ease of parameter tuning, and general robustness and compliance in face of inevitable modeling errors. |
Volume | 27 |
Number | 6 |
Pages | 737-757 |
Short Title | Operational space control: A theoretical and emprical comparison |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Int-J-Robot-Res-2008-27-737_5027[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Wierstra, D.; Schaul,T.; Peters, J.; Schmidhuber, J. |
Year | 2008 |
Title | Episodic Reinforcement Learning by Logistic Reward-Weighted Regression |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Neural Networks (ICANN) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/wierstra_ICANN08.pdf |
Reference Type | Conference Proceedings |
Author(s) | Sehnke, F.; Osendorfer, C; Rueckstiess, T; Graves, A.; Peters, J.; Schmidhuber, J. |
Year | 2008 |
Title | Policy Gradients with Parameter-based Exploration for Control |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Neural Networks (ICANN) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/icann2008sehnke.pdf |
Reference Type | Book |
Author(s) | Peters, J. |
Year | 2008 |
Title | Machine Learning for Robotics |
Journal/Conference/Book Title | VDM-Verlag, ISBN 978-3-639-02110-3 |
ISBN/ISSN | ISBN 978-3-639-02110-3 |
Link to PDF | http://www.amazon.de/Machine-Learning-Robotics-Methods-Skills/dp/363902110X/ref=sr_1_1?ie=UTF8&s=books&qid=1220658804&sr=8-1 |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J.; Nguyen-Tuong, D. |
Year | 2008 |
Title | Policy Learning - a unified perspective with applications in robotics |
Journal/Conference/Book Title | Proceedings of the European Workshop on Reinforcement Learning (EWRL) |
Keywords | reinforcement learning, policy gradient, weighted regression |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/8808e934beb11e344433a6c98a68269e26f1.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Peters, J. |
Year | 2008 |
Title | Reinforcement Learning of Perceptual Coupling for Motor Primitives |
Journal/Conference/Book Title | Proceedings of the European Workshop on Reinforcement Learning (EWRL) |
Reference Type | Journal Article |
Author(s) | Peters, J. |
Year | 2008 |
Title | Machine Learning for Motor Skills in Robotics |
Journal/Conference/Book Title | Kuenstliche Intelligenz |
Keywords | motor control, motor primitives, motor learning |
Abstract | Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artiï¬cial intelligence, and the cognitive sciences. Early approaches to this goal during the heydays of artiï¬cial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks of future robots. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulï¬ll this promise as only few methods manage to scale into the high-dimensional domains of manipulator and humanoid robotics and usually scaling was only achieved in precisely pre-structured domains. We have investigated the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., ï¬rstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting. |
Number | 3 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/KuenstlicheIntelligenz-2008-Peters_[0].pdf |
Reference Type | Conference Paper |
Author(s) | Nguyen Tuong, D.; Peters, J.; Seeger, M.; Schoelkopf, B. |
Year | 2008 |
Title | Learning Robot Dynamics for Computed Torque Control using Local Gaussian Processes Regression |
Journal/Conference/Book Title | Proceedings of the ECSIS Symposium on Learning and Adaptive Behavior in Robotic Systems, LAB-RS 2008 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/nguyen-ecsis.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.; Schaal, S. |
Year | 2008 |
Title | Natural actor critic |
Journal/Conference/Book Title | Neurocomputing |
Keywords | reinforcement learning, policy gradient, natural actor-critic, natural gradients |
Abstract | In this paper, we suggest a novel reinforcement learning architecture, the Natural Actor-Critic. The actor updates are achieved using stochastic policy gradients em- ploying AmariÕs natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by lin- ear regression. We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gra- dients. The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation. We show that several well-known reinforcement learning methods such as the original Actor-Critic and BradtkeÕs Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms. Empirical evaluations illustrate the effectiveness of our techniques in comparison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm. |
Volume | 71 |
Number | 7-9 |
Pages | 1180-1190 |
Short Title | Natural actor critic |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NEUCOM-D-07-00618-1_[0].pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.; Schaal, S. |
Year | 2008 |
Title | Learning to control in operational space |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | operational space control, learning, EM ALGORITHM, redundancy resolution, reinforcement learning |
Abstract | One of the most general frameworks for phrasing control problems for complex, redundant robots is operational space control. However, while this framework is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in com- plex robots, e.g., humanoid robots. In this paper, we suggest a learning approach for opertional space control as a direct inverse model learning problem. A ï¬rst important insight for this paper is that a physically cor- rect solution to the inverse problem with redundant degrees-of-freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on the insight that many operational space controllers can be understood in terms of a constrained optimal control problem. The cost function as- sociated with this optimal control problem allows us to formulate a learn- ing algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the machine learning point of view, this learning problem corre- sponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees of freedom robot arm are used to illustrate the suggested approach. The applica- tion to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on real, physical Mitsubishi PA-10 medical robotics arm. |
Volume | 27 |
Pages | 197-212 |
Short Title | Learning to control in operational space |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/Learning_to_Control_in_Operational_Space.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.; Schaal, S. |
Year | 2008 |
Title | Reinforcement learning of motor skills with policy gradients |
Journal/Conference/Book Title | Neural Networks |
Keywords | Reinforcement learning, Policy gradient methods, Natural gradients, Natural Actor-Critic, Motor skills, Motor primitives |
Abstract | Autonomous learning is one of the hallmarks of human and animal behavior, and understanding the principles of learning will be crucial in order to achieve true autonomy in advanced machines like humanoid robots. In this paper, we examine learning of complex motor skills with human-like limbs. While supervised learning can offer useful tools for bootstrapping behavior, e.g., by learning from demonstration, it is only reinforcement learning that offers a general approach to the final trial-and-error improvement that is needed by each individual acquiring a skill. Neither neurobiological nor machine learning studies have, so far, offered compelling results on how reinforcement learning can be scaled to the high-dimensional continuous state and action spaces of humans or humanoids. Here, we combine two recent research developments on learning motor control in order to achieve this scaling. First, we interpret the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning. Second, we combine motor primitives with the theory of stochastic policy gradient learning, which currently seems to be the only feasible framework for reinforcement learning for humanoids. We evaluate different policy gradient methods with a focus on their applicability to parameterized motor primitives. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm. |
Volume | 21 |
Number | 4 |
Pages | 682-97 |
Date | May |
Short Title | Reinforcement learning of motor skills with policy gradients |
ISBN/ISSN | 0893-6080 (Print) |
Accession Number | 18482830 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Neural-Netw-2008-21-682_4867[0].pdf |
Address | Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 Tubingen, Germany; University of Southern California, 3710 S. McClintoch Ave-RTH401, Los Angeles, CA 90089-2905, USA. |
Language | eng |
Reference Type | Journal Article |
Author(s) | Peters, J.;Mistry, M.;Udwadia, F. E.;Nakanishi, J.;Schaal, S. |
Year | 2008 |
Title | A unifying framework for robot control with redundant DOFs |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Keywords | operational space control, inverse control, dexterous manipulation, optimal control |
Abstract | Recently, Udwadia (Proc. R. Soc. Lond. A 2003:1783–1800, 2003) suggested to derive tracking controllers for mechanical systems with redundant degrees-of-freedom (DOFs) using a generalization of Gauss’ principle of least constraint. This method allows reformulating control problems as a special class of optimal controllers. In this paper, we take this line of reasoning one step further and demonstrate that several well-known and also novel nonlinear robot control laws can be derived from this generic methodology. We show experimental verifications on a Sarcos Master Arm robot for some of the derived controllers. The suggested approach offers a promising unification and simplification of nonlinear control law design for robots obeying rigid body dynamics equations, both with or without external constraints, with over-actuation or underactuation, as well as open-chain and closed-chain kinematics. |
Volume | 24 |
Number | 1 |
Pages | 1-12 |
Short Title | A unifying methodology for robot control with redundant DOFs |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/AR-2008final_[0].pdf |
Reference Type | Thesis |
Author(s) | Kober, J. |
Year | 2008 |
Title | Reinforcement Learning for Motor Primitives |
Journal/Conference/Book Title | Dipl-Ing Thesis, University of Stuttgart |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/DiplomaThesis-Kober_5331[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/DiplomaThesis-Kober_5331[0].pdf |
Reference Type | Journal Article |
Author(s) | Peters, J. |
Year | 2007 |
Title | Computational Intelligence: By Amit Konar |
Journal/Conference/Book Title | The Computer Journal |
Keywords | book review |
Volume | 50 |
Number | 6 |
Pages | 758 |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Schaal, S. |
Year | 2007 |
Title | Policy Learning for Motor Skills |
Journal/Conference/Book Title | Proceedings of 14th International Conference on Neural Information Processing (ICONIP) |
Keywords | Machine Learning, Reinforcement Learning, Robotics, Motor Primitives, Policy Gradients, Natural Actor-Critic, Reward-Weighted Regression |
Abstract | Policy learning which allows autonomous robots to adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, we study policy learning algorithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structures for task representation and execution. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICONIP2007-Peters_4869[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J. |
Year | 2007 |
Title | Solving Deep Memory POMDPs with Recurrent Policy Gradients |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Neural Networks (ICANN) |
Keywords | policy gradients, reinforcement learning |
Abstract | This paper presents Recurrent Policy Gradients, a model- free reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a �Long Short-Term Memory� architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/icann2007.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Schaal, S.; Schoelkopf, B. |
Year | 2007 |
Title | Towards Machine Learning of Motor Skills |
Journal/Conference/Book Title | Proceedings of Autonome Mobile Systeme (AMS) |
Keywords | Motor Skill Learning, Robotics, Natural Actor-Critic, Reward-Weighted Regeression |
Abstract | Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two ma jor components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting. |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Peters_POAMS_2007.pdf |
Reference Type | Conference Proceedings |
Author(s) | Theodorou, E; Peters, J.; Schaal, S. |
Year | 2007 |
Title | Reinforcement Learning for Optimal Control of Arm Movements |
Journal/Conference/Book Title | Abstracts of the 37st Meeting of the Society of Neuroscience |
Keywords | Optimal Control,Reinforcement Learning, Arm Movements |
Abstract | Every day motor behavior consists of a plethora of challenging motor skills from discrete movements such as reaching and throwing to rhythmic movements such as walking, drumming and running. How this plethora of motor skills can be learned remains an open question. In particular, is there any unifying computa-tional framework that could model the learning process of this variety of motor behaviors and at the same time be biologically plausible? In this work we aim to give an answer to these questions by providing a computational framework that unifies the learning mechanism of both rhythmic and discrete movements under optimization criteria, i.e., in a non-supervised trial-and-error fashion. Our suggested framework is based on Reinforcement Learning, which is mostly considered as too costly to be a plausible mechanism for learning com-plex limb movement. However, recent work on reinforcement learning with pol-icy gradients combined with parameterized movement primitives allows novel and more efficient algorithms. By using the representational power of such mo-tor primitives we show how rhythmic motor behaviors such as walking, squash-ing and drumming as well as discrete behaviors like reaching and grasping can be learned with biologically plausible algorithms. Using extensive simulations and by using different reward functions we provide results that support the hy-pothesis that Reinforcement Learning could be a viable candidate for motor learning of human motor behavior when other learning methods like supervised learning are not feasible. |
Reference Type | Journal Article |
Author(s) | Nakanishi, J.; Mistry, M.; Peters, J.; Schaal, S. |
Year | 2007 |
Title | Experimental evaluation of task space position/orientation control towards compliant control for humanoid robots |
Journal/Conference/Book Title | IEEE International Conference on Intelligent Robotics Systems (IROS 2007) |
Keywords | operational space control, quaternion, task space control, resolved motion rate control, resolved acceleration, force control |
Abstract | Compliant control will be a prerequisite for humanoid robotics if these robots are supposed to work safely and robustly in human and/or dynamic environments. One view of compliant control is that a robot should control a minimal number of degrees-of-freedom (DOFs) directly, i.e., those relevant DOFs for the task, and keep the remaining DOFs maximally compliant, usually in the null space of the task. This view naturally leads to task space control. However, surprisingly few implementations of task space control can be found in actual humanoid robots. This paper makes a first step towards assessing the usefulness of task space controllers for humanoids by investigating which choices of controllers are available and what inherent control characteristics they have�this treatment will concern position and orientation control, where the latter is based on a quaternion formulation. Empirical evaluations on an anthropomorphic Sarcos master arm illustrate the robustness of the different controllers as well as the ease of implementing and tuning them. Our extensive empirical results demonstrate that simpler task space controllers, e.g., classical resolved motion rate control or resolved acceleration control can be quite advantageous in face of inevitable modeling errors in model-based control, and that well chosen formulations are easy to implement and quite robust, such that they are useful for humanoids. |
Place Published | San Diego, CA: Oct. 29 � Nov. 2 |
Short Title | Experimental evaluation of task space position/orientation control towards compliant control for humanoid robots |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2007-Nakanishi_4722[0].pdf |
Reference Type | Thesis |
Author(s) | Peters, J. |
Year | 2007 |
Title | Machine Learning of Motor Skills for Robotics |
Journal/Conference/Book Title | Ph.D. Thesis, Department of Computer Science, University of Southern California |
Keywords | Machine Learning, Reinforcement Learning, Robotics, Motor Primitives, Policy Gradients, Natural Actor-Critic, Reward-Weighted Regression |
Abstract | Autonomous robots that can assist humans in situations of daily life have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. A first step towards this goal is to create robots that can accomplish a multitude of different tasks, triggered by environmental context or higher level instruction. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning and human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this thesis, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting. As a theoretical foundation, we first study a general framework to generate control laws for real robots with a particular focus on skills represented as dynamical systems in differential constraint form. We present a point-wise optimal control framework resulting from a generalization of Gauss' principle and show how various well-known robot control laws can be derived by modifying the metric of the employed cost function. The framework has been successfully applied to task space tracking control for holonomic systems for several different metrics on the anthropomorphic SARCOS Master Arm. In order to overcome the limiting requirement of accurate robot models, we first employ learning methods to find learning controllers for task space control. However, when learning to execute a redundant control problem, we face the general problem of the non-convexity of the solution space which can force the robot to steer into physically impossible configurations if supervised learning methods are employed without further consideration. This problem can be resolved using two major insights, i.e., the learning problem can be treated as locally convex and the cost function of the analytical framework can be used to ensure global consistency. Thus, we derive an immediate reinforcement learning algorithm from the expectation-maximization point of view which leads to a reward-weighted regression technique. This method can be used both for operational space control as well as general immediate reward reinforcement learning problems. We demonstrate the feasibility of the resulting framework on the problem of redundant end-effector tracking for both a simulated 3 degrees of freedom robot arm as well as for a simulated anthropomorphic SARCOS Master Arm. While learning to execute tasks in task space is an essential component to a general framework to motor skill learning, learning the actual task is of even higher importance, particularly as this issue is more frequently beyond the abilities of analytical approaches than execution. We focus on the learning of elemental tasks which can serve as the "building blocks of movement generation", called motor primitives. Motor primitives are parameterized task representations based on splines or nonlinear differential equations with desired attractor properties. While imitation learning of parameterized motor primitives is a relatively well-understood problem, the self-improvement by interaction of the system with the environment remains a challenging problem, tackled in the fourth chapter of this thesis. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and outline both established and novel algorithms for the gradient-based improvement of parameterized policies. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm. In conclusion, in this thesis, we have contributed a general framework for analytically computing robot control laws which can be used for deriving various previous control approaches and serves as foundation as well as inspiration for our learning algorithms. We have introduced two classes of novel reinforcement learning methods, i.e., the Natural Actor-Critic and the Reward-Weighted Regression algorithm. These algorithms have been used in order to replace the analytical components of the theoretical framework by learned representations. Evaluations have been performed on both simulated and real robot arms. |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Schaal, S. |
Year | 2007 |
Title | Reinforcement learning for operational space control |
Journal/Conference/Book Title | International Conference on Robotics and Automation (ICRA2007) |
Keywords | operational space control, reinforcement learning, weighted regression, EM-Algorithm |
Abstract | While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting supervised learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-convexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. The important insight that many operational space control algorithms can be reformulated as optimal control problems, however, allows addressing this inverse learning problem in the framework of reinforcement learning. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-based reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICRA2007-2111_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Schaal, S. |
Year | 2007 |
Title | Using reward-weighted regression for reinforcement learning of task space control |
Journal/Conference/Book Title | Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning |
Keywords | reinforcement learning, cart-pole, policy gradient methods |
Abstract | In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, `vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease. |
Place Published | Honolulu, Hawaii, April 1-5, 2007 |
Short Title | Using reward-weighted regression for reinforcement learning of task space control |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ADPRL2007-Peters_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Schaal, S. |
Year | 2007 |
Title | Applying the episodic natural actor-critic architecture to motor primitive learning |
Journal/Conference/Book Title | Proceedings of the 2007 European Symposium on Artificial Neural Networks (ESANN) |
Keywords | reinforcement learning, policy gradient methods, motor primitives, natural actor-critic |
Abstract | In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natural stochastic policy gradients while the critic obtains the natural policy gradient by linear regression. We show that this architecture can be used to learn the �building blocks of movement generation�, called motor primitives. Motor primitives are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. We show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm. |
Place Published | Bruges, Belgium, April 25-27 |
Short Title | Applying the episodic natural actor-critic architecture to motor primitive learning |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/es2007-125.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Schaal, S. |
Year | 2007 |
Title | Reinforcement learning by reward-weighted regression for operational space control |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML2007) |
Keywords | reinforcement learning, operational space control, weighted regression |
Abstract | Many robot control problems of practical importance, including operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots. |
Place Published | Corvallis, Oregon, June 19-21 |
Short Title | Reinforcement learning by reward-weighted regression for operational space control |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICML2007-Peters_4493[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Theodorou, E.;Schaal, S. |
Year | 2007 |
Title | Policy gradient methods for machine learning |
Journal/Conference/Book Title | INFORMS Conference of the Applied Probability Society |
Keywords | policy gradient methods, reinforcement learning, simulation-optimization |
Abstract | We present an in-depth survey of policy gradient methods as they are used in the machine learning community for optimizing parameterized, stochastic control policies in Markovian systems with respect to the expected reward. Despite having been developed separately in the reinforcement learning literature, policy gradient methods employ likelihood ratio gradient estimators as also suggested in the stochastic simulation optimization community. It is well-known that this approach to policy gradient estimation traditionally suffers from three drawbacks, i.e., large variance, a strong dependence on baseline functions and a inefficient gradient descent. In this talk, we will present a series of recent results which tackles each of these problems. The variance of the gradient estimation can be reduced significantly through recently introduced techniques such as optimal baselines, compatible function approximations and all-action gradients. However, as even the analytically obtainable policy gradients perform unnaturally slow, it required the step from ÔvanillaÕ policy gradient methods towards natural policy gradients in order to overcome the inefficiency of the gradient descent. This development resulted into the Natural Actor-Critic architecture which can be shown to be very efficient in application to motor primitive learning for robotics. |
Place Published | Eindhoven, Netherlands, July 9-11, 2007 |
Short Title | Policy gradient methods for machine learning |
Reference Type | Conference Proceedings |
Author(s) | Riedmiller, M.;Peters, J.;Schaal, S. |
Year | 2007 |
Title | Evaluation of policy gradient methods and variants on the cart-pole benchmark |
Journal/Conference/Book Title | Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning |
Keywords | reinforcement learning, cart-pole, policy gradient methods |
Abstract | In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, `vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease. |
Place Published | Honolulu, Hawaii, April 1-5, 2007 |
Short Title | Evaluation of policy gradient methods and variants on the cart-pole benchmark |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ADPRL2007-Peters2_[0].pdf |
Reference Type | Report |
Author(s) | Peters, J. |
Year | 2007 |
Title | Relative Entropy Policy Search |
Journal/Conference/Book Title | CLMC Technical Report: TR-CLMC-2007-2, University of Southern California |
Keywords | relative entropy, policy search, natural policy gradient |
Abstract | This technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems. |
Place Published | Los Angeles, CA |
Type of Work | CLMC Technical Report |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Peters-TR2007.pdf |
Research Notes | A longer and more complete version is under preparation. |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Schaal, S. |
Year | 2006 |
Title | Learning operational space control |
Journal/Conference/Book Title | Robotics: Science and Systems (RSS 2006) |
Keywords | operational space control redundancy forward models inverse models compliance reinforcement leanring locally weighted learning |
Abstract | While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-covexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. A first important insight for this paper is that, nevertheless, a physically correct solution to the inverse problem does exits when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on a recent insight that many operational space controllers can be understood in terms of a constraint optimal control problem. The cost function associated with this optimal control problem allows us to formulate a learning algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the view of machine learning, the learning problem corresponds to a reinforcement learning problem that maximizes an immediate reward and that employs an expectation-maximization policy search algorithm. Evaluations on a three degrees of freedom robot arm illustrate the feasability of our suggested approach. |
Place Published | Philadelphia, PA, Aug.16-19 |
Publisher | Cambridge, MA: MIT Press |
Short Title | Learning operational space control |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/p33.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Schaal, S. |
Year | 2006 |
Title | Reinforcement Learning for Parameterized Motor Primitives |
Journal/Conference/Book Title | Proceedings of the 2006 International Joint Conference on Neural Networks (IJCNN) |
Keywords | motor primitives, reinforcement learning |
Abstract | One of the major challenges in both action generation for robotics and in the understanding of human motor control is to learn the "building blocks of movement generation", called motor primitives. Motor primitives, as used in this paper, are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. While a lot of progress has been made in teaching parameterized motor primitives using supervised or imitation learning, the self-improvement by interaction of the system with the environment remains a challenging problem. In this paper, we evaluate different reinforcement learning approaches for improving the performance of parameterized motor primitives. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and outline both established and novel algorithms for the gradient-based improvement of parameterized policies. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm. |
Short Title | Reinforcement Learning for Parameterized Motor Primitives |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/Reinforcement_Learning_for_Parameterized_Motor_Pri.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ting, J.;Mistry, M.;Nakanishi, J.;Peters, J.;Schaal, S. |
Year | 2006 |
Title | A Bayesian approach to nonlinear parameter identification for rigid body dynamics |
Journal/Conference/Book Title | Robotics: Science and Systems (RSS 2006) |
Keywords | Bayesian regression linear models dimensionality reduction input noise rigid body dynamics parameter identification |
Abstract | For robots of increasing complexity such as humanoid robots, conventional identification of rigid body dynamics models based on CAD data and actuator models becomes difficult and inaccurate due to the large number of additional nonlinear effects in these systems, e.g., stemming from stiff wires, hydraulic hoses, protective shells, skin, etc. Data driven parameter estimation offers an alternative model identification method, but it is often burdened by various other problems, such as significant noise in all measured or inferred variables of the robot. The danger of physically inconsistent results also exists due to unmodeled nonlinearities or insufficiently rich data. In this paper, we address all these problems by developing a Bayesian parameter identification method that can automatically detect noise in both input and output data for the regression algorithm that performs system identification. A post-processing step ensures physically consistent rigid body parameters by nonlinearly projecting the result of the Bayesian estimation onto constraints given by positive definite inertia matrices and the parallel axis theorem. We demonstrate on synthetic and actual robot data that our technique performs parameter identification with $10$ to $30%$ higher accuracy than traditional methods. Due to the resulting physically consistent parameters, our algorithm enables us to apply advanced control methods that algebraically require physical consistency on robotic platforms. |
Place Published | Philadelphia, PA, Aug.16-19 |
Publisher | Cambridge, MA: MIT Press |
Short Title | A Bayesian approach to nonlinear parameter identification for rigid body dynamics |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/p32.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Schaal, S. |
Year | 2006 |
Title | Policy gradient methods for robotics |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Intelligent Robotics Systems (IROS 2006) |
Keywords | policy gradient methods, reinforcement learning, robotics |
Abstract | The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-structured environments. However, to date only few existing reinforcement learning methods have been scaled into the domains of highdimensional robots such as manipulator, legged or humanoid robots. Policy gradient methods remain one of the few exceptions and have found a variety of applications. Nevertheless, the application of such methods is not without peril if done in an uninformed manner. In this paper, we give an overview on learning with policy gradient methods for robotics with a strong focus on recent advances in the field. We outline previous applications to robotics and show how the most recently developed methods can significantly improve learning performance. Finally, we evaluate our most promising algorithm in the application of hitting a baseball with an anthropomorphic arm. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2006-Peters_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Nakanishi, J.;Cory, R.;Mistry, M.;Peters, J.;Schaal, S. |
Year | 2005 |
Title | Comparative experiments on task space control with redundancy resolution |
Journal/Conference/Book Title | IEEE International Conference on Intelligent Robots and Systems (IROS 2005) |
Keywords | manipulator dynamicsredundant manipulatorsspace optimizationdynamical decouplinghumanoid robotsinverse kinematicsmotor coordinationredundancy resolutionrobot dynamicsseven-degree-of-freedom anthropomorphic robot armtask space controlDynamical d |
Abstract | Understanding the principles of motor coordination with redundant degrees of freedom still remains a challenging problem, particularly for new research in highly redundant robots like humanoids. Even after more than a decade of research, task space control with redundacy resolution still remains an incompletely understood theoretical topic, and also lacks a larger body of thorough experimental investigation on complex robotic systems. This paper presents our first steps towards the development of a working redundancy resolution algorithm which is robust against modeling errors and unforeseen disturbances arising from contact forces. To gain a better understanding of the pros and cons of different approaches to redundancy resolution, we focus on a comparative empirical evaluation. First, we review several redundancy resolution schemes at the velocity, acceleration and torque levels presented in the literature in a common notational framework and also introduce some new variants of these previous approaches. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm. Surprisingly, one of our simplest algorithms empirically demonstrates the best performance, despite, from a theoretical point, the algorithm does not share the same beauty as some of the other methods. Finally, we discuss practical properties of these control algorithms, particularly in light of inevitable modeling errors of the robot dynamics. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2005-Nakanishi_5051[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Vijayakumar, S.;Schaal, S. |
Year | 2005 |
Title | Natural Actor-Critic |
Journal/Conference/Book Title | Proceedings of the 16th European Conference on Machine Learning (ECML 2005) |
Keywords | Reinforcement Learning, Policy Gradients, Natural Gradients |
Abstract | This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari�s natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regres- sion. We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gradients. The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation. We show that several well-known reinforcement learning methods such as the original Actor-Critic and Bradtke�s Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms. Em- pirical evaluations illustrate the effectiveness of our techniques in com- parison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NaturalActorCritic.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Mistry, M.;Udwadia, F. E.;Schaal, S. |
Year | 2005 |
Title | A new methodology for robot control design |
Journal/Conference/Book Title | The 5th ASME International Conference on Multibody Systems, Nonlinear Dynamics, and Control (MSNDC 2005) |
Keywords | robot control, nonlinear control, gauss principle |
Abstract | Gauss principle of least constraint and its generalizations have provided a useful insights for the development of tracking controllers for mechanical systems (Udwadia,2003). Using this concept, we present a novel methodology for the design of a specific class of robot controllers. With our new framework, we demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic framework, and show experimental verifications on a Sarcos Master Arm robot for some of these controllers. We believe that the suggested approach unifies and simplifies the design of optimal nonlinear control laws for robots obeying rigid body dynamics equations, both with or without external constraints, holonomic or nonholonomic constraints, with over-actuation or underactuation, as well as open-chain and closed-chain kinematics. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/PetMisUdwSchASME2005_5054[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Mistry, M.;Udwadia, F. E.;Cory, R.;Nakanishi, J.;Schaal, S. |
Year | 2005 |
Title | A unifying framework for the control of robotics systems |
Journal/Conference/Book Title | IEEE International Conference on Intelligent Robots and Systems (IROS 2005) |
Abstract | Recently, [1] suggested to derive tracking controllers for mechanical systems using a generalization of Gauss principle of least constraint. This method al-lows us to reformulate control problems as a special class of optimal control. We take this line of reasoning one step further and demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic methodology. We show experimental verifications on a Sar-cos Master Arm robot for some of the the derived controllers.We believe that the suggested approach offers a promising unification and simplification of nonlinear control law design for robots obeying rigid body dynamics equa-tions, both with or without external constraints, with over-actuation or under-actuation, as well as open-chain and closed-chain kinematics. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2005-Peters_5052[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Schaal, S.;Peters, J.;Nakanishi, J.;Ijspeert, A. |
Year | 2004 |
Title | Learning Movement Primitives |
Journal/Conference/Book Title | International Symposium on Robotics Research (ISRR2003) |
Keywords | movement primitives, supervised learning, reinforcment learning, locomotion, phase resetting, learning from demonstration |
Abstract | This paper discusses a comprehensive framework for modular motor control based on a recently developed theory of dynamic movement primitives (DMP). DMPs are a formulation of movement primitives with autonomous nonlinear differential equations, whose time evolution creates smooth kinematic control policies. Model-based control theory is used to convert the outputs of these policies into motor commands. By means of coupling terms, on-line modifications can be incorporated into the time evolution of the differential equations, thus providing a rather flexible and reactive framework for motor planning and execution. The linear parameterization of DMPs lends itself naturally to supervised learning from demonstration. Moreover, the temporal, scale, and translation invariance of the differential equations with respect to these parameters provides a useful means for movement recognition. A novel reinforcement learning technique based on natural stochastic policy gradients allows a general approach of improving DMPs by trial and error learning with respect to almost arbitrary optimization criteria. We demonstrate the different ingredients of the DMP approach in various examples, involving skill learning from demonstration on the humanoid robot DB, and learning biped walking from demonstration in simulation, including self-improvement of the movement patterns towards energy efficiency through resonance tuning. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Schaal2005_Chapter_LearningMovementPrimitives.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Schaal, S. |
Year | 2004 |
Title | Learning Motor Primitives with Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the 11th Joint Symposium on Neural Computation |
Keywords | natural policy gradients, motor primitives, natural actor-critic |
Abstract | One of the major challenges in action generation for robotics and in the understanding of human motor control is to learn the "building blocks of move- ment generation," or more precisely, motor primitives. Recently, Ijspeert et al. [1, 2] suggested a novel framework how to use nonlinear dynamical systems as motor primitives. While a lot of progress has been made in teaching these mo- tor primitives using supervised or imitation learning, the self-improvement by interaction of the system with the environment remains a challenging problem. In this poster, we evaluate different reinforcement learning approaches can be used in order to improve the performance of motor primitives. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and line out how these lead to a novel algorithm which is based on natural policy gradients [3]. We compare this algorithm to previous reinforcement learning algorithms in the context of dynamic motor primitive learning, and show that it outperforms these by at least an order of magnitude. We demonstrate the efficiency of the resulting reinforcement learning method for creating complex behaviors for automous robotics. The studied behaviors will include both discrete, finite tasks such as baseball swings, as well as complex rhythmic patterns as they occur in biped locomotion |
Place Published | http://resolver.caltech.edu/CaltechJSNC:2004.poster020 |
Reference Type | Conference Proceedings |
Author(s) | Mohajerian, P.;Peters, J.;Ijspeert, A.;Schaal, S. |
Year | 2003 |
Title | A unifying computational framework for optimization and dynamic systems approaches to motor control |
Journal/Conference/Book Title | Proceedings of the 10th Joint Symposium on Neural Computation (JSNC 2003) |
Keywords | computational motor control, optimization, dynamic systems, formal modeling |
Abstract | Theories of biological motor control have been pursued from at least two separate frameworks, the "Dynamic Systems" approach and the "Control Theoretic/Optimization" approach. Control and optimization theory emphasize motor control based on organizational principles in terms of generic cost criteria like "minimum jerk", "minimum torque-change", "minimum variance", etc., while dynamic systems theory puts larger focus on principles of self-organization in motor control, like synchronization, phase-locking, phase transitions, perception-action coupling, etc. Computational formalizations in both approaches have equally differed, using mostly time-indexed desired trajectory plans in control/optimization theory, and nonlinear autonomous differential equations in dynamic systems theory. Due to these differences in philosophy and formalization, optimization approaches and dynamic systems approaches have largely remained two separate research approaches in motor control, mostly conceived of as incompatible. In this poster, we present a novel formal framework for motor control that can harmoniously encompass both optimization and dynamic systems approaches. This framework is based on the discovery that almost arbitrary nonlinear autonomous differential equations can be acquired within a standard statistical (or neural network) learning framework without the need of tedious manual parameter tuning and the danger of entering unstable or chaotic regions of the differential equations. Both rhythmic (e.g., locomotion, swimming, etc.) and discrete (e.g., point-to-point reaching, grasping, etc.) movement can be modeled, either as single degree-of-freedom or multiple degree-of-freedom systems. Coupling parameters to the differential equations can create typical effects of self-organization in dynamic systems, while optimization approaches can be used numerically safely to improve the attractor landscape of the equations with respect to a given cost criterion, as demonstrated in modeling studies of several of the hall marks of dynamics systems and optimization theory. We believe that this novel computational framework will allow a first step towards unifying dynamic systems and optimization approaches to motor control, and provide a set of principled modeling tools to both communities. |
Place Published | Irvine, CA, May 2003 |
Short Title | A unifying computational framework for optimization and dynamic systemsapproaches to motor control |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/JSNC2003-Mohajerian_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Vijayakumar, S.;Schaal, S. |
Year | 2003 |
Title | Reinforcement learning for humanoid robotics |
Journal/Conference/Book Title | IEEE-RAS International Conference on Humanoid Robots (Humanoids2003) |
Keywords | reinforcement learning, policy gradients, movement primitives, behaviors, dynamic systems, humanoid robotics |
Abstract | Reinforcement learning offers one of the most general framework to take traditional robotics towards true autonomy and versatility. However, applying reinforcement learning to high dimensional movement systems like humanoid robots remains an unsolved problem. In this paper, we discuss different approaches of reinforcement learning in terms of their applicability in humanoid robotics. Methods can be coarsely classified into three different categories, i.e., greedy methods, `vanilla' policy gradient methods, and natural gradient methods. We discuss that greedy methods are not likely to scale into the domain humanoid robotics as they are problematic when used with function approximation. `Vanilla' policy gradient methods on the other hand have been successfully applied on real-world robots including at least one humanoid robot. We demonstrate that these methods can be significantly improved using the natural policy gradient instead of the regular policy gradient. A derivation of the natural policy gradient is provided, proving that the average policy gradient of Kakade (2002) is indeed the true natural gradient. A general algorithm for estimating the natural gradient, the Natural Actor-Critic algorithm, is introduced. This algorithm converges to the nearest local minimum of the cost function with respect to the Fisher information metric under suitable conditions. The algorithm outperforms non-natural policy gradients by far in a cart-pole balancing evaluation, and for learning nonlinear dynamic motor primitives for humanoid robot control. It offers a promising route for the development of reinforcement learning for truly high dimensionally continuous state-action systems. |
Place Published | Karlsruhe, Germany, Sept.29-30 |
Short Title | Reinforcement learning for humanoid robotics |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/peters-ICHR2003.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Vijayakumar, S.;Schaal, S. |
Year | 2003 |
Title | Scaling reinforcement learning paradigms for motor learning |
Journal/Conference/Book Title | Proceedings of the 10th Joint Symposium on Neural Computation (JSNC 2003) |
Keywords | Reinforcement learning, neurodynamic programming, actorcritic methods, policy gradient methods, natural policy gradient |
Abstract | Reinforcement learning offers a general framework to explain reward related learning in artificial and biological motor control. However, current reinforcement learning methods rarely scale to high dimensional movement systems and mainly operate in discrete, low dimensional domains like game-playing, artificial toy problems, etc. This drawback makes them unsuitable for application to human or bio-mimetic motor control. In this poster, we look at promising approaches that can potentially scale and suggest a novel formulation of the actor-critic algorithm which takes steps towards alleviating the current shortcomings. We argue that methods based on greedy policies are not likely to scale into high-dimensional domains as they are problematic when used with function approximation � a must when dealing with continuous domains. We adopt the path of direct policy gradient based policy improvements since they avoid the problems of unstabilizing dynamics encountered in traditional value iteration based updates. While regular policy gradient methods have demonstrated promising results in the domain of humanoid notor control, we demonstrate that these methods can be significantly improved using the natural policy gradient instead of the regular policy gradient. Based on this, it is proved that Kakade�s �average natural policy gradient� is indeed the true natural gradient. A general algorithm for estimating the natural gradient, the Natural Actor-Critic algorithm, is introduced. This algorithm converges with probability one to the nearest local minimum in Riemannian space of the cost function. The algorithm outperforms nonnatural policy gradients by far in a cart-pole balancing evaluation, and offers a promising route for the development of reinforcement learning for truly high-dimensionally continuous state-action systems. |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/petersVijayakumarSchaal_JSNC2003_5058[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Schaal, S.;Peters, J.;Nakanishi, J.;Ijspeert, A. |
Year | 2003 |
Title | Control, planning, learning, and imitation with dynamic movement primitives |
Journal/Conference/Book Title | Workshop on Bilateral Paradigms on Humans and Humanoids, IEEE International Conference on Intelligent Robots and Systems (IROS 2003) |
Keywords | movement primitives, supervised learning, reinforcment learning, locomotion, phase resetting, learning from demonstration |
Abstract | In both human and humanoid movement science, the topic of movement primitives has become central in understanding the generation of complex motion with high degree-of-freedom bodies. A theory of control, planning, learning, and imitation with movement primitives seems to be crucial in order to reduce the search space during motor learning and achieve a large level of autonomy and flexibility in dynamically changing environments. Movement recognition based on the same representations as used for movement generation, i.e., movement primitives, is equally intimately tied into these research questions. This paper discusses a comprehensive framework for motor control with movement primitives using a recently developed theory of dynamic movement primitives (DMP). DMPs are a formulation of movement primitives with autonomous nonlinear differential equations, whose time evolution creates smooth kinematic movement plans. Model-based control theory is used to convert such movement plans into motor commands. By means of coupling terms, on-line modifications can be incorporated into the time evolution of the differential equations, thus providing a rather flexible and reactive framework for motor planning and execution � indeed, DMPs form complete kinematic control policies, not just a particular desired trajectory. The linear parameterization of DMPs lends itself naturally to supervised learning from demonstrations. Moreover, the temporal, scale, and translation invariance of the differential equations with respect to these parameters provides a useful means for movement recognition. A novel reinforcement learning technique based on natural stochastic policy gradients allows a general approach of improving DMPs by trial and error learning with respect to almost arbitrary optimization criteria, including situations with delayed rewards. We demonstrate the different ingredients of the DMP approach in various examples, involving skill learning from demonstration on the humanoid robot DB and an application of learning simulated biped walking from a demonstrated trajectory, including self-improvement of the movement patterns in the spirit of energy efficiency through resonance tuning. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2003-Schaal_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Vijayakumar, S.; D’Souza, A.; Peters, J.; Conradt, J.; Rutkowski,T.; Ijspeert, A.; Nakanishi, J.; Inoue, M.; Shibata, T.; Wiryo, A.; Itti, L.; Amari, S.; Schaal, S |
Year | 2002 |
Title | Real-Time Statistical Learning for Oculomotor Control and Visuomotor Coordination |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS/NeurIPS), Demonstration Track |
Reference Type | Report |
Author(s) | Peters, J. |
Year | 2002 |
Title | Policy Gradient Methods for Control Applications |
Journal/Conference/Book Title | CLMC Technical Report: TR-CLMC-2007-1, University of Southern California |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Member/JanPeters/techrep.pdf |
Reference Type | Conference Proceedings |
Author(s) | Burdet, E.; Tee, K.P.; Chew, C.M.; Peters, J.; Bt, V.L. |
Year | 2001 |
Title | Hybrid IDM/Impedance Learning in Human Movements |
Journal/Conference/Book Title | First International Symposium on Measurement, Analysis and Modeling of Human Functions Proceedings |
Keywords | human motor control |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/burdet_ISHF_2001.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/burdet_ISHF_2001.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Riener, R |
Year | 2000 |
Title | A real-time model of the human knee for application in virtual orthopaedic trainer |
Journal/Conference/Book Title | Proceedings of the 10th International Conference on Biomedical Engineering Conference (ICBME) |
Keywords | Biomechanics, human motor control |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ICBME_2000.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ICBME_2000.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J. |
Year | 1998 |
Title | Fuzzy Logic for Practical Applications |
Journal/Conference/Book Title | Kuenstliche Intelligenz (KI) |
Keywords | book review |
Number | 4 |
Pages | 60 |