Reference Type | Journal Article |
Author(s) | Parisi, S.; Tateo, D.; Hensel, M.; D'Eramo, C.; Peters, J.; Pajarinen, J. |
Year | submitted |
Title | Long-Term Visitation Value for Deep Exploration in Sparse Reward Reinforcement Learning |
Journal/Conference/Book Title | Submitted to the Journal of Machine Learning Research (JMLR) |
Link to PDF | https://arxiv.org/abs/2001.00119 |
Reference Type | Journal Article |
Author(s) | Tosatto, S.; Akrour, R.; Peters, J. |
Year | submitted |
Title | An Upper Bound of the Bias of Nadaraya-Watson Kernel Regression under Lipschitz Assumptions |
Journal/Conference/Book Title | JNS |
Keywords | Nonparametric, Bias, Kernel Regression |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SamueleTosatto/biometrica2020preprint.pdf |
Reference Type | Journal Article |
Author(s) | Akrour, R.; Atamna, A.; Peters, J. |
Year | submitted |
Title | Convex Optimization with an Interpolation-based Projection and its Application to Deep Learning |
Journal/Conference/Book Title | Machine Learning (MACH) |
Link to PDF | https://arxiv.org/pdf/2011.07016.pdf |
Reference Type | Journal Article |
Author(s) | Akrour, R.; Tateo, D.; Peters, J. |
Year | submitted |
Title | Reinforcement Learning from a Mixture of Interpretable Experts |
Journal/Conference/Book Title | Transactions on Pattern Analysis and Machine Intelligence (TPAMI) |
Link to PDF | https://arxiv.org/pdf/2006.05911.pdf |
Reference Type | Journal Article |
Author(s) | Muratore, F.; Gienger, M.; Peters, J. |
Year | in press |
Title | Assessing Transferability from Simulation to Reality for Reinforcement Learning |
Journal/Conference/Book Title | IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) |
Abstract | Learning robot control policies from physics simulations is of great interest to the robotics community as it may render the learning process faster, cheaper, and safer by alleviating the need for expensive real-world experiments. However, the direct transfer of learned behavior from simulation to reality is a major challenge. Optimizing a policy on a slightly faulty simulator can easily lead to the maximization of the ‘Simulation Optimization Bias’ (SOB). In this case, the optimizer exploits modeling errors of the simulator such that the resulting behavior can potentially damage the robot. We tackle this challenge by applying domain randomization, i.e., randomizing the parameters of the physics simulations during learning. We propose an algorithm called Simulation-based Policy Optimization with Transferability Assessment (SPOTA) which uses an estimator of the SOB to formulate a stopping criterion for training. The introduced estimator quantifies the over-fitting to the set of domains experienced while training. Our experimental results on two different second order nonlinear systems show that the new simulation-based policy search algorithm is able to learn a control policy exclusively from a randomized simulator, which can be applied directly to real systems without any additional training. |
URL(s) | https://www.ncbi.nlm.nih.gov/pubmed/31722475 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Muratore_Gienger_Peters--AssessingTransferabilityfromSimulationToRealityForReinforcementLearning.pdf |
Language | English |
Reference Type | Journal Article |
Author(s) | Muratore, F.; Eilers, C.; Gienger, M.; Peters, J. |
Year | in press |
Title | Data-efficient Domain Randomization with Bayesian Optimization |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RAL) |
Keywords | sim-2-real, domain randomization, bayesian optimization |
Abstract | When learning policies for robot control, the required real-world data is typically prohibitively expensive to acquire, so learning in simulation is a popular strategy. Unfortunately, such polices are often not transferable to the real world due to a mismatch between the simulation and reality, called 'reality gap'. Domain randomization methods tackle this problem by randomizing the physics simulator (source domain) during training according to a distribution over domain parameters in order to obtain more robust policies that are able to overcome the reality gap. Most domain randomization approaches sample the domain parameters from a fixed distribution. This solution is suboptimal in the context of sim-to-real transferability, since it yields policies that have been trained without explicitly optimizing for the reward on the real system (target domain). Additionally, a fixed distribution assumes there is prior knowledge about the uncertainty over the domain parameters. In this paper, we propose Bayesian Domain Randomization (BayRn), a black-box sim-to-real algorithm that solves tasks efficiently by adapting the domain parameter distribution during learning given sparse data from the real-world target domain. BayRn uses Bayesian optimization to search the space of source domain distribution parameters such that this leads to a policy which maximizes the real-word objective, allowing for adaptive distributions during policy optimization. We experimentally validate the proposed approach in sim-to-sim as well as in sim-to-real experiments, comparing against three baseline methods on two robotic tasks. Our results show that BayRn is able to perform sim-to-real transfer, while significantly reducing the required prior knowledge. |
Editor(s) | Dana Kulic |
Publisher | IEEE |
URL(s) | https://ieeexplore.ieee.org/document/9327467 |
Link to PDF | https://arxiv.org/pdf/2003.02471.pdf |
Language | English |
Last Modified Date | 2021-01-06 |
Reference Type | Journal Article |
Author(s) | Ewerton, M.; Arenz, O.; Peters, J. |
Year | in press |
Title | Assisted Teleoperation in Changing Environments with a Mixture of Virtual Guides |
Journal/Conference/Book Title | Advanced Robotics |
Reference Type | Book |
Author(s) | Belousov, B.; Abdulsamad H.; Klink, P.; Parisi, S.; Peters, J. |
Year | 2021 |
Title | Reinforcement Learning Algorithms: Analysis and Applications |
Journal/Conference/Book Title | Studies in Computational Intelligence |
Publisher | Springer International Publishing |
Edition | 1 |
URL(s) | https://www.springer.com/gp/book/9783030411879 |
Reference Type | Conference Paper |
Author(s) | Watson, J.; Lin, J. A.; Klink, P.; Peters, J. |
Year | 2021 |
Title | Neural Linear Models with Functional Gaussian Process Priors |
Journal/Conference/Book Title | 3rd Symposium on Advances in Approximate Bayesian Inference (AABI) |
Reference Type | Conference Proceedings |
Author(s) | Watson, J.; Peters, J. |
Year | 2021 |
Title | Advancing Trajectory Optimization with Approximate Inference: Exploration, Covariance Control and Adaptive Risk |
Journal/Conference/Book Title | American Control Conference (ACC) |
Reference Type | Conference Proceedings |
Author(s) | Watson, J.; Lin J. A.; Klink, P.; Pajarinen, J.; Peters, J. |
Year | 2021 |
Title | Latent Derivative Bayesian Last Layer Networks |
Journal/Conference/Book Title | Society for Artificial Intelligence and Statistics (AISTATS) |
Reference Type | Journal Article |
Author(s) | Veiga, F. F.; Edin B.B; Peters, J. |
Year | 2020 |
Title | Grip Stabilization through Independent Finger Tactile Feedback Control |
Journal/Conference/Book Title | Sensors (Special Issue on Sensors and Robot Control) |
Volume | 20 |
Link to PDF | https://www.mdpi.com/1424-8220/20/6/1748/pdf |
Reference Type | Journal Article |
Author(s) | Vinogradska, J.; Bischoff, B.; Koller, T.; Achterhold, J.; Peters, J. |
Year | 2020 |
Title | Numerical Quadrature for Probabilistic Policy Search |
Journal/Conference/Book Title | IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) |
Volume | 42 |
Number | 1 |
Pages | 164-175 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NuQuPS_preprint.pdf |
Reference Type | Journal Article |
Author(s) | Lioutikov, R.; Maeda, G.; Veiga, F.F.; Kersting, K.; Peters, J. |
Year | 2020 |
Title | Learning Attribute Grammars for Movement Primitive Sequencing |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | Skillz4robots |
Volume | 39 |
Number | 1 |
Pages | 21-38 |
Reference Type | Journal Article |
Author(s) | Arenz, O.; Zhong, M.; Neumann G. |
Year | 2020 |
Title | Trust-Region Variational Inference with Gaussian Mixture Models |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Keywords | approximate inference, variational inference, sampling, policy search, mcmc, markov chain monte carlo |
Abstract | Many methods for machine learning rely on approximate inference from intractable probability distributions. Variational inference approximates such distributions by tractable models that can be subsequently used for approximate inference. Learning sufficiently accurate approximations requires a rich model family and careful exploration of the relevant modes of the target distribution. We propose a method for learning accurate GMM approximations of intractable probability distributions based on insights from policy search by using information-geometric trust regions for principled exploration. For efficient improvement of the GMM approximation, we derive a lower bound on the corresponding optimization objective enabling us to update the components independently. Our use of the lower bound ensures convergence to a stationary point of the original objective. The number of components is adapted online by adding new components in promising regions and by deleting components with negligible weight. We demonstrate on several domains that we can learn approximations of complex, multimodal distributions with a quality that is unmet by previous variational inference methods, and that the GMM approximation can be used for drawing samples that are on par with samples created by state-of-the-art MCMC samplers while requiring up to three orders of magnitude less computational resources. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/OlegArenz/VIPS_JMLR.pdf |
Reference Type | Journal Article |
Author(s) | Gomez-Gonzalez, S.; Neumann, G.; Schölkopf, B.; Peters, J. |
Year | 2020 |
Title | Adaptation and Robust Learning of Probabilistic Movement Primitives |
Journal/Conference/Book Title | IEEE Transactions on Robotics (T-Ro) |
Volume | 36 |
Number | 2 |
Pages | 366-379 |
Link to PDF | https://arxiv.org/pdf/1808.10648.pdf |
Reference Type | Journal Article |
Author(s) | Koert, D.; Trick, S.; Ewerton, M.; Lutter, M.; Peters, J. |
Year | 2020 |
Title | Incremental Learning of an Open-Ended Collaborative Skill Library |
Journal/Conference/Book Title | International Journal of Humanoid Robotics (IJHR) |
Keywords | SKILLS4ROBOTS, KOBO |
Volume | 17 |
Number | 1 |
Reference Type | Conference Proceedings |
Author(s) | Dam, T.; Klink, P.; D'Eramo, C.; Peters, J.; Pajarinen, J. |
Year | 2020 |
Title | Generalized Mean Estimation in Monte-Carlo Tree Search |
Journal/Conference/Book Title | Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) |
Abstract | We consider Monte-Carlo Tree Search (MCTS) applied to Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs), and the well-known Upper Confidence bound for Trees (UCT) algorithm. In UCT, a tree with nodes (states) and edges (actions) is incrementally built by the expansion of nodes, and the values of nodes are updated through a backup strategy based on the average value of child nodes. However, it has been shown that with enough samples the maximum operator yields more accurate node value estimates than averaging. Instead of settling for one of these value estimates, we go a step further proposing a novel backup strategy which uses the power mean operator, which computes a value between the average and maximum value. We call our new approach Power-UCT and argue how the use of the power mean operator helps to speed up the learning in MCTS. We theoretically analyze our method providing guarantees of convergence to the optimum. Moreover, we discuss a heuristic approach to balance the greediness of backups by tuning the power mean operator according to the number of visits to each node. Finally, we empirically demonstrate the effectiveness of our method in well-known MDP and POMDP benchmarks, showing significant improvement in performance and convergence speed w.r.t. UCT. |
URL(s) | https://www.ijcai.org/Proceedings/2020/0332.pdf |
Link to PDF | https://www.ijcai.org/Proceedings/2020/0332.pdf |
Reference Type | Journal Article |
Author(s) | Loeckel, S.; Peters, J.; van Vliet, P. |
Year | 2020 |
Title | A Probabilistic Framework for Imitating Human Race Driver Behavior |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L), with Presentation at the IEEE International Conference on Robotics and Automation (ICRA) |
Volume | 5 |
Number | 2 |
Link to PDF | https://arxiv.org/pdf/2001.08255.pdf |
Reference Type | Conference Proceedings |
Author(s) | Motokura, K.; Takahashi, M.; Ewerton, M.; Peters, J. |
Year | 2020 |
Title | Plucking Motions for Tea Harvesting Robots Using Probabilistic Movement Primitives |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (ICRA/RA-L), with Presentation at the IEEE International Conference on Robotics and Automation (ICRA) |
Volume | 5 |
Number | 2 |
Pages | 2377-3766 |
Reference Type | Conference Proceedings |
Author(s) | Zelch, C.; Peters, J.; von Stryk, O. |
Year | 2020 |
Title | Learning Control Policies from Optimal Trajectories |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/Members/Zelch_ICRA_2020.pdf |
Reference Type | Journal Article |
Author(s) | Gomez-Gonzalez, S.; Prokudin, S.; Schölkopf, B.; Peters, J. |
Year | 2020 |
Title | Real Time Trajectory Prediction Using Deep Conditional Generative Models |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (ICRA/RA-L), with Presentation at the IEEE International Conference on Robotics and Automation (ICRA) |
Volume | 5 |
Number | 2 |
Pages | 970-976 |
Link to PDF | https://arxiv.org/pdf/1909.03895.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tosatto, S.; Carvalho, J.; Abdulsamad, H.; Peters, J. |
Year | 2020 |
Title | A Nonparametric Off-Policy Policy Gradient |
Journal/Conference/Book Title | Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS) |
Keywords | nonparametric, policy gradient, off policy, reinforcement learning |
Abstract | Reinforcement learning (RL) algorithms still suffer from high sample complexity despite outstanding recent successes. The need for intensive interactions with the environment is especially observed in many widely popular policy gradient algorithms that perform updates using on-policy samples. The price of such inefficiency becomes evident in real-world scenarios such as interaction-driven robot learning, where the success of RL has been rather limited. We address this issue by building on the general sample efficiency of off-policy algorithms. With nonparametric regression and density estimation methods we construct a nonparametric Bellman equation in a principled manner, which allows us to obtain closed-form estimates of the value function, and to analytically express the full policy gradient. We provide a theoretical analysis of our estimate to show that it is consistent under mild smoothness assumptions and empirically show that our approach has better sample efficiency than state-of-the-art policy gradient methods. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SamueleTosatto/tosatto2020.pdf |
Reference Type | Conference Proceedings |
Author(s) | D`Eramo, C.; Tateo, D.; Bonarini, A.; Restelli, M.; Peters, J. |
Year | 2020 |
Title | Sharing Knowledge in Multi-Task Deep Reinforcement Learning |
Journal/Conference/Book Title | International Conference in Learning Representations (ICLR) |
Link to PDF | https://openreview.net/pdf?id=rkgpv2VFvr |
Reference Type | Conference Proceedings |
Author(s) | Eilers, C.; Eschmann, J.; Menzenbach, R.; Belousov, B.; Muratore, F.; Peters, J. |
Year | 2020 |
Title | Underactuated Waypoint Trajectory Optimization for Light Painting Photography |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | SKILLS4ROBOTS |
Abstract | Despite their abundance in robotics and nature, underactuated systems remain a challenge for control engineering. Trajectory optimization provides a generally applicable solution, however its efficiency strongly depends on the skill of the engineer to frame the problem in an optimizer-friendly way. This paper proposes a procedure that automates such problem reformulation for a class of tasks in which the desired trajectory is specified by a sequence of waypoints. The approach is based on introducing auxiliary optimization variables that represent waypoint activations. To validate the proposed method, a letter drawing task is set up where shapes traced by the tip of a rotary inverted pendulum are visualized using long exposure photography. |
Custom 1 | https://www.youtube.com/watch?v=IiophaKtWG0&feature=youtu.be |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Eilers_Eschmann_Menzenbach_BMP--UnderactuatedWaypointTrajectoryOptimizationforLightPaintingPhotography.pdf |
Reference Type | Conference Proceedings |
Author(s) | Stock-Homburg, R.; Peters, J.; Schneider, K.; Prasad, V.; Nukovic, L. |
Year | 2020 |
Title | Evaluation of the Handshake Turing Test for anthropomorphic Robots |
Journal/Conference/Book Title | Proceedings of the ACM/IEEE International Conference on Human Robot Interaction (HRI), Late Breaking Report |
Link to PDF | https://arxiv.org/pdf/2001.10464.pdf |
Reference Type | Journal Article |
Author(s) | Tosatto, S.; Stadtmueller, J.; Peters, J. |
Year | 2020 |
Title | Dimensionality Reduction of Movement Primitives in Parameter Space |
Journal/Conference/Book Title | arXiv |
Keywords | movement primitives, dimensionality reduction, imitation learning, robot learning |
Abstract | Movement primitives are an important policy class for real-world robotics. However, the high dimensionality of their parametrization makes the policy optimization expensive both in terms of samples and computation. Enabling an efficient representation of movement primitives facilitates the application of machine learning techniques such as reinforcement on robotics. Motions, especially in highly redundant kinematic structures, exhibit high correlation in the configuration space. For these reasons, prior work has mainly focused on the application of dimensionality reduction techniques in the configuration space. In this paper, we investigate the application of dimensionality reduction in the parameter space, identifying principal movements. The resulting approach is enriched with a probabilistic treatment of the parameters, inheriting all the properties of the Probabilistic Movement Primitives. We test the proposed technique both on a real robotic task and on a database of complex human movements. The empirical analysis shows that the dimensionality reduction in parameter space is more effective than in configuration space, as it enables the representation of the movements with a significant reduction of parameters. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SamueleTosatto/tosatto2020b.pdf |
Reference Type | Conference Proceedings |
Author(s) | Almeida Santos, A.; Gil, C.E.M.; Peters, J.; Steinke, F. |
Year | 2020 |
Title | Decentralized Data-Driven Tuning of Droop Frequency Controllers |
Journal/Conference/Book Title | 2020 IEEE PES Innovative Smart Grid Technologies Europe |
Link to PDF | https://www.eins.tu-darmstadt.de/fileadmin/user_upload/publications_pdf/20_ISGTEU_SanMorPetSte_paper.pdf |
Reference Type | Conference Proceedings |
Author(s) | Abdulsamad, H.; Peters, J. |
Year | 2020 |
Title | Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation |
Journal/Conference/Book Title | 2nd Annual Conference on Learning for Dynamics and Control |
Link to PDF | https://arxiv.org/abs/2005.01432 |
Reference Type | Conference Proceedings |
Author(s) | Becker, P.; Arenz, O.; Neumann, G. |
Year | 2020 |
Title | Expected Information Maximization: Using the I-Projection for Mixture Density Estimation |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Link to PDF | /uploads/Team/OlegArenz/beckerEIM.pdf |
Reference Type | Journal Article |
Author(s) | Lauri, M.; Pajarinen, J.; Peters, J.; Frintrop, S. |
Year | 2020 |
Title | Multi-Sensor Next-Best-View Planning as Matroid-Constrained Submodular Maximization |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Volume | 5 |
Number | 4 |
Pages | 5323-5330 |
Reference Type | Journal Article |
Author(s) | Agudelo-Espana, D.; Gomez-Gonzalez, S.; Bauer, S.; Schoelkopf, B.; Peters, J. |
Year | 2020 |
Title | Bayesian Online Prediction of Change Points |
Journal/Conference/Book Title | Conference on Uncertainty in Artificial Intelligence (UAI) |
Reference Type | Conference Proceedings |
Author(s) | Laux, M.; Arenz, O.; Pajarinen, J.; Peters, J. |
Year | 2020 |
Title | Deep Adversarial Reinforcement Learning for Object Disentangling |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2020) |
Link to PDF | /uploads/Site/EditPublication/Melvin_Iros.pdf |
Reference Type | Journal Article |
Author(s) | Koert, D.; Kircher, M.; Salikutluk, V.; D'Eramo, C.; Peters, J. |
Year | 2020 |
Title | Multi-Channel Interactive Reinforcement Learning for Sequential Tasks |
Journal/Conference/Book Title | Frontiers in Robotics and AI Human-Robot Interaction |
Keywords | SKILLS4ROBOTS, KOBO |
URL(s) | https://www.frontiersin.org/articles/10.3389/frobt.2020.00097/full |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DorotheaKoert/multi_channel_feedback_rl_sequential.pdf |
Reference Type | Online Database |
Author(s) | Arenz, O.; Neumann, G. |
Year | 2020 |
Title | Non-Adversarial Imitation Learning and its Connections to Adversarial Methods |
Journal/Conference/Book Title | arXiv |
Keywords | Imitation Learning, Inverse Reinforcement Learning, Non-Adversarial Imitation Learning, Adversarial Imitation Learning, AIRL |
Abstract | Many modern methods for imitation learning and inverse reinforcement learning, such as GAIL or AIRL, are based on an adversarial formulation. These methods apply GANs to match the expert's distribution over states and actions with the implicit state-action distribution induced by the agent's policy. However, by framing imitation learning as a saddle point problem, adversarial methods can suffer from unstable optimization, and convergence can only be shown for small policy updates. We address these problems by proposing a framework for non-adversarial imitation learning. The resulting algorithms are similar to their adversarial counterparts and, thus, provide insights for adversarial imitation learning methods. Most notably, we show that AIRL is an instance of our non-adversarial formulation, which enables us to greatly simplify its derivations and obtain stronger convergence guarantees. We also show that our non-adversarial formulation can be used to derive novel algorithms by presenting a method for offline imitation learning that is inspired by the recent ValueDice algorithm, but does not rely on small policy updates for convergence. In our simulated robot experiments, our offline method for non-adversarial imitation learning seems to perform best when using many updates for policy and discriminator at each iteration and outperforms behavioral cloning and ValueDice. |
Link to PDF | /uploads/Team/OlegArenz/nail_arxiv.pdf |
Reference Type | Unpublished Work |
Author(s) | Abi-Farraj, F.; Pacchierotti, C.; Arenz, O.; Neumann, G.; Giordano, P. |
Year | 2020 |
Title | Haptic-based Guided Grasping in a Cluttered Environment |
Journal/Conference/Book Title | IEEE Haptics Symposium |
Link to PDF | https://www.roboticvision.org/wp-content/uploads/Haptic-based-Guided-Grasping-in-a-Cluttered-Environment.pdf |
Reference Type | Conference Proceedings |
Author(s) | Keller, L.; Tanneberg, D.; Stark, S.; Peters, J. |
Year | 2020 |
Title | Model-Based Quality-Diversity Search for Efficient Robot Learning |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | https://arxiv.org/pdf/2008.04589.pdf |
Reference Type | Conference Proceedings |
Author(s) | Klink, P.; D'Eramo, C.; Peters, J.; Pajarinen, J. |
Year | 2020 |
Title | Self-Paced Deep Reinforcement Learning |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PascalKlink/neurips-2020-2.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ploeger, K.; Lutter, M.; Peters, J. |
Year | 2020 |
Title | High Acceleration Reinforcement Learning for Real-World Juggling with Binary Rewards |
Journal/Conference/Book Title | Proceedings of the 4th Conference on Robot Learning (CoRL) |
Reference Type | Conference Proceedings |
Author(s) | Urain, J.; Ginesi, M.; Tateo, D.; Peters, J. |
Year | 2020 |
Title | ImitationFlow: Learning Deep Stable Stochastic Dynamic Systems by Normalizing Flows |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems |
Keywords | Movement Primitives, Imitation Learning |
Abstract | We introduce ImitationFlow, a novel Deep generative model that allows learning complex globally stable, stochastic, nonlinear dynamics. Our approach extends the Normalizing Flows framework to learn stable Stochastic Differential Equations. We prove the Lyapunov stability for a class of Stochastic Differential Equations and we propose a learning algorithm to learn them from a set of demonstrated trajectories. Our model extends the set of stable dynamical systems that can be represented by state-of-the-art approaches, eliminates the Gaussian assumption on the demonstrations, and outperforms the previous algorithms in terms of representation accuracy. We show the effectiveness of our method with both standard datasets and a real robot experiment. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JulenUrainDeJesus/2020iflowurain.pdf |
Reference Type | Conference Proceedings |
Author(s) | Abdulsamad, H.; Nickl, P.; Klink, P.; Peters, J. |
Year | 2020 |
Title | A Variational Infinite Mixture for Probabilistic Inverse Dynamics Learning |
Journal/Conference/Book Title | arXiv |
Link to PDF | https://arxiv.org/pdf/2011.05217.pdf |
Reference Type | Conference Proceedings |
Author(s) | Abdulsamad, H.; Peters, J. |
Year | 2020 |
Title | Learning Hybrid Dynamics and Control |
Journal/Conference/Book Title | ECML/PKDD Workshop on Deep Continuous-Discrete Machine Learning |
Reference Type | Journal Article |
Author(s) | Tanneberg, D.; Rueckert, E.; Peters, J. |
Year | 2020 |
Title | Evolutionary training and abstraction yields algorithmic generalization of neural computers |
Journal/Conference/Book Title | Nature Machine Intelligence |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
URL(s) | https://rdcu.be/caRlg |
Reference Type | Journal Article |
Author(s) | Veiga, F. F.; Akrour, R.; Peters, J. |
Year | 2020 |
Title | Hierarchical Tactile-Based Control Decomposition of Dexterous In-Hand Manipulation Tasks |
Journal/Conference/Book Title | Frontiers in Robotics and AI |
URL(s) | https://www.frontiersin.org/articles/10.3389/frobt.2020.521448/full |
Reference Type | Conference Proceedings |
Author(s) | Urain, J.; Tateo, D.; Ren, T.; Peters, J. |
Year | 2020 |
Title | Structured Policy Representation: Imposing Stability in arbitrarily conditioned dynamic systems |
Journal/Conference/Book Title | NeurIPS 2020, 3rd Robot Learning Workshop |
Keywords | Movement Primitives, Imitation Learning, Inductive Bias |
Abstract | We present a new family of deep neural network-based dynamic systems. The presented dynamics are globally stable and can be conditioned with an arbitrary context state. We show how these dynamics can be used as structured robot policies. Global stability is one of the most important and straightforward inductive biases as it allows us to impose reasonable behaviors outside the region of the demonstrations. |
Pages | 7 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JulenUrainDeJesus/2020_structuredpolicy_urain.pdf |
Reference Type | Conference Proceedings |
Author(s) | Watson, J.; Imohiosen A.; Peters, J. |
Year | 2020 |
Title | Active Inference or Control as Inference? A Unifying View |
Journal/Conference/Book Title | International Workshop on Active Inference |
Reference Type | Journal Article |
Author(s) | Tanneberg, D.; Peters, J.; Rueckert, E. |
Year | 2019 |
Title | Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks |
Journal/Conference/Book Title | Neural Networks |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Volume | 109 |
Pages | 67-80 |
URL(s) | https://doi.org/10.1016/j.neunet.2018.10.005 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DanielTanneberg/tanneberg_NN18.pdf |
Reference Type | Journal Article |
Author(s) | Koc, O.; Peters, J. |
Year | 2019 |
Title | Learning to serve: an experimental study for a new learning from demonstrations framework |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (ICRA/RA-L), with Presentation at the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Learning_to_Serve_2019.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lauri, M.; Pajarinen, J.; Peters, J. |
Year | 2019 |
Title | Information gathering in decentralized POMDPs by policy graph improvement |
Journal/Conference/Book Title | Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS) |
Keywords | ROBOLEAP,SKILLS4ROBOTS |
Link to PDF | https://arxiv.org/pdf/1902.09840 |
Reference Type | Journal Article |
Author(s) | Brandherm, F.; Peters, J.; Neumann, G.; Akrour, R. |
Year | 2019 |
Title | Learning Replanning Policies with Direct Policy Search |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/RiadAkrour/florian_ral_sub.pdf |
Reference Type | Journal Article |
Author(s) | Gebhardt, G.H.W.; Kupcsik, A.; Neumann, G. |
Year | 2019 |
Title | The Kernel Kalman Rule |
Journal/Conference/Book Title | Machine Learning Journal (MLJ) |
Publisher | Springer US |
Volume | 108 |
Number | 12 |
Pages | 2113–2157 |
ISBN/ISSN | 0885-6125 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/GregorGebhardt/TheKernelKalmanRuleJournal.pdf |
Reference Type | Journal Article |
Author(s) | Parisi, S.; Tangkaratt, V.; Peters, J.; Khan, M. E. |
Year | 2019 |
Title | TD-Regularized Actor-Critic Methods |
Journal/Conference/Book Title | Machine Learning (MLJ) |
Volume | 108 |
Number | 8 |
Pages | 1467-1501 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SimoneParisi/parisi2019mlj.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Ritter, C.; Peters, J. |
Year | 2019 |
Title | Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning |
Journal/Conference/Book Title | International Conference on Learning Representations (ICLR) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/MichaelLutter/lutter_iclr_2019.pdf |
Reference Type | Journal Article |
Author(s) | Koc, O.; Maeda, G.; Peters, J. |
Year | 2019 |
Title | Optimizing the Execution of Dynamic Robot Movements with Learning Control |
Journal/Conference/Book Title | IEEE Transactions on Robotics |
Volume | 35 |
Number | 4 |
Pages | 1552-3098 |
Link to PDF | https://arxiv.org/pdf/1807.01918.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tosatto, S.; D'Eramo, C.; Pajarinen, J.; Restelli, M.; Peters, J. |
Year | 2019 |
Title | Exploration Driven By an Optimistic Bellman Equation |
Journal/Conference/Book Title | Proceedings of the International Joint Conference on Neural Networks (IJCNN) |
Keywords | exploration; reinforcement learning; intrinsic motivation; Bosch-Forschungstiftung |
Abstract | Exploring high-dimensional state spaces and finding sparse rewards are central problems in reinforcement learning. Exploration strategies are frequently either naı̈ve (e.g., simplistic epsilon-greedy or Boltzmann policies), intractable (i.e., full Bayesian treatment of reinforcement learning) or rely heavily on heuristics. The lack of a tractable but principled exploration approach unnecessarily complicates the application of reinforcement learning to a broader range of problems. Efficient exploration can be accomplished by relying on the uncertainty of the state-action value function. To obtain the uncertainty, we maintain an ensemble of value function estimates and present an optimistic Bellman equation (OBE) for such ensembles. This OBE is derived from a relative entropy maximization principle and yields an implicit exploration bonus resulting in improved exploration during action selection. The implied exploration bonus can be seen as a well-principled type of intrinsic motivation and exhibits favorable theoretical properties. OBE can be applied to a wide range of algorithms. We propose two algorithms as an application of the principle: Optimistic Q-learning and Optimistic DQN which outperform comparison methods on standard benchmarks. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SamueleTosatto/TosattoIJCNN2019.pdf |
Language | English |
Reference Type | Conference Proceedings |
Author(s) | Wibranek, B.; Belousov, B.; Sadybakasov, A.; Tessmann, O. |
Year | 2019 |
Title | Interactive Assemblies: Man-Machine Collaboration through Building Components for As-Built Digital Models |
Journal/Conference/Book Title | Computer-Aided Architectural Design Futures (CAAD Futures) |
Keywords | SKILLS4ROBOTS |
Link to PDF | /uploads/Team/BorisBelousov/wibranek_caad19.pdf |
Reference Type | Journal Article |
Author(s) | Abi Farraj, F.; Pacchierotti, C.; Arenz, O.; Neumann, G.; Giordano, P. |
Year | 2019 |
Title | A Haptic Shared-Control Architecture for Guided Multi-Target Robotic Grasping |
Journal/Conference/Book Title | IEEE Transactions on Haptics |
Keywords | Grasping, Task analysis, Manipulators, Grippers, Service robots |
Abstract | Although robotic telemanipulation has always been a key technology for the nuclear industry, little advancement has been seen over the last decades. Despite complex remote handling requirements, simple mechanically-linked master-slave manipulators still dominate the field. Nonetheless, there is a pressing need for more effective robotic solutions able to significantly speed up the decommissioning of legacy radioactive waste. This paper describes a novel haptic shared-control approach for assisting a human operator in the sort and segregation of different objects in a cluttered and unknown environment. A 3D scan of the scene is used to generate a set of potential grasp candidates on the objects at hand. These grasp candidates are then used to generate guiding haptic cues, which assist the operator in approaching and grasping the objects. The haptic feedback is designed to be smooth and continuous as the user switches from a grasp candidate to the next one, or from one object to another one, avoiding any discontinuity or abrupt changes. To validate our approach, we carried out two human-subject studies, enrolling 15 participants. We registered an average improvement of 20.8%, 20.1%, 32.5% in terms of completion time, linear trajectory, and perceived effectiveness, respectively, between the proposed approach and standard teleoperation. |
URL(s) | https://ieeexplore.ieee.org/document/8700204 |
Link to PDF | https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8700204 |
Reference Type | Conference Proceedings |
Author(s) | Akrour, R.; Pajarinen, J.; Neumann, G.; Peters, J. |
Year | 2019 |
Title | Projections for Approximate Policy Iteration Algorithms |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Keywords | ROBOLEAP,SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/RiadAkrour/papi.pdf |
Reference Type | Conference Proceedings |
Author(s) | Becker-Ehmck, P.; Peters, J.; van der Smagt, P. |
Year | 2019 |
Title | Switching Linear Dynamics for Variational Bayes Filtering |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Link to PDF | https://arxiv.org/pdf/1905.12434.pdf |
Reference Type | Conference Proceedings |
Author(s) | Belousov, B.; Abdulsamad, H.; Schultheis, M.; Peters, J. |
Year | 2019 |
Title | Belief space model predictive control for approximately optimal system identification |
Journal/Conference/Book Title | 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Keywords | SKILLS4ROBOTS |
Abstract | The fundamental problem of reinforcement learning is to control a dynamical system whose properties are not fully known in advance. Many articles nowadays are addressing the issue of optimal exploration in this setting by investigating the ideas such as curiosity, intrinsic motivation, empowerment, and others. Interestingly, closely related questions of optimal input design with the goal of producing the most informative system excitation have been studied in adjacent fields grounded in statistical decision theory. In most general terms, the problem faced by a curious reinforcement learning agent can be stated as a sequential Bayesian optimal experimental design problem. It is well known that finding an optimal feedback policy for this type of setting is extremely hard and analytically intractable even for linear systems due to the non-linearity of the Bayesian filtering step. Therefore, approximations are needed. We consider one type of approximation based on replacing the feedback policy by repeated trajectory optimization in the belief space. By reasoning about the future uncertainty over the internal world model, the agent can decide what actions to take at every moment given its current belief and expected outcomes of future actions. Such approach became computationally feasible relatively recently, thanks to advances in automatic differentiation. Being straightforward to implement, it can serve as a strong baseline for exploration algorithms in continuous robotic control tasks. Preliminary evaluations on a physical pendulum with unknown system parameters indicate that the proposed approach can infer the correct parameter values quickly and reliably, outperforming random excitation and naive sinusoidal excitation signals, and matching the performance of the best manually designed system identification controller based on the knowledge of the system dynamics. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/rldm19_belousov.pdf |
Reference Type | Journal Article |
Author(s) | Belousov, B.; Peters, J. |
Year | 2019 |
Title | Entropic Regularization of Markov Decision Processes |
Journal/Conference/Book Title | Entropy |
Keywords | SKILLS4ROBOTS |
Publisher | MDPI |
Volume | 21 |
Number | 7 |
ISBN/ISSN | 1099-4300 |
URL(s) | https://www.mdpi.com/1099-4300/21/7/674 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/entropy19_belousov.pdf |
Reference Type | Journal Article |
Author(s) | Pajarinen, J.; Thai, H.L.; Akrour, R.; Peters, J.; Neumann, G. |
Year | 2019 |
Title | Compatible natural gradient policy search |
Journal/Conference/Book Title | Machine Learning (MLJ) |
Keywords | ROBOLEAP,SKILLS4ROBOTS |
Publisher | Springer |
Volume | 108 |
Number | 8 |
Pages | 1443--1466 |
Date | September 2019 |
Link to PDF | https://link.springer.com/content/pdf/10.1007%2Fs10994-019-05807-0.pdf |
Reference Type | Journal Article |
Author(s) | Celemin, C.; Maeda, G.; Peters, J.; Ruiz-del-Solar, J.; Kober, J. |
Year | 2019 |
Title | Reinforcement Learning of Motor Skills using Policy Search and Human Corrective Advice |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Volume | 38 |
Number | 14 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Alumni/JensKober/IJRR__Revision_.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nass, D.; Belousov, B.; Peters, J. |
Year | 2019 |
Title | Entropic Risk Measure in Policy Search |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/iros19_nass_v2.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ozdenizci, O.; Meyer, T.; Wichmann, F.; Peters, J.; Schölkopf B.; Cetin, M.; Grosse-Wentrup, M. |
Year | 2019 |
Title | Neural Signatures of Motor Skill in the Resting Brain |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC) |
Reference Type | Conference Proceedings |
Author(s) | Urain, J.; Peters, J. |
Year | 2019 |
Title | Generalized Multiple Correlation Coefficient as a Similarity Measurement between Trajectories |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | 2019 |
Abstract | Similarity distance measure between two trajectories is an essential tool to understand patterns in motion, for example, in Human-Robot Interaction or Imitation Learning. The problem has been faced in many fields, from Signal Processing, Probabilistic Theory field, Topology field or Statistics field. Anyway, up to now, none of the trajectory similarity measurements metrics are invariant to all possible linear transformation of the trajectories~(rotation, scaling, reflection, shear mapping or squeeze mapping). Also not all of them are robust in front of noisy signals or fast enough for real-time trajectory classification. To overcome this limitation this paper proposes a similarity distance metric that will remain invariant in front of any possible linear transformation. Based on Pearson's Correlation Coefficient and the Coefficient of Determination, our similarity metric, the Generalized Multiple Correlation Coefficient~(GMCC) is presented like the natural extension of the Multiple Correlation Coefficient. The motivation of this paper is two-fold: First, to introduce a new correlation metric that presents the best properties to compute similarities between trajectories invariant to linear transformations and compare it with some state of the art similarity distances. Second, to present a natural way of integrating the similarity metric in an Imitation Learning scenario for clustering robot trajectories. |
Place Published | IROS 2019 |
Date | 2019 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JulenUrainDeJesus/julen_IROS_2019.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Peters, J. |
Year | 2019 |
Title | Deep Lagrangian Networks for end-to-end learning of energy-based control for under-actuated systems |
Journal/Conference/Book Title | International Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/MichaelLutter/IROS_2019_Final_DeLaN_Energy_Control.pdf |
Reference Type | Journal Article |
Author(s) | Koert, D.; Pajarinen, J.; Schotschneider, A.; Trick, S., Rothkopf, C.; Peters, J. |
Year | 2019 |
Title | Learning Intention Aware Online Adaptation of Movement Primitives |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L), with presentation at the IEEE International Conference on Intelligent Robots and Systems (IROS) |
Keywords | SKILLS4ROBOTS, KOBO |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DorotheaKoert/final_ral_2019_koert.pdf |
Reference Type | Conference Proceedings |
Author(s) | Celik, O.; Abdulsamad, H.; Peters, J. |
Year | 2019 |
Title | Chance-Constrained Trajectory Optimization for Nonlinear Systems with Unknown Stochastic Dynamics |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Abstract | Iterative trajectory optimization techniques for non-linear dynamical systems are among the most powerful and sample-efficient methods of model-based reinforcement learning and approximate optimal control. By leveraging time-variant local linear-quadratic approximations of system dynamics and rewards, such methods are able to find both a target-optimal trajectory and time-variant optimal feedback controllers. How- ever, the local linear-quadratic approximations are a major source of optimization bias that leads to catastrophic greedy updates, raising the issue of proper regularization. Moreover, the approximate models’ disregard for any physical state-action limits of the system, causes further aggravation of the problem, as the optimization moves towards unreachable areas of the state-action space. In this paper, we address these drawbacks in the scenario of online-fitted stochastic dynamics. We propose modeling state and action physical limits as probabilistic chance constraints and introduce a new trajectory optimization technique that integrates such probabilistic constraints by opti- mizing a relaxed quadratic program. Our empirical evaluations show a significant improvement in the robustness of the learning process, which enables our approach to perform more effective updates, and avoid premature convergence observed in other state-of-the-art techniques. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/HanyAbdulsamad/celik2019chance.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Peters, J. |
Year | 2019 |
Title | Deep Optimal Control: Using the Euler-Lagrange Equation to learn an Optimal Feedback Control Law |
Journal/Conference/Book Title | Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/MichaelLutter/RLDM2019_Deep_Optimal_Control.pdf |
Reference Type | Conference Proceedings |
Author(s) | Trick, S.; Koert, D.; Peters, J.; Rothkopf, C. |
Year | 2019 |
Title | Multimodal Uncertainty Reduction for Intention Recognition in Human-Robot Interaction |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | SKILLS4ROBOTS, KOBO |
Link to PDF | https://arxiv.org/pdf/1907.02426.pdf |
Reference Type | Conference Proceedings |
Author(s) | Stark, S.; Peters, J.; Rueckert, E. |
Year | 2019 |
Title | Experience Reuse with Probabilistic Movement Primitives |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | GOAL-Robots |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SvenjaStark/stark_iros2019_update.pdf |
Reference Type | Conference Proceedings |
Author(s) | Liu, Z.; Hitzmann, A.; Ikemoto, S.; Stark, S.; Peters, J.; Hosoda, K. |
Year | 2019 |
Title | Local Online Motor Babbling: Learning Motor Abundance of a Musculoskeletal Robot Arm |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Abstract | Motor babbling and goal babbling has been used for sensorimotor learning of highly redundant systems in soft robotics. Recent works in goal babbling has demonstrated successful learning of inverse kinematics (IK) on such systems, and suggests that babbling in the goal space better resolves motor redundancy by learning as few yet efficient sensorimotor mappings as possible. However, for musculoskeletal robot systems, motor redundancy can provide useful information to explain muscle activation patterns, thus the term motor abundance. In this work, we introduce some simple heuristics to empirically define the unknown goal space, and learn the IK of a 10 DoF musculoskeletal robot arm using directed goal babbling. We then further propose local online motor babbling guided by Covariance Matrix Adaptation Evolution Strategy (CMA-ES), which bootstraps on the goal babbling samples for initialization, such that motor abundance can be queried online for any static goal. Our approach leverages the resolving of redundancies and the efficient guided exploration of motor abundance in two stages of learning, allowing both kinematic accuracy and motor variability at the queried goal. The result shows that local online motor babbling guided by CMA-ES can efficiently explore motor abundance on musculoskeletal robot systems and gives useful insights in terms of muscle stiffness and synergy. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SvenjaStark/Liu_IROS_2019.pdf |
Reference Type | Conference Proceedings |
Author(s) | Belousov, B.; Sadybakasov, A.; Wibranek, B.; Veiga, F.F.; Tessmann, O.; Peters, J. |
Year | 2019 |
Title | Building a Library of Tactile Skills Based on FingerVision |
Journal/Conference/Book Title | Proceedings of the 2019 IEEE-RAS 19th International Conference on Humanoid Robots (Humanoids) |
Keywords | SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/belousov19_fingervision.pdf |
Reference Type | Conference Proceedings |
Author(s) | Schultheis, M.; Belousov, B.; Abdulsamad, H.; Peters, J. |
Year | 2019 |
Title | Receding Horizon Curiosity |
Journal/Conference/Book Title | Proceedings of the 3rd Conference on Robot Learning (CoRL) |
Keywords | SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/schultheis19_rhc.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lutter, M.; Belousov, B.; Listmann, K.; Clever, D.; Peters, J. |
Year | 2019 |
Title | HJB Optimal Feedback Control with Deep Differential Value Functions and Action Constraints |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/MichaelLutter/CoRL2019_Deep_Optimal_HJB_Control.pdf |
Reference Type | Journal Article |
Author(s) | Ewerton, M.; Arenz, O.; Maeda, G.; Koert, D.; Kolev, Z.; Takahashi, M.; Peters, J. |
Year | 2019 |
Title | Learning Trajectory Distributions for Assisted Teleoperation and Path Planning |
Journal/Conference/Book Title | Frontiers in Robotics and AI |
URL(s) | https://www.frontiersin.org/articles/10.3389/frobt.2019.00089/full |
Link to PDF | https://www.frontiersin.org/articles/10.3389/frobt.2019.00089/full |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Maeda, G.; Koert, D.; Kolev, Z.; Takahashi, M.; Peters, J. |
Year | 2019 |
Title | Reinforcement Learning of Trajectory Distributions: Applications in Assisted Teleoperation and Motion Planning |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Abstract | The majority of learning from demonstration approaches do not address suboptimal demonstrations or cases when drastic changes in the environment occur after the demonstrations were made. For example, in real teleoperation tasks, the demonstrations provided by the user are often suboptimal due to interface and hardware limitations. In tasks involving co-manipulation and manipulation planning, the environment often changes due to unexpected obstacles rendering previous demonstrations invalid. This paper presents a reinforcement learning algorithm that exploits the use of relevance functions to tackle such problems. This paper introduces the Pearson correlation as a measure of the relevance of policy parameters in regards to each of the components of the cost function to be optimized. The method is demonstrated in a static environment where the quality of the teleoperation is compromised by the visual interface (operating a robot in a three-dimensional task by using a simple 2D monitor). Afterward, we tested the method on a dynamic environment using a real 7-DoF robot arm where distributions are computed online via Gaussian Process regression. |
Place Published | Macau, China |
Pages | 4294--4300 |
Date | November 4-8, 2019 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Member/PubMarcoEwerton/Ewerton_IROS_2019.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wibranek, B.; Belousov, B.; Sadybakasov, A.; Peters, J.; Tessmann, O. |
Year | 2019 |
Title | Interactive Structure: Robotic Repositioning of Vertical Elements in Man-Machine Collaborative Assembly through Vision-Based Tactile Sensing |
Journal/Conference/Book Title | Proceedings of the 37th eCAADe and 23rd SIGraDi Conference |
Keywords | SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/wibranek_sigradi19.pdf |
Reference Type | Conference Proceedings |
Author(s) | Klink, P.; Abdulsamad, H.; Belousov, B.; Peters, J. |
Year | 2019 |
Title | Self-Paced Contextual Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the 3rd Conference on Robot Learning (CoRL) |
Abstract | Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots. Although contextual reinforcement learning provides a principled framework for learning and generalization of behaviors across related tasks, it generally relies on uninformed sampling of environments from an unknown, uncontrolled context distribution, thus missing the benefits of structured, sequential learning. We introduce a novel relative entropy reinforcement learning algorithm that gives the agent the freedom to control the intermediate task distribution, allowing for its gradual progression towards the target context distribution. Empirical evaluation shows that the proposed curriculum learning scheme drastically improves sample efficiency and enables learning in scenarios with both broad and sharp target context distributions in which classical approaches perform sub-optimally. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/PascalKlink/sprl.pdf |
Reference Type | Conference Paper |
Author(s) | Watson, J.; Abdulsamad, H.; Peters, J. |
Year | 2019 |
Title | Stochastic Optimal Control as Approximate Input Inference |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL 2019) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JoeWatson/Watson19I2c.pdf |
Reference Type | Conference Paper |
Author(s) | Abdulsamad, H.; Naveh, K.; Peters, J. |
Year | 2019 |
Title | Model-Based Relative Entropy Policy Search for Stochastic Hybrid Systems |
Journal/Conference/Book Title | 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Reference Type | Journal Article |
Author(s) | Gomez Gonzalez, S.; Nemmour, Y.; Schoelkopf, B.; Peters, J. |
Year | 2019 |
Title | Reliable Real Time Ball Tracking for Robot Table Tennis |
Journal/Conference/Book Title | Robotics |
Volume | 8 |
Number of Volumes | 90 |
Number | 4 |
URL(s) | https://www.mdpi.com/2218-6581/8/4/90 |
Reference Type | Journal Article |
Author(s) | Schuermann, T.; Mohler, B.J.; Peters, J.; Beckerle, P. |
Year | 2019 |
Title | How Cognitive Models of Human Body Experience Might Push Robotics |
Journal/Conference/Book Title | Frontiers in Neurorobotics |
Reference Type | Conference Proceedings |
Author(s) | Delfosse, Q.; Stark, S.; Tanneberg, D.; Santucci, V. G.; Peters, J. |
Year | 2019 |
Title | Open-Ended Learning of Grasp Strategies using Intrinsically Motivated Self-Supervision |
Journal/Conference/Book Title | Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/SvenjaStark/delfosse_iros2019.pdf |
Reference Type | Conference Paper |
Author(s) | Muratore, F.; Gienger, M.; Peters, J. |
Year | 2019 |
Title | Assessing Transferability in Reinforcement Learning from Randomized Simulations |
Journal/Conference/Book Title | Reinforcement Learning and Decision Making (RLDM) |
Keywords | domain randomization, simulation optimization, sim-2-real |
Abstract | Exploration-based reinforcement learning of control policies on physical systems is generally time-intensive and can lead to catastrophic failures. Therefore, simulation-based policy search appears to be an appealing alternative. Unfortunately, running policy search on a slightly faulty simulator can easily lead to the maximization of the ‘Simulation Optimization Bias’ (SOB), where the policy exploits modeling errors of the simulator such that the resulting behavior can potentially damage the device. For this reason, much work in reinforcement learning has focused on model-free methods. The resulting lack of safe simulation-based policy learning techniques imposes severe limitations on the application of reinforcement learning to real-world systems. In this paper, we explore how physics simulations can be utilized for a robust policy optimization by randomizing the simulator’s parameters and training from model ensembles. We propose an algorithm called Simulation-based Policy Optimization with Transferability Assessment (SPOTA) that uses an estimator of the SOB to formulate a stopping criterion for training. We show that the simulation-based policy search algorithm is able to learn a control policy exclusively from a randomized simulator that can be applied directly to a different system without using any data from the latter. |
Language | English |
Reference Type | Conference Paper |
Author(s) | Tanneberg, D.; Rueckert, E.; Peters, J. |
Year | 2019 |
Title | Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer Architecture |
Journal/Conference/Book Title | arXiv |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
URL(s) | https://arxiv.org/pdf/1911.00926.pdf |
Reference Type | Conference Paper |
Author(s) | Klink, P.; Peters, J. |
Year | 2019 |
Title | Measuring Similarities between Markov Decision Processes |
Journal/Conference/Book Title | 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) |
Reference Type | Journal Article |
Author(s) | Kroemer, O.; Leischnig, S.; Luettgen, S.; Peters, J. |
Year | 2018 |
Title | A Kernel-based Approach to Learning Contact Distributions for Robot Manipulation Tasks |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Volume | 42 |
Number | 3 |
Pages | 581-600 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Alumni/OliverKroemer/KroemerAuRo17Updated2.pdf |
Reference Type | Journal Article |
Author(s) | Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Using Probabilistic Movement Primitives in Robotics |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Volume | 42 |
Number | 3 |
Pages | 529-551 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/AlexandrosParaschos/promps_auro.pdf |
Reference Type | Journal Article |
Author(s) | Yi, Z.; Zhang, Y.; Peters, J. |
Year | 2018 |
Title | Biomimetic Tactile Sensors and Signal Processing with Spike Trains: A Review |
Journal/Conference/Book Title | Sensors & Actuators: A. Physical |
Volume | 269 |
Pages | 41-52 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/SNA2018yi.pdf |
Reference Type | Journal Article |
Author(s) | Paraschos, A.; Rueckert, E.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Probabilistic Movement Primitives under Unknown System Dynamics |
Journal/Conference/Book Title | Advanced Robotics (ARJ) |
Volume | 32 |
Number | 6 |
Pages | 297-310 |
Link to PDF | https://www.ias.tu-darmstadt.de/uploads/Alumni/AlexandrosParaschos/Paraschos_AR_2018.pdf |
Reference Type | Journal Article |
Author(s) | Manschitz, S.; Gienger, M.; Kober, J.; Peters, J. |
Year | 2018 |
Title | Mixture of Attractors: A novel Movement Primitive Representation for Learning Motor Skills from Demonstrations |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Volume | 3 |
Number | 2 |
Pages | 926-933 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/SimonManschitz/ManschitzRAL2018.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lioutikov, R.; Maeda, G.; Veiga, F.F.; Kersting, K.; Peters, J. |
Year | 2018 |
Title | Inducing Probabilistic Context-Free Grammars for the Sequencing of Robot Movement Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-Hand |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/RudolfLioutikov/lioutikov_movement_pcfg_icra2018.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gebhardt, G.H.W.; Daun, K.; Schnaubelt, M.; Neumann, G. |
Year | 2018 |
Title | Learning Robust Policies for Object Manipulation with Robot Swarms |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | swarm robotics, policy search, kernel methods, kilobots |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/GregorGebhardt/LearningRobustPoliciesForObjectManipulationWithRobotSwarms.pdf |
Reference Type | Journal Article |
Author(s) | Vinogradska, J.; Bischoff, B.; Peters, J. |
Year | 2018 |
Title | Approximate Value Iteration based on Numerical Quadrature |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation, and IEEE Robotics and Automation Letters (RA-L) |
Volume | 3 |
Number of Volumes | 2 |
Pages | 1330-1337 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NQVI_RAL_manuscript.pdf |
Reference Type | Conference Proceedings |
Author(s) | Pinsler, R.; Akrour, R.; Osa, T.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Sample and Feedback Efficient Hierarchical Reinforcement Learning from Human Preferences |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | IAS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/RiadAkrour/icra18_robert.pdf |
Reference Type | Conference Proceedings |
Author(s) | Koert, D.; Maeda, G.; Neumann, G.; Peters, J. |
Year | 2018 |
Title | Learning Coupled Forward-Inverse Models with Combined Prediction Errors |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-Hand,SKILLS4ROBOTS |
Abstract | Challenging tasks in unstructured environments require robots to learn complex models. Given a large amount of information, learning multiple simple models can offer an efficient alternative to a monolithic complex network. Training multiple models---that is, learning their parameters and their responsibilities---has been shown to be prohibitively hard as optimization is prone to local minima. To efficiently learn multiple models for different contexts, we thus develop a new algorithm based on expectation maximization (EM). In contrast to comparable concepts, this algorithm trains multiple modules of paired forward-inverse models by using the prediction errors of both forward and inverse models simultaneously. In particular, we show that our method yields a substantial improvement over only considering the errors of the forward models on tasks where the inverse space contains multiple solutions. |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/DorotheaKoert/cfim_final.pdf |
Reference Type | Journal Article |
Author(s) | Osa, T.; Pajarinen, J.; Neumann, G.; Bagnell, J.A.; Abbeel, P.; Peters, J. |
Year | 2018 |
Title | An Algorithmic Perspective on Imitation Learning |
Journal/Conference/Book Title | Foundations and Trends in Robotics |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/1811.06711 |
Reference Type | Journal Article |
Author(s) | Veiga, F.; Peters, J.; Hermans, T. |
Year | 2018 |
Title | Grip Stabilization of Novel Objects using Slip Prediction |
Journal/Conference/Book Title | IEEE Transactions on Haptics |
Volume | 11 |
Number | 4 |
Pages | 531--542 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/veigaToH2018.pdf |
Reference Type | Journal Article |
Author(s) | Koc, O.; Maeda, G.; Peters, J. |
Year | 2018 |
Title | Online optimal trajectory generation for robot table tennis |
Journal/Conference/Book Title | Robotics and Autonomous Systems (RAS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Online_optimal_trajectory_generation.pdf |
Reference Type | Journal Article |
Author(s) | Ewerton, M.; Rother, D.; Weimar, J.; Kollegger, G.; Wiemeyer, J.; Peters, J.; Maeda, G. |
Year | 2018 |
Title | Assisting Movement Training and Execution with Visual and Haptic Feedback |
Journal/Conference/Book Title | Frontiers in Neurorobotics |
Keywords | 3rd-Hand, BIMROB, RoMaNS, SKILLS4ROBOTS, NEDO |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/fnbot-12-00024.pdf |
Reference Type | Conference Proceedings |
Author(s) | Belousov, B.; Peters, J. |
Year | 2018 |
Title | Entropic Regularization of Markov Decision Processes |
Journal/Conference/Book Title | 38th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering |
Keywords | reinforcement learning; actor-critic methods; entropic proximal mappings; policy search |
Abstract | The problem of synthesis of an optimal feedback controller for a given Markov decision process (MDP) can in principle be solved by value iteration or policy iteration. However, if system dynamics and the reward function are unknown, the only way for a learning agent to discover an optimal controller is through interaction with the MDP. During data gathering, it is crucial to account for the lack of information, because otherwise ignorance will push the agent towards dangerous areas of the state space. To prevent such behavior and smoothen learning dynamics, prior works proposed to bound the information loss measured by the Kullback-Leibler (KL) divergence at every policy improvement step. In this paper, we consider a broader family of f -divergences that preserve the beneficial property of the KL divergence of providing the policy improvement step in closed form accompanied by a compatible dual objective for policy evaluation. Such entropic proximal policy optimization view gives a unified perspective on compatible actor-critic architectures. In particular, common least squares value function fitting coupled with advantage-weighted maximum likelihood policy estimation is shown to correspond to the Pearson χ2-divergence penalty. Other connections can be established by considering different choices of the penalty generator function f . |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/maxent18_belousov.pdf |
Reference Type | Conference Proceedings |
Author(s) | Parmas, P.; Doya, K.; Rasmussen, C.; Peters, J. |
Year | 2018 |
Title | PIPPS: Flexible Model-Based Policy Search Robust to the Curse of Chaos |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Reference Type | Conference Proceedings |
Author(s) | Arenz, O.; Zhong, M.; Neumann, G. |
Year | 2018 |
Title | Efficient Gradient-Free Variational Inference using Policy Search |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Keywords | Variational Inference, Policy Search, Sampling |
Abstract | Inference from complex distributions is a common problem in machine learning needed for many Bayesian methods. We propose an efficient, gradient-free method for learning general GMM approximations of multimodal distributions based on recent insights from stochastic search methods. Our method establishes information-geometric trust regions to ensure efficient exploration of the sampling space and stability of the GMM updates, allowing for efficient estimation of multi-variate Gaussian variational distributions. For GMMs, we apply a variational lower bound to decompose the learning objective into sub-problems given by learning the individual mixture components and the coefficients. The number of mixture components is adapted online in order to allow for arbitrary exact approximations. We demonstrate on several domains that we can learn significantly better approximations than competing variational inference methods and that the quality of samples drawn from our approximations is on par with samples created by state-of-the-art MCMC samplers that require significantly more computational resources. |
Editor(s) | Dy, Jennifer and Krause, Andreas |
Publisher | PMLR |
Volume | 80 |
Pages | 234--243 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/OlegArenz/VIPS_full.pdf |
Reference Type | Journal Article |
Author(s) | Buechler, D.; Calandra, R.; Schoelkopf, B.; Peters, J. |
Year | 2018 |
Title | Control of Musculoskeletal Systems using Learned Dynamics Models |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters, and IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | https://ei.is.tuebingen.mpg.de/uploads_file/attachment/attachment/422/RAL18final.pdf |
Reference Type | Journal Article |
Author(s) | Sosic, A.; Rueckert, E.; Peters, J.; Zoubir, A.M.; Koeppl, H |
Year | 2018 |
Title | Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Volume | 19 |
Number | 69 |
Pages | 1--45 |
Reference Type | Conference Proceedings |
Author(s) | Akrour, R.; Veiga, F.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Regularizing Reinforcement Learning with State Abstraction |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/RiadAkrour/iros18_riad.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gondaliya, K.D.; Peters, J.; Rueckert, E. |
Year | 2018 |
Title | Learning to Categorize Bug Reports with LSTM Networks |
Journal/Conference/Book Title | Proceedings of the International Conference on Advances in System Testing and Validation Lifecycle |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VALID2018Gondaliya.pdf |
Reference Type | Journal Article |
Author(s) | Osa, T.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Hierarchical Reinforcement Learning of Multiple Grasping Strategies with Human Instructions |
Journal/Conference/Book Title | Advanced Robotics |
Volume | 32 |
Number | 18 |
Pages | 955-968 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/advanced_roboitcs_18osa.pdf |
Reference Type | Conference Paper |
Author(s) | Muratore, F.; Treede, F.; Gienger, M.; Peters, J. |
Year | 2018 |
Title | Domain Randomization for Simulation-Based Policy Optimization with Transferability Assessment |
Journal/Conference/Book Title | Conference on Robot Learning (CoRL) |
Keywords | domain randomization, simulation optimization, sim-2-real |
Abstract | Exploration-based reinforcement learning on real robot systems is generally time-intensive and can lead to catastrophic robot failures. Therefore, simulation-based policy search appears to be an appealing alternative. Unfortunately, running policy search on a slightly faulty simulator can easily lead to the maximization of the ‘Simulation Optimization Bias’ (SOB), where the policy exploits modeling errors of the simulator such that the resulting behavior can potentially damage the robot. For this reason, much work in robot reinforcement learning has focused on model-free methods that learn on real-world systems. The resulting lack of safe simulation-based policy learning techniques imposes severe limitations on the application of robot reinforcement learning. In this paper, we explore how physics simulations can be utilized for a robust policy optimization by perturbing the simulator’s parameters and training from model ensembles. We propose a new algorithm called Simulation-based Policy Optimization with Transferability Assessment (SPOTA) that uses a biased estimator of the SOB to formulate a stopping criterion for training. We show that the new simulation-based policy search algorithm is able to learn a control policy exclusively from a randomized simulator that can be applied directly to a different system without using any data from the latter. |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Muratore_Treede_Gienger_Peters--SPOTA_CoRL2018.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/FabioMuratore/Muratore_Treede_Gienger_Peters--SPOTA_CoRL2018.pdf |
Language | English |
Reference Type | Conference Proceedings |
Author(s) | Koert, D.; Trick, S.; Ewerton, M.; Lutter, M.; Peters, J. |
Year | 2018 |
Title | Online Learning of an Open-Ended Skill Library for Collaborative Tasks |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | SKILLS4ROBOTS, KOBO |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DorotheaKoert/incremental_promp_2018.pdf |
Reference Type | Conference Paper |
Author(s) | Akrour, R.; Peters, J.; Neumann, G. |
Year | 2018 |
Title | Constraint-Space Projection Direct Policy Search |
Journal/Conference/Book Title | European Workshops on Reinforcement Learning (EWRL) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/RiadAkrour/ewrl18_riad.pdf |
Reference Type | Conference Proceedings |
Author(s) | Hoelscher, J.; Koert, D.; Peters, J.; Pajarinen, J. |
Year | 2018 |
Title | Utilizing Human Feedback in POMDP Execution and Specification |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | ROMANS,SKILLS4ROBOTS,ROBOLEAP |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/DorotheaKoert/pomdp_user_interaction_2018.pdf |
Reference Type | Conference Proceedings |
Author(s) | Belousov, B.; Peters, J. |
Year | 2018 |
Title | Mean squared advantage minimization as a consequence of entropic policy improvement regularization |
Journal/Conference/Book Title | European Workshops on Reinforcement Learning (EWRL) |
Keywords | policy optimization, entropic proximal mappings, actor-critic algorithms |
Abstract | Policy improvement regularization with entropy-like f-divergence penalties provides a unifying perspective on actor-critic algorithms, rendering policy improvement and policy evaluation steps as primal and dual subproblems of the same optimization problem. For small policy improvement steps, we show that all f-divergences with twice differentiable generator function f yield a mean squared advantage minimization objective for the policy evaluation step and an advantage-weighted maximum log-likelihood objective for the policy improvement step. The mean squared advantage objective fits in-between the well-known mean squared Bellman error and the mean squared temporal difference error objectives, requiring only the expectation of the temporal difference error with respect to the next state and not the policy, in contrast to the Bellman error, which requires both, and the temporal difference error, which requires none. The advantage-weighted maximum log-likelihood policy improvement rule emerges as a linear approximation to a more general weighting scheme where weights are a monotone function of the advantage. Thus, the entropic policy regularization framework provides a rigorous justification for the common practice of least squares value function fitting accompanied by advantage-weighted maximum log-likelihood policy parameters estimation, at the same time pointing at the direction in which this classical actor-critic approach can be extended. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/BorisBelousov/ewrl18_belousov.pdf |
Reference Type | Unpublished Work |
Author(s) | Pinsler, R.; Maag, M.; Arenz, O.; Neumann, G. |
Year | 2018 |
Title | Inverse Reinforcement Learning of Bird Flocking Behavior |
Journal/Conference/Book Title | Swarms: From Biology to Robotics and Back (ICRA Workshop) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/OlegArenz/PinslerEtAl_ICRA2018swarms.pdf |
Reference Type | Journal Article |
Author(s) | Kupcsik, A.G.; Deisenroth, M.P.; Peters, J.; Ai Poh, L.; Vadakkepat, V.; Neumann, G. |
Year | 2017 |
Title | Model-based Contextual Policy Search for Data-Efficient Generalization of Robot Skills |
Journal/Conference/Book Title | Artificial Intelligence |
Keywords | ComPLACS |
Volume | 247 |
Pages | 415-439 |
Date | June 2017 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kupcsik_AIJ_2015.pdf |
Reference Type | Journal Article |
Author(s) | Wang, Z.; Boularias, A.; Muelling, K.; Schoelkopf, B.; Peters, J. |
Year | 2017 |
Title | Anticipatory Action Selection for Human-Robot Table Tennis |
Journal/Conference/Book Title | Artificial Intelligence |
Volume | 247 |
Pages | 399-414 |
Date | June 2017 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Anticipatory_Action_Selection.pdf |
Reference Type | Journal Article |
Author(s) | Maeda, G.; Neumann, G.; Ewerton, M.; Lioutikov, R.; Kroemer, O.; Peters, J. |
Year | 2017 |
Title | Probabilistic Movement Primitives for Coordination of Multiple Human-Robot Collaborative Tasks |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Keywords | 3rd-Hand, BIMROB |
Volume | 41 |
Number | 3 |
Pages | 593-612 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/gjm_2016_AURO_c.pdf |
Reference Type | Journal Article |
Author(s) | Parisi, S.; Pirotta, M.; Peters, J. |
Year | 2017 |
Title | Manifold-based Multi-objective Policy Search with Sample Reuse |
Journal/Conference/Book Title | Neurocomputing |
Keywords | multi-objective, reinforcement learning, policy search, black-box optimization, importance sampling |
Volume | 263 |
Pages | 3-14 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/parisi_neurocomp_morl.pdf |
Reference Type | Book Section |
Author(s) | Peters, J.; Lee, D.; Kober, J.; Nguyen-Tuong, D.; Bagnell, J.; Schaal, S. |
Year | 2017 |
Title | Chapter 15: Robot Learning |
Journal/Conference/Book Title | Springer Handbook of Robotics, 2nd Edition |
Publisher | Springer International Publishing |
Pages | 357-394 |
Reference Type | Journal Article |
Author(s) | Maeda, G.; Ewerton, M.; Neumann, G.; Lioutikov, R.; Peters, J. |
Year | 2017 |
Title | Phase Estimation for Fast Action Recognition and Trajectory Generation in Human-Robot Collaboration |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | 3rd-Hand, BIMROB |
Volume | 36 |
Number | 13-14 |
Pages | 1579-1594 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/phase_estim_IJRR.pdf |
Reference Type | Journal Article |
Author(s) | Padois, V.; Ivaldib, S.; Babič, J.; Mistry, M.; Peters, J.; Nori, F. |
Year | 2017 |
Title | Whole-body multi-contact motion in humans and humanoids |
Journal/Conference/Book Title | Robotics and Autonomous Systems |
Volume | 90 |
Pages | 97-117 |
Date | April 2017 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ias_padois_et_al_revised_finalised.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tangkaratt, V.; van Hoof, H.; Parisi, S.; Neumann, G.; Peters, J.; Sugiyama, M. |
Year | 2017 |
Title | Policy Search with High-Dimensional Context Variables |
Journal/Conference/Book Title | Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/tangkaratt2017policy.pdf |
Reference Type | Journal Article |
Author(s) | Ivaldi, S.; Lefort, S.; Peters, J.; Chetouani, M.; Provasi, J.; Zibetti, E. |
Year | 2017 |
Title | Towards Engagement Models that Consider Individual Factors in HRI: On the Relation of Extroversion and Negative Attitude Towards Robots to Gaze and Speech During a Human-Robot Assembly Task |
Journal/Conference/Book Title | International Journal of Social Robotics |
Volume | 9 |
Number of Volumes | 1 |
Pages | 63-86 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IJSR_edhhi.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gebhardt, G.H.W.; Kupcsik, A.G.; Neumann, G. |
Year | 2017 |
Title | The Kernel Kalman Rule - Efficient Nonparametric Inference with Recursive Least Squares |
Journal/Conference/Book Title | Proceedings of the National Conference on Artificial Intelligence (AAAI) |
Abstract | Nonparametric inference techniques provide promising tools for probabilistic reasoning in high-dimensional nonlinear systems. Most of these techniques embed distributions into reproducing kernel Hilbert spaces (RKHS) and rely on the kernel Bayes’ rule (KBR) to manipulate the embeddings. However, the computational demands of the KBR scale poorly with the number of samples and the KBR often suffers from numerical instabilities. In this paper, we present the kernel Kalman rule (KKR) as an alternative to the KBR. The derivation of the KKR is based on recursive least squares, inspired by the derivation of the Kalman innovation update. We apply the KKR to filtering tasks where we use RKHS embeddings to represent the belief state, resulting in the kernel Kalman filter (KKF). We show on a nonlinear state estimation task with high dimensional observations that our approach provides a significantly improved estimation accuracy while the computational demands are significantly decreased. |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/GregorGebhardt/TheKernelKalmanRule.pdf |
Reference Type | Journal Article |
Author(s) | Yi, Z.; Zhang, Y.; Peters, J. |
Year | 2017 |
Title | Bioinspired Tactile Sensor for Surface Roughness Discrimination |
Journal/Conference/Book Title | Sensors and Actuators A: Physical |
Volume | 255 |
Pages | 46-53 |
Date | 1 March 2017 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bioinspired_tactile_sensor.pdf |
Reference Type | Journal Article |
Author(s) | Osa, T.; Ghalamzan, E. A. M.; Stolkin, R.; Lioutikov, R.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Guiding Trajectory Optimization by Demonstrated Distributions |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Publisher | IEEE |
Volume | 2 |
Number | 2 |
Pages | 819-826 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Osa_RAL_2017.pdf |
Reference Type | Journal Article |
Author(s) | Kroemer, O.; Peters, J. |
Year | 2017 |
Title | A Comparison of Autoregressive Hidden Markov Models for Multi-Modal Manipulations with Variable Masses |
Journal/Conference/Book Title | Proceedings of the International Conference of Robotics and Automation, and IEEE Robotics and Automation Letters (RA-L) |
Keywords | 3rd-Hand, TACMAN |
Volume | 2 |
Number | 2 |
Pages | 1101 - 1108 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Kroemer_RAL_2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Kollegger, G.; Maeda, G.; Wiemeyer, J.; Peters, J. |
Year | 2017 |
Title | Iterative Feedback-basierte Korrekturstrategien beim Bewegungslernen von Mensch-Roboter-Dyaden |
Journal/Conference/Book Title | DVS Sportmotorik 2017 |
Keywords | BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_motorik2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kollegger, G.; Reinhardt, N.; Ewerton, M.; Peters, J.; Wiemeyer, J. |
Year | 2017 |
Title | Die Bedeutung der Beobachtungsperspektive beim Bewegungslernen von Mensch-Roboter-Dyaden |
Journal/Conference/Book Title | DVS Sportmotorik 2017 |
Keywords | BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/kollegger_motorik2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wiemeyer, J.; Peters, J.; Kollegger, G.; Ewerton, M. |
Year | 2017 |
Title | BIMROB – Bidirektionale Interaktion von Mensch und Roboter beim Bewegungslernen |
Journal/Conference/Book Title | DVS Sportmotorik 2017 |
Keywords | BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/wiemeyer_motorik2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Farraj, F. B.; Osa, T.; Pedemonte, N.; Peters, J.; Neumann, G.; Giordano, P.R. |
Year | 2017 |
Title | A Learning-based Shared Control Architecture for Interactive Task Execution |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/firas_ICRA17.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wilbers, D.; Lioutikov, R.; Peters, J. |
Year | 2017 |
Title | Context-Driven Movement Primitive Adaptation |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-Hand |
Link to PDF | /uploads/Member/PubRudolfLioutikov/wilbers_icra_2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | End, F.; Akrour, R.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Layered Direct Policy Search for Learning Hierarchical Skills |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Research/Overview/icra17_felix.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gabriel, A.; Akrour, R.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Empowered Skills |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | IAS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Research/Overview/icra17_alex.pdf |
Reference Type | Conference Proceedings |
Author(s) | Abdulsamad, H.; Arenz, O.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | State-Regularized Policy Search for Linearized Dynamical Systems |
Journal/Conference/Book Title | Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Abdulsamad_ICAPS_2017.pdf |
Reference Type | Journal Article |
Author(s) | Lioutikov, R.; Neumann, G.; Maeda, G.; Peters, J. |
Year | 2017 |
Title | Learning Movement Primitive Libraries through Probabilistic Segmentation |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | 3rd-hand |
Volume | 36 |
Number | 8 |
Pages | 879-894 |
Link to PDF | /uploads/Publications/lioutikov_probs_ijrr2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Fiebig, K.H.; Jayaram, V.; Hesse, T.; Blank, A.; Peters, J.; Grosse-Wentrup, M. |
Year | 2017 |
Title | Bayesian Regression for Artifact Correction in Electroencephalography |
Journal/Conference/Book Title | Proceedings of the 7th Graz Brain-Computer Interface Conference |
Reference Type | Conference Proceedings |
Author(s) | Akrour, R.; Sorokin, D.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Local Bayesian Optimization of Motor Skills |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Link to PDF | http://proceedings.mlr.press/v70/akrour17a/akrour17a.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gebhardt, G.H.W.; Daun, K.; Schnaubelt, M.; Hendrich, A.; Kauth, D.; Neumann, G. |
Year | 2017 |
Title | Learning to Assemble Objects with a Robot Swarm |
Journal/Conference/Book Title | Proceedings of the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) |
Keywords | multi-agent learning, reinforcement learning, swarm robotics |
Abstract | Nature provides us with a multitude of examples that show how swarms of simple agents are much richer in their abilities than a single individual. This insight is a main principle that swarm robotics tries to exploit. In the last years, large swarms of low-cost robots such as the Kilobots have become available. This allows to bring algorithms developed for swarm robotics from simulations to the real world. Recently, the Kilobots have been used for an assembly task with multiple objects: a human operator controlled a light source to guide the swarm of light-sensitive robots such that they successfully assembled an object of multiple parts. However, hand-coding the control of the light source for autonomous assembly is not straight forward as the interactions of the swarm with the object or the reaction to the light source are hard to model. |
Publisher | International Foundation for Autonomous Agents and Multiagent Systems |
Pages | 1547--1549 |
URL(s) | http://dl.acm.org/citation.cfm?id=3091282.3091357 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/GregorGebhardt/LearningToAssembleObjectsWithARobotSwarm.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kollegger, G.; Ewerton, M.; Wiemeyer, J.; Peters, J. |
Year | 2017 |
Title | BIMROB – Bidirectional Interaction between human and robot for the learning of movements – Robot trains human – Human trains robot |
Journal/Conference/Book Title | 23. Sportwissenschaftlicher Hochschultag der dvs |
Keywords | BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/kollegger_dvs_hochschultag_2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tosatto, S.; Pirotta, M.; D'Eramo, C; Restelli, M. |
Year | 2017 |
Title | Boosted Fitted Q-Iteration |
Journal/Conference/Book Title | Proceedings of the International Conference of Machine Learning (ICML) |
Keywords | Bosch-Forschungstiftung |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/tosatto_icml2017.pdf |
Reference Type | Conference Paper |
Author(s) | Belousov, B.; Neumann, G.; Rothkopf, C.A.; Peters, J. |
Year | 2017 |
Title | Catching heuristics are optimal control policies |
Journal/Conference/Book Title | Proceedings of the Karniel Thirteenth Computational Motor Control Workshop |
Keywords | SKILLS4ROBOTS |
Abstract | Two seemingly contradictory theories attempt to explain how humans move to intercept an airborne ball. One theory posits that humans predict the ball trajectory to optimally plan future actions; the other claims that, instead of performing such complicated computations, humans employ heuristics to reactively choose appropriate actions based on immediate visual feedback. In this paper, we show that interception strategies appearing to be heuristics can be understood as computational solutions to the optimal control problem faced by a ball-catching agent acting under uncertainty. Modeling catching as a continuous partially observable Markov decision process and employing stochastic optimal control theory, we discover that the four main heuristics described in the literature are optimal solutions if the catcher has sufficient time to continuously visually track the ball. Specifically, by varying model parameters such as noise, time to ground contact, and perceptual latency, we show that different strategies arise under different circumstances. The catcher’s policy switches between generating reactive and predictive behavior based on the ratio of system to observation noise and the ratio between reaction time and task duration. Thus, we provide a rational account of human ball-catching behavior and a unifying explanation for seemingly contradictory theories of target interception on the basis of stochastic optimal control. |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Belousov_ANIPS_2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | Busch, B.; Maeda, G.; Mollard, Y.; Demangeat, M.; Lopes, M. |
Year | 2017 |
Title | Postural Optimization for an Ergonomic Human-Robot Interaction |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | 3rd-Hand |
Reference Type | Conference Proceedings |
Author(s) | Pajarinen, J.; Kyrki, V.; Koval, M.; Srinivasa, S; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Hybrid Control Trajectory Optimization under Uncertainty |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | RoMaNS |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/pajarinen_iros_2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Parisi, S.; Ramstedt, S.; Peters, J. |
Year | 2017 |
Title | Goal-Driven Dimensionality Reduction for Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/parisi2017iros.pdf |
Reference Type | Journal Article |
Author(s) | Paraschos, A.; Lioutikov, R.; Peters, J.; Neumann, G. |
Year | 2017 |
Title | Probabilistic Prioritization of Movement Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Intelligent Robot Systems, and IEEE Robotics and Automation Letters (RA-L) |
Keywords | codyco |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/AlexandrosParaschos/paraschos_prob_prio.pdf |
Reference Type | Journal Article |
Author(s) | van Hoof, H.; Tanneberg, D.; Peters, J. |
Year | 2017 |
Title | Generalized Exploration in Policy Search |
Journal/Conference/Book Title | Machine Learning (MLJ) |
Volume | 106 |
Number | 9-10 |
Pages | 1705-1724 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/vanHoof_MLJ_2017.pdf |
Reference Type | Journal Article |
Author(s) | van Hoof, H.; Neumann, G.; Peters, J. |
Year | 2017 |
Title | Non-parametric Policy Search with Limited Information Loss |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Keywords | TACMAN, reinforcement learning |
Volume | 18 |
Number | 73 |
Pages | 1-46 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Alumni/HerkeVanHoof/vanHoof_JMLR_2017.pdf |
Reference Type | Journal Article |
Author(s) | Vinogradska, J.; Bischoff, B.; Nguyen-Tuong, D.; Peters, J. |
Year | 2017 |
Title | Stability of Controllers for Gaussian Process Forward Models |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Volume | 18 |
Number | 100 |
Pages | 1-37 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/16-590.pdf |
Reference Type | Journal Article |
Author(s) | Dermy, O.; Paraschos, A.; Ewerton, M.; Charpillet, F.; Peters, J.; Ivaldi, S |
Year | 2017 |
Title | Prediction of intention during interaction with iCub with Probabilistic Movement Primitives |
Journal/Conference/Book Title | Frontiers in Robotics and AI |
Keywords | CoDyCo, BIMROB |
Volume | 4 |
Pages | 45 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/frobt-04-00045.pdf |
Reference Type | Generic |
Author(s) | Ewerton, M.; Maeda, G.; Rother, D.; Weimar, J.; Lotter, L.; Kollegger, G.; Wiemeyer, J.; Peters, J. |
Year | 2017 |
Title | Assisting the practice of motor skills by humans with a probability distribution over trajectories |
Journal/Conference/Book Title | Workshop Human-in-the-loop robotic manipulation: on the influence of the human role at IROS 2017, Vancouver, Canada |
Keywords | 3rd-Hand, BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Member/PubMarcoEwerton/WORKSHOP_IROS_2017.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tanneberg, D.; Peters, J.; Rueckert, E. |
Year | 2017 |
Title | Online Learning with Stochastic Recurrent Neural Networks using Intrinsic Motivation Signals |
Journal/Conference/Book Title | Proceedings of the Conference on Robot Learning (CoRL) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/DanielTanneberg/corl17_01.pdf |
Reference Type | Conference Proceedings |
Author(s) | Maeda, G.; Ewerton, M.; Osa, T.; Busch, B.; Peters, J. |
Year | 2017 |
Title | Active Incremental Learning of Robot Movement Primitives |
Journal/Conference/Book Title | Proceedings of the Conference on Robot Learning (CoRL) |
Keywords | 3rd-Hand, BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/maedaCoRL_20171014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Rueckert, E.; Nakatenus, M.; Tosatto, S.; Peters, J. |
Year | 2017 |
Title | Learning Inverse Dynamics Models in O(n) time with LSTM networks |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS, Bosch-Forschungstiftung |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Humanoids2017Rueckert.pdf |
Reference Type | Conference Proceedings |
Author(s) | Tanneberg, D.; Peters, J.; Rueckert, E. |
Year | 2017 |
Title | Efficient Online Adaptation with Stochastic Recurrent Neural Networks |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/DanielTanneberg/humanoids17_01.pdf |
Reference Type | Conference Proceedings |
Author(s) | Stark, S.; Peters, J.; Rueckert, E. |
Year | 2017 |
Title | A Comparison of Distance Measures for Learning Nonparametric Motor Skill Libraries |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/SvenjaStark/stark_humanoids2017.pdf |
Reference Type | Journal Article |
Author(s) | Kollegger, G.; Ewerton, M.; Wiemeyer, J.; Peters, J. |
Year | 2017 |
Title | BIMROB -- Bidirectional Interaction Between Human and Robot for the Learning of Movements |
Journal/Conference/Book Title | Proceedings of the 11th International Symposium on Computer Science in Sport (IACSS 2017) |
Keywords | BIMROB |
Editor(s) | Lames, M.; Saupe, D.; Wiemeyer, J. |
Publisher | Springer International Publishing |
Pages | 151--163 |
ISBN/ISSN | 978-3-319-67846-7 |
URL(s) | https://doi.org/10.1007/978-3-319-67846-7_15 |
Reference Type | Conference Proceedings |
Author(s) | Thiem, S.; Stark, S.; Tanneberg, D.; Peters, J.; Rueckert, E. |
Year | 2017 |
Title | Simulation of the underactuated Sake Robotics Gripper in V-REP |
Journal/Conference/Book Title | Workshop at the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | GOAL-Robots, SKILLS4ROBOTS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/PubElmarRueckert/Humanoids2017Thiem.pdf |
Reference Type | Conference Proceedings |
Author(s) | Grossberger, L.; Hohmann, M.R.; Peters J.; Grosse-Wentrup, M. |
Year | 2017 |
Title | Investigating Music Imagery as a Cognitive Paradigm for Low-Cost Brain-Computer Interfaces |
Journal/Conference/Book Title | Proceedings of the 7th Graz Brain-Computer Interface Conference |
Reference Type | Conference Proceedings |
Author(s) | Kollegger, G., Wiemeyer, J., Ewerton, M. & Peters, J. |
Year | 2017 |
Title | BIMROB - Bidirectional Interaction between human and robot for the learning of movements - Robot trains human - Human trains robot |
Journal/Conference/Book Title | Inovation & Technologie im Sport - 23. Sportwissenschaftlicher Hochschultag der deutschen Vereinigung für Sportwissenschaft |
Keywords | BIMROB |
Editor(s) | A. Schwirtz, F. Mess, Y. Demetriou & V. Senner |
Place Published | Hamburg |
Publisher | Czwalina-Feldhaus |
Pages | 179 |
Reference Type | Journal Article |
Author(s) | Belousov, B.; Peters, J. |
Year | 2017 |
Title | f-Divergence constrained policy improvement |
Journal/Conference/Book Title | arXiv |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/1801.00056.pdf |
Reference Type | Conference Proceedings |
Author(s) | Osa, T.; Peters, J.; Neumann, G. |
Year | 2016 |
Title | Experiments with Hierarchical Reinforcement Learning of Multiple Grasping Policies |
Journal/Conference/Book Title | Proceedings of the International Symposium on Experimental Robotics (ISER) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/osa_ISER2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | Arenz, O.; Abdulsamad, H.; Neumann, G. |
Year | 2016 |
Title | Optimal Control and Inverse Optimal Control by Distribution Matching |
Journal/Conference/Book Title | Proceedings of the International Conference on Intelligent Robots and Systems (IROS) |
Keywords | Imitation Learning, Inverse Optimal Control, Optimal Control |
Abstract | Optimal control is a powerful approach to achieve optimal behavior. However, it typically requires a manual specification of a cost function which often contains several objectives, such as reaching goal positions at different time steps or energy efficiency. Manually trading-off these objectives is often difficult and requires a high engineering effort. In this paper, we present a new approach to specify optimal behavior. We directly specify the desired behavior by a distribution over future states or features of the states. For example, the experimenter could choose to reach certain mean positions with given accuracy/variance at specified time steps. Our approach also unifies optimal control and inverse optimal control in one framework. Given a desired state distribution, we estimate a cost function such that the optimal controller matches the desired distribution. If the desired distribution is estimated from expert demonstrations, our approach performs inverse optimal control. We evaluate our approach on several optimal and inverse optimal control tasks on non-linear systems using incremental linearizations similar to differential dynamic programming approaches. |
Publisher | IEEE |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Team/OlegArenz/OC and IOC By Matching Distributions_withSupplements.pdf |
Reference Type | Journal Article |
Author(s) | Rueckert, E.; Kappel, D.; Tanneberg, D.; Pecevski, D; Peters, J. |
Year | 2016 |
Title | Recurrent Spiking Networks Solve Planning Tasks |
Journal/Conference/Book Title | Nature PG: Scientific Reports |
Keywords | 3rdHand, CoDyCo |
Publisher | Nature Publishing Group |
Volume | 6 |
Number | 21142 |
Date | 2016/02/18/online |
ISBN/ISSN | 10.1038/srep21142 |
Custom 2 | http://www.nature.com/articles/srep21142#supplementary-information |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/srep21142 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/srep21142 |
Reference Type | Conference Proceedings |
Author(s) | Kohlschuetter, J.; Peters, J.; Rueckert, E. |
Year | 2016 |
Title | Learning Probabilistic Features from EMG Data for Predicting Knee Abnormalities |
Journal/Conference/Book Title | Proceedings of the XIV Mediterranean Conference on Medical and Biological Engineering and Computing (MEDICON) |
Keywords | CoDyCo, TACMAN |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/KohlschuetterMEDICON_2016.pdf |
Reference Type | Journal Article |
Author(s) | Maeda, G.; Ewerton, M.; Koert, D; Peters, J. |
Year | 2016 |
Title | Acquiring and Generalizing the Embodiment Mapping from Human Observations to Robot Skills |
Journal/Conference/Book Title | IEEE Robotics and Automation Letters (RA-L) |
Keywords | 3rd-Hand, BIMROB |
Volume | 1 |
Number | 2 |
Pages | 784--791 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/GuilhermeMaeda/maeda_RAL_golf_2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | Modugno, V.; Neumann, G.; Rueckert, E.; Oriolo, G.; Peters, J.; Ivaldi, S. |
Year | 2016 |
Title | Learning soft task priorities for control of redundant robots |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | CoDyCo |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/main_revised.pdf |
Reference Type | Conference Proceedings |
Author(s) | Buechler, D.; Ott, H.; Peters, J. |
Year | 2016 |
Title | A Lightweight Robotic Arm with Pneumatic Muscles for Robot Learning |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Maeda, G.; Neumann, G.; Kisner, V.; Kollegger, G.; Wiemeyer, J.; Peters, J. |
Year | 2016 |
Title | Movement Primitives with Multiple Phase Parameters |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | BIMROB, 3rd-Hand |
Pages | 201--206 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_icra_2016_stockholm.pdf |
Reference Type | Journal Article |
Author(s) | Daniel, C.; Neumann, G.; Kroemer, O.; Peters, J. |
Year | 2016 |
Title | Hierarchical Relative Entropy Policy Search |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Volume | 17 |
Pages | 1-50 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Daniel2016JMLR.pdf |
Reference Type | Journal Article |
Author(s) | Veiga, F.F.; Peters, J. |
Year | 2016 |
Title | Can Modular Finger Control for In-Hand Object Stabilization be accomplished by Independent Tactile Feedback Control Laws? |
Journal/Conference/Book Title | arXiv |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/1612.08202.pdf |
Reference Type | Journal Article |
Author(s) | Abdolmaleki, A.; Lau, N.; Reis, L.; Peters, J.; Neumann, G. |
Year | 2016 |
Title | Contextual Policy Search for Linear and Nonlinear Generalization of a Humanoid Walking Controller |
Journal/Conference/Book Title | Journal of Intelligent & Robotic Systems |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/contextualWalking.pdf |
Reference Type | Conference Proceedings |
Author(s) | Vinogradska, J.; Bischoff, B.; Nguyen-Tuong, D.; Romer, A.; Schmidt, H.; Peters, J. |
Year | 2016 |
Title | Stability of Controllers for Gaussian Process Forward Models |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Publications/Vinogradska_ICML_2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | Akrour, R.; Abdolmaleki, A.; Abdulsamad, H.; Neumann, G. |
Year | 2016 |
Title | Model-Free Trajectory Optimization for Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/akrour16.pdf |
Reference Type | Conference Proceedings |
Author(s) | Sharma, D.; Tanneberg, D.; Grosse-Wentrup, M.; Peters, J.; Rueckert, E. |
Year | 2016 |
Title | Adaptive Training Strategies for BCIs |
Journal/Conference/Book Title | Cybathlon Symposium |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/ElmarR%c3%bcckert/Cybathlon16_AdaptiveTrainingRL.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Peters, J.; Rasmussen, C.E.; Deisenroth, M.P. |
Year | 2016 |
Title | Manifold Gaussian Processes for Regression |
Journal/Conference/Book Title | Proceedings of the International Joint Conference on Neural Networks (IJCNN) |
Keywords | CoDyCo |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/1402.5876v4 |
Reference Type | Conference Proceedings |
Author(s) | Weber, P.; Rueckert, E.; Calandra, R.; Peters, J.; Beckerle, P. |
Year | 2016 |
Title | A Low-cost Sensor Glove with Vibrotactile Feedback and Multiple Finger Joint and Hand Motion Sensing for Human-Robot Interaction |
Journal/Conference/Book Title | Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) |
Keywords | CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/ElmarR%c3%bcckert/ROMANS16_daglove.pdf |
Reference Type | Journal Article |
Author(s) | Rueckert, E.; Camernik, J.; Peters, J.; Babic, J. |
Year | 2016 |
Title | Probabilistic Movement Models Show that Postural Control Precedes and Predicts Volitional Motor Control |
Journal/Conference/Book Title | Nature PG: Scientific Reports |
Keywords | CoDyCo; TACMAN |
Volume | 6 |
Number | 28455 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/srep28455 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/srep28455 |
Reference Type | Journal Article |
Author(s) | Daniel, C.; van Hoof, H.; Peters, J.; Neumann, G. |
Year | 2016 |
Title | Probabilistic Inference for Determining Options in Reinforcement Learning |
Journal/Conference/Book Title | Machine Learning (MLJ) |
Volume | 104 |
Number | 2-3 |
Pages | 337-357 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Daniel2016ECML.pdf |
Reference Type | Conference Proceedings |
Author(s) | Manschitz, S.; Gienger, M.; Kober, J.; Peters, J. |
Year | 2016 |
Title | Probabilistic Decomposition of Sequential Force Interaction Tasks into Movement Primitives |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | Honda, HRI-Collaboration |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/SimonManschitz/ManschitzIROS2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; Chen, N.; Karl, M.; van der Smagt, P.; Peters, J. |
Year | 2016 |
Title | Stable Reinforcement Learning with Autoencoders for Tactile and Visual Data |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | TACMAN, tactile manipulation |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/hoof2016IROS.pdf |
Reference Type | Conference Proceedings |
Author(s) | Yi, Z.; Calandra, R.; Veiga, F.; van Hoof, H.; Hermans, T.; Zhang, Y.; Peters, J. |
Year | 2016 |
Title | Active Tactile Object Exploration with Gaussian Processes |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | TACMAN, tactile manipulation |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Publications/Other/iros2016yi.pdf |
Reference Type | Conference Proceedings |
Author(s) | Koc, O.; Peters, J.; Maeda, G. |
Year | 2016 |
Title | A New Trajectory Generation Framework in Robotic Table Tennis |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Reference Type | Conference Proceedings |
Author(s) | Belousov, B.; Neumann, G.; Rothkopf, C.; Peters, J. |
Year | 2016 |
Title | Catching heuristics are optimal control policies |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Keywords | SKILLS4ROBOTS |
Abstract | Two seemingly contradictory theories attempt to explain how humans move to intercept an airborne ball. One theory posits that humans predict the ball trajectory to optimally plan future actions; the other claims that, instead of performing such complicated computations, humans employ heuristics to reactively choose appropriate actions based on immediate visual feedback. In this paper, we show that interception strategies appearing to be heuristics can be understood as computa- tional solutions to the optimal control problem faced by a ball-catching agent acting under uncertainty. Modeling catching as a continuous partially observable Markov decision process and employing stochastic optimal control theory, we discover that the four main heuristics described in the literature are optimal solutions if the catcher has sufficient time to continuously visually track the ball. Specifically, by varying model parameters such as noise, time to ground contact, and perceptual latency, we show that different strategies arise under different circumstances. The catcher’s policy switches between generating reactive and predictive behavior based on the ratio of system to observation noise and the ratio between reaction time and task duration. Thus, we provide a rational account of human ball-catching behavior and a unifying explanation for seemingly contradictory theories of target interception on the basis of stochastic optimal control. |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Belousov_ANIPS_2016.pdf |
Reference Type | Generic |
Author(s) | Maeda, G.; Maloo, A.; Ewerton, M.; Lioutikov, R.; Peters, J. |
Year | 2016 |
Title | Proactive Human-Robot Collaboration with Interaction Primitives |
Journal/Conference/Book Title | International Workshop on Human-Friendly Robotics (HFR), Genoa, Italy |
Keywords | 3rd-Hand, BIMROB |
Reference Type | Conference Proceedings |
Author(s) | Maeda, G.; Maloo, A.; Ewerton, M.; Lioutikov, R.; Peters, J. |
Year | 2016 |
Title | Anticipative Interaction Primitives for Human-Robot Collaboration |
Journal/Conference/Book Title | AAAI Fall Symposium Series. Shared Autonomy in Research and Practice, Arlington, VA, USA |
Keywords | 3rd-Hand, BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/maeda-maloo_AAAI_symposium.pdf |
Reference Type | Book Section |
Author(s) | Peters, J.; Tedrake, R.; Roy, N.; Morimoto, J. |
Year | 2016 |
Title | Robot Learning |
Journal/Conference/Book Title | Encyclopedia of Machine Learning, 2nd Edition, Invited Article |
Reference Type | Conference Proceedings |
Author(s) | Tanneberg, D.; Paraschos, A.; Peters, J.; Rueckert, E. |
Year | 2016 |
Title | Deep Spiking Networks for Model-based Planning in Humanoids |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo; TACMAN |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/DanielTanneberg/tanneberg_humanoids16.pdf |
Reference Type | Conference Proceedings |
Author(s) | Huang, Y.; Buechler, D.; Koc, O.; Schoelkopf, B.; Peters, J. |
Year | 2016 |
Title | Jointly Learning Trajectory Generation and Hitting Point Prediction in Robot Table Tennis |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Reference Type | Conference Proceedings |
Author(s) | Koert, D.; Maeda, G.J.; Lioutikov, R.; Neumann, G.; Peters, J. |
Year | 2016 |
Title | Demonstration Based Trajectory Optimization for Generalizable Robot Motions |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | 3rd-Hand,SKILLS4ROBOTS |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/DorotheaKoert/Debato.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez-Gonzalez, S.; Neumann, G.; Schoelkopf, B.; Peters, J. |
Year | 2016 |
Title | Using Probabilistic Movement Primitives for Striking Movements |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Maeda, G.J.; Kollegger, G.; Wiemeyer, J.; Peters, J. |
Year | 2016 |
Title | Incremental Imitation Learning of Context-Dependent Motor Skills |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | BIMROB, 3rd-Hand |
Pages | 351--358 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_humanoids_2016.pdf |
Reference Type | Conference Proceedings |
Author(s) | Azad, M.; Ortenzi, V.; Lin, H., C.; Rueckert, E.; Mistry, M. |
Year | 2016 |
Title | Model Estimation and Control of Complaint Contact Normal Force |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Humanoids2016Azad.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kollegger, G.; Ewerton, M.; Peters, J.; Wiemeyer, J. |
Year | 2016 |
Title | Bidirektionale Interaktion zwischen Mensch und Roboter beim Bewegungslernen (BIMROB) |
Journal/Conference/Book Title | 11. Symposium der DVS Sportinformatik |
Keywords | BIMROB |
Link to PDF | http://www.sportinformatik2016.ovgu.de/Tagung/Abstracts.html |
Reference Type | Conference Proceedings |
Author(s) | Parisi, S; Blank, A; Viernickel T; Peters, J |
Year | 2016 |
Title | Local-utopia Policy Selection for Multi-objective Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/parisi2016local.pdf |
Reference Type | Book Section |
Author(s) | Peters, J.; Bagnell, J.A. |
Year | 2016 |
Title | Policy gradient methods |
Journal/Conference/Book Title | Encyclopedia of Machine Learning, 2nd Edition, Invited Article |
Reference Type | Conference Proceedings |
Author(s) | Fiebig, K.-H.; Jayaram, V.; Peters, J.; Grosse-Wentrup, M. |
Year | 2016 |
Title | Multi-Task Logistic Regression in Brain-Computer Interfaces |
Journal/Conference/Book Title | IEEE SMC 2016 — 6th Workshop on Brain-Machine Interface Systems |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/smc_2016_FiJaPeGW_mtl_logreg_v2.pdf |
Reference Type | Journal Article |
Author(s) | Yi, Z.; Zhang, Y.; Peters, J. |
Year | 2016 |
Title | Surface Roughness Discrimination Using Bioinspired Tactile Sensors |
Journal/Conference/Book Title | Proceedings of the 16th International Conference on Biomedical Engineering |
Reference Type | Unpublished Work |
Author(s) | Arenz, O.; Neumann, G. |
Year | 2016 |
Title | Iterative Cost Learning from Different Types of Human Feedback |
Journal/Conference/Book Title | IROS 2016 Workshop on Human-Robot Collaboration |
Keywords | Inverse Reinforcement Learning, Preference Learning |
Abstract | Human-robot collaboration in unstructured envi- ronments often involves different types of interactions. These interactions usually occur frequently during normal operation and may provide valuable information about the task to the robot. It is therefore sensible to utilize this data for lifelong robot learning. Learning from human interactions is an active field of research, e.g., Inverse Reinforcement Learning, which aims at learning from demonstrations, or Preference Learning, which aims at learning from human preferences. However, learning from a combination of different types of feedback is still little explored. In this paper, we propose a method for inferring a reward function from a combination of expert demonstrations, pairwise preferences, star ratings as well as oracle-based evaluations of the true reward function. Our method extends Maximum Entropy Inverse Reinforcement Learning in order to account for the additional types of human feedback by framing them as constraints to the original optimization problem. We demonstrate on a gridworld, that the resulting optimization problem can be solved based on the Alternating Direction Method of Multipliers (ADMM), even when confronted with a large amount of training data. |
URL(s) | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/arenz_workshop_IROS16.pdf |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/arenz_workshop_IROS16.pdf |
Reference Type | Unpublished Work |
Author(s) | Arenz, O.; Abdulsamad, H.; Neumann, G. |
Year | 2016 |
Title | (Inverse) Optimal Control for Matching Higher-Order Moments |
Journal/Conference/Book Title | DGR Days (Leipzig) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/OlegArenz/oleg_dgr_2016.pdf |
Reference Type | Journal Article |
Author(s) | Calandra, R.; Seyfarth, A.; Peters, J.; Deisenroth, M. |
Year | 2015 |
Title | Bayesian Optimization for Learning Gaits under Uncertainty |
Journal/Conference/Book Title | Annals of Mathematics and Artificial Intelligence (AMAI) |
Keywords | CoDyCo |
Abstract | Designing gaits and corresponding control policies is a key challenge in robot locomotion. Even with a viable controller parameterization, finding nearoptimal parameters can be daunting. Typically, this kind of parameter optimization requires specific expert knowledge and extensive robot experiments. Automatic black-box gait optimization methods greatly reduce the need for human expertise and time-consuming design processes. Many different approaches for automatic gait optimization have been suggested to date, such as grid search and evolutionary algorithms. In this article, we thoroughly discuss multiple of these optimization methods in the context of automatic gait optimization. Moreover, we extensively evaluate Bayesian optimization, a model-based approach to black-box optimization under uncertainty, on both simulated problems and real robots. This evaluation demonstrates that Bayesian optimization is particularly suited for robotic applications, where it is crucial to find a good set of gait parameters in a small number of experiments. |
URL(s) | http://dx.doi.org/10.1007/s10472-015-9463-9 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Calandra2015a.pdf |
Reference Type | Journal Article |
Author(s) | Mariti, C.; Muscolo, G.G.; Peters, J.; Puig, D.; Recchiuto, C.T.; Sighieri, C.; Solanas, A.; von Stryk, O. |
Year | 2015 |
Title | Developing biorobotics for veterinary research into cat movements |
Journal/Conference/Book Title | Journal of Veterinary Behavior: Clinical Applications and Research |
Volume | 10 |
Number of Volumes | 3 |
Pages | 248-254 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Developing_biorobotics_for_veterinary_research.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; Peters, J.; Neumann, G. |
Year | 2015 |
Title | Learning of Non-Parametric Control Policies with High-Dimensional State Features |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS) |
Keywords | TACMAN |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/hoof2015learning.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Ivaldi, S.; Deisenroth, M.;Rueckert, E.; Peters, J. |
Year | 2015 |
Title | Learning Inverse Dynamics Models with Contacts |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_ICRA15.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Daniel, C.; Neumann, G; van Hoof, H.; Peters, J. |
Year | 2015 |
Title | Towards Learning Hierarchical Skills for Multi-Phase Manipulation Tasks |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-hand, 3rdHand |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/KroemerICRA15.pdf |
Reference Type | Conference Proceedings |
Author(s) | Rueckert, E.; Mundo, J.; Paraschos, A.; Peters, J.; Neumann, G. |
Year | 2015 |
Title | Extracting Low-Dimensional Control Variables for Movement Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rdHand, CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Rueckert_ICRA14LMProMPsFinal.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Neumann, G.; Lioutikov, R.; Ben Amor, H.; Peters, J.; Maeda, G. |
Year | 2015 |
Title | Learning Multiple Collaborative Tasks with a Mixture of Interaction Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-Hand, CompLACS, BIMROB |
Pages | 1535--1542 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_icra_2015_seattle.pdf |
Reference Type | Conference Proceedings |
Author(s) | Traversaro, S.; Del Prete, A.; Ivaldi, S.; Nori, F. |
Year | 2015 |
Title | Avoiding to rely on Inertial Parameters in Estimating Joint Torques with proximal F/T sensing |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | CoDyCo |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICRA15_2129_FI.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lopes, M.; Peters, J.; Piater, J.; Toussaint, M.; Baisero, A.; Busch, B.; Erkent, O.; Kroemer, O.; Lioutikov, R.; Maeda, G.; Mollard, Y.; Munzer, T.; Shukla, D. |
Year | 2015 |
Title | Semi-Autonomous 3rd-Hand Robot |
Journal/Conference/Book Title | Workshop on Cognitive Robotics in Future Manufacturing Scenarios, European Robotics Forum, Vienna, Austria |
Keywords | 3rdhand |
Link to PDF | https://iis.uibk.ac.at/public/papers/Lopes-2015-CogRobFoF.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lioutikov, R.; Neumann, G.; Maeda, G.J.; Peters, J. |
Year | 2015 |
Title | Probabilistic Segmentation Applied to an Assembly Task |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | 3rd-hand |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/lioutikov_humanoids_2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Paraschos, A.; Rueckert, E.; Peters, J; Neumann, G. |
Year | 2015 |
Title | Model-Free Probabilistic Movement Primitives for Physical Interaction |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/PubAlexParaschos/Paraschos_IROS_2015.pdf |
Reference Type | Conference Paper |
Author(s) | Rueckert, E.; Lioutikov, R.; Calandra, R.; Schmidt, M.; Beckerle, P.; Peters, J. |
Year | 2015 |
Title | Low-cost Sensor Glove with Force Feedback for Learning from Demonstrations using Probabilistic Trajectory Representations |
Journal/Conference/Book Title | ICRA 2015 Workshop on Tactile and force sensing for autonomous compliant intelligent robots |
Keywords | CoDyCo |
URL(s) | http://arxiv.org/abs/1510.03253 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Workshops/ICRA2015TactileForce/13_icra_ws_tactileforce.pdf |
Reference Type | Generic |
Author(s) | Ewerton, M.; Neumann, G.; Lioutikov, R.; Ben Amor, H.; Peters, J.; Maeda, G. |
Year | 2015 |
Title | Modeling Spatio-Temporal Variability in Human-Robot Interaction with Probabilistic Movement Primitives |
Journal/Conference/Book Title | Workshop on Machine Learning for Social Robotics, ICRA |
Keywords | 3rd-Hand, CompLACS, BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_workshop_ml_social_robotics_icra_2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Parisi, S.; Abdulsamad, H.; Paraschos, A.; Daniel, C.; Peters, J. |
Year | 2015 |
Title | Reinforcement Learning vs Human Programming in Tetherball Robot Games |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | SCARL |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Member/PubSimoneParisi/parisi_iros_2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Veiga, F.F.; van Hoof, H.; Peters, J.; Hermans, T. |
Year | 2015 |
Title | Stabilizing Novel Objects by Learning to Predict Tactile Slip |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | TACMAN |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/IROS2015veiga.pdf |
Reference Type | Conference Proceedings |
Author(s) | Huang, Y.; Schoelkopf, B.; Peters, J. |
Year | 2015 |
Title | Learning Optimal Striking Points for A Ping-Pong Playing Robot |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/YanlongHuang/Yanlong_IROS2015 |
Reference Type | Conference Proceedings |
Author(s) | Manschitz, S.; Kober, J.; Gienger, M.; Peters, J. |
Year | 2015 |
Title | Probabilistic Progress Prediction and Sequencing of Concurrent Movement Primitives |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | Honda |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/ManschitzIROS2015_v2.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ewerton, M.; Maeda, G.J.; Peters, J.; Neumann, G. |
Year | 2015 |
Title | Learning Motor Skills from Partially Observed Movements Executed at Different Speeds |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | BIMROB, 3rd-hand |
Pages | 456--463 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_iros_2015_hamburg.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wahrburg, A.; Zeiss, S.; Matthias, B.; Peters, J.; Ding, H. |
Year | 2015 |
Title | Combined Pose-Wrench and State Machine Representation for Modeling Robotic Assembly Skills |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | ABB |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Wahrburg_IROS_2015.pdf |
Reference Type | Journal Article |
Author(s) | Daniel, C.; Kroemer, O.; Viering, M.; Metz, J.; Peters, J. |
Year | 2015 |
Title | Active Reward Learning with a Novel Acquisition Function |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Keywords | ComPLACS |
Volume | 39 |
Number of Volumes | 3 |
Pages | 389-405 |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/ChristianDaniel/ActiveRewardLearning.pdf |
Reference Type | Conference Proceedings |
Author(s) | Fritsche, L.; Unverzagt, F.; Peters, J.; Calandra, R. |
Year | 2015 |
Title | First-Person Tele-Operation of a Humanoid Robot |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Fritsche_Humanoids15.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Ivaldi, S.; Deisenroth, M.; Peters, J. |
Year | 2015 |
Title | Learning Torque Control in Presence of Contacts using Tactile Sensing from Robot Skin |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_humanoids2015.pdf |
Reference Type | Journal Article |
Author(s) | Manschitz, S.; Kober, J.; Gienger, M.; Peters, J. |
Year | 2015 |
Title | Learning Movement Primitive Attractor Goals and Sequential Skills from Kinesthetic Demonstrations |
Journal/Conference/Book Title | Robotics and Autonomous Systems |
Keywords | Honda, HRI-Collaboration |
Volume | 74 |
Pages | 97-107 |
ISBN/ISSN | 0921-8890 |
URL(s) | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/ManschitzRAS2015_v2.pdf |
Reference Type | Conference Proceedings |
Author(s) | Maeda, G.; Neumann, G.; Ewerton, M.; Lioutikov, R.; Peters, J. |
Year | 2015 |
Title | A Probabilistic Framework for Semi-Autonomous Robots Based on Interaction Primitives with Phase Estimation |
Journal/Conference/Book Title | Proceedings of the International Symposium of Robotics Research (ISRR) |
Keywords | 3rd-Hand, BIMROB |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/ISRR_uploaded_20150814_small.pdf |
Reference Type | Conference Proceedings |
Author(s) | Koc, O.; Maeda, G.; Neumann, G.; Peters, J. |
Year | 2015 |
Title | Optimizing Robot Striking Movement Primitives with Iterative Learning Control |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | 3rd-hand |
Reference Type | Conference Proceedings |
Author(s) | Hoelscher, J.; Peters, J.; Hermans, T. |
Year | 2015 |
Title | Evaluation of Interactive Object Recognition with Tactile Sensing |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | TACMAN, tactile manipulation |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Theses/hoelscher_ichr2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; Hermans, T.; Neumann, G.; Peters, J. |
Year | 2015 |
Title | Learning Robot In-Hand Manipulation with Tactile Features |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | TACMAN, tactile manipulation |
URL(s) | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/HoofHumanoids2015.pdf |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/HoofHumanoids2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Leischnig, S.; Luettgen, S.; Kroemer, O.; Peters, J. |
Year | 2015 |
Title | A Comparison of Contact Distribution Representations for Learning to Predict Object Interactions |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | TACMAN, tactile manipulation |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Leischnig-Humanoids-2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Abdolmaleki, A.; Lioutikov, R.; Peters, J; Lau, N.; Reis, L.; Neumann, G. |
Year | 2015 |
Title | Model-Based Relative Entropy Stochastic Search |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Keywords | LearnRobotS |
Place Published | Cambridge, MA |
Publisher | MIT Press |
Link to PDF | http://www.ausy.tu-darmstadt.de/uploads/Team/GerhardNeumann/Abdolmaleki_NIPS2015.pdf |
Reference Type | Conference Proceedings |
Author(s) | Dann, C.; Neumann, G.; Peters, J. |
Year | 2015 |
Title | Policy Evaluation with Temporal Differences: A Survey and Comparison |
Journal/Conference/Book Title | Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling (ICAPS) |
Pages | 359-360 |
Reference Type | Journal Article |
Author(s) | Lioutikov, R.; Paraschos, A.; Peters, J.; Neumann, G. |
Year | 2014 |
Title | Generalizing Movements with Information Theoretic Stochastic Optimal Control |
Journal/Conference/Book Title | Journal of Aerospace Information Systems |
Volume | 11 |
Number | 9 |
Pages | 579-595 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/lioutikov_2014_itsoc.pdf |
Reference Type | Journal Article |
Author(s) | Neumann, G.; Daniel, C.; Paraschos, A.; Kupcsik, A.; Peters, J. |
Year | 2014 |
Title | Learning Modular Policies for Robotics |
Journal/Conference/Book Title | Frontiers in Computational Neuroscience |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/fncom-08-00062.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nori, F.; Peters, J.; Padois, V.; Babic, J.; Mistry, M.; Ivaldi, S. |
Year | 2014 |
Title | Whole-body motion in humans and humanoids |
Journal/Conference/Book Title | Proceedings of the Workshop on New Research Frontiers for Intelligent Autonomous Systems (NRF-IAS) |
Keywords | CoDyCo |
Pages | 81-92 |
URL(s) | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/nori2014iascodyco.pdf |
Reference Type | Journal Article |
Author(s) | Dann, C.; Neumann, G.; Peters, J. |
Year | 2014 |
Title | Policy Evaluation with Temporal Differences: A Survey and Comparison |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Keywords | ComPLACS |
Volume | 15 |
Number | March |
Pages | 809-883 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/dann14a.pdf |
Reference Type | Journal Article |
Author(s) | Meyer, T.; Peters, J.; Zander, T.O.; Schoelkopf, B.; Grosse-Wentrup, M. |
Year | 2014 |
Title | Predicting Motor Learning Performance from Electroencephalographic Data |
Journal/Conference/Book Title | Journal of Neuroengineering and Rehabilitation |
Keywords | Team Athena-Minerva |
Volume | 11 |
Number | 1 |
URL(s) | http://www.ias.tu-darmstadt.de/uploads/Publications/Meyer_JNER_2013.pdf |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Meyer_JNER_2013.pdf |
Reference Type | Journal Article |
Author(s) | Bocsi, B.; Csato, L.; Peters, J. |
Year | 2014 |
Title | Indirect Robot Model Learning for Tracking Control |
Journal/Conference/Book Title | Advanced Robotics |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Bocsi_AR_2014.pdf |
Reference Type | Journal Article |
Author(s) | Ben Amor, H.; Saxena, A.; Hudson, N.; Peters, J. |
Year | 2014 |
Title | Special issue on autonomous grasping and manipulation |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Reference Type | Journal Article |
Author(s) | Deisenroth, M.P.; Fox, D.; Rasmussen, C.E. |
Year | 2014 |
Title | Gaussian Processes for Data-Efficient Learning in Robotics and Control |
Journal/Conference/Book Title | IEEE Transactions on Pattern Analysis and Machine Intelligence |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/pami_final_w_appendix.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/pami_final_w_appendix.pdf |
Reference Type | Journal Article |
Author(s) | Wierstra, D.; Schaul, T.; Glasmachers, T.; Sun, Y.; Peters, J.; Schmidhuber, J. |
Year | 2014 |
Title | Natural Evolution Strategies |
Journal/Conference/Book Title | Journal of Machine Learning Research (JMLR) |
Volume | 15 |
Number | March |
Pages | 949-980 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/wierstra14a.pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Englert, P.; Peters, J.; Fox, D. |
Year | 2014 |
Title | Multi-Task Policy Search for Robotics |
Journal/Conference/Book Title | Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Deisenroth_ICRA_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bischoff, B.; Nguyen-Tuong, D.; van Hoof, H. McHutchon, A.; Rasmussen, C.E.; Knoll, A.; Peters, J.; Deisenroth, M.P. |
Year | 2014 |
Title | Policy Search For Learning Robot Control Using Sparse Data |
Journal/Conference/Book Title | Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Bischoff_ICRA_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Seyfarth, A.; Peters, J.; Deisenroth, M.P. |
Year | 2014 |
Title | An Experimental Comparison of Bayesian Optimization for Bipedal Locomotion |
Journal/Conference/Book Title | Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_ICRA2014.pdf |
Reference Type | Book |
Author(s) | Kober, J.; Peters, J. |
Year | 2014 |
Title | Learning Motor Skills - From Algorithms to Robot Experiments |
Journal/Conference/Book Title | Springer Tracts in Advanced Robotics 97 (STAR Series), Springer |
ISBN/ISSN | 978-3-319-03193-4 |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; van Hoof, H.; Neumann, G.; Peters, J. |
Year | 2014 |
Title | Learning to Predict Phases of Manipulation Tasks as Hidden States |
Journal/Conference/Book Title | Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | TACMAN, 3rd-Hand |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ben Amor, H.; Neumann, G.; Kamthe, S.; Kroemer, O.; Peters, J. |
Year | 2014 |
Title | Interaction Primitives for Human-Robot Cooperation Tasks |
Journal/Conference/Book Title | Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | CoDyCo, ComPLACS |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/icraHeniInteract.pdf |
Reference Type | Conference Proceedings |
Author(s) | Haji Ghassemi, N.; Deisenroth, M.P. |
Year | 2014 |
Title | Approximate Inference for Long-Term Forecasting with Periodic Gaussian Processes |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Analytic_Long-Term_Forecasting.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Gopalan, N.; Seyfarth, A.; Peters, J.; Deisenroth, M.P. |
Year | 2014 |
Title | Bayesian Gait Optimization for Bipedal Locomotion |
Journal/Conference/Book Title | Proceedings of the 2014 Learning and Intelligent Optimization Conference (LION8) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_LION8.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kamthe, S.; Peters, J.; Deisenroth, M. |
Year | 2014 |
Title | Multi-modal filtering for non-linear estimation |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) |
Link to PDF | https://spiral.imperial.ac.uk:8443/bitstream/10044/1/12921/2/ICASSP_Final.pdf |
Reference Type | Conference Proceedings |
Author(s) | Manschitz, S.; Kober, J.; Gienger, M.; Peters, J. |
Year | 2014 |
Title | Learning to Unscrew a Light Bulb from Demonstrations |
Journal/Conference/Book Title | Proceedings of ISR/ROBOTIK 2014 |
Keywords | HRI-Collaboration |
Reference Type | Journal Article |
Author(s) | Muelling, K.; Boularias, A.; Schoelkopf, B.; Peters, J. |
Year | 2014 |
Title | Learning Strategies in Table Tennis using Inverse Reinforcement Learning |
Journal/Conference/Book Title | Biological Cybernetics |
Volume | 108 |
Number | 5 |
Pages | 603-619 |
Custom 1 | DOI 10.1007/s00422-014-0599-1 |
Custom 2 | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/Muelling_BICY_2014.pdf |
Reference Type | Journal Article |
Author(s) | Saut, J.-P.; Ivaldi, S.; Sahbani, A.; Bidaud, P. |
Year | 2014 |
Title | Grasping objects localized from uncertain point cloud data |
Journal/Conference/Book Title | Robotics and Autonomous Systems |
Keywords | CoDyCo |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/auro2013_final.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lioutikov, R.; Kroemer, O.; Peters, J.; Maeda, G. |
Year | 2014 |
Title | Learning Manipulation by Sequencing Motor Primitives with a Two-Armed Robot |
Journal/Conference/Book Title | Proceedings of the 13th International Conference on Intelligent Autonomous Systems (IAS) |
Keywords | 3rd-Hand |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Member/PubRudolfLioutikov/lioutikov_ias13_conf.pdf |
Reference Type | Conference Proceedings |
Author(s) | Daniel, C.; Viering, M.; Metz, J.; Kroemer, O.; Peters, J. |
Year | 2014 |
Title | Active Reward Learning |
Journal/Conference/Book Title | Proceedings of Robotics: Science & Systems (R:SS) |
Keywords | complacs |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Daniel_RSS_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Peters, J. |
Year | 2014 |
Title | Predicting Object Interactions from Contact Distributions |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | 3rd-Hand, TACMAN, CoDyCo |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/KroemerIROS2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Chebotar, Y.; Kroemer, O.; Peters, J. |
Year | 2014 |
Title | Learning Robot Tactile Sensing for Object Manipulation |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | 3rd-Hand, TACMAN |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/ChebotarIROS2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Manschitz, S.; Kober, J.; Gienger, M.; Peters, J. |
Year | 2014 |
Title | Learning to Sequence Movement Primitives from Demonstrations |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | HRI-Collaboration, Honda |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/ManschitzIROS2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Luck, K.S.; Neumann, G.; Berger, E.; Peters, J.; Ben Amor, H. |
Year | 2014 |
Title | Latent Space Policy Search for Robotics |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS) |
Keywords | Complacs, codyco |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Luck_IROS_2014.pdf |
Reference Type | Journal Article |
Author(s) | van Hoof, H.; Kroemer, O; Peters, J. |
Year | 2014 |
Title | Probabilistic Segmentation and Targeted Exploration of Objects in Cluttered Environments |
Journal/Conference/Book Title | IEEE Transactions on Robotics (TRo) |
Volume | 30 |
Number | 5 |
Pages | 1198-1209 |
ISBN/ISSN | 1552-3098 |
URL(s) | http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6870500&tag=1 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/hoof2014probabilistic.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez, V.; Kappen, B; Peters, J.; Neumann, G |
Year | 2014 |
Title | Policy Search for Path Integral Control |
Journal/Conference/Book Title | Proceedings of the European Conference on Machine Learning (ECML) |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Gomez_ECML_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Maeda, G.J.; Ewerton, M.; Lioutikov, R.; Amor, H.B.; Peters, J.; Neumann, G. |
Year | 2014 |
Title | Learning Interaction for Collaborative Tasks with Probabilistic Movement Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | 3rd-Hand, CompLACS |
Pages | 527--534 |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/PubGJMaeda/maeda2014InteractionProMP_HUMANOIDS.pdf |
Reference Type | Conference Proceedings |
Author(s) | Brandl, S.; Kroemer, O.; Peters, J. |
Year | 2014 |
Title | Generalizing Pouring Actions Between Objects using Warped Parameters |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | 3rd-Hand |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/BrandlHumanoids2014Final.pdf |
Reference Type | Conference Proceedings |
Author(s) | Colome, A.; Neumann, G.; Peters, J.; Torras, C. |
Year | 2014 |
Title | Dimensionality Reduction for Probabilistic Movement Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Colome_Humanoids_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Rueckert, E.; Mindt, M.; Peters, J.; Neumann, G. |
Year | 2014 |
Title | Robust Policy Updates for Stochastic Optimal Control |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/AICOHumanoidsFinal.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ivaldi, S.; Peters, J.; Padois, V.; Nori, F. |
Year | 2014 |
Title | Tools for simulating humanoid robot dynamics: a survey based on user feedback |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/ivaldi2014simulators.pdf |
Reference Type | Journal Article |
Author(s) | Droniou, A.; Ivaldi, S.; Sigaud, O. |
Year | 2014 |
Title | Deep unsupervised network for multimodal perception, representation and classification |
Journal/Conference/Book Title | Robotics and Autonomous Systems |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Deep unsupervised network_2014.pdf |
Reference Type | Conference Proceedings |
Author(s) | Hermans, T.; Veiga, F.; Hölscher, J.; van Hoof, H.; Peters, J. |
Year | 2014 |
Title | Demonstration: Learning for Tactile Manipulation |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS/NeurIPS), Demonstration Track. |
Keywords | TACMAN, tactile manipulation |
Abstract | Tactile sensing affords robots the opportunity to dexterously manipulate objects in-hand without the need of strong object models and planning. Our demonstration focuses on learning for tactile, in-hand manipulation by robots. We address learning problems related to the control of objects in-hand, as well as perception problems encountered by a robot exploring its environment with a tactile sensor. We demonstrate applications for three specific learning problems: learning to detect slip for grasp stability, learning to reposition objects in-hand, and learning to identify objects and object properties through tactile exploration. We address the problem of learning to detect slip of grasped objects. We show that the robot can learn a detector for slip events which generalizes to novel objects. We leverage this slip detector to produce a feedback controller that can stabilize objects during grasping and manipulation. Our work compares a number of supervised learning approaches and feature representations in order to achieve reliable slip detection. Tactile sensors provide observations of high enough dimension to cause problems for traditional reinforcement learning methods. As such, we introduce a novel reinforcement learning (RL) algorithm which learns transition functions embedded in a reproducing kernel Hilbert space (RKHS). The resulting policy search algorithm provides robust policy updates which can efficiently deal with high-dimensional sensory input. We demonstrate the method on the problem of repositioning a grasped object in the hand. Finally, we present a method for learning to classify objects through tactile exploration. The robot collects data from a number of objects through various exploratory motions. The robot learns a classifier for each object to be used during exploration of its environment to detect objects in cluttered environments. Here again we compare a number of learning methods and features present in the literature and synthesize a method to best work in human environments the robot is likely to encounter. Users will be able to interact with a robot hand by giving it objects to grasp and attempting to remove these objects from the robot. The hand will also perform some basic in-hand manipulation tasks such as rolling the object between the fingers and rotating the object about a fixed grasp point. Users will also be able to interact with a touch sensor capable of classifying objects as well as semantic events such as slipping from a stable contact location. |
Place Published | Cambridge, MA |
Publisher | MIT Press |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/TuckerHermans/learning_tactile_manipulation_demo.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lioutikov, R.; Paraschos, A.; Peters, J.; Neumann, G. |
Year | 2014 |
Title | Sample-Based Information-Theoretic Stochastic Optimal Control |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | 3rd-Hand |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Team/RudolfLioutikov/lioutikov_icra_2014.pdf |
Reference Type | Journal Article |
Author(s) | Muelling, K.; Kober, J.; Kroemer, O.; Peters, J. |
Year | 2013 |
Title | Learning to Select and Generalize Striking Movements in Robot Table Tennis |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | GeRT |
Volume | 32 |
Number | 3 |
Pages | 263-279 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_IJRR_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_IJRR_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Daniel, C.; Neumann, G.; Kroemer, O.; Peters, J. |
Year | 2013 |
Title | Learning Sequential Motor Tasks |
Journal/Conference/Book Title | Proceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | GeRT, CompLACS |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_ICRA_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_ICRA_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Englert, P.; Paraschos, A.; Peters, J.; Deisenroth, M. P. |
Year | 2013 |
Title | Model-based Imitation Learning by Probabilistic Trajectory Matching |
Journal/Conference/Book Title | Proceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA) |
Abstract | One of the most elegant ways of teaching new skills to robots is to provide demonstrations of a task and let the robot imitate this behavior. Such imitation learning is a non-trivial task: Different anatomies of robot and teacher, and reduced robustness towards changes in the control task are two major difficulties in imitation learning. We present an imitation-learning approach to efficiently learn a task from expert demonstrations. Instead of finding policies indirectly, either via state-action mappings (behavioral cloning), or cost function learning (inverse reinforcement learning), our goal is to find policies directly such that predicted trajectories match observed ones. To achieve this aim, we model the trajectory of the teacher and the predicted robot trajectory by means of probability distributions. We match these distributions by minimizing their Kullback-Leibler divergence. In this paper, we propose to learn probabilistic forward models to compute a probability distribution over trajectories. We compare our approach to model-based reinforcement learning methods with hand-crafted cost functions. Finally, we evaluate our method with experiments on a real compliant robot. |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Englert_ICRA_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Englert_ICRA_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gopalan, N.; Deisenroth, M. P.; Peters, J. |
Year | 2013 |
Title | Feedback Error Learning for Rhythmic Motor Primitives |
Journal/Conference/Book Title | Proceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gopalan_ICRA_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gopalan_ICRA_2013.pdf |
Reference Type | Journal Article |
Author(s) | Wang, Z.; Muelling, K.; Deisenroth, M. P.; Ben Amor, H.; Vogt, D.; Schoelkopf, B.; Peters, J. |
Year | 2013 |
Title | Probabilistic Movement Modeling for Intention Inference in Human-Robot Interaction |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Volume | 32 |
Number | 7 |
Pages | 841-858 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_IJRR_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_IJRR_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J.; Muelling, K.; Kroemer, O.; Neumann, G. |
Year | 2013 |
Title | Towards Robot Skill Learning: From Simple Skills to Table Tennis |
Journal/Conference/Book Title | Proceedings of the European Conference on Machine Learning (ECML), Nectar Track |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/peters_ECML_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/peters_ECML_2013.pdf |
Reference Type | Journal Article |
Author(s) | Kober, J.; Bagnell, D.; Peters, J. |
Year | 2013 |
Title | Reinforcement Learning in Robotics: A Survey |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Volume | 32 |
Number | 11 |
Pages | 1238-1274 |
URL(s) | http://www.ias.tu-darmstadt.de/uploads/Publications/Kober_IJRR_2013.pdf |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Kober_IJRR_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kupcsik, A.G.; Deisenroth, M.P.; Peters, J.; Neumann, G. |
Year | 2013 |
Title | Data-Efficient Generalization of Robot Skills with Contextual Policy Search |
Journal/Conference/Book Title | Proceedings of the National Conference on Artificial Intelligence (AAAI) |
Keywords | GeRT, ComPLACS |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kupcsik_AAAI_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kupcsik_AAAI_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bocsi, B.; Csato, L.; Peters, J. |
Year | 2013 |
Title | Alignment-based Transfer Learning for Robot Models |
Journal/Conference/Book Title | Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_IJCNN_2013.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_IJCNN_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Daniel, C.; Neumann, G.; Peters, J. |
Year | 2013 |
Title | Autonomous Reinforcement Learning with Hierarchical REPS |
Journal/Conference/Book Title | Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN) |
Keywords | GeRT, CompLACS |
Reference Type | Journal Article |
Author(s) | Englert, P.; Paraschos, A.; Peters, J.;Deisenroth, M.P. |
Year | 2013 |
Title | Probabilistic Model-based Imitation Learning |
Journal/Conference/Book Title | Adaptive Behavior Journal |
Volume | 21 |
Pages | 388-403 |
URL(s) | http://www.ias.tu-darmstadt.de/uploads/Publications/Englert_ABJ_2013.pdf |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Englert_ABJ_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J.; Muelling, K.; Nguyen-Tuong, D.; Kroemer, O. |
Year | 2013 |
Title | Learning Skills with Motor Primitives |
Journal/Conference/Book Title | Proceedings of the 16th Yale Learning Workshop |
Reference Type | Conference Proceedings |
Author(s) | Neumann, G.; Kupcsik, A.G.; Deisenroth, M.P.; Peters, J. |
Year | 2013 |
Title | Information-Theoretic Motor Skill Learning |
Journal/Conference/Book Title | Proceedings of the AAAI 2013 Workshop on Intelligent Robotic Systems |
Keywords | ComPLACS |
Reference Type | Conference Proceedings |
Author(s) | Ben Amor, H.; Vogt, D.; Ewerton, M.; Berger, E.; Jung, B.; Peters, J. |
Year | 2013 |
Title | Learning Responsive Robot Behavior by Imitation |
Journal/Conference/Book Title | Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/iros2013Heni.pdf |
Reference Type | Journal Article |
Author(s) | Deisenroth, M. P.; Neumann, G.; Peters, J. |
Year | 2013 |
Title | A Survey on Policy Search for Robotics |
Journal/Conference/Book Title | Foundations and Trends in Robotics |
Keywords | CompLACS |
Volume | 21 |
Pages | 388-403 |
URL(s) | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/PolicySearchReview.pdf |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/PolicySearchReview.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; Kroemer, O; Peters, J. |
Year | 2013 |
Title | Probabilistic Interactive Segmentation for Anthropomorphic Robots in Cluttered Environments |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | gert, complacs |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/hoof-HUMANOIDS.pdf |
Reference Type | Conference Proceedings |
Author(s) | Paraschos, A.; Neumann, G; Peters, J. |
Year | 2013 |
Title | A Probabilistic Approach to Robot Trajectory Generation |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo, ComPLACS |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Paraschos_Humanoids_2013.pdf |
Reference Type | Conference Proceedings |
Author(s) | Berger, E.; Vogt, D.; Haji-Ghassemi, N.; Jung, B.; Ben Amor, H. |
Year | 2013 |
Title | Inferring Guidance Information in Cooperative Human-Robot Tasks |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | CoDyCo |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/humanoids2013Heni.pdf |
Reference Type | Conference Proceedings |
Author(s) | Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G |
Year | 2013 |
Title | Probabilistic Movement Primitives |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS / NeurIPS) |
Keywords | CoDyCo, ComPLACS |
Place Published | Cambridge, MA |
Publisher | MIT Press |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Paraschos_NIPS_2013a.pdf |
Reference Type | Book Section |
Author(s) | Sigaud, O.; Peters, J. |
Year | 2012 |
Title | Robot Learning |
Journal/Conference/Book Title | Encyclopedia of the Sciences of Learning, Springer Verlag |
Publisher | Springer Verlag |
Reprint Edition | 978-1-4419-1428-6 |
URL(s) | http://dx.doi.org/10.1007/978-3-642-05181-4_1 |
Reference Type | Journal Article |
Author(s) | Lampert, C.H.; Peters, J. |
Year | 2012 |
Title | Real-Time Detection of Colored Objects In Multiple Camera Streams With Off-the-Shelf Hardware Components |
Journal/Conference/Book Title | Journal of Real-Time Image Processing |
Volume | 7 |
Number | 1 |
Pages | 31-41 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/rtblob-jrtip2010_6651[0].pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/rtblob-jrtip2010_6651[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Daniel, C.; Neumann, G.; Peters, J. |
Year | 2012 |
Title | Hierarchical Relative Entropy Policy Search |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS 2012) |
Keywords | GeRT, CompLACS |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Member/ChristianDaniel/DanielAISTATS2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Member/ChristianDaniel/DanielAISTATS2012.pdf |
Reference Type | Journal Article |
Author(s) | Deisenroth, M.P.; Turner, R.; Huber, M.; Hanebeck, U.D.; Rasmussen, C.E |
Year | 2012 |
Title | Robust Filtering and Smoothing with Gaussian Processes |
Journal/Conference/Book Title | IEEE Transactions on Automatic Control |
Keywords | Gaussian process, filtering, smoothing |
Abstract | We propose a principled algorithm for robust Bayesian filtering and smoothing in nonlinear stochastic dynamic systems when both the transition function and the measurement function are described by non-parametric Gaussian process (GP) models. GPs are gaining increasing importance in signal processing, machine learning, robotics, and control for representing unknown system functions by posterior probability distributions. This modern way of "system identification" is more robust than finding point estimates of a parametric function representation. Our principled filteringslash smoothing approach for GP dynamic systems is based on analytic moment matching in the context of the forward-backward algorithm. Our numerical evaluations demonstrate the robustness of the proposed approach in situations where other state-of-the-art Gaussian filters and smoothers can fail. |
Number of Volumes | IEEE |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/deisenroth_IEEE-TAC2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Ugur, E.; Oztop, E. ; Peters, J. |
Year | 2012 |
Title | A Kernel-based Approach to Direct Action Perception |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bocsi, B.; Hennig, P.; Csato, L.; Peters, J. |
Year | 2012 |
Title | Learning Tracking Control with Forward Models |
Journal/Conference/Book Title | Proceedings of the International Conference on Robotics and Automation (ICRA) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_ICRA_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_ICRA_2012.pdf |
Reference Type | Journal Article |
Author(s) | Kober, J.; Wilhelm, A.; Oztop, E.; Peters, J. |
Year | 2012 |
Title | Reinforcement Learning to Adjust Parametrized Motor Primitives to New Situations |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Keywords | Skill learning; Motor primitives; Reinforcement learning; Meta-parameters; Policy learning |
Publisher | Springer US |
Volume | 33 |
Number | 4 |
Pages | 361-379 |
ISBN/ISSN | 0929-5593 |
URL(s) | http://dx.doi.org/10.1007/s10514-012-9290-3 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/kober_auro2012.pdf |
Language | English |
Reference Type | Journal Article |
Author(s) | Vitzthum, A.; Ben Amor, H.; Heumer, G.; Jung, B. |
Year | 2012 |
Title | XSAMPL3D - An Action Description Language for the Animation of Virtual Characters |
Journal/Conference/Book Title | Journal of Virtual Reality and Broadcasting |
Volume | 9 |
Number | 1 |
URL(s) | http://www.jvrb.org/9.2012 |
Link to PDF | http://www.jvrb.org/9.2012/3262/920121.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wang, Z.;Deisenroth, M; Ben Amor, H.; Vogt, D.; Schoelkopf, B.; Peters, J. |
Year | 2012 |
Title | Probabilistic Modeling of Human Movements for Intention Inference |
Journal/Conference/Book Title | Proceedings of Robotics: Science and Systems (R:SS) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_RSS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_RSS_2012.pdf |
Reference Type | Journal Article |
Author(s) | Nguyen-Tuong, D.; Peters, J. |
Year | 2012 |
Title | Online Kernel-based Learning for Task-Space Tracking Robot Control |
Journal/Conference/Book Title | IEEE Transactions on Neural Networks and Learning Systems |
Volume | 23 |
Number | 9 |
Pages | 1417-1425 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/NguyenTuong_TNN_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/NguyenTuong_TNN_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Mohamed, S. |
Year | 2012 |
Title | Expectation Propagation in Gaussian Process Dynamical Systems |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 26 (NIPS/NeurIPS), Cambridge, MA: MIT Press. |
Abstract | Rich and complex time-series data, such as those generated from engineering systems, financial markets, videos, or neural recordings are now a common feature of modern data analysis. Explaining the phenomena underlying these diverse data sets requires flexible and accurate models. In this paper, we promote Gaussian process dynamical systems as a rich model class that is appropriate for such an analysis. We present a new approximate message-passing algorithm for Bayesian state estimation and inference in Gaussian process dynamical systems, a non-parametric probabilistic generalization of commonly used state-space models. We derive our message-passing algorithm using Expectation Propagation provide a unifying perspective on message passing in general state-space models. We show that existing Gaussian filters and smoothers appear as special cases within our inference framework, and that these existing approaches can be improved upon using iterated message passing. Using both synthetic and real-world data, we demonstrate that iterated message passing can improve inference in a wide range of tasks in Bayesian state estimation, thus leading to improved predictions and more effective decision making. |
Publisher | The MIT Press |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_NIPS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J.; Muelling, K.; Nguyen-Tuong, D.; Kroemer, O. |
Year | 2012 |
Title | Robot Skill Learning |
Journal/Conference/Book Title | Proceedings of the European Conference on Artificial Intelligence (ECAI) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ECAI2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ECAI2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Boularias, A.; Kroemer, O.; Peters, J. |
Year | 2012 |
Title | Structured Apprenticeship Learning |
Journal/Conference/Book Title | Proceedings of the European Conference on Machine Learning (ECML) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_ECML_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_ECML_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Meyer, T.; Peters, J.;Broetz, D.; Zander, T.; Schoelkopf, B.; Soekadar, S.; Grosse-Wentrup, M. |
Year | 2012 |
Title | A Brain-Robot Interface for Studying Motor Learning after Stroke |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS) |
Keywords | Team Athena-Minerva |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Meyer_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Meyer_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Calandra, R.; Seyfarth, A.; Peters, J. |
Year | 2012 |
Title | Toward Fast Policy Search for Learning Legged Locomotion |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS) |
Keywords | legged locomotion, policy search, reinforcement learning, Gaussian process |
Abstract | Legged locomotion is one of the most versatile forms of mobility. However, despite the importance of legged locomotion and the large number of legged robotics studies, no biped or quadruped matches the agility and versatility of their biological counterparts to date. Approaches to designing controllers for legged locomotion systems are often based on either the assumption of perfectly known dynamics or mechanical designs that substantially reduce the dimensionality of the problem. The few existing approaches for learning controllers for legged systems either require exhaustive real-world data or they improve controllers only conservatively, leading to slow learning. We present a data-efficient approach to learning feedback controllers for legged locomotive systems, based on learned probabilistic forward models for generating walking policies. On a compass walker, we show that our approach allows for learning gait policies from very little data. Moreover, we analyze learned locomotion models of a biomechanically inspired biped. Our approach has the potential to scale to high-dimensional humanoid robots with little loss in efficiency. |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; Kroemer, O.;Ben Amor, H.; Peters, J. |
Year | 2012 |
Title | Maximally Informative Interaction Learning for Scene Exploration |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VanHoof_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VanHoof_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Daniel, C.; Neumann, G.; Peters, J. |
Year | 2012 |
Title | Learning Concurrent Motor Skills in Versatile Solution Spaces |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS) |
Keywords | GeRT, CompLACS |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ben Amor, H.; Kroemer, O.; Hillenbrand, U.; Neumann, G.; Peters, J. |
Year | 2012 |
Title | Generalization of Human Grasping for Multi-Fingered Robot Hands |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/BenAmor_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/BenAmor_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J; Muelling, K.; Peters, J. |
Year | 2012 |
Title | Learning Throwing and Catching Skills |
Journal/Conference/Book Title | Proceedings of the International Conference on Robot Systems (IROS), Video Track |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kober_IROS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kober_IROS_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Peters, J. |
Year | 2012 |
Title | Solving Nonlinear Continuous State-Action-Observation POMDPs for Mechanical Systems with Gaussian Noise |
Journal/Conference/Book Title | Proceedings of the European Workshop on Reinforcement Learning (EWRL) |
Link to PDF | http://www.ias.tu-darmstadt.de/uploads/Publications/Deisenroth_EWRL_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Muelling, K.; Kober, J.; Kroemer, O.; Peters, J. |
Year | 2012 |
Title | Learning to Select and Generalize Striking Movements in Robot Table Tennis |
Journal/Conference/Book Title | Proceedings of the AAAI 2012 Fall Symposium on Robots that Learn Interactively from Human Teachers |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/aaaifss12rliht_submission_2.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/aaaifss12rliht_submission_2.pdf |
Reference Type | Conference Proceedings |
Author(s) | Calandra, R.; Raiko, T.; Deisenroth, M.P.; Montesino Pouzols, F. |
Year | 2012 |
Title | Learning Deep Belief Networks from Non-Stationary Streams |
Journal/Conference/Book Title | International Conference on Artificial Neural Networks (ICANN) |
Keywords | deep learning, non-stationary data |
Abstract | Deep learning has proven to be beneficial for complex tasks such as classifying images. However, this approach has been mostly ap- plied to static datasets. The analysis of non-stationary (e.g., concept drift) streams of data involves specific issues connected with the tempo- ral and changing nature of the data. In this paper, we propose a proof- of-concept method, called Adaptive Deep Belief Networks, of how deep learning can be generalized to learn online from changing streams of data. We do so by exploiting the generative properties of the model to incrementally re-train the Deep Belief Network whenever new data are collected. This approach eliminates the need to store past observations and, therefore, requires only constant memory consumption. Hence, our approach can be valuable for life-long learning from non-stationary data streams. |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/calandra_icann2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Meyer, T.; Peters, J.; Broetz, D.; Zander, T.; Schoelkopf, B.; Soekadar, S.; Grosse-Wentrup, M. |
Year | 2012 |
Title | Investigating the Neural Basis for Stroke Rehabilitation by Brain-Computer Interfaces |
Journal/Conference/Book Title | International Conference on Neurorehabilitation |
Keywords | Team Athena-Minerva |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Ben Amor, H.; Ewerton, M.; Peters, J. |
Year | 2012 |
Title | Point Cloud Completion Using Extrusions |
Journal/Conference/Book Title | Proceedings of the International Conference on Humanoid Robots (HUMANOIDS) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_Humanoids_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_Humanoids_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Boularias, A.; Kroemer, O.; Peters, J. |
Year | 2012 |
Title | Algorithms for Learning Markov Field Policies |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 26 (NIPS/NeurIPS), Cambridge, MA: MIT Press. |
Keywords | GeRT |
Place Published | Cambridge, MA |
Publisher | MIT Press |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_NIPS_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_NIPS_2012.pdf |
Reference Type | Book |
Author(s) | Deisenroth M. P.; Szepesvari C.; Peters J. |
Year | 2012 |
Journal/Conference/Book Title | Proceedings of the 10th European Workshop on Reinforcement Learning |
Editor(s) | Deisenroth M. P.; Szepesvari C., Peters J. |
Place Published | JMLR W&C |
Volume | 24 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_EWRLproceedings_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_EWRLproceedings_2012.pdf |
Reference Type | Journal Article |
Author(s) | Piater, J.; Jodogne, S.; Detry, R.; Kraft, D.; Krueger, N.; Kroemer, O.; Peters, J. |
Year | 2011 |
Title | Learning Visual Representations for Perception-Action Systems |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Volume | 30 |
Number | 3 |
Pages | 294-307 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Piater_IJRR_2010.pdf |
Reference Type | Journal Article |
Author(s) | Detry, R.; Kraft, D.; Kroemer, O.; Peters, J.; Krueger, N.; Piater, J.; |
Year | 2011 |
Title | Learning Grasp Affordance Densities |
Journal/Conference/Book Title | Paladyn Journal of Behavioral Robotics |
Keywords | GeRT |
Volume | 2 |
Number | 1 |
Pages | 1-17 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Detry_PJBR_2011.pdf |
Reference Type | Journal Article |
Author(s) | Kober, J.; Peters, J. |
Year | 2011 |
Title | Policy Search for Motor Primitives in Robotics |
Journal/Conference/Book Title | Machine Learning (MLJ) |
Volume | 84 |
Number | 1-2 |
Pages | 171-203 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/kober_MACH_2011.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/kober_MACH_2011.pdf |
Reference Type | Journal Article |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2011 |
Title | Incremental Sparsification for Real-time Online Model Learning |
Journal/Conference/Book Title | Neurocomputing |
Volume | 74 |
Number | 11 |
Pages | 1859-1867 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Nguyen_NEURO_2011.pdf |
Reference Type | Journal Article |
Author(s) | Gomez Rodriguez, M.; Peters, J.; Hill, J.; Schoelkopf, B.; Gharabaghi, A.; Grosse-Wentrup, M. |
Year | 2011 |
Title | Closing the Sensorimotor Loop: Haptic Feedback Helps Decoding of Motor Imagery |
Journal/Conference/Book Title | Journal of Neuroengineering |
Keywords | Team Athena-Minerva |
Volume | 8 |
Number | 3 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gomez-RodriguezJNE2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lampariello, R.; Nguyen Tuong, D.; Castellini, C.; Hirzinger, G.; Peters, J. |
Year | 2011 |
Title | Energy-optimal robot catching in real-time |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Lampariello_ICRA_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Peters, J. |
Year | 2011 |
Title | A Flexible Hybrid Framework for Modeling Complex Manipulation Tasks |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Keywords | GeRT |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Peters, J. |
Year | 2011 |
Title | Active Exploration for Robot Parameter Selection in Episodic Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the 2011 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL) |
Keywords | GeRT |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ADPRL_2011.pdf |
Reference Type | Journal Article |
Author(s) | Kroemer, O.; Lampert, C.H.; Peters, J. |
Year | 2011 |
Title | Learning Dynamic Tactile Sensing with Robust Vision-based Training |
Journal/Conference/Book Title | IEEE Transactions on Robotics (T-Ro) |
Volume | 27 |
Number | 3 |
Pages | 545-557 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_TRo_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Boularias, A.; Kroemer, O.; Peters, J. |
Year | 2011 |
Title | Learning Robot Grasping from 3D Images with Markov Random Fields |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robot Systems (IROS) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Publications/Boularias_IROS_2011.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Boularias_IROS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Peters, J. |
Year | 2011 |
Title | A Non-Parametric Approach to Dynamic Programming |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 25 (NIPS/NeurIPS) |
Keywords | GeRT |
Place Published | Cambridge, MA |
Publisher | MIT Press |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer2011NIPS.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer2011NIPS.pdf |
Reference Type | Conference Proceedings |
Author(s) | van Hoof, H.; van der Zant, T. ; Wiering, M.A. |
Year | 2011 |
Title | Adaptive Visual Face Tracking for an Autonomous Robot |
Journal/Conference/Book Title | Proceedings of the Belgian-Dutch Artificial Intelligence Conference (BNAIC 11) |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VanHoof_BNAIC_2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VanHoof_BNAIC_2011.pdf |
Reference Type | Journal Article |
Author(s) | Muelling, K.; Kober, J.; Peters, J. |
Year | 2011 |
Title | A Biomimetic Approach to Robot Table Tennis |
Journal/Conference/Book Title | Adaptive Behavior Journal |
Volume | 19 |
Number | 5 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Bocsi, B.; Nguyen-Tuong, D; Csato, L; Schoelkopf, B.; Peters, J. |
Year | 2011 |
Title | Learning Inverse Kinematics with Structured Prediction |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robot Systems (IROS) |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_IROS_2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_IROS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wang, Z.; Lampert, C; Muelling, K; Schoelkopf, B.; Peters, J. |
Year | 2011 |
Title | Learning Anticipation Policies for Robot Table Tennis |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robot Systems (IROS) |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_IROS_2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_IROS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2011 |
Title | Learning Task-Space Tracking Control with Kernels |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robot Systems (IROS) |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Nguyen_IROS_2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Nguyen_IROS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Peters, J. |
Year | 2011 |
Title | Learning Elementary Movements Jointly with a Higher Level Task |
Journal/Conference/Book Title | IEEE/RSJ International Conference on Intelligent Robot Systems (IROS) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Kober_IROS_2011.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kober_IROS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez Rodriguez, M.; Grosse-Wentrup, M.; Hill, J.; Schoelkopf, B.; Gharabaghi, A.; Peters, J. |
Year | 2011 |
Title | Towards Brain-Robot Interfaces for Stroke Rehabilitation |
Journal/Conference/Book Title | Proceedings of the International Conference on Rehabilitation Robotics (ICORR) |
Keywords | Team Athena-Minerva |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gomez_ICORR_2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gomez_ICORR_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Wang, Z.; Boularias, A.; Muelling, K.; Peters, J. |
Year | 2011 |
Title | Balancing Safety and Exploitability in Opponent Modeling |
Journal/Conference/Book Title | Proceedings of the Twenty-Fifth National Conference on Artificial Intelligence (AAAI) |
Keywords | GeRT |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_AAAI_2011.pdf |
Reference Type | Journal Article |
Author(s) | Hachiya, H.; Peters, J.; Sugiyama, M. |
Year | 2011 |
Title | Reward Weighted Regression with Sample Reuse for Direct Policy Search in Reinforcement Learning |
Journal/Conference/Book Title | Neural Computation |
Volume | 23 |
Number | 11 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Hachiya_NC2011.pdf |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Hachiya_NC2011.pdf |
Reference Type | Journal Article |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2011 |
Title | Model Learning in Robotics: a Survey |
Journal/Conference/Book Title | Cognitive Processing |
Volume | 12 |
Number | 4 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Nguyen_CP_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Oztop, E.; Peters, J. |
Year | 2011 |
Title | Reinforcement Learning to adjust Robot Movements to New Situations |
Journal/Conference/Book Title | Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Best Paper Track |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Kober_IJCAI_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Boularias, A.; Kober, J.; Peters, J. |
Year | 2011 |
Title | Relative Entropy Inverse Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011) |
Keywords | GeRT |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/boularias11a.pdf |
Reference Type | Journal Article |
Author(s) | Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J. |
Year | 2010 |
Title | Recurrent Policy Gradients |
Journal/Conference/Book Title | Logic Journal of the IGPL |
Volume | 18 |
Number of Volumes | 5 |
Pages | 620-634 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/jzp049v1_5879.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Oztop, E.; Peters, J. |
Year | 2010 |
Title | Reinforcement Learning to adjust Robot Movements to New Situations |
Journal/Conference/Book Title | Proceedings of Robotics: Science and Systems (R:SS) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/RSS2010-Kober_6438[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Detry, R.; Piater, J.; Peters, J. |
Year | 2010 |
Title | Adapting Preshaped Grasping Movements using Vision Descriptors |
Journal/Conference/Book Title | From Animals to Animats 11, International Conference on the Simulation of Adaptive Behavior (SAB) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Kroemer_6437[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Kroemer_6437[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Detry, R.; Piater, J.; Peters, J. |
Year | 2010 |
Title | Grasping with Vision Descriptors and Motor Primitives |
Journal/Conference/Book Title | Proceedings of the International Conference on Informatics in Control, Automation and Robotics (ICINCO) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/ICINCO2010-Kroemer_6436[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/ICINCO2010-Kroemer_6436[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Muelling, K.; Kober, J.; Peters, J. |
Year | 2010 |
Title | Simulating Human Table Tennis with a Biomimetic Robot Setup |
Journal/Conference/Book Title | From Animals to Animats 11, International Conference on the Simulation of Adaptive Behavior (SAB) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Muelling_6626[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Muelling_6626[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2010 |
Title | Incremental Sparsification for Real-time Online Model Learning |
Journal/Conference/Book Title | Proceedings of Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/AISTATS2010-Nguyen-Tuong.pdf |
Reference Type | Journal Article |
Author(s) | Kober, J.; Peters, J. |
Year | 2010 |
Title | Imitation and Reinforcement Learning - Practical Algorithms for Motor Primitive Learning in Robotics |
Journal/Conference/Book Title | IEEE Robotics and Automation Magazine |
Volume | 17 |
Number | 2 |
Pages | 55-62 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/kober_RAM_2010.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/kober_RAM_2010.pdf |
Reference Type | Journal Article |
Author(s) | Kroemer, O.; Detry, R.; Piater, J.; Peters, J. |
Year | 2010 |
Title | Combining Active Learning and Reactive Control for Robot Grasping |
Journal/Conference/Book Title | Robotics and Autonomous Systems |
Keywords | GeRT |
Volume | 58 |
Number | 9 |
Pages | 1105-1116 |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/KroemerJRAS_6636[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/KroemerJRAS_6636[0].pdf |
Reference Type | Book Section |
Author(s) | Nguyen Tuong, D.; Peters, J.;Seeger, M. |
Year | 2010 |
Title | Real-Time Local GP Model Learning |
Journal/Conference/Book Title | From Motor Learning to Interaction Learning in Robots, Springer Verlag |
Number | 264 |
Reprint Edition | 978-3-642-05180-7 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/LGP_IROS_Chapter_6233.pdf |
Reference Type | Book Section |
Author(s) | Peters, J.; Tedrake, R.; Roy, N.; Morimoto, J. |
Year | 2010 |
Title | Robot Learning |
Journal/Conference/Book Title | Encyclopedia of Machine Learning |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/EncyclopediaMachineLearning-Peters-RobotLearning_[0].pdf |
Reference Type | Book |
Author(s) | Sigaud, O.; Peters, J. |
Year | 2010 |
Title | From Motor Learning to Interaction Learning in Robots |
Journal/Conference/Book Title | Studies in Computational Intelligence, Springer Verlag |
Number of Volumes | Springer V |
Number | 264 |
Reprint Edition | 978-3-642-05180-7 |
Link to PDF | http://dx.doi.org/10.1007/978-3-642-05181-4 |
Reference Type | Book Section |
Author(s) | Kober, J.; Mohler, B.; Peters, J. |
Year | 2010 |
Title | Imitation and Reinforcement Learning for Motor Primitives with Perceptual Coupling |
Journal/Conference/Book Title | From Motor Learning to Interaction Learning in Robots, Springer Verlag |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Imitation%20and%20Reinforcement%20Learning%20for%20Motor%20Primitives%20with%20Perceptual%20Coupling_6234[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Muelling, K.; Altun, Y. |
Year | 2010 |
Title | Relative Entropy Policy Search |
Journal/Conference/Book Title | Proceedings of the Twenty-Fourth National Conference on Artificial Intelligence (AAAI), Physically Grounded AI Track |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/Peters2010_REPS.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Muelling, K.; Kroemer, O.; Lampert, C.H.; Schoelkopf, B.; Peters, J. |
Year | 2010 |
Title | Movement Templates for Learning of Hitting and Batting |
Journal/Conference/Book Title | IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/ICRA2010-Kober_6231[1].pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2010 |
Title | Using Model Knowledge for Learning Inverse Dynamics |
Journal/Conference/Book Title | IEEE International Conference on Robotics and Automation |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICRA2010-NguyenTuong_6232.pdf |
Reference Type | Journal Article |
Author(s) | Sehnke, F.; Osendorfer, C.; Rueckstiess, T.; Graves, A.; Peters, J.; Schmidhuber, J. |
Year | 2010 |
Title | Parameter-exploring Policy Gradients |
Journal/Conference/Book Title | Neural Networks |
Volume | 23 |
Number of Volumes | 4 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Neural-Networks-2010-Sehnke.pdf |
Reference Type | Book Section |
Author(s) | Peters, J.; Bagnell, J.A. |
Year | 2010 |
Title | Policy gradient methods |
Journal/Conference/Book Title | Encyclopedia of Machine Learning (invited article) |
Number of Volumes | Springer V |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Peters_EOMLA_submitted_6074[0].pdf |
Reference Type | Journal Article |
Author(s) | Morimura, T.; Uchibe, E.; Yoshimoto, J.; Peters, J.; Doya, K. |
Year | 2010 |
Title | Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning |
Journal/Conference/Book Title | Neural Computation |
Volume | 22 |
Number | 2 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/LSD_revise_ver3_5904[0].pdf |
Reference Type | Book Section |
Author(s) | Detry, R.; Baseski, E.; Popovic, M.; Touati, Y.; Krueger, N.; Kroemer, O.; Peters, J.; Piater, J. |
Year | 2010 |
Title | Learning Continuous Grasp Affordances by Sensorimotor Exploration |
Journal/Conference/Book Title | From Motor Learning to Interaction Learning in Robots, Springer Verlag |
Number | 264 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Detry-2010-MotorInteractionLearning_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Erkan, A.: Kroemer, O.; Detry, R.; Altun, Y.; Piater, J.; Peters, J. |
Year | 2010 |
Title | Learning Probabilistic Discriminative Models of Grasp Affordances under Limited Supervision |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
Keywords | GeRT |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/erkan_IROS_2010.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/erkan_IROS_2010.pdf |
Reference Type | Conference Proceedings |
Author(s) | Muelling, K.; Kober, J.; Peters, J. |
Year | 2010 |
Title | A Biomimetic Approach to Robot Table Tennis |
Journal/Conference/Book Title | Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez Rodriguez, M.; Grosse Wentrup, M.; Peters, J.; Naros, G.; Hill, J.; Gharabaghi, A.; Schoelkopf, B. |
Year | 2010 |
Title | Epidural ECoG Online Decoding of Arm Movement Intention in Hemiparesis |
Journal/Conference/Book Title | 1st ICPR Workshop on Brain Decoding: Pattern Recognition Challenges in Neuroimaging |
Keywords | Team Athena-Minerva |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICPR-WBD-2010-Gomez-Rodriguez.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez Rodriguez, M.; Peters, J.; Hill, J.; Schoelkopf, B.; Gharabaghi, A.; Grosse-Wentrup, M. |
Year | 2010 |
Title | Closing the Sensorimotor Loop: Haptic Feedback Facilitates Decoding of Arm Movement Imagery |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (Workshop on Brain-Machine Interfaces) |
Keywords | Team Athena-Minerva |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/eeg-smc2010_6591.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez Rodriguez, M.; Peters, J.; Hill, J.; Gharabaghi, A.; Schoelkopf, B.; Grosse-Wentrup, M. |
Year | 2010 |
Title | BCI and robotics framework for stroke rehabilitation |
Journal/Conference/Book Title | Proceedings of the 4th International BCI Meeting, May 31 - June 4, 2010. Asilomar, CA, USA |
Keywords | Team Athena-Minerva |
Link to PDF | http://bcimeeting.org/2010/ |
Reference Type | Conference Proceedings |
Author(s) | Lampert, C. H.; Kroemer, O. |
Year | 2010 |
Title | Weakly-Paired Maximum Covariance Analysis for Multimodal Dimensionality Reduction and Transfer Learning |
Journal/Conference/Book Title | Proceedings of the 11th European Conference on Computer Vision (ECCV 2010) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/lampert-eccv2010.pdf |
Reference Type | Conference Proceedings |
Author(s) | Chiappa, S.; Peters, J. |
Year | 2010 |
Title | Movement extraction by detecting dynamics switches and repetitions |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 24 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Chiappa_NIPS_2011.pdf |
Reference Type | Conference Proceedings |
Author(s) | Alvaro, M.; Peters, J.; Schoelfkopf, B.; Lawrence, N. |
Year | 2010 |
Title | Switched Latent Force Models for Movement Segmentation |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 24 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Alvarez_NIPS_2011.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.;Kober, J.;Schaal, S. |
Year | 2010 |
Title | Policy learning algorithmis for motor learning (Algorithmen zum automatischen Erlernen von Motorfaehigkigkeiten) |
Journal/Conference/Book Title | Automatisierungstechnik |
Keywords | reinforcement leanring, motor skills |
Abstract | Robot learning methods which allow au- tonomous robots to adapt to novel situations have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to ful- fill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics. If possible, scaling was usually only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general ap- proach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human- like performance. For doing so, we study two major components for such an approach, i. e., firstly, we study policy learning algo- rithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structu- res for task representation and execution. |
Volume | 58 |
Number | 12 |
Pages | 688-694 |
Short Title | Policy learning algorithmis for motor learning (Algorithmen zum automatischen Erlernen von Motorfähigkigkeiten) |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/at-Automatisierungstechnik-Algorithmen_zum_Automatischen_Erlernen_von_Motorfhigkeiten |
Reference Type | Conference Proceedings |
Author(s) | Muelling, K.; Kober, J.; Peters, J. |
Year | 2010 |
Title | Learning Table Tennis with a Mixture of Motor Primitives |
Journal/Conference/Book Title | 10th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2010) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_ICHR_2012.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_ICHR_2012.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Muelling, K.; Kober, J. |
Year | 2010 |
Title | Experiments with Motor Primitives to learn Table Tennis |
Journal/Conference/Book Title | 12th International Symposium on Experimental Robotics (ISER 2010) |
Reference Type | Conference Proceedings |
Author(s) | Hachiya, H.; Peters, J.; Sugiyama, M. |
Year | 2009 |
Title | Efficient Sample Reuse in EM-based Policy Search |
Journal/Conference/Book Title | Proceedings of the 16th European Conference on Machine Learning (ECML 2009) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ECML-PKDD-2009-Hachiya_6068[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J.; Muelling, K.; Nguyen-Tuong, D.; Kroemer, O. |
Year | 2009 |
Title | Towards Motor Skill Learning for Robotics |
Journal/Conference/Book Title | Proceedings of the International Symposium on Robotics Research (ISRR), Invited Paper |
Abstract | Learning robots that can acquire new motor skills and refine existing one has been a long standing vision of robotics, artificial intelligence, and the cognitive sciences. Early steps towards this goal in the 1980s made clear that reasoning and human insights will not suffice. Instead, new hope has been offered by the rise of modern machine learning approaches. However, to date, it becomes increasingly clear that off-the-shelf machine learning approaches will not suffice for motor skill learning as these methods often do not scale into the high-dimensional domains of manipulator and humanoid robotics nor do they fulfill the real-time requirement of our domain. As an alternative, we propose to break the generic skill learning problem into parts that we can understand well from a robotics point of view. After designing appropriate learning approaches for these basic components, these will serve as the ingredients of a general approach to motor skill learning. In this paper, we discuss our recent and current progress in this direction. For doing so, we present our work on learning to control, on learning elementary movements as well as our steps towards learning of complex tasks. We show several evaluations both using real robots as well as physically realistic simulations. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/peters_ISRR_2007.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Seeger, M.; Peters, J. |
Year | 2009 |
Title | Local Gaussian Process Regression for Real Time Online Model Learning and Control |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 22 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/3403-local-gaussian-process-regression-for-real-time-online-model-learning.pdf |
Reference Type | Conference Proceedings |
Author(s) | Neumann, G.; Peters, J. |
Year | 2009 |
Title | Fitted Q-iteration by Advantage Weighted Regression |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 22 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NIPS2008-Neumann_5520%5B0%5D.pdf |
Reference Type | Journal Article |
Author(s) | Hachiya, H.; Akiyama, T.; Sugiyama, M.; Peters, J. |
Year | 2009 |
Title | Adaptive Importance Sampling for Value Function Approximation in Off-policy Reinforcement Learning |
Journal/Conference/Book Title | Neural Networks |
Keywords | off-policy reinforcement learning; value function approximation; policy iteration; adaptive importance sampling; importance-weighted cross-validation; efficient sample reuse |
Abstract | Off-policy reinforcement learning is aimed at efficiently using data samples gathered from a different policy than the currently optimized one. A common approach is to use importance sampling techniques for compensating for the bias of value function estimators caused by the difference between the data-sampling policy and the target policy. However, existing off-policy methods often do not take the variance of the value function estimators explicitly into account and, therefore, their performance tends to be unstable. To cope with this problem, we propose using an adaptive importance sampling technique which allows us to actively control the trade-off between bias and variance. We further provide a method for optimally determining the trade-off parameter based on a variant of cross-validation. We demonstrate the usefulness of the proposed approach through simulations. |
Volume | 22 |
Number | 10 |
Pages | 1399-1410 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/hachiya-AdaptiveImportanceSampling_5530.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Peters, J. |
Year | 2009 |
Title | Policy Search for Motor Primitives in Robotics |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 22 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NIPS2008-Kober-Peters_5411[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Chiappa, S.; Kober, J.; Peters, J. |
Year | 2009 |
Title | Using Bayesian Dynamical Systems for Motion Template Libraries |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems 22 (NIPS/NeurIPS), Cambridge, MA: MIT Press |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NIPS2008-Chiappa_5400[0].pdf |
Reference Type | Journal Article |
Author(s) | Deisenroth, M.P.; Rasmussen, C.E.; Peters, J. |
Year | 2009 |
Title | Gaussian Process Dynamic Programming |
Journal/Conference/Book Title | Neurocomputing |
Number | 72 |
Pages | 1508-1524 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Neurocomputing-2009-Deisenroth-Preprint_5531.pdf |
Reference Type | Conference Proceedings |
Author(s) | Hoffman, M.; de Freitas, N. ; Doucet, A.; Peters, J. |
Year | 2009 |
Title | An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Reward |
Journal/Conference/Book Title | Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AIStats) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/AIStats2009-Hoffman_5658.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J. |
Year | 2009 |
Title | Using Reward-Weighted Imitation for Robot Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the 2009 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/peters_ADPRL_2009.pdf |
Reference Type | Conference Proceedings |
Author(s) | Hachiya, H.; Akiyama, T.; Sugiyama, M.; Peters, J. |
Year | 2009 |
Title | Efficient Data Reuse in Value Function Approximation |
Journal/Conference/Book Title | Proceedings of the 2009 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ADPRL2009-Hachiya.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Peters, J. |
Year | 2009 |
Title | Learning Motor Primitives for Robotics |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/ICRA2009-Kober_5661[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Piater, J.; Jodogne, S.; Detry, R.; Kraft, D.; Krueger, N.; Kroemer, O.; Peters, J. |
Year | 2009 |
Title | Learning Visual Representations for Interactive Systems |
Journal/Conference/Book Title | Proceedings of the International Symposium on Robotics Research (ISRR), Invited Paper |
Abstract | We describe two quite different methods for associating action parameters to visual percepts. Our RLVC algorithm performs reinforcement learning directly on the visual input space. To make this very large space manageable, RLVC interleaves the reinforcement learner with a supervised classification algorithm that seeks to split perceptual states so as to reduce perceptual aliasing. This results in an adaptive discretization of the perceptual space based on the presence or absence of visual features. Its extension RLJC also handles continuous action spaces. In contrast to the minimalistic visual representations produced by RLVC and RLJC, our second method learns structural object models for robust object detection and pose estimation by probabilistic inference. To these models, the method associates grasp experiences autonomously learned by trial and error. These experiences form a non-parametric representation of grasp success likelihoods over gripper poses, which we call a grasp density. Thus, object detection in a novel scene simultaneously produces suitable grasping options. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Piater-2009-ISRR.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Peters, J. |
Year | 2009 |
Title | Learning new basic Movements for Robotics |
Journal/Conference/Book Title | Proceedings of Autonome Mobile Systeme (AMS 2009) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/paper_16.pdf |
Reference Type | Conference Proceedings |
Author(s) | Muelling, K.; Peters, J. |
Year | 2009 |
Title | A computational model of human table tennis for robot application |
Journal/Conference/Book Title | Proceedings of Autonome Mobile Systeme (AMS 2009) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/A_computational_model_of_human_table_tennis.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kroemer, O.; Detry, R.; Piater, J.; Peters, J. |
Year | 2009 |
Title | Active Learning Using Mean Shift Optimization for Robot Grasping |
Journal/Conference/Book Title | Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/kroemer_IROS_2009.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Seeger, M.; Peters, J. |
Year | 2009 |
Title | Sparse Online Model Learning for Robot Control with Support Vector Regression |
Journal/Conference/Book Title | Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Sparse_Online_Model_Learning_for_Robot_Control.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.; Ng, A. |
Year | 2009 |
Title | Guest Editorial: Special Issue on Robot Learning, Part B |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Volume | 27 |
Number | 2 |
Tertiary Author | 91-92 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Peters-Ng2009_Article_GuestEditorialSpecialIssueOnRo.pdf |
Reference Type | Conference Proceedings |
Author(s) | Sigaud, O.; Peters, J. |
Year | 2009 |
Title | From Motor Learning to Interaction Learning in Robots |
Journal/Conference/Book Title | Proceedings of Journees Nationales de la Recherche en Robotique |
Tertiary Author | 189-195 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/JNRR2009-Sigaud_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Neumann, G.; Maass, W; Peters, J. |
Year | 2009 |
Title | Learning Complex Motions by Sequencing Simpler Motion Templates |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML2009) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICML2009-Neumann.pdf |
Reference Type | Conference Proceedings |
Author(s) | Detry, R; Baseski, E.; Popovic, M.; Touati, Y.; Krueger, N; Kroemer, O.; Peters, J.; Piater, J; |
Year | 2009 |
Title | Learning Object-specific Grasp Affordance Densities |
Journal/Conference/Book Title | Proceedings of the International Conference on Development & Learning (ICDL 2009) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/ICDL2009-Detry_[0].pdf |
Reference Type | Journal Article |
Author(s) | Nguyen Tuong, D.; Seeger, M.; Peters, J. |
Year | 2009 |
Title | Model Learning with Local Gaussian Process Regression |
Journal/Conference/Book Title | Advanced Robotics |
Volume | 23 |
Number | 15 |
Pages | 2015-2034 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Nguyen-Tuong-ModelLearningLocalGaussian.pdf |
Reference Type | Journal Article |
Author(s) | Kober, J.; Peters, J. |
Year | 2009 |
Title | Reinforcement Learning fuer Motor-Primitive |
Journal/Conference/Book Title | Kuenstliche Intelligenz |
Link to PDF | http://www.kuenstliche-intelligenz.de/index.php?id=7779&tx_ki_pi1[showUid]=1820&cHash=a9015a9e57 |
Reference Type | Journal Article |
Author(s) | Peters, J.; Morimoto, J.; Tedrake, R.; Roy, N. |
Year | 2009 |
Title | Robot Learning |
Journal/Conference/Book Title | IEEE Robotics & Automation Magazine |
Keywords | robot learning, tc spotlight |
Volume | 16 |
Number | 3 |
Pages | 19-20 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/05233410.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.; Ng, A. |
Year | 2009 |
Title | Guest Editorial: Special Issue on Robot Learning, Part A |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Volume | 27 |
Number | 1 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Peters-Ng2009_Article_GuestEditorialSpecialIssueOnRo.pdf |
Reference Type | Conference Proceedings |
Author(s) | Lampert, C.H.; Peters, J. |
Year | 2009 |
Title | Active Structured Learning for High-Speed Object Detection |
Journal/Conference/Book Title | Proceedings of the DAGM (Pattern Recognition) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/DAGM2009-Lampert.pdf |
Reference Type | Conference Proceedings |
Author(s) | Gomez Rodriguez, M.; Kober, J.; Schoelkopf, B. |
Year | 2009 |
Title | Denoising photographs using dark frames optimized by quadratic programming |
Journal/Conference/Book Title | Proceedings of the First IEEE International Conference on Computational Photography (ICCP 2009) |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/ICCP09-GomezRodriguez_5491[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/ICCP09-GomezRodriguez_5491[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Peters, J.; Rasmussen, C.E. |
Year | 2008 |
Title | Approximate Dynamic Programming with Gaussian Processes |
Journal/Conference/Book Title | American Control Conference |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Main/PublicationsByYear/deisenroth_ACC2008.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Seeger, M. |
Year | 2008 |
Title | Computed Torque Control with Nonparametric Regressions Techniques |
Journal/Conference/Book Title | American Control Conference |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NguyenTuong_ACC2008.pdf |
Reference Type | Conference Proceedings |
Author(s) | Deisenroth, M.P.; Rasmussen, C.E.; Peters, J. |
Year | 2008 |
Title | Model-Based Reinforcement Learning with Continuous States and Actions |
Journal/Conference/Book Title | Proceedings of the European Symposium on Artificial Neural Networks (ESANN 2008) |
Pages | 19-24 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/deisenroth_ESANN2008.pdf |
Reference Type | Journal Article |
Author(s) | Steinke, F.; Hein, M.; Peters, J.; Schoelkopf, B |
Year | 2008 |
Title | Manifold-valued Thin-Plate Splines with Applications in Computer Graphics |
Journal/Conference/Book Title | Computer Graphics Forum (Special Issue on Eurographics 2008) |
Volume | 27 |
Number | 2 |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Steinke_EGFinal-1049.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2008 |
Title | Learning Inverse Dynamics: a Comparison |
Journal/Conference/Book Title | Proceedings of the European Symposium on Artificial Neural Networks (ESANN 2008) |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NguyenTuong_ACC2008.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Nguyen-Tuong, D. |
Year | 2008 |
Title | Real-Time Learning of Resolved Velocity Control on a Mitsubishi PA-10 |
Journal/Conference/Book Title | International Conference on Robotics and Automation (ICRA) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICRA2008-Peters_4865[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Hachiya, H.; Akiyama, T.; Sugiyama, M.; Peters, J. |
Year | 2008 |
Title | Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation |
Journal/Conference/Book Title | Proceedings of the Twenty-Third National Conference on Artificial Intelligence (AAAI 2008) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/AAAI-2008-Hachiya_5096[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Wierstra, D.; Schaul, T.; Peters, J.; Schmidhuber, J. |
Year | 2008 |
Title | Natural Evolution Strategies |
Journal/Conference/Book Title | 2008 IEEE Congress on Evolutionary Computation |
Abstract | This paper presents Natural Evolution Strategies (NES), a novel algorithm for performing real-valued black box function optimization: optimizing an unknown objective function where algorithm-selected function measurements con- stitute the only information accessible to the method. Natu- ral Evolution Strategies search the fitness landscape using a multivariate normal distribution with a self-adapting mutation matrix to generate correlated mutations in promising regions. NES shares this property with Covariance Matrix Adaption (CMA), an Evolution Strategy (ES) which has been shown to perform well on a variety of high-precision optimization tasks. The Natural Evolution Strategies algorithm, however, is simpler, less ad-hoc and more principled. Self-adaptation of the mutation matrix is derived using a Monte Carlo estimate of the natural gradient towards better expected fitness. By following the natural gradient instead of the �vanilla� gradient, we can ensure efficient update steps while preventing early convergence due to overly greedy updates, resulting in reduced sensitivity to local suboptima. We show NES has competitive performance with CMA on several tasks, while outperforming it on one task that is rich in deceptive local optima, the Rastrigin benchmark. found and the algorithm�s sensitivity to local suboptima on the fitness landscape. |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/wierstra-CEC2008.pdf |
Reference Type | Conference Proceedings |
Author(s) | Nguyen Tuong, D.; Peters, J. |
Year | 2008 |
Title | Local Gaussian Processes Regression for Real-time Model-based Robot Control |
Journal/Conference/Book Title | International Conference on Intelligent Robot Systems (IROS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2008-Nguyen_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Mohler, B.; Peters, J. |
Year | 2008 |
Title | Learning Perceptual Coupling for Motor Primitives |
Journal/Conference/Book Title | International Conference on Intelligent Robot Systems (IROS) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2008-Kober_5414[0].pdf |
Reference Type | Book |
Author(s) | Lesperance, Y.; Lakemeyer, G.; Peters, J.; Pirri, F. |
Year | 2008 |
Title | Proceedings of the 6th International Cognitive Robotics Workshop (CogRob 2008) |
Journal/Conference/Book Title | July 21-22, 2008, Patras, Greece, ISBN 978-960-6843-09-9 |
Reference Type | Conference Proceedings |
Author(s) | Wierstra, D.; Schaul, T.; Peters, J.; Schmidthuber, J. |
Year | 2008 |
Title | Fitness Expectation Maximization |
Journal/Conference/Book Title | 10th International Conference on Parallel Problem Solving from Nature (PPSN 2008) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ppsn08.pdf |
Reference Type | Journal Article |
Author(s) | Nakanishi, J.;Cory, R.;Mistry, M.;Peters, J.;Schaal, S. |
Year | 2008 |
Title | Operational space control: A theoretical and empirical comparison |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | task space control, operational space control, redundancy resolution, humanoid robotics |
Abstract | Dexterous manipulation with a highly redundant movement system is one of the hallmarks of hu- man motor skills. From numerous behavioral studies, there is strong evidence that humans employ compliant task space control, i.e., they focus control only on task variables while keeping redundant degrees-of-freedom as compliant as possible. This strategy is robust towards unknown disturbances and simultaneously safe for the operator and the environment. The theory of operational space con- trol in robotics aims to achieve similar performance properties. However, despite various compelling theoretical lines of research, advanced operational space control is hardly found in actual robotics imple- mentations, in particular new kinds of robots like humanoids and service robots, which would strongly profit from compliant dexterous manipulation. To analyze the pros and cons of different approaches to operational space control, this paper focuses on a theoretical and empirical evaluation of different methods that have been suggested in the literature, but also some new variants of operational space controllers. We address formulations at the velocity, acceleration and force levels. First, we formulate all controllers in a common notational framework, including quaternion-based orientation control, and discuss some of their theoretical properties. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm with several benchmark tasks. As an aside, we also introduce a novel parameter estimation algorithm for rigid body dynamics, which ensures physical consistency, as this issue was crucial for our successful robot implementations. Our extensive empirical results demonstrate that one of the simplified acceleration-based approaches can be advantageous in terms of task performance, ease of parameter tuning, and general robustness and compliance in face of inevitable modeling errors. |
Volume | 27 |
Number | 6 |
Pages | 737-757 |
Short Title | Operational space control: A theoretical and emprical comparison |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Int-J-Robot-Res-2008-27-737_5027[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Wierstra, D.; Schaul,T.; Peters, J.; Schmidhuber, J. |
Year | 2008 |
Title | Episodic Reinforcement Learning by Logistic Reward-Weighted Regression |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Neural Networks (ICANN) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/wierstra_ICANN08.pdf |
Reference Type | Conference Proceedings |
Author(s) | Sehnke, F.; Osendorfer, C; Rueckstiess, T; Graves, A.; Peters, J.; Schmidhuber, J. |
Year | 2008 |
Title | Policy Gradients with Parameter-based Exploration for Control |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Neural Networks (ICANN) |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/icann2008sehnke.pdf |
Reference Type | Book |
Author(s) | Peters, J. |
Year | 2008 |
Title | Machine Learning for Robotics |
Journal/Conference/Book Title | VDM-Verlag, ISBN 978-3-639-02110-3 |
ISBN/ISSN | ISBN 978-3-639-02110-3 |
Link to PDF | http://www.amazon.de/Machine-Learning-Robotics-Methods-Skills/dp/363902110X/ref=sr_1_1?ie=UTF8&s=books&qid=1220658804&sr=8-1 |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Kober, J.; Nguyen-Tuong, D. |
Year | 2008 |
Title | Policy Learning - a unified perspective with applications in robotics |
Journal/Conference/Book Title | Proceedings of the European Workshop on Reinforcement Learning (EWRL) |
Keywords | reinforcement learning, policy gradient, weighted regression |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/8808e934beb11e344433a6c98a68269e26f1.pdf |
Reference Type | Conference Proceedings |
Author(s) | Kober, J.; Peters, J. |
Year | 2008 |
Title | Reinforcement Learning of Perceptual Coupling for Motor Primitives |
Journal/Conference/Book Title | Proceedings of the European Workshop on Reinforcement Learning (EWRL) |
Reference Type | Journal Article |
Author(s) | Peters, J. |
Year | 2008 |
Title | Machine Learning for Motor Skills in Robotics |
Journal/Conference/Book Title | Kuenstliche Intelligenz |
Keywords | motor control, motor primitives, motor learning |
Abstract | Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and the cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks of future robots. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator and humanoid robotics and usually scaling was only achieved in precisely pre-structured domains. We have investigated the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting. |
Number | 3 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/KuenstlicheIntelligenz-2008-Peters_[0].pdf |
Reference Type | Conference Paper |
Author(s) | Nguyen Tuong, D.; Peters, J.; Seeger, M.; Schoelkopf, B. |
Year | 2008 |
Title | Learning Robot Dynamics for Computed Torque Control using Local Gaussian Processes Regression |
Journal/Conference/Book Title | Proceedings of the ECSIS Symposium on Learning and Adaptive Behavior in Robotic Systems, LAB-RS 2008 |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/nguyen-ecsis.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.; Schaal, S. |
Year | 2008 |
Title | Natural actor critic |
Journal/Conference/Book Title | Neurocomputing |
Keywords | reinforcement learning, policy gradient, natural actor-critic, natural gradients |
Abstract | In this paper, we suggest a novel reinforcement learning architecture, the Natural Actor-Critic. The actor updates are achieved using stochastic policy gradients em- ploying AmariÕs natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by lin- ear regression. We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gra- dients. The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation. We show that several well-known reinforcement learning methods such as the original Actor-Critic and BradtkeÕs Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms. Empirical evaluations illustrate the effectiveness of our techniques in comparison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm. |
Volume | 71 |
Number | 7-9 |
Pages | 1180-1190 |
Short Title | Natural actor critic |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NEUCOM-D-07-00618-1_[0].pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.; Schaal, S. |
Year | 2008 |
Title | Learning to control in operational space |
Journal/Conference/Book Title | International Journal of Robotics Research (IJRR) |
Keywords | operational space control, learning, EM ALGORITHM, redundancy resolution, reinforcement learning |
Abstract | One of the most general frameworks for phrasing control problems for complex, redundant robots is operational space control. However, while this framework is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in com- plex robots, e.g., humanoid robots. In this paper, we suggest a learning approach for opertional space control as a direct inverse model learning problem. A first important insight for this paper is that a physically cor- rect solution to the inverse problem with redundant degrees-of-freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on the insight that many operational space controllers can be understood in terms of a constrained optimal control problem. The cost function as- sociated with this optimal control problem allows us to formulate a learn- ing algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the machine learning point of view, this learning problem corre- sponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees of freedom robot arm are used to illustrate the suggested approach. The applica- tion to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on real, physical Mitsubishi PA-10 medical robotics arm. |
Volume | 27 |
Pages | 197-212 |
Short Title | Learning to control in operational space |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/Learning_to_Control_in_Operational_Space.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J.; Schaal, S. |
Year | 2008 |
Title | Reinforcement learning of motor skills with policy gradients |
Journal/Conference/Book Title | Neural Networks |
Keywords | Reinforcement learning, Policy gradient methods, Natural gradients, Natural Actor-Critic, Motor skills, Motor primitives |
Abstract | Autonomous learning is one of the hallmarks of human and animal behavior, and understanding the principles of learning will be crucial in order to achieve true autonomy in advanced machines like humanoid robots. In this paper, we examine learning of complex motor skills with human-like limbs. While supervised learning can offer useful tools for bootstrapping behavior, e.g., by learning from demonstration, it is only reinforcement learning that offers a general approach to the final trial-and-error improvement that is needed by each individual acquiring a skill. Neither neurobiological nor machine learning studies have, so far, offered compelling results on how reinforcement learning can be scaled to the high-dimensional continuous state and action spaces of humans or humanoids. Here, we combine two recent research developments on learning motor control in order to achieve this scaling. First, we interpret the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning. Second, we combine motor primitives with the theory of stochastic policy gradient learning, which currently seems to be the only feasible framework for reinforcement learning for humanoids. We evaluate different policy gradient methods with a focus on their applicability to parameterized motor primitives. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm. |
Volume | 21 |
Number | 4 |
Pages | 682-97 |
Date | May |
Short Title | Reinforcement learning of motor skills with policy gradients |
ISBN/ISSN | 0893-6080 (Print) |
Accession Number | 18482830 |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Neural-Netw-2008-21-682_4867[0].pdf |
Address | Max Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 Tubingen, Germany; University of Southern California, 3710 S. McClintoch Ave-RTH401, Los Angeles, CA 90089-2905, USA. |
Language | eng |
Reference Type | Journal Article |
Author(s) | Peters, J.;Mistry, M.;Udwadia, F. E.;Nakanishi, J.;Schaal, S. |
Year | 2008 |
Title | A unifying framework for robot control with redundant DOFs |
Journal/Conference/Book Title | Autonomous Robots (AURO) |
Keywords | operational space control, inverse control, dexterous manipulation, optimal control |
Abstract | Recently, Udwadia (Proc. R. Soc. Lond. A 2003:1783–1800, 2003) suggested to derive tracking controllers for mechanical systems with redundant degrees-of-freedom (DOFs) using a generalization of Gauss’ principle of least constraint. This method allows reformulating control problems as a special class of optimal controllers. In this paper, we take this line of reasoning one step further and demonstrate that several well-known and also novel nonlinear robot control laws can be derived from this generic methodology. We show experimental verifications on a Sarcos Master Arm robot for some of the derived controllers. The suggested approach offers a promising unification and simplification of nonlinear control law design for robots obeying rigid body dynamics equations, both with or without external constraints, with over-actuation or underactuation, as well as open-chain and closed-chain kinematics. |
Volume | 24 |
Number | 1 |
Pages | 1-12 |
Short Title | A unifying methodology for robot control with redundant DOFs |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/AR-2008final_[0].pdf |
Reference Type | Thesis |
Author(s) | Kober, J. |
Year | 2008 |
Title | Reinforcement Learning for Motor Primitives |
Journal/Conference/Book Title | Dipl-Ing Thesis, University of Stuttgart |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/publications/DiplomaThesis-Kober_5331[0].pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/publications/DiplomaThesis-Kober_5331[0].pdf |
Reference Type | Journal Article |
Author(s) | Peters, J. |
Year | 2007 |
Title | Computational Intelligence: By Amit Konar |
Journal/Conference/Book Title | The Computer Journal |
Keywords | book review |
Volume | 50 |
Number | 6 |
Pages | 758 |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Schaal, S. |
Year | 2007 |
Title | Policy Learning for Motor Skills |
Journal/Conference/Book Title | Proceedings of 14th International Conference on Neural Information Processing (ICONIP) |
Keywords | Machine Learning, Reinforcement Learning, Robotics, Motor Primitives, Policy Gradients, Natural Actor-Critic, Reward-Weighted Regression |
Abstract | Policy learning which allows autonomous robots to adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, we study policy learning algorithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structures for task representation and execution. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICONIP2007-Peters_4869[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J. |
Year | 2007 |
Title | Solving Deep Memory POMDPs with Recurrent Policy Gradients |
Journal/Conference/Book Title | Proceedings of the International Conference on Artificial Neural Networks (ICANN) |
Keywords | policy gradients, reinforcement learning |
Abstract | This paper presents Recurrent Policy Gradients, a model- free reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a �Long Short-Term Memory� architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/icann2007.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Schaal, S.; Schoelkopf, B. |
Year | 2007 |
Title | Towards Machine Learning of Motor Skills |
Journal/Conference/Book Title | Proceedings of Autonome Mobile Systeme (AMS) |
Keywords | Motor Skill Learning, Robotics, Natural Actor-Critic, Reward-Weighted Regeression |
Abstract | Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two ma jor components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting. |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Peters_POAMS_2007.pdf |
Reference Type | Conference Proceedings |
Author(s) | Theodorou, E; Peters, J.; Schaal, S. |
Year | 2007 |
Title | Reinforcement Learning for Optimal Control of Arm Movements |
Journal/Conference/Book Title | Abstracts of the 37st Meeting of the Society of Neuroscience |
Keywords | Optimal Control,Reinforcement Learning, Arm Movements |
Abstract | Every day motor behavior consists of a plethora of challenging motor skills from discrete movements such as reaching and throwing to rhythmic movements such as walking, drumming and running. How this plethora of motor skills can be learned remains an open question. In particular, is there any unifying computa-tional framework that could model the learning process of this variety of motor behaviors and at the same time be biologically plausible? In this work we aim to give an answer to these questions by providing a computational framework that unifies the learning mechanism of both rhythmic and discrete movements under optimization criteria, i.e., in a non-supervised trial-and-error fashion. Our suggested framework is based on Reinforcement Learning, which is mostly considered as too costly to be a plausible mechanism for learning com-plex limb movement. However, recent work on reinforcement learning with pol-icy gradients combined with parameterized movement primitives allows novel and more efficient algorithms. By using the representational power of such mo-tor primitives we show how rhythmic motor behaviors such as walking, squash-ing and drumming as well as discrete behaviors like reaching and grasping can be learned with biologically plausible algorithms. Using extensive simulations and by using different reward functions we provide results that support the hy-pothesis that Reinforcement Learning could be a viable candidate for motor learning of human motor behavior when other learning methods like supervised learning are not feasible. |
Reference Type | Journal Article |
Author(s) | Nakanishi, J.; Mistry, M.; Peters, J.; Schaal, S. |
Year | 2007 |
Title | Experimental evaluation of task space position/orientation control towards compliant control for humanoid robots |
Journal/Conference/Book Title | IEEE International Conference on Intelligent Robotics Systems (IROS 2007) |
Keywords | operational space control, quaternion, task space control, resolved motion rate control, resolved acceleration, force control |
Abstract | Compliant control will be a prerequisite for humanoid robotics if these robots are supposed to work safely and robustly in human and/or dynamic environments. One view of compliant control is that a robot should control a minimal number of degrees-of-freedom (DOFs) directly, i.e., those relevant DOFs for the task, and keep the remaining DOFs maximally compliant, usually in the null space of the task. This view naturally leads to task space control. However, surprisingly few implementations of task space control can be found in actual humanoid robots. This paper makes a first step towards assessing the usefulness of task space controllers for humanoids by investigating which choices of controllers are available and what inherent control characteristics they have�this treatment will concern position and orientation control, where the latter is based on a quaternion formulation. Empirical evaluations on an anthropomorphic Sarcos master arm illustrate the robustness of the different controllers as well as the ease of implementing and tuning them. Our extensive empirical results demonstrate that simpler task space controllers, e.g., classical resolved motion rate control or resolved acceleration control can be quite advantageous in face of inevitable modeling errors in model-based control, and that well chosen formulations are easy to implement and quite robust, such that they are useful for humanoids. |
Place Published | San Diego, CA: Oct. 29 � Nov. 2 |
Short Title | Experimental evaluation of task space position/orientation control towards compliant control for humanoid robots |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2007-Nakanishi_4722[0].pdf |
Reference Type | Thesis |
Author(s) | Peters, J. |
Year | 2007 |
Title | Machine Learning of Motor Skills for Robotics |
Journal/Conference/Book Title | Ph.D. Thesis, Department of Computer Science, University of Southern California |
Keywords | Machine Learning, Reinforcement Learning, Robotics, Motor Primitives, Policy Gradients, Natural Actor-Critic, Reward-Weighted Regression |
Abstract | Autonomous robots that can assist humans in situations of daily life have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. A first step towards this goal is to create robots that can accomplish a multitude of different tasks, triggered by environmental context or higher level instruction. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning and human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this thesis, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting. As a theoretical foundation, we first study a general framework to generate control laws for real robots with a particular focus on skills represented as dynamical systems in differential constraint form. We present a point-wise optimal control framework resulting from a generalization of Gauss' principle and show how various well-known robot control laws can be derived by modifying the metric of the employed cost function. The framework has been successfully applied to task space tracking control for holonomic systems for several different metrics on the anthropomorphic SARCOS Master Arm. In order to overcome the limiting requirement of accurate robot models, we first employ learning methods to find learning controllers for task space control. However, when learning to execute a redundant control problem, we face the general problem of the non-convexity of the solution space which can force the robot to steer into physically impossible configurations if supervised learning methods are employed without further consideration. This problem can be resolved using two major insights, i.e., the learning problem can be treated as locally convex and the cost function of the analytical framework can be used to ensure global consistency. Thus, we derive an immediate reinforcement learning algorithm from the expectation-maximization point of view which leads to a reward-weighted regression technique. This method can be used both for operational space control as well as general immediate reward reinforcement learning problems. We demonstrate the feasibility of the resulting framework on the problem of redundant end-effector tracking for both a simulated 3 degrees of freedom robot arm as well as for a simulated anthropomorphic SARCOS Master Arm. While learning to execute tasks in task space is an essential component to a general framework to motor skill learning, learning the actual task is of even higher importance, particularly as this issue is more frequently beyond the abilities of analytical approaches than execution. We focus on the learning of elemental tasks which can serve as the "building blocks of movement generation", called motor primitives. Motor primitives are parameterized task representations based on splines or nonlinear differential equations with desired attractor properties. While imitation learning of parameterized motor primitives is a relatively well-understood problem, the self-improvement by interaction of the system with the environment remains a challenging problem, tackled in the fourth chapter of this thesis. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and outline both established and novel algorithms for the gradient-based improvement of parameterized policies. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm. In conclusion, in this thesis, we have contributed a general framework for analytically computing robot control laws which can be used for deriving various previous control approaches and serves as foundation as well as inspiration for our learning algorithms. We have introduced two classes of novel reinforcement learning methods, i.e., the Natural Actor-Critic and the Reward-Weighted Regression algorithm. These algorithms have been used in order to replace the analytical components of the theoretical framework by learned representations. Evaluations have been performed on both simulated and real robot arms. |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Schaal, S. |
Year | 2007 |
Title | Reinforcement learning for operational space control |
Journal/Conference/Book Title | International Conference on Robotics and Automation (ICRA2007) |
Keywords | operational space control, reinforcement learning, weighted regression, EM-Algorithm |
Abstract | While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting supervised learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-convexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. The important insight that many operational space control algorithms can be reformulated as optimal control problems, however, allows addressing this inverse learning problem in the framework of reinforcement learning. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-based reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICRA2007-2111_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Schaal, S. |
Year | 2007 |
Title | Using reward-weighted regression for reinforcement learning of task space control |
Journal/Conference/Book Title | Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning |
Keywords | reinforcement learning, cart-pole, policy gradient methods |
Abstract | In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, `vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease. |
Place Published | Honolulu, Hawaii, April 1-5, 2007 |
Short Title | Using reward-weighted regression for reinforcement learning of task space control |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ADPRL2007-Peters_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Schaal, S. |
Year | 2007 |
Title | Applying the episodic natural actor-critic architecture to motor primitive learning |
Journal/Conference/Book Title | Proceedings of the 2007 European Symposium on Artificial Neural Networks (ESANN) |
Keywords | reinforcement learning, policy gradient methods, motor primitives, natural actor-critic |
Abstract | In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natural stochastic policy gradients while the critic obtains the natural policy gradient by linear regression. We show that this architecture can be used to learn the �building blocks of movement generation�, called motor primitives. Motor primitives are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. We show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm. |
Place Published | Bruges, Belgium, April 25-27 |
Short Title | Applying the episodic natural actor-critic architecture to motor primitive learning |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/es2007-125.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Schaal, S. |
Year | 2007 |
Title | Reinforcement learning by reward-weighted regression for operational space control |
Journal/Conference/Book Title | Proceedings of the International Conference on Machine Learning (ICML2007) |
Keywords | reinforcement learning, operational space control, weighted regression |
Abstract | Many robot control problems of practical importance, including operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots. |
Place Published | Corvallis, Oregon, June 19-21 |
Short Title | Reinforcement learning by reward-weighted regression for operational space control |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ICML2007-Peters_4493[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Theodorou, E.;Schaal, S. |
Year | 2007 |
Title | Policy gradient methods for machine learning |
Journal/Conference/Book Title | INFORMS Conference of the Applied Probability Society |
Keywords | policy gradient methods, reinforcement learning, simulation-optimization |
Abstract | We present an in-depth survey of policy gradient methods as they are used in the machine learning community for optimizing parameterized, stochastic control policies in Markovian systems with respect to the expected reward. Despite having been developed separately in the reinforcement learning literature, policy gradient methods employ likelihood ratio gradient estimators as also suggested in the stochastic simulation optimization community. It is well-known that this approach to policy gradient estimation traditionally suffers from three drawbacks, i.e., large variance, a strong dependence on baseline functions and a inefficient gradient descent. In this talk, we will present a series of recent results which tackles each of these problems. The variance of the gradient estimation can be reduced significantly through recently introduced techniques such as optimal baselines, compatible function approximations and all-action gradients. However, as even the analytically obtainable policy gradients perform unnaturally slow, it required the step from vanilla policy gradient methods towards natural policy gradients in order to overcome the inefficiency of the gradient descent. This development resulted into the Natural Actor-Critic architecture which can be shown to be very efficient in application to motor primitive learning for robotics. |
Place Published | Eindhoven, Netherlands, July 9-11, 2007 |
Short Title | Policy gradient methods for machine learning |
Reference Type | Conference Proceedings |
Author(s) | Riedmiller, M.;Peters, J.;Schaal, S. |
Year | 2007 |
Title | Evaluation of policy gradient methods and variants on the cart-pole benchmark |
Journal/Conference/Book Title | Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning |
Keywords | reinforcement learning, cart-pole, policy gradient methods |
Abstract | In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, `vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease. |
Place Published | Honolulu, Hawaii, April 1-5, 2007 |
Short Title | Evaluation of policy gradient methods and variants on the cart-pole benchmark |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/ADPRL2007-Peters2_[0].pdf |
Reference Type | Report |
Author(s) | Peters, J. |
Year | 2007 |
Title | Relative Entropy Policy Search |
Journal/Conference/Book Title | CLMC Technical Report: TR-CLMC-2007-2, University of Southern California |
Keywords | relative entropy, policy search, natural policy gradient |
Abstract | This technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems. |
Place Published | Los Angeles, CA |
Type of Work | CLMC Technical Report |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Peters-TR2007.pdf |
Research Notes | A longer and more complete version is under preparation. |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Schaal, S. |
Year | 2006 |
Title | Learning operational space control |
Journal/Conference/Book Title | Robotics: Science and Systems (RSS 2006) |
Keywords | operational space control redundancy forward models inverse models compliance reinforcement leanring locally weighted learning |
Abstract | While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-covexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. A first important insight for this paper is that, nevertheless, a physically correct solution to the inverse problem does exits when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on a recent insight that many operational space controllers can be understood in terms of a constraint optimal control problem. The cost function associated with this optimal control problem allows us to formulate a learning algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the view of machine learning, the learning problem corresponds to a reinforcement learning problem that maximizes an immediate reward and that employs an expectation-maximization policy search algorithm. Evaluations on a three degrees of freedom robot arm illustrate the feasability of our suggested approach. |
Place Published | Philadelphia, PA, Aug.16-19 |
Publisher | Cambridge, MA: MIT Press |
Short Title | Learning operational space control |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/p33.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Schaal, S. |
Year | 2006 |
Title | Reinforcement Learning for Parameterized Motor Primitives |
Journal/Conference/Book Title | Proceedings of the 2006 International Joint Conference on Neural Networks (IJCNN) |
Keywords | motor primitives, reinforcement learning |
Abstract | One of the major challenges in both action generation for robotics and in the understanding of human motor control is to learn the "building blocks of movement generation", called motor primitives. Motor primitives, as used in this paper, are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. While a lot of progress has been made in teaching parameterized motor primitives using supervised or imitation learning, the self-improvement by interaction of the system with the environment remains a challenging problem. In this paper, we evaluate different reinforcement learning approaches for improving the performance of parameterized motor primitives. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and outline both established and novel algorithms for the gradient-based improvement of parameterized policies. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm. |
Short Title | Reinforcement Learning for Parameterized Motor Primitives |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/Reinforcement_Learning_for_Parameterized_Motor_Pri.pdf |
Reference Type | Conference Proceedings |
Author(s) | Ting, J.;Mistry, M.;Nakanishi, J.;Peters, J.;Schaal, S. |
Year | 2006 |
Title | A Bayesian approach to nonlinear parameter identification for rigid body dynamics |
Journal/Conference/Book Title | Robotics: Science and Systems (RSS 2006) |
Keywords | Bayesian regression linear models dimensionality reduction input noise rigid body dynamics parameter identification |
Abstract | For robots of increasing complexity such as humanoid robots, conventional identification of rigid body dynamics models based on CAD data and actuator models becomes difficult and inaccurate due to the large number of additional nonlinear effects in these systems, e.g., stemming from stiff wires, hydraulic hoses, protective shells, skin, etc. Data driven parameter estimation offers an alternative model identification method, but it is often burdened by various other problems, such as significant noise in all measured or inferred variables of the robot. The danger of physically inconsistent results also exists due to unmodeled nonlinearities or insufficiently rich data. In this paper, we address all these problems by developing a Bayesian parameter identification method that can automatically detect noise in both input and output data for the regression algorithm that performs system identification. A post-processing step ensures physically consistent rigid body parameters by nonlinearly projecting the result of the Bayesian estimation onto constraints given by positive definite inertia matrices and the parallel axis theorem. We demonstrate on synthetic and actual robot data that our technique performs parameter identification with $10$ to $30%$ higher accuracy than traditional methods. Due to the resulting physically consistent parameters, our algorithm enables us to apply advanced control methods that algebraically require physical consistency on robotic platforms. |
Place Published | Philadelphia, PA, Aug.16-19 |
Publisher | Cambridge, MA: MIT Press |
Short Title | A Bayesian approach to nonlinear parameter identification for rigid body dynamics |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/p32.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Schaal, S. |
Year | 2006 |
Title | Policy gradient methods for robotics |
Journal/Conference/Book Title | Proceedings of the IEEE International Conference on Intelligent Robotics Systems (IROS 2006) |
Keywords | policy gradient methods, reinforcement learning, robotics |
Abstract | The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-structured environments. However, to date only few existing reinforcement learning methods have been scaled into the domains of highdimensional robots such as manipulator, legged or humanoid robots. Policy gradient methods remain one of the few exceptions and have found a variety of applications. Nevertheless, the application of such methods is not without peril if done in an uninformed manner. In this paper, we give an overview on learning with policy gradient methods for robotics with a strong focus on recent advances in the field. We outline previous applications to robotics and show how the most recently developed methods can significantly improve learning performance. Finally, we evaluate our most promising algorithm in the application of hitting a baseball with an anthropomorphic arm. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2006-Peters_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Nakanishi, J.;Cory, R.;Mistry, M.;Peters, J.;Schaal, S. |
Year | 2005 |
Title | Comparative experiments on task space control with redundancy resolution |
Journal/Conference/Book Title | IEEE International Conference on Intelligent Robots and Systems (IROS 2005) |
Keywords | manipulator dynamicsredundant manipulatorsspace optimizationdynamical decouplinghumanoid robotsinverse kinematicsmotor coordinationredundancy resolutionrobot dynamicsseven-degree-of-freedom anthropomorphic robot armtask space controlDynamical d |
Abstract | Understanding the principles of motor coordination with redundant degrees of freedom still remains a challenging problem, particularly for new research in highly redundant robots like humanoids. Even after more than a decade of research, task space control with redundacy resolution still remains an incompletely understood theoretical topic, and also lacks a larger body of thorough experimental investigation on complex robotic systems. This paper presents our first steps towards the development of a working redundancy resolution algorithm which is robust against modeling errors and unforeseen disturbances arising from contact forces. To gain a better understanding of the pros and cons of different approaches to redundancy resolution, we focus on a comparative empirical evaluation. First, we review several redundancy resolution schemes at the velocity, acceleration and torque levels presented in the literature in a common notational framework and also introduce some new variants of these previous approaches. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm. Surprisingly, one of our simplest algorithms empirically demonstrates the best performance, despite, from a theoretical point, the algorithm does not share the same beauty as some of the other methods. Finally, we discuss practical properties of these control algorithms, particularly in light of inevitable modeling errors of the robot dynamics. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2005-Nakanishi_5051[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Vijayakumar, S.;Schaal, S. |
Year | 2005 |
Title | Natural Actor-Critic |
Journal/Conference/Book Title | Proceedings of the 16th European Conference on Machine Learning (ECML 2005) |
Keywords | Reinforcement Learning, Policy Gradients, Natural Gradients |
Abstract | This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari�s natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regres- sion. We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gradients. The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation. We show that several well-known reinforcement learning methods such as the original Actor-Critic and Bradtke�s Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms. Em- pirical evaluations illustrate the effectiveness of our techniques in com- parison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/NaturalActorCritic.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Mistry, M.;Udwadia, F. E.;Schaal, S. |
Year | 2005 |
Title | A new methodology for robot control design |
Journal/Conference/Book Title | The 5th ASME International Conference on Multibody Systems, Nonlinear Dynamics, and Control (MSNDC 2005) |
Keywords | robot control, nonlinear control, gauss principle |
Abstract | Gauss principle of least constraint and its generalizations have provided a useful insights for the development of tracking controllers for mechanical systems (Udwadia,2003). Using this concept, we present a novel methodology for the design of a specific class of robot controllers. With our new framework, we demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic framework, and show experimental verifications on a Sarcos Master Arm robot for some of these controllers. We believe that the suggested approach unifies and simplifies the design of optimal nonlinear control laws for robots obeying rigid body dynamics equations, both with or without external constraints, holonomic or nonholonomic constraints, with over-actuation or underactuation, as well as open-chain and closed-chain kinematics. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/PetMisUdwSchASME2005_5054[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Mistry, M.;Udwadia, F. E.;Cory, R.;Nakanishi, J.;Schaal, S. |
Year | 2005 |
Title | A unifying framework for the control of robotics systems |
Journal/Conference/Book Title | IEEE International Conference on Intelligent Robots and Systems (IROS 2005) |
Abstract | Recently, [1] suggested to derive tracking controllers for mechanical systems using a generalization of Gauss principle of least constraint. This method al-lows us to reformulate control problems as a special class of optimal control. We take this line of reasoning one step further and demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic methodology. We show experimental verifications on a Sar-cos Master Arm robot for some of the the derived controllers.We believe that the suggested approach offers a promising unification and simplification of nonlinear control law design for robots obeying rigid body dynamics equa-tions, both with or without external constraints, with over-actuation or under-actuation, as well as open-chain and closed-chain kinematics. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2005-Peters_5052[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Schaal, S.;Peters, J.;Nakanishi, J.;Ijspeert, A. |
Year | 2004 |
Title | Learning Movement Primitives |
Journal/Conference/Book Title | International Symposium on Robotics Research (ISRR2003) |
Keywords | movement primitives, supervised learning, reinforcment learning, locomotion, phase resetting, learning from demonstration |
Abstract | This paper discusses a comprehensive framework for modular motor control based on a recently developed theory of dynamic movement primitives (DMP). DMPs are a formulation of movement primitives with autonomous nonlinear differential equations, whose time evolution creates smooth kinematic control policies. Model-based control theory is used to convert the outputs of these policies into motor commands. By means of coupling terms, on-line modifications can be incorporated into the time evolution of the differential equations, thus providing a rather flexible and reactive framework for motor planning and execution. The linear parameterization of DMPs lends itself naturally to supervised learning from demonstration. Moreover, the temporal, scale, and translation invariance of the differential equations with respect to these parameters provides a useful means for movement recognition. A novel reinforcement learning technique based on natural stochastic policy gradients allows a general approach of improving DMPs by trial and error learning with respect to almost arbitrary optimization criteria. We demonstrate the different ingredients of the DMP approach in various examples, involving skill learning from demonstration on the humanoid robot DB, and learning biped walking from demonstration in simulation, including self-improvement of the movement patterns towards energy efficiency through resonance tuning. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/Schaal2005_Chapter_LearningMovementPrimitives.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Schaal, S. |
Year | 2004 |
Title | Learning Motor Primitives with Reinforcement Learning |
Journal/Conference/Book Title | Proceedings of the 11th Joint Symposium on Neural Computation |
Keywords | natural policy gradients, motor primitives, natural actor-critic |
Abstract | One of the major challenges in action generation for robotics and in the understanding of human motor control is to learn the "building blocks of move- ment generation," or more precisely, motor primitives. Recently, Ijspeert et al. [1, 2] suggested a novel framework how to use nonlinear dynamical systems as motor primitives. While a lot of progress has been made in teaching these mo- tor primitives using supervised or imitation learning, the self-improvement by interaction of the system with the environment remains a challenging problem. In this poster, we evaluate different reinforcement learning approaches can be used in order to improve the performance of motor primitives. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and line out how these lead to a novel algorithm which is based on natural policy gradients [3]. We compare this algorithm to previous reinforcement learning algorithms in the context of dynamic motor primitive learning, and show that it outperforms these by at least an order of magnitude. We demonstrate the efficiency of the resulting reinforcement learning method for creating complex behaviors for automous robotics. The studied behaviors will include both discrete, finite tasks such as baseball swings, as well as complex rhythmic patterns as they occur in biped locomotion |
Place Published | http://resolver.caltech.edu/CaltechJSNC:2004.poster020 |
Reference Type | Conference Proceedings |
Author(s) | Mohajerian, P.;Peters, J.;Ijspeert, A.;Schaal, S. |
Year | 2003 |
Title | A unifying computational framework for optimization and dynamic systems approaches to motor control |
Journal/Conference/Book Title | Proceedings of the 10th Joint Symposium on Neural Computation (JSNC 2003) |
Keywords | computational motor control, optimization, dynamic systems, formal modeling |
Abstract | Theories of biological motor control have been pursued from at least two separate frameworks, the "Dynamic Systems" approach and the "Control Theoretic/Optimization" approach. Control and optimization theory emphasize motor control based on organizational principles in terms of generic cost criteria like "minimum jerk", "minimum torque-change", "minimum variance", etc., while dynamic systems theory puts larger focus on principles of self-organization in motor control, like synchronization, phase-locking, phase transitions, perception-action coupling, etc. Computational formalizations in both approaches have equally differed, using mostly time-indexed desired trajectory plans in control/optimization theory, and nonlinear autonomous differential equations in dynamic systems theory. Due to these differences in philosophy and formalization, optimization approaches and dynamic systems approaches have largely remained two separate research approaches in motor control, mostly conceived of as incompatible. In this poster, we present a novel formal framework for motor control that can harmoniously encompass both optimization and dynamic systems approaches. This framework is based on the discovery that almost arbitrary nonlinear autonomous differential equations can be acquired within a standard statistical (or neural network) learning framework without the need of tedious manual parameter tuning and the danger of entering unstable or chaotic regions of the differential equations. Both rhythmic (e.g., locomotion, swimming, etc.) and discrete (e.g., point-to-point reaching, grasping, etc.) movement can be modeled, either as single degree-of-freedom or multiple degree-of-freedom systems. Coupling parameters to the differential equations can create typical effects of self-organization in dynamic systems, while optimization approaches can be used numerically safely to improve the attractor landscape of the equations with respect to a given cost criterion, as demonstrated in modeling studies of several of the hall marks of dynamics systems and optimization theory. We believe that this novel computational framework will allow a first step towards unifying dynamic systems and optimization approaches to motor control, and provide a set of principled modeling tools to both communities. |
Place Published | Irvine, CA, May 2003 |
Short Title | A unifying computational framework for optimization and dynamic systemsapproaches to motor control |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/JSNC2003-Mohajerian_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Vijayakumar, S.;Schaal, S. |
Year | 2003 |
Title | Reinforcement learning for humanoid robotics |
Journal/Conference/Book Title | IEEE-RAS International Conference on Humanoid Robots (Humanoids2003) |
Keywords | reinforcement learning, policy gradients, movement primitives, behaviors, dynamic systems, humanoid robotics |
Abstract | Reinforcement learning offers one of the most general framework to take traditional robotics towards true autonomy and versatility. However, applying reinforcement learning to high dimensional movement systems like humanoid robots remains an unsolved problem. In this paper, we discuss different approaches of reinforcement learning in terms of their applicability in humanoid robotics. Methods can be coarsely classified into three different categories, i.e., greedy methods, `vanilla' policy gradient methods, and natural gradient methods. We discuss that greedy methods are not likely to scale into the domain humanoid robotics as they are problematic when used with function approximation. `Vanilla' policy gradient methods on the other hand have been successfully applied on real-world robots including at least one humanoid robot. We demonstrate that these methods can be significantly improved using the natural policy gradient instead of the regular policy gradient. A derivation of the natural policy gradient is provided, proving that the average policy gradient of Kakade (2002) is indeed the true natural gradient. A general algorithm for estimating the natural gradient, the Natural Actor-Critic algorithm, is introduced. This algorithm converges to the nearest local minimum of the cost function with respect to the Fisher information metric under suitable conditions. The algorithm outperforms non-natural policy gradients by far in a cart-pole balancing evaluation, and for learning nonlinear dynamic motor primitives for humanoid robot control. It offers a promising route for the development of reinforcement learning for truly high dimensionally continuous state-action systems. |
Place Published | Karlsruhe, Germany, Sept.29-30 |
Short Title | Reinforcement learning for humanoid robotics |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/peters-ICHR2003.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.;Vijayakumar, S.;Schaal, S. |
Year | 2003 |
Title | Scaling reinforcement learning paradigms for motor learning |
Journal/Conference/Book Title | Proceedings of the 10th Joint Symposium on Neural Computation (JSNC 2003) |
Keywords | Reinforcement learning, neurodynamic programming, actorcritic methods, policy gradient methods, natural policy gradient |
Abstract | Reinforcement learning offers a general framework to explain reward related learning in artificial and biological motor control. However, current reinforcement learning methods rarely scale to high dimensional movement systems and mainly operate in discrete, low dimensional domains like game-playing, artificial toy problems, etc. This drawback makes them unsuitable for application to human or bio-mimetic motor control. In this poster, we look at promising approaches that can potentially scale and suggest a novel formulation of the actor-critic algorithm which takes steps towards alleviating the current shortcomings. We argue that methods based on greedy policies are not likely to scale into high-dimensional domains as they are problematic when used with function approximation � a must when dealing with continuous domains. We adopt the path of direct policy gradient based policy improvements since they avoid the problems of unstabilizing dynamics encountered in traditional value iteration based updates. While regular policy gradient methods have demonstrated promising results in the domain of humanoid notor control, we demonstrate that these methods can be significantly improved using the natural policy gradient instead of the regular policy gradient. Based on this, it is proved that Kakade�s �average natural policy gradient� is indeed the true natural gradient. A general algorithm for estimating the natural gradient, the Natural Actor-Critic algorithm, is introduced. This algorithm converges with probability one to the nearest local minimum in Riemannian space of the cost function. The algorithm outperforms nonnatural policy gradients by far in a cart-pole balancing evaluation, and offers a promising route for the development of reinforcement learning for truly high-dimensionally continuous state-action systems. |
URL(s) | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/petersVijayakumarSchaal_JSNC2003_5058[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Schaal, S.;Peters, J.;Nakanishi, J.;Ijspeert, A. |
Year | 2003 |
Title | Control, planning, learning, and imitation with dynamic movement primitives |
Journal/Conference/Book Title | Workshop on Bilateral Paradigms on Humans and Humanoids, IEEE International Conference on Intelligent Robots and Systems (IROS 2003) |
Keywords | movement primitives, supervised learning, reinforcment learning, locomotion, phase resetting, learning from demonstration |
Abstract | In both human and humanoid movement science, the topic of movement primitives has become central in understanding the generation of complex motion with high degree-of-freedom bodies. A theory of control, planning, learning, and imitation with movement primitives seems to be crucial in order to reduce the search space during motor learning and achieve a large level of autonomy and flexibility in dynamically changing environments. Movement recognition based on the same representations as used for movement generation, i.e., movement primitives, is equally intimately tied into these research questions. This paper discusses a comprehensive framework for motor control with movement primitives using a recently developed theory of dynamic movement primitives (DMP). DMPs are a formulation of movement primitives with autonomous nonlinear differential equations, whose time evolution creates smooth kinematic movement plans. Model-based control theory is used to convert such movement plans into motor commands. By means of coupling terms, on-line modifications can be incorporated into the time evolution of the differential equations, thus providing a rather flexible and reactive framework for motor planning and execution � indeed, DMPs form complete kinematic control policies, not just a particular desired trajectory. The linear parameterization of DMPs lends itself naturally to supervised learning from demonstrations. Moreover, the temporal, scale, and translation invariance of the differential equations with respect to these parameters provides a useful means for movement recognition. A novel reinforcement learning technique based on natural stochastic policy gradients allows a general approach of improving DMPs by trial and error learning with respect to almost arbitrary optimization criteria, including situations with delayed rewards. We demonstrate the different ingredients of the DMP approach in various examples, involving skill learning from demonstration on the humanoid robot DB and an application of learning simulated biped walking from a demonstrated trajectory, including self-improvement of the movement patterns in the spirit of energy efficiency through resonance tuning. |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Publications/IROS2003-Schaal_[0].pdf |
Reference Type | Conference Proceedings |
Author(s) | Vijayakumar, S.; D’Souza, A.; Peters, J.; Conradt, J.; Rutkowski,T.; Ijspeert, A.; Nakanishi, J.; Inoue, M.; Shibata, T.; Wiryo, A.; Itti, L.; Amari, S.; Schaal, S |
Year | 2002 |
Title | Real-Time Statistical Learning for Oculomotor Control and Visuomotor Coordination |
Journal/Conference/Book Title | Advances in Neural Information Processing Systems (NIPS/NeurIPS), Demonstration Track |
Reference Type | Report |
Author(s) | Peters, J. |
Year | 2002 |
Title | Policy Gradient Methods for Control Applications |
Journal/Conference/Book Title | CLMC Technical Report: TR-CLMC-2007-1, University of Southern California |
Link to PDF | https://www.ias.informatik.tu-darmstadt.de/uploads/Member/JanPeters/techrep.pdf |
Reference Type | Conference Proceedings |
Author(s) | Burdet, E.; Tee, K.P.; Chew, C.M.; Peters, J.; Bt, V.L. |
Year | 2001 |
Title | Hybrid IDM/Impedance Learning in Human Movements |
Journal/Conference/Book Title | First International Symposium on Measurement, Analysis and Modeling of Human Functions Proceedings |
Keywords | human motor control |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/burdet_ISHF_2001.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/burdet_ISHF_2001.pdf |
Reference Type | Conference Proceedings |
Author(s) | Peters, J.; Riener, R |
Year | 2000 |
Title | A real-time model of the human knee for application in virtual orthopaedic trainer |
Journal/Conference/Book Title | Proceedings of the 10th International Conference on Biomedical Engineering Conference (ICBME) |
Keywords | Biomechanics, human motor control |
URL(s) | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ICBME_2000.pdf |
Link to PDF | http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ICBME_2000.pdf |
Reference Type | Journal Article |
Author(s) | Peters, J. |
Year | 1998 |
Title | Fuzzy Logic for Practical Applications |
Journal/Conference/Book Title | Kuenstliche Intelligenz (KI) |
Keywords | book review |
Number | 4 |
Pages | 60 |
Reference Type | Patent |
Author(s) | Bischoff, B.; Vinogradska, J.; Peters, J. |
Year | 06/25/2020 |
Title | METHOD AND DEVICE FOR SETTING AT LEAST ONE PARAMETER OF AN ACTUATOR CONTROL SYSTEM, ACTUATOR CONTROL SYSTEM AND DATA SET |
Journal/Conference/Book Title | United States Patent Application 20200201290, Granted as 16/625223 |