Publication Details

You can download this complete bibtex reference list as all-ias-publications.bib.

SELECT * FROM publications WHERE ((Notes LIKE '%clmc%' AND Author LIKE '%Peters%') OR Notes LIKE '%jan%' OR Notes LIKE '%ias-pub%') ORDER BY Year DESC
Reference TypeJournal Article
Author(s)Maeda, G.; Ewerton, M.; Neumann, G.; Lioutikov, R.; Peters, J.
Yearaccepted
TitlePhase Estimation for Fast Action Recognition and Trajectory Generation in Human-Robot Collaboration
Journal/Conference/Book TitleInternational Journal of Robotics Research (IJRR)
Keywords3rd-Hand, BIMROB
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/PubGJMaeda/phase_estim_IJRR.pdf
Reference TypeJournal Article
Author(s)Lioutikov, R.; Neumann, G.; Maeda, G.; Peters, J.
Yearaccepted
TitleLearning Movement Primitive Libraries through Probabilistic Segmentation
Journal/Conference/Book TitleInternational Journal of Robotics Research (IJRR)
Keywords3rd-hand
Link to PDF/uploads/Publications/lioutikov_probs_ijrr2017.pdf
Reference TypeJournal Article
Author(s)Kroemer, O.; Leischnig, S.; Luettgen, S.; Peters, J.
Yearaccepted
TitleA Kernel-based Approach to Learning Contact Distributions for Robot Manipulation Tasks
Journal/Conference/Book TitleAutonomous Robots (AURO)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Alumni/OliverKroemer/KroemerAuRo17Updated2.pdf
Reference TypeJournal Article
Author(s)Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G.
Yearaccepted
TitleUsing Probabilistic Movement Primitives in Robotics
Journal/Conference/Book TitleAutonomous Robots (AURO)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/AlexandrosParaschos/promps_auro.pdf
Reference TypeJournal Article
Author(s)van Hoof, H.; Tanneberg, D.; Peters, J.
Yearaccepted
TitleGeneralized Exploration in Policy Search
Journal/Conference/Book TitleMachine Learning (MLJ)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/vanHoof_MLJ_2017.pdf
Reference TypeJournal Article
Author(s)Dermy, O.; Paraschos, A.; Ewerton, M.; Charpillet, F.; Peters, J.; Ivaldi, S
Yearaccepted
TitlePrediction of intention during interaction with iCub with Probabilistic Movement Primitives
Journal/Conference/Book TitleFrontiers in Robotics and AI
KeywordsCoDyCo
Reference TypeJournal Article
Author(s)Yi, Z.; Zhang, Y.; Peters, J.
Yearaccepted
TitleBiomimetic Tactile Sensors and Signal Processing with Spike Trains: A Review
Journal/Conference/Book TitleSensors & Actuators: A. Physical
Reference TypeJournal Article
Author(s)Kupcsik, A.G.; Deisenroth, M.P.; Peters, J.; Ai Poh, L.; Vadakkepat, V.; Neumann, G.
Year2017
TitleModel-based Contextual Policy Search for Data-Efficient Generalization of Robot Skills
Journal/Conference/Book TitleArtificial Intelligence
KeywordsComPLACS
Volume247
Pages415-439
DateJune 2017
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kupcsik_AIJ_2015.pdf
Reference TypeJournal Article
Author(s)Wang, Z.; Boularias, A.; Muelling, K.; Schoelkopf, B.; Peters, J.
Year2017
TitleAnticipatory Action Selection for Human-Robot Table Tennis
Journal/Conference/Book TitleArtificial Intelligence
Volume247
Pages399-414
DateJune 2017
Link to PDFhttp://www.sciencedirect.com/science/article/pii/S0004370214001398
Reference TypeJournal Article
Author(s)Maeda, G.; Neumann, G.; Ewerton, M.; Lioutikov, L.; Kroemer, O.; Peters, J.
Year2017
TitleProbabilistic Movement Primitives for Coordination of Multiple Human-Robot Collaborative Tasks
Journal/Conference/Book TitleAutonomous Robots (AURO)
Keywords3rd-Hand, BIMROB
Volume41
Number3
Pages593-612
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/PubGJMaeda/gjm_2016_AURO_c.pdf
Reference TypeJournal Article
Author(s)Parisi, S.; Pirotta, M.; Peters, J.
Year2017
TitleManifold-based Multi-objective Policy Search with Sample Reuse
Journal/Conference/Book TitleNeurocomputing, Special Issue on Multi-Objective Reinforcement Learning
Keywordsmulti-objective, reinforcement learning, policy search, black-box optimization, importance sampling
Volume263
Pages3-14
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/parisi_neurocomp_morl.pdf
Reference TypeBook Section
Author(s)Peters, J.; Lee, D.; Kober, J.; Nguyen-Tuong, D.; Bagnell, J.; Schaal, S.
Year2017
TitleChapter 15: Robot Learning
Journal/Conference/Book TitleSpringer Handbook of Robotics, 2nd Edition
PublisherSpringer International Publishing
Pages357-394
Reference TypeJournal Article
Author(s)Padois, V.; Ivaldib, S.; Babič, J.; Mistry, M.; Peters, J.; Nori, F.
Year2017
TitleWhole-body multi-contact motion in humans and humanoids
Journal/Conference/Book TitleRobotics and Autonomous Systems
Volume90
Pages97-117
DateApril 2017
Reference TypeConference Proceedings
Author(s)Tangkaratt, V.; van Hoof, H.; Parisi, S.; Neumann, G.; Peters, J.; Sugiyama, M.
Year2017
TitlePolicy Search with High-Dimensional Context Variables
Journal/Conference/Book TitleProceedings of the AAAI Conference on Artificial Intelligence (AAAI)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/tangkaratt2017policy.pdf
Reference TypeJournal Article
Author(s)Ivaldi, S.; Lefort, S.; Peters, J.; Chetouani, M.; Provasi, J.; Zibetti, E.
Year2017
TitleTowards Engagement Models that Consider Individual Factors in HRI: On the Relation of Extroversion and Negative Attitude Towards Robots to Gaze and Speech During a Human-Robot Assembly Task
Journal/Conference/Book TitleInternational Journal of Social Robotics
Volume9
Number of Volumes1
Pages63-86
Reference TypeConference Proceedings
Author(s)Gebhardt, G.H.W.; Kupcsik, A.G.; Neumann, G.
Year2017
TitleThe Kernel Kalman Rule - Efficient Nonparametric Inference with Recursive Least Squares
Journal/Conference/Book TitleProceedings of the National Conference on Artificial Intelligence (AAAI)
AbstractNonparametric inference techniques provide promising tools for probabilistic reasoning in high-dimensional nonlinear systems. Most of these techniques embed distributions into reproducing kernel Hilbert spaces (RKHS) and rely on the kernel Bayes’ rule (KBR) to manipulate the embeddings. However, the computational demands of the KBR scale poorly with the number of samples and the KBR often suffers from numerical instabilities. In this paper, we present the kernel Kalman rule (KKR) as an alternative to the KBR. The derivation of the KKR is based on recursive least squares, inspired by the derivation of the Kalman innovation update. We apply the KKR to filtering tasks where we use RKHS embeddings to represent the belief state, resulting in the kernel Kalman filter (KKF). We show on a nonlinear state estimation task with high dimensional observations that our approach provides a significantly improved estimation accuracy while the computational demands are significantly decreased.
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/GregorGebhardt/TheKernelKalmanRule.pdf
Reference TypeJournal Article
Author(s)Yi, Z.; Zhang, Y.; Peters, J.
Year2017
TitleBioinspired Tactile Sensor for Surface Roughness Discrimination
Journal/Conference/Book TitleSensors and Actuators A: Physical
Volume255
Pages46-53
Date1 March 2017
URL(s) http://www.sciencedirect.com/science/article/pii/S0924424716311992
Reference TypeJournal Article
Author(s)Osa, T.; Ghalamzan, E. A. M.; Stolkin, R.; Lioutikov, R.; Peters, J.; Neumann, G.
Year2017
TitleGuiding Trajectory Optimization by Demonstrated Distributions
Journal/Conference/Book TitleIEEE Robotics and Automation Letters (RA-L)
PublisherIEEE
Volume2
Number2
Pages819-826
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Osa_RAL_2017.pdf
Reference TypeJournal Article
Author(s)Kroemer, O.; Peters, J.
Year2017
TitleA Comparison of Autoregressive Hidden Markov Models for Multi-Modal Manipulations with Variable Masses
Journal/Conference/Book TitleProceedings of the International Conference of Robotics and Automation, and IEEE Robotics and Automation Letters (RA-L)
Keywords3rd-Hand, TACMAN
Volume2
Number2
Pages1101 - 1108
Reference TypeConference Proceedings
Author(s)Farraj, F. B.; Osa, T.; Pedemonte, N.; Peters, J.; Neumann, G.; Giordano, P.R.
Year2017
TitleA Learning-based Shared Control Architecture for Interactive Task Execution
Journal/Conference/Book TitleProceedings of the IEEE International Conference on Robotics and Automation (ICRA)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/firas_ICRA17.pdf
Reference TypeConference Proceedings
Author(s)Wilbers, D.; Lioutikov, R.; Peters, J.
Year2017
TitleContext-Driven Movement Primitive Adaptation
Journal/Conference/Book TitleProceedings of the IEEE International Conference on Robotics and Automation (ICRA)
Keywords3rd-Hand, BIMROB
Link to PDF/uploads/Member/PubRudolfLioutikov/wilbers_icra_2017.pdf
Reference TypeConference Proceedings
Author(s)End, F.; Akrour, R.; Peters, J.; Neumann, G.
Year2017
TitleLayered Direct Policy Search for Learning Hierarchical Skills
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Research/Overview/icra17_felix.pdf
Reference TypeConference Proceedings
Author(s)Gabriel, A.; Akrour, R.; Peters, J.; Neumann, G.
Year2017
TitleEmpowered Skills
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
KeywordsIAS
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Research/Overview/icra17_alex.pdf
Reference TypeConference Proceedings
Author(s)Abdulsamad, H.; Arenz, O.; Peters, J.; Neumann, G.
Year2017
TitleState-Regularized Policy Search for Linearized Dynamical Systems
Journal/Conference/Book TitleProceedings of the International Conference on Automated Planning and Scheduling (ICAPS)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Abdulsamad_ICAPS_2017.pdf
Reference TypeConference Proceedings
Author(s)Fiebig, K.H.; Jayaram, V.; Hesse, T.; Blank, A.; Peters, J.; Grosse-Wentrup, M.
Year2017
TitleBayesian Regression for Artifact Correction in Electroencephalography
Journal/Conference/Book TitleProceedings of the 7th Graz Brain-Computer Interface Conference
Reference TypeConference Proceedings
Author(s)Akrour, R.; Sorokin, D.; Peters, J.; Neumann, G.
Year2017
TitleLocal Bayesian Optimization of Motor Skills
Journal/Conference/Book TitleProceedings of the International Conference on Machine Learning (ICML)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/lbayes_motor_skills.pdf
Reference TypeConference Proceedings
Author(s)Gebhardt, G.H.W.; Daun, K.; Schnaubelt, M.; Hendrich, A.; Kauth, D.; Neumann, G.
Year2017
TitleLearning to Assemble Objects with a Robot Swarm
Journal/Conference/Book TitleProceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (AAMAS 17)
Keywordsmulti-agent learning, reinforcement learning, swarm robotics
AbstractNature provides us with a multitude of examples that show how swarms of simple agents are much richer in their abilities than a single individual. This insight is a main principle that swarm robotics tries to exploit. In the last years, large swarms of low-cost robots such as the Kilobots have become available. This allows to bring algorithms developed for swarm robotics from simulations to the real world. Recently, the Kilobots have been used for an assembly task with multiple objects: a human operator controlled a light source to guide the swarm of light-sensitive robots such that they successfully assembled an object of multiple parts. However, hand-coding the control of the light source for autonomous assembly is not straight forward as the interactions of the swarm with the object or the reaction to the light source are hard to model.
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems
Pages1547--1549
URL(s) http://dl.acm.org/citation.cfm?id=3091282.3091357
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/GregorGebhardt/LearningToAssembleObjectsWithARobotSwarm.pdf
Reference TypeConference Proceedings
Author(s)Tosatto, S.; D'eramo, C; Pirotta, M.; Restelli, M.
Year2017
TitleBoosted Fitted Q-Iteration
Journal/Conference/Book TitleProceedings of the International Conference of Machine Learning (ICML)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/tosatto_icml2017.pdf
Reference TypeConference Paper
Author(s)Belousov, B.; Neumann, G.; Rothkopf, C.A.; Peters, J.
Year2017
TitleCatching heuristics are optimal control policies
Journal/Conference/Book TitleProceedings of the Karniel Thirteenth Computational Motor Control Workshop
KeywordsSKILLS4ROBOTS
AbstractTwo seemingly contradictory theories attempt to explain how humans move to intercept an airborne ball. One theory posits that humans predict the ball trajectory to optimally plan future actions; the other claims that, instead of performing such complicated computations, humans employ heuristics to reactively choose appropriate actions based on immediate visual feedback. In this paper, we show that interception strategies appearing to be heuristics can be understood as computational solutions to the optimal control problem faced by a ball-catching agent acting under uncertainty. Modeling catching as a continuous partially observable Markov decision process and employing stochastic optimal control theory, we discover that the four main heuristics described in the literature are optimal solutions if the catcher has sufficient time to continuously visually track the ball. Specifically, by varying model parameters such as noise, time to ground contact, and perceptual latency, we show that different strategies arise under different circumstances. The catcher’s policy switches between generating reactive and predictive behavior based on the ratio of system to observation noise and the ratio between reaction time and task duration. Thus, we provide a rational account of human ball-catching behavior and a unifying explanation for seemingly contradictory theories of target interception on the basis of stochastic optimal control.
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Belousov_ANIPS_2016.pdf
Reference TypeConference Proceedings
Author(s)Busch, B.; Maeda, G.; Mollard, Y.; Demangeat, M.; Lopes, M.
Year2017
TitlePostural Optimization for an Ergonomic Human-Robot Interaction
Journal/Conference/Book TitleProceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Keywords3rd-Hand
Reference TypeConference Proceedings
Author(s)Pajarinen, J.; Kyrki, V.; Koval, M.; Srinivasa, S; Peters, J.; Neumann, G.
Year2017
TitleHybrid Control Trajectory Optimization under Uncertainty
Journal/Conference/Book TitleProceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
KeywordsRoMaNS
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/pajarinen_iros_2017.pdf
Reference TypeConference Proceedings
Author(s)Parisi, S., Ramstedt, S., Peters J.
Year2017
TitleGoal-Driven Dimensionality Reduction for Reinforcement Learning
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/parisi2017iros.pdf
Reference TypeJournal Article
Author(s)Paraschos, A.; Lioutikov, R.; Peters, J.; Neumann, G.
Year2017
TitleProbabilistic Prioritization of Movement Primitives
Journal/Conference/Book TitleProceedings of the International Conference on Intelligent Robot Systems, and IEEE Robotics and Automation Letters (RA-L)
Keywordscodyco
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/AlexandrosParaschos/paraschos_prob_prio.pdf
Reference TypeJournal Article
Author(s)van Hoof, H.; Neumann, G.; Peters, J.
Year2017
TitleNon-parametric Policy Search with Limited Information Loss
Journal/Conference/Book TitleJournal of Machine Learning Research (JMLR)
KeywordsTACMAN, reinforcement learning
Volume18
Number73
Pages1-46
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Alumni/HerkeVanHoof/vanHoof_JMLR_2017.pdf
Reference TypeJournal Article
Author(s)Vinogradska, J.; Bischoff, B.; Nguyen-Tuong, D.; Peters, J.
Year2017
TitleStability of Controllers for Gaussian Process Forward Models
Journal/Conference/Book TitleJournal of Machine Learning Research (JMLR)
Volume18
Number100
Pages1-37
Reference TypeConference Proceedings
Author(s)Tanneberg, D.; Peters, J.; Rueckert, E.
Year2017
TitleOnline Learning with Stochastic Recurrent Neural Networks using Intrinsic Motivation Signals
Journal/Conference/Book TitleProceedings of the Conference on Robot Learning (CoRL)
KeywordsGOAL-Robots, SKILLS4ROBOTS
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/DanielTanneberg/corl17_01.pdf
Reference TypeConference Proceedings
Author(s)Maeda, G.; Ewerton, M.; Osa, T.; Busch, B.; Peters, J.
Year2017
TitleActive Incremental Learning of Robot Movement Primitives
Journal/Conference/Book TitleProceedings of the Conference on Robot Learning (CoRL)
Keywords3rd-Hand
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/PubGJMaeda/maedaCoRL_20171014.pdf
Reference TypeConference Proceedings
Author(s)Rueckert, E.; Nakatenus, M.; Tosatto, S.; Peters, J.
Year2017
TitleLearning Inverse Dynamics Models in O(n) time with LSTM networks
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsGOAL-Robots, SKILLS4ROBOTS
Link to PDFhttps://rueckert.lima-city.de/papers/Humanoids2017Rueckert.pdf
Reference TypeConference Proceedings
Author(s)Tanneberg, D.; Peters, J.; Rueckert, E.
Year2017
TitleEfficient Online Adaptation with Stochastic Recurrent Neural Networks
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsGOAL-Robots, SKILLS4ROBOTS
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/DanielTanneberg/humanoids17_01.pdf
Reference TypeConference Proceedings
Author(s)Stark, S.; Peters, J.; Rueckert, E.
Year2017
TitleA Comparison of Distance Measures for Learning Nonparametric Motor Skill Libraries
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsGOAL-Robots, SKILLS4ROBOTS
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/SvenjaStark/stark_humanoids2017.pdf
Reference TypeConference Proceedings
Author(s)Osa, T.; Peters, J.; Neumann, G.
Year2016
TitleExperiments with Hierarchical Reinforcement Learning of Multiple Grasping Policies
Journal/Conference/Book TitleProceedings of the International Symposium on Experimental Robotics (ISER)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/osa_ISER2016.pdf
Reference TypeConference Proceedings
Author(s)Arenz, O.; Abdulsamad, H.; Neumann, G.
Year2016
TitleOptimal Control and Inverse Optimal Control by Distribution Matching
Journal/Conference/Book TitleProceedings of the International Conference on Intelligent Robots and Systems (IROS)
KeywordsImitation Learning, Inverse Optimal Control, Optimal Control
AbstractOptimal control is a powerful approach to achieve optimal behavior. However, it typically requires a manual specification of a cost function which often contains several objectives, such as reaching goal positions at different time steps or energy efficiency. Manually trading-off these objectives is often difficult and requires a high engineering effort. In this paper, we present a new approach to specify optimal behavior. We directly specify the desired behavior by a distribution over future states or features of the states. For example, the experimenter could choose to reach certain mean positions with given accuracy/variance at specified time steps. Our approach also unifies optimal control and inverse optimal control in one framework. Given a desired state distribution, we estimate a cost function such that the optimal controller matches the desired distribution. If the desired distribution is estimated from expert demonstrations, our approach performs inverse optimal control. We evaluate our approach on several optimal and inverse optimal control tasks on non-linear systems using incremental linearizations similar to differential dynamic programming approaches.
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/OlegArenz/OC and IOC By Matching Distributions_withSupplements.pdf
Reference TypeJournal Article
Author(s)Rueckert, E.; Kappel, D.; Tanneberg, D.; Pecevski, D; Peters, J.
Year2016
TitleRecurrent Spiking Networks Solve Planning Tasks
Journal/Conference/Book TitleNature PG: Scientific Reports
Keywords3rdHand, CoDyCo
PublisherNature Publishing Group
Volume6
Number21142
Date2016/02/18/online
ISBN/ISSN10.1038/srep21142
Custom 2http://www.nature.com/articles/srep21142#supplementary-information
URL(s) http://www.nature.com/articles/srep21142
Link to PDFhttp://dx.doi.org/10.1038/srep21142
Reference TypeConference Proceedings
Author(s)Kohlschuetter, J.; Peters, J.; Rueckert, E.
Year2016
TitleLearning Probabilistic Features from EMG Data for Predicting Knee Abnormalities
Journal/Conference/Book TitleProceedings of the XIV Mediterranean Conference on Medical and Biological Engineering and Computing (MEDICON)
KeywordsCoDyCo, TACMAN
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/KohlschuetterMEDICON_2016.pdf
Reference TypeJournal Article
Author(s)Maeda, G.; Ewerton, M.; Koert, D; Peters, J.
Year2016
TitleAcquiring and Generalizing the Embodiment Mapping from Human Observations to Robot Skills
Journal/Conference/Book TitleIEEE Robotics and Automation Letters (RA-L)
Keywords3rd-Hand, BIMROB
Volume1
Number2
Pages784--791
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/GuilhermeMaeda/maeda_RAL_golf_2016.pdf
Reference TypeConference Proceedings
Author(s)Modugno, V.; Neumann, G.; Rueckert, E.; Oriolo, G.; Peters, J.; Ivaldi, S.
Year2016
TitleLearning soft task priorities for control of redundant robots
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
KeywordsCoDyCo
Reference TypeConference Proceedings
Author(s)Buechler, D.; Ott, H.; Peters, J.
Year2016
TitleA Lightweight Robotic Arm with Pneumatic Muscles for Robot Learning
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
Reference TypeConference Proceedings
Author(s)Ewerton, M.; Maeda, G.; Neumann, G.; Kisner, V.; Kollegger, G.; Wiemeyer, J.; Peters, J.
Year2016
TitleMovement Primitives with Multiple Phase Parameters
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
KeywordsBIMROB, 3rd-Hand
Pages201--206
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_icra_2016_stockholm.pdf
Reference TypeJournal Article
Author(s)Daniel, C.; Neumann, G.; Kroemer, O.; Peters, J.
Year2016
TitleHierarchical Relative Entropy Policy Search
Journal/Conference/Book TitleJournal of Machine Learning Research (JMLR)
Volume17
Pages1-50
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Daniel2016JMLR.pdf
Reference TypeJournal Article
Author(s)Veiga, F.F.; Peters, J.
Year2016
TitleCan Modular Finger Control for In-Hand Object Stabilization be accomplished by Independent Tactile Feedback Control Laws?
Journal/Conference/Book TitlearXiv
Link to PDFhttps://arxiv.org/pdf/1612.08202.pdf
Reference TypeJournal Article
Author(s)Abdolmaleki, A.; Lau, N.; Reis, L.; Peters, J.; Neumann, G.
Year2016
TitleContextual Policy Search for Linear and Nonlinear Generalization of a Humanoid Walking Controller
Journal/Conference/Book TitleJournal of Intelligent & Robotic Systems
Link to PDFhttp://download.springer.com/static/pdf/812/art%253A10.1007%252Fs10846-016-0347-y.pdf?originUrl=http%3A%2F%2Flink.springer.com%2Farticle%2F10.1007%2Fs10846-016-0347-y&token2=exp=1456995635~acl=%2Fstatic%2Fpdf%2F812%2Fart%25253A10.1007%25252Fs10846-016-034
Reference TypeConference Proceedings
Author(s)Vinogradska, J.; Bischoff, B.; Nguyen-Tuong, D.; Romer, A.; Schmidt, H.; Peters, J.
Year2016
TitleStability of Controllers for Gaussian Process Forward Models
Journal/Conference/Book TitleProceedings of the International Conference on Machine Learning (ICML)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Publications/Vinogradska_ICML_2016.pdf
Reference TypeConference Proceedings
Author(s)Akrour, R.; Abdolmaleki, A.; Abdulsamad, H.; Neumann, G.
Year2016
TitleModel-Free Trajectory Optimization for Reinforcement Learning
Journal/Conference/Book TitleProceedings of the International Conference on Machine Learning (ICML)
Link to PDFhttps://arxiv.org/pdf/1606.09197.pdf
Reference TypeConference Proceedings
Author(s)Sharma, D.; Tanneberg, D.; Grosse-Wentrup, M.; Peters, J.; Rueckert, E.
Year2016
TitleAdaptive Training Strategies for BCIs
Journal/Conference/Book TitleCybathlon Symposium
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/ElmarR%c3%bcckert/Cybathlon16_AdaptiveTrainingRL.pdf
Reference TypeConference Proceedings
Author(s)Calandra, R.; Peters, J.; Rasmussen, C.E.; Deisenroth, M.P.
Year2016
TitleManifold Gaussian Processes for Regression
Journal/Conference/Book TitleProceedings of the International Joint Conference on Neural Networks (IJCNN)
KeywordsCoDyCo
Link to PDFhttp://arxiv.org/pdf/1402.5876v4
Reference TypeConference Proceedings
Author(s)Weber, P.; Rueckert, E.; Calandra, R.; Peters, J.; Beckerle, P.
Year2016
TitleA Low-cost Sensor Glove with Vibrotactile Feedback and Multiple Finger Joint and Hand Motion Sensing for Human-Robot Interaction
Journal/Conference/Book TitleProceedings of the IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)
KeywordsCoDyCo
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/ElmarR%c3%bcckert/ROMANS16_daglove.pdf
Reference TypeJournal Article
Author(s)Rueckert, E.; Camernik, J.; Peters, J.; Babic, J.
Year2016
TitleProbabilistic Movement Models Show that Postural Control Precedes and Predicts Volitional Motor Control
Journal/Conference/Book TitleNature PG: Scientific Reports
KeywordsCoDyCo; TACMAN
Volume6
Number28455
URL(s) http://www.nature.com/articles/srep28455
Link to PDFhttp://dx.doi.org/10.1038/srep28455
Reference TypeJournal Article
Author(s)Daniel, C.; van Hoof, H.; Peters, J.; Neumann, G.
Year2016
TitleProbabilistic Inference for Determining Options in Reinforcement Learning
Journal/Conference/Book TitleMachine Learning (MLJ)
Volume104
Number2-3
Pages337-357
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Daniel2016ECML.pdf
Reference TypeConference Proceedings
Author(s)Manschitz, S.; Gienger, M.; Kober, J.; Peters, J.
Year2016
TitleProbabilistic Decomposition of Sequential Force Interaction Tasks into Movement Primitives
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
KeywordsHonda, HRI-Collaboration
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/SimonManschitz/ManschitzIROS2016.pdf
Reference TypeConference Proceedings
Author(s)van Hoof, H.; Chen, N.; Karl, M.; van der Smagt, P.; Peters, J.
Year2016
TitleStable Reinforcement Learning with Autoencoders for Tactile and Visual Data
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
KeywordsTACMAN, tactile manipulation
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/hoof2016IROS.pdf
Reference TypeConference Proceedings
Author(s)Yi, Z.; Calandra, R.; Veiga, F.; van Hoof, H.; Hermans, T.; Zhang, Y.; Peters, J.
Year2016
TitleActive Tactile Object Exploration with Gaussian Processes
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
KeywordsTACMAN, tactile manipulation
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Publications/Other/iros2016yi.pdf
Reference TypeConference Proceedings
Author(s)Koc, O.; Peters, J.; Maeda, G.
Year2016
TitleA New Trajectory Generation Framework in Robotic Table Tennis
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
Reference TypeConference Proceedings
Author(s)Belousov, B.; Neumann, G.; Rothkopf, C.; Peters, J.
Year2016
TitleCatching heuristics are optimal control policies
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems (NIPS)
KeywordsSKILLS4ROBOTS
AbstractTwo seemingly contradictory theories attempt to explain how humans move to intercept an airborne ball. One theory posits that humans predict the ball trajectory to optimally plan future actions; the other claims that, instead of performing such complicated computations, humans employ heuristics to reactively choose appropriate actions based on immediate visual feedback. In this paper, we show that interception strategies appearing to be heuristics can be understood as computa- tional solutions to the optimal control problem faced by a ball-catching agent acting under uncertainty. Modeling catching as a continuous partially observable Markov decision process and employing stochastic optimal control theory, we discover that the four main heuristics described in the literature are optimal solutions if the catcher has sufficient time to continuously visually track the ball. Specifically, by varying model parameters such as noise, time to ground contact, and perceptual latency, we show that different strategies arise under different circumstances. The catcher’s policy switches between generating reactive and predictive behavior based on the ratio of system to observation noise and the ratio between reaction time and task duration. Thus, we provide a rational account of human ball-catching behavior and a unifying explanation for seemingly contradictory theories of target interception on the basis of stochastic optimal control.
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Belousov_ANIPS_2016.pdf
Reference TypeBook Section
Author(s)Peters, J.; Tedrake, R.; Roy, N.; Morimoto, J.
Year2016
TitleRobot Learning
Journal/Conference/Book TitleEncyclopedia of Machine Learning, 2nd Edition, Invited Article
Reference TypeConference Proceedings
Author(s)Tanneberg, D.; Paraschos, A.; Peters, J.; Rueckert, E.
Year2016
TitleDeep Spiking Networks for Model-based Planning in Humanoids
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsCoDyCo; TACMAN
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/DanielTanneberg/tanneberg_humanoids16.pdf
Reference TypeConference Proceedings
Author(s)Huang, Y.; Buechler, D.; Koc, O.; Schoelkopf, B.; Peters, J.
Year2016
TitleJointly Learning Trajectory Generation and Hitting Point Prediction in Robot Table Tennis
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
Reference TypeConference Proceedings
Author(s)Koert, D.; Maeda, G.J.; Lioutikov, R.; Neumann, G.; Peters, J.
Year2016
TitleDemonstration Based Trajectory Optimization for Generalizable Robot Motions
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
Keywords3rd-Hand,SKILLS4ROBOTS
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/DorotheaKoert/Debato.pdf
Reference TypeConference Proceedings
Author(s)Gomez-Gonzalez, S.; Neumann, G.; Schoelkopf, B.; Peters, J.
Year2016
TitleUsing Probabilistic Movement Primitives for Striking Movements
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
Reference TypeConference Proceedings
Author(s)Ewerton, M.; Maeda, G.J.; Kollegger, G.; Wiemeyer, J.; Peters, J.
Year2016
TitleIncremental Imitation Learning of Context-Dependent Motor Skills
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsBIMROB, 3rd-Hand
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_humanoids_2016.pdf
Reference TypeConference Proceedings
Author(s)Azad, M.; Ortenzi, V.; Lin, H., C.; Rueckert, E.; Mistry, M.
Year2016
TitleModel Estimation and Control of Complaint Contact Normal Force
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsCoDyCo
Link to PDFhttps://rueckert.lima-city.de/papers/Humanoids2016Azad.pdf
Reference TypeConference Proceedings
Author(s)Parisi, S; Blank, A; Viernickel T; Peters, J
Year2016
TitleLocal-utopia Policy Selection for Multi-objective Reinforcement Learning
Journal/Conference/Book TitleProceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/parisi2016local.pdf
Reference TypeBook Section
Author(s)Peters, J.; Bagnell, J.A.
Year2016
TitlePolicy gradient methods
Journal/Conference/Book TitleEncyclopedia of Machine Learning, 2nd Edition, Invited Article
Reference TypeConference Proceedings
Author(s)Fiebig, K.-H.; Jayaram, V.; Peters, J.; Grosse-Wentrup, M.
Year2016
TitleMulti-Task Logistic Regression in Brain-Computer Interfaces
Journal/Conference/Book TitleIEEE SMC 2016 — 6th Workshop on Brain-Machine Interface Systems
Reference TypeJournal Article
Author(s)Yi, Z.; Zhang, Y.; Peters, J.
Year2016
TitleSurface Roughness Discrimination Using Bioinspired Tactile Sensors
Journal/Conference/Book TitleProceedings of the 16th International Conference on Biomedical Engineering
Reference TypeJournal Article
Author(s)Calandra, R.; Seyfarth, A.; Peters, J.; Deisenroth, M.
Year2015
TitleBayesian Optimization for Learning Gaits under Uncertainty
Journal/Conference/Book TitleAnnals of Mathematics and Artificial Intelligence (AMAI)
KeywordsCoDyCo
AbstractDesigning gaits and corresponding control policies is a key challenge in robot locomotion. Even with a viable controller parameterization, finding nearoptimal parameters can be daunting. Typically, this kind of parameter optimization requires specific expert knowledge and extensive robot experiments. Automatic black-box gait optimization methods greatly reduce the need for human expertise and time-consuming design processes. Many different approaches for automatic gait optimization have been suggested to date, such as grid search and evolutionary algorithms. In this article, we thoroughly discuss multiple of these optimization methods in the context of automatic gait optimization. Moreover, we extensively evaluate Bayesian optimization, a model-based approach to black-box optimization under uncertainty, on both simulated problems and real robots. This evaluation demonstrates that Bayesian optimization is particularly suited for robotic applications, where it is crucial to find a good set of gait parameters in a small number of experiments.
URL(s) http://dx.doi.org/10.1007/s10472-015-9463-9
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Calandra2015a.pdf
Reference TypeJournal Article
Author(s)Mariti, C.; Muscolo, G.G.; Peters, J.; Puig, D.; Recchiuto, C.T.; Sighieri, C.; Solanas, A.; von Stryk, O.
Year2015
TitleDeveloping biorobotics for veterinary research into cat movements
Journal/Conference/Book TitleJournal of Veterinary Behavior: Clinical Applications and Research
Volume10
Number of Volumes3
Pages248-254
Link to PDFhttp://www.sciencedirect.com/science/article/pii/S1558787815000052
Reference TypeConference Proceedings
Author(s)van Hoof, H.; Peters, J.; Neumann, G.
Year2015
TitleLearning of Non-Parametric Control Policies with High-Dimensional State Features
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS)
KeywordsTACMAN
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/hoof2015learning.pdf
Reference TypeConference Proceedings
Author(s)Calandra, R.; Ivaldi, S.; Deisenroth, M.;Rueckert, E.; Peters, J.
Year2015
TitleLearning Inverse Dynamics Models with Contacts
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
KeywordsCoDyCo
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_ICRA15.pdf
Reference TypeConference Proceedings
Author(s)Kroemer, O.; Daniel, C.; Neumann, G; van Hoof, H.; Peters, J.
Year2015
TitleTowards Learning Hierarchical Skills for Multi-Phase Manipulation Tasks
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
Keywords3rd-hand, 3rdHand
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/KroemerICRA15.pdf
Reference TypeConference Proceedings
Author(s)Rueckert, E.; Mundo, J.; Paraschos, A.; Peters, J.; Neumann, G.
Year2015
TitleExtracting Low-Dimensional Control Variables for Movement Primitives
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
Keywords3rdHand, CoDyCo
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Rueckert_ICRA14LMProMPsFinal.pdf
Reference TypeConference Proceedings
Author(s)Ewerton, M.; Neumann, G.; Lioutikov, R.; Ben Amor, H.; Peters, J.; Maeda, G.
Year2015
TitleLearning Multiple Collaborative Tasks with a Mixture of Interaction Primitives
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
Keywords3rd-Hand, CompLACS, BIMROB
Pages1535--1542
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_icra_2015_seattle.pdf
Reference TypeConference Proceedings
Author(s)Traversaro, S.; Del Prete, A.; Ivaldi, S.; Nori, F.
Year2015
TitleAvoiding to rely on Inertial Parameters in Estimating Joint Torques with proximal F/T sensing
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
KeywordsCoDyCo
Reference TypeConference Proceedings
Author(s)Lopes, M.; Peters, J.; Piater, J.; Toussaint, M.; Baisero, A.; Busch, B.; Erkent, O.; Kroemer, O.; Lioutikov, R.; Maeda, G.; Mollard, Y.; Munzer, T.; Shukla, D.
Year2015
TitleSemi-Autonomous 3rd-Hand Robot
Journal/Conference/Book TitleWorkshop on Cognitive Robotics in Future Manufacturing Scenarios, European Robotics Forum, Vienna, Austria
Keywords3rdhand
Link to PDFhttps://iis.uibk.ac.at/public/papers/Lopes-2015-CogRobFoF.pdf
Reference TypeConference Proceedings
Author(s)Lioutikov, R.; Neumann, G.; Maeda, G.J.; Peters, J.
Year2015
TitleProbabilistic Segmentation Applied to an Assembly Task
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
Keywords3rd-hand
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/lioutikov_humanoids_2015.pdf
Reference TypeConference Proceedings
Author(s)Paraschos, A.; Rueckert, E.; Peters, J; Neumann, G.
Year2015
TitleModel-Free Probabilistic Movement Primitives for Physical Interaction
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
KeywordsCoDyCo
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/PubAlexParaschos/Paraschos_IROS_2015.pdf
Reference TypeConference Paper
Author(s)Rueckert, E.; Lioutikov, R.; Calandra, R.; Schmidt, M.; Beckerle, P.; Peters, J.
Year2015
TitleLow-cost Sensor Glove with Force Feedback for Learning from Demonstrations using Probabilistic Trajectory Representations
Journal/Conference/Book TitleICRA 2015 Workshop on Tactile and force sensing for autonomous compliant intelligent robots
KeywordsCoDyCo
URL(s) http://arxiv.org/abs/1510.03253
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Workshops/ICRA2015TactileForce/13_icra_ws_tactileforce.pdf
Reference TypeConference Proceedings
Author(s)Ewerton, M.; Neumann, G.; Lioutikov, R.; Ben Amor, H.; Peters, J.; Maeda, G.
Year2015
TitleModeling Spatio-Temporal Variability in Human-Robot Interaction with Probabilistic Movement Primitives
Journal/Conference/Book TitleWorkshop on Machine Learning for Social Robotics, ICRA
Keywords3rd-Hand, CompLACS, BIMROB
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_workshop_ml_social_robotics_icra_2015.pdf
Reference TypeConference Proceedings
Author(s)Parisi, S.; Abdulsamad, H.; Paraschos, A.; Daniel, C.; Peters, J.
Year2015
TitleReinforcement Learning vs Human Programming in Tetherball Robot Games
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
KeywordsSCARL
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/SimoneParisi/parisi_iros_2015
Reference TypeConference Proceedings
Author(s)Veiga, F.F.; van Hoof, H.; Peters, J.; Hermans, T.
Year2015
TitleStabilizing Novel Objects by Learning to Predict Tactile Slip
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
KeywordsTACMAN
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/IROS2015veiga.pdf
Reference TypeConference Proceedings
Author(s)Huang, Y.; Schoelkopf, B.; Peters, J.
Year2015
TitleLearning Optimal Striking Points for A Ping-Pong Playing Robot
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/YanlongHuang/Yanlong_IROS2015
Reference TypeConference Proceedings
Author(s)Manschitz, S.; Kober, J.; Gienger, M.; Peters, J.
Year2015
TitleProbabilistic Progress Prediction and Sequencing of Concurrent Movement Primitives
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
KeywordsHonda
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/ManschitzIROS2015_v2.pdf
Reference TypeConference Proceedings
Author(s)Ewerton, M.; Maeda, G.J.; Peters, J.; Neumann, G.
Year2015
TitleLearning Motor Skills from Partially Observed Movements Executed at Different Speeds
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
KeywordsBIMROB, 3rd-hand
Pages456--463
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/MarcoEwerton/ewerton_iros_2015_hamburg.pdf
Reference TypeConference Proceedings
Author(s)Wahrburg, A.; Zeiss, S.; Matthias, B.; Peters, J.; Ding, H.
Year2015
TitleCombined Pose-Wrench and State Machine Representation for Modeling Robotic Assembly Skills
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
KeywordsABB
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Wahrburg_IROS_2015.pdf
Reference TypeJournal Article
Author(s)Daniel, C.; Kroemer, O.; Viering, M.; Metz, J.; Peters, J.
Year2015
TitleActive Reward Learning with a Novel Acquisition Function
Journal/Conference/Book TitleAutonomous Robots (AURO)
KeywordsComPLACS
Volume39
Number of Volumes3
Pages389-405
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/ChristianDaniel/ActiveRewardLearning.pdf
Reference TypeConference Proceedings
Author(s)Fritsche, L.; Unverzagt, F.; Peters, J.; Calandra, R.
Year2015
TitleFirst-Person Tele-Operation of a Humanoid Robot
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsCoDyCo
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Fritsche_Humanoids15.pdf
Reference TypeConference Proceedings
Author(s)Calandra, R.; Ivaldi, S.; Deisenroth, M.; Peters, J.
Year2015
TitleLearning Torque Control in Presence of Contacts using Tactile Sensing from Robot Skin
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsCoDyCo
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_humanoids2015.pdf
Reference TypeJournal Article
Author(s)Manschitz, S.; Kober, J.; Gienger, M.; Peters, J.
Year2015
TitleLearning Movement Primitive Attractor Goals and Sequential Skills from Kinesthetic Demonstrations
Journal/Conference/Book TitleRobotics and Autonomous Systems
KeywordsHonda, HRI-Collaboration
Volume74
Pages97-107
ISBN/ISSN0921-8890
URL(s) http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/ManschitzRAS2015_v2.pdf
Reference TypeConference Proceedings
Author(s)Maeda, G.; Neumann, G.; Ewerton, M.; Lioutikov, R.; Peters, J.
Year2015
TitleA Probabilistic Framework for Semi-Autonomous Robots Based on Interaction Primitives with Phase Estimation
Journal/Conference/Book TitleProceedings of the International Symposium of Robotics Research (ISRR)
Keywords3rd-Hand, BIMROB
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/PubGJMaeda/ISRR_uploaded_20150814_small.pdf
Reference TypeConference Proceedings
Author(s)Koc, O.; Maeda, G.; Neumann, G.; Peters, J.
Year2015
TitleOptimizing Robot Striking Movement Primitives with Iterative Learning Control
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
Keywords3rd-hand
Reference TypeConference Proceedings
Author(s)Hoelscher, J.; Peters, J.; Hermans, T.
Year2015
TitleEvaluation of Interactive Object Recognition with Tactile Sensing
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsTACMAN, tactile manipulation
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Theses/hoelscher_ichr2015.pdf
Reference TypeConference Proceedings
Author(s)van Hoof, H.; Hermans, T.; Neumann, G.; Peters, J.
Year2015
TitleLearning Robot In-Hand Manipulation with Tactile Features
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsTACMAN, tactile manipulation
URL(s) http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/HoofHumanoids2015.pdf
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/HoofHumanoids2015.pdf
Reference TypeConference Proceedings
Author(s)Leischnig, S.; Luettgen, S.; Kroemer, O.; Peters, J.
Year2015
TitleA Comparison of Contact Distribution Representations for Learning to Predict Object Interactions
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsTACMAN, tactile manipulation
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/Leischnig-Humanoids-2015.pdf
Reference TypeConference Proceedings
Author(s)Abdolmaleki, A.; Lioutikov, R.; Peters, J; Lau, N.; Reis, L.; Neumann, G.
Year2015
TitleModel-Based Relative Entropy Stochastic Search
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems (NIPS)
KeywordsLearnRobotS
Place PublishedCambridge, MA
PublisherMIT Press
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/GerhardNeumann/Abdolmaleki_NIPS2015.pdf
Reference TypeConference Proceedings
Author(s)Dann, C.; Neumann, G.; Peters, J.
Year2015
TitlePolicy Evaluation with Temporal Differences: A Survey and Comparison
Journal/Conference/Book TitleProceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling (ICAPS)
Pages359-360
Reference TypeJournal Article
Author(s)Lioutikov, R.; Paraschos, A.; Peters, J.; Neumann, G.
Year2014
TitleGeneralizing Movements with Information Theoretic Stochastic Optimal Control
Journal/Conference/Book TitleJournal of Aerospace Information Systems
Volume11
Number9
Pages579-595
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/lioutikov_2014_itsoc.pdf
Reference TypeJournal Article
Author(s)Neumann, G.; Daniel, C.; Paraschos, A.; Kupcsik, A.; Peters, J.
Year2014
TitleLearning Modular Policies for Robotics
Journal/Conference/Book TitleFrontiers in Computational Neuroscience
Link to PDFhttp://www.frontiersin.org/Journal/Abstract.aspx?s=237&name=computational_neuroscience&ART_DOI=10.3389/fncom.2014.00062&utm_source=Email_to_authors_&utm_medium=Email&utm_content=T1_11.5e1_author&utm_campaign=Email_publication&journalName=Frontiers_in_Comp
Reference TypeConference Proceedings
Author(s)Nori, F.; Peters, J.; Padois, V.; Babic, J.; Mistry, M.; Ivaldi, S.
Year2014
TitleWhole-body motion in humans and humanoids
Journal/Conference/Book TitleProceedings of the Workshop on New Research Frontiers for Intelligent Autonomous Systems (NRF-IAS)
KeywordsCoDyCo
Pages81-92
URL(s) http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/nori2014iascodyco.pdf
Reference TypeJournal Article
Author(s)Dann, C.; Neumann, G.; Peters, J.
Year2014
TitlePolicy Evaluation with Temporal Differences: A Survey and Comparison
Journal/Conference/Book TitleJournal of Machine Learning Research (JMLR)
KeywordsComPLACS
Volume15
NumberMarch
Pages809-883
Link to PDFhttp://jmlr.org/papers/volume15/dann14a/dann14a.pdf
Reference TypeJournal Article
Author(s)Meyer, T.; Peters, J.; Zander, T.O.; Schoelkopf, B.; Grosse-Wentrup, M.
Year2014
TitlePredicting Motor Learning Performance from Electroencephalographic Data
Journal/Conference/Book TitleJournal of Neuroengineering and Rehabilitation
KeywordsTeam Athena-Minerva
Volume11
Number1
URL(s) http://www.ias.tu-darmstadt.de/uploads/Publications/Meyer_JNER_2013.pdf
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Meyer_JNER_2013.pdf
Reference TypeJournal Article
Author(s)Bocsi, B.; Csato, L.; Peters, J.
Year2014
TitleIndirect Robot Model Learning for Tracking Control
Journal/Conference/Book TitleAdvanced Robotics
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Bocsi_AR_2014.pdf
Reference TypeJournal Article
Author(s)Ben Amor, H.; Saxena, A.; Hudson, N.; Peters, J.
Year2014
TitleSpecial issue on autonomous grasping and manipulation
Journal/Conference/Book TitleAutonomous Robots (AURO)
Reference TypeJournal Article
Author(s)Deisenroth, M.P.; Fox, D.; Rasmussen, C.E.
Year2014
TitleGaussian Processes for Data-Efficient Learning in Robotics and Control
Journal/Conference/Book TitleIEEE Transactions on Pattern Analysis and Machine Intelligence
URL(s) http://www.doc.ic.ac.uk/~mpd37/publications/pami_final_w_appendix.pdf
Link to PDFhttp://www.doc.ic.ac.uk/~mpd37/publications/pami_final_w_appendix.pdf
Reference TypeJournal Article
Author(s)Wierstra, D.; Schaul, T.; Glasmachers, T.; Sun, Y.; Peters, J.; Schmidhuber, J.
Year2014
TitleNatural Evolution Strategies
Journal/Conference/Book TitleJournal of Machine Learning Research (JMLR)
Volume15
NumberMarch
Pages949-980
Link to PDFhttp://jmlr.org/papers/volume15/wierstra14a/wierstra14a.pdf
Reference TypeConference Proceedings
Author(s)Deisenroth, M.P.; Englert, P.; Peters, J.; Fox, D.
Year2014
TitleMulti-Task Policy Search for Robotics
Journal/Conference/Book TitleProceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA)
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Deisenroth_ICRA_2014.pdf
Reference TypeConference Proceedings
Author(s)Bischoff, B.; Nguyen-Tuong, D.; van Hoof, H. McHutchon, A.; Rasmussen, C.E.; Knoll, A.; Peters, J.; Deisenroth, M.P.
Year2014
TitlePolicy Search For Learning Robot Control Using Sparse Data
Journal/Conference/Book TitleProceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA)
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Bischoff_ICRA_2014.pdf
Reference TypeConference Proceedings
Author(s)Calandra, R.; Seyfarth, A.; Peters, J.; Deisenroth, M.P.
Year2014
TitleAn Experimental Comparison of Bayesian Optimization for Bipedal Locomotion
Journal/Conference/Book TitleProceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA)
KeywordsCoDyCo
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_ICRA2014.pdf
Reference TypeBook
Author(s)Kober, J.; Peters, J.
Year2014
TitleLearning Motor Skills - From Algorithms to Robot Experiments
Journal/Conference/Book TitleSpringer Tracts in Advanced Robotics 97 (STAR Series), Springer
ISBN/ISSN978-3-319-03193-4
Reference TypeConference Proceedings
Author(s)Kroemer, O.; van Hoof, H.; Neumann, G.; Peters, J.
Year2014
TitleLearning to Predict Phases of Manipulation Tasks as Hidden States
Journal/Conference/Book TitleProceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA)
KeywordsTACMAN, 3rd-Hand
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2014.pdf
Reference TypeConference Proceedings
Author(s)Ben Amor, H.; Neumann, G.; Kamthe, S.; Kroemer, O.; Peters, J.
Year2014
TitleInteraction Primitives for Human-Robot Cooperation Tasks
Journal/Conference/Book TitleProceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA)
KeywordsCoDyCo, ComPLACS
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/icraHeniInteract.pdf
Reference TypeConference Proceedings
Author(s)Haji-Ghassemi, N.; Deisenroth, M.P.
Year2014
TitleApproximate Inference for Long-Term Forecasting with Periodic Gaussian Processes
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS)
Link to PDFhttps://spiral.imperial.ac.uk:8443/bitstream/10044/1/12886/2/paper.pdf
Reference TypeConference Proceedings
Author(s)Calandra, R.; Gopalan, N.; Seyfarth, A.; Peters, J.; Deisenroth, M.P.
Year2014
TitleBayesian Gait Optimization for Bipedal Locomotion
Journal/Conference/Book TitleProceedings of the 2014 Learning and Intelligent Optimization Conference (LION8)
KeywordsCoDyCo
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/Calandra_LION8.pdf
Reference TypeConference Proceedings
Author(s)Kamthe, S.; Peters, J.; Deisenroth, M.
Year2014
TitleMulti-modal filtering for non-linear estimation
Journal/Conference/Book TitleProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Link to PDFhttps://spiral.imperial.ac.uk:8443/bitstream/10044/1/12921/2/ICASSP_Final.pdf
Reference TypeConference Proceedings
Author(s)Manschitz, S.; Kober, J.; Gienger, M.; Peters, J.
Year2014
TitleLearning to Unscrew a Light Bulb from Demonstrations
Journal/Conference/Book TitleProceedings of ISR/ROBOTIK 2014
KeywordsHRI-Collaboration
Reference TypeJournal Article
Author(s)Muelling, K.; Boularias, A.; Schoelkopf, B.; Peters, J.
Year2014
TitleLearning Strategies in Table Tennis using Inverse Reinforcement Learning
Journal/Conference/Book TitleBiological Cybernetics
Volume108
Number5
Pages603-619
Custom 1DOI 10.1007/s00422-014-0599-1
Custom 2http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/Muelling_BICY_2014.pdf
Reference TypeJournal Article
Author(s)Saut, J.-P.; Ivaldi, S.; Sahbani, A.; Bidaud, P.
Year2014
TitleGrasping objects localized from uncertain point cloud data
Journal/Conference/Book TitleRobotics and Autonomous Systems
KeywordsCoDyCo
Reference TypeConference Proceedings
Author(s)Lioutikov, R.; Kroemer, O.; Peters, J.; Maeda, G.
Year2014
TitleLearning Manipulation by Sequencing Motor Primitives with a Two-Armed Robot
Journal/Conference/Book TitleProceedings of the 13th International Conference on Intelligent Autonomous Systems (IAS)
Keywords3rd-Hand
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Member/PubRudolfLioutikov/lioutikov_ias13_conf.pdf
Reference TypeConference Proceedings
Author(s)Daniel, C.; Viering, M.; Metz, J.; Kroemer, O.; Peters, J.
Year2014
TitleActive Reward Learning
Journal/Conference/Book TitleProceedings of Robotics: Science & Systems (R:SS)
Keywordscomplacs
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Daniel_RSS_2014.pdf
Reference TypeConference Proceedings
Author(s)Kroemer, O.; Peters, J.
Year2014
TitlePredicting Object Interactions from Contact Distributions
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
Keywords3rd-Hand, TACMAN, CoDyCo
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/KroemerIROS2014.pdf
Reference TypeConference Proceedings
Author(s)Chebotar, Y.; Kroemer, O.; Peters, J.
Year2014
TitleLearning Robot Tactile Sensing for Object Manipulation
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
Keywords3rd-Hand, TACMAN
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/ChebotarIROS2014.pdf
Reference TypeConference Proceedings
Author(s)Manschitz, S.; Kober, J.; Gienger, M.; Peters, J.
Year2014
TitleLearning to Sequence Movement Primitives from Demonstrations
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
KeywordsHRI-Collaboration, Honda
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/ManschitzIROS2014.pdf
Reference TypeConference Proceedings
Author(s)Luck, K.S.; Neumann, G.; Berger, E.; Peters, J.; Ben Amor, H.
Year2014
TitleLatent Space Policy Search for Robotics
Journal/Conference/Book TitleProceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS)
KeywordsComplacs, codyco
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Luck_IROS_2014.pdf
Reference TypeJournal Article
Author(s)van Hoof, H.; Kroemer, O; Peters, J.
Year2014
TitleProbabilistic Segmentation and Targeted Exploration of Objects in Cluttered Environments
Journal/Conference/Book TitleIEEE Transactions on Robotics (TRo)
Volume30
Number5
Pages1198-1209
ISBN/ISSN1552-3098
URL(s) http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6870500&tag=1
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/hoof2014probabilistic.pdf
Reference TypeConference Proceedings
Author(s)Gomez, V.; Kappen, B; Peters, J.; Neumann, G
Year2014
TitlePolicy Search for Path Integral Control
Journal/Conference/Book TitleProceedings of the European Conference on Machine Learning (ECML)
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Gomez_ECML_2014.pdf
Reference TypeConference Proceedings
Author(s)Maeda, G.J.; Ewerton, M.; Lioutikov, R.; Amor, H.B.; Peters, J.; Neumann, G.
Year2014
TitleLearning Interaction for Collaborative Tasks with Probabilistic Movement Primitives
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
Keywords3rd-Hand, CompLACS
Pages527--534
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/PubGJMaeda/maeda2014InteractionProMP_HUMANOIDS.pdf
Reference TypeConference Proceedings
Author(s)Brandl, S.; Kroemer, O.; Peters, J.
Year2014
TitleGeneralizing Pouring Actions Between Objects using Warped Parameters
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
Keywords3rd-Hand
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/BrandlHumanoids2014Final.pdf
Reference TypeConference Proceedings
Author(s)Colome, A.; Neumann, G.; Peters, J.; Torras, C.
Year2014
TitleDimensionality Reduction for Probabilistic Movement Primitives
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Colome_Humanoids_2014.pdf
Reference TypeConference Proceedings
Author(s)Rueckert, E.; Mindt, M.; Peters, J.; Neumann, G.
Year2014
TitleRobust Policy Updates for Stochastic Optimal Control
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsCoDyCo
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/AICOHumanoidsFinal.pdf
Reference TypeConference Proceedings
Author(s)Ivaldi, S.; Peters, J.; Padois, V.; Nori, F.
Year2014
TitleTools for simulating humanoid robot dynamics: a survey based on user feedback
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsCoDyCo
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/ivaldi2014simulators.pdf
Reference TypeJournal Article
Author(s)Droniou, A.; Ivaldi, S.; Sigaud, O.
Year2014
TitleDeep unsupervised network for multimodal perception, representation and classification
Journal/Conference/Book TitleRobotics and Autonomous Systems
Reference TypeConference Proceedings
Author(s)Hermans, T.; Veiga, F.; Hölscher, J.; van Hoof, H.; Peters, J.
Year2014
TitleDemonstration: Learning for Tactile Manipulation
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems (NIPS), Demonstration Track.
KeywordsTACMAN, tactile manipulation
AbstractTactile sensing affords robots the opportunity to dexterously manipulate objects in-hand without the need of strong object models and planning. Our demonstration focuses on learning for tactile, in-hand manipulation by robots. We address learning problems related to the control of objects in-hand, as well as perception problems encountered by a robot exploring its environment with a tactile sensor. We demonstrate applications for three specific learning problems: learning to detect slip for grasp stability, learning to reposition objects in-hand, and learning to identify objects and object properties through tactile exploration. We address the problem of learning to detect slip of grasped objects. We show that the robot can learn a detector for slip events which generalizes to novel objects. We leverage this slip detector to produce a feedback controller that can stabilize objects during grasping and manipulation. Our work compares a number of supervised learning approaches and feature representations in order to achieve reliable slip detection. Tactile sensors provide observations of high enough dimension to cause problems for traditional reinforcement learning methods. As such, we introduce a novel reinforcement learning (RL) algorithm which learns transition functions embedded in a reproducing kernel Hilbert space (RKHS). The resulting policy search algorithm provides robust policy updates which can efficiently deal with high-dimensional sensory input. We demonstrate the method on the problem of repositioning a grasped object in the hand. Finally, we present a method for learning to classify objects through tactile exploration. The robot collects data from a number of objects through various exploratory motions. The robot learns a classifier for each object to be used during exploration of its environment to detect objects in cluttered environments. Here again we compare a number of learning methods and features present in the literature and synthesize a method to best work in human environments the robot is likely to encounter. Users will be able to interact with a robot hand by giving it objects to grasp and attempting to remove these objects from the robot. The hand will also perform some basic in-hand manipulation tasks such as rolling the object between the fingers and rotating the object about a fixed grasp point. Users will also be able to interact with a touch sensor capable of classifying objects as well as semantic events such as slipping from a stable contact location.
Place PublishedCambridge, MA
PublisherMIT Press
Link to PDFhttp://www.ausy.tu-darmstadt.de/uploads/Team/TuckerHermans/learning_tactile_manipulation_demo.pdf
Reference TypeJournal Article
Author(s)Muelling, K.; Kober, J.; Kroemer, O.; Peters, J.
Year2013
TitleLearning to Select and Generalize Striking Movements in Robot Table Tennis
Journal/Conference/Book TitleInternational Journal of Robotics Research (IJRR)
KeywordsGeRT
Volume32
Number3
Pages263-279
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_IJRR_2013.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_IJRR_2013.pdf
Reference TypeConference Proceedings
Author(s)Daniel, C.; Neumann, G.; Kroemer, O.; Peters, J.
Year2013
TitleLearning Sequential Motor Tasks
Journal/Conference/Book TitleProceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA)
KeywordsGeRT, CompLACS
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_ICRA_2013.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_ICRA_2013.pdf
Reference TypeConference Proceedings
Author(s)Englert, P.; Paraschos, A.; Peters, J.; Deisenroth, M. P.
Year2013
TitleModel-based Imitation Learning by Probabilistic Trajectory Matching
Journal/Conference/Book TitleProceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA)
AbstractOne of the most elegant ways of teaching new skills to robots is to provide demonstrations of a task and let the robot imitate this behavior. Such imitation learning is a non-trivial task: Different anatomies of robot and teacher, and reduced robustness towards changes in the control task are two major difficulties in imitation learning. We present an imitation-learning approach to efficiently learn a task from expert demonstrations. Instead of finding policies indirectly, either via state-action mappings (behavioral cloning), or cost function learning (inverse reinforcement learning), our goal is to find policies directly such that predicted trajectories match observed ones. To achieve this aim, we model the trajectory of the teacher and the predicted robot trajectory by means of probability distributions. We match these distributions by minimizing their Kullback-Leibler divergence. In this paper, we propose to learn probabilistic forward models to compute a probability distribution over trajectories. We compare our approach to model-based reinforcement learning methods with hand-crafted cost functions. Finally, we evaluate our method with experiments on a real compliant robot.
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Englert_ICRA_2013.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Englert_ICRA_2013.pdf
Reference TypeConference Proceedings
Author(s)Gopalan, N.; Deisenroth, M. P.; Peters, J.
Year2013
TitleFeedback Error Learning for Rhythmic Motor Primitives
Journal/Conference/Book TitleProceedings of 2013 IEEE International Conference on Robotics and Automation (ICRA)
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gopalan_ICRA_2013.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Gopalan_ICRA_2013.pdf
Reference TypeJournal Article
Author(s)Wang, Z.; Muelling, K.; Deisenroth, M. P.; Ben Amor, H.; Vogt, D.; Schoelkopf, B.; Peters, J.
Year2013
TitleProbabilistic Movement Modeling for Intention Inference in Human-Robot Interaction
Journal/Conference/Book TitleInternational Journal of Robotics Research (IJRR)
Volume32
Number7
Pages841-858
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_IJRR_2013.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_IJRR_2013.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.; Kober, J.; Muelling, K.; Kroemer, O.; Neumann, G.
Year2013
TitleTowards Robot Skill Learning: From Simple Skills to Table Tennis
Journal/Conference/Book TitleProceedings of the European Conference on Machine Learning (ECML), Nectar Track
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/peters_ECML_2013.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/peters_ECML_2013.pdf
Reference TypeJournal Article
Author(s)Kober, J.; Bagnell, D.; Peters, J.
Year2013
TitleReinforcement Learning in Robotics: A Survey
Journal/Conference/Book TitleInternational Journal of Robotics Research (IJRR)
Volume32
Number11
Pages1238-1274
URL(s) http://www.ias.tu-darmstadt.de/uploads/Publications/Kober_IJRR_2013.pdf
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Kober_IJRR_2013.pdf
Reference TypeConference Proceedings
Author(s)Kupcsik, A.G.; Deisenroth, M.P.; Peters, J.; Neumann, G.
Year2013
TitleData-Efficient Generalization of Robot Skills with Contextual Policy Search
Journal/Conference/Book TitleProceedings of the National Conference on Artificial Intelligence (AAAI)
KeywordsGeRT, ComPLACS
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kupcsik_AAAI_2013.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kupcsik_AAAI_2013.pdf
Reference TypeConference Proceedings
Author(s)Bocsi, B.; Csato, L.; Peters, J.
Year2013
TitleAlignment-based Transfer Learning for Robot Models
Journal/Conference/Book TitleProceedings of the 2013 International Joint Conference on Neural Networks (IJCNN)
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_IJCNN_2013.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_IJCNN_2013.pdf
Reference TypeConference Proceedings
Author(s)Daniel, C.; Neumann, G.; Peters, J.
Year2013
TitleAutonomous Reinforcement Learning with Hierarchical REPS
Journal/Conference/Book TitleProceedings of the 2013 International Joint Conference on Neural Networks (IJCNN)
KeywordsGeRT, CompLACS
Reference TypeJournal Article
Author(s)Englert, P.; Paraschos, A.; Peters, J.;Deisenroth, M.P.
Year2013
TitleProbabilistic Model-based Imitation Learning
Journal/Conference/Book TitleAdaptive Behavior Journal
Volume21
Pages388-403
URL(s) http://www.ias.tu-darmstadt.de/uploads/Publications/Englert_ABJ_2013.pdf
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Englert_ABJ_2013.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.; Kober, J.; Muelling, K.; Nguyen-Tuong, D.; Kroemer, O.
Year2013
TitleLearning Skills with Motor Primitives
Journal/Conference/Book TitleProceedings of the 16th Yale Learning Workshop
Reference TypeConference Proceedings
Author(s)Neumann, G.; Kupcsik, A.G.; Deisenroth, M.P.; Peters, J.
Year2013
TitleInformation-Theoretic Motor Skill Learning
Journal/Conference/Book TitleProceedings of the AAAI 2013 Workshop on Intelligent Robotic Systems
KeywordsComPLACS
Reference TypeConference Proceedings
Author(s)Ben Amor, H.; Vogt, D.; Ewerton, M.; Berger, E.; Jung, B.; Peters, J.
Year2013
TitleLearning Responsive Robot Behavior by Imitation
Journal/Conference/Book TitleProceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
KeywordsCoDyCo
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/iros2013Heni.pdf
Reference TypeJournal Article
Author(s)Deisenroth, M. P.; Neumann, G.; Peters, J.
Year2013
TitleA Survey on Policy Search for Robotics
Journal/Conference/Book TitleFoundations and Trends in Robotics
KeywordsCompLACS
Volume21
Pages388-403
URL(s) http://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/PolicySearchReview.pdf
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Site/EditPublication/PolicySearchReview.pdf
Reference TypeConference Proceedings
Author(s)van Hoof, H.; Kroemer, O; Peters, J.
Year2013
TitleProbabilistic Interactive Segmentation for Anthropomorphic Robots in Cluttered Environments
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
Keywordsgert, complacs
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/hoof-HUMANOIDS.pdf
Reference TypeConference Proceedings
Author(s)Paraschos, A.; Neumann, G; Peters, J.
Year2013
TitleA Probabilistic Approach to Robot Trajectory Generation
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsCoDyCo, ComPLACS
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Paraschos_Humanoids_2013.pdf
Reference TypeConference Proceedings
Author(s)Berger, E.; Vogt, D.; Haji-Ghassemi, N.; Jung, B.; Ben Amor, H.
Year2013
TitleInferring Guidance Information in Cooperative Human-Robot Tasks
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsCoDyCo
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/humanoids2013Heni.pdf
Reference TypeConference Proceedings
Author(s)Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G
Year2013
TitleProbabilistic Movement Primitives
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems (NIPS)
KeywordsCoDyCo, ComPLACS
Place PublishedCambridge, MA
PublisherMIT Press
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Paraschos_NIPS_2013a.pdf
Reference TypeBook Section
Author(s)Sigaud, O.; Peters, J.
Year2012
TitleRobot Learning
Journal/Conference/Book TitleEncyclopedia of the Sciences of Learning, Springer Verlag
PublisherSeel, Norbert M.
Number of VolumesSpringer V
Reprint Edition978-1-4419-1428-6
URL(s) http://dx.doi.org/10.1007/978-3-642-05181-4_1
Reference TypeJournal Article
Author(s)Lampert, C.H.; Peters, J.
Year2012
TitleReal-Time Detection of Colored Objects In Multiple Camera Streams With Off-the-Shelf Hardware Components
Journal/Conference/Book TitleJournal of Real-Time Image Processing
Volume7
Number1
Pages31-41
URL(s) http://robot-learning.de/uploads/Publications/rtblob-jrtip2010_6651[0].pdf
Link to PDFhttp://robot-learning.de/uploads/Publications/rtblob-jrtip2010_6651[0].pdf
Reference TypeConference Proceedings
Author(s)Daniel, C.; Neumann, G.; Peters, J.
Year2012
TitleHierarchical Relative Entropy Policy Search
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS 2012)
KeywordsGeRT, CompLACS
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Member/ChristianDaniel/DanielAISTATS2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Member/ChristianDaniel/DanielAISTATS2012.pdf
Reference TypeJournal Article
Author(s)Deisenroth, M.P.; Turner, R.; Huber, M.; Hanebeck, U.D.; Rasmussen, C.E
Year2012
TitleRobust Filtering and Smoothing with Gaussian Processes
Journal/Conference/Book TitleIEEE Transactions on Automatic Control
KeywordsGaussian process, filtering, smoothing
AbstractWe propose a principled algorithm for robust Bayesian filtering and smoothing in nonlinear stochastic dynamic systems when both the transition function and the measurement function are described by non-parametric Gaussian process (GP) models. GPs are gaining increasing importance in signal processing, machine learning, robotics, and control for representing unknown system functions by posterior probability distributions. This modern way of "system identification" is more robust than finding point estimates of a parametric function representation. Our principled filteringslash smoothing approach for GP dynamic systems is based on analytic moment matching in the context of the forward-backward algorithm. Our numerical evaluations demonstrate the robustness of the proposed approach in situations where other state-of-the-art Gaussian filters and smoothers can fail.
Number of VolumesIEEE
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/deisenroth_IEEE-TAC2012.pdf
Reference TypeConference Proceedings
Author(s)Kroemer, O.; Ugur, E.; Oztop, E. ; Peters, J.
Year2012
TitleA Kernel-based Approach to Direct Action Perception
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
KeywordsGeRT
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2012.pdf
Reference TypeConference Proceedings
Author(s)Bocsi, B.; Hennig, P.; Csato, L.; Peters, J.
Year2012
TitleLearning Tracking Control with Forward Models
Journal/Conference/Book TitleProceedings of the International Conference on Robotics and Automation (ICRA)
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_ICRA_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Bocsi_ICRA_2012.pdf
Reference TypeJournal Article
Author(s)Kober, J.; Wilhelm, A.; Oztop, E.; Peters, J.
Year2012
TitleReinforcement Learning to Adjust Parametrized Motor Primitives to New Situations
Journal/Conference/Book TitleAutonomous Robots (AURO)
KeywordsSkill learning; Motor primitives; Reinforcement learning; Meta-parameters; Policy learning
PublisherSpringer US
Volume33
Number4
Pages361-379
ISBN/ISSN0929-5593
URL(s) http://dx.doi.org/10.1007/s10514-012-9290-3
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/kober_auro2012.pdf
LanguageEnglish
Reference TypeJournal Article
Author(s)Vitzthum, A.; Ben Amor, H.; Heumer, G.; Jung, B.
Year2012
TitleXSAMPL3D - An Action Description Language for the Animation of Virtual Characters
Journal/Conference/Book TitleJournal of Virtual Reality and Broadcasting
Volume9
Number1
URL(s) http://www.jvrb.org/9.2012
Link to PDFhttp://www.jvrb.org/9.2012/3262/920121.pdf
Reference TypeConference Proceedings
Author(s)Wang, Z.;Deisenroth, M; Ben Amor, H.; Vogt, D.; Schoelkopf, B.; Peters, J.
Year2012
TitleProbabilistic Modeling of Human Movements for Intention Inference
Journal/Conference/Book TitleProceedings of Robotics: Science and Systems (R:SS)
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_RSS_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Wang_RSS_2012.pdf
Reference TypeJournal Article
Author(s)Nguyen-Tuong, D.; Peters, J.
Year2012
TitleOnline Kernel-based Learning for Task-Space Tracking Robot Control
Journal/Conference/Book TitleIEEE Transactions on Neural Networks and Learning Systems
Volume23
Number9
Pages1417-1425
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/NguyenTuong_TNN_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/NguyenTuong_TNN_2012.pdf
Reference TypeConference Proceedings
Author(s)Deisenroth, M.P.; Mohamed, S.
Year2012
TitleExpectation Propagation in Gaussian Process Dynamical Systems
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems 26 (NIPS), Cambridge, MA: MIT Press.
AbstractRich and complex time-series data, such as those generated from engineering systems, financial markets, videos, or neural recordings are now a common feature of modern data analysis. Explaining the phenomena underlying these diverse data sets requires flexible and accurate models. In this paper, we promote Gaussian process dynamical systems as a rich model class that is appropriate for such an analysis. We present a new approximate message-passing algorithm for Bayesian state estimation and inference in Gaussian process dynamical systems, a non-parametric probabilistic generalization of commonly used state-space models. We derive our message-passing algorithm using Expectation Propagation provide a unifying perspective on message passing in general state-space models. We show that existing Gaussian filters and smoothers appear as special cases within our inference framework, and that these existing approaches can be improved upon using iterated message passing. Using both synthetic and real-world data, we demonstrate that iterated message passing can improve inference in a wide range of tasks in Bayesian state estimation, thus leading to improved predictions and more effective decision making.
PublisherThe MIT Press
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_NIPS_2012.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.; Kober, J.; Muelling, K.; Nguyen-Tuong, D.; Kroemer, O.
Year2012
TitleRobot Skill Learning
Journal/Conference/Book TitleProceedings of the European Conference on Artificial Intelligence (ECAI)
KeywordsGeRT
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ECAI2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ECAI2012.pdf
Reference TypeConference Proceedings
Author(s)Boularias, A.; Kroemer, O.; Peters, J.
Year2012
TitleStructured Apprenticeship Learning
Journal/Conference/Book TitleProceedings of the European Conference on Machine Learning (ECML)
KeywordsGeRT
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_ECML_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_ECML_2012.pdf
Reference TypeConference Proceedings
Author(s)Meyer, T.; Peters, J.;Broetz, D.; Zander, T.; Schoelkopf, B.; Soekadar, S.; Grosse-Wentrup, M.
Year2012
TitleA Brain-Robot Interface for Studying Motor Learning after Stroke
Journal/Conference/Book TitleProceedings of the International Conference on Robot Systems (IROS)
KeywordsTeam Athena-Minerva
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Meyer_IROS_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Meyer_IROS_2012.pdf
Reference TypeConference Proceedings
Author(s)Deisenroth, M.P.; Calandra, R.; Seyfarth, A.; Peters, J.
Year2012
TitleToward Fast Policy Search for Learning Legged Locomotion
Journal/Conference/Book TitleProceedings of the International Conference on Robot Systems (IROS)
Keywordslegged locomotion, policy search, reinforcement learning, Gaussian process
AbstractLegged locomotion is one of the most versatile forms of mobility. However, despite the importance of legged locomotion and the large number of legged robotics studies, no biped or quadruped matches the agility and versatility of their biological counterparts to date. Approaches to designing controllers for legged locomotion systems are often based on either the assumption of perfectly known dynamics or mechanical designs that substantially reduce the dimensionality of the problem. The few existing approaches for learning controllers for legged systems either require exhaustive real-world data or they improve controllers only conservatively, leading to slow learning. We present a data-efficient approach to learning feedback controllers for legged locomotive systems, based on learned probabilistic forward models for generating walking policies. On a compass walker, we show that our approach allows for learning gait policies from very little data. Moreover, we analyze learned locomotion models of a biomechanically inspired biped. Our approach has the potential to scale to high-dimensional humanoid robots with little loss in efficiency.
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_IROS_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_IROS_2012.pdf
Reference TypeConference Proceedings
Author(s)van Hoof, H.; Kroemer, O.;Ben Amor, H.; Peters, J.
Year2012
TitleMaximally Informative Interaction Learning for Scene Exploration
Journal/Conference/Book TitleProceedings of the International Conference on Robot Systems (IROS)
KeywordsGeRT
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VanHoof_IROS_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/VanHoof_IROS_2012.pdf
Reference TypeConference Proceedings
Author(s)Daniel, C.; Neumann, G.; Peters, J.
Year2012
TitleLearning Concurrent Motor Skills in Versatile Solution Spaces
Journal/Conference/Book TitleProceedings of the International Conference on Robot Systems (IROS)
KeywordsGeRT, CompLACS
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_IROS_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Daniel_IROS_2012.pdf
Reference TypeConference Proceedings
Author(s)Ben Amor, H.; Kroemer, O.; Hillenbrand, U.; Neumann, G.; Peters, J.
Year2012
TitleGeneralization of Human Grasping for Multi-Fingered Robot Hands
Journal/Conference/Book TitleProceedings of the International Conference on Robot Systems (IROS)
KeywordsGeRT
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/BenAmor_IROS_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/BenAmor_IROS_2012.pdf
Reference TypeConference Proceedings
Author(s)Kober, J; Muelling, K.; Peters, J.
Year2012
TitleLearning Throwing and Catching Skills
Journal/Conference/Book TitleProceedings of the International Conference on Robot Systems (IROS), Video Track
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kober_IROS_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Kober_IROS_2012.pdf
Reference TypeConference Proceedings
Author(s)Deisenroth, M.P.; Peters, J.
Year2012
TitleSolving Nonlinear Continuous State-Action-Observation POMDPs for Mechanical Systems with Gaussian Noise
Journal/Conference/Book TitleProceedings of the European Workshop on Reinforcement Learning (EWRL)
Link to PDFhttp://www.ias.tu-darmstadt.de/uploads/Publications/Deisenroth_EWRL_2012.pdf
Reference TypeConference Proceedings
Author(s)Muelling, K.; Kober, J.; Kroemer, O.; Peters, J.
Year2012
TitleLearning to Select and Generalize Striking Movements in Robot Table Tennis
Journal/Conference/Book TitleProceedings of the AAAI 2012 Fall Symposium on Robots that Learn Interactively from Human Teachers
URL(s) http://www.cs.utexas.edu/~bradknox/AAAIFSS-RLIHT12-papers/aaaifss12rliht_submission_2.pdf
Link to PDFhttp://www.cs.utexas.edu/~bradknox/AAAIFSS-RLIHT12-papers/aaaifss12rliht_submission_2.pdf
Reference TypeConference Proceedings
Author(s)Calandra, R.; Raiko, T.; Deisenroth, M.P.; Montesino Pouzols, F.
Year2012
TitleLearning Deep Belief Networks from Non-Stationary Streams
Journal/Conference/Book TitleInternational Conference on Artificial Neural Networks (ICANN)
Keywordsdeep learning, non-stationary data
AbstractDeep learning has proven to be beneficial for complex tasks such as classifying images. However, this approach has been mostly ap- plied to static datasets. The analysis of non-stationary (e.g., concept drift) streams of data involves specific issues connected with the tempo- ral and changing nature of the data. In this paper, we propose a proof- of-concept method, called Adaptive Deep Belief Networks, of how deep learning can be generalized to learn online from changing streams of data. We do so by exploiting the generative properties of the model to incrementally re-train the Deep Belief Network whenever new data are collected. This approach eliminates the need to store past observations and, therefore, requires only constant memory consumption. Hence, our approach can be valuable for life-long learning from non-stationary data streams.
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Site/EditPublication/calandra_icann2012.pdf
Reference TypeConference Proceedings
Author(s)Meyer, T.; Peters, J.; Broetz, D.; Zander, T.; Schoelkopf, B.; Soekadar, S.; Grosse-Wentrup, M.
Year2012
TitleInvestigating the Neural Basis for Stroke Rehabilitation by Brain-Computer Interfaces
Journal/Conference/Book TitleInternational Conference on Neurorehabilitation
KeywordsTeam Athena-Minerva
Reference TypeConference Proceedings
Author(s)Kroemer, O.; Ben Amor, H.; Ewerton, M.; Peters, J.
Year2012
TitlePoint Cloud Completion Using Extrusions
Journal/Conference/Book TitleProceedings of the International Conference on Humanoid Robots (HUMANOIDS)
KeywordsGeRT
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_Humanoids_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_Humanoids_2012.pdf
Reference TypeConference Proceedings
Author(s)Boularias, A.; Kroemer, O.; Peters, J.
Year2012
TitleAlgorithms for Learning Markov Field Policies
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems 26 (NIPS 2012)
KeywordsGeRT
Place PublishedCambridge, MA
PublisherMIT Press
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_NIPS_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Boularias_NIPS_2012.pdf
Reference TypeBook
Author(s)Deisenroth M. P.; Szepesvari C.; Peters J.
Year2012
Journal/Conference/Book TitleProceedings of the 10th European Workshop on Reinforcement Learning
Editor(s)Deisenroth M. P.; Szepesvari C., Peters J.
Place PublishedJMLR W&C
Volume24
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_EWRLproceedings_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Deisenroth_EWRLproceedings_2012.pdf
Reference TypeJournal Article
Author(s)Piater, J.; Jodogne, S.; Detry, R.; Kraft, D.; Krueger, N.; Kroemer, O.; Peters, J.
Year2011
TitleLearning Visual Representations for Perception-Action Systems
Journal/Conference/Book TitleInternational Journal of Robotics Research (IJRR)
Volume30
Number3
Pages294-307
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Piater_IJRR_2010.pdf
Reference TypeJournal Article
Author(s)Detry, R.; Kraft, D.; Kroemer, O.; Peters, J.; Krueger, N.; Piater, J.;
Year2011
TitleLearning Grasp Affordance Densities
Journal/Conference/Book TitlePaladyn Journal of Behavioral Robotics
KeywordsGeRT
Volume2
Number1
Pages1-17
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Detry_PJBR_2011.pdf
Reference TypeJournal Article
Author(s)Kober, J.; Peters, J.
Year2011
TitlePolicy Search for Motor Primitives in Robotics
Journal/Conference/Book TitleMachine Learning (MLJ)
Volume84
Number1-2
Pages171-203
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/kober_MACH_2011.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/kober_MACH_2011.pdf
Reference TypeJournal Article
Author(s)Nguyen-Tuong, D.; Peters, J.
Year2011
TitleIncremental Sparsification for Real-time Online Model Learning
Journal/Conference/Book TitleNeurocomputing
Volume74
Number11
Pages1859-1867
Link to PDFhttp://robot-learning.de/uploads/Publications/Nguyen_NEURO_2011.pdf
Reference TypeJournal Article
Author(s)Gomez-Rodriguez, M.; Peters, J.; Hill, J.; Schoelkopf, B.; Gharabaghi, A.; Grosse-Wentrup, M.
Year2011
TitleClosing the Sensorimotor Loop: Haptic Feedback Helps Decoding of Motor Imagery
Journal/Conference/Book TitleJournal of Neuroengineering
KeywordsTeam Athena-Minerva
Volume8
Number3
Link to PDFhttp://robot-learning.de/uploads/Publications/Gomez-RodriguezJNE2011.pdf
Reference TypeConference Proceedings
Author(s)Lampariello, R.; Nguyen-Tuong, D.; Castellini, C.; Hirzinger, G.; Peters, J.
Year2011
TitleEnergy-optimal robot catching in real-time
Journal/Conference/Book TitleProceedings of the IEEE International Conference on Robotics and Automation (ICRA)
Link to PDFhttp://robot-learning.de/uploads/Publications/Lampariello_ICRA_2011.pdf
Reference TypeConference Proceedings
Author(s)Kroemer, O.; Peters, J.
Year2011
TitleA Flexible Hybrid Framework for Modeling Complex Manipulation Tasks
Journal/Conference/Book TitleProceedings of the IEEE International Conference on Robotics and Automation (ICRA)
KeywordsGeRT
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ICRA_2011.pdf
Reference TypeConference Proceedings
Author(s)Kroemer, O.; Peters, J.
Year2011
TitleActive Exploration for Robot Parameter Selection in Episodic Reinforcement Learning
Journal/Conference/Book TitleProceedings of the 2011 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL)
KeywordsGeRT
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_ADPRL_2011.pdf
Reference TypeJournal Article
Author(s)Kroemer, O.; Lampert, C.H.; Peters, J.
Year2011
TitleLearning Dynamic Tactile Sensing with Robust Vision-based Training
Journal/Conference/Book TitleIEEE Transactions on Robotics (T-Ro)
Volume27
Number3
Pages545-557
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Kroemer_TRo_2011.pdf
Reference TypeConference Proceedings
Author(s)Boularias, A.; Kroemer, O.; Peters, J.
Year2011
TitleLearning Robot Grasping from 3D Images with Markov Random Fields
Journal/Conference/Book TitleIEEE/RSJ International Conference on Intelligent Robot Systems (IROS)
KeywordsGeRT
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/Publications/Boularias_IROS_2011.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Boularias_IROS_2011.pdf
Reference TypeConference Proceedings
Author(s)Kroemer, O.; Peters, J.
Year2011
TitleA Non-Parametric Approach to Dynamic Programming
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems 25 (NIPS 2011)
KeywordsGeRT
Place PublishedCambridge, MA
PublisherMIT Press
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/Kroemer2011NIPS.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Kroemer2011NIPS.pdf
Reference TypeConference Proceedings
Author(s)van Hoof, H.; van der Zant, T. ; Wiering, M.A.
Year2011
TitleAdaptive Visual Face Tracking for an Autonomous Robot
Journal/Conference/Book TitleProceedings of the Belgian-Dutch Artificial Intelligence Conference (BNAIC 11)
URL(s) http://robot-learning.de/uploads/Publications/VanHoof_BNAIC_2011.pdf
Link to PDFhttp://robot-learning.de/uploads/Publications/VanHoof_BNAIC_2011.pdf
Reference TypeJournal Article
Author(s)Muelling, K.; Kober, J.; Peters, J.
Year2011
TitleA Biomimetic Approach to Robot Table Tennis
Journal/Conference/Book TitleAdaptive Behavior Journal
Volume19
Number5
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf
Reference TypeConference Proceedings
Author(s)Bocsi, B.; Nguyen-Tuong, D; Csato, L; Schoelkopf, B.; Peters, J.
Year2011
TitleLearning Inverse Kinematics with Structured Prediction
Journal/Conference/Book TitleIEEE/RSJ International Conference on Intelligent Robot Systems (IROS)
URL(s) http://robot-learning.de/uploads/Publications/Bocsi_IROS_2011.pdf
Link to PDFhttp://robot-learning.de/uploads/Publications/Bocsi_IROS_2011.pdf
Reference TypeConference Proceedings
Author(s)Wang, Z.; Lampert, C; Muelling, K; Schoelkopf, B.; Peters, J.
Year2011
TitleLearning Anticipation Policies for Robot Table Tennis
Journal/Conference/Book TitleIEEE/RSJ International Conference on Intelligent Robot Systems (IROS)
URL(s) http://robot-learning.de/uploads/Publications/Wang_IROS_2011.pdf
Link to PDFhttp://robot-learning.de/uploads/Publications/Wang_IROS_2011.pdf
Reference TypeConference Proceedings
Author(s)Nguyen-Tuong, D.; Peters, J.
Year2011
TitleLearning Task-Space Tracking Control with Kernels
Journal/Conference/Book TitleIEEE/RSJ International Conference on Intelligent Robot Systems (IROS)
URL(s) http://robot-learning.de/uploads/Publications/Nguyen_IROS_2011.pdf
Link to PDFhttp://robot-learning.de/uploads/Publications/Nguyen_IROS_2011.pdf
Reference TypeConference Proceedings
Author(s)Kober, J.; Peters, J.
Year2011
TitleLearning Elementary Movements Jointly with a Higher Level Task
Journal/Conference/Book TitleIEEE/RSJ International Conference on Intelligent Robot Systems (IROS)
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/Kober_IROS_2011.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Kober_IROS_2011.pdf
Reference TypeConference Proceedings
Author(s)Gomez-Rodriguez, M.; Grosse-Wentrup, M.; Hill, J.; Schoelkopf, B.; Gharabaghi, A.; Peters, J.
Year2011
TitleTowards Brain-Robot Interfaces for Stroke Rehabilitation
Journal/Conference/Book TitleProceedings of the International Conference on Rehabilitation Robotics (ICORR)
KeywordsTeam Athena-Minerva
URL(s) http://robot-learning.de/uploads/Publications/Gomez_ICORR_2011.pdf
Link to PDFhttp://robot-learning.de/uploads/Publications/Gomez_ICORR_2011.pdf
Reference TypeConference Proceedings
Author(s)Wang, Z.; Boularias, A.; Muelling, K.; Peters, J.
Year2011
TitleBalancing Safety and Exploitability in Opponent Modeling
Journal/Conference/Book TitleProceedings of the Twenty-Fifth National Conference on Artificial Intelligence (AAAI)
KeywordsGeRT
Link to PDFhttp://robot-learning.de/uploads/Publications/Wang_AAAI_2011.pdf
Reference TypeJournal Article
Author(s)Hachiya, H.; Peters, J.; Sugiyama, M.
Year2011
TitleReward Weighted Regression with Sample Reuse for Direct Policy Search in Reinforcement Learning
Journal/Conference/Book TitleNeural Computation
Volume23
Number11
URL(s) http://robot-learning.de/uploads/Publications/Hachiya_NC2011.pdf
Link to PDFhttp://robot-learning.de/uploads/Publications/Hachiya_NC2011.pdf
Reference TypeJournal Article
Author(s)Nguyen-Tuong, D.; Peters, J.
Year2011
TitleModel Learning in Robotics: a Survey
Journal/Conference/Book TitleCognitive Processing
Volume12
Number4
Link to PDFhttp://robot-learning.de/uploads/Publications/Nguyen_CP_2011.pdf
Reference TypeConference Proceedings
Author(s)Kober, J.; Oztop, E.; Peters, J.
Year2011
TitleReinforcement Learning to adjust Robot Movements to New Situations
Journal/Conference/Book TitleProceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Best Paper Track
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Kober_IJCAI_2011.pdf
Reference TypeConference Proceedings
Author(s)Boularias, A.; Kober, J.; Peters, J.
Year2011
TitleRelative Entropy Inverse Reinforcement Learning
Journal/Conference/Book TitleProceedings of Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011)
KeywordsGeRT
Link to PDFhttp://jmlr.csail.mit.edu/proceedings/papers/v15/boularias11a/boularias11a.pdf
Reference TypeJournal Article
Author(s)Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J.
Year2010
TitleRecurrent Policy Gradients
Journal/Conference/Book TitleLogic Journal of the IGPL
Volume18
Number of Volumes5
Pages620-634
Link to PDFhttp://robot-learning.de/uploads/Publications/jzp049v1_5879[0].pdf
Reference TypeConference Proceedings
Author(s)Kober, J.; Oztop, E.; Peters, J.
Year2010
TitleReinforcement Learning to adjust Robot Movements to New Situations
Journal/Conference/Book TitleProceedings of Robotics: Science and Systems (R:SS)
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/RSS2010-Kober_6438[0].pdf
Reference TypeConference Proceedings
Author(s)Kroemer, O.; Detry, R.; Piater, J.; Peters, J.
Year2010
TitleAdapting Preshaped Grasping Movements using Vision Descriptors
Journal/Conference/Book Title From Animals to Animats 11, International Conference on the Simulation of Adaptive Behavior (SAB)
KeywordsGeRT
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Kroemer_6437[0].pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Kroemer_6437[0].pdf
Reference TypeConference Proceedings
Author(s)Kroemer, O.; Detry, R.; Piater, J.; Peters, J.
Year2010
TitleGrasping with Vision Descriptors and Motor Primitives
Journal/Conference/Book TitleProceedings of the International Conference on Informatics in Control, Automation and Robotics (ICINCO)
KeywordsGeRT
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/ICINCO2010-Kroemer_6436[0].pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/ICINCO2010-Kroemer_6436[0].pdf
Reference TypeConference Proceedings
Author(s)Muelling, K.; Kober, J.; Peters, J.
Year2010
TitleSimulating Human Table Tennis with a Biomimetic Robot Setup
Journal/Conference/Book TitleFrom Animals to Animats 11, International Conference on the Simulation of Adaptive Behavior (SAB)
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Muelling_6626[0].pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/SAB2010-Muelling_6626[0].pdf
Reference TypeConference Proceedings
Author(s)Nguyen-Tuong, D.; Peters, J.
Year2010
TitleIncremental Sparsification for Real-time Online Model Learning
Journal/Conference/Book TitleProceedings of Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010)
Link to PDFhttp://robot-learning.de/uploads/Publications/AISTATS2010-Nguyen-Tuong_[0].pdf
Reference TypeJournal Article
Author(s)Kober, J.; Peters, J.
Year2010
TitleImitation and Reinforcement Learning - Practical Algorithms for Motor Primitive Learning in Robotics
Journal/Conference/Book TitleIEEE Robotics and Automation Magazine
Volume17
Number2
Pages55-62
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/kober_RAM_2010.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/kober_RAM_2010.pdf
Reference TypeJournal Article
Author(s)Kroemer, O.; Detry, R.; Piater, J.; Peters, J.
Year2010
TitleCombining Active Learning and Reactive Control for Robot Grasping
Journal/Conference/Book TitleRobotics and Autonomous Systems
KeywordsGeRT
Volume58
Number9
Pages1105-1116
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/KroemerJRAS_6636[0].pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/KroemerJRAS_6636[0].pdf
Reference TypeBook Section
Author(s)Nguyen-Tuong, D.; Seeger, M.; Peters, J.
Year2010
TitleReal-Time Local GP Model Learning
Journal/Conference/Book TitleFrom Motor Learning to Interaction Learning in Robots, Springer Verlag
Number264
Reprint Edition978-3-642-05180-7
Link to PDFhttp://robot-learning.de/uploads/Publications/LGP_IROS_Chapter_6233[0].pdf
Reference TypeBook Section
Author(s)Peters, J.; Tedrake, R.; Roy, N.; Morimoto, J.
Year2010
TitleRobot Learning
Journal/Conference/Book TitleEncyclopedia of Machine Learning
Link to PDFhttp://robot-learning.de/uploads/Publications/EncyclopediaMachineLearning-Peters-RobotLearning_[0].pdf
Reference TypeBook
Author(s)Sigaud, O.; Peters, J.
Year2010
TitleFrom Motor Learning to Interaction Learning in Robots
Journal/Conference/Book TitleStudies in Computational Intelligence, Springer Verlag
Number of VolumesSpringer V
Number264
Reprint Edition978-3-642-05180-7
Link to PDFhttp://dx.doi.org/10.1007/978-3-642-05181-4
Reference TypeBook Section
Author(s)Kober, J.; Mohler, B.; Peters, J.
Year2010
TitleImitation and Reinforcement Learning for Motor Primitives with Perceptual Coupling
Journal/Conference/Book TitleFrom Motor Learning to Interaction Learning in Robots, Springer Verlag
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Imitation%20and%20Reinforcement%20Learning%20for%20Motor%20Primitives%20with%20Perceptual%20Coupling_6234[0].pdf
Reference TypeConference Proceedings
Author(s)Peters, J.; Muelling, K.; Altun, Y.
Year2010
TitleRelative Entropy Policy Search
Journal/Conference/Book TitleProceedings of the Twenty-Fourth National Conference on Artificial Intelligence (AAAI), Physically Grounded AI Track
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Team/JanPeters/Peters2010_REPS.pdf
Reference TypeConference Proceedings
Author(s)Kober, J.; Muelling, K.; Kroemer, O.; Lampert, C.H.; Schoelkopf, B.; Peters, J.
Year2010
TitleMovement Templates for Learning of Hitting and Batting
Journal/Conference/Book TitleIEEE International Conference on Robotics and Automation (ICRA)
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/ICRA2010-Kober_6231[1].pdf
Reference TypeConference Proceedings
Author(s)Nguyen-Tuong, D.; Peters, J.
Year2010
TitleUsing Model Knowledge for Learning Inverse Dynamics
Journal/Conference/Book TitleIEEE International Conference on Robotics and Automation
Link to PDFhttp://robot-learning.de/uploads/Publications/ICRA2010-NguyenTuong_6232[0].pdf
Reference TypeJournal Article
Author(s)Sehnke, F.; Osendorfer, C.; Rueckstiess, T.; Graves, A.; Peters, J.; Schmidhuber, J.
Year2010
TitleParameter-exploring Policy Gradients
Journal/Conference/Book TitleNeural Networks
Volume23
Number of Volumes4
Link to PDFhttp://robot-learning.de/uploads/Publications/Neural-Networks-2010-Sehnke_[0].pdf
Reference TypeBook Section
Author(s)Peters, J.; Bagnell, J.A.
Year2010
TitlePolicy gradient methods
Journal/Conference/Book TitleEncyclopedia of Machine Learning (invited article)
Number of VolumesSpringer V
Link to PDFhttp://www-clmc.usc.edu/publications/P/Peters_EOMLA_submitted.pdf
Reference TypeJournal Article
Author(s)Morimura, T.; Uchibe, E.; Yoshimoto, J.; Peters, J.; Doya, K.
Year2010
TitleDerivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning
Journal/Conference/Book TitleNeural Computation
Volume22
Number2
Link to PDFhttp://www-clmc.usc.edu/publications/M/Morimura_NC_2010.pdf
Reference TypeBook Section
Author(s)Detry, R.; Baseski, E.; Popovic, M.; Touati, Y.; Krueger, N.; Kroemer, O.; Peters, J.; Piater, J.
Year2010
TitleLearning Continuous Grasp Affordances by Sensorimotor Exploration
Journal/Conference/Book TitleFrom Motor Learning to Interaction Learning in Robots, Springer Verlag
Number264
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Detry-2010-MotorInteractionLearning_[0].pdf
Reference TypeConference Proceedings
Author(s)Erkan, A.: Kroemer, O.; Detry, R.; Altun, Y.; Piater, J.; Peters, J.
Year2010
TitleLearning Probabilistic Discriminative Models of Grasp Affordances under Limited Supervision
Journal/Conference/Book TitleProceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
KeywordsGeRT
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/erkan_IROS_2010.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/erkan_IROS_2010.pdf
Reference TypeConference Proceedings
Author(s)Muelling, K.; Kober, J.; Peters, J.
Year2010
TitleA Biomimetic Approach to Robot Table Tennis
Journal/Conference/Book TitleProceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/Muelling_ABJ2011.pdf
Reference TypeConference Proceedings
Author(s)Gomez-Rodriguez, M.; Grosse-Wentrup, M.; Peters, J.; Naros, G.; Hill, J.; Gharabaghi, A.; Schoelkopf, B.
Year2010
TitleEpidural ECoG Online Decoding of Arm Movement Intention in Hemiparesis
Journal/Conference/Book Title1st ICPR Workshop on Brain Decoding: Pattern Recognition Challenges in Neuroimaging
KeywordsTeam Athena-Minerva
Link to PDFhttp://robot-learning.de/uploads/Publications/ICPR-WBD-2010-Gomez-Rodriguez_[0].pdf
Reference TypeConference Proceedings
Author(s)Gomez-Rodriguez, M.; Peters, J.; Hill, J.; Schoelkopf, B.; Gharabaghi, A.; Grosse-Wentrup, M.
Year2010
TitleClosing the Sensorimotor Loop: Haptic Feedback Facilitates Decoding of Arm Movement Imagery
Journal/Conference/Book TitleProceedings of the IEEE International Conference on Systems, Man, and Cybernetics (Workshop on Brain-Machine Interfaces)
KeywordsTeam Athena-Minerva
Link to PDFhttp://robot-learning.de/uploads/Publications/eeg-smc2010_6591[0].pdf
Reference TypeConference Proceedings
Author(s)Gomez Rodriguez, M.; Peters, J.; Hill, J.; Gharabaghi, A.; Schoelkopf, B.; Grosse-Wentrup, M.
Year2010
TitleBCI and robotics framework for stroke rehabilitation
Journal/Conference/Book TitleProceedings of the 4th International BCI Meeting, May 31 - June 4, 2010. Asilomar, CA, USA
KeywordsTeam Athena-Minerva
Link to PDFhttp://bcimeeting.org/2010/
Reference TypeConference Proceedings
Author(s)Lampert, C. H.; Kroemer, O.
Year2010
TitleWeakly-Paired Maximum Covariance Analysis for Multimodal Dimensionality Reduction and Transfer Learning
Journal/Conference/Book TitleProceedings of the 11th European Conference on Computer Vision (ECCV 2010)
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/lampert-eccv2010.pdf
Reference TypeConference Proceedings
Author(s)Chiappa, S.; Peters, J.
Year2010
TitleMovement extraction by detecting dynamics switches and repetitions
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems 24 (NIPS 2010), Cambridge, MA: MIT Press
Link to PDFhttp://robot-learning.de/uploads/Publications/Chiappa_NIPS_2011.pdf
Reference TypeConference Proceedings
Author(s)Alvaro, M.; Peters, J.; Schoelfkopf, B.; Lawrence, N.
Year2010
TitleSwitched Latent Force Models for Movement Segmentation
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems 24 (NIPS 2010), Cambridge, MA: MIT Press
Link to PDFhttp://robot-learning.de/uploads/Publications/Alvarez_NIPS_2011.pdf
Reference TypeJournal Article
Author(s)Peters, J.;Kober, J.;Schaal, S.
Year2010
TitlePolicy learning algorithmis for motor learning (Algorithmen zum automatischen Erlernen von Motorfaehigkigkeiten)
Journal/Conference/Book TitleAutomatisierungstechnik
Keywordsreinforcement leanring, motor skills
AbstractRobot learning methods which allow au- tonomous robots to adapt to novel situations have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to ful- fill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics. If possible, scaling was usually only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general ap- proach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human- like performance. For doing so, we study two major components for such an approach, i. e., firstly, we study policy learning algo- rithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structu- res for task representation and execution.
Volume58
Number12
Pages688-694
Short TitlePolicy learning algorithmis for motor learning (Algorithmen zum automatischen Erlernen von Motorfähigkigkeiten)
URL(s) http://www-clmc.usc.edu/publications/P/peters-Auto2010.pdf
Reference TypeConference Proceedings
Author(s)Muelling, K.; Kober, J.; Peters, J.
Year2010
TitleLearning Table Tennis with a Mixture of Motor Primitives
Journal/Conference/Book Title10th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2010)
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_ICHR_2012.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/Muelling_ICHR_2012.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.; Muelling, K.; Kober, J.
Year2010
TitleExperiments with Motor Primitives to learn Table Tennis
Journal/Conference/Book Title12th International Symposium on Experimental Robotics (ISER 2010)
Reference TypeConference Proceedings
Author(s)Hachiya, H.; Peters, J.; Sugiyama, M.
Year2009
TitleEfficient Sample Reuse in EM-based Policy Search
Journal/Conference/Book TitleProceedings of the 16th European Conference on Machine Learning (ECML 2009)
Link to PDFhttp://www-clmc.usc.edu/publications/H/hachiya_ECML2009.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.; Kober, J.; Muelling, K.; Nguyen-Tuong, D.; Kroemer, O.
Year2009
TitleTowards Motor Skill Learning for Robotics
Journal/Conference/Book TitleProceedings of the International Symposium on Robotics Research (ISRR), Invited Paper
AbstractLearning robots that can acquire new motor skills and refine existing one has been a long standing vision of robotics, artificial intelligence, and the cognitive sciences. Early steps towards this goal in the 1980s made clear that reasoning and human insights will not suffice. Instead, new hope has been offered by the rise of modern machine learning approaches. However, to date, it becomes increasingly clear that off-the-shelf machine learning approaches will not suffice for motor skill learning as these methods often do not scale into the high-dimensional domains of manipulator and humanoid robotics nor do they fulfill the real-time requirement of our domain. As an alternative, we propose to break the generic skill learning problem into parts that we can understand well from a robotics point of view. After designing appropriate learning approaches for these basic components, these will serve as the ingredients of a general approach to motor skill learning. In this paper, we discuss our recent and current progress in this direction. For doing so, we present our work on learning to control, on learning elementary movements as well as our steps towards learning of complex tasks. We show several evaluations both using real robots as well as physically realistic simulations.
Link to PDFhttp://www-clmc.usc.edu/publications/P/Peters_ISRR2009.pdf
Reference TypeConference Proceedings
Author(s)Nguyen-Tuong, D.; Seeger, M.; Peters, J.
Year2009
TitleLocal Gaussian Process Regression for Real Time Online Model Learning and Control
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems 22 (NIPS 2008), Cambridge, MA: MIT Press
Link to PDFhttp://www-clmc.usc.edu/publications/N/nguyen_NIPS2008.pdf
Reference TypeConference Proceedings
Author(s)Neumann, G.; Peters, J.
Year2009
TitleFitted Q-iteration by Advantage Weighted Regression
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems 22 (NIPS 2008), Cambridge, MA: MIT Press
Link to PDFhttp://www-clmc.usc.edu/publications/N/neumann_NIPS2008.pdf
Reference TypeJournal Article
Author(s)Hachiya,H.; Akiyama, T.; Sugiyama, M.; Peters, J.
Year2009
TitleAdaptive Importance Sampling for Value Function Approximation in Off-policy Reinforcement Learning
Journal/Conference/Book TitleNeural Networks
Keywordsoff-policy reinforcement learning; value function approximation; policy iteration; adaptive importance sampling; importance-weighted cross-validation; efficient sample reuse
AbstractOff-policy reinforcement learning is aimed at efficiently using data samples gathered from a different policy than the currently optimized one. A common approach is to use importance sampling techniques for compensating for the bias of value function estimators caused by the difference between the data-sampling policy and the target policy. However, existing off-policy methods often do not take the variance of the value function estimators explicitly into account and, therefore, their performance tends to be unstable. To cope with this problem, we propose using an adaptive importance sampling technique which allows us to actively control the trade-off between bias and variance. We further provide a method for optimally determining the trade-off parameter based on a variant of cross-validation. We demonstrate the usefulness of the proposed approach through simulations.
Volume22
Number10
Pages1399-1410
Link to PDFhttp://robot-learning.de/uploads/Publications/hachiya-AdaptiveImportanceSampling_5530[0].pdf
Reference TypeConference Proceedings
Author(s)Kober, J.; Peters, J.
Year2009
TitlePolicy Search for Motor Primitives in Robotics
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems 22 (NIPS 2008), Cambridge, MA: MIT Press
Link to PDFhttp://www-clmc.usc.edu/publications/K/kober_NIPS2008.pdf
Reference TypeConference Proceedings
Author(s)Chiappa, S.; Kober, J.; Peters, J.
Year2009
TitleUsing Bayesian Dynamical Systems for Motion Template Libraries
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems 22 (NIPS 2008), Cambridge, MA: MIT Press
Link to PDFhttp://www-clmc.usc.edu/publications/C/chiappa_NIPS2008.pdf
Reference TypeJournal Article
Author(s)Deisenroth, M.P.; Rasmussen, C.E.; Peters, J.
Year2009
TitleGaussian Process Dynamic Programming
Journal/Conference/Book TitleNeurocomputing
Number72
Pages1508-1524
Link to PDFhttp://robot-learning.de/uploads/Publications/Neurocomputing-2009-Deisenroth-Preprint_5531[0].pdf
Reference TypeConference Proceedings
Author(s)Hoffman, M.; de Freitas, N. ; Doucet, A.; Peters, J.
Year2009
TitleAn Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Reward
Journal/Conference/Book TitleProceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AIStats)
Link to PDFhttp://robot-learning.de/uploads/Publications/AIStats2009-Hoffman_5658[0].pdf
Reference TypeConference Proceedings
Author(s)Peters, J.; Kober, J.
Year2009
TitleUsing Reward-Weighted Imitation for Robot Reinforcement Learning
Journal/Conference/Book TitleProceedings of the 2009 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/peters_ADPRL_2009.pdf
Reference TypeConference Proceedings
Author(s)Hachiya, H.; Akiyama, T.; Sugiyama, M.; Peters, J.
Year2009
TitleEfficient Data Reuse in Value Function Approximation
Journal/Conference/Book TitleProceedings of the 2009 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
Link to PDFhttp://robot-learning.de/uploads/Publications/ADPRL2009-Hachiya_[0].pdf
Reference TypeConference Proceedings
Author(s)Kober, J.; Peters, J.
Year2009
TitleLearning Motor Primitives for Robotics
Journal/Conference/Book TitleProceedings of the IEEE International Conference on Robotics and Automation (ICRA)
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/ICRA2009-Kober_5661[0].pdf
Reference TypeConference Proceedings
Author(s)Piater, J.; Jodogne, S.; Detry, R.; Kraft, D.; Krueger, N.; Kroemer, O.; Peters, J.
Year2009
TitleLearning Visual Representations for Interactive Systems
Journal/Conference/Book TitleProceedings of the International Symposium on Robotics Research (ISRR), Invited Paper
AbstractWe describe two quite different methods for associating action parameters to visual percepts. Our RLVC algorithm performs reinforcement learning directly on the visual input space. To make this very large space manageable, RLVC interleaves the reinforcement learner with a supervised classification algorithm that seeks to split perceptual states so as to reduce perceptual aliasing. This results in an adaptive discretization of the perceptual space based on the presence or absence of visual features. Its extension RLJC also handles continuous action spaces. In contrast to the minimalistic visual representations produced by RLVC and RLJC, our second method learns structural object models for robust object detection and pose estimation by probabilistic inference. To these models, the method associates grasp experiences autonomously learned by trial and error. These experiences form a non-parametric representation of grasp success likelihoods over gripper poses, which we call a grasp density. Thus, object detection in a novel scene simultaneously produces suitable grasping options.
Link to PDFhttp://www-clmc.usc.edu/publications//P/Piater_POTISORRIP_2009.pdf
Reference TypeConference Proceedings
Author(s)Kober, J.; Peters, J.
Year2009
TitleLearning new basic Movements for Robotics
Journal/Conference/Book TitleProceedings of Autonome Mobile Systeme (AMS 2009)
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/paper_16.pdf
Reference TypeConference Proceedings
Author(s)Muelling, K.; Peters, J.
Year2009
TitleA computational model of human table tennis for robot application
Journal/Conference/Book TitleProceedings of Autonome Mobile Systeme (AMS 2009)
Link to PDFhttp://robot-learning.de/uploads/Publications/paper_10.pdf
Reference TypeConference Proceedings
Author(s)Kroemer, O.; Detry, R.; Piater, J.; Peters, J.
Year2009
TitleActive Learning Using Mean Shift Optimization for Robot Grasping
Journal/Conference/Book TitleProceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009)
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/kroemer_IROS_2009.pdf
Reference TypeConference Proceedings
Author(s)Nguyen-Tuong, D.; Schoelkopf, B.; Peters, J.
Year2009
TitleSparse Online Model Learning for Robot Control with Support Vector Regression
Journal/Conference/Book TitleProceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009)
Link to PDFhttp://dx.doi.org/10.1109/IROS.2009.5354609
Reference TypeJournal Article
Author(s)Peters, J.; Ng, A.
Year2009
TitleGuest Editorial: Special Issue on Robot Learning, Part B
Journal/Conference/Book TitleAutonomous Robots (AURO)
Volume27
Number2
Tertiary Author91-92
Link to PDFhttp://dx.doi.org/10.1007/s10514-009-9131-1
Reference TypeConference Proceedings
Author(s)Sigaud, O.; Peters, J.
Year2009
TitleFrom Motor Learning to Interaction Learning in Robots
Journal/Conference/Book TitleProceedings of Journees Nationales de la Recherche en Robotique
Tertiary Author189-195
Link to PDFhttp://robot-learning.de/uploads/Publications/JNRR2009-Sigaud_[0].pdf
Reference TypeConference Proceedings
Author(s)Neumann, G.; Maass, W; Peters, J.
Year2009
TitleLearning Complex Motions by Sequencing Simpler Motion Templates
Journal/Conference/Book TitleProceedings of the International Conference on Machine Learning (ICML2009)
Link to PDFhttp://robot-learning.de/uploads/Publications/ICML2009-Neumann_[0].pdf
Reference TypeConference Proceedings
Author(s)Detry, R; Baseski, E.; Popovic, M.; Touati, Y.; Krueger, N; Kroemer, O.; Peters, J.; Piater, J;
Year2009
TitleLearning Object-specific Grasp Affordance Densities
Journal/Conference/Book TitleProceedings of the International Conference on Development & Learning (ICDL 2009)
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/ICDL2009-Detry_[0].pdf
Reference TypeJournal Article
Author(s)Nguyen-Tuong, D.; Seeger, M.; Peters, J.
Year2009
TitleModel Learning with Local Gaussian Process Regression
Journal/Conference/Book TitleAdvanced Robotics
Volume23
Number15
Pages2015-2034
Link to PDFhttp://robot-learning.de/uploads/Publications/Nguyen-Tuong-ModelLearningLocalGaussianl_6067[0].pdf
Reference TypeJournal Article
Author(s)Kober, J.; Peters, J.
Year2009
TitleReinforcement Learning fuer Motor-Primitive
Journal/Conference/Book TitleKuenstliche Intelligenz
Link to PDFhttp://www.kuenstliche-intelligenz.de/index.php?id=7779&tx_ki_pi1[showUid]=1820&cHash=a9015a9e57
Reference TypeJournal Article
Author(s)Peters, J.; Morimoto, J.; Tedrake, R.; Roy, N.
Year2009
TitleRobot Learning
Journal/Conference/Book TitleIEEE Robotics & Automation Magazine
Keywordsrobot learning, tc spotlight
Volume16
Number3
Pages19-20
Link to PDFhttp://dx.doi.org/10.1109/MRA.2009.933618
Reference TypeJournal Article
Author(s)Peters, J.; Ng, A.
Year2009
TitleGuest Editorial: Special Issue on Robot Learning, Part A
Journal/Conference/Book TitleAutonomous Robots (AURO)
Volume27
Number1
Link to PDFhttp://dx.doi.org/10.1007/s10514-009-9131-1
Reference TypeConference Proceedings
Author(s)Lampert, C.H.; Peters, J.
Year2009
TitleActive Structured Learning for High-Speed Object Detection
Journal/Conference/Book Title Proceedings of the DAGM (Pattern Recognition)
Link to PDFhttp://robot-learning.de/uploads/Publications/DAGM2009-Lampert_[0].pdf
Reference TypeConference Proceedings
Author(s)Gomez Rodriguez, M.; Kober, J.; Schoelkopf, B.
Year2009
TitleDenoising photographs using dark frames optimized by quadratic programming
Journal/Conference/Book TitleProceedings of the First IEEE International Conference on Computational Photography (ICCP 2009)
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/ICCP09-GomezRodriguez_5491[0].pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/ICCP09-GomezRodriguez_5491[0].pdf
Reference TypeConference Proceedings
Author(s)Deisenroth, M.P.; Peters, J.; Rasmussen, C.E.
Year2008
TitleApproximate Dynamic Programming with Gaussian Processes
Journal/Conference/Book TitleAmerican Control Conference
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Main/PublicationsByYear/deisenroth_ACC2008.pdf
Reference TypeConference Proceedings
Author(s)Nguyen-Tuong, D.; Peters, J.; Seeger, M.; Schoelkopf, B.
Year2008
TitleComputed Torque Control with Nonparametric Regressions Techniques
Journal/Conference/Book TitleAmerican Control Conference
Link to PDFhttp://robot-learning.de/uploads/Publications/NguyenTuong_ACC2008.pdf
Reference TypeConference Proceedings
Author(s)Deisenroth, M.P.; Rasmussen, C.E.; Peters, J.
Year2008
TitleModel-Based Reinforcement Learning with Continuous States and Actions
Journal/Conference/Book TitleProceedings of the European Symposium on Artificial Neural Networks (ESANN 2008)
Pages19-24
Link to PDFhttp://robot-learning.de/uploads/Publications/deisenroth_ESANN2008.pdf
Reference TypeJournal Article
Author(s)Steinke, F.; Hein, M.; Peters, J.; Schoelkopf, B
Year2008
TitleManifold-valued Thin-Plate Splines with Applications in Computer Graphics
Journal/Conference/Book TitleComputer Graphics Forum (Special Issue on Eurographics 2008)
Volume27
Number2
Link to PDFhttp://robot-learning.de/uploads/Publications/Steinke_EGFinal-1049.pdf
Reference TypeConference Proceedings
Author(s)Nguyen-Tuong, D.; Peters, J.; Seeger, M.; Schoelkopf, B.
Year2008
TitleLearning Inverse Dynamics: a Comparison
Journal/Conference/Book TitleProceedings of the European Symposium on Artificial Neural Networks (ESANN 2008)
Reference TypeConference Proceedings
Author(s)Peters, J.; Nguyen-Tuong, D.
Year2008
TitleReal-Time Learning of Resolved Velocity Control on a Mitsubishi PA-10
Journal/Conference/Book TitleInternational Conference on Robotics and Automation (ICRA)
Link to PDFhttp://www-clmc.usc.edu/publications/P/peters-ICRA2008.pdf
Reference TypeConference Proceedings
Author(s)Hachiya, H.; Akiyama, T.; Sugiyama, M.; Peters, J.
Year2008
TitleAdaptive Importance Sampling with Automatic Model Selection in Value Function Approximation
Journal/Conference/Book TitleProceedings of the Twenty-Third National Conference on Artificial Intelligence (AAAI 2008)
Link to PDFhttp://www-clmc.usc.edu/publications/H/hachiya-AAAI08.pdf
Reference TypeConference Proceedings
Author(s)Wierstra,D.; Schaul,T.; Peters, J.; Schmidhuber, J.
Year2008
TitleNatural Evolution Strategies
Journal/Conference/Book Title2008 IEEE Congress on Evolutionary Computation
AbstractThis paper presents Natural Evolution Strategies (NES), a novel algorithm for performing real-valued black box function optimization: optimizing an unknown objective function where algorithm-selected function measurements con- stitute the only information accessible to the method. Natu- ral Evolution Strategies search the fitness landscape using a multivariate normal distribution with a self-adapting mutation matrix to generate correlated mutations in promising regions. NES shares this property with Covariance Matrix Adaption (CMA), an Evolution Strategy (ES) which has been shown to perform well on a variety of high-precision optimization tasks. The Natural Evolution Strategies algorithm, however, is simpler, less ad-hoc and more principled. Self-adaptation of the mutation matrix is derived using a Monte Carlo estimate of the natural gradient towards better expected fitness. By following the natural gradient instead of the �vanilla� gradient, we can ensure efficient update steps while preventing early convergence due to overly greedy updates, resulting in reduced sensitivity to local suboptima. We show NES has competitive performance with CMA on several tasks, while outperforming it on one task that is rich in deceptive local optima, the Rastrigin benchmark. found and the algorithm�s sensitivity to local suboptima on the fitness landscape.
Link to PDFhttp://www-clmc.usc.edu/publications/W/wierstra-CEC2008.pdf
Reference TypeConference Proceedings
Author(s)Nguyen-Tuong, D.; Peters, J.
Year2008
TitleLocal Gaussian Processes Regression for Real-time Model-based Robot Control
Journal/Conference/Book TitleInternational Conference on Intelligent Robot Systems (IROS)
Link to PDFhttp://www-clmc.usc.edu/publications/N/nguyen_IROS2008.pdf
Reference TypeConference Proceedings
Author(s)Kober, J.; Mohler, B.; Peters, J.
Year2008
TitleLearning Perceptual Coupling for Motor Primitives
Journal/Conference/Book TitleInternational Conference on Intelligent Robot Systems (IROS)
Link to PDFhttp://www-clmc.usc.edu/publications/K/kober_IROS2008.pdf
Reference TypeBook
Author(s)Lesperance, Y.; Lakemeyer, G.; Peters, J.; Pirri, F.
Year2008
TitleProceedings of the 6th International Cognitive Robotics Workshop (CogRob 2008)
Journal/Conference/Book TitleJuly 21-22, 2008, Patras, Greece, ISBN 978-960-6843-09-9
Reference TypeConference Proceedings
Author(s)Wierstra, D.; Schaul, T.; Peters, J.; Schmidthuber, J.
Year2008
TitleFitness Expectation Maximization
Journal/Conference/Book Title10th International Conference on Parallel Problem Solving from Nature (PPSN 2008)
Link to PDFhttp://www-clmc.usc.edu/publications/W/wierstra_PPSN08.pdf
Reference TypeJournal Article
Author(s)Nakanishi, J.;Cory, R.;Mistry, M.;Peters, J.;Schaal, S.
Year2008
TitleOperational space control: A theoretical and empirical comparison
Journal/Conference/Book TitleInternational Journal of Robotics Research (IJRR)
Keywordstask space control, operational space control, redundancy resolution, humanoid robotics
AbstractDexterous manipulation with a highly redundant movement system is one of the hallmarks of hu- man motor skills. From numerous behavioral studies, there is strong evidence that humans employ compliant task space control, i.e., they focus control only on task variables while keeping redundant degrees-of-freedom as compliant as possible. This strategy is robust towards unknown disturbances and simultaneously safe for the operator and the environment. The theory of operational space con- trol in robotics aims to achieve similar performance properties. However, despite various compelling theoretical lines of research, advanced operational space control is hardly found in actual robotics imple- mentations, in particular new kinds of robots like humanoids and service robots, which would strongly profit from compliant dexterous manipulation. To analyze the pros and cons of different approaches to operational space control, this paper focuses on a theoretical and empirical evaluation of different methods that have been suggested in the literature, but also some new variants of operational space controllers. We address formulations at the velocity, acceleration and force levels. First, we formulate all controllers in a common notational framework, including quaternion-based orientation control, and discuss some of their theoretical properties. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm with several benchmark tasks. As an aside, we also introduce a novel parameter estimation algorithm for rigid body dynamics, which ensures physical consistency, as this issue was crucial for our successful robot implementations. Our extensive empirical results demonstrate that one of the simplified acceleration-based approaches can be advantageous in terms of task performance, ease of parameter tuning, and general robustness and compliance in face of inevitable modeling errors.
Volume27
Number6
Pages737-757
Short TitleOperational space control: A theoretical and emprical comparison
URL(s) http://www-clmc.usc.edu/publications/N/nakanishi-IJRR2008.pdf
Reference TypeConference Proceedings
Author(s)Wierstra,D.; Schaul,T.; Peters, J.; Schmidhuber, J.
Year2008
TitleEpisodic Reinforcement Learning by Logistic Reward-Weighted Regression
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Neural Networks (ICANN)
Link to PDFhttp://www-clmc.usc.edu/publications/W/wierstra_ICANN08.pdf
Reference TypeConference Proceedings
Author(s)Sehnke, F.; Osendorfer, C; Rueckstiess, T; Graves, A.; Peters, J.; Schmidhuber, J.
Year2008
TitlePolicy Gradients with Parameter-based Exploration for Control
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Neural Networks (ICANN)
Link to PDFhttp://www-clmc.usc.edu/publications/S/sehnke_ICANN2008.pdf
Reference TypeBook
Author(s)Peters, J.
Year2008
TitleMachine Learning for Robotics
Journal/Conference/Book TitleVDM-Verlag, ISBN 978-3-639-02110-3
ISBN/ISSNISBN 978-3-639-02110-3
Link to PDFhttp://www.amazon.de/Machine-Learning-Robotics-Methods-Skills/dp/363902110X/ref=sr_1_1?ie=UTF8&s=books&qid=1220658804&sr=8-1
Reference TypeConference Proceedings
Author(s)Peters, J.; Kober, J.; Nguyen-Tuong, D.
Year2008
TitlePolicy Learning - a unified perspective with applications in robotics
Journal/Conference/Book TitleProceedings of the European Workshop on Reinforcement Learning (EWRL)
Keywordsreinforcement learning, policy gradient, weighted regression
Link to PDFhttp://www-clmc.usc.edu/publications/P/peters_EWRL2008.pdf
Reference TypeConference Proceedings
Author(s)Kober, J.; Peters, J.
Year2008
TitleReinforcement Learning of Perceptual Coupling for Motor Primitives
Journal/Conference/Book TitleProceedings of the European Workshop on Reinforcement Learning (EWRL)
Reference TypeJournal Article
Author(s)Peters, J.
Year2008
TitleMachine Learning for Motor Skills in Robotics
Journal/Conference/Book TitleKuenstliche Intelligenz
Keywordsmotor control, motor primitives, motor learning
AbstractAutonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and the cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks of future robots. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator and humanoid robotics and usually scaling was only achieved in precisely pre-structured domains. We have investigated the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting.
Number3
Link to PDFhttp://www-clmc.usc.edu/publications//P/Peters_KI_2008.pdf
Reference TypeConference Paper
Author(s)Nguyen-Tuong, D.; Peters, J.
Year2008
TitleLearning Robot Dynamics for Computed Torque Control using Local Gaussian Processes Regression
Journal/Conference/Book TitleProceedings of the ECSIS Symposium on Learning and Adaptive Behavior in Robotic Systems, LAB-RS 2008
Link to PDFhttp://www-clmc.usc.edu/publications/N/nguyen_ECSIS2008.pdf
Reference TypeJournal Article
Author(s)Peters, J.; Schaal, S.
Year2008
TitleNatural actor critic
Journal/Conference/Book TitleNeurocomputing
Keywordsreinforcement learning, policy gradient, natural actor-critic, natural gradients
AbstractIn this paper, we suggest a novel reinforcement learning architecture, the Natural Actor-Critic. The actor updates are achieved using stochastic policy gradients em- ploying AmariÕs natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by lin- ear regression. We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gra- dients. The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation. We show that several well-known reinforcement learning methods such as the original Actor-Critic and BradtkeÕs Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms. Empirical evaluations illustrate the effectiveness of our techniques in comparison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm.
Volume71
Number7-9
Pages1180-1190
Short TitleNatural actor critic
URL(s) http://www-clmc.usc.edu/publications//P/peters-NC2008.pdf
Reference TypeJournal Article
Author(s)Peters, J.; Schaal, S.
Year2008
TitleLearning to control in operational space
Journal/Conference/Book TitleInternational Journal of Robotics Research (IJRR)
Keywordsoperational space control, learning, EM ALGORITHM, redundancy resolution, reinforcement learning
AbstractOne of the most general frameworks for phrasing control problems for complex, redundant robots is operational space control. However, while this framework is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in com- plex robots, e.g., humanoid robots. In this paper, we suggest a learning approach for opertional space control as a direct inverse model learning problem. A first important insight for this paper is that a physically cor- rect solution to the inverse problem with redundant degrees-of-freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on the insight that many operational space controllers can be understood in terms of a constrained optimal control problem. The cost function as- sociated with this optimal control problem allows us to formulate a learn- ing algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the machine learning point of view, this learning problem corre- sponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees of freedom robot arm are used to illustrate the suggested approach. The applica- tion to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on real, physical Mitsubishi PA-10 medical robotics arm.
Volume27
Pages197-212
Short TitleLearning to control in operational space
URL(s) http://www-clmc.usc.edu/publications/P/peters-IJRR2008.pdf
Reference TypeJournal Article
Author(s)Peters, J.; Schaal, S.
Year2008
TitleReinforcement learning of motor skills with policy gradients
Journal/Conference/Book TitleNeural Networks
KeywordsReinforcement learning, Policy gradient methods, Natural gradients, Natural Actor-Critic, Motor skills, Motor primitives
AbstractAutonomous learning is one of the hallmarks of human and animal behavior, and understanding the principles of learning will be crucial in order to achieve true autonomy in advanced machines like humanoid robots. In this paper, we examine learning of complex motor skills with human-like limbs. While supervised learning can offer useful tools for bootstrapping behavior, e.g., by learning from demonstration, it is only reinforcement learning that offers a general approach to the final trial-and-error improvement that is needed by each individual acquiring a skill. Neither neurobiological nor machine learning studies have, so far, offered compelling results on how reinforcement learning can be scaled to the high-dimensional continuous state and action spaces of humans or humanoids. Here, we combine two recent research developments on learning motor control in order to achieve this scaling. First, we interpret the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning. Second, we combine motor primitives with the theory of stochastic policy gradient learning, which currently seems to be the only feasible framework for reinforcement learning for humanoids. We evaluate different policy gradient methods with a focus on their applicability to parameterized motor primitives. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.
Volume21
Number4
Pages682-97
DateMay
Short TitleReinforcement learning of motor skills with policy gradients
ISBN/ISSN0893-6080 (Print)
Accession Number18482830
URL(s) http://www-clmc.usc.edu/publications/P/peters-NN2008.pdf
AddressMax Planck Institute for Biological Cybernetics, Spemannstr. 38, 72076 Tubingen, Germany; University of Southern California, 3710 S. McClintoch Ave-RTH401, Los Angeles, CA 90089-2905, USA.
Languageeng
Reference TypeJournal Article
Author(s)Peters, J.;Mistry, M.;Udwadia, F. E.;Nakanishi, J.;Schaal, S.
Year2008
TitleA unifying framework for robot control with redundant DOFs
Journal/Conference/Book TitleAutonomous Robots (AURO)
Keywordsoperational space control, inverse control, dexterous manipulation, optimal control
AbstractRecently, Udwadia (Proc. R. Soc. Lond. A 2003:1783–1800, 2003) suggested to derive tracking controllers for mechanical systems with redundant degrees-of-freedom (DOFs) using a generalization of Gauss’ principle of least constraint. This method allows reformulating control problems as a special class of optimal controllers. In this paper, we take this line of reasoning one step further and demonstrate that several well-known and also novel nonlinear robot control laws can be derived from this generic methodology. We show experimental verifications on a Sarcos Master Arm robot for some of the derived controllers. The suggested approach offers a promising unification and simplification of nonlinear control law design for robots obeying rigid body dynamics equations, both with or without external constraints, with over-actuation or underactuation, as well as open-chain and closed-chain kinematics.
Volume24
Number1
Pages1-12
Short TitleA unifying methodology for robot control with redundant DOFs
URL(s) http://www-clmc.usc.edu/publications/P/peters-AR2008.pdf
Reference TypeThesis
Author(s)Kober, J.
Year2008
TitleReinforcement Learning for Motor Primitives
Journal/Conference/Book TitleDipl-Ing Thesis, University of Stuttgart
URL(s) http://www.ias.informatik.tu-darmstadt.de/publications/DiplomaThesis-Kober_5331[0].pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/publications/DiplomaThesis-Kober_5331[0].pdf
Reference TypeJournal Article
Author(s)Peters, J.
Year2007
TitleComputational Intelligence: By Amit Konar
Journal/Conference/Book TitleThe Computer Journal
Keywordsbook review
Volume50
Number6
Pages758
Reference TypeConference Proceedings
Author(s)Peters, J.; Schaal, S.
Year2007
TitlePolicy Learning for Motor Skills
Journal/Conference/Book TitleProceedings of 14th International Conference on Neural Information Processing (ICONIP)
KeywordsMachine Learning, Reinforcement Learning, Robotics, Motor Primitives, Policy Gradients, Natural Actor-Critic, Reward-Weighted Regression
AbstractPolicy learning which allows autonomous robots to adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, we study policy learning algorithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structures for task representation and execution.
Link to PDFhttp://www-clmc.usc.edu/publications/P/peters_ICONIP2007.pdf
Reference TypeConference Proceedings
Author(s)Wierstra, D.; Foerster, A.; Peters, J.; Schmidhuber, J.
Year2007
TitleSolving Deep Memory POMDPs with Recurrent Policy Gradients
Journal/Conference/Book TitleProceedings of the International Conference on Artificial Neural Networks (ICANN)
Keywordspolicy gradients, reinforcement learning
AbstractThis paper presents Recurrent Policy Gradients, a model- free reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov decision problems (POMDPs) that require long-term memories of past observations. The approach involves approximating a policy gradient for a Recurrent Neural Network (RNN) by backpropagating return-weighted characteristic eligibilities through time. Using a Long Short-Term Memory architecture, we are able to outperform other RL methods on two important benchmark tasks. Furthermore, we show promising results on a complex car driving simulation task.
Link to PDFhttp://www-clmc.usc.edu/publications//D/Wierstra_ICANN_2007.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.; Schaal, S.; Schoelkopf, B.
Year2007
TitleTowards Machine Learning of Motor Skills
Journal/Conference/Book TitleProceedings of Autonome Mobile Systeme (AMS)
KeywordsMotor Skill Learning, Robotics, Natural Actor-Critic, Reward-Weighted Regeression
AbstractAutonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two ma jor components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting.
Link to PDFhttp://www-clmc.usc.edu/publications//P/Peters_POAMS_2007.pdf
Reference TypeConference Proceedings
Author(s)Theodorou, E; Peters, J.; Schaal, S.
Year2007
TitleReinforcement Learning for Optimal Control of Arm Movements
Journal/Conference/Book TitleAbstracts of the 37st Meeting of the Society of Neuroscience
KeywordsOptimal Control,Reinforcement Learning, Arm Movements
AbstractEvery day motor behavior consists of a plethora of challenging motor skills from discrete movements such as reaching and throwing to rhythmic movements such as walking, drumming and running. How this plethora of motor skills can be learned remains an open question. In particular, is there any unifying computa-tional framework that could model the learning process of this variety of motor behaviors and at the same time be biologically plausible? In this work we aim to give an answer to these questions by providing a computational framework that unifies the learning mechanism of both rhythmic and discrete movements under optimization criteria, i.e., in a non-supervised trial-and-error fashion. Our suggested framework is based on Reinforcement Learning, which is mostly considered as too costly to be a plausible mechanism for learning com-plex limb movement. However, recent work on reinforcement learning with pol-icy gradients combined with parameterized movement primitives allows novel and more efficient algorithms. By using the representational power of such mo-tor primitives we show how rhythmic motor behaviors such as walking, squash-ing and drumming as well as discrete behaviors like reaching and grasping can be learned with biologically plausible algorithms. Using extensive simulations and by using different reward functions we provide results that support the hy-pothesis that Reinforcement Learning could be a viable candidate for motor learning of human motor behavior when other learning methods like supervised learning are not feasible.
Reference TypeJournal Article
Author(s)Nakanishi, J.; Mistry, M.; Peters, J.; Schaal, S.
Year2007
TitleExperimental evaluation of task space position/orientation control towards compliant control for humanoid robots
Journal/Conference/Book TitleIEEE International Conference on Intelligent Robotics Systems (IROS 2007)
Keywordsoperational space control, quaternion, task space control, resolved motion rate control, resolved acceleration, force control
AbstractCompliant control will be a prerequisite for humanoid robotics if these robots are supposed to work safely and robustly in human and/or dynamic environments. One view of compliant control is that a robot should control a minimal number of degrees-of-freedom (DOFs) directly, i.e., those relevant DOFs for the task, and keep the remaining DOFs maximally compliant, usually in the null space of the task. This view naturally leads to task space control. However, surprisingly few implementations of task space control can be found in actual humanoid robots. This paper makes a first step towards assessing the usefulness of task space controllers for humanoids by investigating which choices of controllers are available and what inherent control characteristics they have�this treatment will concern position and orientation control, where the latter is based on a quaternion formulation. Empirical evaluations on an anthropomorphic Sarcos master arm illustrate the robustness of the different controllers as well as the ease of implementing and tuning them. Our extensive empirical results demonstrate that simpler task space controllers, e.g., classical resolved motion rate control or resolved acceleration control can be quite advantageous in face of inevitable modeling errors in model-based control, and that well chosen formulations are easy to implement and quite robust, such that they are useful for humanoids.
Place PublishedSan Diego, CA: Oct. 29 � Nov. 2
Short TitleExperimental evaluation of task space position/orientation control towards compliant control for humanoid robots
URL(s) http://www-clmc.usc.edu/publications/T/nakanishi-IROS2007.pdf
Reference TypeThesis
Author(s)Peters, J.
Year2007
TitleMachine Learning of Motor Skills for Robotics
Journal/Conference/Book TitlePh.D. Thesis, Department of Computer Science, University of Southern California
KeywordsMachine Learning, Reinforcement Learning, Robotics, Motor Primitives, Policy Gradients, Natural Actor-Critic, Reward-Weighted Regression
AbstractAutonomous robots that can assist humans in situations of daily life have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. A first step towards this goal is to create robots that can accomplish a multitude of different tasks, triggered by environmental context or higher level instruction. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning and human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this thesis, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting. As a theoretical foundation, we first study a general framework to generate control laws for real robots with a particular focus on skills represented as dynamical systems in differential constraint form. We present a point-wise optimal control framework resulting from a generalization of Gauss' principle and show how various well-known robot control laws can be derived by modifying the metric of the employed cost function. The framework has been successfully applied to task space tracking control for holonomic systems for several different metrics on the anthropomorphic SARCOS Master Arm. In order to overcome the limiting requirement of accurate robot models, we first employ learning methods to find learning controllers for task space control. However, when learning to execute a redundant control problem, we face the general problem of the non-convexity of the solution space which can force the robot to steer into physically impossible configurations if supervised learning methods are employed without further consideration. This problem can be resolved using two major insights, i.e., the learning problem can be treated as locally convex and the cost function of the analytical framework can be used to ensure global consistency. Thus, we derive an immediate reinforcement learning algorithm from the expectation-maximization point of view which leads to a reward-weighted regression technique. This method can be used both for operational space control as well as general immediate reward reinforcement learning problems. We demonstrate the feasibility of the resulting framework on the problem of redundant end-effector tracking for both a simulated 3 degrees of freedom robot arm as well as for a simulated anthropomorphic SARCOS Master Arm. While learning to execute tasks in task space is an essential component to a general framework to motor skill learning, learning the actual task is of even higher importance, particularly as this issue is more frequently beyond the abilities of analytical approaches than execution. We focus on the learning of elemental tasks which can serve as the "building blocks of movement generation", called motor primitives. Motor primitives are parameterized task representations based on splines or nonlinear differential equations with desired attractor properties. While imitation learning of parameterized motor primitives is a relatively well-understood problem, the self-improvement by interaction of the system with the environment remains a challenging problem, tackled in the fourth chapter of this thesis. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and outline both established and novel algorithms for the gradient-based improvement of parameterized policies. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm. In conclusion, in this thesis, we have contributed a general framework for analytically computing robot control laws which can be used for deriving various previous control approaches and serves as foundation as well as inspiration for our learning algorithms. We have introduced two classes of novel reinforcement learning methods, i.e., the Natural Actor-Critic and the Reward-Weighted Regression algorithm. These algorithms have been used in order to replace the analytical components of the theoretical framework by learned representations. Evaluations have been performed on both simulated and real robot arms.
Reference TypeConference Proceedings
Author(s)Peters, J.; Schaal, S.
Year2007
TitleReinforcement learning for operational space control
Journal/Conference/Book TitleInternational Conference on Robotics and Automation (ICRA2007)
Keywordsoperational space control, reinforcement learning, weighted regression, EM-Algorithm
AbstractWhile operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting supervised learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-convexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. The important insight that many operational space control algorithms can be reformulated as optimal control problems, however, allows addressing this inverse learning problem in the framework of reinforcement learning. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-based reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.
Link to PDFhttp://www-clmc.usc.edu/publications/P/peters-ICRA2007.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.;Schaal, S.
Year2007
TitleUsing reward-weighted regression for reinforcement learning of task space control
Journal/Conference/Book TitleProceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning
Keywordsreinforcement learning, cart-pole, policy gradient methods
AbstractIn this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, `vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease.
Place PublishedHonolulu, Hawaii, April 1-5, 2007
Short TitleUsing reward-weighted regression for reinforcement learning of task space control
URL(s) http://www-clmc.usc.edu/publications/P/peters-ADPRL2007.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.; Schaal, S.
Year2007
TitleApplying the episodic natural actor-critic architecture to motor primitive learning
Journal/Conference/Book TitleProceedings of the 2007 European Symposium on Artificial Neural Networks (ESANN)
Keywordsreinforcement learning, policy gradient methods, motor primitives, natural actor-critic
AbstractIn this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natural stochastic policy gradients while the critic obtains the natural policy gradient by linear regression. We show that this architecture can be used to learn the �building blocks of movement generation�, called motor primitives. Motor primitives are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. We show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.
Place PublishedBruges, Belgium, April 25-27
Short TitleApplying the episodic natural actor-critic architecture to motor primitive learning
URL(s) http://www-clmc.usc.edu/publications//P/peters-ESANN2007.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.;Schaal, S.
Year2007
TitleReinforcement learning by reward-weighted regression for operational space control
Journal/Conference/Book TitleProceedings of the International Conference on Machine Learning (ICML2007)
Keywordsreinforcement learning, operational space control, weighted regression
AbstractMany robot control problems of practical importance, including operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.
Place PublishedCorvallis, Oregon, June 19-21
Short TitleReinforcement learning by reward-weighted regression for operational space control
URL(s) http://www-clmc.usc.edu/publications//P/peters_ICML2007.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.;Theodorou, E.;Schaal, S.
Year2007
TitlePolicy gradient methods for machine learning
Journal/Conference/Book TitleINFORMS Conference of the Applied Probability Society
Keywordspolicy gradient methods, reinforcement learning, simulation-optimization
AbstractWe present an in-depth survey of policy gradient methods as they are used in the machine learning community for optimizing parameterized, stochastic control policies in Markovian systems with respect to the expected reward. Despite having been developed separately in the reinforcement learning literature, policy gradient methods employ likelihood ratio gradient estimators as also suggested in the stochastic simulation optimization community. It is well-known that this approach to policy gradient estimation traditionally suffers from three drawbacks, i.e., large variance, a strong dependence on baseline functions and a inefficient gradient descent. In this talk, we will present a series of recent results which tackles each of these problems. The variance of the gradient estimation can be reduced significantly through recently introduced techniques such as optimal baselines, compatible function approximations and all-action gradients. However, as even the analytically obtainable policy gradients perform unnaturally slow, it required the step from vanilla policy gradient methods towards natural policy gradients in order to overcome the inefficiency of the gradient descent. This development resulted into the Natural Actor-Critic architecture which can be shown to be very efficient in application to motor primitive learning for robotics.
Place PublishedEindhoven, Netherlands, July 9-11, 2007
Short TitlePolicy gradient methods for machine learning
Reference TypeConference Proceedings
Author(s)Riedmiller, M.;Peters, J.;Schaal, S.
Year2007
TitleEvaluation of policy gradient methods and variants on the cart-pole benchmark
Journal/Conference/Book TitleProceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning
Keywordsreinforcement learning, cart-pole, policy gradient methods
AbstractIn this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, `vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease.
Place PublishedHonolulu, Hawaii, April 1-5, 2007
Short TitleEvaluation of policy gradient methods and variants on the cart-pole benchmark
URL(s) http://www-clmc.usc.edu/publications/P/riedmiller-ADPRL2007.pdf
Reference TypeReport
Author(s)Peters, J.
Year2007
TitleRelative Entropy Policy Search
Journal/Conference/Book TitleCLMC Technical Report: TR-CLMC-2007-2
Keywordsrelative entropy, policy search, natural policy gradient
AbstractThis technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems.
Place PublishedLos Angeles, CA
Type of WorkCLMC Technical Report
URL(s) http://www-clmc.usc.edu/publications/P/Peters-TR2007.pdf
Link to PDFhttp://www-clmc.usc.edu/publications/P/Peters-TR2007.pdf
Research NotesA longer and more complete version is under preparation.
Reference TypeConference Proceedings
Author(s)Peters, J.;Schaal, S.
Year2006
TitleLearning operational space control
Journal/Conference/Book TitleRobotics: Science and Systems (RSS 2006)
Keywordsoperational space control redundancy forward models inverse models compliance reinforcement leanring locally weighted learning
AbstractWhile operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-covexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. A first important insight for this paper is that, nevertheless, a physically correct solution to the inverse problem does exits when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component for our work is based on a recent insight that many operational space controllers can be understood in terms of a constraint optimal control problem. The cost function associated with this optimal control problem allows us to formulate a learning algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational space controller. From the view of machine learning, the learning problem corresponds to a reinforcement learning problem that maximizes an immediate reward and that employs an expectation-maximization policy search algorithm. Evaluations on a three degrees of freedom robot arm illustrate the feasability of our suggested approach.
Place PublishedPhiladelphia, PA, Aug.16-19
PublisherCambridge, MA: MIT Press
Short TitleLearning operational space control
URL(s) http://www-clmc.usc.edu/publications/P/peters-RSS2006.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.;Schaal, S.
Year2006
TitleReinforcement Learning for Parameterized Motor Primitives
Journal/Conference/Book TitleProceedings of the 2006 International Joint Conference on Neural Networks (IJCNN)
Keywordsmotor primitives, reinforcement learning
AbstractOne of the major challenges in both action generation for robotics and in the understanding of human motor control is to learn the "building blocks of movement generation", called motor primitives. Motor primitives, as used in this paper, are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. While a lot of progress has been made in teaching parameterized motor primitives using supervised or imitation learning, the self-improvement by interaction of the system with the environment remains a challenging problem. In this paper, we evaluate different reinforcement learning approaches for improving the performance of parameterized motor primitives. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and outline both established and novel algorithms for the gradient-based improvement of parameterized policies. We compare these algorithms in the context of motor primitive learning, and show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.
Short TitleReinforcement Learning for Parameterized Motor Primitives
URL(s) http://www-clmc.usc.edu/publications/P/peters-IJCNN2006.pdf
Reference TypeConference Proceedings
Author(s)Ting, J.;Mistry, M.;Nakanishi, J.;Peters, J.;Schaal, S.
Year2006
TitleA Bayesian approach to nonlinear parameter identification for rigid body dynamics
Journal/Conference/Book TitleRobotics: Science and Systems (RSS 2006)
KeywordsBayesian regression linear models dimensionality reduction input noise rigid body dynamics parameter identification
AbstractFor robots of increasing complexity such as humanoid robots, conventional identification of rigid body dynamics models based on CAD data and actuator models becomes difficult and inaccurate due to the large number of additional nonlinear effects in these systems, e.g., stemming from stiff wires, hydraulic hoses, protective shells, skin, etc. Data driven parameter estimation offers an alternative model identification method, but it is often burdened by various other problems, such as significant noise in all measured or inferred variables of the robot. The danger of physically inconsistent results also exists due to unmodeled nonlinearities or insufficiently rich data. In this paper, we address all these problems by developing a Bayesian parameter identification method that can automatically detect noise in both input and output data for the regression algorithm that performs system identification. A post-processing step ensures physically consistent rigid body parameters by nonlinearly projecting the result of the Bayesian estimation onto constraints given by positive definite inertia matrices and the parallel axis theorem. We demonstrate on synthetic and actual robot data that our technique performs parameter identification with $10$ to $30%$ higher accuracy than traditional methods. Due to the resulting physically consistent parameters, our algorithm enables us to apply advanced control methods that algebraically require physical consistency on robotic platforms.
Place PublishedPhiladelphia, PA, Aug.16-19
PublisherCambridge, MA: MIT Press
Short TitleA Bayesian approach to nonlinear parameter identification for rigid body dynamics
URL(s) http://www-clmc.usc.edu/publications/T/ting-RSS2006.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.;Schaal, S.
Year2006
TitlePolicy gradient methods for robotics
Journal/Conference/Book TitleProceedings of the IEEE International Conference on Intelligent Robotics Systems (IROS 2006)
Keywordspolicy gradient methods, reinforcement learning, robotics
AbstractThe aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-structured environments. However, to date only few existing reinforcement learning methods have been scaled into the domains of highdimensional robots such as manipulator, legged or humanoid robots. Policy gradient methods remain one of the few exceptions and have found a variety of applications. Nevertheless, the application of such methods is not without peril if done in an uninformed manner. In this paper, we give an overview on learning with policy gradient methods for robotics with a strong focus on recent advances in the field. We outline previous applications to robotics and show how the most recently developed methods can significantly improve learning performance. Finally, we evaluate our most promising algorithm in the application of hitting a baseball with an anthropomorphic arm.
Link to PDFhttp://www-clmc.usc.edu/publications/P/peters-IROS2006.pdf
Reference TypeConference Proceedings
Author(s)Nakanishi, J.;Cory, R.;Mistry, M.;Peters, J.;Schaal, S.
Year2005
TitleComparative experiments on task space control with redundancy resolution
Journal/Conference/Book TitleIEEE International Conference on Intelligent Robots and Systems (IROS 2005)
Keywordsmanipulator dynamicsredundant manipulatorsspace optimizationdynamical decouplinghumanoid robotsinverse kinematicsmotor coordinationredundancy resolutionrobot dynamicsseven-degree-of-freedom anthropomorphic robot armtask space controlDynamical d
AbstractUnderstanding the principles of motor coordination with redundant degrees of freedom still remains a challenging problem, particularly for new research in highly redundant robots like humanoids. Even after more than a decade of research, task space control with redundacy resolution still remains an incompletely understood theoretical topic, and also lacks a larger body of thorough experimental investigation on complex robotic systems. This paper presents our first steps towards the development of a working redundancy resolution algorithm which is robust against modeling errors and unforeseen disturbances arising from contact forces. To gain a better understanding of the pros and cons of different approaches to redundancy resolution, we focus on a comparative empirical evaluation. First, we review several redundancy resolution schemes at the velocity, acceleration and torque levels presented in the literature in a common notational framework and also introduce some new variants of these previous approaches. Second, we present experimental comparisons of these approaches on a seven-degree-of-freedom anthropomorphic robot arm. Surprisingly, one of our simplest algorithms empirically demonstrates the best performance, despite, from a theoretical point, the algorithm does not share the same beauty as some of the other methods. Finally, we discuss practical properties of these control algorithms, particularly in light of inevitable modeling errors of the robot dynamics.
URL(s) http://www-clmc.usc.edu/publications/N/nakanishi-IROS2005.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.;Vijayakumar, S.;Schaal, S.
Year2005
TitleNatural Actor-Critic
Journal/Conference/Book TitleProceedings of the 16th European Conference on Machine Learning (ECML 2005)
KeywordsReinforcement Learning, Policy Gradients, Natural Gradients
AbstractThis paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari�s natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regres- sion. We show that actor improvements with natural policy gradients are particularly appealing as these are independent of coordinate frame of the chosen policy representation, and can be estimated more efficiently than regular policy gradients. The critic makes use of a special basis function parameterization motivated by the policy-gradient compatible function approximation. We show that several well-known reinforcement learning methods such as the original Actor-Critic and Bradtke�s Linear Quadratic Q-Learning are in fact Natural Actor-Critic algorithms. Em- pirical evaluations illustrate the effectiveness of our techniques in com- parison to previous methods, and also demonstrate their applicability for learning control on an anthropomorphic robot arm.
Link to PDFhttp://www-clmc.usc.edu/publications/P/peters-ECML2005.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.;Mistry, M.;Udwadia, F. E.;Schaal, S.
Year2005
TitleA new methodology for robot control design
Journal/Conference/Book TitleThe 5th ASME International Conference on Multibody Systems, Nonlinear Dynamics, and Control (MSNDC 2005)
Keywordsrobot control, nonlinear control, gauss principle
AbstractGauss principle of least constraint and its generalizations have provided a useful insights for the development of tracking controllers for mechanical systems (Udwadia,2003). Using this concept, we present a novel methodology for the design of a specific class of robot controllers. With our new framework, we demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic framework, and show experimental verifications on a Sarcos Master Arm robot for some of these controllers. We believe that the suggested approach unifies and simplifies the design of optimal nonlinear control laws for robots obeying rigid body dynamics equations, both with or without external constraints, holonomic or nonholonomic constraints, with over-actuation or underactuation, as well as open-chain and closed-chain kinematics.
Link to PDFhttp://www-clmc.usc.edu/publications/P/peters-MSNDC2005.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.;Mistry, M.;Udwadia, F. E.;Cory, R.;Nakanishi, J.;Schaal, S.
Year2005
TitleA unifying framework for the control of robotics systems
Journal/Conference/Book TitleIEEE International Conference on Intelligent Robots and Systems (IROS 2005)
AbstractRecently, [1] suggested to derive tracking controllers for mechanical systems using a generalization of Gauss principle of least constraint. This method al-lows us to reformulate control problems as a special class of optimal control. We take this line of reasoning one step further and demonstrate that well-known and also several novel nonlinear robot control laws can be derived from this generic methodology. We show experimental verifications on a Sar-cos Master Arm robot for some of the the derived controllers.We believe that the suggested approach offers a promising unification and simplification of nonlinear control law design for robots obeying rigid body dynamics equa-tions, both with or without external constraints, with over-actuation or under-actuation, as well as open-chain and closed-chain kinematics.
Link to PDFhttp://www-clmc.usc.edu/publications/P/peters-IROS2005.pdf
Reference TypeConference Proceedings
Author(s)Schaal, S.;Peters, J.;Nakanishi, J.;Ijspeert, A.
Year2004
TitleLearning Movement Primitives
Journal/Conference/Book TitleInternational Symposium on Robotics Research (ISRR2003)
Keywordsmovement primitives, supervised learning, reinforcment learning, locomotion, phase resetting, learning from demonstration
AbstractThis paper discusses a comprehensive framework for modular motor control based on a recently developed theory of dynamic movement primitives (DMP). DMPs are a formulation of movement primitives with autonomous nonlinear differential equations, whose time evolution creates smooth kinematic control policies. Model-based control theory is used to convert the outputs of these policies into motor commands. By means of coupling terms, on-line modifications can be incorporated into the time evolution of the differential equations, thus providing a rather flexible and reactive framework for motor planning and execution. The linear parameterization of DMPs lends itself naturally to supervised learning from demonstration. Moreover, the temporal, scale, and translation invariance of the differential equations with respect to these parameters provides a useful means for movement recognition. A novel reinforcement learning technique based on natural stochastic policy gradients allows a general approach of improving DMPs by trial and error learning with respect to almost arbitrary optimization criteria. We demonstrate the different ingredients of the DMP approach in various examples, involving skill learning from demonstration on the humanoid robot DB, and learning biped walking from demonstration in simulation, including self-improvement of the movement patterns towards energy efficiency through resonance tuning.
URL(s) http://www-clmc.usc.edu/publications/S/schaal-ISRR2003.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.; Schaal, S.
Year2004
TitleLearning Motor Primitives with Reinforcement Learning
Journal/Conference/Book TitleProceedings of the 11th Joint Symposium on Neural Computation
Keywordsnatural policy gradients, motor primitives, natural actor-critic
AbstractOne of the major challenges in action generation for robotics and in the understanding of human motor control is to learn the "building blocks of move- ment generation," or more precisely, motor primitives. Recently, Ijspeert et al. [1, 2] suggested a novel framework how to use nonlinear dynamical systems as motor primitives. While a lot of progress has been made in teaching these mo- tor primitives using supervised or imitation learning, the self-improvement by interaction of the system with the environment remains a challenging problem. In this poster, we evaluate different reinforcement learning approaches can be used in order to improve the performance of motor primitives. For pursuing this goal, we highlight the difficulties with current reinforcement learning methods, and line out how these lead to a novel algorithm which is based on natural policy gradients [3]. We compare this algorithm to previous reinforcement learning algorithms in the context of dynamic motor primitive learning, and show that it outperforms these by at least an order of magnitude. We demonstrate the efficiency of the resulting reinforcement learning method for creating complex behaviors for automous robotics. The studied behaviors will include both discrete, finite tasks such as baseball swings, as well as complex rhythmic patterns as they occur in biped locomotion
Place Publishedhttp://resolver.caltech.edu/CaltechJSNC:2004.poster020
Reference TypeConference Proceedings
Author(s)Mohajerian, P.;Peters, J.;Ijspeert, A.;Schaal, S.
Year2003
TitleA unifying computational framework for optimization and dynamic systems approaches to motor control
Journal/Conference/Book TitleProceedings of the 10th Joint Symposium on Neural Computation (JSNC 2003)
Keywordscomputational motor control, optimization, dynamic systems, formal modeling
Abstract Theories of biological motor control have been pursued from at least two separate frameworks, the "Dynamic Systems" approach and the "Control Theoretic/Optimization" approach. Control and optimization theory emphasize motor control based on organizational principles in terms of generic cost criteria like "minimum jerk", "minimum torque-change", "minimum variance", etc., while dynamic systems theory puts larger focus on principles of self-organization in motor control, like synchronization, phase-locking, phase transitions, perception-action coupling, etc. Computational formalizations in both approaches have equally differed, using mostly time-indexed desired trajectory plans in control/optimization theory, and nonlinear autonomous differential equations in dynamic systems theory. Due to these differences in philosophy and formalization, optimization approaches and dynamic systems approaches have largely remained two separate research approaches in motor control, mostly conceived of as incompatible. In this poster, we present a novel formal framework for motor control that can harmoniously encompass both optimization and dynamic systems approaches. This framework is based on the discovery that almost arbitrary nonlinear autonomous differential equations can be acquired within a standard statistical (or neural network) learning framework without the need of tedious manual parameter tuning and the danger of entering unstable or chaotic regions of the differential equations. Both rhythmic (e.g., locomotion, swimming, etc.) and discrete (e.g., point-to-point reaching, grasping, etc.) movement can be modeled, either as single degree-of-freedom or multiple degree-of-freedom systems. Coupling parameters to the differential equations can create typical effects of self-organization in dynamic systems, while optimization approaches can be used numerically safely to improve the attractor landscape of the equations with respect to a given cost criterion, as demonstrated in modeling studies of several of the hall marks of dynamics systems and optimization theory. We believe that this novel computational framework will allow a first step towards unifying dynamic systems and optimization approaches to motor control, and provide a set of principled modeling tools to both communities.
Place PublishedIrvine, CA, May 2003
Short TitleA unifying computational framework for optimization and dynamic systemsapproaches to motor control
URL(s) http://www-clmc.usc.edu/M/mohajerian-JSNC2003.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.;Vijayakumar, S.;Schaal, S.
Year2003
TitleReinforcement learning for humanoid robotics
Journal/Conference/Book TitleIEEE-RAS International Conference on Humanoid Robots (Humanoids2003)
Keywordsreinforcement learning, policy gradients, movement primitives, behaviors, dynamic systems, humanoid robotics
AbstractReinforcement learning offers one of the most general framework to take traditional robotics towards true autonomy and versatility. However, applying reinforcement learning to high dimensional movement systems like humanoid robots remains an unsolved problem. In this paper, we discuss different approaches of reinforcement learning in terms of their applicability in humanoid robotics. Methods can be coarsely classified into three different categories, i.e., greedy methods, `vanilla' policy gradient methods, and natural gradient methods. We discuss that greedy methods are not likely to scale into the domain humanoid robotics as they are problematic when used with function approximation. `Vanilla' policy gradient methods on the other hand have been successfully applied on real-world robots including at least one humanoid robot. We demonstrate that these methods can be significantly improved using the natural policy gradient instead of the regular policy gradient. A derivation of the natural policy gradient is provided, proving that the average policy gradient of Kakade (2002) is indeed the true natural gradient. A general algorithm for estimating the natural gradient, the Natural Actor-Critic algorithm, is introduced. This algorithm converges to the nearest local minimum of the cost function with respect to the Fisher information metric under suitable conditions. The algorithm outperforms non-natural policy gradients by far in a cart-pole balancing evaluation, and for learning nonlinear dynamic motor primitives for humanoid robot control. It offers a promising route for the development of reinforcement learning for truly high dimensionally continuous state-action systems.
Place PublishedKarlsruhe, Germany, Sept.29-30
Short TitleReinforcement learning for humanoid robotics
URL(s) http://www-clmc.usc.edu/publications/p/peters-ICHR2003.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.;Vijayakumar, S.;Schaal, S.
Year2003
TitleScaling reinforcement learning paradigms for motor learning
Journal/Conference/Book TitleProceedings of the 10th Joint Symposium on Neural Computation (JSNC 2003)
KeywordsReinforcement learning, neurodynamic programming, actorcritic methods, policy gradient methods, natural policy gradient
AbstractReinforcement learning offers a general framework to explain reward related learning in artificial and biological motor control. However, current reinforcement learning methods rarely scale to high dimensional movement systems and mainly operate in discrete, low dimensional domains like game-playing, artificial toy problems, etc. This drawback makes them unsuitable for application to human or bio-mimetic motor control. In this poster, we look at promising approaches that can potentially scale and suggest a novel formulation of the actor-critic algorithm which takes steps towards alleviating the current shortcomings. We argue that methods based on greedy policies are not likely to scale into high-dimensional domains as they are problematic when used with function approximation � a must when dealing with continuous domains. We adopt the path of direct policy gradient based policy improvements since they avoid the problems of unstabilizing dynamics encountered in traditional value iteration based updates. While regular policy gradient methods have demonstrated promising results in the domain of humanoid notor control, we demonstrate that these methods can be significantly improved using the natural policy gradient instead of the regular policy gradient. Based on this, it is proved that Kakade�s �average natural policy gradient� is indeed the true natural gradient. A general algorithm for estimating the natural gradient, the Natural Actor-Critic algorithm, is introduced. This algorithm converges with probability one to the nearest local minimum in Riemannian space of the cost function. The algorithm outperforms nonnatural policy gradients by far in a cart-pole balancing evaluation, and offers a promising route for the development of reinforcement learning for truly high-dimensionally continuous state-action systems.
URL(s) http://www-clmc.usc.edu/publicatons/P/peters-JSNC2003.pdf
Reference TypeConference Proceedings
Author(s)Schaal, S.;Peters, J.;Nakanishi, J.;Ijspeert, A.
Year2003
TitleControl, planning, learning, and imitation with dynamic movement primitives
Journal/Conference/Book TitleWorkshop on Bilateral Paradigms on Humans and Humanoids, IEEE International Conference on Intelligent Robots and Systems (IROS 2003)
Keywordsmovement primitives, supervised learning, reinforcment learning, locomotion, phase resetting, learning from demonstration
AbstractIn both human and humanoid movement science, the topic of movement primitives has become central in understanding the generation of complex motion with high degree-of-freedom bodies. A theory of control, planning, learning, and imitation with movement primitives seems to be crucial in order to reduce the search space during motor learning and achieve a large level of autonomy and flexibility in dynamically changing environments. Movement recognition based on the same representations as used for movement generation, i.e., movement primitives, is equally intimately tied into these research questions. This paper discusses a comprehensive framework for motor control with movement primitives using a recently developed theory of dynamic movement primitives (DMP). DMPs are a formulation of movement primitives with autonomous nonlinear differential equations, whose time evolution creates smooth kinematic movement plans. Model-based control theory is used to convert such movement plans into motor commands. By means of coupling terms, on-line modifications can be incorporated into the time evolution of the differential equations, thus providing a rather flexible and reactive framework for motor planning and execution � indeed, DMPs form complete kinematic control policies, not just a particular desired trajectory. The linear parameterization of DMPs lends itself naturally to supervised learning from demonstrations. Moreover, the temporal, scale, and translation invariance of the differential equations with respect to these parameters provides a useful means for movement recognition. A novel reinforcement learning technique based on natural stochastic policy gradients allows a general approach of improving DMPs by trial and error learning with respect to almost arbitrary optimization criteria, including situations with delayed rewards. We demonstrate the different ingredients of the DMP approach in various examples, involving skill learning from demonstration on the humanoid robot DB and an application of learning simulated biped walking from a demonstrated trajectory, including self-improvement of the movement patterns in the spirit of energy efficiency through resonance tuning.
Link to PDFhttp://www-clmc.usc.edu/publications/S/schaal-IROS2003.pdf
Reference TypeConference Proceedings
Author(s)Vijayakumar, S.; D’Souza, A.; Peters, J.; Conradt, J.; Rutkowski,T.; Ijspeert, A.; Nakanishi, J.; Inoue, M.; Shibata, T.; Wiryo, A.; Itti, L.; Amari, S.; Schaal, S
Year2002
TitleReal-Time Statistical Learning for Oculomotor Control and Visuomotor Coordination
Journal/Conference/Book TitleAdvances in Neural Information Processing Systems (NIPS), Demonstration Track
Reference TypeConference Proceedings
Author(s)Burdet, E.; Tee, K.P.; Chew, C.M.; Peters, J.; Bt, V.L.
Year2001
TitleHybrid IDM/Impedance Learning in Human Movements
Journal/Conference/Book TitleFirst International Symposium on Measurement, Analysis and Modeling of Human Functions Proceedings
Keywordshuman motor control
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/burdet_ISHF_2001.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/burdet_ISHF_2001.pdf
Reference TypeConference Proceedings
Author(s)Peters, J.; Riener, R
Year2000
TitleA real-time model of the human knee for application in virtual orthopaedic trainer
Journal/Conference/Book TitleProceedings of the 10th International Conference on Biomedical Engineering Conference (ICBME)
KeywordsBiomechanics, human motor control
URL(s) http://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ICBME_2000.pdf
Link to PDFhttp://www.ias.informatik.tu-darmstadt.de/uploads/Publications/peters_ICBME_2000.pdf
Reference TypeJournal Article
Author(s)Peters, J.
Year1998
TitleFuzzy Logic for Practical Applications
Journal/Conference/Book TitleKuenstliche Intelligenz (KI)
Keywordsbook review
Number4
Pages60

  

zum Seitenanfang