Research Overview

Creating autonomous robots that can learn to assist humans in situations of daily life is a fascinating challenge for machine learning. While this aim has been a long-standing vision of artificial intelligence and the cognitive sciences, we have yet to achieve the first step of creating robots that can learn to accomplish many different tasks triggered by environmental context or higher-level instruction. The goal of our robot learning laboratory is the investigation of the ingredients for such a general approach to motor skill learning, to get closer towards human-like performance in robotics. We thus focus on the solution of basic problems in robotics while developing domain-appropriate machine-learning methods. Starting from theoretically well-founded approaches to representing the required control structures for task representation and execution, we replace the analytically derived modules by more flexible, learned ones.

This page presents an overview of our work on the following topics:

  1. Motor Skill Learning
  2. Imitation Learning
  3. Reinforcement Learning
  4. Robot Grasping and Manipulation
  5. Semi-Autonomous Assistive Robots and Interaction Learning
  6. Tactile Sensing for Manipulation
  7. Visual and Tactile Object Exploration
  8. Bayesian Optimization
  9. Biomimetic Robotics, Human Motor Control and Brain-Robot Interfaces
  10. Whole Body Movement and Robot Locomotion
  11. Modeling Human Motor Learning
  12. Spiking Neural Networks for Motor Control, Planning and Learning
  13. Learning Diverse Policies

 

Motor Skill Learning

In the context of Motor Skill Learning, we are interested in a variety of topics that can be classified by three layers of abstraction:

A) Learning to Execute: An essential problem in robotics is the accurate execution of desired movements using only low-gain controls such that the robot will accomplish the desired task while not harming human beings in its environment. Following a trajectory with little feedback requires the accurate prediction of the needed torques, which cannot be achieved using classical methods for sufficiently complex robots. However, learning such models is hard as the joint-space can never be fully explored and the learning algorithm has to cope with a never-ending data stream in real time. We have developed learning methods both for accomplishing tasks represented in operational space as well as in joint-space. For more information on learning to execute see:

  1. Nguyen-Tuong, D.; Peters, J. (2011). Model Learning in Robotics: a Survey, Cognitive Processing, 12, 4.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Nguyen-Tuong, D.; Seeger, M.; Peters, J. (2009). Model Learning with Local Gaussian Process Regression, Advanced Robotics, 23, 15, pp.2015-2034.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Peters, J.; Schaal, S. (2008). Learning to control in operational space, International Journal of Robotics Research, 27, pp.197-212.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

B) Learning new Elementary Tasks: While learning to execute tasks is a component essential to a framework for motor skill learning, learning the actual task is of even higher importance as discussed in here. We focus on the learning of elementary tasks or movement primitives, which are parameterized task representations based on nonlinear differential equations with desired attractor properties. We mimic how children learn new motor tasks using imitation learning for initializing these movement primitives while employing reinforcement learning to subsequently improve the task performance. We have learned tasks such as Ball-in-a-Cup or bouncing a ball on a string using this approach. In hitting and batting tasks, movement templates with a learned global shape need to be adapted during the execution so that the racket reaches a target position and velocity that will return the ball over to the other side of the net or court. This requires a reformulation of motor primitives to hitting primitives. A key motor skill for manipulating the environment is grasping, which is why one of our research goals is adapting machine learning algorithms to make them applicable in the robot grasping task domain. For grasping or hitting, several alternative motor primitives might be available.

  1. Peters, J.; Schaal, S. (2008). Reinforcement learning of motor skills with policy gradients, Neural Networks, 21, 4, pp.682-97.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Kober, J.; Peters, J. (2011). Policy Search for Motor Primitives in Robotics, Machine Learning, 84, 1-2, pp.171-203.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

C) Learning to Compose Complex Tasks: Most complex tasks require several motor primitives to be executed in parallel or in sequence. The selection and composition of motor primitives requires a perceptuo-motor perspective, and is the necessary for learning complex tasks. An example of a complex tasks which requires motor primitive selection and hitting primitives is the task of learning to play ping-pong. Moving towards learning complex tasks requires the solution of a variety of hard problems. Among these are the decomposition of large tasks into movement primitives (MP), the acquisition and self-improvement of MPs, the determination of the number of MPs in a data set, the determination of the relevant task-space, perceptual context estimation and goal learning for MPs, as well as the composition of MPs for new complex tasks. These questions are tackled in order to make progress towards fast and general motor skill learning for robotics.

  1. Peters, J.; Kober, J.; Muelling, K.; Kroemer, O.; Neumann, G. (2013). Towards Robot Skill Learning: From Simple Skills to Table Tennis, Proceedings of the European Conference on Machine Learning (ECML), Nectar Track.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Muelling, K.; Kober, J.; Kroemer, O.; Peters, J. (2013). Learning to Select and Generalize Striking Movements in Robot Table Tennis, International Journal of Robotics Research, 32, 3, pp.263-279.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Lioutikov, R.; Neumann, G.; Maeda, G.J.; Peters, J. (2015). Probabilistic Segmentation Applied to an Assembly Task, Proceedings of the International Conference on Humanoid Robots (HUMANOIDS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  4. Kober, J.; Wilhelm, A.; Oztop, E.; Peters, J. (2012). Reinforcement Learning to Adjust Parametrized Motor Primitives to New Situations, Autonomous Robots, 33, 4, pp.361-379, Springer US.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Selected Related Research Topics: learning models for control, learning operational space control, reinforcement learning, learning motor primitives, learning complex tasks, learning to grasp, learning to play ping-pong, motor primitives for hitting, brain-robot interfaces

Imitation Learning

Research in robotics and artificial intelligence has lead to the development of complex robots such as humanoids and androids. In order to be meaningfully applied in human-inhabited environments, robots need to possess a variety of physical abilities and skills. However, programming such skills is a labour- and time intensive task which requires a large amount of expert knowledge. In particular, it often involves transforming intuitive concepts of motions and actions into formal mathematical descriptions and algorithms.

To overcome such difficulties, we use imitation learning to teach robots new motor skills. A human demonstrator first provides one or several examples of the the skill. Information recorded through motion capture or physical interaction is used by the robot to automatically generate a controller that can replicate the seen movements. This is done using modern machine learning techniques. Imitation learning also allows robots to improve upon the observed behavior. This so called self-improvement of the task can help the robot to adapt the learned movement to the characteristics of its own body or the requirements of the current context. Hence, even if the examples presented by the human are not optimal, the robot can still use them to bootstrap its behavior.

At IAS, imitation learning has already been used to teach complex motor skills to various kinds of robots. This includes skills such as locomotion, grasping of novel objects , ping-pong, ball-in-the-cup and tetherball. New machine learning methods that reduce the time needed to acquire a motor skill are developed. The goal of this research is to have intelligent robots that can autonomously enlarge their repertoire of skills by observing or interacting with human teachers.

Additionally, we recently developed a novel movement primitive representation, capable for imitation learning. The new approach is called Probabilistic Movement Primitives (ProMPs) and allows for blending between motions, adapting to altered task variables, and co-activating multiple MPs in parallel. We based our approach on a probabilistic formulation of the movement primitive concept that maintains a distribution over trajectories. Our probabilistic approach allows for the derivation of new operations which are essential for implementing all aforementioned properties in one framework. In order to use such a trajectory distribution for robot movement control, we analytically derive a stochastic feedback controller which reproduces the given trajectory distribution. We have evaluated and compare our approach to existing methods on several simulated as well as real robot scenarios.

Key References

  1. Kober, J.; Peters, J. (2010). Imitation and Reinforcement Learning - Practical Algorithms for Motor Primitive Learning in Robotics, IEEE Robotics and Automation Magazine, 17, 2, pp.55-62.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Paraschos, A.; Daniel, C.; Peters, J.; Neumann, G (2013). Probabilistic Movement Primitives, Advances in Neural Information Processing Systems (NIPS), MIT Press.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Paraschos, A.; Neumann, G; Peters, J. (2013). A Probabilistic Approach to Robot Trajectory Generation, Proceedings of the International Conference on Humanoid Robots (HUMANOIDS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  4. Boularias, A.; Kober, J.; Peters, J. (2011). Relative Entropy Inverse Reinforcement Learning, Proceedings of Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  5. Arenz, O.; Abdulsamad, H.; Neumann, G. (2016). Optimal Control and Inverse Optimal Control by Distribution Matching, Proceedings of the International Conference on Intelligent Robots and Systems (IROS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contact: Jan Peters, Gerhard Neumann, Abdeslam Boularias, Jens Kober, Katharina Muelling, Alexandros Paraschos, Oleg Arenz
Download: Flyer as PDF

Reinforcement Learning

Supervised learning is not always sufficient for motor learning problems, partly because often an expert teacher or idealized version of the behavior is not available. Because of that, one of our goals is the development reinforcement learning methods which scale into the dimensionality of humanoid robots and can generate actions for seven or more degrees of freedom.

Efficient reinforcement learning for continuous states and actions is essential for robotics and control. We follow two approaches depending on the dimensionality of the domain. For high-dimensional state and action spaces, it is often easier to directly learn policies without estimating accurate system models. The resulting algorithms are parametric policy search algorithms inspired by expectation-maximization methods or information-theoretic policy updates and can be employed for motor primitive learning. Information-theoretic policy updates seem to be particulary suited for the high-dimensional problems that occur in robotics as the update rules result in a smooth and robust policy update that always `stays close to the data'. If we can learn a model of the robot and its environment, we can employ model-based reinforcement learning to drastically improve the sample efficiency of policy search algorithms. As a result, these methods can learn good policies at a rapid pace based on only little interaction of the system.

Our general goal in reinforcement learning is the development of methods which scale into the dimensionality of humanoid robots. Such high dimensionalities is a tremendous challenge for reinforcement learning as a complete exploration of the underlying state-action spaces is impossible and few existing techniques scale into this domain.Therefore we rely upon a combination of both, watching a teacher and subsequent self-improvement. In more technical terms: first, a control policy is obtained by imitation and then improved using reinforcement learning.

Parametrized policies allow an efficient abstraction of the high-dimensional continuous action spaces which is often needed in robotics. We can directly optimize the parameters of the primitive by the use of policy search methods. Members of IAS have developed a variety of novel algorithms for this context which have been applied for learning to play table tennis, tetherball, the game 'ball in the cup' and darts.

A) Natural Actor-Critic (NAC): The NAC is currently considered the most efficient policy gradient method. It makes use of the fact, that a natural gradient usually beats a vanilla gradient. For more information read:

  1. Peters, J.; Schaal, S. (2008). Reinforcement learning of motor skills with policy gradients, Neural Networks, 21, 4, pp.682-97.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Peters, J.; Schaal, S. (2008). Natural actor critic, Neurocomputing, 71, 7-9, pp.1180-1190.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

B) EM-like Reinforcement Learning: We formulated policy search as an inference problem. This has led to efficient algorithms like reward-weighted regression and PoWER. For more information read:

  1. Peters, J.;Schaal, S. (2007). Reinforcement learning by reward-weighted regression for operational space control, Proceedings of the International Conference on Machine Learning (ICML2007).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Kober, J.; Peters, J. (2009). Policy Search for Motor Primitives in Robotics, Advances in Neural Information Processing Systems 22 (NIPS 2008), Cambridge, MA: MIT Press.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Neumann, G.; Peters, J. (2009). Fitted Q-iteration by Advantage Weighted Regression, Advances in Neural Information Processing Systems 22 (NIPS 2008), Cambridge, MA: MIT Press.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  4. Neumann, G. (2011). Variational Inference for Policy Search in Changing Situations, Proceedings of the International Conference on Machine Learning (ICML 2011) .   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

C) Information-theoretic Policy Search: The optimization in policy search can rapidly change the control policy, which might lead to jumps in the policy as well as in the resulting trajectory distribution. While this behavior might already be dangerous for a real robot, it might also lead to premature convergence to suboptimal solutions or even osciallations. Information-theoretic policy search solves this problem by bounding the Relative Entropy between two subsequent policies. Hence, the new policy always tries to stay close to the `data' that has been generated by the old policy, while maximizing the reward locally. Such policy updates are mathematically sound and allow the derivation of a whole range of new algorithms, including contextual policy search and learning hierarchical policies. For more information read:

  1. Peters, J.; Muelling, K.; Altun, Y. (2010). Relative Entropy Policy Search, Proceedings of the Twenty-Fourth National Conference on Artificial Intelligence (AAAI), Physically Grounded AI Track.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Daniel, C.; Neumann, G.; Peters, J. (2012). Hierarchical Relative Entropy Policy Search, Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS 2012).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Kupcsik, A.G.; Deisenroth, M.P.; Peters, J.; Ai Poh, L.; Vadakkepat, V.; Neumann, G. (2015). Model-based Contextual Policy Search for Data-Efficient Generalization of Robot Skills, Artificial Intelligence.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  4. van Hoof, H.; Peters, J.; Neumann, G. (2015). Learning of Non-Parametric Control Policies with High-Dimensional State Features, Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  5. Parisi, S.; Abdulsamad, H.; Paraschos, A.; Daniel, C.; Peters, J. (2015). Reinforcement Learning vs Human Programming in Tetherball Robot Games, Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  6. Akrour, R.; Abdolmaleki, A.; Abdulsamad, H.; Neumann, G. (2016). Model-Free Trajectory Optimization for Reinforcement Learning, Proceedings of the International Conference on Machine Learning (ICML).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  7. Tangkaratt, V.; van Hoof, H.; Parisi, S.; Neumann, G.; Peters, J.; Sugiyama, M. (2017). Policy Search with High-Dimensional Context Variables, Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  8. End, F.; Akrour, R.; Peters, J.; Neumann, G. (2017). Layered Direct Policy Search for Learning Hierarchical Skills, Proceedings of the International Conference on Robotics and Automation (ICRA).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

For an overview over these approaches, please also consult the following survey papers from IAS members:

  1. Deisenroth, M. P.; Neumann, G.; Peters, J. (2013). A Survey on Policy Search for Robotics, Foundations and Trends in Robotics, 21, pp.388-403.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Kober, J; Bagnell, D.; Peters, J. (2013). Reinforcement Learning in Robotics: A Survey, International Journal of Robotics Research, 32, 11, pp.1238-1274.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contact: Jan Peters, Marc Deisenroth, Gerhard Neumann, Jens Kober, Christian Daniel, Abdeslam Boularias, Simone Parisi, Joni Pajarinen, Herke van Hoof, Takayuki Osa, Riad Akrour
Download: Flyer as PDF

Robot Grasping and Manipulation

A key long-term goal of robotics is to create autonomous robots that can perform a wide range of tasks to help humans in daily life. One of the main requirements of such domestic and service robots is the ability to manipulate a wide range of objects in their surroundings.

Modern industrial robots usually only have to manipulate a single set of identical objects, using preprogrammed actions. However, future service robots will need to operate in unstructured environments and perform tasks with novel objects. These robots will therefore need to learn to optimize their actions to specific objects, as well as generalize their actions between objects.

A) Improving Grasps through Experience: The ability to grasp objects is an important prerequisite for performing various manipulation tasks. Using trial-and-error, a robot can autonomously optimize its grasps of objects. In particular, the grasp selection process can be framed as a continuum-armed bandit reinforcement learning problem. Thus, the robot can actively balance executing grasps that are known to be good and exploring new grasps that may be better.

  1. Kroemer, O.; Detry, R.; Piater, J.; Peters, J. (2010). Combining Active Learning and Reactive Control for Robot Grasping, Robotics and Autonomous Systems, 58, 9, pp.1105-1116.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Boularias, A.; Kroemer, O.; Peters, J. (2011). Learning Robot Grasping from 3D Images with Markov Random Fields, IEEE/RSJ International Conference on Intelligent Robot Systems (IROS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Ben Amor, H.; Kroemer, O.; Hillenbrand, U.; Neumann, G.; Peters, J. (2012). Generalization of Human Grasping for Multi-Fingered Robot Hands, Proceedings of the International Conference on Robot Systems (IROS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

B) Affordance Learning: An object's affordances are the actions that the robot can perform using the object. The affordances of basic objects, such as tools, are usually defined by their surface structures. By finding similar surface structures in different objects, the robot can transfer its knowledge of afforded actions between objects. In this manner, the robot can predict whether a novel object affords a specific action, as well as adapt the action to this object.

The robot can learn an initial action from a human demonstration. Adapting this action to new objects is achieved autonomously by the robot, using a trial-and-error learning approach.

  1. Piater, J.; Jodogne, S.; Detry, R.; Kraft, D.; Krueger, N.; Kroemer, O.; Peters, J. (2011). Learning Visual Representations for Perception-Action Systems, International Journal of Robotics Research, 30, 3, pp.294-307.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Kroemer, O.; Ugur, E.; Oztop, E. ; Peters, J. (2012). A Kernel-based Approach to Direct Action Perception, Proceedings of the International Conference on Robotics and Automation (ICRA).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

C) Multi-Phase Manipulations: Manipulation tasks can be decomposed into phases, wherein the robot's actions have distinct effects depending on the current phase. In order to perform a task, the robot will first need to reach a phase that affords the desired manipulation, which may require transitioning through other phases first. The phases thus define a sequence of subtasks for the robot to complete in order to manipulate different parts of the environment.

Our research has focused on developing methods for learning the phase structure of tasks, as well as learning manipulation skills for transitioning between the phases. The robot first learns the conditions for transitioning between phases and then optimizes its motor skills accordingly. In order to learn versatile manipulation skills, we have also investigated representations for generalizing phase transitions between different objects. As part of our research, the Darias robot has learned the phases of a variety of tasks, including stacking objects, pouring, two-handed grasping, and turning a pepper mill.

  1. Kroemer, O.; Daniel, C.; Neumann, G; van Hoof, H.; Peters, J. (2015). Towards Learning Hierarchical Skills for Multi-Phase Manipulation Tasks, Proceedings of the International Conference on Robotics and Automation (ICRA).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Kroemer, O.; van Hoof, H.; Neumann, G.; Peters, J. (2014). Learning to Predict Phases of Manipulation Tasks as Hidden States, Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Kroemer, O.; Peters, J. (2014). Predicting Object Interactions from Contact Distributions, Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  4. Brandl, S.; Kroemer, O.; Peters, J. (2014). Generalizing Pouring Actions Between Objects using Warped Parameters, Proceedings of the International Conference on Humanoid Robots (HUMANOIDS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contact: Jan Peters, Oliver Kroemer, Abdeslam Boularias, Herke van Hoof, Joni Pajarinen Takayuki Osa
Download: Flyer as PDF

Semi-Autonomous Assistive Robots and Interaction Learning

Semi-autonomous robots are robots that physically interact with a human partner in order to achieve a task in a collaborative manner. Fundamental research in semi-autonomous robotics has potential applications in a variety of scenarios where humans need assistance: from the assembly of products in factories, to the aid of the elderly at home, to the control of actuated prosthetics, to the shared control in repetitive tele-operated processes.

Research for semi-autonomous robots needs to go beyond the usual sense-plan-act paradigm to account for the interaction with humans. Sensing and perception must involve the observation and interpretation of the human movement. Planning has to take into account the inferred human intent and, from a possibly limited repertoire of robot skills, search for an appropriate combination of actions. Acting for collaboration requires algorithms that model and predict human trajectories and generate safe and appropriate commands for both human and robot.

One of the challenges that we are currently addressing is related to the fact that humans usually execute a variety of unforeseen tasks; hence pre-programming a robot for all possible tasks is infeasible. We approach this problem by investigating new algorithms that allow a robot: (1) to learn and maintain a dynamic repertoire of skills and to assess its own ability to assist the human in accomplishing a given task, and (2) to make requests for human demonstration in order to acquire a new skill via imitation and reinforcement learning.

We also investigate ways to model collaborative interaction. Such a model is used to predict the intention and movement of the human while simultaneously generating collaborative control actions for the robot. Our current approach leverages on probabilistic tools for the realization of interaction primitives. Basically, a prior model of the interaction is encoded as a joint distribution of the human and robot movements, from which a posterior distribution of the interaction can be inferred by observing only the human.

We validate and demonstrate our algorithms using real robotic systems which also involves the milestones of the 3rd Hand Project (visit the website here, download the brochure here) as one of our driving applications. You can watch our latest results in here.

  1. Maeda, G.; Neumann, G.; Ewerton, M., Lioutikov, R.; Peters, J. (2015). A Probabilistic Framework for Semi-Autonomous Robots Based on Interaction Primitives with Phase Estimation, Proceedings of the International Symposium of Robotics Research (ISRR).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Ewerton, M.; Maeda, G.J.; Peters, J.; Neumann, G. (2015). Learning Motor Skills from Partially Observed Movements Executed at Different Speeds, Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS), pp.456--463.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Ewerton, M.; Neumann, G.; Lioutikov, R.; Ben Amor, H.; Peters, J.; Maeda, G. (2015). Learning Multiple Collaborative Tasks with a Mixture of Interaction Primitives, Proceedings of the International Conference on Robotics and Automation (ICRA), pp.1535--1542.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex] . Finalist: Best Paper, Best Student Paper, and Best Service Robotics Paper

  1. Maeda, G.J.; Ewerton, M.; Lioutikov, R.; Amor, H.B.; Peters, J.; Neumann, G. (2014). Learning Interaction for Collaborative Tasks with Probabilistic Movement Primitives, Proceedings of the International Conference on Humanoid Robots (HUMANOIDS), pp.527--534.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Ben Amor, H.; Neumann, G.; Kamthe, S.; Kroemer, O.; Peters, J. (2014). Interaction Primitives for Human-Robot Cooperation Tasks , Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contacts: Guilherme Maeda, Rudolf Lioutikov, Oliver Kroemer, Jan Peters

Tactile Sensing for Manipulation

Many current robots lack fine manipulation skills. A major reason for this is that most industrial robotic arm-hand systems do not receive sufficient feedback about the contact with the object. Neuroscience has shown that such feedback from tactile sensors is a critical component in the human ability to perform such tasks. Therefore, we aim to equip our robots with tactile sensors and use the sensor feedback for robot control. Research issues include how to interpret the signals from the sensors (that are often noisy and high-dimensional) and how to include such signals in a robot control loop. Among others, we aim to address such issues by using supervised, unsupervised, and reinforcement learning techniques.

  1. Veiga, F.F.; van Hoof, H.; Peters, J.; Hermans, T. (2015). Stabilizing Novel Objects by Learning to Predict Tactile Slip, Proceedings of the IEEE/RSJ Conference on Intelligent Robots and Systems (IROS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. van Hoof, H.; Hermans, T.; Neumann, G.; Peters, J. (2015). Learning Robot In-Hand Manipulation with Tactile Features, Proceedings of the International Conference on Humanoid Robots (HUMANOIDS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contact: Herke van Hoof, Filipe Veiga, Daniel Tanneberg, Elmar Rückert, Jan Peters

Tactile and Visual Object Exploration

A robot needs to be aware of the properties of objects to efficiently perform tasks with them. Currently, most robots are provided with this object information by a programmer. However, autonomous robots working in service industries and domestic settings will need to perform tasks with novel objects. Hence, relying on predefined object knowledge will not be an option.

Instead, robots will have to learn about objects by exploring their properties through physical interactions, such as pushing, stroking, and lifting. As random exploration is an inefficient approach, we develop methods for efficiently gathering information.

The fundamental knowledge learned about objects and primitive actions will later on form the basis for learning complex behaviors and predicting the properties of novel objects. By discovering accurate representations of objects, the robot will be able to plan and execute manipulations more precisely.

A) Learning Tactile Sensing using Vision: The textures of object surfaces can be observed both by visual inspection and by sliding a dynamic tactile sensor across the surface. The robot can combine these types of sensor readings to determine which components of the data contain information pertaining to the texture.

In particular, the robot can find the components that are maximally correlated between the two sensor modalities. Given the relevant data components, the robot can create a compact representation of object textures, which allows it to distinguish between surfaces more accurately.

  1. Kroemer, O.; Lampert, C.H.; Peters, J. (2011). Learning Dynamic Tactile Sensing with Robust Vision-based Training, IEEE Transactions on Robotics (T-Ro), 27, 3, pp.545-557.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

B) Active Learning of Object Properties: Learning about objects is not a passive perceptual process: its embodiment allows a robot to discover object properties by actively changing its point of view and by interacting with objects. At the same time, the robot observes the effect of its actions and learns how it can bring about desired effects.

To learn efficiently, the robot should select the exploratory actions that yields the highest information gain out of all possible actions. By efficiently exploring its environment, a robot develops object knowledge grounded in its sensorimotor experience.

The developed object knowledge can be used to manipulate previously unknown objects in unstructured environments. For example, the robot could teach itself to perform tasks in domestic environments.

  1. van Hoof, H.; Kroemer, O; Peters, J. (2014). Probabilistic Segmentation and Targeted Exploration of Objects in Cluttered Environments, IEEE Transactions on Robotics (TRo), 30, 5, pp.1198-1209.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. van Hoof, H.; Kroemer, O; Peters, J. (2013). Probabilistic Interactive Segmentation for Anthropomorphic Robots in Cluttered Environments , Proceedings of the International Conference on Humanoid Robots (HUMANOIDS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contact: Jan Peters, Oliver Kroemer, Herke van Hoof
Download: Flyer as PDF

Bayesian Optimization

Optimizing parameters in robotics is a necessary but often hard task. The challenges rise from the presence of multiple local minima, noise in the measurements, lack of analytical gradients and the limited number of experiments possible on a real robot. Hence, typical approaches such as grid search, random search, gradient descent and genetic algorithms often performs poorly.

Bayesian optimization is designed to naturally deal with these challenges of optimization in robotics. Thanks to the use of a response surface model, Bayesian optimization drastically reduce the number of experiments needed. Hence, it can be successfully employed in real system where other approaches would fail.

  1. Calandra, R.; Gopalan, N.; Seyfarth, A.; Peters, J.; Deisenroth, M.P. (2014). Bayesian Gait Optimization for Bipedal Locomotion, Proceedings of the 2014 Learning and Intelligent Optimization Conference (LION8).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Calandra, R.; Seyfarth, A.; Peters, J.; Deisenroth, M.P. (2014). An Experimental Comparison of Bayesian Optimization for Bipedal Locomotion, Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contact: Roberto Calandra, Jan Peters, Marc Deisenroth

Biomimetic Robotics, Human Motor Control and Brain-Robot Interfaces

An different research goal focuses on human motion: using the tools that we develop for robot motor learning may be beneficial to understand human motor control as well as be used in brain-robot interfaces that learn to help stroke patients to rehabilitate. For more information on some of our work in this context see:
  1. Gomez-Rodriguez, M.; Grosse-Wentrup, M.; Hill, J.; Schoelkopf, B.; Gharabaghi, A.; Peters, J. (2011). Towards Brain-Robot Interfaces for Stroke Rehabilitation, Proceedings of the International Conference on Rehabilitation Robotics (ICORR).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Meyer, T.; Peters, J.;Broetz, D.; Zander, T.; Schoelkopf, B.; Soekadar, S.; Grosse-Wentrup, M. (2012). A Brain-Robot Interface for Studying Motor Learning after Stroke, Proceedings of the International Conference on Robot Systems (IROS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Gomez-Rodriguez, M.; Peters, J.; Hill, J.; Schoelkopf, B.; Gharabaghi, A.; Grosse-Wentrup, M. (2011). Closing the Sensorimotor Loop: Haptic Feedback Helps Decoding of Motor Imagery, Journal of Neuroengineering, 8, 3.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  4. Muelling, K.; Kober, J.; Peters, J. (2011). A Biomimetic Approach to Robot Table Tennis, Adaptive Behavior Journal, 19, 5.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  5. Belousov, B.; Neumann, G.; Rothkopf, C.; Peters, J. (2016). Catching heuristics are optimal control policies, Advances in Neural Information Processing Systems (NIPS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contact: Timm Meyer, Jan Peters, Oliver Kroemer, Katharina Muelling

Whole Body Movement and Robot Locomotion

Dealing with whole body movement of humanoid robot is a highly challenging domain involving various different problems ranging from stabilizing behaviors (e.g., balancing, support oneself against objects) over static motions (e.g., getting up from a chair, push-ups) up to locomotion (e.g., walking, running). As of now, no robot exhibits the same dexterity, efficiency, speed and robustness as a human due the difficulty of controlling and planning with a high number of degrees of freedom as well as due to the complexity of modeling and estimating physical contacts.

We aim to develop new approaches to improve the current state-of-the-art for whole body movement of humanoid robot in general, and, especially, for robot locomotion .

  1. Calandra, R.; Ivaldi, S.; Deisenroth, M.;Rueckert, E.; Peters, J. (2015). Learning Inverse Dynamics Models with Contacts, Proceedings of the International Conference on Robotics and Automation (ICRA).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Calandra, R.; Gopalan, N.; Seyfarth, A.; Peters, J.; Deisenroth, M.P. (2014). Bayesian Gait Optimization for Bipedal Locomotion, Proceedings of the 2014 Learning and Intelligent Optimization Conference (LION8).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Calandra, R.; Seyfarth, A.; Peters, J.; Deisenroth, M.P. (2014). An Experimental Comparison of Bayesian Optimization for Bipedal Locomotion, Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contact: Roberto Calandra, Alexandros Paraschos, Elmar Rückert, Serena Ivaldi, Jan Peters

Modeling Human Motor Learning

The human central nervous system has an incredible ability for learning new motor skills. For example, when learning to control a novel sport device, new movement and balancing strategies may be discovered within seconds. Such rapid learning processes are based on learned abstract concepts, current beliefs of the environment and future expectations of desired postures, reached targets or achieved rewards (among others). Little is known about how these underlying processes develop and how they interplay during motor learning.

In our research, we develop probabilistic models and inference techniques to gain a better understanding of the amazing human learning abilities. Our models reproduce characteristic features like motor variability, continuous exploration, stochastic decisions and the ability to learn task abstractions. Applications range from medical diagnoses and rehabilitation research to smart robot learning and control frameworks.

  1. Rueckert, E.; Mundo, J.; Paraschos, A.; Peters, J.; Neumann, G. (2015). Extracting Low-Dimensional Control Variables for Movement Primitives, Proceedings of the International Conference on Robotics and Automation (ICRA).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Rueckert, E.; Lioutikov, R.; Calandra, R.; Schmidt, M.; Beckerle, P.; Peters, J. (2015). Low-cost Sensor Glove with Force Feedback for Learning from Demonstrations using Probabilistic Trajectory Representations, ICRA 2015 Workshop on Tactile and force sensing for autonomous compliant intelligent robots.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  3. Rueckert, E.; Camernik, J.; Peters, J.; Babic, J. (2016). Probabilistic Movement Models Show that Postural Control Precedes and Predicts Volitional Motor Control, Nature PG: Scientific Reports, 6, 28455.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contact: Elmar Rückert, Jan Peters

Spiking Neural Networks for Motor Control, Planning and Learning

Spiking neural networks are powerful computational models of brain functions and are a key technology in, e.g., the Human Brain Project with estimated total costs of 1.19 billion Euros. A second driving force for intensive research efforts are the massive parallel computing abilities. Companies like IBM, SPS or HP develop 'brain-like' memory and computing chips with the long-term goal of developing neuromorphic computers.

We investigate how spiking neural network models for neuromorphic chips can be used for robot motor control and learning. A strong focus is put on the real-time processing of huge sensory streams from tactile, visual and other sensors and on learning from rewards. The implemented computational principles are based on powerful machine learning algorithms like probabilistic inference, contrastive divergence or stochastic policy search that are used in a large variety of other applications like visual scene understanding, speech processing and cognitive reasoning.

  1. Rueckert, E.; Kappel, D.; Tanneberg, D.; Pecevski, D; Peters, J. (2016). Recurrent Spiking Networks Solve Planning Tasks, Nature PG: Scientific Reports, 6, 21142, Nature Publishing Group.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]
  2. Tanneberg, D.; Paraschos, A.; Peters, J.; Rueckert, E. (2016). Deep Spiking Networks for Model-based Planning in Humanoids, Proceedings of the International Conference on Humanoid Robots (HUMANOIDS).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contact: Elmar Rückert, Daniel Tanneberg, Jan Peters

Learning Diverse Policies

In robotics learning algorithms often return a single policy, optimal with respect to a given cost or reward function. Learning a set of policies instead of a single one can be of particular interest in two cases. First, the cost function has a significant impact on the shape of the returned policy. However, the exact definition of the cost can sometimes be arbitrary if it is a trade-off between several objectives (e.g. a trade-off between state and action costs). In this case, it is more informative to explore the set of all policies optimal for a given trade-off. Secondly, even when the cost is known, finding all the policies performing over a given performance threshold can result in a faster adaptation to a changing environment (e.g. adapting to a different opponent in robot table tennis).

A) Multi-Objective Reinforcement Learning

Many real-world applications are characterized by multiple conflicting objectives. In such problems optimality is replaced by Pareto optimality and the goal is to find the Pareto frontier, a set of solutions representing different compromises among the objectives. Despite recent advances in multi-objective optimization, achieving an accurate representation of the Pareto frontier is still an important challenge.

We formulate the problem of approximating Pareto frontiers as MOMDPs and solve it with a manifold-based approach. We combine episodic exploration strategies and importance sampling to efficiently learn a manifold in the policy parameter space such that its image in the objective space accurately approximates the Pareto frontier.

  1. Parisi, S.; Pirotta, M.; Peters, J. (accepted). Manifold-based Multi-objective Policy Search with Sample Reuse, Neurocomputing.   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

B) Intrinsically Motivated Reinforcement Learning

Intrinsic motivation defines a set of objectives the robot can pursue even in the absence of a task related reward. We are especially interested in combining intrinsic motivation and task specific objectives. For instance, a robot can decide to maximize the entropy of its sensory-motor stream, resulting in an exploratory behavior. When combined with a task reward, this can ensure that exploration of diverse behaviors is restricted to the area of interest of the task.

Learning a diverse set of good performing behaviors can have several advantages. In an adversarial setting such as robot table tennis, diversity of behaviors renders the robot harder to predict and hence harder to counter for the opponent. In an collaborative task with a human, an emphasis on diversity gives the human more opportunity to guide the robot and to avoid the latter to be stuck in local optima.

  1. Gabriel, A.; Akrour, R.; Peters, J.; Neumann, G. (2017). Empowered Skills, Proceedings of the International Conference on Robotics and Automation (ICRA).   See Details [Details]   Download Article [PDF]   BibTeX Reference [BibTex]

Contact: Simone Parisi, Riad Akrour, Jan Peters

  

zum Seitenanfang