Zhikun Wang has been a Ph.D. student at MPI for Intelligent Systems in Tuebingen and the TU Darmstadt advised by Jan Peters and Bernhard Schölkopf. He graduated with a Ph.D. in September 2013 and has recently joined Google at Mountain View, CA, USA. Before joining IAS, he had already obtained his B.Sc. and M.Sc. degrees in computer science from Tsinghua University in Beijing.
Mail. Spemannstr. 38, 72076 Tuebingen, Germany
Recent advances in sensors and algorithms allow for robots with improved perception abilities. However, effective perception alone may not be sufficient for human-robot interaction, since the robot's reaction should depend on understanding the human's intention. Hence, my research interests lie in the strategic level of human-robot interaction, which serves as a bridge between perception of human action and planning for reaction. On one side, the robot needs to infer the underlying intention of humans. On the other side, efficient planning for reaction can be achieved by utilizing motor skills with reactive policies learned to choose the right skill at the right time.
I have been developing and implementing machine learning algorithms for intention inference and learning reactive policies. I have chosen robot table tennis as a benchmark, as it is a sufficiently complex scenario for evaluation while intuition still allows interpreting the results. We have achieved promising experimental results, which exhibit their potentials in many other human-robot interaction scenarios.
Gaussian processes, Bayesian inference, Graphical models, Reinforcement learning, Human-robot interaction
Intention inference can be an essential step toward efficient human-robot interaction. For this purpose, we propose the Intention-Driven Dynamics Model (IDDM) to probabilistically model the generative process of movements that are directed by the intention. The IDDM allows to infer the intention from observed movements using Bayes' theorem. The IDDM simultaneously finds a latent state representation of noisy and high-dimensional observations, and models the intention-driven dynamics in the latent states. As most robotics applications are subject to real-time constraints, we develop an efficient online algorithm that allows for real-time intention inference. Two human-robot interaction scenarios, i.e., target prediction for robot table tennis and action recognition for interactive humanoid robots, are used to evaluate the performance of our inference algorithm. In both intention inference tasks, the proposed algorithm achieves substantial improvements over support vector machines and Gaussian processes.
Opponent modeling is a critical mechanism in repeated games. It allows a player to adapt its strategy in order to better respond to the presumed preferences of his opponents. We introduce a new modeling technique that adaptively balances exploitability and risk reduction. An opponent's strategy is modeled with a set of possible strategies that contain the actual strategy with a high probability. The algorithm is safe as the expected payoff is above the minimax payoff with a high probability, and can exploit the opponents' preferences when sufficient observations have been obtained. We apply them to normal-form games and stochastic games with a finite number of stages. The performance of the proposed approach is first demonstrated on repeated rock-paper-scissors games. Subsequently, the approach is evaluated in a human-robot table-tennis setting where the robot player learns to prepare to return a served ball. By modeling the human players, the robot chooses a forehand, backhand or middle preparation pose before they serve. The learned strategies can exploit the opponent's preferences, leading to a higher rate of successful returns.