Skill memory in biped locomotion: Using perceptual information to predict task outcome

Skill memory in biped locomotion Using perceptual information to predict task outcome J. Andre1, C. Santos2, L. Costa3 1 Department of Industrial Electronics, University of Minho, joaocandre@dei.uminho.pt 2 Department of Industrial Electronics, University of Minho, cristina@dei.uminho.pt 3 Department of Production and System, University of Minho, lac@dps.uminho.pt

Abstract Robots must be able to adapt their motor behavior to unexpected situations in order to safely move among humans. A necessary step is to be able to predict failures, which result in behavior abnormalities and may cause irrecoverable damage to the robot and its surroundings, i.e. humans. In this paper we build a predictive model of sensor traces that enables early failure detection by means of a skill memory. Specifically, we propose an architecture based on a biped locomotion solution with improved robustness due to sensory feedback, and extend the concept of Associative Skill Memories (ASM) to periodic movements by introducing several mechanisms into the training workflow, such as linear interpolation and regression into a Dynamical Motion Primitive (DMP) system such that representation becomes time invariant and easily parameterizable. The failure detection mechanism applies statistical tests to determine the optimal operating conditions. Both training and failure testing were conducted on a DARwIn-OP inside a simulation environment to assess and validate the failure detection system proposed. Results show that the system performance in terms of the compromise between sensitivity and specificity is similar with and without the proposed mechanism, while achieving a significant data size reduction due to the periodic approach taken. Keywords: Reinforcement learning · Bio-inspired · Skill Memory

robótica 99, 2.o Trimestre de 2015

UREËWLFD

$57,*2 &,(17õ),&2

1.ª Parte

1. INTRODUCTION Learning is a necessary skill for robots as it is with humans [21], and is still a demanding eﬀort. It comprises an on-going process, and even after grasping the basic motions necessary for a speciﬁc movement, there is room for improvement either by trial-and-error or inference from perceptual data and previous experiences. Humans use the latter almost unconciously, constantly predicting their actions’ impact on their surroundings and improving them in order to achieve better outcomes. On robots, however, the use of sensor information is not nearly as automated, and it can be time-consuming to achieve proper parameterization with satisfactory results [20]. Robotic control is a challenging task due to the great number of variables at play in the interaction between a robot and its surroundings. Not being realistically possible to account in advance for all potential disruptive interference sources in the way movement execution, alternative ways to increase movement robustness must be found, either at hardware level (passive adaptation with compliant joints [22]) or at control level (extensive calibration, exploration though vision systems [22]).

Popular solutions are often based on a feedback loop that corrects movement plans as needed in response to perceptible changes in the environment [22]. Therefore an attractive enhancement in robotic systems would be the capability of using sensor information to improve its performance. Moreover, being able to identify potential failure conditions from perceptual information, while learning from past experiences, can prompt the robot to take countermeasures timely to ensure task completion, reacting to sudden fluctuations in the environment that otherwise would have a disruptive effect in movement execution. Several failure detection systems include particle [24] filters and neural-network based approaches [4, 5], however these are highly focused in industrial and wheeled robots, and up to th author’s knowledge there such a framework in humanoid/legged robots is lacking. This way it becomes necessary to find a way to store relevant perceptual information in consistently, in order to use it to monitor and improve robotic movement, what we would call a skill memory. Based on the premise that stereotypical movements tend to leave similar sensor footprints with each execution, even if the environment is dynamically changing, Pastor et al. have proposed the notion of Associative Skill Memory (ASM) [22]. An ASM saves a reference sensor footprint illustrating the most common values and variation of perceptual data when executing a certain task - associative because the memory itself is not solely based on sensor information, but rather on the association of this specific sensor footprint with the corresponding movement, skill or end task [21]. Overall, this work presents the first step towards a truly sensor-driven CPG biped locomotion approach, using ASMs to construct a vast library of locomotor movements and parameters, selectable according to the external context to achieve the most apropriate gait. There are however several problems to deal with, and failures are highly undesirable for autonomous humanoid robots expected to cope with realistic and possibly unforeseen physically interactive human environments. Throughout this work we assume failures as any deviation from optimal behavior. In the specific case of biped locomotion, this includes, not only falling but also unstable or inefficient walking patterns. During a robotic walk, these failures result in abnormalities in the robot’s normal behavior and may lead to falls and consequently irrecoverable damage to the robot and its surroundings, often populated by humans. In this paper we present a method to construct a predictive model for locomotion from sensor traces acquired during past attempts. This method is explored to reliably predict potential locomotion failures. This ASM-based failure predictor module