[IROS 2021]

Learning from Successful and Failed Demonstrations via Optimization

Abstract— Learning from Demonstration (LfD) is a popular approach that allows humans to teach robots new skills by showing the correct way(s) of performing the desired skill. Human-provided demonstrations, however, are not always optimal and the teacher usually addresses this issue by discarding or replacing sub-optimal (noisy or faulty) demonstrations. We propose a novel LfD representation that learns from both successful and failed demonstrations of a skill. Our approach encodes the two subsets of captured demonstrations (labeled by the teacher) into a statistical skill model, constructs a set of quadratic costs, and finds an optimal reproduction of the skill under novel problem conditions (i.e. constraints). The optimal reproduction balances convergence towards successful examples and divergence from failed examples. We evaluate our approach through several 2D and 3D experiments in real-world using a UR5e manipulator arm and also show that it can reproduce a skill from only failed demonstrations. The benefits of exploiting both failed and successful demonstrations are shown through comparison with two existing LfD approaches. We also compare our approach against an existing skill refinement method and show its capabilities in a multi-coordinate setting..


[ALA 2021]

Reward-Sharing Relational Networks in Multi-Agent Reinforcement Learning as a Framework for Emergent Behavior

Abstract— In this work, we integrate ‘social’ interactions into the MARL setup through a user-defined relational network and examine the effects of agent-agent relations on the rise of emergent behaviors. Leveraging insights from sociology and neuroscience, our proposed framework models agent relationships using the notion of Reward-Sharing Relational Networks (RSRN), where network edge weights act as a measure of how much one agent is invested in the success of (or ‘cares about’) another. We construct relational rewards as a function of the RSRN interaction weights to collectively train the multi-agent system via a multi-agent reinforcement learning algorithm. The performance of the system is tested for a 3-agent scenario with different relational network structures (e.g., self-interested, communitarian, and authoritarian networks). Our results indicate that reward-sharing relational networks can significantly influence learned behaviors. We posit that RSRN can act as a framework where different relational networks produce distinct emergent behaviors, often analogous to the intuited sociological understanding of such networks.


[ICRA 2020]

Towards Mobile Multi-Task Manipulation in a Confined and Integrated Environment with Irregular Objects

Abstract—The FetchIt! Mobile Manipulation Challenge, held at the IEEE International Conference on Robots and Automation (ICRA) in May 2019, offered an environment with complex and integrated task sets, irregular objects, confined space, and machining, introducing new challenges in the mobile manipulation domain. Here we describe our efforts to address these challenges by demonstrating the assembly of a kit of mechanical parts in a caddy. In addition to implementation details, we examine the issues in this task set extensively, and we discuss our software architecture in the hope of providing a base for other researchers. To evaluate performance and consistency, we conducted 20 full runs, then examined failure cases with possible solutions. We conclude by identifying future research directions to address the open challenges.


[ICRA 2020]

Benchmark of Skill Learning from Demonstration: Impact of User Experience, Task Complexity, and Start Configuration on Performance

Abstract—We contribute a study benchmarking the performance of multiple motion-based learning from demonstration approaches. Given the number and diversity of existing methods, it is critical that comprehensive empirical studies be performed comparing the relative strengths of these techniques. In particular, we evaluate four approaches based on properties an end user may desire for real-world tasks. To perform this evaluation, we collected data from nine participants, across four manipulation tasks. The resulting demonstrations were used to train 180 task models and evaluated on 720 task reproductions on a physical robot. Our results detail how i) complexity of the task, ii) the expertise of the human demonstrator, and iii) the starting configuration of the robot affect task performance. The collected dataset of demonstrations, robot executions, and evaluations are publicly available. Research insights and guidelines are also provided to guide future research and deployment choices about these approaches.

Skill Acquisition via Automated Multi-Coordinate Cost Balancing

Abstract - We propose a learning framework, named Multi-Coordinate Cost Balancing (MCCB), to address the problem of acquiring point-to-point movement skills from demonstrations. MCCB encodes demonstrations simultaneously in multiple differential coordinates that specify local geometric properties. MCCB generates reproductions by solving a convex optimization problem with a multi-coordinate cost function and linear constraints on the reproductions, such as initial, target, and via points. Further, since the relative importance of each coordinate system in the cost function might be unknown for a given skill, MCCB learns optimal weighting factors that balance the cost function. We demonstrate the effectiveness of MCCB via detailed experiments conducted on one handwriting dataset and three complex skill datasets.


[Frontiers 2018]

Trajectory-based Skill Learning Using Generalized Cylinders

Abstract - In this article, we introduce Trajectory Learning using Generalized Cylinders (TLGC), a novel trajectory-based skill learning approach from human demonstrations. To model a demonstrated skill, TLGC uses a Generalized Cylinder—a geometric representation composed of an arbitrary space curve called the spine and a surface with smoothly varying cross-sections. Our approach is the first application of Generalized Cylinders to manipulation, and its geometric representation offers several key features: it identifies and extracts the implicit characteristics and boundaries of the skill by encoding the demonstration space, it supports for generation of multiple skill reproductions maintaining those characteristics, the constructed model can generalize the skill to unforeseen situations through trajectory editing techniques, our approach also allows for obstacle avoidance and interactive human refinement of the resulting model through kinesthetic correction. We validate our approach through a set of real-world experiments with both a Jaco 6-DOF and a Sawyer 7-DOF robotic arm.

[CoRL 2017]

Towards Robust Skill Generalization: Unifying Learning from Demonstration and Motion Planning  

Abstract - In this paper, we present Combined Learning from demonstration And Motion Planning (CLAMP) as an efficient approach to skill learning and generalizable skill reproduction. CLAMP combines the strengths of Learning from Demonstration (LfD) and motion planning into a unifying framework. We carry out probabilistic inference to find trajectories which are optimal with respect to a given skill and also feasible in different scenarios. We use factor graph optimization to speed up inference. To encode optimality, we provide a new probabilistic skill model based on a stochastic dynamical system. This skill model requires minimal parameter tuning to learn, is suitable to encode skill constraints, and allows efficient inference. Preliminary experimental results showing skill generalization over initial robot state and unforeseen obstacles are presented. 

Published in: Conference on Robot Learning (CoRL 2017) and Proceedings of Machine Learning Research (PMLR), Nov, 2017 [PDF][Video]

Generalized Cylinders for Learning, Reproduction, Generalization, and Refinement of Robot Skills

Abstract - This paper presents a novel geometric approach for learning and reproducing trajectory-based skills from human demonstrations. Our approach models a skill as a Generalized Cylinder, a geometric representation composed of an arbitrary space curve called spine and a smoothly varying cross-section. While this model has been utilized to solve other robotics problems, this is the first application of Generalized Cylinders to manipulation. The strengths of our approach are the model’s ability to identify and extract the implicit characteristics of the demonstrated skill, support for multiple reproduction of trajectories that maintain those characteristics, generalization to new situations through nonrigid registration, and interactive human refinement of the resulting model through kinesthetic teaching. We validate our approach through several real-world experiments with a Jaco 6-DOF robotic arm.

Published in: Robotics: Science and Systems (RSS 2017), MIT in Cambridge, MA, USA, Jul. 12-16, 2017 [PDF][video]

Trajectory Learning from Demonstration with Canal Surfaces: A Parameter-free Approach

Abstract— We present a novel geometric framework for intuitively encoding and learning a wide range of trajectory-based skills from human demonstrations. Our approach identifies and extracts the main characteristics of the demonstrated skill, which are spatial correlations across different demonstrations. Using the extracted characteristics, the proposed approach generates a continuous representation of the skill based on the concept of canal surfaces. Canal surfaces are Euclidean surfaces formed as the envelope of a family of regular surfaces (e.g. spheres) whose centers lie on a space curve. The learned skill can be reproduced, as a time-independent trajectory, and generalized to unforeseen situations inside the canal while its main characteristics are preserved. The main advantages of the proposed approach include: (a) requiring no parameter tuning, (b) maintaining the main characteristics and implicit boundaries of the skill, and (c) generalizing the learned skill over the initial condition of the movement, while exploiting the whole demonstration space to reproduce a variety of successful movements. Evaluations using simulated and real-world data exemplify the feasibility and robustness of our approach.

Published in: IEEE-RAS International Conference on Humanoid Robotics (Humanoids 2016), Cancun, Mexico, Nov 15-17, 2016. [PDF]

Learning Symbolic Representations of Actions from Human Demonstrations

Abstract— In this paper, a robot learning approach is proposed which integrates Visuospatial Skill Learning, Imitation Learning, and conventional planning methods. In our approach, the sensorimotor skills (i.e., actions) are learned through a learning from demonstration strategy. The sequence of performed actions is learned through demonstrations using Visuospatial Skill Learning. A standard action-level planner is used to represent a symbolic description of the skill, which allows the system to represent the skill in a discrete, symbolic form. The Visuospatial Skill Learning module identifies the underlying constraints of the task and extracts symbolic predicates (i.e., action preconditions and effects), thereby updating the planner representation while the skills are being learned. Therefore the planner maintains a generalized representation of each skill as a reusable action, which can be planned and performed independently during the learning phase. Preliminary experimental results on the iCub robot are presented.

Published in: IEEE International Conference on Robotics and Automations (ICRA 2015), Seattle, Washington, USA, May 26-30, 2015. [PDF][bibtex][PDF][video]

Learning Reactive Robot Behavior for Autonomous Valve Turning

AbstractA learning approach is proposed for the challenging task of autonomous robotic valve turning in the presence of active disturbances and uncertainties. The valve turning task comprises two phases: reaching and turning. For the reaching phase the manipulator learns how to generate trajectories to reach or retract from the target. The learning is based on a set of trajectories demonstrated in advance by the operator. The turning phase is accomplished using a hybrid force/motion control strategy. Furthermore, a reactive decision making system is devised to react to the disturbances and uncertainties arising during the valve turning process. The reactive controller monitors the changes in force, movement of the arm with respect to the valve, and changes in the distance to the target. Observing the uncertainties, the reactive system modulates the valve turning task by changing the direction and rate of the movement. A real-world experiment with a robot manipulator mounted on a movable base is conducted to show the efficiency and validity of the proposed approach.

Published in: IEEE-RAS International Conference on Humanoid Robots (Humanoids 2014), Madrid, Spain, Nov 18-20. [PDF][bibtex]

Multi-Objective Reinforcement Learning for AUV Thruster Failure Recovery

Abstract—This paper investigates learning approaches for discovering fault-tolerant control policies to overcome thruster failures in Autonomous Underwater Vehicles (AUV). The proposed approach is a model-based direct policy search that learns on an on-board simulated model of the vehicle. When a fault is detected and isolated the model of the AUV is reconfigured according to the new condition. To discover a set of optimal solutions a multi-objective reinforcement learning approach is employed which can deal with multiple conflicting objectives. Each optimal solution can be used to generate a trajectory that is able to navigate the AUV towards a specified target while satisfying multiple objectives. The discovered policies are executed on the robot in a closed-loop using AUV’s state feedback. Unlike most existing methods which disregard the faulty thruster, our approach can also deal with partially broken thrusters to increase the persistent autonomy of the AUV. In addition, the proposed approach is applicable when the AUV either becomes under-actuated or remains redundant in the presence of a fault. We validate the proposed approach on the model of the Girona500.

Published in: IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2014), Orlando, FL, Dec 9-12. [PDF][bibtex][PDF][IEEE]

Online Discovery of AUV Control Policies to Overcome Thruster Failures

Abstract— We investigate methods to improve fault-tolerance of Autonomous Underwater Vehicles (AUVs) to increase their reliability and persistent autonomy. We propose a learning-based approach that is able to discover new control policies to overcome thruster failures as they happen. The proposed approach is a model-based direct policy search that learns on an on-board simulated model of the AUV. The model is adapted to a new condition when a fault is detected and isolated. Since the approach generates an optimal trajectory, the learned fault-tolerant policy is able to navigate the AUV towards a specified target with minimum cost. Finally, the learned policy is executed on the real robot in a closed-loop using the state feedback of the AUV. Unlike most existing methods which rely on the redundancy of thrusters, our approach is also applicable when the AUV becomes under-actuated in the presence of a fault. To validate the feasibility and efficiency of the presented approach, we evaluate it with three learning algorithms and three policy representations with increasing complexity. The proposed method is tested on a real AUV, Girona500. 

Published in: IEEE International Conference on Robotics and Automation (ICRA 2014), Hong Kong, China, May 31 - June 7, 2014. [PDF][bibtex][Video][IEEE]

Interactive Robot Learning of Visuospatial Skills

Abstract-- This paper proposes a novel interactive robot learning approach for acquiring visuospatial skills. It allows a robot to acquire new capabilities by observing a demonstration while interacting with a human caregiver. Most existing learning from demonstration approaches focus on the trajectories, whereas in our approach the focus is placed on achieving a desired goal configuration of objects relative to one another. Our approach is based on visual perception which captures the object’s context for each demonstrated action. The context embodies implicitly the visuospatial representation including the relative positioning of the object with respect to multiple other objects simultaneously. The proposed approach is capable of learning and generalizing different skills such as object reconfiguration, classification, and turn-taking interaction. The robot learns to achieve the goal from a single demonstration while requiring minimum a priori knowledge about the environment. We illustrate the capabilities of our approach using four real world experiments with a Barrett WAM robot.

*Visuospatial Skill Learning has been used by Microsoft Robotics [Paper][Video]

Published in Proceedings 16th IEEE International Conference on Advanced Robotics (ICAR 2013), Montevideo, Uruguay, 25-29 Nov 2013.  [PDF][bibtex][video][IEEE]

Online Direct Policy Search for Thruster Failure Recovery in Autonomous Underwater Vehicles

Abstract-- Autonomous underwater vehicles are prone to various factors that may lead a mission to fail and cause unrecoverable damages. Even robust controllers cannot make sure that the robot is able to navigate to a safe location in such situations. In this paper we propose an online learning method for reconfiguring the controller, which tries to recover the robot and survive the mission using the current asset of the system. The proposed method is framed in the reinforcement learning setting, and in particular as a model-based direct policy search approach. Since learning on a damaged vehicle would be impossible owing to time and energy constraints, learning is performed on a model which is identified and kept updated online. We evaluate the applicability of our method with different policy representations and learning algorithms, on the model of the Girona500 autonomous underwater vehicle.

Published in: 6th International workshop on Evolutionary and Reinforcement Learning for Autonomous Robot System (ERLARS 2013), Taormina, Italy, 2-9 Sept 2 2013 [PDF][bibtex]

On-line Learning to Recover from Thruster Failures on Autonomous Underwater Vehicles

Abstract—We propose a method for computing on-line the controller of an Autonomous Underwater Vehicle under thruster
failures. The method is general and can be applied to both redundant and under-actuated AUVs, as it does not rely on
the modification of the thruster control matrix. We define an optimization problem on a specific class of functions, in order to
compute the optimal control law that achieves the target without using the faulty thruster. The method is framed within model-based policy search for reinforcement learning, and we study its applicability on the model of the AUV Girona500. We performed
experiments with policies of increasing complexity, testing the on-line feasibility of the approach as the optimization problem
becomes more complex.

Published in: OCEANS 2013 MTS/IEEE, San Diego, USA, 23-26 Sept 2013. [PDF][bibtex][IEEE]

Visuospatial Skill Learning for Object Reconfiguration Tasks

Abstract-- We present a novel robot learning approach based on visual perception that allows a robot to acquire new skills by observing a demonstration from a tutor. Unlike most existing learning from demonstration approaches, where the focus is placed on the trajectories, in our approach the focus is on achieving a desired goal configuration of objects relative to one another. Our approach is based on visual perception which captures the object's context for each demonstrated action. This context is the basis of the visuospatial representation and encodes implicitly the relative positioning of the object with respect to multiple other objects simultaneously. The proposed approach is capable of learning and generalizing multi-operation skills from a single demonstration, while requiring minimum a priori knowledge about the environment. The learned skills comprise a sequence of operations that aim to achieve the desired goal configuration using the given objects. We illustrate the capabilities of our approach using three object reconfiguration tasks with a Barrett WAM robot.

*Visuospatial Skill Learning has been used by Microsoft Robotics [Paper][Video]

Published in: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2013), Tokyo, Japan, 3-8 Nov 2013. [PDF][bibtex][video][IEEE]

Autonomous Robotic Valve Turning: A Hierarchical Learning Approach

Abstract— Autonomous valve turning is an extremely challenging task for an Autonomous Underwater Vehicle (AUV). To resolve this challenge, this paper proposes a set of different computational techniques integrated in a three-layer hierarchical scheme. Each layer realizes specific subtasks to improve the persistent autonomy of the system. In the first layer, the robot acquires the motor skills of approaching and grasping the valve by kinesthetic teaching. A Reactive Fuzzy Decision Maker (RFDM) is devised in the second layer which reacts to the relative movement between the valve and the AUV, and alters the robot’s movement accordingly. Apprenticeship learning method, implemented in the third layer, performs tuning of the RFDM based on expert knowledge. Although the long-term goal is to perform the valve turning task on a real AUV, as a first step the proposed approach is tested in a laboratory environment.

Published in: IEEE International Conference on Robotics and Automation (ICRA 2013), Karlsruhe, Germany, May 6-10, 2013.
Autonomous Robotic Valve Turning- A Hierarchical Learning Approach

Modeling of Hyper-Redundant Manipulators Dynamics and Design of Fuzzy Controller for the System

Abstract-- Hyper-redundant manipulators having degrees of freedom much more than required, have many advantages and important capabilities. In this paper, modeling of the manipulator dynamics is done using a special curve called 'backbone' curve, modal method, Lagrangian mechanics and geometric transformation between variables space of backbone curve and manipulator joints. The dependency of the nonlinear and coupled terms of the dynamics model to joint variables makes some difficulties in classical methods to controller design. To overcome this problem, fuzzy controllers that have appropriate efficiency in complex and nonlinear systems are used. For demonstrating this matter, dynamics modeling of 10 degrees of freedom manipulator is done. Then a fuzzy controller is designed with attention to the dynamics behavior of the system. Manipulator behavior through various and noisy inputs are evaluated by simulation of the model including fuzzy controller. The results show very small error in manipulator motions and suitable condition of the designed fuzzy controller based on the dynamics model.

Published in: IEEE International Conference on Integration of Knowledge Intensive Multi-Agent Systems (KIMAS), Waltham, MA, USA, April 18-21, 2005. [PDF][bibtex][IEEE]

Subpages (3): Latest Publication pdfs RL