Publications‎ > ‎Conference Papers‎ > ‎

Multi-Objective Reinforcement Learning for AUV Thruster Failure Recovery

posted May 30, 2016, 3:17 PM by Reza A   [ updated Jun 24, 2017, 9:57 AM ]
Seyed Reza Ahmadzadeh, Petar Kormushev, Darwin G. Caldwell

Reference:
Seyed Reza Ahmadzadeh, Petar Kormushev, Darwin G. Caldwell, "Multi-Objective Reinforcement Learning
for AUV Thruster Failure Recovery", In Proc. IEEE Symp. Series on Adaptive Dynamic Programming and
Reinforcement Learning (ADPRL 2014), IEEE Symp. Series on Computational Intelligence (SSCI 2014),
Orlando, FL, USA, 8-12 Dec. 2014.
Bibtex Entry:
@INPROCEEDINGS{ahmadzadeh2014multi, TITLE={Multi-Objective Reinforcement Learning for {AUV} Thruster Failure Recovery}, AUTHOR={Ahmadzadeh, Seyed Reza and Kormushev, Petar and Caldwell, Darwin G.}, BOOKTITLE={{IEEE} Symposium on Adaptive Dynamic Programming and Reinforcement Learning ({ADPRL}), Proc. {IEEE} Symposium Series on Computational Intelligence ({SSCI})}, PAGES={1--8}, YEAR={2014}, MONTH={December}, ORGANIZATION={IEEE}, ADDRESS={Florida, USA}, DOI={10.1109/ADPRL.2014.7010621} }
DOI:
10.1109/ADPRL.2014.7010621
Abstract:
This paper investigates learning approaches for discovering fault-tolerant control policies to
overcome thruster failures in Autonomous Underwater Vehicles (AUV). The proposed approach is a
model-based direct policy search that learns on an on-board simulated model of the vehicle. When
a fault is detected and isolated the model of the AUV is reconfigured according to the new
condition. To discover a set of optimal solutions a multi-objective reinforcement learning approach
is employed which can deal with multiple conflicting objectives. Each optimal solution can be used
to generate a trajectory that is able to navigate the AUV towards a specified target while
satisfying multiple objectives. The discovered policies are executed on the robot in a closed-loop
using AUV's state feedback. Unlike most existing methods which disregard the faulty thruster, our
approach can also deal with partially broken thrusters to increase the persistent autonomy of the
AUV. In addition, the proposed approach is applicable when the AUV either becomes under-actuated
or remains redundant in the presence of a fault. We validate the proposed approach on the model of
the Girona500 AUV.

PDF Preview: