Multi-Objective Reinforcement Learning for AUV Thruster Failure Recovery

Post date: May 30, 2016 10:17:41 PM

Seyed Reza Ahmadzadeh, Petar Kormushev, Darwin G. Caldwell

Reference:

Seyed Reza Ahmadzadeh, Petar Kormushev, Darwin G. Caldwell, "Multi-Objective Reinforcement Learning

for AUV Thruster Failure Recovery", In Proc. IEEE Symp. Series on Adaptive Dynamic Programming and

Reinforcement Learning (ADPRL 2014), IEEE Symp. Series on Computational Intelligence (SSCI 2014),

Orlando, FL, USA, 8-12 Dec. 2014.

Bibtex Entry:

@INPROCEEDINGS{ahmadzadeh2014multi, TITLE={Multi-Objective Reinforcement Learning for {AUV} Thruster Failure Recovery}, AUTHOR={Ahmadzadeh, Seyed Reza and Kormushev, Petar and Caldwell, Darwin G.}, BOOKTITLE={{IEEE} Symposium on Adaptive Dynamic Programming and Reinforcement Learning ({ADPRL}), Proc. {IEEE} Symposium Series on Computational Intelligence ({SSCI})}, PAGES={1--8}, YEAR={2014}, MONTH={December}, ORGANIZATION={IEEE}, ADDRESS={Florida, USA}, DOI={10.1109/ADPRL.2014.7010621} }

DOI:

10.1109/ADPRL.2014.7010621

Abstract:

This paper investigates learning approaches for discovering fault-tolerant control policies to

overcome thruster failures in Autonomous Underwater Vehicles (AUV). The proposed approach is a

model-based direct policy search that learns on an on-board simulated model of the vehicle. When

a fault is detected and isolated the model of the AUV is reconfigured according to the new

condition. To discover a set of optimal solutions a multi-objective reinforcement learning approach

is employed which can deal with multiple conflicting objectives. Each optimal solution can be used

to generate a trajectory that is able to navigate the AUV towards a specified target while

satisfying multiple objectives. The discovered policies are executed on the robot in a closed-loop

using AUV's state feedback. Unlike most existing methods which disregard the faulty thruster, our

approach can also deal with partially broken thrusters to increase the persistent autonomy of the

AUV. In addition, the proposed approach is applicable when the AUV either becomes under-actuated

or remains redundant in the presence of a fault. We validate the proposed approach on the model of

the Girona500 AUV.