Online Discovery of AUV Control Policies to Overcome Thruster Failures

Post date: May 30, 2016 10:8:44 PM

Seyed Reza Ahmadzadeh, Matteo Leonetti, Arnau Carrera, Marc Carreras, Petar Kormushev, Darwin G. Caldwell

Reference:

Seyed Reza Ahmadzadeh, Matteo Leonetti, Arnau Carrera, Marc Carreras, Petar Kormushev, Darwin G.

Caldwell, "Online Discovery of AUV Control Policies to Overcome Thruster Failures", In Proc. IEEE

Intl Conf. on Robotics and Automation, (ICRA 2014), Hong Kong, China, pp. 6522-6528, 31 May-7 June

2014.

Bibtex Entry:

@INPROCEEDINGS{ahmadzadeh2014online, TITLE={Online Discovery of {AUV} Control Policies to Overcome Thruster Failure}, AUTHOR={Ahmadzadeh, Seyed Reza and Leonetti, Matteo and Carrera, Arnau and Carreras, Marc and Kormushev, Petar and Caldwell, Darwin G.}, BOOKTITLE={Robotics and Automation ({ICRA}), {IEEE} International Conference on}, PAGES={6522--6528}, YEAR={2014}, MONTH={May}, ADDRESS={Hong Kong, China}, ORGANIZATION={IEEE}, DOI={10.1109/ICRA.2014.6907821} }

DOI:

10.1109/ICRA.2014.6907821

Abstract:

We investigate methods to improve fault-tolerance of Autonomous Underwater Vehicles (AUVs) to

increase their reliability and persistent autonomy. We propose a learning-based approach that is

able to discover new control policies to overcome thruster failures as they happen. The proposed

approach is a model-based direct policy search that learns on an on-board simulated model of the

AUV. The model is adapted to a new condition when a fault is detected and isolated. Since the

approach generates an optimal trajectory, the learned fault-tolerant policy is able to navigate the

AUV towards a specified target with minimum cost. Finally, the learned policy is executed on the

real robot in a closed-loop using the state feedback of the AUV. Unlike most existing methods which

rely on the redundancy of thrusters, our approach is also applicable when the AUV becomes

under-actuated in the presence of a fault. To validate the feasibility and efficiency of the

presented approach, we evaluate it with three learning algorithms and three policy representations

with increasing complexity. The proposed method is tested on a real AUV, Girona500.