A Survey of Multi-Agent Reinforcement Learning for Cooperative Control in Multi-AUV Systems | Zendy

Arif Wibisono | Zendy; Hyoung-Kyu Song | Zendy; Byung Moo Lee | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A Survey of Multi-Agent Reinforcement Learning for Cooperative Control in Multi-AUV Systems

Author(s) -

Arif Wibisono,

Hyoung-Kyu Song,

Byung Moo Lee

Publication year - 2025

Publication title -

ieee access

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 0.587

H-Index - 127

eISSN - 2169-3536

DOI - 10.1109/access.2025.3609457

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

The growing demand for adaptive and autonomous smart ocean systems has driven the utilization of Autonomous Underwater Vehicles (AUV) in various complex underwater missions, including collaborative navigation, target tracking, data collection, and energy management. To address the unique challenges of the underwater environment, such as limited acoustic communication, unpredictable environmental dynamics, and partial observability, the Multi-Agent Reinforcement Learning (MARL) approach with the Centralized Training with Decentralized Execution (CTDE) paradigm emerges as a promising solution. This paper presents a comprehensive survey of MARL algorithms and their applications in multi-AUV systems. We classify the algorithms into value-based approaches (such as VDN and QMIX), policy-based approaches (such as Multi-Agent Deep Deterministic Policy Gradient (MADDPG) and Multi-Agent Proximal Policy Optimization (MAPPO)), as well as hybrid approaches that support explicit communication (such as CommNet, DIAL, and ROMA). We also review various relevant simulation environments, such as PettingZoo, UWSim, and Gazebo, as well as underwater acoustic channel modeling and key performance metrics used to evaluate the systems. As a complement to the survey study, we present experimental studies on data collection and energy efficiency scenarios using MADDPG and MAPPO. The results include comparative analysis of average reward, overflow, Flight eXceedance (FX), and energy consumption, along with real-time AUV trajectory visualization through a Matplotlib interface. Finally, we propose future research directions, including the integration of Meta-Reinforcement Learning (Meta-RL), graph-based role allocation, and training based on 3D physics simulators, to accelerate the adoption of MARL in large-scale autonomous ocean systems. This survey is expected to serve as a strategic reference for the development of smarter, more collaborative, and resilient AUVs to withstand extreme conditions in support of future smart ocean initiatives.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research