Quantum-Inspired Multi-Agent Reinforcement Learning for Exploration–Exploitation Optimization in UAV-Assisted 6G Network Deployment
Published in Electrical & Electronic Engineering, Computational Sciences, and Statistics
This study introduces a quantum-inspired framework for optimizing the exploration–
exploitation tradeoff in multi-agent reinforcement learning (MARL),
applied to UAV-assisted 6G network deployment. We consider a cooperative scenario
where ten intelligent UAVs autonomously coordinate to maximize signal
coverage and support efficient network expansion under partial observability and
dynamic conditions. The proposed approach integrates classical MARL algorithms
with quantum-inspired optimization techniques, leveraging variational quantum
circuits (VQCs) as the core structure and employing the Quantum Approximate
Optimization Algorithm (QAOA) as a representative VQC-based method for
combinatorial optimization. Complementary probabilistic modeling is incorporated
through Bayesian inference, Gaussian processes, and variational inference to
capture latent environmental dynamics. A centralized training with decentralized
execution (CTDE) paradigm is adopted, where shared memory and local view
grids enhance local observability among agents.
Comprehensive experiments—including scalability tests, sensitivity analysis, and
comparisons with PPO and DDPG baselines—demonstrate that the proposed
framework improves sample efficiency, accelerates convergence, and enhances
coverage performance while maintaining robustness. Radar chart and convergence
analyses further show that QI-MARL achieves a superior balance between
exploration and exploitation compared to classical methods. All implementation
code and supplementary materials are publicly available on GitHub to ensure
reproducibility.
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in