Quantum computing and neuromorphic computing for safe, reliable, and explainable multi-agent reinforcement learning: optimal control in autonomous robotics

Quantum computing and neuromorphic computing for multi-agent reinforcement learning

Mazyar Taghavi Feb 07, 2026

This paper introduces a novel hybrid framework that integrates quantum computing and neuromorphic computing to enhance
the safety, reliability, and explainability of Multi-Agent Reinforcement Learning (MARL) in autonomous robotic systems.
The proposed architecture employs quantum variational circuits for high-level policy exploration and spiking neural networks
for energy-efficient, low-latency motor control. Adopting a centralized training and decentralized execution paradigm, the
framework enables agents to optimize joint policies that combine quantum planning with neuromorphic execution under
partial observability and safety constraints.We evaluate the framework in a simulated environment featuring ten UAV agents
navigating dynamic forest terrain with limited visibility and obstacle avoidance requirements. Empirical results demonstrate
that the hybrid system significantly reduces safety violations while maintaining entropy-based exploration and interpretable
spike-based decision traces. KL divergence metrics confirm the convergence of quantum policies toward safe priors, while
spike entropy analysis reveals temporal diversity in control signals. The key contributions of this work include: (i) a modular
quantum-neuromorphic MARL architecture, (ii) a hybrid training framework incorporating safety-aware coordination, and
(iii) empirical validation through both visual diagnostics and formal metrics. This research establishes a foundation for nextgeneration
embodied AI systems that unify the optimization capabilities of quantum computingwith the biological plausibility
of neuromorphic control.