DRL-SecRoute: A Synergetic Deep Reinforcement Learning Paradigm for Mitigating Byzantine Faults and SSDF Attacks through Heuristic Spectrum Cognizance in Next-Generation Cognitive Radio Networks

Manish Kumar Dixit Jun 15, 2026

Cognitive radio wireless sensor networks (CR-WSNs) are particularly susceptible to
routing vulnerabilities arising from dynamic-spectrum availability and sophisticated
adversarial attacks, emphasizing the need for secure and efficient routing mecha
nisms. Current solutions address routing optimization, spectrum management, and
security as independent tasks, leading to suboptimal performance and susceptibil
ity to Byzantine jamming, spectrum sensing data falsification (SSDF), and primary
user emulation attacks (PUEA). This article introduces DRL-SecRoute (deep rein
forcement learning-based secure routing), a new unified secure routing framework
that synergistically integrates deep reinforcement learning with adaptive spectrum
sensing to address the multi-dimensional optimization problem of secure routing
in dynamic CR-WSN environments. The key contributions are fourfold: (1) a twin
delayed deep deterministic policy gradient (TD3) algorithm enhanced with prior
itized experience replay (PER), specifically designed for continuous state-action
spaces in CR-WSNs, achieving 40% faster convergence than existing discrete-action
methods; (2) a hybrid adaptive spectrum sensing mechanism combining Bayesian
inference with Kalman filtering that reduces sensing overhead by 37.4% while main
taining high prediction accuracy; (3) a lightweight multi-layered anomaly detection
system integrating statistical divergence analysis with unsupervised learning (Isola
tion Forest) to detect diverse attacks with 91.7% accuracy and less than 6% false
positive rates; and (4) a multi-objective optimization framework that jointly opti
mizes routing latency, energy consumption, spectrum efficiency, and security risk
through the unified DRL approach. Extensive simulation experiments across vary
ing network densities (50–200 nodes), traffic loads, spectrum activity levels, and
six attack scenarios—including an adaptive reinforcement learning-based adver
sary—demonstrate that DRL-SecRoute achieves a packet delivery ratio of up to
97.8% under collaborative attack conditions and sustains 84.2% PDR even against
an adaptive reinforcement learning-based adversary, improves energy efficiency
by 32.7% over existing protocols, maintains a spectrum access collision rate below 3.5% across all primary user activity levels, and extends network lifetime by 41.3%.
These results confirm that DRL-SecRoute offers a reliable and scalable foundation
for next-generation CR-WSN deployments.