Dynamical transition in controllable quantum neural networks with large depth

Quantum neural networks are crucial for near-term quantum applications, yet their training dynamics remain poorly understood. We derive first-principle equations to describe these dynamics, identifying a dynamical transition that informs the design of cost functions to accelerate convergence.
Published in Physics and Computational Sciences
Dynamical transition in controllable quantum neural networks with large depth
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Introduction

Quantum neural networks (QNNs), a paradigm for near-term quantum computing, have been widely applied in fields such as chemistry, optimization, quantum simulation, condensed matter physics, sensing and machine learning. However, the fundamental study of late-time QNN training dynamics remains largely unexplored. Previous results, such as the barren plateau and Quantum Neural Tangent Kernel (QNTK) theory, focus on QNNs at random initialization and fails to capture the quantum circuit towards late-time convergence. Can we open this black box by providing a theoretical understanding of QNN training dynamics at later stages?

Fig.1 We study the training dynamics of quantum neural networks with quadratic loss function, and identify a dynamical transition. We derive a first-principle generalized Lotka-Volterra model to characterize it, and also provide interpretations from random unitary ensemble and Schrödinger equation.

Dynamical transition in training dynamics

The training of the QNN aims at minimizing the cost function, for instance, the squared error between measurement expectation of output state and the target value (Fig.1). At each training step, every parameter in the quantum circuit is updated using gradient descent. Through the first-order Taylor expansion, we derive the generalized Lotka-Volterra equation to describe the training dynamics of error and QNTK simultaneously where the QNTK is the squared norm of the gradient. Interestingly, the generalized Lotka-Volterra equation here parallels the predator-prey dynamics in environmental biology, with a zero birth rate. Identifying a conserved quantity from the dynamical equation, we can obtain the analytical solutions for two branches of dynamics and a critical point, summarized in Fig.1 left corner. When the target is greater than the minimum eigenvalue of measurement operator, the error decays exponentially while the QNTK remains constant, leading to the frozen-kernel dynamics. For target value equals the minimum eigenvalue, both error and QNTK decays polynomially with training steps, leading to the critical point. When the target value is set below the minimum eigenvalue, the total error converges to a nonzero constant due to unachievable target corresponding to the frozen-error dynamics, however, the vanishing part of error and QNTK will decay exponentially. This dynamical transition corresponds to a transcritical bifurcation, where stability exchanges occur between fixed points.

Speedup the convergence

Our theoretical framework can also be applied to the training dynamics of linear loss, which decays exponentially toward ground state energy at a fixed rate. In quadratic loss, setting the target value at the ground state energy can only enable a polynomial decay of error, which is much slower than the ground state preparation with linear loss. However, setting the target value below the ground state energy allows exponential convergence toward the ground state, with the rate of convergence controlled by the target, resulting in faster convergence. We compare the decay of errors of these cases in Fig.2, and verify the speedup with theory predictions.

Fig.2 Training dynamics comparison. Black: linear loss. Green: critical point. Red and blue: frozen-error dynamics.

Discussion

Our results go beyond early-time assumptions based on random initialization in QNN studies, uncovering rich physical insights from the training dynamics. The target-driven transcritical bifurcation transition in the dynamics of QNN suggests a different source to the transition without symmetry breaking and provides guide on better design of cost functions in practical applications. Our work can also be viewed as a single-data scenario in supervised learning, paving the way for a theory of data in quantum machine learning.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Quantum Physics
Physical Sciences > Physics and Astronomy > Quantum Physics
Machine Learning
Mathematics and Computing > Computer Science > Artificial Intelligence > Machine Learning
Statistical Physics
Physical Sciences > Physics and Astronomy > Theoretical, Mathematical and Computational Physics > Statistical Physics

Related Collections

With collections, you can get published faster and increase your visibility.

Biology of rare genetic disorders

This cross-journal Collection between Nature Communications, Communications Biology, npj Genomic Medicine and Scientific Reports brings together research articles that provide new insights into the biology of rare genetic disorders, also known as Mendelian or monogenic disorders.

Publishing Model: Open Access

Deadline: Jan 31, 2025

Advances in catalytic hydrogen evolution

This collection encourages submissions related to hydrogen evolution catalysis, particularly where hydrogen gas is the primary product. This is a cross-journal partnership between the Energy Materials team at Nature Communications with Communications Chemistry, Communications Engineering, Communications Materials, and Scientific Reports. We seek studies covering a range of perspectives including materials design & development, catalytic performance, or underlying mechanistic understanding. Other works focused on potential applications and large-scale demonstration of hydrogen evolution are also welcome.

Publishing Model: Open Access

Deadline: Dec 31, 2024