Potential and limitations of Quantum Extreme Learning Machines

Published in Physics
Potential and limitations of Quantum Extreme Learning Machines
Like

Share this post

Choose a social network to share with, or copy the shortened URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

Introduction

Classical ELM    In the vast landscape of machine learning, two models that have received significant interest for their simplicity and potential are extreme learning machines (ELMs) and reservoir computers (RCs). These models work by preprocessing input data through a potentially complex, but untrained, nonlinear mapping, referred to in this context as a reservoir. Such preprocessing allows to more easily learn patterns in the input data by training a simple linear readout layer. In particular, RCs are distinguished by their capability to process sequences of data that change over time, employing a reservoir which holds memory of the inputs seen at previous times. A distinctive feature of ELMs and RCs is the simplicity of the training phase, which amounts to solving a linear regression problem.

A classical input x (e.g., an image) is processed by a reservoir that gives an output y (e.g., 1 or 0 if the image is that of a cat or a dog, respectively).
Figure 1: A classical input x (e.g., an image) is processed by a reservoir that gives an output y (e.g., 1 or 0 if the image is that of a cat or a dog, respectively).

Within the ELM framework, an input datum undergoes a nonlinear transformation, and is then processed linearly by a matrix. This nonlinear transformation embodies the so-called reservoir, and is typically implemented via a deep neural network with random weights. While the function is fixed beforehand, the subsequent readout weights undergo a training procedure, learning how to extract the desired features from the input data. A schematic representation of the operation of ELMs is provided in Fig. 1, taking the task of processing input images as a concrete example.

Quantum ELM     The most straightforward way to define a “quantum ELM (QELM)” is to replace the reservoir mapping with some kind of quantum dynamics, the input data vectors with quantum states, and then apply the final linear transformation on the measurement probabilities obtained upon measuring the output states. QELMs have recently garnered attention due to their potential for processing classical and quantum information.

In the most possible general terms, this kind of apparatus can always be formalised by modelling the reservoir dynamics as some kind of quantum channel, and the measurement stage with some Positive-Operator-Valued-Measure (POVM), as schematically illustrated in Fig. 2.

Figure 2: A quantum system interacts with a quantum ancillary system (reservoir). Information from the latter is extracted with a measurement (POVM) as a vector of probabilities p, which is then linearly processed with a matrix of weights W.

In this figure the classical “cat” is replaced by a quantum system whose features we are interested in. 

As in the classical case, the reservoir is fixed and the training optimises the weights by a simple linear regression. This notable advantage of ELM schemes eliminates the need for complicated and resource-expensive algorithms, which instead are inherent to more standard Machine Learning techniques, like feedforward neural networks.

After collecting testing data, the performance is quantified by comparing expected and predicted output and evaluating an averaged difference (MSE).

Results

Observable reachability     A common “task” for quantum information processing is to estimate linear properties of the input state, i.e. quantum averages of observables. For example, one may be interested in the averages of spin observables, Hamiltonian averages, quantum fidelities or entanglement witnesses.

A question naturally comes up: what observables can QELM learn? By employing the Heisenberg picture on operators, we showed that the model can successfully learn only the observables that lie within the linear subspace generated by the POVM propagated back in time. This feature is no surprise once we realise that, unlike in the classical case, the input-output map is linear in each of its steps.

Noise     Another important question, especially from the experimental point of view, is the following: how does the performance change by introducing random errors in the measurement?

The previous condition is sufficient for perfect reconstruction in the ideal measurement scenario, but in this more realistic setting the situation is different. Fig. 3b shows what happens if we “spoil” measurements both during the training phase and testing phase, with the same number of samples (shown here with different colours).

Futhermore, noise can surprisingly reduce the instability of the protocol, as shown in Fig. 3a by the so-called condition number of the probability matrix as the number of measurement samples varies. This improvement is particularly relevant when training measurements are made more precise than testing measurements.

Figure 3: Condition number (a) and error (MSE) (b) against the number of measurement outcomes and for different values of the number N of measurement samples. Case of the reconstruction of the x-Pauli matrix for 1 input qubit.

Multiple injections     The model just described, even if useful, is not able to reconstruct target functionals which are nonlinear with respect to the input state, like, for instance, the purity of a density matrix. 

Figure 4: Schematics of multiple injection QELMs.

However, we found a new approach, illustrated in Fig. 4, which we have named "multiple-injection QELM." By allowing the reservoir to interact with multiple copies of the same quantum state, a greater amount of information can be progressively acquired. With this change of perspective we can reconstruct nonlinear polynomial target functionals of a certain degree n provided that at least n injections are used.

Fig. 5 shows the performance as a function of the number of injections, and that reconstruction is not possible unless the number of injections is greater than or equal to the degree of the polynomial target function. The reconstruction also fails when the effective input space becomes too large compared with the dimension of the reservoir.

Figure 5: Reconstruction error (MSE) vs number of injections for polynomial target functionals with k=1,…,7 and for a reservoir of 8 qubits. 

Conclusions and perspectives

Our work helps to shed light on the general structure of the problems solvable using quantum extreme learning machines. While in this paper we focused on the quantum state estimation perspective, where the goal is to retrieve properties of the input quantum states, these results can be used to gain a deeper insight into how extreme learning machines can function as a hybrid quantum machine learning architecture to process classical data. Furthermore the extension of our results to time-dependent sequences can be used to find precise relations between memory properties of a quantum reservoir and the information processing capability of the corresponding quantum machine learning model.

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Subscribe to the Topic

Physics and Astronomy
Physical Sciences > Physics and Astronomy

Related Collections

With collections, you can get published faster and increase your visibility.

Synthetic dimensions for topological and quantum phases

This focus collection aims to present the most recent advances and future directions in exploring novel physics of topological and quantum phases with synthetic dimensions.

Publishing Model: Open Access

Deadline: Mar 21, 2024

Ultrafast X-ray Spectroscopy

This collection aims to provide a description of the state-of-the-art in time-resolved X-ray spectroscopic measurements, including applications, technological developments and theoretical studies to explore ultrafast phenomena in isolated quantum systems.

Publishing Model: Open Access

Deadline: May 13, 2024