Introduction
Classical ELM In the vast landscape of machine learning, two models that have received significant interest for their simplicity and potential are extreme learning machines (ELMs) and reservoir computers (RCs). These models work by preprocessing input data through a potentially complex, but untrained, nonlinear mapping, referred to in this context as a reservoir. Such preprocessing allows to more easily learn patterns in the input data by training a simple linear readout layer. In particular, RCs are distinguished by their capability to process sequences of data that change over time, employing a reservoir which holds memory of the inputs seen at previous times. A distinctive feature of ELMs and RCs is the simplicity of the training phase, which amounts to solving a linear regression problem.
Within the ELM framework, an input datum undergoes a nonlinear transformation, and is then processed linearly by a matrix. This nonlinear transformation embodies the so-called reservoir, and is typically implemented via a deep neural network with random weights. While the function is fixed beforehand, the subsequent readout weights undergo a training procedure, learning how to extract the desired features from the input data. A schematic representation of the operation of ELMs is provided in Fig. 1, taking the task of processing input images as a concrete example.
Quantum ELM The most straightforward way to define a “quantum ELM (QELM)” is to replace the reservoir mapping with some kind of quantum dynamics, the input data vectors with quantum states, and then apply the final linear transformation on the measurement probabilities obtained upon measuring the output states. QELMs have recently garnered attention due to their potential for processing classical and quantum information.
In the most possible general terms, this kind of apparatus can always be formalised by modelling the reservoir dynamics as some kind of quantum channel, and the measurement stage with some Positive-Operator-Valued-Measure (POVM), as schematically illustrated in Fig. 2.
In this figure the classical “cat” is replaced by a quantum system whose features we are interested in.
As in the classical case, the reservoir is fixed and the training optimises the weights by a simple linear regression. This notable advantage of ELM schemes eliminates the need for complicated and resource-expensive algorithms, which instead are inherent to more standard Machine Learning techniques, like feedforward neural networks.
After collecting testing data, the performance is quantified by comparing expected and predicted output and evaluating an averaged difference (MSE).
Results
Observable reachability A common “task” for quantum information processing is to estimate linear properties of the input state, i.e. quantum averages of observables. For example, one may be interested in the averages of spin observables, Hamiltonian averages, quantum fidelities or entanglement witnesses.
A question naturally comes up: what observables can QELM learn? By employing the Heisenberg picture on operators, we showed that the model can successfully learn only the observables that lie within the linear subspace generated by the POVM propagated back in time. This feature is no surprise once we realise that, unlike in the classical case, the input-output map is linear in each of its steps.
Noise Another important question, especially from the experimental point of view, is the following: how does the performance change by introducing random errors in the measurement?
The previous condition is sufficient for perfect reconstruction in the ideal measurement scenario, but in this more realistic setting the situation is different. Fig. 3b shows what happens if we “spoil” measurements both during the training phase and testing phase, with the same number of samples (shown here with different colours).
Futhermore, noise can surprisingly reduce the instability of the protocol, as shown in Fig. 3a by the so-called condition number of the probability matrix as the number of measurement samples varies. This improvement is particularly relevant when training measurements are made more precise than testing measurements.
Multiple injections The model just described, even if useful, is not able to reconstruct target functionals which are nonlinear with respect to the input state, like, for instance, the purity of a density matrix.
However, we found a new approach, illustrated in Fig. 4, which we have named "multiple-injection QELM." By allowing the reservoir to interact with multiple copies of the same quantum state, a greater amount of information can be progressively acquired. With this change of perspective we can reconstruct nonlinear polynomial target functionals of a certain degree n provided that at least n injections are used.
Fig. 5 shows the performance as a function of the number of injections, and that reconstruction is not possible unless the number of injections is greater than or equal to the degree of the polynomial target function. The reconstruction also fails when the effective input space becomes too large compared with the dimension of the reservoir.
Conclusions and perspectives
Our work helps to shed light on the general structure of the problems solvable using quantum extreme learning machines. While in this paper we focused on the quantum state estimation perspective, where the goal is to retrieve properties of the input quantum states, these results can be used to gain a deeper insight into how extreme learning machines can function as a hybrid quantum machine learning architecture to process classical data. Furthermore the extension of our results to time-dependent sequences can be used to find precise relations between memory properties of a quantum reservoir and the information processing capability of the corresponding quantum machine learning model.
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in