Learning Control of Quantum Systems

Daoyi Dong D. Dong is with the School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2600, Australia [email protected]

Abstract

This paper provides a brief introduction to learning control of quantum systems. In particular, the following aspects are outlined, including gradient-based learning for optimal control of quantum systems, evolutionary computation for learning control of quantum systems, learning-based quantum robust control, and reinforcement learning for quantum control.

I Introduction

Controlling quantum systems has become a central task in the development of quantum technologies, and quantum control has witnessed rapid progress in the last two decades; for an overview, see, e.g., the survey papers [1, 2, 3, 4, 5] or the monographs [6, 7]. The general goal of quantum control is to actively manipulate and control the dynamics of quantum systems for achieving given objectives [8, 9] (e.g., rapid state transfer, high-fidelity gate operation). Two of fundamental issues in quantum control include investigating controllability of quantum systems and designing control laws to achieve expected control systems performance. Controllability is concerned with what control targets can be achieved and the controllability of finite-dimensional closed systems has been well addressed [7]. A few results on the controllability of open quantum systems have also been presented. For control law design, optimal control theory [1], Lyapunov control approaches [10], learning control algorithms [2] and robust control methods [11] have been developed in manipulating quantum systems for achieving various control objectives.

Among various control design approaches, learning control is recognized as a powerful method for many complex quantum control tasks and has achieved great success in laser control of molecules and other applications since the approach was presented in the seminal paper [12]. Many quantum control tasks may be formulated as an optimization problem and a learning algorithm can be employed to search for an optimal control field satisfying a desired performance condition. Gradient algorithms have been demonstrated to be an excellent candidate for numerically finding an optimal field and have achieved successful applications in nuclear magnetic resonance (NMR) systems due to their high efficiency [13]. In many other optimal control problems, the gradient information may not be easy to obtain and some complex problems may have local optima. For these situations, stochastic search algorithms usually have improved performance to find a good control field. The genetic algorithm (GA) and differential evolution (DE) [11] have been widely used in the area of quantum control of molecular systems and achieved great success [2]. Another task in quantum control is to achieve robustness performance in quantum systems. Gradient-based learning algorithms and stochastic search algorithms may be smartly modified to search for robust control fields. Also, some other machine learning algorithms such as reinforcement learning have found successful applications in various tasks (e.g., quantum error correction [14]).

II Gradient-based learning for optimal control of quantum systems

Consider a finite-dimensional quantum control system where its state $|\psi(t)\rangle$ (using the Dirac notation) is described by the following Schrödinger equation (setting $\hbar=1$ ):

\frac{d}{dt}|{\psi}(t)\rangle=-i[H_{0}+\sum_{m=1}^{M}u_{m}(t)H_{m}]|\psi(t)\rangle,\quad t\in[0,T],

(1)

where $H_{0}$ is the free Hamiltonian of the system and $H_{c}(t)=\sum_{m=1}^{M}u_{m}(t)H_{m}$ is the control Hamiltonian at time $t$ that represents the interaction of the system with the external fields $u_{m}(t)$ . The $H_{m}$ are Hermitian operators through which the controls couple to the system. The objective of quantum optimal control is to find control fields $u_{m}(t)$ for maximizing a performance functional $\Phi$ . $\Phi$ may be a given functional of the state $|\psi\rangle$ and control defined according to practical requirement. For example, the fidelity $\Phi=|\langle\psi(T)|\psi_{f}\rangle|^{2}$ between the final state $|\psi(T)\rangle$ and target state $|\psi_{f}\rangle$ or the expectation $\Phi=|\langle\psi(T)|\hat{O}|\psi(T)\rangle|^{2}$ of an operator $\hat{O}$ may be defined as a performance index for state transfer task. In order to maximize the performance $\Phi$ , we may employ the GRAPE (gradient ascent pulse engineering) algorithm [13] or the Krotov method [15] to search for the control field. For simplicity, we may discretize time $T$ in $N$ equal steps and during each step let the control fields $u_{m}$ be constant. The basic idea in the GRAPE algorithm is that the control fields are iteratively updated following the gradient direction of $\frac{\delta\Phi}{\delta u_{m}(k)}$ with a learning rate $\eta$ , i.e.,

u_{m}(k+1)=u_{m}(k)+\eta\frac{\delta\Phi}{\delta u_{m}(k)}.

(2)

The gradient-based learning method can also be extended to the optimal control problem of unitary transformations (e.g., quantum gates) and open quantum systems. For example, for a unitary transformation $U$ , its evolution is described by the following equation

\dot{U}(t)=-i[H_{0}+\sum_{m=1}^{M}u_{m}(t)H_{m}]U(t),\ \ \ \ U(0)=I.

(3)

Now the objective is to design the controls $u_{m}(t)$ to steer the unitary $U(t)$ from $U(0)=I$ to a desired target $U_{F}$ with high fidelity. We may define the performance function as $\Phi=|{\langle U_{F}|e^{i\varphi}U(T)\rangle}|^{2}$ for an arbitrary phase factor $\varphi$ . Then we can calculate the gradient $\delta\Phi/\delta u_{m}(k)$ , and the optimal control field can be searched for by following the gradient [16]. When we consider the optimal control problem of an open quantum system, its state should be represented by a density matrix $\rho$ and its dynamics should be described by a master equation. The dynamics of a Markovian open quantum system can be described using the following master equation in the Lindblad form as [6]

\dot{\rho}(t)=-i[H_{0}+\sum_{m=1}^{M}u_{m}(t)H_{m},\rho(t)]+\sum_{k}\gamma_{k}\mathcal{D}[L_{k}]\rho(t),

(4)

where the non-negative coefficients $\gamma_{k}$ specify the relevant relaxation rates, $L_{k}$ are appropriate Lindblad operators and $\mathcal{D}[L_{k}]\rho=(L_{k}\rho L_{k}^{\dagger}-\frac{1}{2}L_{k}^{\dagger}L_{k}\rho-\frac{1}{2}\rho L_{k}^{\dagger}L_{k}).$ The open GRAPE algorithm has also been developed to calculate the gradient based on the master equation (see [17]).

Using the basic idea of gradient-based learning control, some variants have been developed for various requirements in quantum optimal control. For example, a data-driven gradient optimization algorithm (d-GRAPE) has been proposed to correct deterministic gate errors in high-precision quantum control by jointly learning from a design model and the experimental data from quantum tomography [18]. A gradient-based frequency-domain optimization algorithm has been developed to solve the optimal control problem with constraints in the frequency domain [19]. Existing results show that gradient-based learning methods can usually achieve excellent performance for solving optimal control problems when the system model is known and the dynamics can be equivalently (or approximately) described using a closed quantum system. This is also analyzed using quantum control landscape theory [20].

III Evolutionary computation for learning control of quantum systems

Gradient algorithms have shown powerful capability for numerically finding optimal controls due to their excellent performance [13]. In many practical applications, it may be difficult to obtain the gradient information or there exist local optima in complex quantum control problems. For these situations, a natural idea is to employ stochastic search algorithms to seek good controls. Evolutionary computation including GA and DE has been widely used in the area of quantum control. In these evolutionary computation methods, crossover, mutation and selection operations are iteratively implemented to search for good solutions (optimal controls) in a parameter space. For example, a subspace-selective self-adaptive differential evolution (SUSSADE) algorithm has been proposed to achieve a high-fidelity single-shot Toffoli gate and single-shot three-qubit gates [21], [22]. Existing results showed that DE with equally-mixed strategies can achieve improved performance for quantum control problems [23]. Several promising evolution algorithms have been investigated comparatively in [24] and it was found that DE usually outperformed GA and particle swarm optimization for hard quantum control problems.

The above introduction of gradient-based learning and evolutionary computation mainly involves open-loop control strategies. Evolutionary computation has demonstrated extremely powerful capability when it is integrated into closed-loop control design. Closed-loop learning control, where each cycle of the closed-loop is executed on a new sample, has achieved great successes in the laser control of laboratory chemical reactions [2, 12]. A closed-loop learning control procedure generally involves three components [1, 2]: (i) a trial laser control input, (ii) the laboratory generation of the control that is applied to the sample and subsequently observed for its impact, and (iii) a learning algorithm to suggest the form of the next control input by considering the prior experiments. The initial trial control input may be a random input field or a well-designed laser pulse. A feature of a good closed-loop learning control design is its insensitivity to the initial trials. A key task is to develop a good learning algorithm for ensuring that the learning process converges to achieve a predetermined objective. GA, DE and several rapid convergence algorithms have been developed for this task [4, 12]. The optimal control problem is usually formulated as solving an optimization problem by maximizing a functional which is related to some variables such as the control inputs, quantum states and control time but may have no analytical form. In the learning process, the optimization problem is solved iteratively. First, a trial input is applied to a sample to be controlled and the result is observed. Second, a learning algorithm suggests a better control input based on the prior experiments. Third, the “better” control input is applied to a new sample. This process continues until the control objective is achieved or the maximum permitted iteration number is reached. It is often feasible to produce many identical-state samples for laboratory chemical molecules. If the control objective is well selected, there is a capability to apply specified control inputs to the samples, and the learning algorithm is sufficiently smart for searching for good control inputs, this process will converge and an optimal control pulse can be found [2].

IV Learning-based quantum robust control

The robust control of quantum systems has been recognized as a key task in developing practical quantum technology since the existence of noise and uncertainties is unavoidable. Learning control is an effective candidate for achieving robust performance in some quantum control problems [11]. We first consider the control problem of inhomogeneous quantum ensembles. An inhomogeneous quantum ensemble consists of many individual quantum systems (e.g., atoms, molecules or spin systems) and the parameters describing the system dynamics of these individual systems could have variations [25, 26]. An example is that a spin ensemble in NMR may encounter large dispersion in the strength of the applied radio frequency field and there also exist variations in the natural frequencies of these spins [25]. Inhomogeneous quantum ensembles have wide applications in many fields ranging from quantum memory to magnetic-resonance imaging. Hence, it is highly desirable to design control laws for an inhomogeneous ensemble to employ the same control inputs to steer individual systems with different dynamics from a given initial state to a target state.

A sampling-based learning control (SLC) method has been developed to achieve high fidelity control of inhomogeneous quantum ensembles [26]. Consider an inhomogeneous ensemble in which the Hamiltonian of each individual system has the following form

H_{\omega,\theta}(t)=\omega H_{0}+\sum_{m=1}^{M}\theta u_{m}(t)H_{m}.

(5)

We assume that the parameters $\omega\in[1-\Omega,1+\Omega]$ and $\theta\in[1-\Theta,1+\Theta]$ , and the constants $\Omega\in[0,1]$ and $\Theta\in[0,1]$ represent the bounds of the parameter dispersion. The objective is to design the controls $\{u_{m}(t)\}$ to simultaneously stabilize the individual systems (with different $\omega$ and $\theta$ ) of the quantum ensemble from an initial state $|\psi_{0}\rangle$ to the same target state $|\psi_{f}\rangle$ with high fidelity. This task can be achieved using the SLC method including two steps of “training” and “testing and evaluation” [26]. In the training step, we select $N$ samples from the quantum ensemble regarding the distribution (e.g., uniform distribution) of the inhomogeneity parameters and then construct a generalized system as follows

\frac{d}{dt}\left(\begin{array}[]{c}|{\psi}_{\omega_{1},\theta_{1}}(t)\rangle\\ |{\psi}_{\omega_{2},\theta_{2}}(t)\rangle\\ \vdots\\ |{\psi}_{\omega_{N},\theta_{N}}(t)\rangle\\ \end{array}\right)=-i\left(\begin{array}[]{c}H_{\omega_{1},\theta_{1}}(t)|\psi_{\omega_{1},\theta_{1}}(t)\rangle\\ H_{\omega_{2},\theta_{2}}(t)|\psi_{\omega_{2},\theta_{2}}(t)\rangle\\ \vdots\\ H_{\omega_{N},\theta_{N}}(t)|\psi_{\omega_{N},\theta_{N}}(t)\rangle\\ \end{array}\right)

(6)

where $H_{\omega_{n},\theta_{n}}=\omega_{n}H_{0}+\sum_{m}\theta_{n}u_{m}(t)H_{m}$ with $n=1,2,\dots,N$ . The cost function for the generalized system is defined by

\Phi_{N}(u):=\frac{1}{N}\sum_{n=1}^{N}\Phi_{\omega_{n},\theta_{n}}(u).

(7)

The task of the training step is to find a control strategy $u^{*}$ to maximize the cost functional $\Phi_{N}(u)$ . A gradient-based learning algorithm (s-GRAPE) can be developed to complete this task. In the process of testing and evaluation, a number of sampling individual systems are randomly selected to evaluate the control performance. Results show that the SLC method is potential for control design of various inhomogeneous quantum ensembles (including inhomogeneous open quantum ensembles).

Besides inhomogeneous quantum ensembles, the SLC method is useful for robust control of single quantum systems with various uncertainties. For example, Eq. (5) can also correspond to the Hamiltonian of a quantum system with inaccurate model parameter $\omega$ and uncertain multiplicative noise $\theta$ . In order to achieve robust control for such a quantum system, we may employ the SLC method to search for robust control pulses [27, 28, 29]. The performance of SLC approach can be further improved by exploring the richness and diversity of samples. Inspired by deep learning, a batch-based gradient algorithm (b-GRAPE) has been presented for efficiently seeking robust quantum controls, and numerical results showed that b-GRAPE can achieve improved performance over the SLC method for remarkably enhancing the control robustness while maintaining high fidelity [30]. In other applications where we need to enhance the robustness in closed-loop learning control, we may either use the Hessian matrix information [31] or integrate the idea of SLC into the learning algorithm in searching for robust control fields. For example, an improved DE algorithm (called as msMS_DE) has been proposed to search for robust femtosecond laser pulses to control fragmentation of the molecule $\text{CH}_{2}\text{BrI}$ [11]. In msMS_DE, multiple samples are used for fitness evaluation and a mixed strategy is employed for the mutation operation.

V Reinforcement learning for quantum control

Reinforcement learning (RL) [32] is another important machine learning approach and it addresses the problem of how an active agent can learn to approximate an optimal strategy while interacting with its environment. It is a model-free feedback-based approach and works well even when the system model is unknown or with uncertainties. RL has been used for learning control of quantum systems. For example, a fidelity-based probabilistic Q-learning approach has been presented to naturally solve the balance problem between exploration and exploitation and was applied to learning control of quantum systems [33]. The authors in [34] showed that the performance of RL is comparable to optimal control approaches in the task of finding a short and high-fidelity protocol, controlling from an initial to a given target state in nonintegrable many-body quantum systems of interacting qubits. RL can also help identify variational protocols with nearly optimal fidelity even in the glassy phase. In [35], deep reinforcement learning is employed to simultaneously optimize the speed and fidelity of quantum computation against both leakage and stochastic control errors. A universal quantum control framework was presented to improve the control robustness by adding control noise into training environments for RL agents trained with trusted-region-policy-optimization. By utilizing two-stage learning with teacher and student networks and a reward quantifying the capability to recover the quantum information stored in a quantum system, the authors in [14] showed how a network-based “agent” in RL can discover good quantum-error-correction strategies to protect qubits against noise.

VI Conclusions

Machine learning has shown powerful capability in discovering high quality controls to achieve optimal control and enhance robust performance for quantum systems. On one hand, it is necessary to further develop or improve existing machine learning algorithms to effectively solve complex quantum control problems emerged from new quantum technologies. On the other hand, various cutting-edge machine learning techniques should be able to find new application opportunities in the area of quantum control.

References

[1] D. Dong, and I. R. Petersen, “Quantum control theory and applications: a survey,” IET Control Theory & Applications, vol. 4, no. 12, pp. 2651-2671, 2010.
[2] H. Rabitz, R. De Vivie-Riedle, M. Motzkus, and K. Kompa, “Whither the future of controlling quantum phenomena?” Science, vol. 288, no. 5467, pp. 824-828, 2000.
[3] S. J. Glaser, U. Boscain, T. Calarco, C. P. Koch, W. Köckenberger, R. Kosloff, I. Kuprov, B. Luy, S. Schirmer, T. Schulte-Herbrüggen, D. Sugny and F. K. Wilhelm, “Training Schrödinger’s cat: quantum optimal control,” The European Physical Journal D, vol. 69, p. 279, 2015.
[4] C. Brif, R. Chakrabarti and H. Rabitz, “Control of quantum phenomena: past, present and future,” New Journal of Physics, Vol.12, p.075008, 2010.
[5] C. Altafini and F. Ticozzi, “Modeling and control of quantum systems: an introduction,” IEEE Transactions on Automatic Control, vol.57, no.8, pp.1898-1917, 2012.
[6] H. M. Wiseman, and G. J. Milburn, “Quantum Measurement and Control,” Cambridge, England: Cambridge University Press, 2010.
[7] D. D’Alessandro, Introduction to Quantum Control and Dynamics, Chapman & Hall/CRC, 2007.
[8] A. Acín, I. Bloch, H. Buhrman, T. Calarco, C. Eichler, J. Eisert, et al., “The quantum technologies roadmap: a European community view”, New Journal of Physics, vol. 20, p. 080201, 2018.
[9] Y. Guo, C. C. Shu , D. Dong, and F. Nori, “Vanishing and revival of resonance Raman scattering”, Physical Review Letters, vol. 123, p. 223202, 2019.
[10] S. Kuang, D. Dong, and I. R. Petersen, “Rapid Lyapunov control of finite-dimensional quantum systems” Automatica, vol. 81, pp.164-175, 2017.
[11] D. Dong, X. Xing, H. Ma, C. Chen, Z. Liu and H. Rabitz, “Learning-based quantum robust control: algorithm, applications and experiments,” IEEE Transactions on Cybernetics, vol.50, pp.3581- 3593, 2020.
[12] R. S. Judson, and H. Rabitz, “Teaching lasers to control molecules,” Physical Review Letters, vol. 68, pp. 1500-1503, 1992.
[13] N. Khaneja, T. Reiss, C Kehlet, T. Schulte-Herbrüggen, and S. J. Glaser, “Optimal control of coupled spin dynamics: design of NMR pulse sequences by gradient ascent algorithms,” Journal of Magnetic Resonance, vol. 172, no. 2, pp. 296-305, 2005.
[14] T. Fösel, P. Tighineanu, T. Weiss, and F. Marquardt, “Reinforcement learning with neural networks for quantum feedback”, Physical Review X, vol. 8, p. 031084, 2018.
[15] G. Jäger, D. M. Reich, M. H. Goerz, C. P. Koch, and U. Hohenester, “Optimal quantum control of Bose-Einstein condensates in magnetic microtraps: Comparison of gradient-ascent-pulse-engineering and Krotov optimization schemes,” Physical Review A, vol. 90, p. 033628, 2014.
[16] D. Dong, C. Wu, C. Chen, B. Qi, I. R. Petersen and F. Nori, “Learning robust pulses for generating universal quantum gates,” Scientific Reports, vol. 6, p. 36090, 2016.
[17] T. Schulte-Herbrüggen, A. Spörl, N. Khaneja, & S. J. Glaser. “Optimal control for generating quantum gates in open dissipative systems”. J. Phys. B: At. Mol. Opt. Phys. vol. 44, p. 154013, 2011.
[18] R. B. Wu, B. Chu, D.H. Owens, H. Rabitz, “Data-driven gradient algorithm for high-precision quantum control,” Physical Review A, vol. 97, p. 042122, 2018.
[19] C. C. Shu, T. S. Ho, X. Xing and H. Rabitz, “Frequency domain quantum optimal control under multiple constraints,” Physical Review A, vol. 93, p. 033417, 2016.
[20] R. Chakrabarti and H. Rabitz, “Quantum control landscapes,” International Reviews in Physical Chemistry, vol. 26, no. 4, pp. 671-735, 2007.
[21] E. Zahedinejad, J. Ghosh, and B. C. Sanders, “High-fidelity single-shot Toffoli gate via quantum control,” Physical Review Letters, vol. 114, no. 20, p. 200502, 2015.
[22] E. Zahedinejad, J. Ghosh, and B. C. Sanders, “Desgining high-fidelity single-shot three-qubit gates: A machine-learning approach,” Physical Review Applied, vol. 6, p. 054005, 2016.
[23] H. Ma, D. Dong, C.-C. Shu, Z. Zhu, and C. Chen, “Differential evolution with equally-mixed strategies for robust control of open quantum systems,” Control Theory and Technology, vol. 15, pp. 226-241, 2017.
[24] E. Zahedinejad, S. Schirmer, and B. C. Sanders, “Evolutionary algorithms for hard quantum control,” Physical Review A, vol. 90, no. 3, p. 032310, 2014.
[25] J. S. Li, and N. Khaneja, “Control of inhomogeneous quantum ensembles,” Physical Review A, vol. 73, no. 3, p. 030302, 2006.
[26] C. Chen, D. Dong, R Long, I. R. Petersen and H. A. Rabitz, “Sampling-based learning control of inhomogeneous quantum ensembles,” Physical Review A, vol. 89, no. 2, p. 023402, 2014.
[27] D. Dong, M. A. Mabrok, I. R. Petersen, B. Qi, C. Chen, and H. Rabitz, “Sampling-based learning control for quantum systems with uncertainties,” IEEE Transactions on Control Systems Technology, vol. 23, pp. 2155-2166, 2015.
[28] D. Dong, C. Chen, B. Qi, I. R. Petersen and F. Nori, “Robust manipulation of superconducting qubits in the presence of fluctuations,” Scientific Reports, vol. 5, p. 7873, 2015.
[29] C. Wu, B. Qi, C. Chen, and D. Dong, “Robust learning control design for quantum unitary transformations,” IEEE Transactions on Cybernetics, vo. 47, pp. 4405-4417, 2017.
[30] R. Wu, H. Ding, D. Dong and X. Wang, “Learning robust and high-precision quantum controls,” Physical Review A, vol. 99, p. 042327, 2019.
[31] X. Xing, R. Rey-de-Castro and H. Rabitz, “Assessment of optimal control mechanism complexity by experimental landscape Hessian analysis: fragmentation of CH₂BrI,” New Journal of Physics, vol. 16, p. 125004, 2014.
[32] R. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 1998.
[33] C. Chen, D. Dong, H.X. Li, J. Chu and T.J. Tarn, “Fidelity-based probabilistic Q-learning for control of quantum systems”, IEEE Transactions on Neural Networks and Learning Systems, Vol. 25, pp.920-933, 2014.
[34] M. Bukov, A. G. R. Day, D. Sels, P. Weinberg, A. Polkovnikov, and P. Mehta, “Reinforcement learning in different phases of quantum control”, Physical Review X, vol. 8, p. 031086, 2018.
[35] M. Y. Niu, S. Boixo, V. N. Smelyanskiy, and H. Neven, “Universal quantum control through deep reinforcement learning”, npj Quantum Information, vol. 5, p. 33, 2019.