Schrödinger-Heisenberg Variational Quantum Algorithms

Zhong-xia Shang Hefei National Laboratory for Physical Sciences at Microscale and Department of Modern Physics, University of Science and Technology of China, Hefei, Anhui 230026, China Shanghai Branch, CAS Centre for Excellence and Synergetic Innovation Centre in Quantum Information and Quantum Physics, University of Science and Technology of China, Shanghai 201315, China Shanghai Research Center for Quantum Sciences, Shanghai 201315, China Ming-cheng Chen Hefei National Laboratory for Physical Sciences at Microscale and Department of Modern Physics, University of Science and Technology of China, Hefei, Anhui 230026, China Shanghai Branch, CAS Centre for Excellence and Synergetic Innovation Centre in Quantum Information and Quantum Physics, University of Science and Technology of China, Shanghai 201315, China Shanghai Research Center for Quantum Sciences, Shanghai 201315, China Xiao Yuan Center on Frontiers of Computing Studies, Peking University, Beijing 100871, China School of Computer Science, Peking University, Beijing 100871, China Chao-yang Lu Hefei National Laboratory for Physical Sciences at Microscale and Department of Modern Physics, University of Science and Technology of China, Hefei, Anhui 230026, China Shanghai Branch, CAS Centre for Excellence and Synergetic Innovation Centre in Quantum Information and Quantum Physics, University of Science and Technology of China, Shanghai 201315, China Shanghai Research Center for Quantum Sciences, Shanghai 201315, China Jian-wei Pan Hefei National Laboratory for Physical Sciences at Microscale and Department of Modern Physics, University of Science and Technology of China, Hefei, Anhui 230026, China Shanghai Branch, CAS Centre for Excellence and Synergetic Innovation Centre in Quantum Information and Quantum Physics, University of Science and Technology of China, Shanghai 201315, China Shanghai Research Center for Quantum Sciences, Shanghai 201315, China

Abstract

Recent breakthroughs have opened the possibility to intermediate-scale quantum computing with tens to hundreds of qubits, and shown the potential for solving classical challenging problems, such as in chemistry and condensed matter physics. However, the high accuracy needed to surpass classical computers poses a critical demand to the circuit depth, which is severely limited by the non-negligible gate infidelity, currently around $0.1-1\%$ . The limited circuit depth places restrictions on the performance of variational quantum algorithms (VQA) and prevents VQAs to explore desired non-trivial quantum states. To resolve this problem, we propose a paradigm of Schrödinger-Heisenberg variational quantum algorithms (SH-VQA). Using SH-VQA, the expectation values of operators on states that require very deep circuits to prepare can now be efficiently measured by rather shallow circuits. The idea is to incorporate a virtual Heisenberg circuit, which acts effectively on the measurement observables, to a real shallow Schrödinger circuit, which is implemented realistically on the quantum hardware. We choose a Clifford virtual circuit, whose effect on the Hamiltonian can be seen as an efficient classical processing. Yet, it greatly enlarges the state expressivity, realizing much larger unitary $t$ -designs. Our method enables accurate quantum simulation and computation that otherwise are only achievable with much deeper circuits or more accurate operations conventionally. This has been verified in our numerical experiments for a better approximation of random states, higher-fidelity solutions to the XXZ model, and the electronic structure Hamiltonians of small molecules. Thus, together with effective quantum error mitigation, our work paves the way for realizing accurate quantum computing algorithms with near-term quantum devices.

Refer to caption — Figure 1: SH-VQE. (a): The SH-VQE circuit. The circuit is composed of the Schrödinger circuit $U$ and the Heisenberg circuit $T$ , where $U$ is the local unitary circuit running on real quantum computers and $T$ is the virtual circuit acted on the Hamiltonian consisting of two parts, the Clifford part, and the single qubit layer. The architecture we use for $U$ throughout this work is layers of parallel 2-qubit gates, which has a well-defined light cone that constrains the propagation of correlations and entanglements. (b): Improvements of SH-VQE. By adding the virtual circuit, $TU\left|0^{\otimes n}\right\rangle$ is able to explore more of the Hilbert space compared with $U\left|0^{\otimes n}\right\rangle$ in conventional VQE and the trainable Hilbert space is much larger than the conventional VQE. (c): Algorithm structure comparison between VQE and SH-VQE. The transformed Hamiltonian $H_{T}$ replaces $H$ in SH-VQE. We update parameters in both $U$ and $T$ to minimize the expectation value of $H_{T}$ .

Almost four decades after Richard Feynman put forward the idea of quantum computing [1], the quantum advantage has been experimentally tested recently in the solid state systems[2, 3, 4] and photonic systems [5, 6]. However, those quantum computational advantage works focused on well-defined quantum sampling problems which were not designed practically useful. Therefore, the next important near-term milestone is to find algorithms for noisy intermediate-scale quantum (NISQ) [7] devices to solve non-trivial practical problems that are intractable for classical computation.

One of the most promising NISQ applications is using variational quantum algorithms (VQA) [8, 9] such as the variational quantum eigensolver (VQE) [10] and the variational quantum simulation (VQS) [11] where a quantum circuit is optimized classically to approximate the eigenstate state energy and to simulate the dynamics of a Hamiltonian respectively for tasks that are widely considered in combinatorial optimization problems [12], condensed matter physics [13], and quantum chemistry [14, 15]. A practical advantage of hybrid algorithms is their certain degrees of resilience to noise in the optimization and quantum hardware [8, 16, 17].

Considering the limitations of NISQ devices, VQAs generally use a shallow local unitary circuit (LUC) (Fig. 1(a)) to approximate the target quantum states. States prepared by shallow LUCs however, could be trivial, obeying the entanglement area law [18] which can be well captured by classical tensor networks [19]. Indeed, the Lieb-Robinson bound [20] indicates that the entanglement light cone restricts the propagation of correlations and, therefore, shallow LUC can not generate long-range entanglement. However, the ground states of some Hamiltonians of interest could be highly non-trivial and require a relatively deep LUC with a depth that has linear or even higher scaling with the qubit number [20, 21] such as interacting spins at critical points [22, 23], topological quantum orders [24, 25], and interacting fermions in complex molecules [15]. This is a big challenge for NISQ devices. Indeed, without an effective quantum error correction, the final fidelity of the quantum circuits drops exponentially with the number of gates. For example, a state-of-the-art random quantum circuit with 60 qubits and 24 layers [3] ended up with a cross-entropy benchmarking fidelity as low as $0.037\%$ . We thus need to significantly improve the NISQ hardware to implement those VQAs to the desired accuracy.

This situation can be summarized as a tradeoff between the fidelity of the LUC and its expressivity [9] (i.e., the ability for the quantum circuits to “express” a sufficiently large volume of quantum states to include those non-trivial ones). To circumvent this problem, we propose a new framework of VQAs, enhanced by virtual Heisenberg circuits, which can noiselessly increase the effective circuit depth and thus simultaneously improve its expressivity and fidelity. We want to mention that there is a related work by Zhang et al. where their classical neural networks serve for a similar purpose as our virtual Heisenberg circuits [26]. And there is an orbital optimized unitary coupled cluster method [27] that shares a similar idea as ours where they turn single-excitation circuits into a classical processing on chemical Hamiltonians. We call our scheme Schrödinger-Heisenberg (SH) variational quantum algorithms (VQA), which illustrates that the main idea is that, in addition to the physical unitary circuit, $U$ , acting on the quantum states in the Schrödinger picture, we bring in a virtual circuit, $T$ , acting on the target Hamiltonian $H$ in the Heisenberg picture (see Fig. 1(a)). In the following, we consider SH-VQE as an example, but we note that the algorithm works for general VQAs. In this case, the energy expectation value $E(T,U)=$ $\left\langle 0^{\otimes n}\left|U^{\dagger}T^{\dagger}HTU\right|0^{\otimes n}\right\rangle$ of the system becomes

E(T,U)=\left\langle 0^{\otimes n}\left|U^{\dagger}H_{T}U\right|0^{\otimes n}\right\rangle

(1)

where the classically calculated transformed Hamiltonian $H_{T}=T^{\dagger}HT$ has the same energy spectrum as $H$ . By properly choosing a relatively deep but noiseless $T$ , the state $TU\left|0^{\otimes n}\right\rangle$ could explore the Hilbert space far outside the range of $U\left|0^{\otimes n}\right\rangle$ (see Fig. 1(b)) and hence can obtain lower and more accurate ground-state energy than conventional VQE for non-trivial problems. We show a workflow of SH-VQE together with a comparison to conventional VQE in Fig. 1(c). Compared with VQE, both the real Schrödinger circuit $U$ and the virtual Heisenberg circuit $T$ in SH-VQE are parametrized and updated when minimizing the expectation value $E(T,U)$ . The key feature of SH-VQE is that only $U$ as a shallow LUC is physically implemented, whereas the relatively deep circuit $T$ is performed virtually and noiselessly using a classical computer.

We first show how to effectively measure $H_{T}$ . In general, the target Hamiltonian $H$ could be expressed as a linear sum of multi-qubit Pauli terms $H=\sum_{i=1}^{m}g_{i}P_{i}$ , where $P_{i}\in$ $\left\{\sigma_{I},\sigma_{X},\sigma_{Y},\sigma_{Z}\right\}^{\otimes n}$ . Then we can measure each $P_{i}$ with a total number of samples $\left(\frac{m}{\epsilon^{2}}\right)\sum_{i}$ $g_{i}^{2}\operatorname{Var}\left[P_{i}\right]$ , proportional to the number of terms $m$ in the Hamiltonian [28], to evaluate the energy expectation value within an error of $\epsilon$ . Here $\operatorname{Var}\left[P_{i}\right]=\left\langle P_{i}^{2}\right\rangle-\left\langle P_{i}\right\rangle^{2}$ . We can similarly measure $H_{T}$ , by similarly decomposing each $T^{\dagger}P_{i}T$ into Pauli strings. While most practical Hamiltonians $H$ only contain a polynomial number of terms, this might not be the case for $T^{\dagger}P_{i}T$ or $H_{T}$ , after the transformation (See Appendix).

Here we propose a structure of the Heisenberg circuit that also leads to efficiently measurable $T^{\dagger}P_{i}T$ or $H_{T}$ . The circuit consists of two parts (Fig. 1(a)), where the first part is an arbitrary Clifford circuit that can be decomposed into a sequence of $O\left(n^{2}\right)$ basic gates from the set $\{\mathrm{H},\mathrm{S},\mathrm{CNOT}\}$ , and the second part is a layer of single-qubit gates. The first part realizes discrete gates such as CNOT to build correlations between any two qubits and the second part makes them continuous. The Clifford circuit maps the multi-qubit Pauli group to itself, which conserves the number of terms of the Hamiltonian. Also, the Gottesman-Knill theorem [30] indicates that calculating the transformed Hamiltonian is easy. While the second part might increase the number of terms of the Hamiltonian, the overhead is polynomial for Hamiltonians $H$ consisting of only $k$ -weight terms, i.e., the Pauli operators $\left\{\sigma_{X},\sigma_{Y},\sigma_{Z}\right\}$ act on at most $k$ qubits since the weight remains unchanged. We note that one can change this part into other easier or more complex circuits for different Hamiltonians, considering the trade-off between the circuit power and the measurement cost.

We begin to study the expressivity of the circuit in SH-VQE. We consider the expressivity measure using the method of quantum complex projective $t$ -design [31], which means that the distribution of the output states has equal moments up to the $t^{\text{th }}$ order to a Haar uniform distributed states from the whole Hilbert space. Intuitively, as illustrated in Fig. 2(a) [32], a higher $t$ -design indicates a more uniform and denser state distribution in the Hilbert space, and vice versa. In general, a LUC of depth $O\left(nt^{10}\right)$ is needed to generate a $t$ -design [33], and the Clifford circuits can produce a 3-design [34]. Using the tight Page’s theorem [29], we define the logarithmic difference of entanglement entropy as

\Delta_{t}=\log\left(E_{\text{Haar }}\left[\operatorname{Tr}\left(\rho_{n/2}^{t}\right)\right]\right)-\log\left(E_{\mathrm{SH}}\left[\operatorname{Tr}\left(\rho_{n/2}^{t}\right)\right]\right)

(2)

to identify the order of expressivity of SH-VQE, where $\rho_{n/2}$ is the reduced half system density matrix, $E_{\text{Haar }}$ is the average over Haar random states, and $E_{SH}$ is the average over the quantum states $TU\left|0^{\otimes n}\right\rangle$ . If $\Delta_{t}$ increases and approaches 0 , it means that $TU\left|0^{\otimes n}\right\rangle$ is a $t$ -design.

Fig. 2(b) shows a comparison of the expressivity of SH-VQE versus VQE through a numerical experiment on a 12-qubit system. In the VQE setting, we run a random LUC at different depths and calculate $\Delta_{t}$ to characterize the $t$ -design. In the SH-VQE setting, we implement both the real Schrödinger circuits $U$ and the virtual Heisenberg circuits $T$ which are pure Clifford consisting of 500 random gates from $\{\mathrm{H},\mathrm{S},\mathrm{CNOT}\}$ . The key observation for both cases is the critical depths when the $\Delta_{t}$ measure increases to and saturates at around 0. It is evident that the $\Delta_{t}$ curves for SH-VQE rise much more rapidly than that for VQE for all $t$ values from 3 to 12. The rising curve for SH-VQE quickly hits the saturation point at a Schrödinger circuit depth of $\sim 2$ , while the VQE curve arrives at a much deep depth of $\sim 36$ . This indicates that SH-VQE can effectively reduce the gate depth by more than one order of magnitude to achieve the same level of expressivity. For a higher number of qubits, we expect an even more dramatic advantage, which can be inferred from a qubit-size dependent test of depth reduction as shown in Fig. D.3. The above results indicate that we can use current NISQ hardware to effectively run deep quantum circuits while maintaining high fidelity. Particularly, based on a two-qubit gate fidelity of $99.5\%$ , the SH-VQE can allow us to run, for instance, a 12-qubit 4-depth quantum circuit with an output fidelity of $90\%$ , which would otherwise demand a two-qubit gate fidelity of $99.95\%$ (currently unrealistic) and depth of 40 in conventional VQE (Fig. D.2). Note that shallow LUCs or Clifford circuits alone can only generate small design orders, but a combination of them can achieve high expressivity.

We consider an example of the XXZ spin model with a periodic boundary condition

H_{XXZ}=\sum_{i=1}^{n}\left[\sigma_{i}^{x}\sigma_{i+1}^{x}+\sigma_{i}^{y}\sigma_{i+1}^{y}+\Delta\sigma_{i}^{z}\sigma_{i+1}^{z}\right]

(3)

to demonstrate a kind of working flow of SH-VQE. At the critical point $\Delta=1$ , the $\mathrm{XXZ}$ model is equivalent to the Heisenberg model whose ground state has a logarithmic scaling of entanglement entropy [22, 23], and hence cannot be prepared by a constant-depth LUC. Since we aim to boost the performance of the NISQ experiments, we use the hardware efficient ansatz [14] for the real Schrödinger circuit even though this may lead to barren plateau problems [35], where each circuit layer composes a layer of CZ gates and a layer of parametrized arbitrary single-qubit gates (denoted as $\vec{\theta})$

For the Heisenberg circuit, the single-qubit gate layer is parametrized with parameters $\vec{\phi}$ . And we restrict the Clifford part to graph circuits [37] where only commuting $\mathrm{CZ}$ gates are used. We separate the graph circuit into patterns of different connectivity with the same translational invariant (TI) symmetry as $H_{XXZ}$ . More concretely, for an $n$ -qubit circuit, we can set $\lfloor n/2\rfloor$ elementary graphs (For the $\mathrm{j}^{\text{th }}$ elementary graph, each node $\mathrm{i}$ is connected with node $i+j$ . $\lfloor\cdot\rfloor$ is the floor function.). As each elementary graph can be turned on or turned off, the total number of possible patterns is $2^{\lfloor n/2\rfloor}$ and we use a $\lfloor n/2\rfloor$ -bit string to label all the possible patterns such as ‘ $01001\ldots$ ’, where 0 means the corresponding elementary graph is turned on whereas 1 means off (Fig. 3a). To efficiently search through an exponentially large space of Clifford gate patterns, we borrow the idea from differentiable quantum architecture search [38], where each elementary TI graph is turned on independently according to a probability described by a two-parameter softmax function [39]. Thus, only $\lfloor n/2\rfloor\times 2$ parameters (denoted as $\vec{\alpha}$ ) are needed to implement the discrete search of the huge Clifford patterns. Therefore, the circuit ansatz for the SH-VQE is

T(\vec{\alpha},\vec{\phi})U(\vec{\theta})\left|0^{\otimes n}\right\rangle

(4)

where $\vec{\alpha}$ and $\vec{\phi}$ represent all configurations of the Heisenberg circuit $T$ and $\vec{\theta}$ are the continuous parameters in the single-qubit gates inside the Schrödinger circuit $U$ . The parameters $\vec{\alpha}$ are used to generate samples of different circuits and the cost function is the average of the Hamiltonian expectation values of these circuits under the same gate parameters $\vec{\theta}$ and $\vec{\phi}$ . The SH-VQE method then optimizes over all the parameters to search for the ground state of the Hamiltonian.

In our numerical simulation, we consider an 8-spin XXZ model with a 4-depth circuit $U$ and 4 elementary TI graphs of the Clifford layer as shown in Fig. 3(a). We show the energy expectation and the evolution of the possibilities of all 16 configurations during the optimization as functions of the number of iterations in Fig. 3(b). When the energy expectation is converged, the probabilities of the candidate circuit structures concentrate on the optimal configuration, the fully connected graph ‘1111’. In Fig. 3(c), we show the optimal energies of all the 16 candidate circuit configurations, which verifies that the optimal configuration is indeed the fully connected graph ‘1111’. We further solve larger models up to 16 spins to show the improvement of SH-VQE compared with conventional VQE using the same Schrödinger circuits (Fig. 3(d)). For SH-VQE, we directly use the generalized fully connected graph circuits as the Clifford part. We can find under the same circuit depth, the SH-VQE obtains higher fidelities than the VQE (an average improvement of $25.2\%$ ).

To further demonstrate the practical values of our algorithm, we implement our algorithm to solve the electronic structure problems of $\mathrm{H}_{4}$ and $\mathrm{H}_{2}\mathrm{O}$ molecules following the same workflow as the above. The $\mathrm{H}_{4}$ molecule corresponds to an 8-qubit Hamiltonian. For the $\mathrm{H}_{2}\mathrm{O}$ molecule, we use the active space method [40] to create an effective 10-qubit Hamiltonian containing 10 spin orbitals and 6 electrons. Since the SH-VQA has the Pauli weight restriction, we use the Bravyi-Kitaev mapping which transforms an $M$ -mode fermionic Hamiltonian to a spin Hamiltonian of $O\left(\log_{2}M\right)$ Pauli weight [41, 40]. Note that the ground states of these molecule Hamiltonians have the correct number of electrons. The results are shown in Fig. 4, where we can see SH-VQE can reach the chemical accuracy ( $1.6\times 10^{-3}$ ) with Schrödinger circuits of much shallower depth than VQE.

We now give some discussions for SH-VQA. First, we want to emphasize that the states $TU\left|0^{\otimes n}\right\rangle$ are both hard to prepare on NISQ devices, as it requires implementing the relatively deep $T$ circuit, and hard to simulate on classical computers, as it can be treated as Clifford circuits with non-stabilizer input states. However, interestingly, within the SH-VQE framework, the operator expectation values under these states can be efficiently evaluated as long as $U$ is classically tractable. Second, we want to talk about the trainability of SH-VQA. A known result is that in general, an ansatz with high expressivity may lead to low trainability [42]. We want to emphasize that the expressivity benchmarked under the very random settings in Fig. 2 should be understood as the achievable expressivity of the NISQ devices enhanced by Heisenberg circuits but not the actual expressivity of the ansatz within the SH-VQA framework for specific problems. Thus, SH-VQA can be understood as a general methodology for improving existing variational algorithms within which biased and trainable ansatzes can be tested. We summarize some strategies in the Appendix.

In summary, we have introduced a novel variational quantum algorithm, the SH-VQA, to efficiently extend the circuit depth of near-term noisy quantum processors. By virtually introducing relatively deep and non-local Clifford circuits, we show that the expressivity of shallow quantum circuits can be significantly enhanced, without sacrificing the fidelity. We use the XXZ model to demonstrate the workflow of SH-VQA and further demonstrate the practical values of SH-VQA by solving small molecules. Our method is directly applicable to current quantum hardware and is compatible with most existing quantum algorithms. Leveraging quantum error mitigation, our work pushes near-term quantum hardware into wide non-trivial applications.

We use the Qulacs [43] and the Qiskit [44] packages for parts of simulations.

References

[1] Richard P Feynman. Simulating physics with computers. In Feynman and computation, pages 133–153. CRC Press, 2018.
[2] Frank Arute, Kunal Arya, Ryan Babbush, Dave Bacon, Joseph C Bardin, Rami Barends, Rupak Biswas, Sergio Boixo, Fernando GSL Brandao, David A Buell, et al. Quantum supremacy using a programmable superconducting processor. Nature, 574(7779):505–510, 2019.
[3] Qingling Zhu, Sirui Cao, Fusheng Chen, Ming-Cheng Chen, Xiawei Chen, Tung-Hsun Chung, Hui Deng, Yajie Du, Daojin Fan, Ming Gong, et al. Quantum computational advantage via 60-qubit 24-cycle random circuit sampling. Science bulletin, 67(3):240–245, 2022.
[4] Yulin Wu, Wan-Su Bao, Sirui Cao, Fusheng Chen, Ming-Cheng Chen, Xiawei Chen, Tung-Hsun Chung, Hui Deng, Yajie Du, Daojin Fan, et al. Strong quantum computational advantage using a superconducting quantum processor. Physical review letters, 127(18):180501, 2021.
[5] Han-Sen Zhong, Hui Wang, Yu-Hao Deng, Ming-Cheng Chen, Li-Chao Peng, Yi-Han Luo, Jian Qin, Dian Wu, Xing Ding, Yi Hu, et al. Quantum computational advantage using photons. Science, 370(6523):1460–1463, 2020.
[6] Han-Sen Zhong, Yu-Hao Deng, Jian Qin, Hui Wang, Ming-Cheng Chen, Li-Chao Peng, Yi-Han Luo, Dian Wu, Si-Qiu Gong, Hao Su, et al. Phase-programmable gaussian boson sampling using stimulated squeezed light. Physical review letters, 127(18):180502, 2021.
[7] John Preskill. Quantum computing in the nisq era and beyond. Quantum, 2:79, 2018.
[8] Suguru Endo, Zhenyu Cai, Simon C Benjamin, and Xiao Yuan. Hybrid quantum-classical algorithms and quantum error mitigation. Journal of the Physical Society of Japan, 90(3):032001, 2021.
[9] Marco Cerezo, Andrew Arrasmith, Ryan Babbush, Simon C Benjamin, Suguru Endo, Keisuke Fujii, Jarrod R McClean, Kosuke Mitarai, Xiao Yuan, Lukasz Cincio, et al. Variational quantum algorithms. Nature Reviews Physics, 3(9):625–644, 2021.
[10] Alberto Peruzzo, Jarrod McClean, Peter Shadbolt, Man-Hong Yung, Xiao-Qi Zhou, Peter J Love, Alán Aspuru-Guzik, and Jeremy L O’brien. A variational eigenvalue solver on a photonic quantum processor. Nature communications, 5(1):4213, 2014.
[11] Xiao Yuan, Suguru Endo, Qi Zhao, Ying Li, and Simon C Benjamin. Theory of variational quantum simulation. Quantum, 3:191, 2019.
[12] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm. arXiv preprint arXiv:1411.4028, 2014.
[13] Dave Wecker, Matthew B Hastings, and Matthias Troyer. Progress towards practical quantum variational algorithms. Physical Review A, 92(4):042303, 2015.
[14] Abhinav Kandala, Antonio Mezzacapo, Kristan Temme, Maika Takita, Markus Brink, Jerry M Chow, and Jay M Gambetta. Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. nature, 549(7671):242–246, 2017.
[15] Jonathan Romero, Ryan Babbush, Jarrod R McClean, Cornelius Hempel, Peter J Love, and Alán Aspuru-Guzik. Strategies for quantum computing molecular energies using the unitary coupled cluster ansatz. Quantum Science and Technology, 4(1):014008, 2018.
[16] Kristan Temme, Sergey Bravyi, and Jay M Gambetta. Error mitigation for short-depth quantum circuits. Physical review letters, 119(18):180509, 2017.
[17] Ying Li and Simon C Benjamin. Efficient variational quantum simulator incorporating active error minimization. Physical Review X, 7(2):021050, 2017.
[18] Fernando GSL Brandao and Michał Horodecki. Exponential decay of correlations implies area law. Communications in mathematical physics, 333:761–798, 2015.
[19] Frank Verstraete, Valentin Murg, and J Ignacio Cirac. Matrix product states, projected entangled pair states, and variational renormalization group methods for quantum spin systems. Advances in physics, 57(2):143–224, 2008.
[20] Sergey Bravyi, Matthew B Hastings, and Frank Verstraete. Lieb-robinson bounds and the generation of correlations and topological quantum order. Physical review letters, 97(5):050401, 2006.
[21] Wen Wei Ho and Timothy H Hsieh. Efficient variational simulation of non-trivial quantum states. SciPost Phys, 6:029, 2019.
[22] Guifre Vidal, José Ignacio Latorre, Enrique Rico, and Alexei Kitaev. Entanglement in quantum critical phenomena. Physical review letters, 90(22):227902, 2003.
[23] José Ignacio Latorre, Enrique Rico, and Guifré Vidal. Ground state entanglement in quantum spin chains. arXiv preprint quant-ph/0304098, 2003.
[24] Yichen Huang, Xie Chen, et al. Quantum circuit complexity of one-dimensional topological phases. Physical Review B, 91(19):195143, 2015.
[25] Xie Chen, Zheng-Cheng Gu, and Xiao-Gang Wen. Local unitary transformation, long-range quantum entanglement, wave function renormalization, and topological order. Physical review b, 82(15):155138, 2010.
[26] Shi-Xin Zhang, Zhou-Quan Wan, Chee-Kong Lee, Chang-Yu Hsieh, Shengyu Zhang, and Hong Yao. Variational quantum-neural hybrid eigensolver. Physical Review Letters, 128(12):120502, 2022.
[27] Wataru Mizukami, Kosuke Mitarai, Yuya O Nakagawa, Takahiro Yamamoto, Tennin Yan, and Yu-ya Ohnishi. Orbital optimized unitary coupled cluster theory for quantum computer. Physical Review Research, 2(3):033421, 2020.
[28] Jarrod R McClean, Jonathan Romero, Ryan Babbush, and Alán Aspuru-Guzik. The theory of variational hybrid quantum-classical algorithms. New Journal of Physics, 18(2):023023, 2016.
[29] Zi-Wen Liu, Seth Lloyd, Elton Zhu, and Huangjun Zhu. Entanglement, quantum randomness, and complexity beyond scrambling. Journal of High Energy Physics, 2018(7):1–62, 2018.
[30] Daniel Gottesman. The heisenberg representation of quantum computers. arXiv preprint quant-ph/9807006, 1998.
[31] Stuart G Hoggar. t-designs in projective spaces. European Journal of Combinatorics, 3(3):233–254, 1982.
[32] Data source. https://www.polyu.edu.hk/ama/staff/xjchen/sphdesigns.html.
[33] Fernando GSL Brandao, Aram W Harrow, and Michał Horodecki. Local random quantum circuits are approximate polynomial-designs. Communications in Mathematical Physics, 346(2):397–434, 2016.
[34] Huangjun Zhu. Multiqubit clifford groups are unitary 3-designs. Physical Review A, 96(6):062336, 2017.
[35] Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes. Nature communications, 9(1):4812, 2018.
[36] Yazan Arouri and Mohammad Sayyafzadeh. An accelerated gradient algorithm for well control optimization. Journal of Petroleum Science and Engineering, 190:106872, 2020.
[37] Marc Hein, Wolfgang Dür, Jens Eisert, Robert Raussendorf, M Nest, and H-J Briegel. Entanglement in graph states and its applications. arXiv preprint quant-ph/0602096, 2006.
[38] Shi-Xin Zhang, Chang-Yu Hsieh, Shengyu Zhang, and Hong Yao. Differentiable quantum architecture search. arXiv preprint arXiv:2010.08561, 2020.
[39] The possibility of picking a graph $\vec{k}={}^{\prime}k_{1}k_{2}\ldots k_{\lfloor n/2\rfloor}$ ’ is described by the product of $\lfloor n/2\rfloor$ independent softmax functions $\prod_{i=1}^{\lfloor n/2\rfloor}\frac{\exp\left(\alpha_{i,k_{i}}\right)}{\exp\left(\alpha_{i,0}\right)+\exp\left(\alpha_{i,1}\right)}$ .
[40] Sam McArdle, Suguru Endo, Alán Aspuru-Guzik, Simon C Benjamin, and Xiao Yuan. Quantum computational chemistry. Reviews of Modern Physics, 92(1):015003, 2020.
[41] Sergey B Bravyi and Alexei Yu Kitaev. Fermionic quantum computation. Annals of Physics, 298(1):210–226, 2002.
[42] Zoë Holmes, Kunal Sharma, Marco Cerezo, and Patrick J Coles. Connecting ansatz expressibility to gradient magnitudes and barren plateaus. PRX Quantum, 3(1):010313, 2022.
[43] Yasunari Suzuki, Yoshiaki Kawase, Yuya Masumura, Yuria Hiraga, Masahiro Nakadai, Jiabao Chen, Ken M Nakanishi, Kosuke Mitarai, Ryosuke Imai, Shiro Tamiya, et al. Qulacs: a fast and versatile quantum circuit simulator for research purpose. arXiv preprint arXiv:2011.13524, 2020.
[44] Andrew Cross. The ibm q experience and qiskit open-source quantum computing software. In APS March meeting abstracts, volume 2018, pages L58–003, 2018.

Appendix A Measurement cost

For variational hybrid quantum-classical algorithms, a key step is to evaluate operators’ expectation values. Typically, we use the so-called operator averaging method, which has no requirements on circuits but requires a large number of measurements.

Without loss of generality, we consider a simple original Hamiltonian $H$ , which is composed of only one Pauli operator. Consider $P_{h}$ to be the Pauli operator we want to evaluate $\left\langle P_{h}\right\rangle=\operatorname{tr}\left(\rho P_{h}\right)$ , the variance of measuring $P_{h}$ is $\operatorname{Var}\left[P_{h}\right]_{\rho}=\operatorname{tr}\left(\rho P_{h}^{2}\right)-$ $\operatorname{tr}\left(\rho P_{h}\right)^{2}$ . If we repeat the measurement for $N_{1}$ times, the variance becomes

\operatorname{Var}\left[P_{h}\right]_{\rho}\rightarrow\frac{\operatorname{Var}\left[P_{h}\right]_{\rho}}{N_{1}}

(5)

If we want to reach an accuracy of $\epsilon$ , the number of measurements we need is

N_{1}=\frac{\operatorname{Var}\left[P_{h}\right]_{\rho}}{\epsilon^{2}}

(6)

Consider we act a circuit $T$ to the Pauli operator $P_{h}$ :

P_{h}\rightarrow T^{\dagger}P_{h}T=\sum_{i=1}^{m_{h}}c_{i}P_{i}

(7)

where $\sum_{i=1}^{m_{h}}c_{i}^{2}=1$ . The circuit transforms $P_{h}$ into $m_{h}$ part. For the corresponding state $\rho_{T}=T^{\dagger}\rho T$ , the variance is unchanged under the transformation $\operatorname{Var}\left[P_{h}\right]_{\rho}=$ $\operatorname{Var}\left[T^{\dagger}P_{h}T\right]_{\rho_{T}}$ . However, we need to measure each term individually so the effective variance of those measurements does increase. One natural way is to repeat $N_{2}/m_{h}$ measurements to evaluate each one of Pauli terms and sum them. The variance in this way is

\sum_{i=1}^{m_{h}}\frac{m_{h}c_{i}^{2}\operatorname{Var}\left[P_{i}\right]_{\rho_{T}}}{N_{2}}

(8)

If we want to reach the same accuracy of $\epsilon$ , the number of measurements $N_{2}$ we need is

N_{2}=\frac{m_{h}\sum_{i=1}^{m_{h}}c_{i}^{2}\operatorname{Var}\left[P_{i}\right]_{\rho_{T}}}{\epsilon^{2}}\approx m_{h}\frac{\operatorname{Var}\left[P_{h}\right]_{\rho}}{\epsilon^{2}}=m_{h}N_{1}

(9)

The approximation above is because every $\left\langle P_{i}\right\rangle$ can be treated as an expectation value of a Bernoulli random variable $\left\langle P_{i}\right\rangle=p_{1}*1+p_{-1}*(-1)$ and the variance is bounded by $4p_{1}p_{-1}\leq 1$ . So, we assume the variances of Pauli terms are at the same level. From Eq. 9 we can see $m_{h}$ must be controlled at a tolerable level, which explains why the structure of VHC must be restricted. This conclusion does not change even if one groups commuting terms.

Another point worth mentioning is that assigning $N_{2}/m_{h}$ measurements to evaluate each one of Pauli’s terms is not the best choice. One can treat this as an optimization problem

	minimize:		$\displaystyle f\left(p_{i}\right)=\sum_{i}\frac{c_{i}^{2}\operatorname{Var}\left[P_{i}\right]_{\rho_{T}}}{N_{2}p_{i}}$
	subject to:		$\displaystyle\sum_{i}p_{i}=1\text{ and }p_{i}\geq 0,i=1,\ldots,m_{h}$		(10)

This problem can be easily solved using the Lagrange multiplier method under the assumption that the variances of Pauli terms are at the same level and the best choice is $p_{i}=\left|c_{i}\right|/\sum_{i=1}^{m_{h}}\left|c_{i}\right|$ (which won’t change the basic conclusion). Another choice is $p_{i}=$ $c_{i}^{2}$ . At first look, this choice should be better than the uniform choice $p_{i}=1/m_{h}$ as the term with bigger variance is assigned with more measurements. However, they actually have the same performances

\sum_{i=1}^{m_{h}}\frac{c_{i}^{2}\operatorname{Var}\left[P_{i}\right]_{\rho_{T}}}{N_{2}c_{i}^{2}}=\frac{\sum_{i=1}^{m_{h}}\operatorname{Var}\left[P_{i}\right]_{\rho_{T}}}{N_{2}}\approx\sum_{i=1}^{m_{h}}\frac{m_{h}c_{i}^{2}\operatorname{Var}\left[P_{i}\right]_{\rho_{T}}}{N_{2}}

(11)

Appendix B The structure of parametrized circuit

The structure of parametrized circuits used when solving XXZ models is shown in Fig. B.1.

Appendix C The expressivity and the trainability of SH-VQA

According to Ref. [42], there is a general trade-off between the expressivity and the trainability of an ansatz. Specifically, higher expressivity leads to low variance of the cost gradients. Based on this result, we can talk about the trainability of SH-VQA. The expressivity in Fig. 2 is the highest expressivity that can be achieved by SH-VQA. If we use an ansatz with the same setting as Fig. 2 for solving problems, the trainability will be rather poor. However, in real situations, for specific problems, we can have numerous strategies to restrict expressivity.

To make SH-VQA trainable, a good trainability Schrödinger circuit is a prerequisite such as the Unitary Coupled Cluster [15] and the Hamiltonian Variational Ansatz [13]. Under this condition, we can further restrict the size of the pool of Clifford circuits. Since each Clifford circuit maps the exploration subspace of $U$ to another subspace, we can think the expressivity is proportional to the number of Clifford circuits in the pool in the worst case where the mapped subspaces of different Clifford circuits are orthogonal. The number of all possible Clifford circuits is exponentially large but we can restrict a small part of it using the prior knowledge of Hamiltonians, good Schrödinger ansatzes, and NISQ devices. An example is to choose Clifford circuits that transform the connectivity of NISQ circuits to be close to those problem-inspired ansatzes like UCC and HVA with good trainability. If we can fix it or choose it from a restricted set, the $t$ -design order can hardly change. There are also other strategies as introduced in Ref. [42]. For example, we can alternatively optimize the parameters in the Schrödinger circuit and the parameters in the Heisenberg circuits. When one part is fixed, the expressivity only depends on the other part. Note that both $U$ and $T$ are circuits with poor expressivity, it is the combination of them that has high expressivity.

The purpose of the above discussions is to show that we are able to find biased problem-inspired trainable SH-VQA ansatzes, which have no conflicts with the expressivity enhancement shown in Fig. 2. We can have a better understanding from Fig. C.1

Appendix D Circuit expressivity and hardware requirements

More detailed 12-qubit simulation results of expressivity comparisons between SH-VQE circuits and VQE circuits are shown in Fig. D.1. From Fig. D.1, we observe that by using the idea of SH-VQE, quantum hardware with a depth of 4 may have a close potential to 40-depth hardware running VQE. This provides us a significant relief on the requirement of quantum gate fidelities in practical experiments. We show such a reduction in Fig. D.2.

We further run numerical experiments to see the scalability of our algorithm. We treat expressivity as a measure of the equivalence of VQE circuit and the SH-VQE circuit. The results of our algorithm are shown in Fig. D.3, where we fix the SH-VQE Schrödinger circuit depth at 2 and 4 and show the equivalent VQE circuit depth as functions of the system size. From the figure, we can see that the more of the qubit number is, the more equivalent VQE circuit depth is achieved, which gives SH-VQE a growing advantage as the quantum device become larger and larger.