This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Quantum algorithm for time-dependent Hamiltonian simulation
by permutation expansion

Yi-Hsiang Chen Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292, USA Department of Physics and Astronomy, and Center for Quantum Information Science & Technology,University of Southern California, Los Angeles, California 90089, USA    Amir Kalev Information Sciences Institute, University of Southern California, Arlington, VA 22203, USA    Itay Hen Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292, USA Department of Physics and Astronomy, and Center for Quantum Information Science & Technology,University of Southern California, Los Angeles, California 90089, USA
Abstract

We present a quantum algorithm for the dynamical simulation of time-dependent Hamiltonians. Our method involves expanding the interaction-picture Hamiltonian as a sum of generalized permutations, which leads to an integral-free Dyson series of the time-evolution operator. Under this representation, we perform a quantum simulation for the time-evolution operator by means of the linear combination of unitaries technique. We optimize the time steps of the evolution based on the Hamiltonian’s dynamical characteristics, leading to a gate count that scales with an L1L^{1}-norm-like scaling with respect only to the norm of the interaction Hamiltonian, rather than that of the total Hamiltonian. We demonstrate that the cost of the algorithm is independent of the Hamiltonian’s frequencies, implying its advantage for systems with highly oscillating components, and for time-decaying systems the cost does not scale with the total evolution time asymptotically. In addition, our algorithm retains the near optimal log(1/ϵ)/loglog(1/ϵ)\log(1/\epsilon)/\log\log(1/\epsilon) scaling with simulation error ϵ\epsilon.

I Introduction

The problem of simulating quantum systems, whether it is to study their dynamics, or to infer their salient equilibrium properties, was the original motivation for quantum computers Feynman (1982) and remains one of their major potential applications Reiher et al. (2017); Babbush et al. (2018). Classical algorithms for this problem are known to be grossly inefficient. Nonetheless, a significant fraction of the world’s computing power today is spent on solving instances of this problem — a reflection on their importance Gioiosa (2017); Sterling et al. (2018); Lee (2014).

An important class of quantum simulations that is known to be particularly challenging, and is the focus of this work, is that of time-dependent quantum processes, which are at the heart of many important quantum phenomena. These include for example quantum control schemes Pang and Jordan (2017), transition states of chemical reactions Butler (1998) analog quantum computers such as quantum annealers Farhi et al. (2001) and the quantum approximate optimization algorithm Farhi et al. (2014). Devising state-of-the-art resource efficient quantum algorithms to simulate these types of processes on quantum circuits is therefore a very worthy cause: it will allow for the studying of said phenomena in a controllable and vastly more illuminating manner.

In the literature, a number of quantum algorithms designed to simulate the dynamics of time-dependent quantum many-body Hamiltonians already exist. However, most of them are variants of algorithms that suit time-independent Hamiltonians but lack optimizations for dynamical ones. For example, Hamiltonians based on the Lie-Trotter-Suzuki decomposition were developed in Refs. Wiebe et al. (2011); Poulin et al. (2011), where the complexity scales polynomially with error. More recent advances Berry et al. (2014, 2015) improve it to a logarithmic error scaling, which directly lead to applications in time-dependent Hamiltonian simulations Low and Wiebe (2019); Kieferová et al. (2019). A recent study by Berry et al. Berry et al. (2020) improves the Hamiltonian scaling to L1L^{1}-norm, by considering the dynamical properties of the time-dependent Hamiltonian. However, these mostly comprise of slicing the dynamics into a sequence of ‘quasi-static’ steps, each of which implementing a static quantum simulation module. In addition, all the above-mentioned algorithms assume a time-dependent oracle — a straightforward but not necessarily practical assumption that can obscure the true complexity of the simulation when physical models are considered.

The sub-optimality that characterizes existing quantum algorithms can be attributed mainly to the fact that the time-evolution operator for time-dependent Hamiltonians is a more intricate entity than its time-independent counterpart (this matter is discussed in more detail below): While in the time-independent case the Schrödinger equation can be formally integrated, the time-evolution unitary operator for time-dependent systems is given in terms a Dyson series Dyson (1949) — a perturbative expansion, wherein each summand is a multi-dimensional integral over a time-ordered product of the (usually interaction-picture) Hamiltonian at different points in time. These time-ordered integrals pose multiple algorithmic and implementation challenges.

In this paper, we provide a quantum algorithm for simulating a time-dependent Hamiltonian dynamics. This algorithm invokes a separation of the Hamiltonian H(t)H(t) into a sum of a static diagonal part H0H_{0} and a dynamical part V(t)V(t), i.e., H(t)=H0+V(t)H(t)=H_{0}+V(t), and switches to the interaction-picture with respect to H0H_{0}. The target evolution operator becomes a product of an interaction-picture unitary UI(t)U_{I}(t) followed by a diagonal unitary eiH0t{\rm e}^{-iH_{0}t} that can be simulated efficiently. The interaction Hamiltonian V(t)V(t) is expanded as a sum of generalized permutations, and the resulting Dyson series of the evolution operator UI(t)U_{I}(t) becomes an integral-free representation Kalev and Hen (2020) with the notion of divided differences, which is a well-studied quantity de Boor (2005); Davis (1975); Mccurdy (1980); Gupta et al. (2020a); McCurdy et al. (1984); Zivcovich (2019). The divided differences have an intuition of discretized derivatives and is closely related to polynomial interpolations de Boor (2005). We refer the reader to Appendix A for a short summary of the notion of the divided differences. Under this representation, we use the LCU method Berry et al. (2015) to simulate UI(t)U_{I}(t) with a truncated Dyson series. We find a partitioning scheme that determines the duration of the time steps along the simulation. Following this procedure, in general, each time interval has a different duration which is determined form the Hamiltonian’s dynamical characteristics and can lead to substantially fewer number of steps as compared to using identical-length simulation segments, typically used in quantum simulation algorithms. We analyze the implementation gate and qubit costs and discuss the circumstances under which our simulation algorithm provides improvements over the state-of-the-art. Specifically, our algorithm is independent of the oscillation frequencies of the Hamiltonian. This is in stark contrast to existing algorithms which have dependence on dH(t)/dt||d{H}(t)/dt||, which grows with oscillation rates. Another class of Hamiltonians for which our algorithm is preferred over others is those with exponential decays. We show that for these systems, our algorithms requires asymptotically a finite number of steps which does not scale with the evolution time, leading in turn to an exponential saving comparing to the linear scaling in existing approaches. Moreover, the cost with Hamiltonian norm only mainly depends on the interaction Hamiltonian V(t)V(t) and not the total Hamiltonian H(t)H(t) Berry et al. (2020). This also indicates an advantage of the algorithm when the time-dependent Hamiltonian is dominant by a static part.

The paper is organized as follows. In Sec. II, we review the permutation expansion method that leads to an integral-free representation for the Dyson series, as introduced in Ref. Kalev and Hen (2020). In Sec. III, we present in detail the simulation algorithm that combines the integral-free expression of the evolution operator with the LCU method, and analyze the circuit costs. We highlight the main advantages of our algorithm in Sec. III.4.3. In Sec. III.5, we address the cases when the exponential-sum expansion of the time-dependence is not exact and estimate the error that stems from a finite sum approximation. Finally, we give a brief summary for our methods and results in Sec. V.

II Permutation expansion method for time-dependent Hamiltonians

In this section, we briefly describe the integral-free Dyson series expression of the evolution operator, derived from a permutation expansion of the time-dependent Hamiltonian Kalev and Hen (2020). Without loss of generality Gupta et al. (2020b), we expand a general time-dependent Hamiltonian in terms of products of time-dependent diagonal matrices, Di(t)D_{i}(t), and permutation operators, PiP_{i}, i.e.,

H(t)=i=0MDi(t)Pi,H(t)=\displaystyle\sum_{i=0}^{M}D_{i}(t)P_{i}, (1)

where P0𝟙P_{0}\equiv\mathds{1}. This decomposition can be done efficiently as long as MM scales polynomially with logd\log d, where dd is the dimension of the Hamiltonian. We decompose each diagonal matrix into a finite sum of exponential functions, i.e.,

Di(t)=k=1Kiexp(Λi(k)t)Di(k),D_{i}(t)=\sum_{k=1}^{K_{i}}\exp\left(\Lambda^{(k)}_{i}t\right)D_{i}^{(k)}, (2)

where Λi(k)\Lambda^{(k)}_{i} and Di(k)D_{i}^{(k)} are complex diagonal matrices with diagonal elements being

λi,z(k)z|Λi(k)|z,\displaystyle\lambda^{(k)}_{i,z}\equiv\langle z|\Lambda^{(k)}_{i}|z\rangle, (3)
di,z(k)z|Di(k)|z,\displaystyle d_{i,z}^{(k)}\equiv\langle z|D_{i}^{(k)}|z\rangle, (4)

in some basis {|z}\{|z\rangle\} (the basis in which D0D_{0} is diagonal) and KiK_{i} indicates the number of terms in the exponential decomposition for Di(t)D_{i}(t). This can be done for many cases when the time dependencies are simple combinations of exponential terms. For simplicity we assume here that the KiK_{i}’s are finite, and address the most general time dependence in detail in Sec. III.5 and refer to various algorithms Beylkin and Monzón (2005, 2010); Braess and Hackbusch (2009); Wiscombe and Evans (1977); Norvidas (2010) for efficiently finding an exponential sum approximation of a function.

For a lighter notation, we set Ki=KK_{i}=K for all ii. We can evaluate the time-evolution operator U(t)U(t) corresponding to H(t)H(t) as

U(t)\displaystyle U(t) 𝒯exp[i0tH(t)𝑑t]\displaystyle\equiv\mathcal{T}\text{exp}\left[-i\int_{0}^{t}H(t^{\prime})dt^{\prime}\right]
=q=0(i)q0t𝑑τq0τ2𝑑τ1H(τq)H(τ1)\displaystyle=\sum_{q=0}^{\infty}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}H(\tau_{q})\cdots H(\tau_{1}) (5)
=q=0𝕚q𝕜q(i)q0t𝑑τq0τ2𝑑τ1exp(Λiq(kq)τq)Diq(kq)Piqexp(Λi1(k1)τ1)Di1(k1)Pi1,\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\Lambda^{(k_{q})}_{i_{q}}\tau_{q}\right)D_{i_{q}}^{(k_{q})}P_{i_{q}}\cdots\exp\left(\Lambda^{(k_{1})}_{i_{1}}\tau_{1}\right)D_{i_{1}}^{(k_{1})}P_{i_{1}}, (6)

where 𝕚q={iq,,i1}\mathbb{i}_{q}=\{i_{q},\cdots,i_{1}\} and 𝕜q={kq,,k1}\mathbb{k}_{q}=\{k_{q},\cdots,k_{1}\} are multi-indices. The action of U(t)U(t) on a basis vector |z|z\rangle is

U(t)|z\displaystyle U(t)|z\rangle =q=0𝕚q𝕜q(i)q0t𝑑τq0τ2𝑑τ1exp(Λiq(kq)τq)Diq(kq)Piqexp(Λi1(k1)τ1)Di1(k1)Pi1|z\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\Lambda^{(k_{q})}_{i_{q}}\tau_{q}\right)D_{i_{q}}^{(k_{q})}P_{i_{q}}\cdots\exp\left(\Lambda^{(k_{1})}_{i_{1}}\tau_{1}\right)D_{i_{1}}^{(k_{1})}P_{i_{1}}|z\rangle (7)
=q=0𝕚q𝕜q(i)q0t𝑑τq0τ2𝑑τ1exp(λiq,z𝕚q(kq)τq++λi1,z𝕚1(k1)τ1)diq,z𝕚q(kq)di1,z𝕚1(k1)PiqPi1|z\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}\tau_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}\tau_{1}\right)d_{i_{q},z_{\mathbb{i}_{q}}}^{(k_{q})}\cdots d_{i_{1},z_{\mathbb{i}_{1}}}^{(k_{1})}P_{i_{q}}\cdots P_{i_{1}}|z\rangle
=q=0𝕚q𝕜q(i)q0t𝑑τq0τ2𝑑τ1exp(λiq,z𝕚q(kq)τq++λi1,z𝕚1(k1)τ1)d𝕚q,z(𝕜q)P𝕚q|z,\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}\tau_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}\tau_{1}\right)d_{\mathbb{i}_{q},z}^{(\mathbb{k}_{q})}P_{\mathbb{i}_{q}}|z\rangle,

where |z𝕚jPijPi1|z|z_{\mathbb{i}_{j}}\rangle\equiv P_{i_{j}}\cdots P_{i_{1}}|z\rangle with jj ranging from 1 to qq, and λij,z𝕚j(kj)(dij,z𝕚j(kj))\lambda^{(k_{j})}_{i_{j},z_{\mathbb{i}_{j}}}\left(d_{i_{j},z_{\mathbb{i}_{j}}}^{(k_{j})}\right) is the z𝕚jz_{\mathbb{i}_{j}}th diagonal element of Λij(kj)(Dij(kj))\Lambda^{(k_{j})}_{i_{j}}\left(D^{(k_{j})}_{i_{j}}\right). P𝕚qP_{\mathbb{i}_{q}} is a shorthand of PiqPi1P_{i_{q}}\cdots P_{i_{1}}, and similarly d𝕚q,z(𝕜q)diq,z𝕚q(kq)di1,z𝕚1(k1)d_{\mathbb{i}_{q},z}^{(\mathbb{k}_{q})}\equiv d_{i_{q},z_{\mathbb{i}_{q}}}^{(k_{q})}\cdots d_{i_{1},z_{\mathbb{i}_{1}}}^{(k_{1})}. Figure 1 illustrates the accumulative actions of Di(k)PiD^{(k)}_{i}P_{i} on a basis vector |z|z\rangle.

Refer to caption
Figure 1: The actions of a sequence of generalized permutations. This figure gives a pictorial illustration on how the elements of the diagonal matrices are picked up when interleaving with permutations. In this example, we have q=2q=2.

To proceed, we use the following identity to simplify the expression in terms of divided differences. It is a variant of Hermite-Genocchi formula de Boor (2005) applying to the exponential function.

Identity 1.

For λ1,,λq\lambda_{1},\cdots,\lambda_{q}\in\mathbb{C},

01𝑑sq0s2𝑑s1e(λ1s1++λqsq)=e[x1,,xq,0],\int^{1}_{0}ds_{q}\cdots\int^{s_{2}}_{0}ds_{1}{\rm e}^{(\lambda_{1}s_{1}+\cdots+\lambda_{q}s_{q})}={\rm e}^{[x_{1},\cdots,x_{q},0]}, (8)

where xj=l=jqλlx_{j}=\sum_{l=j}^{q}\lambda_{l} and e[x1,,xq,0]{\rm e}^{[x_{1},\cdots,x_{q},0]} is the divided difference of the exponential function with inputs x1,,xq,0x_{1},\cdots,x_{q},0. (The case with q=1q=1 can be shown by explicit integration, and the identity follows by induction. For more details, see Ref. Kalev and Hen (2020).)

With this property, the multi-dimensional integration in the time-evolution operator can be simplified as

U(t)|z\displaystyle U(t)|z\rangle =q=0𝕚q𝕜q(i)q0t𝑑τq0τ2𝑑τ1exp(λiq,z𝕚q(kq)τq++λi1,z𝕚1(k1)τ1)d𝕚q,z(𝕜q)P𝕚q|z\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}\tau_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}\tau_{1}\right)d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle
=q=0𝕚q𝕜q(it)q01𝑑sq0s2𝑑s1exp[t(λiq,z𝕚q(kq)sq++λi1,z𝕚1(k1)s1)]d𝕚q,z(𝕜q)P𝕚q|z\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-it)^{q}\int^{1}_{0}ds_{q}\cdots\int^{s_{2}}_{0}ds_{1}\exp\left[t\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}s_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}s_{1}\right)\right]d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle
=q=0𝕚q𝕜q(i)qet[x1,x2,,xq,0]d𝕚q,z(𝕜q)P𝕚q|z,\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle, (9)

where xj=l=jqλil,z𝕚l(kl)x_{j}=\sum_{l=j}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}. The second equality uses the change of variable dτ=tdsd\tau=tds, and the last equality follows from Identity 1 and the identity of tqe[tx0,,txq]=et[x0,,xq]t^{q}{\rm e}^{[tx_{0},\cdots,tx_{q}]}={\rm e}^{t[x_{0},\cdots,x_{q}]}. By completing the basis, we get

U(t)=zU(t)|zz|=zq=0𝕚q𝕜q(i)qet[x1,x2,,xq,0]d𝕚q,z(𝕜q)P𝕚q|zz|.U(t)=\sum_{z}U(t)|z\rangle\langle z|=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle\langle z|. (10)

This is an integral-free expression for the unitary time-evolution operator of the time-dependent Hamiltonian H(t)H(t). We will later approximate the unitary by truncating the series at some order q=Qq=Q that scales as 𝒪(log(1/ϵ)loglog(1/ϵ))\mathcal{O}\left(\frac{\text{log}(1/\epsilon)}{\text{loglog}(1/\epsilon)}\right) Berry et al. (2015), where ϵ\epsilon is the required accuracy.

III Time-dependent Hamiltonian simulation algorithm

A time-dependent Hamiltonian H(t)H(t) can be expressed as a sum of two Hamiltonians—a time-independent H0H_{0} and a dynamical V(t)V(t), i.e.,

H(t)=H0+V(t).H(t)=H_{0}+V(t). (11)

In many practical models, H0H_{0} represents a static and simple Hamiltonian that is often diagonal in a known basis (which we will identify with the computational basis). Hence, hereafter, we assume that H0H_{0} is a diagonal operator with real diagonal elements. The V(t)V(t) component represents the nontrivial interactions between subsystems. Assume111For the most general cases, one can set H0=0H_{0}=0. H0H_{0} is diagonal in the computational basis {|z}\{|z\rangle\}. We switch to the interaction picture, i.e.,

ddt|ψ(t)=iH(t)|ψ(t)ddt|ψI(t)=iHI(t)|ψI(t),\frac{d}{dt}|\psi(t)\rangle=-iH(t)|\psi(t)\rangle\to\frac{d}{dt}|\psi_{I}(t)\rangle=-iH_{I}(t)|\psi_{I}(t)\rangle, (12)

where

|ψI(t)=eiH0t|ψ(t)andHI(t)=eiH0tV(t)eiH0t.|\psi_{I}(t)\rangle={\rm e}^{iH_{0}t}|\psi(t)\rangle\ \ \text{and}\ \ H_{I}(t)={\rm e}^{iH_{0}t}V(t){\rm e}^{-iH_{0}t}. (13)

The Schrödinger-picture unitary operator U(t)U(t), satisfying |ψ(t)=U(t)|ψ(0)|\psi(t)\rangle=U(t)|\psi(0)\rangle, is equivalent to a time-ordered matrix exponential followed by a diagonal unitary, i.e.,

U(t)=eiH0t𝒯exp[i0tHI(t)𝑑t]=eiH0t𝒯exp[i0teiH0tV(t)eiH0t𝑑t].U(t)={\rm e}^{-iH_{0}t}\mathcal{T}\exp\left[-i\int_{0}^{t}H_{I}(t^{\prime})dt^{\prime}\right]={\rm e}^{-iH_{0}t}\mathcal{T}\exp\left[-i\int_{0}^{t}{\rm e}^{iH_{0}t^{\prime}}V(t^{\prime}){\rm e}^{-iH_{0}t^{\prime}}dt^{\prime}\right]. (14)

Hence, the simulation of U(t)=eiH0tUI(t)U(t)={\rm e}^{-iH_{0}t}U_{I}(t) consists of two parts—a complicated UI(t)U_{I}(t) and a simple diagonal unitary eiH0t.{\rm e}^{-iH_{0}t}. The simulation of eiH0t{\rm e}^{-iH_{0}t} can be achieved with a gate cost that scales only linearly with the locality of H0H_{0}. When we write H0=γ=0LJγZγH_{0}=\sum^{L}_{\gamma=0}J_{\gamma}Z_{\gamma}, where each ZγZ_{\gamma} is some tensor product of (single-qubit) Pauli-ZZ operators acting on at most dd qubits, it can be shown that the gate cost scales as 𝒪(Ld)\mathcal{O}(Ld) Nielsen and Chuang (2011); Kalev and Hen (2021). Therefore, the main focus of our simulation is on UIU_{I}.

We next provide an overview of the simulation algorithm in Sec. III.1. In Sec. III.2, we incorporate the LCU framework with the permutation expansion method. Sec. III.3.2 provides the state preparation operation and Sec. III.4 evaluates the simulation cost for the whole procedure.

III.1 An overview of the algorithm

Our proposed simulation algorithm consists of a permutation expansion procedure for UIU_{I} and the LCU method for the quantum simulation. In Sec. III.2, we explain in detail the essential ingredients for merging these two approaches. Before delving into technical details, we provide an overview of the algorithm in this section.

Given a time-dependent Hamiltonian H(t)H(t), we first decompose H(t)H(t) into a sum of a static diagonal term H0H_{0} (if exists) and a dynamical term V(t)V(t). We switch to an interaction picture so that the target unitary evolution U(T)U(T) over a period TT becomes

U(T)𝒯exp[i0TH(t)𝑑t]=eiH0T𝒯exp[i0TeiH0tV(t)eiH0t𝑑t]eiH0TUI(T).U(T)\equiv\mathcal{T}\exp\left[-i\int_{0}^{T}H(t)dt\right]={\rm e}^{-iH_{0}T}\mathcal{T}\exp\left[-i\int_{0}^{T}{\rm e}^{iH_{0}t}V(t){\rm e}^{-iH_{0}t}dt\right]\equiv{\rm e}^{-iH_{0}T}U_{I}(T). (15)

Therefore, the simulation of U(T)U(T) is equivalent to applying UI(T)U_{I}(T) followed by eiH0T{\rm e}^{-iH_{0}T}. Since the diagonal unitary eiH0T{\rm e}^{-iH_{0}T} can be efficiently simulated, we focus on UI(T)U_{I}(T) hereafter.

Let us expand V(t)V(t) as a sum of permutations as

V(t)=i=0MDi(t)Pi,V(t)=\displaystyle\sum_{i=0}^{M}D_{i}(t)P_{i}, (16)

where PiP_{i} are permutations (P0𝟙P_{0}\equiv\mathds{1}) and Di(t)D_{i}(t) are some diagonal matrices that are expressed as exponential sums, i.e.,

Di(t)=k=1Kexp(Λi(k)t)Di(k).D_{i}(t)=\sum_{k=1}^{K}\exp\left(\Lambda^{(k)}_{i}t\right)D_{i}^{(k)}. (17)

Λi(k)\Lambda^{(k)}_{i} and Di(k)D_{i}^{(k)} are some complex diagonal matrices. Partition UI(T)U_{I}(T) into rr segments UI(T,tr1)UI(t1,0)U_{I}(T,t_{r-1})\cdots U_{I}(t_{1},0), whose respective durations Δtw\Delta t_{w}, w=0,,r1w=0,\ldots,r-1 are determined by the partitioning scheme given in Sec. III.2 and the time markers twt_{w}, are defined as tw=l=0w1Δtlt_{w}=\sum_{l=0}^{w-1}\Delta t_{l}. The total number of steps is denoted as rr. The evolution operator from twt_{w} to tw+Δtwt_{w}+\Delta t_{w} is expressed as

UI(tw+Δtw,tw)=q=0𝕚q𝕜qx=±(i)q(eΔtwλ1λ)q2q!Γ𝕚q(𝕜q)(tw)P𝕚qΦ𝕚q,x(𝕜q,w),U_{I}(t_{w}+\Delta t_{w},t_{w})=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=\pm}(-i)^{q}\frac{\left(\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\lambda}\right)^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})P_{\mathbb{i}_{q}}\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x}, (18)

where we denote

Γ𝕚q(𝕜q)(tw)=Diq(kq)maxetwλ(iq,kq)Di1(k1)maxetwλ(i1,k1),\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})=\left|\left|D_{i_{q}}^{(k_{q})}\right|\right|_{\max}{\rm e}^{t_{w}\lambda_{(i_{q},k_{q})}}\cdots\left|\left|D_{i_{1}}^{(k_{1})}\right|\right|_{\max}{\rm e}^{t_{w}\lambda_{(i_{1},k_{1})}}, (19)

where ||||max||\cdot||_{\max} the max norm, λ(i,k)=maxz(z|Λi(k)|z)\lambda_{(i,k)}=\max_{z}\Re(\langle z|\Lambda^{(k)}_{i}|z\rangle) the maximum real part of Λi(k)\Lambda^{(k)}_{i} and λ=maxi,k{λ(i,k)}\lambda=\max_{i,k}\{\lambda_{(i,k)}\}. Here, Φ𝕚q,±(𝕜q,w)\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},\pm} are some diagonal unitaries as derived later in Eq. (38) and each P𝕚qP_{\mathbb{i}_{q}} is a unique product of permutations. Note that the above evolution operators are given as a linear combination of unitaries (LCU). We provide a review for the LCU method in Appendix C. We set the truncation order QQ to be222An exact truncation order that guarantees the accuracy ϵ\epsilon is ln(2r/ϵ)W(ln(2r/ϵ)eln2)1\left\lceil{\frac{\text{ln}(2r/\epsilon)}{W\left(\frac{\text{ln}(2r/\epsilon)}{e\text{ln}2}\right)}-1}\right\rceil, where W()W(\cdot) is the Lambert W-function.

Q=𝒪(log(r/ϵ)loglog(r/ϵ)),Q=\mathcal{O}\left(\frac{\log(r/\epsilon)}{\log\log(r/\epsilon)}\right), (20)

where ϵ\epsilon is the overall simulation accuracy.

To implement the LCU routine for each UI(tw+Δtw,tw)U_{I}(t_{w}+\Delta t_{w},t_{w}), we require preparing a state

|ψ0=1sq=0Q𝕚q𝕜qx=0,1(eΔtwλ1λ)qΓ𝕚q(𝕜q)(tw)2q!|𝕚q|𝕜q|x,\displaystyle|\psi_{0}\rangle=\frac{1}{\sqrt{s}}\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}\sqrt{\frac{\left(\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\lambda}\right)^{q}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}{2q!}}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle, (21)

where |𝕚q|\mathbb{i}_{q}\rangle represents QQ quantum registers that each has dimension MM and |𝕜q|\mathbb{k}_{q}\rangle represents QQ quantum registers that each has dimension KK, and ss is the normalization factor. Following the same notation in Sec. C, let us denote the state preparation unitary as BB, i.e., B|02Q+1=|ψ0B|0\rangle^{\otimes 2Q+1}=|\psi_{0}\rangle (B is explicitly given in Sec. III.3.2). Let us denote VcV_{c} the control unitary such that

Vc|𝕚q|𝕜q|x|ψ=|𝕚q|𝕜q|x(i)qP𝕚qΦ𝕚q,x(𝕜q,w)|ψ.V_{c}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle|\psi\rangle=|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle(-i)^{q}P_{\mathbb{i}_{q}}\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x}|\psi\rangle. (22)

The Oblivious Amplitude Amplification (OAA) involves interleaving the operator W=(BI)Vc(BI)W=(B^{\dagger}\otimes I)V_{c}(B\otimes I) as

A=WRWRW,A=-WRW^{\dagger}RW, (23)

where RI2(|00|I)R\equiv I-2(|0\rangle\langle 0|\otimes I). For each piece of the unitary, we implement AA on the extended system |0(2Q+1)|ψ|0\rangle^{\otimes(2Q+1)}|\psi\rangle. By construction, we have

A|0(2Q+1)|ψ|0(2Q+1)UI(tw+Δtw,tw)|ψ=𝒪(ϵr).\left|\left|A|0\rangle^{\otimes(2Q+1)}|\psi\rangle-|0\rangle^{\otimes(2Q+1)}U_{I}(t_{w}+\Delta t_{w},t_{w})|\psi\rangle\right|\right|=\mathcal{O}\left(\frac{\epsilon}{r}\right). (24)

This means that applying AA effectively performs the unitary UI(tw+Δtw,tw)U_{I}(t_{w}+\Delta t_{w},t_{w}) on the main system |ψ|\psi\rangle, with error 𝒪(ϵ/r)\mathcal{O}(\epsilon/r). Combining rr pieces of the procedure, it effectively simulates UI(T)U_{I}(T) with overall error 𝒪(ϵ)\mathcal{O}(\epsilon), i.e.,

Ar1A1A0|0(2Q+1)|ψ|0(2Q+1)UI(T)|ψ=𝒪(ϵ),\left|\left|A_{r-1}\cdots A_{1}A_{0}|0\rangle^{\otimes(2Q+1)}|\psi\rangle-|0\rangle^{\otimes(2Q+1)}U_{I}(T)|\psi\rangle\right|\right|=\mathcal{O}\left(\epsilon\right), (25)

where AwA_{w} are the OAA operators for the corresponding piece of evolution. This implies that applying the sequence of AAs followed by the circuit for eiH0T{\rm e}^{-iH_{0}T} can approach the action of U(T)U(T) to an arbitrary accuracy.

III.2 Permutation expansion for UI(t)U_{I}(t)

In this section, we give a thorough introduction of the permutation expansion in the Dyson series and the conditions arisen from implementing the LCU method. We focus on addressing the interaction-picture unitary UI(t)U_{I}(t), i.e., the time-ordered operator in Eq. (14). Using the expansions introduced in Eqs. (16) and (17), we get

UI(t)𝒯exp[i0teiH0tV(t)eiH0t𝑑t]\displaystyle U_{I}(t)\equiv\mathcal{T}\exp\left[-i\int_{0}^{t}{\rm e}^{iH_{0}t^{\prime}}V(t^{\prime}){\rm e}^{-iH_{0}t^{\prime}}dt^{\prime}\right]
=q=0(i)q0t𝑑τq0τ2𝑑τ1eiH0τqV(τq)eiH0τqeiH0τ1V(τ1)eiH0τ1\displaystyle=\sum_{q=0}^{\infty}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}{\rm e}^{iH_{0}\tau_{q}}V(\tau_{q}){\rm e}^{-iH_{0}\tau_{q}}\cdots{\rm e}^{iH_{0}\tau_{1}}V(\tau_{1}){\rm e}^{-iH_{0}\tau_{1}}
=q=0𝕚q𝕜q(i)q0t𝑑τq0τ2𝑑τ1eiH0τqeΛiq(kq)τqDiq(kq)PiqeiH0τqeiH0τ1eΛi1(k1)τ1Di1(k1)Pi1eiH0τ1,\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}{\rm e}^{iH_{0}\tau_{q}}{\rm e}^{\Lambda^{(k_{q})}_{i_{q}}\tau_{q}}D_{i_{q}}^{(k_{q})}P_{i_{q}}{\rm e}^{-iH_{0}\tau_{q}}\cdots{\rm e}^{iH_{0}\tau_{1}}{\rm e}^{\Lambda^{(k_{1})}_{i_{1}}\tau_{1}}D_{i_{1}}^{(k_{1})}P_{i_{1}}{\rm e}^{-iH_{0}\tau_{1}}, (26)

We denote the basis in which H0H_{0} is diagonal by {|z}\{|z\rangle\} and its diagonal elements by Ez=z|H0|zE_{z}=\langle z|H_{0}|z\rangle. The action of UI(t)U_{I}(t) on a basis vector |z|z\rangle becomes

UI(t)|z=q=0𝕚q𝕜q(i)q0t𝑑τq0τ2𝑑τ1\displaystyle U_{I}(t)|z\rangle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1} (27)
×exp[(iEz𝕚qiEz𝕚q1+λiq,z𝕚q(kq))τq++(iEz𝕚1iEz+λi1,z𝕚1(k1))τ1]d𝕚q,z(𝕜q)P𝕚q|z,\displaystyle\times\exp\left[\left(iE_{z_{\mathbb{i}_{q}}}-iE_{z_{\mathbb{i}_{q-1}}}+\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}\right)\tau_{q}+\cdots+\left(iE_{z_{\mathbb{i}_{1}}}-iE_{z}+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}\right)\tau_{1}\right]d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle,

where Ez𝕚jE_{z_{\mathbb{i}_{j}}} is the z𝕚jz_{\mathbb{i}_{j}}th diagonal element of H0H_{0}, i.e., Ez𝕚j=z𝕚j|H0|z𝕚jE_{z_{\mathbb{i}_{j}}}=\langle z_{\mathbb{i}_{j}}|H_{0}|z_{\mathbb{i}_{j}}\rangle, and |z𝕚j=P𝕚j|z|z_{\mathbb{i}_{j}}\rangle=P_{\mathbb{i}_{j}}|z\rangle with P𝕚j=PijPi1P_{\mathbb{i}_{j}}=P_{i_{j}}\cdots P_{i_{1}}. By Identity 1, this can be further simplified as

UI(t)|z=q=0𝕚q𝕜q(i)qet[x1,,xq,0]d𝕚q,z(𝕜q)P𝕚q|z,\displaystyle U_{I}(t)|z\rangle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t[x_{1},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle, (28)

where

xj=i(Ez𝕚qEz𝕚j1)+l=jqλil,z𝕚l(kl).x_{j}=i\left(E_{z_{\mathbb{i}_{q}}}-E_{z_{\mathbb{i}_{j-1}}}\right)+\sum_{l=j}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}. (29)

III.3 The LCU routine

To implement the LCU method for a quantum simulation of UI(T)U_{I}(T), we first decompose the overall simulation duration TT into rr pieces in sequence, i.e.,

UI(T)=UI(T,tr1)UI(tr1,tr2)UI(t1,0)=w=0r1UI(tw+Δtw,tw),U_{I}(T)=U_{I}(T,t_{r-1})U_{I}(t_{r-1},t_{r-2})\cdots U_{I}(t_{1},0)=\prod^{r-1}_{w=0}U_{I}(t_{w}+\Delta t_{w},t_{w}), (30)

where the operators in the product of the last equation are understood to be ordered, tw+1=tw+Δtwt_{w+1}=t_{w}+\Delta t_{w} and t00t_{0}\equiv 0 and trTt_{r}\equiv T. The number of steps, rr, and the step size, Δtw\Delta t_{w}, are to be determined. When acting on a computational basis state, each piece in the decomposition can be written as

UI(tw+Δtw,tw)|z=𝒯exp[itwtw+ΔtwHI(t)𝑑t]|z\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})|z\rangle=\mathcal{T}\text{exp}\left[-i\int_{t_{w}}^{t_{w}+\Delta t_{w}}H_{I}(t^{\prime})dt^{\prime}\right]|z\rangle
=q=0𝕚q𝕜q(i)qtwtw+Δtw𝑑τqtwτ2𝑑τ1exp[l=1q(iEz𝕚liEz𝕚l1+λil,z𝕚l(kl))τl]d𝕚q,z(𝕜q)P𝕚q|z,\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t_{w}+\Delta t_{w}}_{t_{w}}d\tau_{q}\cdots\int^{\tau_{2}}_{t_{w}}d\tau_{1}\exp\Bigg{[}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\tau_{l}\Bigg{]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle,
=q=0𝕚q𝕜q(i)qexp[twl=1q(iEz𝕚liEz𝕚l1+λil,z𝕚l(kl))]\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\exp\left[t_{w}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\right]
×0Δtwdτq0τ2dτ1exp[l=1q(iEz𝕚liEz𝕚l1+λil,z𝕚l(kl))τl]d𝕚q,z(𝕜q)P𝕚q|z\displaystyle\times\int^{\Delta t_{w}}_{0}d\tau^{\prime}_{q}\cdots\int^{\tau^{\prime}_{2}}_{0}d\tau^{\prime}_{1}\exp\Bigg{[}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\tau^{\prime}_{l}\Bigg{]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle
=q=0𝕚q𝕜q(i)qexp[twl=1q(iEz𝕚liEz𝕚l1+λil,z𝕚l(kl))]eΔtw[x1,x2,,xq,0]d𝕚q,z(𝕜q)P𝕚q|z\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\exp\left[t_{w}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\right]{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle
=q=0𝕚q𝕜q(i)qeitw(Ez𝕚0Ez𝕚q)etwl=1qλil,z𝕚l(kl)eΔtw[x1,x2,,xq,0]d𝕚q,z(𝕜q)P𝕚q|z,\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{-it_{w}(E_{z_{\mathbb{i}_{0}}}-E_{z_{\mathbb{i}_{q}}})}{\rm e}^{t_{w}\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle, (31)

which has the same form as Eq. (28) except that the integration intervals are shifted (with Ez𝕚0EzE_{z_{\mathbb{i}_{0}}}\equiv E_{z}). We can denote

d𝕚q,z(𝕜q)(tw)=d𝕚q,z(𝕜q)etwl=1qλil,z𝕚l(kl),d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})=d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}{\rm e}^{t_{w}\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}},

which leads to

UI(tw+Δtw,tw)|z=q=0𝕚q𝕜q(i)qeitw(Ez𝕚0Ez𝕚q)eΔtw[x1,x2,,xq,0]d𝕚q,z(𝕜q)(tw)P𝕚q|z,\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})|z\rangle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{-it_{w}(E_{z_{\mathbb{i}_{0}}}-E_{z_{\mathbb{i}_{q}}})}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})P_{\mathbb{i}_{q}}|z\rangle, (32)

To formulate the above expression in terms of a linear combination of unitaries, we need to evaluate the norms of eΔtw[x1,x2,,xq,0]{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]} and d𝕚q,z(𝕜q)(tw)d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w}). The norm of d𝕚q,z(𝕜q)(tw)d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w}) is bounded by

|d𝕚q,z(𝕜q)(tw)|Diq(kq)maxetwλ(iq,kq)Di1(k1)maxetwλ(i1,k1)=Γ𝕚q(𝕜q)(tw).\left|d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})\right|\leq||D^{(k_{q})}_{i_{q}}||_{\max}{\rm e}^{t_{w}\lambda_{(i_{q},k_{q})}}\cdots||D^{(k_{1})}_{i_{1}}||_{\max}\,{\rm e}^{t_{w}\lambda_{(i_{1},k_{1})}}=\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})\,. (33)

The norm of the eΔtw[x1,x2,,xq,0]{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]} can be bounded by using the following identity.

Identity 2.

For any q+1q+1 complex values x0,,xqx_{0},\cdots,x_{q}\in\mathbb{C},

|e[x0,,xq]|e[(x0),,(xq)]=eξq!,\left|{\rm e}^{[x_{0},\cdots,x_{q}]}\right|\leq{\rm e}^{[\Re(x_{0}),\cdots,\Re(x_{q})]}=\frac{{\rm e}^{\xi}}{q!}, (34)

where ()\Re(\cdot) denotes the real part of an input and ξ[min{(x0),,(xq)},max{(x0),,(xq)}]\xi\in\left[\min\{\Re(x_{0}),\cdots,\Re(x_{q})\},\max\{\Re(x_{0}),\cdots,\Re(x_{q})\}\right].

The proof can be found in Appendix A. From Identity 2, we show in Appendix B that

|eΔtw[x1,x2,,xq,0]|eΔtw[qλ,(q1)λ,,λ,0]=1q!(eΔtwλ1λ)qΔt~wqq!,\left|{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}\right|\leq{\rm e}^{\Delta t_{w}[q\lambda,(q-1)\lambda,\ldots,\lambda,0]}=\frac{1}{q!}\left(\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\lambda}\right)^{q}\equiv\frac{\widetilde{\Delta t}_{w}^{q}}{q!}, (35)

where we denoted the quantity

Δt~weΔtwλ1λ.\widetilde{\Delta t}_{w}\equiv\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\lambda}. (36)

With these bounds, the factors in the expansion form in Eq. (32) can be written as

eΔtw[x1,x2,,xq,0]d𝕚q,z(𝕜q)(tw)\displaystyle{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w}) =Δt~wqq!Γ𝕚q(𝕜q)(tw)\displaystyle=\frac{{\widetilde{\Delta t}_{w}}^{q}}{q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w}) (37)
×(eΔtw[x1,x2,,xq,0]Δt~wq/q!eitw(Ez𝕚0Ez𝕚q)diq,z(kq)di1,z(k1)etwl=1qλil,z𝕚l(kl)Γ𝕚q(𝕜q)(tw))\displaystyle\times\left(\frac{{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}}{\widetilde{\Delta t}_{w}^{q}/q!}\frac{{\rm e}^{-it_{w}(E_{z_{\mathbb{i}_{0}}}-E_{z_{\mathbb{i}_{q}}})}d^{(k_{q})}_{i_{q},z}\cdots d^{(k_{1})}_{i_{1},z}\,{\rm e}^{t_{w}\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}}{\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}\right)
=Δt~wqq!Γ𝕚q(𝕜q)(tw)cos[ϕ𝕚q,z(𝕜q)]eiθ𝕚q,z(𝕜q)=Δt~wq2q!Γ𝕚q(𝕜q)(tw)(eiϕ𝕚q,z(𝕜q)+iθ𝕚q,z(𝕜q)+eiϕ𝕚q,z(𝕜q)+iθ𝕚q,z(𝕜q)),\displaystyle=\frac{\widetilde{\Delta t}_{w}^{q}}{q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})\cos\left[\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}\right]{\rm e}^{i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}=\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})\left({\rm e}^{i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}+{\rm e}^{-i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}\right),

where

ϕ𝕚q,z(𝕜q)=cos1[|eΔtw[x1,x2,,xq,0]d𝕚q,z(𝕜q)(tw)Δt~wqq!Γ𝕚q(𝕜q)(tw)|]\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}=\cos^{-1}\left[\left|\frac{{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})}{\frac{\widetilde{\Delta t}_{w}^{q}}{q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}\right|\right]

and

θ𝕚q,z(𝕜q)=arg[eΔtw[x1,x2,,xq,0]d𝕚q,z(𝕜q)(tw)Δt~wqq!Γ𝕚q(𝕜q)(tw)].\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}=\arg{\left[\frac{{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})}{\frac{\widetilde{\Delta t}_{w}^{q}}{q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}\right]}.

The evolution operator from twt_{w} to tw+Δtwt_{w}+\Delta t_{w} becomes

UI(tw+Δtw,tw)\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w}) =zUI(tw+Δtw,tw)|zz|\displaystyle=\sum_{z}U_{I}(t_{w}+\Delta t_{w},t_{w})|z\rangle\langle z|
=zq=0𝕚q𝕜q(i)qΔt~wq2q!Γ𝕚q(𝕜q)(tw)(eiϕ𝕚q,z(𝕜q)+iθ𝕚q,z(𝕜q)+eiϕ𝕚q,z(𝕜q)+iθ𝕚q,z(𝕜q))P𝕚q|zz|\displaystyle=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})\left({\rm e}^{i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}+{\rm e}^{-i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}\right)P_{\mathbb{i}_{q}}|z\rangle\langle z|
=q=0𝕚q𝕜qx=±(i)qΔt~wq2q!Γ𝕚q(𝕜q)(tw)P𝕚qΦ𝕚q,x(𝕜q,w),\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=\pm}(-i)^{q}\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})P_{\mathbb{i}_{q}}\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x}, (38)

where Φ𝕚q,±(𝕜q,w)\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},\pm} are diagonal unitaries with diagonal elements being ei(±ϕ𝕚q,z(𝕜q)+θ𝕚q,z(𝕜q)){\rm e}^{i\left(\pm\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}\right)}.

To implement the LCU method for simulating UI(tw+Δtw,tw)U_{I}(t_{w}+\Delta t_{w},t_{w}), we require a preparation of the state

|ψ0\displaystyle|\psi_{0}\rangle =1sq=0Q𝕚q𝕜qx=0,1Δt~wq2q!Γ𝕚q(𝕜q)(tw)|i1|iq|0(Qq)|k1|kq|0(Qq)|x\displaystyle=\frac{1}{\sqrt{s}}\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}\sqrt{\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}|i_{1}\rangle\cdots|i_{q}\rangle\otimes|0\rangle^{\otimes(Q-q)}|k_{1}\rangle\cdots|k_{q}\rangle\otimes|0\rangle^{\otimes(Q-q)}|x\rangle
1sq=0Q𝕚q𝕜qx=0,1Δt~wq2q!Γ𝕚q(𝕜q)(tw)|𝕚q|𝕜q|x,\displaystyle\equiv\frac{1}{\sqrt{s}}\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}\sqrt{\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle, (39)

where |𝕚q|\mathbb{i}_{q}\rangle represents QQ quantum registers that each has dimension MM and |𝕜q|\mathbb{k}_{q}\rangle represents QQ quantum registers that each has dimension KK. The normalization constant is

s=q=0Q𝕚q𝕜qx=0,1Δt~wq2q!Γ𝕚q(𝕜q)(tw)q=0Q(Γ(tw)Δt~w)qq!,s=\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})\equiv\sum_{q=0}^{Q}\frac{(\Gamma(t_{w})\widetilde{\Delta t}_{w})^{q}}{q!}, (40)

where we define the Γ(tw)\Gamma(t_{w}) as

Γ(tw)i=0Mk=1KDi(k)maxetwλ(i,k),\Gamma(t_{w})\equiv\sum_{i=0}^{M}\sum_{k=1}^{K}\left|\left|D^{(k)}_{i}\right|\right|_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}}\,, (41)

and we note that Γ(tw)\Gamma(t_{w}) is an upper bound on the max-norm of the interaction Hamiltonian at time twt_{w}, Γ(tw)V(tw)max\Gamma(t_{w})\geq\|V(t_{w})\|_{\max}. The quantity Γ(tw)\Gamma(t_{w}) is related to the energy strength in a typical LCU setup Berry et al. (2015). In Appendix D, we provide an alternative way that uses a larger bound Γ=MKmaxk,iDi(k)max\Gamma=MK\max_{\forall k,i}||D^{(k)}_{i}||_{\max}, which leads to an exponential saving for the state preparation. We proceed with Γ(tw)\Gamma(t_{w}) hereafter.

The OAA step in the LCU method requires s2s\approx 2. This leads to

Γ(tw)Δt~w=Γ(tw)eΔtwλ1λ=ln2,\Gamma(t_{w})\widetilde{\Delta t}_{w}=\Gamma(t_{w})\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\lambda}=\text{ln}2, (42)

and Eq. (40) becomes a truncated Taylor expansion of 2 up to order QQ, i.e., 2q=0Q(ln2)qq!2\approx\sum_{q=0}^{Q}\frac{(\text{ln}2)^{q}}{q!}. If we require |s2|ϵ/r|s-2|\leq\epsilon/r, where rr is the total number of steps and ϵ\epsilon is some positive number, then the simulation error for each UI(tw+Δtw,tw)U_{I}(t_{w}+\Delta t_{w},t_{w}) is also within ϵ/r\epsilon/r. The required truncation order with this accuracy scales as

Q=𝒪(log(r/ϵ)loglog(r/ϵ)).Q=\mathcal{O}\left(\frac{\log(r/\epsilon)}{\log\log(r/\epsilon)}\right). (43)

III.3.1 Time partitioning and number of time steps

The condition in Eq. (42) imposes a constraint on the next step size Δtw\Delta t_{w} given the current time twt_{w},

Δtw=1λln(1+λΓ(tw)ln2).\Delta t_{w}=\frac{1}{\lambda}\ln\left(1+\frac{\lambda}{\Gamma(t_{w})}\ln 2\right)\,. (44)

Remembering that Γ(tw)\Gamma(t_{w}) is a function of tw=l=0w1Δtlt_{w}=\sum_{l=0}^{w-1}\Delta t_{l}, this condition determines the schedule, as every Δtw\Delta t_{w} is determined by the preceding time steps.

Special care should be given when setting the last time step, as Δtw\Delta t_{w} can become too large that exceeds the total desired evolution time TT. Whenever tw+1t_{w+1} is found to be greater than TT (or if the argument inside the ln()\ln(\cdot) is found to be negative), one should replace the bound Γ(tw)\Gamma(t_{w}) with a larger bound Γ~(tw)=λln2/(eλΔtw1)\tilde{\Gamma}(t_{w})=\lambda\ln 2/({\rm e}^{\lambda\Delta t_{w}}-1) and set the final step Δtw=Ttw\Delta t_{w}=T-t_{w}.

Let us now examine the dependence of Δtw\Delta t_{w} on Γ(tw)\Gamma(t_{w}) in order to determine a bound on the number of time steps (equivalently, number of repetitions) rr required for the execution of the entire time evolution. We distinguish between three cases. (i) When λ=0\lambda=0, we have ΔtwΓ(tw)=ln2\Delta t_{w}\Gamma(t_{w})=\ln 2, similar to the time-independent case though we note that a vanishing maximal λ\lambda could imply time-dependent oscillations as well. This can be seen by taking the λ0\lambda\to 0 limit of Eq. (44). (ii) In the case where λ<0\lambda<0, i.e., a system with a decaying Γ(tw)\Gamma(t_{w}), we have ΔtwΓ(tw)ln2\Delta t_{w}\Gamma(t_{w})\geq\ln 2, i.e., the time steps are longer than ln2/Γ(tw)\ln 2/\Gamma(t_{w}). Furthermore, the total number of steps rr is finite even for an arbitrarily large evolution time TT. Note that since Γ(tw)\Gamma(t_{w}) approaches zero asymptotically, for a large enough time twt_{w*}, we have Γ(tw)<|λ|ln2\Gamma(t_{w*})<|\lambda|\ln 2, i.e., the argument inside the logarithm above becomes negative. This indicates it reaches the final step, i.e., the bound should be modified as Γ~(tw)=λln2/(eλΔtw1)\tilde{\Gamma}(t_{w*})=\lambda\ln 2/({\rm e}^{\lambda\Delta t_{w*}}-1) and Δtw=Ttw\Delta t_{w*}=T-t_{w*} becomes the final step. (iii) In the case where λ>0\lambda>0 (an amplified Γ(tw)\Gamma(t_{w})), we have Γ(tw)λ\Gamma(t_{w})\gg\lambda at large simulation times, twt_{w}. From Eq. (44), we have Δtwln2/Γ(tw)\Delta t_{w}\to\ln 2/\Gamma(t_{w}) in this limit.

We see that (for large enough simulation times) the time step Δtw\Delta t_{w} is inversely proportional to Γ(tw)\Gamma(t_{w}) which upper-bounds the max-norm of the interaction Hamiltonian at time twt_{w}. Therefore, we have w=0r1Γ(tw)Δtwrln2\sum^{r-1}_{w=0}\Gamma(t_{w})\Delta t_{w}\gtrsim r\ln 2, which implies rw=0r1Γ(tw)Δtw/ln2r\lesssim\sum^{r-1}_{w=0}\Gamma(t_{w})\Delta t_{w}/\ln 2.

It would be instructive to compare the above scaling with that of Ref. Berry et al. (2020) in which the simulation algorithm is said to have an L1L^{1}-norm scaling, i.e., an algorithm cost scaling linearly with ot𝑑τHmax(τ)\int_{o}^{t}d\tau H_{\max}(\tau) up to logarithmic factors. Under a similar intuition, our algorithm has a discretized L1L^{1}-norm-like scaling with w=0r1Γ(tw)Δtw\sum^{r-1}_{w=0}\Gamma(t_{w})\Delta t_{w}. However in our case, Γ(tw)\Gamma(t_{w}) is related to the norm of the interaction Hamiltonian.

III.3.2 State preparation

In this subsection, we provide a procedure to prepare the state |ψ0|\psi_{0}\rangle given in Eq. (39). First, we initialize a state |0Q|0Q|0|0\rangle^{\otimes Q}|0\rangle^{\otimes Q}|0\rangle, where each of the first QQ registers has dimension MM (responsible for |𝕚q|\mathbb{i}_{q}\rangle part), each of the later QQ registers has dimension KK (responsible for |𝕜q|\mathbb{k}_{q}\rangle part), and the last register is a qubit (for the cosine decomposition). For simplicity, we can perform a Hadamard gate on the last qubit and then omit its dependence for the following discussion. The next step is to create a state in following the form,

1sq=0Qsq|1q|0(Qq)|1q|0(Qq),\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{s_{q}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}, (45)

where sq(Γ(tw)Δt~w)q/q!s_{q}\equiv\left(\Gamma(t_{w})\widetilde{\Delta t}_{w}\right)^{q}/q!. For each |1|1\rangle from the first QQ registers (the |𝕚q|\mathbb{i}_{q}\rangle part) and the corresponding |1|1\rangle in the later QQ registers (the |𝕜q|\mathbb{k}_{q}\rangle part), we make

|1|1i=0Mk=1KDi(k)maxetwλ(i,k)Γ(tw)|i|k.|1\rangle|1\rangle\to\sum_{i=0}^{M}\sum_{k=1}^{K}\sqrt{\frac{||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}}}{\Gamma(t_{w})}}|i\rangle|k\rangle. (46)

Then Eq. (45) becomes

1sq=0Qsq𝕚q𝕜qΓ𝕚q(𝕜q)(tw)(Γ(tw))q|𝕚q|𝕜q=1sq=0Q𝕚q𝕜q(Δt~w)qq!Γ𝕚q(𝕜q)(tw)|𝕚q|𝕜q,\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{s_{q}}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sqrt{\frac{\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}{(\Gamma(t_{w}))^{q}}}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle=\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sqrt{\frac{(\widetilde{\Delta t}_{w})^{q}}{q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle, (47)

which is the required |ψ0|\psi_{0}\rangle in Eq. (39), when combined with |x|x\rangle.

Next, we provide a process that produces the state in Eq. (45). First, we perform a rotation that takes the first register in the |𝕚q|\mathbb{i}_{q}\rangle part to

|01s(|0+q=1Qsq|1),|0\rangle\to\frac{1}{\sqrt{s}}\left(|0\rangle+\sqrt{\displaystyle\sum_{q=1}^{Q}s_{q}}|1\rangle\right), (48)

and perform a control gate from the first register to the second (both in the |𝕚q|\mathbb{i}_{q}\rangle part) such that

1s(|0+q=1Qsq|1)|0\displaystyle\frac{1}{\sqrt{s}}\left(|0\rangle+\sqrt{\displaystyle\sum_{q=1}^{Q}s_{q}}|1\rangle\right)|0\rangle
1s[|00+q=1Qsq|11q=1Qsq(s1|0+q=2Qsq|1)]\displaystyle\to\frac{1}{\sqrt{s}}\Bigg{[}|00\rangle+\sqrt{\sum_{q=1}^{Q}s_{q}}|1\rangle\frac{1}{\sqrt{\displaystyle\sum_{q=1}^{Q}s_{q}}}\Bigg{(}\sqrt{s_{1}}|0\rangle+\sqrt{\displaystyle\sum_{q=2}^{Q}s_{q}}|1\rangle\Bigg{)}\Bigg{]}
=1s(|00+s1|10+q=2Qsq|11).\displaystyle=\frac{1}{\sqrt{s}}\left(|00\rangle+\sqrt{s_{1}}|10\rangle+\sqrt{\displaystyle\sum_{q=2}^{Q}s_{q}}|11\rangle\right). (49)

Continuing this procedure for the rest of the registers in the |𝕚q|\mathbb{i}_{q}\rangle part, the state becomes

|0Q1sq=0Qsq|1q|0(Qq).\displaystyle|0\rangle^{\otimes Q}\to\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{s_{q}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}. (50)

At this step, we perform CNOT operations333Strictly speaking, they are not standard CNOTs but higher-dimensional operations that act like a CNOT on the first two levels. from the first QQ registers (|𝕚q|\mathbb{i}_{q}\rangle part) to the last QQ registers (|𝕜q|\mathbb{k}_{q}\rangle part) correspondingly, e.g., perform a CNOT from the first register in the |𝕚q|\mathbb{i}_{q}\rangle part to the first register in the |𝕜q|\mathbb{k}_{q}\rangle part, and so on and so forth. Finally, we have

1sq=0Qsq|1q|0(Qq)|0Q1sq=0Qsq|1q|0(Qq)|1q|0(Qq),\displaystyle\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{s_{q}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}|0\rangle^{\otimes Q}\to\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{s_{q}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}, (51)

which gives Eq. (45) as required. The estimated gate cost for the preparation of |ψ0|\psi_{0}\rangle is 𝒪(QMK)\mathcal{O}(QMK). More detail regarding the cost is provided in Sec. III.4.1.

III.3.3 Implementation of the controlled unitaries

The second ingredient of the LCU routine is the construction of the controlled operation

Vc|𝕚q|𝕜q|x|ψ=|𝕚q|𝕜q|x(i)qP𝕚qΦ𝕚q,x(𝕜q,w)|ψ.V_{c}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle|\psi\rangle=|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle(-i)^{q}P_{\mathbb{i}_{q}}\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x}|\psi\rangle. (52)

Taking an approach similar to that taken in Ref. Kalev and Hen (2021), we first note that Eq. (52) indicates that VcV_{c} can be carried out in two steps: a controlled-phase operation (VcΦV_{c\Phi}) followed by a controlled-permutation operation (VcPV_{cP}).

The controlled-phase operation VcΦV_{c\Phi} requires a somewhat intricate calculation of non-trivial phases. We therefore carry out the required algebra with the help of additional ancillary registers and then ‘push’ the results into phases. The latter step is done by employing the unitary

Uph|φ=eiφ|φ,\displaystyle U_{\text{ph}}|\varphi\rangle={\rm e}^{-i\varphi}|\varphi\rangle\,, (53)

whose implementation cost depends only on the precision with which we specify φ\varphi and is independent of Hamiltonian parameters Nielsen and Chuang (2011) (see Ref. Kalev and Hen (2021) for a complete derivation). With the help of the (controlled) unitary transformation

Vχϕ|𝕚q|𝕜q|x|z|0=|𝕚q|𝕜q|x|z|χ𝐢q(z)+(1)kϕ𝐢q(z),V_{\chi\phi}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle|z\rangle|0\rangle=|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle|z\rangle|\chi_{{\bf i}_{q}}^{(z)}+(-1)^{k}\phi_{{\bf i}_{q}}^{(z)}\rangle\,, (54)

we can write VcΦ=Vχϕ(𝟙Uph)VχϕV_{c{\Phi}}=V_{\chi\phi}^{\dagger}(\mathds{1}\otimes U_{\text{ph}})V_{\chi\phi}, so that

VcΦ|𝕚q|𝕜q|x|z=|𝕚q|𝕜q|xΦ𝐢q(k)|z.V_{c{\Phi}}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle|z\rangle=|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle\Phi_{{\bf i}_{q}}^{(k)}|z\rangle\,. (55)

Note that VχϕV_{\chi\phi} sends computational basis states to computational basis states. We provide an explicit construction of VχϕV_{\chi\phi} in Ref. Kalev and Hen (2021). We find that its gate cost is 𝒪(QM(kod+logM)+QMK(CD+CΔH0+CΛ))\mathcal{O}(QM(k_{od}+\log M)+QMK(C_{D}+C_{\Delta H_{0}}+C_{\Lambda})) and qubit cost is 𝒪(Qlog(MK)).\mathcal{O}(Q\log(MK)). Addiitonal details are provided in Sec. III.4.1.

The construction of VcPV_{cP} is carried out by a repeated execution of the simpler unitary transformation Up|i|z=|iPi|zU_{p}|i\rangle|z\rangle=|i\rangle P_{i}|z\rangle. Recall that PiP_{i} are the off-diagonal permutation operators that appear in the Hamiltonian. The gate cost of UpU_{p} is therefore 𝒪(M(kod+logM)){\cal O}(M(k_{\rm od}+\log M)). Additional details may be found in Ref. Kalev and Hen (2021).

III.4 Algorithm cost

We next analyze the circuit costs for the permutation expansion algorithm. Recall that the simulation of U(T)U(T) consists of two operations—eiH0T{\rm e}^{-iH_{0}T} and UI(T)U_{I}(T). The diagonal unitary eiH0T{\rm e}^{-iH_{0}T} can be implemented efficiently with a gate cost that scales linearly with the system size. To observe this, note that H0H_{0} is a diagonal matrix with real diagonal elements and can be written as H0=γ=0LJγZγH_{0}=\sum_{\gamma=0}^{L}J_{\gamma}Z_{\gamma}, where each ZγZ_{\gamma} is a tensor product of Pauli-ZZ’s (ZZZ\otimes\cdots\otimes Z) acting on at most dd qubits (weight-dd operators). Hence, we can write eiH0T=γ=0LeiJγZγT{\rm e}^{-iH_{0}T}=\prod_{\gamma=0}^{L}{\rm e}^{-iJ_{\gamma}Z_{\gamma}T}. Each eiJγZγT{\rm e}^{-iJ_{\gamma}Z_{\gamma}T} can be simulated using at most 2d2d CNOT gates with a single ancillary qubit. For example, let ZγZ_{\gamma} be a weight-mm (mdm\leq d) operator, then eiJγZγT{\rm e}^{-iJ_{\gamma}Z_{\gamma}T} can be implemented as

[Uncaptioned image]

where |z1|zm|z_{1}\rangle\cdots|z_{m}\rangle are the qubits ZγZ_{\gamma} acts on and |0|0\rangle is an ancillary qubit for extracting the phase. There are total LL such implementations for eiH0T{\rm e}^{-iH_{0}T}. Therefore, the total gate cost is 𝒪(Ld)\mathcal{O}(Ld) and the qubit cost is 𝒪(1)\mathcal{O}(1). Since LL usually grows linearly with the system size, the gate cost also scales linearly.

III.4.1 The cost for the state preparation and the controlled unitaries

The cost of implementing UI(T)U_{I}(T) resembles those in Ref. Kalev and Hen (2021). The first ingredient is the preparation of state |ψ0|\psi_{0}\rangle. Recall from Sec. III.3.2, the operation that takes |0Q|0Q|0|0\rangle^{\otimes Q}|0\rangle^{\otimes Q}|0\rangle to 1sq=0Qsq|1q|0(Qq)|1q|0(Qq)\frac{1}{\sqrt{s}}\sum_{q=0}^{Q}\sqrt{s_{q}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)} has gate cost 𝒪(Q)\mathcal{O}(Q). The operation for |1|1i=0Mk=1KDi(k)maxetwλ(i,k)Γ(tw)|i|k|1\rangle|1\rangle\to\sum_{i=0}^{M}\sum_{k=1}^{K}\sqrt{\frac{||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}}}{\Gamma(t_{w})}}|i\rangle|k\rangle costs 𝒪(MK)\mathcal{O}(MK) Shende et al. (2006). The total gate cost for the preparation of |ψ0|\psi_{0}\rangle (i.e., BB) is 𝒪(QMK)\mathcal{O}(QMK) (Lemma 8 in Childs et al. (2017)). In Appendix D, we provide an alternative procedure that leads to a 𝒪(Qlog(MK))\mathcal{O}(Q\log(MK)) scaling for implementing BB. The qubit cost in the state preparation is 𝒪(Qlog(MK)).\mathcal{O}(Q\log(MK)).

The next component is the implementation of the control unitary VcV_{c}. As shown in Kalev and Hen (2021), the gate cost of performing the control permutation P𝕚qP_{\mathbb{i}_{q}} is 𝒪(QM(kod+logM))\mathcal{O}(QM(k_{od}+\log M)), where kodk_{od} is the “locality,” i.e., each permutation PiP_{i} is a tensor product of at most kodk_{od} Pauli-XX operators. The implementation of the control phase Φ𝕚q,x(𝕜q,w)\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x} involves the calculation of d𝕚q,z(𝕜q)d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z} (the product of diagonal elements in the permutation expansion) and the divided differences (with xjx_{j}’s being the inputs). The cost of the former is 𝒪(QMKCD)\mathcal{O}(QMKC_{D}), where CDC_{D} is the cost of obtaining an element of Di(k)D^{(k)}_{i}. The cost of later is 𝒪(QM(kod+logM)+QMK(CΔH0+CΛ))\mathcal{O}(QM(k_{od}+\log M)+QMK(C_{\Delta H_{0}}+C_{\Lambda})), where CΔH0C_{\Delta H_{0}} (CΛ)(C_{\Lambda}) is the cost of obtaining energy differences of H0H_{0} (elements of Λi(k)\Lambda^{(k)}_{i}) [therefore, CΔH0+CΛC_{\Delta H_{0}}+C_{\Lambda} is the cost for obtaining the inputs xjx_{j}’s as defined in Eq. (29)]. The additional cost for the reversibility of the process scales as 𝒪(Q2)\mathcal{O}(Q^{2}). A detailed discussion of the costs of CΔH0C_{\Delta H_{0}} and CΛC_{\Lambda} may be found in Ref. Kalev and Hen (2021). Combining these, we estimate the total cost for VcV_{c} is

𝒪(Q2+QM(kod+logM)+QMK(CD+CΔH0+CΛ)).\mathcal{O}(Q^{2}+QM(k_{od}+\log M)+QMK(C_{D}+C_{\Delta H_{0}}+C_{\Lambda})). (56)

III.4.2 Overall cost of the algorithm

The full simulation for UI(T)U_{I}(T) is a product of segments UI(tw+Δtw,tw)U_{I}(t_{w}+\Delta t_{w},t_{w}), where each segment is simulated by interleaving BB and VcV_{c}. The total number of segments, rr, is determined by T=w=0r1ΔtwT=\sum_{w=0}^{r-1}\Delta t_{w}, where each Δtw\Delta t_{w} is determined by partitioning scheme described in Sec. III.3.1.

As discussed above, the number of LCU applications rr can be upper-bounded by rw=0r1Γ(tw)Δtw/ln2r\lesssim\sum^{r-1}_{w=0}\Gamma(t_{w})\Delta t_{w}/\ln 2 (in the long simulation time limit), which can be viewed as a discretized L1L^{1}-norm-like scaling with the norm of the non-static component of the Hamiltonian V(t)V(t).

Combining with the cost for simulating eiH0T{\rm e}^{-iH_{0}T} and the cost for each step (56), we conclude that at worst, the total gate cost scales as

𝒪(r(Q2+QM(kod+logM)+QMK(CD+CΔH0+CΛ))+Ld),\mathcal{O}\left(r\left(Q^{2}+QM(k_{od}+\log M)+QMK(C_{D}+C_{\Delta H_{0}}+C_{\Lambda})\right)+Ld\right), (57)

and the qubit cost scales as

𝒪(Qlog(MK)),\mathcal{O}(Q\log(MK)), (58)

where QQ scales as 𝒪(log(r/ϵ)/loglog(r/ϵ))\mathcal{O}(\log(r/\epsilon)/\log\log(r/\epsilon)). For convenience, we provide a glossary of symbols in Table 1. A summary of the gate and qubit costs of the simulation circuit and the various sub-routines used to construct it is given in Table 2.

Symbol Meaning
MM the number of permutation expansion terms of the non-static Hamitonain, c.f., Eq. (16)
KK the length of exponential sum expansion, c.f., Eq. (17)
rr the number partitions, c.f., Sec. III.3.1
QQ the series expansion truncation order, Q=𝒪(log(r/ϵ)loglog(r/ϵ))Q={\cal O}\Bigl{(}\frac{\log(r/\epsilon)}{\log\log(r/\epsilon)}\Bigr{)}
kodk_{\rm od} the upper bound on the locality of PiP_{i}
CDC_{D} the cost of obtaining an element of Di(k)D^{(k)}_{i}
CΔH0C_{\Delta H_{0}} the cost of obtaining energy differences of H0H_{0}
CΛC_{\Lambda} the cost of obtaining an element of Λi(k)\Lambda^{(k)}_{i}
LL the number of terms in the static Hamiltonian, i.e., H0=γ=0LJγZγH_{0}=\sum_{\gamma=0}^{L}J_{\gamma}Z_{\gamma}
dd the locality of ZγZ_{\gamma}
Table 1: Glossary of symbols.
Unitary Gate cost Qubit cost
eiH0T{\rm e}^{-iH_{0}T} 𝒪(Ld){\cal O}(Ld) 𝒪(1){\cal O}(1)
VcV_{c} 𝒪(Q2+QM(kod+logM)+QMK(CD+CΔH0+CΛ))\mathcal{O}(Q^{2}+QM(k_{od}+\log M)+QMK(C_{D}+C_{\Delta H_{0}}+C_{\Lambda})) 𝒪(QlogMK){\cal O}(Q\log MK)
UI(T)U_{I}(T) 𝒪(r(Q2+QM(kod+logM)+QMK(CD+CΔH0+CΛ)))\mathcal{O}\left(r\left(Q^{2}+QM(k_{od}+\log M)+QMK(C_{D}+C_{\Delta H_{0}}+C_{\Lambda})\right)\right) 𝒪(QlogMK){\cal O}(Q\log MK)
Table 2: A summary of resources for the circuit.

III.4.3 Example advantages of the algorithm

To illustrate how our simulation algorithm can provide speedups over existing algorithms, we focus in this subsection on two types of Hamiltonian systems: highly oscillating systems and decaying systems.

The cost of our algorithm is independent of the oscillation rates of the dynamics, whereas the cost of any simulation algorithm (e.g., Berry et al. (2020); Kieferová et al. (2019); Low and Wiebe (2019); Poulin et al. (2011)) that depends on dH(t)/dt||dH(t)/dt|| would depend on oscillation rates of the system. To illustrate this advantage, consider a two-level system with a Hamiltonian

H(t)=hZ+Γ(eiαt|01|+eiαt|10|)=H0+V(t),H(t)=hZ+\Gamma\left({\rm e}^{-i\alpha t}|0\rangle\langle 1|+{\rm e}^{i\alpha t}|1\rangle\langle 0|\right)=H_{0}+V(t), (59)

where h,Γ,αh,\Gamma,\alpha\in\mathbb{R}, H0=hZH_{0}=hZ and V(t)=Γ(eiαt|01|+eiαt|10|)V(t)=\Gamma\left({\rm e}^{-i\alpha t}|0\rangle\langle 1|+{\rm e}^{i\alpha t}|1\rangle\langle 0|\right). In this case, we have kod=M=K=1k_{od}=M=K=1 and λ=0\lambda=0. The gate cost of simulating U(T)U(T) scales as

𝒪(ΓT(logΓT/ϵloglogΓT/ϵ)2),\mathcal{O}\left(\Gamma T\left(\frac{\log\Gamma T/\epsilon}{\log\log\Gamma T/\epsilon}\right)^{2}\right), (60)

which is independent of α\alpha. This means the simulation cost remains the same even if α\alpha becomes arbitrarily large. One can realize the absence of α\alpha owing to the fact that phases are explicitly integrated out into an integral-free expansion series, where the bound of each term does not depend on the oscillations (due to Identity 2). Therefore, our simulation can be significantly more effective when the time dependence of the Hamiltonian has very high frequencies. Note that while the example above was given for a simple qubit system with pure oscillation, the frequency-independence in cost holds for any system.

Another class of systems for which our algorithm can provide seedup are Hamiltonians with exponential decays, i.e., λ<0\lambda<0. For concreteness, consider the Hamiltonian

H(t)=hZ+ΓeαtX=H0+V(t),H(t)=hZ+\Gamma{\rm e}^{-\alpha t}X=H_{0}+V(t), (61)

where h,Γh,\Gamma\in\mathbb{R} and α>0\alpha>0 and H0=hZH_{0}=hZ and V(t)=ΓeαtXV(t)=\Gamma{\rm e}^{-\alpha t}X. In this case, λ=α\lambda=-\alpha and V(t)max=Γeαt||V(t)||_{\max}=\Gamma{\rm e}^{-\alpha t}.

The L1L^{1}-norm defined in Ref. Berry et al. (2020) is 0TH(t)max𝑑t\int^{T}_{0}||H(t)||_{\max}dt, which has a linear scaling 𝒪(hT)\mathcal{O}(hT) with the simulation duration TT, whereas our discretized L1L^{1}-norm w=0r1V(tw)maxΔtw\sum^{r-1}_{w=0}||V(t_{w})||_{\max}\Delta t_{w} tends to a constant in the long time limit. This can be seen from the fact that the partition terminates at a large enough time twt_{w} (T\leq T), where Δtw=Ttw\Delta t_{w}=T-t_{w} becomes the final simulation step, as described in Sec. III.3.1. The above results also hold for any combination of exponential decays (even when these are multiplied by oscillatory terms) with which different time decay dependencies may be constructed.

III.5 Hamiltonians with arbitrary time dependence

The simulation algorithm invokes a switch to the interaction picture, by dividing the Hamiltonian into a static diagonal part H0H_{0} and a time-dependent Hermitian operator V(t)V(t). The V(t)V(t) is expanded using permutations and exponential sums as presented in Eq. (17). There, we assume that the time dependence can be expressed as exponential sums with a finite number of terms, KK. Although this assumption holds for many models (e.g., when the time dependencies are some combinations of trigonometric functions and exponential decays), the exponential series generally requires an infinite sum (e.g., a Fourier series). A straightforward procedure to obtain a finite sum approximation is via a truncated Fourier series. As an example, let us consider a polynomial function of time, i.e., f(t)=l=0pcltlf(t)=\sum_{l=0}^{p}c_{l}t^{l}. Using the proof of Theorem 8.14 in Ref. Rudin (1976), it can be shown that a truncated Fourier series of f(t)f(t) is 𝒪(ϵ)\mathcal{O}(\epsilon) close to f(t)f(t) when the truncation order is 𝒪(1/ϵ)\mathcal{O}(1/\epsilon). We also note that, other than Fourier series, there have been numerous studies Norvidas (2010); Beylkin and Monzón (2005, 2010); Braess and Hackbusch (2009); Wiscombe and Evans (1977) regarding finding an exponential-sum approximation of a function. Some of them, e.g., Beylkin and Monzón (2005), provide efficient algorithms with logarithmically scaling terms (with respect to the inverse of a required accuracy). These results suggest that efficient methods for finding the exponential-sum decompositions of the time dependences of V(t)V(t) can exist in many cases.

Suppose that V(t)V(t) is approximated by a finite series of exponential sum. The resulting error of the unitary evolution, due to the Hamiltonian approximation, scales only at most linearly with the evolution duration. This can be shown using the following property. Given two time-dependent Hamiltonians H1(t)H_{1}(t) and H2(t)H_{2}(t) such that

H1(t)H2(t)ϵfor all t[0,T],||H_{1}(t)-H_{2}(t)||\leq\epsilon\ \ \text{for all }t\in[0,T], (62)

then

U1(T,0)U2(T,0)𝒯exp[i0TH1(t)𝑑t]𝒯exp[i0TH2(t)𝑑t]ϵT.||U_{1}(T,0)-U_{2}(T,0)||\equiv\left|\left|\mathcal{T}\text{exp}\left[-i\int_{0}^{T}H_{1}(t)dt\right]-\mathcal{T}\text{exp}\left[-i\int_{0}^{T}H_{2}(t)dt\right]\right|\right|\leq\epsilon T. (63)

This holds true for any norm ||||||\cdot||. Before proving this, we first note a property of the so-called Subadditivity of error in implementing unitaries Nielsen and Chuang (2011). It says that for unitaries U1,U2,V3U_{1},U_{2},V_{3} and V4V_{4}, we have

U2U1V2V1U2V2+U1V1.||U_{2}U_{1}-V_{2}V_{1}||\leq||U_{2}-V_{2}||+||U_{1}-V_{1}||. (64)

This can be easily shown by

U2U1V2V1\displaystyle||U_{2}U_{1}-V_{2}V_{1}|| =U2U1V2U1+V2U1V2V1\displaystyle=||U_{2}U_{1}-V_{2}U_{1}+V_{2}U_{1}-V_{2}V_{1}||
(U2V2)U1+V2(U1V1)U2V2+U1V1,\displaystyle\leq||(U_{2}-V_{2})U_{1}||+||V_{2}(U_{1}-V_{1})||\leq||U_{2}-V_{2}||+||U_{1}-V_{1}||, (65)

where the basic operator norm inequalities are used. Now we prove the bound in Eq. (63). We divide TT into nn segments such that each segment has width T/nT/n. We can rewrite the time evolution operators as

U1(T,0)=U1(T,n1nT)U1(Tn,0)\displaystyle U_{1}(T,0)=U_{1}\left(T,\frac{n-1}{n}T\right)\cdots U_{1}\left(\frac{T}{n},0\right)
U2(T,0)=U2(T,n1nT)U2(Tn,0).\displaystyle U_{2}(T,0)=U_{2}\left(T,\frac{n-1}{n}T\right)\cdots U_{2}\left(\frac{T}{n},0\right).

Repeatedly using the subadditivity of error, we have

U1(T,0)U2(T,0)m=1nU1(mTn,(m1)Tn)U2(mTn,(m1)Tn)\displaystyle||U_{1}(T,0)-U_{2}(T,0)||\leq\sum_{m=1}^{n}\left|\left|U_{1}\left(\frac{mT}{n},\frac{(m-1)T}{n}\right)-U_{2}\left(\frac{mT}{n},\frac{(m-1)T}{n}\right)\right|\right|
=m=1ni(m1)TnmTn[H1(t)H2(t)]𝑑t+(i)2(m1)TnmTn𝑑t2(m1)Tnt2𝑑t1[H1(t2)H1(t1)H2(t2)H2(t1)]+\displaystyle=\sum_{m=1}^{n}\left|\left|-i\int_{\frac{(m-1)T}{n}}^{\frac{mT}{n}}\left[H_{1}(t)-H_{2}(t)\right]dt+(-i)^{2}\int^{\frac{mT}{n}}_{\frac{(m-1)T}{n}}dt_{2}\int_{\frac{(m-1)T}{n}}^{t_{2}}dt_{1}\left[H_{1}(t_{2})H_{1}(t_{1})-H_{2}(t_{2})H_{2}(t_{1})\right]+\cdots\right|\right|
m=1nϵTn+m=1n𝒪[(Tn)2]=ϵT+𝒪[T2n].\displaystyle\leq\sum_{m=1}^{n}\epsilon\frac{T}{n}+\sum_{m=1}^{n}\mathcal{O}\left[\left(\frac{T}{n}\right)^{2}\right]=\epsilon T+\mathcal{O}\left[\frac{T^{2}}{n}\right]. (66)

Since this inequality holds for any nn, we can take nn\to\infty and it yields Eq. (63) as claimed.

Now we apply this property to the simulation of UI(T)U_{I}(T). Suppose that we have an δ~\tilde{\delta}-accurate approximation of V(t)V(t), i.e., V~(t)V(t)δ~||\tilde{V}(t)-V(t)||\leq\tilde{\delta} for all t[0,T]t\in[0,T], where V~(t)\tilde{V}(t) is the finite exponential-sum approximation of V(t)V(t). The accumulative error from this approximation is bounded by δ~T\tilde{\delta}T and the overall error is 𝒪(δ~T+δ)\mathcal{O}(\tilde{\delta}T+\delta), where δ\delta is the error from LCU implementation. Recall that δ~\tilde{\delta} is closely related to KK (the number of terms). Although it is intuitive that a larger KK can allow for a smaller δ~\tilde{\delta}, the explicit relation between the two largely depends on the model and the expansion method. Nonetheless, we can expect KK to scale at least linearly with 1/δ~1/\tilde{\delta} for many cases, e.g., aforementioned truncated Fourier series for a polynomial.

The simulation cost also depends on MM, the number of terms in the permutation expansion. This quantity usually scales linearly with the system size and can be easily determined. For example, a typical spin model usually involves a sum of tensor products of Pauli-XX’s (or YY’s) and Pauli-ZZ’s. Each tensor product represents an interaction between qubits on certain lattice sites. Due to the common locality constraint that prevents a qubit interacting with the ones arbitrarily far apart, the number of interacting terms, MM, scales at most linearly with the number of qubits. In addition, a tensor product of Pauli operators can be easily separated into a product of diagonal matrix and a permutation, e.g., XXY=(IIiZ)(XXX)X\otimes X\otimes Y=(I\otimes I\otimes-iZ)(X\otimes X\otimes X). We conclude that MM will have modest linear scaling for most practical models.

IV Alternative scheme and reduction to the time-independent case

In this section, we provide an alternative yet equivalent scheme for the dynamical simulation, one that will allow us to establish an immediate connection to the time-independent Hamiltonian simulation formalism (specifically to the scheme presented in Ref. Kalev and Hen (2021)), in which H(t)H(t) is assumed constant in time.

In previous sections, we have chosen to partition the interaction-picture unitary UI(T)U_{I}(T) into short time segments and then follow its execution by the application of a diagonal eiH0T{\rm e}^{iH_{0}T} bringing it back to the Schrödinger picture. Here, we show that the Schrödinger picture U(T)U(T) can be partitioned similarly.

Recalling the expansion of UI(tw+Δtw,tw)U_{I}(t_{w}+\Delta t_{w},t_{w}) in Eq. (32), we have

UI(tw+Δtw,tw)|z=q=0𝕚q𝕜q(i)qetwx1eΔtw[x1,x2,,xq,0]d𝕚q,z(𝕜q)P𝕚q|z,\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})|z\rangle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t_{w}x_{1}}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle,

with

x1=i(Ez𝕚qEz)+l=1qλil,z𝕚l(kl).x_{1}=i\left(E_{z_{\mathbb{i}_{q}}}-E_{z}\right)+\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}.

Breaking the etwx1{\rm e}^{t_{w}x_{1}} phase, we get:

UI(tw+Δtw,tw)|z\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})|z\rangle
=q=0𝕚q𝕜q(i)qetwl=1qλil,z𝕚l(kl)eitwEzei(tw+Δtw)Ez𝕚qeiΔtwEz𝕚qeΔtw[x1,x2,,xq,0]d𝕚q,z(𝕜q)P𝕚q|z.\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t_{w}\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{-it_{w}E_{z}}{\rm e}^{i(t_{w}+\Delta t_{w})E_{z_{\mathbb{i}_{q}}}}{\rm e}^{-i\Delta t_{w}E_{z_{\mathbb{i}_{q}}}}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle. (67)

We find that

eiH0(tw+Δtw)UI(tw+Δtw,tw)=U~I(tw+Δtw,tw)eiH0tw{\rm e}^{-iH_{0}(t_{w}+\Delta t_{w})}U_{I}(t_{w}+\Delta t_{w},t_{w})=\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w}){\rm e}^{-iH_{0}t_{w}} (68)

where

U~I(tw+Δtw,tw)=zq=0𝕚q𝕜q(i)qetwl=1qλil,z𝕚l(kl)eiΔtwEz𝕚qeΔtw[x1,x2,,xq,0]d𝕚q,z(𝕜q)P𝕚q|zz|.\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w})=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t_{w}\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{-i\Delta t_{w}E_{z_{\mathbb{i}_{q}}}}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle\langle z|\,. (69)

Inspecting the full unitary evolution, we observe

U(T)\displaystyle U(T) =eiH0TUI(T)\displaystyle={\rm e}^{-iH_{0}T}U_{I}(T)
=eiH0TUI(T,tr1)UI(tr1,tr2)UI(t1,0)=U~I(T,tr1)D(tr1)UI(tr1,tr2)UI(t1,0).\displaystyle={\rm e}^{-iH_{0}T}U_{I}(T,t_{r-1})U_{I}(t_{r-1},t_{r-2})\cdots U_{I}(t_{1},0)=\widetilde{U}_{I}(T,t_{r-1})D(t_{r-1})U_{I}(t_{r-1},t_{r-2})\cdots U_{I}(t_{1},0). (70)

The evolution operator U(T)U(T) can be simplifies as

U(T)=U~I(T,tr1)U~I(tr1,tr2)U~I(t1,0),U(T)=\widetilde{U}_{I}(T,t_{r-1})\widetilde{U}_{I}(t_{r-1},t_{r-2})\cdots\widetilde{U}_{I}(t_{1},0)\,, (71)

eliminating the diagonal piece. Each U~I(tw+Δtw,tw)\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w}) can be rewritten as:

U~I(tw+Δtw,tw)\displaystyle\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w})
=zq=0𝕚q𝕜q(i)qe(tw+Δtw)l=1qλil,z𝕚l(kl)eΔtw(iEz𝕚q+l=1qλil,z𝕚l(kl))eΔtw[x1,x2,,xq,0]d𝕚q,z(𝕜q)P𝕚q|zz|.\displaystyle=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{(t_{w}+\Delta t_{w})\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{-\Delta t_{w}(iE_{z_{\mathbb{i}_{q}}}+\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}})}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle\langle z|\,. (72)

The factor eΔtw(iEz𝕚q+l=1qλil,z𝕚l(kl)){\rm e}^{-\Delta t_{w}(iE_{z_{\mathbb{i}_{q}}}+\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}})} can be absorbed into the divided difference:

U~I(tw+Δtw,tw)=zq=0𝕚q𝕜q(i)qe(tw+Δtw)l=1qλil,z𝕚l(kl)eΔtw[y~1,y~2,,y~q,y~q+1]d𝕚q,z(𝕜q)P𝕚q|zz|.\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w})=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{(t_{w}+\Delta t_{w})\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{\Delta t_{w}[\tilde{y}_{1},\tilde{y}_{2},\cdots,\tilde{y}_{q},\tilde{y}_{q+1}]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle\langle z|\,. (73)

with

y~j=xj(iEz𝕚q+l=1qλil,z𝕚l(kl))=i(Ez𝕚qEz𝕚j1)+l=jqλil,z𝕚l(kl)(iEz𝕚q+l=1qλil,z𝕚l(kl)),\tilde{y}_{j}=x_{j}-\left(iE_{z_{\mathbb{i}_{q}}}+\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)=i\left(E_{z_{\mathbb{i}_{q}}}-E_{z_{\mathbb{i}_{j-1}}}\right)+\sum_{l=j}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}-\left(iE_{z_{\mathbb{i}_{q}}}+\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\,, (74)

which simplifies to

y~j=iEz𝕚j1l=1j1λil,z𝕚l(kl).\tilde{y}_{j}=-iE_{z_{\mathbb{i}_{j-1}}}-\sum_{l=1}^{j-1}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\,. (75)

By inserting additional iΔtwEzi\Delta t_{w}E_{z} phases into the divided differences, we can rewrite

U~I(tw+Δtw,tw)=(zq=0𝕚q𝕜q(i)qe(tw+Δtw)l=1qλil,z𝕚l(kl)eΔtw[y1,y2,,yq,yq+1]d𝕚q,z(𝕜q)P𝕚q|zz|)eiH0Δtw.\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w})=\left(\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{(t_{w}+\Delta t_{w})\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{\Delta t_{w}[y_{1},y_{2},\cdots,y_{q},y_{q+1}]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle\langle z|\right){\rm e}^{-iH_{0}\Delta t_{w}}\,. (76)

with

yj=y~j+iEz=i(Ez𝕚j1Ez)l=1j1λil,z𝕚l(kl)=iΔEz𝕚j1l=1j1λil,z𝕚l(kl).y_{j}=\tilde{y}_{j}+iE_{z}=-i(E_{z_{\mathbb{i}_{j-1}}}-E_{z})-\sum_{l=1}^{j-1}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}=-i\Delta E_{z_{\mathbb{i}_{j-1}}}-\sum_{l=1}^{j-1}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\,. (77)

Now, we can write U(T)U(T) as alternating off-diagonal and diagonal unitaries:

U(T)=wU~I(tw+Δtw,tw)wUod(tw+Δtw,tw)eiH0Δtw.U(T)=\prod_{w}\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w})\equiv\prod_{w}U_{\textrm{od}}(t_{w}+\Delta t_{w},t_{w}){\rm e}^{-iH_{0}\Delta t_{w}}\,. (78)

When H(t)H(t) becomes time-independent, λil,z𝕚l(kl)=0\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}=0 and Δtw=Δt=ln2/Γ\Delta t_{w}=\Delta t=\ln 2/\Gamma. To synchronize the notation with Kalev and Hen (2021), we identify H0=D0H_{0}=D_{0} and Uod(tw+Δtw,tw)=UodU_{\textrm{od}}(t_{w}+\Delta t_{w},t_{w})=U_{\textrm{od}}. The evolution operator becomes U(T)=UodeiD0ΔtUodeiD0ΔtU(T)=U_{\textrm{od}}{\rm e}^{-iD_{0}\Delta t}\cdots U_{\textrm{od}}{\rm e}^{-iD_{0}\Delta t}, which coincides with Kalev and Hen (2021).

V Conclusions

We presented a quantum algorithm for simulating the evolution operator generated from a time-dependent Hamiltonian. The algorithm involves a permutation expansion for the interaction Hamiltonian, a switch to the interaction-picture, and the incorporation of the LCU technique. Combining the permutation expansion with the Dyson series has led to an integral-free representation for the interaction-picture unitary with coefficients involving the notion of divided differences with complex inputs.

We found that our expansion allowed us to adjust the time steps based on the dynamical characteristics of the Hamiltonian, providing a resource saving as compared to the equal-size partition with the largest bound. This further resulted in a gate resource that scales with an L1L^{1}-norm-like scaling with respect only to the ‘non-static’ norm of the Hamiltonian.

Specifically, we demonstrated that for systems with a decaying non-static component, the resources do not scale with the total evolution time asymptotically. Furthermore, the simulation cost is independent of the frequencies, implying a significant advantage for systems with highly oscillating components.

Acknowledgements.
This work is supported by the U.S. Department of Energy (DOE), Office of Science, Basic Energy Sciences (BES) under Award No. DE-SC0020280.

References

Appendix

Appendix A Properties of divided difference

We begin with a formal definition of divided difference for complex-valued functions and follow with some properties that will be of use to us when deriving the new bound. The main results are derived for the exponential functions.

Definition 1.

Let 𝕌\mathbb{U} be an open subset of \mathbb{C}, and f:𝕌f:\mathbb{U}\to\mathbb{C} is analytic in 𝕌\mathbb{U}. For any non-negative integer qq and x0,x1,,xq𝕌x_{0},x_{1},\cdots,x_{q}\in\mathbb{U}, the divided difference of ff is denoted as f[x0,x1,,xq]f[x_{0},x_{1},\cdots,x_{q}]. If q=0q=0, f[x0]f(x0)f[x_{0}]\equiv f(x_{0}). Suppose {x0,x1,,xq}\{x_{0},x_{1},\cdots,x_{q}\} has rr distinct elements. Let S={xσ(0),xσ(1),,xσ(q)}S=\{x_{\sigma(0)},x_{\sigma(1)},\cdots,x_{\sigma(q)}\} be a sorted set of {x0,x1,,xq}\{x_{0},x_{1},\cdots,x_{q}\}, i.e., there exists a permutation σ\sigma such that the first n1n_{1} elements of SS are equal and the following n2n_{2} elements of SS are equal and so on and so forth. There are rr same-element clusters and i=1rni=q+1\sum_{i=1}^{r}n_{i}=q+1. The divided difference of ff is defined as

f[x0,x1,,xq]={f[xσ(1),,xσ(q)]f[xσ(0),,xσ(q1)]xσ(q)xσ(0)if r>1,f(q)(x0)q!if r=1,f[x_{0},x_{1},\cdots,x_{q}]=\begin{cases}\frac{f[x_{\sigma(1)},\cdots,x_{\sigma(q)}]-f[x_{\sigma(0)},\cdots,x_{\sigma(q-1)}]}{x_{\sigma(q)}-x_{\sigma(0)}}&\text{if }r>1,\\ \frac{f^{(q)}(x_{0})}{q!}&\text{if }r=1,\end{cases}

where f(q)f^{(q)} denotes the qqth derivative of ff.

Although the above sorting procedure is not unique, it can be shown that any choice of the permutation gives the same result, and hence the definition is well-defined.

The divided difference involves a recursive relation that connects a q+1q+1 input case to two qq cases. For q=1q=1,

f[x0,x1]={f(x1)f(x0)x1x0if x0x1,f(x0)if x0=x1.\displaystyle f[x_{0},x_{1}]=\begin{cases}\frac{f(x_{1})-f(x_{0})}{x_{1}-x_{0}}&\text{if }x_{0}\neq x_{1},\\ f^{\prime}(x_{0})&\text{if }x_{0}=x_{1}.\end{cases} (79)

For q=2q=2, and suppose x0x_{0}, x1x_{1} and x2x_{2} are all distinct,

f[x0,x1,x2]\displaystyle f[x_{0},x_{1},x_{2}] =f(x2)f(x1)x2x1f(x1)f(x0)x1x0x2x0\displaystyle=\frac{\frac{f(x_{2})-f(x_{1})}{x_{2}-x_{1}}-\frac{f(x_{1})-f(x_{0})}{x_{1}-x_{0}}}{x_{2}-x_{0}}
=f(x0)(x0x1)(x0x2)+f(x1)(x1x2)(x1x0)+f(x2)(x2x0)(x2x1).\displaystyle=\frac{f(x_{0})}{(x_{0}-x_{1})(x_{0}-x_{2})}+\frac{f(x_{1})}{(x_{1}-x_{2})(x_{1}-x_{0})}+\frac{f(x_{2})}{(x_{2}-x_{0})(x_{2}-x_{1})}. (80)

In fact, it can be shown that for distinct x0,x1,,xqx_{0},x_{1},\cdots,x_{q},

f[x0,x1,,xq]=i=0qf(xi)ki(xixk).\displaystyle f[x_{0},x_{1},\cdots,x_{q}]=\sum^{q}_{i=0}\frac{f(x_{i})}{\prod_{k\neq i}(x_{i}-x_{k})}. (81)
Remark.

Since any analytic function admits a Taylor expansion representation and the divided difference is a linear functional, the divided difference of an analytic function ff has a series expansion form, i.e., for x0,,xqx_{0},\cdots,x_{q} and yy in ff’s analytic domain,

f[x0,,xq]=n=0f(n)(y)n!pn|y[x0,,xq],f[x_{0},\cdots,x_{q}]=\sum_{n=0}^{\infty}\frac{f^{(n)}(y)}{n!}p_{n|y}[x_{0},\cdots,x_{q}], (82)

where pn|y(x)(xy)np_{n|y}(x)\equiv(x-y)^{n}. Because pn|y[x0,,xq]=0p_{n|y}[x_{0},\cdots,x_{q}]=0 for all n<qn<q, the non-vanishing term of the series starts from the qqth order.

For simplicity, we denote the divided difference for the exponential function as e[x0,,xq]{\rm e}^{[x_{0},\cdots,x_{q}]}, i.e.,

e[x0,,xq]f[x0,,xq],wheref(x)=ex.{\rm e}^{[x_{0},\cdots,x_{q}]}\equiv f[x_{0},\cdots,x_{q}],\ \text{where}\ f(x)={\rm e}^{x}. (83)
Property 1.

For any non-negative integer qq and x0,x1,,xqx_{0},x_{1},\cdots,x_{q}\in\mathbb{C},

e[x0,x1,,xq]=ex0e[0,x1x0,,xqx0].{\rm e}^{[x_{0},x_{1},\cdots,x_{q}]}={\rm e}^{x_{0}}{\rm e}^{[0,x_{1}-x_{0},\cdots,x_{q}-x_{0}]}. (84)

This property and the fact that divided differences are permutation symmetric among inputs imply that any input can be factored out of the divided difference by subtracting it from every entry.

Property 2.

For any non-negative integer qq and x0,x1,,xqx_{0},x_{1},\cdots,x_{q}\in\mathbb{C},

e[x0,x1,,xq]=n=q1n!kj=nqj=0q(xj)kj.{\rm e}^{[x_{0},x_{1},\cdots,x_{q}]}=\displaystyle\sum^{\infty}_{n=q}\frac{1}{n!}\sum_{\sum k_{j}=n-q}\prod^{q}_{j=0}(x_{j})^{k_{j}}. (85)

An equivalent definition of divided difference for an analytic function is via its Taylor expansion. It amounts to apply divided difference on every order of the series. Since any polynomial of order less than qq is annihilated, the series starts from the order qq. PropertyProperty 2 is derived from the Taylor expansion of ex{\rm e}^{x} with respect to the origin.

Lemma 1.

For any non-negative integer qq and x0,x1,,xqx_{0},x_{1},\cdots,x_{q}\in\mathbb{C},

01aqe[ax0,ax1,,axq]𝑑a=e[0,x0,x1,,xq].\int_{0}^{1}a^{q}{\rm e}^{[ax_{0},ax_{1},\cdots,ax_{q}]}da={\rm e}^{[0,x_{0},x_{1},\cdots,x_{q}]}. (86)
Proof.

This can be observed from the series expansion of the divided difference for the exponential function, i.e., from PropertyProperty 2,

aqe[ax0,ax1,,axq]\displaystyle a^{q}{\rm e}^{[ax_{0},ax_{1},\cdots,ax_{q}]} =aqn=q1n!kj=nqj=0q(axj)kj\displaystyle=a^{q}\displaystyle\sum^{\infty}_{n=q}\frac{1}{n!}\sum_{\sum k_{j}=n-q}\prod^{q}_{j=0}(ax_{j})^{k_{j}}
=aqn=q1n!kj=nqanqj=0q(xj)kj=n=qann!kj=nqj=0q(xj)kj.\displaystyle=a^{q}\displaystyle\sum^{\infty}_{n=q}\frac{1}{n!}\sum_{\sum k_{j}=n-q}a^{n-q}\prod^{q}_{j=0}(x_{j})^{k_{j}}=\displaystyle\sum^{\infty}_{n=q}\frac{a^{n}}{n!}\sum_{\sum k_{j}=n-q}\prod^{q}_{j=0}(x_{j})^{k_{j}}. (87)

Performing term-by-term integration over aa on both side, we have

01aqe[ax0,ax1,,axq]𝑑a\displaystyle\int_{0}^{1}a^{q}{\rm e}^{[ax_{0},ax_{1},\cdots,ax_{q}]}da =n=q(01ann!𝑑a)kj=nqj=0q(xj)kj\displaystyle=\displaystyle\sum^{\infty}_{n=q}\left(\int_{0}^{1}\frac{a^{n}}{n!}da\right)\sum_{\sum k_{j}=n-q}\prod^{q}_{j=0}(x_{j})^{k_{j}}
=n=q1(n+1)!kj=nqj=0q(xj)kj=e[0,x0,x1,,xq],\displaystyle=\displaystyle\sum^{\infty}_{n=q}\frac{1}{(n+1)!}\sum_{\sum k_{j}=n-q}\prod^{q}_{j=0}(x_{j})^{k_{j}}={\rm e}^{[0,x_{0},x_{1},\cdots,x_{q}]}, (88)

where the last equality follows from the series expansion representation of e[0,x0,x1,,xq]{\rm e}^{[0,x_{0},x_{1},\cdots,x_{q}]}. This completes the proof. ∎

Corrolary 1.

Let f(x)=etxf(x)={\rm e}^{tx}, where tt\in\mathbb{R} and xx\in\mathbb{C}. We denote et[x0,,xq]f[x0,,xq]{\rm e}^{t[x_{0},\cdots,x_{q}]}\equiv f[x_{0},\cdots,x_{q}], where x0,,xqx_{0},\cdots,x_{q}\in\mathbb{C}. For any τ\tau\in\mathbb{R},

0τet[x0,,xq]𝑑t=eτ[0,x0,,xq].\int^{\tau}_{0}{\rm e}^{t[x_{0},\cdots,x_{q}]}dt={\rm e}^{\tau[0,x_{0},\cdots,x_{q}]}. (89)

This can be verified by evaluating the series expansion form on both side, by a similar manner in the proof of Lemma 1 .

With these properties, we are ready to prove the bound in Identity 2 in the main context.

Theorem 1.

For any non-negative integer qq and x0,x1,,xqx_{0},x_{1},\cdots,x_{q}\in\mathbb{C},

|e[x0,x1,,xq]|e[(x0),(x1),,(xq)],\left|{\rm e}^{[x_{0},x_{1},\cdots,x_{q}]}\right|\leq{\rm e}^{[\Re(x_{0}),\Re(x_{1}),\cdots,\Re(x_{q})]}, (90)

where ()\Re(\cdot) gives the real part of the input.

Proof.

We proceed by induction. Eq. (90) is trivially satisfied with the equality when q=0q=0. For the case q=1q=1, we have

|e[x0,x1]|\displaystyle\left|{\rm e}^{[x_{0},x_{1}]}\right| =|ex0||e[0,x1x0]|\displaystyle=\left|{\rm e}^{x_{0}}\right|\left|{\rm e}^{[0,x_{1}-x_{0}]}\right|
=e(x0)|01aea(x1x0)𝑑a|\displaystyle={\rm e}^{\Re(x_{0})}\left|\int_{0}^{1}a{\rm e}^{a(x_{1}-x_{0})}da\right|
e(x0)01a|ea(x1x0)|𝑑a\displaystyle\leq{\rm e}^{\Re(x_{0})}\int_{0}^{1}a\left|{\rm e}^{a(x_{1}-x_{0})}\right|da
=e(x0)01aea(x1x0)𝑑a\displaystyle={\rm e}^{\Re(x_{0})}\int_{0}^{1}a{\rm e}^{a\Re(x_{1}-x_{0})}da
=e[(x0),(x1)],\displaystyle={\rm e}^{[\Re(x_{0}),\Re(x_{1})]}, (91)

where Lemma 1 is used. Assume that we have

|e[x0,,xq]|e[(x0),,(xq)],\left|{\rm e}^{[x_{0},\cdots,x_{q}]}\right|\leq{\rm e}^{[\Re(x_{0}),\cdots,\Re(x_{q})]}, (92)

which it is true for q=0,1q=0,1. It follows that

|e[x0,,xq,xq+1]|\displaystyle\left|{\rm e}^{[x_{0},\cdots,x_{q},x_{q+1}]}\right| =|exq+1||e[0,x0xq+1,,xqxq+1]|\displaystyle=\left|{\rm e}^{x_{q+1}}\right|\left|{\rm e}^{[0,x_{0}-x_{q+1},\cdots,x_{q}-x_{q+1}]}\right|
=e(xq+1)|01aqe[a(x0xq+1),,a(xqxq+1)]𝑑a|\displaystyle={\rm e}^{\Re(x_{q+1})}\left|\int_{0}^{1}a^{q}{\rm e}^{[a(x_{0}-x_{q+1}),\cdots,a(x_{q}-x_{q+1})]}da\right|
e(xq+1)01aq|e[a(x0xq+1),,a(xqxq+1)]|𝑑a\displaystyle\leq{\rm e}^{\Re(x_{q+1})}\int_{0}^{1}a^{q}\left|{\rm e}^{[a(x_{0}-x_{q+1}),\cdots,a(x_{q}-x_{q+1})]}\right|da
e(xq+1)01aqe[(ax0axq+1),,(axqaxq+1)]𝑑a\displaystyle\leq{\rm e}^{\Re(x_{q+1})}\int_{0}^{1}a^{q}{\rm e}^{[\Re(ax_{0}-ax_{q+1}),\cdots,\Re(ax_{q}-ax_{q+1})]}da
=e(xq+1)e[0,(x0xq+1),,(xqxq+1)]\displaystyle={\rm e}^{\Re(x_{q+1})}{\rm e}^{[0,\Re(x_{0}-x_{q+1}),\cdots,\Re(x_{q}-x_{q+1})]}
=e[(x0),,(xq),(xq+1)],\displaystyle={\rm e}^{[\Re(x_{0}),\cdots,\Re(x_{q}),\Re(x_{q+1})]}, (93)

where the second and the third equalities use Lemma 1 and the second inequality uses (92). This proves that the inequality holds for any number of complex inputs. ∎

Appendix B Bounding |eΔtw[x1,x2,,xq,0]|\left|{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}\right|

For |eΔtw[x1,x2,,xq,0]|\left|{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}\right|, we use the following theorem,

Theorem 1.

For any q+1q+1 complex values x0,,xqx_{0},\cdots,x_{q}\in\mathbb{C},

|e[x0,,xq]|e[(x0),,(xq)]=eξq!,\left|{\rm e}^{[x_{0},\cdots,x_{q}]}\right|\leq{\rm e}^{[\Re(x_{0}),\cdots,\Re(x_{q})]}=\frac{{\rm e}^{\xi}}{q!}, (94)

where ()\Re(\cdot) denotes the real part of an input and ξ[min{(x0),,(xq)},max{(x0),,(xq)}]\xi\in\left[\min\{\Re(x_{0}),\cdots,\Re(x_{q})\},\max\{\Re(x_{0}),\cdots,\Re(x_{q})\}\right].

This is proved in Appendix A. From this, we have

|eΔtw[x1,,xq,0]|=(Δtw)q|e[Δtwx1,,Δtwxq,0]|(Δtw)qe[Δtw(x1),,Δtw(xq),0].\left|{\rm e}^{\Delta t_{w}[x_{1},\cdots,x_{q},0]}\right|=({\Delta t_{w}})^{q}\left|{\rm e}^{[\Delta t_{w}x_{1},\cdots,\Delta t_{w}x_{q},0]}\right|\leq({\Delta t_{w}})^{q}{\rm e}^{[\Delta t_{w}\Re(x_{1}),\cdots,\Delta t_{w}\Re(x_{q}),0]}. (95)

From the definition of xjx_{j}, we have

j{1,,q},(xj)=l=jq(λil,z𝕚l(kl))(qj+1)λ.\forall j\in\{1,\cdots,q\},\ \ \ \ \ \Re(x_{j})=\sum_{l=j}^{q}\Re\left(\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\leq(q-j+1)\lambda. (96)

Based on the property that increasing any input in e[,,]{\rm e}^{[\cdot,\cdots,\cdot]} will only increase its value (can be proved by taking derivatives in the Hermite-Genocchi form), we have

|eΔtw[x1,,xq,0]|(Δtw)qe[Δtw(x1),,Δtw(xq),0](Δtw)qe[Δtwqλ,Δtw(q1)λ,,Δtwλ,0].\left|{\rm e}^{\Delta t_{w}[x_{1},\cdots,x_{q},0]}\right|\leq({\Delta t_{w}})^{q}{\rm e}^{[\Delta t_{w}\Re(x_{1}),\cdots,\Delta t_{w}\Re(x_{q}),0]}\leq({\Delta t_{w}})^{q}{\rm e}^{[\Delta t_{w}q\lambda,\Delta t_{w}(q-1)\lambda,\cdots,\Delta t_{w}\lambda,0]}. (97)

Using the permutation symmetric property and Property 1, we have

(Δtw)qe[Δtwqλ,Δtw(q1)λ,,Δtwλ,0]\displaystyle({\Delta t_{w}})^{q}{\rm e}^{[\Delta t_{w}q\lambda,\Delta t_{w}(q-1)\lambda,\cdots,\Delta t_{w}\lambda,0]} =Δtwqe[Δtwqλ,Δtw(q1)λ,,Δtwλ]e[Δtw(q1)λ,,Δtwλ,0]Δtwλq\displaystyle={\Delta t_{w}}^{q}\frac{{\rm e}^{[\Delta t_{w}q\lambda,\Delta t_{w}(q-1)\lambda,\cdots,\Delta t_{w}\lambda]}-{\rm e}^{[\Delta t_{w}(q-1)\lambda,\cdots,\Delta t_{w}\lambda,0]}}{\Delta t_{w}\lambda q}
=(Δtw)qeΔtwλ1Δtwλqe[Δtw(q1)λ,,Δtwλ,0]==(eλΔtw1λ)q1q!.\displaystyle=({\Delta t_{w}})^{q}\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\Delta t_{w}\lambda q}{\rm e}^{[\Delta t_{w}(q-1)\lambda,\cdots,\Delta t_{w}\lambda,0]}=\cdots=\left(\frac{{\rm e}^{\lambda\Delta t_{w}}-1}{\lambda}\right)^{q}\frac{1}{q!}. (98)

Therefore, we have

|eΔtw[x1,x2,,xq,0]|1q!(eλΔtw1λ)q.\left|{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}\right|\leq\frac{1}{q!}\left(\frac{{\rm e}^{\lambda\Delta t_{w}}-1}{\lambda}\right)^{q}. (99)

Appendix C LCU method review

We give a brief introduction to the LCU method in this section, and we adapt the original paper Berry et al. (2015)’s notations for a more convenient reference to readers. Suppose we have a unitary UU, which is an infinite sum of unitaries, i.e.,

U=j=0βjVj,U=\sum_{j=0}^{\infty}\beta_{j}V_{j}, (100)

where βj>0\beta_{j}>0 and VjV_{j} are some unitaries. A truncated series, up to order m1m-1, yields an operator

U~=j=0m1βjVj,\tilde{U}=\sum_{j=0}^{m-1}\beta_{j}V_{j}, (101)

which approaches UU as mm increases. We perform the following procedure to effectively implement U~\tilde{U} on a state |ψ|\psi\rangle embedded in a larger system. Prepare an mm-dimensional ancilla |0|0\rangle and implement a unitary BB such that

B|0=1sj=0m1βj|j,B|0\rangle=\frac{1}{\sqrt{s}}\sum_{j=0}^{m-1}\sqrt{\beta_{j}}|j\rangle, (102)

where s=j0m1βjs=\sum_{j-0}^{m-1}\beta_{j}. Suppose we have access to a control unitary VcV_{c} such that for each jj,

Vc|j|ψ=|jVj|ψ.V_{c}|j\rangle|\psi\rangle=|j\rangle V_{j}|\psi\rangle. (103)

Consider the following combination of the above operations

W(BI)Vc(BI).W\equiv\left(B^{\dagger}\otimes I\right)V_{c}\left(B\otimes I\right). (104)

We have

W|0|ψ=1s|0U~|ψ+11s2|Φ,W|0\rangle|\psi\rangle=\frac{1}{s}|0\rangle\tilde{U}|\psi\rangle+\sqrt{1-\frac{1}{s^{2}}}|\Phi\rangle, (105)

where |Φ|\Phi\rangle’s ancillary part is orthogonal to |00||0\rangle\langle 0|. Let us denote P|00|IP\equiv|0\rangle\langle 0|\otimes I the orthogonal projection onto that subspace and RI2PR\equiv I-2P the reflection operator with respect to PP. It is shown that the sequence of operations AWRWRWA\equiv-WRW^{\dagger}RW, acting on the total system is A|0|ψ=|0U~|ψA|0\rangle|\psi\rangle=|0\rangle\tilde{U}|\psi\rangle when U~\tilde{U} is unitary and s=2s=2. This procedure is the so-called Oblivious Amplitude Amplification (OAA). However, U~\tilde{U} is in general not unitary because it is a truncated series of UU. This nonunitarity can be accounted for when U~U\tilde{U}\approx U and s2s\approx 2. More specifically, it is shown that if UU~=𝒪(δ)||U-\tilde{U}||=\mathcal{O}(\delta) and |s2|=𝒪(δ)|s-2|=\mathcal{O}(\delta), then

PA|0|ψ|0U|ψ=𝒪(δ).\left|\left|PA|0\rangle|\psi\rangle-|0\rangle U|\psi\rangle\right|\right|=\mathcal{O}(\delta). (106)

This means when U~\tilde{U} is δ\delta-close to UU and ss is δ\delta-close to 2, the effect of the operator AA on the whole system is δ\delta-close to only UU acting on |ψ|\psi\rangle.

Note that the condition UU~=𝒪(δ)||U-\tilde{U}||=\mathcal{O}(\delta) can be satisfied when the truncation order mm is high enough. However, the condition |s2|=𝒪(δ)|s-2|=\mathcal{O}(\delta) is satisfied only when βj\beta_{j} are specifically chosen. By construction, we require s=j=0m1βjs=\sum_{j=0}^{m-1}\beta_{j}. If we choose βj=(ln2)j/j!\beta_{j}=(\text{ln}2)^{j}/j!, then

s=j=0m1(ln2)jj!s=\sum_{j=0}^{m-1}\frac{(\text{ln}2)^{j}}{j!} (107)

becomes a truncated Taylor expansion of 2, i.e., 2=eln22={\rm e}^{\text{ln}2}. In fact, it can be shown that the required truncation order mm such that |s2|=𝒪(δ)|s-2|=\mathcal{O}(\delta) scales like log(1/δ)/log(log(1/δ))\log(1/\delta)/\log(\log(1/\delta)). With this mm, it also guarantees that UU~=𝒪(δ)||U-\tilde{U}||=\mathcal{O}(\delta), because

UU~=j=m(ln2)jj!Vjj=m(ln2)jj!=|2s|.\left|\left|U-\tilde{U}\right|\right|=\left|\left|\sum_{j=m}^{\infty}\frac{(\text{ln}2)^{j}}{j!}V_{j}\right|\right|\leq\sum_{j=m}^{\infty}\frac{(\text{ln}2)^{j}}{j!}=|2-s|. (108)

In summary, performing AA on an extended system |0|ψ|0\rangle|\psi\rangle, with βj=(ln2)j/j!\beta_{j}=(\text{ln}2)^{j}/j! and m=𝒪(log(1/δ)/loglog(1/δ))m=\mathcal{O}(\log(1/\delta)/\log\log(1/\delta)), effectively performs UU on |ψ|\psi\rangle with 𝒪(δ)\mathcal{O}(\delta) accuracy.

Appendix D An alternative approach for the LCU setup

We provide an alternative procedure for the LCU routine that leads to an exponential saving for the state preparation. Let us define

Γmaxk,iDi(k)max.\Gamma\equiv\max_{\forall k,i}||D^{(k)}_{i}||_{\max}. (109)

Re-evaluate the coefficients in Eq. (32) using the Γ\Gamma above, we have

|etwx1eΔtw[x1,x2,,xq,0]d𝕚q,z(𝕜q)|=(ΓΔt~wetwλ)qq!cos[ϕ𝕚q,z(𝕜q)]eiθ𝕚q,z(𝕜q).\displaystyle\left|{\rm e}^{t_{w}x_{1}}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}\right|=\frac{\left(\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{q!}\cos\left[\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}\right]{\rm e}^{i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}. (110)

The evolution operator from twt_{w} to tw+Δtwt_{w}+\Delta t_{w} becomes

UI(tw+Δtw,tw)=zUI(tw+Δtw,tw)|zz|\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})=\sum_{z}U_{I}(t_{w}+\Delta t_{w},t_{w})|z\rangle\langle z|
=zq=0𝕚q𝕜q(i)q(ΓΔt~wetwλ)q2q!(eiϕ𝕚q,z(𝕜q)+iθ𝕚q,z(𝕜q)+eiϕ𝕚q,z(𝕜q)+iθ𝕚q,z(𝕜q))P𝕚q|zz|\displaystyle=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\frac{\left(\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{2q!}\left({\rm e}^{i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}+{\rm e}^{-i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}\right)P_{\mathbb{i}_{q}}|z\rangle\langle z|
=q=0(ΓΔt~wetwλ)q2q!𝕚q𝕜qx=±(i)qP𝕚qΦ𝕚q,x(𝕜q,w).\displaystyle=\sum_{q=0}^{\infty}\frac{\left(\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{2q!}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=\pm}(-i)^{q}P_{\mathbb{i}_{q}}\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x}. (111)

The required state |ψ0|\psi_{0}\rangle for LCU becomes

|ψ0=1sq=0Q(ΓΔt~wetwλ)q2q!𝕚q𝕜qx=0,1|𝕚q|𝕜q|x,|\psi_{0}\rangle=\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{\frac{\left(\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{2q!}}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle, (112)

where ss is the normalization factor, i.e.,

s=q=0Q(MKΓΔt~wetwλ)qq!.s=\displaystyle\sum_{q=0}^{Q}\frac{\left(MK\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{q!}. (113)

To prepare the state (112), we first prepare a state in the following form,

1sq=0Q(MKΓΔt~wetwλ)qq!|1q|0(Qq)|1q|0(Qq).\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{\frac{\left(MK\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{q!}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}. (114)

Subsequently, for each |1|1\rangle in the first QQ registers (𝕚q\mathbb{i}_{q} part), we transform it to (1/M)i=0M|i(1/\sqrt{M})\sum_{i=0}^{M}|i\rangle, and for each |1|1\rangle in the later QQ registers (𝕜q\mathbb{k}_{q} part), we transform it to (1/K)k=1K|k(1/\sqrt{K})\sum_{k=1}^{K}|k\rangle. The state (114) becomes

1sq=0Q(MKΓΔt~wetwλ)qq!𝕚q1Mq|𝕚q𝕜q1Kq|𝕜q=1sq=0Q𝕚q𝕜q(ΓΔt~wetwλ)qq!|𝕚q|𝕜q,\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{\frac{\left(MK\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{q!}}\sum_{\mathbb{i}_{q}}\frac{1}{\sqrt{M^{q}}}|\mathbb{i}_{q}\rangle\sum_{\mathbb{k}_{q}}\frac{1}{\sqrt{K^{q}}}|\mathbb{k}_{q}\rangle=\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sqrt{\frac{\left(\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{q!}}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle, (115)

which is the required |ψ0|\psi_{0}\rangle in (112), when combined with |x|x\rangle. Note that since the transformations |1(1/M)i=0M|i|1\rangle\to(1/\sqrt{M})\sum_{i=0}^{M}|i\rangle and |1(1/K)k=1K|k|1\rangle\to(1/\sqrt{K})\sum_{k=1}^{K}|k\rangle are mappings to the equally distributed state, they can be done with a column of parallel Hadamard gates, which has a gate cost 𝒪(log(MK))\mathcal{O}(\log(MK)). This provides an exponential saving comparing to 𝒪(MK)\mathcal{O}(MK) given in the main context. This saving can be apparent when MKMK becomes large. However, this can create an overhead in the required number of repetitions. Indeed, we have Γetwλ=MKmaxk,iDi(k)maxetwλ\Gamma{\rm e}^{t_{w}\lambda}=MK\max_{\forall k,i}||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda} here comparing to Γ(tw)=ikDi(k)maxetwλ(i,k)\Gamma(t_{w})=\sum_{i}\sum_{k}||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}} in the main context, and the overall simulation cost monotonically increases with this quantity. If only a few Di(k)maxetwλ(i,k)||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}} is much larger than the others such that MKmaxk,iDi(k)maxetwλikDi(k)maxetwλ(i,k)MK\max_{\forall k,i}||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda}\gg\sum_{i}\sum_{k}||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}}, then the method provided in the main context is preferred. Depending on the models, one may favors one over the other.