Quantum algorithm for time-dependent Hamiltonian simulation
by permutation expansion

Yi-Hsiang Chen Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292, USA Department of Physics and Astronomy, and Center for Quantum Information Science & Technology,University of Southern California, Los Angeles, California 90089, USA Amir Kalev Information Sciences Institute, University of Southern California, Arlington, VA 22203, USA Itay Hen Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292, USA Department of Physics and Astronomy, and Center for Quantum Information Science & Technology,University of Southern California, Los Angeles, California 90089, USA

Abstract

We present a quantum algorithm for the dynamical simulation of time-dependent Hamiltonians. Our method involves expanding the interaction-picture Hamiltonian as a sum of generalized permutations, which leads to an integral-free Dyson series of the time-evolution operator. Under this representation, we perform a quantum simulation for the time-evolution operator by means of the linear combination of unitaries technique. We optimize the time steps of the evolution based on the Hamiltonian’s dynamical characteristics, leading to a gate count that scales with an $L^{1}$ -norm-like scaling with respect only to the norm of the interaction Hamiltonian, rather than that of the total Hamiltonian. We demonstrate that the cost of the algorithm is independent of the Hamiltonian’s frequencies, implying its advantage for systems with highly oscillating components, and for time-decaying systems the cost does not scale with the total evolution time asymptotically. In addition, our algorithm retains the near optimal $\log(1/\epsilon)/\log\log(1/\epsilon)$ scaling with simulation error $\epsilon$ .

I Introduction

The problem of simulating quantum systems, whether it is to study their dynamics, or to infer their salient equilibrium properties, was the original motivation for quantum computers Feynman (1982) and remains one of their major potential applications Reiher et al. (2017); Babbush et al. (2018). Classical algorithms for this problem are known to be grossly inefficient. Nonetheless, a significant fraction of the world’s computing power today is spent on solving instances of this problem — a reflection on their importance Gioiosa (2017); Sterling et al. (2018); Lee (2014).

An important class of quantum simulations that is known to be particularly challenging, and is the focus of this work, is that of time-dependent quantum processes, which are at the heart of many important quantum phenomena. These include for example quantum control schemes Pang and Jordan (2017), transition states of chemical reactions Butler (1998) analog quantum computers such as quantum annealers Farhi et al. (2001) and the quantum approximate optimization algorithm Farhi et al. (2014). Devising state-of-the-art resource efficient quantum algorithms to simulate these types of processes on quantum circuits is therefore a very worthy cause: it will allow for the studying of said phenomena in a controllable and vastly more illuminating manner.

In the literature, a number of quantum algorithms designed to simulate the dynamics of time-dependent quantum many-body Hamiltonians already exist. However, most of them are variants of algorithms that suit time-independent Hamiltonians but lack optimizations for dynamical ones. For example, Hamiltonians based on the Lie-Trotter-Suzuki decomposition were developed in Refs. Wiebe et al. (2011); Poulin et al. (2011), where the complexity scales polynomially with error. More recent advances Berry et al. (2014, 2015) improve it to a logarithmic error scaling, which directly lead to applications in time-dependent Hamiltonian simulations Low and Wiebe (2019); Kieferová et al. (2019). A recent study by Berry et al. Berry et al. (2020) improves the Hamiltonian scaling to $L^{1}-$ norm, by considering the dynamical properties of the time-dependent Hamiltonian. However, these mostly comprise of slicing the dynamics into a sequence of ‘quasi-static’ steps, each of which implementing a static quantum simulation module. In addition, all the above-mentioned algorithms assume a time-dependent oracle — a straightforward but not necessarily practical assumption that can obscure the true complexity of the simulation when physical models are considered.

The sub-optimality that characterizes existing quantum algorithms can be attributed mainly to the fact that the time-evolution operator for time-dependent Hamiltonians is a more intricate entity than its time-independent counterpart (this matter is discussed in more detail below): While in the time-independent case the Schrödinger equation can be formally integrated, the time-evolution unitary operator for time-dependent systems is given in terms a Dyson series Dyson (1949) — a perturbative expansion, wherein each summand is a multi-dimensional integral over a time-ordered product of the (usually interaction-picture) Hamiltonian at different points in time. These time-ordered integrals pose multiple algorithmic and implementation challenges.

In this paper, we provide a quantum algorithm for simulating a time-dependent Hamiltonian dynamics. This algorithm invokes a separation of the Hamiltonian $H(t)$ into a sum of a static diagonal part $H_{0}$ and a dynamical part $V(t)$ , i.e., $H(t)=H_{0}+V(t)$ , and switches to the interaction-picture with respect to $H_{0}$ . The target evolution operator becomes a product of an interaction-picture unitary $U_{I}(t)$ followed by a diagonal unitary ${\rm e}^{-iH_{0}t}$ that can be simulated efficiently. The interaction Hamiltonian $V(t)$ is expanded as a sum of generalized permutations, and the resulting Dyson series of the evolution operator $U_{I}(t)$ becomes an integral-free representation Kalev and Hen (2020) with the notion of divided differences, which is a well-studied quantity de Boor (2005); Davis (1975); Mccurdy (1980); Gupta et al. (2020a); McCurdy et al. (1984); Zivcovich (2019). The divided differences have an intuition of discretized derivatives and is closely related to polynomial interpolations de Boor (2005). We refer the reader to Appendix A for a short summary of the notion of the divided differences. Under this representation, we use the LCU method Berry et al. (2015) to simulate $U_{I}(t)$ with a truncated Dyson series. We find a partitioning scheme that determines the duration of the time steps along the simulation. Following this procedure, in general, each time interval has a different duration which is determined form the Hamiltonian’s dynamical characteristics and can lead to substantially fewer number of steps as compared to using identical-length simulation segments, typically used in quantum simulation algorithms. We analyze the implementation gate and qubit costs and discuss the circumstances under which our simulation algorithm provides improvements over the state-of-the-art. Specifically, our algorithm is independent of the oscillation frequencies of the Hamiltonian. This is in stark contrast to existing algorithms which have dependence on $||d{H}(t)/dt||$ , which grows with oscillation rates. Another class of Hamiltonians for which our algorithm is preferred over others is those with exponential decays. We show that for these systems, our algorithms requires asymptotically a finite number of steps which does not scale with the evolution time, leading in turn to an exponential saving comparing to the linear scaling in existing approaches. Moreover, the cost with Hamiltonian norm only mainly depends on the interaction Hamiltonian $V(t)$ and not the total Hamiltonian $H(t)$ Berry et al. (2020). This also indicates an advantage of the algorithm when the time-dependent Hamiltonian is dominant by a static part.

The paper is organized as follows. In Sec. II, we review the permutation expansion method that leads to an integral-free representation for the Dyson series, as introduced in Ref. Kalev and Hen (2020). In Sec. III, we present in detail the simulation algorithm that combines the integral-free expression of the evolution operator with the LCU method, and analyze the circuit costs. We highlight the main advantages of our algorithm in Sec. III.4.3. In Sec. III.5, we address the cases when the exponential-sum expansion of the time-dependence is not exact and estimate the error that stems from a finite sum approximation. Finally, we give a brief summary for our methods and results in Sec. V.

II Permutation expansion method for time-dependent Hamiltonians

In this section, we briefly describe the integral-free Dyson series expression of the evolution operator, derived from a permutation expansion of the time-dependent Hamiltonian Kalev and Hen (2020). Without loss of generality Gupta et al. (2020b), we expand a general time-dependent Hamiltonian in terms of products of time-dependent diagonal matrices, $D_{i}(t)$ , and permutation operators, $P_{i}$ , i.e.,

H(t)=\displaystyle\sum_{i=0}^{M}D_{i}(t)P_{i},

(1)

where $P_{0}\equiv\mathds{1}$ . This decomposition can be done efficiently as long as $M$ scales polynomially with $\log d$ , where $d$ is the dimension of the Hamiltonian. We decompose each diagonal matrix into a finite sum of exponential functions, i.e.,

D_{i}(t)=\sum_{k=1}^{K_{i}}\exp\left(\Lambda^{(k)}_{i}t\right)D_{i}^{(k)},

(2)

where $\Lambda^{(k)}_{i}$ and $D_{i}^{(k)}$ are complex diagonal matrices with diagonal elements being

	$\displaystyle\lambda^{(k)}_{i,z}\equiv\langle z\|\Lambda^{(k)}_{i}\|z\rangle,$		(3)
	$\displaystyle d_{i,z}^{(k)}\equiv\langle z\|D_{i}^{(k)}\|z\rangle,$		(4)

in some basis $\{|z\rangle\}$ (the basis in which $D_{0}$ is diagonal) and $K_{i}$ indicates the number of terms in the exponential decomposition for $D_{i}(t)$ . This can be done for many cases when the time dependencies are simple combinations of exponential terms. For simplicity we assume here that the $K_{i}$ ’s are finite, and address the most general time dependence in detail in Sec. III.5 and refer to various algorithms Beylkin and Monzón (2005, 2010); Braess and Hackbusch (2009); Wiscombe and Evans (1977); Norvidas (2010) for efficiently finding an exponential sum approximation of a function.

For a lighter notation, we set $K_{i}=K$ for all $i$ . We can evaluate the time-evolution operator $U(t)$ corresponding to $H(t)$ as

$\displaystyle U(t)$	$\displaystyle\equiv\mathcal{T}\text{exp}\left[-i\int_{0}^{t}H(t^{\prime})dt^{\prime}\right]$
	$\displaystyle=\sum_{q=0}^{\infty}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}H(\tau_{q})\cdots H(\tau_{1})$	(5)
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\Lambda^{(k_{q})}_{i_{q}}\tau_{q}\right)D_{i_{q}}^{(k_{q})}P_{i_{q}}\cdots\exp\left(\Lambda^{(k_{1})}_{i_{1}}\tau_{1}\right)D_{i_{1}}^{(k_{1})}P_{i_{1}},$	(6)

where $\mathbb{i}_{q}=\{i_{q},\cdots,i_{1}\}$ and $\mathbb{k}_{q}=\{k_{q},\cdots,k_{1}\}$ are multi-indices. The action of $U(t)$ on a basis vector $|z\rangle$ is

$\displaystyle U(t)\|z\rangle$	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\Lambda^{(k_{q})}_{i_{q}}\tau_{q}\right)D_{i_{q}}^{(k_{q})}P_{i_{q}}\cdots\exp\left(\Lambda^{(k_{1})}_{i_{1}}\tau_{1}\right)D_{i_{1}}^{(k_{1})}P_{i_{1}}\|z\rangle$	(7)
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}\tau_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}\tau_{1}\right)d_{i_{q},z_{\mathbb{i}_{q}}}^{(k_{q})}\cdots d_{i_{1},z_{\mathbb{i}_{1}}}^{(k_{1})}P_{i_{q}}\cdots P_{i_{1}}\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}\tau_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}\tau_{1}\right)d_{\mathbb{i}_{q},z}^{(\mathbb{k}_{q})}P_{\mathbb{i}_{q}}\|z\rangle,$

where $|z_{\mathbb{i}_{j}}\rangle\equiv P_{i_{j}}\cdots P_{i_{1}}|z\rangle$ with $j$ ranging from 1 to $q$ , and $\lambda^{(k_{j})}_{i_{j},z_{\mathbb{i}_{j}}}\left(d_{i_{j},z_{\mathbb{i}_{j}}}^{(k_{j})}\right)$ is the $z_{\mathbb{i}_{j}}$ th diagonal element of $\Lambda^{(k_{j})}_{i_{j}}\left(D^{(k_{j})}_{i_{j}}\right)$ . $P_{\mathbb{i}_{q}}$ is a shorthand of $P_{i_{q}}\cdots P_{i_{1}}$ , and similarly $d_{\mathbb{i}_{q},z}^{(\mathbb{k}_{q})}\equiv d_{i_{q},z_{\mathbb{i}_{q}}}^{(k_{q})}\cdots d_{i_{1},z_{\mathbb{i}_{1}}}^{(k_{1})}$ . Figure 1 illustrates the accumulative actions of $D^{(k)}_{i}P_{i}$ on a basis vector $|z\rangle$ .

Refer to caption — Figure 1: The actions of a sequence of generalized permutations. This figure gives a pictorial illustration on how the elements of the diagonal matrices are picked up when interleaving with permutations. In this example, we have $q=2$ .

To proceed, we use the following identity to simplify the expression in terms of divided differences. It is a variant of Hermite-Genocchi formula de Boor (2005) applying to the exponential function.

Identity 1.

For $\lambda_{1},\cdots,\lambda_{q}\in\mathbb{C}$ ,

\int^{1}_{0}ds_{q}\cdots\int^{s_{2}}_{0}ds_{1}{\rm e}^{(\lambda_{1}s_{1}+\cdots+\lambda_{q}s_{q})}={\rm e}^{[x_{1},\cdots,x_{q},0]},

(8)

where $x_{j}=\sum_{l=j}^{q}\lambda_{l}$ and ${\rm e}^{[x_{1},\cdots,x_{q},0]}$ is the divided difference of the exponential function with inputs $x_{1},\cdots,x_{q},0$ . (The case with $q=1$ can be shown by explicit integration, and the identity follows by induction. For more details, see Ref. Kalev and Hen (2020).)

With this property, the multi-dimensional integration in the time-evolution operator can be simplified as

$\displaystyle U(t)\|z\rangle$	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}\tau_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}\tau_{1}\right)d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-it)^{q}\int^{1}_{0}ds_{q}\cdots\int^{s_{2}}_{0}ds_{1}\exp\left[t\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}s_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}s_{1}\right)\right]d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle,$	(9)

where $x_{j}=\sum_{l=j}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}$ . The second equality uses the change of variable $d\tau=tds$ , and the last equality follows from Identity 1 and the identity of $t^{q}{\rm e}^{[tx_{0},\cdots,tx_{q}]}={\rm e}^{t[x_{0},\cdots,x_{q}]}$ . By completing the basis, we get

U(t)=\sum_{z}U(t)|z\rangle\langle z|=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle\langle z|.

(10)

This is an integral-free expression for the unitary time-evolution operator of the time-dependent Hamiltonian $H(t)$ . We will later approximate the unitary by truncating the series at some order $q=Q$ that scales as $\mathcal{O}\left(\frac{\text{log}(1/\epsilon)}{\text{loglog}(1/\epsilon)}\right)$ Berry et al. (2015), where $\epsilon$ is the required accuracy.

III Time-dependent Hamiltonian simulation algorithm

A time-dependent Hamiltonian $H(t)$ can be expressed as a sum of two Hamiltonians—a time-independent $H_{0}$ and a dynamical $V(t)$ , i.e.,

H(t)=H_{0}+V(t).

(11)

In many practical models, $H_{0}$ represents a static and simple Hamiltonian that is often diagonal in a known basis (which we will identify with the computational basis). Hence, hereafter, we assume that $H_{0}$ is a diagonal operator with real diagonal elements. The $V(t)$ component represents the nontrivial interactions between subsystems. Assume¹¹1For the most general cases, one can set $H_{0}=0$ . $H_{0}$ is diagonal in the computational basis $\{|z\rangle\}$ . We switch to the interaction picture, i.e.,

\frac{d}{dt}|\psi(t)\rangle=-iH(t)|\psi(t)\rangle\to\frac{d}{dt}|\psi_{I}(t)\rangle=-iH_{I}(t)|\psi_{I}(t)\rangle,

(12)

where

|\psi_{I}(t)\rangle={\rm e}^{iH_{0}t}|\psi(t)\rangle\ \ \text{and}\ \ H_{I}(t)={\rm e}^{iH_{0}t}V(t){\rm e}^{-iH_{0}t}.

(13)

The Schrödinger-picture unitary operator $U(t)$ , satisfying $|\psi(t)\rangle=U(t)|\psi(0)\rangle$ , is equivalent to a time-ordered matrix exponential followed by a diagonal unitary, i.e.,

U(t)={\rm e}^{-iH_{0}t}\mathcal{T}\exp\left[-i\int_{0}^{t}H_{I}(t^{\prime})dt^{\prime}\right]={\rm e}^{-iH_{0}t}\mathcal{T}\exp\left[-i\int_{0}^{t}{\rm e}^{iH_{0}t^{\prime}}V(t^{\prime}){\rm e}^{-iH_{0}t^{\prime}}dt^{\prime}\right].

(14)

Hence, the simulation of $U(t)={\rm e}^{-iH_{0}t}U_{I}(t)$ consists of two parts—a complicated $U_{I}(t)$ and a simple diagonal unitary ${\rm e}^{-iH_{0}t}.$ The simulation of ${\rm e}^{-iH_{0}t}$ can be achieved with a gate cost that scales only linearly with the locality of $H_{0}$ . When we write $H_{0}=\sum^{L}_{\gamma=0}J_{\gamma}Z_{\gamma}$ , where each $Z_{\gamma}$ is some tensor product of (single-qubit) Pauli- $Z$ operators acting on at most $d$ qubits, it can be shown that the gate cost scales as $\mathcal{O}(Ld)$ Nielsen and Chuang (2011); Kalev and Hen (2021). Therefore, the main focus of our simulation is on $U_{I}$ .

We next provide an overview of the simulation algorithm in Sec. III.1. In Sec. III.2, we incorporate the LCU framework with the permutation expansion method. Sec. III.3.2 provides the state preparation operation and Sec. III.4 evaluates the simulation cost for the whole procedure.

III.1 An overview of the algorithm

Our proposed simulation algorithm consists of a permutation expansion procedure for $U_{I}$ and the LCU method for the quantum simulation. In Sec. III.2, we explain in detail the essential ingredients for merging these two approaches. Before delving into technical details, we provide an overview of the algorithm in this section.

Given a time-dependent Hamiltonian $H(t)$ , we first decompose $H(t)$ into a sum of a static diagonal term $H_{0}$ (if exists) and a dynamical term $V(t)$ . We switch to an interaction picture so that the target unitary evolution $U(T)$ over a period $T$ becomes

U(T)\equiv\mathcal{T}\exp\left[-i\int_{0}^{T}H(t)dt\right]={\rm e}^{-iH_{0}T}\mathcal{T}\exp\left[-i\int_{0}^{T}{\rm e}^{iH_{0}t}V(t){\rm e}^{-iH_{0}t}dt\right]\equiv{\rm e}^{-iH_{0}T}U_{I}(T).

(15)

Therefore, the simulation of $U(T)$ is equivalent to applying $U_{I}(T)$ followed by ${\rm e}^{-iH_{0}T}$ . Since the diagonal unitary ${\rm e}^{-iH_{0}T}$ can be efficiently simulated, we focus on $U_{I}(T)$ hereafter.

Let us expand $V(t)$ as a sum of permutations as

V(t)=\displaystyle\sum_{i=0}^{M}D_{i}(t)P_{i},

(16)

where $P_{i}$ are permutations ( $P_{0}\equiv\mathds{1}$ ) and $D_{i}(t)$ are some diagonal matrices that are expressed as exponential sums, i.e.,

D_{i}(t)=\sum_{k=1}^{K}\exp\left(\Lambda^{(k)}_{i}t\right)D_{i}^{(k)}.

(17)

$\Lambda^{(k)}_{i}$ and $D_{i}^{(k)}$ are some complex diagonal matrices. Partition $U_{I}(T)$ into $r$ segments $U_{I}(T,t_{r-1})\cdots U_{I}(t_{1},0)$ , whose respective durations $\Delta t_{w}$ , $w=0,\ldots,r-1$ are determined by the partitioning scheme given in Sec. III.2 and the time markers $t_{w}$ , are defined as $t_{w}=\sum_{l=0}^{w-1}\Delta t_{l}$ . The total number of steps is denoted as $r$ . The evolution operator from $t_{w}$ to $t_{w}+\Delta t_{w}$ is expressed as

U_{I}(t_{w}+\Delta t_{w},t_{w})=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=\pm}(-i)^{q}\frac{\left(\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\lambda}\right)^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})P_{\mathbb{i}_{q}}\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x},

(18)

where we denote

\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})=\left|\left|D_{i_{q}}^{(k_{q})}\right|\right|_{\max}{\rm e}^{t_{w}\lambda_{(i_{q},k_{q})}}\cdots\left|\left|D_{i_{1}}^{(k_{1})}\right|\right|_{\max}{\rm e}^{t_{w}\lambda_{(i_{1},k_{1})}},

(19)

where $||\cdot||_{\max}$ the max norm, $\lambda_{(i,k)}=\max_{z}\Re(\langle z|\Lambda^{(k)}_{i}|z\rangle)$ the maximum real part of $\Lambda^{(k)}_{i}$ and $\lambda=\max_{i,k}\{\lambda_{(i,k)}\}$ . Here, $\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},\pm}$ are some diagonal unitaries as derived later in Eq. (38) and each $P_{\mathbb{i}_{q}}$ is a unique product of permutations. Note that the above evolution operators are given as a linear combination of unitaries (LCU). We provide a review for the LCU method in Appendix C. We set the truncation order $Q$ to be²²2An exact truncation order that guarantees the accuracy $\epsilon$ is $\left\lceil{\frac{\text{ln}(2r/\epsilon)}{W\left(\frac{\text{ln}(2r/\epsilon)}{e\text{ln}2}\right)}-1}\right\rceil$ , where $W(\cdot)$ is the Lambert W-function.

Q=\mathcal{O}\left(\frac{\log(r/\epsilon)}{\log\log(r/\epsilon)}\right),

(20)

where $\epsilon$ is the overall simulation accuracy.

To implement the LCU routine for each $U_{I}(t_{w}+\Delta t_{w},t_{w})$ , we require preparing a state

\displaystyle|\psi_{0}\rangle=\frac{1}{\sqrt{s}}\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}\sqrt{\frac{\left(\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\lambda}\right)^{q}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}{2q!}}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle,

(21)

where $|\mathbb{i}_{q}\rangle$ represents $Q$ quantum registers that each has dimension $M$ and $|\mathbb{k}_{q}\rangle$ represents $Q$ quantum registers that each has dimension $K$ , and $s$ is the normalization factor. Following the same notation in Sec. C, let us denote the state preparation unitary as $B$ , i.e., $B|0\rangle^{\otimes 2Q+1}=|\psi_{0}\rangle$ (B is explicitly given in Sec. III.3.2). Let us denote $V_{c}$ the control unitary such that

V_{c}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle|\psi\rangle=|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle(-i)^{q}P_{\mathbb{i}_{q}}\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x}|\psi\rangle.

(22)

The Oblivious Amplitude Amplification (OAA) involves interleaving the operator $W=(B^{\dagger}\otimes I)V_{c}(B\otimes I)$ as

A=-WRW^{\dagger}RW,

(23)

where $R\equiv I-2(|0\rangle\langle 0|\otimes I)$ . For each piece of the unitary, we implement $A$ on the extended system $|0\rangle^{\otimes(2Q+1)}|\psi\rangle$ . By construction, we have

\left|\left|A|0\rangle^{\otimes(2Q+1)}|\psi\rangle-|0\rangle^{\otimes(2Q+1)}U_{I}(t_{w}+\Delta t_{w},t_{w})|\psi\rangle\right|\right|=\mathcal{O}\left(\frac{\epsilon}{r}\right).

(24)

This means that applying $A$ effectively performs the unitary $U_{I}(t_{w}+\Delta t_{w},t_{w})$ on the main system $|\psi\rangle$ , with error $\mathcal{O}(\epsilon/r)$ . Combining $r$ pieces of the procedure, it effectively simulates $U_{I}(T)$ with overall error $\mathcal{O}(\epsilon)$ , i.e.,

\left|\left|A_{r-1}\cdots A_{1}A_{0}|0\rangle^{\otimes(2Q+1)}|\psi\rangle-|0\rangle^{\otimes(2Q+1)}U_{I}(T)|\psi\rangle\right|\right|=\mathcal{O}\left(\epsilon\right),

(25)

where $A_{w}$ are the OAA operators for the corresponding piece of evolution. This implies that applying the sequence of $A$ s followed by the circuit for ${\rm e}^{-iH_{0}T}$ can approach the action of $U(T)$ to an arbitrary accuracy.

III.2 Permutation expansion for $U_{I}(t)$

In this section, we give a thorough introduction of the permutation expansion in the Dyson series and the conditions arisen from implementing the LCU method. We focus on addressing the interaction-picture unitary $U_{I}(t)$ , i.e., the time-ordered operator in Eq. (14). Using the expansions introduced in Eqs. (16) and (17), we get

	$\displaystyle U_{I}(t)\equiv\mathcal{T}\exp\left[-i\int_{0}^{t}{\rm e}^{iH_{0}t^{\prime}}V(t^{\prime}){\rm e}^{-iH_{0}t^{\prime}}dt^{\prime}\right]$
	$\displaystyle=\sum_{q=0}^{\infty}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}{\rm e}^{iH_{0}\tau_{q}}V(\tau_{q}){\rm e}^{-iH_{0}\tau_{q}}\cdots{\rm e}^{iH_{0}\tau_{1}}V(\tau_{1}){\rm e}^{-iH_{0}\tau_{1}}$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}{\rm e}^{iH_{0}\tau_{q}}{\rm e}^{\Lambda^{(k_{q})}_{i_{q}}\tau_{q}}D_{i_{q}}^{(k_{q})}P_{i_{q}}{\rm e}^{-iH_{0}\tau_{q}}\cdots{\rm e}^{iH_{0}\tau_{1}}{\rm e}^{\Lambda^{(k_{1})}_{i_{1}}\tau_{1}}D_{i_{1}}^{(k_{1})}P_{i_{1}}{\rm e}^{-iH_{0}\tau_{1}},$		(26)

We denote the basis in which $H_{0}$ is diagonal by $\{|z\rangle\}$ and its diagonal elements by $E_{z}=\langle z|H_{0}|z\rangle$ . The action of $U_{I}(t)$ on a basis vector $|z\rangle$ becomes

	$\displaystyle U_{I}(t)\|z\rangle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}$		(27)
	$\displaystyle\times\exp\left[\left(iE_{z_{\mathbb{i}_{q}}}-iE_{z_{\mathbb{i}_{q-1}}}+\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}\right)\tau_{q}+\cdots+\left(iE_{z_{\mathbb{i}_{1}}}-iE_{z}+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}\right)\tau_{1}\right]d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle,$

where $E_{z_{\mathbb{i}_{j}}}$ is the $z_{\mathbb{i}_{j}}$ th diagonal element of $H_{0}$ , i.e., $E_{z_{\mathbb{i}_{j}}}=\langle z_{\mathbb{i}_{j}}|H_{0}|z_{\mathbb{i}_{j}}\rangle$ , and $|z_{\mathbb{i}_{j}}\rangle=P_{\mathbb{i}_{j}}|z\rangle$ with $P_{\mathbb{i}_{j}}=P_{i_{j}}\cdots P_{i_{1}}$ . By Identity 1, this can be further simplified as

\displaystyle U_{I}(t)|z\rangle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t[x_{1},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle,

(28)

where

x_{j}=i\left(E_{z_{\mathbb{i}_{q}}}-E_{z_{\mathbb{i}_{j-1}}}\right)+\sum_{l=j}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}.

(29)

III.3 The LCU routine

To implement the LCU method for a quantum simulation of $U_{I}(T)$ , we first decompose the overall simulation duration $T$ into $r$ pieces in sequence, i.e.,

U_{I}(T)=U_{I}(T,t_{r-1})U_{I}(t_{r-1},t_{r-2})\cdots U_{I}(t_{1},0)=\prod^{r-1}_{w=0}U_{I}(t_{w}+\Delta t_{w},t_{w}),

(30)

where the operators in the product of the last equation are understood to be ordered, $t_{w+1}=t_{w}+\Delta t_{w}$ and $t_{0}\equiv 0$ and $t_{r}\equiv T$ . The number of steps, $r$ , and the step size, $\Delta t_{w}$ , are to be determined. When acting on a computational basis state, each piece in the decomposition can be written as

	$\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})\|z\rangle=\mathcal{T}\text{exp}\left[-i\int_{t_{w}}^{t_{w}+\Delta t_{w}}H_{I}(t^{\prime})dt^{\prime}\right]\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t_{w}+\Delta t_{w}}_{t_{w}}d\tau_{q}\cdots\int^{\tau_{2}}_{t_{w}}d\tau_{1}\exp\Bigg{[}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\tau_{l}\Bigg{]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle,$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\exp\left[t_{w}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\right]$
	$\displaystyle\times\int^{\Delta t_{w}}_{0}d\tau^{\prime}_{q}\cdots\int^{\tau^{\prime}_{2}}_{0}d\tau^{\prime}_{1}\exp\Bigg{[}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\tau^{\prime}_{l}\Bigg{]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\exp\left[t_{w}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\right]{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{-it_{w}(E_{z_{\mathbb{i}_{0}}}-E_{z_{\mathbb{i}_{q}}})}{\rm e}^{t_{w}\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle,$		(31)

which has the same form as Eq. (28) except that the integration intervals are shifted (with $E_{z_{\mathbb{i}_{0}}}\equiv E_{z}$ ). We can denote

d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})=d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}{\rm e}^{t_{w}\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}},

which leads to

\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})|z\rangle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{-it_{w}(E_{z_{\mathbb{i}_{0}}}-E_{z_{\mathbb{i}_{q}}})}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})P_{\mathbb{i}_{q}}|z\rangle,

(32)

To formulate the above expression in terms of a linear combination of unitaries, we need to evaluate the norms of ${\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}$ and $d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})$ . The norm of $d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})$ is bounded by

\left|d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})\right|\leq||D^{(k_{q})}_{i_{q}}||_{\max}{\rm e}^{t_{w}\lambda_{(i_{q},k_{q})}}\cdots||D^{(k_{1})}_{i_{1}}||_{\max}\,{\rm e}^{t_{w}\lambda_{(i_{1},k_{1})}}=\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})\,.

(33)

The norm of the ${\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}$ can be bounded by using the following identity.

Identity 2.

For any $q+1$ complex values $x_{0},\cdots,x_{q}\in\mathbb{C}$ ,

\left|{\rm e}^{[x_{0},\cdots,x_{q}]}\right|\leq{\rm e}^{[\Re(x_{0}),\cdots,\Re(x_{q})]}=\frac{{\rm e}^{\xi}}{q!},

(34)

where $\Re(\cdot)$ denotes the real part of an input and $\xi\in\left[\min\{\Re(x_{0}),\cdots,\Re(x_{q})\},\max\{\Re(x_{0}),\cdots,\Re(x_{q})\}\right]$ .

The proof can be found in Appendix A. From Identity 2, we show in Appendix B that

\left|{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}\right|\leq{\rm e}^{\Delta t_{w}[q\lambda,(q-1)\lambda,\ldots,\lambda,0]}=\frac{1}{q!}\left(\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\lambda}\right)^{q}\equiv\frac{\widetilde{\Delta t}_{w}^{q}}{q!},

(35)

where we denoted the quantity

\widetilde{\Delta t}_{w}\equiv\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\lambda}.

(36)

With these bounds, the factors in the expansion form in Eq. (32) can be written as

$\displaystyle{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})$	$\displaystyle=\frac{{\widetilde{\Delta t}_{w}}^{q}}{q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})$	(37)
	$\displaystyle\times\left(\frac{{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}}{\widetilde{\Delta t}_{w}^{q}/q!}\frac{{\rm e}^{-it_{w}(E_{z_{\mathbb{i}_{0}}}-E_{z_{\mathbb{i}_{q}}})}d^{(k_{q})}_{i_{q},z}\cdots d^{(k_{1})}_{i_{1},z}\,{\rm e}^{t_{w}\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}}{\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}\right)$
	$\displaystyle=\frac{\widetilde{\Delta t}_{w}^{q}}{q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})\cos\left[\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}\right]{\rm e}^{i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}=\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})\left({\rm e}^{i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}+{\rm e}^{-i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}\right),$

where

\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}=\cos^{-1}\left[\left|\frac{{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})}{\frac{\widetilde{\Delta t}_{w}^{q}}{q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}\right|\right]

and

\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}=\arg{\left[\frac{{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}(t_{w})}{\frac{\widetilde{\Delta t}_{w}^{q}}{q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}\right]}.

The evolution operator from $t_{w}$ to $t_{w}+\Delta t_{w}$ becomes

$\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})$	$\displaystyle=\sum_{z}U_{I}(t_{w}+\Delta t_{w},t_{w})\|z\rangle\langle z\|$
	$\displaystyle=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})\left({\rm e}^{i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}+{\rm e}^{-i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}\right)P_{\mathbb{i}_{q}}\|z\rangle\langle z\|$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=\pm}(-i)^{q}\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})P_{\mathbb{i}_{q}}\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x},$	(38)

where $\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},\pm}$ are diagonal unitaries with diagonal elements being ${\rm e}^{i\left(\pm\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}\right)}$ .

To implement the LCU method for simulating $U_{I}(t_{w}+\Delta t_{w},t_{w})$ , we require a preparation of the state

	$\displaystyle\|\psi_{0}\rangle$	$\displaystyle=\frac{1}{\sqrt{s}}\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}\sqrt{\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}\|i_{1}\rangle\cdots\|i_{q}\rangle\otimes\|0\rangle^{\otimes(Q-q)}\|k_{1}\rangle\cdots\|k_{q}\rangle\otimes\|0\rangle^{\otimes(Q-q)}\|x\rangle$
		$\displaystyle\equiv\frac{1}{\sqrt{s}}\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}\sqrt{\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}\|\mathbb{i}_{q}\rangle\|\mathbb{k}_{q}\rangle\|x\rangle,$		(39)

s=\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})\equiv\sum_{q=0}^{Q}\frac{(\Gamma(t_{w})\widetilde{\Delta t}_{w})^{q}}{q!},

(40)

where we define the $\Gamma(t_{w})$ as

\Gamma(t_{w})\equiv\sum_{i=0}^{M}\sum_{k=1}^{K}\left|\left|D^{(k)}_{i}\right|\right|_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}}\,,

(41)

and we note that $\Gamma(t_{w})$ is an upper bound on the max-norm of the interaction Hamiltonian at time $t_{w}$ , $\Gamma(t_{w})\geq\|V(t_{w})\|_{\max}$ . The quantity $\Gamma(t_{w})$ is related to the energy strength in a typical LCU setup Berry et al. (2015). In Appendix D, we provide an alternative way that uses a larger bound $\Gamma=MK\max_{\forall k,i}||D^{(k)}_{i}||_{\max}$ , which leads to an exponential saving for the state preparation. We proceed with $\Gamma(t_{w})$ hereafter.

The OAA step in the LCU method requires $s\approx 2$ . This leads to

\Gamma(t_{w})\widetilde{\Delta t}_{w}=\Gamma(t_{w})\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\lambda}=\text{ln}2,

(42)

and Eq. (40) becomes a truncated Taylor expansion of 2 up to order $Q$ , i.e., $2\approx\sum_{q=0}^{Q}\frac{(\text{ln}2)^{q}}{q!}$ . If we require $|s-2|\leq\epsilon/r$ , where $r$ is the total number of steps and $\epsilon$ is some positive number, then the simulation error for each $U_{I}(t_{w}+\Delta t_{w},t_{w})$ is also within $\epsilon/r$ . The required truncation order with this accuracy scales as

Q=\mathcal{O}\left(\frac{\log(r/\epsilon)}{\log\log(r/\epsilon)}\right).

(43)

III.3.1 Time partitioning and number of time steps

The condition in Eq. (42) imposes a constraint on the next step size $\Delta t_{w}$ given the current time $t_{w}$ ,

\Delta t_{w}=\frac{1}{\lambda}\ln\left(1+\frac{\lambda}{\Gamma(t_{w})}\ln 2\right)\,.

(44)

Remembering that $\Gamma(t_{w})$ is a function of $t_{w}=\sum_{l=0}^{w-1}\Delta t_{l}$ , this condition determines the schedule, as every $\Delta t_{w}$ is determined by the preceding time steps.

Special care should be given when setting the last time step, as $\Delta t_{w}$ can become too large that exceeds the total desired evolution time $T$ . Whenever $t_{w+1}$ is found to be greater than $T$ (or if the argument inside the $\ln(\cdot)$ is found to be negative), one should replace the bound $\Gamma(t_{w})$ with a larger bound $\tilde{\Gamma}(t_{w})=\lambda\ln 2/({\rm e}^{\lambda\Delta t_{w}}-1)$ and set the final step $\Delta t_{w}=T-t_{w}$ .

Let us now examine the dependence of $\Delta t_{w}$ on $\Gamma(t_{w})$ in order to determine a bound on the number of time steps (equivalently, number of repetitions) $r$ required for the execution of the entire time evolution. We distinguish between three cases. (i) When $\lambda=0$ , we have $\Delta t_{w}\Gamma(t_{w})=\ln 2$ , similar to the time-independent case though we note that a vanishing maximal $\lambda$ could imply time-dependent oscillations as well. This can be seen by taking the $\lambda\to 0$ limit of Eq. (44). (ii) In the case where $\lambda<0$ , i.e., a system with a decaying $\Gamma(t_{w})$ , we have $\Delta t_{w}\Gamma(t_{w})\geq\ln 2$ , i.e., the time steps are longer than $\ln 2/\Gamma(t_{w})$ . Furthermore, the total number of steps $r$ is finite even for an arbitrarily large evolution time $T$ . Note that since $\Gamma(t_{w})$ approaches zero asymptotically, for a large enough time $t_{w*}$ , we have $\Gamma(t_{w*})<|\lambda|\ln 2$ , i.e., the argument inside the logarithm above becomes negative. This indicates it reaches the final step, i.e., the bound should be modified as $\tilde{\Gamma}(t_{w*})=\lambda\ln 2/({\rm e}^{\lambda\Delta t_{w*}}-1)$ and $\Delta t_{w*}=T-t_{w*}$ becomes the final step. (iii) In the case where $\lambda>0$ (an amplified $\Gamma(t_{w})$ ), we have $\Gamma(t_{w})\gg\lambda$ at large simulation times, $t_{w}$ . From Eq. (44), we have $\Delta t_{w}\to\ln 2/\Gamma(t_{w})$ in this limit.

We see that (for large enough simulation times) the time step $\Delta t_{w}$ is inversely proportional to $\Gamma(t_{w})$ which upper-bounds the max-norm of the interaction Hamiltonian at time $t_{w}$ . Therefore, we have $\sum^{r-1}_{w=0}\Gamma(t_{w})\Delta t_{w}\gtrsim r\ln 2$ , which implies $r\lesssim\sum^{r-1}_{w=0}\Gamma(t_{w})\Delta t_{w}/\ln 2$ .

It would be instructive to compare the above scaling with that of Ref. Berry et al. (2020) in which the simulation algorithm is said to have an $L^{1}$ -norm scaling, i.e., an algorithm cost scaling linearly with $\int_{o}^{t}d\tau H_{\max}(\tau)$ up to logarithmic factors. Under a similar intuition, our algorithm has a discretized $L^{1}$ -norm-like scaling with $\sum^{r-1}_{w=0}\Gamma(t_{w})\Delta t_{w}$ . However in our case, $\Gamma(t_{w})$ is related to the norm of the interaction Hamiltonian.

III.3.2 State preparation

In this subsection, we provide a procedure to prepare the state $|\psi_{0}\rangle$ given in Eq. (39). First, we initialize a state $|0\rangle^{\otimes Q}|0\rangle^{\otimes Q}|0\rangle$ , where each of the first $Q$ registers has dimension $M$ (responsible for $|\mathbb{i}_{q}\rangle$ part), each of the later $Q$ registers has dimension $K$ (responsible for $|\mathbb{k}_{q}\rangle$ part), and the last register is a qubit (for the cosine decomposition). For simplicity, we can perform a Hadamard gate on the last qubit and then omit its dependence for the following discussion. The next step is to create a state in following the form,

\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{s_{q}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)},

(45)

where $s_{q}\equiv\left(\Gamma(t_{w})\widetilde{\Delta t}_{w}\right)^{q}/q!$ . For each $|1\rangle$ from the first $Q$ registers (the $|\mathbb{i}_{q}\rangle$ part) and the corresponding $|1\rangle$ in the later $Q$ registers (the $|\mathbb{k}_{q}\rangle$ part), we make

|1\rangle|1\rangle\to\sum_{i=0}^{M}\sum_{k=1}^{K}\sqrt{\frac{||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}}}{\Gamma(t_{w})}}|i\rangle|k\rangle.

(46)

Then Eq. (45) becomes

\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{s_{q}}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sqrt{\frac{\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}{(\Gamma(t_{w}))^{q}}}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle=\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sqrt{\frac{(\widetilde{\Delta t}_{w})^{q}}{q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle,

(47)

which is the required $|\psi_{0}\rangle$ in Eq. (39), when combined with $|x\rangle$ .

Next, we provide a process that produces the state in Eq. (45). First, we perform a rotation that takes the first register in the $|\mathbb{i}_{q}\rangle$ part to

|0\rangle\to\frac{1}{\sqrt{s}}\left(|0\rangle+\sqrt{\displaystyle\sum_{q=1}^{Q}s_{q}}|1\rangle\right),

(48)

and perform a control gate from the first register to the second (both in the $|\mathbb{i}_{q}\rangle$ part) such that

	$\displaystyle\frac{1}{\sqrt{s}}\left(\|0\rangle+\sqrt{\displaystyle\sum_{q=1}^{Q}s_{q}}\|1\rangle\right)\|0\rangle$
	$\displaystyle\to\frac{1}{\sqrt{s}}\Bigg{[}\|00\rangle+\sqrt{\sum_{q=1}^{Q}s_{q}}\|1\rangle\frac{1}{\sqrt{\displaystyle\sum_{q=1}^{Q}s_{q}}}\Bigg{(}\sqrt{s_{1}}\|0\rangle+\sqrt{\displaystyle\sum_{q=2}^{Q}s_{q}}\|1\rangle\Bigg{)}\Bigg{]}$
	$\displaystyle=\frac{1}{\sqrt{s}}\left(\|00\rangle+\sqrt{s_{1}}\|10\rangle+\sqrt{\displaystyle\sum_{q=2}^{Q}s_{q}}\|11\rangle\right).$		(49)

Continuing this procedure for the rest of the registers in the $|\mathbb{i}_{q}\rangle$ part, the state becomes

\displaystyle|0\rangle^{\otimes Q}\to\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{s_{q}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}.

(50)

At this step, we perform CNOT operations³³3Strictly speaking, they are not standard CNOTs but higher-dimensional operations that act like a CNOT on the first two levels. from the first $Q$ registers ( $|\mathbb{i}_{q}\rangle$ part) to the last $Q$ registers ( $|\mathbb{k}_{q}\rangle$ part) correspondingly, e.g., perform a CNOT from the first register in the $|\mathbb{i}_{q}\rangle$ part to the first register in the $|\mathbb{k}_{q}\rangle$ part, and so on and so forth. Finally, we have

\displaystyle\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{s_{q}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}|0\rangle^{\otimes Q}\to\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{s_{q}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)},

(51)

which gives Eq. (45) as required. The estimated gate cost for the preparation of $|\psi_{0}\rangle$ is $\mathcal{O}(QMK)$ . More detail regarding the cost is provided in Sec. III.4.1.

III.3.3 Implementation of the controlled unitaries

The second ingredient of the LCU routine is the construction of the controlled operation

V_{c}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle|\psi\rangle=|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle(-i)^{q}P_{\mathbb{i}_{q}}\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x}|\psi\rangle.

(52)

Taking an approach similar to that taken in Ref. Kalev and Hen (2021), we first note that Eq. (52) indicates that $V_{c}$ can be carried out in two steps: a controlled-phase operation ( $V_{c\Phi}$ ) followed by a controlled-permutation operation ( $V_{cP}$ ).

The controlled-phase operation $V_{c\Phi}$ requires a somewhat intricate calculation of non-trivial phases. We therefore carry out the required algebra with the help of additional ancillary registers and then ‘push’ the results into phases. The latter step is done by employing the unitary

\displaystyle U_{\text{ph}}|\varphi\rangle={\rm e}^{-i\varphi}|\varphi\rangle\,,

(53)

whose implementation cost depends only on the precision with which we specify $\varphi$ and is independent of Hamiltonian parameters Nielsen and Chuang (2011) (see Ref. Kalev and Hen (2021) for a complete derivation). With the help of the (controlled) unitary transformation

V_{\chi\phi}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle|z\rangle|0\rangle=|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle|z\rangle|\chi_{{\bf i}_{q}}^{(z)}+(-1)^{k}\phi_{{\bf i}_{q}}^{(z)}\rangle\,,

(54)

we can write $V_{c{\Phi}}=V_{\chi\phi}^{\dagger}(\mathds{1}\otimes U_{\text{ph}})V_{\chi\phi}$ , so that

V_{c{\Phi}}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle|z\rangle=|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle\Phi_{{\bf i}_{q}}^{(k)}|z\rangle\,.

(55)

Note that $V_{\chi\phi}$ sends computational basis states to computational basis states. We provide an explicit construction of $V_{\chi\phi}$ in Ref. Kalev and Hen (2021). We find that its gate cost is $\mathcal{O}(QM(k_{od}+\log M)+QMK(C_{D}+C_{\Delta H_{0}}+C_{\Lambda}))$ and qubit cost is $\mathcal{O}(Q\log(MK)).$ Addiitonal details are provided in Sec. III.4.1.

The construction of $V_{cP}$ is carried out by a repeated execution of the simpler unitary transformation $U_{p}|i\rangle|z\rangle=|i\rangle P_{i}|z\rangle$ . Recall that $P_{i}$ are the off-diagonal permutation operators that appear in the Hamiltonian. The gate cost of $U_{p}$ is therefore ${\cal O}(M(k_{\rm od}+\log M))$ . Additional details may be found in Ref. Kalev and Hen (2021).

III.4 Algorithm cost

We next analyze the circuit costs for the permutation expansion algorithm. Recall that the simulation of $U(T)$ consists of two operations— ${\rm e}^{-iH_{0}T}$ and $U_{I}(T)$ . The diagonal unitary ${\rm e}^{-iH_{0}T}$ can be implemented efficiently with a gate cost that scales linearly with the system size. To observe this, note that $H_{0}$ is a diagonal matrix with real diagonal elements and can be written as $H_{0}=\sum_{\gamma=0}^{L}J_{\gamma}Z_{\gamma}$ , where each $Z_{\gamma}$ is a tensor product of Pauli- $Z$ ’s ( $Z\otimes\cdots\otimes Z$ ) acting on at most $d$ qubits (weight- $d$ operators). Hence, we can write ${\rm e}^{-iH_{0}T}=\prod_{\gamma=0}^{L}{\rm e}^{-iJ_{\gamma}Z_{\gamma}T}$ . Each ${\rm e}^{-iJ_{\gamma}Z_{\gamma}T}$ can be simulated using at most $2d$ CNOT gates with a single ancillary qubit. For example, let $Z_{\gamma}$ be a weight- $m$ ( $m\leq d$ ) operator, then ${\rm e}^{-iJ_{\gamma}Z_{\gamma}T}$ can be implemented as

where $|z_{1}\rangle\cdots|z_{m}\rangle$ are the qubits $Z_{\gamma}$ acts on and $|0\rangle$ is an ancillary qubit for extracting the phase. There are total $L$ such implementations for ${\rm e}^{-iH_{0}T}$ . Therefore, the total gate cost is $\mathcal{O}(Ld)$ and the qubit cost is $\mathcal{O}(1)$ . Since $L$ usually grows linearly with the system size, the gate cost also scales linearly.

III.4.1 The cost for the state preparation and the controlled unitaries

The cost of implementing $U_{I}(T)$ resembles those in Ref. Kalev and Hen (2021). The first ingredient is the preparation of state $|\psi_{0}\rangle$ . Recall from Sec. III.3.2, the operation that takes $|0\rangle^{\otimes Q}|0\rangle^{\otimes Q}|0\rangle$ to $\frac{1}{\sqrt{s}}\sum_{q=0}^{Q}\sqrt{s_{q}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}$ has gate cost $\mathcal{O}(Q)$ . The operation for $|1\rangle|1\rangle\to\sum_{i=0}^{M}\sum_{k=1}^{K}\sqrt{\frac{||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}}}{\Gamma(t_{w})}}|i\rangle|k\rangle$ costs $\mathcal{O}(MK)$ Shende et al. (2006). The total gate cost for the preparation of $|\psi_{0}\rangle$ (i.e., $B$ ) is $\mathcal{O}(QMK)$ (Lemma 8 in Childs et al. (2017)). In Appendix D, we provide an alternative procedure that leads to a $\mathcal{O}(Q\log(MK))$ scaling for implementing $B$ . The qubit cost in the state preparation is $\mathcal{O}(Q\log(MK)).$

The next component is the implementation of the control unitary $V_{c}$ . As shown in Kalev and Hen (2021), the gate cost of performing the control permutation $P_{\mathbb{i}_{q}}$ is $\mathcal{O}(QM(k_{od}+\log M))$ , where $k_{od}$ is the “locality,” i.e., each permutation $P_{i}$ is a tensor product of at most $k_{od}$ Pauli- $X$ operators. The implementation of the control phase $\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x}$ involves the calculation of $d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}$ (the product of diagonal elements in the permutation expansion) and the divided differences (with $x_{j}$ ’s being the inputs). The cost of the former is $\mathcal{O}(QMKC_{D})$ , where $C_{D}$ is the cost of obtaining an element of $D^{(k)}_{i}$ . The cost of later is $\mathcal{O}(QM(k_{od}+\log M)+QMK(C_{\Delta H_{0}}+C_{\Lambda}))$ , where $C_{\Delta H_{0}}$ $(C_{\Lambda})$ is the cost of obtaining energy differences of $H_{0}$ (elements of $\Lambda^{(k)}_{i}$ ) [therefore, $C_{\Delta H_{0}}+C_{\Lambda}$ is the cost for obtaining the inputs $x_{j}$ ’s as defined in Eq. (29)]. The additional cost for the reversibility of the process scales as $\mathcal{O}(Q^{2})$ . A detailed discussion of the costs of $C_{\Delta H_{0}}$ and $C_{\Lambda}$ may be found in Ref. Kalev and Hen (2021). Combining these, we estimate the total cost for $V_{c}$ is

\mathcal{O}(Q^{2}+QM(k_{od}+\log M)+QMK(C_{D}+C_{\Delta H_{0}}+C_{\Lambda})).

(56)

III.4.2 Overall cost of the algorithm

The full simulation for $U_{I}(T)$ is a product of segments $U_{I}(t_{w}+\Delta t_{w},t_{w})$ , where each segment is simulated by interleaving $B$ and $V_{c}$ . The total number of segments, $r$ , is determined by $T=\sum_{w=0}^{r-1}\Delta t_{w}$ , where each $\Delta t_{w}$ is determined by partitioning scheme described in Sec. III.3.1.

As discussed above, the number of LCU applications $r$ can be upper-bounded by $r\lesssim\sum^{r-1}_{w=0}\Gamma(t_{w})\Delta t_{w}/\ln 2$ (in the long simulation time limit), which can be viewed as a discretized $L^{1}$ -norm-like scaling with the norm of the non-static component of the Hamiltonian $V(t)$ .

Combining with the cost for simulating ${\rm e}^{-iH_{0}T}$ and the cost for each step (56), we conclude that at worst, the total gate cost scales as

\mathcal{O}\left(r\left(Q^{2}+QM(k_{od}+\log M)+QMK(C_{D}+C_{\Delta H_{0}}+C_{\Lambda})\right)+Ld\right),

(57)

and the qubit cost scales as

\mathcal{O}(Q\log(MK)),

(58)

where $Q$ scales as $\mathcal{O}(\log(r/\epsilon)/\log\log(r/\epsilon))$ . For convenience, we provide a glossary of symbols in Table 1. A summary of the gate and qubit costs of the simulation circuit and the various sub-routines used to construct it is given in Table 2.

Symbol	Meaning
$M$	the number of permutation expansion terms of the non-static Hamitonain, c.f., Eq. (16)
$K$	the length of exponential sum expansion, c.f., Eq. (17)
$r$	the number partitions, c.f., Sec. III.3.1
$Q$	the series expansion truncation order, $Q={\cal O}\Bigl{(}\frac{\log(r/\epsilon)}{\log\log(r/\epsilon)}\Bigr{)}$
$k_{\rm od}$	the upper bound on the locality of $P_{i}$
$C_{D}$	the cost of obtaining an element of $D^{(k)}_{i}$
$C_{\Delta H_{0}}$	the cost of obtaining energy differences of $H_{0}$
$C_{\Lambda}$	the cost of obtaining an element of $\Lambda^{(k)}_{i}$
$L$	the number of terms in the static Hamiltonian, i.e., $H_{0}=\sum_{\gamma=0}^{L}J_{\gamma}Z_{\gamma}$
$d$	the locality of $Z_{\gamma}$

Table 1: Glossary of symbols.

Unitary	Gate cost	Qubit cost
${\rm e}^{-iH_{0}T}$	${\cal O}(Ld)$	${\cal O}(1)$
$V_{c}$	$\mathcal{O}(Q^{2}+QM(k_{od}+\log M)+QMK(C_{D}+C_{\Delta H_{0}}+C_{\Lambda}))$	${\cal O}(Q\log MK)$
$U_{I}(T)$	$\mathcal{O}\left(r\left(Q^{2}+QM(k_{od}+\log M)+QMK(C_{D}+C_{\Delta H_{0}}+C_{\Lambda})\right)\right)$	${\cal O}(Q\log MK)$

Table 2: A summary of resources for the circuit.

III.4.3 Example advantages of the algorithm

To illustrate how our simulation algorithm can provide speedups over existing algorithms, we focus in this subsection on two types of Hamiltonian systems: highly oscillating systems and decaying systems.

The cost of our algorithm is independent of the oscillation rates of the dynamics, whereas the cost of any simulation algorithm (e.g., Berry et al. (2020); Kieferová et al. (2019); Low and Wiebe (2019); Poulin et al. (2011)) that depends on $||dH(t)/dt||$ would depend on oscillation rates of the system. To illustrate this advantage, consider a two-level system with a Hamiltonian

H(t)=hZ+\Gamma\left({\rm e}^{-i\alpha t}|0\rangle\langle 1|+{\rm e}^{i\alpha t}|1\rangle\langle 0|\right)=H_{0}+V(t),

(59)

where $h,\Gamma,\alpha\in\mathbb{R}$ , $H_{0}=hZ$ and $V(t)=\Gamma\left({\rm e}^{-i\alpha t}|0\rangle\langle 1|+{\rm e}^{i\alpha t}|1\rangle\langle 0|\right)$ . In this case, we have $k_{od}=M=K=1$ and $\lambda=0$ . The gate cost of simulating $U(T)$ scales as

\mathcal{O}\left(\Gamma T\left(\frac{\log\Gamma T/\epsilon}{\log\log\Gamma T/\epsilon}\right)^{2}\right),

(60)

which is independent of $\alpha$ . This means the simulation cost remains the same even if $\alpha$ becomes arbitrarily large. One can realize the absence of $\alpha$ owing to the fact that phases are explicitly integrated out into an integral-free expansion series, where the bound of each term does not depend on the oscillations (due to Identity 2). Therefore, our simulation can be significantly more effective when the time dependence of the Hamiltonian has very high frequencies. Note that while the example above was given for a simple qubit system with pure oscillation, the frequency-independence in cost holds for any system.

Another class of systems for which our algorithm can provide seedup are Hamiltonians with exponential decays, i.e., $\lambda<0$ . For concreteness, consider the Hamiltonian

H(t)=hZ+\Gamma{\rm e}^{-\alpha t}X=H_{0}+V(t),

(61)

where $h,\Gamma\in\mathbb{R}$ and $\alpha>0$ and $H_{0}=hZ$ and $V(t)=\Gamma{\rm e}^{-\alpha t}X$ . In this case, $\lambda=-\alpha$ and $||V(t)||_{\max}=\Gamma{\rm e}^{-\alpha t}$ .

The $L^{1}$ -norm defined in Ref. Berry et al. (2020) is $\int^{T}_{0}||H(t)||_{\max}dt$ , which has a linear scaling $\mathcal{O}(hT)$ with the simulation duration $T$ , whereas our discretized $L^{1}$ -norm $\sum^{r-1}_{w=0}||V(t_{w})||_{\max}\Delta t_{w}$ tends to a constant in the long time limit. This can be seen from the fact that the partition terminates at a large enough time $t_{w}$ ( $\leq T$ ), where $\Delta t_{w}=T-t_{w}$ becomes the final simulation step, as described in Sec. III.3.1. The above results also hold for any combination of exponential decays (even when these are multiplied by oscillatory terms) with which different time decay dependencies may be constructed.

III.5 Hamiltonians with arbitrary time dependence

The simulation algorithm invokes a switch to the interaction picture, by dividing the Hamiltonian into a static diagonal part $H_{0}$ and a time-dependent Hermitian operator $V(t)$ . The $V(t)$ is expanded using permutations and exponential sums as presented in Eq. (17). There, we assume that the time dependence can be expressed as exponential sums with a finite number of terms, $K$ . Although this assumption holds for many models (e.g., when the time dependencies are some combinations of trigonometric functions and exponential decays), the exponential series generally requires an infinite sum (e.g., a Fourier series). A straightforward procedure to obtain a finite sum approximation is via a truncated Fourier series. As an example, let us consider a polynomial function of time, i.e., $f(t)=\sum_{l=0}^{p}c_{l}t^{l}$ . Using the proof of Theorem 8.14 in Ref. Rudin (1976), it can be shown that a truncated Fourier series of $f(t)$ is $\mathcal{O}(\epsilon)$ close to $f(t)$ when the truncation order is $\mathcal{O}(1/\epsilon)$ . We also note that, other than Fourier series, there have been numerous studies Norvidas (2010); Beylkin and Monzón (2005, 2010); Braess and Hackbusch (2009); Wiscombe and Evans (1977) regarding finding an exponential-sum approximation of a function. Some of them, e.g., Beylkin and Monzón (2005), provide efficient algorithms with logarithmically scaling terms (with respect to the inverse of a required accuracy). These results suggest that efficient methods for finding the exponential-sum decompositions of the time dependences of $V(t)$ can exist in many cases.

Suppose that $V(t)$ is approximated by a finite series of exponential sum. The resulting error of the unitary evolution, due to the Hamiltonian approximation, scales only at most linearly with the evolution duration. This can be shown using the following property. Given two time-dependent Hamiltonians $H_{1}(t)$ and $H_{2}(t)$ such that

||H_{1}(t)-H_{2}(t)||\leq\epsilon\ \ \text{for all }t\in[0,T],

(62)

then

||U_{1}(T,0)-U_{2}(T,0)||\equiv\left|\left|\mathcal{T}\text{exp}\left[-i\int_{0}^{T}H_{1}(t)dt\right]-\mathcal{T}\text{exp}\left[-i\int_{0}^{T}H_{2}(t)dt\right]\right|\right|\leq\epsilon T.

(63)

This holds true for any norm $||\cdot||$ . Before proving this, we first note a property of the so-called Subadditivity of error in implementing unitaries Nielsen and Chuang (2011). It says that for unitaries $U_{1},U_{2},V_{3}$ and $V_{4}$ , we have

||U_{2}U_{1}-V_{2}V_{1}||\leq||U_{2}-V_{2}||+||U_{1}-V_{1}||.

(64)

This can be easily shown by

	$\displaystyle\|\|U_{2}U_{1}-V_{2}V_{1}\|\|$	$\displaystyle=\|\|U_{2}U_{1}-V_{2}U_{1}+V_{2}U_{1}-V_{2}V_{1}\|\|$
		$\displaystyle\leq\|\|(U_{2}-V_{2})U_{1}\|\|+\|\|V_{2}(U_{1}-V_{1})\|\|\leq\|\|U_{2}-V_{2}\|\|+\|\|U_{1}-V_{1}\|\|,$		(65)

where the basic operator norm inequalities are used. Now we prove the bound in Eq. (63). We divide $T$ into $n$ segments such that each segment has width $T/n$ . We can rewrite the time evolution operators as

	$\displaystyle U_{1}(T,0)=U_{1}\left(T,\frac{n-1}{n}T\right)\cdots U_{1}\left(\frac{T}{n},0\right)$
	$\displaystyle U_{2}(T,0)=U_{2}\left(T,\frac{n-1}{n}T\right)\cdots U_{2}\left(\frac{T}{n},0\right).$

Repeatedly using the subadditivity of error, we have

	$\displaystyle\|\|U_{1}(T,0)-U_{2}(T,0)\|\|\leq\sum_{m=1}^{n}\left\|\left\|U_{1}\left(\frac{mT}{n},\frac{(m-1)T}{n}\right)-U_{2}\left(\frac{mT}{n},\frac{(m-1)T}{n}\right)\right\|\right\|$
	$\displaystyle=\sum_{m=1}^{n}\left\|\left\|-i\int_{\frac{(m-1)T}{n}}^{\frac{mT}{n}}\left[H_{1}(t)-H_{2}(t)\right]dt+(-i)^{2}\int^{\frac{mT}{n}}_{\frac{(m-1)T}{n}}dt_{2}\int_{\frac{(m-1)T}{n}}^{t_{2}}dt_{1}\left[H_{1}(t_{2})H_{1}(t_{1})-H_{2}(t_{2})H_{2}(t_{1})\right]+\cdots\right\|\right\|$
	$\displaystyle\leq\sum_{m=1}^{n}\epsilon\frac{T}{n}+\sum_{m=1}^{n}\mathcal{O}\left[\left(\frac{T}{n}\right)^{2}\right]=\epsilon T+\mathcal{O}\left[\frac{T^{2}}{n}\right].$		(66)

Since this inequality holds for any $n$ , we can take $n\to\infty$ and it yields Eq. (63) as claimed.

Now we apply this property to the simulation of $U_{I}(T)$ . Suppose that we have an $\tilde{\delta}-$ accurate approximation of $V(t)$ , i.e., $||\tilde{V}(t)-V(t)||\leq\tilde{\delta}$ for all $t\in[0,T]$ , where $\tilde{V}(t)$ is the finite exponential-sum approximation of $V(t)$ . The accumulative error from this approximation is bounded by $\tilde{\delta}T$ and the overall error is $\mathcal{O}(\tilde{\delta}T+\delta)$ , where $\delta$ is the error from LCU implementation. Recall that $\tilde{\delta}$ is closely related to $K$ (the number of terms). Although it is intuitive that a larger $K$ can allow for a smaller $\tilde{\delta}$ , the explicit relation between the two largely depends on the model and the expansion method. Nonetheless, we can expect $K$ to scale at least linearly with $1/\tilde{\delta}$ for many cases, e.g., aforementioned truncated Fourier series for a polynomial.

The simulation cost also depends on $M$ , the number of terms in the permutation expansion. This quantity usually scales linearly with the system size and can be easily determined. For example, a typical spin model usually involves a sum of tensor products of Pauli- $X$ ’s (or $Y$ ’s) and Pauli- $Z$ ’s. Each tensor product represents an interaction between qubits on certain lattice sites. Due to the common locality constraint that prevents a qubit interacting with the ones arbitrarily far apart, the number of interacting terms, $M$ , scales at most linearly with the number of qubits. In addition, a tensor product of Pauli operators can be easily separated into a product of diagonal matrix and a permutation, e.g., $X\otimes X\otimes Y=(I\otimes I\otimes-iZ)(X\otimes X\otimes X)$ . We conclude that $M$ will have modest linear scaling for most practical models.

IV Alternative scheme and reduction to the time-independent case

In this section, we provide an alternative yet equivalent scheme for the dynamical simulation, one that will allow us to establish an immediate connection to the time-independent Hamiltonian simulation formalism (specifically to the scheme presented in Ref. Kalev and Hen (2021)), in which $H(t)$ is assumed constant in time.

In previous sections, we have chosen to partition the interaction-picture unitary $U_{I}(T)$ into short time segments and then follow its execution by the application of a diagonal ${\rm e}^{iH_{0}T}$ bringing it back to the Schrödinger picture. Here, we show that the Schrödinger picture $U(T)$ can be partitioned similarly.

Recalling the expansion of $U_{I}(t_{w}+\Delta t_{w},t_{w})$ in Eq. (32), we have

\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})|z\rangle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t_{w}x_{1}}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle,

with

x_{1}=i\left(E_{z_{\mathbb{i}_{q}}}-E_{z}\right)+\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}.

Breaking the ${\rm e}^{t_{w}x_{1}}$ phase, we get:

	$\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t_{w}\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{-it_{w}E_{z}}{\rm e}^{i(t_{w}+\Delta t_{w})E_{z_{\mathbb{i}_{q}}}}{\rm e}^{-i\Delta t_{w}E_{z_{\mathbb{i}_{q}}}}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle.$		(67)

We find that

{\rm e}^{-iH_{0}(t_{w}+\Delta t_{w})}U_{I}(t_{w}+\Delta t_{w},t_{w})=\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w}){\rm e}^{-iH_{0}t_{w}}

(68)

where

\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w})=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t_{w}\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{-i\Delta t_{w}E_{z_{\mathbb{i}_{q}}}}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle\langle z|\,.

(69)

Inspecting the full unitary evolution, we observe

	$\displaystyle U(T)$	$\displaystyle={\rm e}^{-iH_{0}T}U_{I}(T)$
		$\displaystyle={\rm e}^{-iH_{0}T}U_{I}(T,t_{r-1})U_{I}(t_{r-1},t_{r-2})\cdots U_{I}(t_{1},0)=\widetilde{U}_{I}(T,t_{r-1})D(t_{r-1})U_{I}(t_{r-1},t_{r-2})\cdots U_{I}(t_{1},0).$		(70)

The evolution operator $U(T)$ can be simplifies as

U(T)=\widetilde{U}_{I}(T,t_{r-1})\widetilde{U}_{I}(t_{r-1},t_{r-2})\cdots\widetilde{U}_{I}(t_{1},0)\,,

(71)

eliminating the diagonal piece. Each $\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w})$ can be rewritten as:

	$\displaystyle\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w})$
	$\displaystyle=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{(t_{w}+\Delta t_{w})\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{-\Delta t_{w}(iE_{z_{\mathbb{i}_{q}}}+\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}})}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle\langle z\|\,.$		(72)

The factor ${\rm e}^{-\Delta t_{w}(iE_{z_{\mathbb{i}_{q}}}+\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}})}$ can be absorbed into the divided difference:

\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w})=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{(t_{w}+\Delta t_{w})\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{\Delta t_{w}[\tilde{y}_{1},\tilde{y}_{2},\cdots,\tilde{y}_{q},\tilde{y}_{q+1}]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle\langle z|\,.

(73)

with

\tilde{y}_{j}=x_{j}-\left(iE_{z_{\mathbb{i}_{q}}}+\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)=i\left(E_{z_{\mathbb{i}_{q}}}-E_{z_{\mathbb{i}_{j-1}}}\right)+\sum_{l=j}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}-\left(iE_{z_{\mathbb{i}_{q}}}+\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\,,

(74)

which simplifies to

\tilde{y}_{j}=-iE_{z_{\mathbb{i}_{j-1}}}-\sum_{l=1}^{j-1}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\,.

(75)

By inserting additional $i\Delta t_{w}E_{z}$ phases into the divided differences, we can rewrite

\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w})=\left(\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{(t_{w}+\Delta t_{w})\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{\Delta t_{w}[y_{1},y_{2},\cdots,y_{q},y_{q+1}]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}|z\rangle\langle z|\right){\rm e}^{-iH_{0}\Delta t_{w}}\,.

(76)

with

y_{j}=\tilde{y}_{j}+iE_{z}=-i(E_{z_{\mathbb{i}_{j-1}}}-E_{z})-\sum_{l=1}^{j-1}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}=-i\Delta E_{z_{\mathbb{i}_{j-1}}}-\sum_{l=1}^{j-1}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\,.

(77)

Now, we can write $U(T)$ as alternating off-diagonal and diagonal unitaries:

U(T)=\prod_{w}\widetilde{U}_{I}(t_{w}+\Delta t_{w},t_{w})\equiv\prod_{w}U_{\textrm{od}}(t_{w}+\Delta t_{w},t_{w}){\rm e}^{-iH_{0}\Delta t_{w}}\,.

(78)

When $H(t)$ becomes time-independent, $\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}=0$ and $\Delta t_{w}=\Delta t=\ln 2/\Gamma$ . To synchronize the notation with Kalev and Hen (2021), we identify $H_{0}=D_{0}$ and $U_{\textrm{od}}(t_{w}+\Delta t_{w},t_{w})=U_{\textrm{od}}$ . The evolution operator becomes $U(T)=U_{\textrm{od}}{\rm e}^{-iD_{0}\Delta t}\cdots U_{\textrm{od}}{\rm e}^{-iD_{0}\Delta t}$ , which coincides with Kalev and Hen (2021).

V Conclusions

We presented a quantum algorithm for simulating the evolution operator generated from a time-dependent Hamiltonian. The algorithm involves a permutation expansion for the interaction Hamiltonian, a switch to the interaction-picture, and the incorporation of the LCU technique. Combining the permutation expansion with the Dyson series has led to an integral-free representation for the interaction-picture unitary with coefficients involving the notion of divided differences with complex inputs.

We found that our expansion allowed us to adjust the time steps based on the dynamical characteristics of the Hamiltonian, providing a resource saving as compared to the equal-size partition with the largest bound. This further resulted in a gate resource that scales with an $L^{1}$ -norm-like scaling with respect only to the ‘non-static’ norm of the Hamiltonian.

Specifically, we demonstrated that for systems with a decaying non-static component, the resources do not scale with the total evolution time asymptotically. Furthermore, the simulation cost is independent of the frequencies, implying a significant advantage for systems with highly oscillating components.

Acknowledgements.

This work is supported by the U.S. Department of Energy (DOE), Office of Science, Basic Energy Sciences (BES) under Award No. DE-SC0020280.

References

Feynman (1982) R. P. Feynman, International Journal of Theoretical Physics, 21, 467 (1982).
Reiher et al. (2017) M. Reiher, N. Wiebe, K. M. Svore, D. Wecker, and M. Troyer, Proceedings of the National Academy of Sciences 114, 7555 (2017), http://www.pnas.org/content/114/29/7555.full.pdf .
Babbush et al. (2018) R. Babbush, N. Wiebe, J. McClean, J. McClain, H. Neven, and G. K.-L. Chan, Phys. Rev. X 8, 011044 (2018).
Gioiosa (2017) R. Gioiosa, in Rugged Embedded Systems, edited by A. Vega, P. Bose, and A. Buyuktosunoglu (Morgan Kaufmann, Boston, 2017) pp. 123–148.
Sterling et al. (2018) T. Sterling, M. Anderson, and M. Brodowicz, in High Performance Computing, edited by T. Sterling, M. Anderson, and M. Brodowicz (Morgan Kaufmann, Boston, 2018) pp. 285–311.
Lee (2014) G. Lee, in Cloud Networking, edited by G. Lee (Morgan Kaufmann, Boston, 2014) pp. 179–189.
Pang and Jordan (2017) S. Pang and A. N. Jordan, Nature Communications 8, 14695 (2017).
Butler (1998) L. J. Butler, Annual Review of Physical Chemistry 49, 125 (1998), pMID: 15012427, https://doi.org/10.1146/annurev.physchem.49.1.125 .
Farhi et al. (2001) E. Farhi, J. Goldstone, S. Gutmann, J. Lapan, A. Lundgren, and D. Preda, Science 292, 472 (2001).
Farhi et al. (2014) E. Farhi, J. Goldstone, and S. Gutmann, arXiv e-prints , arXiv:1411.4028 (2014), arXiv:1411.4028 [quant-ph] .
Wiebe et al. (2011) N. Wiebe, D. W. Berry, P. Høyer, and B. C. Sanders, Journal of Physics A: Mathematical and Theoretical 44, 445308 (2011).
Poulin et al. (2011) D. Poulin, A. Qarry, R. Somma, and F. Verstraete, Phys. Rev. Lett. 106, 170501 (2011).
Berry et al. (2014) D. W. Berry, A. M. Childs, R. Cleve, R. Kothari, and R. D. Somma, in Proceedings of the Forty-Sixth Annual ACM Symposium on Theory of Computing, STOC ’14 (Association for Computing Machinery, New York, NY, USA, 2014) p. 283–292.
Berry et al. (2015) D. W. Berry, A. M. Childs, R. Cleve, R. Kothari, and R. D. Somma, Phys. Rev. Lett. 114, 090502 (2015).
Low and Wiebe (2019) G. H. Low and N. Wiebe, “Hamiltonian simulation in the interaction picture,” (2019), arXiv:1805.00675 [quant-ph] .
Kieferová et al. (2019) M. Kieferová, A. Scherer, and D. W. Berry, Phys. Rev. A 99, 042314 (2019).
Berry et al. (2020) D. W. Berry, A. M. Childs, Y. Su, X. Wang, and N. Wiebe, Quantum 4, 254 (2020).
Dyson (1949) F. J. Dyson, Phys. Rev. 75, 486 (1949).
Kalev and Hen (2020) A. Kalev and I. Hen, “An integral-free representation of the dyson series using divided differences,” (2020), arXiv:2010.09888 [quant-ph] .
de Boor (2005) C. de Boor, Surveys in Approximation Theory 1, 46 (2005).
Davis (1975) P. Davis, Interpolation and Approximation, Dover Books on Mathematics (Dover Publications, 1975).
Mccurdy (1980) A. C. Mccurdy, Accurate Computation of Divided Differences, Ph.D. thesis, University of California, Berkeley (1980), aAI8029490.
Gupta et al. (2020a) L. Gupta, L. Barash, and I. Hen, Computer Physics Communications 254, 107385 (2020a).
McCurdy et al. (1984) A. McCurdy, K. C. Ng, and B. N. Parlett, Mathematics of computation 43, 501 (1984).
Zivcovich (2019) F. Zivcovich, Dolomites Research Notes on Approximation 12, 28 (2019).
Gupta et al. (2020b) L. Gupta, T. Albash, and I. Hen, Journal of Statistical Mechanics: Theory and Experiment 2020, 073105 (2020b).
Beylkin and Monzón (2005) G. Beylkin and L. Monzón, Applied and Computational Harmonic Analysis 19, 17 (2005).
Beylkin and Monzón (2010) G. Beylkin and L. Monzón, Applied and Computational Harmonic Analysis 28, 131 (2010), special Issue on Continuous Wavelet Transform in Memory of Jean Morlet, Part I.
Braess and Hackbusch (2009) D. Braess and W. Hackbusch, “On the efficient computation of high-dimensional integrals and the approximation by exponential sums,” (2009) pp. 39–74.
Wiscombe and Evans (1977) W. Wiscombe and J. Evans, Journal of Computational Physics 24, 416 (1977).
Norvidas (2010) S. Norvidas, Acta Mathematica Hungarica 128, 26 (2010).
Nielsen and Chuang (2011) M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information: 10th Anniversary Edition, 10th ed. (Cambridge University Press, USA, 2011).
Kalev and Hen (2021) A. Kalev and I. Hen, Quantum 5, 426 (2021).
Shende et al. (2006) V. V. Shende, S. S. Bullock, and I. L. Markov, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 25, 1000 (2006).
Childs et al. (2017) A. M. Childs, R. Kothari, and R. D. Somma, SIAM Journal on Computing 46, 1920 (2017), https://doi.org/10.1137/16M1087072 .
Rudin (1976) W. Rudin, Principles of Mathematical Analysis, 3rd ed. (McGraw-Hill Education, 1976).

Appendix

Appendix A Properties of divided difference

We begin with a formal definition of divided difference for complex-valued functions and follow with some properties that will be of use to us when deriving the new bound. The main results are derived for the exponential functions.

Definition 1.

Let $\mathbb{U}$ be an open subset of $\mathbb{C}$ , and $f:\mathbb{U}\to\mathbb{C}$ is analytic in $\mathbb{U}$ . For any non-negative integer $q$ and $x_{0},x_{1},\cdots,x_{q}\in\mathbb{U}$ , the divided difference of $f$ is denoted as $f[x_{0},x_{1},\cdots,x_{q}]$ . If $q=0$ , $f[x_{0}]\equiv f(x_{0})$ . Suppose $\{x_{0},x_{1},\cdots,x_{q}\}$ has $r$ distinct elements. Let $S=\{x_{\sigma(0)},x_{\sigma(1)},\cdots,x_{\sigma(q)}\}$ be a sorted set of $\{x_{0},x_{1},\cdots,x_{q}\}$ , i.e., there exists a permutation $\sigma$ such that the first $n_{1}$ elements of $S$ are equal and the following $n_{2}$ elements of $S$ are equal and so on and so forth. There are $r$ same-element clusters and $\sum_{i=1}^{r}n_{i}=q+1$ . The divided difference of $f$ is defined as

f[x_{0},x_{1},\cdots,x_{q}]=\begin{cases}\frac{f[x_{\sigma(1)},\cdots,x_{\sigma(q)}]-f[x_{\sigma(0)},\cdots,x_{\sigma(q-1)}]}{x_{\sigma(q)}-x_{\sigma(0)}}&\text{if }r>1,\\ \frac{f^{(q)}(x_{0})}{q!}&\text{if }r=1,\end{cases}

where $f^{(q)}$ denotes the $q$ th derivative of $f$ .

Although the above sorting procedure is not unique, it can be shown that any choice of the permutation gives the same result, and hence the definition is well-defined.

The divided difference involves a recursive relation that connects a $q+1$ input case to two $q$ cases. For $q=1$ ,

\displaystyle f[x_{0},x_{1}]=\begin{cases}\frac{f(x_{1})-f(x_{0})}{x_{1}-x_{0}}&\text{if }x_{0}\neq x_{1},\\ f^{\prime}(x_{0})&\text{if }x_{0}=x_{1}.\end{cases}

(79)

For $q=2$ , and suppose $x_{0}$ , $x_{1}$ and $x_{2}$ are all distinct,

	$\displaystyle f[x_{0},x_{1},x_{2}]$	$\displaystyle=\frac{\frac{f(x_{2})-f(x_{1})}{x_{2}-x_{1}}-\frac{f(x_{1})-f(x_{0})}{x_{1}-x_{0}}}{x_{2}-x_{0}}$
		$\displaystyle=\frac{f(x_{0})}{(x_{0}-x_{1})(x_{0}-x_{2})}+\frac{f(x_{1})}{(x_{1}-x_{2})(x_{1}-x_{0})}+\frac{f(x_{2})}{(x_{2}-x_{0})(x_{2}-x_{1})}.$		(80)

In fact, it can be shown that for distinct $x_{0},x_{1},\cdots,x_{q}$ ,

\displaystyle f[x_{0},x_{1},\cdots,x_{q}]=\sum^{q}_{i=0}\frac{f(x_{i})}{\prod_{k\neq i}(x_{i}-x_{k})}.

(81)

Remark.

Since any analytic function admits a Taylor expansion representation and the divided difference is a linear functional, the divided difference of an analytic function $f$ has a series expansion form, i.e., for $x_{0},\cdots,x_{q}$ and $y$ in $f$ ’s analytic domain,

f[x_{0},\cdots,x_{q}]=\sum_{n=0}^{\infty}\frac{f^{(n)}(y)}{n!}p_{n|y}[x_{0},\cdots,x_{q}],

(82)

where $p_{n|y}(x)\equiv(x-y)^{n}$ . Because $p_{n|y}[x_{0},\cdots,x_{q}]=0$ for all $n<q$ , the non-vanishing term of the series starts from the $q$ th order.

For simplicity, we denote the divided difference for the exponential function as ${\rm e}^{[x_{0},\cdots,x_{q}]}$ , i.e.,

{\rm e}^{[x_{0},\cdots,x_{q}]}\equiv f[x_{0},\cdots,x_{q}],\ \text{where}\ f(x)={\rm e}^{x}.

(83)

Property 1.

For any non-negative integer $q$ and $x_{0},x_{1},\cdots,x_{q}\in\mathbb{C}$ ,

{\rm e}^{[x_{0},x_{1},\cdots,x_{q}]}={\rm e}^{x_{0}}{\rm e}^{[0,x_{1}-x_{0},\cdots,x_{q}-x_{0}]}.

(84)

This property and the fact that divided differences are permutation symmetric among inputs imply that any input can be factored out of the divided difference by subtracting it from every entry.

Property 2.

For any non-negative integer $q$ and $x_{0},x_{1},\cdots,x_{q}\in\mathbb{C}$ ,

{\rm e}^{[x_{0},x_{1},\cdots,x_{q}]}=\displaystyle\sum^{\infty}_{n=q}\frac{1}{n!}\sum_{\sum k_{j}=n-q}\prod^{q}_{j=0}(x_{j})^{k_{j}}.

(85)

An equivalent definition of divided difference for an analytic function is via its Taylor expansion. It amounts to apply divided difference on every order of the series. Since any polynomial of order less than $q$ is annihilated, the series starts from the order $q$ . $Property$ 2 is derived from the Taylor expansion of ${\rm e}^{x}$ with respect to the origin.

Lemma 1.

For any non-negative integer $q$ and $x_{0},x_{1},\cdots,x_{q}\in\mathbb{C}$ ,

\int_{0}^{1}a^{q}{\rm e}^{[ax_{0},ax_{1},\cdots,ax_{q}]}da={\rm e}^{[0,x_{0},x_{1},\cdots,x_{q}]}.

(86)

Proof.

This can be observed from the series expansion of the divided difference for the exponential function, i.e., from $Property$ 2,

	$\displaystyle a^{q}{\rm e}^{[ax_{0},ax_{1},\cdots,ax_{q}]}$	$\displaystyle=a^{q}\displaystyle\sum^{\infty}_{n=q}\frac{1}{n!}\sum_{\sum k_{j}=n-q}\prod^{q}_{j=0}(ax_{j})^{k_{j}}$
		$\displaystyle=a^{q}\displaystyle\sum^{\infty}_{n=q}\frac{1}{n!}\sum_{\sum k_{j}=n-q}a^{n-q}\prod^{q}_{j=0}(x_{j})^{k_{j}}=\displaystyle\sum^{\infty}_{n=q}\frac{a^{n}}{n!}\sum_{\sum k_{j}=n-q}\prod^{q}_{j=0}(x_{j})^{k_{j}}.$		(87)

Performing term-by-term integration over $a$ on both side, we have

	$\displaystyle\int_{0}^{1}a^{q}{\rm e}^{[ax_{0},ax_{1},\cdots,ax_{q}]}da$	$\displaystyle=\displaystyle\sum^{\infty}_{n=q}\left(\int_{0}^{1}\frac{a^{n}}{n!}da\right)\sum_{\sum k_{j}=n-q}\prod^{q}_{j=0}(x_{j})^{k_{j}}$
		$\displaystyle=\displaystyle\sum^{\infty}_{n=q}\frac{1}{(n+1)!}\sum_{\sum k_{j}=n-q}\prod^{q}_{j=0}(x_{j})^{k_{j}}={\rm e}^{[0,x_{0},x_{1},\cdots,x_{q}]},$		(88)

where the last equality follows from the series expansion representation of ${\rm e}^{[0,x_{0},x_{1},\cdots,x_{q}]}$ . This completes the proof. ∎

Corrolary 1.

Let $f(x)={\rm e}^{tx}$ , where $t\in\mathbb{R}$ and $x\in\mathbb{C}$ . We denote ${\rm e}^{t[x_{0},\cdots,x_{q}]}\equiv f[x_{0},\cdots,x_{q}]$ , where $x_{0},\cdots,x_{q}\in\mathbb{C}$ . For any $\tau\in\mathbb{R}$ ,

\int^{\tau}_{0}{\rm e}^{t[x_{0},\cdots,x_{q}]}dt={\rm e}^{\tau[0,x_{0},\cdots,x_{q}]}.

(89)

This can be verified by evaluating the series expansion form on both side, by a similar manner in the proof of Lemma 1 .

With these properties, we are ready to prove the bound in Identity 2 in the main context.

Theorem 1.

For any non-negative integer $q$ and $x_{0},x_{1},\cdots,x_{q}\in\mathbb{C}$ ,

\left|{\rm e}^{[x_{0},x_{1},\cdots,x_{q}]}\right|\leq{\rm e}^{[\Re(x_{0}),\Re(x_{1}),\cdots,\Re(x_{q})]},

(90)

where $\Re(\cdot)$ gives the real part of the input.

Proof.

We proceed by induction. Eq. (90) is trivially satisfied with the equality when $q=0$ . For the case $q=1$ , we have

$\displaystyle\left\|{\rm e}^{[x_{0},x_{1}]}\right\|$	$\displaystyle=\left\|{\rm e}^{x_{0}}\right\|\left\|{\rm e}^{[0,x_{1}-x_{0}]}\right\|$
	$\displaystyle={\rm e}^{\Re(x_{0})}\left\|\int_{0}^{1}a{\rm e}^{a(x_{1}-x_{0})}da\right\|$
	$\displaystyle\leq{\rm e}^{\Re(x_{0})}\int_{0}^{1}a\left\|{\rm e}^{a(x_{1}-x_{0})}\right\|da$
	$\displaystyle={\rm e}^{\Re(x_{0})}\int_{0}^{1}a{\rm e}^{a\Re(x_{1}-x_{0})}da$
	$\displaystyle={\rm e}^{[\Re(x_{0}),\Re(x_{1})]},$	(91)

where Lemma 1 is used. Assume that we have

\left|{\rm e}^{[x_{0},\cdots,x_{q}]}\right|\leq{\rm e}^{[\Re(x_{0}),\cdots,\Re(x_{q})]},

(92)

which it is true for $q=0,1$ . It follows that

$\displaystyle\left\|{\rm e}^{[x_{0},\cdots,x_{q},x_{q+1}]}\right\|$	$\displaystyle=\left\|{\rm e}^{x_{q+1}}\right\|\left\|{\rm e}^{[0,x_{0}-x_{q+1},\cdots,x_{q}-x_{q+1}]}\right\|$
	$\displaystyle={\rm e}^{\Re(x_{q+1})}\left\|\int_{0}^{1}a^{q}{\rm e}^{[a(x_{0}-x_{q+1}),\cdots,a(x_{q}-x_{q+1})]}da\right\|$
	$\displaystyle\leq{\rm e}^{\Re(x_{q+1})}\int_{0}^{1}a^{q}\left\|{\rm e}^{[a(x_{0}-x_{q+1}),\cdots,a(x_{q}-x_{q+1})]}\right\|da$
	$\displaystyle\leq{\rm e}^{\Re(x_{q+1})}\int_{0}^{1}a^{q}{\rm e}^{[\Re(ax_{0}-ax_{q+1}),\cdots,\Re(ax_{q}-ax_{q+1})]}da$
	$\displaystyle={\rm e}^{\Re(x_{q+1})}{\rm e}^{[0,\Re(x_{0}-x_{q+1}),\cdots,\Re(x_{q}-x_{q+1})]}$
	$\displaystyle={\rm e}^{[\Re(x_{0}),\cdots,\Re(x_{q}),\Re(x_{q+1})]},$	(93)

where the second and the third equalities use Lemma 1 and the second inequality uses (92). This proves that the inequality holds for any number of complex inputs. ∎

Appendix B Bounding $\left|{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}\right|$

For $\left|{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}\right|$ , we use the following theorem,

Theorem 1.

For any $q+1$ complex values $x_{0},\cdots,x_{q}\in\mathbb{C}$ ,

\left|{\rm e}^{[x_{0},\cdots,x_{q}]}\right|\leq{\rm e}^{[\Re(x_{0}),\cdots,\Re(x_{q})]}=\frac{{\rm e}^{\xi}}{q!},

(94)

where $\Re(\cdot)$ denotes the real part of an input and $\xi\in\left[\min\{\Re(x_{0}),\cdots,\Re(x_{q})\},\max\{\Re(x_{0}),\cdots,\Re(x_{q})\}\right]$ .

This is proved in Appendix A. From this, we have

\left|{\rm e}^{\Delta t_{w}[x_{1},\cdots,x_{q},0]}\right|=({\Delta t_{w}})^{q}\left|{\rm e}^{[\Delta t_{w}x_{1},\cdots,\Delta t_{w}x_{q},0]}\right|\leq({\Delta t_{w}})^{q}{\rm e}^{[\Delta t_{w}\Re(x_{1}),\cdots,\Delta t_{w}\Re(x_{q}),0]}.

(95)

From the definition of $x_{j}$ , we have

\forall j\in\{1,\cdots,q\},\ \ \ \ \ \Re(x_{j})=\sum_{l=j}^{q}\Re\left(\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\leq(q-j+1)\lambda.

(96)

Based on the property that increasing any input in ${\rm e}^{[\cdot,\cdots,\cdot]}$ will only increase its value (can be proved by taking derivatives in the Hermite-Genocchi form), we have

\left|{\rm e}^{\Delta t_{w}[x_{1},\cdots,x_{q},0]}\right|\leq({\Delta t_{w}})^{q}{\rm e}^{[\Delta t_{w}\Re(x_{1}),\cdots,\Delta t_{w}\Re(x_{q}),0]}\leq({\Delta t_{w}})^{q}{\rm e}^{[\Delta t_{w}q\lambda,\Delta t_{w}(q-1)\lambda,\cdots,\Delta t_{w}\lambda,0]}.

(97)

Using the permutation symmetric property and Property 1, we have

	$\displaystyle({\Delta t_{w}})^{q}{\rm e}^{[\Delta t_{w}q\lambda,\Delta t_{w}(q-1)\lambda,\cdots,\Delta t_{w}\lambda,0]}$	$\displaystyle={\Delta t_{w}}^{q}\frac{{\rm e}^{[\Delta t_{w}q\lambda,\Delta t_{w}(q-1)\lambda,\cdots,\Delta t_{w}\lambda]}-{\rm e}^{[\Delta t_{w}(q-1)\lambda,\cdots,\Delta t_{w}\lambda,0]}}{\Delta t_{w}\lambda q}$
		$\displaystyle=({\Delta t_{w}})^{q}\frac{{\rm e}^{\Delta t_{w}\lambda}-1}{\Delta t_{w}\lambda q}{\rm e}^{[\Delta t_{w}(q-1)\lambda,\cdots,\Delta t_{w}\lambda,0]}=\cdots=\left(\frac{{\rm e}^{\lambda\Delta t_{w}}-1}{\lambda}\right)^{q}\frac{1}{q!}.$		(98)

Therefore, we have

\left|{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}\right|\leq\frac{1}{q!}\left(\frac{{\rm e}^{\lambda\Delta t_{w}}-1}{\lambda}\right)^{q}.

(99)

Appendix C LCU method review

We give a brief introduction to the LCU method in this section, and we adapt the original paper Berry et al. (2015)’s notations for a more convenient reference to readers. Suppose we have a unitary $U$ , which is an infinite sum of unitaries, i.e.,

U=\sum_{j=0}^{\infty}\beta_{j}V_{j},

(100)

where $\beta_{j}>0$ and $V_{j}$ are some unitaries. A truncated series, up to order $m-1$ , yields an operator

\tilde{U}=\sum_{j=0}^{m-1}\beta_{j}V_{j},

(101)

which approaches $U$ as $m$ increases. We perform the following procedure to effectively implement $\tilde{U}$ on a state $|\psi\rangle$ embedded in a larger system. Prepare an $m$ -dimensional ancilla $|0\rangle$ and implement a unitary $B$ such that

B|0\rangle=\frac{1}{\sqrt{s}}\sum_{j=0}^{m-1}\sqrt{\beta_{j}}|j\rangle,

(102)

where $s=\sum_{j-0}^{m-1}\beta_{j}$ . Suppose we have access to a control unitary $V_{c}$ such that for each $j$ ,

V_{c}|j\rangle|\psi\rangle=|j\rangle V_{j}|\psi\rangle.

(103)

Consider the following combination of the above operations

W\equiv\left(B^{\dagger}\otimes I\right)V_{c}\left(B\otimes I\right).

(104)

We have

W|0\rangle|\psi\rangle=\frac{1}{s}|0\rangle\tilde{U}|\psi\rangle+\sqrt{1-\frac{1}{s^{2}}}|\Phi\rangle,

(105)

where $|\Phi\rangle$ ’s ancillary part is orthogonal to $|0\rangle\langle 0|$ . Let us denote $P\equiv|0\rangle\langle 0|\otimes I$ the orthogonal projection onto that subspace and $R\equiv I-2P$ the reflection operator with respect to $P$ . It is shown that the sequence of operations $A\equiv-WRW^{\dagger}RW$ , acting on the total system is $A|0\rangle|\psi\rangle=|0\rangle\tilde{U}|\psi\rangle$ when $\tilde{U}$ is unitary and $s=2$ . This procedure is the so-called Oblivious Amplitude Amplification (OAA). However, $\tilde{U}$ is in general not unitary because it is a truncated series of $U$ . This nonunitarity can be accounted for when $\tilde{U}\approx U$ and $s\approx 2$ . More specifically, it is shown that if $||U-\tilde{U}||=\mathcal{O}(\delta)$ and $|s-2|=\mathcal{O}(\delta)$ , then

\left|\left|PA|0\rangle|\psi\rangle-|0\rangle U|\psi\rangle\right|\right|=\mathcal{O}(\delta).

(106)

This means when $\tilde{U}$ is $\delta$ -close to $U$ and $s$ is $\delta$ -close to 2, the effect of the operator $A$ on the whole system is $\delta$ -close to only $U$ acting on $|\psi\rangle$ .

Note that the condition $||U-\tilde{U}||=\mathcal{O}(\delta)$ can be satisfied when the truncation order $m$ is high enough. However, the condition $|s-2|=\mathcal{O}(\delta)$ is satisfied only when $\beta_{j}$ are specifically chosen. By construction, we require $s=\sum_{j=0}^{m-1}\beta_{j}$ . If we choose $\beta_{j}=(\text{ln}2)^{j}/j!$ , then

s=\sum_{j=0}^{m-1}\frac{(\text{ln}2)^{j}}{j!}

(107)

becomes a truncated Taylor expansion of 2, i.e., $2={\rm e}^{\text{ln}2}$ . In fact, it can be shown that the required truncation order $m$ such that $|s-2|=\mathcal{O}(\delta)$ scales like $\log(1/\delta)/\log(\log(1/\delta))$ . With this $m$ , it also guarantees that $||U-\tilde{U}||=\mathcal{O}(\delta)$ , because

\left|\left|U-\tilde{U}\right|\right|=\left|\left|\sum_{j=m}^{\infty}\frac{(\text{ln}2)^{j}}{j!}V_{j}\right|\right|\leq\sum_{j=m}^{\infty}\frac{(\text{ln}2)^{j}}{j!}=|2-s|.

(108)

In summary, performing $A$ on an extended system $|0\rangle|\psi\rangle$ , with $\beta_{j}=(\text{ln}2)^{j}/j!$ and $m=\mathcal{O}(\log(1/\delta)/\log\log(1/\delta))$ , effectively performs $U$ on $|\psi\rangle$ with $\mathcal{O}(\delta)$ accuracy.

Appendix D An alternative approach for the LCU setup

We provide an alternative procedure for the LCU routine that leads to an exponential saving for the state preparation. Let us define

\Gamma\equiv\max_{\forall k,i}||D^{(k)}_{i}||_{\max}.

(109)

Re-evaluate the coefficients in Eq. (32) using the $\Gamma$ above, we have

\displaystyle\left|{\rm e}^{t_{w}x_{1}}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}\right|=\frac{\left(\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{q!}\cos\left[\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}\right]{\rm e}^{i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}.

(110)

The evolution operator from $t_{w}$ to $t_{w}+\Delta t_{w}$ becomes

	$\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})=\sum_{z}U_{I}(t_{w}+\Delta t_{w},t_{w})\|z\rangle\langle z\|$
	$\displaystyle=\sum_{z}\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\frac{\left(\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{2q!}\left({\rm e}^{i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}+{\rm e}^{-i\phi^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}+i\theta^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}}\right)P_{\mathbb{i}_{q}}\|z\rangle\langle z\|$
	$\displaystyle=\sum_{q=0}^{\infty}\frac{\left(\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{2q!}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=\pm}(-i)^{q}P_{\mathbb{i}_{q}}\Phi^{(\mathbb{k}_{q},w)}_{\mathbb{i}_{q},x}.$		(111)

The required state $|\psi_{0}\rangle$ for LCU becomes

|\psi_{0}\rangle=\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{\frac{\left(\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{2q!}}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle|x\rangle,

(112)

where $s$ is the normalization factor, i.e.,

s=\displaystyle\sum_{q=0}^{Q}\frac{\left(MK\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{q!}.

(113)

To prepare the state (112), we first prepare a state in the following form,

\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{\frac{\left(MK\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{q!}}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}|1\rangle^{\otimes q}|0\rangle^{\otimes(Q-q)}.

(114)

Subsequently, for each $|1\rangle$ in the first $Q$ registers ( $\mathbb{i}_{q}$ part), we transform it to $(1/\sqrt{M})\sum_{i=0}^{M}|i\rangle$ , and for each $|1\rangle$ in the later $Q$ registers ( $\mathbb{k}_{q}$ part), we transform it to $(1/\sqrt{K})\sum_{k=1}^{K}|k\rangle$ . The state (114) becomes

\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sqrt{\frac{\left(MK\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{q!}}\sum_{\mathbb{i}_{q}}\frac{1}{\sqrt{M^{q}}}|\mathbb{i}_{q}\rangle\sum_{\mathbb{k}_{q}}\frac{1}{\sqrt{K^{q}}}|\mathbb{k}_{q}\rangle=\frac{1}{\sqrt{s}}\displaystyle\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sqrt{\frac{\left(\Gamma\widetilde{\Delta t}_{w}{\rm e}^{t_{w}\lambda}\right)^{q}}{q!}}|\mathbb{i}_{q}\rangle|\mathbb{k}_{q}\rangle,

(115)

which is the required $|\psi_{0}\rangle$ in (112), when combined with $|x\rangle$ . Note that since the transformations $|1\rangle\to(1/\sqrt{M})\sum_{i=0}^{M}|i\rangle$ and $|1\rangle\to(1/\sqrt{K})\sum_{k=1}^{K}|k\rangle$ are mappings to the equally distributed state, they can be done with a column of parallel Hadamard gates, which has a gate cost $\mathcal{O}(\log(MK))$ . This provides an exponential saving comparing to $\mathcal{O}(MK)$ given in the main context. This saving can be apparent when $MK$ becomes large. However, this can create an overhead in the required number of repetitions. Indeed, we have $\Gamma{\rm e}^{t_{w}\lambda}=MK\max_{\forall k,i}||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda}$ here comparing to $\Gamma(t_{w})=\sum_{i}\sum_{k}||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}}$ in the main context, and the overall simulation cost monotonically increases with this quantity. If only a few $||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}}$ is much larger than the others such that $MK\max_{\forall k,i}||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda}\gg\sum_{i}\sum_{k}||D^{(k)}_{i}||_{\max}{\rm e}^{t_{w}\lambda_{(i,k)}}$ , then the method provided in the main context is preferred. Depending on the models, one may favors one over the other.

$\displaystyle U(t)\|z\rangle$	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\Lambda^{(k_{q})}_{i_{q}}\tau_{q}\right)D_{i_{q}}^{(k_{q})}P_{i_{q}}\cdots\exp\left(\Lambda^{(k_{1})}_{i_{1}}\tau_{1}\right)D_{i_{1}}^{(k_{1})}P_{i_{1}}\|z\rangle$	(7)
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}\tau_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}\tau_{1}\right)d_{i_{q},z_{\mathbb{i}_{q}}}^{(k_{q})}\cdots d_{i_{1},z_{\mathbb{i}_{1}}}^{(k_{1})}P_{i_{q}}\cdots P_{i_{1}}\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}\tau_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}\tau_{1}\right)d_{\mathbb{i}_{q},z}^{(\mathbb{k}_{q})}P_{\mathbb{i}_{q}}\|z\rangle,$

$\displaystyle U(t)\|z\rangle$	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t}_{0}d\tau_{q}\cdots\int^{\tau_{2}}_{0}d\tau_{1}\exp\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}\tau_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}\tau_{1}\right)d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-it)^{q}\int^{1}_{0}ds_{q}\cdots\int^{s_{2}}_{0}ds_{1}\exp\left[t\left(\lambda^{(k_{q})}_{i_{q},z_{\mathbb{i}_{q}}}s_{q}+\cdots+\lambda^{(k_{1})}_{i_{1},z_{\mathbb{i}_{1}}}s_{1}\right)\right]d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{t[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle,$	(9)

	$\displaystyle U_{I}(t_{w}+\Delta t_{w},t_{w})\|z\rangle=\mathcal{T}\text{exp}\left[-i\int_{t_{w}}^{t_{w}+\Delta t_{w}}H_{I}(t^{\prime})dt^{\prime}\right]\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\int^{t_{w}+\Delta t_{w}}_{t_{w}}d\tau_{q}\cdots\int^{\tau_{2}}_{t_{w}}d\tau_{1}\exp\Bigg{[}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\tau_{l}\Bigg{]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle,$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\exp\left[t_{w}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\right]$
	$\displaystyle\times\int^{\Delta t_{w}}_{0}d\tau^{\prime}_{q}\cdots\int^{\tau^{\prime}_{2}}_{0}d\tau^{\prime}_{1}\exp\Bigg{[}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\tau^{\prime}_{l}\Bigg{]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}\exp\left[t_{w}\sum_{l=1}^{q}\left(iE_{z_{\mathbb{i}_{l}}}-iE_{z_{\mathbb{i}_{l-1}}}+\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}\right)\right]{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle$
	$\displaystyle=\sum_{q=0}^{\infty}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}(-i)^{q}{\rm e}^{-it_{w}(E_{z_{\mathbb{i}_{0}}}-E_{z_{\mathbb{i}_{q}}})}{\rm e}^{t_{w}\sum_{l=1}^{q}\lambda^{(k_{l})}_{i_{l},z_{\mathbb{i}_{l}}}}{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}d^{(\mathbb{k}_{q})}_{\mathbb{i}_{q},z}P_{\mathbb{i}_{q}}\|z\rangle,$		(31)

	$\displaystyle\|\psi_{0}\rangle$	$\displaystyle=\frac{1}{\sqrt{s}}\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}\sqrt{\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}\|i_{1}\rangle\cdots\|i_{q}\rangle\otimes\|0\rangle^{\otimes(Q-q)}\|k_{1}\rangle\cdots\|k_{q}\rangle\otimes\|0\rangle^{\otimes(Q-q)}\|x\rangle$
		$\displaystyle\equiv\frac{1}{\sqrt{s}}\sum_{q=0}^{Q}\sum_{\mathbb{i}_{q}}\sum_{\mathbb{k}_{q}}\sum_{x=0,1}\sqrt{\frac{\widetilde{\Delta t}_{w}^{q}}{2q!}\Gamma^{(\mathbb{k}_{q})}_{\mathbb{i}_{q}}(t_{w})}\|\mathbb{i}_{q}\rangle\|\mathbb{k}_{q}\rangle\|x\rangle,$		(39)

	$\displaystyle\frac{1}{\sqrt{s}}\left(\|0\rangle+\sqrt{\displaystyle\sum_{q=1}^{Q}s_{q}}\|1\rangle\right)\|0\rangle$
	$\displaystyle\to\frac{1}{\sqrt{s}}\Bigg{[}\|00\rangle+\sqrt{\sum_{q=1}^{Q}s_{q}}\|1\rangle\frac{1}{\sqrt{\displaystyle\sum_{q=1}^{Q}s_{q}}}\Bigg{(}\sqrt{s_{1}}\|0\rangle+\sqrt{\displaystyle\sum_{q=2}^{Q}s_{q}}\|1\rangle\Bigg{)}\Bigg{]}$
	$\displaystyle=\frac{1}{\sqrt{s}}\left(\|00\rangle+\sqrt{s_{1}}\|10\rangle+\sqrt{\displaystyle\sum_{q=2}^{Q}s_{q}}\|11\rangle\right).$		(49)

Quantum algorithm for time-dependent Hamiltonian simulation by permutation expansion

Abstract

I Introduction

II Permutation expansion method for time-dependent Hamiltonians

Identity 1.

III Time-dependent Hamiltonian simulation algorithm

III.1 An overview of the algorithm

III.2 Permutation expansion for UI​(t)U_{I}(t)

III.3 The LCU routine

Identity 2.

III.3.1 Time partitioning and number of time steps

III.3.2 State preparation

III.3.3 Implementation of the controlled unitaries

III.4 Algorithm cost

III.4.1 The cost for the state preparation and the controlled unitaries

III.4.2 Overall cost of the algorithm

III.4.3 Example advantages of the algorithm

III.5 Hamiltonians with arbitrary time dependence

IV Alternative scheme and reduction to the time-independent case

V Conclusions

Acknowledgements.

References

Appendix

Appendix A Properties of divided difference

Definition 1.

Remark.

Property 1.

Property 2.

Lemma 1.

Proof.

Corrolary 1.

Theorem 1.

Proof.

Appendix B Bounding |eΔ​tw​[x1,x2,⋯,xq,0]|\left|{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}\right|

Theorem 1.

Appendix C LCU method review

Appendix D An alternative approach for the LCU setup

Quantum algorithm for time-dependent Hamiltonian simulation
by permutation expansion

III.2 Permutation expansion for $U_{I}(t)$

Appendix B Bounding $\left|{\rm e}^{\Delta t_{w}[x_{1},x_{2},\cdots,x_{q},0]}\right|$