\floatsetup

[figure]style=plain,subcapbesideposition=top

Optimal Strategies of Quantum Metrology with a Strict Hierarchy

Qiushi Liu [email protected] QICI Quantum Information and Computation Initiative, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, China Zihao Hu [email protected] Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong, China Haidong Yuan [email protected] Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong, China Yuxiang Yang [email protected] QICI Quantum Information and Computation Initiative, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, China

Abstract

One of the main quests in quantum metrology is to attain the ultimate precision limit with given resources, where the resources are not only of the number of queries, but more importantly of the allowed strategies. With the same number of queries, the restrictions on the strategies constrain the achievable precision. In this work, we establish a systematic framework to identify the ultimate precision limit of different families of strategies, including the parallel, the sequential, and the indefinite-causal-order strategies, and provide an efficient algorithm that determines an optimal strategy within the family of strategies under consideration. With our framework, we show there exists a strict hierarchy of the precision limits for different families of strategies.

^†^†preprint: APS/123-QED

Introduction.—Quantum metrology [1, 2] features a series of promising applications in the near future [3]. In the prototypical setting of quantum metrology, the goal is to estimate an unknown parameter carried by a quantum channel, given $N$ queries to it. A pivotal task is to design a strategy that utilizes these $N$ queries to generate a quantum state with as much information about the unknown parameter as possible. This often involves, for example, preparing a suitable input probe state [4, 5, 6] and applying intermediate quantum control [7, 8, 9, 10] as well as quantum error correction [11, 12, 13, 14].

In reality, the implementation of strategies is subject to physical restrictions. In particular, within the noisy and intermediate-scale quantum (NISQ) era [15], we have to adjust the strategy to accommodate the limitations on the system. For example, for systems with short coherence time it might be favorable to adopt the parallel strategy [Fig. 1LABEL:sub@subfig:parallel_strategy], where multiple queries of the unknown channel are applied simultaneously on a multipartite entangled state [4]. When the system has longer coherence time and can be better controlled, one could choose to query the channel sequentially [Fig. 1LABEL:sub@subfig:sequential_strategy], which may potentially enhance the precision. In addition to the parallel and sequential strategies, it was recently discovered that the quantum SWITCH [16], a primitive where the order of making queries to the unknown channel is in a quantum superposition [Fig. 1LABEL:sub@subfig:quantum_switch_strategy], can be employed to generate new strategies of quantum metrology [17, 18, 19] that may even break the Heisenberg limit [19]. Moreover, indefinite causal structures beyond the quantum SWITCH [16, 20, 21] [Figs. 1LABEL:sub@subfig:causal_superposition_strategy and LABEL:sub@subfig:general_strategy] have recently been shown to further boost the performance of certain information processing tasks [22, 23]. The ultimate performance of these strategies in quantum metrology, however, remains unknown. This is mainly due to the lack of a systematic method that optimizes the probe state, the control and other degrees of freedom in a strategy in a unified fashion which leads to the ultimate precision limit.

Refer to caption — Figure 1: Prototypical strategies of quantum metrology (for the $N=2$ case). $\mathcal{E}_{\phi}$ is a quantum channel carrying an unknown parameter $\phi$ , and the blue shaded area represents a *strategy*. LABEL:sub@subfig:parallel_strategy A parallel strategy. LABEL:sub@subfig:sequential_strategy A sequential strategy, where $U$ is a control operation. LABEL:sub@subfig:quantum_switch_strategy A quantum SWITCH strategy. The blue and red lines, respectively, correspond to two different execution orders entangled with a control qubit. LABEL:sub@subfig:causal_superposition_strategy A causal superposition strategy. Two sequential strategies, plotted in blue and in red, respectively, are entangled with a control qubit (not shown in the figure) and the output will be measured with the control qubit collectively. LABEL:sub@subfig:general_strategy A general indefinite-causal-order strategy.

In this work, we develop a semidefinite programming (SDP) method of evaluating the optimal precision of single-parameter quantum metrology for finite $N$ (which we call the nonasymptotic regime) over a family of admissible strategies. With this method, we show a strict hierarchy (see Fig. 2) of the optimal performances under different families of strategies, which include the parallel, the sequential, and the indefinite-causal-order [16, 20, 21] ones (see Fig. 1). In conjunction, we design an algorithm to obtain an optimal strategy achieving the highest precision. For the strategy set that admits a symmetric structure, we develop a method of reducing the complexity of our algorithms by an exponential factor.

Quantum Fisher information.—The uncertainty $\delta\hat{\phi}$ of estimating an unknown parameter $\phi$ encoded in a quantum state $\rho_{\phi}$ , for any unbiased estimator $\hat{\phi}$ , can be determined via the quantum Cramér-Rao bound (QCRB) as $\delta\hat{\phi}\geq 1/\sqrt{\nu J_{Q}(\rho_{\phi})}$ [24, 25, 26], where $J_{Q}(\rho_{\phi})$ is the quantum Fisher information (QFI) of the state $\rho_{\phi}$ and $\nu$ is the number of repeated measurements [27]. For single-parameter estimation, the QCRB is achievable, and the QFI thus quantifies the amount of information that can be extracted from the quantum state. One way to compute the QFI is [28, 29]

J_{Q}(\rho_{\phi})=4\min_{\Tr_{A}{(\outerproduct{\Psi_{\phi}}{\Psi_{\phi}})}=\rho_{\phi}}\innerproduct{\dot{\Psi}_{\phi}}{\dot{\Psi}_{\phi}},

(1)

where $\ket{\Psi_{\phi}}$ is the purification of $\rho_{\phi}$ with an ancillary space $\mathcal{H}_{A}$ , $\Tr_{A}$ denotes the partial trace over $\mathcal{H}_{A}$ , and $\dot{\Psi}:=\partial\Psi/\partial\phi$ . When the parameter is carried by a quantum channel $\mathcal{E}_{\phi}$ , i.e., a completely positive trace-preserving (CPTP) map, the channel QFI can be defined as the maximal QFI of output states using the optimal input assisted by arbitrary ancillae [28, 30, 31, 32]: $J_{Q}^{(\mathrm{chan})}(\mathcal{E}_{\phi})=\max_{\rho_{\mathrm{in}}\in\mathcal{S}(\mathcal{H}_{S}\otimes\mathcal{H}_{A})}J_{Q}[(\mathcal{E}_{\phi}\otimes\mathcal{I}_{A})(\rho_{\mathrm{in}})]$ , where $\mathcal{S}(\mathcal{H})$ denotes the space of density operators on the Hilbert space $\mathcal{H}$ , $\mathcal{H}_{S/A}$ denotes the Hilbert space of the system or ancillae, and $\mathcal{I}_{A}$ is the identity on $\mathcal{H}_{A}$ .

We denote by $\mathcal{L}(\mathcal{H})$ the set of linear operators on the finite-dimensional Hilbert space $\mathcal{H}$ , and $\mathcal{L}[\mathcal{L}(\mathcal{H}_{1}),\mathcal{L}(\mathcal{H}_{2})]$ denotes the set of linear maps from $\mathcal{L}(\mathcal{H}_{1})$ to $\mathcal{L}(\mathcal{H}_{2})$ . By the Choi-Jamiołkowski (CJ) isomorphism, a parametrized quantum channel $\mathcal{E}_{\phi}\in\mathcal{L}[\mathcal{L}(\mathcal{H}_{2i-1}),\mathcal{L}(\mathcal{H}_{2i})]$ (for $1\leq i\leq N$ ) can be represented by a positive semidefinite operator (called the CJ operator) $E_{\phi}=\mathsf{Choi}(\mathcal{E}_{\phi})=\mathcal{E}_{\phi}\otimes\mathcal{I}(\lvert I\rrangle\llangle I\rvert)$ , where $\lvert I\rrangle=\sum_{i}\ket{i}\ket{i}$ . The CJ operator of $N$ identical quantum channels is $N_{\phi}=E_{\phi}^{\otimes N}\in\mathcal{L}\left(\otimes_{i=1}^{2N}\mathcal{H}_{i}\right)$ .

Strategy set in quantum metrology.—A strategy is an arrangement of physical processes (the blue shaded area in Fig. 1) which, when concatenated with given queries to $\mathcal{E}_{\phi}$ , generates an output quantum state carrying the information about $\phi$ . A strategy can be described by a CJ operator on $\mathcal{L}\left(\mathcal{H}_{F}\otimes_{i=1}^{2N}\mathcal{H}_{i}\right)$ , where $\mathcal{H}_{F}$ denotes the output Hilbert space of the concatenation, referred to as the global future space. The concatenation of two processes is characterized by the link product [33, 34] of two corresponding CJ operators $A\in\mathcal{L}\left(\otimes_{a\in\mathcal{A}}\mathcal{H}_{a}\right)$ and $B\in\mathcal{L}\left(\otimes_{b\in\mathcal{B}}\mathcal{H}_{b}\right)$ as

A*B:=\Tr_{\mathcal{A}\cap\mathcal{B}}\left[\left(\mathds{1}_{\mathcal{B}\backslash\mathcal{A}}\otimes A^{T_{\mathcal{A}\cap\mathcal{B}}}\right)\left(B\otimes\mathds{1}_{\mathcal{A}\backslash\mathcal{B}}\right)\right],

(2)

where $T_{i}$ denotes the partial transpose on $\mathcal{H}_{i}$ , and $\mathcal{H}_{\mathcal{A}/\mathcal{B}}$ denotes $\otimes_{i\in\mathcal{A}/\mathcal{B}}\mathcal{H}_{i}$ . The output state lies in the global future $F$ , which should not affect any state in the past.

Following the above formalism, given a sufficiently large ancillary Hilbert space, a strategy set determined by the relevant causal constraints is described by a subset $\mathsf{P}$ of

\mathsf{Strat}:=\left\{P\in\mathcal{L}\left(\mathcal{H}_{F}\otimes_{i=1}^{2N}\mathcal{H}_{i}\right)\middle|P\geq 0,\mathsf{rank}(P)=1\right\}.

(3)

Here, without loss of generality [35], we have restricted $P$ to pure processes (rank-1 operators) due to the monotonicity of QFI [36]. Our goal is to identify the ultimate precision limit of parameter estimation characterized by the QFI within such constraints:

Definition 1.

The QFI of $N$ quantum channels $\mathcal{E}_{\phi}$ [37] given a strategy set $\mathsf{P}$ is

J^{(\mathsf{P})}(N_{\phi}):=\max_{P\in\mathsf{P}}J_{Q}(P*N_{\phi}),

(4)

where $J_{Q}(\rho)$ is the QFI of the state $\rho$ , and $N_{\phi}$ is the CJ operator of $N$ channels.

In general we can write the ensemble decomposition [28] of the CJ operator $N_{\phi}$ as $N_{\phi}=\sum_{i=1}^{r}\outerproduct{N_{\phi,i}}{N_{\phi,i}}=\mathbf{N}_{\phi}\mathbf{N}_{\phi}^{\dagger}$ , where $\mathbf{N}_{\phi}:=\left(\ket{N_{\phi,1}},\dots,\ket{N_{\phi,r}}\right)$ and $r:=\max_{\phi}\mathsf{rank}({N_{\phi}})$ . We also define $\dot{\tilde{\mathbf{N}}}_{\phi}:=\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h$ for $h\in\mathbb{H}_{r}$ , where $\mathbb{H}_{r}$ denotes the set of $r\times r$ Hermitian matrices, and the performance operator [38]

\Omega_{\phi}(h):=4\left(\dot{\tilde{\mathbf{N}}}_{\phi}\dot{\tilde{\mathbf{N}}}_{\phi}^{\dagger}\right)^{T}.

(5)

With these notions, we can show (see Appendix A, which is analogous to the approach in [38]) that the QFI admits the form

J^{(\mathsf{P})}(N_{\phi})=\max_{\tilde{P}\in\tilde{\mathsf{P}}}\min_{h\in\mathbb{H}_{r}}\Tr\left[\tilde{P}\Omega_{\phi}(h)\right],

(6)

with

\tilde{\mathsf{P}}:=\left\{\tilde{P}=\Tr_{F}P\middle|P\in\mathsf{P}\right\}.

(7)

To evaluate the QFI, we first exchange $\max_{\tilde{P}}$ and $\min_{h}$ without changing the optimal QFI, assured by the minimax theorem [39, 40] since the objective function is concave on $\tilde{P}$ and convex on $h$ [41]. Hence, the problem is cast into

J^{(\mathsf{P})}(N_{\phi})=\min_{h}\max_{\tilde{P}\in\tilde{\mathsf{P}}}\Tr\left[\tilde{P}\Omega_{\phi}(h)\right].

(8)

Then we fix $h$ and formulate the dual problem of maximization over the set $\tilde{\mathsf{P}}$ . Finally we further optimize the value of $h$ . To simplify the calculation of QFI we require that

\tilde{\mathsf{P}}=\mathsf{Conv}\left\{\bigcup_{i=1}^{K}\left\{S^{i}\geq 0\middle|S^{i}\in\mathsf{S}^{i}\right\}\right\},

(9)

where $\mathsf{Conv}\{\cdot\}$ denotes the convex hull, and each $\mathsf{S}^{i}$ for $i=1,\dots,K$ is an affine space of Hermitian operators. Adopting the above-mentioned method, we get

Theorem 1.

Given an arbitrary strategy set $\mathsf{P}$ such that $\tilde{\mathsf{P}}$ given by Eq. (7) satisfies the condition Eq. (9), the QFI of $N$ quantum channels $\mathcal{E}_{\phi}$ can be expressed as the following optimization problem:

	$\displaystyle J^{(\mathsf{P})}(N_{\phi})=$	$\displaystyle\min_{\lambda,Q^{i},h}\lambda,$		(10)
	$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\lambda Q^{i}\geq\Omega_{\phi}(h),\ Q^{i}\in\overline{\mathsf{S}}^{i},\ i=1,\dots,K,$		(10)

where $\overline{\mathsf{S}}^{i}:=\left\{Q\mathrm{\ is\ Hermitian}\middle|\Tr(QS)=1,S\in\mathsf{S}^{i}\right\}$ is the dual affine space of $\mathsf{S}^{i}$ .

The proof can be found in Appendix B. We remark that similar optimization ideas have been applied to other tasks, such as quantum Bayesian estimation [42], quantum network optimization [43], non-Markovian quantum metrology [38], and quantum channel discrimination [23]. The minimization problem in Theorem 1 can be further written in the form of SDP and solved efficiently, with detailed numerically solvable forms given in Appendix E, where the constraints in Eq. (10) can be further simplified in some cases.

Optimal strategies.—By itself, the QFI does not reveal how to implement the optimal strategy achieving the highest precision. Here, in addition to Theorem 1, we design an algorithm that yields a strategy attaining the optimal QFI for any strategy set satisfying Eq. (9). The method, which generalizes the method of finding an optimal probe state for a single channel [44, 45], is summarized as Algorithm 1 (see Appendix C for its derivation).

Algorithm 1 Find an optimal strategy in the set

\mathsf{P}

(i)

Given $N_{\phi}$ the CJ operator of $N$ channels, solve for an optimal value $h=h^{(\mathrm{opt})}$ in Eq. (10) of Theorem 1 via SDP.

(ii)

Fixing $h=h^{(\mathrm{opt})}$ , solve for an optimal value $\tilde{P}^{(\mathrm{opt})}$ of $\tilde{P}\in\mathsf{\tilde{P}}$ in Eq. (6) via SDP such that

	$\displaystyle\real\left\{\Tr\left\{\tilde{P}^{(\mathrm{opt})}\left[-\mathrm{i}\mathbf{N}_{\phi}\mathscr{H}\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h^{(\mathrm{opt})}\right)^{\dagger}\right]^{T}\right\}\right\}$		(11)
	$\displaystyle=0\ \mathrm{for}\ \mathrm{all}\ \mathscr{H}\in\mathbb{H}_{r},$		(11)

where $\mathbf{N}_{\phi}:=\left(\ket{N_{\phi,1}},\dots,\ket{N_{\phi,r}}\right)$ . An optimal strategy $P^{(\mathrm{opt})}\in\mathsf{P}$ can be taken as a purification of $\tilde{P}^{(\mathrm{opt})}$ .

By Algorithm 1 we obtain the CJ operator of a strategy that attains the optimal QFI. For strategies following definite causal order, there exists an operational method of mapping the CJ operator of the strategy to a probe state and a sequence of in-between control operations with minimal memory space [46]. For causal order superposition strategies (see the strategy set $\mathsf{Sup}$ ), we show that they can always be implemented by controlling the order of operations in a circuit with a quantum SWITCH (see Appendix I). In this way, we obtain a systematic method to identify optimal sequential and causal superposition strategies, one of the key problems in quantum metrology.

Strategy sets.—We consider the evaluation of QFI for five different families of strategies. In all the following definitions the subscript $i$ of an operator denotes the Hilbert space $\mathcal{H}_{i}$ it acts on.

The family of parallel strategies [see Fig. 1LABEL:sub@subfig:parallel_strategy] is the first and one of the most successful examples of quantum-enhanced metrology, featuring the usage of entanglement to achieve precision beyond the classical limit [47]. By making parallel use of $N$ quantum channels together with ancillae, we can regard these $N$ channels as one single channel from $\mathcal{L}\left(\otimes_{i=1}^{N}\mathcal{H}_{2i-1}\right)$ to $\mathcal{L}\left(\otimes_{i=1}^{N}\mathcal{H}_{2i}\right)$ . A parallel strategy set $\mathsf{Par}$ is defined as the collection of $P\in\mathsf{Strat}$ such that [34]

\Tr_{F}{P}=\mathds{1}_{2,4,\dots,2N}\otimes P^{(1)},\ \Tr P^{(1)}=1.

(12)

Note that the optimal QFI of parallel strategies can also be evaluated using the method in [28, 30].

A more general protocol is to allow for sequential use of $N$ channels assisted by ancillae, where only the output of the former channel can affect the input of the latter channel, and any control gates can be inserted between channels [see Fig. 1LABEL:sub@subfig:sequential_strategy]. A sequential strategy set $\mathsf{Seq}$ is defined as the collection of $P\in\mathsf{Strat}$ such that [34]

		$\displaystyle\Tr_{F}P=\mathds{1}_{2N}\otimes P^{(N)},\ \Tr P^{(1)}=1,$		(13)
		$\displaystyle\Tr_{2k-1}P^{(k)}=\mathds{1}_{2k-2}\otimes P^{(k-1)},\ k=2,\dots,N.$		(13)

Unlike the case of parallel strategies, there is no existing way of evaluating the exact QFI using sequential strategies.

We also consider families of strategies involving indefinite causal order. The first one, denoted by $\mathsf{SWI}$ , takes advantage of the (generalized) quantum SWITCH [48, 49], where the execution order of $N$ channels is entangled with the state of an $N!$ -dimensional control system [see Fig. 1LABEL:sub@subfig:quantum_switch_strategy]. See Appendix E for the formal definition.

More generally, we consider the quantum superposition of multiple sequential orders, each with a unique order of querying the $N$ channels [see Fig. 1LABEL:sub@subfig:causal_superposition_strategy]. This can be implemented by entangling $N!$ definite causal orders with a quantum control system [50]. If $N=2$ and the control system is traced out, this notion is equivalent to causal separability [20, 21]. A causal superposition strategy set $\mathsf{Sup}$ is defined as the collection of $P\in\mathsf{Strat}$ such that

		$\displaystyle\Tr_{F}P=\sum_{\pi}q^{\pi}P^{\pi},\ \sum_{\pi\in S_{N}}q^{\pi}=1,$		(14)
		$\displaystyle\,P^{\pi}\in\mathsf{Seq}^{\pi},\ q^{\pi}\geq 0,\ \pi\in S_{N},$		(14)

where each permutation $\pi$ is an element of the symmetric group $S_{N}$ of degree $N$ , and each $\mathsf{Seq}^{\pi}$ denotes a sequential strategy set whose execution order of $N$ channels is $\mathcal{E}_{\phi}^{\pi(1)}\rightarrow\mathcal{E}_{\phi}^{\pi(2)}\rightarrow\cdots\rightarrow\mathcal{E}_{\phi}^{\pi(N)}$ , having denoted by $\mathcal{E}_{\phi}^{k}$ the channel from $\mathcal{L}(\mathcal{H}_{2k-1})$ to $\mathcal{L}(\mathcal{H}_{2k})$ . Note that $\mathsf{SWI}$ is a subset of $\mathsf{Sup}$ , where the intermediate control is trivial. There are other strategies, such as quantum circuits with quantum controlled casual order (QC-QCs) and probabilistic QC-QCs [50, 51], which we will not discuss here.

Finally, we introduce the family of general indefinite-causal-order strategy [see Fig. 1LABEL:sub@subfig:general_strategy], which is the most general strategy set considered in this work. Here the only requirement is that the concatenation of the strategy $P$ with $N$ arbitrary channels results in a legitimate quantum state. The causal relations in this case [21] are a bit cumbersome, but for our purpose what matters is the dual affine space (see Theorem 1), which is simply the space of no-signaling channels [16, 43]. A general indefinite-causal-order strategy set $\mathsf{ICO}$ is defined as the collection of $P\in\mathsf{Strat}$ such that

\rho_{F}=P*\left(\otimes_{j=1}^{N}E^{j}\right),\ \rho_{F}\geq 0,\ \Tr\rho_{F}=1,

(15)

for any $E^{j}\in\mathcal{L}(\mathcal{H}_{2j-1}\otimes\mathcal{H}_{2j}\otimes\mathcal{H}_{A_{j}})$ that denotes the CJ operator of an arbitrary quantum channel with an arbitrary ancillary space $\mathcal{H}_{A_{j}}$ .

We note that, unlike the previous strategies that can always be physically realized, the physical realization of the general $\mathsf{ICO}$ is untraceable [51, 50]. The optimal value obtained with general $\mathsf{ICO}$ nevertheless serves as a useful tool that can gauge the performances of different strategies. For example, as we will show, in some cases the optimal QFI $J^{(\mathsf{Sup})}$ and $J^{(\mathsf{ICO})}$ are equal or nearly equal. This then shows that the physically realizable strategy obtained from the set $\mathsf{Sup}$ is already optimal or nearly optimal among all possible strategies, which we will not be able to tell without $J^{(\mathsf{ICO})}$ .

Symmetry reduced programs for optimal metrology.—The complexity of the original optimization problems in Theorem 1 and Algorithm 1 can be reduced by exploiting the permutation symmetry. In Appendix D, we prove that we can choose a permutation-invariant matrix $h$ for Theorem 1 and solve for a permutation-invariant optimal strategy [52] by Algorithm 1 based on this choice, if any permutation $\pi\in S_{N}$ bijectively maps each affine space $\mathsf{S}^{i}$ [in Eq. (9)] to some affine space $\mathsf{S}^{j}$ . That is, for any $\pi\in S_{N}$ and any $i$ , there exists a $j$ such that the mapping $S\mapsto G_{\pi}SG_{\pi}^{\dagger}$ on $\mathsf{S}^{i}$ is a bijective function from $\mathsf{S}^{i}$ to $\mathsf{S}^{j}$ , where $G_{\pi}$ is a unitary representation of $\pi$ . Furthermore, if each space $\mathsf{S}^{i}$ itself is permutation invariant, we can restrict each $Q^{i}\in\overline{\mathsf{S}}^{i}$ to be permutation invariant, further reducing the complexity of optimization. For both optimization problems we can apply the technique of group-invariant SDP to reduce the size as there exists an isomorphism which preserves positive semidefiniteness, from the permutation-invariant subspace to the space of block-diagonal matrices [53, Theorem 9.1]. Table 1 compares the number of variables involved in QFI evaluation with and without exploiting the symmetry (see Appendix E for its derivation as well as Appendix F for the complexity of Algorithm 1, where by group-invariant SDP we also numerically evaluate the growth of QFI $J^{(\mathsf{ICO})}$ up to $N=5$ ).

Table 1: Complexity of QFI evaluation for each family of strategies (with repect to

N

). The asymptotic numbers of variables in optimization are compared between the original (Ori.) and group-invariant (Inv.) SDP. We denote

d:=\mathsf{dim}(\mathcal{H}_{1})\mathsf{dim}(\mathcal{H}_{2})

and

s:=\max_{\phi}\mathsf{rank}(E_{\phi})\leq d

SDP	$\mathsf{Par}$	$\mathsf{Seq}$	$\mathsf{SWI}$	$\mathsf{Sup}$	$\mathsf{ICO}$
Ori.	$O\left(s^{N}\right)$	$O\left(d^{N}\right)$	$O\left(s^{N}\right)$	$O\left(N!\,d^{N}\right)$	$O\left(d^{N}\right)$
Inv.	$O\left(N^{d^{2}-1}\right)$	$O\left(d^{N}\right)$	$O\left(N^{s^{2}-1}\right)$	$O\left(d^{N}\right)$	$O\left(N^{d^{2}-1}\right)$

Hierarchy of strategies.—By substituting the definitions of different strategy sets into Theorem 1, we obtain the exact values of the optimal QFI. We find that a strict hierarchy of QFI exists quite prevalently. For demonstration purposes, here we show only the result for the amplitude damping channel for $N=2$ and supplement our findings with bountiful numerical results in Appendix G. In this case, the process encoding $\phi$ is a $z$ rotation $U_{z}(\phi)=e^{-\mathrm{i}\phi t\sigma_{z}/2}$ , where $t$ is the evolution time, followed by an amplitude damping channel described by two Kraus operators: $K_{1}^{(\mathrm{AD})}=\outerproduct{0}{0}+\sqrt{1-p}\outerproduct{1}{1}$ and $K_{2}^{(\mathrm{AD})}=\sqrt{p}\outerproduct{0}{1}$ , with the decay parameter $p$ .

In Fig. 2 we plot the QFI versus $p$ for the amplitude damping noise with all 5 strategy sets for $N=2$ . A strict hierarchy of $\mathsf{Par}$ , $\mathsf{Seq}$ and $\mathsf{ICO}$ holds if $p$ is neither 1 nor 0, i.e., $J^{(\mathsf{Par})}<J^{(\mathsf{Seq})}<J^{(\mathsf{ICO})}$ . This is in contrast to the asymptotic regime of $N\to\infty$ , where the relative difference between $J^{(\mathsf{Seq})}$ and $J^{(\mathsf{Par})}$ vanishes for this channel [45]. Besides, in this case general $\mathsf{ICO}$ cannot strictly outperform $\mathsf{Sup}$ , implying that causally superposing two sequential strategies is sufficient to achieve the general optimality in this particular scenario. The gap between $J^{(\mathsf{Sup})}$ and $J^{(\mathsf{ICO})}$ , however, could be observed for the same channel with larger $N$ or for other channels when $N=2$ (see Appendix G). In fact, by randomly sampling noise channels from CPTP channel ensembles, we find that for 984 of 1000 random channels, a strict hierarchy $J^{(\mathsf{Par})}<J^{(\mathsf{Seq})}<J^{(\mathsf{Sup})}<J^{(\mathsf{ICO})}$ holds for $N=2$ , implying that there exist more powerful strategies than causal superposition strategies in these cases. We note that a strict hierarchy of strategies has been found for channel discrimination in [23], but much less is known in quantum metrology until our work.

Our method can also test the tightness of existing QFI bounds in the nonasymptotic regime, which has seldom been done until this work. Here we take the commonly used, asymptotically tight [45] upper bound for parallel strategies [see [28, Theorem 4 and Eq. (17)] or [30, Eq. (16)]]. For $p=0.5$ , our result shows that the exact parallel QFI $J^{(\mathsf{Par})}=1.795$ is $32.7\%$ lower than the asymptotically tight parallel upper bound $2.667$ , and even the exact sequential QFI $J^{(\mathsf{Seq})}=2.179$ is $18.3\%$ lower than this parallel upper bound [54]. Similar phenomena are observed in other noise models and for different $N$ (see Appendix H).

With Algorithm 1 we can also construct strategies to achieve the optimal QFI. Remarkably, we find that a simple strategy of applying a quantum SWITCH using a control qubit $\ket{\plus}_{C}:=\left(\ket{0}_{C}+\ket{1}_{C}\right)/\sqrt{2}$ (without any additional control operations on the probe) beats any sequential strategies (which can involve complex control) in certain cases (e.g. $p<0.5$ ). To our best knowledge, this is the only instance of noisy quantum metrology so far, where the advantage of indefinite causal orders is established rigorously. In Appendix I we also present two explicit examples of implementing optimal sequential and causal superposition strategies, obtained by first applying Algorithm 1 and converting the CJ operators into quantum circuit consisting of single-qubit rotations and CNOT gates (as well as a quantum SWITCH for the case of $\mathsf{Sup}$ ). For optimal causal superposition strategies, the permutation symmetry allows us to only control the execution order of channels while fixing state preparation and intermediate control [ $\rho_{\uparrow}=\rho_{\downarrow}$ and $U_{\uparrow}=U_{\downarrow}$ in Fig. 1LABEL:sub@subfig:causal_superposition_strategy], which can be implemented by a $(2N-1)$ -quantum SWITCH of $N$ channels $\mathcal{E}_{\phi}$ and $N-1$ intermediate operations.

Our result serves as a versatile tool for the demonstration of optimal quantum metrology and the design of optimal quantum sensors, especially in the context of control optimization [55, 56] and indefinite causal orders [20, 16, 21, 18, 22, 19, 17, 23].

The code accompanying the paper is openly available [57].

We thank Cyril Branciard, Alastair A. Abbott, and Raphael Mothe for helpful discussions and comments on our first manuscript. This work is supported by Guangdong Natural Science Fund—General Programme via Project 2022A1515010340, by HKU Seed Fund for Basic Research for New Staff via Project 202107185045, and by the Research Grants Council of Hong Kong through the Grant No. 14307420.

Note added.—Recently, it has been shown that neither sequential nor causal superposition strategies provide any advantage over parallel strategies asymptotically [58].

References

Giovannetti et al. [2011] V. Giovannetti, S. Lloyd, and L. Maccone, Advances in quantum metrology, Nat. Photonics 5, 222 (2011).
Degen et al. [2017] C. L. Degen, F. Reinhard, and P. Cappellaro, Quantum sensing, Rev. Mod. Phys. 89, 035002 (2017).
Martinis [2015] J. M. Martinis, Qubit metrology for building a fault-tolerant quantum computer, npj Quantum Inf. 1, 15005 (2015).
Lee et al. [2002] H. Lee, P. Kok, and J. P. Dowling, A quantum rosetta stone for interferometry, J. Mod. Opt. 49, 2325 (2002).
Bužek et al. [1999] V. Bužek, R. Derka, and S. Massar, Optimal Quantum Clocks, Phys. Rev. Lett. 82, 2207 (1999).
Kitagawa and Ueda [1993] M. Kitagawa and M. Ueda, Squeezed spin states, Phys. Rev. A 47, 5138 (1993).
Demkowicz-Dobrzański and Maccone [2014] R. Demkowicz-Dobrzański and L. Maccone, Using Entanglement Against Noise in Quantum Metrology, Phys. Rev. Lett. 113, 250801 (2014).
Yuan and Fung [2015] H. Yuan and C.-H. F. Fung, Optimal Feedback Scheme and Universal Time Scaling for Hamiltonian Parameter Estimation, Phys. Rev. Lett. 115, 110401 (2015).
Yuan [2016] H. Yuan, Sequential Feedback Scheme Outperforms the Parallel Scheme for Hamiltonian Parameter Estimation, Phys. Rev. Lett. 117, 160801 (2016).
Pang and Jordan [2017] S. Pang and A. N. Jordan, Optimal adaptive control for quantum metrology with time-dependent hamiltonians, Nat. Commun. 8, 14695 (2017).
Dür et al. [2014] W. Dür, M. Skotiniotis, F. Fröwis, and B. Kraus, Improved Quantum Metrology Using Quantum Error Correction, Phys. Rev. Lett. 112, 080801 (2014).
Kessler et al. [2014] E. M. Kessler, I. Lovchinsky, A. O. Sushkov, and M. D. Lukin, Quantum Error Correction for Metrology, Phys. Rev. Lett. 112, 150802 (2014).
Demkowicz-Dobrzański et al. [2017] R. Demkowicz-Dobrzański, J. Czajkowski, and P. Sekatski, Adaptive Quantum Metrology under General Markovian Noise, Phys. Rev. X 7, 041009 (2017).
Zhou et al. [2018] S. Zhou, M. Zhang, J. Preskill, and L. Jiang, Achieving the heisenberg limit in quantum metrology using quantum error correction, Nat. Commun. 9, 78 (2018).
Preskill [2018] J. Preskill, Quantum Computing in the NISQ era and beyond, Quantum 2, 79 (2018).
Chiribella et al. [2013] G. Chiribella, G. M. D’Ariano, P. Perinotti, and B. Valiron, Quantum computations without definite causal structure, Phys. Rev. A 88, 022318 (2013).
Chapeau-Blondeau [2021] F. Chapeau-Blondeau, Noisy quantum metrology with the assistance of indefinite causal order, Phys. Rev. A 103, 032615 (2021).
Mukhopadhyay et al. [2018] C. Mukhopadhyay, M. K. Gupta, and A. K. Pati, Superposition of causal order as a metrological resource for quantum thermometry (2018), arXiv:1812.07508 [quant-ph] .
Zhao et al. [2020] X. Zhao, Y. Yang, and G. Chiribella, Quantum Metrology with Indefinite Causal Order, Phys. Rev. Lett. 124, 190503 (2020).
Oreshkov et al. [2012] O. Oreshkov, F. Costa, and Č. Brukner, Quantum correlations with no causal order, Nat. Commun. 3, 1092 (2012).
Araújo et al. [2015] M. Araújo, C. Branciard, F. Costa, A. Feix, C. Giarmatzi, and Č. Brukner, Witnessing causal nonseparability, New J. Phys. 17, 102001 (2015).
Quintino et al. [2019] M. T. Quintino, Q. Dong, A. Shimbo, A. Soeda, and M. Murao, Reversing Unknown Quantum Transformations: Universal Quantum Circuit for Inverting General Unitary Operations, Phys. Rev. Lett. 123, 210502 (2019).
Bavaresco et al. [2021] J. Bavaresco, M. Murao, and M. T. Quintino, Strict Hierarchy between Parallel, Sequential, and Indefinite-Causal-Order Strategies for Channel Discrimination, Phys. Rev. Lett. 127, 200504 (2021).
Helstrom [1976] C. Helstrom, Quantum Detection and Estimation Theory (Academic Press, New York, 1976).
Holevo [2011] A. S. Holevo, Probabilistic and Statistical Aspects of Quantum Theory, Vol. 1 (Springer Science & Business Media, Berlin, 2011).
Braunstein and Caves [1994] S. L. Braunstein and C. M. Caves, Statistical distance and the geometry of quantum states, Phys. Rev. Lett. 72, 3439 (1994).
[27] For $\nu\rightarrow\infty$ , the maximum likelihood estimator is unbiased and the quantum Cramér-Rao bound is achieved. This should not be confused with the nonasymptotic regime (where $N$ is finite).
Fujiwara and Imai [2008] A. Fujiwara and H. Imai, A fibre bundle over manifolds of quantum channels and its application to quantum statistics, J. Phys. A 41, 255304 (2008).
Escher et al. [2011] B. M. Escher, R. L. de Matos Filho, and L. Davidovich, General framework for estimating the ultimate precision limit in noisy quantum-enhanced metrology, Nat. Phys. 7, 406 (2011).
Demkowicz-Dobrzański et al. [2012] R. Demkowicz-Dobrzański, J. Kołodyński, and M. Guţă, The elusive heisenberg limit in quantum-enhanced metrology, Nat. Commun. 3, 1063 (2012).
Yuan and Fung [2017a] H. Yuan and C.-H. F. Fung, Fidelity and fisher information on quantum channels, New J. Phys. 19, 113039 (2017a).
Yuan and Fung [2017b] H. Yuan and C.-H. F. Fung, Quantum parameter estimation with general dynamics, npj Quantum Inf. 3, 14 (2017b).
Chiribella et al. [2008a] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Quantum circuit architecture, Phys. Rev. Lett. 101, 060401 (2008a).
Chiribella et al. [2009] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Theoretical framework for quantum networks, Phys. Rev. A 80, 022339 (2009).
[35] For any CJ operator $P$ of rank $r_{P}$ , we can always choose an $r_{P}$ -dimensional ancillary space to purify $P$ , which only extends the global future space $\mathcal{H}_{F}$ without changing the causal relations between queries to the channel $\mathcal{E}_{\phi}$ . As QFI is monotonic under CPTP maps, the optimal QFI can be achieved by a purified process, and thus it is sufficient to only consider pure processes.
Petz [2008] D. Petz, Quantum Information Theory and Quantum Statistics (Springer, Berlin, 2008).
Yang [2019] Y. Yang, Memory Effects in Quantum Metrology, Phys. Rev. Lett. 123, 110501 (2019).
Altherr and Yang [2021] A. Altherr and Y. Yang, Quantum Metrology for Non-markovian Processes, Phys. Rev. Lett. 127, 060501 (2021).
Fan [1953] K. Fan, Minimax theorems, Proc. Natl. Acad. Sci. U.S.A. 39, 42 (1953).
Rockafellar [1970] R. T. Rockafellar, Convex Analysis (Princeton University Press, Princeton, NJ, 1970).
[41] We also suppose that the strategy set $\mathsf{P}$ is compact in order to apply Fan’s minimax theorem.
Chiribella [2012] G. Chiribella, Optimal networks for quantum metrology: Semidefinite programs and product rules, New J. Phys. 14, 125008 (2012).
Chiribella and Ebler [2016] G. Chiribella and D. Ebler, Optimal quantum networks and one-shot entropies, New J. Phys. 18, 093053 (2016).
Zhou and Jiang [2020] S. Zhou and L. Jiang, Optimal approximate quantum error correction for quantum metrology, Phys. Rev. Res. 2, 013235 (2020).
Zhou and Jiang [2021] S. Zhou and L. Jiang, Asymptotic theory of quantum channel estimation, PRX Quantum 2, 010343 (2021).
Bisio et al. [2011] A. Bisio, G. M. D’Ariano, P. Perinotti, and G. Chiribella, Minimal computational-space implementation of multiround quantum protocols, Phys. Rev. A 83, 022325 (2011).
Giovannetti et al. [2006] V. Giovannetti, S. Lloyd, and L. Maccone, Quantum Metrology, Phys. Rev. Lett. 96, 010401 (2006).
Colnaghi et al. [2012] T. Colnaghi, G. M. D’Ariano, S. Facchini, and P. Perinotti, Quantum computation with programmable connections between gates, Phys. Lett. A 376, 2940 (2012).
Araújo et al. [2014] M. Araújo, F. Costa, and v. Brukner, Computational Advantage from Quantum-Controlled Ordering of Gates, Phys. Rev. Lett. 113, 250402 (2014).
Wechs et al. [2021] J. Wechs, H. Dourdent, A. A. Abbott, and C. Branciard, Quantum circuits with classical versus quantum control of causal order, PRX Quantum 2, 030335 (2021).
Purves and Short [2021] T. Purves and A. J. Short, Quantum Theory Cannot Violate a Causal Inequality, Phys. Rev. Lett. 127, 110402 (2021).
[52] Concretely, $\tilde{P}^{(\mathrm{opt})}=\Tr_{F}P^{(\mathrm{opt})}$ is permutation invariant in this case.
Bachoc et al. [2012] C. Bachoc, D. C. Gijswijt, A. Schrijver, and F. Vallentin, Invariant semidefinite programs, in Handbook on Semidefinite, Conic and Polynomial Optimization, edited by M. F. Anjos and J. B. Lasserre (Springer US, Boston, MA, 2012) pp. 219–269.
[54] Note that the best existing upper bound on the QFI of sequential strategies [7, 45] is greater than or equal to that of parallel strategies and is thus omitted here. The sequential upper bound is also harder to evaluate numerically because it cannot be readily formulated as a convex optimization problem.
Hou et al. [2019] Z. Hou, R.-J. Wang, J.-F. Tang, H. Yuan, G.-Y. Xiang, C.-F. Li, and G.-C. Guo, Control-Enhanced Sequential Scheme for General Quantum Parameter Estimation at the Heisenberg Limit, Phys. Rev. Lett. 123, 040501 (2019).
Hou et al. [2021] Z. Hou, Y. Jin, H. Chen, J.-F. Tang, C.-J. Huang, H. Yuan, G.-Y. Xiang, C.-F. Li, and G.-C. Guo, “Super-Heisenberg” and Heisenberg Scalings Achieved Simultaneously in the Estimation of a Rotating Field, Phys. Rev. Lett. 126, 070503 (2021).
Liu et al. [2022] Q. Liu, Z. Hu, H. Yuan, and Y. Yang, Quantum channel estimation within constraints on strategies, https://github.com/qiushi-liu/strategies_in_metrology (2022).
Kurdzialek et al. [2022] S. Kurdzialek, W. Gorecki, F. Albarelli, and R. Demkowicz-Dobrzanski, Using adaptiveness and causal superpositions against noise in quantum metrology (2022), arXiv:2212.08106 [quant-ph] .
Watrous [2018] J. Watrous, The Theory of Quantum Information (Cambridge University Press, Cambridge, England, 2018).
Fulton and Harris [2013] W. Fulton and J. Harris, Representation Theory: A First Course, Graduate Texts in Mathematics (Springer New York, NY, 2013).
Schur [1901] I. Schur, Ueber eine Klasse von Matrizen, die sich einer gegebenen Matrix zuordnen lassen, Ph.D. dissertation, Friedrich-Wilhelms-Universität, Berlin (1901).
Montealegre-Mora et al. [2021] F. Montealegre-Mora, D. Rosset, J.-D. Bancal, and D. Gross, Certifying numerical decompositions of compact group representations (2021), arXiv:2101.12244 [math.RT] .
Rosset et al. [2021] D. Rosset, F. Montealegre-Mora, and J.-D. Bancal, Replab: A computational/numerical approach to representation theory, in Quantum Theory and Symmetries, edited by M. B. Paranjape, R. MacKenzie, Z. Thomova, P. Winternitz, and W. Witczak-Krempa (Springer International Publishing, Cham, 2021) pp. 643–653.
Chiribella et al. [2008b] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Transforming quantum operations: Quantum supermaps, Europhys. Lett. 83, 30004 (2008b).
Horn and Zhang [2005] R. A. Horn and F. Zhang, Basic properties of the Schur complement, in The Schur Complement and Its Applications, edited by F. Zhang (Springer US, Boston, MA, 2005) pp. 17–46.
Diamond and Boyd [2016] S. Diamond and S. Boyd, CVXPY: A Python-embedded modeling language for convex optimization, J. Mach. Learn. Res. 17, 1 (2016).
Agrawal et al. [2018] A. Agrawal, R. Verschueren, S. Diamond, and S. Boyd, A rewriting system for convex optimization problems, J. Control Decis. 5, 42 (2018).
MOSEK ApS [2021] MOSEK ApS, The MOSEK Optimizer API for Python manual. Version 9.3. (2021).
Bruzda et al. [2009] W. Bruzda, V. Cappellini, H.-J. Sommers, and K. Życzkowski, Random quantum operations, Phys. Lett. A 373, 320 (2009).
Johansson et al. [2012] J. Johansson, P. Nation, and F. Nori, Qutip: An open-source python framework for the dynamics of open quantum systems, Comput. Phys. Commun. 183, 1760 (2012).
Johansson et al. [2013] J. Johansson, P. Nation, and F. Nori, Qutip 2: A python framework for the dynamics of open quantum systems, Comput. Phys. Commun. 184, 1234 (2013).
Iten et al. [2016] R. Iten, R. Colbeck, I. Kukuljan, J. Home, and M. Christandl, Quantum circuits for isometries, Phys. Rev. A 93, 032318 (2016).
Žnidarič et al. [2008] M. Žnidarič, O. Giraud, and B. Georgeot, Optimal number of controlled-not gates to generate a three-qubit state, Phys. Rev. A 77, 032320 (2008).
Plesch and Brukner [2011] M. Plesch and Č. Brukner, Quantum-state preparation with universal gate decompositions, Phys. Rev. A 83, 032302 (2011).
Iten et al. [2019] R. Iten, O. Reardon-Smith, E. Malvetti, L. Mondada, G. Pauvert, E. Redmond, R. S. Kohli, and R. Colbeck, Introduction to UniversalQCompiler (2019), arXiv:1904.01072 [quant-ph] .

\do@columngrid

one´

Appendix A Proof of Eq. (6) of the main text

The formalism of this proof has been developed for strategies following a definite causal order in [38], and the generalization to indefinite-causal-order strategies considered here is straightforward.

In [28], the QFI of a quantum state is expressed as a minimization problem

J_{Q}(\rho_{\phi})=4\min_{\ket{\psi_{\phi,i}}}\sum_{i=1}^{q}\Tr\left(\outerproduct{\dot{\psi}_{\phi,i}}{\dot{\psi}_{\phi,i}}\right),

(16)

for any integer $q\geq\max_{\phi}\mathsf{rank}(\rho_{\phi})$ , where $\left\{\ket{\psi_{\phi,i}}\right\}$ is a set of unnormalized vectors such that $\rho_{\phi}=\sum_{i}\outerproduct{\psi_{\phi,i}}{\psi_{\phi,i}}$ ¹¹1We assume $\left\{\ket{\psi_{\phi,i}}\right\}$ is continuously differentiable with respect to $\phi$ .. In the main text, the QFI of $N$ quantum channels $\mathcal{E}_{\phi}$ is defined as the QFI of the output state obtained from the concatenation of the CJ operator $N_{\phi}$ of $N$ quantum channels and an optimal strategy $P$ in a given strategy set $\mathsf{P}$ :

J^{(\mathsf{P})}(N_{\phi}):=\max_{P\in\mathsf{P}}J_{Q}(P*N_{\phi}).

(17)

Due to the monotonicity [36] of the QFI under CPTP maps (e.g., the partial trace operation over ancillary space in this case), by choosing a proper global future space $\mathcal{H}_{F}$ an optimal $P$ can be taken as a pure process (rank-1 operator) denoted by $\outerproduct{P}{P}$ . For a fixed $P=\outerproduct{P}{P}$ , minimization over decompositions of $P*N_{\phi}$ is equivalent to minimization over decompositions of $N_{\phi}$ , as $\mathsf{rank}(P*N_{\phi})\leq\mathsf{rank}(N_{\phi})$ .

As a positive semidefinite operator, $N_{\phi}$ has a decomposition as:

N_{\phi}=\sum_{i=1}^{r}\outerproduct{N_{\phi,i}}{N_{\phi,i}},

(18)

where $r:=\max_{\phi}\mathsf{rank}(N_{\phi})$ ²²2We also assume $\left\{\ket{N_{\phi,i}}\right\}$ is continuously differentiable with respect to $\phi$ .. Note that the decomposition is nonunique. Defining $\mathbf{N}_{\phi}:=\left(\ket{N_{\phi,1}},\dots,\ket{N_{\phi,r}}\right)$ , an arbitrary alternative decomposition $\tilde{\mathbf{N}}_{\phi}:=\left(\ket{\tilde{N}_{\phi,1}},\dots,\ket{\tilde{N}_{\phi,r}}\right)$ can be related to $\mathbf{N}_{\phi}$ by $\tilde{\mathbf{N}}_{\phi}=\mathbf{N}_{\phi}V_{\phi}$ , where $V_{\phi}$ is an $r\times r$ unitary matrix. Then the QFI of $N$ channels can be expressed as

J^{(\mathsf{P})}(N_{\phi})=\max_{P\in\mathsf{P}}\min_{V_{\phi}}\Tr\left[P\left(\mathds{1}_{F}\otimes\Omega_{\phi}\right)\right],

(19)

having defined the performance operator $\Omega_{\phi}:=4\left(\dot{\tilde{\mathbf{N}}}_{\phi}\dot{\tilde{\mathbf{N}}}_{\phi}^{\dagger}\right)^{T}=4\sum_{i=1}^{r}\left(\outerproduct{\dot{\tilde{N}}_{\phi,i}}{\dot{\tilde{N}}_{\phi,i}}\right)^{T}$ , where the superscript $T$ denotes the transpose. We can further define an $r\times r$ Hermitian matrix $h:=\mathrm{i}\dot{V}_{\phi}V_{\phi}^{\dagger}$ to take care of the freedom of choice for the decomposition $\tilde{\mathbf{N}}_{\phi}$ , by noting that $\dot{\tilde{\mathbf{N}}}_{\phi}\dot{\tilde{\mathbf{N}}}_{\phi}^{\dagger}=\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h\right)\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h\right)^{\dagger}$ . Now by redefining $\dot{\tilde{\mathbf{N}}}_{\phi}:=\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h\right)$ , we rewrite $\Omega_{\phi}$ as $\Omega_{\phi}(h)$ to explicitly manifest its dependence on $h$ . Hence, we finally arrive at Eq. (6) of the main text:

	$\displaystyle J^{(\mathsf{P})}(N_{\phi})$	$\displaystyle=\max_{P\in\mathsf{P}}\min_{h\in\mathbb{H}_{r}}\Tr\left\{P\left[\mathds{1}_{F}\otimes\Omega_{\phi}(h)\right]\right\}$		(20)
		$\displaystyle=\max_{\tilde{P}\in\tilde{\mathsf{P}}}\min_{h\in\mathbb{H}_{r}}\Tr\left[\tilde{P}\Omega_{\phi}(h)\right],$		(20)

where $\tilde{\mathsf{P}}:=\left\{\tilde{P}=\Tr_{F}P\middle|P\in\mathsf{P}\right\}$ . ∎

Appendix B Proof of Theorem 1

Starting from Eq. (6) of the main text, we exchange the order of minimization and maximization thanks to Fan’s minimax theorem [39], since $\Tr\left[\tilde{P}\Omega_{\phi}(h)\right]$ is concave on $\tilde{P}$ and convex on $h$ , and $\tilde{\mathsf{P}}$ is assumed to be a compact set. Then the problem of QFI evaluation can be rewritten as

J^{(\mathsf{P})}(N_{\phi})=\min_{h}\max_{\tilde{P}\in\tilde{\mathsf{P}}}\Tr\left[\tilde{P}\Omega_{\phi}(h)\right].

(21)

Reformulating the condition of Theorem 1, we require that each operator $\tilde{P}\in\tilde{\mathsf{P}}$ can be written as a convex combination of positive semidefinite operators $S^{i}$ , $i=1,\dots,K$ :

\tilde{P}=\sum_{i=1}^{K}q^{i}S^{i},\ \mathrm{for}\ \sum_{i=1}^{K}q^{i}=1,\ q^{i}\geq 0,\ S^{i}\geq 0,\ S^{i}\in\mathsf{S}^{i},\ i=1,\dots,K,

(22)

where each $\mathsf{S}^{i}$ is an affine space of Hermitian operators. Thus Eq. (21) can be reformulated as

$\displaystyle J^{(\mathsf{P})}(N_{\phi})=$	$\displaystyle\min_{h}\max_{\tilde{P}}\Tr\left[\tilde{P}\Omega_{\phi}(h)\right],$	(23)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\tilde{P}=\sum_{i=1}^{K}q^{i}S^{i},$
	$\displaystyle\sum_{i=1}^{K}q^{i}=1,$
	$\displaystyle\,q^{i}\geq 0,\ S^{i}\geq 0,\ S^{i}\in\mathsf{S}^{i},\ i=1,\dots,K.$

For now we fix $h$ and consider the dual problem of maximization over $\tilde{\mathsf{P}}$ . For each affine space $\mathsf{S}^{i}$ we have defined its dual affine space $\overline{\mathsf{S}}^{i}$ , whose dual affine space in turn is exactly $\mathsf{S}^{i}$ [43]. Choose an affine basis $\{Q^{i,j}\}_{j=1}^{L_{i}}$ for $\overline{\mathsf{S}}^{i}$ , and the maximization problem is further expressed as

$\displaystyle\max_{\tilde{P}}$	$\displaystyle\Tr\left[\tilde{P}\Omega_{\phi}(h)\right],$	(24)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\tilde{P}=\sum_{i=1}^{K}q^{i}S^{i},$
	$\displaystyle\sum_{i=1}^{K}q^{i}=1,$
	$\displaystyle\,q^{i}\geq 0,\ S^{i}\geq 0,\ \Tr\left(S^{i}Q^{i,j}\right)=1,\ i=1,\dots,K,\ j=1,\dots,L_{i}.$

Defining $P^{i}:=q^{i}S^{i}$ to avoid the product of variables in optimization, we have

$\displaystyle\max$	$\displaystyle\Tr\left[\tilde{P}\Omega_{\phi}(h)\right],$	(25)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\tilde{P}=\sum_{i=1}^{K}P^{i},$
	$\displaystyle\sum_{i=1}^{K}q^{i}=1,$
	$\displaystyle\,P^{i}\geq 0,\ \Tr\left(P^{i}Q^{i,j}\right)=q^{i},\ i=1,\dots,K,\ j=1,\dots,L_{i},$

where the constraints $q^{i}\geq 0$ can be safely removed, since $\Tr S^{i}=\prod_{j=1}^{N}d_{2j}$ , implying that $\overline{\mathsf{S}}^{i}$ includes a positive operator proportional to identity for any $i=1,\dots,K$ , having denoted $d_{j}:=\mathsf{dim}(\mathcal{H}_{j})$ for simplicity. The Lagrangian of the problem is given by

	$\displaystyle L$	$\displaystyle=\sum_{i}\Tr\left[P^{i}\Omega_{\phi}(h)\right]+\left(1-\sum_{i}q^{i}\right)\lambda+\sum_{i}\Tr\left(P^{i}\tilde{Q}^{i}\right)+\sum_{i,j}\left[q^{i}-\Tr\left(P^{i}Q^{i,j}\right)\right]\lambda^{i,j}$		(26)
		$\displaystyle=\lambda+\sum_{i}\Tr\left\{P^{i}\left[\Omega_{\phi}(h)+\tilde{Q}^{i}-\sum_{j}\lambda^{i,j}Q^{i,j}\right]\right\}+\sum_{i}\left[q^{i}\left(\sum_{j}\lambda^{i,j}-\lambda\right)\right],$		(26)

for $\tilde{Q}^{i}\geq 0$ . Hence, by removing $\tilde{Q}^{i}$ the dual problem is written as

	$\displaystyle\min$	$\displaystyle\,\lambda,$		(27)
	$\displaystyle\mathrm{s.t.}$	$\displaystyle\sum_{j}\lambda^{i,j}Q^{i,j}\geq\Omega_{\phi}(h),\ \lambda=\sum_{j}\lambda^{i,j},\ i=1,\dots,K,\ j=1,\dots,L_{i}.$		(27)

We define $Q^{i}:=\sum_{j}\lambda^{i,j}Q^{i,j}/\lambda$ if $\lambda\neq 0$ ( $\lambda=0$ corresponds to a trivial case where the QFI is zero), and clearly $Q^{i}$ is an arbitrary operator in the set $\overline{\mathsf{S}}^{i}$ . Therefore, we cast the dual problem into

	$\displaystyle\min$	$\displaystyle\,\lambda,$		(28)
	$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\lambda Q^{i}\geq\Omega_{\phi}(h),\ Q^{i}\in\overline{\mathsf{S}}^{i},\ i=1,\dots,K.$		(28)

Slater’s theorem [59] implies that the strong duality holds, since the QFI is finite and the inequality constraints can be strictly satisfied for a positive semidefinite operator $\Omega_{\phi}(h)$ , by choosing $\lambda Q^{i}=\mu\lVert\Omega_{\phi}(h)\rVert\mathds{1}_{1,2,\dots,2N}$ for $\mu>1$ and any $i=1,\dots,K$ , having denoted the operator norm by $\lVert\cdot\rVert$ . Finally, by optimizing the choice of $h$ we derive the result of Theorem 1. ∎

Appendix C Proof of the validity of Algorithm 1

We first recall the minimax theorem:

\min_{x}\max_{y}f(x,y)=\max_{y}\min_{x}f(x,y)

(29)

for a function $f(x,y)$ convex on $x$ and concave on $y$ . Assume $(x_{0},y_{1})$ is a solution for the L.H.S. of Eq. (29) and $(x_{1},y_{0})$ is a solution for the R.H.S. of Eq. (29). It is easy to see that

f(x_{0},y_{1})\geq f(x_{0},y_{0})\geq f(x_{1},y_{0}).

(30)

In view of Eq. (29) both equalities hold. Therefore, $(x_{0},y_{0})$ is a saddle point of $f(x,y)$ , i.e., $x_{0}=\operatorname*{arg\,min}_{x}{f(x,y_{0})}$ and $y_{0}=\operatorname*{arg\,max}_{y}{f(x_{0},y)}$ . Since the objective function $\Tr\left[\tilde{P}\Omega_{\phi}(h)\right]$ in the primal problem of estimating QFI is convex on $h$ and concave on $\tilde{P}$ , we can substitute $x=h$ and $y=\tilde{P}$ . Obviously, $h^{(\mathrm{opt})}$ is an optimal solution for $\min_{h}\max_{\tilde{P}}\Tr\left[\tilde{P}\Omega_{\phi}(h)\right]$ and thus corresponds to a saddle point. Then $x_{0}=\operatorname*{arg\,min}_{x}{f(x,y_{0})}$ can be satisfied by requiring $\partial_{h}f\left(h,\tilde{P}^{(\mathrm{opt})}\right)|_{h=h^{(\mathrm{opt})}}=0$ , resulting in Eq. (11) in the main text:

\real\left\{\Tr\left\{\tilde{P}^{(\mathrm{opt})}\left[-\mathrm{i}\mathbf{N}_{\phi}\mathscr{H}\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h^{(\mathrm{opt})}\right)^{\dagger}\right]^{T}\right\}\right\}=0\ \mathrm{for}\ \mathrm{all}\ \mathscr{H}\in\mathbb{H}_{r}.

(31)

Meanwhile $y_{0}=\operatorname*{arg\,max}_{y}{f(x_{0},y)}$ corresponds to the requirement that $\tilde{P}^{(\mathrm{opt})}$ is an optimal solution for fixed $h=h^{(\mathrm{opt})}$ . Therefore, $\left(h^{(\mathrm{opt})},\tilde{P}^{(\mathrm{opt})}\right)$ is a saddle point and an optimal solution for $\max_{\tilde{P}}\min_{h}\Tr\left[\tilde{P}\Omega_{\phi}(h)\right]$ . By definition a purification of $\tilde{P}^{(\mathrm{opt})}$ is an optimal physically implemented strategy, i.e., we can choose a strategy $P^{(\mathrm{opt})}$ such that $\Tr_{F}P^{(\mathrm{opt})}=\tilde{P}^{(\mathrm{opt})}$ . ∎

We remark that Eq. (31) can be reformulated as a set of linear constraints by choosing a basis $\{\mathscr{H}^{i}\}_{i=1}^{r^{2}}$ for the space $\mathbb{H}_{r}$ of Hemitian matrices. For example, denoting by $E_{ij}$ the $r\times r$ matrix of which only the $(i,j)$ -th element is $1$ and all other elements are $0$ , we can choose $\mathscr{H}=E_{jj}$ for $j=1,\dots,r$ , $\mathscr{H}=E_{jk}+E_{kj}$ and $\mathscr{H}=\mathrm{i}\left(E_{jk}-E_{kj}\right)$ for $k=1,\dots,r$ and $k\neq j$ , and obtain a series of constraints, which are equivalent to Eq.(31).

Appendix D Symmetry reduced optimization

This section demonstrates how to reduce the size of the optimization problems concerned in Theorem 1 and Algorithm 1, by exploiting the permutation symmetry as applicable.

Let us begin with some notations. We consider the action of $S_{N}$ , the symmetric group of degree $N$ , on the finite-dimensional representation space $\otimes_{i=1}^{N}\mathcal{W}_{i}$ . $G_{\pi}$ is a unitary (and orthogonal) operator on $\otimes_{i=1}^{N}\mathcal{W}_{i}$ corresponding to the permutation $\pi\in S_{N}$ : $G_{\pi}=\sum_{\mathbf{i}=(i_{1},\dots,i_{N})}\left(\otimes_{j=1}^{N}\ket{i_{\pi(j)}}\right)\otimes_{k=1}^{N}\bra{i_{k}}$ , where $\{\ket{i_{j}}\}_{i}$ denotes an orthonormal basis of $\mathcal{W}_{j}$ . Then an operator $X$ on $\otimes_{i=1}^{N}\mathcal{W}_{i}$ is said to be permutation invariant iff $G_{\pi}XG_{\pi}^{\dagger}=X$ for all $\pi\in S_{N}$ . Analogously, a space $\mathcal{X}$ is permutation invariant iff $G_{\pi}XG_{\pi}^{\dagger}\in\mathcal{X}$ for any $X\in\mathcal{X}$ and any $\pi\in S_{N}$ .

We now explicitly express the components of the performance operator $\Omega_{\phi}(h)$ . Given $N$ identical quantum channels $\mathcal{E}_{\phi}(\rho)=\sum_{i}K_{\phi,i}^{\dagger}\rho K_{\phi,i}$ , we decompose the CJ operator $E_{\phi}=\sum_{i}\outerproduct{E_{\phi,i}}{E_{\phi,i}}$ corresponding to each channel. Note that $\ket{E_{\phi,i}}$ is the vectorization of the Kraus operator $K_{\phi,i}$ , i.e., $\ket{E_{\phi,i}}=\sum_{m,n}\matrixelement{m}{K_{\phi,i}}{n}\ket{m}\ket{n}$ . The CJ operator of $N$ identical quantum channels is $N_{\phi}=E_{\phi}^{\otimes N}=\sum_{\mathbf{i}}\outerproduct{N_{\phi,\mathbf{i}}}{N_{\phi,\mathbf{i}}}$ , where we use the notation $\ket{N_{\phi,\mathbf{i}=(i_{1},\dots,i_{N})}}=\otimes_{n=1}^{N}\ket{E_{\phi,i_{n}}}$ . Taking the derivative results in (we omit the subscript $\phi$ in $\ket{E_{\phi,i}}$ ) $\ket{\dot{N}_{\phi,\mathbf{i}}}=\sum_{j=1}^{N}\ket{E_{i_{1}}}\cdots\ket{E_{i_{j-1}}}\ket{\dot{E}_{i_{j}}}\ket{E_{i_{j+1}}}\cdots\ket{E_{i_{N}}}$ , from which we can obtain the performance operator $\Omega_{\phi}(h)=4\left(\dot{\tilde{\mathbf{N}}}_{\phi}\dot{\tilde{\mathbf{N}}}_{\phi}^{\dagger}\right)^{T}=4\sum_{\mathbf{i}}\left(\outerproduct{\dot{\tilde{N}}_{\phi,\mathbf{i}}}{\dot{\tilde{N}}_{\phi,\mathbf{i}}}\right)^{T}$ , where $\ket{\dot{\tilde{N}}_{\phi,\mathbf{j}}}=\ket{\dot{N}_{\phi,\mathbf{j}}}-\mathrm{i}\sum_{\mathbf{k}}\ket{N_{\phi,\mathbf{k}}}h_{\mathbf{k}\mathbf{j}}$ .

D.1 Symmetry reduced QFI evaluation

With these notations, we first consider the optimization in QFI evaluation and have the following:

Lemma 1.

In the optimization problem of Theorem 1, if, for any $\pi\in S_{N}$ and any $i$ , there exists a $j$ such that the mapping $S\mapsto G_{\pi}SG_{\pi}^{\dagger}$ on $\mathsf{S}^{i}$ is a bijective function from $\mathsf{S}^{i}$ to $\mathsf{S}^{j}$ , then there must exist a permutation-invariant $h$ as a feasible solution.

Proof.

Without loss of generality we assume all $\mathsf{S}^{i}$ are distinct spaces; otherwise, we just remove the duplicate ones. We first prove that, under any permutation operation $Q\mapsto G_{\pi}^{\dagger}QG_{\pi}$ , for any $i\in\{1,\dots,K\}$ there exists a unique $j\in\{1,\dots,K\}$ such that the dual affine space $\overline{\mathsf{S}}^{j}$ is bijectively mapped to $\overline{\mathsf{S}}^{i}$ . The condition of the lemma implies that, for any $G_{\pi}$ and any $i$ we can find $j_{i}$ such that $G_{\pi}S^{i}G_{\pi}^{\dagger}\in\mathsf{S}^{j_{i}}$ for all $S^{i}\in\mathsf{S}^{i}$ . Due to the bijectivity, $\mathsf{S}^{i}$ and the corresponding $\mathsf{S}^{j_{i}}$ are isomorphic, with different $j_{i}$ for different $i$ . Apparently $\overline{\mathsf{S}}^{i}$ and $\overline{\mathsf{S}}^{j_{i}}$ are also isomorphic. If we choose any $Q^{j_{i}}\in\overline{\mathsf{S}}^{j_{i}}$ , i.e., $\Tr(Q^{j_{i}}S^{j_{i}})=1$ for all $S^{j_{i}}\in\mathsf{S}^{j_{i}}$ , then we have $\Tr(G_{\pi}^{\dagger}Q^{j_{i}}G_{\pi}S^{i})=\Tr(Q^{j_{i}}G_{\pi}S^{i}G_{\pi}^{\dagger})=1$ for all $S^{i}\in\mathsf{S}^{i}$ . Therefore, $G_{\pi}^{\dagger}Q^{j_{i}}G_{\pi}\in\overline{\mathsf{S}}^{i}$ for all $Q^{j_{i}}\in\overline{\mathsf{S}}^{j_{i}}$ . Furthermore, the permutation operation $Q\mapsto G_{\pi}^{\dagger}QG_{\pi}$ from $\overline{\mathsf{S}}^{j_{i}}$ to $\overline{\mathsf{S}}^{i}$ is bijective as $\overline{\mathsf{S}}^{j_{i}}$ and $\overline{\mathsf{S}}^{i}$ are isomorphic. In particular, any set of $K$ operators $\{Q^{i}\}_{i=1}^{K}$ for $Q^{i}\in\overline{\mathsf{S}}^{i}$ is mapped to a set $\{\tilde{Q}^{i}\}_{i=1}^{K}$ such that $\tilde{Q}^{i}\in\overline{\mathsf{S}}^{i}$ .

We then prove that there exists a permutation-invariant performance operator $\Omega_{\phi}(h)$ as a feasible solution. The group action is characterized by the permutation operator $G_{\pi}$ on the representation space $\otimes_{i}\mathcal{W}_{i}=\otimes_{i}\left(\mathcal{H}_{2i-1}\otimes\mathcal{H}_{2i}\right)$ . Suppose $h^{(\mathrm{opt})}$ is an optimal solution, i.e., for any $i\in\{1,\dots,K\}$ , there exist optimal values of $\lambda$ and $Q^{i}$ such that $\lambda Q^{i}\geq\Omega_{\phi}\left(h^{(\mathrm{opt})}\right)$ for $Q^{i}\in\overline{\mathsf{S}}^{i}$ . Then for any permutation $\pi\in S_{N}$ we have

\lambda G_{\pi}Q^{i}G_{\pi}^{\dagger}\geq G_{\pi}\Omega_{\phi}\left(h^{(\mathrm{opt})}\right)G_{\pi}^{\dagger},\ Q^{i}\in\overline{\mathsf{S}}^{i},\ i=1,\dots,K.

(32)

Since $G_{\pi}$ maps $\{Q^{i}\}_{i}$ to a set $\{\tilde{Q}^{i}\}_{i}$ such that $\tilde{Q}^{i}\in\overline{\mathsf{S}}^{i}$ , the constraint Eq. (32) becomes

\lambda\tilde{Q}^{i}\geq G_{\pi}\Omega_{\phi}\left(h^{(\mathrm{opt})}\right)G_{\pi}^{\dagger},\ \tilde{Q}^{i}\in\overline{\mathsf{S}}^{i},\ i=1,\dots,K,

(33)

which implies that $G_{\pi}\Omega_{\phi}(h^{(\mathrm{opt})})G_{\pi}^{\dagger}$ is also a feasible solution for any $\pi$ . Furthermore, the permutation-invariant solution of the performance operator $\frac{1}{N!}\sum_{\pi\in S_{N}}G_{\pi}\Omega_{\phi}(h^{(\mathrm{opt})})G_{\pi}^{\dagger}$ is feasible.

Next, we show that we can choose a permutation-invariant $h$ such that the performance operator $\Omega_{\phi}(h)=\frac{1}{N!}\sum_{\pi\in S_{N}}G_{\pi}\Omega_{\phi}\left(h^{(\mathrm{opt})}\right)G_{\pi}^{\dagger}$ . $h$ is an operator on $\otimes_{i=1}^{N}\mathbb{C}^{s}$ , where $s:=\max_{\phi}\mathsf{rank}(E_{\phi})\leq d$ . For distinction we denote the group action on $h$ by $G_{\pi}^{\prime}hG_{\pi}^{\prime\dagger}$ and the group action on $\Omega_{\phi}(h)$ by $G_{\pi}\Omega_{\phi}(h)G_{\pi}^{\dagger}$ . Writing $h$ in components, we have

$\displaystyle\sum_{\mathbf{i},\mathbf{j}}G_{\pi}^{\prime}h_{\mathbf{i}\mathbf{j}}\outerproduct{\mathbf{i}}{\mathbf{j}}G_{\pi}^{\prime\dagger}$	$\displaystyle=\sum_{\mathbf{i},\mathbf{j}}h_{\mathbf{i}\mathbf{j}}\outerproduct{\pi(\mathbf{i})}{\pi(\mathbf{j})}$	(34)
	$\displaystyle=\sum_{\pi^{-1}(\mathbf{i}),\pi^{-1}(\mathbf{j})}h_{\pi^{-1}(\mathbf{i})\pi^{-1}(\mathbf{j})}\outerproduct{\mathbf{i}}{\mathbf{j}}$
	$\displaystyle=\sum_{\mathbf{i},\mathbf{j}}h_{\pi^{-1}(\mathbf{i})\pi^{-1}(\mathbf{j})}\outerproduct{\mathbf{i}}{\mathbf{j}},$

where $\pi(\mathbf{i}):=(i_{\pi(1)},\dots,i_{\pi(N)})$ . Note that $G_{\pi}\ket{N_{\phi,\mathbf{i}}}=\ket{N_{\phi,\pi(\mathbf{i})}}$ and

$\displaystyle G_{\pi}\ket{\dot{N}_{\phi,\mathbf{i}}}$	$\displaystyle=\sum_{j=1}^{N}\ket{E_{i_{1}}}\cdots\ket{E_{i_{j-1}}}\ket{\dot{E}_{i_{j}}}\ket{E_{i_{j+1}}}\cdots\ket{E_{i_{N}}}$	(35)
	$\displaystyle=\sum_{j=1}^{N}\ket{E_{i_{\pi(1)}}}\cdots\ket{E_{i_{\pi[\pi^{-1}(j)-1]}}}\ket{\dot{E}_{i_{j}}}\ket{E_{i_{\pi[\pi^{-1}(j)+1]}}}\cdots\ket{E_{i_{\pi(N)}}}$
	$\displaystyle=\sum_{j=1}^{N}\ket{E_{i_{\pi(1)}}}\cdots\ket{E_{i_{\pi(j-1)}}}\ket{\dot{E}_{i_{\pi(j)}}}\ket{E_{i_{\pi(j+1)}}}\cdots\ket{E_{i_{\pi(N)}}}$
	$\displaystyle=\ket{\dot{N}_{\phi,\pi(\mathbf{i})}},$

which results in

$\displaystyle G_{\pi}\Omega_{\phi}(h)G_{\pi}^{\dagger}$	$\displaystyle=4\sum_{\mathbf{j}}G_{\pi}\left(\outerproduct{\dot{\tilde{N}}_{\phi,\mathbf{j}}}{\dot{\tilde{N}}_{\phi,\mathbf{j}}}\right)^{T}G_{\pi}^{\dagger}$	(36)
	$\displaystyle=4\sum_{\mathbf{j}}G_{\pi}\outerproduct{\dot{\tilde{N}}_{\phi,\mathbf{j}}^{}}{\dot{\tilde{N}}_{\phi,\mathbf{j}}^{}}G_{\pi}^{\dagger}$
	$\displaystyle=4\sum_{\mathbf{j}}G_{\pi}\left(\ket{\dot{N}_{\phi,\mathbf{j}}^{}}+\mathrm{i}\sum_{\mathbf{k}}\ket{N_{\phi,\mathbf{k}}^{}}h_{\mathbf{k}\mathbf{j}}^{}\right)\left(\bra{\dot{N}_{\phi,\mathbf{j}}^{}}-\mathrm{i}\sum_{\mathbf{l}}h_{\mathbf{l}\mathbf{j}}\bra{N_{\phi,\mathbf{l}}^{*}}\right)G_{\pi}^{\dagger}$
	$\displaystyle=4\sum_{\mathbf{j}}\left(\ket{\dot{N}_{\phi,\pi(\mathbf{j})}^{}}+\mathrm{i}\sum_{\mathbf{k}}\ket{N_{\phi,\pi(\mathbf{k})}^{}}h_{\mathbf{k}\mathbf{j}}^{}\right)\left(\bra{\dot{N}_{\phi,\pi(\mathbf{j})}^{}}-\mathrm{i}\sum_{\mathbf{l}}h_{\mathbf{l}\mathbf{j}}\bra{N_{\phi,\pi(\mathbf{l})}^{*}}\right)$
	$\displaystyle=4\sum_{\mathbf{j}}\left(\ket{\dot{N}_{\phi,\mathbf{j}}^{}}+\mathrm{i}\sum_{\mathbf{k}}\ket{N_{\phi,\mathbf{k}}^{}}h_{\pi^{-1}(\mathbf{k})\pi^{-1}(\mathbf{j})}^{}\right)\left(\bra{\dot{N}_{\phi,\mathbf{j}}^{}}-\mathrm{i}\sum_{\mathbf{l}}h_{\pi^{-1}(\mathbf{l})\pi^{-1}(\mathbf{j})}\bra{N_{\phi,\mathbf{l}}^{*}}\right)$
	$\displaystyle=\Omega_{\phi}\left(G_{\pi}^{\prime}hG_{\pi}^{\prime\dagger}\right),$

where in the last equation we have used Eq. (34). Therefore, by choosing the permutation-invariant solution $h=\frac{1}{N!}\sum_{\pi\in S_{N}}G_{\pi}^{\prime}h^{(\mathrm{opt})}G_{\pi}^{\prime\dagger}$ we obtain $\Omega_{\phi}(h)=\frac{1}{N!}\sum_{\pi\in S_{N}}G_{\pi}\Omega_{\phi}(h^{(\mathrm{opt})})G_{\pi}^{\dagger}$ , and this permutation-invariant choice of $h$ is also a feasible solution. ∎

If a stronger condition holds, then not only can we choose a permutation-invariant $h$ , but we can also restrict $Q^{i}\in\overline{\mathsf{S}}^{i}$ to be permutation invariant. In this case all matrix variables concerned in the optimization are permutation invariant.

Lemma 2.

In the optimization problem of Theorem 1, if each affine space $\mathsf{S}^{i}$ is permutation invariant for any $i=1,\dots,K$ , then there must exist a permutation-invariant $Q^{i}\in\overline{\mathsf{S}}^{i}$ as a feasible solution for each $i$ .

Proof.

By Lemma 1 we can restrict $h$ to be permutation invariant in optimization, and therefore the performance operator $\Omega_{\phi}(h)$ is permutation invariant. It is easy to see that each dual affine space $\overline{\mathsf{S}}^{i}$ is also permutation invariant. Then for any $i=1,\dots,K$ , for $Q^{i,(\mathrm{opt})}\in\overline{\mathsf{S}}^{i}$ satisfying the constraint $\lambda Q^{i}\geq\Omega_{\phi}(h)$ , we have the permutation-invariant solution $Q^{i}=\frac{1}{N!}\sum_{\pi\in S_{N}}G_{\pi}Q^{i,(\mathrm{opt})}G_{\pi}^{\dagger}\in\overline{\mathsf{S}}^{i}$ which also satisfies the same constraint. Hence, this permutation-invariant choice of $Q^{i}$ is also a feasible solution. ∎

Now since the optimization is restricted to the invariant subspace, we can reduce the matrix sizes by block diagonalization. Generally, for a group-invariant space of complex matrices $\mathbb{C}^{n\times n,(\mathrm{inv})}$ , there exists an isomorphism $\varphi$ preserving positive semidefiniteness between $\mathbb{C}^{n\times n,(\mathrm{inv})}$ and a direct sum of complex matrix spaces [53, Theorem 9.1]:

\varphi:\mathbb{C}^{n\times n,(\mathrm{inv})}\rightarrow\bigoplus_{k=1}^{I}\mathbb{C}^{m_{k}\times m_{k}},

(37)

where $m_{k}$ is the multiplicity of the $k$ -th inequivalent irreducible representation of the group, and $I$ is the number of inequivalent irreducible representations. To be more specific, for the symmetric group $S_{N}$ , the representation space $\mathcal{W}=(\mathbb{C}^{d})^{\otimes N}$ can be decomposed into $\bigoplus_{|\mu|=N}\mathcal{W}^{\mu}$ , where each partition $\mu:=(\mu_{1},\dots,\mu_{d})$ (with nonnegative integers $\mu_{i}\geq\mu_{j}$ for any $i<j$ ) corresponds to a Young diagram, and $|\mu|:=\sum_{i}\mu_{i}$ . Each isotypic component $\mathcal{W}^{\mu}$ can be further decomposed into a direct sum of equivalent irreducible subspaces: $\mathcal{W}^{\mu}=\bigoplus_{i=1}^{m_{\mu}}\mathcal{W}^{\mu,i}$ . We define the dimension of the irreducible representation $d_{\mu}:=\mathsf{dim}\left(\mathcal{W}^{\mu,i}\right)$ . It is worth mentioning that the first decomposition into isotypic components is unique while the second decomposition into equivalent irreducible representation spaces is not (see, e.g., [60]). In terms of the unitary group action $G_{\pi}$ on $\mathcal{W}$ , there exists a unitary transformation of basis $U$ such that for any $\pi$ we have

U^{\dagger}G_{\pi}U=\bigoplus_{|\mu|=N}G_{\pi}^{\mu}\otimes\mathds{1}\left(m_{\mu}\right),

(38)

where $G_{\pi}^{\mu}$ is a unitary operator on the irreducible representation space associated with the Young diagram label $\mu$ , $m_{\mu}$ is the corresponding multiplicity, and $\mathds{1}\left(m_{\mu}\right)$ is an $m_{\mu}\times m_{\mu}$ identity matrix acting on the multiplicity subspace. By Schur’s lemma, for any group-invariant operator $X$ on $\mathcal{W}$ , i.e., $X$ commuting with all $G_{\pi}$ , we have

U^{\dagger}XU=\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes X^{\mu},

(39)

for any $\pi$ , where $X^{\mu}$ is an $m_{\mu}\times m_{\mu}$ matrix. With such block diagonalization of the permutation-invariant operator, we reduce the dimension from $d^{2N}$ to [61, Eq. (57)]

\sum_{|\mu|=N}m_{\mu}^{2}=\binom{N+d^{2}-1}{d^{2}-1}\leq(N+1)^{d^{2}-1},

(40)

where the upper bound is obtained straightforwardly from the definition of the binomial coefficient. Specifically, if $X$ is further restricted to be a Hermitian matrix variable, then the number of real scalar variables contained in all elements of $X$ is reduced from $d^{2N}$ to $\binom{N+d^{2}-1}{d^{2}-1}$ .

Now let us turn to the optimization problem of QFI evaluation. If the Hermitian matrix $h$ is taken to be permutation invariant by Lemma 1, the number of variables concerned in $h$ is reduced from $s^{2N}$ to $\binom{N+s^{2}-1}{s^{2}-1}$ . Similarly, if further by Lemma 2 each Hermitian matrix $Q^{i}$ is permutation invariant, the number of variables in $Q^{i}$ is also reduced from $d^{2N}$ to $\binom{N+d^{2}-1}{d^{2}-1}$ , where we redefine $d:=\mathsf{dim}(\mathcal{H}_{1})\mathsf{dim}(\mathcal{H}_{2})$ . By this reduction the complexity gets polynomial rather than exponential, with respect to $N$ .

We can then reformulate the optimization in Theorem 1 with the reduced form. We consider two cases relevant to the strategy sets mentioned in the main text. First, under certain circumstances we can reduce the number of constraints in optimization as follows:

Theorem 2 (Symmetry reduced Theorem 1, first case).

If, for any $\pi\in S_{N}$ and any $i$ , there exists a $j$ such that the mapping $S\mapsto G_{\pi}SG_{\pi}^{\dagger}$ on $\mathsf{S}^{i}$ is a bijective function from $\mathsf{S}^{i}$ to $\mathsf{S}^{j}$ , and meanwhile for any $i,j$ there exists some permutation operation such that $\mathsf{S}^{i}$ is bijectively mapped to $\mathsf{S}^{j}$ , then the QFI of $N$ quantum channels $\mathcal{E}_{\phi}$ can be expressed as:

	$\displaystyle J^{(\mathsf{P})}(N_{\phi})=$	$\displaystyle\min_{\lambda,Q^{1},h^{\mu}}\lambda,$		(41)
	$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\lambda Q^{1}\geq\Omega_{\phi}(h),\ Q^{1}\in\overline{\mathsf{S}}^{1},$		(41)

where $h=U^{\prime}\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes h^{\mu}\right]U^{\prime\dagger}$ with each $h^{\mu}$ as an $m_{\mu}^{\prime}\times m_{\mu}^{\prime}$ matrix variable and $U^{\prime}$ as a unitary transformation of basis.

Proof.

We first prove that, for any $i,j$ there exists some permutation operation $Q\mapsto G_{\pi}QG_{\pi}^{\dagger}$ such that $\overline{\mathsf{S}}^{i}$ is bijectively mapped to $\overline{\mathsf{S}}^{j}$ . Given any $i$ and $j$ , if we choose an arbitrary $Q^{i}\in\overline{\mathsf{S}}^{i}$ , i.e., $\Tr(Q^{i}S^{i})=1$ for all $S^{i}\in\mathsf{S}^{i}$ , then the condition of the theorem implies that there exists $G_{\pi}$ such that $G_{\pi}S^{i}G_{\pi}^{\dagger}\in\mathsf{S}^{j}$ . As the map is bijective, in fact $G_{\pi}^{\dagger}S^{j}G_{\pi}\in\mathsf{S}^{i}$ for all $S^{j}\in\mathsf{S}^{j}$ . Then we have $\Tr(G_{\pi}Q^{i}G_{\pi}^{\dagger}S^{j})=\Tr(Q^{i}G_{\pi}^{\dagger}S^{j}G_{\pi})=1$ for all $S^{j}\in\mathsf{S}^{j}$ . Therefore, following the same argument in the proof of Lemma 1, under $Q\mapsto G_{\pi}QG_{\pi}^{\dagger}$ , $\overline{\mathsf{S}}^{i}$ is bijectively mapped to $\overline{\mathsf{S}}^{j}$ .

By Lemma 1 we can take $h$ to be permutation invariant and apply the block diagonalization to $h$ given by Eq. (39). Then for any $Q^{1}\in\overline{\mathsf{S}}^{1}$ satisfying the constraint $\lambda Q^{1}\geq\Omega_{\phi}(h)$ , there exists $G_{\pi}$ such that $Q^{i}=G_{\pi}Q^{1}G_{\pi}^{\dagger}\in\overline{\mathsf{S}}^{i}$ satisfying $\lambda Q^{i}\geq G_{\pi}\Omega_{\phi}(h)G_{\pi}^{\dagger}=\Omega_{\phi}(h)$ for any $i=2,\dots,K$ . Therefore, all the constraints except for $\lambda Q^{1}\geq\Omega_{\phi}(h)$ are redundant and can be removed. ∎

Now we consider the second case. If by Lemmas 1 and 2 $h$ and each $Q^{i}$ are permutation invariant, we can then reformulate the constraints in optimization with reduced matrix dimensions. To relate the group representation on $\otimes_{i}\left(\mathcal{H}_{2i-1}\otimes\mathcal{H}_{2i}\right)$ to the representation on $\mathbb{(}C^{s})^{\otimes N}$ , we choose a unitary transformation of basis $U^{\prime}$ for $h$ , decompose $h=U^{\prime}\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes h^{\mu}\right]U^{\prime\dagger}$ with $h^{\mu}$ as an $m_{\mu}^{\prime}\times m_{\mu}^{\prime}$ matrix, and divide $U^{\prime}=\left(U^{\prime,\mu^{1}},\dots,U^{\prime,\mu^{I}}\right)$ into blocks, where $\mu^{i}$ is the label of the $i$ -th Young diagram, and $U^{\prime,\mu^{i}}$ has $d_{\mu^{i}}m_{\mu^{i}}^{\prime}$ columns. Thus by defining the matrix

\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu}=\dot{\mathbf{N}}_{\phi}U^{\prime,\mu}-\mathrm{i}\mathbf{N}_{\phi}U^{\prime,\mu}\left[\mathds{1}\left(d_{\mu}\right)\otimes h^{\mu}\right],

(42)

we have the permutation-invariant performance operator

\Omega_{\phi}(h)=4\sum_{|\mu|=N}\left(\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu\dagger}\right)^{T}=4\sum_{|\mu|=N}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*\dagger}.

(43)

Analogously, we apply the block diagonalization using a change of basis $U$ to

Q^{i}=U\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes Q^{i,\mu}\right]U^{\dagger}\in\overline{\mathsf{S}}^{i},

(44)

where $Q^{i,\mu}$ is an $m_{\mu}\times m_{\mu}$ matrix variable. The unitary transformation $U=\left(U^{\mu^{1}},\dots,U^{\mu^{I}}\right)$ is first divided into blocks, and then each $U^{\mu^{i}}=\left(U^{\mu^{i},1},\dots,U^{\mu^{i},d_{\mu^{i}}}\right)$ is divided into blocks for $i=1,\dots,I$ , where $\mu^{i}$ is the label of the $i$ -th Young diagram, and $U^{\mu^{i},j}$ has $m_{\mu^{i}}$ columns for $j=1,\dots,d_{\mu^{i}}$ . Note that $U$ also gives the block diagonalization of $\Omega_{\phi}(h)$ .

Before proceeding further we prove that the range of $\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*}$ is exactly contained in the irreducible representation space corresponding to $\mu$ :

Lemma 3.

If we decompose the representation space $\mathcal{W}=\otimes_{i=1}^{N}\left(\mathcal{H}_{2i-1}\otimes\mathcal{H}_{2i}\right)=\bigoplus_{|\mu|=N}\mathcal{W}^{\mu}$ , then $\mathsf{Range}\left(\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*}\right)\subseteq\mathcal{W}^{\mu}$ for any Young diagram label $\mu$ .

Proof.

To characterize the representation space $\mathcal{W}^{\mu}$ we introduce the key notion of Young symmetrizer as well as other related concepts very briefly (see, e.g., [60] for mathematical details). A Young tableau is a filling into the boxes of the Young diagram with positive integers weakly increasing along each row and strictly increasing along each column. For a standard Young tableau labeled by $\nu$ , i.e., a Young tableau filled with the entries $1,\dots,N$ , we define two permutation subgroups

P_{\nu}:=\left\{\sigma\in S_{N}\middle|\sigma\ \mathrm{preserves}\ \mathrm{each}\ \mathrm{row}\right\}

(45)

and

Q_{\nu}:=\left\{\sigma\in S_{N}\middle|\sigma\ \mathrm{preserves}\ \mathrm{each}\ \mathrm{column}\right\}.

(46)

In the group algebra $\mathbb{C}S_{N}$ we define two elements $a_{\nu}:=\sum_{\sigma\in P_{\nu}}e_{\sigma}$ and $b_{\nu}:=\sum_{\sigma\in Q_{\nu}}\mathrm{sgn}(\sigma)e_{\sigma}$ , where $e_{\sigma}$ is the unit vector corresponding to $\sigma$ and $\mathrm{sgn}(\cdot)$ denotes the parity of the permutation. Then the Young symmetrizer is defined by

c_{\nu}:=a_{\nu}b_{\nu}=\sum_{\sigma\in P_{\nu}}\sum_{\pi\in Q_{\nu}}\mathrm{sgn}(\pi)e_{\sigma\pi}\in\mathbb{C}S_{N}.

(47)

It is known that a Young diagram of $\mu$ corresponds to $d_{\mu}$ standard Young tableaux, with each Young tableau of $\nu$ characterizing an irreducible representation space, given by the image of $c_{\nu}$ on $\otimes_{i}\mathcal{W}_{i}$ under the natural group algebra representation $\mathbb{C}S_{N}\rightarrow\mathrm{End}\left(\otimes_{i}\mathcal{W}_{i}\right)$ , where $\mathrm{End}(V)$ denotes the set of endomorphisms on $V$ .

With these notions we now have an explicit characterization of the representation space. Note that we can prove $\mathsf{Range}\left(\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*}\right)\subseteq\mathcal{W}^{\mu}$ if $\mathsf{Range}\left(\dot{\mathbf{N}}_{\phi}^{*}U^{\prime,\mu*}\right)\subseteq\mathcal{W}^{\mu}$ and $\mathsf{Range}\left(\mathbf{N}_{\phi}^{*}U^{\prime,\mu*}\right)\subseteq\mathcal{W}^{\mu}$ . In the proof of Lemma 1 we have seen $G_{\pi}\ket{\dot{N}_{\phi,\mathbf{i}}}=\ket{\dot{N}_{\phi,\pi(\mathbf{i})}}$ and $G_{\pi}\ket{N_{\phi,\mathbf{i}}}=\ket{N_{\phi,\pi(\mathbf{i})}}$ , from which it follows that

G_{\pi}\dot{\mathbf{N}}_{\phi}=\dot{\mathbf{N}}_{\phi}G_{\pi}^{\prime}

(48)

and $G_{\pi}\mathbf{N}_{\phi}=\mathbf{N}_{\phi}G_{\pi}^{\prime}$ .

Now we prove that $\mathsf{Range}\left(\dot{\mathbf{N}}_{\phi}^{*}U^{\prime,\mu*}\right)\subseteq\mathcal{W}^{\mu}$ . From the discussions above we know that

\mathsf{Range}\left(U^{\prime,\mu}\right)=\bigoplus_{\nu}c_{\nu}\left[\otimes_{i}\mathbb{C}^{s}\right],

(49)

for all Young tableau labels $\nu$ corresponding to the Young diagram of $\mu$ . Explicitly, we have

c_{\nu}\left[\otimes_{i}\mathbb{C}^{s}\right]=\mathsf{Range}\left[\sum_{\sigma\in P_{\nu}}\sum_{\pi\in Q_{\nu}}\mathrm{sgn}(\pi)G_{\sigma\pi}^{\prime}\right].

(50)

Note that there always exists a unitary transformation of basis $V$ such that we can obtain a real matrix $U_{(\mathrm{real})}^{\prime,\mu}=U^{\prime,\mu}V^{\dagger}$ , which leads to

\mathsf{Range}\left(U^{\prime,\mu*}\right)=\mathsf{Range}\left(U_{(\mathrm{real})}^{\prime,\mu*}V^{*}\right)=\mathsf{Range}\left(U_{(\mathrm{real})}^{\prime,\mu}V^{*}\right)=\mathsf{Range}\left(U^{\prime,\mu}\right).

(51)

Thus $\mathsf{Range}\left(\dot{\mathbf{N}}_{\phi}^{*}U^{\prime,\mu*}\right)=\mathsf{Range}\left(\dot{\mathbf{N}}_{\phi}^{*}U^{\prime,\mu}\right)$ . Furthermore, from Eqs. (49) and (50) we have $\mathsf{Range}\left(\dot{\mathbf{N}}_{\phi}^{*}U^{\prime,\mu*}\right)\subseteq\mathcal{W}^{\mu}$ if

\mathsf{Range}\left[\dot{\mathbf{N}}_{\phi}^{*}\sum_{\sigma\in P_{\nu}}\sum_{\pi\in Q_{\nu}}\mathrm{sgn}(\pi)G_{\sigma\pi}^{\prime}\right]\subseteq\mathcal{W}^{\mu}

(52)

for all Young tableau labels $\nu$ corresponding to the Young diagram of $\mu$ . To show Eq. (52), note that by Eq. (48) we have

\mathsf{Range}\left[\dot{\mathbf{N}}_{\phi}^{*}\sum_{\sigma\in P_{\nu}}\sum_{\pi\in Q_{\nu}}\mathrm{sgn}(\pi)G_{\sigma\pi}^{\prime}\right]=\mathsf{Range}\left[\sum_{\sigma\in P_{\nu}}\sum_{\pi\in Q_{\nu}}\mathrm{sgn}(\pi)G_{\sigma\pi}\dot{\mathbf{N}}_{\phi}^{*}\right].

(53)

Since the R.H.S. of Eq. (53) is the image of the Young symmetrizer $c_{\nu}$ on $\mathsf{Range}\left(\dot{\mathbf{N}}_{\phi}^{*}\right)$ , it is a subset of $\mathcal{W}^{\mu}$ . Therefore, $\mathsf{Range}\left(\dot{\mathbf{N}}_{\phi}^{*}U^{\prime,\mu*}\right)\subseteq\mathcal{W}^{\mu}$ . In the same way we can also show that $\mathsf{Range}\left(\mathbf{N}_{\phi}^{*}U^{\prime,\mu*}\right)\subseteq\mathcal{W}^{\mu}$ and thus complete the proof. ∎

Therefore, $U^{\dagger}\Omega_{\phi}(h)U$ is block diagonal with each block on the space $\mathsf{Range}\left(\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu}\right)$ . This results in:

Theorem 3 (Symmetry reduced Theorem 1, second case).

If each affine space $\mathsf{S}^{i}$ is permutation invariant for any $i=1,\dots,K$ , then the QFI of $N$ quantum channels $\mathcal{E}_{\phi}$ can be expressed as:

	$\displaystyle J^{(\mathsf{P})}(N_{\phi})=$	$\displaystyle\min_{\lambda,Q^{i,\mu},h^{\mu}}\lambda,$		(54)
	$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\lambda Q^{i,\mu}\geq 4U^{\mu,1\dagger}\left(\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu\dagger}\right)^{T}U^{\mu,1},\ i=1,\dots,K,\ \|\mu\|=N,$		(54)

where $Q^{i,\mu}$ is given by Eq. (44).

Proof.

By Lemmas 1 and 2, $h$ , $\Omega_{\phi}(h)$ and all $Q^{i}$ are taken to be permutation invariant. The original constraint $\lambda Q^{i}\geq\Omega_{\phi}(h)$ on the permutation-invariant space is reformulated as

\lambda\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes Q^{i,\mu}\geq\bigoplus_{|\mu|=N}4U^{\mu\dagger}\sum_{|\nu|=N}\dot{\tilde{\mathbf{N}}}_{\phi}^{\nu*}\dot{\tilde{\mathbf{N}}}_{\phi}^{\nu*\dagger}U^{\mu}.

(55)

Since by Lemma 3 $\mathsf{Range}\left(\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*}\right)\subseteq\mathcal{W}^{\mu}=\mathsf{Range}\left(U^{\mu}\right)$ for any $\mu$ , we have $U^{\mu\dagger}\sum_{|\nu|=N}\dot{\tilde{\mathbf{N}}}_{\phi}^{\nu*}\dot{\tilde{\mathbf{N}}}_{\phi}^{\nu*\dagger}U^{\mu}=U^{\mu\dagger}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*\dagger}U^{\mu}$ . Then Eq. (55) can be reformulated as

\lambda\mathds{1}\left(d_{\mu}\right)\otimes Q^{i,\mu}\geq 4U^{\mu\dagger}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*\dagger}U^{\mu},\ |\mu|=N.

(56)

Furthermore, both sides of the inequality in Eq. (56) are block diagonal with the same repeating blocks and we only need to compare one of the blocks. Without loss of generality, we choose the first block for comparison and obtain

\lambda Q^{i,\mu}\geq 4U^{\mu,1\dagger}\left(\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu\dagger}\right)^{T}U^{\mu,1},

(57)

which holds for all $i=1,\dots,K$ and $|\mu|=N$ . ∎

We have not characterized each dual affine space $\overline{\mathsf{S}}^{i}$ for $i=1,\dots,K$ yet. If we choose an affine basis $\{S^{i,j}\}_{j=1}^{M_{i}}$ for each $\mathsf{S}^{i}$ , then the constraint $Q^{i}\in\overline{\mathsf{S}}^{i}$ can be reformulated as a set of linear constraints $\Tr(Q^{i}S^{i,j})=1$ for all $j=1,\dots,M_{i}$ . If each affine space $\mathsf{S}^{i}$ is permutation invariant, then by the proof of Lemma 2 we have, for a feasible solution of $Q^{i}\in\overline{\mathsf{S}}^{i}$ , $G_{\pi}Q^{i}G_{\pi}^{\dagger}$ is also a feasible solution for any $\pi\in S_{N}$ . Then by defining

\tilde{S}^{i,j}:=\frac{1}{N!}\sum_{\pi\in S_{N}}G_{\pi}S^{i,j}G_{\pi}^{\dagger},

(58)

each linear constraint $\Tr(Q^{i}S^{i,j})=1$ can be replaced by $\Tr(Q^{i}\tilde{S}^{i,j})=1$ without changing the problem. Since $\tilde{S}^{i,j}$ is permutation invariant, similar to Eq. (44) it can be decomposed as

\tilde{S}^{i,j}=U\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes\tilde{S}^{i,j,\mu}\right]U^{\dagger},

(59)

where $\tilde{S}^{i,j,\mu}$ is an $m_{\mu}\times m_{\mu}$ matrix. Combining Eqs. (44) and (59), we can reformulate the constraint $Q^{i}\in\overline{\mathsf{S}}^{i}$ with reduced matrix sizes as

\sum_{|\mu|=N}d_{\mu}\Tr(Q^{i,\mu}\tilde{S}^{i,j,\mu})=1,\ j=1,\dots,M_{i}.

(60)

Finally, it remains to be seen how to find the unitary transformation $U$ and $U^{\prime}$ for block diagonalization. Known rigorous numerical algorithms for identifying the transformation are fairly expensive [62]. Fortunately, RepLAB [63], a numerical approach to decomposing representations based on randomized heuristics works very well in practice and is thus adopted here.

D.2 Symmetry reduced algorithm for optimal strategies

The idea is similar to the symmetry reduced evaluation of QFI. Since the first step of Algorithm 1 is simply solving the optimization problem in Theorem 1, now we only consider its second step, where we need to solve for the optimal value of $\tilde{P}$ in

\max_{\tilde{P}\in\tilde{\mathsf{P}}}\Tr\left[\tilde{P}\Omega_{\phi}\left(h^{(\mathrm{opt})}\right)\right],

(61)

where $\tilde{\mathsf{P}}=\mathsf{Conv}\left\{\bigcup_{i=1}^{K}\left\{S^{i}\geq 0\middle|S^{i}\in\mathsf{S}^{i}\right\}\right\}$ and

\real\left\{\Tr\left\{\tilde{P}\left[-\mathrm{i}\mathbf{N}_{\phi}\mathscr{H}\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h^{(\mathrm{opt})}\right)^{\dagger}\right]^{T}\right\}\right\}=0\ \mathrm{for}\ \mathrm{all}\ \mathscr{H}\in\mathbb{H}_{r}.

(62)

We have the following:

Lemma 4.

If, for any $\pi\in S_{N}$ and any $i$ , there exists a $j$ such that the mapping $S\mapsto G_{\pi}SG_{\pi}^{\dagger}$ on $\mathsf{S}^{i}$ is a bijective function from $\mathsf{S}^{i}$ to $\mathsf{S}^{j}$ , then there must exist a permutation-invariant $\tilde{P}$ as an optimal solution in Algorithm 1.

Proof.

By definition $\tilde{\mathsf{P}}$ is permutation invariant. By Lemma 1 we can choose a permutation-invariant $h^{(\mathrm{opt})}$ and thus $\Omega_{\phi}\left(h^{(\mathrm{opt})}\right)$ is also permutation invariant. If $\tilde{P}^{(\mathrm{opt})}$ is an optimal solution of $\tilde{P}$ , then $G_{\pi}\tilde{P}^{(\mathrm{opt})}G_{\pi}^{\dagger}$ for any $\pi\in S_{N}$ is also an optimal solution, since

\Tr\left[G_{\pi}\tilde{P}^{(\mathrm{opt})}G_{\pi}^{\dagger}\Omega_{\phi}\left(h^{(\mathrm{opt})}\right)\right]=\Tr\left[\tilde{P}^{(\mathrm{opt})}G_{\pi}^{\dagger}\Omega_{\phi}\left(h^{(\mathrm{opt})}\right)G_{\pi}\right]=\Tr\left[\tilde{P}^{(\mathrm{opt})}\Omega_{\phi}\left(h^{(\mathrm{opt})}\right)\right]

(63)

and for all $\mathscr{H}\in\mathbb{H}_{r}$

	$\displaystyle\real\left\{\Tr\left\{G_{\pi}\tilde{P}^{(\mathrm{opt})}G_{\pi}^{\dagger}\left[-\mathrm{i}\mathbf{N}_{\phi}G_{\pi}^{\prime}\mathscr{H}G_{\pi}^{\prime\dagger}\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h^{(\mathrm{opt})}\right)^{\dagger}\right]^{T}\right\}\right\}$	(64)
$\displaystyle=$	$\displaystyle\real\left\{\Tr\left\{\tilde{P}^{(\mathrm{opt})}G_{\pi}^{\dagger}\left[-\mathrm{i}\mathbf{N}_{\phi}G_{\pi}^{\prime}\mathscr{H}G_{\pi}^{\prime\dagger}\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h^{(\mathrm{opt})}\right)^{\dagger}\right]^{T}G_{\pi}\right\}\right\}$
$\displaystyle=$	$\displaystyle\real\left\{\Tr\left\{\tilde{P}^{(\mathrm{opt})}\left[-\mathrm{i}\mathbf{N}_{\phi}\mathscr{H}\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h^{(\mathrm{opt})}\right)^{\dagger}\right]^{T}\right\}\right\}$
$\displaystyle=$	$\displaystyle\,0,$

having used $G_{\pi}^{\dagger}\mathbf{N}_{\phi}=\mathbf{N}_{\phi}G_{\pi}^{\prime\dagger}$ and $G_{\pi}^{\dagger}\dot{\mathbf{N}}_{\phi}=\dot{\mathbf{N}}_{\phi}G_{\pi}^{\prime\dagger}$ for any $\pi\in S_{N}$ in the second equality of Eq. (64). Therefore, there exists a permutation-invariant solution $\tilde{P}=\frac{1}{N!}\sum_{\pi\in S_{N}}G_{\pi}\tilde{P}^{(\mathrm{opt})}G_{\pi}^{\dagger}$ . ∎

Now by following the same line of arguments as used from Eqs. (58) to (60), we define

	$\displaystyle O:=$	$\displaystyle\,\frac{1}{N!}\sum_{\pi\in S_{N}}G_{\pi}\left[-\mathrm{i}\mathbf{N}_{\phi}\mathscr{H}\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h^{(\mathrm{opt})}\right)^{\dagger}\right]^{T}G_{\pi}^{\dagger}$		(65)
	$\displaystyle=$	$\displaystyle\,\left[-\mathrm{i}\mathbf{N}_{\phi}\tilde{\mathscr{H}}\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h^{(\mathrm{opt})}\right)^{\dagger}\right]^{T},$		(65)

having defined the permutation-invariant $\tilde{\mathscr{H}}:=\frac{1}{N!}\sum_{\pi\in S_{N}}G_{\pi}^{\prime}\mathscr{H}G_{\pi}^{\prime\dagger}$ . Similar to the arguments in Appendix C, by choosing a basis $\left\{\tilde{\mathscr{H}}^{i}\right\}_{i=1}^{J}$ for the permutation-invariant subspace of $\mathbb{H}_{r\times r}$ , where $J=\binom{N+s^{2}-1}{s^{2}-1}$ , Eq. (31) can be reformulated as a set of $J$ linear constraints. We further define

O^{i}:=\left[-\mathrm{i}\mathbf{N}_{\phi}\tilde{\mathscr{H}}^{i}\left(\dot{\mathbf{N}}_{\phi}-\mathrm{i}\mathbf{N}_{\phi}h^{(\mathrm{opt})}\right)^{\dagger}\right]^{T}

(66)

for $i=1,\dots,J$ . Now we can decompose $O^{i}=U\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes\tilde{O}^{i,\mu}\right]U^{\dagger}$ , $\tilde{P}=U\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes\tilde{P}^{\mu}\right]U^{\dagger}$ , $\Omega_{\phi}\left(h^{(\mathrm{opt})}\right)=U\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes\Omega_{\phi}^{\mu}\left(h^{(\mathrm{opt})}\right)\right]U^{\dagger}$ , and then reformulate the optimization problem as the following reduced form:

	$\displaystyle\max_{\tilde{P}^{\mu}}$	$\displaystyle\sum_{\|\mu\|=N}d_{\mu}\Tr\left[\tilde{P}^{\mu}\Omega_{\phi}^{\mu}\left(h^{(\mathrm{opt})}\right)\right],$		(67)
	$\displaystyle\mathrm{s.t.}$	$\displaystyle\sum_{\|\mu\|=N}d_{\mu}\real\left[\Tr\left(\tilde{P}^{\mu}O^{i,\mu}\right)\right]=0\ \mathrm{for}\ \mathrm{all}\ i=1,\dots,J.$		(67)

Recall that $\tilde{P}=\sum_{i=1}^{K}q^{i}S^{i}$ for $S^{i}\in\mathsf{S}^{i}$ . By choosing an affine basis $\{Q^{i,j}\}_{j=1}^{L_{i}}$ for $\overline{\mathsf{S}}^{i}$ we can also characterize $S^{i}\in\mathsf{S}^{i}$ by a set of linear constraints $\Tr\left(S^{i}Q^{i,j}\right)=1$ for $i=1,\dots,K$ and $j=1,\dots,L_{i}$ , and follow the same routine to tackle the constraints on the permutation-invariant subspace. Thus both the number of variables and the number of constraints are polynomial with respect to $N$ .

Appendix E Evaluation of QFI using different strategies

In this section we provide explicit formulas of the QFI for all strategy sets considered in the main text, in the forms which can be numerically solved by SDP. Without the positivity constraints, parallel, sequential and general indefinite-causal-order strategy sets are affine spaces themselves, while quantum SWITCH and causal superposition strategy sets are convex hulls of affine spaces. In some cases the result of Theorem 1 can be simplified a bit, as it is possible to trace over certain subspace while formulating the primal problem at the beginning.

E.1 Parallel strategies

When definite causal order is obeyed, a strategy can be described by a quantum comb [33, 64, 34]. The dual affine space is the set of dual combs without the positivity constraint [43]. For parallel strategies the primal problem can be written as

$\displaystyle J^{(\mathsf{Par})}(N_{\phi})=$	$\displaystyle\min_{h\in\mathbb{H}_{r}}\max_{\tilde{P}}\Tr\left[\tilde{P}\Omega_{\phi}(h)\right],$	(68)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\tilde{P}\geq 0,$
	$\displaystyle\,\tilde{P}=\mathds{1}_{2,4,\dots,2N}\otimes\tilde{P}^{(1)},$
	$\displaystyle\Tr\tilde{P}^{(1)}=1.$

Equivalently, the problem can be formulated as

$\displaystyle\min_{h\in\mathbb{H}_{r}}\max_{P}$	$\displaystyle\Tr\left[P\Tr_{2,4,\dots,2N}\Omega_{\phi}(h)\right],$	(69)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,P\geq 0,$
	$\displaystyle\Tr P=1.$

The dual problem is given by

	$\displaystyle\min_{\lambda,h}$	$\displaystyle\,\lambda,$		(70)
	$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\lambda\mathds{1}_{1,3,\dots,2N-1}\geq\Tr_{2,4,\dots,2N}\Omega_{\phi}(h),$		(70)

which simplifies the result directly obtained from Theorem 1 a bit. To solve the problem via SDP, we define a block matrix

A:=\left(\begin{array}[]{c | c}\frac{\lambda}{4}\mathds{1}\left(r\prod_{i=1}^{N}d_{2i}\right)&\begin{array}[]{c}\bra{n_{1,1}}\\ \vdots\\ \bra{n_{r,\prod_{i=1}^{N}d_{2i}}}\end{array}\\ \hline\cr\begin{array}[]{ccc}\ket{n_{1,1}}&\ldots&\ket{n_{r,\prod_{i=1}^{N}d_{2i}}}\end{array}&\mathds{1}_{1,3,\dots,2N-1}\end{array}\right),

(71)

wherein $\mathds{1}(d)$ denotes a $d$ -dimensional identity matrix, and

\ket{n_{i,j}}:=\innerproduct{j}{\dot{\tilde{N}}_{\phi,i}^{*}},

(72)

where $\ket{\dot{\tilde{N}}_{\phi,j}}=\ket{\dot{N}_{\phi,j}}-\mathrm{i}\sum_{k}\ket{N_{\phi,k}}h_{kj}$ and $\left\{\ket{j},\ j=1,\dots,\prod_{k=1}^{N}d_{2k}\right\}$ forms an orthonormal basis of $\otimes_{k=1}^{N}\mathcal{H}_{2k}$ , having assumed that the identity map trivially acts on the subspace where the dual vector $\bra{j}$ does not affect. Note that

\sum_{i,j}\outerproduct{n_{i,j}}{n_{i,j}}=\frac{1}{4}\Tr_{2,4,\dots,2N}\Omega_{\phi}(h).

(73)

By Schur complement lemma [65, Theorem 1.12], the constraint in Eq. (70) is equivalent to the requirement that $A\geq 0$ . Hence, the QFI for parallel strategies is solved by

$\displaystyle\min_{\lambda,h}$	$\displaystyle\,\lambda,$	(74)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,A\geq 0,$
	$\displaystyle\,h\in\mathbb{H}_{r}.$

The problem can be solved by SDP since $h$ is incorporated linearly in the blocks of $A$ .

Symmetry reduction.—We can reduce the problem using permutation symmetry. For $\mathsf{Par}$ , the set $\tilde{\mathsf{P}}$ is given by $\tilde{\mathsf{P}}=\left\{S\geq 0\middle|S\in\mathsf{S}\right\}$ , and the affine space $\mathsf{S}$ is permutation invariant. As explained in Appendix D we can decompose the permutation-invariant $h=U^{\prime}\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes h^{\mu}\right]U^{\prime\dagger}$ with $h^{\mu}$ as an $m_{\mu}^{\prime}\times m_{\mu}^{\prime}$ matrix. $Q\in\overline{\mathsf{S}}$ is characterized by the constraint $\Tr_{2,4,\dots,2N}Q=\mathds{1}_{1,3,\dots,2N-1}$ , and we can decompose $Q=U\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes Q^{\mu}\right]U^{\dagger}$ with $Q^{\mu}$ as an $m_{\mu}\times m_{\mu}$ matrix. If we define

A^{\mu}:=\left(\begin{array}[]{c | c}\frac{\lambda}{4}\mathds{1}\left(d_{\mu}m_{\mu}^{\prime}\right)&\left(U^{\mu,1\dagger}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*}\right)^{\dagger}\\ \hline\cr U^{\mu,1\dagger}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*}&Q^{\mu}\end{array}\right),

(75)

where $\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu}$ is given by Eq. (42), then by Theorem 3 we can reformulate the optimization problem as

$\displaystyle\min_{\lambda,Q^{\mu},h}$	$\displaystyle\,\lambda,$	(76)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,A^{\mu}\geq 0,\ h^{\mu}\in\mathbb{H}_{m_{\mu}^{\prime}},\ \|\mu\|=N,$
	$\displaystyle\Tr_{2,4,\dots,2N}Q=\mathds{1}_{1,3,\dots,2N-1}.$

The constraint $\Tr_{2,4,\dots,2N}Q=\mathds{1}_{1,3,\dots,2N-1}$ only requires to be explicitly characterized on the permutation-invariant subspace, since $\Tr_{2,4,\dots,2N}Q$ is permutation invariant. Therefore, not only the number of scalar variables but also the number of constraints in terms of scalar variables are polynomial with respect to $N$ .

E.2 Sequential strategies

For sequential strategies the problem can be written as (having traced over $\mathcal{H}_{2N}$ )

$\displaystyle J^{(\mathsf{Seq})}(N_{\phi})=$	$\displaystyle\min_{h\in\mathbb{H}_{r}}\max_{P^{(k)}}\Tr\left[P^{(N)}\Tr_{2N}\Omega_{\phi}(h)\right],$	(77)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,P^{(N)}\geq 0,$
	$\displaystyle\Tr_{2k-1}P^{(k)}=\mathds{1}_{2k-2}\otimes P^{(k-1)},\ k=2,\dots,N,$
	$\displaystyle\Tr P^{(1)}=1,$

from which it follows that the dual problem is

$\displaystyle\min_{\lambda,Q^{(k)},h}$	$\displaystyle\,\lambda,$	(78)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\lambda\mathds{1}_{2N-1}\otimes Q^{(N-1)}\geq\Tr_{2N}\Omega_{\phi}(h),$
	$\displaystyle\Tr_{2k}Q^{(k)}=\mathds{1}_{2k-1}\otimes Q^{(k-1)},\ k=2,\dots,N-1,$
	$\displaystyle\Tr_{2}Q^{(1)}=\mathds{1}_{1},$

where $Q^{(N-1)}$ is Hermitian.

Similarly, in order to solve the problem via SDP we rewrite it as

$\displaystyle\min_{\lambda,Q^{(k)},h}$	$\displaystyle\,\lambda,$	(79)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,A\geq 0,$
	$\displaystyle\Tr_{2k}Q^{(k)}=\mathds{1}_{2k-1}\otimes Q^{(k-1)},\ k=2,\dots,N-1,$
	$\displaystyle\Tr_{2}Q^{(1)}=\mathds{1}_{1},$
	$\displaystyle\,h\in\mathbb{H}_{r},$

for

A:=\left(\begin{array}[]{c | c}\frac{\lambda}{4}\mathds{1}\left(rd_{2N}\right)&\begin{array}[]{c}\bra{n_{1,1}}\\ \vdots\\ \bra{n_{r,d_{2N}}}\end{array}\\ \hline\cr\begin{array}[]{ccc}\ket{n_{1,1}}&\ldots&\ket{n_{r,d_{2N}}}\end{array}&\mathds{1}_{2N-1}\otimes Q^{(N-1)}\end{array}\right),

(80)

having defined

\ket{n_{i,j}}:=\innerproduct{j}{\dot{\tilde{N}}_{\phi,i}^{*}},

(81)

where $\ket{\dot{\tilde{N}}_{\phi,j}}=\ket{\dot{N}_{\phi,j}}-\mathrm{i}\sum_{k}\ket{N_{\phi,k}}h_{kj}$ and $\left\{\ket{j},\ j=1,\dots,d_{2N}\right\}$ forms an orthonormal basis of $\mathcal{H}_{2N}$ .

E.3 Quantum SWITCH strategies

We first formally define a quantum SWITCH strategy set $\mathsf{SWI}$ as the collection of $P\in\mathsf{Strat}$ such that

P=\left(\rho_{T,A,C}\right)*\outerproduct{P^{(\mathrm{SW})}}{P^{(\mathrm{SW})}},\ \rho_{T,A,C}\geq 0,\ \Tr\rho_{T,A,C}=1,

(82)

where $\ket{P^{(\mathrm{SW})}}:=\lvert I\rrangle_{A,F_{A}}\sum_{\pi\in S_{N}}\left[\ket{\pi}_{C}\lvert I\rrangle_{T,2\pi(1)-1}\left(\otimes_{i=1}^{N-1}\lvert I\rrangle_{2\pi(i),2\pi(i+1)-1}\right)\lvert I\rrangle_{2\pi(N),F_{T}}\ket{\pi}_{F_{C}}\right]$ corresponds to a (generalized) quantum SWITCH for $N$ operations, each permutation $\pi$ is an element of the symmetric group $S_{N}$ whose order is $N!$ , and $\{\ket{\pi}_{C}\}$ forms an orthonormal basis. We suppose each $\mathcal{H}_{i}$ for $i=1,\dots,2N$ has the same dimension $d_{1}$ . $\mathcal{H}_{T}\simeq\mathcal{H}_{i}$ denotes the input space of the target system, $\mathcal{H}_{A}$ the ancillary space, and $\mathcal{H}_{C}$ the space of the control system. Correspondingly, $\mathcal{H}_{F_{T}}$ , $\mathcal{H}_{F_{A}}$ and $\mathcal{H}_{F_{C}}$ denote the future output spaces of each part. The global future space $\mathcal{H}_{F}=\mathcal{H}_{F_{T}}\otimes\mathcal{H}_{F_{A}}\otimes\mathcal{H}_{F_{C}}$ .

Using the quantum SWITCH strategy set, after tracing over the global future space $\mathcal{H}_{F}$ the QFI evaluation problem is written as

$\displaystyle J^{(\mathsf{SWI})}(N_{\phi})=$	$\displaystyle\min_{h\in\mathbb{H}_{r}}\max_{\tilde{P}}\Tr\left[\tilde{P}\Omega_{\phi}(h)\right],$	(83)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\tilde{P}=\sum_{\pi\in S_{N}}q^{\pi}\rho_{2\pi(1)-1}^{\pi}\left(\otimes_{i=1}^{N-1}\lvert I\rrangle_{2\pi(i),2\pi(i+1)-1}\llangle I\rvert_{2\pi(i),2\pi(i+1)-1}\right)\otimes\mathds{1}_{2\pi(N)},$
	$\displaystyle\sum_{\pi\in S_{N}}q^{\pi}=1,$
	$\displaystyle\,\rho_{2\pi(1)-1}^{\pi}\geq 0,\ \Tr\rho_{2\pi(1)-1}^{\pi}=1,\ q^{\pi}\geq 0,\ \pi\in S_{N},$

where the superscript $\pi$ of an operator denotes a permutation label, and the subscript denotes the subspace it lies in. Note that the primal set of $\tilde{P}$ is a convex hull of affine spaces. Equivalently the problem can be rewritten as

$\displaystyle\min_{h\in\mathbb{H}_{r}}\max_{\rho_{2\pi(1)-1}^{\pi},q^{\pi}}$	$\displaystyle\sum_{\pi\in S_{N}}\Tr\left[q^{\pi}\rho_{2\pi(1)-1}^{\pi}\left(\otimes_{i=1}^{N-1}\llangle I\rvert_{2\pi(i),2\pi(i+1)-1}\right)\Tr_{2\pi(N)}\Omega_{\phi}(h)\left(\otimes_{j=1}^{N-1}\lvert I\rrangle_{2\pi(j),2\pi(j+1)-1}\right)\right],$	(84)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\sum_{\pi\in S_{N}}q^{\pi}=1,$
	$\displaystyle\,\rho_{2\pi(1)-1}^{\pi}\geq 0,\ \Tr\rho_{2\pi(1)-1}^{\pi}=1,\ q^{\pi}\geq 0,\ \pi\in S_{N}.$

Following the method in the proof of Theorem 1, the dual problem is given by

	$\displaystyle\min$	$\displaystyle\,\lambda,$		(85)
	$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\lambda\mathds{1}_{2\pi(1)-1}\geq\Omega_{\phi}^{\pi}(h),\ \Omega_{\phi}^{\pi}(h):=\left(\otimes_{i=1}^{N-1}\llangle I\rvert_{2\pi(i),2\pi(i+1)-1}\right)\Tr_{2\pi(N)}\Omega_{\phi}(h)\left(\otimes_{j=1}^{N-1}\lvert I\rrangle_{2\pi(j),2\pi(j+1)-1}\right),\ \pi\in S_{N}.$		(85)

Equivalently in an SDP form the problem is written as

$\displaystyle\min_{\lambda,h}$	$\displaystyle\,\lambda,$	(86)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,A^{\pi}\geq 0,\ \pi\in S_{N},$
	$\displaystyle\,h\in\mathbb{H}_{r},$

having defined

A^{\pi}:=\left(\begin{array}[]{c | c}\frac{\lambda}{4}\mathds{1}(rd_{1})&\begin{array}[]{c}\bra{n_{1,1}^{\pi}}\\ \vdots\\ \bra{n_{r,d_{1}}^{\pi}}\end{array}\\ \hline\cr\begin{array}[]{ccc}\ket{n_{1,1}^{\pi}}&\ldots&\ket{n_{r,d_{1}}^{\pi}}\end{array}&\mathds{1}_{2\pi(1)-1}\end{array}\right),

(87)

for

\ket{n_{i,j}^{\pi}}:=\bra{j^{\pi}}\left(\otimes_{k=1}^{N-1}\llangle I\rvert_{2\pi(k),2\pi(k+1)-1}\right)\ket{\dot{\tilde{N}}_{\phi,i}^{*}},

(88)

where $\ket{\dot{\tilde{N}}_{\phi,j}}=\ket{\dot{N}_{\phi,j}}-\mathrm{i}\sum_{k}\ket{N_{\phi,k}}h_{kj}$ and $\{\ket{j^{\pi}}\}$ forms an orthonormal basis of $\mathcal{H}_{2\pi(N)}$ .

Symmetry reduction.—For $\mathsf{SWI}$ , by Theorem 2, $h$ can be taken to be permutation invariant and the constraint corresponding to one permutation $\pi$ (e.g., the identity element of $S_{N}$ ) is sufficient. As explained in Appendix D we can decompose the permutation-invariant $h=U^{\prime}\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes h^{\mu}\right]U^{\prime\dagger}$ with $h^{\mu}$ as an $m_{\mu}^{\prime}\times m_{\mu}^{\prime}$ matrix. We define

\mathbf{n}_{j_{2N}}^{\mu}:=\bra{j_{2N}}\left(\otimes_{i=1}^{N-1}\llangle I\rvert_{2i,2i+1}\right)\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*},

(89)

where $\{\ket{j_{2N}}\}$ forms an orthonormal basis of $\mathcal{H}_{2N}$ and $\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu}$ is given by Eq. (42). If we define

A^{(\mathrm{inv})}:=\left(\begin{array}[]{c | c}\frac{\lambda}{4}\mathds{1}(rd_{1})&\begin{array}[]{c}\mathbf{n}_{1}^{\mu^{1}\dagger}\\ \vdots\\ \mathbf{n}_{d_{1}}^{\mu^{I}\dagger}\end{array}\\ \hline\cr\begin{array}[]{ccc}\mathbf{n}_{1}^{\mu^{1}}&\ldots&\mathbf{n}_{d_{1}}^{\mu^{I}}\end{array}&\mathds{1}_{1}\end{array}\right),

(90)

then by Theorem 2 we can reformulate the optimization problem as

$\displaystyle\min_{\lambda,h^{\mu}}$	$\displaystyle\,\lambda,$	(91)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,A^{(\mathrm{inv})}\geq 0,$
	$\displaystyle\,h^{\mu}\in\mathbb{H}_{m_{\mu}^{\prime}},\ \|\mu\|=N.$

E.4 Causal superposition strategies

Following a similar route for the causal superposition strategy set the problem can be written as

$\displaystyle J^{(\mathsf{Sup})}(N_{\phi})=$	$\displaystyle\min_{h\in\mathbb{H}_{r}}\max_{P^{\pi,(N)},q^{\pi}}\sum_{\pi\in S_{N}}\Tr\left(q^{\pi}P^{\pi,(N)}\Tr_{2\pi(N)}\Omega_{\phi}(h)\right),$	(92)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\sum_{\pi\in S_{N}}q^{\pi}=1,$
	$\displaystyle\,q^{\pi}\geq 0,\ P^{\pi,(N)}\geq 0,\ \Tr P^{\pi,(1)}=1,\ \Tr_{2k-1}P^{\pi,(k)}=\mathds{1}_{2k-2}\otimes P^{\pi,(k-1)}\ \mathrm{for}\ k=2,\dots,N,\ \pi\in S_{N}.$

For each causal order in the superposition the dual affine space is the set of dual combs without the positivity constraint. Thus the dual problem is given by

$\displaystyle\min$	$\displaystyle\,\lambda,$	(93)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,\lambda\mathds{1}_{2\pi(N)-1}\otimes Q^{\pi,(N-1)}\geq\Tr_{2\pi(N)}\Omega_{\phi}(h),\ \Tr_{2\pi(1)}Q^{\pi,(1)}=\mathds{1}_{2\pi(1)-1},\ \pi\in S_{N},$
	$\displaystyle\Tr_{2\pi(k)}Q^{\pi,(k)}=\mathds{1}_{2\pi(k)-1}\otimes Q^{\pi,(k-1)}\ \mathrm{for}\ k=2,\dots,N-1,\ \pi\in S_{N},$

where the constraints hold for any $\pi\in S_{N}$ . To solve the problem via SDP we can formulate it as

$\displaystyle\min_{\lambda,h}$	$\displaystyle\,\lambda,$	(94)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,A^{\pi}\geq 0,\ \Tr_{2\pi(1)}Q^{\pi,(1)}=\mathds{1}_{2\pi(1)-1},\ \Tr_{2\pi(k)}Q^{\pi,(k)}=\mathds{1}_{2\pi(k)-1}\otimes Q^{\pi,(k-1)}\ \mathrm{for}\ k=2,\dots,N-1,\ \pi\in S_{N},$
	$\displaystyle\,h\in\mathbb{H}_{r},$

having defined

A^{\pi}:=\left(\begin{array}[]{c | c}\frac{\lambda}{4}\mathds{1}\left(rd_{2\pi(N)}\right)&\begin{array}[]{c}\bra{n_{1,1}^{\pi}}\\ \vdots\\ \bra{n_{r,d_{2\pi(N)}}^{\pi}}\end{array}\\ \hline\cr\begin{array}[]{ccc}\ket{n_{1,1}^{\pi}}&\ldots&\ket{n_{r,d_{2\pi(N)}}^{\pi}}\end{array}&\mathds{1}_{2\pi(N)-1}\otimes Q^{\pi,(N-1)}\end{array}\right),

(95)

for

\ket{n_{i,j}^{\pi}}:=\bra{j^{\pi}}\ket{\dot{\tilde{N}}_{\phi,i}^{*}},

(96)

where $\ket{\dot{\tilde{N}}_{\phi,j}}=\ket{\dot{N}_{\phi,j}}-\mathrm{i}\sum_{k}\ket{N_{\phi,k}}h_{kj}$ and $\{\ket{j^{\pi}}\}$ forms an orthonormal basis of $\mathcal{H}_{2\pi(N)}$ .

Symmetry reduction.—Similar to $\mathsf{SWI}$ , by Theorem 2, for $\mathsf{Sup}$ we take a permutation-invariant $h$ and only need the constraint corresponding to the identity element of $S_{N}$ . As explained in Appendix D we can decompose the permutation-invariant $h=U^{\prime}\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes h^{\mu}\right]U^{\prime\dagger}$ with $h^{\mu}$ as an $m_{\mu}^{\prime}\times m_{\mu}^{\prime}$ matrix. We define

\mathbf{n}_{j_{2N}}^{\mu}:=\bra{j_{2N}}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*},

(97)

where $\{\ket{j_{2N}}\}$ forms an orthonormal basis of $\mathcal{H}_{2N}$ and $\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu}$ is given by Eq. (42). If we define

A^{(\mathrm{inv})}:=\left(\begin{array}[]{c | c}\frac{\lambda}{4}\mathds{1}\left(rd_{2N}\right)&\begin{array}[]{c}\mathbf{n}_{1}^{\mu^{1}\dagger}\\ \vdots\\ \mathbf{n}_{d_{1}}^{\mu^{I}\dagger}\end{array}\\ \hline\cr\begin{array}[]{ccc}\mathbf{n}_{1}^{\mu^{1}}&\ldots&\mathbf{n}_{d_{1}}^{\mu^{I}}\end{array}&\mathds{1}_{2N-1}\otimes Q^{(N-1)}\end{array}\right),

(98)

then by Theorem 2 we can reformulate the optimization problem as

$\displaystyle\min_{\lambda,Q^{(k)},h}$	$\displaystyle\,\lambda,$	(99)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,A^{(\mathrm{inv})}\geq 0,$
	$\displaystyle\Tr_{2k}Q^{(k)}=\mathds{1}_{2k-1}\otimes Q^{(k-1)},\ k=2,\dots,N-1,$
	$\displaystyle\Tr_{2}Q^{(1)}=\mathds{1}_{1},$
	$\displaystyle\,h^{\mu}\in\mathbb{H}_{m_{\mu}^{\prime}},\ \|\mu\|=N.$

E.5 General indefinite-causal-order strategies

In this case the explicit linear constraints on strategies have been derived in [21], and the dual affine space turns out to be the set of CJ operators of $N$ -partite no-signaling quantum channels without the positivity constraint [16, 43], mathematically defined by

	$\displaystyle\Tr_{2k}Q$	$\displaystyle=\frac{\mathds{1}_{2k-1}}{d_{2k-1}}\otimes\Tr_{2k-1,2k}Q,\ k=1,\dots,N,$		(100)
	$\displaystyle\Tr Q$	$\displaystyle=\prod_{i=1}^{N}d_{2i-1}.$		(100)

The intuitive interpretation for no-signaling channels is that locally the input of each channel only affects the output of this single channel, but cannot transmit any information to $N-1$ other channels. To solve the QFI evaluation problem via SDP we can write it in the form

$\displaystyle\min_{\lambda,Q,h}$	$\displaystyle\,\lambda,$	(101)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,A\geq 0,$
	$\displaystyle\Tr_{2k}Q=\frac{\mathds{1}_{2k-1}}{d_{2k-1}}\otimes\Tr_{2k-1,2k}Q,\ k=1,\dots,N,$
	$\displaystyle\Tr Q=\prod_{i=1}^{N}d_{2i-1},$
	$\displaystyle\,h\in\mathbb{H}_{r},$

having defined

A=\left(\begin{array}[]{c | c}\frac{\lambda}{4}\mathds{1}\left(r\right)&\begin{array}[]{c}\bra{\dot{\tilde{N}}_{\phi,1}^{*}}\\ \vdots\\ \bra{\dot{\tilde{N}}_{\phi,r}^{*}}\end{array}\\ \hline\cr\begin{array}[]{ccc}\ket{\dot{\tilde{N}}_{\phi,1}^{*}}&\ldots&\ket{\dot{\tilde{N}}_{\phi,r}^{*}}\end{array}&Q\end{array}\right).

(102)

Symmetry reduction.—For $\mathsf{ICO}$ , by Lemmas 1 and 2, both $h$ and $Q$ can be taken to be permutation invariant. As explained in Appendix D we can decompose the permutation-invariant $h=U^{\prime}\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes h^{\mu}\right]U^{\prime\dagger}$ with $h^{\mu}$ as an $m_{\mu}^{\prime}\times m_{\mu}^{\prime}$ matrix, and decompose the permutation-invariant $Q=U\left[\bigoplus_{|\mu|=N}\mathds{1}\left(d_{\mu}\right)\otimes Q^{\mu}\right]U^{\dagger}$ with $Q^{\mu}$ as an $m_{\mu}\times m_{\mu}$ matrix. If we define

A^{\mu}:=\left(\begin{array}[]{c | c}\frac{\lambda}{4}\mathds{1}\left(d_{\mu}m_{\mu}^{\prime}\right)&\left(U^{\mu,1\dagger}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*}\right)^{\dagger}\\ \hline\cr U^{\mu,1\dagger}\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu*}&Q^{\mu}\end{array}\right),

(103)

where $\dot{\tilde{\mathbf{N}}}_{\phi}^{\mu}$ is given by Eq. (42), then by Theorem 3 we can reformulate the optimization problem as

$\displaystyle\min_{\lambda,Q^{\mu},h}$	$\displaystyle\,\lambda,$	(104)
$\displaystyle\mathrm{s.t.}$	$\displaystyle\,A^{\mu}\geq 0,\ h^{\mu}\in\mathbb{H}_{m_{\mu}^{\prime}},\ \|\mu\|=N,$
	$\displaystyle\Tr_{2N}Q=\frac{\mathds{1}_{2N-1}}{d_{2N-1}}\otimes\Tr_{2N-1,2N}Q,$
	$\displaystyle\sum_{\|\mu\|=N}d_{\mu}\Tr Q^{\mu}=\prod_{i=1}^{N}d_{2i-1},$

having removed the redundant constraints on $Q$ by permutation symmetry. Similar to the case of $\mathsf{Par}$ , we only need to consider the constraint on $Q$ on the permutation-invariant subspace, since both $Q$ and $\Tr_{2N-1,2N}Q$ are permutation invariant. Therefore, both the number of scalar variables and the number of constraints in terms of scalar variables are polynomial with respect to $N$ .

Appendix F Complexity analysis

Here we refer to the number of real scalar variables concerned in optimization as the complexity. In Appendix E we have presented both the original and the symmetry reduced QFI evaluation for all families of strategies considered in this work as applicable, from which we can obtain Table 1 in the main text.

Now we consider the algorithm for identifying an optimal strategy. Since the algorithm is based on Theorem 1, its complexity is no less than that of the QFI evaluation. For the second step of the algorithm, by Lemma 4 there exists a permutation-invariant optimal strategy for all families of strategies except for the sequential one, and thus we can apply the group-invariant SDP to achieve the permutation-invariant solution. Recall that $\tilde{P}=\sum_{i=1}^{K}q^{i}S^{i}$ for $S^{i}\in\mathsf{S}^{i}$ . In fact, for $\mathsf{SWI}$ and $\mathsf{Sup}$ , we only need to characterize $S^{1}\in\mathsf{S}^{1}$ and obtain $\tilde{P}=\sum_{i=1}^{K}\frac{1}{K}S^{1}$ due to the permutation invariance. In summary, taking both steps of the algorithm into account, we obtain Table 2:

Table 2: Complexity of Algorithm 1 for each family of strategies (with respect to

N

). The asymptotic numbers of variables in optimization are compared between the original (Ori.) and group-invariant (Inv.) SDP. We denote

d:=d_{1}d_{2}

for

d_{i}:=\mathsf{dim}(\mathcal{H}_{i})

and

s:=\max_{\phi}\mathsf{rank}(E_{\phi})\leq d

SDP	$\mathsf{Par}$	$\mathsf{Seq}$	$\mathsf{SWI}$	$\mathsf{Sup}$	$\mathsf{ICO}$
Ori.	$O\left(\max(s,d_{1})^{N}\right)$	$O\left(d^{N}\right)$	$O\left(N!\right)$	$O\left(N!\,d^{N}\right)$	$O\left(d^{N}\right)$
Inv.	$O\left(N^{d^{2}-1}\right)$	$O\left(d^{N}\right)$	$O(N^{s^{2}-1})$	$O\left(d^{N}\right)$	$O\left(N^{d^{2}-1}\right)$

As a concrete example, we apply the group-invariant SDP to the evaluation of the QFI $J^{(\mathsf{ICO})}$ for the general indefinite-causal-order strategies up to $N=5$ . We consider the amplitude damping noise and take $\phi=1.0$ , $t=1.0$ and $p=0.5$ . As illustrated in Fig. 3, the growth of $J^{(\mathsf{ICO})}$ is faster than linear growth but slower than quadratic growth. Table 3 compares the complexity between the original and the group-invariant SDP and indicates that the symmetry reduced approach can save the computational resources dramatically.

Table 3: Number of real scalar variables concerned in the evaluation of

J^{(\mathsf{ICO})}

. For the original SDP the total number of real scalar variables for

Q

h

and

\lambda

16^{N}+4^{N}+1

, while for the group-invariant SDP the total number of real scalar variables for

Q^{\mu}

h^{\mu}

and

\lambda

for all Young diagram labels

|\mu|=N

\binom{N+15}{15}+\binom{N+3}{3}+1

$N$	1	2	3	4	5
No. of variables in the original SDP	21	273	4161	65793	1049601
No. of variables in the group-invariant SDP	21	147	837	3912	15561

Appendix G Supplementary numerical results

G.1 Hierarchy for the $N=3$ case

Numerical results in this work are obtained by implementing SDP using the open source Python package CVXPY [66, 67] with the solver MOSEK [68]. We plot the QFI for $N=3$ , amplitude damping noise in Fig. 4 and observe a similar hierarchy of the estimation performance using parallel, sequential and indefinite-causal-order strategies. Different from the $N=2$ case presented in the main text, general indefinite-causal-order strategies indeed provide a small advantage over causal superposition strategies, which is presented in Table. 4.

Table 4: Hierarchy of QFI using

\mathsf{ICO}

and

\mathsf{Sup}

$p$	$J^{(\mathsf{Sup})}(N_{\phi})$	$J^{(\mathsf{ICO})}(N_{\phi})$
0.1	8.185	8.200
0.2	7.364	7.375
0.3	6.523	6.524
0.4	5.642	5.647
0.5	4.725	4.743
0.6	3.786	3.815
0.7	2.832	2.870
0.8	1.871	1.909
0.9	0.918	0.930

On the other hand, analogous to the $N=2$ case, simple quantum SWITCH strategies without any additional intermediate control operations could have advantage over any definite-causal-order strategies when the decay parameter $p$ is small, but this advantage becomes more insignificant. This should not be surprising since the control can make a bigger difference as $N$ grows.

G.2 Estimation of randomly sampled channels

To demonstrate the universality of the hierarchy of different families of strategies considered in the main text, we randomly sample noise channels drawn from an ensemble of CPTP maps defined by Bruzda et al. in [69]. In this work we only sample rank-2 qubit channels for $N=2$ , which is enough to show the hierarchy. The sampling process is implemented via an open source Python package QuTiP [70, 71]. We set an error tolerance of $10^{-8}$ , i.e., we claim $J_{1}>J_{2}$ only if the gap is no smaller than $10^{-8}$ . We find that for 984 of 1000 random channels, a strict hierarchy $J^{(\mathsf{Par})}<J^{(\mathsf{Seq})}<J^{(\mathsf{Sup})}<J^{(\mathsf{ICO})}$ holds, implying that general indefinite-causal-order strategies can provide advantage over causal superposition strategies. In addition, we find that of the same 1000 channels $J^{(\mathsf{Par})}<J^{(\mathsf{SWI})}$ for 34 channels and $J^{(\mathsf{Seq})}<J^{(\mathsf{SWI})}$ only for 1 channel, so with a high probability quantum SWITCH strategies cannot outperform strategies following definite causal order for a random noise channel, which highlights the estimation enhancement from intermediate control in the general case.

Appendix H Comparison with asymptotic results

In this section we focus on strategies following definite causal order, i.e., parallel and sequential ones, and compare our results and those of the extensively studied asymptotic theory.

H.1 Preliminaries

We first introduce some basic notions. If we write the operation-sum representation of the channel $\mathcal{E}_{\phi}(\rho)=\sum_{i=1}^{r}K_{\phi,i}^{\dagger}\rho K_{\phi,i}$ , where $\{K_{\phi,i}\}$ are a set of Kraus operators and $r$ is the rank of the channel, the channel QFI can be evaluated by optimization:

J_{Q}^{(\mathrm{chan})}(\mathcal{E}_{\phi})=4\min_{h\in\mathbb{H}_{r}}\lVert\alpha\rVert,

(105)

where $\lVert\cdot\rVert$ denotes the operator norm and $\alpha=\sum_{i}\dot{\tilde{K}}^{\dagger}_{\phi,i}\dot{\tilde{K}}_{\phi,i}$ . Here $\dot{\tilde{K}}_{\phi,i}=\dot{K}_{\phi,i}-\mathrm{i}\sum_{j=1}^{r}h_{ij}K_{\phi,j}$ is nothing but the derivative of an equivalent Kraus representation, given an $r\times r$ Hermitian matrix $h$ .

The upper bounds on QFI of $N$ quantum channels have been derived for both sequential and parallel strategies. For parallel strategies an asymptotically tight upper bound is [28, 30]

J^{(\mathsf{Par})}(N_{\phi})\leq 4\min_{h\in\mathbb{H}_{r}}[N\lVert\alpha\rVert+N(N-1)\lVert\beta\rVert^{2}],

(106)

where $\beta=\mathrm{i}\sum_{i}K_{\phi,i}^{\dagger}\dot{\tilde{K}}_{\phi,i}$ . An upper bound was also derived for sequential strategies [7, 45]:

J^{(\mathsf{Seq})}(N_{\phi})\leq 4\min_{h\in\mathbb{H}_{r}}[N\lVert\alpha\rVert+N(N-1)\lVert\beta\rVert(\lVert\beta\rVert+2\sqrt{\lVert\alpha\rVert})].

(107)

It has been shown that the QFI follows the standard quantum limit if and only if there exists an $h$ such that $\beta=0$ [45]. In this case sequential strategies provide no advantage asymptotically, and we have

\lim_{N\rightarrow\infty}\frac{1}{N}J^{(\mathsf{Par})}(N_{\phi})=\lim_{N\rightarrow\infty}\frac{1}{N}J^{(\mathsf{Seq})}(N_{\phi})=4\min_{h\in\mathbb{H}_{r}\ \mathrm{s.t.}\ \beta=0}\lVert\alpha\rVert.

(108)

We remark that the minimization in Eq. (106) can be efficiently evaluated via SDP [30].

H.2 Tightness of QFI bounds in nonasymptotic channel estimation

We compare our nonasymptotic results and existing asymptotically tight bounds for two types of quantum channels. Apart from the amplitude damping noise described by $K_{1}^{(\mathrm{AD})}=\outerproduct{0}{0}+\sqrt{1-p}\outerproduct{1}{1}$ and $K_{2}^{(\mathrm{AD})}=\sqrt{p}\outerproduct{0}{1}$ considered in the main text, here we also present a second example, where the noise is described by a SWAP-type interaction $V_{\mathrm{int}}=e^{-\mathrm{i}g\tau H_{\mathrm{SWAP}}}$ between a qubit system $S$ and a qubit environment $E$ , where $g$ is the interaction strength, $\tau$ is the interaction time and the Hamiltonian is given by $H_{\mathrm{SWAP}}(\ket{i}_{S}\ket{j}_{E})=\ket{j}_{S}\ket{i}_{E}$ . The initial environment state is $\ket{0}$ , and the Kraus operators can be written as $K_{1}^{(\mathrm{SWAP})}=\bra{0}_{E}V_{\mathrm{int}}\ket{0}_{E}=e^{-\mathrm{i}g\tau}\outerproduct{0}{0}+\cos(g\tau)\outerproduct{1}{1}$ , and $K_{2}^{(\mathrm{SWAP})}=\bra{1}_{E}V_{\mathrm{int}}\ket{0}_{E}=-\mathrm{i}\sin(g\tau)\outerproduct{0}{1}$ .

We plot the QFI for the two examples in Fig. 5. Both of them show the advantage of sequential strategies over parallel ones, and the gaps between exact results of QFI for sequential strategies and the parallel upper bounds given by Eq. (106).

H.3 Elusive advantage of sequential strategies in the asymptotic limit

We observe a gap between parallel and sequential strategies for amplitude damping channels and SWAP-type interactions for $N=2$ and $3$ . Now we show that for both examples there is no advantage of sequential strategies asymptotically since there exists an $h$ such that $\beta=0$ .

For the amplitude damping channel, $K_{\phi,i}^{(\mathrm{AD})}=K_{i}^{(\mathrm{AD})}U_{z}(\phi),\ i=1,2$ . Direct calculation leads to

\beta^{(\mathrm{AD})}=\left(\frac{t}{2}+h^{(\mathrm{AD})}_{11}\right)\outerproduct{0}{0}+\left[h^{(\mathrm{AD})}_{11}-\frac{t}{2}+\left(h^{(\mathrm{AD})}_{22}-h^{(\mathrm{AD})}_{11}\right)p\right]\outerproduct{1}{1}.

(109)

To obtain $\beta^{(\mathrm{AD})}=0$ we just need to take $h^{(\mathrm{AD})}_{11}=-t/2$ and $h^{(\mathrm{AD})}_{22}=(2-p)t/2p$ .

Similarly, for the SWAP-type interaction we have

\beta^{(\mathrm{SWAP})}=\left(\frac{t}{2}+h^{(\mathrm{SWAP})}_{11}\right)\outerproduct{0}{0}+\left[h^{(\mathrm{SWAP})}_{11}-\frac{t}{2}+\left(h^{(\mathrm{SWAP})}_{22}-h^{(\mathrm{SWAP})}_{11}\right)\sin^{2}(g\tau)\right]\outerproduct{1}{1}.

(110)

Thus there exists $h^{(\mathrm{SWAP})}_{11}=-t/2$ and $h^{(\mathrm{SWAP})}_{22}=\left[2-\sin^{2}(g\tau)\right]t/2\sin^{2}(g\tau)$ such that $\beta^{(\mathrm{SWAP})}=0$ .

Appendix I Implementation of optimal strategies with universal quantum gates

In this section we apply Algorithm 1 in the main text to numerically solve for an optimal strategy in the set of sequential and causal superposition ones respectively. The CJ operator of an optimal sequential strategy corresponds to a sequence of isometries with a minimal ancilla-space implementation provided by [46], and can then be decomposed into single-qubit gates and CNOT gates [72]. By taking advantage of the freedom of choosing a parameter-independent unitary on the final output state, we can adjust the strategy to reduce the CNOT count without affecting the QFI. In terms of an optimal causal superposition strategy we follow the same routine for each sequential strategy branch respectively in the superposition.

I.1 Optimal sequential strategy

Let $\mathcal{H}_{0}=\mathbb{C}$ and $\mathcal{H}_{2N+1}=\mathcal{H}_{F}$ , and we have the CJ operator of a sequential strategy $P\in\mathcal{L}\left(\mathcal{H}_{F}\otimes_{i=0}^{2N}\mathcal{H}_{i}\right)$ (an $(N+1)$ -step quantum comb), as illustrated in Fig. 6. In this way of relabeling Hilbert spaces we have

P=P^{(N+1)}\geq 0,\ \Tr_{2k-1}P^{(k)}=\mathds{1}_{2k-2}\otimes P^{(k-1)}\ \mathrm{for}\ k=2,\dots,N+1,\ \Tr P^{(1)}=P^{(0)}=1.

(111)

According to Theorems 1 and 2 in [46], $P$ corresponds to a sequence of isometries $\{V^{(k)}\}$ for $k=1,\dots,N+1$ by Stinespring dilation:

\mathcal{P}(\rho)=\Tr_{A_{N+1}}\left[\left(V^{(N+1)}\otimes\mathds{1}_{1,3,\dots,2N-1}\right)\cdots\left(V^{(1)}\otimes\mathds{1}_{2,4,\dots,2N}\right)\rho\left(V^{(1)}\otimes\mathds{1}_{2,4,\dots,2N}\right)^{\dagger}\cdots\left(V^{(N+1)}\otimes\mathds{1}_{1,3,\dots,2N-1}\right)^{\dagger}\right]

(112)

for any input state $\rho\in\mathcal{L}\left(\otimes_{i=0}^{N}\mathcal{H}_{2i}\right)$ , where the whole process corresponding to $P$ is described by an isometry $\mathcal{P}\in\mathcal{L}\left(\otimes_{i=0}^{N}\mathcal{H}_{2i},\otimes_{i=0}^{N}\mathcal{H}_{2i+1}\right)$ , and in each step a choice of isometry $V^{(k)}\in\mathcal{L}\left(\mathcal{H}_{2k-2}\otimes\mathcal{H}_{A_{k-1}},\mathcal{H}_{2k-1}\otimes\mathcal{H}_{A_{k}}\right)$ with minimal ancilla space is given by

V^{(k)}=\mathds{1}_{2k-1}\otimes\left(P^{(k)*}\right)^{\frac{1}{2}}\left[\lvert I\rrangle_{2k-1,(2k-1)^{\prime}}\mathds{1}_{2k-2\rightarrow(2k-2)^{\prime}}\otimes\left(P^{(k-1)*}\right)^{-\frac{1}{2}}\right],

(113)

where $\mathcal{H}_{A_{k}}=\mathsf{Supp}(P^{(k)*})$ is an ancillary space given by the support of $P^{(k)*}$ with $\mathcal{H}_{A_{0}}=\mathbb{C}$ , $\mathcal{H}_{i^{\prime}}$ is a copy of the Hilbert space $\mathcal{H}_{i}$ , $\mathds{1}_{2k-2\rightarrow(2k-2)^{\prime}}:=\sum_{i}\ket{i}_{(2k-2)^{\prime}}\bra{i}_{2k-2}$ is an identity map from $\mathcal{H}_{2k-2}$ to $\mathcal{H}_{(2k-2)^{\prime}}$ , and $\left(P^{(k-1)*}\right)^{-\frac{1}{2}}$ denotes the Moore–Penrose pseudoinverse of $\left(P^{(k-1)*}\right)^{\frac{1}{2}}$ with its support on $\mathcal{H}_{A_{k-1}}$ .

As the last isometry $V^{(N+1)}$ preserves the QFI, it is therefore only necessary to consider the implementation of $P^{(N)}$ instead of the full strategy $P^{(N+1)}$ . From this explicit construction it follows that the minimal dimension of the ancilla space for implementing the sequential strategy $P$ is $\mathsf{dim}(\mathcal{H}_{A_{N}})=\mathsf{rank}(P^{(N)})$ . In the case of $N=2$ for the amplitude damping noise considered in the main text, it is easy to see that $\mathsf{dim}(\mathcal{H}_{A_{1}})\leq 2$ and $\mathsf{dim}(\mathcal{H}_{A_{2}})\leq 8$ , so $V^{(1)}$ is an isometry from $0$ to (at most) $2$ qubits and $V^{(2)}$ is an isometry from $2$ to (at most) $4$ qubits, as illustrated in Fig. 7.

\Qcircuit@C=1em @R=.7em \lstick|0⟩ & \multigate1V^(1) \gateE_ϕ \multigate3V^(2) \gateE_ϕ \qw
\lstick|0⟩ \ghostV^(1) \qw \ghostV^(2) \qw \qw
\lstick|0⟩ \qw \qw \ghostV^(2) \qw \qw
\lstick|0⟩ \qw \qw \ghostV^(2) \qw \qw

Figure 7: A sequence of isometries corresponding to a sequential strategy for

N=2

. The first qubit is the system qubit going through the channel

\mathcal{E}_{\phi}

twice, while the three other qubits are ancillary.

Next, we apply a circuit decomposition of each isometry into single-qubit gates and CNOT gates. In practice it is often desirable to achieve a CNOT count as low as possible. Note that $V^{(1)}$ actually corresponds to the preparation of a $2$ -qubit state, which in general requires only one CNOT gate [73]. In terms of $V^{(2)}$ , an isometry from $2$ to $4$ qubits, the state-of-the-art decomposition scheme is the column-by-column approach which requires at most 54 CNOT gates [72]. However, as an arbitrary parameter-independent unitary on $3$ ancillae can always be deferred and regrouped into $V^{(3)}$ and therefore does not affect the QFI, in fact we can choose a proper $V^{(2)}$ to further reduce the worst CNOT count to $47$ without changing the QFI. To see this we need to briefly introduce the main ideas behind the column-by-column decomposition scheme.

An isometry $V$ from $m$ to $n$ qubits ( $m\leq n$ ) can be represented in the matrix form by $V=U^{\dagger}\mathds{1}(2^{n}\times 2^{m})$ , where $U^{\dagger}$ is a $2^{n}\times 2^{n}$ unitary matrix and $\mathds{1}(2^{n}\times 2^{m})$ is the first $2^{m}$ columns of the $2^{n}\times 2^{n}$ identity matrix. If we obtain a decomposition of $U^{\dagger}$ , then we can simply initialize the state of the first $n-m$ qubits to $\ket{0}$ to implement $V$ . Equivalently we can find a decomposition of $U$ such that $UV=\mathds{1}(2^{n}\times 2^{m})$ and then inverse the circuit representing $U$ . The idea is to find a sequence of unitary operations such that $U=U_{2^{m}-1}\cdots U_{0}$ transforms $V$ into $\mathds{1}(2^{n}\times 2^{m})$ column by column. More specifically, we first choose a proper $U_{0}$ to map the first column of $V$ to the first column of $\mathds{1}(2^{n}\times 2^{m})$ , i.e., $U_{0}V\ket{0}_{m}=\ket{0}_{n}$ , then choose $U_{1}$ satisfying $U_{1}U_{0}V\ket{1}_{m}=\ket{1}_{n}$ as well as $U_{1}U_{0}V\ket{0}_{m}=\ket{0}_{n}$ …until we determine $U_{2^{m}-1}$ .

Here we only focus on $U_{0}$ , the inverse of which can be seen as a process preparing a state $V\ket{0}_{m}$ from $\ket{0}_{n}$ . In terms of decomposing $V^{(2)}$ from $m=2$ to $n=4$ qubits, preparing a $4$ -qubit state in general requires $8$ CNOT gates [74]. Fortunately, without changing the QFI, we have the freedom to choose a unitary $U_{\mathrm{anc}}$ on the ancillae after applying $V^{(2)}$ such that the state $V^{(2)\prime}\ket{0}_{2}=U_{\mathrm{anc}}V^{(2)}\ket{0}_{2}$ can be prepared using only one CNOT gate. This can be seen by dividing the $4$ qubits into two parties, including the single system qubit (in the space $\mathcal{H}_{S}$ ) and the three ancillary qubits (in the space $\mathcal{H}_{A}$ ), and taking the Schimidt decomposition of the $4$ -qubit state $V^{(2)}\ket{0}_{2}$

\ket{\psi}_{SA}:=V^{(2)}\ket{0}_{2}=\sum_{i=0}^{1}\lambda_{i}\ket{e_{i}}_{S}\ket{f_{i}}_{A},

(114)

where $\{\ket{e_{i}/f_{i}}_{S/A}\}$ forms an orthonormal basis of $\mathcal{H}_{S/A}$ , and $\{\lambda_{i}\}$ is a set of nonnegative real numbers satisfying $\sum_{i}\lambda_{i}^{2}=1$ . Therefore, to prepare $V^{(2)}\ket{0}_{2}$ , we only need a local unitary on $\mathcal{H}_{S}$ to generate $\sum_{i=0}^{1}\lambda_{i}\ket{i}_{S}\ket{0}_{A}$ , then apply one CNOT gate taking the system qubit as the control to obtain $\sum_{i=0}^{1}\lambda_{i}\ket{i}_{S}\ket{i}_{A}$ , and finally apply local unitary operations $U_{S}=\sum_{i}\ket{e_{i}}_{S}\bra{i}_{S}$ on the system and $U_{A}=\sum_{i}\ket{f_{i}}_{A}\bra{i}_{A}$ on the ancillae respectively. If we take $U_{\mathrm{anc}}=U_{A}^{\dagger}$ , then it is easy to see that $V^{(2)\prime}\ket{0}_{2}=U_{\mathrm{anc}}V^{(2)}\ket{0}_{2}=\sum_{i=0}^{1}\lambda_{i}\ket{e_{i}}_{S}\ket{i}_{A}$ can thus be prepared using one CNOT gate. This choice of $V^{(2)\prime}$ saves $7$ CNOT gates compared to the general state preparation scheme, and leads to a worst CNOT count of $47$ in total.

Now we present numerical results of the circuit implementation of an optimal sequential strategy. The decomposition of ismometries is implemented using the Mathematica package UniversalQCompiler [75] based on the method described above. As in the main text, we consider the amplitude damping noise and take $N=2$ , $\phi=1.0$ , $p=0.5$ and $t=1.0$ . The circuits implementing $V^{(1)}$ and $V^{(2)\prime}$ are illustrated in Fig. 8. The state preparation $V^{(1)}$ requires $1$ CNOT gate and the intermediate control operation $V^{(2)\prime}$ requires $33$ CNOT gates.

\Qcircuit@C=1em @R=.7em \push & \lstick|0⟩ \gateR_y \gateR_z \ctrl1 \gateR_y \qw \push
\push \lstick|0⟩ \gateR_y \qw \targ \qw \qw \push

(a) Decomposition of

V^{(1)}

. For simplicity the angles of single-qubit rotation gates are not depicted.

\Qcircuit@C=.6em @R=.7em & \ctrl1 \ctrl1 \ctrl2 \qw \ctrl2 \ctrl3 \qw \ctrl3 \qw \qw \qw \qw \qw \qw \targ \targ \targ \targ \targ \targ \targ \ctrl2 \qw \ctrl3 \qw \qw \qw \targ \targ \targ \targ \targ \ctrl1 \qw
\targ \targ \qw \ctrl1 \qw \qw \ctrl2 \qw \ctrl1 \qw \ctrl2 \targ \targ \targ \ctrl-1 \qw \ctrl-1 \qw \ctrl-1 \qw \ctrl-1 \qw \qw \qw \targ \targ \targ \qw \qw \ctrl-1 \qw \ctrl-1 \targ \qw
\lstick|0⟩ \qw \qw \targ \targ \targ \qw \qw \qw \targ \ctrl1 \qw \qw \ctrl-1 \qw \qw \qw \qw \ctrl-2 \qw \qw \qw \targ \ctrl1 \qw \qw \ctrl-1 \qw \qw \ctrl-2 \qw \qw \qw \qw \qw
\lstick|0⟩ \qw \qw \qw \qw \qw \targ \targ \targ \qw \targ \targ \ctrl-2 \qw \ctrl-2 \qw \ctrl-3 \qw \qw \qw \ctrl-3 \qw \qw \targ \targ \ctrl-2 \qw \ctrl-2 \ctrl-3 \qw \qw \ctrl-3 \qw \qw \qw

(b) Decomposition of

V^{(2)\prime}=U_{\mathrm{anc}}V^{(2)}

. For simplicity single-qubit gates, which might be required in addition to CNOT gates, are not depicted.

Figure 8: Decomposition of isometries corresponding to an optimal sequential strategy for

N=2

. We apply

V^{(2)\prime}

instead of

V^{(2)}

to achieve the maximal QFI with fewer CNOT gates.

I.2 Optimal causal superposition strategy

A causal superposition strategy for estimating $N$ channels can be implemented by an $N!$ -dim quantum control system entangled with $N!$ sequential strategies of applying the channels:

P=\outerproduct{P}{P}\ \mathrm{for}\ \ket{P}=\sum_{\pi\in S_{N}}\ket{P^{\pi}}\ket{\pi}_{C},

(115)

where $\{\ket{\pi}_{C}\}$ forms an orthonormal basis of the Hilbert space $\mathcal{H}_{C}$ of the control system, and each $P^{\pi}=\outerproduct{P^{\pi}}{P^{\pi}}$ is a sequential strategy. Once we obtain an optimal causal superposition strategy by applying Algorithm 1, we can apply the circuit decomposition for each sequential strategy in the superposition, following the method described in Appendix I.1. Taking account of the permutation symmetry, we can choose an optimal causal superposition strategy such that apart from the execution order of two channels, sequential strategies in the decomposition of the strategy are the same, containing the same state preparation and intermediate control.

As a concrete example, we again take $N=2$ , $\phi=1.0$ , $p=0.5$ and $t=1.0$ for the amplitude damping noise and present numerical results of the circuit implementation of an optimal causal superposition strategy. As illustrated in Fig. 9, we use the qubit $\ket{\psi}_{C}$ to coherently control which sequential order is executed. Due to the permutation invariance of the optimal strategy, we can simply control the query order of the identical channels while fixing $V^{(1)}$ and $V^{(2)}$ for all sequential orders. In view of this, generally we can use a $(2N-1)$ -quantum SWITCH to control the order of $N$ channels $\mathcal{E}_{\phi}$ and $N-1$ intermediate control operations $V^{(i)}$ .

\Qcircuit@C=1em @R=.7em & If $\ket{\psi}_{C}=\ket{0}_{C}$ ,
\lstick|0⟩ \multigate1V^(1) \gateE_ϕ^(1) \multigate3V^(2) \gateE_ϕ^(2) \qw
\lstick|0⟩ \ghostV^(1) \qw \ghostV^(2) \qw \qw
\lstick|0⟩ \qw \qw \ghostV^(2) \qw \qw
\lstick|0⟩ \qw \qw \ghostV^(2) \qw \qw \Qcircuit@C=1em @R=.7em & If $\ket{\psi}_{C}=\ket{1}_{C}$ ,
\lstick|0⟩ \multigate1V^(1) \gateE_ϕ^(2) \multigate3V^(2) \gateE_ϕ^(1) \qw
\lstick|0⟩ \ghostV^(1) \qw \ghostV^(2) \qw \qw
\lstick|0⟩ \qw \qw \ghostV^(2) \qw \qw
\lstick|0⟩ \qw \qw \ghostV^(2) \qw \qw

Figure 9: Sequences of isometries corresponding to each sequential order in the causal superposition for

N=2

. The first qubit of the circuit is the system qubit, and the query order of two identical channels

\mathcal{E}_{\phi}^{(1)}

and

\mathcal{E}_{\phi}^{(2)}

is entangled with the state of the control qubit

\ket{\psi}_{C}

. When

\ket{\psi}_{C}

is a superposition of the two states shown in the figure, the causal order is also in a superposition given by Eq. (115).

Further decomposition shows that each sequential branch requires one CNOT gate for state preparation $V^{(1)}$ and 36 CNOT gates for intermediate control $V^{(2)}$ , as illustrated in Fig. 10.

\Qcircuit@C=1em @R=.7em \push & \lstick|0⟩ \gateR_y \gateR_z \ctrl1 \gateR_y \qw \push
\push \lstick|0⟩ \gateR_y \qw \targ \qw \qw \push

(a) Decomposition of

V^{(1)}

. For simplicity the angles of single-qubit rotation gates are not depicted.

\Qcircuit@C=.6em @R=.7em & \ctrl1 \ctrl1 \ctrl2 \qw \ctrl2 \qw \ctrl3 \qw \ctrl3 \qw \qw \qw \qw \qw \qw \qw \qw \qw \targ \targ \targ \targ \targ \targ \targ \ctrl2 \qw \ctrl3 \qw \qw \qw \ctrl1 \targ \targ \targ \ctrl1 \qw
\targ \targ \qw \ctrl1 \qw \qw \qw \ctrl2 \qw \targ \targ \targ \ctrl1 \qw \ctrl2 \targ \targ \targ \ctrl-1 \qw \ctrl-1 \qw \ctrl-1 \qw \ctrl-1 \qw \qw \qw \targ \targ \targ \targ \ctrl-1 \qw \ctrl-1 \targ \qw
\lstick|0⟩ \qw \qw \targ \targ \targ \ctrl1 \qw \qw \qw \qw \ctrl-1 \qw \targ \ctrl1 \qw \qw \ctrl-1 \qw \qw \qw \qw \ctrl-2 \qw \qw \qw \targ \ctrl1 \qw \qw \ctrl-1 \qw \qw \qw \ctrl-2 \qw \qw \qw
\lstick|0⟩ \qw \qw \qw \qw \qw \targ \targ \targ \targ \ctrl-2 \qw \ctrl-2 \qw \targ \targ \ctrl-2 \qw \ctrl-2 \qw \ctrl-3 \qw \qw \qw \ctrl-3 \qw \qw \targ \targ \ctrl-2 \qw \ctrl-2 \qw \qw \qw \qw \qw \qw

(b) Decomposition of

V^{(2)}

. For simplicity single-qubit gates, which might be required in addition to CNOT gates, are not depicted.

Figure 10: Decomposition of isometries corresponding to one causal order in the decomposition of an optimal causal superposition strategy for

N=2

. We have already taken advantage of the freedom to choose a

V^{(2)}

implemented by fewer CNOT gates, as explained in Appendix I.1.

Optimal Strategies of Quantum Metrology with a Strict Hierarchy

Abstract

Definition 1.

Theorem 1.

References

Appendix A Proof of Eq. (6) of the main text

Appendix B Proof of Theorem 1

Appendix C Proof of the validity of Algorithm 1

Appendix D Symmetry reduced optimization

D.1 Symmetry reduced QFI evaluation

Lemma 1.

Proof.

Lemma 2.

Proof.

Theorem 2 (Symmetry reduced Theorem 1, first case).

Proof.

Lemma 3.

Proof.

Theorem 3 (Symmetry reduced Theorem 1, second case).

Proof.

D.2 Symmetry reduced algorithm for optimal strategies

Lemma 4.

Proof.

Appendix E Evaluation of QFI using different strategies

E.1 Parallel strategies

E.2 Sequential strategies

E.3 Quantum SWITCH strategies

E.4 Causal superposition strategies

E.5 General indefinite-causal-order strategies

Appendix F Complexity analysis

Appendix G Supplementary numerical results

G.1 Hierarchy for the N=3N=3 case

G.2 Estimation of randomly sampled channels

Appendix H Comparison with asymptotic results

H.1 Preliminaries

H.2 Tightness of QFI bounds in nonasymptotic channel estimation

H.3 Elusive advantage of sequential strategies in the asymptotic limit

Appendix I Implementation of optimal strategies with universal quantum gates

I.1 Optimal sequential strategy

I.2 Optimal causal superposition strategy

G.1 Hierarchy for the $N=3$ case