Engineering the Cost Function of a Variational Quantum Algorithm for Implementation on Near-Term Devices
Abstract
Variational hybrid quantum-classical algorithms are some of the most promising workloads for near-term quantum computers without error correction. The aim of these variational algorithms is to guide the quantum system to a target state that minimizes a cost function, by varying certain parameters in a quantum circuit. This paper proposes a new approach for engineering cost functions to improve the performance of a certain class of these variational algorithms on today’s small qubit systems. We apply this approach to a variational algorithm that generates thermofield double states of the transverse field Ising model, which are relevant when studying phase transitions in condensed matter systems. We discuss the benefits and drawbacks of various cost functions, apply our new engineering approach, and show that it yields good agreement across the full temperature range.
Index Terms:
NISQ devices, thermofield double, variational algorithm, cost functionI Introduction
Noisy intermediate scale quantum (NISQ) computers are near-term systems that have up to a few hundred non-error-corrected qubits that suffer from decoherence and noise, limiting the overall quantum circuit depth [15]. Some of the main limitations of NISQ systems include limited qubit number, limited qubit connectivity, and hardware-specific quantum gate alphabets [7]. Algorithms must be optimized carefully to smaller quantum circuit depths in order to run on NISQ systems [7, 21]. Hybrid classical-quantum variational algorithms are promising implementations for NISQ systems and include the quantum approximate optimization algorithm (QAOA) [4], and Variational Quantum Eigensolver (VQE) [12, 10]. These variational algorithms use classical compute resources to enable the execution of more complicated algorithms on the small quantum compute resources available today.
Implementation of a variational algorithm typically involves the minimization of a cost function in order to obtain the optimal variational parameters to generate the desired quantum state. The purpose of the variational algorithm described in this work is to construct a quantum state called a thermofield double (TFD) state in the transverse field Ising model (TFIM), which is important in understanding thermal phase transitions in condensed matter systems. TFD states are entangled pure states between two systems which yield a thermal state when one of the systems is traced out [11, 9]. We develop an engineering approach to construct an optimal cost function for implementation on a small qubit system, and show that cost functions engineered using our approach can result in more accurate outcomes on real hardware systems. This higher accuracy will yield better experimental results in state-of-the-art NISQ systems, as evidenced by the implementation of a simpler variant of a similarly engineered cost function to generate TFD states in an actual superconducting quantum processor [17, 16].
II Thermofield Double States Generation
II-A System Definition
The Hamiltonian of the TFIM for a one-dimensional ring of qubits is given by
(1) |
where the transverse field direction was chosen for convenience of implementation in superconducting systems (compare to [20] with ). We use natural units (i.e. ) throughout the manuscript. Consider the special case of generating Thermofield Double (TFD) states in a four-qubit system which was recently demonstrated experimentally, using superconducting qubits by [17]. In this case, the intra-system Hamiltonian reduces to
(2) |
which describes interactions within each of the two subsystems ( and ) with being the transverse field strength. The ultimate objective is to have the full system undergo unitary evolution such that it will yield a thermal state (or Gibbs state) on subsystem , if it is considered in isolation. In practice, this can be studied by performing a partial trace over subsystem [11]. Conversely, this technique can be viewed as a purification of the Gibbs state, resulting in a TFD in the full system [19]. The TFD state at an inverse temperature is thus defined as
(3) |
where is a normalization factor, and
(4) |
is the TFD state at or , which should be a pairwise maximally entangled Bell state (since tracing subsystem out of this full state yields a maximally-mixed state for subsystem ). For simulation purposes we thus set the initial state to be
(5) |
where the qubits are labeled as and qubits are pairwise maximally entangled through applied unitaries.
II-B System Evolution
The protocol for generation of TFD states is described in [20], and we follow a similar path to invoke the variational ansatz motivated by the quantum alternating operator ansatz [5]. This involves alternating between the subsystem Hamiltonian and the entangling Hamiltonian . Here the subsystem Hamiltonian is defined as
(6) |
and the entangling Hamiltonian is defined as
(7) |
Near-term quantum computing systems have stringent coherent operations limits and it is desirable to minimize the number of steps required for a given workload [21]. Hence, here we focus on the case of increasing efficiency of single-step TFD generation under the given Hamiltonians (eqs. 2 and 7). With our restricted model, the quantum circuit for TFD generation is described by eq. 8 with as variational parameters.
(8) |
III Comparison of Cost Functions for TFD States’ Generation
To obtain the optimal variational parameters experimentally, a quantum computer is used for system evolution while a classical computer is used for cost function evaluation based on the measurements returned by the quantum computer. The ultimate success of the protocol depends on the effectiveness of the cost function evaluation as well as the capability of the quantum circuit to accurately generate the desired TFD state.
III-A Error in fidelity between generated and target states
For numerical convenience it is typical to utilize the error in fidelity with respect to the target as the cost function during optimization [19],
(9) |
Evaluation of the latter expression requires access to the actual wavefunction of the generated state. Experimental determination of the wavefunction is typically limited to either estimation based on extensive tomographic reconstruction [1], or an array of sophisticated weak measurements [8]. Thus it is undesirable in its present form for evaluation in an actual hybrid quantum-classical experimental system. In principle, it is possible to modify eq. 9 to use the density matrix of the generated state, but this also is cumbersome and reasons are discussed further in the next section.
III-B Free energy of the system
An alternative to eq. 9 is using the free energy of the system as the cost function during optimization [19]. In this case, the free energy is calculated on the reduced density matrix for subsystem as
(10) |
where , , are the energy, von Neumann entropy, and the reduced density operator for subsystem , respectively, and is the system temperature. In this case quantum state tomography (QST) [1] of subsystem is necessary to calculate . Recently, methods to approximate the von Neumann entropy have also been proposed [2].
Typically, QST of a system requires a complete set of measurements related to the number of unknowns of the system size [6]. Given a system of qubits, the number of unique density matrix elements is given by . Thus, this is the total number of unique measurements required to fully characterize the qubit system (e.g. for a two-qubit subsystem, this gives 15 measurements). As system size increases, QST becomes prohibitively expensive and renders free energy a poor choice for large-scale optimization problems.
III-C Hypothesized cost function based on correlator expectation values
To formulate a more experiment-friendly expression, it is possible to hypothesize a cost function based on the form of as described in section III-B. This would involve substituting alternative expressions for and that require fewer measurements than QST. It is straightforward to infer that will be dominated by at higher temperatures, and by at lower temperatures. Given the simplicity of the TFIM Hamiltonian, the expression for the low temperature regime is easily obtained as follows,
(11) |
where the notation indicates the expectation value of the relevant correlator with respect to the evolved wavefunction in eq. 8. The reduction in the number of terms is not strictly due to the use of correlators, but it is rather the underlying symmetry of the problem that allows us to determine with only three system measurements for subsystem . However, this method of constructing the cost function allows an explicit represention using tangible measurements for the hybrid quantum-classical optimization algorithm.
There is no straightforward method to derive an expression for based on correlator expectation values as before. However, it is trivial to verify that is the ground state of the negative of the inter-system Hamiltonian, . Given that we expect to dominate at and is the infinite temperature TFD state, it is natural to hypothesize an approximate form of to be generalized to when considering the full system (i.e. both and ) [14]. This results in the following expression for a cost function , which is more amenable for practical implementation in a quantum-classical optimization algorithm.
(12) |
where the system temperature is a parameter, and are optimization variables. Here, we have generalized the cost function to the full system and included in the energy of the system at low temperatures.
III-D Free Energy vs. Hypothesized Cost Function
When the strength of the transverse field is set to , a critical point between antiferromagnetic and paramagnetic quantum phases is expected [3, 13]. Hence, for cost function performance comparison purposes, we will primarily explore the case of . The case of will be considered for completeness in appendix C. We simulate TFD state generation using Differential Evolution, which is a global optimization algorithm [18] supported by Wolfram Mathematica for non-linear optimization. The optimization is performed over a wide inverse temperature range of six orders of magnitude to ensure complete coverage.
During optimization we minimize the cost functions corresponding to and separately, and see that there is excellent agreement between them for extreme low and high temperatures (see fig. 1). We have chosen trace distance and fidelity as defined below [11], as proximity measures comparing the ideal TFD state and the optimally generated state with the different cost functions.
(13) | ||||
(14) |
Here, is the density matrix corresponding to the ideal TFD state, represents the density matrix for the circuit-generated state utilizing the relevant cost function, and are corresponding subsystem states after tracing out , respectively.

For perfect TFD state preparation, we would expect and . However, in the range , the performance of as a cost function is poor, and demonstrates difficulty in reaching the target TFD state. This is somewhat expected, given that we hypothesized the form of the simple cost function based on extreme high/low temperature behavior. Also note that for , and indicating that even while using free energy as the cost function we do not construct the ideal TFD state. This is also expected since is the most difficult regime to generate the TFD state based on the given protocol [19]. This indicates that the depth of the circuit is most likely insufficient to construct a better state approaching the TFD state. However depending on the measure used, we find that better approximations to the TFD states are possible using different engineered cost functions.
IV Engineering Improved Cost Functions
IV-A Enhancing the hypothesized cost function
Given the shortcomings of at intermediate values, we began our new approach to construct a better cost function by generalizing eq. 12 for by adding coefficients to the expression containing correlators as follows,
(15) |
where
(16) |
Here and are parameters which we optimize to find a cost function that can yield better approximations to TFD states across the full inverse temperature range. For simplicity and clarity, instead of performing nested optimizations over and , we execute TFD generation for a range of and values (see appendix A for details). By varying , and , and evaluating the minimization quantity of interest , we see that the best agreement between and is obtained for and (see fig. 2). The improvements from using are discussed in section IV-C.

IV-B Analytically obtaining a better cost function
To further the work towards engineering a better cost function, we used an alternative approach based on density matrix elements’ closeness to generate a cost function that shows significant improvement over both the hypothesized cost function , and its enhanced version . In this case, we studied the characteristics of the density matrices of the target TFD state, , and the single-step circuit-generated state, . Note that we are not tracing out subsystem of these matrices to obtain the thermal state. Instead we directly compare the density matrices of the target and generated states. Also we relax the assumption of , and keep as a parameter throughout the analysis.
First, we observe the redundancies present in the ideal TFD state and obtain 15 unique real elements for :
(17) |
where the density matrix elements are labeled consistent with the notation in eq. 5 (see appendix B for details). Similarly, we observe the redundancies and Hermiticity in to obtain 10 unique complex elements for off-diagonal elements, and 5 unique real elements for the diagonal elements:
(18) |
Inspired by trial optimizations, we find that is explicitly symmetrized by choosing particular values for the inter-system variational parameters ,
(19) |
resulting in 14 unique real elements in . With this choice for , it is found that , indicating that this constrained will be limited in fully matching .
Following symmetrization, a cost function is constructed explicitly by calculating the sum of the square of differences between the density matrix elements for and :
(20) |
where , , and is a weight used to prune elements. We find that choosing
(23) |
generates a simple cost function which yields good results across the full temperature range and for different values. For this system of four qubits, we find that it is beneficial to substitute variational angles in the engineered cost function with the operator expectation values as was the case in and . This will enable the experiments to be performed using identical quantum processor measurements, while modifying the classical processor evaluation to improve efficiency. Thus, the non-zero elements for the engineered cost function are explicitly given by eq. 61. Note that given the symmetric nature of the evolution of the subsystems, it is sufficient to include only the intra-system expectation values for one subsystem.
IV-C Improvements from Engineered Cost Functions
We now evaluate the performance of the various cost functions when generating the TFD states for a wide temperature range. Fidelity and trace distances are calculated for the traced out subsystem as described in section III-D. In fig. 3, we observe that yields vastly superior results compared to the original cost function , especially at intermediate temperatures. We also find that performs significantly better than at high temperatures.

For , we note that the two measures ( and ) offer somewhat inconsistent results. When considering trace distance as the proximity measure, and seems to give better results compared to both and . In appendix C we study a few cases of and find that outperforms all other cost functions (including ) irrespective of the proximity measure used. We attribute this anomaly to the definition of the measures themselves, and further study of this aspect is beyond the scope of this work.
V Conclusion
In this article, we explored different cost functions that can be used during hybrid classical-quantum variational algorithm execution for TFD generation on real qubit systems. NISQ systems are constrained due to imperfect quantum operations with relatively low fidelities, low qubit lifetimes, and small qubit numbers. Hence, it is necessary to implement quantum algorithms in the most effective manner to utilize the available resources to their fullest extent. Our aim was to engineer a cost function which will generate the TFD states in a NISQ four-qubit system, in the most efficient manner. The constructed cost functions yielded a substantial improvement over the original cost function results (e.g. reduction in relative error for intermediate temperature TFD states).
The originally hypothesized cost function was found to be inadequate in approximating the TFD state at intermediate temperatures. The enhanced cost function obtained via modifications to the original cost function yielded better results at intermediate temperatures for the quantum critical case of . Subsequently, the cost function constructed based on the closeness of density matrix elements yielded good results for the full temperature range and for the transverse field range .
The method of improving the cost function to formulate as discussed in section IV-A is amenable to experimental exploration of cost function generation, and could result in novel methods for obtaining better cost functions for special state generation. This method can be further extended by adding other correlator expectation values in the cost function with coefficients to be found experimentally. As quantum state evolution typically incurs the highest resource cost during the experiment, this method of constructing cost functions could lead to more efficient scaling of variational algorithms to higher qubit numbers.
Although eq. 61 only included the intra-system expectation values for one subsystem, it is possible to include the other subsystem’s measurements when performing an experiment. Judicious choice of the measurements between the different subsystems should allow higher throughput of measurements for faster variational algorithm execution. Although the construction of is not a scalable technique for higher qubit numbers, excellent results were obtained for all temperature regimes. In construction only 20% of the density matrix elements evaluated for closeness, indicating that the encoding of the thermal state is primarily in a few populations and coherences of the TFD state. Studying how this encoding space will scale with qubit number should shed light on methods to improve practical TFD generation.
Appendix A Definition of Minimization Quantity of Interest for Improving the hypothesized cost function
The list of 15 operators considered in section IV-A to compute the minimization quantity of interest is given by,
(24) |
and the range of 55 temperatures used for the optimization is,
(25) |
In section IV-A, we calculate the differences in operator expectation values between the ideal TFD state and the circuit-generated state. The total of the absolute differences (for all temperatures) between each ideal and generated state is chosen as the minimization quantity of interest, , for finding optimal and values:
(26) |
where and are the ideal and circuit-generated states, respectively for each value. This is an expression that is more conducive for experimental implementation as well depending on the ease of obtaining various operator expectation values.
Appendix B Density Matrix Analysis for TFD Generation
We label the density matrices corresponding to the full system in the binary ordering of the ket occupation states as defined in section II. For example the relevant elements for , and can be found as follows,
In eq. 43, each unique label in is assigned when first encountered dduring enumeration of the matrix elements. Note that this is a symmetric matrix as expected from the definition of the purified thermal state in eq. 3. Conversely, the circuit-generated state is given by a Hermitian matrix as seen in eq. 60. The non-zero elements in the evaluation of eq. 20 are given by the elements in eq. 61.
(43) |
(60) |
(61) |
Appendix C Performance of cost functions for varying
In fig. 4 we compare the different cost functions’ performance at generating TFD states for different values. We find that outperforms for most cases. Note that the low temperature performance of is poor for the case of , indicating the optimal coefficients should be re-optimized for various values. Finding a general expression for the coefficients is desirable as the intermediate temperature performance is still superior to . We note that as a cost function occasionally had difficulty converging to a minimum, especially for and , as can be seen from sudden jumps in the traces.

Acknowledgments
The authors thank Sonika Johri and Xiang Chris Zou for insightful discussions.
References
- [1] J. B. Altepeter, D. F. James, and P. G. Kwiat, 4 Qubit Quantum State Tomography. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004, pp. 113–145.
- [2] A. N. Chowdhury, G. H. Low, and N. Wiebe, “A Variational Quantum Algorithm for Preparing Quantum Gibbs States,” 2020. [Online]. Available: https://arxiv.org/abs/2002.00055
- [3] O. F. de Alcantara Bonfim, B. Boechat, and J. Florencio, “Ground-state properties of the one-dimensional transverse Ising model in a longitudinal magnetic field,” Phys. Rev. E, vol. 99, p. 012122, Jan 2019. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevE.99.012122
- [4] E. Farhi, J. Goldstone, and S. Gutmann, “A Quantum Approximate Optimization Algorithm,” 2014, MIT-CTP/4610. [Online]. Available: https://arxiv.org/abs/1411.4028
- [5] S. Hadfield, Z. Wang, B. O’Gorman, E. G. Rieffel, D. Venturelli, and R. Biswas, “From the Quantum Approximate Optimization Algorithm to a Quantum Alternating Operator Ansatz,” Algorithms, vol. 12, no. 2, 2019. [Online]. Available: https://www.mdpi.com/1999-4893/12/2/34
- [6] D. F. V. James, P. G. Kwiat, W. J. Munro, and A. G. White, “Measurement of qubits,” Phys. Rev. A, vol. 64, p. 052312, Oct 2001. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.64.052312
- [7] S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. Sornborger, and P. J. Coles, “Quantum-assisted quantum compiling,” Quantum, vol. 3, p. 140, May 2019. [Online]. Available: https://doi.org/10.22331/q-2019-05-13-140
- [8] J. S. Lundeen, B. Sutherland, A. Patel, C. Stewart, and C. Bamber, “Direct measurement of the quantum wavefunction,” Nature, vol. 474, no. 7350, pp. 188–191, 2011. [Online]. Available: https://doi.org/10.1038/nature10120
- [9] J. Maziero, “Computing partial traces and reduced density matrices,” International Journal of Modern Physics C, vol. 28, no. 01, p. 1750005, 2017. [Online]. Available: https://doi.org/10.1142/S012918311750005X
- [10] N. Moll, P. Barkoutsos, L. S. Bishop, J. M. Chow, A. Cross, D. J. Egger, S. Filipp, A. Fuhrer, J. M. Gambetta, M. Ganzhorn, A. Kandala, A. Mezzacapo, P. Müller, W. Riess, G. Salis, J. Smolin, I. Tavernelli, and K. Temme, “Quantum optimization using variational algorithms on near-term quantum devices,” Quantum Science and Technology, vol. 3, no. 3, p. 030503, jun 2018.
- [11] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press, 2010.
- [12] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien, “A variational eigenvalue solver on a photonic quantum processor,” Nature Communications, vol. 5, no. 1, p. 4213, 2014. [Online]. Available: https://doi.org/10.1038/ncomms5213
- [13] P. Pfeuty, “The one-dimensional Ising model with a transverse field,” Annals of Physics, vol. 57, no. 1, pp. 79 – 90, 1970. [Online]. Available: http://www.sciencedirect.com/science/article/pii/0003491670902708
- [14] S. Premaratne, S. Johri, X. C. Zou, R. Sagastizabal, M. A. Rol, B. Klaver, M. Moreira, C. Almudever, L. D. Carlo, and A. Matsuura, “Efficient Variational Generation of Thermofield Double States on a Superconducting Quantum Processor: Theory (Part 1),” in APS March Meeting, 2020, p. F07.00011.
- [15] J. Preskill, “Quantum Computing in the NISQ era and beyond,” Quantum, vol. 2, p. 79, Aug. 2018. [Online]. Available: https://doi.org/10.22331/q-2018-08-06-79
- [16] R. Sagastizabal, S. Premaratne, S. Johri, B. Klaver, M. Moreira, V. Negrîneac, M. A. Rol, X. C. Zou, C. Almudever, A. Matsuura, and L. D. Carlo, unpublished.
- [17] R. Sagastizabal, M. A. Rol, B. Klaver, M. Moreira, S. Premaratne, S. Johri, X. C. Zou, C. Almudever, A. Matsuura, and L. D. Carlo, “Efficient Variational Generation of Thermofield Double States on a Superconducting Quantum Processor: Theory (Part 2),” in APS March Meeting, 2020, p. F07.00012.
- [18] R. Storn and K. Price, “Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces,” Journal of Global Optimization, vol. 11, no. 4, pp. 341–359, Dec 1997. [Online]. Available: https://doi.org/10.1023/A:1008202821328
- [19] J. Wu and T. H. Hsieh, “Variational Thermal Quantum Simulation via Thermofield Double States,” Phys. Rev. Lett., vol. 123, p. 220502, Nov 2019. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.123.220502
- [20] D. Zhu, S. Johri, N. M. Linke, K. A. Landsman, N. H. Nguyen, C. H. Alderete, A. Y. Matsuura, T. H. Hsieh, and C. Monroe, “Variational Generation of Thermofield Double States and Critical Ground States with a Quantum Computer,” 2019. [Online]. Available: https://arxiv.org/abs/1906.02699
- [21] X. Zou, S. P. Premaratne, M. A. Rol, S. Johri, V. Ostroukh, D. J. Michalak, R. Caudillo, J. S. Clarke, L. DiCarlo, and A. Y. Matsuura, “Enhancing a Near-Term Quantum Accelerator’s Instruction Set Architecture for Materials Science Applications,” IEEE Transactions on Quantum Engineering, vol. 1, pp. 1–7, 2020.