Improving thermodynamic bounds using correlations

Andreas Dechant Department of Physics #1, Graduate School of Science, Kyoto University, Kyoto 606-8502, Japan Shin-ichi Sasa Department of Physics #1, Graduate School of Science, Kyoto University, Kyoto 606-8502, Japan

Abstract

We discuss how to use correlations between different physical observables to improve recently obtained thermodynamics bounds, notably the fluctuation-response inequality (FRI) and the thermodynamic uncertainty relation (TUR). We show that increasing the number of measured observables will always produce a tighter bound. This tighter bound becomes particularly useful if one of the observables is a conserved quantity, whose expectation is invariant under a given perturbation of the system. For the case of the TUR, we show that this applies to any function of the state of the system. The resulting correlation-TUR takes into account the correlations between a current and a non-current observable, thereby tightening the TUR. We demonstrate our finding on a model of the $\text{F}_{1}$ -ATPase molecular motor, a Markov jump model consisting of two rings and transport through a two-dimensional channel. We find that the correlation-TUR is significantly tighter than the TUR and can be close to an equality even far from equilibrium.

I Introduction

Entropy production is a fundamental concept of non-equilibrium statistical mechanics. It relates the asymmetry of microscopic transitions in a system to the measurable loss of energy in the form of heat dissipated into the environment. For macroscopic systems, measuring the latter thus provides a measure of microscopic time-reversal symmetry breaking. While the same relation holds for microscopic systems and can be even be formulated on the level of single trajectories [1, 2], measuring the dissipated heat is generally very challenging, as the resulting temperature changes are very small and typically lost among the fluctuations of the noisy environment. A more practical way to measure the entropy production in microscopic systems is provided by the work of Harada and Sasa [3], who show that the entropy production can be obtained from the violation of the fluctuation-dissipation relation. We remark that in principle, the entropy production may also be obtained directly from the probabilities of microscopic transitions in the system, however, this requires very good spatial and temporal resolution as well as lots of statistics.

A different way of estimating entropy production has recently been suggested [4, 5, 6, 7, 8] using the thermodynamic uncertainty relation (TUR) [9, 10, 11, 12]. The TUR establishes a connection between entropy production on the one hand, and measurable currents in the system and their fluctuations on the other hand. It may be understood as a more precise formulation of the second law, since it not only establishes the positivity of entropy production but also provides a finite lower bound in terms of experimentally accessible quantities. However, since the TUR is an inequality, there is generally no guarantee that the lower bound is tight, i. e. that a useful estimate of entropy production is obtained from a given measurement. In principle the lower bound can be optimized to produce an accurate estimate of entropy production [4, 5, 6, 7] and even realize equality [13], however, the resulting quantities may not be any easier to measure than the entropy production itself.

From an experimental point of view, it is thus highly desirable to improve the tightness of the bound using available data. However, the tightness of the bound is also of fundamental interest: For example, it has been shown [14] that the TUR is generally not very tight for models of biological molecular motors, with the lower estimate on entropy production being on the order of $10$ to $40\%$ of the actual value. This raises the intriguing question of whether evolution is “bad” at saturating thermodynamic bounds, or whether indeed a tighter bound exists.

So far, applications and extensions of the TUR have mostly focused only on current-like observables (for example the displacement of a particle or the heat exchanged with the environment) [15, 16, 17], although it has been found [18, 19, 20] that, in the presence of time-dependent driving, also state-dependent observables (like the instantaneous position or potential energy) may yield information about the entropy production. While the presence of non-zero average currents clearly distinguishes a non-equilibrium steady state from an equilibrium system; it is thus reasonable that a relation between currents and the entropy production should exist. By contrast, the average of state-dependent observables is independent of time both in equilibrium and non-equilibrium steady states, intuitively, it seems that such observables can provide no additional information about the steady state entropy production. As the main result of this article, we show that this intuitive notion is not correct. We can exploit the correlations between a state-dependent observable $z$ and current $j$ to obtain a tighter version of the TUR. We formulate the TUR in terms of the transport efficiency $\eta_{J}$ [11]

\displaystyle\eta_{J}=\frac{2\langle J\rangle^{2}}{\text{Var}_{J}\Delta S_{\text{irr}}}\leq 1,

(1)

where $J$ is the time-integrated current, $\langle J\rangle$ denotes the average and $\text{Var}_{J}$ the variance, and $\Delta S^{\text{irr}}$ is the total entropy production. Our main result is the bound

\displaystyle\eta_{J}+{\chi_{J,Z}}^{2}\leq 1,

(2)

where $Z$ is the time-integral of the state-dependent observable $z$ and $\chi_{J,Z}=\text{Cov}_{J,Z}/\sqrt{\text{Var}_{J}\text{Var}_{Z}}$ , with $\text{Cov}_{J,Z}$ the covariance, is the Pearson correlation coefficient, which satisfies $-1\leq\chi\leq 1$ . We refer to Eq. (3) as correlation TUR (CTUR). As a consequence, we obtain a tighter bound on the entropy production

\displaystyle\frac{\langle J\rangle^{2}}{\text{Var}_{J}}\leq\frac{\langle J\rangle^{2}}{\text{Var}_{J}\big{(}1-{\chi_{J,Z}}^{2}\big{)}}\leq\frac{1}{2}\Delta S^{\text{irr}},

(3)

where the leftmost expression corresponds to the TUR. Surprisingly, the observable $Z$ can be almost arbitrary, as long as it is the time-integral of a quantity which only depends on the state of the system. This implies that virtually any additional observable that can be obtained from a measurement may be used to tighten the TUR. As we show below, a tight bound is generally obtained when $Z$ is chosen as the local average value of $J$ . Importantly, the CTUR can be evaluated using only the experimentally obtained trajectory data and does not require any additional information about the parameters of the model. Thus suggests that taking into account correlations between observables may indeed be crucial to obtaining accurate estimates of the entropy production in terms of experimentally accessible quantities.

We demonstrate the usefulness of the CTUR by applying it to three distinct examples: For a model for the $\text{F}_{1}$ -ATPase molecular motor [21, 22], we find that, while the bound obtained on the entropy production using the TUR for the displacement of the motor is only around $40\%$ of the actual value, measuring the time-integrated local mean velocity in addition to the displacement and using the CTUR yields an estimate that is about $90\%$ accurate over a wide range of parameters. For a Markov jump model, in which two currents are driven through two connected rings, we show that, even though measuring the current in one of the rings can only give an estimate on the contribution to the entropy production stemming from this ring, this estimate can be tightened considerably using the CTUR. Finally, for transport in a two-dimensional channel, we demonstrate that even simple choices of the state-dependent observable $z$ can yield a significant improvement over the TUR.

II Multidimensional FRI and monotonicity of information

The mathematical basis of our results is an extension of the fluctuation-response inequality (FRI) [23] to multiple observables, similar to the multidimensional TUR [24]. The FRI gives an upper bound on the ratio $\mathcal{Q}(r)$ between the response of the average of an observable $Y$ to a small perturbation of the system, and its fluctuations in the unperturbed system,

\displaystyle\frac{\big{(}\delta\langle Y\rangle\big{)}^{2}}{\text{Var}_{Y}}\leq 2D_{\text{KL}}(\tilde{p}\|p).

(4)

Here, $\delta\langle Y\rangle=\widetilde{\langle Y\rangle}-\langle Y\rangle$ is the response of the observable $Y$ to the perturbation which changes the probability density describing the system from $p(\omega)$ to $\tilde{p}(\omega)$ and $D_{\text{KL}}(\tilde{p}\|p)$ is the Kullback-Leibler divergence between the probability densities. Here, $\omega$ may be the state of the system, but it may also represent a trajectory of the system during the measurement interval. When we consider the perturbation to be described by a parameter $\theta$ , such that $p(\omega)=p^{\theta}(\omega)$ and $\tilde{p}(\omega)=p^{\theta+d\theta}(\omega)$ , then this is equivalent to the Cramér-Rao inequality [25, 26]

\displaystyle\frac{\big{(}\partial_{\theta}\langle Y\rangle\big{)}^{2}}{\text{Var}_{Y}}\leq I(\theta),

(5)

where $I(\theta)$ is the Fisher information

\displaystyle I(\theta)=\int d\omega\ \big{(}\partial_{\theta}\ln p^{\theta}(\omega)\big{)}^{2}p^{\theta}(\omega).

(6)

With this identification, we can use the Cramér-Rao inequality for vector-valued observables, $\bm{Y}^{(K)}=(Y_{1},Y_{2},\ldots,Y_{K})$ ,

\displaystyle\mathcal{Q}^{(K)}_{Y}\equiv\big{(}\partial_{\theta}\langle\bm{Y}^{(K)}\rangle\big{)}^{\text{T}}\big{(}\bm{\Xi}_{Y}^{(K)}\big{)}^{-1}\big{(}\partial_{\theta}\langle\bm{Y}^{(K)}\rangle\big{)}\leq I(\theta),

(7)

where the superscript T denotes transposition and $\bm{\Xi}_{Y}^{(K)}$ is the covariance matrix with entries $(\bm{\Xi}_{Y}^{(K)})_{ij}=\text{Cov}_{Y_{i},Y_{j}}$ . Note that here we assumed that the observables are not linearly dependent such that the covariance matrix is positive definite. As noted in Ref. [24], Eq. (7) is the extension of the FRI to more than one observable.

Next, we want to show that increasing the number of observables results in a tighter bound, i. e. that $\mathcal{Q}^{(K)}_{Y}\leq\mathcal{Q}^{(K+1)}_{Y}$ . We write the covariance matrix $\bm{\Xi}_{Y}^{(K+1)}$ of $K+1$ observables as

	$\displaystyle\bm{\Xi}_{Y}^{(K+1)}=\begin{pmatrix}\bm{A}&\bm{b}\\ \bm{b}^{\text{T}}&c\end{pmatrix}\quad\text{with}$		(8)
	$\displaystyle\bm{A}=\bm{\Xi}_{Y}^{(K)},\;b_{k}=\text{Cov}_{Y_{k},Y_{K+1}},\;c=\text{Var}_{Y_{K+1}}.$

We compute its inverse using the block-inversion formula

	$\displaystyle\big{(}\bm{\Xi}_{Y}^{(K+1)}\big{)}^{-1}=\begin{pmatrix}\bm{A}^{-1}&0\\ 0&0\end{pmatrix}+\bm{D}$		(9)
	$\displaystyle\text{with}\quad\bm{D}=\frac{c-\bm{b}^{\text{T}}\bm{A}^{-1}\bm{b}}{\big{(}\bm{A}^{-1}\bm{b}\big{)}^{\text{T}}\big{(}\bm{A}^{-1}\bm{b}\big{)}}\bm{d}\bm{d}^{\text{T}},\;\bm{d}=\begin{pmatrix}-(\bm{A}^{-1}\bm{b})^{\text{T}}\\ 1\end{pmatrix}.$

Further, we have the Schur determinant identity

\displaystyle\det\big{(}\bm{\Xi}_{Y}^{(K+1)}\big{)}=\det\big{(}\bm{\Xi}_{Y}^{(K)}\big{)}\big{(}c-\bm{b}^{\text{T}}\bm{A}^{-1}\bm{b}\big{)}.

(10)

Since $\bm{\Xi}_{Y}^{(K+1)}$ and $\bm{\Xi}_{Y}^{(K)}$ are positive definite, the second factor on the right-hand side is also positive. As a consequence, the matrix $\bm{D}$ in Eq. (9) is positive semi-definite and we have for any $(K+1)$ -vector $\bm{v}^{(K+1)}$ ,

$\displaystyle\bm{v}^{(K+1),\text{T}}$	$\displaystyle\big{(}\bm{\Xi}_{Y}^{(K+1)}\big{)}^{-1}\bm{v}^{(K+1)}$	(11)
	$\displaystyle=\bm{v}^{(K),\text{T}}\big{(}\bm{\Xi}_{Y}^{(K)}\big{)}^{-1}\bm{v}^{(K)}+\bm{v}^{(K+1),\text{T}}\bm{D}\bm{v}^{(K+1)}$
	$\displaystyle\geq\bm{v}^{(K),\text{T}}\big{(}\bm{\Xi}_{Y}^{(K)}\big{)}^{-1}\bm{v}^{(K)},$

where $\bm{v}^{(K)}$ is the vector $\bm{v}^{(K+1)}$ with the $(K+1)$ -th component removed. For $\bm{v}^{(K+1)}=\partial_{\theta}\langle\bm{Y}^{(K+1)}\rangle$ this yields the desired inequality

\displaystyle\mathcal{Q}^{(K+1)}_{Y}\geq\mathcal{Q}^{(K)}_{Y}.

(12)

In light of the Cramér-Rao inequality Eq. (7), this means that considering more observables yields more information about the parameter $\theta$ (i. e. the perturbation) and thus a tighter lower bound on the Fisher information. In that sense, the information obtained from a measurement increases monotonically with increasing the number of measured observables. This holds true only as long as the additional observables are not linearly dependent on the existing ones; if this is not the case, then the covariance matrix becomes singular and the bound saturates, as the additional observables do not contain any new information.

In the case of two observables $Y_{1}$ and $Y_{2}$ , the inverse of the covariance matrix can be computed explicitly and we obtain the bound

\displaystyle\frac{\big{(}\partial_{\theta}\langle Y_{1}\rangle\big{)}^{2}\text{Var}_{Y_{2}}-2\big{(}\partial_{\theta}\langle Y_{1}\rangle\big{)}\big{(}\partial_{\theta}\langle Y_{2}\rangle\big{)}\text{Cov}_{Y_{1},Y_{2}}+\big{(}\partial_{\theta}\langle Y_{2}\rangle\big{)}^{2}\text{Var}_{Y_{1}}}{\text{Var}_{Y_{1}}\text{Var}_{Y_{2}}-{\text{Cov}_{Y_{1},Y_{2}}}^{2}}\leq I(\theta).

(13)

This expression simplifies further if $Y_{2}$ is a conserved quantity with respect to the perturbation, $\partial_{\theta}\langle Y_{2}\rangle=0$ . Then, we find

\displaystyle\frac{\big{(}\partial_{\theta}\langle Y_{1}\rangle\big{)}^{2}}{\text{Var}_{Y_{1}}\big{(}1-{\chi_{Y_{1},Y_{2}}}^{2}\big{)}}\leq I(\theta).

(14)

In this case it is obvious that the bound is tighter than Eq. (5). This shows that, even if the average of $Y_{2}$ contains no information about the parameter $\theta$ and the perturbation, we may still use its correlations with $Y_{1}$ to obtain a tighter version of the Cramér-Rao inequality and thus the FRI.

III Continuous time-reversal and TUR

In view of later applications, we slightly generalize the discussion to a Langevin dynamics in $\mathbb{R}^{N}$ with an internal degree of freedom

\displaystyle\dot{\bm{x}}(t)=\bm{a}_{i(t)}(\bm{x}(t))+\bm{G}_{i(t)}\bm{\xi}(t),

(15)

where the drift vector $\bm{a}_{i}(\bm{x})$ and diffusion matrix $\bm{G}_{i}$ depend on the discrete state $i=1,\ldots,M$ . The dynamics of the discrete state are governed by a Markov jump process with transition rates $W_{ij}(\bm{x})$ from state $j$ to state $i$ . We take the diffusion matrix to be independent of the position in order to simplify some of the following notation, however, the extension to a position-dependent diffusion matrix can be readily obtained. The evolution of the probability density $p_{i}(\bm{x},t)$ for being at position $\bm{x}$ and in state $i$ at time $t$ is governed by the Fokker-Planck master equation

$\displaystyle\partial_{t}p_{i}(\bm{x},t)=-\bm{\nabla}\big{(}$	$\displaystyle\bm{\nu}_{i}(\bm{x},t)p_{i}(\bm{x},t)\big{)}$	(16)
	$\displaystyle+2\sum_{j}V_{ij}(\bm{x},t)p_{j}(\bm{x},t)\big{)},$
$\displaystyle\text{with}\quad\bm{\nu}_{i}(\bm{x},t)$	$\displaystyle=\bm{a}_{i}(\bm{x})-\bm{\nabla}^{\text{T}}\bm{B}_{i}\ln p_{i}(\bm{x},t),$
$\displaystyle\text{and}\quad V_{ij}(\bm{x},t)$	$\displaystyle=\frac{1}{2}\bigg{(}W_{ij}(\bm{x})-W_{ji}(\bm{x})\frac{p_{i}(\bm{x},t)}{p_{j}(\bm{x},t)}\bigg{)}.$

Here, $\bm{B}_{i}=2\bm{G}_{i}\bm{G}_{i}^{\text{T}}$ is assumed to be positive definite (i. e. $\bm{G}_{i}$ should have full rank). This dynamics reduces to a pure Langevin dynamics in absence of the discrete degree of freedom and to a pure Markov jump dynamics if there is no dependence on $\bm{x}$ . The quantity $\bm{\nu}_{i}(\bm{x},t)$ is called the local mean velocity and characterizes the local flows in the system. We have also introduced its analog $V_{ij}(\bm{x},t)$ for the jump part. For the type of dynamics Eq. (15), we may consider two flavors of currents [27]


$\displaystyle J_{\text{d}}$	$\displaystyle=\int_{0}^{\tau}dt\ \bm{w}^{\text{T}}_{i(t)}(\bm{x}(t))\circ\dot{\bm{x}}(t),$	(17a)
$\displaystyle J_{\text{j}}$	$\displaystyle=\int_{0}^{\tau}\ \omega_{j(t+dt),j(t)}(\bm{x}(t)).$	(17b)

Here $\bm{w}_{i}(\bm{x})$ is a differentiable vector field, $\omega_{ij}(\bm{x})=-\omega_{ji}(\bm{x})$ are the entries of an antisymmetric matrix and $\circ$ denotes the Stratonovich product. Intuitively, the diffusive current $J_{\text{d}}$ may be interpreted as a generalized displacement, in which the velocity is weighted by the position- and state-dependent function $\bm{w}_{i}(\bm{x})$ . The jump current $J_{\text{j}}$ , on the other hand, counts transitions between different states, which are weighted by the function $\omega_{ij}(\bm{x})$ . The averages of these quantities in the steady state are proportional to time and given by


$\displaystyle\langle J_{\text{d}}\rangle$	$\displaystyle=\tau\sum_{i}\int d\bm{x}\ \bm{w}^{\text{T}}_{i(t)}(\bm{x}(t))\bm{\nu}^{\text{st}}_{i}(\bm{x})p_{i}^{\text{st}}(\bm{x}),$	(18a)
$\displaystyle\langle J_{\text{j}}\rangle$	$\displaystyle=\tau\sum_{ij}\int d\bm{x}\ \omega_{ij}(\bm{x})V_{ij}^{\text{st}}(\bm{x})p_{j}^{\text{st}}(\bm{x}).$	(18b)

Here the superscript st denotes the steady state value of the respective quantity. Note that both types of current are proportional to the respective local mean velocity. Next, we briefly summarize the continuous time-reversal introduced in Ref. [13]. This transformation is defined by a family of dynamics of the type Eq. (15), with a parameter $\theta\in[-1,1]$ ; $\theta=1$ corresponds to the time-forward dynamics, while $\theta=-1$ represents the time-reversed dynamics [28]. So, we can connect the time-forward and time-reversed dynamics by a continuous family of dynamics. In the present context, the most important property of this transformation is that it leads to a rescaling of the local mean velocities $\bm{\nu}_{i}^{\text{st},\theta}(\bm{x})=\theta\bm{\nu}_{i}^{\text{st}}(\bm{x})$ and $V_{ij}^{\text{st},\theta}(\bm{x})=\theta V_{ij}^{\text{st}}(\bm{x})$ , while the steady state probability density $p_{i}^{\text{st}}(\bm{x})$ is independent of $\theta$ . This generalizes the intuitive notion that time-reversal should lead to a reversal of all flows in the system to a continuous transformation. From Eq. (18), we then see that the averages of currents are also rescaled by the continuous time-reversal operation:

\displaystyle\langle J\rangle^{\theta}=\theta\langle J\rangle.

(19)

This implies that $\partial_{\theta}\langle J\rangle^{\theta}=\langle J\rangle$ . Further, the Fisher information corresponding to the path probabilities of the dynamics parameterized by $\theta$ is related to the entropy production [29, 23, 13],

\displaystyle I(\theta)\leq\frac{1}{2}\Delta S^{\text{irr}},

(20)

where equality holds for a pure Langevin dynamics. With this, Eq. (7) implies the multidimensional TUR [24],

\displaystyle\langle\bm{Y}^{(K)}\rangle^{\text{T}}\big{(}\bm{\Xi}_{Y}^{(K)}\big{)}^{-1}\langle\bm{Y}^{(K)}\rangle\leq\frac{1}{2}\Delta S^{\text{irr}},

(21)

where the components of $\bm{Y}^{(K)}$ are currents of either type in Eq. (17). The new insight from the preceding discussion is that the left-hand side increases monotonically when increasing the number of measured currents. This fact is very useful when we want to use the left-hand side to estimate the entropy production: Any additional information that can be obtained from a measurement can be used to improve the estimate. We note that, in the steady state of Eq. (15), the entropy production is explicitly given by

	$\displaystyle\Delta S^{\text{irr}}$	$\displaystyle=\tau\bigg{(}\sum_{i}\int d\bm{x}\ \big{\\|}\big{(}\bm{a}_{i}(\bm{x})-\bm{\nabla}^{\text{T}}\bm{B}_{i}\ln p^{\text{st}}_{i}(\bm{x})\big{\\|}^{2}p_{i}^{\text{st}}(\bm{x})$
		$\displaystyle+\sum_{i,j}\int d\bm{x}\ \ln\bigg{(}\frac{W_{ij}(\bm{x})p_{j}^{\text{st}}(\bm{x})}{W_{ji}(\bm{x})p_{i}^{\text{st}}(\bm{x})}\bigg{)}W_{ij}(\bm{x})p_{j}^{\text{st}}(\bm{x}).$		(22)

Crucially, Eq. (21) is not restricted to current observables. To see this, we recall that the steady state probability density $p^{\text{st}}_{i}(\bm{x})$ is invariant under changing the parameter $\theta$ [13]. As a consequence, for a state-dependent (or non-current) observable $Z$

\displaystyle Z=\int_{0}^{\tau}dt\ z_{i(t)}(\bm{x}(t),t),

(23)

where the function $z_{i}(\bm{x},t)$ may depend on the position, the internal state and time, its average does not depend on $\theta$

\displaystyle\langle Z\rangle^{\theta}=\int_{0}^{\tau}dt\sum_{i}\int d\bm{x}\ z_{i}(\bm{x},t)p_{i}^{\text{st}}(\bm{x})=\langle Z\rangle.

(24)

Thus, $\partial_{\theta}\langle Z\rangle^{\theta}=0$ , and we may include such observables in Eq. (21) by setting the corresponding entries in the vector $\langle\bm{Y}^{(K)}\rangle$ to zero. However, such observables do contribute to the covariance matrix $\bm{\Xi}_{Y}^{(K)}$ , and Eq. (12) guarantees that the resulting bound will be tighter than the one without these observables. For the case of one current and one state-dependent observable, we may use Eq. (14) to write the bound explicitly

\displaystyle\frac{\langle J\rangle^{2}}{\text{Var}_{J}\big{(}1-{\chi_{J,Z}}^{2}\big{)}}\leq\frac{1}{2}\Delta S^{\text{irr}},

(25)

which is equivalent to the CTUR Eq. (3). This is very appealing from an experimental point of view: Currents as in Eq. (17) depend on the velocity or transitions between the internal states. Since observing these requires a high time-resolution, such quantities are generally challenging to measure accurately. The only exception are specific choices of the weighting functions, for which the time-integrated observable can be measured directly, for example the displacement of a particle. By contrast, observables of the type Eq. (23), which depend only on the position and the internal state can easily be evaluated from trajectory data. The CTUR Eq. (25) implies that, provided at least one current can be obtained from the measurement, we may use other, non-current observables to improve the lower bound on the entropy production.

IV Optimal observables and stochastic entropy production

Given that the choice of the observable $Z$ in Eq. (25) has a lot of freedom, a natural question is whether there exists an optimal observable which maximizes the bound. This is equivalent to finding $Z$ such that magnitude of the Pearson coefficient $\chi(J,Z)$ becomes maximal for given $J$ . Unfortunately, we have not been able to solve this optimization problem in general. However, there is one particular case, where we can obtain the solution explicitly. For a pure Langevin dynamics without internal states, we may consider the stochastic entropy production $\Sigma$ as the observable $J$ . This corresponds to the weighting function

\displaystyle\bm{w}(\bm{x})

\displaystyle=\bm{B}^{-1}\bm{\nu}^{\text{st}}(\bm{x}).

(26)

As we show in Appendix A, in this case, the optimal choice for $Z$ is

\displaystyle Z=\bar{\Sigma}=\int_{0}^{\tau}dt\ \bm{\nu}^{\text{st,T}}(\bm{x}(t))\bm{B}^{-1}\bm{\nu}^{\text{st}}(\bm{x}(t)).

(27)

This quantity can be interpreted as a local mean entropy production, i. e. the expected entropy production rate at position $\bm{x}(t)$ integrated along the trajectory. Note that both $\Sigma$ and $\bar{\Sigma}$ have the entropy production $\Delta S_{\text{irr}}$ as their average value. As it turns out, this choice turns Eq. (25) into an equality,

\displaystyle 2\Delta S_{\text{irr}}=\text{Var}_{\Sigma}\big{(}1-{\chi_{\Sigma,\bar{\Sigma}}}^{2}\big{)},

(28)

which shows that this really is the optimal choice of $Z$ . We remark that this equality is equivalent to the equality $2\Delta S_{\text{irr}}=\text{Var}(\delta\Sigma)$ with $\delta\Sigma=\Sigma-\bar{\Sigma}$ derived in Ref. [13]. For general currents, while the optimal $Z$ could not be obtained explicitly, we note that the average current is expressed in terms of the local mean velocity as

\displaystyle\langle J\rangle=\tau\int d\bm{x}\ \bm{w}^{\text{T}}(\bm{x})\bm{\nu}^{\text{st}}(\bm{x})p^{\text{st}}(\bm{x}).

(29)

Comparing this to Eq. (27), this suggests that a good choice for $Q$ may be

\displaystyle Z=\bar{J}=\int_{0}^{\tau}dt\ \bm{w}^{\text{T}}(\bm{x}(t))\bm{\nu}^{\text{st}}(\bm{x}(t)).

(30)

This choice is the local mean value of the current, which has the same average as the current itself.

Further insight into the meaning of the optimal observable $Z$ can be gained from the following consideration. Since the average of the observables $J$ and $\tilde{J}=J-Z$ exhibit the same scaling under continuous time-reversal

\displaystyle\partial_{\theta}\langle J\rangle^{\theta}=\partial_{\theta}\langle\tilde{J}\rangle^{\theta}=\langle J\rangle,

(31)

they both satisfy a TUR

\displaystyle\frac{\langle J\rangle^{2}}{\text{Var}_{J}}\leq\frac{1}{2}\Delta S^{\text{irr}}\quad\text{and}\quad\frac{\langle J\rangle^{2}}{\text{Var}_{\tilde{J}}}\leq\frac{1}{2}\Delta S^{\text{irr}}.

(32)

Since the choice of $Z$ is arbitrary within the class of observables Eq. (23), we may minimize the variance of $\tilde{J}$ with respect to $Z$ ,

\displaystyle\frac{\langle J\rangle^{2}}{\inf_{Z}\big{(}\text{Var}_{\tilde{J}}\big{)}}\leq\frac{1}{2}\Delta S^{\text{irr}}.

(33)

We may generalize this slightly by choosing $\tilde{J}=J-\alpha Z$ , where $\alpha$ is a constant. In this case, the minimization with respect to $\alpha$ can be done explicitly and yields

\displaystyle\inf_{\alpha}\big{(}\text{Var}_{\tilde{J}}\big{)}=\text{Var}_{J}\big{(}1-{\chi_{J,Z}}^{2}\big{)},

(34)

from which we readily obtain Eq. (25) and finding the optimal observable corresponds to maximizing the Pearson coefficient. Intuitively, the optimal observable is the state-dependent observable whose fluctuations most closely mimic those of the current $J$ , thus minimizing the variance of $J-Z$ . The tightness of the TUR is thus limited by how closely the current $J$ can be emulated by a state-dependent observable $Z$ , i. e., the magnitude of the predictable part of $J$ . An extreme case are the fluctuations of the stochastic entropy around its local mean value $\delta\Sigma$ . This quantity is equivalent to the entropy production measured in terms of the stochastic time coordinate introduced in Ref. [30]. Its statistics are described by simple Brownian motion and thus its predictable part is zero and the corresponding TUR is an equality [13].

In the case of a pure Markov jump dynamics, the right-hand side of the Eq. (25) can be replaced by a tighter bound

\displaystyle\frac{\langle J\rangle^{2}}{\text{Var}_{J}\big{(}1-{\chi_{J,Z}}^{2}\big{)}}\leq\frac{1}{2}R\leq\frac{1}{2}\Delta S^{\text{irr}}

(35)

with the quantity $R$ given by

\displaystyle R=\tau\sum_{i,j}\frac{\big{(}W_{ij}p^{\text{st}}_{j}-W_{ji}p^{\text{st}}_{i}\big{)}^{2}}{W_{ij}p^{\text{st}}_{j}+W_{ji}p^{\text{st}}_{i}}.

(36)

The inequality $\Psi\leq\Delta S^{\text{irr}}$ is a straightforward consequence of the elementary inequality

\displaystyle\frac{(a-b)^{2}}{a+b}\leq\frac{1}{2}(a-b)\ln\bigg{(}\frac{a}{b}\bigg{)},

(37)

which holds of arbitrary $a,b>0$ . We note that the quantity $R$ has been introduced several times in the recent literature [24, 6, 31] and been termed pseudo entropy production in Ref. [31]. Like the entropy production Eq. (22), this quantity measures the degree to which the system is out of equilibrium. As shown in Appendix A, the choice

\displaystyle\omega_{ij}=2\frac{W_{ij}p^{\text{st}}_{j}-W_{ji}p^{\text{st}}_{i}}{W_{ij}p^{\text{st}}_{j}+W_{ji}p^{\text{st}}_{i}}

(38)

in the current Eq. (17) allows $J=\mathcal{R}$ to be interpreted as a stochastic version of the pseudo entropy in the sense that $\langle\mathcal{R}\rangle=R$ . Further, choosing

\displaystyle Z=\bar{\mathcal{R}}=\int_{0}^{\tau}dt\ \sum_{i}\frac{\Big{(}W_{ij(t)}-W_{j(t)i}\frac{p_{i}^{\text{st}}}{p_{j(t)}^{\text{st}}}\Big{)}^{2}}{W_{ij(t)}+W_{j(t)i}\frac{p_{i}^{\text{st}}}{p_{j(t)}^{\text{st}}}},

(39)

that is, the local mean of the current $\mathcal{R}$ , we obtain equality in Eq. (35),

\displaystyle 2R=\text{Var}_{\mathcal{R}}\big{(}1-{\chi_{\mathcal{R},\bar{\mathcal{R}}}}^{2}\big{)},

(40)

which is the analog of Eq. (27). This shows that as in the Langevin case, we may realize equality in Eq. (35), with the difference that the quantity being estimated is the pseudo entropy production $R$ instead of the entropy production $\Delta S^{\text{irr}}$ . As a consequence, Eq. (35) can only yield a reasonable estimate on the entropy production if $\Delta S^{\text{irr}}\approx R$ . From Eq. (37), this holds when $|W_{ij}p^{\text{st}}_{j}-W_{ji}p^{\text{st}}_{i}|\ll W_{ij}p^{\text{st}}_{j}+W_{ji}p^{\text{st}}_{i}$ for all transitions, i. e. the bias across any transition is small compared to the total activity. The latter condition is realized either near equilibrium (where the total bias is small) or in the continuum limit (where a finite total bias is distributed over many individual transitions). In all other cases, Eq. (35) can at most yield a lower bound on $\Delta S^{\text{irr}}$ , the extreme case being unidirectional transitions [32], where $\Delta S^{\text{irr}}$ diverges while $R$ remains finite. Note that, comparing Eq. (39) with the definition of a general current Eq. (17) suggests that a good choice for the state-dependent observable should be

\displaystyle Z=\bar{J}=\int_{0}^{\tau}dt\ \sum_{i}\omega_{ij(t)}V^{\text{st}}_{ij(t)},

(41)

with the “local mean velocity”

\displaystyle V_{ij}^{\text{st}}=\frac{1}{2}\bigg{(}W_{ij}-W_{ji}\frac{p^{\text{st}}_{i}}{p^{\text{st}}_{j}}\bigg{)}.

(42)

V Improved TUR from trajectory data

Based on the considerations in the previous sections, we now provide a general recipe to obtain sharper TUR-like bounds from existing trajectory data. The goal of this procedure is to use the trajectory data to obtain an estimate of the entropy production. We assume that the current $J$ , i. e. the weighting functions $\bm{w}_{i}(\bm{x})$ and $\omega_{ij}(\bm{x})$ in Eq. (17) are fixed. The reasoning behind this is that, in order to optimize the current, we need to explicitly observe all the transitions along the trajectory, which typically requires a sufficiently high time-resolution. Failing that, we are limited to observing current observables whose time-integral directly corresponds to a measurable quantity, for example the total displacement of a tracer particle. Then, obtaining a good estimate for the entropy production corresponds to finding a good choice for the observable $Z$ in Eq. (25). We propose three methods for finding such a choice. We also point the reader to Appendix B, where we investigate the dependence of the estimate on the size of the data set and the sampling interval.

V.1 Explicit optimization

Given a set of trajectory data, we may in principle optimize the function $z_{i}(\bm{x})$ in Eq. (23) explicitly by maximizing the Pearson coefficient between $J$ and $Z$ . In the case of a pure jump process with $M$ states, this corresponds to optimizing $M-1$ parameters—since the Pearson coefficient is invariant under a global rescaling of $Z$ , we may set $z_{1}=1$ without loss of generality. In the case of a diffusion process, we may choose a reasonable set of (say $K$ ) basis functions $f_{k}(\bm{x})$ and write $z(\bm{x})=\sum_{k=1}^{K}c_{k}f_{k}(\bm{x})$ , which (setting $c_{1}=1$ ) again corresponds to optimizing $K-1$ parameters. Provided that $K$ and the data set are not too large, this explicit optimization is feasible and has the advantage of yielding the tightest possible bound within the accuracy of the optimization algorithm. However, as $K$ or the size of the data set increase, this procedure becomes increasingly unfeasible and less tight but easier to compute heuristic bounds may be desired.

V.2 Approximate local mean velocity

Given that, in many physically relevant situations, the local mean current Eq. (30) is expected to give a reasonably tight bound, we may use an approximate expression $\tilde{\bm{\nu}}^{\text{st}}(\bm{x})$ for the local mean velocity to compute $Z$ . Such an approximate expression may be obtained directly from the trajectory data. Even if the measurement resolution is not sufficient to observe every transition, we may still obtain a coarse-grained mean velocity profile. For a spatial resolution $\Delta x$ and a temporal resolution $\Delta t$ , we may divide the observation volume into cells of size $\Delta x^{N}$ and obtain an estimate for the local mean velocity in a given cell by considering the displacement at time $t+\Delta t$ , averaged over all particles starting out in the cell at time $t$ . An approximate local mean velocity may also be obtained from theoretical considerations by considering a simplified version of the dynamics, in which the local mean velocity can be computed explicitly and is expected to resemble the one of the actual dynamics. A simplified model may be obtained, for example, by linearizing a given non-linear model or by constructing a lower-dimensional effective model. Even though both approaches are generally valid only in a certain limit, in many cases, the resulting local mean velocity still yields an improved bound via the CTUR even beyond the strict validity of the approximations. As a concrete example, we consider in the next section the flow through a two-dimensional channel, which we approximate by a one-dimensional flow in an effective potential, for which an explicit expression of the local mean velocity can be obtained.

Another application, for which we anticipate this approach to be useful, is for systems in the presence of spatial disorder. Since the disorder configuration is generally not known precisely, we also cannot obtain an explicit expression for the local mean velocity. In this case, we may compute the local mean velocity for the disorder-free system, and provided that the disorder is not too strong, we can still expect the CTUR with the corresponding observable $Z$ to provide a notable improvement over the TUR. For example, we can imagine a channel as in Fig. 5, through which a current is driven by an external bias. However, due to variances in the fabrication of the channel or the presence of contamination, the actual potential experienced by the particle may be more complicated. Nevertheless, if we can obtain an expression for the local mean velocity of the ideal situation, we can use it to obtain an improved estimate of the entropy production.

V.3 Inverse occupation fraction

Finally, we provide a concrete formula for obtaining a candidate for the observable $Z$ . This relies on the observation that, in many systems, the local mean velocity at some position $\bm{x}$ and the probability of finding the particle at position $\bm{x}$ are approximately inversely proportional. This relation is exact for Langevin dynamics in a spatially periodic one-dimensional system; in this case, we have $\nu^{\text{st}}(x)p^{\text{st}}(x)=v^{\text{d}}/L$ , where $v^{\text{d}}$ is the drift velocity and $L$ the period of the system. The same relation holds for a Markov jump dynamics with on a ring, $V^{\text{st}}_{i\pm 1,i}p_{i}^{\text{st}}=\pm\mathcal{J}/N$ where $\mathcal{J}$ is the total current on the ring.

However, even in higher-dimensional settings, we often expect the probability density $p^{\text{st}}(\bm{x})$ to be small where the magnitude of $\bm{\nu}^{\text{st}}(\bm{x})$ is large and vice versa. The reasoning for this relation is that, for ergodic dynamics, $p^{\text{st}}(\bm{x})d\bm{x}$ is equal to the temporal occupation fraction, i. e. the fraction of time that the particle resides in the volume element $d\bm{x}$ around $\bm{x}$ in the limit of long observation times. On the other hand, if the local mean velocity at $\bm{x}$ is large, this means that the particle tends to move at a large velocity at $\bm{x}$ and thus we expect it to only spend a short time in the volume element $d\bm{x}$ . Note that this qualitative argument may break down, e. g. near vortices in the local mean velocity, where particles may move fast yet still mostly reside in a small volume. However, since we only require an approximation to the local mean velocity to improve the TUR bound on the entropy production, using the time-integral of the inverse occupation fraction as the observable $Z$ is still expected to yield a notable improvement in most cases. Specifically, dividing the observation volume into $\mathcal{N}$ cells (for simplicity, we assume they have equal volume $v$ ) and the observation time into $\mathcal{M}$ steps, we define as the occupation number $\mathcal{K}_{i}$ the number of times that any trajectory takes a value in the $i$ -th cell. Then, we may define the observable $Z$ as

\displaystyle Z=\sum_{m=1}^{\mathcal{M}}\frac{1}{\mathcal{K}_{i(m)}},

(43)

where $i(m)$ is the cell the trajectory is located in at time step $m$ . Note that, while $Z$ is evaluated for each individual trajectory, $\mathcal{K}_{i}$ is computed from the entire set of trajectory data. We do not need to normalize the occupation number into an occupation fraction, since the overall normalization factor does not impact the Pearson coefficient. While the observable Eq. (43) may not yield the tightest bound in Eq. (25), it has the advantage that it may be computed using only the trajectory data without any additional input.

Finally, we remark upon a possible issue with the definition Eq. (43). If the phase-space volume $\mathcal{V}$ that is accessible to the system is not finite, then it becomes necessary to introduce a lower cutoff on the occupation number $\mathcal{K}_{i}$ . To see this, note that for an ergodic system in the limit of long time and/or a large number of trajectories, we have $\mathcal{K}_{i}\propto p^{\text{st}}(\bm{x}_{i})v$ , where $\bm{x}_{i}$ is the location of the $i$ -th cell and $v$ is the cell volume. Then,

\displaystyle\langle Z\rangle\propto\int_{\mathcal{V}}d\bm{x}\ \frac{1}{p^{\text{st}}(\bm{x})}p^{\text{st}}(\bm{x})=\mathcal{V}.

(44)

So the average of $Z$ is proportional to the volume $\mathcal{V}$ and diverges if the latter becomes infinite. For example, if $p^{\text{st}}(\bm{x})\propto\exp(-\|\bm{x}\|^{2}/(2r_{0}))$ , then $1/p^{\text{st}}(\bm{x})$ grows exponentially for large $\|\bm{x}\|$ and its average is not well-defined. Intuitively, this means that Eq. (43) is dominated by rare events, which are never well sampled no matter how many trajectories we record. To avoid this problem, we may introduce a lower cutoff on $\mathcal{K}_{i}$ , that is, we only count cells that have been visited in at least a fraction $\epsilon$ of all data points. In terms of the probability density, this means that we discard all points with $p^{\text{st}}(\bm{x})<\frac{\epsilon}{v}$ . In doing so, the average of $Z$ becomes finite and we may use the resulting observable as a qualitative estimate for the local mean velocity. Obviously, the resulting quantity is only an approximation to the actual occupation fraction, however, we may still expect it to yield a useful improvement over the TUR, provided that $\epsilon$ is sufficiently small and the statistics of the current are not dominated by rare events. This lower cutoff also avoids another possible issue: In Eq. (43), we use the same data set to evaluate the occupation fraction and the trajectory-dependent observable $Z$ . While this is not problematic for a large number of trajectories, since the influence of any single trajectory on $\mathcal{K}_{i}$ is small, it may lead to unintended correlations if the number of trajectories is not large. In the latter case, it may be preferable to divide the data set into two parts, using one to determine $\mathcal{K}_{i}$ and the other to compute $Z$ . In doing so, we may encounter the situation where a given cell is visited in the second part but not the first part of the data set. In this case, Eq. (43) is no-longer well defined since we divide by zero. Introducing a lower cutoff excludes this possibility. However, we stress that even if $\mathcal{K}_{i}$ and $Z$ are computed from the same data set, Eq. (3) remains valid, so the worst outcome of unintended correlations is to reduce the quality of the estimate on the entropy production.

VI Demonstrations

VI.1 Molecular motor model

To demonstrate how Eq. (25) may be used to obtain a tight bound on the entropy production, we consider the model for the $\text{F}_{1}$ -ATPase molecular motor introduced in Ref. [21] and further studied in Ref. [22]. This model describes the motion of a probe bead coupled to a rotating molecular motor, which either consumes a molecule called ATP in order to generate rotational motion, or, conversely, generates ATP when driven by an external torque. The probe is considered to be trapped inside a potential $U_{i}(x)$ , which is determined both the joint between the probe and the motor and the internal structure of the motor. As the motor rotates in steps of length $L$ , the potential depends on the current state of the motor as $U_{i}(x)=U_{0}(x-iL)$ . In the simplest form of the model, the trapping potential is harmonic, $U_{0}(x)=kx^{2}/2$ , and the motor rotates in steps of $L=120^{\circ}$ . The transitions between the states of the motor are described by the position-dependent transition rates

	$\displaystyle W_{i}^{+}$	$\displaystyle=W_{0}\text{exp}\Big{[}\frac{\alpha}{k_{\text{B}}T}\big{(}U_{i}(x)-U_{i+1}(x)+\Delta\mu\big{)}\Big{]},$		(45)
	$\displaystyle W_{i+1}^{-}$	$\displaystyle=W_{0}\text{exp}\Big{[}\frac{1-\alpha}{k_{\text{B}}T}\big{(}U_{i}(x)-U_{i+1}(x)+\Delta\mu\big{)}\Big{]},$

where $W_{i}^{+}$ ( $W_{i+1}^{-}$ ) is the rate of transitions from $i$ to $i+1$ (from $i+1$ to $i$ ). Here, $W_{0}$ quantifies the overall activity of the motor, which is proportional to the concentration of ATP in the environment of the motor. $\Delta\mu$ is the chemical potential difference driving the rotation; it is the amount of energy gained by the motor when consuming a single molecule of ATP. The parameter $0\leq\alpha\leq 1$ characterizes the asymmetry of the position-dependence of the rates. The spatial motion of the probe is described by one-dimensional Brownian motion in the potential $U_{i}(x)$ . In addition, an external torque may be applied to the probe, which enters the equation of motion in the form of a non-conservative bias force $F$ . As a consequence, the drift coefficient in Eq. (15) is given by $a_{i}(x)=(-U^{\prime}_{i}(x)-F)/\gamma$ , where $\gamma$ is the friction coefficient, $F$ is an external force acting on the probe, while the diffusion matrix is $G=\sqrt{2k_{\text{B}}T/\gamma}$ . An experimentally accessible current is the total displacement of the probe,

\displaystyle J=\int_{0}^{\tau}dt\ \dot{x}(t).

(46)

In the steady state, we have $J=v^{\text{d}}\tau$ , where $v^{\text{d}}$ is the drift velocity. Because the system is effectively one-dimensional, the local mean velocity in the steady state is given by $\nu^{\text{st}}(x)p^{\text{st}}(x)=v^{\text{d}}/L$ where $p^{\text{st}}(x)$ is the $L$ -periodic steady state probability density. In this case, we can thus reconstruct the local mean velocity from the trajectory data of the probe by taking a histogram of the probe positions. However, it should be noted that this local mean velocity is a coarse-grained quantity, the true local mean velocity entering Eq. (16) also depends on the state $i$ of the motor. We define the observable

\displaystyle Z=\int_{0}^{\tau}dt\ \frac{1}{p^{\text{st}}(x(t))},

(47)

which is proportional to $\bar{J}$ . Since the proportionality factor cancels in the Pearson coefficient, $Z$ and $\bar{J}$ are equivalent with respect to Eq. (25). To asses the tightness of the various inequalities, we introduce the transport efficiencies

	$\displaystyle\eta_{J}$	$\displaystyle=\frac{2\langle J\rangle^{2}}{\text{Var}_{J}\Delta S^{\text{irr}}},$		(48)
	$\displaystyle\eta_{J,\bar{J}}$	$\displaystyle=\frac{2\langle J\rangle^{2}}{\text{Var}_{J}\Delta S^{\text{irr}}\big{(}1-{\chi_{J,\bar{J}}}^{2}\big{)}}.$

Both of these quantities are smaller than unity and measure the magnitude of the average transport relative to its fluctuations and the dissipation.

Refer to caption — Figure 1: The transport efficiency Eq. (48) for the displacement of the probe attached to the molecular motor as a function of different parameters. The black circles correspond to the TUR for the displacement $z$ only, while the solid orange squares show the CTUR Eq. (25) including the correlations between the displacement and its local mean value $\bar{z}$ . The data are obtained by numerical simulations of Eq. (15) with $W_{0}\tau_{\text{v}}=10$ , $U_{0}=50k_{B}T$ , $\Delta\mu=19k_{B}T$ , $\alpha=0.1$ , $\gamma=2.5\cdot 10^{3}$ and $F=0$ , except where noted differently. (Top left) As a function of the base activity $W_{0}$ . The horizontal axis is scaled by the timescale $\tau_{v}=\gamma k_{\text{B}}T/(kL)^{2}$ [22]. (Top right) As a function of the chemical potential difference $\Delta\mu$ . (Bottom left) As a function of the external load $F$ . The empty orange squares are Eq. (25) with a numerically optimized observable, see Eq. (49). (Bottom right) As a function of the asymmetry parameter $\alpha$ .

The efficiencies Eq. (48) are shown for the molecular motor model in Fig. 1 as a function of various parameters. The top-left panel shows $\eta$ as a function of the base activity $W_{0}$ , which corresponds to the concentration of ATP in the experiment. For small activity both $\eta_{J}$ and $\eta_{J,\bar{J}}$ are comparable and small; in this limit, the transitions between the different motor conformations are not translated efficiently into motion and the dissipation is not reflected in the motion of the probe [22]. For large activity, $\eta_{J}$ saturates at a value of around $0.4$ . However, when we compute $\eta_{J,\bar{J}}$ in this regime, we find that it saturates at a value close to unity, i. e. the maximum possible value. The top-right panel shows $\eta$ as a function of the chemical potential difference $\Delta\mu$ . While this value cannot be readily changed in experiment, it yields important insight into the nature of the bound Eq. (25). For small $\Delta\mu$ the system is almost in equilibrium, and both the TUR and CTUR are close to an equality. However, as we drive the system out of equilibrium, the TUR ratio quickly drops, while the CTUR remains close to unity. This suggests that, while the TUR is generically only saturated close to equilibrium [33, 4], the improved bound from the CTUR Eq. (25) can yield an accurate estimate of the entropy production even far from equilibrium. The bottom-left panel shows $\eta$ as a function of the external load force. Close to the stall condition $FL=\Delta\mu$ neither of the bounds is tight. This is reasonable, since when the motor stalls, also the probe stops moving, while the motor keeps changing its conformation and thus dissipating energy. Interestingly, both bounds are close to unity slightly above the stall condition, i. e. when the external load is just strong enough to turn the motor in the opposite direction. Away from the stall condition, we again find that the TUR is rather loose, while the CTUR remains tight. Note that in this panel we also included the results obtained by optimizing the observable $Z$ . Specifically, we write

\displaystyle\tilde{Z}=\int_{0}^{\tau}dt\sum_{k=1}^{K}\bigg{(}a_{k}\sin\Big{(}\frac{2\pi x(t)}{L}\Big{)}+b_{k}\cos\Big{(}\frac{2\pi x(t)}{L}\Big{)}\bigg{)}

(49)

and then numerically optimize the parameters $a_{k}$ , $b_{k}$ such that $\chi(J,\tilde{Z})^{2}$ is maximal for the given trajectory data. Here we use $K=10$ ; further increasing of the number of parameters provides no notable improvement. The numerical optimization is done using Mathematica’s NMaximize command. The result are the empty orange squares in the bottom-left panel of Fig. 1. As can be seen, the observable $\bar{Z}$ is not truly optimal, so that some improvement of the lower bound on entropy production is possible. However, the heuristic choice $\bar{Z}$ already provides a useful estimate without the need for any parameter optimization. Finally, the bottom-right panel shows $\eta$ as a function of the asymmetry parameter $\alpha$ . As was shown in Ref. [22] the motor can operate without internal dissipation close to $\alpha=0$ , whereas other values of $\alpha$ result in a finite amount of internal dissipation. Since this difference is most pronounced at low activity, we choose $W_{0}$ such that the velocity remains constant at $v=0.65v_{\text{max}}$ for all values of $\alpha$ ; this corresponds to $W_{0}\tau_{v}\approx 10^{-2}$ in the top-left panel. While in this regime of low activity, neither bound is saturated, Eq. (25) does yield a considerable improvement over the TUR. Interestingly, neither bound shows a pronounced dependence on $\alpha$ , which indicates that their tightness, at least for this model, is not related to the amount of internal vs. external dissipation.

The molecular motor model is essentially one-dimensional: Since the energy scale of the coupling between the probe particle and motor are large compared to the temperature, the position of the probe and the state of the motor are tightly coupled, so measuring the displacement of the probe is almost equivalent to measuring the internal transitions of the motor. Indeed, considering the internal transitions of the motor as an additional current $J_{2}$ , Eq. (21) yields no notable improvement over the TUR. Because of this, it is reasonable that the entropy production can be estimated accurately from a measurement of the probe position. However, this tight coupling breaks down for slow switching (see the top-left panel of Fig. 1): In this regime, while the asymmetry in the transitions between the motor states still gives rise to entropy production, these transitions increasingly fail to produce a directed motion of the probe and thus its displacement can no longer be used to obtain an accurate estimate of entropy production.

In summary, we find that the TUR is generally not tight for this model, which mirrors the behavior in other types of molecular motors [14]. Viewed on its own, this would suggest that $\text{F}_{1}$ -ATPase is not efficient at saturating the bound set by the TUR. When we take into account the correlations between the velocity and its local mean value via the CTUR, the resulting inequality is almost saturated. Thus, in reality, the motor operates close to the limit permitted by the thermodynamic bound in the biologically relevant parameter regime. In contrast to the TUR, which represents a trade-off between dissipation and precision, the biological meaning of saturating the CTUR is not so clear. We note that, in the case of a particle moving in a flat potential, where the local mean velocity is independent of space, the TUR and CTUR are equivalent and both saturated. The presence of a spatially varying local mean velocity generally increases the amount of fluctuations in the current and also induces correlations between the former and the latter. As a consequence, the TUR becomes less tight and we explicitly need to take into account the correlations to counter this. However, in case of the $\text{F}_{1}$ -motor, the spatial inhomogeniety of the dynamics is also an integral part of the mechanism creating the motion of the motor in the first place. In that sense, the CTUR may be interpreted as a trade-off between dissipation, precision and the spatial structure necessary for the motion of the motor.

VI.2 Two-cycle Markov jump process

An obvious question is how the CTUR fares in the presence of multiple independent currents. To investigate this issue, we introduce a simple Markov jump model consisting of $N=10$ sites forming two loops with a common edge, see Fig. 2. While such two-cycle models are also used in modeling the chemical state space of molecular motors [34], we use a simpler setup in order to better understand its behavior in relation to the CTUR.

We parameterize the transition rates between connected states as

\displaystyle W_{ij}=\exp\bigg{[}\frac{\beta}{2}\big{(}E_{i}-E_{j}\pm f\big{)}\bigg{]},

(50)

where $\beta$ is the inverse temperature, $E_{i}$ the energy of state $i$ and $f$ is the bias, with the choice $+f$ corresponding to counter-clockwise and $-f$ to clockwise transitions. As indicated in Fig. 2, we keep the bias of the right cycle fixed at a value $f_{0}$ while varying the bias $f$ in the left cycle; the bias on the shared link $5\rightarrow 6$ is set to $f-f_{0}$ . As the current $J_{\text{L}}$ , we choose the number of counterclockwise transitions through the link $2\rightarrow 3$ (indicated in red in Fig. 2), i. e. $\omega_{32}=1=-\omega_{23}$ and $\omega_{ij}=0$ otherwise in Eq. (17).

As the state-dependent observable $Z_{\text{L}}$ , we choose the inverse of the steady state occupation fraction in the left ring, $1/p^{\text{st}}_{i|L}$ , which corresponds to measuring the ratio of the time spent in state $i$ and the time spent in the left ring. The motivation behind this choice is that, even though the system no longer consists of a single cycle, we still expect the occupation fraction to be approximately inversely proportional to the current through the site, as argued in Section V.3. This observable can be measured by observing only the states in the left ring, for example if the right ring cannot be directly observed. As in the previous example, we define the ratios

	$\displaystyle\eta_{J}$	$\displaystyle=\frac{2\langle J\rangle^{2}}{\text{Var}_{J}\Delta S^{\text{irr}}},$		(51)
	$\displaystyle\eta_{J,Z}$	$\displaystyle=\frac{2\langle J\rangle^{2}}{\text{Var}_{J}\Delta S^{\text{irr}}\big{(}1-\chi_{J,Z}^{2}\big{)}}.$

which are shown in Fig. 3. In the absence of bias, $f=0$ , the average current $J_{\text{L}}$ in the left ring vanishes and thus neither the TUR nor the CTUR give us a non-trivial estimate on the entropy production. The contribution of $J_{L}$ to the total entropy production increases with increasing bias $f$ and thus we can get a non-trivial lower bound on the entropy production. Quantitatively, we observe that the estimate provided by the CTUR is considerably more accurate than the TUR, in particular for larger bias. This finding is independent of the precise potential landscape; indeed, the bound obtained from the CTUR for flat and random configurations of energies $E_{i}$ is qualitatively similar. However, as noted in Eq. (35), in case of a Markov jump process, we are actually estimating the pseudo entropy production Eq. (36) $R$ . Since the latter is smaller than the entropy production $\Delta S^{\text{irr}}$ for large bias (i. e. asymmetry between forward and reverse transitions), the estimate becomes worse when we increase the bias further, even though most of the entropy production in the system now stems from the left ring. To quantify the difference between $R$ and $\Delta S^{\text{irr}}$ , we also consider the modified efficiencies

	$\displaystyle\tilde{\eta}_{J}$	$\displaystyle=\frac{2\langle J\rangle^{2}}{\text{Var}_{J}R},$		(52)
	$\displaystyle\tilde{\eta}_{J,Z}$	$\displaystyle=\frac{2\langle J\rangle^{2}}{\text{Var}_{J}R\big{(}1-\chi_{J,Z}^{2}\big{)}}.$

As can be seen from Fig. 3, the CTUR saturates this bound in the limit of large bias; thus, we can accurately estimate the pseudo entropy production from a measurement of the current and the occupation probabilities in the left ring. By contrast, just as for the molecular motor model, the TUR estimate becomes less accurate in the limit of large bias.

In the case of small bias, it is expected that the (pseudo) entropy production cannot be estimated accurately, since it mostly originates from the right ring, and is not reflected in the current through the left ring. Thus, at most, we can estimate the contribution to the (pseudo) entropy production that originates from the left ring. To illustrate this, we compare the estimate of $R$ obtained from the TUR and CTUR with the pseudo entropy production $R_{\text{L}}$ of only the left ring, i. e. the system in Fig. 2 with $f_{0}=0$ . There is no rigorous inequality between the two, since the bias in the right ring may affect the occupation probabilities and currents in the left ring. However, we do observe that the CTUR for the current in the left ring almost exactly reproduces $R_{L}$ . This is an example of a more general statement: While the CTUR can potentially provide an accurate estimate of the (pseudo) entropy production, it can only capture the contributions to entropy production that correspond to the chosen current observable.

VI.3 Transport in a two-dimensional channel

Both models in the previous example have a very simple geometry: transitions only occur between neighboring sites along one-dimensional structures. To illustrate the usefulness of the CTUR also in more complicated situations, we study the transport of a Brownian particle in a two-dimensional, soft-walled channel described by the potential

\displaystyle U(x,y)=\frac{k}{2}\Big{(}1+\alpha\sin(\lambda x)\Big{)}y^{2}.

(53)

Along the transverse $y$ -direction, the particle is confined in a parabolic trap, whose strength is periodically modulated in the longitudinal $x$ -direction, see Fig. 5. The period $L$ of the modulation is described by the wave number $\lambda=2\pi/L$ , its amplitude by the parameter $0\leq\alpha<1$ , where the condition $\alpha<1$ ensures that the potential remains trapping in the transverse direction. The longitudinal motion in the potential Eq. (53) is qualitatively similar to the motion in a one-dimensional periodic potential. This analogy can be made explicit by assuming that the relaxation in the transverse direction is fast compared to the longitudinal motion, so that the transverse coordinate is distributed according to the equilibrium distribution,

\displaystyle p^{\text{eq}}(y|x)=\frac{1}{Z(x)}\exp\bigg{[}-\frac{U(x,y)}{k_{\text{B}}T}\bigg{]},

(54)

where $p^{\text{eq}}(y|x)$ is the conditional probability density of finding the particle at transverse coordinate $y$ given the longitudinal position $x$ and $Z(x)$ is the normalizing partition function. This assumption is justified for a relatively narrow channel and in the absence of bias. If it holds, we can describe the motion in the longitudinal direction via an effective one-dimensional Langevin dynamics

\displaystyle\dot{x}(t)=-\mu\partial_{x}\tilde{U}(x)+\sqrt{2\mu k_{\text{B}}T}\xi(t),

(55)

with the potential

\displaystyle\tilde{U}(x)=-k_{\text{B}}T\ln\bigg{(}\int_{-\infty}^{\infty}dy\ e^{-\frac{U(x,y)}{k_{\text{B}}T}}\bigg{)}.

(56)

For the form Eq. (53) of the potential, the integral can be evaluated explicitly and we find

\displaystyle\tilde{U}(x)=\frac{k_{\text{B}}T}{2}\ln\Big{(}1+\alpha\sin(\lambda x)\Big{)}.

(57)

The effective potential governing the motion in the longitudinal direction appears only due to diffusion in the transverse direction—in the center of the channel, the potential is flat—and is thus proportional to the temperature, rather than the depth of the channel potential. If we assume Eq. (55) to describe the dynamics in the longitudinal direction, then we can use it to compute the local mean velocity in the presence of a constant bias $F_{0}$ ,

\displaystyle\tilde{\nu}(x)=\mu k_{\text{B}}T\big{(}1-e^{-\frac{F_{0}L}{T}}\big{)}\frac{e^{\frac{\tilde{U}(x)-F_{0}x}{k_{\text{B}}T}}}{\int_{x}^{x+L}dz\ e^{\frac{\tilde{U}(z)-F_{0}z}{k_{\text{B}}T}}}.

(58)

We remark that this expression is only an approximation, since, strictly speaking, the assumption of a transverse equilibrium distribution Eq. (54) breaks down in the presence of a bias. Thus, the true local mean velocity has to be computed using Eq. (53) and depends on both the longitudinal and transverse coordinate. However, the solution of the two-dimensional problem is considerably harder and does not permit a closed-form expression. Thus, we use Eq. (58) as an approximate expression and the corresponding observable

\displaystyle Z=\int_{0}^{\tau}dt\ \tilde{\nu}(x(t))

(59)

to compute the CTUR for the current given by the displacement in the longitudinal direction. Note that, in principle, we may obtain even simpler approximations for the local mean velocity. Since we expect the dynamics to be described by an effectively one-dimensional motion, the local mean velocity and probability density are related by $\nu^{\text{st}}(x)\propto 1/p^{\text{st}}(x)$ , see the discussion in Section V.3. Provided that the bias is not too strong, we may further expect the probability density to be qualitatively similar to the equilibrium density. Thus, a crude approximation of the local mean velocity is given by

\displaystyle\hat{\nu}(x)=\exp\Bigg{[}\frac{U\big{(}x,\sqrt{\frac{k_{\text{B}}T}{k}}\big{)}}{k_{\text{B}}T}\Bigg{]},

(60)

where we evaluated the potential at $y=\sqrt{k_{\text{B}}T/k}$ , which is a measure of the average distance of the particle from the center of the channel. The resulting bounds Eq. (48) obtained from the TUR and CTUR are shown in Fig. 6. For small bias and thus close to equilibrium, both the TUR and CTUR are close to an equality [33]. As we drive the system further away from equilibrium by increasing the bias, the TUR ratio decreases sharply. In this regime, the CTUR with either approximate local mean velocity yields a tighter bound than the TUR and remains close to an equality. For even stronger bias, the equilibrium argument behind the approximate expressions Eq. (58) and Eq. (60) as well as the one-dimensional approximation become increasingly invalid, and thus the tightness of the CTUR likewise starts to decrease. However, we stress that for moderately strong bias, the estimate on the entropy production from the CTUR with Eq. (58) is around $80\%$ of the true value, which is considerably tighter than the TUR estimate of less than $20\%$ .

VII Discussion

In this article, we have shown how to improve the TUR by taking into account the correlations between current and non-current observables and that the resulting inequality can yield a much improved estimate of entropy production. We remark that Eq. (14) is more general: It allows us to improve any bound provided by the FRI, provided that we can find a quantity whose average is invariant under a suitable perturbation of the dynamics. This is a manifestation of the monotonicity of information established in Eq. (12). Since many generalizations of the TUR may be derived from the FRI [35, 18, 19, 20, 17], these generalizations can be improved in a completely analogous manner, by exploiting the existence of symmetries and conserved quantities. For example, in Ref. [18], the TUR was generalized from steady states to systems in the presence of time-periodic driving. In this case, the perturbation that leads to the TUR is a change in the driving frequency, and thus any non-current observable that is independent of the driving frequency can be used to obtain a tighter bound, similar to Eq. (25).

In the presence of several currents, it is only possible to estimate the partial entropy production corresponding to the measured current [36, 6], see also the discussion in Section VI.2. In such situations, measuring several currents and state-dependent observables is required to obtain a good estimate on the entropy production using Eq. (21). We note that, since the derivation of Ref. [36] involves minimizing the entropy production constrained on the measured current while keeping the steady state probability fixed, the CTUR extends in a straightforward manner to the tighter bounds involving the partial entropy production. In general, caution is needed when interpreting the results of the estimation of entropy production for complex systems. This is particularly true for biological systems, in which there are typically many out of equilibrium processes that to not directly contribute to any directly accessible observable, yet still result in (considerable) contributions to entropy production. For example, observing the motion a biological cell can yield an accurate estimate on the entropy production corresponding to its mechanical motion, but it should not be interpreted as the entropy production of the cell, which involves a multitude of different chemical processes.

Finally, the fact that the model of $\text{F}_{1}$ -ATPase is close to saturating the CTUR at biological conditions poses the question of whether this finding extends to models of other molecular motors. In the light of interpreting Eq. (25) as a transport efficiency, Eq. (48), this appears reasonable: Achieving precise transport at minimal dissipation would obviously be advantageous for any machine, whether artificial or naturally occurring. We leave this issue for future research.

Acknowledgements.

Acknowledgments. This work was supported by KAKENHI (Nos. 17H01148, 19H05795 and 20K20425).

References

Sekimoto [2010] K. Sekimoto, Stochastic Energetics, Lecture Notes in Physics (Springer Berlin Heidelberg, 2010).
Seifert [2012] U. Seifert, Stochastic thermodynamics, fluctuation theorems and molecular machines, Rep. Prog. Phys. 75, 126001 (2012).
Harada and Sasa [2005] T. Harada and S.-i. Sasa, Equality connecting energy dissipation with a violation of the fluctuation-response relation, Phys. Rev. Lett. 95, 130602 (2005).
Li et al. [2019] J. Li, J. M. Horowitz, T. R. Gingrich, and N. Fakhri, Quantifying dissipation using fluctuating currents, Nature Comm. 10, 1 (2019).
Manikandan et al. [2020] S. K. Manikandan, D. Gupta, and S. Krishnamurthy, Inferring entropy production from short experiments, Phys. Rev. Lett. 124, 120603 (2020).
Otsubo et al. [2020] S. Otsubo, S. Ito, A. Dechant, and T. Sagawa, Estimating entropy production by machine learning of short-time fluctuating currents, Phys. Rev. E 101, 062106 (2020).
Van Vu et al. [2020] T. Van Vu, V. T. Vo, and Y. Hasegawa, Entropy production estimation with optimal current, Phys. Rev. E 101, 042138 (2020).
Horowitz and Gingrich [2020] J. M. Horowitz and T. R. Gingrich, Thermodynamic uncertainty relations constrain non-equilibrium fluctuations, Nature Phys. 16, 15 (2020).
Barato and Seifert [2015] A. C. Barato and U. Seifert, Thermodynamic uncertainty relation for biomolecular processes, Phys. Rev. Lett. 114, 158101 (2015).
Gingrich et al. [2016] T. R. Gingrich, J. M. Horowitz, N. Perunov, and J. L. England, Dissipation bounds all steady-state current fluctuations, Phys. Rev. Lett. 116, 120601 (2016).
Dechant and Sasa [2018a] A. Dechant and S.-i. Sasa, Current fluctuations and transport efficiency for general Langevin systems, J. Stat. Mech. Theory E. 2018, 063209 (2018a).
Pietzonka et al. [2017] P. Pietzonka, F. Ritort, and U. Seifert, Finite-time generalization of the thermodynamic uncertainty relation, Phys. Rev. E 96, 012101 (2017).
Dechant and i. Sasa [2020] A. Dechant and S. i. Sasa, Continuous time-reversal and equality in the thermodynamic uncertainty relation (2020), arXiv:2010.14769 [cond-mat.stat-mech] .
Hwang and Hyeon [2018] W. Hwang and C. Hyeon, Energetic costs, precision, and transport efficiency of molecular motors, J. Phys. Chem. Lett. 9, 513 (2018).
Dechant and Sasa [2018b] A. Dechant and S.-i. Sasa, Entropic bounds on currents in Langevin systems, Phys. Rev. E 97, 062101 (2018b).
Pal et al. [2020] S. Pal, S. Saryal, D. Segal, T. S. Mahesh, and B. K. Agarwalla, Experimental study of the thermodynamic uncertainty relation, Phys. Rev. Research 2, 022044(R) (2020).
Liu et al. [2020] K. Liu, Z. Gong, and M. Ueda, Thermodynamic uncertainty relation for arbitrary initial states, Phys. Rev. Lett. 125, 140602 (2020).
Koyuk and Seifert [2019] T. Koyuk and U. Seifert, Operationally accessible bounds on fluctuations and entropy production in periodically driven systems, Phys. Rev. Lett. 122, 230601 (2019).
Koyuk and Seifert [2020] T. Koyuk and U. Seifert, Thermodynamic uncertainty relation for time-dependent driving, Phys. Rev. Lett. 125, 260604 (2020).
Van Vu and Hasegawa [2020] T. Van Vu and Y. Hasegawa, Thermodynamic uncertainty relations under arbitrary control protocols, Phys. Rev. Research 2, 013060 (2020).
Zimmermann and Seifert [2012] E. Zimmermann and U. Seifert, Efficiencies of a molecular motor: a generic hybrid model applied to the F1-ATPase, New J. Phys. 14, 103023 (2012).
Kawaguchi et al. [2014] K. Kawaguchi, S. i. Sasa, and T. Sagawa, Nonequilibrium dissipation-free transport in f1-atpase and the thermodynamic role of asymmetric allosterism, Biophys. J. 106, 2450 (2014).
Dechant and Sasa [2020] A. Dechant and S.-i. Sasa, Fluctuation–response inequality out of equilibrium, Proc. Natl. Acad. Sci. 117, 6430 (2020).
Dechant [2018] A. Dechant, Multidimensional thermodynamic uncertainty relations, J. Phys. A Math. Theor. 52, 035001 (2018).
Radhakrishna Rao [1945] C. Radhakrishna Rao, Information and the accuracy attainable in the estimation of statistical parameters, Bull. Calcutta Math. Soc. 37, 81 (1945).
Cramér [2016] H. Cramér, Mathematical methods of statistics, Vol. 9 (Princeton university press, 2016).
Chetrite and Touchette [2015] R. Chetrite and H. Touchette, Nonequilibrium Markov processes conditioned on large deviations, Ann. Henri Poincaré 16, 2005 (2015).
Sasa [2014] S.-i. Sasa, Possible extended forms of thermodynamic entropy, J. Stat. Mech. Theory E. 2014, P01004 (2014).
Hasegawa and Van Vu [2019] Y. Hasegawa and T. Van Vu, Uncertainty relations in stochastic processes: An information inequality approach, Phys. Rev. E 99, 062126 (2019).
Pigolotti et al. [2017] S. Pigolotti, I. Neri, E. Roldán, and F. Jülicher, Generic properties of stochastic entropy production, Phys. Rev. Lett. 119, 140604 (2017).
Shiraishi [2021] N. Shiraishi, Optimal thermodynamic uncertainty relation in markov jump processes (2021), arXiv:2106.11634 [cond-mat.stat-mech] .
Pal et al. [2021] A. Pal, S. Reuveni, and S. Rahav, Thermodynamic uncertainty relation for systems with unidirectional transitions, Phys. Rev. Research 3, 013273 (2021).
Macieszczak et al. [2018] K. Macieszczak, K. Brandner, and J. P. Garrahan, Unified thermodynamic uncertainty relations in linear response, Phys. Rev. Lett. 121, 130601 (2018).
Liepelt and Lipowsky [2007] S. Liepelt and R. Lipowsky, Kinesin’s network of chemomechanical motor cycles, Phys. Rev. Lett. 98, 258102 (2007).
Di Terlizzi and Baiesi [2018] I. Di Terlizzi and M. Baiesi, Kinetic uncertainty relation, J. Phys. A Math. Theor. 52, 02LT03 (2018).
Polettini et al. [2016] M. Polettini, A. Lazarescu, and M. Esposito, Tightening the uncertainty principle for stochastic currents, Phys. Rev. E 94, 052104 (2016).

Appendix A Optimal state dependent observables for stochastic entropy production

In Ref. [13], an explicit expression for the variance of a stochastic current in a diffusion process was derived,

$\displaystyle\text{Var}_{J}$	$\displaystyle=\int_{0}^{\tau}dt\int_{0}^{\tau}ds\int d\bm{x}\int d\bm{y}\ \mu(\bm{x})\mu(\bm{y})\Big{(}p(\bm{x},t;\bm{y},s)-p^{\text{st}}(\bm{x})p^{\text{st}}(\bm{y})\Big{)}$	(61)
	$\displaystyle\quad+\int_{0}^{\tau}dt\int_{0}^{t}ds\int d\bm{x}\int d\bm{y}\ \Big{(}\mu(\bm{y})\psi(\bm{x})-\mu(\bm{x})\psi(\bm{y})\Big{)}p(\bm{x},t;\bm{y},s)$
	$\displaystyle\quad+2\tau\int d\bm{x}\ \bm{w}(\bm{x})\cdot\bm{B}\bm{w}(\bm{x})p^{\text{st}}(\bm{x})-\int_{0}^{\tau}dt\int_{0}^{\tau}ds\int d\bm{x}\int d\bm{y}\ \psi(\bm{x})\psi(\bm{y})p(\bm{x},t;\bm{y},s),$

where $p(\bm{x},t;\bm{y},s)$ is the joint probability density corresponding to Eq. (15) in the absence of the discrete degree of freedom. In the above expression, we defined

\displaystyle\mu(\bm{x})=\bm{w}^{\text{T}}(\bm{x})\nu^{\text{st}}(\bm{x})\qquad\text{and}\qquad\psi(\bm{x})=\bm{w}^{\text{T}}(\bm{x})\bm{\phi}^{\text{st}}(\bm{x})+\text{tr}\big{(}\bm{B}\bm{\mathcal{J}}_{w}(\bm{x})\big{)}.

(62)

Here, $\bm{\phi}^{\text{st}}(\bm{x})$ is the reversible part of the drift,

\displaystyle\bm{\phi}^{\text{st}}(\bm{x})=\bm{B}\bm{\nabla}\ln p^{\text{st}}(\bm{x}),

(63)

tr denotes the trace and $\bm{\mathcal{J}}_{w}(\bm{x})$ is the Jacobian matrix of the weighting function $\bm{w}(\bm{x})$ . In a similar manner, we may derive the covariance between the current $J$ and the state-dependent observable $Z$ ,

	$\displaystyle\text{Cov}_{J,Z}$	$\displaystyle=\int_{0}^{\tau}dt\int_{0}^{\tau}ds\int d\bm{x}\int d\bm{y}\ z(\bm{x})\mu(\bm{y})\Big{(}p(\bm{x},t;\bm{y},s)-p^{\text{st}}(\bm{x})p^{\text{st}}(\bm{y})\Big{)}$		(64)
		$\displaystyle\quad+\int_{0}^{\tau}dt\int_{0}^{t}ds\int d\bm{x}\int d\bm{y}\ \Big{(}z(\bm{y})\psi(\bm{x})-z(\bm{x})\psi(\bm{y})\Big{)}p(\bm{x},t;\bm{y},s),$

while the variance of $Z$ is given by

\displaystyle\text{Var}_{Z}=\int_{0}^{\tau}\int_{0}^{\tau}dt\int_{0}^{\tau}ds\int d\bm{x}\int d\bm{y}\ z(\bm{x})z(\bm{y})\Big{(}p(\bm{x},t;\bm{y},s)-p^{\text{st}}(\bm{x})p^{\text{st}}(\bm{y})\Big{)}.

(65)

The stochastic entropy production $\Sigma$ is the current with the weighting function $\bm{w}(\bm{x})=\bm{B}^{-1}\bm{\nu}^{\text{st}}(\bm{x})$ . For this choice, we find

\displaystyle\psi(\bm{x})=\bm{\nu}^{\text{st,T}}(\bm{x})\bm{\nabla}\ln p^{\text{st}}(\bm{x})+\bm{\nabla}^{\text{T}}\bm{\nu}^{\text{st}}(\bm{x})=\frac{1}{p^{\text{st}}(\bm{x})}\Big{(}\bm{\nabla}^{\text{T}}\big{(}\bm{\nu}^{\text{st}}(\bm{x})p^{\text{st}}(\bm{x})\big{)}\Big{)}=0,

(66)

where we used that the expression in parentheses is nothing but the steady-state condition of the Fokker-Planck equation Eq. (16). In this case, the expressions Eq. (61) and Eq. (64) simplify,

	$\displaystyle\text{Var}_{\Sigma}$	$\displaystyle=\int_{0}^{\tau}dt\int_{0}^{\tau}ds\int d\bm{x}\int d\bm{y}\ \big{(}\bm{\nu}^{\text{st,T}}(\bm{x})\cdot\bm{B}^{-1}\bm{\nu}^{\text{st}}(\bm{x})\big{)}\big{(}\bm{\nu}^{\text{st,T}}(\bm{y})\cdot\bm{B}^{-1}\bm{\nu}^{\text{st}}(\bm{y})\big{)}\Big{(}p(\bm{x},t;\bm{y},s)-p^{\text{st}}(\bm{x})p^{\text{st}}(\bm{y})\Big{)}+2\Delta S^{\text{irr}},$
	$\displaystyle\text{Cov}_{\Sigma,Z}$	$\displaystyle=\int_{0}^{\tau}dt\int_{0}^{\tau}ds\int d\bm{x}\int d\bm{y}\ z(\bm{x})\big{(}\bm{\nu}^{\text{st,T}}(\bm{y})\cdot\bm{B}^{-1}\bm{\nu}^{\text{st}}(\bm{y})\big{)}\Big{(}p(\bm{x},t;\bm{y},s)-p^{\text{st}}(\bm{x})p^{\text{st}}(\bm{y})\Big{)}.$		(67)

With these expressions, it is obvious that choosing $Z=\bar{\Sigma}$ , i. e.,

\displaystyle z(\bm{x})=\bm{\nu}^{\text{st,T}}(\bm{x})\bm{B}^{-1}\bm{\nu}^{\text{st}}(\bm{x}),

(68)

results in

\displaystyle\text{Var}_{\Sigma}\big{(}1-{\chi_{\Sigma,\bar{\Sigma}}}^{2}\big{)}=\text{Var}_{\Sigma}-\frac{{\text{Cov}_{\Sigma,\bar{\Sigma}}}^{2}}{\text{Var}_{\bar{\Sigma}}}=2\Delta S^{\text{irr}}.

(69)

This choice thus realizes the equality in the CTUR Eq. (3).

For the case of a pure jump process, we have for the variances and covariance,

$\displaystyle\text{Var}_{J}$	$\displaystyle=\int_{0}^{\tau}dt\int_{0}^{t}ds\sum_{i,j,k,l}\omega_{ij}\omega_{kl}W_{ij}W_{kl}\Big{(}p(j,t\|k,s)p_{l}^{\text{st}}+p(l,t\|i,s)p_{j}^{\text{st}}-2p_{j}^{\text{st}}p_{l}^{\text{st}}\Big{)}+\tau\sum_{i,j}\omega_{ij}^{2}W_{ij}p_{j}^{\text{st}},$	(70)
$\displaystyle\text{Var}_{Z}$	$\displaystyle=\int_{0}^{\tau}dt\int_{0}^{t}ds\sum_{j,l}z_{j}z_{l}\Big{(}p(j,t\|l,s)p_{l}^{\text{st}}+p(l,t\|j,s)p_{j}^{\text{st}}-2p_{j}^{\text{st}}p_{l}^{\text{st}}\Big{)},$
$\displaystyle\text{Cov}_{J,Z}$	$\displaystyle=\int_{0}^{\tau}dt\int_{0}^{t}ds\ \sum_{j,k,l}z_{j}\omega_{kl}W_{kl}\Big{(}p(j,t\|k,s)p_{l}^{\text{st}}+p(l,t\|j,s)p_{j}^{\text{st}}-2p_{j}^{\text{st}}p_{l}^{\text{st}}\Big{)}.$

We decompose the transition rates into the (irreversible) local mean velocity and a reversible part,

\displaystyle W_{ij}=V^{\text{st}}_{ij}+\Phi^{\text{st}}_{ij}\qquad\text{with}\qquad V_{ij}^{\text{st}}=\frac{1}{2}\bigg{(}W_{ij}-W_{ji}\frac{p_{i}^{\text{st}}}{p_{j}^{\text{st}}}\bigg{)}\qquad\text{and}\qquad\Phi_{ij}^{\text{st}}=\frac{1}{2}\bigg{(}W_{ij}+W_{ji}\frac{p_{i}^{\text{st}}}{p_{j}^{\text{st}}}\bigg{)}.

(71)

Using that $\omega_{ij}=\omega_{ji}$ , we can write

	$\displaystyle\text{Var}_{J}$	$\displaystyle=\int_{0}^{\tau}dt\int_{0}^{t}ds\sum_{i,j,k,l}\omega_{ij}\omega_{kl}\big{(}V^{\text{st}}_{ij}+\Phi^{\text{st}}_{ij}\big{)}\big{(}V^{\text{st}}_{kl}+\Phi^{\text{st}}_{kl}\big{)}\Big{(}p(j,t\|k,s)p_{l}^{\text{st}}+p(l,t\|i,s)p_{j}^{\text{st}}-2p_{j}^{\text{st}}p_{l}^{\text{st}}\Big{)}+\tau\sum_{i,j}\omega_{ij}^{2}\Phi^{\text{st}}_{ij}p_{j}^{\text{st}},$
	$\displaystyle\text{Cov}_{J,Z}$	$\displaystyle=\int_{0}^{\tau}dt\int_{0}^{t}ds\ \sum_{j,k,l}z_{j}\omega_{kl}\big{(}V^{\text{st}}_{kl}+\Phi^{\text{st}}_{kl}\big{)}\Big{(}p(j,t\|k,s)p_{l}^{\text{st}}+p(l,t\|j,s)p_{j}^{\text{st}}-2p_{j}^{\text{st}}p_{l}^{\text{st}}\Big{)}.$		(72)

We focus on the stochastic pseudo-entropy $\mathcal{R}$ with the weighting function

\displaystyle\omega_{ij}=2\frac{W_{ij}p_{j}^{\text{st}}-W_{ji}p_{i}^{\text{st}}}{W_{ij}p_{j}^{\text{st}}+W_{ji}p_{i}^{\text{st}}}=2\frac{V_{ij}^{\text{st}}}{\Phi_{ij}^{\text{st}}}.

(73)

For this choice, we obtain

\displaystyle\sum_{k,l}\omega_{kl}\Phi_{kl}^{\text{st}}p(j,t|l,s)p_{l}^{\text{st}}=2\sum_{k,l}p(j,t|l,s)\big{(}W_{kl}p_{l}^{\text{st}}-W_{lk}p_{k}^{\text{st}}\big{)}=2\sum_{l}p(j,t|l,s)\sum_{k}\big{(}W_{kl}p_{l}^{\text{st}}-W_{lk}p_{k}^{\text{st}}\big{)}=0,

(74)

where we used the steady state condition of the master equation. As a consequence, all terms involving $\Phi_{ij}^{\text{st}}$ in Eq. (72) vanish, and we have

	$\displaystyle\text{Var}_{\mathcal{R}}$	$\displaystyle=4\int_{0}^{\tau}dt\int_{0}^{t}ds\sum_{i,j,k,l}\frac{(V_{ij}^{\text{st}})^{2}}{\Phi_{ij}^{\text{st}}}\frac{(V_{kl}^{\text{st}})^{2}}{\Phi_{kl}^{\text{st}}}\Big{(}p(j,t\|l,s)p_{l}^{\text{st}}+p(l,t\|j,s)p_{j}^{\text{st}}-2p_{j}^{\text{st}}p_{l}^{\text{st}}\Big{)}+2R,$		(75)
	$\displaystyle\text{Cov}_{\mathcal{R},Z}$	$\displaystyle=2\int_{0}^{\tau}dt\int_{0}^{t}ds\ \sum_{j,k,l}z_{j}\frac{(V_{kl}^{\text{st}})^{2}}{\Phi_{kl}^{\text{st}}}\Big{(}p(j,t\|l,s)p_{l}^{\text{st}}+p(l,t\|j,s)p_{j}^{\text{st}}-2p_{j}^{\text{st}}p_{l}^{\text{st}}\Big{)}.$

Note that we can change the second argument of the conditional probabilities, since we have

\displaystyle\sum_{i,j}\frac{(V_{ij}^{\text{st}})^{2}}{\Phi_{ij}^{\text{st}}}p(l,t|i,s)p_{j}^{\text{st}}=-\sum_{i,j}\frac{V_{ij}^{\text{st}}}{\Phi_{ij}^{\text{st}}}V_{ji}^{\text{st}}p(l,t|i,s)p_{i}^{\text{st}}=\sum_{i,j}\frac{V_{ji}^{\text{st}}}{\Phi_{ji}^{\text{st}}}V_{ji}^{\text{st}}p(l,t|i,s)p_{i}^{\text{st}}=\sum_{i,j}\frac{(V_{ij}^{\text{st}})^{2}}{\Phi_{ij}^{\text{st}}}p(l,t|j,s)p_{j}^{\text{st}},

(76)

where we used $V_{ij}^{\text{st}}p_{j}^{\text{st}}=-V_{ji}^{\text{st}}p_{i}^{\text{st}}$ in the first, $V_{ij}^{\text{st}}/\Phi_{ij}^{\text{st}}=-V_{ji}^{\text{st}}/\Phi_{ji}^{\text{st}}$ in the second and renamed indices in the third step. We choose the function $z_{j}$ corresponding to the local average of the stochastic pseudo entropy $\bar{\mathcal{R}}$

\displaystyle z_{j}=\sum_{i}\frac{\Big{(}W_{ij}-W_{ji}\frac{p_{i}^{\text{st}}}{p_{j}^{\text{st}}}\Big{)}^{2}}{W_{ij}+W_{ji}\frac{p_{i}^{\text{st}}}{p_{j}^{\text{st}}}}=2\sum_{i}\frac{(V_{ij}^{\text{st}})^{2}}{\Phi_{ij}^{\text{st}}},

(77)

which, when plugged into the above expressions, yields

\displaystyle\text{Cov}_{\mathcal{R},\bar{\mathcal{R}}}=\text{Var}_{\bar{\mathcal{R}}}=4\int_{0}^{\tau}dt\int_{0}^{t}ds\sum_{i,j,k,l}\frac{(V_{ij}^{\text{st}})^{2}}{\Phi_{ij}^{\text{st}}}\frac{(V_{kl}^{\text{st}})^{2}}{\Phi_{kl}^{\text{st}}}\Big{(}p(j,t|l,s)p_{l}^{\text{st}}+p(l,t|j,s)p_{j}^{\text{st}}-2p_{j}^{\text{st}}p_{l}^{\text{st}}\Big{)}.

(78)

This is precisely the same as the first term in $\text{Var}_{\mathcal{R}}$ and thus, we finally obtain

\displaystyle\text{Var}_{\mathcal{R}}\big{(}1-{\chi_{\mathcal{R},\bar{\mathcal{R}}}}^{2}\big{)}=2R,

(79)

which yields equality in Eq. (35).

Appendix B Dependence of the CTUR on the size of the data set and sampling interval

The CTUR Eq. (3) is a relation between the averages and variances of different physical observables. Formally, these quantities are defined for an infinite ensemble. However, in reality the number of trajectories $N_{\text{t}}$ is always finite, both in experiments and numerical simulations. It is clear that the averages and variance computed from a finite number of random trajectories are themselves random quantities; thus, for any finite data set, there is a finite probability that the relation Eq. (3) can be violated. Moreover, the state-dependent observable Eq. (23) is defined as a time-integral over continuous-time trajectories. However, in reality, the sampling interval is likewise finite, that is, instead of a continuous trajectory $\{\bm{x}(t)\}_{t\in[0,\tau]}$ , we observe a sequence of $M$ discrete values $\bm{x}(t_{j})$ with $t_{j}=j\Delta t_{\text{m}}$ and $\Delta t^{\text{m}}=\tau/M$ . Here, we want to investigate the dependence of the bound Eq. (3) on $N_{\text{t}}$ and $\Delta t_{\text{m}}$ . For simplicity, we focus on a simple model that serves as a minimal example of a non-equilibrium steady state. We consider a single overdamped particle in a one-dimensional periodic potential $U(x+L)=U(x)$ , which is driven by a constant force $F_{0}$ , while the environment is characterized by the mobility $\mu$ and temperature $T$ . The corresponding Langevin equation reads

\displaystyle\dot{x}(t)=\mu\big{(}-U^{\prime}(x)+F_{0}\big{)}+\sqrt{2\mu T}\xi(t).

(80)

For long times, this system settles into a periodic steady state with probability density $p^{\text{st}}(x+L)=p^{\text{st}}(x)$ . Since the system is one-dimensional, this also fixes the steady-state local mean velocity $\nu^{\text{st}}(x)=v^{\text{d}}/(Lp^{\text{st}}(x))$ , see Section V.3. As a current observable, we consider the total displacement

\displaystyle J=\int_{0}^{\tau}dt\ \dot{x}(t).

(81)

As in Section VI.1, we choose the state-dependent observable defined by the inverse probability density,

\displaystyle Z=\int_{0}^{\tau}dt\ \frac{1}{p^{\text{st}}(x(t))},

(82)

which is proportional to the heuristic observable defined in Section IV. $p^{\text{st}}(x)$ is estimated by the occupation fraction of the entire set of trajectories. For a finite sampling interval $\Delta t_{\text{m}}$ , we instead use the discrete approximation of the integral

\displaystyle Z=\sum_{j=1}^{M}\frac{1}{p^{\text{st}}(x(t_{j}))}\Delta t_{\text{m}}.

(83)

We further define an observable $\tilde{Z}$ parameterized as Eq. (49) with $K=10$ and numerically maximize the Pearson coefficient with respect to the parameters $a_{k}$ and $b_{k}$ . As a concrete example, we consider the case $U(x)=-U_{0}\cos(2\pi x/L)$ with $U_{0}=1$ , $L=1$ , $F_{0}=5$ , $\mu=1$ and $T=0.2$ . This corresponds to a moderately strong driving at relatively low temperature. In this regime, the relevant timescale describing the dynamics is the time it takes the particle to traverse one period of the potential, $\tau^{\text{d}}=L/v^{\text{d}}\approx 1.3$ . For the above parameter values, the TUR is generally not tight; the ratio $\eta_{J}$ defined in Eq. (48) has a value of $\eta_{J}\approx 0.12$ and thus the ratio between the average current and its variance underestimates the entropy production by about a factor of $8$ . By contrast, as shown in Fig. 7, the CTUR with the numerically optimized observable $\tilde{Z}$ can provide an accurate estimate of the entropy production, given a sufficient amount of trajectories and a sufficiently small sampling interval. As a function of $N_{\text{t}}$ (left panel of Fig. 7), the ratios corresponding to both the TUR and the CTUR show noticeable fluctuations for a small number of trajectories. However, while the value of the TUR ratio stabilizes at around $N_{\text{t}}\approx 100$ , we need around $N_{\text{t}}\approx 1000$ trajectories for the CTUR to provide a reliable estimate. The reason is that, since, in this case, the CTUR is much tighter than the TUR, the Pearson coefficient $\chi_{J,Z}$ is close to $1$ . Since the left-hand side of Eq. (3) diverges as $\chi_{J,Z}$ approaches unity, small fluctuations in its value have a large impact on the estimate. As a function of the sampling interval $\Delta t_{\text{m}}$ , we note that, if $\Delta t_{\text{m}}$ is comparable to the transport timescale $\tau^{\text{d}}$ , the CTUR does not yield any improvement over the TUR. The reason is that, for such a large sampling interval, the spatially periodic function $p^{\text{st}}(x)$ in Eq. (83) can no longer resolve the motion of the particles. Thus, the observable $Z$ is no longer correlated with the displacement of the particle, causing the Pearson coefficient to vanish. Note that the TUR, which only depends on the accumulated current until the final time, is independent of the sampling interval. We see that the CTUR can yield a significant improvement over the TUR for $\Delta t_{\text{m}}<0.1\tau^{\text{d}}$ and saturates at $\Delta t_{\text{m}}\approx 0.01\tau^{\text{d}}$ . We remark that this is still two orders of magnitude larger than the sampling interval needed to accurately determine the entropy production directly from the trajectory using its representation as a stochastic current

\displaystyle\Sigma=\frac{1}{T}\int_{0}^{\tau}dt\ F(x(t))\circ\dot{x}(t).

(84)