This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Hitting Times for Continuous-Time Imprecise-Markov Chains

Thomas Krak
([email protected]
Uncertainty in Artificial Intelligence – Eindhoven University of Technology)
Abstract

We study the problem of characterizing the expected hitting times for a robust generalization of continuous-time Markov chains. This generalization is based on the theory of imprecise probabilities, and the models with which we work essentially constitute sets of stochastic processes. Their inferences are tight lower- and upper bounds with respect to variation within these sets.

We consider three distinct types of these models, corresponding to different levels of generality and structural independence assumptions on the constituent processes.

Our main results are twofold; first, we demonstrate that the hitting times for all three types are equivalent. Moreover, we show that these inferences are described by a straightforward generalization of a well-known linear system of equations that characterizes expected hitting times for traditional time-homogeneous continuous-time Markov chains.

1 Introduction

We consider the problem of characterizing the expected hitting times for continuous-time imprecise-Markov chains [Škulj, 2015, Krak et al., 2017, Krak, 2021, Erreygers, 2021]. These are robust, set-valued generalizations of (traditional) Markov chains [Norris, 1998], based on the theory of imprecise probabilities [Walley, 1991, Augustin et al., 2014]. From a sensitivity-analysis perspective, we may interpret these sets as hedging against model-uncertainties with respect to a model’s numerical parameters and/or structural (independence) assumptions.

The inference problem of hitting times essentially deals with the question of how long it will take the underlying system to reach some particular subset of its states. This is a common and important problem in such fields as, e.g., reliability analysis, where it can capture the expected time-to-failure of a system; and epidemiology, to model the expected time-until-extinction of an epidemic. For imprecise-Markov chains, then, we are interested in evaluating these quantities in a manner that is robust against, and conservative with respect to, any variation that is compatible with one’s uncertainty about the model specification.

Erreygers [2021] has recently obtained some partial results towards characterizing such inferences, but has not been able to give a complete characterization and has largely studied the finite-time horizon case. The problem of hitting times for discrete-time imprecise-Markov chains was previously studied by Krak et al. [2019], Krak [2020]. In this present work, we largely emulate and extend their results to the continuous-time setting.

We will be concerned with three different types of imprecise-Markov chains. These are all sets of stochastic processes that are in a specific sense compatible with a given set of numerical parameters, but the three types differ in the independence properties of their elements. In particular, they correspond to (i) a set of (time-)homogeneous Markov chains, (ii) a set of (not-necessarily homogeneous) Markov chains, and (iii) a set of general—not-necessarily homogeneous nor Markovian—stochastic processes. It is known (and perhaps not very surprising) that inferences with respect to these three models do not in general agree; see e.g. [Krak, 2021] for a detailed analysis of their differences.

However, our first main result in this work is that the expected hitting time is the same for these three different types of models. Besides being of theoretical interest, we want to emphasize the power of this result: it means that even if a practitioner using Markov chains would be uncertain whether the system they are studying is truly homogeneous and/or Markovian, relaxing these assumptions would not influence inferences about the hitting times in this sense. Purely pragmatically, it also means that we can use computational methods tailored to any one of these types of models, to compute these inferences.

Our second main result is that these hitting times are characterized by a generalization of a well-known system of equations that holds for continuous-time homogeneous Markov chains; see Proposition 2 for this linear system.

The remainder of this paper is structured as follows. In Section 2 we introduce the basic required concepts that we will use throughout, formalizing the notion of stochastic processes and defining the inference problem of interest. In Section 3, we define the various types of imprecise-Markov chains that we use throughout this work. We spend some effort in Section 4 to study the transition dynamics of these models, from a perspective that is particularly relevant for the inference problem of hitting times. In Section 5 we explain and sketch the proofs of our main results, and we give a summary in Section 6.

Because we have quite a lot of conceptual material to cover before we can explain our main results, we are not able to fit any real proofs in the main body of this work. Instead, these—together with a number of technical lemmas—have largely been relegated to the supplementary material.

2 Preliminaries

Throughout, we consider a fixed, finite state space 𝒳\mathcal{X} with at least two elements. This set contains all possible values for some abstract underlying process. An element of 𝒳\mathcal{X} is called a state, and is usually generically denoted as x𝒳x\in\mathcal{X}.

We use ,0\mathbb{R},\mathbb{R}_{\geq 0}, and >0\mathbb{R}_{>0} to denote the reals, the non-negative reals, and the positive reals, respectively. \mathbb{N} denotes the natural numbers without zero, and we let 0{0}\mathbb{N}_{0}\coloneqq\mathbb{N}\cup\{0\}.

For any 𝒴𝒳\mathcal{Y}\subseteq\mathcal{X}, we use 𝒴\smash{\mathbb{R}^{\mathcal{Y}}} to denote the vector space of real-valued functions on 𝒴\mathcal{Y}; in particular, 𝒳\smash{\mathbb{R}^{\mathcal{X}}} denotes the space of all real functions on 𝒳\mathcal{X}. We use \left\lVert\cdot\right\rVert to denote the supremum norm on any such space; for any f𝒴f\in\smash{\mathbb{R}^{\mathcal{Y}}} we let fmax{|f(x)|:x𝒴}\left\lVert f\right\rVert\coloneqq\max\{\left\lvert f(x)\right\rvert\,:\,x\in\mathcal{Y}\}. Throughout, we make extensive use of indicator functions, which are defined for all A𝒴A\subseteq\mathcal{Y} as 𝕀A(x)1\mathbb{I}_{A}(x)\coloneqq 1 if xAx\in A and 𝕀A(x)0\mathbb{I}_{A}(x)\coloneqq 0, otherwise. We use the shorthand 𝕀y𝕀{y}\mathbb{I}_{y}\coloneqq\mathbb{I}_{\{y\}}. Let 𝟏\mathbf{1} denote the function that is identically equal to 11; its dimensionality is to be understood from context.

A map M:𝒴𝒴M:\smash{\mathbb{R}^{\mathcal{Y}}}\to\smash{\mathbb{R}^{\mathcal{Y}}} is also called an operator, and we denote its evaluation in f𝒴f\in\smash{\mathbb{R}^{\mathcal{Y}}} as MfMf. If it holds for all λ0\lambda\in\mathbb{R}_{\geq 0} that M(λf)=λMfM(\lambda f)=\lambda Mf then MM is called non-negatively homogeneous. For any non-negatively homogeneous operator on 𝒴\smash{\mathbb{R}^{\mathcal{Y}}}, we define the induced operator norm Msup{Mf:f𝒴,f=1}\left\lVert M\right\rVert\coloneqq\sup\{\left\lVert Mf\right\rVert\,:\,f\in\smash{\mathbb{R}^{\mathcal{Y}}},\left\lVert f\right\rVert=1\}. We reserve the symbol II to denote the identity operator on any space; the domain is to be understood from context.

Note that any linear operator is also non-negatively homogeneous. Moreover, if MM is linear it can be represented as an |𝒴|×|𝒴|\left\lvert\mathcal{Y}\right\rvert\times\left\lvert\mathcal{Y}\right\rvert matrix by arbitrarily fixing an ordering on 𝒴\mathcal{Y}. However, without fixing such an ordering, we simply use M(x,y)M𝕀y(x)M(x,y)\coloneqq M\mathbb{I}_{y}(x) to denote the entry in the xx-row and yy-column of such a matrix, for any x,y𝒴x,y\in\mathcal{Y}. For any f𝒴f\in\smash{\mathbb{R}^{\mathcal{Y}}} and x𝒴x\in\mathcal{Y} we then have Mf(x)=y𝒴M(x,y)f(y)Mf(x)=\sum_{y\in\mathcal{Y}}M(x,y)f(y), so that MfMf simply represents the usual matrix-vector product of MM with the (column) vector ff. In the sequel, we interchangeably refer to linear operators also as matrices. We note the well-known equality M=maxx𝒴y𝒴|M(x,y)|\left\lVert M\right\rVert=\max_{x\in\mathcal{Y}}\sum_{y\in\mathcal{Y}}\left\lvert M(x,y)\right\rvert for the induced matrix norm.

2.1 Processes & Markov Chains

We now turn to stochastic processes, which are fundamentally the subject of this work. The typical (measure-theoretic) way to define a stochastic process is simply as a family (Xi)i(X_{i})_{i\in\mathcal{I}} of random variables with index set \mathcal{I}. This index set represents the time domain of the stochastic process. The random variables are understood to be taken with respect to some underlying probability space (Ω,,P)(\Omega_{\mathcal{I}},\mathcal{F}_{\mathcal{I}},P), where Ω\Omega_{\mathcal{I}} is a set of sample paths, which are functions from \mathcal{I} to 𝒳\mathcal{X} representing possible realizations of the evolution of the underlying process through 𝒳\mathcal{X}. The random variables XiX_{i}, ii\in\mathcal{I} are canonically the maps Xi:ωω(i)X_{i}:\omega\mapsto\omega(i) on Ω\Omega_{\mathcal{I}}.

However, for our purposes it will be more convenient to instead refer to the probability measure PP as the stochastic process. Different processes PP may then be taken over the same measurable space (Ω,)(\Omega_{\mathcal{I}},\mathcal{F}_{\mathcal{I}}), using the same canonical variables (Xi)i(X_{i})_{i\in\mathcal{I}} for all these processes.

In this work we will use both discrete- and continuous-time stochastic processes, which corresponds to choosing =0\mathcal{I}=\mathbb{N}_{0} or =0\mathcal{I}=\mathbb{R}_{\geq 0}, respectively. In both cases we take \mathcal{F}_{\mathcal{I}} to be the σ\sigma-algebra generated by the cylinder sets; this ensures that all functions that we consider are measurable.

In the discrete-time case, we let Ω0\Omega_{\mathbb{N}_{0}} be the set of all functions from 0\mathbb{N}_{0} to 𝒳\mathcal{X}. A discrete-time stochastic process PP is then simply a probability measure on (Ω0,𝟎)(\Omega_{\mathbb{N}_{0}},\mathcal{F}_{\mathbf{\mathbb{N}_{0}}}). Moreover, PP is said to be a Markov chain if it satisfies the (discrete-time) Markov property, meaning that

P(Xn+1=xn+1\displaystyle P(X_{n+1}=x_{n+1}\, |X0=x0,,Xn=xn)\displaystyle|\,X_{0}=x_{0},\ldots,X_{n}=x_{n})
=P(Xn+1=xn+1|Xn=xn),\displaystyle=P(X_{n+1}=x_{n+1}\,|\,X_{n}=x_{n})\,,

for all x0,,xn+1𝒳x_{0},\ldots,x_{n+1}\in\mathcal{X} and n0n\in\mathbb{N}_{0}. If, additionally, it holds for all x,y𝒳x,y\in\mathcal{X} and n0n\in\mathbb{N}_{0} that

P(Xn+1=y|Xn=x)=P(X1=y|X0=x),P(X_{n+1}=y\,|\,X_{n}=x)=P(X_{1}=y\,|\,X_{0}=x)\,,

then PP is said to be a (time-)homogeneous Markov chain. We use 0,0M\mathbb{P}_{\mathbb{N}_{0}},\mathbb{P}_{\mathbb{N}_{0}}^{\mathrm{M}}, and 0HM\mathbb{P}_{\mathbb{N}_{0}}^{\mathrm{HM}} to denote, respectively, the set of all discrete-time stochastic processes; the set of all discrete-time Markov chains; and the set of all discrete-time homogeneous Markov chains.

In the continuous-time case, we let Ω0\Omega_{\mathbb{R}_{\geq 0}} be the set of all cadlag functions from 0\mathbb{R}_{\geq 0} to 𝒳\mathcal{X}. A continuous-time stochastic process PP is a probability measure on (Ω0,0)(\Omega_{\mathbb{R}_{\geq 0}},\mathcal{F}_{\mathbb{R}_{\geq 0}}). The process PP is said to be a Markov chain if it satisfies the (continuous-time) Markov property,

P(Xtn+1=xtn+1\displaystyle P(X_{t_{n+1}}=x_{t_{n+1}}\, |Xt0=xt0,,Xtn=xtn)\displaystyle|\,X_{t_{0}}=x_{t_{0}},\ldots,X_{t_{n}}=x_{t_{n}})
=P(Xtn+1=xtn+1|Xtn=xtn)\displaystyle=P(X_{t_{n+1}}=x_{t_{n+1}}\,|\,X_{t_{n}}=x_{t_{n}})

for all xt0,,xtn+1𝒳x_{t_{0}},\ldots,x_{t_{n+1}}\in\mathcal{X}, t0<<tntn+10t_{0}<\cdots<t_{n}\leq t_{n+1}\in\mathbb{R}_{\geq 0}, and all n0n\in\mathbb{N}_{0}. If, additionally, it holds that

P(Xs=y|Xt=x)=P(Xst=y|X0=x)P(X_{s}=y\,|\,X_{t}=x)=P(X_{s-t}=y\,|\,X_{0}=x)

for all x,y𝒳x,y\in\mathcal{X} and all t,s0t,s\in\mathbb{R}_{\geq 0} with tst\leq s, then PP is said to be a (time-)homogeneous Markov chain. We use 0,0M\mathbb{P}_{\mathbb{R}_{\geq 0}},\mathbb{P}_{\mathbb{R}_{\geq 0}}^{\mathrm{M}}, and 0HM\mathbb{P}_{\mathbb{R}_{\geq 0}}^{\mathrm{HM}} to denote, respectively, the set of all continuous-time stochastic processes; the set of all continuous-time Markov chains; and the set of all continuous-time homogeneous Markov chains.

We refer to [Norris, 1998] for an excellent further introduction to discrete-time and continuous-time Markov chains.

2.2 Transition Dynamics

Throughout this work, we make extensive use of operator-theoretic representations of the behavior of stochastic processes, and Markov chains in particular. The first reason for this is that such operators serve as a way to parameterize Markov chains. Moreover, they are also useful as a computational tool, since they can often be used to express inferences of interest; see, e.g., Propositions 1 and 2 further on. We introduce the basic concepts below, and refer to e.g. [Norris, 1998] for details.

A transition matrix TT is a linear operator on 𝒳\smash{\mathbb{R}^{\mathcal{X}}} such that, for all x𝒳x\in\mathcal{X}, it holds that T(x,y)0T(x,y)\geq 0 for all y𝒳y\in\mathcal{X}, and y𝒳T(x,y)=1\sum_{y\in\mathcal{X}}T(x,y)=1. There is an important and well-known connection between Markov chains and transition matrices; for any discrete-time homogeneous Markov chain PP, we can define the corresponding transition matrix TP{}^{P}T as

TP(x,y)P(X1=y|X0=x)for all x,y𝒳.{}^{P}T(x,y)\coloneqq P(X_{1}=y\,|\,X_{0}=x)\quad\text{for all $x,y\in\mathcal{X}$.}

Since PP is a probability measure, we clearly have that TP{}^{P}T is a transition matrix. Conversely, a given transition matrix TT uniquely determines a discrete-time homogeneous Markov chain PP with TP=T{}^{P}T=T, up to the specification of the initial distribution P(X0)P(X_{0}). For this reason, transition matrices are often taken as a crucial parameter to specify (discrete-time, homogeneous) Markov chains.

Analogously, for a (non-homogeneous) discrete-time Markov chain PP, we might define a family (TnP)n0({}^{P}T_{n})_{n\in\mathbb{N}_{0}} of time-dependent corresponding transition matrices, with

TnP(x,y)P(Xn+1=y|Xn=x),{}^{P}T_{n}(x,y)\coloneqq P(X_{n+1}=y\,|\,X_{n}=x)\,,

for all x,y𝒳x,y\in\mathcal{X} and n0n\in\mathbb{N}_{0}. Conversely, any family (Tn)n0(T_{n})_{n\in\mathbb{N}_{0}} of transition matrices uniquely determines a discrete-time Markov chain PP with TnP=Tn{}^{P}T_{n}=T_{n} for all n0n\in\mathbb{N}_{0}, again up to the specification of P(X0)P(X_{0}).

In the continuous-time setting, transition matrices are also of great importance. However, it will be instructive to first introduce rate matrices. A rate matrix QQ is a linear operator on 𝒳\smash{\mathbb{R}^{\mathcal{X}}} such that, for all x𝒳x\in\mathcal{X}, it holds that Q(x,y)0Q(x,y)\geq 0 for all y𝒳y\in\mathcal{X} with xyx\neq y, and y𝒳Q(x,y)=0\sum_{y\in\mathcal{X}}Q(x,y)=0.

For any rate matrix QQ and any t0t\in\mathbb{R}_{\geq 0}, the matrix exponential eQte^{Qt} of QtQt can be defined as [Van Loan, 2006]

eQtlimn+(I+t/nQ)n.e^{Qt}\coloneqq\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}Q\bigr{)}^{n}\,.

An alternative characterization is as the (unique) solution to the matrix ordinary differential equation [Van Loan, 2006]

ddseQs=QeQs=eQsQ,with eQ0=I.\frac{\mathrm{d}}{\mathrm{d}\,s}e^{Qs}=Qe^{Qs}=e^{Qs}Q,\quad\text{with $e^{Q0}=I$.} (1)

For any t,s0t,s\in\mathbb{R}_{\geq 0} it holds that eQ(t+s)=eQteQse^{Q(t+s)}=e^{Qt}e^{Qs}, and we immediately have eQ0=Ie^{Q0}=I. The family (eQt)t0(e^{Qt})_{t\in\mathbb{R}_{\geq 0}} is therefore called the semigroup generated by QQ, and QQ is called the generator of this semigroup. Moreover, for any rate matrix QQ and any t0t\in\mathbb{R}_{\geq 0}, eQte^{Qt} is a transition matrix [Norris, 1998, Thm 2.1.2].

Now let us consider a continuous-time homogeneous Markov chain PP, and define the corresponding transition matrix111Note that in continuous-time, we always have to measure the transition-time interval [0,t][0,t] to specify these matrices. TtP{}^{P}T_{t} for all t0t\in\mathbb{R}_{\geq 0} and x,y𝒳x,y\in\mathcal{X} as

TtP(x,y)P(Xt=y|X0=x).{}^{P}T_{t}(x,y)\coloneqq P(X_{t}=y\,|\,X_{0}=x)\,. (2)

It turns out that there is then a unique rate matrix QP{}^{P}\!Q associated with PP such that TtP=eQPt{}^{P}T_{t}=e^{{}^{P}\!Qt} for all t0t\in\mathbb{R}_{\geq 0}. By combining Equations (1) and (2), we can identify QP{}^{P}\!Q as

QP=(ddtTtP)|t=0.{}^{P}\!Q=\Bigl{(}\frac{\mathrm{d}}{\mathrm{d}\,t}{}^{P}T_{t}\Bigr{)}\bigg{|}_{t=0}\,.

As before, in the other direction we have that any fixed rate matrix QQ uniquely determines a continuous-time homogeneous Markov chain PP with QP=Q{}^{P}\!Q=Q, up to the specification of P(X0)P(X_{0}). For this reason, rate matrices are often used to specify (continuous-time, homogeneous) Markov chains.

Let us finally consider a (not-necessarily homogeneous) continuous-time Markov chain PP. For any t,s0t,s\in\mathbb{R}_{\geq 0} with tst\leq s, we can then define a transition matrix TtsP{}^{P}T_{t}^{s} with, for all x,y𝒳x,y\in\mathcal{X}, TtsP(x,y)P(Xs=y|Xt=x){}^{P}T_{t}^{s}(x,y)\coloneqq P(X_{s}=y\,|\,X_{t}=x). Under appropriate assumptions of differentiability, this induces a family (QtP)t0({}^{P}\!Q_{t})_{t\in\mathbb{R}_{\geq 0}} of rate matrices QtP{}^{P}\!Q_{t}, as

QtP=(ddsTtsP)|s=t.{}^{P}\!Q_{t}=\Bigl{(}\frac{\mathrm{d}}{\mathrm{d}\,s}{}^{P}T_{t}^{s}\Bigr{)}\bigg{|}_{s=t}\,. (3)

In the converse direction we might try to reconstruct the transition matrices of PP by solving the matrix ordinary differential equation(s)

ddsTtsP=TtsPQsP,with TttP=I.\frac{\mathrm{d}}{\mathrm{d}\,s}{}^{P}T_{t}^{s}={}^{P}T_{t}^{s}{}^{P}\!Q_{s},\quad\text{with ${}^{P}T_{t}^{t}=I$.} (4)

By comparing with Equation (1), we see that in the special case where QsP{}^{P}\!Q_{s} does not depend on ss—that is, where PP is homogeneous with QsP=QP{}^{P}\!Q_{s}={}^{P}\!Q, say—we indeed obtain TtsP=eQP(st){}^{P}T_{t}^{s}=e^{{}^{P}\!Q(s-t)}. However, in general the non-autonomous system (4) does not have such a closed-form solution, and we cannot move beyond this implicit characterization.

2.3 Hitting Times

We now have all the pieces to introduce the inference problem that is the subject of this work, viz. the expected hitting times of some non-empty set of states A𝒳A\subset\mathcal{X} with respect to a particular stochastic process. We take this set AA to be fixed for the remainder of this work.

In the discrete-time case, we consider the (extended real-valued)222We agree that 0(+)=00(+\infty)=0; (+)+(+)=+(+\infty)+(+\infty)=+\infty; and, for any cc\in\mathbb{R}, (+)+c=+(+\infty)+c=+\infty and c(+)=+c(+\infty)=+\infty if c>0c>0. function τ0:Ω00{+}\tau_{\mathbb{N}_{0}}:\Omega_{\mathbb{N}_{0}}\to\mathbb{R}_{\geq 0}\cup\{+\infty\} given by

τ0(ω)inf{n0:ω(n)A}for all ωΩ0.\tau_{\mathbb{N}_{0}}(\omega)\coloneqq\inf\bigl{\{}n\in\mathbb{N}_{0}\,:\,\omega(n)\in A\bigr{\}}\quad\text{for all $\omega\in\Omega_{\mathbb{N}_{0}}$.}

This captures the number of steps before a process PP “hits” any state in AA. The expected hitting time for a discrete-time process PP starting in x𝒳x\in\mathcal{X} is then defined as

𝔼P[τ0|X0=x]Ω0τ0(ω)dP(ω|X0=x).\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{N}_{0}}\,|\,X_{0}=x\bigr{]}\coloneqq\int_{\Omega_{\mathbb{N}_{0}}}\tau_{\mathbb{N}_{0}}(\omega)\,\mathrm{d}P(\omega\,|\,X_{0}=x)\,.

We use 𝔼P[τ0|X0]\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{N}_{0}}\,|\,X_{0}\bigr{]} to denote the extended real-valued function on 𝒳\mathcal{X} given by x𝔼P[τ0|X0=x]x\mapsto\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{N}_{0}}\,|\,X_{0}=x\bigr{]}. When dealing with homogeneous Markov chains, this quantity has the following simple characterization:

Proposition 1.

[Norris, 1998, Thm 1.3.5] Let PP be a discrete-time homogeneous Markov chain with corresponding transition matrix TP{}^{P}T. Then h𝔼P[τ0|X0]h\coloneqq\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{N}_{0}}\,|\,X_{0}\bigr{]} is the minimal non-negative solution to the linear system333Throughout, for any f,g𝒳f,g\in\smash{\mathbb{R}^{\mathcal{X}}}, the quantity fgfg is understood as the pointwise product between the functions ff and gg.444Strictly speaking this requires extending the domain of TP{}^{P}T to extended-real valued functions, but we will shortly introduce some assumptions that obviate such an exposition.

h=𝕀Ac+𝕀AcTPh.h=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}{}^{P}Th\,.

In the continuous-time case, the definition is analogous; we introduce a function τ0:Ω00{+}\tau_{\mathbb{R}_{\geq 0}}:\Omega_{\mathbb{R}_{\geq 0}}\to\mathbb{R}_{\geq 0}\cup\{+\infty\} as

τ0(ω)inf{t0:ω(t)A}for all ωΩ0.\tau_{\mathbb{R}_{\geq 0}}(\omega)\coloneqq\inf\bigl{\{}t\in\mathbb{R}_{\geq 0}\,:\,\omega(t)\in A\bigr{\}}\,\,\text{for all $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$.}

This function measures the time until a process “hits” any state in AA on a given sample path. The expected hitting time for a continuous-time process PP starting in x𝒳x\in\mathcal{X} is

𝔼P[τ0|X0=x]Ω0τ0(ω)dP(ω|X0=x).\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}=x\bigr{]}\coloneqq\int_{\Omega_{\mathbb{R}_{\geq 0}}}\tau_{\mathbb{R}_{\geq 0}}(\omega)\,\mathrm{d}P(\omega\,|\,X_{0}=x)\,.

We again use 𝔼P[τ0|X0]\smash{\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}} to denote the extended-real valued function on 𝒳\mathcal{X} given by x𝔼P[τ0|X0=x]x\mapsto\smash{\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}=x\bigr{]}}. Also in this case, the characterization for homogeneous Markov chains is particularly simple:

Proposition 2.

[Norris, 1998, Thm 3.3.3] Let PP be a continuous-time homogeneous Markov chain with rate matrix QP\smash{{}^{P}\!Q} such that QP(x,x)0\smash{{}^{P}\!Q}(x,x)\neq 0 for all xAcx\in A^{c}. Then h𝔼P[τ0|X0]h\coloneqq\smash{\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}} is the minimal non-negative solution to

𝕀Ah=𝕀Ac+𝕀AcQPh.\mathbb{I}_{A}h=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\smash{{}^{P}\!Q}h\,. (5)

3 Imprecise-Markov Chains

Let us now introduce imprecise-Markov chains [Hermans and Škulj, 2014, Škulj, 2015, Krak et al., 2017], which are the stochastic processes that we aim to study in this work. Their characterization is based on the theory of imprecise probabilities [Walley, 1991, Augustin et al., 2014].

We here adopt the “sensitivity analysis” interpretation of imprecise probabilities. This means that we represent an imprecise-Markov chain simply as a set 𝒫\mathcal{P} of stochastic processes. Intuitively, the idea is that we collect in 𝒫\mathcal{P} all (traditional, “precise”) stochastic processes that we deem to plausibly capture the dynamics of the underlying system of interest. Inferences with respect to 𝒫\mathcal{P} are defined using lower- and upper expectations, given respectively as

𝔼¯𝒫[|]infP𝒫𝔼P[|]and𝔼¯𝒫[|]supP𝒫𝔼P[|].\underline{\mathbb{E}}_{\mathcal{P}}[\cdot\,|\,\cdot]\coloneqq\inf_{P\in\mathcal{P}}\mathbb{E}_{P}[\cdot\,|\,\cdot]\quad\text{and}\quad\overline{\mathbb{E}}_{\mathcal{P}}[\cdot\,|\,\cdot]\coloneqq\sup_{P\in\mathcal{P}}\mathbb{E}_{P}[\cdot\,|\,\cdot]\,.

So, their inferences represent robust—i.e. conservative—and tight lower- and upper bounds on inferences with respect to all stochastic processes that we deem to be plausible.

3.1 Sets of Processes & Types

We already mentioned that an imprecise-Markov chain is essentially simply a set 𝒫\mathcal{P} of stochastic processes. Let us now consider how to define such sets.

We start by considering the discrete-time case; then, clearly, 𝒫\mathcal{P} will be a set of discrete-time processes. We will parameterize such a set with some non-empty set 𝒯\mathcal{T} of transition matrices. Our aim is then to include in 𝒫\mathcal{P} all processes that are in some sense “compatible” with 𝒯\mathcal{T}.555We will not constrain the initial models P(X0)P(X_{0}) of the elements of 𝒫\mathcal{P}, since in any case such a choice would not influence the inferences that we study in this work. However, at this point we are faced with a choice about which type of processes to include in this set, and these different choices lead to different types of imprecise-Markov chains.

Arguably the conceptually most simple model is 𝒫𝒯HM\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}}, which contains all homogeneous Markov chains PP whose corresponding transition matrix is included in 𝒯\mathcal{T}:

𝒫𝒯HM{P0HM:TP𝒯}.\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}}\coloneqq\bigl{\{}P\in\mathbb{P}^{\mathrm{HM}}_{\mathbb{N}_{0}}\,:\,{}^{P}T\in\mathcal{T}\bigr{\}}\,.

However, we could instead consider 𝒫𝒯M\mathcal{P}^{\mathrm{M}}_{\mathcal{T}}, which is the set of all (not-necessarily homogeneous) Markov chains whose time-dependent transition matrices are contained in 𝒯\mathcal{T}:

𝒫𝒯M{P0M:TnP𝒯for all n0}.\mathcal{P}^{\mathrm{M}}_{\mathcal{T}}\coloneqq\bigl{\{}P\in\mathbb{P}^{\mathrm{M}}_{\mathbb{N}_{0}}\,:\,{}^{P}T_{n}\in\mathcal{T}\,\text{for all $n\in\mathbb{N}_{0}$}\bigr{\}}\,.

The last choice that we consider here is the set 𝒫𝒯I\mathcal{P}^{\mathrm{I}}_{\mathcal{T}}, which essentially contains all discrete-time processes whose single-step transition dynamics are described by 𝒯\mathcal{T}. Its characterization is more cumbersome since we have not expressed these general processes in terms of transition matrices, but we can say that it is the set of all P0P\in\mathbb{P}_{\mathbb{N}_{0}} such that for all n0n\in\mathbb{N}_{0} and all x0,,xn𝒳x_{0},\ldots,x_{n}\in\mathcal{X}, there is some T𝒯T\in\mathcal{T} such that for all y𝒳y\in\mathcal{X} it holds that

P(Xn+1=y|X0=x0,,Xn=xn)=T(xn,y).P(X_{n+1}=y\,|\,X_{0}=x_{0},\ldots,X_{n}=x_{n})=T(x_{n},y)\,.

This last type is called an imprecise-Markov chain under epistemic irrelevance, whence the superscript ‘I\mathrm{I}’.

Note that the three types 𝒫𝒯HM,𝒫𝒯M\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}},\mathcal{P}^{\mathrm{M}}_{\mathcal{T}}, and 𝒫𝒯I\mathcal{P}^{\mathrm{I}}_{\mathcal{T}} capture not only “plausible” variation in terms of parameter uncertainty—expressed through the set 𝒯\mathcal{T}—but also variation in terms of the structural independence conditions that we consider! So, from an applied perspective, if someone is not sure whether the underlying system that they are studying is truly Markovian and/or time-homogeneous, they might choose to use different such sets in their analysis.

In the continuous-time case, we again proceed analogously. First, we fix a non-empty set 𝒬\mathcal{Q} of rate matrices, which will be the parameter for our models. We then first consider the set 𝒫𝒬HM\mathcal{P}^{\mathrm{HM}}_{\mathcal{Q}} of all homogeneous Markov chains whose rate matrix is included in 𝒬\mathcal{Q}:

𝒫𝒬HM{P0HM:QP𝒬}.\mathcal{P}^{\mathrm{HM}}_{\mathcal{Q}}\coloneqq\bigl{\{}P\in\mathbb{P}_{\mathbb{R}_{\geq 0}}^{\mathrm{HM}}\,:\,{}^{P}\!Q\in\mathcal{Q}\bigr{\}}\,.

The other two types are constructed in analogy to the discrete-time case, but unfortunately we don’t have the space for a complete exposition of their characterization. Instead we refer the interested reader to [Krak et al., 2017, Krak, 2021] for an in-depth study of these different types and comparisons between them; in what follows we limit ourselves to a largely intuitive specification.

The model 𝒫𝒬M\mathcal{P}^{\mathrm{M}}_{\mathcal{Q}} is the set of all continuous-time (not-necessarily homogeneous) Markov chains whose transition dynamics are compatible with 𝒬\mathcal{Q} at every point in time. This includes in particular all Markov chains PP satisfying the appropriate differentiability assumptions to meaningfully say that the time-dependent rate matrices QtP{}^{P}\!Q_{t}—as in Equation (3)—are included in 𝒬\mathcal{Q} for all t0t\in\mathbb{R}_{\geq 0}. However, 𝒫𝒬M\mathcal{P}^{\mathrm{M}}_{\mathcal{Q}} also contains other processes that are not (everywhere) differentiable; see e.g. [Krak, 2021, Sec 4.6 and 5.2] for the technical details.

The most involved model to explain is again 𝒫𝒬I\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}, which includes all continuous-time processes whose time- and history-dependent transition dynamics can be described using elements of 𝒬\mathcal{Q}. It includes, but is not limited to, appropriately differentiable processes PP such that for all n0n\in\mathbb{N}_{0}, all t0<<tn0t_{0}<\cdots<t_{n}\in\mathbb{R}_{\geq 0}, and all xt0,,xtn𝒳x_{t_{0}},\ldots,x_{t_{n}}\in\mathcal{X}, there is some Q𝒬Q\in\mathcal{Q} such that for all y𝒳y\in\mathcal{X} it holds that

(ddsP(Xs=y|Xt0=xt0,,\displaystyle\biggl{(}\frac{\mathrm{d}}{\mathrm{d}\,s}P(X_{s}=y\,|\,X_{t_{0}}=x_{t_{0}},\ldots, Xtn=xtn))|s=tn\displaystyle X_{t_{n}}=x_{t_{n}})\biggr{)}\bigg{|}_{s=t_{n}}
=Q(xtn,y)\displaystyle=Q(x_{t_{n}},y)

We again refer to [Krak, 2021, Sec 4.6 and 5.2] for the technical details involving the additional elements of 𝒫𝒬I\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}} that are not appropriately differentiable. Importantly, we note the nested structure  [Krak, 2021, Prop 5.9]

𝒫𝒬HM𝒫𝒬M𝒫𝒬I,\mathcal{P}^{\mathrm{HM}}_{\mathcal{Q}}\subseteq\mathcal{P}^{\mathrm{M}}_{\mathcal{Q}}\subseteq\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}\,,

where the inclusions are strict provided 𝒬\mathcal{Q} isn’t trivial.

For notational convenience, we will use identical sub- and superscripts to denote the corresponding lower- and upper expectations for any of these imprecise-Markov chains; e.g., we let 𝔼¯𝒯HM[|]𝔼¯𝒫𝒯HM[|]\underline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{T}}[\cdot\,|\,\cdot]\coloneqq\underline{\mathbb{E}}_{\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}}}[\cdot\,|\,\cdot].

3.2 Imprecise Transition Dynamics

Let us now introduce some machinery to describe the dynamics of imprecise-Markov chains. In particular, we here move from the set-valued parameters 𝒯\mathcal{T} and 𝒬\mathcal{Q} used in Section 3.1, to their dual representations; these are operators that can serve as computational tools.

In Section 3.1, we described discrete-time imprecise-Markov chains using non-empty sets 𝒯\mathcal{T} of transition matrices. With any such set, we can associate the corresponding lower- and upper transition operators T¯\underline{T} and T¯\overline{T} on 𝒳\smash{\mathbb{R}^{\mathcal{X}}}, defined respectively as

T¯finfT𝒯TfandT¯fsupT𝒯Tffor all f𝒳.\underline{T}f\coloneqq\inf_{T\in\mathcal{T}}Tf\quad\text{and}\quad\overline{T}f\coloneqq\sup_{T\in\mathcal{T}}Tf\quad\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$.}

More generally, any operator T¯\underline{T} (resp. T¯\overline{T}) on 𝒳\smash{\mathbb{R}^{\mathcal{X}}} is a lower (resp. upper) transition operator if for all f,g𝒳f,g\in\smash{\mathbb{R}^{\mathcal{X}}}, all λ0\lambda\in\mathbb{R}_{\geq 0}, and all x𝒳x\in\mathcal{X}, it holds that [De Bock, 2017]

  1. 1.

    miny𝒳f(y)T¯f(x)\min_{y\in\mathcal{X}}f(y)\leq\underline{T}f(x) and T¯f(x)maxy𝒳f(y)\overline{T}f(x)\leq\max_{y\in\mathcal{X}}f(y)

  2. 2.

    T¯f+T¯gT¯(f+g)\underline{T}f+\underline{T}g\leq\underline{T}(f+g) and T¯(f+g)T¯f+T¯g\overline{T}(f+g)\leq\overline{T}f+\overline{T}g

  3. 3.

    T¯(λf)=λT¯f\underline{T}(\lambda f)=\lambda\underline{T}f and T¯(λf)=λT¯f\overline{T}(\lambda f)=\lambda\overline{T}f.

It should be noted that lower- and upper transition operators are conjugate, in that any T¯\underline{T} induces a corresponding upper transition operator T¯()=T¯()\overline{T}(\cdot)=-\underline{T}(-\cdot), and vice versa. Moreover, any transition matrix TT is also a lower—and, by its linearity, upper—transition operator.

It is easily verified that the lower- and upper transition operators corresponding to a given non-empty set 𝒯\mathcal{T} are, indeed, lower- and upper transition operators. Conversely, with a given lower transition operator T¯\underline{T}, we can associate the set of transition matrices that dominate it, in the sense that

𝒯T¯{T:T a trans. mat.,TfT¯ffor all f𝒳}.\mathcal{T}_{\underline{T}}\coloneqq\bigl{\{}T\,:\,\text{$T$ a trans. mat.},\,Tf\geq\underline{T}f\,\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$}\bigr{\}}\,.

This set satisfies the following important properties:

Proposition 3.

[Krak, 2021, Sec 3.4] Let T¯\underline{T} be a lower transition operator with conjugate upper transition operator T¯()=T¯()\overline{T}(\cdot)=-\underline{T}(-\cdot) and dominating set of transition matrices 𝒯T¯\mathcal{T}_{\underline{T}}. Then 𝒯T¯\mathcal{T}_{\underline{T}} is a non-empty, closed, and convex set of transition matrices that has separately specified rows,666A set \mathcal{M} of matrices is said to have separately specified rows if, intuitively, it is closed under the row-wise recombination of its elements; see e.g. [Hermans and Škulj, 2014] for details. and for all f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}} it holds that T¯f=infT𝒯T¯Tf\underline{T}f=\inf_{T\in\mathcal{T}_{\underline{T}}}Tf and T¯f=supT𝒯T¯Tf\overline{T}f=\sup_{T\in\mathcal{T}_{\underline{T}}}Tf. Moreover, for all f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}} there is some T𝒯T¯T\in\mathcal{T}_{\underline{T}} such that Tf=T¯fTf=\underline{T}f, and there is some—possibly different—T𝒯T¯T\in\mathcal{T}_{\underline{T}} such that Tf=T¯fTf=\overline{T}f.

Notably, there is a one-to-one relation between non-empty sets of transition matrices that are closed and convex and have separately specified rows, and lower (or upper) transition operators: if T¯\underline{T} is the lower transition operator for the set 𝒯\mathcal{T}, and if 𝒯\mathcal{T} satisfies these properties, then 𝒯=𝒯T¯\mathcal{T}=\mathcal{T}_{\underline{T}} [Krak, 2021, Cor 3.38]. Hence these objects may serve as dual representations for each other.

One reason that this is important is the use of T¯\underline{T} as a computational tool; under the conditions of this duality it holds that for any function f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}} and any n0n\in\mathbb{N}_{0}, we can write [Hermans and Škulj, 2014]

𝔼¯𝒯I[f(Xn)|X0=x]=𝔼¯𝒯M[f(Xn)|X0=x]=T¯nf(x),\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{T}}[f(X_{n})|X_{0}=x]=\underline{\mathbb{E}}^{\mathrm{M}}_{\mathcal{T}}[f(X_{n})|X_{0}=x]=\underline{T}^{n}f(x)\,,

where T¯\underline{T} is the lower transition operator for 𝒯\mathcal{T}. This reduces the problem of computing such inferences for the imprecise-Markov chains 𝒫𝒯M\mathcal{P}^{\mathrm{M}}_{\mathcal{T}} and 𝒫𝒯I\mathcal{P}^{\mathrm{I}}_{\mathcal{T}} to solving nn independent linear optimization problems over 𝒯\mathcal{T}; first compute f1T¯ff_{1}\coloneqq\underline{T}f, then compute f2T¯f1=T¯2ff_{2}\coloneqq\underline{T}\,f_{1}=\underline{T}^{2}f, and so forth. Note that this method in general only yields a conservative bound on the corresponding inference for 𝒫𝒯HM\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}}, as the minimizers TkT_{k} that obtain Tkfk1=T¯fk1T_{k}f_{k-1}=\underline{T}f_{k-1} may be different at each step.

We next consider the dynamics in the continuous-time setting. We proceed analogously to the above: we first consider a non-empty and bounded777In the induced operator norm. set 𝒬\mathcal{Q} of rate matrices. With this set, we then associate the corresponding lower- and upper rate operators Q¯\underline{Q} and Q¯\overline{Q} on 𝒳\smash{\mathbb{R}^{\mathcal{X}}}, defined as

Q¯finfQ𝒬QfandQ¯fsupQ𝒬Qffor all f𝒳.\underline{Q}f\coloneqq\inf_{Q\in\mathcal{Q}}Qf\quad\text{and}\quad\overline{Q}f\coloneqq\sup_{Q\in\mathcal{Q}}Qf\quad\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$.}

More generally, any operator Q¯\underline{Q} (resp. Q¯\overline{Q}) on 𝒳\smash{\mathbb{R}^{\mathcal{X}}} is a lower (resp. upper) rate operator if for all f,g𝒳f,g\in\smash{\mathbb{R}^{\mathcal{X}}}, all λ0\lambda\in\mathbb{R}_{\geq 0} and μ\mu\in\mathbb{R}, and all x,y𝒳x,y\in\mathcal{X} with yxy\neq x, it holds that [De Bock, 2017]

  1. 1.

    Q¯(μ𝟏)(x)=0\underline{Q}(\mu\mathbf{1})(x)=0 and Q¯(μ𝟏)(x)=0\overline{Q}(\mu\mathbf{1})(x)=0

  2. 2.

    Q¯𝕀y(x)0\underline{Q}\mathbb{I}_{y}(x)\geq 0 and Q¯𝕀y(x)0\overline{Q}\mathbb{I}_{y}(x)\geq 0

  3. 3.

    Q¯f+Q¯gQ¯(f+g)\underline{Q}f+\underline{Q}g\leq\underline{Q}(f+g) and Q¯(f+g)Q¯f+Q¯g\overline{Q}(f+g)\leq\overline{Q}f+\overline{Q}g

  4. 4.

    Q¯(λf)=λQ¯f\underline{Q}(\lambda f)=\lambda\underline{Q}f and Q¯(λf)=λQ¯f\overline{Q}(\lambda f)=\lambda\overline{Q}f

As before, such objects are conjugate, in that if Q¯\underline{Q} is a lower rate operator, then Q¯()=Q¯()\smash{\overline{Q}}(\cdot)=-\smash{\underline{Q}}(-\cdot) is an upper rate operator. Moreover, any rate matrix QQ is also a lower (and upper) rate operator. There is again a duality between lower (or upper) rate operators, and sets of rate matrices. For fixed Q¯\underline{Q} and with the dominating set of rate matrices 𝒬Q¯\mathcal{Q}_{\underline{Q}} defined as

𝒬Q¯{Q:Q a rate mat.,QfQ¯ffor all f𝒳},\mathcal{Q}_{\underline{Q}}\coloneqq\bigl{\{}Q\,:\,\text{$Q$ a rate mat.},\,Qf\geq\underline{Q}f\,\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$}\bigr{\}}\,,

we have the following result:

Proposition 4.

[Krak, 2021, Sec 6.2] Let Q¯\underline{Q} be a lower rate operator with conjugate upper rate operator Q¯()=Q¯()\overline{Q}(\cdot)=-\underline{Q}(-\cdot) and dominating set of rate matrices 𝒬Q¯\mathcal{Q}_{\underline{Q}}. Then 𝒬Q¯\mathcal{Q}_{\underline{Q}} is a non-empty, compact, and convex set of rate matrices that has separately specified rows, and for all f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}} it holds that Q¯f=infQ𝒬Q¯Qf\underline{Q}f=\inf_{Q\in\mathcal{Q}_{\underline{Q}}}Qf and Q¯f=supQ𝒬Q¯Qf\overline{Q}f=\sup_{Q\in\mathcal{Q}_{\underline{Q}}}Qf. Moreover, for all f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}} there is some Q𝒬Q¯Q\in\mathcal{Q}_{\underline{Q}} such that Qf=Q¯fQf=\underline{Q}f, and there is some—possibly different—Q𝒬Q¯Q\in\mathcal{Q}_{\underline{Q}} such that Qf=Q¯fQf=\overline{Q}f.

Now fix any lower rate operator Q¯\underline{Q} and any t0t\in\mathbb{R}_{\geq 0}, and let

eQ¯tlimn+(I+t/nQ¯)n.e^{\underline{Q}t}\coloneqq\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}\underline{Q}\bigr{)}^{n}\,. (6)

The operator eQ¯te^{\underline{Q}t} is then a lower transition operator [De Bock, 2017], and the family (eQ¯t)t0(e^{\underline{Q}t})_{t\in\mathbb{R}_{\geq 0}} is a semigroup of lower transition operators; it satisfies eQ¯(t+s)=eQ¯teQ¯se^{\underline{Q}(t+s)}=e^{\underline{Q}t}e^{\underline{Q}s} for all t,s0t,s\in\mathbb{R}_{\geq 0}, and eQ¯0=Ie^{\underline{Q}0}=I. The analogous construction with an upper rate operator Q¯\overline{Q} instead generates a semigroup (eQ¯t)t0(e^{\overline{Q}t})_{t\in\mathbb{R}_{\geq 0}} of upper transition operators. When Q¯\underline{Q} and Q¯\overline{Q} are taken with respect to the same set 𝒬\mathcal{Q}, these semigroups satisfy, for all t0t\in\mathbb{R}_{\geq 0}, f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}}, and Q𝒬Q\in\mathcal{Q},

eQ¯tfeQtfeQ¯tf.e^{\underline{Q}t}f\leq e^{Qt}f\leq e^{\overline{Q}t}f\,. (7)

Here the importance again derives from the use as a computational tool; under the conditions of duality between 𝒬\mathcal{Q} and Q¯\underline{Q}, we have for any f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}} and any t0t\in\mathbb{R}_{\geq 0} that [Škulj, 2015, Krak et al., 2017]

𝔼¯𝒬I[f(Xt)|X0=x]=𝔼¯𝒬M[f(Xt)|X0=x]=eQ¯tf(x).\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[f(X_{t})|X_{0}=x]=\underline{\mathbb{E}}^{\mathrm{M}}_{\mathcal{Q}}[f(X_{t})|X_{0}=x]=e^{\underline{Q}t}f(x)\,.

Hence such inferences can be numerically computed by approximating (6) with a finite choice of nn, and then solving nn independent linear optimization problems over 𝒬\mathcal{Q}. Error bounds for this scheme are available in the literature [Škulj, 2015, Krak et al., 2017, Erreygers, 2021].

3.3 Class Structure

Let us now fix a set 𝒬\mathcal{Q} of rate matrices that we will use in the remainder of this work. Throughout, let Q¯\underline{Q} and Q¯\overline{Q} denote the lower- and upper rate operators associated with 𝒬\mathcal{Q}. We impose several standard regularity conditions on this set: we assume that 𝒬\mathcal{Q} is non-empty, compact, convex, and that it has separately specified rows. These are common assumptions that are imposed to ensure the duality between 𝒬\mathcal{Q} and Q¯\underline{Q}, which in turn guarantees that inferences with the induced imprecise-Markov chains remain well-behaved, as well as analytically (and, often, computationally) tractable.

We now have all the pieces to start studying the inference problem that is the subject of this work: the lower- and upper expected hitting times of the set A𝒳A\subset\mathcal{X} for continuous-time imprecise-Markov chains described by 𝒬\mathcal{Q}.

Before we begin, let us impose two additional conditions on the dynamics of the system.

Assumption 1.

We assume that all states in AA are absorbing, which is equivalent to requiring that Q(x,x)=0Q(x,x)=0 for all Q𝒬Q\in\mathcal{Q} and all xAx\in A.

Note that this does not influence the inferences in which we are interested, since those only deal with behavior at times before states in AA are reached. However, imposing this explicitly substantially simplifies the analysis.

Next, we assume that the set AA is lower reachable from any state xAcx\in A^{c} [De Bock, 2017]. This means that we can construct a sequence x1,,xn+1𝒳x_{1},\ldots,x_{n+1}\in\mathcal{X} starting in any x1Acx_{1}\in A^{c} and ending in some xn+1Ax_{n+1}\in A such that, for all k=1,,nk=1,\ldots,n, it holds that Q¯𝕀xk+1(xk)>0\underline{Q}\mathbb{I}_{x_{k+1}}(x_{k})>0. This is equivalent [De Bock, 2017] to

Assumption 2.

We assume eQ¯t𝕀A(x)>0e^{\underline{Q}t}\mathbb{I}_{A}(x)>0 for all t>0t\in\mathbb{R}_{>0} and all xAcx\in A^{c}.

Essentially, this means that for all elements of our imprecise-probabilistic models the probability of eventually hitting AA is bounded away from zero. This ensures that the expected hitting times remain bounded for all P𝒫𝒬IP\in\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}, so that we can ignore any extended real-valued analysis. It also implies that for all Q𝒬Q\in\mathcal{Q} we have that Q(x,x)0Q(x,x)\neq 0 for all xAcx\in A^{c}, which is relevant to meet the precondition of Proposition 2. As a practical point, De Bock [2017] gives an algorithm to check whether a given set 𝒬\mathcal{Q} satisfies this condition.

On a technical level, Assumption 2 is the crucial one for our results, and—unlike with Assumption 1—it cannot really be ignored in practice. However, based on earlier work by Krak et al. [2019] in the discrete-time setting, we hope in the future to strengthen our results to hold without this assumption.

4 Subspace Dynamics

In the context of hitting times, the interesting behavior of a process actually occurs before it has reached a target state in AA. Hence it will be useful to introduce some machinery to study the transition dynamics as it relates to the states AcA^{c}.

To introduce the notation in a general way, choose any non-empty 𝒴𝒳\mathcal{Y}\subset\mathcal{X}. Then for any f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}}, let f|𝒴𝒴f|_{\mathcal{Y}}\in\smash{\mathbb{R}^{\mathcal{Y}}} denote the restriction of ff to 𝒴\mathcal{Y}. Conversely, for any f𝒴f\in\smash{\mathbb{R}^{\mathcal{Y}}}, let f𝒳𝒳f\!\!\uparrow_{\mathcal{X}}\in\smash{\mathbb{R}^{\mathcal{X}}} denote the unique extension of ff to 𝒳\mathcal{X} that satisfies f(x)=0f(x)=0 for all x𝒳𝒴x\in\mathcal{X}\setminus\mathcal{Y}. Moreover, for any operator MM on 𝒳\smash{\mathbb{R}^{\mathcal{X}}}, we define the operator M|𝒴M|_{\mathcal{Y}} on 𝒴\smash{\mathbb{R}^{\mathcal{Y}}} as

M|𝒴f(M(f𝒳))|𝒴for all f𝒴.M|_{\mathcal{Y}}f\coloneqq\bigl{(}M(f\!\!\uparrow_{\mathcal{X}})\bigr{)}|_{\mathcal{Y}}\quad\quad\text{for all $f\in\smash{\mathbb{R}^{\mathcal{Y}}}$.}

This somewhat verbose notation is perhaps most easily understood when MM is a linear operator, i.e. a matrix. In that case, M|𝒴M|_{\mathcal{Y}} is simply the |𝒴|×|𝒴|\left\lvert\mathcal{Y}\right\rvert\times\left\lvert\mathcal{Y}\right\rvert sub-matrix of MM on the coordinates in 𝒴\mathcal{Y}. The definition above allows us to extend this notion also to non-linear operators, and to lower- and upper transition and rate operators, specifically.

Now for any rate matrix Q𝒬Q\in\mathcal{Q}, we call GQ|AcG\coloneqq Q|_{A^{c}} its corresponding subgenerator. For any t0t\in\mathbb{R}_{\geq 0}, we then define eGteQt|Ace^{Gt}\coloneqq e^{Qt}|_{A^{c}}. We have the following result:

Proposition 5.

Fix Q𝒬Q\in\mathcal{Q} and let GG be its subgenerator. Then eGt=limn+(I+t/nG)ne^{Gt}=\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}G\bigr{)}^{n} for all t0t\in\mathbb{R}_{\geq 0}. Moreover, the family (eGt)t0(e^{Gt})_{t\in\mathbb{R}_{\geq 0}} is a semigroup.

Analogously, we define G¯Q¯|Ac\underline{G}\coloneqq\underline{Q}|_{A^{c}} and G¯Q¯|Ac\overline{G}\coloneqq\overline{Q}|_{A^{c}} to be the lower- and upper subgenerators corresponding to Q¯\underline{Q} and Q¯\overline{Q}, respectively. We also let eG¯teQ¯t|Ace^{\underline{G}t}\coloneqq e^{\underline{Q}t}|_{A^{c}} and eG¯teQ¯t|Ace^{\overline{G}t}\coloneqq e^{\overline{Q}t}|_{A^{c}}. Perhaps unsurprisingly, we then have:

Proposition 6.

It holds that eG¯t=limn+(I+t/nG¯)ne^{\underline{G}t}=\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}\underline{G}\bigr{)}^{n} and eG¯t=limn+(I+t/nG¯)ne^{\overline{G}t}=\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}\overline{G}\bigr{)}^{n} for all t0t\in\mathbb{R}_{\geq 0}. Moreover, the families (eG¯t)t0(e^{\underline{G}t})_{t\in\mathbb{R}_{\geq 0}}, (eG¯t)t0(e^{\overline{G}t})_{t\in\mathbb{R}_{\geq 0}} are semigroups.

Our Assumption 2 implies the norm bound:

Proposition 7.

For any t>0t>0, it holds that eG¯t<1\left\lVert\smash{e^{\overline{G}t}}\right\rVert<1.

It is a straightforward consequence of the use of the supremum norm, together with Equation (7) and the fact that eQte^{Qt} and eQ¯te^{\overline{Q}t} are (upper) transition operators, that also eGteG¯t<1\left\lVert\smash{e^{Gt}}\right\rVert\leq\left\lVert\smash{e^{\overline{G}t}}\right\rVert<1 for all t>0t\in\mathbb{R}_{>0}. Hence by the semigroup property we immediately have that limt+eGt=0\lim_{t\to+\infty}\left\lVert\smash{e^{Gt}}\right\rVert=0. This also implies the following well-known result.

Proposition 8.

[Taylor and Lay, 1958, Thm IV.1.4] For any Q𝒬Q\in\mathcal{Q} with subgenerator GG, and all t>0t>0, the inverse operator (IeGt)1(I-e^{Gt})^{-1} exists, and (IeGt)1=k=0+eGtk(I-e^{Gt})^{-1}=\sum_{k=0}^{+\infty}e^{Gtk}.

This allows us to characterize hitting times for discrete-time homogeneous Markov chains whose transition matrix is given by eQte^{Qt}, as follows.

Proposition 9.

Choose any Q𝒬Q\in\mathcal{Q}, let GG be its subgenerator, and fix any Δ>0\Delta>0. Let P0HMP\in\mathbb{P}^{\mathrm{HM}}_{\mathbb{N}_{0}} be such that TP=eQΔ{}^{P}T=e^{Q\Delta}. Then the expected hitting times h𝔼P[τ0|X0]h\coloneqq\mathbb{E}_{P}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}] satisfy h|Ac=(IeGΔ)1𝟏h|_{A^{c}}=(I-e^{G\Delta})^{-1}\mathbf{1} and h(x)=0h(x)=0 for all xAx\in A.

Proof.

By Proposition 1, in xAcx\in A^{c} we have that

h(x)=𝕀Ac(x)+𝕀Ac(x)eQΔh(x)=1+eQΔh(x).h(x)=\mathbb{I}_{A^{c}}(x)+\mathbb{I}_{A^{c}}(x)e^{Q\Delta}h(x)=1+e^{Q\Delta}h(x)\,.

Conversely, it is immediate from the definition that h(x)=0h(x)=0 for all xAx\in A. This implies that h=(h|Ac)𝒳h=(h|_{A^{c}})\!\!\uparrow_{\mathcal{X}}, and hence

h|Ac=𝟏+(eQΔ(h|Ac)𝒳)|Ac=𝟏+eGΔh|Ac.h|_{A^{c}}=\mathbf{1}+\bigl{(}e^{Q\Delta}(h|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A^{c}}=\mathbf{1}+e^{G\Delta}h|_{A^{c}}\,.

Re-ordering terms we have (IeGΔ)h|Ac=𝟏(I-e^{G\Delta})h|_{A^{c}}=\mathbf{1}. Now use Proposition 8 and multiply with (IeGΔ)1(I-e^{G\Delta})^{-1}. ∎

We need the following observation:

Lemma 1.

Consider any Q𝒬Q\in\mathcal{Q} with subgenerator GG, and let σ(G)\sigma(G) be the set of eigenvalues of GG. Then Reλ<0\mathrm{Re}\,\lambda<0 for all λσ(G)\lambda\in\sigma(G).

This implies that 0σ(G)0\notin\sigma(G), and so we have:

Corollary 1.

For any Q𝒬Q\in\mathcal{Q} with subgenerator GG, the inverse operator G1G^{-1} exists.

This allows us to characterize hitting times for continuous-time homogeneous Markov chains:

Proposition 10.

Choose any Q𝒬Q\in\mathcal{Q}, let GG be its subgenerator, and let P0HMP\in\mathbb{P}_{\mathbb{R}_{\geq 0}}^{\mathrm{HM}} with QP=Q{}^{P}\!Q=Q. Then the expected hitting times h𝔼P[τ0|X0]h\coloneqq\mathbb{E}_{P}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}] satisfy h|Ac=G1𝟏h|_{A^{c}}=-G^{-1}\mathbf{1} and h(x)=0h(x)=0 for all xAx\in A.

Proof.

By Proposition 2, in xAcx\in A^{c} we have that

1=𝕀Ac(x)=𝕀Ac(x)Qh(x)=Qh(x).-1=-\mathbb{I}_{A^{c}}(x)=\mathbb{I}_{A^{c}}(x)Qh(x)=Qh(x)\,.

Conversely, it is immediate from the definition that h(x)=0h(x)=0 for all xAx\in A. This implies h=(h|Ac)𝒳h=(h|_{A^{c}})\!\!\uparrow_{\mathcal{X}}, and hence

Gh|Ac=(Q(h|Ac)𝒳)|Ac=(Qh)|Ac=𝟏.Gh|_{A^{c}}=\bigl{(}Q(h|_{A^{c}})\!\!\uparrow_{\mathcal{X}})\bigr{|_{A^{c}}}=(Qh)|_{A^{c}}=-\mathbf{1}\,.

Now use Corollary 1 and multiply with G1G^{-1}. ∎

4.1 Quasicontractivity of Subspace Dynamics

We already know from Proposition 7 that eG¯t<1\left\lVert\smash{e^{\overline{G}t}}\right\rVert<1 for all t>0t\in\mathbb{R}_{>0}. Since eG¯0=I\smash{e^{\overline{G}0}}=I (because it is a semigroup), it follows that eG¯t1\left\lVert\smash{e^{\overline{G}t}}\right\rVert\leq 1 for all t0t\in\mathbb{R}_{\geq 0}. A semigroup that satisfies this property is said to be contractive. Moreover, Proposition 7 together with the semigroup property implies that limt+eG¯t=0\lim_{t\to+\infty}\left\lVert\smash{e^{\overline{G}t}}\right\rVert=0. A semigroup that satisfies this property is said to be uniformly exponentially stable, and in such a case the following result holds:

Proposition 11.

There are M1M\geq 1 and ξ>0\xi>0 such that eG¯tMeξt\left\lVert\smash{e^{\overline{G}t}}\right\rVert\leq Me^{-\xi t} for all t0t\in\mathbb{R}_{\geq 0}.

This result means that the norm eG¯t\left\lVert\smash{e^{\overline{G}t}}\right\rVert decays exponentially as tt grows. However, for technical reasons we require an exponentially decaying norm bound with M=1M=1; if this holds the semigroup is said to be quasicontractive.

It is not clear that obtaining such a bound is possible when eG¯t\left\lVert\smash{e^{\overline{G}t}}\right\rVert is induced by the supremum norm \left\lVert\cdot\right\rVert on Ac\smash{\mathbb{R}^{A^{c}}}. However, we can get it by defining a different norm \left\lVert\cdot\right\rVert_{*} on Ac\smash{\mathbb{R}^{A^{c}}}. We then obtain the quasicontractivity with respect to the induced operator norm \left\lVert\cdot\right\rVert_{*}. Because Ac\smash{\mathbb{R}^{A^{c}}} is finite-dimensional these norms are equivalent, and such a result suffices for our purposes. This re-norming trick is originally due to Feller [1953], and an analogous construction is commonly used for semigroups of linear operators; see e.g. [Renardy and Rogers, 2006, Thm 12.21].

So, consider the ξ>0\xi>0 from Proposition 11, and let

fsupt0eξteG¯t|f|for all fAc,\left\lVert f\right\rVert_{*}\coloneqq\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert\quad\text{for all $f\in\smash{\mathbb{R}^{A^{c}}}$,} (8)

where |f|\left\lvert f\right\rvert denotes the elementwise-absolute value of ff.

Proposition 12.

The map fff\mapsto\left\lVert f\right\rVert_{*} is a norm on Ac\smash{\mathbb{R}^{A^{c}}}.

Moreover, we have the desired result:

Proposition 13.

We have eG¯teξt\left\lVert\smash{e^{\overline{G}t}}\right\rVert_{*}\leq e^{-\xi t} for all t0t\in\mathbb{R}_{\geq 0}.

Finally, the same bound holds for precise models:

Proposition 14.

For any Q𝒬Q\in\mathcal{Q} with subgenerator GG it holds that eGteξt\left\lVert e^{Gt}\right\rVert_{*}\leq e^{-\xi t} for all t0t\in\mathbb{R}_{\geq 0}.

5 Hitting Times as Limits

We now have all the pieces to explain the proof of our main results. The trick will be to establish a connection between hitting times for continuous-time imprecise-Markov chains, and hitting times for discrete-time imprecise-Markov chains, for which analogous results were previously established by Krak et al. [2019].

We essentially just look at a discretized continuous-time Markov chain taking steps of some fixed size Δ>0\Delta>0, derive the expected hitting time for this discrete-time Markov chain, and then take the limit Δ0+\Delta\to 0^{+}. The main difficulty is in establishing that this converges uniformly for all elements in our sets of processes; this is why we went through the trouble of establishing quasicontractivity in Section 4.1.

To start, for any Q𝒬Q\in\mathcal{Q} and Δ>0\Delta>0, let hΔQh^{Q}_{\Delta} be the minimal non-negative solution to the linear system888Note the re-scaled term Δ𝕀Ac\Delta\mathbb{I}_{A^{c}} on the right-hand side, which distinguishes this from the system in Proposition 1; this is required since the hitting times for discrete-time Markov chains are expressed in the number of steps, and to pass to continuous-time we need to measure the size of these steps.

hΔQ=Δ𝕀Ac+𝕀AceQΔhΔQ,h^{Q}_{\Delta}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}e^{Q\Delta}h^{Q}_{\Delta}\,, (9)

and let hQh^{Q} be the minimal non-negative solution to

𝕀AhQ=𝕀Ac+𝕀AcQhQ.\mathbb{I}_{A}h^{Q}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Qh^{Q}\,. (10)

Then we know from Propositions 1 and 2 that 1/ΔhΔQ\nicefrac{{1}}{{\Delta}}h^{Q}_{\Delta} represents the expected hitting times of a discrete-time homogeneous Markov chain with transition matrix eQΔe^{Q\Delta}, and that hQh^{Q} does the same for a continuous-time homogeneous Markov chain with rate matrix QQ. We now have the following result:

Proposition 15.

There are δ>0\delta>0 and L>0L>0 such that hΔQhQ<ΔLhQ\left\lVert h^{Q}_{\Delta}-h^{Q}\right\rVert<\Delta L\left\lVert h^{Q}\right\rVert for all Δ(0,δ)\Delta\in(0,\delta) and all Q𝒬Q\in\mathcal{Q}.

Since hQ\left\lVert h^{Q}\right\rVert is bounded due to Proposition 10:

Corollary 2.

We have limΔ0+hΔQ=hQ\lim_{\Delta\to 0^{+}}h^{Q}_{\Delta}=h^{Q} for all Q𝒬Q\in\mathcal{Q}.

We will now set up the analogous results for imprecise-Markov chains. First, let

h¯infQ𝒬hQandh¯supQ𝒬hQ.\underline{h}\coloneqq\inf_{Q\in\mathcal{Q}}h^{Q}\quad\text{and}\quad\overline{h}\coloneqq\sup_{Q\in\mathcal{Q}}h^{Q}\,. (11)

Clearly, it follows from Proposition 2 and the definition of lower- and upper expectations that these quantities represent the lower- and upper expected hitting times for the imprecise-Markov chain 𝒫𝒬HM\mathcal{P}_{\mathcal{Q}}^{\mathrm{HM}}, i.e. it holds that

h¯=𝔼¯𝒬HM[τ0|X0]andh¯=𝔼¯𝒬HM[τ0|X0].\underline{h}=\underline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{HM}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}\quad\text{and}\quad\overline{h}=\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{HM}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}\,.

Now for any Δ>0\Delta>0, let h¯Δ\underline{h}_{\Delta} and h¯Δ\overline{h}_{\Delta} denote the minimal non-negative solutions to the non-linear systems

h¯Δ=Δ𝕀Ac+𝕀AceQ¯Δh¯Δ\underline{h}_{\Delta}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}e^{\underline{Q}\Delta}\underline{h}_{\Delta} (12)

and

h¯Δ=Δ𝕀Ac+𝕀AceQ¯Δh¯Δ.\overline{h}_{\Delta}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}e^{\overline{Q}\Delta}\overline{h}_{\Delta}\,. (13)

It was previously shown by Krak et al. [2019] that—up to re-scaling with 1/Δ\nicefrac{{1}}{{\Delta}}—the quantities h¯Δ\underline{h}_{\Delta} and h¯Δ\overline{h}_{\Delta} represent the lower (resp. upper) expected hitting times of, identically, the discrete-time imprecise-Markov chains 𝒫𝒯ΔHM\mathcal{P}_{\mathcal{T}_{\Delta}}^{\mathrm{HM}}, 𝒫𝒯ΔM\mathcal{P}_{\mathcal{T}_{\Delta}}^{\mathrm{M}}, and 𝒫𝒯ΔI\mathcal{P}_{\mathcal{T}_{\Delta}}^{\mathrm{I}} parameterized by the set 𝒯Δ\mathcal{T}_{\Delta} of transition matrices that dominate eQ¯Δe^{\underline{Q}\Delta}. We now set out of prove an analogous result for continuous-time imprecise-Markov chains. We start with the following:

Proposition 16.

It holds that limΔ0+h¯Δ=h¯\lim_{\Delta\to 0^{+}}\underline{h}_{\Delta}=\underline{h} and limΔ0+h¯Δ=h¯\lim_{\Delta\to 0^{+}}\overline{h}_{\Delta}=\overline{h}.

This property allows us to leverage recent results by Erreygers [2021] and Krak [2021] regarding discrete and finite approximations of lower- and upper expectations in continuous-time imprecise-Markov chains, to obtain our first main result:

Theorem 1.

It holds that

h¯=𝔼¯𝒬HM[τ0|X0]=𝔼¯𝒬M[τ0|X0]=𝔼¯𝒬I[τ0|X0],\underline{h}=\underline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{HM}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}=\underline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{M}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}=\underline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}\,,

and, moreover, that

h¯=𝔼¯𝒬HM[τ0|X0]=𝔼¯𝒬M[τ0|X0]=𝔼¯𝒬I[τ0|X0].\overline{h}=\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{HM}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}=\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{M}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}=\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}\,.

Moreover, it follows relatively straightforwardly from Proposition 16 that the lower- and upper expected hitting times for continuous-time imprecise-Markov chains satisfy an immediate generalization of the system that characterizes the expected hitting times for (precise) continuous-time homogeneous Markov chains. This is our second main result:

Theorem 2.

Let h¯\underline{h} and h¯\overline{h} denote the lower- and upper expected hitting times for any one of 𝒫𝒬HM\mathcal{P}^{\mathrm{HM}}_{\mathcal{Q}}, 𝒫𝒬M\mathcal{P}^{\mathrm{M}}_{\mathcal{Q}}, or 𝒫𝒬I\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}. Then h¯\underline{h} is the minimal non-negative solution to the non-linear system 𝕀Ah¯=𝕀Ac+𝕀AcQ¯h¯\mathbb{I}_{A}\underline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\underline{Q}\,\underline{\vphantom{Q}h}, and h¯\overline{h} is the minimal non-negative solution to the non-linear system 𝕀Ah¯=𝕀Ac+𝕀AcQ¯h¯\mathbb{I}_{A}\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,\overline{h}.

6 Summary & Conclusion

We have investigated the problem of characterizing expected hitting times for continuous-time imprecise-Markov chains. We have shown that under two relatively mild assumptions on the system’s class structure—viz. that the target states are absorbing, and can be reached by any non-target state—the corresponding lower (resp. upper) expected hitting time is the same for all three types of imprecise-Markov chains.

We have also demonstrated that these lower- and upper expected hitting times h¯\underline{h} and h¯\overline{h} satisfy the non-linear systems

𝕀Ah¯=𝕀Ac+𝕀AcQ¯h¯and𝕀Ah¯=𝕀Ac+𝕀AcQ¯h¯,\mathbb{I}_{A}\underline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\underline{Q}\,\underline{\vphantom{Q}h}\quad\text{and}\quad\mathbb{I}_{A}\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,\overline{h}\,,

in analogy with the precise linear system (5). Indeed, we conclude that the lower- and upper expected hitting times for any of these three types of imprecise-Markov chains, can be fully characterized as the unique minimal non-negative solutions to these respective systems.

We aim to strengthen these results in future work to hold with fewer assumptions on the system’s class structure.

acknowledgements

We would like to sincerely thank Jasper De Bock for many stimulating discussions on the subject of imprecise-Markov chains, and for pointing out a technical error in an earlier draft of this work. We are also grateful for the constructive feedback of three anonymous reviewers.

References

  • Augustin et al. [2014] Thomas Augustin, Frank P. A. Coolen, Gert de Cooman, and Matthias C. M. Troffaes, editors. Introduction to Imprecise Probabilities. John Wiley & Sons, 2014.
  • De Bock [2017] Jasper De Bock. The limit behaviour of imprecise continuous-time Markov chains. Journal of Nonlinear Science, 27(1):159–196, 2017.
  • Engel and Nagel [2000] Klaus-Jochen Engel and Rainer Nagel. One-parameter semigroups for linear evolution equations, volume 194. Springer, 2000.
  • Erreygers [2021] Alexander Erreygers. Markovian Imprecise Jump Processes: Foundations, Algorithms and Applications. PhD thesis, Ghent University, 2021.
  • Feller [1953] William Feller. On the generation of unbounded semi-groups of bounded linear operators. Annals of Mathematics, pages 166–174, 1953.
  • Hermans and Škulj [2014] Filip Hermans and Damjan Škulj. Stochastic processes. In Thomas Augustin, Frank P. A. Coolen, Gert de Cooman, and Matthias C. M. Troffaes, editors, Introduction to Imprecise Probabilities, chapter 11. Wiley, 2014.
  • Krak [2020] Thomas Krak. Computing expected hitting times for imprecise Markov chains. In International Conference on Uncertainty Quantification & Optimisation, pages 185–205. Springer, 2020.
  • Krak [2021] Thomas Krak. Continuous-Time Imprecise-Markov Chains: Theory and Algorithms. PhD thesis, Ghent University, 2021.
  • Krak et al. [2017] Thomas Krak, Jasper De Bock, and Arno Siebes. Imprecise continuous-time Markov chains. International Journal of Approximate Reasoning, 88:452–528, 2017.
  • Krak et al. [2019] Thomas Krak, Natan T’Joens, and Jasper De Bock. Hitting times and probabilities for imprecise Markov chains. In Proceedings of ISIPTA 2019, volume 103 of Proceedings of Machine Learning Research, pages 265–275. PMLR, 2019.
  • Norris [1998] James Robert Norris. Markov chains. Cambridge university press, 1998.
  • Renardy and Rogers [2006] Michael Renardy and Robert C. Rogers. An introduction to partial differential equations. Springer Science & Business Media, 2006.
  • Škulj [2015] Damjan Škulj. Efficient computation of the bounds of continuous time imprecise Markov chains. Applied mathematics and computation, 250:165–180, 2015.
  • Taylor and Lay [1958] Angus E. Taylor and David C. Lay. Introduction to functional analysis, volume 1. Wiley New York, 1958.
  • Van Loan [2006] Charles F. Van Loan. A study of the matrix exponential. 2006.
  • Walley [1991] Peter Walley. Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, London, 1991.

Appendix A Proofs and Lemmas for Section 4

For certain operators, we note that subspace restriction distributes over operator composition:

Lemma 2.

Let MM and NN be operators on 𝒳\smash{\mathbb{R}^{\mathcal{X}}} such that N|A=IN|_{A}=I. Then (MN)|Ac=M|AcN|Ac\bigl{(}MN\bigr{)}|_{A^{c}}=M|_{A^{c}}N|_{A^{c}}.

Proof.

Fix any fAcf\in\smash{\mathbb{R}^{A^{c}}}. Then

M|AcN|Acf=M(((Nf𝒳)|Ac)𝒳)|Ac.\displaystyle M|_{A^{c}}N|_{A^{c}}f=M\Bigl{(}\bigl{(}(Nf\!\!\uparrow_{\mathcal{X}})|_{A^{c}}\bigr{)}\!\!\uparrow_{\mathcal{X}}\Bigr{)}|_{A^{c}}\,.

Note that since N|A=IN|_{A}=I and f𝒳(x)=0f\!\!\uparrow_{\mathcal{X}}(x)=0 for all xAx\in A, we also have Nf𝒳(x)=0Nf\!\!\uparrow_{\mathcal{X}}(x)=0 for all xAx\in A. Hence in particular, it holds that ((Nf𝒳)|Ac)𝒳=Nf𝒳\bigl{(}(Nf\!\!\uparrow_{\mathcal{X}})|_{A^{c}}\bigr{)}\!\!\uparrow_{\mathcal{X}}=Nf\!\!\uparrow_{\mathcal{X}}. We therefore find that

M|AcN|Acf\displaystyle M|_{A^{c}}N|_{A^{c}}f =(MNf𝒳)|Ac=(MN)|Acf,\displaystyle=\bigl{(}MNf\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A^{c}}=(MN)|_{A^{c}}f\,,

which concludes the proof. ∎

This can be used in particular for certain operators associated with Q𝒬Q\in\mathcal{Q} and the associated lower- and upper rate operators:

Lemma 3.

Fix any Δ0\Delta\geq 0 and any Q𝒬Q\in\mathcal{Q}. Then

(I+ΔQ)|A=(I+ΔQ¯)|A=(I+ΔQ¯)|A=I.(I+\Delta Q)|_{A}=(I+\Delta\underline{Q})|_{A}=(I+\Delta\overline{Q})|_{A}=I\,.
Proof.

Fix any Q𝒬Q\in\mathcal{Q}, and first choose any f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}} and xAx\in A. By Assumption 1 and the definition of rate matrices, we have Q(x,y)=0Q(x,y)=0 for all y𝒳y\in\mathcal{X}, whence Qf(x)=y𝒳Q(x,y)f(y)=0Qf(x)=\sum_{y\in\mathcal{X}}Q(x,y)f(y)=0. Since Q𝒬Q\in\mathcal{Q} is arbitrary, we also have Q¯f(x)=0\underline{Q}f(x)=0 and Q¯f(x)=0\overline{Q}f(x)=0. It follows that

f(x)=(I+ΔQ)f(x)=(I+ΔQ¯)f(x)=(I+ΔQ¯)f(x).f(x)=(I+\Delta Q)f(x)=(I+\Delta\underline{Q})f(x)=(I+\Delta\overline{Q})f(x)\,.

Since this is true for all f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}} and all xAx\in A, the result is now immediate. ∎

Corollary 3.

For all Q𝒬Q\in\mathcal{Q} and t0t\in\mathbb{R}_{\geq 0} it holds that

eQt|A=eQ¯t|A=eQ¯t|A=I.e^{Qt}|_{A}=e^{\underline{Q}t}|_{A}=e^{\overline{Q}t}|_{A}=I\,.
Proof.

Use Lemma 3 and the definitions of eQte^{Qt}, eQ¯te^{\underline{Q}t}, eQ¯te^{\overline{Q}t}. ∎

Lemma 4.

Let MM and NN be operators on 𝒳\smash{\mathbb{R}^{\mathcal{X}}} such that M|A=I=N|AM|_{A}=I=N|_{A}. Then M|AcN|AcMN\left\lVert M|_{A^{c}}-N|_{A^{c}}\right\rVert\leq\left\lVert M-N\right\rVert.

Proof.

Fix any fAcf\in\smash{\mathbb{R}^{A^{c}}} with f=1\left\lVert f\right\rVert=1. Then f𝒳=1\left\lVert f\!\!\uparrow_{\mathcal{X}}\right\rVert=1. Moreover, since f𝒳(x)=0f\!\!\uparrow_{\mathcal{X}}(x)=0 for all xAx\in A and since M|A=I=N|AM|_{A}=I=N|_{A}, we have that (Mf𝒳)(x)=0=(Nf𝒳)(x)(Mf\!\!\uparrow_{\mathcal{X}})(x)=0=(Nf\!\!\uparrow_{\mathcal{X}})(x) for all xAx\in A. Hence we find

(M|AcN|Ac)f\displaystyle\left\lVert(M|_{A^{c}}-N|_{A^{c}})f\right\rVert
=((MN)f𝒳)|Ac\displaystyle\quad\quad=\left\lVert\bigl{(}(M-N)f\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A^{c}}\right\rVert
=(MN)f𝒳\displaystyle\quad\quad=\left\lVert(M-N)f\!\!\uparrow_{\mathcal{X}}\right\rVert
sup{(MN)g:g𝒳,g=1}\displaystyle\quad\quad\leq\sup\bigl{\{}\left\lVert(M-N)g\right\rVert\,:\,g\in\smash{\mathbb{R}^{\mathcal{X}}},\left\lVert g\right\rVert=1\bigr{\}}
=MN.\displaystyle\quad\quad=\left\lVert M-N\right\rVert\,.

The result follows since fAcf\in\smash{\mathbb{R}^{A^{c}}} is arbitrary. ∎

Proof of Proposition 5.

Fix Q𝒬Q\in\mathcal{Q} and let GG be its subgenerator. First fix any t0t\in\mathbb{R}_{\geq 0} and any ϵ>0\epsilon>0. Then by definition of eQte^{Qt}, for all nn\in\mathbb{N} large enough it holds that

eQt(I+t/nQ)n<ϵ.\left\lVert e^{Qt}-\bigl{(}I+\nicefrac{{t}}{{n}}Q\bigr{)}^{n}\right\rVert<\epsilon\,.

Moreover, by Lemmas 2 and 3 we have

((I+t/nQ)n)|Ac=(I+t/nG)n,\Bigl{(}\bigl{(}I+\nicefrac{{t}}{{n}}Q\bigr{)}^{n}\Bigr{)}|_{A^{c}}=\bigl{(}I+\nicefrac{{t}}{{n}}G\bigr{)}^{n}\,,

and, by Corollary 3, that eQt|A=Ie^{Qt}|_{A}=I. Hence by Lemma 4 we find

eGt(I+t/nG)neQt(I+t/nQ)n<ϵ.\left\lVert e^{Gt}-\bigl{(}I+\nicefrac{{t}}{{n}}G\bigr{)}^{n}\right\rVert\leq\left\lVert e^{Qt}-\bigl{(}I+\nicefrac{{t}}{{n}}Q\bigr{)}^{n}\right\rVert<\epsilon\,.

Since ϵ>0\epsilon>0 is arbitrary, we have

eGt=limn+(I+t/nG)n.e^{Gt}=\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}G\bigr{)}^{n}\,.

This concludes the proof of the first claim.

To see that (eGt)t0(e^{Gt})_{t\in\mathbb{R}_{\geq 0}} is a semigroup, note that (eQt)t0(e^{Qt})_{t\in\mathbb{R}_{\geq 0}} is a semigroup, then apply Lemma 2 and Corollary 3. ∎

Proof of Proposition 6.

The proof is completely analogous to the proof of Proposition 5; simply replace QQ with either Q¯\underline{Q} or Q¯\overline{Q} as appropriate. ∎

Proof of Proposition 7.

Let ϵminxAceQ¯t𝕀A(x)\epsilon\coloneqq\min_{x\in A^{c}}e^{\underline{Q}t}\mathbb{I}_{A}(x); then ϵ>0\epsilon>0 due to Assumption 2. Fix any fAcf\in\smash{\mathbb{R}^{A^{c}}} with f=1\left\lVert f\right\rVert=1. By definition, we have eG¯tf=eQ¯t|Acf=(eQ¯tf𝒳)|Ace^{\overline{G}t}f=e^{\overline{Q}t}|_{A^{c}}f=\bigl{(}e^{\overline{Q}t}f\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A^{c}}.

Let 𝒯t\mathcal{T}_{t} denote the set of transition matrices that dominates eQ¯te^{\underline{Q}t}. Due to Proposition 3, there is some T𝒯tT\in\mathcal{T}_{t} such that Tf𝒳=eQ¯tf𝒳Tf\!\!\uparrow_{\mathcal{X}}=e^{\overline{Q}t}f\!\!\uparrow_{\mathcal{X}}. Fix any xAcx\in A^{c}. Then, using that f𝒳(y)=0f\!\!\uparrow_{\mathcal{X}}(y)=0 for all yAy\in A, together with the fact that TT is a transition matrix, we have

|Tf𝒳(x)|\displaystyle\left\lvert Tf\!\!\uparrow_{\mathcal{X}}(x)\right\rvert =|y𝒳T(x,y)f𝒳(y)|yAcT(x,y),\displaystyle=\left\lvert\sum_{y\in\mathcal{X}}T(x,y)f\!\!\uparrow_{\mathcal{X}}(y)\right\rvert\leq\sum_{y\in A^{c}}T(x,y)\,,

and hence |Tf𝒳(x)|T𝕀Ac(x)\left\lvert Tf\!\!\uparrow_{\mathcal{X}}(x)\right\rvert\leq T\mathbb{I}_{A^{c}}(x). We have 𝕀A+𝕀Ac=𝟏\mathbb{I}_{A}+\mathbb{I}_{A^{c}}=\mathbf{1} and T𝟏(x)=1T\mathbf{1}(x)=1 since TT is a transition matrix. Using the linear character of TT, we find that

T𝕀Ac(x)=T(𝟏𝕀A)(x)=1T𝕀A(x).\displaystyle T\mathbb{I}_{A^{c}}(x)=T(\mathbf{1}-\mathbb{I}_{A})(x)=1-T\mathbb{I}_{A}(x)\,.

Since T𝒯tT\in\mathcal{T}_{t} and xAcx\in A^{c} we have

0<ϵ=minyAceQ¯t𝕀A(y)eQ¯t𝕀A(x)T𝕀A(x).0<\epsilon=\min_{y\in A^{c}}e^{\underline{Q}t}\mathbb{I}_{A}(y)\leq e^{\underline{Q}t}\mathbb{I}_{A}(x)\leq T\mathbb{I}_{A}(x)\,.

Combining the above we find that

|Tf𝒳(x)|T𝕀Ac(x)=1T𝕀A(x)1ϵ.\left\lvert Tf\!\!\uparrow_{\mathcal{X}}(x)\right\rvert\leq T\mathbb{I}_{A^{c}}(x)=1-T\mathbb{I}_{A}(x)\leq 1-\epsilon\,.

Since this is true for all xAcx\in A^{c}, we find that (Tf𝒳)|Ac1ϵ\left\lVert(Tf\!\!\uparrow_{\mathcal{X}})|_{A^{c}}\right\rVert\leq 1-\epsilon. Moreover, since Tf𝒳=eQ¯tf𝒳Tf\!\!\uparrow_{\mathcal{X}}=e^{\overline{Q}t}f\!\!\uparrow_{\mathcal{X}}, it follows that (eQ¯tf𝒳)|Ac1ϵ\left\lVert(e^{\overline{Q}t}f\!\!\uparrow_{\mathcal{X}})|_{A^{c}}\right\rVert\leq 1-\epsilon, or in other words, that

eG¯tf1ϵ,with ϵ>0.\left\lVert e^{\overline{G}t}f\right\rVert\leq 1-\epsilon\,,\quad\quad\text{with $\epsilon>0$.}

The result follows since fAcf\in\smash{\mathbb{R}^{A^{c}}} with f=1\left\lVert f\right\rVert=1 is arbitrary. ∎

Proof of Lemma 1.

Let ρ(eG)maxλσ(eG)|λ|\rho(e^{G})\coloneqq\max_{\lambda\in\sigma(e^{G})}\left\lvert\lambda\right\rvert denote the spectral radius of eGe^{G}. We know from Section 4 that eG<1\left\lVert e^{G}\right\rVert<1, and hence we have ρ(eG)eG<1\rho(e^{G})\leq\left\lVert e^{G}\right\rVert<1 [Taylor and Lay, 1958, Thm V.3.5]. This implies that |λ|<1\left\lvert\lambda\right\rvert<1 for all λσ(eG)\lambda\in\sigma(e^{G}).

By the spectral mapping theorem [Engel and Nagel, 2000, Lemma I.3.13] we then have eReλ<1e^{\mathrm{Re}\,\lambda}<1 for all λσ(G)\lambda\in\sigma(G), or in other words, that Reλ<0\mathrm{Re}\,\lambda<0 for all λσ(G)\lambda\in\sigma(G). ∎

Appendix B Proofs and Lemmas for Section 4.1

Proof of Proposition 11.

This proof is a straightforward generalization of an argument in [Engel and Nagel, 2000, Prop I.3.12].

Let first qeG¯q\coloneqq\left\lVert e^{\overline{G}}\right\rVert; then 0<q<10<q<1 due to Proposition 7. Define

msups[0,1]eG¯s.m\coloneqq\sup_{s\in[0,1]}\left\lVert e^{\overline{G}s}\right\rVert\,.

Then m1m\geq 1 since meG¯0=I=1m\geq\left\lVert e^{\overline{G}0}\right\rVert=\left\lVert I\right\rVert=1. Moreover, m1m\leq 1 due to Proposition 7, and hence m=1m=1. Now set M1/qM\coloneqq\nicefrac{{1}}{{q}} and ξlogq\xi\coloneqq-\log q; then ξ>0\xi>0 since q<1q<1.

Fix any t0t\in\mathbb{R}_{\geq 0}. If t=0t=0 then the result is trivial, so let us suppose that t>0t>0. Then there are k0k\in\mathbb{N}_{0} and s[0,1)s\in[0,1) such that t=k+st=k+s. Using the semigroup property, we have

eG¯t=eG¯(s+k)\displaystyle\left\lVert e^{\overline{G}t}\right\rVert=\left\lVert e^{\overline{G}(s+k)}\right\rVert eG¯seG¯kmqk=eklogq.\displaystyle\leq\left\lVert e^{\overline{G}s}\right\rVert\left\lVert e^{\overline{G}}\right\rVert^{k}\leq mq^{k}=e^{k\log q}\,.

We have k=tsk=t-s and s[0,1)s\in[0,1), and so

eG¯t\displaystyle\left\lVert e^{\overline{G}t}\right\rVert eklogq\displaystyle\leq e^{k\log q}
=e(ts)logq\displaystyle=e^{(t-s)\log q}
=etlogqeslogq\displaystyle=e^{t\log q}e^{-s\log q}
=eξteslogq\displaystyle=e^{-\xi t}e^{-s\log q}
eξtelogq=1qeξt=Meξt,\displaystyle\leq e^{-\xi t}e^{-\log q}=\frac{1}{q}e^{-\xi t}=Me^{-\xi t}\,,

which concludes the proof. ∎

Proof of Proposition 12.

It follows from the definition that for any upper transition operator T¯\overline{T} and any non-negative f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}}, also T¯f\overline{T}f is non-negative. In the sequel, we will therefore say that upper transition operators preserve non-negativity. Since eQ¯te^{\overline{Q}t} is an upper transition operator, this property clearly extends also to eG¯te^{\overline{G}t}.

Now fix f,gAcf,g\in\smash{\mathbb{R}^{A^{c}}} and t0t\in\mathbb{R}_{\geq 0}. By preservation of non-negativity we have for any xAcx\in A^{c} that

|eG¯t|f+g||(x)=eG¯t|f+g|(x).\left\lvert e^{\overline{G}t}\left\lvert f+g\right\rvert\right\rvert(x)=e^{\overline{G}t}\left\lvert f+g\right\rvert(x)\,.

Moreover, we clearly have |f+g||f|+|g|\left\lvert f+g\right\rvert\leq\left\lvert f\right\rvert+\left\lvert g\right\rvert, and so by the monotonicity of upper transition operators, we have

eG¯t|f+g|(x)eG¯t(|f|+|g|)(x).e^{\overline{G}t}\left\lvert f+g\right\rvert(x)\leq e^{\overline{G}t}(\left\lvert f\right\rvert+\left\lvert g\right\rvert)(x)\,.

Finally, by the subadditivity of upper transition operators, we find that

eG¯t(|f|+|g|)(x)eG¯t|f|(x)+eG¯t|g|(x).e^{\overline{G}t}(\left\lvert f\right\rvert+\left\lvert g\right\rvert)(x)\leq e^{\overline{G}t}\left\lvert f\right\rvert(x)+e^{\overline{G}t}\left\lvert g\right\rvert(x)\,.

Again by preservation of non-negativity we have

eG¯t|f|(x)+eG¯t|g|(x)=|eG¯t|f|(x)+eG¯t|g||(x).e^{\overline{G}t}\left\lvert f\right\rvert(x)+e^{\overline{G}t}\left\lvert g\right\rvert(x)=\left\lvert e^{\overline{G}t}\left\lvert f\right\rvert(x)+e^{\overline{G}t}\left\lvert g\right\rvert\right\rvert(x)\,.

Because this is true for all xAcx\in A^{c}, we find that

eG¯t|f+g|\displaystyle\left\lVert e^{\overline{G}t}\left\lvert f+g\right\rvert\right\rVert eG¯t|f|+eG¯t|g|\displaystyle\leq\left\lVert e^{\overline{G}t}\left\lvert f\right\rvert+e^{\overline{G}t}\left\lvert g\right\rvert\right\rVert
eG¯t|f|+eG¯t|g|.\displaystyle\leq\left\lVert e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert+\left\lVert e^{\overline{G}t}\left\lvert g\right\rvert\right\rVert\,.

Multiplying both sides with eξte^{\xi t} and noting that t0t\in\mathbb{R}_{\geq 0} is arbitrary, we find that

f+g\displaystyle\left\lVert f+g\right\rVert_{*} =supt0eξteG¯t|f+g|\displaystyle=\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert f+g\right\rvert\right\rVert
supt0eξteG¯t|f|+eξteG¯t|g|\displaystyle\leq\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert+\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert g\right\rvert\right\rVert
supt0eξteG¯t|f|+supt0eξteG¯t|g|\displaystyle\leq\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert+\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert g\right\rvert\right\rVert
=f+g.\displaystyle=\left\lVert f\right\rVert_{*}+\left\lVert g\right\rVert_{*}\,.

Hence we have established that \left\lVert\cdot\right\rVert_{*} satisfies the triangle inequality.

Next, fix any fAcf\in\smash{\mathbb{R}^{A^{c}}} and cc\in\mathbb{R}. Then

cf\displaystyle\left\lVert cf\right\rVert_{*} =supt0eξteG¯t|cf|\displaystyle=\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert cf\right\rvert\right\rVert
=supt0eξteG¯t|c||f|\displaystyle=\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert c\right\rvert\left\lvert f\right\rvert\right\rVert
=|c|supt0eξteG¯t|f|=|c|f.\displaystyle=\left\lvert c\right\rvert\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert=\left\lvert c\right\rvert\left\lVert f\right\rVert_{*}\,.

So \left\lVert\cdot\right\rVert_{*} is absolutely homogeneous.

Finally, fix fAcf\in\smash{\mathbb{R}^{A^{c}}} and suppose that f=0\left\lVert f\right\rVert_{*}=0. It holds that

0=feξ0eG¯0|f|0,\displaystyle 0=\left\lVert f\right\rVert_{*}\geq\left\lVert e^{\xi 0}e^{\overline{G}0}\left\lvert f\right\rvert\right\rVert\geq 0\,,

whence it holds that eξ0eG¯0|f|=0\left\lVert e^{\xi 0}e^{\overline{G}0}\left\lvert f\right\rvert\right\rVert=0. This implies that also eG¯0|f|=0\left\lVert e^{\overline{G}0}\left\lvert f\right\rvert\right\rVert=0. Since eG¯0=Ie^{\overline{G}0}=I, we have

0=eG¯0|f|=|f|=f,\displaystyle 0=\left\lVert e^{\overline{G}0}\left\lvert f\right\rvert\right\rVert=\left\lVert\left\lvert f\right\rvert\right\rVert=\left\lVert f\right\rVert\,,

whence f=0f=0. Hence \left\lVert\cdot\right\rVert_{*} separates Ac\smash{\mathbb{R}^{A^{c}}}. ∎

Lemma 5.

For any Q𝒬Q\in\mathcal{Q} with subgenerator GG, any fAcf\in\smash{\mathbb{R}^{A^{c}}}, and any t0t\geq 0, it holds that eGtfeG¯tf\left\lVert e^{Gt}f\right\rVert_{*}\leq\left\lVert e^{\overline{G}t}f\right\rVert_{*}.

Proof.

Choose fAcf\in\smash{\mathbb{R}^{A^{c}}}. Let TT be any matrix with non-negative entries. Then |Tf(x)||T|f|(x)|\left\lvert Tf(x)\right\rvert\leq\left\lvert T\left\lvert f\right\rvert(x)\right\rvert for all xAcx\in A^{c}. In particular, we have

|Tf(x)|\displaystyle\left\lvert Tf(x)\right\rvert =|yAcT(x,y)f(y)|\displaystyle=\left\lvert\sum_{y\in A^{c}}T(x,y)f(y)\right\rvert
yAc|T(x,y)||f(y)|\displaystyle\leq\sum_{y\in A^{c}}\left\lvert T(x,y)\right\rvert\left\lvert f(y)\right\rvert
=T|f|(x)=|T|f|(x)|,\displaystyle=T\left\lvert f\right\rvert(x)=\left\lvert T\left\lvert f\right\rvert(x)\right\rvert\,,

where the final two equalities follow from the fact that TT only has non-negative entries. Since this is true for any matrix TT with non-negative entries, we have in particular that |eGtf|(x)eGt|f|(x)\left\lvert e^{Gt}f\right\rvert(x)\leq e^{Gt}\left\lvert f\right\rvert(x). Similarly, it holds that

|eG¯tf|(x)\displaystyle\left\lvert e^{\overline{G}t}f\right\rvert(x) =|supT𝒯tTf(x)|\displaystyle=\left\lvert\sup_{T\in\mathcal{T}_{t}}Tf(x)\right\rvert
supT𝒯t|Tf(x)|\displaystyle\leq\sup_{T\in\mathcal{T}_{t}}\left\lvert Tf(x)\right\rvert
supT𝒯tT|f|(x)=eG¯t|f|(x).\displaystyle\leq\sup_{T\in\mathcal{T}_{t}}T\left\lvert f\right\rvert(x)=e^{\overline{G}t}\left\lvert f\right\rvert(x)\,.

It follows that, for any s0s\in\mathbb{R}_{\geq 0}, we have

eG¯s|eGtf|(x)eG¯seGt|f|(x).\displaystyle e^{\overline{G}s}\left\lvert e^{Gt}f\right\rvert(x)\leq e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert(x)\,.

Due to preservation of non-negativity, and since this is true for any xAcx\in A^{c}, we have

eG¯s|eGtf|eG¯seGt|f|.\left\lVert e^{\overline{G}s}\left\lvert e^{Gt}f\right\rvert\right\rVert\leq\left\lVert e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rVert\,.

Now let fAcf\in\smash{\mathbb{R}^{A^{c}}} be such that f=1\left\lVert f\right\rVert_{*}=1 and eGt=eGtf\left\lVert e^{Gt}\right\rVert_{*}=\left\lVert e^{Gt}f\right\rVert_{*}; this ff clearly exists since Ac\smash{\mathbb{R}^{A^{c}}} is finite-dimensional. Then we have

eGt\displaystyle\left\lVert e^{Gt}\right\rVert_{*} =eGtf\displaystyle=\left\lVert e^{Gt}f\right\rVert_{*}
=sups0eξseG¯s|eGtf|\displaystyle=\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}\left\lvert e^{Gt}f\right\rvert\right\rVert
sups0eξseG¯seGt|f|=eGt|f|eGt,\displaystyle\leq\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rVert=\left\lVert e^{Gt}\left\lvert f\right\rvert\right\rVert_{*}\leq\left\lVert e^{Gt}\right\rVert_{*}\,,

where the final inequality used that |f|=f=1\left\lVert\left\lvert f\right\rvert\right\rVert_{*}=\left\lVert f\right\rVert_{*}=1. Hence we have found that eGt=eGt|f|\left\lVert e^{Gt}\right\rVert_{*}=\left\lVert e^{Gt}\left\lvert f\right\rvert\right\rVert_{*}.

Since eQt𝒯te^{Qt}\in\mathcal{T}_{t} by Equation (7), we also have

eGt|f|eG¯t|f|.e^{Gt}\left\lvert f\right\rvert\leq e^{\overline{G}t}\left\lvert f\right\rvert\,.

By monotonicity of upper transition operators, this implies that

eG¯seGt|f|eG¯seG¯t|f|e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\leq e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert

and, due to the preservation of non-negativity, we have

eG¯seGt|f|=|eG¯seGt|f||,e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert=\left\lvert e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rvert\,,

and

eG¯seG¯t|f|=|eG¯seG¯t|f||.e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert=\left\lvert e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rvert\,.

Hence for all xAcx\in A^{c} we have

|eG¯seGt|f||(x)|eG¯seG¯t|f||(x),\left\lvert e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rvert(x)\leq\left\lvert e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rvert(x)\,,

or in other words, that

eG¯seGt|f|eG¯seG¯t|f|.\left\lVert e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rVert\leq\left\lVert e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert\,.

Since this holds for all s0s\in\mathbb{R}_{\geq 0}, we have

eGt=eGt|f|\displaystyle\left\lVert e^{Gt}\right\rVert_{*}=\left\lVert e^{Gt}\left\lvert f\right\rvert\right\rVert_{*} =sups0eξseG¯seGt|f|\displaystyle=\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rVert
sups0eξseG¯seG¯t|f|\displaystyle\leq\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert
=eG¯t|f|eG¯t,\displaystyle=\left\lVert e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert_{*}\leq\left\lVert e^{\overline{G}t}\right\rVert_{*}\,,

which concludes the proof. ∎

Proof of Proposition 13.

The argument is analogous to the well-known case for linear quasicontractive semigroups; for a similar result, see e.g. [Renardy and Rogers, 2006, Thm 12.21]. So, fix any t0t\in\mathbb{R}_{\geq 0} and fAcf\in\smash{\mathbb{R}^{A^{c}}}.

Using a similar argument as used in the proof of Lemma 5, we use the preservation of non-negativity and the monotonicity of upper transition operators, to find for any s0s\in\mathbb{R}_{\geq 0} that

eξseG¯s|eG¯tf|eξseG¯seG¯t|f|.\left\lVert e^{\xi s}e^{\overline{G}s}\left\lvert e^{\overline{G}t}f\right\rvert\right\rVert\leq\left\lVert e^{\xi s}e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert\,.

Hence we have

eG¯tf\displaystyle\left\lVert e^{\overline{G}t}f\right\rVert_{*} =sups0eξseG¯s|eG¯tf|\displaystyle=\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}\left\lvert e^{\overline{G}t}f\right\rvert\right\rVert
sups0eξseG¯seG¯t|f|\displaystyle\leq\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert
=eξteξtsups0eξseG¯(s+t)|f|\displaystyle=e^{-\xi t}e^{\xi t}\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}(s+t)}\left\lvert f\right\rvert\right\rVert
=eξtsups0eξ(s+t)eG¯(s+t)|f|\displaystyle=e^{-\xi t}\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi(s+t)}e^{\overline{G}(s+t)}\left\lvert f\right\rvert\right\rVert
=eξtsupsteξ(s)eG¯s|f|eξtf,\displaystyle=e^{-\xi t}\sup_{s\in\mathbb{R}_{\geq t}}\left\lVert e^{\xi(s)}e^{\overline{G}s}\left\lvert f\right\rvert\right\rVert\leq e^{-\xi t}\left\lVert f\right\rVert_{*}\,,

where for the second equality we used the semigroup property. Since fAcf\in\smash{\mathbb{R}^{A^{c}}} is arbitrary, this implies that

eG¯t=sup{eG¯tf:fAc,f=1}eξt,\left\lVert e^{\overline{G}t}\right\rVert_{*}=\sup\bigl{\{}\left\lVert e^{\overline{G}t}f\right\rVert_{*}\,:\,f\in\smash{\mathbb{R}^{A^{c}}},\left\lVert f\right\rVert_{*}=1\bigr{\}}\leq e^{-\xi t}\,,

which completes the proof. ∎

Proof of Proposition 14.

This is immediate from Lemma 5 and Proposition 13. ∎

Appendix C Proofs and Lemmas for Section 5

The following result is well-known, but we state it here for convenience:

Lemma 6.

Let TT be a linear bounded operator on a Banach space with norm \left\lVert\cdot\right\rVert_{*}. Suppose that T<1\left\lVert T\right\rVert_{*}<1 and that (IT)1(I-T)^{-1} exists. Then

(IT)111T.\left\lVert(I-T)^{-1}\right\rVert_{*}\leq\frac{1}{1-\left\lVert T\right\rVert_{*}}\,.
Proof.

Since T<1\left\lVert T\right\rVert_{*}<1 we have (IT)1=k=0+Tk(I-T)^{-1}=\sum_{k=0}^{+\infty}T^{k}. Taking norms,

(IT)1=k=0+Tkk=0+Tk=11T,\left\lVert(I-T)^{-1}\right\rVert_{*}=\left\lVert\sum_{k=0}^{+\infty}T^{k}\right\rVert_{*}\leq\sum_{k=0}^{+\infty}\left\lVert T\right\rVert_{*}^{k}=\frac{1}{1-\left\lVert T\right\rVert_{*}}\,,

where the final step used the value of the geometric series and that T<1\left\lVert T\right\rVert_{*}<1. ∎

Lemma 7.

There is some C>0C>0 such that for any Δ>0\Delta>0 with Δξ<1\Delta\xi<1, and any Q𝒬Q\in\mathcal{Q} with subgenerator GG, it holds that (IeGΔ)1<C/Δ\left\lVert(I-e^{G\Delta})^{-1}\right\rVert<\nicefrac{{C}}{{\Delta}}.

Proof.

Let ξ>0\xi>0 be as in Proposition 11, and let \left\lVert\cdot\right\rVert_{*} be the norm from Equation (8). Since Ac\smash{\mathbb{R}^{A^{c}}} is finite-dimensional the norms \left\lVert\cdot\right\rVert and \left\lVert\cdot\right\rVert_{*} are equivalent, and hence there is some c>0c>0 such that fcf\left\lVert f\right\rVert\leq c\left\lVert f\right\rVert_{*} for all fAcf\in\smash{\mathbb{R}^{A^{c}}}. Set C2c/ξC\coloneqq\nicefrac{{2c}}{{\xi}}; then C>0C>0 since ξ>0\xi>0.

Fix any Δ>0\Delta>0 such that Δξ<1\Delta\xi<1, and any Q𝒬Q\in\mathcal{Q} with subgenerator GG. It follows from Proposition 14 that eGΔeξΔ\left\lVert e^{G\Delta}\right\rVert_{*}\leq e^{-\xi\Delta}. Using a standard quadratic bound on the negative scalar exponential, we have

eGΔeξΔ1ξΔ+12Δ2ξ2<1Δξ2<1,\left\lVert e^{G\Delta}\right\rVert_{*}\leq e^{-\xi\Delta}\leq 1-\xi\Delta+\frac{1}{2}\Delta^{2}\xi^{2}<1-\frac{\Delta\xi}{2}<1\,, (14)

where the third inequality used that Δξ<1\Delta\xi<1. Notice that eGΔeξΔ<1\left\lVert e^{G\Delta}\right\rVert_{*}\leq e^{-\xi\Delta}<1. Moreover, (IeGΔ)1(I-e^{G\Delta})^{-1} exists by Proposition 8. By the norm equivalence, we have

(IeGΔ)1c(IeGΔ)1,\left\lVert(I-e^{G\Delta})^{-1}\right\rVert\leq c\left\lVert(I-e^{G\Delta})^{-1}\right\rVert_{*}\,, (15)

and, by Lemma 6, that

(IeGΔ)111eGΔ.\left\lVert(I-e^{G\Delta})^{-1}\right\rVert_{*}\leq\frac{1}{1-\left\lVert e^{G\Delta}\right\rVert_{*}}\,.

Using Equation (14) we obtain

(IeGΔ)111eGΔ<111+Δξ2=1Δ2ξ.\left\lVert(I-e^{G\Delta})^{-1}\right\rVert_{*}\leq\frac{1}{1-\left\lVert e^{G\Delta}\right\rVert_{*}}<\frac{1}{1-1+\frac{\Delta\xi}{2}}=\frac{1}{\Delta}\frac{2}{\xi}\,.

Combining with Equation (15) yields

(IeGΔ)1<c1Δ2ξ=CΔ,\left\lVert(I-e^{G\Delta})^{-1}\right\rVert<c\frac{1}{\Delta}\frac{2}{\xi}=\frac{C}{\Delta}\,,

which concludes the proof. ∎

Proof of Proposition 15.

Let ξ,C>0\xi,C>0 be as in Lemma 7, and let δ1/ξ\delta\coloneqq\nicefrac{{1}}{{\xi}} and LC𝒬2L\coloneqq C\left\lVert\mathcal{Q}\right\rVert^{2} with 𝒬supQ𝒬Q\left\lVert\mathcal{Q}\right\rVert\coloneqq\sup_{Q\in\mathcal{Q}}\left\lVert Q\right\rVert; note that 𝒬0\left\lVert\mathcal{Q}\right\rVert\in\mathbb{R}_{\geq 0} since 𝒬\mathcal{Q} is bounded by assumption. Observe that we must have 𝒬>0\left\lVert\mathcal{Q}\right\rVert>0 due to Assumption 2, whence L>0L>0.

Choose any Δ(0,δ)\Delta\in(0,\delta) and Q𝒬Q\in\mathcal{Q}. It is immediate from the definitions that hQ(x)=0=hΔQ(x)h^{Q}(x)=0=h^{Q}_{\Delta}(x) for all xAx\in A and all Q𝒬Q\in\mathcal{Q}, so it remains to bound the norm on AcA^{c}.

Let GG be the subgenerator of QQ on AcA^{c}. By Proposition 9 we have that hΔQ|Ac=(IeGΔ)1Δ𝟏h^{Q}_{\Delta}|_{A^{c}}=(I-e^{G\Delta})^{-1}\Delta\mathbf{1}. Using the definition of hQh^{Q} this implies that

hΔQ|AceGΔhΔQ|Ac=Δ𝟏=ΔGhQ|Ac.\displaystyle h^{Q}_{\Delta}|_{A^{c}}-e^{G\Delta}h^{Q}_{\Delta}|_{A^{c}}=\Delta\mathbf{1}=-\Delta Gh^{Q}|_{A^{c}}\,.

Re-ordering terms we have

hΔQ|Ac=eGΔhΔQ|AcΔGhQ|Ac.h^{Q}_{\Delta}|_{A^{c}}=e^{G\Delta}h^{Q}_{\Delta}|_{A^{c}}-\Delta Gh^{Q}|_{A^{c}}\,.

Let B=eGΔ(I+ΔG)B=e^{G\Delta}-(I+\Delta G). We find that

hΔQ\displaystyle h^{Q}_{\Delta} |AchQ|Ac\displaystyle|_{A^{c}}-h^{Q}|_{A^{c}}
=eGΔhΔQ|AcΔGhQ|AchQ|Ac\displaystyle=e^{G\Delta}h^{Q}_{\Delta}|_{A^{c}}-\Delta Gh^{Q}|_{A^{c}}-h^{Q}|_{A^{c}}
=eGΔhΔQ|Ac(I+ΔG)hQ|Ac\displaystyle=e^{G\Delta}h^{Q}_{\Delta}|_{A^{c}}-(I+\Delta G)h^{Q}|_{A^{c}}
=eGΔ(hΔQ|AchQ|Ac)+(eGΔ(I+ΔG))hQ|Ac\displaystyle=e^{G\Delta}(h^{Q}_{\Delta}|_{A^{c}}-h^{Q}|_{A^{c}})+\bigl{(}e^{G\Delta}-(I+\Delta G)\bigr{)}h^{Q}|_{A^{c}}
=eGΔ(hΔQ|AchQ|Ac)+BhQ|Ac.\displaystyle=e^{G\Delta}(h^{Q}_{\Delta}|_{A^{c}}-h^{Q}|_{A^{c}})+Bh^{Q}|_{A^{c}}\,.

We see that the difference on the left-hand side occurs again on the right-hand side. Hence we can substitute the same expansion nn\in\mathbb{N} times to get

hΔQ|AchQ|Ac\displaystyle h^{Q}_{\Delta}|_{A^{c}}-h^{Q}|_{A^{c}}
=eGΔ(n+1)(hΔQ|AchQ|Ac)+k=0neGΔkBhQ|Ac.\displaystyle\quad=e^{G\Delta(n+1)}(h^{Q}_{\Delta}|_{A^{c}}-h^{Q}|_{A^{c}})+\sum_{k=0}^{n}e^{G\Delta k}Bh^{Q}|_{A^{c}}\,.

Since we know from Section 4 that limt+eGt=0\lim_{t\to+\infty}e^{Gt}=0, we see that the left summand vanishes as we take n+n\to+\infty and, using Proposition 8, we have (IeQΔ)1=k=0+eGΔk(I-e^{Q\Delta})^{-1}=\sum_{k=0}^{+\infty}e^{G\Delta k}. So, passing to this limit and taking norms, we find

hΔQ|AchQ|Ac\displaystyle\left\lVert h^{Q}_{\Delta}|_{A^{c}}-h^{Q}|_{A^{c}}\right\rVert =(IeGΔ)1BhQ|Ac\displaystyle=\left\lVert(I-e^{G\Delta})^{-1}Bh^{Q}|_{A^{c}}\right\rVert
(IeGΔ)1BhQ|Ac.\displaystyle\leq\left\lVert(I-e^{G\Delta})^{-1}\right\rVert\left\lVert B\right\rVert\left\lVert h^{Q}|_{A^{c}}\right\rVert\,.

Using Lemmas 3 and 4 and Corollary 3, we have

B=eGΔ(I+ΔG)eQΔ(I+ΔQ),\left\lVert B\right\rVert=\left\lVert e^{G\Delta}-(I+\Delta G)\right\rVert\leq\left\lVert e^{Q\Delta}-(I+\Delta Q)\right\rVert\,,

and so, due to [Krak, 2021, Lemma B.8], we have BΔ2Q2\left\lVert B\right\rVert\leq\Delta^{2}\left\lVert Q\right\rVert^{2}. Since Q𝒬Q\in\mathcal{Q} it follows that Q𝒬\left\lVert Q\right\rVert\leq\left\lVert\mathcal{Q}\right\rVert, and so BΔ2𝒬2\left\lVert B\right\rVert\leq\Delta^{2}\left\lVert\mathcal{Q}\right\rVert^{2}. Since Δ<δ\Delta<\delta we have Δξ<1\Delta\xi<1, whence (IeGΔ)1<C/Δ\left\lVert(I-e^{G\Delta})^{-1}\right\rVert<\nicefrac{{C}}{{\Delta}} due to Lemma 7. In summary we find

hΔQ|AchQ|Ac\displaystyle\left\lVert h^{Q}_{\Delta}|_{A^{c}}-h^{Q}|_{A^{c}}\right\rVert <CΔΔ2𝒬2hQ|Ac=ΔLhQ,\displaystyle<\frac{C}{\Delta}\Delta^{2}\left\lVert\mathcal{Q}\right\rVert^{2}\left\lVert h^{Q}|_{A^{c}}\right\rVert=\Delta L\left\lVert h^{Q}\right\rVert\,,

which concludes the proof. ∎

Proposition 17.

[Krak, 2020, Prop 7] Fix any Δ>0\Delta>0, and let 𝒯Δ\mathcal{T}_{\Delta} denote the set of transition matrices that dominate eQ¯Δe^{\underline{Q}\Delta}. Choose any T0𝒯ΔT_{0}\in\mathcal{T}_{\Delta}. For all n0n\in\mathbb{N}_{0}, let hnh_{n} be the (unique) non-negative solution to hn=Δ𝕀Ac+𝕀AcTnhnh_{n}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}T_{n}h_{n}, and let Tn+1𝒯ΔT_{n+1}\in\mathcal{T}_{\Delta} be such that Tn+1hn=eQ¯ΔhnT_{n+1}h_{n}=e^{\underline{Q}\Delta}h_{n}.

Then limn+hn=h¯Δ\lim_{n\to+\infty}h_{n}=\underline{h}_{\Delta}.

Proof.

The preconditions of the reference actually require every TnT_{n} to be an extreme point of 𝒯Δ\mathcal{T}_{\Delta}, but inspection of the proof of [Krak, 2020, Prop 7] shows that this is not required; the superfluous condition is only used to streamline the statement of an algorithmic result further on in that work. ∎

We next need some results that involve transition matrices TtsP{}^{P}T_{t}^{s} corresponding to (not-necessarily homogeneous) Markov chains P𝒬MP\in\mathbb{P}^{\mathrm{M}}_{\mathcal{Q}}. We recall from Section 2.2 that these are defined for any t,s0t,s\in\mathbb{R}_{\geq 0} with tst\leq s as

TtsP(x,y)P(Xs=y|Xt=x)for all x,y𝒳.{}^{P}T_{t}^{s}(x,y)\coloneqq P(X_{s}=y\,|\,X_{t}=x)\quad\text{for all $x,y\in\mathcal{X}$.}
Lemma 8.

Consider the sequence (hn)n0(h_{n})_{n\in\mathbb{N}_{0}} constructed as in Proposition 17. For any n0n\in\mathbb{N}_{0}, there is a Markov chain Pn+1𝒫𝒬MP_{n+1}\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}} with corresponding transition matrix T0Δ(n+1){}^{(n+1)}T_{0}^{\Delta} such that T0Δ(n+1)hn=eQ¯Δhn{}^{(n+1)}T_{0}^{\Delta}h_{n}=e^{\underline{Q}\Delta}h_{n}.

Hence in particular, we can choose the co-sequence (Tn)n(T_{n})_{n\in\mathbb{N}} in Proposition 17 to be (T0Δ(n))n({}^{(n)}T_{0}^{\Delta})_{n\in\mathbb{N}}.

Proof.

This follows from [Krak, 2021, Cor 6.24] and the fact that 𝒬\mathcal{Q} is non-empty, compact, convex, and has separately specified rows. ∎

Proposition 18.

For all Δ>0\Delta>0 there is a Markov chain P𝒫𝒬MP\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}} with corresponding transition matrix T=T0ΔPT={}^{P}T_{0}^{\Delta}, such that the unique solution hh to h=Δ𝕀Ac+𝕀AcThh=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Th satisfies h=h¯Δh=\underline{h}_{\Delta}.

Proof.

Let 𝒯ΔM{T0ΔP:P𝒫𝒬M}\mathcal{T}_{\Delta}^{\mathrm{M}}\coloneqq\{{}^{P}T_{0}^{\Delta}\,:\,P\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}}\}, and let (hn)(h_{n})_{\in\mathbb{N}} be as in Proposition 17, with the co-sequence (Tn)n(T_{n})_{n\in\mathbb{N}} chosen as in Lemma 8 to consist of transition matrices corresponding to Markov chains in 𝒫𝒬M\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}}. Then (Tn)n(T_{n})_{n\in\mathbb{N}} lives in 𝒯ΔM\mathcal{T}_{\Delta}^{\mathrm{M}}.

The set 𝒯ΔM\mathcal{T}_{\Delta}^{\mathrm{M}} is compact by [Krak, 2021, Cor 5.18] and the fact that 𝒬\mathcal{Q} is non-empty, compact, and convex. Hence we can find a subsequence (Tnj)j(T_{n_{j}})_{j\in\mathbb{N}} with limj+Tnj=:T𝒯ΔM\lim_{j\to+\infty}T_{n_{j}}=:T\in\mathcal{T}_{\Delta}^{\mathrm{M}}. Since T𝒯ΔMT\in\mathcal{T}_{\Delta}^{\mathrm{M}}, there is a Markov chain P𝒫𝒬MP\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}} with corresponding transition matrix T=T0ΔPT={}^{P}T_{0}^{\Delta}.

Moreover, since T,Tnj𝒯ΔMT,T_{n_{j}}\in\mathcal{T}_{\Delta}^{\mathrm{M}}, it follows from [Krak, 2021, Cor 6.24] that the transition matrices TT and all TnjT_{n_{j}} dominate the lower transition operator eQ¯Δe^{\underline{Q}\Delta}. Together with Assumption 2, this allows us to invoke [Krak, 2020, Prop 6], by which we can let hh be the unique solution to h=Δ𝕀Ac+𝕀AcThh=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Th, and it holds for any jj\in\mathbb{N} that hnj|A=0h_{n_{j}}|_{A}=0, and

hnj|Ac=(ITnj|Ac)1𝟏Δ.h_{n_{j}}|_{A^{c}}=(I-T_{n_{j}}|_{A^{c}})^{-1}\mathbf{1}\Delta\,.

Similarly, it holds that h|A=0h|_{A}=0, and

h|Ac=(IT|Ac)1𝟏Δ.h|_{A^{c}}=(I-T|_{A^{c}})^{-1}\mathbf{1}\Delta\,.

Since limh+Tnj=T\lim_{h\to+\infty}T_{n_{j}}=T and by continuity of the map M(IM)1M\mapsto(I-M)^{-1}—which holds since all these inverses exist—it follows that h|Ac=limj+hnj|Ach|_{A^{c}}=\lim_{j\to+\infty}h_{n_{j}}|_{A^{c}}. Since also h|A=hnj|Ah|_{A}=h_{n_{j}}|_{A}, it follows that limj+hnj=h\lim_{j\to+\infty}h_{n_{j}}=h.

By Proposition 17 we have limn+hn=h¯Δ\lim_{n\to+\infty}h_{n}=\underline{h}_{\Delta}, and hence we conclude that h¯Δ=limj+hnj=h\underline{h}_{\Delta}=\lim_{j\to+\infty}h_{n_{j}}=h. ∎

Proposition 19.

Fix any t0t\geq 0 and consider any Markov chain P𝒫𝒬MP\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}} with transition matrix T0tP{}^{P}T_{0}^{t}. Choose any ϵ>0\epsilon>0. Then there is some mm\in\mathbb{N} such that for all nmn\geq m there are Q1,,Qn𝒬Q_{1},\ldots,Q_{n}\in\mathcal{Q}, such that

T0tPi=1n(I+t/nQi)<ϵ.\left\lVert{}^{P}T_{0}^{t}-\prod_{i=1}^{n}(I+\nicefrac{{t}}{{n}}Q_{i})\right\rVert<\epsilon\,.
Proof.

The result is trivial if t=0t=0, so let us consider the case where t>0t>0. Let ϵϵ/2t\epsilon^{\prime}\coloneqq\nicefrac{{\epsilon}}{{2t}}. By [Krak, 2021, Lemma 5.12] there is some mm\in\mathbb{N} such that for all nmn\geq m and with Δt/n\Delta\coloneqq\nicefrac{{t}}{{n}}, for all i=1,,ni=1,\ldots,n there is some Qi𝒬Q_{i}\in\mathcal{Q} such that

T(i1)ΔiΔP(I+ΔQi)Δϵ.\left\lVert{}^{P}T_{(i-1)\Delta}^{i\Delta}-(I+\Delta Q_{i})\right\rVert\leq\Delta\epsilon^{\prime}\,.

Since PP is a Markov chain, we can factor its transition matrices [Krak, 2021, Prop 5.1] as

T0tP=T0ΔPTΔ2ΔPTtΔtP=i=1nT(i1)ΔiΔP.{}^{P}T_{0}^{t}={}^{P}T_{0}^{\Delta}{}^{P}T_{\Delta}^{2\Delta}\cdots{}^{P}T_{t-\Delta}^{t}=\prod_{i=1}^{n}{}^{P}T_{(i-1)\Delta}^{i\Delta}\,.

Using [Krak, 2021, Lemma B.5] for the first inequality, we have

T0tPi=1n(I+ΔQi)\displaystyle\left\lVert{}^{P}T_{0}^{t}-\prod_{i=1}^{n}(I+\Delta Q_{i})\right\rVert
=i=1nT(i1)ΔiΔPi=1n(I+ΔQi)\displaystyle=\left\lVert\prod_{i=1}^{n}{}^{P}T_{(i-1)\Delta}^{i\Delta}-\prod_{i=1}^{n}(I+\Delta Q_{i})\right\rVert
i=1nT(i1)ΔiΔP(I+ΔQi)\displaystyle\leq\sum_{i=1}^{n}\left\lVert{}^{P}T_{(i-1)\Delta}^{i\Delta}-(I+\Delta Q_{i})\right\rVert
i=1nΔϵ=ntnϵ2t=ϵ2,\displaystyle\leq\sum_{i=1}^{n}\Delta\epsilon^{\prime}=n\frac{t}{n}\frac{\epsilon}{2t}=\frac{\epsilon}{2}\,,

which concludes the proof. ∎

Lemma 9.

Consider a sequence (Qn)n(Q_{n})_{n\in\mathbb{N}} in 𝒬\mathcal{Q} with limit Qlimn+QnQ_{*}\coloneqq\lim_{n\to+\infty}Q_{n}. For all nn\in\mathbb{N}, let hnh_{n} denote the minimal non-negative solution to 𝕀Ahn=𝕀Ac+𝕀AcQnhn\mathbb{I}_{A}h_{n}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Q_{n}h_{n}, and let hh_{*} denote the minimal non-negative solution to 𝕀Ah=𝕀Ac+𝕀AcQh\mathbb{I}_{A}h_{*}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Q_{*}h_{*}. Then h=limn+hnh_{*}=\lim_{n\to+\infty}h_{n}.

Proof.

Since 𝒬\mathcal{Q} is closed, we have Q𝒬Q_{*}\in\mathcal{Q}. Let (Gn)n(G_{n})_{n\in\mathbb{N}} and GG_{*} denote the subgenerators of (Qn)n(Q_{n})_{n\in\mathbb{N}} and QQ_{*}, respectively. Then G1G_{*}^{-1} and Gn1G_{n}^{-1}, nn\in\mathbb{N} exist by Corollary 1, and hence we also have limn+Gn1=G1\lim_{n\to+\infty}G_{n}^{-1}=G_{*}^{-1}. Right-multiplying with 𝟏-\mathbf{1} and applying Proposition 10 gives

limn+hn|Ac=limn+Gn1𝟏=G1𝟏=h|Ac.\lim_{n\to+\infty}h_{n}|_{A^{c}}=\lim_{n\to+\infty}-G_{n}^{-1}\mathbf{1}=-G_{*}^{-1}\mathbf{1}=h_{*}|_{A^{c}}\,.

Finally, by definition we trivially have hn(x)=0=h(x)h_{n}(x)=0=h_{*}(x) for all xAx\in A. Hence also limn+hn|A=h|A\lim_{n\to+\infty}h_{n}|_{A}=h_{*}|_{A}. ∎

Lemma 10.

[Krak et al., 2019, Cor 13] Fix any Δ>0\Delta>0 and let h¯Δ\underline{h}_{\Delta} be the minimal non-negative solution to the non-linear system (12). Let 𝒯Δ\mathcal{T}_{\Delta} denote the set of transition matrices that dominate eQ¯Δe^{\underline{Q}\Delta} and, for all T𝒯ΔT\in\mathcal{T}_{\Delta}, let hTh_{T} denote the minimal non-negative solution to the linear system hT=Δ𝕀Ac+𝕀AcThTh_{T}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Th_{T}. Then it holds that

h¯Δ=infT𝒯ΔhT.\underline{h}_{\Delta}=\inf_{T\in\mathcal{T}_{\Delta}}h_{T}\,.
Proof of Proposition 16.

We only give the proof for the lower hitting times, i.e. that limΔ0+h¯Δh¯=0\lim_{\Delta\to 0^{+}}\left\lVert\underline{h}_{\Delta}-\underline{h}\right\rVert=0. The argument for the upper hitting times is completely analogous.

Choose any two sequences (Δn)n(\Delta_{n})_{n\in\mathbb{N}} and (ϵn)n(\epsilon_{n})_{n\in\mathbb{N}} in >0\mathbb{R}_{>0} such that limn+Δn=0\lim_{n\to+\infty}\Delta_{n}=0 and limn+ϵn=0\lim_{n\to+\infty}\epsilon_{n}=0. We will assume without loss of generality that Δn𝒬1\Delta_{n}\left\lVert\mathcal{Q}\right\rVert\leq 1 for all nn\in\mathbb{N}, where 𝒬=supQ𝒬Q\left\lVert\mathcal{Q}\right\rVert=\sup_{Q\in\mathcal{Q}}\left\lVert Q\right\rVert.

Now first fix any nn\in\mathbb{N}, and consider h¯Δn\underline{h}_{\Delta_{n}}. By Proposition 18 there is a Markov chain Pn𝒫𝒬MP_{n}\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}} with transition matrix TnT0ΔnPnT_{n}\coloneqq{}^{P_{n}}T_{0}^{\Delta_{n}} such that the unique solution hnh_{n} to hn=Δn𝕀Ac+𝕀AcTnhnh_{n}=\Delta_{n}\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}T_{n}h_{n} satisfies hn=h¯Δnh_{n}=\underline{h}_{\Delta_{n}}.

By Proposition 19, there are mnm_{n}\in\mathbb{N} with mnnm_{n}\geq n and Q1(n),,Qmn(n)Q^{(n)}_{1},\ldots,Q^{(n)}_{m_{n}} in 𝒬\mathcal{Q} such that, with

Φni=1mn(I+ΔnmnQi(n)),\Phi_{n}\coloneqq\prod_{i=1}^{m_{n}}\left(I+\frac{\Delta_{n}}{m_{n}}Q_{i}^{(n)}\right)\,,

it holds that TnΦn<ϵn\left\lVert T_{n}-\Phi_{n}\right\rVert<\epsilon_{n}. Now define

Qni=1mn1mnQi(n).Q_{n}\coloneqq\sum_{i=1}^{m_{n}}\frac{1}{m_{n}}Q_{i}^{(n)}\,.

Then Qn𝒬Q_{n}\in\mathcal{Q} since 𝒬\mathcal{Q} is convex. Let hQnh_{Q_{n}} denote the minimal non-negative solution to 𝕀AhQn=𝕀Ac+𝕀AcQnhQn\mathbb{I}_{A}h_{Q_{n}}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Q_{n}h_{Q_{n}}.

By repeating this construction for all nn\in\mathbb{N}, we obtain a sequence (Qn)n(Q_{n})_{n\in\mathbb{N}} in 𝒬\mathcal{Q}. Since 𝒬\mathcal{Q} is (sequentially) compact, we can consider a subsequence (Qnj)j(Q_{n_{j}})_{j\in\mathbb{N}} such that limj+Qnj=:Q𝒬\lim_{j\to+\infty}Q_{n_{j}}=:Q_{*}\in\mathcal{Q}.

Let hh_{*} be the minimal non-negative solution to 𝕀Ah=𝕀Ac+𝕀AcQh\mathbb{I}_{A}h_{*}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Q_{*}h_{*}. We now need to estimate some norm bounds that hold by choosing jj large enough. Let K=5K=5 and fix any δ>0\delta>0.

Since (Qnj)j(Q_{n_{j}})_{j\in\mathbb{N}} converges to QQ_{*}, it follows from Lemma 9 that for jj large enough, we have

hQnjh<δK\left\lVert h_{Q_{n_{j}}}-h_{*}\right\rVert<\frac{\delta}{K} (16)

Since hh_{*} is bounded, this also implies that the sequence (hQnj)j(h_{Q_{n_{j}}})_{j\in\mathbb{N}} is eventually uniformly bounded above in norm by some constant M0M\geq 0, say.

For all jj\in\mathbb{N}, let h^nj\hat{h}_{n_{j}} be such that h^nj|Ac(IeGnjΔnj)1Δnj𝟏\hat{h}_{n_{j}}|_{A^{c}}\coloneqq(I-e^{G_{n_{j}}\Delta_{n_{j}}})^{-1}\Delta_{n_{j}}\mathbf{1} and h^nj|A0\hat{h}_{n_{j}}|_{A}\coloneqq 0. Then

h^nj=Δnj𝕀Ac+𝕀AceQnjΔnjh^nj.\hat{h}_{n_{j}}=\Delta_{n_{j}}\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}e^{Q_{n_{j}}\Delta_{n_{j}}}\hat{h}_{n_{j}}\,.

For jj large enough we eventually have Δnjξ<1\Delta_{n_{j}}\xi<1, and so by Proposition 15, we then have

h^njhQnj\displaystyle\left\lVert\hat{h}_{n_{j}}-h_{Q_{n_{j}}}\right\rVert <ΔnjLhQnj\displaystyle<\Delta_{n_{j}}L\left\lVert h_{Q_{n_{j}}}\right\rVert
ΔnjLM,\displaystyle\leq\Delta_{n_{j}}LM\,,

with L,ML,M independent of jj. Hence for jj large enough we have

h^njhQnj<δK.\left\lVert\hat{h}_{n_{j}}-h_{Q_{n_{j}}}\right\rVert<\frac{\delta}{K}\,. (17)

Let next h~nj\tilde{h}_{n_{j}} be the minimal non-negative solution to h~nj=Δnj𝕀Ac+𝕀AcΦnjh~nj\tilde{h}_{n_{j}}=\Delta_{n_{j}}\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\Phi_{n_{j}}\tilde{h}_{n_{j}}. Since mnjnjm_{n_{j}}\geq n_{j}, for jj large enough we have Φnj|Ac<1\left\lVert\Phi_{n_{j}}|_{A^{c}}\right\rVert<1 due to Assumption 2.

By [Krak, 2021, Lemmas B.8 and B.12] we have

ΦnjeQnjΔnj2Δnj2𝒬2,\left\lVert\Phi_{n_{j}}-e^{Q_{n_{j}}\Delta_{n_{j}}}\right\rVert\leq 2\Delta_{n_{j}}^{2}\left\lVert\mathcal{Q}\right\rVert^{2}\,,

and so, for any ϵ>0\epsilon>0, we can choose jj large enough so that eventually ΦnjeQnjΔnj<ϵ\left\lVert\Phi_{n_{j}}-e^{Q_{n_{j}}\Delta_{n_{j}}}\right\rVert<\epsilon. Using the continuity of the map T(IT)1T\mapsto(I-T)^{-1} on operators TT for which this inverse exists, for large enough jj we therefore find that

h~nj|Ach^nj|Ac\displaystyle\left\lVert\tilde{h}_{n_{j}}|_{A^{c}}-\hat{h}_{n_{j}}|_{A^{c}}\right\rVert
=((IΦnj|Ac)1(IeQnjΔnj|Ac)1)Δnj𝟏\displaystyle=\left\lVert\bigl{(}(I-\Phi_{n_{j}}|_{A^{c}})^{-1}-(I-e^{Q_{n_{j}}\Delta_{n_{j}}}|_{A^{c}})^{-1}\bigr{)}\Delta_{n_{j}}\mathbf{1}\right\rVert
<ΔnjδKδK.\displaystyle<\Delta_{n_{j}}\frac{\delta}{K}\leq\frac{\delta}{K}\,.

Since h~nj|A=0=h^nj|A\tilde{h}_{n_{j}}|_{A}=0=\hat{h}_{n_{j}}|_{A}, this implies that then also

h~njh^nj<δK.\left\lVert\tilde{h}_{n_{j}}-\hat{h}_{n_{j}}\right\rVert<\frac{\delta}{K}\,. (18)

Next, we recall that h¯Δnj=hnj\underline{h}_{\Delta_{n_{j}}}=h_{n_{j}}, and

TnjΦnj<ϵnj.\left\lVert T_{n_{j}}-\Phi_{n_{j}}\right\rVert<\epsilon_{n_{j}}\,.

Hence by continuity of the map T(IT)1T\mapsto(I-T)^{-1} on operators TT for which this inverse exists, for large enough jj we find that

hnj|Ach~nj|Ac\displaystyle\left\lVert h_{n_{j}}|_{A^{c}}-\tilde{h}_{n_{j}}|_{A^{c}}\right\rVert
=((ITnj|Ac)1(IΦnj|Ac)1)Δnj𝟏\displaystyle=\left\lVert\bigl{(}(I-T_{n_{j}}|_{A^{c}})^{-1}-(I-\Phi_{n_{j}}|_{A^{c}})^{-1}\bigr{)}\Delta_{n_{j}}\mathbf{1}\right\rVert
<ΔnjδKδK.\displaystyle<\Delta_{n_{j}}\frac{\delta}{K}\leq\frac{\delta}{K}\,.

Since hnj|A=0=h~nj|Ah_{n_{j}}|_{A}=0=\tilde{h}_{n_{j}}|_{A}, this implies that also

hnjh~nj<δK.\left\lVert h_{n_{j}}-\tilde{h}_{n_{j}}\right\rVert<\frac{\delta}{K}\,. (19)

Putting Equations (16)–(19) together, we find that for any large enough jj it holds that

h¯Δnjh\displaystyle\left\lVert\underline{h}_{\Delta_{n_{j}}}-h_{*}\right\rVert =hnjh\displaystyle=\left\lVert h_{n_{j}}-h_{*}\right\rVert
hnjh~nj\displaystyle\leq\left\lVert h_{n_{j}}-\tilde{h}_{n_{j}}\right\rVert
+h~njh^nj\displaystyle\quad+\left\lVert\tilde{h}_{n_{j}}-\hat{h}_{n_{j}}\right\rVert
+h^njhQnj\displaystyle\quad\quad\quad+\left\lVert\hat{h}_{n_{j}}-h_{Q_{n_{j}}}\right\rVert
+hQnjh\displaystyle\quad\quad\quad\quad+\left\lVert h_{Q_{n_{j}}}-h_{*}\right\rVert
<4δK.\displaystyle<4\frac{\delta}{K}\,. (20)

Since δ>0\delta>0 is arbitrary this clearly implies that

limj+h¯Δnj=h.\lim_{j\to+\infty}\underline{h}_{\Delta_{n_{j}}}=h_{*}\,. (21)

Next, let us show that h=h¯h_{*}=\underline{h}. To this end, assume ex absurdo that there is some Q𝒬Q\in\mathcal{Q} such that hQ(x)<h(x)h^{Q}(x)<h_{*}(x) for some xAcx\in A^{c}. Let δh(x)hQ(x)>0\delta\coloneqq h_{*}(x)-h_{Q}(x)>0. Due to Corollary 2, for any Δ>0\Delta>0 small enough it holds that

hΔQhQ<δK.\left\lVert h_{\Delta}^{Q}-h^{Q}\right\rVert<\frac{\delta}{K}\,.

This implies in particular that for large enough jj it holds that hΔnjQ(x)<hQ(x)+δ/Kh_{\Delta_{n_{j}}}^{Q}(x)<h^{Q}(x)+\nicefrac{{\delta}}{{K}}. Moreover, it follows from Equation (20) that for large enough jj we have h¯Δnj(x)>h(x)4δ/K\underline{h}_{\Delta_{n_{j}}}(x)>h_{*}(x)-4\nicefrac{{\delta}}{{K}}. It holds that hQ(x)=h(x)δh_{Q}(x)=h_{*}(x)-\delta, and hence, since K=5K=5, we find that that for large enough jj,

hΔnjQ(x)\displaystyle h_{\Delta_{n_{j}}}^{Q}(x) <hQ(x)+δ/K\displaystyle<h^{Q}(x)+\nicefrac{{\delta}}{{K}}
=h(x)δ+δ/K\displaystyle=h_{*}(x)-\delta+\nicefrac{{\delta}}{{K}}
=h(x)KδK+δ/K\displaystyle=h_{*}(x)-K\frac{\delta}{K}+\nicefrac{{\delta}}{{K}}
=h(x)(K1)δK\displaystyle=h_{*}(x)-(K-1)\frac{\delta}{K}
=h(x)4δK<h¯Δnj(x).\displaystyle=h_{*}(x)-4\frac{\delta}{K}<\underline{h}_{\Delta_{n_{j}}}(x)\,.

In other words, and using Lemma 10, we then have

hΔnjQ(x)<h¯Δnj(x)=infT𝒯ΔnjhT(x)hΔnjQ(x),h^{Q}_{\Delta_{n_{j}}}(x)<\underline{h}_{\Delta_{n_{j}}}(x)=\inf_{T\in\mathcal{T}_{\Delta_{n_{j}}}}h_{T}(x)\leq h^{Q}_{\Delta_{n_{j}}}(x)\,,

where the last step used that eQΔnj𝒯Δnje^{Q\Delta_{n_{j}}}\in\mathcal{T}_{\Delta_{n_{j}}}. From this contradiction we conclude that our earlier assumption must be wrong, and so it holds that h(x)hQ(x)h_{*}(x)\leq h^{Q}(x) for all x𝒳x\in\mathcal{X} and Q𝒬Q\in\mathcal{Q}. This implies that hh¯h_{*}\leq\underline{h}. Since it clearly also holds that h¯h\underline{h}\leq h_{*} because Q𝒬Q_{*}\in\mathcal{Q}, this implies that, indeed as claimed, h=h¯h_{*}=\underline{h}.

In summary, at this point we have shown that for any sequence (Δn)n(\Delta_{n})_{n\in\mathbb{N}} in >0\mathbb{R}_{>0} with limn+Δn=0\lim_{n\to+\infty}\Delta_{n}=0, there is a subsequence such that limj+h¯Δnj=h¯\lim_{j\to+\infty}\underline{h}_{\Delta_{n_{j}}}=\underline{h}.

So, finally, suppose ex absurdo that limΔ0+h¯Δh¯\lim_{\Delta\to 0^{+}}\underline{h}_{\Delta}\neq\underline{h}. Then there is some sequence (Δn)n(\Delta_{n})_{n\in\mathbb{N}} in >0\mathbb{R}_{>0} such that limn+Δn=0\lim_{n\to+\infty}\Delta_{n}=0, and some ϵ>0\epsilon>0, such that h¯Δnh¯ϵ\left\lVert\underline{h}_{\Delta_{n}}-\underline{h}\right\rVert\geq\epsilon for all nn\in\mathbb{N}. By the above result, there is a subsequence such that limj+h¯Δnj=h¯\lim_{j\to+\infty}\underline{h}_{\Delta_{n_{j}}}=\underline{h}, which is a contradiction. ∎

Proof of Theorem 1.

The crucial approach of this proof is to emulate Erreygers [2021, Sec 6.3] and consider discretized and truncated hitting times. By taking appropriate limits of such approximations, we then recover the “real” hitting times. We however need to be a bit careful with these constructions, since lower (and upper) expectation operators for continuous-time imprecise-Markov chains are not necessarily continuous with respect to arbitrary limits of such approximations [Erreygers, 2021, Chap 5]. This—fairly long—proof is therefore roughly divided into two parts; first, we construct a specific sequence of approximations, and establish the relevant continuity properties with respect to this sequence. Then, in the second part of this proof, we use this continuity to establish the main claim of this theorem.

To this end, for any t0t\in\mathbb{R}_{\geq 0} and Δ>0\Delta\in\mathbb{R}_{>0}, we first consider a fixed-step grid νΔt\nu_{\Delta}^{t} over [0,t][0,t] with step-size Δ\Delta, as

νΔt{iΔ:i0,iΔt}.\nu_{\Delta}^{t}\coloneqq\bigl{\{}i\Delta\,:\,i\in\mathbb{N}_{0},i\Delta\leq t\bigr{\}}\,. (22)

We define the associated approximate hitting time functions τΔt:Ω0\tau_{\Delta}^{t}:\Omega_{\mathbb{R}_{\geq 0}}\to\mathbb{R} for all ωΩ0\omega\in\Omega_{\mathbb{R}_{\geq 0}} as

τΔt(ω)min({sνΔt:ω(s)A}{t}).\tau_{\Delta}^{t}(\omega)\coloneqq\min\Bigl{(}\bigl{\{}s\in\nu_{\Delta}^{t}\,:\,\omega(s)\in A\}\cup\{t\}\Bigr{)}\,. (23)

Then by [Erreygers, 2021, Lemma 6.19], as we take the time-horizon tt to infinity and the step-size Δ\Delta to zero, we have the point-wise limit to the actual hitting time function τ0\tau_{\mathbb{R}_{\geq 0}}, in that

τ0(ω)=limt+,Δ0+τΔt(ω)for all ωΩ0.\tau_{\mathbb{R}_{\geq 0}}(\omega)=\lim_{t\to+\infty,\Delta\to 0^{+}}\tau_{\Delta}^{t}(\omega)\quad\text{for all $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$.} (24)

Let us now construct a specific sequence of approximate hitting time functions that will converge to this limit. To this end, first fix an arbitrary sequence (ϵn)n0(\epsilon_{n})_{n\in\mathbb{N}_{0}} in >0\mathbb{R}_{>0} such that limn+ϵn=0\lim_{n\to+\infty}\epsilon_{n}=0. Moreover, for any n0n\in\mathbb{N}_{0}, we introduce the (discrete-time) truncated hitting time τ0:n:Ω0\tau_{0:n}:\Omega_{\mathbb{N}_{0}}\to\mathbb{R}, defined for all ωΩ0\omega\in\Omega_{\mathbb{N}_{0}} as

τ0:n(ω)min({t{0,,n}:ω(t)A}{n}).\tau_{0:n}(\omega)\coloneqq\min\Bigl{(}\bigl{\{}t\in\{0,\ldots,n\}\,:\,\omega(t)\in A\bigr{\}}\cup\{n\}\Bigr{)}\,.

Now fix any k0k\in\mathbb{N}_{0}, let Δk2k\Delta_{k}\coloneqq 2^{-k}, and let 𝒯k\mathcal{T}_{k} denote the set of transition matrices that dominate eQ¯Δke^{\underline{Q}\Delta_{k}}. We now consider discrete-time imprecise-Markov chains parameterized by 𝒯k\mathcal{T}_{k}. As discussed in [Krak et al., 2019], for all n0n\in\mathbb{N}_{0} there are functions999These represent lower and an upper expectations with respect to a game-theoretic imprecise-Markov chain, but the details don’t concern us here. 𝔼¯𝒯kV[τ0:n|X0]\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}] and 𝔼¯𝒯kV[τ0:n|X0]\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}] in 𝒳\smash{\mathbb{R}^{\mathcal{X}}} such that

𝔼¯𝒯kV[τ0:n|X0]\displaystyle\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}] 𝔼¯𝒯kI[τ0:n|X0]\displaystyle\leq\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{I}}[\tau_{0:n}\,|\,X_{0}]
𝔼¯𝒯kI[τ0:n|X0]𝔼¯𝒯kV[τ0:n|X0]\displaystyle\leq\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{I}}[\tau_{0:n}\,|\,X_{0}]\leq\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}]

that, moreover, satisfy

limn+𝔼¯𝒯kV[τ0:n|X0]=𝔼¯𝒯kV[τ0|X0]=𝔼¯𝒯kHM[τ0|X0]\lim_{n\to+\infty}\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}]=\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]=\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]

and

limn+𝔼¯𝒯kV[τ0:n|X0]=𝔼¯𝒯kV[τ0|X0]=𝔼¯𝒯kHM[τ0|X0].\lim_{n\to+\infty}\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}]=\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]=\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]\,.

We already noted in Section 5 that the functions h¯Δk\underline{h}_{\Delta_{k}} and h¯Δk\overline{h}_{\Delta_{k}} from Equations (12) and (13) satisfy

h¯Δk=Δk𝔼¯𝒯kHM[τ0|X0]andh¯Δk=Δk𝔼¯𝒯kHM[τ0|X0].\underline{h}_{\Delta_{k}}=\Delta_{k}\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]\,\,\text{and}\,\,\overline{h}_{\Delta_{k}}=\Delta_{k}\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]\,.

Combining the above, we find that

limn+Δk𝔼¯𝒯kV[τ0:n|X0]=h¯Δk\lim_{n\to+\infty}\Delta_{k}\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}]=\underline{h}_{\Delta_{k}}

and

limn+Δk𝔼¯𝒯kV[τ0:n|X0]=h¯Δk.\lim_{n\to+\infty}\Delta_{k}\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}]=\overline{h}_{\Delta_{k}}\,.

Hence for all k0k\in\mathbb{N}_{0}, we can now choose tk0t_{k}\in\mathbb{N}_{0} large enough such that tkkt_{k}\geq k, and so that with nk=2ktkn_{k}=2^{k}t_{k} we have both

Δk𝔼¯𝒯kV[τ0:nk|X0]h¯Δk<ϵk,\left\lVert\Delta_{k}\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:{n_{k}}}\,|\,X_{0}]-\underline{h}_{\Delta_{k}}\right\rVert<\epsilon_{k}\,, (25)

and

Δk𝔼¯𝒯kV[τ0:nk|X0]h¯Δk<ϵk.\left\lVert\Delta_{k}\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:{n_{k}}}\,|\,X_{0}]-\overline{h}_{\Delta_{k}}\right\rVert<\epsilon_{k}\,. (26)

With these selections, we now define the sequence (τk)k0(\tau_{k})_{k\in\mathbb{N}_{0}} of approximate hitting times as τkτΔktk\tau_{k}\coloneqq\tau_{\Delta_{k}}^{t_{k}} for all k0k\in\mathbb{N}_{0}. Clearly we have limk+Δk=0\lim_{k\to+\infty}\Delta_{k}=0, and since tkkt_{k}\geq k we also find that limk+tk=+\lim_{k\to+\infty}t_{k}=+\infty. Hence by Equation (24) we have the pointwise limit

τ0(ω)=limk+τk(ω)for all ωΩ0.\tau_{\mathbb{R}_{\geq 0}}(\omega)=\lim_{k\to+\infty}\tau_{k}(\omega)\quad\text{for all $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$.} (27)

Having constructed this specific sequence that converges to the “true” hitting time function, we will now demonstrate the relevant continuity properties of the lower- and upper expectations of interest, with respect to this sequence.

To this end, we define τ^:Ω0{+}\hat{\tau}:\Omega_{\mathbb{R}_{\geq 0}}\to\mathbb{R}\cup\{+\infty\} as

τ^(ω)supt0supn0τΔnt(ω)for all ωΩ0.\hat{\tau}(\omega)\coloneqq\sup_{t\in\mathbb{N}_{0}}\sup_{n\in\mathbb{N}_{0}}\tau_{\Delta_{n}}^{t}(\omega)\quad\text{for all $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$.} (28)

Then for all k0k\in\mathbb{N}_{0} we have τk(ω)=τΔktk(ω)τ^(ω)\tau_{k}(\omega)=\tau_{\Delta_{k}}^{t_{k}}(\omega)\leq\hat{\tau}(\omega) for all ωΩ0\omega\in\Omega_{\mathbb{R}_{\geq 0}}. Moreover, since every τk\tau_{k} is non-negative, it holds in fact that |τk(ω)|τ^(ω)\left\lvert\tau_{k}(\omega)\right\rvert\leq\hat{\tau}(\omega) for all k0k\in\mathbb{N}_{0} and ωΩ0\omega\in\Omega_{\mathbb{R}_{\geq 0}}. This means that if we can show that the upper expectation 𝔼¯𝒬I[τ^|X0=x]\smash{\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}}[\hat{\tau}\,|\,X_{0}=x] is bounded for all x𝒳x\in\mathcal{X}, then we can use the imprecise version of the dominated convergence theorem [Erreygers, 2021, Thm 5.32] to take lower- and upper expectations of the limit in Equation (27). So, we will now show that this boundedness indeed holds.

We note that, for fixed t0t\in\mathbb{N}_{0}, τΔnt\tau_{\Delta_{n}}^{t} is monotonically decreasing as we increase n0n\in\mathbb{N}_{0}. To see this, first consider the grids νΔnt\nu_{\Delta_{n}}^{t} and νΔn+1t\nu_{\Delta_{n+1}}^{t} over [0,t][0,t]. For any sνΔnts\in\nu_{\Delta_{n}}^{t} there is some i0i\in\mathbb{N}_{0} such that s=iΔns=i\Delta_{n}, and since Δn=2n=2Δn+1\Delta_{n}=2^{-n}=2\Delta_{n+1}, we find that also s=2iΔn+1νΔn+1ts=2i\Delta_{n+1}\in\nu_{\Delta_{n+1}}^{t}. Hence we conclude that νΔntνΔn+1t\nu_{\Delta_{n}}^{t}\subseteq\nu_{\Delta_{n+1}}^{t}. From this set inclusion, we also clearly have for any ωΩ0\omega\in\Omega_{\mathbb{R}_{\geq 0}} that

{sνΔnt:ω(s)A}{sνΔn+1t:ω(s)A},\{s\in\nu_{\Delta_{n}}^{t}\,:\,\omega(s)\in A\}\subseteq\{s\in\nu_{\Delta_{n+1}}^{t}\,:\,\omega(s)\in A\}\,,

and so together with the fact that sts\leq t for all sνΔn+1ts\in\nu_{\Delta_{n+1}}^{t}, it then follows from Equation (23) that τΔnt(ω)τΔn+1t(ω)\tau_{\Delta_{n}}^{t}(\omega)\geq\tau_{\Delta_{n+1}}^{t}(\omega).

Using this observation, we immediately find that for any t0t\in\mathbb{N}_{0} and ωΩ0\omega\in\Omega_{\mathbb{R}_{\geq 0}} it holds that

supn0τΔnt(ω)=τΔ0t(ω)=τ1t(ω),\sup_{n\in\mathbb{N}_{0}}\tau_{\Delta_{n}}^{t}(\omega)=\tau_{\Delta_{0}}^{t}(\omega)=\tau_{1}^{t}(\omega)\,,

and so from Equation (28), we have

τ^(ω)=supt0τ1t(ω).\hat{\tau}(\omega)=\sup_{t\in\mathbb{N}_{0}}\tau_{1}^{t}(\omega)\,.

Next, we observe that τ1t\tau_{1}^{t} is monotonically increasing as we increase t0t\in\mathbb{N}_{0}. Indeed, for any t0t\in\mathbb{N}_{0} the grid ν1t\nu_{1}^{t} over [0,t][0,t] simply constitutes the set ν1t={0,,t}\nu_{1}^{t}=\{0,\ldots,t\}. Hence that τ1t\tau_{1}^{t} is monotonically increasing as we increase t0t\in\mathbb{N}_{0}, follows immediately from Equation (23). In particular, this implies that the sequence (τ1t)t0(\tau_{1}^{t})_{t\in\mathbb{N}_{0}} converges monotonically to τ^\hat{\tau}. Moreover, we have τ10(ω)=0\tau_{1}^{0}(\omega)=0 for all ωΩ0\omega\in\Omega_{\mathbb{R}_{\geq 0}}, and so we find that identically

𝔼¯𝒬I[τ10|X0]=infP𝒫𝒬I𝔼P[τ10|X0]=0.\displaystyle\underline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\tau_{1}^{0}\,|\,X_{0}]=\inf_{P\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{I}}}\mathbb{E}_{P}[\tau_{1}^{0}\,|\,X_{0}]=0\,.

Hence by the continuity of upper expectations with respect to monotonically increasing convergent sequences of functions that are bounded below [Erreygers, 2021, Thm 5.31], we have

𝔼¯𝒬I[τ^|X0]=limt+,t0𝔼¯𝒬I[τ1t|X0].\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}]=\lim_{t\to+\infty,t\in\mathbb{N}_{0}}\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\tau_{1}^{t}\,|\,X_{0}]\,. (29)

Now, for every t0t\in\mathbb{N}_{0}, τ1t\tau_{1}^{t} only depends on finitely many time-points; indeed, τ1t(ω)\tau_{1}^{t}(\omega) only depends on the value of ω(s)\omega(s) for s{0,,t}s\in\{0,\ldots,t\}. Using [Krak, 2021, Thm 7.2], this means that the lower- and upper expectations of these functions with respect to the imprecise-Markov chain 𝒫𝒬I\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}, can also be expressed as lower (resp. upper) expectations of this function with respect to an induced discrete-time imprecise-Markov chain. Indeed, since the step-size used in these approximating functions is uniformly equal to one, and using the obvious correspondence between τ1t\tau_{1}^{t} and τ0:t\tau_{0:t}, it is not difficult to see that

𝔼¯𝒬I[τ1t|X0]=𝔼¯𝒯I[τ0:t|X0]for all t0,\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\tau_{1}^{t}\,|\,X_{0}]=\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{I}}[\tau_{0:t}\,|\,X_{0}]\quad\text{for all $t\in\mathbb{N}_{0}$,} (30)

where 𝒯\mathcal{T} is the set of transition matrices that dominates eQ¯e^{\underline{Q}}.

We now again invoke the previously mentioned results from [Krak et al., 2019]; for any t0t\in\mathbb{N}_{0}, there is a function 𝔼¯𝒯V[τ0:t|X0]𝒳\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{0:t}\,|\,X_{0}]\in\smash{\mathbb{R}^{\mathcal{X}}} that satisfies

𝔼¯𝒯I[τ0:t|X0]𝔼¯𝒯V[τ0:t|X0].\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{I}}[\tau_{0:t}\,|\,X_{0}]\leq\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{0:t}\,|\,X_{0}]\,. (31)

Combining Equations (29), (30), and (31), and using [Krak et al., 2019, Prop 7] to establish the limit on the final right-hand side, we have

𝔼¯𝒬I[τ^|X0]\displaystyle\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}] =limt+,t0𝔼¯𝒬I[τ1t|X0]\displaystyle=\lim_{t\to+\infty,t\in\mathbb{N}_{0}}\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\tau_{1}^{t}\,|\,X_{0}]
=limt+,t0𝔼¯𝒯I[τ0:t|X0]\displaystyle=\lim_{t\to+\infty,t\in\mathbb{N}_{0}}\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{I}}[\tau_{0:t}\,|\,X_{0}]
limt+,t0𝔼¯𝒯V[τ0:t|X0]=𝔼¯𝒯V[τ0|X0].\displaystyle\leq\lim_{t\to+\infty,t\in\mathbb{N}_{0}}\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{0:t}\,|\,X_{0}]=\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]\,.

By [Krak et al., 2019, Thm 12] it holds that

𝔼¯𝒯V[τ0|X0]=𝔼¯𝒯HM[τ0|X0],\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]=\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]\,,

and, moreover, that there is some homogeneous discrete-time Markov chain P𝒫𝒯HMP\in\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}} with associated transition matrix T=TP𝒯T={}^{P}T\in\mathcal{T} and hitting times h=𝔼P[τ0|X0]h=\mathbb{E}_{P}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}] such that h=𝔼¯𝒯HM[τ0|X0]h=\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]. Putting this together, we find that

𝔼¯𝒬I[τ^|X0=x]h(x)for all x𝒳.\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}=x]\leq h(x)\quad\text{for all $x\in\mathcal{X}$.} (32)

By Proposition 1, hh is also the minimal non-negative solution to the system

h=𝕀Ac+𝕀AcTh.h=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Th\,. (33)

It is immediate from the definition that h|A=0h|_{A}=0, and since τ^\hat{\tau} is clearly non-negative, we obtain from Equation (32) that for all xAx\in A we have

0𝔼¯𝒬I[τ^|X0=x]h(x)=0,0\leq\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}=x]\leq h(x)=0\,,

or in other words, that 𝔼¯𝒬I[τ^|X0=x]=0\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}=x]=0 for all xAx\in A. So, it remains to bound this upper expectation on AcA^{c}.

By our Assumption 2, it holds for all xAcx\in A^{c} that eQ¯𝕀A(x)>0e^{\underline{Q}}\mathbb{I}_{A}(x)>0. Since eQ¯e^{\underline{Q}} is the lower transition operator corresponding to 𝒯\mathcal{T} due to Proposition 3, it follows that eQ¯e^{\underline{Q}} satisfies conditions C1–C3 and R1 from Reference [Krak, 2020]. We now recall that T=TP𝒯T={}^{P}T\in\mathcal{T}. Since the preconditions C1–C3 and R1 of this reference are all satisfied, we can now invoke [Krak, 2020, Lemma 10], which states that the inverse operator (IT|Ac)1(I-T|_{A^{c}})^{-1} exists.

We note that h|A=0h|_{A}=0, and so h=(h|Ac)𝒳h=(h|_{A^{c}})\!\!\uparrow_{\mathcal{X}}. Hence in particular, we have T|Ach|Ac=(Th)|AcT|_{A^{c}}h|_{A^{c}}=(Th)|_{A^{c}}. From Equation (33), we now find that

h|Ac=𝟏+(Th)|Ac=𝟏+T|Ach|Ac,h|_{A^{c}}=\mathbf{1}+(Th)|_{A^{c}}=\mathbf{1}+T|_{A^{c}}h|_{A^{c}}\,,

and so re-ordering terms, we have (IT|Ac)h|Ac=𝟏(I-T|_{A^{c}})h|_{A^{c}}=\mathbf{1}. Using the existence of the inverse operator established above, we obtain

h|Ac=(IT|Ac)1𝟏.h|_{A^{c}}=(I-T|_{A^{c}})^{-1}\mathbf{1}\,.

Since (IT|Ac)(I-T|_{A^{c}}) is an invertible bounded linear operator, also clearly (IT|Ac)1(I-T|_{A^{c}})^{-1} is bounded. Hence we have

h|Ac=(IT|Ac)1𝟏(IT|Ac)1<+.\left\lVert h|_{A^{c}}\right\rVert=\left\lVert(I-T|_{A^{c}})^{-1}\mathbf{1}\right\rVert\leq\left\lVert(I-T|_{A^{c}})^{-1}\right\rVert<+\infty\,.

From Equation (32) we find that 𝔼¯𝒬I[τ^|X0=x]<+\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}=x]<+\infty for all xAcx\in A^{c}. In summary, at this point we have shown that 𝔼¯𝒬I[τ^|X0=x]\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}=x] is bounded for all x𝒳x\in\mathcal{X}. Since we already established that τ^\hat{\tau} absolutely dominates the sequence (τk)k0(\tau_{k})_{k\in\mathbb{N}_{0}}, we can now finally use the limit (27) and the dominated convergence theorem [Erreygers, 2021, Thm 5.32] to establish that

lim supk+𝔼¯𝒬I[τk|X0]𝔼¯𝒬I[τ0|X0],\limsup_{k\to+\infty}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k}\,|\,X_{0}]\leq\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\,, (34)

and

𝔼¯𝒬I[τ0|X0]lim infk+𝔼¯𝒬I[τk|X0].\overline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\leq\liminf_{k\to+\infty}\overline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k}\,|\,X_{0}]\,. (35)

This concludes the first part of this proof. Our next step will be to identify the limits superior and inferior in the above inequalities as corresponding to, respectively, h¯\underline{h} and h¯\overline{h}.

Let us start by obtaining the required result for the lower expectation. From the definition of the limit superior, there is a convergent subsequence such that

s¯limj+𝔼¯𝒬I[τkj|X0]=lim supk+𝔼¯𝒬I[τk|X0].\underline{s}\coloneqq\lim_{j\to+\infty}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k_{j}}\,|\,X_{0}]=\limsup_{k\to+\infty}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k}\,|\,X_{0}]\,. (36)

Now fix any j0j\in\mathbb{N}_{0}, and consider the approximate function τkj=τΔkjtkj\tau_{k_{j}}=\tau_{\smash{\Delta_{k_{j}}}}^{t_{k_{j}}}. As before, this function really only depends on the system at finitely many time points; specifically, those on the grid νΔkjtkj\nu_{\smash{\Delta_{k_{j}}}}^{t_{k_{j}}} over [0,tkj][0,t_{k_{j}}]. We can therefore again use [Krak, 2021, Thm 7.2], to express the lower- and upper expectations of this function with respect to the imprecise-Markov chain 𝒫𝒬I\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}, as lower- and upper expectations of a function with respect to an induced discrete-time imprecise-Markov chain. Since the step size of this grid is now equal to Δkj\Delta_{k_{j}} rather than one, this requires a bit more effort than before. In particular, we now need to compensate for the step-size Δkj\Delta_{k_{j}} of the grid. Indeed, the corresponding discrete-time imprecise-Markov chain should consider steps that are implicitly of this “length”, so we consider the model induced by the set 𝒯kj\mathcal{T}_{k_{j}} of transition matrices that dominate eQ¯Δkje^{\underline{Q}\Delta_{k_{j}}}. It then remains to find an appropriate translation τ~kj\tilde{\tau}_{k_{j}} of τkj\tau_{k_{j}} to the domain Ω0\Omega_{\mathbb{N}_{0}}.

As a first observation, we note that this “translation” τ~kj:Ω0\tilde{\tau}_{k_{j}}:\Omega_{\mathbb{N}_{0}}\to\mathbb{R} should depend on the same number of time points as τkj\tau_{k_{j}}. We note that since tk0t_{k}\in\mathbb{N}_{0} it holds that tkνΔkjtkjt_{k}\in\nu^{t_{k_{j}}}_{\smash{\Delta_{k_{j}}}}. Hence it follows from Equation (22) that νΔkjtkj\nu^{t_{k_{j}}}_{\smash{\Delta_{k_{j}}}} contains exactly Δkj1tkj=2kjtkj=nkj\Delta_{k_{j}}^{-1}t_{k_{j}}=2^{k_{j}}t_{k_{j}}=n_{k_{j}} time points, in addition to the origin 0, and that τkj\tau_{k_{j}} depends exactly on these time points. Indeed, inspection of Equation (23) reveals that, by re-scaling to compensate for the step size Δkj\Delta_{k_{j}}, the quantity Δkj1τkj(ω)\Delta_{k_{j}}^{-1}\tau_{k_{j}}(\omega) simply represents the natural index of the discrete grid element of νΔkjtkj\nu^{t_{k_{j}}}_{\smash{\Delta_{k_{j}}}} on which ωΩ0\omega\in\Omega_{\mathbb{R}_{\geq 0}} did (or did not) initially hit AA. Adapting Equation (23), we therefore define for any ωΩ0\omega\in\Omega_{\mathbb{N}_{0}} that

τ~kj(ω)min({sνΔkjtkj:ω(Δkj1s)A}{tkj}).\displaystyle\tilde{\tau}_{k_{j}}(\omega)\coloneqq\min\Bigl{(}\bigl{\{}s\in\nu_{\Delta_{k_{j}}}^{t_{k_{j}}}\,:\,\omega\bigl{(}\Delta_{k_{j}}^{-1}s\bigr{)}\in A\bigr{\}}\cup\{t_{k_{j}}\}\Bigr{)}\,.

We see that, as required, Δkj1τ~kj(ω)\Delta_{k_{j}}^{-1}\tilde{\tau}_{k_{j}}(\omega) is again simply the identity of the step on which ωΩ0\omega\in\Omega_{\mathbb{N}_{0}} did (or did not) initially hit AA. This implies the relation to the discrete-time truncated hitting time τ0:nkj\tau_{0:n_{k_{j}}}; for any ωΩ0\omega\in\Omega_{\mathbb{N}_{0}} we have

τ~kj(ω)\displaystyle\tilde{\tau}_{k_{j}}(\omega)
=ΔkjΔkj1τ~kj(ω)\displaystyle=\Delta_{k_{j}}\Delta_{k_{j}}^{-1}\tilde{\tau}_{k_{j}}(\omega)
=Δkjmin({sΔkj1νΔkjtkj:ω(s)A}{Δkj1tkj})\displaystyle=\Delta_{k_{j}}\min\Bigl{(}\bigl{\{}s\in\Delta_{k_{j}}^{-1}\nu_{\Delta_{k_{j}}}^{t_{k_{j}}}\,:\,\omega(s)\in A\bigr{\}}\cup\{\Delta_{k_{j}}^{-1}t_{k_{j}}\}\Bigr{)}
=Δkjmin({s{0,,nkj}:ω(s)A}{nkj})\displaystyle=\Delta_{k_{j}}\min\Bigl{(}\bigl{\{}s\in\{0,\ldots,n_{k_{j}}\}\,:\,\omega\bigl{(}s\bigr{)}\in A\bigr{\}}\cup\{n_{k_{j}}\}\Bigr{)}
=Δkjτ0:nkj(ω),\displaystyle=\Delta_{k_{j}}\tau_{0:n_{k_{j}}}(\omega)\,,

and so we simply have that τ~kj=Δkjτ0:nkj\tilde{\tau}_{k_{j}}=\Delta_{k_{j}}\tau_{0:n_{k_{j}}}. Following the discussion in [Krak, 2021, Chap 7], and [Krak, 2021, Thm 7.2] in particular, we therefore find the identity

𝔼¯𝒬I[τkj|X0]=Δkj𝔼¯𝒯kjI[τ0:nkj|X0]for all j0.\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k_{j}}\,|\,X_{0}]=\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]\quad\text{for all $j\in\mathbb{N}_{0}$.} (37)

We again recall from Krak et al. [2019] the objects 𝔼¯𝒯kjV[τ0:nkj|X0]\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}] in 𝒳\smash{\mathbb{R}^{\mathcal{X}}} satisfying

𝔼¯𝒯kjV[τ0:nkj|X0]𝔼¯𝒯kjI[τ0:nkj|X0]for all j0.\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]\leq\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]\quad\text{for all $j\in\mathbb{N}_{0}$.}

Hence from Equations (36) and (37) we now find that

s¯\displaystyle\underline{s} =limj+𝔼¯𝒬I[τkj|X0]\displaystyle=\lim_{j\to+\infty}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k_{j}}\,|\,X_{0}]
=limj+Δkj𝔼¯𝒯kjI[τ0:nkj|X0]\displaystyle=\lim_{j\to+\infty}\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]
limj+Δkj𝔼¯𝒯kjV[τ0:nkj|X0].\displaystyle\geq\lim_{j\to+\infty}\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]\,. (38)

We now note that, for all j0j\in\mathbb{N}_{0}, we have

Δkj𝔼¯𝒯kjV[τ0:nkj|X0]h¯\displaystyle\left\lVert\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]-\underline{h}\right\rVert
Δkj𝔼¯𝒯kjV[τ0:nkj|X0]h¯Δkj+h¯Δkjh¯\displaystyle\quad\leq\left\lVert\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]-\underline{h}_{\Delta_{k_{j}}}\right\rVert+\left\lVert\underline{h}_{\Delta_{k_{j}}}-\underline{h}\right\rVert
<ϵkj+h¯Δkjh¯,\displaystyle\quad<\epsilon_{k_{j}}+\left\lVert\underline{h}_{\Delta_{k_{j}}}-\underline{h}\right\rVert\,,

where we used Equation (25) for the final inequality. Using that limj+ϵkj=0\lim_{j\to+\infty}\epsilon_{k_{j}}=0 and limj+Δkj=0\lim_{j\to+\infty}\Delta_{k_{j}}=0, together with Proposition 16, we see that both summands vanish as we increase j0j\in\mathbb{N}_{0}, and so we have

limj+Δkj𝔼¯𝒯kjV[τ0:nkj|X0]=h¯.\lim_{j\to+\infty}\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]=\underline{h}\,. (39)

We already established in Section 5 that h¯=𝔼¯𝒬HM[τ0|X0]\underline{h}=\underline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]. Hence by combining Equations (34), (36), (38), and (39), we now find that

𝔼¯𝒬HM[τ0|X0]=h¯s¯𝔼¯𝒬I[τ0|X0].\underline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]=\underline{h}\leq\underline{s}\leq\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\,. (40)

However, as noted in Section 3.1 we have the inclusion 𝒫𝒬HM𝒫𝒬M𝒫𝒬I\mathcal{P}^{\mathrm{HM}}_{\mathcal{Q}}\subseteq\mathcal{P}^{\mathrm{M}}_{\mathcal{Q}}\subseteq\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}, so it immediately follows from the definition of the lower expectations that

𝔼¯𝒬I[τ0|X0]𝔼¯𝒬M[τ0|X0]𝔼¯𝒬HM[τ0|X0].\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\leq\underline{\mathbb{E}}^{\mathrm{M}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\leq\underline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\,.

Hence by Equation (40) we obtain the identity

𝔼¯𝒬I[τ0|X0]=𝔼¯𝒬M[τ0|X0]=𝔼¯𝒬HM[τ0|X0],\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]=\underline{\mathbb{E}}^{\mathrm{M}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]=\underline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\,,

which concludes the proof that the lower expected hitting times are the same for all three types of continuous-time imprecise-Markov chains. We omit the proof for the upper expected hitting times; this is completely analogous, starting instead from Equation (35) and using the norm bound (26) to pass to the limit h¯=𝔼¯𝒬HM[τ0|X0]\overline{h}=\overline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]. ∎

Proof of Theorem 2.

First fix any Δ>0\Delta>0. For h¯Δ\underline{h}_{\Delta} it then holds that

h¯Δ=Δ𝕀Ac+𝕀AceQ¯Δh¯Δ.\displaystyle\underline{h}_{\Delta}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}e^{\underline{Q}\Delta}\underline{h}_{\Delta}\,.

Since h¯Δ(x)=0\underline{h}_{\Delta}(x)=0 for all xAx\in A, we have h¯Δ=𝕀Ach¯Δ\underline{h}_{\Delta}=\mathbb{I}_{A^{c}}\underline{h}_{\Delta} and 𝕀Ah¯Δ=0\mathbb{I}_{A}\underline{h}_{\Delta}=0. We can therefore rearrange terms and add 𝕀Ah¯Δ\mathbb{I}_{A}\underline{h}_{\Delta} to the above, to obtain

𝕀Ah¯Δ=𝕀Ac+𝕀AceQ¯ΔIΔh¯Δ.\mathbb{I}_{A}\underline{h}_{\Delta}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\frac{e^{\underline{Q}\Delta}-I}{\Delta}\underline{h}_{\Delta}\,.

Because the individual limits exist by Proposition 16 and [De Bock, 2017, Prop 9], taking Δ\Delta to zero yields

𝕀Ah¯\displaystyle\mathbb{I}_{A}\underline{h} =𝕀AlimΔ0+h¯Δ\displaystyle=\mathbb{I}_{A}\lim_{\Delta\to 0^{+}}\underline{h}_{\Delta}
=𝕀Ac+𝕀AclimΔ0+eQ¯ΔIΔh¯Δ\displaystyle=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\lim_{\Delta\to 0^{+}}\frac{e^{\underline{Q}\Delta}-I}{\Delta}\underline{h}_{\Delta}
=𝕀Ac+𝕀AclimΔ0+eQ¯ΔIΔlimΔ0+h¯Δ\displaystyle=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\lim_{\Delta\to 0^{+}}\frac{e^{\underline{Q}\Delta}-I}{\Delta}\lim_{\Delta\to 0^{+}}\underline{h}_{\Delta}
=𝕀Ac+𝕀AcQ¯h¯.\displaystyle=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\underline{Q}\,\underline{\vphantom{Q}h}\,.

So, h¯\underline{h} is indeed a solution to the system 𝕀Ah¯=𝕀Ac+𝕀AcQ¯h¯\mathbb{I}_{A}\underline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\underline{Q}\,\underline{\vphantom{Q}h}. It follows from a completely analogous argument that also h¯\overline{h} is a solution to 𝕀Ah¯=𝕀Ac+𝕀AcQ¯h¯\mathbb{I}_{A}\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,\overline{h}.

That h¯\underline{h} and h¯\overline{h} are non-negative is clear. We now first show that h¯\underline{h} is the minimal non-negative solution to its corresponding system. To this end, suppose ex absurdo that there is some non-negative h𝒳h\in\smash{\mathbb{R}^{\mathcal{X}}} such that h(x)<h¯(x)h(x)<\underline{h}(x) for some x𝒳x\in\mathcal{X}, and 𝕀Ah=𝕀Ac+𝕀AcQ¯h\mathbb{I}_{A}h=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\underline{Q}h. Then clearly xAcx\in A^{c} since h¯(y)=0\underline{h}(y)=0 for all yAy\in A and hh is non-negative.

By Proposition 4, there is then some Q𝒬Q\in\mathcal{Q} such that Qh=Q¯hQh=\underline{Q}h, and so also 𝕀Ah=𝕀Ac+𝕀AcQh\mathbb{I}_{A}h=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Qh. By Proposition 2 there is some minimal non-negative solution hh_{*} to the system 𝕀Ah=𝕀Ac+𝕀AcQh\mathbb{I}_{A}h_{*}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Qh_{*}, where the minimality implies in particular that hhh_{*}\leq h. Since Q𝒬Q\in\mathcal{Q}, we obtain

h(x)h(x)<h¯(x)=infQ𝒬hQ(x)h(x),h_{*}(x)\leq h(x)<\underline{h}(x)=\inf_{Q^{\prime}\in\mathcal{Q}}h^{Q^{\prime}}(x)\leq h_{*}(x)\,,

which is a contradiction.

We next show that h¯\overline{h} is the minimal non-negative solution to its corresponding system; this will require a bit more effort and we need to start with some auxiliary constructions. By Proposition 4, there is some Q𝒬Q\in\mathcal{Q} such that Qh¯=Q¯h¯Q\overline{h}=\overline{Q}\,\overline{h}. Let GG be the subgenerator of QQ.

Consider the ξ>0\xi>0 from Proposition 11, and let \left\lVert\cdot\right\rVert_{*} be the norm from Section 4.1. Since Ac\smash{\mathbb{R}^{A^{c}}} is finite-dimensional, the norms \left\lVert\cdot\right\rVert and \left\lVert\cdot\right\rVert_{*} are equivalent, whence there is some c>0c\in\mathbb{R}_{>0} such that fcf\left\lVert f\right\rVert_{*}\leq c\left\lVert f\right\rVert for all fAcf\in\smash{\mathbb{R}^{A^{c}}}.

Now let Δ>0\Delta>0 be such that ΔQ1\Delta\left\lVert Q\right\rVert\leq 1, ΔQ¯1\Delta\left\lVert\underline{Q}\right\rVert\leq 1, Δξ<2/3\Delta\xi<\nicefrac{{2}}{{3}}, and ΔcQ2<ξ/3\Delta c\left\lVert Q\right\rVert^{2}<\nicefrac{{\xi}}{{3}}; this is clearly always possible. Define the map H:AcAcH:\smash{\mathbb{R}^{A^{c}}}\to\smash{\mathbb{R}^{A^{c}}} for all fAcf\in\smash{\mathbb{R}^{A^{c}}} as

H(f)f+Δ𝟏+ΔGf=Δ𝟏+(I+ΔG)f.H(f)\coloneqq f+\Delta\mathbf{1}+\Delta Gf=\Delta\mathbf{1}+(I+\Delta G)f\,.

Let us show that HH is a contraction on the Banach space (Ac,)(\smash{\mathbb{R}^{A^{c}}},\left\lVert\cdot\right\rVert_{*}), or in other words, that there is some α[0,1)\alpha\in[0,1) such that H(f)H(g)αfg\left\lVert H(f)-H(g)\right\rVert_{*}\leq\alpha\left\lVert f-g\right\rVert_{*} for all f,gAcf,g\in\smash{\mathbb{R}^{A^{c}}} [Renardy and Rogers, 2006, Sec 10.1.1]. So, fix any f,gAcf,g\in\smash{\mathbb{R}^{A^{c}}}. Then we have

H(f)H(g)\displaystyle\left\lVert H(f)-H(g)\right\rVert_{*} =(I+ΔG)f(I+ΔG)g\displaystyle=\left\lVert(I+\Delta G)f-(I+\Delta G)g\right\rVert_{*}
=(I+ΔG)(fg)\displaystyle=\left\lVert(I+\Delta G)(f-g)\right\rVert_{*}
I+ΔGfg,\displaystyle\leq\left\lVert I+\Delta G\right\rVert_{*}\left\lVert f-g\right\rVert_{*}\,,

from which we find that

H(f)H(g)\displaystyle\left\lVert H(f)-H(g)\right\rVert_{*}
((I+ΔG)eGΔ+eGΔ)fg.\displaystyle\quad\quad\leq\left(\left\lVert(I+\Delta G)-e^{G\Delta}\right\rVert_{*}+\left\lVert e^{G\Delta}\right\rVert_{*}\right)\left\lVert f-g\right\rVert_{*}\,. (41)

By Proposition 14 we have eGΔeξΔ\left\lVert e^{G\Delta}\right\rVert_{*}\leq e^{-\xi\Delta}, and so using a standard quadratic bound on the negative scalar exponential,

eGΔ1Δξ+12Δ2ξ2<123Δξ,\left\lVert e^{G\Delta}\right\rVert_{*}\leq 1-\Delta\xi+\frac{1}{2}\Delta^{2}\xi^{2}<1-\frac{2}{3}\Delta\xi\,, (42)

where we used that Δξ<2/3\Delta\xi<\nicefrac{{2}}{{3}}.

Moreover, we have that

(I+ΔG)eGΔ\displaystyle\left\lVert(I+\Delta G)-e^{G\Delta}\right\rVert_{*} c(I+ΔG)eGΔ\displaystyle\leq c\left\lVert(I+\Delta G)-e^{G\Delta}\right\rVert
c(I+ΔQ)eQΔ\displaystyle\leq c\left\lVert(I+\Delta Q)-e^{Q\Delta}\right\rVert
cΔ2Q2,\displaystyle\leq c\Delta^{2}\left\lVert Q\right\rVert^{2}\,,

where the second inequality used Lemmas 3 and 4 and Corollary 3; and the final inequality used [Krak, 2021, Lemma B.8]. Since ΔcQ2<ξ/3\Delta c\left\lVert Q\right\rVert^{2}<\nicefrac{{\xi}}{{3}} we have

(I+ΔG)eGΔ<13Δξ.\left\lVert(I+\Delta G)-e^{G\Delta}\right\rVert_{*}<\frac{1}{3}\Delta\xi\,. (43)

Combining Equations (41), (42), and (43) we obtain

H(f)H(g)(113Δξ)fg.\left\lVert H(f)-H(g)\right\rVert_{*}\leq\left(1-\frac{1}{3}\Delta\xi\right)\left\lVert f-g\right\rVert_{*}\,.

Since Δ>0\Delta>0, ξ>0\xi>0, and Δξ<2/3\Delta\xi<\nicefrac{{2}}{{3}}, we conclude that HH is indeed a contraction. Hence by the Banach fixed-point theorem [Renardy and Rogers, 2006, Thm 10.1], there is a unique fixed point fAcf\in\smash{\mathbb{R}^{A^{c}}} such that H(f)=fH(f)=f and, for any gAcg\in\smash{\mathbb{R}^{A^{c}}}, it holds that

limn+Hn(g)=f.\lim_{n\to+\infty}H^{n}(g)=f\,. (44)

It is easy to see that this unique fixed point is given by h¯|Ac\overline{h}|_{A^{c}}. Indeed, from the choice of QQ and the fact that h¯\overline{h} satisfies 𝕀Ah¯=𝕀Ac+𝕀AcQ¯h¯\mathbb{I}_{A}\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,\overline{h}, we have

𝕀Ah¯=𝕀Ac+𝕀AcQh¯=𝕀Ac+𝕀AcQ¯h¯.\mathbb{I}_{A}\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Q\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,\overline{h}\,.

Moreover, since h¯|A=0\overline{h}|_{A}=0 we have 𝕀Ah¯=0\mathbb{I}_{A}\overline{h}=0 and h¯=(h¯|Ac)𝒳\overline{h}=(\overline{h}|_{A^{c}})\!\!\uparrow_{\mathcal{X}}, whence

(Qh¯)|Ac=(Q(h¯|Ac)𝒳)|Ac=Gh¯|Ac.\bigl{(}Q\,\overline{h}\bigr{)}|_{A^{c}}=\bigl{(}Q\,(\overline{h}|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A^{c}}=G\,\overline{h}|_{A^{c}}\,.

Noting that 𝕀Ac|Ac=𝟏\mathbb{I}_{A^{c}}|_{A^{c}}=\mathbf{1}, after multiplying with Δ\Delta we find that Δ𝟏+ΔGh¯|Ac=0\Delta\mathbf{1}+\Delta G\,\overline{h}|_{A^{c}}=0. Comparing with the definition of HH, we have

H(h¯|Ac)=h¯|Ac+Δ𝟏+ΔGh¯|Ac=h¯|Ac+0=h¯|Ac,\displaystyle H(\overline{h}|_{A^{c}})=\overline{h}|_{A^{c}}+\Delta\mathbf{1}+\Delta G\,\overline{h}|_{A^{c}}=\overline{h}|_{A^{c}}+0=\overline{h}|_{A^{c}}\,,

so h¯|Ac\overline{h}|_{A^{c}} is indeed a fixed-point of HH. Since we already established that HH has a unique fixed-point, we conclude from Equation (44) that

h¯|Ac=limn+Hn(g)for all gAc.\overline{h}|_{A^{c}}=\lim_{n\to+\infty}H^{n}(g)\quad\text{for all $g\in\smash{\mathbb{R}^{A^{c}}}$.} (45)

Next let us show that HH is monotone. To this end, fix any f,gAcf,g\in\smash{\mathbb{R}^{A^{c}}} such that fgf\leq g; then clearly also f𝒳g𝒳f\!\!\uparrow_{\mathcal{X}}\leq g\!\!\uparrow_{\mathcal{X}}. Since Δ>0\Delta>0 is such that ΔQ1\Delta\left\lVert Q\right\rVert\leq 1, it follows from [Krak, 2021, Prop 4.9] that (I+ΔQ)(I+\Delta Q) is a transition matrix. By the monotonicity of transition matrices [Krak, 2021, Prop 3.32], we find that therefore

(I+ΔQ)f𝒳(I+ΔQ)g𝒳,(I+\Delta Q)f\!\!\uparrow_{\mathcal{X}}\leq(I+\Delta Q)g\!\!\uparrow_{\mathcal{X}}\,,

which in turn implies that

(I+ΔG)f\displaystyle(I+\Delta G)f =((I+ΔQ)f𝒳)|Ac\displaystyle=\bigl{(}(I+\Delta Q)f\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A^{c}}
((I+ΔQ)g𝒳)|Ac=(I+ΔG)g.\displaystyle\leq\bigl{(}(I+\Delta Q)g\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A^{c}}=(I+\Delta G)g\,.

From the definition of HH we therefore conclude that H(f)H(g)H(f)\leq H(g). Since f,gf,g with fgf\leq g are arbitrary, this concludes the proof of the monotonicity of HH.

Now, let us define H¯:𝒳𝒳\overline{H}:\smash{\mathbb{R}^{\mathcal{X}}}\to\smash{\mathbb{R}^{\mathcal{X}}} for all f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}} as

H¯(f)Δ𝕀Ac+𝕀Ac(I+ΔQ¯)f.\overline{H}(f)\coloneqq\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}(I+\Delta\overline{Q})f\,.

We first note that, since ΔQ¯1\Delta\left\lVert\underline{Q}\right\rVert\leq 1, it follows from [De Bock, 2017, Prop 5] that (I+ΔQ¯)(I+\Delta\underline{Q}) is a lower transition operator. From the conjugacy of Q¯\underline{Q} and Q¯\overline{Q}, we have for any f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}} that

(I+ΔQ¯)(f)\displaystyle-(I+\Delta\underline{Q})(-f) =f+ΔQ¯(f)\displaystyle=f+\Delta\cdot-\underline{Q}(-f)
=f+ΔQ¯f=(I+ΔQ¯)f,\displaystyle=f+\Delta\overline{Q}f=(I+\Delta\overline{Q})f\,,

which implies that (I+ΔQ¯)(I+\Delta\overline{Q}) is the upper transition operator that is conjugate to the lower transition operator (I+ΔQ¯)(I+\Delta\underline{Q}). By the monotonicity of upper transition operators—see Section 3.2—this implies that H¯\overline{H} is monotone, or in other words that for all f,g𝒳f,g\in\smash{\mathbb{R}^{\mathcal{X}}} with fgf\leq g it holds that H¯(f)H¯(g)\overline{H}(f)\leq\overline{H}(g).

Let us next show that

H¯(f)H(f|Ac)𝒳for all f𝒳 with f|A=0.\overline{H}(f)\geq H(f|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\quad\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ with $f|_{A}=0$.} (46)

Indeed, if f|A=0f|_{A}=0 then (f|Ac)𝒳=f(f|_{A^{c}})\!\!\uparrow_{\mathcal{X}}=f, and since Q𝒬Q\in\mathcal{Q}, it follows from the definitions of Q¯\overline{Q} and GG that then

(Q¯f)|Ac(Qf)|Ac=(Q(f|Ac)𝒳)|Ac=Gf|Ac.(\overline{Q}\,f)|_{A^{c}}\geq(Q\,f)|_{A^{c}}=(Q\,(f|_{A^{c}})\!\!\uparrow_{\mathcal{X}})|_{A^{c}}=Gf|_{A^{c}}\,.

Hence we have

H¯(f)|Ac\displaystyle\overline{H}(f)|_{A^{c}} =Δ𝟏+f|Ac+Δ(Q¯f)|Ac\displaystyle=\Delta\mathbf{1}+f|_{A^{c}}+\Delta(\overline{Q}\,f)|_{A^{c}}
Δ𝟏+f|Ac+ΔGf|Ac=H(f|Ac).\displaystyle\geq\Delta\mathbf{1}+f|_{A^{c}}+\Delta Gf|_{A^{c}}=H(f|_{A^{c}})\,.

Moreover, we immediately have from the definition that H¯(f)|A=0=(H(f|Ac)𝒳)|A\overline{H}(f)|_{A}=0=\bigl{(}H(f|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A}, and so Equation (46) indeed holds.

Next, we note that for any f𝒳f\in\smash{\mathbb{R}^{\mathcal{X}}} it holds that H¯(f)|A=0\overline{H}(f)|_{A}=0, and so by Equation (46) we find that

H¯(H¯(f))H(H¯(f)|Ac)𝒳.\overline{H}(\overline{H}(f))\geq H(\overline{H}(f)|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\,.

Provided that also f|A=0f|_{A}=0, then using the previously established monotonicity of HH we obtain

H¯(H¯(f))\displaystyle\overline{H}(\overline{H}(f)) H(H¯(f)|Ac)𝒳\displaystyle\geq H(\overline{H}(f)|_{A^{c}})\!\!\uparrow_{\mathcal{X}}
H(H(f|Ac)𝒳|Ac)𝒳.\displaystyle\geq H(H(f|_{A^{c}})\!\!\uparrow_{\mathcal{X}}|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\,.

We clearly have H(f|Ac)𝒳|Ac=H(f|Ac)H(f|_{A^{c}})\!\!\uparrow_{\mathcal{X}}|_{A^{c}}=H(f|_{A^{c}}), whence

H¯2(f)=H¯(H¯(f))H(H(f|Ac))𝒳=(H2(f|Ac))𝒳.\overline{H}^{2}(f)=\overline{H}(\overline{H}(f))\geq H(H(f|_{A^{c}}))\!\!\uparrow_{\mathcal{X}}=(H^{2}(f|_{A^{c}}))\!\!\uparrow_{\mathcal{X}}\,.

Indeed, we can repeat this reasoning for nn\in\mathbb{N} steps, to conclude that

H¯n(f)(Hn(f|Ac))𝒳for all f𝒳 with f|A=0.\overline{H}^{n}(f)\geq(H^{n}(f|_{A^{c}}))\!\!\uparrow_{\mathcal{X}}\quad\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ with $f|_{A}=0$.} (47)

Now suppose ex absurdo that there is some non-negative g𝒳g\in\smash{\mathbb{R}^{\mathcal{X}}}, such that g(x)<h¯(x)g(x)<\overline{h}(x) for some x𝒳x\in\mathcal{X}, and such that 𝕀Ag=𝕀Ac+𝕀AcQ¯g\mathbb{I}_{A}g=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,g. Since gg is non-negative and h¯|A=0\overline{h}|_{A}=0, we must have that xAcx\in A^{c}. Moreover, we clearly have 𝕀Ag=0\mathbb{I}_{A}g=0, which implies that g|A=0g|_{A}=0 and so g=𝕀Acgg=\mathbb{I}_{A^{c}}g. Hence it follows that Δ𝕀Ac+𝕀AcΔQ¯g=0\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\Delta\overline{Q}g=0, and we find that H¯(g)=𝕀Acg=g\overline{H}(g)=\mathbb{I}_{A^{c}}g=g. Hence gg is a fixed point of H¯\overline{H}. Since g|A=0g|_{A}=0, and using Equation (47), this implies that for any n0n\in\mathbb{N}_{0} we have

g=H¯n(g)(Hn(g|Ac))𝒳.g=\overline{H}^{n}(g)\geq(H^{n}(g|_{A^{c}}))\!\!\uparrow_{\mathcal{X}}\,.

Recalling that xAcx\in A^{c} is such that g(x)<h¯(x)g(x)<\overline{h}(x), we use Equation (45) to take limits in nn and find that

g(x)limn+Hn(g|Ac)(x)=h¯(x)>g(x),g(x)\geq\lim_{n\to+\infty}H^{n}(g|_{A^{c}})(x)=\overline{h}(x)>g(x)\,,

which is a contradiction. ∎