Hitting Times for Continuous-Time Imprecise-Markov Chains

Thomas Krak

([email protected]
Uncertainty in Artificial Intelligence – Eindhoven University of Technology)

Abstract

We study the problem of characterizing the expected hitting times for a robust generalization of continuous-time Markov chains. This generalization is based on the theory of imprecise probabilities, and the models with which we work essentially constitute sets of stochastic processes. Their inferences are tight lower- and upper bounds with respect to variation within these sets.

We consider three distinct types of these models, corresponding to different levels of generality and structural independence assumptions on the constituent processes.

Our main results are twofold; first, we demonstrate that the hitting times for all three types are equivalent. Moreover, we show that these inferences are described by a straightforward generalization of a well-known linear system of equations that characterizes expected hitting times for traditional time-homogeneous continuous-time Markov chains.

1 Introduction

We consider the problem of characterizing the expected hitting times for continuous-time imprecise-Markov chains [Škulj, 2015, Krak et al., 2017, Krak, 2021, Erreygers, 2021]. These are robust, set-valued generalizations of (traditional) Markov chains [Norris, 1998], based on the theory of imprecise probabilities [Walley, 1991, Augustin et al., 2014]. From a sensitivity-analysis perspective, we may interpret these sets as hedging against model-uncertainties with respect to a model’s numerical parameters and/or structural (independence) assumptions.

The inference problem of hitting times essentially deals with the question of how long it will take the underlying system to reach some particular subset of its states. This is a common and important problem in such fields as, e.g., reliability analysis, where it can capture the expected time-to-failure of a system; and epidemiology, to model the expected time-until-extinction of an epidemic. For imprecise-Markov chains, then, we are interested in evaluating these quantities in a manner that is robust against, and conservative with respect to, any variation that is compatible with one’s uncertainty about the model specification.

Erreygers [2021] has recently obtained some partial results towards characterizing such inferences, but has not been able to give a complete characterization and has largely studied the finite-time horizon case. The problem of hitting times for discrete-time imprecise-Markov chains was previously studied by Krak et al. [2019], Krak [2020]. In this present work, we largely emulate and extend their results to the continuous-time setting.

We will be concerned with three different types of imprecise-Markov chains. These are all sets of stochastic processes that are in a specific sense compatible with a given set of numerical parameters, but the three types differ in the independence properties of their elements. In particular, they correspond to (i) a set of (time-)homogeneous Markov chains, (ii) a set of (not-necessarily homogeneous) Markov chains, and (iii) a set of general—not-necessarily homogeneous nor Markovian—stochastic processes. It is known (and perhaps not very surprising) that inferences with respect to these three models do not in general agree; see e.g. [Krak, 2021] for a detailed analysis of their differences.

However, our first main result in this work is that the expected hitting time is the same for these three different types of models. Besides being of theoretical interest, we want to emphasize the power of this result: it means that even if a practitioner using Markov chains would be uncertain whether the system they are studying is truly homogeneous and/or Markovian, relaxing these assumptions would not influence inferences about the hitting times in this sense. Purely pragmatically, it also means that we can use computational methods tailored to any one of these types of models, to compute these inferences.

Our second main result is that these hitting times are characterized by a generalization of a well-known system of equations that holds for continuous-time homogeneous Markov chains; see Proposition 2 for this linear system.

The remainder of this paper is structured as follows. In Section 2 we introduce the basic required concepts that we will use throughout, formalizing the notion of stochastic processes and defining the inference problem of interest. In Section 3, we define the various types of imprecise-Markov chains that we use throughout this work. We spend some effort in Section 4 to study the transition dynamics of these models, from a perspective that is particularly relevant for the inference problem of hitting times. In Section 5 we explain and sketch the proofs of our main results, and we give a summary in Section 6.

Because we have quite a lot of conceptual material to cover before we can explain our main results, we are not able to fit any real proofs in the main body of this work. Instead, these—together with a number of technical lemmas—have largely been relegated to the supplementary material.

2 Preliminaries

Throughout, we consider a fixed, finite state space $\mathcal{X}$ with at least two elements. This set contains all possible values for some abstract underlying process. An element of $\mathcal{X}$ is called a state, and is usually generically denoted as $x\in\mathcal{X}$ .

We use $\mathbb{R},\mathbb{R}_{\geq 0}$ , and $\mathbb{R}_{>0}$ to denote the reals, the non-negative reals, and the positive reals, respectively. $\mathbb{N}$ denotes the natural numbers without zero, and we let $\mathbb{N}_{0}\coloneqq\mathbb{N}\cup\{0\}$ .

For any $\mathcal{Y}\subseteq\mathcal{X}$ , we use $\smash{\mathbb{R}^{\mathcal{Y}}}$ to denote the vector space of real-valued functions on $\mathcal{Y}$ ; in particular, $\smash{\mathbb{R}^{\mathcal{X}}}$ denotes the space of all real functions on $\mathcal{X}$ . We use $\left\lVert\cdot\right\rVert$ to denote the supremum norm on any such space; for any $f\in\smash{\mathbb{R}^{\mathcal{Y}}}$ we let $\left\lVert f\right\rVert\coloneqq\max\{\left\lvert f(x)\right\rvert\,:\,x\in\mathcal{Y}\}$ . Throughout, we make extensive use of indicator functions, which are defined for all $A\subseteq\mathcal{Y}$ as $\mathbb{I}_{A}(x)\coloneqq 1$ if $x\in A$ and $\mathbb{I}_{A}(x)\coloneqq 0$ , otherwise. We use the shorthand $\mathbb{I}_{y}\coloneqq\mathbb{I}_{\{y\}}$ . Let $\mathbf{1}$ denote the function that is identically equal to $1$ ; its dimensionality is to be understood from context.

A map $M:\smash{\mathbb{R}^{\mathcal{Y}}}\to\smash{\mathbb{R}^{\mathcal{Y}}}$ is also called an operator, and we denote its evaluation in $f\in\smash{\mathbb{R}^{\mathcal{Y}}}$ as $Mf$ . If it holds for all $\lambda\in\mathbb{R}_{\geq 0}$ that $M(\lambda f)=\lambda Mf$ then $M$ is called non-negatively homogeneous. For any non-negatively homogeneous operator on $\smash{\mathbb{R}^{\mathcal{Y}}}$ , we define the induced operator norm $\left\lVert M\right\rVert\coloneqq\sup\{\left\lVert Mf\right\rVert\,:\,f\in\smash{\mathbb{R}^{\mathcal{Y}}},\left\lVert f\right\rVert=1\}$ . We reserve the symbol $I$ to denote the identity operator on any space; the domain is to be understood from context.

Note that any linear operator is also non-negatively homogeneous. Moreover, if $M$ is linear it can be represented as an $\left\lvert\mathcal{Y}\right\rvert\times\left\lvert\mathcal{Y}\right\rvert$ matrix by arbitrarily fixing an ordering on $\mathcal{Y}$ . However, without fixing such an ordering, we simply use $M(x,y)\coloneqq M\mathbb{I}_{y}(x)$ to denote the entry in the $x$ -row and $y$ -column of such a matrix, for any $x,y\in\mathcal{Y}$ . For any $f\in\smash{\mathbb{R}^{\mathcal{Y}}}$ and $x\in\mathcal{Y}$ we then have $Mf(x)=\sum_{y\in\mathcal{Y}}M(x,y)f(y)$ , so that $Mf$ simply represents the usual matrix-vector product of $M$ with the (column) vector $f$ . In the sequel, we interchangeably refer to linear operators also as matrices. We note the well-known equality $\left\lVert M\right\rVert=\max_{x\in\mathcal{Y}}\sum_{y\in\mathcal{Y}}\left\lvert M(x,y)\right\rvert$ for the induced matrix norm.

2.1 Processes & Markov Chains

We now turn to stochastic processes, which are fundamentally the subject of this work. The typical (measure-theoretic) way to define a stochastic process is simply as a family $(X_{i})_{i\in\mathcal{I}}$ of random variables with index set $\mathcal{I}$ . This index set represents the time domain of the stochastic process. The random variables are understood to be taken with respect to some underlying probability space $(\Omega_{\mathcal{I}},\mathcal{F}_{\mathcal{I}},P)$ , where $\Omega_{\mathcal{I}}$ is a set of sample paths, which are functions from $\mathcal{I}$ to $\mathcal{X}$ representing possible realizations of the evolution of the underlying process through $\mathcal{X}$ . The random variables $X_{i}$ , $i\in\mathcal{I}$ are canonically the maps $X_{i}:\omega\mapsto\omega(i)$ on $\Omega_{\mathcal{I}}$ .

However, for our purposes it will be more convenient to instead refer to the probability measure $P$ as the stochastic process. Different processes $P$ may then be taken over the same measurable space $(\Omega_{\mathcal{I}},\mathcal{F}_{\mathcal{I}})$ , using the same canonical variables $(X_{i})_{i\in\mathcal{I}}$ for all these processes.

In this work we will use both discrete- and continuous-time stochastic processes, which corresponds to choosing $\mathcal{I}=\mathbb{N}_{0}$ or $\mathcal{I}=\mathbb{R}_{\geq 0}$ , respectively. In both cases we take $\mathcal{F}_{\mathcal{I}}$ to be the $\sigma$ -algebra generated by the cylinder sets; this ensures that all functions that we consider are measurable.

In the discrete-time case, we let $\Omega_{\mathbb{N}_{0}}$ be the set of all functions from $\mathbb{N}_{0}$ to $\mathcal{X}$ . A discrete-time stochastic process $P$ is then simply a probability measure on $(\Omega_{\mathbb{N}_{0}},\mathcal{F}_{\mathbf{\mathbb{N}_{0}}})$ . Moreover, $P$ is said to be a Markov chain if it satisfies the (discrete-time) Markov property, meaning that

	$\displaystyle P(X_{n+1}=x_{n+1}\,$	$\displaystyle\|\,X_{0}=x_{0},\ldots,X_{n}=x_{n})$
		$\displaystyle=P(X_{n+1}=x_{n+1}\,\|\,X_{n}=x_{n})\,,$

for all $x_{0},\ldots,x_{n+1}\in\mathcal{X}$ and $n\in\mathbb{N}_{0}$ . If, additionally, it holds for all $x,y\in\mathcal{X}$ and $n\in\mathbb{N}_{0}$ that

P(X_{n+1}=y\,|\,X_{n}=x)=P(X_{1}=y\,|\,X_{0}=x)\,,

then $P$ is said to be a (time-)homogeneous Markov chain. We use $\mathbb{P}_{\mathbb{N}_{0}},\mathbb{P}_{\mathbb{N}_{0}}^{\mathrm{M}}$ , and $\mathbb{P}_{\mathbb{N}_{0}}^{\mathrm{HM}}$ to denote, respectively, the set of all discrete-time stochastic processes; the set of all discrete-time Markov chains; and the set of all discrete-time homogeneous Markov chains.

In the continuous-time case, we let $\Omega_{\mathbb{R}_{\geq 0}}$ be the set of all cadlag functions from $\mathbb{R}_{\geq 0}$ to $\mathcal{X}$ . A continuous-time stochastic process $P$ is a probability measure on $(\Omega_{\mathbb{R}_{\geq 0}},\mathcal{F}_{\mathbb{R}_{\geq 0}})$ . The process $P$ is said to be a Markov chain if it satisfies the (continuous-time) Markov property,

	$\displaystyle P(X_{t_{n+1}}=x_{t_{n+1}}\,$	$\displaystyle\|\,X_{t_{0}}=x_{t_{0}},\ldots,X_{t_{n}}=x_{t_{n}})$
		$\displaystyle=P(X_{t_{n+1}}=x_{t_{n+1}}\,\|\,X_{t_{n}}=x_{t_{n}})$

for all $x_{t_{0}},\ldots,x_{t_{n+1}}\in\mathcal{X}$ , $t_{0}<\cdots<t_{n}\leq t_{n+1}\in\mathbb{R}_{\geq 0}$ , and all $n\in\mathbb{N}_{0}$ . If, additionally, it holds that

P(X_{s}=y\,|\,X_{t}=x)=P(X_{s-t}=y\,|\,X_{0}=x)

for all $x,y\in\mathcal{X}$ and all $t,s\in\mathbb{R}_{\geq 0}$ with $t\leq s$ , then $P$ is said to be a (time-)homogeneous Markov chain. We use $\mathbb{P}_{\mathbb{R}_{\geq 0}},\mathbb{P}_{\mathbb{R}_{\geq 0}}^{\mathrm{M}}$ , and $\mathbb{P}_{\mathbb{R}_{\geq 0}}^{\mathrm{HM}}$ to denote, respectively, the set of all continuous-time stochastic processes; the set of all continuous-time Markov chains; and the set of all continuous-time homogeneous Markov chains.

We refer to [Norris, 1998] for an excellent further introduction to discrete-time and continuous-time Markov chains.

2.2 Transition Dynamics

Throughout this work, we make extensive use of operator-theoretic representations of the behavior of stochastic processes, and Markov chains in particular. The first reason for this is that such operators serve as a way to parameterize Markov chains. Moreover, they are also useful as a computational tool, since they can often be used to express inferences of interest; see, e.g., Propositions 1 and 2 further on. We introduce the basic concepts below, and refer to e.g. [Norris, 1998] for details.

A transition matrix $T$ is a linear operator on $\smash{\mathbb{R}^{\mathcal{X}}}$ such that, for all $x\in\mathcal{X}$ , it holds that $T(x,y)\geq 0$ for all $y\in\mathcal{X}$ , and $\sum_{y\in\mathcal{X}}T(x,y)=1$ . There is an important and well-known connection between Markov chains and transition matrices; for any discrete-time homogeneous Markov chain $P$ , we can define the corresponding transition matrix ${}^{P}T$ as

{}^{P}T(x,y)\coloneqq P(X_{1}=y\,|\,X_{0}=x)\quad\text{for all $x,y\in\mathcal{X}$.}

Since $P$ is a probability measure, we clearly have that ${}^{P}T$ is a transition matrix. Conversely, a given transition matrix $T$ uniquely determines a discrete-time homogeneous Markov chain $P$ with ${}^{P}T=T$ , up to the specification of the initial distribution $P(X_{0})$ . For this reason, transition matrices are often taken as a crucial parameter to specify (discrete-time, homogeneous) Markov chains.

Analogously, for a (non-homogeneous) discrete-time Markov chain $P$ , we might define a family $({}^{P}T_{n})_{n\in\mathbb{N}_{0}}$ of time-dependent corresponding transition matrices, with

{}^{P}T_{n}(x,y)\coloneqq P(X_{n+1}=y\,|\,X_{n}=x)\,,

for all $x,y\in\mathcal{X}$ and $n\in\mathbb{N}_{0}$ . Conversely, any family $(T_{n})_{n\in\mathbb{N}_{0}}$ of transition matrices uniquely determines a discrete-time Markov chain $P$ with ${}^{P}T_{n}=T_{n}$ for all $n\in\mathbb{N}_{0}$ , again up to the specification of $P(X_{0})$ .

In the continuous-time setting, transition matrices are also of great importance. However, it will be instructive to first introduce rate matrices. A rate matrix $Q$ is a linear operator on $\smash{\mathbb{R}^{\mathcal{X}}}$ such that, for all $x\in\mathcal{X}$ , it holds that $Q(x,y)\geq 0$ for all $y\in\mathcal{X}$ with $x\neq y$ , and $\sum_{y\in\mathcal{X}}Q(x,y)=0$ .

For any rate matrix $Q$ and any $t\in\mathbb{R}_{\geq 0}$ , the matrix exponential $e^{Qt}$ of $Qt$ can be defined as [Van Loan, 2006]

e^{Qt}\coloneqq\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}Q\bigr{)}^{n}\,.

An alternative characterization is as the (unique) solution to the matrix ordinary differential equation [Van Loan, 2006]

\frac{\mathrm{d}}{\mathrm{d}\,s}e^{Qs}=Qe^{Qs}=e^{Qs}Q,\quad\text{with $e^{Q0}=I$.}

(1)

For any $t,s\in\mathbb{R}_{\geq 0}$ it holds that $e^{Q(t+s)}=e^{Qt}e^{Qs}$ , and we immediately have $e^{Q0}=I$ . The family $(e^{Qt})_{t\in\mathbb{R}_{\geq 0}}$ is therefore called the semigroup generated by $Q$ , and $Q$ is called the generator of this semigroup. Moreover, for any rate matrix $Q$ and any $t\in\mathbb{R}_{\geq 0}$ , $e^{Qt}$ is a transition matrix [Norris, 1998, Thm 2.1.2].

Now let us consider a continuous-time homogeneous Markov chain $P$ , and define the corresponding transition matrix¹¹1Note that in continuous-time, we always have to measure the transition-time interval $[0,t]$ to specify these matrices. ${}^{P}T_{t}$ for all $t\in\mathbb{R}_{\geq 0}$ and $x,y\in\mathcal{X}$ as

{}^{P}T_{t}(x,y)\coloneqq P(X_{t}=y\,|\,X_{0}=x)\,.

(2)

It turns out that there is then a unique rate matrix ${}^{P}\!Q$ associated with $P$ such that ${}^{P}T_{t}=e^{{}^{P}\!Qt}$ for all $t\in\mathbb{R}_{\geq 0}$ . By combining Equations (1) and (2), we can identify ${}^{P}\!Q$ as

{}^{P}\!Q=\Bigl{(}\frac{\mathrm{d}}{\mathrm{d}\,t}{}^{P}T_{t}\Bigr{)}\bigg{|}_{t=0}\,.

As before, in the other direction we have that any fixed rate matrix $Q$ uniquely determines a continuous-time homogeneous Markov chain $P$ with ${}^{P}\!Q=Q$ , up to the specification of $P(X_{0})$ . For this reason, rate matrices are often used to specify (continuous-time, homogeneous) Markov chains.

Let us finally consider a (not-necessarily homogeneous) continuous-time Markov chain $P$ . For any $t,s\in\mathbb{R}_{\geq 0}$ with $t\leq s$ , we can then define a transition matrix ${}^{P}T_{t}^{s}$ with, for all $x,y\in\mathcal{X}$ , ${}^{P}T_{t}^{s}(x,y)\coloneqq P(X_{s}=y\,|\,X_{t}=x)$ . Under appropriate assumptions of differentiability, this induces a family $({}^{P}\!Q_{t})_{t\in\mathbb{R}_{\geq 0}}$ of rate matrices ${}^{P}\!Q_{t}$ , as

{}^{P}\!Q_{t}=\Bigl{(}\frac{\mathrm{d}}{\mathrm{d}\,s}{}^{P}T_{t}^{s}\Bigr{)}\bigg{|}_{s=t}\,.

(3)

In the converse direction we might try to reconstruct the transition matrices of $P$ by solving the matrix ordinary differential equation(s)

\frac{\mathrm{d}}{\mathrm{d}\,s}{}^{P}T_{t}^{s}={}^{P}T_{t}^{s}{}^{P}\!Q_{s},\quad\text{with ${}^{P}T_{t}^{t}=I$.}

(4)

By comparing with Equation (1), we see that in the special case where ${}^{P}\!Q_{s}$ does not depend on $s$ —that is, where $P$ is homogeneous with ${}^{P}\!Q_{s}={}^{P}\!Q$ , say—we indeed obtain ${}^{P}T_{t}^{s}=e^{{}^{P}\!Q(s-t)}$ . However, in general the non-autonomous system (4) does not have such a closed-form solution, and we cannot move beyond this implicit characterization.

2.3 Hitting Times

We now have all the pieces to introduce the inference problem that is the subject of this work, viz. the expected hitting times of some non-empty set of states $A\subset\mathcal{X}$ with respect to a particular stochastic process. We take this set $A$ to be fixed for the remainder of this work.

In the discrete-time case, we consider the (extended real-valued)²²2We agree that $0(+\infty)=0$ ; $(+\infty)+(+\infty)=+\infty$ ; and, for any $c\in\mathbb{R}$ , $(+\infty)+c=+\infty$ and $c(+\infty)=+\infty$ if $c>0$ . function $\tau_{\mathbb{N}_{0}}:\Omega_{\mathbb{N}_{0}}\to\mathbb{R}_{\geq 0}\cup\{+\infty\}$ given by

\tau_{\mathbb{N}_{0}}(\omega)\coloneqq\inf\bigl{\{}n\in\mathbb{N}_{0}\,:\,\omega(n)\in A\bigr{\}}\quad\text{for all $\omega\in\Omega_{\mathbb{N}_{0}}$.}

This captures the number of steps before a process $P$ “hits” any state in $A$ . The expected hitting time for a discrete-time process $P$ starting in $x\in\mathcal{X}$ is then defined as

\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{N}_{0}}\,|\,X_{0}=x\bigr{]}\coloneqq\int_{\Omega_{\mathbb{N}_{0}}}\tau_{\mathbb{N}_{0}}(\omega)\,\mathrm{d}P(\omega\,|\,X_{0}=x)\,.

We use $\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{N}_{0}}\,|\,X_{0}\bigr{]}$ to denote the extended real-valued function on $\mathcal{X}$ given by $x\mapsto\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{N}_{0}}\,|\,X_{0}=x\bigr{]}$ . When dealing with homogeneous Markov chains, this quantity has the following simple characterization:

Proposition 1.

[Norris, 1998, Thm 1.3.5] Let $P$ be a discrete-time homogeneous Markov chain with corresponding transition matrix ${}^{P}T$ . Then $h\coloneqq\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{N}_{0}}\,|\,X_{0}\bigr{]}$ is the minimal non-negative solution to the linear system³³3Throughout, for any $f,g\in\smash{\mathbb{R}^{\mathcal{X}}}$ , the quantity $fg$ is understood as the pointwise product between the functions $f$ and $g$ .⁴⁴4Strictly speaking this requires extending the domain of ${}^{P}T$ to extended-real valued functions, but we will shortly introduce some assumptions that obviate such an exposition.

h=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}{}^{P}Th\,.

In the continuous-time case, the definition is analogous; we introduce a function $\tau_{\mathbb{R}_{\geq 0}}:\Omega_{\mathbb{R}_{\geq 0}}\to\mathbb{R}_{\geq 0}\cup\{+\infty\}$ as

\tau_{\mathbb{R}_{\geq 0}}(\omega)\coloneqq\inf\bigl{\{}t\in\mathbb{R}_{\geq 0}\,:\,\omega(t)\in A\bigr{\}}\,\,\text{for all $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$.}

This function measures the time until a process “hits” any state in $A$ on a given sample path. The expected hitting time for a continuous-time process $P$ starting in $x\in\mathcal{X}$ is

\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}=x\bigr{]}\coloneqq\int_{\Omega_{\mathbb{R}_{\geq 0}}}\tau_{\mathbb{R}_{\geq 0}}(\omega)\,\mathrm{d}P(\omega\,|\,X_{0}=x)\,.

We again use $\smash{\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}}$ to denote the extended-real valued function on $\mathcal{X}$ given by $x\mapsto\smash{\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}=x\bigr{]}}$ . Also in this case, the characterization for homogeneous Markov chains is particularly simple:

Proposition 2.

[Norris, 1998, Thm 3.3.3] Let $P$ be a continuous-time homogeneous Markov chain with rate matrix $\smash{{}^{P}\!Q}$ such that $\smash{{}^{P}\!Q}(x,x)\neq 0$ for all $x\in A^{c}$ . Then $h\coloneqq\smash{\mathbb{E}_{P}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}}$ is the minimal non-negative solution to

\mathbb{I}_{A}h=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\smash{{}^{P}\!Q}h\,.

(5)

3 Imprecise-Markov Chains

Let us now introduce imprecise-Markov chains [Hermans and Škulj, 2014, Škulj, 2015, Krak et al., 2017], which are the stochastic processes that we aim to study in this work. Their characterization is based on the theory of imprecise probabilities [Walley, 1991, Augustin et al., 2014].

We here adopt the “sensitivity analysis” interpretation of imprecise probabilities. This means that we represent an imprecise-Markov chain simply as a set $\mathcal{P}$ of stochastic processes. Intuitively, the idea is that we collect in $\mathcal{P}$ all (traditional, “precise”) stochastic processes that we deem to plausibly capture the dynamics of the underlying system of interest. Inferences with respect to $\mathcal{P}$ are defined using lower- and upper expectations, given respectively as

\underline{\mathbb{E}}_{\mathcal{P}}[\cdot\,|\,\cdot]\coloneqq\inf_{P\in\mathcal{P}}\mathbb{E}_{P}[\cdot\,|\,\cdot]\quad\text{and}\quad\overline{\mathbb{E}}_{\mathcal{P}}[\cdot\,|\,\cdot]\coloneqq\sup_{P\in\mathcal{P}}\mathbb{E}_{P}[\cdot\,|\,\cdot]\,.

So, their inferences represent robust—i.e. conservative—and tight lower- and upper bounds on inferences with respect to all stochastic processes that we deem to be plausible.

3.1 Sets of Processes & Types

We already mentioned that an imprecise-Markov chain is essentially simply a set $\mathcal{P}$ of stochastic processes. Let us now consider how to define such sets.

We start by considering the discrete-time case; then, clearly, $\mathcal{P}$ will be a set of discrete-time processes. We will parameterize such a set with some non-empty set $\mathcal{T}$ of transition matrices. Our aim is then to include in $\mathcal{P}$ all processes that are in some sense “compatible” with $\mathcal{T}$ .⁵⁵5We will not constrain the initial models $P(X_{0})$ of the elements of $\mathcal{P}$ , since in any case such a choice would not influence the inferences that we study in this work. However, at this point we are faced with a choice about which type of processes to include in this set, and these different choices lead to different types of imprecise-Markov chains.

Arguably the conceptually most simple model is $\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}}$ , which contains all homogeneous Markov chains $P$ whose corresponding transition matrix is included in $\mathcal{T}$ :

\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}}\coloneqq\bigl{\{}P\in\mathbb{P}^{\mathrm{HM}}_{\mathbb{N}_{0}}\,:\,{}^{P}T\in\mathcal{T}\bigr{\}}\,.

However, we could instead consider $\mathcal{P}^{\mathrm{M}}_{\mathcal{T}}$ , which is the set of all (not-necessarily homogeneous) Markov chains whose time-dependent transition matrices are contained in $\mathcal{T}$ :

\mathcal{P}^{\mathrm{M}}_{\mathcal{T}}\coloneqq\bigl{\{}P\in\mathbb{P}^{\mathrm{M}}_{\mathbb{N}_{0}}\,:\,{}^{P}T_{n}\in\mathcal{T}\,\text{for all $n\in\mathbb{N}_{0}$}\bigr{\}}\,.

The last choice that we consider here is the set $\mathcal{P}^{\mathrm{I}}_{\mathcal{T}}$ , which essentially contains all discrete-time processes whose single-step transition dynamics are described by $\mathcal{T}$ . Its characterization is more cumbersome since we have not expressed these general processes in terms of transition matrices, but we can say that it is the set of all $P\in\mathbb{P}_{\mathbb{N}_{0}}$ such that for all $n\in\mathbb{N}_{0}$ and all $x_{0},\ldots,x_{n}\in\mathcal{X}$ , there is some $T\in\mathcal{T}$ such that for all $y\in\mathcal{X}$ it holds that

P(X_{n+1}=y\,|\,X_{0}=x_{0},\ldots,X_{n}=x_{n})=T(x_{n},y)\,.

This last type is called an imprecise-Markov chain under epistemic irrelevance, whence the superscript ‘ $\mathrm{I}$ ’.

Note that the three types $\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}},\mathcal{P}^{\mathrm{M}}_{\mathcal{T}}$ , and $\mathcal{P}^{\mathrm{I}}_{\mathcal{T}}$ capture not only “plausible” variation in terms of parameter uncertainty—expressed through the set $\mathcal{T}$ —but also variation in terms of the structural independence conditions that we consider! So, from an applied perspective, if someone is not sure whether the underlying system that they are studying is truly Markovian and/or time-homogeneous, they might choose to use different such sets in their analysis.

In the continuous-time case, we again proceed analogously. First, we fix a non-empty set $\mathcal{Q}$ of rate matrices, which will be the parameter for our models. We then first consider the set $\mathcal{P}^{\mathrm{HM}}_{\mathcal{Q}}$ of all homogeneous Markov chains whose rate matrix is included in $\mathcal{Q}$ :

\mathcal{P}^{\mathrm{HM}}_{\mathcal{Q}}\coloneqq\bigl{\{}P\in\mathbb{P}_{\mathbb{R}_{\geq 0}}^{\mathrm{HM}}\,:\,{}^{P}\!Q\in\mathcal{Q}\bigr{\}}\,.

The other two types are constructed in analogy to the discrete-time case, but unfortunately we don’t have the space for a complete exposition of their characterization. Instead we refer the interested reader to [Krak et al., 2017, Krak, 2021] for an in-depth study of these different types and comparisons between them; in what follows we limit ourselves to a largely intuitive specification.

The model $\mathcal{P}^{\mathrm{M}}_{\mathcal{Q}}$ is the set of all continuous-time (not-necessarily homogeneous) Markov chains whose transition dynamics are compatible with $\mathcal{Q}$ at every point in time. This includes in particular all Markov chains $P$ satisfying the appropriate differentiability assumptions to meaningfully say that the time-dependent rate matrices ${}^{P}\!Q_{t}$ —as in Equation (3)—are included in $\mathcal{Q}$ for all $t\in\mathbb{R}_{\geq 0}$ . However, $\mathcal{P}^{\mathrm{M}}_{\mathcal{Q}}$ also contains other processes that are not (everywhere) differentiable; see e.g. [Krak, 2021, Sec 4.6 and 5.2] for the technical details.

The most involved model to explain is again $\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}$ , which includes all continuous-time processes whose time- and history-dependent transition dynamics can be described using elements of $\mathcal{Q}$ . It includes, but is not limited to, appropriately differentiable processes $P$ such that for all $n\in\mathbb{N}_{0}$ , all $t_{0}<\cdots<t_{n}\in\mathbb{R}_{\geq 0}$ , and all $x_{t_{0}},\ldots,x_{t_{n}}\in\mathcal{X}$ , there is some $Q\in\mathcal{Q}$ such that for all $y\in\mathcal{X}$ it holds that

	$\displaystyle\biggl{(}\frac{\mathrm{d}}{\mathrm{d}\,s}P(X_{s}=y\,\|\,X_{t_{0}}=x_{t_{0}},\ldots,$	$\displaystyle X_{t_{n}}=x_{t_{n}})\biggr{)}\bigg{\|}_{s=t_{n}}$
		$\displaystyle=Q(x_{t_{n}},y)$

We again refer to [Krak, 2021, Sec 4.6 and 5.2] for the technical details involving the additional elements of $\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}$ that are not appropriately differentiable. Importantly, we note the nested structure [Krak, 2021, Prop 5.9]

\mathcal{P}^{\mathrm{HM}}_{\mathcal{Q}}\subseteq\mathcal{P}^{\mathrm{M}}_{\mathcal{Q}}\subseteq\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}\,,

where the inclusions are strict provided $\mathcal{Q}$ isn’t trivial.

For notational convenience, we will use identical sub- and superscripts to denote the corresponding lower- and upper expectations for any of these imprecise-Markov chains; e.g., we let $\underline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{T}}[\cdot\,|\,\cdot]\coloneqq\underline{\mathbb{E}}_{\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}}}[\cdot\,|\,\cdot]$ .

3.2 Imprecise Transition Dynamics

Let us now introduce some machinery to describe the dynamics of imprecise-Markov chains. In particular, we here move from the set-valued parameters $\mathcal{T}$ and $\mathcal{Q}$ used in Section 3.1, to their dual representations; these are operators that can serve as computational tools.

In Section 3.1, we described discrete-time imprecise-Markov chains using non-empty sets $\mathcal{T}$ of transition matrices. With any such set, we can associate the corresponding lower- and upper transition operators $\underline{T}$ and $\overline{T}$ on $\smash{\mathbb{R}^{\mathcal{X}}}$ , defined respectively as

\underline{T}f\coloneqq\inf_{T\in\mathcal{T}}Tf\quad\text{and}\quad\overline{T}f\coloneqq\sup_{T\in\mathcal{T}}Tf\quad\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$.}

More generally, any operator $\underline{T}$ (resp. $\overline{T}$ ) on $\smash{\mathbb{R}^{\mathcal{X}}}$ is a lower (resp. upper) transition operator if for all $f,g\in\smash{\mathbb{R}^{\mathcal{X}}}$ , all $\lambda\in\mathbb{R}_{\geq 0}$ , and all $x\in\mathcal{X}$ , it holds that [De Bock, 2017]

1.

$\min_{y\in\mathcal{X}}f(y)\leq\underline{T}f(x)$ and $\overline{T}f(x)\leq\max_{y\in\mathcal{X}}f(y)$
2.

$\underline{T}f+\underline{T}g\leq\underline{T}(f+g)$ and $\overline{T}(f+g)\leq\overline{T}f+\overline{T}g$
3.

$\underline{T}(\lambda f)=\lambda\underline{T}f$ and $\overline{T}(\lambda f)=\lambda\overline{T}f$ .

It should be noted that lower- and upper transition operators are conjugate, in that any $\underline{T}$ induces a corresponding upper transition operator $\overline{T}(\cdot)=-\underline{T}(-\cdot)$ , and vice versa. Moreover, any transition matrix $T$ is also a lower—and, by its linearity, upper—transition operator.

It is easily verified that the lower- and upper transition operators corresponding to a given non-empty set $\mathcal{T}$ are, indeed, lower- and upper transition operators. Conversely, with a given lower transition operator $\underline{T}$ , we can associate the set of transition matrices that dominate it, in the sense that

\mathcal{T}_{\underline{T}}\coloneqq\bigl{\{}T\,:\,\text{$T$ a trans. mat.},\,Tf\geq\underline{T}f\,\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$}\bigr{\}}\,.

This set satisfies the following important properties:

Proposition 3.

[Krak, 2021, Sec 3.4] Let $\underline{T}$ be a lower transition operator with conjugate upper transition operator $\overline{T}(\cdot)=-\underline{T}(-\cdot)$ and dominating set of transition matrices $\mathcal{T}_{\underline{T}}$ . Then $\mathcal{T}_{\underline{T}}$ is a non-empty, closed, and convex set of transition matrices that has separately specified rows,⁶⁶6A set $\mathcal{M}$ of matrices is said to have separately specified rows if, intuitively, it is closed under the row-wise recombination of its elements; see e.g. [Hermans and Škulj, 2014] for details. and for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ it holds that $\underline{T}f=\inf_{T\in\mathcal{T}_{\underline{T}}}Tf$ and $\overline{T}f=\sup_{T\in\mathcal{T}_{\underline{T}}}Tf$ . Moreover, for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ there is some $T\in\mathcal{T}_{\underline{T}}$ such that $Tf=\underline{T}f$ , and there is some—possibly different— $T\in\mathcal{T}_{\underline{T}}$ such that $Tf=\overline{T}f$ .

Notably, there is a one-to-one relation between non-empty sets of transition matrices that are closed and convex and have separately specified rows, and lower (or upper) transition operators: if $\underline{T}$ is the lower transition operator for the set $\mathcal{T}$ , and if $\mathcal{T}$ satisfies these properties, then $\mathcal{T}=\mathcal{T}_{\underline{T}}$ [Krak, 2021, Cor 3.38]. Hence these objects may serve as dual representations for each other.

One reason that this is important is the use of $\underline{T}$ as a computational tool; under the conditions of this duality it holds that for any function $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ and any $n\in\mathbb{N}_{0}$ , we can write [Hermans and Škulj, 2014]

\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{T}}[f(X_{n})|X_{0}=x]=\underline{\mathbb{E}}^{\mathrm{M}}_{\mathcal{T}}[f(X_{n})|X_{0}=x]=\underline{T}^{n}f(x)\,,

where $\underline{T}$ is the lower transition operator for $\mathcal{T}$ . This reduces the problem of computing such inferences for the imprecise-Markov chains $\mathcal{P}^{\mathrm{M}}_{\mathcal{T}}$ and $\mathcal{P}^{\mathrm{I}}_{\mathcal{T}}$ to solving $n$ independent linear optimization problems over $\mathcal{T}$ ; first compute $f_{1}\coloneqq\underline{T}f$ , then compute $f_{2}\coloneqq\underline{T}\,f_{1}=\underline{T}^{2}f$ , and so forth. Note that this method in general only yields a conservative bound on the corresponding inference for $\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}}$ , as the minimizers $T_{k}$ that obtain $T_{k}f_{k-1}=\underline{T}f_{k-1}$ may be different at each step.

We next consider the dynamics in the continuous-time setting. We proceed analogously to the above: we first consider a non-empty and bounded⁷⁷7In the induced operator norm. set $\mathcal{Q}$ of rate matrices. With this set, we then associate the corresponding lower- and upper rate operators $\underline{Q}$ and $\overline{Q}$ on $\smash{\mathbb{R}^{\mathcal{X}}}$ , defined as

\underline{Q}f\coloneqq\inf_{Q\in\mathcal{Q}}Qf\quad\text{and}\quad\overline{Q}f\coloneqq\sup_{Q\in\mathcal{Q}}Qf\quad\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$.}

More generally, any operator $\underline{Q}$ (resp. $\overline{Q}$ ) on $\smash{\mathbb{R}^{\mathcal{X}}}$ is a lower (resp. upper) rate operator if for all $f,g\in\smash{\mathbb{R}^{\mathcal{X}}}$ , all $\lambda\in\mathbb{R}_{\geq 0}$ and $\mu\in\mathbb{R}$ , and all $x,y\in\mathcal{X}$ with $y\neq x$ , it holds that [De Bock, 2017]

1.

$\underline{Q}(\mu\mathbf{1})(x)=0$ and $\overline{Q}(\mu\mathbf{1})(x)=0$
2.

$\underline{Q}\mathbb{I}_{y}(x)\geq 0$ and $\overline{Q}\mathbb{I}_{y}(x)\geq 0$
3.

$\underline{Q}f+\underline{Q}g\leq\underline{Q}(f+g)$ and $\overline{Q}(f+g)\leq\overline{Q}f+\overline{Q}g$
4.

$\underline{Q}(\lambda f)=\lambda\underline{Q}f$ and $\overline{Q}(\lambda f)=\lambda\overline{Q}f$

As before, such objects are conjugate, in that if $\underline{Q}$ is a lower rate operator, then $\smash{\overline{Q}}(\cdot)=-\smash{\underline{Q}}(-\cdot)$ is an upper rate operator. Moreover, any rate matrix $Q$ is also a lower (and upper) rate operator. There is again a duality between lower (or upper) rate operators, and sets of rate matrices. For fixed $\underline{Q}$ and with the dominating set of rate matrices $\mathcal{Q}_{\underline{Q}}$ defined as

\mathcal{Q}_{\underline{Q}}\coloneqq\bigl{\{}Q\,:\,\text{$Q$ a rate mat.},\,Qf\geq\underline{Q}f\,\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$}\bigr{\}}\,,

we have the following result:

Proposition 4.

[Krak, 2021, Sec 6.2] Let $\underline{Q}$ be a lower rate operator with conjugate upper rate operator $\overline{Q}(\cdot)=-\underline{Q}(-\cdot)$ and dominating set of rate matrices $\mathcal{Q}_{\underline{Q}}$ . Then $\mathcal{Q}_{\underline{Q}}$ is a non-empty, compact, and convex set of rate matrices that has separately specified rows, and for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ it holds that $\underline{Q}f=\inf_{Q\in\mathcal{Q}_{\underline{Q}}}Qf$ and $\overline{Q}f=\sup_{Q\in\mathcal{Q}_{\underline{Q}}}Qf$ . Moreover, for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ there is some $Q\in\mathcal{Q}_{\underline{Q}}$ such that $Qf=\underline{Q}f$ , and there is some—possibly different— $Q\in\mathcal{Q}_{\underline{Q}}$ such that $Qf=\overline{Q}f$ .

Now fix any lower rate operator $\underline{Q}$ and any $t\in\mathbb{R}_{\geq 0}$ , and let

e^{\underline{Q}t}\coloneqq\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}\underline{Q}\bigr{)}^{n}\,.

(6)

The operator $e^{\underline{Q}t}$ is then a lower transition operator [De Bock, 2017], and the family $(e^{\underline{Q}t})_{t\in\mathbb{R}_{\geq 0}}$ is a semigroup of lower transition operators; it satisfies $e^{\underline{Q}(t+s)}=e^{\underline{Q}t}e^{\underline{Q}s}$ for all $t,s\in\mathbb{R}_{\geq 0}$ , and $e^{\underline{Q}0}=I$ . The analogous construction with an upper rate operator $\overline{Q}$ instead generates a semigroup $(e^{\overline{Q}t})_{t\in\mathbb{R}_{\geq 0}}$ of upper transition operators. When $\underline{Q}$ and $\overline{Q}$ are taken with respect to the same set $\mathcal{Q}$ , these semigroups satisfy, for all $t\in\mathbb{R}_{\geq 0}$ , $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ , and $Q\in\mathcal{Q}$ ,

e^{\underline{Q}t}f\leq e^{Qt}f\leq e^{\overline{Q}t}f\,.

(7)

Here the importance again derives from the use as a computational tool; under the conditions of duality between $\mathcal{Q}$ and $\underline{Q}$ , we have for any $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ and any $t\in\mathbb{R}_{\geq 0}$ that [Škulj, 2015, Krak et al., 2017]

\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[f(X_{t})|X_{0}=x]=\underline{\mathbb{E}}^{\mathrm{M}}_{\mathcal{Q}}[f(X_{t})|X_{0}=x]=e^{\underline{Q}t}f(x)\,.

Hence such inferences can be numerically computed by approximating (6) with a finite choice of $n$ , and then solving $n$ independent linear optimization problems over $\mathcal{Q}$ . Error bounds for this scheme are available in the literature [Škulj, 2015, Krak et al., 2017, Erreygers, 2021].

3.3 Class Structure

Let us now fix a set $\mathcal{Q}$ of rate matrices that we will use in the remainder of this work. Throughout, let $\underline{Q}$ and $\overline{Q}$ denote the lower- and upper rate operators associated with $\mathcal{Q}$ . We impose several standard regularity conditions on this set: we assume that $\mathcal{Q}$ is non-empty, compact, convex, and that it has separately specified rows. These are common assumptions that are imposed to ensure the duality between $\mathcal{Q}$ and $\underline{Q}$ , which in turn guarantees that inferences with the induced imprecise-Markov chains remain well-behaved, as well as analytically (and, often, computationally) tractable.

We now have all the pieces to start studying the inference problem that is the subject of this work: the lower- and upper expected hitting times of the set $A\subset\mathcal{X}$ for continuous-time imprecise-Markov chains described by $\mathcal{Q}$ .

Before we begin, let us impose two additional conditions on the dynamics of the system.

Assumption 1.

We assume that all states in $A$ are absorbing, which is equivalent to requiring that $Q(x,x)=0$ for all $Q\in\mathcal{Q}$ and all $x\in A$ .

Note that this does not influence the inferences in which we are interested, since those only deal with behavior at times before states in $A$ are reached. However, imposing this explicitly substantially simplifies the analysis.

Next, we assume that the set $A$ is lower reachable from any state $x\in A^{c}$ [De Bock, 2017]. This means that we can construct a sequence $x_{1},\ldots,x_{n+1}\in\mathcal{X}$ starting in any $x_{1}\in A^{c}$ and ending in some $x_{n+1}\in A$ such that, for all $k=1,\ldots,n$ , it holds that $\underline{Q}\mathbb{I}_{x_{k+1}}(x_{k})>0$ . This is equivalent [De Bock, 2017] to

Assumption 2.

We assume $e^{\underline{Q}t}\mathbb{I}_{A}(x)>0$ for all $t\in\mathbb{R}_{>0}$ and all $x\in A^{c}$ .

Essentially, this means that for all elements of our imprecise-probabilistic models the probability of eventually hitting $A$ is bounded away from zero. This ensures that the expected hitting times remain bounded for all $P\in\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}$ , so that we can ignore any extended real-valued analysis. It also implies that for all $Q\in\mathcal{Q}$ we have that $Q(x,x)\neq 0$ for all $x\in A^{c}$ , which is relevant to meet the precondition of Proposition 2. As a practical point, De Bock [2017] gives an algorithm to check whether a given set $\mathcal{Q}$ satisfies this condition.

On a technical level, Assumption 2 is the crucial one for our results, and—unlike with Assumption 1—it cannot really be ignored in practice. However, based on earlier work by Krak et al. [2019] in the discrete-time setting, we hope in the future to strengthen our results to hold without this assumption.

4 Subspace Dynamics

In the context of hitting times, the interesting behavior of a process actually occurs before it has reached a target state in $A$ . Hence it will be useful to introduce some machinery to study the transition dynamics as it relates to the states $A^{c}$ .

To introduce the notation in a general way, choose any non-empty $\mathcal{Y}\subset\mathcal{X}$ . Then for any $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ , let $f|_{\mathcal{Y}}\in\smash{\mathbb{R}^{\mathcal{Y}}}$ denote the restriction of $f$ to $\mathcal{Y}$ . Conversely, for any $f\in\smash{\mathbb{R}^{\mathcal{Y}}}$ , let $f\!\!\uparrow_{\mathcal{X}}\in\smash{\mathbb{R}^{\mathcal{X}}}$ denote the unique extension of $f$ to $\mathcal{X}$ that satisfies $f(x)=0$ for all $x\in\mathcal{X}\setminus\mathcal{Y}$ . Moreover, for any operator $M$ on $\smash{\mathbb{R}^{\mathcal{X}}}$ , we define the operator $M|_{\mathcal{Y}}$ on $\smash{\mathbb{R}^{\mathcal{Y}}}$ as

M|_{\mathcal{Y}}f\coloneqq\bigl{(}M(f\!\!\uparrow_{\mathcal{X}})\bigr{)}|_{\mathcal{Y}}\quad\quad\text{for all $f\in\smash{\mathbb{R}^{\mathcal{Y}}}$.}

This somewhat verbose notation is perhaps most easily understood when $M$ is a linear operator, i.e. a matrix. In that case, $M|_{\mathcal{Y}}$ is simply the $\left\lvert\mathcal{Y}\right\rvert\times\left\lvert\mathcal{Y}\right\rvert$ sub-matrix of $M$ on the coordinates in $\mathcal{Y}$ . The definition above allows us to extend this notion also to non-linear operators, and to lower- and upper transition and rate operators, specifically.

Now for any rate matrix $Q\in\mathcal{Q}$ , we call $G\coloneqq Q|_{A^{c}}$ its corresponding subgenerator. For any $t\in\mathbb{R}_{\geq 0}$ , we then define $e^{Gt}\coloneqq e^{Qt}|_{A^{c}}$ . We have the following result:

Proposition 5.

Fix $Q\in\mathcal{Q}$ and let $G$ be its subgenerator. Then $e^{Gt}=\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}G\bigr{)}^{n}$ for all $t\in\mathbb{R}_{\geq 0}$ . Moreover, the family $(e^{Gt})_{t\in\mathbb{R}_{\geq 0}}$ is a semigroup.

Analogously, we define $\underline{G}\coloneqq\underline{Q}|_{A^{c}}$ and $\overline{G}\coloneqq\overline{Q}|_{A^{c}}$ to be the lower- and upper subgenerators corresponding to $\underline{Q}$ and $\overline{Q}$ , respectively. We also let $e^{\underline{G}t}\coloneqq e^{\underline{Q}t}|_{A^{c}}$ and $e^{\overline{G}t}\coloneqq e^{\overline{Q}t}|_{A^{c}}$ . Perhaps unsurprisingly, we then have:

Proposition 6.

It holds that $e^{\underline{G}t}=\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}\underline{G}\bigr{)}^{n}$ and $e^{\overline{G}t}=\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}\overline{G}\bigr{)}^{n}$ for all $t\in\mathbb{R}_{\geq 0}$ . Moreover, the families $(e^{\underline{G}t})_{t\in\mathbb{R}_{\geq 0}}$ , $(e^{\overline{G}t})_{t\in\mathbb{R}_{\geq 0}}$ are semigroups.

Our Assumption 2 implies the norm bound:

Proposition 7.

For any $t>0$ , it holds that $\left\lVert\smash{e^{\overline{G}t}}\right\rVert<1$ .

It is a straightforward consequence of the use of the supremum norm, together with Equation (7) and the fact that $e^{Qt}$ and $e^{\overline{Q}t}$ are (upper) transition operators, that also $\left\lVert\smash{e^{Gt}}\right\rVert\leq\left\lVert\smash{e^{\overline{G}t}}\right\rVert<1$ for all $t\in\mathbb{R}_{>0}$ . Hence by the semigroup property we immediately have that $\lim_{t\to+\infty}\left\lVert\smash{e^{Gt}}\right\rVert=0$ . This also implies the following well-known result.

Proposition 8.

[Taylor and Lay, 1958, Thm IV.1.4] For any $Q\in\mathcal{Q}$ with subgenerator $G$ , and all $t>0$ , the inverse operator $(I-e^{Gt})^{-1}$ exists, and $(I-e^{Gt})^{-1}=\sum_{k=0}^{+\infty}e^{Gtk}$ .

This allows us to characterize hitting times for discrete-time homogeneous Markov chains whose transition matrix is given by $e^{Qt}$ , as follows.

Proposition 9.

Choose any $Q\in\mathcal{Q}$ , let $G$ be its subgenerator, and fix any $\Delta>0$ . Let $P\in\mathbb{P}^{\mathrm{HM}}_{\mathbb{N}_{0}}$ be such that ${}^{P}T=e^{Q\Delta}$ . Then the expected hitting times $h\coloneqq\mathbb{E}_{P}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]$ satisfy $h|_{A^{c}}=(I-e^{G\Delta})^{-1}\mathbf{1}$ and $h(x)=0$ for all $x\in A$ .

Proof.

By Proposition 1, in $x\in A^{c}$ we have that

h(x)=\mathbb{I}_{A^{c}}(x)+\mathbb{I}_{A^{c}}(x)e^{Q\Delta}h(x)=1+e^{Q\Delta}h(x)\,.

Conversely, it is immediate from the definition that $h(x)=0$ for all $x\in A$ . This implies that $h=(h|_{A^{c}})\!\!\uparrow_{\mathcal{X}}$ , and hence

h|_{A^{c}}=\mathbf{1}+\bigl{(}e^{Q\Delta}(h|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A^{c}}=\mathbf{1}+e^{G\Delta}h|_{A^{c}}\,.

Re-ordering terms we have $(I-e^{G\Delta})h|_{A^{c}}=\mathbf{1}$ . Now use Proposition 8 and multiply with $(I-e^{G\Delta})^{-1}$ . ∎

We need the following observation:

Lemma 1.

Consider any $Q\in\mathcal{Q}$ with subgenerator $G$ , and let $\sigma(G)$ be the set of eigenvalues of $G$ . Then $\mathrm{Re}\,\lambda<0$ for all $\lambda\in\sigma(G)$ .

This implies that $0\notin\sigma(G)$ , and so we have:

Corollary 1.

For any $Q\in\mathcal{Q}$ with subgenerator $G$ , the inverse operator $G^{-1}$ exists.

This allows us to characterize hitting times for continuous-time homogeneous Markov chains:

Proposition 10.

Choose any $Q\in\mathcal{Q}$ , let $G$ be its subgenerator, and let $P\in\mathbb{P}_{\mathbb{R}_{\geq 0}}^{\mathrm{HM}}$ with ${}^{P}\!Q=Q$ . Then the expected hitting times $h\coloneqq\mathbb{E}_{P}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]$ satisfy $h|_{A^{c}}=-G^{-1}\mathbf{1}$ and $h(x)=0$ for all $x\in A$ .

Proof.

By Proposition 2, in $x\in A^{c}$ we have that

-1=-\mathbb{I}_{A^{c}}(x)=\mathbb{I}_{A^{c}}(x)Qh(x)=Qh(x)\,.

Conversely, it is immediate from the definition that $h(x)=0$ for all $x\in A$ . This implies $h=(h|_{A^{c}})\!\!\uparrow_{\mathcal{X}}$ , and hence

Gh|_{A^{c}}=\bigl{(}Q(h|_{A^{c}})\!\!\uparrow_{\mathcal{X}})\bigr{|_{A^{c}}}=(Qh)|_{A^{c}}=-\mathbf{1}\,.

Now use Corollary 1 and multiply with $G^{-1}$ . ∎

4.1 Quasicontractivity of Subspace Dynamics

We already know from Proposition 7 that $\left\lVert\smash{e^{\overline{G}t}}\right\rVert<1$ for all $t\in\mathbb{R}_{>0}$ . Since $\smash{e^{\overline{G}0}}=I$ (because it is a semigroup), it follows that $\left\lVert\smash{e^{\overline{G}t}}\right\rVert\leq 1$ for all $t\in\mathbb{R}_{\geq 0}$ . A semigroup that satisfies this property is said to be contractive. Moreover, Proposition 7 together with the semigroup property implies that $\lim_{t\to+\infty}\left\lVert\smash{e^{\overline{G}t}}\right\rVert=0$ . A semigroup that satisfies this property is said to be uniformly exponentially stable, and in such a case the following result holds:

Proposition 11.

There are $M\geq 1$ and $\xi>0$ such that $\left\lVert\smash{e^{\overline{G}t}}\right\rVert\leq Me^{-\xi t}$ for all $t\in\mathbb{R}_{\geq 0}$ .

This result means that the norm $\left\lVert\smash{e^{\overline{G}t}}\right\rVert$ decays exponentially as $t$ grows. However, for technical reasons we require an exponentially decaying norm bound with $M=1$ ; if this holds the semigroup is said to be quasicontractive.

It is not clear that obtaining such a bound is possible when $\left\lVert\smash{e^{\overline{G}t}}\right\rVert$ is induced by the supremum norm $\left\lVert\cdot\right\rVert$ on $\smash{\mathbb{R}^{A^{c}}}$ . However, we can get it by defining a different norm $\left\lVert\cdot\right\rVert_{*}$ on $\smash{\mathbb{R}^{A^{c}}}$ . We then obtain the quasicontractivity with respect to the induced operator norm $\left\lVert\cdot\right\rVert_{*}$ . Because $\smash{\mathbb{R}^{A^{c}}}$ is finite-dimensional these norms are equivalent, and such a result suffices for our purposes. This re-norming trick is originally due to Feller [1953], and an analogous construction is commonly used for semigroups of linear operators; see e.g. [Renardy and Rogers, 2006, Thm 12.21].

So, consider the $\xi>0$ from Proposition 11, and let

\left\lVert f\right\rVert_{*}\coloneqq\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert\quad\text{for all $f\in\smash{\mathbb{R}^{A^{c}}}$,}

(8)

where $\left\lvert f\right\rvert$ denotes the elementwise-absolute value of $f$ .

Proposition 12.

The map $f\mapsto\left\lVert f\right\rVert_{*}$ is a norm on $\smash{\mathbb{R}^{A^{c}}}$ .

Moreover, we have the desired result:

Proposition 13.

We have $\left\lVert\smash{e^{\overline{G}t}}\right\rVert_{*}\leq e^{-\xi t}$ for all $t\in\mathbb{R}_{\geq 0}$ .

Finally, the same bound holds for precise models:

Proposition 14.

For any $Q\in\mathcal{Q}$ with subgenerator $G$ it holds that $\left\lVert e^{Gt}\right\rVert_{*}\leq e^{-\xi t}$ for all $t\in\mathbb{R}_{\geq 0}$ .

5 Hitting Times as Limits

We now have all the pieces to explain the proof of our main results. The trick will be to establish a connection between hitting times for continuous-time imprecise-Markov chains, and hitting times for discrete-time imprecise-Markov chains, for which analogous results were previously established by Krak et al. [2019].

We essentially just look at a discretized continuous-time Markov chain taking steps of some fixed size $\Delta>0$ , derive the expected hitting time for this discrete-time Markov chain, and then take the limit $\Delta\to 0^{+}$ . The main difficulty is in establishing that this converges uniformly for all elements in our sets of processes; this is why we went through the trouble of establishing quasicontractivity in Section 4.1.

To start, for any $Q\in\mathcal{Q}$ and $\Delta>0$ , let $h^{Q}_{\Delta}$ be the minimal non-negative solution to the linear system⁸⁸8Note the re-scaled term $\Delta\mathbb{I}_{A^{c}}$ on the right-hand side, which distinguishes this from the system in Proposition 1; this is required since the hitting times for discrete-time Markov chains are expressed in the number of steps, and to pass to continuous-time we need to measure the size of these steps.

h^{Q}_{\Delta}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}e^{Q\Delta}h^{Q}_{\Delta}\,,

(9)

and let $h^{Q}$ be the minimal non-negative solution to

\mathbb{I}_{A}h^{Q}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Qh^{Q}\,.

(10)

Then we know from Propositions 1 and 2 that $\nicefrac{{1}}{{\Delta}}h^{Q}_{\Delta}$ represents the expected hitting times of a discrete-time homogeneous Markov chain with transition matrix $e^{Q\Delta}$ , and that $h^{Q}$ does the same for a continuous-time homogeneous Markov chain with rate matrix $Q$ . We now have the following result:

Proposition 15.

There are $\delta>0$ and $L>0$ such that $\left\lVert h^{Q}_{\Delta}-h^{Q}\right\rVert<\Delta L\left\lVert h^{Q}\right\rVert$ for all $\Delta\in(0,\delta)$ and all $Q\in\mathcal{Q}$ .

Since $\left\lVert h^{Q}\right\rVert$ is bounded due to Proposition 10:

Corollary 2.

We have $\lim_{\Delta\to 0^{+}}h^{Q}_{\Delta}=h^{Q}$ for all $Q\in\mathcal{Q}$ .

We will now set up the analogous results for imprecise-Markov chains. First, let

\underline{h}\coloneqq\inf_{Q\in\mathcal{Q}}h^{Q}\quad\text{and}\quad\overline{h}\coloneqq\sup_{Q\in\mathcal{Q}}h^{Q}\,.

(11)

Clearly, it follows from Proposition 2 and the definition of lower- and upper expectations that these quantities represent the lower- and upper expected hitting times for the imprecise-Markov chain $\mathcal{P}_{\mathcal{Q}}^{\mathrm{HM}}$ , i.e. it holds that

\underline{h}=\underline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{HM}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}\quad\text{and}\quad\overline{h}=\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{HM}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}\,.

Now for any $\Delta>0$ , let $\underline{h}_{\Delta}$ and $\overline{h}_{\Delta}$ denote the minimal non-negative solutions to the non-linear systems

\underline{h}_{\Delta}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}e^{\underline{Q}\Delta}\underline{h}_{\Delta}

(12)

and

\overline{h}_{\Delta}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}e^{\overline{Q}\Delta}\overline{h}_{\Delta}\,.

(13)

It was previously shown by Krak et al. [2019] that—up to re-scaling with $\nicefrac{{1}}{{\Delta}}$ —the quantities $\underline{h}_{\Delta}$ and $\overline{h}_{\Delta}$ represent the lower (resp. upper) expected hitting times of, identically, the discrete-time imprecise-Markov chains $\mathcal{P}_{\mathcal{T}_{\Delta}}^{\mathrm{HM}}$ , $\mathcal{P}_{\mathcal{T}_{\Delta}}^{\mathrm{M}}$ , and $\mathcal{P}_{\mathcal{T}_{\Delta}}^{\mathrm{I}}$ parameterized by the set $\mathcal{T}_{\Delta}$ of transition matrices that dominate $e^{\underline{Q}\Delta}$ . We now set out of prove an analogous result for continuous-time imprecise-Markov chains. We start with the following:

Proposition 16.

It holds that $\lim_{\Delta\to 0^{+}}\underline{h}_{\Delta}=\underline{h}$ and $\lim_{\Delta\to 0^{+}}\overline{h}_{\Delta}=\overline{h}$ .

This property allows us to leverage recent results by Erreygers [2021] and Krak [2021] regarding discrete and finite approximations of lower- and upper expectations in continuous-time imprecise-Markov chains, to obtain our first main result:

Theorem 1.

It holds that

\underline{h}=\underline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{HM}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}=\underline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{M}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}=\underline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}\,,

and, moreover, that

\overline{h}=\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{HM}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}=\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{M}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}=\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}\bigl{[}\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}\bigr{]}\,.

Moreover, it follows relatively straightforwardly from Proposition 16 that the lower- and upper expected hitting times for continuous-time imprecise-Markov chains satisfy an immediate generalization of the system that characterizes the expected hitting times for (precise) continuous-time homogeneous Markov chains. This is our second main result:

Theorem 2.

Let $\underline{h}$ and $\overline{h}$ denote the lower- and upper expected hitting times for any one of $\mathcal{P}^{\mathrm{HM}}_{\mathcal{Q}}$ , $\mathcal{P}^{\mathrm{M}}_{\mathcal{Q}}$ , or $\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}$ . Then $\underline{h}$ is the minimal non-negative solution to the non-linear system $\mathbb{I}_{A}\underline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\underline{Q}\,\underline{\vphantom{Q}h}$ , and $\overline{h}$ is the minimal non-negative solution to the non-linear system $\mathbb{I}_{A}\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,\overline{h}$ .

6 Summary & Conclusion

We have investigated the problem of characterizing expected hitting times for continuous-time imprecise-Markov chains. We have shown that under two relatively mild assumptions on the system’s class structure—viz. that the target states are absorbing, and can be reached by any non-target state—the corresponding lower (resp. upper) expected hitting time is the same for all three types of imprecise-Markov chains.

We have also demonstrated that these lower- and upper expected hitting times $\underline{h}$ and $\overline{h}$ satisfy the non-linear systems

\mathbb{I}_{A}\underline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\underline{Q}\,\underline{\vphantom{Q}h}\quad\text{and}\quad\mathbb{I}_{A}\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,\overline{h}\,,

in analogy with the precise linear system (5). Indeed, we conclude that the lower- and upper expected hitting times for any of these three types of imprecise-Markov chains, can be fully characterized as the unique minimal non-negative solutions to these respective systems.

We aim to strengthen these results in future work to hold with fewer assumptions on the system’s class structure.

acknowledgements

We would like to sincerely thank Jasper De Bock for many stimulating discussions on the subject of imprecise-Markov chains, and for pointing out a technical error in an earlier draft of this work. We are also grateful for the constructive feedback of three anonymous reviewers.

References

Augustin et al. [2014] Thomas Augustin, Frank P. A. Coolen, Gert de Cooman, and Matthias C. M. Troffaes, editors. Introduction to Imprecise Probabilities. John Wiley & Sons, 2014.
De Bock [2017] Jasper De Bock. The limit behaviour of imprecise continuous-time Markov chains. Journal of Nonlinear Science, 27(1):159–196, 2017.
Engel and Nagel [2000] Klaus-Jochen Engel and Rainer Nagel. One-parameter semigroups for linear evolution equations, volume 194. Springer, 2000.
Erreygers [2021] Alexander Erreygers. Markovian Imprecise Jump Processes: Foundations, Algorithms and Applications. PhD thesis, Ghent University, 2021.
Feller [1953] William Feller. On the generation of unbounded semi-groups of bounded linear operators. Annals of Mathematics, pages 166–174, 1953.
Hermans and Škulj [2014] Filip Hermans and Damjan Škulj. Stochastic processes. In Thomas Augustin, Frank P. A. Coolen, Gert de Cooman, and Matthias C. M. Troffaes, editors, Introduction to Imprecise Probabilities, chapter 11. Wiley, 2014.
Krak [2020] Thomas Krak. Computing expected hitting times for imprecise Markov chains. In International Conference on Uncertainty Quantification & Optimisation, pages 185–205. Springer, 2020.
Krak [2021] Thomas Krak. Continuous-Time Imprecise-Markov Chains: Theory and Algorithms. PhD thesis, Ghent University, 2021.
Krak et al. [2017] Thomas Krak, Jasper De Bock, and Arno Siebes. Imprecise continuous-time Markov chains. International Journal of Approximate Reasoning, 88:452–528, 2017.
Krak et al. [2019] Thomas Krak, Natan T’Joens, and Jasper De Bock. Hitting times and probabilities for imprecise Markov chains. In Proceedings of ISIPTA 2019, volume 103 of Proceedings of Machine Learning Research, pages 265–275. PMLR, 2019.
Norris [1998] James Robert Norris. Markov chains. Cambridge university press, 1998.
Renardy and Rogers [2006] Michael Renardy and Robert C. Rogers. An introduction to partial differential equations. Springer Science & Business Media, 2006.
Škulj [2015] Damjan Škulj. Efficient computation of the bounds of continuous time imprecise Markov chains. Applied mathematics and computation, 250:165–180, 2015.
Taylor and Lay [1958] Angus E. Taylor and David C. Lay. Introduction to functional analysis, volume 1. Wiley New York, 1958.
Van Loan [2006] Charles F. Van Loan. A study of the matrix exponential. 2006.
Walley [1991] Peter Walley. Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, London, 1991.

Appendix A Proofs and Lemmas for Section 4

For certain operators, we note that subspace restriction distributes over operator composition:

Lemma 2.

Let $M$ and $N$ be operators on $\smash{\mathbb{R}^{\mathcal{X}}}$ such that $N|_{A}=I$ . Then $\bigl{(}MN\bigr{)}|_{A^{c}}=M|_{A^{c}}N|_{A^{c}}$ .

Proof.

Fix any $f\in\smash{\mathbb{R}^{A^{c}}}$ . Then

\displaystyle M|_{A^{c}}N|_{A^{c}}f=M\Bigl{(}\bigl{(}(Nf\!\!\uparrow_{\mathcal{X}})|_{A^{c}}\bigr{)}\!\!\uparrow_{\mathcal{X}}\Bigr{)}|_{A^{c}}\,.

Note that since $N|_{A}=I$ and $f\!\!\uparrow_{\mathcal{X}}(x)=0$ for all $x\in A$ , we also have $Nf\!\!\uparrow_{\mathcal{X}}(x)=0$ for all $x\in A$ . Hence in particular, it holds that $\bigl{(}(Nf\!\!\uparrow_{\mathcal{X}})|_{A^{c}}\bigr{)}\!\!\uparrow_{\mathcal{X}}=Nf\!\!\uparrow_{\mathcal{X}}$ . We therefore find that

\displaystyle M|_{A^{c}}N|_{A^{c}}f

\displaystyle=\bigl{(}MNf\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A^{c}}=(MN)|_{A^{c}}f\,,

which concludes the proof. ∎

This can be used in particular for certain operators associated with $Q\in\mathcal{Q}$ and the associated lower- and upper rate operators:

Lemma 3.

Fix any $\Delta\geq 0$ and any $Q\in\mathcal{Q}$ . Then

(I+\Delta Q)|_{A}=(I+\Delta\underline{Q})|_{A}=(I+\Delta\overline{Q})|_{A}=I\,.

Proof.

Fix any $Q\in\mathcal{Q}$ , and first choose any $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ and $x\in A$ . By Assumption 1 and the definition of rate matrices, we have $Q(x,y)=0$ for all $y\in\mathcal{X}$ , whence $Qf(x)=\sum_{y\in\mathcal{X}}Q(x,y)f(y)=0$ . Since $Q\in\mathcal{Q}$ is arbitrary, we also have $\underline{Q}f(x)=0$ and $\overline{Q}f(x)=0$ . It follows that

f(x)=(I+\Delta Q)f(x)=(I+\Delta\underline{Q})f(x)=(I+\Delta\overline{Q})f(x)\,.

Since this is true for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ and all $x\in A$ , the result is now immediate. ∎

Corollary 3.

For all $Q\in\mathcal{Q}$ and $t\in\mathbb{R}_{\geq 0}$ it holds that

e^{Qt}|_{A}=e^{\underline{Q}t}|_{A}=e^{\overline{Q}t}|_{A}=I\,.

Proof.

Use Lemma 3 and the definitions of $e^{Qt}$ , $e^{\underline{Q}t}$ , $e^{\overline{Q}t}$ . ∎

Lemma 4.

Let $M$ and $N$ be operators on $\smash{\mathbb{R}^{\mathcal{X}}}$ such that $M|_{A}=I=N|_{A}$ . Then $\left\lVert M|_{A^{c}}-N|_{A^{c}}\right\rVert\leq\left\lVert M-N\right\rVert$ .

Proof.

Fix any $f\in\smash{\mathbb{R}^{A^{c}}}$ with $\left\lVert f\right\rVert=1$ . Then $\left\lVert f\!\!\uparrow_{\mathcal{X}}\right\rVert=1$ . Moreover, since $f\!\!\uparrow_{\mathcal{X}}(x)=0$ for all $x\in A$ and since $M|_{A}=I=N|_{A}$ , we have that $(Mf\!\!\uparrow_{\mathcal{X}})(x)=0=(Nf\!\!\uparrow_{\mathcal{X}})(x)$ for all $x\in A$ . Hence we find

	$\displaystyle\left\lVert(M\|_{A^{c}}-N\|_{A^{c}})f\right\rVert$
	$\displaystyle\quad\quad=\left\lVert\bigl{(}(M-N)f\!\!\uparrow_{\mathcal{X}}\bigr{)}\|_{A^{c}}\right\rVert$
	$\displaystyle\quad\quad=\left\lVert(M-N)f\!\!\uparrow_{\mathcal{X}}\right\rVert$
	$\displaystyle\quad\quad\leq\sup\bigl{\{}\left\lVert(M-N)g\right\rVert\,:\,g\in\smash{\mathbb{R}^{\mathcal{X}}},\left\lVert g\right\rVert=1\bigr{\}}$
	$\displaystyle\quad\quad=\left\lVert M-N\right\rVert\,.$

The result follows since $f\in\smash{\mathbb{R}^{A^{c}}}$ is arbitrary. ∎

Proof of Proposition 5.

Fix $Q\in\mathcal{Q}$ and let $G$ be its subgenerator. First fix any $t\in\mathbb{R}_{\geq 0}$ and any $\epsilon>0$ . Then by definition of $e^{Qt}$ , for all $n\in\mathbb{N}$ large enough it holds that

\left\lVert e^{Qt}-\bigl{(}I+\nicefrac{{t}}{{n}}Q\bigr{)}^{n}\right\rVert<\epsilon\,.

Moreover, by Lemmas 2 and 3 we have

\Bigl{(}\bigl{(}I+\nicefrac{{t}}{{n}}Q\bigr{)}^{n}\Bigr{)}|_{A^{c}}=\bigl{(}I+\nicefrac{{t}}{{n}}G\bigr{)}^{n}\,,

and, by Corollary 3, that $e^{Qt}|_{A}=I$ . Hence by Lemma 4 we find

\left\lVert e^{Gt}-\bigl{(}I+\nicefrac{{t}}{{n}}G\bigr{)}^{n}\right\rVert\leq\left\lVert e^{Qt}-\bigl{(}I+\nicefrac{{t}}{{n}}Q\bigr{)}^{n}\right\rVert<\epsilon\,.

Since $\epsilon>0$ is arbitrary, we have

e^{Gt}=\lim_{n\to+\infty}\bigl{(}I+\nicefrac{{t}}{{n}}G\bigr{)}^{n}\,.

This concludes the proof of the first claim.

To see that $(e^{Gt})_{t\in\mathbb{R}_{\geq 0}}$ is a semigroup, note that $(e^{Qt})_{t\in\mathbb{R}_{\geq 0}}$ is a semigroup, then apply Lemma 2 and Corollary 3. ∎

Proof of Proposition 6.

The proof is completely analogous to the proof of Proposition 5; simply replace $Q$ with either $\underline{Q}$ or $\overline{Q}$ as appropriate. ∎

Proof of Proposition 7.

Let $\epsilon\coloneqq\min_{x\in A^{c}}e^{\underline{Q}t}\mathbb{I}_{A}(x)$ ; then $\epsilon>0$ due to Assumption 2. Fix any $f\in\smash{\mathbb{R}^{A^{c}}}$ with $\left\lVert f\right\rVert=1$ . By definition, we have $e^{\overline{G}t}f=e^{\overline{Q}t}|_{A^{c}}f=\bigl{(}e^{\overline{Q}t}f\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A^{c}}$ .

Let $\mathcal{T}_{t}$ denote the set of transition matrices that dominates $e^{\underline{Q}t}$ . Due to Proposition 3, there is some $T\in\mathcal{T}_{t}$ such that $Tf\!\!\uparrow_{\mathcal{X}}=e^{\overline{Q}t}f\!\!\uparrow_{\mathcal{X}}$ . Fix any $x\in A^{c}$ . Then, using that $f\!\!\uparrow_{\mathcal{X}}(y)=0$ for all $y\in A$ , together with the fact that $T$ is a transition matrix, we have

\displaystyle\left\lvert Tf\!\!\uparrow_{\mathcal{X}}(x)\right\rvert

\displaystyle=\left\lvert\sum_{y\in\mathcal{X}}T(x,y)f\!\!\uparrow_{\mathcal{X}}(y)\right\rvert\leq\sum_{y\in A^{c}}T(x,y)\,,

and hence $\left\lvert Tf\!\!\uparrow_{\mathcal{X}}(x)\right\rvert\leq T\mathbb{I}_{A^{c}}(x)$ . We have $\mathbb{I}_{A}+\mathbb{I}_{A^{c}}=\mathbf{1}$ and $T\mathbf{1}(x)=1$ since $T$ is a transition matrix. Using the linear character of $T$ , we find that

\displaystyle T\mathbb{I}_{A^{c}}(x)=T(\mathbf{1}-\mathbb{I}_{A})(x)=1-T\mathbb{I}_{A}(x)\,.

Since $T\in\mathcal{T}_{t}$ and $x\in A^{c}$ we have

0<\epsilon=\min_{y\in A^{c}}e^{\underline{Q}t}\mathbb{I}_{A}(y)\leq e^{\underline{Q}t}\mathbb{I}_{A}(x)\leq T\mathbb{I}_{A}(x)\,.

Combining the above we find that

\left\lvert Tf\!\!\uparrow_{\mathcal{X}}(x)\right\rvert\leq T\mathbb{I}_{A^{c}}(x)=1-T\mathbb{I}_{A}(x)\leq 1-\epsilon\,.

Since this is true for all $x\in A^{c}$ , we find that $\left\lVert(Tf\!\!\uparrow_{\mathcal{X}})|_{A^{c}}\right\rVert\leq 1-\epsilon$ . Moreover, since $Tf\!\!\uparrow_{\mathcal{X}}=e^{\overline{Q}t}f\!\!\uparrow_{\mathcal{X}}$ , it follows that $\left\lVert(e^{\overline{Q}t}f\!\!\uparrow_{\mathcal{X}})|_{A^{c}}\right\rVert\leq 1-\epsilon$ , or in other words, that

\left\lVert e^{\overline{G}t}f\right\rVert\leq 1-\epsilon\,,\quad\quad\text{with $\epsilon>0$.}

The result follows since $f\in\smash{\mathbb{R}^{A^{c}}}$ with $\left\lVert f\right\rVert=1$ is arbitrary. ∎

Proof of Lemma 1.

Let $\rho(e^{G})\coloneqq\max_{\lambda\in\sigma(e^{G})}\left\lvert\lambda\right\rvert$ denote the spectral radius of $e^{G}$ . We know from Section 4 that $\left\lVert e^{G}\right\rVert<1$ , and hence we have $\rho(e^{G})\leq\left\lVert e^{G}\right\rVert<1$ [Taylor and Lay, 1958, Thm V.3.5]. This implies that $\left\lvert\lambda\right\rvert<1$ for all $\lambda\in\sigma(e^{G})$ .

By the spectral mapping theorem [Engel and Nagel, 2000, Lemma I.3.13] we then have $e^{\mathrm{Re}\,\lambda}<1$ for all $\lambda\in\sigma(G)$ , or in other words, that $\mathrm{Re}\,\lambda<0$ for all $\lambda\in\sigma(G)$ . ∎

Appendix B Proofs and Lemmas for Section 4.1

Proof of Proposition 11.

This proof is a straightforward generalization of an argument in [Engel and Nagel, 2000, Prop I.3.12].

Let first $q\coloneqq\left\lVert e^{\overline{G}}\right\rVert$ ; then $0<q<1$ due to Proposition 7. Define

m\coloneqq\sup_{s\in[0,1]}\left\lVert e^{\overline{G}s}\right\rVert\,.

Then $m\geq 1$ since $m\geq\left\lVert e^{\overline{G}0}\right\rVert=\left\lVert I\right\rVert=1$ . Moreover, $m\leq 1$ due to Proposition 7, and hence $m=1$ . Now set $M\coloneqq\nicefrac{{1}}{{q}}$ and $\xi\coloneqq-\log q$ ; then $\xi>0$ since $q<1$ .

Fix any $t\in\mathbb{R}_{\geq 0}$ . If $t=0$ then the result is trivial, so let us suppose that $t>0$ . Then there are $k\in\mathbb{N}_{0}$ and $s\in[0,1)$ such that $t=k+s$ . Using the semigroup property, we have

\displaystyle\left\lVert e^{\overline{G}t}\right\rVert=\left\lVert e^{\overline{G}(s+k)}\right\rVert

\displaystyle\leq\left\lVert e^{\overline{G}s}\right\rVert\left\lVert e^{\overline{G}}\right\rVert^{k}\leq mq^{k}=e^{k\log q}\,.

We have $k=t-s$ and $s\in[0,1)$ , and so

	$\displaystyle\left\lVert e^{\overline{G}t}\right\rVert$	$\displaystyle\leq e^{k\log q}$
		$\displaystyle=e^{(t-s)\log q}$
		$\displaystyle=e^{t\log q}e^{-s\log q}$
		$\displaystyle=e^{-\xi t}e^{-s\log q}$
		$\displaystyle\leq e^{-\xi t}e^{-\log q}=\frac{1}{q}e^{-\xi t}=Me^{-\xi t}\,,$

which concludes the proof. ∎

Proof of Proposition 12.

It follows from the definition that for any upper transition operator $\overline{T}$ and any non-negative $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ , also $\overline{T}f$ is non-negative. In the sequel, we will therefore say that upper transition operators preserve non-negativity. Since $e^{\overline{Q}t}$ is an upper transition operator, this property clearly extends also to $e^{\overline{G}t}$ .

Now fix $f,g\in\smash{\mathbb{R}^{A^{c}}}$ and $t\in\mathbb{R}_{\geq 0}$ . By preservation of non-negativity we have for any $x\in A^{c}$ that

\left\lvert e^{\overline{G}t}\left\lvert f+g\right\rvert\right\rvert(x)=e^{\overline{G}t}\left\lvert f+g\right\rvert(x)\,.

Moreover, we clearly have $\left\lvert f+g\right\rvert\leq\left\lvert f\right\rvert+\left\lvert g\right\rvert$ , and so by the monotonicity of upper transition operators, we have

e^{\overline{G}t}\left\lvert f+g\right\rvert(x)\leq e^{\overline{G}t}(\left\lvert f\right\rvert+\left\lvert g\right\rvert)(x)\,.

Finally, by the subadditivity of upper transition operators, we find that

e^{\overline{G}t}(\left\lvert f\right\rvert+\left\lvert g\right\rvert)(x)\leq e^{\overline{G}t}\left\lvert f\right\rvert(x)+e^{\overline{G}t}\left\lvert g\right\rvert(x)\,.

Again by preservation of non-negativity we have

e^{\overline{G}t}\left\lvert f\right\rvert(x)+e^{\overline{G}t}\left\lvert g\right\rvert(x)=\left\lvert e^{\overline{G}t}\left\lvert f\right\rvert(x)+e^{\overline{G}t}\left\lvert g\right\rvert\right\rvert(x)\,.

Because this is true for all $x\in A^{c}$ , we find that

	$\displaystyle\left\lVert e^{\overline{G}t}\left\lvert f+g\right\rvert\right\rVert$	$\displaystyle\leq\left\lVert e^{\overline{G}t}\left\lvert f\right\rvert+e^{\overline{G}t}\left\lvert g\right\rvert\right\rVert$
		$\displaystyle\leq\left\lVert e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert+\left\lVert e^{\overline{G}t}\left\lvert g\right\rvert\right\rVert\,.$

Multiplying both sides with $e^{\xi t}$ and noting that $t\in\mathbb{R}_{\geq 0}$ is arbitrary, we find that

	$\displaystyle\left\lVert f+g\right\rVert_{*}$	$\displaystyle=\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert f+g\right\rvert\right\rVert$
		$\displaystyle\leq\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert+\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert g\right\rvert\right\rVert$
		$\displaystyle\leq\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert+\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert g\right\rvert\right\rVert$
		$\displaystyle=\left\lVert f\right\rVert_{}+\left\lVert g\right\rVert_{}\,.$

Hence we have established that $\left\lVert\cdot\right\rVert_{*}$ satisfies the triangle inequality.

Next, fix any $f\in\smash{\mathbb{R}^{A^{c}}}$ and $c\in\mathbb{R}$ . Then

	$\displaystyle\left\lVert cf\right\rVert_{*}$	$\displaystyle=\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert cf\right\rvert\right\rVert$
		$\displaystyle=\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert c\right\rvert\left\lvert f\right\rvert\right\rVert$
		$\displaystyle=\left\lvert c\right\rvert\sup_{t\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi t}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert=\left\lvert c\right\rvert\left\lVert f\right\rVert_{*}\,.$

So $\left\lVert\cdot\right\rVert_{*}$ is absolutely homogeneous.

Finally, fix $f\in\smash{\mathbb{R}^{A^{c}}}$ and suppose that $\left\lVert f\right\rVert_{*}=0$ . It holds that

\displaystyle 0=\left\lVert f\right\rVert_{*}\geq\left\lVert e^{\xi 0}e^{\overline{G}0}\left\lvert f\right\rvert\right\rVert\geq 0\,,

whence it holds that $\left\lVert e^{\xi 0}e^{\overline{G}0}\left\lvert f\right\rvert\right\rVert=0$ . This implies that also $\left\lVert e^{\overline{G}0}\left\lvert f\right\rvert\right\rVert=0$ . Since $e^{\overline{G}0}=I$ , we have

\displaystyle 0=\left\lVert e^{\overline{G}0}\left\lvert f\right\rvert\right\rVert=\left\lVert\left\lvert f\right\rvert\right\rVert=\left\lVert f\right\rVert\,,

whence $f=0$ . Hence $\left\lVert\cdot\right\rVert_{*}$ separates $\smash{\mathbb{R}^{A^{c}}}$ . ∎

Lemma 5.

For any $Q\in\mathcal{Q}$ with subgenerator $G$ , any $f\in\smash{\mathbb{R}^{A^{c}}}$ , and any $t\geq 0$ , it holds that $\left\lVert e^{Gt}f\right\rVert_{*}\leq\left\lVert e^{\overline{G}t}f\right\rVert_{*}$ .

Proof.

Choose $f\in\smash{\mathbb{R}^{A^{c}}}$ . Let $T$ be any matrix with non-negative entries. Then $\left\lvert Tf(x)\right\rvert\leq\left\lvert T\left\lvert f\right\rvert(x)\right\rvert$ for all $x\in A^{c}$ . In particular, we have

	$\displaystyle\left\lvert Tf(x)\right\rvert$	$\displaystyle=\left\lvert\sum_{y\in A^{c}}T(x,y)f(y)\right\rvert$
		$\displaystyle\leq\sum_{y\in A^{c}}\left\lvert T(x,y)\right\rvert\left\lvert f(y)\right\rvert$
		$\displaystyle=T\left\lvert f\right\rvert(x)=\left\lvert T\left\lvert f\right\rvert(x)\right\rvert\,,$

where the final two equalities follow from the fact that $T$ only has non-negative entries. Since this is true for any matrix $T$ with non-negative entries, we have in particular that $\left\lvert e^{Gt}f\right\rvert(x)\leq e^{Gt}\left\lvert f\right\rvert(x)$ . Similarly, it holds that

	$\displaystyle\left\lvert e^{\overline{G}t}f\right\rvert(x)$	$\displaystyle=\left\lvert\sup_{T\in\mathcal{T}_{t}}Tf(x)\right\rvert$
		$\displaystyle\leq\sup_{T\in\mathcal{T}_{t}}\left\lvert Tf(x)\right\rvert$
		$\displaystyle\leq\sup_{T\in\mathcal{T}_{t}}T\left\lvert f\right\rvert(x)=e^{\overline{G}t}\left\lvert f\right\rvert(x)\,.$

It follows that, for any $s\in\mathbb{R}_{\geq 0}$ , we have

\displaystyle e^{\overline{G}s}\left\lvert e^{Gt}f\right\rvert(x)\leq e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert(x)\,.

Due to preservation of non-negativity, and since this is true for any $x\in A^{c}$ , we have

\left\lVert e^{\overline{G}s}\left\lvert e^{Gt}f\right\rvert\right\rVert\leq\left\lVert e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rVert\,.

Now let $f\in\smash{\mathbb{R}^{A^{c}}}$ be such that $\left\lVert f\right\rVert_{*}=1$ and $\left\lVert e^{Gt}\right\rVert_{*}=\left\lVert e^{Gt}f\right\rVert_{*}$ ; this $f$ clearly exists since $\smash{\mathbb{R}^{A^{c}}}$ is finite-dimensional. Then we have

	$\displaystyle\left\lVert e^{Gt}\right\rVert_{*}$	$\displaystyle=\left\lVert e^{Gt}f\right\rVert_{*}$
		$\displaystyle=\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}\left\lvert e^{Gt}f\right\rvert\right\rVert$
		$\displaystyle\leq\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rVert=\left\lVert e^{Gt}\left\lvert f\right\rvert\right\rVert_{}\leq\left\lVert e^{Gt}\right\rVert_{}\,,$

where the final inequality used that $\left\lVert\left\lvert f\right\rvert\right\rVert_{*}=\left\lVert f\right\rVert_{*}=1$ . Hence we have found that $\left\lVert e^{Gt}\right\rVert_{*}=\left\lVert e^{Gt}\left\lvert f\right\rvert\right\rVert_{*}$ .

Since $e^{Qt}\in\mathcal{T}_{t}$ by Equation (7), we also have

e^{Gt}\left\lvert f\right\rvert\leq e^{\overline{G}t}\left\lvert f\right\rvert\,.

By monotonicity of upper transition operators, this implies that

e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\leq e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert

and, due to the preservation of non-negativity, we have

e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert=\left\lvert e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rvert\,,

and

e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert=\left\lvert e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rvert\,.

Hence for all $x\in A^{c}$ we have

\left\lvert e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rvert(x)\leq\left\lvert e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rvert(x)\,,

or in other words, that

\left\lVert e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rVert\leq\left\lVert e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert\,.

Since this holds for all $s\in\mathbb{R}_{\geq 0}$ , we have

	$\displaystyle\left\lVert e^{Gt}\right\rVert_{}=\left\lVert e^{Gt}\left\lvert f\right\rvert\right\rVert_{}$	$\displaystyle=\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}e^{Gt}\left\lvert f\right\rvert\right\rVert$
		$\displaystyle\leq\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert$
		$\displaystyle=\left\lVert e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert_{}\leq\left\lVert e^{\overline{G}t}\right\rVert_{}\,,$

which concludes the proof. ∎

Proof of Proposition 13.

The argument is analogous to the well-known case for linear quasicontractive semigroups; for a similar result, see e.g. [Renardy and Rogers, 2006, Thm 12.21]. So, fix any $t\in\mathbb{R}_{\geq 0}$ and $f\in\smash{\mathbb{R}^{A^{c}}}$ .

Using a similar argument as used in the proof of Lemma 5, we use the preservation of non-negativity and the monotonicity of upper transition operators, to find for any $s\in\mathbb{R}_{\geq 0}$ that

\left\lVert e^{\xi s}e^{\overline{G}s}\left\lvert e^{\overline{G}t}f\right\rvert\right\rVert\leq\left\lVert e^{\xi s}e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert\,.

Hence we have

	$\displaystyle\left\lVert e^{\overline{G}t}f\right\rVert_{*}$	$\displaystyle=\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}\left\lvert e^{\overline{G}t}f\right\rvert\right\rVert$
		$\displaystyle\leq\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}s}e^{\overline{G}t}\left\lvert f\right\rvert\right\rVert$
		$\displaystyle=e^{-\xi t}e^{\xi t}\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi s}e^{\overline{G}(s+t)}\left\lvert f\right\rvert\right\rVert$
		$\displaystyle=e^{-\xi t}\sup_{s\in\mathbb{R}_{\geq 0}}\left\lVert e^{\xi(s+t)}e^{\overline{G}(s+t)}\left\lvert f\right\rvert\right\rVert$
		$\displaystyle=e^{-\xi t}\sup_{s\in\mathbb{R}_{\geq t}}\left\lVert e^{\xi(s)}e^{\overline{G}s}\left\lvert f\right\rvert\right\rVert\leq e^{-\xi t}\left\lVert f\right\rVert_{*}\,,$

where for the second equality we used the semigroup property. Since $f\in\smash{\mathbb{R}^{A^{c}}}$ is arbitrary, this implies that

\left\lVert e^{\overline{G}t}\right\rVert_{*}=\sup\bigl{\{}\left\lVert e^{\overline{G}t}f\right\rVert_{*}\,:\,f\in\smash{\mathbb{R}^{A^{c}}},\left\lVert f\right\rVert_{*}=1\bigr{\}}\leq e^{-\xi t}\,,

which completes the proof. ∎

Proof of Proposition 14.

This is immediate from Lemma 5 and Proposition 13. ∎

Appendix C Proofs and Lemmas for Section 5

The following result is well-known, but we state it here for convenience:

Lemma 6.

Let $T$ be a linear bounded operator on a Banach space with norm $\left\lVert\cdot\right\rVert_{*}$ . Suppose that $\left\lVert T\right\rVert_{*}<1$ and that $(I-T)^{-1}$ exists. Then

\left\lVert(I-T)^{-1}\right\rVert_{*}\leq\frac{1}{1-\left\lVert T\right\rVert_{*}}\,.

Proof.

Since $\left\lVert T\right\rVert_{*}<1$ we have $(I-T)^{-1}=\sum_{k=0}^{+\infty}T^{k}$ . Taking norms,

\left\lVert(I-T)^{-1}\right\rVert_{*}=\left\lVert\sum_{k=0}^{+\infty}T^{k}\right\rVert_{*}\leq\sum_{k=0}^{+\infty}\left\lVert T\right\rVert_{*}^{k}=\frac{1}{1-\left\lVert T\right\rVert_{*}}\,,

where the final step used the value of the geometric series and that $\left\lVert T\right\rVert_{*}<1$ . ∎

Lemma 7.

There is some $C>0$ such that for any $\Delta>0$ with $\Delta\xi<1$ , and any $Q\in\mathcal{Q}$ with subgenerator $G$ , it holds that $\left\lVert(I-e^{G\Delta})^{-1}\right\rVert<\nicefrac{{C}}{{\Delta}}$ .

Proof.

Let $\xi>0$ be as in Proposition 11, and let $\left\lVert\cdot\right\rVert_{*}$ be the norm from Equation (8). Since $\smash{\mathbb{R}^{A^{c}}}$ is finite-dimensional the norms $\left\lVert\cdot\right\rVert$ and $\left\lVert\cdot\right\rVert_{*}$ are equivalent, and hence there is some $c>0$ such that $\left\lVert f\right\rVert\leq c\left\lVert f\right\rVert_{*}$ for all $f\in\smash{\mathbb{R}^{A^{c}}}$ . Set $C\coloneqq\nicefrac{{2c}}{{\xi}}$ ; then $C>0$ since $\xi>0$ .

Fix any $\Delta>0$ such that $\Delta\xi<1$ , and any $Q\in\mathcal{Q}$ with subgenerator $G$ . It follows from Proposition 14 that $\left\lVert e^{G\Delta}\right\rVert_{*}\leq e^{-\xi\Delta}$ . Using a standard quadratic bound on the negative scalar exponential, we have

\left\lVert e^{G\Delta}\right\rVert_{*}\leq e^{-\xi\Delta}\leq 1-\xi\Delta+\frac{1}{2}\Delta^{2}\xi^{2}<1-\frac{\Delta\xi}{2}<1\,,

(14)

where the third inequality used that $\Delta\xi<1$ . Notice that $\left\lVert e^{G\Delta}\right\rVert_{*}\leq e^{-\xi\Delta}<1$ . Moreover, $(I-e^{G\Delta})^{-1}$ exists by Proposition 8. By the norm equivalence, we have

\left\lVert(I-e^{G\Delta})^{-1}\right\rVert\leq c\left\lVert(I-e^{G\Delta})^{-1}\right\rVert_{*}\,,

(15)

and, by Lemma 6, that

\left\lVert(I-e^{G\Delta})^{-1}\right\rVert_{*}\leq\frac{1}{1-\left\lVert e^{G\Delta}\right\rVert_{*}}\,.

Using Equation (14) we obtain

\left\lVert(I-e^{G\Delta})^{-1}\right\rVert_{*}\leq\frac{1}{1-\left\lVert e^{G\Delta}\right\rVert_{*}}<\frac{1}{1-1+\frac{\Delta\xi}{2}}=\frac{1}{\Delta}\frac{2}{\xi}\,.

Combining with Equation (15) yields

\left\lVert(I-e^{G\Delta})^{-1}\right\rVert<c\frac{1}{\Delta}\frac{2}{\xi}=\frac{C}{\Delta}\,,

which concludes the proof. ∎

Proof of Proposition 15.

Let $\xi,C>0$ be as in Lemma 7, and let $\delta\coloneqq\nicefrac{{1}}{{\xi}}$ and $L\coloneqq C\left\lVert\mathcal{Q}\right\rVert^{2}$ with $\left\lVert\mathcal{Q}\right\rVert\coloneqq\sup_{Q\in\mathcal{Q}}\left\lVert Q\right\rVert$ ; note that $\left\lVert\mathcal{Q}\right\rVert\in\mathbb{R}_{\geq 0}$ since $\mathcal{Q}$ is bounded by assumption. Observe that we must have $\left\lVert\mathcal{Q}\right\rVert>0$ due to Assumption 2, whence $L>0$ .

Choose any $\Delta\in(0,\delta)$ and $Q\in\mathcal{Q}$ . It is immediate from the definitions that $h^{Q}(x)=0=h^{Q}_{\Delta}(x)$ for all $x\in A$ and all $Q\in\mathcal{Q}$ , so it remains to bound the norm on $A^{c}$ .

Let $G$ be the subgenerator of $Q$ on $A^{c}$ . By Proposition 9 we have that $h^{Q}_{\Delta}|_{A^{c}}=(I-e^{G\Delta})^{-1}\Delta\mathbf{1}$ . Using the definition of $h^{Q}$ this implies that

\displaystyle h^{Q}_{\Delta}|_{A^{c}}-e^{G\Delta}h^{Q}_{\Delta}|_{A^{c}}=\Delta\mathbf{1}=-\Delta Gh^{Q}|_{A^{c}}\,.

Re-ordering terms we have

h^{Q}_{\Delta}|_{A^{c}}=e^{G\Delta}h^{Q}_{\Delta}|_{A^{c}}-\Delta Gh^{Q}|_{A^{c}}\,.

Let $B=e^{G\Delta}-(I+\Delta G)$ . We find that

	$\displaystyle h^{Q}_{\Delta}$	$\displaystyle\|_{A^{c}}-h^{Q}\|_{A^{c}}$
		$\displaystyle=e^{G\Delta}h^{Q}_{\Delta}\|_{A^{c}}-\Delta Gh^{Q}\|_{A^{c}}-h^{Q}\|_{A^{c}}$
		$\displaystyle=e^{G\Delta}h^{Q}_{\Delta}\|_{A^{c}}-(I+\Delta G)h^{Q}\|_{A^{c}}$
		$\displaystyle=e^{G\Delta}(h^{Q}_{\Delta}\|_{A^{c}}-h^{Q}\|_{A^{c}})+\bigl{(}e^{G\Delta}-(I+\Delta G)\bigr{)}h^{Q}\|_{A^{c}}$
		$\displaystyle=e^{G\Delta}(h^{Q}_{\Delta}\|_{A^{c}}-h^{Q}\|_{A^{c}})+Bh^{Q}\|_{A^{c}}\,.$

We see that the difference on the left-hand side occurs again on the right-hand side. Hence we can substitute the same expansion $n\in\mathbb{N}$ times to get

	$\displaystyle h^{Q}_{\Delta}\|_{A^{c}}-h^{Q}\|_{A^{c}}$
	$\displaystyle\quad=e^{G\Delta(n+1)}(h^{Q}_{\Delta}\|_{A^{c}}-h^{Q}\|_{A^{c}})+\sum_{k=0}^{n}e^{G\Delta k}Bh^{Q}\|_{A^{c}}\,.$

Since we know from Section 4 that $\lim_{t\to+\infty}e^{Gt}=0$ , we see that the left summand vanishes as we take $n\to+\infty$ and, using Proposition 8, we have $(I-e^{Q\Delta})^{-1}=\sum_{k=0}^{+\infty}e^{G\Delta k}$ . So, passing to this limit and taking norms, we find

	$\displaystyle\left\lVert h^{Q}_{\Delta}\|_{A^{c}}-h^{Q}\|_{A^{c}}\right\rVert$	$\displaystyle=\left\lVert(I-e^{G\Delta})^{-1}Bh^{Q}\|_{A^{c}}\right\rVert$
		$\displaystyle\leq\left\lVert(I-e^{G\Delta})^{-1}\right\rVert\left\lVert B\right\rVert\left\lVert h^{Q}\|_{A^{c}}\right\rVert\,.$

Using Lemmas 3 and 4 and Corollary 3, we have

\left\lVert B\right\rVert=\left\lVert e^{G\Delta}-(I+\Delta G)\right\rVert\leq\left\lVert e^{Q\Delta}-(I+\Delta Q)\right\rVert\,,

and so, due to [Krak, 2021, Lemma B.8], we have $\left\lVert B\right\rVert\leq\Delta^{2}\left\lVert Q\right\rVert^{2}$ . Since $Q\in\mathcal{Q}$ it follows that $\left\lVert Q\right\rVert\leq\left\lVert\mathcal{Q}\right\rVert$ , and so $\left\lVert B\right\rVert\leq\Delta^{2}\left\lVert\mathcal{Q}\right\rVert^{2}$ . Since $\Delta<\delta$ we have $\Delta\xi<1$ , whence $\left\lVert(I-e^{G\Delta})^{-1}\right\rVert<\nicefrac{{C}}{{\Delta}}$ due to Lemma 7. In summary we find

\displaystyle\left\lVert h^{Q}_{\Delta}|_{A^{c}}-h^{Q}|_{A^{c}}\right\rVert

\displaystyle<\frac{C}{\Delta}\Delta^{2}\left\lVert\mathcal{Q}\right\rVert^{2}\left\lVert h^{Q}|_{A^{c}}\right\rVert=\Delta L\left\lVert h^{Q}\right\rVert\,,

which concludes the proof. ∎

Proposition 17.

[Krak, 2020, Prop 7] Fix any $\Delta>0$ , and let $\mathcal{T}_{\Delta}$ denote the set of transition matrices that dominate $e^{\underline{Q}\Delta}$ . Choose any $T_{0}\in\mathcal{T}_{\Delta}$ . For all $n\in\mathbb{N}_{0}$ , let $h_{n}$ be the (unique) non-negative solution to $h_{n}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}T_{n}h_{n}$ , and let $T_{n+1}\in\mathcal{T}_{\Delta}$ be such that $T_{n+1}h_{n}=e^{\underline{Q}\Delta}h_{n}$ .

Then $\lim_{n\to+\infty}h_{n}=\underline{h}_{\Delta}$ .

Proof.

The preconditions of the reference actually require every $T_{n}$ to be an extreme point of $\mathcal{T}_{\Delta}$ , but inspection of the proof of [Krak, 2020, Prop 7] shows that this is not required; the superfluous condition is only used to streamline the statement of an algorithmic result further on in that work. ∎

We next need some results that involve transition matrices ${}^{P}T_{t}^{s}$ corresponding to (not-necessarily homogeneous) Markov chains $P\in\mathbb{P}^{\mathrm{M}}_{\mathcal{Q}}$ . We recall from Section 2.2 that these are defined for any $t,s\in\mathbb{R}_{\geq 0}$ with $t\leq s$ as

{}^{P}T_{t}^{s}(x,y)\coloneqq P(X_{s}=y\,|\,X_{t}=x)\quad\text{for all $x,y\in\mathcal{X}$.}

Lemma 8.

Consider the sequence $(h_{n})_{n\in\mathbb{N}_{0}}$ constructed as in Proposition 17. For any $n\in\mathbb{N}_{0}$ , there is a Markov chain $P_{n+1}\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}}$ with corresponding transition matrix ${}^{(n+1)}T_{0}^{\Delta}$ such that ${}^{(n+1)}T_{0}^{\Delta}h_{n}=e^{\underline{Q}\Delta}h_{n}$ .

Hence in particular, we can choose the co-sequence $(T_{n})_{n\in\mathbb{N}}$ in Proposition 17 to be $({}^{(n)}T_{0}^{\Delta})_{n\in\mathbb{N}}$ .

Proof.

This follows from [Krak, 2021, Cor 6.24] and the fact that $\mathcal{Q}$ is non-empty, compact, convex, and has separately specified rows. ∎

Proposition 18.

For all $\Delta>0$ there is a Markov chain $P\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}}$ with corresponding transition matrix $T={}^{P}T_{0}^{\Delta}$ , such that the unique solution $h$ to $h=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Th$ satisfies $h=\underline{h}_{\Delta}$ .

Proof.

Let $\mathcal{T}_{\Delta}^{\mathrm{M}}\coloneqq\{{}^{P}T_{0}^{\Delta}\,:\,P\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}}\}$ , and let $(h_{n})_{\in\mathbb{N}}$ be as in Proposition 17, with the co-sequence $(T_{n})_{n\in\mathbb{N}}$ chosen as in Lemma 8 to consist of transition matrices corresponding to Markov chains in $\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}}$ . Then $(T_{n})_{n\in\mathbb{N}}$ lives in $\mathcal{T}_{\Delta}^{\mathrm{M}}$ .

The set $\mathcal{T}_{\Delta}^{\mathrm{M}}$ is compact by [Krak, 2021, Cor 5.18] and the fact that $\mathcal{Q}$ is non-empty, compact, and convex. Hence we can find a subsequence $(T_{n_{j}})_{j\in\mathbb{N}}$ with $\lim_{j\to+\infty}T_{n_{j}}=:T\in\mathcal{T}_{\Delta}^{\mathrm{M}}$ . Since $T\in\mathcal{T}_{\Delta}^{\mathrm{M}}$ , there is a Markov chain $P\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}}$ with corresponding transition matrix $T={}^{P}T_{0}^{\Delta}$ .

Moreover, since $T,T_{n_{j}}\in\mathcal{T}_{\Delta}^{\mathrm{M}}$ , it follows from [Krak, 2021, Cor 6.24] that the transition matrices $T$ and all $T_{n_{j}}$ dominate the lower transition operator $e^{\underline{Q}\Delta}$ . Together with Assumption 2, this allows us to invoke [Krak, 2020, Prop 6], by which we can let $h$ be the unique solution to $h=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Th$ , and it holds for any $j\in\mathbb{N}$ that $h_{n_{j}}|_{A}=0$ , and

h_{n_{j}}|_{A^{c}}=(I-T_{n_{j}}|_{A^{c}})^{-1}\mathbf{1}\Delta\,.

Similarly, it holds that $h|_{A}=0$ , and

h|_{A^{c}}=(I-T|_{A^{c}})^{-1}\mathbf{1}\Delta\,.

Since $\lim_{h\to+\infty}T_{n_{j}}=T$ and by continuity of the map $M\mapsto(I-M)^{-1}$ —which holds since all these inverses exist—it follows that $h|_{A^{c}}=\lim_{j\to+\infty}h_{n_{j}}|_{A^{c}}$ . Since also $h|_{A}=h_{n_{j}}|_{A}$ , it follows that $\lim_{j\to+\infty}h_{n_{j}}=h$ .

By Proposition 17 we have $\lim_{n\to+\infty}h_{n}=\underline{h}_{\Delta}$ , and hence we conclude that $\underline{h}_{\Delta}=\lim_{j\to+\infty}h_{n_{j}}=h$ . ∎

Proposition 19.

Fix any $t\geq 0$ and consider any Markov chain $P\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}}$ with transition matrix ${}^{P}T_{0}^{t}$ . Choose any $\epsilon>0$ . Then there is some $m\in\mathbb{N}$ such that for all $n\geq m$ there are $Q_{1},\ldots,Q_{n}\in\mathcal{Q}$ , such that

\left\lVert{}^{P}T_{0}^{t}-\prod_{i=1}^{n}(I+\nicefrac{{t}}{{n}}Q_{i})\right\rVert<\epsilon\,.

Proof.

The result is trivial if $t=0$ , so let us consider the case where $t>0$ . Let $\epsilon^{\prime}\coloneqq\nicefrac{{\epsilon}}{{2t}}$ . By [Krak, 2021, Lemma 5.12] there is some $m\in\mathbb{N}$ such that for all $n\geq m$ and with $\Delta\coloneqq\nicefrac{{t}}{{n}}$ , for all $i=1,\ldots,n$ there is some $Q_{i}\in\mathcal{Q}$ such that

\left\lVert{}^{P}T_{(i-1)\Delta}^{i\Delta}-(I+\Delta Q_{i})\right\rVert\leq\Delta\epsilon^{\prime}\,.

Since $P$ is a Markov chain, we can factor its transition matrices [Krak, 2021, Prop 5.1] as

{}^{P}T_{0}^{t}={}^{P}T_{0}^{\Delta}{}^{P}T_{\Delta}^{2\Delta}\cdots{}^{P}T_{t-\Delta}^{t}=\prod_{i=1}^{n}{}^{P}T_{(i-1)\Delta}^{i\Delta}\,.

Using [Krak, 2021, Lemma B.5] for the first inequality, we have

	$\displaystyle\left\lVert{}^{P}T_{0}^{t}-\prod_{i=1}^{n}(I+\Delta Q_{i})\right\rVert$
	$\displaystyle=\left\lVert\prod_{i=1}^{n}{}^{P}T_{(i-1)\Delta}^{i\Delta}-\prod_{i=1}^{n}(I+\Delta Q_{i})\right\rVert$
	$\displaystyle\leq\sum_{i=1}^{n}\left\lVert{}^{P}T_{(i-1)\Delta}^{i\Delta}-(I+\Delta Q_{i})\right\rVert$
	$\displaystyle\leq\sum_{i=1}^{n}\Delta\epsilon^{\prime}=n\frac{t}{n}\frac{\epsilon}{2t}=\frac{\epsilon}{2}\,,$

which concludes the proof. ∎

Lemma 9.

Consider a sequence $(Q_{n})_{n\in\mathbb{N}}$ in $\mathcal{Q}$ with limit $Q_{*}\coloneqq\lim_{n\to+\infty}Q_{n}$ . For all $n\in\mathbb{N}$ , let $h_{n}$ denote the minimal non-negative solution to $\mathbb{I}_{A}h_{n}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Q_{n}h_{n}$ , and let $h_{*}$ denote the minimal non-negative solution to $\mathbb{I}_{A}h_{*}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Q_{*}h_{*}$ . Then $h_{*}=\lim_{n\to+\infty}h_{n}$ .

Proof.

Since $\mathcal{Q}$ is closed, we have $Q_{*}\in\mathcal{Q}$ . Let $(G_{n})_{n\in\mathbb{N}}$ and $G_{*}$ denote the subgenerators of $(Q_{n})_{n\in\mathbb{N}}$ and $Q_{*}$ , respectively. Then $G_{*}^{-1}$ and $G_{n}^{-1}$ , $n\in\mathbb{N}$ exist by Corollary 1, and hence we also have $\lim_{n\to+\infty}G_{n}^{-1}=G_{*}^{-1}$ . Right-multiplying with $-\mathbf{1}$ and applying Proposition 10 gives

\lim_{n\to+\infty}h_{n}|_{A^{c}}=\lim_{n\to+\infty}-G_{n}^{-1}\mathbf{1}=-G_{*}^{-1}\mathbf{1}=h_{*}|_{A^{c}}\,.

Finally, by definition we trivially have $h_{n}(x)=0=h_{*}(x)$ for all $x\in A$ . Hence also $\lim_{n\to+\infty}h_{n}|_{A}=h_{*}|_{A}$ . ∎

Lemma 10.

[Krak et al., 2019, Cor 13] Fix any $\Delta>0$ and let $\underline{h}_{\Delta}$ be the minimal non-negative solution to the non-linear system (12). Let $\mathcal{T}_{\Delta}$ denote the set of transition matrices that dominate $e^{\underline{Q}\Delta}$ and, for all $T\in\mathcal{T}_{\Delta}$ , let $h_{T}$ denote the minimal non-negative solution to the linear system $h_{T}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Th_{T}$ . Then it holds that

\underline{h}_{\Delta}=\inf_{T\in\mathcal{T}_{\Delta}}h_{T}\,.

Proof of Proposition 16.

We only give the proof for the lower hitting times, i.e. that $\lim_{\Delta\to 0^{+}}\left\lVert\underline{h}_{\Delta}-\underline{h}\right\rVert=0$ . The argument for the upper hitting times is completely analogous.

Choose any two sequences $(\Delta_{n})_{n\in\mathbb{N}}$ and $(\epsilon_{n})_{n\in\mathbb{N}}$ in $\mathbb{R}_{>0}$ such that $\lim_{n\to+\infty}\Delta_{n}=0$ and $\lim_{n\to+\infty}\epsilon_{n}=0$ . We will assume without loss of generality that $\Delta_{n}\left\lVert\mathcal{Q}\right\rVert\leq 1$ for all $n\in\mathbb{N}$ , where $\left\lVert\mathcal{Q}\right\rVert=\sup_{Q\in\mathcal{Q}}\left\lVert Q\right\rVert$ .

Now first fix any $n\in\mathbb{N}$ , and consider $\underline{h}_{\Delta_{n}}$ . By Proposition 18 there is a Markov chain $P_{n}\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{M}}$ with transition matrix $T_{n}\coloneqq{}^{P_{n}}T_{0}^{\Delta_{n}}$ such that the unique solution $h_{n}$ to $h_{n}=\Delta_{n}\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}T_{n}h_{n}$ satisfies $h_{n}=\underline{h}_{\Delta_{n}}$ .

By Proposition 19, there are $m_{n}\in\mathbb{N}$ with $m_{n}\geq n$ and $Q^{(n)}_{1},\ldots,Q^{(n)}_{m_{n}}$ in $\mathcal{Q}$ such that, with

\Phi_{n}\coloneqq\prod_{i=1}^{m_{n}}\left(I+\frac{\Delta_{n}}{m_{n}}Q_{i}^{(n)}\right)\,,

it holds that $\left\lVert T_{n}-\Phi_{n}\right\rVert<\epsilon_{n}$ . Now define

Q_{n}\coloneqq\sum_{i=1}^{m_{n}}\frac{1}{m_{n}}Q_{i}^{(n)}\,.

Then $Q_{n}\in\mathcal{Q}$ since $\mathcal{Q}$ is convex. Let $h_{Q_{n}}$ denote the minimal non-negative solution to $\mathbb{I}_{A}h_{Q_{n}}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Q_{n}h_{Q_{n}}$ .

By repeating this construction for all $n\in\mathbb{N}$ , we obtain a sequence $(Q_{n})_{n\in\mathbb{N}}$ in $\mathcal{Q}$ . Since $\mathcal{Q}$ is (sequentially) compact, we can consider a subsequence $(Q_{n_{j}})_{j\in\mathbb{N}}$ such that $\lim_{j\to+\infty}Q_{n_{j}}=:Q_{*}\in\mathcal{Q}$ .

Let $h_{*}$ be the minimal non-negative solution to $\mathbb{I}_{A}h_{*}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Q_{*}h_{*}$ . We now need to estimate some norm bounds that hold by choosing $j$ large enough. Let $K=5$ and fix any $\delta>0$ .

Since $(Q_{n_{j}})_{j\in\mathbb{N}}$ converges to $Q_{*}$ , it follows from Lemma 9 that for $j$ large enough, we have

\left\lVert h_{Q_{n_{j}}}-h_{*}\right\rVert<\frac{\delta}{K}

(16)

Since $h_{*}$ is bounded, this also implies that the sequence $(h_{Q_{n_{j}}})_{j\in\mathbb{N}}$ is eventually uniformly bounded above in norm by some constant $M\geq 0$ , say.

For all $j\in\mathbb{N}$ , let $\hat{h}_{n_{j}}$ be such that $\hat{h}_{n_{j}}|_{A^{c}}\coloneqq(I-e^{G_{n_{j}}\Delta_{n_{j}}})^{-1}\Delta_{n_{j}}\mathbf{1}$ and $\hat{h}_{n_{j}}|_{A}\coloneqq 0$ . Then

\hat{h}_{n_{j}}=\Delta_{n_{j}}\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}e^{Q_{n_{j}}\Delta_{n_{j}}}\hat{h}_{n_{j}}\,.

For $j$ large enough we eventually have $\Delta_{n_{j}}\xi<1$ , and so by Proposition 15, we then have

	$\displaystyle\left\lVert\hat{h}_{n_{j}}-h_{Q_{n_{j}}}\right\rVert$	$\displaystyle<\Delta_{n_{j}}L\left\lVert h_{Q_{n_{j}}}\right\rVert$
		$\displaystyle\leq\Delta_{n_{j}}LM\,,$

with $L,M$ independent of $j$ . Hence for $j$ large enough we have

\left\lVert\hat{h}_{n_{j}}-h_{Q_{n_{j}}}\right\rVert<\frac{\delta}{K}\,.

(17)

Let next $\tilde{h}_{n_{j}}$ be the minimal non-negative solution to $\tilde{h}_{n_{j}}=\Delta_{n_{j}}\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\Phi_{n_{j}}\tilde{h}_{n_{j}}$ . Since $m_{n_{j}}\geq n_{j}$ , for $j$ large enough we have $\left\lVert\Phi_{n_{j}}|_{A^{c}}\right\rVert<1$ due to Assumption 2.

By [Krak, 2021, Lemmas B.8 and B.12] we have

\left\lVert\Phi_{n_{j}}-e^{Q_{n_{j}}\Delta_{n_{j}}}\right\rVert\leq 2\Delta_{n_{j}}^{2}\left\lVert\mathcal{Q}\right\rVert^{2}\,,

and so, for any $\epsilon>0$ , we can choose $j$ large enough so that eventually $\left\lVert\Phi_{n_{j}}-e^{Q_{n_{j}}\Delta_{n_{j}}}\right\rVert<\epsilon$ . Using the continuity of the map $T\mapsto(I-T)^{-1}$ on operators $T$ for which this inverse exists, for large enough $j$ we therefore find that

	$\displaystyle\left\lVert\tilde{h}_{n_{j}}\|_{A^{c}}-\hat{h}_{n_{j}}\|_{A^{c}}\right\rVert$
	$\displaystyle=\left\lVert\bigl{(}(I-\Phi_{n_{j}}\|_{A^{c}})^{-1}-(I-e^{Q_{n_{j}}\Delta_{n_{j}}}\|_{A^{c}})^{-1}\bigr{)}\Delta_{n_{j}}\mathbf{1}\right\rVert$
	$\displaystyle<\Delta_{n_{j}}\frac{\delta}{K}\leq\frac{\delta}{K}\,.$

Since $\tilde{h}_{n_{j}}|_{A}=0=\hat{h}_{n_{j}}|_{A}$ , this implies that then also

\left\lVert\tilde{h}_{n_{j}}-\hat{h}_{n_{j}}\right\rVert<\frac{\delta}{K}\,.

(18)

Next, we recall that $\underline{h}_{\Delta_{n_{j}}}=h_{n_{j}}$ , and

\left\lVert T_{n_{j}}-\Phi_{n_{j}}\right\rVert<\epsilon_{n_{j}}\,.

Hence by continuity of the map $T\mapsto(I-T)^{-1}$ on operators $T$ for which this inverse exists, for large enough $j$ we find that

	$\displaystyle\left\lVert h_{n_{j}}\|_{A^{c}}-\tilde{h}_{n_{j}}\|_{A^{c}}\right\rVert$
	$\displaystyle=\left\lVert\bigl{(}(I-T_{n_{j}}\|_{A^{c}})^{-1}-(I-\Phi_{n_{j}}\|_{A^{c}})^{-1}\bigr{)}\Delta_{n_{j}}\mathbf{1}\right\rVert$
	$\displaystyle<\Delta_{n_{j}}\frac{\delta}{K}\leq\frac{\delta}{K}\,.$

Since $h_{n_{j}}|_{A}=0=\tilde{h}_{n_{j}}|_{A}$ , this implies that also

\left\lVert h_{n_{j}}-\tilde{h}_{n_{j}}\right\rVert<\frac{\delta}{K}\,.

(19)

Putting Equations (16)–(19) together, we find that for any large enough $j$ it holds that

$\displaystyle\left\lVert\underline{h}_{\Delta_{n_{j}}}-h_{*}\right\rVert$	$\displaystyle=\left\lVert h_{n_{j}}-h_{*}\right\rVert$
	$\displaystyle\leq\left\lVert h_{n_{j}}-\tilde{h}_{n_{j}}\right\rVert$
	$\displaystyle\quad+\left\lVert\tilde{h}_{n_{j}}-\hat{h}_{n_{j}}\right\rVert$
	$\displaystyle\quad\quad\quad+\left\lVert\hat{h}_{n_{j}}-h_{Q_{n_{j}}}\right\rVert$
	$\displaystyle\quad\quad\quad\quad+\left\lVert h_{Q_{n_{j}}}-h_{*}\right\rVert$
	$\displaystyle<4\frac{\delta}{K}\,.$	(20)

Since $\delta>0$ is arbitrary this clearly implies that

\lim_{j\to+\infty}\underline{h}_{\Delta_{n_{j}}}=h_{*}\,.

(21)

Next, let us show that $h_{*}=\underline{h}$ . To this end, assume ex absurdo that there is some $Q\in\mathcal{Q}$ such that $h^{Q}(x)<h_{*}(x)$ for some $x\in A^{c}$ . Let $\delta\coloneqq h_{*}(x)-h_{Q}(x)>0$ . Due to Corollary 2, for any $\Delta>0$ small enough it holds that

\left\lVert h_{\Delta}^{Q}-h^{Q}\right\rVert<\frac{\delta}{K}\,.

This implies in particular that for large enough $j$ it holds that $h_{\Delta_{n_{j}}}^{Q}(x)<h^{Q}(x)+\nicefrac{{\delta}}{{K}}$ . Moreover, it follows from Equation (20) that for large enough $j$ we have $\underline{h}_{\Delta_{n_{j}}}(x)>h_{*}(x)-4\nicefrac{{\delta}}{{K}}$ . It holds that $h_{Q}(x)=h_{*}(x)-\delta$ , and hence, since $K=5$ , we find that that for large enough $j$ ,

	$\displaystyle h_{\Delta_{n_{j}}}^{Q}(x)$	$\displaystyle<h^{Q}(x)+\nicefrac{{\delta}}{{K}}$
		$\displaystyle=h_{*}(x)-\delta+\nicefrac{{\delta}}{{K}}$
		$\displaystyle=h_{*}(x)-K\frac{\delta}{K}+\nicefrac{{\delta}}{{K}}$
		$\displaystyle=h_{*}(x)-(K-1)\frac{\delta}{K}$
		$\displaystyle=h_{*}(x)-4\frac{\delta}{K}<\underline{h}_{\Delta_{n_{j}}}(x)\,.$

In other words, and using Lemma 10, we then have

h^{Q}_{\Delta_{n_{j}}}(x)<\underline{h}_{\Delta_{n_{j}}}(x)=\inf_{T\in\mathcal{T}_{\Delta_{n_{j}}}}h_{T}(x)\leq h^{Q}_{\Delta_{n_{j}}}(x)\,,

where the last step used that $e^{Q\Delta_{n_{j}}}\in\mathcal{T}_{\Delta_{n_{j}}}$ . From this contradiction we conclude that our earlier assumption must be wrong, and so it holds that $h_{*}(x)\leq h^{Q}(x)$ for all $x\in\mathcal{X}$ and $Q\in\mathcal{Q}$ . This implies that $h_{*}\leq\underline{h}$ . Since it clearly also holds that $\underline{h}\leq h_{*}$ because $Q_{*}\in\mathcal{Q}$ , this implies that, indeed as claimed, $h_{*}=\underline{h}$ .

In summary, at this point we have shown that for any sequence $(\Delta_{n})_{n\in\mathbb{N}}$ in $\mathbb{R}_{>0}$ with $\lim_{n\to+\infty}\Delta_{n}=0$ , there is a subsequence such that $\lim_{j\to+\infty}\underline{h}_{\Delta_{n_{j}}}=\underline{h}$ .

So, finally, suppose ex absurdo that $\lim_{\Delta\to 0^{+}}\underline{h}_{\Delta}\neq\underline{h}$ . Then there is some sequence $(\Delta_{n})_{n\in\mathbb{N}}$ in $\mathbb{R}_{>0}$ such that $\lim_{n\to+\infty}\Delta_{n}=0$ , and some $\epsilon>0$ , such that $\left\lVert\underline{h}_{\Delta_{n}}-\underline{h}\right\rVert\geq\epsilon$ for all $n\in\mathbb{N}$ . By the above result, there is a subsequence such that $\lim_{j\to+\infty}\underline{h}_{\Delta_{n_{j}}}=\underline{h}$ , which is a contradiction. ∎

Proof of Theorem 1.

The crucial approach of this proof is to emulate Erreygers [2021, Sec 6.3] and consider discretized and truncated hitting times. By taking appropriate limits of such approximations, we then recover the “real” hitting times. We however need to be a bit careful with these constructions, since lower (and upper) expectation operators for continuous-time imprecise-Markov chains are not necessarily continuous with respect to arbitrary limits of such approximations [Erreygers, 2021, Chap 5]. This—fairly long—proof is therefore roughly divided into two parts; first, we construct a specific sequence of approximations, and establish the relevant continuity properties with respect to this sequence. Then, in the second part of this proof, we use this continuity to establish the main claim of this theorem.

To this end, for any $t\in\mathbb{R}_{\geq 0}$ and $\Delta\in\mathbb{R}_{>0}$ , we first consider a fixed-step grid $\nu_{\Delta}^{t}$ over $[0,t]$ with step-size $\Delta$ , as

\nu_{\Delta}^{t}\coloneqq\bigl{\{}i\Delta\,:\,i\in\mathbb{N}_{0},i\Delta\leq t\bigr{\}}\,.

(22)

We define the associated approximate hitting time functions $\tau_{\Delta}^{t}:\Omega_{\mathbb{R}_{\geq 0}}\to\mathbb{R}$ for all $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$ as

\tau_{\Delta}^{t}(\omega)\coloneqq\min\Bigl{(}\bigl{\{}s\in\nu_{\Delta}^{t}\,:\,\omega(s)\in A\}\cup\{t\}\Bigr{)}\,.

(23)

Then by [Erreygers, 2021, Lemma 6.19], as we take the time-horizon $t$ to infinity and the step-size $\Delta$ to zero, we have the point-wise limit to the actual hitting time function $\tau_{\mathbb{R}_{\geq 0}}$ , in that

\tau_{\mathbb{R}_{\geq 0}}(\omega)=\lim_{t\to+\infty,\Delta\to 0^{+}}\tau_{\Delta}^{t}(\omega)\quad\text{for all $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$.}

(24)

Let us now construct a specific sequence of approximate hitting time functions that will converge to this limit. To this end, first fix an arbitrary sequence $(\epsilon_{n})_{n\in\mathbb{N}_{0}}$ in $\mathbb{R}_{>0}$ such that $\lim_{n\to+\infty}\epsilon_{n}=0$ . Moreover, for any $n\in\mathbb{N}_{0}$ , we introduce the (discrete-time) truncated hitting time $\tau_{0:n}:\Omega_{\mathbb{N}_{0}}\to\mathbb{R}$ , defined for all $\omega\in\Omega_{\mathbb{N}_{0}}$ as

\tau_{0:n}(\omega)\coloneqq\min\Bigl{(}\bigl{\{}t\in\{0,\ldots,n\}\,:\,\omega(t)\in A\bigr{\}}\cup\{n\}\Bigr{)}\,.

Now fix any $k\in\mathbb{N}_{0}$ , let $\Delta_{k}\coloneqq 2^{-k}$ , and let $\mathcal{T}_{k}$ denote the set of transition matrices that dominate $e^{\underline{Q}\Delta_{k}}$ . We now consider discrete-time imprecise-Markov chains parameterized by $\mathcal{T}_{k}$ . As discussed in [Krak et al., 2019], for all $n\in\mathbb{N}_{0}$ there are functions⁹⁹9These represent lower and an upper expectations with respect to a game-theoretic imprecise-Markov chain, but the details don’t concern us here. $\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}]$ and $\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}]$ in $\smash{\mathbb{R}^{\mathcal{X}}}$ such that

	$\displaystyle\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,\|\,X_{0}]$	$\displaystyle\leq\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{I}}[\tau_{0:n}\,\|\,X_{0}]$
		$\displaystyle\leq\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{I}}[\tau_{0:n}\,\|\,X_{0}]\leq\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,\|\,X_{0}]$

that, moreover, satisfy

\lim_{n\to+\infty}\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}]=\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]=\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]

and

\lim_{n\to+\infty}\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}]=\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]=\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]\,.

We already noted in Section 5 that the functions $\underline{h}_{\Delta_{k}}$ and $\overline{h}_{\Delta_{k}}$ from Equations (12) and (13) satisfy

\underline{h}_{\Delta_{k}}=\Delta_{k}\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]\,\,\text{and}\,\,\overline{h}_{\Delta_{k}}=\Delta_{k}\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]\,.

Combining the above, we find that

\lim_{n\to+\infty}\Delta_{k}\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}]=\underline{h}_{\Delta_{k}}

and

\lim_{n\to+\infty}\Delta_{k}\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:n}\,|\,X_{0}]=\overline{h}_{\Delta_{k}}\,.

Hence for all $k\in\mathbb{N}_{0}$ , we can now choose $t_{k}\in\mathbb{N}_{0}$ large enough such that $t_{k}\geq k$ , and so that with $n_{k}=2^{k}t_{k}$ we have both

\left\lVert\Delta_{k}\underline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:{n_{k}}}\,|\,X_{0}]-\underline{h}_{\Delta_{k}}\right\rVert<\epsilon_{k}\,,

(25)

and

\left\lVert\Delta_{k}\overline{\mathbb{E}}_{\mathcal{T}_{k}}^{\mathrm{V}}[\tau_{0:{n_{k}}}\,|\,X_{0}]-\overline{h}_{\Delta_{k}}\right\rVert<\epsilon_{k}\,.

(26)

With these selections, we now define the sequence $(\tau_{k})_{k\in\mathbb{N}_{0}}$ of approximate hitting times as $\tau_{k}\coloneqq\tau_{\Delta_{k}}^{t_{k}}$ for all $k\in\mathbb{N}_{0}$ . Clearly we have $\lim_{k\to+\infty}\Delta_{k}=0$ , and since $t_{k}\geq k$ we also find that $\lim_{k\to+\infty}t_{k}=+\infty$ . Hence by Equation (24) we have the pointwise limit

\tau_{\mathbb{R}_{\geq 0}}(\omega)=\lim_{k\to+\infty}\tau_{k}(\omega)\quad\text{for all $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$.}

(27)

Having constructed this specific sequence that converges to the “true” hitting time function, we will now demonstrate the relevant continuity properties of the lower- and upper expectations of interest, with respect to this sequence.

To this end, we define $\hat{\tau}:\Omega_{\mathbb{R}_{\geq 0}}\to\mathbb{R}\cup\{+\infty\}$ as

\hat{\tau}(\omega)\coloneqq\sup_{t\in\mathbb{N}_{0}}\sup_{n\in\mathbb{N}_{0}}\tau_{\Delta_{n}}^{t}(\omega)\quad\text{for all $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$.}

(28)

Then for all $k\in\mathbb{N}_{0}$ we have $\tau_{k}(\omega)=\tau_{\Delta_{k}}^{t_{k}}(\omega)\leq\hat{\tau}(\omega)$ for all $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$ . Moreover, since every $\tau_{k}$ is non-negative, it holds in fact that $\left\lvert\tau_{k}(\omega)\right\rvert\leq\hat{\tau}(\omega)$ for all $k\in\mathbb{N}_{0}$ and $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$ . This means that if we can show that the upper expectation $\smash{\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}}[\hat{\tau}\,|\,X_{0}=x]$ is bounded for all $x\in\mathcal{X}$ , then we can use the imprecise version of the dominated convergence theorem [Erreygers, 2021, Thm 5.32] to take lower- and upper expectations of the limit in Equation (27). So, we will now show that this boundedness indeed holds.

We note that, for fixed $t\in\mathbb{N}_{0}$ , $\tau_{\Delta_{n}}^{t}$ is monotonically decreasing as we increase $n\in\mathbb{N}_{0}$ . To see this, first consider the grids $\nu_{\Delta_{n}}^{t}$ and $\nu_{\Delta_{n+1}}^{t}$ over $[0,t]$ . For any $s\in\nu_{\Delta_{n}}^{t}$ there is some $i\in\mathbb{N}_{0}$ such that $s=i\Delta_{n}$ , and since $\Delta_{n}=2^{-n}=2\Delta_{n+1}$ , we find that also $s=2i\Delta_{n+1}\in\nu_{\Delta_{n+1}}^{t}$ . Hence we conclude that $\nu_{\Delta_{n}}^{t}\subseteq\nu_{\Delta_{n+1}}^{t}$ . From this set inclusion, we also clearly have for any $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$ that

\{s\in\nu_{\Delta_{n}}^{t}\,:\,\omega(s)\in A\}\subseteq\{s\in\nu_{\Delta_{n+1}}^{t}\,:\,\omega(s)\in A\}\,,

and so together with the fact that $s\leq t$ for all $s\in\nu_{\Delta_{n+1}}^{t}$ , it then follows from Equation (23) that $\tau_{\Delta_{n}}^{t}(\omega)\geq\tau_{\Delta_{n+1}}^{t}(\omega)$ .

Using this observation, we immediately find that for any $t\in\mathbb{N}_{0}$ and $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$ it holds that

\sup_{n\in\mathbb{N}_{0}}\tau_{\Delta_{n}}^{t}(\omega)=\tau_{\Delta_{0}}^{t}(\omega)=\tau_{1}^{t}(\omega)\,,

and so from Equation (28), we have

\hat{\tau}(\omega)=\sup_{t\in\mathbb{N}_{0}}\tau_{1}^{t}(\omega)\,.

Next, we observe that $\tau_{1}^{t}$ is monotonically increasing as we increase $t\in\mathbb{N}_{0}$ . Indeed, for any $t\in\mathbb{N}_{0}$ the grid $\nu_{1}^{t}$ over $[0,t]$ simply constitutes the set $\nu_{1}^{t}=\{0,\ldots,t\}$ . Hence that $\tau_{1}^{t}$ is monotonically increasing as we increase $t\in\mathbb{N}_{0}$ , follows immediately from Equation (23). In particular, this implies that the sequence $(\tau_{1}^{t})_{t\in\mathbb{N}_{0}}$ converges monotonically to $\hat{\tau}$ . Moreover, we have $\tau_{1}^{0}(\omega)=0$ for all $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$ , and so we find that identically

\displaystyle\underline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\tau_{1}^{0}\,|\,X_{0}]=\inf_{P\in\mathcal{P}_{\mathcal{Q}}^{\mathrm{I}}}\mathbb{E}_{P}[\tau_{1}^{0}\,|\,X_{0}]=0\,.

Hence by the continuity of upper expectations with respect to monotonically increasing convergent sequences of functions that are bounded below [Erreygers, 2021, Thm 5.31], we have

\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}]=\lim_{t\to+\infty,t\in\mathbb{N}_{0}}\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\tau_{1}^{t}\,|\,X_{0}]\,.

(29)

Now, for every $t\in\mathbb{N}_{0}$ , $\tau_{1}^{t}$ only depends on finitely many time-points; indeed, $\tau_{1}^{t}(\omega)$ only depends on the value of $\omega(s)$ for $s\in\{0,\ldots,t\}$ . Using [Krak, 2021, Thm 7.2], this means that the lower- and upper expectations of these functions with respect to the imprecise-Markov chain $\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}$ , can also be expressed as lower (resp. upper) expectations of this function with respect to an induced discrete-time imprecise-Markov chain. Indeed, since the step-size used in these approximating functions is uniformly equal to one, and using the obvious correspondence between $\tau_{1}^{t}$ and $\tau_{0:t}$ , it is not difficult to see that

\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\tau_{1}^{t}\,|\,X_{0}]=\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{I}}[\tau_{0:t}\,|\,X_{0}]\quad\text{for all $t\in\mathbb{N}_{0}$,}

(30)

where $\mathcal{T}$ is the set of transition matrices that dominates $e^{\underline{Q}}$ .

We now again invoke the previously mentioned results from [Krak et al., 2019]; for any $t\in\mathbb{N}_{0}$ , there is a function $\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{0:t}\,|\,X_{0}]\in\smash{\mathbb{R}^{\mathcal{X}}}$ that satisfies

\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{I}}[\tau_{0:t}\,|\,X_{0}]\leq\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{0:t}\,|\,X_{0}]\,.

(31)

Combining Equations (29), (30), and (31), and using [Krak et al., 2019, Prop 7] to establish the limit on the final right-hand side, we have

	$\displaystyle\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,\|\,X_{0}]$	$\displaystyle=\lim_{t\to+\infty,t\in\mathbb{N}_{0}}\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\tau_{1}^{t}\,\|\,X_{0}]$
		$\displaystyle=\lim_{t\to+\infty,t\in\mathbb{N}_{0}}\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{I}}[\tau_{0:t}\,\|\,X_{0}]$
		$\displaystyle\leq\lim_{t\to+\infty,t\in\mathbb{N}_{0}}\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{0:t}\,\|\,X_{0}]=\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{\mathbb{N}_{0}}\,\|\,X_{0}]\,.$

By [Krak et al., 2019, Thm 12] it holds that

\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]=\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]\,,

and, moreover, that there is some homogeneous discrete-time Markov chain $P\in\mathcal{P}^{\mathrm{HM}}_{\mathcal{T}}$ with associated transition matrix $T={}^{P}T\in\mathcal{T}$ and hitting times $h=\mathbb{E}_{P}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]$ such that $h=\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{HM}}[\tau_{\mathbb{N}_{0}}\,|\,X_{0}]$ . Putting this together, we find that

\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}=x]\leq h(x)\quad\text{for all $x\in\mathcal{X}$.}

(32)

By Proposition 1, $h$ is also the minimal non-negative solution to the system

h=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Th\,.

(33)

It is immediate from the definition that $h|_{A}=0$ , and since $\hat{\tau}$ is clearly non-negative, we obtain from Equation (32) that for all $x\in A$ we have

0\leq\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}=x]\leq h(x)=0\,,

or in other words, that $\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}=x]=0$ for all $x\in A$ . So, it remains to bound this upper expectation on $A^{c}$ .

By our Assumption 2, it holds for all $x\in A^{c}$ that $e^{\underline{Q}}\mathbb{I}_{A}(x)>0$ . Since $e^{\underline{Q}}$ is the lower transition operator corresponding to $\mathcal{T}$ due to Proposition 3, it follows that $e^{\underline{Q}}$ satisfies conditions C1–C3 and R1 from Reference [Krak, 2020]. We now recall that $T={}^{P}T\in\mathcal{T}$ . Since the preconditions C1–C3 and R1 of this reference are all satisfied, we can now invoke [Krak, 2020, Lemma 10], which states that the inverse operator $(I-T|_{A^{c}})^{-1}$ exists.

We note that $h|_{A}=0$ , and so $h=(h|_{A^{c}})\!\!\uparrow_{\mathcal{X}}$ . Hence in particular, we have $T|_{A^{c}}h|_{A^{c}}=(Th)|_{A^{c}}$ . From Equation (33), we now find that

h|_{A^{c}}=\mathbf{1}+(Th)|_{A^{c}}=\mathbf{1}+T|_{A^{c}}h|_{A^{c}}\,,

and so re-ordering terms, we have $(I-T|_{A^{c}})h|_{A^{c}}=\mathbf{1}$ . Using the existence of the inverse operator established above, we obtain

h|_{A^{c}}=(I-T|_{A^{c}})^{-1}\mathbf{1}\,.

Since $(I-T|_{A^{c}})$ is an invertible bounded linear operator, also clearly $(I-T|_{A^{c}})^{-1}$ is bounded. Hence we have

\left\lVert h|_{A^{c}}\right\rVert=\left\lVert(I-T|_{A^{c}})^{-1}\mathbf{1}\right\rVert\leq\left\lVert(I-T|_{A^{c}})^{-1}\right\rVert<+\infty\,.

From Equation (32) we find that $\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}=x]<+\infty$ for all $x\in A^{c}$ . In summary, at this point we have shown that $\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,|\,X_{0}=x]$ is bounded for all $x\in\mathcal{X}$ . Since we already established that $\hat{\tau}$ absolutely dominates the sequence $(\tau_{k})_{k\in\mathbb{N}_{0}}$ , we can now finally use the limit (27) and the dominated convergence theorem [Erreygers, 2021, Thm 5.32] to establish that

\limsup_{k\to+\infty}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k}\,|\,X_{0}]\leq\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\,,

(34)

and

\overline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\leq\liminf_{k\to+\infty}\overline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k}\,|\,X_{0}]\,.

(35)

This concludes the first part of this proof. Our next step will be to identify the limits superior and inferior in the above inequalities as corresponding to, respectively, $\underline{h}$ and $\overline{h}$ .

Let us start by obtaining the required result for the lower expectation. From the definition of the limit superior, there is a convergent subsequence such that

\underline{s}\coloneqq\lim_{j\to+\infty}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k_{j}}\,|\,X_{0}]=\limsup_{k\to+\infty}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k}\,|\,X_{0}]\,.

(36)

Now fix any $j\in\mathbb{N}_{0}$ , and consider the approximate function $\tau_{k_{j}}=\tau_{\smash{\Delta_{k_{j}}}}^{t_{k_{j}}}$ . As before, this function really only depends on the system at finitely many time points; specifically, those on the grid $\nu_{\smash{\Delta_{k_{j}}}}^{t_{k_{j}}}$ over $[0,t_{k_{j}}]$ . We can therefore again use [Krak, 2021, Thm 7.2], to express the lower- and upper expectations of this function with respect to the imprecise-Markov chain $\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}$ , as lower- and upper expectations of a function with respect to an induced discrete-time imprecise-Markov chain. Since the step size of this grid is now equal to $\Delta_{k_{j}}$ rather than one, this requires a bit more effort than before. In particular, we now need to compensate for the step-size $\Delta_{k_{j}}$ of the grid. Indeed, the corresponding discrete-time imprecise-Markov chain should consider steps that are implicitly of this “length”, so we consider the model induced by the set $\mathcal{T}_{k_{j}}$ of transition matrices that dominate $e^{\underline{Q}\Delta_{k_{j}}}$ . It then remains to find an appropriate translation $\tilde{\tau}_{k_{j}}$ of $\tau_{k_{j}}$ to the domain $\Omega_{\mathbb{N}_{0}}$ .

As a first observation, we note that this “translation” $\tilde{\tau}_{k_{j}}:\Omega_{\mathbb{N}_{0}}\to\mathbb{R}$ should depend on the same number of time points as $\tau_{k_{j}}$ . We note that since $t_{k}\in\mathbb{N}_{0}$ it holds that $t_{k}\in\nu^{t_{k_{j}}}_{\smash{\Delta_{k_{j}}}}$ . Hence it follows from Equation (22) that $\nu^{t_{k_{j}}}_{\smash{\Delta_{k_{j}}}}$ contains exactly $\Delta_{k_{j}}^{-1}t_{k_{j}}=2^{k_{j}}t_{k_{j}}=n_{k_{j}}$ time points, in addition to the origin $0$ , and that $\tau_{k_{j}}$ depends exactly on these time points. Indeed, inspection of Equation (23) reveals that, by re-scaling to compensate for the step size $\Delta_{k_{j}}$ , the quantity $\Delta_{k_{j}}^{-1}\tau_{k_{j}}(\omega)$ simply represents the natural index of the discrete grid element of $\nu^{t_{k_{j}}}_{\smash{\Delta_{k_{j}}}}$ on which $\omega\in\Omega_{\mathbb{R}_{\geq 0}}$ did (or did not) initially hit $A$ . Adapting Equation (23), we therefore define for any $\omega\in\Omega_{\mathbb{N}_{0}}$ that

\displaystyle\tilde{\tau}_{k_{j}}(\omega)\coloneqq\min\Bigl{(}\bigl{\{}s\in\nu_{\Delta_{k_{j}}}^{t_{k_{j}}}\,:\,\omega\bigl{(}\Delta_{k_{j}}^{-1}s\bigr{)}\in A\bigr{\}}\cup\{t_{k_{j}}\}\Bigr{)}\,.

We see that, as required, $\Delta_{k_{j}}^{-1}\tilde{\tau}_{k_{j}}(\omega)$ is again simply the identity of the step on which $\omega\in\Omega_{\mathbb{N}_{0}}$ did (or did not) initially hit $A$ . This implies the relation to the discrete-time truncated hitting time $\tau_{0:n_{k_{j}}}$ ; for any $\omega\in\Omega_{\mathbb{N}_{0}}$ we have

	$\displaystyle\tilde{\tau}_{k_{j}}(\omega)$
	$\displaystyle=\Delta_{k_{j}}\Delta_{k_{j}}^{-1}\tilde{\tau}_{k_{j}}(\omega)$
	$\displaystyle=\Delta_{k_{j}}\min\Bigl{(}\bigl{\{}s\in\Delta_{k_{j}}^{-1}\nu_{\Delta_{k_{j}}}^{t_{k_{j}}}\,:\,\omega(s)\in A\bigr{\}}\cup\{\Delta_{k_{j}}^{-1}t_{k_{j}}\}\Bigr{)}$
	$\displaystyle=\Delta_{k_{j}}\min\Bigl{(}\bigl{\{}s\in\{0,\ldots,n_{k_{j}}\}\,:\,\omega\bigl{(}s\bigr{)}\in A\bigr{\}}\cup\{n_{k_{j}}\}\Bigr{)}$
	$\displaystyle=\Delta_{k_{j}}\tau_{0:n_{k_{j}}}(\omega)\,,$

and so we simply have that $\tilde{\tau}_{k_{j}}=\Delta_{k_{j}}\tau_{0:n_{k_{j}}}$ . Following the discussion in [Krak, 2021, Chap 7], and [Krak, 2021, Thm 7.2] in particular, we therefore find the identity

\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k_{j}}\,|\,X_{0}]=\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]\quad\text{for all $j\in\mathbb{N}_{0}$.}

(37)

We again recall from Krak et al. [2019] the objects $\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]$ in $\smash{\mathbb{R}^{\mathcal{X}}}$ satisfying

\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]\leq\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]\quad\text{for all $j\in\mathbb{N}_{0}$.}

Hence from Equations (36) and (37) we now find that

$\displaystyle\underline{s}$	$\displaystyle=\lim_{j\to+\infty}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k_{j}}\,\|\,X_{0}]$
	$\displaystyle=\lim_{j\to+\infty}\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,\|\,X_{0}]$
	$\displaystyle\geq\lim_{j\to+\infty}\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,\|\,X_{0}]\,.$	(38)

We now note that, for all $j\in\mathbb{N}_{0}$ , we have

	$\displaystyle\left\lVert\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,\|\,X_{0}]-\underline{h}\right\rVert$
	$\displaystyle\quad\leq\left\lVert\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,\|\,X_{0}]-\underline{h}_{\Delta_{k_{j}}}\right\rVert+\left\lVert\underline{h}_{\Delta_{k_{j}}}-\underline{h}\right\rVert$
	$\displaystyle\quad<\epsilon_{k_{j}}+\left\lVert\underline{h}_{\Delta_{k_{j}}}-\underline{h}\right\rVert\,,$

where we used Equation (25) for the final inequality. Using that $\lim_{j\to+\infty}\epsilon_{k_{j}}=0$ and $\lim_{j\to+\infty}\Delta_{k_{j}}=0$ , together with Proposition 16, we see that both summands vanish as we increase $j\in\mathbb{N}_{0}$ , and so we have

\lim_{j\to+\infty}\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,|\,X_{0}]=\underline{h}\,.

(39)

We already established in Section 5 that $\underline{h}=\underline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]$ . Hence by combining Equations (34), (36), (38), and (39), we now find that

\underline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]=\underline{h}\leq\underline{s}\leq\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\,.

(40)

However, as noted in Section 3.1 we have the inclusion $\mathcal{P}^{\mathrm{HM}}_{\mathcal{Q}}\subseteq\mathcal{P}^{\mathrm{M}}_{\mathcal{Q}}\subseteq\mathcal{P}^{\mathrm{I}}_{\mathcal{Q}}$ , so it immediately follows from the definition of the lower expectations that

\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\leq\underline{\mathbb{E}}^{\mathrm{M}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\leq\underline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\,.

Hence by Equation (40) we obtain the identity

\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]=\underline{\mathbb{E}}^{\mathrm{M}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]=\underline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]\,,

which concludes the proof that the lower expected hitting times are the same for all three types of continuous-time imprecise-Markov chains. We omit the proof for the upper expected hitting times; this is completely analogous, starting instead from Equation (35) and using the norm bound (26) to pass to the limit $\overline{h}=\overline{\mathbb{E}}^{\mathrm{HM}}_{\mathcal{Q}}[\tau_{\mathbb{R}_{\geq 0}}\,|\,X_{0}]$ . ∎

Proof of Theorem 2.

First fix any $\Delta>0$ . For $\underline{h}_{\Delta}$ it then holds that

\displaystyle\underline{h}_{\Delta}=\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}e^{\underline{Q}\Delta}\underline{h}_{\Delta}\,.

Since $\underline{h}_{\Delta}(x)=0$ for all $x\in A$ , we have $\underline{h}_{\Delta}=\mathbb{I}_{A^{c}}\underline{h}_{\Delta}$ and $\mathbb{I}_{A}\underline{h}_{\Delta}=0$ . We can therefore rearrange terms and add $\mathbb{I}_{A}\underline{h}_{\Delta}$ to the above, to obtain

\mathbb{I}_{A}\underline{h}_{\Delta}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\frac{e^{\underline{Q}\Delta}-I}{\Delta}\underline{h}_{\Delta}\,.

Because the individual limits exist by Proposition 16 and [De Bock, 2017, Prop 9], taking $\Delta$ to zero yields

	$\displaystyle\mathbb{I}_{A}\underline{h}$	$\displaystyle=\mathbb{I}_{A}\lim_{\Delta\to 0^{+}}\underline{h}_{\Delta}$
		$\displaystyle=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\lim_{\Delta\to 0^{+}}\frac{e^{\underline{Q}\Delta}-I}{\Delta}\underline{h}_{\Delta}$
		$\displaystyle=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\lim_{\Delta\to 0^{+}}\frac{e^{\underline{Q}\Delta}-I}{\Delta}\lim_{\Delta\to 0^{+}}\underline{h}_{\Delta}$
		$\displaystyle=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\underline{Q}\,\underline{\vphantom{Q}h}\,.$

So, $\underline{h}$ is indeed a solution to the system $\mathbb{I}_{A}\underline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\underline{Q}\,\underline{\vphantom{Q}h}$ . It follows from a completely analogous argument that also $\overline{h}$ is a solution to $\mathbb{I}_{A}\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,\overline{h}$ .

That $\underline{h}$ and $\overline{h}$ are non-negative is clear. We now first show that $\underline{h}$ is the minimal non-negative solution to its corresponding system. To this end, suppose ex absurdo that there is some non-negative $h\in\smash{\mathbb{R}^{\mathcal{X}}}$ such that $h(x)<\underline{h}(x)$ for some $x\in\mathcal{X}$ , and $\mathbb{I}_{A}h=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\underline{Q}h$ . Then clearly $x\in A^{c}$ since $\underline{h}(y)=0$ for all $y\in A$ and $h$ is non-negative.

By Proposition 4, there is then some $Q\in\mathcal{Q}$ such that $Qh=\underline{Q}h$ , and so also $\mathbb{I}_{A}h=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Qh$ . By Proposition 2 there is some minimal non-negative solution $h_{*}$ to the system $\mathbb{I}_{A}h_{*}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Qh_{*}$ , where the minimality implies in particular that $h_{*}\leq h$ . Since $Q\in\mathcal{Q}$ , we obtain

h_{*}(x)\leq h(x)<\underline{h}(x)=\inf_{Q^{\prime}\in\mathcal{Q}}h^{Q^{\prime}}(x)\leq h_{*}(x)\,,

which is a contradiction.

We next show that $\overline{h}$ is the minimal non-negative solution to its corresponding system; this will require a bit more effort and we need to start with some auxiliary constructions. By Proposition 4, there is some $Q\in\mathcal{Q}$ such that $Q\overline{h}=\overline{Q}\,\overline{h}$ . Let $G$ be the subgenerator of $Q$ .

Consider the $\xi>0$ from Proposition 11, and let $\left\lVert\cdot\right\rVert_{*}$ be the norm from Section 4.1. Since $\smash{\mathbb{R}^{A^{c}}}$ is finite-dimensional, the norms $\left\lVert\cdot\right\rVert$ and $\left\lVert\cdot\right\rVert_{*}$ are equivalent, whence there is some $c\in\mathbb{R}_{>0}$ such that $\left\lVert f\right\rVert_{*}\leq c\left\lVert f\right\rVert$ for all $f\in\smash{\mathbb{R}^{A^{c}}}$ .

Now let $\Delta>0$ be such that $\Delta\left\lVert Q\right\rVert\leq 1$ , $\Delta\left\lVert\underline{Q}\right\rVert\leq 1$ , $\Delta\xi<\nicefrac{{2}}{{3}}$ , and $\Delta c\left\lVert Q\right\rVert^{2}<\nicefrac{{\xi}}{{3}}$ ; this is clearly always possible. Define the map $H:\smash{\mathbb{R}^{A^{c}}}\to\smash{\mathbb{R}^{A^{c}}}$ for all $f\in\smash{\mathbb{R}^{A^{c}}}$ as

H(f)\coloneqq f+\Delta\mathbf{1}+\Delta Gf=\Delta\mathbf{1}+(I+\Delta G)f\,.

Let us show that $H$ is a contraction on the Banach space $(\smash{\mathbb{R}^{A^{c}}},\left\lVert\cdot\right\rVert_{*})$ , or in other words, that there is some $\alpha\in[0,1)$ such that $\left\lVert H(f)-H(g)\right\rVert_{*}\leq\alpha\left\lVert f-g\right\rVert_{*}$ for all $f,g\in\smash{\mathbb{R}^{A^{c}}}$ [Renardy and Rogers, 2006, Sec 10.1.1]. So, fix any $f,g\in\smash{\mathbb{R}^{A^{c}}}$ . Then we have

	$\displaystyle\left\lVert H(f)-H(g)\right\rVert_{*}$	$\displaystyle=\left\lVert(I+\Delta G)f-(I+\Delta G)g\right\rVert_{*}$
		$\displaystyle=\left\lVert(I+\Delta G)(f-g)\right\rVert_{*}$
		$\displaystyle\leq\left\lVert I+\Delta G\right\rVert_{}\left\lVert f-g\right\rVert_{}\,,$

from which we find that

	$\displaystyle\left\lVert H(f)-H(g)\right\rVert_{*}$
	$\displaystyle\quad\quad\leq\left(\left\lVert(I+\Delta G)-e^{G\Delta}\right\rVert_{}+\left\lVert e^{G\Delta}\right\rVert_{}\right)\left\lVert f-g\right\rVert_{*}\,.$		(41)

By Proposition 14 we have $\left\lVert e^{G\Delta}\right\rVert_{*}\leq e^{-\xi\Delta}$ , and so using a standard quadratic bound on the negative scalar exponential,

\left\lVert e^{G\Delta}\right\rVert_{*}\leq 1-\Delta\xi+\frac{1}{2}\Delta^{2}\xi^{2}<1-\frac{2}{3}\Delta\xi\,,

(42)

where we used that $\Delta\xi<\nicefrac{{2}}{{3}}$ .

Moreover, we have that

	$\displaystyle\left\lVert(I+\Delta G)-e^{G\Delta}\right\rVert_{*}$	$\displaystyle\leq c\left\lVert(I+\Delta G)-e^{G\Delta}\right\rVert$
		$\displaystyle\leq c\left\lVert(I+\Delta Q)-e^{Q\Delta}\right\rVert$
		$\displaystyle\leq c\Delta^{2}\left\lVert Q\right\rVert^{2}\,,$

where the second inequality used Lemmas 3 and 4 and Corollary 3; and the final inequality used [Krak, 2021, Lemma B.8]. Since $\Delta c\left\lVert Q\right\rVert^{2}<\nicefrac{{\xi}}{{3}}$ we have

\left\lVert(I+\Delta G)-e^{G\Delta}\right\rVert_{*}<\frac{1}{3}\Delta\xi\,.

(43)

Combining Equations (41), (42), and (43) we obtain

\left\lVert H(f)-H(g)\right\rVert_{*}\leq\left(1-\frac{1}{3}\Delta\xi\right)\left\lVert f-g\right\rVert_{*}\,.

Since $\Delta>0$ , $\xi>0$ , and $\Delta\xi<\nicefrac{{2}}{{3}}$ , we conclude that $H$ is indeed a contraction. Hence by the Banach fixed-point theorem [Renardy and Rogers, 2006, Thm 10.1], there is a unique fixed point $f\in\smash{\mathbb{R}^{A^{c}}}$ such that $H(f)=f$ and, for any $g\in\smash{\mathbb{R}^{A^{c}}}$ , it holds that

\lim_{n\to+\infty}H^{n}(g)=f\,.

(44)

It is easy to see that this unique fixed point is given by $\overline{h}|_{A^{c}}$ . Indeed, from the choice of $Q$ and the fact that $\overline{h}$ satisfies $\mathbb{I}_{A}\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,\overline{h}$ , we have

\mathbb{I}_{A}\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}Q\overline{h}=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,\overline{h}\,.

Moreover, since $\overline{h}|_{A}=0$ we have $\mathbb{I}_{A}\overline{h}=0$ and $\overline{h}=(\overline{h}|_{A^{c}})\!\!\uparrow_{\mathcal{X}}$ , whence

\bigl{(}Q\,\overline{h}\bigr{)}|_{A^{c}}=\bigl{(}Q\,(\overline{h}|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A^{c}}=G\,\overline{h}|_{A^{c}}\,.

Noting that $\mathbb{I}_{A^{c}}|_{A^{c}}=\mathbf{1}$ , after multiplying with $\Delta$ we find that $\Delta\mathbf{1}+\Delta G\,\overline{h}|_{A^{c}}=0$ . Comparing with the definition of $H$ , we have

\displaystyle H(\overline{h}|_{A^{c}})=\overline{h}|_{A^{c}}+\Delta\mathbf{1}+\Delta G\,\overline{h}|_{A^{c}}=\overline{h}|_{A^{c}}+0=\overline{h}|_{A^{c}}\,,

so $\overline{h}|_{A^{c}}$ is indeed a fixed-point of $H$ . Since we already established that $H$ has a unique fixed-point, we conclude from Equation (44) that

\overline{h}|_{A^{c}}=\lim_{n\to+\infty}H^{n}(g)\quad\text{for all $g\in\smash{\mathbb{R}^{A^{c}}}$.}

(45)

Next let us show that $H$ is monotone. To this end, fix any $f,g\in\smash{\mathbb{R}^{A^{c}}}$ such that $f\leq g$ ; then clearly also $f\!\!\uparrow_{\mathcal{X}}\leq g\!\!\uparrow_{\mathcal{X}}$ . Since $\Delta>0$ is such that $\Delta\left\lVert Q\right\rVert\leq 1$ , it follows from [Krak, 2021, Prop 4.9] that $(I+\Delta Q)$ is a transition matrix. By the monotonicity of transition matrices [Krak, 2021, Prop 3.32], we find that therefore

(I+\Delta Q)f\!\!\uparrow_{\mathcal{X}}\leq(I+\Delta Q)g\!\!\uparrow_{\mathcal{X}}\,,

which in turn implies that

	$\displaystyle(I+\Delta G)f$	$\displaystyle=\bigl{(}(I+\Delta Q)f\!\!\uparrow_{\mathcal{X}}\bigr{)}\|_{A^{c}}$
		$\displaystyle\leq\bigl{(}(I+\Delta Q)g\!\!\uparrow_{\mathcal{X}}\bigr{)}\|_{A^{c}}=(I+\Delta G)g\,.$

From the definition of $H$ we therefore conclude that $H(f)\leq H(g)$ . Since $f,g$ with $f\leq g$ are arbitrary, this concludes the proof of the monotonicity of $H$ .

Now, let us define $\overline{H}:\smash{\mathbb{R}^{\mathcal{X}}}\to\smash{\mathbb{R}^{\mathcal{X}}}$ for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ as

\overline{H}(f)\coloneqq\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}(I+\Delta\overline{Q})f\,.

We first note that, since $\Delta\left\lVert\underline{Q}\right\rVert\leq 1$ , it follows from [De Bock, 2017, Prop 5] that $(I+\Delta\underline{Q})$ is a lower transition operator. From the conjugacy of $\underline{Q}$ and $\overline{Q}$ , we have for any $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ that

	$\displaystyle-(I+\Delta\underline{Q})(-f)$	$\displaystyle=f+\Delta\cdot-\underline{Q}(-f)$
		$\displaystyle=f+\Delta\overline{Q}f=(I+\Delta\overline{Q})f\,,$

which implies that $(I+\Delta\overline{Q})$ is the upper transition operator that is conjugate to the lower transition operator $(I+\Delta\underline{Q})$ . By the monotonicity of upper transition operators—see Section 3.2—this implies that $\overline{H}$ is monotone, or in other words that for all $f,g\in\smash{\mathbb{R}^{\mathcal{X}}}$ with $f\leq g$ it holds that $\overline{H}(f)\leq\overline{H}(g)$ .

Let us next show that

\overline{H}(f)\geq H(f|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\quad\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ with $f|_{A}=0$.}

(46)

Indeed, if $f|_{A}=0$ then $(f|_{A^{c}})\!\!\uparrow_{\mathcal{X}}=f$ , and since $Q\in\mathcal{Q}$ , it follows from the definitions of $\overline{Q}$ and $G$ that then

(\overline{Q}\,f)|_{A^{c}}\geq(Q\,f)|_{A^{c}}=(Q\,(f|_{A^{c}})\!\!\uparrow_{\mathcal{X}})|_{A^{c}}=Gf|_{A^{c}}\,.

Hence we have

	$\displaystyle\overline{H}(f)\|_{A^{c}}$	$\displaystyle=\Delta\mathbf{1}+f\|_{A^{c}}+\Delta(\overline{Q}\,f)\|_{A^{c}}$
		$\displaystyle\geq\Delta\mathbf{1}+f\|_{A^{c}}+\Delta Gf\|_{A^{c}}=H(f\|_{A^{c}})\,.$

Moreover, we immediately have from the definition that $\overline{H}(f)|_{A}=0=\bigl{(}H(f|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\bigr{)}|_{A}$ , and so Equation (46) indeed holds.

Next, we note that for any $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ it holds that $\overline{H}(f)|_{A}=0$ , and so by Equation (46) we find that

\overline{H}(\overline{H}(f))\geq H(\overline{H}(f)|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\,.

Provided that also $f|_{A}=0$ , then using the previously established monotonicity of $H$ we obtain

	$\displaystyle\overline{H}(\overline{H}(f))$	$\displaystyle\geq H(\overline{H}(f)\|_{A^{c}})\!\!\uparrow_{\mathcal{X}}$
		$\displaystyle\geq H(H(f\|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\|_{A^{c}})\!\!\uparrow_{\mathcal{X}}\,.$

We clearly have $H(f|_{A^{c}})\!\!\uparrow_{\mathcal{X}}|_{A^{c}}=H(f|_{A^{c}})$ , whence

\overline{H}^{2}(f)=\overline{H}(\overline{H}(f))\geq H(H(f|_{A^{c}}))\!\!\uparrow_{\mathcal{X}}=(H^{2}(f|_{A^{c}}))\!\!\uparrow_{\mathcal{X}}\,.

Indeed, we can repeat this reasoning for $n\in\mathbb{N}$ steps, to conclude that

\overline{H}^{n}(f)\geq(H^{n}(f|_{A^{c}}))\!\!\uparrow_{\mathcal{X}}\quad\text{for all $f\in\smash{\mathbb{R}^{\mathcal{X}}}$ with $f|_{A}=0$.}

(47)

Now suppose ex absurdo that there is some non-negative $g\in\smash{\mathbb{R}^{\mathcal{X}}}$ , such that $g(x)<\overline{h}(x)$ for some $x\in\mathcal{X}$ , and such that $\mathbb{I}_{A}g=\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\overline{Q}\,g$ . Since $g$ is non-negative and $\overline{h}|_{A}=0$ , we must have that $x\in A^{c}$ . Moreover, we clearly have $\mathbb{I}_{A}g=0$ , which implies that $g|_{A}=0$ and so $g=\mathbb{I}_{A^{c}}g$ . Hence it follows that $\Delta\mathbb{I}_{A^{c}}+\mathbb{I}_{A^{c}}\Delta\overline{Q}g=0$ , and we find that $\overline{H}(g)=\mathbb{I}_{A^{c}}g=g$ . Hence $g$ is a fixed point of $\overline{H}$ . Since $g|_{A}=0$ , and using Equation (47), this implies that for any $n\in\mathbb{N}_{0}$ we have

g=\overline{H}^{n}(g)\geq(H^{n}(g|_{A^{c}}))\!\!\uparrow_{\mathcal{X}}\,.

Recalling that $x\in A^{c}$ is such that $g(x)<\overline{h}(x)$ , we use Equation (45) to take limits in $n$ and find that

g(x)\geq\lim_{n\to+\infty}H^{n}(g|_{A^{c}})(x)=\overline{h}(x)>g(x)\,,

which is a contradiction. ∎

	$\displaystyle h^{Q}_{\Delta}$	$\displaystyle\|_{A^{c}}-h^{Q}\|_{A^{c}}$
		$\displaystyle=e^{G\Delta}h^{Q}_{\Delta}\|_{A^{c}}-\Delta Gh^{Q}\|_{A^{c}}-h^{Q}\|_{A^{c}}$
		$\displaystyle=e^{G\Delta}h^{Q}_{\Delta}\|_{A^{c}}-(I+\Delta G)h^{Q}\|_{A^{c}}$
		$\displaystyle=e^{G\Delta}(h^{Q}_{\Delta}\|_{A^{c}}-h^{Q}\|_{A^{c}})+\bigl{(}e^{G\Delta}-(I+\Delta G)\bigr{)}h^{Q}\|_{A^{c}}$
		$\displaystyle=e^{G\Delta}(h^{Q}_{\Delta}\|_{A^{c}}-h^{Q}\|_{A^{c}})+Bh^{Q}\|_{A^{c}}\,.$

	$\displaystyle\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\hat{\tau}\,\|\,X_{0}]$	$\displaystyle=\lim_{t\to+\infty,t\in\mathbb{N}_{0}}\overline{\mathbb{E}}_{\mathcal{Q}}^{\mathrm{I}}[\tau_{1}^{t}\,\|\,X_{0}]$
		$\displaystyle=\lim_{t\to+\infty,t\in\mathbb{N}_{0}}\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{I}}[\tau_{0:t}\,\|\,X_{0}]$
		$\displaystyle\leq\lim_{t\to+\infty,t\in\mathbb{N}_{0}}\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{0:t}\,\|\,X_{0}]=\overline{\mathbb{E}}_{\mathcal{T}}^{\mathrm{V}}[\tau_{\mathbb{N}_{0}}\,\|\,X_{0}]\,.$

$\displaystyle\underline{s}$	$\displaystyle=\lim_{j\to+\infty}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{Q}}[\tau_{k_{j}}\,\|\,X_{0}]$
	$\displaystyle=\lim_{j\to+\infty}\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{I}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,\|\,X_{0}]$
	$\displaystyle\geq\lim_{j\to+\infty}\Delta_{k_{j}}\underline{\mathbb{E}}^{\mathrm{V}}_{\mathcal{T}_{k_{j}}}[\tau_{0:n_{k_{j}}}\,\|\,X_{0}]\,.$	(38)