This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Technische Universität Dresdenthanks: This work was funded by DFG grant 389792660 as part of TRR 248, the Cluster of Excellence EXC 2050/1 (CeTI, project ID 390696704, as part of Germany’s Excellence Strategy), DFG-projects BA-1679/11-1 and BA-1679/12-1, and the Research Training Group QuantLA (GRK 1763).
11email: {christel.baier, florian.funke, simon.jantsch,
jakob.piribauer, robin.ziemek}@tu-dresden.de

Probabilistic causes in Markov chains

Christel Baier    Florian Funke    Simon Jantsch   
Jakob Piribauer
   Robin Ziemek
Abstract

The paper studies a probabilistic notion of causes in Markov chains that relies on the counterfactuality principle and the probability-raising property. This notion is motivated by the use of causes for monitoring purposes where the aim is to detect faulty or undesired behaviours before they actually occur. A cause is a set of finite executions of the system after which the probability of the effect exceeds a given threshold. We introduce multiple types of costs that capture the consump-tion of resources from different perspectives, and study the complexity of computing cost-minimal causes

1 Introduction

The study of cause-effect relationships in formal systems has received considerable attention over the past 25 years. Notions of causality have been proposed within various models, including structural equation models [37, 27, 26], temporal logics in Kripke structures [10, 3] and Markov chains [34, 35], and application areas have been identified in abundance, ranging from finance [32] to medicine [33] to aeronautics [30]. These approaches form an increasingly powerful toolkit aimed at explaining why an observable phenomenon (the effect) has happened, and which previous events (the causes) are logically linked to its occurrence. As such, causality plays a fundamental building block in determining moral responsibility [9, 5] or legal accountability [20], and ultimately fosters user acceptance through an increased level of transparency [36].

Despite the variety of models, application areas, and involved disciplines, all approaches essentially rely on (one of) two central paradigms that dictate how causes are linked to their effects: the counterfactuality principle and the probability-raising property. Counterfactual reasoning prescribes that an effect would not have happened if the cause had not occurred. Probability-raising states that the probability of the effect is higher whenever the cause has been observed.

The contribution of this paper is twofold: First, we define a novel notion of cause for ω\omega-regular properties in stochastic operational models. Second, we study the complexity of computing optimal causes for cost mechanisms motivated by monitoring applications.

The causes presented in this paper combine the two prevailing causality paradigms mentioned above into a single concept. More specifically, a pp-cause for an ω\omega-regular property \mathcal{L} in a discrete-time Markov chain is a set of finite executions π\pi of the system such that the probability that \mathcal{L} occurs after executing π\pi is at least pp, where pp is typically larger than the overall probability of \mathcal{L}. The counterfactuality principle is invoked through the additional requirement that almost every execution exhibiting the event \mathcal{L} contains a finite prefix which is a member of the pp-cause. This condition makes our approach amenable to the needs of monitoring a system at runtime.

Imagine a critical event that the system should avoid (e.g., a fully automated drone crashing onto the ground), and assume that a pp-cause for this event is known (e.g., physical specifications foreshadowing a crash). Typically, the probability threshold pp – which can be thought of as the sensitivity of the monitor – should be lower if the criticality of the event is higher. As the system is running, as soon as the execution seen so far is part of the pp-cause, the monitor can trigger an alarm and suitable countermeasures can be taken (e.g., manual control instead of automated behavior). As such, our approach can be preventive in nature.

The monitoring application outlined above suggests computing a pp-cause from the model before the system is put to use. However, multiple pp-causes may exist for the same property, which raises the question which one to choose. Cyber-physical systems consume time, energy and other resources, which are often subject to budget restrictions. Furthermore, the intended countermeasures may incur different costs depending on the system state. Such costs can be modelled using state weights in the Markov chain, which induce weight functions on the finite executions either in an accumulative (total resource consumption) or instantaneous (current consumption intensity) fashion. On top of this model, we present three cost mechanisms for causes: (1) The expected cost measures the expected resource consumption until the monitor triggers an alarm or reaches a safe state, (2) the partial expected cost measures the expected consumption where executions reaching a safe state do not incur any cost, and (3) the maximal cost measures the maximal consumption that can occur until an alarm is triggered.

expcost\operatorname{expcost} pexpcost\operatorname{pexpcost} maxcost\operatorname{maxcost}
non-negative weights in 𝙿\mathtt{P} (7) pseudo-polyn. (4) in 𝙿\mathtt{P} (9)
accumulated 𝙿𝙿\mathtt{PP}-hard (6)
arbitrary weights in 𝙿\mathtt{P} (7) 𝙿𝙿\mathtt{PP}-hard (6) pseudo-polyn. (9)
accumulated in 𝙽𝙿𝚌𝚘𝙽𝙿\mathtt{NP}\cap\mathtt{coNP}
arbitrary weights in 𝙿\mathtt{P} (4.4) in 𝙿\mathtt{P} (4.4) in 𝙿\mathtt{P} (4.4)
instantaneous
Figure 1: Summary of complexity results for different kinds of cost.

Figure 1 summarizes our results regarding the complexity of computing cost-minimal pp-causes for the different combinations of weight type and cost mechanism. To obtain these results we utilize a web of connections to the rich landscape of computational problems for discrete-time Markovian models. More precisely, the results for the expected cost rely on connections to the stochastic shortest path problem (SSP) studied in [4]. The pseudo-polynomial algorithm for partial expected costs on non-negative, accumulated weights uses partial expectations in Markov decision processes [38]. The 𝙿𝙿\mathtt{PP}-hardness result is proved by reduction from the cost problem for acyclic Markov chains stated in [24]. The pseudo-polynomial algorithm for the maximal cost on arbitrary, accumulated weights applies insights from total-payoff games [6, 8].

Full proofs missing in the main document can be found in the appendix.

1.0.1 Related Work.

The structural model approach to actual causality [27] has sparked notions of causality in formal verification [10, 3]. The complexity of computing actual causes has been studied in [16, 17]. A probabilistic extension of this framework has been proposed in [21]. Recent work on checking and inferring actual causes is given in [29], and an application-oriented framework for it is presented in [30]. The work [34] builds a framework for actual causality in Markov chains and applies it to infer causal relationships in data sets. It was later extended to continuous time data [32] and to token causality [35] and has been refined using new measures for the significance of actual and token causes [28, 45].

A logic for probabilistic causal reasoning is given in [44] in combination with logical programming. The work [43] compares this approach to Pearl’s theory of causality involving Bayesian networks [37]. The CP-logic of [44] is close to the representation of causal mechanisms of [14]. The probability-raising principle goes back to Reichenbach [39]. It has been identified as a key ingredient to causality in various philosophical accounts, see e.g. [15].

Monitoring ω\omega-regular properties in stochastic systems modeled as Hidden Markov Chains (HMCs) was studied in [40, 23] and has recently been revived [18]. The trade-off between accuracy and overhead in runtime verification has been studied in [2, 41, 31]. In particular [2] uses HMCs to estimate how likely each monitor instance is to violate a temporal property. Monitoring the evolution of finite executions has also been investigated in the context of statistical model checking of LTL properties [13]. How randomization can improve monitors for non-probabilistic systems has been examined in [7]. The safety level of [19] measures which portion of a language admits bad prefixes, in the sense classically used for safety languages.

2 Preliminaries

2.0.1 Markov chains.

A discrete-time Markov chain (DTMC) MM is a tuple (S,s0,𝐏)(S,s_{0},\mathbf{P}), where SS is a finite set of states, s0Ss_{0}\in S is the initial state, and 𝐏:S×S[0,1]\mathbf{P}\colon S\times S\to[0,1] is the transition probability function where we require sS𝐏(s,s)=1\sum_{s^{\prime}\in S}\mathbf{P}(s,s^{\prime})=1 for all sSs\in S. For algorithmic problems all transition probabilities are assumed to be rational. A finite path π^\hat{\pi} in MM is a sequence s0s1sns_{0}s_{1}\ldots s_{n} of states such that 𝐏(si,si+1)>0\mathbf{P}(s_{i},s_{i+1})>0 for all 0in10\leq i\leq n-1. Let last(s0sn)=sn\operatorname{last}(s_{0}\ldots s_{n})=s_{n}. Similarly one defines the notion of an infinite path π\pi. Let Pathsfin(M)\operatorname{Paths}_{\operatorname{fin}}(M) and Paths(M)\operatorname{Paths}(M) be the set of finite and infinite paths. The set of prefixes of a path π\pi is denoted by Pref(π)\operatorname{Pref}(\pi). The cylinder set of a finite path π^\hat{\pi} is Cyl(π^)={πPaths(M)π^Pref(π)}\operatorname{Cyl}(\hat{\pi})=\{\pi\in\ \operatorname{Paths}(M)\mid\hat{\pi}\in\operatorname{Pref}(\pi)\}. We consider Paths(M)\operatorname{Paths}(M) as a probability space whose σ\sigma-algebra is generated by such cylinder sets and whose probability measure is induced by Pr(Cyl(s0sn))=𝐏(s0,s1)𝐏(sn1,sn)\mathrm{Pr}(\operatorname{Cyl}(s_{0}\ldots s_{n}))=\mathbf{P}(s_{0},s_{1})\cdot\ldots\cdot\mathbf{P}(s_{n-1},s_{n}) (see [1, Chapter 10] for more details).

For an ω\omega-regular language Sω\mathcal{L}\subseteq S^{\omega} let PathsM()=Paths(M)\operatorname{Paths}_{M}(\mathcal{L})=\operatorname{Paths}(M)\cap\mathcal{L}. The probability of \mathcal{L} in MM is defined as PrM()=Pr(PathsM())\mathrm{Pr}_{M}(\mathcal{L})=\mathrm{Pr}(\operatorname{Paths}_{M}(\mathcal{L})). Given a state sSs\in S, let PrM,s()=PrMs()\mathrm{Pr}_{M,s}(\mathcal{L})=\mathrm{Pr}_{M_{s}}(\mathcal{L}), where MsM_{s} is the DTMC obtained from MM by replacing the initial state s0s_{0} with ss. If MM is clear from the context, we omit the subscript. For a finite path π^Pathsfin(M)\hat{\pi}\in\operatorname{Paths}_{\operatorname{fin}}(M), define the conditional probability

PrM(π^)=PrM(PathsM()Cyl(π^))PrM(Cyl(π^)).\mathrm{Pr}_{M}(\mathcal{L}\mid\hat{\pi})=\frac{\mathrm{Pr}_{M}\left(\operatorname{Paths}_{M}(\mathcal{L})\cap\operatorname{Cyl}(\hat{\pi})\right)}{\mathrm{Pr}_{M}(\operatorname{Cyl}(\hat{\pi}))}.

Given ESE\subseteq S, let E={s0s1Paths(M)i0.siE}\lozenge E=\{s_{0}s_{1}\ldots\in\operatorname{Paths}(M)\mid\exists i\geq 0.\;s_{i}\in E\}. For such reachability properties we have PrM(Es0sn)=PrM,sn(E)\mathrm{Pr}_{M}(\lozenge E\mid s_{0}\ldots s_{n})=\mathrm{Pr}_{M,s_{n}}(\lozenge E) for any s0snPathsfin(M)s_{0}\ldots s_{n}\in\operatorname{Paths}_{\operatorname{fin}}(M). We assume Prs0(s)>0\mathrm{Pr}_{s_{0}}(\lozenge s)>0 all states sSs\in S. Furthermore, define a weight function on MM as a map c:Sc:S\to\mathbb{Q}. We typically use it to induce a weight function c:Pathsfin(M)c:\operatorname{Paths}_{\operatorname{fin}}(M)\to\mathbb{Q} (denoted by the same letter) by accumulation, i.e., c(s0sn)=i=0nc(si)c(s_{0}\cdots s_{n})=\sum_{i=0}^{n}c(s_{i}). Finally, a set ΠPathsfin(M)\Pi\subseteq\operatorname{Paths}_{\operatorname{fin}}(M) is called prefix-free if for every π^Π\hat{\pi}\in\Pi we have ΠPref(π^)={π^}\Pi\cap\operatorname{Pref}(\hat{\pi})=\{\hat{\pi}\}.

2.0.2 Markov decision processes.

A Markov decision process (MDP) \mathcal{M} is a tuple (S,Act,s0,𝐏)(S,\operatorname{Act},s_{0},\mathbf{P}), where SS is a finite set of states, Act\operatorname{Act} is a finite set of actions, s0s_{0} is the initial state, and 𝐏:S×Act×S[0,1]\mathbf{P}\colon S\times\operatorname{Act}\times S\to[0,1] is the transition probability function such that for all states sSs\in S and actions αAct\alpha\in\operatorname{Act} we have sS𝐏(s,α,s){0,1}\sum_{s^{\prime}\in S}\mathbf{P}(s,\alpha,s^{\prime})\in\{0,1\}. An action α\alpha is enabled in state sSs\in S if sS𝐏(s,α,s)=1\sum_{s^{\prime}\in S}\mathbf{P}(s,\alpha,s^{\prime})=1 and we define Act(s)={αα is enabled in s}\operatorname{Act}(s)=\{\alpha\mid\alpha\text{ is enabled in }s\}. We require Act(s)\operatorname{Act}(s)\neq\emptyset for all states sSs\in S.

An infinite path in \mathcal{M} is an infinite sequence π=s0α1s1α2s2(S×Act)ω\pi=s_{0}\alpha_{1}s_{1}\alpha_{2}s_{2}\dots\in(S\times\operatorname{Act})^{\omega} such that for all i0i\geq 0 we have 𝐏(si,αi+1,si+1)>0\mathbf{P}(s_{i},\alpha_{i+1},s_{i+1})>0. Any finite prefix of π\pi that ends in a state is a finite path. A scheduler 𝔖\mathfrak{S} is a function that maps a finite path s0α1s1sns_{0}\alpha_{1}s_{1}\dots s_{n} to an enabled action αAct(sn)\alpha\in\operatorname{Act}(s_{n}). Therefore it resolves the nondeterminism of the MDP and induces a (potentially infinite) Markov chain 𝔖\mathcal{M}_{\mathfrak{S}}. If the chosen action only depends on the last state of the path, i.e., 𝔖(s0α1s1sn)=𝔖(sn)\mathfrak{S}(s_{0}\alpha_{1}s_{1}\dots s_{n})=\mathfrak{S}(s_{n}), then the scheduler is called memoryless and naturally induces a finite DTMC. For more details on DTMCs and MDPs we refer to [1].

3 Causes

This section introduces a notion of cause for ω\omega-regular properties in Markov chains. For the rest of this section we fix a DTMC MM with state space SS, an ω\omega-regular language \mathcal{L} over the alphabet SS and a threshold p(0,1]p\in(0,1].

Definition 1 (pp-critical prefix)

A finite path π^\hat{\pi} is a pp-critical prefix for \mathcal{L} if Pr(π^)p\mathrm{Pr}(\mathcal{L}\mid\hat{\pi})\geq p.

Definition 2 (pp-cause)

A pp-cause for \mathcal{L} in MM is a prefix-free set of finite paths ΠPathsfin(M)\Pi\subseteq\operatorname{Paths}_{\operatorname{fin}}(M) such that

  1. (1)

    almost every πPathsM()\pi\in\operatorname{Paths}_{M}(\mathcal{L}) has a prefix π^Π\hat{\pi}\in\Pi, and

  2. (2)

    every π^Π\hat{\pi}\in\Pi is a pp-critical prefix for \mathcal{L}.

Note that condition (1) and (2) are in the spirit of completeness and soundness as used in [11]. The first condition is our invocation of the counterfactuality principle: Almost every occurrence of the effect (for example, reaching a target set) is preceded by an element in the cause. If the threshold is chosen such that p>Prs0()p>\mathrm{Pr}_{s_{0}}(\mathcal{L}), then the second condition reflects the probability-raising principle in that seeing an element of Π\Pi implies that the probability of the effect \mathcal{L} has increased over the course of the execution. For monitoring purposes as described in the introduction it would be misleading to choose pp below Prs0()\mathrm{Pr}_{s_{0}}(\mathcal{L}) as this could instantly trigger an alarm before the system is put to use. Also pp should not be too close to 11 as this may result in an alarm being triggered too late.

If \mathcal{L} coincides with a reachability property one could equivalently remove the almost from (1) of Definition 2. In general, however, ignoring paths with probability zero is necessary to guarantee the existence of pp-causes for all pp.

Example 1

Consider the DTMC MM depicted in Figure 3. For p=3/4p=3/4, a possible pp-cause for =𝑒𝑟𝑟𝑜𝑟\mathcal{L}=\lozenge\mathit{error} in MM is given by the set Π1={st,su}\Pi_{1}=\{st,su\} since both tt and uu reach 𝑒𝑟𝑟𝑜𝑟\mathit{error} with probability greater or equal than pp. The sets Θ1={st,su,stu}\Theta_{1}=\{st,su,stu\} and Θ2={st𝑒𝑟𝑟𝑜𝑟,su}\Theta_{2}=\{st\mathit{error},su\} are not pp-causes: Θ1\Theta_{1} is not prefix-free and for Θ2\Theta_{2} the path stu𝑒𝑟𝑟𝑜𝑟stu\mathit{error} has no prefix in Θ2\Theta_{2}. Another pp-cause is Π2={st𝑒𝑟𝑟𝑜𝑟,su,stu}\Pi_{2}=\{st\mathit{error},su,stu\}.

Example 2

It can happen that there does not exist any finite pp-cause. Consider Figure 3 and p=1/2p=1/2. Since Prs(𝑒𝑟𝑟𝑜𝑟)<p\mathrm{Pr}_{s}(\lozenge\mathit{error})<p, the singleton {s}\{s\} is not a pp-cause. Thus, for every n0n\geq 0 either snts^{n}t or snt𝑒𝑟𝑟𝑜𝑟s^{n}t\mathit{error} is contained in any pp-cause, which must therefore be infinite. There may also exist non-regular pp-causes (as languages of finite words over SS). For example, for A={nn prime}A=\{n\in\mathbb{N}\mid n\text{ prime}\} the pp-cause ΠA={s0ntnA}{s0mt𝑒𝑟𝑟𝑜𝑟mA}\Pi_{A}=\{s_{0}^{n}t\mid n\in A\}\cup\{s_{0}^{m}t\mathit{error}\mid m\notin A\} is non-regular.

ssttuu𝑒𝑟𝑟𝑜𝑟\mathit{error}𝑠𝑎𝑓𝑒\mathit{safe}1/21/21/21/27/87/81/81/83/43/41/41/4
Figure 2: Example DTMC MM
sstt𝑒𝑟𝑟𝑜𝑟\mathit{error}𝑠𝑎𝑓𝑒\mathit{safe}1/21/21/41/41/41/41/101/109/109/10
Figure 3: Infinite and non-regular 1/21/2-causes
Remark 1 (Reduction to reachability properties)

Let 𝒜\mathcal{A} be a deterministic Rabin automaton for \mathcal{L} and consider the product Markov chain M𝒜M\otimes\mathcal{A} as in [1, Section 10.3]. For any finite path π^=s0snPathsfin(M)\hat{\pi}=s_{0}\ldots s_{n}\in\operatorname{Paths}_{\operatorname{fin}}(M) there is a unique path a(π^)=(s0,q1)(s1,q2)(sn,qn+1)Pathsfin(M𝒜)a(\hat{\pi})=(s_{0},q_{1})(s_{1},q_{2})\ldots(s_{n},q_{n+1})\in\operatorname{Paths}_{\operatorname{fin}}(M\otimes\mathcal{A}) whose projection onto the first factor is π^\hat{\pi}. Under this correspondence, a bottom strongly connected component (BSCC) of M𝒜M\otimes\mathcal{A} is either accepting or rejecting, meaning that for every finite path reaching this BSCC the corresponding path π^\hat{\pi} in MM satisfies PrM(π^)=1\mathrm{Pr}_{M}(\mathcal{L}\mid\hat{\pi})=1, or respectively, PrM(π^)=0\mathrm{Pr}_{M}(\mathcal{L}\mid\hat{\pi})=0 [1, Section 10.3]. This readily implies that almost every πPathsM()\pi\in\operatorname{Paths}_{M}(\mathcal{L}) has a 11-critical prefix and that, therefore, pp-causes exist for any pp.

Moreover, if UU is the union of all accepting BSCCs in M𝒜M\otimes\mathcal{A}, then

PrM(π^)=PrM𝒜(Ua(π^))\mathrm{Pr}_{M}(\mathcal{L}\mid\hat{\pi})=\mathrm{Pr}_{M\otimes\mathcal{A}}\big{(}\lozenge U\mid a(\hat{\pi})\big{)} (3.1)

holds for all finite paths π^\hat{\pi} of MM [1, Theorem 10.56]. Hence every pp-cause Π1\Pi_{1} for \mathcal{L} in MM induces a pp-cause Π2\Pi_{2} for U\lozenge U in M𝒜M\otimes\mathcal{A} by taking Π2={a(π^)π^Π1}\Pi_{2}=\{a(\hat{\pi})\mid\hat{\pi}\in\Pi_{1}\}. Vice versa, given a pp-cause Π2\Pi_{2} for U\lozenge U in M𝒜M\otimes\mathcal{A}, then the set of projections of paths in Π2\Pi_{2} onto their first component is a pp-cause for \mathcal{L} in MM. In summary, the reduction of ω\omega-regular properties on MM to reachability properties on the product M𝒜M\otimes\mathcal{A} also induces a reduction on the level of causes.

Remark 1 motivates us to focus on reachability properties henceforth. To apply the algorithms presented in Section 4 to specifications given in richer formalisms such as LTL, one would first have to apply the reduction to reachability given above, which increases the worst-case complexity exponentially.

In order to align the exposition with the monitoring application we are targeting, we will consider the target set as representing an erroneous behavior that is to be avoided. After collapsing the target set, we may assume that there is a unique state 𝑒𝑟𝑟𝑜𝑟S\mathit{error}\in S, so =𝑒𝑟𝑟𝑜𝑟\mathcal{L}=\lozenge\mathit{error} is the language we are interested in. Further, we collapse all states from which 𝑒𝑟𝑟𝑜𝑟\mathit{error} is not reachable to a unique state 𝑠𝑎𝑓𝑒S\mathit{safe}\in S with the property Pr𝑠𝑎𝑓𝑒(𝑒𝑟𝑟𝑜𝑟)=0\mathrm{Pr}_{\mathit{safe}}(\lozenge\mathit{error})=0. After this pre-processing, we have Prs0({𝑒𝑟𝑟𝑜𝑟,𝑠𝑎𝑓𝑒})=1\mathrm{Pr}_{s_{0}}(\lozenge\{\mathit{error},\mathit{safe}\})=1. Define the set

Sp:={sSPrs(𝑒𝑟𝑟𝑜𝑟)p}S_{p}:=\{s\in S\mid\mathrm{Pr}_{s}(\lozenge\mathit{error})\geq p\}

of all acceptable final states for pp-critical prefixes. This set is never empty as 𝑒𝑟𝑟𝑜𝑟Sp\mathit{error}\in S_{p} for all p(0,1]p\in(0,1].

There is a partial order on the set of pp-causes defined as follows: ΠΦ\Pi\preceq\Phi if and only if for all ϕΦ\phi\in\Phi there exists πΠ\pi\in\Pi such that πPref(ϕ)\pi\in\operatorname{Pref}(\phi). The reflexivity and transitivity are straightforward, and the antisymmetry follows from the fact that pp-causes are prefix-free. However, this order itself has no influence on the probability. In fact for two pp-causes Π,Φ\Pi,\Phi with ΠΦ\Pi\preceq\Phi it can happen that for πΠ,ϕΦ\pi\in\Pi,\phi\in\Phi we have Pr(𝑒𝑟𝑟𝑜𝑟π)Pr(𝑒𝑟𝑟𝑜𝑟ϕ)\mathrm{Pr}(\lozenge\mathit{error}\mid\pi)\geq\mathrm{Pr}(\lozenge\mathit{error}\mid\phi). This partial order admits a minimal element which is a regular language over SS and which plays a crucial role for finding optimal causes in Section 4.

{restatable}

[Canonical pp-cause]propositioncanonicalpath Let

Θ={s0snPathsfin(M)|snSp and for all i<nsiSp}.\displaystyle\Theta=\left\{s_{0}\cdots s_{n}\in\operatorname{Paths}_{\operatorname{fin}}(M)\;\middle|\;s_{n}\in S_{p}\text{ and for all $i<n$: }\>s_{i}\notin S_{p}\right\}.

Then Θ\Theta is a regular pp-cause (henceforth called the canonical pp-cause) and for all pp-causes Π\Pi we have ΘΠ\Theta\preceq\Pi.

We now introduce an MDP associated with MM whose schedulers correspond to the pp-causes of MM. This is useful both to represent pp-causes and for algorithmic questions we consider later.

Definition 3 (pp-causal MDP)

For the DTMC M=(S,s0,𝐏)M=(S,s_{0},\mathbf{P}) define the pp-causal MDP 𝒞p(M)=(S,{continue,pick},s0,𝐏)\mathcal{C}_{p}(M)=(S,\{continue,pick\},s_{0},\mathbf{P}^{\prime}) associated with MM, where 𝐏\mathbf{P}^{\prime} is defined as follows:

𝐏(s,continue,s)\displaystyle\mathbf{P^{\prime}}(s,continue,s^{\prime}) =𝐏(s,s) for all s,sS\displaystyle=\mathbf{P}(s,s^{\prime})\text{ for all }s,s^{\prime}\in S
𝐏(s,pick,𝑒𝑟𝑟𝑜𝑟)\displaystyle\mathbf{P^{\prime}}(s,pick,\mathit{error}) ={1 if sSp0 otherwise\displaystyle=\begin{cases}1&\text{ if }s\in S_{p}\\ 0&\text{ otherwise}\end{cases}

Given a weight function cc on MM, we consider cc also as weight function on 𝒞p(M)\mathcal{C}_{p}(M).

Example 3

Figure 4 demonstrates the pp-causal MDP construction of 𝒞p(M)\mathcal{C}_{p}(M). The black edges are transitions of MM, probabilities are omitted. Let us assume Sp\{𝑒𝑟𝑟𝑜𝑟}={s1,s3,s4}S_{p}\backslash\{\mathit{error}\}=\{s_{1},s_{3},s_{4}\}. To construct 𝒞p(M)\mathcal{C}_{p}(M) one adds transitions for the action pickpick, as shown by red edges.

s0s_{0}s1s_{1}s2s_{2}s3s_{3}s4s_{4}𝑠𝑎𝑓𝑒\mathit{safe}𝑒𝑟𝑟𝑜𝑟\mathit{error}
Figure 4: Illustration of the pp-causal MDP construction

Technically, schedulers are defined on all finite paths of an MDP \mathcal{M}. However, under any scheduler, there are usually paths that cannot be obtained under the scheduler. Thus we define an equivalence relation \equiv on the set of schedulers of \mathcal{M} by setting 𝔖𝔖\mathfrak{S}\equiv\mathfrak{S}^{\prime} if Paths(𝔖)=Paths(𝔖)\operatorname{Paths}(\mathcal{M}_{\mathfrak{S}})=\operatorname{Paths}(\mathcal{M}_{\mathfrak{S}^{\prime}}). Note that two schedulers equivalent under \equiv behave identically.

Lemma 1

There is a one-to-one correspondence between equivalence classes of schedulers in 𝒞p(M)\mathcal{C}_{p}(M) w.r.t. \equiv and pp-causes in MM for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error}.

Proof

Given a pp-cause Π\Pi for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} in MM, we construct the equivalence class of scheduler [𝔖Π][\mathfrak{S}_{\Pi}] by defining 𝔖Π(π^)=pick\mathfrak{S}_{\Pi}(\hat{\pi})=pick if π^Π\hat{\pi}\in\Pi, and otherwise 𝔖Π(π^)=continue\mathfrak{S}_{\Pi}(\hat{\pi})=continue. Vice versa, given an equivalence class [𝔖][\mathfrak{S}] of schedulers, we define the pp-cause

Π𝔖={π^Pathsfin(M)|𝔖(π^)=pick or π^ ends in 𝑒𝑟𝑟𝑜𝑟 and𝔖 does not choose pick on any prefix of π^}\displaystyle\Pi_{\mathfrak{S}}=\left\{\hat{\pi}\in\operatorname{Paths}_{\operatorname{fin}}(M)\;\middle|\;\begin{aligned} &\mathfrak{S}(\hat{\pi})=pick\text{ or $\hat{\pi}$ ends in $\mathit{error}$ and}\\ &\text{$\mathfrak{S}$ does not choose $pick$ on any prefix of $\hat{\pi}$}\end{aligned}\right\}

Since pickpick can only be chosen once on every path in Paths(𝔖)\operatorname{Paths}(\mathcal{M}_{\mathfrak{S}}), it is easy to see that 𝔖𝔖\mathfrak{S}\equiv\mathfrak{S}^{\prime} implies Π𝔖=Π𝔖\Pi_{\mathfrak{S}}=\Pi_{\mathfrak{S}^{\prime}}. Note that every π^Π𝔖\hat{\pi}\in\Pi_{\mathfrak{S}} is a pp-critical prefix since it ends in SpS_{p} and every path in 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} is covered since either pickpick is chosen or π^\hat{\pi} ends in 𝑒𝑟𝑟𝑜𝑟\mathit{error}. Furthermore, the second condition makes Π\Pi prefix-free. ∎

3.1 Types of pp-causes and induced monitors

We now introduce two classes of pp-causes which have a comparatively simple representation, and we explain what classes of schedulers they correspond to in the pp-causal MDP and how monitors can be derived for them.

Definition 4 (State-based pp-cause)

A pp-cause Π\Pi is state-based if there exists a set of states QSpQ\subseteq S_{p} such that Π={s0snPathsfin(M)snQ and i<n:siQ}\Pi=\{s_{0}\ldots s_{n}\in\operatorname{Paths}_{\operatorname{fin}}(M)\mid s_{n}\in Q\text{ and }\forall i<n:\;s_{i}\notin Q\}.

State-based pp-causes correspond to memoryless schedulers of 𝒞p(M)\mathcal{C}_{p}(M) which choose pickpick exactly for paths ending in QQ. For DTMCs equipped with a weight function we introduce threshold-based pp-causes:

Definition 5 (Threshold-based pp-cause)

A pp-cause Π\Pi is threshold-based if there exists a map T:Sp{}T:S_{p}\to\mathbb{Q}\cup\{\infty\} such that

Π={s0snPathsfin(M)|s0snpick(T) ands0sipick(T) for i<n}\Pi=\left\{s_{0}\cdots s_{n}\in\operatorname{Paths}_{\operatorname{fin}}(M)\;\middle|\;\begin{aligned} &s_{0}\cdots s_{n}\in\operatorname{pick}(T)\text{ and}\\ &s_{0}\cdots s_{i}\notin\operatorname{pick}(T)\text{ for }i<n\end{aligned}\right\}

where pick(T)={s0snPathsfin(M)snSp and c(s0sn)<T(sn)}\operatorname{pick}(T)=\{s_{0}\ldots s_{n}\in\operatorname{Paths}_{\operatorname{fin}}(M)\mid s_{n}\in S_{p}\text{ and }c(s_{0}\ldots s_{n})<T(s_{n})\}.

Threshold-based pp-causes correspond to a simple class of weight-based schedulers of the pp-causal MDP, which base their decision in a state only on whether the current weight exceeds the threshold or not. Intuitively, threshold-based pp-causes are useful if triggering an alarm causes costs while reaching a safe state does not (see Section 4.2): The idea is that cheap paths (satisfying c(s0sn)<T(sn)c(s_{0}\ldots s_{n})<T(s_{n})) are picked for the pp-cause, while expensive paths are continued in order to realize the chance (with probability 1p\leq 1{-}p) that a safe state is reached and therefore the high cost that has already been accumulated is avoided.

The concept of pp-causes can be used as a basis for monitors that raise an alarm as soon as a state sequence in the pp-cause has been observed. State-based pp-causes have the advantage that they are realizable by “memoryless” monitors that only need the information on the current state of the Markov chain. Threshold-based monitors additonally need to track the weight that has been accumulated so far until the threshold value of the current state is exceeded. So, the memory requirements of monitors realizing a threshold-based pp-cause are given by the logarithmic length of the largest threshold value for SpS_{p}-states. All algorithms proposed in Section 4 for computing cost-minimal pp-causes will return pp-causes that are either state-based or threshold-based with polynomially bounded memory requirements.

3.2 Comparison to prima facie causes

The work [34] presents the notion of prima facie causes in DTMCs where both causes and events are formalized as PCTL state formulae. In our setting we can equivalently consider a state 𝑒𝑟𝑟𝑜𝑟S\mathit{error}\in S as the effect and a state subset CSC\subseteq S constituting the cause. We then reformulate [34, Definition 4.1] to our setting.

Definition 6 (cf. [34])

A set CSC\subseteq S is a pp-prima facie cause of 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} if the following three conditions hold:

  1. (1)

    The set CC is reachable from the initial state and 𝑒𝑟𝑟𝑜𝑟C\mathit{error}\notin C.

  2. (2)

    sC:Prs(𝑒𝑟𝑟𝑜𝑟)p\forall s\in C:\;\mathrm{Pr}_{s}(\lozenge\mathit{error})\geq p

  3. (3)

    Prs0(𝑒𝑟𝑟𝑜𝑟)<p\mathrm{Pr}_{s_{0}}(\lozenge\mathit{error})<p

The condition p>Prs0(𝑒𝑟𝑟𝑜𝑟)p>\mathrm{Pr}_{s_{0}}(\lozenge\mathit{error}) we discussed for pp-causes is hard-coded here as (3). In [34] the value pp is implicitly existentially quantified and thus conditions (2) and (3) can be combined to Prs(𝑒𝑟𝑟𝑜𝑟)>Prs0(𝑒𝑟𝑟𝑜𝑟)\mathrm{Pr}_{s}(\lozenge\mathit{error})>\mathrm{Pr}_{s_{0}}(\lozenge\mathit{error}) for all sCs\in C. This encapsulates the probability-raising property. However, 𝑒𝑟𝑟𝑜𝑟\mathit{error} may be reached while avoiding the cause CC, so pp-prima facie causes do not entail the counterfactuality principle. Definition 2 can be seen as an extension of pp-prima facie causes by virtue of the following lemma:

Lemma 2

For p>Prs0(𝑒𝑟𝑟𝑜𝑟)p>\mathrm{Pr}_{s_{0}}(\lozenge\mathit{error}) every pp-prima facie cause induces a state-based pp-cause.

Proof

Let CSC\subseteq S be a pp-prima facie cause. By condition (1) and (2) of Definition 6 we have CSp\{𝑒𝑟𝑟𝑜𝑟}C\subseteq S_{p}\backslash\{\mathit{error}\}. Since every path reaching 𝑒𝑟𝑟𝑜𝑟\mathit{error} trivially visits a state in Q:=C{𝑒𝑟𝑟𝑜𝑟}SpQ:=C\cup\{\mathit{error}\}\subseteq S_{p}, the set Π={s0snPathsfin(M)snQ and i<n:siQ}\Pi=\{s_{0}\ldots s_{n}\in\operatorname{Paths}_{\operatorname{fin}}(M)\mid s_{n}\in Q\text{ and }\forall i<n:\;s_{i}\notin Q\} is a state-based pp-cause. ∎

4 Costs of pp-causes

In this section we fix a DTMC MM with state space SS, unique initial state s0s_{0}, unique target and safe state 𝑒𝑟𝑟𝑜𝑟,𝑠𝑎𝑓𝑒S\mathit{error},\mathit{safe}\in S and a threshold p(0,1]p\in(0,1]. As motivated in the introduction, we equip the DTMC of our model with a weight function c:Sc\colon S\to\mathbb{Q} on states and consider the induced accumulated weight function c:Pathsfin(M)c\colon\operatorname{Paths}_{\operatorname{fin}}(M)\to\mathbb{Q}. These weights typically represent resources spent, e.g., energy, time, material, etc.

4.1 Expected cost of a pp-cause

Definition 7 (Expected cost)

Given a pp-cause Π\Pi for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} in MM consider the random variable 𝒳:Paths(M)\mathcal{X}:\operatorname{Paths}(M)\to\mathbb{Q} with

𝒳(π)=c(π^) for {π^ΠPref(π) if such π^ existsπ^Pref(π) minimal with last(π^)=𝑠𝑎𝑓𝑒 otherwise.\mathcal{X}(\pi)=c(\hat{\pi})\text{ for }\begin{cases}\hat{\pi}\in\Pi\cap\operatorname{Pref}(\pi)&\text{ if such }\hat{\pi}\text{ exists}\\ \hat{\pi}\in\operatorname{Pref}(\pi)\text{ minimal with }\operatorname{last}(\hat{\pi})=\mathit{safe}&\text{ otherwise}.\end{cases}

Since Prs0({𝑒𝑟𝑟𝑜𝑟,𝑠𝑎𝑓𝑒})=1\mathrm{Pr}_{s_{0}}(\lozenge\{\mathit{error},\mathit{safe}\})=1, paths not falling under the two cases above have measure 0. Then the expected cost expcost(Π)\operatorname{expcost}(\Pi) of Π\Pi is the expected value of 𝒳\mathcal{X}.

The expected cost is a means by which the efficiency of causes for monitoring purposes can be estimated. Assume a pp-cause Π\Pi is used to monitor critical scenarios of a probabilistic system. This means that at some point either a critical scenario is predicted by the monitor (i.e., the execution seen so far lies in Π\Pi), or the monitor reports that no critical scenario will arise (i.e., 𝑠𝑎𝑓𝑒\mathit{safe} has been reached) and can therefore be turned off. If the weight function on the state space is chosen such that it models the cost of monitoring the respective states, then expcost(Π)\operatorname{expcost}(\Pi) estimates the average total resource consumption of the monitor.

We say that a pp-cause Π\Pi is expcost\operatorname{expcost}-minimal if for all pp-causes Φ\Phi we have expcost(Π)expcost(Φ)\operatorname{expcost}(\Pi)\leq\operatorname{expcost}(\Phi). By expcostmin\operatorname{expcost}^{\min}, we denote the value expcost(Π)\operatorname{expcost}(\Pi) of any expcost\operatorname{expcost}-minimal pp-cause Π\Pi.

{restatable}

theoremexpcostComplexity

  1. (1)

    Given a non-negative weight function c:S0c:S\to\mathbb{Q}_{\geq 0}, the canonical pp-cause Θ\Theta from Remark 1 is expcost\operatorname{expcost}-minimal.

  2. (2)

    For an arbitrary weight function c:Sc:S\to\mathbb{Q}, an expcost\operatorname{expcost}-minimal and state-based pp-cause Π\Pi and expcostmin\operatorname{expcost}^{\min} can be computed in polynomial time.

Proof

The statement (1) follows from the fact that if ΠΦ\Pi\preceq\Phi holds for two pp-causes, then we have expcost(Π)expcost(Φ)\operatorname{expcost}(\Pi)\leq\operatorname{expcost}(\Phi), which is shown in the appendix. The value expcostmin=expcost(Θ)\operatorname{expcost}^{\min}=\operatorname{expcost}(\Theta) can then be computed in polynomial time using methods for expected rewards in Markov chains[1, Section 10.5].

To show (2), we reduce our problem to the stochastic shortest path problem (SSP)[4] from s0s_{0} to {𝑒𝑟𝑟𝑜𝑟,𝑠𝑎𝑓𝑒}\{\mathit{error},\mathit{safe}\}. By Lemma 1 equivalence classes of schedulers in 𝒞p(M)\mathcal{C}_{p}(M) are in one-to-one correspondence with pp-causes in MM. Let Π𝔖\Pi_{\mathfrak{S}} be a pp-cause associated with a representative scheduler 𝔖\mathfrak{S}. One can show that expcost(Π𝔖)\operatorname{expcost}(\Pi_{\mathfrak{S}}) is equal to the expected accumulated weight of paths under scheduler 𝔖\mathfrak{S} in 𝒞p(M)\mathcal{C}_{p}(M) upon reaching {𝑒𝑟𝑟𝑜𝑟,𝑠𝑎𝑓𝑒}\{\mathit{error},\mathit{safe}\}. A scheduler 𝔖\mathfrak{S}^{*} minimizing this value can be computed in polynomial time by solving the SSP in 𝒞p(M)\mathcal{C}_{p}(M) [4], and the algorithm returns a memoryless such 𝔖\mathfrak{S}^{*}. It follows that Π𝔖\Pi_{\mathfrak{S}^{*}} is an expcost\operatorname{expcost}-minimal and state-based pp-cause. ∎

4.2 Partial expected cost of a pp-cause

In this section we study a variant of the expected cost where paths with no prefix in the pp-cause are attributed zero costs. A use case for this cost mechanism arises if the costs are not incurred by monitoring the system, but by the countermeasures taken upon triggering the alarm. For example, an alarm might be followed by a downtime of the system, and the cost of this may depend on the current state and history of the execution. In such cases there are no costs incurred if no alarm is triggered.

Definition 8 (Partial expected cost)

For a pp-cause Π\Pi for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} in MM consider the random variable 𝒳:Paths(M)\mathcal{X}:\operatorname{Paths}(M)\to\mathbb{Q} with

𝒳(π)={c(π^)for π^ΠPref(π) if such π^ exists0otherwise.\displaystyle\mathcal{X}(\pi)=\begin{cases}c(\hat{\pi})&\text{for }\hat{\pi}\in\Pi\cap\operatorname{Pref}(\pi)\text{ if such }\hat{\pi}\text{ exists}\\ 0&\text{otherwise}.\end{cases}

The partial expected cost pexpcost(Π)\operatorname{pexpcost}(\Pi) of Π\Pi is the expected value of 𝒳\mathcal{X}.

The analogous statement to Definition 7 (1) does not hold for partial expected costs, as the following example shows.

Example 4
s01s_{0}\mid 1t0t\mid 0uwu\mid w𝑒𝑟𝑟𝑜𝑟\mathit{error}𝑠𝑎𝑓𝑒\mathit{safe}1/41/41/41/41/21/21/21/21/21/23/43/41/41/4
Figure 5: An example showing that the partial expected cost is not monotonous on pp-causes when cc is non-negative.

Consider the Markov chain depicted in Figure 5. For p=1/2p=1/2 and 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} we have Sp={t,u,𝑒𝑟𝑟𝑜𝑟}S_{p}=\{t,u,\mathit{error}\}. The canonical pp-cause is Θ={s0ktk1}\Theta=\{s_{0}^{k}t\mid k\geq 1\} with pexpcost(Θ)=k1(1/4)kk=4/9\operatorname{pexpcost}(\Theta)=\sum_{k\geq 1}(1/4)^{k}\cdot k=4/9. Now let Π\Pi be any pp-cause for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error}. If the path s0ts_{0}^{\ell}t belongs to Π\Pi, then it contributes (1/4)(1/4)^{\ell}\cdot\ell to pexpcost(Π)\operatorname{pexpcost}(\Pi). If instead the paths s0t𝑒𝑟𝑟𝑜𝑟s_{0}^{\ell}t\mathit{error} and s0tu𝑒𝑟𝑟𝑜𝑟s_{0}^{\ell}tu\mathit{error} belong to Π\Pi, they contribute (1/4)1/2+(1/4)1/23/4(+w)(1/4)^{\ell}\cdot 1/2\cdot\ell+(1/4)^{\ell}\cdot 1/2\cdot 3/4\cdot(\ell+w). So, the latter case provides a smaller pexpcost\operatorname{pexpcost} if l>3wl>3w, and the pexpcost\operatorname{pexpcost}-minimal pp-cause is therefore

Π={s0kt1k3w}{s0kt𝑒𝑟𝑟𝑜𝑟,s0ktu3w<k}.\Pi=\{s_{0}^{k}t\mid 1\leq k\leq 3w\}\cup\{s_{0}^{k}t\mathit{error},s_{0}^{k}tu\mid 3w<k\}.

For w=1w=1, the expected cost of this pp-cause is 511/1152=4/91/1152511/1152=4/9-1/1152. So, it is indeed smaller than pexpcost(Θ)\operatorname{pexpcost}(\Theta).

{restatable}

theorempexpcostPseudoP Given a non-negative weight function c:S0c\colon S\to\mathbb{Q}_{\geq 0}, a pexpcost\operatorname{pexpcost}-minimal and threshold-based pp-cause Π\Pi, and the value pexpcostmin\operatorname{pexpcost}^{\min}, can be computed in pseudo-polynomial time. Π\Pi has a polynomially bounded representation.

Proof

For the pseudo-polynomial time bound we apply the techniques from [38] to optimize the partial expected cost in MDPs to the pp-causal MDP 𝒞p(M)\mathcal{C}_{p}(M). It is shown in [38] that there is an optimal scheduler whose decision depends only on the current state and accumulated weight and that such a scheduler and its partial expectation can be computed in pseudo-polynomial time. It is further shown that a rational number KK can be computed in polynomial time such that for accumulated weights above KK, an optimal scheduler has to minimize the probability to reach 𝑒𝑟𝑟𝑜𝑟\mathit{error}. In our case, this means choosing the action continuecontinue. Due to the special structure of 𝒞p(M)\mathcal{C}_{p}(M), we can further show that there is indeed a threshold T(s)T(s) for each state ss such that action pickpick is optimal after a path π^\hat{\pi} ending in ss if and only if c(π^)<T(s)c(\hat{\pi})<T(s). So, a threshold-based pexpcost\operatorname{pexpcost}-minimal pp-cause can be computed in pseudo-polynomial time. Furthermore, we have T(s)<KT(s)<K for each state ss and as KK has a polynomially bounded representation the same applies to the values T(s)T(s) for all states ss. ∎

Since the causal MDP 𝒞p(M)\mathcal{C}_{p}(M) has a comparatively simple form, one could expect that one can do better than the pseudo-polynomial algorithm obtained by reduction to [38]. Nevertheless, in the remainder of this section we argue that computing a pexpcost\operatorname{pexpcost}-minimal pp-cause is computationally hard, in contrast to expcost\operatorname{expcost} (cf. Definition 7). For this we recall that the complexity class 𝙿𝙿\mathtt{PP} [22] is characterized as the class of languages \mathcal{L} that have a probabilistic polynomial-time bounded Turing machine MM_{\mathcal{L}} such that for all words τ\tau one has τ\tau\in\mathcal{L} if and only if MM_{\mathcal{L}} accepts τ\tau with probability at least 1/21/2 (cf. [25]). We will use polynomial Turing reductions, which, in contrast to many-one reductions, allow querying an oracle that solves the problem we reduce to a polynomial number of times. A polynomial time algorithm for a problem that is 𝙿𝙿\mathtt{PP}-hard under polynomial Turing reductions would imply that the polynomial hierarchy collapses [42]. We reduce the 𝙿𝙿\mathtt{PP}-complete cost-problem stated in [24, Theorem 3] to the problem of computing pexpcostmin\operatorname{pexpcost}^{\min}.

sstta+0a\mid+0c+0c\mid+0𝑠𝑎𝑓𝑒\mathit{safe}b+R+ib\mid+R\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}+i𝑒𝑟𝑟𝑜𝑟\mathit{error}MM2/32/31/31/31/21/21/21/2
Figure 6: The DTMCs NiN_{i} for i=0,1i=0,1
{restatable}

theorempexpcostPP Given an acyclic DTMC MM, a weight function c:Sc\colon S\to\mathbb{N} and a rational ϑ\vartheta\in\mathbb{Q}, deciding whether pexpcostminϑ\operatorname{pexpcost}^{\min}\leq\vartheta is 𝙿𝙿\mathtt{PP}-hard under Turing reductions.

Proof

We sketch a Turing reduction from the following problem which is shown to be 𝙿𝙿\mathtt{PP}-hard in[24]: Given an acyclic DTMC MM over state space SS with initial state ss, absorbing state tt such that Prs(t)=1\mathrm{Pr}_{s}(\lozenge t)=1, weight function c:Sc:S\to\mathbb{N} and natural number RR\in\mathbb{N}, decide whether

PrM({πPaths(M)c(π)R})1/2.\mathrm{Pr}_{M}(\{\pi\in\operatorname{Paths}(M)\mid c(\pi)\leq R\})\geq 1/2.

In an acyclic Markov chain MM the values of pexpcost\operatorname{pexpcost} have a polynomially bounded binary representation as shown in the appendix. This allows for a binary search to compute pexpcostmin\operatorname{pexpcost}^{\min} with polynomially many calls to the corresponding threshold problem. We use this procedure in a polynomial-time Turing reduction.

Let now MM be a Markov chain as in [24, Theorem 3] and let RR be a natural number. We construct two Markov chains N0N_{0} and N1N_{1} depicted in Figure 6. The pexpcost\operatorname{pexpcost}-minimal pp-cause in both Markov chains consists of all paths reaching cc with weight R\leq R and all paths reaching 𝑒𝑟𝑟𝑜𝑟\mathit{error} that do not have a prefix reaching cc with weight R\leq R. The difference between the values pexpcostmin\operatorname{pexpcost}^{\min} in the two Markov chains depends only on the probability of paths in the minimal pp-cause collecting the additional weight +1+1 in N1N_{1}. This probability is 16PrM({πPaths(M)c(π)R})\frac{1}{6}\mathrm{Pr}_{M}(\{\pi\in\operatorname{Paths}(M)\mid c(\pi)\leq R\}). By repeatedly using the threshold problem to compute pexpcostmin\operatorname{pexpcost}^{\min} in N0N_{0} and N1N_{1} as described above, we can hence decide the problem from [24, Theorem 3]. More details can be found in the appendix. ∎

4.3 Maximal cost of a pp-cause

In practice, the weight function on the Markov chain potentially models resources for which the available consumption has a tight upper bound. For example, the amount of energy a drone can consume from its battery is naturally limited. Instead of just knowing that on average the consumption will lie below a given bound, it is therefore often desirable to find monitors whose costs are guaranteed to lie below this limit for (almost) any evolution of the system.

Definition 9 (Maximal cost)

Let Π\Pi be a pp-cause for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} in MM. We define the maximal cost of Π\Pi to be

maxcost(Π)\displaystyle\operatorname{maxcost}(\Pi) =sup{c(π^)π^Π}.\displaystyle=\sup\{c(\hat{\pi})\mid\hat{\pi}\in\Pi\}.

The maximal cost of a pp-cause is a measure for the worst-case resource consumption among executions of the system. Therefore, by knowing the minimal value maxcostmin\operatorname{maxcost}^{\min} for pp-causes one can ensure that there will be no critical scenario arising from resource management. {restatable}theoremmaxcostSummary

  1. (1)

    Given a non-negative weight function c:S0c:S\to\mathbb{Q}_{\geq 0}, the canonical pp-cause Θ\Theta is maxcost\operatorname{maxcost}-minimal and maxcostmin\operatorname{maxcost}^{\min} can be computed in time polynomial in the size of MM.

  2. (2)

    For an arbitrary weight function c:Sc:S\to\mathbb{Q} a maxcost\operatorname{maxcost}-minimal and state-based pp-cause Π\Pi and maxcostmin\operatorname{maxcost}^{\min} can be computed in pseudo-polynomial time.

  3. (3)

    Given a rational ϑ\vartheta\in\mathbb{Q}, deciding whether maxcostminϑ\operatorname{maxcost}^{\min}\leq\vartheta is in 𝙽𝙿𝚌𝚘𝙽𝙿\mathtt{NP}\cap\mathtt{coNP}.

Proof

To show (1) it suffices to note that for non-negative weight functions maxcost\operatorname{maxcost} is monotonous with respect to the partial order \preceq on pp-causes. Therefore Θ\Theta is maxcost\operatorname{maxcost}-minimal. For (2) we reduce the problem to a max-cost reachability game as defined in [6]. The algorithm from [6] computes the lowest maximal cost and has a pseudo-polynomial time bound. By virtue of the fact that the minimizing player has a memoryless strategy we can compute a set of states QSpQ\subseteq S_{p} on which a maxcost\operatorname{maxcost}-minimal pp-cause Π\Pi is based upon. In order to show (3) we reduce the max-cost reachability game from (2) further to mean-payoff games, as seen in [8]. Mean-payoff games are known to lie in 𝙽𝙿𝚌𝚘𝙽𝙿\mathtt{NP}\cap\mathtt{coNP} [8]. ∎

4.4 Instantaneous cost

The given weight function cc on states can also induce an instantaneous weight function cinst:Pathsfin(M)c_{\operatorname{inst}}:\operatorname{Paths}_{\operatorname{fin}}(M)\to\mathbb{Q} which just takes the weight of the state visited last, i.e., cinst(s0sn)=c(sn)c_{\operatorname{inst}}(s_{0}\cdots s_{n})=c(s_{n}). This yields an alternative cost mechanism intended to model the situation where the cost of repairing or rebooting only depends on the current state, e.g., the altitude an automated drone has reached.

We add the subscript ‘inst{\operatorname{inst}}’ to the three cost variants, where the accumulative weight function cc has been replaced with the instantaneous weight function cinstc_{\operatorname{inst}}, the error state is replaced by an error set EE and the safe state is replaced by a set of terminal safe states FF. Thus we optimize pp-causes for E\lozenge E in MM.

{restatable}

theoreminstcost For expcostinst\operatorname{expcost}_{\operatorname{inst}}, pexpcostinst\operatorname{pexpcost}_{\operatorname{inst}}, and maxcostinst\operatorname{maxcost}_{\operatorname{inst}} a cost-minimal pp-cause Π\Pi and the value of the minimal cost can be computed in time polynomial in MM. In all cases Π\Pi can be chosen to be a state-based pp-cause.

Proof

We first note that pexpcostinst\operatorname{pexpcost}_{\operatorname{inst}} can be reduced to expcostinst\operatorname{expcost}_{\operatorname{inst}} by setting the weight of all states in FF to 0. We then construct an MDP (different from 𝒞p(M)\mathcal{C}_{p}(M)) which emulates the instantaneous weight function using an accumulating weight function. Thus, finding an expcostinst\operatorname{expcost}_{\operatorname{inst}}-minimal pp-cause Π\Pi reduces to the SSP from [4], which can be solved in polynomial time. The solution admits a memoryless scheduler and thus Π\Pi is state-based in this case.

For maxcostinst\operatorname{maxcost}_{\operatorname{inst}} we order the states in SpS_{p} by their cost and then start iteratively removing the states with lowest cost until EE is not reachable anymore. The set QQ of states which where removed induce a state-based maxcostinst\operatorname{maxcost}_{\operatorname{inst}}-minimal pp-cause Π\Pi. This gives us a polynomial time procedure to compute maxcostinstmin\operatorname{maxcost}_{\operatorname{inst}}^{\min} and Π\Pi. ∎

5 Conclusion

We combined the counterfactuality principle and the probability-raising property into the notion of pp-causes in DTMCs. In order to find suitable pp-causes we defined different cost models and gave algorithms to compute corresponding cost-minimal causes.

Cyber-physical systems are often not fully probabilistic, but involve a certain amount of control in form of decisions depending on the system state. Such systems can be modeled by MDPs, to which we intend to generalize the causality framework presented here. Our approach also assumes that the probabilistic system described by the Markov chain is fully observable. By observing execution traces instead of paths of the system, generalizing the notion of pp-causes to hidden Markov models is straightforward. However, the corresponding computational problems exhibit additional difficulties which we address in future work.

References

  • [1] Baier, C., Katoen, J.P.: Principles of Model Checking (Representation and Mind Series). The MIT Press, Cambridge, MA (2008)
  • [2] Bartocci, E., Grosu, R., Karmarkar, A., Smolka, S.A., Stoller, S.D., Zadok, E., Seyster, J.: Adaptive Runtime Verification. In: Runtime Verification. pp. 168–182. Springer Berlin Heidelberg (2013). https://doi.org/10.1007/978-3-642-35632-2_18
  • [3] Beer, I., Ben-David, S., Chockler, H., Orni, A., Trefler, R.: Explaining counterexamples using causality. In: Computer Aided Verification (CAV’09). pp. 94–108. Springer Berlin Heidelberg (2009). https://doi.org/10.1007/978-3-642-02658-4_11
  • [4] Bertsekas, D.P., Tsitsiklis, J.N.: An analysis of stochastic shortest path problems. Math. Oper. Res. 16(3), 580–595 (1991)
  • [5] Braham, M., van Hees, M.: An anatomy of moral responsibility. Mind 121 (483), 601–634 (2012)
  • [6] Brihaye, T., Geeraerts, G., Haddad, A., Monmege, B.: To Reach or not to Reach? Efficient Algorithms for Total-Payoff Games. In: Proceedings of the 26th International Conference on Concurrency Theory (CONCUR’15). LIPIcs, vol. 42, pp. 297–310 (2015). https://doi.org/10.4230/LIPIcs.CONCUR.2015.297
  • [7] Chadha, R., Sistla, A.P., Viswanathan, M.: On the expressiveness and complexity of randomization in finite state monitors. J. ACM 56(5) (2009). https://doi.org/10.1145/1552285.1552287
  • [8] Chatterjee, K., Doyen, L., Henzinger, T.A.: The Cost of Exactness in Quantitative Reachability. In: Models, Algorithms, Logics and Tools. pp. 367–381. Springer International Publishing, Cham (2017). https://doi.org/10.1007/978-3-319-63121-9_18
  • [9] Chockler, H., Halpern, J.Y.: Responsibility and Blame: A Structural-Model Approach. J. Artif. Int. Res. 22(1), 93–115 (Oct 2004)
  • [10] Chockler, H., Halpern, J.Y., Kupferman, O.: What causes a system to satisfy a specification? ACM Transactions on Computational Logic 9(3), 20:1–20:26 (2008)
  • [11] Cini, C., Francalanza, A.: An LTL Proof System for Runtime Verification. In: Baier, C., Tinelli, C. (eds.) Tools and Algorithms for the Construction and Analysis of Systems. pp. 581–595. Springer Berlin Heidelberg (2015)
  • [12] Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, Third Edition. The MIT Press, 3rd edn. (2009)
  • [13] Daca, P., Henzinger, T.A., Křetínský, J., Petrov, T.: Faster Statistical Model Checking for Unbounded Temporal Properties. In: Tools and Algorithms for the Construction and Analysis of Systems. pp. 112–129. Springer Berlin Heidelberg (2016). https://doi.org/10.1007/978-3-662-49674-9_7
  • [14] Dash, D., Voortman, M., De Jongh, M.: Sequences of Mechanisms for Causal Reasoning in Artificial Intelligence. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence. p. 839–845. IJCAI ’13, AAAI Press (2013)
  • [15] Eells, E.: Probabilistic Causality. Cambridge Studies in Probability, Induction and Decision Theory, Cambridge University Press (1991)
  • [16] Eiter, T., Lukasiewicz, T.: Complexity results for explanations in the structural-model approach. Artificial Intelligence 154(1-2), 145–198 (2004)
  • [17] Eiter, T., Lukasiewicz, T.: Causes and explanations in the structural-model approach: Tractable cases. Artificial Intelligence 170(6-7), 542–580 (2006)
  • [18] Esparza, J., Kiefer, S., Kretinsky, J., Weininger, M.: Online monitoring ω\omega-regular properties in unknown Markov chains. Arxiv preprint, 2010.08347 (2020)
  • [19] Faran, R., Kupferman, O.: Spanning the spectrum from safety to liveness. Acta Informatica 55(8), 703–732 (2018). https://doi.org/10.1007/s00236-017-0307-4
  • [20] Feigenbaum, J., Hendler, J.A., Jaggard, A.D., Weitzner, D.J., Wright, R.N.: Accountability and Deterrence in Online Life. In: Proceedings of WebSci ’11. ACM, New York, NY, USA (2011). https://doi.org/10.1145/2527031.2527043
  • [21] Fenton-Glynn, L.: A Proposed Probabilistic Extension of the Halpern and Pearl Definition of ‘Actual Cause’. The British Journal for the Philosophy of Science 68(4), 1061–1124 (2016)
  • [22] Gill, J.: Computational complexity of probabilistic Turing machines. SIAM Journal on Computing 6(4), 675–695 (1977). https://doi.org/10.1137/0206049
  • [23] Gondi, K., Patel, Y., Sistla, A.P.: Monitoring the Full Range of ω\omega-Regular Properties of Stochastic Systems. In: Proceedings of VMCAI’09:. pp. 105–119. Springer Berlin Heidelberg (2009). https://doi.org/10.1007/978-3-540-93900-9_12
  • [24] Haase, C., Kiefer, S.: The Odds of Staying on Budget. In: Automata, Languages, and Programming. pp. 234–246. Springer Berlin Heidelberg (2015). https://doi.org/10.1007/978-3-662-47666-6_19
  • [25] Haase, C., Kiefer, S.: The Complexity of the Kth Largest Subset Problem and Related Problems. Inf. Process. Lett. 116(2), 111–115 (2016). https://doi.org/10.1016/j.ipl.2015.09.015
  • [26] Halpern, J.Y.: A Modification of the Halpern-Pearl Definition of Causality. In: Proceedings of IJCAI’15. p. 3022–3033. AAAI Press (2015)
  • [27] Halpern, J.Y., Pearl, J.: Causes and Explanations: A Structural-Model Approach: Part 1: Causes. In: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence (UAI). pp. 194–202 (2001)
  • [28] Huang, Y., Kleinberg, S.: Fast and Accurate Causal Inference from Time Series Data. In: Proceedings of FLAIRS 2015. pp. 49–54. AAAI Press (2015)
  • [29] Ibrahim, A., Pretschner, A.: From Checking to Inference: Actual Causality Computations as Optimization Problems. In: Proceedings of ATVA’20. pp. 343–359. Springer Internat. Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-59152-6_19
  • [30] Ibrahim, A., Pretschner, A., Klesel, T., Zibaei, E., Kacianka, S., Pretschner, A.: Actual Causality Canvas: A General Framework for Explanation-Based Socio-Technical Constructs. In: Proceedings of ECAI’20. pp. 2978 – 2985. IOS Press Ebooks (2020). https://doi.org/10.3233/FAIA200472
  • [31] Kalajdzic, K., Bartocci, E., Smolka, S.A., Stoller, S.D., Grosu, R.: Runtime Verification with Particle Filtering. In: Runtime Verification. pp. 149–166. Springer Berlin Heidelberg (2013). https://doi.org/10.1007/978-3-642-40787-1_9
  • [32] Kleinberg, S.: A Logic for Causal Inference in Time Series with Discrete and Continuous Variables. In: Proceedings of IJCAI’11. pp. 943–950 (2011)
  • [33] Kleinberg, S., Hripcsak, G.: A review of causal inference for biomedical informatics. J Biomed Inform. 44(6), 1102–12 (2011). https://doi.org/10.1016/j.jbi.2011.07.001
  • [34] Kleinberg, S., Mishra, B.: The Temporal Logic of Causal Structures. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI). pp. 303–312 (2009)
  • [35] Kleinberg, S., Mishra, B.: The Temporal Logic of Token Causes. In: Proceedings of KR’10. p. 575–577. AAAI Press (2010)
  • [36] Miller, T.: Explanation in Artificial Intelligence: Insights from the Social Sciences. Artificial Intelligence 267 (06 2017). https://doi.org/10.1016/j.artint.2018.07.007
  • [37] Pearl, J.: Causality. Cambridge University Press, 2nd edn. (2009)
  • [38] Piribauer, J., Baier, C.: Partial and conditional expectations in Markov decision processes with integer weights. In: Proceedings of FoSSaCS’19. Lecture Notes in Computer Science, vol. 11425, pp. 436–452. Springer (2019)
  • [39] Reichenbach, H.: The Direction of Time. Dover Publications (1956)
  • [40] Sistla, A.P., Srinivas, A.R.: Monitoring Temporal Properties of Stochastic Systems. In: Logozzo, F., Peled, D.A., Zuck, L.D. (eds.) Verification, Model Checking, and Abstract Interpretation. pp. 294–308. Springer Berlin Heidelberg (2008)
  • [41] Stoller, S.D., Bartocci, E., Seyster, J., Grosu, R., Havelund, K., Smolka, S.A., Zadok, E.: Runtime Verification with State Estimation. In: Runtime Verification. pp. 193–207. Springer Berlin Heidelberg (2012). https://doi.org/10.1007/978-3-642-29860-8_15
  • [42] Toda, S.: PP is as Hard as the Polynomial-Time Hierarchy. SIAM Journal on Computing 20(5), 865–877 (1991). https://doi.org/10.1137/0220053
  • [43] Vennekens, J., Bruynooghe, M., Denecker, M.: Embracing Events in Causal Modelling: Interventions and Counterfactuals in CP-Logic. In: Logics in Artificial Intelligence. pp. 313–325. Springer Berlin Heidelberg (2010)
  • [44] Vennekens, J., Denecker, M., Bruynooghe, M.: CP-logic: A language of causal probabilistic events and its relation to logic programming. Theory and Practice of Logic Programming 9(3), 245–308 (2009). https://doi.org/10.1017/S1471068409003767
  • [45] Zheng, M., Kleinberg, S.: A Method for Automating Token Causal Explanation and Discovery. In: Proceedings of FLAIRS’17 (2017)

6 Appendix

We will denote nondeterministic finite automata as tuples 𝒜=(Q,Σ,Q0,T,F)\mathcal{A}=(Q,\Sigma,Q_{0},T,F), where QQ is the set of states, Σ\Sigma is the alphabet, Q0QQ_{0}\subseteq Q is the initial state, T:Q×Σ2QT\colon Q\times\Sigma\to 2^{Q} is the transition function, and FQF\subseteq Q are the final states. A transition qT(q,α)q^{\prime}\in T(q,\alpha) is also denoted by q𝛼qq\overset{\alpha}{\mapsto}q^{\prime}.

\canonicalpath

*

Proof

The first condition snSp={sSPrs(𝑒𝑟𝑟𝑜𝑟)p}s_{n}\in S_{p}=\{s\in S\mid\mathrm{Pr}_{s}(\lozenge\mathit{error})\geq p\} ensures that every path in Θ\Theta is a pp-critical prefix for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} in MM. The second condition ensures that no proper prefix is a pp-critical prefix, and therefore Θ\Theta is prefix-free. Clearly, every path reaching 𝑒𝑟𝑟𝑜𝑟\mathit{error} passes through SpS_{p} since 𝑒𝑟𝑟𝑜𝑟\mathit{error} is itself an element in SpS_{p}. These properties together show that Θ\Theta is a pp-cause.

Let Π\Pi be another pp-cause for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} in MM and consider an arbitrary path s0snΠs_{0}\ldots s_{n}\in\Pi. This means snSps_{n}\in S_{p}. If we have for all i<ni<n that Prsi(𝑒𝑟𝑟𝑜𝑟)<p\mathrm{Pr}_{s_{i}}(\lozenge\mathit{error})<p, then s0snΘs_{0}\ldots s_{n}\in\Theta and there is nothing to prove. Therefore assume that there exists i<ni<n such that siSps_{i}\in S_{p}. For minimal such ii we have s0siΘs_{0}\ldots s_{i}\in\Theta and thus ΘΠ\Theta\preceq\Pi.

To prove that Θ\Theta is regular, consider the deterministic finite automaton 𝒜=(Q,Σ,q0,T,F)\mathcal{A}=(Q,\Sigma,q_{0},T,F) for Q=SQ=S, Σ=S{ι}\Sigma=S\cup\{\iota\}, q0=ιq_{0}=\iota, and final states F=SpF=S_{p}. The transition relation is given by

T={ιs0s0}{sssP(s,s)>0sSp}T=\{\iota\mathrel{\overset{\makebox[0.0pt]{\mbox{\tiny$s_{0}$}}}{\mapsto}}s_{0}\}\cup\{s\mathrel{\overset{\makebox[0.0pt]{\mbox{\tiny$s^{\prime}$}}}{\mapsto}}s^{\prime}\mid P(s,s^{\prime})>0\wedge s\notin S_{p}\}

i.e., all states in SpS_{p} are terminal. We argue that this automaton describes the language Θ\Theta:

If s0Sps_{0}\in S_{p} then there are no transitions possible and 𝒜={s0}\mathcal{L}_{\mathcal{A}}=\{s_{0}\} and Θ={s0}\Theta=\{s_{0}\} by definition.

Let 𝒜Θ\mathcal{L}_{\mathcal{A}}\subseteq\Theta and let s0sns_{0}\ldots s_{n} be a word accepted by 𝒜\mathcal{A}. By definition we have snSps_{n}\in S_{p} For any state sSps\in S_{p} visited by s0sns_{0}\cdots s_{n} we know that s=sns=s_{n} since SpS_{p} is terminal. This means i<n.Prsi(𝑒𝑟𝑟𝑜𝑟)<p\forall i<n.\;\mathrm{Pr}_{s_{i}}(\lozenge\mathit{error})<p and thus s0snΘs_{0}\ldots s_{n}\in\Theta.

Now let 𝒜Θ\mathcal{L}_{\mathcal{A}}\supseteq\Theta and s0snΘs_{0}\ldots s_{n}\in\Theta. A run for s0sns_{0}\ldots s_{n} in 𝒜\mathcal{A} is possible, since we have an edge for every transition in MM except for outgoing edges of SpS_{p}. These exceptions are avoided, since for i<ni<n we have Prsi(𝑒𝑟𝑟𝑜𝑟)<p\mathrm{Pr}_{s_{i}}(\lozenge\mathit{error})<p and thus no state but the last one is in SpS_{p}. On the other hand sns_{n} is in SpS_{p} by assumption and thus the word s0sns_{0}\ldots s_{n} is accepted by 𝒜\mathcal{A}. ∎

In the following proof, let Cyl(Π)=πΠCyl(π)\operatorname{Cyl}(\Pi)=\bigcup_{\pi\in\Pi}\operatorname{Cyl}(\pi) for ΠPathsfin(M)\Pi\subseteq\operatorname{Paths}_{\operatorname{fin}}(M).

\expcostComplexity

*

Proof

To show (1), we claim that for two pp-causes Π\Pi and Φ\Phi with ΠΦ\Pi\preceq\Phi we have expcost(Π)expcost(Φ)\operatorname{expcost}(\Pi)\leq\operatorname{expcost}(\Phi). Let 𝒳Π\mathcal{X}_{\Pi} and 𝒳Φ\mathcal{X}_{\Phi} be the corresponding random variables of expcost(Π)\operatorname{expcost}(\Pi) and expcost(Φ)\operatorname{expcost}(\Phi), respectively. We prove expcost(Π)expcost(Φ)\operatorname{expcost}(\Pi)\leq\operatorname{expcost}(\Phi) by showing for almost all πPaths(M)\pi\in\operatorname{Paths}(M) that 𝒳Π(π)𝒳Φ(π)\mathcal{X}_{\Pi}(\pi)\leq\mathcal{X}_{\Phi}(\pi). In the argument below we ignore cases that have measure 0.

From ΠΦ\Pi\preceq\Phi it follows that Cyl(Π)Cyl(Φ)\operatorname{Cyl}(\Pi)\supseteq\operatorname{Cyl}(\Phi). We now deal separately with the three cases obtained by the partition

Paths(M)=Cyl(Φ)˙Cyl(Π)\Cyl(Φ)˙Paths(M)\Cyl(Π).\operatorname{Paths}(M)=\operatorname{Cyl}(\Phi)\ \dot{\cup}\ \operatorname{Cyl}(\Pi)\backslash\operatorname{Cyl}(\Phi)\ \dot{\cup}\ \operatorname{Paths}(M)\backslash\operatorname{Cyl}(\Pi).

For s0s1Cyl(Φ)s_{0}s_{1}\ldots\in\operatorname{Cyl}(\Phi) both random variables consider the unique prefix of s0s1s_{0}s_{1}\ldots in the respective pp-cause. For any s0snΦs_{0}\ldots s_{n}\in\Phi we know by definition of \preceq that there is s0smΠs_{0}\ldots s_{m}\in\Pi with mnm\leq n. Therefore

𝒳Π(s0s1)=c(s0sm)c(s0sn)=𝒳Φ(s0s1).\mathcal{X}_{\Pi}(s_{0}s_{1}\ldots)=c(s_{0}\ldots s_{m})\leq c(s_{0}\ldots s_{n})=\mathcal{X}_{\Phi}(s_{0}s_{1}\ldots).

For s0s1Cyl(Π)\Cyl(Φ)s_{0}s_{1}\ldots\in\operatorname{Cyl}(\Pi)\backslash\operatorname{Cyl}(\Phi) the random variable 𝒳Π\mathcal{X}_{\Pi} takes the unique prefix s0sms_{0}\ldots s_{m} in Π\Pi, whereas 𝒳Φ\mathcal{X}_{\Phi} takes the path s0sns_{0}\ldots s_{n} such that sn=𝑠𝑎𝑓𝑒s_{n}=\mathit{safe}. Since Pr𝑠𝑎𝑓𝑒(𝑒𝑟𝑟𝑜𝑟)=0\mathrm{Pr}_{\mathit{safe}}(\lozenge\mathit{error})=0, we have almost surely that m<nm<n, and therefore

𝒳Π(s0s1)=c(s0sm)c(s0sn)=𝒳Φ(s0s1).\mathcal{X}_{\Pi}(s_{0}s_{1}\ldots)=c(s_{0}\ldots s_{m})\leq c(s_{0}\ldots s_{n})=\mathcal{X}_{\Phi}(s_{0}s_{1}\ldots).

For s0s1Paths(M)\Cyl(Π)s_{0}s_{1}\ldots\in\operatorname{Paths}(M)\backslash\operatorname{Cyl}(\Pi) both random variables evaluate the same path and therefore 𝒳Π(s0s1)=𝒳Φ(s0s1)\mathcal{X}_{\Pi}(s_{0}s_{1}\ldots)=\mathcal{X}_{\Phi}(s_{0}s_{1}\ldots). Therefore we have for any πPaths(M)\pi\in\operatorname{Paths}(M) that 𝒳Π(π)𝒳Φ(π)\mathcal{X}_{\Pi}(\pi)\leq\mathcal{X}_{\Phi}(\pi).

To compute expcostmin=expcost(Θ)\operatorname{expcost}^{\min}=\operatorname{expcost}(\Theta) consider the DTMC MM but change the transitions such that every state in SpS_{p} is terminal. The value expcost(Θ)\operatorname{expcost}(\Theta) is then the expected reward ExpRew(s0Sp{𝑠𝑎𝑓𝑒})\operatorname{ExpRew}(s_{0}\models\lozenge S_{p}\cup\{\mathit{safe}\}) defined in [1, Definition 10.71]. Expected rewards in DTMCs can be computed in polynomial time via a classical linear equation system [1, Section 10.5.1].

We show (2) by reducing the problem of finding an expcost\operatorname{expcost}-minimal pp-cause to the stochastic shortest path problem (SSP) [4] in 𝒞p(M)\mathcal{C}_{p}(M). For a scheduler 𝔖\mathfrak{S} of of 𝒞p(M)\mathcal{C}_{p}(M) denote the corresponding pp-cause for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} in MM by Π𝔖\Pi_{\mathfrak{S}} (cf. Lemma 1). In [4] the authors define the value expected cost of a scheduler 𝔖\mathfrak{S} in an MDP. This value with respect to the target set {𝑒𝑟𝑟𝑜𝑟,𝑠𝑎𝑓𝑒}\{\mathit{error},\mathit{safe}\} coincides with expcost(Π𝔖)\operatorname{expcost}(\Pi_{\mathfrak{S}}).

The SSP asks to find a scheduler 𝔖\mathfrak{S}^{*} in 𝒞p(M)\mathcal{C}_{p}(M) minimizing the expected cost. It is shown in [4] that a memoryless such 𝔖\mathfrak{S}^{*} and the value expcostmin\operatorname{expcost}^{\min} can be computed in polynomial time. The scheduler 𝔖\mathfrak{S}^{*} corresponds to an expcost\operatorname{expcost}-minimal state-based pp-cause Π\Pi. ∎

\pexpcostPseudoP

*

Proof

Consider a scheduler 𝔖\mathfrak{S} of 𝒞p(M)\mathcal{C}_{p}(M) with weight function cc and recall that 𝒞p(M)𝔖\mathcal{C}_{p}(M)_{\mathfrak{S}} denotes Markov chain induced by 𝔖\mathfrak{S}. Define the random variable 𝔖𝑒𝑟𝑟𝑜𝑟:Paths(𝒞p(M)𝔖)\oplus_{\mathfrak{S}}\mathit{error}:\operatorname{Paths}(\mathcal{C}_{p}(M)_{\mathfrak{S}})\to\mathbb{Q}

𝔖𝑒𝑟𝑟𝑜𝑟(π)={c(π)if π𝑒𝑟𝑟𝑜𝑟0otherwise.\oplus_{\mathfrak{S}}\mathit{error}(\pi)=\begin{cases}c(\pi)&\text{if }\pi\in\lozenge\mathit{error}\\ 0&\text{otherwise.}\end{cases}

The partial expectation 𝔼𝔖{\mathbb{PE}}^{\mathfrak{S}} of a scheduler 𝔖\mathfrak{S} in 𝒞p(M)\mathcal{C}_{p}(M) is defined as the expected value of 𝔖𝑒𝑟𝑟𝑜𝑟\oplus_{\mathfrak{S}}\mathit{error}. The minimal partial expectation is 𝔼min=inf𝔖𝔼𝔖{\mathbb{PE}}^{\min}=\inf_{\mathfrak{S}}{\mathbb{PE}}^{\mathfrak{S}}. It is known that there is a scheduler obtaining the minimal partial expectation [38].

Then a pp-cause Π\Pi and the corresponding scheduler 𝔖Π\mathfrak{S}_{\Pi} satisfy pexpcost(Π)=𝔼𝔖Π\operatorname{pexpcost}(\Pi)=\mathbb{PE}^{\mathfrak{S}_{\Pi}}. A cost-minimal scheduler for the partial expectation in an MDP with non-negative weights can be computed in pseudo-polynomial time by [38]. In this process we also compute pexpcostmin=𝔼min\operatorname{pexpcost}^{\min}={\mathbb{PE}}^{\min}.

Furthermore, we can show that once the action continuecontinue is optimal in a state ss with accumulated weight ww, it is also optimal for all weights w>ww^{\prime}>w: Suppose choosing continuecontinue is optimal in some state ss if the accumulated weight is ww. Let EE be the partial expected accumulated weight that the optimal scheduler collects from then on and let q=Prs(𝑒𝑟𝑟𝑜𝑟)q=\mathrm{Pr}_{s}(\lozenge\mathit{error}). The optimality of continuecontinue implies that E+wqwE+w\cdot q\leq w. For all w>ww^{\prime}>w, this implies E+wqwE+w^{\prime}\cdot q\leq w^{\prime} as well. We conclude the existence of T:ST:S\to\mathbb{Q} such that continuecontinue is optimal if and only if the accumulated weight is at least T(s)T(s). If pickpick is not enabled in a state ss, we have T(s)=0T(s)=0. Therefore Π\Pi is a threshold-based pp-cause defined by TT. As shown in [38], there is a saturation point KK such that schedulers minimizing the partial expectation can be chosen to behave memoryless as soon as the accumulated weight exceeds KK. This means that T(s)T(s) can be chosen to be either at most KK or \infty for each state ss. The saturation point KK and hence all thresholds T(s)T(s) have a polynomial representation. ∎

\pexpcostPP

*

Proof
ssttai+0a_{i}\mid+0ci+0c_{i}\mid+0𝑠𝑎𝑓𝑒\mathit{safe}bi+R+ib_{i}\mid+R\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}+i𝑒𝑟𝑟𝑜𝑟\mathit{error}MM2/32/31/31/31/21/21/21/2
Figure 7: The DTMCs NiN_{i} for i=0,1i=0,1

We provide a Turing reduction from the following problem that is shown to be 𝙿𝙿\mathtt{PP}-hard in [24]: Given an acyclic DTMC MM over state space SS with initial state ss, absorbing state tt such that Prs(t)=1\mathrm{Pr}_{s}(\lozenge t)=1, weight function c:Sc:S\to\mathbb{N} and natural number RR\in\mathbb{N}, decide whether

PrM({φPaths(M)c(π)R})12.\mathrm{Pr}_{M}(\{\varphi\in\operatorname{Paths}(M)\mid c(\pi)\leq R\})\geq\frac{1}{2}.

Given such an acyclic DTMC MM we construct the two DTMCs N0N_{0} and N1N_{1} depicted in Figure 7. We consider the pexpcost\operatorname{pexpcost}-minimal pp-causes for p=1/2p=1/2 in NiN_{i} for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} and i{0,1}i\in\{0,1\}. Suppose a path π\pi arrives at state cic_{i} with probability Prs(π)\mathrm{Pr}_{s}(\pi) and accumulated weight ww. We have to decide whether the path π\pi or the extension π=πbi𝑒𝑟𝑟𝑜𝑟\pi^{\prime}=\pi b_{i}\mathit{error} should be included in the cost-minimal pp-cause. The path π\pi^{\prime} has weight w+R+iw+R+i and probability Prs(π)/2\mathrm{Pr}_{s}(\pi)/2. We observe that π\pi is the optimal choice if

Prs(π)wPrs(π)w+R+i2.\mathrm{Pr}_{s}(\pi)\cdot w\leq\mathrm{Pr}_{s}(\pi)\cdot\frac{w+R+i}{2}.

This is the case if and only if wR+iw\leq R+i. If i=1i=1, and w=R+iw=R+i, both choices are equally good and we decide to include the path π\pi in this case. Hence, the pexpcost\operatorname{pexpcost}-minimal pp-causes for p=1/2p=1/2 in N0N_{0} is

Π0=\displaystyle\Pi_{0}= {π^Pathsfin(N0)last(π^)=c0 and c(π^)R}\displaystyle\{\hat{\pi}\in\operatorname{Paths}_{\operatorname{fin}}(N_{0})\mid\operatorname{last}(\hat{\pi})=c_{0}\text{ and }c(\hat{\pi})\leq R\}\cup
{π^Pathsfin(N0)last(π^)=𝑒𝑟𝑟𝑜𝑟 and c(π^)>2R}.\displaystyle\{\hat{\pi}\ \in\operatorname{Paths}_{\operatorname{fin}}(N_{0})\mid\operatorname{last}(\hat{\pi})=\mathit{error}\text{ and }c(\hat{\pi})>2R\}.

Similarly in N1N_{1} the following pp-cause is pexpcost\operatorname{pexpcost}-minimal:

Π1=\displaystyle\Pi_{1}= {π^Pathsfin(N1)last(π^)=c1 and c(π^)R}\displaystyle\{\hat{\pi}\in\operatorname{Paths}_{\operatorname{fin}}(N_{1})\mid\operatorname{last}(\hat{\pi})=c_{1}\text{ and }c(\hat{\pi})\leq R\}\cup
{π^Pathsfin(N1)last(π^)=𝑒𝑟𝑟𝑜𝑟 and c(π^)>2R+1}.\displaystyle\{\hat{\pi}\ \in\operatorname{Paths}_{\operatorname{fin}}(N_{1})\mid\operatorname{last}(\hat{\pi})=\mathit{error}\text{ and }c(\hat{\pi})>2R+1\}.

Therefore we have

3pexpcostN0(Π0)=\displaystyle 3\cdot\operatorname{pexpcost}_{N_{0}}(\Pi_{0})= φPathsfin(M),c(φ)RPr(φ)c(φ)\displaystyle\sum_{\varphi\in\operatorname{Paths}_{\operatorname{fin}}(M),c(\varphi)\leq R}\mathrm{Pr}(\varphi)c(\varphi)
+φPathsfin(M),c(φ)>RPr(φ)c(φ)+R2,\displaystyle+\sum_{\varphi\in\operatorname{Paths}_{\operatorname{fin}}(M),c(\varphi)>R}\mathrm{Pr}(\varphi)\frac{c(\varphi)+R}{2},
3pexpcostN1(Π1)=\displaystyle 3\cdot\operatorname{pexpcost}_{N_{1}}(\Pi_{1})= φPathsfin(M),c(φ)RPr(φ)c(φ)\displaystyle\sum_{\varphi\in\operatorname{Paths}_{\operatorname{fin}}(M),c(\varphi)\leq R}\mathrm{Pr}(\varphi)c(\varphi)
+φPathsfin(M),c(φ)>RPr(φ)c(φ)+R+12.\displaystyle+\sum_{\varphi\in\operatorname{Paths}_{\operatorname{fin}}(M),c(\varphi)>R}\mathrm{Pr}(\varphi)\frac{c(\varphi)+R+1}{2}.

We conclude that

pexpcostN0minpexpcostN1min\displaystyle\operatorname{pexpcost}_{N_{0}}^{\min}-\operatorname{pexpcost}_{N_{1}}^{\min}
=\displaystyle= pexpcostN0(Π0)pexpcostN1(Π1)\displaystyle\operatorname{pexpcost}_{N_{0}}(\Pi_{0})-\operatorname{pexpcost}_{N_{1}}(\Pi_{1})
=\displaystyle= 16PrM({φPaths(M)c(φ)>R})\displaystyle\frac{1}{6}\cdot\mathrm{Pr}_{M}(\{\varphi\in\operatorname{Paths}(M)\mid c(\varphi)>R\})
=\displaystyle= 16(1PrM({φPaths(M)c(φ)R}))\displaystyle\frac{1}{6}\cdot(1-\mathrm{Pr}_{M}(\{\varphi\in\operatorname{Paths}(M)\mid c(\varphi)\leq R\}))

In the sequel we prove that we can use an oracle for the threshold problem for pexpcostmin\operatorname{pexpcost}^{\min} to compute the values

pexpcostN0minandpexpcostN1min.\operatorname{pexpcost}_{N_{0}}^{\min}\qquad\text{and}\qquad\operatorname{pexpcost}_{N_{1}}^{\min}.

This in turn allows us to compute 16PrM({π^Paths(M)c(π^)R})\frac{1}{6}\mathrm{Pr}_{M}(\{\hat{\pi}\in\operatorname{Paths}(M)\mid c(\hat{\pi})\leq R\}) and thus to decide the problem from [24].

In any acyclic Markov chain KK, the following holds: We assume that the transition probabilities are encoded as fractions of coprime integers. Therefore we can compute the product of all denominators to get a number LL in polynomial time and with polynomially bounded encoding. The maximal weight WW of a path in KK can be computed in linear time and has a polynomially bounded encoding. Therefore the minimal pexpcost\operatorname{pexpcost} is an integer multiple of 1/L1/L and there are at most WLW\cdot L many different values for pexpcost\operatorname{pexpcost}. This in particular applies to the value of the pexpcost\operatorname{pexpcost}-optimal pp-cause.

Note that there are still exponentially many possible values for the minimal partial expected costs in both Markov chains NiN_{i}. A binary search over all possible values with polynomially many applications of the threshold problem:

Is pexpcostminϑ\operatorname{pexpcost}^{\min}\leq\vartheta for a rational ϑ\vartheta\in\mathbb{Q}?

is possible, nevertheless. We therefore can apply this binary search to find the exact value pexpcostNi(Π)\operatorname{pexpcost}_{N_{i}}(\Pi) for both DTMCs Ni(i=0,1)N_{i}(i=0,1) by solving polynomially many instances of the threshold problem. This gives us a polynomial Turing reduction to the problem stated in [24]. ∎

\maxcostSummary

*

Proof

We start with some preliminary considerations. If there exists a path s0sns_{0}\ldots s_{n} entirely contained in S\SpS\backslash S_{p} containing a cycle with positive weight, then maxcost\operatorname{maxcost} of any pp-cause is \infty: Consider a such a positive cycle reachable in S\SpS\backslash S_{p}. Then there are paths in 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error} which contain this cycle arbitrarily often. For any pp-cause Π\Pi almost all of these paths need a prefix in Π\Pi. Since no state in the positive cycle nor in the path from s0s_{0} to the cycle is in SpS_{p}, such prefixes also contain the cycle arbitrarily often. This means these prefixes accumulate the positive weight of the cycle arbitrarily often. Therefore, all pp-causes contain paths with arbitrary high weight. Thus, before optimizing Π\Pi we check whether there are positive cycles reachable in the induced graph of MM on S\SpS\backslash S_{p}. This can be done in polynomial time with the Bellman-Ford algorithm [12]. Henceforth we assume there are no such positive cycles.

For (1) we show that for two pp-causes Π,Φ\Pi,\Phi with ΠΦ\Pi\preceq\Phi we have maxcost(Π)maxcost(Φ)\operatorname{maxcost}(\Pi)\leq\operatorname{maxcost}(\Phi). Let π^Π\hat{\pi}\in\Pi be arbitrary. Since π^\hat{\pi} is a pp-critical prefix (and p>0p>0), there is a path πPathsM()\pi\in\operatorname{Paths}_{M}(\mathcal{L}) with π^Pref(π)\hat{\pi}\in\operatorname{Pref}(\pi). Since Φ\Phi is a pp-cause, there exists φ^ΦPref(π)\hat{\varphi}\in\Phi\cap\operatorname{Pref}(\pi). The assumption ΠΦ\Pi\preceq\Phi and the fact that both Π\Pi and Φ\Phi are prefix-free then force π^\hat{\pi} to be a prefix of φ^\hat{\varphi}. Hence c(π^)c(φ^)c(\hat{\pi})\leq c(\hat{\varphi}), and since π^\hat{\pi} was arbitrary, it follows that maxcost(Π)maxcost(Φ)\operatorname{maxcost}(\Pi)\leq\operatorname{maxcost}(\Phi). This implies that Θ\Theta is maxcost\operatorname{maxcost}-minimal.

Computing maxcostmin=maxcost(Θ)\operatorname{maxcost}^{\min}=\operatorname{maxcost}(\Theta) for a non-negative weight function can be reduced to the computation of the longest path in a modified version of MM. There can be cycles with weight 0 in S\SpS\backslash S_{p}, but in such cycles every state in the cycle has weight 0. Therefore we collapse such cycles completely without changing the value maxcost(Θ)\operatorname{maxcost}(\Theta). We further collapse the set SpS_{p} into one absorbing state ff. Computing maxcost(Θ)\operatorname{maxcost}(\Theta) now amounts to searching for a longest path from s0s_{0} to ff in this modified weighted directed acyclic graph. This can be done in linear time by finding a shortest path after multiplying all weights with 1-1. Therefore the problem can be solved in overall polynomial time [12].

For (2) we reduce the problem of finding a maxcost\operatorname{maxcost}-minimal pp-cause to the solution of a max-cost reachability game as defined in [6]. Define the game arena 𝒜=(V,VMax,VMin,E)\mathcal{A}=(V,V_{Max},V_{Min},E) with

V=S˙S˙p,\displaystyle V=S\>\dot{\cup}\>\dot{S}_{p}, VMax=S,\displaystyle V_{Max}=S, VMin=S˙p,\displaystyle V_{Min}=\dot{S}_{p},

where S˙p\dot{S}_{p} is a copy of SpS_{p}. The copy of state sSps\in S_{p} in S˙p\dot{S}_{p} will be written as s˙\dot{s}. There is an edge (s,t)E(s,t)\in E between states ss and tt in VV if and only if one of the following conditions holds:

  1. (1)

    sS,tS\Sps\in S,t\in S\backslash S_{p}, and 𝐏(s,t)>0\mathbf{P}(s,t)>0,

  2. (2)

    sS,tS˙ps\in S,t\in\dot{S}_{p}, and for uSpu\in S_{p} with t=u˙t=\dot{u} we have 𝐏(s,u)>0\mathbf{P}(s,u)>0,

  3. (3)

    sS˙p,tSps\in\dot{S}_{p},t\in S_{p}, and s=t˙s=\dot{t}, or

  4. (4)

    sS˙ps\in\dot{S}_{p} and t=𝑒𝑟𝑟𝑜𝑟t=\mathit{error}.

We equip 𝒜\mathcal{A} with a weight function w:Vw:V\to\mathbb{Q} by mirroring the weight function of MM in the following way:

  1. (1)

    w(s)=c(s)w(s)=c(s) for sS\Sps\in S\backslash S_{p}, and

  2. (2)

    w(s)=0w(s)=0 for sSps\in S_{p},

  3. (3)

    w(s˙)=c(s)w(\dot{s})=c(s) for sSps\in S_{p},

In [6] the authors define a min-cost reachability game. For our purposes we need the dual notion of a max-cost reachability game, which is obtained by just changing the total payoff of a play avoiding the target set to be -\infty instead of ++\infty. The objective of player MaxMax with vertices VMaxV_{Max} is then to maximize the total payoff of the play, while player MinMin with vertices VMinV_{Min} wants to minimize the total payoff. By changing from min-cost to max-cost reachability games, the results of [6] concerning strategies for MaxMax and MinMin are reversed.

We consider the max-cost reachability game on 𝒜\mathcal{A} with target 𝑒𝑟𝑟𝑜𝑟\mathit{error}. Define Val(s)Val(s) as the total payoff if both sides play optimally if the play starts in ss. We have Val(s0)=Val(s_{0})=\infty if there is a positive cycle reachable in S\SpS\backslash S_{p} by the above argumentation. In contrast we always have Val(s0)Val(s_{0})\neq-\infty since 𝑒𝑟𝑟𝑜𝑟\mathit{error} is reachable and 𝑠𝑎𝑓𝑒\mathit{safe} can be avoided by MaxMax. We proceed to show that our reduction is correct.

Claim

There is a 11-11 correspondence between strategies σ\sigma of MinMin and pp-causes in MM for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error}.

Proof (of the claim)

Let σ\sigma be a strategy for MinMin and consider the set of consistent plays starting in s0s_{0} that we denote by Playss0(𝒜,σ)Plays_{s_{0}}(\mathcal{A},\sigma). Every play πPlayss0(𝒜,σ)\pi\in Plays_{s_{0}}(\mathcal{A},\sigma) that reaches 𝑒𝑟𝑟𝑜𝑟\mathit{error} corresponds to a pp-critical prefix. To see this, omit all states contained in S˙p\dot{S}_{p} from the play. If π\pi reaches 𝑒𝑟𝑟𝑜𝑟\mathit{error} and the last state before 𝑒𝑟𝑟𝑜𝑟\mathit{error} is in S˙p\dot{S}_{p} then omit 𝑒𝑟𝑟𝑜𝑟\mathit{error} as well. The resulting path in MM is a pp-critical prefix. Let ΠPathsfin(M)\Pi\subseteq\operatorname{Paths}_{\operatorname{fin}}(M) be the set of pp-critical prefixes obtained in this way from plays in Playss0(𝒜,σ)Plays_{s_{0}}(\mathcal{A},\sigma) reaching 𝑒𝑟𝑟𝑜𝑟\mathit{error}.

To see that any path π\pi to 𝑒𝑟𝑟𝑜𝑟\mathit{error} in MM has a prefix in Π\Pi, let τ\tau be the strategy of player MaxMax that moves along the steps of π\pi. In the resulting play, either player MinMin moves to 𝑒𝑟𝑟𝑜𝑟\mathit{error} from some state sS˙ps\in\dot{S}_{p} according to σ\sigma and the corresponding prefix of π\pi is in Π\Pi, or the play reaches 𝑒𝑟𝑟𝑜𝑟\mathit{error} from a state not in S˙p\dot{S}_{p} and hence the path π\pi itself belongs to Π\Pi. Since the strategy has to make decisions for every sS˙ps\in\dot{S}_{p}, every path πPathsM(𝑒𝑟𝑟𝑜𝑟)\pi\in\operatorname{Paths}_{M}(\lozenge\mathit{error}) has a prefix in Π\Pi. Π\Pi is prefix-free since a violation of this property would correspond to a non-consistent play, since MinMin can only choose the edge to 𝑒𝑟𝑟𝑜𝑟\mathit{error} once. Therefore Π\Pi is a pp-cause in MM for 𝑒𝑟𝑟𝑜𝑟\lozenge\mathit{error}.

Since the reverse of this construction follows along completely analogous lines, we omit it here. \blacksquare

Claim

We have maxcostmin=Val(s0)\operatorname{maxcost}^{\min}=Val(s_{0}).

Proof (of the claim)

Recall that the value Val(s0)Val(s_{0}) is defined as the total payoff of the unique play πPlayss0(𝒜,σ)Playss0(𝒜,τ)\pi\in Plays_{s_{0}}(\mathcal{A},\sigma)\cap Plays_{s_{0}}(\mathcal{A},\tau), where σ\sigma is the optimal strategy of MinMin and τ\tau is the optimal strategy of MaxMax. For each strategy σ\sigma^{\prime} for MinMin, let Πσ\Pi_{\sigma^{\prime}} be the corresponding pp-cause as provided by the above claim. The optimal strategy τ\tau^{\prime} for MaxMax against σ\sigma^{\prime} has to avoid 𝑠𝑎𝑓𝑒\mathit{safe} and hence has to create a play that ends in 𝑒𝑟𝑟𝑜𝑟\mathit{error}. The total payoff of the induced play φ\varphi is equal to the weight of the corresponding path φ^Πσ\hat{\varphi}\in\Pi_{\sigma^{\prime}} and the total payoff that MaxMax can achieve is precisely maxcost(Πσ)\operatorname{maxcost}(\Pi_{\sigma^{\prime}}). The optimal strategy σ\sigma for MinMin hence corresponds to a maxcost\operatorname{maxcost}-minimal pp-cause Πσ\Pi_{\sigma} and maxcost(Πσ)=Val(s0)\operatorname{maxcost}(\Pi_{\sigma})=Val(s_{0}). \blacksquare

Now we apply the results for max-cost reachability games to pp-causes. This means we can use [6, Algorithm 1] to compute the value Val(s)Val(s) of the game for all states sSs\in S in pseudo-polynomial time. This includes the value Val(s0)=maxcostminVal(s_{0})=\operatorname{maxcost}^{\min}. From the values of the game starting at different states, an optimal memoryless strategy σ\sigma for MinMin can be derived by fixing a successor ss^{\prime} for each state sVMins\in V_{Min} with (s,s)E(s,s^{\prime})\in E and

Val(s)=c(s)+Val(s).Val(s)=c(s^{\prime})+Val(s^{\prime}).

Since the strategy is memoryless, we get a set QS˙pQ\subseteq\dot{S}_{p} for which MinMin chooses the edge to 𝑒𝑟𝑟𝑜𝑟\mathit{error}. By the construction from before the maxcost\operatorname{maxcost}-minimal pp-cause Π\Pi obtained in this way is state-based.

For (3) we note that the decision problem “Is Val(s0)0Val(s_{0})\leq 0?” is in 𝙽𝙿𝚌𝚘𝙽𝙿\mathtt{NP}\cap\mathtt{coNP} by a reduction to mean-payoff games, as shown in [8]. The reduction introduces an edge from 𝑒𝑟𝑟𝑜𝑟\mathit{error} back to s0s_{0} and removes the state 𝑠𝑎𝑓𝑒\mathit{safe} from the game. We have that Val(s0)0Val(s_{0})\leq 0 in the original max-cost reachability game if and only if the value of the constructed mean-payoff game is at most 0. The reason is that the value of the mean-payoff game is at most 0 if there is a strategy for MinMin such that MaxMax can neither force a positive cycle in the original max-cost reachability game nor reach 𝑒𝑟𝑟𝑜𝑟\mathit{error} with positive weight in the original game. We adapt the construction to show that the decision problem “is maxcostminϑ\operatorname{maxcost}^{\min}\leq\vartheta?”, for ϑ\vartheta\in\mathbb{Q}, is also in 𝙽𝙿𝚌𝚘𝙽𝙿\mathtt{NP}\cap\mathtt{coNP}. This can be achieved by adding an additional vertex ss with weight ϑ-\vartheta to VMaxV_{Max}, removing the edge between 𝑒𝑟𝑟𝑜𝑟\mathit{error} and s0s_{0} and adding two new edges, one from 𝑒𝑟𝑟𝑜𝑟\mathit{error} to ss and one from ss to s0s_{0}. The value of the resulting mean-payoff game is then at most 0 if there is a strategy for MinMin such that MaxMax can neither force a positive cycle in the original max-cost reachability game nor reach 𝑒𝑟𝑟𝑜𝑟\mathit{error} with weight above ϑ\vartheta in the original game. ∎

\instcost

*

Proof

Recall that we now work with a set of error states EE instead of a single state 𝑒𝑟𝑟𝑜𝑟\mathit{error} and a set of terminal safe states FF such that Prf(E)=0\mathrm{Pr}_{f}(\Diamond E)=0 if and only if fFf\in F. We first note that for an instantaneous weight function we reduce partial expected cost to expected cost and therefore only need to consider one case: Given a weight function c:Sc\colon S\to\mathbb{Q}, consider the weight function cc^{\prime} obtained from cc by forcing c(f)=0c^{\prime}(f)=0 for all fFf\in F. Then the partial expected cost with respect to cinstc_{\operatorname{inst}} equals the expected cost with respect to cinstc^{\prime}_{\operatorname{inst}}.

For expcostinst\operatorname{expcost}_{\operatorname{inst}} we construct an MDP 𝒩=(S˙S˙p,{pick,continue},s0,𝐏)\mathcal{N}=(S\>\dot{\cup}\>\dot{S}_{p},\{pick,continue\},s_{0},\mathbf{P}^{\prime}), where S˙p,\dot{S}_{p}, is a disjoint copy of SpS_{p} in which all states are terminal. The copy of state sSps\in S_{p} in S˙p\dot{S}_{p} will be written as s˙\dot{s}. We define 𝐏(s,continue,s)=𝐏(s,s)\mathbf{P}^{\prime}(s,continue,s^{\prime})=\mathbf{P}(s,s^{\prime}) for s,sSs,s^{\prime}\in S, and 𝐏(s,pick,s˙)=1\mathbf{P}^{\prime}(s,pick,\dot{s})=1 for sSps\in S_{p}. The action pickpick is not enabled states outside of SpS_{p}, and the action continuecontinue is not enabled in S˙p\dot{S}_{p}. We define the weight function c𝒩:S˙S˙pc_{\mathcal{N}}:S\>\dot{\cup}\>\dot{S}_{p}\to\mathbb{Q} by

c𝒩(s)={c(s)if sFEc(t)if sS˙p and t˙=s0else.c_{\mathcal{N}}(s)=\begin{cases}c(s)&\text{if }s\in F\cup E\\ c(t)&\text{if $s\in\dot{S}_{p}$ and $\dot{t}=s$}\\ 0&\text{else.}\end{cases}

The construction is illustrated in Figures 9 and 9: Consider the DTMC MM^{\prime} depicted in Figure 9. The transition probabilities are omitted. It is enough to know Sp={s0,s1}S_{p}^{\prime}=\{s_{0},s_{1}\} and F={𝑠𝑎𝑓𝑒}F^{\prime}=\{\mathit{safe}\}. The constructed MDP 𝒩\mathcal{N}^{\prime} can be seen in Figure 9, where the black edges are inherited from MM^{\prime}, and the red edges are added transitions belonging to the action pickpick.

s0|+3s_{0}|+3s1|+1s_{1}|+1s2|2s_{2}|-2𝑒𝑟𝑟𝑜𝑟|+7\mathit{error}|+7𝑠𝑎𝑓𝑒|+4\mathit{safe}|+4
Figure 8: The DTMC MM^{\prime} with instantaneous weight
s0|+0s_{0}|+0s1|+0s_{1}|+0s2|+0s_{2}|+0𝑒𝑟𝑟𝑜𝑟|+7\mathit{error}|+7𝑠𝑎𝑓𝑒|+4\mathit{safe}|+4s˙0|+3\dot{s}_{0}|+3s˙1|+1\dot{s}_{1}|+1
Figure 9: The MDP 𝒩\mathcal{N}^{\prime} emulating instantaneous weight

For the constructed MDP 𝒩\mathcal{N} we consider the accumulated weight function. This emulates an instantaneous weight function for the random variable 𝒳c\mathcal{X}_{c} of expcostinst\operatorname{expcost}_{\operatorname{inst}}. A scheduler of this MDP corresponds to a pp-cause for MM in the same way as established in Lemma 1. Therefore the problem of finding an expcost\operatorname{expcost}-minimal pp-cause for instantaneous weight c:Sc:S\to\mathbb{Q} in MM for E\lozenge E is equivalent to finding a cost-minimal scheduler in 𝒩\mathcal{N} for EFS˙p\lozenge E\cup F\cup\dot{S}_{p}. This is again the stochastic shortest path problem for 𝒩\mathcal{N}, which can be solved in polynomial time by [4]. Since the SSP is solved by a memoryless scheduler, the expcostinst\operatorname{expcost}_{\operatorname{inst}}-minimal pp-cause is state-based.

For the computation of the minimal value of maxcostinst\operatorname{maxcost}_{\operatorname{inst}}, we enumerate the set SpS_{p} as s0,,sks^{0},\dots,s^{k} where k=|Sp|1k=|S_{p}|-1 such that c(si)c(sj)c(s^{i})\leq c(s^{j}) for all 0i<jk0\leq i<j\leq k. Now we iteratively remove states in increasing order starting with s0s^{0}. After removing a state sis^{i}, we check whether EE is reachable in the resulting Markov chain. If this is the case, we continue by removing the next state. If EE is not reachable anymore, the set Si:={s0,,si}S^{i}:=\{s^{0},\dots,s^{i}\} induces a state-based pp-cause ΠSi\Pi_{S^{i}}. This follows from the fact that each path from the initial state to EE contains a state in SiS^{i} and that SiSpS^{i}\subseteq S_{p}. Furthermore, maxcostinst(ΠSi)c(si)\operatorname{maxcost}_{\operatorname{inst}}(\Pi_{S^{i}})\leq c(s^{i}). Let jj be the largest number less than ii such that c(sj)<c(si)c(s^{j})<c(s^{i}). There is no pp-cause in which all paths end in {s0,,sj}\{s^{0},\dots,s^{j}\} as EE was still reachable after removing these states from MM. So, there is no pp-cause Π\Pi with maxcostinst(Π)<c(si)\operatorname{maxcost}_{\operatorname{inst}}(\Pi)<c(s^{i}). Therefore, ΠSi\Pi_{S^{i}} is indeed a maxcostinst\operatorname{maxcost}_{\operatorname{inst}}-minimal pp-cause. Since ESpE\subseteq S_{p}, the procedure terminates at the latest when the states in EE are removed. Hence the algorithm finds a state-based maxcostinst\operatorname{maxcost}_{\operatorname{inst}}-minimal pp-cause in polynomial time. ∎