UMONS – Université de Mons, Belgium F.R.S.-FNRS & UMONS – Université de Mons, BelgiumF.R.S.-FNRS Research Associate. Università degli Studi di Torino, Italy \CopyrightJames C. A. Main, Mickael Randour, and Jeremy Sproston \ccsdesc[500]Theory of computation Formal languages and automata theory \supplement \fundingResearch supported by F.R.S.-FNRS under Grant n° F.4520.18 (ManySynth). \hideLIPIcs
Time Flies When Looking out of the Window: Timed Games with Window Parity Objectives
Abstract
The window mechanism was introduced by Chatterjee et al. to reinforce mean-payoff and total-payoff objectives with time bounds in two-player turn-based games on graphs [17]. It has since proved useful in a variety of settings, including parity objectives in games [14] and both mean-payoff and parity objectives in Markov decision processes [12].
We study window parity objectives in timed automata and timed games: given a bound on the window size, a path satisfies such an objective if, in all states along the path, we see a sufficiently small window in which the smallest priority is even. We show that checking that all time-divergent paths of a timed automaton satisfy such a window parity objective can be done in polynomial space, and that the corresponding timed games can be solved in exponential time. This matches the complexity class of timed parity games, while adding the ability to reason about time bounds. We also consider multi-dimensional objectives and show that the complexity class does not increase. To the best of our knowledge, this is the first study of the window mechanism in a real-time setting.
keywords:
window objectives, timed automata, timed games, parity gamescategory:
\relatedversion1 Introduction
Timed automata and games.
Timed automata [2] are extensions of finite automata with real-valued variables called clocks. Clocks increase at the same rate and measure the elapse of time between actions. Transitions are constrained by the values of clocks, and clocks can be reset on transitions. Timed automata are used to model real-time systems [4]. Not all paths of timed automata are meaningful; infinite paths that take a finite amount of time, called time-convergent paths, are often disregarded when checking properties of timed automata. Timed automata induce uncountable transition systems. However, many properties can be checked using the region abstraction, which is a finite quotient of the transition system.
Timed automaton games [25], or simply timed games, are games played on timed automata: one player represents the system and the other its environment. Players play an infinite amount of rounds: for each round, both players simultaneously present a delay and an action, and the play proceeds according to the fastest move (note that we use paths for automata and plays for games to refer to sequences of consecutive states and transitions). When defining winning conditions for players, convergent plays must be taken in account; we must not allow a player to achieve its objective by forcing convergence but cannot either require a player to force divergence (as it also depends on its opponent). Given an objective as a set of plays, following [20], we declare a play winning for a player if either it is time-divergent and belongs to the objective, or it is time-convergent and the player is not responsible for convergence.
Parity conditions.
The class of -regular specifications is widely used (e.g., it can express liveness and safety), and parity conditions are a canonical way of representing them. In (timed) parity games, locations are labeled with a non-negative integer priority and the parity objective is to ensure the smallest priority occurring infinitely often along the path/play is even. Timed games with -regular objectives given as parity automata are shown to be solvable in [20]. Furthermore, a reduction from timed parity games to classical turn-based parity games on a graph is established in [19].
Real-timed windows.
The parity objective can be reformulated: for all odd priorities seen infinitely often, a smaller even priority must be seen infinitely often. One can see the odd priority as a request and the even one as a response. The parity objective does not specify any timing constraints between requests and responses. In applications however, this may not be sufficient: for example, a server should respond to requests in a timely manner.
We revisit the window mechanism introduced by Chatterjee et al. for mean-payoff and total-payoff games [17] and later applied to parity games [14] and to parity and mean-payoff objectives in Markov decision processes [12]: we provide the first (to the best of our knowledge) study of window objectives in the real-time setting. More precisely, we lift the (resp. direct) fixed window parity objective of [14] to its real-time counterpart, the (resp. direct) timed window parity objective, and study it in timed automata and games.
Intuitively, given a non-negative integer bound on the window size, the direct timed window parity objective requires that at all times along a path/play, we see a window of size at most such that the smallest priority in this window is even. While time was counted as steps in prior works (all in a discrete setting), we naturally measure window size using delays between configurations in real-time models. The (non-direct) timed window parity objective is simply a prefix-independent version of the direct one, thus more closely matching the spirit of classical parity: it asks that some suffix satisfies the direct objective.
Contributions.
We extend window parity objectives to a dense-time setting, and study both verification of timed automata and realizability in timed games. We consider adaptations of the fixed window parity objectives of [14], where the window size is given as a parameter. We establish that (a) verifying that all time-divergent paths of a timed automaton satisfy a timed window parity specification is PSPACE-complete; and that (b) checking the existence of a winning strategy for a window parity objective in timed games is EXPTIME-complete. These results (Theorem 6.8) hold for both the direct and prefix-independent variants, and they extend to multi-dimensional objectives, i.e., conjunctions of window parity.
All algorithms are based on a reduction to an expanded timed automaton (Definition 4.1). We establish that, similarly to the discrete case, it suffices to keep track of one window at a time (or one per objective in the multi-dimensional case) instead of all currently open windows, thanks to the so-called inductive property of windows (Lemma 3.3). A window can be summarized using its smallest priority and its current size: we encode the priorities in a window by extending locations with priorities and using an additional clock to measure the window’s size. The (resp. direct) timed window parity objective translates to a co-Büchi (resp. safety) objective on the expanded automaton. Locations to avoid for the co-Büchi (resp. safety) objective indicate a window exceeding the supplied bound without the smallest priority of the window being even — a bad window. To check that all time-divergent paths of the expanded automaton satisfy the safety (resp. co-Büchi) objective, we check for the existence of a time-divergent path visiting (resp. infinitely often) an unsafe location using the PSPACE algorithm of [2]. To solve the similarly-constructed expanded game, we use the EXPTIME algorithm of [20].
Comparison.
Window variants constitute conservative approximations of classical objectives (e.g., [17, 14, 12]), strengthening them by enforcing timing constraints. Complexity-wise, the situation is varied. In one-dimension turn-based games on graphs, window variants [17, 14] provide polynomial-time alternatives to the classical objectives, bypassing long-standing complexity barriers. However, in multi-dimension games, their complexity becomes worse than for the original objectives: in particular, fixed window parity games are EXPTIME-complete for multiple dimensions [14]. We show that timed games with multi-dimensional window parity objectives are in the same complexity class as untimed ones, i.e., dense time comes for free.
For classical parity objectives, timed games can be solved in exponential time [19, 20]. The approach of [19] is as follows: from a timed parity game, one builds a corresponding turn-based parity game on a graph, the construction being polynomial in the number of priorities and the size of the region abstraction. We recall that despite recent progress on quasi-polynomial-time algorithms (starting with [16]), no polynomial-time algorithm is known for parity games; the blow-up comes from the number of priorities. Overall, the two sources of blow-up — region abstraction and number of priorities — combine in a single-exponential solution for timed parity games. We establish that (multi-dimensional) window parity games can be solved in time polynomial in the size of the region abstraction, the number of priorities and the window size, and exponential in the number of dimensions. Thus even for conjunctions of objectives, we match the complexity class of single parity objectives of timed games, while avoiding the blow-up related to the number of priorities and enforcing time bounds between odd priorities and smaller even priorities via the window mechanism.
Outline.
This work is organized as follows. Section 2 summarizes all prerequisite notions and vocabulary. In Section 3, we introduce the timed window parity objective, compare it to the classical parity objective, and establish some useful properties. Our reduction from window parity objectives to safety or co-Büchi objectives is presented in Section 4: the construction of the expanded timed automaton/game used in the reduction is provided in Section 4.1, Section 4.2 develops mappings between plays of a game and plays of its expansion such that a time-divergent play in one game satisfies the objective if and only if its image satisfies the objective of the other game and Section 4.3 shows how these mappings can be used to transfer winning strategies between a timed game and its expansion. The reduction is extended to the case of multiple window parity objectives in Section 5. Finally, Section 6 presents the complexity.
This paper is a full version of a preceding conference version [24]. This version presents in full details the contributions of the conference version, with detailed proofs.
Related work.
In addition to the aforementioned foundational works, the window mechanism has seen diverse extensions and applications: e.g., [5, 3, 11, 15, 23, 27, 8]. Window parity games are strongly linked to the concept of finitary -regular games: see, e.g., [18], or [14] for a complete list of references. The window mechanism can be used to ensure a certain form of (local) guarantee over paths: different techniques have been considered in stochastic models, notably variance-based [10] or worst-case-based [13, 7] methods. Finally, let us recall that game models provide a useful framework for controller synthesis [26], and that timed automata have been extended in a number of directions (see, e.g., [9] and references therein): applications of the window mechanism in such richer models could be of interest.
2 Preliminaries
Timed automata.
A clock variable, or clock, is a real-valued variable. Let be a set of clocks. A clock constraint over is a conjunction of formulae of the form with , , and . We write as shorthand for the clock constraint . Let denote the set of clock constraints over .
Let denote the set of non-negative real numbers. We refer to functions as clock valuations over . A clock valuation over a set of clocks satisfies a clock constraint of the form if and a conjunction if it satisfies both and . Given a clock constraint and clock valuation , we write if satisfies .
For a clock valuation and , we let be the valuation defined by for all . For any valuation and , we define to be the valuation agreeing with for clocks in and that assigns to clocks in . We denote by the zero valuation, assigning 0 to all clocks in .
A timed automaton (TA) is a tuple where is a finite set of locations, is an initial location, a finite set of clocks containing a special clock which keeps track of the total time elapsed, a finite set of actions, an invariant assignment function and an edge relation. We only consider deterministic timed automata, i.e., we assume that in any location , there are no two outgoing edges and sharing the same action such that the conjunction is satisfiable. For an edge , the clock constraint is called the guard of the edge.
A TA gives rise to an uncountable transition system with the state space , the initial state , set of actions and the transition relation defined as follows: for any action and delay , we have that if and only if there is some edge such that , , and ; for any delay , if and only if . Let us note that the satisfaction set of clock constraints is convex: it is described by a conjunction of inequalities. Whenever , the above conditions (the invariant holds after the delay) are equivalent to requiring for all (the invariant holds at each intermediate time step).
A move is any pair in (i.e., an action in the transition system). For any move and states , , we write or as shorthand for . Moves of the form are called delay moves. We say a move is enabled in a state if there is some such that . There is at most one successor per move in a state, as we do not allow two guards on edges labeled by the same action to be simultaneously satisfied.
A path in a TA is a finite or infinite sequence such that for all , is a state of and for all , is a transition in . A path is initial if . For clarity, we write instead of .
An infinite path is time-divergent if the sequence is not bounded from above. A path which is not time-divergent is called time-convergent; time-convergent paths are traditionally ignored in analysis of timed automata [1, 2] as they model unrealistic behavior. This includes ignoring Zeno paths, which are time-convergent paths along which infinitely many actions appear. We write for the set of paths of .
Priorities.
A priority function is a function with . We use priority functions to express parity objectives. A -dimensional priority function is a function which assigns vectors of priorities to locations.
Timed games.
We consider two player games played on TAs. We refer to the players as player 1 () for the system and player 2 () for the environment. We use the notion of timed automaton games of [20].
A timed (automaton) game (TG) is a tuple where is a TA and is a partition of . We refer to actions in as actions for .
Recall a move is a pair . Let denote the set of states of . In each state , the moves available to are the elements of the set where
contains moves with actions and delay moves that are enabled in . The set is defined analogously with actions. We write and for the set of all moves of and respectively.
At each state along a play, both players simultaneously select a move and . Intuitively, the fastest player gets to act and in case of a tie, the move is chosen non-deterministically. This is formalized by the joint destination function , defined by
For and , we write to denote the delay occurring when and play and respectively.
A play is defined similarly to a path: it is a finite or infinite sequence of the form where for all indices , for , and for , . A play is initial if . For a finite play , we set . For an infinite play , we write . A play follows a path in the TA, but there need not be a unique path compatible with a play: if along a play, at the th step, the moves of both players share the same delay and target state, either move can label the th transition in a matching path.
Similarly to paths, an infinite play is time-divergent if and only if is not bounded from above. Otherwise, we say a play is time-convergent. We define the following sets: for the set of plays of ; for the set of finite plays of ; for the set of time-divergent plays of . We also write to denote plays starting in state of .
Note that our games are built on deterministic TAs. From a modeling standpoint, this is not restrictive, as we can simulate a non-deterministic TA through the actions of .
Strategies.
A strategy for is a function describing which move a player should use based on a play history. Formally, a strategy for is a function such that for all , . This last condition requires that each move given by a strategy be enabled in the last state of a play.
A play is said to be consistent with a -strategy if for all indices , . Given a -strategy , we define (resp. ) to be the set of plays (resp. set of plays starting in state ) consistent with .
A -strategy is move-independent if the move it suggests depends only on the sequence of states seen in the play. Formally, is move-independent if for all finite plays and if for all , then . We use move-independent strategies in the proof of our reduction to relabel some moves of a play without affecting the suggestions of the strategy.
Objectives.
An objective represents the property we desire on paths of a TA or a goal of a player in a TG. Formally, we define an objective as a set of infinite paths (when studying TAs) or a set of infinite plays (when studying TGs). An objective is state-based (resp. location-based) if it depends solely on the sequence of states (resp. of locations) in a path or play. Any location-based objective is state-based.
Remark 2.1.
In the sequel, we present objectives exclusively as sets of plays. Definitions for paths are analogous as all the objectives defined hereafter are state-based.
We use the following classical location-based objectives. A reachability objective for a set of locations is the set of plays that pass through a location in . The complement of a reachability objective is a safety objective; given a set , it is the set of plays that never visit a location in . A Büchi objective for a set contains all plays that pass through locations in infinitely often and the complement co-Büchi objective consists of plays traversing locations in finitely often. The parity objective for a priority function over the set of locations requires that the smallest priority seen infinitely often is even.
Fix a set of locations and a priority function. The aforementioned objectives are formally defined as follows.
-
•
;
-
•
.
-
•
;
-
•
;
-
•
.
Winning conditions.
In games, we distinguish objectives and winning conditions. We adopt the definition of [20]. Let be an objective. It is desirable to have victory be achieved in a physically meaningful way: for example, it is unrealistic to have a safety objective be achieved by stopping time. This motivates a restriction to time-divergent plays. However, this requires to force the divergence of plays, which is not reasonable, as can stall using delays with zero time units. Thus we also declare winning time-convergent plays where is blameless. Let denote the set of -blameless plays, which we define in the following way.
Let be a (possibly finite) play. We say is not responsible (or not to be blamed) for the transition at step in if either ( is faster) or and does not hold in (’s move was selected and did not have the same target state as ’s) where for . The set is formally defined as the set of infinite plays such that there is some such that for all , is not responsible for the transition at step in .
Given an objective , we set the winning condition for to be the set of plays
Winning conditions for are defined by exchanging the roles of the players in the former definition.
We consider that the two players are adversaries and have opposite objectives, and (shorthand for ). Let us note that there may be plays such that and , e.g., any time-convergent play in which neither player is blameless.
A winning strategy for for an objective from a state is a strategy such that . Move-independent strategies are known to suffice for timed automaton games with state-based objectives [20].
Decision problems.
We consider two different problems for an objective . The first is the verification problem for , which asks given a timed automaton whether all time-divergent initial paths satisfy the objective. Second is the realizability problem, which asks whether in a timed automaton game with objective , has a winning strategy from the initial state.
3 Window objectives
We consider the fixed window parity and direct fixed window parity problems from [14] and adapt the discrete-time requirements from their initial formulation to dense-time requirements for TAs and TGs. Intuitively, a direct fixed window parity objective for some bound requires that at all points along a play or a path, we see a window of size less than in which the smallest priority is even. The (non-direct) fixed window parity objective requires that the direct objective holds for some suffix. In the sequel, we drop “fixed” from the name of these objectives.
In this section, we formalize the timed window parity objective in TGs as sets of plays. The definition for paths of TAs is analogous (see Remark 2.1). First, we define the timed good window objective, which formalizes the notion of good windows. Then we introduce the timed window parity objective and its direct variant. We compare these objectives to the parity objective and argue that satisfying a window objective implies satisfying a parity objective, and that window objectives do not coincide with parity objectives in general, via an example. We conclude this section by proving some useful properties of this objective.
For this entire section, we fix a TG where , a priority function and a bound on the size of windows.
3.1 Definitions
Good windows.
A window objective is based on a notion of good windows. Intuitively, a good window for the parity objective is a fragment of a play in which less than time units pass and the smallest priority of the locations appearing in this fragment is even.
The timed good window objective encompasses plays in which there is a good window at the start of the play. We formally define the timed good window (parity) objective as the set
The timed good window objective is a state-based objective.
We introduce some terminology related to windows. Let be an infinite play. We say that the window opened at step closes at step if is even and for all , is odd. Note that, in this case, we must have . In other words, a window closes when an even priority smaller than all other priorities in the window is encountered. The window opened at step is said to close immediately if is even.
If a window does not close within time units, we refer to it as a bad window: the window opened at step is a bad window if there is some such that and for all , if , then is odd.
Direct timed window objective.
The direct window parity objective in graph games requires that every suffix of the play belongs to the good window objective. To adapt this objective to a dense-time setting, we must express that at all times, we have a good window. We require that this property holds not only at states which appear explicitly along plays, but also in the continuum between them (during the delay within a location). To this end, let us introduce a notation for suffixes of play.
Let be a play. For all and all , write and . For any and , let be the delayed suffix of starting in position delayed by time units, defined as
If , we write rather than .
Using the notations above, we define the direct timed window (parity) objective as the set
The direct timed window objective is state-based: the timed good window objective is state-based and the delays are encoded in states (measured by clock ), thus all conditions in the definition of the direct timed window objective depend only the sequence of states of a play.
A good window for a delayed suffix can be expressed using exclusively indices from the play . In fact, if and only if there is some such that is even and . We use this characterization to avoid mixing indices from plays and in proofs.
Timed window objective.
We define the timed window (parity) objective as a prefix-independent variant of the direct timed window objective. Formally, we let
The timed window objective requires the direct timed window objective to hold from some point on. This implies that the timed window objective is state-based.
3.2 Comparison with parity objectives
Both the direct and non-direct timed window objectives reinforce the parity objective with time bounds. It can easily be shown that satisfying the direct timed window objective implies satisfying a parity objective. Any odd priority seen along a play in is answered within time units by a smaller even priority. Therefore, should any odd priority appear infinitely often, it is followed by a smaller even priority. As the set of priorities is finite, there must be some smaller even priority appearing infinitely often. This in turn implies that the parity objective is fulfilled. Using prefix-independence of the parity objective, we can also conclude that satisfying the non-direct timed window objective implies satisfaction of the parity objective.
However, in some cases, the timed window objectives may not hold even though the parity objective holds. For simplicity, we provide an example on a TA, rather than a TG. Consider the timed automaton depicted in Figure 1.
All time-divergent paths of satisfy the parity objective. We can classify time-divergent paths in two families: either is visited infinitely often, or from some point on only delay transitions are taken in . In the former case, the smallest priority seen infinitely often is and in the latter case, it is .
However, there is a path such that for all window sizes , violates the direct and non-direct timed window objectives. Initialize to . This path can be described by the following loop: play action in with delay , followed by action with delay in and action in with delay , increase by and repeat. The window opened in only closes when location is entered. At the -th step of the loop, this window closes after time units. As we let increase to infinity, there is no window size such that this path satisfies the direct and non-direct timed window objectives for .
This example demonstrates the interest of reinforcing parity objectives with time bounds; we can enforce that there is a bounded delay between an odd priority and a smaller even priority in a path.
3.3 Properties of window objectives
We present several properties of the timed window objective. First, we show that we need only check good windows for non-delayed suffixes . Once this property is proven, we move on to the inductive property of windows, which is the crux of the reduction in the next section. This inductive property states that when we close a window in less than time units all other windows opened in the meantime also close in less than time units.
The definition of the direct timed window objective requires checking uncountably many windows. This can be reduced to a countable number of windows: those opened when entering states appearing along a play. Let us explain why no information is lost through such a restriction. We rely on a timeline-like visual representation given in Figure 2. Consider a window that does not close immediately and is opened in some state of the play delayed by time units, of the form (depicted by the circle at the start of the bottom line of Figure 2). This implies that the priority of is odd, otherwise this window would close immediately. Assume the window opened at step closes at step (illustrated by the middle line of the figure) in less than time units. As the priority of is odd, we must have (i.e., the window opened at step is still open as long as is not left). These lines cover the same locations, i.e., the set of locations appearing along the time-frame given by both the dotted and dashed lines coincide. Thus, the window opened time units after step closes in at most time units, at the same time as the window opened at step .
Lemma 3.1.
Let and . Let denote . Then if and only if for all , . Furthermore, if and only if for all , .
Proof 3.2.
Assume for all , holds. Selecting yields .
Conversely, assume that . Let . By definition of timed good window objectives, there is some such that and is even. The fact that follows immediately from the chain of inequalities .
The last claim of the lemma follows immediately from the first part of the lemma and the definition of direct timed window objectives.
In turn-based games on graphs, window objectives exhibit an inductive property: when a window closes, all subsequently opened windows close (or were closed earlier) [14]. This is also the case for the timed variant. A window closes when an even priority smaller than all priorities seen in the window is encountered. This priority is also smaller than priorities in all windows opened in the meantime, therefore they must close at this point (if they are not yet closed). We state this property only for windows opened at steps along the run and neglect the continuum in between due to Lemma 3.1.
Lemma 3.3 (Inductive property).
Let . Let . Assume the window opened at step closes at step and . Then, for all , .
Proof 3.4.
Fix . The sequence is non-decreasing, which implies that . It remains to show that is even. As the window opened at step closes at step , we have and is even. We have the inequalities the first follows from above, the second because we take a minimum of a smaller set and the third by definition of minimum. Thus is even, ending the proof.
It follows from this inductive property that it suffices to keep track of one window at a time when checking whether a play satisfies the (direct) timed window objective.
4 Reduction
We establish in this section that the realizability (resp. verification) problem for the direct/non-direct timed window parity objective can be reduced to the the realizability (resp. verification) problem for safety/co-Büchi objectives on an expanded TG (resp. TA). Our reduction uses the same construction of an expanded TA for both the verification and realizability problems. A state of the expanded TA describes the status of a window, allowing the detection of bad windows. This section is divided in three parts.
Firstly, we describe how a TA can be expanded with window-related information. Then we show that time-divergent plays in a TG and its expansion can be related, by constructing two (non-bijective) mappings, in a manner such that a time-divergent play in the base TG satisfies the direct/non-direct timed window parity objective if and only if its related play in the expanded TG satisfies the safety/co-Büchi objective. These results developed for plays are (indirectly) applied to paths in order to show the correctness of the reduction for the verification problem. Thirdly, we establish that the mappings developed in the second part can be leveraged to translate strategies in TGs, and prove that the presented translations preserve winning strategies, proving correctness of the reduction for TGs.
For this section, we fix a TG with TA , a priority function and a bound on the size of windows.
4.1 Encoding the objective in an automaton
To solve the verification and realizability problems for the direct/non-direct timed window objective, we rely on a reduction to a safety/co-Büchi objective in an expanded TA. The inductive property (Lemma 3.3) implies that it suffices to keep track of one window at a time when checking a window objective. Following this, we encode the status of a window in the TA.
A window can be summarized by two characteristics: the lowest priority within it and for how long it has been open. To keep track of the first trait, we encode the lowest priority seen in the current window in locations of the TA. An expanded location is a pair where ; the number represents the smallest priority in the window currently under consideration. We say a pair is an even (resp. odd) location if is even (resp. odd). To measure how long a window is opened, we use an additional clock that does not appear in . This clock is reset whenever a new window opens or a bad window is detected.
The focus of the reduction is over time-divergent plays. Some time-convergent plays may violate a timed good window objective without ever seeing a bad window, e.g., when time does not progress up to the supplied window size. Along time-divergent plays however, the lack of a good window at any point equates to the presence of a bad window. We encode the (resp. direct) timed window objective as a co-Büchi (resp. safety) objective. Locations to avoid in both cases indicate bad windows and are additional expanded locations , referred to as bad locations. We introduce two new actions and , one per player, for entering and exiting bad locations. While only the action is sufficient for the reduction to be correct, introducing two actions allows for a simpler correctness proof in the case of TGs; we can exploit the fact that can enter and exit bad locations. We use two new actions no matter the considered problem: this enables us to use the same expanded TA construction for both the verification problem and realizability problem.
It remains to discuss how the initial location, edges and invariants of an expanded TA are defined. We discuss edges and invariants for each type of expanded location, starting with even locations, then odd locations and finally bad locations. Each rule we introduce hereafter is followed by an application on an example. We depict the TA of Figure 1 and the reachable fragment of its expansion in Figure 3 and use these TAs for our example. For this explanation, we use the terminology of TAs (paths) rather than that of TGs (plays).
The initial location of an expanded TA encodes the window opened at the start of an initial path of the original TA. This window contains only a single priority, that is the priority of the initial location of the original TA. Thus, the initial location of the expanded TA is the expanded location . In the case of our example, the initial location is and the priority of is , thus the initial location of the expanded TA is .
Even expanded locations encode windows that are closed and do not need to be monitored anymore. Therefore, the invariant of an even expanded location is unchanged from the invariant of the original location in the original TA. Similarly, we do not add any additional constraints on the edges leaving even expanded locations. Leaving an even expanded location means opening a new window: any edge leaving an even expanded location has an expanded location of the form as its target ( is the only priority occurring in the new window) and resets to start measuring the size of the new window. For example, in Figure 3, the edge from to of the expanded TA is obtained this way from the edge from to in the original TA.
Odd expanded locations represent windows that are still open. The clock measures how long a window has been opened. If reaches in an odd expanded location, that equates to a bad window in the original TA. In this case, we force time-divergent paths of the expanded TA to visit a bad location. This is done in three steps. We strengthen the invariant of odd expanded locations to prevent from exceeding . We also disable the edges that leave odd expanded locations and do not go to a bad location whenever holds, by reinforcing the guards of such edges by . Finally, we include two edges to a bad location (one per additional action and ), which can only be used whenever there is a bad window, i.e., when . In the case of our example, if reaches in , we redirect the path to location , indicating a window has not closed in time in . When reaches in , no more non-zero delays are possible, the edge from to is disabled and only the edges to are enabled.
When leaving an odd expanded location using an edge, assuming we do not go to a bad location, the smallest priority of the window has to be updated. The new smallest priority is the minimum between the smallest priority of the window prior to traversing the edge and the priority of the target location. In our example for instance, the edge from ) to is derived from the edge from to in the original TA. As the priority of is 0 and is smaller than the current smallest priority of the window encoded by location , the smallest priority of the window is updated to when traversing the edge. Note that we do not reset despite the encoded window closing upon entering : the value of does not matter while in even locations, thus there is no need for a reset when closing the window.
A bad location is entered whenever a bad window is detected while in location . Bad locations are equipped with the invariant preventing the passage of time. In this way, for time-divergent paths, a new window is opened immediately after a bad window is detected. For each additional action and , we add an edge exiting the bad location. Edges leaving a bad location have as their target the expanded location ; we reopen a window in the location in which a bad window was detected. The clock is not reset by these edges, as it was reset prior to entering the bad location and the invariant prevents any non-zero delay in the bad location. For instance, the edges from to in our example represent that when reopening while in location , the smallest priority of this window is .
The expansion depends on the priority function and the bound on the size of windows . Therefore, we write for the expansion. The formal definition of follows.
Definition 4.1.
Given a TA , the TA is defined to be the TA such that
-
•
;
-
•
;
-
•
where is a new clock;
-
•
is an expanded set of actions with special actions , for bad locations;
-
•
for all and even , for all and odd , and for all ;
-
•
the set of edges of is the smallest set satisfying the following rules:
-
–
if is an even number and , then
-
–
if is an odd number and , then
-
–
for all locations , odd and ,
-
–
For a TG , we set .
We write for states of instead of , for conciseness. The bar over the valuation is a visual indicator of the different domain.
We say that a play of is well-initialized if and . A well-initialized play can be seen as a play with a window opening at its start. Any initial play in is well-initialized. Proving statements related to the direct timed window objective only for initial plays (rather than well-initialized plays) is too restrictive to effectively apply them to to the timed window objective, as this objective deals with suffixes. We later define expansions of plays as specific well-initialized plays. We write for the set of bad locations.
4.2 Expanding and projecting plays
We prove that any play of has an expansion in , and conversely, any play in projects to a play in . This is done by constructing an expansion mapping and a projection mapping, both of which are shown to behave well w.r.t. our objectives (Lemma 4.3).
Remark 4.2.
Note that we do not construct a bijection between the set of plays of and the set of plays of . This cannot be achieved naturally due to the additional information encoded in the expanded automaton, and notably the presence of bad locations. We illustrate this by showing there are some plays of that are intuitively indistinguishable if seen as plays of .
Consider the initial location of , and assume that its priority is odd and its invariant is . Consider the initial play of where the actions are used by both players with a delay of at the start of the play and then only delay moves are taken in the reached bad location, i.e., , where for all and . As the actions and do not exist in , cannot be discerned from the similar play of where instead of using the actions , delay moves were used instead, i.e., with for all .
This motivates using two mappings instead of a bijection to prove the correctness of our reduction.
Expansion mapping
The expansion mapping between plays of and of is defined by an inductive construction. We construct expansions step by step. The rough idea is to use the same moves in the play of being expanded and in its expansion in , as long as these moves do not allow to reach in an odd location of , i.e., as long as these moves do not make us see a bad window. In fact the construction addresses how to proceed if a move enabled in would allow to reach in an odd location. If only one of the two players, say , suggests a move with a large enough delay for clock to reach , then their adversary preempts them and it suffices to replace ’s move by any valid move with a larger delay than ’s. However, if both players suggest moves with too large a delay, the expanded play goes through a bad location (possibly multiple times) until enough time passes and one of the two players can use their move (with the remaining delay) in the expanded game.
Before presenting a formal construction of the expansion mapping, let us describe the structure of the inductive step of the construction. We number the different cases in the same order as they appear in the upcoming formal definition. Let be a play of , and assume the expansion of its prefix has already been constructed. We assume inductively that the last states of and its expansion share the same location of the original TG and that their clock valuations agree over . Furthermore, we also inductively assume the last state of is not in a bad location. Write and . Denote by the delay of the last pair of moves of the players in .
-
1.
If is even, the same moves are available in and ; the expansion can be extended using the pair of moves .
-
2.
If is odd, the invariant of the expanded location prevents from exceeding . We distinguish cases depending on whether the delay allows to reach .
-
(a)
If , then one of the players has offered a move enabled in : this move determines how to extend the play.
-
(b)
Otherwise, . The construction makes the expansion go through location . When is exited, the path goes to , the invariant of which depends on the parity of . We treat each case differently.
-
i.
If is even, then the invariant of matches that of . Once is left, we reason similarly to case 1, using the moves with whatever delay remains.
-
ii.
If is odd, it may be required to go to a bad location more than once if the remaining delay after the first visit to the bad location exceeds . Once the remaining delay is strictly less than , we can operate as in case 2.a.
-
i.
-
(a)
The formal construction of the expansion mapping follows. The inductive hypothesis in this construction of is the following: for all finite plays , using to denote and to denote , then , and . We proceed by induction on the number of moves along a play.
The base case consists of plays of with no moves, i.e., plays in which there is a single state. For any play , we set to be the play of consisting of a single state, where and . The inductive hypothesis is verified: the states and share the same location of , their clock valuations agree over and is not a bad location.
Next we assume that expansions are defined for all plays with moves. Fix a play with moves and assume the expansion of its prefix has already been constructed. Write , , and for , . We assume w.l.o.g. that holds: the induction step can be done similarly by exchanging the roles of the players if does not hold. This assumption implies is faster or as fast as . If were strictly faster, then would hold, in turn implying that ( cannot be reset). However, since holds, it follows that , contradicting the assumption that is faster. In other words, we must have . We separate the construction in multiple cases.
-
1.
If is even, the moves and are enabled in by construction. Indeed, and agree over , and we have and for any outgoing edge of in there is an edge in with the same guard and which resets . We distinguish cases following whether a delay is taken or not.
If is a delay move, we set . This play is well-defined: is even, thus and share the same invariant and support the same delay moves. The inductive hypothesis is verified: because a transition labeled by a delay move was taken, and .
Otherwise, is not a delay move and is associated with an edge of the TA. We set with and . This is a well-defined play, owing to the edges recalled above. It is not difficult to verify that the inductive hypothesis is satisfied.
Note that is reset in the second case. It may be the case that is not responsible for the last transition in the expansion despite being responsible for the last transition of . In other words, it is possible that both and hold, but that does not hold. This occurs whenever the moves of both players share the same target state in the base TG but one player uses a delay move and the other a move with an action. We choose to have ’s move be responsible for the transition in the expansion in this case. This choice is for technical reasons related to blamelessness.
-
2.
If is odd, one or both of the moves or may not be enabled in due to the different invariant. Recall that and for any outgoing edge of in there is an edge in . If , a move is disabled in state if . We reason as follows, depending on whether a delay of allows clock to reach from the state .
-
(a)
Assume . Then is enabled in (recall we assume is responsible for the last transition of ). To ensure the -selected move in the expansion is enabled in , we alter it if its delay is too large: let if and otherwise.
If is a delay move, define . This is a well-defined play: holds because , and coincide on and the move is available in , and . Otherwise, if is not a delay move, define where and . By definition of the edges recalled above, and because and coincide on and the move is available in , we conclude that is a well-defined play. In either case, the inductive hypothesis is satisfied.
-
(b)
Otherwise, assume . In this case, a bad location appears along . Denote by the time left before the current window becomes a bad window. For any non-negative real , we write for the clock valuation obtained by shifting by time units and then resetting , for the move and for , i.e., the move with a delay shortened by time units.
Recall, when is left, by definition of edges of , the location is entered. Depending on the parity of , the invariant of is different. Thus, there are two cases to consider: is even and is odd.
-
i.
If is even, we set to be the play
(1) where and if and otherwise ( is reset if ’s action is not a delay). We obtain this expansion by first using actions with the delay to enter a bad location, then the actions immediately again to exit the bad location and finally use the original moves of the players, but with an offset of times units ( time units passed before entering the bad location). This expansion is a play of : the moves and are enabled in as (because as is enabled in , and ), and lead to the state (recall edges entering bad locations reset ). The moves and are enabled in due to the edges for of . One can argue the moves and are enabled in using the same arguments as in case 1. The inductive hypothesis is preserved in this case.
-
ii.
Whenever is odd, the invariant of implies . Let denote the integral part of ; represents the number of bad windows we detect with our construction during delay . In a nutshell, we divide the single step in the original TG into steps in the expansion: we enter and exit the bad location times and finally play the original move in the end with the remaining delay. It may be necessary to modify ’s move due to the invariant of the odd location implying and the guard of its outgoing edges labeled by actions in implying . To this end, let if and otherwise. If , we get an expansion similar to the previous case. We set to be
where , and . That is indeed a play of : moves ( and ) are enabled for the same reasons as the previous case and the other moves are valid for the same reasons as in case 2.a: because . For , we define as
where , represents the time spent repeatedly entering and exiting the bad location, and . This is a well-defined play of for the reasons argued above. Extending the expansion this way preserves the inductive hypothesis.
-
i.
-
(a)
This construction generalizes to infinite plays. In some cases, an expansion of a finite play may contain more steps. However, this is only the case when a bad location appears in an expansion. If a finite play and its expansion share the same number of steps, we say that they are coherent. The expansion mapping preserves time-convergence and divergence for infinite paths: the sum of delays are identical in a play and its expansion.
The behavior of the expansion mapping w.r.t. suffixes and well-initialized plays is of interest for studying the connection between the non-direct timed window objective on the base TG and the co-Büchi objective on the expanded TG. Given a (finite or infinite) play , a suffix of is not necessarily the expansion of a suffix of . However, any well-initialized suffix of can be shown to be the expansion of a delayed suffix for some and . A well-initialized suffix of an expansion starts whenever an even location is left through an edge (case 1) or a bad location is left through an edge (case 2.b). In the former case, the suffix of the expansion is the expansion of a suffix by construction (the last state of the expansion, viewed as a play, is well-initialized). In the latter case, a delayed suffix may be required. This is observable with equation (1) (and is similar in other cases involving bad locations): the suffix of the expansion under construction is the expansion of the shifted suffix of the finite play under consideration and is well-initialized.
Projection mapping
The counterpart to the expansion mapping is the projection mapping . The projection mapping removes window information in any play in to obtain a play in . Any action or is replaced by the action . Formally, we define the projection mapping over finite and infinite plays as follows.
For any (finite or infinite) play , we set to be the sequence where for all and all , if and otherwise. This sequence is indeed a well-defined play: any move enabled in a state such that is enabled in . If the expanded location of is bad, the only such move is . If it is not bad, guards of outgoing edges and invariants in are either the same or strengthened from their counterpart in , i.e., if the constraints are verified in the expanded TA, they must be in the original one. Furthermore, as edges unrelated to bad states are derived from the edges of the original TA, this ensures that for all (where is the joint destination function).
The projection mapping preserves time-divergence. Unlike the expansion mapping, projecting a finite play does not alter the amount of moves. This mapping respects suffixes: for all finite plays and (finite or infinite) plays of such that is a well-defined play, we have , where the move is if and otherwise. We refer to this property as suffix compatibility.
Objective preservation
We now establish the main theorem of this section: a play of satisfies the (resp. direct) timed window objective if and only if its expansion satisfies the co-Büchi (resp. safety) objective over bad locations; and a play of satisfies the co-Büchi (resp. safety) objective over bad locations if and only if its projection satisfies the (resp. direct) timed window objective.
Lemma 4.3.
The following assertions hold. For all time-divergent plays :
-
A.1.
if and only if ;
-
A.2.
if and only if .
For all well-initialized time-divergent plays :
-
B.1.
if and only if ;
-
B.2.
if and only if .
The form of this result is due to the lack of a bijection between the sets of plays of a TG and its expansion (Remark 4.2). This lemma essentially follows from the construction of and the definitions of the expansion and projection mappings.
Proof 4.4.
We start with Item A.1. Fix a time-divergent play and write for its expansion. Assume satisfies the direct timed window objective. We establish that is safe and all of its prefixes are coherent with their expansion using an inductive argument. Intuitively, for the first step, we show that if the window opened at step closes at step , then the th location of is even and is reached in less than time units.
Since , there is some such that , is odd for all and is even. Since , and are coherent up to step as does not reach (i.e., no bad locations can occur up to step ). By construction of the expansion mapping (see case 2 of the construction), and is even. From there, there are two possibilities: either location is never left (only delays are taken from there) or after some delay moves, location is exited. In the first case, the safety objective is trivially satisfied as the location is never left in the expansion and no bad locations were visited beforehand. Otherwise, some edge is traversed (for the first time since step ) at some step . Then, , and no bad locations appear in the first steps of .
We can repeat the first argument in position because and thus . We conclude there is some such that for and is even. Once more, we separate cases following if is left through an edge. Iterating this argument shows that no bad locations appear along .
Conversely, assume does not satisfy the direct window objective. Let be the smallest index such that (Lemma 3.1). We can argue, using a similar inductive argument as above, that and are coherent up to step , no bad locations occur up to step , and that and . In the sequel, we argue that a bad location is entered using the fact that there is no good window at step .
The negation of yields that for all , if , then is odd. There is some such that as is assumed to be time-divergent. Write for the smallest such . As is minimal, holds. It follows from the above that is odd as and that there are no resets of between steps and (resets of require an even location or a bad location), hence . The delay is such that ; the definition of the expansion function redirects to a bad location (case 2.b of the expansion definition). This shows that does not satisfy .
Let us move on to Item A.2. Fix a time-divergent play and write for its expansion. Assume does not satisfy the objective , i.e., there are infinitely many occurrences of bad locations along . We establish that , using Item A.1.
It suffices to show that for infinitely many , there is some such that . Indeed, Lemma 3.1 and this last assertion imply that for infinitely many , , hence, there are infinitely many bad windows along . In turn, this property ensures no suffix of satisfies the direct timed window objective, i.e., .
Recall a well-initialized suffix of is the expansion of some and a well-initialized suffix always follows after a bad location. Bad locations are assumed to occur infinitely often along , therefore has infinitely many well-initialized suffixes. It follows from that there are infinitely many well-initialized suffixes of that violate . Therefore, there are infinitely many such that does not satisfy for some , because each step of induces finitely many visits to a bad location in the expansion by construction (in other words, there are finitely many well-initialized suffixes of per step in ). From Item A.1, there are infinitely many such that does not satisfy the direct window objective for some . This ends this direction of the proof.
Conversely, assume satisfies the objective . If is safe, then it follows from Item A.1 that . If is unsafe, there is a well-initialized suffix of that is safe, i.e., the suffix of starts after the last occurrence of a bad location. This suffix is of the form for some and . By Item A.1, , which implies that . Thus, we have , which ends the proof of this item.
Let us proceed to Item B.1. Let be a time-divergent well-initialized play and write for its projection. Assume satisfies the objective and let us prove that . It suffices to establish that for all , (Lemma 3.1). First, we argue there is an even location along . Then we establish that when the first even location is entered in , the window opened at step 0 in closes. We conclude, using the inductive property of windows and an inductive argument, that satisfies the direct window objective.
Safety and divergence of ensure that an even location appears along . Assume that there are no even locations along . Then every location appearing in must be odd by safety of . Thus cannot be reset as it requires exiting an even location or entering a bad location. The invariant of odd locations would prevent time-divergence of , which would be contradictory. Thus, there must be some even location along .
Let denote the smallest index such that is even. We establish that the window opened at step in closes at step . It must hold that , as the outgoing edges of odd locations that do not target bad locations have guards implying and do not reset . Furthermore, is odd for all and , by definition of the edge relation of and since . This proves that the window opened at step 0 closes at step in less than time units. It follows from the inductive property of windows (Lemma 3.3) that for all , .
As is odd for , we have . There are two possibilities for : either the even location is never left or there is some such that at step , the expanded location is exited through an edge (via the pair of moves ). In the first case, only delays are taken in the location in , the priority of which is even, yielding for all (all windows after step close immediately), and combining this with the previous paragraph implies . In the latter case, we have for (similarly to the former case, only delays are taken in in up to step ), and and as edges leaving even locations reset and lead to locations of the form . Repeating the previous arguments from position ( is a time-divergent well-initialized suffix of ), one can find some such that is even and for all , . Once more, we split in cases following whether any edge is traversed in location . It follows from an induction that .
Assume now that does not satisfy the safety objective. There is some smallest such that . Let be if was never reset before position or the greatest such that the pair of moves induces a reset of otherwise. We have and : if , this is due to being well-initialized and if , this follows from being the smallest index of a bad location and from the definition of edges in ; edges that reset have as their target expanded locations of the form or . We argue that does not satisfy the timed good window objective. There cannot be any even locations between positions and ; there are no edges to bad locations in even locations and edges leaving even locations reset . Thus, for all , and is odd. Bad locations can only be entered when reaches in an odd location: the window opened at step does not close in time. Therefore, .
For Item B.2, we use compatibility of the projection mapping with suffixes. Fix a divergent well-initialized play and let be its projection. Assume satisfies the objective . If is safe, we have the property by Item B.1. Assume some bad location appears along . As the co-Büchi objective is satisfied, this location is left and the suffix following this exit is well-initialized. As there are finitely many bad locations along , it follows has a well-initialized suffix satisfying . From the compatibility of projections with suffixes and Item B.1, has a suffix satisfying the direct timed window objective, hence satisfies the timed window objective.
Conversely, assume does not satisfy , i.e., there are infinitely many occurrences of bad locations along . Divergence ensures any bad location is eventually left through an edge, yielding a well-initialized suffix. From Item B.1 and suffix compatibility of the projection mapping, it follows that has infinitely many suffixes that do not satisfy the direct timed window objective. This implies that does not satisfy the timed window objective, ending the proof.
This lemma completely disregards plays that are not well-initialized. This is not an issue, as any play starting in the initial state of the TG is well-initialized. Indeed, if is the initial location of , then is the initial location of , and thus is the initial state of . Any initial play of expands to an initial play of and any initial play of projects to an initial play of .
The previous result can be leveraged to prove that the verification problem for the (resp. direct) timed window objective on can be reduced to the verification problem for the co-Büchi (resp. safety) objective on .
Theorem 4.5.
Let be a TA, a priority function and . All time-divergent paths of satisfy the (resp. direct) timed window objective if and only if all time-divergent paths of satisfy the co-Büchi (resp. safety) objective over bad locations.
Proof 4.6.
We show that if there is a time divergent path of that violates the (resp. direct) timed window objective, then there is a time-divergent path of that violates the co-Büchi (resp. safety) objective for bad locations. The other direction is proven using similar arguments and the projection mapping rather than the expansion mapping.
Assume there is a time-divergent initial path of that does not satisfy the (resp. direct) timed window objective. Consider the TG . The sequence is a time-divergent initial play of as is a time-divergent initial path of . Furthermore, does not satisfy the (resp. direct) timed window objective as it shares the same sequence of states as . By Lemma 4.3, the time-divergent play of does not satisfy the co-Büchi (resp. safety) objective over bad locations. There is some path of the TA that shares the same sequence of states as . This path is time-divergent and does not satisfy the co-Büchi (resp. safety) objective over bad locations, ending the proof.
4.3 Translating strategies
In this section, we present how strategies can be translated from the base game to the expanded game and vice-versa, using the expansion and projection mappings. We restrict our attention to move-independent strategies, as this subclass of strategies suffices for state-based objectives [20].
We open the section with a binary classification of time-convergent plays of the expanded TG, useful to prove how our translations affect blamelessness of outcomes. Then we proceed to our translations. We define how a strategy of can be translated to a strategy of using the expansion mapping. Then we define a translation of strategies of to strategies of , using the projection mapping. Each translation definition is accompanied by a technical result that establishes a connection between outcomes of a translated strategy and outcomes of the original strategy, through the projection or expansion mapping. It follows from these technical results that the translation of a winning strategy in one game is a winning strategy in the other.
Recall that given a winning strategy of , all of its time-convergent outcomes are -blameless. Therefore, when translating a winning strategy from one game to the other, we must argue that all time-convergent outcomes of the obtained strategy are -blameless. To argue how our translations preserve this property, let us introduce a binary classification of time-convergent plays of . We argue that any time-convergent play of the expanded TG either remains in a bad location from some point on or visits finitely many bad locations. Whenever a bad location is left, a subsequent visit to a bad location requires at least time units to elapse. It is thus impossible to visit bad locations infinitely often without either remaining in a bad location from some point onward or having time diverge. We now formalize this property and present its proof.
Proposition 4.7.
Let be an infinite play of . If is time-convergent, then .
Proof 4.8.
Assume towards a contradiction that neither nor hold. It follows that there are infinitely many such that and infinitely many such that . Consider some such that . There is some such that and some such that . We argue that at least time units pass between steps and , i.e., . Once this is established, iterating this argument establishes divergence of , reaching a contradiction. The invariant of ensures that . The bad location is exited at some stage to reach and the guards of edges to the location are . It follows that at least time units must elapse between steps and .
We now move on to our translations. We translate -strategies of the expanded TG to -strategies of the original TG by evaluating the strategy on the expanded TG on expansions of plays provided by the expansion mapping and by replacing any occurrences of by . The fact that translating a winning strategy of the expanded TG this way yields a winning strategy of the original TG is not straightforward: when we translate a strategy of to a strategy of , the expansion of an outcome of may not be consistent with , preventing the direct use of Lemma 4.3. Indeed the definition of the expansion mapping may impose moves in where would suggest . However, we can mitigate this issue by constructing another play in parallel that is consistent with and shares the same sequence of states as . We leverage the non-deterministic behavior of tie-breaking and move-independence of to ensure consistency of with , by changing the moves of on comparatively to . We also prove that if is -blameless and time-convergent, then also is -blameless.
Lemma 4.9.
Let be a move-independent strategy of in . Let be the -strategy in defined by
for all finite plays . For all , there is a play such that shares the same sequence of states as and such that if is time-convergent and -blameless, then is -blameless.
Proof 4.10.
Fix an infinite play . We construct inductively: at step of the construction, we assume that we have constructed a finite play such that shares the same sequence of states as and is consistent with .
Initially, we set . This is a play consisting of only one state and with no moves. It shares the same sequence of states as by construction and is consistent with .
Now assume by induction that we have constructed a finite play that shares the same sequence of states as and is consistent with . Let . Let be a move of and be a state of such that . Write . We construct so that it shares the same sequence of moves as and so it is consistent with .
Recall that an expansion never ends in a bad location. The proof is divided in different cases. Case 1 is when is in an even location. Case 2 is when is in an odd location. We further split case 2 in four sub-cases:
-
2.a
;
-
2.b
and ;
-
2.c
and is responsible for the last transition;
-
2.d
and is not responsible for the last transition (this implies ).
These four sub-cases are disjoint. We show that they cover all possibilities before moving on to the remainder of the construction. Assume is in an odd location. The inequality holds due to being the delay of the move and this delay being the same as the delay of (this follows from the definition of ) and because is an odd location with an invariant implying . We now describe how the construction proceeds in each case.
Case 1. Assume is in an even location. Then the same moves are enabled in and . We have for some state of given by the definition of . We let be . This is a well-defined play: being a play implies and and share the same sequence of states, hence . Furthermore, by move-independence of and definition of , . The play is consistent with by construction.
Case 2. Now assume that is in an odd location. Recall that we must have .
Sub-case 2.a. Assume . Then for some state of given by the definition of the expansion mapping and where if and otherwise. We define to be the play , which is a well-defined play sharing the sequence of states of and consistent with for the same reasons as case 1.
Sub-case 2.b. Assume and . Then the move of is changed in the expansion to : we have for some state of . It follows from preempting that . By move-independence of and definition of , the delay of the moves and match. Let be . Thus is a well-defined play sharing the same sequence of states as and consistent with .
Sub-cases 2.c. and 2.d. For the two remaining cases, assume that . Thus, : the only actions available in state are and as implies and edges leaving with actions other than and have guards requiring . We write and for and , we write in the following.
Sub-case 2.c. If and is responsible for the last transition of , then is of the form
for some given by the expansion definition. We construct in three steps. The sequence is a well-defined play: both appended moves share the same delay because the delay of the move is that of by definition of and move-independence of , and holds. Its extension is also a well-defined play as must have a delay of zero due to the invariant of bad locations enforcing . We define to be the play . The sequence is a play: the move is available in any state and performing it does not change the state, and it cannot be outsped. By construction, shares the same sequence of states as and is consistent with .
Sub-case 2.d. Assume now that and that is not responsible for the last transition. Then is of the form
for some state of . Like in case 2.c., we extend using the same states and changing the moves along the above to ensure consistency with . Let and be defined identically to case 2.c. We define to be . The sequence is a well-defined play (the last transition depends on ’s move), is consistent with and shares the same sequence of states as .
The inductive construction above yields a play that shares the same sequence of states as . It remains to show that if is time-convergent and -blameless, then is -blameless. Assume that is time-convergent and -blameless. By Proposition 4.7, either does not leave a bad location from some point on or satisfies . An expansion always exits a bad location the step following entry, therefore is necessarily of the second kind, as it shares its sequence of states with . Thus, from some point on, the inductive construction of is done using cases 1, 2.a or 2.b. Furthermore, from some point on, ’s moves are no longer responsible for transitions in . In cases 1 and 2.a, is responsible for the added transition in if and only if is responsible for the last transition of (by construction of the expansion mapping). In case 2.b, is strictly faster in both plays. Therefore, if from some point on is no longer responsible for transitions in , then from some point on is no longer responsible for transitions in , i.e., is -blameless.
We can also translate -strategies defined on to -strategies on using the projection mapping. Translating strategies this way must be done with care: we must consider the case where a move suggested by the strategy in requires too long a delay to be played in the expanded TG . In this case, the suggested move is replaced by for a suitable delay . By construction, the translated strategy always suggests the move when the play ends in a bad location.
Similarly to the first translation, when deriving a strategy of by translating a strategy of , the projection of an outcome of may not be consistent with . However, we show that there is a play consistent with that shares the same sequence of states as , by using techniques similar to those used to prove Lemma 4.9. Analogously to Lemma 4.9, we establish that time-convergence and -blamelessness of imply -blamelessness of .
Lemma 4.11.
Let be a move-independent strategy of in . Let be a finite play in and let denote and . We set:
For all , there is a play such that shares the same sequence of states as and if is time-convergent and -blameless, then is -blameless.
Proof 4.12.
Fix . We construct the sought play by induction. We assume that at step of the construction, we have constructed that shares the same sequence of states as , and such that is consistent with .
The base case is straightforward. We set . Now, assume that we have constructed a finite play as described above and let us construct . Let be a move of and be a state of such that . Write , and .
We discuss different cases: 1. is an even number, 2. and 3. is an odd number. The third case is divided in three disjoint sub-cases: 3.a. ; 3.b. and ; 3.c. and .
Case 1. Assume is an even number. Then the same moves are enabled in and . By definition of and move-independence of , . We have . We define as the play . This is a well-defined play consistent with that shares the same sequence of states as , due to the fact that and share the same sequence of states and .
Case 2. Assume . We have , . Indeed, the only moves available in a bad location are and . The play either proceeds using an edge to or stays in . The values of clocks do not change in both of these cases. By definition of the projection mapping, . We let be : is a well-defined play (the last transition is valid by the move of ), is consistent with and has the same sequence of states as .
Case 3. Assume is an odd number. We study three sub-cases: a. ; b. and ; c. and . They are disjoint and cover all possibilities.
We start with case 3.a. Assume that . Then holds: first, to enter a bad location from an odd location, must reach , thus for ; second, the invariant of odd locations implies , thus no delay greater than can be played by either player in the state . It follows from that the delay of the move must be greater than or equal to (by line 1 of the definition of ). We have . We define to be . The sequence is a well-defined play consistent with ; the last step is valid due to the delay of being more than . By construction, has the same sequence of states as .
Assume for cases 3.b and 3.c that . We start with case 3.b. Assume . Then because is a move enabled in the state which is constrained by the invariant . As , necessarily and is responsible for the last transition of . We have . In this case, we set . This is a well-defined play, as and is smaller than the delay suggested by by definition of .
For case 3.c, assume . Then by definition of . To avoid distinguishing cases following whether , let denote if and otherwise. We have . We let be the sequence . It is a well-defined play, consistent with and sharing the same sequence of states as . These last statements follow from and sharing the same sequence of states and move-independence of ensuring .
The previous inductive construction shows the existence of a play such that and share the same sequence of states. We now argue that if the constructed play is time-convergent and -blameless, then is also -blameless.
Assume that is time-convergent and -blameless. It follows that is also time-convergent because and share the same sequence of states (recall the projection mapping preserves time-convergence). By Proposition 4.7, either does not leave a bad location from some point on or satisfies . If from some point on does not leave a bad location, then it is -blameless by construction of : in a bad location, always suggests a move that exits the bad location. Assume now that satisfies . Then, from some point on, the construction of proceeds following cases 1, 3.b and 3.c of the inductive construction. In cases 1 and 3.c, the player responsible for the transition that is added is the same in both games. In case 3.b, is not responsible for the last transition of . Therefore, if from some point on is no longer responsible for transitions in , then from some point on, is no longer responsible for transitions in , i.e., is -blameless.
The translations of strategies described in Lemma 4.9 (resp. Lemma 4.11) establish a relation associating to any outcome of a translated strategy, an outcome of the original strategy which shares states with the expansion (resp. projection) of this outcome. Recall that all the objectives we have considered are state-based, and therefore move-independent strategies suffice for these objectives. Using the relations developed in Lemma 4.9 and Lemma 4.11, and Lemma 4.3, we can establish that translating a winning strategy of yields a winning strategy of , and vice-versa. From this, we can conclude that the realizability problem on TGs with (resp. direct) timed window objectives can be reduced to the realizability problem on TGs with co-Büchi (resp. safety) objectives.
Theorem 4.13.
Let be the initial state of and be the initial state of . There is a winning strategy for for the objective (resp. ) from in if and only if there is a winning strategy for for the objective (resp. ) from in .
Proof 4.14.
Let .
Assume has a winning strategy for the objective in from . We assume this strategy to be move-independent because is a state-based objective. By Lemma 4.11, there is a strategy such that for any play , there is a play such that the sequence of states of and coincide, and if is time-convergent and -blameless, then is -blameless.
Fix and let as above. We establish that . Because is winning, . First, assume is time-divergent. Then is also time-divergent. It follows that and since is state-based, . Lemma 4.3 ensures that . Assume now that is time-convergent. Then is also time-convergent. Because is winning, is -blameless. This implies -blamelessness of . Thus, we have and have shown that is winning.
Conversely, assume has a winning strategy in for objective from . We assume that the strategy is move-independent. We show that the strategy defined in Lemma 4.9 is winning in from . Fix . Let be such that the sequence of states of coincides with that of , and that if if time-convergent and -blameless, then is also blameless. We prove that and distinguish cases following time-divergence of . Assume is time-divergent, then so is . The play is consistent with a winning strategy and time-divergent, thus . Safety/co-Büchi objectives are state-based, and hence if and only if . It follows from Lemma 4.3 that . Now assume that is time-convergent. Then, is time-convergent thus -blameless, therefore is also -blameless. We have proven that , which ends the proof.
5 Multi-dimensional objectives
We can generalize the former reduction to conjunctions of (resp. direct) timed window parity objectives. Fix a TG with set of locations . We consider a -dimensional priority function . We write for each component function. Fix a vector of bounds on window sizes.
Generalized (resp. direct) timed window objectives are defined as conjunctions of (resp. direct) timed window objectives. Formally, the generalized timed window (parity) objective is defined as
and the generalized direct timed window (parity) objective as
The verification and realizability problems for these objectives can be solved using a similar construction to the single-dimensional case. The inductive property (Lemma 3.3) ensures only one window needs to be monitored at a time on each dimension. We adapt the construction of the expanded TA to keep track of several windows at once.
Expanded locations are labeled with vectors where represents the smallest priority in the current window on dimension for each . Besides these, for each location , there is an expanded location marked by . We do not keep track of the dimension on which a bad window was seen: we only consider one kind of bad location. To measure the size of a window in each dimension, we introduce new clocks . For any , denote by the set of indices of components that are odd.
Updates of the vector of priorities are independent between dimensions and follow the same logic as the simpler case. In the single-dimensional case, an edge leaving an even location leads to a location of the form and an edge leaving an odd location leads to a location of the form . The behavior of edges is similar in this case. The update function generalizes the handling of updates. Let be a vector of current priorities and be a location of . We set to be the vector such that if and otherwise.
For any non-bad location and edge leaving location , there is a matching edge in the expanded TA. The target location of this edge is . Its guard is a strengthened version of : it is disabled if a bad window is detected on some dimension by adding a conjunct for each to the guard of the edge. In the same manner that odd locations were equipped with a strengthened invariant of the form , expanded locations are equipped with an invariant that is the conjunction of with the conjunction over of .
It remains to discuss how bad locations are generalized. They have the invariant ( is chosen arbitrarily) and for each and location , there is an edge . In the single-dimensional case, each odd location had an edge to a bad location with a guard . Analogously, for any location with non-empty, there are edges to . There are two such edges (one for each ) per dimension in . Recall that there cannot be two edges and with satisfiable. This prevents adding edges guarded by . We do not need more than edges however. For instance, instead of having one edge guarded by and another guarded by , we can replace the guard of the second edge by to ensure the guards are incompatible. Upon entry to a bad location, all additional clocks , …, are reset, no matter the dimension on which a bad window was detected.
The formal definition of follows.
Definition 5.1.
Given a TA , is defined to be the TA such that
-
•
;
-
•
;
-
•
where ;
-
•
where , ;
-
•
for all , , and for all ;
-
•
the set of edges of is the smallest set that satisfies the following rules:
-
–
if and , then
-
–
for all , , and , where the clock condition is defined by
-
–
for all and ,
-
–
For a TG , we set .
The generalized (resp. direct) timed window objective on a TA or TG translates to a co-Büchi (resp. safety) objective on the expansion and respectively. Let . We can derive the following theorems by adapting the proofs of the single-dimensional case (Theorems 4.5 and 4.13). The only nuance is that there are several dimensions to be handled in parallel. However, different dimensions minimally interact with one another. The only way a dimension may affect the others is by resetting them in the expansion (when a bad location is visited), and that would entail a bad window was detected on one dimension.
Theorem 5.2.
All time-divergent paths of satisfy the generalized (resp. direct) timed window objective if and only if all time-divergent paths of satisfy the co-Büchi (resp. safety) objective over bad locations.
Theorem 5.3.
There is a winning strategy for for the objective (resp. ) from the initial state of if and only if there is a winning strategy for for the objective (resp. ) from the initial state of .
6 Algorithms and complexity
This section presents algorithms for solving the verification and realizability problems for the generalized (resp. direct) timed window parity objective. We establish lower bounds that match the complexity class of our algorithms. We consider the general multi-dimensional setting and we denote by the number of timed window parity objectives under consideration.
6.1 Algorithms
We use the construction presented in the previous sections to solve both the realizability problem and the verification problem for the (resp. direct) timed window objective. In both cases, we invoke known sub-algorithms. For games, we rely on a general algorithm for solving TGs with -regular location-based objectives. For automata, we use an algorithm for the emptiness problem of timed automata.
First, we analyze the time required to construct the expanded TA with respect to the inputs to the problem.
Lemma 6.1.
Let be a TA. Let be a -dimensional priority function and be a vector of bounds on window sizes. The expanded TA can be computed in time exponential in and polynomial in the size of , the size of , the size of , , the length of the encoding of the clock constraints of and in the encoding of .
Proof 6.2.
We analyze each component of . First, we study the size of the set of expanded locations. The set of expanded locations is given by . Therefore, . The set of clocks contains clocks.
To determine a bound on the size of the set of edges, we base ourselves on the rules that define . For each edge and , there is an edge in exiting location . There are edges obtained this way. Furthermore, for each non-bad expanded location , there are up to edges to the location . There are at most edges obtained by this rule. Finally, it remains to count the edges that leave bad locations. There are two such edges per bad location, totaling to . Overall, an upper bound on the size of is given by .
It remains to study the total length of the encoding of the clock constraints of . We have shown above that and are of size exponential in and polynomial in , and . There are as many clock constraints in as there are locations and edges. Therefore, it suffices to show that the encoding of each individual clock constraint of is polynomial in the total length of the encoding of the clock constraints of , and the encoding of to end the proof.
A clock constraint of is either derived from a clock constraint of , the invariant of a bad location or the guard of an edge to or from a bad location. Clock constraints derived from are either unchanged, or they are obtained by reinforcing a clock constraint of with conjuncts in the case of invariants or with a conjuncts in the case of guards. At most, we extend clock constraints of with conjuncts that can be encoded linearly w.r.t. the encoding of the . The invariant of bad locations is and is therefore constant in length. The guards of edges exiting bad locations are and are also constant in length. Finally, guards of edges to bad locations are conjunctions of the form , which can be encoded linearly w.r.t. and the encoding of the .
Timed games.
To solve the realizability problem, we rely on the algorithm from [20]. To use the algorithm of [20], we introduce deterministic parity automata. For all finite alphabets , deterministic parity automata can represent all -regular subsets of . We use deterministic parity automata to encode location-based objectives (the set of locations of the studied timed automaton serves as the alphabet of the parity automaton).
Let be a finite non-empty alphabet. A deterministic parity automaton (DPA) of order is a tuple , where is a finite set of states, is the initial state, is the transition function and is a function assigning a priority to each state of the DPA. An execution of on an infinite word is an infinite sequence of states that starts in the initial state of , i.e., and such that for all , , i.e., each step of the execution is performed by reading a letter of the input word. An infinite word is accepted by if there is an execution such that the smallest priority appearing infinitely often along the execution is even, i.e. if . A DPA is total if its transition function is total, i.e., for all and , is defined.
The algorithm of [20] checks the existence of a -winning strategy in a TG where the set of locations of is and the set of clocks is with a location objective specified by a total DPA with set of states and of order in time
(2) |
where is the largest constant to which is compared to in the clock constraints of .
We now show that the realizability problem for generalized (resp. direct) timed window objectives is in EXPTIME using the algorithm described above. This essentially boils down to specifying DPAs that encode safety and co-Büchi objectives and using these along with the algorithm of [20] to check the existence of a -winning strategy in .
Proof 6.3.
Fix a TG and let denote the set of locations of . Let be a multi-dimensional priority function and be a vector of window sizes. Let denote the set of locations of . We describe total DPAs and for the objectives and in the TG .
We encode the safety objective using a DPA with two states. Intuitively, the initial state is never left as long as no bad location is read. If the DPA reads a bad location, it moves to a sink state, representing that the safety objective was violated. The initial state is given an even priority and the sink state an odd priority so that accepting executions are those that read sequences of locations that respect the safety objective. Formally, let be the DPA where , and , and the transition function is defined by for all , for all and for all . This DPA is total by construction.
The co-Büchi objective is also encoded by a DPA with two states. The first state is entered every time a non-bad location is read by the DPA and the second state whenever a bad location is read. Runs that visit infinitely often violate the co-Büchi objective, therefore we give an odd priority smaller than the even priority of . Formally, let be the DPA where , and , and the transition function is defined by for all and , and for all and . This deterministic parity automaton is total by construction.
By Theorem 5.3, it suffices to check the existence of a winning strategy for in the expanded TG to answer the realizability problem over . By Lemma 6.1, the construction of the TG from takes exponential time w.r.t. to the inputs to the problem.
Recall that there are locations in . We have shown that the objective (resp. ) can be encoded using a DPA with two states and order at most . Combining this with equation (2), it follows that we can check the existence of a -winning strategy in for a safety or co-Büchi objective in time
(3) |
where is the largest constant to which is compared to in the clock constraints of .
Overall, can be constructed in exponential time and the existence of a -winning strategy in can be checked in exponential time, establishing EXPTIME-membership of the realizability problem for (resp. direct) timed window objectives.
Timed automata.
We describe a polynomial space algorithm for the verification problem for generalized (resp. direct) timed window objectives derived from our reduction. First, we remark that to check that all time-divergent paths of a TA satisfy a conjunction of objectives, we can proceed one objective at a time: for any family of objectives, there is some time-divergent path that does not satisfy the conjunction of objectives in the family if and only if, for some objective from the family, there is some time-divergent path that does not satisfy it. This contrasts with TGs, in which may have a winning strategy for each individual objective but is unable to satisfy their conjunction.
Following the observation above, we establish membership in PSPACE of the verification problem for the generalized (resp. direct) timed window objective in two steps. First, we argue that the verification problem is in PSPACE for the single-dimensional (resp. direct) timed window parity objective. Then, in the multi-dimensional setting, we use the single-dimensional case as an oracle to check satisfaction of a generalized objective one dimension at a time.
Lemma 6.4.
The generalized (direct) direct timed window verification problem is in PSPACE.
Proof 6.5.
Fix a TA , a priority function and a bound on window sizes. By Theorem 4.5, the verification problem on for the (resp. direct) timed window objective can be reduced to the verification problem on for the co-Büchi objective (resp. the safety objective ). The complement of a co-Büchi (resp. safety) objective is a Büchi (resp. reachability) objective. Therefore, there is a time-divergent path of that does not satisfy the (resp. direct) timed window objective if and only if there is some time-divergent path in that satisfies (resp. ).
Our algorithm for the verification problem on for the (resp. direct) timed window objective proceeds as follows: construct and check if there is some time-divergent path in satisfying (resp. ) and return no if that is the case.
This algorithm is in polynomial space. By Lemma 6.1, as we have fixed , can be computed in time polynomial in , the sizes of , , , the length of the encoding of the clock constraints of and the encoding of . In other words, the verification problem for the (resp. direct) timed window objective can be reduced to checking the existence of a time-divergent path of a TA satisfying a Büchi (resp. reachability objective) in polynomial time. Checking the existence of a time-divergent path satisfying a Büchi or reachability objective can be done in polynomial space [2]. Thus, the verification problem for the (one-dimensional) (resp. direct) timed window objective can be solved in polynomial space.
For the -dimensional case, the previous algorithm can be used for each individual component, to check that all dimensions satisfy their respective objective. Complexity-wise, this shows the multidimensional problem belongs in (with an oracle in PSPACE for the single-dimensional case). Because, [6], this proves that the generalized (resp. direct) timed window verification problem is in PSPACE.
6.2 Lower bounds
We have presented algorithms which share the same complexity class for one or multiple dimensions. In this section, we establish that our bounds are tight. It suffices to show hardness for the single-dimensional problems, as they are subsumed by the -dimensional case.
The verification and realizability problems for timed window objectives can be shown to be at least as hard as the verification and realizability problems for safety objectives. The safety verification problem is PSPACE-complete [2]. The realizability problem for safety objectives is EXPTIME-complete (this follows from EXPTIME-completeness of the safety control problem [22]). The same construction is used for the verification and realizability problem. Given a timed automaton, we expand it so as to encode in locations whether the safety objective was violated at some point. We assign an even priority to locations that indicate the safety objective never was violated and an odd priority to the other locations: as long as the safety objective is not violated, windows close immediately and as soon as it is violated, it no longer is possible to close windows.
Lemma 6.6.
The verification (resp. realizability) problem for the (direct) timed window objective is PSPACE-hard (resp. EXPTIME-hard).
Proof 6.7.
Fix a TA and a set of unsafe locations . We construct a TA and a priority function such that all time-divergent initial paths of satisfy if and only if all time-divergent initial paths of satisfy (resp. ) (the choice of 1 as the bound of the window size is arbitrary, the provided reduction functions for any bound on the size of windows). We encode the safety objective in a TA as a (resp. direct) timed window objective.
We expand locations of with a Boolean representing whether was visited. Formally, let be the TA with , if and otherwise, for all and , and the set of edges is the smallest set satisfying, for each edge : if ; if ; and . The priority function defined over locations of is defined by .
There is a natural bijection between the set of initial paths of and the set of initial paths of . An initial path is mapped via to the initial path of , where the sequence is if for all , and otherwise, for all and for all , where denotes . This mapping is well-defined and bijective: the same moves are enabled in a state of and in the state of for all . Furthermore, for all and all moves enabled in , holds if and only if there is some such that
The bijection preserves time-divergence. Furthermore, a path of satisfies if and only if does not visit a location of the form . The initial paths of that visit a location of the form are exactly those that do not satisfy the (resp. direct) timed window objective: once such a location is entered, it is no longer possible to close windows as the set of locations cannot be left by construction. Therefore, there is a time-divergent path of that does not satisfy if and only if there is a time-divergent path of that does not satisfy (resp. ). Furthermore, the TA has the same set of clocks as and twice as many locations and edges as , and the overall length of the encoding of the clock constraints of is double that of . This shows that the safety verification problem can be reduced in polynomial time to the (resp. direct) timed window objective. This establishes PSPACE-hardness of the verification problem for (resp. direct) timed window objectives.
The same construction can be used for the realizability problem. There is an analogous mapping for initial plays. This mapping can be used to establish a bijection between the restriction of strategies over initial paths in the safety TG and the (resp. direct) timed window TG. It follows that the realizability problem for safety objectives can be reduced to the realizability problem for (resp. direct) timed window objectives in polynomial time, establishing EXPTIME-hardness of the realizability problem for (resp. direct) timed window objectives.
6.3 Wrap-up
We summarize our complexity results in the following theorem.
Theorem 6.8.
The verification problem for the (direct) generalized timed window parity objective is PSPACE-complete and the realizability problem for the (direct) generalized timed window parity objective is EXPTIME-complete.
We conclude with a comparison of TGs with parity objectives and TGs with (resp. direct) timed window objectives. It was shown in [19] that TGs with parity objectives can be reduced to turn-based parity games on graphs. Their solution is as follows: to check if a -winning strategy exists in a TG with the objective , they construct a turn-based game on a graph with states and priorities, where denotes the set of states of the region abstraction of , the size of which is bounded by . Despite recent progress on quasi-polynomial-time algorithms for parity games [16], there are no known polynomial-time algorithms, and, in many techniques, the blow-up is due to the number of priorities. Therefore, the complexity of checking the existence of a -winning strategy in the TG for a parity objective suffers from a blow-up related to the number of priorities and the size of the region abstraction.
In contrast, our solution for TGs with (resp. direct) timed window objectives does not suffer from a blow-up due to the number of priorities: for a single dimension, the complexity given in (3) is polynomial in the size of the set of states of the region abstraction and in . This shows that adding time bounds to parity objectives in TGs comes for free complexity-wise.
References
- [1] Rajeev Alur, Costas Courcoubetis, and David L. Dill. Model-checking in dense real-time. Inf. Comput., 104(1):2–34, 1993. doi:10.1006/inco.1993.1024.
- [2] Rajeev Alur and David L. Dill. A theory of timed automata. Theor. Comput. Sci., 126(2):183–235, 1994. doi:10.1016/0304-3975(94)90010-8.
- [3] Christel Baier. Reasoning about cost-utility constraints in probabilistic models. In Mikolaj Bojanczyk, Slawomir Lasota, and Igor Potapov, editors, Reachability Problems - 9th International Workshop, RP 2015, Warsaw, Poland, September 21-23, 2015, Proceedings, volume 9328 of Lecture Notes in Computer Science, pages 1–6. Springer, 2015. doi:10.1007/978-3-319-24537-9\_1.
- [4] Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT Press, 2008.
- [5] Christel Baier, Joachim Klein, Sascha Klüppelholz, and Sascha Wunderlich. Weight monitoring with linear temporal logic: complexity and decidability. In Thomas A. Henzinger and Dale Miller, editors, Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS ’14, Vienna, Austria, July 14 - 18, 2014, pages 11:1–11:10. ACM, 2014. doi:10.1145/2603088.2603162.
- [6] Theodore P. Baker, John Gill, and Robert Solovay. Relativizations of the P =? NP question. SIAM J. Comput., 4(4):431–442, 1975. doi:10.1137/0204037.
- [7] Raphaël Berthon, Mickael Randour, and Jean-François Raskin. Threshold constraints with guarantees for parity objectives in Markov decision processes. In Ioannis Chatzigiannakis, Piotr Indyk, Fabian Kuhn, and Anca Muscholl, editors, 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017, July 10-14, 2017, Warsaw, Poland, volume 80 of LIPIcs, pages 121:1–121:15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2017. doi:10.4230/LIPIcs.ICALP.2017.121.
- [8] Benjamin Bordais, Shibashis Guha, and Jean-François Raskin. Expected window mean-payoff. In Arkadev Chattopadhyay and Paul Gastin, editors, 39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2019, December 11-13, 2019, Bombay, India, volume 150 of LIPIcs, pages 32:1–32:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019. doi:10.4230/LIPIcs.FSTTCS.2019.32.
- [9] Patricia Bouyer, Thomas Brihaye, Mickael Randour, Cédric Rivière, and Pierre Vandenhove. Decisiveness of stochastic systems and its application to hybrid models. In Jean-François Raskin and Davide Bresolin, editors, Proceedings 11th International Symposium on Games, Automata, Logics, and Formal Verification, GandALF 2020, Brussels, Belgium, September 21-22, 2020, volume 326 of EPTCS, pages 149–165, 2020. doi:10.4204/EPTCS.326.10.
- [10] Tomás Brázdil, Krishnendu Chatterjee, Vojtech Forejt, and Antonín Kucera. Trading performance for stability in Markov decision processes. J. Comput. Syst. Sci., 84:144–170, 2017. doi:10.1016/j.jcss.2016.09.009.
- [11] Tomás Brázdil, Vojtech Forejt, Antonín Kucera, and Petr Novotný. Stability in graphs and games. In Desharnais and Jagadeesan [21], pages 10:1–10:14. doi:10.4230/LIPIcs.CONCUR.2016.10.
- [12] Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, and Mickael Randour. Life is random, time is not: Markov decision processes with window objectives. Log. Methods Comput. Sci., 16(4), 2020. URL: https://lmcs.episciences.org/6975.
- [13] Véronique Bruyère, Emmanuel Filiot, Mickael Randour, and Jean-François Raskin. Meet your expectations with guarantees: Beyond worst-case synthesis in quantitative games. Inf. Comput., 254:259–295, 2017. doi:10.1016/j.ic.2016.10.011.
- [14] Véronique Bruyère, Quentin Hautem, and Mickael Randour. Window parity games: an alternative approach toward parity games with time bounds. In Domenico Cantone and Giorgio Delzanno, editors, Proceedings of the Seventh International Symposium on Games, Automata, Logics and Formal Verification, GandALF 2016, Catania, Italy, 14-16 September 2016, volume 226 of EPTCS, pages 135–148, 2016. doi:10.4204/EPTCS.226.10.
- [15] Véronique Bruyère, Quentin Hautem, and Jean-François Raskin. On the complexity of heterogeneous multidimensional games. In Desharnais and Jagadeesan [21], pages 11:1–11:15. doi:10.4230/LIPIcs.CONCUR.2016.11.
- [16] Cristian S. Calude, Sanjay Jain, Bakhadyr Khoussainov, Wei Li, and Frank Stephan. Deciding parity games in quasipolynomial time. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 252–263. ACM, 2017. doi:10.1145/3055399.3055409.
- [17] Krishnendu Chatterjee, Laurent Doyen, Mickael Randour, and Jean-François Raskin. Looking at mean-payoff and total-payoff through windows. Inf. Comput., 242:25–52, 2015. doi:10.1016/j.ic.2015.03.010.
- [18] Krishnendu Chatterjee, Thomas A. Henzinger, and Florian Horn. Finitary winning in omega-regular games. ACM Trans. Comput. Log., 11(1):1:1–1:27, 2009. doi:10.1145/1614431.1614432.
- [19] Krishnendu Chatterjee, Thomas A. Henzinger, and Vinayak S. Prabhu. Timed parity games: Complexity and robustness. Log. Methods Comput. Sci., 7(4), 2011. doi:10.2168/LMCS-7(4:8)2011.
- [20] Luca de Alfaro, Marco Faella, Thomas A. Henzinger, Rupak Majumdar, and Mariëlle Stoelinga. The element of surprise in timed games. In Roberto M. Amadio and Denis Lugiez, editors, CONCUR 2003 - Concurrency Theory, 14th International Conference, Marseille, France, September 3-5, 2003, Proceedings, volume 2761 of Lecture Notes in Computer Science, pages 142–156. Springer, 2003. doi:10.1007/978-3-540-45187-7\_9.
- [21] Josée Desharnais and Radha Jagadeesan, editors. 27th International Conference on Concurrency Theory, CONCUR 2016, August 23-26, 2016, Québec City, Canada, volume 59 of LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016. URL: http://www.dagstuhl.de/dagpub/978-3-95977-017-0.
- [22] Thomas A. Henzinger and Peter W. Kopke. Discrete-time control for rectangular hybrid automata. Theor. Comput. Sci., 221(1-2):369–392, 1999. doi:10.1016/S0304-3975(99)00038-9.
- [23] Paul Hunter, Guillermo A. Pérez, and Jean-François Raskin. Looking at mean payoff through foggy windows. Acta Inf., 55(8):627–647, 2018. doi:10.1007/s00236-017-0304-7.
- [24] James C. A. Main, Mickael Randour, and Jeremy Sproston. Time flies when looking out of the window: Timed games with window parity objectives. In Serge Haddad and Daniele Varacca, editors, 32nd International Conference on Concurrency Theory, CONCUR 2021, August 23–27, 2021, Paris, France (Virtual Conference), volume 203 of LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
- [25] Oded Maler, Amir Pnueli, and Joseph Sifakis. On the synthesis of discrete controllers for timed systems (an extended abstract). In Ernst W. Mayr and Claude Puech, editors, STACS 95, 12th Annual Symposium on Theoretical Aspects of Computer Science, Munich, Germany, March 2-4, 1995, Proceedings, volume 900 of Lecture Notes in Computer Science, pages 229–242. Springer, 1995. doi:10.1007/3-540-59042-0\_76.
- [26] Mickael Randour. Automated synthesis of reliable and efficient systems through game theory: A case study. In Proc. of ECCS 2012, Springer Proceedings in Complexity XVII, pages 731–738. Springer, 2013. doi:10.1007/978-3-319-00395-5\_90.
- [27] Stéphane Le Roux, Arno Pauly, and Mickael Randour. Extending finite-memory determinacy by Boolean combination of winning conditions. In Sumit Ganguly and Paritosh K. Pandya, editors, 38th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2018, December 11-13, 2018, Ahmedabad, India, volume 122 of LIPIcs, pages 38:1–38:20. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2018. doi:10.4230/LIPIcs.FSTTCS.2018.38.