Serge Haddad and Daniele Varacca \EventNoEds2 \EventLongTitle32nd International Conference on Concurrency Theory (CONCUR 2021) \EventShortTitleCONCUR 2021 \EventAcronymCONCUR \EventYear2021 \EventDateAugust 23–27, 2021 \EventLocationVirtual Conference \EventLogo \SeriesVolume203 \ArticleNo32 stix@largesymbols"0E stix@largesymbols"0F Radboud University, Nijmegen, The Netherlands [email protected]://orcid.org/0000-0001-8993-6486Work forms part of the NWO TOP project 612.001.852 and the DFG-funded project COAX (MI 717/5-2) Friedrich-Alexander-Universität Erlangen-Nürnberg, [email protected]://orcid.org/0000-0002-2021-1644 Work forms part of the DFG-funded project CoMoC (MI 717/7-1) Friedrich-Alexander-Universität Erlangen-Nürnberg, [email protected]://orcid.org/0000-0002-3146-5906Work forms part of the DFG-funded project CoMoC (SCHR 1118/15-1) \CopyrightT. Wißmann, S. Milius, and L. Schröder \relatedversion \relatedversiondetailsFull Version with Appendixhttps://arxiv.org/abs/2105.00669 {CCSXML} <ccs2012> <concept> <concept_id>10003752.10003790</concept_id> <concept_desc>Theory of computation Logic</concept_desc> <concept_significance>500</concept_significance> </concept> </ccs2012> \ccsdesc[500]Theory of computation Logic {CCSXML} <ccs2012> <concept> <concept_id>10003752.10010124.10010138.10010143</concept_id> <concept_desc>Theory of computation Program analysis</concept_desc> <concept_significance>500</concept_significance> </concept> </ccs2012> \ccsdesc[500]Theory of computation Program analysis
Explaining Behavioural Inequivalence Generically
in Quasilinear Time
Abstract
We provide a generic algorithm for constructing formulae that distinguish behaviourally inequivalent states in systems of various transition types such as nondeterministic, probabilistic or weighted; genericity over the transition type is achieved by working with coalgebras for a set functor in the paradigm of universal coalgebra. For every behavioural equivalence class in a given system, we construct a formula which holds precisely at the states in that class. The algorithm instantiates to deterministic finite automata, transition systems, labelled Markov chains, and systems of many other types. The ambient logic is a modal logic featuring modalities that are generically extracted from the functor; these modalities can be systematically translated into custom sets of modalities in a postprocessing step. The new algorithm builds on an existing coalgebraic partition refinement algorithm. It runs in time on systems with states and transitions, and the same asymptotic bound applies to the dag size of the formulae it constructs. This improves the bounds on run time and formula size compared to previous algorithms even for previously known specific instances, viz. transition systems and Markov chains; in particular, the best previous bound for transition systems was .
keywords:
bisimulation, partition refinement, modal logic, distinguishing formulae, coalgebra1 Introduction
For finite transition systems, the Hennessy-Milner theorem guarantees that two states are bisimilar if and only if they satisfy the same modal formulae. This implies that whenever two states are not bisimilar, then one can find a modal formula that holds at one of the states but not at the other. Such a formula explains the difference of the two states’ behaviour and is thus usually called a distinguishing formula [13]. For example, in the transition system in Figure 2, the formula distinguishes the states and because satisfies whereas does not. Given two states in a finite transition system with states and transitions, the algorithm by Cleaveland [13] computes a distinguishing formula in time . The algorithm builds on the Kanellakis-Smolka partition refinement algorithm [28, 29], which computes the bisimilarity relation on a transition system within the same time bound.
Similar logical characterizations of bisimulation exist for other system types. For instance, Desharnais et al. [16, 17] characterize probabilistic bisimulation on (labelled) Markov chains, in the sense of Larsen and Skou [33] (for each label, every state has either no successors or a probability distribution on successors). In their logic, a formula holds at states that have a transition probability of at least to states satisfying . For example, the state in Figure 2 satisfies but does not. Desharnais et al. provide an algorithm that computes distinguishing formulae for labelled Markov chains in run time (roughly) .
In the present work, we construct such counterexamples generically for a variety of system types. We achieve genericity over the system type by modelling state-based systems as coalgebras for a set functor in the framework of universal coalgebra [40]. Examples of coalgebras for a set functor include transition systems, deterministic automata, or weighted systems (e.g. Markov chains). Universal coalgebra provides a generic notion of behavioural equivalence that instantiates to standard notions for concrete system types, e.g. bisimilarity (transtion systems), language equivalence (deterministic automata), or probabilistic bisimilarity (Markov chains). Moreover, coalgebras come equipped with a generic notion of modal logic that is parametric in a choice of modalities whose semantics is constructed so as to guarantee invariance w.r.t. behavioural equivalence; under easily checked conditions, such a coalgebraic modal logic in fact characterizes behavioural equivalence in the same sense as Hennessy-Milner logic characterizes bisimilarity [39, 42]. Hence, as soon as suitable modal operators are found, coalgebraic modal formulae serve as distinguishing formulae.
In a nutshell, the contribution of the present paper is an algorithm that computes distinguishing formulae for behaviourally inequivalent states in quasilinear time, and in fact certificates that uniquely describe behavioural equivalence classes in a system, in coalgebraic generality. We build on an existing efficient coalgebraic partition refinement algorithm [46], thus achieving run time on coalgebras with states and transitions (in a suitable encoding). The dag size of formulae is also (for tree size, exponential lower bounds are known [22]); even for labelled transition systems, we thus improve the previous best bound [13] for both run time and formula size. We systematically extract the requisite modalities from the functor at hand, requiring binary and nullary modalities in the general case, and then give a systematic method to translate these generic modal operators into more customary ones (such as the standard operators of Hennessy-Milner logic).
We subsequently identify a notion of cancellative functor that allows for additional optimization. E.g. functors modelling weighted systems are cancellative if and only if the weights come from a cancellative monoid, such as , or as used in probabilistic systems. For cancellative functors, much simpler distinguishing formulae can be constructed: the binary modalities can be replaced by unary ones, and only conjunction is needed in the propositional base. On labelled Markov chains, this complements the result that a logic with only conjunction and different unary modalities (mentioned above) suffices for the construction of distinguishing formulae (but not certificates) [17] (see also [19]).
Related Work
Cleaveland’s algorithm [13] for labelled transition systems is is based on Kanellakis and Smolka’s partition refinement algorithm [29]. The coalgebraic partition refinement algorithm we employ [46] is instead related to the more efficient Paige-Tarjan algorithm [36]. König et al. [32] extract formulae from winning strategies in a bisimulation game in coalgebraic generality; their algorithm runs in and does not support negative transition weights. Characteristic formulae for behavioural equivalence classes taken across all models require the use of fixpoint logics [21]. The mentioned algorithm by Desharnais et al. for distinguishing formulae on labelled Markov processes [17, Fig. 4] is based on Cleaveland’s. No complexity analysis is made but the algorithm has four nested loops, so its run time is roughly . Bernardo and Miculan [10] provide a similar algorithm for a logic with only disjunction. There are further generalizations along other axes, e.g. to behavioural preorders [12]. The TwoTowers tool set for the analysis of stochastic process algebras [9, 8] computes distinguishing formulae for inequivalent processes, using variants of Cleaveland’s algorithm. Some approaches construct alternative forms of certificates for inequivalence, such as Cranen et al.’s notion of evidence [14] or methods employed on business process models, based on model differences and event structures [18, 6, 5].
2 Preliminaries
We first recall some basic notation. We denote by , , and the sets representing the natural numbers , , and . For every set , there is a unique map . We write for the set of functions , so e.g. . In particular, is the set of -valued predicates on , which is in bijection with the powerset of , i.e. the set of all subsets of ; in this bijection, a subset corresponds to its characteristic function , given by if , and otherwise. We generally indicate injective maps by . Given maps , , we write for the map given by . We denote the disjoint union of sets , by , with canonical inclusion maps and . More generally, we write for the disjoint union of an -indexed family of sets , and for the -th inclusion map. For a map (not necessarily surjective), we denote by the kernel of , i.e. the equivalence relation
(1) |
Notation \thetheorem (Partitions).
Given an equivalence relation on , we write for the equivalence class of . If is the kernel of a map , we simply write in lieu of . The intersection of equivalence relations is again an equivalence relation. The partition corresponding to is denoted by . Note that is a surjective map and that .
A signature is a set , whose elements are called operation symbols, equipped with a function assigning to each operation symbol its arity. We write for with . We will apply the same terminology and notation to collections of modal operators.
2.1 Coalgebra
Universal coalgebra [40] provides a generic framework for the modelling and analysis of state-based systems. Its key abstraction is to parametrize notions and results over the transition type of systems, encapsulated as an endofunctor on a given base category. Instances cover, for example, deterministic automata, labelled (weighted) transition systems, and Markov chains.
Definition \thetheorem.
A set functor assigns to every set a set and to every map a map such that identity maps and composition are preserved: and . An -coalgebra is a pair consisting of a set (the carrier) and a map (the structure). When is clear from the context, we simply speak of a coalgebra.
In a coalgebra , we understand the carrier set as consisting of states, and the structure as assigning to each state a structured collection of successor states, with the structure of collections determined by . In this way, the notion of coalgebra subsumes numerous types of state-based systems, as illustrated next.
Example \thetheorem.
-
1.
The powerset functor sends a set to its powerset and a map to the map taking direct images. A -coalgebra is precisely a transition system: It assigns to every state a set of successor states, inducing a transition relation given by iff . Similarly, coalgebras for the finite powerset functor (with being the set of finite subsets of ) are finitely branching transition systems.
-
2.
Coalgebras for the functor , where is a fixed input alphabet, are deterministic automata (without an explicit initial state). Indeed, a coalgebra structure consists of a finality predicate and a transition map in curried form .
-
3.
Every signature defines a signature functor that maps a set to the set
whose elements we may understand as flat -terms with variables from . The action of on maps is then given by . For simplicity, we write (instead of ) for the coproduct injections, and in lieu of for the signature functor. States in -coalgebras describe possibly infinite -trees.
-
4.
For a commutative monoid , the monoid-valued functor [25] is given by
(2) on sets ; for a map , the map is defined by
A coalgebra is a finitely branching weighted transition system, where is the transition weight from to . For the Boolean monoid , we recover . Coalgebras for , with understood as the additive monoid of the reals, are -weighted transition systems. The functor
which assigns to a set the set of all finite probability distributions on (represented as finitely supported probability mass functions), is a subfunctor of .
-
5.
Functors can be composed; for instance, given a set of labels, the composite of and the functor (whose action on sets maps a set to the set ) is the functor , whose coalgebras are -labelled transition systems. Coalgebras for have been termed probabilistic transition systems [33] or labelled Markov chains [17], and coalgebras for are partial labelled Markov chains [17]. Coalgebras for are variously known as simple Segala systems or Markov decision processes.
We have a canonical notion of behaviour on -coalgebras:
Definition \thetheorem.
An -coalgebra morphism is a map such that . States in an -coalgebra are behaviourally equivalent () if there exists a coalgebra morphism such that .
Thus, we effectively define the behaviour of a state as those of its properties that are preserved by coalgebra morphisms. The notion of behavioural equivalence subsumes standard branching-time equivalences:
Example \thetheorem.
-
1.
For , behavioural equivalence on -coalgebras, i.e. on transition systems, is bisimilarity in the usual sense.
-
2.
For deterministic automata as coalgebras for , two states are behaviourally equivalent iff they accept the same formal language.
-
3.
For a signature functor , two states of a -coalgebra are behaviourally equivalent iff they describe the same -tree.
-
4.
For labelled transition systems as coalgebras for , coalgebraic behavioural equivalence precisely captures Milner’s strong bisimilarity [1].
- 5.
Remark \thetheorem.
-
1.
The notion of behavioural equivalence extends straightforwardly to states in different coalgebras, as one can canonically define the disjoint union of coalgebras.
- 2.
2.2 Coalgebraic Logics
We briefly review basic concepts of coalgebraic modal logic [38, 42]. Coalgebraic modal logics are parametric in a functor determining the type of systems underlying the semantics, and additionally in a choice of modalities interpreted in terms of predicate liftings. For now, we use as a basic example, deferring further examples to section 5.
Syntax
The syntax of coalgebraic modal logic is
parametrized over the choice of signature of modal
operators (with assigned arities). Then, formulae are
generated by the grammar
Example \thetheorem.
For , one often takes ; the induced syntax is that of (single-action) Hennessy-Milner logic. As usual, we write .
Semantics
We interpret formulae as sets of states in -coalgebras. This interpretation arises by assigning to each modal operator an -ary predicate lifting [38, 42], i.e. a family of maps , one for every set , such that the naturality condition
(3) |
for all and all (for categorically-minded readers, is a natural transformation ); the idea being to lift given predicates on states to predicates on structured collections of states. Given these data, the extension of a formula in an -coalgebra is a predicate , or just , on , recursively defined by
(where we apply set operations to predicates with the evident meaning). We say that a state satisfies if . Notice how the clause for modalities says that satisfies iff satisfies the predicate obtained by lifting the predicates on to a predicate on according to .
Example \thetheorem.
Over , we interpret by the predicate lifting
The arising notion of satisfaction over -coalgebras is precisely the standard one: iff for some transition .
The naturality condition (3) of predicate liftings guarantees invariance of the logic under coalgebra morphisms, and hence under behavioural equivalence:
Proposition 2.1 (Adequacy [38, 42]).
Behaviourally equivalent states satisfy the same formulae: implies that for all formulae , we have iff .
In our running example , this instantiates to the well-known fact that modal formulae are bisimulation-invariant, that is, bisimilar states in transition systems satisfy the same formulae of Hennessy-Milner logic.
3 Constructing Distinguishing Formulae
A proof method certifying behavioural equivalence of states in a coalgebra is immediate by definition: One simply needs to exhibit a coalgebra morphism such that . In fact, for many system types, it suffices to relate and by a coalgebraic bisimulation in a suitable sense (e.g. [1, 40, 24, 34]), generalizing the Park-Milner bisimulation principle [35, 37]. It is less obvious how to certify behavioural inequivalence , showing that such a morphism does not exist. By 2.1, one option is to exhibit a (coalgebraic) modal formula that is satisfied by but not by . In the case of (image-finite) transition systems, such a formula is guaranteed to exist by the Hennessy-Milner theorem, which moreover is known to generalize to coalgebras [39, 42]. More generally, we consider separation of sets of states by formulae, following Cleaveland [13, Def. 2.4]:
Definition 1.
Let be an -coalgebra. A formula distinguishes a set from a set if and . In case and , we just say that distinguishes from . We say that is a certificate of if distinguishes from , that is if .
Note that distinguishes from iff distinguishes from . Certificates have also been referred to as descriptions [22]. If is a certificate of a behavioural equivalence class , then by definition distinguishes from whenever . To obtain distinguishing formulae for behaviourally inequivalent states in a coalgebra, it thus suffices to construct certificates for all behavioural equivalence classes, and our algorithm does just that. Of course, every certificate must be at least as large as a smallest distinguishing formula. However, already on transition systems, distinguishing formulae and certificates have the same asymptotic worst-case size (cf. section 6).
A natural approach to computing certificates for behavioural equivalence classes is to extend algorithms that compute these equivalence classes. In particular, partition refinement algorithms compute a sequence of consecutively finer partitions (i.e. ) on the state space, where every block is a union of behavioural equivalence classes, and the final partition is precisely . Indeed, Cleaveland’s algorithm for computing certificates on labelled transition systems [13] correspondingly extends Kanellakis and Smolka’s partition refinement algorithm [28, 29], which runs in on systems with states and transitions. Our generic algorithm will be based on a more efficient partition refinement algorithm.
3.1 Paige-Tarjan with Certificates
Before we turn to constructing certificates in coalgebraic generality, we informally recall and extend the Paige-Tarjan algorithm [36], which computes the partition modulo bisimilarity of a given transition system with states and transitions in time . We fix a given finite transition system, viewed as a -coalgebra .
The algorithm computes two sequences and of partitions of (with equivalence relations), where only the most recent partition is held in memory and indexes the iterations of the main loop. Throughout the execution, is finer than (that is, for all ), and the algorithm terminates when . Intuitively, is ‘one transition ahead’ of : if distinguishes states and , then is based on distinguishing transitions to from transitions to .
Initially, consists of only one block and of two blocks: the live states and the deadlocks (i.e. states with no outgoing transitions). If , then there is a block that is the union of at least two blocks in . In such a situation, the algorithm chooses in to have at most half the size of and then splits the block into and in the partition :
This is correct because every state in is already known to be behaviourally inequivalent to every state in . By the definition of bisimilarity, this implies that every block with some transition to may contain behaviourally inequivalent states as illustrated in Figure 3; that is, may need to be split into smaller blocks, as follows:
-
(C1)
states in with successors in but not in (e.g. in Figure 3),
-
(C2)
states in with successors in and (e.g. ), and
-
(C3)
states in with successors but not in (e.g. ).
The partition arises from by splitting all such predecessor blocks of accordingly. If no such is properly split, then , and the algorithm terminates. It is straightforward to construct certificates for the blocks arising during the execution:
-
•
The certificate for the only block is , and the blocks for live states and deadlocks in have certificates and , respectively.
-
•
In the refinement step, suppose that are certificates of and , respectively, where . For every predecessor block of , the three blocks obtained by splitting are distinguished (see 1) as follows:
(C1) , (C2) , (C3) . (4) Of course these formulae only distinguish the states in from each other (e.g. there may be states in other blocks with transitions to both and ). Hence, given a certificate of , one obtains certificates of the three resulting blocks in via conjunction: , etc.
Upon termination, every bisimilarity class in the transition system is annotated with a certificate. A key step in the generic development will be to come up with a coalgebraic generalization of the formulae for (C1)–(C3).
3.2 Generic Partition Refinement
The Paige-Tarjan algorithm has been adapted to other system types, e.g. weighted systems [44], and it has recently been generalized to coalgebras [46, 20]. A crucial step in this generalization is to rephrase the case distinction (C1)–(C3) in terms of the functor : Given a predecessor block in for , the three cases distinguish between the equivalence classes for , where the map in the composite is defined by
(5) |
Every case is a possible value of : (C1) , (C2) , and (C3) . Since is a predecessor block of , the ‘fourth case’ is not possible. There is a transition from to some state outside of iff . However, because of the previous refinement steps performed by the algorithm, either all or no states states of have an edge to (a property called stability [36]), hence no distinction on is necessary.
It is now easy to generalize from transition systems to coalgebras by simply replacing the functor with in the refinement step. We recall the algorithm:
Algorithm 3.1 ([46, Alg. 4.9, (5.1)]).
Given a coalgebra , put
Starting at iteration , repeat the following while :
-
(A1)
Pick and such that and
-
(A2)
-
(A3)
This algorithm formalizes the intuitive steps from subsection 3.1. Again, two sequences of partitions , are constructed, and upon termination. Initially, identifies all states and distinguishes states by only their output behaviour; e.g. for and , the value is if is a deadlock, and if is a live state, and for , the value indicates whether is a final or non-final state.
In the main loop, blocks and witnessing are picked, and is split into and , like in the Paige-Tarjan algorithm. Note that step (A2) is equivalent to directly defining the equivalence relation as
A similar intersection of equivalence relations is performed in step (A3). The intersection splits every block into smaller blocks such that end up in the same block iff , i.e. is replaced by . Again, this corresponds to the distinction of the three cases (C1)–(C3). For example, for , there are cases to be distinguished, and so every is split into at most that many blocks.
The following property of is needed for correctness [46, Ex. 5.11].
Definition 2 ([46]).
A functor is zippable if map
is injective for all sets .
Intuitively, is a term in variables from and . If is zippable, then is uniquely determined by the two elements in and obtained by identifying all - and all -variables with , respectively. E.g. is zippable: is uniquely determined by and , and similarly for the three other cases of . In fact, all signature functors as well as and all monoid-valued functors are zippable. Moreover, the class of zippable functors is closed under products, coproducts, and subfunctors but not under composition, e.g. is not zippable [46].
Remark 3.
To apply the algorithm to coalgebras for composites of zippable functors, e.g. , there is a reduction [46, Section 8] that embeds every -coalgebra into a coalgebra for the zippable functor . This reduction preserves and reflects behavioural equivalence, but introduces an intermediate state for every transition.
3.3 Generic Modal Operators
The extended Paige-Tarjan algorithm (subsection 3.1) constructs a distinguishing formula according to the three cases (C1)–(C3). In the coalgebraic 3.1, these cases correspond to elements of , which determine in which block an element of a predecessor block ends up. Indeed, the elements of will also serve as generic modalities in characteristic formulae for blocks of states, essentially by the known equivalence between -ary predicate liftings and (in this case, singleton) subsets of [42] (termed tests by Klin [30]):
Definition 4.
The signature of -modalities for a functor is
that is, we write for the syntactic representation of a binary modality for every . The interpretation of for is given by the predicate lifting
The intended use of is as follows: Suppose a block is split into subblocks and with certificates and , respectively: and . As in Figure 3, we then split every predecessor block of into smaller parts, each of which is uniquely characterized by the formula for some .
Example 3.3.
For , is equivalent to .
Lemma 3.4.
Given an -coalgebra , , and formulae and such that , we have if and only if .
In the initial partition on a transition system , we used the formulae and to distinguish live states and deadlocks. In general, we can similarly describe the initial partition using modalities induced by elements of :
Notation 3.5.
Define the injective map by . Then the injection provides a way to interpret elements as nullary modalities :
(Alternatively, we could introduce directly as a nullary modality.)
Lemma 3.6.
For , , and , we have if and only if .
3.4 Algorithmic Construction of Certificates
The -modalities introduced above (4) induce an instance of coalgebraic modal logic (subsection 2.2). We refer to coalgebraic modal formulae employing the -modalities as -modal formulae, and write for the set of -modal formulae. As in the extended Paige-Tarjan algorithm (subsection 3.1), we annotate every block arising during the execution of 3.1 with a certificate in the shape of an -modal formula. Annotating blocks with formulae means that we construct maps
As in 3.1, indexes the loop iterations. For blocks in the respective partition, , denote corresponding certificates: we will have
(6) |
We construct and iteratively, using certificates for the blocks at every iteration:
Algorithm 3.7.
Like in subsection 3.1, the only block of has as a certificate. Since the partition distinguishes by the ‘output’ (e.g. final vs. non-final states), the certificate of specifies what is (3.6).
In the -th iteration of the main loop, we have certificates and for in step (A1) satisfying (6) available from the previous iterations. In (A’2), the Boolean connectives describe how is split into and . In (A’3), new certificates are constructed for every predecessor block that is refined. If does not change, then neither does its certificate. Otherwise, the block is split into the blocks for in step (A3), which is reflected by the modality as per 3.4.
Remark 5.
In step (A’2), can be simplified to be no larger than . To see this, note that for , , and , every conjunct of is also a conjunct of . In , one can hence remove all conjuncts of from , obtaining a formula , and then equivalently use in the definition of .
Theorem 3.8.
Corollary 3.9 (Hennessy-Milner).
For zippable , states in a finite -coalgebra are behaviourally equivalent iff they agree on all -modal formulae.
Remark 6.
A smaller formula distinguishing a state from a state can be extracted from the certificates in time . It is the leftmost conjunct that is different in the respective certificates of and . This is the subformula starting at the modal operator introduced in for the least with ; hence, satisfies but satisfies for some in .
3.5 Complexity Analysis
The operations introduced by 3.7 can be implemented with only constant run time overhead. To this end, one implements and as arrays of formulae of length (note that at any point, there are at most -many blocks). In the refinable-partition data structure [45], every block has an index (a natural number) and there is an array of length mapping every state to the block it is contained in. Hence, for both partitions and , one can look up a state’s block and a block’s certificate in constant time.
It is very likely that the certificates contain a particular subformula multiple times and that certificates of different blocks share common subformulae. For example, every certificate of a block refined in the -th iteration using contains the subformulas and . Therefore, it is advantageous to represent all certificates constructed as one directed acyclic graph (dag) with nodes labelled by modal operators and conjunction and having precisely two outgoing edges. Moreover, edges have a binary flag indicating whether they represent negation . Initially, there is only one node representing , and the operations of 3.7 allocate new nodes and update the arrays for and to point to the right nodes. For example, if the predecessor block is refined in step (A’3), yielding a new block , then a new node labelled is allocated with edges to the nodes and to another new node labelled with edges to the nodes and .
For purposes of estimating the size of formulae generated by the algorithm, we use a notion of transition in coalgebras, inspired by the notion of canonical graph [26].
Definition 3.10.
For states in an -coalgebra , we say that there is a transition if is not in the image , where is the inclusion map.
Theorem 3.11.
For a coalgebra with states and transitions, the formula dag constructed by 3.7 has size and height at most .
Theorem 3.12.
For a tighter run time analysis of the underlying partition refinement algorithm, one additionally requires that is equipped with a refinement interface [46, Def. 6.4], which is based on a given encoding of -coalgebras in terms of edges between states (encodings serve only as data structures and have no direct semantic meaning, in particular do not entail a semantic reduction to relational structures). This notion of edge yields the same numbers (in -notation) as 3.10 for all functors considered. All zippable functors we consider here have refinement interfaces [46, 15]. In presence of a refinement interface, step (A3) can be implemented efficiently, with resulting overall run time where , is the number of edges in the encoding of the input coalgebra , and the run-time factor is associated with the refinement interface. In most instances, e.g. for , , one has ; in particular, the generic algorithm recovers the run time of the Paige-Tarjan algorithm.
Remark 7.
The claimed run time relies on close attention to a number of implementation details. This includes use of an efficient data structure for the partition [31, 45]; the other partition is only represented implicitly in terms of a queue of blocks witnessing , requiring additional care when splitting blocks in the queue [44, Fig. 3]. Moreover, grouping the elements of a block by involves the consideration of a possible majority candidate [44].
Theorem 3.13.
On a coalgebra with states and transitions for a zippable set functor with a refinement interface with factor , 3.7 runs in time .
4 Cancellative Functors
Our use of binary modalities relates to the fact that, as observed already by Paige and Tarjan, when splitting a block according to an existing partition of a block into and , it is not in general sufficient to look only at the successors in . However, this does suffice for some transition types; e.g. Hopcroft’s algorithm for deterministic automata [27] and Valmari and Franceschinis’ algorithm for weighted systems (e.g. Markov chains) [44] both split only with respect to . In the following, we exhibit a criterion on the level of functors that captures that splitting w.r.t. only is sufficient:
Definition 8.
A functor is cancellative if the map
is injective.
To understand the role of the above map, recall the function from (5) and note that and , so the composite yields information about the accumulated transition weights into and but not about the one into ; the injectivity condition means that for cancellative functors, this information suffices in the splitting step for . The term cancellative stems from the respective property on monoids; recall that a monoid is cancellative if implies for all .
Proposition 4.1.
The monoid-valued functor for a commutative monoid is cancellative if and only if is a cancellative monoid.
Hence, is cancellative, but is not. Moreover, all signature functors are cancellative:
Proposition 4.2.
The class of cancellative functors contains the all constant functors as well as the identity functor, and it is closed under subfunctors, products, and coproducts.
For example, is cancellative, but is not because of its subfunctor .
Remark 9.
Cancellative functors are neither closed under quotients nor under composition. Zippability and cancellativity are independent properties. Zippability in conjunction with cancellativity implies -zippability for all , the -ary variant [32] of zippability.
Theorem 4.3.
If is a cancellative functor, in 3.7 can be replaced with . Then, the algorithm still correctly computes certificates in the given -coalgebra .
Note that in this optimized algorithm, the computation of can be omitted because it is not used anymore. Hence, the resulting formulae only involve , , and modalities from the set (with the second parameter fixed to ). These modalities are equivalently unary modalities induced by elements of , which we term -modalities; hence, the corresponding Hennessy-Milner Theorem (3.9) adapts to for cancellative functors, as follows:
Corollary 4.4.
For zippable and cancellative , states in an -coalgebra are behaviourally equivalent iff they agree on modal formulae built using , , and unary -modalities.
5 Domain-Specific Certificates
On a given specific system type, one is typically interested in certificates and distinguishing formulae expressed via modalities whose use is established in the respective domain, e.g. and for transition systems. We next describe how the generic modalities can be rewritten to domain-specific ones in a postprocessing step. The domain-specific modalities will not in general be equivalent to -modalities, but still yield certificates.
Definition 10.
The Boolean closure of a modal signature has as -ary modalities propositional combinations of atoms of the form , for , where are propositional combinations of elements of . Such a modality is interpreted by predicate liftings defined inductively in the obvious way.
For example, the boolean closure of contains the unary modality .
Definition 11.
Given a modal signature for a functor , a domain-specific interpretation consists of functions and assigning to each a nullary modality and to each a binary modality such that the predicate liftings and satisfy
(Recall that is the characteristic function of , and denotes the equivalence class of w.r.t. .)
Thus, holds precisely at states with output behaviour . Intuitively, describes the refinement step of a predecessor block when splitting into and (Figure 3), which translates into the arguments and of . In the refinement step, we know from previous iterations that all elements have the same behaviour w.r.t. . This is reflected in the intersection with . The axiom guarantees that characterizes uniquely, but only within the equivalence class representing a predecessor block. Thus, can be much smaller than equivalents of (cf. 3.3):
Example 5.1.
-
1.
For , we have a domain-specific interpretation over the modal signature . For , take and . For , we put
The certificates obtained via this translation are precisely the ones generated in the example using the Paige-Tarjan algorithm, cf. (4), with in lieu of .
-
2.
For a signature (functor) , take . We interpret by the predicate liftings
Intuitively, states that the th successor satisfies iff . We then have a domain-specific interpretation given by for and for and .
-
3.
For a monoid-valued functor , take , interpreted by the predicate liftings given by
A formula thus states that the accumulated weight of the successors satisfying is exactly . A domain-specific interpretation is then given by for and for . In case is cancellative, we can also simply put .
-
4.
For labelled Markov chains, i.e. , let , where denotes that on input , the next state will satisfy with probability at least , as in cited work by Desharnais et al. [17]. This gives rise to the interpretation:
Given a domain-specific interpretation for a modal signature for the functor , we can postprocess certificates produced by 3.7 by replacing the modalities for according to the translation recursively defined by the following clauses for modalities and by commutation with propositional operators:
Note that one can replace with for the optimized from 5; the latter conjunction has essentially the same size as .
Proposition 5.2.
For every certificate of a behavioural equivalence class of a given coalgebra produced by either 3.7 or its optimization (Theorem 4.3), is also a certificate for that class.
Thus, the domain-specific modal signatures also inherit a Hennessy-Milner Theorem.
Example 5.3.
For labelled Markov chains () and the interpretation via the modalities (5.1.4), this yields certificates (thus in particular distinguishing formulae) in run time , with the same bound on formula size. Desharnais et al. describe an algorithm [17, Fig. 4] that computes distinguishing formulae in the negation-free fragment of the same logic (they note also that this fragment does not suffice for certificates). They do not provide a run-time analysis, but the nested loop structure indicates that the asymptotic complexity is roughly .
6 Worst Case Tree Size of Certificates
In the complexity analysis (subsection 3.5), we have seen that certificates – and thus also distinguishing formulae – have dag size on input coalgebras with states and transitions. However, when formulae are written in the usual linear way, multiple occurrences of the same subformula lead to an exponential blow up of the formula size in this sense, which for emphasis we refer to as the tree size.
Figueira and Gorín [22] show that exponential tree size is inevitable even for distinguishing formulae. The proof is based on winning strategies in bisimulation games, a technique that is also applied in other results on lower bounds on formula size [23, 3, 4].
Open Problem 6.1.
Do states in -coalgebras generally have certificates of subexponential tree size in the number of states? If yes, can small certificates be computed efficiently?
We note that for another cancellative functor, the answer is well-known: On deterministic automata, i.e. coalgebras for , the standard minimization algorithm constructs distinguishing words of linear length.
Remark 12.
Cleaveland [13, p. 368] also mentions that minimal distinguishing formulae may be exponential in size, however for a slightly different notion of minimality: a formula distinguishing from is minimal if no obtained by replacing a non-trivial subformula of with the formula distinguishes from . This is weaker than demanding that the formula size of is as small as possible. For example, in the transition system
for ,
the formula distinguishes from and is minimal in the above sense. However, can in fact be distinguished from in size , by the formula .
7 Conclusions and Further Work
We have presented a generic algorithm that computes distinguishing formulae for behaviourally inequivalent states in state-based systems of various types, cast as coalgebras for a functor capturing the system type. Our algorithm is based on coalgebraic partition refinement [46], and like that algorithm runs in time , with a functor-specific factor that is in many cases of interest. Independently of this factor, the distinguishing formulae constructed by the algorithm have dag size ; they live in a dedicated instance of coalgebraic modal logic [39, 42], with binary modalities extracted from the type functor in a systematic way. We have shown that for cancellative functors, the construction of formulae and, more importantly, the logic can be simplified, requiring only unary modalities and conjunction. We have also discussed how distinguishing formulae can be translated into a more familiar domain-specific syntax (e.g. Hennessy-Milner logic for transition systems).
There is an open source implementation of the underlying partition refinement algorithm [15], which may serve as a basis for a future implementation.
In partition refinement, blocks are successively refined in a top-down manner, and this is reflected by the use of conjunction in distinguishing formulae. Alternatively, bisimilarity may be computed bottom-up, as in a recent partition aggregation algorithm [11]. It is an interesting point for future investigation whether this algorithm can be extended to compute distinguishing formulae, which would likely be of a rather different shape than those computed via partition refinement.
References
- [1] Peter Aczel and Nax Mendler. A final coalgebra theorem. In Proc. Category Theory and Computer Science (CTCS), volume 389 of LNCS, pages 357–365. Springer, 1989.
- [2] Jiří Adámek, Stefan Milius, and Lawrence S. Moss. Initial algebras, terminal coalgebras, and the theory of fixed points of functors. draft book, available online at https://www8.cs.fau.de/ext/milius/publications/files/CoalgebraBook.pdf, 2021.
- [3] Micah Adler and Neil Immerman. An n! lower bound on formula size. In LICS 2001, pages 197–206. IEEE Computer Society, 2001.
- [4] Micah Adler and Neil Immerman. An n! lower bound on formula size. ACM Trans. Comput. Log., 4(3):296–314, 2003.
- [5] Abel Armas-Cervantes, Paolo Baldan, Marlon Dumas, and Luciano García-Bañuelos. Behavioral comparison of process models based on canonically reduced event structures. In Business Process Management, pages 267–282. Springer, 2014.
- [6] Abel Armas-Cervantes, Luciano García-Bañuelos, and Marlon Dumas. Event structures as a foundation for process model differencing, part 1: Acyclic processes. In Web Services and Formal Methods, pages 69–86. Springer, 2013.
- [7] Falk Bartels, Ana Sokolova, and Erik de Vink. A hierarchy of probabilistic system types. Theoret. Comput. Sci., 327:3–22, 2004.
- [8] Marco Bernardo. TwoTowers 5.1 user manual, 2004.
- [9] Marco Bernardo, Rance Cleaveland, Steve Sims, and W. Stewart. TwoTowers: A tool integrating functional and performance analysis of concurrent systems. In Formal Description Techniques and Protocol Specification, Testing and Verification, FORTE / PSTV 1998, volume 135 of IFIP Conference Proceedings, pages 457–467. Kluwer, 1998.
- [10] Marco Bernardo and Marino Miculan. Constructive logical characterizations of bisimilarity for reactive probabilistic systems. Theoretical Computer Science, 764:80 – 99, 2019. Selected papers of ICTCS 2016.
- [11] Johanna Björklund and Loek Cleophas. Aggregation-based minimization of finite state automata. Acta Informatica, 2020.
- [12] Ufuk Celikkan and Rance Cleaveland. Generating diagnostic information for behavioral preorders. Distributed Computing, 9(2):61–75, 1995.
- [13] Rance Cleaveland. On automatically explaining bisimulation inequivalence. In Computer-Aided Verification, pages 364–372. Springer, 1991.
- [14] Sjoerd Cranen, Bas Luttik, and Tim A. C. Willemse. Evidence for Fixpoint Logic. In 24th EACSL Annual Conference on Computer Science Logic (CSL 2015), volume 41 of LIPIcs, pages 78–93. Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 2015.
- [15] Hans-Peter Deifel, Stefan Milius, Lutz Schröder, and Thorsten Wißmann. Generic partition refinement and weighted tree automata. In Formal Methods – The Next 30 Years, Proc. 3rd World Congress on Formal Methods (FM 2019), volume 11800 of LNCS, pages 280–297. Springer, 10 2019.
- [16] J. Desharnais, A. Edalat, and P. Panangaden. A logical characterization of bisimulation for labeled markov processes. In Proceedings. Thirteenth Annual IEEE Symposium on Logic in Computer Science (Cat. No.98CB36226), pages 478–487, 1998.
- [17] Josée Desharnais, Abbas Edalat, and Prakash Panangaden. Bisimulation for labelled markov processes. Information and Computation, 179(2):163–193, 2002.
- [18] Remco Dijkman. Diagnosing differences between business process models. In Business Process Management, pages 261–277, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg.
- [19] Ernst-Erich Doberkat. Stochastic Coalgebraic Logic. Springer, 2009.
- [20] Ulrich Dorsch, Stefan Milius, Lutz Schröder, and Thorsten Wißmann. Efficient coalgebraic partition refinement. In Proc. 28th International Conference on Concurrency Theory (CONCUR 2017), LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017.
- [21] Ulrich Dorsch, Stefan Milius, Lutz Schröder, and Thorsten Wißmann. Predicate liftings and functor presentations in coalgebraic expression languages. In Coalgebraic Methods in Computer Science, CMCS 2018, volume 11202 of LNCS, pages 56–77. Springer, 2018.
- [22] Santiago Figueira and Daniel Gorín. On the size of shortest modal descriptions. In Advances in Modal Logic 8, papers from the eighth conference on "Advances in Modal Logic," held in Moscow, Russia, 24-27 August 2010, pages 120–139. College Publications, 2010.
- [23] Tim French, Wiebe van der Hoek, Petar Iliev, and Barteld Kooi. On the succinctness of some modal logics. Artificial Intelligence, 197:56 – 85, 2013.
- [24] Daniel Gorín and Lutz Schröder. Simulations and bisimulations for coalgebraic modal logics. In Algebra and Coalgebra in Computer Science - 5th International Conference, CALCO 2013, volume 8089 of LNCS, pages 253–266. Springer, 2013.
- [25] H. Peter Gumm and Tobias Schröder. Monoid-labeled transition systems. In Coalgebraic Methods in Computer Science, CMCS 2001, volume 44(1) of ENTCS, pages 185–204. Elsevier, 2001.
- [26] H.Peter Gumm. From -coalgebras to filter structures and transition systems. In Algebra and Coalgebra in Computer Science, volume 3629 of LNCS, pages 194–212. Springer, 2005.
- [27] John Hopcroft. An algorithm for minimizing states in a finite automaton. In Theory of Machines and Computations, pages 189–196. Academic Press, 1971.
- [28] Paris C. Kanellakis and Scott A. Smolka. Ccs expressions, finite state processes, and three problems of equivalence. In Proceedings of the Second Annual ACM Symposium on Principles of Distributed Computing, PODC ’83, pages 228–240. ACM, 1983.
- [29] Paris C. Kanellakis and Scott A. Smolka. CCS expressions, finite state processes, and three problems of equivalence. Inf. Comput., 86(1):43–68, 1990.
- [30] Bartek Klin. The least fibred lifting and the expressivity of coalgebraic modal logic. In Algebra and Coalgebra in Computer Science, CALCO 2005, volume 3629 of LNCS, pages 247–262. Springer, 2005.
- [31] Timo Knuutila. Re-describing an algorithm by Hopcroft. Theor. Comput. Sci., 250:333 – 363, 2001.
- [32] Barbara König, Christina Mika-Michalski, and Lutz Schröder. Explaining non-bisimilarity in a coalgebraic approach: Games and distinguishing formulas. In Coalgebraic Methods in Computer Science, pages 133–154. Springer, 2020.
- [33] Kim Guldstrand Larsen and Arne Arne Skou. Bisimulation through probabilistic testing. Inform. Comput., 94(1):1–28, 1991.
- [34] Johannes Marti and Yde Venema. Lax extensions of coalgebra functors and their logic. J. Comput. Syst. Sci., 81(5):880–900, 2015.
- [35] R. Milner. Communication and Concurrency. International series in computer science. Prentice-Hall, 1989.
- [36] Robert Paige and Robert E. Tarjan. Three partition refinement algorithms. SIAM J. Comput., 16(6):973–989, 1987.
- [37] D. Park. Concurrency and automata on infinite sequences. In Proceedings of 5th GI-Conference on Theoretical Computer Science, volume 104 of LNCS, pages 167–183, 1981.
- [38] Dirk Pattinson. Coalgebraic modal logic: soundness, completeness and decidability of local consequence. Theoretical Computer Science, 309(1):177 – 193, 2003.
- [39] Dirk Pattinson. Expressive logics for coalgebras via terminal sequence induction. Notre Dame J. Formal Log., 45(1):19–33, 2004.
- [40] Jan Rutten. Universal coalgebra: a theory of systems. Theor. Comput. Sci., 249:3–80, 2000.
- [41] Jan Rutten and Erik de Vink. Bisimulation for probabilistic transition systems: a coalgebraic approach. Theoret. Comput. Sci., 221:271–293, 1999.
- [42] Lutz Schröder. Expressivity of coalgebraic modal logic: The limits and beyond. Theor. Comput. Sci., 390(2-3):230–247, 2008.
- [43] Věra Trnková. On a descriptive classification of set functors I. Commentationes Mathematicae Universitatis Carolinae, 12(1):143–174, 1971.
- [44] Antti Valmari and Giuliana Franceschinis. Simple time Markov chain lumping. In Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2010, volume 6015 of LNCS, pages 38–52. Springer, 2010.
- [45] Antti Valmari and Petri Lehtinen. Efficient minimization of dfas with partial transition. In Theoretical Aspects of Computer Science, STACS 2008, volume 1 of LIPIcs, pages 645–656. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Germany, 2008.
- [46] Thorsten Wißmann, Ulrich Dorsch, Stefan Milius, and Lutz Schröder. Efficient and Modular Coalgebraic Partition Refinement. Logical Methods in Computer Science, Volume 16, Issue 1, January 2020.
Appendix A Appendix: Omitted Proofs
Details for section 2 (Preliminaries)
Details for subsection 2.1.
Given a pair of -coalgebra and , we have a canonical -coalgebra structure on the the disjoint union of their carriers:
The canonical inclusion maps and are -coalgebra morphisms. We say that states and are behavioural equivalent if .
Note that this definition extends the original definition of , in the sense that in the same coalgebra are behaviourally equivalent () iff in the canonical coalgebra on .
Details on Predicate Liftings in subsection 2.2.
The naturality of in for means that for every map , the diagram
commutes. Since is contravariant, the map is sent to which takes inverse images; writing down the commutativity element-wise yields (3). By the Yoneda lemma, one can define predicate liftings
Lemma A.1.
A predicate lifting for is uniquely defined by a map . Then is given by
or written as sets (considering ):
Proof A.2.
The following mathematical objects are in one-to-one correspondence
The first correspondence is the Yoneda lemma and the second correspondence is a power law. On the right, the inhabitants of the sets are listed when starting with . By the definition of we have:
Details for section 3 (Constructing Distinguishing Formulae)
Verification of 4.
Proof of 3.4.
This follows directly from 4 for and , using that :
Proof of 3.6.
Note that for , we have where .
(3.5) | ||||
(3.4, ) | ||||
() | ||||
( injective) |
In the last step use that, w.l.o.g., preserves injective maps (subsection 2.1). ∎
Proof of Theorem 3.8.
- 1.
- 2.
Details for 6.
In order to verify that the first differing conjunct is a distinguishing formula, we perform a case distinction on the least with :
If and are already split by , then the conjunct at index in the respective certificates of and differs, and we have and . By 3.6, distinguishes from (and distinguishes from ).
If and are split by (but ) in the th iteration, then
Thus, the conjuncts that differs in the respective certificates for and are the following conjuncts at index :
By 3.4, distinguishes from (and distinguishes from ).
Proof of Theorem 3.11.
Before proving Theorem 3.11, we need to establish a sequence of lemmas on the underlying partition refinement algorithm. We assume wlog that preserves finite intersections; that is pullbacks of pairs of injective maps. In fact, the functor mentioned in subsection 2.1, which coincides with on all nonempty sets and map and therefore has the same coalgebras, preserves finite intersections..
Let be a coalgebra for . As additional notation, we define for all sets and :
In other words, we write if there is a transition from (some state of) to (some state of) . Also we define the set of predecessor states of a set as:
Lemma A.3.
For every -coalgebra , , and with finite, we have
Proof A.4.
For every , we have that . Hence, for every , there exists such that
The set is the intersection of all sets with :
Since preserves finite intersections and is finite, we have that
Since is contained in every (as witnessed by ) it is also contained in their intersection. That is, for being the inclusion map, there is with . Now consider the following diagrams:
and |
Both triangles commute because and . Thus, we conclude
Lemma A.5.
For all and in 3.1, we have
Proof A.6.
Lemma A.9.
For and finite in the th iteration of 3.1,
Proof A.10.
Let be used for splitting in iteration . By contraposition, A.7 implies that if and , then (the unique) with satisfies and thefore has a transition to . By the finiteness of , the block is split into finitely many blocks , representing the equivalence classes for . By A.3 we know that if has no transition to , then . Moreover, all elements of are sent to the same value by (A.5). Hence, there is at most one block with no transition to , and all other blocks , , have a transition to .Therefore the number blocks is bounded above as follows: . Summing over all predecessor blocks we obtain:
(A.7) | ||||
(bound on above) | ||||
( are disjoint) |
This completes the proof
Lemma A.11.
Throughout the execution of 3.1 for an input coalgebra with states and transitions, we have
Remark.
Proof A.12.
Because holds in step (A1) of 3.1, one can show that every state is contained in the set in at most iterations [46, Lem. 7.15]. More formally, let be the blocks picked in the th iteration of 3.1. Then we have
(9) |
Let the algorithm terminate after iterations returning . Then, the number of new blocks introduced by step (A3) is bounded as follows (note that after the third step, is a side condition enforcing that we have a summand provided that lies in , whereas before we sum over all ):
(A.9) | ||||
by (9) | ||||
The only blocks we have not counted so far are the blocks of , since , we have at most different blocks in .
We are now ready to prove the main theorem on the dag size of formulae created by 3.7.
Proof A.13 (Proof of Theorem 3.11).
Regarding the height of the dag, it is immediate that and have a height of at most . Since for all , there are at most iterations, with the final partition being .
In 3.7 we create a new modal operator formula whenever 3.1 creates a new block in . By A.11, the number of modalities in the dag is thus bounded by
In every iteration of the main loop, is extended by two new formulae, one for and one for . The formula does not increase the size of the dag, because no new node needs to be allocated. For , we need to allocate one new node for the conjunction, so there are at most new such nodes allocated throughout the execution of the whole algorithm. Even if the optimization in 5 is applied, the additional run time can be neglected under the -notation.
Proof of Theorem 3.12.
We implement every operation of 3.7 in constant time. The arrays for and are re-used in every iteration. Hence the index is entirely neglected and only serves as an indicator for whether we refer to a value before or after the loop iteration. We proceed by case distinction as follows:
-
1.
Initialization step:
-
•
The only block in has index 0, and so we make point to the node .
-
•
For every block in , 3.1 has computed for some (in fact every) . Since canonically embeds into (3.5), we create a new node labelled with two edges to .
For every , this runs in constant time and can be performed whenever the original 3.1 creates a new such block .
-
•
-
2.
In the refinement step, we can look up the certificates resp. for resp. in constant time using the indices of the blocks and . Whenever the original algorithm creates a new block, we also immediately construct the certificate of this new block by creating at most two new nodes in the dag (with at most four outgoing edges). However, if a block does not change (that is, or , resp.), then the corresponding certificate is not changed either in steps item (A’2) resp. item (A’3).
In the loop body we update the certificates as follows:
-
(A’2)
The new block just points to the certificate constructed earlier. For the new block , we allocate a new node , with one edge to and one negated edge to . (See also details for 5 on the run time for computing the optimized negation.)
-
(A’3)
Not all resulting blocks have a transition to . There may be (at most) one new block , with no transition to (see the proof of A.9). In the refinable partition structure, such a block will inherit the index from (i.e. the index of in equals the index of in ). Moreover, every fulfils (by A.3), and for every (by A.5).
Now, one first saves the node of the certificate in some variable , say. Then the array is updated at index by the formula
Consequently, any block inheriting the index of automatically has the correct certificate.
The allocation of nodes for this formula is completely analogous to the one for an ordinary block having edges to : One allocates a new node labelled with edges to the saved node (the original value of ) and to another newly allocated node labelled with edges to the nodes and . ∎
-
(A’2)
Details for 5.
In order to keep the formula size smaller, one can implement the optimization of 5, but one has to take care not to increase the run time. To this end, mark every modal operator node in the formula dag with a boolean flag expressing whether:
is a conjunct of some -formula.
Thus, every new modal operator in (A’3) is flagged ‘false’ initially. When splitting the block in into and in step (A’2), the formula for block is a conjunction of and the negation of all ‘false’-marked conjuncts of . Afterwards these conjuncts are all marked ‘true’, because they are inherited by . The ‘false’-marked conjuncts always form a prefix of all conjuncts of a formula in . It therefore suffices to greedily take conjuncts from the root of a formula graph while they are marked ‘false’.
As a consequence, step (A’3) does not run in constant time but instead takes as many steps as there are ‘false’-marked conjuncts in . However, over the whole execution of the algorithm this eventually amortizes because every newly allocated modal operator allocated is initially marked ‘false’ and later marked ‘true’ precisely once.
Proof of Theorem 3.13.
The overall run time is immediate, because the underlying 3.1 has run time and 3.7 preserves this run time by Theorem 3.12.
Details for section 4 (Cancellative Functors)
Proof of 4.1.
For , let with
which is written point-wise as follows:
Hence, , and moreover
Since is cancellative, we have , which proves that . Thus, the map is injective.
For , let with . Define by
Thus,
Since is injective, we see that holds. Thus, we have , which proves that is cancellative.∎
Proof of 4.2.
-
1.
For the constant functor with value , is the identity map on for every set . Therefore is cancellative.
-
2.
The identity functor is cancellative because the map is clearly injective.
-
3.
Let a natural transformation with injective components and let be cancellative. Combining the naturality squares of for and , we obtain the commutative square:
Every composition of injective maps is injective, and so by standard cancellation laws for injective maps, is injective as well, showing that the subfunctor is cancellative.
-
4.
Let be a family of cancellative functors, and suppose that we have elements with
Write for the th projection function from the product. For every we have:
Since every is cancellative, we have for every . This implies since the product projections are jointly injective.
-
5.
Again, let be a family of cancellative functors. Suppose that we have elements satisfying
This implies that there exists an and with , , and
Since is cancellative, we have as desired. ∎
Details for 9.
Operation | cancellative | non-cancellative |
---|---|---|
Quotient | ||
Composition |
cancellative | non-cancellative | |
---|---|---|
zippable | ||
non-zippable | see (10) |
-
1.
Cancellative functors are not closed under quotients: e.g. the non-cancellative functor is a quotient of the signature functor (which is cancellative by 4.2).
-
2.
Cancellative functors are not closed under composition. For the additive monoid of natural numbers, the monoid-valued functor sends to the set of finite multisets on (‘bags’). Since is cancellative, is a cancellative functor. However, is not:
Here, we use to denote multisets, so but .
- 3.
- 4.
-
5.
The functor is neither zippable [46, Ex. 5.10] nor cancellative because
-
6.
Every functor satisfying and is cancellative but not zippable:
-
•
Indeed, every map with domain is injective, in particular the map
whence is cancellative.
-
•
If and we have that the map
is not injective, whence is not zippable.
An example for such a functor is given by
(10) which sends a map to the map defined by
-
•
-
7.
For the proof of
recall from König et al. [32] that a functor is -zippable if the canonical map
is injective. Formally, is given by
where is the set and the map is defined by
First, we show that for a zippable and cancellative functor , the map
is injective for all sets . Indeed, we have the following chain of injective maps, where the index at the is only notation to distinguish coproduct components more easily:
( is zippable) ( is zippable) ( is cancellative, ) Call this composition . The injective map factors through , because it matches with on the components and , and for the other components, one has the map
with . Since is injective, must be injective, too.
Also note that a function is cancellative iff equivalently the map
is injective, for and and .
We now proceed with the proof of the desired implication by induction on . In the base cases and there is nothing to show because every functor is - and -zippable, and for , the implication is trivial (zippability and -zippability are identical properties). In the inductive step, given that is -zippable, -zippable (), and cancellative, we show that is -zippable.
We have the following chain of injective maps, where we again annotate some of the singleton sets with indices to indicate from which coproduct components they come:
( is -zippable) (the above injective helper map ) This composition thus is injective as well, and in fact the composition is precisely , showing that is -zippable. ∎
The optimization present in the algorithms for Markov chains [44] and automata [27] can now be adapted to coalgebras for cancellative functors, where it suffices to split only according to transitions into , neglecting transitions into . More formally, this means that we replace the three-valued with in the refinement step (A3):
Proposition A.14.
Let be a cancellative set functor. For in the -th iteration of 3.1, we have
Proof of A.14.
From the definition (1) of the kernel, we immediately obtain the following properties for all maps , :
(11) (12) (13) For every coalgebra and we have
Since is cancellative, is injective, and we thus obtain
(14) By (12), this implies that
(15) Let be the block that is split into and in iteration . Since is finer than and , we have ; thus:
(16) Now we verify the desired property:
(by (A3)) (by (15)) (def. ) (by (13)) (by (16)) This completes the proof. ∎
Example A.15.
Observe that, in the optimized step (A3), is no longer mentioned. It is therefore unsurprising that we do not need a certificate for it when constructing certificates for the blocks of . Instead, we can reflect the map in the coalgebraic modal formula and take as the (unary) modal operators. Just like in 3.5, the set canonically embeds into :
Proof of Theorem 4.3.
Before proving Theorem 4.3, we define a new set of (unary) modalities (A.16), establish a lemma about its semantics (A.17), fully phrase the entire optimized algorithm (A.19), and then show its correctness (Theorem A.21).
Notation A.16.
Define the injective map by and . Then the injection provides a way to interpret elements as unary modalities :
Remark to A.16.
There are several different ways to define for , depending on the definition of the inclusion .
for Definition for All these variants make the following A.17 true because in any case:
(17) Analogously to 3.4 we can show:
Lemma A.17.
Given a cancellative functor , an -coalgebra , , a formula , and , we have if and only if .
In 3.7, the family is only used in the definition of to characterize the larger block that has been split into the smaller blocks and . For a cancellative functor, we can replace
in the definition of . Hence, we can omit from 3.7 altogether, obtaining the following algorithm, which is again based on coalgebraic partition refinement (3.1).
Proof A.18 (Proof of A.17).
Since we put with and , we have for all .
(A.16) (3.4, ) () ( injective) In the last step, we use that preserves injective maps (subsection 2.1)
Algorithm A.19.
The certificates thus computed are reduced to roughly half the size compared to 3.7; the asymptotic run time and formula size (subsection 3.5) remain unchanged. More importantly:
Remark 13.
The certificates constructed by A.19 do not contain negation (or disjunction); they are built from , conjunction , and unary modal operators for (the nullary operators for embed into ).
Proof A.20 (Details on 13).
Define the injective map by . Hence, we can also embed the nullary into :
This is compatible with the notations established so far because we have for the inclusions defined in 3.5 and A.16. Thus, we obtain the same modal operator regardless of whether we embed first into and from there into (, A.16) or directly into (, 3.5):
Theorem A.21.
For cancellative functors, A.19 is correct; that is, for all we have:
Note that the optimized A.19 can also be implemented by directly constructing certificates for the unary modal operators . That is, one can treat the modal operators as first class citizens, in lieu of embedding them into the set as we did in A.16. The only difference between the two implementation approaches w.r.t. the size of the formula dag is one edge per modality, namely the edge to the node from the node , which arises when step 21 is expanded according to A.16.
Proof A.22 (Proof of Theorem A.21).
We prove the desired correctness by induction over , the index of loop iterations.
The definition of is identical to the definition in 3.7 whence
proved completely analogously as in the proof of Theorem 3.8.
Details for section 5 (Domain-Specific Certificates)
Details for 10.
For every set , define the set as terms over the grammar
(18) There is an obvious way to evaluate boolean combinations of predicates using the maps
defined inductively as follows:
Given a signature of modal operators and corresponding predicate liftings , we can combine all of them. To this end, write for the corresponding signature functor (cf. subsection 2.1.3); we define a family of maps as follows:
Since every is natural in , so is . We can replace with the signature
where , has the arity . Observe that is functorial; in fact, it is the (free or term) monad for the signature functor associated to the grammar in (18). Thus is a functor, too. Applying the Yoneda-Lemma to this functor, we have for every the (natural) family of maps :
Hence, we obtain a predicate lifting for by defining:
It is a composition of natural transformations and so is itself natural in .
Definition 14.
Given a modal signature for a functor , a simple domain-specific interpretation consists of functions and assigning a nullary modality to each and a unary modality to each such that the predicate liftings and satisfy
Proposition A.23.
Let be a modal signature for a cancellative functor , and a simple domain-specific interpretation. Define by . Then is a domain-specific interpretation.
Proof A.24.
We verify that is a domain-specific interpretation (11) by verifying that for every , defining
satisfies
In the following, we put . By the naturality of the predicate lifting of , the following square commutes (recall that is contravariant):
(19) We thus have:
(def. ) (def. ) (def. ) (by (19)) For every , we have
(by the above calculation) ( reflexive) () (def. ) (assumption on ) (def. ) (def. ) ( cancellative) Note that is injective because is cancellative.
Details for 5.1.
- (a)
-
(b)
5.1.2: For the verification for signature functors, define a helper map by . The predicate lifting for the (unary) modal operator , for , is obtained from A.1 by the predicate corresponding to the set
This gives rise to the predicate lifting
(A.1) (def. ) Similarly, for the nullary modal operator (for the -ary operation symbol ), take given by the set
(noting that ). This gives rise to the predicate lifting
(A.1) (def. ) For the verification of the (simple) domain-specific interpretation (14), we put
with then induces the claimed via A.23:
There is nothing to show for since it has the correct semantics by the definition of . Note that is injective because for every the operation symbol and all its parameters (from ) are uniquely determined by and . For , , we have
Thus, we compute
(def. ) () (def. ) ( injective) -
(c)
5.1.3: For every , define the map
which gives rise to the predicate lifting of the unary modal operator :
(A.1) (def. ) For the verification of the axioms of the domain-specific interpretation (11), we have that satisfies the axiom:
() For the other component of the domain-specific interpretation, we proceed by case distinction:
-
•
If is non-cancellative, we have for and thus we have for every :
(def. ) (def. ) - •
-
•
- (d)
Proof of 5.2.
Lemma A.25.
Let be a domain-specific interpretation for . For all and we have:
Proof A.26.
Proof A.27 (Proof of 5.2).
We prove by induction over the index of main loop iterations that and are a certificates for and , respectively. (In the cancellative case, and are not defined; so just put , for convenience.)
-
(a)
For , we trivially have
Furthermore, unravelling 3.5,
Consequently,
using . The naturality of , , implies that . Hence,
-
(b)
In the inductive step, there is nothing to show for because it is only a boolean combination of and . For , we distinguish two cases: whether the class is refined or not. If , then
and we are done. Now suppose that in the -th iteration with chosen . By (A’3) resp. 21 we have:
where is or ; in any case . Note that here is either (3.7) or (A.19). Put in the first case and else. Using , we see that
where the last equation follows from the inductive hypothesis. Thus, we have
and therefore
Moreover, we have
in the first case by item (A3), in the second case by A.14, recalling that .
We are now prepared for our final computation:
(Semantics of ) (I.H.) () () (domain-specific interpret. (A.25)) ())
This completes the proof.
Details for 5.3.
The 3.7 runs in producing certificates of a total size of . When translating these certificates for the modalities by the translation , we obtain certificates for the input coalgebra (5.2). However, the formula size has a blow up by the additional factor because of the big conjunctions in the domain-specific interpretation (5.1.4).
This represents is a better run time than that of the algorithm by Desharnais et al. [17, Fig. 4], which nests multiple loops: four loops over blocks all blocks seen so far and one loop over , roughly leading to a total run time in .
Details for section 6 (Worst Case Tree Size of Certificates)
Details for 12.
To verify the minimality of , one considers all possible replacements of subformulae of by :
All of these hold at both and , because can perform arbitrarily many transitions and can perform transitions.
We note additionally that even the optimized algorithm for cancellative functors (cf. A.19) constructs certificates of exponential worst-case tree size:
Example A.28.
Define the -coalgebra on by
The optimized A.19 constructs a certificate of size in the -th layer. In this example, however, linear-sized certificates do exist for all states, e.g.
Details for A.28.
Define the -coalgebra on the carrier
We put
For the complexity class of the formulae generated, consider the subcoalgebra on .
The initial partition distinguishes on the total out-degree (being 1, 2, 3, 4, or 6). Consider that after iterations of the main loop of the algorithm, the states have just been found to be behaviourally different and all states of are still identified. Then the algorithm has to use some of the blocks , , , as the splitter for further refinement. Assume wlog that is used as the splitter, first. This will have the effect that will be refined into the blocks
Assume, that the formula for is at this point (we omit the index, since the singleton block can not be refined further). The definition of in the algorithm annotates the block with and the block with .
Splitting by does not lead to further refinement. However, when splitting by (or equivalently ), is split into and and likewise into and . Let be the certificate constructed for . This implies that the formulas for and are respectively extended by the conjunct ; likewise, the formulas for and obtain a new conjunct . Hence, for every the tree-size of the formula constructed is at least:
Thus the tree-size of the certificate constructed for cancellative functors may grow exponentially with the state count.
Despite the exponential tree-size of the formulas constructed, there exist linearly sized certificates for all states in the above coalgebra . First, we have
This lets us define certificates for and :
For the remaining two state sequences and we first note
and thus have certificates
Since involves modal operators, every state in has a certificate with at most modal operators.