Serge Haddad and Daniele Varacca \EventNoEds2 \EventLongTitle32nd International Conference on Concurrency Theory (CONCUR 2021) \EventShortTitleCONCUR 2021 \EventAcronymCONCUR \EventYear2021 \EventDateAugust 23–27, 2021 \EventLocationVirtual Conference \EventLogo \SeriesVolume203 \ArticleNo32 stix@largesymbols"0E stix@largesymbols"0F Radboud University, Nijmegen, The Netherlands [email protected]://orcid.org/0000-0001-8993-6486Work forms part of the NWO TOP project 612.001.852 and the DFG-funded project COAX (MI 717/5-2) Friedrich-Alexander-Universität Erlangen-Nürnberg, [email protected]://orcid.org/0000-0002-2021-1644 Work forms part of the DFG-funded project CoMoC (MI 717/7-1) Friedrich-Alexander-Universität Erlangen-Nürnberg, [email protected]://orcid.org/0000-0002-3146-5906Work forms part of the DFG-funded project CoMoC (SCHR 1118/15-1) \CopyrightT. Wißmann, S. Milius, and L. Schröder \relatedversion \relatedversiondetailsFull Version with Appendixhttps://arxiv.org/abs/2105.00669 {CCSXML} <ccs2012> <concept> <concept_id>10003752.10003790</concept_id> <concept_desc>Theory of computation Logic</concept_desc> <concept_significance>500</concept_significance> </concept> </ccs2012> \ccsdesc[500]Theory of computation Logic {CCSXML} <ccs2012> <concept> <concept_id>10003752.10010124.10010138.10010143</concept_id> <concept_desc>Theory of computation Program analysis</concept_desc> <concept_significance>500</concept_significance> </concept> </ccs2012> \ccsdesc[500]Theory of computation Program analysis
Explaining Behavioural Inequivalence Generically
in Quasilinear Time
Abstract
We provide a generic algorithm for constructing formulae that distinguish behaviourally inequivalent states in systems of various transition types such as nondeterministic, probabilistic or weighted; genericity over the transition type is achieved by working with coalgebras for a set functor in the paradigm of universal coalgebra. For every behavioural equivalence class in a given system, we construct a formula which holds precisely at the states in that class. The algorithm instantiates to deterministic finite automata, transition systems, labelled Markov chains, and systems of many other types. The ambient logic is a modal logic featuring modalities that are generically extracted from the functor; these modalities can be systematically translated into custom sets of modalities in a postprocessing step. The new algorithm builds on an existing coalgebraic partition refinement algorithm. It runs in time on systems with states and transitions, and the same asymptotic bound applies to the dag size of the formulae it constructs. This improves the bounds on run time and formula size compared to previous algorithms even for previously known specific instances, viz. transition systems and Markov chains; in particular, the best previous bound for transition systems was .
keywords:
bisimulation, partition refinement, modal logic, distinguishing formulae, coalgebra1 Introduction
For finite transition systems, the Hennessy-Milner theorem guarantees that two states are bisimilar if and only if they satisfy the same modal formulae. This implies that whenever two states are not bisimilar, then one can find a modal formula that holds at one of the states but not at the other. Such a formula explains the difference of the two states’ behaviour and is thus usually called a distinguishing formula [13]. For example, in the transition system in Figure 2, the formula distinguishes the states and because satisfies whereas does not. Given two states in a finite transition system with states and transitions, the algorithm by Cleaveland [13] computes a distinguishing formula in time . The algorithm builds on the Kanellakis-Smolka partition refinement algorithm [28, 29], which computes the bisimilarity relation on a transition system within the same time bound.
Similar logical characterizations of bisimulation exist for other system types. For instance, Desharnais et al. [16, 17] characterize probabilistic bisimulation on (labelled) Markov chains, in the sense of Larsen and Skou [33] (for each label, every state has either no successors or a probability distribution on successors). In their logic, a formula holds at states that have a transition probability of at least to states satisfying . For example, the state in Figure 2 satisfies but does not. Desharnais et al. provide an algorithm that computes distinguishing formulae for labelled Markov chains in run time (roughly) .
In the present work, we construct such counterexamples generically for a variety of system types. We achieve genericity over the system type by modelling state-based systems as coalgebras for a set functor in the framework of universal coalgebra [40]. Examples of coalgebras for a set functor include transition systems, deterministic automata, or weighted systems (e.g. Markov chains). Universal coalgebra provides a generic notion of behavioural equivalence that instantiates to standard notions for concrete system types, e.g. bisimilarity (transtion systems), language equivalence (deterministic automata), or probabilistic bisimilarity (Markov chains). Moreover, coalgebras come equipped with a generic notion of modal logic that is parametric in a choice of modalities whose semantics is constructed so as to guarantee invariance w.r.t. behavioural equivalence; under easily checked conditions, such a coalgebraic modal logic in fact characterizes behavioural equivalence in the same sense as Hennessy-Milner logic characterizes bisimilarity [39, 42]. Hence, as soon as suitable modal operators are found, coalgebraic modal formulae serve as distinguishing formulae.
In a nutshell, the contribution of the present paper is an algorithm that computes distinguishing formulae for behaviourally inequivalent states in quasilinear time, and in fact certificates that uniquely describe behavioural equivalence classes in a system, in coalgebraic generality. We build on an existing efficient coalgebraic partition refinement algorithm [46], thus achieving run time on coalgebras with states and transitions (in a suitable encoding). The dag size of formulae is also (for tree size, exponential lower bounds are known [22]); even for labelled transition systems, we thus improve the previous best bound [13] for both run time and formula size. We systematically extract the requisite modalities from the functor at hand, requiring binary and nullary modalities in the general case, and then give a systematic method to translate these generic modal operators into more customary ones (such as the standard operators of Hennessy-Milner logic).
We subsequently identify a notion of cancellative functor that allows for additional optimization. E.g. functors modelling weighted systems are cancellative if and only if the weights come from a cancellative monoid, such as , or as used in probabilistic systems. For cancellative functors, much simpler distinguishing formulae can be constructed: the binary modalities can be replaced by unary ones, and only conjunction is needed in the propositional base. On labelled Markov chains, this complements the result that a logic with only conjunction and different unary modalities (mentioned above) suffices for the construction of distinguishing formulae (but not certificates) [17] (see also [19]).
Related Work
Cleaveland’s algorithm [13] for labelled transition systems is is based on Kanellakis and Smolka’s partition refinement algorithm [29]. The coalgebraic partition refinement algorithm we employ [46] is instead related to the more efficient Paige-Tarjan algorithm [36]. König et al. [32] extract formulae from winning strategies in a bisimulation game in coalgebraic generality; their algorithm runs in and does not support negative transition weights. Characteristic formulae for behavioural equivalence classes taken across all models require the use of fixpoint logics [21]. The mentioned algorithm by Desharnais et al. for distinguishing formulae on labelled Markov processes [17, Fig. 4] is based on Cleaveland’s. No complexity analysis is made but the algorithm has four nested loops, so its run time is roughly . Bernardo and Miculan [10] provide a similar algorithm for a logic with only disjunction. There are further generalizations along other axes, e.g. to behavioural preorders [12]. The TwoTowers tool set for the analysis of stochastic process algebras [9, 8] computes distinguishing formulae for inequivalent processes, using variants of Cleaveland’s algorithm. Some approaches construct alternative forms of certificates for inequivalence, such as Cranen et al.’s notion of evidence [14] or methods employed on business process models, based on model differences and event structures [18, 6, 5].
2 Preliminaries
We first recall some basic notation. We denote by , , and the sets representing the natural numbers , , and . For every set , there is a unique map . We write for the set of functions , so e.g. . In particular, is the set of -valued predicates on , which is in bijection with the powerset of , i.e. the set of all subsets of ; in this bijection, a subset corresponds to its characteristic function , given by if , and otherwise. We generally indicate injective maps by . Given maps , , we write for the map given by . We denote the disjoint union of sets , by , with canonical inclusion maps and . More generally, we write for the disjoint union of an -indexed family of sets , and for the -th inclusion map. For a map (not necessarily surjective), we denote by the kernel of , i.e. the equivalence relation
(1) |
Notation \thetheorem (Partitions).
Given an equivalence relation on , we write for the equivalence class of . If is the kernel of a map , we simply write in lieu of . The intersection of equivalence relations is again an equivalence relation. The partition corresponding to is denoted by . Note that is a surjective map and that .
A signature is a set , whose elements are called operation symbols, equipped with a function assigning to each operation symbol its arity. We write for with . We will apply the same terminology and notation to collections of modal operators.
2.1 Coalgebra
Universal coalgebra [40] provides a generic framework for the modelling and analysis of state-based systems. Its key abstraction is to parametrize notions and results over the transition type of systems, encapsulated as an endofunctor on a given base category. Instances cover, for example, deterministic automata, labelled (weighted) transition systems, and Markov chains.
Definition \thetheorem.
A set functor assigns to every set a set and to every map a map such that identity maps and composition are preserved: and . An -coalgebra is a pair consisting of a set (the carrier) and a map (the structure). When is clear from the context, we simply speak of a coalgebra.
In a coalgebra , we understand the carrier set as consisting of states, and the structure as assigning to each state a structured collection of successor states, with the structure of collections determined by . In this way, the notion of coalgebra subsumes numerous types of state-based systems, as illustrated next.
Example \thetheorem.
-
1.
The powerset functor sends a set to its powerset and a map to the map taking direct images. A -coalgebra is precisely a transition system: It assigns to every state a set of successor states, inducing a transition relation given by iff . Similarly, coalgebras for the finite powerset functor (with being the set of finite subsets of ) are finitely branching transition systems.
-
2.
Coalgebras for the functor , where is a fixed input alphabet, are deterministic automata (without an explicit initial state). Indeed, a coalgebra structure consists of a finality predicate and a transition map in curried form .
-
3.
Every signature defines a signature functor that maps a set to the set
whose elements we may understand as flat -terms with variables from . The action of on maps is then given by . For simplicity, we write (instead of ) for the coproduct injections, and in lieu of for the signature functor. States in -coalgebras describe possibly infinite -trees.
-
4.
For a commutative monoid , the monoid-valued functor [25] is given by
(2) on sets ; for a map , the map is defined by
A coalgebra is a finitely branching weighted transition system, where is the transition weight from to . For the Boolean monoid , we recover . Coalgebras for , with understood as the additive monoid of the reals, are -weighted transition systems. The functor
which assigns to a set the set of all finite probability distributions on (represented as finitely supported probability mass functions), is a subfunctor of .
-
5.
Functors can be composed; for instance, given a set of labels, the composite of and the functor (whose action on sets maps a set to the set ) is the functor , whose coalgebras are -labelled transition systems. Coalgebras for have been termed probabilistic transition systems [33] or labelled Markov chains [17], and coalgebras for are partial labelled Markov chains [17]. Coalgebras for are variously known as simple Segala systems or Markov decision processes.
We have a canonical notion of behaviour on -coalgebras:
Definition \thetheorem.
An -coalgebra morphism is a map such that . States in an -coalgebra are behaviourally equivalent () if there exists a coalgebra morphism such that .
Thus, we effectively define the behaviour of a state as those of its properties that are preserved by coalgebra morphisms. The notion of behavioural equivalence subsumes standard branching-time equivalences:
Example \thetheorem.
-
1.
For , behavioural equivalence on -coalgebras, i.e. on transition systems, is bisimilarity in the usual sense.
-
2.
For deterministic automata as coalgebras for , two states are behaviourally equivalent iff they accept the same formal language.
-
3.
For a signature functor , two states of a -coalgebra are behaviourally equivalent iff they describe the same -tree.
-
4.
For labelled transition systems as coalgebras for , coalgebraic behavioural equivalence precisely captures Milner’s strong bisimilarity [1].
- 5.
Remark \thetheorem.
-
1.
The notion of behavioural equivalence extends straightforwardly to states in different coalgebras, as one can canonically define the disjoint union of coalgebras.
- 2.
2.2 Coalgebraic Logics
We briefly review basic concepts of coalgebraic modal logic [38, 42]. Coalgebraic modal logics are parametric in a functor determining the type of systems underlying the semantics, and additionally in a choice of modalities interpreted in terms of predicate liftings. For now, we use as a basic example, deferring further examples to section 5.
Syntax
The syntax of coalgebraic modal logic is
parametrized over the choice of signature of modal
operators (with assigned arities). Then, formulae are
generated by the grammar
Example \thetheorem.
For , one often takes ; the induced syntax is that of (single-action) Hennessy-Milner logic. As usual, we write .
Semantics
We interpret formulae as sets of states in -coalgebras. This interpretation arises by assigning to each modal operator an -ary predicate lifting [38, 42], i.e. a family of maps , one for every set , such that the naturality condition
(3) |
for all and all (for categorically-minded readers, is a natural transformation ); the idea being to lift given predicates on states to predicates on structured collections of states. Given these data, the extension of a formula in an -coalgebra is a predicate , or just , on , recursively defined by
(where we apply set operations to predicates with the evident meaning). We say that a state satisfies if . Notice how the clause for modalities says that satisfies iff satisfies the predicate obtained by lifting the predicates on to a predicate on according to .
Example \thetheorem.
Over , we interpret by the predicate lifting
The arising notion of satisfaction over -coalgebras is precisely the standard one: iff for some transition .
The naturality condition (3) of predicate liftings guarantees invariance of the logic under coalgebra morphisms, and hence under behavioural equivalence:
Proposition 2.1 (Adequacy [38, 42]).
Behaviourally equivalent states satisfy the same formulae: implies that for all formulae , we have iff .
In our running example , this instantiates to the well-known fact that modal formulae are bisimulation-invariant, that is, bisimilar states in transition systems satisfy the same formulae of Hennessy-Milner logic.
3 Constructing Distinguishing Formulae
A proof method certifying behavioural equivalence of states in a coalgebra is immediate by definition: One simply needs to exhibit a coalgebra morphism such that . In fact, for many system types, it suffices to relate and by a coalgebraic bisimulation in a suitable sense (e.g. [1, 40, 24, 34]), generalizing the Park-Milner bisimulation principle [35, 37]. It is less obvious how to certify behavioural inequivalence , showing that such a morphism does not exist. By 2.1, one option is to exhibit a (coalgebraic) modal formula that is satisfied by but not by . In the case of (image-finite) transition systems, such a formula is guaranteed to exist by the Hennessy-Milner theorem, which moreover is known to generalize to coalgebras [39, 42]. More generally, we consider separation of sets of states by formulae, following Cleaveland [13, Def. 2.4]:
Definition 1.
Let be an -coalgebra. A formula distinguishes a set from a set if and . In case and , we just say that distinguishes from . We say that is a certificate of if distinguishes from , that is if .
Note that distinguishes from iff distinguishes from . Certificates have also been referred to as descriptions [22]. If is a certificate of a behavioural equivalence class , then by definition distinguishes from whenever . To obtain distinguishing formulae for behaviourally inequivalent states in a coalgebra, it thus suffices to construct certificates for all behavioural equivalence classes, and our algorithm does just that. Of course, every certificate must be at least as large as a smallest distinguishing formula. However, already on transition systems, distinguishing formulae and certificates have the same asymptotic worst-case size (cf. section 6).
A natural approach to computing certificates for behavioural equivalence classes is to extend algorithms that compute these equivalence classes. In particular, partition refinement algorithms compute a sequence of consecutively finer partitions (i.e. ) on the state space, where every block is a union of behavioural equivalence classes, and the final partition is precisely . Indeed, Cleaveland’s algorithm for computing certificates on labelled transition systems [13] correspondingly extends Kanellakis and Smolka’s partition refinement algorithm [28, 29], which runs in on systems with states and transitions. Our generic algorithm will be based on a more efficient partition refinement algorithm.
3.1 Paige-Tarjan with Certificates
Before we turn to constructing certificates in coalgebraic generality, we informally recall and extend the Paige-Tarjan algorithm [36], which computes the partition modulo bisimilarity of a given transition system with states and transitions in time . We fix a given finite transition system, viewed as a -coalgebra .
The algorithm computes two sequences and of partitions of (with equivalence relations), where only the most recent partition is held in memory and indexes the iterations of the main loop. Throughout the execution, is finer than (that is, for all ), and the algorithm terminates when . Intuitively, is ‘one transition ahead’ of : if distinguishes states and , then is based on distinguishing transitions to from transitions to .
Initially, consists of only one block and of two blocks: the live states and the deadlocks (i.e. states with no outgoing transitions). If , then there is a block that is the union of at least two blocks in . In such a situation, the algorithm chooses in to have at most half the size of and then splits the block into and in the partition :
This is correct because every state in is already known to be behaviourally inequivalent to every state in . By the definition of bisimilarity, this implies that every block with some transition to may contain behaviourally inequivalent states as illustrated in Figure 3; that is, may need to be split into smaller blocks, as follows:
-
(C1)
states in with successors in but not in (e.g. in Figure 3),
-
(C2)
states in with successors in and (e.g. ), and
-
(C3)
states in with successors but not in (e.g. ).
The partition arises from by splitting all such predecessor blocks of accordingly. If no such is properly split, then , and the algorithm terminates. It is straightforward to construct certificates for the blocks arising during the execution:
-
•
The certificate for the only block is , and the blocks for live states and deadlocks in have certificates and , respectively.
-
•
In the refinement step, suppose that are certificates of and , respectively, where . For every predecessor block of , the three blocks obtained by splitting are distinguished (see 1) as follows:
(C1) , (C2) , (C3) . (4) Of course these formulae only distinguish the states in from each other (e.g. there may be states in other blocks with transitions to both and ). Hence, given a certificate of , one obtains certificates of the three resulting blocks in via conjunction: , etc.
Upon termination, every bisimilarity class in the transition system is annotated with a certificate. A key step in the generic development will be to come up with a coalgebraic generalization of the formulae for (C1)–(C3).
3.2 Generic Partition Refinement
The Paige-Tarjan algorithm has been adapted to other system types, e.g. weighted systems [44], and it has recently been generalized to coalgebras [46, 20]. A crucial step in this generalization is to rephrase the case distinction (C1)–(C3) in terms of the functor : Given a predecessor block in for , the three cases distinguish between the equivalence classes for , where the map in the composite is defined by
(5) |
Every case is a possible value of : (C1) , (C2) , and (C3) . Since is a predecessor block of , the ‘fourth case’ is not possible. There is a transition from to some state outside of iff . However, because of the previous refinement steps performed by the algorithm, either all or no states states of have an edge to (a property called stability [36]), hence no distinction on is necessary.
It is now easy to generalize from transition systems to coalgebras by simply replacing the functor with in the refinement step. We recall the algorithm:
Algorithm 3.1 ([46, Alg. 4.9, (5.1)]).
Given a coalgebra , put
Starting at iteration , repeat the following while :
-
(A1)
Pick and such that and
-
(A2)
-
(A3)
This algorithm formalizes the intuitive steps from subsection 3.1. Again, two sequences of partitions , are constructed, and upon termination. Initially, identifies all states and distinguishes states by only their output behaviour; e.g. for and , the value is if is a deadlock, and if is a live state, and for , the value indicates whether is a final or non-final state.
In the main loop, blocks and witnessing are picked, and is split into and , like in the Paige-Tarjan algorithm. Note that step (A2) is equivalent to directly defining the equivalence relation as
A similar intersection of equivalence relations is performed in step (A3). The intersection splits every block into smaller blocks such that end up in the same block iff , i.e. is replaced by . Again, this corresponds to the distinction of the three cases (C1)–(C3). For example, for , there are cases to be distinguished, and so every is split into at most that many blocks.
The following property of is needed for correctness [46, Ex. 5.11].
Definition 2 ([46]).
A functor is zippable if map
is injective for all sets .
Intuitively, is a term in variables from and . If is zippable, then is uniquely determined by the two elements in and obtained by identifying all - and all -variables with , respectively. E.g. is zippable: is uniquely determined by and , and similarly for the three other cases of . In fact, all signature functors as well as and all monoid-valued functors are zippable. Moreover, the class of zippable functors is closed under products, coproducts, and subfunctors but not under composition, e.g. is not zippable [46].
Remark 3.
To apply the algorithm to coalgebras for composites of zippable functors, e.g. , there is a reduction [46, Section 8] that embeds every -coalgebra into a coalgebra for the zippable functor . This reduction preserves and reflects behavioural equivalence, but introduces an intermediate state for every transition.
3.3 Generic Modal Operators
The extended Paige-Tarjan algorithm (subsection 3.1) constructs a distinguishing formula according to the three cases (C1)–(C3). In the coalgebraic 3.1, these cases correspond to elements of , which determine in which block an element of a predecessor block ends up. Indeed, the elements of will also serve as generic modalities in characteristic formulae for blocks of states, essentially by the known equivalence between -ary predicate liftings and (in this case, singleton) subsets of [42] (termed tests by Klin [30]):
Definition 4.
The signature of -modalities for a functor is
that is, we write for the syntactic representation of a binary modality for every . The interpretation of for is given by the predicate lifting
The intended use of is as follows: Suppose a block is split into subblocks and with certificates and , respectively: and . As in Figure 3, we then split every predecessor block of into smaller parts, each of which is uniquely characterized by the formula for some .
Example 3.3.
For , is equivalent to .
Lemma 3.4.
Given an -coalgebra , , and formulae and such that , we have if and only if .
In the initial partition on a transition system , we used the formulae and to distinguish live states and deadlocks. In general, we can similarly describe the initial partition using modalities induced by elements of :
Notation 3.5.
Define the injective map by . Then the injection provides a way to interpret elements as nullary modalities :
(Alternatively, we could introduce directly as a nullary modality.)
Lemma 3.6.
For , , and , we have if and only if .
3.4 Algorithmic Construction of Certificates
The -modalities introduced above (4) induce an instance of coalgebraic modal logic (subsection 2.2). We refer to coalgebraic modal formulae employing the -modalities as -modal formulae, and write for the set of -modal formulae. As in the extended Paige-Tarjan algorithm (subsection 3.1), we annotate every block arising during the execution of 3.1 with a certificate in the shape of an -modal formula. Annotating blocks with formulae means that we construct maps
As in 3.1, indexes the loop iterations. For blocks in the respective partition, , denote corresponding certificates: we will have
(6) |
We construct and iteratively, using certificates for the blocks at every iteration:
Algorithm 3.7.
Like in subsection 3.1, the only block of has as a certificate. Since the partition distinguishes by the ‘output’ (e.g. final vs. non-final states), the certificate of specifies what is (3.6).
In the -th iteration of the main loop, we have certificates and for in step (A1) satisfying (6) available from the previous iterations. In (A’2), the Boolean connectives describe how is split into and . In (A’3), new certificates are constructed for every predecessor block that is refined. If does not change, then neither does its certificate. Otherwise, the block is split into the blocks for in step (A3), which is reflected by the modality as per 3.4.
Remark 5.
In step (A’2), can be simplified to be no larger than . To see this, note that for , , and , every conjunct of is also a conjunct of . In , one can hence remove all conjuncts of from , obtaining a formula , and then equivalently use in the definition of .
Theorem 3.8.
Corollary 3.9 (Hennessy-Milner).
For zippable , states in a finite -coalgebra are behaviourally equivalent iff they agree on all -modal formulae.
Remark 6.
A smaller formula distinguishing a state from a state can be extracted from the certificates in time . It is the leftmost conjunct that is different in the respective certificates of and . This is the subformula starting at the modal operator introduced in for the least with ; hence, satisfies but satisfies for some in .
3.5 Complexity Analysis
The operations introduced by 3.7 can be implemented with only constant run time overhead. To this end, one implements and as arrays of formulae of length (note that at any point, there are at most -many blocks). In the refinable-partition data structure [45], every block has an index (a natural number) and there is an array of length mapping every state to the block it is contained in. Hence, for both partitions and , one can look up a state’s block and a block’s certificate in constant time.
It is very likely that the certificates contain a particular subformula multiple times and that certificates of different blocks share common subformulae. For example, every certificate of a block refined in the -th iteration using contains the subformulas and . Therefore, it is advantageous to represent all certificates constructed as one directed acyclic graph (dag) with nodes labelled by modal operators and conjunction and having precisely two outgoing edges. Moreover, edges have a binary flag indicating whether they represent negation . Initially, there is only one node representing , and the operations of 3.7 allocate new nodes and update the arrays for and to point to the right nodes. For example, if the predecessor block is refined in step (A’3), yielding a new block , then a new node labelled is allocated with edges to the nodes and to another new node labelled with edges to the nodes and .
For purposes of estimating the size of formulae generated by the algorithm, we use a notion of transition in coalgebras, inspired by the notion of canonical graph [26].
Definition 3.10.
For states in an -coalgebra , we say that there is a transition if is not in the image , where is the inclusion map.
Theorem 3.11.
For a coalgebra with states and transitions, the formula dag constructed by 3.7 has size and height at most .
Theorem 3.12.
For a tighter run time analysis of the underlying partition refinement algorithm, one additionally requires that is equipped with a refinement interface [46, Def. 6.4], which is based on a given encoding of -coalgebras in terms of edges between states (encodings serve only as data structures and have no direct semantic meaning, in particular do not entail a semantic reduction to relational structures). This notion of edge yields the same numbers (in -notation) as 3.10 for all functors considered. All zippable functors we consider here have refinement interfaces [46, 15]. In presence of a refinement interface, step (A3) can be implemented efficiently, with resulting overall run time where , is the number of edges in the encoding of the input coalgebra , and the run-time factor is associated with the refinement interface. In most instances, e.g. for , , one has ; in particular, the generic algorithm recovers the run time of the Paige-Tarjan algorithm.
Remark 7.
The claimed run time relies on close attention to a number of implementation details. This includes use of an efficient data structure for the partition [31, 45]; the other partition is only represented implicitly in terms of a queue of blocks witnessing , requiring additional care when splitting blocks in the queue [44, Fig. 3]. Moreover, grouping the elements of a block by involves the consideration of a possible majority candidate [44].
Theorem 3.13.
On a coalgebra with states and transitions for a zippable set functor with a refinement interface with factor , 3.7 runs in time .
4 Cancellative Functors
Our use of binary modalities relates to the fact that, as observed already by Paige and Tarjan, when splitting a block according to an existing partition of a block into and , it is not in general sufficient to look only at the successors in . However, this does suffice for some transition types; e.g. Hopcroft’s algorithm for deterministic automata [27] and Valmari and Franceschinis’ algorithm for weighted systems (e.g. Markov chains) [44] both split only with respect to . In the following, we exhibit a criterion on the level of functors that captures that splitting w.r.t. only is sufficient:
Definition 8.
A functor is cancellative if the map
is injective.
To understand the role of the above map, recall the function from (5) and note that and , so the composite yields information about the accumulated transition weights into and but not about the one into ; the injectivity condition means that for cancellative functors, this information suffices in the splitting step for . The term cancellative stems from the respective property on monoids; recall that a monoid is cancellative if implies for all .
Proposition 4.1.
The monoid-valued functor for a commutative monoid is cancellative if and only if is a cancellative monoid.
Hence, is cancellative, but is not. Moreover, all signature functors are cancellative:
Proposition 4.2.
The class of cancellative functors contains the all constant functors as well as the identity functor, and it is closed under subfunctors, products, and coproducts.
For example, is cancellative, but is not because of its subfunctor .
Remark 9.
Cancellative functors are neither closed under quotients nor under composition. Zippability and cancellativity are independent properties. Zippability in conjunction with cancellativity implies -zippability for all , the -ary variant [32] of zippability.
Theorem 4.3.
If is a cancellative functor, in 3.7 can be replaced with . Then, the algorithm still correctly computes certificates in the given -coalgebra .
Note that in this optimized algorithm, the computation of can be omitted because it is not used anymore. Hence, the resulting formulae only involve , , and modalities from the set (with the second parameter fixed to ). These modalities are equivalently unary modalities induced by elements of , which we term -modalities; hence, the corresponding Hennessy-Milner Theorem (3.9) adapts to for cancellative functors, as follows:
Corollary 4.4.
For zippable and cancellative , states in an -coalgebra are behaviourally equivalent iff they agree on modal formulae built using , , and unary -modalities.
5 Domain-Specific Certificates
On a given specific system type, one is typically interested in certificates and distinguishing formulae expressed via modalities whose use is established in the respective domain, e.g. and for transition systems. We next describe how the generic modalities can be rewritten to domain-specific ones in a postprocessing step. The domain-specific modalities will not in general be equivalent to -modalities, but still yield certificates.
Definition 10.
The Boolean closure of a modal signature has as -ary modalities propositional combinations of atoms of the form , for , where are propositional combinations of elements of . Such a modality is interpreted by predicate liftings defined inductively in the obvious way.
For example, the boolean closure of contains the unary modality .
Definition 11.
Given a modal signature for a functor , a domain-specific interpretation consists of functions and assigning to each a nullary modality and to each a binary modality such that the predicate liftings and satisfy
(Recall that is the characteristic function of , and denotes the equivalence class of w.r.t. .)
Thus, holds precisely at states with output behaviour . Intuitively, describes the refinement step of a predecessor block when splitting into and (Figure 3), which translates into the arguments and of . In the refinement step, we know from previous iterations that all elements have the same behaviour w.r.t. . This is reflected in the intersection with . The axiom guarantees that characterizes uniquely, but only within the equivalence class representing a predecessor block. Thus, can be much smaller than equivalents of (cf. 3.3):
Example 5.1.
-
1.
For , we have a domain-specific interpretation over the modal signature . For , take and . For , we put
The certificates obtained via this translation are precisely the ones generated in the example using the Paige-Tarjan algorithm, cf. (4), with in lieu of .
-
2.
For a signature (functor) , take . We interpret by the predicate liftings
Intuitively, states that the th successor satisfies iff . We then have a domain-specific interpretation given by for and for and .
-
3.
For a monoid-valued functor , take , interpreted by the predicate liftings given by
A formula thus states that the accumulated weight of the successors satisfying is exactly . A domain-specific interpretation is then given by for and for . In case is cancellative, we can also simply put .
-
4.
For labelled Markov chains, i.e. , let , where denotes that on input , the next state will satisfy with probability at least , as in cited work by Desharnais et al. [17]. This gives rise to the interpretation:
Given a domain-specific interpretation for a modal signature for the functor , we can postprocess certificates produced by 3.7 by replacing the modalities for according to the translation recursively defined by the following clauses for modalities and by commutation with propositional operators:
Note that one can replace with for the optimized from 5; the latter conjunction has essentially the same size as .
Proposition 5.2.
For every certificate of a behavioural equivalence class of a given coalgebra produced by either 3.7 or its optimization (Theorem 4.3), is also a certificate for that class.
Thus, the domain-specific modal signatures also inherit a Hennessy-Milner Theorem.
Example 5.3.
For labelled Markov chains () and the interpretation via the modalities (5.1.4), this yields certificates (thus in particular distinguishing formulae) in run time , with the same bound on formula size. Desharnais et al. describe an algorithm [17, Fig. 4] that computes distinguishing formulae in the negation-free fragment of the same logic (they note also that this fragment does not suffice for certificates). They do not provide a run-time analysis, but the nested loop structure indicates that the asymptotic complexity is roughly .
6 Worst Case Tree Size of Certificates
In the complexity analysis (subsection 3.5), we have seen that certificates – and thus also distinguishing formulae – have dag size on input coalgebras with states and transitions. However, when formulae are written in the usual linear way, multiple occurrences of the same subformula lead to an exponential blow up of the formula size in this sense, which for emphasis we refer to as the tree size.
Figueira and Gorín [22] show that exponential tree size is inevitable even for distinguishing formulae. The proof is based on winning strategies in bisimulation games, a technique that is also applied in other results on lower bounds on formula size [23, 3, 4].
Open Problem 6.1.
Do states in -coalgebras generally have certificates of subexponential tree size in the number of states? If yes, can small certificates be computed efficiently?
We note that for another cancellative functor, the answer is well-known: On deterministic automata, i.e. coalgebras for , the standard minimization algorithm constructs distinguishing words of linear length.
Remark 12.
Cleaveland [13, p. 368] also mentions that minimal distinguishing formulae may be exponential in size, however for a slightly different notion of minimality: a formula distinguishing from is minimal if no obtained by replacing a non-trivial subformula of with the formula distinguishes from . This is weaker than demanding that the formula size of is as small as possible. For example, in the transition system
for ,
the formula distinguishes from and is minimal in the above sense. However, can in fact be distinguished from in size , by the formula .
7 Conclusions and Further Work
We have presented a generic algorithm that computes distinguishing formulae for behaviourally inequivalent states in state-based systems of various types, cast as coalgebras for a functor capturing the system type. Our algorithm is based on coalgebraic partition refinement [46], and like that algorithm runs in time , with a functor-specific factor that is in many cases of interest. Independently of this factor, the distinguishing formulae constructed by the algorithm have dag size ; they live in a dedicated instance of coalgebraic modal logic [39, 42], with binary modalities extracted from the type functor in a systematic way. We have shown that for cancellative functors, the construction of formulae and, more importantly, the logic can be simplified, requiring only unary modalities and conjunction. We have also discussed how distinguishing formulae can be translated into a more familiar domain-specific syntax (e.g. Hennessy-Milner logic for transition systems).
There is an open source implementation of the underlying partition refinement algorithm [15], which may serve as a basis for a future implementation.
In partition refinement, blocks are successively refined in a top-down manner, and this is reflected by the use of conjunction in distinguishing formulae. Alternatively, bisimilarity may be computed bottom-up, as in a recent partition aggregation algorithm [11]. It is an interesting point for future investigation whether this algorithm can be extended to compute distinguishing formulae, which would likely be of a rather different shape than those computed via partition refinement.
References
- [1] Peter Aczel and Nax Mendler. A final coalgebra theorem. In Proc. Category Theory and Computer Science (CTCS), volume 389 of LNCS, pages 357–365. Springer, 1989.
- [2] Jiří Adámek, Stefan Milius, and Lawrence S. Moss. Initial algebras, terminal coalgebras, and the theory of fixed points of functors. draft book, available online at https://www8.cs.fau.de/ext/milius/publications/files/CoalgebraBook.pdf, 2021.
- [3] Micah Adler and Neil Immerman. An n! lower bound on formula size. In LICS 2001, pages 197–206. IEEE Computer Society, 2001.
- [4] Micah Adler and Neil Immerman. An n! lower bound on formula size. ACM Trans. Comput. Log., 4(3):296–314, 2003.
- [5] Abel Armas-Cervantes, Paolo Baldan, Marlon Dumas, and Luciano García-Bañuelos. Behavioral comparison of process models based on canonically reduced event structures. In Business Process Management, pages 267–282. Springer, 2014.
- [6] Abel Armas-Cervantes, Luciano García-Bañuelos, and Marlon Dumas. Event structures as a foundation for process model differencing, part 1: Acyclic processes. In Web Services and Formal Methods, pages 69–86. Springer, 2013.
- [7] Falk Bartels, Ana Sokolova, and Erik de Vink. A hierarchy of probabilistic system types. Theoret. Comput. Sci., 327:3–22, 2004.
- [8] Marco Bernardo. TwoTowers 5.1 user manual, 2004.
- [9] Marco Bernardo, Rance Cleaveland, Steve Sims, and W. Stewart. TwoTowers: A tool integrating functional and performance analysis of concurrent systems. In Formal Description Techniques and Protocol Specification, Testing and Verification, FORTE / PSTV 1998, volume 135 of IFIP Conference Proceedings, pages 457–467. Kluwer, 1998.
- [10] Marco Bernardo and Marino Miculan. Constructive logical characterizations of bisimilarity for reactive probabilistic systems. Theoretical Computer Science, 764:80 – 99, 2019. Selected papers of ICTCS 2016.
- [11] Johanna Björklund and Loek Cleophas. Aggregation-based minimization of finite state automata. Acta Informatica, 2020.
- [12] Ufuk Celikkan and Rance Cleaveland. Generating diagnostic information for behavioral preorders. Distributed Computing, 9(2):61–75, 1995.
- [13] Rance Cleaveland. On automatically explaining bisimulation inequivalence. In Computer-Aided Verification, pages 364–372. Springer, 1991.
- [14] Sjoerd Cranen, Bas Luttik, and Tim A. C. Willemse. Evidence for Fixpoint Logic. In 24th EACSL Annual Conference on Computer Science Logic (CSL 2015), volume 41 of LIPIcs, pages 78–93. Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 2015.
- [15] Hans-Peter Deifel, Stefan Milius, Lutz Schröder, and Thorsten Wißmann. Generic partition refinement and weighted tree automata. In Formal Methods – The Next 30 Years, Proc. 3rd World Congress on Formal Methods (FM 2019), volume 11800 of LNCS, pages 280–297. Springer, 10 2019.
- [16] J. Desharnais, A. Edalat, and P. Panangaden. A logical characterization of bisimulation for labeled markov processes. In Proceedings. Thirteenth Annual IEEE Symposium on Logic in Computer Science (Cat. No.98CB36226), pages 478–487, 1998.
- [17] Josée Desharnais, Abbas Edalat, and Prakash Panangaden. Bisimulation for labelled markov processes. Information and Computation, 179(2):163–193, 2002.
- [18] Remco Dijkman. Diagnosing differences between business process models. In Business Process Management, pages 261–277, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg.
- [19] Ernst-Erich Doberkat. Stochastic Coalgebraic Logic. Springer, 2009.
- [20] Ulrich Dorsch, Stefan Milius, Lutz Schröder, and Thorsten Wißmann. Efficient coalgebraic partition refinement. In Proc. 28th International Conference on Concurrency Theory (CONCUR 2017), LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017.
- [21] Ulrich Dorsch, Stefan Milius, Lutz Schröder, and Thorsten Wißmann. Predicate liftings and functor presentations in coalgebraic expression languages. In Coalgebraic Methods in Computer Science, CMCS 2018, volume 11202 of LNCS, pages 56–77. Springer, 2018.
- [22] Santiago Figueira and Daniel Gorín. On the size of shortest modal descriptions. In Advances in Modal Logic 8, papers from the eighth conference on "Advances in Modal Logic," held in Moscow, Russia, 24-27 August 2010, pages 120–139. College Publications, 2010.
- [23] Tim French, Wiebe van der Hoek, Petar Iliev, and Barteld Kooi. On the succinctness of some modal logics. Artificial Intelligence, 197:56 – 85, 2013.
- [24] Daniel Gorín and Lutz Schröder. Simulations and bisimulations for coalgebraic modal logics. In Algebra and Coalgebra in Computer Science - 5th International Conference, CALCO 2013, volume 8089 of LNCS, pages 253–266. Springer, 2013.
- [25] H. Peter Gumm and Tobias Schröder. Monoid-labeled transition systems. In Coalgebraic Methods in Computer Science, CMCS 2001, volume 44(1) of ENTCS, pages 185–204. Elsevier, 2001.
- [26] H.Peter Gumm. From -coalgebras to filter structures and transition systems. In Algebra and Coalgebra in Computer Science, volume 3629 of LNCS, pages 194–212. Springer, 2005.
- [27] John Hopcroft. An algorithm for minimizing states in a finite automaton. In Theory of Machines and Computations, pages 189–196. Academic Press, 1971.
- [28] Paris C. Kanellakis and Scott A. Smolka. Ccs expressions, finite state processes, and three problems of equivalence. In Proceedings of the Second Annual ACM Symposium on Principles of Distributed Computing, PODC ’83, pages 228–240. ACM, 1983.
- [29] Paris C. Kanellakis and Scott A. Smolka. CCS expressions, finite state processes, and three problems of equivalence. Inf. Comput., 86(1):43–68, 1990.
- [30] Bartek Klin. The least fibred lifting and the expressivity of coalgebraic modal logic. In Algebra and Coalgebra in Computer Science, CALCO 2005, volume 3629 of LNCS, pages 247–262. Springer, 2005.
- [31] Timo Knuutila. Re-describing an algorithm by Hopcroft. Theor. Comput. Sci., 250:333 – 363, 2001.
- [32] Barbara König, Christina Mika-Michalski, and Lutz Schröder. Explaining non-bisimilarity in a coalgebraic approach: Games and distinguishing formulas. In Coalgebraic Methods in Computer Science, pages 133–154. Springer, 2020.
- [33] Kim Guldstrand Larsen and Arne Arne Skou. Bisimulation through probabilistic testing. Inform. Comput., 94(1):1–28, 1991.
- [34] Johannes Marti and Yde Venema. Lax extensions of coalgebra functors and their logic. J. Comput. Syst. Sci., 81(5):880–900, 2015.
- [35] R. Milner. Communication and Concurrency. International series in computer science. Prentice-Hall, 1989.
- [36] Robert Paige and Robert E. Tarjan. Three partition refinement algorithms. SIAM J. Comput., 16(6):973–989, 1987.
- [37] D. Park. Concurrency and automata on infinite sequences. In Proceedings of 5th GI-Conference on Theoretical Computer Science, volume 104 of LNCS, pages 167–183, 1981.
- [38] Dirk Pattinson. Coalgebraic modal logic: soundness, completeness and decidability of local consequence. Theoretical Computer Science, 309(1):177 – 193, 2003.
- [39] Dirk Pattinson. Expressive logics for coalgebras via terminal sequence induction. Notre Dame J. Formal Log., 45(1):19–33, 2004.
- [40] Jan Rutten. Universal coalgebra: a theory of systems. Theor. Comput. Sci., 249:3–80, 2000.
- [41] Jan Rutten and Erik de Vink. Bisimulation for probabilistic transition systems: a coalgebraic approach. Theoret. Comput. Sci., 221:271–293, 1999.
- [42] Lutz Schröder. Expressivity of coalgebraic modal logic: The limits and beyond. Theor. Comput. Sci., 390(2-3):230–247, 2008.
- [43] Věra Trnková. On a descriptive classification of set functors I. Commentationes Mathematicae Universitatis Carolinae, 12(1):143–174, 1971.
- [44] Antti Valmari and Giuliana Franceschinis. Simple time Markov chain lumping. In Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2010, volume 6015 of LNCS, pages 38–52. Springer, 2010.
- [45] Antti Valmari and Petri Lehtinen. Efficient minimization of dfas with partial transition. In Theoretical Aspects of Computer Science, STACS 2008, volume 1 of LIPIcs, pages 645–656. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Germany, 2008.
- [46] Thorsten Wißmann, Ulrich Dorsch, Stefan Milius, and Lutz Schröder. Efficient and Modular Coalgebraic Partition Refinement. Logical Methods in Computer Science, Volume 16, Issue 1, January 2020.
Appendix A Appendix: Omitted Proofs
Details for section 2 (Preliminaries)
Details for subsection 2.1.
Given a pair of -coalgebra and , we have a canonical -coalgebra structure on the the disjoint union of their carriers:
The canonical inclusion maps and are -coalgebra morphisms. We say that states and are behavioural equivalent if .
Note that this definition extends the original definition of , in the sense that in the same coalgebra are behaviourally equivalent () iff in the canonical coalgebra on .
Details on Predicate Liftings in subsection 2.2.
The naturality of in for means that for every map , the diagram
commutes. Since is contravariant, the map is sent to which takes inverse images; writing down the commutativity element-wise yields (3). By the Yoneda lemma, one can define predicate liftings
Lemma A.1.
A predicate lifting for is uniquely defined by a map . Then is given by
or written as sets (considering ):
Proof A.2.
The following mathematical objects are in one-to-one correspondence
The first correspondence is the Yoneda lemma and the second correspondence is a power law. On the right, the inhabitants of the sets are listed when starting with . By the definition of we have:
Details for section 3 (Constructing Distinguishing Formulae)
Verification of 4.
Proof of 3.4.
This follows directly from 4 for and , using that :
Proof of 3.6.
Note that for , we have where .
(3.5) | ||||
(3.4, ) | ||||
() | ||||
( injective) |
In the last step use that, w.l.o.g., preserves injective maps (subsection 2.1). ∎
Proof of Theorem 3.8.
- 1.
- 2.
Details for 6.
In order to verify t