Finite Model Property and Bisimulation for LFD
Abstract
Recently, Baltag & van Benthem introduced a decidable logic of functional dependence (LFD) that extends the logic of Cylindrical Relativized Set Algebras (CRS) with atomic local dependence statements. Its semantics can be given in terms of generalised assignment models or their modal counterparts, hence the logic is both a first-order and a modal logic. We show that LFD has the finite model property (FMP) using Herwig’s theorem on extending partial isomorphisms, and prove a bisimulation invariance theorem characterizing LFD as a fragment of first-order logic.
1 Introduction
Recently, Baltag & van Benthem introduced a decidable logic of functional dependence (LFD) that extends the logic of Cylindrical Relativized Set Algebras (CRS) [2] with atomic dependence statements. The semantics is given in terms of dependence models111These are just the generalised assignment models known from [2][5]., which are pairs of a first-order structure together with a fixed set of variable assignments (or ’team’) on , where is some (possibly finite) ambient set of variables. Formulas are evaluated at individual assignments ; in particular the dependence atoms get the following semantics: if for all , implies . This is in contrast with logics based on team semantics, where dependence formulas are evaluated at teams and this team is dynamically changed over the course of evaluation. Whereas most logics based on team semantics are undecidable, non-classical and have expressive going beyond FOL, LFD is decidable with a classical semantics and can be considered a fragment of FOL.
Many interesting notions of dependence (such as lineair dependence in vector spaces, temporal dependence in dynamical systems and strategic interaction in a multi-player game) can be formalized in LFD [3]. Moreover, LFD invites a natural epistemic interpretation where (sets of) variables may represent (groups of) agents, (joint) questions or objects.[3] The dependence modalities then capture distributed knowledge or the interrogative modality, while the dependence atoms capture epistemic superiority or inquisitive implication (or other ’mixed’ notions). More spectacularly, [4] introduces a complete and decidable dynamic-epistemic logic based on LFD with so-called ’reading events’ as well as a notion of ’common distributed knowledge’ that combines features of common knowledge and distributed knowledge.
Dependence models are closely related to relational databases: assignments are rows in the table and each variable represents a column, or attribute. Here is a simple numerical example of a dependence model, viewed as a database:
x | y | z |
---|---|---|
1 | 0 | 1 |
1 | 0 | 0 |
0 | 1 | 1 |
2 | 0 | 2 |
In this table we see that e.g. locally depends on in the first row, because the second row, which agrees on with the first, also agrees on with it (and no other rows agree on with the first row). In fact, this dependence holds at all rows, in which case we say that globally depends on . Conversely, does not depend globally on because it does not locally depend on at the first row: both and occur as -values of rows that share the current -value . Finally, because the fourth row is the only row with -value , all other variables locally depend on there.
The foregoing example witnesses the close connection between LFD and the study of dependence in databases, and indeed the Projection and Transitivity axioms of LFD recapture Armstrong’s Axioms for functional dependence [3]. Deeper connections with database theory as well as team semantics might arise by introducing dynamics on the level of teams, generalizing the semantics to dependence universes, i.e. families of dependence models [3]. In particular, dependence models
The decidability proof in [3] uses completeness of LFD w.r.t a purely syntactic ’type semantics’ resembling the ’quasi-models’ studied in connection with the Guarded Fragment [5][2]. The question whether LFD has the finite model property (FMP) w.r.t. dependence models remained an open problem [3]. Our main result is that LFD has the FMP, by a new application of Herwig’s theorem on extending partial isomorphisms. Moreover, we define dependence bisimulations and show that LFD can be characterized as the fragment of FOL that is invariant under this notion. Independently, another notion of bisimulation for LFD along more standard lines has been proposed in [6]. We show that these notions are equivalent, but that dependence bisimulations suggest a more efficient procedure for checking bisimilarity.
2 Preliminaries
We first introduce the language LFD, dependence models and type models. A pair , where is set of variables and is a relational language is called a vocabulary. When both and are finite, we say that is a finite vocabulary. We write for the set of first-order formulas with variables in (both free and bound) and predicates in , and similarly for . We assume that each vocabulary becomes equipped with an arity map .
Definition 2.1.
(Syntax) The language is recursively defined by:
where is a finite set of variables, an individual variable, a predicate symbol and a finite string of variables.222LFD as a modal language is generated by the same definition, but where become unary predicates in . Fixing notation, for any , we write if holds for all . We also skip the set brackets for singletons, writing for , and for . For every , we define its free variables by:
-
•
-
•
-
•
,
Moreover, we let denote the set of variables occurring in . This is in general a superset of the free of variables, i.e. . Further, we say that .
Definition 2.2.
(Dependence Models) A dependence model (for the vocabulary ) is a pair of a relational structure for , together with a fixed team .333We use letters for first-order structures and blackboard bold letters for dependence models. For each , we define an agreement relation on the team:
Note that may be finite. We call a dependence model distinguished if all the assignments are injective.
Definition 2.3.
(Semantics) Truth of a formula in a dependence model at an assignment is defined by the following clauses (the Boolean cases are defined as usual:
Where denotes the tuple for . Clearly, for every dependence model and assignments , there is a unique set that is the maximal set of variables on which agree. An important feature of the semantics is LFD satisfies Locality.
Next to dependence models, LFD is weakly complete w.r.t. a non-standard type semantics.[3] In other words, only LFD over finite vocabularies is complete for this semantics. Type models were used in [3] as a technical auxiliary to prove completeness and decidability. Types are defined relative to closures. We obtain the closure of a formula by adding to all formulas for and closing the resulting set under subformulas and single negation. 444For every non-negated formula (i.e. a formula whose principal connective is not ) we add to the closure, and for negated formulas we we do nothing. The resulting closure set will not contain any formulas with double negations.
Definition 2.4.
(-Types) Let be a closure in . A subset is a -type if it satisfies the following conditions (where all formulas mentioned run over only):
-
(a)
iff
-
(b)
iff and
-
(c)
if , then
-
(d)
for all
-
(e)
implies
For , we define a relation on types :
where is the dependence-closure of w.r.t and . Observe that implies as .
Definition 2.5.
(Type Models) A type model (for ) is a family of -types satisfying:
-
•
if , then there exists a , such that and .
-
•
holds for all .
Type models are always finite, as there are only finitely many -types for a given closure . This proves the decidability as LFD is (weakly) complete w.r.t. type models [3]. The semantic conditions for type models are given by membership:
2.1 Tree Model Property
Every satisfiable LFD formula can be satisfied on a certain tree-like dependence model. This fact follows from the fact that dependence models and type models provide equivalent semantics for LFD, i.e. each type model can be represented as a dependence model and vice versa [3]. The interesting direction is representing arbitrary type models as dependence models by means of an unravelling construction in the sense of modal logic. To say what we mean by ’tree-like’ we need the graph-theoretic notion of a -tree (the definition is taken from [7]). Say that an -tuple of objects from a -structure is live in , if there is some -ary such that .
Definition 2.6.
(-Tree) A -structure is a -tree if there exists a tree (i.e. an acyclic, connected graph) and a function , assigning to every node of a set of at most elements of , such that the following two conditions hold.
-
(i)
For every live tuple from , there is some node such that .
-
(ii)
For every element of , the set of nodes is connected (and hence induces a subtree of ).
is of finite branching degree if is, that is if the set of neighbours of every node in is finite.
Theorem 2.1.
Representation of Type Models [3]
Let be a type model for . There exists a dependence model with .
Proof.
Let and where is the set of variables occurring in formulas in (i.e. ). Fix a type . A good path is a sequence with such that for each (a) and (b) . Write for the last element of , and for the length of (not counting the variable sets). For each good path , we define the path assignment , assigning objects of the form to variables :
(1) | |||
(2) | |||
(3) |
So new objects are created whenever the value for a variable is not locally determined by the predecessor path. We obtain a team on the structure with domain and where an -ary holds of an -tuple iff all paths are linearly ordered by initial segment and the formula , where is the longest path amongst .
This yields a distinguished dependence model whose objects are typed by a unique variable . The set of all good paths, ordered by initial segment, forms a tree whose branching degree is bounded by . Together with the map , this shows that is a -tree of finite branching degree. Finally, we have the following crucial truth lemma [3]:
Lemma 2.1.
Truth Lemma
For all formulas and good paths iff
This lemma implies that : because every type occurs as the for some unique good path of length 2 already, namely . Moreover, note that we are free to choose the initial fixed type from in the definition of good path, and hence, by the truth lemma, we can choose what type to be satisfied at the root. ∎
Corollary 2.1.
Tree Model Property
If is satisfiable and , there is a dependence model , where is -tree of finite branching degree, satisfying at the root assignment.
Definition 2.7.
(First-Order Translation) Although interpreted over a generalised semantics, LFD in finitely many variables can be encoded back into FOL over standard structures. So let be a finite set of variables with enumeration . We double the amount of variables, creating a set of copied variables from the variables in . We ensure that the relevant assignments agree on their values for variables and their copies by the conjunction .555This additional condition (it was not in the original formulation in [3]) is essential for encoding the semantics of the dependence atoms into FOL, which treats as variables as completely independent otherwise. Further, we introduce a new -ary predicate such that encodes the fact that the tuples of values assigned to by the current assignment is the range of some admissible assignment from the team (this is a tuple because is finite). The first-order translation is defined by [3]:
-
•
and commutes with Boolean connectives
-
•
, where is the enumeration of all the variables in and is the enumeration of all the variables in .
-
•
, where are as in part (d), and are the corresponding fresh -copies of and respectively.666Furthermore, denotes the formula that is obtained by replacing the variables by in the formula .
There is a one-to-one correspondence between dependence models and structures in this extended language. If is a dependence model, is the expansion of with the interpretation . Conversely, given any -structure we obtain a team which together with a reduct of makes for the corresponding dependence model. We have the equivalence:
for every and all assignments extending . This translation easily adapts to other local dependence atoms proposed in [6], e.g. and .
3 Characterization
The original paper [3] left finding a bisimulation-invariance theorem characterizing LFD as an open problem. precisely which formulas in are equivalent to the -translation of an LFD-formula over standard structures.777There is also a modal translation of LFD into FOL that extends the well-known standard translation of modal logic into the 2-variable fragment of FOL. A similar characterization theorem can be proved via this translation and the relational semantics for LFD, as our notion of bisimulation as well as the one proposed in [6] are naturally formulated on dependence models as well as their modal counterparts. The following notion of dependence bisimulation exactly characterizes as the largest fragment of invariant under this notion. Say that a set of variables is dependence-closed at if , or equivalently if implies .
Definition 3.1.
(Dependence Bisimulation) Let be dependence models. We say that a non-empty relation is a dependence-bisimulation if for every :
-
(Atom)
iff
-
(Forth)
For every , (i) the set is dependence-closed at and
there is some such that (ii) and (iii) -
(Back)
symmetric to the (Forth) clause
Dependence bisimulations are always total; every state is related to another by the bisimulation.
Proposition 3.1.
LFD-formulas are invariant under dependence bisimulations.
Proof.
Let be dependence models and a dependence bisimulation with and LFD. We show that iff by induction on the complexity of ; the atomic and Boolean cases are trivial. For the other cases, we show only one direction.
( Suppose that and let , i.e. , for some . By the (Back)-clause there is some such that and . Hence and so by .
( Suppose that and let for some . We want to show that , i.e. . By the (Back)-clause there is some with , and is dependence-closed at . As , by monotonicity of dependence we have . This shows that as is dependence-closed at . ∎
Dependence bisimulations in fact characterize LFD as a fragment of FOL. This can be shown by formulating an analogue of dependence bisimulations for structures of the form , and showing that on -saturated structures of this form, LFD-equivalence implies dependence-bisimilarity.
Independently, another notion of bisimulation characterizing LFD has been proposed in [6] that treats dependence atoms like ordinary relational atoms. That is, instead of the dependence-closed condition they simply require that ” iff ” holds for all . It follows that proposition 3.1 shows that dependence bisimulations are also bisimulations in their sense. Conversely, ” iff ” clearly implies the dependence-closed condition, hence the two notions are equivalent. It follows that the proof given in [6] also shows that LFD is the dependence bisimulation-invariant fragment of FOL.
Theorem 3.1.
Van Benthem Characterization
is the largest fragment of that is invariant under dependence bisimulations.
Dependence bisimulations suggest a more efficient way to implement a bisimilarity-checking algorithm for LFD compared to the definition in [6]. For what proposition 3.1 shows is that, given with , it actually suffices to check that ” iff for all ” holds for all in order to conclude that ” iff for all ” holds for all . This could be used to avoid an exponential blow-up in .
Dependence bisimulations generalise naturally to extensions of LFD. For instance, we can extend with the equality relation , yielding the logic which was shown to be a conservative reduction class of FOL and hence undecidable in [6]. Dependence bisimulations with an extended (Atom) clause that also ranges over equality can be shown to characterize as a fragment to FOL. Interestingly, over full dependence models (i.e. those with , which are standard first-order structures repackaged as dependence models), dependence bisimulations (for LFD over a finite vocabulary with ) coincides with -potential isomorphism, which characterizes first-order logic in variables.
4 Finite Model Property
We show that LFD has the FMP w.r.t the intended dependence model semantics, by an application of Herwig’s theorem similar to the one in [7]. Fix a satisfiable LFD-formula , and let . We let be the smallest vocabulary containing and hence . Note that is a finite vocabulary, so let . We know that there is a tree-like dependence model , with associated tree of good paths, satisfying at the root assignment. Furthermore, the degree of is bounded by , where is the number of distinct -types. Our strategy is as follows: we will cut the underlying -tree to a finite structure, encode the dependence atoms in a richer language and finally use Herwig’s theorem to generate out of this a finite dependence model that is bisimilar to the original tree-model. Define a sub-team of by:
and let be the submodel of induced by ; we call the cut-off model. This is a finite model because the branching degree of is bounded and is finite. The truth lemma clearly no longer holds on this cut-off model, because some existential witnesses are missing for assignments of length 3.
We extend the language to include an -ary relation for each , and obtain the (still finite) richer language . We will use these relations to encode the semantics of the dependence atoms. We expand the structure underlying the cut-off model to a structure by putting:
so that iff . In the end, we want to show that holds on the Herwig extension, so that we can recover an appropriate dependence model from it. To show this, we will need the following restricted version of this claim on the cut-off model:
Proposition 4.1.
For each of length .
Proof.
By contraposition, so suppose that . This means that , so for the good path (it is a good path as trivially holds) we have that and , i.e. . ∎
Herwig’s theorem on extending partial isomorphism [8] is a result about first-order relational languages. It tells us that any finite structure with some set of partial isomorphisms on it has a finite extension in which all these partial isomorphisms extend to automorphisms. This theorem has already been used to show the FMP of the Guarded Fragment (GF) [7].
Theorem 4.1.
Herwig
Let be a finite relational language, a finite -structure and a (finite) set of partial isomorphisms on . Then there exists a finite extension of that satisfies the following conditions:
-
(i)
Every extends to a unique automorphism of . This yields a subgroup of the automorphism group of .
-
(ii)
If a tuple from is live or , then there exists an automorphism such that for each , .
-
(iii)
If and such that , then either or there is a unique such that and .
where is the collection of all partial isomorphisms that can be obtained by composing the with their inverses. Note that is strictly speaking not a group as it need not be the case that is the identity on (in general, it is the identity on a subset of ).
Condition (iii) is in need of further clarification. In words, it says that elements in the submodel are only mapped to each other by some if this is forced given the choice of partial isomorphisms. Uniqueness of in this condition is ensured by the fact that the map extends to a bijective map that commutes with the operations (and the identity ). By condition (i), is defined on the subset . Set and ; so commutation follows by definition. It immediately follows that the map is injective. For surjectivity, let . By definition for some and for each . Define . Now observe:
We proceed with specifying a choice of partial isomorphisms on the cut-off model. If is a good path of and , then there is a partial isomorphism such that , where so . We pick the finite set of partial isomorphisms . The following proposition tells us what kind of partial isomorphisms are in .
Lemma 4.1.
If with , then there are with such that , and .
Proof.
Let such that . By definition, for some and for each . Note that for each we have that , so 888More specifically , but this is the same as equality as . Another way of putting this is that and . for some with . In particular, there are such that and . Set . It follows that and so , i.e.
This was the base case for an inductive argument up to . So let and suppose that with and
Now recall that for some with . Moreover, it follows that . Hence by induction, there is some with such that and
then for and we have proved the lemma ∎
The associated first-order structure of the Herwig extension is a finite model in a finite relational language , and is a finite set of partial isomorphisms on it. Hence, by Herwig’s theorem, there exists a finite extension of this structure, the Herwig extension, satisfying conditions (i)-(iii) w.r.t . It is easy to see that the Herwig extension corresponds in the canonical way (i.e. see the first-order translation above) to a dependence model such that . Recall that we want to establish a bisimulation between the finite Herwig extension and the infinite tree model . To do this, we will need the following lemmas.
Lemma 4.2.
Level 2 Lemma
For every there is an such that where .
Proof.
Let . Then the tuple is live in . Hence by condition (ii) there is some automorphism such that is a tuple of objects of the submodel . As is an isomorphism, it follows that as well. But this can only be if for some . Now suppose that , with , then by (i) there is an automorphism such that and , where . Hence we may assume that there exists some such that for some path assignment of length . ∎
Next, we generalise the notion of ’underlying type’ (i.e. for a path assignment ) to all assignments in . We define a function . Set for all . For , by the level 2 lemma we know there is such that , and we set .
Proof.
Well-definedness of
Let be automorphisms with and . Observe that is an automorphism in the subgroup that maps elements in to each other, as . Hence by (iii) there must be a unique such that and thus . Lemma 4.1 tells us that there are assignments with , and . It is an easy consequence of the Truth Lemma (lemma 2.1) and Locality that implies that and similarly for .999For suppose that . Observe that for any path assignment with . By Locality the hypothesis gives that . By the Truth Lemma, this in turn implies that which says that , but this clearly implies that . Hence .
∎
This last fact used, i.e. that implies , we will now generalise to all assignments in w.r.t their ’underlying types’ .
Lemma 4.3.
Type Lemma
If with , then .
Proof.
Let with . By the level 2 lemma, there is such that with , so . As is an isomorphism on , we know that is an assignment as well, with . By applying the level 2 lemma again to , we get a such that with , so . Again, we know that is also an assignment (though in general not one in ) such that . But observe that the automorphism maps for all , hence by condition (iii) there must be a unique such that and so . By Lemma 4.1, there are such that , and . Invoking the Truth Lemma and Locality as before this implies that and . Concatenating these facts we see that
∎
Lemma 4.4.
Encoding Lemma
For all and all
Proof.
() By the level 2 lemma, there is such that with . Applying the first-order translation to proposition 3.1 we get that . But observe that
is an existential first-order formula. Hence by the dualized version of the Łoś-Tarski theorem, this still holds in the Herwig extension, i.e. . As is an isomorphism on and , we get that , as desired.
() Suppose that , and let for some . The former fact implies that and the latter by the Type Lemma implies that . It follows that as well. Applying the level 2 lemma two times successively as before, we obtain automorphism such that , with (recall the notation ). As in the type lemma, we see that (i.e. ) for all and thus by condition (iii) there is a unique such that and hence .
We know by Lemma 4.1 that there must be with such that , and . By fact 4.9 from [3], this implies that there is a path and similarly for . We saw that , so in fact must be in all the types along these paths. But then it follows from condition (2) of the recursive definition of path assignments (in the proof of theorem 2.1) that and . But recall that so:
But then so by transitivity and we conclude that . ∎
Theorem 4.2.
The dependence models and are dependence-bisimilar.
Proof.
We show that the relation defined by is an LFD-bisimulation in the sense of [6] and hence, by our remark above, also a dependence bisimulation. Pick an arbitrary pair . By the level 2 lemma, there is some such that with , hence . As is well-defined, it follows that . We show that the pair satisfies (Atom) (i.e. the one which also ranges over dependence atoms [6]) and is closed under the (Back) & (Forth) clauses (without the dependence-closedness condition).
(Atom) Observe that the chain of equivalences:
holds for every (i.e. including the relations !) by the fact that is an isomorphism with and the way we have specified the interpretation on both models. Invoking the encoding lemma, this implies that iff .
(Forth) Let be some assignment in the Herwig extension, and let be the maximal set of variables on which and agree. By the Type Lemma . But , so it follows that is a good path. Clearly with , and lastly as .
(Back) Let , with the maximal set ot variables on which agree. By a now familiar argument involving the Truth lemma and Locality (i.e. the analogue of the Type Lemma for ), it follows that . As , we see that is a good path. Moreover, we know that which implies that and so is in the cut-off model. Clearly . Set , then as and moreover since . ∎
Corollary 4.1.
Bounded Model Property
Every satisfiable in LFD has a finite model whose size is bounded by a computable function of . 101010Any formula determines a unique smallest finite vocabulary such that belongs to ; the computable function takes as input , the maximal arity of relations in , and the number of distinct -types .
Proof.
Let be a satisfiable LFD-formula with closure in the language and . By the tree model property, there is a -tree and a team such that is a dependence model satisfying at the root assignment. We cut this tree at length 3 and obtain the cut-off model whose size is upper bounded by , where is the branching degree of the -tree . Note that itself has as upper bound, where and . It follows that the size of the cut-off model is already exponential in the size of the variables .
Now construct the Herwig extension as above. Using the bound given in [8], we get that is upper bounded by an iterated exponential of a polynomial function of degree of , where is the maximal arity of predicates in . By theorem 4.2, and are-bisimilar. As dependence bisimulations are always total, there is some assignment with . By the invariance result above (proposition 3.1), it follows that . ∎
5 Conclusion
We have introduced dependence bisimulations and have shown that this notion characterizes LFD as a fragment of FOL. Furthermore, we have shown that LFD has the finite (or bounded) model property, by a new application of Herwig’s theorem and a tree-model property established in [3]. The same strategy can be used to carry out a direct proof of the FMP through the equivalent modal semantics.111111The proof of this can be found in an extended version of this paper (arXiv:2107.06042). With minor adaptations, the proof goes through, though we need to appeal to a more general version of Herwig’s theorem (theorem 5 in [8]) to ensure that the Herwig extension is a tree in order to obtain a standard relational model from it. By reducing the maximal arity to , going through the modal semantics significantly lowers the upper bound on the size of the Herwig extension to being singly exponential in the size of the cut-off model.
While LFD only adds local dependence atoms to CRS, extensions of CRS with other local versions of atomic dependency properties have been considered in [6].121212We will consider only the logics defined in [6] that are closed under negation, i.e. those for which is closed under negation. The authors show that LFD extended with either equality or inclusion is undecidable and that the extension of CRS with both inclusion and equality is contained in GF. CRS with independence atoms was shown undecidable in [3], resulting in a complete characterization of the satisfiability problems of such logics. The same paper also studies the model-checking problem for such logics, and shows it to be PTIME-complete in restriction to finitely many variables. However, this tight bound is only obtained on the assumption that the local atoms considered (i.e. inclusion, dependence, independence and equality) are all efficiently checkable.
One open problem is to determine the computational complexity of the satisfiability problem for LFD. It seems that, with a few adaptations, the satisfiability test for GF given in [7] can be used for the case of LFD. Indeed, the ’witnesses for satisfiability’ defined there closely resemble type models. A more conceptual challenge is connecting the qualitative notion of dependence studied by LFD to probabilistic, i.e. quantitative notions of correlation and dependence.
References
- [1]
- [2] Hajnal Andréka, István Németi & Johan van Benthem (1998): Modal Languages and Bounded Fragments of Predicate Logic. J. Philos. Log. 27(3), pp. 217–274, 10.1023/A:1004275029985.
- [3] Alexandru Baltag & Johan van Benthem (2021): A Simple Logic of Functional Dependence. Journal of Philosophical Logic, 10.1007/s10992-020-09588-z.
- [4] Alexandru Baltag & Sonja Smets (2020): Learning What Others Know. In Elvira Albert & Laura Kovacs, editors: LPAR23. LPAR-23: 23rd International Conference on Logic for Programming, Artificial Intelligence and Reasoning, EPiC Series in Computing 73, EasyChair, pp. 90–119, 10.29007/plm4. Available at https://easychair.org/publications/paper/V8Jp.
- [5] Johan van Benthem (2001): Exploring Logical Dynamics. Studia Logica 67(1), pp. 111–114, 10.1023/A:1017389612557.
- [6] Erich Grädel & Phil Pützstück (2021): Logics of Dependence and Independence: The Local Variants.
- [7] Erich Grädel (1999): On the Restraining Power of Guards. Journal of Symbolic Logic 64(4), p. 1719–1742, 10.2307/2586808.
- [8] B. Herwig (1998): Extending partial isomorphisms for the small index property of many -categorical structures. Israel Journal of Mathematics 107, pp. 93–123, 10.1007/BF02764005.