Language Models for Some Extensions of the Lambek Calculus
Abstract
We investigate language interpretations of two extensions of the Lambek calculus: with additive conjunction and disjunction and with additive conjunction and the unit constant. For extensions with additive connectives, we show that conjunction and disjunction behave differently. Adding both of them leads to incompleteness due to the distributivity law. We show that with conjunction only no issues with distributivity arise. In contrast, there exists a corollary of the distributivity law in the language with disjunction only which is not derivable in the non-distributive system. Moreover, this difference keeps valid for systems with permutation and/or weakening structural rules, that is, intuitionistic linear and affine logics and affine multiplicative-additive Lambek calculus. For the extension of the Lambek with the unit constant, we present a calculus which reflects natural algebraic properties of the empty word. We do not claim completeness for this calculus, but we prove undecidability for the whole range of systems extending this minimal calculus and sound w.r.t. language models. As a corollary, we show that in the language with the unit there exissts a sequent that is true if all variables are interpreted by regular language, but not true in language models in general.
keywords:
Lambek calculus , language models , relational models , distributive law , incompleteness , undecidability1 Introduction
The Lambek calculus was introduced by Joachim Lambek [1] for mathematical modelling of natural language syntax. This suggests the natural interpretation of the Lambek calculus as the algebraic logic of operations on formal languages. Such interpretations of the Lambek calculus are called language models, or L-models for short.
The Lambek calculus, as originally formulated by Lambek, includes three operations: (product), (left division), and (right division). A distinctive feature of the Lambek calculus is the so-called Lambek’s non-emptiness restriction. In terms of L-models, this means that the empty word is disallowed, and we consider, for a given alphabet , subsets of . Lambek operations on languages are defined as follows:
The division operations, and , are indeed residuals of the product w.r.t. the subset relation:
These equivalences form the core of the Lambek calculus. Along with transitivity (), reflexivity (), and associativity (), they form a complete axiomatization of all generally true atomic statements about Lambek operations on formal languages. This axiomatization is the Lambek calculus in its non-sequential form.
The sequential formulation of the Lambek calculus [1] is as follows. Formulae are constructed from variables () using three connectives: , , . (We use capital Latin letters both for languages and for Lambek formulae.) Sequents are expressions of the form , where the antecedent is a sequence of formulae and the succedent is one formula (intuitionistic style). The calculus includes axioms of the form and the following rules of inference:
An L-model, formally, is a mapping of Lambek formulae to subsets of (languages without the empty word), which commutes with Lambek operations: , , and . A sequent is true in this model, if .
According to Lambek’s non-emptiness restriction, all sequents in derivations are required to have non-empty antecedents. This constraint is motivated by linguistic applications: without it, Lambek categorial grammars generate ungrammatical sentences [2, § 2.5].
Abolishing Lambek’s restriction—that is, removing constraints “ is non-empty” on and —yields the Lambek calculus allowing empty antecedents, denoted by [3]. Language models are easily adapted for the case of : now we consider languages, which are subsets of (that is, they are allowed to include the empty word ). The definition of division operations is also modified: for models of ,
This modification can alter the values of and even if and do not contain the empty word. For example, now always includes , and therefore is always a subset of . Hence, is not a conservative extension of : the sequent has a non-empty antecedent, but is derivable only in , not in . For these modified L-models, let us use the term L-models.
In an L-model , a sequent of the form is true if , and a sequent of the form , with an empty antecedent, is true if .
Completeness theorems for and w.r.t. corresponding versions of L-models were proved by Pentus [4, 5]. Pentus’ proofs are highly non-trivial. If one considers the fragment without (the product-free fragment), however, proving L-completeness becomes much easier. This was done by Buszkowski [6]; Buszkowski’s proof applies both to and , w.r.t. L-models and L-models, respectively.
Besides product and two divisions, natural operations on formal languages include set-theoretic intersection and union. These operations correspond to so-called additive conjunction and disjunction. Additive operations are usually axiomatized by the following inference rules (cf. [7]):
The Lambek calculus extended with these rules is denoted by (multiplicative-additive Lambek calculus); is the variant of without Lambek’s restriction (that is, allowing empty antecedents). L-completeness, however, fails for in general. Further, in Section 2, we discuss this issue in detail.
Following Abrusci [8], we put the Lambek calculus into a broader context of linear logic. Namely, can be viewed as a fragment of intuitionistic non-commutative linear logic. (This fragment includes multiplicative and additive operations, but lacks the exponential and constants.) We also consider commutative systems: intuitionistic linear logic and intuitionistic affine logic .
Calculi and can be obtained from by adding structural rules: permutation for and permutation and weakening for . In the language of , the rules of permutation and weakening are formulated as follows:
Adding only weakening yields non-commutative intuitionistic affine logic, or affine (monotone) multiplicative-additive Lambek calculus. We denote this system by (in the presence of extra structural rules, we do not impose Lambek’s restriction).
We shall also use alternative calculi for the commutative systems and , in which structural rules are hidden in axioms and in the format of sequents. First, we change the language of formulae, introducing one connective instead of and (these are equivalent in and ). We also write instead of , following Girard’s [9] linear logic notations.
Sequents are now going to be expressions of the form , where is a multiset of formulae. Further means , and means , where is multiset union.
Axioms are of the form , for each variable , in the case of , and of the form for . Inference rules for both systems are as follows:
For , the weakening rule is not officially included in the system, but is admissible:
(it is hidden in axioms).
The cut rule of the following form is admissible both in and :
This is shown by a standard inductive argument.
Finally, let us introduce the multiplicative unit constant, . The unit constant is added to systems without Lambek’s restriction extending (i.e., itself, , , , ). The Lambek calculus with the unit, [10], is obtained from by adding one axiom, (its antecedent is empty), and one inference rule,
L-completeness, however, does not hold for . Indeed, since should be the unit w.r.t. , that is for any , in L-models it should be interpreted as . The following sequent is a counter-example for L-completeness: . This sequent is true in all models for any interpretation of , but is not derivable in .
Throughout this paper, we shall frequently consider fragments of the calculi defined above in languages with restricted sets of connectives. Such a fragment will be denoted by the name of the calculus, followed by the list of connectives in parentheses: e.g., .
2 Distributivity Law in Fragments with One Additive
It is well known, that is incomplete w.r.t. L-models. The reason is the distributivity principle,
On one hand, this principle is obviously true in all L-models. On the other hand, as noticed by Ono and Komori [11], one needs the structural rules of contraction and weakening to derive it. In particular, the distributivity principle is not derivable in , , , , and .
The distributivity principle, as formulated above, includes both additive connectives, and . We investigate fragments of with only one additive, or . The result of our study is that with respect to distributivity and behave in opposite ways.
Let denote with the distributivity principle added as an extra axiom scheme. In the presence of this axiom scheme, we have to keep cut as an official rule of the system (it is now not eliminable). A hypersequential system for was developed by Kozak [12].
Let us restrict ourselves to the product-free language (with product, proving L-completeness is hard even without extra connections [4, 5]). We also consider calculi without the unit constant: issues connected with are discussed in Section 3. Thus, we consider two fragments of the multiplicative-additive Lambek calculus: and , and the corresponding fragments of bigger system up to . (For commutative calculi, we have only one implication, that is, consider fragments in the language of and .)
As shown by Buszkowski [6], is complete w.r.t. L-models. This yields the following corollary: is a conservative fragment of both and . Indeed, any sequent provable in is true in all L-models; if it is in the language of , it is derivable in by L-completeness. In other words, the distributivity principle has no non-trivial corollaries in the language of .
The situation with is opposite. Namely, we present a corollary of the distributivity principle in the language of , which is not provable in . Thus, is not a conservative fragment of , and is therefore incomplete w.r.t. L-models. Moreover, we show that this effect is of a more general nature. Namely, the same holds for the corresponding fragments of , , , and : distributivity has no new corollaries in the language with , but has such in the language with .
2.1 Completeness with Additive Conjunction Only
For the first series of results, concerning , we give a semantic proof. For each system, we consider a specific version of L-semantics. For and , these are L-models and L-models respectively. For other systems, let us first give some definitions and prove correctness statements for them.
Definition 1.
A language is called monotone, if for any word and an arbitrary word the word also belongs to .
Proposition 1.
If and are both monotone, then so are , , and .
Proof.
Let . Then for any we have . Now take for an arbitrary . By monotonicity of , the word is also in . Since this holds for any , we get . The reasoning for is symmetric. The case of is trivial. ∎
Definition 2.
A language is called commutative, if for any word belonging to and an arbitrary transposition on the word also belongs to .
Commutative languages are in one-to-one correspondence with multisets of letters from . Thus, we can define the operation of multiset union, , for two commutative languages and , which can be expressed as follows:
If is a commutative language, then if and only if . Therefore, for commutative and , we have ; we denote this language by .
Proposition 2.
If and are commutative, then so is and .
Proof.
Commutativity of is obvious. For , take any and let . Now for any . By definiton of , we have . Now by commutativity of , the word also belongs to . Indeed, it is obtained from by the following transposition:
Since was taken arbitrarily, we conclude that . ∎
Having the class of monotone languages and the class of commutative languages closed under our operations (, , ), we can define the classes of restricted L-models for all our systems.
Definition 3.
An L-model is monotone, if all languages in it are monotone. Truth of sequents is defined as in ordinary L-models.
Definition 4.
A commutative L-model is an L-model, where all languages are commutative.
In commutative models actually plays the role of product (while we do not have product as a connective, we still have the sequential comma, which is a hidden product), due to the following fact.
Proposition 3.
In a commutative L-model , a sequent is true if and only if .
Proof.
The “if” part is due to the fact that . The “only if” part holds since is closed under transpositions. ∎
Now we prove an extension of Buszkowski’s completeness result
Theorem 4.
Each of , , , , is sound and complete w.r.t. the corresponding class of models, according to the following table:
Calculus | Models |
---|---|
L-models | |
L-models | |
monotone L-models | |
commutative L-models | |
L-models, which are both monotone and commutative |
Proof.
The cases of and are due to Buszkowski [6]. Let us consider the remaining three systems.
The soundness part is easy: our conditions on models were specifically designed to reflect structural rules. In a monotone model, if , then also , thus the weakening rule is valid. If we have a commutative L-model, then the permutation rule is valid. This is obvious from Proposition 3: unlike , is just commutative. All other rules and axioms are valid in arbitrary L-models.
Completeness is proved by Buszkowski’s canonical model argument. We do it uniformly for all systems. In the canonical model, the alphabet is the set of all formulae of the given calculus, and for any formula let
First we show that is indeed an L-model:
This is performed exactly as in Buszkowski’s proof. Indeed, if , then for an arbitrary we have and . Applying cut with , we get derivable in our system. Thus, , therefore . Notice that cut is available in all systems we consider. Dually, if , then, since by the axiom, . This means derivability , thus . Hence, .
The case is symmetric. For , we use the equivalence if and only if and . Here the “if” part is an application of , and the “only if” part is by cut with and .
Next, is easy to see that the canonical model belongs to the corresponding class of models: monotone for , commutative for , commutative and monotone for .
Finally, suppose a sequent is not derivable. Consider two cases. If is non-empty, then, since each belongs to , we have . On the other hand, . This falsifies under interpretation . If is empty, then we have , which again falsifies . This finishes the completeness proof. ∎
It is easy to see that soundness actually extends to the language with (interpreted as set-theoretic union). Unions of monotone languages are also monotone, the same for commutative languages. The situation with product is a bit more complicated for commutative systems, since is usually not commutative, even for commutative and . Thus, we have to alter the definition of language models in the commutative case, requiring instead of . Under this modification, soundness holds for product also. Finally, notice that in all models we consider and are interpreted set-theoretically, thus, obey the distributivity law. These considerations yield the following soundness result:
Proposition 5.
Each of , , , , is sound w.r.t. the corresponding class of models, according to the table in Theorem 4; for and in the models we use to interpret .
Now we are ready to state and prove our conservativity result.
Theorem 6.
The systems in the restricted language without , , , , , and are conservative fragments of , , , , and respectively.
2.2 Incompleteness with Additive Disjunction Only
If we take instead of , however, no analog of the conservativity result like Theorem 6 is possible, due to the following counter-example.
Theorem 7.
The sequent
is derivable in but this sequent is not derivable in .
This sequent is in the language of . The theorem states that it is derivable in , and therefore in all its extensions up to , but not in the corresponding () fragments without the distributivity law added. Thus, this is a non-trivial corollary of in the language without . In particular, Theorem 7 implies that is incomplete w.r.t. L-models, as well as , , , are incomplete w.r.t. the corresponding modifications of L-models (compare with Theorem 4).
Before proving Theorem 7, let us make some remarks. First, let us notice that the sequent in this theorem is slightly different from the one in our WoLLIC 2019 paper [13], where one variable is used for and . The reason is that the old example happens to be derivable in (but still not in and weaker systems).
Second, the hard part of Theorem 7 is, of course, the second one (non-derivability). Fortunately, the derivability problem in is algorithmically decidable (belongs to PSPACE), thus, it is possible to establish non-derivability by exhaustive proof search. This proof search was first performed, as a pre-verification of the result, automatically using an affine modification of llprover by Tamura [14]. (For the WoLLIC 2019 paper, we used a prover by Jipsen [15], based on the algorithm by Okada and Terui [16].) In order to make this article self-contained and independent from proof-search software, here we present a complete manual proof search.
One of the WoLLIC 2019 reviewers suggested a shorter method of proving non-derivability of the given sequent in , via an algebraic counter-model. This counter-model is a commutative residuated lattice on the set . The order is defined as follows: ; are incomparable. Product and residual are defined as follows:
(In the commutative situation, we have only one residual, which we denote by .) Variables are interpreted as follows: as , as , and both as . This algebraic model falsifies the sequent in Theorem 7. However, is insufficient for our new purposes. The reason is that in this model , while in the presence of weakening should be true. Thus, in order to establish non-derivability of our sequent not only in , but also in , we use the good old syntactic method.
Proof of Theorem 7.
The first statement is proved using the joining (diamond) construction, the idea of which goes back to Lambek [1] and Pentus [17]. Indeed, let and . Then is equivalent to . One can easily check derivability of and in . Notice that the antecedent of this sequent is exactly the one in the sequent of our theorem. Next, we derive , and further by distributivity .
The second statement is proved by an exhaustive proof search for the sequent
(the translation of our sequent into the commutative language) in .
In order to facilitate proof search, we take into account the following considerations.
First, the rules and are invertible. Thus, we can suppose they are applied immediately. Moreover, has two premises, and when disproving derivability we have the right to choose one and establish non-derivability there.
Second, we can suppose that in our (hypothetic) derivation instances of of the form are directly preceded by axioms. Indeed, such instances are interchangeable upwards with and , and cannot appear before this , since is a variable. Other rules are impossible by the polarized subformula property.
Third, we establish non-derivability of several sequents, which will appear frequently in our proof search:
(1) |
(2) |
(3) |
(4) |
(5) |
(6) |
(7) |
(8) |
Now we are ready to start proof search. First we invert introducing and choose :
Now we have a choice of 4 principal connectives (denoted by numbers in the sequent) to be decomposed first.
Case 1. In this case, we use , thanks to our consideration that with should be applied immediately after an axiom.
Invert and , choosing out of :
Now we can decompose (by ) one of the implications 2–4, and for each we have a choice of ways of splitting the rest of the antecedent into and . Making use of the weakening rule, however, we can reduce the number of cases.
Subcase 1–2. If includes , then the right premise is , where is a subset of . Notice that if and the sequent is not derivable with , it is also not derivable with (otherwise we could derive it with using the weakening rule). However, the sequent is not derivable even with the maximal :
Indeed, invert and choose :
Here one should use , but then in its right premise we can again invert choosing , which yields one of:
None of these is derivable.
If does not include , then is a subset of , and we again take the maximal in the left premise:
(9) |
Decomposing yields either or , both not derivable by (1) and (2). Thus, we have to decompose on the right.
Taking (and inverting ) yields
Now we again have to use . The new cases are and , both not derivable (3)(4).
Subcase 1–3. Apply for and consider its left premise with the maximal possible :
(10) |
Subsubcase 1–3–2. Decompose . If the big formula with goes to the new , then the new is either or empty. However, neither nor is derivable (6)(7). If the formula with goes to the new , then the new is either or empty. This gives, at maximum, , which is falsified by choosing in the inverted : .
Subsubcase 1–3–4. Decompose . Again, if the big formula (now with ) goes to the new , we falsify the left premise by (1) or (2). Otherwise, the right premise is, at maximum, , which is again falsified by choosing .
Subcase 1–4. If includes , then the right premise is, at maximum,
Invert and choose :
Now we have to use . Its right premise is, at maximum, . Choosing falsifies it.
If is in , then the maximal version of the left premise is
(11) |
Applying right now is impossible: its left premise gets falsified by (7) or (6). Apply (recall that is used only directly below axiom) and invert :
Here the left premise of is also falsified by (7), (6), or (8).
Case 2. Consider again two cases, depending on whether goes to or to . If it goes to , then the right premise is, at maximum,
Invert and choose :
(12) |
For reusal of our reasoning in further cases, we shall falsify a stronger sequent
(13) |
Indeed, , and therefore is derivable in . Thus, if (12) happens to be derivable then, by cut, so will be (13).
Now we decompose one of , , in (13).
Subcase 2––1. Recall that we never choose in , and invert :
Invert and choose :
Subsubcase 2––1–5. The left premise is, at maximum,
Applying is impossible, since its right premise is falsified by choosing : and .
Subsubcase 2––1–4. Again, if goes to the new , then the right premise is, at maximum, which is falsified by choosing . If it goes to the new , then the new left premise is, at maximum, , which is not derivable by (6).
Subcase 2––4. If the new is empty, then the left premise is falsified by (7). Otherwise, the right premise is
Invert and choose :
Applying is impossible (); also . Thus, we have to use , and we can immediately apply afterwards: Inverting and choosing falsifies this sequent: .
Subcase 2––5. The left premise is, at maximum,
This is not derivable.
Now let, in Case 2, go to . Then the left premise is, at maximum,
This sequent is stronger than (9)—that is, (9) can be obtained from it by weakening. Therefore, it cannot be derivable, since we’ve already falsified (9) in Case 1.
Case 3. Take the maximal possible and consider the left premise:
This sequent is stronger than (10), and therefore not derivable: (10) was falsified in Case 1.
Case 4. If goes to , then the maximal version of the right premise of is
Invert and choose :
Suppose this sequent is derivable. Then it will also be derivable after swapping variables and :
This sequent, however, is exactly (13), up to commutativity; (13) was falsified in Case 2.
Finally, if , in Case 4, goes to , then the maximal version of the left premise of is
This sequent is stronger than (11) and therefore cannot be derivable. ∎
3 Undecidability with , , and
3.1 The System and Its Undecidability
In this section we consider the extension of the Lambek calculus with the multiplicative unit constant. The language of our fragment will be as follows: , , . As shown by Buszkowski [6], in the fragment of and the Lambek calculus with empty antecedents is complete w.r.t. L-models. As noticed in the Introduction, however, this is not the case if we add . In L-models, because of the principle , the unit constant is necessarily interpreted as the singleton set , where is the empty word. (In the presence of the unit constant, we allow the empty word to belong to our languages and abolish Lambek’s non-emptiness restriction.) This particular interpretation of satisfies certain principles, including and . Moreover, these principles keep valid for languages of the form (for any ). Indeed, this language is either or , and for the empty set we also have and .
Below we present a calculus denoted by , which reflects these principles as sequential rules:
The rules and are called “commuting” rules; they reflect the fact that, for any set , and . The “doubling” rule is caused by and . Thus, these rules express natural algebraic properties of the interpretation of as . However, we emphasize that they are not admissible in the standard calculus , introduced by Lambek [10], that is, non-commutative intuitionistic multiplicative-additive linear logic.
The rules , , and are not new. Their underlying principles, namely, and appear in works of the Hungarian school (Andréka, Mikulás, Németi). Namely, in [18] one can find the first of these equivalences (denoted there as formula 3.2), as one of the principles which is true in language algebras, but not in algebras of binary relations. The second equivalence is true for binary relations also; formula (CbI) in [19] is actually its stronger version, . We get our by taking .
Andréka, Mikulás, and Sain [20] also sketch an undecidability proof for a system related to the one considered here. Their proof is based on the technique of Kurucz et al. [21]. The system considered in [20] is the logic of residuated distributive lattices over monoids. Unlike the case we consider in this section, their system requires product, the unit and also the zero constant (the minimal element of the lattice) to be present in the language. Here we require only division, additive conjunction, and the unit. The trade-off is that we consider a narrower class of models. Namely, we consider only L-models, and these models, as shown above, allow extra principles for .
We do not claim that is an L-complete system. Indeed, the L-complete extension of happens to be quite involved (cf. [22]). In particular, it is still an open problem whether such a complete system is recursively enumerable. The cut rule is not included in , so all our derivations will be cut-free. We do not claim that cut is admissible in this system.
We prove undecidability for the whole range of systems between and the L-complete system in the language of , , .
Theorem 8.
Let be an L-sound logic which includes . Then the derivability problem for is undecidable.
Our undecidability proof is based on encoding computations of 2-counter Minsky machines [23]. In the forward encoding, from Minsky computations to derivations in our calculus, we present explicit derivations in . For the backwards direction, from derivations to computations, we use a semantic approach using L-models (cf. [24, 25, 16], where phase semantics was used for similar purposes). Thus, we get undecidability not only for itself, but for the whole range of its L-sound extensions.
Before going further, let us introduce the relative double negation construction. We fix a variable (atomic proposition) and define relative negation as
The term “negation” here is motivated as follows. In linear logic with the falsity constant , negation is expressed as . Here we do the same non-commutatively, but due to lack of the constant we replace it by a fixed variable. This is the minimal logic approach: variable can be read as “false,” but no specific axioms like (ex falso) are imposed for .
The relative double negation now is
Notice the difference from the more usual in the Lambek calculus “type raising” version of something like double negation: . In our setting, we have neither (due to the intuitionistic nature of the Lambek calculus), nor (due to non-commutativity; in contrast, is derivable). Nevertheless, will be useful for our construction.
Given a sequence of formulae and a formula , we introduce the notation
In particular,
and
In what follows, we suppose that the bb operation has a higher priority than ordinary division .
Consider a non-deterministic Minsky machine with a finite set of states . A configuration of is a triple , where is the current state and and are the current values of ’s two counters. The counters themselves are denoted by and . The configuration is considered the final one; the initial configuration can be taken arbitrarily.
Configurations of Minsky machines are encoded as follows. We introduce distinct variables , , , , , , …, and represent configuration as
In particular, the final configuration is represented as .
Minsky instructions are encoded according to the following table:
Instruction | Formula |
---|---|
inc | |
inc | |
dec | |
dec | |
jz | |
jz |
Here instruction inc (increment) means “at state , increase by 1 and go to ” (). Instruction dec (decrement) means “at state , decrease by 1 and go to .” If , then this instruction cannot be applied. Finally, jz (zero-test) means “at state , if , go to .” Now if , then the instruction cannot be applied.
Notice that our version of zero-test and decrement instructions are very restrictive. Once the counter has a wrong value (zero for decreasing or non-zero for zero-test), the machine just fails to proceed. Usually, in such cases the machine is allowed to perform conditional branching (e.g., zero-test jumps to if the counter is zero and safely stays at if not). These restrictions, however, are compensated by the allowed non-determinism of . Indeed, the compound jzdec instruction from Minsky’s original formalism [26], meaning “at state , if , decrease by one and go to , and if , go to ,” is modelled by adding simultaneously two instructions: dec and jz. This non-deterministically branches computation; however, exactly one branch (depending on whether ) could be successful, the other one immediately fails.
Let us denote the set of our variables, except , by :
Finally, the Minsky machine is represented by the following formula
Here in the first big conjunction ranges among all instructions of .
Now we are ready to state our main encoding theorems.
Theorem 9.
If can reach the final configuration , starting from , then the following sequent is derivable in :
Theorem 10.
If the sequent is true in all L-models, then can reach from .
Notice that our encodings are in a sense “upside-down”: the starting configuration corresponds to the goal sequent in our derivation, and the sequent encoding the final configuration is on the top of the derivation, very close to axioms (see proof of Theorem 9 below). The right intuition here is to consider the derivation in the direction of proof search, developing from the goal up to axioms. This direction correctly reflects the direction of Minsky computation.
Theorem 8 (our undecidability result) immediately follows from Theorems 9 and 10. Indeed, if is a logic which is L-sound and includes , then is provable in if and only if can reach from . Indeed, the “if” direction is by Theorem 9, and the “only if” direciton is by Theorem 10. Since reachability in Minsky computations is undecidable, we get undecidability of .
Notice that each formula in the big conjunction , except the first one, is of the form . The key lemma for such formulae, in the view of Theorem 9, is as follows.
Lemma 11.
If the big conjunction includes and is derivable in , then so is .
Proof.
The derivation is as follows:
∎
Corollary 12 (“Post-ish productions”).
Let and be sequences of variables from (no complex formulae). Then, provided that includes for any , the sequent is derivable in from .
Proof.
It is sufficient to consider the case of ; then we proceed by induction on the length of . For , we apply Lemma 11 with . ∎
Corollary 13 (One step of Minsky computation).
Suppose the Minsky machine can make a computation step from configuration to configuration , and let be the instance of for . Then is derivable from in .
Proof.
The proof is performed uniformly for all Minsky instructions. For any instruction , the corresponding formula is of the form . On the other hand, is obtained from by replacing with in the antecedent.
For example, for the instruction inc in the center of we have , which is replaced with in . This exactly corresponds to the computation step: the number of ’s (that is, the value of ) gets increased by 1, and the state is changed to . For jz, the replacement happens at the edge of the antecedent, involving or .
Thus, is of the form and is . Now we derive from in the following way:
∎
Now we are ready to prove Theorem 9.
Proof of Theorem 9.
Using Corollary 13 and induction on the number of steps in Minsky computation from to , we derive from
This sequent is derived as follows:
∎
The backwards direction, Theorem 10, is proved by constructing a specific L-model. Let and fefine as the set of “terminating words” for , defined as follows:
Now define the interpreting function on variables as follows:
Lemma 14.
.
Proof.
It is sufficient to show that , that is, belongs to interpretation of all formulae in the big conjunction .
First, . Indeed, , thus, we have to show that . This is indeed so by the definition of , since is reachable from itself in zero steps.
Second, we prove that for each instruction of . Recall that , and if instruction changes the configuration from to , then the code of the second configuration is obtained from the code of the first one by replacing with . In other words, the code of is and the code of is . We have to prove that . Since ( contains only letters from ), this means that should belong to .
In turn, means that for any word we have . The fact that , since , actually means that . Thus, we have to prove, for an arbitrary , that if , then .
If , then we have , and . Here cannot be split between and , because any word in should begin with and end on . This means that is a code of some configuration , from which can reach the final configuration. As noticed above, this means that encodes a configuration , which transforms into by applying instruction . Therefore, from our Minsky machine can also reach the final state, hence . This yields , which is our goal.
Third, consider , where . We have to show that , that is, . The latter means that for any the word should belong to . This is indeed so: if , then , and since is closed under cyclic transpositions, also . ∎
Now we are ready to prove Theorem 10.
Proof of Theorem 10.
If is true in all L-models, it is true in the specific model defined above. By Lemma 14, ; for any . Thus, we have
and therefore
(No cyclic transposition is possible, since and should start and end the word.)
By definition of , this means that can reach the final state , starting from . ∎
3.2 Models on Regular Languages with the Unit Constant
Let denote the set of all sequents in the language of , , which are true in all L-models, that is, the complete theory of this class of models.
As noticed above, the question of axiomatizing this theory is quite involved. We know that this theory includes , introduced in the previous section, but it is probably much more complicated. For example, as shown in [22], Sobociński’s 3-valued logic can be embedded into .
It follows from Theorem 8 that is undecidable. More precisely, it is -hard (hard w.r.t. the class of recursively enumerable sets). The upper complexity bound, however, is not known: this theory could possibly be even not recursively enumerable. Having the algorithmic complexity question for open, we can still obtain an interesting corollary of our complexity estimations.
Recall the standard notion of regular expression. Regular expressions are constructed from constants and using two binary operations, and , and one unary operation, ∗. The language described by a given regular expression is defined recursively:
Languages described by regular expressions are called regular languages.
By Lreg-models let us denote a subclass of L-models in which every variable as interpreted as a regular language, that is, a set of words described by a regular expression. It is well-known that the class of regular languages is closed under intersection (see, for example, [27, Theorem 2.8]). Moreover, it is also closed under division:
Proposition 15.
If and are regular languages, then so are and .
Proof.
A more well-known fact (see, for example, [27, Exercise 2.3.17a]) is that the class of regular languages is closed under the following modified division operation with the existential quantifier instead of the universal one: . Our “normal” division can be reduced to by the complement operation: , where . Since the class of regular languages is closed under and complement (again, see [27, Theorem 2.8]), it is also closed under . The case is symmetric. ∎
Thus, in Lreg-models interpretations of all formulae are regular languages.
In the language without the unit constant, namely, , , , the theory of Lreg-models coincides with the theory of all L-models:
Proposition 16.
Proof.
On the one hand, the calculus is sound w.r.t. all L-models. On the other hand, as shown by Buszkowski [28], it is complete w.r.t. a class of models which is even narrower than the class of Lreg-models. Namely, is complete w.r.t. the class of L-models in which variables are interpreted by cofinite languages. (A cofinite language is a language which includes all words over a given alphabet, except for a finite set.) In this case, formulae are interpreted by cofinite or finite languages, and any finite or cofinite language is regular. Therefore, both and are axiomatized by the same calculus . ∎
The unit changes things dramatically. With the unit, there is no completeness result, like Theorem 4, but also no equivalence between theories of all L-models and Lreg-models.
Theorem 17.
Proof.
As follows from Theorem 8, is -hard. On the other hand, following Buszkowski [29], we can show that belongs to the class. Indeed, a sequent belongs to this theory if and only if it is true in all Lreg-models. A concrete sequent includes only a finite number of variables, , …, . Thus, a model for this sequent is defined by a finite number of regular expressions , …, , which describe the languages , …, . This means that the general truth condition for this sequent can be written down as the following formula:
Quantifiers , …, can be encoded as quantifiers over natural numbers representing the regular expressions. The quantifier-free part of the formula (truth condition under a concrete ) is decidable, because all necessary operations on regular expressions are computable. Thus, we get a representation of the set of sequents which are true in all Lreg-models.
It is well known that a set cannot belong to and be -hard at the same time. (Otherwise, for any set in there would be a computable reduction to a set in , which would yield , which is not the case.) Therefore,
∎
Notice that our proof of Theorem 8 does not apply to , because the language there is non-regular (in fact, it is undecidable).
Since the class of Lreg-models is narrower than the class of all L-models, we have (by Galois connection) an inverted inclusion of theories:
By our Theorem 17, this inclusion is strict. Thus, the other inclusion should fail:
In other words, there exists a sequent which is true in all Lreg-models, but not in all L-models. Our proof, however, is non-constructive, and we do not present a concrete example of such sequent. Constructing such a concrete example is left for further research.
Notice that if we apply the reasoning establishing the upper bound of to , we shall have to quantify over arbitrary formal languages , …, . This results in hyperarithmetical quantifiers, and yields only a very high, complexity upper bound for .
4 Concluding Remarks
In this article, we have investigated language interpretations of natural extensions of the Lambek calculus: with additive operations ( and ) and with additive conjunction () and the unit constant ().
For extensions with additive connectives (Section 2), we have shown that conjunction and disjunction show significantly different behaviour. It is known that adding both conjunction and disjunction leads to incompleteness due to the distributivity law . This law is true in all language models, but not derivable in the multiplicative-additive Lambek calculus (). Adding only conjunction, however, still provides completeness. Any sequent in the language of , , (but not ) that is derivable with the help of , is also derivable without it. For disjunction the situation is opposite: there exists a sequent in the language of , , , which is derivable using , but not derivable without it.
Moreover, this difference between and keeps valid for systems with permutation and/or weakening structural rules, that is, intuitionistic linear (), and affine () logics and affine .
For the extension of the Lambek calculus with the unit, , it is well-known that its standard axiomatization in the style of linear logic does not give an L-complete system. In Section 3, we present a system in the language , where rules for reflect natural algebraic properties of the empty word in the algebra of formal languages. This system is denoted by . We do not claim L-completeness of . Instead, we consider the whole range of logics between and the L-complete system denoted by . For any logic within this range, we show that it is undecidable; more precisely, -complete. As a corollary, we also show that, in the language of , the complete theory of all L-models is different from that of Lreg-models, where formulae are interpreted by regular languages.
A preliminary version of this article was presented at WoLLIC 2019 and published in its lecture notes [13]. Let us briefly list the results which are new in this article, compared to the WoLLIC paper.
-
1.
In the language without additive conjunction, we show incompleteness not only for , but also for its extensions: , , , and .
-
2.
We prove that is a conservative fragment of extended with the distributity law . Moreover, we prove similar results for , , , and .
-
3.
We prove that, in the language including , the theory of all L-models is different from the theory of LReg-models, in which formulae are interpreted by regular languages. In the language of (without ), the corresponding theories coincide due to a completeness result by Buszkowski [28].
While in Section 2 we have presented a quite completed study, Section 3 leaves many questions open for further investigations. Among these, we would like to emphasize the following ones.
-
1.
The question of axiomatization, or even recursive enumerability for complete theories and is still open, and potentially very hard. Notice that these theories are different (Theorem 17) and that for enumerability will immediately mean decidability.
-
2.
A possibly easier question would be to construct a concrete formula distinguishing and . That is, we are looking for an explicit example for Theorem 17.
-
3.
Without the unit, we know that
By the completeness theorem of Pentus [4, 5], the first equality is also true for the language of (with product instead of additive conjunction). There are two open questions. First, whether Pentus’ theorem is true for the language with both conjunctions (). Second, whether Pentus’ theorem is true for Lreg-models. Both questions are questions of making Pentus’ result stronger. Recalling that Pentus’ proofs are quite sophisticated, these questions are also probably very hard.
References
- [1] J. Lambek, The mathematics of sentence structure, Amer. Math. Monthly 65 (1958) 154–170.
- [2] R. Moot, C. Retoré, The logic of categorial grammars: a deductive account of natural language syntax and semantics, Vol. 6850 of LNCS, Springer, 2012.
- [3] J. Lambek, On the calculus of syntactic types, in: Structure of Language and Its Mathematical Aspects, Vol. 12 of Proc. Symposia Appl. Math., AMS, 1961, pp. 166–178.
- [4] M. Pentus, Models for the Lambek calculus, Annals of Pure and Applied Logic 75 (1–2) (1995) 179–213.
- [5] M. Pentus, Free monoid completeness of the Lambek calculus allowing empty premises, in: Proc. Logic Colloquium ’96, Vol. 12 of Lect. Notes Logic, Springer, 1998, pp. 171–209.
- [6] W. Buszkowski, Compatibility of a categorial grammar with an associated category system, Zeitschr. Math. Logik Grundl. Math. (Math. Logic Q.) 28 (1982) 229–238.
- [7] M. Kanazawa, The Lambek calculus enriched with additional connectives, J. Logic Lang. Inform. 1 (2) (1992) 141–171.
- [8] V. M. Abrusci, A comparison between Lambek syntactic calculus and intuitionistic linear logic, Zeitschr. math. Logik Grundl. Math. (Math. Logic Q.) 36 (1990) 11–15.
- [9] J.-Y. Girard, Linear logic, Theor. Comput. Sci. 50 (1) (1987) 1–102.
- [10] J. Lambek, Deductive systems and categories II. Standard constructions and closed categories, in: Category Theory, Homology Theory and Their Applications I, Vol. 86 of Lect. Notes Math., Springer, 1969, pp. 76–122.
- [11] H. Ono, Y. Komori, Logics without contraction rule, J. Symb. Log. 50 (1) (1985) 169–201.
- [12] M. Kozak, Distributive full Lambek calculus has the finite model property, Studia Logica 91 (2009) 201–216.
- [13] M. Kanovich, S. Kuznetsov, A. Scedrov, L-models and R-models for Lambek calculus enriched with additives and the multiplicative unit, in: WoLLIC 2019: Logic, Language, Information, and Computation, Vol. 11541 of Lect. Notes Comput. Sci., 2019, pp. 373–391.
- [14] N. Tamura, A linear logic prover (llprover), http://bach.istc.kobe-u.ac.jp/llprover/ (1998–2007).
- [15] P. Jipsen, Deciding equations in residuated lattices, http://www1.chapman.edu/~jipsen/reslat/.
- [16] M. Okada, K. Terui, The finite model property for various fragments of intuitionistic linear logic, J. Symb. Log. 64 (2) (1999) 790–802.
- [17] M. Pentus, The conjoinability relation in Lambek calculus and linear logic, J. Log. Lang. Inform. 3 (1994) 121–140.
- [18] H. Andréka, S. Mikulás, I. Németi, The equational theory of Kleene lattices, Theoretical Computer Science 412 (52) (2011) 7099–7108.
- [19] H. Andréka, S. Mikulás, Axiomatizability of positive algebras of binary relations, Algebra Universalis 66 (2011) 7–34.
- [20] H. Andréka, I. Németi, I. Sain, Some new landmarks on the roadmap of two dimensional logics, in: J. van Eijck, A. Visser (Eds.), Logic and Information Flow, MIT Press, 1994, pp. 163–169.
- [21] A. Kurucz, I. Németi, I. Sain, A. Simon, Undecidable varieties of semilattice-ordered semigroups, of Boolean algebras with operators, and logics extending Lambek calculus, Bulletin of the IGPL 1 (1) (1993) 91–98.
- [22] S. L. Kuznetsov, Trivalent logics arising from L-models for the Lambek calculus with constants, Journal of Applied Non-Classical Logics 24 (1–2) (2014) 132–137.
- [23] M. Kanovich, The direct simulation of Minsky machines in linear logic, in: Advances in Linear Logic, Vol. 222 of London Mathematical Society Lecture Notes, Cambridge University Press, 1995, pp. 123–145.
- [24] Y. Lafont, The undecidability of second order linear logic without exponentials, Journal of Symbolic Logic 61 (2) (1996) 541–548.
- [25] Y. Lafont, A. Scedrov, The undecidability of second order multiplicative linear logic, Information and Computation 125 (1) (1996) 46–51.
- [26] M. L. Minsky, Recursive unsolvability of Post’s problem of “Tag” and other topics in theory of Turing machines, Annals of Mathematics 74 (3) (1961) 437–455.
- [27] A. V. Aho, J. D. Ullman, The theory of parsing, translation, and compiling. Vol. I: Parsing, Prentice-Hall, 1972.
- [28] W. Buszkowski, The finite model property for BCI and related systems, Studia Logica 57 (1996) 303–323.
- [29] W. Buszkowski, On the complexity of the equational theory of relational action algebras, in: RelMiCS 2006: Relations and Kleene Algebra in Computer Science, Vol. 4136 of LNCS, Springer, 2006, pp. 106–119.