This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Language Models for Some Extensions of the Lambek Calculus

Max Kanovich Stepan Kuznetsov Andre Scedrov University College London Steklov Mathematical Institute of RAS University of Pennsylvania National Research University Higher School of Economics
Abstract

We investigate language interpretations of two extensions of the Lambek calculus: with additive conjunction and disjunction and with additive conjunction and the unit constant. For extensions with additive connectives, we show that conjunction and disjunction behave differently. Adding both of them leads to incompleteness due to the distributivity law. We show that with conjunction only no issues with distributivity arise. In contrast, there exists a corollary of the distributivity law in the language with disjunction only which is not derivable in the non-distributive system. Moreover, this difference keeps valid for systems with permutation and/or weakening structural rules, that is, intuitionistic linear and affine logics and affine multiplicative-additive Lambek calculus. For the extension of the Lambek with the unit constant, we present a calculus which reflects natural algebraic properties of the empty word. We do not claim completeness for this calculus, but we prove undecidability for the whole range of systems extending this minimal calculus and sound w.r.t. language models. As a corollary, we show that in the language with the unit there exissts a sequent that is true if all variables are interpreted by regular language, but not true in language models in general.

keywords:
Lambek calculus , language models , relational models , distributive law , incompleteness , undecidability
journal: Information and Computation

1 Introduction

The Lambek calculus was introduced by Joachim Lambek [1] for mathematical modelling of natural language syntax. This suggests the natural interpretation of the Lambek calculus as the algebraic logic of operations on formal languages. Such interpretations of the Lambek calculus are called language models, or L-models for short.

The Lambek calculus, as originally formulated by Lambek, includes three operations: \cdot (product), \\mathop{\backslash} (left division), and /\mathop{/} (right division). A distinctive feature of the Lambek calculus is the so-called Lambek’s non-emptiness restriction. In terms of L-models, this means that the empty word is disallowed, and we consider, for a given alphabet Σ\Sigma, subsets of Σ+\Sigma^{+}. Lambek operations on languages are defined as follows:

AB={uvuA,vB},\displaystyle A\cdot B=\{uv\mid u\in A,v\in B\},
A\B={uΣ+(vA)vuB},\displaystyle A\mathop{\backslash}B=\{u\in\Sigma^{+}\mid(\forall v\in A)\>vu\in B\},
B/A={uΣ+(vA)uvB}.\displaystyle B\mathop{/}A=\{u\in\Sigma^{+}\mid(\forall v\in A)\>uv\in B\}.

The division operations, \\mathop{\backslash} and /\mathop{/}, are indeed residuals of the product w.r.t. the subset relation:

BA\CABCAC/B.B\subseteq A\mathop{\backslash}C\iff A\cdot B\subseteq C\iff A\subseteq C\mathop{/}B.

These equivalences form the core of the Lambek calculus. Along with transitivity (ABCACA\subseteq B\subseteq C\Rightarrow A\subseteq C), reflexivity (AAA\subseteq A), and associativity (A(BC)=(AB)CA\cdot(B\cdot C)=(A\cdot B)\cdot C), they form a complete axiomatization of all generally true atomic statements about Lambek operations on formal languages. This axiomatization is the Lambek calculus in its non-sequential form.

The sequential formulation of the Lambek calculus [1] is as follows. Formulae are constructed from variables (p,q,r,p,q,r,\ldots) using three connectives: \cdot, \\mathop{\backslash}, /\mathop{/}. (We use capital Latin letters both for languages and for Lambek formulae.) Sequents are expressions of the form ΓC\Gamma\vdash C, where the antecedent Γ\Gamma is a sequence of formulae and the succedent CC is one formula (intuitionistic style). The calculus 𝐋\mathbf{L} includes axioms of the form AAA\vdash A and the following rules of inference:

ΠAΓ,B,ΔC\LΓ,Π,A\B,ΔCA,ΠB\R; Π is non-emptyΠA\B\Gamma,\Pi,A\mathop{\backslash}B,\Delta\vdash C\lx@proof@logical@and\Pi\vdash A\Gamma,B,\Delta\vdash C\qquad\Pi\vdash A\mathop{\backslash}BA,\Pi\vdash B
ΠAΓ,B,ΔC/LΓ,B/A,Π,ΔCΠ,AB/R; Π is non-emptyΠB/A\Gamma,B\mathop{/}A,\Pi,\Delta\vdash C\lx@proof@logical@and\Pi\vdash A\Gamma,B,\Delta\vdash C\qquad\Pi\vdash B\mathop{/}A\Pi,A\vdash B
Γ,A,B,ΔCLΓ,AB,ΔCΓAΔBRΓ,ΔAB\Gamma,A\cdot B,\Delta\vdash C\Gamma,A,B,\Delta\vdash C\qquad\Gamma,\Delta\vdash A\cdot B\lx@proof@logical@and\Gamma\vdash A\Delta\vdash B
ΠAΓ,A,ΔCCutΓ,Π,ΔC\Gamma,\Pi,\Delta\vdash C\lx@proof@logical@and\Pi\vdash A\Gamma,A,\Delta\vdash C

The cut rule is eliminable [1].

An L-model, formally, is a mapping ww of Lambek formulae to subsets of Σ+\Sigma^{+} (languages without the empty word), which commutes with Lambek operations: w(AB)=w(A)w(B)w(A\cdot B)=w(A)\cdot w(B), w(A\B)=w(A)\w(B)w(A\mathop{\backslash}B)=w(A)\mathop{\backslash}w(B), and w(B/A)=w(B)/w(A)w(B\mathop{/}A)=w(B)\mathop{/}w(A). A sequent A1,,AnBA_{1},\ldots,A_{n}\vdash B is true in this model, if w(A1)w(An)w(B)w(A_{1})\cdot\ldots\cdot w(A_{n})\subseteq w(B).

According to Lambek’s non-emptiness restriction, all sequents in derivations are required to have non-empty antecedents. This constraint is motivated by linguistic applications: without it, Lambek categorial grammars generate ungrammatical sentences [2, § 2.5].

Abolishing Lambek’s restriction—that is, removing constraints “Π\Pi is non-empty” on \R\mathop{\backslash}R and /R\mathop{/}R—yields the Lambek calculus allowing empty antecedents, denoted by 𝐋\mathbf{L}^{\boldsymbol{*}} [3]. Language models are easily adapted for the case of 𝐋\mathbf{L}^{\boldsymbol{*}}: now we consider languages, which are subsets of Σ\Sigma^{*} (that is, they are allowed to include the empty word ε\varepsilon). The definition of division operations is also modified: for models of 𝐋\mathbf{L}^{\boldsymbol{*}},

A\B={uΣ(vA)vuB},\displaystyle A\mathop{\backslash}B=\{u\in\Sigma^{*}\mid(\forall v\in A)\,vu\in B\},
B/A={uΣ(vA)uvB}.\displaystyle B\mathop{/}A=\{u\in\Sigma^{*}\mid(\forall v\in A)\,uv\in B\}.

This modification can alter the values of A\BA\mathop{\backslash}B and B/AB\mathop{/}A even if AA and BB do not contain the empty word. For example, A\AA\mathop{\backslash}A now always includes ε\varepsilon, and therefore (A\A)\B(A\mathop{\backslash}A)\mathop{\backslash}B is always a subset of BB. Hence, 𝐋\mathbf{L}^{\boldsymbol{*}} is not a conservative extension of 𝐋\mathbf{L}: the sequent (p\p)\qq(p\mathop{\backslash}p)\mathop{\backslash}q\vdash q has a non-empty antecedent, but is derivable only in 𝐋\mathbf{L}^{\boldsymbol{*}}, not in 𝐋\mathbf{L}. For these modified L-models, let us use the term Lε\varepsilon-models.

In an Lε\varepsilon-model ww, a sequent of the form A1,,AnBA_{1},\ldots,A_{n}\vdash B is true if w(A1)w(An)w(B)w(A_{1})\cdot\ldots\cdot w(A_{n})\subseteq w(B), and a sequent of the form B\vdash B, with an empty antecedent, is true if εw(B)\varepsilon\in w(B).

Completeness theorems for 𝐋\mathbf{L} and 𝐋\mathbf{L}^{\boldsymbol{*}} w.r.t. corresponding versions of L-models were proved by Pentus [4, 5]. Pentus’ proofs are highly non-trivial. If one considers the fragment without \cdot (the product-free fragment), however, proving L-completeness becomes much easier. This was done by Buszkowski [6]; Buszkowski’s proof applies both to 𝐋\mathbf{L} and 𝐋\mathbf{L}^{\boldsymbol{*}}, w.r.t. L-models and Lε\varepsilon-models, respectively.

Besides product and two divisions, natural operations on formal languages include set-theoretic intersection and union. These operations correspond to so-called additive conjunction and disjunction. Additive operations are usually axiomatized by the following inference rules (cf. [7]):

Γ,A,ΔCΓ,B,ΔCLΓ,AB,ΔCΠARlΠABΠBRrΠAB\Gamma,A\vee B,\Delta\vdash C\lx@proof@logical@and\Gamma,A,\Delta\vdash C\Gamma,B,\Delta\vdash C\qquad\Pi\vdash A\vee B\Pi\vdash A\qquad\Pi\vdash A\vee B\Pi\vdash B
Γ,A,ΔCLlΓ,AB,ΔCΓ,B,ΔCLrΓ,AB,ΔCΠAΠBRΠAB\Gamma,A\wedge B,\Delta\vdash C\Gamma,A,\Delta\vdash C\qquad\Gamma,A\wedge B,\Delta\vdash C\Gamma,B,\Delta\vdash C\qquad\Pi\vdash A\wedge B\lx@proof@logical@and\Pi\vdash A\Pi\vdash B

The Lambek calculus 𝐋\mathbf{L} extended with these rules is denoted by 𝐌𝐀𝐋𝐂\mathbf{MALC} (multiplicative-additive Lambek calculus); 𝐌𝐀𝐋𝐂\mathbf{MALC}^{\boldsymbol{*}} is the variant of 𝐌𝐀𝐋𝐂\mathbf{MALC} without Lambek’s restriction (that is, allowing empty antecedents). L-completeness, however, fails for 𝐌𝐀𝐋𝐂\mathbf{MALC} in general. Further, in Section 2, we discuss this issue in detail.

Following Abrusci [8], we put the Lambek calculus into a broader context of linear logic. Namely, 𝐌𝐀𝐋𝐂\mathbf{MALC}^{\boldsymbol{*}} can be viewed as a fragment of intuitionistic non-commutative linear logic. (This fragment includes multiplicative and additive operations, but lacks the exponential and constants.) We also consider commutative systems: intuitionistic linear logic 𝐈𝐋𝐋\mathbf{ILL} and intuitionistic affine logic 𝐈𝐀𝐋\mathbf{IAL}.

Calculi 𝐈𝐋𝐋\mathbf{ILL} and 𝐈𝐀𝐋\mathbf{IAL} can be obtained from 𝐌𝐀𝐋𝐂\mathbf{MALC}^{\boldsymbol{*}} by adding structural rules: permutation for 𝐈𝐋𝐋\mathbf{ILL} and permutation and weakening for 𝐈𝐀𝐋\mathbf{IAL}. In the language of 𝐌𝐀𝐋𝐂\mathbf{MALC}, the rules of permutation and weakening are formulated as follows:

Γ,B,A,ΔCPΓ,A,B,ΔCΓ,ΔCWΓ,A,ΔC\Gamma,A,B,\Delta\vdash C\Gamma,B,A,\Delta\vdash C\qquad\Gamma,A,\Delta\vdash C\Gamma,\Delta\vdash C

Adding only weakening yields non-commutative intuitionistic affine logic, or affine (monotone) multiplicative-additive Lambek calculus. We denote this system by 𝐀𝐌𝐀𝐋𝐂\mathbf{AMALC}^{\boldsymbol{*}} (in the presence of extra structural rules, we do not impose Lambek’s restriction).

We shall also use alternative calculi for the commutative systems 𝐈𝐋𝐋\mathbf{ILL} and 𝐈𝐀𝐋\mathbf{IAL}, in which structural rules are hidden in axioms and in the format of sequents. First, we change the language of formulae, introducing one connective ABA\multimap B instead of A\BA\mathop{\backslash}B and B/AB\mathop{/}A (these are equivalent in 𝐈𝐋𝐋\mathbf{ILL} and 𝐈𝐀𝐋\mathbf{IAL}). We also write ABA\otimes B instead of ABA\cdot B, following Girard’s [9] linear logic notations.

Sequents are now going to be expressions of the form ΓC\Gamma\vdash C, where Γ\Gamma is a multiset of formulae. Further Γ,A\Gamma,A means Γ{A}\Gamma\uplus\{A\}, and Γ,Π\Gamma,\Pi means ΓΠ\Gamma\uplus\Pi, where \uplus is multiset union.

Axioms are of the form ppp\vdash p, for each variable pp, in the case of 𝐈𝐋𝐋\mathbf{ILL}, and of the form Γ,pp\Gamma,p\vdash p for 𝐈𝐀𝐋\mathbf{IAL}. Inference rules for both systems are as follows:

ΠAΓ,BCLΓ,Π,ABCΠ,ABRΠAB\Gamma,\Pi,A\multimap B\vdash C\lx@proof@logical@and\Pi\vdash A\Gamma,B\vdash C\qquad\Pi\vdash A\multimap B\Pi,A\vdash B
Γ,A,BCLΓ,ABCΓAΔBRΓ,ΔAB\Gamma,A\otimes B\vdash C\Gamma,A,B\vdash C\qquad\Gamma,\Delta\vdash A\otimes B\lx@proof@logical@and\Gamma\vdash A\Delta\vdash B
Γ,ACΓ,BCLΓ,ABCΠARlΠABΠBRrΠAB\Gamma,A\vee B\vdash C\lx@proof@logical@and\Gamma,A\vdash C\Gamma,B\vdash C\qquad\Pi\vdash A\vee B\Pi\vdash A\qquad\Pi\vdash A\vee B\Pi\vdash B
Γ,ACLlΓ,ABCΓ,BCLrΓ,ABCΠAΠBRΠAB\Gamma,A\vee B\vdash C\Gamma,A\vdash C\qquad\Gamma,A\vee B\vdash C\Gamma,B\vdash C\qquad\Pi\vdash A\wedge B\lx@proof@logical@and\Pi\wedge A\Pi\wedge B

For 𝐈𝐀𝐋\mathbf{IAL}, the weakening rule is not officially included in the system, but is admissible:

ΓCWΓ,AC\Gamma,A\vdash C\Gamma\vdash C

(it is hidden in axioms).

The cut rule of the following form is admissible both in 𝐈𝐋𝐋\mathbf{ILL} and 𝐈𝐀𝐋\mathbf{IAL}:

ΠAΓ,ACCutΓ,ΠC\Gamma,\Pi\vdash C\lx@proof@logical@and\Pi\vdash A\Gamma,A\vdash C

This is shown by a standard inductive argument.

Finally, let us introduce the multiplicative unit constant, 𝟏\mathbf{1}. The unit constant is added to systems without Lambek’s restriction extending 𝐋\mathbf{L}^{\boldsymbol{*}} (i.e., 𝐋\mathbf{L}^{\boldsymbol{*}} itself, 𝐌𝐀𝐋𝐂\mathbf{MALC}^{\boldsymbol{*}}, 𝐀𝐌𝐀𝐋𝐂\mathbf{AMALC}^{\boldsymbol{*}}, 𝐈𝐋𝐋\mathbf{ILL}, 𝐈𝐀𝐋\mathbf{IAL}). The Lambek calculus with the unit, 𝐋𝟏\mathbf{L}_{\mathbf{1}} [10], is obtained from 𝐋\mathbf{L}^{\boldsymbol{*}} by adding one axiom, 𝟏\vdash\mathbf{1} (its antecedent is empty), and one inference rule,

Γ,ΔC𝟏LΓ,𝟏,ΔC\Gamma,\mathbf{1},\Delta\vdash C\Gamma,\Delta\vdash C

L-completeness, however, does not hold for 𝐋𝟏\mathbf{L}_{\mathbf{1}}. Indeed, since 𝟏\mathbf{1} should be the unit w.r.t. \cdot, that is A𝟏=A=𝟏AA\cdot\mathbf{1}=A=\mathbf{1}\cdot A for any AA, in Lε\varepsilon-models it should be interpreted as {ε}\{\varepsilon\}. The following sequent is a counter-example for L-completeness: 𝟏/p,𝟏/p𝟏/p\mathbf{1}\mathop{/}p,\mathbf{1}\mathop{/}p\vdash\mathbf{1}\mathop{/}p. This sequent is true in all models for any interpretation of pp, but is not derivable in 𝐋𝟏\mathbf{L}_{\mathbf{1}}.

Throughout this paper, we shall frequently consider fragments of the calculi defined above in languages with restricted sets of connectives. Such a fragment will be denoted by the name of the calculus, followed by the list of connectives in parentheses: e.g., 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge).

2 Distributivity Law in Fragments with One Additive

It is well known, that 𝐌𝐀𝐋𝐂\mathbf{MALC} is incomplete w.r.t. L-models. The reason is the distributivity principle,

(AC)(BC)(AB)C.(A\vee C)\wedge(B\vee C)\vdash(A\wedge B)\vee C. (𝒟)

On one hand, this principle is obviously true in all L-models. On the other hand, as noticed by Ono and Komori [11], one needs the structural rules of contraction and weakening to derive it. In particular, the distributivity principle is not derivable in 𝐌𝐀𝐋𝐂\mathbf{MALC}, 𝐌𝐀𝐋𝐂\mathbf{MALC}^{\boldsymbol{*}}, 𝐀𝐌𝐀𝐋𝐂\mathbf{AMALC}^{\boldsymbol{*}}, 𝐈𝐋𝐋\mathbf{ILL}, and 𝐈𝐋𝐋\mathbf{ILL}.

The distributivity principle, as formulated above, includes both additive connectives, \wedge and \vee. We investigate fragments of 𝐌𝐀𝐋𝐂\mathbf{MALC} with only one additive, \wedge or \vee. The result of our study is that with respect to distributivity \wedge and \vee behave in opposite ways.

Let 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}+\mathcal{D} denote 𝐌𝐀𝐋𝐂\mathbf{MALC} with the distributivity principle added as an extra axiom scheme. In the presence of this axiom scheme, we have to keep cut as an official rule of the system (it is now not eliminable). A hypersequential system for 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}+\mathcal{D} was developed by Kozak [12].

Let us restrict ourselves to the product-free language (with product, proving L-completeness is hard even without extra connections [4, 5]). We also consider calculi without the unit constant: issues connected with 𝟏\mathbf{1} are discussed in Section 3. Thus, we consider two fragments of the multiplicative-additive Lambek calculus: 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge) and 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\vee), and the corresponding fragments of bigger system up to 𝐈𝐀𝐋\mathbf{IAL}. (For commutative calculi, we have only one implication, that is, consider fragments in the language of ,\multimap,\wedge and ,\multimap,\vee.)

As shown by Buszkowski [6], 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge) is complete w.r.t. L-models. This yields the following corollary: 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge) is a conservative fragment of both 𝐌𝐀𝐋𝐂\mathbf{MALC} and 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}+\mathcal{D}. Indeed, any sequent provable in 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}+\mathcal{D} is true in all L-models; if it is in the language of \,/,\mathop{\backslash},\mathop{/},\wedge, it is derivable in 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge) by L-completeness. In other words, the distributivity principle has no non-trivial corollaries in the language of \,/,\mathop{\backslash},\mathop{/},\wedge.

The situation with 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\vee) is opposite. Namely, we present a corollary of the distributivity principle in the language of \,/,\mathop{\backslash},\mathop{/},\vee, which is not provable in 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\vee). Thus, 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\vee) is not a conservative fragment of 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}+\mathcal{D}, and is therefore incomplete w.r.t. L-models. Moreover, we show that this effect is of a more general nature. Namely, the same holds for the corresponding fragments of 𝐌𝐀𝐋𝐂\mathbf{MALC}^{\boldsymbol{*}}, 𝐀𝐌𝐀𝐋𝐂\mathbf{AMALC}^{\boldsymbol{*}}, 𝐈𝐋𝐋\mathbf{ILL}, and 𝐈𝐀𝐋\mathbf{IAL}: distributivity has no new corollaries in the language with \wedge, but has such in the language with \vee.

2.1 Completeness with Additive Conjunction Only

For the first series of results, concerning \wedge, we give a semantic proof. For each system, we consider a specific version of L-semantics. For 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge) and 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge), these are L-models and Lε\varepsilon-models respectively. For other systems, let us first give some definitions and prove correctness statements for them.

Definition 1.

A language AA is called monotone, if for any word u1u2Au_{1}u_{2}\in A and an arbitrary word ww the word u1wu2u_{1}wu_{2} also belongs to AA.

Proposition 1.

If AA and BB are both monotone, then so are A\BA\mathop{\backslash}B, B/AB\mathop{/}A, and ABA\wedge B.

Proof.

Let u=u1u2A\Bu=u_{1}u_{2}\in A\mathop{\backslash}B. Then for any vAv\in A we have vu1u2Bvu_{1}u_{2}\in B. Now take u=u1wu2u^{\prime}=u_{1}wu_{2} for an arbitrary ww. By monotonicity of BB, the word vu=vu1wu2vu^{\prime}=vu_{1}wu_{2} is also in BB. Since this holds for any vAv\in A, we get uA\Bu^{\prime}\in A\mathop{\backslash}B. The reasoning for B/AB\mathop{/}A is symmetric. The case of ABA\wedge B is trivial. ∎

Definition 2.

A language AA is called commutative, if for any word u=a1anu=a_{1}\ldots a_{n} belonging to AA and an arbitrary transposition σ𝐒n\sigma\in\mathbf{S}_{n} on {1,,n}\{1,\ldots,n\} the word aσ(1)aσ(n)a_{\sigma(1)}\ldots a_{\sigma(n)} also belongs to AA.

Commutative languages are in one-to-one correspondence with multisets of letters from Σ\Sigma. Thus, we can define the operation of multiset union, ABA\uplus B, for two commutative languages AA and BB, which can be expressed as follows:

AB={aσ(1)aσ(n)σ𝐒n and a1anAB}.A\uplus B=\{a_{\sigma(1)}\ldots a_{\sigma(n)}\mid\sigma\in\mathbf{S}_{n}\mbox{ and }a_{1}\ldots a_{n}\in A\cdot B\}.

If AA is a commutative language, then vuAvu\in A if and only if uvAuv\in A. Therefore, for commutative AA and BB, we have A\B=B/AA\mathop{\backslash}B=B\mathop{/}A; we denote this language by ABA\multimap B.

Proposition 2.

If AA and BB are commutative, then so is ABA\multimap B and ABA\wedge B.

Proof.

Commutativity of ABA\wedge B is obvious. For ABA\multimap B, take any u=a1anAB=B/Au=a_{1}\ldots a_{n}\in A\multimap B=B\mathop{/}A and let u=uσ(1)uσ(n)u^{\prime}=u_{\sigma(1)}\ldots u_{\sigma(n)}. Now for any v=an+1amAv=a_{n+1}\ldots a_{m}\in A. By definiton of B/AB\mathop{/}A, we have uvBuv\in B. Now by commutativity of BB, the word uvu^{\prime}v also belongs to BB. Indeed, it is obtained from uvuv by the following transposition:

σ~=(12nn+1mσ(1)σ(2)σ(n)n+1m).\tilde{\sigma}=\left(\begin{matrix}1&2&\ldots&n&n+1&\ldots&m\\ \sigma(1)&\sigma(2)&\ldots&\sigma(n)&n+1&\ldots&m\end{matrix}\right).

Since vAv\in A was taken arbitrarily, we conclude that uB/A=ABu^{\prime}\in B\mathop{/}A=A\multimap B. ∎

Having the class of monotone languages and the class of commutative languages closed under our operations (\\mathop{\backslash}, /\mathop{/}, \wedge), we can define the classes of restricted Lε\varepsilon-models for all our systems.

Definition 3.

An Lε\varepsilon-model is monotone, if all languages in it are monotone. Truth of sequents is defined as in ordinary Lε\varepsilon-models.

Definition 4.

A commutative Lε\varepsilon-model is an Lε\varepsilon-model, where all languages are commutative.

In commutative models \uplus actually plays the role of product (while we do not have product as a connective, we still have the sequential comma, which is a hidden product), due to the following fact.

Proposition 3.

In a commutative Lε\varepsilon-model ww, a sequent A1,,AnBA_{1},\ldots,A_{n}\vdash B is true if and only if w(A1)w(An)w(B)w(A_{1})\uplus\ldots\uplus w(A_{n})\subseteq w(B).

Proof.

The “if” part is due to the fact that w(A1)w(An)w(A1)w(An)w(A_{1})\cdot\ldots\cdot w(A_{n})\subseteq w(A_{1})\uplus\ldots\uplus w(A_{n}). The “only if” part holds since w(B)w(B) is closed under transpositions. ∎

Now we prove an extension of Buszkowski’s completeness result

Theorem 4.

Each of 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge), 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge), 𝐀𝐌𝐀𝐋𝐂(\,/,)\mathbf{AMALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge), 𝐈𝐋𝐋(,)\mathbf{ILL}(\multimap,\wedge), 𝐈𝐀𝐋(,)\mathbf{IAL}(\multimap,\wedge) is sound and complete w.r.t. the corresponding class of models, according to the following table:

Calculus Models
𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge) L-models
𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge) Lε\varepsilon-models
𝐀𝐌𝐀𝐋𝐂(\,/,)\mathbf{AMALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge) monotone Lε\varepsilon-models
𝐈𝐋𝐋(,)\mathbf{ILL}(\multimap,\wedge) commutative Lε\varepsilon-models
𝐈𝐀𝐋(,)\mathbf{IAL}(\multimap,\wedge) Lε\varepsilon-models, which are both monotone and commutative
Proof.

The cases of 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge) and 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge) are due to Buszkowski [6]. Let us consider the remaining three systems.

The soundness part is easy: our conditions on models were specifically designed to reflect structural rules. In a monotone model, if w(A1)w(Ak)w(Ak+1)w(An)w(B)w(A_{1})\cdot\ldots\cdot w(A_{k})\cdot w(A_{k+1})\cdot\ldots\cdot w(A_{n})\subseteq w(B), then also w(A1)w(Ak)w(A)w(Ak+1)w(An)w(B)w(A_{1})\cdot\ldots\cdot w(A_{k})\cdot w(A)\cdot w(A_{k+1})\cdot\ldots\cdot w(A_{n})\subseteq w(B), thus the weakening rule is valid. If we have a commutative Lε\varepsilon-model, then the permutation rule is valid. This is obvious from Proposition 3: unlike \cdot, \uplus is just commutative. All other rules and axioms are valid in arbitrary Lε\varepsilon-models.

Completeness is proved by Buszkowski’s canonical model argument. We do it uniformly for all systems. In the canonical model, the alphabet Σ\Sigma is the set of all formulae of the given calculus, and for any formula AA let

w0(A)={ΓΓA is derivable in the given system}.w_{0}(A)=\{\Gamma\mid\Gamma\vdash A\mbox{ is derivable in the given system}\}.

First we show that w0w_{0} is indeed an Lε{\varepsilon}-model:

w0(A\B)=w0(A)\w0(B);\displaystyle w_{0}(A\mathop{\backslash}B)=w_{0}(A)\mathop{\backslash}w_{0}(B);
w0(B/A)=w0(B)/w0(A);\displaystyle w_{0}(B\mathop{/}A)=w_{0}(B)\mathop{/}w_{0}(A);
w0(AB)=w0(A)w0(B).\displaystyle w_{0}(A\wedge B)=w_{0}(A)\wedge w_{0}(B).

This is performed exactly as in Buszkowski’s proof. Indeed, if Γw0(A\B)\Gamma\in w_{0}(A\mathop{\backslash}B), then for an arbitrary Δw0(A)\Delta\in w_{0}(A) we have ΓA\B\Gamma\vdash A\mathop{\backslash}B and ΔA\Delta\vdash A. Applying cut with A,A\BBA,A\mathop{\backslash}B\vdash B, we get Δ,ΓA\Delta,\Gamma\vdash A derivable in our system. Thus, ΔΓw0(B)\Delta\Gamma\in w_{0}(B), therefore Γw0(A)\w0(B)\Gamma\in w_{0}(A)\mathop{\backslash}w_{0}(B). Notice that cut is available in all systems we consider. Dually, if Γw0(A)\w0(B)\Gamma\in w_{0}(A)\mathop{\backslash}w_{0}(B), then, since Aw0(A)A\in w_{0}(A) by the axiom, AΓw0(B)A\Gamma\in w_{0}(B). This means derivability A,ΓBA,\Gamma\vdash B, thus ΓA\B\Gamma\vdash A\mathop{\backslash}B. Hence, Γw0(A\B)\Gamma\in w_{0}(A\mathop{\backslash}B).

The /\mathop{/} case is symmetric. For \wedge, we use the equivalence ΓAB\Gamma\vdash A\wedge B if and only if ΓA\Gamma\vdash A and ΓB\Gamma\vdash B. Here the “if” part is an application of R\wedge R, and the “only if” part is by cut with ABAA\wedge B\vdash A and ABBA\wedge B\vdash B.

Next, is easy to see that the canonical model w0w_{0} belongs to the corresponding class of models: monotone for 𝐀𝐌𝐀𝐋𝐂(\,/,)\mathbf{AMALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge), commutative for 𝐈𝐋𝐋(\,/,)\mathbf{ILL}(\mathop{\backslash},\mathop{/},\wedge), commutative and monotone for 𝐈𝐀𝐋(\,/,)\mathbf{IAL}(\mathop{\backslash},\mathop{/},\wedge).

Finally, suppose a sequent ΠB\Pi\vdash B is not derivable. Consider two cases. If Π=A1,,An\Pi=A_{1},\ldots,A_{n} is non-empty, then, since each AiA_{i} belongs to w(Ai)w(A_{i}), we have Γw(A1)w(An)\Gamma\in w(A_{1})\cdot\ldots\cdot w(A_{n}). On the other hand, Γw(B)\Gamma\notin w(B). This falsifies ΠB\Pi\vdash B under interpretation w0w_{0}. If Π\Pi is empty, then we have εw(B)\varepsilon\notin w(B), which again falsifies ΠB\Pi\vdash B. This finishes the completeness proof. ∎

It is easy to see that soundness actually extends to the language with \vee (interpreted as set-theoretic union). Unions of monotone languages are also monotone, the same for commutative languages. The situation with product is a bit more complicated for commutative systems, since ABA\cdot B is usually not commutative, even for commutative AA and BB. Thus, we have to alter the definition of language models in the commutative case, requiring w(AB)=w(A)w(B)w(A\cdot B)=w(A)\uplus w(B) instead of w(AB)=w(A)w(B)w(A\cdot B)=w(A)\cdot w(B). Under this modification, soundness holds for product also. Finally, notice that in all models we consider \vee and \wedge are interpreted set-theoretically, thus, obey the distributivity law. These considerations yield the following soundness result:

Proposition 5.

Each of 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}+\mathcal{D}, 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}^{\boldsymbol{*}}+\mathcal{D}, 𝐀𝐌𝐀𝐋𝐂+𝒟\mathbf{AMALC}^{\boldsymbol{*}}+\mathcal{D}, 𝐈𝐋𝐋+𝒟\mathbf{ILL}+\mathcal{D}, 𝐈𝐀𝐋+𝒟\mathbf{IAL}+\mathcal{D} is sound w.r.t. the corresponding class of models, according to the table in Theorem 4; for 𝐈𝐋𝐋\mathbf{ILL} and 𝐈𝐀𝐋\mathbf{IAL} in the models we use \uplus to interpret \cdot.

Now we are ready to state and prove our conservativity result.

Theorem 6.

The systems in the restricted language without \vee, 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge), 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge), 𝐀𝐌𝐀𝐋𝐂(\,/,)\mathbf{AMALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge), 𝐈𝐋𝐋(,)\mathbf{ILL}(\multimap,\wedge), and 𝐈𝐀𝐋(,)\mathbf{IAL}(\multimap,\wedge) are conservative fragments of 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}+\mathcal{D}, 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}^{\boldsymbol{*}}+\mathcal{D}, 𝐀𝐌𝐀𝐋𝐂+𝒟\mathbf{AMALC}^{\boldsymbol{*}}+\mathcal{D}, 𝐈𝐋𝐋+𝒟\mathbf{ILL}+\mathcal{D}, and 𝐈𝐀𝐋+𝒟\mathbf{IAL}+\mathcal{D} respectively.

Proof.

Let ΠB\Pi\vdash B be a sequent in the language of \,/,\mathop{\backslash},\mathop{/},\wedge (in the commutative case, ,\multimap,\wedge). Suppose it is derivable in one of the distributive systems, 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}+\mathcal{D}, …, 𝐈𝐀𝐋+𝒟\mathbf{IAL}+\mathcal{D}. Then by Proposition 5 it is true in all models of the corresponding class. By Theorem 4 it is derivable in, respectively, 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge), …, 𝐈𝐀𝐋(,)\mathbf{IAL}(\multimap,\wedge). ∎

2.2 Incompleteness with Additive Disjunction Only

If we take \vee instead of \wedge, however, no analog of the conservativity result like Theorem 6 is possible, due to the following counter-example.

Theorem 7.

The sequent

((x/y)w)/((x/y)(x/z)w),(x/y)w,((x/y)w)\((x/z)w)(x/(yz))w((x\mathop{/}y)\vee w)\mathop{/}((x\mathop{/}y)\vee(x\mathop{/}z)\vee w),(x\mathop{/}y)\vee w,\\ ((x\mathop{/}y)\vee w)\mathop{\backslash}((x\mathop{/}z)\vee w)\vdash(x\mathop{/}(y\vee z))\vee w

is derivable in 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}+\mathcal{D} but this sequent is not derivable in 𝐈𝐀𝐋\mathbf{IAL}.

This sequent is in the language of \,/,\mathop{\backslash},\mathop{/},\vee. The theorem states that it is derivable in 𝐌𝐀𝐋𝐂+𝒟\mathbf{MALC}+\mathcal{D}, and therefore in all its extensions up to 𝐈𝐀𝐋+𝒟\mathbf{IAL}+\mathcal{D}, but not in the corresponding (\,/,\mathop{\backslash},\mathop{/},\vee) fragments without the distributivity law added. Thus, this is a non-trivial corollary of 𝒟\mathcal{D} in the language without \wedge. In particular, Theorem 7 implies that 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\vee) is incomplete w.r.t. L-models, as well as 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\vee), 𝐀𝐌𝐀𝐋𝐂(\,/,)\mathbf{AMALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\vee), 𝐈𝐋𝐋(,)\mathbf{ILL}(\multimap,\vee), 𝐈𝐀𝐋(,)\mathbf{IAL}(\multimap,\vee) are incomplete w.r.t. the corresponding modifications of L-models (compare with Theorem 4).

Before proving Theorem 7, let us make some remarks. First, let us notice that the sequent in this theorem is slightly different from the one in our WoLLIC 2019 paper [13], where one variable is used for xx and ww. The reason is that the old example happens to be derivable in 𝐈𝐀𝐋\mathbf{IAL} (but still not in 𝐈𝐋𝐋\mathbf{ILL} and weaker systems).

Second, the hard part of Theorem 7 is, of course, the second one (non-derivability). Fortunately, the derivability problem in 𝐌𝐀𝐋𝐂\mathbf{MALC} is algorithmically decidable (belongs to PSPACE), thus, it is possible to establish non-derivability by exhaustive proof search. This proof search was first performed, as a pre-verification of the result, automatically using an affine modification of llprover by Tamura [14]. (For the WoLLIC 2019 paper, we used a 𝐌𝐀𝐋𝐂\mathbf{MALC} prover by Jipsen [15], based on the algorithm by Okada and Terui [16].) In order to make this article self-contained and independent from proof-search software, here we present a complete manual proof search.

One of the WoLLIC 2019 reviewers suggested a shorter method of proving non-derivability of the given sequent in 𝐌𝐀𝐋𝐂\mathbf{MALC}, via an algebraic counter-model. This counter-model is a commutative residuated lattice on the set R={0,a,b,c,1}R=\{0,a,b,c,1\}. The order is defined as follows: 0a,b,c10\prec a,b,c\prec 1; a,b,ca,b,c are incomparable. Product and residual are defined as follows:

0abc1000000a0abc1b0bac1c0cc0c1011c10abc1011111a0abc1b0bac1cccc111000c1\begin{array}[]{c|ccccc}\cdot&0&a&b&c&1\\ \hline\cr 0&0&0&0&0&0\\ a&0&a&b&c&1\\ b&0&b&a&c&1\\ c&0&c&c&0&c\\ 1&0&1&1&c&1\\ \end{array}\qquad\begin{array}[]{c|ccccc}\multimap&0&a&b&c&1\\ \hline\cr 0&1&1&1&1&1\\ a&0&a&b&c&1\\ b&0&b&a&c&1\\ c&c&c&c&1&1\\ 1&0&0&0&c&1\\ \end{array}

(In the commutative situation, we have only one residual, which we denote by \multimap.) Variables are interpreted as follows: yy as bb, zz as cc, xx and ww both as aa. This algebraic model falsifies the sequent in Theorem 7. However, is insufficient for our new purposes. The reason is that in this model ab=baa\cdot b=b\not\preceq a, while in the presence of weakening ABAA\cdot B\vdash A should be true. Thus, in order to establish non-derivability of our sequent not only in 𝐌𝐀𝐋𝐂\mathbf{MALC}, but also in 𝐈𝐀𝐋\mathbf{IAL}, we use the good old syntactic method.

Proof of Theorem 7.

The first statement is proved using the joining (diamond) construction, the idea of which goes back to Lambek [1] and Pentus [17]. Indeed, let A=(x/y)wA=(x\mathop{/}y)\vee w and B=(x/z)wB=(x\mathop{/}z)\vee w. Then ABA\vee B is equivalent to (x/y)(x/z)w(x\mathop{/}y)\vee(x\mathop{/}z)\vee w. One can easily check derivability of A/(AB),A,A\BAA\mathop{/}(A\vee B),A,A\mathop{\backslash}B\vdash A and A/(AB),A,A\BBA\mathop{/}(A\vee B),A,A\mathop{\backslash}B\vdash B in 𝐌𝐀𝐋𝐂\mathbf{MALC}. Notice that the antecedent of this sequent is exactly the one in the sequent of our theorem. Next, we derive A/(AB),A,A\BABA\mathop{/}(A\vee B),A,A\mathop{\backslash}B\vdash A\wedge B, and further by distributivity AB((x/y)(x/z))w(x/(yz))wA\wedge B\equiv((x\mathop{/}y)\wedge(x\mathop{/}z))\vee w\equiv(x\mathop{/}(y\vee z))\wedge w.

The second statement is proved by an exhaustive proof search for the sequent

((yx)(zx)w)((yx)w),(yx)w,((yx)w)((zx)w)((yz)x)w((y\multimap x)\vee(z\multimap x)\vee w)\multimap((y\multimap x)\vee w),(y\multimap x)\vee w,\\ ((y\multimap x)\vee w)\multimap((z\multimap x)\vee w)\vdash((y\vee z)\multimap x)\vee w

(the translation of our sequent into the commutative language) in 𝐈𝐀𝐋\mathbf{IAL}.

In order to facilitate proof search, we take into account the following considerations.

First, the rules L\vee L and R\multimap R are invertible. Thus, we can suppose they are applied immediately. Moreover, L\vee L has two premises, and when disproving derivability we have the right to choose one and establish non-derivability there.

Second, we can suppose that in our (hypothetic) derivation instances of Lr\vee L_{r} of the form ΓwΓFw\dfrac{\Gamma\vdash w}{\Gamma\vdash F\vee w} are directly preceded by axioms. Indeed, such instances are interchangeable upwards with L\multimap L and L\vee L, and R\multimap R cannot appear before this Lr\vee L_{r}, since ww is a variable. Other rules are impossible by the polarized subformula property.

Third, we establish non-derivability of several sequents, which will appear frequently in our proof search:

⊬(yx)w\not\vdash(y\multimap x)\vee w (1)
z⊬(yx)wz\not\vdash(y\multimap x)\vee w (2)
y⊬(yx)wy\not\vdash(y\multimap x)\vee w (3)
z,y⊬(yx)wz,y\not\vdash(y\multimap x)\vee w (4)
z,z⊬(yx)wz,z\not\vdash(y\multimap x)\vee w (5)
z⊬(yx)(zx)wz\not\vdash(y\multimap x)\vee(z\multimap x)\vee w (6)
⊬(yx)(zx)w\not\vdash(y\multimap x)\vee(z\multimap x)\vee w (7)
z,y⊬(yx)(zx)wz,y\not\vdash(y\multimap x)\vee(z\multimap x)\vee w (8)

Now we are ready to start proof search. First we invert L\vee L introducing (yx)w(y\multimap x)\vee w and choose yxy\multimap x:

((yx)(zx)w)2((yx)w),y3x,((yx)w)4((zx)w)((yz)x)1w((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 2}}{{\multimap}}((y\multimap x)\vee w),y\stackrel{{\scriptstyle 3}}{{\multimap}}x,\\ ((y\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w)\vdash((y\vee z)\multimap x)\stackrel{{\scriptstyle 1}}{{\vee}}w

Now we have a choice of 4 principal connectives (denoted by numbers in the sequent) to be decomposed first.

Case 1. In this case, we use Rl\vee R_{l}, thanks to our consideration that Rr\vee R_{r} with ww should be applied immediately after an axiom.

((yx)(zx)w)2((yx)w),y3x,((yx)w)4((zx)w)(yz)x((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 2}}{{\multimap}}((y\multimap x)\vee w),y\stackrel{{\scriptstyle 3}}{{\multimap}}x,\\ ((y\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w)\vdash(y\vee z)\multimap x

Invert R\multimap R and L\vee L, choosing zz out of yzy\vee z:

((yx)(zx)w)2((yx)w),y3x,((yx)w)4((zx)w),zx((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 2}}{{\multimap}}((y\multimap x)\vee w),y\stackrel{{\scriptstyle 3}}{{\multimap}}x,\\ ((y\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w),z\vdash x

Now we can decompose (by L\multimap L) one of the implications 2–4, and for each we have a choice of 8=238=2^{3} ways of splitting the rest of the antecedent into Π\Pi and Γ\Gamma. Making use of the weakening rule, however, we can reduce the number of cases.

Subcase 1–2. If Π\Pi includes yx3y\stackrel{{\scriptstyle 3}}{{\multimap x}}, then the right premise is Γ,(yx)wx\Gamma,(y\multimap x)\vee w\vdash x, where Γ\Gamma is a subset of z,((yx)w)((zx)w)z,((y\multimap x)\vee w)\multimap((z\multimap x)\vee w). Notice that if ΓΓ\Gamma^{\prime}\subseteq\Gamma and the sequent is not derivable with Γ\Gamma, it is also not derivable with Γ\Gamma^{\prime} (otherwise we could derive it with Γ\Gamma using the weakening rule). However, the sequent is not derivable even with the maximal Γ\Gamma:

z,((yx)w)((zx)w),(yx)w⊬x.z,((y\multimap x)\vee w)\multimap((z\multimap x)\vee w),(y\multimap x)\vee w\not\vdash x.

Indeed, invert L\vee L and choose ww:

z,w,((yx)w)((zx)w)⊬x.z,w,((y\multimap x)\vee w)\multimap((z\multimap x)\vee w)\not\vdash x.

Here one should use L\multimap L, but then in its right premise we can again invert L\vee L choosing ww, which yields one of:

wxz,wxw,wxz,w,wx.w\vdash x\qquad z,w\vdash x\qquad w,w\vdash x\qquad z,w,w\vdash x.

None of these is derivable.

If Π\Pi does not include y3xy\stackrel{{\scriptstyle 3}}{{\multimap}}x, then Π\Pi is a subset of ((yx)w)((zx)w),z((y\multimap x)\vee w)\multimap((z\multimap x)\vee w),z, and we again take the maximal Π\Pi in the left premise:

((yx)w)((zx)w),z(yx)(zx)w((y\multimap x)\vee w)\multimap((z\multimap x)\vee w),z\vdash(y\multimap x)\vee(z\multimap x)\vee w (9)

Decomposing \multimap yields either (yx)w\vdash(y\multimap x)\vee w or z(yx)wz\vdash(y\multimap x)\vee w, both not derivable by (1) and (2). Thus, we have to decompose \vee on the right.

Taking yxy\multimap x (and inverting R\multimap R) yields

((yx)w)((zx)w),z,yx.((y\multimap x)\vee w)\multimap((z\multimap x)\vee w),z,y\vdash x.

Now we again have to use L\multimap L. The new cases are y(yx)wy\vdash(y\multimap x)\vee w and z,y(yx)wz,y\vdash(y\multimap x)\vee w, both not derivable (3)(4).

Taking zxz\multimap x and inverting R\multimap R gives

((yx)w)((zx)w),z,zx.((y\multimap x)\vee w)\multimap((z\multimap x)\vee w),z,z\vdash x.

Decomposing \multimap fails due to (1)(2)(5).

Subcase 1–3. Apply L\multimap L for 3\stackrel{{\scriptstyle 3}}{{\multimap}} and consider its left premise with the maximal possible Π\Pi:

((yx)(zx)w)2((yx)w),((yx)w)4((zx)w),zy.((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 2}}{{\multimap}}((y\multimap x)\vee w),((y\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w),z\vdash y. (10)

Subsubcase 1–3–2. Decompose 2\stackrel{{\scriptstyle 2}}{{\multimap}}. If the big formula with 4\stackrel{{\scriptstyle 4}}{{\multimap}} goes to the new Γ\Gamma, then the new Π\Pi is either zz or empty. However, neither z(yx)(zx)wz\vdash(y\multimap x)\vee(z\multimap x)\vee w nor (yx)(zx)w\vdash(y\multimap x)\vee(z\multimap x)\vee w is derivable (6)(7). If the formula with 4\stackrel{{\scriptstyle 4}}{{\multimap}} goes to the new Π\Pi, then the new Γ\Gamma is either zz or empty. This gives, at maximum, z,(yx)wyz,(y\multimap x)\vee w\vdash y, which is falsified by choosing ww in the inverted L\vee L: z,w⊬yz,w\not\vdash y.

Subsubcase 1–3–4. Decompose 4\stackrel{{\scriptstyle 4}}{{\multimap}}. Again, if the big formula (now with 2\stackrel{{\scriptstyle 2}}{{\multimap}}) goes to the new Γ\Gamma, we falsify the left premise by (1) or (2). Otherwise, the right premise is, at maximum, z,(zx)wyz,(z\multimap x)\vee w\vdash y, which is again falsified by choosing ww.

Subcase 1–4. If Π\Pi includes y3xy\stackrel{{\scriptstyle 3}}{{\multimap}}x, then the right premise is, at maximum,

((yx)(zx)w)2((yx)w),z,(zx)wx((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 2}}{{\multimap}}((y\multimap x)\vee w),z,(z\multimap x)\vee w\vdash x

Invert L\vee L and choose ww:

((yx)(zx)w)2((yx)w),z,wx((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 2}}{{\multimap}}((y\multimap x)\vee w),z,w\vdash x

Now we have to use 2L\stackrel{{\scriptstyle 2}}{{\multimap}}L. Its right premise is, at maximum, z,w,(yx)wxz,w,(y\multimap x)\vee w\vdash x. Choosing ww falsifies it.

If y3xy\stackrel{{\scriptstyle 3}}{{\multimap}}x is in Γ\Gamma, then the maximal version of the left premise is

((yx)(zx)w)2((yx)w),z(yx)w.((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 2}}{{\multimap}}((y\multimap x)\vee w),z\vdash(y\multimap x)\vee w. (11)

Applying 2\stackrel{{\scriptstyle 2}}{{\multimap}} right now is impossible: its left premise gets falsified by (7) or (6). Apply Rl\vee R_{l} (recall that Rr\vee R_{r} is used only directly below axiom) and invert R\multimap R:

((yx)(zx)w)2((yx)w),z,yx.((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 2}}{{\multimap}}((y\multimap x)\vee w),z,y\vdash x.

Here the left premise of 2\stackrel{{\scriptstyle 2}}{{\multimap}} is also falsified by (7), (6), or (8).

Case 2. Consider again two cases, depending on whether y3xy\stackrel{{\scriptstyle 3}}{{\multimap}}x goes to Π\Pi or to Γ\Gamma. If it goes to Π\Pi, then the right premise is, at maximum,

(yx)w,((yx)w)4((zx)w)((yz)x)1w.(y\multimap x)\vee w,((y\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w)\vdash((y\vee z)\multimap x)\stackrel{{\scriptstyle 1}}{{\vee}}w.

Invert L\vee L and choose yxy\multimap x:

y5x,((yx)w)4((zx)w)((yz)x)1w.y\stackrel{{\scriptstyle 5}}{{\multimap}}x,((y\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w)\vdash((y\vee z)\multimap x)\stackrel{{\scriptstyle 1}}{{\vee}}w. (12)

For reusal of our reasoning in further cases, we shall falsify a stronger sequent

y5x,((yx)(zx)w)4((zx)w)((yz)x)1w.y\stackrel{{\scriptstyle 5}}{{\multimap}}x,((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w)\vdash((y\vee z)\multimap x)\stackrel{{\scriptstyle 1}}{{\vee}}w. (13)

Indeed, (yx)w(yx)(zx)w(y\multimap x)\vee w\vdash(y\multimap x)\vee(z\multimap x)\vee w, and therefore ((yx)(zx)w)((zx)w)((yx)w)((zx)w)((y\multimap x)\vee(z\multimap x)\vee w)\multimap((z\multimap x)\vee w)\vdash((y\multimap x)\vee w)\multimap((z\multimap x)\vee w) is derivable in 𝐈𝐀𝐋\mathbf{IAL}. Thus, if (12) happens to be derivable then, by cut, so will be (13).

Now we decompose one of 1\stackrel{{\scriptstyle 1}}{{\vee}}, 4\stackrel{{\scriptstyle 4}}{{\multimap}}, 5\stackrel{{\scriptstyle 5}}{{\multimap}} in (13).

Subcase 2–Π\Pi–1. Recall that we never choose ww in R\vee R, and invert R\multimap R:

y5x,((yx)(zx)w)4((zx)w),yzx.y\stackrel{{\scriptstyle 5}}{{\multimap}}x,((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w),y\vee z\vdash x.

Invert L\vee L and choose zz:

y5x,((yx)(zx)w)4((zx)w),zx.y\stackrel{{\scriptstyle 5}}{{\multimap}}x,((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w),z\vdash x.

Subsubcase 2–Π\Pi–1–5. The left premise is, at maximum,

((yx)(zx)w)4((zx)w),zy.((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w),z\vdash y.

Applying 4L\stackrel{{\scriptstyle 4}}{{\multimap}}L is impossible, since its right premise is falsified by choosing ww: w,z⊬yw,z\not\vdash y and w⊬yw\not\vdash y.

Subsubcase 2–Π\Pi–1–4. Again, if y5xy\stackrel{{\scriptstyle 5}}{{\multimap}}x goes to the new Π\Pi, then the right premise is, at maximum, z,(zx)wx,z,(z\multimap x)\vee w\vdash x, which is falsified by choosing ww. If it goes to the new Γ\Gamma, then the new left premise is, at maximum, z(yx)(zx)wz\vdash(y\multimap x)\vee(z\multimap x)\vee w, which is not derivable by (6).

Subcase 2–Π\Pi–4. If the new Π\Pi is empty, then the left premise is falsified by (7). Otherwise, the right premise is

(zx)w((yz)x)w.(z\multimap x)\vee w\vdash((y\vee z)\multimap x)\vee w.

Invert L\vee L and choose zxz\multimap x:

zx((yz)x)w.z\multimap x\vdash((y\vee z)\multimap x)\vee w.

Applying L\multimap L is impossible (⊬z\not\vdash z); also zx⊬wz\multimap x\not\vdash w. Thus, we have to use Rl\vee R_{l}, and we can immediately apply R\multimap R afterwards: zx,yzx.z\multimap x,y\vee z\vdash x. Inverting L\vee L and choosing yy falsifies this sequent: zx,y⊬xz\multimap x,y\not\vdash x.

Subcase 2–Π\Pi–5. The left premise is, at maximum,

((yx)(zx)w)((zx)w)y.((y\multimap x)\vee(z\multimap x)\vee w)\multimap((z\multimap x)\vee w)\vdash y.

This is not derivable.

Now let, in Case 2, y3xy\stackrel{{\scriptstyle 3}}{{\multimap}}x go to Γ\Gamma. Then the left premise is, at maximum,

((yx)w)4((zx)w)(yx)(zx)w.((y\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w)\vdash(y\multimap x)\vee(z\multimap x)\vee w.

This sequent is stronger than (9)—that is, (9) can be obtained from it by weakening. Therefore, it cannot be derivable, since we’ve already falsified (9) in Case 1.

Case 3. Take the maximal possible Π\Pi and consider the left premise:

((yx)(zx)w)2((yx)w),((yx)w)4((zx)w)y.((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 2}}{{\multimap}}((y\multimap x)\vee w),((y\multimap x)\vee w)\stackrel{{\scriptstyle 4}}{{\multimap}}((z\multimap x)\vee w)\vdash y.

This sequent is stronger than (10), and therefore not derivable: (10) was falsified in Case 1.

Case 4. If y3xy\stackrel{{\scriptstyle 3}}{{\multimap}}x goes to Π\Pi, then the maximal version of the right premise of 4L\stackrel{{\scriptstyle 4}}{{\multimap}}L is

((yx)(zx)w)2((yx)w),(zx)w((yz)x)1w.((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 2}}{{\multimap}}((y\multimap x)\vee w),(z\multimap x)\vee w\vdash((y\vee z)\multimap x)\stackrel{{\scriptstyle 1}}{{\vee}}w.

Invert L\vee L and choose zxz\multimap x:

((yx)(zx)w)2((yx)w),z6x((yz)x)1w.((y\multimap x)\vee(z\multimap x)\vee w)\stackrel{{\scriptstyle 2}}{{\multimap}}((y\multimap x)\vee w),z\stackrel{{\scriptstyle 6}}{{\multimap}}x\vdash((y\vee z)\multimap x)\stackrel{{\scriptstyle 1}}{{\vee}}w.

Suppose this sequent is derivable. Then it will also be derivable after swapping variables yy and zz:

((zx)(yx)w)((zx)w),yx((zy)x)w.((z\multimap x)\vee(y\multimap x)\vee w)\multimap((z\multimap x)\vee w),y\multimap x\vdash((z\vee y)\multimap x)\vee w.

This sequent, however, is exactly (13), up to commutativity; (13) was falsified in Case 2.

Finally, if y3xy\stackrel{{\scriptstyle 3}}{{\multimap}}x, in Case 4, goes to Γ\Gamma, then the maximal version of the left premise of 4L\stackrel{{\scriptstyle 4}}{{\multimap}}L is

((yx)(zx)w)((yx)w)((yz)x)w.((y\multimap x)\vee(z\multimap x)\vee w)\multimap((y\multimap x)\vee w)\vdash((y\vee z)\multimap x)\vee w.

This sequent is stronger than (11) and therefore cannot be derivable. ∎

3 Undecidability with \\mathop{\backslash}, \wedge, and 𝟏\mathbf{1}

3.1 The System 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}) and Its Undecidability

In this section we consider the extension of the Lambek calculus with the multiplicative unit constant. The language of our fragment will be as follows: \\mathop{\backslash}, \wedge, 𝟏\mathbf{1}. As shown by Buszkowski [6], in the fragment of \\mathop{\backslash} and \wedge the Lambek calculus with empty antecedents is complete w.r.t. Lε\varepsilon-models. As noticed in the Introduction, however, this is not the case if we add 𝟏\mathbf{1}. In Lε\varepsilon-models, because of the principle A𝟏𝟏A\cdot\mathbf{1}\vdash\mathbf{1}, the unit constant 𝟏\mathbf{1} is necessarily interpreted as the singleton set {ε}\{\varepsilon\}, where ε\varepsilon is the empty word. (In the presence of the unit constant, we allow the empty word to belong to our languages and abolish Lambek’s non-emptiness restriction.) This particular interpretation of 𝟏\mathbf{1} satisfies certain principles, including A{ε}={ε}AA\cdot\{\varepsilon\}=\{\varepsilon\}\cdot A and {ε}{ε}={ε}\{\varepsilon\}\cdot\{\varepsilon\}=\{\varepsilon\}. Moreover, these principles keep valid for languages of the form {ε}B\{\varepsilon\}\cap B (for any BB). Indeed, this language is either {ε}\{\varepsilon\} or \varnothing, and for the empty set \varnothing we also have A=AA\cdot\varnothing=\varnothing\cdot A and =\varnothing\cdot\varnothing=\varnothing.

Below we present a calculus denoted by 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}), which reflects these principles as sequential rules:

IdAA𝟏A,𝟏AA\vdash A\qquad A,\mathbf{1}\vdash A
ΠAΓ,B,ΔC\LΓ,Π,A\B,ΔCA,ΠB\RΠA\B\Gamma,\Pi,A\mathop{\backslash}B,\Delta\vdash C\lx@proof@logical@and\Pi\vdash A\Gamma,B,\Delta\vdash C\qquad\Pi\vdash A\mathop{\backslash}BA,\Pi\vdash B
Γ,ACLlΓ,ABCΓ,BCLrΓ,ABCΠAΠBRΠAB\Gamma,A\vee B\vdash C\Gamma,A\vdash C\qquad\Gamma,A\vee B\vdash C\Gamma,B\vdash C\qquad\Pi\vdash A\wedge B\lx@proof@logical@and\Pi\wedge A\Pi\wedge B
Γ,A,𝟏G,ΔCLεΓ,𝟏G,A,ΔCΓ,𝟏G,A,ΔCRεΓ,A,𝟏G,ΔCΓ,𝟏G,𝟏G,ΔCDεΓ,𝟏G,ΔC\Gamma,\mathbf{1}\wedge G,A,\Delta\vdash C\Gamma,A,\mathbf{1}\wedge G,\Delta\vdash C\quad\Gamma,A,\mathbf{1}\wedge G,\Delta\vdash C\Gamma,\mathbf{1}\wedge G,A,\Delta\vdash C\quad\Gamma,\mathbf{1}\wedge G,\Delta\vdash C\Gamma,\mathbf{1}\wedge G,\mathbf{1}\wedge G,\Delta\vdash C

The rules LεL\varepsilon and RεR\varepsilon are called “commuting” rules; they reflect the fact that, for any set XX, X{ε}={ε}XX\cdot\{\varepsilon\}=\{\varepsilon\}\cdot X and X=XX\cdot\varnothing=\varnothing\cdot X. The “doubling” rule DεD\varepsilon is caused by {ε}{ε}={ε}\{\varepsilon\}\cdot\{\varepsilon\}=\{\varepsilon\} and =\varnothing\cdot\varnothing=\varnothing. Thus, these rules express natural algebraic properties of the interpretation of 𝟏\mathbf{1} as \varnothing. However, we emphasize that they are not admissible in the standard calculus 𝐋𝟏\mathbf{L}_{\mathbf{1}}, introduced by Lambek [10], that is, non-commutative intuitionistic multiplicative-additive linear logic.

The rules LεL\varepsilon, RεR\varepsilon, and DεD\varepsilon are not new. Their underlying principles, namely, (𝟏G)AA(𝟏G)(\mathbf{1}\wedge G)\cdot A\equiv A\cdot(\mathbf{1}\wedge G) and (𝟏G)(𝟏G)𝟏G(\mathbf{1}\wedge G)\cdot(\mathbf{1}\wedge G)\equiv\mathbf{1}\wedge G appear in works of the Hungarian school (Andréka, Mikulás, Németi). Namely, in [18] one can find the first of these equivalences (denoted there as formula 3.2), as one of the principles which is true in language algebras, but not in algebras of binary relations. The second equivalence is true for binary relations also; formula (CbI) in [19] is actually its stronger version, (𝟏G)(𝟏F)𝟏GF(\mathbf{1}\wedge G)\cdot(\mathbf{1}\wedge F)\equiv\mathbf{1}\wedge G\wedge F. We get our (𝟏G)(𝟏G)𝟏G(\mathbf{1}\wedge G)\cdot(\mathbf{1}\wedge G)\equiv\mathbf{1}\wedge G by taking F=GF=G.

Andréka, Mikulás, and Sain [20] also sketch an undecidability proof for a system related to the one considered here. Their proof is based on the technique of Kurucz et al. [21]. The system considered in [20] is the logic of residuated distributive lattices over monoids. Unlike the case we consider in this section, their system requires product, the unit and also the zero constant (the minimal element of the lattice) to be present in the language. Here we require only division, additive conjunction, and the unit. The trade-off is that we consider a narrower class of models. Namely, we consider only Lε\varepsilon-models, and these models, as shown above, allow extra principles for 𝟏\mathbf{1}.

We do not claim that 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}) is an Lε\varepsilon-complete system. Indeed, the Lε\varepsilon-complete extension of 𝐋𝟏\mathbf{L}_{\mathbf{1}} happens to be quite involved (cf. [22]). In particular, it is still an open problem whether such a complete system is recursively enumerable. The cut rule is not included in 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}), so all our derivations will be cut-free. We do not claim that cut is admissible in this system.

We prove undecidability for the whole range of systems between 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}) and the Lε\varepsilon-complete system in the language of \\mathop{\backslash}, \wedge, 𝟏\mathbf{1}.

Theorem 8.

Let \mathcal{L} be an Lε\varepsilon-sound logic which includes 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}). Then the derivability problem for \mathcal{L} is undecidable.

Our undecidability proof is based on encoding computations of 2-counter Minsky machines [23]. In the forward encoding, from Minsky computations to derivations in our calculus, we present explicit derivations in 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}). For the backwards direction, from derivations to computations, we use a semantic approach using L-models (cf. [24, 25, 16], where phase semantics was used for similar purposes). Thus, we get undecidability not only for 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}) itself, but for the whole range of its Lε\varepsilon-sound extensions.

Before going further, let us introduce the relative double negation construction. We fix a variable (atomic proposition) bb and define relative negation AbA^{b} as

Ab=A\b.A^{b}=A\mathop{\backslash}b.

The term “negation” here is motivated as follows. In linear logic with the falsity constant \bot, negation is expressed as A=AA^{\bot}=A\multimap\bot. Here we do the same non-commutatively, but due to lack of the \bot constant we replace it by a fixed variable. This is the minimal logic approach: variable bb can be read as “false,” but no specific axioms like bAb\vdash A (ex falso) are imposed for bb.

The relative double negation now is

Abb=(A\b)\b.A^{bb}=(A\mathop{\backslash}b)\mathop{\backslash}b.

Notice the difference from the more usual in the Lambek calculus “type raising” version of something like double negation: Abb=b/(A\b){}^{b}A^{b}=b\mathop{/}(A\mathop{\backslash}b). In our setting, we have neither AbbAA^{bb}\vdash A (due to the intuitionistic nature of the Lambek calculus), nor AAbbA\vdash A^{bb} (due to non-commutativity; in contrast, AAbbA\vdash{}^{b}A^{b} is derivable). Nevertheless, AbbA^{bb} will be useful for our construction.

Given a sequence of formulae Φ=A1,A2,,Am1,Am\Phi=A_{1},A_{2},\ldots,A_{m-1},A_{m} and a formula CC, we introduce the notation

Φ\C=Am\(Am1\\(A2\(A1\C))).\Phi\mathop{\backslash}C=A_{m}\mathop{\backslash}(A_{m-1}\mathop{\backslash}\ldots\mathop{\backslash}(A_{2}\mathop{\backslash}(A_{1}\mathop{\backslash}C))\ldots).

In particular,

Φb=Am\(Am1\\(A2\(A1\b)))\Phi^{b}=A_{m}\mathop{\backslash}(A_{m-1}\mathop{\backslash}\ldots\mathop{\backslash}(A_{2}\mathop{\backslash}(A_{1}\mathop{\backslash}b))\ldots)

and

Φbb=(Am\(Am1\\(A2\(A1\b))))\b.\Phi^{bb}=\bigl{(}A_{m}\mathop{\backslash}(A_{m-1}\mathop{\backslash}\ldots\mathop{\backslash}(A_{2}\mathop{\backslash}(A_{1}\mathop{\backslash}b))\ldots)\bigr{)}\mathop{\backslash}b.

In what follows, we suppose that the bb operation has a higher priority than ordinary division \\mathop{\backslash}.

Consider a non-deterministic Minsky machine 𝔐\mathfrak{M} with a finite set of states {L0,L1,,Ln}\{L_{0},L_{1},\ldots,L_{n}\}. A configuration of 𝔐\mathfrak{M} is a triple (Li,k1,k2)(L_{i},k_{1},k_{2}), where LiL_{i} is the current state and k1k_{1} and k2k_{2} are the current values of 𝔐\mathfrak{M}’s two counters. The counters themselves are denoted by c1c_{1} and c2c_{2}. The configuration (L0,0,0)(L_{0},0,0) is considered the final one; the initial configuration can be taken arbitrarily.

Configurations of Minsky machines are encoded as follows. We introduce distinct variables e1e_{1}, e2e_{2}, p1p_{1}, p2p_{2}, l0l_{0}, l1l_{1}, …, lnl_{n} and represent configuration (Li,k1,k2)(L_{i},k_{1},k_{2}) as

e1,p1,,p1k1 times,li,p2,,p2k2 times,e2.e_{1},\underbrace{p_{1},\ldots,p_{1}}_{\text{$k_{1}$ times}},l_{i},\underbrace{p_{2},\ldots,p_{2}}_{\text{$k_{2}$ times}},e_{2}.

In particular, the final configuration (L0,0,0)(L_{0},0,0) is represented as e1,l0,e2e_{1},l_{0},e_{2}.

Minsky instructions are encoded according to the following table:

Instruction II Formula AIA_{I}
inc(Li,1,Lj)(L_{i},1,L_{j}) li\(p1,lj)bbl_{i}\mathop{\backslash}(p_{1},l_{j})^{bb}
inc(Li,2,Lj)(L_{i},2,L_{j}) li\(lj,p2)bbl_{i}\mathop{\backslash}(l_{j},p_{2})^{bb}
dec(Li,1,Lj)(L_{i},1,L_{j}) (p1,li)\ljbb(p_{1},l_{i})\mathop{\backslash}l_{j}^{bb}
dec(Li,2,Lj)(L_{i},2,L_{j}) (li,p2)\ljbb(l_{i},p_{2})\mathop{\backslash}l_{j}^{bb}
jz(Li,1,Lj)(L_{i},1,L_{j}) (e1,li)\(e1,lj)bb(e_{1},l_{i})\mathop{\backslash}(e_{1},l_{j})^{bb}
jz(Li,2,Lj)(L_{i},2,L_{j}) (li,e2)\(lj,e2)bb(l_{i},e_{2})\mathop{\backslash}(l_{j},e_{2})^{bb}

Here instruction inc(Li,r,Lj)(L_{i},r,L_{j}) (increment) means “at state LiL_{i}, increase crc_{r} by 1 and go to LjL_{j}” (r=1,2r=1,2). Instruction dec(Li,r,Lj)(L_{i},r,L_{j}) (decrement) means “at state LiL_{i}, decrease crc_{r} by 1 and go to LjL_{j}.” If kr=0k_{r}=0, then this instruction cannot be applied. Finally, jz(Li,r,Lj)(L_{i},r,L_{j}) (zero-test) means “at state LiL_{i}, if kr=0k_{r}=0, go to LjL_{j}.” Now if kr0k_{r}\neq 0, then the instruction cannot be applied.

Notice that our version of zero-test and decrement instructions are very restrictive. Once the counter has a wrong value (zero for decreasing or non-zero for zero-test), the machine just fails to proceed. Usually, in such cases the machine is allowed to perform conditional branching (e.g., zero-test jumps to LjL_{j} if the counter is zero and safely stays at LiL_{i} if not). These restrictions, however, are compensated by the allowed non-determinism of 𝔐\mathfrak{M}. Indeed, the compound jzdec(Li,r,Lj1,Lj2)(L_{i},r,L_{j_{1}},L_{j_{2}}) instruction from Minsky’s original formalism [26], meaning “at state LiL_{i}, if kr0k_{r}\neq 0, decrease crc_{r} by one and go to Lj1L_{j_{1}}, and if kr=0k_{r}=0, go to Lj2L_{j_{2}},” is modelled by adding simultaneously two instructions: dec(Li,r,Lj1)(L_{i},r,L_{j_{1}}) and jz(Li,r,Lj2)(L_{i},r,L_{j_{2}}). This non-deterministically branches computation; however, exactly one branch (depending on whether kr=0k_{r}=0) could be successful, the other one immediately fails.

Let us denote the set of our variables, except bb, by 𝒱\mathcal{V}:

𝒱={e1,e2,p1,p2,l0,l1,,ln}.\mathcal{V}=\{e_{1},e_{2},p_{1},p_{2},l_{0},l_{1},\ldots,l_{n}\}.

Finally, the Minsky machine 𝔐\mathfrak{M} is represented by the following formula

G=((e1,l0,e2)\b)IAIq𝒱(q\qbb).G=((e_{1},l_{0},e_{2})\mathop{\backslash}b)\wedge\bigwedge_{I}A_{I}\wedge\bigwedge_{q\in\mathcal{V}}(q\mathop{\backslash}q^{bb}).

Here in the first big conjunction II ranges among all instructions of 𝔐\mathfrak{M}.

Now we are ready to state our main encoding theorems.

Theorem 9.

If 𝔐\mathfrak{M} can reach the final configuration (L0,0,0)(L_{0},0,0), starting from (Li,k1,k2)(L_{i},k_{1},k_{2}), then the following sequent is derivable in 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}):

𝟏G,e1,p1,,p1k1 times,li,p2,,p2k2 times,e2b.\mathbf{1}\wedge G,e_{1},\underbrace{p_{1},\ldots,p_{1}}_{\text{$k_{1}$ times}},l_{i},\underbrace{p_{2},\ldots,p_{2}}_{\text{$k_{2}$ times}},e_{2}\vdash b. ()
Theorem 10.

If the sequent ()(*) is true in all Lε\varepsilon-models, then 𝔐\mathfrak{M} can reach (L0,0,0)(L_{0},0,0) from (Li,k1,k2)(L_{i},k_{1},k_{2}).

Notice that our encodings are in a sense “upside-down”: the starting configuration corresponds to the goal sequent in our derivation, and the sequent encoding the final configuration (L0,0,0)(L_{0},0,0) is on the top of the derivation, very close to axioms (see proof of Theorem 9 below). The right intuition here is to consider the derivation in the direction of proof search, developing from the goal up to axioms. This direction correctly reflects the direction of Minsky computation.

Theorem 8 (our undecidability result) immediately follows from Theorems 9 and 10. Indeed, if \mathcal{L} is a logic which is Lε\varepsilon-sound and includes 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}), then ()(*) is provable in \mathcal{L} if and only if 𝔐\mathfrak{M} can reach (L0,0,0)(L_{0},0,0) from (L1,k1,k2)(L_{1},k_{1},k_{2}). Indeed, the “if” direction is by Theorem 9, and the “only if” direciton is by Theorem 10. Since reachability in Minsky computations is undecidable, we get undecidability of \mathcal{L}.

Before proving Theorems 9 and 10, we establish several technical results.

Notice that each formula in the big conjunction GG, except the first one, is of the form GΦ,Ψ=Ψ\ΦbbG_{\Phi,\Psi}=\Psi\mathop{\backslash}\Phi^{bb}. The key lemma for such formulae, in the view of Theorem 9, is as follows.

Lemma 11.

If the big conjunction GG includes GΦ,ΨG_{\Phi,\Psi} and 𝟏G,Φ,Δb\mathbf{1}\wedge G,\Phi,\Delta\vdash b is derivable in 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}), then so is 𝟏G,Δ,Ψb\mathbf{1}\wedge G,\Delta,\Psi\vdash b.

Proof.

The derivation is as follows:

ΨΨ𝟏G,Φ,ΔbRε several timesΦ,𝟏G,Δbbb\R𝟏G,ΔΦ\b\L𝟏G,Δ,(Φ\b)\bB\L𝟏G,Δ,Ψ,Ψ\ΦbbbLr𝟏G,Δ,Ψ,𝟏(Ψ\Φbb)bLε several times𝟏G,𝟏(Ψ\Φbb),Δ,ΨbL several times𝟏G,𝟏G,Δ,Ψ,bDε𝟏G,Δ,Ψb\mathbf{1}\wedge G,\Delta,\Psi\vdash b\mathbf{1}\wedge G,\mathbf{1}\wedge G,\Delta,\Psi,\vdash b\mathbf{1}\wedge G,\mathbf{1}\wedge(\Psi\mathop{\backslash}\Phi^{bb}),\Delta,\Psi\vdash b\mathbf{1}\wedge G,\Delta,\Psi,\mathbf{1}\wedge(\Psi\mathop{\backslash}\Phi^{bb})\vdash b\mathbf{1}\wedge G,\Delta,\Psi,\Psi\mathop{\backslash}\Phi^{bb}\vdash b\lx@proof@logical@and\Psi\vdash\Psi\mathbf{1}\wedge G,\Delta,(\Phi\mathop{\backslash}b)\mathop{\backslash}b\vdash B\mathbf{1}\wedge G,\Delta\vdash\Phi\mathop{\backslash}b\lx@proof@logical@and\Phi,\mathbf{1}\wedge G,\Delta\vdash b\mathbf{1}\wedge G,\Phi,\Delta\vdash bb\vdash b

Corollary 12 (“Post-ish productions”).

Let Δ1\Delta_{1} and Δ2\Delta_{2} be sequences of variables from 𝒱\mathcal{V} (no complex formulae). Then, provided that GG includes q\qbbq\mathop{\backslash}q^{bb} for any q𝒱q\in\mathcal{V}, the sequent 𝟏G,Δ2,Δ1b\mathbf{1}\wedge G,\Delta_{2},\Delta_{1}\vdash b is derivable in 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}) from 𝟏G,Δ1,Δ2b\mathbf{1}\wedge G,\Delta_{1},\Delta_{2}\vdash b.

Proof.

It is sufficient to consider the case of Δ1=q\Delta_{1}=q; then we proceed by induction on the length of Δ1\Delta_{1}. For Δ1=q\Delta_{1}=q, we apply Lemma 11 with Φ=Ψ=q\Phi=\Psi=q. ∎

Corollary 13 (One step of Minsky computation).

Suppose the Minsky machine 𝔐\mathfrak{M} can make a computation step from configuration (Li,k1,k2)(L_{i},k_{1},k_{2}) to configuration (Li,k1,k2)(L_{i^{\prime}},k^{\prime}_{1},k^{\prime}_{2}), and let ()(*^{\prime}) be the instance of ()(*) for (Li,k1,k2)(L_{i^{\prime}},k^{\prime}_{1},k^{\prime}_{2}). Then ()(*) is derivable from ()(*^{\prime}) in 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}).

Proof.

The proof is performed uniformly for all Minsky instructions. For any instruction II, the corresponding formula AIA_{I} is of the form GΦ,Ψ=Ψ\ΦbbG_{\Phi,\Psi}=\Psi\mathop{\backslash}\Phi^{bb}. On the other hand, ()(*^{\prime}) is obtained from ()(*) by replacing Ψ\Psi with Φ\Phi in the antecedent.

For example, for the instruction inc(Li,1,Lj)(L_{i},1,L_{j}) in the center of ()(*) we have li=Ψl_{i}=\Psi, which is replaced with p1,lj=Φp_{1},l_{j}=\Phi in ()(*^{\prime}). This exactly corresponds to the computation step: the number of p1p_{1}’s (that is, the value of c1c_{1}) gets increased by 1, and the state is changed to ljl_{j}. For jz, the replacement happens at the edge of the antecedent, involving e1e_{1} or e2e_{2}.

Thus, ()(*) is of the form 𝟏G,Δ1,Ψ,Δ2b\mathbf{1}\wedge G,\Delta_{1},\Psi,\Delta_{2}\vdash b and ()(*^{\prime}) is 𝟏G,Δ1,Φ,Δ2b\mathbf{1}\wedge G,\Delta_{1},\Phi,\Delta_{2}\vdash b. Now we derive ()(*) from ()(*^{\prime}) in the following way:

𝟏G,Δ1,Φ,Δ2bCorollary 12𝟏G,Φ,Δ2,Δ1bLemma 11𝟏G,Δ2,Δ1,ΨbCorollary 12𝟏G,Δ1,Ψ,Δ2b\mathbf{1}\wedge G,\Delta_{1},\Psi,\Delta_{2}\vdash b\mathbf{1}\wedge G,\Delta_{2},\Delta_{1},\Psi\vdash b\mathbf{1}\wedge G,\Phi,\Delta_{2},\Delta_{1}\vdash b\mathbf{1}\wedge G,\Delta_{1},\Phi,\Delta_{2}\vdash b

Now we are ready to prove Theorem 9.

Proof of Theorem 9.

Using Corollary 13 and induction on the number of steps in Minsky computation from (Li,k1,k2)(L_{i},k_{1},k_{2}) to (L0,0,0)(L_{0},0,0), we derive ()(*) from

𝟏G,e1,l0,e2b\mathbf{1}\wedge G,e_{1},l_{0},e_{2}\vdash b (0)

This sequent (0)(*_{0}) is derived as follows:

e2e2l0l0e1e1bb\Le1,e1\bb\Le1,l0,l0\(e1\b)b\Le1,l0,e2,e2\(l0\(e1\b))bL several timese1,l0,e2,𝟏GbLε 3 times𝟏G,e1,l0,e2b\mathbf{1}\wedge G,e_{1},l_{0},e_{2}\vdash be_{1},l_{0},e_{2},\mathbf{1}\wedge G\vdash be_{1},l_{0},e_{2},e_{2}\mathop{\backslash}(l_{0}\mathop{\backslash}(e_{1}\mathop{\backslash}b))\vdash b\lx@proof@logical@and e_{2}\vdash e_{2}e_{1},l_{0},l_{0}\mathop{\backslash}(e_{1}\mathop{\backslash}b)\vdash b\lx@proof@logical@and l_{0}\vdash l_{0}e_{1},e_{1}\mathop{\backslash}b\vdash b\lx@proof@logical@and e_{1}\vdash e_{1}b\vdash b

The backwards direction, Theorem 10, is proved by constructing a specific Lε\varepsilon-model. Let Σ=𝒱\Sigma=\mathcal{V} and fefine B𝔐B_{\mathfrak{M}} as the set of “terminating words” for 𝔐\mathfrak{M}, defined as follows:

B𝔐={e1p1p1k1 timeslip2p2k2 timese2 𝔐 can reach (L0,0,0) from (Li,k1,k2) }.B_{\mathfrak{M}}=\{e_{1}\underbrace{p_{1}\ldots p_{1}}_{\text{$k_{1}$ times}}l_{i}\underbrace{p_{2}\ldots p_{2}}_{\text{$k_{2}$ times}}e_{2}\mid\mbox{ $\mathfrak{M}$ can reach $(L_{0},0,0)$ from $(L_{i},k_{1},k_{2})$ }\}.

Now define the interpreting function ww on variables as follows:

w(q)={q}for q𝒱;\displaystyle w(q)=\{q\}\qquad\mbox{for $q\in\mathcal{V}$;}
w(b)={ΞΥ Ξ and Υ are words over Σ such that ΥΞB𝔐 }.\displaystyle w(b)=\{\Xi\Upsilon\mid\mbox{ $\Xi$ and $\Upsilon$ are words over $\Sigma$ such that $\Upsilon\Xi\in B_{\mathfrak{M}}$ }\}.
Lemma 14.

w(𝟏G)={ε}w(\mathbf{1}\wedge G)=\{\varepsilon\}.

Proof.

It is sufficient to show that εw(G)\varepsilon\in w(G), that is, ε\varepsilon belongs to interpretation of all formulae in the big conjunction GG.

First, εw((e1,l0,e2)\b)\varepsilon\in w((e_{1},l_{0},e_{2})\mathop{\backslash}b). Indeed, w((e1,l0,e2)\b)={e1l0e2}\w(b)w((e_{1},l_{0},e_{2})\mathop{\backslash}b)=\{e_{1}l_{0}e_{2}\}\mathop{\backslash}w(b), thus, we have to show that e1l0e2ε=e1l0e2w(b)e_{1}l_{0}e_{2}\varepsilon=e_{1}l_{0}e_{2}\in w(b). This is indeed so by the definition of B𝔐B_{\mathfrak{M}}, since (L0,0,0)(L_{0},0,0) is reachable from itself in zero steps.

Second, we prove that εw(AI)\varepsilon\in w(A_{I}) for each instruction II of 𝔐\mathfrak{M}. Recall that AI=Ψ\Φbb=Ψ\((Φ\b)\b)A_{I}=\Psi\mathop{\backslash}\Phi^{bb}=\Psi\mathop{\backslash}((\Phi\mathop{\backslash}b)\mathop{\backslash}b), and if instruction II changes the configuration from (Li,k1,k2)(L_{i},k_{1},k_{2}) to (Li,k1,k2)(L_{i^{\prime}},k^{\prime}_{1},k^{\prime}_{2}), then the code of the second configuration is obtained from the code of the first one by replacing Ψ\Psi with Φ\Phi. In other words, the code of (Li,k1,k2)(L_{i},k_{1},k_{2}) is Δ1ΨΔ2\Delta_{1}\Psi\Delta_{2} and the code of (Li,k1,k2)(L_{i^{\prime}},k^{\prime}_{1},k^{\prime}_{2}) is Δ1ΦΔ2\Delta_{1}\Phi\Delta_{2}. We have to prove that εw(Ψ)\w((Φ\b)\b)\varepsilon\in w(\Psi)\mathop{\backslash}w((\Phi\mathop{\backslash}b)\mathop{\backslash}b). Since w(Ψ)={Ψ}w(\Psi)=\{\Psi\} (Ψ\Psi contains only letters from 𝒱\mathcal{V}), this means that Ψ\Psi should belong to w((Φ\b)\b)w((\Phi\mathop{\backslash}b)\mathop{\backslash}b).

In turn, Ψw((Φ\b)\b)\Psi\in w((\Phi\mathop{\backslash}b)\mathop{\backslash}b) means that for any word Δw(Φ\b)\Delta\in w(\Phi\mathop{\backslash}b) we have ΔΨw(b)\Delta\Psi\in w(b). The fact that Δw(Φ\b)\Delta\in w(\Phi\mathop{\backslash}b), since w(Φ)={Φ}w(\Phi)=\{\Phi\}, actually means that ΦΔw(b)\Phi\Delta\in w(b). Thus, we have to prove, for an arbitrary Δ\Delta, that if ΦΔw(b)\Phi\Delta\in w(b), then ΔΨw(b)\Delta\Psi\in w(b).

If ΦΔw(b)\Phi\Delta\in w(b), then we have Δ=Δ1Δ2\Delta=\Delta_{1}\Delta_{2}, and Δ2ΦΔ1B𝔐\Delta_{2}\Phi\Delta_{1}\in B_{\mathfrak{M}}. Here Φ\Phi cannot be split between Ξ\Xi and Υ\Upsilon, because any word in B𝔐B_{\mathfrak{M}} should begin with e1e_{1} and end on e2e_{2}. This means that Δ2ΦΔ1\Delta_{2}\Phi\Delta_{1} is a code of some configuration (Li,k1,k2)(L_{i^{\prime}},k^{\prime}_{1},k^{\prime}_{2}), from which 𝔐\mathfrak{M} can reach the final configuration. As noticed above, this means that Δ2ΨΔ1\Delta_{2}\Psi\Delta_{1} encodes a configuration (Li,k1,k2)(L_{i},k_{1},k_{2}), which transforms into (Li,k1,k2)(L_{i^{\prime}},k^{\prime}_{1},k^{\prime}_{2}) by applying instruction II. Therefore, from (Li,k1,k2)(L_{i},k_{1},k_{2}) our Minsky machine can also reach the final state, hence Δ2ΨΔ1B𝔐\Delta_{2}\Psi\Delta_{1}\in B_{\mathfrak{M}}. This yields ΔΨ=Δ1Δ2Ψw(b)\Delta\Psi=\Delta_{1}\Delta_{2}\Psi\in w(b), which is our goal.

Third, consider q\qbbq\mathop{\backslash}q^{bb}, where q𝒱q\in\mathcal{V}. We have to show that εw(q)\w(qbb)\varepsilon\in w(q)\mathop{\backslash}w(q^{bb}), that is, qw(qbb)q\in w(q^{bb}). The latter means that for any Δw(q\b)\Delta\in w(q\mathop{\backslash}b) the word Δq\Delta q should belong to w(b)w(b). This is indeed so: if Δw(q\b)\Delta\in w(q\mathop{\backslash}b), then qΔw(b)q\Delta\in w(b), and since w(b)w(b) is closed under cyclic transpositions, also Δqw(b)\Delta q\in w(b). ∎

Now we are ready to prove Theorem 10.

Proof of Theorem 10.

If ()(*) is true in all Lε\varepsilon-models, it is true in the specific model defined above. By Lemma 14, w(𝟏G)={ε}w(\mathbf{1}\wedge G)=\{\varepsilon\}; w(q)={q}w(q)=\{q\} for any q𝒱q\in\mathcal{V}. Thus, we have

e1p1p1k1 timeslip2p2k2 timese2w(b),e_{1}\underbrace{p_{1}\ldots p_{1}}_{\text{$k_{1}$ times}}l_{i}\underbrace{p_{2}\ldots p_{2}}_{\text{$k_{2}$ times}}e_{2}\in w(b),

and therefore

e1p1p1k1 timeslip2p2k2 timese2B𝔐.e_{1}\underbrace{p_{1}\ldots p_{1}}_{\text{$k_{1}$ times}}l_{i}\underbrace{p_{2}\ldots p_{2}}_{\text{$k_{2}$ times}}e_{2}\in B_{\mathfrak{M}}.

(No cyclic transposition is possible, since e1e_{1} and e2e_{2} should start and end the word.)

By definition of B𝔐B_{\mathfrak{M}}, this means that 𝔐\mathfrak{M} can reach the final state (L0,0,0)(L_{0},0,0), starting from (Li,k1,k2)(L_{i},k_{1},k_{2}). ∎

3.2 Models on Regular Languages with the Unit Constant

Let Th(Lε-models;\,,𝟏)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}) denote the set of all sequents in the language of \\mathop{\backslash}, \wedge, 𝟏\mathbf{1} which are true in all Lε\varepsilon-models, that is, the complete theory of this class of models.

As noticed above, the question of axiomatizing this theory is quite involved. We know that this theory includes 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}), introduced in the previous section, but it is probably much more complicated. For example, as shown in [22], Sobociński’s 3-valued logic 𝐑𝐌3\mathbf{RM}_{3} can be embedded into Th(Lε-models;\,,𝟏)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}).

It follows from Theorem 8 that Th(Lε-models;\,,𝟏)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}) is undecidable. More precisely, it is Σ10\Sigma_{1}^{0}-hard (hard w.r.t. the class of recursively enumerable sets). The upper complexity bound, however, is not known: this theory could possibly be even not recursively enumerable. Having the algorithmic complexity question for Th(Lε-models;\,,𝟏)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}) open, we can still obtain an interesting corollary of our complexity estimations.

Recall the standard notion of regular expression. Regular expressions are constructed from constants 0 and 11 using two binary operations, \cdot and ++, and one unary operation, . The language (R)\mathscr{L}(R) described by a given regular expression RR is defined recursively:

(0)=;\displaystyle\mathscr{L}(0)=\varnothing;
(1)={ε};\displaystyle\mathscr{L}(1)=\{\varepsilon\};
(AB)=(A)(B);\displaystyle\mathscr{L}(A\cdot B)=\mathscr{L}(A)\cdot\mathscr{L}(B);
(A+B)=(A)(B);\displaystyle\mathscr{L}(A+B)=\mathscr{L}(A)\cup\mathscr{L}(B);
(A)=((A))={u1unn0,ui(A)}.\displaystyle\mathscr{L}(A^{*})=\bigl{(}\mathscr{L}(A)\bigr{)}^{*}=\{u_{1}\ldots u_{n}\mid n\geq 0,u_{i}\in\mathscr{L}(A)\}.

Languages described by regular expressions are called regular languages.

By Lregε\varepsilon-models let us denote a subclass of Lε\varepsilon-models in which every variable as interpreted as a regular language, that is, a set of words described by a regular expression. It is well-known that the class of regular languages is closed under intersection (see, for example, [27, Theorem 2.8]). Moreover, it is also closed under division:

Proposition 15.

If AA and BB are regular languages, then so are A\BA\mathop{\backslash}B and B/AB\mathop{/}A.

Proof.

A more well-known fact (see, for example, [27, Exercise 2.3.17a]) is that the class of regular languages is closed under the following modified division operation with the existential quantifier instead of the universal one: AB={uΣ(vA)vuB}A\mathop{\mbox{\raisebox{5.0pt}{\rotatebox{-60.0}{${\sim}$}}}}B=\{u\in\Sigma^{*}\mid(\exists v\in A)\,vu\in B\}. Our “normal” division \\mathop{\backslash} can be reduced to \mathop{\mbox{\raisebox{5.0pt}{\rotatebox{-60.0}{${\sim}$}}}} by the complement operation: A\B=AB¯¯A\mathop{\backslash}B=\overline{A\mathop{\mbox{\raisebox{5.0pt}{\rotatebox{-60.0}{${\sim}$}}}}\overline{B}}, where X¯=ΣX\overline{X}=\Sigma^{*}-X. Since the class of regular languages is closed under \mathop{\mbox{\raisebox{5.0pt}{\rotatebox{-60.0}{${\sim}$}}}} and complement (again, see [27, Theorem 2.8]), it is also closed under \\mathop{\backslash}. The /\mathop{/} case is symmetric. ∎

Thus, in Lregε\varepsilon-models interpretations of all formulae are regular languages.

In the language without the unit constant, namely, \\mathop{\backslash}, /\mathop{/}, \wedge, the theory of Lregε\varepsilon-models coincides with the theory of all Lε\varepsilon-models:

Proposition 16.

Th(Lregε-models;\,/,)=Th(Lε-models;\,/,).\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\mathop{\backslash},\mathop{/},\wedge)=\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\mathop{/},\wedge).

Proof.

On the one hand, the calculus 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge) is sound w.r.t. all Lε\varepsilon-models. On the other hand, as shown by Buszkowski [28], it is complete w.r.t. a class of models which is even narrower than the class of Lregε\varepsilon-models. Namely, 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge) is complete w.r.t. the class of Lε\varepsilon-models in which variables are interpreted by cofinite languages. (A cofinite language is a language which includes all words over a given alphabet, except for a finite set.) In this case, formulae are interpreted by cofinite or finite languages, and any finite or cofinite language is regular. Therefore, both Th(Lregε-models;\,/,)\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\mathop{\backslash},\mathop{/},\wedge) and Th(Lε-models;\,/,)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\mathop{/},\wedge) are axiomatized by the same calculus 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge). ∎

The unit changes things dramatically. With the unit, there is no completeness result, like Theorem 4, but also no equivalence between theories of all Lε\varepsilon-models and Lregε\varepsilon-models.

Theorem 17.

Th(Lregε-models;\,,𝟏)Th(Lε-models;\,,𝟏).\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1})\neq\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}).

Proof.

As follows from Theorem 8, Th(Lε-models;\,,𝟏)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}) is Σ10\Sigma_{1}^{0}-hard. On the other hand, following Buszkowski [29], we can show that Th(Lε-models;\,,𝟏)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}) belongs to the Π10\Pi_{1}^{0} class. Indeed, a sequent belongs to this theory if and only if it is true in all Lregε\varepsilon-models. A concrete sequent includes only a finite number of variables, p1p_{1}, …, pnp_{n}. Thus, a model for this sequent is defined by a finite number of regular expressions R1R_{1}, …, RnR_{n}, which describe the languages w(p1)w(p_{1}), …, w(pn)w(p_{n}). This means that the general truth condition for this sequent can be written down as the following formula:

R1Rn(the sequent is true under interpretationwhere w(pi) is the language of Ri).\forall R_{1}\,\ldots\,\forall R_{n}\bigl{(}\mbox{the sequent is true under interpretation}\\ \mbox{where $w(p_{i})$ is the language of $R_{i}$}\bigr{)}.

Quantifiers R1\forall R_{1}, …, Rn\forall R_{n} can be encoded as quantifiers over natural numbers representing the regular expressions. The quantifier-free part of the formula (truth condition under a concrete ww) is decidable, because all necessary operations on regular expressions are computable. Thus, we get a Π10\Pi_{1}^{0} representation of the set of sequents which are true in all Lregε\varepsilon-models.

It is well known that a set cannot belong to Π10\Pi_{1}^{0} and be Σ10\Sigma_{1}^{0}-hard at the same time. (Otherwise, for any set in Σ10\Sigma_{1}^{0} there would be a computable reduction to a set in Π10\Pi_{1}^{0}, which would yield Σ10Π10\Sigma_{1}^{0}\subseteq\Pi_{1}^{0}, which is not the case.) Therefore,

Th(Lregε-models;\,,𝟏)Th(Lε-models;\,,𝟏).\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1})\neq\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}).

Notice that our proof of Theorem 8 does not apply to Th(Lregε-models;\,,𝟏)\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\linebreak\mathop{\backslash},\wedge,\mathbf{1}), because the language w(b)w(b) there is non-regular (in fact, it is undecidable).

Since the class of Lregε\varepsilon-models is narrower than the class of all Lε\varepsilon-models, we have (by Galois connection) an inverted inclusion of theories:

Th(Lregε-models;\,,𝟏)Th(Lε-models;\,,𝟏).\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1})\supset\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}).

By our Theorem 17, this inclusion is strict. Thus, the other inclusion should fail:

Th(Lregε-models;\,,𝟏)Th(Lε-models;\,,𝟏).\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1})\not\subset\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}).

In other words, there exists a sequent which is true in all Lregε\varepsilon-models, but not in all Lε\varepsilon-models. Our proof, however, is non-constructive, and we do not present a concrete example of such sequent. Constructing such a concrete example is left for further research.

Notice that if we apply the reasoning establishing the upper Π10\Pi_{1}^{0} bound of Th(Lregε-models;\,,𝟏)\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}) to Th(Lε-models;\,,𝟏)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}), we shall have to quantify over arbitrary formal languages w(p1)w(p_{1}), …, w(pn)w(p_{n}). This results in hyperarithmetical quantifiers, and yields only a very high, Π11\Pi_{1}^{1} complexity upper bound for Th(Lε-models;\,,𝟏)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}).

4 Concluding Remarks

In this article, we have investigated language interpretations of natural extensions of the Lambek calculus: with additive operations (\vee and \wedge) and with additive conjunction (\wedge) and the unit constant (𝟏\mathbf{1}).

For extensions with additive connectives (Section 2), we have shown that conjunction and disjunction show significantly different behaviour. It is known that adding both conjunction and disjunction leads to incompleteness due to the distributivity law 𝒟\mathcal{D}. This law is true in all language models, but not derivable in the multiplicative-additive Lambek calculus (𝐌𝐀𝐋𝐂\mathbf{MALC}). Adding only conjunction, however, still provides completeness. Any sequent in the language of \\mathop{\backslash}, /\mathop{/}, \wedge (but not \vee) that is derivable with the help of 𝒟\mathcal{D}, is also derivable without it. For disjunction the situation is opposite: there exists a sequent in the language of \\mathop{\backslash}, /\mathop{/}, \vee, which is derivable using 𝒟\mathcal{D}, but not derivable without it.

Moreover, this difference between \wedge and \vee keeps valid for systems with permutation and/or weakening structural rules, that is, intuitionistic linear (𝐈𝐋𝐋\mathbf{ILL}), and affine (𝐈𝐀𝐋\mathbf{IAL}) logics and affine 𝐌𝐀𝐋𝐂\mathbf{MALC}.

For the extension of the Lambek calculus with the unit, 𝟏\mathbf{1}, it is well-known that its standard axiomatization in the style of linear logic does not give an Lε\varepsilon-complete system. In Section 3, we present a system in the language \,,𝟏\mathop{\backslash},\wedge,\mathbf{1}, where rules for 𝟏\mathbf{1} reflect natural algebraic properties of the empty word in the algebra of formal languages. This system is denoted by 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}). We do not claim Lε\varepsilon-completeness of 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}). Instead, we consider the whole range of logics between 𝐋+ε(\,,𝟏)\mathbf{L}^{+\varepsilon}(\mathop{\backslash},\wedge,\mathbf{1}) and the Lε\varepsilon-complete system denoted by Th(Lε-models;\,,𝟏)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}). For any logic within this range, we show that it is undecidable; more precisely, Σ10\Sigma_{1}^{0}-complete. As a corollary, we also show that, in the language of \,,𝟏\mathop{\backslash},\wedge,\mathbf{1}, the complete theory of all Lε\varepsilon-models is different from that of Lregε\varepsilon-models, where formulae are interpreted by regular languages.

A preliminary version of this article was presented at WoLLIC 2019 and published in its lecture notes [13]. Let us briefly list the results which are new in this article, compared to the WoLLIC paper.

  • 1.

    In the language without additive conjunction, we show incompleteness not only for 𝐌𝐀𝐋𝐂\mathbf{MALC}, but also for its extensions: 𝐌𝐀𝐋𝐂\mathbf{MALC}^{\boldsymbol{*}}, 𝐀𝐌𝐀𝐋𝐂\mathbf{AMALC}^{\boldsymbol{*}}, 𝐈𝐋𝐋\mathbf{ILL}, and 𝐈𝐀𝐋\mathbf{IAL}.

  • 2.

    We prove that 𝐌𝐀𝐋𝐂(\,/,)\mathbf{MALC}(\mathop{\backslash},\mathop{/},\wedge) is a conservative fragment of 𝐌𝐀𝐋𝐂\mathbf{MALC} extended with the distributity law 𝒟\mathcal{D}. Moreover, we prove similar results for 𝐌𝐀𝐋𝐂\mathbf{MALC}^{\boldsymbol{*}}, 𝐀𝐌𝐀𝐋𝐂\mathbf{AMALC}^{\boldsymbol{*}}, 𝐈𝐋𝐋\mathbf{ILL}, and 𝐈𝐀𝐋\mathbf{IAL}.

  • 3.

    We prove that, in the language including 𝟏\mathbf{1}, the theory of all Lε\varepsilon-models is different from the theory of LRegε\varepsilon-models, in which formulae are interpreted by regular languages. In the language of \,/,\mathop{\backslash},\mathop{/},\wedge (without 𝟏\mathbf{1}), the corresponding theories coincide due to a completeness result by Buszkowski [28].

While in Section 2 we have presented a quite completed study, Section 3 leaves many questions open for further investigations. Among these, we would like to emphasize the following ones.

  • 1.

    The question of axiomatization, or even recursive enumerability for complete theories Th(Lε-models;\,,𝟏)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}) and Th(Lregε-models;\,,𝟏)\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}) is still open, and potentially very hard. Notice that these theories are different (Theorem 17) and that for Th(Lregε-models;\,,𝟏)\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}) enumerability will immediately mean decidability.

  • 2.

    A possibly easier question would be to construct a concrete formula distinguishing Th(Lε-models;\,,𝟏)\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}) and Th(Lregε-models;\,,𝟏)\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\mathop{\backslash},\wedge,\mathbf{1}). That is, we are looking for an explicit example for Theorem 17.

  • 3.

    Without the unit, we know that

    𝐌𝐀𝐋𝐂(\,/,)=Th(Lε-models;\,/,)=Th(Lregε-models;\,/,).\mathbf{MALC}^{\boldsymbol{*}}(\mathop{\backslash},\mathop{/},\wedge)=\mathrm{Th}(\mbox{L$\varepsilon$-models};\mathop{\backslash},\mathop{/},\wedge)=\mathrm{Th}(\mbox{{\sc Lreg}$\varepsilon$-models};\mathop{\backslash},\mathop{/},\wedge).

    By the completeness theorem of Pentus [4, 5], the first equality is also true for the language of \,/,\mathop{\backslash},\mathop{/},\cdot (with product instead of additive conjunction). There are two open questions. First, whether Pentus’ theorem is true for the language with both conjunctions (\,/,,\mathop{\backslash},\mathop{/},\cdot,\wedge). Second, whether Pentus’ theorem is true for Lregε\varepsilon-models. Both questions are questions of making Pentus’ result stronger. Recalling that Pentus’ proofs are quite sophisticated, these questions are also probably very hard.

References

  • [1] J. Lambek, The mathematics of sentence structure, Amer. Math. Monthly 65 (1958) 154–170.
  • [2] R. Moot, C. Retoré, The logic of categorial grammars: a deductive account of natural language syntax and semantics, Vol. 6850 of LNCS, Springer, 2012.
  • [3] J. Lambek, On the calculus of syntactic types, in: Structure of Language and Its Mathematical Aspects, Vol. 12 of Proc. Symposia Appl. Math., AMS, 1961, pp. 166–178.
  • [4] M. Pentus, Models for the Lambek calculus, Annals of Pure and Applied Logic 75 (1–2) (1995) 179–213.
  • [5] M. Pentus, Free monoid completeness of the Lambek calculus allowing empty premises, in: Proc. Logic Colloquium ’96, Vol. 12 of Lect. Notes Logic, Springer, 1998, pp. 171–209.
  • [6] W. Buszkowski, Compatibility of a categorial grammar with an associated category system, Zeitschr. Math. Logik Grundl. Math. (Math. Logic Q.) 28 (1982) 229–238.
  • [7] M. Kanazawa, The Lambek calculus enriched with additional connectives, J. Logic Lang. Inform. 1 (2) (1992) 141–171.
  • [8] V. M. Abrusci, A comparison between Lambek syntactic calculus and intuitionistic linear logic, Zeitschr. math. Logik Grundl. Math. (Math. Logic Q.) 36 (1990) 11–15.
  • [9] J.-Y. Girard, Linear logic, Theor. Comput. Sci. 50 (1) (1987) 1–102.
  • [10] J. Lambek, Deductive systems and categories II. Standard constructions and closed categories, in: Category Theory, Homology Theory and Their Applications I, Vol. 86 of Lect. Notes Math., Springer, 1969, pp. 76–122.
  • [11] H. Ono, Y. Komori, Logics without contraction rule, J. Symb. Log. 50 (1) (1985) 169–201.
  • [12] M. Kozak, Distributive full Lambek calculus has the finite model property, Studia Logica 91 (2009) 201–216.
  • [13] M. Kanovich, S. Kuznetsov, A. Scedrov, L-models and R-models for Lambek calculus enriched with additives and the multiplicative unit, in: WoLLIC 2019: Logic, Language, Information, and Computation, Vol. 11541 of Lect. Notes Comput. Sci., 2019, pp. 373–391.
  • [14] N. Tamura, A linear logic prover (llprover), http://bach.istc.kobe-u.ac.jp/llprover/ (1998–2007).
  • [15] P. Jipsen, Deciding equations in residuated lattices, http://www1.chapman.edu/~jipsen/reslat/.
  • [16] M. Okada, K. Terui, The finite model property for various fragments of intuitionistic linear logic, J. Symb. Log. 64 (2) (1999) 790–802.
  • [17] M. Pentus, The conjoinability relation in Lambek calculus and linear logic, J. Log. Lang. Inform. 3 (1994) 121–140.
  • [18] H. Andréka, S. Mikulás, I. Németi, The equational theory of Kleene lattices, Theoretical Computer Science 412 (52) (2011) 7099–7108.
  • [19] H. Andréka, S. Mikulás, Axiomatizability of positive algebras of binary relations, Algebra Universalis 66 (2011) 7–34.
  • [20] H. Andréka, I. Németi, I. Sain, Some new landmarks on the roadmap of two dimensional logics, in: J. van Eijck, A. Visser (Eds.), Logic and Information Flow, MIT Press, 1994, pp. 163–169.
  • [21] A. Kurucz, I. Németi, I. Sain, A. Simon, Undecidable varieties of semilattice-ordered semigroups, of Boolean algebras with operators, and logics extending Lambek calculus, Bulletin of the IGPL 1 (1) (1993) 91–98.
  • [22] S. L. Kuznetsov, Trivalent logics arising from L-models for the Lambek calculus with constants, Journal of Applied Non-Classical Logics 24 (1–2) (2014) 132–137.
  • [23] M. Kanovich, The direct simulation of Minsky machines in linear logic, in: Advances in Linear Logic, Vol. 222 of London Mathematical Society Lecture Notes, Cambridge University Press, 1995, pp. 123–145.
  • [24] Y. Lafont, The undecidability of second order linear logic without exponentials, Journal of Symbolic Logic 61 (2) (1996) 541–548.
  • [25] Y. Lafont, A. Scedrov, The undecidability of second order multiplicative linear logic, Information and Computation 125 (1) (1996) 46–51.
  • [26] M. L. Minsky, Recursive unsolvability of Post’s problem of “Tag” and other topics in theory of Turing machines, Annals of Mathematics 74 (3) (1961) 437–455.
  • [27] A. V. Aho, J. D. Ullman, The theory of parsing, translation, and compiling. Vol. I: Parsing, Prentice-Hall, 1972.
  • [28] W. Buszkowski, The finite model property for BCI and related systems, Studia Logica 57 (1996) 303–323.
  • [29] W. Buszkowski, On the complexity of the equational theory of relational action algebras, in: RelMiCS 2006: Relations and Kleene Algebra in Computer Science, Vol. 4136 of LNCS, Springer, 2006, pp. 106–119.