This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Institute of Logic and Cognition, Department of Philosophy,
Sun Yat-sen University, Guangzhou, 510275, China.

The Complexity and Expressive Power of Second-Order Extended Logic

Shiguang Feng 11    Xishun Zhao 11
Abstract

We study the expressive powers of SO-HORN, SO-HORNr and SO-HORN∗r on all finite structures. We show that SO-HORNr, SO-HORN∗r, FO(LFP) coincide with each other and SO-HORN is proper sublogic of SO-HORNr. To prove this result, we introduce the notions of DATALOG program, DATALOGr program and their stratified versions, S-DATALOG program and S-DATALOGr program. It is shown that, on all structures, DATALOGr and S-DATALOGr are equivalent and DATALOG is a proper sublogic of DATALOGr. SO-HORN and SO-HORNr can be treated as the negations of DATALOG and DATALOGr, respectively. We also show that SO-EHORNr logic which is an extended version of SO-HORN captures co-NP on all finite structures.

1 Introduction

Descriptive complexity is a bridge between complexity theory and mathematical logic. It uses logic systems to measure the resources that are necessary for some complexity classes. The first capture result is Fagin’s theorem. In 1974, Fagin showed that existential second-order logic(\existsSO) captures NP on all finite structures. This is a seminal work that has been followed by many studies in the characterization of other complexity classes by means of logical systems, the research of NP-complete approximation problems from a descriptive point of view, and so on. Through the effort of many researchers in this area, almost every important complexity classes have their corresponding capture logic systems. First-order logic (FO) which is a very powerful logic in classic model theory is too weak when only consider finite structures. By augmenting some recursion operators, researchers got many powerful logic. FO(IFP), FO(PFP), FO(DTC) and FO(TC) are the logic that equip first-order logic with inflationary fixpoint operator, partial fixpoint operator, deterministic transitive closure operator and transitive closure operator, respectively. Immerman and Vardi both showed that FO(IFP) captures PTIME on ordered structures in 1986 and 1982, respectively. Abiteboul and Vainu showed that FO(PFP) captures PSPACE on ordered structures in 1983. Immerman showed that FO(DTC) and FO(TC) capture L and NL on ordered structures, respectively, in 1987. In[4], Grädel introduced SO-HORN logic which is a fragment of second-order logic and showed that SO-HORN captures PTIME on ordered structures. Whereas all these results are obtained on ordered structures, a very important open problem in descriptive complexity is that whether there is an effective logic captures PTIME on all structures. It turns out that if there is no such a logic, then PNPP\neq NP. Due to the definition of SO-HORN, on all structures, it is closed under substructures and can not express some first-order definable properties. Grädel introduced another version of this logic in [6] which is called SO-HORN in this paper. SO-HORN allows first-order formula in it, but we show that it still is not expressive enough to capture PTIME on all structures. In[21], we defined SO-HORNr logic which is an revised version of SO-HORN and showed that it captures PTIME on ordered structures. We are interested in finding that whether this logic can be a candidate to capture PTIME on all finite structures. To study the expressive powers of these logic on all finite structures, we introduce the notions of DATALOG program and DATALOGr program and their stratified versions (S-DATALOG, S-DATALOGr, respectively). In this paper, we show that SO-HORNr has the same expressive power as FO(LFP) which is a very powerful logic. We also study the expressive power of SO-EHORNr and prove that SO-EHORNr captures co-NP on all finite structures. This logic is an extended version of SO-HORN and has been shown that it can capture co-NP on ordered structures.

This paper is organized as follows: In section 2, we set up notations and terminologies. In section 3, we study the expressive powers of SO-HORN, SO-HORNr and SO-HORN∗r on all finite structures. We use DATALOG and S-DATALOG program as intermediate logic, and introduce the notions of DATALOG program, DATALOGr program and their stratified versions. We show that SO-HORNr, SO-HORN∗r, FO(LFP) coincide with each other and SO-HORN is a proper sublogic of SO-HORNr. In section 4, we use the normal forms of Σ11\Sigma_{1}^{1} formulas to show that SO-EHORNr captures co-NP on all finite structures. We prove this result in two versions. Section 5 is the conclusion of this paper.

2 Preliminaries

We assume the reader has already some familiarity with mathematical logic and complexity theory. Nonetheless, we give a brief description of some basic definitions.

A vocabulary is a finite set of relation symbols P,Q,R,P,Q,R,\cdots and constant symbols 𝐜,𝐝,\mathbf{c},\mathbf{d},\cdots. Each relation symbol is equipped with a natural number r1r\geq 1, its arity. We use lower-case Greek letters τ,σ,\tau,\sigma,\cdots to denote a vocabulary. Given a vocabulary τ={P1,P2,,Pn,𝐜1,𝐜2,,𝐜m}\tau=\{P_{1},P_{2},\cdots,P_{n},\mathbf{c}_{1},\mathbf{c}_{2},\cdots,\mathbf{c}_{m}\}, a τ\tau-structure 𝒜\mathcal{A} is a tuple

A,P1A,P2A,,PnA,𝐜1A,𝐜2A,,𝐜mA\left\langle A,P_{1}^{A},P_{2}^{A},\cdots,P_{n}^{A},\mathbf{c}_{1}^{A},\mathbf{c}_{2}^{A},\cdots,\mathbf{c}_{m}^{A}\right\rangle

where AA is the domain of 𝒜\mathcal{A}, for each relation symbol PiP_{i} of arity ri(1in)r_{i}\,(1\leq i\leq n), PiAAriP_{i}^{A}\subseteq A^{r_{i}} and for each constant symbol 𝐜j(1jm)\mathbf{c}_{j}\,(1\leq j\leq m), 𝐜jAA\mathbf{c}_{j}^{A}\in A. From now on, we use calligraphic letters 𝒜,,\mathcal{A},\mathcal{B},\cdots and capital letters A,B,A,B,\dots to denote structures and the corresponding domains, respectively. We use |A||A| to denote the cardinality of set AA and a¯\bar{a} to denote a sequence a1,a2,,ana_{1},a_{2},\cdots,a_{n} or a tuple (a1,a2,,an)(a_{1},a_{2},\cdots,a_{n}) of elements (variables). A structure 𝒜\mathcal{A} is finite if its domain AA is a finite set. Without otherwise specified, we restrict ourself to finite structures in this paper. Let STRUC(τ)STRUC(\tau) denote the set of all finite τ\tau-structures, and if \mathcal{L} is a logic, then (τ)\mathcal{L}(\tau) denotes the set of all formulas of \mathcal{L} over the vocabulary τ\tau. Given a formula ϕ(τ)\phi\in\mathcal{L}(\tau), define

Mod(ϕ)={𝒜𝒜 is a τ-structure and 𝒜ϕ}Mod(\phi)=\{\mathcal{A}\mid\mathcal{A}\textrm{ is a $\tau$-structure and }\mathcal{A}\models\phi\}

Let 1\mathcal{L}_{1} and 2\mathcal{L}_{2} be two logic, we use 12\mathcal{L}_{1}\subseteq\mathcal{L}_{2} to denote the expressive power of 2\mathcal{L}_{2} is no less than that of 1\mathcal{L}_{1}, i.e. for any vocabulary τ\tau and any ϕ1(τ)\phi\in\mathcal{L}_{1}(\tau), there exists ψ2(τ)\psi\in\mathcal{L}_{2}(\tau) such that Mod(ϕ)=Mod(ψ)Mod(\phi)=Mod(\psi). 12\mathcal{L}_{1}\subset\mathcal{L}_{2} denotes that 2\mathcal{L}_{2} is strictly more expressive than 1\mathcal{L}_{1}, and 12\mathcal{L}_{1}\equiv\mathcal{L}_{2} denotes that 1\mathcal{L}_{1} has the same expressive power as 2\mathcal{L}_{2}.

Define τ<=τ{<,succ,𝐦𝐢𝐧,𝐦𝐚𝐱}\tau_{<}=\tau\cup\{<,succ,\mathbf{min},\mathbf{max}\}, where τ\tau is a vocabulary. A τ<\tau_{<}-structure 𝒜\mathcal{A} is ordered if the reduct 𝒜{<,succ,𝐦𝐢𝐧,𝐦𝐚𝐱}\mathcal{A}\mid\{<,succ,\mathbf{min},\mathbf{max}\} is an ordering (that is, <<, succsucc, 𝐦𝐢𝐧\mathbf{min} and 𝐦𝐚𝐱\mathbf{max} are interpreted by the ordering relation, successor relation, the least and the last element of the ordering, respectively.) Let STRUC<(τ)STRUC_{<}(\tau) denote the set of all ordered structures over τ<\tau_{<}. Given a formula ϕ(τ<)\phi\in\mathcal{L}(\tau_{<}), define

Mod<(ϕ)={𝒜ϕ𝒜 is an ordered τ<-structure}Mod_{<}(\phi)=\{\mathcal{A}\models\phi\mid\mathcal{A}\text{ is an ordered }\tau_{<}\text{-structure}\}

Let \mathcal{L} be a logic and 𝒞\mathcal{C} be a complexity class, we say logic \mathcal{L} captures complexity class 𝒞\mathcal{C} if the following two conditions are satisfied:

1)

The data complexity of \mathcal{L} is in 𝒞\mathcal{C}.

  • For any vocabulary τ\tau and any closed formula ϕ(τ)\phi\in\mathcal{L}(\tau), the membership problem of Mod(ϕ)Mod(\phi) is in 𝒞\mathcal{C}.

2)

𝒞\mathcal{C} is expressible in \mathcal{L}.

  • For any vocabulary τ\tau and any set KSTRUC(τ)K\subseteq STRUC(\tau), if the membership problem of KK is in 𝒞\mathcal{C}, then ϕ(τ)\exists\phi\in\mathcal{L}(\tau) such that K=Mod(ϕ)K=Mod(\phi).

We say \mathcal{L} captures 𝒞\mathcal{C} on ordered structures if we only consider ordered structures and it satisfies:

1)

For any vocabulary τ\tau and any closed formula ϕ(τ<)\phi\in\mathcal{L}(\tau_{<}), the membership problem of Mod<(ϕ)Mod_{<}(\phi) is in 𝒞\mathcal{C}.

2)

For any vocabulary τ\tau and any set KSTRUC<(τ)K\subseteq STRUC_{<}(\tau), if the membership problem of KK is in 𝒞\mathcal{C}, then ϕ(τ<)\exists\phi\in\mathcal{L}(\tau_{<}) such that K=Mod<(ϕ)K=Mod_{<}(\phi).

We recall some notions and results of quantified Boolean formulas (QBF). A QBF formula has the form:

Φ=Q1x1Qnxnφ\Phi=Q_{1}x_{1}\cdots Q_{n}x_{n}\varphi

where Qi{,}Q_{i}\in\{\forall,\exists\} and φ\varphi is a propositional CNF formula over Boolean variables. A literal xix_{i} or ¬xi\neg x_{i} is called universal (resp. existential) if QiQ_{i} is \forall (resp. \exists). If every clause in φ\varphi contains at most one positive literal (resp. positive existential literal) then Φ\Phi is called a quantified Horn formula (QHORN) (resp. quantified extended Horn formula (QEHORN)). The evaluation problem for QBF is PSPACE-complete [9], for QHORN is in PTIME, whereas it remains PSPACE-complete for QEHORN. However, for each fixed k1k\geq 1, the evaluation problem for QEHORN formulas with prefix type x¯1y¯1x¯ky¯k\forall\overline{x}_{1}\exists\overline{y}_{1}\cdots\forall\overline{x}_{k}\exists\overline{y}_{k} (here x¯i,y¯i\overline{x}_{i},\overline{y}_{i} are sequences of Boolean variables) is co-NP complete[1, 2].

3 Expressive powers of Horn logic

Definition 1

Second-order Horn logic, denoted by SO-HORN, is the set of second-order formulas of the form

Q1R1QmRmx¯(C1Cn)Q_{1}R_{1}\cdots Q_{m}R_{m}\forall\bar{x}(C_{1}\wedge\cdots\wedge C_{n})

where Qi{,}Q_{i}\in\{\forall,\exists\}, R1,,RmR_{1},\cdots,R_{m} are relation symbols and C1,,CnC_{1},\cdots,C_{n} are Horn clauses with respect to R1,,RmR_{1},\cdots,R_{m}, more precisely, each CjC_{j} is an implication of the form

α1αlβ1βqH\alpha_{1}\wedge\cdots\wedge\alpha_{l}\wedge\beta_{1}\wedge\cdots\wedge\beta_{q}\rightarrow H

where

1)

each αs\alpha_{s} is an atomic formula Rix¯R_{i}\bar{x},

2)

each βt\beta_{t} is either an atomic formula Py¯P\bar{y} or a negated atomic formula ¬Py¯\neg P\bar{y} where P{R1,,Rm}P\notin\{R_{1},\cdots,R_{m}\},

3)

HH is either an atomic formula Rkz¯R_{k}\overline{z} or the Boolean constant \bot (for false).

  • If we replace condition 2) by

    2’)

    each βt\beta_{t} is a first-order formula ϕ(y¯)\phi(\bar{y}) containing no R1,,RmR_{1},\cdots,R_{m},

    we denote this logic by SO-HORN.

  • If we replace condition 1) by

    1’)

    each αs\alpha_{s} is either an atomic formula Rix¯R_{i}\bar{x} or y¯Riy¯z¯\forall\bar{y}R_{i}\bar{y}\bar{z},

    we call this logic second-order revised Horn logic, denoted by SO-HORNr.

  • If we replace conditions 1), 2) by 1’), 2’), respectively, we denote this logic by SO-HORN∗r.

On ordered structures, Grädel[4, 6] showed that SO-HORN and SO-HORN capture PTIME and we[21] showed that SO-HORN∗r captures PTIME. These four logic coincide with each other. On all structures, it turns out that SO-HORN is strictly less expressive than FO(LFP) which is a very powerful logic. It is not hard to see that the following holds from the definition above:

SO-HORNSO-HORNSO-HORNrSO-HORNr\begin{array}[]{cccc}&\subseteq&\textrm{SO-HORN}^{\mathrm{*}}\\ \textrm{SO-HORN}&&&\subseteq\textrm{SO-HORN}^{\mathrm{*r}}\\ &\subseteq&\textrm{SO-HORN}^{\mathrm{r}}\end{array}

An important open problem in descriptive complexity is that whether there is an effective logic captures PTIME on all structures. We are interesting in finding the expressive powers of these Horn logic on all structures. At first, we give a lemma that is very useful. Let ϕ\phi be a logic formula, we use ϕ[α/β]\phi[\alpha/\beta] to denote the formula obtained by replacing the formula α\alpha by β\beta in ϕ\phi.

Lemma 1

Every second-order formula xRϕ(x,x¯)\forall x\exists R\phi(x,\bar{x}) is equivalent to a formula of the form Rxϕ[Ry¯1/Rxy¯1,,Ry¯n/Rxy¯n](x,x¯)\exists R^{\prime}\forall x\phi[R\bar{y}_{1}/R^{\prime}x\bar{y}_{1},\cdots,R\bar{y}_{n}/R^{\prime}x\bar{y}_{n}](x,\bar{x}) where xx occurs free in ϕ\phi, RR is an rr-ary relation symbol, RR^{\prime} is an r+1r+1-ary relation symbol and Ry¯1,,Ry¯nR\bar{y}_{1},\cdots,R\bar{y}_{n} are all the different atomic formulas that occur in ϕ\phi and over the relation symbol RR.

Proof

Given a structure 𝒜\mathcal{A} and RAr+1R^{\prime}\subseteq A^{r+1}, write

Ra={(a1,,ar)(a,a1,,ar)R}R_{a}=\{(a_{1},\cdots,a_{r})\mid(a,a_{1},\cdots,a_{r})\in R^{\prime}\}

then for any (a1,,ar)Ar(a_{1},\cdots,a_{r})\in A^{r}

(a1,,ar)Raiff(a,a1,,ar)R\begin{array}[]{ccc}(a_{1},\cdots,a_{r})\in R_{a}&\textrm{iff}&(a,a_{1},\cdots,a_{r})\in R^{\prime}\end{array} (1)

Set ϕ(x,x¯)=ϕ[Ry¯1/Rxy¯1,,Ry¯n/Rxy¯n](x,x¯)\phi^{\prime}(x,\bar{x})=\phi[R\bar{y}_{1}/R^{\prime}x\bar{y}_{1},\cdots,R\bar{y}_{n}/R^{\prime}x\bar{y}_{n}](x,\bar{x}). Suppose x¯=x1,,xk\bar{x}=x_{1},\cdots,x_{k}, for any b¯=b1,,bkA\bar{b}=b_{1},\cdots,b_{k}\in A, RAr+1R^{\prime}\subseteq A^{r+1} and aAa\in A, by (1) it is easy to check that

(𝒜,Ra)ϕ[a,b¯]iff(𝒜,R)ϕ[a,b¯]\begin{array}[]{ccc}(\mathcal{A},R_{a})\models\phi[a,\bar{b}]&\textrm{iff}&(\mathcal{A},R^{\prime})\models\phi^{\prime}[a,\bar{b}]\end{array} (2)

For any structure 𝒜\mathcal{A} and b¯=b1,,bk\bar{b}=b_{1},\cdots,b_{k},

𝒜xRϕ[b¯]ifffor any aAthere exists  RaArsuch that (𝒜,Ra)ϕ[a,b¯]iff(𝒜,R)ϕ[a,b¯] for any aA (by (2), where  R=aA{(a,a1,,ar)(a1,,ar)Ra} )iff(𝒜,R)xϕ[b¯]iff𝒜Rxϕ[b¯]\begin{array}[]{lll}\mathcal{A}\models\forall x\exists R\phi[\bar{b}]&\textrm{iff}&\textrm{for any $a\in A$there exists $\textrm{ }R_{a}\subseteq A^{r}$}\\ &&\textrm{such that $(\mathcal{A},R_{a})\models\phi[a,\bar{b}]$}\\ \\ &\textrm{iff}&(\mathcal{A},R^{\prime})\models\phi^{\prime}[a,\bar{b}]\textrm{ for any $a\in A$\,(by \eqref{eq:all_before_exist}, where }\textrm{ }\\ &&R^{\prime}=\underset{a\in A}{\bigcup}\{(a,a_{1},\cdots,a_{r})\mid(a_{1},\cdots,a_{r})\in R_{a}\}\textrm{ )}\\ \\ &\textrm{iff}&(\mathcal{A},R^{\prime})\models\forall x\phi^{\prime}[\bar{b}]\\ \\ &\textrm{iff}&\mathcal{A}\models\exists R^{\prime}\forall x\phi^{\prime}[\bar{b}]\end{array}

which completes the proof.

It turns out that SO-HORN collapses to its existential fragment ((SO\exists)-HORN, i.e. the formulas where all the second-order quantifiers are existential[4]). In fact, many second-order formulas equal to the formulas that only have existential second-order prefix. We extend this result to a more general form in the following proposition.

Proposition 1

Every second-order formula of the form

Q1R1QmRmx¯(j=1𝑙(αj1αjhjβj1βjqjHj))Q_{1}R_{1}\cdots Q_{m}R_{m}\forall\bar{x}\left(\overset{l}{\underset{j=1}{\bigwedge}}(\alpha_{j1}\wedge\cdots\wedge\alpha_{jh_{j}}\wedge\beta_{j1}\wedge\cdots\wedge\beta_{jq_{j}}\rightarrow H_{j})\right) (3)

where Qi{,}Q_{i}\in\{\forall,\exists\} and

1)

each αkt\alpha_{kt} is either an atomic formula Rix¯R_{i}\bar{x} or y¯Riy¯z¯\forall\bar{y}R_{i}\bar{y}\bar{z},

2)

each βef\beta_{ef} is a second-order formula that does not contain R1,,RmR_{1},\cdots,R_{m},

3)

HjH_{j} is either an atomic formula Rkz¯R_{k}\overline{z} or the Boolean constant \bot (for false),

is equivalent to a formula of the form

R1Rnx¯(j=1l(αj1αjhjβj1βjqjHj))\exists R^{\prime}_{1}\cdots\exists R^{\prime}_{n}\forall\bar{x}^{\prime}\left(\overset{l^{\prime}}{\underset{j=1}{\bigwedge}}(\alpha^{\prime}_{j1}\wedge\cdots\wedge\alpha^{\prime}_{jh^{\prime}_{j}}\wedge\beta^{\prime}_{j1}\wedge\cdots\wedge\beta^{\prime}_{jq^{\prime}_{j}}\rightarrow H^{\prime}_{j})\right) (4)

where

1)

each αkt\alpha^{\prime}_{kt} is either an atomic formula Rix¯R^{\prime}_{i}\bar{x} or y¯Riy¯z¯\forall\bar{y}R^{\prime}_{i}\bar{y}\bar{z},

2)

each βef\beta^{\prime}_{e^{\prime}f^{\prime}} is some βef\beta_{ef} in (3) that does not contain R1,,Rm,R1,,RnR_{1},\cdots,R_{m},R^{\prime}_{1},\cdots,R^{\prime}_{n},

3)

HjH^{\prime}_{j} is either an atomic formula Rkz¯R^{\prime}_{k}\overline{z}, or the Boolean constant \bot (for false).

Proof

This proof is adapted from that of Theorem 5 in [4]. It is suffice to prove for formulas of the form PR1Rnx¯ϕ\forall P\exists R_{1}\cdots\exists R_{n}\forall\bar{x}\phi. For an arbitrary second-order prefix we can successively remove the innermost universal second-order quantifier. The main idea is that for any structure 𝒜\mathcal{A}

𝒜PR1Rnx¯ϕiff(𝒜,P)R1Rnx¯ϕ for any Pthat are false at at most one point.\begin{array}[]{ccl}\mathcal{A}\models\forall P\exists R_{1}\cdots\exists R_{n}\forall\bar{x}\phi&\textrm{iff}&(\mathcal{A},P)\models\exists R_{1}\cdots\exists R_{n}\forall\bar{x}\phi\textrm{ for any }P\\ &&\textrm{that are false at at most one point.}\end{array}

Suppose that the arity of PP is kk and y¯=y1,,yk\bar{y}=y_{1},\cdots,y_{k} are new variables that don’t occur in ϕ\phi, we only need to show that

𝒜PR1Rnx¯ϕiff𝒜y¯R1Rnx¯ϕ[Pz¯/z¯y¯]R1Rnx¯ϕ[Pz¯/z¯=z¯]\begin{array}[]{ccr}\mathcal{A}\models\forall P\exists R_{1}\cdots\exists R_{n}\forall\bar{x}\phi&\textrm{iff}&\mathcal{A}\models\forall\bar{y}\exists R{}_{1}\cdots\exists R{}_{n}\forall\bar{x}\phi[P\bar{z}/\bar{z}\neq\bar{y}]\\ &&\bigwedge\exists R_{1}\cdots\exists R_{n}\forall\bar{x}\phi[P\bar{z}/\bar{z}=\bar{z}]\end{array}

The """\Rightarrow" direction is easy. For the """\Leftarrow" direction, suppose

𝒜y¯R1Rnx¯ϕ[Pz¯/z¯y¯]R1Rnx¯ϕ[Pz¯/z¯=z¯]\mathcal{A}\models\forall\bar{y}\exists R{}_{1}\cdots\exists R{}_{n}\forall\bar{x}\phi[P\bar{z}/\bar{z}\neq\bar{y}]\wedge\exists R_{1}\cdots\exists R_{n}\forall\bar{x}\phi[P\bar{z}/\bar{z}=\bar{z}]

The case that P=AkP=A^{k} or PP is false at only one tuple is trivial. Let Pa¯P_{\bar{a}} be the subset of AkA^{k} that only the tuple a¯\bar{a} is not in PP, we use R1Pa¯,,RnPa¯R_{1}^{P_{\bar{a}}},\cdots,R_{n}^{P_{\bar{a}}} to denote the corresponding relations such that (𝒜,Pa¯,R1Pa¯,,RnPa¯)x¯ϕ(\mathcal{A},P_{\bar{a}},R_{1}^{P_{\bar{a}}},\cdots,R_{n}^{P_{\bar{a}}})\models\forall\bar{x}\phi. For the other cases, consider an arbitrary PAkP\subset A^{k}, set RiP=a¯PRiPa¯(1in)R_{i}^{P}=\underset{\bar{a}\notin P}{\bigcap}R_{i}^{P_{\bar{a}}}\,(1\leq i\leq n). We proceed to show that

(𝒜,P,R1P,,RnP)x¯ϕ(\mathcal{A},P,R_{1}^{P},\cdots,R_{n}^{P})\models\forall\bar{x}\phi

Conversely, suppose that

(𝒜,P,R1P,,RnP)x¯ϕ(\mathcal{A},P,R_{1}^{P},\cdots,R_{n}^{P})\nvDash\forall\bar{x}\phi

then there exist b¯A\bar{b}\in A and a clause α1αhβ1βqHj\alpha_{1}\wedge\cdots\wedge\alpha_{h}\wedge\beta_{1}\wedge\cdots\wedge\beta_{q}\rightarrow H_{j} such that

(𝒜,P,R1P,,RnP)α1αhβ1βq[b¯](\mathcal{A},P,R_{1}^{P},\cdots,R_{n}^{P})\models\alpha_{1}\wedge\cdots\wedge\alpha_{h}\wedge\beta_{1}\wedge\cdots\wedge\beta_{q}[\bar{b}] (5)

and

(𝒜,P,R1P,,RnP)Hj[b¯](\mathcal{A},P,R_{1}^{P},\cdots,R_{n}^{P})\nvDash H_{j}[\bar{b}] (6)

Since P=a¯PPa¯P=\underset{\bar{a}\notin P}{\bigcap}P_{\bar{a}} and RiP=a¯PRiPa¯(1in)R_{i}^{P}=\underset{\bar{a}\notin P}{\bigcap}R_{i}^{P_{\bar{a}}}\,(1\leq i\leq n), it follows that for each a¯P\bar{a}\notin P and Q{P,R1,,Rn}Q\in\{P,R_{1},\cdots,R_{n}\},

(𝒜,P,R1P,,RnP)Qc¯(𝒜,Pa¯,R1Pa¯,,RnPa¯)Qc¯(\mathcal{A},P,R_{1}^{P},\cdots,R_{n}^{P})\models Q\bar{c}\Rightarrow(\mathcal{A},P_{\bar{a}},R_{1}^{P_{\bar{a}}},\cdots,R_{n}^{P_{\bar{a}}})\models Q\bar{c}

and

(𝒜,P,R1P,,RnP)y¯Qy¯c¯(𝒜,Pa¯,R1Pa¯,,RnPa¯)y¯Qy¯c¯(\mathcal{A},P,R_{1}^{P},\cdots,R_{n}^{P})\models\forall\bar{y}Q\bar{y}\bar{c}^{\prime}\Rightarrow(\mathcal{A},P_{\bar{a}},R_{1}^{P_{\bar{a}}},\cdots,R_{n}^{P_{\bar{a}}})\models\forall\bar{y}Q\bar{y}\bar{c}^{\prime}

where c¯\bar{c} and c¯\bar{c}^{\prime} tuples of elements of AA. Because P,R1,,RnP,R_{1},\cdots,R_{n} don’t occur in β1,,βq\beta_{1},\cdots,\beta_{q}, by (5) we see that for each a¯P\bar{a}\notin P

(𝒜,Pa¯,R1Pa¯,,RnPa¯)α1αhβ1βq[b¯](\mathcal{A},P_{\bar{a}},R_{1}^{P_{\bar{a}}},\cdots,R_{n}^{P_{\bar{a}}})\models\alpha_{1}\wedge\cdots\wedge\alpha_{h}\wedge\beta_{1}\wedge\cdots\wedge\beta_{q}[\bar{b}]
  • If HjH_{j} is \bot in (6), then (𝒜,Pa¯,R1Pa¯,,RnPa¯)(\mathcal{A},P_{\bar{a}},R_{1}^{P_{\bar{a}}},\cdots,R_{n}^{P_{\bar{a}}})\nvDash\bot, contrary to

    (𝒜,Pa¯,R1Pa¯,,RnPa¯)x¯ϕ(\mathcal{A},P_{\bar{a}},R_{1}^{P_{\bar{a}}},\cdots,R_{n}^{P_{\bar{a}}})\models\forall\bar{x}\phi
  • If HjH_{j} is some Qz¯Q\bar{z} where Q{P,R1,,Rn}Q\in\{P,R_{1},\cdots,R_{n}\}, there must be a a¯P\bar{a}\notin P such that (𝒜,Pa¯,R1Pa¯,,RnPa¯)Hj[b¯](\mathcal{A},P_{\bar{a}},R_{1}^{P_{\bar{a}}},\cdots,R_{n}^{P_{\bar{a}}})\nvDash H_{j}[\bar{b}], contrary to

    (𝒜,Pa¯,R1Pa¯,,RnPa¯)x¯ϕ(\mathcal{A},P_{\bar{a}},R_{1}^{P_{\bar{a}}},\cdots,R_{n}^{P_{\bar{a}}})\models\forall\bar{x}\phi

By repeating use of lemma 1, y¯R1Rnx¯ϕ[Pz¯/z¯y¯]R1Rnx¯ϕ[Pz¯/z¯=z¯]\forall\bar{y}\exists R{}_{1}\cdots\exists R{}_{n}\forall\bar{x}\phi[P\bar{z}/\bar{z}\neq\bar{y}]\wedge\exists R_{1}\cdots\exists R_{n}\forall\bar{x}\phi[P\bar{z}/\bar{z}=\bar{z}] is equivalent to a formula of the required form.

Corollary 1

SO-HORNr and SO-HORN∗r are both collapse to their existential fragments.

From now on, unless explicitly stated, we will restrict ourselves to the existential fragments of these logic defined above in the rest of this paper. To compare the expressive power of the four logic on all structures, we use DATALOG and S-DATALOG program as intermediate logic, and introduce the notion of DATALOGr program. For the detailed definition and semantics of DATALOG program and S-DATALOG program we refer the reader to [8].

Definition 2

A DATALOG program Π\Pi over a vocabulary τ\tau is a finite set of rules of the form

βα1,,αl\beta\leftarrow\alpha_{1},\cdots,\alpha_{l}

where l0l\geq 0 and

(1)

each αi\alpha_{i} is either an atomic formula or a negated atomic formula or zero-ary relation symbol,

(2)

β\beta is either an atomic formula Rx¯R\bar{x} or a zero-ary relation symbol QQ where RR and QQ only occur positively in each rule of Π\Pi.

α1,,αl\alpha_{1},\cdots,\alpha_{l} constitute the body of the rule, β\beta is the head of the rule. Every relation symbols in the head of some rule of Π\Pi are intentional, all the other symbols are extensional. We use (τ,Π)int(\tau,\Pi)_{int} and (τ,Π)ext(\tau,\Pi)_{ext} to denote the set of intentional and extensional symbols(constant symbols are extensional symbols), respectively. We allow zero-ary relation symbols here to be able to compare the expressive power of DATALOG with other logic. A zero-ary relation has the Boolean value TRUE or FALSE. Let QQ be a zero-ary relation symbol and 𝒜\mathcal{A} be a structure, the interpretation of QQ in 𝒜\mathcal{A} is: “QA=Q^{A}=\emptyset” means “QA=FALSEQ^{A}=FALSE”, “QA={}Q^{A}=\{\emptyset\}” means “QA=TRUEQ^{A}=TRUE”.

Example 1

This is a DATALOG program over the vocabulary τ={E,R,Q,𝐬,𝐭}\tau=\{E,R,Q,\mathbf{s},\mathbf{t}\} where (τ,Π)int={R,Q}(\tau,\Pi)_{int}=\{R,Q\}, (τ,Π)ext={E,𝐬,𝐭}(\tau,\Pi)_{ext}=\{E,\mathbf{s},\mathbf{t}\} and 𝐬,𝐭\mathbf{s},\mathbf{t} are constant symbols.

Π:RxyExyRxyExz,RzyQR𝐬𝐭\begin{array}[]{cccl}\Pi:&Rxy&\leftarrow&Exy\\ &Rxy&\leftarrow&Exz,\,Rzy\\ &Q&\leftarrow&R\mathbf{st}\end{array}

Instead of giving a detailed definition of the semantics of DATALOG programs, we present it briefly using an example. Given a DATALOG program Π\Pi and a (τ,Π)ext(\tau,\Pi)_{ext}-structure, we can apply all the rules of Π\Pi simultaneously to generate consecutive stages of the intentional symbols. Consider the above example, let R(0)=R_{(0)}=\emptyset, Q(0)=Q_{(0)}=\emptyset and 𝒜\mathcal{A} be a (τ,Π)ext(\tau,\Pi)_{ext}-structure. Suppose R(i)R_{(i)} and Q(i)Q_{(i)} are known, R(i+1)R_{(i+1)} and Q(i+1)Q_{(i+1)} can be evaluated by

R(i+1)xyExyR(i+1)xyExz,R(i)zyQ(i+1)R(i)𝐬𝐭\begin{array}[]{lcl}R_{(i+1)}xy&\leftarrow&Exy\\ R_{(i+1)}xy&\leftarrow&Exz,\,R_{(i)}zy\\ Q_{(i+1)}&\leftarrow&R_{(i)}\mathbf{st}\end{array}

where Q(i+1)={}Q_{(i+1)}=\{\emptyset\} iff 𝒜R(i)𝐬𝐭\mathcal{A}\models R_{(i)}\mathbf{st}. Because RR occurs only positively in the body of each rules, we can reach a fixed-point in polynomial many steps with respect to |A||A|. Let R()=n0R(n)R_{(\infty)}=\underset{n\geq 0}{\bigcup}R_{(n)} denote the fixed-point. Then 𝒜\mathcal{A} gives rise to a τ\tau-structure

𝒜[Π]=(𝒜,R(),Q())\mathcal{A}[\Pi]=(\mathcal{A},R_{(\infty)},Q_{(\infty)})

A DATALOG formula has the form (Π,P)t¯(\Pi,P)\bar{t} where PP is an rr-ary intentional relation symbol and t¯=t1,,tr\bar{t}=t_{1},\cdots,t_{r} are variables that don’t occur in Π\Pi. Given a (τ,Π)ext(\tau,\Pi)_{ext}-structure 𝒜\mathcal{A} and a¯=a1,,arA\bar{a}=a_{1},\cdots,a_{r}\in A

𝒜(Π,P)[a¯]iff(a1,,ar)P𝒜[Π]\begin{array}[]{ccc}\mathcal{A}\models(\Pi,P)[\bar{a}]&\textrm{iff}&(a_{1},\cdots,a_{r})\in P^{\mathcal{A}[\Pi]}\end{array}

If PP is a zero-ary relation symbol, then

𝒜(Π,P)iffP𝒜[Π]is TRUE\begin{array}[]{ccc}\mathcal{A}\models(\Pi,P)&\textrm{iff}&P^{\mathcal{A}[\Pi]}\end{array}\textrm{is TRUE}

In the above example, given a,bAa,b\in A,

𝒜(Π,R)[a,b]iff(a,b)R𝒜[Π]iffthere is a directed path from a to b,𝒜(Π,Q)iffQ𝒜[Π] is TRUEiffthere is a directed path from 𝐬 to 𝐭.\begin{array}[]{lcccc}\mathcal{A}\models(\Pi,R)[a,b]&\textrm{iff}&(a,b)\in R^{\mathcal{A}[\Pi]}&\textrm{iff}&\textrm{there is a directed path from a to b,}\\ \\ \mathcal{A}\models(\Pi,Q)&\textrm{iff}&Q^{\mathcal{A}[\Pi]}\textrm{ is TRUE}&\textrm{iff}&\textrm{there is a directed path from $\mathbf{s}$ to $\mathbf{t}$.}\end{array}
Definition 3

We define three extensions of DATALOG programs:

  • If we allow first-order formulas only over extensional symbols in the body of rules of a DATALOG program, we denote this logic program by DATALOG program.

  • If we allow formulas of the form y¯Ry¯z¯\forall\bar{y}R\bar{y}\bar{z} where RR is an intentional relation symbol in the body of rules of a DATALOG program, we denote this logic program by DATALOGr program.

  • If we allow both first-order formulas only over extensional symbols and formulas of the form y¯Ry¯z¯\forall\bar{y}R\bar{y}\bar{z} where RR is an intentional relation symbol in the body of rules of a DATALOG program, we denote this logic program by DATALOG∗r program.

Example 2

This is a DATALOG∗r program over τ={R,Q,P1,,Pn}\tau=\{R,Q,P_{1},\cdots,P_{n}\} where (τ,Π)int={R,Q}(\tau,\Pi)_{int}=\{R,Q\}, (τ,Π)ext={P1,,Pn}(\tau,\Pi)_{ext}=\{P_{1},\cdots,P_{n}\} and ϕ(x,y)\phi(x,y) is a first-order formula over (τ,Π)ext(\tau,\Pi)_{ext}.

Π:Rxyϕ(x,y)Rxyϕ(x,z),RzyQxyRxy\begin{array}[]{cccl}\Pi:&Rxy&\leftarrow&\phi(x,y)\\ &Rxy&\leftarrow&\phi(x,z),\,Rzy\\ &Qx&\leftarrow&\forall yRxy\end{array}

Relation RR denotes the transitive closure of the graph defined by ϕ(x,y)\phi(x,y). QQ denotes the set of vertices that can reach every vertex in this graph.

For a DATALOG ( DATALOG, DATALOGr, DATALOG∗r, respectively ) program Π\Pi, if there is a zero-ary relation symbol occurring in the body of some rules, e.g. QQ, we replace every occurrences of QQ by QxQ^{\prime}x, where QQ^{\prime} is a new relation symbol and xx is a new variable that do not occur in Π\Pi, and then add the rule QQxQ\leftarrow Q^{\prime}x in Π\Pi. Let Π\Pi^{\prime} denote the resulting DATALOG ( DATALOG, DATALOGr, DATALOG∗r, respectively ) program. It is easily check that the fixed-points of all common intentional relations of Π\Pi and Π\Pi^{\prime} coincide on all structures. So for technical reason, unless stated explicitly, we restrict that the zero-ary relation symbols only occur in the head of a rule.

It turns out that many first-order definable properties can not be defined by DATALOG formulas. DATALOG and DATALOG∗r programs which are equipped with first-order formulas are more expressive than DATALOG programs. In the following we show that DATALOGr formulas is enough to express all first-order definable properties. We say a DATALOG ( DATALOG, DATALOGr, DATALOG∗r, respectively) formula (Π,P)t¯(\Pi,P)\bar{t} is bounded if there exists a fixed number k0k\geq 0 such that P(k)=P()P_{(k)}=P_{(\infty)} for all structures.

Proposition 2

Every first-order formula is equivalent to a bounded DATALOGr formula.

Proof

We assign every first-order formula ϕ(x¯)\phi(\bar{x}) an equivalent bounded DATALOGr formula (Πϕ,Pϕ)t¯(\Pi_{\phi},P_{\phi})\bar{t}. We prove this result by induction on the quantifier rank of ϕ(x¯)\phi(\bar{x}).

With no loss of generality, suppose ϕ(x¯)\phi(\bar{x}) is in DNF form

Q1x1Qmxm(C1Cn)Q_{1}x_{1}\cdots Q_{m}x_{m}(C_{1}\vee\cdots\vee C_{n})

where Qi{,}Q_{i}\in\{\forall,\exists\} and each CjC_{j} is a conjunction of atomic or negated atomic formulas. Assume v¯=v1,,vr\bar{v}=v_{1},\cdots,v_{r} are all the variables in C1CnC_{1}\vee\cdots\vee C_{n},

  • for each Cj=αj1αj2αjkjC_{j}=\alpha_{j1}\wedge\alpha_{j2}\wedge\cdots\wedge\alpha_{jk_{j}}, set

    ΠCj={PCjv¯αj1,αj2,,αjkj}\Pi_{C_{j}}=\{P_{C_{j}}\bar{v}\leftarrow\alpha_{j1},\alpha_{j2},\cdots,\alpha_{jk_{j}}\}
  • set

    ΠC1Cn=j=1𝑛ΠCjj=1𝑛{PC1Cnv¯PCjv¯}\Pi_{C_{1}\vee\cdots\vee C_{n}}=\overset{n}{\underset{j=1}{\bigcup}}\Pi_{C_{j}}\cup\overset{n}{\underset{j=1}{\bigcup}}\{P_{C_{1}\vee\cdots\vee C_{n}}\bar{v}\leftarrow P_{C_{j}}\bar{v}\}

It’s easily seen that C1Cn(v¯)C_{1}\vee\cdots\vee C_{n}(\bar{v}) and (ΠC1Cn,PC1Cn)t¯(\Pi_{C_{1}\vee\cdots\vee C_{n}},P_{C_{1}\vee\cdots\vee C_{n}})\bar{t} are equivalent. Suppose ψ(x¯,x)\psi(\bar{x},x) and (Πψ,Pψ)t¯(\Pi_{\psi},P_{\psi})\bar{t} are equivalent and PϕP_{\phi} is a new relation symbol not in Πψ\Pi_{\psi},

  • if ϕ(x¯)=xψ(x¯,x)\phi(\bar{x})=\forall x\psi(\bar{x},x), set

    Πϕ=Πψ{Pϕx¯xPψx¯x}\Pi_{\phi}=\Pi_{\psi}\cup\{P_{\phi}\bar{x}\leftarrow\forall xP_{\psi}\bar{x}x\}
  • if ϕ(x¯)=xψ(x¯,x)\phi(\bar{x})=\exists x\psi(\bar{x},x), set

    Πϕ=Πψ{Pϕx¯Pψx¯x}\Pi_{\phi}=\Pi_{\psi}\cup\{P_{\phi}\bar{x}\leftarrow P_{\psi}\bar{x}x\}

It is easy to check that ϕ(x¯)\phi(\bar{x}) and (Πϕ,Pϕ)t¯(\Pi_{\phi},P_{\phi})\bar{t} are equivalent. To prove (Πϕ,Pϕ)t¯(\Pi_{\phi},P_{\phi})\bar{t} is bounded, observe that each PCjP_{C_{j}} can reach its fixed-point in one step and PC1CnP_{C_{1}\vee\cdots\vee C_{n}} can reach its fixed-point in two steps. Suppose the quantifier rank of ϕ(x¯)\phi(\bar{x}) is mm, PϕP_{\phi} can reach its fixed-point in m+2m+2 steps.

Corollary 2

DATALOG{}^{*}\subseteq DATALOGr{}^{r}\equiv DATALOG∗r

Proof

Suppose (Π,P)t¯(\Pi,P)\bar{t} is a DATALOG( DATALOG∗r ) formula over vocabulary τ\tau and ϕ1(x¯1),,ϕn(x¯n)\phi_{1}(\bar{x}_{1}),\cdots,\phi_{n}(\bar{x}_{n}) are the no atomic (or negated atomic) first-order formulas that occur in Π\Pi and over (τ,Π)ext(\tau,\Pi)_{ext}. Let (Πϕi,Pϕi)t¯(1in)(\Pi_{\phi_{i}},P_{\phi_{i}})\bar{t}\,(1\leq i\leq n) be the corresponding DATALOGr formulas that equivalent to ϕi(x¯i)(1in)\phi_{i}(\bar{x}_{i})\,(1\leq i\leq n), respectively. Set

Π=Π[ϕ1(x¯1)/Pϕ1x¯1,,ϕn(x¯n)/Pϕnx¯n]i=1𝑛Πϕi\Pi^{\prime}=\Pi[\phi_{1}(\bar{x}_{1})/P_{\phi_{1}}\bar{x}_{1},\cdots,\phi_{n}(\bar{x}_{n})/P_{\phi_{n}}\bar{x}_{n}]\cup\overset{n}{\underset{i=1}{\bigcup}}\Pi_{\phi_{i}}

where Π[ϕ1(x¯1)/Pϕ1x¯1,,ϕn(x¯n)/Pϕnx¯n]\Pi[\phi_{1}(\bar{x}_{1})/P_{\phi_{1}}\bar{x}_{1},\cdots,\phi_{n}(\bar{x}_{n})/P_{\phi_{n}}\bar{x}_{n}] denotes the logic program obtained by replacing each ϕi(x¯i)\phi_{i}(\bar{x}_{i}) by Pϕix¯i(1in)P_{\phi_{i}}\bar{x}_{i}\,(1\leq i\leq n) in Π\Pi. The DATALOGr formula (Π,P)t¯(\Pi^{\prime},P)\bar{t} is equivalent to (Π,P)t¯(\Pi,P)\bar{t}.

A DATALOG program Π\Pi is positive if no atomic occurs negatively in any rule of Π\Pi. M. Ajtai and Y. Gurevich[12] showed that a positive DATALOG formula is bounded if and only if it is equivalent to a (existential positive) first-order formula. For a DATALOG formula, if it is bounded then it equals to a first-order formula, but the converse is false. They find a DATALOG formula which is equivalent to a first-order formula but not bounded. We show that this statement is true for DATALOGr formulas.

Theorem 3.1

For any DATALOGr formula (Π,P)t¯(\Pi,P)\bar{t} the following are equivalent:

(i)

(Π,P)t¯(\Pi,P)\bar{t} is equivalent to a first-order formula.

(ii)

(Π,P)t¯(\Pi,P)\bar{t} is equivalent to a bounded DATALOGr formula.

Proof

(i) \Rightarrow (ii) is by Proposition 2. The proof for (ii) \Rightarrow (i) is similar to that of Proposition 9.3.1 in [8], we give only the main ideas. Observing that given a DATALOGr formula (Π,P)t¯(\Pi,P)\bar{t} and a fixed kk, the kk-th stage P(k)P_{(k)} in the evaluation is expressible by a first-order formula. If (Π,P)t¯(\Pi,P)\bar{t} is bounded then there exists an l0l\geq 0 such that P(l)=P()P_{(l)}=P_{(\infty)} on all structures.

Remark 1

If a DATALOG formula is equivalent to a bounded DATALOG formula, then the DATALOG formula itself is bounded. But if a DATALOGr formula is equivalent to a bounded DATALOGr formula, itself is not necessary bounded. We give a counterexample. Consider an atomic formula PxPx, let << and succsucc be two 2-ary relation symbols and 𝐦𝐢𝐧\mathbf{min} be a constant symbol. Construct two first-order formulas ψ1(<,𝐦𝐢𝐧)\psi_{1}(<,\mathbf{min}) and ψ2(succ)\psi_{2}(succ), where ψ1(<,𝐦𝐢𝐧)\psi_{1}(<,\mathbf{min}) says that << is a linear ordering relation and 𝐦𝐢𝐧\mathbf{min} is the least element, ψ2(succ)\psi_{2}(succ) says that succsucc is a successor relation. By Proposition 2, ψ1\psi_{1} is equivalent to a DATALOGr formula (Π1,P<)(\Pi_{1},P_{<}), ¬ψ1\neg\psi_{1} is equivalent to a DATALOGr formula (Π1,P)(\Pi^{\prime}_{1},P_{\nless}), ψ2\psi_{2} is equivalent to a DATALOGr formula (Π2,Psucc)(\Pi_{2},P_{succ}) and ¬ψ2\neg\psi_{2} is equivalent to a DATALOGr formula (Π2,P¬succ)(\Pi^{\prime}_{2},P_{\neg succ}). Set

Π3:Q𝐦𝐢𝐧P<,PsuccQyQx,succ(x,y),P<,PsuccPxQx,Px,P<,PsuccPxPx,P<,P¬succPxPx,P\begin{array}[]{lrcl}\Pi_{3}:&Q\mathbf{min}&\leftarrow&P_{<},P_{succ}\\ &Qy&\leftarrow&Qx,succ(x,y),P_{<},P_{succ}\\ &P^{\prime}x&\leftarrow&Qx,Px,P_{<},P_{succ}\\ &P^{\prime}x&\leftarrow&Px,P_{<},P_{\neg succ}\\ &P^{\prime}x&\leftarrow&Px,P_{\nless}\end{array}

and Π=Π1Π2Π1Π2Π3\Pi=\Pi_{1}\cup\Pi_{2}\cup\Pi^{\prime}_{1}\cup\Pi^{\prime}_{2}\cup\Pi_{3}. The DATALOGr program Π3\Pi_{3} says that if << is a linear ordering relation, 𝐦𝐢𝐧\mathbf{min} is the least element and succsucc is a successor relation, then it checks all elements one by one that whether they have the property PP. Otherwise, just let P=PP^{\prime}=P. The DATALOGr formula (Π,P)t(\Pi,P^{\prime})t is equivalent to PxPx which is a first-order formula, but for a {P,<,succ,𝐦𝐢𝐧}\{P,<,succ,\mathbf{min}\}-structure 𝒜\mathcal{A} where << interpreted as a linear ordering relation, 𝐦𝐢𝐧\mathbf{min} interpreted as the least element and succsucc interpreted as a successor relation, PP^{\prime} reach its fixed-point in more than |A||A| steps.

Corollary 2 shows that DATALOG programs is a sublogic of DATALOGr programs. In fact, it is proper. Before proving this result, we review stratified DATALOG program ( by short: S-DATALOG program ) which is more powerful than DATALOG program.

Definition 4

A stratified DATALOG program Σ\Sigma, denoted by S-DATALOG program Σ\Sigma, is a sequence Π0,Π1,,Πn\Pi_{0},\Pi_{1},\cdots,\Pi_{n} of DATALOG programs over vocabularies τ0,τ1,,τn\tau_{0},\tau_{1},\cdots,\tau_{n}, respectively, such that (τm+1,Πm+1)ext=τm(0m<n)(\tau_{m+1},\Pi_{m+1})_{ext}=\tau_{m}\,(0\leq m<n).

Given a S-DATALOG program Σ=(Π0,Π1,,Πn)\Sigma=(\Pi_{0},\Pi_{1},\cdots,\Pi_{n}) and a (τ0,Π0)ext(\tau_{0},\Pi_{0})_{ext}-structure 𝒜\mathcal{A}, we set

𝒜[Σ]=((𝒜[Π0])[Π1])[Πn]\mathcal{A}[\Sigma]=(\ldots(\mathcal{A}[\Pi_{0}])[\Pi_{1}]\ldots)[\Pi_{n}]

A S-DATALOG formula has the form (Σ,P)t¯(\Sigma,P)\bar{t} where PP is an rr-ary intentional relation symbol of one of the constitutes Πi\Pi_{i} and t¯=t1,,tr\bar{t}=t_{1},\cdots,t_{r} are variables that do not occur in Σ\Sigma.

Example 3

The following is an S-DATALOG program Σ=(Π0,Π1)\Sigma=(\Pi_{0},\Pi_{1}) over the vocabulary τ0={E,R},τ1={E,R,P}\tau_{0}=\{E,R\},\tau_{1}=\{E,R,P\} where (τ0,Π0)int={R}(\tau_{0},\Pi_{0})_{int}=\{R\}, (τ0,Π0)ext={E}(\tau_{0},\Pi_{0})_{ext}=\{E\}, (τ1,Π1)int={P}(\tau_{1},\Pi_{1})_{int}=\{P\} and (τ1,Π1)ext={E,R}(\tau_{1},\Pi_{1})_{ext}=\{E,R\}.

Π0:RxyExyRxyExz,RzyΠ1:Pxy¬Rxy\begin{array}[]{crcl}\Pi_{0}:&Rxy&\leftarrow&Exy\\ &Rxy&\leftarrow&Exz,Rzy\end{array}\begin{array}[]{cccc}\Pi_{1}:&Pxy&\leftarrow&\neg Rxy\\ \\ \end{array}

For any (τ0,Π0)ext(\tau_{0},\Pi_{0})_{ext}-structure 𝒜\mathcal{A} and a,bAa,b\in A,

𝔄(Σ,R)[a,b]iff(a,b)R𝒜[Σ]iffthere is a directed path from a to b,𝔄(Σ,P)[a,b]iff(a,b)P𝒜[Σ]iffthere is no directed path from a to b.\begin{array}[]{lcccc}\mathfrak{A}\models(\Sigma,R)[a,b]&\textrm{iff}&(a,b)\in R^{\mathcal{A}[\Sigma]}&\textrm{iff}&\textrm{there is a directed path from a to b,}\\ \\ \mathfrak{A}\models(\Sigma,P)[a,b]&\textrm{iff}&(a,b)\in P^{\mathcal{A}[\Sigma]}&\textrm{iff}&\textrm{there is no directed path from a to b.}\end{array}
Definition 5

In Definition 4,

  • if Σ\Sigma is a sequence Π0,Π1,,Πn\Pi_{0},\Pi_{1},\cdots,\Pi_{n} of DATALOG programs, we denote this logic program by S-DATALOG program,

  • if Σ\Sigma is a sequence Π0,Π1,,Πn\Pi_{0},\Pi_{1},\cdots,\Pi_{n} of DATALOGr programs, we denote this logic program by S-DATALOGr program.

Proposition 3

S-DATALOG \equiv S-DATALOG

Proof

The main idea of the proof is that every first-order formula ϕ(x¯)\phi(\bar{x}) is equivalent to a S-DATALOG formula (Σϕ,Pϕ)t¯(\Sigma_{\phi},P_{\phi})\bar{t}. So we can replace ϕ(x¯)\phi(\bar{x}) in S-DATALOG program Σ\Sigma by Pϕx¯P_{\phi}\bar{x} and add Σϕ\Sigma_{\phi} to Σ\Sigma. The details are left to the reader.

Theorem 3.2

DATALOGr{}^{r}\equiv S-DATALOGr{}^{r}\equiv FO(LFP)

Proof

We prove by showing that DATALOGr{}^{r}\subseteq S-DATALOGr{}^{r}\subseteq FO(LFP) \subseteq DATALOGr. The containment DATALOGr{}^{r}\subseteq S-DATALOGr is trivial. To prove S-DATALOGr{}^{r}\subseteq FO(LFP), we first consider the case when (Π,P)t¯(\Pi,P)\bar{t} is a DATALOGr formula. For each intentional relation symbol QQ in Π\Pi, we construct a first-order formula

ϕQ(x¯Q)={y¯(γ1γl)Qx¯Qγ1,,γlΠ and y¯are the free variables in γ1,,γlexcept x¯Q}\begin{array}[]{r}\phi_{Q}(\bar{x}_{Q})=\bigvee\{\exists\bar{y}(\gamma_{1}\wedge\cdots\wedge\gamma_{l})\mid Q\bar{x}_{Q}\leftarrow\gamma_{1},\cdots,\gamma_{l}\in\Pi\textrm{ and }\bar{y}\\ \textrm{are the free variables in $\gamma_{1},\cdots,\gamma_{l}$except $\bar{x}_{Q}$}\}\end{array}

By the semantics of DATALOGr, (Π,P)t¯(\Pi,P)\bar{t} is equivalent to the FO(S-LFP) formula (for details of FO(S-LFP) we refer the reader to [8])

[S-LFPx¯P,P,x¯Q1,Q1,,x¯Qn,QnϕP,ϕQ1,,ϕQn]t¯[\textrm{S-LFP}_{\bar{x}_{P},\,P,\,\bar{x}_{Q_{1}},\,Q_{1},\cdots,\bar{x}_{Q_{n}},\,Q_{n}}\phi_{P},\phi_{Q_{1}},\cdots,\phi_{Q_{n}}]\bar{t}

which is equivalent to a FO(LFP) formula. For an S-DATALOGr formula (Σ,P)t¯(\Sigma,P)\bar{t} which over the program Σ=(Π0,Π1,,Πn)\Sigma=(\Pi_{0},\Pi_{1},\cdots,\Pi_{n}), we prove by induction on nn. For each formula (Π0,P)t¯(\Pi_{0},P)\bar{t} we find an equivalent FO(LFP) formula as above. Suppose every formula (Πi,P)t¯(0ij)(\Pi_{i},P)\bar{t}\,(0\leq i\leq j), where PP is an intentional relation symbol of Πi\Pi_{i}, has an equivalent FO(LFP) formula, we replace Px¯P\bar{x} by the equivalent FO(LFP) formula in Πj+1\Pi_{j+1} and deal with Πj+1\Pi_{j+1} as a DATALOGr program.

To prove FO(LFP) \subseteq DATALOGr, we use the normal form of FO(LFP) formulas. It has been shown that every FO(LFP) formula ϕ(x¯)\phi(\bar{x}) is equivalent to a formula of the form (Theorem 9.4.2, [8])

u[LFPz¯,Zψ(x¯,z¯)]u~\exists u[\textrm{LFP}_{\bar{z},Z}\psi(\bar{x},\bar{z})]\tilde{u}

where ψ\psi is a first-order formula. Without loss of generality, we assume that ψ\psi is in DNF form

Q1x1Qmxm(C1Cn)Q_{1}x_{1}\cdots Q_{m}x_{m}(C_{1}\vee\cdots\vee C_{n})

where Qi{,}Q_{i}\in\{\forall,\exists\} and each CjC_{j} is a conjunction of atomic or negated atomic formulas. By Proposition 2 we know that there is a DATALOGr formula (Π,P)x¯z¯(\Pi,P)\bar{x}\bar{z} such that ψ(x¯,z¯)\psi(\bar{x},\bar{z}) and (Π,P)x¯z¯(\Pi,P)\bar{x}\bar{z} are equivalent. Set

Π=Π[Zz¯/Zx¯z¯]{Zx¯z¯Px¯z¯,Qx¯Zx¯u~}\Pi^{\prime}=\Pi[Z\bar{z}/Z\bar{x}\bar{z}]\cup\{Z\bar{x}\bar{z}\leftarrow P\bar{x}\bar{z},Q\bar{x}\leftarrow Z\bar{x}\tilde{u}\}

where Π[Zz¯/Zx¯z¯]\Pi[Z\bar{z}/Z\bar{x}\bar{z}] denotes the logic program obtained by replacing each Zz¯Z\bar{z} by Zx¯z¯Z\bar{x}\bar{z} in Π\Pi. By the definition of FO(LFP) we know that ZZ occurs only positively in each CjC_{j}, so Π\Pi^{\prime} is a DATALOGr program and (Π,Q)t¯(\Pi^{\prime},Q)\bar{t} is equivalent to ϕ(x¯)\phi(\bar{x}).

P. Kolaitis[16] showed that S-DATALOG is equivalent to EFP, H. Ebbinghaus and J. Flum[8] showed that S-DATALOG is equivalent to BFP. Both EFP and BFP are proper subsets of FO(LFP). Combining Proposition 3 and Theorem 3.2 we conclude the following corollary.

Corollary 3

DATALOG{}^{*}\subset DATALOGr

It is natural to try to relate Horn logic to Logic programs. First we prove a lemma that shows a property of SO-HORN formulas. We say a formula ϕ\phi is closed under substructures if for any structure 𝒜\mathcal{A} and any 𝒜\mathcal{B}\subseteq\mathcal{A}, 𝒜ϕ\mathcal{A}\models\phi implies ϕ\mathcal{B}\models\phi. A formula ϕ\phi is closed under extensions if for any structure 𝒜\mathcal{A} and any 𝒜\mathcal{A}\subseteq\mathcal{B}, 𝒜ϕ\mathcal{A}\models\phi implies ϕ\mathcal{B}\models\phi. The following lemma shows that adding a second-order quantifier does not change these preservation properties of a formula.

Lemma 2

If ϕ\phi is a second-order formula closed under substructures (extensions), then both Rϕ\exists R\phi and Rϕ\forall R\phi are closed under substructures (extensions).

Proof

Given a structure 𝒜\mathcal{A} and an rr-ary relation RR over 𝒜\mathcal{A}, if 𝒜\mathcal{B}\subseteq\mathcal{A}, then we use RBrR\cap B^{r} to denote the reduct of RR to \mathcal{B}. Suppose that ϕ\phi is closed under substructures, then for any structure 𝒜\mathcal{A} and 𝒜\mathcal{B}\subseteq\mathcal{A},

𝒜Rϕthere exists a RArsuch that (𝒜,R)ϕ(,RBr)ϕ(ϕ is closed under substructures)Rϕ.\begin{array}[]{ccl}\mathcal{A}\models\exists R\phi&\Rightarrow&\textrm{there exists a }R\subseteq A^{r}\\ &&\textrm{such that }(\mathcal{A},R)\models\phi\\ &\Rightarrow&(\mathcal{B},R\cap B^{r})\models\phi\\ &&(\phi\textrm{ is closed under substructures})\\ &\Rightarrow&\mathcal{B}\models\exists R\phi.\end{array}
𝒜Rϕ(𝒜,R)ϕ for any RAr(,RBr)ϕ for any RAr(ϕ is closed under substructures)(,R)ϕ for any RBrRϕ.\begin{array}[]{ccl}\mathcal{A}\models\forall R\phi&\Rightarrow&(\mathcal{A},R)\models\phi\textrm{ for any }R\subseteq A^{r}\\ &\Rightarrow&(\mathcal{B},R\cap B^{r})\models\phi\textrm{ for any }R\subseteq A^{r}\\ &&(\phi\textrm{ is closed under substructures})\\ &\Rightarrow&(\mathcal{B},R^{\prime})\models\phi\textrm{ for any }R^{\prime}\subseteq B^{r}\\ &\Rightarrow&\mathcal{B}\models\forall R\phi.\end{array}

If ϕ\phi is closed under extensions, then for any structure 𝒜\mathcal{A} and 𝒜\mathcal{A}\subseteq\mathcal{B}

𝒜Rϕthere exists a RArsuch that (A,R)ϕ(,R)ϕ(ϕ is closed under extensions)Rϕ.\begin{array}[]{ccl}\mathcal{A}\models\exists R\phi&\Rightarrow&\textrm{there exists a }R\subseteq A^{r}\\ &&\textrm{such that }(A,R)\models\phi\\ &\Rightarrow&(\mathcal{B},R)\models\phi\\ &&(\phi\textrm{ is closed under extensions})\\ &\Rightarrow&\mathcal{B}\models\exists R\phi.\end{array}
𝒜Rϕ(𝒜,R)ϕ for any RAr(𝒜,RAr)ϕ for any RBr(,R)ϕ for any RBr(ϕ is closed under extensions)Rϕ.\begin{array}[]{ccl}\mathcal{A}\models\forall R\phi&\Rightarrow&(\mathcal{A},R)\models\phi\textrm{ for any }R\subseteq A^{r}\\ &\Rightarrow&(\mathcal{A},R^{\prime}\cap A^{r})\models\phi\textrm{ for any }R^{\prime}\subseteq B^{r}\\ &\Rightarrow&(\mathcal{B},R^{\prime})\models\phi\textrm{ for any }R^{\prime}\subseteq B^{r}\\ &&(\phi\textrm{ is closed under extensions})\\ &\Rightarrow&\mathcal{B}\models\forall R\phi.\end{array}
Corollary 4

SO-HORN is closed under substructures.

Proof

It’s easily seen that for a SO-HORN formula Φ=Q1R1QnRnx¯ϕ\Phi=Q_{1}R_{1}\cdots Q_{n}R_{n}\forall\bar{x}\phi where Qi{,}(1in)Q_{i}\in\{\forall,\exists\}\,(1\leq i\leq n), the first-order part x¯ϕ\forall\bar{x}\phi is closed under substructures. By Lemma 2, Φ\Phi is closed under substructures.

For any formula ϕ\phi, ϕ\phi is closed under substructures if and only if ¬ϕ\neg\phi is closed under extensions. M. Ajtai and Y. Gurevich[12] mentioned that DATALOG formulas are closed under extensions. The following proposition implies that, in a sense, DATALOG is equivalent to the negation of SO-HORN.

Proposition 4

For every SO-HORN ( SO-HORN, SO-HORNr, SO-HORN∗r, respectively ) formula ϕ(x¯)\phi(\bar{x}) over vocabulary σ\sigma we can find a DATALOG ( DATALOG, DATALOGr, DATALOG∗r, respectively ) formula (Π,P)t¯(\Pi,P)\bar{t} over vocabulary τ\tau such that σ=(τ,Π)ext\sigma=(\tau,\Pi)_{ext} and the following are satisfied:

  • If ϕ\phi is a sentence, then PP is a zero-ary intentional relation symbol such that for any (τ,Π)ext(\tau,\Pi)_{ext}-structure 𝒜\mathcal{A}

    𝒜ϕiff𝒜(Π,P)i.e. P𝒜[Π] is FALSE\begin{array}[]{cccc}\mathcal{A}\models\phi&\textrm{iff}&\mathcal{A}\nvDash(\Pi,P)&\textrm{i.e. }P^{\mathcal{A}[\Pi]}\textrm{ is FALSE}\end{array}
  • If ϕ(x¯)\phi(\bar{x}) is a formula with free variables x¯=x1,,xr\bar{x}=x_{1},\cdots,x_{r}, then PP is an rr-ary intentional relation symbol such that for any (τ,Π)ext(\tau,\Pi)_{ext}-structure 𝒜\mathcal{A} and any a¯=a1,,arA\bar{a}=a_{1},\cdots,a_{r}\in A

    𝒜ϕ[a¯]iff𝒜(Π,P)[a¯]i.e. (a1,,ar)P𝒜[Π]\begin{array}[]{cccc}\mathcal{A}\models\phi[\bar{a}]&\textrm{iff}&\mathcal{A}\nvDash(\Pi,P)[\bar{a}]&\textrm{i.e. }(a_{1},\cdots,a_{r})\notin P^{\mathcal{A}[\Pi]}\end{array}

The converse of this statement is also true.

Proof

We give the proof only for the case of DATALOG and SO-HORN, the others follow by the same method. For every SO-HORN formula ϕ(x¯)\phi(\bar{x}) we construct a DATALOG formula (Π,P)t¯(\Pi,P)\bar{t} which is equivalent to ¬ϕ(x¯)\neg\phi(\bar{x}). Suppose ϕ(x¯)\phi(\bar{x}) has the form

R1Rnx1xm(i=1𝑙(αi1αisiHi))(x¯)\exists R_{1}\cdots\exists R_{n}\forall x_{1}\cdots\forall x_{m}\left(\overset{l}{\underset{i=1}{\bigwedge}}(\alpha_{i1}\wedge\cdots\wedge\alpha_{is_{i}}\rightarrow H_{i})\right)(\bar{x})

where x¯=x1,,xr\bar{x}=x_{1},\cdots,x_{r} are the free variables in ϕ\phi. Suppose v¯=v1,,vr\bar{v}=v_{1},\cdots,v_{r} are the variables that don’t occur in α\alpha, we use

α=α[Ru¯11/Ru¯11v¯,,Ru¯nn/Ru¯nnv¯]\alpha^{\prime}=\alpha[R{}_{1}\bar{u}_{1}/R{}_{1}^{\prime}\bar{u}_{1}\bar{v},\cdots,R{}_{n}\bar{u}_{n}/R{}_{n}^{\prime}\bar{u}_{n}\bar{v}]

to denote the formula obtained by replacing each Riu¯iR_{i}\bar{u}_{i} by Riu¯iv¯(1in)R_{i}^{\prime}\bar{u}_{i}\bar{v}\,(1\leq i\leq n), respectively. We construct Π\Pi as following:

  • for each clause αi1αisiHi\alpha_{i1}\wedge\cdots\wedge\alpha_{is_{i}}\rightarrow H_{i} in ϕ\phi,

    • if HiH_{i} is an atomic formula Rju¯R_{j}\bar{u}, we add the following rule in Π\Pi,

      Rju¯v¯x¯=v¯,αi1,,αisiR_{j}^{\prime}\bar{u}\bar{v}\leftarrow\bar{x}=\bar{v},\alpha^{\prime}_{i1},\cdots,\alpha^{\prime}_{is_{i}}
    • if HiH_{i} is the Boolean constant \bot, we add the following rule in Π\Pi,

      Pv¯x¯=v¯,αi1,,αisiP\bar{v}\leftarrow\bar{x}=\bar{v},\alpha^{\prime}_{i1},\cdots,\alpha^{\prime}_{is_{i}}
  • if no HiH_{i} is the Boolean constant \bot in ϕ\phi, we add the following rule in Π\Pi,

    Pv¯Pv¯P\bar{v}\leftarrow P\bar{v}

It’s easily seen that if ϕ\phi is a sentence then PP is a zero-ary intentional relation symbol in Π\Pi. From the semantics of DATALOG program and SO-HORN logic we know that for any structure 𝒜\mathcal{A} and a¯=a1,,arA\bar{a}=a_{1},\cdots,a_{r}\in A, 𝒜ϕ[a¯]\mathcal{A}\models\phi[\bar{a}] iff 𝒜(Π,P)[a¯]\mathcal{A}\nvDash(\Pi,P)[\bar{a}].

For the converse direction, given a DATALOG formula (Π,P)t¯(\Pi,P)\bar{t}, we construct a SO-HORN formula ϕ(x¯)\phi(\bar{x}) such that (Π,P)t¯(\Pi,P)\bar{t} and ¬ϕ(x¯)\neg\phi(\bar{x}) are equivalent. Remember that we assume no zero-ary relation symbol in the body of any rule of Π\Pi, so we can ignore the rules Qα1,,αlQ\leftarrow\alpha_{1},\cdots,\alpha_{l} in Π\Pi where QQ is a zero-ary relation symbol and QPQ\neq P in the following construction with no result affected. Let R1,,RnR_{1},\cdots,R_{n} be the all no zero-ary intentional relation symbols and z1,,zmz_{1},\cdots,z_{m} be the all free variables in Π\Pi. The prefix of ϕ(x¯)\phi(\bar{x}) is R1Rnz1zm\exists R_{1}\cdots\exists R_{n}\forall z_{1}\cdots\forall z_{m}, the matrix of ϕ(x¯)\phi(\bar{x}) is obtained by

  • if PP is a zero-ary relation symbol,

    • if Pα1,,αlP\leftarrow\alpha_{1},\cdots,\alpha_{l} is a rule of Π\Pi, then we add the clause α1αl\alpha_{1}\wedge\cdots\wedge\alpha_{l}\rightarrow\bot as a conjunct,

    • if Riu¯α1,,αlR_{i}\bar{u}\leftarrow\alpha_{1},\cdots,\alpha_{l} is a rule of Π\Pi, then we add the clause α1αlRiu¯\alpha_{1}\wedge\cdots\wedge\alpha_{l}\rightarrow R_{i}\bar{u} as a conjunct,

  • if PP is an rr-ary (r>0)(r>0) relation symbol,

    • if Pu¯α1,,αlP\bar{u}\leftarrow\alpha_{1},\cdots,\alpha_{l} is a rule of Π\Pi, then let x¯=x1,,xr\bar{x}=x_{1},\cdots,x_{r} be new variables that don’t occur in Π\Pi, we add α1αlPu¯\alpha_{1}\wedge\cdots\wedge\alpha_{l}\rightarrow P\bar{u} and Px¯P\bar{x}\rightarrow\bot as a conjuncts,

    • if Riu¯α1,,αlR_{i}\bar{u}\leftarrow\alpha_{1},\cdots,\alpha_{l} is a rule of Π\Pi and RiPR_{i}\neq P, then we add the clause α1αlRiu¯\alpha_{1}\wedge\cdots\wedge\alpha_{l}\rightarrow R_{i}\bar{u} as a conjunct.

This completes the proof.

From the above proposition and Corollary 2 we can conclude the following corollary.

Corollary 5

SO-HORN{}^{*}\subseteq SO-HORNr{}^{r}\equiv SO-HORN∗r

Proof

We show that every SO-HORN (SO-HORN∗r) formula is equivalent to a SO-HORNr formula.

ϕ(x¯)\phi(\bar{x}) is a SO-HORN (SO-HORN∗r) formula,

\Rightarrow there exists a DATALOG (DATALOG∗r) formula (Π,P)t¯(\Pi,P)\bar{t} such that (Π,P)t¯(\Pi,P)\bar{t} and ¬ϕ(x¯)\neg\phi(\bar{x}) are equivalent (by Proposition 4),

\Rightarrow there exists a DATALOGr formula (Π,P)t¯(\Pi^{\prime},P^{\prime})\bar{t} such that (Π,P)t¯(\Pi^{\prime},P^{\prime})\bar{t} and ¬ϕ(x¯)\neg\phi(\bar{x}) are equivalent (by Corollary 2),

\Rightarrow there exists a SO-HORNr formula ψ(x¯)\psi(\bar{x}) such that ψ(x¯)\psi(\bar{x}) and ϕ(x¯)\phi(\bar{x}) are equivalent (by Proposition 4).

To compare the expressive power of the several Horn logic defined above and DATALOG program, we define stratified versions of these Horn logic in the following.

Definition 6

For each number s1s\geq 1, SO-HORNs is defined inductively by

  • SO-HORN=1{}_{1}= SO-HORN,

  • SO-HORNj+1 is the set of formulas of the form

    Q1R1QmRmx¯(C1Cn)Q_{1}R_{1}\cdots Q_{m}R_{m}\forall\bar{x}(C_{1}\wedge\cdots\wedge C_{n})

    where Qi{,}Q_{i}\in\{\forall,\exists\}, R1,,RmR_{1},\cdots,R_{m} are relation symbols and each CjC_{j} is an implication of the form

    α1αlβ1βqH\alpha_{1}\wedge\cdots\wedge\alpha_{l}\wedge\beta_{1}\wedge\cdots\wedge\beta_{q}\rightarrow H

    where

    1)

    each αs\alpha_{s} is an atomic formula Rix¯R_{i}\bar{x},

    2)

    each βt\beta_{t} is either a SO-HORNl formula ϕ(x¯)\phi(\bar{x}) or its negation¬ϕ(x¯)\neg\phi(\bar{x}) where ljl\leq j and R1,,RmR_{1},\cdots,R_{m} don’t occur in ϕ\phi,

    3)

    HH is either a atomic formula Rkz¯R_{k}\overline{z} or the Boolean constant \bot (for false).

Set SO-HORN=s1{}_{\infty}=\underset{s\geq 1}{\bigcup}SO-HORNs.

If we replace SO-HORN by SO-HORN, we denote this logic by SO-HORN{}_{\infty}^{*}. And if we replace SO-HORN by SO-HORNr, and replace condition 1) by

1’)

each αs\alpha_{s} is either an atomic formula Rix¯R_{i}\bar{x} or y¯Riy¯z¯\forall\bar{y}R_{i}\bar{y}\bar{z}, we denote this logic by SO-HORNr{}_{\infty}^{r}.

Proposition 5

SO-HORN, SO-HORN{}_{\infty}^{*}, SO-HORNr{}_{\infty}^{r} are equivalent to S-DATALOG, S-DATALOG, S-DATALOGr, respectively.

Proof

The proof is similar to that of Proposition 4. The details are left to the reader.

Corollary 6

SO-HORN{}_{\infty}\equiv SO-HORN{}_{\infty}^{*}\equiv FO(BFP)

From the results above we obtain our main theorem in the following.

Theorem 3.3

On all finite structures the followings are hold:

SO-HORNSO-HORNSO-HORNrand SO-HORNrSO-HORNrSO-HORNrFO(LFP)\begin{array}[]{ll}&\text{SO-HORN}\subset\text{SO-HORN}^{*}\subset\text{SO-HORN}^{r}\\ \text{and }\\ &\text{SO-HORN}^{r}\equiv\text{SO-HORN}^{*r}\equiv\text{SO-HORN}_{\infty}^{r}\equiv\text{FO(LFP)}\end{array}

In [5] Grädel asked that whether SO-HORN captures the class of problems that are in P and closed under substructures. Using some results above and a technique that encode a graph into a tree which is exponentially larger than it, we show that if the answer is “Yes”, then the 3-COLORABILITY problem is in P, which implies that P=NP. So it seems that the answer is not “Yes” for this question. We ask that whether every problem that is expressible in FO(LFP) and closed under substructures is expressible by a SO-HORN formula. This question is equivalent to the question that whether every problem that is expressible in DATALOGr and closed under extensions is expressible by a DATALOG formula.

4 Expressive power of Extended Horn Logic

In [21] we introduced SO-EHORN and SO-EHORNr logic and proved that they capture co-NP on ordered structures. It is well known that Σ11\Sigma_{1}^{1} captures NP[17], as a corollary, Π11\Pi_{1}^{1} captures co-NP on all structures. So it is of interest to know whether SO-EHORNr captures co-NP on all structures. In this section, we show that SO-EHORNrΠ11{}^{r}\equiv\Pi_{1}^{1} which implies that SO-EHORNr captures co-NP on all structures.

Definition 7

Second-order Extended Horn logic, denoted by SO-EHORN, is the set of second-order formulas of the form

P1R1PmRmx¯(C1Cn)\forall P_{1}\exists R_{1}\cdots\forall P_{m}\exists R_{m}\forall\bar{x}(C_{1}\wedge\cdots\wedge C_{n})

where R1,,Rm,P1,,PmR_{1},\cdots,R_{m},P_{1},\cdots,P_{m} are relation symbols and C1,,CnC_{1},\cdots,C_{n} are extended Horn clauses with respect to R1,,RmR_{1},\cdots,R_{m}, more precisely, each CjC_{j} is an implication of the form

α1αlβ1βqH\alpha_{1}\wedge\cdots\wedge\alpha_{l}\wedge\beta_{1}\wedge\cdots\wedge\beta_{q}\rightarrow H

where

1)

each αs\alpha_{s} is an atomic formula Rix¯R_{i}\bar{x},

2)

each βt\beta_{t} is either an atomic formula Qy¯Q\bar{y} or a negated atomic formula ¬Qy¯\neg Q\bar{y} where Q{R1,,Rm}Q\notin\{R_{1},\cdots,R_{m}\},

3)

HH is either an atomic formula Rkz¯R_{k}\overline{z} or the Boolean constant \bot (for False).

If we replace condition 1) by

1’)

each αs\alpha_{s} is either an atomic formula Rix¯R_{i}\overline{x} or y¯Riy¯z¯\forall\overline{y}R_{i}\overline{y}\overline{z},

then we call the logic second-order Extended revised Horn Logic, denoted by SO-EHORNr.

By Lemma 2, we know that SO-EHORN is closed substructures. So it can not captures co-NP on all structures. But SO-EHORNr is more expressive, the following example shows that it can express some co-NP complete problem.

Example 4

The decision problem for the set {ϕ|ϕ\{\phi|\phi is an unsatisfiable propositional CNF formula}\} is co-NP complete. We can encode ϕ\phi via the structure 𝒜ϕ=A,Cla,Var,P,N\mathcal{A}_{\phi}=\left\langle A,Cla,Var,P,N\right\rangle[15], where AA is the domain, ClaCla and VarVar are two unary relations, PP and NN are two binary relations, and for any i,jA,Cla(i),Var(j),P(i,j)i,j\in A,\ Cla(i),Var(j),P(i,j) and N(i,j)N(i,j) means that ii is a clause, jj is a variable, variable jj occurs positively in clause ii and variable jj occurs negatively in clause ii, respectively.

For example, the CNF formula

ϕ=((p1p3)(p2¬p3)(p1p2))\phi=((p_{1}\vee p_{3})\wedge(p_{2}\vee\neg p_{3})\wedge(p_{1}\vee p_{2}))

can be encoded as

𝒜ϕ={1,2,3},Cla,VarP,NCla={1,2,3}Var={1,2,3}P={(1,1),(1,3),(2,2),(3,1),(3,2)}N={(2,3)}\begin{array}[]{l}\mathcal{A}_{\phi}=\left\langle\{1,2,3\},Cla,VarP,N\right\rangle\\ Cla=\{1,2,3\}\\ Var=\{1,2,3\}\\ P=\{(1,1),(1,3),(2,2),(3,1),(3,2)\}\\ N=\{(2,3)\}\end{array}

The following formula

Φ=XYxy(z¬Y(z)(¬Y(x)Cla(x))(Cla(x)Var(y)¬Y(x)P(x,y)¬X(y))(Cla(x)Var(y)¬Y(x)N(x,y)X(y)))\Phi=\forall X\exists Y\forall x\forall y\left(\begin{array}[]{cl}&\exists z\neg Y(z)\\ \wedge&(\neg Y(x)\rightarrow Cla(x))\\ \wedge&(Cla(x)\wedge Var(y)\wedge\neg Y(x)\wedge P(x,y)\rightarrow\neg X(y))\\ \wedge&(Cla(x)\wedge Var(y)\wedge\neg Y(x)\wedge N(x,y)\rightarrow X(y))\end{array}\right)

means that for any valuation XX, where X(i)X(i) holds iff variable ii is true under this valuation, there exists a set of YY of clauses, such that any clause that doesn’t occur in YY is false under this valuation. Φ\Phi is a SO-EHORNr formula and for any CNF formula ϕ\phi, 𝒜ϕΦ\mathcal{A}_{\phi}\models\Phi iff ϕ\phi is unsatisfiable.

Theorem 4.1

SO-EHORNrΠ11{}^{r}\equiv\Pi_{1}^{1} on all structures.

Proof

We give here two different proofs of this theorem.

#1. It is well known that every Σ11\Sigma_{1}^{1} formula is equivalent to a formula of skolem normal form[4]

P1Pnx1xmy1yrϕ\exists P_{1}\cdots\exists P_{n}\forall x_{1}\cdots\forall x_{m}\exists y_{1}\cdots\exists y_{r}\phi (7)

where ϕ\phi is a quantifier-free CNF formula. Let PP be a new rr-ary second-order relation symbol and z¯=z1,,zr\bar{z}=z_{1},\cdots,z_{r} be new variables that don’t occur in (7). It is easily seen that y1yrϕ\exists y_{1}\cdots\exists y_{r}\phi is equivalent to

P(z¯Pz¯y¯(Py¯ϕ))\exists P(\exists\bar{z}P\bar{z}\wedge\forall\bar{y}(P\bar{y}\rightarrow\phi))

Then (7) is equivalent to

P1Pnx¯Py¯(z¯Pz¯(Py¯ϕ))\exists P_{1}\cdots\exists P_{n}\forall\bar{x}\exists P\forall\bar{y}(\exists\bar{z}P\bar{z}\wedge(P\bar{y}\rightarrow\phi))

Repeated application of Lemma 1 shows that the above formula is equivalent to

P1PnPx¯y¯(z¯Px¯z¯(Px¯y¯ϕ))\exists P_{1}\cdots\exists P_{n}\exists P^{\prime}\forall\bar{x}\forall\bar{y}(\exists\bar{z}P^{\prime}\bar{x}\bar{z}\wedge(P^{\prime}\bar{x}\bar{y}\rightarrow\phi)) (8)

where PP^{\prime} is an m+rm+r-ary second-order relation symbol. The left is the same as that of Theorem 7 in [21], just replace the formula Φ\Phi by the formula (8) above.

#2. By (7) we know that every Π11\Pi_{1}^{1} formula is equivalent to a formula of the form

P1Pnx1xmy1yrϕ\forall P_{1}\cdots\forall P_{n}\exists x_{1}\cdots\exists x_{m}\forall y_{1}\cdots\forall y_{r}\phi^{\prime} (9)

where ϕ\phi^{\prime} is a quantifier-free formula. With no loss of generality, suppose ϕ\phi^{\prime} is in CNF form. Let RR be a new mm-ary second-order relation symbol and z¯=z1,,zm\bar{z}=z_{1},\cdots,z_{m} be new variables that don’t occur in (9), then x¯y¯ϕ\exists\bar{x}\forall\bar{y}\phi^{\prime} is equivalent to R(z¯¬Rz¯x¯(¬Rx¯y¯ϕ))\exists R(\exists\bar{z}\neg R\bar{z}\wedge\forall\bar{x}(\neg R\bar{x}\rightarrow\forall\bar{y}\phi^{\prime})). So (9) is equivalent to

P1PnRx¯y¯(z¯¬Rz¯(¬Rx¯ϕ))\forall P_{1}\cdots\forall P_{n}\exists R\forall\bar{x}\forall\bar{y}(\exists\bar{z}\neg R\bar{z}\wedge(\neg R\bar{x}\rightarrow\phi^{\prime})) (10)

It is easily seen that (10) is equivalent to a SO-EHORNr formula.

Corollary 7

SO-EHORNr captures co-NP on all structures.

Corollary 8

SO-EHORNr collapses to Π21\Pi_{2}^{1}-EHORNr on all structures.

Proof

In the second proof of Theorem (4.1), we have proved more, namely that every Π11\Pi_{1}^{1} formula is equivalent to a Π21\Pi_{2}^{1}-EHORNr formula which implies that SO-EHORNrΠ21{}^{r}\subseteq\Pi_{2}^{1}-EHORNr.

5 Conclusion

In this paper, we showed that, on all finite structures, SO-HORNr, SO-HORN∗r, FO(LFP) coincide with each other and SO-HORN is a proper sublogic of SO-HORNr. We introduced the notions of DATALOG program and DATALOGr program and their stratified versions, S-DATALOG program and S-DATALOGr program. We proved that DATALOGr and S-DATALOGr are equivalent and DATALOG is a proper sublogic of DATALOGr. We also showed that SO-EHORNr which is an extended version of SO-HORN captures co-NP on all structures. We prove this this result in two versions. In the first version we improved the proof in [21] using the skolem normal forms of Σ11\Sigma_{1}^{1} formulas. In the second version, we gave a straightforward proof using the normal form.

References

  • [1] A. Flögel, M. Karpinski, and H. Kleine Büning. Subclasses of Quantified Boolean Formulas. In Lecture Notes in Computer Science, 553: 145-155, Springer, 1990.
  • [2] A. Flögel. Resolution für quantifizierte Bool’sche Formeln. Dissertation, University Paderborn ,1993.
  • [3] A. Chandra, D. Harel. Horn Clause Queries and Generalizations. J. Logic Programming 1 (1985), 1–15.
  • [4] Erich Grädel. The expressive power of second order Horn logic. Proceedings of the 8th annual symposium on Theoretical aspects of computer science, p.466-477, February 1991, Hamburg.
  • [5] E. Grädel. Capturing Complexity Classes by Fragments of Second-order Logic. Theoretical Computer Science, 101: 35-57, 1992.
  • [6] Erich Grädel , P. G. Kolaitis, L. Libkin , M. Marx , J. Spencer , Moshe Y. Vardi , Y. Venema , Scott Weinstein. Finite Model Theory and Its Applications. Springer-Verlag New York, Inc., Secaucus, NJ, 2005.
  • [7] Evgeny Dantsin, Thomas Eiter, Georg Gottlob, Andrei Voronkov. Complexity and Expressive Power of Logic Programming. ACM Computing Surveys, Vol. 33, No. 3, September 2001, pp. 374–425.
  • [8] H.-D. Ebbinghaus, J. Flum. Finite Model Theory. Springer, 1999.
  • [9] H. Kleine Büning, T. Lettmann. Propositional Logic: Deduction and Algorithms. Cambridge University Press, 1999.
  • [10] Hans Kleine Büning, Xishun Zhao. On Models for Quantified Boolean Formulas. In Proc. of SAT’04, 2004.
  • [11] H. Kleine Büning, Xishun Zhao. Computational Complexity of Quantified Boolean Formulas with Fixed Maximal Deficiency. Theoretical Computer Science, 407: 448-457, 2008.
  • [12] Miklos Ajtai, Yuri Gurevich. Datalog vs first-order logic. Journal of Computer and System Sciences, Volume 49, Issue 3, December 1994, Pages 562-588.
  • [13] N. Immerman. Relational Queries Computable in Polynomial Time. Inf. and Control 68 (1986), 86–104.
  • [14] N. Immerman. Languages that Capture Complexity Classes. SIAM J. Comput. 16 (1987), 760–778.
  • [15] N. Immerman. Descriptive Complexity. Springer-Verlag, New York, 1999.
  • [16] P. Kolaitis. The Expressive Power of Stratified Logic Programs. Information and Computation archive Volume 90 , Issue 1, Pages: 50 - 66,1991.
  • [17] R. Fagin. Generalized First-Order Spectra and Polynomial-Time Recognizable Sets. SIAM-AMS Proc. 7 (1974), 43–73.
  • [18] S. Abiteboul , V. Vianu. Fixpoint extensions of first-order logic and datalog-like languages. Proceedings of the Fourth Annual Symposium on Logic in computer science, p.71-79, June 1989, California, United States
  • [19] S. Abiteboul , V. Vianu. DATALOG extensions for database queries and updates. Journal of Computer and System Sciences 43,62-124,1991.
  • [20] S. Abiteboul , R. Hull, V. Vianu. Foundations of Data Bases. Addison Wesley, 1995.
  • [21] Shiguang Feng, Xishun Zhao. Complexity and Expressive Power of Second-Order Extended Horn Logic. Mathematical Logic Quarterly, under review.
  • [22] Xishun Zhao, Hans Kleine Büning. Model-Equivalent Reductions. Lecture Notes in Computer Science, Volume 3569/2005, 355-370, Springer, 2005.
  • [23] Y. Gurevich. Logic and the Challenge of Computer Science. Computer Science Press (1988), 1–57.