This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\publyear

2021 \papernumber2095

\finalVersionForIOS

A Non-Deterministic Multiset Query Language

Bartosz Zieliński
Department of Computer Science
Faculty of Physics and Applied Informatics
Address for correspondence: Department of Computer Science, Faculty of Physics and Applied Informatics, University of Łódź, Pomorska 149/153 90-236 Łódź, Poland.

Received November 2021;  revised November 2021.
   University of Łódź
Pomorska 149/153 90-236 Łódź
   Poland
[email protected]
Abstract

We develop a multiset query and update language executable in a term rewriting system. Its most remarkable feature, besides non-standard approach to quantification and introduction of fresh values, is non-determinism — a query result is not uniquely determined by the database. We argue that this feature is very useful, e.g., in modelling user choices during simulation or reachability analysis of a data-centric business process — the intended application of our work. Query evaluation is implemented by converting the query into a terminating term rewriting system and normalizing the initial term which encapsulates the current database. A normal form encapsulates a query result. We prove that our language can express any relational algebra query. Finally, we present a simple business process specification framework (and an example specification). Both syntax and semantics of our query language is implemented in Maude.

keywords:
term rewriting, query languages, business process modelling
volume: 184issue: 2

A Non-deterministic Multiset Query Language

1 Introduction

In a data-centric approach to business process modelling (see, e.g., [1, 2]), specification of data transformation during case execution is an integral part of the business process model. This new paradigm requires new tools and formalisms for effective specification, simulation and validation. Task-centric models are commonly formalized using Petri nets (see, e.g., [3, 4]). Adapting Petri net-based formalizations to data-centric models is, however, problematic: While simple transformations on data can be represented directly within a coloured Petri net, Petri nets lack facilities for complex data processing and querying. Even so, there exists a large amound of literature (see e.g., [5, 6, 7]) devoted to enriching Petri nets with with data processing capabilities and automated verification of their properties. Recent paper [8] introduced DB-Nets — an attempt to integrate coloured Petri nets with relational databases. The use of two separate formalisms complicates verification and simulation (though it corresponds to actual implementations of BPM systems). In the following paper [9] a subset of database operations was implemented inside coloured Petri nets with name creation and transition priorities.

Conditional term rewriting [10] was proposed as an alternative (if less popular) generic framework for specification of dynamic systems [11]. It subsumes a variety of Petri nets [12] and their simulation is one of popular applications of the term rewriting system Maude [13, 14]). More precisely, it is well known (see e.g., [12]) that coloured Petri nets can be implemented by multiset rewriting systems, and, conversely, rewriting systems which rewrite multisets of terms representing colour tokens can be interpreted as Petri nets: just identify places with colour tokens, and each rewriting rule of the form

a1a2anb1b2bma_{1}a_{2}\ldots a_{n}\Rightarrow b_{1}b_{2}\ldots b_{m}

with a transition with input arcs from a1a_{1}, a2a_{2}, \ldots, ana_{n} and output arcs to b1b_{1}, b2b_{2}, \ldots, bmb_{m}. Other constructs, such as inhibitor arcs can be easily implemented with conditional rewriting rules. Rewriting systems are more general than Petri Nets since they are not limited to rewriting multisets (on the other hand, Petri nets are much better supported by tools and programming libraries). However, rewriting systems still share with Petri nets the limitation and inconvenience of not directly supporting bulk, complex operations on data which involve some kinds of quantification. For example, suppose that the state of a data driven business process related to e-commerce is represented by a multiset of terms. In particular, terms of the form item(p,c)\text{\tt item}(p,c) denote the presence of a product pp in the basket of a customer cc. Suppose now that cc cancels the case, and so we need to remove all cc’s items from the multiset. Describing removal of a single (nondeterministically chosen) item is easy with the rule item(c,x)\text{\tt item}(c,x)\Rightarrow\emptyset where xx is a variable, but specifiying in a rewriting system (or a Petri net) that all of them need to be removed before the next business step is more complex (though clearly possible) and would require auxilliary tokens and conditional rewrite rules corresponding to inhibitor arcs in a Petri net preventing other transitions as long as there are still some cc’s items present in the rewritten multiset. Thus, a high-level query language adding quantifier constructs on top of conventional rewriting system or a Petri net is clearly desirable, particularly if rewriting systems or Petri nets are to become convenient formalisms to model data driven business processes.

In this paper we present an expressive multiset query and update language 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}}, designed to be executable in a term rewriting system, useful for a unified and somewhat “Petri nettish” formalization of data-centric business processes. The connection with Petri nets is admittedly tenuous and follows from the fact that the language acts on data represented as multisets of terms which could be viewed as tokens (see the discussion above). Since this language specifies changes to data, instead of being a reimplementation of relational calculus, it contains linear-like features fitting a term rewriting implementation. Most remarkably, 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} is non-deterministic — the result of a query or update is not, in general, uniquely determined by the database. This permits modelling user choices, just like in the case of Petri nets. 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} consists of three sublanguages, parametrized with respect to a signature Σ\Sigma and Σ\Sigma-algebra of facts 𝒟\mathcal{D}:

  1. 1.

    Language 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} of conditions (Boolean queries) which can be used independently as constraints, or as components of queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} and 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}.

  2. 2.

    Data manipulation language𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}. A DML query QQ in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}} defines new facts to be added to the database. Some of the old facts used in constructing the new ones may be deleted.

  3. 3.

    Language 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} of queries which only return facts but do not change the database. Both syntax, and to some extent semantics of 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} is a restriction of syntax and semantics of 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}.

A query QQ in 𝒬Σ,𝒟α\mathcal{Q}_{\Sigma,\mathcal{D}}^{\alpha}, α{𝐜𝐧𝐝,𝐪𝐫𝐲,𝐝𝐦𝐥}\alpha\in\{\mathbf{cnd},\mathbf{qry},\mathbf{dml}\}, is given semantics by assignment of a rewriting system Σ,𝒟α(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\alpha}(Q). To evaluate QQ in a database FF we start with an initial term IQ(F)\mathrm{I}_{Q}(F). A normal form of IQ(F)\mathrm{I}_{Q}(F) wraps a result of QQ’s evaluation: a Boolean value indicating validity of a condition, a query answer or a new database resulting from execution of a DML query. As remarked above, while Σ,𝒟α(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\alpha}(Q) is always terminating (i.e., there are no infinite execution paths, so we always do get some result from evaluating QQ), it is not confluent (i.e., divergent execution paths may not eventually converge) in general, hence we may get distinct results depending on nondeterministic choices. We identify syntactic constraints on queries in each of the sublanguages which ensure confluence for the rewriting system, and hence determinism for the results of the query. 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} shares with the language introduced in [15] a non-standard approach to variable binding. The approach avoids problems with capture-free substitutions without dispensing with explicit variables, but at the price of non-compositionality: The surrounding context may determine whether a variable in a subterm is free or bound in this subterm. 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}} supports introduction of fresh values to the database, which is used, e.g., to generate identifiers for newly created artifacts or to simulate user input (cf. [7]).

Since queries in 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} are converted to a term rewriting system, verification of a business process specified as a set of DML queries (see Section 7) can be assisted with symbolic reachability analysis techniques based on narrowing (see e.g., [16]) . We plan to expand on this idea in future research. Note that our results in this article regarding the confluence of the rewriting systems to which a particular subclass of queries compiles to, may be relevant for narrowing (see e.g., [17, 18], c.f. [19]). E.g., narrowing a confluent system may provide a more efficient search procedure than in the case of a non-confluent one.

1.1 Prior work

The present paper builds on the previous paper [15] (cf. [20]) where a multiset query language executed in Maude was proposed. 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} shares many similarities with the language described in [15], particularly the treatment of quantification. It has, however, distinct syntax (with, e.g., fact markings in quantifiers) and distinct semantics. We consider both languages to be alternatives, each of which with its own strengths. A side-by-side comparison is presented in Table 1. Observe that while the language described in [15] can be defined both in the set and multiset setting (in the former case we match multisets of facts modulo idempotence in addition to commutativity, associativity and identity), here we assume exclusively multiset setting. This is because the language described here is implemented through multiset rewriting. Making multiset constructor idempotent would make it impossible to consistently replace matched terms, a feature which is crucial for our formalism to behave sensibly. E.g., given a rule aba\Rightarrow b, term ab=𝒜aabab=_{\mathcal{A}}aab rewrites in one step both to bb=𝒜bbb=_{\mathcal{A}}b (intended) and abb=𝒜ababb=_{\mathcal{A}}ab (not intended).

Table 1: Comparison between 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} and 𝒞Σ,𝒟(X)\mathcal{CF}_{\Sigma,\mathcal{D}}(X) from [15]
𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} 𝒞Σ,𝒟(X)\mathcal{CF}_{\Sigma,\mathcal{D}}(X)
Semantics of (possibly) non-deterministic queries is based on translation into term rewriting systems. Semantics of queries is based on term matching on the metalevel. Queries are always deterministic.
Queries return facts in the same signature as the database against which the query is evaluated. Queries return objects of an arbitrary signature (as long as it contains a “union” operator).
DML queries construct multisets of facts to be added to the current database. Some of the current facts used in the construction may be deleted. DML expressions construct a pair of sets or multisets of facts — those to be deleted from and those to be added to the current database.
Fresh values are introduced through “virtual fresh facts”. Actual freshness in ensured through rewrite rules which define the semantics of DML query. Attributes of input facts can be marked by special sorts as fresh. Testing framework ensures that values injected in those columns are actually fresh.
It is assumed that the database is a multiset of facts The language can be defined both in set and multiset setting
Treatment of variable binding is identical in both languages

𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}}, has some similarities to matching logic [21, 22], as both are based on term matching. They have different purposes, however, and different syntax and semantics. Unlike matching logic statements, 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} queries are non-deterministic. Matching logic is used for software verification, while 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} is a query language intended to be a component of the system. Finally, matching logic has conventional quantifiers, whereas we use a non-standard quantification over “relation patterns”.

CINNI [23] is a generic calculus of substitutions implemented in Maude which combines de Bruijn indices with explicit names to solve the problem of capture-free substitutions. To avoid the associated complexity, we decided not to use conventional variable binding implemented, e.g., using CINNI.

Our quantification over “relation patterns” instead of variables, resembles quantifier constructs in description logic 111We are grateful to Prof. Andrzej Tarlecki for this observation. [24]. Our syntactic construct, however, uses explicit variable names and is not limited to binary relations where only the second column is bound by the quantifier.

Data-centric business process models are formalized in a variety of ways. First-order logic and its restricted variations (see e.g., [25, 26, 27, 28]), datalog [29], and UML [30], are popular choices. Those formalisms are excellent for the specification of data models, and they come with expressive query languages; they are, however, not so ideally suited for modelling change, because of frame problems [31], where it is not always obvious what information is modified and what stays the same. Rewriting formalisms [10], which are explicit about scope of change have a clear advantage here. On the other hand, a great deal of work has been devoted to formal verification of logic based business process formalism, see e.g., [32, 33] in the context of hierarchical artifact systems. For another example see [34] where a data aware extension of BPMN was proposed together with SMT (satisfiability modulo theories) based verification techniques.

In [35] a language called Reseda, was introduced for specification of data driven business process. The language integrates data description with behaviour. What makes it relevant as a prior work to the present paper is that Reseda’s semantics is defined by associating with a Reseda program a transition system. Such a program can then be executed by rewriting data in accordance with the transition rules, similarly to the execution of the language described here.

As we remarked earlier, a recent paper [8] introduced DB-Nets which integrate coloured Petri nets with relational databases. Since the use of two separate formalisms complicates verification and simulation in the following paper [9] a subset of database operations was implemented inside coloured Petri nets with name creation and transition priorities. Thus, the motivation of [9] is analogous to the motivation of this paper, but in the world of Petri nets instead of term rewriting systems. There are however two important differences: First, our language is meant to provide complete specification of data driven business processes, whereas in [9] the business process is still specified as a Petri net, and there is just an interface between relational queries (perhaps implemented inside the net itself) and the main net describing the process. Secondly (and this is what makes the first point possible) we do not simply implement a conventional relational dml and query language in a rewriting system. Instead, we implement a linear and non-deterministic query language which can emulate user choices and creation of new objects.

1.2 Preliminaries on term rewriting

We recall basic notions related to term rewriting [10, 11], and many sorted equational logic [36].

Let SS be a poset (partially ordered set). A family X={Xs|sS}X=\{X_{s}\;|\;s\in S\} of sets is called an SS-sorted set if XsXsX_{s}\subseteq X_{s^{\prime}} whenever sss\leq s^{\prime}. We abbreviate xXx\in\bigcup X as xXx\in X. We write x:sx:s iff xXsx\in X_{s}.

An algebraic signature Σ=(ΣS,ΣF)\Sigma=(\Sigma_{S},\Sigma_{F}) consists of a finite poset of sorts ΣS\Sigma_{S} and a finite set ΣF\Sigma_{F} of function symbols. The set of function symbols ΣF\Sigma_{F} is ΣS+\Sigma_{S}^{+}-sorted, where ΣS+\Sigma_{S}^{+} is the set of finite, non-empty sequences of elements of ΣS\Sigma_{S} partially ordered with s0snt0tms_{0}\cdots s_{n}\leq t_{0}\cdots t_{m} iff m=nm=n, s0t0s_{0}\leq t_{0} and sitis_{i}\geq t_{i} for all i{1,,n}i\in\{1,\ldots,n\}. Traditionally we write f:s1sns0f:s_{1}\ldots s_{n}\rightarrow s_{0} when f(ΣF)s0snf\in(\Sigma_{F})_{s_{0}\ldots s_{n}}, where we denote by (ΣF)s0sn(\Sigma_{F})_{s_{0}\ldots s_{n}} the set of function symbols of sort s0sns_{0}\ldots s_{n}. This explains the somewhat confusing ordering on ΣS+\Sigma_{S}^{+}: we are covariant on return value and contravariant on arguments. Symbols c:sc:\rightarrow s are called constants of sort ss. A Σ\Sigma-algebra 𝔸\mathbb{A} is an assignment of a set s𝔸\llbracket s\rrbracket_{\mathbb{A}} to each sΣSs\in\Sigma_{S} such that s𝔸s𝔸\llbracket s\rrbracket_{\mathbb{A}}\subseteq\llbracket s^{\prime}\rrbracket_{\mathbb{A}} if sss\leq s^{\prime}, and a function f𝔸:s1𝔸××sn𝔸s𝔸\llbracket f\rrbracket_{\mathbb{A}}:\llbracket s_{1}\rrbracket_{\mathbb{A}}\times\cdots\times\llbracket s_{n}\rrbracket_{\mathbb{A}}\rightarrow\llbracket s\rrbracket_{\mathbb{A}} to each f:s1snsf:s_{1}\ldots s_{n}\rightarrow s in ΣF\Sigma_{F}. Let V:={Vs|sΣS}V:=\{V_{s}\;|\;s\in\Sigma_{S}\} be a ΣS\Sigma_{S}-sorted set of variables. A term algebra 𝒯Σ(V)\mathcal{T}_{\Sigma}(V) has “sort-safe” terms as elements and function symbols interpreted by themselves. We denote by 𝒯Σ\mathcal{T}_{\Sigma} the algebra of ground Σ\Sigma-terms. We often use mixfix syntax where underscores in the function name correspond to consecutive arguments. Thus, if ΣF\Sigma_{F} contains _+_:AAA\_+\_:A\;A\rightarrow A and 0:A0:\rightarrow A then 0+00+0 is a ground term of sort AA. Positions in a term are denoted by strings of positive integers. Denote by ε\varepsilon the empty string, and by t|κt|_{\kappa} the subterm of t𝒯Σ(V)t\in\mathcal{T}_{\Sigma}(V) at position κ+\kappa\in\mathbb{Z}^{*}_{+} (if defined), i.e., t|ε:=tt|_{\varepsilon}:=t, and f(t1,,tn)|kκ:=tk|κf(t_{1},\ldots,t_{n})|_{k\kappa}:=t_{k}|_{\kappa}. Let 𝑃𝑜𝑠(t):={κ+|t|κis defined}\mathit{Pos}(t):=\{\kappa\in\mathbb{Z}^{*}_{+}\;|\;t|_{\kappa}\;\text{is defined}\}. If κ𝑃𝑜𝑠(t)\kappa\in\mathit{Pos}(t) and uu is a term of the same sort as t|κt|_{\kappa}, then we denote by t[u]κt[u]_{\kappa} the result of replacing t|κt|_{\kappa} in tt with uu. We use a standard notation for substitutions. Let a=a1,,an\vec{a}=a_{1},\ldots,a_{n} be a list of terms, v=v1,,vn\vec{v}=v_{1},\ldots,v_{n} a list of distinct variables. Then we denote σ={a/v}={a1/v1,,an/vn}\sigma=\{\vec{a}/\vec{v}\}=\{a_{1}/v_{1},\ldots,a_{n}/v_{n}\} when σ(vi)=ai\sigma(v_{i})=a_{i}, i{1,,n}i\in\{1,\ldots,n\}, and σ(v)=v\sigma(v)=v for any variable v{v1,,vn}v\notin\{v_{1},\ldots,v_{n}\}.

A Σ\Sigma-algebra may be defined as a quotient of 𝒯Σ\mathcal{T}_{\Sigma} by a congruence generated by a set AEA\cup E of equalities, where equalities in AA, referred to as equational attributes, define structural properties such as associativity, commutativity, or identity, and EE consists of conditional equalities interpreted as directed simplification rules on 𝒯Σ\mathcal{T}_{\Sigma}. It is assumed that simplifications terminate and are confluent, hence each tt has the unique (modulo AA) irreducible form tE/A𝒯Σt{\downarrow_{E/A}}\in\mathcal{T}_{\Sigma} representing a class of tt in 𝒯Σ/=AE\mathcal{T}_{\Sigma}/{=}_{A\cup E}.

Simplification with respect to equalities computes values. The behaviour is represented with rewritings. A rewriting system =(Σ,A,E,R)\mathcal{R}=(\Sigma,A,E,R) consists of a signature Σ\Sigma, a set of equations AEA\cup E were EE defines confluent and terminating (modulo AA) simplifications on 𝒯Σ\mathcal{T}_{\Sigma} , and a finite set RR of conditional rewriting rules of the form λ:t1t2𝑖𝑓C\lambda:t_{1}\Rightarrow t_{2}\ \mathit{if}\ C, where optional condition CC is a conjunction of equalities, and λ\lambda is the rule’s label. A one-step rewrite u𝜆uu\xrightarrow{\lambda}_{\mathcal{R}}u^{\prime} from uu to uu^{\prime} using such a rule is possible if there exists a position κ\kappa, term vv, and a substitution σ\sigma such that u=Avu=_{A}v, v|κ=Aσ(t1)v|_{\kappa}=_{A}\sigma(t_{1}), u=Av[σ(t2)]κu^{\prime}=_{A}v[\sigma(t_{2})]_{\kappa} and σ(C)\sigma(C) is satisfied. We write uuu\rightarrow_{\mathcal{R}}u^{\prime} iff there exist terms s,s𝒯Σs,s^{\prime}\in\mathcal{T}_{\Sigma} and a label λ\lambda of a rule in RR such that uE/A=Asu{\downarrow}_{E/A}=_{A}s, s𝜆ss\xrightarrow{\lambda}_{\mathcal{R}}s^{\prime}, and sE/A=AuE/As^{\prime}{\downarrow}_{E/A}=_{A}u^{\prime}{\downarrow}_{E/A}. We denote by +\rightarrow_{\mathcal{R}}^{+} and \rightarrow_{\mathcal{R}}^{*} the transitive and reflexive-transitive closures of \rightarrow_{\mathcal{R}}. We also write u!uu\rightarrow_{\mathcal{R}}^{!}u^{\prime} if uuu\rightarrow_{\mathcal{R}}^{*}u^{\prime} and there is no u′′u^{\prime\prime} such that uu′′u^{\prime}\rightarrow_{\mathcal{R}}u^{\prime\prime}. If \mathcal{R} is implied by the context, we omit \mathcal{R} from arrows.

Variants of the following definition and easy to prove lemma appear in the literature (see, e.g., [37]):

Definition 1.1

Let (X,)(X,\rightarrow), where X×X\rightarrow\subseteq X\times X, be a transition system. Assume that \equiv is an equivalence on XX which is a bisimulation on (X,)(X,\rightarrow). We call \rightarrow semiconfluent at xXx\in X modulo \equiv if for all y,yXy,y^{\prime}\in X such that yxyy\leftarrow x\rightarrow^{*}y^{\prime} there exist z,zXz,z^{\prime}\in X such that yzzyy\rightarrow^{*}z\equiv z^{\prime}\;{}^{*}\!\!\leftarrow y^{\prime}. We call \rightarrow semiconfluent modulo \equiv if \rightarrow is semiconfluent at all xXx\in X. We call \rightarrow confluent at xXx\in X modulo \equiv if for all y,yXy,y^{\prime}\in X such that yxyy\;{}^{*}\!\!\leftarrow x\rightarrow^{*}y^{\prime} there exist z,zXz,z^{\prime}\in X such that yzzyy\rightarrow^{*}z\equiv z^{\prime}\;{}^{*}\!\!\leftarrow y^{\prime}. We call \rightarrow confluent modulo \equiv if \rightarrow is confluent at all xXx\in X.

Lemma 1.2

Semiconfluence modulo equivalence implies confluence modulo equivalence.

2 Multisets of facts, fresh facts and patterns

Our queries are evaluated against, or act on, finite multisets of facts. Duplicate facts can be genuinely useful and removing them is computationally expensive. If necessary, duplicates can be removed explicitly or, better, one can ensure that no duplicates are introduced in the first place by judicious choice of DML operations. In fact, SQL is a multiset query language as well, hence by using multisets we are closer to the actual relational database practice than formal systems based on sets.

𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} is parametrized with respect to a signature of facts Σ\Sigma and a Σ\Sigma-algebra of facts 𝒟\mathcal{D}. ΣS\Sigma_{S} must contain sorts Fact and Bool for facts and Booleans, respectively. All constructors for facts are contained in ΣF\Sigma_{F}. Facts are reifications of predicate instances. A typical fact has the form f(a1,,an)f(a_{1},\ldots,a_{n}), where f:s1snFactf:s_{1}\ldots s_{n}\rightarrow\text{\tt Fact} is a fact constructor. 𝒟\mathcal{D} defines all the data types used in facts and is specifiable in terms of directed equations and equational attributes. 𝒟\mathcal{D} must define Boolean connectives and Boolean-valued equality _=_:ssBool\_=\_:s\;s\rightarrow\text{\tt Bool} for all sΣS,Ks\in\Sigma_{S,K}. We assume that all ground terms of sort Bool simplify to either t or f. This is non-trivial: Define a function f:NatBoolf:\text{\tt Nat}\rightarrow\text{\tt Bool} with a single equation f(0)=tf(0)=\text{\bf t}. Then f(1)f(1) is fully reduced and distinct from both t and f.

Multisets of facts. The signature of multisets extends ΣS\Sigma_{S} with sorts (Ne)FSet of finite (non-empty) multisets of facts. The subsort ordering is given by Fact<NeFSet<FSet\text{\tt Fact}<\text{\tt NeFSet}<\text{\tt FSet}. In particular, each fact is a non-empty multiset of facts. Finite multisets of facts are constructed with an associative and commutative binary operator __:FSetFSetFSet\_\circ\_:\text{\tt FSet}\;\text{\tt FSet}\rightarrow\text{\tt FSet} (cf. [38]) with identity element :FSet\emptyset:\rightarrow\text{\tt FSet}. Operator __\_\circ\_ is subsort overloaded with the additional declaration __:FSetNeFSetNeFSet\_\circ\_:\text{\tt FSet}\;\text{\tt NeFSet}\rightarrow\text{\tt NeFSet}. Thus, a multiset constructed from a multiset and a non-empty multiset is non-empty.

Freshness and nominal sorts. Support for creation of fresh values is a common requirement (cf. [7]): Identifiers for new objects must not belong to the present nor any past active domain of the database. To understand why reusing identifiers from past domains is bad consider situation where a new business object is created with the identifier of a previously deleted one. In this case the attempt to verify if the deleted object is present in the final database may yield an incorrect affirmative answer. We support creation of fresh values of nominal sorts only. Usually this suffices, and freshness for non-nominal data types is problematic (cf. [39]). A sort ss is nominal (relative to Σ\Sigma-algebra 𝒟\mathcal{D}) if values of this sort have no non-trivial algebraic or relational structure beside equality. In particular, for nominal ss, sss′′s^{\prime}\leq s\leq s^{\prime\prime} if and only if s=s=s′′s^{\prime}=s=s^{\prime\prime}. To create values of each nominal sort ss we have constructor ı_s:Nats\imath^{s}_{\_}:\text{\tt Nat}\rightarrow s which belongs neither to Σ\Sigma nor to the signature of 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}}.

Example 2.1

Consider a client basket database. Identifiers of customers, products and baskets have sorts 𝔠\mathfrak{c}, 𝔭\mathfrak{p}, and 𝔟\mathfrak{b}, respectively. We use two fact constructors: _owns_:𝔠𝔟Fact\_\text{\tt owns}\_:\mathfrak{c}\;\mathfrak{b}\rightarrow\text{\tt Fact} and _in_:𝔭𝔟Fact\_\text{\tt in}\_:\mathfrak{p}\;\mathfrak{b}\rightarrow\text{\tt Fact}. Multiset of facts (ı1𝔠ownsı1𝔟)(ı2𝔭inı1𝔟)(ı3𝔭inı1𝔟)(\imath^{\mathfrak{c}}_{1}\;\text{\tt owns}\;\imath^{\mathfrak{b}}_{1})\circ(\imath^{\mathfrak{p}}_{2}\;\text{\tt in}\;\imath^{\mathfrak{b}}_{1})\circ(\imath^{\mathfrak{p}}_{3}\;\text{\tt in}\;\imath^{\mathfrak{b}}_{1}) denotes the state in which customer ı1𝔠\imath^{\mathfrak{c}}_{1} is the owner of basket ı1𝔟\imath^{\mathfrak{b}}_{1} containing products ı2𝔭\imath^{\mathfrak{p}}_{2} and ı3𝔭\imath^{\mathfrak{p}}_{3}. Using multisets instead of sets can be useful: multiset (ı3𝔭inı1𝔟)(ı3𝔭inı1𝔟)(\imath^{\mathfrak{p}}_{3}\;\text{\tt in}\;\imath^{\mathfrak{b}}_{1})\circ(\imath^{\mathfrak{p}}_{3}\;\text{\tt in}\;\imath^{\mathfrak{b}}_{1}) denotes the situation where basket ı1𝔟\imath^{\mathfrak{b}}_{1} contains two items of ı3𝔭\imath^{\mathfrak{p}}_{3}.

Constructing values of nominal sorts from natural numbers simplifies creation of fresh values: to ensure freshness one can construct the new value with the smallest natural number which was not used so far. To keep track of those “smallest unused naturals”, and to make retrieval of fresh values similar to retrieval of data we use fresh facts of sort Fact𝐧\text{\tt Fact}^{\mathbf{n}} unrelated to Fact. For each nominal sort ss we have a single-argument constructor of the form Cs:sFact𝐧C_{s}:s\rightarrow\text{\tt Fact}^{\mathbf{n}} which wraps value ıns\imath^{s}_{n} such that for all mnm\geq n, ıms\imath^{s}_{m} was never used before. When fresh value of sort ss is requested, we return ıns\imath^{s}_{n} and update the fresh fact to Cs(ın+1s)C_{s}(\imath^{s}_{n+1}). Fresh facts are combined into (non-empty) multisets of sort (Ne)FSet𝐧\text{\tt FSet}^{\mathbf{n}} using commutative and associative operator __:FSet𝐧FSet𝐧FSet𝐧\_\circ\_:\text{\tt FSet}^{\mathbf{n}}\;\text{\tt FSet}^{\mathbf{n}}\rightarrow\text{\tt FSet}^{\mathbf{n}} with identity :FSet𝐧\emptyset:\rightarrow\text{\tt FSet}^{\mathbf{n}}. To facilitate bulk updates of fresh facts needed in the semantics of DML queries, we define the following function:

υ:FSet𝐧FSet𝐧,υ(Cs1(ım1s1)Csn(ımnsn))=Cs1(ım1+1s1)Csn(ımn+1sn).\upsilon:\text{\tt FSet}^{\mathbf{n}}\rightarrow\text{\tt FSet}^{\mathbf{n}},\quad\upsilon\bigl{(}C_{s_{1}}(\imath^{s_{1}}_{m_{1}})\circ\cdots\circ C_{s_{n}}(\imath^{s_{n}}_{m_{n}})\bigr{)}=C_{s_{1}}(\imath^{s_{1}}_{m_{1}+1})\circ\cdots\circ C_{s_{n}}(\imath^{s_{n}}_{m_{n}+1}). (1)

Patterns. Quantifiers in 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} quantify over patterns (of sort Pat) containing non-ground multisets of facts and fresh facts marked by modalities, which control retention of matched facts (i.e., whether upon matching they are removed temporarily, permanently, or not at all from the database), and syntactically wrap fresh facts (i.e., fresh facts can appear in the pattern only inside specialized modality). Patterns can be preserving (of sort Patp\text{\tt Pat}^{p}), semi-terminating (of sort Patst\text{\tt Pat}^{\text{\it st}}), terminating (of sort Patt\text{\tt Pat}^{t}), terminating and preserving (of sort Pattp\text{\tt Pat}^{\text{\it tp}}), or neither. The subsort relation is defined by Pattp<Patt<Patst<Pat\text{\tt Pat}^{\text{\it tp}}<\text{\tt Pat}^{t}<\text{\tt Pat}^{\text{\it st}}<\text{\tt Pat} and Pattp<Patp<Pat\text{\tt Pat}^{\text{\it tp}}<\text{\tt Pat}^{p}<\text{\tt Pat}. Patterns are constructed with modalities

[_]!:NeFSetPatp,[_]?:NeFSetPattp,[_]0:NeFSetPatst,[_]𝐧:NeFSet𝐧Pat.[\_]_{!}:\text{\tt NeFSet}\rightarrow\text{\tt Pat}^{p},\quad\!\![\_]_{?}:\text{\tt NeFSet}\rightarrow\text{\tt Pat}^{\text{\it tp}},\quad\!\![\_]_{0}:\text{\tt NeFSet}\rightarrow\text{\tt Pat}^{\text{\it st}},\quad\!\![\_]_{\mathbf{n}}:\text{\tt NeFSet}^{\mathbf{n}}\rightarrow\text{\tt Pat}.

and associative and commutative, subsort overloaded operator:

__:PatPatPat.__:PatPattPatt,__:PatpPatpPatp,\displaystyle\_\circ\_:\text{\tt Pat}\;\text{\tt Pat}\rightarrow\text{\tt Pat}.\quad\_\circ\_:\text{\tt Pat}\;\text{\tt Pat}^{t}\rightarrow\text{\tt Pat}^{t},\quad\_\circ\_:\text{\tt Pat}^{p}\;\text{\tt Pat}^{p}\rightarrow\text{\tt Pat}^{p},
__:PatpPattpPattp,__:PatstPatPatst.\displaystyle\_\circ\_:\text{\tt Pat}^{p}\;\text{\tt Pat}^{\text{\it tp}}\rightarrow\text{\tt Pat}^{\text{\it tp}},\quad\_\circ\_:\text{\tt Pat}^{\text{\it st}}\;\text{\tt Pat}\rightarrow\text{\tt Pat}^{\text{\it st}}.

Thus, a terminating (resp. a semi-terminating) pattern has to contain at least one fact wrapped with [_]?[\_]_{?} (resp. with either [_]?[\_]_{?} or [_]0[\_]_{0}). A terminating and preserving pattern consists of facts marked only with [_]![\_]_{!} or [_]?[\_]_{?}, and it contains at least one fact wrapped with [_]?[\_]_{?}. Directed equalities [F1]m[F2]m=[F1F2]m[F_{1}]_{m}\circ[F_{2}]_{m}=[F_{1}\circ F_{2}]_{m}, where m{!,?,0,𝐧}m\in\{!,?,0,\mathbf{n}\}, and F1F_{1}, F2F_{2} are non-empty multisets of (fresh) facts guarantee that fully reduced patterns have facts gathered in groups of the same modality.

Example 2.2

Let Id be a sort, let f,g:IdNatFactf,g:\text{\tt Id}\;\text{\tt Nat}\rightarrow\text{\tt Fact} and h:IdFacth:\text{\tt Id}\rightarrow\text{\tt Fact} be fact constructors and let xx, yy, zz and tt be variables. Then

[f(x,y)h(x)]?[g(x,y)]![CId(t)]𝐧[g(x,1)]0[f(x,y)\circ h(x)]_{?}\circ[g(x,y)]_{!}\circ[C_{\text{\tt Id}}(t)]_{\mathbf{n}}\circ[g(x,1)]_{0}

is a pattern. It is terminating because of a presence of [f(x,y)h(x)]?[f(x,y)\circ h(x)]_{?} subpattern. It is not, however, preserving since it contains facts wrapped in [_]0[\_]_{0} and [_]𝐧[\_]_{\mathbf{n}}.

The informal meaning of modalities is as follows: Facts matched by those marked by [_]?[\_]_{?} can be considered at most once during quantifier evaluation, but they are not removed from the database. Facts marked by [_]0[\_]_{0} are removed from the database when matched, but they are returned if the computation branch this matching leads to is unsuccessful. Thus, the presence of facts marked by [_]0[\_]_{0} in the pattern may not guarantee termination, unless one can prove that the formula under quantifier is always successful. Facts marked by [_]![\_]_{!} are always retained in the database, and [_]𝐧[\_]_{\mathbf{n}} wraps fresh facts.

Remark 2.3

In what follows we use the following notation. Let PP be a pattern. We denote by P0P_{0}, P!P_{!}, P?P_{?}, P𝐧P_{\mathbf{n}} the multisets of facts consisting of those facts in PP which are wrapped by modalities [_]0[\_]_{0}, [_]![\_]_{!}, [_]?[\_]_{?}, and [_]𝐧[\_]_{\mathbf{n}}, respectively. Thus, e.g., ([F1]![F2]?)?:=F2([F_{1}]_{!}\circ[F_{2}]_{?})_{?}:=F_{2} and ([F1]![F2]?)𝐧:=([F_{1}]_{!}\circ[F_{2}]_{?})_{\mathbf{n}}:=\emptyset.

3 Query and condition languages

This section introduces the three sublanguages of 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}}, their syntax and informal semantics. Formal semantics based on conditional term rewriting is provided in the subsequent sections.

3.1 Conditions

The language 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} of conditions on finite multisets of facts is analogous to first-order logic with quantification restricted to the active domain.

Definition 3.1

Let Σ\Sigma be a signature and let 𝒟\mathcal{D} be a Σ\Sigma-algebra of facts. Formulas of 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} are (generally non-ground) terms of sort Cnd constructed with

:Cnd,{_}:BoolCnd,¬_:CndCnd,__:CndCndCnd,_._:PattpCndCnd.\displaystyle\bot:\rightarrow\text{\tt Cnd},\ \{\_\}:\text{\tt Bool}\rightarrow\text{\tt Cnd},\ \neg\_:\text{\tt Cnd}\rightarrow\text{\tt Cnd},\ \_\vee\_:\text{\tt Cnd}\;\text{\tt Cnd}\rightarrow\text{\tt Cnd},\ \exists\_.\_:\text{\tt Pat}^{\text{\it tp}}\;\text{\tt Cnd}\rightarrow\text{\tt Cnd}.

Thus, \bot, {B}\{B\}, ¬ψ\neg\psi, ψψ\psi\vee\psi^{\prime} and P.ψ\exists P\mathbin{.}\psi are conditions if BB is a term of sort Bool, PP is a terminating and preserving pattern, and ψ\psi and ψ\psi^{\prime} are conditions. Consider condition T:=P.ψT:=\exists P\mathbin{.}\psi. Existential quantifier \exists binds in ψ\psi all the variables appearing in PP which were not bound by the term surrounding TT. Thus, the meaning of the formula may change when it is placed in a different context.

Example 3.2

Let R:NatNatFactR:\text{\tt Nat}\ \text{\tt Nat}\rightarrow\text{\tt Fact}, and suppose that terms R(t1,t2)R(t_{1},t_{2}) represent rows of a relation 𝐑𝐍×𝐍\mathbf{R}\subseteq\mathbf{N}\times\mathbf{N}. Let xx, yy, and zz be distinct variables. Then condition ¬[R(x,y)]?.[R(x,z)]?.¬{y=z}\neg\exists[R(x,y)]_{?}.\exists[R(x,z)]_{?}.\neg\{y=z\} expresses functional dependency from the first to the second column of 𝐑\mathbf{R}. The first quantifier binds xx and yy, the second one binds zz. The condition is closed. The subcondition [R(x,z)]?.¬{y=z}\exists[R(x,z)]_{?}.\neg\{y=z\} taken on its own is open, but only yy is free and the quantifier now binds both xx and zz.

Closed formulas in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} are called sentences in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}. Let 𝑉𝑎𝑟(t)\mathit{Var}(t) be the set of variables of tt, and let 𝑐𝑙?(ϕ)\mathit{cl?}(\phi) iff ϕ\phi in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} is closed. To define 𝑐𝑙?(ϕ)\mathit{cl?}(\phi) by structural recursion we need to keep track of variables bound by the context of ϕ\phi. Thus, 𝑐𝑙?(ϕ):=𝑐𝑙?(ϕ,)\mathit{cl?}(\phi):=\mathit{cl?}(\phi,\emptyset), where, for any set of variables VV,

𝑐𝑙?(,V)=t,𝑐𝑙?(¬ϕ,V)=𝑐𝑙?(ϕ,V),𝑐𝑙?({t},V)=𝑉𝑎𝑟(t)V\displaystyle\mathit{cl?}(\bot,V)=\text{\bf t},\quad\mathit{cl?}(\neg\phi,V)=\mathit{cl?}(\phi,V),\quad\mathit{cl?}(\{t\},V)=\mathit{Var}(t)\subseteq V
𝑐𝑙?(ϕ1ϕ2,V)=𝑐𝑙?(ϕ1,V)𝑐𝑙?(ϕ2,V),𝑐𝑙?(P.ϕ,V)=𝑐𝑙?(ϕ,V𝑉𝑎𝑟(P)).\displaystyle\mathit{cl?}(\phi_{1}\vee\phi_{2},V)=\mathit{cl?}(\phi_{1},V)\wedge\mathit{cl?}(\phi_{2},V),\quad\mathit{cl?}(\exists P\mathbin{.}\phi,V)=\mathit{cl?}\bigl{(}\phi,V\cup\mathit{Var}(P)\bigr{)}. (2)

As the syntactic sugar we define operators :Cnd\top:\rightarrow\text{\tt Cnd}, __:CndCndCnd\_\wedge\_:\text{\tt Cnd}\;\text{\tt Cnd}\rightarrow\text{\tt Cnd}, _._:PattpCndCnd\forall\_.\_:\text{\tt Pat}^{\text{\it tp}}\;\text{\tt Cnd}\rightarrow\text{\tt Cnd}. with equalities =¬\top=\neg\bot, ϕ1ϕ2=¬(¬ϕ1¬ϕ2)\phi_{1}\wedge\phi_{2}=\neg(\neg\phi_{1}\vee\neg\phi_{2}), P.ϕ=¬P.¬ϕ\forall P\mathbin{.}\phi=\neg\exists P\mathbin{.}\neg\phi. Later we prove that, for any condition ϕ\phi, ¬¬ϕ\neg\neg\phi is logically equivalent to ϕ\phi. Then the functional dependency in Example 3.2 can be equivalently written as [R(x,y)]?.[R(x,z)]?.{y=z}.\forall[R(x,y)]_{?}.\forall[R(x,z)]_{?}.\{y=z\}.

Definition 3.3

A subcondition ψ\psi of ϕ\phi in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} is a subterm of ϕ\phi of sort Cnd.

3.2 Syntax of queries and DML queries

Syntactically 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} is a restriction of 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}. Therefore, we define their syntax jointly as follows:

Definition 3.4

Queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} are terms of sort Qy. DML queries in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}} are terms of sort DQy. Success assured (DML) queries are terms of sorts DQys\text{\tt DQy}^{s}. The sorts are ordered with Fact<Qy<DQy\text{\tt Fact}<\text{\tt Qy}<\text{\tt DQy} and Fact<DQys<DQy\text{\tt Fact}<\text{\tt DQy}^{s}<\text{\tt DQy}. Thus, every query in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} can be also interpreted as DML query (which inserts but doesn’t delete). Every fact is also a query. Success assured DML queries are DML queries guaranteed to return some facts (or at least \surd). Terms of sort DQy are constructed with

:DQys,__:DQyDQyDQy,__:QyQyQy,__:DQysDQyDQys,\displaystyle\surd:\rightarrow\text{\tt DQy}^{s},\quad\_\rhd\_:\text{\tt DQy}\;\text{\tt DQy}\rightarrow\text{\tt DQy},\quad\_\rhd\_:\text{\tt Qy}\;\text{\tt Qy}\rightarrow\text{\tt Qy},\quad\_\rhd\_:\text{\tt DQy}^{s}\;\text{\tt DQy}\rightarrow\text{\tt DQy}^{s},
__:DQyDQysDQys,:Qy,__:CndDQyDQy,__:CndQyQy,\displaystyle\_\rhd\_:\text{\tt DQy}\;\text{\tt DQy}^{s}\rightarrow\text{\tt DQy}^{s},\quad\emptyset:\rightarrow\text{\tt Qy},\quad\_\Rightarrow\_:\text{\tt Cnd}\;\text{\tt DQy}\rightarrow\text{\tt DQy},\quad\_\Rightarrow\_:\text{\tt Cnd}\;\text{\tt Qy}\rightarrow\text{\tt Qy},
_._:PattDQyDQy,_._:PattpQyQy,_._:PatstDQysDQy.\displaystyle\nabla\_.\_:\text{\tt Pat}^{t}\;\text{\tt DQy}\rightarrow\text{\tt DQy},\quad\nabla\_.\_:\text{\tt Pat}^{\text{\it tp}}\;\text{\tt Qy}\rightarrow\text{\tt Qy},\quad\nabla\_.\_:\text{\tt Pat}^{\text{\it st}}\;\text{\tt DQy}^{s}\rightarrow\text{\tt DQy}.

Thus, \emptyset, ff, QQQ\rhd Q^{\prime}, ϕQ\phi\Rightarrow Q, and P.Q\nabla P\mathbin{.}Q are queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} if ff is a fact, PP is a terminating and preserving pattern, QQ and QQ^{\prime} are queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}, and ϕ\phi is a condition. Similarly, \emptyset, \surd, ff, DDD\rhd D^{\prime}, ϕD\phi\Rightarrow D, and P.D\nabla P\mathbin{.}D are DML queries in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}} if ff is a fact, PP is a terminating pattern, DD and DD^{\prime} are DML queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}, and ϕ\phi is a condition. In addition, P.D\nabla P\mathbin{.}D is a DML query if PP is a more general semi-terminating pattern, but DD is success-assured, i.e., either \surd, a fact, or of the form D1D2D_{1}\rhd D_{2} where at least one of D1D_{1}, D2D_{2} is success assured. We force DD to be success assured when PP is semi-terminating, but not terminating because quantification over semi-terminating pattern may not terminate if the quantified expression can fail. Informal semantics of (DML) queries is given by:

  1. 1.

    Let f:Factf:\text{\tt Fact}. A query ff returns ff. A DML query ff adds ff to the current database.

  2. 2.

    \surd is used to mark a branch of a DML query as successful even if it does not add any facts.

  3. 3.

    \emptyset is a query returning the empty multiset of facts or a DML query which does nothing.

  4. 4.

    A query QQQ\rhd Q^{\prime} returns the multiset union of results of QQ and QQ^{\prime}. A DML query DDD\rhd D^{\prime} adds to and removes from the database a multiset union of what DD and DD^{\prime} add and remove, respectively. Since facts are removed immediately, __\_\rhd\_ is not commutative for DML queries.

  5. 5.

    A query ϕQ\phi\Rightarrow Q returns what QQ returns if ϕ\phi is satisfied. It returns \emptyset otherwise. A DML query ϕD\phi\Rightarrow D does what DD does if ϕ\phi is satisfied. It does nothing otherwise.

  6. 6.

    Quantifier P.Q\nabla P.Q denotes iteration over facts in the database matching PP. At each iteration step σ(Q)\sigma(Q) is executed, where σ\sigma is the matching substitution. If PP contains fresh facts, execution of a DML query P.Q\nabla P\mathbin{.}Q may introduce fresh values.

Let 𝑐𝑙?(Q)\mathit{cl?}(Q) if and only if the query ϕ\phi in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}} (or 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}) is closed. Similarly as in the case of conditions, 𝑐𝑙?(Q):=𝑐𝑙?(Q,)\mathit{cl?}(Q):=\mathit{cl?}(Q,\emptyset), where, for any set of variables VV,

𝑐𝑙?(,V)=𝑐𝑙?(,V)=t,𝑐𝑙?(ϕQ,V)=𝑐𝑙?(ϕ,V)𝑐𝑙?(Q,V),\displaystyle\mathit{cl?}(\emptyset,V)=\mathit{cl?}(\surd,V)=\text{\bf t},\quad\mathit{cl?}(\phi\Rightarrow Q,V)=\mathit{cl?}(\phi,V)\wedge\mathit{cl?}(Q,V),
𝑐𝑙?(f,V)=𝑉𝑎𝑟(f)Viff:Fact\displaystyle\mathit{cl?}(f,V)=\mathit{Var}(f)\subseteq V\ \text{if}\ f:\text{\tt Fact}
𝑐𝑙?(Q1Q2,V)=𝑐𝑙?(Q1,V)𝑐𝑙?(Q2,V),𝑐𝑙?(P.Q,V)=𝑐𝑙?(Q,V𝑉𝑎𝑟(P)).\displaystyle\mathit{cl?}(Q_{1}\rhd Q_{2},V)=\mathit{cl?}(Q_{1},V)\wedge\mathit{cl?}(Q_{2},V),\quad\mathit{cl?}(\nabla P\mathbin{.}Q,V)=\mathit{cl?}\bigl{(}Q,V\cup\mathit{Var}(P)\bigr{)}.
Definition 3.5

A (DML) subquery of a (DML) query QQ is a subterm of QQ of sort (D)Qy.

Example 3.6

The following query in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} is closed:

[f(x)g]?[h(x)]!.({x>5}f(x+1)).\nabla[f(x)\circ g]_{?}\circ[h(x)]_{!}\mathbin{.}\bigl{(}\{x>5\}\Rightarrow f(x+1)\bigr{)}.

It returns a multiset of facts consisting of facts of the form f(x+1)f(x+1) where f(x)f(x) and h(x)h(x) belong to the multiset we query and x>5x>5. When evaluating the query each source fact f(x)f(x) (such that h(x)h(x) is in the multiset and x>5x>5) and some fact gg are matched only once during the evaluation of this query (on the other hand, facts of the form h(x)h(x) can be matched multiple times). Thus, we return at most nn facts f(x+1)f(x+1), where nn is the number of gg facts in the database. For example, for multiset f(7)f(8)f(7)f(4)h(8)h(7)ggf(7)\circ f(8)\circ f(7)\circ f(4)\circ h(8)\circ h(7)\circ g\circ g the query returns either f(7)f(8)f(7)\circ f(8) or f(7)f(7)f(7)\circ f(7) depending on whether we match f(7)gf(7)\circ g and f(8)gf(8)\circ g or we match f(7)gf(7)\circ g twice (and then, in both cases, we run out of gg-facts).

Example 3.7

The following (closed) “DML query” is not syntactically correct (it cannot be assigned sort DQy):

[f(x)]0.({x>5}g(x))\nabla[f(x)]_{0}\mathbin{.}\bigl{(}\{x>5\}\Rightarrow g(x)\bigr{)}

This is because the pattern [f(x)]0[f(x)]_{0} is only semiterminating, but not terminating. In this case the subquery in the scope of [f(x)]!._\nabla[f(x)]_{!}\mathbin{.}\_ should be success assured, but {x>5}g(x)\{x>5\}\Rightarrow g(x) clearly isn’t (if x5x\leq 5 then it fails). If we would execute this query against a multiset f(1)f(6)f(1)\circ f(6) it would loop forever matching f(1)f(1), removing it, failing when executing {1>5}g(1)\{1>5\}\Rightarrow g(1), returning f(1)f(1) to the multiset, matching it again, and so on.

As the following example demonstrates, some incorrect DML queries of the form P.ψ\nabla P\mathbin{.}\psi, where PP is semiterminating but not terminating and ϕ\phi is not success assured, provably always terminate. Nevertheless, we prefer to reject them regardless, since usually making them syntactically correct is not difficult:

Example 3.8

Assume that xx is a variable of sort Nat (natural numbers). Execution of the following syntactically incorrect DML query always terminates (it replaces each fact of the form f(x)f(x) where x>5x>5 with fact g(x)g(x), and returns to the multiset all other f(x)f(x)’s unchanged):

[f(x)]0.(({x>5}g(x))({x5}f(x))).\nabla[f(x)]_{0}\mathbin{.}\bigl{(}(\{x>5\}\Rightarrow g(x))\rhd(\{x\leq 5\}\Rightarrow f(x))\bigr{)}.

Termination follows from the fact that ({x>5}g(x))({x5}f(x))(\{x>5\}\Rightarrow g(x))\rhd(\{x\leq 5\}\Rightarrow f(x)\bigr{)} always succeeds regardless of which natural number is bound to xx: depending on whether x>5x>5 or x5x\leq 5 (and one of these must be true) either left or right argument of \rhd succeeds and hence the whole subquery succeeds.

Unfortunately, neither {x>5}g(x)\{x>5\}\Rightarrow g(x) nor {x5}f(x)\{x\leq 5\}\Rightarrow f(x) is syntactically success assured and hence ({x>5}g(x))({x5}f(x))(\{x>5\}\Rightarrow g(x))\rhd(\{x\leq 5\}\Rightarrow f(x)\bigr{)} is also not success assured. However, it is very easy to modify the above query so that it describes the same modification to the database, but is now syntactically correct and can be assigned sort DQy:

[f(x)]0.(({x>5}g(x))({x5}f(x))).\nabla[f(x)]_{0}\mathbin{.}\bigl{(}(\{x>5\}\Rightarrow g(x))\rhd(\{x\leq 5\}\Rightarrow f(x))\rhd\mathit{\surd}\bigr{)}.

4 Rewriting semantics of 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}

Semantics of a sentence ϕ\phi in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} is given by the rewriting system Σ,𝒟𝐜𝐧𝐝(ϕ)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi). Terms rewritten by Σ,𝒟𝐜𝐧𝐝(ϕ)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi) are of the form {F,S}c\{F,S\}^{c}, where {_,_}c:FSetStkcStatec\{\_,\_\}^{c}:\text{\tt FSet}\;\text{\tt Stk}^{c}\rightarrow\text{\tt State}^{c}, FF is the database of facts on which ϕ\phi is checked, and SS is a stack of sort Stkc\text{\tt Stk}^{c} implementing structural recursion. Normal forms, constructed with 𝔰:BoolStatec\mathfrak{s}:\text{\tt Bool}\rightarrow\text{\tt State}^{c}, encapsulate the result of ϕ\phi’s evaluation. Sort Statec\text{\tt State}^{c} of terms holding the full state of evaluation must be distinct from all the other sorts and it must not have any super- or sub-sort relation to other sorts. This guarantees that there are no constructors accepting terms of sort Statec\text{\tt State}^{c} as arguments. Consequently, if all rewrite rules have terms of this sort on the left-hand side, then no subterms can be rewritten, i.e., the defined rewriting system is top-level. Terms of sort Stkc\text{\tt Stk}^{c} are built from frames of sort Frmc\text{\tt Frm}^{c}, where Frmc<Stkc\text{\tt Frm}^{c}<\text{\tt Stk}^{c}, using an associative binary operator __:StkcStkcStkc\_\_:\text{\tt Stk}^{c}\;\text{\tt Stk}^{c}\rightarrow\text{\tt Stk}^{c} with identity \emptyset. Most constructors of frames are indexed by subconditions ψ\psi of ϕ\phi, and lists of distinct variable names v:=v1,,vn\vec{v}:=v_{1},\ldots,v_{n} of respective sorts s1,,sns_{1},\ldots,s_{n} (v\vec{v} can be of any length, even be empty, as long as {v}𝑉𝑎𝑟(ϕ)\{\vec{v}\}\subseteq\mathit{Var}(\phi) and it contains free variables of ψ\psi):

𝔯:BoolFrmc,¬:Frmc,[_,,_]ψv|c,[_,,_]ψv|c,:s1snFrmc,\displaystyle\mathfrak{r}:\text{\tt Bool}\rightarrow\text{\tt Frm}^{c},\quad{\neg}:\rightarrow\text{\tt Frm}^{c},\quad\quad[\_,\ldots,\_]^{\vec{v}|c}_{\psi},[\_,\ldots,\_]^{\vec{v}|c,\downarrow}_{\psi}:s_{1}\ldots s_{n}\rightarrow\text{\tt Frm}^{c},
[_|_,,_]P.ψv|c:FSets1snFrmc\displaystyle[\_|\_,\ldots,\_]^{\vec{v}|c}_{\exists P\mathbin{.}\psi}:\text{\tt FSet}\;s_{1}\ldots s_{n}\rightarrow\text{\tt Frm}^{c}

As v\vec{v} can be empty, the above signature templates include []ψ|c,[]ψ|c,:Frmc[]^{|c}_{\psi},[]^{|c,\downarrow}_{\psi}:\rightarrow\text{\tt Frm}^{c} and [_|]ψ|c:FSetFrmc[\_|]^{|c}_{\psi}:\text{\tt FSet}\rightarrow\text{\tt Frm}^{c}. 𝔯(B)\mathfrak{r}(B) encapsulates the result of evaluation of a subcondition. Frame ¬{\neg} negates the result of the next frame on the stack. Frames of the form [a]ψv|c[\vec{a}]^{\vec{v}|c}_{\psi}, [a]ψv|c,[\vec{a}]^{\vec{v}|c,\downarrow}_{\psi}, or [F|a]ψv|c[F^{\prime}|\vec{a}]^{\vec{v}|c}_{\psi} are called (ψ,σ)(\psi,\sigma)-frames, where σ:={a/v}\sigma:=\{\vec{a}/\vec{v}\} is the current substitution. They are related to evaluation of σ(ψ)\sigma(\psi). Marked frames [a]ψv|c,[\vec{a}]^{\vec{v}|c,\downarrow}_{\psi} occur in evaluation of disjunctions __\_\vee\_. Iterator frames [F|a]P.ψv|c[F^{\prime}|\vec{a}]^{\vec{v}|c}_{\exists P\mathbin{.}\psi} represent iterative evaluation of quantifiers. Multiset FF^{\prime}, called iterator state, contains facts available for matching with P!P?P_{!}\circ P_{?}. Given a database FF, to evaluate sentence ϕ\phi we rewrite a state term Iϕ(F):={F,[]ϕ|c}\mathrm{I}_{\phi}(F):=\{F,[]^{|c}_{\phi}\} until a normal form 𝔰(B)\mathfrak{s}(B) is reached. If BB then ϕ\phi is satisfied in FF.

Now we present rule schemata for Σ,𝒟𝐜𝐧𝐝(ϕ)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi) instantiated for a given formula ϕ\phi in 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}}. It is important to distinguish between object variables substituted when applying actual rules, and metavariables used to define rule templates. Below, F,F:FSetF,F^{\prime}:\text{\tt FSet}, S:StkcS:\text{\tt Stk}^{c}, and B:BoolB:\text{\tt Bool} are object variables. We also denote by v:=v1,,vn\vec{v}:=v_{1},\ldots,v_{n} and w=w1,,wm\vec{w}=w_{1},\ldots,w_{m} sequences of object variables in 𝑉𝑎𝑟(ϕ)\mathit{Var}(\phi). Metavariables, ψ\psi, ψ1\psi_{1}, and ψ2\psi_{2} stand for arbitrary subconditions of ϕ\phi, metavariable \mathcal{B} stands for Boolean subterms of ϕ\phi. Metavariable PP stands for arbitrary patterns in ϕ\phi, and P!P_{!} and P?P_{?} denote multisets of facts in the instance of pattern PP marked by respective modalities (see Remark 2.3).

Constant \bot evaluates to f, and {}\{\mathcal{B}\} evaluates to σ()\sigma(\mathcal{B}), where σ\sigma is the current substitution of variables v\vec{v} (σ()\sigma(\mathcal{B}) is ground since {v}\{\vec{v}\} contains all free variables (i.e., all variables) of \mathcal{B}, hence, by assumption, it simplifies to either f or t):

λ:{F,S[v]v|c}c{F,S𝔯(f)}c,λ{_}:{F,S[v]{}v|c}c{F,S𝔯()}c.\lambda_{\bot}:\{F,S[\vec{v}]_{\bot}^{\vec{v}|c}\}^{c}\Rightarrow\{F,S\mathfrak{r}(\text{\bf f})\}^{c},\quad\quad{\lambda_{\{\_\}}}:\{F,S[\vec{v}]_{\{\mathcal{B}\}}^{\vec{v}|c}\}^{c}\Rightarrow\{F,S\mathfrak{r}(\mathcal{B})\}^{c}. (3)

To evaluate σ(¬ψ)\sigma(\neg\psi), we “unfold” by replacing the (¬ψ,σ)(\neg\psi,\sigma)-frame with ¬\neg and (ψ,σ)(\psi,\sigma)-frame. When σ(ψ)\sigma(\psi) is evaluated we “fold” by negating the result:

λ¬unf:{F,S[v]¬ψv|c}{F,S¬[v]ψv|c}c,λ¬fld:{F,S¬𝔯(B)}{F,S𝔯(¬B)}c.{\lambda_{\neg}^{\text{unf}}}:\{F,S[\vec{v}]_{\neg\psi}^{\vec{v}|c}\}\Rightarrow\{F,S{\neg}[\vec{v}]_{\psi}^{\vec{v}|c}\}^{c},\quad\quad{\lambda_{\neg}^{\text{fld}}}:\{F,S{\neg}\mathfrak{r}(B)\}\Rightarrow\{F,S\mathfrak{r}(\neg B)\}^{c}. (4)
Remark 4.1

In our notation the same symbols often play a dual role — as subterms, and as part of function symbols (in sub- and super-scripts). Consider the following instantiation of schema λ¬unf{\lambda_{\neg}^{\text{unf}}}:

λ¬unf:{F,S[x,y]¬{x=y}x,y|c}c{F,S¬[x,y]{x=y}x,y|c}c{\lambda_{\neg}^{\text{unf}}}:\{F,S[x,y]^{x,y|c}_{\neg\{x=y\}}\}^{c}\Rightarrow\{F,S{\neg}[x,y]^{x,y|c}_{\{x=y\}}\}^{c}

Variables in sub- and super-scripts are never substituted: with the above rule we have a one step rewrite of ground terms {,[0,1]¬{x=y}x,y|c}cλ¬unf{,¬[0,1]{x=y}x,y|c}c.\{\emptyset,[0,1]^{x,y|c}_{\neg\{x=y\}}\}^{c}\xrightarrow{{\lambda_{\neg}^{\text{unf}}}}\{\emptyset,{\neg}[0,1]^{x,y|c}_{\{x=y\}}\}^{c}. In schema λ{_}{\lambda_{\{\_\}}} metavariable \mathcal{B} occurs both as part of a name and as a subterm. An instantiation λ{_}:{F,S[x,y]{x=y}x,y|c}c{F,S𝔯(x=y)}c{\lambda_{\{\_\}}}:\{F,S[x,y]^{x,y|c}_{\{x=y\}}\}^{c}\Rightarrow\{F,S\mathfrak{r}(x=y)\}^{c} yields (with S=¬S={\neg}, x=0x=0, and y=1y=1) a one step rewrite {,¬[0,1]{x=y}x,y|c}cλ{_}{,¬𝔯(0=1)}c={,¬𝔯(f)}c.\{\emptyset,{\neg}[0,1]^{x,y|c}_{\{x=y\}}\}^{c}\xrightarrow{{\lambda_{\{\_\}}}}\{\emptyset,{\neg}\mathfrak{r}(0=1)\}^{c}=\{\emptyset,{\neg}\mathfrak{r}(\text{\bf f})\}^{c}.

To evaluate disjunction ψ1ψ2\psi_{1}\vee\psi_{2} we create two frames corresponding to the disjuncts. If ψ2\psi_{2} evaluates to t, the frame marked by \downarrow is dropped (disjunctions are short circuited). If ψ2\psi_{2} evaluates to f, the frame corresponding to ψ1\psi_{1} drops \downarrow and is evaluated normally.

λunf:{F,S[v]ψ1ψ2v|c}c{F,S[v]ψ1v|c,[v]ψ2v|c}c,\displaystyle{\lambda_{\vee}^{\text{unf}}}:\{F,S[\vec{v}]_{\psi_{1}\vee\psi_{2}}^{\vec{v}|c}\}^{c}\Rightarrow\{F,S[\vec{v}]_{\psi_{1}}^{\vec{v}|c,\downarrow}[\vec{v}]_{\psi_{2}}^{\vec{v}|c}\}^{c},
λ;tfld:{F,S[v]ψv|c,𝔯(t)}c{F,S𝔯(t)}c,λ;ffld:{F,S[v]ψv|c,𝔯(f)}c{F,S[v]ψv|c}c,\displaystyle{\lambda_{\vee;\text{\bf t}}^{\text{fld}}}:\{F,S[\vec{v}]_{\psi}^{\vec{v}|c,\downarrow}\mathfrak{r}(\text{\bf t})\}^{c}\Rightarrow\{F,S\mathfrak{r}(\text{\bf t})\}^{c},\quad{\lambda_{\vee;\text{\bf f}}^{\text{fld}}}:\{F,S[\vec{v}]_{\psi}^{\vec{v}|c,\downarrow}\mathfrak{r}(\text{\bf f})\}^{c}\Rightarrow\{F,S[\vec{v}]_{\psi}^{\vec{v}|c}\}^{c},\quad (5)

Quantifier evaluation is initialized with the whole database available for matching:

λinit:{F,S[v]P.ψv|c}c{F,S[F|v]P.ψv|c}c.{\lambda_{\exists}^{\text{init}}}:\{F,S[\vec{v}]^{\vec{v}|c}_{\exists P\mathbin{.}\psi}\}^{c}\Rightarrow\{F,S[F\;|\;\vec{v}]^{\vec{v}|c}_{\exists P\mathbin{.}\psi}\}^{c}. (6)

Let w\vec{w} be a sequence of all the distinct variables in 𝑉𝑎𝑟(P){v}\mathit{Var}(P)\setminus\{\vec{v}\}. Let σ\sigma be the current substitution. Rule λunf{\lambda_{\exists}^{\text{unf}}} pushes onto the stack a (σ,ψ)(\sigma^{\prime},\psi)-frame, where σ=σ{b/w}\sigma^{\prime}=\sigma\cup\{\vec{b}/\vec{w}\} is defined by matching Fσ(P!P?)F^{\prime}\circ\sigma(P_{!}\circ P_{?}) with iterator state, and it removes σ(P?)\sigma^{\prime}(P_{?}) from the iterator state. We keep applying λunf{\lambda_{\exists}^{\text{unf}}} until σ(ψ)\sigma^{\prime}(\psi) evaluates to t or we cannot match σ(P!P?)\sigma(P_{!}\circ P_{?}) with iterator state:

λunf:{F,S[FP!P?|v]P.ψv|c}c{F,S[FP!|v]P.ψv|c[v,w]ψv,w|c}c,\displaystyle{\lambda_{\exists}^{\text{unf}}}:\bigl{\{}F,S[F^{\prime}\circ P_{!}\circ P_{?}\;|\;\vec{v}]^{\vec{v}|c}_{\exists P\mathbin{.}\psi}\bigr{\}}^{c}\Rightarrow\bigl{\{}F,S[F^{\prime}\circ P_{!}\;|\;\vec{v}]^{\vec{v}|c}_{\exists P\mathbin{.}\psi}[\vec{v},\vec{w}]^{\vec{v},\vec{w}|c}_{\psi}\bigr{\}}^{c}, (7)
λ;ffld:{F,S[F|v]P.ψv|c𝔯(f)}c{F,S[F|v]P.ψv|c}c,λ;tfld:{F,S[F|v]P.ψv|c𝔯(t)}c{F,S𝔯(t)}c.\displaystyle{\lambda_{\exists;\text{\bf f}}^{\text{fld}}}:\bigr{\{}F,S[F^{\prime}|\vec{v}\bigr{]}^{\vec{v}|c}_{\exists P\mathbin{.}\psi}\mathfrak{r}(\text{\bf f})\bigr{\}}^{c}\Rightarrow\bigl{\{}F,S[F^{\prime}|\vec{v}]^{\vec{v}|c}_{\exists P\mathbin{.}\psi}\bigr{\}}^{c},\quad{\lambda_{\exists;\text{\bf t}}^{\text{fld}}}:\bigl{\{}F,S[F^{\prime}|\vec{v}\bigr{]}^{\vec{v}|c}_{\exists P\mathbin{.}\psi}\mathfrak{r}(\text{\bf t})\bigr{\}}^{c}\Rightarrow\{F,S\mathfrak{r}(\text{\bf t})\bigr{\}}^{c}.

Let Yes? be a sort and let yes:Yes?\text{\it yes}:\rightarrow\text{\tt Yes?}. For all v𝑉𝑎𝑟(ϕ)\vec{v}\subseteq\mathit{Var}(\phi) and patterns PP occurring in ϕ\phi we define a function μP,v:Pattps1snYes?\mu_{P,\vec{v}}:\text{\tt Pat}^{\text{\it tp}}\;s_{1}\ldots s_{n}\rightarrow\text{\tt Yes?} with the single equation

μP,v(FP!P?,v)=yes.\mu_{P,\vec{v}}(F^{\prime}\circ P_{!}\circ P_{?},\vec{v})=\text{\it yes}. (8)

Thus, μP,v(F,a)=yes\mu_{P,\vec{v}}(F,\vec{a})=\text{\it yes} if and only if FF matches with F{a/v}(P?P0)F^{\prime}\circ\{\vec{a}/\vec{v}\}(P_{?}\circ P_{0}). Since facts matched by σ(P?)\sigma(P_{?}) are removed from the iteration state, λunf{\lambda_{\exists}^{\text{unf}}} cannot be applied infinitely many times. The following rule schema makes σ(P.ψ)\sigma(\exists P\mathbin{.}\psi) evaluate to f when λunf{\lambda_{\exists}^{\text{unf}}} can no longer be applied:

λend:{F,S[F|v]P.ψv|c}c{F,S𝔯(f)}cif(μP,v(F,v)=yes)=f.{\lambda_{\exists}^{\text{end}}}:\bigl{\{}F,S[F^{\prime}\;|\;\vec{v}]^{\vec{v}|c}_{\exists P\mathbin{.}\psi}\}^{c}\Rightarrow\{F,S\mathfrak{r}(\text{\bf f})\}^{c}\quad\text{if}\ (\mu_{P,\vec{v}}(F^{\prime},\vec{v})=\text{\it yes})=\text{\bf f}. (9)

Finally, the rule λsat:{F,𝔯(B)}c𝔰(B)\lambda_{\text{sat}}:\{F,\mathfrak{r}(B)\}^{c}\Rightarrow\mathfrak{s}(B) finishes evaluation of ϕ\phi.

Theorem 4.2

Σ,𝒟𝐜𝐧𝐝(ϕ)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi) is a terminating rewriting system.

Proof 4.3

It suffices to define a partial well-order _<c_\_<_{c}\_ on terms of sort Statec\text{\tt State}^{c} which makes rewriting strictly monotonic, i.e., such that t1t2t_{1}\rightarrow t_{2} implies t2<ct1t_{2}<_{c}t_{1} for all t1,t2:Statect_{1},t_{2}:\text{\tt State}^{c}. The order is defined by

𝔰(B)<c{F,S}c,{F,S}c<c{F,S}ciffS<sS,\mathfrak{s}(B)<_{c}\{F,S\}^{c},\quad\{F,S\}^{c}<_{c}\{F,S^{\prime}\}^{c}\ \text{iff}\ S<_{s}S^{\prime},

for all FF, SS, SS^{\prime}, BB. Here _<s_\_<_{s}\_ is the lexicographic order on stacks derived from partial order _<f_\_<_{f}\_ on frames, i.e., for all frames D1,,DnD_{1},\ldots,D_{n}, E1,,EmE_{1},\ldots,E_{m}, D1Dn<sE1EmD_{1}\ldots D_{n}<_{s}E_{1}\ldots E_{m} iff either (1) for some kk, Dk<fEkD_{k}<_{f}E_{k} and Di=EiD_{i}=E_{i} for i{1,,k1}i\in\{1,\ldots,k-1\}, or (2) n<mn<m and Di=EiD_{i}=E_{i} for i{1,,n}i\in\{1,\ldots,n\}. Frame ordering is defined by 𝔯(B)<f¬<f[F|a]ψv|c<f[F|a]ψv|c<f[b]ψw|c<f[b]ψw|c,<f[F′′|c]ψx|c\mathfrak{r}(B)<_{f}{\neg}<_{f}[F\;|\;\vec{a}]^{\vec{v}|c}_{\psi}<_{f}[F^{\prime}\;|\;\vec{a}]^{\vec{v}|c}_{\psi}<_{f}[\vec{b}]_{\psi}^{\vec{w}|c}<_{f}[\vec{b}]_{\psi}^{\vec{w}|c,\downarrow}<_{f}[F^{\prime\prime}\;|\;\vec{c}]^{\vec{x}|c}_{\psi^{\prime}}, for all FF, FF^{\prime}, F′′F^{\prime\prime}, ψ\psi, ψ\psi^{\prime}, BB, v\vec{v}, w\vec{w}, x\vec{x}, a\vec{a}, b\vec{b}, c\vec{c} such that FFF\subsetneq F^{\prime} and ψ\psi is a proper subcondition of ψ\psi^{\prime}. Partial order _<f_\_<_{f}\_ is Noetherian because multisets of facts FF, FF^{\prime} and conditions ψ\psi, ψ\psi^{\prime} are finite terms. Hence, if the stacks are of bounded size, also _<c_\_<_{c}\_ is Noetherian. The size of stacks is bounded because each stack size increasing rule is of the form {F,S[]ψ1}c{F,SA[]ψ2}c\{F,S[\ldots]^{\ldots}_{\psi_{1}}\}^{c}\rightarrow\{F,SA[\ldots]^{\ldots}_{\psi_{2}}\}^{c}, where AA is a frame and ψ2\psi_{2} is a proper subterm of ψ1\psi_{1}. Since rules in Σ,𝒟𝐜𝐧𝐝(ϕ)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi) are topmost, the rewriting is strictly monotonic because t2<ct1t_{2}<_{c}t_{1} for each rule schema t1t2ifCt_{1}\Rightarrow t_{2}\;\text{if}\;C in Σ,𝒟𝐜𝐧𝐝(ϕ)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi).

The following useful observation can be trivially verified by examining the rule schemas:

Lemma 4.4

Let ψ\psi be a subcondition of ϕ\phi. For any finite multiset of facts FF, stack SS, Boolean BB, variables v=v1,,vn\vec{v}=v_{1},\ldots,v_{n} and values a=a1,,an\vec{a}=a_{1},\ldots,a_{n}, {F,S[a]ψv|c}c{F,S𝔯(B)}c\{F,S[\vec{a}]^{\vec{v}|c}_{\psi}\}^{c}\rightarrow^{*}\{F,S\mathfrak{r}(B)\}^{c} in Σ,𝒟𝐜𝐧𝐝(ϕ)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi) iff {F,[]σ(ψ)|c}c𝔰(B)\{F,[]^{|c}_{\sigma(\psi)}\}^{c}\rightarrow^{*}\mathfrak{s}(B) in Σ,𝒟𝐜𝐧𝐝(σ(ψ))\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\sigma(\psi)), where substitution σ:={a/v}\sigma:=\{\vec{a}/\vec{v}\}.

The following example shows verification of a condition in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} for a given multiset of facts using rewriting semantics. It also shows non-confluence of the resulting rewrite system.

Example 4.5

Suppose p,q:NatFactp,q:\text{\tt Nat}\rightarrow\text{\tt Fact}. Let ψ:=[q(z)]?.{z=x}\psi:=\exists\;[q(z)]_{?}\mathbin{.}\{z=x\}, K:=p(1)q(0)K:=p(1)\circ q(0), L:=p(0)q(0)L:=p(0)\circ q(0), and M:=p(0)p(0)p(1)M:=p(0)\circ p(0)\circ p(1). To check if condition

ϕ:=[p(x)p(y)]?[q(y)]!.(¬ψ{x=y}).\phi:=\exists[p(x)\circ p(y)]_{?}\circ[q(y)]_{!}\mathbin{.}\bigl{(}\neg\psi\vee\{x=y\}\bigr{)}.

is satisfied in a multiset H:=p(0)p(0)K=Lp(0)p(1)=Mq(0)H:=p(0)\circ p(0)\circ K=L\circ p(0)\circ p(1)=M\circ q(0) we normalize

Iϕ(H)λinit\displaystyle\mathrm{I}_{\phi}(H)\xrightarrow{{\lambda_{\exists}^{\text{init}}}} {H,[H|]ϕ|c}cλunf{H,[K|]ϕ|c[0,0]¬ψ{x=y}x,y|c}cλunf{H,[K|]ϕ|c[0,0]¬ψx,y|c,[0,0]{x=y}x,y|c}c\displaystyle\bigl{\{}H,[H|\;]^{|c}_{\phi}\bigr{\}}^{c}\xrightarrow{{\lambda_{\exists}^{\text{unf}}}}\bigl{\{}H,[K|\;]^{|c}_{\phi}[0,0]^{x,y|c}_{\neg\psi\vee\{x=y\}}\bigr{\}}^{c}\xrightarrow{{\lambda_{\vee}^{\text{unf}}}}\bigl{\{}H,[K|\;]^{|c}_{\phi}[0,0]^{x,y|c,\downarrow}_{\neg\psi}[0,0]^{x,y|c}_{\{x=y\}}\bigr{\}}^{c}
λ{_}\displaystyle\xrightarrow{{\lambda_{\{\_\}}}} {H,[K|]ϕ|c[0,0]¬ψx,y|c,𝔯(0=0)}cλ;tfld{H,[K|]ϕ|c𝔯(t)}cλ;tfld{H,𝔯(t)}cλsat𝔰(t).\displaystyle\bigl{\{}H,[K|\;]^{|c}_{\phi}[0,0]^{x,y|c,\downarrow}_{\neg\psi}\mathfrak{r}(0=0)\bigr{\}}^{c}\xrightarrow{{\lambda_{\vee;\text{\bf t}}^{\text{fld}}}}\bigl{\{}H,[K|\;]^{|c}_{\phi}\mathfrak{r}(\text{\bf t})\bigr{\}}^{c}\xrightarrow{{\lambda_{\exists;\text{\bf t}}^{\text{fld}}}}\bigl{\{}H,\mathfrak{r}(\text{\bf t})\bigr{\}}^{c}\xrightarrow{\lambda_{\text{sat}}}\mathfrak{s}(\text{\bf t}).

Thus, ϕ\phi is satisfied in HH. However, we have also a normalizing sequence ending with 𝔰(f)\mathfrak{s}(\text{\bf f}):

Iϕ(H)λinit\displaystyle\mathrm{I}_{\phi}(H)\xrightarrow{{\lambda_{\exists}^{\text{init}}}} {H,[H|]ϕ|c}cλunf{H,[L|]ϕ|c[0,1]¬ψ{x=y}x,y|c}c{H,[L|]ϕ|c[0,1]¬ψx,y|c}c\displaystyle\bigl{\{}H,[H\;|\;]^{|c}_{\phi}\bigr{\}}^{c}\xrightarrow{{\lambda_{\exists}^{\text{unf}}}}\bigl{\{}H,[L\;|\;]^{|c}_{\phi}[0,1]^{x,y|c}_{\neg\psi\vee\{x=y\}}\bigr{\}}^{c}\rightarrow^{*}\bigl{\{}H,[L\;|\;]^{|c}_{\phi}[0,1]^{x,y|c}_{\neg\psi}\bigr{\}}^{c}
\displaystyle\rightarrow^{*} {H,[L|]ϕ|c¬[H| 0,1]ψx,y|c}cλunf{H,[L|]ϕ|c¬[M| 0,1]ψx,y|c[0,1,0]{z=x}x,y,z|c}c\displaystyle\bigl{\{}H,[L\;|\;]^{|c}_{\phi}{\neg}[H\;|\;0,1]^{x,y|c}_{\psi}\bigr{\}}^{c}\xrightarrow{{\lambda_{\exists}^{\text{unf}}}}\bigl{\{}H,[L\;|\;]^{|c}_{\phi}{\neg}[M\;|\;0,1]^{x,y|c}_{\psi}[0,1,0]^{x,y,z|c}_{\{z=x\}}\bigr{\}}^{c}
\displaystyle\rightarrow^{*} {H,[L|]ϕ|c¬𝔯(t)}c!𝔰(f).\displaystyle\bigl{\{}H,[L\;|\;]^{|c}_{\phi}{\neg}\mathfrak{r}(\text{\bf t})\bigr{\}}^{c}\rightarrow^{!}\mathfrak{s}(\text{\bf f}).

Thus, evaluation of conditions does not necessarily lead to a unique result (the rewriting system is not confluent). This requires making the definition of logical equivalence bisimulation-like:

Definition 4.6

Let ϕ1\phi_{1} and ϕ2\phi_{2} be conditions in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}. We say that ϕ1\phi_{1} is logically equivalent to ϕ2\phi_{2}, writing ϕ1ϕ2\phi_{1}\equiv\phi_{2}, if and only if for all ground multisets of facts FF, ground substitutions σ\sigma such that σ(ϕ1)\sigma(\phi_{1}) and σ(ϕ2)\sigma(\phi_{2}) are closed, and a Boolean B{t,f}B\in\{\text{\bf t},\text{\bf f}\} we have

Iσ(ϕ1)(F)!𝔰(B)if and only ifIσ(ϕ2)(F)!𝔰(B).\mathrm{I}_{\sigma(\phi_{1})}(F)\rightarrow^{!}\mathfrak{s}(B)\quad\text{if and only if}\quad\mathrm{I}_{\sigma(\phi_{2})}(F)\rightarrow^{!}\mathfrak{s}(B).

The following result is an immediate consequence of Lemma 4.4:

Lemma 4.7

Logical equivalence on conditions in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} is an equivalence and a congruence, i.e., if κ\kappa is a position in a condition ϕ\phi in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} such that ϕ|κ\phi|_{\kappa} is a condition, and ψϕ|κ\psi\equiv\phi|_{\kappa}, then ϕϕ[ψ]κ\phi\equiv\phi[\psi]_{\kappa}.

A renaming is an injective substitution σ\sigma mapping variables to variables.

Lemma 4.8

For any closed condition ϕ\phi in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}, and any renaming σ\sigma, ϕσ(ϕ)\phi\equiv\sigma(\phi).

The following result clarifies elements of rewriting semantics of sentences in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}:

Lemma 4.9

For each ground multiset of facts FF, and all sentences ϕ\phi, ϕ1\phi_{1}, ϕ2\phi_{2} and P.ψ\exists P\mathbin{.}\psi :

  1. 1.

    Iϕ(F)!𝔰(t)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}) or Iϕ(F)!𝔰(f)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}), and these are the only possible normal forms of Iϕ(F)\mathrm{I}_{\phi}(F).

  2. 2.

    I(F)!𝔰(f)\mathrm{I}_{\bot}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}) and never I(F)!𝔰(t)\mathrm{I}_{\bot}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}).

  3. 3.

    Iϕ(F)!𝔰(B)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{s}(B) iff I¬ϕ(F)!𝔰(¬B)\mathrm{I}_{\neg\phi}(F)\rightarrow^{!}\mathfrak{s}(\neg B) for all B:BoolB:\text{\tt Bool}.

  4. 4.

    Iϕ1ϕ2(F)!𝔰(t)\mathrm{I}_{\phi_{1}\vee\phi_{2}}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}) iff Iϕ1(F)!𝔰(t)\mathrm{I}_{\phi_{1}}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}) or Iϕ2(F)!𝔰(t)\mathrm{I}_{\phi_{2}}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}).

  5. 5.

    Iϕ1ϕ2(F)!𝔰(f)\mathrm{I}_{\phi_{1}\vee\phi_{2}}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}) iff Iϕ1(F)!𝔰(f)\mathrm{I}_{\phi_{1}}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}) and Iϕ2(F)!𝔰(f)\mathrm{I}_{\phi_{2}}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}).

  6. 6.

    IP.ψ(F)!𝔰(t)\mathrm{I}_{\exists P\mathbin{.}\psi}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}) iff there exists a substitution σ\sigma, and a multiset FF^{\prime} such that Fσ(P?P!)=FF^{\prime}\circ\sigma(P_{?}\circ P_{!})=F and Iσ(ψ)(F)!𝔰(t)\mathrm{I}_{\sigma(\psi)}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}).

  7. 7.

    IP.ψ(F)!𝔰(f)\mathrm{I}_{\exists P\mathbin{.}\psi}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}) iff there exist two sequences of ground multisets of facts F0,F1,,FnF_{0},F_{1},\ldots,F_{n} and G0,G1,,Gn1G_{0},G_{1},\ldots,G_{n-1}, and a sequence of substitutions σ0,σ1,,σn1\sigma_{0},\sigma_{1},\ldots,\sigma_{n-1} such that

    1. (a)

      F0=FF_{0}=F, Fi+1=Giσi(P!)F_{i+1}=G_{i}\circ\sigma_{i}(P_{!}), and Fi=Giσi(P!P?)F_{i}=G_{i}\circ\sigma_{i}(P_{!}\circ P_{?}), for all i{0,,n1}i\in\{0,\ldots,n-1\},

    2. (b)

      Iσi(ψ)(F)!𝔰(f)\mathrm{I}_{\sigma_{i}(\psi)}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}), for all i{0,,n1}i\in\{0,\ldots,n-1\},

    3. (c)

      there exists no substitution σn\sigma_{n} and multiset of facts GnG_{n} such that Fn=Gnσn(P!P?)F_{n}=G_{n}\circ\sigma_{n}(P_{!}\circ P_{?}).

  8. 8.

    Iϕ¬ϕ(F)!𝔰(t)\mathrm{I}_{\phi\vee\neg\phi}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}). If both Iϕ(F)!𝔰(t)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}) and Iϕ(F)!𝔰(f)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}) then also Iϕ¬ϕ(F)!𝔰(f)\mathrm{I}_{\phi\vee\neg\phi}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}).

Proof 4.10

The first point is verified by structural recursion using Lemma 4.4 and rules in Equations (3)–(9). Points 2–7 are verified using rules in Equations (3)–(9). Point 8 is verified using points 3-5.

Lemma 4.11

The following logical equivalences hold between conditions in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}:

ϕϕ,ϕ1ϕ2ϕ2ϕ1,ϕ1(ϕ2ϕ3)(ϕ1ϕ2)ϕ3,\displaystyle\phi\vee\bot\equiv\phi,\quad\phi_{1}\vee\phi_{2}\equiv\phi_{2}\vee\phi_{1},\quad\phi_{1}\vee(\phi_{2}\vee\phi_{3})\equiv(\phi_{1}\vee\phi_{2})\vee\phi_{3},
¬¬ϕϕ,P.(ϕ1ϕ2)(P.ϕ1)(P.ϕ2).\displaystyle\neg\neg\phi\equiv\phi,\quad\exists P\mathbin{.}(\phi_{1}\vee\phi_{2})\equiv(\exists P\mathbin{.}\phi_{1})\vee(\exists P\mathbin{.}\phi_{2}).
Proof 4.12

The above equivalences can be proven using points 1-7 in Lemma 4.9. Only the last equivalence’s proof is non-trivial. Let FF be a ground multiset of facts and let σ\sigma be a ground substitution such that σ(P.(ϕ1ϕ2))\sigma(\exists P\mathbin{.}(\phi_{1}\vee\phi_{2})) (or, equivalently, σ((P.ϕ1)(P.ϕ2))\sigma((\exists P\mathbin{.}\phi_{1})\vee(\exists P\mathbin{.}\phi_{2}))) is closed. Denote Q:=σ(P)Q:=\sigma(P), ψi:=σ(ϕi)\psi_{i}:=\sigma(\phi_{i}), for i{1,2}i\in\{1,2\}. Using Lemma 4.9, p. 6, we see that IQ.(ψ1ψ2)(F)!𝔰(t)\mathrm{I}_{\exists Q\mathbin{.}(\psi_{1}\vee\psi_{2})}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}) iff there exists a substitution σ\sigma^{\prime}, and a multiset of facts FF^{\prime} such that Fσ(Q?Q!)=FF^{\prime}\circ\sigma^{\prime}(Q_{?}\circ Q_{!})=F and Iσ(ψ1)σ(ψ2)(F)!𝔰(t)\mathrm{I}_{\sigma^{\prime}(\psi_{1})\vee\sigma^{\prime}(\psi_{2})}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}). The latter holds iff there exists i{1,2}i\in\{1,2\} such that Iσ(ψi)(F)!𝔰(t)\mathrm{I}_{\sigma^{\prime}(\psi_{i})}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}), by Lemma 4.9, p. 4. It follows, again using Lemma 4.9, point 6, that IQ.(ψ1ψ2)(F)!𝔰(t)\mathrm{I}_{\exists Q\mathbin{.}(\psi_{1}\vee\psi_{2})}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}) iff IQ.ψi(F)!𝔰(t)\mathrm{I}_{\exists Q\mathbin{.}\psi_{i}}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}) for some i{1,2}i\in\{1,2\}, i.e., iff (by Lemma 4.9, p. 4) I(Q.ψ1)(Q.ψ2)(F)!𝔰(t)\mathrm{I}_{(\exists Q\mathbin{.}\psi_{1})\vee(\exists Q\mathbin{.}\psi_{2})}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}). The part of the proof with falsity is more complex. Using Lemma 4.9, p. 7, we see that IQ.(ψ1ψ2)(F)!𝔰(f)\mathrm{I}_{\exists Q\mathbin{.}(\psi_{1}\vee\psi_{2})}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}) iff there exist two sequences of ground multisets of facts F0,,FnF_{0},\ldots,F_{n} and G0,,Gn1G_{0},\ldots,G_{n-1}, and a sequence of substitutions σ0,,σn1\sigma_{0},\ldots,\sigma_{n-1} such that (a) F0=FF_{0}=F, Fi+1=Giσi(Q!)F_{i+1}=G_{i}\circ\sigma_{i}(Q_{!}) and Fi=Giσi(Q!Q?)F_{i}=G_{i}\circ\sigma_{i}(Q_{!}\circ Q_{?}), for all i{0,,n1}i\in\{0,\ldots,n-1\}, (b) Iσi(ψ1)σi(ψ2)(F)!𝔰(f)\mathrm{I}_{\sigma_{i}(\psi_{1})\vee\sigma_{i}(\psi_{2})}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}), for all i{0,,n1}i\in\{0,\ldots,n-1\}, (c) there exists no substitution σn\sigma_{n} and multiset of facts GnG_{n} such that Fn=Gnσn(Q!Q?)F_{n}=G_{n}\circ\sigma_{n}(Q_{!}\circ Q_{?}). Then by Lemma 4.9, p. 5, (b) iff, for j{1,2}j\in\{1,2\}, (bj) Iσi(ψj)(Fi)!𝔰(f)\mathrm{I}_{\sigma_{i}(\psi_{j})}(F_{i})\rightarrow^{!}\mathfrak{s}(\text{\bf f}), for all i{0,,n1}i\in\{0,\ldots,n-1\}. Then (by Lemma 4.9, p. 7) (a), (b1), (b2) and (c) iff IQ.ψj(F)!𝔰(f)\mathrm{I}_{\exists Q\mathbin{.}\psi_{j}}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}) for j{1,2}j\in\{1,2\} iff, by Lemma 4.9, p. 5, I(Q.ψ1)(Q.ψ2)(F)!𝔰(f)\mathrm{I}_{(\exists Q\mathbin{.}\psi_{1})\vee(\exists Q\mathbin{.}\psi_{2})}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}).

Non-confluence of Σ,𝒟𝐜𝐧𝐝(ϕ)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi) in Example 4.5 depended on patterns in ϕ\phi with more than one fact. It turns out that Σ,𝒟𝐜𝐧𝐝(ϕ)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi) may be non-confluent even if ϕ\phi contains only single-fact patterns:

Example 4.13

Let r:NatNatFactr:\text{\tt Nat}\;\text{\tt Nat}\rightarrow\text{\tt Fact} be commutative. Consider condition ϕ:=[r(x,y)]?.{x=1}\phi:=\exists[r(x,y)]_{?}\mathbin{.}\{x=1\} evaluated in a database F=r(1,2)F=r(1,2). Since rr is commutative, there are two distinct substitutions {1/x,2/y}\{1/x,2/y\} and {2/x,1/y}\{2/x,1/y\} which match r(x,y)r(x,y) with r(1,2)r(1,2). Consequently, there are two distinct paths of evaluating ϕ\phi: Iϕ(F){F,[F|]ϕ|c}cλunf{{F,[|]ϕ|c[1,2]{x=1}x,y|c}c!𝔰(t){F,[|]ϕ|c[2,1]{x=1}x,y|c}c!𝔰(f).\mathrm{I}_{\phi}(F)\rightarrow^{*}\bigl{\{}F,[F|]_{\phi}^{|c}\bigr{\}}^{c}\xrightarrow{{\lambda_{\exists}^{\text{unf}}}}\left\{\begin{array}[]{l}\bigl{\{}F,[\emptyset|]_{\phi}^{|c}[1,2]^{x,y|c}_{\{x=1\}}\bigr{\}}^{c}\rightarrow^{!}\mathfrak{s}(\text{\bf t})\\ \bigl{\{}F,[\emptyset|]_{\phi}^{|c}[2,1]^{x,y|c}_{\{x=1\}}\bigr{\}}^{c}\rightarrow^{!}\mathfrak{s}(\text{\bf f})\end{array}\right..

Definition 4.14

A fully reduced term t:Factt:\text{\tt Fact} is said to have a unique matching property iff for any ground, fully reduced term t:Factt^{\prime}:\text{\tt Fact} there exists at most one substitution σ\sigma such that σ(t)=At\sigma(t)=_{A}t^{\prime}.

Definition 4.15

A condition ϕ\phi in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} is called deterministic if and only if all quantification patterns in ϕ\phi contain only single facts with unique matching property.

The following theorem states that while evaluation of a deterministic condition is not itself deterministic, but its results are.

Theorem 4.16

Let ϕ\phi be a deterministic condition in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}. Then Σ,𝒟𝐜𝐧𝐝(ϕ)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi) is confluent. In particular, given a ground multiset of facts FF, there is a unique B{t,f}B\in\{\text{\bf t},\text{\bf f}\} such that Iϕ(F)!𝔰(B)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{s}(B).

Proof 4.17

We argue by induction on the complexity of formulas indexing frames on the top of a stack. First observe that only terms of the form t:={F,S[F|a]P.ψv|c}ct:=\bigl{\{}F,S[F^{\prime}\;|\;\vec{a}]^{\vec{v}|c}_{\exists P\mathbin{.}\psi}\bigr{\}}^{c} can be rewritten in a single step into two distinct terms. As semiconfluence implies confluence it suffices to prove that if tt+t′′t^{\prime}\leftarrow t\rightarrow^{+}t^{\prime\prime} then there exists ss such that tst′′t^{\prime}\rightarrow^{*}s\;{}^{*}\!\!\leftarrow t^{\prime\prime}. By Lemmas 4.4 and 4.9, p. 1, a rewrite sequence t+t′′t\rightarrow^{+}t^{\prime\prime} must (1) contain {F,S𝔯(B)}c\{F,S\mathfrak{r}(B)\}^{c}, or (2) can be extended to the sequence ending in {F,S𝔯(B)}c\{F,S\mathfrak{r}(B)\}^{c}. If we prove that t{F,S𝔯(B)}ct^{\prime}\rightarrow^{*}\{F,S\mathfrak{r}(B)\}^{c} then we can set s=t′′s=t^{\prime\prime} (if (1)) or s={F,S𝔯(B)}cs=\{F,S\mathfrak{r}(B)\}^{c} (if (2)). It remains to prove that t{F,S𝔯(B)}ct^{\prime}\rightarrow^{*}\{F,S\mathfrak{r}(B)\}^{c}. Under the theorem’s assumption, P=[f]?P=[f]_{?}, where ff is a fact with a unique matching property (Definition 4.14). Thus, if for some fact gg in FF^{\prime} there exists a substitution σg\sigma_{g} extending {a/v}\{\vec{a}/\vec{v}\} such that g=σg(f)g=\sigma_{g}(f) and Iσg(ψ)(F)𝔯(t)\mathrm{I}_{\sigma_{g}(\psi)}(F)\rightarrow\mathfrak{r}(\text{\bf t}) (which implies, by inductive assumption, that Iσg(ψ)(F)↛𝔯(f)\mathrm{I}_{\sigma_{g}(\psi)}(F)\not\rightarrow^{*}\mathfrak{r}(\text{\bf f})), it is unique, and cannot be missed during evaluation of the iterator frame. Either such gg exists, and then t{F,S𝔯(t)}ct\rightarrow^{*}\{F,S\mathfrak{r}(\text{\bf t})\}^{c} but t↛{F,S𝔯(f)}ct\not\rightarrow^{*}\{F,S\mathfrak{r}(\text{\bf f})\}^{c} (hence necessarily both B=tB=\text{\bf t} and t{F,S𝔯(t)}ct^{\prime}\rightarrow^{*}\{F,S\mathfrak{r}(\text{\bf t})\}^{c}), or it doesn’t, and hence, both B=fB=\text{\bf f} and t{F,S𝔯(f)}ct^{\prime}\rightarrow^{*}\{F,S\mathfrak{r}(\text{\bf f})\}^{c}

Example 4.18

Suppose p:NatFactp:\text{\tt Nat}\rightarrow\text{\tt Fact}, q:Factq:\rightarrow\text{\tt Fact}, and let r:NatNatFactr:\text{\tt Nat}\;\text{\tt Nat}\rightarrow\text{\tt Fact} be a commutative operator (i.e., r(x,y)=Ar(y,x)r(x,y)=_{A}r(y,x)). Clearly, p(x)p(x) and qq have unique matching property (Definition 4.14), while r(x,y)r(x,y) does not, since, e.g., if σ1:={0/x,1/y}\sigma_{1}:=\{0/x,1/y\} and σ2:={1/x,0/y}\sigma_{2}:=\{1/x,0/y\} then σ1(r(x,y))=Aσ2(r(x,y))=Ar(0,1)\sigma_{1}(r(x,y))=_{A}\sigma_{2}(r(x,y))=_{A}r(0,1). Let

ϕ1:=[p(x)]?.{x=0},ϕ2:=[p(x)q]?.{x=0},ϕ3:=[r(x,y)]?.{x=0}.\phi_{1}:=\exists[p(x)]_{?}\mathbin{.}\{x=0\},\quad\phi_{2}:=\exists[p(x)\circ q]_{?}\mathbin{.}\{x=0\},\quad\phi_{3}:=\exists[r(x,y)]_{?}\mathbin{.}\{x=0\}.

Then ϕ1\phi_{1} is deterministic (Definition 4.15), while ϕ2\phi_{2} and ϕ3\phi_{3} are not deterministic. Let F:=p(0)p(1)qr(0,1)F:=p(0)\circ p(1)\circ q\circ r(0,1). We now consider evaluation of all the ϕi\phi_{i}’s on FF. First, the reader will easily verify that while evaluating the existential quantifier in ϕ1\phi_{1} we can either first match p(x)p(x) with p(1)p(1) and then, upon failure, with p(0)p(0), or first match with p(0)p(0). Eventually, both paths yield satisfaction of ϕ1\phi_{1} on FF (although the first path is is longer). On the other hand, evaluation of ϕ2\phi_{2} and ϕ3\phi_{3} demonstrate two ways in which non-determinism occurs in evaluation of conditions in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}. First, consider two non-convergent paths of rewriting Iϕ2(F)I_{\phi_{2}}(F):

{F,[p(1)r(0,1)|]ϕ2|c[0]{x=0}x|c}c\textstyle{\bigl{\{}F,[p(1)\circ r(0,1)|\;]^{|c}_{\phi_{2}}[0]^{x|c}_{\{x=0\}}\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}λ{_}\scriptstyle{{\lambda_{\{\_\}}}}{F,[p(1)r(0,1)|]ϕ2|c𝔯(t)}c\textstyle{\bigl{\{}F,[p(1)\circ r(0,1)|\;]^{|c}_{\phi_{2}}\mathfrak{r}(\text{\bf t})\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}\scriptstyle{*}𝔰(t)\textstyle{\mathfrak{s}(\text{\bf t})}{F,[p(0)p(1)qr(0,1)|]ϕ2|c}c\textstyle{\bigl{\{}F,[p(0)\circ p(1)\circ q\circ r(0,1)|\;]^{|c}_{\phi_{2}}\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}λunf\scriptstyle{{\lambda_{\exists}^{\text{unf}}}}λunf\scriptstyle{{\lambda_{\exists}^{\text{unf}}}}Iϕ2(F)\textstyle{I_{\phi_{2}}(F)\ignorespaces\ignorespaces\ignorespaces\ignorespaces}λinit\scriptstyle{{\lambda_{\exists}^{\text{init}}}}𝔰(f)\textstyle{\mathfrak{s}(\text{\bf f})}{F,[p(0)r(0,1)|]ϕ2|c[1]{x=0}x|c}c\textstyle{\bigl{\{}F,[p(0)\circ r(0,1)|\;]^{|c}_{\phi_{2}}[1]^{x|c}_{\{x=0\}}\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}λ{_}\scriptstyle{{\lambda_{\{\_\}}}}{F,[p(0)r(0,1)|]ϕ2|c𝔯(f)}c\textstyle{\bigl{\{}F,[p(0)\circ r(0,1)|\;]^{|c}_{\phi_{2}}\mathfrak{r}(\text{\bf f})\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}λend\scriptstyle{{\lambda_{\exists}^{\text{end}}}}{F,𝔯(f)}c\textstyle{\bigl{\{}F,\mathfrak{r}(\text{\bf f})\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}\scriptstyle{*}

Here the reason for non-determinism which ultimately leads to non-convergent paths of execution is that when evaluating the existential quantifier at each attempt we have to consume both p(x)p(x)-fact and qq-fact: since there is only one qq fact, if we start from wrong p(x)p(x)-fact (i.e., p(1)p(1)), we do not get the second chance.

Denote for brevity K:=p(0)p(1)qK:=p(0)\circ p(1)\circ q. Recall that r(x,y)=Ar(y,x)r(x,y)=_{A}r(y,x). In particular, r(0,1)=Ar(1,0)r(0,1)=_{A}r(1,0). Consider now two non-convergent paths of rewriting Iϕ3(F)I_{\phi_{3}}(F):

{F,[K|]ϕ3|c[0,1]{x=0}x,y|c}c\textstyle{\bigl{\{}F,[K|\;]^{|c}_{\phi_{3}}[0,1]^{x,y|c}_{\{x=0\}}\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}λ{_}\scriptstyle{{\lambda_{\{\_\}}}}{F,[K|]ϕ3|c𝔯(t)}c\textstyle{\bigl{\{}F,[K|\;]^{|c}_{\phi_{3}}\mathfrak{r}(\text{\bf t})\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}\scriptstyle{*}𝔰(t)\textstyle{\mathfrak{s}(\text{\bf t})}{F,[Kr(0,1)|]ϕ3|c}c\textstyle{\bigl{\{}F,[K\circ r(0,1)|\;]^{|c}_{\phi_{3}}\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}λunf\scriptstyle{{\lambda_{\exists}^{\text{unf}}}}λunf\scriptstyle{{\lambda_{\exists}^{\text{unf}}}}Iϕ3(F)\textstyle{I_{\phi_{3}}(F)\ignorespaces\ignorespaces\ignorespaces\ignorespaces}λinit\scriptstyle{{\lambda_{\exists}^{\text{init}}}}𝔰(f)\textstyle{\mathfrak{s}(\text{\bf f})}{F,[K|]ϕ3|c[1,0]{x=0}x,y|c}c\textstyle{\bigl{\{}F,[K|\;]^{|c}_{\phi_{3}}[1,0]^{x,y|c}_{\{x=0\}}\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}λ{_}\scriptstyle{{\lambda_{\{\_\}}}}{F,[K|]ϕ3|c𝔯(f)}c\textstyle{\bigl{\{}F,[K|\;]^{|c}_{\phi_{3}}\mathfrak{r}(\text{\bf f})\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}λend\scriptstyle{{\lambda_{\exists}^{\text{end}}}}{F,𝔯(f)}c\textstyle{\bigl{\{}F,\mathfrak{r}(\text{\bf f})\bigr{\}}^{c}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}\scriptstyle{*}

Thus, in this case the reason of non-determinism leading to non-convergent paths was the possibility of two distinct matchings of r(0,1)r(0,1) with r(x,y)r(x,y) given by {0/x,1/y}\{0/x,1/y\} and {1/x,0/y}\{1/x,0/y\}.

5 Rewriting semantics of 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}

Let QQ be a query in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}. We associate with QQ the rewriting system Σ,𝒟𝐪𝐫𝐲(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}(Q). Terms rewritten with the rules of the rewriting system Σ,𝒟𝐪𝐫𝐲(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}(Q) are of the form {F,F,S}q\{F,F^{\prime},S\}^{q}, where {_,_,_}q:FSetFSetStkqStateq\{\_,\_,\_\}^{q}:\text{\tt FSet}\;\text{\tt FSet}\;\text{\tt Stk}^{q}\rightarrow\text{\tt State}^{q}, FF is a database of facts against which we issue the query, FF^{\prime} is a partial answer (i.e., an answer built so far in the rewriting process), and SS is a stack of sort Stkq\text{\tt Stk}^{q} which simulates structural recursion. Normal forms encapsulating an answer to QQ are constructed with 𝔞:FSetStateq\mathfrak{a}:\text{\tt FSet}\rightarrow\text{\tt State}^{q}. Terms of sort Stkq\text{\tt Stk}^{q} are constructed from local computation frames of sort Frmq\text{\tt Frm}^{q}, where Frmq<Stkq\text{\tt Frm}^{q}<\text{\tt Stk}^{q}, using an associative binary operator __:StkqStkqStkq\_\_:\text{\tt Stk}^{q}\;\text{\tt Stk}^{q}\rightarrow\text{\tt Stk}^{q} with identity element \emptyset. Constructors of frames are indexed by sub-queries RR of QQ, and lists of distinct variable names v:=v1,,vn\vec{v}:=v_{1},\ldots,v_{n} of respective sorts s1,,sns_{1},\ldots,s_{n} (v\vec{v} can be of any length as long as {v}𝑉𝑎𝑟(Q)\{\vec{v}\}\subseteq\mathit{Var}(Q) and it contains all free variables of RR):

[_,,_]Rv|q:s1snFrmq,[_,,_|_]Rv|q:s1snStkcFrmq,\displaystyle[\_,\ldots,\_]^{\vec{v}|q}_{R}:s_{1}\ldots s_{n}\rightarrow\text{\tt Frm}^{q},\quad[\_,\ldots,\_|\_]^{\vec{v}|q}_{R}:s_{1}\ldots s_{n}\;\text{\tt Stk}^{c}\rightarrow\text{\tt Frm}^{q},
[_|_,,_]Rv|q:FSets1snFrmqifR=P.R.\displaystyle[\_|\_,\ldots,\_]^{\vec{v}|q}_{R}:\text{\tt FSet}\;s_{1}\ldots s_{n}\rightarrow\text{\tt Frm}^{q}\ \text{if}\ R=\nabla P\mathbin{.}R^{\prime}.

As v\vec{v} can be empty, the above signature templates include []R|c:Frmq[]^{|c}_{R}:\rightarrow\text{\tt Frm}^{q}, etc. As in the case of 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}, variables in super- and sub-scripts are part of function names and are never matched or substituted — Remark 4.1 applies here and in the next section. Frames of the form [a]Rv|q[\vec{a}]^{\vec{v}|q}_{R}, [a|S]Rv|q[\vec{a}|S]^{\vec{v}|q}_{R}, or [F|a]Rv|q[F^{\prime}|\vec{a}]^{\vec{v}|q}_{R} are called (R,σ)(R,\sigma)-frames, where σ:={a/v}\sigma:=\{\vec{a}/\vec{v}\} is the current substitution. They indicate evaluation of σ(R)\sigma(R). Conditional frames [a|S]Rv|q[\vec{a}|S]^{\vec{v}|q}_{R} are used in evaluation of conditionals ϕR\phi\Rightarrow R, where S:StkcS:\text{\tt Stk}^{c} represents evaluation of ϕ\phi. Iterator frames [F|a]P.Rv|q[F^{\prime}|\vec{a}]^{\vec{v}|q}_{\nabla P\mathbin{.}R}, represent iterative evaluation of σ(P.R)\sigma(\nabla P\mathbin{.}R). Multiset FF^{\prime}, called iterator state, contains facts available for matching with P!P?P_{!}\circ P_{?}. Given a database FF, to evaluate a closed query QQ we rewrite a state term IQ(F):={F,,[]Q|q}q\mathrm{I}_{Q}(F):=\{F,\emptyset,[]^{|q}_{Q}\}^{q} until a normal form 𝔞(F)\mathfrak{a}(F^{\prime}) is reached. Then we conclude that evaluation of QQ on FF yields FF^{\prime} as an answer.

Now we are ready to define the rules of Σ,𝒟𝐪𝐫𝐲(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}(Q). Literal facts are added to the partial answer multiset after applying the current substitution, and empty queries return nothing:

λfact:{F,F,S[v]fv|q}q{F,Ff,S}q,λ:{F,F,S[v]v|q}q{F,F,S}q.\lambda_{\text{fact}}:\bigl{\{}F,F^{\prime},S[\vec{v}]^{\vec{v}|q}_{f}\bigr{\}}^{q}\Rightarrow\bigl{\{}F,F^{\prime}\circ f,S\bigr{\}}^{q},\quad\lambda_{\emptyset}:\bigl{\{}F,F^{\prime},S[\vec{v}]^{\vec{v}|q}_{\emptyset}\bigr{\}}^{q}\Rightarrow\bigl{\{}F,F^{\prime},S\bigr{\}}^{q}. (10)

Evaluation of “union” __\_\rhd\_ is implemented by replacing the frame corresponding to R1R2R_{1}\rhd R_{2} with two frames corresponding to R1R_{1} and R2R_{2}, respectively:

λunf:{F,F,S[v]R1R2v|q}q{F,F,S[v]R2v|q[v]R1v|q}q\lambda_{\rhd}^{\text{unf}}:\ \bigl{\{}F,F^{\prime},S[\vec{v}]^{\vec{v}|q}_{R_{1}\rhd R_{2}}\bigr{\}}^{q}\Rightarrow\bigl{\{}F,F^{\prime},S[\vec{v}]^{\vec{v}|q}_{R_{2}}[\vec{v}]^{\vec{v}|q}_{R_{1}}\bigr{\}}^{q} (11)

To compute a conditional ϕR\phi\Rightarrow R we first embed a stack representing computation of condition ϕ\phi within the frame corresponding to the conditional. Once this condition is evaluated, we either evaluate RR if the condition is satisfied, or drop the conditional if it is not:

λcondunf:\displaystyle\lambda_{\text{cond}}^{\text{unf}}: {F,F,S[v]ϕRv|q}q{F,F,S[v|[v]ϕv|c]Rv|q}q,\displaystyle\ \bigl{\{}F,F^{\prime},S[\vec{v}]^{\vec{v}|q}_{\phi\Rightarrow R}\bigr{\}}^{q}\Rightarrow\bigl{\{}F,F^{\prime},S\bigl{[}\vec{v}\;|\;[\vec{v}]^{\vec{v}|c}_{\phi}\bigr{]}^{\vec{v}|q}_{R}\bigr{\}}^{q},
λcond;funf:\displaystyle\lambda_{\text{cond};\text{\bf f}}^{\text{unf}}: {F,F,S[v|𝔯(f)]Rv|q}q{F,F,S}q,\displaystyle\ \bigl{\{}F,F^{\prime},S[\vec{v}\;|\;\mathfrak{r}(\text{\bf f})]^{\vec{v}|q}_{R}\bigr{\}}^{q}\Rightarrow\{F,F^{\prime},S\}^{q},
λcond;tunf:\displaystyle\lambda_{\text{cond};\text{\bf t}}^{\text{unf}}: {F,F,S[v|𝔯(t)]Rv|q}q{F,F,S[v]Rv|q}q.\displaystyle\ \bigl{\{}F,F^{\prime},S[\vec{v}\;|\;\mathfrak{r}(\text{\bf t})]^{\vec{v}|q}_{R}\bigr{\}}^{q}\Rightarrow\bigl{\{}F,F^{\prime},S[\vec{v}]_{R}^{\vec{v}|q}\bigr{\}}^{q}. (12)

To compute ϕ\phi we add, for every rule λ:{F,S}c{F,S′′}cifC\lambda:\{F,S^{\prime}\}^{c}\Rightarrow\{F,S^{\prime\prime}\}^{c}\ \text{if}\ C in Σ,𝒟𝐜𝐧𝐝(ϕ)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi) the rule schema

λq:{F,F,S[v|S]Rv|q}q{F,F,S[v|S′′]Rv|q}qifC\lambda^{q}:\{F,F^{\prime},S[\vec{v}\;|\;S^{\prime}]^{\vec{v}|q}_{R}\}^{q}\Rightarrow\{F,F^{\prime},S[\vec{v}\;|\;S^{\prime\prime}]^{\vec{v}|q}_{R}\}^{q}\ \text{if}\ C (13)

Evaluation of _._\nabla\_.\_ subquery is initialized with the whole database available for matching:

λinit:{F,F,S[v]P.Rv|q}q{F,F,S[F|v]P.Rv|q}q{\lambda_{\nabla}^{\text{init}}}:\bigl{\{}F,F^{\prime},S[\vec{v}]^{\vec{v}|q}_{\nabla P\mathbin{.}R}\bigr{\}}^{q}\Rightarrow\bigl{\{}F,F^{\prime},S[F\;|\;\vec{v}]^{\vec{v}|q}_{\nabla P\mathbin{.}R}\bigr{\}}^{q} (14)

Let w\vec{w} be a sequence of all the distinct variables in 𝑉𝑎𝑟(P){v}\mathit{Var}(P)\setminus\{\vec{v}\}. Let σ\sigma be the current substitution. Rule λunf{\lambda_{\nabla}^{\text{unf}}} pushes onto the stack a (σ,R)(\sigma^{\prime},R)-frame, where σ=σ{b/w}\sigma^{\prime}=\sigma\cup\{\vec{b}/\vec{w}\} is defined by matching F′′σ(P!P?)F^{\prime\prime}\circ\sigma(P_{!}\circ P_{?}) with iterator state, and it removes σ(P?)\sigma^{\prime}(P_{?}) from the iterator state:

λunf:{F,F,S[F′′P!P?|v]P.Rv|q}c{F,F,S[F′′P!|v]P.Rv|q[v,w]Rv,w|q}q{\lambda_{\nabla}^{\text{unf}}}:\bigl{\{}F,F^{\prime},S\bigl{[}F^{\prime\prime}\circ P_{!}\circ P_{?}\;|\;\vec{v}\bigr{]}^{\vec{v}|q}_{\nabla P\mathbin{.}R}\bigr{\}}^{c}\Rightarrow\bigl{\{}F,F^{\prime},S\bigl{[}F^{\prime\prime}\circ P_{!}\;|\;\vec{v}\bigr{]}^{\vec{v}|q}_{\nabla P\mathbin{.}R}[\vec{v},\vec{w}]^{\vec{v},\vec{w}|q}_{R}\bigr{\}}^{q} (15)

We keep applying λunf{\lambda_{\nabla}^{\text{unf}}} until we cannot match F′′σ(P!P?)F^{\prime\prime}\circ\sigma(P_{!}\circ P_{?}) with iterator state. Then we remove the iterator frame from the stack. To prevent premature application, rule schema λend{\lambda_{\nabla}^{\text{end}}} is conditional, where the condition uses functions μP,v:Pattps1snYes?\mu_{P,\vec{v}}:\text{\tt Pat}^{\text{\it tp}}\;s_{1}\ldots s_{n}\rightarrow\text{\tt Yes?} defined for each v𝑉𝑎𝑟(Q)\vec{v}\subseteq\mathit{Var}(Q) and pattern PP occurring in QQ with the single equation μP,v(FP!P?,v)=yes\mu_{P,\vec{v}}(F\circ P_{!}\circ P_{?},\vec{v})=\text{\it yes} (cf. Equation (8)):

λend:{F,F,S[F′′|v]P.Rv|q}q{F,F,S}qif(μP,v(F′′,v)=yes)=f.{\lambda_{\nabla}^{\text{end}}}:\{F,F^{\prime},S[F^{\prime\prime}\;|\;\vec{v}]^{\vec{v}|q}_{\nabla P\mathbin{.}R}\}^{q}\Rightarrow\{F,F^{\prime},S\}^{q}\quad\text{if}\ (\mu_{P,\vec{v}}(F^{\prime\prime},\vec{v})=\text{\it yes})=\text{\bf f}. (16)

Finally, the rule λans:{F,F,}c𝔞(F)\lambda_{\text{ans}}:\{F,F^{\prime},\emptyset\}^{c}\Rightarrow\mathfrak{a}(F^{\prime}) finishes evaluation of QQ.

The following result can be proven similarly to Theorem 4.2:

Theorem 5.1

Σ,𝒟𝐪𝐫𝐲(R)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}(R) is a terminating rewriting system.

The following useful observation can be trivially verified by examining the rule schemas:

Lemma 5.2

Let RR be a subcondition of QQ. Then, for all multisets of facts FF, FF^{\prime}, F′′F^{\prime\prime}, stacks SS, lists of variables v=v1,,vn\vec{v}=v_{1},\ldots,v_{n} and values a=a1,,an\vec{a}=a_{1},\ldots,a_{n}, {F,F,S[a]Rv|q}q{F,FF′′,S}q\{F,F^{\prime},S[\vec{a}]^{\vec{v}|q}_{R}\}^{q}\rightarrow^{*}\{F,F^{\prime}\circ F^{\prime\prime},S\}^{q} in Σ,𝒟𝐪𝐫𝐲(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}(Q) if and only if {F,,[]σ(R)|q}q!𝔞(F′′)\{F,\emptyset,[]^{|q}_{\sigma(R)}\}^{q}\rightarrow^{!}\mathfrak{a}(F^{\prime\prime}) in Σ,𝒟𝐪𝐫𝐲(σ(R))\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}(\sigma(R)), where substitution σ:={a/v}\sigma:=\{\vec{a}/\vec{v}\}.

The following example shows that evaluation of queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} does not, in general, return a unique answer, and, in particular, that Σ,𝒟𝐪𝐫𝐲(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}(Q) for general queries QQ is not confluent.

Example 5.3

Let :Fact\sharp:\rightarrow\text{\tt Fact} and b:NatFactb:\text{\tt Nat}\rightarrow\text{\tt Fact}. Consider the query Q:=[]?[b(x)]!.b(x)Q:=\nabla[\sharp]_{?}\circ[b(x)]_{!}\mathbin{.}b(x) executed against database F:=b(1)b(2)F:=\sharp\circ b(1)\circ b(2). Then IQ(F)\mathrm{I}_{Q}(F) can normalized in two ways:

IQ(F)λinit{F,,[F|]Q|q}qλunf{{F,,[b(1)b(2)|]Q|q[1]b(x)x|q}q!𝔞(b(1)){F,,[b(1)b(2)|]Q|q[2]b(x)x|q}q!𝔞(b(2)).\mathrm{I}_{Q}(F)\xrightarrow{{\lambda_{\nabla}^{\text{init}}}}\bigl{\{}F,\emptyset,[F\;|\;]^{|q}_{Q}\bigr{\}}^{q}\xrightarrow{{\lambda_{\nabla}^{\text{unf}}}}\left\{\begin{array}[]{l}\bigl{\{}F,\emptyset,[b(1)\circ b(2)\;|\;]^{|q}_{Q}[1]^{x|q}_{b(x)}\bigr{\}}^{q}\rightarrow^{!}\mathfrak{a}(b(1))\\ \bigl{\{}F,\emptyset,[b(1)\circ b(2)\;|\;]^{|q}_{Q}[2]^{x|q}_{b(x)}\bigr{\}}^{q}\rightarrow^{!}\mathfrak{a}(b(2))\end{array}\right..

Queries like QQ are useful as a simulation of a non-deterministic choice (say, by a human agent) of a subset of values stored in the database with a fixed maximal cardinality. E.g., [b(x)]?.b(x)\nabla[b(x)]_{?}\mathbin{.}b(x) returns all “bb-facts” stored in the database. Query QQ defined above, however, chooses (with repetitions) at most as many bb-facts as there are tokens \sharp. Query [b(x)]?.b(x)\nabla[\sharp\circ b(x)]_{?}\mathbin{.}b(x) avoids repetitions.

The following is the bisimulation-like definition of logical equivalence between queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}:

Definition 5.4

Let Q1Q_{1} and Q2Q_{2} be two conditions in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}. Recall that IQ(F):={F,,[]Q|q}q\mathrm{I}_{Q}(F):=\{F,\emptyset,[]^{|q}_{Q}\}^{q}. We say that Q1Q_{1} is logically equivalent to Q2Q_{2}, writing Q1Q2Q_{1}\equiv Q_{2}, if and only if, for all ground multisets of facts FF and FF^{\prime}, and ground substitutions σ\sigma such that σ(Q1)\sigma(Q_{1}) and σ(Q2)\sigma(Q_{2}) are closed, we have

Iσ(Q1)(F)!𝔞(F)iffIσ(Q2)(F)!𝔞(F).\mathrm{I}_{\sigma(Q_{1})}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime})\quad\text{iff}\quad\mathrm{I}_{\sigma(Q_{2})}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}).

In other words, queries are equivalent if they can match each other’s answers. The following result is an immediate consequence of Lemma 5.2:

Lemma 5.5

Logical equivalence on queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} is an equivalence relation and a congruence, i.e., if κ\kappa is a position in a query QQ in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} such that Q|κQ|_{\kappa} is a query, and RQ|κR\equiv Q|_{\kappa}, then QQ[R]κQ\equiv Q[R]_{\kappa}.

We leave proof of the next observation to the reader:

Lemma 5.6

For any closed query QQ in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}, and any renaming σ\sigma, Qσ(Q)Q\equiv\sigma(Q).

The following clarification of semantics of queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} is proven similarly to Lemma 4.9:

Lemma 5.7

For all ground multisets of facts FF, FF^{\prime} and F′′F^{\prime\prime}, all closed queries QQ, Q1Q_{1}, Q2Q_{2}, and P.R\nabla P\mathbin{.}R in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}, and all closed conditions ϕ\phi in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}, the following statements hold:

  1. 1.

    If IQ(F)!Γ\mathrm{I}_{Q}(F)\rightarrow^{!}\Gamma then Γ=𝔞(G)\Gamma=\mathfrak{a}(G) for some ground multiset of facts GG.

  2. 2.

    Let ff be a fact. If If(F)!Γ\mathrm{I}_{f}(F)\rightarrow^{!}\Gamma then Γ=𝔞(f)\Gamma=\mathfrak{a}(f). If I(F)!Γ\mathrm{I}_{\emptyset}(F)\rightarrow^{!}\Gamma then Γ=𝔞()\Gamma=\mathfrak{a}(\emptyset).

  3. 3.

    Let FF^{\prime}\neq\emptyset. In this case IϕQ(F)!𝔞(F)\mathrm{I}_{\phi\Rightarrow Q}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}) iff Iϕ(F)!𝔰(t)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}) and IQ(F)!𝔞(F)\mathrm{I}_{Q}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}).

  4. 4.

    IϕQ(F)!𝔞()\mathrm{I}_{\phi\Rightarrow Q}(F)\rightarrow^{!}\mathfrak{a}(\emptyset) iff either (non-exclusively) Iϕ(F)!𝔰(f)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}) or IQ(F)!𝔞()\mathrm{I}_{Q}(F)\rightarrow^{!}\mathfrak{a}(\emptyset).

  5. 5.

    IQ1Q2(F)!𝔞(F)\mathrm{I}_{Q_{1}\rhd Q_{2}}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}) iff IQ1(F)!𝔞(F1)\mathrm{I}_{Q_{1}}(F)\rightarrow^{!}\mathfrak{a}(F_{1}) and IQ2(F)!𝔞(F2)\mathrm{I}_{Q_{2}}(F)\rightarrow^{!}\mathfrak{a}(F_{2}) for some multisets F1F_{1} and F2F_{2} such that F=F1F2F^{\prime}=F_{1}\circ F_{2}.

  6. 6.

    IP.R(F)!𝔞(F)\mathrm{I}_{\nabla P\mathbin{.}R}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}) iff there exist lists of ground multisets of facts F0,,FnF_{0},\ldots,F_{n}, G0,,Gn1G_{0},\ldots,G_{n-1}, and H0,,Hn1H_{0},\ldots,H_{n-1}, and a sequence of substitutions σ0,σ1,,σn1\sigma_{0},\sigma_{1},\ldots,\sigma_{n-1} such that

    1. (a)

      F0=FF_{0}=F, Fi+1=Giσi(P!)F_{i+1}=G_{i}\circ\sigma_{i}(P_{!}), and Fi=Giσi(P!P?)F_{i}=G_{i}\circ\sigma_{i}(P_{!}\circ P_{?}), for all i{0,,n1}i\in\{0,\ldots,n-1\},

    2. (b)

      Iσi(R)(F)!𝔞(Hi)\mathrm{I}_{\sigma_{i}(R)}(F)\rightarrow^{!}\mathfrak{a}(H_{i}), for all i{0,,n1}i\in\{0,\ldots,n-1\},

    3. (c)

      there exists no substitution σn\sigma_{n} and multiset of facts GnG_{n} such that Fn=Gnσn(P!P?)F_{n}=G_{n}\circ\sigma_{n}(P_{!}\circ P_{?}),

    4. (d)

      F=H0H1Hn1F^{\prime}=H_{0}\circ H_{1}\circ\cdots H_{n-1}.

Lemma 5.8

The following logical equivalences hold between conditions in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}:

QQ,Q1Q2Q2Q1,Q1(Q2Q3)(Q1Q2)Q3,\displaystyle\emptyset\rhd Q\equiv Q,\quad Q_{1}\rhd Q_{2}\equiv Q_{2}\rhd Q_{1},\quad Q_{1}\rhd(Q_{2}\rhd Q_{3})\equiv(Q_{1}\rhd Q_{2})\rhd Q_{3},
R,¬RR,P.\displaystyle\bot\Rightarrow R\equiv\emptyset,\quad\neg\bot\Rightarrow R\equiv R,\quad\exists P\mathbin{.}\emptyset\equiv\emptyset

The next results show that non-confluence of queries makes some natural equivalences invalid:

Lemma 5.9

Let ϕ\phi be a condition in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} and let Q1Q_{1}, Q2Q_{2} be queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}. For all ground multisets of facts FF, FF^{\prime} and all substitutions σ\sigma such that σ(ϕ)\sigma(\phi), σ(Q1)\sigma(Q_{1}) and σ(Q2)\sigma(Q_{2}) are closed, we have

Iσ((ϕQ1)(ϕQ2))(F)!𝔞(F)ifIσ(ϕ(Q1Q2))(F)!𝔞(F),\mathrm{I}_{\sigma((\phi\Rightarrow Q_{1})\rhd(\phi\Rightarrow Q_{2}))}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime})\quad\text{if}\quad\mathrm{I}_{\sigma(\phi\Rightarrow(Q_{1}\rhd Q_{2}))}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}), (17)

however, the inverse implication does not hold in general. If ϕ\phi is deterministic (Definition 4.15), then

(ϕQ1)(ϕQ2)ϕ(Q1Q2).(\phi\Rightarrow Q_{1})\rhd(\phi\Rightarrow Q_{2})\equiv\phi\Rightarrow(Q_{1}\rhd Q_{2}). (18)
Proof 5.10

To prove the implication (17) assume Iσ(ϕ(Q1Q2))(F)!𝔞(F)\mathrm{I}_{\sigma(\phi\Rightarrow(Q_{1}\rhd Q_{2}))}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}). Either FF^{\prime}\neq\emptyset or F=F^{\prime}=\emptyset. In the first case, by Lemma 5.7, p. 3 , our assumption is equivalent to Iϕ(F)!𝔯(t)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{r}(\text{\bf t}) and Iσ(Q1Q2)(F)!𝔞(F)\mathrm{I}_{\sigma(Q_{1}\rhd Q_{2})}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}), the latter of which, in turn, is equivalent, by Lemma 5.7, p. 5, to Iσ(Q1)(F)!𝔞(F1)\mathrm{I}_{\sigma(Q_{1})}(F)\rightarrow^{!}\mathfrak{a}(F_{1}) and Iσ(Q2)(F)!𝔞(F2)\mathrm{I}_{\sigma(Q_{2})}(F)\rightarrow^{!}\mathfrak{a}(F_{2}) for some multisets F1F_{1} and F2F_{2} such that F=F1F2F^{\prime}=F_{1}\circ F_{2}. But this is equivalent, by Lemma 5.7, p. 3 and 5, to Iσ((ϕQ1)(ϕQ2))(F)!𝔞(F).\mathrm{I}_{\sigma((\phi\Rightarrow Q_{1})\rhd(\phi\Rightarrow Q_{2}))}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}). The case F=F^{\prime}=\emptyset is dealt with similarly.

To show that, in general, the inverse implication does not hold, consider ϕ:=[ab(x)]?.{x=1}\phi:=\exists[a\circ b(x)]_{?}\mathbin{.}\{x=1\}, Q:=ϕ(cd)Q:=\phi\Rightarrow(c\rhd d), Q:=(ϕc)(ϕd)Q^{\prime}:=(\phi\Rightarrow c)\rhd(\phi\Rightarrow d), where a,c,d:Facta,c,d:\rightarrow\text{\tt Fact} and b:NatFactb:\text{\tt Nat}\rightarrow\text{\tt Fact}. Let F:=ab(1)b(2)F:=a\circ b(1)\circ b(2). Since Iϕ(F)!𝔰(B)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{s}(B) for B{t,f}B\in\{\text{\bf t},\text{\bf f}\}, IQ(F)!𝔞(F)\mathrm{I}_{Q}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}) iff F{,cd}F^{\prime}\in\{\emptyset,c\circ d\}, whereas IQ(F)!𝔞(F)\mathrm{I}_{Q^{\prime}}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}) iff F{,c,d,cd}F^{\prime}\in\{\emptyset,c,d,c\circ d\}.

Assume that ϕ\phi is deterministic. By Theorem 4.16, either Iσ(ϕ)(F)!𝔰(t)\mathrm{I}_{\sigma(\phi)}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}) or Iσ(ϕ)(F)!𝔰(f)\mathrm{I}_{\sigma(\phi)}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}), but not both. Let R1:=σ((ϕQ1)(ϕQ2))R_{1}:=\sigma\bigl{(}(\phi\Rightarrow Q_{1})\rhd(\phi\Rightarrow Q_{2})\bigr{)}, R2:=σ(ϕ(Q1Q2))R_{2}:=\sigma\bigl{(}\phi\Rightarrow(Q_{1}\rhd Q_{2})\bigr{)}. If Iσ(ϕ)(F)!𝔰(f)\mathrm{I}_{\sigma(\phi)}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f}) then IRi(F)𝔞(F)\mathrm{I}_{R_{i}}(F)\rightarrow\mathfrak{a}(F^{\prime}) iff F=F^{\prime}=\emptyset for i{1,2}i\in\{1,2\}. If Iσ(ϕ)(F)!𝔰(t)\mathrm{I}_{\sigma(\phi)}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t}) then, for i{1,2}i\in\{1,2\}, IRi(F)𝔞(F)\mathrm{I}_{R_{i}}(F)\rightarrow\mathfrak{a}(F^{\prime}) iff F=F1F2F^{\prime}=F_{1}\circ F_{2} for some multisets F1F_{1} and F2F_{2} such that IQj(F)𝔞(Fj)\mathrm{I}_{Q_{j}}(F)\rightarrow\mathfrak{a}(F_{j}), j{1,2}j\in\{1,2\}.

Example 5.11

In general, equivalence P.(Q1Q2)(P.Q1)(P.Q2)\nabla P\mathbin{.}(Q_{1}\rhd Q_{2})\equiv(\nabla P\mathbin{.}Q_{1})\rhd(\nabla P\mathbin{.}Q_{2}) is not valid. Indeed, let a:Facta:\rightarrow\text{\tt Fact} and b,c:NatFactb,c:\text{\tt Nat}\rightarrow\text{\tt Fact}, and let P:=[ab(x)]?P:=[a\circ b(x)]_{?}, Q:=P.(b(x)c(x))Q:=\nabla P\mathbin{.}(b(x)\rhd c(x)), Q:=(P.b(x))(P.c(x))Q^{\prime}:=(\nabla P\mathbin{.}b(x))\rhd(\nabla P\mathbin{.}c(x)). Suppose that F:=ab(1)b(2)F:=a\circ b(1)\circ b(2). Then IQ(F)!𝔯(F)\mathrm{I}_{Q}(F)\rightarrow^{!}\mathfrak{r}(F^{\prime}) iff F{b(i)c(i)|i{1,2}}F^{\prime}\in\bigl{\{}b(i)\circ c(i)\;|\;i\in\{1,2\}\bigr{\}}. However, IQ(F)!𝔯(F)\mathrm{I}_{Q^{\prime}}(F)\rightarrow^{!}\mathfrak{r}(F^{\prime}) iff F{b(i)c(j)|i,j{1,2}}F^{\prime}\in\bigl{\{}b(i)\circ c(j)\;|\;i,j\in\{1,2\}\bigr{\}}.

Here we define a class of queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} which evaluate to unique answers:

Definition 5.12

A query QQ in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} is called deterministic if all quantification patterns in QQ (including those inside conditions) contain only single facts with unique matching property.

Theorem 5.13

Let QQ be a deterministic query in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}. Then Σ,𝒟(Q)\mathcal{R}_{\Sigma,\mathcal{D}}(Q) is confluent, and, in particular, given a ground multiset of facts FF, there is a unique multiset of facts FF^{\prime} such that IQ(F)!𝔞(F)\mathrm{I}_{Q}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}).

Proof 5.14

As semiconfluence implies confluence, to show confluence at t:Stateqt:\text{\tt State}^{q} it suffices to prove that if tt+t′′t^{\prime}\leftarrow t\rightarrow^{+}t^{\prime\prime} for some tt^{\prime} and t′′t^{\prime\prime}, then there exists ss such that tst′′t^{\prime}\rightarrow^{*}s\;{}^{*}\!\!\leftarrow t^{\prime\prime}. Semiconfluence is immediate at irreducible terms 𝔞(F)\mathfrak{a}(F^{\prime}), as well as terms {F,F,}q\{F,F^{\prime},\emptyset\}^{q} which can only rewrite to 𝔞(F)\mathfrak{a}(F^{\prime}). Let t:={F,F,SA}qt:=\{F,F^{\prime},SA\}^{q} where S:StkqS:\text{\tt Stk}^{q}, and A:FrmqA:\text{\tt Frm}^{q}. If there exists a unique multiset of facts KK such that every rewrite sequence starting with tt either (1) contains h:={F,FK,S}qh:=\{F,F^{\prime}\circ K,S\}^{q} or (2) it can be extended to reach hh, then semiconfluence holds at tt. Indeed, in this case, either s=t′′s=t^{\prime\prime} witnesses the semiconfluence (if (1) holds for t+t′′t\rightarrow^{+}t^{\prime\prime}) or s=hs=h does (if (2) holds for t+t′′t\rightarrow^{+}t^{\prime\prime}). It remains to prove the existence of the unique multiset KK. We argue by induction on the structure of formulas indexing the frame AA on the top of the stack. Most cases are dealt by trivial application of Lemmas 5.7 and 5.2. For frames related to conditionals __\_\Rightarrow\_ we have to also use Lemma 4.16. The only non-trivial part of the proof concerns frames AA of the form [F′′|a]P.Rv|q[F^{\prime\prime}\;|\;\vec{a}]^{\vec{v}|q}_{\nabla P\mathbin{.}R}.

Under the theorem’s assumption, P=[f]?P=[f]_{?}, where ff is a fact with unique matching property (Definition 4.14). Let w\vec{w} be a sequence of all variables in 𝑉𝑎𝑟(f){v}\mathit{Var}(f)\setminus\{\vec{v}\}. If there is no fact ff^{\prime} in FF^{\prime} such that ff^{\prime} matches {a/v}(f)\{\vec{a}/\vec{v}\}(f) then necessarily K=K=\emptyset. Otherwise F=Gf1fnF^{\prime}=G\circ f_{1}\circ\cdots f_{n} for some n>0n>0, where (1) for all i{1,,n}i\in\{1,\ldots,n\} there exists a unique substitution σi={a/v,bi/w}\sigma_{i}=\{\vec{a}/\vec{v},\vec{b^{i}}/\vec{w}\} such that fi=σi(f)f_{i}=\sigma_{i}(f), (2) there is no fact ff^{\prime} in GG such that ff^{\prime} matches {a/v}(f)\{\vec{a}/\vec{v}\}(f). In this case, necessarily, by Lemma 5.7, p. 6, K=K1KnK=K_{1}\circ\cdots\circ K_{n}, where, for all i{1,,n}i\in\{1,\ldots,n\}, KiK_{i} is the unique multiset of facts (uniqueness and existence follows from inductive assumption) such that Iσi(R)(F)!𝔞(Ki)\mathrm{I}_{\sigma_{i}(R)}(F)\rightarrow^{!}\mathfrak{a}(K_{i}).

There is a useful relationship between queries and conditions:

Lemma 5.15

Let r:s1snFactr:s_{1}\ldots s_{n}\rightarrow\text{\tt Fact}, let QQ be a query in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}}, and let x:=x1,,xn\vec{x}:=x_{1},\ldots,x_{n} be a list of nn distinct variables such that {x}𝑉𝑎𝑟(Q)=\{\vec{x}\}\cap\mathit{Var}(Q)=\emptyset and xi:six_{i}:s_{i} for i{1,,n}i\in\{1,\ldots,n\}. Then, there exists a condition ϕQx|r\phi_{Q}^{\vec{x}|r} in 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}} such that, for all ground multisets of facts FF, and for all ground substitutions σ:={t/x,a/v}\sigma:=\{\vec{t}/\vec{x},\vec{a}/\vec{v}\} such that {a/v}(Q)\{\vec{a}/\vec{v}\}(Q) is closed, σ(ϕQx|r)\sigma(\phi_{Q}^{\vec{x}|r}) is closed, and, moreover,

Iσ(ϕQx|r)(F)!𝔰(t)iffF.(IQ(F)!𝔞(Fr(t))),\displaystyle\mathrm{I}_{\sigma(\phi_{Q}^{\vec{x}|r})}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf t})\quad\text{iff}\quad\exists F^{\prime}\mathbin{.}\bigl{(}\mathrm{I}_{Q}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime}\circ r(\vec{t}))\bigr{)},
Iσ(ϕQx|r)(F)!𝔰(f)iffF.(r(t)FIQ(F)!𝔞(F)),\displaystyle\mathrm{I}_{\sigma(\phi_{Q}^{\vec{x}|r})}(F)\rightarrow^{!}\mathfrak{s}(\text{\bf f})\quad\text{iff}\quad\exists F^{\prime}\mathbin{.}\bigl{(}r(\vec{t})\notin F^{\prime}\wedge\mathrm{I}_{Q}(F)\rightarrow^{!}\mathfrak{a}(F^{\prime})\bigr{)}, (19)

i.e., σ(ϕQx|r)\sigma(\phi_{Q}^{\vec{x}|r}) evaluates to t (resp. f) on FF if and only if there is some FF^{\prime} returned by QQ when evaluated against FF, such that r(t)Fr(\vec{t})\in F^{\prime} (resp. r(t)Fr(\vec{t})\notin F^{\prime}). Moreover, if QQ is deterministic, then so is ϕQx|r\phi_{Q}^{\vec{x}|r}

Proof 5.16

We define ϕQx|r\phi_{Q}^{\vec{x}|r} by recursion on the structure of a query QQ:

ϕx|r=,ϕh(t)x|r={r(x)=h(t)},ϕQ1Q2x|r=ϕQ1x|rϕQ2x|r,ϕψRx|r=ψϕRx|r,ϕP.Rx|r=P.ϕRx|r.\displaystyle\phi_{\emptyset}^{\vec{x}|r}=\bot,\ \phi_{h(\vec{t})}^{\vec{x}|r}=\{r(\vec{x})=h(\vec{t})\},\ \phi_{Q_{1}\rhd Q_{2}}^{\vec{x}|r}=\phi_{Q_{1}}^{\vec{x}|r}\vee\phi_{Q_{2}}^{\vec{x}|r},\ \phi_{\psi\Rightarrow R}^{\vec{x}|r}=\psi\wedge\phi_{R}^{\vec{x}|r},\ \phi_{\nabla P\mathbin{.}R}^{\vec{x}|r}=\exists P.\phi_{R}^{\vec{x}|r}.

The easy if laborious proof that ϕQx|r\phi_{Q}^{\vec{x}|r} really satisfies all the conditions in the statement is left to the reader.

Example 5.17

Consider the following query in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} (where xx and yy are distinct variables):

Q:=[f(x,y)]?.(({x=y}r(x))r(s(x))).Q:=\nabla[f(x,y)]_{?}\mathbin{.}\bigl{(}(\{x=y\}\Rightarrow r(x))\rhd r(s(x))\bigr{)}.

Thus, for any fact of the form f(x,y)f(x,y) in the database, QQ will output r(s(x))r(s(x)), and, if x=yx=y, also r(x)r(x). Using the recursive formula from the proof of Lemma 5.15 we easily see that

ϕQz|r:=[f(x,y)]?.(({x=y}{r(z)=r(x)}){r(z)=r(s(x))}).\phi^{z|r}_{Q}:=\exists[f(x,y)]_{?}\mathbin{.}\bigl{(}(\{x=y\}\wedge\{r(z)=r(x)\})\vee\{r(z)=r(s(x))\}\bigr{)}.

The next result shows that 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} can emulate relational algebra.

Theorem 5.18

Denote by RelAlg(𝒮,𝒟)\text{RelAlg}(\mathcal{S},\mathcal{D}) the relational algebra over the relational schema 𝒮\mathcal{S} and the domain 𝒟\mathcal{D} of atomic values (which we silently identify with its algebraic representation, where types are sorts and predicates are represented with equationally defined operators into the Bool sort). Furthermore, assume that for each relational algebra expression RR in RelAlg(𝒮,𝒟)\text{RelAlg}(\mathcal{S},\mathcal{D}) a function symbol R:s1snFact\mathcal{R}_{R}:s_{1}\ldots s_{n}\rightarrow\text{\tt Fact} is in ΣF\Sigma_{F}, where sis_{i} is the sort (domain) of the ii-th column of RR and nn is the arity of RR. For any relational database II with schema 𝒮\mathcal{S}, let 𝑡𝑟¯(I)\overline{\mathit{tr}}(I) be a multiset corresponding to the database II. More precisely,

𝑡𝑟¯(I):={r(t)|r is a relation symbol in 𝒮 and (t)rI},\overline{\mathit{tr}}(I):=\circ\{\mathcal{R}_{r}(\vec{t})\;|\;\text{$r$ is a relation symbol in $\mathcal{S}$ and\ }(\vec{t})\in r^{I}\},

where rIr^{I} is the set of tuples of rr in II. Then, for all formulas RR in RelAlg(𝒮,𝒟)\text{RelAlg}(\mathcal{S},\mathcal{D}), there exists a closed query 𝑡𝑟(R)\mathit{tr}(R) in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} such that for all relational databases II with schema 𝒮\mathcal{S},

F.(I𝑡𝑟(R)(𝑡𝑟¯(I))!𝔞(FR(t)))iff(t)𝑒𝑣𝑎𝑙I(R),\exists F\mathbin{.}\bigl{(}\mathrm{I}_{\mathit{tr}(R)}(\overline{\mathit{tr}}(I))\rightarrow^{!}\mathfrak{a}(F\circ\mathcal{R}_{R}(\vec{t}))\bigr{)}\quad\text{iff}\quad(\vec{t})\in\mathit{eval}_{I}(R), (20)

where 𝑒𝑣𝑎𝑙I(R)\mathit{eval}_{I}(R) is the set of tuples obtained by evaluation of relational query RR against the relational database II. Furthermore, for any relational expression RR, 𝑡𝑟(R)\mathit{tr}(R) is deterministic.

Proof 5.19

To prove the theorem we define 𝑡𝑟(R)\mathit{tr}(R) by recursion on the structure of RR. We consider relational algebra expressions to be constructed with base relations, projections, selections, set unions, Cartesian products, and set differences. Other well known relational operators such as joins can be defined in terms of basic operators mentioned above. Attribute renaming is not relevant in our case, since we represent relations with positional arguments only. We denote set unions, Cartesian products, and set differences with the usual mathematical notation (e.g., RSR\cup S, R×SR\times S and RSR\setminus S, respectively). Notation for projections and selections is less standardised (and needs to be adapted to relations with positional arguments). We denote by πi1,,ik(R)\pi_{i_{1},\ldots,i_{k}}(R) the projection of RR onto i1i_{1}’th, i2i_{2}’th, \ldots, and iki_{k}’th “leg” (i.e., on a single tuple, πi1,,ik((t1,t2,,tn)):=(ti1,ti2,,tik)\pi_{i_{1},\ldots,i_{k}}((t_{1},t_{2},\ldots,t_{n})):=(t_{i_{1}},t_{i_{2}},\ldots,t_{i_{k}})). We denote by σϕ(R)\sigma_{\phi}(R) the selection with condition ϕ\phi applied to RR (i.e., it returns those tuples of RR which satisfy ϕ\phi). If RR is nn-ary, then we will assume that in ϕ\phi expression $i\$_{i} corresponds to the ii-th column of RR, for i{1,,n}i\in\{1,\ldots,n\}.

Let rr be a base relation of arity kk (for simplicity we omit the typing information), and let x\vec{x} be a list of kk distinct variables. Then 𝑡𝑟(r):=[r(x)]?.r(x).\mathit{tr}(r):=\nabla[\mathcal{R}_{r}(\vec{x})]_{?}\mathbin{.}\mathcal{R}_{r}(\vec{x}). We now consider selected relational operators:

  1. 1.

    Let RR be a relational formula defining an nn-ary relation, and let i1,,iki_{1},\ldots,i_{k} be a subsequence of 1,,n1,\ldots,n. We define 𝑡𝑟(πi1,,ik(R))\mathit{tr}(\pi_{i_{1},\ldots,i_{k}}(R)) to be 𝑡𝑟(R)\mathit{tr}(R) with each subquery of the form R(t1,,tn)\mathcal{R}_{R}(t_{1},\ldots,t_{n}) replaced with πi1,,ik(R)(ti1,,tik)\mathcal{R}_{\pi_{i_{1},\ldots,i_{k}}(R)}(t_{i_{1}},\ldots,t_{i_{k}}).

  2. 2.

    Let RR be a relational formula of arity kk, and let ϕ\phi be a term of sort Bool representing condition on rows of RR (where in ϕ\phi special variable $i\$_{i}, i{1,,n}i\in\{1,\ldots,n\} corresponds to the ii-th “attribute” of RR). We define 𝑡𝑟(σϕ(R))\mathit{tr}(\sigma_{\phi}(R)) to be 𝑡𝑟(R)\mathit{tr}(R) with each subquery of the form R(t)\mathcal{R}_{R}(\vec{t}) replaced with {{t/$}(ϕ)}σϕ(R)(t)\bigl{\{}\{\vec{t}/\vec{\$}\}(\phi)\bigr{\}}\Rightarrow\mathcal{R}_{\sigma_{\phi}(R)}(\vec{t}).

  3. 3.

    𝑡𝑟(R1R2):=𝑡𝑟(R1)𝑡𝑟(R2)\mathit{tr}(R_{1}\cup R_{2}):=\mathit{tr}(R_{1})\rhd\mathit{tr}(R_{2}).

  4. 4.

    Let RR and SS be relational formulas. Let 𝑡𝑟(S):=σ(𝑡𝑟(S))\mathit{tr}(S)^{\prime}:=\sigma(\mathit{tr}(S)) for some renaming σ\sigma such that 𝑉𝑎𝑟(𝑡𝑟(R))𝑉𝑎𝑟(𝑡𝑟(S))=\mathit{Var}(\mathit{tr}(R))\cap\mathit{Var}(\mathit{tr}(S)^{\prime})=\emptyset (𝑡𝑟(S)𝑡𝑟(S)\mathit{tr}(S)\equiv\mathit{tr}(S)^{\prime} by Lemma 5.6 since 𝑡𝑟(S)\mathit{tr}(S) is closed). Further, let α(t)\alpha(\vec{t}) be 𝑡𝑟(S)\mathit{tr}(S)^{\prime} with each subquery of the form S(s)\mathcal{R}_{S}(\vec{s}) replaced with R×S(t,s)\mathcal{R}_{R\times S}(\vec{t},\vec{s}). Then we define 𝑡𝑟(R×S)\mathit{tr}(R\times S) to be 𝑡𝑟(R)\mathit{tr}(R) with each subquery of the form R(t)\mathcal{R}_{R}(\vec{t}) replaced with α(t)\alpha(\vec{t}).

  5. 5.

    Let RR and SS be relational formulas of arity kk. Let 𝑡𝑟(S)\mathit{tr}(S)^{\prime} be like in the previous point. Let x\vec{x} be a list of kk distinct variables such that {x}(𝑉𝑎𝑟(𝑡𝑟(R))𝑉𝑎𝑟(𝑡𝑟(S)))=\{\vec{x}\}\cap(\mathit{Var}(\mathit{tr}(R))\cup\mathit{Var}(\mathit{tr}(S)^{\prime}))=\emptyset. We define 𝑡𝑟(RS))\mathit{tr}(R\setminus S)) to be 𝑡𝑟(R)\mathit{tr}(R) with each subquery of the form R(t)\mathcal{R}_{R}(\vec{t}) replaced with ¬{t/x}(ϕ𝑡𝑟(S)x|S)RS(t)\neg\{\vec{t}/\vec{x}\}\bigl{(}\phi_{\mathit{tr}(S)^{\prime}}^{\vec{x}|\mathcal{R}_{S}}\bigr{)}\Rightarrow\mathcal{R}_{R\setminus S}(\vec{t}) (see Lemma 5.15).

An easy induction on the structure of RR shows that 𝑡𝑟(R)\mathit{tr}(R) is closed and deterministic, and Equation (20) is satisfied (in the case of induction step for set difference we also use Lemma 5.15)

Remark 5.20

Relational queries RR evaluate to sets of tuples. However, 𝑡𝑟(R)\mathit{tr}(R) may evaluate to a multiset of facts — e.g., when evaluating unions duplicate facts are not removed.

Example 5.21

Let rr and ss be binary relations. Consider the following relational algebra expression:

R:=π1,4(σ$2=$3(r×s)).R:=\pi_{1,4}\bigl{(}\sigma_{\$_{2}=\$_{3}}(r\times s)\bigr{)}.

Thus, (x,y)R(x,y)\in R iff (x,z)r(x,z)\in r and (z,y)s(z,y)\in s for some zz. Let represent RR as a query in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} using definition of 𝑡𝑟(_)\mathit{tr}(\_) from the proof of Theorem 5.18. First,

𝑡𝑟(r)=[r(x1,x2)]?.r(x1,x2),𝑡𝑟(s)=[s(y1,y2)]?.s(y1,y2),\mathit{tr}(r)=\nabla[\mathcal{R}_{r}(x_{1},x_{2})]_{?}\mathbin{.}\mathcal{R}_{r}(x_{1},x_{2}),\quad\mathit{tr}(s)=\nabla[\mathcal{R}_{s}(y_{1},y_{2})]_{?}\mathbin{.}\mathcal{R}_{s}(y_{1},y_{2}),

where x1,x2,y2,y2x_{1},x_{2},y_{2},y_{2} are distinct variables. Then, by the point 4 in the proof

𝑡𝑟(r×s)=[r(x1,x2)]?.[s(y1,y2)]?.r×s(x1,x2,y1,y2).\mathit{tr}(r\times s)=\nabla[\mathcal{R}_{r}(x_{1},x_{2})]_{?}\mathbin{.}\nabla[\mathcal{R}_{s}(y_{1},y_{2})]_{?}\mathbin{.}\mathcal{R}_{r\times s}(x_{1},x_{2},y_{1},y_{2}).

Finally, to deal with selection and projection we apply points 2 and 1, respectively, from the proof:

𝑡𝑟(π1,4(σ$2=$3(r×s)))=[r(x1,x2)]?.[s(y1,y2)]?.({x2=y1}π1,4(σ$2=$3(r×s))(x1,y2)).\mathit{tr}\bigl{(}\pi_{1,4}\bigl{(}\sigma_{\$_{2}=\$_{3}}(r\times s)\bigr{)}\bigr{)}\\ =\nabla[\mathcal{R}_{r}(x_{1},x_{2})]_{?}\mathbin{.}\nabla[\mathcal{R}_{s}(y_{1},y_{2})]_{?}\mathbin{.}\bigl{(}\{x_{2}=y_{1}\}\Rightarrow\mathcal{R}_{\pi_{1,4}(\sigma_{\$_{2}=\$_{3}}(r\times s))}(x_{1},y_{2})\bigr{)}.

6 Rewriting semantics of 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}

We associate with a DML query QQ in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}} the rewriting system Σ,𝒟𝐝𝐦𝐥(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}(Q). Terms rewritten with the rules of Σ,𝒟𝐝𝐦𝐥(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}(Q) are of the form {F,F,F𝐧,S}d\{F,F^{\prime},F_{\mathbf{n}},S\}^{d}, where {_,_,_,_}d:FSetFSetFSet𝐧StkdStated\{\_,\_,\_,\_\}^{d}:\text{\tt FSet}\;\text{\tt FSet}\;\text{\tt FSet}^{\mathbf{n}}\;\text{\tt Stk}^{d}\rightarrow\text{\tt State}^{d}. FF is the database of facts against which we issue the DML query. FF changes during execution of the query while the facts are removed from it. A multiset FF^{\prime}, expanded during query execution, contains new facts to be added to the database. F𝐧F_{\mathbf{n}} is a multiset of fresh facts (see Section 2) from which fresh values are drawn. SS is a stack of sort Stkd\text{\tt Stk}^{d} which simulates structural recursion. Normal forms encapsulating a new database and new multiset of fresh facts after successful or, respectively, failing execution of a DML query, are constructed with 𝔫,𝔣:FSetFSet𝐧Stated\mathfrak{n},\mathfrak{f}:\text{\tt FSet}\;\text{\tt FSet}^{\mathbf{n}}\rightarrow\text{\tt State}^{d}. We consider only terms tt of sort Stated\text{\tt State}^{d} which satisfy the following freshness condition:

Definition 6.1

A term tt of sort Stated\text{\tt State}^{d} satisfies the freshness condition if and only if m>nm>n for all positions κ\kappa, κ\kappa^{\prime} in tt such that κκ1\kappa\neq\kappa^{\prime}1, t|κ=Cs(ıms)t|_{\kappa}=C_{s}(\imath^{s}_{m}) and t|κ=ınst|_{\kappa^{\prime}}=\imath^{s}_{n}.

Example 6.2

Consider the following term of Stated\text{\tt State}^{d} sort:

𝔫(f(ı10s)f(ı3s),Cs(ı7s)).\mathfrak{n}\bigl{(}f(\imath^{s}_{10})\circ f(\imath^{s}_{3}),C_{s}(\imath^{s}_{7})\bigr{)}.

It does not satisfy the freshness condition having as subterms both Cs(ı7s)C_{s}(\imath^{s}_{7}) and ı10s\imath^{s}_{10} with 10>710>7. Our general assumption which justifies the freshness condition is that a fresh fact of the form Cs(ıms)C_{s}(\imath^{s}_{m}) means that we never before used any value of the form ıns\imath^{s}_{n} with n>mn>m. Clearly, terms which do not satisfy freshness condition (like the term above) violate this assumption.

Terms of sort Stkd\text{\tt Stk}^{d} are constructed from local computation frames of sort Frmd\text{\tt Frm}^{d}, where Frmd<Stkd\text{\tt Frm}^{d}<\text{\tt Stk}^{d}, using an associative binary operator __:StkdStkdStkd\_\_:\text{\tt Stk}^{d}\;\text{\tt Stk}^{d}\rightarrow\text{\tt Stk}^{d} with identity \emptyset. Most constructors of frames are indexed by DML sub-queries RR of QQ, and lists of distinct variable names v:=v1,,vn\vec{v}:=v_{1},\ldots,v_{n} of respective sorts s1,,sns_{1},\ldots,s_{n} such that {v}𝑉𝑎𝑟(Q)\{\vec{v}\}\subseteq\mathit{Var}(Q) contains all free variables of RR:

[_,,_]Rv|d,[_,,_]Rv|d,:s1snFrmd,[_,,_|_]Rv|d:s1snStkcFrmd,\displaystyle[\_,\ldots,\_]^{\vec{v}|d}_{R},[\_,\ldots,\_]^{\vec{v}|d,\downarrow}_{R}:s_{1}\ldots s_{n}\rightarrow\text{\tt Frm}^{d},\quad[\_,\ldots,\_\;|\;\_]^{\vec{v}|d}_{R}:s_{1}\ldots s_{n}\;\text{\tt Stk}^{c}\rightarrow\text{\tt Frm}^{d},
:Frmd,[_|_,,_|_]P.Rv|d:FSets1snBoolFrmd,\displaystyle\mathit{\surd}:\rightarrow\text{\tt Frm}^{d},\quad[\_\;|\;\_,\ldots,\_\;|\;\_]^{\vec{v}|d}_{\nabla P\mathbin{.}R}:\text{\tt FSet}\;s_{1}\ldots s_{n}\;\text{\tt Bool}\rightarrow\text{\tt Frm}^{d},
[_|_,,_|_,_]P.Rv|d:FSets1snBoolFSetFrmd.\displaystyle[\_\;|\;\_,\ldots,\_\;|\;\_,\_]^{\vec{v}|d}_{\nabla P\mathbin{.}R}:\text{\tt FSet}\;s_{1}\ldots s_{n}\;\text{\tt Bool}\;\text{\tt FSet}\rightarrow\text{\tt Frm}^{d}.

As v\vec{v} can be empty, the above signature templates include []Rv|d,[]Rv|d,:Frmd[]^{\vec{v}|d}_{R},[]_{R}^{\vec{v}|d,\downarrow}:\rightarrow\text{\tt Frm}^{d}, etc. A constant \mathit{\surd} marks successful branches of computation, i.e., those which created either new facts or :DQy\surd:\text{\tt DQy}. Marking such branches is necessary as facts deleted by unsuccessful branches are restored as soon as the branch finishes. Frames of the form [a]Rv|d[\vec{a}]^{\vec{v}|d}_{R}, [a]Rv|d,[\vec{a}]^{\vec{v}|d,\downarrow}_{R}, [a|S]Rv|d[\vec{a}|S]^{\vec{v}|d}_{R}, [F|a|B]Rv|d[F^{\prime}|\vec{a}|B]^{\vec{v}|d}_{R}, or [F|a|B,F′′]Rv|d[F^{\prime}|\vec{a}|B,F^{\prime\prime}]^{\vec{v}|d}_{R} are called (R,σ)(R,\sigma)-frames, where σ:={a/v}\sigma:=\{\vec{a}/\vec{v}\} is the current substitution. They are related to evaluation of σ(R)\sigma(R). Marked frames [a]v|d,[\vec{a}]^{\vec{v}|d,\downarrow} occur in the execution of “unions” __\_\rhd\_. Conditional frames [a|S]Rv|d[\vec{a}|S]^{\vec{v}|d}_{R} are used in execution of conditionals ϕR\phi\Rightarrow R, where S:StkcS:\text{\tt Stk}^{c} represents evaluation of ϕ\phi. Iterator frames [F|a|B]P.Rv|d[F^{\prime}|\vec{a}|B]^{\vec{v}|d}_{\nabla P\mathbin{.}R} and [F|a|B,F′′]P.Rv|d[F^{\prime}|\vec{a}|B,F^{\prime\prime}]^{\vec{v}|d}_{\nabla P\mathbin{.}R} represent iterative execution of σ(P.R)\sigma(\nabla P\mathbin{.}R). Multiset FF^{\prime}, called iterator state, contains facts available for matching with P!P?P0P_{!}\circ P_{?}\circ P_{0} (cf. Remark 2.3). Iteration status BB is equal to t iff the iteration already generated either new facts or \surd (i.e., if the branch related to σ(P.R)\sigma(\nabla P\mathbin{.}R) was successful). “Tentative” iterator frames [F|a|B,F′′]P.Rv|q[F^{\prime}|\vec{a}|B,F^{\prime\prime}]^{\vec{v}|q}_{\nabla P\mathbin{.}R} store multiset F′′F^{\prime\prime} of facts deleted from the database in the present step, so that they can be restored if the step is unsuccessful. Given a database FF, in order to execute a closed DML query QQ we rewrite a state term IQ(F,F𝐧):={F,,F𝐧,[]Q|d}d\mathrm{I}_{Q}(F,F_{\mathbf{n}}):=\{F,\emptyset,F_{\mathbf{n}},[]^{|d}_{Q}\}^{d} until a normal form 𝔫(F,F𝐧)\mathfrak{n}(F^{\prime},F_{\mathbf{n}}^{{}^{\prime}}) or 𝔣(F,F𝐧)\mathfrak{f}(F^{\prime},F_{\mathbf{n}}^{{}^{\prime}}) is reached, indicating that a successful or, respectively, unsuccessful execution of QQ in the database FF yielded a new database FF^{\prime}, and a new multiset of fresh facts F𝐧F_{\mathbf{n}}^{{}^{\prime}}. Now we are ready to define rule schemas of Σ,𝒟𝐝𝐦𝐥(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}(Q).

Two consecutive occurrences of \mathit{\surd} are collapsed, and (,σ)(\mathit{\surd},\sigma)-frames are replaced with \mathit{\surd}:

λcol:{F,F,F𝐧,S}d{F,F,F𝐧,S}d,λ:{F,F,F𝐧,S[v]v|d}d{F,F,F𝐧,S}d.\lambda_{\text{col}}:\{F,F^{\prime},F_{\mathbf{n}},S\mathit{\surd}\mathit{\surd}\bigr{\}}^{d}\Rightarrow\!\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S\mathit{\surd}\}^{d},\;\;\lambda_{\surd}:\{F,F^{\prime},F_{\mathbf{n}},S[\vec{v}]^{\vec{v}|d}_{\surd}\}^{d}\Rightarrow\!\{F,F^{\prime},F_{\mathbf{n}},S\mathit{\surd}\}^{d}. (21)

An (,σ)(\emptyset,\sigma)-frame is removed from the stack. An (f,σ)(f,\sigma)-frame, where ff is a fact and σ\sigma is the current substitution, is replaced by \mathit{\surd} and σ(f)\sigma(f) is added to FF^{\prime} (since 𝑉𝑎𝑟(f){v}\mathit{Var}(f)\subseteq\{\vec{v}\}, σ(f)\sigma(f) is closed):

λ:{F,F,F𝐧,S[v]v|d}d{F,F,F𝐧,S}q,λfact:{F,F,F𝐧,S[v]fv|d}d{F,Ff,F𝐧,S}q.\lambda_{\emptyset}:\bigl{\{}F,F^{\prime}\!,F_{\mathbf{n}},S[\vec{v}]^{\vec{v}|d}_{\emptyset}\bigr{\}}^{d}\Rightarrow\bigl{\{}F,F^{\prime}\!,F_{\mathbf{n}},S\bigr{\}}^{q},\ \lambda_{\text{fact}}:\bigl{\{}F,F^{\prime}\!,F_{\mathbf{n}},S[\vec{v}]^{\vec{v}|d}_{f}\bigr{\}}^{d}\Rightarrow\bigl{\{}F,F^{\prime}\circ f,F_{\mathbf{n}},S\mathit{\surd}\bigr{\}}^{q}. (22)

An (R1R2,σ)(R_{1}\rhd R_{2},\sigma)-frame is split into the (R1,σ)(R_{1},\sigma)-frame and the (R2,σ)(R_{2},\sigma)-frame. The (R2,σ)(R_{2},\sigma)-frame is marked with \downarrow so that the evaluation of σ(R1R2)\sigma(R_{1}\rhd R_{2}) can be marked as successful when at least one of the branches is successful. When both branches are successful, this can produce two consecutive \mathit{\surd} constants on the stack which are then collapsed using λcol\lambda_{\text{col}} rule in Equation (21).

λunf:\displaystyle\lambda_{\rhd}^{\text{unf}}: {F,F,F𝐧,S[v]R1R2v|d}d{F,F,F𝐧,S[v]R2v|d,[v]R1v|d}d,\displaystyle\ \bigl{\{}F,F^{\prime},F_{\mathbf{n}},S[\vec{v}]^{\vec{v}|d}_{R_{1}\rhd R_{2}}\bigr{\}}^{d}\Rightarrow\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S[\vec{v}]^{\vec{v}|d,\downarrow}_{R_{2}}[\vec{v}]^{\vec{v}|d}_{R_{1}}\bigr{\}}^{d},
λ;fld:\displaystyle\lambda_{\rhd;\mathit{\surd}}^{\text{fld}}: {F,F,F𝐧,S[v]R2v|d,}d{F,F,F𝐧,S[v]R2v|d}d,\displaystyle\ \bigl{\{}F,F^{\prime},F_{\mathbf{n}},S[\vec{v}]^{\vec{v}|d,\downarrow}_{R_{2}}\mathit{\surd}\bigr{\}}^{d}\Rightarrow\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S\mathit{\surd}[\vec{v}]^{\vec{v}|d}_{R_{2}}\bigr{\}}^{d},
λ;fld:\displaystyle\lambda_{\rhd;\emptyset}^{\text{fld}}: {F,F,F𝐧,S[v]R2v|d,}d{F,F,F𝐧,S[v]R2v|d}d,\displaystyle\ \bigl{\{}F,F^{\prime},F_{\mathbf{n}},S[\vec{v}]^{\vec{v}|d,\downarrow}_{R_{2}}\bigr{\}}^{d}\Rightarrow\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S[\vec{v}]^{\vec{v}|d}_{R_{2}}\bigr{\}}^{d}, (23)

Rule schemas for execution of conditionals ϕQ\phi\Rightarrow Q are similar to those in Equations (12), (13):

λcondunf:\displaystyle\lambda_{\text{cond}}^{\text{unf}}: {F,F,F𝐧,S[v]ϕRv|d}d{F,F,F𝐧,S[v|[v]ϕv|c]Rv|d}d,\displaystyle\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S[\vec{v}]^{\vec{v}|d}_{\phi\Rightarrow R}\bigr{\}}^{d}\Rightarrow\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S\bigl{[}\vec{v}\;|\;[\vec{v}]^{\vec{v}|c}_{\phi}\bigr{]}^{\vec{v}|d}_{R}\bigr{\}}^{d},
λcond;funf:\displaystyle\lambda_{\text{cond};\text{\bf f}}^{\text{unf}}: {F,F,F𝐧,S[v|𝔯(f)]Rv|d}d{F,F,F𝐧,S}d,\displaystyle\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S[\vec{v}\;|\;\mathfrak{r}(\text{\bf f})]^{\vec{v}|d}_{R}\bigr{\}}^{d}\Rightarrow\{F,F^{\prime},F_{\mathbf{n}},S\}^{d},
λcond;tunf:\displaystyle\lambda_{\text{cond};\text{\bf t}}^{\text{unf}}: {F,F,F𝐧,S[v|𝔯(t)]Rv|d}d{F,F,F𝐧,S[v]Rv|d}d,\displaystyle\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S[\vec{v}\;|\;\mathfrak{r}(\text{\bf t})]^{\vec{v}|d}_{R}\bigr{\}}^{d}\Rightarrow\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S[\vec{v}]_{R}^{\vec{v}|d}\bigr{\}}^{d},
λd:\displaystyle\lambda^{d}: {F,F,F𝐧,S[v|S]Rv|d}d{F,F,F𝐧,S[v|S′′]Rv|d}difC\displaystyle\{F,F^{\prime},F_{\mathbf{n}},S[\vec{v}\;|\;S^{\prime}]^{\vec{v}|d}_{R}\}^{d}\Rightarrow\{F,F^{\prime},F_{\mathbf{n}},S[\vec{v}\;|\;S^{\prime\prime}]^{\vec{v}|d}_{R}\}^{d}\ \text{if}\ C
for allλ:{F,S}c{F,S′′}cifCinΣ,𝒟𝐜𝐧𝐝(ϕ).\displaystyle\quad\quad\text{for all}\ \lambda:\{F,S^{\prime}\}^{c}\Rightarrow\{F,S^{\prime\prime}\}^{c}\ \text{if}\ C\ \text{in}\ \mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}(\phi). (24)

Evaluation of a _._\nabla\_.\_ subquery is initialized with the whole database available for matching. Iteration status is f since evaluation is not successful yet.

λinit:{F,F,F𝐧,S[v]P.Rv|d}d{F,F,F𝐧,S[F|v|f]P.Rv|d}d.{\lambda_{\nabla}^{\text{init}}}:\quad\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S[\vec{v}]^{\vec{v}|d}_{\nabla P\mathbin{.}R}\bigr{\}}^{d}\Rightarrow\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S[F\;|\;\vec{v}\;|\;\text{\bf f}]^{\vec{v}|d}_{\nabla P\mathbin{.}R}\bigr{\}}^{d}. (25)

Let w\vec{w} be a sequence of all the distinct variables in 𝑉𝑎𝑟(P){v}\mathit{Var}(P)\setminus\{\vec{v}\}. Let σ\sigma be the current substitution. Rule λunf{\lambda_{\nabla}^{\text{unf}}} pushes onto the stack a (σ,R)(\sigma^{\prime},R)-frame, where σ=σ{b/w}\sigma^{\prime}=\sigma\cup\{\vec{b}/\vec{w}\} is defined by matching F′′σ(P!P?P0)F^{\prime\prime}\circ\sigma(P_{!}\circ P_{?}\circ P_{0}) with iterator state and P𝐧P_{\mathbf{n}} with the multiset of fresh facts F𝐧F_{\mathbf{n}}. It also removes σ(P?P0)\sigma^{\prime}(P_{?}\circ P_{0}) from the iterator state and σ(P0)\sigma^{\prime}(P_{0}) from the database state, and, finally, it updates fresh facts using υ\upsilon defined in Equation (1). Removed facts σ(P0)\sigma^{\prime}(P_{0}) are stored in the tentative iterator frame:

λunf:\displaystyle{\lambda_{\nabla}^{\text{unf}}}: {FP!P?P0,F,F𝐧P𝐧,S[F′′P!P?P0|v|B]P.Rv|d}d\displaystyle\quad\bigl{\{}F\circ P_{!}\circ P_{?}\circ P_{0},F^{\prime},F_{\mathbf{n}}\circ P_{\mathbf{n}},S\bigl{[}F^{\prime\prime}\circ P_{!}\circ P_{?}\circ P_{0}\;|\;\vec{v}\;|\;B\bigr{]}^{\vec{v}|d}_{\nabla P\mathbin{.}R}\bigr{\}}^{d}
{FP!P?,F,F𝐧υ(P𝐧),S[F′′P!|v|B,P0]P.Rv|d[v,w]Rv,w|d}d,\displaystyle\quad\Rightarrow\bigl{\{}F\circ P_{!}\circ P_{?},F^{\prime},F_{\mathbf{n}}\circ\upsilon(P_{\mathbf{n}}),S\bigl{[}F^{\prime\prime}\circ P_{!}\;|\;\vec{v}\;|\;B,P_{0}\bigr{]}^{\vec{v}|d}_{\nabla P\mathbin{.}R}[\vec{v},\vec{w}]^{\vec{v},\vec{w}|d}_{R}\bigr{\}}^{d}, (26)

If execution of σ(R)\sigma^{\prime}(R) proves unsuccessful, removed facts can be restored both to iterator state and database. Otherwise, we discard them and set the iteration status to t:

λ;fld:\displaystyle\lambda^{\text{fld}}_{\nabla;\emptyset}: {F,F,F𝐧,S[F′′|v|B,F0]P.Rv|d}d{FF0,F,F𝐧,S[F′′F0|v|B]P.Rv|d}d,\displaystyle\ \bigl{\{}F,F^{\prime},F_{\mathbf{n}},S\bigl{[}F^{\prime\prime}\;|\;\vec{v}\;|\;B,F_{0}\bigr{]}^{\vec{v}|d}_{\nabla P\mathbin{.}R}\bigr{\}}^{d}\Rightarrow\bigl{\{}F\circ F_{0},F^{\prime},F_{\mathbf{n}},S\bigl{[}F^{\prime\prime}\circ F_{0}\;|\;\vec{v}\;|\;B\bigr{]}^{\vec{v}|d}_{\nabla P\mathbin{.}R}\bigr{\}}^{d},
λ;fld:\displaystyle\lambda^{\text{fld}}_{\nabla;\mathit{\surd}}: {F,F,F𝐧,S[F′′|v|B,F0]P.Rv|d}d{F,F,F𝐧,S[F′′|v|t]P.Rv|d}d.\displaystyle\ \bigl{\{}F,F^{\prime},F_{\mathbf{n}},S\bigl{[}F^{\prime\prime}\;|\;\vec{v}\;|\;B,F_{0}\bigr{]}^{\vec{v}|d}_{\nabla P\mathbin{.}R}\mathit{\surd}\bigr{\}}^{d}\Rightarrow\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S\bigl{[}F^{\prime\prime}\;|\;\vec{v}\;|\;\text{\bf t}\bigr{]}^{\vec{v}|d}_{\nabla P\mathbin{.}R}\bigr{\}}^{d}. (27)

We keep applying λunf{\lambda_{\nabla}^{\text{unf}}} until we cannot match σ(P!P?P0)\sigma(P_{!}\circ P_{?}\circ P_{0}) with iterator state. Then we replace the iterator frame with δ(B)\delta(B) where BB is the iteration status, δ(t)=\delta(\text{\bf t})=\mathit{\surd}, and δ(f)=\delta(\text{\bf f})=\emptyset. To prevent premature application, rule schema λend{\lambda_{\nabla}^{\text{end}}} is conditional, where the condition uses functions μP,v:Pats1snYes?\mu_{P,\vec{v}}:\text{\tt Pat}\;s_{1}\ldots s_{n}\rightarrow\text{\tt Yes?} defined for each v𝑉𝑎𝑟(Q)\vec{v}\subseteq\mathit{Var}(Q) and pattern PP occurring in QQ with the single equation μP,v(FP!P?P0,v)=yes\mu_{P,\vec{v}}(F\circ P_{!}\circ P_{?}\circ P_{0},\vec{v})=\text{\it yes} (cf. Equation (8)):

λend:{F,F,F𝐧,S[F′′|v|B]P.Rv|d}d{F,F,F𝐧,Sδ(B)}dif(μP,v(F′′,v)=yes)=f.{\lambda_{\nabla}^{\text{end}}}:\bigl{\{}F,F^{\prime},F_{\mathbf{n}},S\bigl{[}F^{\prime\prime}\;|\;\vec{v}\;|\;B\bigr{]}^{\vec{v}|d}_{\nabla P\mathbin{.}R}\bigr{\}}^{d}\Rightarrow\{F,F^{\prime},F_{\mathbf{n}},S\delta(B)\}^{d}\ \text{if}\ (\mu_{P,\vec{v}}(F^{\prime\prime},\vec{v})=\text{\it yes})=\text{\bf f}. (28)

The last two rules reduce the Stated\text{\tt State}^{d}-terms with an empty stack or stack containing only the \mathit{\surd} constant into a term constructed with 𝔫\mathfrak{n} or 𝔣\mathfrak{f}, respectively:

λdml:{F,F,F𝐧,}𝔫(FF,F𝐧),λdml:{F,F,F𝐧,}𝔣(FF,F𝐧).\lambda_{\text{dml}}^{\mathit{\surd}}:\{F,F^{\prime},F_{\mathbf{n}},\mathit{\surd}\}\Rightarrow\mathfrak{n}(F\circ F^{\prime},F_{\mathbf{n}}),\quad\lambda_{\text{dml}}^{\emptyset}:\{F,F^{\prime},F_{\mathbf{n}},\emptyset\}\Rightarrow\mathfrak{f}(F\circ F^{\prime},F_{\mathbf{n}}). (29)
Theorem 6.3

Σ,𝒟𝐝𝐦𝐥(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}(Q) is a terminating rewriting system.

Proof 6.4

The proof is similar to the proof of Theorem 4.2. Only subqueries of the form P.R\nabla P\mathbin{.}R, where PP is semiterminating but not terminating (i.e., P0P_{0}\neq\emptyset, but P?=P_{?}=\emptyset) require special care. In this case, the signature of _._\nabla\_.\_ forces RR to be success assured, i.e., RR’s evaluation always succeeds, and hence removed facts matching P0P_{0} are never restored. This ensures termination of P.R\nabla P\mathbin{.}R’s execution.

The following useful observation can be trivially verified by examining the rule schemas:

Lemma 6.5

Let RR be a DML subquery of QQ. Then, for all multisets of facts FF, FF^{\prime}, GG, GG^{\prime}, multisets F𝐧F_{\mathbf{n}}, G𝐧G_{\mathbf{n}} of fresh facts, stacks SS, lists of variables v=v1,,vn\vec{v}=v_{1},\ldots,v_{n} and of values a=a1,,an\vec{a}=a_{1},\ldots,a_{n},

  1. 1.

    {F,F,F𝐧,S[a]Rv|d}d1{G,FG,G𝐧,S}d\{F,F^{\prime},F_{\mathbf{n}},S[\vec{a}]^{\vec{v}|d}_{R}\}^{d}\rightarrow^{*}_{\mathcal{R}_{1}}\{G,F^{\prime}\circ G^{\prime},G_{\mathbf{n}},S\mathit{\surd}\}^{d} iff {F,,F𝐧,[]σ(R)|d}d2𝔫(GG,G𝐧)\{F,\emptyset,F_{\mathbf{n}},[]^{|d}_{\sigma(R)}\}^{d}\rightarrow^{*}_{\mathcal{R}_{2}}\mathfrak{n}(G\circ G^{\prime},G_{\mathbf{n}}),

  2. 2.

    {F,F,F𝐧,S[a]Rv|d}d1{F,F,G𝐧,S}d\{F,F^{\prime},F_{\mathbf{n}},S[\vec{a}]^{\vec{v}|d}_{R}\}^{d}\rightarrow^{*}_{\mathcal{R}_{1}}\{F,F^{\prime},G_{\mathbf{n}},S\}^{d} iff {F,,F𝐧,[]σ(R)|d}d2𝔣(F,G𝐧)\{F,\emptyset,F_{\mathbf{n}},[]^{|d}_{\sigma(R)}\}^{d}\rightarrow^{*}_{\mathcal{R}_{2}}\mathfrak{f}(F,G_{\mathbf{n}}).

where σ:={a/v}\sigma:=\{\vec{a}/\vec{v}\}, 1:=Σ,𝒟𝐝𝐦𝐥(Q)\mathcal{R}_{1}:=\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}(Q), and 2:=Σ,𝒟𝐝𝐦𝐥(σ(R))\mathcal{R}_{2}:=\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}(\sigma(R)).

As in the case of pure queries, we consider two DML queries equivalent if and only if they can match their results. We should not, however, distinguish results differing only by the choice of fresh values. Let 𝒮𝐧\mathcal{S}_{\mathbf{n}} denote the set of nominal sorts in ΣS\Sigma_{S}. Let 𝑛𝑜𝑚(t)\mathit{nom}(t) be the 𝒮𝐧\mathcal{S}_{\mathbf{n}}-sorted set of nominal values contained in term tt. For any 𝒮𝐧\mathcal{S}_{\mathbf{n}}-sorted bijection α:XY\alpha:X\rightarrow Y between sets of nominal values we denote by α^\hat{\alpha} the natural extension of α\alpha to terms tt such that 𝑛𝑜𝑚(t)X\mathit{nom}(t)\subseteq X. More precisely, α^(x)=α(x)\hat{\alpha}(x)=\alpha(x) if xXx\in X, α(c)=c\alpha(c)=c if cc is a constant of non-nominal sort or a variable, and α^(f(t1,,tn))=f(α^(t1),,α^(tn))\hat{\alpha}(f(t_{1},\ldots,t_{n}))=f(\hat{\alpha}(t_{1}),\ldots,\hat{\alpha}(t_{n})) if f(t1,,tn)f(t_{1},\ldots,t_{n}) is of non-nominal sort. With those notions we define equivalence on DML queries as follows:

Definition 6.6

Let Q1Q_{1} and Q2Q_{2} be two DML queries in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}. We say that Q1Q_{1} is logically equivalent to Q2Q_{2}, writing Q1Q2Q_{1}\equiv Q_{2}, if and only if, for all ground substitutions σ\sigma such that σ(Q1)\sigma(Q_{1}) and σ(Q2)\sigma(Q_{2}) are closed, all ground multisets of facts FF, GG, all ground multisets of fresh facts F𝐧F_{\mathbf{n}}, G𝐧G_{\mathbf{n}}, and all i{1,2}i\in\{1,2\}

  1. 1.

    if Iσ(Qi)(F,F𝐧)!𝔫(G,G𝐧)\mathrm{I}_{\sigma(Q_{i})}(F,F_{\mathbf{n}})\rightarrow^{!}\mathfrak{n}(G,G_{\mathbf{n}}) then there exist multisets of fresh facts F𝐧F^{{}^{\prime}}_{\mathbf{n}}, G𝐧G^{{}^{\prime}}_{\mathbf{n}}, multisets of facts FF^{\prime}, GG^{\prime}, and an 𝒮𝐧\mathcal{S}_{\mathbf{n}}-sorted bijection α:𝑛𝑜𝑚(F)𝑛𝑜𝑚(G)𝑛𝑜𝑚(F)𝑛𝑜𝑚(G)\alpha:\mathit{nom}(F)\cup\mathit{nom}(G)\rightarrow\mathit{nom}(F^{\prime})\cup\mathit{nom}(G^{\prime}) such that F=α^(F)F^{\prime}=\hat{\alpha}(F), G=α^(G)G^{\prime}=\hat{\alpha}(G), and Iσ(Q3i)(F,F𝐧)!𝔫(G,G𝐧)\mathrm{I}_{\sigma(Q_{3-i})}(F^{\prime},F^{{}^{\prime}}_{\mathbf{n}})\rightarrow^{!}\mathfrak{n}(G^{\prime},G^{{}^{\prime}}_{\mathbf{n}}).

  2. 2.

    if Iσ(Qi)(F,F𝐧)!𝔣(F,G𝐧)\mathrm{I}_{\sigma(Q_{i})}(F,F_{\mathbf{n}})\rightarrow^{!}\mathfrak{f}(F,G_{\mathbf{n}}) then there exist multisets of fresh facts F𝐧F^{{}^{\prime}}_{\mathbf{n}}, G𝐧G^{{}^{\prime}}_{\mathbf{n}}, a multiset of facts FF^{\prime}, and an 𝒮𝐧\mathcal{S}_{\mathbf{n}}-sorted bijection α:𝑛𝑜𝑚(F)𝑛𝑜𝑚(F)\alpha:\mathit{nom}(F)\rightarrow\mathit{nom}(F^{\prime}) such that F=α^(F)F^{\prime}=\hat{\alpha}(F) and Iσ(Q3i)(F,F𝐧)!𝔣(F,G𝐧)\mathrm{I}_{\sigma(Q_{3-i})}(F^{\prime},F^{{}^{\prime}}_{\mathbf{n}})\rightarrow^{!}\mathfrak{f}(F^{\prime},G^{{}^{\prime}}_{\mathbf{n}}).

The following result is an immediate consequence of Lemma 6.5:

Lemma 6.7

Logical equivalence on queries in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}} is an equivalence relation and a congruence, i.e., if κ\kappa is a position in a DML query QQ such that Q|κQ|_{\kappa} is a DML query, and RQ|κR\equiv Q|_{\kappa}, then QQ[R]κQ\equiv Q[R]_{\kappa}.

Lemma 6.8

For any closed DML query QQ in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}, and any renaming σ\sigma, Qσ(Q)Q\equiv\sigma(Q).

Lemma 6.9, proven similarly to Lemma 4.9, clarifies elements of rewriting semantics of DML queries in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}} (we leave _._\nabla\_.\_ evaluation where we are better off with the rewriting definition).

Lemma 6.9

For all ground multisets of facts FF, GG and of fresh facts F𝐧F_{\mathbf{n}}, G𝐧G_{\mathbf{n}}, as well as closed DML queries QQ, Q1Q_{1}, Q2Q_{2} and all sentences ϕ\phi, the following holds:

  1. 1.

    If IQ(F,F𝐧)!Γ\mathrm{I}_{Q}(F,F_{\mathbf{n}})\rightarrow^{!}\Gamma then Γ=𝔫(G,G𝐧)\Gamma=\mathfrak{n}(G,G_{\mathbf{n}}) or Γ=𝔣(F,G𝐧)\Gamma=\mathfrak{f}(F,G_{\mathbf{n}}) for some ground multiset of facts GG and ground multiset of fresh facts G𝐧G_{\mathbf{n}}. If IQ(F,F𝐧)!𝔣(G,G𝐧)\mathrm{I}_{Q}(F,F_{\mathbf{n}})\rightarrow^{!}\mathfrak{f}(G,G_{\mathbf{n}}) then F=GF=G.

  2. 2.

    If If(F,F𝐧)!Γ\mathrm{I}_{f}(F,F_{\mathbf{n}})\rightarrow^{!}\Gamma then Γ=𝔫(Ff,F𝐧)\Gamma=\mathfrak{n}(F\circ f,F_{\mathbf{n}}). If I(F,F𝐧)!Γ\mathrm{I}_{\emptyset}(F,F_{\mathbf{n}})\rightarrow^{!}\Gamma then Γ=𝔣(F,F𝐧)\Gamma=\mathfrak{f}(F,F_{\mathbf{n}}).

  3. 3.

    IϕQ(F,F𝐧)!𝔫(G,G𝐧)\mathrm{I}_{\phi\Rightarrow Q}(F,F_{\mathbf{n}})\rightarrow^{!}\mathfrak{n}(G,G_{\mathbf{n}}) iff Iϕ(F)!𝔯(t)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{r}(\text{\bf t}) and IQ(F,F𝐧)!𝔫(G,G𝐧)\mathrm{I}_{Q}(F,F_{\mathbf{n}})\rightarrow^{!}\mathfrak{n}(G,G_{\mathbf{n}}).

  4. 4.

    IϕQ(F,F𝐧)!𝔣(F,G𝐧)\mathrm{I}_{\phi\Rightarrow Q}(F,F_{\mathbf{n}})\rightarrow^{!}\mathfrak{f}(F,G_{\mathbf{n}}) iff Iϕ(F)!𝔯(f)\mathrm{I}_{\phi}(F)\rightarrow^{!}\mathfrak{r}(\text{\bf f}) or IQ(F,F𝐧)!𝔣(F,G𝐧)\mathrm{I}_{Q}(F,F_{\mathbf{n}})\rightarrow^{!}\mathfrak{f}(F,G_{\mathbf{n}}).

  5. 5.

    IQ1Q2(F,F𝐧)!𝔣(F,G𝐧)\mathrm{I}_{Q_{1}\rhd Q_{2}}(F,F_{\mathbf{n}})\rightarrow^{!}\mathfrak{f}(F,G_{\mathbf{n}}) iff there exists a multiset of fresh facts G𝐧G^{{}^{\prime}}_{\mathbf{n}} such that IQ1(F,F𝐧)!𝔣(F,G𝐧)\mathrm{I}_{Q_{1}}(F,F_{\mathbf{n}})\rightarrow^{!}\mathfrak{f}(F,G^{{}^{\prime}}_{\mathbf{n}}) and IQ2(F,G𝐧)!𝔣(F,G𝐧)\mathrm{I}_{Q_{2}}(F,G^{{}^{\prime}}_{\mathbf{n}})\rightarrow^{!}\mathfrak{f}(F,G_{\mathbf{n}}).

  6. 6.

    IQ1Q2(F,F𝐧)!𝔫(G,G𝐧)\mathrm{I}_{Q_{1}\rhd Q_{2}}(F,F_{\mathbf{n}})\rightarrow^{!}\mathfrak{n}(G,G_{\mathbf{n}}) iff, for some ground multisets of facts FF^{\prime}, F′′F^{\prime\prime}, GG^{\prime}, G′′G^{\prime\prime} such that G=G′′FF′′G=G^{\prime\prime}\circ F^{\prime}\circ F^{\prime\prime}, a ground multiset of fresh facts G𝐧G_{\mathbf{n}}^{{}^{\prime}}, and stacks S,S{,}S,S^{\prime}\in\{\emptyset,\mathit{\surd}\} where S=S=\mathit{\surd} or S=S^{\prime}=\mathit{\surd}, we have IQ1(F,F𝐧){G,F,G𝐧,S}d\mathrm{I}_{Q_{1}}(F,F_{\mathbf{n}})\rightarrow^{*}\{G^{\prime},F^{\prime},G_{\mathbf{n}}^{{}^{\prime}},S\}^{d} and IQ2(G,G𝐧){G′′,F′′,G𝐧,S}d\mathrm{I}_{Q_{2}}(G^{\prime},G^{{}^{\prime}}_{\mathbf{n}})\rightarrow^{*}\{G^{\prime\prime},F^{\prime\prime},G_{\mathbf{n}},S^{\prime}\}^{d}.

Lemma 6.10

The following logical equivalences hold between queries in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}:

QQ,QQ,Q1(Q2Q3)(Q1Q2)Q3,\displaystyle\emptyset\rhd Q\equiv Q,\quad Q\rhd\emptyset\equiv Q,\quad Q_{1}\rhd(Q_{2}\rhd Q_{3})\equiv(Q_{1}\rhd Q_{2})\rhd Q_{3},
R,¬RR,P.\displaystyle\bot\Rightarrow R\equiv\emptyset,\quad\neg\bot\Rightarrow R\equiv R,\quad\exists P\mathbin{.}\emptyset\equiv\emptyset

Lemma 6.10 is very similar to Lemma 5.8, except that __\_\rhd\_ is not commutative. Q1Q2Q_{1}\rhd Q_{2} may be non-equivalent with Q2Q1Q_{2}\rhd Q_{1} if, say, Q1Q_{1} deletes a fact which is referred to in some pattern in Q2Q_{2}.

We need to generalize the notion of confluence, lest rewriting paths leading to terms differing only by distinct choices of fresh values (as in the next example) are to be considered non-convergent.

Example 6.11

Let I be a nominal sort. Let r:NatFactr:\text{\tt Nat}\rightarrow\text{\tt Fact}, s:INatFacts:\text{\tt I}\;\text{\tt Nat}\rightarrow\text{\tt Fact}. Consider DML query Q:=[CI(x)]𝐧[r(y)]?.s(x,y)Q:=\nabla[C_{\text{\tt I}}(x)]_{\mathbf{n}}\circ[r(y)]_{?}\mathbin{.}s(x,y), and let F:=r(1)r(2)F:=r(1)\circ r(2). Then

IQ(F,CI(ı0I)){F,,CI(ı0I),[F||f]Q|d}dλunf{F,,CI(ı1I),[r(2)||f,]Q|d[ı0I,1]s(x,y)x,y|d}d{F,s(ı0I,1),CI(ı1I),[r(2)||t]Q|d}dλunf{F,S(ı0I,1),CI(ı2I),[||t]Q|d[ı1I,2]s(x,y)x,y|d}d𝔫(Fs(ı0I,1)s(ı1I,2),CI(ı2I)),\mathrm{I}_{Q}\bigl{(}F,C_{\text{\tt I}}(\imath^{\text{\tt I}}_{0})\bigr{)}\rightarrow^{*}\bigl{\{}F,\emptyset,C_{\text{\tt I}}(\imath^{\text{\tt I}}_{0}),[F||\text{\bf f}]^{|d}_{Q}\bigr{\}}^{d}\xrightarrow{{\lambda_{\nabla}^{\text{unf}}}}\bigl{\{}F,\emptyset,C_{\text{\tt I}}(\imath^{\text{\tt I}}_{1}),[r(2)||\text{\bf f},\emptyset]^{|d}_{Q}[\imath^{\text{\tt I}}_{0},1]^{x,y|d}_{s(x,y)}\bigr{\}}^{d}\\ \rightarrow^{*}\bigl{\{}F,s(\imath^{\text{\tt I}}_{0},1),C_{\text{\tt I}}(\imath^{\text{\tt I}}_{1}),[r(2)||\text{\bf t}]^{|d}_{Q}\bigr{\}}^{d}\xrightarrow{{\lambda_{\nabla}^{\text{unf}}}}\bigl{\{}F,S(\imath^{\text{\tt I}}_{0},1),C_{\text{\tt I}}(\imath^{\text{\tt I}}_{2}),[\emptyset||\text{\bf t}]^{|d}_{Q}[\imath^{\text{\tt I}}_{1},2]^{x,y|d}_{s(x,y)}\bigr{\}}^{d}\\ \rightarrow^{*}\mathfrak{n}\bigl{(}F\circ s(\imath^{\text{\tt I}}_{0},1)\circ s(\imath^{\text{\tt I}}_{1},2),C_{\text{\tt I}}(\imath^{\text{\tt I}}_{2})\bigr{)},

and, if we match r(1)r(1) and r(2)r(2) in reverse order in applications of λunf{\lambda_{\nabla}^{\text{unf}}} rule, then

IQ(F,CI(ı0I))𝔫(Fs(ı0I,2)s(ı1I,1),CI(ı2I)).\mathrm{I}_{Q}\bigl{(}F,C_{\text{\tt I}}(\imath^{\text{\tt I}}_{0})\bigr{)}\rightarrow^{*}\mathfrak{n}\bigl{(}F\circ s(\imath^{\text{\tt I}}_{0},2)\circ s(\imath^{\text{\tt I}}_{1},1),C_{\text{\tt I}}(\imath^{\text{\tt I}}_{2})\bigr{)}.

Here we define an equivalence relation on terms of sort Stated\text{\tt State}^{d} which is a bisimulation:

Definition 6.12

We say that term t1t_{1} is nominally equivalent to term t2t_{2}, in which case we write t1𝐧t2t_{1}\equiv_{\mathbf{n}}t_{2}, if and only if there exists a bijection α:𝑛𝑜𝑚(t1)𝑛𝑜𝑚(t2)\alpha:\mathit{nom}(t_{1})\rightarrow\mathit{nom}(t_{2}) such that α^(t1)=t2\hat{\alpha}(t_{1})=t_{2}.

Lemma 6.13

Nominal equivalence is an equivalence relation. When restricted to Stated\text{\tt State}^{d} terms satisfying the freshness condition (Definition 6.1), it is also a bisimulation on Σ,𝒟𝐝𝐦𝐥(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}(Q), for all QQ.

We leave an easy proof of Lemma 6.13 to the reader. The restriction to terms satisfying the freshness condition is necessary for nominal equivalence being a bisimulation, as demonstrated below:

Example 6.14

Let I be a nominal sort. Let r:IFactr:\text{\tt I}\rightarrow\text{\tt Fact}, s:IIFacts:\text{\tt I}\;\text{\tt I}\rightarrow\text{\tt Fact}. Define F:=r(ı1I)r(ı2I)F:=r(\imath^{\text{\tt I}}_{1})\circ r(\imath^{\text{\tt I}}_{2}). Consider a DML query Q:=[CI(x)]𝐧[r(y)]?.({x=y}s(x,y)).Q:=\nabla[C_{\text{\tt I}}(x)]_{\mathbf{n}}\circ[r(y)]_{?}\mathbin{.}(\{x=y\}\Rightarrow s(x,y)). Let t1:=IQ(F,CI(ı0I))t_{1}:=\mathrm{I}_{Q}(F,C_{\text{\tt I}}(\imath^{\text{\tt I}}_{0})) and t2:=IQ(F,CI(ı3I))t_{2}:=\mathrm{I}_{Q}(F,C_{\text{\tt I}}(\imath^{\text{\tt I}}_{3})). Term t1t_{1} does not satisfy the freshness condition (Definition 6.1). It is immediate that t1𝐧t2t_{1}\equiv_{\mathbf{n}}t_{2}, t1!t3t_{1}\rightarrow^{!}t_{3}, where t3:=𝔫(s(ı1I,1)F,CI(ı2I))t_{3}:=\mathfrak{n}(s(\imath^{\text{\tt I}}_{1},1)\circ F,C_{\text{\tt I}}(\imath^{\text{\tt I}}_{2})), but the only normal form of t2t_{2} is t4:=𝔣(F,CI(ı5I))t_{4}:=\mathfrak{f}(F,C_{\text{\tt I}}(\imath^{\text{\tt I}}_{5})) and t3𝐧t4t_{3}\not\equiv_{\mathbf{n}}t_{4}.

We now define a class of queries QQ for which Σ,𝒟𝐝𝐦𝐥(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}(Q) is confluent modulo nominal equivalence.

Definition 6.15

Let QQ be a DML query. We say that QQ has no deletion conflicts if and only if for each DML subquery P.R\nabla P\mathbin{.}R of QQ, and any subterm f:Factf:\text{\tt Fact} of RR (resp. PP) occurring inside [_]0[\_]_{0}, PP (resp. RR) has no subterm ff^{\prime} occurring inside [_]0[\_]_{0}, [_]![\_]_{!} or [_]?[\_]_{?} such that ff and ff^{\prime} are unifiable.

Definition 6.16

Let QQ be a DML query in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}. QQ is called deterministic if it has no deletion conflicts and all quantification patterns in QQ (including those inside subterms which are conditions) contain only single facts with unique matching property (but may contain any number of fresh facts).

In Example 4.18 we shown why multiple facts in patterns lead to non-confluence, and as a result, to non-deterministic evaluation of conditions (and queries). However, we have not previously considered deletion conflicts (as they are specific to DML queries). The following example shows why deletion conflicts can prevent confluence:

Example 6.17

Consider the following DML query:

Q:=[f(x)]0.(([f(1)]?.h(x))).Q:=\nabla[f(x)]_{0}\mathbin{.}\bigl{(}\mathit{\surd}\rhd(\nabla[f(1)]_{?}\mathbin{.}h(x))\bigr{)}. (30)

Observe that all quantification patterns in QQ consist of single facts, but QQ does have deletion conflicts (facts in both patterns are unifiable, and one of them is [_]0[\_]_{0}-pattern which has the second one in its scope). Since the first pattern ([f(x)]0[f(x)]_{0}) is semi-terminating but not terminating, to ensure that the DML query in its scope is success assured it is of the form _\mathit{\surd}\rhd\_.

Now, let F:=f(1)f(2)F:=f(1)\circ f(2). It is easy to see that executing QQ against FF removes from FF both ff-facts and either adds a single fact h(2)h(2) or nothing depending on whether pattern [f(x)]0[f(x)]_{0} first matches f(2)f(2) (which makes it possible for the subquery [f(1)]?.h(x)\nabla[f(1)]_{?}\mathbin{.}h(x) to succeed then and return h(2)h(2)) or f(1)f(1) (which causes all executions of subquery [f(1)]?.h(x)\nabla[f(1)]_{?}\mathbin{.}h(x) to fail).

The following theorem states that while evaluation of a deterministic DML query is not itself deterministic, but its results are.

Theorem 6.18

Let QQ be a deterministic query in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}. Then Σ,𝒟𝐝𝐦𝐥(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}(Q) is confluent up to a nominal equivalence. In particular, given ground multisets of facts FF, and of fresh facts F𝐧F_{\mathbf{n}}, there is a unique (up to nominal equivalence) term tt of the form 𝔫(G,G𝐧)\mathfrak{n}(G,G_{\mathbf{n}}) or 𝔣(F,G𝐧)\mathfrak{f}(F,G_{\mathbf{n}}) such that IQ(F,F𝐧)!t\mathrm{I}_{Q}(F,F_{\mathbf{n}})\rightarrow^{!}t.

Proof 6.19

The only significant difference between the proof of this theorem and Theorem 5.13 is the presence of deletions and fresh facts. The non-confluence introduced by fresh facts can be absorbed with nominal equivalence. Since QQ has no deletion conflicts, when a DML subquery P.R\nabla P\mathbin{.}R is executed, neither deletion of facts through PP influences execution of RR nor execution of RR decreases the pool of facts available for matching with PP. Moreover, if PP contains [f]0[f]_{0} for some fact ff, then ff is the only fact in PP, hence RR cannot fail, facts deleted through PP are never returned, and [f]0[f]_{0} behaves like [f]?[f]_{?}.

Let us finish this section with the following remark about expressibility of 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}:

Remark 6.20

A typical formalization of database updates is to use pairs of queries which define facts to be, respectively, deleted from, and added to the current database. This approach can be emulated in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}, using multiple DML queries executed in a sequence. First, let us extend the signature Σ\Sigma with function symbols fdf^{d} and faf^{a} for each fact constructor ff. Let QdQ_{d} and QaQ_{a} be queries in 𝒬Σ,𝒟𝐪𝐫𝐲\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{qry}} which return sets of facts to be deleted and added, respectively, to the database. We assume that QdQ_{d} and QaQ_{a} contain no subterms of the form fd(t)f^{d}(\vec{t}) or fa(t)f^{a}(\vec{t}). Let Q^d\hat{Q}_{d} and Q^a\hat{Q}_{a} be the same as QdQ_{d} and QaQ_{a}, respectively, except that all subqueries f(t)f(\vec{t}) of sort Fact are replaced, respectively, with fd(t)f^{d}(\vec{t}) and fa(t)f^{a}(\vec{t}). Then to update the database we execute the following DML queries (in this order):

Q^d,Q^a,[f1d(v1)]?.[f1(v1)]0.,,[fmd(vm)]?.[fm(vm)]0.,\displaystyle\hat{Q}_{d},\quad\hat{Q}_{a},\quad\nabla[f_{1}^{d}(\vec{v}^{1})]_{?}\mathbin{.}\nabla[f_{1}(\vec{v}^{1})]_{0}\mathbin{.}\mathit{\surd},\ldots,\nabla[f_{m}^{d}(\vec{v}^{m})]_{?}\mathbin{.}\nabla[f_{m}(\vec{v}^{m})]_{0}\mathbin{.}\mathit{\surd},
[f1d(v1)]0.,,[fmd(vm)]0.,[f1a(v1)]0.f1(v1),,[fma(vm)]0.fm(vm),\displaystyle\nabla[f_{1}^{d}(\vec{v}^{1})]_{0}\mathbin{.}\mathit{\surd},\ldots,\nabla[f_{m}^{d}(\vec{v}^{m})]_{0}\mathbin{.}\mathit{\surd},\quad\nabla[f_{1}^{a}(\vec{v}^{1})]_{0}\mathbin{.}f_{1}(\vec{v}^{1}),\ldots,\nabla[f_{m}^{a}(\vec{v}^{m})]_{0}\mathbin{.}f_{m}(\vec{v}^{m}),

where f1,,fmf_{1},\ldots,f_{m} are fact constructors occurring in QdQ_{d} and QaQ_{a}. Thus, because of Theorem 5.18 we can express any relational database update using multiple DML queries in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}.

7 Reachability analysis of data-centric business processes

In this section we demonstrate the application of 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}} in specification and analysis of data-centric business processes. First, we describe a general reachability and simulation framework, and then devote the rest of the section to an extended example specification.

So far, we have specified the rules for execution of a single DML expression. A business process, in general, executes a sequence of DML expressions according to some orchestration rules. A simple example of such rules which we use here, appropriate for a data driven process, is that if a DML expression can be executed successfully then it can be chosen (non-deterministically) as the next command to be executed. Usually (c.f., [40]) such data modifying operations are launched in response to some events, such as user actions which also provide input parameters for the command. In turn, their execution may trigger further events. Here, taking inspiration from [41, 42], we interpret some of the facts as triggering events, user input, and output events (we describe this in more detail later as a part of the example). This simplifies the simulation.

Thus, we specify a business process simply as a finite set Γ\Gamma of DML expressions in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}. During simulation, at each “business step” a DML expression is chosen non-deterministically and is executed. Unsuccessful execution simply leaves the database of facts unchanged. Alternatively, during reachability analysis which performs a breadth-first search through all possible evolutions of the process it is more efficient to make the system stuck on unsuccessful step. This prunes spurious branches in search tree.

More precisely, a set of DML expressions Γ\Gamma determines a rewriting system Σ,𝒟(Γ)\mathcal{R}_{\Sigma,\mathcal{D}}(\Gamma) defined to be the union of Σ,𝒟𝐝𝐦𝐥(Q)\mathcal{R}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}(Q)’s, QΓQ\in\Gamma, augmented with two rule schemas

λQnew:𝔫(F,F𝐧)IQ(F,F𝐧),λQfail:𝔣(F,F𝐧)IQ(F,F𝐧),\lambda^{\text{new}}_{Q}:\mathfrak{n}(F,F_{\mathbf{n}})\Rightarrow\mathrm{I}_{Q}(F,F_{\mathbf{n}}),\quad\lambda^{\text{fail}}_{Q}:\mathfrak{f}(F,F_{\mathbf{n}})\Rightarrow\mathrm{I}_{Q}(F,F_{\mathbf{n}}), (31)

for all QΓQ\in\Gamma.

Rule schema λQnew\lambda^{\text{new}}_{Q} chooses non-deterministically a new DML query for execution and rewrites into an initial state for this query if the execution of the previous one was successful. Similarily, λQfail\lambda^{\text{fail}}_{Q} chooses a new DML query if the execution of the previous one failed. It is important to know that the failure of a randomly chosen DML expression does not usually mean that the business process execution is faulty: Instead, it may simply mean that the given DML expression was not applicable at the moment. Since here the only way to know if the DML expression is applicable is to run it, the rule λQfail\lambda^{\text{fail}}_{Q} is necessary lest the simulation stops prematurely. On the other hand, unsuccessful executions do not change database state, and thus are spurious, adding no useful information. This is why, when doing reachability analysis which explores using breadth-first search all possible paths of execution (in contrast to simulations, where each simulation travels just a single execution path) it is better to drop the rule λQfail\lambda^{\text{fail}}_{Q}.

It is assumed that all DML queries QΓQ\in\Gamma are such that a successful execution of Q consumes and emits a special fact \sharp (of sort Fact) called a token. The token does not denote any real data, but rather facilitates a non-deterministic choice of user input. Say, if in the database there were facts f(a1),f(a2),f(a_{1}),f(a_{2}),\ldots, where a1,a2,a_{1},a_{2},\ldots are possible user inputs for some business step, then we can simulate user choice and execution of further action D(x)D(x) (based on this choice and expressed as DML query with free variable xx storing user’s decision) by using the DML expression of the form []0[f(x)]?.D(x)\nabla[\sharp]_{0}\circ[f(x)]_{?}\mathbin{.}D(x). If the query wouldn’t match and remove the token then the action D(x)D(x) would be executed for every possible user input. In the example described in this section instead of a constant token we use tokens :NatFact\sharp:Nat\rightarrow\text{\tt Fact} parametrized by a natural number. All DML queries consume a token with a non-zero parameter and emit a token with a parameter decreased by one. This permits limiting the number of “large business steps”, i.e., executions of DML queries, in the simulation or search procedure. Rewriting systems such as Maude permit limiting rewriting steps in the search procedure. However, execution of each DML query can take many rewriting steps, the number of which is not easy to estimate. Thus, it is not trivial to pass from the number of rewriting steps to the number of business steps (which are more natural in this context).

Given an initial database FF we start reachability analysis from term 𝔫(F(k),Cs1(ım1s1)Csn(ımnsn))\mathfrak{n}(F\circ\sharp(k),C_{s_{1}}(\imath_{m_{1}}^{s_{1}})\circ\cdots C_{s_{n}}(\imath_{m_{n}}^{s_{n}})), where kk is the maximal depth of search expressed in the number of business steps, s1,,sns_{1},\ldots,s_{n} are nominal sorts for which we need fresh values, and m1,,mnm_{1},\ldots,m_{n} are such that values ıpisi\imath_{p_{i}}^{s_{i}} for pimip_{i}\geq m_{i}, i{1,,n}i\in\{1,\ldots,n\}, do not occur in FF. In case of reachability analysis we search for the term of the form 𝔫(F,F𝐧)\mathfrak{n}(F,F_{\mathbf{n}}) where the database FF satisfies some condition (either desirable or undesirable one).

7.1 Example specification

We borrow an example from [28, Appendix C] to demonstrate specification of a business process as a set of DML expressions in 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}}. The example concerns the process of selecting and advertising restaurant offers of dinners by employees of mediating agency, and managing corresponding bookings. The lifecycles of two key artifact types — Offer and Booking — are presented as finite state machines in Figure 1. Each agent publishes exactly one restaurant offer — either the new one which just came or the one which was previously put on hold. The published offer is in the state available. Agent puts the offer he currently publishes on hold (state onHold) when picking up another offer for publication. Dashed arrows in Figure 1 indicate that entering a given state by an artifact may trigger state change in another artifact, e.g., there is a dashed arrow between the available state and the anonymous transition into onHold state (in a distinct artifact of type Offer). Available offer may get closed (state closed, or be picked up by a customer (transition newBooking to state beingBooked). The latter triggers creation of a new Booking artifact. Booking starts with a preliminary phase of drafting (state drafting) in which the customer chooses dinner hosts (transition addHosts). After draft submission (which changes the state to submitted) the agent computes price for the offer (transition determineProposal to state finalized) and the customer decides to either accept or reject the proposal transitioning, respectively, to the accepted or canceled state. The acceptance may in some cases go through toBeValidated state when additional validation is necessary.

onHoldavailableclosedbeingBookednewOfferresumecloseOffernewBookingOfferdraftingaddHostscanceledacceptedsubmittedfinalizedtoBeValidatedsubmitcancelconfirmrejectdetermineProposalaccept2accept1reject2Booking
Figure 1: Lifecycles of Offer and Booking artifacts presented as finite state machines [28, Figure 5] (see also [43, Figure 5])

States of offers and bookings are constants of sorts OState and BState, respectively, named like in Figure 1. We use the following nominal sorts for identifiers: Rest for restaurants; Person for customers, agents and hosts; Offer, Book and Url for offers, bookings, and url’s of finalized proposals, respectively. For facts we use the following constructors:

Rest:RestFact,Agent,Cust:PersonFact,\displaystyle\text{\tt Rest}:\text{\tt Rest}\rightarrow\text{\tt Fact},\quad\text{\tt Agent},\text{\tt Cust}:\text{\tt Person}\rightarrow\text{\tt Fact},
Offer:OfferOStateRestPersonFact,Book:BookBStateOfferPersonFact,\displaystyle\text{\tt Offer}:\text{\tt Offer}\;\text{\tt OState}\;\text{\tt Rest}\;\text{\tt Person}\rightarrow\text{\tt Fact},\quad\text{\tt Book}:\text{\tt Book}\;\text{\tt BState}\;\text{\tt Offer}\;\text{\tt Person}\rightarrow\text{\tt Fact},
Host:BookPersonFact,Prop:BookUrlFact.\displaystyle\text{\tt Host}:\text{\tt Book}\;\text{\tt Person}\rightarrow\text{\tt Fact},\quad\text{\tt Prop}:\text{\tt Book}\;\text{\tt Url}\rightarrow\text{\tt Fact}.

where facts Rest(r)\text{\tt Rest}(r), Agent(a)\text{\tt Agent}(a), Cust(c)\text{\tt Cust}(c) indicate that rr, aa, and cc are, respectively, identifiers of a registered restaurant, agent, and customer. Offer(o,s,r,a)\text{\tt Offer}(o,s,r,a) means that an offer oo in a state ss for a restaurant rr is managed by an agent aa. Book(b,s,o,c)\text{\tt Book}(b,s,o,c) means that booking bb in a state ss for a customer cc is related to an offer oo. A fact Host(b,p)\text{\tt Host}(b,p) indicates that a person pp is included as a host for booking bb. Finally, Prop(b,u)\text{\tt Prop}(b,u) indicates that finalized proposal for booking bb, with details and prices, is available at the url uu.

We now specify selected transitions from Figure 1 in detail. Transition newOffer responsible for creation of new offers is implemented with the following DML query:

[(s(n))]0[COffer(o)]𝐧[Agent(a)]?[Rest(r)]!.([Offer(o,beingBooked,r,a)]?.)(O(o,available,r,a)([O(o,available,r,a)]0.O(o,onHold,r,a))(n)).\nabla[\sharp(s(n))]_{0}\circ[C_{\text{\tt Offer}}(o)]_{\mathbf{n}}\circ[\text{\tt Agent}(a)]_{?}\circ[\text{\tt Rest}(r)]_{!}\mathbin{.}\bigl{(}\forall\;[\text{\tt Offer}(o^{\prime},\text{\tt beingBooked},r^{\prime},a)]_{?}\;.\;\bot\bigr{)}\\ \Rightarrow\bigl{(}O(o,\text{\tt available},r,a)\rhd\bigl{(}\nabla[O(o^{\prime},\text{\tt available},r^{\prime},a)]_{0}\mathbin{.}O(o^{\prime},\text{\tt onHold},r^{\prime},a)\bigr{)}\rhd\sharp(n)\bigr{)}.

Above, s:NatNats:\text{\tt Nat}\rightarrow\text{\tt Nat} denotes the successor function. We assume that each DML query is executed against a database in which there is exactly one token matching (s(n))\sharp(s(n)), i.e., a token holding a number greater than zero. Thus, in the above, we first choose a single fresh offer identifier, and, non-deterministically, a single registered agent and a single restaurant. The token is marked with [_]0[\_]_{0}, so it is removed from the database after matching (this guarantees that we choose no more than one agent, restaurant, and offer identifier). The query emits back a token with a number decreased by 1 ensuring the possibility (if this number is greater than zero) of executing a next query. The restaurant can be arbitrary (as long as it is registered in the system), however the agent must not manage an offer being booked, as described by the deterministic condition

[O(o,beingBooked,r,a)]?.\forall[O(o^{\prime},\text{\tt beingBooked},r^{\prime},a)]_{?}\;.\;\bot

inside the above DML query. If this condition is not satisfied, the quantifier step fails, the token is returned to the database and a new matching is tried. Since Agent(a)\text{\tt Agent}(a) is marked by [_]?[\_]_{?}, we do not try the same agent again. If the correct matching is found, a fact describing new offer is added to the database. We also change the state of any available offer managed by the agent of the new offer to onHold.

An offer which was put on hold, may be resumed by any agent who is not managing an offer which is currently being booked. Agent resuming an offer becomes the new manager of the offer:

[(s(n))Offer(o,onHold,r,a)]0[Agent(a)]?.([Offer(o,beingBooked,r,a)]?.)((n)Offer(o,available,r,a)([Offer(o,available,r,a)]0.Offer(o,onHold,r,a))).\nabla[\sharp(s(n))\circ\text{\tt Offer}(o,\text{\tt onHold},r,a)]_{0}\circ[\text{\tt Agent}(a^{\prime})]_{?}\mathbin{.}\bigl{(}\forall\;[\text{\tt Offer}(o^{\prime},\text{\tt beingBooked},r^{\prime},a^{\prime})]_{?}\;.\;\bot\bigr{)}\\ \Rightarrow\bigl{(}\sharp(n)\rhd\text{\tt Offer}(o,\text{\tt available},r,a^{\prime})\\ \rhd\bigl{(}\nabla[\text{\tt Offer}(o^{\prime},\text{\tt available},r^{\prime},a^{\prime})]_{0}\mathbin{.}\text{\tt Offer}(o^{\prime},\text{\tt onHold},r^{\prime},a^{\prime})\bigr{)}\bigr{)}.

With the newBooking transition some offer oo changes state from available to beingBooked). It also triggers creation of a new booking (with a fresh identifier) in the drafting state for the chosen offer oo on behalf of some registered customer:

[(s(n))Offer(o,available,r,a)]0[CBook(b)]𝐧[Cust(c)]!.(Offer(o,beingBooked,r,a)Book(b,drafting,o,c)(n)).\nabla[\sharp(s(n))\circ\text{\tt Offer}(o,\text{\tt available},r,a)]_{0}\circ[C_{\text{\tt Book}}(b)]_{\mathbf{n}}\circ[\text{\tt Cust}(c)]_{!}\\ \mathbin{.}\bigl{(}\text{\tt Offer}(o,\text{\tt beingBooked},r,a)\rhd\text{\tt Book}(b,\text{\tt drafting},o,c)\rhd\sharp(n)\bigr{)}.

The customer involved in booking can add dinner hosts one by one (see transition addHosts in Figure 1) as long as the booking is in the drafting stage. The added host can be either fresh or be present in the database as a host for another offer. We use separate DML queries for each of those cases. The first case (of a fresh host) is trivial:

[(s(n))]0[CPerson(h)]𝐧[Book(b,drafting,o,c)]!.((n)Host(b,h)).\nabla[\sharp(s(n))]_{0}\circ[C_{\text{\tt Person}}(h)]_{\mathbf{n}}\circ[\text{\tt Book}(b,\text{\tt drafting},o,c)]_{!}\mathbin{.}\bigl{(}\sharp(n)\rhd\text{\tt Host}(b,h)\bigr{)}.

In the second case we have to ensure that we are not adding the same person twice:

[(s(n))]0[Book(b,drafting,o,c)]![Host(b,h)]?.(([Host(b,h)]?.)((n)Host(b,h))).\nabla[\sharp(s(n))]_{0}\circ[\text{\tt Book}(b,\text{\tt drafting},o,c)]_{!}\circ[\text{\tt Host}(b^{\prime},h)]_{?}\mathbin{.}\bigl{(}(\forall\;[\text{\tt Host}(b,h)]_{?}\mathbin{.}\bot)\Rightarrow(\sharp(n)\rhd\text{\tt Host}(b,h))\bigr{)}.

The submit action simply changes the state of the booking from drafting to submitted. Then, if the customer’s customized booking is infeasible, it can be rejected (reject transition in Figure 1, the implementation of which we omit for brevity’s sake). Otherwise, the final proposal (which includes cost, etc.) to the customer who owns the booking is created (transition determineProposal in Figure 1). The preparation of the proposal is abstracted as (1) creating the fresh url to the proposal, and (2) removing information about hosts (which is available at the new url). As before, we pick the booking non-deterministically using the token and appropriate pattern:

[(s(n))Book(b,submitted,o,c)]0[CUrl(u)]𝐧.(Prop(b,u)Book(b,finalized,o,c)([Host(b,h)]0.)(n)).\nabla[\sharp(s(n))\circ\text{\tt Book}(b,\text{\tt submitted},o,c)]_{0}\circ[C_{\text{\tt Url}}(u)]_{\mathbf{n}}\mathbin{.}\\ \bigl{(}\text{\tt Prop}(b,u)\rhd\text{\tt Book}(b,\text{\tt finalized},o,c)\rhd\bigl{(}\nabla[\text{\tt Host}(b,h)]_{0}\mathbin{.}\mathit{\surd}\bigr{)}\rhd\sharp(n)\bigr{)}.

A finalized booking proposal for a restaurant rr can be accepted either immediately (with accept1\text{\tt accept}_{1}) or after an additional confirmation (with accept2\text{\tt accept}_{2}). The first case applies only to golden customers of a given restaurant rr, i.e., those who successfully booked a dinner in rr at least kk-times, for some fixed kk. Accepting a proposal changes the state of the offer to which the booking belongs to closed:

[(s(n))Book(b,finalized,o,c)Offer(o,beingBooked,r,a)]0[Cust(c)]?.([Offer(o1,closed,r,a1)B(b1,accepted,o1,c)Offer(ok,closed,r,ak)Book(bk,accepted,ok,c)]?.)(Book(b,accepted,o,c)Offer(o,closed,r,a)(n)).\nabla[\sharp(s(n))\circ\text{\tt Book}(b,\text{\tt finalized},o,c)\circ\text{\tt Offer}(o,\text{\tt beingBooked},r,a)]_{0}\circ[\text{\tt Cust}(c)]_{?}\mathbin{.}\\ \bigl{(}\exists\;[\text{\tt Offer}(o_{1},\text{\tt closed},r,a_{1})\circ B(b_{1},\text{\tt accepted},o_{1},c)\\ \circ\cdots\circ\text{\tt Offer}(o_{k},\text{\tt closed},r,a_{k})\circ\text{\tt Book}(b_{k},\text{\tt accepted},o_{k},c)]_{?}\mathbin{.}\top\bigr{)}\\ \Rightarrow\bigl{(}\text{\tt Book}(b,\text{\tt accepted},o,c)\rhd\text{\tt Offer}(o,\text{\tt closed},r,a)\rhd\sharp(n)\bigr{)}.
Remark 7.1

In our earlier work [15] we have used the almost same example (with minor modifications) to illustrate an alternative formalism (c.f. Section 1.1 in the current paper) where queries, also implemented in the rewriting system, but using meta-level features, are deterministic. This makes them behave like a classical queries, but because of determinism it is not possible to simulate user input directly. Instead, a separate mechanism had to be introduced to simulate non-deterministic input choice. Secondly, DML expressions in [15] did not add or, more importantly, delete facts from the database directly. Instead, they return pairs (which need to be specified in the DML expression itself) of (multi)sets of facts: those to be deleted and those to be added. However, in this particular example (and we believe it is typical) facts to be deleted are matched by parts of the patterns in the query. This (in [15]) led to code duplication, and suggested natural use of rewriting rules (which, of course, replaces matched subterms), and ultimately led to the formalism described in this paper.

Remark 7.2

We have implemented both the syntax and semantics of 𝒬Σ,𝒟Cnd\mathcal{Q}_{\Sigma,\mathcal{D}}^{\text{\tt Cnd}} and 𝒬Σ,𝒟Dml\mathcal{Q}_{\Sigma,\mathcal{D}}^{\text{\tt Dml}} in Maude [44]. The implementation is available on the project’s website [45]. To test the implementation we have used specification of the business process described above (also available from [45]). The specification compiles into a Maude’s system module which contains definitions of 196 operators, 285 equations and 83 rewriting rules (in actual implementation we used equations in place of deterministic rewriting rules).

Let us now describe a simple example of a reachability analysis with the specification just described. Let

initDB:=agent(a1)agent(a2)cust(c1)cust(c2)rest(r1)rest(r2)(7)\text{\tt initDB}:=\text{\tt agent}(a_{1})\circ\text{\tt agent}(a_{2})\circ\text{\tt cust}(c_{1})\circ\text{\tt cust}(c_{2})\circ\text{\tt rest}(r_{1})\circ\text{\tt rest}(r_{2})\circ\sharp(7)

be an initial database of facts. Let

initDBN:=C(ı0Offer)C(ı0Book)C(ı0Url)C(ı0Person)\text{\tt initDBN}:=C(\imath^{\text{\tt Offer}}_{0})\circ C(\imath^{\text{\tt Book}}_{0})\circ C(\imath^{\text{\tt Url}}_{0})\circ C(\imath^{\text{\tt Person}}_{0})

be an initial multiset of fresh facts. Finally, let

initState:=𝔫(initDB,initDBN)\text{\tt initState}:=\mathfrak{n}(\text{\tt initDB},\text{\tt initDBN})

be an initial state. Note that since the token is parametrized by 7, it follows that we can make at most seven successful business steps from this state. We are interested in checking if we can reach from initState (in no more than 7 business steps) the state in which there exists a closed offer (and accepted associated booking) from the database with no bookings and offers. Formally, we want to reach the state matching

𝔫(FOffer(o1,closed,r1,a1)Book(b1,accepted,o1,c1),F𝐧).\mathfrak{n}(F\circ\text{\tt Offer}(o_{1},\text{\tt closed},r_{1},a_{1})\circ\text{\tt Book}(b_{1},\text{\tt accepted},o_{1},c_{1}),F_{\mathbf{n}}).

Using our implementation [45] we can easily check that a matching state is indeed reachable in 6 business steps (Maude reported 525165 actual rewritings in 2528ms).

8 Conclusion

We have presented a multiset non-deterministic query and data manipulation language 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} based on conditional term rewriting. The intended application of this language is in specification, simulation and reachability analysis of data-centric business processes. However, the remarkable features of 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}}, particularly non-determinism and non-standard approach to variable binding, make it interesting on its own. We show that non-determinism of queries is useful for simulating user choices, but we also provide easily identifiable syntactic restrictions which ensure uniqueness of query results. Interestingly, this non-determinism leads to bisimulation-like definitions of logical equivalence between formulas. In the last section we demonstrated how sets of DML queries can be used to specify a business process and we provide a simple framework for simulation and testing.

𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} is a multiset query language. Most formal query languages, including relational calculus and algebra, are based on sets. One under-appreciated fact is that SQL is really a multiset query language, and for a very good reason — removing duplicates is expensive. While this was not our primary reason to use multisets, we believe that using multiset languages encourages query design which avoids unnecessary expensive operations, and takes the complexity of query execution into account better than set-based languages.

The fact that closed 𝒬Σ,𝒟\mathcal{Q}_{\Sigma,\mathcal{D}} formulas are compiled to rewriting systems permits their symbolic execution using narrowing [16]. We intend to explore this possibility in future research. This is also one of the reasons why it was important to limit the use of conditional rules as much as possible: many implementations of narrowing (see e.g., [46]) do not permit narrowing with conditions.

We have implemented 𝒬Σ,𝒟𝐜𝐧𝐝\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{cnd}}, 𝒬Σ,𝒟𝐝𝐦𝐥\mathcal{Q}_{\Sigma,\mathcal{D}}^{\mathbf{dml}} and a specification framework extending the one described at the beginning of Section 7 in Maude [46]. The implementation is available from [45]. It differs in non-essential way from the one described in the present paper, but the code is extensively documented.

Acknowledgements

The author is grateful to the anonymous reviewers for their helpful remarks

References

  • [1] Hull R. Artifact-Centric Business Process Models: Brief Survey of Research Results and Challenges. In: Meersman R, Tari Z (eds.), On the Move to Meaningful Internet Systems: OTM 2008. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008 pp. 1152–1163. 10.1007/978-3-540-88873-4_17.
  • [2] Calvanese D, De Giacomo G, Montali M. Foundations of Data-aware Process Analysis: A Database Theory Perspective. In: Proceedings of the 32Nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS ’13. ACM, New York, NY, USA, 2013 pp. 1–12. 10.1145/2463664.2467796.
  • [3] van der Aalst WM. The application of Petri nets to workflow management. Journal of circuits, systems, and computers, 1998. 8(01):21–66. 10.1142/S0218126698000043.
  • [4] van der Aalst WM, Ter Hofstede AH. YAWL: yet another workflow language. Information systems, 2005. 30(4):245–275. 10.1016/j.is.2004.02.002.
  • [5] Rosa-Velardo F, de Frutos-Escrig D. Decidability and complexity of Petri nets with unordered data. Theoretical Computer Science, 2011. 412(34):4439–4451.
  • [6] Lasota S. Decidability border for Petri nets with data: WQO dichotomy conjecture. In: International Conference on Application and Theory of Petri Nets and Concurrency. Springer, 2016 pp. 20–36. doi:10.1007/978-3-319-39086-4_3.
  • [7] Montali M, Rivkin A. Model checking Petri nets with names using data-centric dynamic systems. Formal Aspects of Computing, 2016. 28(4):615–641. 10.1007/s00165-016-0370-6.
  • [8] Montali M, Rivkin A. DB-Nets: On the Marriage of Colored Petri Nets and Relational Databases. In: Transactions on Petri Nets and Other Models of Concurrency XII. Springer Berlin Heidelberg, Berlin, Heidelberg, 2017 pp. 91–118. 10.1007/978-3-662-55862-1_5.
  • [9] Montali M, Rivkin A. From DB-nets to Coloured Petri Nets with Priorities. In: International Conference on Applications and Theory of Petri Nets and Concurrency. Springer, 2019 pp. 449–469. doi:10.1007/978-3-030-21571-2_24.
  • [10] Meseguer J. Conditional rewriting logic as a unified model of concurrency. Theoretical computer science, 1992. 96(1):73–155. 10.1016/0304-3975(92)90182-F.
  • [11] Meseguer J, Rosu G. The rewriting logic semantics project. Theoretical Computer Science, 2007. 373(3):213 – 237. 10.1016/j.tcs.2006.12.018.
  • [12] Stehr MO, Meseguer J, Ölveczky PC. Rewriting Logic as a Unifying Framework for Petri Nets. In: Unifying Petri Nets: Advances in Petri Nets. Springer Berlin Heidelberg, Berlin, Heidelberg, 2001 pp. 250–303. 10.1007/3-540-45541-8_9.
  • [13] Padberg J, Schulz A. Model Checking Reconfigurable Petri Nets with Maude. In: Echahed R, Minas M (eds.), Graph Transformation. Springer International Publishing, Cham, 2016 pp. 54–70. 10.1007/978-3-319-40530-8_4.
  • [14] Kheldoun A, Barkaoui K, Ioualalen M. Formal verification of complex business processes based on high-level Petri nets. Information Sciences, 2017. 385:39–54. 10.1016/j.ins.2016.12.044.
  • [15] Zieliński B. A Query Language Based on Term Matching and Rewriting. Fundamenta Informaticae, 2019. 169:237–274. 10.3233/FI-2019-1845.
  • [16] Meseguer J, Thati P. Symbolic reachability analysis using narrowing and its application to verification of cryptographic protocols. Higher-Order and Symbolic Computation, 2007. 20(1-2):123–160. 10.1007/s10990-007-9000-6.
  • [17] Fay M. First-order unification in an equational theory. In: Proceedings of the 4th Workshop on Automated Deduction, Austin, Texas, 1979.
  • [18] Hullot JM. Canonical forms and unification. In: International Conference on Automated Deduction. Springer, 1980 pp. 318–334. doi:10.1007/3-540-10009-1_25.
  • [19] Alpuente M, Escobar S, Iborra J. Termination of narrowing revisited. Theoretical Computer Science, 2009. 410(46):4608–4625. doi:10.1016/j.tcs.2009.07.037.
  • [20] Zieliński B, Maślanka P. Relational Transition System in Maude. In: Beyond Databases, Architectures and Structures. Towards Efficient Solutions for Data Analysis and Knowledge Representation: 13th International Conference, BDAS 2017, Ustroń, Poland, May 30 - June 2, 2017, Proceedings. Springer International Publishing, Cham, 2017 pp. 497–511. 10.1007/978-3-319-58274-0_39.
  • [21] Roşu G, Ellison C, Schulte W. Matching Logic: An Alternative to Hoare/Floyd Logic. In: Algebraic Methodology and Software Technology. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011 pp. 142–162. 10.1007/978-3-642-17796-5_9.
  • [22] Roşu G. Matching logic. arXiv:1705.06312, 2017.
  • [23] Stehr MO. CINNI-A Generic Calculus of Explicit Substitutions and its Application to λ\lambda-ς\varsigma-and π\pi-Calculi. Electronic Notes in Theoretical Computer Science, 2000. 36:70–92. 10.1016/S1571-0661(05)80125-2.
  • [24] Baader F. The description logic handbook: Theory, implementation and applications. Cambridge University Press, New York, NY, USA, 2003. ISBN:0-521-78176-0.
  • [25] De Giacomo G, De Masellis R, Rosati R. Verification of conjunctive artifact-centric services. International Journal of Cooperative Information Systems, 2012. 21(02):111–139. 10.1142/S0218843012500025.
  • [26] Hariri BB, Calvanese D, De Giacomo G, De Masellis R, Felli P. Foundations of relational artifacts verification. In: International Conference on Business Process Management. Springer, 2011 pp. 379–395. 10.1007/978-3-642-23059-2_28.
  • [27] Calvanese D, Montali M, Patrizi F, De Giacomo G. Description logic based dynamic systems: modeling, verification, and synthesis. In: Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, 2015 pp. 4247–4253.
  • [28] Abdulla PA, Aiswarya C, Atig MF, Montali M, Rezine O. Recency-Bounded Verification of Dynamic Database-Driven Systems. In: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS ’16. ACM, New York, NY, USA, 2016 pp. 195–210. 10.1145/2902251.2902300.
  • [29] Chen-Burger YH, Robertson D. Automating business modelling: a guide to using logic to represent informal methods and support reasoning. Springer Science & Business Media, 2006. 10.1007/b138799.
  • [30] Merouani H, Mokhati F, Seridi-Bouchelaghem H. Formalizing Artifact-Centric Business Processes - Towards a Conformance Testing Approach. In: Proceedings of the 16th International Conference on Enterprise Information Systems. 2014 pp. 368–374. 10.5220/0004951803680374.
  • [31] McCarthy J, Hayes PJ. Some philosophical problems from the standpoint of artificial intelligence. Readings in artificial intelligence, 1969. pp. 431–450.
  • [32] Deutsch A, Li Y, Vianu V. Verifas: a practical verifier for artifact systems. Proceedings of the VLDB Endowment, 2017. 11(3):283–296. doi:10.14778/3157794.3157798.
  • [33] Deutsch A, Li Y, Vianu V. Verification of hierarchical artifact systems. ACM Transactions on Database Systems (TODS), 2019. 44(3):1–68. doi:10.1145/3321487.
  • [34] Calvanese D, Ghilardi S, Gianola A, Montali M, Rivkin A. Formal modeling and SMT-based parameterized verification of data-aware BPMN. In: International Conference on Business Process Management. Springer, 2019 pp. 157–175. doi:10.1007/978-3-030-26619-6_12.
  • [35] Seco JC, Debois S, Hildebrandt T, Slaats T. RESEDA: Declaring live event-driven computations as REactive SEmi-structured DAta. In: 2018 IEEE 22nd International enterprise distributed object computing conference (EDOC). IEEE, 2018 pp. 75–84. doi:10.1109/EDOC.2018.00020.
  • [36] Meseguer J. Membership algebra as a logical framework for equational specification. In: Recent Trends in Algebraic Development Techniques. Springer Berlin Heidelberg, Berlin, Heidelberg, 1998 pp. 18–61. 10.1007/3-540-64299-4_26.
  • [37] Huet G. Confluent reductions: Abstract properties and applications to term rewriting systems: Abstract properties and applications to term rewriting systems. Journal of the ACM (JACM), 1980. 27(4):797–821.
  • [38] Thielscher M. Introduction to the Fluent Calculus. Electronic Transactions on Artificial Intelligence, 1998. 2(3-4):179–192.
  • [39] Ochremiak J. Nominal sets over algebraic atoms. In: International Conference on Relational and Algebraic Methods in Computer Science. Springer, 2014 pp. 429–445. 10.1007/978-3-319-06251-8_26.
  • [40] Hull R, Damaggio E, De Masellis R, Fournier F, Gupta M, Heath III FT, Hobson S, Linehan M, Maradugu S, Nigam A, et al. Business artifacts with guard-stage-milestone lifecycles: managing artifact interactions with conditions and events. In: Proceedings of the 5th ACM international conference on Distributed event-based system. ACM, 2011 pp. 51–62. 10.1145/2002259.2002270.
  • [41] Abiteboul S, Vianu V, Fordham B, Yesha Y. Relational Transducers for Electronic Commerce. In: Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS ’98. ACM, New York, NY, USA. ISBN 0-89791-996-3, 1998 pp. 179–187. 10.1145/275487.275507.
  • [42] Abiteboul S, Vianu V, Fordham B, Yesha Y. Relational transducers for electronic commerce. Journal of Computer and System Sciences, 2000. 61(2):236–269. doi:10.1006/jcss.2000.1708.
  • [43] Abdulla PA, Aiswarya C, Atig MF, Montali M, Rezine O. Recency-bounded verification of dynamic database-driven systems (extended version). arXiv preprint arXiv:1604.03413, 2016.
  • [44] Clavel M, Durán F, Eker S, Lincoln P, Martí-Oliet N, Meseguer J, Talcott C. The Maude 2.0 System. In: Nieuwenhuis R (ed.), Rewriting Techniques and Applications (RTA 2003), number 2706 in Lecture Notes in Computer Science. Springer-Verlag, 2003 pp. 76–87. 10.1007/3-540-44881-0_7.
  • [45] Zieliński B. Nondeterministic Rewriting Query Language (NDRQL). Project website,
    http://ki.wfi.uni.lodz.pl/ndrql/.
  • [46] Clavel M, Duràn F, Eker S, Escobar S, Lincoln P, Martì-Oliet N, Meseguer J, Talcott C. Maude Manual (Version 2.7.1), 2016.