\twoaddress

Department of Mathematics
University of California, Los Angeles, CA, USA Department of Mathematics, University of Athens, Greece3 \urladdrhttp://www.math.ucla.edu/\urltildeynm

Elementary recursive algorithms ${\,}^{\ast}$

Yiannis N. Moschovakis [email protected]

⁰⁰footnotetext: ^∗A preliminary version of the results in this article was included in an early draft of [aric2019] (ARIC) as an additional Chapter in Part 1. It was replaced by a brief summary in Section 2H of the final, published version of ARIC, because it is not directly relevant to the main aim of that monograph, which is to develop methods for deriving and justifying robust lower complexity bounds for mathematical problems. There are many references in the sequel to ARIC and to older work by several people, but I have included enough of the basic definitions and facts so that the statements and proofs of the new results in Sections 4 and 5 stand on their own.

They run our lives, if you believe the hype in the news, but there is no precise definition of algorithms which is generally accepted by the mathematicians, logicians and computer scientists who create and study them.¹¹1Using imprecise formulations of the Church-Turing Thesis and vague references to [church1935, church1936] and [turing1936], it is sometimes claimed naively that algorithms are Turing machines. This does not accord with the original formulations of the Church-Turing Thesis, cf. the discussion in Section 1.1 of [ynm2014] (which repeats points well-known and understood by those who have thought about this matter); and it is not a useful assumption for algebraic complexity theory, cf. page 2 of ARIC. My main aims here are (first) to discuss briefly and point to the few publications that try to deal with this foundational question and (primarily) to outline in Sections 4 and 5 simple proofs of two basic mathematical results about the elementary recursive algorithms from specified primitives expressed by recursive (McCarthy) programs.

1 What is an algorithm?

With the (standard, self-evident) notation of Sections 1D and LABEL:eqlogic of ARIC, we will focus on algorithms which compute partial functions and (decide partial) relations

f:A^{n}\rightharpoonup A_{s}\qquad(s\in\{{\texttt{ind}},{\texttt{boole}}\},A_{\texttt{ind}}=A,A_{\texttt{boole}}={\{{\textup{t\hskip-1.7ptt}},{\textup{ff}}\}})

(1-1)

from the finitely many primitives of a (partial, typically infinite) $\Phi$ -structure

{\textup{{A}}}=(A,\{\phi^{\textup{{A}}}\}_{\phi\in\Phi})\quad(\phi\in\Phi,\phi^{\textup{{A}}}:A^{\textup{arity}(\phi)}\rightharpoonup A_{{\textup{sort}}(\phi)}).

(1-2)

The most substantial part of this restriction is that it leaves out algorithms with side effects and interaction, cf. the footnote on page LABEL:elementary of ARIC and the relevant Section 3B in [ynmflr].

Equally important is the restriction to algorithms from specified primitives, especially as the usual formulations of the Church-Turing Thesis suggest that the primitives of a Turing machine are in some sense “absolutely computable” and need not be explicitly assumed or justified. We have noted in the Introduction to ARIC (and in several other places) why this is not a useful approach; but in trying to understand computability and the meaning of the Church-Turing Thesis, it is natural to ask whether there are absolutely computable primitives and what those might be. See Sections 2 and 8 of [ynm2014] for a discussion of the problem and references to relevant work, especially the eloquent analyses in [gandy1980] and [kripke2000].

There is also the restriction to first-order primitives, partial functions and relations. This is necessary for the development of a conventional theory of complexity, but recursion and computability from higher type primitives have been extensively studied: see [kleene1959], [ynmkechris-moschovakis] and [sacks1990] for the higher-type recursion which extends directly the first-order notion we have adopted in ARIC, and [longley-normann2015] for a near-complete exposition of the many and different approaches to the topic.²²2See also [ynmflr]—which is about recursion on structures with arbitrary monotone functionals for primitives—and the subsequent [ynmbotik] where the relevant notion of algorithm from higher-type primitives is modeled rigorously.

Once we focus on algorithms which compute partial functions and relations as in (1-1) from the primitives of a $\Phi$ -structure, the problem of modeling them by set-theoretic objects comes down basically to choosing between iterative algorithms specified by (classical) computation models as in Section LABEL:cmodels of ARIC and elementary recursive algorithms expressed directly by recursive (McCarthy) programs; at least this is my view, which I have explained and defended as best I can in Section 3 of [ynmsicily].

The first of these choices—that algorithms are iterative processes—is the standard view, explicitly or implicitly adopted (sometimes with additional restrictions) by most mathematicians and computer scientists, including Knuth in Section 1.1 of his classic [knuth1973]. More recently (and substantially more abstractly, on arbitrary structures), this standard view has been developed, advocated and defended by Gurevich and his collaborators, cf. [gurevich1995, gurevich2000] and [gurevich2008]; see also [zt2000] and [duzi2014].

I have made the second choice—that algorithms are directly expressed by systems of mutual recursive definitions—and I have developed and defended this view in several papers, including [ynmsicily]. I will not repeat these arguments here, except for the few remarks in the remainder of this Section about the role that iterative algorithms play in the theory of recursion and (especially) Proposition 4.6, which verifies that iterative algorithms are “faithfully coded” by the recursive algorithms they define, and so their theory is not “lost” when we take recursive algorithms to be the basic objects.

By the definitions in Section LABEL:synsem of ARIC (reviewed in Section 2 below), a recursive (McCarthy) $\Phi$ -program is a syntactic expression

E\equiv E_{0}{\sf~{}where~{}}\Big{\{}\textup{{p}}_{1}(\vec{\textup{{x}}}_{1})=E_{1},\ldots,\textup{{p}}_{K}(\vec{\textup{{x}}}_{K})=E_{K}\Big{\}},

(1-3)

where $E_{0},E_{1},\ldots,E_{k}$ are (pure, explicit) $\Phi\cup\{\textup{{p}}_{1},\ldots,\textup{{p}}_{K}\}$ -terms and for every $i=1,\ldots,K$ all the individual variables which occur in $E_{i}$ are included in the list $\vec{\textup{{x}}}_{i}$ of distinct individual variables; and an extended program is a pair

(E,\vec{\textup{{x}}})\equiv E(\vec{\textup{{x}}})\equiv E_{0}(\vec{\textup{{x}}}){\sf~{}where~{}}\Big{\{}\textup{{p}}_{1}(\vec{\textup{{x}}}_{1})=E_{1},\ldots,\textup{{p}}_{K}(\vec{\textup{{x}}}_{K})=E_{K}\Big{\}}

(1-4)

of a program $E$ and a list of distinct individual variables $\vec{\textup{{x}}}$ which includes all the individual variables which occur in $E_{0}$ .

Recursive algorithms

My understanding of the algorithm defined by $E(\vec{\textup{{x}}})$ in a $\Phi$ -structure A is that it calls for solving in A the system of recursive equations in the body of $E(\vec{\textup{{x}}})$ (within the braces $\{\cdots\}$ ) and then plugging the solutions in its head $E_{0}(\vec{\textup{{x}}})$ to compute $\textup{den}({\textup{{A}}},E(\vec{x}))$ , the value of the partial function computed by $E(\vec{\textup{{x}}})$ on the input $\vec{x}$ ; how we find the canonical (least) solutions of this system is not specified in this view by the algorithm from the primitives of A defined by $E(\vec{\textup{{x}}})$ .

This “lack of specificity” of elementary recursive algorithms is surely a weakness of the view I have adopted, especially as it leads to some difficult problems.

Implementations

To compute $\textup{den}({\textup{{A}}},E(\vec{x}))$ for $\vec{x}\in A^{n}$ , we might use the method outlined in the proof of the Fixed Point Lemma LABEL:lfp or the recursive machine defined in Section LABEL:cmodels of ARIC, or any one of several well known and much studied implementations of recursion. These are iterative algorithms defined on structures which are (typically) richer than A and they must satisfy additional properties relating them to $E(\vec{\textup{{x}}})$ —we would not call any iterative algorithm from any primitives which happens to compute the same partial function on $A$ as $E(\vec{\textup{{x}}})$ an implementation of it; so to specify the implementations of a recursive program is an important (and difficult) part of this approach to the foundations of the theory of algorithms, cf. [ynmsicily] and (especially) [ynmpaschalis2008] which proposes a precise definition and establishes some basic results about it.

On the other hand, the standard view has some problems of its own:

Recursion vs. iteration

Iterators are defined rigorously in ARIC and Theorem LABEL:itrecthmA gives a strong, precise version of the slogan

\textit{iteration can be reduced to recursion};

(1-5)

on every structure A, if $f:A^{n}\rightharpoonup A_{s}$ is computed by an A-explicit iterator, then $f$ is also computed by an A-recursive program. The converse of (1-5) is not true however: there are structures where some A-recursive relation is not tail recursive, i.e., it cannot be decided by an iterator which is explicit in A—it is necessary to add primitives and/or to enlarge the universe of A. Classical examples include [patterson-hewitt1970] (Theorem LABEL:pathewhm in ARIC), [lynch-blum1979] and (the most interesting) [tiuryn1989].³³3See also [jones1999, jones2001] and [bhaskar2017int, bhaskar2017ext]. The last Chapter LABEL:algebra of ARIC discusses a (basic) problem from algebraic complexity theory, a very rich and active research area in which the recursive representation of algorithms is essential.

Functional vs. imperative programming

It is sometimes argued that the main difference between recursion and iteration is a matter of “programming style”, at least for the structures which are mostly studied in complexity theory in which every recursive partial function is tail recursive, i.e., computed by some A-explicit iterator. This is not quite true: consider, for example the classical merge-sort algorithm (Section LABEL:algexamples of ARIC) which sorts strings from the primitives of the Lisp structure

{\textup{{L}}^{\ast}}=(L^{\ast},\textup{nil},{\textup{eq}}_{\textup{nil}},\textup{head},\textup{tail},\textup{cons})

over an ordered set $L$ , (LABEL:stringstr) in ARIC; it is certainly implemented by some ${\textup{{L}}^{\ast}}$ -explicit iterator (as is every recursive algorithm of ${\textup{{L}}^{\ast}}$ ), but which one of these is the merge-sort? It seems that we can understand this classic algorithm and reason about it better by looking at Proposition LABEL:mergelemma of ARIC rather than focussing on choosing some specific implementation of it.⁴⁴4See also Theorem LABEL:sortinglb of ARIC for a precise formulation and proof of the basic optimality property of the merge-sort.

Proofs of correctness

In Proposition LABEL:mergesortprop of ARIC, we claimed the correctness of the merge-sort—that it sorts—by just saying that

The sort function satisfies the equation …

whose proof was too trivial to deserve more than the single sentence

The validity of (LABEL:mergesort) is immediate, by induction on $|u|$ .

To prove the correctness of an iterator that “expresses the merge-sort”, you must first design a specific one and then explain how you can extract from all the “housekeeping” details necessary for the specification of iterators a proof that what is actually being computed is the sorting function; most likely you will trust that a formal version of (LABEL:mergesort) of ARIC is implemented correctly by some compiler or interpreter of whatever higher-level language you are using, as we did for the recursive machine.

Simply put, whether correct or not, the view that algorithms are faithfully expressed by systems of recursive equations typically separates proofs of their correctness and their basic complexity properties which involve only purely mathematical facts from the relevant subject and standard results of fixed-point-theory, from proofs of implementation correctness for programming languages which are ultimately necessary but have a very different flavor.

Defining mathematical objects in set theory

Finally, it may be useful to review here briefly what it means to define algorithms according to a general (and widely accepted, I think) naive view of what it means to define mathematical objects. This is discussed more fully in Section 3 of [ynmsicily].

One standard example is the “definition” (or “construction”) of real numbers using (canonically) convergent sequences of rational numbers: we set first

x\in\textup{Cauchy}(\mathbb{Q})\iff x:\mathbb{N}\to\mathbb{Q}\\ {~{}\&~{}}(\forall k)(\exists n)(\forall i,j)\Big{[}i,j\geq n\implies|x(i)-x(j)|<\frac{1}{k+1}\Big{]},

next we put for $x,y\in\textup{Cauchy}(\mathbb{Q})$

x\sim y\iff(\forall k)(\exists n)(\forall i>n)|x(i)-y(i)|<\frac{1}{k+1},

and finally we declare that $x,y\in\textup{Cauchy}(\mathbb{Q})$ represent the same real number if $x\sim y$ .

In general, a representation in set theory of a mathematical notion $P$ is a pair of a set (or class) $\mathcal{C}_{P}$ of set-theoretic objects which represent the objects that fall under $P$ and an equivalence condition $\sim_{P}$ on $\mathcal{C}_{P}$ which models the identity relation on them; and the representation is faithful—and can be viewed as a definition of $P$ in set theory—to the extent that the relations on $P$ -objects that we want to study are those which are equivalent to the $\sim_{P}$ -invariant properties of the objects in $\mathcal{C}_{P}$ .

Now, number theorists could not care less about such “definitions” of real numbers and they were happily investigating the existence and properties of rational, algebraic and transcedental numbers for more than two thousand years before they were given in the 1870s. Part of the reason for giving them was to argue for adopting set theory as a “foundation” (a “universal language”) for all of mathematics and to apply set theoretic methods to algebraic number theory;⁵⁵5Most spectacular of these was Cantor’s proof that transcedental numbers exist by a counting argument much simpler than Liouville’s original proof—the first “killer application” of set theory. but this is not the main point—some people would use category theory today and argue for its superiority over set theory as both a foundation and a tool for applications. The main point of looking for such “definitions” is to identify the fundamental, characteristic properties of a mathematical notion, which, to repeat, should be the $\sim_{P}$ -invariant properties of the set-theoretic objects that model the notion.

Another, fundamental (and much more sophisticated) example of this process of constructing faithful modelings of mathematical notions is the identification in [kolmogorov1933] of real-valued random variables with measurable functions $X:\Omega\to\mathbb{R}$ on a probability space, under various equivalence relations.\ynmequal, equal almost surely, or equal in distribution

2 Equational logic of partial terms with conditionals

With the definitions in Section 1A of ARIC, a partial function

f:X\rightharpoonup W

with input set $X$ and output set $W$ is an (ordinary) function $f:D_{f}\to W$ on some subset $D_{f}\subseteq X$ , the domain of convergence of $f$ . We write

f(x)\kern-1.99997pt\downarrow\kern 1.00006pt\iff\kern-6.49994pt_{\mathrm{df}}~{}x\in D_{f},\quad f(x)\kern-1.00006pt\uparrow\kern 0.50003pt\iff\kern-6.49994pt_{\mathrm{df}}~{}x\notin D_{f}\qquad(x\in X)

and most significantly, for $f,g:X\rightharpoonup W$ and $x\in X$ ,

f(x)=g(x)\iff\kern-6.49994pt_{\mathrm{df}}~{}\Big{(}f(x)\kern-1.99997pt\downarrow\kern 1.00006pt{~{}\&~{}}g(x)\kern-1.99997pt\downarrow\kern 1.00006pt{~{}\&~{}}f(x)=g(x)\Big{)}\text{ or }\Big{(}f(x)\kern-1.00006pt\uparrow\kern 0.50003pt{~{}\&~{}}g(x)\kern-1.00006pt\uparrow\kern 0.50003pt\Big{)}.

This relation between partial functions $f,g:X\rightharpoonup W$ with the same input and output sets is sometimes called Kleene’s strong equality between “partial terms”, but there is nothing partial about it: for any two $f,g:X\rightharpoonup W$ and any $x\in X$ , the proposition $f(x)=g(x)$ is either true or false.

A signature or vocabulary is a set $\Phi$ of function symbols, each with assigned $\textup{arity}(\phi)\in\mathbb{N}$ and ${\textup{sort}}(\phi)\in\{{\texttt{ind}},{\texttt{boole}}\}$ ; and as in (1-2), a (two-sorted, partial) $\Phi$ -structure is a pair

{\textup{{A}}}=(A,\{\phi^{\textup{{A}}}\}_{\phi\in\Phi})\quad(\phi\in\Phi,\phi^{\textup{{A}}}:A^{\textup{arity}(\phi)}\rightharpoonup A_{{\textup{sort}}(\phi)})

(2-1)

of a (typically infinite) universe $A$ and an interpretation of the function symbols which matches their formal arities and sorts, i.e., for each $\phi\in\Phi$ ,

\phi^{\textup{{A}}}:A^{n}\rightharpoonup A_{s}\qquad(n=\textup{arity}(\phi),s={\textup{sort}}(\phi),A_{\texttt{ind}}=A,A_{\texttt{boole}}=\{{\textup{t\hskip-1.7ptt}},{\textup{ff}}\}).

An expansion of a $\Phi$ -structure A in (2-1) is any $\Phi\cup\Psi$ -structure

({\textup{{A}}},\Psi^{\textup{{A}}})=(A,\{\phi^{\textup{{A}}}\}_{\phi\in\Phi},\{\psi^{\textup{{A}}}\}_{\psi\in\Psi})

(2-2)

on the same universe $A$ with $\Phi\cap\Psi=\emptyset$ .

Syntax

The (pure, explicit) $\Phi^{2}$ -terms are defined by the structural recursion

F:\equiv{\textup{t\hskip-1.7ptt}}\mid{\textup{ff}}\mid\textup{{v}}_{i}\mid\textup{{p}}^{s,n}(F_{1},\ldots,F_{n})\\ \mid\phi(F_{1},\ldots,F_{\textup{arity}(\phi)})\mid\textup{if }F_{0}~{}\textup{then }F_{1}~{}\textup{else }F_{2},

where $\textup{{v}}_{0},\textup{{v}}_{1},\ldots$ is a fixed list of individual variables; for each $s\in\{{\texttt{ind}},{\texttt{boole}}\}$ and each $n\in\mathbb{N}$ , $\textup{{p}}^{s,n}_{0},\textup{{p}}^{s,n}_{1},\ldots$ is a fixed list of (partial) function variables of sort $s$ and arity $n$ ; each term is assigned a sort in the natural way; and a Parsing Lemma (like Problem x1E.1 of ARIC) justifies the standard method of defining functions on these terms by structural recursion.⁶⁶6See Problem x2.1 for a detailed statement and proof of this. We just write $\phi$ for $\phi(~{})$ when $\textup{arity}(\phi)=0$ , $\textup{{p}}^{s,0}$ for $\textup{{p}}^{s,0}(~{})$ and we do not allow variables over the set $\mathbb{B}=\{{\textup{ff}},{\textup{t\hskip-1.7ptt}}\}$ of truth values, cf. footnote 9 on p. 37 of ARIC.

A $\Phi$ -term is a $\Phi^{2}$ -term which has no function variables and a $\Phi\cup\{\textup{{p}}_{1},\ldots,\textup{{p}}_{K}\}$ -term is a $\Phi^{2}$ -term whose function variables are all in the list $\textup{{p}}_{1},\ldots,\textup{{p}}_{K}$ ; these terms are naturally interpreted in expansions $({\textup{{A}}},p_{1},\ldots,p_{K})$ of $\Phi$ -structures which interpret each $\textup{{p}}_{i}$ by some $p_{i}$ of the correct sort and arity.

An extended $\Phi^{2}$ -term is a pair

(F,\vec{\textup{{x}}})\equiv F(\vec{\textup{{x}}})\equiv F(\textup{{x}}_{1},\ldots,\textup{{x}}_{n})

of a $\Phi^{2}$ -term $F$ and a finite list of distinct individual variables which includes all the individual variables that occur in $F$ . The notation provides a simple way to deal with substitutions,

F(E_{1},\ldots,E_{n}):\equiv F\{\textup{{x}}_{1}:\equiv E_{1},\ldots,\textup{{x}}_{n}:\equiv E_{n}\}\\ (F(\vec{\textup{{x}}})\text{ an extended term, $E_{1},\ldots,E_{n}$ terms of sort ${\texttt{ind}}$}).

Semantics

The denotation (or value) $\textup{den}(({\textup{{A}}},\vec{p}),F(\vec{x}))$ in a $\Phi$ -structure A of each extended $\Phi\cup\{\vec{\textup{{p}}}\}$ -term $F(\vec{\textup{{x}}})$ for given values $\vec{p},\vec{x}$ of its variables is defined by the usual (compositional) structural recursion on the definition of terms: skipping the $({\textup{{A}}},\vec{p})$ part (which remains constant),

	$\displaystyle\textup{den}({\textup{t\hskip-1.7ptt}}(\vec{x}))={\textup{t\hskip-1.7ptt}},~{}~{}\textup{den}({\textup{ff}}(\vec{x}))={\textup{ff}},~{}~{}\textup{den}(\textup{{x}}_{i}(\vec{x}))=x_{i},$
	$\displaystyle\textup{den}(\textup{{p}}_{i}(F_{1},\ldots,F_{n})(\vec{x}))=p_{i}(\textup{den}(F_{1}(\vec{x})),\ldots,\textup{den}(F_{n}(\vec{x}))),$
	$\displaystyle\textup{den}(\phi(F_{1},\ldots,F_{\textup{arity}(\phi)})(\vec{x}))=\phi^{\textup{{A}}}(\textup{den}(F_{1}(\vec{x})),\ldots,\textup{den}(F_{\textup{arity}(\phi)}(\vec{x}))),$
	$\displaystyle\textup{den}(\textup{if }F_{0}~{}\textup{then }F_{1}~{}\textup{else }F_{2}(\vec{x}))=\textup{if }\textup{den}(F_{0}(\vec{x}))~{}\textup{then }\textup{den}(F_{1}(\vec{x}))~{}\textup{else }\textup{den}(F_{2}(\vec{x}));$

and we will also use standard model-theoretic notation,

	$\displaystyle({\textup{{A}}},\vec{p})\models F(\vec{x})=G(\vec{x})$	$\displaystyle\iff\kern-6.49994pt_{\mathrm{df}}~{}\textup{den}(({\textup{{A}}},\vec{p}),F(\vec{x}))=\textup{den}(({\textup{{A}}},\vec{p}),G(\vec{x})),$
	$\displaystyle{\textup{{A}}}\models F(\vec{\textup{{x}}})=G(\vec{\textup{{x}}})$	$\displaystyle\iff\kern-6.49994pt_{\mathrm{df}}~{}(\forall\vec{p},\vec{x})\Big{(}({\textup{{A}}},\vec{p})\models F(\vec{x})=G(\vec{x}\Big{)}$
	$\displaystyle\models F(\vec{\textup{{x}}})=G(\vec{\textup{{x}}})$	$\displaystyle\iff\kern-6.49994pt_{\mathrm{df}}~{}(\forall{\textup{{A}}})\Big{(}{\textup{{A}}}\models F(\vec{\textup{{x}}})=G(\vec{\textup{{x}}})\Big{)}.$

Problems for Section 2

Problem x2.1.

A set $\mathcal{T}$ of pairs $(F,s)$ is closed under term formation if

({\textup{t\hskip-1.7ptt}},{\texttt{boole}}),({\textup{ff}},{\texttt{boole}})\in\mathcal{T},\text{ for all }i,(\textup{{v}}_{i},{\texttt{ind}})\in\mathcal{T},\\ \text{for all }\phi\in\Phi,\Big{(}(F_{1},{\texttt{ind}}),\ldots,(F_{\textup{arity}(\phi)},{\texttt{ind}})\in\mathcal{T}\hskip 85.35826pt\mbox{}\\ \mbox{}\hskip 85.35826pt\implies(\phi(F_{1},\ldots,F_{\textup{arity}(\phi)}),{\textup{sort}}(\phi))\in\mathcal{T}\Big{)},\\ \text{for all }\textup{{p}}^{s,n}_{i},\Big{(}(F_{1},{\texttt{ind}}),\ldots,(F_{n},{\texttt{ind}})\in\mathcal{T}\hskip 85.35826pt\mbox{}\\ \mbox{}\hskip 85.35826pt\implies(\textup{{p}}^{s,n}_{i}(F_{1},\ldots,F_{n}),s)\in\mathcal{T}\Big{)}\\ \text{and }\Big{(}(F_{1},{\texttt{boole}}),(F_{2},s),(F_{3},s)\in\mathcal{T}\implies(\textup{if }F_{1}~{}\textup{then }F_{2}~{}\textup{else }F_{3},s)\in\mathcal{T}\Big{)},

where we view ${\textup{t\hskip-1.7ptt}},{\textup{ff}}$ and $\textup{{v}}_{i}$ as strings of symbols of length $1$ ; and a string $F$ is a pure, explicit $\Phi^{2}$ -term of sort $s$ if the pair $(F,s)$ belongs to every set $\mathcal{T}$ which is closed under term formation.

Formulate a Parsing Lemma for pure, explicit $\Phi^{2}$ -terms (as in Problem x1E.1 of ARIC for terms without function variables) and give a complete proof of it using this precise definition.

3 Continuous, pure recursors

For any two sets $X,W$ , a (continuous, pure) recursor on $X$ to $W$ is a tuple

\alpha=(\alpha_{0},\ldots,\alpha_{K}):X\rightsquigarrow W,

(3-1)

such that for suitable sets $Y^{\alpha}_{1},W^{\alpha}_{1},\ldots,Y^{\alpha}_{K},W^{\alpha}_{K}$ ,⁷⁷7For the definitions of monotone and continuous functionals see Section 1A of ARIC.

	$\displaystyle\alpha_{0}:X\times(Y^{\alpha}_{1}\rightharpoonup W^{\alpha}_{1})\times\cdots\times(Y^{\alpha}_{K}\rightharpoonup W^{\alpha}_{K})$	$\displaystyle\rightharpoonup W,\text{ and}$
	$\displaystyle\alpha_{i}:(Y^{\alpha}_{1}\rightharpoonup W^{\alpha}_{1})\times\cdots\times(Y^{\alpha}_{K}\rightharpoonup W^{\alpha}_{K})$	$\displaystyle\rightharpoonup W^{\alpha}_{i}\quad(1\leq i\leq K)$

are continuous functionals. We allow $X={\bf I}=\{\emptyset\}$ (as on page LABEL:emptyproduct of ARIC), in which case, by our conventions, $\alpha_{0}:(Y^{\alpha}_{1}\rightharpoonup W^{\alpha}_{1})\times\cdots\times(Y^{\alpha}_{K}\rightharpoonup W^{\alpha}_{K})\rightharpoonup W$ .

The number $K$ is the dimension of $\alpha$ , $\alpha_{0}$ is its output or head functional, the poset $(Y_{1}\rightharpoonup W_{1})\times\cdots\times(Y_{K}\rightharpoonup W_{K})$ is its solution space, and we allow $K=0$ –in which case there is no solution space and $(\alpha_{0})$ is naturally identified with the partial function $\alpha_{0}:X\rightharpoonup W$ . With the notation of (LABEL:wheredef) of ARIC, $\alpha$ defines (or computes) the partial function $\overline{\alpha}:X\rightharpoonup W$ , where

\overline{\alpha}(x)=\alpha_{0}(x,\vec{p})\,\textup{where}\hskip-25.58336pt\raisebox{5.59721pt}{\rule{25.58336pt}{0.3pt}}\,\Big{\{}p_{1}(y_{1})=\alpha_{1}(y_{1},\vec{p}),\ldots,p_{K}(y_{K})=\alpha_{K}(y_{K},\vec{p})\Big{\}}.

(3-2)

We can summarize this situation succinctly by writing

\alpha(x)=(\alpha_{0},\ldots,\alpha_{K})(x)\\ =\alpha_{0}(x,\vec{p})\textit{ where }\Big{\{}p_{1}(y_{1})=\alpha_{1}(y_{1},\vec{p}),\ldots,p_{K}(y_{K})=\alpha_{K}(y_{K},\vec{p})\Big{\}},

(3-3)

where “where” is now an operation which assigns to every tuple of continuous functionals $(\alpha_{0},\ldots,\alpha_{K})$ (with suitable input and output sets) a recursor.

The recursor defined by an extended program

Every (deterministic) extended recursive $\Phi$ -program

E(\vec{\textup{{x}}})\equiv E_{0}(\vec{\textup{{x}}}){\sf~{}where~{}}\Big{\{}\textup{{p}}_{1}(\vec{\textup{{x}}}_{1})=E_{1},\ldots,\textup{{p}}_{K}(\vec{\textup{{x}}}_{K})=E_{K}\Big{\}}

(as in (1-4)) with dimension $K$ , arity $n$ and sort $s$ defines naturally on each $\Phi$ -structure A the continuous pure recursor on $A$ of sort $s$ and arity $n$

\mathfrak{r}({\textup{{A}}},E(\vec{\textup{{x}}}))=(\alpha_{0},\alpha_{1},\ldots,\alpha_{K}):A^{n}\rightsquigarrow A_{s},

(3-4)

where

\begin{array}[]{rcl}\alpha_{0}(\vec{x},\vec{p})&=&\textup{den}(({\textup{{A}}},\vec{p}),E_{0}(\vec{x})),\\ \alpha_{i}(\vec{x}_{i},\vec{p})&=&\textup{den}(({\textup{{A}}},\vec{p}),E_{i}(\vec{x}_{i}))\quad(1\leq i\leq K)\end{array}

(3-5)

or, with the notation in (3-3),

\mathfrak{r}({\textup{{A}}},E(\vec{x}))=\textup{den}(({\textup{{A}}},\vec{p}),E_{0}(\vec{x}))\\ \textit{ where }\Big{\{}p_{1}(\vec{x}_{1})=\textup{den}(({\textup{{A}}},\vec{p}),E_{1}(\vec{x}_{1})),\\ \ldots,p_{K}(\vec{x}_{K})=\textup{den}(({\textup{{A}}},\vec{p}),E_{K}(\vec{x}_{K}))\Big{\}}.

(3-6)

It is immediate from the semantics of recursive programs on page LABEL:progsemantics of ARIC that the recursor of an extended program computes its denotation,

\overline{\mathfrak{r}({\textup{{A}}},E(\vec{x}))}=\textup{den}({\textup{{A}}},E(\vec{x}))\quad(\vec{x}\in A^{n}).

(3-7)

However : $\mathfrak{r}({\textup{{A}}},E(\vec{\textup{{x}}}))$ does not typically model faithfully the algorithm expressed by $E(\vec{\textup{{x}}})$ on A, partly because it does not take into account the explicit steps which may be required to computes $\textup{den}({\textup{{A}}},E(\vec{x}))$ and (more importantly) because it does not record the dependence of that algorithm on the primitives of A. In the extreme case, if $E\equiv E_{0}{\sf~{}where~{}}\{~{}\}$ is a program with empty body (an explicit $\Phi$ -term), then

\mathfrak{r}({\textup{{A}}},E(\vec{\textup{{x}}}))=((\lambda\vec{x})\,\textup{den}({\textup{{A}}},E(\vec{x})))

is a trivial recursor of dimension $0$ , a partial function on $A$ —and it is the same for all explicit terms which define this partial function from any primitives, which is certainly not right. In the next Section 4 we will define the intension $\textup{int}({\textup{{A}}},E(\vec{\textup{{x}}}))$ of $E(\vec{\textup{{x}}})$ in A which (we will claim) models faithfully the algorithm from the primitives of A expressed by $E(\vec{\textup{{x}}})$ . As it turns out, however,

\textup{int}({\textup{{A}}},E(\vec{\textup{{x}}}))=\mathfrak{r}({\textup{{A}}},\textup{cf}(E(\vec{\textup{{x}}})))

for some extended program $\textup{cf}(E(\vec{\textup{{x}}}))$ which is canonically associated with $E(\vec{\textup{{x}}})$ , and so every recursive algorithm of a structure A is $\mathfrak{r}({\textup{{A}}},F(\vec{\textup{{x}}})))$ for some $F(\vec{\textup{{x}}})$ .

\ynm

About implementations (2)

Our understanding of the algorithm expressed by a recursor $\alpha$ in (3-3) is that it calls for solving the system of recursive equations in its body and then plugging the solutions in the head functional to compute the value $\overline{\alpha}(x)=\alpha_{0}(x,\overline{p}_{1},\ldots,\overline{p}_{K})$ . How we find the canonical solutions of this system is not specified by $\alpha$ : we might use the method outlined in the proof of the Fixed Point Lemma LABEL:lfp, or the recursive machine defined in Section LABEL:cmodels if $\alpha=\mathfrak{r}({\textup{{A}}},E(\vec{\textup{{x}}}))$ is the recursor of an extended program in a structure, or any one of several well-known and much studied implementations of recursion. What the implementations of pure recursors are is an important part of this approach to the foundations of the theory of algorithms, cf. [ynmsicily] and (especially) [ynmpaschalis2008] which includes a precise definition and some basic results. We will not go into this here, except for a few additional comments on page 4.

Strong recursor isomorphism

The solutions of the system of recursive equations in the body of a recursor $\alpha$ as in (3-3) do not depend on the order in which these equations are listed and the known methods for computing them also do not depend on that order in any important way; so it is natural to identify recursors which differ only in the order in which their bodies are enumerated, so that, for example

\alpha_{0}(x,p_{1},p_{2},p_{3})\textit{ where }\Big{\{}p_{1}(y_{1})=\alpha_{1}(y_{1},p_{1},p_{2},p_{3}),\\ \hskip 56.9055ptp_{2}(y_{2})=\alpha_{2}(y_{2},p_{1},p_{2},p_{3}),~{}~{}p_{3}(y_{3})=\alpha_{3}(y_{3},p_{1},p_{2},p_{3})\Big{\}}\\ =\alpha_{0}(x,p_{1},p_{2},p_{3})\textit{ where }\Big{\{}p_{3}(y_{3})=\alpha_{3}(y_{3},p_{1},p_{2},p_{3}),\\ \hskip 28.45274ptp_{1}(y_{1})=\alpha_{1}(y_{1},p_{1},p_{2},p_{3}),~{}~{}p_{2}(y_{2})=\alpha_{2}(y_{2},p_{1},p_{2},p_{3})\Big{\}}.

\ynm

Detail for the example We need $q_{1}=p_{3},q_{2}=p_{1},q_{3}=p_{2}$ , so take $\pi(1)=3,\pi(2)=1,\pi(3)=2$ , with the inverse being $\rho(3)=1,\rho(1)=2,\rho(2)=3$ . This then requires the equations

	$\displaystyle\beta_{0}(x,q_{1},q_{2},q_{3})$	$\displaystyle=\alpha_{0}(x,p_{1},p_{2},p_{3})=\alpha_{0}(x,q_{\rho(1)},q_{\rho(2)},q_{\rho(3)})$
	$\displaystyle\beta_{1}(y_{3},q_{1},q_{2},q_{3})$	$\displaystyle=\alpha_{3}(y_{3},p_{1},p_{2},p_{3})=\alpha_{3}(y_{3},q_{\rho(1)},q_{\rho(2)},q_{\rho(3)})$

For these we need to have

Y^{\beta}_{1}=Y^{\alpha}_{3}=Y^{\alpha}_{\pi(1)},Y^{\beta}_{2}=Y^{\alpha}_{1}=Y^{\alpha}_{\pi(2)},Y^{\beta}_{3}=Y^{\alpha}_{2}=Y^{\alpha}_{\pi(3)}

q_{i}:Y^{\beta}_{i}=Y^{\alpha}_{\pi(i)}\rightharpoonup W^{\alpha}_{\pi(i)}=W^{\beta}_{i},\text{ so }q_{\rho(i)}:Y^{\alpha}_{i}\rightharpoonup W^{\alpha}_{i}.

The precise definition of this equivalence relation is a bit messy in the general case: two recursors $\alpha,\beta:X\rightsquigarrow W$ on the same input and output sets are strongly isomorphic—or just equal—if they have the same dimension $K$ and there is a permutation $\pi:\{1,\ldots,K\}\mbox{\,$\rightarrowtail\kern-8.00003pt\rightarrow$\,}\{1,\ldots,K\}$ with inverse $\rho=\pi^{-1}$ such that

\begin{array}[]{c}Y^{\beta}_{i}=Y^{\alpha}_{\pi(i)},\quad W^{\beta}_{i}=W^{\alpha}_{\pi(i)}\quad(i=1,\ldots,K),\\ \beta_{0}(x,q_{1},\ldots,q_{K})=\alpha_{0}(x,q_{\rho(1)},\ldots,q_{\rho(K)}),\\ \beta_{i}(y_{\pi(i)},q_{1},\ldots,q_{K})=\alpha_{\pi(i)}(y_{\pi(i)},q_{\rho(1)},\ldots,q_{\rho(K)}),~{}(1\leq i\leq K).\end{array}

(3-8)

Directly from the definitions, strongly isomorphic recursors $\alpha,\beta:X\rightsquigarrow W$ compute the same partial function $\overline{\alpha}=\overline{\beta}:X\rightharpoonup W$ ; the idea, of course, is that strongly isomorphic recursors model the same algorithm, and so we will simply write $\alpha=\beta$ to indicate that $\alpha$ and $\beta$ are strongly isomorphic. This view is discussed in several of the articles cited above and we will not go into it here.⁸⁸8This finest relation of recursor equivalence was introduced (for monotone recursors) in [ynmbotik], and more general, weaker notions were considered in [ynmsicily, ynmeng] and in [ynmpaschalis2008].

For the recursors defined by recursive programs, this definition takes the following form, where for any $\Phi^{2}$ -term $F$ and sequences of distinct function and individual variables $\vec{\textup{{q}}},\vec{\textup{{p}}},\vec{\textup{{y}}},\vec{\textup{{x}}}$ (satisfying the obvious sort and arity conditions),

F\{\vec{\textup{{q}}}:\equiv\vec{\textup{{p}}},\vec{\textup{{y}}}:\equiv\vec{\textup{{x}}}\}\\ :\equiv\text{the result of replacing in $F$ every $\textup{{q}}_{i}$ by $\textup{{p}}_{i}$ and every $\textup{{y}}_{j}$ by $\textup{{x}}_{j}$}.

(3-9)

Lemma 3.1.

Two extended $\Phi$ -programs

\left\{\begin{array}[]{rcl}E(\vec{\textup{{x}}})&\equiv&E_{0}(\vec{\textup{{x}}}){\sf~{}where~{}}\Big{\{}\textup{{p}}_{1}(\vec{\textup{{x}}}_{1})=E_{1},\ldots,\textup{{p}}_{K}(\vec{\textup{{x}}}_{K})=E_{K}\Big{\}},\\ F(\vec{\textup{{x}}})&\equiv&F_{0}(\vec{\textup{{x}}}){\sf~{}where~{}}\Big{\{}\textup{{q}}_{1}(\vec{\textup{{y}}}_{1})=F_{1},\ldots,\textup{{q}}_{K^{\prime}}(\vec{\textup{{y}}}_{K^{\prime}})=E_{K^{\prime}}\Big{\}}\end{array}\right.

(3-10)

of the same sort and arity define the same recursor on a $\Phi$ -structure A exactly when they have the same number of parts $(K^{\prime}=K)$ and there is a permutation $\pi:\{0,\ldots,K\}\mbox{\,$\rightarrowtail\kern-8.00003pt\rightarrow$\,}\{0,\ldots,K\}$ with inverse $\rho=\pi^{-1}$ such that $\pi(0)=0$ and for all $\vec{p}$ and $\vec{x}$ ,

\begin{array}[]{rcl}({\textup{{A}}},\vec{p})&\models&E_{0}(\vec{x})=F_{0}\{\vec{\textup{{q}}}:\equiv\rho(\vec{\textup{{p}}})\}(\vec{x}),\\ ({\textup{{A}}},\vec{p})&\models&E_{i}(\vec{x}_{i})=F_{\pi(i)}\{\vec{\textup{{q}}}:\equiv\rho(\vec{\textup{{p}}}),\vec{\textup{{y}}}_{i}:\equiv\vec{\textup{{x}}}_{i}\}(\vec{x}_{i}),\quad(i=1,\ldots,K),\end{array}

(3-11)

where $\rho(\textup{{p}}_{1},\ldots,\textup{{p}}_{K})=(\textup{{p}}_{\rho(1)},\ldots,\textup{{p}}_{\rho(K)})$ .

{proofplain}

[The proof] is an exercise in managing messy notations and we leave it for Problem x3.1.

(Extended) program congruence

Two programs $E$ and $F$ are congruent if $F$ can be constructed from $E$ by an alphabetic change (renaming) of the bound individual and function variables in its parts (as in (3-9)) and a permutation of the equations in the body of $E$ . Congruence is obviously an equivalence relation on programs, congruent programs have the same free variables and we write

	$\displaystyle E\equiv_{c}F$	$\displaystyle\iff\kern-6.49994pt_{\mathrm{df}}~{}\text{ $E$ and $F$ are congruent},$
	$\displaystyle E(\vec{\textup{{x}}})\equiv_{c}F(\vec{\textup{{y}}})$	$\displaystyle\iff\kern-6.49994pt_{\mathrm{df}}~{}\vec{\textup{{x}}}\equiv\vec{\textup{{y}}}\text{ and }E\equiv_{c}F.$

By Lemma 3.1, congruent extended programs define equal recursors in every structure A, i.e., for every extended program $E(\vec{\textup{{x}}})$ and every program $F$ ,

E\equiv_{c}F\implies(\forall{\textup{{A}}})[\mathfrak{r}({\textup{{A}}},E(\vec{\textup{{x}}}))=\mathfrak{r}({\textup{{A}}},F(\vec{\textup{{x}}}))],

(3-12)

but the converse fails, cf. Problem x3.2.

More general recursors

The definition of recursors we gave is basically the one in [ynmbotik], except that we allowed there the parts $\alpha_{i}$ of $\alpha$ to be (monotone but) not continuous and to depend on the input set $X$ , i.e.,

\alpha=(\alpha_{0},\alpha_{1},\ldots,\alpha_{K}):X\rightsquigarrow W,

(

\star

)

such that for suitable sets $Y^{\alpha}_{1},W^{\alpha}_{1},\ldots,Y^{\alpha}_{K},W^{\alpha}_{K}$ ,

	$\displaystyle\alpha_{0}:(Y^{\alpha}_{1}\rightharpoonup W^{\alpha}_{1})\times\cdots\times(Y^{\alpha}_{K}\rightharpoonup W^{\alpha}_{K})\times X$	$\displaystyle\rightharpoonup W,\text{ and}$
	$\displaystyle\alpha_{i}:Y^{\alpha}_{i}\times(Y^{\alpha}_{1}\rightharpoonup W^{\alpha}_{1})\times\cdots\times(Y^{\alpha}_{K}\rightharpoonup W^{\alpha}_{K})\times X$	$\displaystyle\rightharpoonup W^{\alpha}_{i}\quad(1\leq i\leq K).$

Allowing the parts to be discontinuous is necessary for the theory of higher-type recursion which was the topic of [ynmflr, ynmbotik], but their dependence on the input set $X$ is not: the simpler, present notion is easier to work with and it includes the more general objects if we identify $\alpha$ in ( $\star$ ‣ 3) with

\alpha^{\prime}=(\alpha^{\prime}_{0},\alpha^{\prime}_{1},\ldots,\alpha^{\prime}_{K}):X\rightsquigarrow W

on the sets $Y^{\prime}_{i}=X\times Y_{i},W^{\prime}_{i}=W_{i}$ ( $1\leq i\leq K$ ) with

	$\displaystyle\alpha^{\prime}_{0}(x,p_{1},\ldots,p_{K})$	$\displaystyle=\alpha_{0}(\lambda y_{1}p(x,y_{1}),\ldots,\lambda y_{K}p(x,y_{K}),x),$
	$\displaystyle\alpha^{\prime}_{i}(x,y_{i},p_{1},\ldots,p_{K})$	$\displaystyle=\alpha_{i}(\lambda y_{1}p(x,y_{1}),\ldots,\lambda y_{K}p(x,y_{K}),x)\quad(1\leq i\leq K).$

Substantially more general notions of (monotone and continuous) recursors whose solution spaces are products of arbitrary complete posets were introduced in [ynmsicily, ynmeng] and (in greater detail) in [ynmpaschalis2008]; and it is routine to define nondeterministic algorithms along the same lines, using the fixpoint semantics of nondeterministic programs on page LABEL:ndfpsemantics of ARIC.

Problems for Section 3

Problem x3.1.

Prove Lemma 3.1. Hint: Check by a (routine) induction on terms, that for any two lists of distinct function variables $\vec{\textup{{p}}}\equiv\textup{{p}}_{1},\ldots,\textup{{p}}_{K}$ and $\vec{\textup{{q}}}\equiv\textup{{q}}_{1},\ldots,\textup{{q}}_{K}$ and individual variables $\vec{\textup{{y}}}\equiv\textup{{y}}_{1},\ldots,\textup{{y}}_{m}$ , $\vec{\textup{{x}}}\equiv\textup{{x}}_{1},\ldots,\textup{{x}}_{m}$ , and for every $\Phi\cup\{\vec{\textup{{q}}}\}$ -term $M$ whose free variables are all in the list $\vec{\textup{{y}}}$ ,

\text{if }\beta(\vec{y},\vec{q})=\textup{den}(({\textup{{A}}},\vec{q}),M(\vec{y})),\\ \text{then }\beta(\vec{x},\vec{p})=\textup{den}(({\textup{{A}}},\vec{p}),M\{\vec{\textup{{q}}}:\equiv\vec{\textup{{p}}},\vec{\textup{{y}}}:\equiv\vec{\textup{{x}}}\}(\vec{x})),

assuming, of course, that the sorts and arities of the function variables are such that the claim makes sense.

Problem x3.2.

Prove that for every explicit term $E$ of sort boole,

\models\textup{if }E~{}\textup{then }E~{}\textup{else }E=E

and use this to define two extended programs $E(\textup{{x}}),F(\textup{{x}})$ such that for all A, $\mathfrak{r}({\textup{{A}}},E(\vec{\textup{{x}}}))=\mathfrak{r}({\textup{{A}}},F(\vec{\textup{{x}}}))$ but $E\not\equiv_{c}F$ .

4 Canonical forms and intensions

In this section we will associate with each extended (recursive) $\Phi$ -program

E(\vec{\textup{{x}}})\equiv E_{0}(\vec{\textup{{x}}}){\sf~{}where~{}}\Big{\{}\textup{{p}}_{1}(\vec{\textup{{x}}}_{1})=E_{1},\ldots,\textup{{p}}_{K}(\vec{\textup{{x}}}_{K})=E_{K}\Big{\}}

(4-1)

a (unique up to congruence) canonical form $\textup{cf}(E(\vec{\textup{{x}}}))$ , so that the construction

({\textup{{A}}},E(\vec{\textup{{x}}}))\mapsto\mathfrak{r}({\textup{{A}}},\textup{cf}(E(\vec{\textup{{x}}})))

captures the algorithm expressed by $E(\vec{\textup{{x}}})$ in any $\Phi$ -structure A. This theory of canonical forms yields, in particular, a robust notion of the elementary (first-order) algorithms of a structure A.

Some more syntax

For the syntactic work we will do in this Section we need to enrich and simplify the notation of ARIC summarized in Section 2.

We introduce two function symbols $\textsf{cond}_{s}$ of arity $3$ and sort $s$ ( $\equiv{\texttt{ind}}$ or boole) and the abbreviations

\textsf{cond}_{s}(F_{0},F_{1},F_{2}):\equiv\textup{if }F_{0}~{}\textup{then }F_{1}~{}\textup{else }F_{2};

this is semantically inappropriate because the conditionals do not define (strict) partial functions, but it simplifies considerably the definition of pure, explicit $\Phi^{2}$ -terms on page 2 which now takes the form

F:\equiv{\textup{t\hskip-1.7ptt}}\mid{\textup{ff}}\mid\textup{{v}}_{i}\mid\textup{{c}}(F_{1},\ldots,F_{n}),

(Pure explicit terms)

where c is any $n$ -ary function symbol $\phi$ in the vocabulary $\Phi$ , or an $n$ -ary function variable $\textup{{p}}^{s,n}_{i}$ or $\textsf{cond}_{s}$ (and $n=3$ ).

Set representations

With each extended $\Phi$ -program $E(\vec{\textup{{x}}})$ as in (4-1), we associate its set representation

\textup{set}(E(\vec{\textup{{x}}}))=\Big{\{}E_{0}(\vec{\textup{{x}}}),~{}~{}\textup{{p}}_{1}(\vec{\textup{{x}}}_{1})=E_{1},\ldots,~{}~{}\textup{{p}}_{K}(\vec{\textup{{x}}}_{K})=E_{K}\Big{\}}.

(4-2)

Notice that $\textup{set}(E(\vec{\textup{{x}}}))$ determines $E_{0}(\vec{\textup{{x}}})$ , its only member which is not an equation, and by the definition of the congruence relation on page 3,

\textup{set}(E(\vec{\textup{{x}}}))=\textup{set}(F(\vec{\textup{{x}}}))\implies E\equiv_{c}F;

the converse implication fails, but the algorithm which computes canonical forms naturally operates on these set representations of extended terms, so having a notation for them simplifies greatly its specification.

Immediate and irreducible terms

A term $I$ is immediate if it is an individual variable or a direct call to a recursive variable applied to individual variables,\ynmChecked this out in detail.

I:\equiv\textup{{v}}_{i}\mid\textup{{p}}^{s,0}\mid\textup{{p}}^{n,s}_{i}(\textup{{u}}_{1},\ldots,\textup{{u}}_{n});

(Immediate terms)

and a term $F$ is irreducible if it is immediate, a Boolean constant, or of the form $\textup{{c}}(I_{1},\ldots,I_{l})$ with immediate $I_{1},\ldots,I_{n}$ ,⁹⁹9There are natural sort restrictions in these definitions and assignments of sorts to immediate and irreducible terms that we will suppress in the notation, cf. Problem x4.1.

F:\equiv I\mid{\textup{t\hskip-1.7ptt}}\mid{\textup{ff}}\mid\textup{{c}}(I_{1},\ldots,I_{n})

(Irreducible terms)

For example, $\textup{{u}},\textup{{p}}(\textup{{u}}_{1},\textup{{u}}_{2},\textup{{u}}_{1})$ are immediate, $\phi(\textup{{u}})$ and $\textup{{p}}(\textup{{u}},\textup{{q}}(\textup{{v}}),\textup{{r}}(\textup{{u}}))$ are irreducible but not immediate, and $\textup{{p}}(u,\phi(\textup{{v}}))$ is reducible.

Computationally, we think of immediate terms as “generalized variables” which can be accessed directly, like array entries $a[i],b[i,j,i]$ in some programming languages, and a term is irreducible if it can be computed by direct (not nested) calls to primitives or to recursive variables. Both of these properties of terms are (trivially) preserved by alphabetic change of variables (3-9),

\textit{if $F$ is immediate {(}irreducible{)}},\\ \textit{then $F\{\vec{\textup{{q}}}:\equiv\vec{\textup{{p}}},\vec{\textup{{y}}}:\equiv\vec{\textup{{x}}}\}$ is also immediate {(}irreducible{)}}.

(4-3)

Irreducible extended programs

An extended program

E(\vec{\textup{{x}}})\equiv E_{0}(\vec{\textup{{x}}}){\sf~{}where~{}}\Big{\{}\textup{{p}}_{1}(\vec{\textup{{x}}}_{1})=E_{1},\ldots,\textup{{p}}_{K}(\vec{\textup{{x}}}_{K})=E_{K}\Big{\}}

is irreducible if $E_{0},E_{1},\ldots,E_{K}$ are all irreducible terms. These are the extended programs which cannot be reduced by the basic process of reduction, to which we turn next.

Arrow reduction

We first define the simplest reduction relation

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F(\vec{\textup{{x}}}),\quad\vec{\textup{{x}}}\equiv(\textup{{x}}_{1},\ldots,\textup{{x}}_{n})

on extended programs with the same list of $n$ free variable, where $\textup{{p}},\textup{{q}}$ are function variables and $j$ is a number. The definition splits in two cases on the triple $(E,\textup{{p}},j)$ and it helps to call a function variable r fresh if it is none of the function variables $\textup{{p}}_{i}$ in the body of $E$ .

(1) The body case: one of the equations in the body of $E$ is

\textup{{p}}(\vec{\textup{{x}}}_{\textup{{p}}})=\textup{{c}}(G_{1},\ldots,G_{j-1},G_{j},G_{j+1},\ldots,G_{l})

and the term $G_{j}$ is not immediate. Put

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F(\vec{\textup{{x}}})\iff\kern-6.49994pt_{\mathrm{df}}~{}\text{$\textup{{q}}$ is fresh, $\textup{arity}(\textup{{q}})=\textup{arity}(\textup{{p}})$, ${\textup{sort}}(\textup{{q}})={\textup{sort}}(\textup{{p}})$, and}\\ \textup{set}(F(\vec{\textup{{x}}}))=\Big{(}\textup{set}(E(\vec{\textup{{x}}}))\setminus\Big{\{}\textup{{p}}(\vec{\textup{{x}}}_{\textup{{p}}})=\textup{{c}}(G_{1},\ldots,G_{j-1},G_{j},G_{j+1},\ldots,G_{l})\Big{\}}\Big{)}\\ \cup\Big{\{}\textup{{p}}(\vec{\textup{{x}}}_{\textup{{p}}})=\textup{{c}}(G_{1},\ldots,G_{j-1},\textup{{q}}(\vec{\textup{{x}}}_{\textup{{p}}}),G_{j+1},\ldots,G_{l}),\quad\textup{{q}}(\vec{\textup{{x}}}_{\textup{{p}}})=G_{j}\Big{\}}.

(2) The head case: p and q are both fresh, $\textup{arity}(\textup{{q}})=n$ , ${\textup{sort}}(\textup{{q}})={\textup{sort}}(E_{0})$ ,

E_{0}\equiv\textup{{c}}(G_{1},\ldots,G_{j-1},G_{j},G_{j+1},\ldots,G_{l})

and the term $G_{j}$ is not immediate. Put

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F(\vec{\textup{{x}}})\\ \iff\kern-6.49994pt_{\mathrm{df}}~{}\textup{set}(F(\vec{\textup{{x}}}))=\Big{(}\textup{set}(E(\vec{\textup{{x}}}))\setminus\Big{\{}\textup{{c}}(G_{1},\ldots,G_{j-1},G_{j},G_{j+1},\ldots,G_{l})\Big{\}}\Big{)}\\ \cup\Big{\{}\textup{{c}}(G_{1},\ldots,G_{j-1},\textup{{q}}(\vec{\textup{{x}}}),G_{j+1},\ldots,G_{l}),\quad\textup{{q}}(\vec{\textup{{x}}})=G_{j}\Big{\}}.

Notice that in the head case, the specific function variable p does not occur on the right-hand-side of the definition as it does in the body case, it is used only as a “marker” that this is the head case: we put

\textup{{p}}\sim_{E}\textup{{p}}^{\prime}\iff\kern-6.49994pt_{\mathrm{df}}~{}\text{either }\textup{{p}}\equiv\textup{{p}}^{\prime}\in\{\textup{{p}}_{1},\ldots,\textup{{p}}_{K}\}\quad\text{(the body case)}\\ \text{ or }\{\textup{{p}},\textup{{p}}^{\prime}\}\cap\{\textup{{p}}_{1},\ldots,\textup{{p}}_{K}\}=\emptyset\quad\text{(the head case)}.

(4-4)

\begin{array}[]{cc}\begin{array}[]{cclcl}E(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}\textup{{p}}(\vec{\textup{{x}}}_{\textup{{p}}})&=&\textup{{c}}(G_{1},G_{2},G_{3})\end{array}\\ &&\qquad\xrightarrow{(\textup{{p}},1,\textup{{q}})}\\ F(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}\textup{{p}}(\vec{\textup{{x}}}_{\textup{{p}}})&=&\textup{{c}}(\textup{{q}}(\vec{\textup{{x}}}_{\textup{{p}}}),G_{2},G_{3})\\ \textup{{q}}(\vec{\textup{{x}}}_{\textup{{p}}})&=&G_{1}\end{array}\\ &&\qquad\xrightarrow{(\textup{{p}},3,\textup{{q}}^{\prime})}\\ H(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}\textup{{p}}(\vec{\textup{{x}}}_{\textup{{p}}})&=&\textup{{c}}(\textup{{q}}(\vec{\textup{{x}}}_{\textup{{p}}}),G_{2},\textup{{q}}^{\prime}(\vec{\textup{{x}}}_{\textup{{p}}}))\\ \textup{{q}}^{\prime}(\vec{\textup{{x}}}_{\textup{{p}}})&=&G_{3}\\ \textup{{q}}(\vec{\textup{{x}}}_{\textup{{p}}})&=&G_{1}\end{array}\end{array}&\begin{array}[]{cclcl}E(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}\textup{{p}}(\vec{\textup{{x}}}_{\textup{{p}}})&=&\textup{{c}}(G_{1},G_{2},G_{3})\end{array}\\ &&\qquad\xrightarrow{(\textup{{p}},3,\textup{{q}}^{\prime})}\\ F^{\prime}(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}\textup{{p}}(\vec{\textup{{x}}}_{\textup{{p}}})&=&\textup{{c}}(G_{1},G_{2},\textup{{q}}^{\prime}(\vec{\textup{{x}}}_{\textup{{p}}}))\\ \textup{{q}}^{\prime}(\vec{\textup{{x}}}_{\textup{{p}}})&=&G_{3}\end{array}\\ &&\qquad\xrightarrow{(\textup{{p}},1,\textup{{q}})}\\ H(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}\textup{{p}}(\vec{\textup{{x}}}_{\textup{{p}}})&=&c(\textup{{q}}(\vec{\textup{{x}}}_{\textup{{p}}}),G_{2},\textup{{q}}^{\prime}(\vec{\textup{{x}}}_{\textup{{p}}}))\\ \textup{{q}}(\vec{\textup{{x}}}_{\textup{{p}}})&=&G_{1}\\ \textup{{q}}^{\prime}(\vec{\textup{{x}}}_{\textup{{p}}})&=&G_{3}\end{array}\end{array}\end{array}

Diagram 1: Amalgamation example, the body case.

Lemma 4.1.

(1) If $E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F(\vec{\textup{{x}}})$ and $E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F^{\prime}(\vec{\textup{{x}}})$ , then $F\equiv_{c}F^{\prime}$ .

(2) If $\textup{{q}}^{\prime}$ is fresh, $\textup{arity}(\textup{{q}}^{\prime})=\textup{arity}(\textup{{q}})$ and ${\textup{sort}}(\textup{{q}}^{\prime})={\textup{sort}}(\textup{{q}})$ , then

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F(\vec{\textup{{x}}})\implies E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}}^{\prime})}F\{\textup{{q}}:\equiv\textup{{q}}^{\prime}\}(\vec{\textup{{x}}}).

(4-5)

(3) If $\textup{{r}}^{\prime}$ is fresh and $\textup{{r}}^{\prime}\not\equiv\textup{{q}}$ , then

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F(\vec{\textup{{x}}})\implies E\{\textup{{r}}:\equiv\textup{{r}}^{\prime}\}(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F\{\textup{{r}}:\equiv\textup{{r}}^{\prime}\}(\vec{\textup{{x}}}).

(4-6)

(4) An extended program $E(\vec{\textup{{x}}})$ is irreducible if and only if there are no $\textup{{p}},j,\textup{{q}},F$ such that $E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F(\vec{\textup{{x}}})$ .

(5) For every extended program $E(\vec{\textup{{x}}})$ , every program $F$ , every triple $(\textup{{p}},j,\textup{{q}})$ and every structure A,

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F(\vec{\textup{{x}}})\implies\textup{den}({\textup{{A}}},E(\vec{x}))=\textup{den}({\textup{{A}}},F(\vec{x}))\quad(\vec{x}\in A^{n}).

Proof.

\ynm

If, for example, the body case applies and q is any variable with the correct arity and sort, we can take

F\equiv E_{0}{\sf~{}where~{}}\Big{\{}\textup{{p}}_{1}(\vec{\textup{{x}}}_{1})=E_{1},\ldots,\textup{{p}}_{j-1}(\textup{{x}}_{j-1}),\textup{{p}}_{j-1}(\textup{{x}}_{j-1}),\ldots,\\ \textup{{p}}_{K}(\vec{\textup{{x}}}_{K})=E_{K},\textup{{p}}(\vec{\textup{{x}}}_{\textup{{p}}})=\textup{{c}}(G_{1},\ldots,G_{j-1},\textup{{q}}(\vec{\textup{{x}}}_{\textup{{p}}}),G_{j+1},\ldots,G_{l}),~{}~{}\textup{{q}}(\vec{\textup{{x}}}_{\textup{{p}}})=G_{j}\Big{\}}.

(1) – (3) are trivial, once we get through the notation and (4) follows immediately by the definition. (5) is also quite easy, either by a direct, fixpoint argument or by applying the Head and Bekič-Scott rules of Theorem LABEL:recidentities of ARIC. ∎

\begin{array}[]{cc}\begin{array}[]{cclcl}E(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}&&\textup{{c}}(G_{1},G_{2},G_{3})\end{array}\\ &&\qquad\xrightarrow{(\textup{{p}},1,\textup{{q}})}\\ F(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}\textup{{c}}(\textup{{q}}(\vec{\textup{{x}}}),G_{2},G_{3})\\ \textup{{q}}(\vec{\textup{{x}}})=G_{1}\end{array}\\ &&\qquad\xrightarrow{(\textup{{p}},3,\textup{{q}}^{\prime})}\\ H(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}\textup{{c}}(\textup{{q}}(\vec{\textup{{x}}}),G_{2},\textup{{q}}^{\prime}(\vec{\textup{{x}}}))\\ \textup{{q}}^{\prime}(\vec{\textup{{x}}})=G_{3}\\ \textup{{q}}(\vec{\textup{{x}}})=G_{1}\end{array}\end{array}&\begin{array}[]{cclcl}E(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}\textup{{c}}(G_{1},G_{2},G_{3})\end{array}\\ &&\qquad\xrightarrow{(\textup{{p}},3,\textup{{q}}^{\prime})}\\ F^{\prime}(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}\textup{{c}}(G_{1},G_{2},\textup{{q}}^{\prime}(\vec{\textup{{x}}}))\\ \textup{{q}}^{\prime}(\vec{\textup{{x}}})=G_{3}\end{array}\\ &&\qquad\xrightarrow{(\textup{{p}},1,\textup{{q}})}\\ H(\vec{\textup{{x}}})&:&\begin{array}[]{rcl}c(\textup{{q}}(\vec{\textup{{x}}}),G_{2},\textup{{q}}^{\prime}(\vec{\textup{{x}}})\\ \textup{{q}}(\vec{\textup{{x}}})=G_{1}\\ \textup{{q}}^{\prime}(\vec{\textup{{x}}})=G_{3}\end{array}\end{array}\end{array}

Diagram 2: Amalgamation example, the head case.

The key property of the reduction calculus is the following, simple

Lemma 4.2 (Amalgamation).

If $(\textup{{p}}\not\sim_{E}\textup{{p}}^{\prime}\text{ or }j\neq j^{\prime}),~{}~{}\textup{{q}}\not\equiv\textup{{q}}^{\prime}$ ,

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F(\vec{\textup{{x}}})\text{ and }E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}}^{\prime},j^{\prime},\textup{{q}}^{\prime})}F^{\prime}(\vec{\textup{{x}}}),

(4-7)

then there is a program $H$ such that

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}}^{\prime},j^{\prime},\textup{{q}}^{\prime})}H(\vec{\textup{{x}}})\text{ and }E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}}^{\prime},j^{\prime},\textup{{q}}^{\prime})}F^{\prime}(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{p}})}H(\vec{\textup{{x}}}).

(4-8)

Proof.

This is obvious if $\textup{{p}}\not\sim_{E}\textup{{p}}^{\prime}$ , so that the two assumed reductions operate independently on different parts of $E$ and they give the same result if they are executed in either order. For the proof in the more interesting case when $\textup{{p}}\sim_{E}\textup{{p}}^{\prime}$ , we just put down as examples the relevant sequences of needed replacements in Diagram 1 for the body case when p is ternary, $j=1$ and $j^{\prime}=3$ , and in Diagram 2 for the head case when $j=1$ and $j^{\prime}=3$ . The general case is only notationally more complex. ∎

Extended program reduction

Using now arrow reduction, we define the one-step and full reduction relations on extended programs by

	$\displaystyle E(\vec{\textup{{x}}})\Rightarrow_{1}F(\vec{\textup{{x}}})\iff\kern-6.49994pt_{\mathrm{df}}~{}\text{ for some }\textup{{p}},j,\textup{{q}},E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F(\vec{\textup{{x}}}),$		(4-9)
	$\displaystyle E(\vec{\textup{{x}}})\Rightarrow F(\vec{\textup{{x}}})\iff\kern-6.49994pt_{\mathrm{df}}~{}E\equiv_{c}F\hskip 142.26378pt$		(4-10)
	$\displaystyle\hskip 28.45274pt\text{ or }(\exists F^{1},\ldots,F^{k})[E(\vec{\textup{{x}}})\Rightarrow_{1}F^{1}(\vec{\textup{{x}}})\Rightarrow_{1}\cdots\Rightarrow_{1}F^{k}(\vec{\textup{{x}}})\text{ and }F^{k}\equiv_{c}F].$

It is immediate from Lemma 4.1 that for every structure A, every extended program $E(\vec{\textup{{x}}})$ and every $F$ ,

E(\vec{\textup{{x}}})\Rightarrow F(\vec{\textup{{x}}})\implies\textup{den}({\textup{{A}}},E(\vec{x}))=\textup{den}({\textup{{A}}},F(\vec{x}))\quad(\vec{x}\in A^{n}),

(4-11)

but this is true even if we removed the all-important, non-immediacy restrictions in the definition of reduction, by Theorem LABEL:recidentities of ARIC. The claim is that

\textit{if }E(\vec{\textup{{x}}})\Rightarrow F(\vec{\textup{{x}}}),\\ \textit{then $E(\vec{\textup{{x}}})$ and $F(\vec{\textup{{x}}})$ express the same algorithm in every $\Phi$-structure},

but we will not get here into defending this naive view beyond what we said in Sections 1 and 3.

Caution. Reduction is a syntactic operation on extended recursive programs which models (very abstractly) partial compilation, bringing the mutual recursion expressed by the program to a useful form before the recursion is implemented without committing to any particular method of implementation. No real computation is done by it, and it does not embody any “optimization” of the algorithm expressed by an extended program, cf. Problem x4.4.

Size

Let ${\textup{size}}(E)$ be the number of occurrences of non-immediate proper subterms¹⁰¹⁰10See Problem x4.3 for a rigorous definition and proofs of the properties of ${\textup{size}}(E)$ . of some part of a program $E$ . For example, if

E\equiv\textup{{p}}(\textup{{x}},\boxed{\phi_{0}(\textup{{x}})}){\sf~{}where~{}}\\ \Big{\{}\textup{{p}}(\textup{{x}},\textup{{y}})=\textup{if }\boxed{\text{test}(\boxed{\phi_{0}(\textup{{x}})},\textup{{y}})}~{}\textup{then }\textup{{y}}~{}\textup{else }\boxed{\textup{{p}}(\textup{{x}},\boxed{\sigma(\textup{{x}},\textup{{y}})})}\Big{\}},

(4-12)

with $\Phi=\{\phi_{0},\textup{test},\sigma\}$ , then ${\textup{size}}(E)=5$ , because the proper subterms we count are the boxed $\phi_{0}(\textup{{x}})$ (twice), $\text{test}(\phi_{0}(\textup{{x}}),\textup{{y}})$ , $\sigma(\textup{{x}},\textup{{y}})$ , and $\textup{{p}}(\phi_{0}(\textup{{x}}),\sigma(\textup{{x}},\textup{{y}}))$ .

From the definition of reduction, easily, an extended program $E(\vec{\textup{{x}}})$ is irreducible exactly when ${\textup{size}}(E)=0$ and each one-step reduction lowers size by $1$ , so, trivially:

Lemma 4.3.

If $E(\vec{\textup{{x}}})\Rightarrow_{1}F^{1}(\vec{\textup{{x}}})\Rightarrow_{1}\cdots\Rightarrow_{1}F^{k}(\vec{\textup{{x}}})$ is a sequence of one-step reductions starting with $E(\vec{\textup{{x}}})$ , then $k\leq{\textup{size}}(E)$ ; and $F^{k}(\vec{\textup{{x}}})$ is irreducible if and only if $k={\textup{size}}(E)$ .

We illustrate the reduction process by constructing in Figure 3 a maximal reduction sequence starting with $E(\textup{{x}})$ with $E$ the program in (4-12).

\ynm

p(x, y) = if test(ϕ_0(x),y) then y else p(ϕ_0(x),σ(x,y))

\begin{array}[]{rcl}E(\textup{{x}})&:&\begin{array}[]{rcl}\textup{{p}}(\textup{{x}},\phi_{0}(\textup{{x}}))&&{\sf~{}where~{}}\{\\ \textup{{p}}(\textup{{x}},\textup{{y}})&=&\textup{if }\underline{\text{test}(\phi_{0}(\textup{{x}}),\textup{{y}})}~{}\textup{then }\textup{{y}}~{}\textup{else }\textup{{p}}(\textup{{x}},\sigma(\textup{{x}},\textup{{y}}))\}\end{array}\\ &&\qquad\xrightarrow{(\textup{{p}},1,\textup{{q}}_{1})~{}}\\ F^{1}(\textup{{x}})&:&\begin{array}[]{rcl}\textup{{p}}(\textup{{x}},\phi_{0}(\textup{{x}}))&&{\sf~{}where~{}}\{\\ \textup{{p}}(\textup{{x}},\textup{{y}})&=&\textup{if }\textup{{q}}_{1}(\textup{{x}},\textup{{y}})~{}\textup{then }\textup{{y}}~{}\textup{else }\textup{{p}}(\textup{{x}},\underline{\sigma(\textup{{x}},\textup{{y}})}),\\ \textup{{q}}_{1}(\textup{{x}},\textup{{y}})&=&\text{test}(\phi_{0}(\textup{{x}}),\textup{{y}})\}\end{array}\\ &&\qquad\xrightarrow{(\textup{{p}},3,\textup{{q}}_{2})~{}}\\ F^{2}(\textup{{x}})&:&\begin{array}[]{rcl}\textup{{p}}(\textup{{x}},\phi_{0}(\textup{{x}}))&&{\sf~{}where~{}}\{\\ \textup{{p}}(\textup{{x}},\textup{{y}})&=&\textup{if }\textup{{q}}_{1}(\textup{{x}},\textup{{y}})~{}\textup{then }\textup{{y}}~{}\textup{else }\textup{{q}}_{2}(\textup{{x}},\textup{{y}}),\\ \textup{{q}}_{2}(\textup{{x}},\textup{{y}})&=&\textup{{p}}(\underline{\phi_{0}(\textup{{x}})},\sigma(\textup{{x}},\textup{{y}})),\textup{{q}}_{1}(\textup{{x}},\textup{{y}})=\text{test}(\phi_{0}(\textup{{x}}),\textup{{y}})\}\end{array}\\ &&\qquad\xrightarrow{(\textup{{q}}_{2},1,q_{3})~{}}\\ F^{3}(\textup{{x}})&:&\begin{array}[]{rcl}\textup{{p}}(\textup{{x}},\phi_{0}(\textup{{x}}))&&{\sf~{}where~{}}\{\\ \textup{{p}}(\textup{{x}},\textup{{y}})&=&\textup{if }\textup{{q}}_{1}(\textup{{x}},\textup{{y}})~{}\textup{then }\textup{{y}}~{}\textup{else }\textup{{q}}_{2}(\textup{{x}},\textup{{y}}),\\ \textup{{q}}_{2}(\textup{{x}},\textup{{y}})&=&\textup{{p}}(\textup{{q}}_{3}(\textup{{x}},\textup{{y}}),\sigma(\textup{{x}},\textup{{y}})),\textup{{q}}_{3}(\textup{{x}},\textup{{y}})=\phi_{0}(\textup{{x}}),\\ \textup{{q}}_{1}(\textup{{x}},\textup{{y}})&=&\text{test}(\underline{\phi_{0}(\textup{{x}})},\textup{{y}})\}\end{array}\\ &&\qquad\xrightarrow{(\textup{{q}}_{1},1,\textup{{q}}_{4})~{}}\\ F^{4}(\textup{{x}})&:&\begin{array}[]{rcl}\textup{{p}}(\textup{{x}},\phi_{0}(\textup{{x}}))&&{\sf~{}where~{}}\{\\ \textup{{p}}(\textup{{x}},\textup{{y}})&=&\textup{if }\textup{{q}}_{1}(\textup{{x}},\textup{{y}})~{}\textup{then }\textup{{y}}~{}\textup{else }\textup{{q}}_{2}(\textup{{x}},\textup{{y}}),\\ \textup{{q}}_{2}(\textup{{x}},\textup{{y}})&=&\textup{{p}}(\textup{{q}}_{3}(\textup{{x}},\textup{{y}}),\underline{\sigma(\textup{{x}},\textup{{y}})}),\\ \textup{{q}}_{3}(\textup{{x}},\textup{{y}})&=&\phi_{0}(\textup{{x}}),\\ \textup{{q}}_{1}(\textup{{x}},\textup{{y}})&=&\text{test}(\textup{{q}}_{4}(\textup{{x}},\textup{{y}}),\textup{{y}}),\\ \textup{{q}}_{4}(\textup{{x}},\textup{{y}})&=&\phi_{0}(\textup{{x}})\}\end{array}\\ &&\qquad\xrightarrow{(\textup{{q}}_{2},2,\textup{{q}}_{5})~{}}\\ F^{5}(\textup{{x}})&:&\begin{array}[]{rcl}\textup{{p}}(\textup{{x}},\underline{\phi_{0}(\textup{{x}})})&&{\sf~{}where~{}}\{\\ \textup{{p}}(\textup{{x}},\textup{{y}})&=&\textup{if }\textup{{q}}_{1}(\textup{{x}},\textup{{y}})~{}\textup{then }\textup{{y}}~{}\textup{else }\textup{{q}}_{2}(\textup{{x}},\textup{{y}}),\\ \textup{{q}}_{2}(\textup{{x}},\textup{{y}})&=&\textup{{p}}(\textup{{q}}_{3}(\textup{{x}},\textup{{y}}),\textup{{q}}_{5}(\textup{{x}},\textup{{y}})),\\ \textup{{q}}_{5}(\textup{{x}},\textup{{y}})&=&\sigma(\textup{{x}},\textup{{y}}),\\ \textup{{q}}_{3}(\textup{{x}},\textup{{y}})&=&\phi_{0}(\textup{{x}}),~{}~{}\textup{{q}}_{1}(\textup{{x}},\textup{{y}})=\text{test}(\textup{{q}}_{4}(\textup{{x}},\textup{{y}}),\textup{{y}}),\\ \textup{{q}}_{4}(\textup{{x}},\textup{{y}})&=&\phi_{0}(\textup{{x}})\}\end{array}\\ &&\qquad\xrightarrow{(\textup{{r}},2,\textup{{q}}_{6})~{}}\\ F^{6}(\textup{{x}})&:&\begin{array}[]{rcl}\textup{{p}}(\textup{{x}},\textup{{q}}_{6}(x))&&{\sf~{}where~{}}\{\\ \textup{{p}}(\textup{{x}},\textup{{y}})&=&\textup{if }\textup{{q}}_{1}(\textup{{x}},\textup{{y}})~{}\textup{then }\textup{{y}}~{}\textup{else }\textup{{q}}_{2}(\textup{{x}},\textup{{y}}),\\ \textup{{q}}_{2}(\textup{{x}},\textup{{y}})&=&\textup{{p}}(\textup{{q}}_{3}(\textup{{x}},\textup{{y}}),\textup{{q}}_{5}(\textup{{x}},\textup{{y}})),\\ \textup{{q}}_{5}(\textup{{x}},\textup{{y}})&=&\sigma(\textup{{x}},\textup{{y}}),\\ \textup{{q}}_{3}(\textup{{x}},\textup{{y}})&=&\phi_{0}(\textup{{x}}),~{}~{}\textup{{q}}_{1}(\textup{{x}},\textup{{y}})=\text{test}(\textup{{q}}_{4}(\textup{{x}},\textup{{y}}),\textup{{y}},)\\ \textup{{q}}_{4}(\textup{{x}},\textup{{y}})&=&\phi_{0}(\textup{{x}}),\textup{{q}}_{6}(x)=\phi_{0}(\textup{{x}})\}\end{array}\end{array}

Diagram 3: A maximal reduction sequence for

E(\textup{{x}})

in (4-12).

Lemma 4.4.

For any extended program $E(\vec{\textup{{x}}})$ and irreducible $F^{1}(\vec{\textup{{x}}}),F^{2}(\vec{\textup{{x}}})$ ,

\Big{(}E(\vec{\textup{{x}}})\Rightarrow F^{1}(\vec{\textup{{x}}}){~{}\&~{}}E(\vec{\textup{{x}}})\Rightarrow F^{2}(\vec{\textup{{x}}})\Big{)}\implies F^{1}\equiv_{c}F^{2}.

{proofplain}

is by induction on ${\textup{size}}(E)$ .

Basis, ${\textup{size}}(E)\leq 1$ . If ${\textup{size}}(E)=0$ so that $E(\vec{\textup{{x}}})$ is irreducible, then the hypothesis gives immediately $E\equiv_{c}F^{1}$ and $E\equiv_{c}F^{2}$ , so $F^{1}\equiv_{c}F^{2}$ .

If ${\textup{size}}(E)=1$ , then there is exactly one part $E_{i}$ of $E$ which can activate a one-step reduction. In the head case, the definition gives

\textup{set}(F^{1}(\vec{\textup{{x}}}))=\Big{(}\textup{set}(E(\vec{\textup{{x}}}))\setminus\{\textup{{c}}(G_{1},\ldots,G_{j-1},G_{j},G_{j+1},\ldots,G_{l})\}\Big{)}\\ \cup\{\textup{{c}}(G_{1},\ldots,G_{j-1},\textup{{q}}(\vec{\textup{{x}}}),G_{j+1},\ldots,G_{l}),~{}~{}\textup{{q}}(\vec{\textup{{x}}})=G_{j}\}

with a fresh function variable q and the same equation for $\textup{set}(F^{2}(\vec{\textup{{x}}}))$ with a (possibly) different fresh function variable $\textup{{q}}^{\prime}$ ; but then $F^{1}\equiv_{c}F^{2}$ by (1) of Lemma 4.1. The same argument works in the body case.

Induction Step, $k={\textup{size}}(E)\geq 2$ . The hypothesis and Lemma 4.3 supply reduction sequences

E(\vec{\textup{{x}}})\Rightarrow_{1}F^{1,1}(\vec{\textup{{x}}})\Rightarrow_{1}\cdots\Rightarrow_{1}F^{1,k}(\vec{\textup{{x}}})\equiv F^{1}(\vec{\textup{{x}}}),\\ E(\vec{\textup{{x}}})\Rightarrow_{1}F^{2,1}(\vec{\textup{{x}}})\Rightarrow_{1}\cdots\Rightarrow_{1}F^{2,k}(\vec{\textup{{x}}})\equiv F^{2}(\vec{\textup{{x}}}),

(4-12)

so focussing on the first one-step reductions in the two hypotheses, there are triples $(\textup{{p}},j,\textup{{q}})$ and $(\textup{{p}}^{\prime},j^{\prime},\textup{{q}}^{\prime})$ such that

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F^{1,1}(\vec{\textup{{x}}})\text{ and }E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}}^{\prime},j^{\prime},\textup{{q}}^{\prime})}F^{2,1}(\vec{\textup{{x}}}).

(4-13)

We consider three cases on how this may arise:

Case 1: $(\textup{{p}}\not\sim_{E}\textup{{p}}^{\prime}\text{ or }j\neq j^{\prime})$ and $\textup{{q}}\not\equiv\textup{{q}}^{\prime}$ . The Amalgamation Lemma 4.2 applies in this case and supplies a program $H$ such that

F^{1,1}(\vec{\textup{{x}}})\Rightarrow_{1}H(\vec{\textup{{x}}})\text{ and }F^{2,1}(\vec{\textup{{x}}})\Rightarrow_{1}H(\vec{\textup{{x}}}).

(

\ast\ast

)

We fix a (maximal) reduction sequence

H(\vec{\textup{{x}}})\Rightarrow_{1}H^{3}(\vec{\textup{{x}}})\Rightarrow_{1}\cdots\Rightarrow_{1}H^{k}(\vec{\textup{{x}}})\text{ (with an irreducible $H^{k}(\vec{\textup{{x}}})$)}

(

\ast\ast\ast

)

and notice that from the reductions in the three starred displays,

F^{1,1}(\vec{\textup{{x}}})\Rightarrow F^{1}(\vec{\textup{{x}}}),~{}~{}F^{1,1}(\vec{\textup{{x}}})\Rightarrow H^{k}(\vec{\textup{{x}}}),\qquad F^{2,1}(\vec{\textup{{x}}})\Rightarrow F^{2}(\vec{\textup{{x}}}),~{}~{}F^{2,1}(\vec{\textup{{x}}})\Rightarrow H^{k}(\vec{\textup{{x}}});

but ${\textup{size}}(F^{1,1})={\textup{size}}(F^{2,1})=k-1$ , so the Induction Hypothesis applies and yields the required $F^{1}\equiv_{c}H^{k}\equiv_{c}F^{2}$ .

Case 2, $(\textup{{p}}\not\sim_{E}\textup{{p}}^{\prime}\text{ or }j\neq j^{\prime})$ but $\textup{{q}}\equiv\textup{{q}}^{\prime}$ , so the first one-step reductions supplied by the Hypothesis are

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F^{1,1}(\vec{\textup{{x}}})\text{ and }E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}}^{\prime},j^{\prime},\textup{{q}})}F^{2,1}(\vec{\textup{{x}}}).

If $\textup{{q}}^{\prime}$ is fresh but with the same arity and sort as q, then (2) of Lemma 4.1 gives

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}}^{\prime},j^{\prime},\textup{{q}}^{\prime})}F^{2,1}\{\textup{{q}}:\equiv\textup{{q}}^{\prime}\}(\vec{\textup{{x}}});

now (3) of Lemma 4.1 gives

F^{2,1}\{\textup{{q}}:\equiv\textup{{q}}^{\prime}\}(\vec{\textup{{x}}})\Rightarrow_{1}\cdots\Rightarrow_{1}F^{2,k}\{\textup{{q}}:\equiv\textup{{q}}^{\prime}\}(\vec{\textup{{x}}})

so that Case 1 applies and gives

F^{1}\equiv_{c}F^{2}\{\textup{{q}}:\equiv\textup{{q}}^{\prime}\};

but $F^{2}\{\textup{{q}}:\equiv\textup{{q}}^{\prime}\}\equiv_{c}F^{2}$ , which completes the argument in this case.

Case 3, $\textup{{p}}\sim\textup{{p}}^{\prime},j=j^{\prime}$ and $\textup{{q}}\not\equiv\textup{{q}}^{\prime}$ . The first one-step reductions supplied by the Hypothesis now are

E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}})}F^{1,1}(\vec{\textup{{x}}})\text{ and }E(\vec{\textup{{x}}})\xrightarrow{(\textup{{p}},j,\textup{{q}}^{\prime})}F^{2,1}(\vec{\textup{{x}}})

where q and $\textup{{q}}^{\prime}$ are distinct but have the same sort and arity; so

F^{1,1}\{\textup{{q}}:\equiv\textup{{r}}\}\equiv_{c}F^{2,1}\{\textup{{q}}^{\prime}:\equiv\textup{{r}}\}

with any fresh r; and then by the Induction Hypothesis and (3) of Lemma 4.1 as above,

F^{1,k}\equiv_{c}F^{1,k}\{\textup{{q}}:\equiv\textup{{r}}\}\equiv_{c}F^{2,k}\{\textup{{q}}^{\prime}:\equiv\textup{{r}}\}\equiv_{c}F^{2},

which is what we needed to prove.

Theorem 4.5 (Canonical forms).

Every extended program $E(\vec{\textup{{x}}})$ is reducible to a unique up to congruence irreducible extended program $\textup{cf}(E(\vec{\textup{{x}}}))$ , its canonical form.

In detail: with every extended program $E(\vec{\textup{{x}}})$ , we can associate an extended program $\textup{cf}(E(\vec{\textup{{x}}}))$ with the following properties: {ritemize}(3)

$\textup{cf}(E(\vec{\textup{{x}}}))$ is irreducible.

$E(\vec{\textup{{x}}})\Rightarrow\textup{cf}(E(\vec{\textup{{x}}}))$ .

If $F(\vec{\textup{{x}}})$ is irreducible and $E(\vec{\textup{{x}}})\Rightarrow F(\vec{\textup{{x}}})$ , then $F(\vec{\textup{{x}}})\equiv_{c}\textup{cf}(E(\vec{\textup{{x}}}))$ .

It follows that for all $E(\vec{\textup{{x}}}),F(\vec{\textup{{x}}})$ ,

E(\vec{\textup{{x}}})\Rightarrow F(\vec{\textup{{x}}})\implies\textup{cf}(E(\vec{\textup{{x}}}))\equiv_{c}\textup{cf}(F(\vec{\textup{{x}}})).

Proof.

If $E(\vec{\textup{{x}}})$ is irreducible, take $\textup{cf}(E(\vec{\textup{{x}}}))\equiv E(\vec{\textup{{x}}})$ , and if ${\textup{size}}(E)=k>0$ , fix a maximal sequence of one-step reductions

E(\vec{\textup{{x}}})\Rightarrow_{1}F^{1}(\vec{\textup{{x}}})\Rightarrow_{1}\cdots\Rightarrow_{1}F^{k}(\vec{\textup{{x}}})

and set $\textup{cf}(E(\vec{\textup{{x}}})):\equiv F^{k}(\vec{\textup{{x}}})$ ; now (1) and (2) are immediate and (3) follows from Lemma 4.4. The last claim holds because $F(\vec{\textup{{x}}})\Rightarrow\textup{cf}(F(\vec{\textup{{x}}}))$ , so

\textup{if }E(\vec{\textup{{x}}})\Rightarrow F(\vec{\textup{{x}}})\textup{ then }E(\vec{\textup{{x}}})\Rightarrow\textup{cf}(F(\vec{\textup{{x}}}))

and hence $\textup{cf}(E(\vec{\textup{{x}}}))\equiv_{c}\textup{cf}(F(\vec{\textup{{x}}}))$ by (3). ∎

It is clearly possible to assign to each extended program $E(\vec{\textup{{x}}})$ a specific canonical form, e.g., by choosing $\textup{cf}(E(\vec{\textup{{x}}}))$ to be the “lexicographically least” irreducible $F(\vec{\textup{{x}}})$ such that $E(\vec{\textup{{x}}})\Rightarrow F(\vec{\textup{{x}}})$ . This, however, is neither convenient nor useful, and it is best to think of $\textup{cf}(E(\vec{\textup{{x}}}))$ as denoting ambiguously any irreducible extended program such that $E(\vec{\textup{{x}}})\Rightarrow\textup{cf}(E(\vec{\textup{{x}}}))$ ; the theorem insures that any two such canonical forms of $E(\vec{\textup{{x}}})$ are congruent, and the construction in the proof gives a simple way to compute one of them.

Canonical forms for richer languages

A substantially more general version of the Canonical Form Theorem for functional structures was established in [ynmflr], and there are natural versions of it for many richer languages, including a suitable formulation of the typed $\lambda$ -calculus with recursion (PCF). The proof we gave here for McCarthy programs is considerably easier than these, more general results.

Intensions and elementary algorithms

The (referential) intension of an extended McCarthy program $E(\vec{\textup{{x}}})$ in a structure A is the recursor of its canonical form,

\textup{int}({\textup{{A}}},E(\vec{\textup{{x}}}))=_{\mathrm{df}}\mathfrak{r}({\textup{{A}}},\textup{cf}(E)(\vec{\textup{{x}}}));

(4-14)

it does not depend on the choice of $\textup{cf}(E)$ by (3-12), and it models the elementary algorithm expressed by $E(\vec{\textup{{x}}})$ in A.

\ynm

A (bad) discussion of intensional equivalence between programs, which I should at some point address.

An operation $\boldsymbol{\phi}({\textup{{A}}},E(\vec{\textup{{x}}}))$ on the extended programs of a structure A is an intensional invariant if it depends only on the intension of $E(\vec{\textup{{x}}})$ in A, i.e.,

\textup{int}({\textup{{A}}},E(\vec{\textup{{x}}}))=\textup{int}({\textup{{A}}},F(\vec{\textup{{x}}}))\implies\boldsymbol{\phi}({\textup{{A}}},E(\vec{\textup{{x}}}))=\boldsymbol{\phi}({\textup{{A}}},F(\vec{\textup{{x}}})).

(4-15)

We would expect that the most basic complexity measures on programs are intensional invariants, and they are, Problems LABEL:callsinvariant, LABEL:depthinvariant.

\ynm

In the structure $(A,\phi,\psi)$ where $\phi=\psi$ the terms $\phi(x)$ and $\psi(x)$ have the same intension, but do not have the same calls-complexity functions for $\phi$ because of the names; also, if $\phi(x)=x$ , then the total calls-complexity is not determined by the intension. Most likely I should avoid these picky issues here, it is not the place for them.

The recursor representation of an iterative algorithm

Using the precise definition of the classical notion of an iterator (sequential machine)

\mathfrak{i}=(\textup{input},S,\sigma,T,\textup{output}):X\rightsquigarrow W

in Section LABEL:cmodels of ARIC, let

{\textup{{A}}}_{\mathfrak{i}}=(A_{\mathfrak{i}},X,W,S,\textup{input},\sigma,T,\textup{output})

be the structure associated with $\mathfrak{i}$ , where

A_{\mathfrak{i}}=X\uplus W\uplus S

is the disjoint sum of the sets $X,W$ and $S$ , and let

E_{\mathfrak{i}}(\textup{{x}}):\equiv\textup{{q}}(\textup{input}(\textup{{x}})){\sf~{}where~{}}\{\textup{{q}}(\textup{{s}})=\textup{if }T(\textup{{s}})~{}\textup{then }\textup{output}(\textup{{s}})~{}\textup{else }\textup{{q}}(\sigma(\textup{{s}}))\}.

Proposition 4.6.

An iterator $\mathfrak{i}$ is determined by its intension

\textup{int}(\mathfrak{i})=\textup{int}({\textup{{A}}}_{\mathfrak{i}},E_{\mathfrak{i}}(\textup{{x}}))=\mathfrak{r}({\textup{{A}}}_{\mathfrak{i}},\textup{cf}(E_{\mathfrak{i}})(\textup{{x}})),

(4-16)

the recursive algorithm on ${\textup{{A}}}_{\mathfrak{i}}$ expressed by the extended tail recursive program $E_{\mathfrak{i}}(\vec{\textup{{x}}})$ associated with $\mathfrak{i}$ .

{proofplain}

is easy, if a bit technical, and we leave it for Problem x4.13^∗.

Theorem 4.2 in [ynmpaschalis2008] is a much stronger result along these lines, but it requires several definitions for its formulation and this simple Proposition expresses quite clearly the basic message: every property of an iterator $\mathfrak{i}$ can be expressed as a property of its associated recursor $\textup{int}(\mathfrak{i})$ , and so choosing recursive rather than iterative algorithms as the basic objects does not force us to miss any important facts about the simpler objects.

\ynm

About implementations (3)

An outline of a theory of implementations was sketched first in [ynmsicily] and then (in more detail) in [ynmpaschalis2008]. Briefly:

(1) A reducibility relation $\alpha\leq_{r}\beta$ on recursors $\alpha,\beta:X\rightsquigarrow W$ with the same input and output sets is defined, aiming to capture an exact notion of “simulation”.

(2) By definition, an implementation of a recursor $\alpha:X\rightsquigarrow W$ is any iterator $\mathfrak{i}:X\rightsquigarrow W$ such that $\alpha$ is reducible to the recursor associated with $\mathfrak{i}$ , $\alpha\leq_{r}\mathfrak{r}({\textup{{A}}}_{\mathfrak{i}},E_{\mathfrak{i}}(\textup{{x}}))$ .

By the main Theorem 5.3 in [ynmpaschalis2008],

the recursive machine $\mathfrak{i}({\textup{{A}}},E(\vec{\textup{{x}}}))$ associated with $E(\vec{\textup{{x}}})$ in A is an implementation of the recursor $\mathfrak{r}({\textup{{A}}},E(\vec{\textup{{x}}}))$ .

This provides some, initial evidence that this approach to a theory of implementations at least covers a most basic case, but not much more has been done on the question and we will not go into it here. What is worth emphasizing (again) is that the recursive machine $\mathfrak{i}({\textup{{A}}},E(\vec{\textup{{x}}}))$ lives in a structure ${\textup{{A}}}_{\mathfrak{i}}$ whose universe $A_{\mathfrak{i}}$ extends the universe $A$ of A and which has richer primitives. In other words, the usual assertion that recursion can be reduced to (or implemented by) iteration involves extending the universe of the structure and adding new primitives; and this is necessary by the results discussed in Section LABEL:recvstail and on page LABEL:recvstailII of ARIC. \ynmin [stolboushkin1983] and [tiuryn1989] cited on page LABEL:recistail.

Problems for Section 4

Problem x4.1.

A term $I$ is immediate of sort $s$ if $I\equiv\textup{{v}}_{i}$ and $s={\texttt{ind}}$ ; or $I\equiv\textup{{p}}^{s,0}$ ; or $I\equiv\textup{{p}}^{s,n}(\textup{{u}}_{1},\ldots,\textup{{u}}_{n})$ where each $\textup{{u}}_{i}$ is an individual variable. Write out a similar, full (recursive) definition of irreducible terms of sort $s$ .

\ynm

Problem x4.2.

Prove Lemma 4.1

Problem x4.3 (Size).

(1) Prove than there is exactly one function ${\textup{size}}(E)$ which assigns a number to every explicit term so that

\textit{if $E(\vec{\textup{{x}}})$ is irreducible, then ${\textup{size}}(E)=0$, and}\\ {\textup{size}}(\textup{{c}}(G_{1},\ldots,G_{l}))=\textstyle\sum\big{\{}{\textup{size}}(G_{i})+1\,\mid\,G_{i}\text{ is not immediate}\}.

(2) For a program $E$ as in (4-1), set

{\textup{size}}(E)=\textstyle\sum\big{\{}{\textup{size}}(E)_{i}\,\mid\,E_{i}\text{ is a part of }E\big{\}}

and check that

(1)

$E(\vec{\textup{{x}}})$ is irreducible if and only if ${\textup{size}}(E)=0$ , and
(2)

if $E(\vec{\textup{{x}}})\Rightarrow_{1}F(\vec{\textup{{x}}})$ , then ${\textup{size}}(F)={\textup{size}}(E)-1$ .

\ynm

This was with only the intuitive def. around. (1) If $E$ is irreducible, then every equation of $E$ is of the form

\textup{{p}}_{i}(\vec{\textup{{x}}}_{i})=\textup{{c}}(G_{1},\ldots,G_{l})

with immediate $G_{1},\ldots,G_{l}$ and no non-immediate term occurs as a proper subterm of a part of $E$ and ${\textup{size}}(E)=0$ ; and if $E$ is reducible, then in some equation of $E$ as above some $G_{i}$ is not immediate and is counted, so ${\textup{size}}(E)>0$ .

(2) If $\textup{{p}}_{i}(\vec{\textup{{x}}}_{i})=\textup{{c}}(G_{1},\ldots,G_{j-1},G_{j},G_{j+1},\ldots,G_{l})$ is an equation of $E$ , then the contribution it makes to the size of $E$ is

s=\sum_{1\leq i\leq l}{\textup{size}}(G_{i})+|\{i\,\mid\,G_{i}\text{ is not immediate}\}|.

If $G_{j}$ is not immediate, then the two equations in the program $E^{\prime}$ constructed by the one-step reduction contribute

s^{\prime}=\sum_{1\leq i\leq l,i\neq j}{\textup{size}}(G_{i})\\ +|\{i\,\mid\,i\neq j{~{}\&~{}}G_{i}\text{ is not immediate}\}|+{\textup{size}}(G_{j})=s-1

to the ${\textup{size}}(E^{\prime})$ .

Problem x4.4.

Notice that the irreducible extended program $F_{5}(\vec{\textup{{x}}})$ in Diagram 3 to which $E(\vec{\textup{{x}}})$ reduces calls for computing the value $\phi_{1}(x)$ twice in each “loop”. Define an extended program $F(\vec{\textup{{x}}})$ which computes the same partial function as $E(\vec{\textup{{x}}})$ but calls for computing $\phi_{1}(x)$ only once in each loop, and construct a complete reduction sequence for $E^{\ast}(\vec{\textup{{x}}})$ .

Problem x4.13^∗.

Prove Proposition 4.6. Hint: Use the reduction calculus to compute a canonical form of the program $E_{\mathfrak{i}}(\vec{\textup{{x}}})$ ,

\textup{cf}(E_{\mathfrak{i}}(\vec{\textup{{x}}}))\equiv_{c}\textup{{q}}(\textup{{p}}_{1}(\textup{{x}}))\textit{ where }\{\textup{{p}}_{1}(\textup{{x}})=\textup{input}(\textup{{x}}),\\ \textup{{p}}_{2}(\textup{{s}})=T(\textup{{s}}),~{}~{}\textup{{p}}_{3}(\textup{{s}})=\textup{output}(\textup{{s}}),~{}~{}\textup{{p}}_{4}(\textup{{s}})=\sigma(\textup{{s}}),~{}~{}\textup{{p}}_{5}(\textup{{s}})=\textup{{q}}(\textup{{p}}_{4}(\textup{{s}}))\\ \textup{{q}}(\textup{{s}})=\textup{if }\textup{{p}}_{2}(\textup{{s}})~{}\textup{then }\textup{{p}}_{3}(\textup{{s}})~{}\textup{else }\textup{{p}}_{5}(\textup{{s}})\},

so that $\textup{int}(\mathfrak{i}(\textup{{x}}))=\textup{int}({\textup{{A}}}_{\mathfrak{i}},E_{\mathfrak{i}}(\textup{{x}}))=\mathfrak{r}({\textup{{A}}}_{\mathfrak{i}},\textup{cf}(E_{\mathfrak{i}}(\textup{{x}})))=(\alpha_{0},\alpha_{1},\ldots,\alpha_{6})$ , where

\alpha_{i}:A_{\mathfrak{i}}\times(A_{\mathfrak{i}}\rightharpoonup A_{\mathfrak{i}})\times(A_{\mathfrak{i}}\rightharpoonup\mathbb{B})\times(A_{\mathfrak{i}}\rightharpoonup A_{\mathfrak{i}})^{4}\rightharpoonup A_{\mathfrak{i}}\quad(i=1,\ldots,6).

The result is practically trivial from this, except for the fact that we only know the canonical form up to congruence, so it is not immediate how to “pick out” from the functionals $\alpha_{i}$ the input, output, etc. of $\mathfrak{i}$ —and this is complicated by the fact that the sets $X,W$ and $S$ may overlap, in fact they may all be the same set. We need to use the details of the coding of many-sorted structures by single-sorted ones specified in Section 1D of ARIC.

5 Decidability of intensional program equivalence

\ynm

Why did I put this note here? The reduction to irreducible identities needs to be checked.

Two extended $\Phi$ -programs are intensionally equivalent on a $\Phi$ -structure A if they have equal intensions,

E(\vec{\textup{{x}}})=^{{\textup{{A}}}}_{\textup{int}}F(\vec{\textup{{x}}})\iff\kern-6.49994pt_{\mathrm{df}}~{}\textup{int}({\textup{{A}}},E(\vec{\textup{{x}}}))=\textup{int}({\textup{{A}}},F(\vec{\textup{{x}}}));

(5-1)

they are (globally) intensionally equivalent if they are intensionally equivalent on every infinite A,

E(\vec{\textup{{x}}})=_{\textup{int}}F(\vec{\textup{{x}}})\iff\kern-6.49994pt_{\mathrm{df}}~{}(\forall\text{ infinite }{\textup{{A}}})[\textup{int}({\textup{{A}}},E(\vec{\textup{{x}}}))=\textup{int}({\textup{{A}}},F(\vec{\textup{{x}}}))].

(5-2)

The main result of this section is

Theorem 5.1.

For every infinite $\Phi$ -structure A, the relation of intensional equivalence on A between extended $\Phi$ -programs is decidable.¹¹¹¹11Except for a reference to a published result in Problem x5.46^∗, I will not discuss in this paper the question of complexity of intensional equivalence on recursive programs, primarily because I do not know anything non-trivial about it.

We will also prove that global intensional equivalence between $\Phi$ -programs is essentially equivalent with congruence, and so it is decidable, Theorem 5.14.

Plan for the proof

To decide if $E(\vec{\textup{{x}}})$ and $F(\vec{\textup{{x}}})$ are intensionally equivalent on A, we compute their canonical forms, check that these have the same number $K+1$ of irreducible parts $E_{i},F_{j}$ and then (to apply Lemma 3.1) check if there is a permutation $\pi$ of $\{0,\ldots,K\}$ with inverse $\rho$ such that $\pi(0)=0$ and

\begin{array}[]{rcll}{\textup{{A}}}&\models&E_{0}(\vec{\textup{{x}}})&=F_{0}\{\vec{\textup{{q}}}:\equiv\rho(\vec{\textup{{p}}})\}(\vec{\textup{{x}}}),\\ {\textup{{A}}}&\models&E_{i}(\vec{\textup{{x}}}_{i})&=F_{\pi(i)}\{\vec{\textup{{q}}}:\equiv\rho(\vec{\textup{{p}}}),\vec{\textup{{y}}_{i}}_{\pi(i)}:\equiv\vec{\textup{{x}}}_{i}\},\quad(i=1,\ldots,K).\end{array}

These identities are all irreducible by (4-3), so the problem comes down to deciding the validity of irreducible term identities in A. The proof involves associating with each infinite $\Phi$ -structure A a finite list of “conditions” about its (finitely many) primitives—its dictionary—which codifies all the information needed to decide whether an arbitrary irreducible identity holds in A; it involves some fussy details, but fundamentally it is an elementary exercise in the equational logic of partial terms with conditionals. In classical terminology, it is (basically) a Finite Basis Theorem for the theory of irreducible identities which hold in an arbitrary, infinite (partial) $\Phi$ -structure.¹²¹²12The decidability of intensional equivalence is considerably simpler for total structures A, cf. Lemma 5.9 and substantially more difficult for the FLR-structures of [ynmflr] which allow monotone, discontinuous functionals among their primitives; see [ynmfrege]—and a correction which fills a gap in the proof of this result in [ynmfrege] which is posted (along with the paper) on ynm’s homepage. It is an open—and, I think, interesting—question for the acyclic recursive algorithms of [ynmlcms], where intensional synonymy models synonymy (or faithful translation) for fragments of natural language.

The satisfaction relation

We will also need in this Section the classical characterization of valid identities in terms of the satisfaction relation.

An assignment $\sigma$ in a $\Phi$ -structure A associates with each variable an object of the proper kind, i.e.,

\sigma(\textup{{v}}_{i})\in A,\quad\sigma(\textup{{p}}^{s,n}_{i}):A^{n}\rightharpoonup A_{s},

so that, for example, the boolean function variables $\textup{{p}}^{{\texttt{boole}},0}_{i}$ are assigned nullary partial functions $p:{\bf I}\rightharpoonup\mathbb{B}$ , essentially $\kern-1.00006pt\uparrow\kern 0.50003pt$ , tt or ff. For each $\Phi^{2}$ -term $E$ , we define

\sigma({\textup{{A}}},E)=_{\mathrm{df}}\text{ the value (perhaps $\kern-1.00006pt\uparrow\kern 0.50003pt$) of $E$ in ${\textup{{A}}}$ under $\sigma$}

by the usual compositional recursion and we set

{\textup{{A}}},\sigma\models E=F\iff\kern-6.49994pt_{\mathrm{df}}~{}\sigma({\textup{{A}}},E)=\sigma({\textup{{A}}},F).

If the variables of $E$ are among $\textup{{p}}_{1},\ldots,\textup{{p}}_{K},\textup{{x}}_{1},\ldots,\textup{{x}}_{n}$ , then

\sigma({\textup{{A}}},E)=\textup{den}(({\textup{{A}}},\sigma(\textup{{p}}_{1}),\ldots,\sigma(\textup{{p}}_{K})),\sigma(\textup{{x}}_{1}),\ldots,\sigma(\textup{{x}}_{n})),

and so by the notation on p. 2, for any two extended $\Phi^{2}$ -terms,

{\textup{{A}}}\models E(\vec{\textup{{x}}})=F(\vec{\textup{{x}}})\iff(\text{for all }\sigma)[{\textup{{A}}},\sigma\models E=F].

Preliminary constructions and notation

We fix for this Section an infinite $\Phi$ -structure ${\textup{{A}}}=(A,\Phi)$ (with finite $\Phi$ , as always) and we assume without loss of generality that among the symbols in $\Phi$ are $\psi_{\textup{t\hskip-1.7ptt}},\psi_{\textup{ff}},\textup{id}$ such that

\psi_{\textup{t\hskip-1.7ptt}}^{\textup{{A}}}={\textup{t\hskip-1.7ptt}},\quad\psi_{\textup{ff}}^{\textup{{A}}}={\textup{ff}},\quad\textup{id}^{\textup{{A}}}(t)=t.

We will often define assignments partially, only on the variables which occur in a term $E$ and specifying only finitely many values $\sigma(\textup{{p}}^{s,n}_{i})(\vec{x})$ of the function variables—those needed to compute $\sigma({\textup{{A}}},E)$ —perhaps to come back later and extend $\sigma$ when we view $E$ as a subterm of some $F$ .

Using the assumption that $A$ is infinite, we fix pairwise disjoint, infinite sets

A^{\texttt{ind}},~{}~{}A^{{\texttt{ind}},n}_{i}\subset A

whose union is co-infinite in $A$ and we fix an assignment $\ddot{\sigma}$ on the individual and function variables of sort ind using injections

\ddot{\sigma}:\{\textup{{v}}_{0},\textup{{v}}_{1},\ldots\}\rightarrowtail A^{\texttt{ind}},\quad\ddot{\sigma}(\textup{{p}}^{{\texttt{ind}},n}_{i}):A^{n}\rightarrowtail A^{{\texttt{ind}},n}_{i}.

We do not define at this point any values $\ddot{\sigma}(\textup{{p}}^{{\texttt{boole}},n}_{i})(x_{1},\ldots,x_{n})$ for function variables of sort boole.

A term $E$ is pure, algebraic if it is of sort ind and none of ${\textup{t\hskip-1.7ptt}},{\textup{ff}}$ , the conditional or any symbol from $\Phi$ occurs in it, i.e.,

E:\equiv\textup{{v}}_{i}\mid\textup{{p}}^{{\texttt{ind}},0}_{i}\mid\textup{{p}}^{{\texttt{ind}},n}_{i}(E_{1},\ldots,E_{n}).

(Pure, algebraic terms)

Lemma 5.2.

If $E,F$ are pure, algebraic terms, then

{\textup{{A}}},\ddot{\sigma}\models E=F\iff E\equiv F.

{proofplain}

of this (classical) result is by an easy induction on $\textup{length}(E)$ , using the Parsing Lemma in Problem x2.1.

Forms of irreducible identities

To organize the proof of Theorem 5.1, we will appeal to the (trivial) fact that every irreducible term is in exactly one of the following forms:

(1)

tt, ff, or an individual variable v.
(2)

Function variable application, $\textup{{p}}(z_{1},\ldots,z_{n})$ , where the terms $z_{1},\ldots,z_{n}$ are immediate of sort ind and $n\geq 0$ ; p can be of either sort, ind or boole.
(3)

Conditional, $\textup{if }z_{1}~{}\textup{then }z_{2}~{}\textup{else }z_{3}$ , with immediate $z_{1}$ of sort boole and immediate $z_{2},z_{3}$ of the same sort, ind or boole.
(4)

Primitive application, $\phi(z_{1},\ldots,z_{k})$ , $k\geq 0$ , with immediate $z_{1},\ldots,z_{k}$ , all of sort ind and $\phi$ of either sort.

It follows that every irreducible identity falls in one of the ten, pairwise exclusive forms [i-j] with $1\leq i\leq j\leq 4$ depending on the forms of its two sides. In dealing with these cases in the following Lemmas, we will use repeatedly the (trivial) fact that

if $u,v$ are immediate of sort boole and $u\not\equiv v$ , then we can extend $\ddot{\sigma}$ to the function variables which occur in $u$ and $v$ so that $\ddot{\sigma}(u)$ and $\ddot{\sigma}(v)$ take any pre-assigned truth values (or diverge).

Lemma 5.3 ([2-2]).

For any two irreducible terms in form (2),

{\textup{{A}}}\models\textup{{p}}(z_{1},\ldots,z_{n})=\textup{{q}}(w_{1},\ldots,w_{m})\iff\textup{{p}}(z_{1},\ldots,z_{n})\equiv\textup{{q}}(w_{1},\ldots,w_{m}).

Proof.

Let $E\equiv\textup{{p}}(z_{1},\ldots,z_{n})$ and $F\equiv\textup{{q}}(w_{1},\ldots,w_{m})$ , assume ${\textup{{A}}}\models E=F$ and consider two cases:

Case 1, ${\textup{sort}}(\textup{{p}})={\textup{sort}}(\textup{{q}})={\texttt{ind}}$ ; now $E$ and $F$ are pure algebraic and so by Lemma 5.2:

{\textup{{A}}}\models E=F\implies{\textup{{A}}},\ddot{\sigma}\models E=F\implies E\equiv F.

Case 2, ${\textup{sort}}(\textup{{p}})={\textup{sort}}(\textup{{q}})={\texttt{boole}}$ . If $\textup{{p}}\not\equiv\textup{{q}}$ , extend $\ddot{\sigma}$ by setting

\ddot{\sigma}(\textup{{p}})(x_{1},\ldots,x_{n})={\textup{t\hskip-1.7ptt}},\quad\ddot{\sigma}(\textup{{q}})(y_{1},\ldots,y_{m})={\textup{ff}}

and check that ${\textup{{A}}},\ddot{\sigma}\not\models E=F$ ; so $\textup{{p}}\equiv\textup{{q}}$ , hence $n=m$ , and it is enough to prove that $z_{1}\equiv w_{1},\ldots,z_{n}\equiv w_{n}$ . If not, then for some $i$ , $\ddot{\sigma}(z_{i})\neq\ddot{\sigma}(w_{i})$ by Lemma 5.2; and then we can choose some $a\in A$ and extend $\ddot{\sigma}$ by setting

\ddot{\sigma}(\textup{{p}})(x_{1},\ldots,x_{n})=\textup{if }(x_{i}=a)~{}\textup{then }{\textup{t\hskip-1.7ptt}}~{}\textup{else }{\textup{ff}},

so that $\ddot{\sigma}(\textup{{p}}(z_{1},\ldots,z_{n}))\neq\ddot{\sigma}(\textup{{p}}(w_{1},\ldots,w_{n}))$ , which contradicts the hypothesis. ∎

Lemma 5.4 ([3-3]).

If $z_{1},\ldots,w_{3}$ are all immediate, then

{\textup{{A}}}\models\textup{if }z_{1}~{}\textup{then }z_{2}~{}\textup{else }z_{3}=\textup{if }w_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3},\\ \iff z_{1}\equiv w_{1},z_{2}\equiv w_{2},z_{3}\equiv w_{3}.

Proof.

Assume ${\textup{{A}}}\models\textup{if }z_{1}~{}\textup{then }z_{2}~{}\textup{else }z_{3}=\textup{if }w_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3}$ , note that $z_{1},w_{1}$ must be of sort boole and consider cases:

Case (a): $z_{2},z_{3},w_{2},w_{3}$ are of sort ind. If $z_{1}\not\equiv w_{1}$ , extend $\ddot{\sigma}$ so that

\ddot{\sigma}(z_{1})={\textup{t\hskip-1.7ptt}},\quad\ddot{\sigma}(w_{1})\kern-1.00006pt\uparrow\kern 0.50003pt

which gives $\ddot{\sigma}(\textup{if }z_{1}~{}\textup{then }z_{2}~{}\textup{else }z_{3})=\ddot{\sigma}(z_{2})\text{ while }\ddot{\sigma}(\textup{if }w_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3})\kern-1.00006pt\uparrow\kern 0.50003pt$ , hence $z_{1}\equiv w_{1}$ . Setting $\ddot{\sigma}(z_{1})=\ddot{\sigma}(w_{1})={\textup{t\hskip-1.7ptt}}$ gives $z_{2}\equiv w_{2}$ and setting $\ddot{\sigma}(z_{1})={\textup{ff}}$ gives $z_{3}\equiv w_{3}$ .

Case (b): all $z_{1},z_{2},z_{3},w_{1},w_{2},w_{3}$ are of sort boole. Assume first, towards a contradiction that $z_{1}\not\equiv w_{1}$ , and set

\ddot{\sigma}(z_{1})={\textup{t\hskip-1.7ptt}},\quad\ddot{\sigma}(w_{1})\kern-1.00006pt\uparrow\kern 0.50003pt.

If $z_{2}\not\equiv w_{1}$ , we can further set $\ddot{\sigma}(z_{2})={\textup{t\hskip-1.7ptt}}$ (even if $z_{2}\equiv z_{1}$ ) and get a contradiction, so $z_{2}\equiv w_{1}$ . The same argument, with $\ddot{\sigma}(z_{1})={\textup{ff}}$ this time gives $z_{3}\equiv w_{1}$ ; and the symmetric argument gives $w_{2}\equiv w_{3}\equiv z_{1}$ , so what we have now is

{\textup{{A}}}\models\textup{if }z_{1}~{}\textup{then }w_{1}~{}\textup{else }w_{1}=\textup{if }w_{1}~{}\textup{then }z_{1}~{}\textup{else }z_{1}

with the added hypothesis that $z_{1}\not\equiv w_{1}$ ; but then we can set

\ddot{\sigma}(z_{1})={\textup{t\hskip-1.7ptt}},\quad\ddot{\sigma}(w_{1})={\textup{ff}},

which gives

\ddot{\sigma}(\textup{if }z_{1}~{}\textup{then }w_{1}~{}\textup{else }w_{1})=\ddot{\sigma}(w_{1})={\textup{ff}},\quad\ddot{\sigma}(\textup{if }w_{1}~{}\textup{then }z_{1}~{}\textup{else }z_{1})=\ddot{\sigma}(z_{1})={\textup{t\hskip-1.7ptt}},

which violates the hypothesis; so $z_{1}\equiv w_{1}$ , and the hypothesis in Case (b) takes the form

{\textup{{A}}}\models\textup{if }z_{1}~{}\textup{then }z_{2}~{}\textup{else }z_{3}=\textup{if }z_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3}.

Now assume towards a contradiction that $z_{2}\not\equiv w_{2}$ ; so one of them is different from $z_{1}$ , suppose it is $w_{2}$ ; so we now get a contradiction by setting $\ddot{\sigma}(z_{1})=\ddot{\sigma}(z_{2})={\textup{t\hskip-1.7ptt}}$ and $\ddot{\sigma}(w_{2})={\textup{ff}}$ .

At this point we know that $z_{1}\equiv w_{1}$ and $z_{2}\equiv w_{2}$ , and the last bit that $z_{3}\equiv w_{3}$ is simpler. ∎

Lemma 5.5 ([2-3]).

If $z_{1},\ldots,z_{m},w_{1},w_{2},w_{3}$ are all immediate, then

{\textup{{A}}}\models\textup{{p}}(z_{1},\ldots,z_{m})=\textup{if }w_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3}\\ \iff\textup{{p}}(z_{1},\ldots,z_{m})\equiv w_{1}\equiv w_{2}\equiv w_{3}.

Proof.

Notice first that if p is of sort ind, then

{\textup{{A}}},\sigma\not\models\textup{{p}}(z_{1},\ldots,z_{m})=\textup{if }w_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3}

for any $\sigma$ which converges on $\textup{{p}}(z_{1},\ldots,z_{m})$ but such that $\sigma(w_{1})\kern-1.00006pt\uparrow\kern 0.50003pt$ .

If p is of sort boole and $\textup{{p}}(z_{1},\ldots,z_{m})\not\equiv w_{1}$ , then we can get a $\sigma$ such that $\sigma(\textup{{p}}(z_{1},\ldots,z_{m}))\kern-1.99997pt\downarrow\kern 1.00006pt$ but $\sigma(w_{1})\kern-1.00006pt\uparrow\kern 0.50003pt$ so the hypothesis fails again. It follows that $\textup{{p}}(z_{1},\ldots,z_{m})\equiv w_{1}$ and the hypothesis takes the form

{\textup{{A}}}\models w_{1}=\textup{if }w_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3}.

If $w_{2}\not\equiv w_{1}$ , we get a contradiction by setting $\sigma(w_{1})={\textup{t\hskip-1.7ptt}},\sigma(w_{2})\kern-1.00006pt\uparrow\kern 0.50003pt$ , and similarly if $w_{1}\not\equiv w_{3}$ . ∎

Lemma 5.6 ([3-4]).

If $z_{1},\ldots,z_{m},w_{1},w_{2},w_{3}$ are all immediate, then

{\textup{{A}}}\not\models\phi(z_{1},\ldots,z_{m})=\textup{if }w_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3}.

Proof.

Notice that $w_{1}\not\equiv z_{i}$ , since ${\textup{sort}}(w_{1})={\texttt{boole}}$ while ${\textup{sort}}(z_{i})={\texttt{ind}}$ and consider three cases:

If $\phi^{\textup{{A}}}(\ddot{\sigma}(z_{1}),\ldots,\ddot{\sigma}(z_{m}))\kern-1.99997pt\downarrow\kern 1.00006pt$ , extend $\ddot{\sigma}$ by setting $\ddot{\sigma}(w_{1})\kern-1.00006pt\uparrow\kern 0.50003pt$ so

{\textup{{A}}},\ddot{\sigma}\not\models\phi(z_{1},\ldots,z_{m})=\textup{if }w_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3}.

(

\ast

)

If $\phi^{\textup{{A}}}(\ddot{\sigma}(z_{1}),\ldots,\ddot{\sigma}(z_{m}))\kern-1.00006pt\uparrow\kern 0.50003pt$ and $\phi$ is of sort ind, then setting now $\ddot{\sigma}(w_{1})={\textup{t\hskip-1.7ptt}}$ gives ( $\ast$ ‣ 5) again, since $\ddot{\sigma}(w_{2})\kern-1.99997pt\downarrow\kern 1.00006pt$ .

If $\phi^{\textup{{A}}}(\ddot{\sigma}(z_{1}),\ldots,\ddot{\sigma}(z_{m}))\kern-1.00006pt\uparrow\kern 0.50003pt$ and $\phi$ is of sort boole, then we can set $\ddot{\sigma}(w_{1})=\ddot{\sigma}(w_{2})={\textup{t\hskip-1.7ptt}}$ which again gives ( $\ast$ ‣ 5). ∎

Lemma 5.7.

If $z_{1},\ldots,z_{k},w_{1},\ldots,w_{m}$ are all immediate, then

\text{if }{\textup{{A}}}\models\phi(z_{1},\ldots,z_{k})=\textup{{p}}(w_{1},\ldots,w_{m}),\\ \text{then $\textup{{p}}(w_{1},\ldots,w_{m})\equiv z_{j}$ for some $j$}\\ \text{and so }{\textup{{A}}}\models\phi(z_{1},\ldots,z_{k})=z_{j}=\textup{id}(z_{j}).

In particular, with $k>0$ and $m=0$ ,

{\textup{{A}}}\models\phi(z_{1},\ldots,z_{k})=\textup{{p}}\implies\textup{{p}}\equiv z_{j}\text{ for some }j,

and with $k=0$ , ${\textup{{A}}}\not\models\phi=\textup{{p}}(z_{1},\ldots,z_{m})$ .

Proof.

Assume the hypothesis ${\textup{{A}}}\models\phi(z_{1},\ldots,z_{k})=\textup{{p}}(w_{1},\ldots,w_{m})$ and also that $\textup{{p}}(w_{1},\ldots,w_{m})\not\equiv z_{j}$ for any $j$ .

(a) ${\textup{sort}}(\phi)={\texttt{ind}}$ ; because if ${\textup{sort}}(\phi)={\texttt{boole}}$ , then

\text{either }\phi^{\textup{{A}}}(\ddot{\sigma}(z_{1}),\ldots,\ddot{\sigma}(z_{k})))\kern-1.99997pt\downarrow\kern 1.00006pt\text{ and we can set }\ddot{\sigma}(\textup{{p}})(\ddot{\sigma}(z_{1}),\ldots,\ddot{\sigma}(z_{k}))\kern-1.00006pt\uparrow\kern 0.50003pt\\ \text{or }\phi^{\textup{{A}}}(\ddot{\sigma}(z_{1}),\ldots,\ddot{\sigma}(z_{k})))\kern-1.00006pt\uparrow\kern 0.50003pt\text{ and we can set }\ddot{\sigma}(\textup{{p}})(\ddot{\sigma}(z_{1}),\ldots,\ddot{\sigma}(z_{k}))\kern-1.99997pt\downarrow\kern 1.00006pt.

( $\ast$ ) Call $E$ relevant if $E\equiv z_{i}$ or $E\equiv w_{i}$ for some $i$ , set $\sigma(\textup{{x}})=\ddot{\sigma}(\textup{{x}})$ for every individual variable which occurs in a relevant term, and for a function variable $\textup{{q}}^{{\texttt{ind}},n}$ of sort ind and arity $n$ , set

\sigma(\textup{{q}}^{{\texttt{ind}},n})(x_{1},\ldots,x_{n})=v\iff\kern-6.49994pt_{\mathrm{df}}~{}\text{ there is a relevant }F\equiv\textup{{q}}^{{\texttt{ind}},n}(\textup{{x}}_{1},\ldots,\textup{{x}}_{n})\\ \text{such that }x_{1}=\ddot{\sigma}(\textup{{x}}_{1}),\ldots,x_{n}=\ddot{\sigma}(\textup{{x}}_{n}),{~{}\&~{}}\ddot{\sigma}(\textup{{q}}^{{\texttt{ind}},n})(x_{1},\ldots,x_{n})=v.

(b) For each pure, algebraic term $E$ , $\sigma(E)$ is defined exactly when $E$ is relevant and then $\sigma(E)=\ddot{\sigma}(E)$ . This is because by the definition, if $\sigma(E)$ is defined, then $\sigma(E)=\ddot{\sigma}(E)=\ddot{\sigma}(F)$ for a relevant $F$ and hence $E\equiv F$ by Lemma 5.2.

It follows that $\sigma(\textup{{p}}(w_{1},\ldots,w_{m}))$ is not defined by the stipulation ( $\ast$ ), since $\textup{{p}}(w_{1},\ldots,w_{m})\not\equiv z_{i}$ by the hypothesis and each $w_{j}$ is a proper subterm of $\textup{{p}}(w_{1},\ldots,w_{m})$ , so it cannot be equal to it; and then we reach a contradiction by setting $\sigma(\textup{{p}})$ so that {endproofeqnarray*} σ(p)(¨σ(w_1),…,¨σ(w_m))≠ϕ^A(¨σ(z_1),…,¨σ(z_k). ∎

Lemma 5.8.

If there is an algorithm which decides the validity in A of identities in form [4-4].

{\textup{{A}}}\models\phi(z_{1},\ldots,z_{n})=\psi(z_{n+1},\ldots,z_{l})\quad(\phi,\psi\in\Phi,z_{1},\ldots,z_{l}\textup{ immediate}),

then the relation of intensional equivalence on A between extended $\Phi$ -programs is decidable.

Proof.

\ynm

It is useful to assume here that there are nullary constants $\phi_{\textup{t\hskip-1.7ptt}},\phi_{\textup{ff}}$ of sort boole and a unary function symbol id of sort id in $\Phi$ such that

\phi_{\textup{t\hskip-1.7ptt}}^{\textup{{A}}}={\textup{t\hskip-1.7ptt}},\quad\phi_{\textup{ff}}^{\textup{{A}}}={\textup{ff}}\text{~{}~{} and ~{}~{} }\textup{id}^{\textup{{A}}}(x)=x,

(5-3)

by adding them to $\Phi$ if necessary.

Forms [1-1], [1-2] and [1-3] are trivial, cf. Problem x5.1.

Form [2-2] is decided by Lemma 5.3 and forms [2-3], [3-3] and [3-4] are decided by Lemmas 5.5, 5.4 and 5.6.

Identities in forms [4-1] (equivalent to [1-4]) and [4-2] (equivalent to [2-4]) are either

\phi(z_{1},\ldots,z_{n})={\textup{t\hskip-1.7ptt}}=\psi_{\textup{t\hskip-1.7ptt}}\text{ and }\phi(z_{1},\ldots,z_{n})={\textup{ff}}=\psi_{\textup{ff}}

which are in form [4-4] with our assumption that $\psi_{\textup{t\hskip-1.7ptt}},\psi_{\textup{ff}}\in\Phi$ , or by Lemma 5.7 and our assumption that $\textup{id}\in\Phi$ , they are equivalent in A to identities

\phi(z_{1},\ldots,z_{n})=z_{j}=\textup{id}(z_{j})

for some $j$ (that we can compute), cf. Problem x5.11. ∎

At this point there is a fork in the road: it is very easy to finish the proof of Theorem 5.1 for total structures, while the general case of arbitrary partial structures involves some technical difficulties. We do the simple thing first.

Lemma 5.9.

Suppose A is a total, infinite, $\Phi$ -structure, $\phi,\psi\in\Phi$ and $z_{1},\ldots,z_{l}$ are immediate terms.

(1) If ${\textup{{A}}}\models\phi(z_{1},\ldots,z_{n})=\psi(z_{n+1},\ldots,z_{l})$ , then for all $i=1,\ldots,n$ ,

\text{if $z_{i}$ is not an individual variable, then there is a $j$ such that }z_{i}\equiv z_{n+j}.

(2) There is a sequence $\textup{{x}}_{1},\ldots,\textup{{x}}_{l}$ of (not necessarily distinct) individual variables such that

(\forall i,j)\Big{(}\textup{{x}}_{i}\equiv\textup{{x}}_{j}\iff z_{i}\equiv z_{j}\Big{)},

and for any such sequence

{\textup{{A}}}\models\phi(z_{1},\ldots,z_{n})=\psi(z_{n+1},\ldots,z_{l})\\ \iff{\textup{{A}}}\models\phi(\textup{{x}}_{1},\ldots,\textup{{x}}_{n})=\psi(\textup{{x}}_{n+1},\ldots,\textup{{x}}_{l}).

(5-4)

Proof.

(1) Suppose towards a contradiction that

{\textup{{A}}}\models\phi(z_{1},\ldots,z_{n})=\psi(z_{n+1},\ldots,z_{l})

but $z_{i}\equiv\textup{{p}}(\textup{{x}}_{1},\ldots,\textup{{x}}_{k})$ is not identical with any $z_{n+j}$ and define the assignment $\sigma$ which agrees with $\ddot{\sigma}$ on all individual and function variables except that

\sigma(\textup{{p}})(\ddot{\sigma}(\textup{{x}}_{1}),\ldots,\ddot{\sigma}(\textup{{x}}_{k}))\kern-1.00006pt\uparrow\kern 0.50003pt;

this gives $\phi^{\textup{{A}}}(\sigma(z_{1}),\ldots,\sigma(z_{n}))\kern-1.00006pt\uparrow\kern 0.50003pt$ and $\phi^{\textup{{A}}}(\sigma_{n+1},\ldots\sigma(z_{l}))\kern-1.99997pt\downarrow\kern 1.00006pt$ , which contradicts the hypothesis.

(2) To construct a sequence $\textup{{x}}_{1},\ldots,\textup{{x}}_{l}$ of individual variables with the required property by induction on $i$ : pick a fresh $\textup{{x}}_{i}$ if $z_{i}\not\equiv z_{j}$ for every $j<i$ and otherwise let $\textup{{x}}_{i}:\equiv\textup{{x}}_{j}\text{ for the least (and hence every) $j<i$ such that $z_{i}\equiv z_{j}$}$ .

Suppose $\textup{{x}}_{1},\ldots,\textup{{x}}_{l}$ are such that $\textup{{x}}_{i}\equiv\textup{{x}}_{j}\iff z_{i}\equiv z_{j}$ , assume that

{\textup{{A}}}\models\phi(z_{1},\ldots,z_{n})=\psi(z_{n+1},\ldots,z_{l})

and for any assignment $\sigma$ define $\tau$ so that

(\forall i=1,\ldots,k)[\tau(z_{i})=\sigma(\textup{{x}}_{i})],

which is possible by the assumption relating $\textup{{x}}_{i}$ and $z_{i}$ . The hypothesis gives

\phi^{\textup{{A}}}(\tau(z_{1}),\ldots,\tau(z_{n}))=\psi^{\textup{{A}}}(\tau(z_{n+1}),\ldots,\tau(z_{l})),

which then implies

\phi^{\textup{{A}}}(\sigma(\textup{{x}}_{1}),\ldots,\sigma(\textup{{x}}_{n}))=\psi^{\textup{{A}}}(\sigma(\textup{{x}}_{n+1}),\ldots,\sigma(\textup{{x}}_{l})),

and since $\sigma$ was arbitrary we get the required

{\textup{{A}}}\models\phi(\textup{{x}}_{1},\ldots,\textup{{x}}_{n})=\psi(\textup{{x}}_{n+1},\ldots,\textup{{x}}_{l}).

The converse is proved similarly. ∎

The point of the Lemma is that, for given $\phi,\psi$ , there are infinitely many identities which may or may not satisfy the left-hand-side of (5-4), because there are infinitely many immediate terms, e.g., $z_{1}$ could be any one of

\textup{{p}}_{1}^{1}(\textup{{v}}_{1}),\textup{{p}}^{2}_{1}(\textup{{v}}_{1},\textup{{v}}_{1}),\textup{{p}}^{3}_{1}(\textup{{v}}_{1},\textup{{v}}_{1},\textup{{v}}_{1}),\ldots;

while the right-hand-side of (5-4) involves only finitely many identities (up to alphabetic change), which we can exploit as follows:

An individual bare identity in $\Phi$ is any identity in form [4-4]

\theta:\phi(\textup{{x}}_{1},\ldots,\textup{{x}}_{k})=\psi(\textup{{x}}_{k+1},\ldots,\textup{{x}}_{l}),

(

\ast

)

where the (not necessarily distinct) individual variables $\textup{{x}}_{1},\ldots,\textup{{x}}_{l}$ are chosen from the first $l$ entries in a fixed list $\textup{{v}}_{0},\textup{{v}}_{1},\ldots$ of all individual variables; and the individual dictionary of a total structure A is the set

D({\textup{{A}}})=\{\theta\,\mid\,\theta\text{ is an individual bare identity and }{\textup{{A}}}\models\theta\}.

(Ind-Dictionary)

This is a finite set (because $\Phi$ is finite) and by (2) of Lemma 5.9, for any identity in form [4-4] we can construct an individual bare identity such that

{\textup{{A}}}\models\phi(z_{1},\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l})\\ \iff\phi(\textup{{x}}_{1},\ldots,\textup{{x}}_{k})=\psi(\textup{{x}}_{k+1},\ldots,\textup{{x}}_{l})\in D({\textup{{A}}});

(5-5)

and then Lemma 5.8 then gives immediately

Theorem 5.10.

For every total, infinite $\Phi$ -structure A, the relation of intensional equivalence on A between extended $\Phi$ -programs is decidable.

The general case

We now turn to the proof of Theorem 5.1 for an arbitrary infinite partial $\Phi$ -structure A which is similar in structure but requires some additional arguments. It will be useful to keep in mind the following example of an irreducible identity which illustrates the changes that we will need to make to the argument:

\phi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{v}}})=\psi(\textup{{p}}({\textup{{v}}}),{\textup{{u}}},{\textup{{v}}}).

(5-6)

An individual variable u is placed in an irreducible identity

\phi(z_{1},\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l})

(5-7)

if ${\textup{{u}}}\equiv z_{i}$ for some $i$ ; and an assignment $\sigma$ is injective if it assigns distinct values $\sigma({\textup{{u}}})\neq\sigma({\textup{{v}}})$ in $A$ to distinct, placed individual variables ${\textup{{u}}}\not\equiv{\textup{{v}}}$ . We write

{\textup{{A}}}\models_{\textup{inj}}\phi(z_{1},\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l})\\ \iff\kern-6.49994pt_{\mathrm{df}}~{}\text{ for every injective }\sigma,~{}{\textup{{A}}},\sigma\models\phi(z_{1},\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l}).

(5-8)

Lemma 5.11.

With every irreducible identity (5-7), we can associate a sequence ${\textup{{w}}}_{1},\ldots,{\textup{{w}}}_{l}$ of (not necessarily distinct) individual and nullary function variables of sort ind so that for every (infinite) $\Phi$ -structure A,

{\textup{{A}}}\models_{\textup{inj}}\phi(z_{1},\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l})\\ \iff{\textup{{A}}}\models_{\textup{inj}}\phi({\textup{{w}}}_{1},\ldots,{\textup{{w}}}_{k})=\psi({\textup{{w}}}_{k+1},\ldots,{\textup{{w}}}_{l}).

(5-9)

Proof.

Call $i$ new (in (5-7)) if there is no $j<i$ such that $z_{i}\equiv z_{j}$ , and set (by induction on $i$ ):

(a1)

${\textup{{w}}}_{i}:\equiv z_{i}$ , if $i$ is new and $z_{i}$ is an individual or nullary function variable;
(a2)

${\textup{{w}}}_{i}:\equiv$ some fresh nullary function variable (distinct from every $z_{j}$ and from every ${\textup{{w}}}_{j}$ with $j<i$ ), if $i$ is new and $z_{i}\equiv\textup{{p}}({\textup{{u}}}_{1},\ldots,{\textup{{u}}}_{n})$ with $n>0$ ;
(a3)

${\textup{{w}}}_{i}:\equiv{\textup{{w}}}_{j}$ for the least (and hence every) $j<i$ such that $z_{i}\equiv z_{j}$ , if $i$ is not new.

Notice that directly from the definition,

{\textup{{w}}}_{i}\equiv{\textup{{w}}}_{j}\iff z_{i}\equiv z_{j}\quad(1\leq i,j\leq l),

(5-10)

cf. Problem x5.4.

To prove first the direction $(\Longrightarrow)$ in (5-9), assume that

{\textup{{A}}}\models_{\textup{inj}}\phi(z_{1},\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l}),

let $\tau$ be any injective assignment, and define $\sigma$ by setting first

\sigma(z_{i}):=\tau({\textup{{w}}}_{i}),\text{ if $z_{i}$ is an individual or nullary function variable}.

Since $\tau$ is injective, this already insures that, however we extend it, $\sigma$ will be an injective assignment.

Next, if u is an individual variable which occurs in some $z_{i}\equiv\textup{{p}}({\textup{{u}}}_{1},\ldots,{\textup{{u}}}_{n})$ and is not placed, set $\sigma({\textup{{u}}})=\overline{\textup{{u}}}$ to a fresh value in $A$ , so that

{\textup{{u}}}\not\equiv{\textup{{v}}}\implies\overline{\textup{{u}}}\neq\overline{\textup{{v}}}.

At this point we have defined $\sigma$ on all the individual and nullary function variables which occur in (5-9) and it assigns distinct values to distinct individual variables. To define it on $n$ -ary function variables with $n>0$ which occur in (5-9), set

\sigma(\textup{{p}})(\overline{\textup{{u}}}_{1},\ldots,\overline{\textup{{u}}}_{n})=\tau({\textup{{w}}}_{i}),\text{ if }\textup{{p}}({\textup{{u}}}_{1},\ldots,{\textup{{u}}}_{n})\equiv z_{i};

which is a good definition, because if it happens that $\overline{\textup{{u}}}_{1}=\overline{\textup{{v}}}_{1},\ldots,\overline{\textup{{u}}}_{n}=\overline{\textup{{v}}}_{n}$ for some variables ${\textup{{v}}}_{1},\ldots,{\textup{{v}}}_{n}$ such that $\textup{{p}}({\textup{{v}}}_{1},\ldots,{\textup{{v}}}_{n})\equiv z_{j}$ for some $j\neq i$ , then ${\textup{{u}}}_{1}\equiv{\textup{{v}}}_{1},\ldots,{\textup{{u}}}_{n}\equiv{\textup{{v}}}_{n}$ , hence $z_{i}\equiv z_{j}$ and $\tau({\textup{{w}}}_{i})=\tau({\textup{{w}}}_{j})$ by (5-10).

The hypothesis now gives us that ${\textup{{A}}},\sigma\models\phi(z_{1},\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l})$ , and we verify ${\textup{{A}}},\tau\models\phi({\textup{{w}}}_{1},\ldots,{\textup{{w}}}_{k})=\psi({\textup{{w}}}_{k+1},\ldots,{\textup{{w}}}_{l})$ by a direct computation:

\tau({\textup{{A}}},\phi({\textup{{w}}}_{1},\ldots,{\textup{{w}}}_{k}))=\phi^{\textup{{A}}}(\tau({\textup{{w}}}_{1}),\ldots,\tau({\textup{{w}}}_{k}))\\ =\phi^{\textup{{A}}}(\sigma(z_{1}),\ldots,\sigma(z_{k}))\quad(\text{by the construction})\\ =\psi^{\textup{{A}}}(\sigma(z_{k+1}),\ldots,\sigma(z_{l}))\quad(\text{by the hypothesis})\\ n=\psi^{\textup{{A}}}(\tau({\textup{{w}}}_{k+1}),\ldots,\tau({\textup{{w}}}_{l}))\quad(\text{by the construction})\\ =\tau({\textup{{A}}},\psi({\textup{{w}}}_{k+1},\ldots,{\textup{{w}}}_{l})).

The converse direction $(\Longleftarrow)$ of (5-9) is proved similarly and we leave it for Problem x5.8. ∎

The lemma reduces the problem of the validity in A of any identity in (5-7) to the validity in A of a single identity from a finite list and that would give us the decision method we want by the same argument we used for total structures by appealing to Lemma 5.9—except for the annoying subscript ${}_{\textup{inj}}$ , which we must deal with next.

Consider the example in (5-6), for which the construction in the Lemma verified that

{\textup{{A}}}\models_{\textup{inj}}\phi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{v}}})=\psi(\textup{{p}}({\textup{{v}}}),{\textup{{u}}},{\textup{{v}}})\iff{\textup{{A}}}\models_{\textup{inj}}\phi(\textup{{q}}_{1},{\textup{{u}}},{\textup{{v}}})=\psi(\textup{{q}}_{2},{\textup{{u}}},{\textup{{v}}}),

an equivalence which may fail without the subscript ${}_{\textup{inj}}$ by Problem x5.7. Now

{\textup{{A}}}\models\phi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{v}}})=\psi(\textup{{p}}({\textup{{v}}}),{\textup{{u}}},{\textup{{v}}})\implies{\textup{{A}}}\models\phi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{u}}})=\psi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{u}}});

the construction in the Lemma associates the identity $\phi(\textup{{q}},{\textup{{u}}},{\textup{{u}}})=\psi(\textup{{q}},{\textup{{u}}},{\textup{{u}}})$ with $\phi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{u}}})=\psi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{u}}})$ ; and it is quite easy to check that

{\textup{{A}}}\models\phi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{v}}})=\psi(\textup{{p}}({\textup{{v}}}),{\textup{{u}}},{\textup{{v}}})\\ \iff{\textup{{A}}}\models_{\textup{inj}}\phi(\textup{{q}}_{1},{\textup{{u}}},{\textup{{v}}})=\psi(\textup{{q}}_{2},{\textup{{u}}},{\textup{{v}}}){~{}\&~{}}{\textup{{A}}}\models_{\textup{inj}}\phi(\textup{{q}},{\textup{{u}}},{\textup{{u}}})=\psi(\textup{{q}},{\textup{{u}}},{\textup{{u}}}),

(5-11)

cf. Problem x5.9. This reduces the validity of $\phi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{v}}})=\psi(\textup{{p}}({\textup{{v}}}),{\textup{{u}}},{\textup{{v}}})$ in A to the injective validity in A of two identities which are effectively constructed from $\phi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{v}}})=\psi(\textup{{p}}({\textup{{v}}}),{\textup{{u}}},{\textup{{v}}})$ ; and the natural extension of this reduction holds for arbitrary irreducible identities which involve the primitives in $\Phi$ :

Lemma 5.12.

Let $\theta\equiv\phi(z_{1}.\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l})$ be any irreducible identity with $\phi,\psi\in\Phi$ , let $({\textup{{u}}}_{1},\ldots,{\textup{{u}}}_{m})$ be an enumeration of the individual variables which are placed in $\theta$ , let ${\mathcal{E}}$ be the set of equivalence relations on the set $\{{\textup{{u}}}_{1},\ldots,{\textup{{u}}}_{m}\}$ and for any $\sim\,\in{\mathcal{E}}$ , let

\widetilde{\textup{{u}}}_{i}:\equiv{\textup{{u}}}_{j}\text{ for the least $j$ such that ${\textup{{u}}}_{i}\sim{\textup{{u}}}_{j}$}\quad(i=1,\ldots,m).

Then, for any infinite $\Phi$ -structure A,

{\textup{{A}}}\models\theta\iff\mbox{\hbox to0.0pt{$\bigwedge$\hss}\hskip 3.0pt$\bigwedge$\,}_{\sim\,\in\,{\mathcal{E}}}{\textup{{A}}}\models_{\textup{inj}}\theta\{{\textup{{u}}}_{i}:\equiv\widetilde{\textup{{u}}}_{i}\,\mid\,i=1,\ldots,m\},

(5-12)

where $\theta\{{\textup{{u}}}_{i}:\equiv\widetilde{\textup{{u}}}_{i}\,\mid\,i=1,\ldots,m\}$ is the result of replacing in $\theta$ each ${\textup{{u}}}_{i}$ by its representative $\widetilde{\textup{{u}}}_{i}$ in the equivalence relation $\sim$ .

Proof.

The direction $(\Longrightarrow)$ of (5-12) is trivial: because

{\textup{{A}}}\models\theta\implies(\forall\sim\,\in{\mathcal{E}})[{\textup{{A}}}\models\theta\{{\textup{{u}}}_{i}:\equiv\widetilde{\textup{{u}}}_{i}\,\mid\,i=1,\ldots,m\}]\\ \implies(\forall\sim\,\in{\mathcal{E}})[{\textup{{A}}}\models_{\textup{inj}}\theta\{{\textup{{u}}}_{i}:\equiv\widetilde{\textup{{u}}}_{i}\,\mid\,i=1,\ldots,m\}].

For the converse, assume the right-hand-side of (5-12), fix an assignment $\sigma$ and set

{\textup{{u}}}_{i}\sim{\textup{{u}}}_{j}\iff\sigma({\textup{{u}}}_{i})=\sigma({\textup{{u}}}_{j})\qquad\text{(notice that $\sim$ depends on $\sigma$)};

$\sigma$ is injective for the identity $\theta\{{\textup{{u}}}_{i}:\equiv\widetilde{\textup{{u}}}_{i}\,\mid\,i=1,\ldots,m\}$ whose placed individual variables are exactly the representatives $\{\widetilde{\textup{{u}}}_{i}\,\mid\,i=1,\ldots,m\}$ of the placed individual variables of $\theta$ , and so the hypothesis gives

{\textup{{A}}},\sigma\models\theta\{{\textup{{u}}}_{i}:\equiv\widetilde{\textup{{u}}}_{i}\,\mid\,i=1,\ldots,m\}.

To infer that ${\textup{{A}}},\sigma\models\theta$ , check first that for every immediate term $z$ ,

\sigma(z)=\sigma(z\{{\textup{{u}}}_{i}:\equiv\widetilde{\textup{{u}}}_{i}\,\mid\,i=1,\ldots,m\})=\sigma(\widetilde{z});

this holds by the definition of $\sim$ if $z\equiv{\textup{{u}}}_{i}$ for some $i$ and then trivially in every other case. Finally, compute:

{\textup{{A}}},\sigma\models\theta\{{\textup{{u}}}_{i}:\equiv\widetilde{\textup{{u}}}_{i}\,\mid\,i=1,\ldots,m\}\\ \implies\phi^{\textup{{A}}}(\sigma(\widetilde{z}_{1}),\ldots,\sigma(\widetilde{z}_{k}))=\psi^{\textup{{A}}}(\sigma(\widetilde{z}_{k+1}),\ldots,\sigma(\widetilde{z}_{l}))\\ \implies\phi^{\textup{{A}}}(\sigma(z_{1}),\ldots,\sigma(z_{k}))=\psi^{\textup{{A}}}(\sigma(z_{k+1}),\ldots,\sigma(z_{l}))\\ \hskip 56.9055pt\implies\sigma({\textup{{A}}},\phi(z_{1},\ldots,z_{k}))=\sigma({\textup{{A}}},\psi(z_{k+1},\ldots,z_{l}))\implies{\textup{{A}}},\sigma\models\theta.\hskip 17.07182pt\qed

\noqed

∎

A bare $\Phi$ -identity is an identity of the form

\phi({\textup{{w}}}_{1},\ldots,{\textup{{w}}}_{k})=\psi({\textup{{w}}}_{k+1},\ldots,{\textup{{w}}}_{l}),

(5-13)

where each ${\textup{{w}}}_{i}$ is an individual or nullary function variable of sort ind chosen from the first $2l$ entries in a fixed list

\textup{{v}}_{0},\textup{{p}}^{{\texttt{ind}},0}_{0},\textup{{v}}_{1},\textup{{p}}^{{\texttt{ind}},0}_{1},\ldots

of all such variables; and the dictionary of A is the set

D({\textup{{A}}})=\{\theta\,\mid\,\theta\text{ is a bare identity and }{\textup{{A}}}\models_{\textup{inj}}\theta\}.

(Dictionary)

The bare $\Phi$ -identities are alphabetic variants of the identities constructed in Lemma 5.11 and there are only finitely many of them, cf. Problem x5.10.

\ynm

We understand (5-13) to include (by convention) identities of the form

\phi(x_{1},\ldots,x_{k})=\psi,\quad\phi=\psi

when the arity of one or both of $\phi,\psi$ is $0$ . We will not deal separately with these degenerate cases, since the arguments we will give can be trivially adjusted to apply to them also.

Corollary 5.13.

With every irreducible identity of the form

\theta\equiv\phi(z_{1},\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l}),

we can associate a finite sequence $\theta_{1},\ldots,\theta_{n}$ of bare identities so that for every $\Phi$ -structure A,

{\textup{{A}}}\models\theta\iff{\textup{{A}}}\models_{\textup{inj}}\theta_{1}{~{}\&~{}}\cdots{~{}\&~{}}{\textup{{A}}}\models_{\textup{inj}}\theta_{n}\iff\theta_{1},\ldots,\theta_{n}\in D({\textup{{A}}}).

(5-14)

Proof.

Let $\sim_{1},\ldots,\sim_{n}$ be an enumeration of the equivalence relations on the set $\{{\textup{{u}}}_{1},\ldots,{\textup{{u}}}_{m}\}$ of the placed individual variables, let, for each $i$ , $\theta^{\prime}_{i}$ be the identity associated with $\theta\{{\textup{{u}}}_{i}:\equiv\widetilde{\textup{{u}}}_{i}\,\mid\,i=1,\ldots,m\}$ when $\sim\,=\,\sim_{i}$ by the construction in Lemma 5.11, and let $\theta_{i}$ be the “alphabetically least” variant of $\theta^{\prime}_{i}$ in which only the allowed variables occur. ∎

For example, to be ridiculously formal, the bare identities associated with Example (5-6) ny (5-11) are

\phi(\textup{{p}}^{{\texttt{ind}},0}_{0},\textup{{v}}_{0},\textup{{v}}_{1})=\psi(\textup{{p}}^{{\texttt{ind}},0}_{1},\textup{{v}}_{0},\textup{{v}}_{1})\text{ and }\phi(\textup{{p}}^{{\texttt{ind}},0}_{0},\textup{{v}}_{0},\textup{{v}}_{0})=\psi(\textup{{p}}^{{\texttt{ind}},0}_{0},\textup{{v}}_{0},\textup{{v}}_{0}).

{proofplain}

[Proof of Theorem 5.1] is now immediate by appealing to Lemma 5.8.

\ynm

Following the plan for the proof on page 5, we reduce the problem of intensional equivalence between two extended $\Phi$ -programs $E(\vec{\textup{{x}}})$ and $F(\vec{\textup{{x}}})$ to deciding the validity in A of irreducible term identities, and these fall in one of ten cases i-j with $1\leq i\leq j\leq 4$ on the same page depending of the form of their two sides, for example

\textup{if }w_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3}=\phi(z_{1},\ldots,z_{m}).

Case (3-4)

Case (1-1) is trivial, as no identity holds between distinct terms which are immediate, or tt or ff, and Lemma 5.3 and the first three parts of Lemma LABEL:irrthm decide the validity of the identity if $1\leq i\leq j\leq 3$ or $i=3,j=4$ —for example, the identity in case (3-4) never holds by Part (3) of Lemma LABEL:irrthm.

Cases (1-4) and (2-4), $w=\phi(z_{1},\ldots,z_{k})$ where $w$ is tt, ff or an irreducible (perhaps immediate) term of the form $p(w_{1},\ldots,w_{m})$ or $p$ . Now part (4) of Lemma LABEL:irrthm reduces this identity to

\phi(z_{1},\ldots,z_{k})=z_{j}

for some $j$ (that we can compute), and the same holds in the simpler case when $w$ is an individual or nullary function variable, cf. Problem x5.11. We assume that the identity map $\textup{id}(x)=x$ on $A$ is among the primitives—or we just add it, which will just make the decidability result stronger—and then this identity takes the form

\phi(z_{1},\ldots,z_{k})=\textup{id}(z_{i})

and is covered by case (4-4) below. The same reduction to (4-4) works for tt and ff, we just assume that these boolean constants have names in $\Phi$ .

Case (4-4) requires us to decide whether an irreducible identity of the form

\phi(z_{1},\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l})

holds in A and it is the only case which depends on the primitives of A. For this we will use the dictionary of A, the set

D({\textup{{A}}})=\{\theta\,\mid\,\theta\text{ is a bare identity and }{\textup{{A}}}\models_{\textup{inj}}\theta\}.

(Dictionary)

This is a finite set by Problem x5.10, and by Corollary 5.13, we can construct from $\phi(z_{1},\ldots,z_{k})$ and $\psi(z_{k+1},\ldots,z_{l})$ bare identities $\theta_{1},\ldots,\theta_{n}$ such that

{\textup{{A}}}\models\phi(z_{1},\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l})\\ \hskip 28.45274pt\iff{\textup{{A}}}\models_{\textup{inj}}\theta_{1}{~{}\&~{}}\dots~{}\&~{}{\textup{{A}}}\models_{\textup{inj}}\theta_{n}\iff\theta_{1},\ldots,\theta_{n}\in D({\textup{{A}}}).\hskip 18.49411pt\qed

\noqed

Global intensional equivalence

We might guess that irreducible programs are globally intensionally equivalent only if they are congruent, but this is spoiled by the trivial,

\models E=\textup{if }E~{}\textup{then }E~{}\textup{else }E\quad(\text{with $E$ explicit of boolean sort})

(5-15)

already noticed in Problem x3.2. So we need to adjust:

An irreducible program $E$ is proper if none of its parts is an immediate term of the form p or $\textup{{p}}(\textup{{u}}_{1},\ldots,\textup{{u}}_{n})$ and boolean sort.

Theorem 5.14.

(1) Every extended irreducible $\Phi$ -program is globally intensionally equivalent to a proper one.

(2) Two extended, proper, irreducible $\Phi$ -programs are globally intensionally equivalent if and only if they are congruent.

(3) The relation of global intensional equivalence between extended $\Phi$ -programs is decidable.

Proof.

(1) Replace every part $E_{i}\equiv\underline{\textup{{p}}^{{\texttt{boole}},n}({\textup{{w}}}_{1},\ldots,{\textup{{w}}}_{n})}$ of the program which is immediate and of Boolean sort by $\underline{\textup{if }E_{i}~{}\textup{then }E_{i}~{}\textup{else }E_{i}}$ .

(2) We need to show that if $E(\vec{\textup{{x}}})$ and $F(\vec{\textup{{x}}})$ are proper, extended irreducible $\Phi$ -programs, then

\Big{(}E(\vec{\textup{{x}}})=^{{\textup{{A}}}}_{\textup{int}}F(\vec{\textup{{x}}})\text{ for every infinite $\Phi$-structure ${\textup{{A}}}$}\Big{)}\implies E\equiv_{c}F;

and the idea is to use the hypothesis on a single, suitably “free” total structure A and then apply Lemma 5.9.

We fix an infinite (countable) set $A$ .

Step 1. For each $\phi\in\Phi$ of sort ind and arity $n\geq 0$ (if there are any such), fix a set $R_{\phi}\subset A$ and an injection $\phi^{\textup{{A}}}:A^{n}\rightarrowtail R_{\phi}$ , so that

\phi\not\equiv\psi\implies R_{\phi}\cap R_{\psi}=\emptyset\text{ and $\bigcup\{R_{\phi}\,\mid\,\phi\in\Phi\}$ is co-infinite}.

If $\textup{arity}(\phi)=0$ , then $R_{\phi}=\{\overline{\phi}\}$ is a singleton.

It is easy—if a bit tedious—to check that for all immediate terms $z_{1},\ldots,z_{k}$ , $w_{1},\ldots,w_{n}$ and however we complete the definition of A, if $\phi$ and $\psi$ are of sort ind, then

{\textup{{A}}}\models\phi(z_{1},\ldots,z_{k})=\psi(w_{1},\ldots,w_{n})\implies\phi(z_{1},\ldots,z_{k})\equiv\psi(w_{1},\ldots,w_{n}),

(5-16)

cf. Problem x5.12.

Step 2. For any two $\phi,\psi\in\Phi$ of sort boole and arities $k>0,n>0$ and for every equivalence relation $\sim$ on the set $\{1,\ldots,k,k+1,\ldots,k+n\}$ , we choose a fresh tuple $a_{1},\ldots,a_{k},a_{k+1},\ldots,a_{k+n}$ of elements in $A$ such that

a_{i}=a_{j}\iff i\sim j

and set $\phi^{\textup{{A}}}(a_{1},\ldots,a_{k})={\textup{t\hskip-1.7ptt}},\quad\psi^{\textup{{A}}}(a_{k+1},\ldots,a_{k+n})={\textup{ff}}$ . We do this by enumerating all triples $(\phi,\psi,\sim)$ where $\phi,\psi$ are of boolean sort and arities $k>0,n>0$ and $\sim$ is an equivalence relation on $\{1,\ldots,k+n\}$ , and then successively fixing a fresh tuple $a_{1},\ldots,a_{k+n}$ for each triple with the required property—which we can do since, at any stage, we have used up only finitely many members of $A$ . This insures (5-16) for $\phi,\psi$ of sort boole and non-zero arities no matter how we complete the definitions of $\phi^{\textup{{A}}}$ .

Step 3. If $\gamma\in\Phi$ is of sort boole and arity $0$ (a boolean constant), we set $\gamma^{\textup{{A}}}\kern-1.00006pt\uparrow\kern 0.50003pt$ .

At this point we have completed the definition of a structure A and the hypothesis gives us that

{\textup{{A}}}\models E_{i}=F_{i}\quad(i=0,\ldots K).

Lemma 1. For $i=0,\ldots,K$ , $E_{i}\equiv F_{i}$ , unless both $E_{i}$ and $F_{i}$ are distinct boolean constants in $\Phi$ .

{proofplain}

[Proof] is easy (if tedious) by appealing to Lemmas 5.2 and 5.5 and we leave it for Problem x5.13.

Lemma 2. For each boolean constant $\gamma\in\Phi$ ,

\Big{|}\{i\leq K\,\mid\,E_{i}\equiv\gamma\}\Big{|}=\Big{|}\{j\leq K\,\mid\,F_{j}\equiv\gamma\}\Big{|}.

Proof.

Fix $\gamma$ and define the structure ${\textup{{B}}}={\textup{{B}}}^{\gamma}$ exactly as we defined A in Steps 1 and 2 above but replacing Step 3 by the following:

Step $3^{\gamma}$ . Set $\gamma^{\textup{{B}}}={\textup{t\hskip-1.7ptt}}$ and for every other boolean constant $\delta\in\Phi$ , set $\delta^{\textup{{B}}}\kern-1.00006pt\uparrow\kern 0.50003pt$ .

The hypothesis gives us a permutation $\pi:\{0,\ldots,K\}\to\{0,\ldots,K\}$ such that $\pi(0)=0$ and

{\textup{{B}}}\models E_{i}=F_{\pi(i)},\quad i\leq K.

By a slight modification of the detailed case analysis in the proof of Lemma 1 (in Cases [2-5], [3-5] and [4-5]),

\text{if $E_{i}$ is not a propositional constant in $\Phi$, or ${\textup{t\hskip-1.7ptt}}$ or ${\textup{ff}}$,}\\ \text{then $F_{\pi(i)}$ is not a propositional constant in $\Phi$, or ${\textup{t\hskip-1.7ptt}}$ or ${\textup{ff}}$,}

and directly from the definition of B,

\text{if $E_{i}$ is a propositional constant in $\Phi$ other than $\gamma$,}\\ \text{then $F_{\pi(i)}$ is also a propositional constant in $\Phi$ other than $\gamma$.}

So $\pi$ pairs the parts of $E$ which are tt or $\gamma$ with the parts of $F$ which are tt or $\gamma$ , hence

\Big{|}\{i\leq K\,\mid\,E_{i}\equiv\gamma\text{ or }E_{i}\equiv{\textup{t\hskip-1.7ptt}}\}\Big{|}=\Big{|}\{j\leq K\,\mid\,F_{j}\equiv\gamma\text{ or }F_{j}\equiv{\textup{t\hskip-1.7ptt}}\}\Big{|};

but $\Big{|}\{i\leq K\,\mid\,E_{i}\equiv{\textup{t\hskip-1.7ptt}}\}\Big{|}=\Big{|}\{j\leq K\,\mid\,F_{j}\equiv{\textup{t\hskip-1.7ptt}}\}\Big{|}$ by Lemma 1, and so the last displayed identity implies the claim in the Lemma. ∎

Lemma 3. $E_{0}\equiv F_{0}$ .

Proof.

This is true by Lemma 1, if $E_{0}$ is not a boolean constant in $\Phi$ .

If $E_{0}\equiv\gamma$ , then the construction in Lemma 2 gives $F_{0}\equiv{\textup{t\hskip-1.7ptt}}\vee F_{0}\equiv\gamma$ ; and the same construction starting with $\gamma^{\textup{{B}}}={\textup{ff}}$ gives $F_{0}\equiv{\textup{ff}}\vee F_{0}\equiv\gamma$ , so that

(F_{0}\equiv{\textup{t\hskip-1.7ptt}}\vee F_{0}\equiv\gamma){~{}\&~{}}(F_{0}\equiv{\textup{ff}}\vee F_{0}\equiv\gamma),

which implies that $F_{0}\equiv\gamma$ . ∎

To complete the proof that $E\equiv_{c}F$ , we need to construct a permutation $\rho:\{0,\ldots,K\}\mbox{\,$\rightarrowtail\kern-8.00003pt\rightarrow$\,}\{0,\ldots,K\}$ such that $\rho(0)=0$ and

E_{i}\equiv F_{\rho(i)}\quad(i=1,\ldots,K).

(5-17)

We start by setting $\rho(i)=i$ if $E_{i}$ is not a boolean constant in $\Phi$ or $i=0$ , which insures (5-17) for such $i$ by Lemmas 1 and 3. Next, for each boolean constant $\gamma\in\Phi$ , we appeal to Lemma 2 to extend $\rho$ in any way so that

\rho:\{i\leq K\,\mid\,E_{i}\equiv\gamma\}\mbox{\,$\rightarrowtail\kern-8.00003pt\rightarrow$\,}\{j\leq K\,\mid\,F_{j}\equiv\gamma\}

and establishes that $E\equiv_{c}F$ .

Part (2) follows almost immediately from (1) and we leave it for Problem x5.15. ∎

Together with the reduction calculus in Section 4, Theorem 5.14 suggests an obvious way to axiomatize the relation of global intensional equivalence between programs, but we will not go into it here.

Problems for Section 5

Problem x5.1.

Give a decision procedure for ${\textup{{A}}}\models E=F$ in forms [1-1], [1-2], [1-3] and arbitrary A.

\ynm

Problem x5.2 ([Lemma LABEL:irrthm, (3)).

Prove that if $z_{1},\ldots,z_{m}$ and $w_{1}$ , $w_{2}$ , $w_{3}$ are immediate, $\phi\in\Phi$ and A is any infinite $\Phi$ -structure, then

{\textup{{A}}}\not\models\phi(z_{1},\ldots,z_{m})=\textup{if }w_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3}.

(

\ast

)

Problem x5.3.

True or false: for any infinite $A$ and any immediate terms $z_{1},\ldots,z_{m},w_{1},w_{2},w_{3}$ with ${\textup{sort}}(w_{2})={\textup{sort}}(w_{3})=s\in\{{\texttt{ind}},{\texttt{boole}}\}$ , we can define a partial function $\phi:A^{m}\rightharpoonup A_{s}$ such that

(A,\phi)\models\phi(z_{1},\ldots,z_{m})=\textup{if }~{}w_{1}~{}\textup{then }w_{2}~{}\textup{else }w_{3}.

Problem x5.4.

Prove (5-10), that with the definitions in the proof of Lemma 5.11,

{\textup{{w}}}_{i}\equiv{\textup{{w}}}_{j}\iff z_{i}\equiv z_{j}\quad(1\leq i,j\leq l).

(

\ast

)

Hint: Use induction on $i$ .

Problem x5.5.

Prove that if $E$ and $F$ are distinct immediate terms or tt or ff, then ${\textup{{A}}}\not\models E=F$ , for any infinite A.

Problem x5.6.

Suppose A is a total $\Phi$ -structure and $\phi,\psi\in\Phi$ . Find a sequence of (not necessarily distinct) individual variables ${\textup{{w}}}_{1},\ldots,{\textup{{w}}}_{6}$ such that

{\textup{{A}}}\models\phi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{v}}})=\psi(\textup{{p}}({\textup{{v}}}),{\textup{{u}}},{\textup{{v}}})\iff{\textup{{A}}}\models\phi({\textup{{w}}}_{1},{\textup{{w}}}_{2},{\textup{{w}}}_{3})=\psi({\textup{{w}}}_{4},{\textup{{w}}}_{5},{\textup{{w}}}_{6}).

Problem x5.7.

Give an example where

{\textup{{A}}}\models\phi(\textup{{p}}({\textup{{u}}}),{\textup{{u}}},{\textup{{v}}})=\psi(\textup{{p}}({\textup{{v}}}),{\textup{{u}}},{\textup{{v}}})\text{ but }{\textup{{A}}}\not\models\phi(\textup{{q}}_{1},{\textup{{u}}},{\textup{{v}}})=\psi(\textup{{q}}_{2},{\textup{{u}}},{\textup{{v}}}).

Hint: Set $\phi^{\textup{{A}}}(x,s,t)=\psi^{\textup{{A}}}(y,s,t)=f(s,t)$ , with a partial function $f(s,t)$ which converges only when $s=t$ .

Problem x5.8.

Prove the direction ( $\Longleftarrow$ ) of (5-9), i.e.,

{\textup{{A}}}\models_{\textup{inj}}\phi({\textup{{w}}}_{1},\ldots,{\textup{{w}}}_{k})=\psi({\textup{{w}}}_{k+1},\ldots,{\textup{{w}}}_{l})\\ \implies{\textup{{A}}}\models_{\textup{inj}}\phi(z_{1},\ldots,z_{k})=\psi(z_{k+1},\ldots,z_{l})

with the choice of variables ${\textup{{w}}}_{1},\ldots,{\textup{{w}}}_{l}$ made in the proof of Lemma 5.11.

Problem x5.9.

Prove (5-11).

Problem x5.10.

Prove that the number of bare $\Phi$ -identities as in (5-13) is bounded above by $2^{2\textup{arity}(\Phi)}|\Phi|^{2}$ , where $\textup{arity}(\Phi)$ is the largest arity of any $\phi\in\Phi$ and $|\Phi|$ is its cardinality.

Problem x5.11.

Prove that if $\phi(z_{1},\ldots,z_{k})$ is irreducible, w is an individual variable and ${\textup{{A}}}\models\textup{{w}}=\phi(z_{1},\ldots,z_{k})$ for some structure A, then every $z_{i}$ is an individual variable and w is one of them.

Problem x5.12.

Prove (5-16)

{\textup{{A}}}\models\phi(z_{1},\ldots,z_{k})=\psi(w_{1},\ldots,w_{n})\implies\phi(z_{1},\ldots,z_{k})\equiv\psi(w_{1},\ldots,w_{n}),

in Step 1 of the proof of Theorem 5.14.

Problem x5.13.

Prove Lemma 1 in the proof of Theorem 5.14.

Problem x5.14.

Prove Case (2-2) of Theorem 5.14, that for immediate terms $z_{1},\ldots,z_{l}$ ,

\models p(z_{1},\ldots,z_{k})=q(z_{k+1},\ldots,z_{l})\implies p(z_{1},\ldots,z_{k})\equiv q(z_{k+1},\ldots,z_{l}).

Problem x5.15.

Derive (2) of Theorem 5.14 from (1).

Propositional programs

An explicit term $E$ is propositional if it has no individual variables, no symbols from $\Phi$ and only boolean, nullary function variables,

P:\equiv{\textup{t\hskip-1.7ptt}}\mid{\textup{ff}}\mid\textup{{p}}^{{\texttt{boole}},0}_{i}\mid\textup{if }~{}P_{1}~{}\textup{then }P_{2}~{}\textup{else }P_{3};

(Prop terms)

and a program is propositional if all its parts are, e.g.,

\text{Liar}:\equiv\textup{{p}}{\sf~{}where~{}}\{\textup{{p}}=\textup{if }\textup{{p}}~{}\textup{then }{\textup{ff}}~{}\textup{else }{\textup{t\hskip-1.7ptt}}\},\\ \text{Truthteller}:\equiv\textup{{p}}{\sf~{}where~{}}\{\textup{{p}}=\textup{if }\textup{{p}}~{}\textup{then }{\textup{t\hskip-1.7ptt}}~{}\textup{else }{\textup{ff}}\}.

These are significant for the project of using referential intensions to model meanings initiated in [ynmfrege, ynmlcms] which is far from our topic, but they also bear on the complexity problem for intensional equivalence.

For our purposes here, a finite graph with $n>0$ nodes is a binary relation $E$ on the set $\{0,\ldots,n-1\}$ , the edge relation on the set of nodes of the graph $(\{0,\ldots,n-1\},E)$ in more standard terminology.¹³¹³13To the best of my knowledge (and that of Wikipedia), it is still open as I write this whether the graph isomorphism problem is co-NP.

Problem x5.46^∗ ([ynmfrege], for a related language).

Prove that the problem of intensional equivalence between propositional programs is at least as hard as the problem of isomorphism between graphs. Hint: You need to associate with each graph $E$ of size $n$ an irreducible propositional program $\widetilde{E}$ which codes $E$ , so that

E\text{ is isomorphic with }F\iff\widetilde{E}=_{\textup{int}}\widetilde{F}\iff\widetilde{E}\equiv_{c}\widetilde{F};

and the trick is to use propositional variables $\textup{{p}}_{i}$ , one for each $i<n$ and $\textup{{p}}_{ij}$ , one for each pair $(i,j)$ with $i,j<n$ .

References

[1] \bibfitembhaskar2017int \guyS.SiddharthBhaskar 20170 \guysmagicSiddharth Bhaskar A difference in complexity between recursion and tail recursion, Theory of Computing Systems, vol.\weaktie60\yearmagic(2017), pp.\weaktie299–313. \TheSortKeyIsbhaskar siddharth 2017 difference in complexity between recursion and tail recursion
[2] \bibritembhaskar2017ext \guyS.SiddharthBhaskar 20180 \guysmagic\bysame Recursion versus tail recursion over $\overline{\mathbb{F}}p$ , Journal of Logical and Algebraic Methods in Programming\yearmagic,(2018), pp.\weaktie68–90. \TheSortKeyIsbhaskar siddharth 2018 recursion versus tail recursion over fp
[3] \bibfitemchurch1935 \guyA.AlonzoChurch 19350 \guysmagicAlonzo Church An unsolvable problem in elementary number theory, Bulletin of the American Mathematical Society, vol.\weaktie41\yearmagic(1935), pp.\weaktie332–333, This is an abstract of [church1936]. \TheSortKeyIschurch alonzo 1935 unsolvable problem in elementary number theory
[4] \bibritemchurch1936 \guyA.AlonzoChurch 19360 \guysmagic\bysame An unsolvable problem in elementary number theory, American Journal of Mathematics\yearmagic,(1936), pp.\weaktie345–363, An abstract of this paper was published in [church1935]. \TheSortKeyIschurch alonzo 1936a unsolvable problem in elementary number theory
[5] \bibfitemgurevich2008 \guyN.NachumDershowitz and \guyY.YuriGurevich 20080 \guysmagicNachum Dershowitz \biband Yuri Gurevich A natural axiomatization of computability and proof of Church’s Thesis, \bslname, vol.\weaktie14\yearmagic(2008), pp.\weaktie299–350. \TheSortKeyIsdershowitz nachum gurevich yuri 2008 natural axiomatization of computability and proof of churchs thesis
[6] \bibfitemduzi2014 \guyM.M.Duží 20140 \guysmagicM. Duží A procedural interpretation of the Church-Turing thesis, Church’s Thesis: Logic, Mind and Nature (Adam Olszewski, Bartosz Brozek, \biband Piotr Urbanczyk, editors), Copernicus Center Press, Krakow 2013\yearmagic,2014. \TheSortKeyIsduv z i m 2014 procedural interpretation of the church turing thesis
[7] \bibfitemgandy1980 \guyR.RobinGandy 19800 \guysmagicRobin Gandy Church’s Thesis and principles for mechanisms, The Kleene Symposium (J. Barwise, H. J. Keisler, \biband K. Kunen, editors), North Holland Publishing Co\yearmagic,1980, pp.\weaktie123–148. \TheSortKeyIsgandy robin 1980 churchs thesis and principles for mechanisms
[8] \bibfitemgurevich1995 \guyY.YuriGurevich 19950 \guysmagicYuri Gurevich Evolving algebras 1993: Lipari guide, Specification and validation methods (E. Börger, editor), Oxford University Press\yearmagic,1995, pp.\weaktie9–36. \TheSortKeyIsgurevich yuri 1995 evolving algebras 1993 lipari guide
[9] \bibritemgurevich2000 \guyY.YuriGurevich 20000 \guysmagic\bysame Sequential abstract state machines capture sequential algorithms, ACM Transactions on Computational Logic, vol.\weaktie1\yearmagic(2000), pp.\weaktie77–111. \TheSortKeyIsgurevich yuri 2000 sequential abstract state machines capture sequential algorithms
[10] \bibfitemjones1999 \guyN. D.Neil D.Jones 19990 \guysmagicNeil D. Jones LOGSPACE and PTIME characterized by programming languages, Theoretical Computer Science\yearmagic,(1999), pp.\weaktie151–174. \TheSortKeyIsjones neil d 1999 logspace and ptime characterized by programming languages
[11] \bibritemjones2001 \guyN. D.Neil D.Jones 20010 \guysmagic\bysame The expressive power of higher-order types or, life without CONS, Journal of Functional Programming, vol.\weaktie11\yearmagic(2001), pp.\weaktie55–94. \TheSortKeyIsjones neil d 2001 expressive power of higher order types or life without cons
[12] \bibfitemynmkechris-moschovakis \guyA. S.A. S.Kechris and \guyY. N.Y. N.Moschovakis 19770 \guysmagicA. S. Kechris \biband Y. N. Moschovakis Recursion in higher types, Handbook of mathematical logic (J. Barwise, editor), Studies in Logic, No. 90, North Holland, Amsterdam\yearmagic,1977, pp.\weaktie681–737. \TheSortKeyIskechris a s moschovakis y n 1977 recursion in higher types
[13] \bibfitemkleene1959 \guyS. C.Stephen C.Kleene 19590 \guysmagicStephen C. Kleene Recursive functionals and quantifiers of finite types I, Transactions of the American Mathematical Society, vol.\weaktie91\yearmagic(1959), pp.\weaktie1–52. \TheSortKeyIskleene stephen c 1959 recursive functionals and quantifiers of finite types i
[14] \bibfitemknuth1973 \guyD. E.D. E.Knuth 19730 \guysmagicD. E. Knuth The Art of Computer Programming, Volume 1. Fundamental Algorithms, second ed., Addison-Wesley\yearmagic,1973. \TheSortKeyIsknuth d e 1973 the art of computer programming volume 1 fundamental algorithms
[15] \bibfitemkolmogorov1933 \guyA. N.A. N.Kolmogorov 19330 \guysmagicA. N. Kolmogorov Foundations of the theory of probability, Chelsea Publishing Co., New York\yearmagic,1933, English 1956 translation of the 1933 German monograph, edited by Nathan Morrison and with an added bibliography by A. T. Bharucha-Reid. \TheSortKeyIskolmogorov a n 1933 foundations of the theory of probability
[16] \bibfitemkripke2000 \guyS. A.Saul A.Kripke 20000 \guysmagicSaul A. Kripke From the Church-Turing Thesis to the First-Order Algorithm Theorem, Proceedings of the 15th Annual IEEE Symposium on Logic in Computer Science (Washington, DC, USA), LICS ’00, IEEE Computer Society\yearmagic,2000, The reference is to an abstract. A video of a talk by Saul Kripke at The 21st International Workshop on the History and Philosophy of Science with the same title is posted at http://www.youtube.com/watch?v=D9SP5wj882w, and this is my only knowledge of this article. \TheSortKeyIskripke saul a 2000 from the church turing thesis to the first order algorithm theorem
[17] \bibfitemlongley-normann2015 \guyJ.JohnLongley and \guyD.DagNormann 20150 \guysmagicJohn Longley \biband Dag Normann Higher-order computability, Springer\yearmagic,2015. \TheSortKeyIslongley john normann dag 2015 higher order computability
[18] \bibfitemlynch-blum1979 \guyN. A.Nancy A.Lynch and \guyE. K.Edward K.Blum 19790 \guysmagicNancy A. Lynch \biband Edward K. Blum A difference in expressive power between flowcharts and recursion schemes, Mathematical Systems Theory\yearmagic,(1979), pp.\weaktie205–211. \TheSortKeyIslynch nancy a blum edward k 1979 difference in expressive power between flowcharts and recursion schemes
[19] \bibfitemynmflr \guyY. N.Yiannis N.Moschovakis 1989a0 \guysmagicYiannis N. Moschovakis The formal language of recursion, \jslname, vol.\weaktie54\yearmagic(1989), pp.\weaktie1216–1252, posted in ynm’s homepage. \TheSortKeyIsmoschovakis yiannis n 1989 formal language of recursion
[20] \bibritemynmbotik \guyY. N.Yiannis N.Moschovakis 1989b0 \guysmagic\bysame A mathematical modeling of pure, recursive algorithms, Logic at Botik ’89 (A. R. Meyer \biband M. A. Taitslin, editors), vol. 363, Springer-Verlag, Berlin\yearmagic,1989, Lecture Notes in Computer Science, posted in ynm’s homepage, pp.\weaktie208–229. \TheSortKeyIsmoschovakis yiannis n 1989 mathematical modeling of pure recursive algorithms
[21] \bibritemynmfrege \guyY. N.Yiannis N.Moschovakis 19940 \guysmagic\bysame Sense and denotation as algorithm and value, Logic colloquium ’90 (J. Väänänen \biband J. Oikkonen, editors), Lecture Notes in Logic, vol. 2, Association for Symbolic Logic\yearmagic,1994, posted in ynm’s homepage, together with a corrected and expanded proof of the main Theorem 4.1 of this paper, pp.\weaktie210–249. \TheSortKeyIsmoschovakis yiannis n 1994 sense and denotation as algorithm and value
[22] \bibritemynmsicily \guyY. N.Yiannis N.Moschovakis 19980 \guysmagic\bysame On founding the theory of algorithms, Truth in mathematics (H. G. Dales \biband G. Oliveri, editors), Clarendon Press, Oxford\yearmagic,1998, posted in ynm’s homepage, pp.\weaktie71–104. \TheSortKeyIsmoschovakis yiannis n 1998 on founding the theory of algorithms
[23] \bibritemynmeng \guyY. N.Yiannis N.Moschovakis 20010 \guysmagic\bysame What is an algorithm?, Mathematics unlimited – 2001 and beyond (B. Engquist \biband W. Schmid, editors), Springer\yearmagic,2001, posted in ynm’s homepage, pp.\weaktie929–936. \TheSortKeyIsmoschovakis yiannis n 2001 what is an algorithm
[24] \bibritemynmlcms \guyY. N.Yiannis N.Moschovakis 20060 \guysmagic\bysame A logical calculus of meaning and synonymy, Linguistics and Philosophy, vol.\weaktie29\yearmagic(2006), pp.\weaktie27–89, posted in ynm’s homepage. \TheSortKeyIsmoschovakis yiannis n 2006 logical calculus of meaning and synonymy
[25] \bibritemynm2014 \guyY. N.Yiannis N.Moschovakis 20140 \guysmagic\bysame On the Church-Turing Thesis and relative recursion, Logic and Science Facing the New Technologies (Peter Schroeder-Heister, Gerhard Heinzmann, Wilfrid Hodges, \biband Pierre Edouard Bour, editors), College Publications\yearmagic,2014, Logic, Methodology and Philosophy of Science, Proceedings of the 14th International Congress (Nancy), posted in ynm’s homepage, pp.\weaktie179–200. \TheSortKeyIsmoschovakis yiannis n 2014 on the church turing thesis and relative recursion
[26] \bibritemaric2019 \guyY. N.Yiannis N.Moschovakis 20190 \guysmagic\bysame Abstract recursion and intrinsic complexity, ASL Lecture Notes in Logic, vol. 48, Cambridge University Press\yearmagic,2019, posted in ynm’s homepage. \TheSortKeyIsmoschovakis yiannis n 2019 abstract recursion and intrinsic complexity
[27] \bibfitemynmpaschalis2008 \guyY. N.Yiannis N.Moschovakis and \guyV.VasilisPaschalis 20080 \guysmagicYiannis N. Moschovakis \biband Vasilis Paschalis Elementary algorithms and their implementations, New computational paradigms (S. B. Cooper, Benedikt Lowe, \biband Andrea Sorbi, editors), Springer\yearmagic,2008, posted in ynm’s homepage, pp.\weaktie81—118. \TheSortKeyIsmoschovakis yiannis n paschalis vasilis 2008 elementary algorithms and their implementations
[28] \bibfitempatterson-hewitt1970 \guyM. S.Michael S.Patterson and \guyC. E.Carl E.Hewitt 19700 \guysmagicMichael S. Patterson \biband Carl E. Hewitt Comparative schematology, MIT AI Lab publication posted at http://hdl.handle.net/1721.1/6291 with the date 1978\yearmagic,1970. \TheSortKeyIspatterson michael s hewitt carl e 1970 comparative schematology
[29] \bibfitemsacks1990 \guyG. E.Gerald E.Sacks 19900 \guysmagicGerald E. Sacks Higher recursion theory, Perspectives in Mathematical Logic, Springer\yearmagic,1990. \TheSortKeyIssacks gerald e 1990 higher recursion theory
[30] \bibfitemtiuryn1989 \guyJ.JerzyTiuryn 19890 \guysmagicJerzy Tiuryn A simplified proof of DDL $<$ DL, Information and Computation, vol.\weaktie82\yearmagic(1989), pp.\weaktie1–12. \TheSortKeyIstiuryn jerzy 1989 simplified proof of ddl dl
[31] \bibfitemzt2000 \guyJ.J.V.Tucker and \guyJ.J.I.Zucker 20000 \guysmagicJ.V. Tucker \biband J.I. Zucker Computable functions and semicomputable sets on many-sorted algebras, Handbook of Logic in Computer Science (S. Abramsky, D.M. Gabbay, \biband T.S.E. Maibaum, editors), vol. 5, Oxford University Press\yearmagic,2000, pp.\weaktie317–523. \TheSortKeyIstucker jv zucker ji 2000 computable functions and semicomputable sets on many sorted algebras
[32] \bibfitemturing1936 \guyA. M.Alan M.Turing 19360 \guysmagicAlan M. Turing On computable numbers with an application to the Entscheidungsproblem, Proceedings of the London Mathematical Society, vol.\weaktie42\yearmagic(1936), pp.\weaktie230–265, A correction, ibid. volume 43 (1937), pp. 544–546. \TheSortKeyIsturing alan m 1936 on computable numbers with an application to the entscheidungsproblem
[33]

Elementary recursive algorithms∗{\,}^{\ast}

1 What is an algorithm?

Recursive algorithms

Implementations

Recursion vs. iteration

Functional vs. imperative programming

Proofs of correctness

Defining mathematical objects in set theory

2 Equational logic of partial terms with conditionals

Syntax

Semantics

Problems for Section 2

Problem x2.1.

3 Continuous, pure recursors

The recursor defined by an extended program

About implementations (2)

Strong recursor isomorphism

Lemma 3.1.

(Extended) program congruence

More general recursors

Problems for Section 3

Problem x3.1.

Problem x3.2.

4 Canonical forms and intensions

Some more syntax

Set representations

Immediate and irreducible terms

Irreducible extended programs

Arrow reduction

Lemma 4.1.

Proof.

Lemma 4.2 (Amalgamation).

Proof.

Extended program reduction

Size

Lemma 4.3.

Lemma 4.4.

Theorem 4.5 (Canonical forms).

Proof.

Canonical forms for richer languages

Intensions and elementary algorithms

The recursor representation of an iterative algorithm

Proposition 4.6.

About implementations (3)

Problems for Section 4

Problem x4.1.

Problem x4.2.

Problem x4.3 (Size).

Problem x4.4.

Problem x4.13∗.

5 Decidability of intensional program equivalence

Theorem 5.1.

Plan for the proof

The satisfaction relation

Preliminary constructions and notation

Lemma 5.2.

Forms of irreducible identities

Lemma 5.3 ([2-2]).

Proof.

Lemma 5.4 ([3-3]).

Proof.

Lemma 5.5 ([2-3]).

Proof.

Lemma 5.6 ([3-4]).

Proof.

Lemma 5.7.

Proof.

Lemma 5.8.

Proof.

Lemma 5.9.

Proof.

Theorem 5.10.

The general case

Lemma 5.11.

Proof.

Lemma 5.12.

Proof.

Corollary 5.13.

Proof.

Global intensional equivalence

Elementary recursive algorithms ${\,}^{\ast}$

Problem x4.13^∗.

Problem x5.46^∗ ([ynmfrege], for a related language).