A Historical Account of My Early
Research Interests ^†^†thanks: This work has been partially supported by GNCS-INdAM, Italy.

Alberto Pettorossi DICII, University of Rome Tor Vergata, Rome, ItalyIASI-CNR, Rome, Italy [email protected]

Abstract

This paper presents a brief account of some of the my early research interests. This historical account starts from my laurea thesis on Signal Theory and my master thesis on Computation Theory. It recalls some results in Combinatory Logic and Term Rewriting Systems. Some other results concern Program Transformation, Parallel Computation, Theory of Concurrency, and Proof of Program Properties. My early research activity has been mainly done in cooperation with Andrzej Skowron, Anna Labella, and Maurizio Proietti.

1 From Signal Theory to Combinatory Logic

Since my childhood I very much liked Arithmetic and Mathematics. The formal reasoning always attracted my spirit and I always felt a special interest for numbers and geometrical patterns. Maybe this was due to the fact that I thought that Mathematics is a way of establishing ‘truth beyond any doubt’. As Plato says: ‘Truth becomes manifest in the mathematical process’ (Phaedo). (The actual word used by Plato for ‘mathematical process’ comes from $\lambda o\gamma\acute{\iota}\zeta o\mu\alpha\iota$ which means: I compute, I deduce.) During my high school I attended the Classical Lyceum. Perhaps, for me the Scientific Lyceum would have been a better school to attend, but the Scientific Lyceum was located too far away from my home town.

At the age of nineteen, I began my university studies in Rome as a student of Engineering. I was in doubt whether or not to enrol myself as a Mathematics student, but eventually I followed my father’s suggestion to study Engineering because, as he said: “If you study Mathematics, you will have no other choice in life than to become a teacher.” My thesis work was in Telecommunication and, in particular, I studied the problem of how to pre-distort an electric signal which encodes a sequence of symbols, each one being 0 or 1, via a sequence of impulses. The pre-distortion of the electric signal should minimize the effect of a Gaussian white noise (which would require a reduction of bandwidth) and the interference between symbols (which would require an increase of bandwidth). A theoretical solution to this problem is not easy to find. Thus, I was suggested to look for a practical solution via a numerical simulation of the transmission channel and the construction of the so called eye pattern [46]. In the numerical simulation, which uses the Fast Fourier Transform algorithm, one could easily modify the various parameters of the pre-distortion for minimizing the errors in the output sequence of 0’s and 1’s. The thesis work was done under the patient guidance of my supervisors Professors Bruno Peroni and Paolo Mandarini.

After getting the laurea degree, I attended during 1972 at Rome University a course in Engineering of Control and Computation Systems. During that year I read the book entitled Mathematical Theory of Computation written by Professor Zohar Manna (1939–2018) (at that time that book was nothing more than a thick technical report of Stanford University, California). I wrote my master thesis on the “Automatic Derivation of Control Flow Graphs of Fortran Programs”, under the guidance of Professor Vincenzo Falzone and Professor Paolo Ercoli [47]. In particular, I wrote a Fortran program which derives control flow graphs of Fortran programs. That program ran on a UNIVAC 1108 computer with the EXEC 8 Operating System. The main memory had 128k words. The program I wrote was a bit naive, but at that time I was not familiar with efficient parsing techniques. I also studied various kinds of program schemas and, in particular, those introduced by Lavrov [38], Yanov [75], and Martynuk [40]. Having constructed the control flow graph of a given Fortran program, one could transform that program into an equivalent one with better computational properties (such as smaller time or space complexities) by applying a set of schema transformations [33] which are guaranteed to preserve semantical equivalence. Schema transformations are part of the research area in which I have been interested for some years afterwards.

During that period, which overlapped with my military service in the Italian Air Force, I also read a book on Combinatory Logic (actually, not the entire book) by J. R. Hindley, B. Lercher and J. P. Seldin [28, 29]. I read the Italian edition of the book, which was emended of some inaccuracies with respect to the previous English edition (as Roger Hindley himself told me later). Under the guidance of Professor Giorgio Ausiello and the great help of my colleague Carlo Batini, I studied various properties of subbases in Weak Combinatory Logic (WCL) [4].

WCL is an applicative system whose terms, called combinators, can be defined as follows: (i) $K$ and $S$ are atomic terms, and (ii) if $t_{\_}{1}$ and $t_{\_}{2}$ are terms, then $(t_{\_}{1}\ t_{\_}{2})$ is a term. When parentheses are missing, left associativity is assumed. A notion of reduction, denoted $>$ , is introduced as follows: for all terms $x,y,z$ , $Sxyz>xz(yz)$ and $Kxy>x$ . Thus, for instance, $SKKS>KS(KS)>S$ . WCL is a Turing complete system as every partial recursive function can be represented as a combinator in WCL. A subbase in WCL is a set of terms which can be constructed starting a fixed set of (possibly non-atomic) combinators. For instance, the subbase $\{B\}$ , where $B$ is a combinator defined by the following reduction: $Bxyz>x(yz)$ , is made out of all terms which are constructed by $B$ ’s (and parentheses) only. These terms are called $B$ -combinators. One can show that $B$ can be expressed in the subbase $\{S,K\}$ by $S(KS)K$ . Indeed, $S(KS)Kxyz>^{*}x(yz)$ , where $>^{*}$ denotes the reflexive, transitive closure of $>$ . The various subbases provide a way of partitioning the set of computable functions into various sets, according to the features of the combinators in the subbases. This should be contrasted with other stratifications of the set of computable functions one could define and, among them, the stratifications based on complexity classes or on the Chomsky hierarchy [31] with the type $i$ (for $i\!=\!0,1,2,3$ ) classes of languages.

Among other subbases, we studied the subbase $\{B\}$ and we showed how to construct the shortest $B$ -combinator for constructing bracketed terms out of sequences of atomic subterms. For instance, $B(B(BB)B)(BB)$ is the shortest $B$ -combinator $X$ such that: $Xx_{\_}{1}x_{\_}{2}x_{\_}{3}x_{\_}{4}x_{\_}{5}x_{\_}{6}\ >^{*}\ x_{\_}{1}(x_{\_}{2}(x_{\_}{3}x_{\_}{4}))(x_{\_}{5}x_{\_}{6})$ .

During 1975, while attending in Rome the conference on $\lambda$ -calculus and Computer Science Theory, where our results on subbases were presented [4], I heard from Professor Henk Barendregt of an open problem concerning the existence of a combinator $\widetilde{X}$ made of only $S$ ’s (and parentheses), having no weak normal form. A combinator $T$ is said to be in weak normal form if no combinator $T^{\prime}$ exists such that $T\!>\!T^{\prime}$ . $X$ is said to have weak normal form if there exists a combinator $T$ such that $X>^{*}T$ and $T$ is in weak normal form.

It was not hard to show that one such combinator $\widetilde{X}$ is $SAA(SAA)$ , where $A$ denotes $((SS)S)$ . I send the result to Henk Barendregt (by surface mail, of course). Some years later I was happy to see that an exercise about that problem and its solution was included in Barendregt’s book on $\lambda$ -calculus [3, page 162].

2 Finite and Infinite Computations

While studying Combinatory Logic, I became interested in terms viewed as trees and tree transformers. Indeed, combinators can be considered both as trees and tree transformers at the same time. This area was also related to the research on Term Rewriting Systems which was going to be one of my interests for a few years later. The search for a non-terminating combinator stimulated my studies on infinite, non-terminating computations.

In 1979 I introduced a hierarchy of infinite computations within WCL (and other Turing complete systems) which is related to the Chomsky hierarchy of languages [49]. That definition uses the notion of a sampling function $s$ which is a total function from the set of natural numbers to $\{\mathit{true},\mathit{false}\}$ , which from an infinite sequence $\sigma\!=\!\langle w_{\_}{0},w_{\_}{1},w_{\_}{2},\ldots\rangle$ of finite words constructed by an infinite computation, selects an infinite subsequence $\sigma\!_{\_}{s}$ whose words are the elements of a (finite or infinite) language $L_{\_}{s}$ . We state that $L_{\_}{s}=_{\_}{\mathit{def}}\{w_{\_}{j}\mid j\!\geq\!0\wedge w_{\_}{j}\leavevmode\nobreak\ \mathit{occurs\leavevmode\nobreak\ in}\leavevmode\nobreak\ \sigma\wedge s(j)\!=\!\mathit{true}\}$ . Let us assume that $L_{\_}{s}$ is generated by a grammar $G_{\_}{s}$ . In this case we say that also the subsequence $\sigma\!_{\_}{s}$ is generated by the grammar $G_{\_}{s}$ . Given a sequence $\sigma$ , by varying the sampling function $s$ we have different languages $L_{\_}{s}$ and different generating grammars $G_{\_}{s}$ . For $i\!=\!0,1,2,3$ , we say that the infinite computation which generates $\sigma$ is of type $i$ if there exists a sampling function $s$ selecting a subsequence $\sigma\!_{\_}{s}$ generated by a grammar of type $i$ , and no sampling function $s^{\prime}$ exists such that the subsequence selected by $s^{\prime}$ is generated by a grammar of type $(i\!+\!1)$ .

For instance, let us consider the following program $P$ :

$w=``a$ ” $\,;\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ {\mathtt{while}}\leavevmode\nobreak\ true\leavevmode\nobreak\ {\mathtt{do}}\leavevmode\nobreak\ {\mathit{print\leavevmode\nobreak\ w\,;}}\leavevmode\nobreak\ \leavevmode\nobreak\ w=``b$ ” $\,w\,``c$ ” $\,;\leavevmode\nobreak\ \leavevmode\nobreak\ \pi_{\_}{0}\leavevmode\nobreak\ {\mathtt{od}}$

where $a,b,c$ are characters, $w$ is a string of characters, and $\pi_{\_}{0}$ is a terminating program fragment associated with a type 0 language $L_{\_}{0}$ , such that: (i) $L_{\_}{0}$ is not of type 1, (ii) $\pi_{\_}{0}$ does not modify $w$ , (iii) at each loop body execution, $\pi_{\_}{0}$ prints only one word of $L_{\_}{0}$ , and (iv) for every word $v\!\in\!L_{\_}{0}$ there is exactly one body execution in which $\pi_{\_}{0}$ prints $v$ . We have that $P$ evokes an infinite computation of type 2, as the grammar with axiom $S$ and productions: $S\rightarrow a\mid b\,S\,c$ is a type 2 (context free) grammar.

When I first presented this hierarchy definition at a conference, I met my dear colleague Philippe Flajolet (1948-2011) and he said to me: “I have already studied these topics [25]. You should look at the immune sets.” That remark motivated my first encounter with Roger’s book on recursivity [67] where immune sets are defined and analyzed. Then also Professor Maurice Nivat (1937-2017) came to me and said: “It is a nice piece of work,… but you should rewrite the paper in a better way!”. I was very glad that Nivat showed interest in my work. He was right in asking me to rewrite it and improve it. Unfortunately, I did not follow his suggestion. Not even when, a few years later, Professor Tony Hoare told me: “I like writing and rewriting my papers.”

Looking for terms with infinite behaviour in WCL, in 1980 I wrote a paper on the automatic construction of combinators having no normal form by using the so called accumulation method and the pattern matching and hereditary embedding method [50]. The solutions of some equations between terms would guarantee the existence of the combinators with the desired properties.

On the other side of the camp, that is, considering the finite behaviours, many people at that time were studying properties of Term Rewriting Systems (TRSs) which would guarantee termination. Among them, Nachum Dershowitz, Samuel Kamin, Jean-Jacques Lévy, and David Plaisted. In 1981 I wrote a paper introducing the non-ascending property [51]. In that paper I related the various techniques which were proposed, including recursive path orderings, simplification ordering, and bounded lexicographic orderings. I thank Nachum for pointing out to me some errors in that paper and, in particular, a missing left-linearity hypothesis about the TRS under consideration [21]. A TRS is said to be left linear if the variable occurrences on the left hand side of every rule are all distinct. For instance, $f(x,y,z)\rightarrow f(y,z,x)$ is a left linear rule, while $f(x,y,x)\rightarrow g(y,x)$ is not. During a conference coffee-break, Jean-Jacques showed me a simple inductive proof of Fact 1 [51, pages 436–437] using bounded lexicographic orderings (actually, that proof is based on a non-predicative definition of the non-ascending rewriting rules).

3 Program Transformation

During the years 1977–1981 I visited Edinburgh University. I was supported by the British Council organization and the Italian National Research Council. I did my Ph.D. thesis work on program transformation under the guidance of Professor Rod Burstall and also Professor Robin Milner, during Rod’s visit to Inria in Paris for some months. I met Rod in person for the first time at the Artificial Intelligence Department, in Hope Park Square at Edinburgh. I addressed him by saying: “Professor Burstall, $\ldots$ ”. I do not remember my subsequent words, but I do remember what he said to me in answering: “Alberto, this is the last time you call me ‘professor’. Please, call me Rod.” He introduced me to functional programming and he wrote ‘for me’, as he said, a compiler for a new functional language, called NPL [10] he was developing at that time. The language NPL later evolved into Hope [12]. While at Edinburgh, I wrote a paper [48] on the automatic annotation of functional programs for improving memory utilization. Functions could destroy the value of their arguments whenever they were no longer needed for subsequent computations. I apologize for not having Rod as co-author of that paper.

My Ph.D. thesis work was mainly on program transformation starting from the seminal paper by Rod and John Darlington [11]. Some time before, Rod had received a letter from Professor Edger W. Dijkstra (1930-2002) proposing the following ‘exercise’ in program transformation: the derivation of an iterative program for the fusc function [23, pages 215–216, 230–232]:

${\mathtt{fusc(0)=0}}$ ${\mathtt{fusc(1)=1}}$

${\mathtt{fusc(2n)=fusc(n)}}$ ${\mathtt{fusc(2n\!+\!1)=fusc(n\!+\!1)+fusc(n)}}$ for ${\mathtt{n\!\geq\!0}}$

In one of my scientific conversations with Rod, he told me about his research interests and he also mentioned the above exercise. The difficult part of the exercise was how to motivate the ‘invention’ of the new function definitions to be introduced during program transformation in the so called eureka steps [11].

To do the same exercise Bauer and Wössner [5, page 288] use an embedding into a linear combination, that is, they define the function ${\mathtt{F(n,a,b)=_{\_}{\mathit{def}}}}$ $\mathtt{a\!\times\!{\mathtt{fusc}}(n)+b\!\times\!fusc(n\!+\!1)}$ . Using that function, they are able to derive for fusc a program that is linear recursive and also tail-recursive. Then, from that program they easily derive an iterative program. But, where the function F comes from? I wanted to do the exercise using the unfolding/folding rules only [11] and, at the same time, I wanted to give a somewhat mechanizable account of the definition the new functions to be introduced.

Now, the unfolding rule allows one to unroll (upto a specified depth) the recursive calls thereby generating a directed acyclic graph of distinct calls. I called that graph the m-dag. The prefix m (short for minimal) tells us that in an m-dag identical function calls are denoted by a single node. Then, I used the so called tupling strategy that allows one to define new functions as the result of tupling together function calls which share common subcalls, that is, calls which have common descendants in the m-dag. Note that to check this sharing property requires syntactic operations only on the m-dags. By using the tupling strategy, looking at the m-dag for $\mathtt{fusc}$ , we introduce the tuple function $\mathtt{t(n)=_{\_}{def}\langle fusc(n),\leavevmode\nobreak\ fusc(n\!+\!1)\rangle}$ and we get the following recursive equations for $\mathtt{fusc}$ :

${\mathtt{fusc(n)=u}}$ ${\mathtt{where\leavevmode\nobreak\ \langle u,v\rangle=t(n)}}$ for ${\mathtt{n\!\geq\!0}}$

${\mathtt{t(0)=}}$ {by unfolding} $\mathtt{=\langle fusc(0),\leavevmode\nobreak\ fusc(1)\rangle=}$ {by unfolding} $\mathtt{=\langle 0,1\rangle}$

${\mathtt{t(2n)=}}$ {by unfolding} $\mathtt{=\langle fusc(2n),\ fusc(2n\!+\!1)\rangle=}$ {by unfolding} =
$\mathtt{=\leavevmode\nobreak\ \langle fusc(n),\ fusc(n\!+\!1)\!+\!fusc(n)\rangle=}$ {by $\mathtt{where}$ abstraction [11]} =
$\mathtt{=\leavevmode\nobreak\ \langle u,u\!+\!v\rangle}$ $\mathtt{where\langle u,v\rangle=\langle fusc(n),\leavevmode\nobreak\ fusc(n\!+\!1)\rangle}$ = {by folding} =
${\mathtt{=\leavevmode\nobreak\ \langle u,u\!+\!v\rangle}}$ where ${\mathtt{\langle u,v\rangle=t(n)}}$ for ${\mathtt{n\!>\!0}}$

${\mathtt{t(2n\!+\!1)\!=\!\langle u\!+\!v,v\rangle}}$ where ${\mathtt{\langle u,v\rangle=t(n)}}$ for ${\mathtt{n\!\geq\!0}}$ (by a derivation similar to that of $\mathtt{t(2n)}$ )

Now a last step is needed to get the iterative program desired by Dijkstra’s exercise.

I used the following schema equivalence (such as the ones in [70]) stating that $\mathtt{t(n)}$ defined by the non-tail recursive equations:

$\mathtt{t(0)=a}$

$\mathtt{t(2n)=b(t(n))}$ for $\mathtt{n\!>\!0}$

$\mathtt{t(2n+1)=c(t(n))}$ for $\mathtt{n\!\geq\!0}$

is equal to the value of $\mathtt{res}$ returned by the following program, where ${\mathtt{B[\ell..0]}}$ stores the binary expansion of $\mathtt{m}$ , the most significant bit being at position $\ell$ (obviously, $\mathtt{B[\ell..0]}$ can be computed by performing $O(\log$ $\mathtt{m}$ $)$ successive integer divisions by 2):

${\mathtt{res\!=\!a;\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ p\!=\!\ell;}}$ ${\mathtt{while\leavevmode\nobreak\ p\!\geq\!0\leavevmode\nobreak\ do\leavevmode\nobreak\ if\leavevmode\nobreak\ \leavevmode\nobreak\ B[p]\!=\!0\leavevmode\nobreak\ \leavevmode\nobreak\ then\leavevmode\nobreak\ \leavevmode\nobreak\ res\!=\!b(res)\leavevmode\nobreak\ \leavevmode\nobreak\ else\leavevmode\nobreak\ \leavevmode\nobreak\ res\!=\!c(res)\,;\leavevmode\nobreak\ \leavevmode\nobreak\ p=p\!-\!1\leavevmode\nobreak\ \leavevmode\nobreak\ od}}$

By using this schema equivalence we derive from the above linear, non-tail recursive program for $\mathtt{fusc}$ the following iterative program:

${\mathtt{\{n\!\geq\!0}}\leavevmode\nobreak\ \leavevmode\nobreak\ \wedge\leavevmode\nobreak\$ $\mathtt{n=\sum_{\_}{p=0}^{\ell}B[p]\cdot 2^{p}}$ }

${\mathtt{\langle u,v\rangle=\langle 0,1\rangle;}}$ ${\mathtt{p=\ell;}}$

${\mathtt{while\leavevmode\nobreak\ p\!\geq\!0}}$ ${\mathtt{do\leavevmode\nobreak\ }}$ ${\mathtt{if\leavevmode\nobreak\ \leavevmode\nobreak\ B[p]\!=\!0\leavevmode\nobreak\ \leavevmode\nobreak\ then\leavevmode\nobreak\ \leavevmode\nobreak\ v=u\!+\!v\leavevmode\nobreak\ \leavevmode\nobreak\ else\leavevmode\nobreak\ \leavevmode\nobreak\ u=u\!+\!v\,;}}$ ${\mathtt{p=p\!-\!1\leavevmode\nobreak\ od}}$

$\{{\mathtt{\langle u,v\rangle\!=\!t(n)}}\leavevmode\nobreak\ \leavevmode\nobreak\ \wedge\leavevmode\nobreak\ \leavevmode\nobreak\ {\mathtt{u\!=\!fusc(n)}}\}$

Note that we do not need to state the somewhat intricate invariant of the while-loop for showing the correctness of the derived iterative program, as Dijkstra’s methodology for program construction would have required us to do. The derived program, which is correct by construction, uses an $O(\log$ $\mathtt{n}$ $)$ number of operations for computing $\mathtt{fusc(n)}$ as Dijkstra’s program reported in [23, page 215–216]¹¹1In order to get exactly Dijkstra’s program, one should perform a generalization step as indicated in [53].. We have only to show by induction, once and for all, the validity of the schema equivalence we have used.

Having derived an iterative program for the fusc function, I faced the problem of deriving by transformation an iterative program, such as the one suggested by [42], which computes the Fibonacci function $\mathtt{fib(n)}$ using an $O(\log$ $\mathtt{n}$ $)$ number of arithmetic operations. Here is the definition of the Fibonacci function:

${\mathtt{fib(0)\!=\!0\hskip 56.9055ptfib(1)\!=\!1}}$

${\mathtt{fib(n\!+\!2)=fib(n\!+\!1)+fib(n)}}$ for ${\mathtt{n\!\geq\!0}}$ $(\dagger 1)$

By using the tupling strategy the function ${\mathtt{g(n)=_{\_}{\mathtt{def}}\langle fib(n),\ fib(n\!-\!1)\rangle}}$ is introduced and the following program is derived:

${\mathtt{fib(0)\!=\!0\hskip 56.9055ptfib(1)\!=\!1}}$

${\mathtt{fib(n\!+\!2)\!=\!u}}$ where ${\mathtt{\langle u,v\rangle=g(n\!+\!2)}}$ for ${\mathtt{n\!\geq\!0}}$

${\mathtt{g(1)=}}$ $\mathtt{\langle 1,0\rangle}$

${\mathtt{g(n\!+\!2)\!=\!\langle u\!+\!v,u\rangle}}$ where ${\mathtt{\langle u,v\rangle=g(n\!+\!1)}}$ for ${\mathtt{n\!\geq\!0}}$

The iterative program for fib can be obtained by applying the following schema equivalence stating that $\mathtt{g(n)}$ defined by the equations:

${\mathtt{g(0)\!=\!a}}$ ${\mathtt{g(n\!+\!1)\!=\!b(g(n))}}$ for ${\mathtt{n\!\geq\!0}}$

is equal to the value of $\mathtt{res}$ returned by the program:

${\mathtt{res\!=\!a;}}$ ${\mathtt{while\leavevmode\nobreak\ n\!>\!0\leavevmode\nobreak\ do\leavevmode\nobreak\ \leavevmode\nobreak\ res=b(res)\,;\leavevmode\nobreak\ \leavevmode\nobreak\ n=n\!-\!1\leavevmode\nobreak\ \leavevmode\nobreak\ od}}$

Thus, we get:

${\mathtt{\{n\!\geq\!0}}\}$

${\mathtt{if\leavevmode\nobreak\ n\!=\!0\leavevmode\nobreak\ then\leavevmode\nobreak\ u\!=\!0\leavevmode\nobreak\ else}}$

${\mathtt{if\leavevmode\nobreak\ n\!=\!1\leavevmode\nobreak\ then\leavevmode\nobreak\ u\!=\!1\leavevmode\nobreak\ else}}$

${\mathtt{begin\leavevmode\nobreak\ p=n\!-\!1;\leavevmode\nobreak\ \langle u,v\rangle\!=\!\langle 1,0\rangle;\leavevmode\nobreak\ }}{\mathtt{\leavevmode\nobreak\ while\leavevmode\nobreak\ p\!>\!0\leavevmode\nobreak\ do\leavevmode\nobreak\ \langle u,v\rangle\!=\!\langle u\!+\!v,u\rangle\,;\leavevmode\nobreak\ \leavevmode\nobreak\ p=p\!-\!1\leavevmode\nobreak\ od\leavevmode\nobreak\ end}}$

$\{{\mathtt{u\!=\!fib(n)}}\}$

This program has a linear time complexity, in the sense that it computes the result by a linear number of additions. In order to get a program which requires $O(\log$ $\mathtt{n}$ $)$ arithmetic operations when computing $\mathtt{fib(n)}$ , we should invent the multiplication operation, which is not present in Equation $\mathtt{(\dagger 1)}$ . From that equation by unfolding we have:

${\mathtt{fib(n\!+\!2)}}$ ${\mathtt{=fib(n\!+\!1)+fib(n)=}}$ {by unfolding ${\mathtt{fib(n\!+\!1)}}$ } =

${\mathtt{=2\cdot fib(n)+fib(n\!-\!1)=}}$ {by unfolding ${\mathtt{fib(n)}}$ } =

${\mathtt{=3\cdot fib(n\!-\!1)+2\cdot fib(n\!-\!2)}}$ $(\dagger 2)$

The unfolding process may continue for some more steps, but we stop here. We will not discuss here the important issue of how many unfolding steps should be performed when deriving programs by transformation. Let us simply note that more unfoldings may exhibit more patterns of function calls from which more efficient functions can be derived.

In our case the invention of the multiplication operation is reduced to three generalization steps [56]. First, we generalize the initial values 0 and 1 of the function ${\mathtt{fib}}$ to two variables ${\mathtt{a_{\_}{0}}}$ and ${\mathtt{a_{\_}{1}}}$ , respectively. (This kind of generalization step is usually done when mechanically proving theorems about functions [8].) By promoting those new variables to arguments, we get the following new function $\mathtt{G}$ :

${\mathtt{G(a_{\_}{0},\!a_{\_}{1},\!0)\!=\!a_{\_}{0}\hskip 56.9055ptG(a_{\_}{0},\!a_{\_}{1},\!1)\!=\!a_{\_}{1}}}$

${\mathtt{G(a_{\_}{0},\!a_{\_}{1},\!n\!+\!2)=G(a_{\_}{0},\!a_{\_}{1},\!n\!+\!1)+G(a_{\_}{0},\!a_{\_}{1},\!n)}}$ for ${\mathtt{n\!\geq\!0}}$ $(\dagger 3)$

This function $\mathtt{G}$ satisfies the following equation which is derived from Equation $(\dagger 3)$ , as Equation $(\dagger 2)$ has been derived from $(\dagger 1)$ :

${\mathtt{G(a_{\_}{0},a_{\_}{1},n\!+\!2)\!=\!3\cdot G(a_{\_}{0},a_{\_}{1},n\!-\!1)+2\cdot G(a_{\_}{0},a_{\_}{1},n\!-\!2)}}$ $(\dagger 4)$

The second generalization consists in generalizing the coefficients $\mathtt{2}$ and $\mathtt{3}$ to two functions $\mathtt{p(n)}$ and $\mathtt{q(n)}$ , respectively (and thus, multiplication is introduced). By this generalization we establish a correspondence between the value of the coefficients and the number of unfoldings performed. We can then derive the explicit definitions of the functions $\mathtt{p(n)}$ and $\mathtt{q(n)}$ as shown in [56], and we get that $\mathtt{p(n)=G(1,0,n)}$ and $\mathtt{q(n)=G(0,1,n)}$ .

The third, final generalization consists in generalizing the argument $\mathtt{n\!+\!2}$ on the left hand side of Equation $(\dagger 4)$ to ${\mathtt{n\!+\!k}}$ and promoting the new variable $\mathtt{k}$ to an argument of a new function defined as follows: $\mathtt{F(a_{\_}{0},a_{\_}{1},n,k)=_{\_}{def}G(a_{\_}{0},a_{\_}{1},n\!+\!k)}$ .

From the equations defining $\mathtt{F(a_{\_}{0},a_{\_}{1},n,k)}$ we get (the details are in [56, pages 184–185]):

${\mathtt{G(a_{\_}{0},a_{\_}{1},n\!+\!k)=G(0,1,k)\cdot G(a_{\_}{0},a_{\_}{1},n\!+\!1)+G(1,0,k)\cdot G(a_{\_}{0},a_{\_}{1},n)}}$

Then, by taking ${\mathtt{n\!=\!k}}$ and ${\mathtt{n\!=\!k\!+\!1}}$ , we also get:

${\mathtt{G(a_{\_}{0},a_{\_}{1},2k)=G(0,1,k)\cdot G(a_{\_}{0},a_{\_}{1},k\!+\!1)+G(1,0,k)\cdot G(a_{\_}{0},a_{\_}{1},k)}}$ for $\mathtt{k>0}$

${\mathtt{G(a_{\_}{0},a_{\_}{1},2k\!+\!1)=G(0,1,k)\cdot G(a_{\_}{0},a_{\_}{1},k\!+\!2)+G(1,0,k)\cdot G(a_{\_}{0},a_{\_}{1},k\!+\!1)}}$ for $\mathtt{k\geq 0}$

Eventually, by tupling together the function calls which share the same subcalls, we get the following program which computes $\mathtt{fib(n)}$ by performing an $O(\log$ $\mathtt{n}$ $)$ number of arithmetic operations only, as desired. For all $\mathtt{k\geq 0}$ , the function $\mathtt{r(k)}$ is the pair $\mathtt{\langle G(1,0,k),\leavevmode\nobreak\ G(0,1,k)\rangle}$ .

${\mathtt{fib(0)=0\hskip 56.9055ptfib(1)=1}}$

${\mathtt{fib(n\!+\!2)\!=\!u\!+\!v\leavevmode\nobreak\ \leavevmode\nobreak\ where\leavevmode\nobreak\ \langle u,v\rangle=r(n\!+\!1)}}$ for ${\mathtt{n\!\geq\!0}}$

${\mathtt{r(0)=\langle 1,0\rangle}}$

${\mathtt{r(2k)=\langle u^{2}\!+\!v^{2},\ 2uv\!+\!v^{2}\rangle}}$ where ${\mathtt{\langle u,v\rangle=r(k)}}$ for ${\mathtt{k\!>\!0}}$

${\mathtt{r(2k\!+\!1)=\langle 2uv\!+\!v^{2},\ (u\!+\!v)^{2}\!+\!v^{2}\rangle}}$ where ${\mathtt{\langle u,v\rangle=r(k)}}$ for ${\mathtt{k\!\geq\!0}}$

We leave to the reader to derive the iterative program that can be obtained by a simple schema equivalence from this program. One can say that the program we have derived is even better than the program based on $2\times 2$ matrix multiplications [42], because it tuples together two values only, not four, as required by the use of the $2\times 2$ matrices. Note that our derivation of the program does not rely on any knowledge of matrix theory.

In the paper with Rod Burstall [56] there is a generalization of the derivation of the logarithmic time program for fib to the case of any linear recurrence relation over any semiring structure. What remains to be done? One may want to derive a constant time program for evaluating any linear recurrence relation over a semiring. This would require the introduction of the exponentiation operation. Recall that ${\mathtt{fib(n)=(A^{n}-B^{n})/sqrt(5)}}$ , where ${\mathtt{A=(1\!+\!sqrt(5))/2}}$ and ${\mathtt{B=(1\!-\!sqrt(5))/2}}$ .

From September 1977 to June 1978, I visited the School of Computer and Information Science at Syracuse University, N.Y., USA. I attended courses taught by Professor Alan Robinson, John Reynolds, Lockwood Morris, and Robert Kowalski (at that time a visiting professor from Imperial College, London, UK). It was a splendid occasion for deepening my knowledge about many aspects of Computer Science from such illustrious teachers.

In Syracuse I had the opportunity of reading more carefully some parts of the book Automata Theory, Languages, and Computation by Hopcroft and Ullman [31] and the book Introduction to Mathematical Logic by Mendelson [41]. I was exposed by Professor Kowalski for the first time to various topics of Artificial Intelligence and I read the preliminary draft of his beautiful book Logic for Problem Solving [36]. I remember the stress put by Kowalski on Keith Clark’s negation as failure semantics for logic programs [13]. This Computational Logic area was going to become my main research area in the years to come, through my cooperation with Maurizio Proietti in Logic Program Transformation.

4 The Tupling Strategy and the List Introduction Strategy

The results of the use of tupling and generalization during program transformation were presented in a paper of the 1984 ACM Symposium on Lisp and Functional Programming, Austin, Texas, USA [53]. While giving a seminar on those results at the University of Warsaw (Poland) Professor Helena Rasiowa²²2Helena Rasiowa and Roman Sikorski gave in 1950 a first algebraic proof of Gödel Completeness Theorem for first-order predicate calculus. who was in the audience, at the end kindly said to me: “Your paper is a collection of examples!”. I was not surprised by that remark, but I was happy to have, among the examples, a simple derivation of an iterative program for computing the moves of the Towers of Hanoi problem. That task was considered to be very challenging by some authors (see, for instance, [27, page 285]), and the derivation I proposed is also easily mechanizable.

The following Hanoi function ${\mathtt{h(n,A,B,C)}}$ computes the shortest sequence of moves in the free monoid $\mathtt{\{AB,BC,CA,BA,CB,AC\}^{*}}$ to move ${\mathtt{n\,(\geq\!0)}}$ disks from peg ${\mathtt{A}}$ to peg ${\mathtt{B}}$ using peg ${\mathtt{C}}$ as an extra peg. A move of a disk from peg X to peg Y is denoted by XY, for any distinct X, Y in $\mathtt{\{A,B,C\}}$ . Every disk is of a different size and over any disk only smaller disks can be placed. $\varepsilon$ denotes the empty sequence of moves, and ${\mathtt{::}}$ denotes the concatenation of sequences of moves.

${\mathtt{h(0,A,B,C)=\varepsilon}}$

${\mathtt{h(n\!+\!1,A,B,C)=h(n,A,C,B)::AB::h(n,C,B,A)}}$ for ${\mathtt{n\!\geq\!0}}$ $(\dagger 5)$

In order to get an iterative program for computing ${\mathtt{h(n,A,B,C)}}$ , we first unfold ${\mathtt{h(n,A,C,B)}}$ and ${\mathtt{h(n,C,B,A)}}$ in $(\dagger 5)$ and then we tuple together in the new function $\mathtt{t(n\!-\!1)}$ the calls of $\mathtt{h(n\!-\!1,A,B,C)}$ , $\mathtt{h(n\!-\!1,B,C,A)}$ , and $\mathtt{h(n\!-\!1,C,A,B)}$ which share common subcalls (see Figure 1). The order of the components in the tuple is insignificant. Details are in [54].

\VCDraw{VCPicture}

(-4,-1)(14,6.8) \SetEdgeArrowWidth0pt \ChgStateLineWidth.6 \FixStateDiameter10mm

\StateVar

[ ${\small{$\mathtt{h(n\!+\!1,A,B,C)}$}}$ ](6,5.8)nABC \StateVar[ ${\small{$\mathtt{h(n,A,C,B)}$}}$ ](0,4)n1ACB \StateVar[ ${\small{$\mathtt{h(n,C,B,A)}$}}$ ](12,4)n1CBA

\FixStateDiameter

14mm \ChgStateLineWidth.6 \ChgStateLineStyledashed \StateVar[ ](6,2)Tuple \ChgStateLineColorwhite \ChgStateLineStylesolid \StateVar[ ${\small{$\mathtt{t(n\!-\!1):}$}}$ ](-5,2)Tuple1 \ChgStateLineColorblack

\ChgStateLineWidth

.6 \FixStateDiameter10mm \StateVar[ ${\small{$\mathtt{h(n\!-\!1,B,C,A)}$}}$ ](0,2)n2BCA \StateVar[ ${\small{$\mathtt{h(n\!-\!1,A,B,C)}$}}$ ](6,2)n2ABC \StateVar[ ${\small{$\mathtt{h(n\!-\!1,C,A,B)}$}}$ ](12,2)n2CAB \FixStateDiameter14mm \ChgStateLineWidth.6 \ChgStateLineStyledashed \StateVar[ ](6,-.5)Tuple

\ChgStateLineColor

white \ChgStateLineStylesolid \StateVar[ ${\small{$\mathtt{t(n\!-\!3):}$}}$ ](-5,-.5)Tuple3 \ChgStateLineColorblack

\ChgStateLineWidth

.6 \FixStateDiameter10mm \StateVar[ ${\small{$\mathtt{h(n\!-\!3,B,C,A)}$}}$ ](0,-.5)n3BCA \StateVar[ ${\small{$\mathtt{h(n\!-\!3,A,B,C)}$}}$ ](6,-.5)n3ABC \StateVar[ ${\small{$\mathtt{h(n\!-\!3,C,A,B)}$}}$ ](12,-.5)n3CAB

\EdgeL

nABCn1ACB \EdgeLnABCn1CBA \EdgeLn1ACBn2BCA \EdgeLn1ACBn2ABC \EdgeLn1CBAn2ABC \EdgeLn1CBAn2CAB

\EdgeL

n2BCAn3BCA \EdgeLn2BCAn3ABC \EdgeLn2BCAn3CAB \EdgeLn2ABCn3BCA \EdgeLn2ABCn3ABC \EdgeLn2ABCn3CAB \EdgeLn2CABn3BCA \EdgeLn2CABn3ABC \EdgeLn2CABn3CAB

Figure 1: An upper portion of the call graph m-dag of the Hanoi function

\mathtt{h(n\!+\!1,A,B,C)}

. An edge from an upper node to a lower node denotes that the upper call requires the lower call. Dashed lines denote tuples.

We get:

${\mathtt{h(0,A,B,C)=\varepsilon}}$ ${\mathtt{h(1,A,B,C)=AB}}$

${\mathtt{h(n\!+\!2,A,B,C)=u::AC::v::AB::w::CB::u}}$ ${\mathtt{where\leavevmode\nobreak\ \langle u,v,w\rangle=t(n)}}$ for ${\mathtt{n\!\geq\!0}}$

${\mathtt{t(0)=\langle\varepsilon,\varepsilon,\varepsilon\rangle}}$ ${\mathtt{t(1)=\langle AB,BC,CA\rangle}}$

${\mathtt{t(n\!+\!2)=\langle}}$ ${\mathtt{u::AC::v::AB::w::CB::u,}}$ ${\mathtt{v::BA::w::BC::u::AC::v,}}$

${\mathtt{w::CB::u::CA::v::BA::w\,\rangle}}$ ${\mathtt{where\leavevmode\nobreak\ \langle u,v,w\rangle=t(n)}}$ for ${\mathtt{n\!\geq\!0}}$

Then, we can apply the schema equivalence stating that $\mathtt{g(n)}$ defined by the equations:

${\mathtt{g(0)\!=\!a}}$ ${\mathtt{g(1)\!=\!b}}$ ${\mathtt{g(n\!+\!2)\!=\!c(g(n))}}$ for ${\mathtt{n\!\geq\!0}}$

is equal to the value of $\mathtt{res}$ returned by the program:

${\mathtt{if\leavevmode\nobreak\ even(n)\leavevmode\nobreak\ \leavevmode\nobreak\ then\leavevmode\nobreak\ \leavevmode\nobreak\ res\!=\!a\leavevmode\nobreak\ \leavevmode\nobreak\ else\leavevmode\nobreak\ \leavevmode\nobreak\ res\!=\!b;}}$

${\mathtt{while\leavevmode\nobreak\ n\!>\!1\leavevmode\nobreak\ do\leavevmode\nobreak\ \leavevmode\nobreak\ res=c(res)\,;\leavevmode\nobreak\ \leavevmode\nobreak\ n=n\!-\!2\leavevmode\nobreak\ \leavevmode\nobreak\ od}}$

We get the following program, where for $\mathtt{k\!=\!1,2,3}$ , $\mathtt{Tk}$ denotes the $\mathtt{k}$ -th component of the triple $\mathtt{T}$ :

${\mathtt{\{n\!\geq\!0}}\}$

${\mathtt{if\leavevmode\nobreak\ \leavevmode\nobreak\ n\!=\!0\leavevmode\nobreak\ \leavevmode\nobreak\ then\leavevmode\nobreak\ \leavevmode\nobreak\ Hanoi\!=\!\varepsilon\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ else}}$

${\mathtt{if\leavevmode\nobreak\ \leavevmode\nobreak\ n\!=\!1\leavevmode\nobreak\ \leavevmode\nobreak\ then\leavevmode\nobreak\ \leavevmode\nobreak\ Hanoi\!=\!AB\leavevmode\nobreak\ \leavevmode\nobreak\ else}}$

${\mathtt{begin\leavevmode\nobreak\ \leavevmode\nobreak\ n\!=\!n\!-\!2;\leavevmode\nobreak\ \leavevmode\nobreak\ if\leavevmode\nobreak\ even(n)\leavevmode\nobreak\ \leavevmode\nobreak\ then\leavevmode\nobreak\ \leavevmode\nobreak\ T\!=\!\langle\varepsilon,\varepsilon,\varepsilon\rangle\leavevmode\nobreak\ \leavevmode\nobreak\ else\leavevmode\nobreak\ \leavevmode\nobreak\ T\!=\!\langle AB,BC,CA\rangle\,;}}$

${\mathtt{while\leavevmode\nobreak\ n\!>\!1\leavevmode\nobreak\ do\leavevmode\nobreak\ \leavevmode\nobreak\ T\!=\!\langle\,}}$ ${\mathtt{T1\!::\!AC\!::\!T2\!::\!AB\!::\!T3\!::\!CB\!::\!T1,}}$ ${\mathtt{T2\!::\!BA\!::\!T3\!::\!BC\!::\!T1\!::\!AC\!::\!T2,}}$

${\mathtt{T3\!::\!CB\!::\!T1\!::\!CA\!::\!T2\!::\!BA\!::\!T3\,\rangle;\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ n\!=\!n\!-\!2\leavevmode\nobreak\ \leavevmode\nobreak\ od\,;}}$

$\mathtt{Hanoi=T1\!::\!AC\!::\!T2\!::\!AB\!::\!T3\!::\!CB\!::\!T1}$

$\mathtt{end}$

$\{\,{\mathtt{Hanoi=h(n,A,B,C)}}\,\}$

The technique we have presented is based only on the tupling strategy and a simple schema equivalence. That technique is successful also for the many variants of the Towers of Hanoi problem that can be found in the literature (see, among others, [24]). A different derivation for computing the Hanoi function can be done by introducing, besides the tuple $\mathtt{t(n)}$ , also the tuple $\mathtt{t^{\prime}(n\!-\!2)=_{\_}{def}\langle h(n\!-\!2,A,C,B),}$ $\mathtt{h(n\!-\!2,C,B,A),}$ $\mathtt{h(n\!-\!2,B,A,C)\rangle}$ corresponding to the calls of $\mathtt{h}$ at level $\mathtt{n\!-\!2}$ (not depicted in Figure 1). We leave this derivation to the reader.

In a later paper I addressed the problem of finding the $\mathtt{m}$ -th move of algorithms which compute sequences of moves without computing any other move [55]. This problem arose as a generalization of the problem relative to the Towers of Hanoi. If the moves are computed by a function defined by a recurrence relation, then under suitable hypotheses, it is indeed possible to compute the $\mathtt{m}$ -th move without computing any other move. For the case of the Hanoi function $\mathtt{h(n,A,B,C)}$ we have that the length $\mathtt{Lh(n)}$ of the sequence of moves for $\mathtt{n}$ disks, satisfies the following equations: ${\mathtt{Lh(0)=0\hskip 28.45274ptLh(n\!+\!1)=2\cdot Lh(n)\!+\!1}}$ for ${\mathtt{n\!\geq\!0}}$

One can show [55] that the $\mathtt{m}$ -th move of $\mathtt{h(n,A,B,C)}$ , for $\mathtt{1\!\leq\!m\!\leq\!2^{n}\!-\!1}$ and $\mathtt{n\!\geq\!0}$ , can be computed using the deterministic finite automaton of Figure 2. We assume that $\mathtt{M[\ell..0]}$ is the binary expansion of $\mathtt{m}$ , the most significant bit being at the leftmost position $\mathtt{\ell}$ . Thus, $\mathtt{m=\sum_{\_}{i=0}^{\ell}M[i]\cdot 2^{i}}$ and $\mathtt{m}$ is not a power of $\mathtt{2}$ iff $\mathtt{M[\ell..0]\not\in 10^{*}}$ . Let $\mathtt{trans(X,p)}$ denotes the state $\mathtt{Y}$ such that in the finite automaton of Figure 2 there is an arc from state $\mathtt{X}$ to state $\mathtt{Y}$ with label $\mathtt{p}$ .

$\mathtt{i\!=\!\ell;\leavevmode\nobreak\ \leavevmode\nobreak\ state\!=\!AB;}$

$\mathtt{while\leavevmode\nobreak\ M[i..0]\not\in 10^{*}\leavevmode\nobreak\ do\leavevmode\nobreak\ }$ $\mathtt{begin\leavevmode\nobreak\ \leavevmode\nobreak\ state=trans(state,M[i]);\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ i=i\!-\!1\leavevmode\nobreak\ \leavevmode\nobreak\ end\leavevmode\nobreak\ od}$

The $\mathtt{m}$ -th move is the name of the final state, with $\mathtt{B}$ and $\mathtt{C}$ interchanged if an odd number of state transitions is made.

\VCDraw{VCPicture}

(-2,-.8)(12,2) \SetEdgeArrowWidth7pt \SetEdgeArrowLengthCoef1.5 \ChgStateLineWidth1. \FixStateDiameter14mm

\State

[ ${\small{$\mathtt{AB}$}}$ ](0,0)AB \State[ ${\small{$\mathtt{BC}$}}$ ](5,0)BC \State[ ${\small{$\mathtt{CA}$}}$ ](10,0)CA

\Initial

\ForthBackOffset

\SetLoopAngle

20 \SetLoopOffset2pt \EdgeLABBC1 \EdgeLBCAB1 \LoopNAB0 \EdgeLBCCA0 \EdgeLCABC0 \LoopNCA1

Figure 2: The finite automaton for computing the

\mathtt{m}

-th move in the sequence

\mathtt{h(n,A,B,C)}

of moves for the Towers of Hanoi problem with

\mathtt{n}

disks and pegs

\mathtt{A}

\mathtt{B}

, and

\mathtt{C}

Suppose that we want to compute the $\mathtt{44}$ -th move of $\mathtt{h(6,A,B,C)}$ . The binary expansion of $\mathtt{44}$ is $\mathtt{101100}$ . Starting from the left, we take the prefix $\mathtt{101}$ up to (and excluding) the suffix in $\mathtt{10^{*}}$ (in our case $\mathtt{100}$ ). We perform the transitions on the automaton of Figure 2 starting from state $\mathtt{AB}$ according to that prefix (from left to right) and we get to state $\mathtt{CA}$ . Since the length of the prefix is odd (it is indeed $\mathtt{3}$ ), the move to be computed is $\mathtt{BA}$ , that is, $\mathtt{CA}$ with $\mathtt{B}$ and $\mathtt{C}$ interchanged.

In a subsequent paper with Maurizio Proietti [58] we want to explore the idea of introducing lists, rather than arrays (indeed, tuples being of fixed size can be seen as arrays). Originally, this idea was suggested to me by Rod Burstall. Since every recursive function can be computed by using stacks (actually, two stacks are sufficient for computing any partial recursive function on natural numbers [31]), this technique seems to me, at first, not very relevant in the practice of improving the time complexity of a program or avoiding inefficient recursions. We explored the use of this technique and, indeed, we managed to achieve good results. In particular, the list introduction strategy can be used when the recursive calls do not generate a sequence of cuts of constant size in the m-dag of the function calls, and thus it does not allow the use of the tupling strategy. A cut in an m-dag is set $C$ of nodes such that every path from the root to a leaf intersects $C$ . In the case of the Hanoi function (see Figure 1) we have depicted the cuts associated with $\mathtt{t(n\!-\!1)}$ and $\mathtt{t(n\!-\!3)}$ . Both of them are of size $\mathtt{3}$ and thus, the tupling strategy (with three function calls) is successful. More details on cuts and their use for program transformation also in relation with pebble games [45] can be found in my Ph.D. thesis [52].

We used the list introduction strategy for deriving a program for computing the binomial coefficients: $\binom{n+1}{k+1}\!=\!\binom{n}{k}\!+\!\binom{n}{k+1}$ . In this case the sequence of cuts from the root to the leaves is of increasing size. Indeed, $\binom{n+1}{k+1}$ requires the computations of $\binom{n}{k}$ and $\binom{n}{k+1}$ , which in turns, require the computations of $\binom{n-1}{k-1}$ , $\binom{n-1}{k}$ , $\binom{n-1}{k+1}$ , and so on. (Indeed, in the Pascal Triangle the basis has an increasing size when the height of the triangle increases). Therefore, the tupling strategy cannot be used.

Now, in order to show the power of the list introduction strategy, let us consider the $n$ -queens problem. Details are in [58]. An $n\!\times\!n$ board configuration Qs is represented by a list of pairs of the form: $[\langle R_{\_}1,C_{\_}1\rangle,\ldots,\langle R_{\_}n,C_{\_}n\rangle]$ , where for $i\!=\!1,\ldots,n$ , $\langle R_{\_}i,C_{\_}i\rangle$ denotes a queen placed in row $R_{\_}i$ and column $C_{\_}i$ . For $i\!=\!1,\ldots,n$ , the values of $R_{\_}i$ and $C_{\_}i$ belong to the list $[1,\ldots,n]$ .

We start from the following initial program Queens:

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{1.}&{\it queens}({\it Ns},{\it Qs})\leftarrow{\it placequeens}({\it Ns},{\it Qs}),\ {\it safeboard}({\it Qs})\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{2.}&{\it placequeens}([\,],[\,])\leftarrow\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{3.}&{\it placequeens}({\it Ns},[Q|{\it Qs}])\leftarrow{\it select}(Q,{\it Ns},{\it Ns}1),\ {\it placequeens}({\it Ns}1,{\it Qs})\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{4.}&{\it safeboard}([\,])\leftarrow\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{5.}&{\it safeboard}([Q|{\it Qs}])\leftarrow{\it safequeen}(Q,{\it Qs}),\ {\it safeboard}({\it Qs})\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{6.}&{\it safequeen}(Q,[\,])\leftarrow\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{7.}&{\it safequeen}(Q1,[Q2|{\it Qs}])\leftarrow{\it notattack}(Q1,Q2),\ {\it safequeen}(Q1,{\it Qs})\end{array}$

In order to place $n$ queens we solve the goal ${\it queens}([1,\ldots,n],{\it Qs\/})$ . By clause 1 we have that ${\it placequeens}([1,\ldots,n],{\it Qs})$ generates a board configuration Qs and ${\it safeboard}({\it Qs})$ checks that in Qs no two queens lie on the same diagonal (either ‘up diagonal’ or ‘down diagonal’ in Dijkstra’s terminology [22]). We assume that ${\it notattack\/}(Q1,Q2)$ holds iff queen position (or queen, for short) $Q1$ , that is, $\langle R1,C1\rangle$ , is not on the same diagonal of the queen $Q2$ . The tests that the queens are neither on the same row nor on the same column can be avoided by assuming that ${\it select\/}(Q,{\it Ns},{\it Ns}1)$ holds iff ${\it Ns}$ is a list of distinct numbers in $[1,\ldots,n]$ , $Q$ is queen $\langle R,C\rangle$ such that row $R$ is the length of Ns and column $C$ is a member of Ns, and ${\it Ns}1$ is the new list obtained from ${\it Ns}$ by deleting the occurrence of $C$ . The length of the list ${\it Ns}$ decreases by one unit after each call of placequeens. In particular, we have that board configurations having $k$ queens (with $1\!\leq\!k\!\leq\!n$ ) are of the form: $[\langle n,c_{\_}1\rangle,\langle n\!-\!1,c_{\_}2\rangle,\ldots,\langle n\!-\!k\!+\!1,c_{\_}k\rangle]$ , where $c_{\_}1,c_{\_}2,\ldots,c_{\_}k$ are distinct numbers in $[1,\ldots,n]$ .

Program Queens solves the problem using the generate-and-test approach and it is not efficient. A more efficient program using an accumulator that stores the diagonals which are not safe, has been proposed in [68, page 255]. Efficiency is increased because backtracking is reduced.

By applying the list introduction strategy (which includes also some generalization steps) one can derive the following program TransfQueens whose behaviour is similar to that of the accumulator version. The various transformation steps are described in [58]. The higher efficiency of the final program is due to the fact that the test for a safe board configuration is ‘promoted’ into the process of generating new configurations, and the number of generated unsafe board configurations is decreased (see the filter promotion technique [6, 16]).

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{8.}&{\it queens}([\,],[\,])\leftarrow\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{9.}&{\it queens}({\it Ns},[Q|{\it Qs}])\leftarrow{\it select\/}(Q,{\it Ns},{\it Ns}1),\ {\it genlist}1({\it Ns}1,{\it Qs},[Q])\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{10.}&{\it genlist}1([\,],[\,],{\it Ps})\leftarrow\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{11.}&{\it genlist}1({\it Ns},[Q1|{\it Qs}],[\,])\leftarrow{\it select}(Q1,{\it Ns},{\it Ns}1),\ {\it genlist\/}1({\it Ns}1,{\it Qs},[Q1])\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{12.}&{\it genlist}1({\it Ns},[Q1|{\it Qs}],[P1|{\it Ps}])\leftarrow{\it select}(Q1,{\it Ns},{\it Ns}1),\ {\it notattack}(P1,Q1),\par\par\end{array}$

$\begin{array}[]{rlll}\hskip 166.44861pt{\it genlist\/}2({\it Ns}1,{\it Qs},[P1],{\it Ps},Q1)\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{13.}&{\it genlist}2({\it Ns}1,{\it Qs},{\it Ps}1,[\,],Q1)\leftarrow{\it genlist\/}1({\it Ns}1,{\it Qs},[Q1|{\it Ps}1])\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{14.}&{\it genlist}2({\it Ns}1,{\it Qs},{\it Ps}1,[P2|{\it Ps}2],Q1)\leftarrow{\it notattack}(P2,Q1),\\ \end{array}$

$\begin{array}[]{rlll}\hskip 193.47873pt{\it genlist\/}2({\it Ns}1,{\it Qs},[P2|{\it Ps}1],{\it Ps}2,Q1)\end{array}$

This program performs much less backtracking than the Queens program³³3In some experiments we have done, for 10 queens TransfQueens runs about 70 times faster than Queens.. By clause 9, the first queen position $Q$ is selected and genlist1 is called with its last argument storing the current board configuration, which is the list $[Q]$ . When a new queen is placed at position $Q1$ and it is not attacked by the last queen placed at position $P1$ (see clause 12), genlist2 checks whether or not $Q1$ is attacked by the queens already present in the current configuration and whose positions are stored in Ps (see clauses 13 and 14). If $Q1$ is not attacked, the configuration is updated (see the last argument $[Q1|Ps1]$ of genlist1 in clause 13). Otherwise, if $Q1$ is attacked, by backtracking (see the atom select in clause 12), a different queen position is selected. If all positions for the new queen are under attack, then by backtracking (see the atoms select in clauses 9 and 11), the position of a previously placed queen, if there is one, is selected in a different way.

The explanation which we have just given about the derived program (clauses 8–14), may appear unclear to the non-expert reader, but one should note that it was not needed at all. Indeed, correctness of the derived program is guaranteed by the correctness of the transformation rules, and the efficiency improvement is due to filter promotion.

5 The Lambda Abstraction Strategy

While studying the tupling strategy and analyzing its power, a sentence by John Darlington, with whom I shared the office in Edinburgh, came often to my mind: “After unfolding, having done some local improvements (such as the ones obtained by the $\mathtt{where}$ abstraction as shown in Section 3 for the $\mathtt{fusc}$ function), you need to fold.” This need for folding [17] is an important requirement. Folding steps make the local improvements to be become global, so that they can be replicated at each level of recursion and thus become significant.

However, folding steps need matchings between expressions and these matchings may be sometimes impossible. Generalization of constants to variables may allow matchings in some cases, but not always. In particular, when an expression should match one of its subexpressions, generalization of constant to variables does not help. In those cases we have suggested to construct functions from expressions [63]. This is done by replacing the expression $\mathtt{E[e]}$ where the subexpression $\mathtt{e}$ occurs, by the application $\mathtt{(\lambda x.E[x])\,e}$ . We call this technique lambda abstraction strategy (or, as in other papers, higher-order abstraction).

Let us see how lambda abstraction works in the following two examples taken from [63]. The first example refers to the following program Reverse for reversing a list, where $\mathtt{[\,]}$ , $\mathtt{:}$ , and $\mathtt{@}$ denote the empty list, cons, and append on lists, respectively.

${\mathtt{\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ 1.\leavevmode\nobreak\ \leavevmode\nobreak\ rev([\,])=[\,]}}$

${\mathtt{\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ 2.\leavevmode\nobreak\ \leavevmode\nobreak\ rev(a\!:\!\ell)=rev(\ell)\leavevmode\nobreak\ @\leavevmode\nobreak\ [a]}}$

${\mathtt{\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ 3.\leavevmode\nobreak\ \leavevmode\nobreak\ [\,]\leavevmode\nobreak\ @\leavevmode\nobreak\ y=y}}$

${\mathtt{\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ 4.\leavevmode\nobreak\ \leavevmode\nobreak\ (a\!:\!\ell)\leavevmode\nobreak\ @\leavevmode\nobreak\ y=a:(\ell\leavevmode\nobreak\ @\leavevmode\nobreak\ y)}}$

We want to derive a tail recursive definition of $\mathtt{rev}$ . We need $\mathtt{rev}$ to be the top operator of the right hand side of Eq. $\mathtt{2}$ , that is, $\mathtt{rev(\ell)\,@\,[a]}$ , and by induction we need that right hand side to be $\mathtt{rev(\ell)}$ . There is a subexpression mismatch between $\mathtt{rev(\ell)\,@\,[a]}$ and $\mathtt{rev(\ell)}$ . Then we proceed as follows: (i) instead of $\mathtt{rev(\ell)}$ , we consider $\mathtt{rev(\ell)\,@\,[\,]}$ , (ii) we generalize the constant $\mathtt{[\,]}$ to the variable $\mathtt{x}$ , thereby deriving $\mathtt{rev(\ell)\,@\,x}$ , and (iii) we abstract $\mathtt{rev(\ell)\,@\,x}$ with respect to $\mathtt{x}$ , thereby deriving the function $\mathtt{\lambda x.\leavevmode\nobreak\ rev(\ell)\,@\,x}$ .

The definition of the new function $\mathtt{f(\ell)=_{\_}{def}\lambda x.\leavevmode\nobreak\ rev(\ell)\,@\,x}$ is as follows.

$\mathtt{5.\leavevmode\nobreak\ \leavevmode\nobreak\ f([\,])=\lambda x.\leavevmode\nobreak\ rev([\,])\,@\,x=}$ {by Eq. $\mathtt{1}$ } = $\mathtt{\lambda x.\leavevmode\nobreak\ [\,]\,@\,x=}$ {by Eq. $\mathtt{3}$ } = $\mathtt{\lambda x.x}$

$\mathtt{6.\leavevmode\nobreak\ \leavevmode\nobreak\ f(a\!:\!\ell)=\lambda x.\,rev(a\!:\!\ell)\,@\,x=\lambda x.\,(rev(\ell)\,@\,[a])\,@\,x=}$ {by associativity of $\mathtt{@}$ } =

= $\mathtt{\lambda x.\,rev(\ell)\,@\,([a]\,@\,x)=}$ $\mathtt{\lambda x.\,rev(\ell)\,@\,(a\!:\!x)=}$ {by folding} = $\mathtt{\lambda x.\,(f(\ell)\,(a\!:\!x))}$

We also have:

$\mathtt{\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ 7.\leavevmode\nobreak\ \leavevmode\nobreak\ rev(\ell)=f(\ell)\,[\,]}$

The derived program (Eqs. $\mathtt{5}$ – $\mathtt{7}$ ) is more efficient than program (Eqs. $\mathtt{1}$ – $\mathtt{4}$ ) because the expensive operation append has been replaced by the cheaper operation cons. Eqs. $\mathtt{5}$ – $\mathtt{7}$ are basically equivalent to the program proposed in [32] where a new representation for list has to be invented.

Note that the mechanization of the transformation we have now presented requires the use of associativity property for the append function. Thus, in general, it is important to have knowledge of the algebraic properties of the operations in use.

A second example refers to a problem proposed by Richard Bird [7]. Given a binary tree $\mathtt{t}$ we want to construct an isomorphic binary tree $\mathtt{\widetilde{t}}$ such that: (i) $\mathtt{t}$ and $\mathtt{\widetilde{t}}$ have the same multiset of leaves, and (ii) the leaves of $\mathtt{\widetilde{t}}$ , when read from left to right, are in ascending order. One should derive a program which construct $\mathtt{\widetilde{t}}$ by making one traversal only of the tree $\mathtt{t}$ .

In order to solve this program Richard Bird uses the so called locally recursive programs whose semantics is quite complex and it is based on the call-by-need mode of evaluation. By using the tupling and lambda abstraction strategies we will get the desired program with the following advantages over Bird’s solution: (i) the use of call-by-value semantics, (ii) the absence of local recursion, (iii) the leaves are sorted on the fly, and (iv) the computation of components of tuples is done only when they are required for later computations.

By $\mathtt{tip(n)}$ we denote a binary tree whose single leaf is the integer $\mathtt{n}$ , and by $\mathtt{t1\!\wedge\!t2}$ we denote a binary tree with children $\mathtt{t1}$ and $\mathtt{t2}$ . By $\mathtt{hd}$ and $\mathtt{tl}$ we denote, as usual, the head and tail functions on lists. Our initial program is as follows.

$\mathtt{1.\leavevmode\nobreak\ \leavevmode\nobreak\ TreeSort(t)=replace(t,sort(leaves(t)))}$ where:

(i) $\mathtt{leaves(t)}$ returns the list of the leaves of the tree $\mathtt{t}$ , (ii) $\mathtt{sort(\ell)}$ rearranges the list $\mathtt{\ell}$ in ascending order from left to right, and (iii) $\mathtt{replace(t,\ell)}$ uses in the left-to-right order the elements of the list $\mathtt{\ell}$ to replace from left-to-right the leaves of the tree $\mathtt{t}$ .

We assume that the length of $\mathtt{\ell}$ is at least the number of leaves in $\mathtt{t}$ . For instance, we have: $\mathtt{TreeSort((tip(1)\!\wedge\!tip(2))\wedge tip(1))\!=}$ $\mathtt{(tip(1)\!\wedge\!tip(1))\wedge tip(2)}$ . Here is the definition of the various functions required:

$\mathtt{2.\leavevmode\nobreak\ \leavevmode\nobreak\ leaves(tip(n))=[n]}$

$\mathtt{3.\leavevmode\nobreak\ \leavevmode\nobreak\ leaves(t1\wedge t2)=leaves(t1)\leavevmode\nobreak\ @\leavevmode\nobreak\ leaves(t2)}$

$\mathtt{4.\leavevmode\nobreak\ \leavevmode\nobreak\ replace(tip(n),\ell)=tip(hd(\ell))}$

$\mathtt{5.\leavevmode\nobreak\ \leavevmode\nobreak\ replace(t1\wedge t2,\ell)=replace(t1,take(k,\ell))\wedge replace(t2,drop(k,\ell))}$ $\mathtt{where\leavevmode\nobreak\ k\!=\!size(t1)}$

$\mathtt{6.\leavevmode\nobreak\ \leavevmode\nobreak\ take(n,\ell)=if\leavevmode\nobreak\ n=0\leavevmode\nobreak\ then\leavevmode\nobreak\ [\,]\leavevmode\nobreak\ else\leavevmode\nobreak\ take(n\!-\!1,\ell)\leavevmode\nobreak\ @\leavevmode\nobreak\ [hd(drop(n\!-\!1,\ell))]}$

$\mathtt{7.\leavevmode\nobreak\ \leavevmode\nobreak\ drop(n,\ell)=if\leavevmode\nobreak\ n=0\leavevmode\nobreak\ then\leavevmode\nobreak\ \ell\leavevmode\nobreak\ else\leavevmode\nobreak\ tl(drop(n\!-\!1,\ell))}$

For instance, $\mathtt{take(2,[a,b,c,d,e])=[hd([a,b,c,d,e]),hd([b,c,d,e])]=[a,b]}$ and

$\mathtt{drop(2,[a,b,c,d,e])=tl(tl([a,b,c,d,e]))=[c,d,e]}$ .

As usual, given a list $\mathtt{\ell}$ , we denote by $\mathtt{length(\ell)}$ the number of elements in $\mathtt{\ell}$ . We assume that $\mathtt{0\!\leq\!k\!\leq\!length(\ell)}$ holds when evaluating $\mathtt{take(k,\ell)}$ and $\mathtt{drop(k,\ell)}$ . For all list $\mathtt{\ell}$ , for all $\mathtt{0\!\leq\!n\!\leq\!length(\ell)}$ , we have $\mathtt{\ell=take(n,\ell)\,@\,drop(n,\ell)}$ . The function $\mathtt{size(t)}$ returns the number of leaves in the tree $\mathtt{t}$ . We have:

$\mathtt{8.\leavevmode\nobreak\ \leavevmode\nobreak\ size(tip(n))=1}$

$\mathtt{9.\leavevmode\nobreak\ \leavevmode\nobreak\ size(t1\wedge t2)=size(t1)+size(t2).}$

Here is the definition of $\mathtt{sort}$ using $\mathtt{merge}$ of two ordered lists:

$\mathtt{10.\leavevmode\nobreak\ \leavevmode\nobreak\ sort(\ell)=if\leavevmode\nobreak\ \ell=[\,]\leavevmode\nobreak\ \leavevmode\nobreak\ then\leavevmode\nobreak\ \leavevmode\nobreak\ [\,]\leavevmode\nobreak\ \leavevmode\nobreak\ else\leavevmode\nobreak\ \leavevmode\nobreak\ merge([hd(\ell)],sort(tl(\ell))).}$

$\mathtt{11.\leavevmode\nobreak\ \leavevmode\nobreak\ merge([\,],\ell)=\ell}$

$\mathtt{12.\leavevmode\nobreak\ \leavevmode\nobreak\ merge(\ell,[\,])=\ell}$

$\mathtt{13.\leavevmode\nobreak\ \leavevmode\nobreak\ merge(a\!:\!\ell 1,\ b\!:\!\ell 2)=if\leavevmode\nobreak\ \leavevmode\nobreak\ a\!\leq\!b\leavevmode\nobreak\ \leavevmode\nobreak\ then\leavevmode\nobreak\ \leavevmode\nobreak\ a:merge(\ell 1,\ b\!:\!\ell 2)\leavevmode\nobreak\ \leavevmode\nobreak\ else\leavevmode\nobreak\ \leavevmode\nobreak\ b:merge(a\!:\!\ell 1,\ \ell 2)}$

Unfortunately, $\mathtt{TreeSort(t)}$ traverses the tree $\mathtt{t}$ twice: a first visit is for collecting the leaves, and a second visit is for replacing them in ascending order.

Now, let us start off the derivation of the one traversal algorithm by getting the inductive definition of $\mathtt{TreeSort(t)}$ . From Eq. $\mathtt{1}$ we get:

$\mathtt{14.\leavevmode\nobreak\ \leavevmode\nobreak\ replace(tip(n),sort(leaves(tip(n))))=replace(tip(n),sort([n]))=tip(n)}$

$\mathtt{15.\leavevmode\nobreak\ \leavevmode\nobreak\ replace(t1\!\wedge\!t2,sort(leaves(t1\wedge t2)))=replace(t1,take(size(t1),\ell))\leavevmode\nobreak\ \wedge}$

$\mathtt{\wedge\leavevmode\nobreak\ replace(t2,drop(size(t1),\ell))\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ where\leavevmode\nobreak\ \ell=sort(leaves(t1\!\wedge\!t2))}$

Now no folding step can be performed, because in $\mathtt{replace(t1,take(size(t1),\ell))}$ the subexpression $\mathtt{take(size(t1),\ell)}$ does not match $\mathtt{sort(leaves(t1))}$ . Similarly, for the subtree $\mathtt{t2}$ , instead of $\mathtt{t1}$ . By the lambda abstraction we generalize the mismatching subexpression to the list variable $\mathtt{z}$ , and we introduce the function $\mathtt{\lambda z.\leavevmode\nobreak\ replace(t,z)}$ whose definition is as follows (the details are in [63]):

$\mathtt{16.\leavevmode\nobreak\ \leavevmode\nobreak\ \lambda z.\leavevmode\nobreak\ replace(tip(n),z)=\lambda z.\leavevmode\nobreak\ tip(hd(z))}$

$\mathtt{17.\leavevmode\nobreak\ \leavevmode\nobreak\ \lambda z.\leavevmode\nobreak\ replace(t1\!\wedge\!t2,z)=\lambda z.\,((\lambda y.\leavevmode\nobreak\ replace(t1,y)\leavevmode\nobreak\ take(k,z))\ \wedge}$

$\mathtt{\wedge\leavevmode\nobreak\ ((\lambda y.\leavevmode\nobreak\ replace(t2,y))\leavevmode\nobreak\ drop(k,z)))\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ where\leavevmode\nobreak\ \leavevmode\nobreak\ k=size(t1)}$

The functions $\mathtt{\lambda z.\leavevmode\nobreak\ replace(t,z)}$ and $\mathtt{sort(leaves(t))}$ visit the same tree $\mathtt{t}$ . We apply the tupling strategy and we define the function:

$\mathtt{T(t)=_{\_}{def}\langle\lambda z.\leavevmode\nobreak\ replace(t,z),\ sort(leaves(t))\rangle}$

whose explicit definition is:

$\mathtt{18.\leavevmode\nobreak\ \leavevmode\nobreak\ T(tip(n))=\langle\lambda z.\leavevmode\nobreak\ tip(hd(z)),\leavevmode\nobreak\ [n]\rangle}$

$\mathtt{19.\leavevmode\nobreak\ \leavevmode\nobreak\ T(t1\!\wedge\!t2)\!=\!\langle\lambda z.\,((a1\leavevmode\nobreak\ take(size(t1),z))\wedge(a2\leavevmode\nobreak\ drop(size(t1),z))),\leavevmode\nobreak\ merge(b1,b2)\rangle}$

$\mathtt{where\leavevmode\nobreak\ \leavevmode\nobreak\ \langle a1,b1\rangle\!=\!T(t1)\leavevmode\nobreak\ \leavevmode\nobreak\ and\leavevmode\nobreak\ \leavevmode\nobreak\ \langle a2,b2\rangle\!=\!T(t2)}$

Now $\mathtt{T(t1)}$ , $\mathtt{take(size(t1),z)}$ , and $\mathtt{drop(size(t1),z)}$ visit the same tree $\mathtt{t1}$ . We apply the tupling strategy and we introduce the new function:

$\mathtt{U(t,y)=_{\_}{def}\langle\lambda z.\leavevmode\nobreak\ replace(t,z),\leavevmode\nobreak\ sort(leaves(t)),\leavevmode\nobreak\ take(size(t),y),\leavevmode\nobreak\ drop(size(t),y)\rangle}$

We get the following explicit definition for $\mathtt{U(tip(n),y)}$ :

$\mathtt{U(tip(n),y)=\langle\lambda z.\leavevmode\nobreak\ tip(hd(z)),\leavevmode\nobreak\ [n],\leavevmode\nobreak\ [hd(y)],\leavevmode\nobreak\ tl(y)\rangle}$

However, when looking for the explicit definition of $\mathtt{U(t1\!\wedge\!t2,y)}$ we get again a subexpression mismatch (see [63]) and we use again lambda abstraction for the last two components of the 4-tuple $\mathtt{U(t,y)}$ . Thus, we introduce the following function:

$\mathtt{V(t)=_{\_}{def}\!\langle\lambda z.replace(t,z),\ sort(leaves(t)),\ \lambda z.take(size(t),z),\ \lambda z.drop(size(t),z)\rangle}$

whose explicit definition is:

$\mathtt{20.\leavevmode\nobreak\ \leavevmode\nobreak\ V(tip(n))=\langle\lambda z.\leavevmode\nobreak\ tip(hd(z)),\leavevmode\nobreak\ [n],\leavevmode\nobreak\ \lambda z.\leavevmode\nobreak\ [hd(z)],\leavevmode\nobreak\ \lambda z.\leavevmode\nobreak\ tl(z)\rangle}$

$\mathtt{21.\leavevmode\nobreak\ \leavevmode\nobreak\ V(t1\!\wedge\!t2)\!=\!\langle\lambda z.((a1\,(c1\,z))\wedge(a2\leavevmode\nobreak\ (d1\leavevmode\nobreak\ z))),\leavevmode\nobreak\ merge(b1,b2),\leavevmode\nobreak\ \lambda z.((c1\leavevmode\nobreak\ z)\,@\,(c2\,(d1\,z)\!)\!),\leavevmode\nobreak\ \lambda z.(d2\,(d1\,z))\rangle}$

$\mathtt{where\leavevmode\nobreak\ \langle a1,b1,c1,d1\rangle\!=\!V(t1)\leavevmode\nobreak\ \leavevmode\nobreak\ and\leavevmode\nobreak\ \leavevmode\nobreak\ \langle a2,b2,c2,d2\rangle\!=\!V(t2)}$

We get the following program such that for all trees $\mathtt{t}$ , $\mathtt{NewTreeSort(t)=TreeSort(t)}$ (see Eq. $\mathtt{1}$ ):

$\mathtt{22.\leavevmode\nobreak\ \leavevmode\nobreak\ NewTreeSort(t)=(a2\leavevmode\nobreak\ b2)\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ where\leavevmode\nobreak\ \langle a2,b2\rangle=T(t)}$

$\mathtt{18.\leavevmode\nobreak\ \leavevmode\nobreak\ T(tip(n))=\langle\lambda z.\leavevmode\nobreak\ tip(hd(z)),\leavevmode\nobreak\ [n]\rangle}$

$\mathtt{23.\leavevmode\nobreak\ \leavevmode\nobreak\ T(t1\!\wedge\!t2)=\langle\lambda z.\,((a1\leavevmode\nobreak\ c1,z))\wedge(a2\leavevmode\nobreak\ d1,z))),\leavevmode\nobreak\ merge(b1,b2)\rangle}$

$\mathtt{where\leavevmode\nobreak\ \leavevmode\nobreak\ \langle a1,b1,c1,d1\rangle\!=\!V(t1)\leavevmode\nobreak\ \leavevmode\nobreak\ and\leavevmode\nobreak\ \leavevmode\nobreak\ \langle a2,b2\rangle\!=\!T(t2)}$

together with Eqs. $\mathtt{20}$ and $\mathtt{21}$ for the function $\mathtt{V(t)}$ .

A further improvement of this program can be made by avoiding the append function $\mathtt{@}$ occurring in Eq. $\mathtt{21}$ . One can use the same technique of lambda abstraction shown in the Reverse example at the beginning of this section. We consider a variant of the function $\mathtt{V(t)}$ whose 3rd component is the abstraction $\mathtt{\lambda z\,x.\leavevmode\nobreak\ take(size(t),z)\,@\,x}$ , instead of $\mathtt{\lambda z.\leavevmode\nobreak\ take(size(t),z)}$ . The function $\mathtt{T^{*}(t)}$ is like $\mathtt{T(t)}$ , but uses $\mathtt{V^{*}(t)}$ , instead of $\mathtt{V(t)}$ . We get the following final program such that for all trees $\mathtt{t}$ , $\mathtt{NewTreeSort^{*}(t)=TreeSort(t)}$ :

$\mathtt{22^{*}\!.\leavevmode\nobreak\ \leavevmode\nobreak\ NewTreeSort^{*}(t)=(a2\leavevmode\nobreak\ b2)\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ where\leavevmode\nobreak\ \langle a2,b2\rangle=T^{*}(t)}$

$\mathtt{18^{*}\!.\leavevmode\nobreak\ \leavevmode\nobreak\ T^{*}(tip(n))=\langle\lambda z.\leavevmode\nobreak\ tip(hd(z)),\leavevmode\nobreak\ [n]\rangle}$

$\mathtt{23^{*}.\leavevmode\nobreak\ \leavevmode\nobreak\ T^{*}(t1\!\wedge\!t2)=\langle\lambda z.\,((a1\leavevmode\nobreak\ (c1(z,[\,])))\wedge(a2\leavevmode\nobreak\ (d1\leavevmode\nobreak\ z))),\leavevmode\nobreak\ merge(b1,b2)\rangle}$

$\mathtt{where\leavevmode\nobreak\ \leavevmode\nobreak\ \langle a1,b1,c1,d1\rangle\!=\!V^{*}(t1)\leavevmode\nobreak\ \leavevmode\nobreak\ and\leavevmode\nobreak\ \leavevmode\nobreak\ \langle a2,b2\rangle\!=\!T^{*}(t2)}$

$\mathtt{20^{*}.\leavevmode\nobreak\ \leavevmode\nobreak\ V^{*}(tip(n))=\langle\lambda z.\leavevmode\nobreak\ tip(hd(z)),\leavevmode\nobreak\ [n],\leavevmode\nobreak\ \lambda z\,x.\leavevmode\nobreak\ hd(z)\!:\!x,\leavevmode\nobreak\ \lambda z.\leavevmode\nobreak\ tl(z)\rangle}$

$\mathtt{21^{*}.\leavevmode\nobreak\ \leavevmode\nobreak\ V^{*}(t1\!\wedge\!t2)=\langle\lambda z.\,((a1\leavevmode\nobreak\ (c1(z,[\,])))\wedge(a2\leavevmode\nobreak\ (d1\leavevmode\nobreak\ z))),\leavevmode\nobreak\ \leavevmode\nobreak\ merge(b1,b2),}$

$\mathtt{\leavevmode\nobreak\ \lambda z\,x.\,(c1(z,c2((d1\leavevmode\nobreak\ z),x))),}$ $\mathtt{\lambda z.(d2\leavevmode\nobreak\ (d1\leavevmode\nobreak\ z))\rangle}$

$\mathtt{where\leavevmode\nobreak\ \langle a1,b1,c1,d1\rangle\!=\!V^{*}(t1)\leavevmode\nobreak\ \leavevmode\nobreak\ and\leavevmode\nobreak\ \leavevmode\nobreak\ \langle a2,b2,c2,d2\rangle\!=\!V^{*}(t2)}$

Computer experiments performed at the time of writing the paper [63] from which we take this example, show that the computation of the final function $\mathtt{NewTreeSort^{*}(t)}$ is faster than the one of the initial function $\mathtt{TreeSort(t)}$ for trees whose size is greater than about 30. For trees of smaller size the overhead of dealing with functions is not compensated by the fact that the input tree is visited once only.

Note also that since lambda expressions do not have free variables, we can operate on them by using pairs of bound variables and function bodies, instead of the more expensive closures. Thus, for instance, $\mathtt{\lambda z.\,expr}$ can be represented by the pair $\mathtt{\langle z,expr\rangle}$ .

Some years later Maurizio Proietti and I have studied the application of the lambda abstraction strategy in the area of logic programming. As in functional programs where we have lambda expressions denoting functions, in logic programming we should have terms denoting goals, and thus goals should be allowed to occur as arguments of predicates. To allow goals as arguments, we have proposed a novel logic language, we have defined its semantics, and we have provided for it a set of unfold/fold transformations rules, together with some goal replacement rules, such as the one stating the equivalence of the goal $g\wedge\mathit{true}$ with the goal $g$ [57, 64]. Those rules have been proved correct.

Here is an example of efficiency improvement obtained by program transformation in this novel language. This transformation has not been mechanized, but we believe that it is not hard to do it. Details can be found in [64, Section 7.1]. Let us consider a program which given a binary tree (either $l(N)$ or $t(L,N,R)$ ), (i) flips all its left and right subtrees, and (ii) checks in a subsequent traversal of the tree, whether or not all labels are natural numbers.

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{1.}&\mathit{flipcheck}(X,Y)\leftarrow\mathit{flip}(X,Y),\ \mathit{check}(Y)\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{2.}&\mathit{flip}(l(N),l(N))\leftarrow\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{3.}&\mathit{flip}(t(L,N,R),t(\mathit{FR},N,\mathit{FL}))\leftarrow\mathit{flip}(L,\mathit{FL}),\ \mathit{flip}(R,\mathit{FR})\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{4.}&\mathit{check}(l(N))\leftarrow\mathit{nat}(N)\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{5.}&\mathit{check}(t(L,N,R))\leftarrow\mathit{nat}(N),\ \mathit{check}(L),\ \mathit{check}(R)\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{6.}&\mathit{nat}(0)\leftarrow\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{7.}&\mathit{nat}(s(N))\leftarrow\mathit{nat}(N)\end{array}$

We derived the following program which traverses the input tree only once and uses the continuation passing style:

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{8}&\mathit{flipcheck}(X,Y)\leftarrow\mathit{newp}(X,Y,G,\mathit{true},G)\par\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{9}&\mathit{newp}(l(N),l(N),G,C,D)\leftarrow\mathit{eq}\raisebox{1.0pt}{\bf\@text@baccent{ }}\mathit{c}(G,\mathit{nat}\raisebox{1.0pt}{\bf\@text@baccent{ }}\mathit{c}(N,C),D)\par\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{10}&\mathit{newp}(t(L,N,R),t(\mathit{FR},N,\mathit{FL}),G,C,D)\leftarrow\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{}&\hskip 56.9055pt\mathit{newp}(L,\mathit{FL},U,C,\ \mathit{newp}(R,\mathit{FR},V,U,\ \mathit{eq}\raisebox{1.0pt}{\bf\@text@baccent{ }}\mathit{c}(G,\mathit{nat}\raisebox{1.0pt}{\bf\@text@baccent{ }}\mathit{c}(N,V),D)))\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{11}&\mathit{nat}\raisebox{1.0pt}{\bf\@text@baccent{ }}\mathit{c}(0,C)\leftarrow C\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{12}&\mathit{nat}\raisebox{1.0pt}{\bf\@text@baccent{ }}\mathit{c}(s(N),C)\leftarrow\mathit{nat}\raisebox{1.0pt}{\bf\@text@baccent{ }}\mathit{c}(N,C)\end{array}$

For the predicate $\mathit{eq}\raisebox{1.0pt}{\bf\@text@baccent{ }}\mathit{c}$ we assume that: $\vdash\forall\,({\it eq}\raisebox{1.0pt}{\bf\@text@baccent{ }}{c}(X,Y,C)\,\leftrightarrow\,((X\!=\!Y)\wedge C)))$ .

6 Communications and Parallelism

While at Edinburgh I had the privilege of attending a course on the Calculus of Communicating Systems (CCS) by Professor Robin Milner (1934-2010) [43]. I remember the day when Robin Milner and Gordon Plotkin decided the name to be given to this new calculus. As I was told, they first decided that the name should have been of three letters only! I appreciated the beauty of the calculus which resembles a development of lambda calculus. The application of a function $\mathtt{\lambda x.\,e[x]}$ to an argument $\mathtt{a}$ can, indeed, be understood as a communication which takes place between: (i) the ‘function agent’ and (ii) the ‘argument agent’ through the ‘port’ named $\mathtt{\lambda}$ . After their communication, which is called a handshaking, the agents continue their respective activities, namely, (i) the function agent does the evaluation of $\mathtt{e[a]}$ , that is, the body $\mathtt{e[x]}$ of the function where the variable $\mathtt{x}$ have been bound to the value $\mathtt{a}$ , and (ii) the argument agent does nothing, that is, it become the null-agent (indeed, for the rest of the computation, the argument has nothing left to do).

At about the same time, Professor Tony Hoare in Oxford was developing his calculus of Concurrent Sequential Programs (CSP) [30]. I remember a visit that Tony Hoare made to Robin Milner at Edinburgh and the stimulating seminar Hoare gave on CSP on that occasion.

In subsequent years, I thought of exploring the power of communications and parallelism in functional programming, also because the various components of the tuples introduced by the tupling strategy can be computed in parallel. These components can be considered as independent agents which may synchronize at the end of their computations. During those years, the notion of communicating agents was emerging quite significantly in various programming paradigms.

Andrzej Skowron and I did some work in this area and we proposed (some variants of) a functional language with communications [61, 62]. Each function call is assumed to be an agent, that is, a triple of the form $\mathtt{\langle x,m\rangle\!::\!expr}$ , where $\mathtt{x}$ is its name, $\mathtt{m}$ is its message, that is, its local information, and $\mathtt{expr}$ is its expression, that is, the task it has to perform. The operational semantics of the language is based on the conditional rewriting of sets (or multisets) of agents, similarly to what is done in coordination languages (see, for instance, [26]).

As an example of a functional program with communications which we proposed, let us consider the following program for computing the familiar Fibonacci function.

The variable $\mathtt{x}$ ranges over agent names which are strings constructed from $\mathtt{x}$ as the following grammar indicates: $\mathtt{x\leavevmode\nobreak\ ::=\leavevmode\nobreak\ \varepsilon\leavevmode\nobreak\ \mid\leavevmode\nobreak\ x.0\leavevmode\nobreak\ \mid\leavevmode\nobreak\ x.1}$ . The left and right son-calls of the agent whose name is $\mathtt{x}$ have names $\mathtt{x.0}$ and $\mathtt{x.1}$ , respectively. By default, the name of the agent of the initial function call is the empty string $\mathtt{\varepsilon}$ .

In our example, the variables $\mathtt{ms}$ and $\mathtt{ms1}$ range over the three message constants: $\mathtt{R}$ (for ready), $\mathtt{R1}$ (for ready $\mathtt{1}$ ), and $\mathtt{W}$ (for wait). Agents with messages $\mathtt{R}$ and $\mathtt{R1}$ may make rewritings, while agents with message $\mathtt{W}$ cannot (see Rules $\mathtt{1}$ – $\mathtt{4}$ below). The variables $\mathtt{n}$ and $\mathtt{val}$ range over integers and the variable $\mathtt{exp}$ ranges over integer expressions.

$\mathtt{1.\leavevmode\nobreak\ \leavevmode\nobreak\ \big{\{}\langle x,ms\rangle\!::\!fib(0)\big{\}}\leavevmode\nobreak\ \Rightarrow\leavevmode\nobreak\ \big{\{}\langle x,ms\rangle\!::\!0\big{\}}}$ $\mathtt{if\leavevmode\nobreak\ \leavevmode\nobreak\ ms\!=\!R\leavevmode\nobreak\ \leavevmode\nobreak\ or\leavevmode\nobreak\ \leavevmode\nobreak\ ms\!=\!R1}$

$\mathtt{2.\leavevmode\nobreak\ \leavevmode\nobreak\ \big{\{}\langle x,ms\rangle\!::\!fib(1)\big{\}}\leavevmode\nobreak\ \Rightarrow\leavevmode\nobreak\ \big{\{}\langle x,ms\rangle\!::\!1\big{\}}}$ $\mathtt{if\leavevmode\nobreak\ \leavevmode\nobreak\ ms\!=\!R\leavevmode\nobreak\ \leavevmode\nobreak\ or\leavevmode\nobreak\ \leavevmode\nobreak\ ms\!=\!R1}$

$\mathtt{3.\leavevmode\nobreak\ \leavevmode\nobreak\ \big{\{}\langle x,R\rangle\!::\!fib(n\!+\!2)\big{\}}\leavevmode\nobreak\ \Rightarrow\leavevmode\nobreak\ \big{\{}\langle x,R\rangle\!::\!+(x.0,x.1),\ \langle x.0,R\rangle\!::\!fib(n\!+\!1),\ \langle x.1,R1\rangle\!::\!fib(n)\big{\}}}$ $\mathtt{if\leavevmode\nobreak\ n\!\geq\!0}$

$\mathtt{4.\leavevmode\nobreak\ \leavevmode\nobreak\ \big{\{}\langle x,R1\rangle\!::\!fib(n\!+\!2)\big{\}}\leavevmode\nobreak\ \Rightarrow\leavevmode\nobreak\ \big{\{}\langle x,R1\rangle\!::\!+(x.0,x.1),\ \langle x.0,W\rangle\!::\!fib(n\!+\!1),\ \langle x.1,R\rangle\!::\!fib(n)\big{\}}}$ $\mathtt{if\leavevmode\nobreak\ n\!\geq\!0}$

$\mathtt{5.\leavevmode\nobreak\ \leavevmode\nobreak\ \big{\{}\langle x.0,ms\rangle\!::\!val,\ \langle x,ms1\rangle\!::+(x.0,exp)\big{\}}\leavevmode\nobreak\ \Rightarrow\leavevmode\nobreak\ \big{\{}\langle x,ms1\rangle\!::+(val,exp)\big{\}}}$

$\mathtt{6.\leavevmode\nobreak\ \leavevmode\nobreak\ \big{\{}\langle x.1,ms\rangle\!::\!val,\ \langle x,ms1\rangle\!::+(exp,x.1)\big{\}}\leavevmode\nobreak\ \Rightarrow\leavevmode\nobreak\ \big{\{}\langle x.1,ms\rangle\!::\!val,\ \langle x,ms1\rangle\!::+(exp,val)\big{\}}}$

$\mathtt{7.\leavevmode\nobreak\ \leavevmode\nobreak\ \big{\{}\langle x.0.1,R1\rangle\!::\!val,\ \langle x.1.0,W\rangle\!::exp\big{\}}\leavevmode\nobreak\ \Rightarrow\leavevmode\nobreak\ \big{\{}\langle x.0.1,R1\rangle\!::\!val,\ \langle x.1.0,R\rangle\!::val\big{\}}}$

Rules $\mathtt{1}$ and $\mathtt{2}$ are the expected ones for computing $\mathtt{fib(0)}$ and $\mathtt{fib(1)}$ . The recursive call of $\mathtt{fib(n\!+\!2}$ ) has two variants (see Rules $\mathtt{3}$ and $\mathtt{4}$ ) so to be able to evaluate the call of agent $\mathtt{x.0.1}$ in a different way than that of agent $\mathtt{x.1.0}$ . The expression $\mathtt{+(x.0,x.1)}$ has the effect that, once the values of the son-calls are evaluated and sent to the father-call, according to Rules $\mathtt{5}$ and $\mathtt{6}$ , then the father-call silently performs the sum of the values it has received. Rule $\mathtt{7}$ sends the value computed by agent $\mathtt{x.0.1}$ to agent $\mathtt{x.1.0}$ . This communication is correct and improves efficiency. Indeed, by our program the value of $\mathtt{fib(n\!-\!1)}$ which is needed for computing $\mathtt{fib(n\!+\!1)}$ and $\mathtt{fib(n)}$ , is computed once only. Note, in fact, that one of the two agents which have to compute $\mathtt{fib(n\!-\!1)}$ , has the message $\mathtt{W}$ and cannot make further rewritings.

We have considered the problem of how to modify the rules of the programs when acquiring knowledge of new facts about the functions to be evaluated for improving program efficiency. In the case of the Fibonacci function, one such fact may be the equality of the expressions to be computed by the agents $\mathtt{x.0.0}$ and $\mathtt{x.1}$ .

Note that the above Rules $\mathtt{1}$ – $\mathtt{7}$ do not perform the on-the-fly garbage collection of the agents because right-sons are not erased. To overcome this problem one may use more complex messages [62] so that every agent knows the agents which are waiting for receiving the value it computes. If there are none, the agent may be erased once it has sent its value to the father-call.

Note also that, if instead of Rule $\mathtt{6}$ , we use the simpler equation:

$\mathtt{{\widetilde{6}}.\leavevmode\nobreak\ \leavevmode\nobreak\ \big{\{}\langle x.1,ms\rangle\!::\!val,\ \langle x,ms1\rangle\!::+(exp,x.1)\big{\}}\leavevmode\nobreak\ \Rightarrow\leavevmode\nobreak\ \big{\{}\langle x,ms1\rangle\!::+(exp,val)\big{\}}}$

deadlock may be generated. We have also proposed a modal logic for proving correctness of our functional programs with agents and communications [60] and, in particular, the absence of deadlock. Unfortunately, no implementation of our language proposal and its modal logic has been done.

Concerning a more theoretical study of parallelism and communications, Anna Labella and I considered categorical models for calculus with handshaking communications both in the case of CCS [43] and CSP [30]. We were inspired by the definition of the cartesian closed categories for providing models of the lambda calculus.

We followed an approach different from Winskel’s one [74]. We did not give an a priori definition of a categorical structure, where the embedding of the algebraic models of CCS or CSP might not be completely satisfactory. We started, instead, from the algebraic models, based on labelled trees of various kinds, and we defined suitable categories of labeled trees where one can interpret all the basic operations of CCS and CSP. In a sense, we followed the approach presented many years earlier by Rod Burstall for the description of flowchart programs [9]. The details of our categorical constructions can be found in [34, 37].

In some models of ours we used enriched categories [35]. An enriched category is a category where the sets of morphisms associated with the pairs of objects, are replaced by objects from a fixed monoidal category. For lack of space we will not enter into the details here.

7 Transformation and Verification in Logic Programming

While studying at Edinburgh, I thought of applying the transformation methodology to CCS agents. I remember talking to Robin Milner about this idea. He did not show much interest maybe because for him it was more important to first acquire a good understanding the equivalences between terms in the CCS calculus, before applying them to the transformations of agents which, of course, should be equivalence preserving.

Then, I thought of applying program transformation to the area of logic programming which I first studied during my Ph.D. research at Edinburgh. At that time William Clocksin and Chris Mellish were writing their popular book on Prolog [14]. I remember reading some parts of a draft of the book. Also I had the chance of looking at David Warren’s report on how to compile logic programs [71]. I also read his paper comparing the Prolog implementation with Lisp [73] and the later report on the Warren Abstract Machine [72]. From those days I still remember David’s kindness, his cooperation with Fernando Pereira, and his love for plants and flowers.

A few years later, when back in Italy, I was introduced by Anna Labella to her former student Maurizio Proietti who, not long before, had graduated in Mathematics at Rome University ‘La Sapienza’, defending a thesis on Category Theory. I spoke to Maurizio and I introduced him to logic programming [39]. I also encouraged him to work in the field of logic program transformation. He kindly accepted. The basis of his work was a paper by Hisao Tamaki and Taisuke Sato [69] that soon afterwards became the standard reference for logic program transformation.

That was the beginning of Maurizio’s cooperation with me. He was first funded by a research grant from the private company Enidata (Rome) and soon later, he became a researcher of the Italian National Research Council in Rome. We first considered some techniques for finding the eureka predicates, that is, the predicate definitions to be introduced during program transformation [65].

Besides the definition introduction, unfolding, and folding rules, we have used for our transformations a rule called Generalization $+$ Equality Introduction (see also [8] for a similar rule when proving theorems in functional programs). By this rule, a clause of the form $H\leftarrow A_{\_}{1},\ldots,A_{\_}{n}$ is generalized to the clause $H\leftarrow{\mathit{GenA}}_{\_}{1},\ldots,{\mathit{GenA}}_{\_}{n},$ $X_{\_}{1}\!=\!t_{\_}{1}$ , $\ldots,X_{\_}{n}\!=\!t_{\_}{r}$ , where $({\mathit{GenA}}_{\_}{1},\ldots$ , ${\mathit{GenA}}_{\_}{n})\,\vartheta\!=\!(A_{\_}{1},\ldots,A_{\_}{n})$ being $\vartheta$ the substitution $\{X_{\_}{1}/t_{\_}{1},\ldots,$ $X_{\_}{r}/t_{\_}{r}\}$ .

We have also introduced: (i) the class of non-ascending programs, where, among other properties, each variable should occur in an atom at most once, (ii) the synchronized descent rule (SDR) for driving the unfolding steps by selecting the atoms to be unfolded, and (iii) the loop absorption strategy for the synthesis of the eureka predicates. We have also characterized classes of programs in which that strategy is guaranteed to be successful.

Let us see a simple example of application of the loop absorption strategy. Here is a program, called $\mathit{Comsub}$ , for computing common subsequences of lists.

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{1.}&\mathit{comsub}(X,Y,Z)\leftarrow\mathit{sub}(X,Y),\ \mathit{sub}(X,Z)\vspace{.5mm}\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{2.}&\mathit{sub}([\,],X)\leftarrow\vspace{.5mm}\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{3.}&\mathit{sub}([A|X],[A|Y])\leftarrow\mathit{sub}(X,Y)\vspace{.5mm}\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{4.}&\mathit{sub}(X,[A|Y])\leftarrow\mathit{sub}(X,Y)\vspace{.5mm}\end{array}$

where $\mathit{sub}(X,Y)$ holds iff $X$ is a sublist of $Y$ . The order of the elements should be preserved, but the elements in $X$ need not to be consecutive in $Y$ . For instance, [1,2] is a sublist of [1,3,2,3], while [2,1] is not. We want to derive a program where the double visit of the list $X$ in clause 1 is avoided.

First, we make the given program to be non-ascending by replacing clause 3 by the following clause:

3.1 ${\mathit{sub}([A|X],[A1|Y])\!\leftarrow\!A\!=\!A1,\mathit{sub}(X,Y)}$

Let Comsub $1$ be the set $\{1,2,3.1,4\}$ of clauses. In Figure 3 we have depicted an upper portion of the unfolding tree for Comsub $1$ .

\VCDraw{VCPicture}

(-6.7,-1)(20,11.5) \SetEdgeArrowWidth8pt

\ChgStateLineWidth

1.2 \FixStateDiameter26mm \StateVar[1. comsub(X,Y,Z) ←sub(X,Y),sub(X,Z)](5.9,9.5)1

\FixStateDiameter

20mm \StateVar[5. comsub([ ],Y,Z) ←sub([ ],Z)](-2.7,6)5

\FixStateDiameter

28mm \StateVar[6. comsub([A|X],[A1|Y],Z) ←A=A1, sub(X,Y),sub([A|X],Z)](5.9,6)6

\SetStateLineStyle

solid \ChgStateLineWidth.6 \FixStateDiameter11mm \StateVar[8. comsub([ ],Y,Z) ←](-2.7,3)8

\FixStateDiameter

18mm \StateVar[9. comsub([ ],Y,[A|Z]) ←sub([ ],Z)](6.0,2.9)9

\FixStateDiameter

25mm \StateVar[7. comsub(X,[A|Y],Z) ←sub(X,Y),sub(X,Z)](15.2,6)7

\FixStateDiameter

25mm \StateVar[10. comsub([A|X],[A1|Y],[A2|Z]) ←A=A1, sub(X,Y),A=A2, sub(X,Z)](0.8,0)10

\FixStateDiameter

25mm \StateVar[11. comsub([A|X],[A1|Y],[B|Z]) ←A=A1, sub(X,Y),sub([A|X],Z)](14,0)11

\ChgEdgeLineWidth

2 \EdgeL15 \EdgeL16 \EdgeL17 \EdgeL58 \EdgeL59

\Point

(9.4,4.6)6c \Point(10.6,2.2)10c \Point(6,1)10cc \SetEdgeArrowWidth0pt \EdgeL6c10c \SetEdgeArrowWidth8pt \EdgeL10c10cc

\Point

(9.4,4.6)6b \Point(14,1.2)11b \EdgeL6b11b

\ChgEdgeLineStyle

dashed \ChgEdgeLineWidth2.5 \ChgEdgeArrowWidth8pt

\Point

(2.0,9.5)1b0 \Point(-6.4,9.5)1b1 \Point(-6.4,-.1)1b2 \Point(-5,-.1)1b3 \SetEdgeArrowWidth0pt \EdgeL1b31b2 \EdgeL1b21b1 \ChgEdgeArrowWidth8pt \EdgeL1b11b0

\Point

(-1.,4.9)5a \Point(2.4,3.7)9a \EdgeL9a5a

\Point

(10.2,4.8)6a \Point(14.9,1.3)11a \EdgeL11a6a

\Point

(9.6,8.8)1a \Point(13.6,7.3)7a \EdgeL7a1a

Figure 3: An upper portion of the unfolding tree for Comsub

1

In that figure we have underlined the atoms which are unfolded. Solid down-arrows denote unfolding, and dashed up-arrows denote loops which suggest the definition clauses needed for folding, as we will explain. In clause 6 we unfold the atom ${\mathit{sub}}([A|X],Z)$ which is selected by the SDR rule. Indeed, by the synchronized descent rule, in clause 6 we have to unfold that atom, because in its ancestor-clause 1 we have unfolded the other atom ${\mathit{sub}}(X,Y)$ occurring in the body of that ancestor. Unfolding is stopped when the recursive defined atoms in the body of a leaf-clause, say $L$ , are subsumed by the body of an ancestor-clause, say $A$ . In this case we say that a loop of the form $\langle A,L\rangle$ has been detected. Details can be found in [65].

According to the loop absorption strategy, for each detected loop $\langle A,L\rangle$ we introduce a new definition clause $D$ so that the bodies of both clauses $A$ and $L$ can be folded using $D$ . The loops $\langle 1,10\rangle$ and $\langle 1,7\rangle$ need not a new definition because we have clause 1 defining comsub. The loops $\langle 5,9\rangle$ and $\langle 6,11\rangle$ require the following two new predicate definitions

$\mathit{newsub}(Z)\!\leftarrow\!\mathit{sub}([\,],\!Z)$ for loop $\langle 5,9\rangle$

$\mathit{newcomsub}(A,\!X,\!Y,\!Z)\!\leftarrow\!\mathit{sub}(X,\!Y),\ \mathit{sub}([A|X],\!Z)$ for loop $\langle 6,11\rangle$

By performing the unfolding and folding steps which correspond to the subtrees rooted in clauses 1, 5, and 6 of Figure 3, we get the explicit definitions of the predicates $\mathit{newsub}$ and $\mathit{newcomsub}$ .

Eventually, by simplifying the equalities, we get the following program:

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{5.}&\makebox[256.0748pt][l]{$\mathit{comsub}([\,],Y,Z)\leftarrow$}(*)\nopagebreak\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{6.}&\mathit{comsub}([\,],Y,[A|Z])\leftarrow\mathit{newsub}(Z)\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{7.}&\makebox[256.0748pt][l]{$\mathit{comsub}([A|X],[A|Y],Z)\leftarrow\mathit{newcomsub}(A,X,Y,Z)$}(*)\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{8.}&\makebox[256.0748pt][l]{$\mathit{comsub}(X,[A|Y],Z)\leftarrow\mathit{comsub}(X,Y,Z)$}(*)\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{9.}&\makebox[256.0748pt][l]{$\mathit{newcomsub}(A,X,Y,[A|Z])\leftarrow\mathit{comsub}(X,Y,Z)$}(*)\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{10.}&\makebox[256.0748pt][l]{$\mathit{newcomsub}(A,X,Y,[B|Z])\leftarrow\mathit{newcomsub}(A,X,Y,Z)$}(*)\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{11.}&\mathit{newsub}(Z)\end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{12.}&\mathit{newsub}([A|Z])\leftarrow\mathit{newsub}(Z)\end{array}$

Now clause 6 is subsumed by clause 5 and can be erased. Then, also clauses 11 and 12 can be erased and the final program is made out of the marked clauses 5, 7–10 only. This final program is equal to the one derived by Tamaki-Sato [69]. Note that our derivation does not rely on human intuition and can easily be mechanized. The computation of all solutions of the goal $\mathit{comsub}(X,Y,Z)$ , where $X$ is a free variable and $Y$ and $Z$ are ground lists of 10 elements, is about 6 times faster when using the final program, instead of the initial one [65]. A development of the technique we have now illustrated can be found in [66].

The following example, taken from a paper of ours [59] written some years later in honor of Professor Robert Kowalski, shows an application of the program transformation methodology also to the case when clauses may have negated atoms in their body. For that kind of logic programs, called locally stratified logic programs, we have also provided the transformation rules that can be applied and we have shown that they are correct, in the sense that they preserve the perfect model semantics. The details on the rules and the definition of the perfect model semantics can be found in [59].

Let us consider the following program CFParser for deriving a word generated by a given context-free grammar over the alphabet $\{a,b\}$ :

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{1.}&\mathit{derive}([\,],[\,])\leftarrow\\[1.42262pt] \end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{2.}&\mathit{derive}([A|S],[A|W])\leftarrow\mathit{terminal}(A),\,\mathit{derive}(S,W)\\[1.42262pt] \end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{3.}&\mathit{derive}([A|S],W)\leftarrow\mathit{nonterminal}(A),\ \mathit{production}(A,B),\ \mathit{append}(B,S,T),\ \mathit{derive}(T,W)\\[1.42262pt] \end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{4.}&\makebox[170.71652pt][l]{$\mathit{nonterminal}(s)\leftarrow$}\makebox[22.76219pt][r]{5.\leavevmode\nobreak\ }\mathit{nonterminal}(x)\leftarrow\\[1.42262pt] \end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{6.}&\makebox[170.71652pt][l]{$\mathit{terminal}(a)\leftarrow$}\makebox[22.76219pt][r]{7.\leavevmode\nobreak\ }\mathit{terminal}(b)\leftarrow\\[1.42262pt] \end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{8.}&\makebox[170.71652pt][l]{$\mathit{production}(s,[a,x,b])\leftarrow$}\makebox[22.76219pt][r]{9.\leavevmode\nobreak\ }\mathit{production}(x,[\,])\leftarrow\\[1.42262pt] \end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{10.}&\makebox[170.71652pt][l]{$\mathit{production}(x,[a,x])\leftarrow$}\makebox[22.76219pt][r]{11.\leavevmode\nobreak\ }\mathit{production}(x,[a,b,x])\leftarrow\\[1.42262pt] \end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{12.}&\makebox[170.71652pt][l]{$\mathit{word}([\,])\leftarrow$}\makebox[22.76219pt][r]{13.\leavevmode\nobreak\ }\mathit{word}([A|W])\leftarrow\mathit{terminal}(A),\,\mathit{word}(W)\\[1.42262pt] \end{array}$

$\begin{array}[]{rlll}\makebox[22.76228pt][r]{14.}&\makebox[170.71652pt][l]{$\mathit{append}([\,],\!\mathit{Ys},\!\mathit{Ys})\leftarrow$}\makebox[22.76219pt][r]{15.\leavevmode\nobreak\ }\mathit{append}([A|\mathit{Xs}],\!\!\mathit{Ys},\![A|\mathit{Zs}])\!\leftarrow\!\mathit{append}(\mathit{Xs},\!\!\mathit{Ys},\!\mathit{Zs})\end{array}$

The relation $\mathit{derive}([s],W)$ holds iff the word $W$ can be derived from the start symbol $s$ using the following productions (see clauses 8–11):

$s\rightarrow a\,x\,b$ $x\rightarrow\varepsilon\mid a\,x\mid a\,b\,x$

The nonterminal symbols are $s$ and $x$ (see clauses 4 and 5), the terminal symbols are $a$ and $b$ (see clauses 6 and 7), words in $\{a,b\}^{*}$ are represented as lists of $a$ ’s and $b$ ’s, and the empty word $\varepsilon$ is represented as the empty list $[\,]$ .

The relation $\mathit{derive}(L,W)$ holds iff $L$ is a sequence of terminal or nonterminal symbols from which the word $W$ can be derived by using the productions.

We would like to derive an efficient program for an initial goal $G$ of the form:

$\mathit{word}(W),\ \neg\mathit{derive}([s],W)$

which holds in the perfect model of the program CFParser iff $\,W$ is a word which is not derivable from $s$ by using the given context-free grammar. We perform our two step program derivation presented in [59, Section 2.3]. In the first step, from goal $G$ we derive the following two clauses:

16. $g(W)\leftarrow\mathit{word}(W),\,\neg\mathit{new}1(W)$

17. $\mathit{new}1(W)\leftarrow\mathit{derive}([s],W)$

In the second step, we apply the unfold-definition-folding strategy presented in [66]. We will not recall here the formal definition of this strategy. It will be enough to say that it is similar to the loop absorption strategy we have seen in action in the above derivation starting from the Comsub $1$ program.

For our CFParser program, at the end of the second step, we get:

$g([\,])\leftarrow$ $g([a|A])\leftarrow\mathit{new}2(A)$ $g([b|A])\leftarrow\mathit{new}3(A)$

$\mathit{new}2([\,])\leftarrow$ $\mathit{new}2([a|A])\leftarrow\mathit{new}4(A)$ $\mathit{new}2([b|A])\leftarrow\mathit{new}5(A)$

$\mathit{new}3([\,])\leftarrow$ $\mathit{new}3([a|A])\leftarrow\mathit{new}3(A)$ $\mathit{new}3([b|A])\leftarrow\mathit{new}3(A)$

$\mathit{new}4([\,])\leftarrow$ $\mathit{new}4([a|A])\leftarrow\mathit{new}4(A)$ $\mathit{new}4([b|A])\leftarrow\mathit{new}6(A)$

$\mathit{new}5([a|A])\leftarrow\mathit{new}3(A)$ $\mathit{new}5([b|A])\leftarrow\mathit{new}3(A)$

$\mathit{new}6([a|A])\leftarrow\mathit{new}4(A)$ $\mathit{new}6([b|A])\leftarrow\mathit{new}5(A)$

This program corresponds to the deterministic finite automaton of Figure 4. Each predicate of the derived program is a state, (ii) $g$ is the initial state, (iii) a state $p$ is final iff it has a clause of the form $p([\,])\leftarrow$ , (iv) a clause of the form $p([\ell|A])\!\leftarrow\!q(A)$ denotes a transition with label $\ell$ from $p$ to $q$ . Note that the derivation of the final program that corresponds to a finite automaton has been possible because the context-free grammar indeed generates a regular language.

\VCDraw{VCPicture}

(-4.4,-2.3)(14.8,3.2) \SetEdgeArrowWidth7pt \SetEdgeArrowLengthCoef1.8 \ChgStateLineWidth1

\FixStateDiameter

18mm \State[new5](12.4,0)new5 \State[new6](16.9,0)new6

\FixStateLineDouble

0.81.2 \StateLineDouble\FixStateDiameter16mm \State[g](-5.8,0)g \FixStateDiameter19mm \State[new2](-1.3,0)new2 \State[new3](3.4,0)new3 \State[new4](7.9,0)new4

\Point

(-7.8,0)i \EdgeLig

\SetArcCurvature

0.9 \SetArcAngle35 \SetLoopAngle14 \EdgeLgnew2a \LoopWnew3\LabelR[.15]a,b \LoopWnew4a \ArcLnew4new6\LabelL[.5]b \ArcRnew5new3\LabelR[.5]a,b

\EdgeR

new6new5b

\SetArcCurvature

.9 \ArcLnew2new4a \ArcRgnew3\LabelL[.5]b \ArcLnew6new4\LabelR[.5]a

\SetArcCurvature

.65 \ArcRnew2new5\LabelL[.5]b

Figure 4: The finite automaton which accepts the words which are not generated from

s

by the productions:

s\rightarrow a\,x\,b

and

x\rightarrow\varepsilon\mid a\,x\mid a\,b\,x

. State

g

is the initial state and the final states have double circles.

Finally we present an example on how to use the transformation methodology for the verification of program properties. This example is the so called Yale Shooting Problem which is often used in temporal reasoning. This problem can be described and formalized as follows.

We have a person and a gun. Three events are possible: (e1) a load event, when the gun is loaded, (e2) a shoot event, when the gun shoots, and (e3) a wait event, when nothing happens (see clauses 1–3 below). A situation is (the result of) a sequence of events. A sequence is represented as a list. We assume that, as time progresses, the list grows ‘to the left’, that is, given the current list $S$ of events, when a new event $E$ occurs, the new list of events is $[E|S]$ . In any situation, at least one of the following three facts holds: (f1) the person is alive, (f2) the person is dead, and (f3) the gun is loaded (see clauses 4–6 below).

We also assume the following hypotheses (see clauses 7–11 and note the presence of a negated atom in clause 11). (s1) In the initial situation denoted by the empty list, the person is alive. (s2) After a load event the gun is loaded. (s3) If the gun is loaded, then after a shoot event the person is dead. (s4) If the gun is loaded, then it is abnormal that after a shoot event the person is alive. (s5) Inertia Axiom: If a fact $F$ holds in a situation $S$ and it is not abnormal that $F$ holds after the event $E$ following $S$ , then $F$ holds also after the event $E$ .

The following locally stratified program YSP formalizes the above statements. A similar formalization is in a paper by Apt and Bezem [2].

1. $\mathit{event}(\mathit{load})\leftarrow$ 2. $\mathit{event}(\mathit{shoot})\leftarrow$ 3. $\mathit{event}(\mathit{wait})\leftarrow$

4. $\mathit{fact}(\mathit{alive})\leftarrow$ 5. $\mathit{fact}(\mathit{dead})\leftarrow$ 6. $\mathit{fact}(\mathit{loaded})\leftarrow$

7. $\mathit{holds}(\mathit{alive},[\,])\leftarrow$ 8. $\mathit{holds}(\mathit{loaded},[\mathit{load}|S])\leftarrow$

9. $\mathit{holds}(\mathit{dead},[\mathit{shoot}|S])\leftarrow\mathit{holds}(\mathit{loaded},S)$

10. $ab(\mathit{alive},\mathit{shoot},S)\leftarrow\mathit{holds}(\mathit{loaded},S)$

11. $\mathit{holds}(F,[E|S])\leftarrow\mathit{fact}(F),\ \mathit{event}(E),\ \mathit{holds}(F,S),\ \neg\,ab(F,E,S)$

12. $\mathit{append}([\,],\!\mathit{Ys},\!\mathit{Ys})\leftarrow$ 13. $\mathit{append}([A|\mathit{Xs}],\!\mathit{Ys},[A|\mathit{Zs}])\leftarrow\mathit{append}(\mathit{Xs},\!\mathit{Ys},\mathit{Zs})$

By applying SLDNF-resolution [39], Apt and Bezem showed that $\mathit{holds}(\mathit{dead},[\mathit{shoot},\mathit{wait},\mathit{load}])$ is true in the perfect model of program YSP. Now we consider a property $\Gamma$ which cannot be shown by SLDNF-resolution (see [59]):

${\Gamma}\ \equiv\ \forall S\,(\mathit{holds}(\mathit{dead},S)$ $\leavevmode\nobreak\ \rightarrow\leavevmode\nobreak\ \leavevmode\nobreak\ \exists S0,S1,S2,S^{\prime}\,(\mathit{append}(S2,[\mathit{shoot}|S1],S^{\prime}),\,\mathit{append}(S^{\prime},[\mathit{load}|S0],S)))$

Property $\Gamma$ means that the fact that the person is dead in the current situation $S$ implies that in the past there was a load event followed, possibly not immediately, by a shoot event. Thus, since time progresses ‘to the left’, $S$ is a list of events of the form: $[\ldots,\mathit{shoot},\ldots,\mathit{load},\ldots]$ .

In the first step of our two step verification method (see [59, Section 2.3]), we apply the Lloyd-Topor transformation [39, page 113] starting from the statement: $g\leftarrow{\Gamma}$ (where $g$ is a new predicate name) and we derive the following clauses:

14. $g\leftarrow\neg\mathit{new}1$

15. $\mathit{new}1\leftarrow\mathit{holds}(\mathit{dead},S),\ \neg\mathit{new}2(S)$

16. $\mathit{new}2(S)\leftarrow\mathit{append}(S2,[\mathit{shoot}|S1],S^{\prime}),\ \mathit{append}(S^{\prime},[\mathit{load}|S0],S)$

At the end of the second step, after a few iterations of the unfold-definition-folding strategy and after the deletion of all definitions of predicates which are not required by $g$ , we are left with the single clause: $g\leftarrow$ . Details can be found in [59].

Since $g$ holds in the (perfect model of the) final program, we have that property ${\Gamma}$ holds in the (perfect model of the) final program. Thus, ${\Gamma}$ holds also in the initial program made out of clauses 1–13.

Much more recently we have explored some verification techniques based on the transformation of constrained Horn clauses, also in the case of imperative and functional programs [20] and in the case of business processes (see, for instance, [18]). This recent work has been done in cooperation with Emanuele De Angelis and Fabio Fioravanti. They also have been working and still work in the implementation and development of an automatic transformation and verification tool [19], which was originally set up by Ornella Aioni and Maurizio Proietti.

8 Future Developments

Reviewing my research activity when writing this paper, I realized that many topics and issues would need a more accurate analysis and study. It would be difficult to list them all, but I have been encouraged to mention at least some of them. I hope that these suggestions may be useful for researchers in the field and they may find these suggestions of some interest.

Concerning the theory of combinators and WCL presented in Section 1, one should note that the combinator $X\equiv B(B(BB)B)(BB)$ we have presented has parentheses and one could consider to construct a $B$ -combinator, call it $\widetilde{B}$ , which places those parentheses in a sequence of seven $B$ ’s, so that $\widetilde{B}BBBBBBB>^{*}B(B(BB)B)(BB)$ . A routine construction, following [4], shows that $\widetilde{B}$ is, in fact, $B(B(B(BB)B)B)(BB)$ . The relation between combinators $X$ and $\widetilde{B}$ could be for the reader a stimulus for studying the process of placing parentheses in a list of variables, that is, the process of constructing a binary tree from the list of its leaves.

One can start by considering, first, the use of regular combinators only. A combinator $X$ is said to be regular if its reduction is of the form $Xx_{\_}{1}\ldots x_{\_}{n}>x_{\_}{1}t_{\_}{2}\ldots t_{\_}{m}$ , where $t_{\_}{2},\ldots,t_{\_}{m}$ are terms made out of $x_{\_}{2},\ldots,x_{\_}{n}$ only. A particular regular combinator for placing parentheses is, indeed, $B$ . Similarly, one could study the permutative and duplicative properties of the regular combinators $C$ (defined by $Cxyz>xzy$ ) and $W$ (defined by $Wxy>xyy$ ) and other regular (or non-regular) combinators. This study will improve the results reported in the classical book by Curry and Feys [15, Chapter 5].

For Section 2 one could develop the techniques presented in [50]. Those developments can be useful in the area of Term Rewriting Systems for constructing terms with infinite behaviour.

For the issues considered in Section 3 on Program Transformation, it will be important to investigate how to invent the multiplication operation, having at our disposal in the initial program version only the addition operation. Generalizations of various kinds can be suggested as we have done in this paper, but an interesting technique would be the one based on the idea of deriving multiplication as the iteration of additions. Then, in an analogous way, exponentiation can be invented as the iteration of multiplications, thus allowing us to derive even more efficient programs. The idea of iteration can hopefully be generated by mechanically analyzing the m-dags constructed by unfolding and looking at repeated patterns.

For Sections 4 and 5, it could be important to mechanize the techniques we have presented there, and in particular those for finding the suitable tuples of functions and suitable lambda-abstractions via the analysis of: (i) cuts and pebble games in the m-dags, and (ii) subexpression mismatchings, respectively.

For Section 6 one can provide an implementation of the functional language with communications we have proposed so that one can execute programs written in that language. One may also: (i) automate the process of adding communications to functional programs for improving their efficiency by making use of the properties of the functions to be evaluated, and (ii) automate the reasoning on the modal theories presented in [60] in which one can prove correctness of those communications. Thus, one will have a machine-checked proof of correctness of the communications which have been added.

For Section 7 a possible project is to construct a transformation system of logic programs with goals as arguments in which: (i) one can run the programs according to the operational semantics we have defined in our paper [57], and (ii) one can apply the various transformation rules (definition introduction, unfolding, folding, goal replacement) we have listed in that paper.

9 Acknowledgements

My gratitude goes to the various people who taught me Computer Science and among them, Paolo Ercoli, Rod Burstall, Robin Milner, Gordon Plotkin, John Reynolds, Alan Robinson, and Leslie Valiant. I am also grateful to my colleagues and students. In particular, I would like to thank Anna Labella, Andrzej Skowron, Maurizio Proietti, Valerio Senni, Sophie Renault, Fabio Fioravanti, and Emanuele De Angelis. A very special thank goes to Maurizio, who for many years has been ‘so devoted to me, so patient, so zealous’, as John Henry Newman said of his friend Ambrose St. John [44, page 190]. Maurizio has been for me an unvaluable source of inspiration and strength.

Many thanks to Andrei Nemytykh for inviting me to write this paper and for his comments, and Maurizio Proietti and Laurent Fribourg for their friendly help and encouragement.

References

[1]
[2] K. R. Apt & M. Bezem (1991): Acyclic Programs. New Generation Computing 9, pp. 335–363, 10.1007/BF03037168.
[3] H. P. Barendregt (1984): The Lambda Calculus, its Syntax and Semantics. North-Holland, Amsterdam, 10.2307/2274112.
[4] C. Batini & A. Pettorossi (1975): On Subrecursiveness in Weak Combinatory Logic. In: Proceedings of the Symposium $\lambda$ -Calculus and Computer Science Theory, Lecture Notes in Computer Science 37, Springer-Verlag, pp. 297–311, 10.1007/BFb0029533.
[5] F. L. Bauer & H. Wössner (1982): Algorithmic Language and Program Development. Springer-Verlag, 10.1007/978-3-642-61807-9.
[6] R. S. Bird (1984): The Promotion and Accumulation Strategies in Transformational Programming. ACM Toplas 6(4), pp. 487–504, 10.1145/1780.1781.
[7] R. S. Bird (1984): Using Circular Programs to Eliminate Multiple Traversal of Data. Acta Informatica 21, pp. 239–250, 10.1007/BF00264249.
[8] R. S. Boyer & J. S. Moore (1975): Proving Theorems About Lisp Functions. Journal of the ACM 22(1), pp. 129–144, 10.1145/321864.321875.
[9] R. Burstall (1972): An Algebraic Description of Programs with Assertions, Verification and Simulation. In: Proc. ACM Conference on Proving Assertions about Programs, ACM, New York, NY, USA, pp. 7–14, 10.1145/800235.807068.
[10] R. M. Burstall (1977): Design Considerations for a Functional Programming Language. In: Proc. Infotech State of the Art Conference “The Software Revolution”, Copenhagen, Denmark, pp. 47–57.
[11] R. M. Burstall & J. Darlington (1977): A Transformation System for Developing Recursive Programs. Journal of the ACM 24(1), pp. 44–67, 10.1145/321992.321996.
[12] R. M. Burstall, D.B. MacQueen & G. H. Sannella (1980): Hope: An Experimental Applicative Language. In: Conference Record of the 1980 LISP Conference, Stanford University, Stanford, Ca, USA, pp. 136–143, 10.1145/800087.802799.
[13] K. L. Clark (1978): Negation as Failure. In H. Gallaire & J. Minker, editors: Logic and Data Bases, Plenum Press, New York, pp. 293–322, 10.1007/978-1-4684-3384-511.
[14] W. F. Clocksin & C. S. Mellish (1984): Programming in Prolog, Second edition. Springer-Verlag, New York, 10.1007/978-3-642-96873-0.
[15] H. B. Curry & R. Feys (1974): Combinatory Logic. North-Holland.
[16] J. Darlington (1978): A Synthesis of Several Sorting Algorithms. Acta Informatica 11, pp. 1–30, 10.1007/BF00264597.
[17] J. Darlington (1981): An Experimental Program Transformation System. Artificial Intelligence 16, pp. 1–46, 10.1016/0004-3702(81)90014-X.
[18] E. De Angelis, F. Fioravanti, M. C. Meo, A. Pettorossi & M. Proietti (2017): Verification of Time-Aware Business Processes using Constrained Horn Clauses. In: Proceedings of the 26th International Symposium on Logic-Based Program Synthesis and Transformation $($ LOPSTR 2016 $)$ , Lecture Notes in Computer Science 10184, Springer, pp. 38–55, 10.1007/978-3-319-63139-43.
[19] E. De Angelis, F. Fioravanti, A. Pettorossi & M. Proietti (2014): VeriMAP $:$ A Tool for Verifying Programs through Transformations. In: Proc. 20th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS ’14, Lecture Notes in Computer Science 8413, Springer, pp. 568–574, 10.1007/978-3-642-54862-8_47. Available at: http://www.map.uniroma2.it/VeriMAP.
[20] E. De Angelis, F. Fioravanti, A. Pettorossi & M. Proietti (2017): Semantics-based generation of verification conditions via program specialization. Science of Computer Programming 147, pp. 78–108, 10.1016/j.scico.2016.11.002. Selected and Extended papers from the Int. Symp. on Principles and Practice of Declarative Programming 2015.
[21] N. Dershowitz (1987): Termination of rewriting. Journal of Symbolic Computation 3(1-2), pp. 69–116, 10.1016/S0747-7171(87)80022-6.
[22] E. W. Dijkstra (1971): A Short Introduction to the Art of Programming. Technical Report, EWD 316.
[23] E. W. Dijkstra (1982): Selected Writing on Computing: A Personal Perspective. Springer-Verlag, New York, Heidelberg, Berlin, 10.1007/978-1-4612-5695-3.
[24] M. C. Er (1983): An iterative solution to the generalized Towers of Hanoi problems. BIT 23, pp. 295–302, 10.1007/BF01934458.
[25] P. Flajolet & J.-M. Steyaert (1974): On Sets Having Only Hard Subsets. In J. Loeckx, editor: 2nd ICALP, Automata, Languages and Programming, Lecture Notes in Computer Science 14, Springer, pp. 446–457, 10.1007/978-3-662-21545-634.
[26] D. Gelernter & N. Carriero (1992): Coordination Languages and their Significance. Communications of the ACM 35(2), pp. 97–107, 10.1145/129630.129635.
[27] P. J. Hayes (1977): A note on the Towers of Hanoi problem. The Computer Journal 20(3), pp. 282–285, 10.1093/comjnl/20.3.282.
[28] J. R. Hindley, B. Lercher & J. P. Seldin (1975): Introduzione alla Logica Combinatoria. Serie di Logica Matematica, Boringhieri. (In Italian).
[29] J. R. Hindley & J. P. Seldin (1986): Introduction to Combinators and $\lambda$ -Calculus. London Mathematical Society, Cambridge University Press, 10.1007/BF00047236.
[30] C.A.R. Hoare (1978): Communicating Sequential Processes. CACM 21(8), pp. 666–677, 10.1145/359576.359585.
[31] J. E. Hopcroft & J. D. Ullman (1979): Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 10.1145/568438.568455.
[32] R. J. M. Hughes (1986): A novel representation of lists and its application to the function “reverse”. Info. Proc. Lett. 22, pp. 141–144, 10.1016/0020-0190(86)90059-1.
[33] V. E. Itkin & Z. Zwienogrodsky (1971): On equivalence of program schemata. Journ. Comp. Syst. Sci. 5.
[34] S. Kasangian, A. Labella & A. Pettorossi (1990): Observers, experiments, and agents: A comprehensive approach to parallelism. In I. Guessarian, editor: Semantics of Systems of Concurrent Processes. LITP Spring School, Lecture Notes in Computer Science 469, Springer-Verlag, pp. 375–406, 10.1007/3-540-53479-216.
[35] G. M. Kelly (1982): Basic Concepts of Enriched Category Theory. Cambridge University Press, Cambridge.
[36] R. A. Kowalski (1979): Logic for Problem Solving. North Holland. Available at: www.doc.ic.ac.uk/
~rak/papers/LogicForProblemSolving.pdf.
[37] A. Labella & A. Pettorossi (1985): Categorical Models for Handshaking Communications. Fundamenta Informaticae. Series IV. VIII(3-4), pp. 322–357.
[38] S. S. Lavrov (1961): Economy of memory in closed operator schemes. U.S.S.R. Computat. Math. and Math. Physics, pp. 810–828.
[39] J. W. Lloyd (1987): Foundations of Logic Programming. Springer-Verlag, Berlin, 10.1007/978-3-642-83189-8. Second Edition.
[40] V. V. Martynuk (1965): On the analysis of control-flow graphs for a program scheme. Journ. Comp. Math. and Math. Phys. 5,2.
[41] E. Mendelson (1987): Introduction to Mathematical Logic. Wadsworth & Brooks/Cole Advanced Books & Software, Monterey, California, USA, 10.2307/2274877. Third Edition.
[42] J. Miller & S. Brown (1966): An algorithm for evaluation of remote terms in a linear recurrence sequence. The Computer Journal 9, pp. 188–190, 10.1093/comjnl/9.2.188.
[43] R. Milner (1989): Communication and Concurrency. Prentice Hall, 10.5555/534666.
[44] J. H. Newman (2001): Apologia Pro Vita Sua. Maisie Ward (ed.), Sheed and Ward, London.
[45] M. S. Paterson & C. E. Hewitt (1970): Comparative Schematology. In: Conference on Concurrent Systems and Parallel Computation Project MAC, Woods Hole, Mass., USA, pp. 119–127. Available at: https://dl.acm.org/doi/pdf/10.1145/1344551.1344563.
[46] A. Pettorossi (1971): Ottimizzazione di un Collegamento per Trasmissione di Dati Mediante Simulazione Numerica. Laurea Thesis (in Italian). University of Rome, Italy. Available on request to the author.
[47] A. Pettorossi (1972): Automatic Derivation of Control Flow Graphs of Fortran Programs. Master Thesis (in Italian). Original title: “Generazione Automatica del Grafo di Flusso del Controllo per un Programma di Calcolo Scritto in Fortran”. University of Rome, Italy. Available on request to the author.
[48] A. Pettorossi (1978): Improving memory utilization in transforming programs. In: Proc. Mathematical Foundations of Computer Science 1978, Zakopane $(\!$ Poland $)$ , Lecture Notes in Computer Science 64, Springer-Verlag, pp. 416–425, 10.1007/3-540-08921-789.
[49] A. Pettorossi (1979): On the Definition of Hierarchies of Infinite Sequential Computations. In Lothar Budach, editor: Fundamentals of Computation Theory, FCT’79, Akademic-Verlag, Berlin, pp. 335–341. Available on request to the author.
[50] A. Pettorossi (1980): Synthesis of Subtree Rewriting Systems Behaviour by Solving Equations. In: Proc. 5ème Colloque de Lille $($ France $)$ on “Les Arbres en Algèbre et en Programmation”, U.E.R. I.E.E.A. BP 36, Université de Lille I, 59655 Villeneuve d’Ascq Cedex, France, pp. 63–74. Available on request to the author.
[51] A. Pettorossi (1981): Comparing and Putting Together Recursive Path Orderings, Simplification Orderings, and Non-Ascending Property for Termination Proofs of Term Rewriting Systems. In: Proc. ICALP 1981, Haifa $(\!$ Israel $)$ , Lecture Notes in Computer Science 115, Springer-Verlag, pp. 432–447, 10.1007/3-540-10843-2_35.
[52] A. Pettorossi (1984): Methodologies for Transformations and Memoing in Applicative Languages. Ph.D. thesis, Edinburgh University, Edinburgh, Scotland. Available at:
https://era.ed.ac.uk/handle/1842/15643.
[53] A. Pettorossi (1984): A Powerful Strategy for Deriving Efficient Programs by Transformation. In: ACM Symposium on Lisp and Functional Programming, ACM Press, pp. 273–281, 10.1145/800055.802044.
[54] A. Pettorossi (1985): Towers of Hanoi Problems: Deriving Iterative Solutions by Program Transformation. BIT 25, pp. 327–334, 10.1007/BF01934378.
[55] A. Pettorossi (1987): Derivation of Efficient Programs For Computing Sequences of Actions. Theoretical Computer Science 53, pp. 151–167, 10.1016/0304-3975(87)90030-2.
[56] A. Pettorossi & R. M. Burstall (1982): Deriving Very Efficient Algorithms for Evaluating Linear Recurrence Relations Using the Program Transformation Technique. Acta Informatica 18, pp. 181–206, 10.1007/BF00264438.
[57] A. Pettorossi & M. Proietti (2000): Transformation Rules for Logic Programs with Goals as Arguments. In A. Bossi, editor: Proceedings of the 9th International Workshop on Logic-based Program Synthesis and Transformation $($ LOPSTR ’99 $)$ , Venezia, Italy, Lecture Notes in Computer Science 1817, Springer-Verlag, Berlin, pp. 177–196, 10.1007/10720327_11.
[58] A. Pettorossi & M. Proietti (2002): The List Introduction Strategy for the Derivation of Logic Programs. Formal Aspects of Computing 13(3-5), pp. 233–251, 10.1007/s001650200011.
[59] A. Pettorossi & M. Proietti (2002): Program Derivation = Rules $+$ Strategies. In A. Kakas & F. Sadri, editors: Computational Logic: Logic Programming and Beyond (Essays in honour of Bob Kowalski, Part I), Lecture Notes in Computer Science 2407, Springer-Verlag, pp. 273–309, 10.1007/3-540-45628-7_12.
[60] A. Pettorossi & A. Skowron (1983): Complete Modal Theories for Verifying Communicating Agents Behaviour in Recursive Equations Programs. Internal Report CSR-128-83, University of Edinburgh, Edinburgh, Scotland. Available on request to the authors.
[61] A. Pettorossi & A. Skowron (1985): A methodology for improving parallel programs by adding communications. In: Computation Theory, SCT 1984, Lecture Notes in Computer Science 280, Springer-Verlag, Berlin, pp. 228–250, 10.1007/3-540-16066-3_20.
[62] A. Pettorossi & A. Skowron (1985): A System for Developing Distributed Communicating Programs. In M. Feilmeier, G. Joubert & U. Schendel, editors: International Conference ‘Parallel Computing 85’, North-Holland, pp. 241–246. Available on request to the authors.
[63] A. Pettorossi & A. Skowron (1989): The Lambda Abstraction Strategy for Program Derivation. Fundamenta Informaticae XII(4), pp. 541–561. Available on request to the authors.
[64] Alberto Pettorossi & Maurizio Proietti (2004): Transformations of Logic Programs with Goals as Arguments. Theory Pract. Log. Program. 4(4), pp. 495–537, 10.1017/S147106840400198X.
[65] M. Proietti & A. Pettorossi (1990): Synthesis of Eureka Predicates for Developing Logic Programs. In N. D. Jones, editor: Third European Symposium on Programming, ESOP ’90, Lecture Notes in Computer Science 432, Springer-Verlag, pp. 306–325, 10.1007/3-540-52592-0_71.
[66] M. Proietti & A. Pettorossi (1991): Unfolding-Definition-Folding, in this Order, for Avoiding Unnecessary Variables in Logic Programs. In J. Małuszyński & M. Wirsing, editors: Third International Symposium on Programming Language Implementation and Logic Programming, PLILP ’91, Lecture Notes in Computer Science 528, Springer-Verlag, pp. 347–358, 10.1007/3-540-54444-5_111.
[67] H. Rogers (1967): Theory of Recursive Functions and Effective Computability. McGraw-Hill.
[68] L. S. Sterling & E. Shapiro (1994): The Art of Prolog. The MIT Press, Cambridge, Massachusetts. Second Edition.
[69] H. Tamaki & T. Sato (1984): Unfold/Fold Transformation of Logic Programs. In S.-Å. Tärnlund, editor: Proceedings of the Second International Conference on Logic Programming, ICLP ’84, Uppsala University, Uppsala, Sweden, pp. 127–138.
[70] S. A. Walker & H. R. Strong (1973): Characterization of Flowchartable Recursions. Journal of Computer and System Sciences 7(4), pp. 404–447, 10.1016/S0022-0000(73)80032-7.
[71] D. H. D. Warren (1977): Implementing Prolog – Compiling Predicate Logic Programs. Research Report 39 & 40, Department of Artificial Intelligence, University of Edinburgh, Scotland.
[72] D. H. D. Warren (1983): An Abstract Prolog Instruction Set. Technical Report 309, SRI International.
[73] D. H. D. Warren, L. M. Pereira & F. Pereira (1977): Prolog - the language and its implementation compared with Lisp. SIGART Newsl. 64, pp. 109–115, 10.1145/872736.806939.
[74] G. Winskel (1984): Synchronization Trees. Theoretical Computer Science 34(1-2), pp. 33–82, 10.1016/0304-3975(84)90112-9.
[75] Y. I. Yanov (1960): The Logical Schemes of Algorithms. Problems of Cybernetics 1, pp. 82–140. English translation.

A Historical Account of My Early Research Interests ††thanks: This work has been partially supported by GNCS-INdAM, Italy.

Abstract

1 From Signal Theory to Combinatory Logic

2 Finite and Infinite Computations

3 Program Transformation

4 The Tupling Strategy and the List Introduction Strategy

5 The Lambda Abstraction Strategy

6 Communications and Parallelism

7 Transformation and Verification in Logic Programming

8 Future Developments

9 Acknowledgements

References

A Historical Account of My Early
Research Interests ^†^†thanks: This work has been partially supported by GNCS-INdAM, Italy.