¹¹institutetext:

Compiling Turing Machines into
Storage Modification Machines

J.-M. Chauvet

Abstract

It is well known that Schönhage’s Storage Modification Machines (SMM) can simulate Turing Machines (TM) since Schönhage’s original proof of the Turing completeness of the eponymous machines. We propose a simple transformation of TM into SMM, setting the base for a straightforward TM-to-SMM compiler.

1 Introduction

It is well known that Schönhage’s Storage Modification Machines [4] can simulate Turing Machines (TM) [5]. Simulation of small universal TM, or other simple universal models such as Post’s tag systems [1] and the cellular automaton Rule 110 [6], is by now a standard way to prove that a large number of alternate models of computation, including a variety of physically-inspired systems, are computationally universal. In the following, we consider a slightly revised version of Schönhage’s Storage Modification Machine (SMM) and offer a simple construction to simulate the behavior of a TM. This construction sets the base for a generic TM-to-SMM compiler.

2 Images of Computation: Turing Machines and Storage Modification Machines

Turing Machines, masquerading as read-write-head-and-tape mechanic contraptions, are in fact quite mathematical, theoretical models of computation. They are best viewed as abstract machines (metaphorically) manipulating symbols on a strip of tape according to a table of rules. In a simplified design, the infinite tape is divided into contiguous cells, each cell having a right- and a left-neighbor, which may or not exhibit a symbol (from a given, usually finite, alphabet). The head when positioned over a cell can read or write this cell’s symbol. Finally the machine is characterized by a state, also from a given, usually finite, state-set.

An elementary execution step of such a TM is summarized as follows:

1.

Read the current cell’s symbol $\sigma$ (or blank, if no symbol is present).

According then to the combination of current state and symbol in the transition table, $(S,\sigma)$ :

2.

Write back a symbol $\sigma^{\prime}$ on the current cell.
3.

Move head to one of the two neighbors of the current cell.
4.

Change internal state to $S^{\prime}$ .

The behavior of this simple TM is entirely specified by its transition table which states, for each relevant $(S,\sigma)$ combination, the data required to perform steps 2, 3 and 4 above.

In our construction, this data is simply a string which concatenates the symbol $\sigma^{\prime}$ , the constant $L$ (left) or $R$ (right) to indicate which contiguous neighbor to target in step 3, and finally the new state $S^{\prime}$ of the TM.

Running a TM involves (i) positioning the head on a tape with some initial symbols in some cells, (ii) setting some state as the initial one, and (iii) start consecutively executing the above steps. By convention when no combination of state and symbol is found in the transition table, the machine halts¹¹1The whole point of Turing in his original 1936 paper was to provide a mathematical description of a very simple device capable of arbitrary computations, and prove properties of computation in general – and in particular, the uncomputability of the Entscheidungsproblem (decision problem) of predicting whether a so specified TM would halt or move forever, reading and writing symbols..

Storage Modification Machines presented here are from a variant in [2] where they are used to implement population protocol models. Like a TM, a SMM represents a single computing agent. It is equipped with memory and a processing unit. Its memory stores a finite directed graph of equal out-degree nodes [4], with a distinguished node called the center. (Edges of this graph are also called pointers.) Edges leading out of each node are uniquely labelled by distinct directions, drawn from a finite set $D$ .

Any string $x\in D^{*}$ refers to the node $p(x)$ reached from the center by following the sequence of directions labelled by $x$ . (Note that this is somehow similar to moving the head in a TM.) In the variant used here, nodes may have different out-degrees, and we set $p(x)=\emptyset$ when $x$ is not a valid path in the graph.

SMM are furthermore characterized by a program, or control list, which is a finite list of consecutively numbered instructions (reminiscent of the transition tables in TM). The restricted instruction set is as follows:

•

new label creates a new labelled node and makes it the center, setting all its outgoing edges to the previous center.
•

set xd to y where $x,y$ are paths in $D^{*}$ and $d\in D$ is a direction, redirects the $d$ edge of $p(x)$ to point to $p(y)$ .
•

center x where $x$ is a path, moves the center to $p(x)$ .
•

if x y then ln where $x,y$ are paths and ln a line number, jumps to line ln if $p(x)=p(y)$ and skips to the next line if not. Line numbers can be absolute, ln, or relative to the current line number, +ln or -ln.
•

stop message halts the SMM, printing out message.

The similarities highlighted above between TM and SMM are our guidelines to the specification of a generic transformation of the TM transition table to appropriate sequences of SMM instructions.

3 A TM-to-SMM Compiler

More specifically, elaborating on these similarities: each TM tape cell and each position of the head over it are represented by a node in the compiled SMM, with one dedicated direction, say f, pointing from head to cell and cell to head.

The symbol on the tape cell is simply the binary representation of its index in the alphabet, using $n$ specialized directions in the tape node, where $n$ is the integer immediately superior to the base-2 log of the alphabet size. The state of the TM is encoded in a head node in the same way, the binary representation of its index in the state-set using $m$ specialized directions (which may be in common with the former $n$ ones, without confusion as a node is either a tape or a head).

A special direction in both type of nodes, say o, always points to a fixed initial node, the Origin. The center of the SMM is set to the current position of the TM head, i.e. the SMM head node linked to the current tape cell tape node.

By convention, the binary representation bits in nodes are $0$ when their direction points to self, and $1$ when their direction points to the fixed Origin.

Finally tape nodes are doubly-linked using two specialized directions, e and w (for east and west); their corresponding head nodes (in the f direction) are doubly-linked in the same way. The compiled SMM has then $4+max(n,m)$ directions.

The TM compiler is built as a sequence of code generation/decoration passes:

•

State Selection. Build a decision tree based on the state binary representation over the $m$ bit directions, using SMM controls if <B> o to test each <B> bit direction.
•

Symbol Detection. At each leaf of the previous tree, build a decision tree based on the symbol binary representation over the $n$ bit directions. At each leaf of each of these new trees, add the control list implementing the $(\sigma,S)$ transition found in the TM table: binary encode the new symbol on the current tape node; move to the e or w head node as indicated in the transition (possibly creating new SMM nodes in the process, if they do not exist, hence simulating an infinite tape); finally encode the new state in this head cell, and make it the center. This reproduces the steps 2-4 of the TM.
•

Prologue. Prefix the control list resulting from the previous passes with a specific control list setting up the initial tape and symbols, the initial head position and the initial state.

The compiled SMM is ran stepwise: the initial prologue to set up the TM as a first mandatory step, followed by a sequence of calls to the above transition control list. Each call to the SMM execution step implements a full transition in the TM execution.

3.1 SMM simulation of the TM simulation of the Collatz $3x+1$ function

We exercize the TM-to-SMM compiler on the compact 3-4 TM, simulating the Collatz $3x+1$ function, given in [3] and defined as:

$(S,\sigma)$	b(lank)	0	1	2
$A$	bLC	0RA	0RB	1RA
$B$	2LC	1RB	2RA	2RB
$C$	bRA	0LC	1LC	2LC

where $R$ , $L$ represent right and left, captured as e, w directions in the compiled SMM. This TM operates on integers represented in base 3: the symbol alphabet is $\{b,0,1,2\}$ and the state set is $\{A,B,C\}$ . The initial tape is the base-3 representation of $x_{0}$ , and the initial state is set to $A$ . Note that this TM never halts as it ends up – if the Collatz conjecture is true – looping over the $\{1,4,2\}$ cycle.

The following page displays twelve steps of the compiled SMM execution, starting from the initial value $x_{0}=19$ . On the right-hand side, top to bottom, the SMM growth is graphed with the center highlighted in gray. (The o direction and all binary representations of symbols and states in the SMM nodes are omitted for clarity.) On the left-hand side a more conventional TM graphical representation of the SMM computation is presented; symbols and states are shown and color-coded, with the position of the head. The initial tape cell is circled twice.

The repeated execution of the SMM control list results in values which are read (in base 3) when the TM state is $C$ and the head is at its leftmost position, over a blank, b, tape cell. Note that the compact Collatz 3-4 TM computes the only odd values in the Collatz series starting at $x_{0}$ , skipping over all the intermediate divisions by $2$ . Here in the trace displayed next page:

T	Base 3	Decimal
0	201	19
7	1002	29
…	…	…

Of course the delay between successive printings of the iterated values on the tape lengthens as the tape is progressively expanded right, forcing a longer trip of the head back to its leftmost position in state $C$ .

This compiled SMM has 6 directions, 2 of which are used for the binary representation of states and symbols. At any given time it counts $2n+1$ nodes, where $n$ is the length of the tape. The prologue control list is 43 lines long; the transition control list is 398 lines long (including compiler-generated comments).

(Listings for the compiled SMM program, and other Turing Machines simulations is on-line at https://github.com/CRTandKDU/SMM.)

Refer to caption — Figure 1: Twelve steps of the SMM compiled from the compact Collatz 3-4 TM.

4 Conclusion

Schönhage’s Storage Modification Machines can generally simulate Turing Machines. This paper provides an alternate construction of such SMM simulations which finds its use in a Turing Machine to SMM compiler . Related questions on the minimal (space) complexity SMM required for simulation of complex Turing Machines may then be addressed by looking into optimizing this simple TM-to-SMM compiler.

References

[1] John Cocke and Marvin Minsky. Universality of Tag Systems with P = 2. J. ACM, 11(1):15–20, January 1964.
[2] Rachid Guerraoui and Eric Ruppert. Names Trump Malice: Tiny Mobile Agents Can Tolerate Byzantine Failures. In Proceedings of the 36th Internatilonal Collogquium on Automata, Languages and Programming: Part II, ICALP ’09, page 484–495, Berlin, Heidelberg, 2009. Springer-Verlag.
[3] Pascal Michel. Simulation of the Collatz 3x+1 function by Turing machines, 2014. https://arxiv.org/abs/1409.7322, last accessed on 9/25/2021.
[4] A. Schönhage. Storage Modification Machines. SIAM Journal on Computing, 9(3):490–508, 1980.
[5] A. M. Turing. On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society. Second Series, 42:230–265, 1936. This is the paper that introduced what is now called the Universal Turing Machine.
[6] Stephen Wolfram. A New Kind of Science. Wolfram Media, 2002.

Compiling Turing Machines into Storage Modification Machines