Weak Greibach Normal Form for Hyperedge Replacement Grammars

Tikhon Pshenitsyn Department of Mathematical Logic and Theory of Algorithms
Faculty of Mathematics and Mechanics
Lomonosov Moscow State University
GSP-1, Leninskie Gory, Moscow, 119991, Russian FederationThe study was funded by RFBR, project number 20-01-00670. [email protected]

Abstract

It is known that hyperedge replacement grammars are similar to string context-free grammars in the sense of definitions and properties. Therefore, we expect that there is a generalization of the well-known Greibach normal form from string grammars to hypergraph grammars. Such generalized normal forms are presented in several papers; however, they do not cover a large class of hypergraph languages (e.g. languages consisting of star graphs). In this paper, we introduce a weak Greibach normal form, whose definition corresponds to the lexicalized normal form for string grammars, and prove that every context-free hypergraph language (with nonsubstantial exceptions) can be generated by a grammar in this normal form. The proof presented in this paper generalizes a corresponding one for string grammars with a few more technicalities.

1 Introduction

Extensions of formal grammars and languages from strings to graphs are considered in a wide variety of works. The resulting formalisms called graph grammars generate graph languages by means of productions: each production is of the form $A\to G$ ; it allows one to replace a part of a graph labeled by $A$ with the graph $G$ if certain conditions are satisfied. An overview on graph grammars can be found in “Handbook on Graph Grammars and Computing by Graph Transformation” [12].

In this paper, we focus on a particular approach called hyperedge replacement grammar (HRG). An overview on HRG can be found in the book [8] or in a chapter of the handbook mentioned above [4]. In brief, a hyperedge replacement grammar contains productions that allow one to replace an edge with a certain label by a hypergraph; the rest is similar to string context-free grammar definitions. HRGs appeared in the seventies, and they became popular from theoretical and practical points of view; particularly, they can be used in machine translation or programming (see, for instance, [10]). Many structural properties of HRGs have been studied. Nicely, many theorems regarding context-free string grammars (CFGs) can be extended to HRGs in a natural way; besides, it is often the case that proofs for CFGs can be generalized to HRGs so there is no need to invent anything new.

For context-free grammars, there is a so-called Greibach normal form, which is helpful in a number of investigations, e.g. to connect CFGs with categorial grammars (see [3]). Each production in a grammar in the Greibach normal form has to begin with a terminal symbol and proceed with at most two nonterminal ones. This definition can be weakened: one says that a grammar is in the weak Greibach normal form (it is often called lexicalized) if there is exactly one terminal symbol in the right-hand side of each production. Obviously, the same definition can be introduced for graphs. Namely, an HRG is said to be in the weak Greibach normal form if for each production $A\to H$ the hypergraph $H$ has exactly one edge labeled by a terminal symbol.

The main objective of this paper is to present a precise class of graph languages that can be generated by HRGs in the weak Greibach normal form. In order to do this we generalize the proof of existence of the Greibach normal form for context-free grammars taken from [2] in a straightforward way. However, several differences exist. Firstly, there is a nonsubstantial class of hypergraph context-free languages that cannot be generated by grammars in the weak Greibach normal form (a problem arises because of isolated nodes). Secondly, there is a trick in the proof for CFGs that exploits the string nature of grammars, so it is hard to generalize it to hypergraphs. This led us to a large and technically heavy proof. Finally, there are bad news regarding algorithmical complexity: conversion of an HRG into an equivalent one in the weak Greibach normal form cannot be done even in exponential time.

In Sect. 2 we compare our contribution with other studies related to the Greibach normal form for graph grammars. In Sect. 3 formal definitions related to hypergraphs and HRGs are presented. In Sect. 4 the weak Greibach normal form is introduced, related issues are discussed. Sect. 5 is devoted to the main theorem. In Sect. 6 we discuss algorithmic complexity of the normalization procedure. In Sect. 7 we conclude.

2 Related Work

There are several works devoted to the Greibach normal form for graph grammars. The paper of Joost Engelfriet [5] establishes that HRGs that produce languages of bounded degree are equivalent to apex HRGs; this result generalizes the double Greibach normal form for CFGs. To recall, a language $L$ is of bounded degree if for some $M$ all nodes of all hypergraphs in $L$ have degree not exceeding $M$ . Obviously, there are substantive examples of languages of unbounded degree: for instance, the language of star graphs, which have one node and arbitrarily many edges outgoing from it. Besides, the property of being apex is stronger than the weak Greibach normal form we are interested in.

In the paper of Christina Jansen et al. [9] the local Greibach normal form is presented. The authors prove that data structure grammars (it is a specific kind of grammars that generate so-called heap configurations) can be transformed into grammars in the local Greibach normal form; after the proof the authors point out that the normalization can be generalized to HRGs of bounded degree. However, the authors also note that their procedure being algorithmically efficient cannot be generalized to all HRGs.

The weak Greibach normal form in the sense we are interested in is introduced for another type of graph grammars. Namely, in [6] it is proved that each B-eNCE grammar (here we do not consider definitions regarding this formalism) is equivalent to a B-eNCE grammar in the weak Greibach normal form. In [7] it is shown that B-edNCE grammars and HRGs have the same recognizing power in some sense (namely, since B-edNCE grammars produce usual graphs with labeled nodes and edges while HRGs produce hypergraphs with labeled edges, the result is established w.r.t. two translations, from graphs to hypergraphs and vice versa). However, it seems to be impossible that these two results can be combined in order to obtain the normal form for HRGs.

Therefore, to our best knowledge, the question of whether each HRG is equivalent to an HRG in the weak Greibach normal form has hitherto remained open. We answer it in this paper.

3 Preliminaries

This section is concerned with definitions related to hypergraphs. All of them are taken from [4]. Note that we use a slightly different notation from that in the handbook mentioned above.

3.1 Hypergraphs, Sub-hypergraphs

$\mathbb{N}$ includes $0$ . The set of integers from $1$ to $n$ is denoted by $[1,n]$ .

It is convenient for us to use the following notation: if $\{i_{1},\dots,i_{k}\}$ is an indexed set of integers such that for $m<n\;i_{m}<i_{n}$ holds, then it is called index-ordered and the set is denoted as $\{i_{1},\dots,i_{k}\}_{IO}$ .
The set $\Sigma^{*}$ is the set of all strings over the alphabet $\Sigma$ including the empty string $\varepsilon$ . The length $|w|$ of the word $w$ is the number of positions in $w$ . The $i$ -th symbol in $w$ is denote by $w(i)$ ( $1\leq i\leq|w|$ ). $\Sigma^{+}$ denotes the set of all nonempty strings. The set $\Sigma^{\circledast}$ is the set of all strings consisting of distinct symbols. The set of all symbols contained in the word $w$ is denoted by $[w]$ . If $f:\Sigma\to\Delta$ is a function from one set to another, then it is naturally extended as a function $f:\Sigma^{*}\to\Delta^{*}$ ( $f(\sigma_{1}\dots\sigma_{k})=f(\sigma_{1})\dots f(\sigma_{k})$ ).

Let $C$ be some fixed set of labels, for which the function $type:C\to\mathbb{N}$ is considered.

Definition 1.

A hypergraph over $C$ is a tuple $G=\langle V,E,att,lab,ext\rangle$ where $V$ is the set of nodes, $E$ is the set of hyperedges, $att:E\to V^{\circledast}$ assigns an ordered set of attachment nodes to each edge, $lab:E\to C$ labels each hyperedge by some element of $C$ in such a way that $type(lab(e))=|att(e)|$ whenever $e\in E$ , and $ext\in V^{\circledast}$ is an ordered set of external nodes.

Components of a hypergraph $G$ are denoted by $V_{G},E_{G},att_{G},lab_{G},ext_{G}$ .

In the remainder of the paper, hypergraphs are simply called graphs, and hyperedges are simply called edges. The set of all graphs with labels from $C$ is denoted by $\mathcal{H}(C)$ . In this work, figures contain graph drawings and sketches. In drawings, nodes are depicted by black dots, edges are denotes as labeled boxes, $att$ is represented with numbered lines (called “tentacles”), external nodes are depicted by numbers in brackets. If an edge has type 2, it is depicted by an arrow. If we are not interested in a specific form of a graph but we want to have a closer look at some its part, we depict the whole graph as an area but draw its part of interest in detail.

Example 1.

This is a graph:

If $G$ is a graph, and $e\in E_{G}$ is labeled by $a$ , then $G$ can be denoted by $G(e:a)$ .

Definition 2.

The function $type$ (or $type_{G}$ to be exact) returns the number of nodes attached to some edge in a graph $G$ : $type_{G}(e):=|att_{G}(e)|$ . If $G$ is a graph, then $type(G):=|ext_{G}|$ .

Definition 3.

A sub-hypergraph (or just subgraph) $H$ of a hypergraph $G$ is a hypergraph such that $V_{H}\subseteq V_{G}$ , $E_{H}\subseteq E_{G}$ , and for all $e\in E_{H}$ $att_{H}(e)=att_{G}(e)$ , $lab_{H}(e)=lab_{G}(e)$ .

Definition 4.

If $H=\langle\{v_{i}\}_{i=1}^{n},\{e_{0}\},att,lab,v_{1}\dots v_{n}\rangle$ , $att(e_{0})=v_{1}\dots v_{n}$ and $lab(e_{0})=a$ , then $H$ is called a handle. It is denoted by $\circledcirc(a)$ .

Definition 5.

An isomorphism between graphs $G$ and $H$ is a pair of bijective functions $\mathcal{E}:E_{G}\to E_{H}$ , $\mathcal{V}:V_{G}\to V_{H}$ such that $att_{H}\circ\mathcal{E}=\mathcal{V}\circ att_{G}$ , $lab_{G}=lab_{H}\circ\mathcal{E}$ , $\mathcal{V}(ext_{G})=ext_{H}$ . In this work, we do not distinguish between isomorphic graphs.

3.2 Replacement

This procedure is defined in [4]. In short, the replacement of an edge $e_{0}$ in $G$ with a graph $H$ can be done if $type(e_{0})=type(H)$ as follows:

1.

Remove $e_{0}$ ;
2.

Insert an isomorphic copy of $H$ (namely, $H$ and $G$ have to consist of disjoint sets of nodes and edges);
3.

For each $i$ , fuse the $i$ -th external node of $H$ with the $i$ -th attachement node of $e_{0}$ .

To be more precise, the set of edges in the resulting graph is $(E_{G}\setminus\{e_{0}\})\cup E_{H}$ , and the set of nodes is $V_{G}\cup(V_{H}\setminus ext_{H})$ . The result is denoted by $G[H/e_{0}]$ .

3.3 Hyperedge Replacement Grammars

Definition 6.

A hyperedge replacement grammar (HRG) is a tuple $HGr=\langle N,\Sigma,P,S\rangle$ , where $N$ is a finite alphabet of nonterminal symbols, $\Sigma$ is a finite alphabet of terminal symbols ( $N\cap\Sigma=\emptyset$ ), $P$ is a set of productions, and $S\in N$ . Each production is of the form $A\to H$ where $A\in N$ , $H\in\mathcal{H}(N\cup\Sigma)$ and $type(A)=type(H)$ . For $\pi=A\to H$ we denote $lhs(\pi)=A,rhs(\pi)=H$ .

Edges labeled by terminal (nonterminal) symbols are called terminal edges (nonterminal edges resp.). If a graph contains terminal edges only, it is called terminal.

If $G$ is a graph, $e_{0}\in E_{G}$ , $lab(e_{0})=A$ and $\pi=A\to H\in P$ , then $G$ directly derives $G[H/e_{0}]$ (we write $G\Rightarrow G[H/e_{0}]$ or $G\underset{\pi}{\Rightarrow}G[H/e_{0}]$ ). The transitive reflexive closure of $\Rightarrow$ is denoted by $\overset{\ast}{\Rightarrow}$ . If $G\overset{\ast}{\Rightarrow}H$ , then $G$ is said to derive $H$ . The corresponding sequence of production applications is called a derivation.

Definition 7.

The language generated by an HRG $HGr=\langle N,\Sigma,P,S\rangle$ is the set of graphs $H\in\mathcal{H}(\Sigma)$ such that $\circledcirc(S)\overset{\ast}{\Rightarrow}H$ . It is denoted by $L(HGr)$ .

A graph language $L$ is said to be a hypergraph context-free language (HCFL) if it is generated by some HRG.

Two grammars are said to be equivalent if they generate the same language.

Further we simply write $A\overset{\ast}{\Rightarrow}G$ instead of $\circledcirc(A)\overset{\ast}{\Rightarrow}G$ .

4 Weak Greibach Normal Form

In this section we introduce the formal definition of the normal form we are interested in, and establish a simple property of languages generated by grammars in this normal form.

Definition 8.

An HRG $HGr$ is in the weak Greibach normal form (WGNF) if there is exactly one terminal edge in the right-hand side of each production. Formally, $\forall(X\to H)\in P_{HGr}$ $\exists!e_{0}\in E_{H}:lab_{H}(e_{0})\in\Sigma_{HGr}$ .

Example 2.

The grammar $HGr_{1}=\langle\{S\},\{a\},P_{1},S\rangle$ is in the WGNF where $P_{1}$ contains the following rules ( $type(S)=type(a)=1$ ):

This grammar generates the language of star 1-edged $a$ -labeled hypergraphs, which is unbounded.

Remark 1.

If $\langle N,\Sigma,P,S\rangle$ is in the WGNF and $X\overset{k}{\Rightarrow}G$ for some $X\in N$ , then $G$ has exactly $k$ terminal edges (each production adds exactly one terminal edge).

Note that not each hypergraph context-free language can be generated by some HRG in the WGNF. This follows from

Example 3.

Consider the HRG $HGr_{2}=\langle\{T\},\{b\},P_{2},T\rangle$ where $P_{2}$ contains the following rules
( $type(T)=type(b)=0$ ):

The first production just adds an isolated node; thus this grammar produces graphs that have exactly one edge labeled by $b$ and arbitrarily many isolated nodes. If there is an equivalent $HGr^{\prime}=\langle N,\{b\},P^{\prime},S^{\prime}\rangle$ in the WGNF, then each right-hand side of each production in $P^{\prime}$ contains exactly one terminal edge. Note that if $S^{\prime}\overset{k}{\Rightarrow}G,G\in\mathcal{H}(\{b\})$ for some $k$ in $HGr^{\prime}$ , then $G$ has $k$ terminal edges (Remark 1); hence $k$ has to be equal to $1$ and therefore $S^{\prime}\to G\in P^{\prime}$ . However, there are infinitely many graphs in $L(HGr)$ while $|P^{\prime}|<\infty$ . ∎

It is seen that the reason of this problem is in isolated nodes. In the string case if a language contains the empty word, then it cannot be generated by a grammar in the WGNF due to obvious reasons: each production adds at least one terminal symbol so it is impossible to produce the empty word. In the case of graphs, we also have to prohibit somehow undesired languages with “too many” isolated nodes occuring in graphs.

Below we describe the precise class of languages generated by grammars in the weak Greibach normal form. We denote by $esize(G)$ the number of edges in $G$ , and by $isize(G)$ the number of isolated nodes in $G$ .

Definition 9.

An HCFL $L$ is said to be isolated-node bounded (ibHCFL) if there is a constant $M$ such that, for each $G\in L$ , $isize(G)<M*esize(G)$ . An HRG $HGr$ is called isolated-node bounded if it generates an ibHCFL.

Theorem 1.

Each HRG $HGr=\langle N,\Sigma,P,S\rangle$ in the WGNF generates an ibHCFL.

Proof.

Let $M=\max_{A\to H\in P}\{isize(H)\}+1$ . Define $esize_{T}(G)$ as the number of terminal edges in $G$ . We prove by induction on $k$ that for each graph $G$ such that $X\overset{k}{\Rightarrow}G,X\in N$ $isize(G)<M*esize_{T}(G)$ . Remark 1 implies that $esize_{T}(G)=k$ .

Basis. If $k=0$ , then $G=\circledcirc(X)$ , and the statement is trivial ( $0<M$ ).

Step. Let $X\overset{k-1}{\Rightarrow}H\Rightarrow G$ . By the induction hypothesis, $isize(H)<M*esize_{T}(H)$ . The number of isolated nodes appeared at the last step does not exceed $M$ by the definition of $M$ . Then $isize(G)<M*esize_{T}(H)+M=M*esize_{T}(G)$ .

Note finally that $esize_{T}(G)\leq esize(G)$ . ∎

The other direction of this statement is of central interest in this paper. The next section is devoted to it.

5 Transformation of HRGs Generating IBHCFLs into HRGs in the Weak Greibach Normal Form

The main theorem we are going to prove is the following:

Theorem 2.

Each ibHCFL $L$ can be generated by an HRG in the WGNF.

In other words, this theorem states that each context-free hypergraph language satisfying the isolated-node boundedness property can be generated by a grammar in the weak Greibach normal form.

Structurally, the proof of this theorem is based on the corresponding one for string context-free grammars. Let us recall the mains steps of the latter in brief (see details in [2]).

The input of the algorithm is a context-free grammar $CFG$ that does not generate the empty word (this property is related to isolated-node boundedness). The desired output is an equivalent grammar in the weak Greibach normal form (i.e. in which each production is of the form $A\to aA_{1}\dots A_{n}$ where $a$ is terminal while $A_{1},\dots,A_{n}$ are nonterminal).

1.

Useless rules and symbols, $\varepsilon$ -rules (i.e. rules of the form $A\to\varepsilon$ ) and chain rules (i.e. rules of the form $A\to B$ for $A,B$ being nonterminal) are eliminated.
2.

It is shown how to eliminate recursive $A$ -productions, i.e. productions of the form $A\to A\alpha$ for some fixed nonterminal symbol $A$ . The trick is to move $A$ from the left side of $A\alpha$ to the right side. It is done as follows: if $A\to A\alpha_{1}|\dots|A\alpha_{m}|\beta_{1}|\dots|\beta_{p}$ are all the $A$ -productions (here $|$ means enumeration of productions), then they are replaced by the productions $A\to\beta_{1}A^{\prime}|\dots|\beta_{p}A^{\prime}|\beta_{1}|\dots|\beta_{p}$ and $A^{\prime}\to\alpha_{1}|\dots|\alpha_{m}|\alpha_{1}A^{\prime}|\dots|\alpha_{m}A^{\prime}$ .
3.

Left recursion is completely eliminated: nonterminals are numbered as $A_{1},\dots,A_{n}$ and then a procedure involving application of step 2 is done such that its result is a grammar where each production is either of the form $A_{i}\to A_{j}\alpha$ for $i<j$ or of the form $A_{i}\to b\alpha$ for $b$ being terminal.
4.

By taking compositions of the above productions and adding new nonterminal symbols the grammar is normalized. The resulting grammar includes rules of the form $A\to b\alpha$ ( $b$ is terminal, $\alpha$ is a string of nonterminals) only.

Our goal is to recreate this proof for HRGs. Note that steps 2-4 of the above plan actively exploit string nature of CFGs: transformations of productions are based on movements of symbols from the leftmost position to the rightmost one. Graphs in general do not have leftmost positions unlike strings, so we are going to distinguish arbitrary edges in graphs so they play the role of “the leftmost symbol”. It is done by means of $\delta$ (see the proof below).

In the proof we use the following

Lemma 1.

If $\pi=A\to G(e_{0}:B)$ is a production of a grammar $HGr=\langle N,\Sigma,P,S\rangle$ for $A,B$ being nonterminal, and $B\to H_{1},\dots,B\to H_{k}$ are all the productions in $HGr$ with $B$ in the left-hand side, then replacing $\pi$ by productions $A\to G[H_{1}/e_{0}],\dots,A\to G[H_{k}/e_{0}]$ does not change the language generated.

It directly follows from the well-known context-freeness lemma.

5.1 Eliminating Useless, Edgeless and Chain Productions

Our first objective is to eliminate useless productions, productions with no edges in the right-hand side and productions with one nonterminal edge only in the right-hand side. It appears that only this step requires isolated-node boundedness. In the below transformations we provide theoretical reasonings that they can be done irregarding their algorithmic realization; however, the latter can be done similarly to the string case (see [2]).

Definition 10.

A nonterminal symbol $A$ in a grammar $HGr$ is useless if there is no derivation of the form $S\overset{\ast}{\Rightarrow}H(e:A)\overset{\ast}{\Rightarrow}G$ in this grammar where $G$ is terminal.

Transformation 1 (eliminating useless symbols).

Input: an HRG $HGr$ .
Output: an equivalent HRG $HGr^{\prime}$ without useless nonterminal symbols.

It suffices to remove all the useless symbols and all the productions containing them. This does not affect the language generated.

Definition 11.

A graph is called edgeless if it does not contain edges. A production $A\to G$ is called edgeless if $G$ is edgeless.

Transformation 2 (eliminating edgeless productions).

Input: an HRG $HGr$ that generates an ibHCFL without useless symbols.
Output: an HRG $HGr^{\prime}$ without edgeless productions such that $L(HGr^{\prime})=L(HGr)$ .

Method.
Let $HGr=\langle N,\Sigma,P,S\rangle$ . Let $Null=\{(A;H)\mid E_{H}=\emptyset,\>A\in N,\>A\overset{\ast}{\Rightarrow}H\}$ . This set is finite, because otherwise there is a symbol $A_{0}$ for which arbitrarily large edgeless graphs $H$ exist such that $(A_{0};H)\in Null$ ; this contradicts the fact that $L(HGr)\in\mathrm{ibHCFL}$ (it is important that $A_{0}$ is not useless).

Let $B\to G\in P$ and $E_{G}=\{e_{1},\dots,e_{n}\}$ . Let $P_{1}$ contain the rules of the form $B\to G[H_{1}/e_{1}]\dots[H_{n}/e_{n}]$ where $H_{i}=\circledcirc(lab(e_{i}))$ for at least one $i$ (then replacement of $e_{i}$ by $H_{i}$ changes nothing) and $(lab(e_{i});H_{i})$ belongs to $Null$ otherwise. It is argued that $HGr^{\prime}=\langle N,\Sigma,P_{1},S\rangle$ does not have productions with edgeless right-hand sides (this follows from the construction) and that $L(HGr^{\prime})=L(HGr)$ . ∎

Definition 12.

A production $A\to G$ is called a chain production if $E_{G}=\{e\}$ and $lab_{G}(e)$ is nonterminal.

Example 4.

The first production in the grammar $HGr_{2}$ is chain.

Transformation 3 (eliminating chain productions).

Input: an HRG $HGr$ that generates an ibHCFL without useless symbols.
Output: an HRG $HGr^{\prime}$ without chain productions such that $L(HGr^{\prime})=L(HGr)$ .

Method.
Let $HGr=\langle N,\Sigma,P,S\rangle$ . Consider the set $Chain=\{(A;H)\mid E_{H}=\{e_{0}\},lab_{H}(e_{0})=B,\>A,B\in N,\>A\overset{\ast}{\Rightarrow}H\}$ . Note that $Chain$ is finite (again, otherwise one can derive a graph with arbitrarily many isolated nodes having a fixed number of edges). Let $HGr^{\prime}=\langle N,\Sigma,P^{\prime\prime},S\rangle$ where $P^{\prime}=P\setminus\{A\to H\mid(A;H)\in Chain\}$ and $P^{\prime\prime}=P^{\prime}\cup\{A\to G\mid\exists H:\>(A;H)\in Chain,H\to G\in P^{\prime}\}$ . Thus we removed all the productions having a graph with one edge in the right-hand side. It can be easily shown that $L(HGr^{\prime})=L(HGr)$ . ∎

Remark 2.

If $HGr$ in Transformation 3 does not have edgeless productions, then $HGr^{\prime}$ does not have them either. Transformations 2 and 3 applied to a grammar without useless symbols transform them into a grammar without useless symbols too.

Example 5.

The grammar $HGr_{2}$ from Example 3 cannot be turned into an equivalent one without edgeless and chain productions.

The above procedures complete the first step of normalization.

5.2 Defining and Eliminating Recursive Productions

Now we start proving Theorem 2. Let $HGr=\langle N,\Sigma,P,S\rangle$ be a grammar that generates $L$ . Applying Transformations 1, 2, 3, we can assume that $HGr$ does not have useless symbols, edgeless productions and chain productions. Thus, the right-hand side of each production has at least two edges or one terminal edge.

Let us carefully examine what happens in the string case at Step 2. Consider the following

Example 6.

Let $A\to Ac|Ad|Be$ be all the $A$ -productions in a string context-free grammar. After Step 2 there are the following ones: $A\to BeA^{\prime}|Be,\;A^{\prime}\to c|d|cA^{\prime}|dA^{\prime}$ . Below we show how a derivation in the old grammar is remodeled in the new one:

Old:	$A\Rightarrow Ac\Rightarrow Adc\Rightarrow Addc\Rightarrow Beddc$ .
New:	$A\Rightarrow BeA^{\prime}\Rightarrow BedA^{\prime}\Rightarrow BeddA^{\prime}\Rightarrow Beddc$ .

The underlying idea is to invert the derivation: in the new derivation one starts with $Be$ and applies new productions in reverse order w.r.t. to the old derivation.

The same idea is used in the graph case. The difficulty is that we do not have the leftmost symbol in productions. However, it suffices to distinguish an arbitrary edge in the right-hand side of each production to play the role of the leftmost symbol. We do this by means of the function $\delta$ .

Definition 13.

Let us fix an arbitrary function $\delta$ which acts on $P$ in such a way that $\delta(A\to H)$ belongs to $E_{H}$ . We denote $\mu(\pi)=lab(\delta(\pi))$ .

Now $\mu(\pi)$ plays the role of the “leftmost” symbol of a production $\pi$ . Then we define recursive productions as expected.

Definition 14.

A production $\pi$ is recursive if $lhs(\pi)=\mu(\pi)$ . A production $\pi$ is an $A$ -production if $lhs(\pi)=A$ .

Definition 15.

A derivation $A\Rightarrow H_{1}\Rightarrow\dots\Rightarrow H_{k}$ is called a $\delta$ -derivation if each of its productions is applied to the edge that is $\delta$ of the previous production. Formally, if $A\underset{\pi_{1}}{\Rightarrow}H_{1}\underset{\pi_{2}}{\Rightarrow}\dots\underset{\pi_{k}}{\Rightarrow}H_{k}$ , then $\pi_{i}$ has to be appied to $\delta(\pi_{i-1})$ (in the subgraph that appears after the $(i-1)$ -th step). We say that the final label of such a derivation is $\mu(\pi_{k})$ . Note that $\mu(\pi_{i})$ has to be nonterminal for $i<k$ (but $\mu(\pi_{k})$ can be terminal).

Below we study how to eliminate all the recursive $A$ -productions for some fixed $A\in N$ . In the string case the idea is quite simple: one moves $A$ from the beginning of the string to its end in order to invert a derivation process. Here we also exploit the idea of inverting the derivation but, of course, we have to take into account more complex and general graph structures.

Let $A\underset{\pi_{1}}{\Rightarrow}H_{1}\underset{\pi_{2}}{\Rightarrow}\dots\underset{\pi_{k}}{\Rightarrow}H_{k}$ be a $\delta$ -derivation such that $lhs(\pi_{i})=A$ for $i=1,\dots,k$ and $\mu(\pi_{k})\neq A$ (compare this with the old derivation in Example 6). We provide the following intuition behind the below construction. Imagine that one has a hole in his sweater (this metaphor is related to that in [5]), and he sews it up starting from edges (i.e. he applies $\pi_{1},\dots,\pi_{k-1}$ ) and finishing by sewing on a patch in the center of the hole (i.e. he applies $\pi_{k}$ ). In the new derivation one starts with the patch (that is, the right-hand side of $\pi_{k}$ ), then he sews up right-hand sides of $\pi_{k-1},\dots,\pi_{2}$ in this order (thus the patch grows and becomes bigger) and finishing by connecting it with edges of the hole w.r.t. the production $\pi_{1}$ (see also Fig. 5 which provides a sketch of this process).

The major problem of this idea is the following: if we want to invert a derivation such as on Figure 5(a) and to obtain a new one, such as one drawn on Figure 5(b), we have to carefully control external nodes. Namely, in the old derivation external nodes of $H$ are predefined by $R_{1}$ while in the new derivation $R_{1}$ appears at the very last step. In order to place external nodes correctly, we introduce complex nonterminal symbols that describe how external nodes of a graph on the current derivation step correspond to external nodes of a graph on the next and on the last step. This description is done by means of partial functions.

Now we proceed with formal realization of this idea. Let $t=type(A)$ . Let $\rho_{1}=A\to R_{1},\dots,\rho_{K}=A\to R_{K}$ be all the recursive $A$ -productions in $P$ and let $\gamma_{1}=A\to G_{1},\dots,\gamma_{L}=A\to G_{L}$ be the remaining $A$ -productions in $P$ . Our goal is to construct a grammar equivalent to $HGr$ without recursive $A$ -productions. Firstly, we simply remove them: $P_{1}:=P\setminus\{\rho_{1},\dots,\rho_{K}\}$ . In order to compensate for the lack of these rules we add new ones accordingly to the procedure below.

Definition 16.

$f:X\to Y$ is a partial function if $f$ is a function on some subset $X^{\prime}$ of $X$ . We denote the domain $X^{\prime}$ of $f$ by $Dom(f)$ , and the range of $f$ by $Ran(f)$ .

Definition 17.

$f:X\to Y$ is a partial bijection if $f$ is a partial function such that $f|_{Dom(f)}$ is a bijection.

Definition 18.

If $f:X\to Y$ , $g:Y\to Z$ are partial functions, then $g\circ f:X\to Z$ is a partial function defined on $Dom(f)\cap f^{-1}(Dom(g))$ that acts on this set as a usual composition.

(a) The production

\gamma

(b) The production

\rho

Figure 1: The productions

\gamma

and

\rho

Let $\gamma=\gamma_{i}$ for some $i$ and let $\rho=\rho_{j}$ for some $j$ . We define $e_{0}:=\delta(\rho)$ , $\widetilde{e}:=\delta(\gamma)$ , $G:=rhs(\gamma)$ and $R:=rhs(\rho)$ . Illustrations of these productions are presented in Fig. 1. Firstly, we extend $N$ by new nonterminal symbols of the form $(A,f,g)$ where $f,g:[1,t]\to[1,t]$ are two partial bijections; the resulting set is denoted by $N^{\prime}$ . Then we add to $P_{1}$ rules of the following three forms:
Type I. Recall the metaphor about a sweater and a patch. If one imagines an $A$ -labeled edge at the beginning of a derivation as a hole in a sweater, then productions of type I are designed to add a patch $G$ at the very beginning of sewing up (= of a derivation). Partial functions are used to describe what external nodes of $G$ (edges of a patch) are equal to attachment nodes of the $A$ -labelled edge (edges of a hole).
Let $f,g:[1,t]\to[1,t]$ be two partial bijections. Informally, $f$ specifies what nodes of $ext_{G}$ are external on the second step of the inverted derivation (see again Fig. 5(b)); $g$ specifies what external nodes of the second step graph are external at the last step. Let $Dom(g\circ f)=\{p_{1},\dots,p_{k}\}_{IO}$ . We set

•

$V^{\prime}:=V_{G}\cup\{u_{1},\dots,u_{t-k}\}$ for $u_{1},\dots,u_{t-k}$ being new nodes;
•

$E^{\prime}:=E_{G}\cup\{e^{\prime}\}$ where $e^{\prime}$ is new;
•

For $e\in E_{G}$ we set $att^{\prime}(e):=att_{G}(e)$ and $lab^{\prime}(e)=lab_{G}(e)$ ;
•

For $e^{\prime}$ we set $att^{\prime}(e^{\prime}):=ext_{G}u_{1}\dots u_{t-k}$ ;
•

$lab^{\prime}(e^{\prime}):=(A,f,g)$ ;
•

Let $[1,t]\setminus Ran(g\circ f)=\{j_{1},\dots,j_{t-k}\}_{IO}$ (note that $g\circ f$ is bijective so $Dom(g\circ f)$ and $Ran(g\circ f)$ are of the same size). Then we set $ext^{\prime}(i):=u_{r}$ , if $i=j_{r},\;r\in\{1,\dots,t-k\}$ or $ext^{\prime}(i):=ext_{G}(p_{r})$ for $i=i_{r}=g(f(p_{r})),\;r\in\{1,\dots,k\}$ .

We are ready to announce that $G^{\prime}=\langle V^{\prime},E^{\prime},att^{\prime},lab^{\prime},ext^{\prime}\rangle$ . Finally, we introduce the following rule:

{\nu_{1,\gamma,f,g}:=A\to G^{\prime}}.

We define $\delta(\nu_{1,\gamma,f,g})=\delta(\gamma)$ (note that $\gamma$ is nonrecursive so is $\nu_{1,\gamma,f,g}$ ).

Remark 3.

Type of $(A,f,g)$ equals $type((A,f,g))=t+t-k=2\cdot type(A)-|Dom(g\circ f)|$ .

Figure 2 illustrates this type of productions.

Figure 2: The production

\nu_{1,\gamma,f,g}

Type II. Production of type II are used at the last step of a derivation such a on Fig. 5(b); returning to the metaphor they allow one to finish sewing up and to finally connect a patch (that has grown by applying productions of types I and III) with the edges of a hole.
We recall that $\rho=A\to R$ such that $\delta(\rho)=e_{0}\in E_{R}$ and $lab(e_{0})=A$ . Let $f_{0}:[1,t]\to[1,t]$ be a function which is defined by the following relation: $f_{0}(i)=j\Leftrightarrow att_{R}(e_{0})(i)=ext_{R}(j)$ . Obviously, $f_{0}$ is a partial bijection. We set

•

$V^{\prime}:=V_{R}$ ;
•

$E^{\prime}:=E_{G}\setminus\{e_{0}\}$ ;
•

$att^{\prime}=att_{R}|_{E^{\prime}}$ , $lab^{\prime}=lab_{R}|_{E^{\prime}}$
•

Let $k$ be equal to $|Dom(f_{0})|$ and let $[1,t]\setminus Ran(f_{0})=\{j_{1},\dots,j_{t-k}\}_{IO}$ . Then $ext^{\prime}=att_{R}(e_{0})ext_{R}(j_{1})\allowbreak\dots ext_{R}(j_{t-k})$ .

We set $R^{\prime}=\langle V^{\prime},E^{\prime},att^{\prime},lab^{\prime},ext^{\prime}\rangle$ . Finally, we introduce ${\nu_{2,\rho}:=(A,f_{0},id)\to R^{\prime}}$ and set ${\delta(\nu_{2,\rho}):=e}$ for some chosen $e\in E_{R^{\prime}}$ (note that there is one since $|E_{G}|\geq 2$ ). Here $id:[1,t]\to[1,t]$ is the identity function. See Figure 3.

Remark 4.

$type(R^{\prime})=2t-k=2\cdot type(A)-|Dom(f_{0})|=2\cdot type(A)-|Dom(id\circ f_{0})|=type((A,f_{0},id))$ . Thus the production is defined correctly.

Figure 3: The production

\nu_{2,\rho}

Type III. Productions of type III serve to make a “sewing up step” from inside to outside. $f,g$ that specify relations between external nodes on the current step, on the next step and on the final step are changed by related functions $f^{\prime},g^{\prime}$ .
Let $f,g,f^{\prime},g^{\prime}:[1,t]\to[1,t]$ be partial bijections. The whole procedure described below can be done if $g^{\prime}\circ f^{\prime}=g$ .

We set

•

$V^{\prime}:=V_{R}\cup\{u_{1},\dots,u_{t-k^{\prime}}\}$ such that $u_{1},\dots,u_{t-k^{\prime}}$ are new nodes ( $k^{\prime}$ is defined below);
•

$E^{\prime}:=E_{R}\setminus\{e_{0}\}\cup\{e^{\prime}\}$ where $e_{0}=\delta(\rho)$ and $e^{\prime}$ is new;
•

For $e\in E_{R}\setminus\{e_{0}\}$ we set $att^{\prime}(e):=att_{R}(e)$ and $lab^{\prime}(e)=lab_{R}(e)$ ;
•

For $e^{\prime}$ we set $att^{\prime}(e^{\prime}):=ext_{R}u_{1}\dots u_{t-k^{\prime}}$ ;
•

$lab^{\prime}(e^{\prime}):=(A,f^{\prime},g^{\prime})$ ;
•
Let
- –
  
  $M_{1}:=Dom(g)\setminus Ran(f)=\{s_{1},\dots,s_{p}\}_{IO}$ (nodes of the current step that will be external at the last step but that were not taken into account at the previous step);
- –
  
  $M_{2}:=[1,t]\setminus Ran(g\circ f)=\{j_{1},\dots,j_{t-k}\}_{IO}$ ;
- –
  
  $g(s_{i})=j_{l_{i}}$ ;
- –
  
  Let $M_{2}\setminus g(M_{1})=\{j_{x_{1}},\dots,j_{x_{t-k-p}}\}$ where $\{x_{1},\dots,x_{t-k-p}\}$ is index-ordered; here we note that $g(M_{1})=Ran(g)\setminus Ran(g\circ f)$ , thus $g(M_{1})\subseteq M_{2}$ .
Then we set $k^{\prime}=k+p$ and
- –
  
  $ext^{\prime}(i):=att_{R}(e_{0})(i)$ for $i=1,\dots,t$ ;
- –
  
  $ext^{\prime}(t+l_{i})=ext_{R}(s_{i})$ for $i=1,\dots,p$ ;
- –
  
  $ext^{\prime}(t+x_{i}):=u_{i}$ for $i=1,\dots,t-k^{\prime}$ .

We set $R^{\prime}=\langle V^{\prime},E^{\prime},att^{\prime},lab^{\prime},ext^{\prime}\rangle$ . Then ${\nu_{3,\rho,f,g,f^{\prime},g^{\prime}}:=(A,f,g)\to R^{\prime}}$ . We put ${\delta(\nu_{3,\rho,f,g,f^{\prime},g^{\prime}})=e}$ for some $e\in E_{R}\setminus\{e_{0}\}$ . The production is illustrated on Figure 4.

Remark 5.

Again, we prove well-definedness of this production:

•

Firstly, we check that $type(R^{\prime})=type((A,f,g))$ : $|ext^{\prime}|=t+p+t-k^{\prime}=2t+p-k-p=2\cdot type(A)-|Ran(g\circ f)|=type((A,f,g))$ ;
•

Secondly, we have to check that $|att(e^{\prime})|=type((a,f^{\prime},g^{\prime}))$ : $|att(e^{\prime})|=t+t-k^{\prime}=t+|M_{2}\setminus g(M_{1})|$ . Note that $M_{2}\setminus g(M_{1})=\left([1,t]\setminus Ran(g\circ f)\right)\setminus\left(Ran(g)\setminus Ran(g\circ f)\right)=[1,t]\setminus(Ran(g)\cup Ran(g\circ f))=[1,t]\setminus Ran(g)$ . This yields $|att(e^{\prime})|=2t-|Ran(g)|=2t-|Ran(g^{\prime}\circ f^{\prime})|=type((A,f^{\prime},g^{\prime}))$ .

Figure 4: The production

\nu_{3,\rho,f,g,f^{\prime},g^{\prime}}

. Here

R^{\ominus}

denotes a graph

R

without

\delta(\rho)

We say that $P_{2}$ is obtained from $P_{1}$ by adding all possible productions of all the above types. Obviously, their number is finite (each production is defined by at most four partial functions from $[1,t]$ to $[1,t]$ ).

Lemma 2.

The grammar $HGr_{1}=\langle N^{\prime},\Sigma,P_{2},S\rangle$ is equivalent to $HGr=\langle N,\Sigma,P,S\rangle$ .

Proof.

Informally, it suffices to notice that the design of new productions allows one to invert productions in the way which is shown on Figure 5. Example 7 provides an illustration of the formal proof, which is given below.

Firstly, we check that $L(HGr)\subseteq L(HGr_{1})$ . In order to do that, we show how to model a branch of a derivation that consists of rules of the form $\rho_{i}$ and finishes with a rule $\gamma_{j}$ by means of $HGr_{1}$ . Let $A\underset{\rho_{i_{1}}}{\Rightarrow}H_{1}\underset{\rho_{i_{2}}}{\Rightarrow}\dots\underset{\rho_{i_{m}}}{\Rightarrow}H_{m}\underset{\gamma_{j}}{\Rightarrow}H$ be a $\delta$ -derivation in $HGr$ where $H_{k}=H_{k-1}[R_{k}/\delta(\rho_{i_{k-1}})],1\leq k<m$ and $H=H_{m}[G/\delta(\rho_{i_{m}})]$ . It is convenient to put $H_{0}:=\circledcirc(A)$ and say that $\delta(\rho_{i_{0}}):=e_{0}$ where $E_{H_{0}}=\{e_{0}\}$ . For the sake of simplyfing notations we write $\rho_{1},\dots,\rho_{m}$ instead of $\rho_{i_{1}},\dots,\rho_{i_{m}}$ and $\gamma$ instead of $\gamma_{j}$ .

Let $G_{k}$ be defined inductively as follows: $G_{m+1}=G$ ; $G_{k}=R_{k}[G_{k+1}/\delta(\rho_{k})]$ . Note that

\begin{array}[]{lcl}H&=&H_{m}[G/\delta(\rho_{m})]=\\ &=&H_{m-1}[R_{m}/\delta(\rho_{m-1})][G/\delta(\rho_{m})]=\\ &\dots&\\ &=&R_{1}[R_{2}/\delta(\rho_{1})]\dots[R_{m}/\delta(\rho_{m-1})][G_{m+1}/\delta(\rho_{m})]=\\ &=&R_{1}[R_{2}/\delta(\rho_{1})]\dots[G_{m}/\delta(\rho_{m-1})]=\\ &\dots&\\ &=&R_{1}[G_{2}/\delta(\rho_{1})]=\\ &=&G_{1}.\\ \end{array}

To prove the claimed inclusion, it suffices to show then that $A\underset{HGr_{1}}{\Rightarrow}G_{1}$ .

Observe that $H=H_{i-1}[R_{i}[G_{i+1}/\delta(\rho_{i})]/\delta(\rho_{i-1})],\quad i=1,\dots,m$ . Then $G_{i+1}$ can be considered as a subgraph of $G_{i}:=R_{i}[G_{i+1}/\delta(\rho_{i})]$ , which is a subgraph of $H$ . Let $f_{i}$ be a partial function defined by the following correspondence: $f_{i}(j)=k\Leftrightarrow ext_{G_{i+1}}(j)=ext_{G_{i}}(k)$ . Similarly we define $g_{i}$ : $g_{i}(j)=k\Leftrightarrow ext_{G_{i}}(j)=ext_{H}(k)$ . It follows from these definitions that $g_{i}\circ f_{i}=g_{i-1}$ .

Let $\sigma_{i}=A\to G_{i+1}$ and $G^{\prime}_{i}:=rhs(\nu_{1,\sigma_{i},f_{i},g_{i}})$ . We prove by the reverse induction on $p=m,\dots,1$ that then $A\underset{HGr_{1}}{\Rightarrow}G^{\prime}_{p}$ .

Basis. Since $\sigma_{m}=A\to G=\gamma$ is not a recursive $A$ -production, we added $\nu_{1,\sigma_{m},f_{m},g_{m}}$ in our grammar. This completes the basis case.

Step. We assume by the induction hypothesis that $A\underset{HGr_{1}}{\Rightarrow}G^{\prime}_{p}$ and our aim is to prove the same for $p-1$ . $G^{\prime}_{p}$ contains an edge labeled by $(A,f_{p},g_{p})$ . Then a type III rule is applied to this edge: $\nu_{3,\rho_{p},f_{p},g_{p},f_{p-1},g_{p-1}}$ . The design of such rules allows us to perform exactly the substitution of $G_{p+1}$ into $R_{p}$ .

(a)

(b)

Figure 5: Illustration of the proof of Lemma 2. External nodes are not depicted here.

Therefore, $A\underset{HGr_{1}}{\Rightarrow}G^{\prime}_{1}$ . Note that the right-hand side of this derivation contains an edge labeled by $(A,f_{1},g_{1})$ . Since $G_{1}=H$ , we see that $g_{1}$ is the identity function $id$ . Thus we can apply the type II rule of the form $\nu_{2,\rho_{1}}$ whose left-hand side equals $(A,f_{1},id)=(A,f_{1},g_{1})$ . This rule is also designed in such a way that its application is equivalent to substitution of $G_{2}$ into $R_{1}$ . This means that $A\underset{HGr_{1}}{\Rightarrow}G^{\prime}_{1}\Rightarrow R_{1}[G_{2}/\delta(\rho_{1})]=G_{1}$ , as required.

Now, if there is an arbitrary derivation where the rule $\rho_{i}$ is applied ( $1\leq i\leq K$ ), then we find the largest $\delta$ -derivation within the derivation that includes this occurence $\rho_{i}$ and remodel it as described above. Thus, the above reasonings complete the proof for one of the inclusions.

To prove the reverse inclusion, it suffices to show how to transform a branch of a derivation containing nonterminals of the form $(A,f,g)$ into a derivation in $HGr$ . Let

\begin{split}A\underset{\nu_{1,\gamma,f_{m},g_{m}}}{\Rightarrow}G_{m+1}\underset{\nu_{3,\rho_{m},f_{m},g_{m},f_{m-1},g_{m-1}}}{\Rightarrow}G_{m}\underset{\nu_{3,\rho_{m-1},f_{m-1},g_{m-1},f_{m-2},g_{m-2}}}{\Rightarrow}\dots\underset{\nu_{3,\rho_{2},f_{2},g_{2},f_{1},g_{1}}}{\Rightarrow}G_{2}\underset{\nu_{2,\rho_{m}}}{\Rightarrow}G_{1}\end{split}

be such a branch (it has to be of a similar form). Then we can consider a derivation in $HGr$ of the form $A\underset{\rho_{1}}{\Rightarrow}H_{1}\underset{\rho_{2}}{\Rightarrow}\dots\underset{\rho_{m}}{\Rightarrow}H_{m}\underset{\gamma}{\Rightarrow}H$ ; the first part of the proof of this theorem allows us to conclude that $H=G_{1}$ . Note that functions $f_{i}$ and $g_{i}$ in the first derivation can be arbitrary (they only have to satisfy the condition of type III rules); however, it is not hard to show that they are the same as the functions $f_{i}$ and $g_{i}$ that are built on the basis of the second derivation.

This finishes the proof. ∎

There are two important observations regarding the procedure and the lemma above:

1.

After such a procedure there are no recursive $A$ -productions;
2.

For each nonterminal $(A,f,g)$ there is no rule $\pi$ such that $\mu(\pi)=(A,f,g)$ .

Note that $HGr_{1}$ does not preserve all the properties that $HGr$ had. Namely, rules of Type II can contain only one edge. However, our further actions do not affect type II and type III rules so this does not matter. Besides, useless symbols can appear, which also does not bother us (we deleted them only to perform Transformations 2 and 3). This finishes the second step.

Example 7.

Let a grammar contain two $A$ -productions: the first one is recursive and the second one is not recursive. They are depicted below; $\delta$ of each production is represented by the red color.

In this grammar the following $\delta$ -derivation is possible:

In the new grammar this derivation turns into the following one:
$A\Rightarrow$ $\Rightarrow$ $\Rightarrow$
$\Rightarrow$ $\Rightarrow G$
Here $f_{3}(1)=2,f_{3}(2)=3;f_{1}=f_{2}=f_{3}=g_{2};g_{3}(1)=3;g_{1}=id$ are all the involved partial functions; it is easy to check that they actually satisfy conditions from definitions. Note that if $n$ is the total number of derivation steps, then $f_{i}$ specifies a relation between external nodes on the $(n-i)$ -th and on the $(n-i+1)$ -th steps and it is defined by a corresponding production in the old derivation. In this example, $f_{i}$ are all defined by the production $\rho$ , thus they are the same. $g_{i}$ defines a relation between external nodes on the $(n-i+1)$ -th step and external nodes at the end of a derivation (on the $n$ -th step). As expected, $g_{1}$ is the identity function since $n=4$ and $(n-1+1)=n$ , so $g_{1}$ specifies a relation between external nodes on the last and on the last step.

5.3 Eliminating Recursion and Convertion into the WGNF

The rest of the steps are similar to those in the string case. Let us start with $HGr$ ; let $N=\{A_{1},\dots,A_{m}\}$ . Then the following transformation is done.

Transformation 4 (eliminating recursion).

Input: an HRG $HGr$ as at the beginning of the proof (without useless symbols or edgeless or chain productions).
Output: an equivalent HRG $HGr^{\prime}$ with the same properties such that for each $\pi\in P$ with $lhs(\pi)=A_{i}$ either $\mu(\pi)=A_{j}$ and $i<j$ or $\mu(\pi)\in\Sigma$ .

Method.

1.

Set $i=1$ .
2.

Eliminate all the recursive $A_{i}$ -productions according to Lemma 2. Now it is argued that if $\pi$ belongs to the set of productions and $lhs(\pi)=A_{i}$ , then $\mu(\pi)=A_{k}$ for $k>i$ or $\mu(\pi)$ is terminal.
3.

If $i=m$ , then a desired grammar is obtained. Otherwise, set $i=i+1$ and $j=1$ .
4.

Let $A_{j}\to H_{1},\dots,A_{j}\to H_{h}$ be all the $A_{j}$ productions. Let $\pi=A_{i}\to G(e_{0}:A_{j})$ be a production with $\delta(\pi)=e_{0}$ . Then we can replace $\pi$ by the rules $A_{i}\to G[H_{1}/e_{0}],\dots,A_{i}\to G[H_{h}/e_{0}]$ without changing the generated language (see Lemma 1). For these rules we set $\delta(A_{i}\to G[H_{k}/e_{0}])=\delta(A_{j}\to H_{k}),\;k=1,\dots,h$ . It will now be the case (due to the previous steps of this transformation) that for each $A_{j}$ -production $\pi$ $\mu(\pi)$ equals either a terminal or $A_{k}$ for $k>j$ . Thus all the $A_{i}$ -productions will have this property after such replacement.
5.

If $j=i-1$ , go to step 2; otherwise set $j=j+1$ and go to step 4.

∎

Let $HGr^{\prime}=\langle N^{\prime},\Sigma,P^{\prime},S\rangle$ be a grammar that is obtained from $HGr$ after Transformation 4. Let $|N^{\prime}|=M$ (note that $M$ is really large). We already know that $HGr^{\prime}$ does not have recursions, i.e. there are no $\delta$ -derivations $A\underset{\pi_{1}}{\Rightarrow}H_{1}\underset{\pi_{2}}{\Rightarrow}\dots\underset{\pi_{k}}{\Rightarrow}H_{k}$ such that $\mu(\pi_{k})=A$ . Indeed, due to the second observation after Lemma 2: nonterminals of the form $(B,f,g)$ cannot participate in recursive $\delta$ -derivations since they do not appear as values of $\mu$ . For the remaining nonterminals this is a consequence of the result of Transformation 4. Thus we can put a linear order on $N^{\prime}$ in such a way that for $A\underset{\pi}{\Rightarrow}G(e_{0}:B)\in P^{\prime}$ , if $\delta(\pi)=e_{0}$ , then either $B$ is terminal or it is nonterminal and $A<B$ .

We define the final set of productions $\overline{P}$ as follows: $A\to G$ belongs to $\overline{P}$ if there is a $\delta$ -derivation $A\underset{\pi_{1}}{\Rightarrow}H_{1}\underset{\pi_{2}}{\Rightarrow}\dots\underset{\pi_{k}}{\Rightarrow}H_{k}=G$ of the length $1\leq k\leq M$ such that $\mu(\pi_{k})$ is terminal. Obviously, the last requirement yields that $G$ has at least one terminal edge.

Lemma 3.

$\overline{HGr}=\langle N^{\prime},\Sigma,\overline{P},S\rangle$ is equivalent to $HGr^{\prime}$ .

Proof.

Clearly, $L(\overline{HGr})\subseteq L({HGr}^{\prime})$ , because productions in $\overline{HGr}$ are composed of that in ${HGr}^{\prime}$ . The reverse inclusion is proved by induction on the length $n$ of derivation in $HGr^{\prime}$ in more general form: $\langle N^{\prime},\Sigma,\overline{P},X\rangle$ is equivalent to $\langle N^{\prime},\Sigma,P^{\prime},X\rangle$ for each $X\in N^{\prime}$ .
Basis. If $X\to G\in P^{\prime}$ then it is also in $\overline{P}$ since $G$ is terminal.
Step. Let $X\to H(e_{0}:Y)\overset{n-1}{\Rightarrow}G$ (where $e_{0}=\delta(X\to H(e_{0}:Y))$ ) be a derivation of a terminal graph $G$ . This derivation contains a $\delta$ -derivation $X\underset{\pi_{1}}{\Rightarrow}H_{1}=H(e_{0}:Y)\underset{\pi_{2}}{\Rightarrow}H_{2}\underset{\pi_{3}}{\Rightarrow}H_{3}\dots\underset{\pi_{k}}{\Rightarrow}H_{k}$ and let $\mu(\pi_{i})=X_{i}$ (we also set $X_{0}=X$ ). Then $X_{0}<X_{1}<\dots<X_{k}$ , $X_{k}$ is terminal and all $X_{i}$ are different; thus $k\leq M$ and $X\to H_{k}\in\overline{P}$ . Then the source derivation is rebuilt as follows: we apply $X\to H_{k}$ to $X$ and then do the same with nonterminal edges in $H_{k}$ using the induction hypothesis. This completes the proof. ∎

Thus $L(\overline{HGr})=L$ . Now we have to overcome the following problem: the weak Greibach normal form requires right-hand sides of productions to have exactly one terminal edge while we can guarantee for $\overline{HGr}$ that there is at least one terminal edge. This problem is easily fixed: for each $a\in\Sigma$ we introduce a new nonterminal symbol $T_{a}$ and we add rules of the form $T_{a}\to\circledcirc(a)$ . Now, if $A\to G\in\overline{P}$ has more than one terminal symbol in the right-hand side, then we choose one of them and replace the other ones by corresponding nonterminals: $a:=T_{a}$ . After this step the modified grammar $\overline{HGr}$ still generates $L$ , and it is in the weak Greibach normal form.

6 Algorithmic Complexity

In the string case polynomial-time algorithms of convertion into the Greibach normal form are known; one is interested whether they exist in the hypergraph case.

It can be seen that the proof we have presented is horrible from the point of view of algorithmic complexity. The main problem is in the second step: the amount of partial bijections from the set $[1,t]$ to itself is equal to $(C_{t}^{0})^{2}0!+(C_{t}^{1})^{2}1!+(C_{t}^{2})^{2}2!+\dots+(C_{t}^{t})^{2}t!$ , which grows even faster then any exponential function. It is less obvious but another “heavy” part of the algorithm is the first step — eliminating chain productions. One would hope that presence of these slow parts is the problem of our proof but this is not the case.

Example 8.

Consider a grammar $HGr_{3}=\langle\{S\},\{a\},P_{3},S\rangle$ where $P$ has 3 productions ( $type(S)=n$ , $type(a)=n$ ):

1.

$S\to\langle\{v_{1},\dots,v_{n}\},\{e_{0}\},att,lab,v_{1}v_{2}\dots v_{n}\rangle$ , $lab(e_{0})=S$ , $att(e_{0})=v_{2}v_{1}v_{3}v_{4}\dots v_{n}$ ;
2.

$S\to\langle\{v_{1},\dots,v_{n}\},\{e_{0}\},att,lab,v_{1}v_{2}\dots v_{n}\rangle$ , $lab(e_{0})=S$ , $att(e_{0})=v_{2}v_{3}v_{4}\dots v_{n}v_{1}$ ;
3.

$S\to\circledcirc(a)$ .

Obviously, this grammar has a size $O(n)$ . It generates graphs with one $a$ -labeled edge such that attachment nodes of this edge are obtained from external nodes of the graph by a permutation composed of the transposition $(12)$ and of the cycle $(12\dots n)$ . Since these permutations are generators of $S_{n}$ , the language $L(HGr_{3})$ contains $n!$ hypergraphs. All of them have exactly one hyperedge, so for all $H\in L(HGr_{3})$ a production $S\to H$ has to belong to a grammar in the WGNF; therefore, even the number of productions grows faster than any exponent apart from the algorithm involved.

This yields the following

Proposition 1.

There is no exponential-time algorithm that takes an HRG generating an $ibHCFL$ and returns an equivalent HRG in the WGNF.

However, if we consider only HRGs of order $r$ for some fixed $r$ (i.e. in which types of symbols are not greater than $r$ ), then the above example ceases to be a problem as well as the first and the second steps in our proof since the number of partial bijections from $[1;r]$ to itself (and, in particular, the size of $S_{r}$ ) is constant when $r$ is fixed.

7 Conclusion

Our desire to introduce the weak Greibach normal form appeared due to another research about hypergraph basic categorial grammars (we presented a paper [11] about them at ICGT-2020); in that work we introduce a new approach to describing graph languages and prove that each grammar in this new formalism can be transformed into an equivalent HRG in the WGNF and vice versa. Theorems 1 and 2 show that ibHCFL is exactly the class of languages generated by grammars in the weak Greibach normal form; thus we also answer a question about the recognizing power of hypergraph basic categorial grammars — they also generate exactly ibHCFL. Besides, we suppose that the issue of the WGNF is a fundamental theoretical question, and we have presented an answer in line with those in [5, 9]. Note that the result gives a natural grammar characterization of the language class ibHCFL.
Acknowledgments. I thank my scientific advisor prof. Mati Pentus and anonymous reviewers for their valuable advices.

References

[1]
[2] Alfred V. Aho & Jeffrey D. Ullman (1972): The theory of parsing, translation, and compiling. 1: Parsing. Prentice-Hall. Available at https://www.worldcat.org/oclc/310805937.
[3] Y. Bar-Hillel, C. Caifman & E. Shamir (1960): On Categorial and Phrase-structure Grammars. Weizmann Science Press.
[4] Frank Drewes, Hans-Jörg Kreowski & Annegret Habel (1997): Hyperedge Replacement, Graph Grammars. In Grzegorz Rozenberg, editor: Handbook of Graph Grammars and Computing by Graph Transformations, Volume 1: Foundations, World Scientific, pp. 95–162, 10.1142/9789812384720_0002.
[5] Joost Engelfriet (1992): A Greibach Normal Form for Context-free Graph Grammars. In Werner Kuich, editor: Automata, Languages and Programming, 19th International Colloquium, ICALP92, Vienna, Austria, July 13-17, 1992, Proceedings, Lecture Notes in Computer Science 623, Springer, pp. 138–149, 10.1007/3-540-55719-9_70.
[6] Joost Engelfriet, George Leih & Emo Welzl (1990): Boundary Graph Grammars with Dynamic Edge Relabeling. J. Comput. Syst. Sci. 40(3), pp. 307–345, 10.1016/0022-0000(90)90002-3.
[7] Joost Engelfriet & Grzegorz Rozenberg (1990): A Comparison of Boundary Graph Grammars and Context-Free Hypergraph Grammars. Inf. Comput. 84(2), pp. 163–206, 10.1016/0890-5401(90)90038-J.
[8] Annegret Habel (1992): Hyperedge Replacement: Grammars and Languages. Lecture Notes in Computer Science 643, Springer, 10.1007/BFb0013875.
[9] Christina Jansen, Jonathan Heinen, Joost-Pieter Katoen & Thomas Noll (2011): A Local Greibach Normal Form for Hyperedge Replacement Grammars. In Adrian-Horia Dediu, Shunsuke Inenaga & Carlos Martín-Vide, editors: Language and Automata Theory and Applications - 5th International Conference, LATA 2011, Tarragona, Spain, May 26-31, 2011. Proceedings, Lecture Notes in Computer Science 6638, Springer, pp. 323–335, 10.1007/978-3-642-21254-3_25.
[10] Bevan K. Jones, Jacob Andreas, Daniel Bauer, Karl Moritz Hermann & Kevin Knight (2012): Semantics-Based Machine Translation with Hyperedge Replacement Grammars. In Martin Kay & Christian Boitet, editors: COLING 2012, 24th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, 8-15 December 2012, Mumbai, India, Indian Institute of Technology Bombay, pp. 1359–1376. Available at https://www.aclweb.org/anthology/C12-1083/.
[11] Tikhon Pshenitsyn (2020): Hypergraph Basic Categorial Grammars. In Fabio Gadducci & Timo Kehrer, editors: Graph Transformation - 13th International Conference, ICGT 2020, Held as Part of STAF 2020, Online, June 25-26, 2020, Proceedings, Lecture Notes in Computer Science 12150, Springer, pp. 146–162, 10.1007/978-3-030-51372-6_9.
[12] Grzegorz Rozenberg, editor (1997): Handbook of Graph Grammars and Computing by Graph Transformations, Volume 1: Foundations. World Scientific, 10.1142/3303.