This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Weak Greibach Normal Form for Hyperedge Replacement Grammars

Tikhon Pshenitsyn Department of Mathematical Logic and Theory of Algorithms
Faculty of Mathematics and Mechanics
Lomonosov Moscow State University
GSP-1, Leninskie Gory, Moscow, 119991, Russian FederationThe study was funded by RFBR, project number 20-01-00670. [email protected]
Abstract

It is known that hyperedge replacement grammars are similar to string context-free grammars in the sense of definitions and properties. Therefore, we expect that there is a generalization of the well-known Greibach normal form from string grammars to hypergraph grammars. Such generalized normal forms are presented in several papers; however, they do not cover a large class of hypergraph languages (e.g. languages consisting of star graphs). In this paper, we introduce a weak Greibach normal form, whose definition corresponds to the lexicalized normal form for string grammars, and prove that every context-free hypergraph language (with nonsubstantial exceptions) can be generated by a grammar in this normal form. The proof presented in this paper generalizes a corresponding one for string grammars with a few more technicalities.

1 Introduction

Extensions of formal grammars and languages from strings to graphs are considered in a wide variety of works. The resulting formalisms called graph grammars generate graph languages by means of productions: each production is of the form AGA\to G; it allows one to replace a part of a graph labeled by AA with the graph GG if certain conditions are satisfied. An overview on graph grammars can be found in “Handbook on Graph Grammars and Computing by Graph Transformation” [12].

In this paper, we focus on a particular approach called hyperedge replacement grammar (HRG). An overview on HRG can be found in the book [8] or in a chapter of the handbook mentioned above [4]. In brief, a hyperedge replacement grammar contains productions that allow one to replace an edge with a certain label by a hypergraph; the rest is similar to string context-free grammar definitions. HRGs appeared in the seventies, and they became popular from theoretical and practical points of view; particularly, they can be used in machine translation or programming (see, for instance, [10]). Many structural properties of HRGs have been studied. Nicely, many theorems regarding context-free string grammars (CFGs) can be extended to HRGs in a natural way; besides, it is often the case that proofs for CFGs can be generalized to HRGs so there is no need to invent anything new.

For context-free grammars, there is a so-called Greibach normal form, which is helpful in a number of investigations, e.g. to connect CFGs with categorial grammars (see [3]). Each production in a grammar in the Greibach normal form has to begin with a terminal symbol and proceed with at most two nonterminal ones. This definition can be weakened: one says that a grammar is in the weak Greibach normal form (it is often called lexicalized) if there is exactly one terminal symbol in the right-hand side of each production. Obviously, the same definition can be introduced for graphs. Namely, an HRG is said to be in the weak Greibach normal form if for each production AHA\to H the hypergraph HH has exactly one edge labeled by a terminal symbol.

The main objective of this paper is to present a precise class of graph languages that can be generated by HRGs in the weak Greibach normal form. In order to do this we generalize the proof of existence of the Greibach normal form for context-free grammars taken from [2] in a straightforward way. However, several differences exist. Firstly, there is a nonsubstantial class of hypergraph context-free languages that cannot be generated by grammars in the weak Greibach normal form (a problem arises because of isolated nodes). Secondly, there is a trick in the proof for CFGs that exploits the string nature of grammars, so it is hard to generalize it to hypergraphs. This led us to a large and technically heavy proof. Finally, there are bad news regarding algorithmical complexity: conversion of an HRG into an equivalent one in the weak Greibach normal form cannot be done even in exponential time.

In Sect. 2 we compare our contribution with other studies related to the Greibach normal form for graph grammars. In Sect. 3 formal definitions related to hypergraphs and HRGs are presented. In Sect. 4 the weak Greibach normal form is introduced, related issues are discussed. Sect. 5 is devoted to the main theorem. In Sect. 6 we discuss algorithmic complexity of the normalization procedure. In Sect. 7 we conclude.

2 Related Work

There are several works devoted to the Greibach normal form for graph grammars. The paper of Joost Engelfriet [5] establishes that HRGs that produce languages of bounded degree are equivalent to apex HRGs; this result generalizes the double Greibach normal form for CFGs. To recall, a language LL is of bounded degree if for some MM all nodes of all hypergraphs in LL have degree not exceeding MM. Obviously, there are substantive examples of languages of unbounded degree: for instance, the language of star graphs, which have one node and arbitrarily many edges outgoing from it. Besides, the property of being apex is stronger than the weak Greibach normal form we are interested in.

In the paper of Christina Jansen et al. [9] the local Greibach normal form is presented. The authors prove that data structure grammars (it is a specific kind of grammars that generate so-called heap configurations) can be transformed into grammars in the local Greibach normal form; after the proof the authors point out that the normalization can be generalized to HRGs of bounded degree. However, the authors also note that their procedure being algorithmically efficient cannot be generalized to all HRGs.

The weak Greibach normal form in the sense we are interested in is introduced for another type of graph grammars. Namely, in [6] it is proved that each B-eNCE grammar (here we do not consider definitions regarding this formalism) is equivalent to a B-eNCE grammar in the weak Greibach normal form. In [7] it is shown that B-edNCE grammars and HRGs have the same recognizing power in some sense (namely, since B-edNCE grammars produce usual graphs with labeled nodes and edges while HRGs produce hypergraphs with labeled edges, the result is established w.r.t. two translations, from graphs to hypergraphs and vice versa). However, it seems to be impossible that these two results can be combined in order to obtain the normal form for HRGs.

Therefore, to our best knowledge, the question of whether each HRG is equivalent to an HRG in the weak Greibach normal form has hitherto remained open. We answer it in this paper.

3 Preliminaries

This section is concerned with definitions related to hypergraphs. All of them are taken from [4]. Note that we use a slightly different notation from that in the handbook mentioned above.

3.1 Hypergraphs, Sub-hypergraphs

\mathbb{N} includes 0. The set of integers from 11 to nn is denoted by [1,n][1,n].

It is convenient for us to use the following notation: if {i1,,ik}\{i_{1},\dots,i_{k}\} is an indexed set of integers such that for m<nim<inm<n\;i_{m}<i_{n} holds, then it is called index-ordered and the set is denoted as {i1,,ik}IO\{i_{1},\dots,i_{k}\}_{IO}.
The set Σ\Sigma^{*} is the set of all strings over the alphabet Σ\Sigma including the empty string ε\varepsilon. The length |w||w| of the word ww is the number of positions in ww. The ii-th symbol in ww is denote by w(i)w(i) (1i|w|1\leq i\leq|w|). Σ+\Sigma^{+} denotes the set of all nonempty strings. The set Σ\Sigma^{\circledast} is the set of all strings consisting of distinct symbols. The set of all symbols contained in the word ww is denoted by [w][w]. If f:ΣΔf:\Sigma\to\Delta is a function from one set to another, then it is naturally extended as a function f:ΣΔf:\Sigma^{*}\to\Delta^{*} (f(σ1σk)=f(σ1)f(σk)f(\sigma_{1}\dots\sigma_{k})=f(\sigma_{1})\dots f(\sigma_{k})).

Let CC be some fixed set of labels, for which the function type:Ctype:C\to\mathbb{N} is considered.

Definition 1.

A hypergraph over CC is a tuple G=V,E,att,lab,extG=\langle V,E,att,lab,ext\rangle where VV is the set of nodes, EE is the set of hyperedges, att:EVatt:E\to V^{\circledast} assigns an ordered set of attachment nodes to each edge, lab:EClab:E\to C labels each hyperedge by some element of CC in such a way that type(lab(e))=|att(e)|type(lab(e))=|att(e)| whenever eEe\in E, and extVext\in V^{\circledast} is an ordered set of external nodes.

Components of a hypergraph GG are denoted by VG,EG,attG,labG,extGV_{G},E_{G},att_{G},lab_{G},ext_{G}.

In the remainder of the paper, hypergraphs are simply called graphs, and hyperedges are simply called edges. The set of all graphs with labels from CC is denoted by (C)\mathcal{H}(C). In this work, figures contain graph drawings and sketches. In drawings, nodes are depicted by black dots, edges are denotes as labeled boxes, attatt is represented with numbered lines (called “tentacles”), external nodes are depicted by numbers in brackets. If an edge has type 2, it is depicted by an arrow. If we are not interested in a specific form of a graph but we want to have a closer look at some its part, we depict the whole graph as an area but draw its part of interest in detail.

Example 1.

This is a graph:

(1)(1)ss(3)(3)(2)(2)pppp123

If GG is a graph, and eEGe\in E_{G} is labeled by aa, then GG can be denoted by G(e:a)G(e:a).

Definition 2.

The function typetype (or typeGtype_{G} to be exact) returns the number of nodes attached to some edge in a graph GG: typeG(e):=|attG(e)|type_{G}(e):=|att_{G}(e)|. If GG is a graph, then type(G):=|extG|type(G):=|ext_{G}|.

Definition 3.

A sub-hypergraph (or just subgraph) HH of a hypergraph GG is a hypergraph such that VHVGV_{H}\subseteq V_{G}, EHEGE_{H}\subseteq E_{G}, and for all eEHe\in E_{H} attH(e)=attG(e)att_{H}(e)=att_{G}(e), labH(e)=labG(e)lab_{H}(e)=lab_{G}(e).

Definition 4.

If H={vi}i=1n,{e0},att,lab,v1vnH=\langle\{v_{i}\}_{i=1}^{n},\{e_{0}\},att,lab,v_{1}\dots v_{n}\rangle, att(e0)=v1vnatt(e_{0})=v_{1}\dots v_{n} and lab(e0)=alab(e_{0})=a, then HH is called a handle. It is denoted by (a)\circledcirc(a).

Definition 5.

An isomorphism between graphs GG and HH is a pair of bijective functions :EGEH\mathcal{E}:E_{G}\to E_{H}, 𝒱:VGVH\mathcal{V}:V_{G}\to V_{H} such that attH=𝒱attGatt_{H}\circ\mathcal{E}=\mathcal{V}\circ att_{G}, labG=labHlab_{G}=lab_{H}\circ\mathcal{E}, 𝒱(extG)=extH\mathcal{V}(ext_{G})=ext_{H}. In this work, we do not distinguish between isomorphic graphs.

3.2 Replacement

This procedure is defined in [4]. In short, the replacement of an edge e0e_{0} in GG with a graph HH can be done if type(e0)=type(H)type(e_{0})=type(H) as follows:

  1. 1.

    Remove e0e_{0};

  2. 2.

    Insert an isomorphic copy of HH (namely, HH and GG have to consist of disjoint sets of nodes and edges);

  3. 3.

    For each ii, fuse the ii-th external node of HH with the ii-th attachement node of e0e_{0}.

To be more precise, the set of edges in the resulting graph is (EG{e0})EH(E_{G}\setminus\{e_{0}\})\cup E_{H}, and the set of nodes is VG(VHextH)V_{G}\cup(V_{H}\setminus ext_{H}). The result is denoted by G[H/e0]G[H/e_{0}].

3.3 Hyperedge Replacement Grammars

Definition 6.

A hyperedge replacement grammar (HRG) is a tuple HGr=N,Σ,P,SHGr=\langle N,\Sigma,P,S\rangle, where NN is a finite alphabet of nonterminal symbols, Σ\Sigma is a finite alphabet of terminal symbols (NΣ=N\cap\Sigma=\emptyset), PP is a set of productions, and SNS\in N. Each production is of the form AHA\to H where ANA\in N, H(NΣ)H\in\mathcal{H}(N\cup\Sigma) and type(A)=type(H)type(A)=type(H). For π=AH\pi=A\to H we denote lhs(π)=A,rhs(π)=Hlhs(\pi)=A,rhs(\pi)=H.

Edges labeled by terminal (nonterminal) symbols are called terminal edges (nonterminal edges resp.). If a graph contains terminal edges only, it is called terminal.

If GG is a graph, e0EGe_{0}\in E_{G}, lab(e0)=Alab(e_{0})=A and π=AHP\pi=A\to H\in P, then GG directly derives G[H/e0]G[H/e_{0}] (we write GG[H/e0]G\Rightarrow G[H/e_{0}] or G𝜋G[H/e0]G\underset{\pi}{\Rightarrow}G[H/e_{0}]). The transitive reflexive closure of \Rightarrow is denoted by \overset{\ast}{\Rightarrow}. If GHG\overset{\ast}{\Rightarrow}H, then GG is said to derive HH. The corresponding sequence of production applications is called a derivation.

Definition 7.

The language generated by an HRG HGr=N,Σ,P,SHGr=\langle N,\Sigma,P,S\rangle is the set of graphs H(Σ)H\in\mathcal{H}(\Sigma) such that (S)H\circledcirc(S)\overset{\ast}{\Rightarrow}H. It is denoted by L(HGr)L(HGr).

A graph language LL is said to be a hypergraph context-free language (HCFL) if it is generated by some HRG.

Two grammars are said to be equivalent if they generate the same language.

Further we simply write AGA\overset{\ast}{\Rightarrow}G instead of (A)G\circledcirc(A)\overset{\ast}{\Rightarrow}G.

4 Weak Greibach Normal Form

In this section we introduce the formal definition of the normal form we are interested in, and establish a simple property of languages generated by grammars in this normal form.

Definition 8.

An HRG HGrHGr is in the weak Greibach normal form (WGNF) if there is exactly one terminal edge in the right-hand side of each production. Formally, (XH)PHGr\forall(X\to H)\in P_{HGr} !e0EH:labH(e0)ΣHGr\exists!e_{0}\in E_{H}:lab_{H}(e_{0})\in\Sigma_{HGr}.

Example 2.

The grammar HGr1={S},{a},P1,SHGr_{1}=\langle\{S\},\{a\},P_{1},S\rangle is in the WGNF where P1P_{1} contains the following rules (type(S)=type(a)=1type(S)=type(a)=1):

SS\to(1)(1)SS1aa1SS\to(1)(1)aa1

This grammar generates the language of star 1-edged aa-labeled hypergraphs, which is unbounded.

Remark 1.

If N,Σ,P,S\langle N,\Sigma,P,S\rangle is in the WGNF and X𝑘GX\overset{k}{\Rightarrow}G for some XNX\in N, then GG has exactly kk terminal edges (each production adds exactly one terminal edge).

Note that not each hypergraph context-free language can be generated by some HRG in the WGNF. This follows from

Example 3.

Consider the HRG HGr2={T},{b},P2,THGr_{2}=\langle\{T\},\{b\},P_{2},T\rangle where P2P_{2} contains the following rules
(type(T)=type(b)=0type(T)=type(b)=0):

TT\toTTTT\tobb

The first production just adds an isolated node; thus this grammar produces graphs that have exactly one edge labeled by bb and arbitrarily many isolated nodes. If there is an equivalent HGr=N,{b},P,SHGr^{\prime}=\langle N,\{b\},P^{\prime},S^{\prime}\rangle in the WGNF, then each right-hand side of each production in PP^{\prime} contains exactly one terminal edge. Note that if S𝑘G,G({b})S^{\prime}\overset{k}{\Rightarrow}G,G\in\mathcal{H}(\{b\}) for some kk in HGrHGr^{\prime}, then GG has kk terminal edges (Remark 1); hence kk has to be equal to 11 and therefore SGPS^{\prime}\to G\in P^{\prime}. However, there are infinitely many graphs in L(HGr)L(HGr) while |P|<|P^{\prime}|<\infty. ∎

It is seen that the reason of this problem is in isolated nodes. In the string case if a language contains the empty word, then it cannot be generated by a grammar in the WGNF due to obvious reasons: each production adds at least one terminal symbol so it is impossible to produce the empty word. In the case of graphs, we also have to prohibit somehow undesired languages with “too many” isolated nodes occuring in graphs.

Below we describe the precise class of languages generated by grammars in the weak Greibach normal form. We denote by esize(G)esize(G) the number of edges in GG, and by isize(G)isize(G) the number of isolated nodes in GG.

Definition 9.

An HCFL LL is said to be isolated-node bounded (ibHCFL) if there is a constant MM such that, for each GLG\in L, isize(G)<Mesize(G)isize(G)<M*esize(G). An HRG HGrHGr is called isolated-node bounded if it generates an ibHCFL.

Theorem 1.

Each HRG HGr=N,Σ,P,SHGr=\langle N,\Sigma,P,S\rangle in the WGNF generates an ibHCFL.

Proof.

Let M=maxAHP{isize(H)}+1M=\max_{A\to H\in P}\{isize(H)\}+1. Define esizeT(G)esize_{T}(G) as the number of terminal edges in GG. We prove by induction on kk that for each graph GG such that X𝑘G,XNX\overset{k}{\Rightarrow}G,X\in N isize(G)<MesizeT(G)isize(G)<M*esize_{T}(G). Remark 1 implies that esizeT(G)=kesize_{T}(G)=k.

Basis. If k=0k=0, then G=(X)G=\circledcirc(X), and the statement is trivial (0<M0<M).

Step. Let Xk1HGX\overset{k-1}{\Rightarrow}H\Rightarrow G. By the induction hypothesis, isize(H)<MesizeT(H)isize(H)<M*esize_{T}(H). The number of isolated nodes appeared at the last step does not exceed MM by the definition of MM. Then isize(G)<MesizeT(H)+M=MesizeT(G)isize(G)<M*esize_{T}(H)+M=M*esize_{T}(G).

Note finally that esizeT(G)esize(G)esize_{T}(G)\leq esize(G). ∎

The other direction of this statement is of central interest in this paper. The next section is devoted to it.

5 Transformation of HRGs Generating IBHCFLs into HRGs in the Weak Greibach Normal Form

The main theorem we are going to prove is the following:

Theorem 2.

Each ibHCFL LL can be generated by an HRG in the WGNF.

In other words, this theorem states that each context-free hypergraph language satisfying the isolated-node boundedness property can be generated by a grammar in the weak Greibach normal form.

Structurally, the proof of this theorem is based on the corresponding one for string context-free grammars. Let us recall the mains steps of the latter in brief (see details in [2]).

The input of the algorithm is a context-free grammar CFGCFG that does not generate the empty word (this property is related to isolated-node boundedness). The desired output is an equivalent grammar in the weak Greibach normal form (i.e. in which each production is of the form AaA1AnA\to aA_{1}\dots A_{n} where aa is terminal while A1,,AnA_{1},\dots,A_{n} are nonterminal).

  1. 1.

    Useless rules and symbols, ε\varepsilon-rules (i.e. rules of the form AεA\to\varepsilon) and chain rules (i.e. rules of the form ABA\to B for A,BA,B being nonterminal) are eliminated.

  2. 2.

    It is shown how to eliminate recursive AA-productions, i.e. productions of the form AAαA\to A\alpha for some fixed nonterminal symbol AA. The trick is to move AA from the left side of AαA\alpha to the right side. It is done as follows: if AAα1||Aαm|β1||βpA\to A\alpha_{1}|\dots|A\alpha_{m}|\beta_{1}|\dots|\beta_{p} are all the AA-productions (here || means enumeration of productions), then they are replaced by the productions Aβ1A||βpA|β1||βpA\to\beta_{1}A^{\prime}|\dots|\beta_{p}A^{\prime}|\beta_{1}|\dots|\beta_{p} and Aα1||αm|α1A||αmAA^{\prime}\to\alpha_{1}|\dots|\alpha_{m}|\alpha_{1}A^{\prime}|\dots|\alpha_{m}A^{\prime}.

  3. 3.

    Left recursion is completely eliminated: nonterminals are numbered as A1,,AnA_{1},\dots,A_{n} and then a procedure involving application of step 2 is done such that its result is a grammar where each production is either of the form AiAjαA_{i}\to A_{j}\alpha for i<ji<j or of the form AibαA_{i}\to b\alpha for bb being terminal.

  4. 4.

    By taking compositions of the above productions and adding new nonterminal symbols the grammar is normalized. The resulting grammar includes rules of the form AbαA\to b\alpha (bb is terminal, α\alpha is a string of nonterminals) only.

Our goal is to recreate this proof for HRGs. Note that steps 2-4 of the above plan actively exploit string nature of CFGs: transformations of productions are based on movements of symbols from the leftmost position to the rightmost one. Graphs in general do not have leftmost positions unlike strings, so we are going to distinguish arbitrary edges in graphs so they play the role of “the leftmost symbol”. It is done by means of δ\delta (see the proof below).

In the proof we use the following

Lemma 1.

If π=AG(e0:B)\pi=A\to G(e_{0}:B) is a production of a grammar HGr=N,Σ,P,SHGr=\langle N,\Sigma,P,S\rangle for A,BA,B being nonterminal, and BH1,,BHkB\to H_{1},\dots,B\to H_{k} are all the productions in HGrHGr with BB in the left-hand side, then replacing π\pi by productions AG[H1/e0],,AG[Hk/e0]A\to G[H_{1}/e_{0}],\dots,A\to G[H_{k}/e_{0}] does not change the language generated.

It directly follows from the well-known context-freeness lemma.

5.1 Eliminating Useless, Edgeless and Chain Productions

Our first objective is to eliminate useless productions, productions with no edges in the right-hand side and productions with one nonterminal edge only in the right-hand side. It appears that only this step requires isolated-node boundedness. In the below transformations we provide theoretical reasonings that they can be done irregarding their algorithmic realization; however, the latter can be done similarly to the string case (see [2]).

Definition 10.

A nonterminal symbol AA in a grammar HGrHGr is useless if there is no derivation of the form SH(e:A)GS\overset{\ast}{\Rightarrow}H(e:A)\overset{\ast}{\Rightarrow}G in this grammar where GG is terminal.

Transformation 1 (eliminating useless symbols).

Input: an HRG HGrHGr.
Output: an equivalent HRG HGrHGr^{\prime} without useless nonterminal symbols.

It suffices to remove all the useless symbols and all the productions containing them. This does not affect the language generated.

Definition 11.

A graph is called edgeless if it does not contain edges. A production AGA\to G is called edgeless if GG is edgeless.

Transformation 2 (eliminating edgeless productions).

Input: an HRG HGrHGr that generates an ibHCFL without useless symbols.
Output: an HRG HGrHGr^{\prime} without edgeless productions such that L(HGr)=L(HGr)L(HGr^{\prime})=L(HGr).

Method.
Let HGr=N,Σ,P,SHGr=\langle N,\Sigma,P,S\rangle. Let Null={(A;H)EH=,AN,AH}Null=\{(A;H)\mid E_{H}=\emptyset,\>A\in N,\>A\overset{\ast}{\Rightarrow}H\}. This set is finite, because otherwise there is a symbol A0A_{0} for which arbitrarily large edgeless graphs HH exist such that (A0;H)Null(A_{0};H)\in Null; this contradicts the fact that L(HGr)ibHCFLL(HGr)\in\mathrm{ibHCFL} (it is important that A0A_{0} is not useless).

Let BGPB\to G\in P and EG={e1,,en}E_{G}=\{e_{1},\dots,e_{n}\}. Let P1P_{1} contain the rules of the form BG[H1/e1][Hn/en]B\to G[H_{1}/e_{1}]\dots[H_{n}/e_{n}] where Hi=(lab(ei))H_{i}=\circledcirc(lab(e_{i})) for at least one ii (then replacement of eie_{i} by HiH_{i} changes nothing) and (lab(ei);Hi)(lab(e_{i});H_{i}) belongs to NullNull otherwise. It is argued that HGr=N,Σ,P1,SHGr^{\prime}=\langle N,\Sigma,P_{1},S\rangle does not have productions with edgeless right-hand sides (this follows from the construction) and that L(HGr)=L(HGr)L(HGr^{\prime})=L(HGr). ∎

Definition 12.

A production AGA\to G is called a chain production if EG={e}E_{G}=\{e\} and labG(e)lab_{G}(e) is nonterminal.

Example 4.

The first production in the grammar HGr2HGr_{2} is chain.

Transformation 3 (eliminating chain productions).

Input: an HRG HGrHGr that generates an ibHCFL without useless symbols.
Output: an HRG HGrHGr^{\prime} without chain productions such that L(HGr)=L(HGr)L(HGr^{\prime})=L(HGr).

Method.
Let HGr=N,Σ,P,SHGr=\langle N,\Sigma,P,S\rangle. Consider the set Chain={(A;H)EH={e0},labH(e0)=B,A,BN,AH}Chain=\{(A;H)\mid E_{H}=\{e_{0}\},lab_{H}(e_{0})=B,\>A,B\in N,\>A\overset{\ast}{\Rightarrow}H\}. Note that ChainChain is finite (again, otherwise one can derive a graph with arbitrarily many isolated nodes having a fixed number of edges). Let HGr=N,Σ,P′′,SHGr^{\prime}=\langle N,\Sigma,P^{\prime\prime},S\rangle where P=P{AH(A;H)Chain}P^{\prime}=P\setminus\{A\to H\mid(A;H)\in Chain\} and P′′=P{AGH:(A;H)Chain,HGP}P^{\prime\prime}=P^{\prime}\cup\{A\to G\mid\exists H:\>(A;H)\in Chain,H\to G\in P^{\prime}\}. Thus we removed all the productions having a graph with one edge in the right-hand side. It can be easily shown that L(HGr)=L(HGr)L(HGr^{\prime})=L(HGr). ∎

Remark 2.

If HGrHGr in Transformation 3 does not have edgeless productions, then HGrHGr^{\prime} does not have them either. Transformations 2 and 3 applied to a grammar without useless symbols transform them into a grammar without useless symbols too.

Example 5.

The grammar HGr2HGr_{2} from Example 3 cannot be turned into an equivalent one without edgeless and chain productions.

The above procedures complete the first step of normalization.

5.2 Defining and Eliminating Recursive Productions

Now we start proving Theorem 2. Let HGr=N,Σ,P,SHGr=\langle N,\Sigma,P,S\rangle be a grammar that generates LL. Applying Transformations 1, 2, 3, we can assume that HGrHGr does not have useless symbols, edgeless productions and chain productions. Thus, the right-hand side of each production has at least two edges or one terminal edge.

Let us carefully examine what happens in the string case at Step 2. Consider the following

Example 6.

Let AAc|Ad|BeA\to Ac|Ad|Be be all the AA-productions in a string context-free grammar. After Step 2 there are the following ones: ABeA|Be,Ac|d|cA|dAA\to BeA^{\prime}|Be,\;A^{\prime}\to c|d|cA^{\prime}|dA^{\prime}. Below we show how a derivation in the old grammar is remodeled in the new one:

Old: AAcAdcAddcBeddcA\Rightarrow Ac\Rightarrow Adc\Rightarrow Addc\Rightarrow Beddc.
New: ABeABedABeddABeddcA\Rightarrow BeA^{\prime}\Rightarrow BedA^{\prime}\Rightarrow BeddA^{\prime}\Rightarrow Beddc.

The underlying idea is to invert the derivation: in the new derivation one starts with BeBe and applies new productions in reverse order w.r.t. to the old derivation.

The same idea is used in the graph case. The difficulty is that we do not have the leftmost symbol in productions. However, it suffices to distinguish an arbitrary edge in the right-hand side of each production to play the role of the leftmost symbol. We do this by means of the function δ\delta.

Definition 13.

Let us fix an arbitrary function δ\delta which acts on PP in such a way that δ(AH)\delta(A\to H) belongs to EHE_{H}. We denote μ(π)=lab(δ(π))\mu(\pi)=lab(\delta(\pi)).

Now μ(π)\mu(\pi) plays the role of the “leftmost” symbol of a production π\pi. Then we define recursive productions as expected.

Definition 14.

A production π\pi is recursive if lhs(π)=μ(π)lhs(\pi)=\mu(\pi). A production π\pi is an AA-production if lhs(π)=Alhs(\pi)=A.

Definition 15.

A derivation AH1HkA\Rightarrow H_{1}\Rightarrow\dots\Rightarrow H_{k} is called a δ\delta-derivation if each of its productions is applied to the edge that is δ\delta of the previous production. Formally, if Aπ1H1π2πkHkA\underset{\pi_{1}}{\Rightarrow}H_{1}\underset{\pi_{2}}{\Rightarrow}\dots\underset{\pi_{k}}{\Rightarrow}H_{k}, then πi\pi_{i} has to be appied to δ(πi1)\delta(\pi_{i-1}) (in the subgraph that appears after the (i1)(i-1)-th step). We say that the final label of such a derivation is μ(πk)\mu(\pi_{k}). Note that μ(πi)\mu(\pi_{i}) has to be nonterminal for i<ki<k (but μ(πk)\mu(\pi_{k}) can be terminal).

Below we study how to eliminate all the recursive AA-productions for some fixed ANA\in N. In the string case the idea is quite simple: one moves AA from the beginning of the string to its end in order to invert a derivation process. Here we also exploit the idea of inverting the derivation but, of course, we have to take into account more complex and general graph structures.

Let Aπ1H1π2πkHkA\underset{\pi_{1}}{\Rightarrow}H_{1}\underset{\pi_{2}}{\Rightarrow}\dots\underset{\pi_{k}}{\Rightarrow}H_{k} be a δ\delta-derivation such that lhs(πi)=Alhs(\pi_{i})=A for i=1,,ki=1,\dots,k and μ(πk)A\mu(\pi_{k})\neq A (compare this with the old derivation in Example 6). We provide the following intuition behind the below construction. Imagine that one has a hole in his sweater (this metaphor is related to that in [5]), and he sews it up starting from edges (i.e. he applies π1,,πk1\pi_{1},\dots,\pi_{k-1}) and finishing by sewing on a patch in the center of the hole (i.e. he applies πk\pi_{k}). In the new derivation one starts with the patch (that is, the right-hand side of πk\pi_{k}), then he sews up right-hand sides of πk1,,π2\pi_{k-1},\dots,\pi_{2} in this order (thus the patch grows and becomes bigger) and finishing by connecting it with edges of the hole w.r.t. the production π1\pi_{1} (see also Fig. 5 which provides a sketch of this process).

The major problem of this idea is the following: if we want to invert a derivation such as on Figure 5(a) and to obtain a new one, such as one drawn on Figure 5(b), we have to carefully control external nodes. Namely, in the old derivation external nodes of HH are predefined by R1R_{1} while in the new derivation R1R_{1} appears at the very last step. In order to place external nodes correctly, we introduce complex nonterminal symbols that describe how external nodes of a graph on the current derivation step correspond to external nodes of a graph on the next and on the last step. This description is done by means of partial functions.

Now we proceed with formal realization of this idea. Let t=type(A)t=type(A). Let ρ1=AR1,,ρK=ARK\rho_{1}=A\to R_{1},\dots,\rho_{K}=A\to R_{K} be all the recursive AA-productions in PP and let γ1=AG1,,γL=AGL\gamma_{1}=A\to G_{1},\dots,\gamma_{L}=A\to G_{L} be the remaining AA-productions in PP. Our goal is to construct a grammar equivalent to HGrHGr without recursive AA-productions. Firstly, we simply remove them: P1:=P{ρ1,,ρK}P_{1}:=P\setminus\{\rho_{1},\dots,\rho_{K}\}. In order to compensate for the lack of these rules we add new ones accordingly to the procedure below.

Definition 16.

f:XYf:X\to Y is a partial function if ff is a function on some subset XX^{\prime} of XX. We denote the domain XX^{\prime} of ff by Dom(f)Dom(f), and the range of ff by Ran(f)Ran(f).

Definition 17.

f:XYf:X\to Y is a partial bijection if ff is a partial function such that f|Dom(f)f|_{Dom(f)} is a bijection.

Definition 18.

If f:XYf:X\to Y, g:YZg:Y\to Z are partial functions, then gf:XZg\circ f:X\to Z is a partial function defined on Dom(f)f1(Dom(g))Dom(f)\cap f^{-1}(Dom(g)) that acts on this set as a usual composition.

AA\toGG(1)(1)(t)(t)
(a) The production γ\gamma.
AA\toRRAA(1)(1)(t)(t)12tt
(b) The production ρ\rho.
Figure 1: The productions γ\gamma and ρ\rho.

Let γ=γi\gamma=\gamma_{i} for some ii and let ρ=ρj\rho=\rho_{j} for some jj. We define e0:=δ(ρ)e_{0}:=\delta(\rho), e~:=δ(γ)\widetilde{e}:=\delta(\gamma), G:=rhs(γ)G:=rhs(\gamma) and R:=rhs(ρ)R:=rhs(\rho). Illustrations of these productions are presented in Fig. 1. Firstly, we extend NN by new nonterminal symbols of the form (A,f,g)(A,f,g) where f,g:[1,t][1,t]f,g:[1,t]\to[1,t] are two partial bijections; the resulting set is denoted by NN^{\prime}. Then we add to P1P_{1} rules of the following three forms:
Type I. Recall the metaphor about a sweater and a patch. If one imagines an AA-labeled edge at the beginning of a derivation as a hole in a sweater, then productions of type I are designed to add a patch GG at the very beginning of sewing up (= of a derivation). Partial functions are used to describe what external nodes of GG (edges of a patch) are equal to attachment nodes of the AA-labelled edge (edges of a hole).
Let f,g:[1,t][1,t]f,g:[1,t]\to[1,t] be two partial bijections. Informally, ff specifies what nodes of extGext_{G} are external on the second step of the inverted derivation (see again Fig. 5(b)); gg specifies what external nodes of the second step graph are external at the last step. Let Dom(gf)={p1,,pk}IODom(g\circ f)=\{p_{1},\dots,p_{k}\}_{IO}. We set

  • V:=VG{u1,,utk}V^{\prime}:=V_{G}\cup\{u_{1},\dots,u_{t-k}\} for u1,,utku_{1},\dots,u_{t-k} being new nodes;

  • E:=EG{e}E^{\prime}:=E_{G}\cup\{e^{\prime}\} where ee^{\prime} is new;

  • For eEGe\in E_{G} we set att(e):=attG(e)att^{\prime}(e):=att_{G}(e) and lab(e)=labG(e)lab^{\prime}(e)=lab_{G}(e);

  • For ee^{\prime} we set att(e):=extGu1utkatt^{\prime}(e^{\prime}):=ext_{G}u_{1}\dots u_{t-k};

  • lab(e):=(A,f,g)lab^{\prime}(e^{\prime}):=(A,f,g);

  • Let [1,t]Ran(gf)={j1,,jtk}IO[1,t]\setminus Ran(g\circ f)=\{j_{1},\dots,j_{t-k}\}_{IO} (note that gfg\circ f is bijective so Dom(gf)Dom(g\circ f) and Ran(gf)Ran(g\circ f) are of the same size). Then we set ext(i):=urext^{\prime}(i):=u_{r}, if i=jr,r{1,,tk}i=j_{r},\;r\in\{1,\dots,t-k\} or ext(i):=extG(pr)ext^{\prime}(i):=ext_{G}(p_{r}) for i=ir=g(f(pr)),r{1,,k}i=i_{r}=g(f(p_{r})),\;r\in\{1,\dots,k\}.

We are ready to announce that G=V,E,att,lab,extG^{\prime}=\langle V^{\prime},E^{\prime},att^{\prime},lab^{\prime},ext^{\prime}\rangle. Finally, we introduce the following rule:

ν1,γ,f,g:=AG.{\nu_{1,\gamma,f,g}:=A\to G^{\prime}}.

We define δ(ν1,γ,f,g)=δ(γ)\delta(\nu_{1,\gamma,f,g})=\delta(\gamma) (note that γ\gamma is nonrecursive so is ν1,γ,f,g\nu_{1,\gamma,f,g}).

Remark 3.

Type of (A,f,g)(A,f,g) equals type((A,f,g))=t+tk=2type(A)|Dom(gf)|type((A,f,g))=t+t-k=2\cdot type(A)-|Dom(g\circ f)|.

Figure 2 illustrates this type of productions.

AA\toGG(A,f,g)(A,f,g)12\dotsp1p_{1}(i1)(i_{1})\dotspkp_{k}(ik)(i_{k})\dotsttt+1t+1\dotst+(tk)t+(t-k)\;(j1)(j_{1})(jtk)(j_{t-k})
Figure 2: The production ν1,γ,f,g\nu_{1,\gamma,f,g}.

Type II. Production of type II are used at the last step of a derivation such a on Fig. 5(b); returning to the metaphor they allow one to finish sewing up and to finally connect a patch (that has grown by applying productions of types I and III) with the edges of a hole.
We recall that ρ=AR\rho=A\to R such that δ(ρ)=e0ER\delta(\rho)=e_{0}\in E_{R} and lab(e0)=Alab(e_{0})=A. Let f0:[1,t][1,t]f_{0}:[1,t]\to[1,t] be a function which is defined by the following relation: f0(i)=jattR(e0)(i)=extR(j)f_{0}(i)=j\Leftrightarrow att_{R}(e_{0})(i)=ext_{R}(j). Obviously, f0f_{0} is a partial bijection. We set

  • V:=VRV^{\prime}:=V_{R};

  • E:=EG{e0}E^{\prime}:=E_{G}\setminus\{e_{0}\};

  • att=attR|Eatt^{\prime}=att_{R}|_{E^{\prime}}, lab=labR|Elab^{\prime}=lab_{R}|_{E^{\prime}}

  • Let kk be equal to |Dom(f0)||Dom(f_{0})| and let [1,t]Ran(f0)={j1,,jtk}IO[1,t]\setminus Ran(f_{0})=\{j_{1},\dots,j_{t-k}\}_{IO}. Then ext=attR(e0)extR(j1)extR(jtk)ext^{\prime}=att_{R}(e_{0})ext_{R}(j_{1})\allowbreak\dots ext_{R}(j_{t-k}).

We set R=V,E,att,lab,extR^{\prime}=\langle V^{\prime},E^{\prime},att^{\prime},lab^{\prime},ext^{\prime}\rangle. Finally, we introduce ν2,ρ:=(A,f0,id)R{\nu_{2,\rho}:=(A,f_{0},id)\to R^{\prime}} and set δ(ν2,ρ):=e{\delta(\nu_{2,\rho}):=e} for some chosen eERe\in E_{R^{\prime}} (note that there is one since |EG|2|E_{G}|\geq 2). Here id:[1,t][1,t]id:[1,t]\to[1,t] is the identity function. See Figure 3.

Remark 4.

type(R)=2tk=2type(A)|Dom(f0)|=2type(A)|Dom(idf0)|=type((A,f0,id))type(R^{\prime})=2t-k=2\cdot type(A)-|Dom(f_{0})|=2\cdot type(A)-|Dom(id\circ f_{0})|=type((A,f_{0},id)). Thus the production is defined correctly.

(A,f0,id)(A,f_{0},id)\toRR^{\prime}(t+1)(t+1)(2tk)(2t-k)(1)(1)(2)(2)\dots(t)(t)
Figure 3: The production ν2,ρ\nu_{2,\rho}.

Type III. Productions of type III serve to make a “sewing up step” from inside to outside. f,gf,g that specify relations between external nodes on the current step, on the next step and on the final step are changed by related functions f,gf^{\prime},g^{\prime}.
Let f,g,f,g:[1,t][1,t]f,g,f^{\prime},g^{\prime}:[1,t]\to[1,t] be partial bijections. The whole procedure described below can be done if gf=gg^{\prime}\circ f^{\prime}=g.

We set

  • V:=VR{u1,,utk}V^{\prime}:=V_{R}\cup\{u_{1},\dots,u_{t-k^{\prime}}\} such that u1,,utku_{1},\dots,u_{t-k^{\prime}} are new nodes (kk^{\prime} is defined below);

  • E:=ER{e0}{e}E^{\prime}:=E_{R}\setminus\{e_{0}\}\cup\{e^{\prime}\} where e0=δ(ρ)e_{0}=\delta(\rho) and ee^{\prime} is new;

  • For eER{e0}e\in E_{R}\setminus\{e_{0}\} we set att(e):=attR(e)att^{\prime}(e):=att_{R}(e) and lab(e)=labR(e)lab^{\prime}(e)=lab_{R}(e);

  • For ee^{\prime} we set att(e):=extRu1utkatt^{\prime}(e^{\prime}):=ext_{R}u_{1}\dots u_{t-k^{\prime}};

  • lab(e):=(A,f,g)lab^{\prime}(e^{\prime}):=(A,f^{\prime},g^{\prime});

  • Let

    • M1:=Dom(g)Ran(f)={s1,,sp}IOM_{1}:=Dom(g)\setminus Ran(f)=\{s_{1},\dots,s_{p}\}_{IO} (nodes of the current step that will be external at the last step but that were not taken into account at the previous step);

    • M2:=[1,t]Ran(gf)={j1,,jtk}IOM_{2}:=[1,t]\setminus Ran(g\circ f)=\{j_{1},\dots,j_{t-k}\}_{IO};

    • g(si)=jlig(s_{i})=j_{l_{i}};

    • Let M2g(M1)={jx1,,jxtkp}M_{2}\setminus g(M_{1})=\{j_{x_{1}},\dots,j_{x_{t-k-p}}\} where {x1,,xtkp}\{x_{1},\dots,x_{t-k-p}\} is index-ordered; here we note that g(M1)=Ran(g)Ran(gf)g(M_{1})=Ran(g)\setminus Ran(g\circ f), thus g(M1)M2g(M_{1})\subseteq M_{2}.

    Then we set k=k+pk^{\prime}=k+p and

    • ext(i):=attR(e0)(i)ext^{\prime}(i):=att_{R}(e_{0})(i) for i=1,,ti=1,\dots,t;

    • ext(t+li)=extR(si)ext^{\prime}(t+l_{i})=ext_{R}(s_{i}) for i=1,,pi=1,\dots,p;

    • ext(t+xi):=uiext^{\prime}(t+x_{i}):=u_{i} for i=1,,tki=1,\dots,t-k^{\prime}.

We set R=V,E,att,lab,extR^{\prime}=\langle V^{\prime},E^{\prime},att^{\prime},lab^{\prime},ext^{\prime}\rangle. Then ν3,ρ,f,g,f,g:=(A,f,g)R{\nu_{3,\rho,f,g,f^{\prime},g^{\prime}}:=(A,f,g)\to R^{\prime}}. We put δ(ν3,ρ,f,g,f,g)=e{\delta(\nu_{3,\rho,f,g,f^{\prime},g^{\prime}})=e} for some eER{e0}e\in E_{R}\setminus\{e_{0}\}. The production is illustrated on Figure 4.

Remark 5.

Again, we prove well-definedness of this production:

  • Firstly, we check that type(R)=type((A,f,g))type(R^{\prime})=type((A,f,g)): |ext|=t+p+tk=2t+pkp=2type(A)|Ran(gf)|=type((A,f,g))|ext^{\prime}|=t+p+t-k^{\prime}=2t+p-k-p=2\cdot type(A)-|Ran(g\circ f)|=type((A,f,g));

  • Secondly, we have to check that |att(e)|=type((a,f,g))|att(e^{\prime})|=type((a,f^{\prime},g^{\prime})): |att(e)|=t+tk=t+|M2g(M1)||att(e^{\prime})|=t+t-k^{\prime}=t+|M_{2}\setminus g(M_{1})|. Note that M2g(M1)=([1,t]Ran(gf))(Ran(g)Ran(gf))=[1,t](Ran(g)Ran(gf))=[1,t]Ran(g)M_{2}\setminus g(M_{1})=\left([1,t]\setminus Ran(g\circ f)\right)\setminus\left(Ran(g)\setminus Ran(g\circ f)\right)=[1,t]\setminus(Ran(g)\cup Ran(g\circ f))=[1,t]\setminus Ran(g). This yields |att(e)|=2t|Ran(g)|=2t|Ran(gf)|=type((A,f,g))|att(e^{\prime})|=2t-|Ran(g)|=2t-|Ran(g^{\prime}\circ f^{\prime})|=type((A,f^{\prime},g^{\prime})).

(A,f,g)(A,f,g)\to(1)(1)(2)(2)\dots(t)(t)(A,f,g)(A,f^{\prime},g^{\prime})12\dotss1s_{1}(t+l1)(t+l_{1})\dotssps_{p}(t+lp)(t+l_{p})\dotsttt+1t+1\dotst+(tk)t+(t-k^{\prime})\;(t+x1)(t+x_{1})(t+xtk)(t+x_{t-k^{\prime}})RR^{\ominus}
Figure 4: The production ν3,ρ,f,g,f,g\nu_{3,\rho,f,g,f^{\prime},g^{\prime}}. Here RR^{\ominus} denotes a graph RR without δ(ρ)\delta(\rho).

We say that P2P_{2} is obtained from P1P_{1} by adding all possible productions of all the above types. Obviously, their number is finite (each production is defined by at most four partial functions from [1,t][1,t] to [1,t][1,t]).

Lemma 2.

The grammar HGr1=N,Σ,P2,SHGr_{1}=\langle N^{\prime},\Sigma,P_{2},S\rangle is equivalent to HGr=N,Σ,P,SHGr=\langle N,\Sigma,P,S\rangle.

Proof.

Informally, it suffices to notice that the design of new productions allows one to invert productions in the way which is shown on Figure 5. Example 7 provides an illustration of the formal proof, which is given below.

Firstly, we check that L(HGr)L(HGr1)L(HGr)\subseteq L(HGr_{1}). In order to do that, we show how to model a branch of a derivation that consists of rules of the form ρi\rho_{i} and finishes with a rule γj\gamma_{j} by means of HGr1HGr_{1}. Let Aρi1H1ρi2ρimHmγjHA\underset{\rho_{i_{1}}}{\Rightarrow}H_{1}\underset{\rho_{i_{2}}}{\Rightarrow}\dots\underset{\rho_{i_{m}}}{\Rightarrow}H_{m}\underset{\gamma_{j}}{\Rightarrow}H be a δ\delta-derivation in HGrHGr where Hk=Hk1[Rk/δ(ρik1)],1k<mH_{k}=H_{k-1}[R_{k}/\delta(\rho_{i_{k-1}})],1\leq k<m and H=Hm[G/δ(ρim)]H=H_{m}[G/\delta(\rho_{i_{m}})]. It is convenient to put H0:=(A)H_{0}:=\circledcirc(A) and say that δ(ρi0):=e0\delta(\rho_{i_{0}}):=e_{0} where EH0={e0}E_{H_{0}}=\{e_{0}\}. For the sake of simplyfing notations we write ρ1,,ρm\rho_{1},\dots,\rho_{m} instead of ρi1,,ρim\rho_{i_{1}},\dots,\rho_{i_{m}} and γ\gamma instead of γj\gamma_{j}.

Let GkG_{k} be defined inductively as follows: Gm+1=GG_{m+1}=G; Gk=Rk[Gk+1/δ(ρk)]G_{k}=R_{k}[G_{k+1}/\delta(\rho_{k})]. Note that

H=Hm[G/δ(ρm)]==Hm1[Rm/δ(ρm1)][G/δ(ρm)]==R1[R2/δ(ρ1)][Rm/δ(ρm1)][Gm+1/δ(ρm)]==R1[R2/δ(ρ1)][Gm/δ(ρm1)]==R1[G2/δ(ρ1)]==G1.\begin{array}[]{lcl}H&=&H_{m}[G/\delta(\rho_{m})]=\\ &=&H_{m-1}[R_{m}/\delta(\rho_{m-1})][G/\delta(\rho_{m})]=\\ &\dots&\\ &=&R_{1}[R_{2}/\delta(\rho_{1})]\dots[R_{m}/\delta(\rho_{m-1})][G_{m+1}/\delta(\rho_{m})]=\\ &=&R_{1}[R_{2}/\delta(\rho_{1})]\dots[G_{m}/\delta(\rho_{m-1})]=\\ &\dots&\\ &=&R_{1}[G_{2}/\delta(\rho_{1})]=\\ &=&G_{1}.\\ \end{array}

To prove the claimed inclusion, it suffices to show then that AHGr1G1A\underset{HGr_{1}}{\Rightarrow}G_{1}.

Observe that H=Hi1[Ri[Gi+1/δ(ρi)]/δ(ρi1)],i=1,,mH=H_{i-1}[R_{i}[G_{i+1}/\delta(\rho_{i})]/\delta(\rho_{i-1})],\quad i=1,\dots,m. Then Gi+1G_{i+1} can be considered as a subgraph of Gi:=Ri[Gi+1/δ(ρi)]G_{i}:=R_{i}[G_{i+1}/\delta(\rho_{i})], which is a subgraph of HH. Let fif_{i} be a partial function defined by the following correspondence: fi(j)=kextGi+1(j)=extGi(k)f_{i}(j)=k\Leftrightarrow ext_{G_{i+1}}(j)=ext_{G_{i}}(k). Similarly we define gig_{i}: gi(j)=kextGi(j)=extH(k)g_{i}(j)=k\Leftrightarrow ext_{G_{i}}(j)=ext_{H}(k). It follows from these definitions that gifi=gi1g_{i}\circ f_{i}=g_{i-1}.

Let σi=AGi+1\sigma_{i}=A\to G_{i+1} and Gi:=rhs(ν1,σi,fi,gi)G^{\prime}_{i}:=rhs(\nu_{1,\sigma_{i},f_{i},g_{i}}). We prove by the reverse induction on p=m,,1p=m,\dots,1 that then AHGr1GpA\underset{HGr_{1}}{\Rightarrow}G^{\prime}_{p}.

Basis. Since σm=AG=γ\sigma_{m}=A\to G=\gamma is not a recursive AA-production, we added ν1,σm,fm,gm\nu_{1,\sigma_{m},f_{m},g_{m}} in our grammar. This completes the basis case.

Step. We assume by the induction hypothesis that AHGr1GpA\underset{HGr_{1}}{\Rightarrow}G^{\prime}_{p} and our aim is to prove the same for p1p-1. GpG^{\prime}_{p} contains an edge labeled by (A,fp,gp)(A,f_{p},g_{p}). Then a type III rule is applied to this edge: ν3,ρp,fp,gp,fp1,gp1\nu_{3,\rho_{p},f_{p},g_{p},f_{p-1},g_{p-1}}. The design of such rules allows us to perform exactly the substitution of Gp+1G_{p+1} into RpR_{p}.

AA\RightarrowR1R_{1}AA\RightarrowR1R_{1}^{\ominus}R2R_{2}AA\overset{\ast}{\Rightarrow}R1R_{1}^{\ominus}R2R_{2}^{\ominus}\dotsRmR_{m}^{\ominus}GG=H=H
(a)
AA\RightarrowGG(A,fm,gm)(A,f_{m},g_{m})\RightarrowRmR_{m}^{\ominus}GG(A,fm1,gm1)(A,f_{m-1},g_{m-1})\overset{\ast}{\Rightarrow}R2R_{2}^{\ominus}\dotsRmR_{m}^{\ominus}GG(A,f1,id)(A,f_{1},id)H\Rightarrow H
(b)
Figure 5: Illustration of the proof of Lemma 2. External nodes are not depicted here.

Therefore, AHGr1G1A\underset{HGr_{1}}{\Rightarrow}G^{\prime}_{1}. Note that the right-hand side of this derivation contains an edge labeled by (A,f1,g1)(A,f_{1},g_{1}). Since G1=HG_{1}=H, we see that g1g_{1} is the identity function idid. Thus we can apply the type II rule of the form ν2,ρ1\nu_{2,\rho_{1}} whose left-hand side equals (A,f1,id)=(A,f1,g1)(A,f_{1},id)=(A,f_{1},g_{1}). This rule is also designed in such a way that its application is equivalent to substitution of G2G_{2} into R1R_{1}. This means that AHGr1G1R1[G2/δ(ρ1)]=G1A\underset{HGr_{1}}{\Rightarrow}G^{\prime}_{1}\Rightarrow R_{1}[G_{2}/\delta(\rho_{1})]=G_{1}, as required.

Now, if there is an arbitrary derivation where the rule ρi\rho_{i} is applied (1iK1\leq i\leq K), then we find the largest δ\delta-derivation within the derivation that includes this occurence ρi\rho_{i} and remodel it as described above. Thus, the above reasonings complete the proof for one of the inclusions.

To prove the reverse inclusion, it suffices to show how to transform a branch of a derivation containing nonterminals of the form (A,f,g)(A,f,g) into a derivation in HGrHGr. Let

Aν1,γ,fm,gmGm+1ν3,ρm,fm,gm,fm1,gm1Gmν3,ρm1,fm1,gm1,fm2,gm2ν3,ρ2,f2,g2,f1,g1G2ν2,ρmG1\begin{split}A\underset{\nu_{1,\gamma,f_{m},g_{m}}}{\Rightarrow}G_{m+1}\underset{\nu_{3,\rho_{m},f_{m},g_{m},f_{m-1},g_{m-1}}}{\Rightarrow}G_{m}\underset{\nu_{3,\rho_{m-1},f_{m-1},g_{m-1},f_{m-2},g_{m-2}}}{\Rightarrow}\dots\underset{\nu_{3,\rho_{2},f_{2},g_{2},f_{1},g_{1}}}{\Rightarrow}G_{2}\underset{\nu_{2,\rho_{m}}}{\Rightarrow}G_{1}\end{split}

be such a branch (it has to be of a similar form). Then we can consider a derivation in HGrHGr of the form Aρ1H1ρ2ρmHm𝛾HA\underset{\rho_{1}}{\Rightarrow}H_{1}\underset{\rho_{2}}{\Rightarrow}\dots\underset{\rho_{m}}{\Rightarrow}H_{m}\underset{\gamma}{\Rightarrow}H; the first part of the proof of this theorem allows us to conclude that H=G1H=G_{1}. Note that functions fif_{i} and gig_{i} in the first derivation can be arbitrary (they only have to satisfy the condition of type III rules); however, it is not hard to show that they are the same as the functions fif_{i} and gig_{i} that are built on the basis of the second derivation.

This finishes the proof. ∎

There are two important observations regarding the procedure and the lemma above:

  1. 1.

    After such a procedure there are no recursive AA-productions;

  2. 2.

    For each nonterminal (A,f,g)(A,f,g) there is no rule π\pi such that μ(π)=(A,f,g)\mu(\pi)=(A,f,g).

Note that HGr1HGr_{1} does not preserve all the properties that HGrHGr had. Namely, rules of Type II can contain only one edge. However, our further actions do not affect type II and type III rules so this does not matter. Besides, useless symbols can appear, which also does not bother us (we deleted them only to perform Transformations 2 and 3). This finishes the second step.

Example 7.

Let a grammar contain two AA-productions: the first one is recursive and the second one is not recursive. They are depicted below; δ\delta of each production is represented by the red color.
ρ=A\rho=A\to(1)(1)AA(2)(2)(3)(3)BB312γ=A\gamma=A\to(1)(1)(2)(2)(3)(3)CCDDEE
In this grammar the following δ\delta-derivation is possible:
AA\Rightarrow(1)(1)AA(2)(2)(3)(3)BB312\Rightarrow(1)(1)AA(3)(3)(2)(2)BBBB231\Rightarrow(1)(1)AA(2)(2)(3)(3)BBBBBB123\Rightarrow
\Rightarrow(1)(1)(2)(2)(3)(3)BBBBBBCCDDEE=G=G
In the new grammar this derivation turns into the following one:
AA\Rightarrow CCDDEE(A,f3,g3)(A,f_{3},g_{3})132(1)(1)(2)(2)(3)(3)456 \Rightarrow (3)(3)CCDDEEBB(A,f2,g2)(A,f_{2},g_{2})213(1)(1)(2)(2)45 \Rightarrow
\Rightarrow (3)(3)(2)(2)CCDDEEBBBB(A,f1,g1)(A,f_{1},g_{1})321(1)(1)4 G\Rightarrow G
Here f3(1)=2,f3(2)=3;f1=f2=f3=g2;g3(1)=3;g1=idf_{3}(1)=2,f_{3}(2)=3;f_{1}=f_{2}=f_{3}=g_{2};g_{3}(1)=3;g_{1}=id are all the involved partial functions; it is easy to check that they actually satisfy conditions from definitions. Note that if nn is the total number of derivation steps, then fif_{i} specifies a relation between external nodes on the (ni)(n-i)-th and on the (ni+1)(n-i+1)-th steps and it is defined by a corresponding production in the old derivation. In this example, fif_{i} are all defined by the production ρ\rho, thus they are the same. gig_{i} defines a relation between external nodes on the (ni+1)(n-i+1)-th step and external nodes at the end of a derivation (on the nn-th step). As expected, g1g_{1} is the identity function since n=4n=4 and (n1+1)=n(n-1+1)=n, so g1g_{1} specifies a relation between external nodes on the last and on the last step.

5.3 Eliminating Recursion and Convertion into the WGNF

The rest of the steps are similar to those in the string case. Let us start with HGrHGr; let N={A1,,Am}N=\{A_{1},\dots,A_{m}\}. Then the following transformation is done.

Transformation 4 (eliminating recursion).

Input: an HRG HGrHGr as at the beginning of the proof (without useless symbols or edgeless or chain productions).
Output: an equivalent HRG HGrHGr^{\prime} with the same properties such that for each πP\pi\in P with lhs(π)=Ailhs(\pi)=A_{i} either μ(π)=Aj\mu(\pi)=A_{j} and i<ji<j or μ(π)Σ\mu(\pi)\in\Sigma.

Method.

  1. 1.

    Set i=1i=1.

  2. 2.

    Eliminate all the recursive AiA_{i}-productions according to Lemma 2. Now it is argued that if π\pi belongs to the set of productions and lhs(π)=Ailhs(\pi)=A_{i}, then μ(π)=Ak\mu(\pi)=A_{k} for k>ik>i or μ(π)\mu(\pi) is terminal.

  3. 3.

    If i=mi=m, then a desired grammar is obtained. Otherwise, set i=i+1i=i+1 and j=1j=1.

  4. 4.

    Let AjH1,,AjHhA_{j}\to H_{1},\dots,A_{j}\to H_{h} be all the AjA_{j} productions. Let π=AiG(e0:Aj)\pi=A_{i}\to G(e_{0}:A_{j}) be a production with δ(π)=e0\delta(\pi)=e_{0}. Then we can replace π\pi by the rules AiG[H1/e0],,AiG[Hh/e0]A_{i}\to G[H_{1}/e_{0}],\dots,A_{i}\to G[H_{h}/e_{0}] without changing the generated language (see Lemma 1). For these rules we set δ(AiG[Hk/e0])=δ(AjHk),k=1,,h\delta(A_{i}\to G[H_{k}/e_{0}])=\delta(A_{j}\to H_{k}),\;k=1,\dots,h. It will now be the case (due to the previous steps of this transformation) that for each AjA_{j}-production π\pi μ(π)\mu(\pi) equals either a terminal or AkA_{k} for k>jk>j. Thus all the AiA_{i}-productions will have this property after such replacement.

  5. 5.

    If j=i1j=i-1, go to step 2; otherwise set j=j+1j=j+1 and go to step 4.

Let HGr=N,Σ,P,SHGr^{\prime}=\langle N^{\prime},\Sigma,P^{\prime},S\rangle be a grammar that is obtained from HGrHGr after Transformation 4. Let |N|=M|N^{\prime}|=M (note that MM is really large). We already know that HGrHGr^{\prime} does not have recursions, i.e. there are no δ\delta-derivations Aπ1H1π2πkHkA\underset{\pi_{1}}{\Rightarrow}H_{1}\underset{\pi_{2}}{\Rightarrow}\dots\underset{\pi_{k}}{\Rightarrow}H_{k} such that μ(πk)=A\mu(\pi_{k})=A. Indeed, due to the second observation after Lemma 2: nonterminals of the form (B,f,g)(B,f,g) cannot participate in recursive δ\delta-derivations since they do not appear as values of μ\mu. For the remaining nonterminals this is a consequence of the result of Transformation 4. Thus we can put a linear order on NN^{\prime} in such a way that for A𝜋G(e0:B)PA\underset{\pi}{\Rightarrow}G(e_{0}:B)\in P^{\prime}, if δ(π)=e0\delta(\pi)=e_{0}, then either BB is terminal or it is nonterminal and A<BA<B.

We define the final set of productions P¯\overline{P} as follows: AGA\to G belongs to P¯\overline{P} if there is a δ\delta-derivation Aπ1H1π2πkHk=GA\underset{\pi_{1}}{\Rightarrow}H_{1}\underset{\pi_{2}}{\Rightarrow}\dots\underset{\pi_{k}}{\Rightarrow}H_{k}=G of the length 1kM1\leq k\leq M such that μ(πk)\mu(\pi_{k}) is terminal. Obviously, the last requirement yields that GG has at least one terminal edge.

Lemma 3.

HGr¯=N,Σ,P¯,S\overline{HGr}=\langle N^{\prime},\Sigma,\overline{P},S\rangle is equivalent to HGrHGr^{\prime}.

Proof.

Clearly, L(HGr¯)L(HGr)L(\overline{HGr})\subseteq L({HGr}^{\prime}), because productions in HGr¯\overline{HGr} are composed of that in HGr{HGr}^{\prime}. The reverse inclusion is proved by induction on the length nn of derivation in HGrHGr^{\prime} in more general form: N,Σ,P¯,X\langle N^{\prime},\Sigma,\overline{P},X\rangle is equivalent to N,Σ,P,X\langle N^{\prime},\Sigma,P^{\prime},X\rangle for each XNX\in N^{\prime}.
Basis. If XGPX\to G\in P^{\prime} then it is also in P¯\overline{P} since GG is terminal.
Step. Let XH(e0:Y)n1GX\to H(e_{0}:Y)\overset{n-1}{\Rightarrow}G (where e0=δ(XH(e0:Y))e_{0}=\delta(X\to H(e_{0}:Y))) be a derivation of a terminal graph GG. This derivation contains a δ\delta-derivation Xπ1H1=H(e0:Y)π2H2π3H3πkHkX\underset{\pi_{1}}{\Rightarrow}H_{1}=H(e_{0}:Y)\underset{\pi_{2}}{\Rightarrow}H_{2}\underset{\pi_{3}}{\Rightarrow}H_{3}\dots\underset{\pi_{k}}{\Rightarrow}H_{k} and let μ(πi)=Xi\mu(\pi_{i})=X_{i} (we also set X0=XX_{0}=X). Then X0<X1<<XkX_{0}<X_{1}<\dots<X_{k}, XkX_{k} is terminal and all XiX_{i} are different; thus kMk\leq M and XHkP¯X\to H_{k}\in\overline{P}. Then the source derivation is rebuilt as follows: we apply XHkX\to H_{k} to XX and then do the same with nonterminal edges in HkH_{k} using the induction hypothesis. This completes the proof. ∎

Thus L(HGr¯)=LL(\overline{HGr})=L. Now we have to overcome the following problem: the weak Greibach normal form requires right-hand sides of productions to have exactly one terminal edge while we can guarantee for HGr¯\overline{HGr} that there is at least one terminal edge. This problem is easily fixed: for each aΣa\in\Sigma we introduce a new nonterminal symbol TaT_{a} and we add rules of the form Ta(a)T_{a}\to\circledcirc(a). Now, if AGP¯A\to G\in\overline{P} has more than one terminal symbol in the right-hand side, then we choose one of them and replace the other ones by corresponding nonterminals: a:=Taa:=T_{a}. After this step the modified grammar HGr¯\overline{HGr} still generates LL, and it is in the weak Greibach normal form.

6 Algorithmic Complexity

In the string case polynomial-time algorithms of convertion into the Greibach normal form are known; one is interested whether they exist in the hypergraph case.

It can be seen that the proof we have presented is horrible from the point of view of algorithmic complexity. The main problem is in the second step: the amount of partial bijections from the set [1,t][1,t] to itself is equal to (Ct0)20!+(Ct1)21!+(Ct2)22!++(Ctt)2t!(C_{t}^{0})^{2}0!+(C_{t}^{1})^{2}1!+(C_{t}^{2})^{2}2!+\dots+(C_{t}^{t})^{2}t!, which grows even faster then any exponential function. It is less obvious but another “heavy” part of the algorithm is the first step — eliminating chain productions. One would hope that presence of these slow parts is the problem of our proof but this is not the case.

Example 8.

Consider a grammar HGr3={S},{a},P3,SHGr_{3}=\langle\{S\},\{a\},P_{3},S\rangle where PP has 3 productions (type(S)=ntype(S)=n, type(a)=ntype(a)=n):

  1. 1.

    S{v1,,vn},{e0},att,lab,v1v2vnS\to\langle\{v_{1},\dots,v_{n}\},\{e_{0}\},att,lab,v_{1}v_{2}\dots v_{n}\rangle, lab(e0)=Slab(e_{0})=S, att(e0)=v2v1v3v4vnatt(e_{0})=v_{2}v_{1}v_{3}v_{4}\dots v_{n};

  2. 2.

    S{v1,,vn},{e0},att,lab,v1v2vnS\to\langle\{v_{1},\dots,v_{n}\},\{e_{0}\},att,lab,v_{1}v_{2}\dots v_{n}\rangle, lab(e0)=Slab(e_{0})=S, att(e0)=v2v3v4vnv1att(e_{0})=v_{2}v_{3}v_{4}\dots v_{n}v_{1};

  3. 3.

    S(a)S\to\circledcirc(a).

Obviously, this grammar has a size O(n)O(n). It generates graphs with one aa-labeled edge such that attachment nodes of this edge are obtained from external nodes of the graph by a permutation composed of the transposition (12)(12) and of the cycle (12n)(12\dots n). Since these permutations are generators of SnS_{n}, the language L(HGr3)L(HGr_{3}) contains n!n! hypergraphs. All of them have exactly one hyperedge, so for all HL(HGr3)H\in L(HGr_{3}) a production SHS\to H has to belong to a grammar in the WGNF; therefore, even the number of productions grows faster than any exponent apart from the algorithm involved.

This yields the following

Proposition 1.

There is no exponential-time algorithm that takes an HRG generating an ibHCFLibHCFL and returns an equivalent HRG in the WGNF.

However, if we consider only HRGs of order rr for some fixed rr (i.e. in which types of symbols are not greater than rr), then the above example ceases to be a problem as well as the first and the second steps in our proof since the number of partial bijections from [1;r][1;r] to itself (and, in particular, the size of SrS_{r}) is constant when rr is fixed.

7 Conclusion

Our desire to introduce the weak Greibach normal form appeared due to another research about hypergraph basic categorial grammars (we presented a paper [11] about them at ICGT-2020); in that work we introduce a new approach to describing graph languages and prove that each grammar in this new formalism can be transformed into an equivalent HRG in the WGNF and vice versa. Theorems 1 and 2 show that ibHCFL is exactly the class of languages generated by grammars in the weak Greibach normal form; thus we also answer a question about the recognizing power of hypergraph basic categorial grammars — they also generate exactly ibHCFL. Besides, we suppose that the issue of the WGNF is a fundamental theoretical question, and we have presented an answer in line with those in [5, 9]. Note that the result gives a natural grammar characterization of the language class ibHCFL.
Acknowledgments. I thank my scientific advisor prof. Mati Pentus and anonymous reviewers for their valuable advices.

References

  • [1]
  • [2] Alfred V. Aho & Jeffrey D. Ullman (1972): The theory of parsing, translation, and compiling. 1: Parsing. Prentice-Hall. Available at https://www.worldcat.org/oclc/310805937.
  • [3] Y. Bar-Hillel, C. Caifman & E. Shamir (1960): On Categorial and Phrase-structure Grammars. Weizmann Science Press.
  • [4] Frank Drewes, Hans-Jörg Kreowski & Annegret Habel (1997): Hyperedge Replacement, Graph Grammars. In Grzegorz Rozenberg, editor: Handbook of Graph Grammars and Computing by Graph Transformations, Volume 1: Foundations, World Scientific, pp. 95–162, 10.1142/9789812384720_0002.
  • [5] Joost Engelfriet (1992): A Greibach Normal Form for Context-free Graph Grammars. In Werner Kuich, editor: Automata, Languages and Programming, 19th International Colloquium, ICALP92, Vienna, Austria, July 13-17, 1992, Proceedings, Lecture Notes in Computer Science 623, Springer, pp. 138–149, 10.1007/3-540-55719-9_70.
  • [6] Joost Engelfriet, George Leih & Emo Welzl (1990): Boundary Graph Grammars with Dynamic Edge Relabeling. J. Comput. Syst. Sci. 40(3), pp. 307–345, 10.1016/0022-0000(90)90002-3.
  • [7] Joost Engelfriet & Grzegorz Rozenberg (1990): A Comparison of Boundary Graph Grammars and Context-Free Hypergraph Grammars. Inf. Comput. 84(2), pp. 163–206, 10.1016/0890-5401(90)90038-J.
  • [8] Annegret Habel (1992): Hyperedge Replacement: Grammars and Languages. Lecture Notes in Computer Science 643, Springer, 10.1007/BFb0013875.
  • [9] Christina Jansen, Jonathan Heinen, Joost-Pieter Katoen & Thomas Noll (2011): A Local Greibach Normal Form for Hyperedge Replacement Grammars. In Adrian-Horia Dediu, Shunsuke Inenaga & Carlos Martín-Vide, editors: Language and Automata Theory and Applications - 5th International Conference, LATA 2011, Tarragona, Spain, May 26-31, 2011. Proceedings, Lecture Notes in Computer Science 6638, Springer, pp. 323–335, 10.1007/978-3-642-21254-3_25.
  • [10] Bevan K. Jones, Jacob Andreas, Daniel Bauer, Karl Moritz Hermann & Kevin Knight (2012): Semantics-Based Machine Translation with Hyperedge Replacement Grammars. In Martin Kay & Christian Boitet, editors: COLING 2012, 24th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, 8-15 December 2012, Mumbai, India, Indian Institute of Technology Bombay, pp. 1359–1376. Available at https://www.aclweb.org/anthology/C12-1083/.
  • [11] Tikhon Pshenitsyn (2020): Hypergraph Basic Categorial Grammars. In Fabio Gadducci & Timo Kehrer, editors: Graph Transformation - 13th International Conference, ICGT 2020, Held as Part of STAF 2020, Online, June 25-26, 2020, Proceedings, Lecture Notes in Computer Science 12150, Springer, pp. 146–162, 10.1007/978-3-030-51372-6_9.
  • [12] Grzegorz Rozenberg, editor (1997): Handbook of Graph Grammars and Computing by Graph Transformations, Volume 1: Foundations. World Scientific, 10.1142/3303.