This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Old and New Minimalism: a Hopf algebra comparison

Matilde Marcolli, Robert C. Berwick, Noam Chomsky Departments of Mathematics and of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA [email protected] Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge MA 02141, USA [email protected] Department of Linguistics, University of Arizona, Tucson, AZ 85721, USA [email protected] [email protected]
(Date: 2023)
Abstract.

In this paper we compare some old formulations of Minimalism, in particular Stabler’s computational minimalism, and Chomsky’s new formulation of Merge and Minimalism, from the point of view of their mathematical description in terms of Hopf algebras. We show that the newer formulation has a clear advantage purely in terms of the underlying mathematical structure. More precisely, in the case of Stabler’s computational minimalism, External Merge can be described in terms of a partially defined operated algebra with binary operation, while Internal Merge determines a system of right-ideal coideals of the Loday-Ronco Hopf algebra and corresponding right-module coalgebra quotients. This mathematical structure shows that Internal and External Merge have significantly different roles in the old formulations of Minimalism, and they are more difficult to reconcile as facets of a single algebraic operation, as desirable linguistically. On the other hand, we show that the newer formulation of Minimalism naturally carries a Hopf algebra structure where Internal and External Merge directly arise from the same operation. We also compare, at the level of algebraic properties, the externalization model of the new Minimalism with proposals for assignments of planar embeddings based on heads of trees.

1. Introduction

The Minimalist Program of generative linguistics, introduced by Chomsky in the ’90s, [8], underwent a significant simplifying reformulation in the more recent work [9], [10], [11], [12] (see also [4], [5], [31]). A recent overview of the current formulation of the fundamental Merge operation of the Minimalist Model is presented in the lectures [13].

In our paper [37] we showed that the new formulation of Merge has a very natural mathematical description in terms of magmas and Hopf algebras. This mathematical formulation makes it possible to derive several desirable linguistic properties of Merge directly as consequences of the mathematical setting. We also showed in [37] that the mathematical formulation of Merge has the same structure as the mathematical theory underlying fundamental interactions in physics, such as the renormalization process of quantum field theory and the recursive solution of equations of motion via combinatorial Dyson-Schwinger equations. An analogous recursive generative process of hierarchies of graphs plays a crucial role in both cases, as Feynman graphs in the physical case and as syntactic trees in the linguistic case.

In the present companion paper, we show how this same mathematical formalism based on Hopf algebras can be used to compare the new formulation of Merge with older forms of the Minimalist Model. The advantages of the new formulation can be stated directly in linguistic terms, as discussed in [10], [11], [13], for instance. What we argue here is that one can also see the advantage of the New Minimalism directly in terms of the underlying mathematical structure.

More precisely, we first consider the case of Stabler’s “Computational Minimalism”, [42]. This formulation is unsatisfactory from the linguistic perspective, and superseded by Chomsky’s more recent formulation of Minimalism. We take this as an example because among the older versions of Minimalism it is one that tends to be more widely known to mathematicians (as well as to theoretical computer scientists), through its relation to formal languages. Indeed, mathematicians familiar with the theory of formal languages are usually aware of the fact that Stabler’s formulation of Minimalism is describable in terms of a class of minimalist grammars (MG), which are equivalent to multiple context free grammars, a class of context-sensitive formal languages that include all the context-free and regular languages, as well as other classes such as the tree-adjoining grammars, see [44]. The MG grammars can also be characterized in terms of linear context free rewrite systems (LCFRS), [39]. However, as shown by Berwick in [3], this equivalence between Stabler’s computational minimalism and multiple context free grammars hides an important difference in terms of “succintness gap”. Namely, computational minimalism is exponentially more succinct than otherwise equivalent multiple context free grammars, (see [3] and [42]). As shown in [3], a similar gap exists between transformational generative grammar and generalized phrase structure grammar. Another problem with thinking in terms of formal languages is that they are designed to describe languages as strings (oriented sets) produced as ordered sequences of transitions in an automaton that computes the language. This time-ordered description of languages hides its more intrinsic and fundamental description in terms of structures (binary rooted trees without an assigned planar embedding, in mathematical terms). Indeed the current form of Minimalism not only to provides a more efficient encoding of the generative process of syntax, but it also proposes a model where the core computational structure of syntax is entirely based on structures rather than on linear order. The latter (equivalently, planar embeddings of trees) is superimposed to this core computational structure, in a later externalization phase. Thus, the MG grammars description of Stabler’s minimalism, comfortingly familiar as it may appear to the mathematically minded, is in fact deceiving, since formal languages only deal with properties of terminal strings, not with structure, while Minimalism is primarily concerned with structure and does not need to incorporate linear order in its computational core mechanism.

The goal of the present paper is to elaborate on this difference. We also want to stress the point that, while formal languages have long been considered the mathematical theory of choice for application to generative linguistics, it is in fact neither the only one nor the most appropriate. As we argued in [37], the algebraic formalism of Hopf algebras is a more suitable mathematical tool for theoretical linguistics, and a useful way to move beyond the traditional thinking in terms of formal languages. Thus, in the present paper, we will use this Hopf algebras viewpoint to make comparisons between older forms of minimalism like Stabler’s and the current form based on free symmetric Merge, and to show the advantages of the latter from this algebraic perspective.

We analyze Stabler’s formulation of Minimalism from the point of view of Hopf algebra structures, which we have shown in [37] to be an appropriate algebraic language for the formulation of Merge in Minimalism. We show here that, in the older setting of Stabler’s Computational Minimalism, Internal and External Merge correspond to very different types of mathematical structures. External Merge is expressible in terms of the notion of “operated algebra”, while Internal Merge can be described in terms of right-ideal coideals in a Hopf algebra, and corresponding quotient right-module coalgebras. While these are interesting mathematical structures, the very different form of Internal and External Merge makes it difficult to reconcile them as two forms of a single underlying basic Merge structure. This is unsatisfactory, because linguistically one expects Internal and External merge to be manifestations of the same fundamental computational principle.

We then show that the mathematical formulation of the New Minimalism that we introduced in [37] completely bypasses this problem, by directly presenting a unified framework for both Internal and External Merge.

We also further discuss our proposed model for externalization of [37], by comparing it with proposed alternative models based on possible constructions of planarizations of trees, independent of syntactic parameters. We outline some difficulties with such constructions, at the level of the underlying algebraic structure.

2. Stabler’s Minimalism and the Loday–Ronco Hopf algebra

In this section we first recall some mathematical structures, in particular the Loday–Ronco Hopf algebra of planar binary rooted trees, and the explicit form of product and coproduct. In §2.1 we recall the basic definition and properties of Hopf algebras and in §2.2 we introduce the specific case of binary planar rooted trees, with the product and coproduct described in §2.4.

It is important to point out here that one way in which older formulation of Minimalism differ from the more recent formulation is in the fact of considering planar trees, namely rooted trees endowed with a particular choice of a planar embedding. Fixing the embedding is equivalent to fixing an ordering of the leaves of the tree, which corresponds linguistically to considering a linear order on sentences. While this is assumed in older versions of Minimalism, the newer version eliminates the assignment of planar structures at the level of the fundamental computational mechanism of Merge, relegating the linear ordering to a later externalization procedure that interfaces the fundamental computational mechanism of Merge (where ordering is not assumed) with the sensory-motor system, where ordering is imposed in the externalization of language into speech, sign, writing. The difference between imposing a priori a choice of planar embeddings on trees or not leads to significantly different mathematical formulations of the resulting Hopf algebras and Merge operations.

Another very significant difference between older versions of Minimalism and the newer formulation is the calculus of labels of syntactic features associated to trees, and how such labels determine and are determined by the transformation implemented by Internal and External Merge. We introduce labeling in §2.5. For a discussion of labels and projection in the different versions of Minimalism see [9].

We show in §2.6 how in older formulations of Minimalism such as Stabler’s, the conditions on planarity and labels impose conditions on the applicability of Merge operations, for both External and Internal Merge, and that this makes the operations only partially defined on specific domains, reflecting conditions on matching/related labels and on projection. We discuss the specific case of External Merge in §2.7 and of Internal Merge in §2.8.

In §2.9 and §2.10 we describe the mathematical structure underlying the formulation of Internal Merge in the older versions of Minimalism. In §2.9 we show that the issue of labels and domains requires a modification of the Loday–Ronco coproduct, while in §2.10 we show that a similar modification is required for the product, and that the problem of heads and projection results in the fact that the domain of Internal Merge is a right ideal but not a left ideal with respect to the product, which is a coalgebra with resoect to the modified coproduct.

Moreover, in §2.11 we show that the presence of domains that make Merge partially defined, due to labeling conditions, creates an additional mathematical complication related to the iterated application of Merge, as composition of partially defined operations.

We observe in §2.12 that a significant difference with respect to the Hopf algebra formulation of the new Minimalism we described in [37] arises in comparison with the analogous structure in fundamental physics. In the case of the old Minimalism discussed here the Internal Merge has an intrinsically asymmetric structure, because of constraints coming from labeling and projection and from the imposition of working with trees with planar structures. Because of this, instead of finding Hopf ideals as in the setting of theoretical physics, one can only obtain right-ideal coideals, which correspond to a much weaker form of “generalized quotients” in Hopf algebra theory.

We then discuss in more detail the mathematical structure of the old formulation of External Merge in §2.13 and §2.14. We show that these exhibit a very different mathematical structure with respect to Internal Merge. It can be described in terms of the notion of “operated algebra”, where again one needs to extend the notion to a partially defined version because of the conditions on domains coming from the labels and projection problem.

2.1. Preliminaries on Hopf algebras of rooted trees

We begin by reviewing the main constructions of Hopf algebras of rooted trees, focusing on the Loday–Ronco Hopf algebra, which is based on planar binary rooted trees.

Associative algebras can be regarded as the natural algebraic structure that describes linear strings of letters in a given alphabet with the operation of concatenation of words. On the other hand, in Linguistics the emphasis is put preferably on the syntactic trees that provide the parsing of sentences, rather than on the strings of words produced by a given grammar. When dealing with trees instead of strings as the main objects, the appropriate algebraic formalism is no longer associative algebras, but Hopf algebras and more generally operads. These are mathematical structures essentially designed for the parsing of hierarchical compositional rules, by encoding (in a coproduct operation) all the possible “correct parsings”, that is, all available decompositions of an element into its possible building blocks.

We recall here some general facts about Hopf algebras that we will be using in the following.

A Hopf algebra {\mathcal{H}} is a vector space over a field 𝕂{\mathbb{K}}, endowed with

  • a multiplication m:𝕂m:{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\to{\mathcal{H}};

  • a unit u:𝕂u:{\mathbb{K}}\to{\mathcal{H}};

  • a comultiplication Δ:𝕂\Delta:{\mathcal{H}}\to{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}};

  • a counit ϵ:𝕂\epsilon:{\mathcal{H}}\to{\mathbb{K}};

  • an antipode S:S:{\mathcal{H}}\to{\mathcal{H}}

which satisfy the following properties. The multiplication operation is associative, namely the following diagram commutes

𝕂𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}mid\scriptstyle{m\otimes id}idm\scriptstyle{id\otimes m}𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}m\scriptstyle{m}𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}m\scriptstyle{m}\textstyle{\mathcal{H}}

and the unit and multiplication are related by the commutative diagram

𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}m\scriptstyle{m}𝕂𝕂\textstyle{{\mathbb{K}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}uid\scriptstyle{u\otimes id}𝕂𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathbb{K}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}idu\scriptstyle{id\otimes u}\textstyle{\mathcal{H}}

where the two unmarked downward arrows are the identifications induced by scalar multiplication. The coproduct is coassociative, namely the following diagram commutes

𝕂𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}}𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}Δid\scriptstyle{\Delta\otimes id}𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}idΔ\scriptstyle{id\otimes\Delta}\textstyle{{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}Δ\scriptstyle{\Delta}Δ\scriptstyle{\Delta}

and the comultiplication and the counit are related by the commutative diagram

𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}ϵid\scriptstyle{\epsilon\otimes id}idϵ\scriptstyle{id\otimes\epsilon}𝕂𝕂\textstyle{{\mathbb{K}}\otimes_{\mathbb{K}}{\mathcal{H}}}𝕂𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathbb{K}}}\textstyle{{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}Δ\scriptstyle{\Delta}

where the two unmarked upward arrows are the identifications given, respectively, by x1𝕂xx\mapsto 1_{\mathbb{K}}\otimes x and xx1𝕂x\mapsto x\otimes 1_{\mathbb{K}}, with 1𝕂1_{\mathbb{K}} the unit of the field 𝕂{\mathbb{K}}. Moreover, Δ\Delta and ϵ\epsilon are algebra homomorphisms, and mm and uu are coalgebra homomorphisms. The antipode S:S:{\mathcal{H}}\to{\mathcal{H}} is a linear map such that the following diagram also commutes

𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}m\scriptstyle{m}\textstyle{\mathcal{H}}𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}m\scriptstyle{m}𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}idS\scriptstyle{id\otimes S}\textstyle{{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}Δ\scriptstyle{\Delta}uϵ\scriptstyle{u\circ\epsilon}Δ\scriptstyle{\Delta}𝕂\textstyle{{\mathcal{H}}\otimes_{\mathbb{K}}{\mathcal{H}}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}Sid\scriptstyle{S\otimes id}

In any graded bialgebra it is possible to construct an antipode map inductively by

(2.1) S(X)=XS(X)X′′,S(X)=-X-\sum S(X^{\prime})X^{\prime\prime},

for any element XX of the bialgebra, with coproduct Δ(X)=X1+1X+XX′′\Delta(X)=X\otimes 1+1\otimes X+\sum X^{\prime}\otimes X^{\prime\prime}, where the XX^{\prime} and X′′X^{\prime\prime} are terms of lower degree. Thus, in the following we will focus only on the bialgebra structure and not discuss the antipode map explicitly.

Heuristically, what these properties of a Hopf algebra describe can be summarized as follows. As we will see in the specific example of trees below, as a vector space {\mathcal{H}} consists of formal linear combinations of a specific class of objects (planar binary rooted trees, Feynman graphs, etc.). The operations will be defined on these generators and extended by linearity. The multiplication structure is a combination operation and the coproduct structure is a decomposition operation that lists all the possible different decompositions (parsings). The antipode is like a group inverse and it establishes the compatibility between multiplication and comultiplication, unit and counit.

2.2. Binary rooted trees

A rooted tree TT is a finite graph whose geometric realization is simply connected (no loops), defined by a set of vertices VV, with a distinguished element vrVv_{r}\in V, the root vertex, and a set of edges EE, which we can assume oriented with the uniquely defined orientation away from the root. The source and target maps s,t:EVs,t:E\to V assign to each edge of the tree its source and target vertices. The leaves of the tree are the univalent vertices. They are also the sinks (no outgoing edges). A rooted tree is planar if it is endowed with an embedding of its geometric realization in the plane. Assigning planar embedding is equivalent to assigning a linear ordering of the leaves. The tree is vertex-decorated if there is a map LV:VDVL_{V}:V\to D_{V} to a (finite) set. It is edge-decorated if similarly there is a map LE:EDEL_{E}:E\to D_{E} to a finite set of possible edge decorations. We will assume the trees are vertex-decorated, and we will simply refer to them as decorated trees.

An admissible cut CC on a rooted tree TT is an operation that

  1. (1)

    selects a number of edges of TT with the property that every oriented path in TT from the root to one of the leaves contains at most one of the selected edges

  2. (2)

    removes the selected edges.

The result of an admissible cut is a disjoint union of a tree ρC(T)\rho_{C}(T) that contains the root vertex and a forest πC(T)\pi_{C}(T), that is, a disjoint union πC(T)=iTi\pi_{C}(T)=\cup_{i}T_{i} of planar trees, where each TiT_{i} has a unique source vertex (no incoming edges), which we select as root vertex of TiT_{i},

(2.2) C(T)=ρC(T)πC(T).C(T)=\rho_{C}(T)\cup\pi_{C}(T).

An elementary admissible cut (or simply elementary cut) is a cut CC consisting of a single edge.

2.3. The Loday–Ronco Hopf algebra of binary rooted trees

We note right away that this algebraic structure is different from that we considered in [37]. Since older forms of Minimalism, such as Stabler’s version that we discuss here, incorporate linear ordering, it is necessary to work with binary rooted trees with an assigned choice of planar embedding (linear ordering of the leaves). This has immediate consequences on the type of algebraic structure that we need to work with, as the possible forms of product and coproduct operations are affected by the presence of this linear ordering. Another difference with respect to our setting in [37] is that in the new Minimalism, Merge acts on workspaces (a Hopf algebra of binary forests with no planar embeddings), while in the old Stabler Minimalism, Merge acts on (a Hopf algebra of) planar binary rooted trees.

Consider the vector space 𝒱k{\mathcal{V}}_{k} spanned by the planar binary rooted trees TT with kk internal vertices (equivalently, with k+1k+1 leaves). It has dimension

dim𝒱k=(#DV)k(2k)!k!(k+1)!,\dim{\mathcal{V}}_{k}=(\#D_{V})^{k}\frac{(2k)!}{k!(k+1)!},

where #DV\#D_{V} is the cardinality of the set DVD_{V} of possible vertex labels. Let 𝒱=k0𝒱k{\mathcal{V}}=\oplus_{k\geq 0}{\mathcal{V}}_{k}, with 𝒱0={\mathcal{V}}_{0}={\mathbb{Q}}. For a given label dDVd\in D_{V} the grafting operator d\wedge_{d} is defined as

(2.3) d:𝒱𝒱𝒱,T1T2T=T1dT2,\wedge_{d}:{\mathcal{V}}\otimes{\mathcal{V}}\to{\mathcal{V}},\ \ \ T_{1}\otimes T_{2}\mapsto T=T_{1}\wedge_{d}T_{2}\,,

with d:𝒱k𝒱𝒱k+1\wedge_{d}:{\mathcal{V}}_{k}\otimes{\mathcal{V}}_{\ell}\to{\mathcal{V}}_{k+\ell-1}, that attaches the two roots vr1v_{r_{1}} of T1T_{1} and vr2v_{r_{2}} of T2T_{2} to a single root vertex vv labelled by dDVd\in D_{V}.

One also introduces the following associative concatenation operations on planar binary rooted trees: given SS and TT, the tree S\TS\backslash T (SS under TT) is obtained by grafting the root of TT to the rightmost leaf of SS, while T/ST/S (SS over TT) is the tree obtained by grafting the root of TT to the leftmost leaf of SS. The grafting operation (2.3) is related to these concatenations by

(2.4) T1dT2=T1/S\T2,T_{1}\wedge_{d}T_{2}=T_{1}/S\backslash T_{2},

where SS is the planar binary tree with a single vertex decorated by dDVd\in D_{V}. Each planar binary rooted tree is described as a grafting T=TdTrT=T_{\ell}\wedge_{d}T_{r}, of the trees stemming to the left and right of the root vertex. We remove the explicit mention of the vertex decorations when not needed.

The Loday–Ronco Hopf algebra LR{\mathcal{H}}_{LR} of planar binary rooted trees is obtained from the vector space 𝒱{\mathcal{V}} by defining a multiplication and a comultiplication inductively by degrees. For trees T=TTrT=T_{\ell}\wedge T_{r} and T=TTrT^{\prime}=T^{\prime}_{\ell}\wedge T^{\prime}_{r} the product defined in [35] can be built inductively using the property

TT=T(TrT)+(TT)Tr,T\star T^{\prime}=T_{\ell}\wedge(T_{r}\star T^{\prime})+(T\star T^{\prime}_{\ell})\wedge T^{\prime}_{r},

with the tree consisting of a single root vertex \bullet as the unit. The coproduct similarly can be built from lower degree terms by the property

Δ(T)=j,k(T,jTr,k)(T,njTr,mk)+T\Delta(T)=\sum_{j,k}(T_{\ell,j}\star T_{r,k})\otimes(T^{\prime}_{\ell,n-j}\wedge T^{\prime}_{r,m-k})+T\otimes\bullet

where T=TTrT=T_{\ell}\wedge T_{r} and Δ(T)=jT,jT,nj\Delta(T_{\ell})=\sum_{j}T_{\ell,j}\otimes T^{\prime}_{\ell,n-j} and Δ(Tr)=kTr,kTr,mk\Delta(T_{r})=\sum_{k}T_{r,k}\otimes T^{\prime}_{r,m-k}, for T𝒱nT_{\ell}\in{\mathcal{V}}_{n} and Tr𝒱mT_{r}\in{\mathcal{V}}_{m}.

In [35] this Hopf algebra LR{\mathcal{H}}_{LR} of planar binary rooted trees is described in terms of the Hopf algebra structure on the group algebra [S]=n[Sn]{\mathbb{Q}}[S_{\infty}]=\oplus_{n}{\mathbb{Q}}[S_{n}] of the symmetric group. Namely, the inclusion of the Hopf algebra of non-commutative symmetric functions in the Malvenuto-Reutenauer Hopf algebra of permutations factors through the Loday-Ronco Hopf algebra of planar binary rooted trees, see also [1]. In [7] a version of the Loday-Ronco Hopf algebra of planar binary rooted trees was used for the renormalization of massless quantum electrodynamics and and explicit isomorphisms between the Loday-Ronco Hopf algebra of planar binary rooted trees and the (noncommutative) Connes–Kreimer Hopf algebra of renormalization were constructed in [1], [25], [20].

2.4. Graphical form of coproduct and product

It is shown in [1] that the coproduct and product of the Loday-Ronco Hopf algebra of planar binary rooted trees can be conveniently visualized in the following way.

Given a tree TT, one subdivides it into two parts by cutting along the path from one of the leaves to the root (illustration from [1])

[Uncaptioned image]

and the coproduct then takes the form of a sum over all such decompositions

(2.5) Δ(T)=TT′′\Delta(T)=\sum T^{\prime}\otimes T^{\prime\prime}

where T,T′′T^{\prime},T^{\prime\prime} are, respectively the parts of TT to the left and right of the path from a leaf to the root and all choices of leaves are summed over.

In order to form the product T1T2T_{1}\star T_{2}, suppose that T1T_{1} has n1+1n_{1}+1 leaves and T2T_{2} has n2+1n_{2}+1 leaves. Consider again subdivisions of the tree T1T_{1} obtained by cutting along paths from the leaves to the root, including trees consisting of just the path itself, with the cuts chosen so as to obtain n2+1n_{2}+1 subtrees of T1T_{1} (illustration from [1] in a case with n1=5n_{1}=5, n2=3n_{2}=3)

[Uncaptioned image][Uncaptioned image]

We write 𝒱(k){\mathcal{V}}^{(k)} for the {\mathbb{Q}}-vector space spanned by the planar binary rooted trees with kk leaves, and we write the usual operadic composition operation of grafting roots to leaves as above in the form

(2.6) γ:𝒱(k0)××𝒱(kn)×𝒱(n+1)𝒱(k0++kn)\gamma:{\mathcal{V}}^{(k_{0})}\times\cdots\times{\mathcal{V}}^{(k_{n})}\times{\mathcal{V}}^{(n+1)}\to{\mathcal{V}}^{(k_{0}+\cdots+k_{n})}

where γ(T0,,Tn;T)\gamma(T_{0},\ldots,T_{n};T) is the tree obtained by grafting the trees Ti𝒱(ki)T_{i}\in{\mathcal{V}}^{(k_{i})} in collection (T0,,Tn)(T_{0},\ldots,T_{n}) to the tree T𝒱(n+1)T\in{\mathcal{V}}^{(n+1)}, by attaching the root of TiT_{i} to the ii-th leaf of TT, as illustrated above.

The product in LR{\mathcal{H}}_{LR} is then written in the form

(2.7) TT=(T0,,Tn)γ(T0,,Tn;T)T\star T^{\prime}=\sum_{(T_{0},\ldots,T_{n})}\gamma(T_{0},\ldots,T_{n};T^{\prime})

with n+1n+1 the number of leaves of TT^{\prime}, and the sum over subdivisions into subtrees extracted according to the rule described above.

2.5. Labelled trees

The previous subsections only dealt with general mathematical formalism about planar binary rooted trees. We now consider more specifically the case of Stabler’s Computational Minimalism.

In Stabler’s formulation, one considers planar binary rooted trees such as the following example (from [42]).

[Uncaptioned image]

where the ordering of the leaves determines and is determined by the planar embedding of the tree, and the labels at the inner vertices serve the purpose of pointing toward the head.

More precisely, all internal vertices and the root vertex are labelled by symbols in the set {>,<}\{>,<\}. The purpose of the labels >> and << is to identify where the head of the tree is: in the example above the head is the leaf vertex number 88. In addition to these labels, one also considers a finite set of syntactic features X{N,V,A,P,C,T,D,}X\in\{N,V,A,P,C,T,D,\ldots\} and “selector” features denoted by the symbol σX\sigma X for a head selecting a phrase XPXP. More generally we can have labels that are strings (ordered finite sets) α=X0X1Xr\alpha=X_{0}X_{1}\cdots X_{r} of syntactic features as above. We also consider labels given by letters ω\omega and ω¯\bar{\omega} that stand for “licensor” and “licensee”. (Note: in [42] the notation =X=X is used for feature selector instead of our σX\sigma X, and the notation ±X\pm X is used for licensor/licensee pairs, but we prefer to avoid using mathematical notation with a meaning different from its generally accepted one, to avoid confusion, hence we will use the letters σ\sigma and ω,ω¯\omega,\bar{\omega} instead.)

We write the result of external merge applied to constituents α\alpha and β\beta as in [42], as the labelled planar binary rooted tree

\Tree

[ .<< α\alpha β\beta ]                rather than \Tree[ .α\alpha α\alpha β\beta ]

This means that we consider as generators of the Loday–Ronco Hopf algebras the binary trees with labels that include the syntactic features, as well as symbols << and >>, as in [42], used to point towards the head when merge is applied.

By (2.4) it is clear that the head of the tree T1>T2T_{1}\wedge_{>}T_{2} is the head of T2T_{2} and the head of the tree T1<T2T_{1}\wedge_{<}T_{2} is the head of T1T_{1}.

Recall also that a maximal projection in TT is a subtree of TT that is not a proper subtree of any larger subtree with the same head. As pointed out in [42], in the example illustrated above, the leaves {2,3,4}\{2,3,4\} determine a subtree with head vertex the leaf numbered 33, but any larger subtree in TT would have a different head, hence this subtree is a maximal projection. So is the subtree determined by the leaves {5,6}\{5,6\}, for instance.

We will write ling{\mathcal{H}}_{ling} for the Loday-Ronco Hopf algebra LR{\mathcal{H}}_{LR} with the choice of labels described above. We will write 𝒱ling{\mathcal{V}}_{ling} for the underlying {\mathbb{Q}}-vector space spanned by the binary trees with the labeling described above.

As we pointed out already, the choice of working with planar binary rooted trees corresponds to an assigned linear ordering of its leaves, which in turn is the linear ordering of the resulting sentence when spoken or read. Thus, by working with planar trees we are assuming that linear ordering is determined for the result of Merge. This is the same assumption made in [42]. This is the crucial point in the comparison with the new Minimalism, and we will discuss it more explicitly in Section 3 below.

2.6. Old formulation of External and Internal Merge

We now discuss the external and internal merge operations in the old formulations of minimalism. We first describe the operations at the level of the underlying combinatorial tree, and then we give the more precise constraints on when the operations can be applied, based on the labels of the trees involved.

Thus, the way we should view external and internal merge is as partially defined operations

:𝒱ling𝒱ling𝒱ling{\mathcal{E}}:{\mathcal{V}}_{ling}\otimes{\mathcal{V}}_{ling}\to{\mathcal{V}}_{ling}
:𝒱ling𝒱ling{\mathcal{I}}:{\mathcal{V}}_{ling}\to{\mathcal{V}}_{ling}

where we assume that the operations defined on generators are extended by (bi)linearity, so that they are defined on linear subspaces

Dom()𝒱ling𝒱ling{\rm Dom}({\mathcal{E}})\subset{\mathcal{V}}_{ling}\otimes{\mathcal{V}}_{ling}
Dom()𝒱ling{\rm Dom}({\mathcal{I}})\subset{\mathcal{V}}_{ling}

that we will describe more precisely below.

At the level of the underlying combinatorial trees external and internal merge are simply defined as the following operations.

The combinatorial structure of the external merge is given by

(2.8) (T1T2)={T2T1=T2T1otherwise,{\mathcal{E}}(T_{1}\otimes T_{2})=\left\{\begin{array}[]{ll}\bullet\wedge T_{2}&T_{1}=\bullet\\[8.53581pt] T_{2}\wedge T_{1}&\text{otherwise,}\end{array}\right.

where \bullet denotes the tree consisting of a single vertex, and the \wedge operation is the grafting operation d\wedge_{d} defined as in (2.3), where the specific label d{>,<}d\in\{>,<\} according to a rule that will be discussed more in detail in §2.7 below.

The combinatorial structure of the internal merge is given by

(2.9) (T)=πC(T)ρC(T),{\mathcal{I}}(T)=\pi_{C}(T)\wedge\rho_{C}(T),

where CC is an elementary admissible cut of TT with ρC(T)\rho_{C}(T) the remaining pruned tree that contains the root of TT and πC(T)\pi_{C}(T) the part that is severed by the cut (which in the case of an elementary cut is itself a tree rather than a forest). Again \wedge is as in (2.3), with a label that will be made more precise below, and the choice of the elementary cut will also depend on conditions on tree labels, see §2.8.

We now look more precisely at the structure of external and internal merge and at their domains of definition. It should be noted how working with planar structures and with conditions of matching/related labels makes the underlying mathematical operations more difficult to describe, as they are only defined on specific domains and with additional conditions.

2.7. Old form of External Merge

The more precise definition of the External Merge in the older forms of Miimalism is given as follows (where we are adapting the description of [42] to our terminology and notation).

As in [42], we use the notation T[α]T[\alpha] for a tree where the head is labelled by an ordered set of syntactic features starting with α\alpha. The label α\alpha consists of a string of syntactic features of the form

α=X0X1Xr or α=σX0X1Xr,\alpha=X_{0}X_{1}\cdots X_{r}\ \ \ \text{ or }\ \ \ \alpha=\sigma X_{0}X_{1}\cdots X_{r},

with σ\sigma the selector symbol.

We introduce the notation α^\hat{\alpha} for the string obtained from α\alpha after removing the first feature (other than the selection symbol). Namely, for α=X0X1Xr\alpha=X_{0}X_{1}\cdots X_{r} or α=σX0X1Xr\alpha=\sigma X_{0}X_{1}\cdots X_{r}, we have α^=X1Xr\hat{\alpha}=X_{1}\cdots X_{r}.

The external merge T=(T1[σα],T2[α])T={\mathcal{E}}(T_{1}[\sigma\alpha],T_{2}[\alpha]) of two trees T1[σα]T_{1}[\sigma\alpha] and T2[α]T_{2}[\alpha] is given by

(2.10) (T1[σα],T2[α])={T1[σα^]<T2[α^]|T1|=1T2[α^]>T1[σα^]|T1|>1.{\mathcal{E}}(T_{1}[\sigma\alpha],T_{2}[\alpha])=\left\{\begin{array}[]{ll}T_{1}[\widehat{\sigma\alpha}]\wedge_{<}T_{2}[\hat{\alpha}]&|T_{1}|=1\\ T_{2}[\hat{\alpha}]\wedge_{>}T_{1}[\widehat{\sigma\alpha}]&|T_{1}|>1.\end{array}\right.

This clearly agrees with (2.8) at the level of the underlying combinatorial trees. The main difference with (2.8) is that here the operation can be performed only if the heads are decorated with syntactic features σα\sigma\alpha and α\alpha, respectively. Thus the domain of definition Dom()𝒱ling𝒱ling{\rm Dom}({\mathcal{E}})\subset{\mathcal{V}}_{ling}\otimes{\mathcal{V}}_{ling} of the external merge operation is the vector space on the set of generators

(2.11) Dom()=span{(T1[β],T2[α])|β=σα}.{\rm Dom}({\mathcal{E}})={\rm span}_{\mathbb{Q}}\{(T_{1}[\beta],T_{2}[\alpha])\,|\,\beta=\sigma\alpha\}.

2.8. Old version of Internal Merge

In the older formulations of Minimalism, one considers a second Merge operation, considered as distinct from External Merge, namely the Internal Merge. We again adapt to our notation and terminology the formulation given in [42]. We use the same notation as above for T[α]T[\alpha] with α=X0X1Xr\alpha=X_{0}X_{1}\ldots X_{r} or α=σX0X1Xr\alpha=\sigma X_{0}X_{1}\ldots X_{r} a string of syntactic features possibly starting with a selector, and we write α^=X1Xr\hat{\alpha}=X_{1}\ldots X_{r}. The internal merge is again modeled on the basic grafting operation of planar rooted binary trees d\wedge_{d}, but this time its input is a single tree T[α]T[\alpha].

Given a planar binary rooted tree TT containing a subtree T1T_{1}, and given another binary rooted tree T2T_{2}, we denote as in [42] by T{T1T2}T\{T_{1}\to T_{2}\} the planar binary rooted tree obtained by removing the subtree T1T_{1} from TT and replacing it with T2T_{2}. In particular, we write T{T1}T\{T_{1}\to\emptyset\} for the tree obtained by removing the subtree T1T_{1} from TT. Note that, in order to ensure that the result of this operation is still a tree, we need to assume that if an internal vertex of TT belongs to the subtree T1T_{1}, then all oriented paths in TT from this vertex to a leaf also belong to T1T_{1}, where paths in the tree are oriented from root to leaves. We only consider subtrees that have this property. When it is necessary to stress this fact, we will refer to such subtrees as complete subtrees. Note that these are exactly the subtrees that can be obtained by applying a single elementary admissible cut CC to the tree TT. The tree T{T1}T\{T_{1}\to\emptyset\} is then the same as the tree ρC(T)\rho_{C}(T).

Consider a tree T[α]T[\alpha] where α=X0Xr\alpha=X_{0}\cdots X_{r} or α=σX0Xr\alpha=\sigma X_{0}\cdots X_{r} or α=ωX0Xr\alpha=\omega X_{0}\cdots X_{r} or α=ω¯X0Xr\alpha=\bar{\omega}X_{0}\cdots X_{r}.

The domain Dom()𝒱ling{\rm Dom}({\mathcal{I}})\subset{\mathcal{V}}_{ling} of the internal merge is given by the subspace

(2.12) Dom()=span{T[α]|T1[β]T[α],with β=ω¯X0β^,α=ωX0α^},{\rm Dom}({\mathcal{I}})={\rm span}_{\mathbb{Q}}\{T[\alpha]\,|\,\exists T_{1}[\beta]\subset T[\alpha],\text{with }\beta=\bar{\omega}X_{0}\hat{\beta},\,\alpha=\omega X_{0}\hat{\alpha}\}\,,

where T1TT_{1}\subset T is a subtree (in the sense specified above), and

(2.13) (T[α])=T1M[β^]>T{T1[β]M}=πC(T)>ρC(T),{\mathcal{I}}(T[\alpha])=T_{1}^{M}[\hat{\beta}]\wedge_{>}T\{T_{1}[\beta]^{M}\to\emptyset\}=\pi_{C}(T)\wedge_{>}\rho_{C}(T),

where CC is the elementary admissible cut specified by the subtree T1MT_{1}^{M} (the maximal projection of the head of T1T_{1}) and the condition (given by the matching labels ωX0\omega X_{0} and ω¯X0\bar{\omega}X_{0}) that T[α]Dom()T[\alpha]\in{\rm Dom}({\mathcal{I}}). The head of πC(T)>ρC(T)\pi_{C}(T)\wedge_{>}\rho_{C}(T) gets the label α^\hat{\alpha}.

Stabler’s External and Internal Merge, that we recalled here in (2.10) and (2.13), have issues at the linguistics level, such as problems with potentially unlabelable exocentric constructions. For example (as observed by Riny Huijbregts) Internal Merge as in (2.13) gives {XP,YP}\{XP,YP\} results, and labeling cannot be assigned, unless a further application of Internal Merge takes XPXP or YPYP in a criterial position, where structural agreement, SPEC-Head agreement, can take place between specifier and head. In [42], two further variants of External and Internal Merge are introduced in order to deal with “persistent features”. This results in what is referred to in [42] as “conflated minimalist grammars” (CMGs). These additional forms of External and Internal Merge, however, do not solve the problem mentioned above and further complicate the structure, so we will focus here on analyzing the algebraic structure determined by just the forms (2.10) and (2.13) of Stabler’s External and Internal Merge.

2.9. Old version of Internal Merge and the coalgebra structure

As we have discussed above, the internal merge is only partially defined on 𝒱ling{\mathcal{V}}_{ling} because it requires the existence of matching conditions on the label, which determine the domain Dom()𝒱ling{\rm Dom}({\mathcal{I}})\subset{\mathcal{V}}_{ling}. We now discuss now how the domain Dom(){\rm Dom}({\mathcal{I}}) and the internal merge operation {\mathcal{I}} behave with respect to the coproduct of the Hopf algebra ling{\mathcal{H}}_{ling}.

Consider the coproduct Δ(T)\Delta(T) of a tree TDom()T\in{\rm Dom}({\mathcal{I}}). This is given by the sum of decompositions Δ(T)=TT′′\Delta(T)=\sum T^{\prime}\otimes T^{\prime\prime} as illustrated in (2.5). In each of these decompositions one side will contain the head of the tree TT (or possibly both sides if the leaf used to cut the tree happens to be also the head).

If both the head of TT and the head of πC(T)\pi_{C}(T) are contained in the same side of the partition, then that side still belongs to Dom(){\rm Dom}({\mathcal{I}}). Thus, these pieces of the coproduct are in Dom()ling+lingDom(){\rm Dom}({\mathcal{I}})\otimes{\mathcal{H}}_{ling}+{\mathcal{H}}_{ling}\otimes{\rm Dom}({\mathcal{I}}). The remaining terms, with the heads of TT of πC(T)\pi_{C}(T) on different sides of the decomposition only belong in general to lingling{\mathcal{H}}_{ling}\otimes{\mathcal{H}}_{ling} and fall outside of the domain of the internal merge.

This suggests a modification of the coproduct on ling{\mathcal{H}}_{ling} with respect to the usual Loday-Ronco coproduct (2.5). For a generator TT that is not in Dom(){\rm Dom}({\mathcal{I}}) we just set πC(T)=T\pi_{C}(T)=T, that is, no cut is performed.

For TDom()T\in{\rm Dom}({\mathcal{I}}), let h(T)h(T) and h(πC(T))h(\pi_{C}(T)) be the leaves of TT that are the head of TT and the head of πC(T)\pi_{C}(T), respectively. If TT is in Dom(){\rm Dom}({\mathcal{I}}) then CC is, as above, the elementary admissible cut determined by the label conditions. Let 𝒫(T){\mathcal{P}}_{\mathcal{I}}(T) denote the set of bipartitions

(2.14) 𝒫(T)={T=(T,T′′)|(h(T)T and h(πC(T))T) or (h(T)T′′ and h(πC(T))T′′)}.{\mathcal{P}}_{\mathcal{I}}(T)=\{T=(T^{\prime},T^{\prime\prime})\,|\,(h(T)\in T^{\prime}\text{ and }h(\pi_{C}(T))\in T^{\prime})\text{ or }(h(T)\in T^{\prime\prime}\text{ and }h(\pi_{C}(T))\in T^{\prime\prime})\}\,.

For trees outside the domain Dom(){\rm Dom}({\mathcal{I}}) the set 𝒫(T){\mathcal{P}}_{\mathcal{I}}(T) just counts all bipartitions as we have set πC(T)=T\pi_{C}(T)=T.

We then define the modified coproduct

(2.15) Δ(T):=(T,T′′)𝒫(T)TT′′,\Delta_{\mathcal{I}}(T):=\sum_{(T^{\prime},T^{\prime\prime})\in{\mathcal{P}}_{\mathcal{I}}(T)}T^{\prime}\otimes T^{\prime\prime},

which is the same as (2.5) outside of Dom(){\rm Dom}({\mathcal{I}}).

With this modified coproduct Dom(){\rm Dom}({\mathcal{I}}) is a coideal of the coalgebra ling{\mathcal{H}}_{ling}, namely

(2.16) Δ(Dom())Dom()ling+lingDom().\Delta_{\mathcal{I}}({\rm Dom}({\mathcal{I}}))\subset{\rm Dom}({\mathcal{I}})\otimes{\mathcal{H}}_{ling}+{\mathcal{H}}_{ling}\otimes{\rm Dom}({\mathcal{I}})\,.

2.10. Old version of Internal Merge and the algebra structure

We now discuss the behavior of internal merge with respect to the product structure of ling{\mathcal{H}}_{ling}. In this case, arguing in a similar way as for the coproduct above, there is a modification \star_{\mathcal{I}} of the original product of (LR,)({\mathcal{H}}_{LR},\star) that remains the same outside of Dom(){\rm Dom}({\mathcal{I}}) and takes into account the additional data in Dom(){\rm Dom}({\mathcal{I}}) and that has the effect of making Dom(){\rm Dom}({\mathcal{I}}) into a right ideal with respect to the algebra (ling,)({\mathcal{H}}_{ling},\star_{\mathcal{I}}). However, Dom(){\rm Dom}({\mathcal{I}}) is not a left ideal.

We define the modified product \star_{\mathcal{I}} on ling{\mathcal{H}}_{ling} in the following way. Let h(T)h(T) denote the head of TT. Given trees T,TT,T^{\prime} where TT^{\prime} has n+1n+1 leaves, consider the set 𝒫(T,T){\mathcal{P}}_{\mathcal{I}}(T,T^{\prime}) given by decompositions (T0,,Tn)(T_{0},\ldots,T_{n}) of TT as above with

(2.17) 𝒫(T,T):={(T0,,Tn)|h(T) and h(πC(T))Th(T)}{\mathcal{P}}_{\mathcal{I}}(T,T^{\prime}):=\{(T_{0},\ldots,T_{n})\,|\,h(T)\text{ and }h(\pi_{C}(T))\in T_{h(T^{\prime})}\}\,

Namely those decompositions (T0,,Tn)(T_{0},\ldots,T_{n}) with the property that the head of TT (and the head of πC(T)\pi_{C}(T) in the case where TDom()T\in{\rm Dom}({\mathcal{I}})) lies in the component Th(T)T_{h(T^{\prime})} that is grafted to the head of TT^{\prime}. In the case of TDom()T\notin{\rm Dom}({\mathcal{I}}) the condition reduces to just h(T)Th(T)h(T)\in T_{h(T^{\prime})}. Note that it is always possible to have such decompositions, as some of the components TiT_{i} can always be taken to be copies of just the path from a leaf to the root, so that the relevant piece of the decomposition can be placed at the h(T)h(T^{\prime})-position.

We then take the product \star_{\mathcal{I}} on ling{\mathcal{H}}_{ling} of the form

(2.18) TT=(T0,,Tn)𝒫(T,T)γ(T0,,Tn;T).T\star_{\mathcal{I}}T^{\prime}=\sum_{(T_{0},\ldots,T_{n})\in{\mathcal{P}}_{\mathcal{I}}(T,T^{\prime})}\gamma(T_{0},\ldots,T_{n};T^{\prime})\,.

The head of each γ(T0,,Tn;T)\gamma(T_{0},\ldots,T_{n};T^{\prime}) is then the same as the head of TT, which we write in shorthand as h(TT)=h(T)h(T\star_{\mathcal{I}}T^{\prime})=h(T). Moreover, by construction the component Th(T)T_{h(T^{\prime})} is in Dom(){\rm Dom}({\mathcal{I}}) when TDom()T\in{\rm Dom}({\mathcal{I}}), so that TTT\star_{\mathcal{I}}T^{\prime} is itself in Dom(){\rm Dom}({\mathcal{I}}). This shows that Dom(){\rm Dom}({\mathcal{I}}) is a right ideal with respect to the algebra (ling,)({\mathcal{H}}_{ling},\star_{\mathcal{I}}),

Dom()lingDom().{\rm Dom}({\mathcal{I}})\,\star_{\mathcal{I}}\,{\mathcal{H}}_{ling}\subset{\rm Dom}({\mathcal{I}})\,.

It is, however, not a left ideal.

For TDom()T\in{\rm Dom}({\mathcal{I}}) we then have

(2.19) (TT)=(T0,,Tn)𝒫(T,T)πC(Th(T))>γ(T0,,ρC(Th(T)),,Tn;T).{\mathcal{I}}(T\star_{\mathcal{I}}T^{\prime})=\sum_{(T_{0},\ldots,T_{n})\in{\mathcal{P}}_{\mathcal{I}}(T,T^{\prime})}\pi_{C}(T_{h(T^{\prime})})\wedge_{>}\gamma(T_{0},\ldots,\rho_{C}(T_{h(T^{\prime})}),\ldots,T_{n};T^{\prime})\,.

Combining the behavior with respect to the product with the previous observation about the coproduct we conclude that the internal merge {\mathcal{I}} defines a right (ling,)({\mathcal{H}}_{ling},\star_{\mathcal{I}})-module given by the cosets

:=Dom()\ling{\mathcal{M}}_{\mathcal{I}}:={\rm Dom}({\mathcal{I}})\backslash{\mathcal{H}}_{ling}

where {\mathcal{M}}_{\mathcal{I}} is also a coalgebra with the coproduct induced by (ling,Δ)({\mathcal{H}}_{ling},\Delta_{\mathcal{I}}).

2.11. Iterated Internal Merge

In fact, the construction above can be extended to a nested family of right-ideal coideals, given by the domains of the iterations of the internal merge, 𝒟N+1𝒟N{\mathcal{D}}_{N+1}\subset{\mathcal{D}}_{N} with

𝒟N:=Dom(N).{\mathcal{D}}_{N}:={\rm Dom}({\mathcal{I}}^{N})\,.

When we consider repeated application of NN internal merge operations, starting from a given tree T[α]T[\alpha], where α\alpha is a finite list of features each of which can be either XiX_{i} or ωXi\omega X_{i} of ω¯Xi\bar{\omega}X_{i} as above, we need to assume the following conditions

  1. (1)

    There are NN subtrees T1,,TNT_{1},\ldots,T_{N} in TT, each of which is a complete subtree, in the sense specified above.

  2. (2)

    Let T1M,,TNMT_{1}^{M},\ldots,T_{N}^{M} be the maximal projections of the subtrees, in the sense recalled above. These are also complete subtrees of TT.

  3. (3)

    The subtrees TiMT_{i}^{M} are disjoint.

In addition to these combinatorial conditions, we have a matching labels condition that specifies the domain of the composite operation. Let T[α]T[\alpha] be the given tree as a labelled tree, and let T1[β(1)],,TN[β(N)]T_{1}[\beta^{(1)}],\ldots,T_{N}[\beta^{(N)}] be the labelled subtrees, with the conditions listed above. The domain of the composite internal merge is then given by

(2.20) Dom(N)={T[α]|T1[β(1)],,TN[β(N)] with {(1), (2), (3) are satisfiedβ0(1)=ω¯X0,,β0(N)=ω¯XN1α=ωX0ωX1ωXN1}}.{\rm Dom}({\mathcal{I}}^{N})=\left\{T[\alpha]\,\big{|}\,\exists T_{1}[\beta^{(1)}],\ldots,T_{N}[\beta^{(N)}]\,\text{ with }\left\{\begin{array}[]{l}\text{(1), (2), (3) are satisfied}\\ \beta^{(1)}_{0}=\bar{\omega}X_{0},\ldots,\beta^{(N)}_{0}=\bar{\omega}X_{N-1}\\ \alpha=\omega X_{0}\omega X_{1}\cdots\omega X_{N-1}\cdots\end{array}\right\}\right\}.

For a forest F=T1TF=T_{1}\cdots T_{\ell}, we define a grafting operation \wedge^{\ell} that consists of the repeated application of the grafting \wedge to the trees in FF,

(2.21) F=T1T2T.\bigwedge^{\ell}F=T_{1}\wedge T_{2}\wedge\cdots\wedge T_{\ell}.

The conditions (1), (2), and (3) above ensure that the choice of the subtrees T1M,,TNMT_{1}^{M},\ldots,T_{N}^{M} corresponds to an admissible cut CC of the tree TT, with the number of cut branches #C=N\#C=N The planar binary rooted tree obtained by this operation is then

(2.22) #C(T[X])=1+#C(πC(T)[𝕐^]ρC(T)[X^N]),{\mathcal{I}}^{\#C}(T[X])=\bigwedge^{1+\#C}\left(\pi_{C}(T)[\hat{\mathbb{Y}}]\,\,\rho_{C}(T)[\hat{X}^{N}]\right),

where we use the notation πC(T)[𝕐^]\pi_{C}(T)[\hat{\mathbb{Y}}] for the forest πC(T)=TNMT1M\pi_{C}(T)=T_{N}^{M}\cdots T_{1}^{M}, where the label [𝕐^][\hat{\mathbb{Y}}] means

πC(T)[𝕐^]=TNM[β^(N)]T1M[β^(1)]\pi_{C}(T)[\hat{\mathbb{Y}}]=T_{N}^{M}[\hat{\beta}^{(N)}]\cdots T_{1}^{M}[\hat{\beta}^{(1)}]

and the label [α^N][\hat{\alpha}^{N}] of the tree ρC(T)\rho_{C}(T) stands for what remains of the original label XX after the initial terms ωX0ωX1ωXN1\omega X_{0}\omega X_{1}\cdots\omega X_{N-1} are removed.

Each 𝒟N+1\𝒟N{\mathcal{D}}_{N+1}\backslash{\mathcal{D}}_{N} determines a coideal in the coalgebra 𝒟N+1\ling{\mathcal{D}}_{N+1}\backslash{\mathcal{H}}_{ling} and this gives a projective system of right-module coalgebras

N:=Dom(N)\ling.{\mathcal{M}}_{{\mathcal{I}}^{N}}:={\rm Dom}({\mathcal{I}}^{N})\backslash{\mathcal{H}}_{ling}\,.

2.12. Coideals, recursive structures, and symmetries

In the Minimalist Model of syntax, the Merge operator is the main mechanism for the construction of recursive and hierarchical structures. So indeed it is not surprising that the Hopf-algebraic formalism occurs naturally in this context, as it is known to describe similar hierarchical structures (such as Feynman graphs) in fundamental physics. In the applications of Hopf algebras to physics, especially to renormalization in perturbative quantum field theories, Hopf ideals are closely related to the recursive implementation of symmetries (Ward identities) and the recursive construction of solutions to equations of motion (Dyson-Schwinger equations), [19], [45].

We have discussed in our previous paper [37] how the algebraic formulation of the new Minimalism closely resembles the mathematical structure of Dyson-Schwinger equations in physics. However, the analogous mathematical structure describing Merge in the old forms of Minimalism differs significantly from that basic fundamental formulation we have in the case of the new Minimalism.

The main difference, as shown above, lies in the intrinsic asymmetry of the old form of the Merge operation, which only gives rise to a weaker structure than Hopf ideals, namely the right-ideal coideals discussed in §2.10. Thus, instead of a quotient Hopf algebra as in the case of implementation of symmetries in quantum field theory, one only obtains a quotient right-module coalgebra.

Such objects, quotient right-module coalgebras, sometimes referred to as “generalized quotients” of Hopf algebras, do provide a suitable notion of quotients in the case of noncommutative Hopf algebras, [38], [43]. They are also studied in the context of the theory of Hopf–Galois extensions, designed to provide good geometric analogs in noncommutative geometry of principal bundles and torsors in ordinary geometry, see for instance [41]. However, the type of structure involved, in order to accommodate the requirement of working with planar trees, becomes significantly more complicated than in the case of free symmetric Merge of the new Minimalism.

We can think of the right-module coalgebras N{\mathcal{M}}_{{\mathcal{I}}^{N}} as a family of geometric spaces (in a noncommutative sense) that implement the recursive structures of syntax generated by the Merge operation, in a sense similar to how quotients by Hopf ideals in physics recursively implement the gauge symmetries or the recursively constructed solutions to the equations of motion. This, however, is a significantly weaker (and at the same time significantly more complicated) algebraic structure than the one we see occurring in the newer version of Minimalism, as shown in [37] and discussed in §3 below.

2.13. Partial operated algebra

We now discuss more in detail the mathematical structure of the old formulation of External Merge, and we show how it creates an independent structure of a very different nature from Internal Merge discussed in §2.10. These very different mathematical formulations of Internal and External Merge are another significant drawback of the older Minimalism in comparison with the newer.

We can consider the data (ling,,Dom(),)({\mathcal{H}}_{ling},\star_{\mathcal{I}},{\rm Dom}({\mathcal{I}}),{\mathcal{I}}) discussed above as a generalization of the notion of operated algebra (see [24], [46]) where one considers data (𝒜,,F)({\mathcal{A}},\star,F) of an algebra together with a linear operator F:𝒜𝒜F:{\mathcal{A}}\to{\mathcal{A}}. Here we allow for the case where the linear operator F:Dom(F)𝒜F:{\rm Dom}(F)\to{\mathcal{A}} is only defined on a smaller domain Dom(F)𝒜{\rm Dom}(F)\subset{\mathcal{A}} which is a right ideal of 𝒜{\mathcal{A}}. We can refer to this structure (ling,,Dom(),)({\mathcal{H}}_{ling},\star_{\mathcal{I}},{\rm Dom}({\mathcal{I}}),{\mathcal{I}}) as a partial operated algebra.

The setting of operated algebras was introduced by Rota [40] as a way of formalizing various instances of linear operators F:𝒜𝒜F:{\mathcal{A}}\to{\mathcal{A}} on an algebra that satisfy polynomial constraints. The simplest example of such polynomial constraints is the identity F(ab)=F(a)F(b)F(a\star b)=F(a)\star F(b) that makes FF an actual algebra homomorphism. Another example is the Leibniz rule constraint F(ab)=aF(b)+F(a)bF(a\star b)=a\star F(b)+F(a)\star b that makes FF a derivation. Other interesting polynomial constraints are Rota–Baxter relations, F(a)F(b)=F(aF(b))+F(F(a)b)+λF(ab)F(a)\star F(b)=F(a\star F(b))+F(F(a)\star b)+\lambda F(a\star b), which depending on the value of the parameter λ\lambda can represent various types of operations such as integration by parts, or extraction of the polar part of a Laurent series, and play an important role in the Hopf algebra formulation of renormalization in physics.

In the case of the internal merge we considered above, not only the linear operator is only partially defined as :Dom()ling{\mathcal{I}}:{\rm Dom}({\mathcal{I}})\to{\mathcal{H}}_{ling}, but the identity (2.19) that expresses the internal merge of a product is not directly of a simple poynomial form as in Rota’s operated algebra program. However, it appears to be an interesting question whether the type of relation (2.19) can be accommodated in terms of the approach to Rota’s program via term rewriting systems (in the sense of [2]) developed in [22], [24].

2.14. Old version of External merge and the Hopf algebra structure

We now focus on the binary operation {\mathcal{E}} of External Merge, defined on the subdomain Dom()lingling{\rm Dom}({\mathcal{E}})\subset{\mathcal{H}}_{ling}\otimes{\mathcal{H}}_{ling} described in (2.11). The structure of external merge is simpler than the internal merge discussed above, and closely related to the usual Hochschild cocycle that defines the grafting operations in the Hopf-algebraic construction of Dyson-Schwinger equations in physics.

The “operated algebra” viewpoint we discussed briefly above is especially useful in analyzing the external merge, along the lines of extending the notion of operated algebra from unitary to binary operations discussed in [46].

We recall from [46] the notion of operated algebra based on binary operations. These structure is called a Ω\vee_{\Omega}-algebra in [46]. In our notation we use \wedge instead of \vee for the grafting of binary trees used in the merge operation, following the convention of writing syntactic parsing trees with the root at the top and the leaves at the bottom. So we are going to refer to this structure as Ω\wedge_{\Omega}-algebra.

A Ω\wedge_{\Omega}-algebra is an algebra (𝒜,)({\mathcal{A}},\star) together with a family of binary operations {α}αΩ\{\wedge_{\alpha}\}_{\alpha\in\Omega} satisfying the identity

(2.23) ab=a1α(a2b)+(ab1)αb2,a\star b=a_{1}\wedge_{\alpha}(a_{2}\star b)+(a\star b_{1})\wedge_{\alpha}b_{2}\,,

for a=a1αa2a=a_{1}\wedge_{\alpha}a_{2} and b=b1αb2b=b_{1}\wedge_{\alpha}b_{2}. A Ω\wedge_{\Omega}-bialgebra (or Hopf algebra) (,Δ,,Ω)({\mathcal{H}},\Delta,\star,\wedge_{\Omega}) is a bialgebra (or Hopf algebra) that is also a Ω\wedge_{\Omega}-algebra. It is a cocycle Ω\wedge_{\Omega}-Hopf algebra if the binary operations Ω\wedge_{\Omega} also satisfy the cocycle identity

(2.24) Δ(aαb)=(aαb)1+(α)τ(Δ(a)Δ(b)),\Delta(a\wedge_{\alpha}b)=(a\wedge_{\alpha}b)\otimes 1+(\star\otimes\wedge_{\alpha})\circ\tau(\Delta(a)\otimes\Delta(b))\,,

where τ:\tau:{\mathcal{H}}\otimes{\mathcal{H}}\otimes{\mathcal{H}}\otimes{\mathcal{H}}\to{\mathcal{H}}\otimes{\mathcal{H}}\otimes{\mathcal{H}}\otimes{\mathcal{H}} is the permutation that exchanges the two middle factors.

It is shown in [46] that the Loday-Ronco Hopf algebra LR{\mathcal{H}}_{LR} of planar binary rooted trees with the binary operations α\wedge_{\alpha} that graft two trees T1,T2T_{1},T_{2} as the left and right subtrees at a new binary root with label α\alpha is a cocyle Ω\wedge_{\Omega}-Hopf algebra. The range of α\wedge_{\alpha} is a coideal in LR{\mathcal{H}}_{LR}. It is also shown in [46] that the Ω\wedge_{\Omega}-Hopf algebra LR{\mathcal{H}}_{LR} is the free cocycle Ω\wedge_{\Omega}-Hopf algebra, that is, the initial object in the category of cocycle Ω\wedge_{\Omega}-Hopf algebras.

The external merge operation is modelled on the grafting operations α\wedge_{\alpha} of the Loday-Ronco Hopf algebra, with labels α{<,>}\alpha\in\{<,>\}. The main differences are that the external merge inverts the order when the first tree is nontrivial,

(T1,T2)=T2>T1 for T1{\mathcal{E}}(T_{1},T_{2})=T_{2}\wedge_{>}T_{1}\ \ \ \text{ for }T_{1}\neq\bullet
(,T2)=<T2{\mathcal{E}}(\bullet,T_{2})=\bullet\wedge_{<}T_{2}

and the fact that the operation is only defined on a domain Dom()lingling{\rm Dom}({\mathcal{E}})\subset{\mathcal{H}}_{ling}\otimes{\mathcal{H}}_{ling} of pairs of trees (T1[β],T2[α])(T_{1}[\beta],T_{2}[\alpha]) with matching labels β=σα\beta=\sigma\alpha.

Both the α\wedge_{\alpha}-algebra identity (2.23) and the cocycle identity (2.24) are still satisfied, whenever all the terms involved are in the domain of the merge operator. This leads naturally to consider external merge as a structure of partially defined cocycle Ω\wedge_{\Omega}-Hopf algebra.

Namely, a partially defined cocycle Ω\wedge_{\Omega}-bialgebra (or Hopf algebra) is a bialgebra (or Hopf algebra) (,Δ,)({\mathcal{H}},\Delta,\star) together with a family {α}αΩ\{\wedge_{\alpha}\}_{\alpha\in\Omega} of binary operations defined on linear subspaces Dom(α){\rm Dom}(\wedge_{\alpha})\subset{\mathcal{H}}\otimes{\mathcal{H}},

α:Dom(α)\wedge_{\alpha}:{\rm Dom}(\wedge_{\alpha})\hookrightarrow{\mathcal{H}}\otimes{\mathcal{H}}\to{\mathcal{H}}

that satisfy (2.23) and (2.24), whenever all the terms are in Dom(α){\rm Dom}(\wedge_{\alpha}).

The Hopf algebra ling{\mathcal{H}}_{ling} with the external merge operator {\mathcal{E}} is then a partially defined cocycle Ω\wedge_{\Omega}-Hopf algebra.

3. New Minimalism, Merge, magmas and Hopf algebras

The new version of Minimalism, as presented in [10], [11], [13], and in our mathematical formulation in [37], avoids all the complications arising in the previous formulation, both coming from the presence of planar structures and from the need for labeling and projection. By delegating these aspects to a later externalization procedure, the core computational mechanism of Merge becomes very transparent and simple from the mathematical perspective and no longer leads to very different structures underlying Internal and External Merge. The formulation is also no longer plagued by the problem of partially defined operations and corresponding domains.

The whole mathematical structure of Merge in the new version of Minimalism is discussed in detail in our paper [37], including a proposed model for externalization. We summarize it here for direct comparison with the old Minimalism discussed in the previous section.

3.1. The core computational structure

In the new version of Minimalism, one starts with a core computational structure, which is the recursive construction of binary rooted trees, where trees are now just abstract trees, not endowed with planar structure. This structure is described mathematically in the following way.

The set 𝔗{\mathfrak{T}} of finite binary rooted trees without planar structure is obtained as the free non-associative commutative magma whose elements are the balanced bracketed expressions in a single variable xx, with the binary operation (binary set formation)

(3.1) (α,β)𝔐(α,β)={α,β}(\alpha,\beta)\mapsto{\mathfrak{M}}(\alpha,\beta)=\{\alpha,\beta\}\,

where α,β\alpha,\beta are two such balanced bracketed expressions.

This description of the generative process of binary rooted trees is immediate from the identification of these trees with balanced bracketed expressions in a single variable xx, such as

{{x{xx}}x}\Tree[[x[xx]]x].\{\{x\{xx\}\}x\}\longleftrightarrow\Tree[[x[xx]]x]\ \ \,.

Under this identification, the binary operation of the magma takes two rooted binary trees and attaches the roots to a new common root

(3.2) (T,T)𝔐(T,T)=\Tree[TT].(T,T^{\prime})\mapsto{\mathfrak{M}}(T,T^{\prime})=\Tree[TT^{\prime}]\,\,\,.

If one takes the {\mathbb{Q}}-vector space 𝒱(𝔗){\mathcal{V}}({\mathfrak{T}}) spanned by the set 𝔗{\mathfrak{T}}, the magma operation 𝔐{\mathfrak{M}} on 𝔗{\mathfrak{T}} induces on 𝒱(𝔗){\mathcal{V}}({\mathfrak{T}}) the structure of an algebra. More precisely, 𝒱(𝔗){\mathcal{V}}({\mathfrak{T}}) is the free commutative non-associative algebra generated by a single variable xx, or equivalently the free algebra over the quadratic operad freely generated by the single commutative binary operation 𝔐{\mathfrak{M}} (see [27]).

One can assign a grading (a weight, measuring the size) to the binary rooted trees by the number of leaves, =#L(T)\ell=\#L(T), so that one can decompose the vector space 𝒱(𝔗)=𝒱(𝔗){\mathcal{V}}({\mathfrak{T}})=\oplus_{\ell}{\mathcal{V}}({\mathfrak{T}})_{\ell} into its graded components (the sub-vector-spaces spanned by the trees of weight \ell).

As discussed in [37], this is a very natural and simple mathematical object, and the generative process for the binary rooted trees can then be seen as the simplest and most fundamental possible case of recursive solution of a fixed point problem, of the kind that is known in physics as Dyson–Schwinger equation.

Indeed, one can view the recursive construction of binary rooted trees through repeated application of the Merge operation (3.1) as the recursive construction of a solution to the fixed point equation

(3.3) X=𝔐(X,X),X={\mathfrak{M}}(X,X)\,,

where X=XX=\sum_{\ell}X_{\ell} is a formal infinite sum of variables XX_{\ell} in 𝒱(𝔗){\mathcal{V}}({\mathfrak{T}})_{\ell}, with \ell the grading by number of leaves. The problem (3.3) can be solved recursively by degrees, with

Xn=𝔐(X,X)n=j=1n1𝔐(Xj,Xnj),X_{n}={\mathfrak{M}}(X,X)_{n}=\sum_{j=1}^{n-1}{\mathfrak{M}}(X_{j},X_{n-j})\,,

with solution X1=xX_{1}=x, X2={xx}X_{2}=\{xx\}, X3={x{xx}}+{{xx}x}=2{x{xx}}X_{3}=\{x\{xx\}\}+\{\{xx\}x\}=2\{x\{xx\}\}, X4=2{x{x{xx}}}+{{xx}{xx}}X_{4}=2\{x\{x\{xx\}\}\}+\{\{xx\}\{xx\}\}, and so on. The coefficients with which the trees occur in these solutions count the different possible choices of planar embeddings.

The equation (3.3) is the simplest case (with the simplest quadratic polynomial) of the more general form of Dyson–Schwinger equations for rooted trees X=𝔅(P(X))X={\mathfrak{B}}(P(X)) where P(X)P(X) is a polynomial in the formal variable XX and 𝔅{\mathfrak{B}} (usually called B+B^{+} in the physics literature) is the operation that takes a collection of rooted trees T1,T2,,TnT_{1},T_{2},\ldots,T_{n}, seen as the forest F=T1T2TnF=T_{1}\sqcup T_{2}\sqcup\ldots\sqcup T_{n}, which appears as a monomial in P(X)P(X), and constructs a new tree by attaching all the roots of the TiT_{i}’s to a single new root (an nn-ary Merge),

(3.4) 𝔅:T1T2Tn\Tree[T1T2Tn].{\mathfrak{B}}:T_{1}\sqcup T_{2}\sqcup\ldots\sqcup T_{n}\mapsto\Tree[T_{1}T_{2}\cdots T_{n}]\,.

For a detailed discussion of the role of this operator in physics see, for instance, [19], [45]. The case where P(X)P(X) has a single quadratic term gives the binary Merge operation in its core generative process.

3.2. Syntactic objects

The core computational structure of §3.1 introduces Merge in the most basic form of binary set formation (3.1). This can then be extended to the generative process that gives rise to syntactic objects.

One starts with an assigned set 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0} of lexical items and syntactic features such as N,V,AN,V,A, P,C,T,D,P,C,T,D,\ldots

The set 𝒮𝒪{\mathcal{S}}{\mathcal{O}} of syntactic objects is then identified with the set

(3.5) 𝒮𝒪𝔗𝒮𝒪0{\mathcal{S}}{\mathcal{O}}\simeq{\mathfrak{T}}_{{\mathcal{S}}{\mathcal{O}}_{0}}

of finite binary rooted trees with no assigned planar embedding, and with leaves labelled by elements of 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0}. Just as the set of binary rooted trees with no labeling of leaves has a magma structure with the binary set formation operation 𝔐{\mathfrak{M}} of (3.1), the set of syntactic objects also has a magma structure, namely it is the free, non-associative, commutative magma over the set 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0},

(3.6) 𝒮𝒪=Magmana,c(𝒮𝒪0,𝔐),{\mathcal{S}}{\mathcal{O}}={\rm Magma}_{na,c}({\mathcal{S}}{\mathcal{O}}_{0},{\mathfrak{M}})\,,

with the binary Merge operation 𝔐{\mathfrak{M}} defined as in (3.2) for pairs of trees with labelled leaves.

Note that the description as elements of the magma Magmana,c(𝒮𝒪0,𝔐){\rm Magma}_{na,c}({\mathcal{S}}{\mathcal{O}}_{0},{\mathfrak{M}}) is what is usually referred in the linguistics setting (see [10], [11]) as the description in terms of sets (in fact multisets) rather than trees, since one expresses the elements of the magma in terms of (multi)sets with brackets corresponding to the Merge operations, rather than representing them in tree form, such as

{a,{{b,c},d}}\Tree[a[[bc]d]],\{a,\{\{b,c\},d\}\}\leftrightarrow\Tree[a[[bc]d]]\,,

where the tree on the right should not be considered as planar. As sets

𝔗𝒮𝒪0Magmana,c(𝒮𝒪0,𝔐){\mathfrak{T}}_{{\mathcal{S}}{\mathcal{O}}_{0}}\simeq{\rm Magma}_{na,c}({\mathcal{S}}{\mathcal{O}}_{0},{\mathfrak{M}})

are in bijection, hence both descriptions are equivalent, but the description as magma is preferred in linguistics as it emphasizes the generative process. The description in terms of magma elements, as in the left-hand-side of the example above, also avoids confusion as to whether the trees have a planar embedding or not: working with (multi)sets rather than with lists clearly means that no planarity is assumed. We will adopt here the tree notation, just because certain mathematical operation we will be using have a simpler and more immediately visualizable description in terms of trees rather than in terms of the corresponding multisets.

In both the core magma (𝔗,𝔐)({\mathfrak{T}},{\mathfrak{M}}) and in the magma (3.6) of syntactic objects, one can introduce a multiplicative unit 11 satisfying 𝔐(T,1)=T=𝔐(1,T){\mathfrak{M}}(T,1)=T={\mathfrak{M}}(1,T) for all trees, by formally adding a trivial (empty) tree.

3.3. Workspaces

In the new formulation of Minimalism, the Merge operation acts on workspaces, consisting of material (lexical items and syntactic objects) available for computation. The Merge operation updates the workspace for the next step of structure formation. This notion of workspace is formalized in [14]. Mathematically, as discussed in [37], workspaces are just finite disjoint unions of binary rooted trees (that is, forests) with leaves labelled by 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0}. Equivalently, workspaces are multisets of syntactic objects.

Thus, the set of workspaces can be identified with the set 𝔉𝒮𝒪0{\mathfrak{F}}_{{\mathcal{S}}{\mathcal{O}}_{0}} of binary rooted forests with no assigned planar structure (disjoint unions of binary rooted trees with no assigned planar structure) with leaf labels in 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0}.

Given a workspace F𝔉𝒮𝒪0F\in{\mathfrak{F}}_{{\mathcal{S}}{\mathcal{O}}_{0}}, the material in FF that is accessible for computation consist of the lexical items and all the trees that were obtained through previous applications of Merge. This inductive definition can be rephrased more directly, by defining the set of accessible terms of FF.

For a single binary rooted tree TT with no assigned planar structure, the set Acc(T)Acc(T) of accessible terms of TT is the set of all subtrees TvTT_{v}\subset T given by all the descendants of a given non-root vertex of TT. Indeed, these subtrees TvT_{v} are exactly all the intermediate trees obtained in an iterative construction of TT starting from the lexical items or features at the leaves, by repeated application of the Merge operation (3.2). We also write Acc(T)Acc^{\prime}(T) for the set obtained by adding to Acc(T)Acc(T) a copy of TT itself, so that

Acc(T)={Tv|vVint(T)} and Acc(T)={Tv|vV(T)},Acc^{\prime}(T)=\{T_{v}\,|\,v\in V_{int}(T)\}\ \ \ \text{ and }\ \ \ Acc^{\prime}(T)=\{T_{v}\,|\,v\in V(T)\}\,,

where Vint(T)V_{int}(T) is the set of non-root vertices and V(T)V(T) is the set of all vertices of TT including the root.

For a workspace given by a forest F=aTa𝔉𝒮𝒪0F=\sqcup_{a\in{\mathcal{I}}}T_{a}\in{\mathfrak{F}}_{{\mathcal{S}}{\mathcal{O}}_{0}}, with TaT_{a} the component trees and {\mathcal{I}} a finite indexing set, the set of accessible terms of the workspace is given by

(3.7) Acc(F)=aAcc(Ta),Acc(F)=\bigsqcup_{a\in{\mathcal{I}}}Acc^{\prime}(T_{a})\,,

namely it consists of the syntactic objects, i.e. the connected components TaT_{a} of FF, together with all the accessible terms αAcc(Ta)\alpha\in Acc(T_{a}) of each component TaT_{a}.

3.4. Merge acting on workspaces

The action of Merge on workspaces is then defined in the new version of Minimalism as a transformation that produce a new workspace from a given one, given a choice of a pair of syntactic objects.

In S,S𝔗𝒮𝒪0S,S^{\prime}\in{\mathfrak{T}}_{{\mathcal{S}}{\mathcal{O}}_{0}} are two given syntactic objects, the Merge operator 𝔐S,S{\mathfrak{M}}_{S,S^{\prime}} acts on a given workspace FF in the following way. For each of the two arguments of the binary operator 𝔐S,S{\mathfrak{M}}_{S,S^{\prime}} the set Acc(F)Acc(F) of accessible terms is searched for a term matching SS, respectively SS^{\prime}. If matching terms are not found the workspace F=aTaF=\sqcup_{a}T_{a} remains the same. If they are found, say Ti,viST_{i,v_{i}}\simeq S and Tj,vjST_{j,v_{j}}\simeq S^{\prime}, then the new resulting workspace F=𝔐S,S(F)F^{\prime}={\mathfrak{M}}_{S,S^{\prime}}(F) is of the form

(3.8) F=𝔐(Ti,vi,Tj,vj)Ti/Ti,viTj/Tj,vjai,jTa,F^{\prime}={\mathfrak{M}}(T_{i,v_{i}},T_{j,v_{j}})\sqcup T_{i}/T_{i,v_{i}}\sqcup T_{j}/T_{j,v_{j}}\sqcup\bigsqcup_{a\neq i,j}T_{a}\,,

where the quotients Ti/Ti,viT_{i}/T_{i,v_{i}} and Tj/Tj,vjT_{j}/T_{j,v_{j}} perform the cancellation of the deeper copies of the accessible terms Ti,viT_{i,v_{i}} and Tj,vjT_{j,v_{j}} in the new workspace.

The quotient T/TvT/T_{v} of a tree TT by an accessible term TvTT_{v}\subset T is defined here in the following way, see [37]. One first removes the subtree TvT_{v} from TT, leaving the complement TTvT\smallsetminus T_{v}. There is then a unique maximal binary rooted tree that can be obtained from this complement TTvT\smallsetminus T_{v} by contracting edges, and this resulting binary rooted tree is what we call T/TvT/T_{v}. When vv is a non-root vertex of TT, the quotient T/TvT/T_{v} is equivalently described as obtained by contracting all of TvT_{v} to its root vertex, and also contracting the edge above the root vertex of TvT_{v} and the other edge out of the vertex above the root of TvT_{v}. In the case where Tv=TT_{v}=T we have T/Tv=1T/T_{v}=1. We also set T/1=TT/1=T.

In this new version of Minimalism, the Hopf algebra structure can be seen as underlying the action of Merge described as in (3.8). Indeed, the fundamental property of the coproduct in a Hopf algebra is providing the list of all the possible ways of decomposing an object (the term of the algebra the coproduct is applied to) into a pair of a subobject and the associated quotient object. In other words, in the case of a tree TT one can write a coproduct in the form

(3.9) Δ(T)=vFv¯T/Fv¯,\Delta(T)=\sum_{v}F_{\underline{v}}\otimes T/F_{\underline{v}}\,,

where we include the special cases T1T\otimes 1 and 1T1\otimes T, and where, for v¯={vi}i=1k\underline{v}=\{v_{i}\}_{i=1}^{k} we write Fv¯F_{\underline{v}} for the forest consisting of a union of disjoint subtrees TviTT_{v_{i}}\subset T. This can be equivalently described as a forest obtained from an admissible cut of the tree TT. We can view the first terms

(3.10) Δ(2)(T)=vTvT/Tv,\Delta_{(2)}(T)=\sum_{v}T_{v}\otimes T/T_{v}\,,

of the coproduct (meaning those where the subforest has a single component) as a way of compiling a list of all the accessible terms of TT with the corresponding cancellation of the deeper copy. In these terms, the action (3.8) can be described as taking, for each of the two arguments of the binary operation 𝔐S,S{\mathfrak{M}}_{S,S^{\prime}} the coproduct Δ(F)\Delta(F) over the entire workspace, which extracts all the accessible terms, searching among them for matching copies of SS and SS^{\prime}, merging them if found, and keeping the other terms of the coproduct that perform the cancellation of the deeper copies.

The {\mathbb{Q}}-vector space 𝒱(𝔉𝒮𝒪0){\mathcal{V}}({\mathfrak{F}}_{{\mathcal{S}}{\mathcal{O}}_{0}}) spanned by the workspaces, with the operations of disjoint union \sqcup as product and the coproduct (3.9) extended from trees to forests by Δ(F)=aΔ(Ta)\Delta(F)=\sqcup_{a}\Delta(T_{a}) for F=aTaF=\sqcup_{a}T_{a}, has the structure of an associative, commutative, coassociative, non-cocommutative bialgebra (𝒱(𝔉𝒮𝒪0),,Δ)({\mathcal{V}}({\mathfrak{F}}_{{\mathcal{S}}{\mathcal{O}}_{0}}),\sqcup,\Delta). The vector space 𝒱(𝔉𝒮𝒪0)=𝒱(𝔉𝒮𝒪0){\mathcal{V}}({\mathfrak{F}}_{{\mathcal{S}}{\mathcal{O}}_{0}})=\oplus_{\ell}{\mathcal{V}}({\mathfrak{F}}_{{\mathcal{S}}{\mathcal{O}}_{0}})_{\ell} has a grading by number of leaves, compatible with the operations, so that a coproduct making 𝒱(𝔉𝒮𝒪0){\mathcal{V}}({\mathfrak{F}}_{{\mathcal{S}}{\mathcal{O}}_{0}}) a Hopf algebra can be constructed inductively by degrees.

The action of Merge on workspaces described above is formulated in [37] explicitly in terms of the product and coproduct structure of (𝒱(𝔉𝒮𝒪0),,Δ)({\mathcal{V}}({\mathfrak{F}}_{{\mathcal{S}}{\mathcal{O}}_{0}}),\sqcup,\Delta), as

(3.11) 𝔐S,S=(𝔅id)δS,SΔ,{\mathfrak{M}}_{S,S^{\prime}}=\sqcup\circ({\mathfrak{B}}\otimes{\rm id})\circ\delta_{S,S^{\prime}}\circ\Delta\,,

where the coproduct Δ\Delta produces the list of accessible terms and corresponding cancellations, and the operator δS,S\delta_{S,S^{\prime}} identifies the matching copies. These are then fed into Merge by the operation 𝔅id{\mathfrak{B}}\otimes{\rm id} which at the same times produces the new merged tree through the grafting operation 𝔅{\mathfrak{B}} of (3.4) and performs the cancellation of the deeper copies by keeping the corresponding quotient terms produced by the coproduct in the new workspace. The resuling new workspace is then produced by taking all these components and the new component created by Merge together through the product operation \sqcup. A more detailed discussion of this operation is given in our companion paper [37].

3.5. Internal and External Merge

The action of Merge on workspaces defined by (3.8) and (3.11) contains other forms of Merge in addition to External and Internal Merge, such Sideward and Countercyclic Merge (see §2.3 of [37]). These are not desirable in linguistic terms, and one expects that a mechanism of Minimal Search would extract the terms corresponding to External and Internal Merge as the “least effort” contributions. Minimal Search refers to the extraction of matching terms from the list of the accessible terms produced by the coproduct, where the search in this list is done according to an appropriate economy principle.

With the formulation (3.11) of the action of Merge on workspaces, it is shown in [37] that the usual form of Minimal Search can be implemented in this Hopf algebra setting by selecting the leading order part of the coproduct with respect to a grading that weights the accessible terms TvTT_{v}\subset T by the distance of the vertex vv from the root vertex of TT, so that searching among terms deeper into the tree becomes less efficient, with respect to this cost function, than searching near the top of the tree, and taking the leading order term with respect to this grading has exactly the same effect as implementing Minimal Search in the way usually described in the linguistics literature (see [10], [11], [13]).

More precisely, one can introduce degrees in the coproduct by taking

(3.12) Δ(ϵ,η):𝒱(𝔗𝒮𝒪0)𝒱(𝔗𝒮𝒪0)[ϵ]𝒱(𝔗𝒮𝒪0)[η],\Delta^{(\epsilon,\eta)}:{\mathcal{V}}({\mathfrak{T}}_{{\mathcal{S}}{\mathcal{O}}_{0}})\to{\mathcal{V}}({\mathfrak{T}}_{{\mathcal{S}}{\mathcal{O}}_{0}})[\epsilon]\otimes_{\mathbb{Q}}{\mathcal{V}}({\mathfrak{T}}_{{\mathcal{S}}{\mathcal{O}}_{0}})[\eta]\,,
Δ(ϵ,η)(T)=v¯ϵdv¯Fv¯ηdv¯(T/Fv¯),\Delta^{(\epsilon,\eta)}(T)=\sum_{\underline{v}}\epsilon^{d_{\underline{v}}}\,F_{\underline{v}}\otimes\eta^{d_{\underline{v}}}(T/F_{\underline{v}})\,,

where for v¯={v1,,vn}\underline{v}=\{v_{1},\ldots,v_{n}\} a set of vertices viVint(T)v_{i}\in V_{int}(T), we set dv¯=dv1++dvnd_{\underline{v}}=d_{v_{1}}+\cdots+d_{v_{n}}, with dvd_{v} the distance of a vertex vv to the root of TT. In the Merge action one correspondingly obtains

(3.13) 𝔐S,Sϵ=(𝔐ϵid)δS,SΔ(ϵ,ϵ1){\mathfrak{M}}^{\epsilon}_{S,S^{\prime}}=\sqcup\circ({\mathfrak{M}}^{\epsilon}\otimes{\rm id})\circ\delta_{S,S^{\prime}}\circ\Delta^{(\epsilon,\epsilon^{-1})}

with Δ(ϵ,ϵ1)\Delta^{(\epsilon,\epsilon^{-1})} as in (3.12), and with

(3.14) 𝔐ϵ:𝒱(𝔗𝒮𝒪0)[ϵ,ϵ1]𝒱(𝔗𝒮𝒪0)[ϵ,ϵ1]𝒱(𝔗𝒮𝒪0)[ϵ,ϵ1]𝔐ϵ(ϵdα,ϵβ)=ϵ|d+|𝔐(α,β).\begin{array}[]{c}{\mathfrak{M}}^{\epsilon}:{\mathcal{V}}({\mathfrak{T}}_{{\mathcal{S}}{\mathcal{O}}_{0}})[\epsilon,\epsilon^{-1}]\otimes_{\mathbb{Q}}{\mathcal{V}}({\mathfrak{T}}_{{\mathcal{S}}{\mathcal{O}}_{0}})[\epsilon,\epsilon^{-1}]\to{\mathcal{V}}({\mathfrak{T}}_{{\mathcal{S}}{\mathcal{O}}_{0}})[\epsilon,\epsilon^{-1}]\\[8.53581pt] {\mathfrak{M}}^{\epsilon}(\epsilon^{d}\alpha,\epsilon^{\ell}\beta)=\epsilon^{|d+\ell|}\,{\mathfrak{M}}(\alpha,\beta)\,.\end{array}

It is then shown in [37] that taking the leading term (namely the only nonzero term in the limit ϵ0\epsilon\to 0) of arbitrary compositions of operators 𝔐S,Sϵ{\mathfrak{M}}^{\epsilon}_{S,S^{\prime}}, one finds only (compositions of) Internal and External Merge, while all the remaining forms of Merge (Sideward, Countercyclic) are of lower order and disappear in the ϵ0\epsilon\to 0 limit.

4. Externalization and planarization

In the new formulation of Minimalism, as we discussed above, Merge happens in the free symmetric form described by the free commutative non-associative magma Magmana,c(𝒮𝒪0,𝔐){\rm Magma}_{na,c}({\mathcal{S}}{\mathcal{O}}_{0},{\mathfrak{M}}) of (3.6) that constructively defines syntactic objects, which are binary rooted trees with no assignment of planar structure.

In this formulation of Minimalism, the assignment of planar structure to trees happens after the action of Merge has taken place, in a further process of externalization.

This is in contrast with older versions of Minimalism (including the case of Stabler’s formulation that we discussed in the previous sections), where Merge is applied directly on planar trees.

There are suggestions, such as Richard Kayne’s LCA (Linear Correspondence Axiom) ([29], [30]), that propose the replacement of externalization with a more implicit (and unique) choice of planar embeddings for trees. Irrespective of the tenability of this proposal on linguistic ground, we discuss in this section some more basic difficulties in its implementation, that arise at the formal algebraic level.

First a comment about terminology: in the linguistics literature it is customary to use the term linearization for the choice of a linear ordering for the leaves of a binary rooted tree. Since this creates a conflict of terminology with the more common mathematical use of the word “linearization”, and the choice of a linear ordering of the leaves is equivalent to the choice of a planar embedding of the tree, we will adopt here the terminology planarization (of trees) instead of linearization (of the set of leaves). Thus, we will refer to the LCA proposal as a “planarization” rather than as a “linearization algorithm” as usually described. We trust this will not cause confusion with the readers.

4.1. Commutative and non-commutative magmas

A first important observation is that the Merge operation can happen before (as in the new Minimalism) or after the assignment of planar structure to trees (as in the old Minimalism), but not both consistently. What we mean by this is the following simple mathematical observation.

Just as we consider the free commutative non-associative magma Magmana,c(𝒮𝒪0,𝔐){\rm Magma}_{na,c}({\mathcal{S}}{\mathcal{O}}_{0},{\mathfrak{M}}) of the new Minimalism, we can similarly consider the free non-commutative non-associative magma

𝒮𝒪nc:=Magmana,nc(𝒮𝒪0,𝔐nc){\mathcal{S}}{\mathcal{O}}^{nc}:={\rm Magma}_{na,nc}({\mathcal{S}}{\mathcal{O}}_{0},{\mathfrak{M}}^{nc})

over the same set 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0}. The elements of this magma 𝒮𝒪nc{\mathcal{S}}{\mathcal{O}}^{nc} are the planar binary rooted trees with leaves labelled by the set 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0}. We write the elements of 𝒮𝒪nc{\mathcal{S}}{\mathcal{O}}^{nc} as TπT^{\pi}, where TT is an abstract (non-planarly embedded) binary rooted tree and π\pi is a planar embedding of TT. The non-commutative non-associative magma operation is given by

𝔐nc(T1π1,T2π2)=\Tree[T1π1T2π2]=:Tπ,{\mathfrak{M}}^{nc}(T^{\pi_{1}}_{1},T^{\pi_{2}}_{2})=\Tree[T_{1}^{\pi_{1}}T_{2}^{\pi_{2}}]=:T^{\pi}\,,

where now the trees T1π1T^{\pi_{1}}_{1} and T2π2T^{\pi_{2}}_{2} are planar and the above tree TπT^{\pi} is assigned the planar embedding π\pi where T1π1T_{1}^{\pi_{1}} is to the left of T2π2T^{\pi_{2}}_{2}. In particular now 𝔐nc(T1π1,T2π2)𝔐nc(T2π2,T1π1){\mathfrak{M}}^{nc}(T_{1}^{\pi_{1}},T^{\pi_{2}}_{2})\neq{\mathfrak{M}}^{nc}(T^{\pi_{2}}_{2},T^{\pi_{1}}_{1}), unlike the case of the commutative 𝔐{\mathfrak{M}} of 𝒮𝒪{\mathcal{S}}{\mathcal{O}}.

There is a morphism of magmas 𝒮𝒪nc𝒮𝒪{\mathcal{S}}{\mathcal{O}}^{nc}\to{\mathcal{S}}{\mathcal{O}} which simply forgets the planar structure of trees, so that TπTT^{\pi}\mapsto T. It is well defined as a morphism of magmas since the two different planar trees 𝔐nc(T1π1,T2π2){\mathfrak{M}}^{nc}(T^{\pi_{1}}_{1},T^{\pi_{2}}_{2}) and 𝔐nc(T2π2,T1π1){\mathfrak{M}}^{nc}(T^{\pi_{2}}_{2},T^{\pi_{1}}_{1}) have the same underlying abstract (non-planar) tree 𝔐(T1,T2){\mathfrak{M}}(T_{1},T_{2}).

On the other hand, there is no morphism of magmas that goes in the opposite direction, from 𝒮𝒪{\mathcal{S}}{\mathcal{O}} to 𝒮𝒪nc{\mathcal{S}}{\mathcal{O}}^{nc}. Indeed, if such morphism existed, then its image would necessarily be a commutative sub-magma of 𝒮𝒪nc{\mathcal{S}}{\mathcal{O}}^{nc}, but 𝒮𝒪nc{\mathcal{S}}{\mathcal{O}}^{nc} does not contain any nontrivial commutative sub-magma. This can be seen easily as, if a tree TπT^{\pi} is contained in a commutative sub-magma of 𝒮𝒪nc{\mathcal{S}}{\mathcal{O}}^{nc}, then 𝔐nc(Tπ,Tπ){\mathfrak{M}}^{nc}(T^{\pi},T^{\pi}) also is, but 𝔐nc(𝔐nc(Tπ,Tπ),Tπ))𝔐nc(Tπ,𝔐nc(Tπ,Tπ)){\mathfrak{M}}^{nc}({\mathfrak{M}}^{nc}(T^{\pi},T^{\pi}),T^{\pi}))\neq{\mathfrak{M}}^{nc}(T^{\pi},{\mathfrak{M}}^{nc}(T^{\pi},T^{\pi})) contradicting the fact that the sub-magma is commutative.

This fact has the immediate consequence that we stated above, namely that if a free symmetric Merge takes place before any assignment of planar structure to trees, then it cannot also consistently apply after a choice of planarization. In other words, if Σ\Sigma is any choice of a section of the projection 𝒮𝒪nc𝒮𝒪{\mathcal{S}}{\mathcal{O}}^{nc}\to{\mathcal{S}}{\mathcal{O}} (that is, an assignment of planarization) then one cannot have compatible Merge operations satisfying 𝔐nc(Σ(T1),Σ(T2))=Σ(𝔐(T1,T2)){\mathfrak{M}}^{nc}(\Sigma(T_{1}),\Sigma(T_{2}))=\Sigma({\mathfrak{M}}(T_{1},T_{2})). Merge can act on abstract trees as in the new Minimalism or on planar trees as in the old Minimalism, but these two view are mutually exclusive.

Descriptions of Merge at the level of planar trees, as in the old Minimalism, involve a partially defined operations with restrictions on domains, based on some label assignments. We have discussed this explicitly in the case of Stabler’s formulation but analogous conditions exist in other formulations of the old Minimalism. In view of the considerations above, this can be read as evidence of the fact that one is trying to describe at the level of planar trees a Merge that is really taking place at the underlying level of non-planar abstract trees, correcting for the incompatibility described above by restricting the domain of applicability. In particular, this means that the significant complications in the underlying algebraic structure of External and Internal Merge that we have observed in the case of Stabler’s formulation, similarly apply to any other formulation that places Merge after the planar embedding of trees.

This issue does not arise within the formulation of the new Minimalism, since the planarization that happens in externalization is simply a non-canonical (meaning dependent on syntactic parameters) choice of a section Σ\Sigma of the projection 𝒮𝒪nc𝒮𝒪{\mathcal{S}}{\mathcal{O}}^{nc}\to{\mathcal{S}}{\mathcal{O}}. The only requirement on Σ\Sigma is to be compatible with syntactic parameters, not to be a morphism with respect to the magma operations 𝔐{\mathfrak{M}} and 𝔐nc{\mathfrak{M}}^{nc}, since in this model Merge only acts before externalization as 𝔐{\mathfrak{M}} (not after externalization as 𝔐nc{\mathfrak{M}}^{nc}). In our previous paper [37] we describe externalization in the form of a correspondence. This represents the two step procedure that first choses a non-canonical section Σ\Sigma and then quotients the image by eliminating those planar trees obtained in this way that are not compatible with further (language specific) syntactic constraints. Merge is not applied anywhere in this externalization process, which happens after the results of Merge have been computed at the level of trees without planar structure.

4.2. Planarization versus Externalization

Proposals such as Kayne’s LCA planarization of trees, [29], [30] (see also Chapter 7 of [28] and [34] for a short summary), suggest the replacement of externalization with a different way of constructing planarization. This relies on the idea that the abstract trees are endowed with additional data (related to heads, maximal projection, and c-command relation) that permits a canonical choice of planar structure. We discuss two different mathematical difficulties inherent in this proposal.

First let’s assume that indeed additional data on the abstract syntactic tree suffice to endow them with a unique canonical choice of linearization. (We will discuss the difficulties with this assumptions below.) In this case, we would have a map, which we call ΣLCA\Sigma^{LCA} that assigns to an abstract binary rooted tree TT a corresponding, uniquely defined non-planar tree ΣLCA(T)=TπLCA\Sigma^{LCA}(T)=T^{\pi_{LCA}}, with πLCA\pi_{LCA} the linear ordering constructed by the LCA algorithm.

By our previous observations on the morphisms of magmas, we will necessarily have in general that

𝔐nc(ΣLCA(T1),ΣLCA(T2))ΣLCA(𝔐(T1,T2)),{\mathfrak{M}}^{nc}(\Sigma^{LCA}(T_{1}),\Sigma^{LCA}(T_{2}))\neq\Sigma^{LCA}({\mathfrak{M}}(T_{1},T_{2}))\,,

so that ΣLCA\Sigma^{LCA} cannot be compatible with asymmetric Merge of planar trees.

The only way to make it compatible with Merge would be to define Merge on the image of ΣLCA\Sigma^{LCA} not as the asymmetric Merge of planar trees but as

𝔐LCA(ΣLCA(T1),ΣLCA(T2)):=ΣLCA(𝔐(T1,T2)).{\mathfrak{M}}^{LCA}(\Sigma^{LCA}(T_{1}),\Sigma^{LCA}(T_{2})):=\Sigma^{LCA}({\mathfrak{M}}(T_{1},T_{2}))\,.

This, however, simply creates an isomorphic copy of the commutative magma Magmana,c(𝒮𝒪0,𝔐){\rm Magma}_{na,c}({\mathcal{S}}{\mathcal{O}}_{0},{\mathfrak{M}}) (by a choice of a particular representative in each equivalence class of the projection 𝒮𝒪nc𝒮𝒪{\mathcal{S}}{\mathcal{O}}^{nc}\to{\mathcal{S}}{\mathcal{O}}). This would imply that application of the planarization ΣLCA\Sigma^{LCA} has no effect, in terms of the properties of Merge, with respect to working directly with the free symmetric Merge on abstract non-planar trees. The alternative is, as in externalization, to not require any compatibility between ΣLCA\Sigma^{LCA} and Merge: in other word, even if LCA replaces externalization, Merge is still only happening in the form of free symmetric Merge, at the level of the abstract non-planar trees, and not after planarization.

We now look more specifically at the proposals for how the planarization ΣLCA\Sigma^{LCA} should be obtained, to highlight a different kind of difficulty.

In an (abstract) binary rooted tree TT, two vertices v1,v2v_{1},v_{2} are sisters if there is a vertex vv of TT connected to both v1v_{1} and v2v_{2}. A vertex vv of TT dominates another vertex ww if vv is on the unique path from the root of TT to ww. A vertex vv in TT c-commands another vertex ww if neither dominates the other and the lowest vertex that dominates vv also dominates ww. A vertex vv asymmetrically c-commands a vertex ww if vv c-commands ww and v,wv,w are not sisters. Asymmetric c-command defines a partial ordering of the leaves of TT. In order to extend this partial ordering relation, instead of using directly the asymmetric c-command relation to define the order structure, one uses maximal projections. Namely, one requires that a leaf \ell precedes another leaf \ell^{\prime} if and only if either \ell asymmetrically c-commands \ell^{\prime} or a maximal projection dominating \ell c-commands \ell^{\prime}.

Even with this extension using maximal projections, there are issues in making this a total ordering, as discussed for instance in Chapter 7 of [28] and in [34].

Since we have been discussing in the previous sections the Stabler formalism, it is easy to see this same issue in terms of the labels >> and << assigned to the internal vertices of (planar) trees in Stabler’s formulation. The difference is that in Stabler one starts with a tree that already has a planar assignment and uses heads, maximal projection, and c-command to obtain a new planar projection. In that case, as we discussed already, the labels >> and << are assigned to the root and the non-leaf vertices of a planar tree by pointing, at each vertex vv, in the direction of the branch where the head of the tree TvT_{v} resides. If this assignment of labels is well defined, then the new planar structure of the tree can be obtained simply by flipping subtrees about their root vertex every time the vertex is labelled <<, until all the resulting vertices become labelled by >>.

The implicit assumption that makes this possible is that every subtree TvT_{v} of a given tree TT has a head, that is, a marked leaf. This assumption on heads is necessary both for the labeling by >> and << in Stabler’s formalism and for the use of maximal projections for the definition of ordering in LCA, since a maximal projection is a subtree TvT_{v} of TT that is not strictly contained in any larger TwT_{w} with the same head.

In the case of the labels >> and <<, the problem of what label is assigned to the new root vertex, when External or Internal Merge is performed, requires restrictions on the domain of applicability of these Merge operations, depending on conditions on the labels at the heads of the trees used in the Merge operation (which makes them partially defined operations). It also requires other further steps, such as allowing Merge results to be unlabeled during derivation and labelled at the end so that successive-cyclic raising can remove the elements responsible for un-labelability. (This refers to the same linguistic problem with Stabler’s formalism that we discussed above, at the end of §2.8, mentioned to us by Riny Huijbregts).

In the case of LCA, the problem again arises from the fact that a result of Merge need not have the heads of the two merged trees in an asymmetric c-command relation. This implies that, even with the introduction of heads and maximal projections, the assumption that all trees have a head marked leaf cannot always be satisfied, hence one cannot obtain a total ordering of the leaves (a unique choice of planar embedding).

Partial corrections to this problem in LCA are suggested using movement (see [28] p.230-231), which is similar in nature to the point mentioned above regarding un-labelable structures in Stabler’s formalism, or by introducing “null heads” in the structure, or by morphological reanalysis that hides certain items from LCA. In any case, the fundamental difficulty in constructing a planarization algorithm based on heads and maximal projections can be summarized as follows.

We define a head function on an (abstract) binary rooted tree TT as a function hT:Vo(T)L(T)h_{T}:V^{o}(T)\to L(T) from the set Vo(T)V^{o}(T) of non-leaf vertices of TT to the set L(T)L(T) of leaves of TT, with the property that if TvTwT_{v}\subseteq T_{w} and hT(w)L(Tv)L(Tw)h_{T}(w)\in L(T_{v})\subseteq L(T_{w}), then hT(w)=hT(v)h_{T}(w)=h_{T}(v). We write h(T)h(T) for the value of hTh_{T} at the root of TT. This general definition is designed to abstract the properties of the head in the syntactic sense.

Consider then pairs (T,hT)(T,h_{T}) and (T,hT)(T^{\prime},h_{T^{\prime}}) of trees with given head functions. There are exactly two choices of a head function on the Merge 𝔐(T,T){\mathfrak{M}}(T,T^{\prime}), corresponding to whether h(T)h(T) or h(T)h(T^{\prime}) is equal to h(𝔐(T,T))h({\mathfrak{M}}(T,T^{\prime})). Since the trees T,TT,T^{\prime} are not planar and 𝔐{\mathfrak{M}} is the symmetric Merge, there is no consistent way of making one rather than the other choice of h𝔐(T,T)h_{{\mathfrak{M}}(T,T^{\prime})}, at each application of 𝔐{\mathfrak{M}}. This implies that, on a given binary rooted tree TT there are 2#Vo(T)2^{\#V^{o}(T)} possible head functions. We can think of any such choice as the assignment, at each vertex vVo(T)v\in V^{o}(T) of a marking to either one or the other of the two edges exiting vv in the direction away from the root.

By thinking of hTh_{T} as an assignment of a marking to one of the two edges below each vertex, the head function hTh_{T} determines a planar embedding of TT by putting under each vertex the marked edge to the left. Thus, the problem of constructing planar embeddings of abstract binary rooted trees can be transformed into the problem of constructing head functions.

The LCA algorithm aims at obtaining a special assignment ThTT\mapsto h_{T} of head functions hTh_{T} to abstract binary rooted trees TT that is somehow determined uniquely by the properties of the labeling set 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0} of the leaves of the trees. Let us denote by λ()\lambda(\ell) the label in 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0} assigned to the leaf L(T)\ell\in L(T).

This is where the main difficulty arises. For instance, if the labeling set 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0} happens to be a totally ordered set (which is not a realistic linguistic assumption), as long at two trees (T,hT)(T,h_{T}) and (T,hT)(T^{\prime},h_{T^{\prime}}) have head functions with labels λ(h(T))λ(h(T))\lambda(h(T))\neq\lambda(h(T^{\prime})), there is always a preferred choice of head function on 𝔐(T,T){\mathfrak{M}}(T,T^{\prime}), which is the one in which the two subtrees TT and TT^{\prime} are ordered according to the ordering of the labels λ(h(T))\lambda(h(T)) and λ(h(T))\lambda(h(T^{\prime})) in 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0}. However, this excludes the case where the leaves h(T)h(T) and h(T)h(T^{\prime}) may have the same label in 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0}. So even under the unrealistically strong assumption that labels are taken from a totally ordered set, a planarization algorithm based on the construction of a head function cannot be defined on all the syntactic objects in 𝒮𝒪{\mathcal{S}}{\mathcal{O}}.

At the level of the underlying algebraic structure, the issue with planarization is therefore twofold. It does not provide an alternative to Merge acting on the non-planar trees, for the reasons mentioned above on maps of magmas. At the same time, a choice of planar embedding that is independent of syntactic parameters and is based only on heads of trees would require a canonical construction of head functions from properties of the labeling set 𝒮𝒪0{\mathcal{S}}{\mathcal{O}}_{0}, but this cannot be done consistently on the entire set of syntactic objects 𝒮𝒪{\mathcal{S}}{\mathcal{O}} produced by the free symmetric Merge 𝔐{\mathfrak{M}}.

5. Conclusions

We have seen from this comparative analysis of the algebraic structures underlying one of the older versions of Minimalism (Stabler’s Computational Minimalism) and Chomsky’s newer version of Minimalism that the new version has a simpler mathematical structure with a unifying description of Internal and External Merge, and with a core generative process that reflects the most fundamental magma of binary set formation, generating the binary rooted trees without planar structure.

The more complicated mathematical structure of Stabler’s Minimalism is caused by several factors. The intrinsic asymmetry of the Internal Merge is due to working with planar binary rooted trees. Abandoning the idea that planar embeddings should be part of the core computational structure of Minimalism is justified linguistically by the relevance of structures (abstract binary rooted trees) rather than strings (linearly ordered sets of leaves, or equivalently planar embeddings of trees) in syntactic parsing, see [18]. Thus, working with abstract trees without planar embeddings is one of the simplifying factors of the new Minimalism. The other main issue that complicates the mathematical structure of the older versions of Minimalism is the very different nature of the Internal and External Merge operations: in the case of Stabler’s Minimalism analyzed here these two forms of Merge relate to two very different algebraic objects (operated algebras and right-ideal coideals) hence they cannot be reconciled as coming from the same operation, while in the new Minimalism both Internal and External Merge are cases of the same operation, and both arise as the leading terms with respect to the appropriate formulation of Minimal Search. Finally, another main issue that makes the mathematical structure of older versions of Minimalism significantly more complicated is the presence of conditions on labels that need to be matched for Internal and External Merge to be applicable, related to the problem of projections discussed in [9]. Mathematically this makes all operations only partially defined on particular domains where conditions on labels are met. As we discussed, this creates problems with having to work with partially defined versions of various algebraic structures and it significantly complicated iterations of the Merge operations, where the conditions on domains compound. Since the conditions on labels are absent from the fundamental structure of the new Minimalism, this problem of dealing with partially defined operations also disappears, leading to another simplification at the level of the mathematical structures involved.

Within the new formulation of Minimalism, the planar structure of trees is introduced as a later step of externalization, not at the level of the Merge action. The choice of planar structure in externalization is done through a non-canonical (that is, dependent on syntactic parameters) section of the projection from planar to abstract trees. We show that proposed construction of a unique canonical choice of planar embeddings, based on heads of trees, maximal projections, and c-command, run into difficulties at the level of the underlying algebraic structure.

Acknowledgments

We thank Riny Huijbregts for helpful comments and feedback, and Richard Kayne for useful discussions. The first author acknowledges support from NSF grant DMS-2104330 and FQXi grants FQXi-RFP-1 804 and FQXi-RFP-CPW-2014, SVCF grant 2020-224047, and support from the Center for Evolutionary Science at Caltech.


References

  • [1] M. Aguiar, F. Sottile, Structure of the Loday–Ronco Hopf algebra of trees, Journal of Algebra, Vol.295 (2006) 473–511.
  • [2] F. Baader, T. Nipkow, Term Rewriting and All That, Cambridge University Press, 1998.
  • [3] R.C. Berwick, Mind the gap, in “50 Years Later: Reflections on Chomsky’s Aspects”, pp. 1–13, MITWPL, 2015.
  • [4] R.C. Berwick, N. Chomsky, Why only us? MIT Press, 2017.
  • [5] R.C. Berwick, S. Epstein, On the Convergence of ‘Minimalist’ Syntax and Categorial Grammar, in “Algebraic Methods in Language Processing 1995”, pp. 143—148, Universiteit Twente, 1995.
  • [6] A. Borowiec, W.A. Dudek, S. Duplij, Basic concepts of ternary Hopf algebras, arXiv:0306208, Journal of Kharkov National University, ser. Nuclei, Particles and Fields, v. 529, N 3(15) (2001) pp. 21–29.
  • [7] Ch. Brouder, A. Frabetti, QED Hopf algebras on planar binary trees. J. Algebra 267 (2003) N.1, 298–322.
  • [8] N. Chomsky, The Minimalist Program, MIT Press, 1995.
  • [9] N. Chomsky, Problems of projection, Lingua, Vol.130 (2013) 33-49.
  • [10] N. Chomsky, Some Puzzling Foundational Issues: The Reading Program, Catalan Journal of Linguistics Special Issue (2019) 263–285.
  • [11] N. Chomsky, The UCLA lectures, 2019. https://ling.auf.net/lingbuzz/005485
  • [12] N. Chomsky, Simplicity and the form of grammars, Journal of Language Modelling, Vol.9 (2021) N.1, 5–15.
  • [13] N. Chomsky, Working toward the Strong Interpretation of SMT, 2023 Theoretical Linguistics at Keio-EMU Linguistics as Scientific Inquiry Lecture Series,
    https://www.youtube.com/playlist?list=PLWXQYx-RCmeP7B2UtIA8OJsvAF-xvjDuZ
  • [14] C. Collins, E. Stabler, A Formalization of Minimalist Syntax, Syntax 19 (2016) N.1, 43–78.
  • [15] A. Connes, D. Kreimer, Hopf algebras, Renormalization and Noncommutative geometry, Comm. Math. Phys 199 (1998) 203–242
  • [16] A. Connes, M. Marcolli, Noncommutative Geometry, Quantum Fields, and Motives, Colloquium Publications, Vol.55, American Mathematical Society, 2008.
  • [17] C. Delaney, M. Marcolli, Dyson-Schwinger equations in the theory of computation, in “Feynman amplitudes, periods and motives”, pp. 79–107, Contemp. Math. 648, Amer. Math. Soc., 2015.
  • [18] M.B.H. Everaert, M.A.C. Huybregts, N. Chomsky, R.C. Berwick, J.J. Bolhuis, Structures, Not Strings: Linguistics as Part of the Cognitive Sciences, Trends in Cognitive Sciences, Vol.19 (2015) N.12, 729–743.
  • [19] L. Foissy, Classification of systems of Dyson–Schwinger equations in the Hopf algebra of decorated rooted trees, Advances in Math. 224 (2010) 2094–2150.
  • [20] L. Foissy, Les algèbres de Hopf des arbres enracinés, I, Bull. Sci. Math. 126 (2002) 193–239.
  • [21] S. Fong, J. Ginsburg, On constraining Free Merge, The 43rd Meeting of the Kansai Linguistics Society. Konan University: Kobe, Japan, 2018.
  • [22] X. Gao, L. Guo, H. Zhang, Rota’s program on algebraic operators, rewriting systems and Gröbner-Shirshov bases, arXiv:2108.11823.
  • [23] L. Gerritzen, R. Holtkamp, Hopf co-addition for free magma algebras and the non-associative Hausdorff series, J. of Algebra, Vol. 265 (2003) 264–284.
  • [24] L. Guo, Operated semigroups, Motzkin paths and rooted trees, J. Algebraic Combin. 29 (2009), 35–62.
  • [25] R. Holtkamp, Comparison of Hopf algebras on trees, Arch. Math. (Basel) 80 (4) (2003) 368–383.
  • [26] R. Holtkamp, Rooted trees appearing in products and co-products, in “Combinatorics and physics”, 153–169, Contemp. Math., 539, Amer. Math. Soc., 2011.
  • [27] R. Holtkamp, A pseudo-analyzer approach to formal group laws not of operad type, J. Algebra, Vol. 237 (2001) 382–405.
  • [28] N. Hornstein, J. Nunes, K.K. Grohmann, Understanding Minimalism, Cambridge University Press, 2005.
  • [29] R.S. Kayne, The Asymmetry of Syntax, MIT Press, 1994.
  • [30] R.S. Kayne, Temporal/Linear order, antisymmetry and externalization, Research in Generative Grammar 45 (2023) N.2. 1–22.
  • [31] M. Komachi, H. Kitahara, A. Uchibori, K. Takita, Generative procedure revisited. Reports of the Keio Institute of Cultural and Linguistic Studies, 50 (2019) 269–283.
  • [32] D. Kreimer, On the Hopf algebra structure of perturbative quantum field theories, Adv. Theor. Math. Phys. 2 (1998) N.2, 303–334.
  • [33] D. Kreimer, W.D. van Suijlekom, Recursive relations in the core Hopf algebra, Nuclear Phys. B 820 (2009), no. 3, 682–693.
  • [34] N. LaCara, The Linear Correspondence Axiom, LIN331 Syntactic Theory, course at the University of Toronto, 26 July 2018.
  • [35] J.L. Loday, M. Ronco, Hopf algebra of the planar binary trees, Adv. Math. 139 (1998) N.2, 293–309.
  • [36] J.L. Loday, M. Ronco, Order structure on the algebra of permutations and of planar binary trees, J. Alg. Combin. Vol. 15 (2002) N.3, 253–270.
  • [37] M. Marcolli, N. Chomsky, R.C. Berwick, Mathematical structure of syntactic Merge, preprint 2023.
  • [38] A. Masuoka, Quotient Theory of Hopf algebras, in (J. Bergen, S. Montgomery, Eds.), “Advances in Hopf Algebras”, Marcel Dekker, 1994, pp. 107–133.
  • [39] J. Michaelis, Transforming linear context free rewriting systems into minimalist grammars, in “Logical Aspects of Computational Linguistics (NY, 2001)” (P. de Groote, G. Morrill, and C. Retoré, Eds.), Lecture Notes in Artificial Intelligence, Vol. 2099, Springer, pp. 228–244.
  • [40] G.C. Rota, Baxter operators, an introduction, in (Joseph P.S.Kung, Ed.), “Gian-Carlo Rota on Combinatorics, Introductory papers and commentaries”, Birkhäuser, 1995, pp. 504–512.
  • [41] P. Schauenburg, H.J. Schneider, On generalized Hopf Galois extensions, Journal of Pure and Applied Algebra, Vol.202 (2005) 168–194.
  • [42] E.P. Stabler, Computational perspectives on minimalism, in “Oxford Handbook of Linguistic Minimalism” (C. Boeckx, ed.), Oxford University Press, 2010, 616–641.
  • [43] M. Takeuchi, Quotient spaces for Hopf algebras, Comm. Algebra 22 (1994), N.7, 2503–2523.
  • [44] K. Vijay-Shanker, D. Weir, The equivalence of four extensions of context free grammar formalisms, Mathematical Systems Theory, 27 (1994) 511–545.
  • [45] K. Yeats, Rearranging Dyson-Schwinger Equations, Memoirs of the American Mathematical Society, 211, American Mathematical Society, 2011.
  • [46] Y. Zhang, X. Gao, Hopf algebras of planar binary trees: an operated algebra approach, Journal of Algebraic Combinatorics, 51 (2020) 567–588.