This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The palindromization map

Dominique Perrin and Christophe Reutenauer
Abstract

The palindromization map has been defined initially by Aldo de Luca in the context of Sturmian words. It was extended to the free group of rank 22 by Kassel and the second author. We extend their construction to arbitrary alphabets. We also investigate the suffix automaton and compact suffix automaton of the words obtained by palindromization.

1 Introduction

The iterated palindromic closure is an injective map mapping arbitrary words to palindromes. It has been introduced by Aldo de Luca in [9]. This map is used to define a representation of Sturmian words by means of a directive word and is related to a transformation introduced by Rauzy (see [20, 3]). The iterated palindromic closure has been shown in [18] to be extendable, in the case of two letters, to a map (not anymore injective) from the free group into itself and to have many interesting properties.

In this article, we study the extension of the iterated palindrome closure to a map from the free group on more than two letters to itself. We show that some of the features appearing with two letters remain valid while some others do not hold anymore. In particular, we show that the map is continuous for the profinite topology (Proposition 4.1). We were not able to characterise the kernel of the map as it is done in [18], where it is related to the braid group. We also discuss the relation with noncommutative cohomology evidenced in [18] but we show that, on more than two letters, the cocycle corresponding to the iterated palindromization map is not trivial.

In Section 5, we describe the suffix automaton of a word of the form Pal(w)\operatorname{Pal}(w), that is the minimal automaton of the set of suffixes of this word. We extend to arbitrary alphabets results concerning the suffix automaton; the corresponding results for binary alphabets are from [13].

In Section 6, we develop study compact automata. These automata have already been studied in the case of suffix automata (see the chapter by Maxime Crochemore in [19]), but they do not seem to have been considered before in the general case of automata. We fill this gap and present a direct definition of a minimal compact automaton, which is shown to be unique (Corollary 6.6), together with two other results related to the notion of reduction of automaton (Propositions 6.5 and 6.7).

This will apply in Section 7 to the construction of the minimal compact suffix automaton of Pal(u)\operatorname{Pal}(u). In that section, we construct directly the minimal compact automaton of the set of suffixes of Pal(w)\operatorname{Pal}(w). The construction extends the known construction in the binary alphabet case due to Epifanio, Mignosi, Shallit and Venturini [13] (see also [7]). It consists in computing the automaton for Pal(ua)\operatorname{Pal}(ua) from the automaton of Pal(u)\operatorname{Pal}(u), by adjoining one state, and several transitions from the first automaton to this state. From this, we deduce the exact form of the automaton (Theorem 7.1) and several of its properties (Corollaries 7.6 and 7.7)

Acknowledgements

We thank Maxime Crochemore for his advice about compact automata.

2 The palindromization map

We denote by AA^{*} the free monoid on the alphabet AA and by 11 the empty word. We denote by w~=ana2a1\tilde{w}=a_{n}\cdots a_{2}a_{1} the reversal of the word w=a1a2anw=a_{1}a_{2}\cdots a_{n} with aiAa_{i}\in A. The word ww is a palindrome if w~=w\tilde{w}=w.

We denote by FG(A)FG(A) the free group on AA. If ww is in F(A)F(A) and aa in AA, we denote by |w|a|w|_{a} the number of occurrences of aa in ww, where one counts with -1 the occurrences of a1a^{-1}; this is well defined and does not depend on the chosen expression for ww; we call it the aa-degree of ww. Moreover, define |w|=aA|w|a|w|=\sum_{a\in A}|w|_{a}, the algebraic length of ww. In particular, if wAw\in A^{*}, then |w||w| is the length of ww.

The reversal of the element w=a1a2anw=a_{1}a_{2}\cdots a_{n} with aiAA1a_{i}\in A\cup A^{-1} is the element w~=ana2a1\tilde{w}=a_{n}\cdots a_{2}a_{1}. This does not depend on the chosen expression for ww. The map ww~w\mapsto\tilde{w} is an antimorphism, that is, it satisfies uv~=v~u~\widetilde{uv}=\tilde{v}\tilde{u}. We also say that an element ww of FG(A)FG(A) is a palindrome if w~=w\tilde{w}=w.

Let us define two morphisms L:uLuL\colon u\mapsto L_{u} and R:uRuR\colon u\mapsto R_{u} from FG(A)FG(A) into its group of automorphisms as follows. For a,bAa,b\in A, we set

La(b)={aif b=aabotherwiseL_{a}(b)=\begin{cases}a&\mbox{if $b=a$}\\ ab&\mbox{otherwise}\end{cases}

The map LaL_{a} is an automorphism of FG(A)FG(A) since its inverse is the map

La1(b)={aif b=aa1botherwiseL_{a}^{-1}(b)=\begin{cases}a&\mbox{if $b=a$}\\ a^{-1}b&\mbox{otherwise}\end{cases}

Symmetrically, for a,bAa,b\in A, we set Ra(b)=La(b)~R_{a}(b)=\widetilde{L_{a}(b)}.

Thus, for example, La(ab1)=aLa(b)1=ab1a1L_{a}(ab^{-1})=aL_{a}(b)^{-1}=ab^{-1}a^{-1} and Ra(ab1)=aRa(b)1=aa1b1=b1R_{a}(ab^{-1})=aR_{a}(b)^{-1}=aa^{-1}b^{-1}=b^{-1}.

Note that L,RL,R are related by two identities. The first one is

aRa(u)=La(u)aaR_{a}(u)=L_{a}(u)a (2.1)

for every aAA1a\in A\cup A^{-1} and uFG(A)u\in FG(A). Indeed, this is true when uu is a letter; by rewriting equivalently Ra(u)=a1La(u)aR_{a}(u)=a^{-1}L_{a}(u)a, this equality follows since both sides are the image of uu under an automorphism of FG(A)FG(A).

The second one is

Ru(v)=Lu(v~)~R_{u}(v)=\widetilde{L_{u}(\tilde{v})} (2.2)

for every u,vFG(A)u,v\in FG(A). Indeed, this is true when uu is a letter and the general case follows similarly.

Every word wAw\in A^{*} is a prefix of some palindrome since ww~w\tilde{w} is always a palindrome. Thus, there exists a palindrome of shortest length which has ww as a prefix. Actually, this palindrome is unique. It is called the palindromic closure of ww and denoted w(+)w^{(+)}. One has w(+)=yzy~w^{(+})=yz\tilde{y} where zz is the longest palindrome suffix of w=yzw=yz (for these results, due to Aldo de Luca, see for example [21] Proposition 12.1.1).

For example, (abaa)(+)=abaaba(abaa)^{(+)}=abaaba, since the longest palindromic suffix of abaaabaa is z=aaz=aa, and y=aby=ab.

Let Pal\operatorname{Pal} be the unique map from AA^{*} to AA^{*} such that Pal(1)=1\operatorname{Pal}(1)=1, and for wAw\in A^{*} and aAa\in A

Pal(wa)=(Pal(w)a)(+).\operatorname{Pal}(wa)=(\operatorname{Pal}(w)a)^{(+)}.

For a word ww, Pal(w)\operatorname{Pal}(w) is called the iterated palindromic closure of ww. The iterated palindromic closure has been introduced by Aldo de Luca [9] who has shown that it is injective (see for example [21] p. 102, where the injectivity is proved by an algorithm).

For example Pal(aba)=abaaba\operatorname{Pal}(aba)=abaaba, since Pal(a)=a\operatorname{Pal}(a)=a, Pal(ab)=(ab)(+)=aba\operatorname{Pal}(ab)=(ab)^{(+)}=aba, and finally Pal(aba)=(abaa)(+)=abaaba\operatorname{Pal}(aba)=(abaa)^{(+)}=abaaba.

The mapping Pal\operatorname{Pal} may be extended to infinite words, since when uu is a proper prefix of vv, Pal(u)\operatorname{Pal}(u) is a proper prefix of Pal(v)\operatorname{Pal}(v).

An important property of Pal\operatorname{Pal} is the following functional equation, known as Justin’s Formula [17]. For all u,vAu,v\in A^{*},

Pal(uv)=Pal(u)Ru(Pal(v)).\operatorname{Pal}(uv)=\operatorname{Pal}(u)R_{u}(\operatorname{Pal}(v)). (2.3)

A dual form of (2.3) (which is the one actually given in [17, Lemma 2.1]) is

Pal(uv)=Lu(Pal(v))Pal(u).\operatorname{Pal}(uv)=L_{u}(\operatorname{Pal}(v))\operatorname{Pal}(u). (2.4)

Indeed, assuming (2.4), we have using (2.2),

Pal(uv)\displaystyle\operatorname{Pal}(uv) =\displaystyle= Pal(uv)~=Pal(u)~Lu(Pal(v))~\displaystyle\widetilde{\operatorname{Pal}(uv)}=\widetilde{\operatorname{Pal}(u)}\widetilde{L_{u}(\operatorname{Pal}(v))} (2.5)
=\displaystyle= Pal(u)Ru(Pal(v)~)=Pal(u)Ru(Pal(v)).\displaystyle\operatorname{Pal}(u)R_{u}(\widetilde{\operatorname{Pal}(v)})=\operatorname{Pal}(u)R_{u}(\operatorname{Pal}(v)). (2.6)

Hence (2.4) implies (2.3). The reverse implication is true, too, as is similarly verified.

We want to prove the following result, extending the construction of [18] to arbitrary alphabets.

Theorem 2.1

There exists a unique extension of Pal\operatorname{Pal} to a map from FG(A)FG(A) to itself fixing every aAA1a\in A\cup A^{-1} and satisfying (2.3) for every u,vFG(A)u,v\in FG(A).

We will still denote by Pal\operatorname{Pal} the extension of the iterated palindromic closure to the free group and call it the palindromization map. We will see below that the extension also satisfies (2.4), for any u,vFG(A)u,v\in FG(A).

The statement follows directly from the following property.

Proposition 2.2

Let α:uFG(A)αuAut(FG(A))\alpha:u\in FG(A)\mapsto\alpha_{u}\in\operatorname{Aut}(FG(A)) be a morphism from the free group on AA to the group of its automorphisms, such that αa(a)=a\alpha_{a}(a)=a for every aAa\in A. There exists a unique map f:FG(A)FG(A)f:FG(A)\to FG(A) such that

  • (i)

    f(a)=af(a)=a for every aAA1a\in A\cup A^{-1}.

  • (ii)
    f(uv)=f(u)αu(f(v))f(uv)=f(u)\alpha_{u}(f(v)) (2.7)

    for every u,vFG(A)u,v\in FG(A).

Proof.

We prove first uniqueness of ff. Equation (2.7) with u=v=1u=v=1 implies that f(1)=1f(1)=1.

Next, the same equation implies

f(au)=aαa(f(u)).f(au)=a\alpha_{a}(f(u)).

for each uFG(A)u\in FG(A) and each aAA1a\in A\cup A^{-1}. Thus there is a unique ff such that for each reduced word auau with aAA1a\in A\cup A^{-1} one has f(au)=aαa(f(u))f(au)=a\alpha_{a}(f(u)). Indeed, this is true when u=1u=1 and follows easily using induction on the length of auau. Thus there is at most one map ff satisfying conditions (i) and (ii).

To prove the existence, let us prove that, for this map ff, Equation (2.7) holds for every vFG(A)v\in FG(A) by induction on the length l(u)l(u) of the reduced word uu. It holds for |u|=0|u|=0. Next, let uFG(A)u\in FG(A) be of positive length. Set u=uau=u^{\prime}a with aAA1a\in A\cup A^{-1}, in reduced form. Then l(u)=l(u)1l(u^{\prime})=l(u)-1. Since uv=u(av)uv=u^{\prime}(av), the induction hypothesis implies

f(uv)\displaystyle f(uv) =\displaystyle= f(u)αu(f(av)).\displaystyle f(u^{\prime})\alpha_{u^{\prime}}(f(av)).

Applying again the induction hypothesis, we obtain

f(u)=f(ua)=f(u)αu(a).f(u)=f(u^{\prime}a)=f(u^{\prime})\alpha_{u^{\prime}}(a).

- Assume first that avav is reduced. Then, by definition of ff, we have f(av)=aαa(f(v))f(av)=a\alpha_{a}(f(v)). Thus

f(uv)\displaystyle f(uv) =\displaystyle= f(u)αu(aαa(f(v)))\displaystyle f(u^{\prime})\alpha_{u^{\prime}}(a\alpha_{a}(f(v)))
=\displaystyle= f(u)αu(a)αu(f(v)).\displaystyle f(u^{\prime})\alpha_{u^{\prime}}(a)\alpha_{u}(f(v)).

Thus we obtain

f(uv)\displaystyle f(uv) =\displaystyle= f(u)αu(f(v)).\displaystyle f(u)\alpha_{u}(f(v)).

- Assume now that avav is not reduced; set v=a1vv=a^{-1}v^{\prime} in reduced form. Then, by definition of ff, we have f(v)=f(a1v)=a1αa1(f(v))f(v)=f(a^{-1}v^{\prime})=a^{-1}\alpha_{a^{-1}}(f(v^{\prime})).

Thus, since αa(a1)=a1\alpha_{a}(a^{-1})=a^{-1}, we have

f(u)αu(f(v))\displaystyle f(u)\alpha_{u}(f(v)) =\displaystyle= f(u)αu(a)αua(a1αa1(f(v))\displaystyle f(u^{\prime})\alpha_{u^{\prime}}(a)\alpha_{u^{\prime}a}(a^{-1}\alpha_{a^{-1}}(f(v^{\prime}))
=\displaystyle= f(u)αu(a)αu(a1)αu(f(v))\displaystyle f(u^{\prime})\alpha_{u^{\prime}}(a)\alpha_{u^{\prime}}(a^{-1})\alpha_{u^{\prime}}(f(v^{\prime}))
=\displaystyle= f(u)αu(f(v))\displaystyle f(u^{\prime})\alpha_{u^{\prime}}(f(v^{\prime}))

which by induction hypothesis is equal to f(uv)=f(uv)f(u^{\prime}v^{\prime})=f(uv). ∎

We now verify the following property of the palindromization map.

Proposition 2.3

For every wFG(A)w\in FG(A), Pal(w)\operatorname{Pal}(w) is a palindrome.

Proof.

For every uFG(A)u\in FG(A) and aAA1a\in A\cup A^{-1}, we have

aRa(u~)=Ra(u)~a.aR_{a}(\tilde{u})=\widetilde{R_{a}(u)}a. (2.8)

Indeed, aRa(u~)=La(u~)a=Ra(u)~aaR_{a}(\tilde{u})=L_{a}(\tilde{u})a=\widetilde{R_{a}(u)}a by (2.1) and (2.2). It follows from (2.8) that for every aAA1a\in A\cup A^{-1} and every palindrome uFG(A)u\in FG(A), aRa(u)aR_{a}(u) is a palindrome.

Let us now show by induction on the length of the reduced word representing wFG(A)w\in FG(A) that Pal(w)\operatorname{Pal}(w) is a palindrome. It is true if w=1w=1. Next, set w=auw=au in reduced form with aAA1a\in A\cup A^{-1} and uFG(A)u\in FG(A). We have by (2.3), Pal(w)=aRa(Pal(u))\operatorname{Pal}(w)=aR_{a}(\operatorname{Pal}(u)). By induction hypothesis, Pal(u)\operatorname{Pal}(u) is a palindrome and it follows from the previous observation that Pal(w)\operatorname{Pal}(w) is a palindrome, too. ∎

It follows from Proposition 2.3, using the same argument as in (2.6), that the map Pal\operatorname{Pal} satisfies also (2.4) for every u,vFG(A)u,v\in FG(A).

As an example, we have Pal(b1)=b1Pal(b^{-1})=b^{-1} and Pal(ab1)=Pal(a)Ra(Pal(b1))=aRa(b1)=a(ba)1=b1\operatorname{Pal}(ab^{-1})=\operatorname{Pal}(a)R_{a}(Pal(b^{-1}))=aR_{a}(b^{-1})=a(ba)^{-1}=b^{-1}. This shows that the extension of Pal\operatorname{Pal} to FG(A)FG(A) is not injective. In the case of a binary alphabet, one can characterize the kernel of Pal\operatorname{Pal} as follows.

Let B3B_{3} be the braid group on three strands defined as

B3=σ1,σ2σ1σ2σ1=σ2σ1σ2.B_{3}=\langle\sigma_{1},\sigma_{2}\mid\sigma_{1}\sigma_{2}\sigma_{1}=\sigma_{2}\sigma_{1}\sigma_{2}\rangle.

Let β:FG(a,b)B3\beta:FG(a,b)\to B_{3} be the morphism β:aσ1,bσ2\beta:a\mapsto\sigma_{1},b\mapsto\sigma_{2}. For any u,vFG(a,b)u,v\in FG(a,b), one has Pal(u)=Pal(v)\operatorname{Pal}(u)=\operatorname{Pal}(v) if and only if β(u1v)σ1σ21σ11\beta(u^{-1}v)\in\langle\sigma_{1}\sigma_{2}^{-1}\sigma_{1}^{-1}\rangle (see [18, Proposition 5.2]). No such characterization is known on more than two letters.

3 Semidirect products, cocycles and sequential functions

We discuss now several interpretations of Justin’s Formula.

Semidirect products

Observe first that Equation (2.7) (and thus also Equation (2.3)) can be expressed in terms of semidirect products. Indeed, consider the semidirect product FG(A)αFG(A)FG(A)*_{\alpha}FG(A) of FG(A)FG(A) with itself corresponding to the morphism α\alpha from FG(A)FG(A) into Aut(FG(A))\operatorname{Aut}(FG(A)). By definition, it is the set of pairs (u,v)FG(A)×FG(A)(u,v)\in FG(A)\times FG(A) with the product

(u,v)(r,s)=(uαv(r),vs)(u,v)(r,s)=(u\alpha_{v}(r),vs)

Equation (2.7) expresses the fact that δ:w(f(w),w)\delta:w\mapsto(f(w),w) is a morphism from FG(A)FG(A) to FG(A)αFG(a)FG(A)*_{\alpha}FG(a). Indeed, assuming (2.7), we have for every u,vFG(A)u,v\in FG(A),

δ(u)δ(v)=(f(u),u)(f(v),v)=(f(u)αu(f(v)),uv)=f(uv),uv)=δ(uv).\delta(u)\delta(v)=(f(u),u)(f(v),v)=(f(u)\alpha_{u}(f(v)),uv)=f(uv),uv)=\delta(uv).

This proves the following statement.

Proposition 3.1

The map u(Pal(u),u)u\mapsto(\operatorname{Pal}(u),u) is the unique morphism from FG(A)FG(A) to FG(A)RFG(A)FG(A)*_{R}FG(A) sending every aAa\in A to (a,a)(a,a).

Cocycles

Justin’s Formula is also related to the notion of nonabelian group cohomology, as pointed out in [18]. A function ff, from FG(A)FG(A) to itself, is a 1-cocycle, with respect to a group morphism α:uαu\alpha:u\mapsto\alpha_{u} from FG(A)FG(A) to Aut(FG(A))\operatorname{Aut}(FG(A)), if (2.7) holds for all u,vFG(A)u,v\in FG(A). Thus Pal\operatorname{Pal}, as a function from FG(A)FG(A) to itself, is a 1-cocycle with respect to the morphism RR. Such a 1-cocycle is trivial if there is an element xFG(A)x\in FG(A) such that

f(u)=x1αu(x)f(u)=x^{-1}\alpha_{u}(x) (3.1)

for every uFG(A)u\in FG(A). When AA has two elements, one has Pal(u)=(ab)1Ru(ab)\operatorname{Pal}(u)=(ab)^{-1}R_{u}(ab) by [18, Equation (3.1)]. Thus the 1-cocycle Pal\operatorname{Pal} is trivial. This is not the case on more than two letters. Indeed, suppose that xFG(A)x\in FG(A) is such that Pal(u)=x1Ru(x)\operatorname{Pal}(u)=x^{-1}R_{u}(x) for all uFG(A)u\in FG(A). One has then xa=Ra(x)xa=R_{a}(x) for every aAa\in A and thus, by taking the aa-degree: |x|a+1=|x||x|_{a}+1=|x|. This implies, by summing over all aAa\in A, |x|(Card(A)1)=Card(A)|x|(\operatorname{Card}(A)-1)=\operatorname{Card}(A), which is impossible for Card(A)3\operatorname{Card}(A)\geq 3 since |x||x| is an integer.

Sequential functions

Equation (2.3) can also be seen as expressing that, as a function from AA^{*} to AA^{*}, the map Pal\operatorname{Pal} is a sequential function, that is, a function computed by a sequential transducer. Let us recall the definition of a sequential transducer on a set QQ of states. Let

(q,a)Q×AqaQ(q,a)\in Q\times A\mapsto q\cdot a\in Q

be a map, called the transition function. This map extends to a right action of AA^{*} on QQ by q(ua)=(qu)aq\cdot(ua)=(q\cdot u)\cdot a for uAu\in A^{*} and aAa\in A. In addition, let

(q,a)Q×AqaA(q,a)\in Q\times A\mapsto q*a\in A^{*}

be a map called the output function. This map extends to a map from Q×AQ\times A^{*} to AA^{*} by

q(ua)=(qu)((qu)a).q*(ua)=(q*u)((q\cdot u)*a).

Given a sequential transducer on QQ defined by the maps (q,a)qa(q,a)\mapsto q\cdot a and (q,a)qa(q,a)\mapsto q*a and given an initial state iQi\in Q, the function f:AAf:A^{*}\to A^{*} defined by the transducer is

f(w)=iwf(w)=i*w
Proposition 3.2

The function Pal\operatorname{Pal} is defined by the transducer on the set of states Aut(FG(A))\operatorname{Aut}(FG(A)) with transition and output functions

Rua=Rua and Rua=Ru(a)R_{u}\cdot a=R_{ua}\mbox{ and }R_{u}*a=R_{u}(a)

respectively, and with initial state i=R1i=R_{1}.

Proof.

We prove by induction on the length of wAw\in A^{*} that Pal(w)=iw\operatorname{Pal}(w)=i*w. It is true for w=1w=1 and next, assuming that Pal(u)=iu\operatorname{Pal}(u)=i*u, we obtain for every aAa\in A,

i(ua)=(iu)((iu)a)=Pal(u)Ru(a)=Pal(ua).i*(ua)=(i*u)((i\cdot u)*a)=\operatorname{Pal}(u)R_{u}(a)=\operatorname{Pal}(ua).

4 Uniform continuity of Pal\operatorname{Pal} for the profinite distance

The profinite topology on the free group FG(A)FG(A) is the topology generated by the inverse images of subsets of a finite group FF by a morphism φ:FG(A)F\varphi:FG(A)\to F. Equivalently, it is the coarsest topology such that every morphism to a finite discrete group is continuous. This topology on the free group was introduced by Hall (see [16]). It is a particular case of a more general notion which extends to varieties of groups and also of semigroups (see [22] and [1]).

The following is proved in [18] for the case of a binary alphabet.

Proposition 4.1

The map Pal:FG(A)FG(A)\operatorname{Pal}:FG(A)\to FG(A) is continuous for the profinite topology.

Actually we shall prove a slightly more general result. Following [1] p. 57, we define the profinite distance dd on FG(A)FG(A) by the formula

d(u,v)=1r(u,v)d(u,v)=\frac{1}{r(u,v)}

where u,vu,v are distinct, and where r(u,v)r(u,v) is the minimal cardinality of a group GG such that, for some morphism φ:FG(A)G\varphi:FG(A)\to G, φ(u)φ(v)\varphi(u)\neq\varphi(v). Since FG(A)FG(A) is residually finite, r(u,v)r(u,v) is a finite integer for each pair u,vu,v of distinct elements in FG(A)FG(A). It is easy to prove that r(u,w)min(r(u,v),r(v,w))r(u,w)\geq\min(r(u,v),r(v,w)), hence

d(u,w)max(d(u,v),d(v,w)),d(u,w)\leq\max(d(u,v),d(v,w)),

which means that dd is an ultrametric distance.

The topology of FG(A)FG(A) induced by dd is precisely the profinite topology.

Proposition 4.1 follows from the next result, since each uniformly continuous function is continuous.

Proposition 4.2

The map Pal:FG(A)FG(A)\operatorname{Pal}:FG(A)\to FG(A) is uniformly continuous for the profinite distance.

We need in the proof the following characterizations of uniformly continuous functions.

1. A function P:FG(A)FG(A)P:FG(A)\to FG(A) is uniformly continuous if and only if for any morphism φ:FG(A)G\varphi:FG(A)\to G, the function φP:FG(A)G\varphi\circ P:FG(A)\to G is uniformly continuous, where GG has the discrete distance.

2. A function P:FG(A)GP^{\prime}:FG(A)\to G, with GG a finite group with discrete distance, is uniformly continuous if and only if it factorizes as P=hψP^{\prime}=h\circ\psi, where ψ:FG(A)S\psi:FG(A)\to S is a morphism into a finite group, and h:SGh:S\to G is some function.

Proof.

We apply the first criterion above to the function P=PalP=\operatorname{Pal}. Thus let φ:FG(A)G\varphi:FG(A)\to G be a group morphism into a finite group GG.

We define a right action of FG(A)FG(A) on the finite set Q=G×Hom(FG(A),G)Q=G\times\operatorname{Hom}(FG(A),G) by

(g,π)w=(gπ(Pal(w)),πRw).(g,\pi)\cdot w=(g\pi(\operatorname{Pal}(w)),\pi\circ R_{w}). (4.1)

It is indeed a right action, since the associativity follows from

((g,π)u)v\displaystyle((g,\pi)\cdot u)\cdot v =\displaystyle= (gπ(Pal(u)),πRu)v\displaystyle(g\pi(\operatorname{Pal}(u)),\pi\circ R_{u})\cdot v
=\displaystyle= (gπ(Pal(u))πRu(Pal(v)),πRuRv)\displaystyle(g\pi(\operatorname{Pal}(u))\pi\circ R_{u}(\operatorname{Pal}(v)),\pi\circ R_{u}\circ R_{v})
=\displaystyle= (gπ(Pal(u)Ru(Pal(v))),πRuv)\displaystyle(g\pi(\operatorname{Pal}(u)R_{u}(\operatorname{Pal}(v))),\pi\circ R_{uv})
=\displaystyle= (gπ(Pal(uv)),πRuv)=(g,π)(uv).\displaystyle(g\pi(\operatorname{Pal}(uv)),\pi\circ R_{uv})=(g,\pi)\cdot(uv).

It follows from Equation (4.1) that for every wFG(A)w\in FG(A), one has

(1G,φ)w=(φ(Pal(w)),φRw).(1_{G},\varphi)\cdot w=(\varphi(\operatorname{Pal}(w)),\varphi\circ R_{w}).

In order to apply the second criterion, we define now a morphism ψ:FG(A)S\psi:FG(A)\to S, where SS is the symmetric group on QQ: for any ww in FG(A)FG(A), ψ(w)\psi(w) is the permutation qqwq\mapsto q\cdot w. Then ψ\psi is a morphism, because of the right action defined above, letting SS act on the right on QQ. Note that SS is a finite group, since QQ is finite.

Define a function h:SGh:S\to G; it sends each element σS\sigma\in S onto the first component gg of the pair (g,π)=(1G,φ)σ(g,\pi)=(1_{G},\varphi)\sigma. Thus we have hψ(w)=h\circ\psi(w)= first component of (1G,φ)w(1_{G},\varphi)\cdot w, which is equal to φPal(w)\varphi\circ\operatorname{Pal}(w); thus hψ=φPalh\circ\psi=\varphi\circ\operatorname{Pal}, which allows to conclude, according to the second criterion, that φPal\varphi\circ\operatorname{Pal} is uniformly continuous. With the first criterion, we see that Pal\operatorname{Pal} is uniformly continuous. ∎

Since AA^{*} is dense in FG(A)FG(A) for the profinite topology (see [1] or [2] for example), we deduce from Proposition 4.1 the following statement.

Corollary 4.3

The map Pal:FG(A)FG(A)\operatorname{Pal}:FG(A)\to FG(A) is the unique continuous extension to FG(A)FG(A) of the iterated palindromic closure of AA^{*}.

Though Pal\operatorname{Pal} is continuous for the profinite topology, it is not continuous for the pro-pp topology on the free group F2F_{2} of rank 22, where pp is a prime number. Recall that the pro-pp topology is the coarsest topology such that every group homomorphism from F2F_{2} into a finite pp-group is continuous (see [18, Remark 6.3]).

5 Suffix automaton

The minimal automaton of the set of suffixes of a word ww is called the suffix automaton of ww. These automata has been extensively studied (see [19, Chapter 2]) A striking property (originally due to [4]) is that its number of states is at most 2|w|12|w|-1.

Let uu be a word over an arbitrary alphabet AA. We denote by 𝒮(u)\mathcal{S}(u) the suffix automaton of Pal(u)\operatorname{Pal}(u).

Part (i) of the following result is not new, but we give a proof for sake of completeness.

Theorem 5.1

The automaton 𝒮(u)\mathcal{S}(u) has the following properties:

  1. (i)

    It has |Pal(u)|+1|\operatorname{Pal}(u)|+1 states, which may be naturally identified with the prefixes of Pal(u)\operatorname{Pal}(u).

  2. (ii)

    Its terminal states are the palindromic prefixes of Pal(u)\operatorname{Pal}(u).

This result, for a binary alphabet, is due to  [13] ((i) in the binary case also follows from Theorem 1 in [24], which characterizes the binary words whose suffix automaton has |w|+1|w|+1 states); see also [13], [12], [7]. Moreover, for general alphabets, Part (i) of the theorem is a consequence of the remark in Section 5 in Fici’s article [14]. Note that that characterizations of the words ww such that the suffix automaton of ww has |w|+1|w|+1 states have been given by Fici [14] and Richomme [23].

A factor ww of a word uu (resp. an infinite word ss) is left-special if there are at least two letters aa such that awaw is a factor of uu (resp. ss).

The following is from [10, Proposition 5] (see also [11, Proposition 1.5.11]). Recall that for any infinite word xx, the infinite word Pal(x)\operatorname{Pal}(x) is well-defined.

Proposition 5.2

If s=Pal(x)s=\operatorname{Pal}(x) for some infinite word xx, the left-special factors of ss are the prefixes of ss.

We will use the following consequence.

Corollary 5.3

For any (finite) word uu, the left special factors of Pal(u)\operatorname{Pal}(u) are prefixes of Pal(u)\operatorname{Pal}(u).

Proof.

Set s=Pal(uω)s=\operatorname{Pal}(u^{\omega}). Let pp be a left-special factor of Pal(u)\operatorname{Pal}(u). Since Pal(u)\operatorname{Pal}(u) is a prefix of ss, the word pp is also left-special with respect to ss. By Proposition 5.2, pp is a prefix of ss and thus of Pal(u)\operatorname{Pal}(u). ∎

Note that not all prefixes of Pal(u)\operatorname{Pal}(u) need to be left-special. For example, if u=abu=ab, the factor abab is not a left-special factor of Pal(u)=aba\operatorname{Pal}(u)=aba.

Given a language LL, a residual of LL is set of the form u1L={vAuvL}u^{-1}L=\{v\in A^{*}\mid uv\in L\}. It is well-known that the minimal automaton of a language LL, denoted 𝒜(L)\mathcal{A}(L), has the set of nonempty residuals of LL as set of states.

Proof of Theorem 5.1. Let 11 be the initial state of 𝒮(u)\mathcal{S}(u). Let PP be the set of prefixes of Pal(u)\operatorname{Pal}(u). The map α:p1p\alpha\colon p\mapsto 1\cdot p is injective. Indeed, let p,pPp,p^{\prime}\in P be such that 1p=1p1\cdot p=1\cdot p^{\prime}. Assuming that |p||p||p|\leq|p^{\prime}|, let rr be such that p=prp^{\prime}=pr. Then 1pr=1p=1p1\cdot pr=1\cdot p^{\prime}=1\cdot p and thus (1p)r=1p(1\cdot p)\cdot r=1\cdot p. Since the language recognized by 𝒮(u)\mathcal{S}(u) is finite, the graph of 𝒮(u)\mathcal{S}(u) is acyclic, which forces r=1r=1. Thus p=pp=p^{\prime}.

Let us show now that α\alpha is surjective. Let qq be a state of the automaton 𝒮(u)\mathcal{S}(u). Let ww be a word such that 1w=q1\cdot w=q. Since the automaton is co-accessible, there is some word ss such that 1ws1\cdot ws is a terminal state, and thus wsws is a suffix of Pal(u)\operatorname{Pal}(u). Hence the word ww is a factor of Pal(u)\operatorname{Pal}(u). Let pp be the shortest prefix of Pal(u)\operatorname{Pal}(u) such that pwpw is a prefix of Pal(u)\operatorname{Pal}(u). Let us show by induction on the length of pp^{\prime} that for every suffix pp^{\prime} of pp, one has 1pw=q1\cdot p^{\prime}w=q. It is true if p=1p^{\prime}=1. Otherwise, set p=ap′′p^{\prime}=ap^{\prime\prime}. We have 1p′′w=q1\cdot p^{\prime\prime}w=q by induction hypothesis.

Let ss be an arbitrary word such that wsws belongs to the set SS of suffixes of Pal(u)\operatorname{Pal}(u). Since 1p′′w=1w1\cdot p^{\prime\prime}w=1\cdot w, and since the automaton 𝒮(u)\mathcal{S}(u) is the minimal automaton of SS, so that (p′′w)1S=w1S(p^{\prime\prime}w)^{-1}S=w^{-1}S, and sw1Ss\in w^{-1}S, we have p′′wsSp^{\prime\prime}ws\in S. We cannot have p′′ws=Pal(u)p^{\prime\prime}ws=\operatorname{Pal}(u), since otherwise p′′wp^{\prime\prime}w is a prefix of Pal(u)\operatorname{Pal}(u) with p′′p^{\prime\prime} shorter than pp.

Thus there is some bAb\in A such that bp′′wsSbp^{\prime\prime}ws\in S . Since ap′′w=pwap^{\prime\prime}w=p^{\prime}w is a factor of Pal(u)\operatorname{Pal}(u) and since, by Corollary 5.3, p′′wp^{\prime\prime}w is not left-special (because p′′wp^{\prime\prime}w is not a prefix of Pal(u)\operatorname{Pal}(u) for the same reason as above), we have a=ba=b. This shows that wsSpwsSws\in S\Rightarrow p^{\prime}ws\in S. The converse is also true and thus that w1S=(pw)1Sw^{-1}S=(p^{\prime}w)^{-1}S, that is 1pw=1w=q1\cdot p^{\prime}w=1\cdot w=q, since the automaton is minimal.

We conclude that 1pw=q1\cdot pw=q. This shows that α\alpha is surjective and proves property (i). Next, if pp is a prefix of Pal(u)\operatorname{Pal}(u) which is also a suffix, then pp is a palindrome and thus property (ii) is true.         

Note that the automaton 𝒮(u)\mathcal{S}(u) has the additional property

  1. (iii)

    The label of an edge depends only on its end.

Actually, this property holds for any suffix automaton, as is well-known. Indeed, if p,qp,q are two states of the suffix automaton of a word ww such that pa=qb=rp\cdot a=q\cdot b=r, let u,v,tu,v,t be such that 1uparts1\stackrel{{\scriptstyle u}}{{\rightarrow}}p\stackrel{{\scriptstyle a}}{{\rightarrow}}r\stackrel{{\scriptstyle t}}{{\rightarrow}}s and 1vqbrts1\stackrel{{\scriptstyle v}}{{\rightarrow}}q\stackrel{{\scriptstyle b}}{{\rightarrow}}r\stackrel{{\scriptstyle t}}{{\rightarrow}}s with ss a terminal state. Then uatuat and vbtvbt are suffixes of ww, which implies a=ba=b.

Example 5.4

Consider u=abcu=abc. We have Pal(u)=abacaba\operatorname{Pal}(u)=abacaba and the automaton 𝒮(abc)\mathcal{S}(abc) is represented in Figure 1.

011223344556677aabbaaccaabbaabbcccc
Figure 1: The automaton 𝒮(abc)\mathcal{S}(abc).

6 Compact automata

We explore the notion of compact automaton in which the edges can be labeled by nonempty words instead of letters. This version of automata appears, in the case of compact suffix automata, in [6] or [8]. It is also presented in the chapter by Maxime Crochemore in [19]. In particular, the construction of a minimal compact suffix automaton is described (see also [5]). We will show here that it is possible to define in complete generality a minimal compact automaton for every language.

A compact automaton 𝒜=(Q,E,I,T)\mathcal{A}=(Q,E,I,T) is given by a set of states QQ, a set of edges EQ×A+×QE\subset Q\times A^{+}\times Q, a set initial states IQI\subset Q and a set of terminal states TQT\subset Q. A path p0u0p1u1p2un1pnp_{0}\stackrel{{\scriptstyle u_{0}}}{{\rightarrow}}p_{1}\stackrel{{\scriptstyle u_{1}}}{{\rightarrow}}p_{2}\ldots\stackrel{{\scriptstyle u_{n-1}}}{{\rightarrow}}p_{n} is a sequence of consecutive edges. Its label is u0u1un1u_{0}u_{1}\cdots u_{n-1}. The language recognized by 𝒜\mathcal{A}, denoted L(𝒜)L(\mathcal{A}), is the set of labels od successful paths, that is paths from II to TT.

An ordinary automaton is clearly a particular case of a compact automaton.

A compact automaton is deterministic if Card(I)=1\operatorname{Card}(I)=1 and if for every state pp, the labels of the edges starting at pp begin with distinct letters.

Again, an ordinary deterministic automaton is deterministic as a compact automaton.

Example 6.1

The compact automaton of Figure 2 is deterministic. Its initial state (indicated with an incoming arrow) is 0 and 22 (with a double circle) is the unique terminal state.

01122aaaaaababa
Figure 2: A deterministic compact automaton.

The set of special states of a compact automaton 𝒜=(Q,E,I,T)\mathcal{A}=(Q,E,I,T) is the set QsQ_{s} of states qq which either belong to ITI\cup T or such that there are edges going out of qq with labels beginning with distinct letters.

Let p,qp,q be special states. A path pwqp\stackrel{{\scriptstyle w}}{{\rightarrow}}q is special if the only special states on the path are its origin and its end.

A reduction from a deterministic compact automaton 𝒜=(Q,E,i,T)\mathcal{A}=(Q,E,i,T) onto a deterministic compact automaton 𝒜=(Q,E,i,T)\mathcal{A}^{\prime}=(Q^{\prime},E^{\prime},i^{\prime},T^{\prime}) is a map φ\varphi from QsQ_{s} onto QsQ^{\prime}_{s} such that

  1. 1.

    φ(i)=i\varphi(i)=i^{\prime},

  2. 2.

    φ(p)T\varphi(p)\in T^{\prime} if and only if pTp\in T,

  3. 3.

    for every p,qQsp,q\in Q_{s}, there is a special path pwqp\stackrel{{\scriptstyle w}}{{\rightarrow}}q in 𝒜\mathcal{A} if and only there is a special path φ(p)wφ(q)\varphi(p)\stackrel{{\scriptstyle w}}{{\rightarrow}}\varphi(q) in 𝒜\mathcal{A}^{\prime}.

An automaton is trim if every state is on some successful path.

Proposition 6.2

If 𝒜,𝒜\mathcal{A},\mathcal{A}^{\prime} are deterministic compact automata and if φ:𝒜𝒜\varphi\colon\mathcal{A}\to\mathcal{A}^{\prime} is a reduction, then L(𝒜)=L(𝒜)L(\mathcal{A})=L(\mathcal{A}^{\prime}).

Proof.

If ww is in L(𝒜)L(\mathcal{A}), there is a path iwti\stackrel{{\scriptstyle w}}{{\rightarrow}}t with tTt\in T. Let w=w0w1wnw=w_{0}w_{1}\cdots w_{n} be the factorisation of ww such that the path has the form q0w0q1w1qnwnqn+1q_{0}\stackrel{{\scriptstyle w_{0}}}{{\rightarrow}}q_{1}\stackrel{{\scriptstyle w_{1}}}{{\rightarrow}}\cdots q_{n}\stackrel{{\scriptstyle w_{n}}}{{\rightarrow}}q_{n+1} with each path qiwiqi+1q_{i}\stackrel{{\scriptstyle w_{i}}}{{\rightarrow}}q_{i+1} being special and where q0=i,qn+1=tq_{0}=i,q_{n+1}=t. Since φ\varphi is a reduction, there is for each ii with 0in0\leq i\leq n, a special path φ(qi)wiφ(qi+1)\varphi(q_{i})\stackrel{{\scriptstyle w_{i}}}{{\rightarrow}}\varphi(q_{i+1}). Thus there is in 𝒜\mathcal{A}^{\prime} a path i=φ(i)wφ(t)Ti^{\prime}=\varphi(i)\stackrel{{\scriptstyle w}}{{\rightarrow}}\varphi(t)\in T^{\prime}, which implies that ww is in L(𝒜)L(\mathcal{A}^{\prime}).

Conversely, let wL(𝒜)w\in L(\mathcal{A}^{\prime}). Then there exists a path iwti^{\prime}\stackrel{{\scriptstyle w}}{{\rightarrow}}t^{\prime} with tTt^{\prime}\in T^{\prime}. We may decompose this path as i=q0w0q1w1qnwnqn+1=ti^{\prime}=q^{\prime}_{0}\stackrel{{\scriptstyle w_{0}}}{{\rightarrow}}q^{\prime}_{1}\stackrel{{\scriptstyle w_{1}}}{{\rightarrow}}\cdots q^{\prime}_{n}\stackrel{{\scriptstyle w_{n}}}{{\rightarrow}}q^{\prime}_{n+1}=t^{\prime}, where each path qiwiqi+1q^{\prime}_{i}\stackrel{{\scriptstyle w_{i}}}{{\rightarrow}}q^{\prime}_{i+1} is special. By surjectivity of φ\varphi, we have qi=φ(qi)q^{\prime}_{i}=\varphi(q_{i}), q0=iq_{0}=i by Condition 1 in the definition of reduction, and qn+1=tTq_{n+1}=t\in T by Condition 2. Next, there is in 𝒜\mathcal{A} a special path qiwiqi+1q_{i}\stackrel{{\scriptstyle w_{i}}}{{\rightarrow}}q_{i+1} by Condition 3. Thus there is in 𝒜\mathcal{A} a path iwti\stackrel{{\scriptstyle w}}{{\rightarrow}}t and consequently wL(𝒜)w\in L(\mathcal{A}). ∎

Given a language LAL\subset A^{*}, a nonempty residual u1Lu^{-1}L is called special if either

  1. 1.

    u=1u=1, or

  2. 2.

    uLu\in L, or

  3. 3.

    there are two v,wu1Lv,w\in u^{-1}L which begin by different letters.

The minimal compact automaton of a language LL, denoted 𝒜c(L)\mathcal{A}_{c}(L) is the following compact automaton. The set of states is the set of special residuals of LL. The initial state is LL and the terminal states are the u1Lu^{-1}L such that uu is in LL. The edges are the (p,v,q)(p,v,q) such that p=u1Lp=u^{-1}L, q=(uv)1Lq=(uv)^{-1}L and there is no factorization v=vv′′v=v^{\prime}v^{\prime\prime} with v,v′′v^{\prime},v^{\prime\prime} nonempty such that (uv)1L(uv^{\prime})^{-1}L is a special residual. By definition, 𝒜c(L)\mathcal{A}_{c}(L) is deterministic and all its states are special; in particular, all its special paths are edges.

Example 6.3

The compact automaton of Figure 2 is the minimal compact automaton of the language {aaa,aba}\{aaa,aba\}.

Example 6.4

The compact automaton of Figure 3 is deterministic. Its initial state is indicated by an incoming arrow, and all states are terminal.

aababacabacabababacabacabacabacaba
Figure 3: A deterministic compact automaton.

This automaton is the minimal compact automaton of the set of suffixes of the word abacaba=Pal(abc)abacaba=\operatorname{Pal}(abc).

Proposition 6.5

For every trim deterministic compact automaton 𝒜\mathcal{A}, there is a unique reduction from 𝒜\mathcal{A} onto the minimal compact automaton of L(𝒜)L(\mathcal{A}).

Proof.

Let 𝒜=(Q,i,T)\mathcal{A}=(Q,i,T) and L=L(𝒜)L=L(\mathcal{A}). Set 𝒜c(L)=(R,j,S)\mathcal{A}_{c}(L)=(R,j,S). We define a mapping φ:QsR\varphi\colon Q_{s}\to R as follows. First φ(i)=j\varphi(i)=j, so that Condition 1 in the definition of reduction is satisfied. Next, for pQsp\in Q_{s} let uu be such that iupi\stackrel{{\scriptstyle u}}{{\rightarrow}}p. We set φ(p)=u1L\varphi(p)=u^{-1}L. The map is well-defined because if iupi\stackrel{{\scriptstyle u}}{{\rightarrow}}p and iupi\stackrel{{\scriptstyle u^{\prime}}}{{\rightarrow}}p, then u1L=u1Lu^{-1}L=u^{\prime-1}L. If pp is in TT, then uu is in LL and thus φ(p)\varphi(p) is in SS; conversely, if φ(p)S\varphi(p)\in S, then uLu\in L, hence pTp\in T, and Condition 2 is satisfied.

If there are two edges pavqp\stackrel{{\scriptstyle av}}{{\rightarrow}}q and pavqp\stackrel{{\scriptstyle a^{\prime}v^{\prime}}}{{\rightarrow}}q^{\prime} in 𝒜\mathcal{A} with aaa\neq a^{\prime}, let w,ww,w^{\prime} be such that qwtq\stackrel{{\scriptstyle w}}{{\rightarrow}}t, qwtq^{\prime}\stackrel{{\scriptstyle w^{\prime}}}{{\rightarrow}}t^{\prime} and t,tTt,t^{\prime}\in T. Then avw,avwu1Lavw,a^{\prime}v^{\prime}w^{\prime}\in u^{-1}L and thus φ(p)\varphi(p) is a special residual. This shows that φ\varphi maps QsQ_{s} into RsR_{s}.

The mapping φ\varphi is surjective because for each u1Lu^{-1}L in RR, the state pQp\in Q such that iupi\stackrel{{\scriptstyle u}}{{\rightarrow}}p is special.

We verify Condition 3, that is, pwqp\stackrel{{\scriptstyle w}}{{\rightarrow}}q is a special path of 𝒜\mathcal{A} if and only if φ(p)wφ(q)\varphi(p)\stackrel{{\scriptstyle w}}{{\rightarrow}}\varphi(q) is an edge of 𝒜c(L)\mathcal{A}_{c}(L). Indeed, if pwqp\stackrel{{\scriptstyle w}}{{\rightarrow}}q is a special path, let iupi\stackrel{{\scriptstyle u}}{{\rightarrow}}p be a path in 𝒜\mathcal{A}. Then u1Lw(uw)1Lu^{-1}L\stackrel{{\scriptstyle w}}{{\rightarrow}}(uw)^{-1}L is an edge of 𝒜c\mathcal{A}_{c} since otherwise the path pwqp\stackrel{{\scriptstyle w}}{{\rightarrow}}q would not be special. Conversely, if φ(p)wφ(q)\varphi(p)\stackrel{{\scriptstyle w}}{{\rightarrow}}\varphi(q) is a special path of 𝒜c(L)\mathcal{A}_{c}(L), then it is an edge; let uu be such that there is a path iupi\stackrel{{\scriptstyle u}}{{\rightarrow}}p if 𝒜\mathcal{A}; then, by definition of edges in 𝒜c(L)\mathcal{A}_{c}(L), φ(p)=u1L\varphi(p)=u^{-1}L and φ(q)=(uw)1L\varphi(q)=(uw)^{-1}L; thus there is a path iuwqi\stackrel{{\scriptstyle uw}}{{\rightarrow}}q and finally, since the automaton is deterministic, a path pwqp\stackrel{{\scriptstyle w}}{{\rightarrow}}q; it must be special, since u1Lw(uw)1Lu^{-1}L\stackrel{{\scriptstyle w}}{{\rightarrow}}(uw)^{-1}L is an edge of 𝒜c\mathcal{A}_{c}.

Thus φ\varphi is a reduction for 𝒜\mathcal{A} onto 𝒜c(L)\mathcal{A}_{c}(L).

We prove now uniqueness: let ψ\psi be some reduction from 𝒜\mathcal{A} onto 𝒜c\mathcal{A}_{c}. Let q=iuq=i\cdot u (that is, the unique state qq such that there is a path from ii to qq with label uu). Then (Q,q,T)(Q,q,T) recognizes u1Lu^{-1}L. If ψ(q)=k\psi(q)=k, then it is easily verified that ψ\psi is a reduction from (Q,q,T)(Q,q,T) onto (R,k,S)(R,k,S). Hence by Proposition 6.2, these two automata recognize the same language. But, since the states of 𝒜c(u)\mathcal{A}_{c}(u) are distinct residuals, u1Lu^{-1}L is the unique state of 𝒜c\mathcal{A}_{c} such that (R,u1L,S)(R,u^{-1}L,S) recognizes u1Lu^{-1}L. Thus we must have k=u1Lk=u^{-1}L. ∎

Since all the states of the compact automaton 𝒜c(L)\mathcal{A}_{c}(L) are special, its number of states is at most the number of special states of any compact automaton 𝒜\mathcal{A} recognizing LL. We have therefore the following statement which justifies the name of minimal compact automaton for 𝒜c(L)\mathcal{A}_{c}(L).

Corollary 6.6

The compact automaton 𝒜c(L)\mathcal{A}_{c}(L) is, for every recognizable language LL, the unique compact automaton with the minimal number of states which recognizes LL.

Let 𝒜=(Q,i,T)\mathcal{A}=(Q,i,T) be a trim compact deterministic automaton. Let qQq\in Q be a non-special state. Let qvrq\stackrel{{\scriptstyle v}}{{\rightarrow}}r be the unique edge going out of qq. Then qrq\neq r: indeed, if q=rq=r, then qq is not co-accessible, since qq is not terminal (being not special), and there is no other outgoing edge than the loop qvqq\stackrel{{\scriptstyle v}}{{\rightarrow}}q; this contradicts that 𝒜\mathcal{A} is trim. Since ii is special, we have also qiq\neq i. Consider the compact automaton 𝒜=(Q{q},i,T)\mathcal{A}^{\prime}=(Q\setminus\{q\},i,T) with set of edges

  1. (i)

    the edges (p,w,r)(p,w,r) of 𝒜\mathcal{A} with p,rqp,r\neq q,

  2. (ii)

    the edges (p,uv,r)(p,uv,r) for every edge (p,u,q)(p,u,q) of 𝒜\mathcal{A}

The identity map of QsQ_{s} is a reduction from 𝒜\mathcal{A} onto 𝒜\mathcal{A}^{\prime}, called an elementary reduction.

pppp^{\prime}qqrr\rightarrowpppp^{\prime}rruuuu^{\prime}vvuvuvuvu^{\prime}v
Figure 4: An elementary reduction.
Proposition 6.7

The minimal compact automaton 𝒜c(L)\mathcal{A}_{c}(L) is obtained from 𝒜(L)\mathcal{A}(L) by a sequence of elementary reductions.

Proof.

Consider the deterministic compact automata recognizing LL, having the following property, denoted by (R): for each state qq, reachable from the initial state by a path labelled uu, define the (well defined) mapping φ𝒜\varphi_{\mathcal{A}} from the set of states into the set of residuals of LL by qu1Lq\mapsto u^{-1}L; then this mapping is injective.

The minimal automaton 𝒜(L)\mathcal{A}(L) has property (R)(R). We claim that property (R) is preserved by each elementary reduction. If an automaton has property (R) and has only special states, then it must be the minimal compact automaton. This proves the proposition.

We prove the claim. Let 𝒜\mathcal{A} and 𝒜\mathcal{A}^{\prime} be as above. Then, as is easily verified, one has φ𝒜=φ𝒜|(Q{q})\varphi_{\mathcal{A}^{\prime}}=\varphi_{\mathcal{A}}|(Q\setminus\{q\}), and more precisely, if pqp\neq q is reachable by uu from ii in 𝒜\mathcal{A}, then it is reachable from ii by uu in 𝒜\mathcal{A}^{\prime} and therefore φ𝒜(p)=u1L=φ𝒜(p)\varphi_{\mathcal{A}}(p)=u^{-1}L=\varphi_{\mathcal{A}^{\prime}}(p). ∎

Example 6.8

Let 𝒜\mathcal{A} be the deterministic automaton represented in Figure 1.

The special states are 0,1,3,70,1,3,7. There is a reduction from this automaton to the compact automaton of Figure 3.

Two elementary reductions, suppressing 5,65,6 give the automaton of Figure 7.

01122334477aabbaaccabaababbcccc
Figure 5: Suppresion of 5,65,6.

The suppression of 44 gives then the compact automaton of Figure 6.

011223377aabbaacabacababbcabacabacabacaba
Figure 6: Suppresion of 44.

Finally, the suppression of 22 gives the minimal compact automaton of Figure 3.

7 Direct construction of the compact suffix automaton of Pal(u)\operatorname{Pal}(u)

In Theorem 5.1, we have given some properties of the minimal automaton 𝒮(u)\mathcal{S}(u) of the set of suffixes of Pal(u)\operatorname{Pal}(u). By Proposition 6.7, we know how to transform this automaton into the minimal compact automaton of this set, which we denote 𝒮c(u)\mathcal{S}_{c}(u).

In the present section, we construct directly this compact automaton. One reason to proceed directly from uu to Sc(u)S_{c}(u) is that the number of states of 𝒮c(u)\mathcal{S}_{c}(u) is 1+1+ the length of uu (by Lemma 7.2 below), while the number of states of 𝒮(u)\mathcal{S}(u) (which is 1+1+ the length of Pal(u)Pal(u) by Theorem 5.1) can be exponential in |u||u| (for example, if u=(ab)nu=(ab)^{n}, the length of Pal(u)Pal(u) is F2n+32F_{2n+3}-2, which grows exponentially with nn).

Theorem 7.1

The automaton 𝒮c(u)\mathcal{S}_{c}(u) is completely characterized as follows: the states are the prefixes of uu, all terminal, and 11 is the initial state. For each factorization u=xyazu=xyaz, where aa is a letter and x,y,zx,y,z are words, with yy aa-free, there is a transition xxyax\to xya, labelled Pal(xy)1Pal(xya)\operatorname{Pal}(xy)^{-1}\operatorname{Pal}(xya).

Recall from Section 6 that the states of the automaton 𝒮c(u)\mathcal{S}_{c}(u) are the special residuals of LL, the set of suffixes of Pal(u)\operatorname{Pal}(u). By Theorem 5.1, the nonempty residuals of LL are the p1Lp^{-1}L where pp is a prefix of Pal(u)\operatorname{Pal}(u). Clearly, the mapping pp1Lp\mapsto p^{-1}L is a bijection from the set of prefixes of Pal(u)\operatorname{Pal}(u) onto the set of nonempty residuals of LL.

Lemma 7.2

The set of states 𝒮c(u)\mathcal{S}_{c}(u) is naturally in bijection with the set of palindromic prefixes of Pal(u)\operatorname{Pal}(u). This set is naturally in bijection with the set of prefixes of uu; with this identification, the initial state is 11 and all states are terminal.

Proof.

The second bijection maps a prefix pp of uu onto the prefix Pal(p)\operatorname{Pal}(p) of Pal(u)\operatorname{Pal}(u) (see [17] p. 209).

Let pp be a prefix of Pal(u)\operatorname{Pal}(u). According to the definition of special residuals in Section 6, p1Lp^{-1}L is special if and only if (i) either p=1p=1, or (ii) pLp\in L, or (iii) if there are two words in p1Lp^{-1}L beginning by different letters.

Let p1Lp^{-1}L be a special residual of LL. We show that in all three cases, pp is a palindromic prefix of Pal(u)\operatorname{Pal}(u).

In case (i), p=1p=1 which is clearly a palindromic prefix of Pal(u)\operatorname{Pal}(u). In case (ii), pp is a suffix of Pal(u)\operatorname{Pal}(u), and being also a prefix, it is a palindromic prefix of Pal(u)\operatorname{Pal}(u). In case (iii), let s,ts,t be the two words, with s=ass=as^{\prime}, t=btt=bt^{\prime}, for distinct letters a,ba,b; then pas,pbtpas^{\prime},pbt^{\prime} are in LL, hence pp is a right special factor of Pal(u)\operatorname{Pal}(u); then p~\tilde{p} is a left special factor of Pal(u)\operatorname{Pal}(u), thus by Corollary 5.3, p~\tilde{p} is a prefix of Pal(u)\operatorname{Pal}(u) and therefore pp is a palindromic prefix of Pal(u)\operatorname{Pal}(u).

Conversely, if pp is a palindromic prefix of Pal(u)\operatorname{Pal}(u), then pp is also a suffix of Pal(u)\operatorname{Pal}(u), hence pLp\in L and p1Lp^{-1}L is special residual of LL, by case (ii). ∎

We use another formula of Justin, see [17] p. 209. Let xAx\in A. If uu is xx-free then Pal(ux)=Pal(u)xPal(u)Pal(ux)=Pal(u)xPal(u). If on the other hand xx occurs in uu, write u=u1xu2u=u_{1}xu_{2} with u2u_{2} xx-free. Then

Pal(ux)=Pal(u)Pal(u1)1Pal(u).\operatorname{Pal}(ux)=\operatorname{Pal}(u)\operatorname{Pal}(u_{1})^{-1}\operatorname{Pal}(u). (7.1)

The recursive definition of 𝒮c(u)\mathcal{S}_{c}(u) is is explained in the following result.

Proposition 7.3

Let uA,xAu\in A^{*},x\in A. Define u=hu2u=hu_{2}, where u2u_{2} is the longest xx-free suffix of uu. The automaton 𝒮c(u)\mathcal{S}_{c}(u) having as set of states the set of prefixes of uu, as stated in Lemma 7.2, construct an automaton 𝒮\mathcal{S} as follows:

  • add to 𝒮c(u)\mathcal{S}_{c}(u) the new state uxux, which is terminal;

  • for each prefix pp of u2u_{2}, add an edge from the state hphp of 𝒮c(u)\mathcal{S}_{c}(u) to the new state uxux, labelled Pal(u)1Pal(ux)\operatorname{Pal}(u)^{-1}\operatorname{Pal}(ux).

    Then 𝒮=𝒮c(u)\mathcal{S}=\mathcal{S}_{c}(u).

Note that Pal(u)\operatorname{Pal}(u) is a prefix of Pal(ux)\operatorname{Pal}(ux), so that Pal(u)1Pal(ux)A\operatorname{Pal}(u)^{-1}\operatorname{Pal}(ux)\in A^{*}. Moreover, hphp is a prefix of uu, hence is a state of 𝒮c(x)\mathcal{S}_{c}(x).

11aaabababcabcabcaabcaabacabaabacabaabacabaabacabaabacabaabacaba
Figure 7: From 𝒮c(abc)\mathcal{S}_{c}(abc) to 𝒮c(abca)\mathcal{S}_{c}(abca)

Figure 7 illustrates the construction in Proposition 7.3: the construction from 𝒮c(abc)\mathcal{S}_{c}(abc) in Figure 3 to 𝒮c(abca)\mathcal{S}_{c}(abca); here, u=abcu=abc, h=a,u2=bch=a,u_{2}=bc and only the new edges are drawn. Note that Pal(abc)1Pal(abca)=abacaba\operatorname{Pal}(abc)^{-1}\operatorname{Pal}(abca)=abacaba.

Lemma 7.4

Each word recognized by the automaton 𝒮\mathcal{S} of Proposition 7.3 is a suffix of Pal(ux)Pal(ux).

Proof.

Let ww be a word recognized by 𝒮\mathcal{S}, that is the label of some path in 𝒮\mathcal{S}. If this path does not end at uxux, then by the construction, it is a path in 𝒮c(u)\mathcal{S}_{c}(u); hence ww is a suffix of Pal(u)\operatorname{Pal}(u), and since Pal(u)\operatorname{Pal}(u) is a suffix of Pal(ux)\operatorname{Pal}(ux) (the former is a prefix of the latter, and both words are palindromes), ww is a suffix of Pal(ux)\operatorname{Pal}(ux). If this path ends at uxux, then its last edge is one of the new edges; hence w=sPal(u)1Pal(ux)w=s\operatorname{Pal}(u)^{-1}\operatorname{Pal}(ux), where ss is a suffix of Pal(u)\operatorname{Pal}(u). Suppose first that uu is xx-free; then by Justin’s result recalled above, Pal(ux)=Pal(u)xPal(u)\operatorname{Pal}(ux)=\operatorname{Pal}(u)x\operatorname{Pal}(u) and w=sxPal(u)w=sx\operatorname{Pal}(u) is a suffix of Pal(u)xPal(u)=Pal(ux)\operatorname{Pal}(u)x\operatorname{Pal}(u)=Pal(ux). Suppose now that uu is not xx-free; then u=u1xu2u=u_{1}xu_{2}, u2u_{2} is xx-free and Pal(ux)=Pal(u)Pal(u1)1Pal(u)\operatorname{Pal}(ux)=\operatorname{Pal}(u)\operatorname{Pal}(u_{1})^{-1}\operatorname{Pal}(u); moreover, w=sPal(u)1Pal(u)Pal(u1)1Pal(u)=sPal(u1)1Pal(u)w=s\operatorname{Pal}(u)^{-1}\operatorname{Pal}(u)\operatorname{Pal}(u_{1})^{-1}\operatorname{Pal}(u)=s\operatorname{Pal}(u_{1})^{-1}\operatorname{Pal}(u); since ss is a suffix of Pal(u)\operatorname{Pal}(u) and since Pal(u1)1Pal(u)\operatorname{Pal}(u_{1})^{-1}\operatorname{Pal}(u) is in AA^{*}, we see that ww is a suffix of Pal(u)Pal(u1)1Pal(u)=Pal(ux)\operatorname{Pal}(u)\operatorname{Pal}(u_{1})^{-1}\operatorname{Pal}(u)=\operatorname{Pal}(ux).

Lemma 7.5

For every prefix of pp of uu, 𝒮c(p)\mathcal{S}_{c}(p) is obtained from 𝒮c(u)\mathcal{S}_{c}(u) by keeping in the latter only the states which are prefixes of pp.

The number of paths in 𝒮c(u)\mathcal{S}_{c}(u) from the initial state to the state uu is equal to |Pal(u)||Pal(u)||\operatorname{Pal}(u)|-|\operatorname{Pal}(u^{-})| if uu is nonempty (where xx^{-} denotes the word xx with the last letter removed), and it is 11 if uu is empty.

Proof.

We prove the lemma by induction on the length of uu. For u=1u=1, it is immediate.

Suppose now that uAu\in A^{*} and xAx\in A. We prove the assertions for uxux, admitting them for shorter words.

Take the notation 𝒮\mathcal{S} and u=hu2u=hu_{2} in Proposition 7.3. We know by Lemma 7.4 that each word recognized by 𝒮\mathcal{S} is recognized by 𝒮c(ux)\mathcal{S}_{c}(ux). We prove now the converse, by a counting argument, using the induction hypothesis. Let nwn_{w} denote the number of words recognized by 𝒮c(w)\mathcal{S}_{c}(w). Then nwn_{w} is equal to the number of suffixes of Pal(w)\operatorname{Pal}(w), hence is equal to to 1+1+ the length of Pal(w)\operatorname{Pal}(w). Hence nuxnu=|Pal(ux)||Pal(u)|n_{ux}-n_{u}=|\operatorname{Pal}(ux)|-|\operatorname{Pal}(u)|. Now let nn be the number of words recognized by 𝒮\mathcal{S}. By construction of the automaton 𝒮\mathcal{S}, each word recognized by it, and not recognized by 𝒮c(u)\mathcal{S}_{c}(u), is of the form w=sPal(u)1Pal(ux)w=s\operatorname{Pal}(u)^{-1}\operatorname{Pal}(ux), where ss is the label of some path in 𝒮c(u)\mathcal{S}_{c}(u) which starts at 11 and ends at hphp for some prefix pp of u2u_{2}. By induction, the number of such words ss is equal to |Pal(hp)||Pal((hp))||\operatorname{Pal}(hp)|-|\operatorname{Pal}((hp)^{-})| if hphp is nonempty, and 1 if hphp is empty. Since the corresponding sum is telescoping, it follows that the number nnun-n_{u} of possible words ss is equal to |Pal(u)||Pal(h)||\operatorname{Pal}(u)|-|\operatorname{Pal}(h^{-})| if hh is nonempty, and to 1+|Pal(u)|1+|\operatorname{Pal}(u)| if hh is empty. If uu is not xx-free, then h=u1xh=u_{1}x is nonempty, h=u1h^{-}=u_{1} and Pal(ux)=Pal(u)Pal(u1)1Pal(u)\operatorname{Pal}(ux)=\operatorname{Pal}(u)\operatorname{Pal}(u_{1})^{-1}\operatorname{Pal}(u), so that nuxnu=|Pal(ux)||Pal(u)|=|Pal(u1)1Pal(u)|=|Pal(u)||u1|=|Pal(u)||h|=nnun_{ux}-n_{u}=|\operatorname{Pal}(ux)|-|\operatorname{Pal}(u)|=|\operatorname{Pal}(u_{1})^{-1}\operatorname{Pal}(u)|=|\operatorname{Pal}(u)|-|u_{1}|=|\operatorname{Pal}(u)|-|h^{-}|=n-n_{u}. If uu is xx-free, then h=1h=1, Pal(ux)=Pal(u)xPal(u)\operatorname{Pal}(ux)=\operatorname{Pal}(u)x\operatorname{Pal}(u), and nuxnu=|Pal(ux)||Pal(u)|=|xPal(u)|=1+|Pal(u)|=nnun_{ux}-n_{u}=|\operatorname{Pal}(ux)|-|\operatorname{Pal}(u)|=|x\operatorname{Pal}(u)|=1+|\operatorname{Pal}(u)|=n-n_{u}. Thus in both cases, nux=nn_{ux}=n, which implies that 𝒮c(ux)\mathcal{S}_{c}(ux) and 𝒮\mathcal{S} both recognize the language of suffixes of Pal(ux)\operatorname{Pal}(ux); since both automata have the same number of states and the first is minimal, they are isomorphic; but since both have a unique longest path of the same length, they are equal.

The two assertions of the lemma now clearly follow for uxux. ∎

Proof of Proposition 7.3. It follows from the proof of Lemma 7.5.         

Proof Theorem 7.1. The theorem follows by a straightforward induction from Proposition 7.3.         

Theorem 7.1 has the following corollary, which could also be proved using (iv) in Section 5 and Proposition 6.7.

Corollary 7.6

In the graph 𝒮c(u)\mathcal{S}_{c}(u), the label of an edge depends only on the final state of the edge.

Proof.

Indeed, the label of each transition vwv\to w is Pal(w)1Pal(w)\operatorname{Pal}(w^{-})^{-1}\operatorname{Pal}(w). ∎

Corollary 7.7

Let uAu\in A^{*}, of length nn, and for any aAa\in A, denote by pa(u)p_{a}(u) the position of the rightmost occurrence of aa in the word uu, with pa(u)=0p_{a}(u)=0 when uu is aa-free. Then the number of states in 𝒮c(u)\mathcal{S}_{c}(u) is n+1n+1 and the number of transitions is aApa(u)\sum_{a\in A}p_{a}(u).

Proof.

By Theorem 7.1, the number of states is the number of prefixes of uu, thus it is n+1n+1. The number tt of transitions is equal to t=aAtat=\sum_{a\in A}t_{a}, where tat_{a} is the number of factorizations u=xyazu=xyaz, x,y,zAx,y,z\in A^{*}, yy aa-free. We have ta=0t_{a}=0 if uu is aa-free.

Suppose that uu contains aa. Denote Ia=AaAI_{a}=A^{*}aA^{*}, the set of words containing letter aa; clearly, each word vv in IaI_{a} has a unique factorization v=yazv=yaz, y,zAy,z\in A^{*}, yy aa-free. Hence tat_{a} is equal to the number of factorizations u=xvu=xv,

()xA,vIa.(*)\,\,x\in A^{*},v\in I_{a}.

If we factorize u=u1au2u=u_{1}au_{2}, where u2u_{2} is the longest aa-free suffix of uu, then |u1|+1=pa(u)|u_{1}|+1=p_{a}(u), and a factorization u=xvu=xv satisfies ()(*) if and only if xx is a prefix of u1u_{1}. Hence the number of such factorizations is |u1|+1=pa(u)|u_{1}|+1=p_{a}(u). ∎

For a binary alphabet, Theorem 7.1 was obtained, in an another but equivalent form, by Epifanio, Mignosi, Shallit and Venturini [13] (see also [7], especially Figure 1). Similarly for Corollary 7.7, see [13] Proposition 1.

8 Further comments

Following [13], we obtain for any word uu on any alphabet, a directed graph that counts from 0 to n=|Pal(u)|n=|\operatorname{Pal}(u)| in the following sense: replace in 𝒮c(u)\mathcal{S}_{c}(u) each label by its length; then one obtains a directed graph, with the initial vertex 11, such that for each k=0,,nk=0,\ldots,n, there is a unique path, starting from the initial vertex, whose label is kk (here, labels of paths here additive). This follows clearly since there is a unique suffix of Pal(u)\operatorname{Pal}(u) of each length k=0,,nk=0,\ldots,n. For uu on a binary alphabet, Epifanio et al. call this graph a Sturmian graph.

As open problem, we mention that Theorem 7.3 has certainly an interpretation as a factorization result: each suffix of Pal(u)\operatorname{Pal}(u) as a certain factorization as a product of the words which are the label of the edges of the automaton; these words are all of the form Pal(p)Pal(p)\operatorname{Pal}(p^{-})\operatorname{Pal}(p), pp a proper prefix of uu. In the binary alphabet case, this is known: the factorization is related to the Ostrowski lazy factorization (see [12]), and to the factorization theorem of Anna Frid [15] Corollary 1, which following [7] Section 9, may be stated as follows: for uu on a binary alphabet, of length nn, each suffix ss of Pal(u)\operatorname{Pal}(u), of length \ell, has a unique factorization 1inL0d1L1d2Ln1dn\prod_{1\leq i\leq n}L_{0}^{d_{1}}L_{1}^{d_{2}}\cdots L_{n-1}^{d_{n}}, where Li=Pal(pi)1Pal(pi+1)L_{i}=\operatorname{Pal}(p_{i}^{-})^{-1}\operatorname{Pal}(p_{i+1}), pip_{i} the prefix of length ii of uu, and where =1indiqi1\ell=\sum_{1\leq i\leq n}d_{i}q_{i-1} is the lazy Ostrowski representation of \ell, with qj=|Lj|q_{j}=|L_{j}|.

References

  • [1] Jorge Almeida. Profinite semigroups and applications. In Structural theory of automata, semigroups, and universal algebra, volume 207 of NATO Sci. Ser. II Math. Phys. Chem., pages 1–45. Springer, Dordrecht, 2005. Notes taken by Alfredo Costa.
  • [2] Jorge Almeida, Afredo Costa, Revekka Kyriakoglou, and Dominique Perrin. Profinite Semigroups and Symbolic Dynamics. Springer verlag, 2020.
  • [3] Pierre Arnoux and Gérard Rauzy. Représentation géométrique de suites de complexité 2n+12n+1. Bull. Soc. Math. France, 119(2):199–215, 1991.
  • [4] A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M. T. Chen, and J. Seiferas. The smallest automaton recognizing the subwords of a text. Theoret. Comput. Sci., 40(1):31–55, 1985. Special issue: Eleventh international colloquium on automata, languages and programming (Antwerp, 1984).
  • [5] Anselm Blumer, J. Blumer, David Haussler, Ross M. McConnell, and Andrzej Ehrenfeucht. Complete inverted files for efficient text retrieval and analysis. J. ACM, 34(3):578–595, 1987.
  • [6] Anselm Blumer, Andrzej Ehrenfeucht, and David Haussler. Average sizes of suffix trees and dawgs. Discret. Appl. Math., 24(1-3):37–45, 1989.
  • [7] Y. Bugeaud and C. Reutenauer. On the conjugates of Christoffel words, 2022.
  • [8] Maxime Crochemore and Renaud Vérin. Direct construction of compact directed acyclic word graphs. In Alberto Apostolico and Jotun Hein, editors, Combinatorial Pattern Matching, 8th Annual Symposium, CPM 97, Aarhus, Denmark, June 30 - July 2, 1997, Proceedings, volume 1264 of Lecture Notes in Computer Science, pages 116–129. Springer, 1997.
  • [9] Aldo de Luca. Sturmian words: structure, combinatorics, and their arithmetics. Theoret. Comput. Sci., 183(1):45 – 82, 1997.
  • [10] Xavier Droubay, Jacques Justin, and Giuseppe Pirillo. Episturmian words and some constructions of de Luca and Rauzy. Theoret. Comput. Sci., 255(1-2):539–553, 2001.
  • [11] Fabien Durand and Dominique Perrin. Dimension Groups and Dynamical Systems. Cambridge University Press, 2022.
  • [12] Chiara Epifanio, Christiane Frougny, Alessandra Gabriele, Filippo Mignosi, and Jeffrey O. Shallit. Sturmian graphs and integer representations over numeration systems. Discret. Appl. Math., 160(4-5):536–547, 2012.
  • [13] Chiara Epifanio, Filippo Mignosi, Jeffrey O. Shallit, and Ilaria Venturini. On sturmian graphs. Discret. Appl. Math., 155(8):1014–1030, 2007.
  • [14] Gabriele Fici. Special factors and the combinatorics of suffix and factor automata. Theoret. Comput. Sci., 412(29):3604–3615, 2011.
  • [15] Anna E. Frid. Sturmian numeration systems and decompositions to palindromes. European J. Combin., 71:202–212, 2018.
  • [16] Marshall Hall, Jr. A topology for free groups and related groups. Ann. of Math. (2), 52:127–139, 1950.
  • [17] Jacques Justin. Episturmian morphisms and a Galois theorem on continued fractions. ITA, 39(1):207–215, 2005.
  • [18] Christian Kassel and Christophe Reutenauer. A palindromization map for the free group. Theor. Comput. Sci., 409(3):461–470, 2008.
  • [19] M. Lothaire. Applied Combinatorics on Words. Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2005.
  • [20] Gérard Rauzy. Mots infinis en arithmétique. In Maurice Nivat and Dominique Perrin, editors, Automata on Infinite Words, volume 192 of Lecture Notes in Computer Science, pages 165–171. Springer-verlag, 1984.
  • [21] Christophe Reutenauer. From Christoffel Words to Markoff Numbers. Oxford University Press, 2019.
  • [22] Luis Ribes and Pavel Zalesskii. Profinite groups, volume 40 of Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics]. Springer-Verlag, Berlin, second edition, 2010.
  • [23] Gwenaël Richomme. A characterization of infinite LSP words. In Developments in language theory, volume 10396 of Lecture Notes in Comput. Sci., pages 320–331. Springer, Cham, 2017.
  • [24] Marinella Sciortino and Luca Q. Zamboni. Suffix automata and standard sturmian words. In Tero Harju, Juhani Karhumäki, and Arto Lepistö, editors, Developments in Language Theory, 11th International Conference, DLT 2007, Turku, Finland, July 3-6, 2007, Proceedings, volume 4588 of Lecture Notes in Computer Science, pages 382–398. Springer, 2007.