The palindromization map
Abstract
The palindromization map has been defined initially by Aldo de Luca in the context of Sturmian words. It was extended to the free group of rank by Kassel and the second author. We extend their construction to arbitrary alphabets. We also investigate the suffix automaton and compact suffix automaton of the words obtained by palindromization.
1 Introduction
The iterated palindromic closure is an injective map mapping arbitrary words to palindromes. It has been introduced by Aldo de Luca in [9]. This map is used to define a representation of Sturmian words by means of a directive word and is related to a transformation introduced by Rauzy (see [20, 3]). The iterated palindromic closure has been shown in [18] to be extendable, in the case of two letters, to a map (not anymore injective) from the free group into itself and to have many interesting properties.
In this article, we study the extension of the iterated palindrome closure to a map from the free group on more than two letters to itself. We show that some of the features appearing with two letters remain valid while some others do not hold anymore. In particular, we show that the map is continuous for the profinite topology (Proposition 4.1). We were not able to characterise the kernel of the map as it is done in [18], where it is related to the braid group. We also discuss the relation with noncommutative cohomology evidenced in [18] but we show that, on more than two letters, the cocycle corresponding to the iterated palindromization map is not trivial.
In Section 5, we describe the suffix automaton of a word of the form , that is the minimal automaton of the set of suffixes of this word. We extend to arbitrary alphabets results concerning the suffix automaton; the corresponding results for binary alphabets are from [13].
In Section 6, we develop study compact automata. These automata have already been studied in the case of suffix automata (see the chapter by Maxime Crochemore in [19]), but they do not seem to have been considered before in the general case of automata. We fill this gap and present a direct definition of a minimal compact automaton, which is shown to be unique (Corollary 6.6), together with two other results related to the notion of reduction of automaton (Propositions 6.5 and 6.7).
This will apply in Section 7 to the construction of the minimal compact suffix automaton of . In that section, we construct directly the minimal compact automaton of the set of suffixes of . The construction extends the known construction in the binary alphabet case due to Epifanio, Mignosi, Shallit and Venturini [13] (see also [7]). It consists in computing the automaton for from the automaton of , by adjoining one state, and several transitions from the first automaton to this state. From this, we deduce the exact form of the automaton (Theorem 7.1) and several of its properties (Corollaries 7.6 and 7.7)
Acknowledgements
We thank Maxime Crochemore for his advice about compact automata.
2 The palindromization map
We denote by the free monoid on the alphabet and by the empty word. We denote by the reversal of the word with . The word is a palindrome if .
We denote by the free group on . If is in and in , we denote by the number of occurrences of in , where one counts with -1 the occurrences of ; this is well defined and does not depend on the chosen expression for ; we call it the -degree of . Moreover, define , the algebraic length of . In particular, if , then is the length of .
The reversal of the element with is the element . This does not depend on the chosen expression for . The map is an antimorphism, that is, it satisfies . We also say that an element of is a palindrome if .
Let us define two morphisms and from into its group of automorphisms as follows. For , we set
The map is an automorphism of since its inverse is the map
Symmetrically, for , we set .
Thus, for example, and .
Note that are related by two identities. The first one is
(2.1) |
for every and . Indeed, this is true when is a letter; by rewriting equivalently , this equality follows since both sides are the image of under an automorphism of .
The second one is
(2.2) |
for every . Indeed, this is true when is a letter and the general case follows similarly.
Every word is a prefix of some palindrome since is always a palindrome. Thus, there exists a palindrome of shortest length which has as a prefix. Actually, this palindrome is unique. It is called the palindromic closure of and denoted . One has where is the longest palindrome suffix of (for these results, due to Aldo de Luca, see for example [21] Proposition 12.1.1).
For example, , since the longest palindromic suffix of is , and .
Let be the unique map from to such that , and for and
For a word , is called the iterated palindromic closure of . The iterated palindromic closure has been introduced by Aldo de Luca [9] who has shown that it is injective (see for example [21] p. 102, where the injectivity is proved by an algorithm).
For example , since , , and finally .
The mapping may be extended to infinite words, since when is a proper prefix of , is a proper prefix of .
An important property of is the following functional equation, known as Justin’s Formula [17]. For all ,
(2.3) |
A dual form of (2.3) (which is the one actually given in [17, Lemma 2.1]) is
(2.4) |
Indeed, assuming (2.4), we have using (2.2),
(2.5) | |||||
(2.6) |
Hence (2.4) implies (2.3). The reverse implication is true, too, as is similarly verified.
We want to prove the following result, extending the construction of [18] to arbitrary alphabets.
Theorem 2.1
There exists a unique extension of to a map from to itself fixing every and satisfying (2.3) for every .
We will still denote by the extension of the iterated palindromic closure to the free group and call it the palindromization map. We will see below that the extension also satisfies (2.4), for any .
The statement follows directly from the following property.
Proposition 2.2
Let be a morphism from the free group on to the group of its automorphisms, such that for every . There exists a unique map such that
-
(i)
for every .
-
(ii)
(2.7) for every .
Proof.
We prove first uniqueness of . Equation (2.7) with implies that .
Next, the same equation implies
for each and each . Thus there is a unique such that for each reduced word with one has . Indeed, this is true when and follows easily using induction on the length of . Thus there is at most one map satisfying conditions (i) and (ii).
To prove the existence, let us prove that, for this map , Equation (2.7) holds for every by induction on the length of the reduced word . It holds for . Next, let be of positive length. Set with , in reduced form. Then . Since , the induction hypothesis implies
Applying again the induction hypothesis, we obtain
- Assume first that is reduced. Then, by definition of , we have . Thus
Thus we obtain
- Assume now that is not reduced; set in reduced form. Then, by definition of , we have .
Thus, since , we have
which by induction hypothesis is equal to . ∎
We now verify the following property of the palindromization map.
Proposition 2.3
For every , is a palindrome.
Proof.
For every and , we have
(2.8) |
Indeed, by (2.1) and (2.2). It follows from (2.8) that for every and every palindrome , is a palindrome.
Let us now show by induction on the length of the reduced word representing that is a palindrome. It is true if . Next, set in reduced form with and . We have by (2.3), . By induction hypothesis, is a palindrome and it follows from the previous observation that is a palindrome, too. ∎
It follows from Proposition 2.3, using the same argument as in (2.6), that the map satisfies also (2.4) for every .
As an example, we have and . This shows that the extension of to is not injective. In the case of a binary alphabet, one can characterize the kernel of as follows.
Let be the braid group on three strands defined as
Let be the morphism . For any , one has if and only if (see [18, Proposition 5.2]). No such characterization is known on more than two letters.
3 Semidirect products, cocycles and sequential functions
We discuss now several interpretations of Justin’s Formula.
Semidirect products
Observe first that Equation (2.7) (and thus also Equation (2.3)) can be expressed in terms of semidirect products. Indeed, consider the semidirect product of with itself corresponding to the morphism from into . By definition, it is the set of pairs with the product
Equation (2.7) expresses the fact that is a morphism from to . Indeed, assuming (2.7), we have for every ,
This proves the following statement.
Proposition 3.1
The map is the unique morphism from to sending every to .
Cocycles
Justin’s Formula is also related to the notion of nonabelian group cohomology, as pointed out in [18]. A function , from to itself, is a 1-cocycle, with respect to a group morphism from to , if (2.7) holds for all . Thus , as a function from to itself, is a 1-cocycle with respect to the morphism . Such a 1-cocycle is trivial if there is an element such that
(3.1) |
for every . When has two elements, one has by [18, Equation (3.1)]. Thus the 1-cocycle is trivial. This is not the case on more than two letters. Indeed, suppose that is such that for all . One has then for every and thus, by taking the -degree: . This implies, by summing over all , , which is impossible for since is an integer.
Sequential functions
Equation (2.3) can also be seen as expressing that, as a function from to , the map is a sequential function, that is, a function computed by a sequential transducer. Let us recall the definition of a sequential transducer on a set of states. Let
be a map, called the transition function. This map extends to a right action of on by for and . In addition, let
be a map called the output function. This map extends to a map from to by
Given a sequential transducer on defined by the maps and and given an initial state , the function defined by the transducer is
Proposition 3.2
The function is defined by the transducer on the set of states with transition and output functions
respectively, and with initial state .
Proof.
We prove by induction on the length of that . It is true for and next, assuming that , we obtain for every ,
∎
4 Uniform continuity of for the profinite distance
The profinite topology on the free group is the topology generated by the inverse images of subsets of a finite group by a morphism . Equivalently, it is the coarsest topology such that every morphism to a finite discrete group is continuous. This topology on the free group was introduced by Hall (see [16]). It is a particular case of a more general notion which extends to varieties of groups and also of semigroups (see [22] and [1]).
The following is proved in [18] for the case of a binary alphabet.
Proposition 4.1
The map is continuous for the profinite topology.
Actually we shall prove a slightly more general result. Following [1] p. 57, we define the profinite distance on by the formula
where are distinct, and where is the minimal cardinality of a group such that, for some morphism , . Since is residually finite, is a finite integer for each pair of distinct elements in . It is easy to prove that , hence
which means that is an ultrametric distance.
The topology of induced by is precisely the profinite topology.
Proposition 4.1 follows from the next result, since each uniformly continuous function is continuous.
Proposition 4.2
The map is uniformly continuous for the profinite distance.
We need in the proof the following characterizations of uniformly continuous functions.
1. A function is uniformly continuous if and only if for any morphism , the function is uniformly continuous, where has the discrete distance.
2. A function , with a finite group with discrete distance, is uniformly continuous if and only if it factorizes as , where is a morphism into a finite group, and is some function.
Proof.
We apply the first criterion above to the function . Thus let be a group morphism into a finite group .
We define a right action of on the finite set by
(4.1) |
It is indeed a right action, since the associativity follows from
It follows from Equation (4.1) that for every , one has
In order to apply the second criterion, we define now a morphism , where is the symmetric group on : for any in , is the permutation . Then is a morphism, because of the right action defined above, letting act on the right on . Note that is a finite group, since is finite.
Define a function ; it sends each element onto the first component of the pair . Thus we have first component of , which is equal to ; thus , which allows to conclude, according to the second criterion, that is uniformly continuous. With the first criterion, we see that is uniformly continuous. ∎
Since is dense in for the profinite topology (see [1] or [2] for example), we deduce from Proposition 4.1 the following statement.
Corollary 4.3
The map is the unique continuous extension to of the iterated palindromic closure of .
Though is continuous for the profinite topology, it is not continuous for the pro- topology on the free group of rank , where is a prime number. Recall that the pro- topology is the coarsest topology such that every group homomorphism from into a finite -group is continuous (see [18, Remark 6.3]).
5 Suffix automaton
The minimal automaton of the set of suffixes of a word is called the suffix automaton of . These automata has been extensively studied (see [19, Chapter 2]) A striking property (originally due to [4]) is that its number of states is at most .
Let be a word over an arbitrary alphabet . We denote by the suffix automaton of .
Part (i) of the following result is not new, but we give a proof for sake of completeness.
Theorem 5.1
The automaton has the following properties:
-
(i)
It has states, which may be naturally identified with the prefixes of .
-
(ii)
Its terminal states are the palindromic prefixes of .
This result, for a binary alphabet, is due to [13] ((i) in the binary case also follows from Theorem 1 in [24], which characterizes the binary words whose suffix automaton has states); see also [13], [12], [7]. Moreover, for general alphabets, Part (i) of the theorem is a consequence of the remark in Section 5 in Fici’s article [14]. Note that that characterizations of the words such that the suffix automaton of has states have been given by Fici [14] and Richomme [23].
A factor of a word (resp. an infinite word ) is left-special if there are at least two letters such that is a factor of (resp. ).
The following is from [10, Proposition 5] (see also [11, Proposition 1.5.11]). Recall that for any infinite word , the infinite word is well-defined.
Proposition 5.2
If for some infinite word , the left-special factors of are the prefixes of .
We will use the following consequence.
Corollary 5.3
For any (finite) word , the left special factors of are prefixes of .
Proof.
Set . Let be a left-special factor of . Since is a prefix of , the word is also left-special with respect to . By Proposition 5.2, is a prefix of and thus of . ∎
Note that not all prefixes of need to be left-special. For example, if , the factor is not a left-special factor of .
Given a language , a residual of is set of the form . It is well-known that the minimal automaton of a language , denoted , has the set of nonempty residuals of as set of states.
Proof of Theorem 5.1. Let be the initial state of . Let be the set of prefixes of . The map is injective. Indeed, let be such that . Assuming that , let be such that . Then and thus . Since the language recognized by is finite, the graph of is acyclic, which forces . Thus .
Let us show now that is surjective. Let be a state of the automaton . Let be a word such that . Since the automaton is co-accessible, there is some word such that is a terminal state, and thus is a suffix of . Hence the word is a factor of . Let be the shortest prefix of such that is a prefix of . Let us show by induction on the length of that for every suffix of , one has . It is true if . Otherwise, set . We have by induction hypothesis.
Let be an arbitrary word such that belongs to the set of suffixes of . Since , and since the automaton is the minimal automaton of , so that , and , we have . We cannot have , since otherwise is a prefix of with shorter than .
Thus there is some such that . Since is a factor of and since, by Corollary 5.3, is not left-special (because is not a prefix of for the same reason as above), we have . This shows that . The converse is also true and thus that , that is , since the automaton is minimal.
We conclude that . This shows that is surjective and proves property (i). Next, if is a prefix of which is also a suffix, then is a palindrome and thus property (ii) is true.
Note that the automaton has the additional property
-
(iii)
The label of an edge depends only on its end.
Actually, this property holds for any suffix automaton, as is well-known. Indeed, if are two states of the suffix automaton of a word such that , let be such that and with a terminal state. Then and are suffixes of , which implies .
Example 5.4
Consider . We have and the automaton is represented in Figure 1.
6 Compact automata
We explore the notion of compact automaton in which the edges can be labeled by nonempty words instead of letters. This version of automata appears, in the case of compact suffix automata, in [6] or [8]. It is also presented in the chapter by Maxime Crochemore in [19]. In particular, the construction of a minimal compact suffix automaton is described (see also [5]). We will show here that it is possible to define in complete generality a minimal compact automaton for every language.
A compact automaton is given by a set of states , a set of edges , a set initial states and a set of terminal states . A path is a sequence of consecutive edges. Its label is . The language recognized by , denoted , is the set of labels od successful paths, that is paths from to .
An ordinary automaton is clearly a particular case of a compact automaton.
A compact automaton is deterministic if and if for every state , the labels of the edges starting at begin with distinct letters.
Again, an ordinary deterministic automaton is deterministic as a compact automaton.
Example 6.1
The compact automaton of Figure 2 is deterministic. Its initial state (indicated with an incoming arrow) is and (with a double circle) is the unique terminal state.
The set of special states of a compact automaton is the set of states which either belong to or such that there are edges going out of with labels beginning with distinct letters.
Let be special states. A path is special if the only special states on the path are its origin and its end.
A reduction from a deterministic compact automaton onto a deterministic compact automaton is a map from onto such that
-
1.
,
-
2.
if and only if ,
-
3.
for every , there is a special path in if and only there is a special path in .
An automaton is trim if every state is on some successful path.
Proposition 6.2
If are deterministic compact automata and if is a reduction, then .
Proof.
If is in , there is a path with . Let be the factorisation of such that the path has the form with each path being special and where . Since is a reduction, there is for each with , a special path . Thus there is in a path , which implies that is in .
Conversely, let . Then there exists a path with . We may decompose this path as , where each path is special. By surjectivity of , we have , by Condition 1 in the definition of reduction, and by Condition 2. Next, there is in a special path by Condition 3. Thus there is in a path and consequently . ∎
Given a language , a nonempty residual is called special if either
-
1.
, or
-
2.
, or
-
3.
there are two which begin by different letters.
The minimal compact automaton of a language , denoted is the following compact automaton. The set of states is the set of special residuals of . The initial state is and the terminal states are the such that is in . The edges are the such that , and there is no factorization with nonempty such that is a special residual. By definition, is deterministic and all its states are special; in particular, all its special paths are edges.
Example 6.3
The compact automaton of Figure 2 is the minimal compact automaton of the language .
Example 6.4
The compact automaton of Figure 3 is deterministic. Its initial state is indicated by an incoming arrow, and all states are terminal.
This automaton is the minimal compact automaton of the set of suffixes of the word .
Proposition 6.5
For every trim deterministic compact automaton , there is a unique reduction from onto the minimal compact automaton of .
Proof.
Let and . Set . We define a mapping as follows. First , so that Condition 1 in the definition of reduction is satisfied. Next, for let be such that . We set . The map is well-defined because if and , then . If is in , then is in and thus is in ; conversely, if , then , hence , and Condition 2 is satisfied.
If there are two edges and in with , let be such that , and . Then and thus is a special residual. This shows that maps into .
The mapping is surjective because for each in , the state such that is special.
We verify Condition 3, that is, is a special path of if and only if is an edge of . Indeed, if is a special path, let be a path in . Then is an edge of since otherwise the path would not be special. Conversely, if is a special path of , then it is an edge; let be such that there is a path if ; then, by definition of edges in , and ; thus there is a path and finally, since the automaton is deterministic, a path ; it must be special, since is an edge of .
Thus is a reduction for onto .
We prove now uniqueness: let be some reduction from onto . Let (that is, the unique state such that there is a path from to with label ). Then recognizes . If , then it is easily verified that is a reduction from onto . Hence by Proposition 6.2, these two automata recognize the same language. But, since the states of are distinct residuals, is the unique state of such that recognizes . Thus we must have . ∎
Since all the states of the compact automaton are special, its number of states is at most the number of special states of any compact automaton recognizing . We have therefore the following statement which justifies the name of minimal compact automaton for .
Corollary 6.6
The compact automaton is, for every recognizable language , the unique compact automaton with the minimal number of states which recognizes .
Let be a trim compact deterministic automaton. Let be a non-special state. Let be the unique edge going out of . Then : indeed, if , then is not co-accessible, since is not terminal (being not special), and there is no other outgoing edge than the loop ; this contradicts that is trim. Since is special, we have also . Consider the compact automaton with set of edges
-
(i)
the edges of with ,
-
(ii)
the edges for every edge of
The identity map of is a reduction from onto , called an elementary reduction.
Proposition 6.7
The minimal compact automaton is obtained from by a sequence of elementary reductions.
Proof.
Consider the deterministic compact automata recognizing , having the following property, denoted by (R): for each state , reachable from the initial state by a path labelled , define the (well defined) mapping from the set of states into the set of residuals of by ; then this mapping is injective.
The minimal automaton has property . We claim that property (R) is preserved by each elementary reduction. If an automaton has property (R) and has only special states, then it must be the minimal compact automaton. This proves the proposition.
We prove the claim. Let and be as above. Then, as is easily verified, one has , and more precisely, if is reachable by from in , then it is reachable from by in and therefore . ∎
Example 6.8
Let be the deterministic automaton represented in Figure 1.
The special states are . There is a reduction from this automaton to the compact automaton of Figure 3.
Two elementary reductions, suppressing give the automaton of Figure 7.
The suppression of gives then the compact automaton of Figure 6.
Finally, the suppression of gives the minimal compact automaton of Figure 3.
7 Direct construction of the compact suffix automaton of
In Theorem 5.1, we have given some properties of the minimal automaton of the set of suffixes of . By Proposition 6.7, we know how to transform this automaton into the minimal compact automaton of this set, which we denote .
In the present section, we construct directly this compact automaton. One reason to proceed directly from to is that the number of states of is the length of (by Lemma 7.2 below), while the number of states of (which is the length of by Theorem 5.1) can be exponential in (for example, if , the length of is , which grows exponentially with ).
Theorem 7.1
The automaton is completely characterized as follows: the states are the prefixes of , all terminal, and is the initial state. For each factorization , where is a letter and are words, with -free, there is a transition , labelled .
Recall from Section 6 that the states of the automaton are the special residuals of , the set of suffixes of . By Theorem 5.1, the nonempty residuals of are the where is a prefix of . Clearly, the mapping is a bijection from the set of prefixes of onto the set of nonempty residuals of .
Lemma 7.2
The set of states is naturally in bijection with the set of palindromic prefixes of . This set is naturally in bijection with the set of prefixes of ; with this identification, the initial state is and all states are terminal.
Proof.
The second bijection maps a prefix of onto the prefix of (see [17] p. 209).
Let be a prefix of . According to the definition of special residuals in Section 6, is special if and only if (i) either , or (ii) , or (iii) if there are two words in beginning by different letters.
Let be a special residual of . We show that in all three cases, is a palindromic prefix of .
In case (i), which is clearly a palindromic prefix of . In case (ii), is a suffix of , and being also a prefix, it is a palindromic prefix of . In case (iii), let be the two words, with , , for distinct letters ; then are in , hence is a right special factor of ; then is a left special factor of , thus by Corollary 5.3, is a prefix of and therefore is a palindromic prefix of .
Conversely, if is a palindromic prefix of , then is also a suffix of , hence and is special residual of , by case (ii). ∎
We use another formula of Justin, see [17] p. 209. Let . If is -free then . If on the other hand occurs in , write with -free. Then
(7.1) |
The recursive definition of is is explained in the following result.
Proposition 7.3
Let . Define , where is the longest -free suffix of . The automaton having as set of states the set of prefixes of , as stated in Lemma 7.2, construct an automaton as follows:
-
•
add to the new state , which is terminal;
-
•
for each prefix of , add an edge from the state of to the new state , labelled .
Then .
Note that is a prefix of , so that . Moreover, is a prefix of , hence is a state of .
Figure 7 illustrates the construction in Proposition 7.3: the construction from in Figure 3 to ; here, , and only the new edges are drawn. Note that .
Lemma 7.4
Each word recognized by the automaton of Proposition 7.3 is a suffix of .
Proof.
Let be a word recognized by , that is the label of some path in . If this path does not end at , then by the construction, it is a path in ; hence is a suffix of , and since is a suffix of (the former is a prefix of the latter, and both words are palindromes), is a suffix of . If this path ends at , then its last edge is one of the new edges; hence , where is a suffix of . Suppose first that is -free; then by Justin’s result recalled above, and is a suffix of . Suppose now that is not -free; then , is -free and ; moreover, ; since is a suffix of and since is in , we see that is a suffix of .
∎
Lemma 7.5
For every prefix of of , is obtained from by keeping in the latter only the states which are prefixes of .
The number of paths in from the initial state to the state is equal to if is nonempty (where denotes the word with the last letter removed), and it is if is empty.
Proof.
We prove the lemma by induction on the length of . For , it is immediate.
Suppose now that and . We prove the assertions for , admitting them for shorter words.
Take the notation and in Proposition 7.3. We know by Lemma 7.4 that each word recognized by is recognized by . We prove now the converse, by a counting argument, using the induction hypothesis. Let denote the number of words recognized by . Then is equal to the number of suffixes of , hence is equal to to the length of . Hence . Now let be the number of words recognized by . By construction of the automaton , each word recognized by it, and not recognized by , is of the form , where is the label of some path in which starts at and ends at for some prefix of . By induction, the number of such words is equal to if is nonempty, and 1 if is empty. Since the corresponding sum is telescoping, it follows that the number of possible words is equal to if is nonempty, and to if is empty. If is not -free, then is nonempty, and , so that . If is -free, then , , and . Thus in both cases, , which implies that and both recognize the language of suffixes of ; since both automata have the same number of states and the first is minimal, they are isomorphic; but since both have a unique longest path of the same length, they are equal.
The two assertions of the lemma now clearly follow for . ∎
Theorem 7.1 has the following corollary, which could also be proved using (iv) in Section 5 and Proposition 6.7.
Corollary 7.6
In the graph , the label of an edge depends only on the final state of the edge.
Proof.
Indeed, the label of each transition is . ∎
Corollary 7.7
Let , of length , and for any , denote by the position of the rightmost occurrence of in the word , with when is -free. Then the number of states in is and the number of transitions is .
Proof.
By Theorem 7.1, the number of states is the number of prefixes of , thus it is . The number of transitions is equal to , where is the number of factorizations , , -free. We have if is -free.
Suppose that contains . Denote , the set of words containing letter ; clearly, each word in has a unique factorization , , -free. Hence is equal to the number of factorizations ,
If we factorize , where is the longest -free suffix of , then , and a factorization satisfies if and only if is a prefix of . Hence the number of such factorizations is . ∎
8 Further comments
Following [13], we obtain for any word on any alphabet, a directed graph that counts from 0 to in the following sense: replace in each label by its length; then one obtains a directed graph, with the initial vertex , such that for each , there is a unique path, starting from the initial vertex, whose label is (here, labels of paths here additive). This follows clearly since there is a unique suffix of of each length . For on a binary alphabet, Epifanio et al. call this graph a Sturmian graph.
As open problem, we mention that Theorem 7.3 has certainly an interpretation as a factorization result: each suffix of as a certain factorization as a product of the words which are the label of the edges of the automaton; these words are all of the form , a proper prefix of . In the binary alphabet case, this is known: the factorization is related to the Ostrowski lazy factorization (see [12]), and to the factorization theorem of Anna Frid [15] Corollary 1, which following [7] Section 9, may be stated as follows: for on a binary alphabet, of length , each suffix of , of length , has a unique factorization , where , the prefix of length of , and where is the lazy Ostrowski representation of , with .
References
- [1] Jorge Almeida. Profinite semigroups and applications. In Structural theory of automata, semigroups, and universal algebra, volume 207 of NATO Sci. Ser. II Math. Phys. Chem., pages 1–45. Springer, Dordrecht, 2005. Notes taken by Alfredo Costa.
- [2] Jorge Almeida, Afredo Costa, Revekka Kyriakoglou, and Dominique Perrin. Profinite Semigroups and Symbolic Dynamics. Springer verlag, 2020.
- [3] Pierre Arnoux and Gérard Rauzy. Représentation géométrique de suites de complexité . Bull. Soc. Math. France, 119(2):199–215, 1991.
- [4] A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M. T. Chen, and J. Seiferas. The smallest automaton recognizing the subwords of a text. Theoret. Comput. Sci., 40(1):31–55, 1985. Special issue: Eleventh international colloquium on automata, languages and programming (Antwerp, 1984).
- [5] Anselm Blumer, J. Blumer, David Haussler, Ross M. McConnell, and Andrzej Ehrenfeucht. Complete inverted files for efficient text retrieval and analysis. J. ACM, 34(3):578–595, 1987.
- [6] Anselm Blumer, Andrzej Ehrenfeucht, and David Haussler. Average sizes of suffix trees and dawgs. Discret. Appl. Math., 24(1-3):37–45, 1989.
- [7] Y. Bugeaud and C. Reutenauer. On the conjugates of Christoffel words, 2022.
- [8] Maxime Crochemore and Renaud Vérin. Direct construction of compact directed acyclic word graphs. In Alberto Apostolico and Jotun Hein, editors, Combinatorial Pattern Matching, 8th Annual Symposium, CPM 97, Aarhus, Denmark, June 30 - July 2, 1997, Proceedings, volume 1264 of Lecture Notes in Computer Science, pages 116–129. Springer, 1997.
- [9] Aldo de Luca. Sturmian words: structure, combinatorics, and their arithmetics. Theoret. Comput. Sci., 183(1):45 – 82, 1997.
- [10] Xavier Droubay, Jacques Justin, and Giuseppe Pirillo. Episturmian words and some constructions of de Luca and Rauzy. Theoret. Comput. Sci., 255(1-2):539–553, 2001.
- [11] Fabien Durand and Dominique Perrin. Dimension Groups and Dynamical Systems. Cambridge University Press, 2022.
- [12] Chiara Epifanio, Christiane Frougny, Alessandra Gabriele, Filippo Mignosi, and Jeffrey O. Shallit. Sturmian graphs and integer representations over numeration systems. Discret. Appl. Math., 160(4-5):536–547, 2012.
- [13] Chiara Epifanio, Filippo Mignosi, Jeffrey O. Shallit, and Ilaria Venturini. On sturmian graphs. Discret. Appl. Math., 155(8):1014–1030, 2007.
- [14] Gabriele Fici. Special factors and the combinatorics of suffix and factor automata. Theoret. Comput. Sci., 412(29):3604–3615, 2011.
- [15] Anna E. Frid. Sturmian numeration systems and decompositions to palindromes. European J. Combin., 71:202–212, 2018.
- [16] Marshall Hall, Jr. A topology for free groups and related groups. Ann. of Math. (2), 52:127–139, 1950.
- [17] Jacques Justin. Episturmian morphisms and a Galois theorem on continued fractions. ITA, 39(1):207–215, 2005.
- [18] Christian Kassel and Christophe Reutenauer. A palindromization map for the free group. Theor. Comput. Sci., 409(3):461–470, 2008.
- [19] M. Lothaire. Applied Combinatorics on Words. Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2005.
- [20] Gérard Rauzy. Mots infinis en arithmétique. In Maurice Nivat and Dominique Perrin, editors, Automata on Infinite Words, volume 192 of Lecture Notes in Computer Science, pages 165–171. Springer-verlag, 1984.
- [21] Christophe Reutenauer. From Christoffel Words to Markoff Numbers. Oxford University Press, 2019.
- [22] Luis Ribes and Pavel Zalesskii. Profinite groups, volume 40 of Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics]. Springer-Verlag, Berlin, second edition, 2010.
- [23] Gwenaël Richomme. A characterization of infinite LSP words. In Developments in language theory, volume 10396 of Lecture Notes in Comput. Sci., pages 320–331. Springer, Cham, 2017.
- [24] Marinella Sciortino and Luca Q. Zamboni. Suffix automata and standard sturmian words. In Tero Harju, Juhani Karhumäki, and Arto Lepistö, editors, Developments in Language Theory, 11th International Conference, DLT 2007, Turku, Finland, July 3-6, 2007, Proceedings, volume 4588 of Lecture Notes in Computer Science, pages 382–398. Springer, 2007.