Université du Québec à Montréal,
CP 8888 Succ. Centre-ville, Montréal (QC) Canada H3C 3P8
11email: [email protected]
On the number of -powers in a finite word
Abstract
This note is an attempt to attack a conjecture of Fraenkel and Simpson stated in 1998 concerning the number of distinct squares in a finite word. By counting the number of (right-)special factors, we give an upper bound of the number of -powers in a finite word for any integer . By -power, we mean a word of the form .
1 Introduction and notation
Given a finite word, the problem of counting the number of distinct squares was introduced by Fraenkel and Simpson. In [4] they conjectured that the number of distinct squares in a finite word is bounded by its length and they proved that this number is bounded by . After that Ilie [5] strengthened this bound to ; Lam [6] improved this result to ; Deza, Franek and Thierry [2] achieved a bound of ; Thierry [7] refined this bound to . A basic fact about the square-counting problem is that no more than two squares can have their last occurrence starting at the same position, this fact is proved in [4] using the three squares lemma of Crochemore and Rytter [1]. After that, the ideas of improving the bound of distinct squares in a finite word are about counting the number of positions at which there exist two different squares having their last occurrence starting. In this article, instead of studying the same motif, we propose to consider the number of special factors. The idea is from the fact that (almost) every occurrent factor can be associated with a (right-)special factor. Even though we can not achieve an injection from the set of squares to the set of special factors in the finite word. This correspondence leads an upper bound of the number of -powers in the given word. The main result of this note is announced as follows:
Theorem 1
Let be an integer larger than 2. For any finite word , let denote the number of its distinct non-empty factors of the form , let denote the length of and let denote the number of distinct letters in . We then have
2 Preliminaries
Let us recall the basic terminology about words. By word we mean a finite concatenation of symbols , with . The length of , denoted , is and we say that the symbol is at position . The set is called the alphabet of and its elements are called letters. Let denote the cardinality of . A word of length is called the empty word and it is denoted by . For any word , we have . Let be two different words, we say is shorter (resp. longer) than if (resp. ).
A word is called a factor of if for some words . When (resp. ) is called a prefix (resp. suffix) of . The set of all factors (resp. prefixes, resp. suffixes) of is denoted by (resp. , resp. ). For any integer satisfying , let denote the number of distinct factors of length in .
A factor of is called right-special if there exist two different letters such that and are both factors of . are called right-extensions of .
For any natural number , the -th power of a finite word is denoted by and consists of the concatenation of copies of .
A finite word is said to be primitive if it is not a power of another word, that is if implies . Let denote the set of all primitive factors of .
A -power is a word satisfying for a certain and for a . Let denote the number of its distinct non-empty -powers. For a given word and two positive integers satisfying , let us define to be the prefix of of length . Now we can define the rational power a word: let be a finite word and let be a positive rational number, is well defined only if , and in this case, there is a couple of non-negative integers satisfying , we define to be . For a given word and a given integer , we say that is of period if there exists a word of length such that .
Here we recall a basic lemma concerning the repetitions:
Lemma 2 (Fine and Wilf [3])
Let be a word having and for periods. If then is also a period of .
3 Number of right-special factors
Let be a finite word. In this section, we consider the word obtained by concatenating a special letter at the end of , with the condition that .
For any , let us define , and similarly, let us define . There are the following facts between and :
1) For any factor , ;
2) If but , .
For any factor , let us define . For any integer satisfying , let us define to be the shortest suffix of containing as a prefix.
Example 3
Let us consider the following word
For ,, thus are all well-defined, we have , and . For , we then have , thus only is well-defined, we have .
Lemma 4
Let be a finite word and let , if , then, for any integer satisfying , the factor is a right-special factor in .
Proof
Let and let , with for all satisfying and . As , there exists an integer satisfying such that . For any integer satisfying , occurs at least twice in as respectively a prefix and a suffix of . If we suppose that , then
Now let us prove that is a right-special factor of . As , is well-defined and it is a factor of . We claim that . In fact, if , then ; if ,
in this case if , then , this contradicts the maximality of . As a consequence, in both cases, thus
Hence, is a right-special factor in .
Example 5
Let us consider the word given in the previous example:
is to be
For ,, thus are right-special factors in . In fact, , and are all factors of for . For , . is also a right-special factor of because . However, it is not counted in the lemma, because .
Lemma 6
For any couple of different primitive factors satisfying , we have
.
Proof
If there exist two different primitive factors and two integers such that then with respectively a prefix of and . From the hypothesis that and Lemma 2, there exists a factor such that are both a power of , this fact contradicts the primitivities of .
Corollary 7
Let be a finite word and let denote the number of right-special factors in , then we have
4 Proof of the main theorem
Proof ( of Theorem 1)
Let be an integer larger than , we have
where represents the largest integer smaller than or equal to . On the other hand, for any primitive satisfying , we have . Hence,
where denote the number of right-special factors in .
Now let us give an upper bound of . First, we claim that if is a right-special factor, then . Otherwise, is a suffix of and it does not have any right-extensions. Now, for any right-special factor of of length , it has at least two right-extensions, and the suffix of of length is not right-special. Thus, if we let denote the number of right-special factors of of length , we then have , where is the number of distinct factors of length in defined in the section Preliminaries. Hence,
Thus,
5 Concluding Remarks
The result obtained in the main theorem is not sharp. Let us consider the word , it is easy to check that while the bound given in Theorem 1 is . The author believes that the problem is from the way we count the number of right-special factors. From Lemme 4 we prove that for a given primitive , all are right-special if . However, in Lemma 6 we count just the for . In fact, we can only prove that the words of the form are pairwisely different when . Meanwhile, we can have two different primitives such that for some positive integer . Thus, further work to to be done to investigate that
If the previous inequality holds, we may expect to prove
References
- [1] Crochemore, M., Rytter, W.: Squares, cubes, and time-space efficient string searching. Algorithmica 13, 405–425 (1995)
- [2] Deza, A., Franek, F., Thierry, A.: How many double squares can a string contain? Discrete Applied Mathematics 180, 52–69 (2015)
- [3] Fine, N.J., Wilf, H.S.: Uniqueness theorems for periodic functions. Proceedings of the American Mathematical Society 16(1), 109–114 (1965)
- [4] Fraenkel, A.S., Simpson, J.: How many squares can a string contain? J. Comb. Theory, Ser. A 82(1), 112–120 (1998)
- [5] Ilie, L.: A note on the number of distinct squares in a word. In: Brlek, S., Reutenauer, C. (eds.) Proc. Words2005, 5-th International Conference on Words. vol. 36, pp. 289–294. Publications du LaCIM, Montreal, Canada (13–17 Sep 2005)
- [6] Lam, N.H.: On the number of squares in a string. AdvOL-Report 2 (2013)
- [7] Thierry, A.: A proof that a word of length n has less than 1.5n distinct squares (2020)