Breaking a Barrier in Constructing Compact Indexes for Parameterized Pattern Matching
Abstract
A parameterized string (p-string) is a string over an alphabet , where and are disjoint alphabets for static symbols (s-symbols) and for parameter symbols (p-symbols), respectively. Two p-strings and are said to parameterized match (p-match) if and only if can be transformed into by applying a bijection on to every occurrence of p-symbols in . The indexing problem for p-matching is to preprocess a p-string of length so that we can efficiently find the occurrences of substrings of that p-match with a given pattern. Extending the Burrows-Wheeler Transform (BWT) based index for exact string pattern matching, Ganguly et al. [SODA 2017] proposed the first compact index (named pBWT) for p-matching, and posed an open problem on how to construct it in compact space, i.e., in bits of space. Hashimoto et al. [SPIRE 2022] partially solved this problem by showing how to construct some components of pBWTs for in time in an online manner while reading the symbols of from right to left. In this paper, we improve the time complexity to . We remark that removing the multiplicative factor of from the complexity is of great interest because it has not been achieved for over a decade in the construction of related data structures like parameterized suffix arrays even in the offline setting. We also show that our data structure can support backward search, a core procedure of BWT-based indexes, at any stage of the online construction, making it the first compact index for p-matching that can be constructed in compact space and even in an online manner.
1 Introduction
A parameterized string (p-string) is a string over an alphabet , where and are disjoint alphabets for static symbols (s-symbols) and for parameter symbols (p-symbols), respectively. Two p-strings and are said to parameterized match (p-match) if and only if can be transformed into by applying a bijection on to every occurrence of p-symbols in . P-matching was introduced by Baker aiming at software maintenance and plagiarism detection [1, 2, 3], and has been extensively studied in the last decades (see a recent survey [21] and references therein).
The indexing problem for p-matching is to preprocess a p-string of length so that we can efficiently find the occurrences of substrings of that p-match with a given pattern. Extending indexes for exact string pattern matching, there have been proposed several data structures that can be used as indexes for p-matching, e.g., parameterized suffix trees [1, 19, 2, 3], parameterized suffix arrays [7, 17, 4, 11], parameterized suffix trays [13], parameterized DAWGs [23], parameterized position heaps [8, 10, 12] and parameterized Burrows-Wheeler Transform (pBWT) [14, 18, 15].
Among these indexes, pBWTs are the most space economic, consuming bits [14] or bits with a simplified version proposed in [18], where is the alphabet size. Let and respectively be the numbers of distinct s-symbols and p-symbols that appear in . The pBWT of can be constructed via the parameterized suffix tree of for which -time or randomized -time construction algorithms are known [19, 6, 20], but the intermediate memory footprint of bits could be intolerable when is significantly larger than . Hashimoto et al. [16] showed how to construct some components of the pBWT of [18] for in time in an online manner while reading the symbols of from right to left.
In this paper, we improve the time complexity of [16] to . Removing the multiplicative factor of from the time complexity of [16] is of great interest because it has not been achieved for over a decade in the construction of related data structures like parameterized suffix arrays even in an offline setting [11]. We also show that our data structure can support backward search, a core procedure of BWT-based indexes, at any stage of the online construction, making it the first compact index for p-matching that can be constructed in compact space and even in an online manner. This is not likely to be achieved with the previous work [16] due to the lack of support for 2D range counting queries in the data structure it uses.
Our computational assumptions are as follows:
-
•
We assume a standard Word-RAM model with word size .
-
•
Each symbol in is represented by bits.
-
•
We can distinguish symbols in and in time, e.g., by having some flag bits or thresholds separating them each other.
-
•
The order on s-symbols are determined in time based on their bit representations.
An index of a p-string for p-matching is to support, given a pattern , the counting queries that compute the number of occurrences of substrings that p-match with and the locating queries that compute the occurrences of . Our main result is:
Theorem 1.
For a p-string of length over an alphabet , an index of for p-matching can be constructed online in time and bits of space, where is the number of distinct p-symbols used in the p-string. At any stage of the online construction, it can support the counting queries in time and the locating queries in time, where is the pattern length and is the number of occurrences to be reported.
2 Preliminaries
2.1 Basic notations and tools
An integer interval is denoted by , where represents the empty interval if .
Let be an ordered finite alphabet. An element of is called a string over . The length of a string is denoted by . The empty string is the string of length 0, that is, . Let and for any non-negative integer . The concatenation of two strings and is denoted by or simply . When a string is represented by the concatenation of strings , and (i.e. ), then , and are called a prefix, substring, and suffix of , respectively. A substring of is called proper if .
The -th symbol of a string is denoted by for , and the substring of a string that begins at position and ends at position is denoted by for , i.e., . For convenience, let if ; further let and denote abbreviations for the prefix of length and the suffix starting at position , respectively. For two strings and , let denote the length of the longest common prefix between and . We consider the lexicographic order over by extending the strict total order defined on : is lexicographically smaller than (denoted as ) if and only if either is a proper prefix of or holds.
For any string , character , and position , returns the number of occurrences of in and returns the -th occurrence of in . For , a range minimum query asks for . We also consider find previous/next queries and , where is a predicate either in the form of “” (equal to ), “” (less than ) or “” (larger than or equal to ): returns the largest position at which satisfies the predicate . Symmetrically, returns the smallest position at which satisfies the predicate . For example with the integer string , , , , , , and .
If the answer of , or does not exist, it is just ignored. To handle this case of non-existence, we would use them in an expression with or : For example, returns if does not exist.
Dynamic strings should support insertion/deletion of a symbol to/from any position as well as fast random access. We use the following result:
Lemma 2 ([22]).
A dynamic string of length over an alphabet can be implemented while supporting random access, insertion, deletion, and queries in bits of space and query and update times.
Dynamic binary strings equipped with and queries can be used as a building block for the dynamic wavelet matrix [5] of a string over an alphabet to support queries beyond and . The idea is that each of the other queries can be simulated by performing one of the building block queries on every level of the wavelet matrix, which has levels, cf. [24, Section 6.2.].
Lemma 3.
A dynamic string of length over an alphabet with can be implemented while supporting random access, insertion, deletion, , , , and queries in bits of space and query and update times.
2.2 Parameterized strings
Let and denote two disjoint sets of symbols. We call a symbol in a static symbol (s-symbol) and a symbol in a parameter symbol (p-symbol). A parameterized string (p-string) is a string over . Let represent a symbol that is larger than any integer, and let be the set of natural positive numbers including infinity. We assume that and is an ordered alphabet such that all s-symbols are smaller than any element in . Let be the smallest s-symbol, which will be used as an end-marker of p-strings. For any p-string the p-encoded string of , also proposed as in [18], is the string in such that
where is the largest position in with . In other words, p-encoding transforms an occurrence of a p-symbol into the distance to the previous occurrence of the same p-symbol, or if it is the leftmost occurrence. By definition, p-encoding is prefix-consistent, i.e., for any symbol . On the other hand, and differ if and only if occurs in . If it is the case, the leftmost occurrence of in is the unique position such that and differ with and , i.e., and .
For any p-string , let denote the number of distinct p-symbols in , i.e., . We define a function that maps a p-string over to an element in such that
where . In the second case, represents the rank of p-symbol when ordering all p-symbols by their leftmost occurrences in , considering the rank of p-symbols not in to be . If exists, it holds that . For convenience, let .
For two p-strings and , denotes the number of ’s in the longest common prefix of and .
The following lemma states basic but important properties of the p-string encoding and , which immediately follow from the definition (see Fig. 1 for illustrations).
Lemma 4.
For any p-strings and over with , and , Table 1 shows a complete list of cases for , and the lexicographic order between and , where a case starting with A and B is under the condition that at least one of and is in , and respectively none of and is in .
cases | additional conditions | lexicographic order | ||
---|---|---|---|---|
(A1) | iff | |||
(A2) | ||||
(B1) | ||||
(B2) | and | |||
(B3) | and | |||
(B4) |

By Lemma 4, we have the following corollaries:
Corollary 5.
For any p-strings and , .
Corollary 6.
For any p-strings and with , if and only if .
Corollary 7.
For any p-strings and with , it holds that , where and .
1 | 12 | 0 | |||||
2 | 11 | 0 | 1 | ||||
3 | 7 | 0 | 2 | ||||
4 | 3 | 2 | 2 | ||||
5 | 10 | 0 | 2 | 1 | |||
6 | 6 | 1 | 3 | 2 | |||
7 | 2 | 2 | 3 | 2 | |||
8 | 9 | 1 | 2 | 2 | |||
9 | 5 | 2 | 3 | 3 | |||
10 | 1 | 3 | 3 | ||||
11 | 8 | 2 | 2 | ||||
12 | 4 | 2 | 3 |
Let be a p-string that has the smallest s-symbol as its end-marker, i.e., and does not appear anywhere else in . The suffix rank function for maps a position to the lexicographic rank of in . Its inverse function returns the starting position of the lexicographically -th p-encoded suffix of .
The main components of parameterized Burrows-Wheeler Transform (pBWT) of are and . (resp. ) is defined to be the string of length such that (resp. ), where we assume that . 111Previous studies [14, 18, 16] define pBWTs based on sorted cyclic rotations, but our suffix-based definition is more suitable for online construction to prevent unnecessary update on and . Since is equivalent to the set of all non-empty suffixes of , is a permutation of .
The so-called LF-mapping maps a position to if , and otherwise . By definition and Corollary 6, we have:
Corollary 8.
For any p-string and any integers with , if .
Thanks to Corollary 8, it holds that , where . The inverse function of can be computed by , where .
Let be the string of length such that and for any . An example of all explained arrays is given in Table 2.
3 Online construction algorithm
For online construction of our index for p-matching, we consider maintaining dynamic data structures for , and while prepending a symbol to the current p-string . The details of the data structures will be presented in Subsection 3.3. In what follows, we focus on a single step of updating to for some symbol in . Note that , and are strongly related to the sorted p-encoded suffixes of a p-string and is the only suffix that was not in the suffixes of . Let and . In order to deal with the new emerging suffix , we compute the lexicographic rank of in the non-empty p-encoded suffixes of . Then and can be obtained by replacing in at by and inserting and into the -th position of and , respectively. In Subsection 3.1, we propose our algorithm to compute . For updating , we have to compute the -values for with its lexicographically adjacent p-encoded suffixes, which will be treated in Subsection 3.2.
3.1 How to compute
Unlike the existing work [16] that computes by counting the number of p-encoded suffixes that are lexicographically smaller than , we get indirectly by computing the rank of a lexicographically closest (smaller or larger) p-encoded suffix to . The lexicographically smaller (resp. larger) closest element in to is called the p-pred (resp. p-succ) of . and its rank is denoted by (resp. ). Then it holds that .
We start with the easy case that the prepended symbol is an s-symbol.
Lemma 9.
Let be a p-string with . If exists, the rank of the p-pred of is . Otherwise, , where is the largest s-symbol that appears in and smaller than .
Proof.
By Case (A2) of Lemma 4 (cf. Table 1), the lexicographic order of p-encoded suffixes starting with does not change by removing their first characters, which are all . If exists, is the lexicographically smaller closest p-encoded suffix to that is preceded by . Hence, is the p-pred of , which means that .
If does not exist, it implies that is the lexicographically smallest p-encoded suffix that starts with . Since lexicographically comes right after the p-encoded suffixes starting with an s-symbol smaller than , is the last occurrence of in , that is, . ∎
In the rest of this subsection, we consider the case that is a p-symbol. If contains no p-symbol, it is clear that . Hence, in what follows, we assume that there is a p-symbol in .
Since has the longest -value with its p-pred or p-succ, we search for such p-encoded suffixes of using the following lemmas to leverage the information stored in .
Lemma 10.
Given two positions and with , .
Proof.
It is not difficult to see that for any strings , and thus, . Since holds the -values of lexicographically adjacent p-encoded suffixes, we get by applying the previous argument successively. ∎
Lemma 11.
Algorithm 1 correctly computes the maximal interval such that for any .
Lemma 12.
Algorithm 2 correctly returns .
Proof.
Let for any , and for any . Also let . Although Algorithm 2 does not intend to compute the exact value of , it checks if falls in in decreasing order of starting from . One of the necessary conditions to have is that , or equivalently . Line 2 computes the maximal interval that represents the ranks of the p-encoded suffixes having an -value larger than or equal to . The basic idea is to find a p-encoded suffix in that comes closest to when extended by adding its preceding symbol. When comes down to the point with , is returned in one of the if-then-blocks at Lines 2, 2, 2 and 2.
If for an integer , there are two possible scenarios:
-
1.
and either or is , and
-
2.
and both and are at least .
The former p-encoded suffix is processed in one of the if-then-blocks at Lines 2 and 2 when , while the latter at Lines 2, 2, 2 and 2 when . Note that the former is never farther from than the latter because the lexicographic order between and is determined by and at in the former case, while it is by and something smaller than in the latter case. Since Algorithm 2 processes the former case first, it guarantees that the algorithm finds the closer one first.
The case with is treated differently than other cases in the if-then-block at Line 2 since is the unique position where turns into . For a p-encoded suffix , having is necessary and sufficient for its extended suffix to have an -value larger than with . By Corollary 6, p-encoded suffixes satisfying this condition must preserve their lexicographic order after extension, and hence, it is enough to search for the closest one ( or ) to and compute the rank of its extended suffix by . If Lines 2 and 2 fail to return a value, it means that . The if-block at Line 2 checks if there exists a p-encoded suffix that satisfies the former condition to be . It is enough to find one with for because it is necessary and sufficient to have and . Note that there could be two or more p-encoded suffixes that satisfy the condition and their lexicographic order may change by extension. In the then-block at Line 2, the algorithm computes the rank of the lexicographically smallest p-encoded suffix that has an -value larger than with , which is the p-succ of in this case.
The case with is processed in the else-block at Line 2. Here, it is good to keep in mind that when we enter this else-block, or holds for any proper suffix of , since otherwise must be reporeted in a previous round of the foreach loop.
When the if-condition at Line 2 holds, is the lexicographically smaller closest p-encoded suffix to such that , or equivalently . For any p-encoded suffix in its extended suffix is lexicographically smaller than due to Corollary 7, and never closer to than . At Line 2, the algorithm computes the maximal interval such that every p-encoded suffix in shares the common prefix of length with . Since any has an -value smaller than with , it follows from Lemma 4 that . Also, for any , the aforementioned precondition to enter the else-block at Line 2 leads to or by Lemma 4. So far we have confirmed that the -value between and the p-pred of is less than , which implies that the p-pred is the largest p-encoded suffix that is prefixed by . If computed at Line 2 is in , is prefixed by and the p-pred is the largest p-encoded suffix that is prefixed by , which can be computed by because contains exactly ’s. If , the p-pred is the largest p-encoded suffix that is prefixed by , which is .
When the if-condition at Line 2 holds, is the lexicographically larger closest p-encoded suffix to such that , or equivalently . For any p-encoded suffix in its extended suffix is lexicographically smaller than due to Corollary 7, and never closer to than . At Line 2, the algorithm computes the maximal interval such that every p-encoded suffix in shares the common prefix of length with . Since any has an -value smaller than with , it follows from Lemma 4 that . Also, for any , the aforementioned precondition to enter the else-block at Line 2 leads to by Lemma 4. So far we have confirmed that the -value between and the p-succ of is less than , which implies that the p-succ is the smallest p-encoded suffix that is prefixed by . If computed at Line 2 is in , the p-succ is the smallest p-encoded suffix that is prefixed by , which is . If , the p-succ is the smallest p-encoded suffix that is prefixed by , which can be computed by because contains exactly ’s.
When we enter the if-then-block at Line 2, it is guaranteed that . In order to check if there exists a p-encoded suffix that satisfies the former condition to be , the algorithm computes . If , is the lexicographically largest p-encoded suffix that satisfies the condition, and by Corollary 6, its extended suffix must be the largest one to have . Therefore, is the p-pred of , and . ∎
3.2 How to maintain
Suppose that we have , , , , we show how to compute -values of with its p-pred and p-succ to maintain .
3.3 Dynamic data structures and analysis
We consider constructing , and of a p-string over an alphabet of length online. Let and respectively be the numbers of distinct s-symbols and p-symbols that appear in .
First we show data structures needed to implement our algorithm presented in the previous subsections. For we maintain a dynamic string of Lemma 2 supporting random access, insertion, and queries in time and bits of space. For we maintain a dynamic string of Lemma 3 to support random access, insertion, , and queries in time and bits of space.
If we build a dynamic string of Lemma 3 for , the query time would be . Since our algorithm does not use , and queries for s-symbols, we can reduce the query time to as follows. We represent with one level of a wavelet tree, where a bit vector partitions the alphabet into and and thus has pointers to and storing respectively the sequence over and that over of . We represent the former and the latter by the data structures described in Lemmas 2 and 3, respectively, since we only need the aforementioned queries such as on . Then, queries on can be answered in time using bits of space.
In addition to these dynamic strings for , and , we consider another dynamic string . Let be a string that is obtained by extracting the leftmost occurrence of every p-symbol in . Thus, . A dynamic string of Lemma 2 enables us to compute for a p-symbol by in time and bits of space.
We also maintain a fusion tree of [25] of bits to maintain the set of s-symbols used in the current , which enables us to compute in Lemma 9 in time.
We are now ready to prove the following lemma.
Lemma 13.
, and for a p-string of length over an alphabet can be constructed online in time and bits of space, where is the number of distinct p-symbols used in the p-string.
Proof.
We maintain the dynamic data structures of bits described in this subsection while prepending a symbol to the current p-string. For a single step of updating to with , we compute as described in Subsection 3.1 and obtain and by replacing in at by and inserting and into the -th position of and , respectively. is updated as described in Subsection 3.2.
If , the computation of based on Lemma 9 requires a constant number of queries. If , Algorithm 2 computes invoking queries, where and . The value can be seen as a potential held by the current string , which upper bounds the number of queries. The number of queries in a single step can be in the worst case when and are close to and respectively , but this will reduce the potential for later steps, which allows us to give an amortized analysis. Since a single step increases the potential at most 1 by Corollary 5, the total number of queries can be bounded by .
Since we invoke queries that take time each, the overall time complexity is . ∎
4 Extendable compact index for p-matching
In this section, we show that , and can serve as an index for p-matching.
First we show that we can support backward search, a core procedure of BWT-based indexes, with the data structures for , and described in Subsection 3.3. For any p-string , let -interval be the maximal interval such that is prefixed by for any . We show the next lemma for a single step of backward search, which computes -interval from -interval.
Lemma 14.
Suppose that we have data structures for , and described in Subsection 3.3. Given -interval and , we can compute -interval in time.
Proof.
We show that we can compute -interval from -interval using a constant number of queries supported on , and , which takes time each.
-
•
When is in : is prefixed by if and only if is prefixed by and . In other words, if and only if and . Then it holds that and due to Corollary 6.
-
•
When is a p-symbol that appears in : Similar to the previous case, if and only if and . Then it holds that and due to Corollary 6.
-
•
When is a p-symbol that does not appear in : Let . Since and are necessary and sufficient conditions for to be in , we can compute , the width of , by counting the number of positions such that with . This can be done with 2D range counting queries, which can also be supported with the wavelet tree of Lemma 3. If is in , it holds that and . Note that is not necessarily because p-encoded suffixes with in do not necessarily preserve the lexicographic order when they are extended by one symbol to the left, making it non-straightforward to identify the position .
To tackle this problem, we consider and , and show that , where is the number of positions such that with . Observe that and by definition, and that if and only if and (see Fig. 2 for an illustration). Also, it holds that for any and with and because , and they fall into Case (B4) of Lemma 4. Similarly for any and with and , we have . Hence, holds.
This concludes the proof. ∎

We are now ready to prove the main theorem:
Proof of Theorem 1:.
If we only need counting queries, Lemmas 13 and 14 are enough: While we build , and online, we can compute -interval for a given pattern of length using Lemma 14 successively times, spending time in total.
Since is the set of occurrences of in , we consider how to access in time to support locating queries. As is common in BWT-based indexes, we employ a sampling technique (e.g., see [9]): For every text positions we store the values so that if we apply LF/FL-mapping to successively at most times we hit one of the sampled text positions. A minor remark is that since our online construction proceeds from right to left, it is convenient to start sampling from the right-end of and store the distance to the right-end instead of the text position counted from the left-end of .
During the online construction of the data structures for , and , we additionally maintain a dynamic bit vector of length and dynamic integer string of length , which marks the sampled positions and stores sampled values, respectively. We implement with the dynamic string of Lemma 2 in bits with query times. In order to support LF/FL-mapping in time, we also maintain a dynamic string of Lemma 2 for . With the additional space usage of bits, we can access in time as we use LF/FL-mapping at most times. This leads to the claimed time bound for locating queries. ∎
Acknowledgements
This work was supported by JSPS KAKENHI Grant Number 19K20213 and 22K11907 (TI).
References
- [1] Brenda S. Baker. A theory of parameterized pattern matching: algorithms and applications. In S. Rao Kosaraju, David S. Johnson, and Alok Aggarwal, editors, Proc. 25th Annual ACM Symposium on Theory of Computing (STOC), pages 71–80. ACM, 1993.
- [2] Brenda S. Baker. Parameterized pattern matching: Algorithms and applications. Journal of Computer and System Sciences, 52(1):28–42, 1996.
- [3] Brenda S. Baker. Parameterized duplication in strings: Algorithms and an application to software maintenance. SIAM J. Comput., 26(5):1343–1362, 1997.
- [4] Richard Beal and Donald A. Adjeroh. p-suffix sorting as arithmetic coding. J. Discrete Algorithms, 16:151–169, 2012.
- [5] Francisco Claude, Gonzalo Navarro, and Alberto Ordóñez Pereira. The wavelet matrix: An efficient wavelet tree for large alphabets. Inf. Syst., 47:15–32, 2015.
- [6] Richard Cole and Ramesh Hariharan. Faster suffix tree construction with missing suffix links. SIAM J. Comput., 33(1):26–42, 2003.
- [7] Satoshi Deguchi, Fumihito Higashijima, Hideo Bannai, Shunsuke Inenaga, and Masayuki Takeda. Parameterized suffix arrays for binary strings. In Proc. Prague Stringology Conference (PSC) 2008, pages 84–94, 2008.
- [8] Diptarama, Takashi Katsura, Yuhei Otomo, Kazuyuki Narisawa, and Ayumi Shinohara. Position heaps for parameterized strings. In Juha Kärkkäinen, Jakub Radoszewski, and Wojciech Rytter, editors, Proc. 28th Annual Symposium on Combinatorial Pattern Matching (CPM) 2017, volume 78 of LIPIcs, pages 8:1–8:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017.
- [9] Paolo Ferragina and Giovanni Manzini. Opportunistic data structures with applications. In FOCS, pages 390–398, 2000.
- [10] Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Right-to-left online construction of parameterized position heaps. In Jan Holub and Jan Zdárek, editors, Proc. Prague Stringology Conference (PSC) 2018, pages 91–102. Czech Technical University in Prague, Faculty of Information Technology, Department of Theoretical Computer Science, 2018.
- [11] Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Direct linear time construction of parameterized suffix and LCP arrays for constant alphabets. In Nieves R. Brisaboa and Simon J. Puglisi, editors, Proc. 26th International Symposium on String Processing and Information Retrieval (SPIRE) 2019, volume 11811 of Lecture Notes in Computer Science, pages 382–391. Springer, 2019.
- [12] Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. The parameterized position heap of a trie. In Pinar Heggernes, editor, Proc. 11th International Conference on Algorithms and Complexity (CIAC) 2019, volume 11485 of Lecture Notes in Computer Science, pages 237–248. Springer, 2019.
- [13] Noriki Fujisato, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. The parameterized suffix tray. In Tiziana Calamoneri and Federico Corò, editors, Proc. 12th International Conference on Algorithms and Complexity (CIAC) 2021, volume 12701 of Lecture Notes in Computer Science, pages 258–270. Springer, 2021.
- [14] Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan. pBWT: Achieving succinct data structures for parameterized pattern matching and related problems. In Proc. 28th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) 2017, pages 397–407, 2017. doi:10.1137/1.9781611974782.25.
- [15] Arnab Ganguly, Rahul Shah, and Sharma V. Thankachan. Fully functional parameterized suffix trees in compact space. In Mikolaj Bojanczyk, Emanuela Merelli, and David P. Woodruff, editors, Proc. 49th International Colloquium on Automata, Languages, and Programming, (ICALP) 2022, volume 229 of LIPIcs, pages 65:1–65:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
- [16] Daiki Hashimoto, Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, and Ayumi Shinohara. Computing the parameterized burrows-wheeler transform online. In Diego Arroyuelo and Barbara Poblete, editors, Proc. 29th International Symposium on String Processing and Information Retrieval (SPIRE) 2022, volume 13617 of Lecture Notes in Computer Science, pages 70–85. Springer, 2022. doi:10.1007/978-3-031-20643-6\_6.
- [17] Tomohiro I, Satoshi Deguchi, Hideo Bannai, Shunsuke Inenaga, and Masayuki Takeda. Lightweight parameterized suffix array construction. In Proc. 20th International Workshop on Combinatorial Algorithms (IWOCA) 2009, pages 312–323, 2009.
- [18] Sung-Hwan Kim and Hwan-Gue Cho. Simpler FM-index for parameterized string matching. Inf. Process. Lett., 165:106026, 2021. doi:10.1016/j.ipl.2020.106026.
- [19] S. Rao Kosaraju. Faster algorithms for the construction of parameterized suffix trees (preliminary version). In Proc. 36th Annual Symposium on Foundations of Computer Science (FOCS), pages 631–637. IEEE Computer Society, 1995.
- [20] Taehyung Lee, Joong Chae Na, and Kunsoo Park. On-line construction of parameterized suffix trees for large alphabets. Inf. Process. Lett., 111(5):201–207, 2011.
- [21] Juan Mendivelso, Sharma V. Thankachan, and Yoan J. Pinzón. A brief history of parameterized matching problems. Discret. Appl. Math., 274:103–115, 2020. doi:10.1016/j.dam.2018.07.017.
- [22] J. Ian Munro and Yakov Nekrich. Compressed data structures for dynamic sequences. In Proc. 23rd Annual European Symposium on Algorithms (ESA) 2015, pages 891–902, 2015.
- [23] Katsuhito Nakashima, Noriki Fujisato, Diptarama Hendrian, Yuto Nakashima, Ryo Yoshinaka, Shunsuke Inenaga, Hideo Bannai, Ayumi Shinohara, and Masayuki Takeda. Parameterized dawgs: Efficient constructions and bidirectional pattern searches. Theor. Comput. Sci., 933:21–42, 2022. doi:10.1016/j.tcs.2022.09.008.
- [24] Gonzalo Navarro. Wavelet trees for all. J. Discrete Algorithms, 25:2–20, 2014. doi:10.1016/j.jda.2013.07.004.
- [25] Mihai Patrascu and Mikkel Thorup. Dynamic integer sets with optimal rank, select, and predecessor search. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014, pages 166–175, 2014. doi:10.1109/FOCS.2014.26.