This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Longest (Sub-)Periodic Subsequence

Hideo Bannai    Tomohiro I    Dominik Köppl
Abstract

We present an algorithm computing the longest periodic subsequence of a string of length nn in 𝒪(n7)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{7}) time with 𝒪(n4)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{4}) words of space. We obtain improvements when restricting the exponents or extending the search allowing the reported subsequence to be subperiodic down to 𝒪(n3)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{3}) time and 𝒪(n2)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}) words of space.

1 Introduction

A natural extension of the analysis of regularities such as squares or palindromes perceived as substrings of a given text is the study of the same type of regularities when considering subsequences. In this line of research, given a text of length nn, Kosowski [17] proposed an algorithm running in 𝒪(n2)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}) time using 𝒪(n)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n) words of space to find the longest subsequence that is a square. Inoue et al. [13] generalized this setting to consider the longest such subsequence common of two texts TT and SS of length nn, and gave an algorithm computing this sequence in 𝒪(n6)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{6}) time using 𝒪(n4)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{4}) space, also providing improvements in case that the number of matching characters pairs T[i]=S[j]T[i]=S[j] is rather small. Recently, Inoue et al. [14] provided similar improvements for the longest square subsequence of a single string. Here, we consider the problem for a single text, but allow the subsequence to have different exponents. In detail, we want to find the longest subsequence that is (sub-)periodic.

A non-exhaustive list of related problems are finding the longest palindromic subsequence [5, 12], absent subsequences [16], longest increasing and decreasing subsequences [19, 7], maximal common subsequences [18, 6], the longest run subsequence [20], the longest Lyndon subsequence [2], longest rollercoasters [9, 3, 8], and computing subsequence automaton [4, 1]. Our techniques rely on finding longest common subsequences, which is conceived as a well-studied problem (see [10, 11, 15] and the references therein).

2 Preliminaries

Let \mathbb{N} denote all natural numbers 1,2,1,2,\ldots, and +\mathbb{Q}^{+} the set of all rational numbers greater than or equal to 1. We distinguish integer intervals 1,,n=[1..n]1,\ldots,n=[1..n] and intervals of rational numbers [1/2,3/4][1/2,3/4].

Let Σ\Sigma denote a totally ordered set of symbols called the alphabet. An element of Σ\Sigma^{*} is called a string. Given a string SΣS\in\Sigma^{*}, we denote its length with |S||S| and its ii-th symbol with S[i]S[i] for i[1..|S|]i\in[1..|S|]. Further, we write S[i..j]=S[i]S[j]S[i..j]=S[i]\cdots S[j]. A factorization T=F1FzT=F_{1}\cdots F_{z} is a partitioning of TT into substrings F1,,FzF_{1},\ldots,F_{z}. A subsequence of a string SS with length \ell is a string S[i1]S[i]S[i_{1}]\cdots S[i_{\ell}] with i1<<ii_{1}<\ldots<i_{\ell}.

Given a string SS, we can write SS in the form S=UxUS=U^{x}U^{\prime} with UU^{\prime} being either empty or a proper prefix of UU. Then |U||U| is called a period of SS, and x+|U|/|U|+x+|U^{\prime}|/|U|\in\mathbb{Q}^{+} is called its exponent with respect to the period |U||U|. For the largest possible such exponent xx, SS is called periodic if x2x\geq 2, or sub-periodic if x(1,2)x\in(1,2). For instance, the unary string T=aaT=\texttt{a}\cdots\texttt{a} has the minimum period 11 with exponent |T||T|, or more generally, period p[1..|T|]p\in[1..|T|] with exponent |T|/p|T|/p.

Further, for two strings YY and ZZ, let LCSY,Z(y,z){\textrm{LCS}}_{Y,Z}({y,z}) denote the longest common subsequence of Y[1..y]Y[1..y] and Z[1..z]Z[1..z]. We omit the strings in subscript if they are clear from the context. Also, we allow us to write LCS for an arbitrary number of strings; the number of strings is reflected by the arguments given to LCS. It is known that we can answer the longest common subsequence LCSX1,,Xk(x1,,xk){\textrm{LCS}}_{X_{1},\ldots,X_{k}}({x_{1},\ldots,x_{k}}) of kk strings X1[1..x1],,Xk[1..xk]X_{1}[1..x_{1}],\ldots,X_{k}[1..x_{k}] in constant time by building a table of size 𝒪(nk)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{k}) and filling its entries in 𝒪(knk)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(kn^{k}) time via dynamic programming.

Structure of the Paper

In what follows, we first show an algorithm (Sect. 3) computing the longest subsequence that is periodic or sub-periodic. Subsequently, we refine this algorithm to omit the sub-periodic subsequences by allowing more time and space in Sect. 4. Table 1 gives an overview of our obtained results. We observe that, if we are only interested in the longest subsequence having an exponent >1>1, we obtain a algorithm faster than those finding subsequences with more restricted exponents.

Table 1: Space and time complexities for finding periodic subsequences of specific exponents. ϵ\epsilon is a rational number with 0<ϵ<10<\epsilon<1. The exponent column means that a subsequence is considered only if it has at least one exponent within the domain given in the column.

exponent time space solution 22\mathbb{N} 𝒪(n2)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}) 𝒪(n)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n) [17] +ϵ\mathbb{N}+\epsilon 𝒪(n3)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{3}) 𝒪(n2)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}) Thm. 3.2 +((2,3][4,))\mathbb{Q}^{+}\cap((2,3]\cup[4,\infty)) 𝒪(n5)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{5}) 𝒪(n3)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{3}) Thm. 4.2 (3+ϵ)(3+\epsilon)\mathbb{N} 𝒪(n7)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{7}) 𝒪(n4)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{4}) Thm. 4.3

A key observation is that a longest (sub-)periodic subsequence SS is maximal, meaning that no occurrence T[i1]T[i2]T[i|S|]T[i_{1}]T[i_{2}]\cdots T[i_{|S|}] of SS in TT can be extended with a character in T[1..i11]T[1..i_{1}-1] or T[i|S|+1..]T[i_{|S|}+1..] to form a longer subsequence without breaking the periodicity.

3 Longest (Sub-)periodic Subsequence

We start with the search for the longest subsequence that is periodic or subperiodic, meaning that one of its exponents is in +=(1,2)(2,3)(3,4)\mathbb{Q}^{+}\setminus\mathbb{N}=(1,2)\cup(2,3)\cup(3,4)\cdots. The idea is to compute every possible factorization of T=YZT=YZ into two factors, and try to (a) prolong each square UUUU to UcUUcU for UcUc being a subsequence in YY, or (b) extend UUUU^{\prime} to UcUUcU^{\prime} with UU^{\prime} being a proper prefix of UU found in ZZ and UcUc a subsequence found in YY. For a fixed partition T=YZT=YZ, we define

D2[y,z]:=max{2LCSY,Z(y1,z)+1 if LCSY,Z(y1,z1)>0,D2[y1,z]+1 if D2[y1,z]>0,D2[y,z1],0,\displaystyle D_{2}[y,z]:=\max\begin{cases}2\cdot{\textrm{LCS}}_{Y,Z}({y-1,z})+1&\text{~{}if~{}}{\textrm{LCS}}_{Y,Z}({y-1,z-1})>0,\\ D_{2}[y-1,z]+1&\text{~{}if~{}}D_{2}[y-1,z]>0,\\ D_{2}[y,z-1],\\ 0,\end{cases} (1)

for y[1..|Y|]y\in[1..|Y|] and z[1..|Z|]z\in[1..|Z|] to be the longest subsequence of TT having an exponent in +\mathbb{Q}^{+}\setminus\mathbb{N} of the form UUUU^{\prime} with UU^{\prime} being a non-empty common subsequence of Y[1..y]Y[1..y] and Z[1..z]Z[1..z] and UU a subsequence of Y[1..y]Y[1..y], where D2[,0]:=D2[0,]:=0D_{2}[\cdot,0]:=D_{2}[0,\cdot]:=0. Note that we can add D2[y1,z1]+1 if D2[y1,z1]>0D_{2}[y-1,z-1]+1\text{~{}if~{}}D_{2}[y-1,z-1]>0 as another selectable value for the maximum determining D2[y,z]D_{2}[y,z], but this does not change the maximum, since the value of D2[y1,z1]+1D_{2}[y-1,z-1]+1 is used as an option for the maximum determining D2[y,z1]D_{2}[y,z-1], which is already an option for D2[y,z]D_{2}[y,z]. D2D_{2} has the following property:

ZZ YY a b c a a
Z[1]=Z[1]= a // aba abca abcaa abcaaa
Z[2]=Z[2]= c // aba abca acaac acaaac
Z[3]=Z[3]= a // aba abca acaac acaaaca
Figure 1: D2D_{2} of Sect. 3 for the text T=abcaaacaT=\texttt{abcaaaca} with the factorization T=YZT=YZ with Y=abcaaY=\texttt{abcaa} and Z=acaZ=\texttt{aca}.
Lemma 3.1.

D2[y,z]D_{2}[y,z] is the length of the longest subsequence of TT of the form UUUU^{\prime} with UU^{\prime} being a proper prefix of UU and a common subsequence of Y[1..y]Y[1..y] and Z[1..z]Z[1..z], and UU being a subsequence of Y[1..y]Y[1..y].

Proof.

Let UUUU^{\prime} be the longest subsequence having an exponent in +\mathbb{Q}^{+}\setminus\mathbb{N} in TT with Y[y1]Y[y|U|]Z[z1]Z[z|U|]=UUY[y_{1}]\cdots Y[y_{|U|}]Z[z_{1}]\cdots Z[z_{|U^{\prime}|}]=UU^{\prime}. We partition T=YZT=YZ such that UU and UU^{\prime} are subsequences in YY and ZZ, respectively. Since UU^{\prime} is a proper prefix of UU, LCSY,Z(y|U|,z|U|)|U|{\textrm{LCS}}_{Y,Z}({y_{|U^{\prime}|},z_{|U^{\prime}|}})\geq|U^{\prime}|. If LCSY,Z(y|U|,z|U|)>|U|{\textrm{LCS}}_{Y,Z}({y_{|U^{\prime}|},z_{|U^{\prime}|}})>|U^{\prime}|, then there is another common subsequence VV^{\prime} of YY and ZZ, and VU[|U|..]V^{\prime}U[|U^{\prime}|..] is a longer subsequence of YY than UU, so VU[|U|..]UV^{\prime}U[|U^{\prime}|..]U^{\prime} is a longer subsequence having an exponent in +\mathbb{Q}^{+}\setminus\mathbb{N} than UUUU^{\prime}, a contradiction. Therefore, LCSY,Z(y|U|,z|U|)=|U|{\textrm{LCS}}_{Y,Z}({y_{|U^{\prime}|},z_{|U^{\prime}|}})=|U^{\prime}|. Obviously, Y[y|U|..]=U[y|U|..]Y[y_{|U^{\prime}|}..]=U[y_{|U^{\prime}|}..], since otherwise we could extend UU further.

Finally, we show that the sequence Y[y1]Y[y|U|]Z[z1]Z[z|U|]Y[y_{1}]\cdots Y[y_{|U|}]Z[z_{1}]\cdots Z[z_{|U^{\prime}|}] is considered in the construction of D2D_{2}. Since y|U|+1y|U|+1|Y|y_{|U^{\prime}|}+1\leq y_{|U^{\prime}|+1}\leq|Y| (otherwise UU^{\prime} cannot be a proper prefix of UU), D2[y|U|+1,z|U|]2LCSY,Z(y|U|,z|U|)+1D_{2}[y_{|U^{\prime}|}+1,z_{|U^{\prime}|}]\geq 2\cdot{\textrm{LCS}}_{Y,Z}({y_{|U^{\prime}|},z_{|U^{\prime}|}})+1, and equality results from the fact that we otherwise have found a longer subsequence VVVV^{\prime} in Y[1..y|U|+1]Z[1..z|U|]Y[1..y_{|U^{\prime}|}+1]Z[1..z_{|U^{\prime}|}] that we can extend to a subsequence of YZYZ longer than UUUU^{\prime} by applying the second option in Eq. 1. The third option of Eq. 1 fills up D2D_{2} such that we obtain the length of UUUU^{\prime} in D2[|Y|,|Z|]D_{2}[|Y|,|Z|]. ∎

Theorem 3.2.

We can find the longest subsequence having an exponent in +\mathbb{Q}^{+}\setminus\mathbb{N} in 𝒪(n3)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{3}) time using 𝒪(n2)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}) space.

Proof.

For each of the nn different partitions T=YZT=YZ, we precompute a table answering LCSY,Z(y,z){\textrm{LCS}}_{Y,Z}({y,z}) in constant time. This table needs 𝒪(n2)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}) space and can be constructed in 𝒪(n2)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}) time. Next, we create a table D2D_{2} of size 𝒪(n2)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}), and fill each of its cells by Eq. 1 in constant time thanks to the precomputation step. In total, we need 𝒪(n2)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}) time per partition T=YZT=YZ, and therefore 𝒪(n3)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{3}) time for the entire computation. ∎

In the following we want to omit subsequences having exponents only in [1,2)[1,2).

4 Longest Periodic Subsequence

We now extend our ideas of the previous section to omit the sub-periodic subsequences, such that our algorithm always reports a subsequence with an exponent of at least 22. Our main idea is to generalize the factorization of TT from 22 to kk factors. Computing the LCS of these kk factors, we obtain all longest periodic subsequences having an exponent of length kk\ell with \ell\in\mathbb{N}. For k=2k=2, we can find all square subsequences, i.e., the longest common subsequence with an exponent of 2x2x for xx\in\mathbb{N}, similar to [17]. Like in Sect. 3, we can support exponent values +\ell\in\mathbb{Q}^{+} by stopping matching characters in the last factor. In general, the number of factors k=3k=3 is a good value if the exponent is not in (3,4)(3,4). With an exponent x(3,4)x\in(3,4), each factor starts capturing at the root of the repetition, i.e., if we match a subsequence S=UxUS=U^{\left\lfloor x\right\rfloor}U^{\prime}, then we start capturing UU in all factors simultaneously. However, the last factor has to capture |UU||UU^{\prime}| characters if x(3,4)x\in(3,4). Hence, we need to split this last factor such that we have four factors, i.e., we need k=4k=4 for x(3,4)x\in(3,4).

However, k=3k=3 suffices for x(2,3][4,)x\in(2,3]\cup[4,\infty). To see that, we let the first k1k-1 factor capture the subsequence Ux/k1U^{\left\lfloor x/k-1\right\rfloor}. The last factor captures UyU^{y} with y=xx/(k1)(k1)y=x-\left\lfloor x/(k-1)\right\rfloor(k-1), which works if yx/(k1)y\leq\left\lfloor x/(k-1)\right\rfloor, i.e., x3x/2x\leq 3\left\lfloor x/2\right\rfloor, which holds for x4x\geq 4. For x(2,3]x\in(2,3], each of the first factors captures UU, while the last factor captures UyU^{y} with y1y\leq 1.

4.1 Three Factors

There are (n2)=𝒪(n2){n\choose 2}=\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}) possible factorizations of the form T=XYZT=XYZ. Let us fix one such factorization T=XYZT=XYZ. For this factorization, we define the three-dimensional table D3[1..|X|,1..|Y|,1..|Z|]D_{3}[1..|X|,1..|Y|,1..|Z|] by

D3[x,y,z]:=max{3LCSX,Y,Z(x1,y1,z)+δxyD3[x1,y1,z]+δxyD3[x1,y,z]D3[x,y1,z]D3[x,y,z1]D_{3}[x,y,z]:=\max\begin{cases}3\cdot{\textrm{LCS}}_{X,Y,Z}({x-1,y-1,z})+\delta_{xy}\\ D_{3}[x-1,y-1,z]+\delta_{xy}\\ D_{3}[x-1,y,z]\\ D_{3}[x,y-1,z]\\ D_{3}[x,y,z-1]\\ \end{cases}

where δxy:=2\delta_{xy}:=2 if X[x]=Y[y]X[x]=Y[y] and 0 otherwise. Compared to D2D_{2}, the number options for a cell value have increased. The first option takes the longest common subsequence UU of X[1..x1]X[1..x-1], Y[1..y1]Y[1..y-1] and Z[1..z]Z[1..z], which induces the subsequence U3U^{3} of TT, and tries to prolong U3U^{3} to a subsequence (Uc)2U(Uc)^{2}U^{\prime} of TT with UU^{\prime} being a prefix of UU and c=X[x]=Y[y]c=X[x]=Y[y] (or gives U3U^{3} if X[x]Y[y]X[x]\not=Y[y]). The second option performs an extension of U2UU^{2}U^{\prime} to (Uc)2U(Uc)^{2}U^{\prime} with c=X[x]=Y[y]c=X[x]=Y[y]. The last options are similar to the standard LCS computation by copying a previously computed result on a mismatch of X[x]X[x] and Y[y]Y[y].

Lemma 4.1.

D3[x,y,z]D_{3}[x,y,z] is the longest subsequence U2UU^{2}U^{\prime} of TT with UU^{\prime} (possibly empty) prefix UU such that UU is a common subsequence of X[1..x]X[1..x], and Y[1..y]Y[1..y], and UU^{\prime} is a subsequence of Z[1..z]Z[1..z].

Proof.

The longest periodic subsequence with an exponent in 33\mathbb{N} is given by LCSX,Y,Z(|X|,|Y|,|Z|){\textrm{LCS}}_{X,Y,Z}({|X|,|Y|,|Z|}). In what follows, we show that we can also find the longest periodic subsequence with an exponent not divisible by three, i.e., the exponent is in I:=+((2,3][4,))3I:=\mathbb{Q}^{+}\cap((2,3]\cup[4,\infty))\setminus 3\mathbb{N}. This longest periodic subsequence has the form U2U=X[x1]X[x|U|]Y[y1]Y[y|U|]Z[1]Z[z|U|]U^{2}U^{\prime}=X[x_{1}]\cdots X[x_{|U|}]Y[y_{1}]\cdots Y[y_{|U|}]Z[1]\cdots Z[z_{|U^{\prime}|}] with UU^{\prime} being a (not necessarily proper) prefix of UU (see the aforementioned discussion of why this subsequence has to admit this form). Then UU^{\prime} is a common subsequence of XX, YY, and ZZ with LCSX,Y,Z(x|U|,y|U|,z|U|)|U|{\textrm{LCS}}_{X,Y,Z}({x_{|U^{\prime}|},y_{|U^{\prime}|},z_{|U^{\prime}|}})\geq|U^{\prime}|. Similar to the proof of Lemma 3.1, we can argue that LCSX,Y,Z(x|U|,y|U|,z|U|)=|U|{\textrm{LCS}}_{X,Y,Z}({x_{|U^{\prime}|},y_{|U^{\prime}|},z_{|U^{\prime}|}})=|U^{\prime}|, otherwise TT has a longer periodic subsequence with an exponent in II. Hence, the longest periodic subsequence with an exponent in II has the form V2UV^{2}U^{\prime} with UU^{\prime} being a prefix of VV. But then V[|U|..]V[|U^{\prime}|..] is the longest common subsequence of X[x|U|..]X[x_{|U^{\prime}|}..] and Y[y|U|..]Y[y_{|U^{\prime}|}..], computed in D3D_{3}. ∎

Theorem 4.2.

We can find the longest periodic subsequence with an exponent in +((2,3][4,))\mathbb{Q}^{+}\cap((2,3]\cup[4,\infty)) in 𝒪(n5)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{5}) time using 𝒪(n3)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{3}) space.

Proof.

For the claimed time and space complexities, let us fix a factorization T=XYZT=XYZ. We first pre-process LCS with a three-dimensional table taking 𝒪(n3)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{3}) space and 𝒪(n3)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{3}) time such that we can answer an LCS query in 𝒪(1)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(1) time. The table D3D_{3} also takes 𝒪(n3)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{3}) space, and each cell can be filled in constant time thanks to the pre-processing step. Finally, we compute the maximum value of D3D_{3} for each factorization T=XYZT=XYZ, which are 𝒪(n2)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}) many. Hence, we fill D3D_{3} 𝒪(n2)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{2}) times, which gives us the total time of 𝒪(n5)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{5}). ∎

4.2 Four Factors

Finally, we consider a factorization of size four to capture exponents in (3,4)(3,4). We have (n3)=𝒪(n3){n\choose 3}=\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{3}) possibilities to factorize TT into four factors W,X,Y,ZW,X,Y,Z. Let us fix a factorization T=WXYZT=WXYZ. We fill the 4-dimensional table D4[1..|W|,1..|X|,1..|Y|,1..|Z|]D_{4}[1..|W|,1..|X|,1..|Y|,1..|Z|] as follows:

D4[w,x,y,z]:=max{4LCSW,X,Y,Z(w1,x1,y1,z)+δwxyif δxyz>0D4[w1,x1,y1,z]+δwxyD4[w1,x,y,z]D4[w,x1,y,z]D4[w,x,y1,z]D4[w,x,y,z1]\displaystyle D_{4}[w,x,y,z]:=\max\begin{cases}4\cdot{\textrm{LCS}}_{W,X,Y,Z}({w-1,x-1,y-1,z})+\delta_{wxy}&\text{if~{}}\delta_{xyz}>0\\ D_{4}[w-1,x-1,y-1,z]+\delta_{wxy}\\ D_{4}[w-1,x,y,z]\\ D_{4}[w,x-1,y,z]\\ D_{4}[w,x,y-1,z]\\ D_{4}[w,x,y,z-1]\end{cases}

where δwxy:=3\delta_{wxy}:=3 if W[w]=X[x]=Y[y]W[w]=X[x]=Y[y] and 0 otherwise.

Theorem 4.3.

We can compute the longest periodic subsequence with an exponent x(3x,4x)+\in\bigcup_{x\in\mathbb{N}}(3x,4x)\cap\mathbb{Q}^{+} in 𝒪(n7)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{7}) time using 𝒪(n4)\mathop{}\mathopen{}\mathcal{O}\mathopen{}(n^{4}) space.

Proof.

We can show analogously to Lemma 4.1 that D4[w,x,y,z]D_{4}[w,x,y,z] stores the length of the longest string U3UU^{3}U^{\prime} for which UU^{\prime} is both a proper prefix of UU and a subsequence of Z[1..z]Z[1..z], while UU is a common subsequence of the three strings W[1..w]W[1..w], X[1..x]X[1..x], and Y[1..y]Y[1..y]. The rest can be proven analogously to Thm. 4.2 by adding an additional dimension. ∎

5 Open Problems

We are unaware of polynomial-time algorithms computing several other types of regularities when considering subsequences. For instance, we are not aware of an algorithm computing the longest sub-periodic subsequence. For that, we would need an efficient algorithm computing the longest (common) subsequence without a border. Then we could use the algorithm of Sect. 3 and compute this subsequence without a border instead of the (plain) LCS. Other problems are finding the longest (common) subsequence that is primitive (exponent in (1,)(1,\infty)\setminus\mathbb{N}), or the longest (common) subsequence that is non-primitive (exponent in {1}\mathbb{N}\setminus\{1\}).

References

  • Baeza-Yates [1991] R. A. Baeza-Yates. Searching subsequences. Theor. Comput. Sci., 78(2):363–376, 1991.
  • Bannai et al. [2022] H. Bannai, T. I, T. Kociumaka, D. Köppl, and S. J. Puglisi. Computing longest (common) Lyndon subsequences. CoRR, abs/2201.06773, 2022.
  • Biedl et al. [2019] T. C. Biedl, A. Biniaz, R. Cummings, A. Lubiw, F. Manea, D. Nowotka, and J. O. Shallit. Rollercoasters: Long sequences without short runs. SIAM J. Discret. Math., 33(2):845–861, 2019.
  • Bille et al. [2017] P. Bille, I. L. Gørtz, and F. R. Skjoldjensen. Subsequence automata with default transitions. J. Discrete Algorithms, 44:48–55, 2017.
  • Chowdhury et al. [2014] S. R. Chowdhury, M. M. Hasan, S. Iqbal, and M. S. Rahman. Computing a longest common palindromic subsequence. Fundam. Informaticae, 129(4):329–340, 2014.
  • Conte et al. [2019] A. Conte, R. Grossi, G. Punzi, and T. Uno. Polynomial-delay enumeration of maximal common subsequences. In Proc. SPIRE, volume 11811 of LNCS, pages 189–202, 2019.
  • Duraj [2020] L. Duraj. A sub-quadratic algorithm for the longest common increasing subsequence problem. In Proc. STACS, volume 154 of LIPIcs, pages 41:1–41:18, 2020.
  • Fujita et al. [2021] K. Fujita, Y. Nakashima, S. Inenaga, H. Bannai, and M. Takeda. Longest common rollercoasters. In Proc. SPIRE, volume 12944 of LNCS, pages 21–32, 2021.
  • Gawrychowski et al. [2019] P. Gawrychowski, F. Manea, and R. Serafin. Fast and longest rollercoasters. In Proc. STACS, volume 126 of LIPIcs, pages 30:1–30:17, 2019.
  • Hirschberg [1975] D. S. Hirschberg. A linear space algorithm for computing maximal common subsequences. Commun. ACM, 18(6):341–343, 1975.
  • Hirschberg [1977] D. S. Hirschberg. Algorithms for the longest common subsequence problem. J. ACM, 24(4):664–675, 1977.
  • Inenaga and Hyyrö [2018] S. Inenaga and H. Hyyrö. A hardness result and new algorithm for the longest common palindromic subsequence problem. Inf. Process. Lett., 129:11–15, 2018.
  • Inoue et al. [2018] T. Inoue, S. Inenaga, H. Hyyrö, H. Bannai, and M. Takeda. Computing longest common square subsequences. In Proc. CPM, volume 105 of LIPIcs, pages 15:1–15:13, 2018.
  • Inoue et al. [2020] T. Inoue, S. Inenaga, and H. Bannai. Longest square subsequence problem revisited. In Proc. SPIRE, volume 12303 of LNCS, pages 147–154, 2020.
  • Kiyomi et al. [2021] M. Kiyomi, T. Horiyama, and Y. Otachi. Longest common subsequence in sublinear space. Inf. Process. Lett., 168:106084, 2021.
  • Kosche et al. [2021] M. Kosche, T. Koß, F. Manea, and S. Siemer. Absent subsequences in words. In Proc. RP, volume 13035 of LNCS, pages 115–131, 2021.
  • Kosowski [2004] A. Kosowski. An efficient algorithm for the longest tandem scattered subsequence problem. In Proc. SPIRE, volume 3246 of LNCS, pages 93–100, 2004.
  • Sakai [2019] Y. Sakai. Maximal common subsequence algorithms. Theor. Comput. Sci., 793:132–139, 2019.
  • Schensted [1961] C. Schensted. Longest increasing and decreasing subsequences. Canadian Journal of Mathematics, 13:179–191, 1961.
  • Schrinner et al. [2020] S. Schrinner, M. Goel, M. Wulfert, P. Spohr, K. Schneeberger, and G. W. Klau. The longest run subsequence problem. In Proc. WABI, volume 172 of LIPIcs, pages 6:1–6:13, 2020.