Lattice walks ending on a coordinate hyperplane avoiding backtracking and repeats
Abstract.
We work with lattice walks in using step set that finish with . We further impose conditions of avoiding backtracking (i.e. ) and avoiding consecutive steps (i.e. ) each possibly combined with restricting to the half-space . We find in all cases the generating functions for such walks are algebraic and give explicit formulas for them. We also find polynomial recurrences for their coefficients. From the generating functions we find the asymptotic enumeration of each family of walks considered. The enumeration in special cases includes central binomial coefficients and Catalan numbers as well as relations to enumeration of another family of walks previously studied for which we provide a bijection.
Key words and phrases:
Lattice walks, WZ theory, generating functions, formal languages, central binomial coefficients, Catalan numbers2010 Mathematics Subject Classification:
Primary 05A15; Secondary 05A16, 33F10, 68Q451. Introduction
Central binomial coefficients and Catalan numbers are two fundamental sequences in enumerative combinatorics as well as in other areas of mathematics. Both these numbers can be thought of as enumerating walks in with steps from that end at the origin. The Catalan number model has the additional restriction the walks must stay in the nonnegative integers and correspond to the height of usual Dyck paths. We generalize this situation by looking at walks in with steps from . The condition that the walks must end on the hyperplane defined by is imposed. For the generalized Catalan situation we also require the walks be confined to the halfspace with . We then look at such walks avoiding the backtracking pattern and the repeat pattern for . Here a walk is a sequence of vectors from and contains the pattern if and only if for . That is, our pattern avoidance means that no contiguous subsequence of a walk is equal to the pattern. When there are no nonempty walks ending at the origin avoiding the backtracking pattern. While for avoiding the repeat pattern there are the two walks of length namely and only the first of which is relevant the Catalan situation. However, for there are many such walks. Our aim is enumerating these walks.
Walks in integer lattices are a topic of interest in computer science, mathematics, physics, and statistics. Lattice path combinatorics can be approached from various perspectives (see [Kra15] for survey with focus on enumeration). Our focus is enumeration (i.e. finding recurrences, generating functions, and formulas). One such enumeration problem involving avoidance of backtracking we solve gives an interpretation of sequence A085363 in the OEIS [OEI] in terms of 2D lattices walks. This sequence was an original motivation, but we found the techniques readily generalized to arbitrary dimension as well as avoidance of consecutive steps. In addition to combinatorial arguments, we make use of formal language theory and its connection to generating functions. We also use automated methods for dealing with hypergeometric terms and finding polynomial recurrences (see [PWZ96] for more about these types of techniques).
Lattice path enumeration has a long and rich history, and recent attention has been given to avoiding patterns as we consider here. We saw for Dyck paths (i.e. ) that avoiding consecutive steps or backtracking is not interesting. However, the consecutive step patterns and are special cases of runs and . Study of Dyck paths avoiding runs of given lengths (as well as peaks and valleys avoiding certain heights) is conducted in [EZ] with automated computational methods and in [BDB] with formal grammar techniques. Avoiding a pattern in lattice paths was considered in [ABBG20] and later for multiple patterns in [AB20, ABR20]. These latter works make use of the so called vectorial kernel method which modifies the kernel method to allow for pattern avoidance. While the vectorial kernel method is an efficient and flexible way to access the generating functions of our walks, we want to focus in this work on binomial expressions for the coefficients; so, for our set of jumps we present a way to do so via an alternative approach based on context-free grammars and ad-hoc recurrences.
The remainder of the paper is organized as follows. In Section 2 we formally define the languages of walks we are interested in and show that their generating functions are algebraic (and hence the sequences of coefficients satisfy a recurrence with polynomial coefficients). Formulas and recurrences are found in Section 3, then generating functions and asymptotics are given in Section 4. Lastly, Section 5 concludes by looking at some open problems and related work. Also, there is the Appendix which provides Maple code used to prove some results from earlier in the article.
Acknowledgments
The author wishes to thank anonymous referees for their careful reading as well as for several thoughtful and helpful comments which have improved this paper.
2. The setting
For we consider lattice walks in that use steps from which start at the origin and end with . Let be the language over the alphabet consisting of such walks. Evidently any such walk must consist of steps for some . Let be the number of such walks with steps.
We are further interested in subsets of these walks with additional conditions. We consider the language of walks which avoid backtracking. That is, walks ending with with the additional constraint that a step cannot be directly followed by for any . We let be the number of walks avoiding backtracking with steps. Similarly, we let denote the language of walks avoiding consecutive repeats. That is, walks avoiding a step directly followed by for any . We let denote the number of walks avoiding consecutive repeats with steps.
Now let be the language consisting of walks starting at the origin for which at all times and finish with . We similarly let and denote the sublanguages which avoid backtracking and consecutive equal steps respectively. Again we let the sequences , , and count the number of walks of length in each of the languages , , and . Letting denote the language of walks avoiding the backtracking pattern and denote the language of walks avoiding consecutive repeats we have
as expressions of our languages of interest. It is worth noting that both and are regular languages.
Remark 2.1.
Here are a few remarks about notation (some of which has already been used in the Section 1). We will use to denote vectors (i.e. elements of our alphabet) and to denote words (i.e. sequences of vectors). For example, and is a word over . We may also use or to denote a element of ending in or respectively for We will also use the power notation to denote repeated instances of a vector or word. So if then . We may also put words to a power. For example, if and , then
In enumeration it is often advantageous and interesting to work with generating functions. We will consider the following generating functions
which are the ordinary generating functions for our sequences. Also for any , , and we let be the sublanguage of walks which start with . Furthermore, we let
be the generating function for these walks with a given first step.
We now review some facts on power series which can be found in [Sta80]. A formal power series is algebraic if it satisfies a nontrivial polynomial equation. A holonomic sequence is a sequence that satisfies a recurrence relation with polynomial coefficients. That is, a sequence such that
for some and polynomials not all of which are equal to zero. The sequence of coefficients of an algebraic formal power series is holonomic.
From a generating function one can find the asymptotics of its sequence of coefficients. Methods for obtaining these asymptotic expressions are well established and can be found in the text [FS09]. We write to mean that
which is the standard notation. Consider a generating function viewed as a complex analytic function and let be its unique singularity closest to the origin. Assume approaches as for some constant where is neither nor a negative integer. This is the situation that will be relevant to use, and under these assumptions
where denotes the Gamma function.
Let us now show using formal language and generating function theory that each of our generating functions are algebraic from which it follows each of the sequences is holonomic. Our goal will be to further understand these generating functions and find the recurrence relations that the sequences obey. The Chomsky–Schützenberger enumeration theorem [CS63] (see [KS86, Pan05] for proofs) says that the length generating function of an unambigous context free language is algebraic. A deterministic context free language is a language recognized by a deterministic push down automata (DPDA). Ginsburg and Greibach [GG66] have shown that a deterministic context free language is an unambiguous context free language and that when such a language is intersected with a regular language it remains a deterministic context free language.
Theorem 2.2.
Let be one of our languages, then is a deterministic context free language. Hence, is algebraic and its sequence of coefficients is holomonic.
Proof.
It suffices to prove that each of and is recognized by a DPDA. Both of the languages and consisting of walks which respectively avoid the backtracking pattern as well as which avoid the repeat pattern (but can end anywhere) are regular. Since , , , and it will follow that each of these will be deterministic context free languages. By the Chomsky–Schützenberger enumeration theorem all the generating functions will be algebraic. Therefore each of the sequences must be holonomic. DPDAs recognizing the languages and are shown in Figure 1 and Figure 2 respectively. ∎
Remark 2.3.
Theorem 2.2 essentially follows from the definition of our languages. Its purpose is to make rigorous the fact that generating functions are algebraic and that their sequences of coefficients are holonomic. It is worth noting that there are other techniques that can also be used to obtain the algebraicity of the generating functions (e.g. the vectorial kernel method). We also note the method above the flexible to other situations (e.g. intersecting with a different regular language coming from a pattern or patterns).
Now that we know our generating functions are algebraic we turn our attention to describing these generating functions and their coefficient as explicitly as possible. The values of and can be found without difficulty. We have that
(1) |
since any walk in is
where for coordinates in form any binary string and the last coordinate makes a balanced binary string. Hence, satisfies the recurrence
(2) |
of the form guaranteed from being holonomic. It follows that
(3) |
is the expression for the generating function. By similar arguments we find that
(4) |
where is the th Catalan number. Also,
(5) |
and
(6) |
give the recurrence and generating function for
3. Formulas and recurrences
In this section we find formulas for each of the sequences , , , and consisting of a sum of hypergeometric terms and as an evaluation of a hypergeometric function. Zeilberger’s algorithm [Zei90, Zei91] is then applied to obtain a recurrence.
3.1. Ending on a hyperplane
For a given integer , an integer composition of is an ordered sequence of positive integers which sum to . The notion of an integer composition will be essential to the proofs of the next two theorems. We will write to denote that is an integer composition of . The length of an integer composition is the number of elements in the sequence and is denoted by . For example, and are two distinct integer compositions of both of length . A known fact about integer compositions which we will use is that there are integer compositions of with length equal to .
Theorem 3.1.
For any and we have
which satisfies the recurrence
for with and
Proof.
Recall that counts the number of walks of length which avoid backtracking and end with . Consider the projection of such a walk onto the -axis. This projection can be encoded by a word with up steps and down steps corresponding to if step in the walk had the last coordinate or respectively. Let us assume the word starts with a . The case that starts with a is completely analogous. The lengths of the runs of ’s and ’s will give two integer compositions and respectively where when . The vector corresponding to each and has the last coordinate determined and then has either or choices for which it could have came from making the prefix of the first coordinates of the vector. Looking at only the ’s, we see that the with only choices are those in positions . Looking at only the ’s, we see that the ’s with only choices are those in positions . Thus we find that
Once we have expressed in this way as a sum over hypergeometric terms we may find the hypergeometric evaluation and recurrence using automated methods. In the appendix we show how this computation can be performed in Maple.
∎
Example 3.2.
For and along with the two integer compositions and we have lattice walks in the plane contributing to the count of . Projecting onto the -axis all such walks will be either
or
Here the superscripts indicate how many choices we have for each step as a -dimensional walk. We let
Accounting for the symmetry of reflection over the -axis and the -axis (i.e. exchanging and or exchanging and ) there are walks each starting with which are shown both as words and graphed in the plane in Figure 3.
Theorem 3.3.
For any and we have
which satisfies the recurrence
for with and
Proof.
Recall that counts the number of walks of length which avoid consecutive repeated steps and end with . Proceeding similarly to the proof of Theorem 3.1, we consider the projection of such a walk onto the -axis. Again this projection can be encoded by a word with up steps and down steps meaning the corresponding step in the walk had the last coordinate or respectively. Let us assume the word starts with a since the case that starts with a is completely analogous. The lengths of the runs of ’s and ’s will give two integer compositions and respectively where when . The vector corresponding to each and has the last coordinate determined and then has either or choices for which it could have came from making the prefix of the first coordinates. Looking at only the ’s, we see that the with choices are those in positions . Looking at only the ’s, we see that the ’s with choices are those in positions . So, it follows that
Notice the only difference with Theorem 3.1 is that exponents of and are changed. Once we have expressed in this way as a sum over hypergeometric terms we may find the hypergeometric evaluation and recurrence using automated methods. In the appendix we show how this computation can be performed in Maple. ∎
3.2. Ending on a hyperplane and restricted to a halfspace
Theorem 3.4.
For any and we have
which satisfies the recurrence
for with and .
Proof.
Recall that counts the number of walks of length which avoid backtracking and end with while staying the half space defined by . In the last coordinate we must have a Dyck path. We partition Dyck paths of semilength by number of peaks. The Narayana number
gives the number of Dyck paths of semilength with peaks (and hence valleys). For the first coordinates with have choices for each step except for the steps directly after a peak or valley for which we only have choices. Once we have expressed in this way as a sum over hypergeometric terms we may find the hypergeometric evaluation and recurrence using automated methods. In the appendix we show how this computation can be performed in Maple. ∎
Theorem 3.5.
For any and we have
which satisfies the recurrence
for with and .
Proof.
Recall that counts the number of walks of length which avoid the repeat pattern and end with while staying the half space defined by . In the last coordinate we must have a Dyck path. We again partition Dyck paths of semilength by number of peaks given by Narayana numbers
for each . For the first coordinates with have for only the steps directly after a peak or valley as well as for the first step. At all other steps we have choices for the first coordinates. Once we have expressed in this way as a sum over hypergeometric terms we may find the hypergeometric evaluation and recurrence using automated methods. In the appendix we show how this computation can be performed in Maple. ∎
Example 3.6.
For and consider walks contributing to projecting onto the Dyck path where and are up and down steps respectively and superscripts indicated the number of choices for each step. There are such paths because this Dyck path has peaks and valley. We let
and accounting for symmetry we may assume is the first step. This results in walks shown in Figure 4.
4. Generating functions and asymptotics
In this section we give a formula for each of the generating functions , , , and .
Lemma 4.1.
If and , then .
Proof.
Let be defined by
for each . It is clear is an involution, and hence a bijection. We have that if and only if . This is because if and only if and because the sum of the last coordinate is preserved. Now given any we can obtain any other by for any some sequence (i.e. the coordinates where and differ). It follows that . The lemma follows since
as all nonempty walks must start with some while for . ∎
Lemma 4.2.
If and , then .
Proof.
The proof is very similar to the proof of Lemma 4.1. The only essential difference is that a walk in can begin with any and hence we are also allowed the use the similarly defined . ∎
Lemma 4.3.
For any we have that and .
Proof.
Using Theorem 3.1, Theorem 3.3, Theorem 3.4, and Theorem 3.5 we see that satisfies the same recurrence as while satisfies the same recurrence as . It remains only to check initial conditions. We find that
and also
so the lemma is proven for and . In a similar way
and
which completes the proof of the lemma. ∎
Theorem 4.4.
For any we have
while for we have
and .
Proof.
For the result is easy as there are no nonempty walks avoiding backtracking ending at the origin while and are the only walks avoiding consecutive steps ending at the origin with only the former confined to the nonnegative integers. To establish the generating function identity for with we split the walk where it first returns to . We note any walk in is:
-
(i)
empty,
-
(ii)
or of the form such that and where is nonempty and does not begin with ,
-
(iii)
or of the form such that with and where does not begin with .
Making use of Lemma 4.1 it follows that satisfies
which is equivalent to quadratic equation
that we can solve. For using the quadratic formula and making the correct choice of sign we have
which completes the proof for after adding to each side. By Lemma 4.3 it follows that
which can be used to conclude the theorem for .
Now consider let’s consider walks in which we again split according to where they first return to . Such a walk must be:
-
(i)
empty,
-
(ii)
of the form such that , is nonempty where is the last coordinate of the first step of the walk, and which does not start with
-
(iii)
of the form such that with and which does not start with .
Letting while still letting we have that
where we have made use of Lemma 4.2. After rearranging we find
then solving for and substituting the formula we have previously found for we have
which derives the formula for . Last it remains to demonstrate the formula for and this can be readily done using Lemma 4.3. ∎
Corollary 4.5.
For any , we have
as asymptotics for our sequences.
Proof.
By Lemma 4.3 we need only compute the asymptotics for and . For we have the generating function
by Theorem 4.4 which has as its singularity closest to the origin. As we have that the generating function approaches
as a complex analytic function with . It then follows that
which simplifies to the desired formula since .
Now for we have the generating function
by Theorem 4.4 which also has as its singularity closest to the origin. As we have that the generating function approaches
as a complex analytic function with . It then follows that
which simplifies to the desired formula since . ∎
5. Conclusion
We now conclude with some possible directions for future work and discussion of related work.
5.1. Intersections of hyperplanes
Let us look at a natural generalization of the languages and . For We can consider the language of walks in with steps from which end with . So, we have that . We also take the language which is contained in with the additional condition that at all times during the walk. Similarly it is the case that . In Theorem 2.2 we saw that both and are an unambiguous CFLs which in turn guaranteed the sequences under consideration earlier were holonomic and their generating functions were algebraic. It turns out neither nor is a CFL for .
Proposition 5.1.
For any the languages and are not a CFLs.
Proof.
It will suffice to show that neither nor is a CFL. We will use the pumping lemma and the choice of string pumped will readily generalize to and for . Assume that either or is a CFL with pumping length . There is the walk
which cannot be pumped. Indeed any consecutive substring of length at most will contain one coordinate with all entries equal. Hence, after pumping this substring the walk will not end at the origin. For other values of and the above example can be used to choose the coordinates in positions and while the remaining coordinates can be filled as needed to make a valid walk. ∎
Letting denote the number of walks of length in we can see that
which satisfies that recurrence
so the sequence turns out to still be holomonic. A similar computation can be performed for with Catalan numbers in place of the central binomial coefficients. We can then look at avoiding patterns like backtracking or consecutive steps. For these languages we do not have a guarantee that the corresponding sequences are holomonic.
Question 5.2.
What can be said about enumeration for the analogous languages ending on an intersection of hyperplanes?
5.2. A Motzkin generalization and more patterns
A natural extension would be to look at other sets of steps. We outline one possibility related to Motzkin numbers. The Motzkin numbers enumerate walks in with steps from that end at the origin and are restricted to the nonnegative integers. Continuing in the spirit of what we have done, one could consider walks in with step set that end with along with combining half-space restrictions as well as pattern avoidance. Bu [Bu21] has used dynamic programming to attack enumeration of restricted Motzkin paths in a manner similar to the previously mentioned work on Dyck paths by Ekhad and Zeilberger [EZ]. As previously mentioned these works look at other patterns (e.g. longer runs of consecutive steps). One could look at the pattern of a longer run with either our current step set or another step set. Theorem 2.2 can be easily adapted to give an algebraic generating function for other patterns since the language of walks avoiding a pattern is regular.
Question 5.3.
What can be said about enumeration for the analogous languages using the Motzkin-like alphabet ?
5.3. Other related work
We finish by mentioning some other related work. The case of Theorem 3.1 agrees with A082298 in the OEIS [OEI] and provides a proof of a conjecture observed by David Scambler. For Theorem 3.4 and Theorem 3.5 correspond to A086871 and A082298 respectively in the OEIS [OEI]. Furthermore, when Theorem 3.4 is twice the formula in [Cok03, Theorem 2.2(b)] which enumerates lattice walks from to using steps which never go above the line . These walks are also enumerated by Woan in [Wj01] and are A059231 in the OEIS [OEI].
Let be the sublanguage of walks starting with the step . Also, let be the language of walks from to for any using steps from which never go below the -axis. The walks in are equinumerous with the walks enumerated by Cocker and by Woan via exchanging and . Since our walks have a symmetry with orbit size for all nonempty walks by the action of reflecting over the -axis, there is a bijection between and . Let us now define a map . For start by considering the decomposition into maximal runs of consecutive steps
for with which is unique and well-defined. We then set
where
for each .
Proposition 5.4.
The map is a bijection.
Proof.
If has steps, then is a path from to . It is enough to show is injective since we know the number of walks in with steps is the same as the number of paths in from to . Consider with
as decompositions into maximal runs. Assume the , then immediately we have because the paths and must have the same number of steps. Next it must be the case that since . The first steps of and are and so we have . This establishes the base case for induction. Assume that and for some . We will show this implies that and . The th steps of and are and . This means that and the -coordinate of and match. Since the walks avoid backtracking and we have decomposed them into maximal runs there is only one choice for the -coordinate once the -coordinate is fixed. As we have that . The proposition then follows by induction. ∎
Example 5.5.
References
- [AB20] Andrei Asinowski and Cyril Banderier. On Lattice Paths with Marked Patterns: Generating Functions and Multivariate Gaussian Distribution. In Michael Drmota and Clemens Heuberger, editors, 31st International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2020), volume 159 of Leibniz International Proceedings in Informatics (LIPIcs), pages 1:1–1:16, Dagstuhl, Germany, 2020. Schloss Dagstuhl–Leibniz-Zentrum für Informatik. doi:10.4230/LIPIcs.AofA.2020.1.
- [ABBG20] Andrei Asinowski, Axel Bacher, Cyril Banderier, and Bernhard Gittenberger. Analytic combinatorics of lattice paths with forbidden patterns, the vectorial kernel method, and generating functions for pushdown automata. Algorithmica, 82(3):386–428, 2020. doi:10.1007/s00453-019-00623-3.
- [ABR20] Andrei Asinowski, Cyril Banderier, and Valerie Roitner. Generating functions for lattice paths with several forbidden patterns. Sém. Lothar. Combin., 84B:Art. 95, 12, 2020.
- [BDB] AJ Bu and Robert Dougherty-Bliss. Enumerating restricted Dyck paths with context-free grammars. arXiv:2009.09061. URL: https://arxiv.org/abs/2009.09061.
- [Bu21] AJ Bu. Automated counting of restricted Motzkin paths. Enumer. Combin. Appl., 1(2):Article #S2R12, 2021.
- [Cok03] Curtis Coker. Enumerating a class of lattice paths. Discrete Math., 271(1-3):13–28, 2003. doi:10.1016/S0012-365X(03)00037-2.
- [CS63] N. Chomsky and M. P. Schützenberger. The algebraic theory of context-free languages. In Computer programming and formal systems, pages 118–161. North-Holland, Amsterdam, 1963.
- [EZ] Shalosh B. Ekhad and Doron Zeilberger. Automatic counting of restricted Dyck paths via (numeric and symbolic) dynamic programming. arXiv:2006.01961. URL: https://arxiv.org/abs/2006.01961.
- [FS09] Philippe Flajolet and Robert Sedgewick. Analytic combinatorics. Cambridge University Press, Cambridge, 2009. doi:10.1017/CBO9780511801655.
- [GG66] Seymour Ginsburg and Sheila Greibach. Deterministic context free languages. Information and Control, 9(6):620 – 648, 1966. doi:https://doi.org/10.1016/S0019-9958(66)80019-0.
- [Kra15] Christian Krattenthaler. Lattice path enumeration. In Handbook of enumerative combinatorics, Discrete Math. Appl. (Boca Raton), pages 589–678. CRC Press, Boca Raton, FL, 2015.
- [KS86] Werner Kuich and Arto Salomaa. Semirings, automata, languages, volume 5 of EATCS Monographs on Theoretical Computer Science. Springer-Verlag, Berlin, 1986. doi:10.1007/978-3-642-69959-7.
- [Map] Maplesoft, a division of Waterloo Maple Inc.. Maple. URL: https://www.maplesoft.com/.
- [OEI] OEIS Foundation Inc. The On-Line Encyclopedia of Integer Sequences. URL: https://oeis.org.
- [Pan05] Alois Panholzer. Gröbner bases and the defining polynomial of a context-free grammar generating function. J. Autom. Lang. Comb., 10(1):79–97, 2005.
- [PWZ96] Marko Petkovšek, Herbert S. Wilf, and Doron Zeilberger. . With foreword by Donald E. Knuth. Wellesley, MA: A. K. Peters, 1996.
- [Sta80] R. P. Stanley. Differentiably finite power series. European J. Combin., 1(2):175–188, 1980. doi:10.1016/S0195-6698(80)80051-5.
- [Wj01] Woan Wen-jin. Diagonal lattice paths. Congr. Numerantium, 151:173–178, 2001.
- [Zei90] Doron Zeilberger. A fast algorithm for proving terminating hypergeometric identities. Discrete Math., 80(2):207–211, 1990. doi:10.1016/0012-365X(90)90120-7.
- [Zei91] Doron Zeilberger. The method of creative telescoping. J. Symbolic Comput., 11(3):195–204, 1991. doi:10.1016/S0747-7171(08)80044-2.
Appendix: Proofs with Maple code
Here in the Appendix we show how the functions sumrecursion and sumtohyper from the sumtools package in Maple [Map] can be used to compute the recurrences and hypergeometric evaluations in this paper. We first illustrate how to compute the recurrence relations in Theorem 3.1, Theorem 3.3, Theorem 3.4, and Theorem 3.5 using Maple to execute to Zeilberger’s algorithm [Zei90, Zei91]. Running the following in Maple
will return is the relevant parts of Theorem 3.1, Theorem 3.3, Theorem 3.4, and Theorem 3.5 giving the recurrences. Also running the following in Maple
verifies the hypergeometric evaluations.