This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Hitting times for second-order random walks

Dario Fasino, Arianna Tonetto
Department of Mathematics, Computer Science and Physics
University of Udine
33100, Udine (Italy)
[email protected] &Francesco Tudisco
School of Mathematics
Gran Sasso Science Institute
67100, L’Aquila (Italy)
[email protected]
Abstract

A second-order random walk on a graph or network is a random walk where transition probabilities depend not only on the present node but also on the previous one. A notable example is the non-backtracking random walk, where the walker is not allowed to revisit a node in one step. Second-order random walks can model physical diffusion phenomena in a more realistic way than traditional random walks and have been very successfully used in various network mining and machine learning settings. However, numerous questions are still open for this type of stochastic processes. In this work we extend well-known results concerning mean hitting and return times of standard random walks to the second-order case. In particular, we provide simple formulas that allow us to compute these numbers by solving suitable systems of linear equations. Moreover, by introducing the “pullback” first-order stochastic process of a second-order random walk, we provide second-order versions of the renowned Kac’s and random target lemmas.

Keywords

Second-order random walks; non-backtracking walks; hitting times; return times.

1 Introduction

Random walks on graphs are a key concept in network theory used to model various dynamics such as user navigation and epidemic spreading [29], to quantify node centrality [30, 31], or to reveal communities and core-periphery structures [9, 10, 33, 37]. The standard random walk on a graph considers a hypothetical walker that starting from node ii moves to a new node of the graph choosing it uniformly at random among the neighbors of ii. This defines a memoryless stochastic process which corresponds to a Markov chain on the nodes. However, the process described by a standard graph random walk is often localized and may fail to capture the underlying physical model as well as to fully exploit the global graph structure. For this reason, there is a growing interest in recent years towards so-called higher-order stochastic processes whose evolution may depend on past states, including stochastic processes modeled by transition probability tensors [5, 12] as well as non-local diffusion mappings [7, 11] and non-backtracking random walks [13].

The latter is one of the best known examples of second-order stochastic processes, where node transitions depend on both the current and the previous state spaces as the process is not allowed to backtrack, i.e. to immediately revisit the previous space in the next step. Non-backtracking random walks are often more appropriate than classical random walks to model real-world diffusive phenomena on networks where a message-passing or disease-spreading analogy is relevant. In particular, non-backtracking walks can avoid unwanted localization effects on the leading eigenvector of the adjacency matrix of a graph [17, 20, 24] and their mixing rate, in many cases, is faster than the one of classical random walks [27, 6]. Furthermore, many computations with non-backtracking walks have negligible overheads compared to standard approaches [1, 14].

The non-backtracking paradigm is at the basis of several second-order stochastic processes appearing in recent graph-theoretic, network mining and machine learning literature. For example, the mixing time of a clique-based version of the non-backtracking random walk on regular graphs is studied in [6]. Similarity measures based on second-order random walks are used in [38] to measure node proximity and improve classical walk-based link-prediction and clustering algorithms. Eigenvalues and eigenvectors of non-backtracking matrices have been shown to provide remarkable performances in community detection and in various centrality measurements [1, 20, 35]. The graph-embedding generated by the popular node2vec algorithm [15] is based on a flexible second-order random walk depending on two parameters: one used to control the likelihood of backtracking (i.e. immediately revisiting a node in the walk), the other used to bias the navigation towards or away from the immediately preceding nodes.

In this work we are concerned with hitting and return times of second-order random walks on graphs with possibly countably many vertices. Roughly speaking, a hitting time is the first time at which a stochastic process reaches a given subset of its state space, while a return time is the first time the process gets back to a given starting state. Hitting and return times are well understood for Markov chains, which include as a particular case the stochastic process that describes a standard random walk on a graph. In this case, the state space is given by the set of nodes and explicit formulas are available to compute mean hitting and return times by means of suitable systems linear equations which, for finite and connected graphs, have unique solutions. Moreover, these quantities have numerous applications to, for example, quantify node distances and similarities. For this reason, they are an important tool to evaluate network cohesion and node centrality and are at the basis of popular node classification and link prediction algorithms [22, 28].

In this paper, we extend several results concerning mean hitting and return times of standard first-order stochastic processes to the second-order setting. In particular, we show that also mean hitting times of second-order random walks can be obtained from a set of linear equations, which has potential applications in higher-order classification and link prediction methods [3, 4, 36]. Moreover, we prove that mean return times of second-order random walks coincide with standard return times of suitably defined random walks in the original graph. In particular, mean return times for non-backtracking random walks coincide with the usual mean return times in all finite undirected graphs.

The paper is organized as follows. In Section 2 and 3 we present the notations used in this work and some basic results about random walks and second-order random walks, respectively. In Section 4 we introduce mean hitting times and return times for second-order random walks and we present first results concerning possible ways to compute them numerically via the solution of systems of linear equations. Our main results are in Section 5. There, we analyze second-order mean hitting and return times under what we call “equilibrium assumption”, corresponding to the condition that the second-order random walk is at the steady state. In particular, we derive an algebraic relationship between the hitting time matrix for a second-order random walk and the one corresponding to a standard random walk on the directed line graph of the original graph. Moreover, we show second-order analogues of the Kac’s and the random target lemmas for standard Markov chains. Finally, in Section 6, we illustrate some of the results of this paper on some real-world examples.

2 Notations and basic results

An oriented graph (or network) is a pair 𝒢=(V,E)\mathcal{G}=(V,E), where VV is the vertex (or node) set and EE is the set of oriented edges, i.e., ordered pairs of nodes. We say that (i,j)E(i,j)\in E is an edge going from ii to jj. From here onward, letters i,j,ki,j,k denote generic nodes of 𝒢\mathcal{G}, while e,fe,f are used to denote generic edges of 𝒢\mathcal{G}. Furthermore,

  • if e=(i,j)Ee=(i,j)\in E then we set 𝚂(e)=i{\tt S}(e)=i and 𝚃(e)=j{\tt T}(e)=j, the source and terminal node of the edge ee;

  • for iVi\in V we denote by i={eE:𝚃(e)=i}\mathcal{I}_{i}=\{e\in E:{\tt T}(e)=i\} and 𝒪i={eE:𝚂(e)=i}\mathcal{O}_{i}=\{e\in E:{\tt S}(e)=i\} the in-neighborhood and out-neighborhood of ii, respectively. The in-degree and the out-degree of node ii are di=|𝒪i|d^{-}_{i}=|\mathcal{O}_{i}| and di+=|i|d^{+}_{i}=|\mathcal{I}_{i}|, respectively.

In the present work, 𝒢\mathcal{G} can have a countably infinite number of nodes. However, we will assume that 𝒢\mathcal{G} is locally finite, that is, VV is possibly infinite but each node has finite in- and out-degree.

A graph 𝒢=(V,E)\mathcal{G}=(V,E) can be completely described by its (possibly infinite dimensional) adjacency matrix A=(Aij)A=(A_{ij}), defined as

Aij={1 if (i,j)E;0 otherwise.A_{ij}=\begin{cases}1&\hbox{ if }(i,j)\in E;\\ 0&\hbox{ otherwise.}\end{cases}

If the adjacency matrix is symmetric then 𝒢\mathcal{G} is called undirected. Hence, in an undirected graph each edge (i,j)(i,j) has its own reciprocal edge (j,i)(j,i). In that case we have di=di+d^{-}_{i}=d^{+}_{i} and this common value is the degree of node ii, denoted as did_{i}.

A walk of length mm on 𝒢\mathcal{G} is a sequence v0,v1,,vmVv_{0},v_{1},\ldots,v_{m}\in V of nodes such that (vi1,vi)E(v_{i-1},v_{i})\in E for i=1,,mi=1,\ldots,m. If for all i,jVi,j\in V there exists a walk from ii to jj, then the graph 𝒢\mathcal{G} is called strongly connected.

Finally, the symbols \mathbb{P} and 𝔼\mathbb{E} denote the probability and the expectation, respectively. All vectors used in the text are column vectors, with possibly infinitely many entries. In particular, the symbol 𝟙\mathbbx{1} denotes the column vector with all entries equal to 1, the size of which should be clear from the context.

2.1 Random walks

In this section, we recall from [18, 32] some basic concepts and results in Markov chain theory. A random walk on 𝒢\mathcal{G} is a sequence of nodes v0,v1,,vk,v_{0},v_{1},\ldots,v_{k},\ldots where vk+1v_{k+1} is chosen at random between the out-neighbours of vkv_{k}, according to specified probabilities pi,j:=(vk+1=j|vk=i)p_{i,j}:=\mathbb{P}(v_{k+1}=j|v_{k}=i) such that pi,j=0p_{i,j}=0 if (i,j)E(i,j)\notin E and j𝒪ipi,j=1\sum_{j\in\mathcal{O}_{i}}p_{i,j}=1 for every iVi\in V. The last condition implies that di+1d^{+}_{i}\geq 1 for every iVi\in V, which will be assumed henceforth. A standard choice for pi,jp_{i,j} is pi,j=1/di+p_{i,j}=1/d^{+}_{i}, for j𝒪ij\in\mathcal{O}_{i}. In this case the random walk is called uniform. Our subsequent discussion is not restricted to this case.

A random walk can be viewed as a Markov chain {Yn,n}\{Y_{n},n\in\mathbb{N}\} with state space VV and transition matrix PP with entries Pij=pi,jP_{ij}=p_{i,j}. Clearly, PP is a row-stochastic matrix. The random variable YnY_{n} represents the state of the walker at time nn.

For a fixed subset SVS\subset V we call hitting time to SS the random variable TST_{S} defined as TS=min{n:YnS}T_{S}=\min\{n\in\mathbb{N}:Y_{n}\in S\}. In other words, TST_{S} is the first time the random walker reaches the subset SS, starting from a given initial state Y0Y_{0}. Based on TST_{S}, we can define the hitting probability φiS\varphi_{i\to S} from ii to SS as:

φiS=(TS<+|Y0=i),\varphi_{i\to S}=\mathbb{P}(T_{S}<+\infty|Y_{0}=i), (1)

i.e., the probability that, starting from node ii, the random walker ever hits SS. When φiS=1\varphi_{i\to S}=1, it is interesting to compute the average time taken by the random walker to reach SS, starting from ii. This quantity is given by

τiS=𝔼(TS|Y0=i)\tau_{i\to S}=\mathbb{E}(T_{S}|Y_{0}=i)

and is called the mean hitting time from ii to SS. It is well-known that the quantities φiS\varphi_{i\to S} and τiS\tau_{i\to S} can be obtained from the solution of certain linear systems, as shown by the next two theorems, whose proofs can be found in, e.g., [32].

Theorem 2.1.

For a graph 𝒢=(V,E)\mathcal{G}=(V,E) and for a subset SVS\subset V, the hitting probabilities {φiS:iV}\{\varphi_{i\to S}:i\in V\} are the minimal non-negative solution of the following linear system:

{φiS=1if iSφiS=jVPi,jφjSif iS.\begin{cases}\varphi_{i\to S}=1&\hbox{if $i\in S$}\\ \varphi_{i\to S}=\sum_{j\in V}P_{i,j}\varphi_{j\to S}&\hbox{if $i\notin S$.}\end{cases} (2)

In the previous theorem, minimality of the solution means that if {xi:iV}\{x_{i}:i\in V\} is another non-negative solution of (2) then φiSxi\varphi_{i\to S}\leq x_{i} for all iVi\in V.

Theorem 2.2.

For a graph 𝒢=(V,E)\mathcal{G}=(V,E) and for a subset SVS\subset V, suppose that φiS=1\varphi_{i\to S}=1 for all iVi\in V. Then the mean hitting times {τiS:iV}\{\tau_{i\to S}:i\in V\} are the minimal non-negative solution of the following linear system:

{τiS=0if iSτiS=1+jVPi,jτjSif iS.\begin{cases}\tau_{i\to S}=0&\hbox{if }i\in S\\ \tau_{i\to S}=1+\sum_{j\in V}P_{i,j}\tau_{j\to S}&\hbox{if }i\notin S.\end{cases} (3)

The case of finite graphs is the most usual in applications. In this case, a well-known result shows that for strongly connected graphs (3) has a unique solution, see e.g., Chap. VI of [18]. Precisely:

Theorem 2.3.

Let 𝒢=(V,E)\mathcal{G}=(V,E) be finite and strongly connected. For any given subset SVS\subset V, all the hitting probabilities to SS are equal to 11 and the linear system (3) has a unique solution.

It is interesting to note that Equation (3) can be written in compact matrix-vector form as follows. Let tSt^{S} and rSr^{S} be the vectors entrywise defined as follows:

tiS={0iSτiSiSriS={τiS+iS0iS,t^{S}_{i}=\begin{cases}0&i\in S\\ \tau_{i\to S}&i\notin S\end{cases}\qquad r^{S}_{i}=\begin{cases}\tau^{+}_{i\to S}&i\in S\\ 0&i\notin S,\end{cases} (4)

where the numbers τiS+\tau^{+}_{i\to S} are defined as

τiS+=1+jVPi,jτjS,(iS).\tau^{+}_{i\to S}=1+\sum_{j\in V}P_{i,j}\tau_{j\to S},\qquad(i\in S)\,. (5)

Then, the vector tSt^{S} solves the singular equation

(IP)x=𝟙𝕣𝕊.(I-P)x=\mathbbx{1}-r^{S}. (6)

In particular, among the infinitely many solutions xx of this equation, the vector tSt^{S} is the one characterized by the condition tiS=0t^{S}_{i}=0 for iSi\in S. The quantities (5) are called mean return times and represent the average number of steps required to return to SS after leaving it from iSi\in S. Indeed, if we consider the random variable TS+=min{n1:YnS}T^{+}_{S}=\min\{n\geq 1:Y_{n}\in S\}, then it holds τiS+=𝔼(TS+|Y0=i)\tau^{+}_{i\to S}=\mathbb{E}(T^{+}_{S}|Y_{0}=i). The special case when SS is a singleton deserves a special attention, and we adopt the simplified notation τi=𝔼(Ti+|Y0=i)\tau_{i}=\mathbb{E}(T^{+}_{i}|Y_{0}=i). This quantity is called mean return time to ii.

When 𝒢\mathcal{G} is finite and strongly connected, by means of the Perron–Frobenius theorem (see, e.g., [25]) we can conclude that the random walk {Yn,n}\{Y_{n},n\in\mathbb{N}\} admits a unique positive invariant distribution, i.e., a vector π>0\pi>0 such that πT𝟙=𝟙\pi^{T}\mathbbx{1}=1 and πTP=πT\pi^{T}P=\pi^{T}. More generally, when VV is countable, the same conclusion holds true if and only if the associated Markov chain is irreducible and positive recurrent, that is, τi\tau_{i} is finite for at least one iVi\in V, see e.g., Thm. 21.12 of [21] or Thm. 1.7.7 of [32]. The latter property also implies that all hitting probabilities (1) are equal to 11 and all hitting times are finite.

With the help of the invariant distribution we can obtain an explicit formula for the mean return time to a set SVS\subset V,

τS\displaystyle\tau_{S} =𝔼(TS+|Y0S)\displaystyle=\mathbb{E}(T^{+}_{S}|Y_{0}\in S)
=iS(Y0=i|Y0S)𝔼(TS+|Y0=i)=1jSπjiSπiτiS+.\displaystyle=\sum_{i\in S}\mathbb{P}(Y_{0}=i|Y_{0}\in S)\mathbb{E}(T^{+}_{S}|Y_{0}=i)=\frac{1}{\sum_{j\in S}\pi_{j}}\sum_{i\in S}\pi_{i}\tau^{+}_{i\to S}.

This number quantifies the average number of steps taken by a random walker initially placed in SS to hit again SS, given that its probability of being in node ii is proportional to πi\pi_{i}. Indeed, the number πi/jSπj\pi_{i}/\sum_{j\in S}\pi_{j} gives the conditional probability that the random walker is in ii given that it is in SS. The average return time τS\tau_{S} can be computed by means of Kac’s lemma, see e.g., [16] or [21, Lemma 21.13]:

Lemma 2.4.

Let π\pi be the stationary density of an irreducible Markov chain on the state space VV. Then for any SVS\subset V,

τS=1iSπi.\tau_{S}=\frac{1}{\sum_{i\in S}\pi_{i}}.

In particular, τi=1/πi\tau_{i}=1/\pi_{i} for every iVi\in V.

If 𝒢\mathcal{G} is finite then the mean hitting times fulfill the so-called random target lemma, see e.g., Lemma 10.1 of [21].

Lemma 2.5.

Let PP be irreducible, and let π\pi be its stationary density. Then, there exists a constant κ\kappa such that

jVπjτij=κ,\sum_{j\in V}\pi_{j}\tau_{i\to j}=\kappa,

independently on iVi\in V.

The number κ=κ(P)\kappa=\kappa(P) appearing in the preceding Lemma is the renowned Kemeny’s constant of PP. This constant quantifies the expected number of steps to get from node ii to a node jj, selected randomly according to the stationary density.

Introducing the matrix T=(τij)T=(\tau_{i\to j}), whose ijij-th entry is the mean hitting time from ii to jj, the claim of Lemma 2.5 can be compactly expressed via the identity Tπ=κ𝟙T\pi=\kappa\mathbbx{1}. Since the jj-th column of TT fulfills equation (6) with S={j}S=\{j\}, using Lemma 2.4 it is not difficult to verify that this matrix solves the equation

(IP)T=𝟙𝟙𝕋Diag(π)𝟙,(I-P)T=\mathbbx{1}\mathbbx{1}^{T}-\mathrm{Diag}(\pi)^{-1}, (7)

where Diag(π)\mathrm{Diag}(\pi) is the diagonal matrix with diagonal entries given by the entries of the vector π\pi. Actually, the matrix TT is the unique solution of (7) with zero diagonal entries.

We conclude this review on random walks with a technical lemma of independent interest and that will be useful for later. This result shows that, apart of a constant term, the vector tSt^{S} defined in (4) can be expressed by a convex linear combination of the vectors {ti,iS}\{t^{i},i\in S\}. Hence, mean access times to a subset can be computed from mean access times to the single elements of the set.

Lemma 2.6.

For any fixed subset SVS\subset V there exist positive coefficients β\beta and {αi}\{\alpha_{i}\}, for iSi\in S, such that iSαi=1\sum_{i\in S}\alpha_{i}=1, and

tS=iSαitiβ𝟙.t^{S}=\sum_{i\in S}\alpha_{i}t^{i}-\beta\mathbbx{1}.

In particular, τiS=jSαjτijβ\tau_{i\to S}=\sum_{j\in S}\alpha_{j}\tau_{i\to j}-\beta for every iVi\in V.

Proof.

Consider the vector a=(αi)iVa=(\alpha_{i})_{i\in V} with entries

αi={0iSπiτiS+iS.\alpha_{i}=\begin{cases}0&i\notin S\\ \pi_{i}\tau^{+}_{i\to S}&i\in S.\end{cases} (8)

By (6) we have

0=πT(IP)tS=πT(𝟙𝕣𝕊)=𝟙𝕚𝕊π𝕚τ𝕚𝕊+.0=\pi^{T}(I-P)t^{S}=\pi^{T}(\mathbbx{1}-r^{S})=1-\sum_{i\in S}\pi_{i}\tau^{+}_{i\to S}.

Hence iαi=iSπiτiS+=1\sum_{i}\alpha_{i}=\sum_{i\in S}\pi_{i}\tau^{+}_{i\to S}=1. Moreover, from (7) we derive

(IP)Ta=(𝟙𝟙𝕋Diag(π)𝟙)𝕒=𝟙(𝟙𝕋𝕒)𝕣𝕊=𝟙𝕣𝕊.(I-P)Ta=(\mathbbx{1}\mathbbx{1}^{T}-\mathrm{Diag}(\pi)^{-1})a=\mathbbx{1}(\mathbbx{1}^{T}a)-r^{S}=\mathbbx{1}-r^{S}.

This proves that the difference TatSTa-t^{S} belongs to the kernel of IPI-P, that is, iSαititS=β𝟙\sum_{i\in S}\alpha_{i}t^{i}-t^{S}=\beta\mathbbx{1} for some β\beta\in\mathbb{R}. Considering the kk-th entry of this identity for kSk\in S, we have

β=iSαiτkitkS=iSαiτki>0,\beta=\sum_{i\in S}\alpha_{i}\tau_{k\to i}-t^{S}_{k}=\sum_{i\in S}\alpha_{i}\tau_{k\to i}>0,

and the proof is complete. ∎

3 Second-order random walks

A second-order random walk on a graph 𝒢=(V,E)\mathcal{G}=(V,E) is a walk v0,v1,,vk,v_{0},v_{1},\ldots,v_{k},\ldots where the probability of choosing vk+1v_{k+1} depends not only on vkv_{k} but also on vk1v_{k-1}. More precisely, we describe this random walk by means of a stochastic process {Xn,n}\{X_{n},n\in\mathbb{N}\}, where XnVX_{n}\in V represents the node where the walker is located at time nn. Hence, there exists constant probabilities pi,j,kp_{i,j,k} such that, for n0n\geq 0

(Xn+2=k|Xn+1=j,Xn=i)=pi,j,k.\mathbb{P}(X_{n+2}=k|X_{n+1}=j,X_{n}=i)=p_{i,j,k}. (9)

We additionally assume that there are specific probabilities pi,jp^{\prime}_{i,j} for the first transition, that is,

(X1=j|X0=i)=pi,j(i,j)E.\mathbb{P}(X_{1}=j|X_{0}=i)=p^{\prime}_{i,j}\qquad(i,j)\in E. (10)

A second-order random walk is not a Markov chain because it does not fulfill the Markov condition, as we need to remember the previous step in order to take the next step. However, it can be turned into a Markov chain by changing the state space from the vertices of the graph to the directed edges. More precisely, let 𝒢^=(V^,E^)\widehat{\mathcal{G}}=(\widehat{V},\widehat{E}) be the (directed) line graph of 𝒢=(V,E)\mathcal{G}=(V,E), where V^=E\widehat{V}=E and (e,f)E^(e,f)\in\widehat{E} if and only if 𝚃(e)=𝚂(f){\tt T}(e)={\tt S}(f). Then, the second-order random walk can be seen as a (first-order) walk on 𝒢^\widehat{\mathcal{G}}, that is, a sequence of directed edges e0,e1,,ek,e_{0},e_{1},\ldots,e_{k},\ldots where v0=𝚂(e0)v_{0}={\tt S}(e_{0}) and vi=𝚂(ei)=𝚃(ei1)v_{i}={\tt S}(e_{i})={\tt T}(e_{i-1}), for n1n\geq 1. However, we must keep in mind that a walk of length mm in 𝒢\mathcal{G} corresponds to a walk of length m1m-1 in 𝒢^\widehat{\mathcal{G}}.

For n0n\geq 0 consider the random variable WnEW_{n}\in E defined as

Wn=(i,j)Xn+1=j,Xn=i.W_{n}=(i,j)\ \Longleftrightarrow\ X_{n+1}=j,\ X_{n}=i. (11)

On the basis of (9) and (10), the sequence {Wn}\{W_{n}\} is a Markov chain with initial distribution (W0=(i,j))=(X0=i)pi,j\mathbb{P}(W_{0}=(i,j))=\mathbb{P}(X_{0}=i)p^{\prime}_{i,j} and transition matrix P^\widehat{P} given by P^(i,j)(j,k)=pi,j,k\widehat{P}_{(i,j)(j,k)}=p_{i,j,k}. Hence, if (e,f)E^(e,f)\in\widehat{E} then P^ef0\widehat{P}_{ef}\geq 0 while P^ef=0\widehat{P}_{ef}=0 if (e,f)E^(e,f)\notin\widehat{E}. Note that we allow the case P^ef=0\widehat{P}_{ef}=0 for some pair (e,f)E^(e,f)\in\widehat{E}, which may occur in some special circumstances, notably the non-backtracking random walk on 𝒢\mathcal{G}, see Example 3.2 below.

Example 3.1.

The uniform random walk on 𝒢^\widehat{\mathcal{G}} is the Markov chain governed by the transition matrix P^\widehat{P},

P^ef={1/d𝚂(f)+𝚂(f)=𝚃(e)0otherwise.\widehat{P}_{ef}=\begin{cases}1/d^{+}_{{\tt S}(f)}&{\tt S}(f)={\tt T}(e)\\ 0&\hbox{otherwise.}\end{cases}

As P^ef\widehat{P}_{ef} do not depend on 𝚂(e){\tt S}(e), the corresponding stochastic process on 𝒢\mathcal{G} is the uniform random walk where Pij=(Xn+1=j|Xn=i)=1/di+P_{ij}=\mathbb{P}(X_{n+1}=j|X_{n}=i)=1/d^{+}_{i}. If 𝒢\mathcal{G} is undirected, then P^\widehat{P} is bi-stochastic.

Example 3.2.

The Hashimoto graph of 𝒢=(V,E)\mathcal{G}=(V,E) is the (directed) graph =(V,E)\mathcal{H}=(V_{\mathcal{H}},E_{\mathcal{H}}) where V=EV_{\mathcal{H}}=E and (e,f)E(e,f)\in E_{\mathcal{H}} if and only if both 𝚃(e)=𝚂(f){\tt T}(e)={\tt S}(f) and 𝚃(f)𝚂(e){\tt T}(f)\neq{\tt S}(e). A random walk on the Hashimoto graph is obviously related to a non-backtracking random walk on 𝒢\mathcal{G}. We can look at a random walk on \mathcal{H} as a weighted random walk on the directed line graph 𝒢^=(V^,E^)\widehat{\mathcal{G}}=(\widehat{V},\widehat{E}) by applying a null weight on the edges (e,f)E^(e,f)\in\widehat{E} such that 𝚃(f)=𝚂(e){\tt T}(f)={\tt S}(e). The corresponding transition matrix P^\widehat{P} is defined as follows. If e=(i,j)e=(i,j) and AA is the adjacency matrix of 𝒢\mathcal{G}, then

P^ef={1/(dj+Aji)𝚂(f)=j and i𝚃(f)0otherwise.\widehat{P}_{ef}=\begin{cases}1/(d^{+}_{j}-A_{ji})&{\tt S}(f)=j\hbox{ and }i\neq{\tt T}(f)\\ 0&\hbox{otherwise.}\end{cases} (12)

A directed edge e=(i,j)Ee=(i,j)\in E such that 𝒪j=(j,i)\mathcal{O}_{j}=(j,i) is called dangling. The stochasticity condition fEP^ef=1\sum_{f\in E}\widehat{P}_{ef}=1 implies that no edge in 𝒢\mathcal{G} is dangling, otherwise the out-degree of the corresponding node in \mathcal{H} would be zero. Hence, the non-backtracking random walk on 𝒢\mathcal{G} is well defined if (and only if) di+1d^{+}_{i}\geq 1 for iVi\in V and there are no dangling edges. Finally, we recall from e.g., [19] that, if 𝒢\mathcal{G} is undirected then P^\widehat{P} is bi-stochastic.

Example 3.3.

Let P^(1)\widehat{P}^{(1)} and P^(0)\widehat{P}^{(0)} be the stochastic matrices defined in the Example 3.1 and Example 3.2, respectively. For any α(0,1)\alpha\in(0,1) the matrix P^(α)=αP^(1)+(1α)P^(0)\widehat{P}^{(\alpha)}=\alpha\widehat{P}^{(1)}+(1-\alpha)\widehat{P}^{(0)} is stochastic (bi-stochastic, if 𝒢\mathcal{G} is undirected). The corresponding Markov chain describes random walks in 𝒢\mathcal{G} where backtracking steps are not completely eliminated, but rather the probability of performing one such step is downweighted by a factor depending on α\alpha. Thus, the corresponding second-order process can be called a backtrack-downweighted random walk. Combinatorial aspects of this kind of walks are addressed in, e.g., [2].

Remark 3.4.

The Markov chain {Wn}\{W_{n}\} defined by (11) may not be irreducible even if 𝒢\mathcal{G} is strongly connected. For example, if 𝒢\mathcal{G} is an undirected graph consisting of a cycle, then the Hashimoto graph of 𝒢\mathcal{G} is not connected, and the matrix P^\widehat{P} built as in Example 3.2 is reducible. However, the matrix P^(α)\widehat{P}^{(\alpha)} defined in Example 3.3 is irreducible in the sole assumption that 𝒢\mathcal{G} is strongly connected.

4 Second-order mean hitting times

In this section, we introduce the concept of mean hitting time for the second-order random walk {Xn,n}\{X_{n},n\in\mathbb{N}\} defined by (9) and (10) and we give results analogous to Theorems 2.1 and 2.2 for it. For each node kVk\in V, let T~k\widetilde{T}_{k} be the random variable counting the first time the process arrives at node kk,

T~k=min{n0:Xn=k}.\widetilde{T}_{k}=\min\{n\geq 0:X_{n}=k\}.

The aim of this section is to compute the corresponding mean second-order hitting time τ~ik=𝔼(T~k|X0=i)\widetilde{\tau}_{i\to k}=\mathbb{E}(\widetilde{T}_{k}|X_{0}=i), i.e., how many steps are required by a second-order random walk on average to reach kk, starting from ii. To this goal, we first compute the mean second-order hitting times to a node kk, given both the first and the second node,

τ~i,jk=𝔼(T~k|X1=j,X0=i),\widetilde{\tau}_{i,j\to k}=\mathbb{E}(\widetilde{T}_{k}|X_{1}=j,X_{0}=i)\,, (13)

and then we use these values to compute τ~ik\widetilde{\tau}_{i\to k} by means of the formula

τ~ik\displaystyle\widetilde{\tau}_{i\to k} =𝔼(T~k|X0=i)\displaystyle=\mathbb{E}(\widetilde{T}_{k}|X_{0}=i)
=jV(X1=j|X0=i)𝔼(T~k|X1=j,X0=i)=jVpijτ~i,jk.\displaystyle=\sum_{j\in V}\mathbb{P}(X_{1}=j|X_{0}=i)\mathbb{E}(\widetilde{T}_{k}|X_{1}=j,X_{0}=i)=\sum_{j\in V}p^{\prime}_{ij}\widetilde{\tau}_{i,j\to k}. (14)

The above formula shows that, unlike mean hitting times for Markov chains, second-order mean hitting times depend on the choice of the first transition probabilities, as defined in (10). In the sequel, we will see that there is a quite natural choice for these probabilities. On the other hand, the finiteness of the numbers τ~i,jk\widetilde{\tau}_{i,j\to k}, which is essential for the finiteness of second-order hitting times, is not influenced by the values of the probabilities pijp^{\prime}_{ij}. Indeed, in an infinite network it can happen that (T~k=+|X1=j,X0=i)>0\mathbb{P}(\widetilde{T}_{k}=+\infty|X_{1}=j,X_{0}=i)>0 for some (i,j)E(i,j)\in E and kVk\in V. For example, this happens when P^\widehat{P} is the transition matrix of the uniform random walk on 𝒢^\widehat{\mathcal{G}} and this chain is transient. In this case, τ~i,jk=+\widetilde{\tau}_{i,j\to k}=+\infty because τ~i,jk\widetilde{\tau}_{i,j\to k} is the expectation of a random variable that is equal to ++\infty with positive probability. For this reason, we first compute φi,jk=(T~k<+|X1=j,X0=i)\varphi_{i,j\to k}=\mathbb{P}(\widetilde{T}_{k}<+\infty|X_{1}=j,X_{0}=i), the probability that, starting with X0=iX_{0}=i and X1=jX_{1}=j, the random walker reaches kk in finite time. For these quantities, the following result holds:

Theorem 4.1.

For all kVk\in V, {φi,jk:(i,j)E}\{{\varphi}_{i,j\to k}:(i,j)\in E\} is the minimal non-negative solution of the following linear system:

{φi,jk=1if i=k or j=kφi,jk=Vpi,j,φj,kotherwise.\begin{cases}{\varphi}_{i,j\to k}=1&\hbox{if }i=k\text{ or }j=k\\ {\varphi}_{i,j\to k}=\sum_{\ell\in V}p_{i,j,\ell}{\varphi}_{j,\ell\to k}&\hbox{otherwise.}\end{cases} (15)
Proof.

We first show that {φi,jk:(i,j)E}\{{\varphi}_{i,j\to k}:(i,j)\in E\} solves the linear system (15). Let us consider an edge (i,j)E(i,j)\in E. If i=ki=k then X0=kX_{0}=k, so T~k=0\widetilde{T}_{k}=0; if j=kj=k then X1=kX_{1}=k and T~k=1\widetilde{T}_{k}=1. In both cases it immediately follows that φi,jk=(T~k<+|X1=j,X0=i)=1{\varphi}_{i,j\to k}=\mathbb{P}(\widetilde{T}_{k}<+\infty|X_{1}=j,X_{0}=i)=1. In the other cases, we have

φi,jk\displaystyle{\varphi}_{i,j\to k} =(T~k<+|X1=j,X0=i)\displaystyle=\mathbb{P}(\widetilde{T}_{k}<+\infty|X_{1}=j,X_{0}=i)
=V(X2=|X1=j,X0=i)(T~k<+|X2=,X1=j,X0=i)\displaystyle=\sum_{\ell\in V}\mathbb{P}(X_{2}=\ell|X_{1}=j,X_{0}=i)\mathbb{P}(\widetilde{T}_{k}<+\infty|X_{2}=\ell,X_{1}=j,X_{0}=i)
=Vpi,j,φj,k,\displaystyle=\sum_{\ell\in V}p_{i,j,\ell}{\varphi}_{j,\ell\to k},

that is, (15). Now let {xi,j:(i,j)E}\{x_{i,j}:(i,j)\in E\} be another non-negative solution of (15) for a fixed kk. We show that xi,jφi,jkx_{i,j}\geq{\varphi}_{i,j\to k} for all (i,j)E(i,j)\in E. Indeed, if i=ki=k or j=kj=k then xi,j=φi,jkx_{i,j}={\varphi}_{i,j\to k} and the claim follows. Otherwise, for i,jki,j\neq k we have xi,j=pi,j,k+kpi,j,xj,lx_{i,j}=p_{i,j,k}+\sum_{\ell\neq k}p_{i,j,\ell}x_{j,l}. Let 0=i\ell_{0}=i and 1=j\ell_{1}=j. For any n¯2\bar{n}\geq 2 we have

x0,1\displaystyle x_{\ell_{0},\ell_{1}} =p0,1,k+2kp0,1,2x1,2\displaystyle=p_{\ell_{0},\ell_{1},k}+\sum_{\ell_{2}\neq k}p_{\ell_{0},\ell_{1},\ell_{2}}x_{\ell_{1},\ell_{2}}
=p0,1,k+2kp0,1,2[p1,2,k+3kp1,2,3x2,3]\displaystyle=p_{\ell_{0},\ell_{1},k}+\sum_{\ell_{2}\neq k}p_{\ell_{0},\ell_{1},\ell_{2}}\bigg{[}p_{\ell_{1},\ell_{2},k}+\sum_{\ell_{3}\neq k}p_{\ell_{1},\ell_{2},\ell_{3}}x_{\ell_{2},\ell_{3}}\Big{]}
=\displaystyle=\ldots
=n=1n¯[2,,nkt=2npt2,t1,n]pn1,n,k+2,,n¯+1kt=2n¯+1pt2,t1,txt1t\displaystyle=\sum_{n=1}^{\bar{n}}\Bigg{[}\sum_{\ell_{2},\ldots,\ell_{n}\neq k}\prod_{t=2}^{n}p_{\ell_{t-2},\ell_{t-1},\ell_{n}}\Bigg{]}p_{\ell_{n-1},\ell_{n},k}+\sum_{\ell_{2},\ldots,\ell_{\bar{n}+1}\neq k}\prod_{t=2}^{\bar{n}+1}p_{\ell_{t-2},\ell_{t-1},\ell_{t}}x_{\ell_{t-1}\ell_{t}}
n=1n¯[2knkt=2npt2,t1,n]pn1,n,k,\displaystyle\geq\sum_{n=1}^{\bar{n}}\Bigg{[}\sum_{\ell_{2}\neq k}\cdots\sum_{\ell_{n}\neq k}\prod_{t=2}^{n}p_{\ell_{t-2},\ell_{t-1},\ell_{n}}\Bigg{]}p_{\ell_{n-1},\ell_{n},k},

where we used the inequality xt1t0x_{\ell_{t-1}\ell_{t}}\geq 0 and the empty sum (i.e., the square bracket in the n=1n=1 case) must be considered equal to 11. Observe that for n=1,2,3,n=1,2,3,\ldots

[2knkt=2npt2,t1,n]pn1,n,k=(T~k=n+1|X1=j,X0=i).\Bigg{[}\sum_{\ell_{2}\neq k}\cdots\sum_{\ell_{n}\neq k}\prod_{t=2}^{n}p_{\ell_{t-2},\ell_{t-1},\ell_{n}}\Bigg{]}p_{\ell_{n-1},\ell_{n},k}=\mathbb{P}(\widetilde{T}_{k}=n+1|X_{1}=j,X_{0}=i).

Hence we have xi,j(T~kn¯+1|X1=j,X0=i)x_{i,j}\geq\mathbb{P}(\widetilde{T}_{k}\leq\bar{n}+1|X_{1}=j,X_{0}=i). Finally,

xi,j\displaystyle x_{i,j} limn¯+(T~kn¯+1|X1=j,X0=i)\displaystyle\geq\lim_{\bar{n}\to+\infty}\mathbb{P}(\widetilde{T}_{k}\leq\bar{n}+1|X_{1}=j,X_{0}=i)
=(T~k<+|X1=j,X0=i)=φi,jk\displaystyle=\mathbb{P}(\widetilde{T}_{k}<+\infty|X_{1}=j,X_{0}=i)={\varphi}_{i,j\to k}

and the claim follows. ∎

On the basis of the preceding theorem we can compute the probability that the second-order process {Xn}\{X_{n}\} hits a node kVk\in V in a finite number of steps.

Corollary 4.2.

For any fixed kVk\in V define φik=(T~k<+|X0=i){\varphi}_{i\to k}=\mathbb{P}(\widetilde{T}_{k}<+\infty|X_{0}=i). Then, the vector (φik)iV({\varphi}_{i\to k})_{i\in V} is the vector (φi,jk)(i,j)E({\varphi}_{i,j\to k})_{(i,j)\in E} premultiplied by the matrix M|V|×|E|M\in\mathbb{R}^{|V|\times|E|} given by

Mie={pi,je=(i,j)E0else.M_{ie}=\begin{cases}p^{\prime}_{i,j}&e=(i,j)\in E\\ 0&\hbox{else.}\end{cases} (16)
Proof.

By elementary considerations, we have

φik\displaystyle{\varphi}_{i\to k} =(T~k<+|X0=i)\displaystyle=\mathbb{P}(\widetilde{T}_{k}<+\infty|X_{0}=i)
=jV(X1=j|X0=i)(T~k<+|X1=j,X0=i)=jVpi,jφi,jk.\displaystyle=\sum_{j\in V}\mathbb{P}(X_{1}=j|X_{0}=i)\mathbb{P}(\widetilde{T}_{k}<+\infty|X_{1}=j,X_{0}=i)=\sum_{j\in V}p^{\prime}_{i,j}{\varphi}_{i,j\to k}.

In a similar way, we can compute the second-order mean hitting times, as shown in the next theorem:

Theorem 4.3.

In the previous notation, if φi,jk=1{\varphi}_{i,j\to k}=1 for each (i,j)E(i,j)\in E, then {τ~i,jk:(i,j)E}\{\widetilde{\tau}_{i,j\to k}:(i,j)\in E\} is the minimal non-negative solution of the following linear system:

{τ~i,jk=0if i=kτ~i,jk=1if j=k and ikτ~i,jk=1+Vpi,j,τ~j,kif ik and jk.\begin{cases}\widetilde{\tau}_{i,j\to k}=0&\hbox{if }i=k\\ \widetilde{\tau}_{i,j\to k}=1&\hbox{if }j=k\hbox{ and }i\neq k\\ \widetilde{\tau}_{i,j\to k}=1+\sum_{\ell\in V}p_{i,j,\ell}\widetilde{\tau}_{j,\ell\to k}&\hbox{if }i\neq k\hbox{ and }j\neq k.\end{cases} (17)
Proof.

We first show that {τ~i,jk:(i,j)E}\{\widetilde{\tau}_{i,j\to k}:(i,j)\in E\} is a solution of the linear system (17). Let (i,j)E(i,j)\in E. If i=ki=k then (T~k=0|X1=j,X0=i)=1\mathbb{P}(\widetilde{T}_{k}=0|X_{1}=j,X_{0}=i)=1 and τ~i,jk=0\widetilde{\tau}_{i,j\to k}=0. If j=kj=k then (T~k=1|X1=j,X0=i)=1\mathbb{P}(\widetilde{T}_{k}=1|X_{1}=j,X_{0}=i)=1 and τ~i,jk=1\widetilde{\tau}_{i,j\to k}=1. Otherwise, if both iki\neq k and jkj\neq k then, by using the definition of expectation we obtain

τ~i,jk=𝔼(T~k|X1=j,X0=i)=2(T~k=2|X1=j,X0=i)+n3n(T~k=n|X1=j,X0=i).\begin{split}\widetilde{\tau}_{i,j\to k}&=\mathbb{E}(\widetilde{T}_{k}|X_{1}=j,X_{0}=i)\\ &=2\mathbb{P}(\widetilde{T}_{k}=2|X_{1}=j,X_{0}=i)+\sum_{n\geq 3}n\mathbb{P}(\widetilde{T}_{k}=n|X_{1}=j,X_{0}=i).\end{split}

Now we observe that (T~k=2|X1=j,X0=i)=(X2=k|X1=j,X0=i)=pi,j,k\mathbb{P}(\widetilde{T}_{k}=2|X_{1}=j,X_{0}=i)=\mathbb{P}(X_{2}=k|X_{1}=j,X_{0}=i)=p_{i,j,k}. Then τ~i,jk\widetilde{\tau}_{i,j\to k} is given by

τ~i,jk=\displaystyle\widetilde{\tau}_{i,j\to k}=  2pi,j,k+n3nk(X2=|X1=j,X0=i)(T~k=n|X2=,X1=j)\displaystyle\,2p_{i,j,k}+\sum_{n\geq 3}n\sum_{\ell\neq k}\mathbb{P}(X_{2}=\ell|X_{1}=j,X_{0}=i)\mathbb{P}(\widetilde{T}_{k}=n|X_{2}=\ell,X_{1}=j)
=\displaystyle=  2pi,j,k+kpi,j,n3n(T~k=n|X2=,X1=j)\displaystyle\,2p_{i,j,k}+\sum_{\ell\neq k}p_{i,j,\ell}\sum_{n\geq 3}n\,\mathbb{P}(\widetilde{T}_{k}=n|X_{2}=\ell,X_{1}=j)
=\displaystyle=  2pi,j,k+kpi,j,m2(m+1)(T~k=m|X1=,X0=j).\displaystyle\,2p_{i,j,k}+\sum_{\ell\neq k}p_{i,j,\ell}\sum_{m\geq 2}(m+1)\mathbb{P}(\widetilde{T}_{k}=m|X_{1}=\ell,X_{0}=j).

Finally, since m2m(T~k=m|X1=,X0=j)=τ~j,k\sum_{m\geq 2}m\mathbb{P}(\widetilde{T}_{k}=m|X_{1}=\ell,X_{0}=j)=\widetilde{\tau}_{j,\ell\to k} and m2(T~k=m|X1=,X0=j)=1\sum_{m\geq 2}\mathbb{P}(\widetilde{T}_{k}=m|X_{1}=\ell,X_{0}=j)=1 for k\ell\neq k, we have

τ~i,jk\displaystyle\widetilde{\tau}_{i,j\to k} =2pi,j,k+kpi,j,(τ~j,k+1)\displaystyle=2p_{i,j,k}+\sum_{\ell\neq k}p_{i,j,\ell}(\widetilde{\tau}_{j,\ell\to k}+1)
=2pi,j,k+kpi,j,τ~j,k+kpi,j,\displaystyle=2p_{i,j,k}+\sum_{\ell\neq k}p_{i,j,\ell}\widetilde{\tau}_{j,\ell\to k}+\sum_{\ell\neq k}p_{i,j,\ell}
=1+pi,j,k+kpi,j,τ~j,k=1+Vpi,j,τ~j,k,\displaystyle=1+p_{i,j,k}+\sum_{\ell\neq k}p_{i,j,\ell}\widetilde{\tau}_{j,\ell\to k}=1+\sum_{\ell\in V}p_{i,j,\ell}\widetilde{\tau}_{j,\ell\to k},

where in the last passages we have used the fact that Vpi,j,=1\sum_{\ell\in V}p_{i,j,\ell}=1.

Now let {yi,j:(i,j)E}\{y_{i,j}:(i,j)\in E\} be another non-negative solution of (17). We show that yi,jτ~i,jky_{i,j}\geq\widetilde{\tau}_{i,j\to k} for all (i,j)E(i,j)\in E. If i=ki=k or j=kj=k then yi,j=τ~i,jky_{i,j}=\widetilde{\tau}_{i,j\to k} and the claim follows.

For i,jki,j\neq k we have yi,j=1+pi,j,k+kpi,j,yj,ly_{i,j}=1+p_{i,j,k}+\sum_{\ell\neq k}p_{i,j,\ell}y_{j,l}. Now, let 0=i\ell_{0}=i and 1=j\ell_{1}=j. For any n¯2\bar{n}\geq 2 we have

y0,1\displaystyle y_{\ell_{0},\ell_{1}} =1+p0,1,k+2kp0,1,2y1,2\displaystyle=1+p_{\ell_{0},\ell_{1},k}+\sum_{\ell_{2}\neq k}p_{\ell_{0},\ell_{1},\ell_{2}}y_{\ell_{1},\ell_{2}}
=1+p0,1,k+2kp0,1,2[1+p1,2,k+3kp1,2,3y2,3]\displaystyle=1+p_{\ell_{0},\ell_{1},k}+\sum_{\ell_{2}\neq k}p_{\ell_{0},\ell_{1},\ell_{2}}\bigg{[}1+p_{\ell_{1},\ell_{2},k}+\sum_{\ell_{3}\neq k}p_{\ell_{1},\ell_{2},\ell_{3}}y_{\ell_{2},\ell_{3}}\Big{]}
=\displaystyle=\ldots
=n=1n¯[2,,nkt=2npt2,t1,n](1+pn1,n,k)+2,,n¯+1kt=2n¯+1pt2,t1,tyt1,t\displaystyle=\sum_{n=1}^{\bar{n}}\Bigg{[}\sum_{\ell_{2},\ldots,\ell_{n}\neq k}\prod_{t=2}^{n}p_{\ell_{t-2},\ell_{t-1},\ell_{n}}\Bigg{]}\big{(}1+p_{\ell_{n-1},\ell_{n},k}\big{)}+\sum_{\ell_{2},\ldots,\ell_{\bar{n}+1}\neq k}\prod_{t=2}^{\bar{n}+1}p_{\ell_{t-2},\ell_{t-1},\ell_{t}}y_{\ell_{t-1},\ell_{t}}
n=1n¯[2,,nkt=2npt2,t1,n](1+pn1,n,k),\displaystyle\geq\sum_{n=1}^{\bar{n}}\Bigg{[}\sum_{\ell_{2},\ldots,\ell_{n}\neq k}\prod_{t=2}^{n}p_{\ell_{t-2},\ell_{t-1},\ell_{n}}\Bigg{]}\big{(}1+p_{\ell_{n-1},\ell_{n},k}\big{)},

where the empty sum (i.e., the square bracket when n=1n=1) must be considered equal to 11 and the last inequality holds since yt1,t0y_{\ell_{t-1},\ell_{t}}\geq 0. Rearranging terms,

y0,1\displaystyle y_{\ell_{0},\ell_{1}} 1+2p0,1,2+n=3n¯2kn1knt=2npt2,t1,n\displaystyle\geq 1+\sum_{\ell_{2}}p_{\ell_{0},\ell_{1},\ell_{2}}+\sum_{n=3}^{\bar{n}}\sum_{\ell_{2}\neq k}\cdots\sum_{\ell_{n-1}\neq k}\sum_{\ell_{n}}\prod_{t=2}^{n}p_{\ell_{t-2},\ell_{t-1},\ell_{n}}
=n=1n¯(T~kn|X1=j,X0=i)\displaystyle=\sum_{n=1}^{\bar{n}}\mathbb{P}(\widetilde{T}_{k}\geq n|X_{1}=j,X_{0}=i)
n=1n¯n(T~k=n|X1=j,X0=i).\displaystyle\geq\sum_{n=1}^{\bar{n}}n\,\mathbb{P}(\widetilde{T}_{k}=n|X_{1}=j,X_{0}=i).

Finally, passing to the limit n¯\bar{n}\to\infty we obtain

yi,j\displaystyle y_{i,j} n=1n(T~k=n|X1=j,X0=i)=𝔼(T~k|X1=j,X0=i)=τ^i,jk,\displaystyle\geq\sum_{n=1}^{\infty}n\,\mathbb{P}(\widetilde{T}_{k}=n|X_{1}=j,X_{0}=i)=\mathbb{E}(\widetilde{T}_{k}|X_{1}=j,X_{0}=i)=\hat{\tau}_{i,j\to k},

and the proof is complete. ∎

Analogously to the argument of Corollary 4.2, we obtain the following characterization for the second-order hitting times directly from Equation (14).

Corollary 4.4.

For any fixed kVk\in V, the vector (τ~ik)iV(\widetilde{\tau}_{i\to k})_{i\in V} coincides with the vector (τ~i,jk)(i,j)E(\widetilde{\tau}_{i,j\to k})_{(i,j)\in E} pre-multiplied by the matrix MM in (16).

4.1 An alternative approach

The linear system in Theorem 4.3 allows us to compute second-order mean hitting times by means of the transition probabilities of the stochastic process {Xn,n}\{X_{n},n\in\mathbb{N}\}. The same quantities can be obtained using the fact that to any second-order random walk {Xn}\{X_{n}\} on 𝒢\mathcal{G}, corresponds a suitable random walk {Wn}\{W_{n}\} in 𝒢^\widehat{\mathcal{G}}, as in (11). The two walks have the same transition probability, but the second is shorter than the first by one step. By correcting this discrepancy, we can obtain mean hitting times for {Xn}\{X_{n}\} from those for {Wn}\{W_{n}\}.

To this end, for any fixed subset SES\subset E, consider the random variables

T^S=min{n0:WnS},T^S+=min{n1:WnS},\widehat{T}_{S}=\min\{n\geq 0:W_{n}\in S\},\qquad\widehat{T}^{+}_{S}=\min\{n\geq 1:W_{n}\in S\},

and the corresponding expectations

τ^eS=𝔼(TS|W0=e),τ^eS+=𝔼(TS+|W0=e).\widehat{\tau}_{e\to S}=\mathbb{E}(T_{S}|W_{0}=e),\qquad\widehat{\tau}^{+}_{e\to S}=\mathbb{E}(T^{+}_{S}|W_{0}=e). (18)

These quantities can be computed by means of the techniques recalled in Section 2.1, since {Wn}\{W_{n}\} is a classical Markov chain. The theorem below shows how mean hitting (18) for {Wn}\{W_{n}\} relate to the solution of the linear system in (17). It follows that the problem of computing second-order mean hitting times in 𝒢\mathcal{G} can be expressed as the problem of computing traditional mean hitting times in 𝒢^\widehat{\mathcal{G}}.

Let us consider a node kVk\in V. By Theorem 2.2, {τ^ek:eE}\{\hat{\tau}_{e\to\mathcal{I}_{k}}:e\in E\} is the minimal non-negative solution of the following linear system:

{τ^ek=0if ekτ^ek=1+fEP^efτ^fkif ek.\begin{cases}\hat{\tau}_{e\to\mathcal{I}_{k}}=0&\hbox{if }e\in\mathcal{I}_{k}\\ \hat{\tau}_{e\to\mathcal{I}_{k}}=1+\sum_{f\in E}\widehat{P}_{ef}\hat{\tau}_{f\to\mathcal{I}_{k}}&\hbox{if $e\notin\mathcal{I}_{k}$.}\end{cases} (19)
Theorem 4.5.

In the previous notation, it holds

τ~i,jk={0if i=kτ^(i,j)k+1if ik.\widetilde{\tau}_{i,j\to k}=\begin{cases}0&\hbox{if }i=k\\ \hat{\tau}_{(i,j)\to\mathcal{I}_{k}}+1&\hbox{if }i\neq k.\end{cases}
Proof.

We show that each non-negative solution of (19) corresponds to a non-negative solution of (17) and vice-versa. Let {zi,j:(i,j)E}\{z_{i,j}:(i,j)\in E\} be a non-negative solution of (19). For all (i,j)E(i,j)\in E, we define yi,jy_{i,j} as

yi,j={0if i=kzi,j+1if ik.y_{i,j}=\begin{cases}0&\hbox{if }i=k\\ z_{i,j}+1&\hbox{if }i\neq k.\end{cases}

Then {yi,j:(i,j)E}\{y_{i,j}:(i,j)\in E\} is a non-negative solution of (17). In fact, if j=kj=k but iki\neq k then yi,j=1y_{i,j}=1. On the other hand, when iki\neq k and jkj\neq k we have

yi,j=1+zi,j\displaystyle y_{i,j}=1+z_{i,j} =1+(1+VP^(i,j)(j,)zj,)\displaystyle=1+\Bigl{(}1+\sum_{\ell\in V}\widehat{P}_{(i,j)(j,\ell)}z_{j,\ell}\Bigr{)}
=1+VP^(i,j)(j,)(1+z,m)=1+Vpi,j,yj,.\displaystyle=1+\sum_{\ell\in V}\widehat{P}_{(i,j)(j,\ell)}(1+z_{\ell,m})=1+\sum_{\ell\in V}p_{i,j,\ell}\,y_{j,\ell}.

In the previous passages we used the fact that P^\widehat{P} is stochastic and P^(i,j)(,m)=0\widehat{P}_{(i,j)(\ell,m)}=0 if j\ell\neq j. Conversely, if {yi,j:(i,j)E}\{y_{i,j}:(i,j)\in E\} is a non-negative solution of (17), then yi,j1y_{i,j}\geq 1 if iki\neq k. Thus, if we put, for all (i,j)E(i,j)\in E,

zi,j={0if i=kyi,j1if ik,z_{i,j}=\begin{cases}0&\hbox{if }i=k\\ y_{i,j}-1&\hbox{if }i\neq k,\end{cases}

then {zi,j:(i,j)E}\{z_{i,j}:(i,j)\in E\} is a non-negative solution of (19). The thesis follows by considering the minimal non-negative solution of both systems. ∎

4.2 Second-order mean return times

In this section, we formulate the concept of return time for second-order random walks and we give an explicit formula for their computation in arbitrary networks, whenever the numbers τ~i,jk\widetilde{\tau}_{i,j\to k} can be computed from (17). For a set SVS\subset V, consider the random variable

T~S+=min{n1:XnS}.\widetilde{T}^{+}_{S}=\min\{n\geq 1:X_{n}\in S\}.

We are interested in computing the second-order mean return time to SS, namely, τ~S=𝔼(T~S+|X0S)\widetilde{\tau}_{S}=\mathbb{E}(\widetilde{T}^{+}_{S}|X_{0}\in S). This quantity represents the average number of steps that are required to reach again SS, starting from a node in SS and moving through the vertices of 𝒢\mathcal{G} according to a second-order random walk. For simplicity, in this section we limit ourselves to the single-node case S={i}S=\{i\}. We will go back to the general case in Section 5.1. Firstly, we observe that

τ~i=𝔼(T~i+|X0=i)=jV(X1=j|X0=i)𝔼(T~i+|X1=j,X0=i).\widetilde{\tau}_{i}=\mathbb{E}(\widetilde{T}^{+}_{i}|X_{0}=i)=\sum_{j\in V}\mathbb{P}(X_{1}=j|X_{0}=i)\mathbb{E}(\widetilde{T}^{+}_{i}|X_{1}=j,X_{0}=i).

Introducing the auxiliary notation τ~i,j=𝔼(T~i+|X1=j,X0=i)\widetilde{\tau}_{i,j}=\mathbb{E}(\widetilde{T}^{+}_{i}|X_{1}=j,X_{0}=i), we obtain

τ~i=jpi,jτ~i,j.\widetilde{\tau}_{i}=\sum_{j}p^{\prime}_{i,j}\widetilde{\tau}_{i,j}. (20)

The intuition behind this formula is that the random walker chooses at first an out-neighbour jj of ii at random and then follows a second-order random walk that ends with node ii. In fact, τ~i,j\widetilde{\tau}_{i,j} can be easily computed by using the vector {τ~i,jk:(i,j)E}\{\widetilde{\tau}_{i,j\to k}:(i,j)\in E\} obtained from (17). Indeed, it holds

τ~i,j\displaystyle\widetilde{\tau}_{i,j} =kV(X2=k|X1=j,X0=i)𝔼(T~i+|X2=k,X1=j,X0=i)\displaystyle=\sum_{k\in V}\mathbb{P}(X_{2}=k|X_{1}=j,X_{0}=i)\mathbb{E}(\widetilde{T}^{+}_{i}|X_{2}=k,X_{1}=j,X_{0}=i)
=kVpi,j,k(1+𝔼(T~i|X1=k,X0=j))\displaystyle=\sum_{k\in V}p_{i,j,k}\bigl{(}1+\mathbb{E}(\widetilde{T}_{i}|X_{1}=k,X_{0}=j)\bigr{)}
=1+kVpi,j,kτ~j,ki.\displaystyle=1+\sum_{k\in V}p_{i,j,k}\widetilde{\tau}_{j,k\to i}.

Placing the formula above into (20), yields

τ~i=1+j,kVpi,jpi,j,kτ~j,ki.\widetilde{\tau}_{i}=1+\sum_{j,k\in V}p^{\prime}_{i,j}p_{i,j,k}\widetilde{\tau}_{j,k\to i}.

Apparently, this formula does not admit a simple matrix-vector form as in Corollary 4.4. In the sequel, we propose to adopt a specific choice for the numbers pi,jp^{\prime}_{i,j} that allows us to obtain a result for any SVS\subset V analogous to Lemma 2.4.

5 The pullback of a second-order random walk

From now on, we suppose that the Markov chain associated to P^\widehat{P} is ergodic, so that there exists a (unique) vector π^>0\widehat{\pi}>0 with π^TP^=π^T\widehat{\pi}^{T}\widehat{P}=\widehat{\pi}^{T}. In this case, we can consider the additional assumption that the random walk {Wn}\{W_{n}\} on 𝒢^\widehat{\mathcal{G}} is at equilibrium, i.e., (Wn=e)=π^e\mathbb{P}(W_{n}=e)=\widehat{\pi}_{e}, for n0n\geq 0. Indeed, if the Markov chain {Wn}\{W_{n}\} is at the stationary state, then, owing to the correspondence between the events Xn+1=iX_{n+1}=i and WniW_{n}\in\mathcal{I}_{i} for n0n\geq 0, the analysis of the process {Xn}\{X_{n}\} is simplified and reveals new properties.

For every eEe\in E, define

λe=π^ef:𝚃(f)=𝚃(e)π^f.\lambda_{e}=\frac{\widehat{\pi}_{e}}{\sum_{f:{\tt T}(f)={\tt T}(e)}\widehat{\pi}_{f}}. (21)

The numbers λe\lambda_{e} appearing in (21) have a simple probabilistic interpretation, namely λe=(Wn=e|Wn𝚃(e))\lambda_{e}=\mathbb{P}(W_{n}=e|W_{n}\in\mathcal{I}_{{\tt T}(e)}) is the conditional probability that the random walker in 𝒢^\widehat{\mathcal{G}} is in ee, given that they are also in 𝚃(e)\mathcal{I}_{{\tt T}(e)}. Moreover, let L|V|×|E|L\in\mathbb{R}^{|V|\times|E|} be the “lifting” matrix defined as follows:

Lie={λe𝚃(e)=i0else,L_{ie}=\begin{cases}\lambda_{e}&{\tt T}(e)=i\\ 0&\hbox{else,}\end{cases} (22)

Note that the scalars λe\lambda_{e} fulfill the condition

eiλe=1,i=1,,n.\sum_{e\in\mathcal{I}_{i}}\lambda_{e}=1,\qquad i=1,\ldots,n.

This condition implies that LL is stochastic, L𝟙=𝟙L\mathbbx{1}=\mathbbx{1}. If pp is a probability vector on the nodes of 𝒢\mathcal{G}, then p^T=pTL\widehat{p}^{T}=p^{T}L is a probability vector defined on the edges, and fulfills the identity pi=eip^ep_{i}=\sum_{e\in\mathcal{I}_{i}}\widehat{p}_{e}. Furthermore, let R|E|×|V|R\in\mathbb{R}^{|E|\times|V|} be the “restriction” matrix

Rei={1if 𝚃(e)=i0else.R_{ei}=\begin{cases}1&\hbox{if }{\tt T}(e)=i\\ 0&\hbox{else.}\end{cases} (23)

Given any probability distribution p^\widehat{p} on the edges of 𝒢\mathcal{G}, the product pT=p^TRp^{T}=\widehat{p}^{T}R defines a probability distribution on the nodes such that pi=eip^ep_{i}=\sum_{e\in\mathcal{I}_{i}}\widehat{p}_{e}. We note in passing that the product LRLR is the identity in n\mathbb{R}^{n}.

Theorem 5.1.

Let P=LP^Rn×nP=L\widehat{P}R\in\mathbb{R}^{n\times n},

Pij=eifjλeP^ef.P_{ij}=\sum_{e\in\mathcal{I}_{i}}\sum_{f\in\mathcal{I}_{j}}\lambda_{e}\widehat{P}_{ef}. (24)

Then, PP is an irreducible stochastic matrix. Assuming that the Markov chain {Wn}\{W_{n}\} is at equilibrium with stationary density π^\widehat{\pi}, then also {Xn}\{X_{n}\} is at equilibrium with stationary density πT=π^TR\pi^{T}=\widehat{\pi}^{T}R and, for n1n\geq 1,

Pij=(Xn+1=j|Xn=i).P_{ij}=\mathbb{P}(X_{n+1}=j|X_{n}=i).

Moreover, the initial transition probabilities (10) are

pi,j=π^(i,j)f𝒪iπ^f,(i,j)E,p^{\prime}_{i,j}=\frac{\widehat{\pi}_{(i,j)}}{\sum_{f\in\mathcal{O}_{i}}\widehat{\pi}_{f}},\qquad(i,j)\in E, (25)

and the vector πT=π^TR\pi^{T}=\widehat{\pi}^{T}R is the (unique) invariant vector of the Markov chain associated to PP.

Proof.

First, note that the matrix PP is non-negative and stochastic. To prove that PP is irreducible, let i,jVi,j\in V be arbitrary. Owing to the irreducibility of P^\widehat{P}, for any eie\in\mathcal{I}_{i} and fjf\in\mathcal{I}_{j} there is a path in 𝒢^\widehat{\mathcal{G}} from ee to ff. Let e=e0,e1,,em=fe=e_{0},e_{1},\ldots,e_{m}=f be such path. Then, the sequence 𝚃(e0),𝚃(e1),,𝚃(em){\tt T}(e_{0}),{\tt T}(e_{1}),\ldots,{\tt T}(e_{m}) is a path in 𝒢\mathcal{G} from ii to jj.

Thus, owing to the correspondence between Xn+1=iX_{n+1}=i and WniW_{n}\in\mathcal{I}_{i}, we have

(Xn+1=j|Xn=i)\displaystyle\mathbb{P}(X_{n+1}=j|X_{n}=i) =(Wnj|Wn1i)\displaystyle=\mathbb{P}(W_{n}\in\mathcal{I}_{j}|W_{n-1}\in\mathcal{I}_{i})
=eifj(Wn=f|Wn1=e)(Wn1=e|Wn1i)\displaystyle=\sum_{e\in\mathcal{I}_{i}}\sum_{f\in\mathcal{I}_{j}}\mathbb{P}(W_{n}=f|W_{n-1}=e)\mathbb{P}(W_{n-1}=e|W_{n-1}\in\mathcal{I}_{i})
=eifjP^efλe=Pij\displaystyle=\sum_{e\in\mathcal{I}_{i}}\sum_{f\in\mathcal{I}_{j}}\widehat{P}_{ef}\lambda_{e}=P_{ij}

for any n1n\geq 1. Moreover, consider the vector πT=π^TR\pi^{T}=\widehat{\pi}^{T}R. In detail,

πi=eiπ^e>0,i=1,,n.\pi_{i}=\sum_{e\in\mathcal{I}_{i}}\widehat{\pi}_{e}>0,\qquad i=1,\ldots,n. (26)

To prove that π\pi is an invariant vector of PP, first observe that π^TRL=π^\widehat{\pi}^{T}RL=\widehat{\pi}. Indeed, owing to (21), for any fixed eEe\in E we obtain

(π^TRL)e=λef:𝚃(f)=𝚃(e)π^f=π^e.(\widehat{\pi}^{T}RL)_{e}=\lambda_{e}\sum_{f:{\tt T}(f)={\tt T}(e)}\widehat{\pi}_{f}=\widehat{\pi}_{e}.

Moreover, πT𝟙=π^𝕋𝟙=π^𝕋𝟙=𝟙\pi^{T}\mathbbx{1}=\widehat{\pi}^{T}R\mathbbx{1}=\widehat{\pi}^{T}\mathbbx{1}=1, eventually yielding

πTP=π^TRLP^R=π^TP^R=π^TR=πT.\pi^{T}P=\widehat{\pi}^{T}RL\widehat{P}R=\widehat{\pi}^{T}\widehat{P}R=\widehat{\pi}^{T}R=\pi^{T}.

To complete the proof it is sufficient to observe that, if {Wn}\{W_{n}\} is at equilibrium then also {Xn}\{X_{n}\} is at equilibrium and

π^(i,j)=(W0=(i,j))\displaystyle\widehat{\pi}_{(i,j)}=\mathbb{P}(W_{0}=(i,j)) =(X1=j,X0=i)\displaystyle=\mathbb{P}(X_{1}=j,X_{0}=i)
=(X1=j|X0=i)(X0=i)=pi,jfiπ^f.\displaystyle=\mathbb{P}(X_{1}=j|X_{0}=i)\mathbb{P}(X_{0}=i)=p^{\prime}_{i,j}\sum_{f\in\mathcal{I}_{i}}\widehat{\pi}_{f}.

Thus, pi,j=π^(i,j)/fiπ^fp^{\prime}_{i,j}=\widehat{\pi}_{(i,j)}/\sum_{f\in\mathcal{I}_{i}}\widehat{\pi}_{f}. Finally, as fiπ^f=f𝒪iπ^f\sum_{f\in\mathcal{I}_{i}}\widehat{\pi}_{f}=\sum_{f\in\mathcal{O}_{i}}\widehat{\pi}_{f}, we obtain (25), which concludes the proof. ∎

We call the matrix PP in (24) the pullback of P^\widehat{P} on 𝒢\mathcal{G}. Note that these two matrices correspond to two different stochastic processes, as briefly pointed out in Remark 5.2 below. Moreover, as we may expect, we remark that the pullback operation is not injective, as different stochastic processes on 𝒢^\widehat{\mathcal{G}} may have the same pullback in 𝒢\mathcal{G}. This is shown by the Example 5.3 below, where the classic and non-backtracking processes are shown to have the same pullback.

Remark 5.2.

As shown in Theorem 5.1, the entries of the pullback matrix PP correspond to first-order transition probabilities of the stochastic process {Xn}\{X_{n}\}. However, this does not mean that {Xn}\{X_{n}\} is a Markov chain, which would be true if (and only if) the sequence {Xn}\{X_{n}\} were to satisfy the Markov condition (Xn+1=k|Xn=j)=(Xn+1=k|Xn=j,Xn1=i)\mathbb{P}(X_{n+1}=k|X_{n}=j)=\mathbb{P}(X_{n+1}=k|X_{n}=j,X_{n-1}=i), for every (i,j),(j,k)E(i,j),(j,k)\in E, at least under the considered assumptions. On the other hand, Equation (24) yields

Pij=(h,i)Eλ(h,i)ph,i,j,P_{ij}=\sum_{(h,i)\in E}\lambda_{(h,i)}p_{h,i,j},

that is, PijP_{ij} is a convex combinations of the numbers ph,i,jp_{h,i,j}. Consequently, the Markov condition holds true if and only if the second-order transition probabilities (9) are independent of the oldest state, in which case {Xn}\{X_{n}\} is trivially a Markov chain.

Actually, the construction of PP from P^\widehat{P} is reminiscent of a so-called lumping of {Wn}\{W_{n}\}, see e.g., [34] and [18, Sec. 6.3-4]. Lumping methods are known to allow the construction of Markov chains with a reduced number of states from certain finite Markov chains with a larger states set. The corresponding transition matrices are related by an identity similar to the one defining PP from P^\widehat{P} in Theorem 5.1. However, the process {Xn}\{X_{n}\} is not a lumping of {Wn}\{W_{n}\} since it is not a Markov chain, even when at equilibrium. This is the reason why second-order mean hitting times differ ostensibly from mean hitting times corresponding to PP, as shown in various numerical experiments in Section 6.

Example 5.3.

Let 𝒢\mathcal{G} be undirected and (strongly) connected, with adjacency matrix AA, and let 𝒢^\widehat{\mathcal{G}} be its line graph. Consider the random walk on 𝒢^\widehat{\mathcal{G}} defined as in Example 3.1, with transition matrix P^\widehat{P}. It is not hard to verify that the pullback of P^\widehat{P} on 𝒢\mathcal{G} coincides with the transition matrix of the standard random walk on 𝒢\mathcal{G}. The same conclusion is true also for the matrix P^\widehat{P} considered in Example 3.2. Indeed, if Aij=0A_{ij}=0 then

(LP^R)ij=eifjλeP^ef=0(L\widehat{P}R)_{ij}=\sum_{e\in\mathcal{I}_{i}}\sum_{f\in\mathcal{I}_{j}}\lambda_{e}\widehat{P}_{ef}=0

because the condition 𝚂(f)=𝚃(e){\tt S}(f)={\tt T}(e) is always false. On the other hand, if Aij=1A_{ij}=1 then

(LP^R)ij\displaystyle(L\widehat{P}R)_{ij} =1eifjλeP^ef\displaystyle=\frac{1}{\sum_{e\in\mathcal{I}_{i}}}\sum_{f\in\mathcal{I}_{j}}\lambda_{e}\widehat{P}_{ef}
=eifjλed𝚂(f)+1\displaystyle=\sum_{e\in\mathcal{I}_{i}}\sum_{f\in\mathcal{I}_{j}}\frac{\lambda_{e}}{d^{+}_{{\tt S}(f)}-1}
=(|i|1)1/didi+1=1di,\displaystyle=(|\mathcal{I}_{i}|-1)\frac{1/d^{-}_{i}}{d^{+}_{i}-1}=\frac{1}{d_{i}},

since |i|=di=di+=di|\mathcal{I}_{i}|=d^{-}_{i}=d^{+}_{i}=d_{i}. By linearity, also the pullback of the backtrack-downweighted random walk in Example 3.3 coincides with the standard random walk on 𝒢\mathcal{G}.

5.1 Hitting and return times at equilibrium

Due to Theorem 5.1, it is convenient to assume that the initial probabilities of the second-order stochastic process under consideration (10) are equal to λ(i,j)\lambda_{(i,j)}. Hence, from here onward we consider the “equilibrium assumption” (X0=i)=πi\mathbb{P}(X_{0}=i)=\pi_{i} together with (25). With this assumption, the presentation of the mean hitting and return times introduced in the previous section can be carried out with greater clarity and simplicity. To begin with, we derive an algebraic expression for the second-order hitting times matrix T~=(τ~ij)\widetilde{T}=(\widetilde{\tau}_{i\to j}) of a very different nature from that at the basis of Equation (14).

Theorem 5.4.

Let T~=(τ~ij)i,jV\widetilde{T}=(\widetilde{\tau}_{i\to j})_{i,j\in V} and T^=(τ^ef)e,fE\widehat{T}=(\widehat{\tau}_{e\to f})_{e,f\in E}. There exists a vector a=(αe)eEa=(\alpha_{e})_{e\in E} and a vector b=(βi)iVb=(\beta_{i})_{i\in V} such that

T~=LT^Diag(a)R𝟙𝕓𝕋,\widetilde{T}=L\widehat{T}\,\mathrm{Diag}(a)R-\mathbbx{1}b^{T},

where LL is as in (22) and RR is as in (23).

Proof.

Introduce the matrix C|E|×|V|C\in\mathbb{R}^{|E|\times|V|} such that Cei=τ^eiC_{ei}=\widehat{\tau}_{e\to\mathcal{I}_{i}} is defined as in Section 4.1. From the identity

τ~ij=𝔼(T~j|X0=i)\displaystyle\widetilde{\tau}_{i\to j}=\mathbb{E}(\widetilde{T}_{j}|X_{0}=i) =𝔼(T^j|W0i)\displaystyle=\mathbb{E}(\widehat{T}_{\mathcal{I}_{j}}|W_{0}\in\mathcal{I}_{i})
=ei(W0=e|W0i)𝔼(T^j|W0=e)=eiλeτ^ej\displaystyle=\sum_{e\in\mathcal{I}_{i}}\mathbb{P}(W_{0}=e|W_{0}\in\mathcal{I}_{i})\,\mathbb{E}(\widehat{T}_{\mathcal{I}_{j}}|W_{0}=e)=\sum_{e\in\mathcal{I}_{i}}\lambda_{e}\widehat{\tau}_{e\to\mathcal{I}_{j}}

it follows that T~=LC\widetilde{T}=LC. Using the technical Lemma 2.6, for every eEe\in E and iVi\in V there exist coefficients αe\alpha_{e} and βi\beta_{i} such that τ^ei=fiαfτ^efβi\widehat{\tau}_{e\to\mathcal{I}_{i}}=\sum_{f\in\mathcal{I}_{i}}\alpha_{f}\widehat{\tau}_{e\to f}-\beta_{i}. The latter identity can be expressed in matrix terms as

C=T^Diag(a)R𝟙𝕓𝕋.C=\widehat{T}\,\mathrm{Diag}(a)R-\mathbbx{1}b^{T}.

Finally, as L𝟙=𝟙L\mathbbx{1}=\mathbbx{1}, we conclude. ∎

The previous theorem yields a formula for the matrix T~\widetilde{T} that is amenable to numerical computations, once the matrix T^\widehat{T} has been computed by standard methods via, e.g., (7). The next result proves that second-order mean return times defined in Section 4.2 coincide with the usual return times of the random walk associated with the pullback of P^\widehat{P} on 𝒢\mathcal{G}.

Theorem 5.5.

Let π\pi be the stationary density of the pullback of P^\widehat{P} on 𝒢\mathcal{G}. For every SVS\subset V it holds τ~S=1/iSπi\widetilde{\tau}_{S}=1/\sum_{i\in S}\pi_{i}. In particular, τ~i=1/πi\widetilde{\tau}_{i}=1/\pi_{i} for iVi\in V.

Proof.

Let S^=iSi\widehat{S}=\cup_{i\in S}\mathcal{I}_{i}. Then, τ~S\widetilde{\tau}_{S} can be reformulated as the mean return time in S^\widehat{S} of the chain {Wn}\{W_{n}\}. From Lemma 2.4 and (26), we get

τ~S=𝔼(T^S^+|W0S^)=1iSeiπ^e=1iSπi,\widetilde{\tau}_{S}=\mathbb{E}(\widehat{T}^{+}_{\widehat{S}}|W_{0}\in\widehat{S})=\frac{1}{\sum_{i\in S}\sum_{e\in\mathcal{I}_{i}}\widehat{\pi}_{e}}=\frac{1}{\sum_{i\in S}\pi_{i}},

and we have the claim. ∎

Example 5.6.

Let 𝒢=(V,E)\mathcal{G}=(V,E) be finite and undirected. For α[0,1]\alpha\in[0,1], the stationary density π^\widehat{\pi} of the matrix P^(α)\widehat{P}^{(\alpha)} in Example 3.3 is a constant vector, π^e=1/|E|\widehat{\pi}_{e}=1/|E|, as the transition matrix is bi-stochastic. Consequently, the second-order mean return times coincides with the mean return times of the uniform random walk on 𝒢\mathcal{G}, namely, τ~i=(jdj)/di\widetilde{\tau}_{i}=(\sum_{j}d_{j})/d_{i}, independently of α\alpha and the topology of 𝒢\mathcal{G}. Moreover, in this example Equation (25) becomes λ(i,j)=1/di\lambda_{(i,j)}=1/d_{i} being 𝒢\mathcal{G} undirected, i.e., the first step in the second-order process coincides with one step of a standard uniform random walk on 𝒢\mathcal{G}.

Theorem 5.5 yields a second-order analogous to Kac’s lemma, see Lemma 2.4. For a family of networks that exhibit some specific symmetries, Lemma 2.5 can also be extended to second-order random walks, as shown in the sequel.

The coefficients αe\alpha_{e} appearing in the proof of Theorem 5.4 come from Lemma 2.6 and, according to (8), can be expressed as αe=π^eτ^ei+\alpha_{e}=\widehat{\pi}_{e}\widehat{\tau}^{+}_{e\to\mathcal{I}_{i}} where i=𝚃(e)i={\tt T}(e) and τ^ei+\widehat{\tau}^{+}_{e\to\mathcal{I}_{i}} represents the mean return time to i\mathcal{I}_{i} of {Wn}\{W_{n}\}, after leaving it from eie\in\mathcal{I}_{i}, as defined in (18). Suppose that these return times depend on ii but not on eie\in\mathcal{I}_{i}. This assumption is clearly verified if for every e,fie,f\in\mathcal{I}_{i} there is an automorphism of 𝒢^\widehat{\mathcal{G}} that exchanges ee and ff, i.e., the nodes ee and ff in 𝒢^\widehat{\mathcal{G}} can be exchanged without altering the topology of 𝒢^\widehat{\mathcal{G}}. The latter condition is satisfied if, for example, {Wn}\{W_{n}\} is a uniform random walk on a bipartite regular graph, that is, an undirected, bipartite graph whose adjacency matrix has the form

(O𝟙𝟙𝕋𝟙𝟙𝕋O),\begin{pmatrix}O&\mathbbx{1}\mathbbx{1}^{T}\\ \mathbbx{1}\mathbbx{1}^{T}&O\end{pmatrix},

in which the diagonal blocks can have different sizes.

For notational simplicity, denote by ρi=τ^ei+\rho_{i}=\widehat{\tau}^{+}_{e\to\mathcal{I}_{i}} that common value. Then, from Theorem 5.1 we have

1=eiαe=eiπ^eτ^ei+=ρieiπ^e=ρiπi.1=\sum_{e\in\mathcal{I}_{i}}\alpha_{e}=\sum_{e\in\mathcal{I}_{i}}\widehat{\pi}_{e}\widehat{\tau}^{+}_{e\to\mathcal{I}_{i}}=\rho_{i}\sum_{e\in\mathcal{I}_{i}}\widehat{\pi}_{e}=\rho_{i}\pi_{i}.

Thus ρi=1/πi\rho_{i}=1/\pi_{i}. Moreover, for eie\in\mathcal{I}_{i} we obtain

αe=π^eρi=π^e/πi.\alpha_{e}=\widehat{\pi}_{e}\rho_{i}=\widehat{\pi}_{e}/\pi_{i}.

Hence, in the notation of Theorem 5.4, we have that Diag(a)Rπ=π^\mathrm{Diag}(a)R\pi=\widehat{\pi}, and thus

T~π=LT^Diag(a)Rπ𝟙𝕓𝕋π=𝕃𝕋^π^𝟙𝕓𝕋π,\widetilde{T}\pi=L\widehat{T}\,\mathrm{Diag}(a)R\pi-\mathbbx{1}b^{T}\pi=L\widehat{T}\widehat{\pi}-\mathbbx{1}b^{T}\pi,

where b=(βi)iVb=(\beta_{i})_{i\in V} is the coefficient vector defined in Theorem 5.4. By Lemma 2.5, we have T^π^=κ𝟙\widehat{T}\widehat{\pi}=\kappa\mathbbx{1} where κ\kappa is the Kemeny’s constant of T^\widehat{T}, and we eventually obtain:

T~π=κL𝟙𝟙𝕓𝕋π=𝟙(κ𝕓𝕋π).\widetilde{T}\pi=\kappa L\mathbbx{1}-\mathbbx{1}b^{T}\pi=\mathbbx{1}(\kappa-b^{T}\pi).

Combining all the observations above, we obtain the following second-order equivalent of the random target lemma:

Corollary 5.7.

Let π\pi be the stationary density given in Theorem 5.1. In the equilibrium assumption discussed at the beginning of Section 5.1 and using the previous notation, we have that, if for every iVi\in V the return time τ^ei+\widehat{\tau}^{+}_{e\to\mathcal{I}_{i}} depends on ii but not on eie\in\mathcal{I}_{i}, then

jVπjτ~ij=κ~,\sum_{j\in V}\pi_{j}\widetilde{\tau}_{i\to j}=\widetilde{\kappa},

independently on ii, where κ~=κbTπ\widetilde{\kappa}=\kappa-b^{T}\pi.

6 Numerical experiments on real-world networks

In this section, we present numerical results on the computation of second-order mean hitting times on some real-world networks. We consider three undirected and unweighted networks: Guppy [8], that represents the social interactions of a population of free-ranging guppies (Poecilia reticulata), Dolphins [23], that represents the social structure of a group of dolphins, and Householder93 [26], that represents a collaboration network of a group of mathematicians. Since these graphs contain dangling edges, we remove in sequence all the nodes of degree 11 until each vertex has degree greater or equal to 22. Table 1 shows the dimension of these networks before and after removing such leaf nodes. Our experiments illustrate various statistics on the second-order hitting times τ^ij\widehat{\tau}_{i\to j} for non-backtracking random walks in these graphs. Note that hitting times computed with the pullback transition matrix coincide with the hitting times τij\tau_{i\to j} for the standard random walk, see Example 5.6.

before the removal after the removal
nodes edges diameter nodes edges diameter
Guppy 99 726 6 98 725 5
Dolphins 62 159 8 53 150 7
Householder93 104 211 7 73 180 5
Table 1: Dimension of the networks used in our experiments before and after the removal of leaf nodes.

In Figure 1 we scatter plot the average non-backtracking mean hitting times m^j=iVτ^ij/|V|\widehat{m}_{j}=\sum_{i\in V}\widehat{\tau}_{i\to j}/|V| against the average traditional mean hitting times mj=iVτij/|V|m_{j}=\sum_{i\in V}\tau_{i\to j}/|V|. The dotted red line is the bisector of the first quadrant. We observe that, in all these networks, the ratio between mjm_{j} and m^j\widehat{m}_{j} is always smaller than 11. Intuitively, this means that, on average, non-backtracking random walks are shorter than classical random walks.

Refer to caption

Figure 1: Scatter plot of the average non-backtracking mean hitting times against the average traditional mean hitting times for the networks Guppy, Dolphins and Householder93. The dotted lines represent the bisectors.

Figure 2 displays the value of the non-backtracking mean access time

ai=jVπjτ^ija_{i}=\sum_{j\in V}\pi_{j}\widehat{\tau}_{i\to j} (27)

with respect to the node index iVi\in V. Recall that these values for the standard random walks are all equal to the Kemeny constant of the graph, see Lemma 2.5. Remarkably, the values for non-backtracking walks have a very narrow range, as shown by the small variation of the values in the yy-axes. This experiment shows that in these graphs the claim of Corollary 4.4 is approximately valid, even though the assumptions about the structure of the graph are not satisfied.

Refer to caption

Figure 2: Plot of the non-backtracking mean access times (27) for the networks Guppy, Dolphins and Householder93.

Finally, we computed mean hitting times for the backtrack-downweighted random walks introduced in Example 3.3, with varying α[0,1]\alpha\in[0,1]. The corresponding stochastic processes interpolate between the standard random walk (α=1\alpha=1) and the non-backtracking variant (α=0\alpha=0). Nevertheless, the pullback matrix does not depend on α\alpha, as noted in Example 5.3. Let τij(α)\tau^{(\alpha)}_{i\to j} denote second-order mean hitting times associated with the transition matrix P^(α)\widehat{P}^{(\alpha)}. To better illustrate the effect of penalizing backtracking steps, we measure the mean hitting times ratio

rj(α)=iVτij(α)iVτij(1).r^{(\alpha)}_{j}=\frac{\sum_{i\in V}\tau^{(\alpha)}_{i\to j}}{\sum_{i\in V}\tau^{(1)}_{i\to j}}. (28)

The curves in Figure 3 illustrate the behavior of the average value (magenta line), along with the maximum and the minimum value (black lines) of these ratios. The values for α=0\alpha=0 correspond to the results shown in Figure 1.

Refer to caption

Figure 3: Behavior of the mean hitting time ratios (28) for the backtrack-penalized random walks. The solid lines represent maximum, average, and minimum values of the mean hitting time ratios, respectively, as α\alpha varies in [0,1][0,1].

7 Conclusions

Second-order random walks, which include non-backtracking random walks as a particular case, provide stochastic models for navigation and diffusion processes on networks that can offer improved modeling capacity with respect to classical random-walks, and thus allow us to better capture real-world dynamics on networks. In this paper, we propose a very general framework to properly define mean hitting times and mean return times for second-order random walks for directed graphs with finite or countably many nodes. Our approach is based on the correspondence between the second-order random walk and a standard random walk obtained by first lifting the second-random walk to the directed line graph and then pulling back onto the original graph. The “pullback” Markov chain obtained this way has the same stationary density as the initial second-order random process and allow us to transfer several well-known results for standard random walks to the second-order case. For example, we prove that second-order mean return times coincide with the mean return times of the corresponding pullback Markov chain and we show how to extend the Kac’s and random target lemmas to second-order random walks.

These results prompt us to further investigations concerning computational issues as well as theoretical properties of second-order random walks. For instance, it is certainly interesting to devise more efficient numerical methods to compute second-order hitting times than the one based on Corollary 4.4 and Theorem 5.4. Furthermore, numerical experiments show that non-backtracking hitting times are consistently smaller than their corresponding classical, backtracking versions. In fact, reducing the probability of backtracking results in a reduction of the mean hitting times. However, a formal explanation of this fact, at the best of our knowledge, is still missing.

Acknowledgement

The work of Dario Fasino and Arianna Tonetto has been carried out in the framework of the departmental research project ‘ICON: Innovative Combinatorial Optimization in Networks’, Department of Mathematics, Computer Science and Physics (PRID 2017), University of Udine, Italy. The work of Dario Fasino and Francesco Tudisco has been partly supported by Istituto Nazionale di Alta Matematica, INdAM-GNCS, Italy.

References

  • [1] F. Arrigo, P. Grindrod, D. J. Higham, and V. Noferini. Non-backtracking walk centrality for directed networks. Journal of Complex Networks, 6:54–78, 2018.
  • [2] F. Arrigo, D. J. Higham, and V. Noferini. A theory for backtrack-downweighted walks. SIAM J. Matrix Anal. Appl., 42:1229–1247, 2021.
  • [3] F. Arrigo, D. J. Higham, and F. Tudisco. A framework for second-order eigenvector centralities and clustering coefficients. Proceedings of the Royal Society A, 476(2236):20190724, 2020.
  • [4] A. R. Benson, R. Abebe, M. T. Schaub, A. Jadbabaie, and J. Kleinberg. Simplicial closure and higher-order link prediction. Proceedings of the National Academy of Sciences, 115(48):E11221–E11230, 2018.
  • [5] A. R. Benson, D. F. Gleich, and L.-H. Lim. The spacey random walk: A stochastic process for higher-order data. SIAM Review, 59(2):321–345, 2017.
  • [6] S. M. Cioabă and P. Xu. Mixing rates of random walks with little backtracking. In SCHOLAR—a Scientific Celebration Highlighting Open Lines of Arithmetic Research, volume 655 of Contemp. Math., pages 27–58. Amer. Math. Soc., Providence, RI, 2015.
  • [7] S. Cipolla, F. Durastante, and F. Tudisco. Nonlocal PageRank. ESAIM Mathematical Modelling and Numerical Analysis, 2020.
  • [8] D. P. Croft, J. Krause, and R. James. Social networks in the guppy (poecilia reticulata). Proceedings of the Royal Society B: Biological Sciences, 271, 2005.
  • [9] M. Cucuringu, P. Rombach, S. Lee, and M. Porter. Detection of core-periphery structure in networks using spectral methods and geodesic paths. European Journal of Applied Mathematics, 27:846–887, 2016.
  • [10] F. Della Rossa, F. Dercole, and C. Piccardi. Profiling core-periphery network structure by random walkers. Scientific reports, 3:1–8, 2013.
  • [11] E. Estrada, J.-C. Delvenne, N. Hatano, J. L. Mateos, R. Metzler, A. P. Riascos, and M. T. Schaub. Random multi-hopper model: super-fast random walks on graphs. Journal of Complex Networks, 6(3):382–403, 2017.
  • [12] D. Fasino and F. Tudisco. Ergodicity coefficients for higher-order stochastic processes. SIAM J. Mathematics of Data Science, 2:740–769, 2020.
  • [13] R. Fitzner and R. Hofstad. Non-backtracking random walk. Journal of Statistical Physics, 150:264–284, 2012.
  • [14] P. Grindrod, D. J. Higham, and V. Noferini. The deformed graph Laplacian and its applications to network centrality analysis. Journal of Statistical Physics, 39:310–341, 2018.
  • [15] A. Grover and J. Leskovec. Node2vec: Scalable feature learning for networks. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 855–864, 2016.
  • [16] M. Kac. On the notion of recurrence in discrete stochastic processes. Bulletin of the American Mathematical Society, 53(10):1002–1010, 1947.
  • [17] T. Kawamoto. Localized eigenvectors of the non-backtracking matrix. Journal of Statistical Mechanics: Theory and Experiment, 2016(2):023404, 2016.
  • [18] J. G. Kemeny and J. L. Snell. Finite Markov chains. Undergraduate Texts in Mathematics. Springer-Verlag, New York-Heidelberg, 1976.
  • [19] M. Kempton. Non-backtracking random walks and a weighted Ihara’s theorem. Open Journal of Discrete Mathematics, 6:207–226, 2016.
  • [20] F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly, L. Zdeborová, and P. Zhang. Spectral redemption in clustering sparse networks. Proceedings of the National Academy of Sciences, 110(52):20935–20940, 2013.
  • [21] D. A. Levin, Y. Peres, and E. L. Wilmer. Markov Chains and Mixing Times. American Mathematical Society, Providence, RI, 2009.
  • [22] D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. Journal of the American society for information science and technology, 58(7):1019–1031, 2007.
  • [23] D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson. The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Behavioral Ecology and Sociobiology, 54(4):396–405, 2003.
  • [24] T. Martin, X. Zhang, and M. E. Newman. Localization and centrality in networks. Physical Review E, 90(5):052808, 2014.
  • [25] C. D. Meyer. Matrix Analysis and Applied Linear Algebra, volume 71. SIAM, 2000.
  • [26] C. Moler. Lake Arrowhead Coauthor Graph. http://blogs.mathworks.com/cleve/2013/06/10/lake-arrowhead-coauthor-graph/, 2013.
  • [27] A. N., I. Benjamini, E. Lubetzky, and S. Sodin. Non-backtracking random walks mix faster. Communications in Contemporary Mathematics, 9:585–603, 2007.
  • [28] B. Nadler, N. Srebro, and X. Zhou. Semi-supervised learning with the graph Laplacian: The limit of infinite unlabelled data. Advances in Neural Information Processing Systems, 22:1330–1338, 2009.
  • [29] M. Newman. Networks: An Introduction. Oxford University Press, 2010.
  • [30] M. E. Newman. A measure of betweenness centrality based on random walks. Social Networks, 27(1):39–54, 2005.
  • [31] J. D. Noh and H. Rieger. Random walks on complex networks. Physical Review Letters, 92(11):118701, 2004.
  • [32] J. R. Norris. Markov Chains. Number 2 in Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 1998.
  • [33] P. Rombach, M. A. Porter, J. H. Fowler, and P. J. Mucha. Core-periphery structure in networks (revisited). SIAM Review, 59(3):619–646, 2017.
  • [34] G. Rubino and B. Sericola. A finite characterization of weak lumpable Markov processes. I. The discrete time case. Stochastic Process. Appl., 38(2):195–204, 1991.
  • [35] L. Torres, K. S. Chan, H. Tong, and T. Eliassi-Rad. Nonbacktracking eigenvalues under node removal: X-centrality and targeted immunization. SIAM Journal on Mathematics of Data Science, 3(2):656–675, 2021.
  • [36] F. Tudisco, A. R. Benson, and K. Prokopchik. Nonlinear higher-order label spreading. In Proceedings of the Web Conference 2021, pages 2402–2413, 2021.
  • [37] F. Tudisco and D. J. Higham. A nonlinear spectral method for core-periphery detection in networks. SIAM J. Mathematics of Data Science, 1:269–292, 2019.
  • [38] Y. Wu, Y. Bian, and X. Zhang. Remember where you came from: On the second-order random walk based proximity measures. Proc. VLDB Endow., 10(1):13–24, 2016.