This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On the Houdré-Tetali conjecture about an isoperimetric constant of graphs

Lap Chi Lau111Cheriton School of Computer Science, University of Waterloo. Supported by NSERC Discovery Grant. ,      Dante Tjowasi222Cheriton School of Computer Science, University of Waterloo. Supported by NSERC Discovery Grant.

Houdré and Tetali defined a class of isoperimetric constants φp\varphi_{p} of graphs for 0p10\leq p\leq 1, and conjectured a Cheeger-type inequality for φ12\varphi_{\frac{1}{2}} of the form

λ2φ12λ2,\lambda_{2}\lesssim\varphi_{\frac{1}{2}}\lesssim\sqrt{\lambda_{2}},

where λ2\lambda_{2} is the second smallest eigenvalue of the normalized Laplacian matrix. If true, the conjeecture would be a strengthening of the hard direction of the classical Cheeger’s inequality. Morris and Peres proved Houdré and Tetali’s conjecture up to an additional log factor, using techniques from evolving sets. We present the following related results on this conjecture.

  1. 1.

    We provide a family of counterexamples to the conjecture of Houdré and Tetali, showing that the logarithmic factor is needed.

  2. 2.

    We match Morris and Peres’s bound using standard spectral arguments.

  3. 3.

    We prove that Houdré and Tetali’s conjecture is true for any constant pp strictly bigger than 12\frac{1}{2}, which is also a strengthening of the hard direction of Cheeger’s inequality.

Furthermore, our results can be extended to directed graphs using Chung’s definition of eigenvalues for directed graphs.

1 Introduction

Motivated by Talagrand’s isoperimetric inequality for the hypercubes [Tal93] (see Section 1.2), Houdré and Tetali [HT04] extended Talagrand’s isoperimetric constant to general Markov chains and also to different exponents.

Definition 1.1 (Isoperimetric Constants for Markov Chains [HT04]).

Let (V,P,π)(V,P,\pi) be an irreducible Markov chain with vertex set VV, transition matrix P|V|×|V|P\in\mathbb{R}^{|V|\times|V|} and stationary distribution π:V0\pi:V\to\mathbb{R}_{\geq 0}. For any p(0,1]p\in(0,1], define the isoperimetric constant as

φp(P):=minSV:π(S)12φp(S):=minSV:π(S)12vSπ(v)(uS¯P(v,u))pπ(S).\varphi_{p}(P):=\min_{S\subset V:\pi(S)\leq\frac{1}{2}}\varphi_{p}(S):=\min_{S\subset V:\pi(S)\leq\frac{1}{2}}\frac{\sum_{v\in S}\pi(v)\cdot\big{(}\sum_{u\in\overline{S}}P(v,u)\big{)}^{p}}{\pi(S)}.

Let S:={vSuS¯P(v,u)>0}\partial S:=\{v\in S\mid\sum_{u\in\overline{S}}P(v,u)>0\} be the inner vertex boundary of SS. Then, for p=0p=0,

φ0(P):=minSV:π(S)12φ0(S):=minSV:π(S)12π(S)π(S).\varphi_{0}(P):=\min_{S\subset V:\pi(S)\leq\frac{1}{2}}\varphi_{0}(S):=\min_{S\subset V:\pi(S)\leq\frac{1}{2}}\frac{\pi(\partial S)}{\pi(S)}.

Given an undirected graph or a directed graph G=(V,E)G=(V,E), let PGP_{G} be the transition matrix of the natural random walk, where PG(v,u)=1/deg(v)P_{G}(v,u)=1/\deg(v) in the undirected case and PG(v,u)=1/degout(v)P_{G}(v,u)=1/\deg^{\mathrm{out}}(v) in the directed case. Then the isoperimetric constant φp(G)\varphi_{p}(G) for the graph GG is defined as φp(PG)\varphi_{p}(P_{G}).

When p=1p=1, this is known as the Cheeger isoperimetric constant of a Markov chain (see e.g. [LP17]) or the edge conductance/expansion of a graph (see Section 2 for definition). When p=0p=0, this measures the vertex expansion of an undirected graph or a directed graph. Talagrand studied the case p=12p=\frac{1}{2} and proved a lower bound on φ12\varphi_{\frac{1}{2}} for Boolean hypercubes. One can view φ12\varphi_{\frac{1}{2}} as a quantity that interpolates between edge conductance and vertex expansion, since it follows from the Cauchy-Schwarz inequality that φ12(G)2φ1(G)φ0(G)\varphi_{\frac{1}{2}}(G)^{2}\leq\varphi_{1}(G)\cdot\varphi_{0}(G), and Talagrand used his lower bound to recover Margulis’ theorem about edge conductance and vertex expansion on hypercubes (see Section 1.2).

For an undirected graph GG, one can use the second smallest eigenvalue λ2\lambda_{2} of the matrix IPGI-P_{G} to give upper and lower bound on φ1(G)\varphi_{1}(G). The classical Cheeger’s inequality is

λ2(IPG)φ1(G)λ2(IPG).\lambda_{2}(I-P_{G})\lesssim\varphi_{1}(G)\lesssim\sqrt{\lambda_{2}(I-P_{G})}.

Houdré and Tetali conjectured that the same relations hold for φ12(G)\varphi_{\frac{1}{2}}(G) as well when the Markov chain is reversible.

Conjecture 1.2 ([HT04]).

Let (V,P,π)(V,P,\pi) be an irreducible and reversible Markov chain. Then

λ2(IP)φ12(P)λ2(IP).\lambda_{2}(I-P)\lesssim\varphi_{\frac{1}{2}}(P)\lesssim\sqrt{\lambda_{2}(I-P)}.

It is clear from the definition that φp(G)\varphi_{p}(G) increases as pp decreases, and thus λ2φ1(G)φp(G)\lambda_{2}\lesssim\varphi_{1}(G)\leq\varphi_{p}(G) for all p<1p<1. Therefore, the Houdré-Tetali conjecture is a strengthening of the hard direction of Cheeger’s inequality. It predicts that when the hard direction of Cheeger’s inequality is tight for a graph GG such that φ1(G)λ2\varphi_{1}(G)\asymp\sqrt{\lambda_{2}}, then the graph must satisfy φ1(G)φ12(G)\varphi_{1}(G)\asymp\varphi_{\frac{1}{2}}(G). Or, in other words, when φ1(G)φ12(G)\varphi_{1}(G)\ll\varphi_{\frac{1}{2}}(G) such as on the hypercube (see 1.7) or on the dumbbell graphs, then the hard direction of Cheeger’s inequality cannot be tight. So, the conjecture can be viewed as an improved Cheeger’s inequality in the spirit of [KLL+13, KLL17] and this is a main motivation of this work.

Morris and Peres came close to proving the conjecture with an extra logarithmic factor.

Theorem 1.3 ([MP05]).

Let (V,P,π)(V,P,\pi) be an irreducible and reversible Markov chain. Suppose that P(v,v)12P(v,v)\geq\frac{1}{2} for all vVv\in V. Then

λ2(IP)(φ12(P))2log(1/φ12(P)).\lambda_{2}(I-P)\gtrsim\frac{\big{(}\varphi_{\frac{1}{2}}(P)\big{)}^{2}}{\log\big{(}1/\varphi_{\frac{1}{2}}(P)\big{)}}.

Their proof is based on techniques from evolving sets. They lower bounded the “boundary gauge” Ψ\Psi using φ12\varphi_{\frac{1}{2}}, and also upper bounded the mixing rate using Ψ\Psi so that they can relate λ2\lambda_{2} and φ12\varphi_{\frac{1}{2}}.

1.1 Our Results

We found counterexamples to the Houdré-Tetali conjecture, showing that the extra logarithmic factor is needed.

Theorem 1.4.

There are irreducible and reversible Markov chains (V,P,π)(V,P,\pi) with

λ2(IP)log|V||V|2andφ12(P)log|V||V|λ2(IP)(φ12(P))2log(1/φ12(P)).\lambda_{2}(I-P)\lesssim\frac{\log|V|}{|V|^{2}}\quad\textrm{and}\quad\varphi_{\frac{1}{2}}(P)\gtrsim\frac{\log|V|}{|V|}\quad\implies\quad\lambda_{2}(I-P)\lesssim\frac{\big{(}\varphi_{\frac{1}{2}}(P)\big{)}^{2}}{\log\big{(}1/\varphi_{\frac{1}{2}}(P)\big{)}}.

The counterexample is simple to describe, which is a weighted undirected graph with vertex set [n][n] and edge weight P(i,j)P(i,j) inversely proportional to min{|ij|,n|ij|}3\min\{|i-j|,n-|i-j|\}^{3}. See Section 4 for details.

On the positive side, we match the result of Morris and Peres using standard spectral arguments. We show that the simple sweep-cut algorithm can be used to output a set SS with φ12(S)\varphi_{\frac{1}{2}}(S) satisfying the guarantee in Theorem 1.3, without the self-loop assumption. See Section 3.1.

Perhaps more interestingly, the same arguments can be used to prove that the Houdré-Tetali conjecture is true if we replace 12\frac{1}{2} by any constant p>12p>\frac{1}{2}.

Theorem 1.5.

Let (V,P,π)(V,P,\pi) be an irreducible and reversible Markov chain. For any p(12,1]p\in(\frac{1}{2},1],

(φp(P))242p1λ2(IP).\big{(}\varphi_{p}(P)\big{)}^{2}\leq\frac{4}{2p-1}\cdot\lambda_{2}(I-P).

Similar to the discussion after 1.2, this shows that the tight examples of the hard direction of Cheeger’s inequality must satisfy φ12+ϵ(G)1ϵφ1(G)\varphi_{\frac{1}{2}+{\epsilon}}(G)\lesssim\sqrt{\frac{1}{{\epsilon}}}\cdot\varphi_{1}(G) for any ϵ(0,12){\epsilon}\in(0,\frac{1}{2}). Also, this provides an improved analysis of Cheeger’s inequality that if φ1(G)2p1φp(G)\varphi_{1}(G)\ll\sqrt{2p-1}\cdot\varphi_{p}(G) then φ1(G)λ2\varphi_{1}(G)\ll\sqrt{\lambda_{2}}. So this result has similar consequences as if Houdré and Tetali’s conjecture was true.

Finally, we observe that the same statement as in Theorem 1.5 can also be proved for non-reversible Markov chains, by replacing λ2(IP)\lambda_{2}(I-P) with the eigenvalue defined for directed graphs by Chung [Chu05]. See Theorem 3.2.

Remark 1.6 (φp\varphi_{p} for p<12p<\frac{1}{2}).

For p<12p<\frac{1}{2}, a simple argument shows that an inequality of the form in Theorem 1.5 cannot hold. To see it, consider the transformation P(1δ)I+δPP\rightarrow(1-\delta)I+\delta P for some parameter 0<δ<10<\delta<1 (equivalent to adding a large self loop when δ\delta is small) will scale φp\varphi_{p} by a factor δp\delta^{p}, while the second eigenvalue scales by a factor of δ\delta, and so the ratio φp(G)/λ2(IPG)\varphi_{p}(G)/\sqrt{\lambda_{2}(I-P_{G})} scales by δp12\delta^{p-\frac{1}{2}}\to\infty as δ0\delta\to 0. When p=12p=\frac{1}{2}, adding self loops does not change the ratio. Thus, it is the first exponent where such an inequality makes sense.

1.2 Previous Work on Boolean Hypercubes

The isoperimetric constant φ12\varphi_{\frac{1}{2}} was initially studied by Talagrand in the Boolean hypercubes. Let {0,1}n\{0,1\}^{n} be the nn-dimensional hypercube. For a point x{0,1}nx\in\{0,1\}^{n}, let xix^{\oplus i} be the point obtained by flipping the ii-th bit of xx. For a subset S{0,1}nS\subset\{0,1\}^{n}, if xSx\notin S define hS(x)=0h_{S}(x)=0 and otherwise if xSx\in S define

hS(x):=|{i[n]xiS}|,h_{S}(x):=\big{|}\{i\in[n]\mid x^{\oplus i}\notin S\}\big{|},

so that xhS(x)\sum_{x}h_{S}(x) is the size of the edge boundary of SS. Let μ\mu be the uniform distribution on {0,1}n\{0,1\}^{n} and μ(S):=xSμ(x)\mu(S):=\sum_{x\in S}\mu(x). The classical Poincaré inequality can be stated as, for any S{0,1}nS\subset\{0,1\}^{n},

μ(S)(1μ(S))𝔼xμ[hS(x)].\mu(S)\cdot(1-\mu(S))\lesssim\mathbb{E}_{x\sim\mu}\big{[}h_{S}(x)\big{]}. (1.1)

Talagrand [Tal93] proved a strengthening of the Poincaré inequality: For any S{0,1}nS\subset\{0,1\}^{n},

μ(S)(1μ(S))𝔼xμ[hS(x)].\mu(S)\cdot(1-\mu(S))\lesssim\mathbb{E}_{x\sim\mu}\big{[}{\sqrt{h_{S}(x)}}\big{]}. (1.2)

The quantity 𝔼hS\mathbb{E}\sqrt{h_{S}} is always smaller than 𝔼hS\mathbb{E}h_{S} and can be seen as a different measure of the boundary information of SS. Let S:={xhS(x)>0}\partial S:=\{x\mid h_{S}(x)>0\} be the vertex boundary of SS. By the Cauchy-Schwarz inequality, Talagrand’s theorem implies Margulis’ theorem [Mar77] that

μ(S)2(1μ(S))2𝔼xμ[hS(x)]μ(S),\mu(S)^{2}\cdot(1-\mu(S))^{2}\lesssim\mathbb{E}_{x\sim\mu}\big{[}h_{S}(x)\big{]}\cdot\mu(\partial S),

which was an original motivation for Talagrand to consider the quantity 𝔼hS\mathbb{E}{\sqrt{h_{S}}}. More recently, both Margulis’ and Talagrand’s theorems inspired the analogs for directed graphs developed in [CS16, KMS18], to make major progresses in analyzing sublinear time algorithms for testing monotone functions. See also [EG22, EKLM22] for a proof of a Talagrand’s conjecture that further sharpens these inequalities.

The following remark clarifies the connection between φp\varphi_{p} and the quantities appearing in Poincaré’s inequality and Talagrand’s inequality.

Remark 1.7 (φp\varphi_{p} for Hypercubes).

For the nn-dimensional Boolean hypercube QnQ_{n}, the stationary distribution π\pi is simply the uniform distribution μ\mu. Note that the numerator in φ1(Qn)\varphi_{1}(Q_{n}) is exactly 1n𝔼xμ[hf(x)]\frac{1}{n}\mathbb{E}_{x\in\mu}[h_{f}(x)], and the Poincaré inequality translates to φ1(Qn)1n\varphi_{1}(Q_{n})\gtrsim\frac{1}{n}. Similarly, the numerator of φ12(Qn)\varphi_{\frac{1}{2}}(Q_{n}) is exactly 1n𝔼xμ[hf(x)]\frac{1}{\sqrt{n}}\mathbb{E}_{x\in\mu}[{\sqrt{h_{f}(x)}}], and the Talagrand’s inequality translates to φ12(Qn)1n\varphi_{\frac{1}{2}}(Q_{n})\gtrsim\frac{1}{\sqrt{n}}.

Finally, we note that a parameter similar to φp\varphi_{p}, called hfph_{f}^{p}, was also studied in [EG22].

2 Preliminaries

Given two functions f,gf,g, we use fgf\lesssim g to denote the existence of a positive constant c>0c>0, such that fcgf\leq c\cdot g always holds. We use fgf\asymp g to denote fgf\lesssim g and gfg\lesssim f. For positive integers kk, we use [k][k] to denote the set {1,2,,k}\{1,2,\dots,k\}. For a function f:Xf:X\rightarrow\mathbb{R}, supp(f)\operatorname{supp}(f) denotes the domain subset on which ff is nonzero. The function logx\log x refers to the base ee logarithm.

Undirected Graphs: Let G=(V,E)G=(V,E) be an undirected graph. Let w:E0w:E\to\mathbb{R}_{\geq 0} be a weight function on the edges. The weighted degree of a vertex vv is defined as degw(v):=e:evw(e)\mathrm{deg}_{w}(v):=\sum_{e:e\ni v}w(e). Let SVS\subset V be a nonempty subset of vertices. The edge boundary of SS is defined as δ(S):={eEeSandeS¯}\delta(S):=\{e\in E\mid e\cap S\neq\emptyset{\rm~{}and~{}}e\cap\overline{S}\neq\emptyset\} and w(δ(S))w(\delta(S)) be the total edge weight of δ(S)\delta(S). The volume of SS is defined as volw(S):=vSdegw(v)\operatorname{vol}_{w}(S):=\sum_{v\in S}\mathrm{deg}_{w}(v). The edge conductance of SS and of GG are defined as

ϕ(S):=w(δ(S))volw(S)andϕ(G):=minS:volw(S)volw(V)/2ϕ(S).\phi(S):=\frac{w\big{(}\delta(S)\big{)}}{\operatorname{vol}_{w}(S)}\quad\textrm{and}\quad{\phi}(G):=\min_{S:\operatorname{vol}_{w}(S)\leq\operatorname{vol}_{w}(V)/2}{\phi}(S).

In an undirected graph, the ordinary random walk has transition matrix PP with P(u,v)=w(uv)/degw(u)P(u,v)=w(uv)/\deg_{w}(u) for every u,vVu,v\in V. If the graph is connected, then the stationary distribution π\pi is unique with π(u)=degw(u)/vVdegw(v)\pi(u)=\deg_{w}(u)/\sum_{v\in V}\deg_{w}(v) for every uVu\in V. It is thus straightforward to check that ϕ(S)=φ1(S)\phi(S)=\varphi_{1}(S) and ϕ(G)=φ1(G)\phi(G)=\varphi_{1}(G), i.e. the two definitions coincide.

Directed Graphs: Let G=(V,E)G=(V,E) be a directed graph. Let w:E0w:E\to\mathbb{R}_{\geq 0} be a weight function on the edges. The weighted indegree of a vertex vv is defined as dwin(v):=u:uvEw(uv)d_{w}^{\rm in}(v):=\sum_{u:\overrightarrow{uv}\in E}w(\overrightarrow{uv}) and the weighted outdegree of vv is defined as dwout(v):=u:vuEw(vu)d_{w}^{\rm out}(v):=\sum_{u:\overrightarrow{vu}\in E}w(\overrightarrow{vu}). In a directed graph, the ordinary random walk has transition matrix PP with P(u,v)=w(uv)/degwout(u)P(u,v)=w(\overrightarrow{uv})/\deg_{w}^{\rm out}(u). The stationary distribution π\pi has no easy description but is unique as long as the directed graph is strongly connected. There are different notions of directed edge conductance for directed graphs. In analyzing random walks, the standard definition is exactly φ1(G)\varphi_{1}(G) as described in 1.1, and this quantity is closely related to the mixing time of random walks; see e.g. [LP17, Chu05, MP05]. In analyzing graph partitioning, there is a definition that extends the edge conductance above to directed graphs, which will not be used in this paper; see e.g. [Yos16, LTW23].

Spectral Graph Theory: Given an undirected graph G=(V,E)G=(V,E) with a weight function w:E0w:E\to\mathbb{R}_{\geq 0}, its adjacency matrix A=A(G)A=A(G) is an |V|×|V||V|\times|V| matrix where the (u,v)(u,v)-th entry is w(uv)w(uv). The Laplacian matrix is defined as L:=DAL:=D-A, where D:=diag({degw(v)}vV)D:=\operatorname{diag}(\{\deg_{w}(v)\}_{v\in V}) is the diagonal degree matrix. The normalized adjacency matrix is defined as 𝒜:=D1/2AD1/2\mathcal{A}:=D^{-1/2}AD^{-1/2}, and the normalized Laplacian matrix is defined as :=I𝒜\mathcal{L}:=I-\mathcal{A}. Let λ1()λ2()λn()\lambda_{1}({\mathcal{L}})\leq\lambda_{2}({\mathcal{L}})\leq\cdots\leq\lambda_{n}({\mathcal{L}}) be the eigenvalues of {\mathcal{L}}. It is known that λ1()=0\lambda_{1}({\mathcal{L}})=0 with eigenvector D1/21D^{1/2}\vec{1}, and

λ2()=mingD1/21gTggTg=minfD1fTLffTDf=minfD1uvEw(uv)(f(u)f(v))2vdegw(v)f(v)2.\lambda_{2}({\mathcal{L}})=\min_{g\perp D^{1/2}\vec{1}}\frac{g^{T}{\mathcal{L}}g}{g^{T}g}=\min_{f\perp D\vec{1}}\frac{f^{T}Lf}{f^{T}Df}=\min_{f\perp D\vec{1}}\frac{\sum_{uv\in E}w(uv)\cdot(f(u)-f(v))^{2}}{\sum_{v}\deg_{w}(v)\cdot f(v)^{2}}.

Cheeger’s inequality [Che70, AM85, Alo86] is a fundamental result in spectral graph theory that connects edge conductance of an undirected graph G=(V,E)G=(V,E) to the second smallest eigenvalue of its normalized Laplacian matrix:

λ22ϕ(G)2λ2.\frac{\lambda_{2}}{2}\leq\phi(G)\leq\sqrt{2\lambda_{2}}.

The random walk transition matrix PP is similar to the normalized Laplacian matrix 𝒜\mathcal{A}, and the matrix IPI-P is similar to the normalized Laplacian matrix \mathcal{L}. In particular, IPI-P enjoys the same spectral properties as \mathcal{L} with real eigenvalues and a quadratic form characterization of λ2\lambda_{2} as above; see 3.1.

Chung [Chu05] defined the Laplacian matrix of a directed graph and used it to prove an analog of Cheeger’s inequality. These will be stated in Section 3.2.

3 Positive Results

To prove Theorem 1.5, we follow standard spectral arguments used in proving Cheeger-type inequalities, in Trevisan’s style. First, we start with the second eigenvector f2:Vf_{2}:V\to\mathbb{R} and truncate it so that the π\pi weight of its support is at most half while preserving its Rayleigh quotient. The proof of the following lemma is standard and we defer it to the end of this section.

Lemma 3.1.

Let (V,P,π)(V,P,\pi) be an irreducible and reversible Markov chain. Let f2f_{2} be an eigenvector associated to the second smallest eigenvalue of the matrix IPI-P, with π({vf2(v)>0})12\pi(\{v\mid f_{2}(v)>0\})\leq\frac{1}{2}. Define the truncated vector ff such that f(v):=max{f2(v),0}f(v):=\max\{f_{2}(v),0\} for all vVv\in V. Then

λ2(IP)f(i)f(j)π(i)P(i,j)(f(i)f(j))2iVπ(i)f(i)2.\lambda_{2}(I-P)\geq\frac{\sum_{f(i)\geq f(j)}\pi(i)\cdot P(i,j)\cdot(f(i)-f(j))^{2}}{\sum_{i\in V}\pi(i)f(i)^{2}}.

Then the plan is to prove that one of the level sets has small isoperimetric constant. We index the vertices by [n][n] and order them so that f(i)f(j)f(i)\leq f(j) for iji\leq j. For any t0t\geq 0, define the level set St:={i[n]f(i)2>t}S_{t}:=\{i\in[n]\mid f(i)^{2}>t\}. By the construction of ff, it holds that π(St)12\pi(S_{t})\leq\frac{1}{2} for any t0t\geq 0, and so φp(P)mint:t0φp(St)\varphi_{p}(P)\leq\min_{t:t\geq 0}\varphi_{p}(S_{t}). For convenience, we rescale ff so that maxif(i)=1\max_{i}f(i)=1.

To prove that one of StS_{t} has small isoperimetric constant, we choose a uniform random t[0,1]t\in[0,1] and consider φp(St)\varphi_{p}(S_{t}). We will bound the expected value of the numerator of φp(St)\varphi_{p}(S_{t}) and of the denominator of φp(St)\varphi_{p}(S_{t}) and conclude that there exists a tt with φp(St)\varphi_{p}(S_{t}) at most the ratio of the expected values, i.e. mint:t0φp(St)𝔼t[numerator of φp(St)]/𝔼t[denominator of φp(St)]\displaystyle\min_{t:t\geq 0}\varphi_{p}(S_{t})\leq\mathbb{E}_{t}[\textrm{numerator~{}of~{}}\varphi_{p}(S_{t})]~{}/~{}\mathbb{E}_{t}[\textrm{denominator~{}of~{}}\varphi_{p}(S_{t})].

The expected value of the denominator is easy to analyze. Since we choose tt uniformly randomly, each vertex ii is included in StS_{t} with probability f(i)2f(i)^{2}, and thus

𝔼t[π(St)]=iVπ(i)f(i)2.\mathbb{E}_{t}[\pi(S_{t})]=\sum_{i\in V}\pi(i)\cdot f(i)^{2}.

The rest of the proof is to analyze the expected value of the numerator iVπ(i)P(i,St¯)p\sum_{i\in V}\pi(i)\cdot P(i,\overline{S_{t}})^{p}. For a vertex ii, if the random threshold tt is between f(j)2f(j)^{2} and f(j1)2f(j-1)^{2} with f(j)>f(j1)f(j)>f(j-1), then P(i,St¯)=P(i,[j1])P(i,\overline{S_{t}})=P(i,[j-1]), and so

𝔼t[P(i,St¯)p]\displaystyle\mathbb{E}_{t}\big{[}P(i,\overline{S_{t}})^{p}\big{]} =\displaystyle= j=1i(f(j)2f(j1)2)P(i,[j1])p\displaystyle\sum_{j=1}^{i}\big{(}f(j)^{2}-f(j-1)^{2}\big{)}\cdot P(i,[j-1])^{p}
=\displaystyle= j=1i(f(j)2f(j1)2)l=1j1(P(i,[l])pP(i,[l1])p)\displaystyle\sum_{j=1}^{i}\big{(}f(j)^{2}-f(j-1)^{2}\big{)}\cdot\sum_{l=1}^{j-1}\big{(}P(i,[l])^{p}-P(i,[l-1])^{p}\big{)}
=\displaystyle= l=1i1(f(i)2f(l)2)(P(i,[l])pP(i,[l1])p),\displaystyle\sum_{l=1}^{i-1}\big{(}f(i)^{2}-f(l)^{2}\big{)}\cdot\big{(}P(i,[l])^{p}-P(i,[l-1])^{p}\big{)},

where the second equality is by writing a telescoping sum and the third equality is by a change of summation. So, the expected numerator 𝔼t[i=1nπ(i)P(i,St¯)p]\mathbb{E}_{t}[\sum_{i=1}^{n}\pi(i)\cdot P(i,\overline{S_{t}})^{p}] is

i=1nj=1i1π(i)(f(i)2f(j)2)(P(i,[j])pP(i,[j1])p)\displaystyle\sum_{i=1}^{n}\sum_{j=1}^{i-1}\pi(i)\cdot\big{(}f(i)^{2}-f(j)^{2}\big{)}\cdot\big{(}P(i,[j])^{p}-P(i,[j-1])^{p}\big{)}
\displaystyle\leq i=1nj=1i1π(i)(f(i)f(j))2P(i,j)\displaystyle\sqrt{\sum_{i=1}^{n}\sum_{j=1}^{i-1}\pi(i)\cdot(f(i)-f(j))^{2}\cdot P(i,j)}
i=1nj=1i1π(i)(f(i)+f(j))2(P(i,[j])pP(i,[j1])p)2P(i,j)\displaystyle\quad\quad\cdot\sqrt{\sum_{i=1}^{n}\sum_{j=1}^{i-1}\pi(i)\cdot(f(i)+f(j))^{2}\cdot\frac{\big{(}P(i,[j])^{p}-P(i,[j-1])^{p}\big{)}^{2}}{P(i,j)}}
\displaystyle\leq λ2i=1nπ(i)f(i)2i=1n4π(i)f(i)2j=1i1(P(i,[j])pP(i,[j1])p)2P(i,[j])P(i,[j1]),\displaystyle\sqrt{\lambda_{2}\cdot\sum_{i=1}^{n}\pi(i)\cdot f(i)^{2}}\cdot\sqrt{\sum_{i=1}^{n}4\cdot\pi(i)\cdot f(i)^{2}\cdot\sum_{j=1}^{i-1}\frac{\big{(}P(i,[j])^{p}-P(i,[j-1])^{p}\big{)}^{2}}{P(i,[j])-P(i,[j-1])}},

where the first inequality is by Cauchy-Schwarz, and the second inequality is by 3.1 and (f(i)+f(j))24f(i)2(f(i)+f(j))^{2}\leq 4f(i)^{2} and P(i,j)=P(i,[j])P(i,[j1])P(i,j)=P(i,[j])-P(i,[j-1]).

To upper bound the inner sum of the second term, we denote aj:=P(i,[j])a_{j}:=P(i,[j]) and it suffices to upper bound the sum of the form j=1n(ajpaj1p)2/(ajaj1)\sum_{j=1}^{n}(a_{j}^{p}-a_{j-1}^{p})^{2}/(a_{j}-a_{j-1}) with a0=0a_{0}=0 and an1a_{n}\leq 1, with a bound independent of nn. Let C(n,a)C(n,a) denote the supremum of the sum when an=aa_{n}=a. Note that C(n,a)=a2p1C(n,1)C(n,a)=a^{2p-1}\cdot C(n,1) by a simple scaling argument. Let (ai)i=1n(a_{i})_{i=1}^{n} be an optimal sequence that achieves the supremum of C(n,1)C(n,1). Then,

C(n,1)=C(n1,an1)+(1an1p)21an1\displaystyle C(n,1)~{}=~{}C(n-1,a_{n-1})+\frac{(1-a_{n-1}^{p})^{2}}{1-a_{n-1}} =\displaystyle= an12p1C(n1,1)+(1an1p)21an1\displaystyle a_{n-1}^{2p-1}\cdot C(n-1,1)+\frac{(1-a_{n-1}^{p})^{2}}{1-a_{n-1}}
\displaystyle\leq an12p1C(n,1)+(1an1p)21an1.\displaystyle a_{n-1}^{2p-1}\cdot C(n,1)+\frac{(1-a_{n-1}^{p})^{2}}{1-a_{n-1}}.

It follows that

C(n,1)supa[0,1](1ap)2(1a)(1a2p1)supa[0,1](1ap)2(2p1)(1a)212p1,C(n,1)\leq\sup_{a\in[0,1]}\frac{(1-a^{p})^{2}}{(1-a)(1-a^{2p-1})}\leq\sup_{a\in[0,1]}\frac{(1-a^{p})^{2}}{(2p-1)(1-a)^{2}}\leq\frac{1}{2p-1},

where the second inequality is by the mean value theorem that 1a2p1(2p1)(1a)1-a^{2p-1}\geq(2p-1)(1-a) and the last inequality is because a[0,1]a\in[0,1] and p(12,1]p\in(\frac{1}{2},1]. Clearly, C(n,1)C(n,an)C(n,1)\geq C(n,a_{n}) for any an[0,1]a_{n}\in[0,1], and so the inner sum of the second term in the expected numerator is at most 12p1\frac{1}{2p-1}. Putting together, this completes the proof of Theorem 1.5 as

φp(P)mint:t>0φp(St)𝔼t[i=1nπ(i)P(i,St¯)p]𝔼t[π(St)]2λ212p1,\varphi_{p}(P)\leq\min_{t:t>0}\varphi_{p}(S_{t})\leq\frac{\mathbb{E}_{t}[\sum_{i=1}^{n}\pi(i)\cdot P(i,\overline{S_{t}})^{p}]}{\mathbb{E}_{t}[\pi(S_{t})]}\leq 2\sqrt{\lambda_{2}}\sqrt{\frac{1}{2p-1}},

which implies that (φp(P))242p1λ2.\big{(}\varphi_{p}(P)\big{)}^{2}\leq\frac{4}{2p-1}\cdot\lambda_{2}.

3.1 Recovering Morris and Peres’s Result

To recover Theorem 1.3, we follow the same arguments but add a truncation step so that the sequence aia_{i} above will start with a0φ12(P)a_{0}\approx\varphi_{\frac{1}{2}}(P). In this subsection, we plug in p=12p=\frac{1}{2}. As above, the main work is to upper bound the expected value of the numerator. Recall that

𝔼t[P(i,St¯)]=j=1i1(f(i)2f(j)2)(P(i,[j])P(i,[j1])),\mathbb{E}_{t}\Big{[}\sqrt{P(i,\overline{S_{t}})}\Big{]}=\sum_{j=1}^{i-1}\big{(}f(i)^{2}-f(j)^{2}\big{)}\cdot\big{(}\sqrt{P(i,[j])}-\sqrt{P(i,[j-1])}\big{)},

Let lil_{i} be the index such that P(i,[li])12φ12(P)\sqrt{P(i,[l_{i}])}\leq\frac{1}{2}\varphi_{\frac{1}{2}}(P) but P(i,[li+1])>12φ12(P)\sqrt{P(i,[l_{i}+1])}>\frac{1}{2}\varphi_{\frac{1}{2}}(P). Then, we can upper bound the right hand side by

𝔼t[P(i,St¯)]12φ12(P)f(i)2\displaystyle\mathbb{E}_{t}\Big{[}\sqrt{P(i,\overline{S_{t}})}\Big{]}\leq\frac{1}{2}\varphi_{\frac{1}{2}}(P)\cdot f(i)^{2} +\displaystyle+ (f(i)2f(li+1)2)(P(i,[li+1])12φ12(P))\displaystyle(f(i)^{2}-f(l_{i}+1)^{2})\Big{(}\sqrt{P(i,[l_{i}+1])}-\frac{1}{2}\varphi_{\frac{1}{2}}(P)\Big{)}
+\displaystyle+ j=li+2i1(f(i)2f(j)2)(P(i,[j])P(i,[j1])).\displaystyle\sum_{j=l_{i}+2}^{i-1}(f(i)^{2}-f(j)^{2})\left(\sqrt{P(i,[j])}-\sqrt{P(i,[j-1])}\right).

To shorten the expression, let us use the notations

ai,0=12φ12(P)andai,j=P(i,[li+j]).a_{i,0}=\frac{1}{2}\varphi_{\frac{1}{2}}(P)\quad\textrm{and}\quad a_{i,j}=\sqrt{P(i,[l_{i}+j])}.

Summing over ii and using these notations, the expected numerator is

𝔼t[i=1nπ(i)P(i,St¯)]\displaystyle\mathbb{E}_{t}\Big{[}\sum_{i=1}^{n}\pi(i)\cdot\sqrt{P(i,\overline{S_{t}})}\Big{]} \displaystyle\leq 12φ12(P)i=1nπ(i)f(i)2\displaystyle\frac{1}{2}\varphi_{\frac{1}{2}}(P)\cdot\sum_{i=1}^{n}\pi(i)\cdot f(i)^{2}
+i=1nπ(i)j=1ili1(f(i)2f(li+j)2)(ai,jai,j1)()\displaystyle\quad+\underbrace{\sum_{i=1}^{n}\pi(i)\cdot\sum_{j=1}^{i-l_{i}-1}(f(i)^{2}-f(l_{i}+j)^{2})\cdot(a_{i,j}-a_{i,j-1})}_{(*)}

Applying Cauchy-Schwarz as before gives

()\displaystyle(*) \displaystyle\leq i=1nj=1ili1π(i)(f(i)f(li+j))2P(i,li+j)\displaystyle\sqrt{\sum_{i=1}^{n}\sum_{j=1}^{i-l_{i}-1}\pi(i)\cdot(f(i)-f(l_{i}+j))^{2}\cdot P(i,l_{i}+j)~{}}
i=1nj=1ili1π(i)(f(i)+f(li+j))2(ai,jai,j1)2P(i,li+j)\displaystyle\quad\cdot\sqrt{\sum_{i=1}^{n}\sum_{j=1}^{i-l_{i}-1}\pi(i)\cdot(f(i)+f(l_{i}+j))^{2}\cdot\frac{\big{(}a_{i,j}-a_{i,j-1}\big{)}^{2}}{P(i,l_{i}+j)}}
\displaystyle\leq λ2i=1nπ(i)f(i)2i=1n4π(i)f(i)2j=1i1ai,jai,j1ai,j+ai,j1(),\displaystyle\sqrt{\lambda_{2}\cdot\sum_{i=1}^{n}\pi(i)\cdot f(i)^{2}}\cdot\sqrt{\sum_{i=1}^{n}4\cdot\pi(i)\cdot f(i)^{2}\cdot\underbrace{\sum_{j=1}^{i-1}\frac{a_{i,j}-a_{i,j-1}}{a_{i,j}+a_{i,j-1}}}_{(**)}},

where the second inequality uses 3.1 and P(i,li+j)=ai,j2ai,j12P(i,l_{i}+j)=a_{i,j}^{2}-a_{i,j-1}^{2}.

To upper bound ()(**), we let bi:=ai,jb_{i}:=a_{i,j} and use that 12φ12(P)=b0b1bm1=:bm+1\frac{1}{2}\varphi_{\frac{1}{2}}(P)=b_{0}\leq b_{1}\leq\ldots\leq b_{m}\leq 1=:b_{m+1} to upper bound the function

f:(b0,b1,,bm)b1b0b1+b0+b2b1b2+b1++bmbm1bm+bm1+1bm1+bmf:(b_{0},b_{1},\ldots,b_{m})\to\frac{b_{1}-b_{0}}{b_{1}+b_{0}}+\frac{b_{2}-b_{1}}{b_{2}+b_{1}}+\cdots+\frac{b_{m}-b_{m-1}}{b_{m}+b_{m-1}}+\frac{1-b_{m}}{1+b_{m}}

The partial derivative of ff is

fbi=2bi1(bi+bi1)22bi+1(bi+1+bi)2=2(bi+1bi1)(bi1bi+1bi2)(bi+bi1)2(bi+1+bi)2.\frac{\partial f}{\partial b_{i}}=\frac{2b_{i-1}}{(b_{i}+b_{i-1})^{2}}-\dfrac{2b_{i+1}}{(b_{i+1}+b_{i})^{2}}=\frac{2(b_{i+1}-b_{i-1})(b_{i-1}b_{i+1}-b_{i}^{2})}{(b_{i}+b_{i-1})^{2}(b_{i+1}+b_{i})^{2}}.

Since bi+1bi1>0b_{i+1}-b_{i-1}>0 by definition, the function increases up until bi2=bi1bi+1b_{i}^{2}=b_{i-1}b_{i+1} and then decreases. So, the maximum is attained when bi=(b0)m+1im+1b_{i}=(b_{0})^{\frac{m+1-i}{m+1}} with b0=12φ12(P)b_{0}=\frac{1}{2}\varphi_{\frac{1}{2}}(P) and bm+1=1b_{m+1}=1, in which case the sum is

i=1m+1bibi1bi+bi1=i=1m+11bi1bi1+bi1bi=i=1m+11b01m+11+b01m+1=(m+1)1b01m+11+b01m+1\sum_{i=1}^{m+1}\frac{b_{i}-b_{i-1}}{b_{i}+b_{i-1}}=\sum_{i=1}^{m+1}\frac{1-\frac{b_{i-1}}{b_{i}}}{1+\frac{b_{i-1}}{b_{i}}}=\sum_{i=1}^{m+1}\frac{1-b_{0}^{\frac{1}{m+1}}}{1+b_{0}^{\frac{1}{m+1}}}=(m+1)\cdot\frac{1-b_{0}^{\frac{1}{m+1}}}{1+b_{0}^{\frac{1}{m+1}}}

For b0[0,1]b_{0}\in[0,1], this value is increasing when mm increases, and so the sum is upper bounded by

()limxx1b01x1+b01x=limxx1b01x2\displaystyle(**)\leq\lim_{x\to\infty}x\cdot\frac{1-b_{0}^{\frac{1}{x}}}{1+b_{0}^{\frac{1}{x}}}=\lim_{x\to\infty}x\cdot\frac{1-b_{0}^{\frac{1}{x}}}{2} =\displaystyle= limy01b0y2y\displaystyle\lim_{y\to 0}\frac{1-b_{0}^{y}}{2y}
=\displaystyle= limy0b0ylogb02=12log1b0=12log2φ12(P),\displaystyle\lim_{y\to 0}\frac{-b_{0}^{y}\log b_{0}}{2}=\frac{1}{2}\log\frac{1}{b_{0}}=\frac{1}{2}\log\frac{2}{\varphi_{\frac{1}{2}}(P)},

where the third last equality is by L’Hôpital’s rule. Plugging this back into ()(**) and ()(*), the expected numerator is

𝔼t[i=1nπ(i)P(i,St¯)]\displaystyle\mathbb{E}_{t}\Big{[}\sum_{i=1}^{n}\pi(i)\cdot\sqrt{P(i,\overline{S_{t}})}\Big{]} \displaystyle\leq 12φ12(P)i=1nπ(i)f(i)2\displaystyle\frac{1}{2}\varphi_{\frac{1}{2}}(P)\cdot\sum_{i=1}^{n}\pi(i)\cdot f(i)^{2}
+λ2i=1nπ(i)f(i)2i=1n2log2φ12(P)π(i)f(i)2.\displaystyle\quad+\sqrt{\lambda_{2}\cdot\sum_{i=1}^{n}\pi(i)\cdot f(i)^{2}}~{}\cdot\sqrt{\sum_{i=1}^{n}2\log\frac{2}{\varphi_{\frac{1}{2}}(P)}\cdot\pi(i)\cdot f(i)^{2}}.

As before, the expected denominator is 𝔼t[π(St)]=iVπ(i)f(i)2\mathbb{E}_{t}[\pi(S_{t})]=\sum_{i\in V}\pi(i)\cdot f(i)^{2}. Putting together,

φ12(P)mint:t>0φ12(St)𝔼t[i=1nπ(i)P(i,St¯)]𝔼t[π(St)]12φ12(P)+2log2φ12(P)λ2.\displaystyle\varphi_{\frac{1}{2}}(P)\leq\min_{t:t>0}\varphi_{\frac{1}{2}}(S_{t})\leq\frac{\mathbb{E}_{t}\Big{[}\sum_{i=1}^{n}\pi(i)\cdot\sqrt{P(i,\overline{S_{t}})}\Big{]}}{\mathbb{E}_{t}[\pi(S_{t})]}\leq\frac{1}{2}\varphi_{\frac{1}{2}}(P)+\sqrt{2\log\frac{2}{\varphi_{\frac{1}{2}}(P)}\lambda_{2}}.

Rearranging recovers Theorem 1.3.

3.2 Non-Reversible Markov Chains

Houdré and Tetali only formulated 1.2 for reversible Markov chains of which the eigevalues of IPI-P are real. For non-reversible Markov chains, we observe that Chung’s definition of eigenvalues for directed graphs [Chu05] can be used to obtain the same results in Theorem 1.3 and Theorem 1.5.

Given a directed graph G=(V,E)G=(V,E) with a weight function w:E0w:E\to\mathbb{R}_{\geq 0}, let PGP_{G} be the transition matrix of the ordinary random walks on GG with PG(u,v)=w(uv)/vVw(uv)P_{G}(u,v)=w(uv)/\sum_{v\in V}w(uv) for each edge uvEuv\in E. Suppose GG is strongly connected, then there is a unique stationary distribution π:V+\pi:V\to\mathbb{R}_{+} such that πTP=πT\pi^{T}P=\pi^{T}. Let Π=diag(π)\Pi=\operatorname{diag}(\pi). Chung defined the Laplacian of the directed graph GG as

LG:=I12(Π12PΠ12+Π12PTΠ12).\vec{L}_{G}:=I-\frac{1}{2}\Big{(}\Pi^{\frac{1}{2}}P\Pi^{-\frac{1}{2}}+\Pi^{-\frac{1}{2}}P^{T}\Pi^{\frac{1}{2}}\Big{)}.

Since LG\vec{L}_{G} is a real symmetric matrix, its eigenvalues are real. Let λ2\lambda_{2} be the second smallest eigenvalue of LG\vec{L}_{G}. Chung [Chu05] proved an analog of Cheeger’s inequality that

12φ1(G)2λ2(LG)2φ1(G).\frac{1}{2}\varphi_{1}(G)^{2}\leq\lambda_{2}(\vec{L}_{G})\leq 2\varphi_{1}(G).

We observe that λ2(LG)\lambda_{2}(\vec{L}_{G}) can be used to extend our results to non-reversible Markov chains.

Theorem 3.2.

Let (V,P,π)(V,P,\pi) be an irreducible Markov chain. For any p(12,1]p\in(\frac{1}{2},1],

(φp(P))242p1λ2(LG).\big{(}\varphi_{p}(P)\big{)}^{2}\leq\frac{4}{2p-1}\cdot\lambda_{2}(\vec{L}_{G}).

For p=1/2p=1/2,

λ2(LG)(φ12(P))2log(1/φ12(P)).\lambda_{2}(\vec{L}_{G})\gtrsim\frac{\big{(}\varphi_{\frac{1}{2}}(P)\big{)}^{2}}{\log\big{(}1/\varphi_{\frac{1}{2}}(P)\big{)}}.

Note that the main proofs of Theorem 1.5 and Theorem 1.3 (i.e. computing the expected numerator) did not require the Markov chain to be reversible. The reversible assumption was only used in characterizing the second eigenvalue in 3.1. The following is an analog of 3.1 for non-reversible Markov chains using Chung’s definition of the second eigenvalue of directed graphs.

Lemma 3.3.

Let (V,P,π)(V,P,\pi) be an irreducible Markov chain. Let v2v_{2} be an eigenvector associated to the second smallest eigenvalue of the matrix LG\vec{L}_{G}. Define the reweighted eigenvector f2:=Π12v2f_{2}:=\Pi^{-\frac{1}{2}}v_{2}, with π({v:f2(v)0})12\pi(\{v:f_{2}(v)\geq 0\})\leq\frac{1}{2}. Define the truncated vector f:=max(f2,0)f:=\max(f_{2},0). Then

λ2(LG)u,vV:f(u)f(v)π(u)P(u,v)(f(u)f(v))2vVπ(v)f(v)2.\lambda_{2}(\vec{L}_{G})\geq\frac{\sum_{u,v\in V:f(u)\geq f(v)}\pi(u)\cdot P(u,v)\cdot(f(u)-f(v))^{2}}{\sum_{v\in V}\pi(v)f(v)^{2}}.

With this lemma, we can follow the proofs of Theorem 1.5 and Theorem 1.3 verbatim as in above, by defining level sets StS_{t} using the truncated vector ff and computing the expected numerator and denominator and so on.

This concludes the proof of Theorem 3.2. We will prove 3.3 in the next subsection.

3.3 Proofs of Auxiliary Lemmas

In this subsection, we prove 3.1 and 3.3. The proofs are standard but we include them for completeness.

Proof of 3.1.

For f,g:Vf,g:V\to\mathbb{R}, we define f,gπ:=iVπ(i)f(i)g(i)\langle f,g\rangle_{\pi}:=\sum_{i\in V}\pi(i)\cdot f(i)\cdot g(i). By definition of the second eigenvector, (IP)f2,f2π=λ2f2,f2π\langle(I-P)f_{2},f_{2}\rangle_{\pi}=\lambda_{2}\langle f_{2},f_{2}\rangle_{\pi}.

For f:=max(f2,0)f:=\max(f_{2},0), note that PfPf2Pf\geq Pf_{2}, as

(Pf)(i)=jVp(i,j)f(j)jVp(i,j)f2(j)=(Pf2)(i),(Pf)(i)=\sum_{j\in V}p(i,j)f(j)\geq\sum_{j\in V}p(i,j)f_{2}(j)=(Pf_{2})(i),

and thus

Pf,fπ=iV:f2(i)0π(i)(Pf)(i)f(i)iV,f2(i)0π(i)(Pf2)(i)f2(i)=(1λ2)f,fπ,\langle Pf,f\rangle_{\pi}=\sum_{i\in V:f_{2}(i)\geq 0}\pi(i)\cdot(Pf)(i)\cdot f(i)\geq\sum_{i\in V,f_{2}(i)\geq 0}\pi(i)\cdot(Pf_{2})(i)\cdot f_{2}(i)=(1-\lambda_{2})\langle f,f\rangle_{\pi},

where the last equality uses that Pf2=(1λ2)f2Pf_{2}=(1-\lambda_{2})f_{2}. It follows that

λ2(IP)f,fπf,fπ.\lambda_{2}\geq\frac{\langle(I-P)f,f\rangle_{\pi}}{\langle f,f\rangle_{\pi}}.

The denominator is the same as the denominator in the statement. It remains to check that the numerator is also the same as the numerator in the statement. By direct calculation,

(IP)f,fπ\displaystyle\langle(I-P)f,f\rangle_{\pi} =iVπ(i)f(i)2iVπ(i)jVP(i,j)f(j)f(i)\displaystyle=\sum_{i\in V}\pi(i)f(i)^{2}-\sum_{i\in V}\pi(i)\sum_{j\in V}P(i,j)f(j)f(i)
=iVjVπ(i)P(i,j)f(i)2iVjVπ(i)P(j,i)f(j)f(i)\displaystyle=\sum_{i\in V}\sum_{j\in V}\pi(i)P(i,j)f(i)^{2}-\sum_{i\in V}\sum_{j\in V}\pi(i)P(j,i)f(j)f(i)
=iVjVπ(i)P(i,j)(12(f(i)2+f(j)2)f(i)f(j))\displaystyle=\sum_{i\in V}\sum_{j\in V}\pi(i)P(i,j)\Big{(}\frac{1}{2}(f(i)^{2}+f(j)^{2})-f(i)f(j)\Big{)}
=i>jπ(i)P(i,j)(f(i)f(j))2,\displaystyle=\sum_{i>j}\pi(i)P(i,j)(f(i)-f(j))^{2},

where the second equality uses jVP(i,j)=1\sum_{j\in V}P(i,j)=1 and the third equality uses reversibility which gives π(i)P(i,j)=π(j)P(j,i)\pi(i)P(i,j)=\pi(j)P(j,i) for all i,jVi,j\in V, to get i,jπ(i)P(i,j)f(i)2=i,jπ(i)P(i,j)f(j)2\sum_{i,j}\pi(i)P(i,j)f(i)^{2}=\sum_{i,j}\pi(i)P(i,j)f(j)^{2}. ∎

To prove 3.3, we will use the following facts about λ2(LG)\lambda_{2}(\vec{L}_{G}) in [Chu05].

Lemma 3.4 ([Chu05]).

Let G=(V,E)G=(V,E) be a strongly connected directed graph and π\pi be its stationary distribution. The second smallest eigenvalue λ2\lambda_{2} of the directed Laplacian LG\vec{L}_{G} satisfies

λ2=inffπu,vVπ(u)P(u,v)|f(u)f(v)|2vVπ(v)|f(v)|2\lambda_{2}=\inf_{f\perp\pi}\frac{\sum_{u,v\in V}\pi(u)\cdot P(u,v)\cdot|f(u)-f(v)|^{2}}{\sum_{v\in V}\pi(v)\cdot|f(v)|^{2}}

Suppose v2v_{2} is an eigenvector of LG\vec{L}_{G} associated with eigenvalue λ2\lambda_{2}. Then, for the reweighted eigenvector f2:=Π12v2f_{2}:=\Pi^{-\frac{1}{2}}v_{2}, for all uVu\in V,

λ2f2(u)π(u)=12v(f2(u)f2(v))(π(u)P(u,v)+π(v)P(v,u)).\lambda_{2}\cdot f_{2}(u)\cdot\pi(u)=\frac{1}{2}\sum_{v}\big{(}f_{2}(u)-f_{2}(v)\big{)}\cdot\big{(}\pi(u)P(u,v)+\pi(v)P(v,u)\big{)}.
Proof of 3.3.

We claim that the truncated vector f:=max{f2,0}f:=\max\{f_{2},0\} satisfies

λ2f(u)π(u)12v(f(u)f(v))(π(u)P(u,v)+π(v)P(v,u)).\lambda_{2}\cdot f(u)\cdot\pi(u)\geq\frac{1}{2}\sum_{v}\big{(}f(u)-f(v)\big{)}\cdot\big{(}\pi(u)P(u,v)+\pi(v)P(v,u)\big{)}.

for all uVu\in V. Indeed, for uu such that f(u)>0f(u)>0,

λ2f(u)π(u)\displaystyle\lambda_{2}\cdot f(u)\cdot\pi(u) =\displaystyle= λ2f2(u)π(u)\displaystyle\lambda_{2}\cdot f_{2}(u)\cdot\pi(u)
=\displaystyle= 12vV(f2(u)f2(v))(π(u)P(u,v)+π(v)P(v,u))\displaystyle\frac{1}{2}\sum_{v\in V}\big{(}f_{2}(u)-f_{2}(v)\big{)}\cdot\big{(}\pi(u)P(u,v)+\pi(v)P(v,u)\big{)}
\displaystyle\geq 12vV(f(u)f(v))(π(u)P(u,v)+π(v)P(v,u)),\displaystyle\frac{1}{2}\sum_{v\in V}\big{(}f(u)-f(v)\big{)}\cdot\big{(}\pi(u)P(u,v)+\pi(v)P(v,u)\big{)},

where the second equality is by the fact above and the last inequality is by f2(u)f2(v)f(u)f(v)f_{2}(u)-f_{2}(v)\geq f(u)-f(v) for all u,vVu,v\in V due to truncation. For uu such that f(u)=0f(u)=0, the inequality holds trivially because

λ2f(u)π(u)=012v(f(v))(π(u)P(u,v)+π(v)P(v,u))\lambda_{2}\cdot f(u)\cdot\pi(u)=0\geq\frac{1}{2}\sum_{v}\big{(}-f(v)\big{)}\cdot\big{(}\pi(u)P(u,v)+\pi(v)P(v,u)\big{)}

as f(v)0f(v)\geq 0 for all vv by truncation. Thus the claim follows. Multiplying both sides of the claim by f(u)f(u) and then summing over all uu gives

λ2uVf2(u)π(u)\displaystyle\lambda_{2}\cdot\sum_{u\in V}f^{2}(u)\pi(u) \displaystyle\geq 12uVf(u)vV(f(u)f(v))(π(u)P(u,v)+π(v)P(v,u))\displaystyle\frac{1}{2}\sum_{u\in V}f(u)\sum_{v\in V}\big{(}f(u)-f(v)\big{)}\cdot\big{(}\pi(u)P(u,v)+\pi(v)P(v,u)\big{)}
=\displaystyle= 12uVvVπ(u)P(u,v)(12f(u)2+12f(v)2f(u)f(v))\displaystyle\frac{1}{2}\sum_{u\in V}\sum_{v\in V}\pi(u)\cdot P(u,v)\cdot\Big{(}\frac{1}{2}f(u)^{2}+\frac{1}{2}f(v)^{2}-f(u)f(v)\Big{)}
=\displaystyle= 12uVvVπ(u)P(u,v)(f(u)f(v))2.\displaystyle\frac{1}{2}\sum_{u\in V}\sum_{v\in V}\pi(u)\cdot P(u,v)\cdot\big{(}f(u)-f(v)\big{)}^{2}.

This is equivalent to the statement where the sum is over pairs with f(u)f(v)f(u)\geq f(v). ∎

4 Counterexamples

In this section, we prove Theorem 1.4 by constructing a family of counterexamples and bounding their second eigenvalues and φ12\varphi_{\frac{1}{2}} value. The construction is simple.

Definition 4.1 (Counterexamples).

Let GnG_{n} be a graph with vertex set [n][n]. For each i,j[n],iji,j\in[n],i\neq j, the edge weight is

P(i,j)=1C(min{|ij|,n|ij|})3,P(i,j)=\frac{1}{C\big{(}\min\{|i-j|,n-|i-j|\}\big{)}^{3}},

where C=i=1n1/min{|ij|,n|ij|}3C=\sum_{i=1}^{n}1/\min\{|i-j|,n-|i-j|\}^{3} is the normalizing constant to make the graph 11-regular.

We will prove the two claims in Theorem 1.4 about the second smallest eigenvalue and the φ1/2(G)\varphi_{\frac{1}{/}2}(G) value. First, we analyze the second smallest eigenvalue, based on the construction that IPI-P is a circulant matrix.

Lemma 4.2.

For GnG_{n} in 4.1, the second smallest eigenvalue of IPI-P is

λ2(IP)lognn2.\lambda_{2}(I-P)\lesssim\frac{\log n}{n^{2}}.
Proof.

By our construction, the graph GnG_{n} is cyclic that P(i,j)=P((i+k)modn,(j+k)modn)P(i,j)=P((i+k)\mathrm{~{}mod~{}}n,(j+k)\mathrm{~{}mod~{}}n) for all i,j,k[n]i,j,k\in[n]. So the matrix IPI-P is a circulant matrix of the form

IP=(a0a1a2an1an1a0an2an2an1an3a1a2a0)I-P=\begin{pmatrix}a_{0}&a_{1}&a_{2}&\dots&a_{n-1}\\ a_{n-1}&a_{0}&\cdots&\cdots&a_{n-2}\\ a_{n-2}&a_{n-1}&\ddots&\ddots&a_{n-3}\\ \vdots&\ddots&\ddots&\ddots&\vdots\\ a_{1}&a_{2}&\dots&\dots&a_{0}\end{pmatrix}

where a0=1a_{0}=1 and aj=P(1,j+1)a_{j}=-P(1,j+1) for all j[n]j\in[n]. It is well-known that an n×nn\times n circulant matrix with first row entries ana\in\mathbb{R}^{n} has eigenvalues and corresponding eigenvectors

{i=0n1aiωki}k=0n1and{(1,ωk,,ωkn1)T}k=0n1\Big{\{}\sum_{i=0}^{n-1}a_{i}{\omega_{k}}^{i}\Big{\}}_{k=0}^{n-1}\quad\mathrm{and}\quad\left\{\left(1,\omega_{k},\ldots,\omega_{k}^{n-1}\right)^{T}\right\}_{k=0}^{n-1}

where ωk:=e2πkın\omega_{k}:=e^{\frac{2\pi k\imath}{n}} are the nn-th roots of unity for k[n]k\in[n] (where ı\imath denotes the imaginary number).

So, the second smallest eigenvalue λ2\lambda_{2} of IPI-P corresponds to the first nn-th root of unity ω:=ω1=e2πın\omega:=\omega_{1}=e^{\frac{2\pi\imath}{n}}, and

λ2=i=0n1aiωi=i=2nP(1,i)i=2n/2+1P(1,i)ωi1i=n/2+2nP(1,i)ωi1.\lambda_{2}=\sum_{i=0}^{n-1}a_{i}\cdot{\omega}^{i}=\sum_{i=2}^{n}P(1,i)-\sum_{i=2}^{\lfloor n/2\rfloor+1}P(1,i)\cdot{\omega}^{i-1}-\sum_{i=\lfloor n/2\rfloor+2}^{n}P(1,i)\cdot{\omega}^{i-1}.

We consider two cases, when nn is odd and nn is even. When n=2k+1n=2k+1 is odd, note that by definition P(1,i)=P(1,2k+3i)P(1,i)=P(1,2k+3-i) for 2ik+12\leq i\leq k+1 and so we can pair up the terms in the above equation to get

λ2=i=2k+1P(1,i)(2ωi11ωi1)=i=2k+1P(1,i)(ωi121ωi12)2.\lambda_{2}=\sum_{i=2}^{k+1}P(1,i)\Big{(}2-{\omega}^{i-1}-\frac{1}{\omega^{i-1}}\Big{)}=-\sum_{i=2}^{k+1}P(1,i)\bigg{(}{\omega}^{\frac{i-1}{2}}-\frac{1}{\omega^{\frac{i-1}{2}}}\bigg{)}^{2}.

Using the definition of ωk:=e2πkın=cos2kπn+ısin2kπn\omega^{k}:=e^{\frac{2\pi k\imath}{n}}=\cos\frac{2k\pi}{n}+\imath\sin\frac{2k\pi}{n}, it follows that

λ2=i=2k+1P(1,i)(ωi12ω¯i12)2\displaystyle\lambda_{2}=-\sum_{i=2}^{k+1}P(1,i)\Big{(}{\omega}^{\frac{i-1}{2}}-\overline{\omega}^{\frac{i-1}{2}}\Big{)}^{2} =i=2k+1P(1,i)(2ısin(i1)πn)2\displaystyle=-\sum_{i=2}^{k+1}P(1,i)\left(2\imath\sin\dfrac{(i-1)\pi}{n}\right)^{2}
=4i=2k+1P(1,i)(sin(i1)πn)2.\displaystyle=4\sum_{i=2}^{k+1}P(1,i)\left(\sin\dfrac{(i-1)\pi}{n}\right)^{2}.

Finally we use the fact that sinx<x\sin x<x and that P(1,i)=1Cmin{|i1|,n|i1|}3<1(i1)3P(1,i)=\frac{1}{C\cdot\min\{|i-1|,n-|i-1|\}^{3}}<\frac{1}{(i-1)^{3}} for 2ik+12\leq i\leq k+1 as C1C\geq 1 to conclude that

λ2<4i=2k+11(i1)3((i1)πn)2=4i=1k1i(π2n2)lognn2.\lambda_{2}<4\sum_{i=2}^{k+1}\frac{1}{(i-1)^{3}}\Big{(}\frac{(i-1)\pi}{n}\Big{)}^{2}=4\sum_{i=1}^{k}\frac{1}{i}\Big{(}\frac{\pi^{2}}{n^{2}}\Big{)}\lesssim\frac{\log n}{n^{2}}.

When n=2kn=2k is even, the proof follows along the same lines, but we need to remove the term k=n2+1k=\frac{n}{2}+1 in the sum because ωn/2=1\omega^{n/2}=-1. However, it only contributes a term of O(1n3)O\left(\frac{1}{n^{3}}\right) to the sum, which is negligible. ∎

It remains to prove that φ12(P)lognn\varphi_{\frac{1}{2}}(P)\gtrsim\frac{\log n}{n}. As this graph is symmetric, our intuition is that φ12(P)\varphi_{\frac{1}{2}}(P) attains its minimum at the set S={1,,n2}S=\{1,\ldots,\frac{n}{2}\}. In this case, for each vertex iSi\in S,

P(i,S¯)j=ii+n21j31i1i+n212i,\sqrt{P(i,\overline{S})}\geq\sqrt{\sum_{j=i}^{i+\frac{n}{2}}\frac{1}{j^{3}}}\approx\frac{1}{i}-\frac{1}{i+\frac{n}{2}}\geq\frac{1}{2i},

which implies that

φ12(S)i=1n21nP(i,S¯)1ni=1n21ilognn.\varphi_{\frac{1}{2}}(S)\gtrsim\sum_{i=1}^{\frac{n}{2}}\frac{1}{n}\sqrt{P(i,\overline{S})}\gtrsim\frac{1}{n}\sum_{i=1}^{\frac{n}{2}}\frac{1}{i}\gtrsim\frac{\log n}{n}.

Our plan was to prove that SS indeed attains the minimum, but we do not have such a proof. Instead, we will work on a slightly different lower bound, which satisfies a concavity property that allows us to argue that sets of consecutive vertices attain the minimum, in order to prove the lower bound. It turns out that the proof is a bit long and we will present it in the next subsection.

4.1 Proof of φ12\varphi_{\frac{1}{2}} Lower Bound

First, we set up some notations for the proof. Let us partition the vertex set of GnG_{n} into two sets AA and B:=GAB:=G\setminus A with |A||B||A|\leq|B|. As the graph GnG_{n} is cyclic, we can arrange the vertices V=[n]V=[n] in a clockwise manner and without loss of generality we assume 1A1\in A and nBn\in B. Let us divide the vertices of AA and BB into contiguous sets A1,B1,A2,B2,,Ak,BkA_{1},B_{1},A_{2},B_{2},\ldots,A_{k},B_{k} in the cyclic representation, and denote their sizes by ai:=|Ai|a_{i}:=|A_{i}| and bi:=|Bi|b_{i}:=|B_{i}| for 1ik1\leq i\leq k. More explicitly, for 1ik1\leq i\leq k, the vertices in AiA_{i} and BiB_{i} are

Ai={j=1i1aj+j=1i1bj+1,,j=1iaj+j=1i1bj},Bi={j=1iaj+j=1i1bj+1,,j=1iai+j=1i1bi}.A_{i}=\Big{\{}\sum_{j=1}^{i-1}a_{j}+\sum_{j=1}^{i-1}b_{j}+1,\ldots,\sum_{j=1}^{i}a_{j}+\sum_{j=1}^{i-1}b_{j}\Big{\}}\mathrm{,}B_{i}=\Big{\{}\sum_{j=1}^{i}a_{j}+\sum_{j=1}^{i-1}b_{j}+1,\ldots,\sum_{j=1}^{i}a_{i}+\sum_{j=1}^{i-1}b_{i}\Big{\}}.

For two disjoint subsets S,TVS,T\subset V, let us define f(S,T):=uSP(u,T)f(S,T):=\sum_{u\in S}\sqrt{P(u,T)}. Note that φ12(A)=f(A,B)|A|\varphi_{\frac{1}{2}}(A)=\frac{f(A,B)}{|A|}, so our goal is to lower bound

f(A,B)=i=1kf(Ai,B).f(A,B)=\sum_{i=1}^{k}f(A_{i},B).

For two sets S,T{Ai}i=1k{Bi}i=1kS,T\in\{A_{i}\}_{i=1}^{k}\cup\{B_{i}\}_{i=1}^{k}, let us define the contiguous block [S,T][S,T] to be the block of sets from SS clockwise up until TT, possibly going around. For example, [Bk,A2]:=BkA1B1A2[B_{k},A_{2}]:=B_{k}\cup A_{1}\cup B_{1}\cup A_{2}, and note that [S,T][T,S][S,T]\neq[T,S] since the sets are counted clockwise.

After we set up the notations, we start with a lower bound on f(Ai,B)f(A_{i},B) by a natural function, the logarithm of the size of contiguous sets, which is the “slightly different lower bound” that we mentioned before this subsection.

Lemma 4.3.

For 1ik1\leq i\leq k,

2Cf(Ai,B)\displaystyle\sqrt{2C}\cdot f(A_{i},B)\geq j=1k(log(|[Ai,Aj]|+1)+log(|[Bi,Bj1]|+1)\displaystyle\sum_{j=1}^{k}\Big{(}\log\big{(}|[A_{i},A_{j}]|+1\big{)}+\log\big{(}|[B_{i},B_{j-1}]|+1\big{)}
log(|[Bi,Aj]|+1)log(|[Ai,Bj1]|+1)),\displaystyle\quad\quad-\log\big{(}|[B_{i},A_{j}]|+1\big{)}-\log\big{(}|[A_{i},B_{j-1}]|+1\big{)}\Big{)},

where CC is the normalizing constant in 4.1 and |[S,T]||[S,T]| denotes the number of vertices in the block [S,T][S,T].

Proof.

We prove the statement for f(A1,B)f(A_{1},B). By definition of f,Ai,Bif,A_{i},B_{i} stated above,

2Cf(A1,B)\displaystyle\sqrt{2C}\cdot f(A_{1},B) =\displaystyle= 2CiA1l=1kP(i,Bl)\displaystyle\sqrt{2C}\cdot\sum_{i\in A_{1}}\sqrt{\sum_{l=1}^{k}P(i,B_{l})}
=\displaystyle= 2Ci=1a1l=1kj=1blP(i,m=1lam+m=1l1bm+j).\displaystyle\sqrt{2C}\cdot\sum_{i=1}^{a_{1}}\sqrt{\sum_{l=1}^{k}\sum_{j=1}^{b_{l}}P\bigg{(}i,\sum_{m=1}^{l}a_{m}+\sum_{m=1}^{l-1}b_{m}+j\bigg{)}}.

By the definition of PP in 4.1, P(i,j)1C|ij|3P(i,j)\geq\frac{1}{C|i-j|^{3}} and so

2Cf(A1,B)\displaystyle\sqrt{2C}\cdot f(A_{1},B) \displaystyle\geq 2i=1a1l=1kj=1bl(m=1lam+m=1l1bm+ji)3\displaystyle\sqrt{2}\cdot\sum_{i=1}^{a_{1}}\sqrt{\sum_{l=1}^{k}\sum_{j=1}^{b_{l}}\bigg{(}\sum_{m=1}^{l}a_{m}+\sum_{m=1}^{l-1}b_{m}+j-i\bigg{)}^{-3}}
=\displaystyle= 2i=0a11l=1kj=1bl(m=2lam+m=1l1bm+j+i)3.\displaystyle\sqrt{2}\cdot\sum_{i=0}^{a_{1}-1}\sqrt{\sum_{l=1}^{k}\sum_{j=1}^{b_{l}}\bigg{(}\sum_{m=2}^{l}a_{m}+\sum_{m=1}^{l-1}b_{m}+j+i\bigg{)}^{-3}}.

We lower bound the inner sum by an integral, so that

2Cf(A1,B)\displaystyle\sqrt{2C}\cdot f(A_{1},B) \displaystyle\geq 2i=0a11l=1k1bl+1(m=2lam+m=1l1bm+x+i)3𝑑x\displaystyle\sqrt{2}\cdot\sum_{i=0}^{a_{1}-1}\sqrt{\sum_{l=1}^{k}\int_{1}^{b_{l}+1}\bigg{(}\sum_{m=2}^{l}a_{m}+\sum_{m=1}^{l-1}b_{m}+x+i\bigg{)}^{-3}\,dx}
=\displaystyle= i=0a11(l=1k(m=2lam+m=1l1bm+i+1)2αl\displaystyle\sum_{i=0}^{a_{1}-1}\bigg{(}\sum_{l=1}^{k}\underbrace{\bigg{(}\sum_{m=2}^{l}a_{m}+\sum_{m=1}^{l-1}b_{m}+i+1\bigg{)}^{-2}}_{\alpha_{l}}
(m=2lam+m=1lbm+i+1)2βl)1/2.\displaystyle\quad\quad-\underbrace{\bigg{(}\sum_{m=2}^{l}a_{m}+\sum_{m=1}^{l}b_{m}+i+1\bigg{)}^{-2}}_{\beta_{l}}\bigg{)}^{1/2}.

Now we use the following simple inequality about decreasing numbers.

Claim 4.4.

Let (αi)i=1k,(βi)i=1k(\alpha_{i})_{i=1}^{k},(\beta_{i})_{i=1}^{k} be positive real numbers such that α1β1α2β2αkβk0\alpha_{1}\geq\beta_{1}\geq\alpha_{2}\geq\beta_{2}\geq\dots\geq\alpha_{k}\geq\beta_{k}\geq 0. Then

i=1k(αi2βi2)(i=1k(αiβi))2.\sum_{i=1}^{k}\big{(}\alpha_{i}^{2}-\beta_{i}^{2}\big{)}\geq\Big{(}\sum_{i=1}^{k}\big{(}\alpha_{i}-\beta_{i}\big{)}\Big{)}^{2}.
Proof.

The proof is by induction. For i=1i=1, the claim is clear as α12β12(α1β1)2\alpha_{1}^{2}-\beta_{1}^{2}\geq(\alpha_{1}-\beta_{1})^{2}. Suppose that the claim is true for i=ki=k. Let A=i=2k+1(αi2βi2)A=\sum_{i=2}^{k+1}(\alpha_{i}^{2}-\beta_{i}^{2}) and B=i=2k+1(αiβi)B=\sum_{i=2}^{k+1}(\alpha_{i}-\beta_{i}). For the induction step, we need to show that α12β12+A(α1β1+B)2\alpha_{1}^{2}-\beta_{1}^{2}+A\geq(\alpha_{1}-\beta_{1}+B)^{2}. Since AB2A\geq B^{2} by induction, it suffices to show that α12β12(α1β1)(α1β1+2B)\alpha_{1}^{2}-\beta_{1}^{2}\geq(\alpha_{1}-\beta_{1})(\alpha_{1}-\beta_{1}+2B), which is equivalent to β1B\beta_{1}\geq B. It follows from the property of decreasing sequence that Bα2β1B\leq\alpha_{2}\leq\beta_{1}, verifying the induction step. ∎

The αl\sqrt{\alpha_{l}} and βl\sqrt{\beta_{l}} in the right hand side above satisfy the assumptions of the claim, and thus

2Cf(A1,B)l=1ki=0a11((m=2lam+m=1l1bm+i+1)1(m=2lam+m=1lbm+i+1)1)\sqrt{2C}\cdot f(A_{1},B)\geq\sum_{l=1}^{k}\sum_{i=0}^{a_{1}-1}\bigg{(}\bigg{(}\sum_{m=2}^{l}a_{m}+\sum_{m=1}^{l-1}b_{m}+i+1\bigg{)}^{-1}-\bigg{(}\sum_{m=2}^{l}a_{m}+\sum_{m=1}^{l}b_{m}+i+1\bigg{)}^{-1}\bigg{)}

We again lower bound the inner sum by an integral so that 2Cf(A1,B)\sqrt{2C}\cdot f(A_{1},B) is at least

l=1k0a1((m=2lam+m=1l1bm+x+1)1(m=2lam+m=1lbm+x+1)1)𝑑x\displaystyle\sum_{l=1}^{k}\int_{0}^{a_{1}}\bigg{(}\bigg{(}\sum_{m=2}^{l}a_{m}+\sum_{m=1}^{l-1}b_{m}+x+1\bigg{)}^{-1}-\bigg{(}\sum_{m=2}^{l}a_{m}+\sum_{m=1}^{l}b_{m}+x+1\bigg{)}^{-1}\bigg{)}\,dx
=\displaystyle= l=1k(log(m=1lam+m=1l1bm+1)log(m=2lam+m=1l1bm+1)\displaystyle\sum_{l=1}^{k}\Bigg{(}\log\bigg{(}\sum_{m=1}^{l}a_{m}+\sum_{m=1}^{l-1}b_{m}+1\bigg{)}-\log\bigg{(}\sum_{m=2}^{l}a_{m}+\sum_{m=1}^{l-1}b_{m}+1\bigg{)}
log(m=1lam+m=1lbm+1)+log(m=2lam+m=1lbm+1))\displaystyle\quad\quad\quad-\log\bigg{(}\sum_{m=1}^{l}a_{m}+\sum_{m=1}^{l}b_{m}+1\bigg{)}+\log\bigg{(}\sum_{m=2}^{l}a_{m}+\sum_{m=1}^{l}b_{m}+1\bigg{)}\Bigg{)}
=\displaystyle= l=1k(log(|[A1,Al]|+1)log(|[B1,Al]|+1)log(|[A1,Bl]|+1)\displaystyle\sum_{l=1}^{k}\Big{(}\log\big{(}|[A_{1},A_{l}]|+1\big{)}-\log\big{(}|[B_{1},A_{l}]|+1\big{)}-\log\big{(}|[A_{1},B_{l}]|+1\big{)}
+log(|[B1,Bl]|+1)),\displaystyle\quad\quad+\log\big{(}|[B_{1},B_{l}]|+1\big{)}\Big{)},

using the definition e.g. |[A1,Bl]|=j=1l(aj+bj)|[A_{1},B_{l}]|=\sum_{j=1}^{l}(a_{j}+b_{j}). ∎

Next, we are going to sum up the lower bounds in 4.3 to obtain a lower bound on f(A,B)f(A,B). To write the sum nicely, we use a simple observation on the signs of the logarithm in our sum. Let us call a contiguous block [S,T][S,T] odd if there are an odd number of sets in {Ai}i=1k{Bi}i=1k\{A_{i}\}_{i=1}^{k}\cup\{B_{i}\}_{i=1}^{k}, and even otherwise. Note that the odd blocks are exactly those with the first and last sets from the same partition AA or BB, e.g. [Ai,Aj],[Bi,Bj][A_{i},A_{j}],[B_{i},B_{j}]. With this definition, the lower bound on f(A,B)f(A,B) can be written as follows.

Lemma 4.5.

Using the definitions and notations in this subsection,

2Cf(A,B)ST:[S,T]oddlog(|[S,T]|+1)ST:[S,T]evenlog(|[S,T]|+1)(k1)log(n+1),\sqrt{2C}\cdot f(A,B)\geq\sum_{{S\neq T}:[S,T]\textrm{odd}}\log\big{(}|[S,T]|+1\big{)}-\sum_{S\neq T:[S,T]\textrm{even}}\log\big{(}|[S,T]|+1\big{)}-(k-1)\log(n+1),

where the sum is over S,T{Ai}i=1k{Bi}i=1kS,T\in\{A_{i}\}_{i=1}^{k}\cup\{B_{i}\}_{i=1}^{k}.

Proof.

We sum the inequalities in 4.3 from 1ik1\leq i\leq k. On the right hand side of the inequality in 4.3, we see that all contiguous blocks starting from AiA_{i} or BiB_{i} are in the sum, with the odd blocks positive and even blocks negative. Thus, summing over all AiA_{i}, every contiguous block is counted once as it is uniquely determined by the starting and ending sets, except for the whole cycle which appears once on the right hand side for every ii with a negative sign. ∎

To prove a lower bound on the right hand side of 4.5, the idea is to use the following concavity property.

Lemma 4.6.

For k2k\geq 2, consider the function

h:\displaystyle h: (a1,b1,ak,bk)\displaystyle(a_{1},b_{1}\dots,a_{k},b_{k})\rightarrow
ST:[S,T]oddlog(|[S,T]|+1)ST:[S,T]evenlog(|[S,T]|+1)(k1)log(n+1),\displaystyle\mkern-18.0mu\mkern-18.0mu\mkern-18.0mu\mkern-18.0mu\mkern-18.0mu\mkern-18.0mu\mkern-18.0mu\mkern-18.0mu\sum_{S\neq T:[S,T]\text{odd}}\log\big{(}|[S,T]|+1\big{)}-\sum_{S\neq T:[S,T]\text{even}}\log\big{(}|[S,T]|+1\big{)}-(k-1)\log(n+1),

where the sum is over S,T{Ai}i=1k{Bi}i=1kS,T\in\{A_{i}\}_{i=1}^{k}\cup\{B_{i}\}_{i=1}^{k} and so |[S,T]||[S,T]| depends on a1,b1,,ak,bka_{1},b_{1},\ldots,a_{k},b_{k}. Then, for all positive jj, the function

g:xh(x,b1,sx,b2,,ak,bk),g:x\rightarrow h(x,b_{1},s-x,b_{2},\dots,a_{k},b_{k}),

obtained by fixing non-negative integers b1,b2,a3,b3,,ak,bkb_{1},b_{2},a_{3},b_{3},\dots,a_{k},b_{k} as the size of the other sets and ss as the sum of a1+a2a_{1}+a_{2}, is concave on x[0,s]x\in[0,s].

Proof.

To prove concavity, we use the second derivative test, where gg is concave if the second derivative g′′g^{\prime\prime} is non-positive. We write gg as g0+g1(x)+g2(x)g_{0}+g_{1}(x)+g_{2}(x), where the g1(x)g_{1}(x) consists of all the log terms which contain A1A_{1} but not A2A_{2}, and similarly g2(x)g_{2}(x) consists of all the log terms which contain only A2A_{2} but not A1A_{1}. The remaining terms are in g0g_{0}, which either contain both A1A_{1} and A2A_{2} or none of A1A_{1} and A2A_{2}. Note that these terms are independent of xx, because if a block [S,T][S,T] contains both A1A_{1} and A2A_{2} then its size |[S,T]||[S,T]| is the same even when we change xx, so these terms can be ignored when we compute derivatives.

Let us focus on g1(x)g_{1}(x) first. The blocks that contain A1A_{1} but not A2A_{2} must be of the form [S,A1][S,A_{1}] or [S,B1][S,B_{1}] for some set SS. Let σ([S,T])\sigma([S,T]) denote the parity of the block [S,T][S,T]. Note that the parity of [S,A1][S,A_{1}] and [S,B1][S,B_{1}] are different, and so

g1′′(x)\displaystyle g_{1}^{\prime\prime}(x) =\displaystyle= S(1)σ([S,A1])+1((log(|[S,A1]|+1))′′(log(|[S,B1]|+1))′′)\displaystyle\sum_{S}(-1)^{\sigma([S,A_{1}])+1}\Big{(}\big{(}\log\big{(}|[S,A_{1}]|+1\big{)}\big{)}^{\prime\prime}-\big{(}\log\big{(}|[S,B_{1}]|+1\big{)}\big{)}^{\prime\prime}\Big{)}
=\displaystyle= S(1)σ([S,A1])+1(log(|[S,Bk]|+x+1)′′(log(|[S,Bk]|+x+b1+1))′′\displaystyle\sum_{S}(-1)^{\sigma([S,A_{1}])+1}\Big{(}\log(|[S,B_{k}]|+x+1)^{\prime\prime}-(\log(|[S,B_{k}]|+x+b_{1}+1)\Big{)}^{\prime\prime}
=\displaystyle= S(1)σ([S,A1])((|[S,Bk]|+x+1)2(|[S,Bk]|+x+b1+1)2)\displaystyle\sum_{S}(-1)^{\sigma([S,A_{1}])}\Big{(}\big{(}|[S,B_{k}]|+x+1\big{)}^{-2}-\big{(}|[S,B_{k}]|+x+b_{1}+1\big{)}^{-2}\Big{)}
=\displaystyle= S(1)σ([S,A1])(b1(|[S,Bk]|+x+1)2(|[S,Bk]|+x+b1+1)1\displaystyle\sum_{S}(-1)^{\sigma([S,A_{1}])}\bigg{(}b_{1}\cdot\big{(}|[S,B_{k}]|+x+1\big{)}^{-2}\cdot\big{(}|[S,B_{k}]|+x+b_{1}+1\big{)}^{-1}
+b1(|[S,Bk]|+x+1)1(|[S,Bk]|+x+b1+1)2),\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad+~{}b_{1}\cdot\big{(}|[S,B_{k}]|+x+1\big{)}^{-1}\cdot\big{(}|[S,B_{k}]|+x+b_{1}+1\big{)}^{-2}\bigg{)},

where the sum is over S{Ai}i=1k{Bi}i=1kS\in\{A_{i}\}_{i=1}^{k}\cup\{B_{i}\}_{i=1}^{k}. In the special case when S=A1S=A_{1}, we violate our own notation and let |[A1,Bk]|=0|[A_{1},B_{k}]|=0 in this proof; all other cases are still the same.

When b1=0b_{1}=0, the sum equals zero and we are done, so assume b10b_{1}\neq 0. To see that g1′′(x)g_{1}^{\prime\prime}(x) is negative, we pair up the terms with S=BiS=B_{i} and S=Ai+1S=A_{i+1} with indices taken modulo kk so that

1b1g1′′(x)\displaystyle\frac{1}{b_{1}}\cdot g_{1}^{\prime\prime}(x) =\displaystyle= i=1k[(|[Bi,Bk]|+x+1)2(|[Bi,Bk]|+x+b1+1)1\displaystyle\sum_{i=1}^{k}\Big{[}\big{(}|[B_{i},B_{k}]|+x+1\big{)}^{-2}\cdot\big{(}|[B_{i},B_{k}]|+x+b_{1}+1)^{-1}
+(|[Bi,Bk]|+x+1)1(|[Bi,Bk]|+x+b1+1)2\displaystyle\quad\quad+~{}\big{(}|[B_{i},B_{k}]|+x+1\big{)}^{-1}\cdot\big{(}|[B_{i},B_{k}]|+x+b_{1}+1\big{)}^{-2}
(|[Ai+1,Bk]|+x+1)2(|[Ai+1,Bk]|+x+b1+1)1\displaystyle\quad\quad-~{}\big{(}|[A_{i+1},B_{k}]|+x+1\big{)}^{-2}\cdot\big{(}|[A_{i+1},B_{k}]|+x+b_{1}+1\big{)}^{-1}
(|[Ai+1,Bk]|+x+1)1(|[Ai+1,Bk]|+x+b1+1)2]\displaystyle\quad\quad-~{}\big{(}|[A_{i+1},B_{k}]|+x+1\big{)}^{-1}\cdot\big{(}|[A_{i+1},B_{k}]|+x+b_{1}+1\big{)}^{-2}\Big{]}
<\displaystyle< 0,\displaystyle 0,

where the inequality holds because |[Ai+1,Bk]|<|[Bi,Bk]||[A_{i+1},B_{k}]|<|[B_{i},B_{k}]| and so each summand is negative (recall the special case that |[A1,Bk]|=0|[A_{1},B_{k}]|=0 in this proof).

The function g2(x)g_{2}(x) is handled analogously in view of the symmetry of the second derivative of the logarithm. This proves that gg is concave. ∎

With the concavity property, we can apply a simple “swapping/merging” argument to reduce to the case when there is only one contiguous set, i.e. k=1k=1, and then finish the proof.

By concavity, the function g(x)g(x) attains its minimum at one of the endpoints, and so

h(a1,,bn)min{h(0,b1,a1+a2,b2,,an,bn),h(a1+a2,b1,0,b2,,an,bn)}h(a_{1},\dots,b_{n})\geq\min\big{\{}h(0,b_{1},a_{1}+a_{2},b_{2},\dots,a_{n},b_{n}),h(a_{1}+a_{2},b_{1},0,b_{2},\dots,a_{n},b_{n})\big{\}}

The next observation is that when one set has size zero, we can merge the two adjacent sets in the same partition into one. More formally, let b1=0b_{1}=0 without loss of generality, we claim that

h(a1,0,a2,b2,ak,bk)=h(a1+a2,b2,,ak,bk).h(a_{1},0,a_{2},b_{2}\dots,a_{k},b_{k})=h(a_{1}+a_{2},b_{2},\dots,a_{k},b_{k}).

To see this, note that |[S,A1]||[S,A_{1}]| and |[S,B1]||[S,B_{1}]| have the same values but they have different signs so the terms involving them cancel out each other, and similarly the terms involving |[B1,S]||[B_{1},S]| and |[A2,S]||[A_{2},S]| cancel out each other. Therefore, in h(a1,0,a2,b2,,ak,bk)h(a_{1},0,a_{2},b_{2},\ldots,a_{k},b_{k}), there are no terms ending with A1A_{1} or B1B_{1} and no terms beginning with B1B_{1} or A2A_{2}, and all the remaining terms have a one-to-one correspondence with the terms in h(a1+a2,b2,,ak,bk)h(a_{1}+a_{2},b_{2},\ldots,a_{k},b_{k}).

This reduces kk by one. Repeating the same argument until k=1k=1, we see that

2Cf(A,B)h(|A|,n|A|)=log(|A|+1)+log(n|A|+1)log(n+1),\sqrt{2C}\cdot f(A,B)\geq h(|A|,n-|A|)=\log(|A|+1)+\log(n-|A|+1)-\log(n+1),

and thus

φ12(G)=minA:|A|n2f(A,B)|A|\displaystyle\varphi_{\frac{1}{2}}(G)=\min_{A:|A|\leq\frac{n}{2}}\frac{f(A,B)}{|A|} \displaystyle\gtrsim minl:ln2h(l,nl)l\displaystyle\min_{l:l\leq\frac{n}{2}}\frac{h(l,n-l)}{l}
=\displaystyle= minl:ln2log(l+1)+log(nl+1)log(n+1)l,\displaystyle\min_{l:l\leq\frac{n}{2}}\frac{\log(l+1)+\log(n-l+1)-\log(n+1)}{l},

where we used that CC is upper bounded by an absolute constant.

It remains to lower bound the right hand side. Since ln2l\leq\frac{n}{2}, it follows that log(nl1)log(n+1)log((n+1)/2)log(n+1)=log2\log(n-l-1)-\log(n+1)\geq\log((n+1)/2)-\log(n+1)=-\log 2, and so

log(k+1)+log(nk+1)log(n+1)klogk+12klogk+12k+12lognn,\frac{\log(k+1)+\log(n-k+1)-\log(n+1)}{k}\geq\frac{\log\frac{k+1}{2}}{k}\gtrsim\frac{\log\frac{k+1}{2}}{\frac{k+1}{2}}\geq\frac{\log n}{n},

where the last inequality is because lognn\frac{\log n}{n} is decreasing for n3n\geq 3 and for 1k41\leq k\leq 4 the last inequality clearly holds when nn is large enough. This concludes the proof of Theorem 1.4.

Concluding Remarks and Open Questions

We believe that the same analysis of φp(G)\varphi_{p}(G) can be extended to other generalizations of Cheeger’s inequality in [LGT12, KLL+13], and also to the directed edge conductance using the recent notions of reweighted eigenvalues in [OTZ22, KLT22, LTW23, LTW24]. We leave it as an open question to find a counterexample where the transition matrix is the simple random walk matrix of a graph.

Acknowledgements

We thank Christian Houdré and Prasad Tetali for their encouragement and the anonymous reviewers for their helpful comments.

References

  • [Alo86] Noga Alon. Eigenvalues and expanders. Combinatorica, 6:83–96, 1986.
  • [AM85] Noga Alon and Vitali Milman. λ1\lambda_{1}, isoperimetric inequalities for graphs, and superconcentrators. Journal of Combinatorial Theory, Series B, 38(1):73–88, 1985.
  • [Che70] Jeff Cheeger. A lower bound for the smallest eigenvalue of the laplacian. In Problems in Analysis, pages 195–199. Princeton University Press, 1970.
  • [Chu05] Fan Chung. Laplacians and cheeger inequality for directed graphs. Annals of Combinatorics, 9:1–19, 2005.
  • [CS16] Deeparnab Chakrabarty and C. Seshadhri. An o(n)o(n) monotonicity tester for boolean functions over the hypercube. SIAM Journal on Computing, 45(2), 2016.
  • [EG22] Ronen Eldan and Renan Gross. Concentration on the boolean hypercube via pathwise stochastic analysis. Inventiones Mathematics, 230:935–994, 2022.
  • [EKLM22] Ronen Eldan, Guy Kindler, Noam Lifshitz, and Dor Minzer. Isoperimetric inequalities made simpler. Technical report, ArXiv preprint, 2204.06686, 2022.
  • [HT04] Christian Houdré and Prasad Tetali. Isoperimetric invariants for product Markov chains and graph products. Combinatorica, 24:359–388, 2004.
  • [KLL+13] Tsz Chiu Kwok, Lap Chi Lau, Yin Tat Lee, Shayan Oveis Gharan, and Luca Trevisan. Improved cheeger’s inequality: Analysis of spectral partitioning algorithms through higher order spectral gap. In Proceedings of the 45th Annual Symposium on Theory of Computing (STOC), pages 11–20, 2013.
  • [KLL17] Tsz Chiu Kwok, Lap Chi Lau, and Yin Tat Lee. Improved cheeger’s inequality and analysis of local graph partitioning using vertex expansion and expansion profile. SIAM Journal on Computing, 46(3):890–910, 2017.
  • [KLT22] Tsz Chiu Kwok, Lap Chi Lau, and Kam Chuen Tung. Cheeger inequalities for vertex expansion and reweighted eigenvalues. In Proceedings of the 62nd IEEE Annual Symposium on Foundations of Computer Science (FOCS), pages 366–377, 2022.
  • [KMS18] Subhash Khot, Dor Minzer, and Muli Safra. On monotonicity testing and boolean isoperimetric-type theorem. SIAM Journal on Computing, 47(6), 2018.
  • [LGT12] James R. Lee, Shayan Oveis Gharan, and Luca Trevisan. Multi-way spectral partitioning and higher-order cheeger inequalities. In Proceedings of the 44th Annual Symposium on Theory of Computing (STOC), pages 1117–1130, 2012.
  • [LP17] David Levin and Yuval Peres. Markov chains and mixing times, volume 107. American Mathematical Society, 2017.
  • [LTW23] Lap Chi Lau, Kam Chuen Tung, and Robert Wang. Cheeger inequalities for directed graphs and hypergraphs using reweighted eigenvalues. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing (STOC), pages 1834–1847, 2023.
  • [LTW24] Lap Chi Lau, Kam Chuen Tung, and Robert Wang. Fast algorithms for directed graph partitioning using flows and reweighted eigenvalues. In Proceedings of the 2024 ACM-SIAM Symposium on Discrete Algorithms, SODA 2024, pages 591–624, 2024.
  • [Mar77] Grigory Margulis. Probabilistic characteristics of graphs with large connectivity. Problemy Peredachi Informatsii, 10(2):101–108, 1977.
  • [MP05] Ben Morris and Yuval Peres. Evolving sets, mixing and heat kernel bounds. Probability Theory and Related Fields, 133:245–266, 2005.
  • [OTZ22] Sam Olesker-Taylor and Luca Zanetti. Geometric bounds on the fastest mixing Markov chain. In the 13th Innovations in Theoretical Computer Science Conference (ITCS 2022), 2022.
  • [Tal93] Michel Talagrand. Isoperimetry, logarithmic sobolev inequalities on the discrete cube, and margulis’ graph connectivity theorem. Geometric and Functional Analysis, 3(3):295–314, 1993.
  • [Yos16] Yuichi Yoshida. Nonlinear laplacian for digraphs and its applications to network analysis. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining (WSDM), pages 483–492, 2016.