This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Short-step Methods Are Not Strongly Polynomial-Time

Manru Zong The Hong Kong Polytechnic University. E–mail: [email protected]    Yin Tat Lee University of Washington. E–mail: [email protected]    Man-Chung Yue The Hong Kong Polytechnic University. E–mail: [email protected]
Abstract

Short-step methods are an important class of algorithms for solving convex constrained optimization problems. In this short paper, we show that under very mild assumptions on the self-concordant barrier and the width of the 2\ell_{2}-neighbourhood, any short-step interior-point method is not strongly polynomial-time.

1 Introduction

An algorithm for solving linear programming problems is said to be strongly polynomial-time if the required number of arithmetic operations is a polynomial in the numbers of variables and constraints, independent of the bit-length for encoding the problem instance. A major open problem in optimization and computer science is whether there exists a strongly polynomial-time algorithm for solving linear programming problems.

This paper is concerned with interior-point methods [17], which are an important class of algorithms for solving convex constrained optimization problems and enjoy great success in both theory and practice [9, 13, 17, 19]. In particular, we focus on the sub-class of path-following interior-point methods [17]. These methods are built upon the concept of self-concordant barriers [17]. A self-concordant barrier allows us to define a curve, called the central path, leading from a certain center of the feasible region to an optimal solution. Roughly speaking, the principle of path-following interior-point methods is to approximately trace the central path to move towards the optimal solution. The geometry of the central path therefore plays an important role in the studies of path-following interior-point methods.

By constructing a family of linear programming problems with ill-behaved central paths, it has been shown in [7] that the number of iterations of any path-following interior-point method is lower bounded by Ω(m/(logm)3)\Omega(\sqrt{m/(\log m)^{3}}), where mm is the number of constraints. Based on another family of linear programming problems, the recent paper [3] improved the lower bound to Ω(2m)\Omega(2^{m}), see also [4]. This lower bound applies to and thus denies the strong polynomiality of a large class of path-following interior-point methods, including the short-step methods associated with the logarithmic barrier [10, 15], the long-step methods [11] and the predictor-corrector methods [14]. More instances of ill-behaved central paths are presented in [6, 8].

One way to generalize the short-step methods is to replace the logarithmic barrier by an arbitrary self-concordant barrier [17], such as the volumetric barrier [20], universal barrier [13, 17], entropic barrier [5] and Lee-Sidford barrier [12]. Such an idea has led to improved algorithms for solving linear programming problems, see [12, 20] for example.

In view of the above-mentioned results, it is interesting to ask whether the short-step method associated with a general self-concordant barrier is strongly polynomial-time. The purpose of this short paper is to settle this question negatively, see Corollary 1. Our study is also motivated by the paper [2], which proved that for a family of linear programming problems, slightly different from the one in [3], a certain limit of the central path associated with the entropic barrier coincides with the central path associated with the logarithmic barrier. This suggests that the exponential lower bound in [3] for short-step methods associated with the logarithmic barrier could possibly be generalized to short-step methods associated with a general self-concordant barrier. Our main result Corollary 1 partly fills this gap.

Finally, we should also point out that some sub-classes of linear programming problems do admit strongly polynomial-time algorithms [1, 18, 22].

2 Preliminaries

Consider the linear programming problem

minimizecxsubject toAxb,\begin{array}[]{c@{\quad}l}\mbox{minimize}&c^{\top}x\\ \vskip 3.0pt plus 1.0pt minus 1.0pt\cr\mbox{subject to}&Ax\leq b,\end{array} (P)

where cnc\in\mathbb{R}^{n}, bmb\in\mathbb{R}^{m} and Am×nA\in\mathbb{R}^{m\times n}. Denote the feasible region of problem (P) and its interior by \mathcal{F} and \mathcal{F}^{\circ}, respectively. We assume that \mathcal{F} is bounded and that \mathcal{F}^{\circ} is non-empty. For simplicity, we denote the optimality gap of a feasible point xx\in\mathcal{F} by

𝗀𝖺𝗉(x)=cxminxcx.\mathsf{gap}(x)=c^{\top}x-\min_{x^{\prime}\in\mathcal{F}}c^{\top}x^{\prime}.

The notion of self-concordant barriers plays vital role in the study of interior-point methods. Let 𝒦n\mathcal{K}\subseteq\mathbb{R}^{n} be a proper convex domain, i.e., a convex set with non-empty interior and containing no 1-dimensional affine subspace. A function ϕ:int(𝒦)\phi:\mathrm{int}(\mathcal{K})\to\mathbb{R} is said to be a barrier on 𝒦\mathcal{K} if ϕ(x)+asx𝒦\phi(x)\to+\infty\quad\text{as}\quad x\to\partial\mathcal{K}. A three times continuously differentiable convex function ϕ\phi is said to be self-concordant on 𝒦\mathcal{K} if for any xint(𝒦)x\in\mathrm{int}(\mathcal{K}) and hnh\in\mathbb{R}^{n},

|D3ϕ(x)[h,h,h]|2(D2ϕ(x)[h,h])32.\left|\mathrm{D}^{3}\phi(x)[h,h,h]\right|\leq 2\left(\mathrm{D}^{2}\phi(x)[h,h]\right)^{\frac{3}{2}}.

If ϕ\phi additionally satisfies that for any xint(𝒦)x\in\mathrm{int}(\mathcal{K}) and hnh\in\mathbb{R}^{n},

|Dϕ(x)[h]|(νD2ϕ(x)[h,h])12,\left|\mathrm{D}\phi(x)[h]\right|\leq\left(\nu\,\mathrm{D}^{2}\phi(x)[h,h]\right)^{\frac{1}{2}},

then ϕ\phi is said to be ν\nu-self-concordant on 𝒦\mathcal{K}. This paper focuses on the proper convex domain 𝒦=\mathcal{K}=\mathcal{F}, which is a polytope defined by the linear inequalities AxbAx\leq b. A standard self-concordant barrier on \mathcal{F} is the logarithmic barrier

ϕln(x)=i=1mln(bAx)i.\phi_{\ln}(x)=-\sum_{i=1}^{m}\ln(b-Ax)_{i}.

Given a self-concordant barrier ϕ\phi on \mathcal{F}, we consider the problem

minimizecx+μϕ(x)subject toAx<b.\begin{array}[]{c@{\quad}l}\mbox{minimize}&c^{\top}x+\mu\,\phi(x)\\ \vskip 3.0pt plus 1.0pt minus 1.0pt\cr\mbox{subject to}&Ax<b.\end{array} (Pμ)

From [16, Section 5.3.4], for any μ>0\mu>0, there exists a unique minimizer xϕ(μ)x^{\phi}(\mu) to problem (Pμ). We call xϕ(μ)x^{\phi}(\mu) the μ\mu-analytic center of problem (P) associated with the self-concordant barrier ϕ\phi. The central path of problem (P) associated with ϕ\phi is then defined as (the image of) the curve μxϕ(μ)\mu\mapsto x^{\phi}(\mu) for μ>0\mu>0. By [16, Theorem 5.3.10], xϕ(μ)x^{\phi}(\mu) converges to an optimal solution to problem (P) as μ0\mu\to 0.

By [16, Theorem 5.1.6], 2ϕ(x)\nabla^{2}\phi(x) is positive definite for any xx\in\mathcal{F}^{\circ}. This allows us to define a norm

hx=h2ϕ(x)h,hn,\|h\|_{x}=\sqrt{h^{\top}\nabla^{2}\phi(x)h},\quad h\in\mathbb{R}^{n},

and its dual norm

hx=h(2ϕ(x))1h,hn.\|h\|_{x}^{*}=\sqrt{h^{\top}\left(\nabla^{2}\phi(x)\right)^{-1}h},\quad h\in\mathbb{R}^{n}.

For any θ(0,1)\theta\in(0,1) and μ>0\mu>0, the 2\ell_{2}-neighbourhood of problem (P) is defined as

𝒩θϕ(μ)={x:c+μϕ(x)xθμ}.\mathcal{N}^{\phi}_{\theta}(\mu)=\left\{x\in\mathcal{F}^{\circ}:\left\|c+\mu\,\nabla\phi(x)\right\|_{x}^{*}\leq\theta\mu\right\}.

Note that c+μϕ(x)c+\mu\nabla\phi(x) is the gradient of the objective function of problem (Pμ). The 2\ell_{2}-neighbourhood is important and a natural choice for the design of path-following interior-point methods since the Newton’s method applied to problem (Pμ) converges quadratically to the optimal solution xϕ(μ)x^{\phi}(\mu) [16, Theorem 5.2.2]. We write

𝒩θϕ=μ>0𝒩θϕ(μ),\mathcal{N}^{\phi}_{\theta}=\bigcup_{\mu>0}\mathcal{N}^{\phi}_{\theta}(\mu),

which is also called the 2\ell_{2}-neighbourhood of problem (P). By the optimality condition of problem (Pμ), we can see that xϕ(μ)𝒩θϕ(μ)x^{\phi}(\mu)\in\mathcal{N}^{\phi}_{\theta}(\mu) for any θ(0,1)\theta\in(0,1) and μ>0\mu>0.

A short-step method associated with ϕ\phi is defined as an algorithm that generates a sequence of iterates {xk}k0\{x^{k}\}_{k\geq 0} such that the polygonal (i.e., continuous piecewise linear) curve formed using the sequence {xk}k0\{x^{k}\}_{k\geq 0} is contained in the 2\ell_{2}-neighbourhood 𝒩θϕ\mathcal{N}^{\phi}_{\theta} of problem (P) for some θ(0,1)\theta\in(0,1), where kk is the iteration counter, or more precisely,

k0[xk,xk+1]𝒩θϕ.\bigcup_{k\geq 0}[x^{k},x^{k+1}]\subseteq\mathcal{N}^{\phi}_{\theta}.

Another popular choice of the neighbourhood for path-following algorithms for solving linear programming problems is the so-called wide neighbourhood [21]:

𝒲θ(μ)={x:\displaystyle\mathcal{W}_{\theta}(\mu)=\Big{\{}x\in\mathcal{F}^{\circ}: y+m such that Ay=c,y(bAx)=mμ,\displaystyle\,\exists\,y\in\mathbb{R}_{+}^{m}\text{ such that }A^{\top}y=-c,\ y^{\top}(b-Ax)=m\mu,
 and yi(bAx)i(1θ)μ,i=1,,m}.\displaystyle\,\text{ and }y_{i}\,(b-Ax)_{i}\geq(1-\theta)\mu,\ i=1,\dots,m\Big{\}}.

Similarly to the 2\ell_{2}-neighbourhood, we write

𝒲θ=μ>0𝒲θ(μ).\mathcal{W}_{\theta}=\bigcup_{\mu>0}\mathcal{W}_{\theta}(\mu).

3 Main Results

Our non-strong polynomiality result is based on the following family of linear programs introduced in [3]:

minimizex1subject tox1t2,x2t,x2j+1tx2j1,j=1,,r1,x2j+1tx2j,j=1,,r1,x2j+2t11/2j(x2j1+x2j),j=1,,r1,x2r1,x2r0,\begin{array}[]{c@{\quad}l}\mbox{minimize}&x_{1}\\ \vskip 3.0pt plus 1.0pt minus 1.0pt\cr\mbox{subject to}&x_{1}\leq t^{2},\\ &x_{2}\leq t,\\ &x_{2j+1}\leq t\,x_{2j-1},\quad j=1,\dots,r-1,\\ &x_{2j+1}\leq t\,x_{2j},\quad j=1,\dots,r-1,\\ &x_{2j+2}\leq t^{1-1/2^{j}}(x_{2j-1}+x_{2j}),\quad j=1,\dots,r-1,\\ &x_{2r-1},\,x_{2r}\geq 0,\end{array} LWr(t){\textbf{LW}}_{r}(t)

where t>1t>1 is a real number and r1r\geq 1 is an integer. The notation LWr(t){\textbf{LW}}_{r}(t) follows from [3] and signifies that the central path of this linear program is long and winding.

To distinguish the properties of problem LWr(t){\textbf{LW}}_{r}(t) from those of a general linear program, we introduce an extra subscript tt to the notations. For instances, the feasible region, the μ\mu-analytic center associated with a self-concordant barrier ϕ\phi and the wide neighbourhood of problem LWr(t){\textbf{LW}}_{r}(t) are denoted as t\mathcal{F}_{t}, xtϕ(μ)x^{\phi}_{t}(\mu) and 𝒲θ,t\mathcal{W}_{\theta,t}, respectively.

We can now present the main results of this paper, whose proofs are deferred to Section 4. The main contribution of this paper is the non-strong polynomiality of short-step methods.

Corollary 1 (Non-strong Polynomiality of Short-step Methods).

Consider LWr(t){\textbf{LW}}_{r}(t) for a sufficiently large t>1t>1. Let θ(0,12)\theta\in(0,\tfrac{1}{2}), ϕ\phi be a ν\nu-self-concordant barrier on t\mathcal{F}_{t} with ν\nu independent of tt and {xk}k0\{x^{k}\}_{k\geq 0} be a sequence of iterates generated by a short-step method associated with ϕ\phi. Suppose that

𝗀𝖺𝗉(x0)320rνtand𝗀𝖺𝗉(xK)1180r.\mathsf{gap}(x^{0})\geq 320r\nu\sqrt{t}\quad\text{and}\quad\mathsf{gap}(x^{K})\leq\frac{1}{180r}.

Then, K2r3K\geq 2^{r-3}.

We should emphasize that although the self-concordance parameter ν\nu in Corollary 1 is assumed to be independent of tt, it could possibly depend on the numbers of variables and constraints of problem LWr(t){\textbf{LW}}_{r}(t). This subsumes almost all known barriers. Furthermore, the constants in Corollary 1 are simplified for readability. They are not optimal and can be made slightly better (see Corollary 2). Finally, we note that the requirement on the initial optimality gap is only very mild. Indeed, it is customary for interior-point methods to start at the \infty-analytic center xϕ()x^{\phi}(\infty). Using Theorem 2, we can check that 𝗀𝖺𝗉(xϕ())=Ω(tν)\mathsf{gap}(x^{\phi}(\infty))=\Omega(\tfrac{t}{\nu}), see also the proof of Lemma 2.

The cruxes of the proof of Corollary 1 are the following theorems. The first shows a certain equivalence among the central paths of all self-concordant barriers. Geometrically speaking, it guarantees that when restricted to a constant-cost slice, all central paths are approximately equally close to (or away from) the boundary of the feasible region.

Theorem 1 (Equivalence of Central Paths).

Consider the linear program (P). Let ϕ\phi and ψ\psi be νϕ\nu_{\phi}- and νψ\nu_{\psi}-self-concordant barriers on the feasible region \mathcal{F}. Let μ,η>0\mu,\eta>0 be such that cxϕ(μ)=cxψ(η)c^{\top}x^{\phi}(\mu)=c^{\top}x^{\psi}(\eta). Then, for any i=1,,mi=1,\dots,m,

(1+νϕ+2νϕ)1(bAxϕ(μ))i(bAxψ(η))i(1+νψ+2νψ).(1+\nu_{\phi}+2\sqrt{\nu_{\phi}})^{-1}\leq\frac{(b-Ax^{\phi}(\mu))_{i}}{(b-Ax^{\psi}(\eta))_{i}}\leq(1+\nu_{\psi}+2\sqrt{\nu_{\psi}}).

The second one bounds the optimality gap of analytic centers.

Theorem 2 (Optimality Gap of Analytic Center).

Consider the linear program (P). Let ϕ\phi be a ν\nu-self-concordant barrier on the feasible region \mathcal{F}. Then, for any μ>0\mu>0,

min{μ2,ρc22ν+4ν}𝗀𝖺𝗉(xϕ(μ))μν,\min\left\{\frac{\mu}{2},\frac{\rho\|c\|_{2}}{2\nu+4\sqrt{\nu}}\right\}\leq\mathsf{gap}(x^{\phi}(\mu))\leq\mu\nu,

where ρ>0\rho>0 is the radius of the largest ball contained in \mathcal{F}.

It should be pointed out that the upper bound in Theorem 2 is not new and can be found in [16, Theorem 5.3.10]. The novelty of the theorem lies in the lower bound.

4 Proofs

In this section, we prove the main results. Throughout this section, given a vector x¯n\bar{x}\in\mathbb{R}^{n} and a positive definite matrix QQ, we let (Q,x¯)={xn:(xx¯)Q(xx¯)1}\mathcal{E}(Q,\bar{x})=\{x\in\mathbb{R}^{n}:(x-\bar{x})^{\top}Q(x-\bar{x})\leq 1\}. Also, for any ν>0\nu>0, we let Cν=ν+2νC_{\nu}=\nu+2\sqrt{\nu}.

4.1 Proof of Theorem 1

We first collect a simple lemma about linear optimization over ellipsoids, whose proof uses only elementary optimality arguments and is thus omitted.

Lemma 1.

Let x¯n\bar{x}\in\mathbb{R}^{n}, ana\in\mathbb{R}^{n} and Qn×nQ\in\mathbb{R}^{n\times n} be a positive definite matrix. Then,

maxx(Q,x¯)ax=ax¯+aQ1aandminx(Q,x¯)ax=ax¯aQ1a.\max_{x\in\mathcal{E}(Q,\bar{x})}a^{\top}x=a^{\top}\bar{x}+\sqrt{a^{\top}Q^{-1}a}\quad\text{and}\quad\min_{x\in\mathcal{E}(Q,\bar{x})}a^{\top}x=a^{\top}\bar{x}-\sqrt{a^{\top}Q^{-1}a}.

We can now prove Theorem 1.

Proof of Theorem 1.

Let ={xn:cx=cxϕ(μ)}\mathcal{H}=\{x\in\mathbb{R}^{n}:c^{\top}x=c^{\top}x^{\phi}(\mu)\}. By [16, Theorem 5.1.5], we have

(2ϕ(xϕ(μ)),xϕ(μ))and(2ψ(xψ(η)),xψ(η)).\mathcal{E}(\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))\subseteq\mathcal{F}\quad\text{and}\quad\mathcal{E}(\nabla^{2}\psi(x^{\psi}(\eta)),x^{\psi}(\eta))\subseteq\mathcal{F}.

Combining [16, Theorem 5.3.8] with these inclusions yields

(2ϕ(xϕ(μ)),xϕ(μ))(Cνϕ22ϕ(xϕ(μ)),xϕ(μ)),\displaystyle\mathcal{E}(\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))\cap\mathcal{H}\subseteq\mathcal{F}\cap\mathcal{H}\subseteq\mathcal{E}(C_{\nu_{\phi}}^{-2}\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))\cap\mathcal{H},

and

(2ψ(xψ(η)),xψ(η))(Cνψ22ψ(xψ(η)),xψ(η)),\displaystyle\mathcal{E}(\nabla^{2}\psi(x^{\psi}(\eta)),x^{\psi}(\eta))\cap\mathcal{H}\subseteq\mathcal{F}\cap\mathcal{H}\subseteq\mathcal{E}(C_{\nu_{\psi}}^{-2}\nabla^{2}\psi(x^{\psi}(\eta)),x^{\psi}(\eta))\cap\mathcal{H},

which implies, respectively,

xψ(η)(Cνϕ22ϕ(xϕ(μ)),xϕ(μ))andxϕ(μ)(Cνψ22ψ(xψ(η)),xψ(η)).x^{\psi}(\eta)\in\mathcal{E}(C_{\nu_{\phi}}^{-2}\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))\cap\mathcal{H}\quad\text{and}\quad x^{\phi}(\mu)\in\mathcal{E}(C_{\nu_{\psi}}^{-2}\nabla^{2}\psi(x^{\psi}(\eta)),x^{\psi}(\eta))\cap\mathcal{H}.

Using Lemma 1, for any i=1,,mi=1,\dots,m,

maxx(Cνϕ22ϕ(xϕ(μ)),xϕ(μ))(bAx)i=(bAxϕ(μ))i+Cνϕai(2ϕ(xϕ(μ)))1ai,\max_{x^{\prime}\in\mathcal{E}(C_{\nu_{\phi}}^{-2}\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))}(b-Ax^{\prime})_{i}=(b-Ax^{\phi}(\mu))_{i}+C_{\nu_{\phi}}\sqrt{a_{i}^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}a_{i}}, (1)

and

minx(Cνϕ22ϕ(xϕ(μ)),xϕ(μ))(bAx)i=(bAxϕ(μ))iCνϕai(2ϕ(xϕ(μ)))1ai,\min_{x^{\prime}\in\mathcal{E}(C_{\nu_{\phi}}^{-2}\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))}(b-Ax^{\prime})_{i}=(b-Ax^{\phi}(\mu))_{i}-C_{\nu_{\phi}}\sqrt{a_{i}^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}a_{i}}, (2)

where aia_{i}^{\top} is the ii-row of AA. Also, since AxbAx^{\prime}\leq b for any x(2ϕ(xϕ(μ)),xϕ(μ))x^{\prime}\in\mathcal{E}(\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))\subseteq\mathcal{F}, Lemma 1 implies that

0minx(2ϕ(xϕ(μ)),xϕ(μ))(bAx)i=(bAxϕ(μ))iai(2ϕ(xϕ(μ)))1ai.0\leq\min_{x^{\prime}\in\mathcal{E}(\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))}(b-Ax^{\prime})_{i}=(b-Ax^{\phi}(\mu))_{i}-\sqrt{a_{i}^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}a_{i}}.

It follows from (1), (2) and xψ(η)(Cνϕ22ϕ(xϕ(μ)),xϕ(μ))x^{\psi}(\eta)\in\mathcal{E}(C_{\nu_{\phi}}^{-2}\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu)) that

|(bAxϕ(μ))i(bAxψ(η))i|Cνϕai(2ϕ(xϕ(μ)))1aiCνϕ(bAxϕ(μ))i.\displaystyle\left|(b-Ax^{\phi}(\mu))_{i}-(b-Ax^{\psi}(\eta))_{i}\right|\leq C_{\nu_{\phi}}\sqrt{a_{i}^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}a_{i}}\leq C_{\nu_{\phi}}(b-Ax^{\phi}(\mu))_{i}.

Using the same arguments, we also obtain

|(bAxϕ(μ))i(bAxψ(η))i|Cνψai(2ψ(xψ(η)))1aiCνψ(bAxψ(η))i.\displaystyle\left|(b-Ax^{\phi}(\mu))_{i}-(b-Ax^{\psi}(\eta))_{i}\right|\leq C_{\nu_{\psi}}\sqrt{a_{i}^{\top}\left(\nabla^{2}\psi(x^{\psi}(\eta))\right)^{-1}a_{i}}\leq C_{\nu_{\psi}}(b-Ax^{\psi}(\eta))_{i}.

Thus, for any i=1,,mi=1,\dots,m,

(bAxψ(η))i(1+Cνϕ)(bAxϕ(μ))iand(bAxϕ(μ))i(1+Cνψ)(bAxψ(η))i,\displaystyle(b-Ax^{\psi}(\eta))_{i}\leq(1+C_{\nu_{\phi}})(b-Ax^{\phi}(\mu))_{i}\quad\text{and}\quad(b-Ax^{\phi}(\mu))_{i}\leq(1+C_{\nu_{\psi}})(b-Ax^{\psi}(\eta))_{i},

which completes the proof. ∎

4.2 Proof of Theorem 2

We next prove Theorem 2.

Proof of Theorem 2.

The upper bound follows directly from [16, Theorem 5.3.10]. Therefore, we prove only the lower bound. Two cases are discussed separately:

ϕ(xϕ(μ)))(2ϕ(xϕ(μ)))1ϕ(xϕ(μ)))14\displaystyle\,\nabla\phi(x^{\phi}(\mu)))^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}\nabla\phi(x^{\phi}(\mu)))\leq\frac{1}{4}
and ϕ(xϕ(μ)))(2ϕ(xϕ(μ)))1ϕ(xϕ(μ)))>14.\displaystyle\,\nabla\phi(x^{\phi}(\mu)))^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}\nabla\phi(x^{\phi}(\mu)))>\frac{1}{4}.

We first consider the case of

ϕ(xϕ(μ)))(2ϕ(xϕ(μ)))1ϕ(xϕ(μ)))14.\nabla\phi(x^{\phi}(\mu)))^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}\nabla\phi(x^{\phi}(\mu)))\leq\frac{1}{4}.

By [16, Lemma 5.1.5 and Theorem 5.2.1], we have

xϕ(μ)xϕ()xϕ()21+23xϕ(μ)xϕ()xϕ()(14)2114=112,\displaystyle\frac{\|x^{\phi}(\mu)-x^{\phi}(\infty)\|_{x^{\phi}(\infty)}^{2}}{1+\frac{2}{3}\|x^{\phi}(\mu)-x^{\phi}(\infty)\|_{x^{\phi}(\infty)}}\leq\frac{(\tfrac{1}{4})^{2}}{1-\tfrac{1}{4}}=\frac{1}{12},

where xϕ()x^{\phi}(\infty) is the \infty-analytic center, i.e., the unique optimal solution to the problem

minimizeϕ(x)subject toAx<b.\begin{array}[]{c@{\quad}l}\mbox{minimize}&\phi(x)\\ \vskip 3.0pt plus 1.0pt minus 1.0pt\cr\mbox{subject to}&Ax<b.\end{array}

Solving the quadratic inequality, we get

xϕ(μ)xϕ()xϕ()12.\|x^{\phi}(\mu)-x^{\phi}(\infty)\|_{x^{\phi}(\infty)}\leq\frac{1}{2}. (3)

Using Lemma 1 and inequality (3), we have that

𝗀𝖺𝗉(xϕ(μ))minx(162ϕ(xϕ()),xϕ())𝗀𝖺𝗉(x)=𝗀𝖺𝗉(xϕ())12c(2ϕ(xϕ()))1c.\begin{split}\mathsf{gap}(x^{\phi}(\mu))&\geq\min_{x\in\mathcal{E}(16\nabla^{2}\phi(x^{\phi}(\infty)),x^{\phi}(\infty))}\mathsf{gap}(x)\\ &=\mathsf{gap}(x^{\phi}(\infty))-\frac{1}{2}\sqrt{c^{\top}\left(\nabla^{2}\phi(x^{\phi}(\infty))\right)^{-1}c}.\end{split} (4)

Next, since (2ϕ(xϕ()),xϕ())\mathcal{E}(\nabla^{2}\phi(x^{\phi}(\infty)),x^{\phi}(\infty))\subseteq\mathcal{F}, Lemma 1 implies

0minx(2ϕ(xϕ()),xϕ())𝗀𝖺𝗉(x)=𝗀𝖺𝗉(xϕ())c(2ϕ(xϕ()))1c.0\leq\min_{x\in\mathcal{E}(\nabla^{2}\phi(x^{\phi}(\infty)),x^{\phi}(\infty))}\mathsf{gap}(x)\\ =\mathsf{gap}(x^{\phi}(\infty))-\sqrt{c^{\top}\left(\nabla^{2}\phi(x^{\phi}(\infty))\right)^{-1}c}. (5)

Combining the inequalities (4) and (5), we obtain

𝗀𝖺𝗉(xϕ(μ))12c(2ϕ(xϕ()))1c.\mathsf{gap}(x^{\phi}(\mu))\geq\frac{1}{2}\sqrt{c^{\top}\left(\nabla^{2}\phi(x^{\phi}(\infty))\right)^{-1}c}. (6)

On the other hand, by supposition, the feasible region \mathcal{F} contains a ball of radius ρ\rho. By [16, Theorem 5.3.9], it is contained in the ellipsoid (Cν22ϕ(xϕ()),xϕ())\mathcal{E}(C_{\nu}^{-2}\nabla^{2}\phi(x^{\phi}(\infty)),x^{\phi}(\infty)). Hence,

(ρ2I,xϕ())(Cν22ϕ(xϕ()),xϕ()),\mathcal{E}(\rho^{-2}I,x^{\phi}(\infty))\subseteq\mathcal{E}(C_{\nu}^{-2}\nabla^{2}\phi(x^{\phi}(\infty)),x^{\phi}(\infty)),

where the left-hand side is a ball with center xϕ()x^{\phi}(\infty) of radius ρ\rho. Using Lemma 1, we get

𝗀𝖺𝗉(xϕ())+ρc2=maxx(ρ2I,xϕ())𝗀𝖺𝗉(x)maxx(Cν22ϕ(xϕ()),xϕ())𝗀𝖺𝗉(x)\displaystyle\,\mathsf{gap}(x^{\phi}(\infty))+\rho\|c\|_{2}=\max_{x\in\mathcal{E}(\rho^{-2}I,x^{\phi}(\infty))}\mathsf{gap}(x)\leq\max_{x\in\mathcal{E}(C_{\nu}^{-2}\nabla^{2}\phi(x^{\phi}(\infty)),x^{\phi}(\infty))}\mathsf{gap}(x)
=\displaystyle= 𝗀𝖺𝗉(xϕ())+(ν+2ν)c(2ϕ(xϕ()))1c,\displaystyle\,\mathsf{gap}(x^{\phi}(\infty))+(\nu+2\sqrt{\nu})\sqrt{c^{\top}\left(\nabla^{2}\phi(x^{\phi}(\infty))\right)^{-1}c},

which, upon substitution into (6), yields

𝗀𝖺𝗉(xϕ(μ))ρc22(ν+2ν).\mathsf{gap}(x^{\phi}(\mu))\geq\frac{\rho\|c\|_{2}}{2(\nu+2\sqrt{\nu})}.

Then, we consider the case of

ϕ(xϕ(μ)))2ϕ(xϕ(μ))ϕ(xϕ(μ)))>14.\nabla\phi(x^{\phi}(\mu)))^{\top}\nabla^{2}\phi(x^{\phi}(\mu))\nabla\phi(x^{\phi}(\mu)))>\frac{1}{4}.

Using the optimality condition of problem (Pμ), we get c+μϕ(xϕ(μ))=0c+\mu\nabla\phi(x^{\phi}(\mu))=0. Therefore,

c(2ϕ(xϕ(μ)))1c>14μ2.c^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}c>\frac{1}{4}\mu^{2}. (7)

Since (2ϕ(xϕ(μ)),xϕ(μ))\mathcal{E}(\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))\subseteq\mathcal{F}, by Lemma 1,

0minx(2ϕ(xϕ(μ)),xϕ(μ))=𝗀𝖺𝗉(xϕ(μ))c(2ϕ(xϕ(μ)))1c,0\leq\min_{x\in\mathcal{E}(\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))}=\mathsf{gap}(x^{\phi}(\mu))-\sqrt{c^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}c},

which, together with (7), yields

𝗀𝖺𝗉(xϕ(μ))>μ2.\mathsf{gap}(x^{\phi}(\mu))>\frac{\mu}{2}.

This completes the proof. ∎

4.3 Some Extra Tools

The following proposition compares the slack of the μ\mu-analytic center with that of any point in its 2\ell_{2}-neighbourhood.

Proposition 1.

Consider the linear program (P). Let θ(0,(693)/10)\theta\in(0,(\sqrt{69}-3)/10), μ>0\mu>0, ϕ\phi be a self-concordant function on \mathcal{F} and x𝒩θϕ(μ)x\in\mathcal{N}^{\phi}_{\theta}(\mu). Then, for any i=1,,mi=1,\dots,m,

(1βθ)(bAxϕ(μ))i(bAx)i(1+βθ)(bAxϕ(μ))i,(1-\beta_{\theta})(b-Ax^{\phi}(\mu))_{i}\leq(b-Ax)_{i}\leq(1+\beta_{\theta})(b-Ax^{\phi}(\mu))_{i},

where βθ(0,1)\beta_{\theta}\in(0,1) is a constant depending only on θ\theta defined in (8).

Proof.

Let x𝒩θϕ(μ)x\in\mathcal{N}^{\phi}_{\theta}(\mu). By [16, Lemma 5.1.5 and Theorem 5.2.1] and the definition of 2\ell_{2}-neighbourhood,

xxϕ(μ)xϕ(μ)21+23xxϕ(μ)xϕ(μ)θ21θ.\displaystyle\frac{\|x-x^{\phi}(\mu)\|_{x^{\phi}(\mu)}^{2}}{1+\frac{2}{3}\|x-x^{\phi}(\mu)\|_{x^{\phi}(\mu)}}\leq\frac{\theta^{2}}{1-\theta}.

Solving the quadratic inequality yields xxϕ(μ)xϕ(μ)βθ\|x-x^{\phi}(\mu)\|_{x^{\phi}(\mu)}\leq\beta_{\theta}, where

βθ=13(θ21θ+θ4(1θ)2+9θ21θ).\beta_{\theta}=\frac{1}{3}\left(\frac{\theta^{2}}{1-\theta}+\sqrt{\frac{\theta^{4}}{(1-\theta)^{2}}+\frac{9\theta^{2}}{1-\theta}}\right). (8)

We note that βθ<1\beta_{\theta}<1 whenever θ(0,(693)/10)\theta\in(0,(\sqrt{69}-3)/10). Using Lemma 1, we get

maxx(βθ22ϕ(xϕ(μ)),xϕ(μ))(bAx)i=(bAxϕ(μ))i+βθai(2ϕ(xϕ(μ)))1ai,\max_{x^{\prime}\in\mathcal{E}(\beta_{\theta}^{-2}\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))}(b-Ax^{\prime})_{i}=(b-Ax^{\phi}(\mu))_{i}+\beta_{\theta}\sqrt{a_{i}^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}a_{i}},

and

minx(βθ22ϕ(xϕ(μ)),xϕ(μ))(bAx)i=(bAxϕ(μ))iβθai(2ϕ(xϕ(μ)))1ai0,\min_{x^{\prime}\in\mathcal{E}(\beta_{\theta}^{-2}\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))}(b-Ax^{\prime})_{i}=(b-Ax^{\phi}(\mu))_{i}-\beta_{\theta}\sqrt{a_{i}^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}a_{i}}\geq 0,

where the inequality follows from that AxbAx^{\prime}\leq b for any x(βθ22ϕ(xϕ(μ)),xϕ(μ))x^{\prime}\in\mathcal{E}(\beta_{\theta}^{-2}\nabla^{2}\phi(x^{\phi}(\mu)),x^{\phi}(\mu))\subseteq\mathcal{F}. Therefore, we have

|(bAx)i(bAxϕ(μ))i|βθai(2ϕ(xϕ(μ)))1aiβθ(bAxϕ(μ))i.\left|(b-Ax)_{i}-(b-Ax^{\phi}(\mu))_{i}\right|\leq\beta_{\theta}\sqrt{a_{i}^{\top}\left(\nabla^{2}\phi(x^{\phi}(\mu))\right)^{-1}a_{i}}\leq\beta_{\theta}(b-Ax^{\phi}(\mu))_{i}.

This completes the proof. ∎

The next proposition compares the optimality gap of the μ\mu-analytic center with that of any point in its 2\ell_{2}-neighbourhood.

Proposition 2.

Consider the linear program (P). Let θ(0,(693)/10)\theta\in(0,(\sqrt{69}-3)/10), μ>0\mu>0, ϕ\phi be a self-concordant function on \mathcal{F} and x𝒩θϕ(μ)x\in\mathcal{N}^{\phi}_{\theta}(\mu). Then,

(1βθ)𝗀𝖺𝗉(xϕ(μ))𝗀𝖺𝗉(x)(1+βθ)𝗀𝖺𝗉(xϕ(μ)).(1-\beta_{\theta})\mathsf{gap}(x^{\phi}(\mu))\leq\mathsf{gap}(x)\leq(1+\beta_{\theta})\mathsf{gap}(x^{\phi}(\mu)).

The proof of Proposition 2 uses the same arguments as in the proof of Proposition 1 and is therefore omitted.

We will need to compare the μ\mu-analytic center associated with ϕ\phi and the η\eta-analytic center associated with the logarithmic barrier ϕln\phi_{\ln} along the constant-cost slices of t\mathcal{F}_{t}.

Lemma 2.

Consider the linear program LWr(t){\textbf{LW}}_{r}(t). For any self-concordant barrier ϕ\phi on t\mathcal{F}_{t} and μ>0\mu>0 satisfying

𝗀𝖺𝗉(xϕ(μ))<t4(3r+1)+83r+1,\mathsf{gap}(x^{\phi}(\mu))<\frac{t}{4(3r+1)+8\sqrt{3r+1}},

there exists a unique η(μ)>0\eta(\mu)>0 such that

η(μ)2𝗀𝖺𝗉(xϕ(μ))=𝗀𝖺𝗉(xϕln(η(μ)))(3r+1)η(μ).\frac{\eta(\mu)}{2}\leq\mathsf{gap}(x^{\phi}(\mu))=\mathsf{gap}(x^{\phi_{\ln}}(\eta(\mu)))\leq(3r+1)\eta(\mu).
Proof.

We first prove the existence of η(μ)\eta(\mu) satisfying 𝗀𝖺𝗉(xϕ(μ))=𝗀𝖺𝗉(xϕln(η(μ)))\mathsf{gap}(x^{\phi}(\mu))=\mathsf{gap}(x^{\phi_{\ln}}(\eta(\mu))). Since μ𝗀𝖺𝗉(xϕ(μ))\mu\mapsto\mathsf{gap}(x^{\phi}(\mu)) and η𝗀𝖺𝗉(xϕln(η(μ)))\eta\mapsto\mathsf{gap}(x^{\phi_{\ln}}(\eta(\mu))) are both continuous and strictly increasing functions attaining the minimum value 0 at 0, it suffices to show that for any μ>0\mu>0 with

𝗀𝖺𝗉(xϕ(μ))<t4(3r+1)+83r+1,\mathsf{gap}(x^{\phi}(\mu))<\frac{t}{4(3r+1)+8\sqrt{3r+1}},

there exists η>0\eta>0 such that 𝗀𝖺𝗉(xϕ(μ))𝗀𝖺𝗉(xϕln(η))\mathsf{gap}(x^{\phi}(\mu))\leq\mathsf{gap}(x^{\phi_{\ln}}(\eta)). To show this, we note that the feasible region t\mathcal{F}_{t} contains a hypercube of side length tt, which in turn contains a ball of radius t2\frac{t}{2}. Also, the cost vector of problem LWr(t){\textbf{LW}}_{r}(t) is ct=(1,0,,0)c_{t}=(1,0,\dots,0)^{\top} and hence ct2=1\|c_{t}\|_{2}=1. Furthermore, it is well-known that the self-concordance parameter of the logarithmic barrier ϕln\phi_{\ln} on a polytope is precisely the number of linear inequalities defining the polytope, see [16, Section 5.3] for example. Therefore, ϕln\phi_{\ln} is (3r+1)(3r+1)-self-concordant on t\mathcal{F}_{t}. Theorem 2 then implies that for a large enough η>0\eta>0,

𝗀𝖺𝗉(xϕ(μ))<min{t4(3r+1)+83r+1,η2}𝗀𝖺𝗉(xϕln(η)).\displaystyle\mathsf{gap}(x^{\phi}(\mu))<\min\left\{\frac{t}{4(3r+1)+8\sqrt{3r+1}},\frac{\eta}{2}\right\}\leq\mathsf{gap}(x^{\phi_{\ln}}(\eta)).

Therefore, there exists a unique η(μ)>0\eta(\mu)>0 with 𝗀𝖺𝗉(xϕ(μ))=𝗀𝖺𝗉(xϕln(η(μ)))\mathsf{gap}(x^{\phi}(\mu))=\mathsf{gap}(x^{\phi_{\ln}}(\eta(\mu))).

It remains to prove the inequalities for 𝗀𝖺𝗉(xϕ(μ))=𝗀𝖺𝗉(xϕln(η(μ)))\mathsf{gap}(x^{\phi}(\mu))=\mathsf{gap}(x^{\phi_{\ln}}(\eta(\mu))). By Theorem 2,

min{t4(3r+1)+83r+1,η(μ)2}𝗀𝖺𝗉(xϕln(η(μ)))η(μ)(3r+1).\min\left\{\frac{t}{4(3r+1)+8\sqrt{3r+1}},\frac{\eta(\mu)}{2}\right\}\leq\mathsf{gap}(x^{\phi_{\ln}}(\eta(\mu)))\leq\eta(\mu)(3r+1).

Since

𝗀𝖺𝗉(xϕ(μ))=𝗀𝖺𝗉(xϕln(η(μ)))<t4(3r+1)+83r+1,\mathsf{gap}(x^{\phi}(\mu))=\mathsf{gap}(x^{\phi_{\ln}}(\eta(\mu)))<\frac{t}{4(3r+1)+8\sqrt{3r+1}},

we get

η(μ)2𝗀𝖺𝗉(xϕ(μ))=𝗀𝖺𝗉(xϕln(η(μ)))η(μ)(3r+1).\frac{\eta(\mu)}{2}\leq\mathsf{gap}(x^{\phi}(\mu))=\mathsf{gap}(x^{\phi_{\ln}}(\eta(\mu)))\leq\eta(\mu)(3r+1).

This completes the proof. ∎

We will also need the following remarkable result from [3, Theorem 29].

Theorem 3.

Let ω(0,1)\omega\in(0,1). Consider the linear program LWr(t){\textbf{LW}}_{r}(t) for

t>(2(5r1)(10r1)4((10r2)!)81ω)2r+2.t>\left(\frac{2(5r-1)(10r-1)^{4}((10r-2)!)^{8}}{1-\omega}\right)^{2^{r+2}}.

Suppose that

[x0,x1][x1,x2][xK1,xK]𝒲ω,t,[x^{0},x^{1}]\cup[x^{1},x^{2}]\cup\cdots\cup[x^{K-1},x^{K}]\subseteq\mathcal{W}_{\omega,t},

with x0𝒲ω,t(η0)x^{0}\in\mathcal{W}_{\omega,t}(\eta_{0}) and xK𝒲ω,t(ηK)x^{K}\in\mathcal{W}_{\omega,t}(\eta_{K}) for some η0t\eta_{0}\geq\sqrt{t} and ηK1\eta_{K}\leq 1. Then, K2r3K\geq 2^{r-3}.

Some remarks are in order. First, the original statement of [3, Theorem 29] concerns iterates of a longer vector consisting of primal, dual and slack variables. But Theorem 3 concerns only the primal variables and hence is seemingly stronger than [3, Theorem 29], because, for a polygonal curve in N\mathbb{R}^{N} with KK pieces, its projection to a lower dimensional space could possibly have less than KK pieces in general. However, upon a close inspection of the proof of [3, Theorem 29] and [3, Section 6.2], we find that Theorem 3 holds for the linear program LWr(t){\textbf{LW}}_{r}(t). Using the language of [3], the reason is that for the tropical central path of the linear program LWr(t){\textbf{LW}}_{r}(t), even the projection onto the primal variables is already a polygonal curve with a huge number of pieces. Second, in the original statement of [3, Theorem 29], the requirement on η0\eta_{0} is that η0t2\eta_{0}\geq t^{2}, and the conclusion is that K2r1K\geq 2^{r-1}. By considering a shorter part of the tropical central path of LWr(t){\textbf{LW}}_{r}(t), we find that replacing the requirement on η0\eta_{0} with the weaker bound of η0t\eta_{0}\geq\sqrt{t} leads the weaker conclusion of K2r3K\geq 2^{r-3}, see [3, Section 4.3].

4.4 Proof of Corollary 1

We now have enough tools at our disposal to prove Corollary 1. In fact, we will prove the following version with slightly better constants, from which Corollary 1 follows.

Corollary 2 (Precise Version of Corollary 1).

Consider LWr(t){\textbf{LW}}_{r}(t) for a sufficiently large t>1t>1. Let θ(0,(693)/10)\theta\in(0,(\sqrt{69}-3)/10), ϕ\phi be a ν\nu-self-concordant barrier on t\mathcal{F}_{t} with ν\nu independent of tt and {xk}k0\{x^{k}\}_{k\geq 0} be a sequence of iterates generated by a short-step method associated with ϕ\phi. Suppose that

𝗀𝖺𝗉(x0)(1+βθ)(3r+1)(1+Cν)t1βθand𝗀𝖺𝗉(xK)1βθ2(1+βθ)(1+C3r+1),\mathsf{gap}(x^{0})\geq\frac{(1+\beta_{\theta})(3r+1)(1+C_{\nu})\sqrt{t}}{1-\beta_{\theta}}\quad\text{and}\quad\mathsf{gap}(x^{K})\leq\frac{1-\beta_{\theta}}{2(1+\beta_{\theta})(1+C_{3r+1})},

where βθ(0,1)\beta_{\theta}\in(0,1) is a constant depending only on θ\theta defined in (8). Then, K2r3K\geq 2^{r-3}.

Proof.

Let

ω=1(1βθ)(1+βθ)(1+Cν)(1+C3r+1)\omega=1-\frac{(1-\beta_{\theta})}{(1+\beta_{\theta})(1+C_{\nu})(1+C_{3r+1})}

and

t>(2(5r1)(10r1)4((10r2)!)81ω)2r+2.t>\left(\frac{2(5r-1)(10r-1)^{4}((10r-2)!)^{8}}{1-\omega}\right)^{2^{r+2}}.

Note that ω(0,1)\omega\in(0,1).

Reduction

Without loss of generality, we can assume that for any x[x0,x1][xK1,xK]x\in[x^{0},x^{1}]\cup\cdots\cup[x^{K-1},x^{K}],

𝗀𝖺𝗉(x0)=(1+βθ)(3r+1)(1+Cν)t1βθ𝗀𝖺𝗉(x)=1βθ2(1+βθ)(1+C3r+1)=𝗀𝖺𝗉(xK).\mathsf{gap}(x^{0})=\frac{(1+\beta_{\theta})(3r+1)(1+C_{\nu})\sqrt{t}}{1-\beta_{\theta}}\geq\mathsf{gap}(x)=\frac{1-\beta_{\theta}}{2(1+\beta_{\theta})(1+C_{3r+1})}=\mathsf{gap}(x^{K}). (9)

Indeed, if this is not the case, because of the continuity of 𝗀𝖺𝗉\mathsf{gap} and the assumption that

𝗀𝖺𝗉(x0)(1+βθ)(3r+1)(1+Cν)t1βθand𝗀𝖺𝗉(xK)1βθ2(1+βθ)(1+C3r+1),\mathsf{gap}(x^{0})\geq\frac{(1+\beta_{\theta})(3r+1)(1+C_{\nu})\sqrt{t}}{1-\beta_{\theta}}\quad\text{and}\quad\mathsf{gap}(x^{K})\leq\frac{1-\beta_{\theta}}{2(1+\beta_{\theta})(1+C_{3r+1})},

we can choose a sub-curve (connected subset) of [x0,x1][xK1,xK][x^{0},x^{1}]\cup\cdots\cup[x^{K-1},x^{K}] that satisfies all the assumptions of Corollary 2 as well as assumption (9).

Our goal is to show that [x0,x1][xK1,xK]𝒲ω,t[x^{0},x^{1}]\cup\cdots\cup[x^{K-1},x^{K}]\subseteq\mathcal{W}_{\omega,t} and that x0𝒲ω,t(η0)x^{0}\in\mathcal{W}_{\omega,t}(\eta_{0}) and xK𝒲ω,t(ηK)x^{K}\in\mathcal{W}_{\omega,t}(\eta_{K}), for some η0t\eta_{0}\geq\sqrt{t} and ηK1\eta_{K}\leq 1. If this can be proved, the desired conclusion would then follow from Theorem 3.

Proving [x0,x1][xK1,xK]𝒲ω,t[x^{0},x^{1}]\cup\cdots\cup[x^{K-1},x^{K}]\subseteq\mathcal{W}_{\omega,t}

Let x[x0,x1][xK1,xK]x\in[x^{0},x^{1}]\cup\cdots\cup[x^{K-1},x^{K}]. Then, x𝒩θ,tϕ(μ)x\in\mathcal{N}^{\phi}_{\theta,t}(\mu) for some μ>0\mu>0. By assumption (9) and Proposition 2, we have

𝗀𝖺𝗉(xϕ(μ))(1+βθ)2(3r+1)(1+Cν)t1βθ<t4(3r+1)+83r+1.\mathsf{gap}(x^{\phi}(\mu))\leq\frac{(1+\beta_{\theta})^{2}(3r+1)(1+C_{\nu})\sqrt{t}}{1-\beta_{\theta}}<\frac{t}{4(3r+1)+8\sqrt{3r+1}}.

Using Lemma 2, there exists a unique η(μ)>0\eta({\mu})>0 such that

ctxtϕ(μ)=ctxtϕln(η(μ)).c_{t}^{\top}x_{t}^{\phi}(\mu)=c_{t}^{\top}x_{t}^{\phi_{\ln}}(\eta({\mu})).

By Proposition 1 and Theorem 1, for any i=1,,3r+1i=1,\dots,3r+1,

(btAtx)i(btAtxtϕln(η(μ)))i=(btAtx)i(btAtxtϕ(μ))i(btAtxtϕ(μ))i(btAtxtϕln(η(μ)))i(1βθ)(1+Cν),\begin{split}\frac{(b_{t}-A_{t}x)_{i}}{(b_{t}-A_{t}x^{\phi_{\ln}}_{t}(\eta(\mu)))_{i}}=\frac{(b_{t}-A_{t}x)_{i}}{(b_{t}-A_{t}x^{\phi}_{t}(\mu))_{i}}\cdot\frac{(b_{t}-A_{t}x^{\phi}_{t}(\mu))_{i}}{(b_{t}-A_{t}x^{\phi_{\ln}}_{t}(\eta(\mu)))_{i}}\geq\frac{(1-\beta_{\theta})}{(1+C_{\nu})},\end{split} (10)

From the definition of the logarithmic barrier ϕln\phi_{\ln} and the optimality conditions of problem (Pμ), there exists y+3r+1y\in\mathbb{R}_{+}^{3r+1} such that Aty=ctA_{t}^{\top}y=-c_{t} and

yi(btAtxtϕln(η(μ)))i=η(μ),i=1,,3r+1.y_{i}(b_{t}-A_{t}x^{\phi_{\ln}}_{t}(\eta(\mu)))_{i}=\eta(\mu),\quad i=1,\dots,3r+1.

Averaging these 3r+13r+1 equalities and then using Proposition 1, Theorem 1 and the fact that ϕln\phi_{\ln} is (3r+1)(3r+1)-self-concordant on t\mathcal{F}_{t}, we get

η(μ)=13r+1i=13r+1yi(btAtxtϕln(η(μ)))i(1+C3r+1)13r+1i=13r+1yi(btAtxtϕ(μ))i(1+C3r+1)1(1+βθ)1y(btAtx)3r+1.\begin{split}\eta(\mu)&=\frac{1}{3r+1}\sum_{i=1}^{3r+1}y_{i}(b_{t}-A_{t}x^{\phi_{\ln}}_{t}(\eta(\mu)))_{i}\\ &\geq\frac{(1+C_{3r+1})^{-1}}{3r+1}\sum_{i=1}^{3r+1}y_{i}(b_{t}-A_{t}x^{\phi}_{t}(\mu))_{i}\\ &\geq(1+C_{3r+1})^{-1}(1+\beta_{\theta})^{-1}\frac{y^{\top}(b_{t}-A_{t}x)}{3r+1}.\end{split} (11)

Hence, using (10) and (11), for any i=1,,3r+1i=1,\dots,3r+1,

yi(btAtx)i\displaystyle y_{i}(b_{t}-A_{t}x)_{i} (1βθ)(1+Cν)yi(btAtxtϕln(η(μ))i=(1βθ)(1+Cν)η(μ)\displaystyle\geq\frac{(1-\beta_{\theta})}{(1+C_{\nu})}\cdot y_{i}(b_{t}-A_{t}x^{\phi_{\ln}}_{t}(\eta(\mu))_{i}=\frac{(1-\beta_{\theta})}{(1+C_{\nu})}\cdot\eta(\mu) (12)
(1βθ)(1+βθ)(1+Cν)(1+C3r+1)y(btAtx)3r+1,\displaystyle\geq\frac{(1-\beta_{\theta})}{(1+\beta_{\theta})(1+C_{\nu})(1+C_{3r+1})}\cdot\frac{y^{\top}(b_{t}-A_{t}x)}{3r+1},

which implies that x𝒲ω,t(ηx)x\in\mathcal{W}_{\omega,t}(\eta_{x}) with

ηx=y(btAtx)3r+1.\eta_{x}=\frac{y^{\top}(b_{t}-A_{t}x)}{3r+1}. (13)

Proving x0𝒲ω,t(η0)x^{0}\in\mathcal{W}_{\omega,t}(\eta_{0}) and xK𝒲ω,t(ηK)x^{K}\in\mathcal{W}_{\omega,t}(\eta_{K}) for some η0t\eta_{0}\geq\sqrt{t} and ηK1\eta_{K}\leq 1

Using definition (13), inequality (11), Lemma 2 and Proposition 2, we get

ηxK\displaystyle\eta_{x^{K}} =y(btAtxK)3r+1(1+βθ)(1+C3r+1)η(μK)2(1+βθ)(1+C3r+1)𝗀𝖺𝗉(xϕ(μK))\displaystyle=\frac{y^{\top}(b_{t}-A_{t}x^{K})}{3r+1}\leq(1+\beta_{\theta})(1+C_{3r+1})\eta(\mu_{K})\leq 2(1+\beta_{\theta})(1+C_{3r+1})\mathsf{gap}(x^{\phi}(\mu_{K}))
2(1+βθ)(1+C3r+1)1βθ𝗀𝖺𝗉(xK)1,\displaystyle\leq\frac{2(1+\beta_{\theta})(1+C_{3r+1})}{1-\beta_{\theta}}\mathsf{gap}(x^{K})\ \leq 1,

where xK𝒩θ,tϕ(μK)x^{K}\in\mathcal{N}^{\phi}_{\theta,t}(\mu_{K}). Similarly, using definition (13), inequality (12), Lemma 2 and Proposition 2, we get

ηx0\displaystyle\eta_{x^{0}} =y(btAtx0)3r+11βθ1+Cνη(μ0)1βθ(3r+1)(1+Cν)𝗀𝖺𝗉(xϕ(μ0))\displaystyle=\frac{y^{\top}(b_{t}-A_{t}x^{0})}{3r+1}\geq\frac{1-\beta_{\theta}}{1+C_{\nu}}\eta(\mu^{0})\geq\frac{1-\beta_{\theta}}{(3r+1)(1+C_{\nu})}\mathsf{gap}(x^{\phi}(\mu_{0}))
1βθ(1+βθ)(3r+1)(1+Cν)𝗀𝖺𝗉(x0)t,\displaystyle\geq\frac{1-\beta_{\theta}}{(1+\beta_{\theta})(3r+1)(1+C_{\nu})}\mathsf{gap}(x^{0})\geq\sqrt{t},

where x0𝒩θ,tϕ(μ0)x^{0}\in\mathcal{N}^{\phi}_{\theta,t}(\mu_{0}) Taking η0=ηx0\eta_{0}=\eta_{x^{0}} and ηK=ηx0\eta_{K}=\eta_{x^{0}}, the proof is completed. ∎

Acknowledgments

The authors thank Defeng Sun, Kim-Chuan Toh and Stephen Wright for their helpful discussions.

References

  • [1] I. Adler and S. Cosares. A strongly polynomial algorithm for a special class of linear programs. Operations Research, 39(6):955–960, 1991.
  • [2] X. Allamigeon, A. Aznag, S. Gaubert, and Y. Hamdi. The tropicalization of the entropic barrier. arXiv preprint arXiv:2010.10205, 2020.
  • [3] X. Allamigeon, P. Benchimol, S. Gaubert, and M. Joswig. Log-barrier interior point methods are not strongly polynomial. SIAM Journal on Applied Algebra and Geometry, 2(1):140–178, 2018.
  • [4] X. Allamigeon, P. Benchimol, S. Gaubert, and M. Joswig. What tropical geometry tells us about the complexity of linear programming. SIAM Review, 63(1):123–164, 2021.
  • [5] S. Bubeck and R. Eldan. The entropic barrier: a simple and optimal universal self-concordant barrier. In Conference on Learning Theory, pages 279–279, 2015.
  • [6] A. Deza, T. Terlaky, and Y. Zinchenko. Polytopes and arrangements: Diameter and curvature. Operations Research Letters, 36(2):215–222, 2008.
  • [7] A. Deza, T. Terlaky, and Y. Zinchenko. Central path curvature and iteration-complexity for redundant Klee-Minty cubes. In Advances in Applied Mathematics and Global Optimization, pages 223–256. Springer, 2009.
  • [8] J. C. Gilbert, C. C. Gonzaga, and E. Karas. Examples of ill-behaved central paths in convex optimization. Mathematical Programming, 103(1):63–94, 2004.
  • [9] J. Gondzio. Interior point methods 25 years later. European Journal of Operational Research, 218(3):587–601, 2012.
  • [10] M. Kojima, S. Mizuno, and A. Yoshise. A polynomial-time algorithm for a class of linear complementarity problems. Mathematical Programming, 44(1):1–26, 1989.
  • [11] M. Kojima, S. Mizuno, and A. Yoshise. A primal-dual interior point algorithm for linear programming. In Progress in Mathematical Programming, pages 29–47. Springer, 1989.
  • [12] Y. T. Lee and A. Sidford. Solving linear programs with O~(rank)\tilde{O}(\sqrt{\text{rank}}) linear system solves. arXiv preprint arXiv:1910.08033, 2019.
  • [13] Y. T. Lee and M.-C. Yue. Universal barrier is nn-self-concordant. Mathematics of Operations Research, 46(3):1129–1148, 2021.
  • [14] S. Mizuno, M. J. Todd, and Y. Ye. On adaptive-step primal-dual interior-point algorithms for linear programming. Mathematics of Operations Research, 18(4):964–981, 1993.
  • [15] R. Monteiro and I. Adler. Interior path following primal-dual algorithms. Part I: Linear programming. Mathematical Programming, 44(1):27–41, 1989.
  • [16] Y. Nesterov. Lectures on Convex Optimization. Springer, 2018.
  • [17] Y. Nesterov and A. Nemirovskii. Interior-point Polynomial Algorithms in Convex Programming. SIAM, 1994.
  • [18] É. Tardos. A strongly polynomial algorithm to solve combinatorial linear programs. Operations Research, 34(2):250–256, 1986.
  • [19] K.-C. Toh, M. J. Todd, and R. H. Tütüncü. SDPT3 – A MATLAB software package for semidefinite programming, Version 1.3. Optimization Methods and Software, 11(1-4):545–581, 1999.
  • [20] P. M. Vaidya. A new algorithm for minimizing convex functions over convex sets. In 30th Annual Symposium on Foundations of Computer Science, pages 338–343. IEEE Computer Society, 1989.
  • [21] S. J. Wright. Primal-dual Interior-point Methods. SIAM, 1997.
  • [22] Y. Ye. The simplex and policy-iteration methods are strongly polynomial for the Markov decision problem with a fixed discount rate. Mathematics of Operations Research, 36(4):593–603, 2011.