Convex Minimization with Integer Minima in Time††thanks: A preliminary version of this paper appeared at the 35th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2024).
Given a convex function on with an integer minimizer, we show how to find an exact minimizer of using calls to a separation oracle and time. The previous best polynomial time algorithm for this problem given in [Jiang, SODA 2021, JACM 2022] achieves oracle complexity. However, the overall runtime of Jiang’s algorithm is at least , due to expensive sub-routines such as the Lenstra-Lenstra-Lovász (LLL) algorithm [Lenstra, Lenstra, Lovász, Math. Ann. 1982] and random walk based cutting plane method [Bertsimas, Vempala, JACM 2004]. Our significant speedup is obtained by a nontrivial combination of a faster version of the LLL algorithm due to [Neumaier, Stehlé, ISSAC 2016] that gives similar guarantees, the volumetric center cutting plane method (CPM) by [Vaidya, FOCS 1989] and its fast implementation given in [Jiang, Lee, Song, Wong, STOC 2020].
For the special case of submodular function minimization (SFM), our result implies a strongly polynomial time algorithm for this problem using calls to an evaluation oracle and additional arithmetic operations. Both the oracle complexity and the number of arithmetic operations of our more general algorithm are better than the previous best-known runtime algorithms for this specific problem given in [Lee, Sidford, Wong, FOCS 2015] and [Dadush, Végh, Zambelli, SODA 2018, MOR 2021].
1 Introduction
The problem of minimizing a convex function over , assuming access to a separation oracle , has gained considerable interest since the seminal work of Grötschel, Lovász, and Schrijver [22, 24]. The separation oracle is such that when queried with a point , it returns “YES” if minimizes , or else a hyperplane that separates from the minimizers of . One popular and successful approach for this problem is the cutting plane method, which dates back to the center of gravity method, independently discovered by Levin [39] and Newman [48] in the 1960s. Since then, cutting plane methods have undergone substantial developments and improvements over the past six decades in terms of its oracle complexity111The oracle complexity of an algorithm is the number of oracle calls made by the algorithm in the worst case. and runtime efficiency [58, 72, 37, 65, 49, 67, 4, 6, 44]. In particular, the current fastest cutting plane method is due to Jiang, Lee, Song, and Wong [34].
Despite outstanding progress on the cutting plane method, direct application of these methods for minimizing a convex function on via a separation oracle typically results in weakly-polynomial time algorithms, with the oracle complexity and runtime depending logarithmically on the accuracy parameter and the “size”222Depending on the specific problem setting, the “size” of the function can be its range, Lipschitzness, condition number, length of binary representation, etc. of the function .
A fundamental but extremely challenging question is to design a strongly-polynomial time algorithm that efficiently computes an exact minimizer of on , with its oracle complexity, number of arithmetic operations, and bit size all being polynomial only in the dimension of the problem and not dependent on the “size” of the function . For example, it remains a major open problem to design a strongly-polynomial time algorithm for solving linear programs (LPs). This problem is widely known as Smale’s 9th problem in his list of eighteen unsolved mathematical problems for the 21st century [59]. In particular, recent breakthroughs on LP solvers are all weakly-polynomial time algorithms, e.g., [42, 43, 11, 5, 36]. Despite this obstacle, strongly-polynomial time algorithms for LPs are known under additional structures, such as LPs with at most two non-zero entries per row [47, 1, 12] or per column [68, 53] in the constraint matrix, LPs with bounded entries in the constraint matrix [64, 70, 14], and LPs with 0-1 optimal solutions [9, 10].
For minimizing a general convex function with access to a separation oracle, it is impossible to design a strongly-polynomial time algorithm unless the function satisfies additional combinatorial properties. One natural and general assumption, satisfied by many fundamental discrete optimization problems such as submodular function minimization (SFM), maximum flow, maximum matching, and shortest path, is that the convex function has an integer minimizer. Under the integer minimizer assumption, Jiang [31, 32], improving upon the classical work of Grötschel, Lovász, and Schrijver [22, 24], recently gave a strongly-polynomial time algorithm333In fact, Jiang [31, 32] gave a family of algorithms for this problem via an algorithmic reduction to the shortest vector problem. For instance, he gave an algorithm that achieves a nearly-optimal oracle complexity of but using exponential time. for finding an exact minimizer of using calls to , where is the -norm of the integer minimizer.
In fact, Jiang’s general result implies significant improvement to the oracle complexity of strongly-polynomial time algorithms even for the special problem of SFM (see Subsection 1.2 for more details). Despite its favorable oracle complexity, Jiang’s algorithm actually requires additional arithmetic operations to implement. This additional part of the runtime is prohibitively large for its application to SFM; other state-of-the-art strongly-polynomial time algorithms for SFM [44, 34, 17] have worse oracle complexity but use much fewer arithmetic operations.
Unlike the special case of SFM, for the more general problem of minimizing a convex function with an integer minimizer, the arithmetic operations in Jiang’s algorithm is the state-of-the-art for strongly-polynomial time algorithms with near-optimal oracle complexity. The lack of a fast algorithm for the general problem, as well as the tradeoff between smaller oracle complexity and fewer additional arithmetic operations for state-of-the-art SFM algorithms, naturally lead to the following question:
Can we minimize a convex function with integer minimizers using a separation oracle in a way that is both oracle-efficient and requires much fewer arithmetic operations?
In this paper, we provide an affirmative answer to the above question. Specifically, we give an algorithm that minimizes a convex function with an integer minimizer using only separation oracle calls and an additional arithmetic operations. When applied to SFM, our result implies an algorithm that makes evaluation oracle queries and uses an additional arithmetic operations. This improves, for both the oracle complexity and the additional arithmetic operations, the state-of-the-art strongly-polynomial time SFM algorithms in [44, 17]. Compared with Jiang’s algorithm [32], despite having a slightly bigger oracle complexity, the additional number of arithmetic operations used by our algorithm is significantly lower by a factor.
1.1 Our Result
The main result of this paper is a much more efficient algorithm for minimizing convex functions with integer minimizers:
Theorem 1.1 (Main result, informal version of Theorem 5.4).
Given a separation oracle for a convex function on . If the set of minimizers of is contained in a box of radius and all extreme points of are integer vectors, then there exists a randomized algorithm that outputs an integer minimizer of with high probability using
-
•
queries to , and
-
•
additional arithmetic operations.
The strong assumption that all extreme points of are integer vectors guarantees that the algorithm outputs an integer minimizer of . In fact, this assumption is necessary for the algorithm to efficiently compute an integer minimizer (see Remark 1.4 in [32] for an example). Without such an assumption, the algorithm can still output a minimizer, though not neccessarily an integer one, with the same guarantee as in Theorem 1.1 (see Remark 1.5 in [32]).
Previously, under the same assumptions as in Theorem 1.1, Jiang [32] gave an algorithm that computes an integer minimizer of using separation oracle calls, which is better than the oracle complexity in Theorem 1.1. However, his algorithm uses an enormous number of additional arithmetic operations because of the following reasons: to obtain a subquadratic oracle complexity, [32] resorts to the random walk based cutting plane method by Bertsimas and Vempala [6] to work directly with the polytope formed by all the separating hyperplanes output by . Unfortunately, the sampling approach in [6] requires arithmetic operations to perform one iteration of the cutting plane method. In addition, to guarantee that the sampling can be performed efficiently, [32] has to maintain two “sandwiching” ellipsoids, contained in and containing the polytope respectively, which requires additional computational cost to preserve after reducing the dimension of the problem.
Another computational bottleneck in [32] is that the algorithm approximates the length of the shortest vector after every iteration of the cutting plane method. To do so, [32] leverages the sieve algorithm in [2] that computes a -approximation to the shortest vector using arithmetic operations. Compounding with the overall iterations of the algorithm, this step of approximating the shortest vector takes arithmetic operations in total.
To get around the aforementioned computational bottlenecks, we utilize an intricate combination of a computationally more efficient cutting plane method based on the volumetric center of due to Vaidya [67] and a faster version of the LLL algorithm given in [51]. Instead of working directly with the polytope and performing volume reduction in a step-by-step manner, we run Vaidya’s cutting plane method [67] in blocks of consecutive steps, the volume decreases by a constant factor per step only in an amortized sense within each block. In particular, recent work by Jiang, Lee, Song and Wong [34] shows that steps of Vaidya’s method can be implemented using a total of arithmetic operations. Running Vaidya’s method in blocks also enables us to compute only approximate shortest vectors in total, while still being able to avoid the appearance of extremely short vectors. Harnessing the faster implementation of the LLL algorithm in [51] allows us to compute the approximate shortest vectors using a total of arithmetic operations.
1.2 Applications to Submodular Function Minimization
Submodular function minimization (SFM) has been one of the cornerstone problems in combinatorial optimization since the seminal work of Edmonds [19]. Many classic combinatorial problems can be abstracted as optimization over a submodular function, such as graph cut function, set coverage function and economic utility function. For more comprehensive reviews of SFM, we refer readers to [46, 30].
Papers | Year | Runtime | Remarks | General? |
---|---|---|---|---|
[22, 24] | 1981,88 | [46] | first strongly | |
[57] | 2000 | first comb. strongly | ||
[27] | 2000 | first comb. strongly | ||
[20] | 2000 | |||
[29] | 2002 | |||
[71] | 2003 | |||
[52] | 2007 | |||
[28] | 2009 | |||
[44] | 2015 | previous best runtime | ||
[44] | 2015 | exponential time | ||
[17] | 2018 | previous best runtime | ||
[31] | 2021 | best oracle complexity | ||
[31] | 2021 | exponential time | ||
This paper | 2023 | best runtime |
We briefly recall the standard setting of SFM: we are given a submodular function over an -element ground set where the function is accessed via an evaluation oracle, which returns the value for a query set using time . The goal is to find the minimizer of using queries to the evaluation oracle and additional arithmetic operations.
An SFM algorithm is called strongly-polynomial time if its runtime depends polynomially only on and the dimension , but not on the range of the function . In their seminal work, Grötschel, Lovász, and Schrijver [23, 24] gave the first strongly-polynomial time algorithm for SFM based on the ellipsoid method. Since then, there has been a long history of efforts in designing faster strongly-polynomial time algorithms for SFM (see Table 1).
The state-of-the-art strongly-polynomial time algorithms for SFM, in terms of the number of arithmetic operations used, were given by Lee, Sidford, and Wong [44] and Dadush, Végh, and Zambelli [17]. Both algorithms have runtime . The oracle complexity of these algorithms were later improved to sub-cubic in terms of by Jiang [31, 32], where he gave a strongly-polynomial time algorithm with runtime alongside an exponential time algorithm with a nearly-quadratic oracle complexity. It remains a major open problem in the area of SFM whether there exists a strongly-polynomial time algorithm with truly sub-cubic, i.e. for some absolute constant , oracle complexity. When the problem has certain structures, such as the minimizer is -sparse, Graur, Jiang and Sidford [21] have developed an algorithm with queries. SFM has also been heavily studied in the parallel setting, see e.g., [8].
Unfortunately, while the algorithm in [32] answers major open questions in [44] and makes significant progress on the oracle complexity for SFM, the number of additional arithmetic operations used by Jiang’s algorithm is a factor of larger than the algorithms in [44, 17]. In this paper, we complement Jiang’s algorithm by giving the following strongly-polynomial time SFM algorithm that improves both the oracle complexity and the number of arithmetic operations used by the algorithms in [44, 17].
Corollary 1.2 (Submodular function minimization).
Given an evaluation oracle for a submodular function defined over an -element ground set, there exists a randomized strongly-polynomial time algorithm that computes an exact minimizer of with high probability using
-
•
queries to , and
-
•
additional arithmetic operations.
Corollary 1.2 almost immediately follows from Theorem 1.1 together with the standard fact that a separation oracle for the Lovász extension of the submodular function can be implemented by making queries to the evaluation oracle of and additional arithmetic operations (e.g., Theorem 6.4 in [32]). The proof of Corollary 1.2 is essentially identical to the proof of Theorem 1.7 in [32], and is therefore omitted.
Compared to the previous best runtime algorithms in [44] and [17], our algorithm improves their oracle complexity from to while also improving the number of arithmetic operations from to . We highlight that in [44], the authors manage to achieve an oracle complexity, but at the expense of an exponential runtime. It is also important to note that our improvements are achieved via a much more general algorithm, whereas the algorithms in [44, 17] work specifically for SFM.
In comparison to the current best oracle complexity algorithm in [32], our algorithm has a slightly worse oracle complexity, but we significantly improve the additional arithmetic operations in his algorithm down to .
Finally, note that the current best implementation of a separation oracle for the Lovász extension requires queries to , and the current fastest cutting plane method requires arithmetic operations per step. So for any cutting plane algorithm for SFM that uses iterations, the current best runtime we can hope for such a method is using state-of-the-art techniques. Our algorithm, in fact, matches such a runtime bound. In particular, we use iterations of the cutting plane method with a total runtime of . So in some sense, our result matches what can be achieved using the current best known algorithmic techniques for cutting plane methods. In contrast, the algorithm in [32] does not have such a feature, with the additional arithmetic operations being much larger than its oracle complexity.
2 Technique Overview
In this section, we provide a brief overview of the techniques in prior work and in our work. In Section 2.1, we review the methods in [32]. In Section 2.2, we present a preliminary overview of our approach. In Section 2.3, we summarize current progress on the problem of minimizing convex functions with integer minimizers and discuss potential future directions.
2.1 An Overview of Previous Work
Before discussing our main technical insights, we first review the approach of [32] that achieves subquadratic oracle calls for minimizing convex functions with integer minimizers. The key ingredient in Jiang’s algorithm is an interplay between the polytope, formed by the separating hyperplanes returned by the separation oracle, which contains the integer minimizer, and a lattice that captures the structures of integer points.
In particular, in each iteration of the cutting plane method, Jiang’s algorithm examines the length of the (approximate) shortest vector in the lattice under the -norm, where is the covariance matrix of the polytope . The length of this vector under -norm provides a measurement for the width of the outer ellipsoid induced by . Thus, when this norm is small, it implies that the outer ellipsoid has a small width and we can safely proceed by reducing dimension and projecting the lattice onto the orthogonal complement of the shortest vector. On the other hand, if the shortest vector still has a relatively large norm, then the polytope can be further refined by using cutting plane method. To achieve the best possible oracle complexity, the algorithm in [32] performs the width measurement in a step-by-step fashion, meaning that it actively checks the length of the shortest vector after every single cutting plane step. This ensures that the algorithm can enter the dimension reduction phase as soon as a short lattice vector can be obtained, which happens when the volume of is small enough.
Unfortunately, such a step-by-step strategy and careful width measurement come at a price – the approximate shortest vector subprocedure will be called in every iteration of the cutting plane method. Moreover, an expensive cutting plane method, such as the random walk based center of gravity method [6], that guarantees the volume shrinks in every step needs to be used.
2.2 Our Approach: Cutting Plane in Blocks and Lazy Width Measurement
Now we discuss our approach for overcoming the major computational bottleneck of [32], which is the step-by-step measurement of width mentioned above. For simplicity, we assume the integer minimizer (i.e. ) in the subsequent discussion. By delaying the width measurement, i.e. find an approximate shortest vector after a block of cutting plane steps, we can utilize the much more efficient cutting plane method due to Vaidya [67]. However, this comes at the cost of a possibly much larger number of oracle calls, since the length of the shortest vector in the lattice might become very small when we actually perform the measurement, which will lead to a large increase in the volume of after dimension reduction. Nevertheless, we show that by carefully balancing the block size of cutting plane steps and the loss incurred during dimension reduction due to short lattice vectors, we can still achieve an oracle complexity.
To explain our approach, we first provide a brief introduction to Vaidya’s cutting plane method. Unlike classical cutting plane methods, such as the ellipsoid method and the center of gravity method, where each step is guaranteed to shrink the volume, Vaidya’s algorithm and analysis rely on controlling the volume of the Dikin ellipsoid induced by the log barrier on polytope . The algorithm iteratively finds the point whose corresponding Dikin ellipsoid has the largest volume, which is called the volumetric center. Subsuquently, the algorithm uses a separation oracle on the volumetric center . If is a minimizer, then we are done; otherwise, the algorithm computes the leverage score of each constraint with respect to the Hessian matrix, which measures the relative importance of each constraint. If all constraints are relatively important, then a new constraint is added based on the separating hyperplane returned by the separation oracle at . If one of the constraints has leverage score smaller than some tiny constant, then it is dropped to ensure that the polytope always has at most constraints. Due to the increase in the volume of the polytope when constraints are dropped, the volume of only shrinks in an amortized sense. In particular, only after steps, the volume of is guaranteed to decrease by a multiplicative factor of from the initial volume. It is crucial to note that no guarantee on the volume shrinkage of is known if iterations of Vaidya’s cutting plane method is run.
Our idea is then to view the steps as a “block”, meaning that each time we measure the shortest vector and realize that it still has a large norm, we execute Vaidya’s cutting plane steps for steps and then re-examine the norm. This naturally induces a strategy that measures the width of the ellipsoid in a lazy fashion. The volume decrease of via this strategy is similar to that when running the cutting plane method in [6] for steps. Therefore, the only issue is the significant decrease in the norm of the shortest vector during a block of cutting plane steps, which can be as small as . Fortunately, this shrinkage of the norm is acceptable and it incurs only an additional factor in the number of oracle calls.
One major advantage of using Vaidya’s cutting plane method is that the iterations can be performed very efficiently. Using the fast implementation in [34], a block of cutting plane steps can be done in time. Lazy width measurement also reduces the total number of approximate shortest vectors we need to compute from to , which opens up the gate of using faster approximate shortest vector algorithms in [51].
Although the efficiency issue of the cutting plane step and width measurement is addressed, we still need to carefully control the complexity of the dimension reduction step. This step can be particularly expensive due to the loss of structural information of the polytope after collapsing it onto a proper subspace on which the algorithm recurs, as discussed in [32]. In the standard setting of Vaidya’s cutting plane method, the polytope evolves in a “slow-changing” manner that makes it easy to maintain and update the volumetric center and its corresponding Hessian matrix. However, after reducing a dimension, we no longer have such a “slow-changing” property, and thus the new volumetric center and Hessian matrix have to be computed from scratch.
A natural idea would be to formulate the problem of recomputing the volumetric center and Hessian as a convex optimization task constrained to the polytope we have access to. On the surface, this problem can be solved straightforwardly using the same cutting plane procedure efficiently. The caveat is that evaluating the volumetric function or its gradient requires time444 is the exponent of matrix multiplication [18, 69, 40]. Currently, .. As the cutting plane method evolves for iterations, this will lead to a total of arithmetic operations. To circumvent this issue, we start from a simpler convex body containing whose volumetric center can be easily computed. Specifically, we choose the hyperrectangle centered at the center of the outer ellipsoid that contains . We show that this method only affects the number of oracle calls by a factor of , despite causing the volume to blow up by a factor of . A similar approach is also taken in [32] to ease the computation of centroid and covariance matrix of . However, his method requires iterativelty refining the hyperrectangle using constraints of until it coincides with , at which point the algorithm relearns the collapsed polytope . This approach can take as many as steps (without calling the separation oracle) with operations per step, and is mainly for achieving the best possible oracle complexity. Our approach is arguably simpler and much more efficient.
Another challenge is to prove that using volumetric centers and Dikin ellipsoids [15] is sufficient to progress the algorithm. In Jiang’s algorithm, the use of centroid and covariance matrix makes the analysis straightforward due to the standard nice geometric property that the polytope is sandwiched between ellipsoids induced by the covariance matrix at the centroid. However, in our analysis, we use the volumetric center and Dikin ellipsoid of the log barrier, which requires a different approach. Past works that exclusively analyze the performance of the cutting plane method based on this approach [67] take a functional value point of view, measuring the progress of the algorithm via the change of the volumetric function. Instead, our analysis extracts key geometric information of Vaidya’s cutting plane method, showing that the volumetric center and Dikin ellipsoid evolve in a similar spirit as the centroid and covariance matrix.
In particular, we prove structural result for using Dikin ellipsoids [15] to sandwich of an -dimensional polytope defined by constraints. Let us parametrize where and , define the Hessian of the log-barrier as where is the diagonal matrix with -th diagonal being . The volumetric function is defined as and we let denote the minimizer of , i.e. is the volumetric center of . The key structural result we prove is that , where is the ellipsoid centered at and defined by . While it is standard for to be sandwiched by Dikin ellipsoids induced by a self-concordant barrier function and centered at the minimizer of that function, we note that is not the minimizer of the log-barrier function (but rather the minimizer of ), while the Dikin ellipsoid is defined w.r.t. the Hessian of the log-barrier. In fact, for the Dikin ellipsoid centered at the analytic center denoted by (minimizer of the log-barrier function) and the induced Hessian at , we have
On the other hand, to progress Vaidya’s cutting plane method, we have to work with the Dikin ellipsoid centered and induced by instead of . We first note that the scaled volumetric function indeed defines a self-concordant barrier function and we thus have
For the left containment, we can safely replace as for any , it is true that . For the right containment, we utilize the fact that and are close spectral approximation of each other, thus we have , and we conclude
We believe this sandwiching condition might be of independent interest for other geometric applications of Vaidya’s method.
To carry out the analysis, we use the potential function developed in [32], which captures the volume of the polytope and the density of the lattice simultaneously. We show that the block of cutting plane steps also leads to rapid decrement of the potential. Unlike [32], the block cutting plane steps and Dikin ellipsoids cause extra losses on the volume of after dimension reduction, due to the possible appearance of very short vectors. We observe that such blowups are always at most for each dimension reduction step. Thus, the potential increases by at most in total, and we need to use a total of oracle calls to counter this increment. To summarize, by trading for a slightly-worse number of oracle calls, we make more room for the algorithm to gain extra speedup through leverage score maintenance, faster approximate shortest vector, and crude estimation of the convex body after reducing dimension.
2.3 Discussion
A natural question that arises from our result is whether it is possible to obtain a strongly-polynomial time algorithm with quadratic or subquadratic oracle complexity, matching the one in [32], while achieving the same improved runtime as ours. Our algorithm has only one factor in the oracle complexity, but we contend that this is inherent for Vaidya’s approach since it can only guarantee a volume decrease after cutting plane steps. Additionally, in the dimension reduction phase, we use a hyperrectangle to approximate the convex body, which introduces a volume increase of . To resolve this issue, an algorithm that approximately computes the volumetric center in time would be necessary. However, designing such an algorithm is an interesting and nontrivial data structure task, as discussed in the preceding subsection, since it requires maintaining and updating the log determinant of a Gram matrix under slow diagonal changes.
To achieve a subquadratic oracle complexity, a crucial ingredient in [32] is an approximate shortest vector algorithm with a sub-exponential approximation factor, first given in [2]. The main reason for the sub-exponential approximation ratio is a block-reduction scheme introduced in [56] that computes a more general notion of reduced lattice basis. Recent improvements on basis reduction algorithms make use of this block-reduction idea to obtain more refined recursive structures. Therefore, it is of interest to design approximate shortest vector algorithms using arithmetic operations while achieving a sub-exponential approximation factor.
3 Preliminary
In Section 3.1, we provide several basic notations, definitions and facts. In Section 3.2, we discuss LLL algorithm and shortest vector problem. In Section 3.3, we define and state several basic tools in convex geometry. In Section 3.4, we provide some related definitions about lattice projection. In Section 3.5, we present the slicing lemma. In Section 3.6, we state a dimension reduction lemma.
3.1 Notations and Basic Facts
Basic Notations.
For an integer , we use to denote the set . For any function , we use to denote .
Matrices and Vectors.
For a vector , we use to denote its norm. For a vector , we use to denote its transpose. For a matrix , we use to denote its transpose. We use to denote a length- vector where all the entries are zeros. We use to denote a length- vector where all the entries are ones.
We say a square matrix is PSD () if for all vectors , . For a square matrix , we use to denote the determinant of matrix . For a square and invertible matrix , we use to denote the inverse of matrix .
For a PSD matrix , we define the induced matrix norm for any vector as .
Ellipsoid.
Given a point and a PSD matrix , we use to denote the (not necessarily full-rank) ellipsoid given by
where is the subspace spanned by eigenvectors corresponding to nonzero eigenvalues of . When the ellipsoid is centered at , we use the short-hand notation to denote .
Lattices.
Let be a set of linearly independent vectors, we use
denote the lattice generated by and is the rank of the lattice. If , then it’s full rank. A basis of is a set of linearly independent vectors that generates under integer combinations. Bases of are equivalent under unimodular transforms. We use to denote the norm of the shortest nonzero vector in and to denote the induced -norm of the shortest nonzero vector in .
Given a basis , the parallelotope associated to it is the polytope . The determinant of is the volume of , which is independent of basis.
The dual lattice of is defined as follows:
Definition 3.1 (Dual lattice).
Given a lattice , the dual lattice is the set of all vectors such that for all .
For more backgrounds about lattices, we refer the readers to lecture notes by Rothvoss [55].
Leverage Score.
We define leverage score, which is a standard concept in numerical linear algebra [13, 7, 60, 61, 34, 62, 16]. We remark that leverage score has multiple equivalent definitions, here we just present one of them.
Definition 3.2 (Leverage score).
Given a matrix , we define the leverage score for matrix to be , i.e,
Note that is the -th row of .
We state a useful fact here.
Fact 3.3 (folklore).
Given a matrix , we have the following identity of its leverage score:
3.2 LLL Algorithm for Shortest Vector Problem
Given a lattice and a corresponding basis , it is natural to seek an algorithm that finds the vector with norm , or at least approximately finds it. The famous Lenstra-Lenstra-Lovász algorithm serves such a purpose:
Theorem 3.4 (LLL algorithm, [41]).
Let be a basis for lattice and be a PSD matrix that is full rank on . Let such that for any . Then there exists an algorithm that outputs in arithmetic operations a nonzero vector such that
Moreover, the integers occurring in the algorithm have bit sizes at most .
Lately, [51] improves the runtime of the LLL algorithm by leveraging the block reduction technique introduced in [56]. This is a key component in our time algorithm.
Theorem 3.5 (Theorem 2 of [51]).
Let be a basis for lattice and be a PSD matrix that is full rank on . Let such that for any . Then there exists an algorithm that outputs in arithmetic operations to a nonzero vector such that
Moreover, the integers occurring in the algorithm have bit sizes at most .
3.3 Convex Geometry
In this section, we define notions about centroid and covariance. The work [32] heavily exploits the structure of objects to design a subqudratic oracle complexity algorithm for minimizing convex functions. Our approach also relates to these notions.
For a convex body , we use to denote its volume, i.e., . We first define the centroid of convex body,
Definition 3.6.
Let be a convex body. Let denote the uniform measure on convex body . We define the centroid of as
Equivalently, we can write as
Then, we define the covariance of convex body,
Definition 3.7 (Covariance of convex body, ).
Let be a convex body. We define the covariance matrix of under uniform measure as
It is well-known that any isotropic555A convex body is isotropic if the uniform distribution over the body has zero mean and covariance matrix being the identity. convex body is enclosed by two balls.
Lemma 3.8 (Ellipsoidal approximation of convex body, [38]).
Let be an isotropic convex body in . Then,
where is the unit Euclidean ball in .
If is non-isotropic, then
3.4 Lattice Projection
We collect some standard facts of lattice projection that is directly implied by Gram-Schmidt process. We use to denote the orthogonal projection onto the subspace .
Fact 3.9 (Lattice projection).
Let be a full rank lattice in and be a linear subspace such that . Then
Fact 3.10 (Dual of lattice projection).
Let be a full rank lattice in and be a linear subspace such that . Then
3.5 Slicing Lemma
3.6 Dimension Reduction Lemma
The key building block of both [31] and our algorithm is a lattice-dimension reduction step. The following result from [31] shows that all integral points are preserved after a dimension reduction step.
Lemma 3.12 (Lemma 3.1 of [31]).
Given an affine subspace where is a subspace of and is some fixed point, and an ellipsoid that has full rank on . Given a vector with then there exists a hyperplane such that .
4 Cutting Plane Method
In Section 4.1, we introduce the definition of log-barrier and volumetric center. In Section 4.2, we introduce leverage score and related notations. In Section 4.3, we present the convergence lemma. In Section 4.4, we present the sandwiching lemma. In Section 4.5, we show the closeness between Hessian of log-barrier and covariance matrix. In Section 4.6, we show the stability of approximate center. In Section 4.7, we show the closeness between approximate center and true center. In Section 4.8, we present a novel slicing lemma. In Section 4.9, we present the dynamic leverage score maintenance data structure. In Section 4.10, we present our main lemma.
4.1 Log-barrier and Volumetric Center
We start with defining the log-barrier, and it has been widely used in convex optimization [54, 50, 33, 26, 35, 45, 63, 25].
Definition 4.1 (Log-barrier).
Let and . Let denote the -th row of . The log-barrier is defined as
for .
Let be the bounded full-dimensional polytope where , and .
Definition 4.2 (Hessian and volumetric).
Given and . Let be defined as
where denotes the -th row of .
Let (volumetric) function be as
is the Hessian of the logarithmic barrier function and is positive definite for all in the interior of .
Definition 4.3 (Volumetric center).
We define the volumetric center of as
Observe that is a convex function, hence one can run Newton-type algorithm to approximate very fast.
4.2 Leverage Score of Log Hessian
We start with some definitions.
Definition 4.4.
We define to be the -th leverage score of matrix as
Using , we can write the gradient of in an convenient form:
It is well-known that is a good approximation of :
Lemma 4.6 (Lemma 3 of [67]).
For any , we have
Finally, we define which quantifies:
Definition 4.7.
Let be the largest number such that
The following lemma provides an upper bound on :
Lemma 4.8 (Lemma 4 of [67]).
For any , we have
Further, we have
4.3 Volume Shrinking
The following result (Lemma 4.9) bounds the progress of adding or deleting a plane of Vaidya’s CPM.
Lemma 4.9.
Let and be some constants and let denote the value of at the beginning of -th iteration. Then at the beginning of each iteration there exists an satisfying the condition
Furthermore, the following statements hold:
-
•
If at -th iteration then
-
•
Otherwise, we have
Next, in Lemma 4.10, we show that after iterations, the volume of the resulting convex body is only fraction of the original convex body.
Lemma 4.10.
Let be a convex body with non-empty interior. Suppose we run CuttingPlaneMethod for iterations, then we obtain a convex body such that
Proof.
Let denote the volume of the polytope at the beginning of -th iteration. Note that . Using Lemma 4.9 we shall obtain an upper bound on and show that after iterations the volume decreases by a factor of . First, we show that
Since is bounded, the number of bounding planes is at least and to start with this number is exactly . Thus by the -th iteration the case of adding a plane must have occurred at least as often as the case of deleting a plane otherwise the number of planes would have fallen below . So by the -th iteration adding a plane must happen at least times and removing a plane must happen at most times. Hence
where the last step follows from .
Set , we have
(1) |
We note that by Lemma 4.11, the polytope contains so its can be lower bounded by the volume of . Therefore,
(2) |
To obtain an upper bound on , we note that if is the point that maximizes the logarithmic barrier over , then
Then from the relationship between determinants and volume it follows that
(3) |
where the last step follows from definitions of and (see Definition 4.2).
Since (by Fact 3.3), the case of deleting a plane is forced to happen at an iteration if the number of planes defining is greater than and hence never exceeds .
Let us bound the difference between and :
the first step is by Eq. (4.3), the second step is by Eq. (1), the third step is by Eq. (4.3) and the last step is by .
Exponentiating both sides, we obtain
Thus, we complete the proof. ∎
4.4 Sandwiching Lemma
We derive some sandwiching conditions regarding the ellipsoid induced by the Hessian of log barrier.
Lemma 4.11 (Sandwich convex body via log Hessian).
Let be a polytope where and . Let be defined as Definition 4.2. Then for any , we have that
and the following upper bound
Proof.
Let us first consider the ellipsoid contained in . By definition, we have that
without loss of generality assume , then means
which means that for any , it holds that
Since , we must have that for all . Hence, the square condition only requires that , so if , then clearly . Otherwise due to the absolute value constraint, it must be the case that . This concludes the proof of .
For the ellipsoid that contains , we note that the volumetric function scaled by is also a self-concordant barrier with complexity parameter [67, 66, 3] and scaling does not change the minimizer, therefore we have
recall that (due to Lemma 4.6 and Definition 4.7) and (See Lemma 4.8), we conclude that
thus, and we conclude the desired result. ∎
4.5 Closeness of Log Hessian and Covariance
Note that Lemma 3.8 and Lemma 4.11 together imply the spectral approximation between the matrix (See Definition 4.2) and (See Definition 3.6).
Lemma 4.12 (Closeness of log Hessian and covariance).
Let be a polytope for and . Then
Proof.
The proof relies on two sandwiching lemmas.
By Lemma 3.8, we have that
by Lemma 4.11,
We thus have
Without loss of generality, let’s prove one side containment. Given
we note that re-centering and making their centers the same does not change the containment relation, therefore, we have the following:
this directly implies the Loewner ordering
Following a similar approach, we can show that
this completes the proof. ∎
4.6 Stability of Approximate Center
In this section, we prove another useful lemma that concerns properties of the Hessian matrix of the log barrier. It compares the Hessian evaluated at the volumetric center and an approximate center.
Before proceeding to the second lemma, we define some notions.
Definition 4.13.
Let be a polytope with constraints of dimension . Let . Define to be the region
Recall the CPM of [67] maintains an approximate volumetric center that will be updated via Newton’s method. By performing Newton’s method, it can be guaranteed that evaluating function on the approximate center is not too far away from the volumetric center. The next lemma shows that the approximate center is also in the region for proper .
Lemma 4.14 (Lemma 10 of [67]).
Let be defined as in Definition 4.2. Let be defined as in Definition 4.5. Let be a small constant and let be a point in such that . Then we have
-
•
.
-
•
.
-
•
.
In the following Lemma, we show the stability of approximate center.
Lemma 4.15 (Stability of approximate center).
Let be a convex polytope with being its volumetric center. Let and be defined as in Definition 4.2. Let be a small constant. Let be a point such that , then we have
Proof.
By Lemma 4.14, we know that , which means that for any , we have that
Recall we define the (Definition 4.2) matrix as
which means for different arguments, the only part differs is the coefficients . If we can approximate the coefficients well, then we can show and are spectrally close. Without loss of generality we prove the upper bound, lower bound is similar:
where the first step follows from definition , the second step follows from , the third step follows from , the forth step follows from definition of , and the last step follows from when . This completes our proof. ∎
4.7 Closeness of Approximate and True Center
The goal of this section is to prove Lemma 4.16, which states that under the induced- norm, the approximate center and the volumetric center is at most away. We will later show that this discrepancy is in fact acceptable for our algorithm to make progress.
Lemma 4.16 (Closeness of approximate and true center in terms of norm).
Let be a convex polytope with being its volumetric center and let . Let and be defined as in Definition 4.2. Let be a point such that , then we have
Proof.
Throughout the proof, we set to .
By Lemma 4.14, we have that , which means
This means that if ,
On the other hand if ,
We thus have shown that . We proceed to measure the squared- norm of :
This completes the proof of the lemma. ∎
4.8 High Dimensional Slicing Lemma
We present a novel high dimensional slicing lemma that uses Hessian of log barrier, instead of the covariance matrix as in [31]. It incurs an extra term, but as we will see later, this does not affect the total number of oracle calls too much.
Lemma 4.17 (High dimensional slicing lemma, volumetric version).
Let be a convex body, let denote the Hessian matrix of log barrier of at its volumetric center. Let be a full-rank lattice in . Let be an -dimensional subspace of such that . Then we have
where is the dual lattice and is the shortest nonzero vector in under the norm .
4.9 Faster Cutting Plane via Leverage Score Maintenance
The vanilla Vaidya’s CPM algorithm requires to compute all leverage scores per iteration, which would require time to form a projection matrix. [34] shows that by carefully designing data structures for leverage score maintenance, each iteration can be improved to amortized time. The following lemma states that to correct the extra error induced by using such data structures, it is sufficient to run for an extra step. This provides guarantee for [34], and we leverage it to speed up our algorithm.
Lemma 4.18 (Approximating function value via approximate leverage score, Lemma A.3 of [34]).
If the following conditions hold
-
•
The Vaidya’s Newton step [67] uses exact leverage score and runs in iterations to obtain an approximate point such that
-
•
The closeness between true leverage score and approximate leverage score,
Then, running Vaidya’s Newton step [67] with approximate leverage score for iterations, we can obtain a such that
To obtain various guarantees, we need to find an with for small constant . Note that whenever the smallest leverage score is at least , we only need to run an extra iterations of Newton’s step. In the other case, one can show that the old point is still a good starting point for the Newton’s step, therefore running an extra iterations suffices.
We state the data structure of [34] for completeness.
Lemma 4.19 (Theorem 5.1 of [34]).
Given an initial matrix with , initial weight , there is a randomized data structure that approximately maintains the leverage score
where is the diagonal matrix that puts on its diagonal. The data structure uses space and supports the following operations:
-
•
Init: The data structure initializes in time.
-
•
Update: The data structure updates by
-
–
If , we insert a row with weight into such that
Suppose currently has rows already, we append to the -th row of and append to the -th row of . In this case, we ignore the and from input parameters.
-
–
If , let , let denote the -the row of . We delete the -th row from such that
In this case we ignore the from input parameters of update function.
-
–
If , we update the weight vector from to such that
Here denote the from input of update function, and denote the weight we stored from last iteration. In this case, we ignore the from input parameters of update function.
This step takes amortized time.
-
–
-
•
Query: The data structure outputs an approximate leverage score such that
this step takes time. Here is some fixed constant.
In our application, we will invoke the data structure in the following fashion: each time the data structure is initialized, it will be updated and queried for a consecutive of steps, which takes a total of time. We will then perform such sequence of operations for times, which leads to a total of time.
4.10 Main Lemma
The meta lemma of this section states that if we invoke oracle calls to add planes for Vaidya’s CPM, we end up with a polytope whose volume is only a fraction of of the original polytope.
Lemma 4.20.
Given a separation oracle for a convex function defined on . Let be defined as Definition 4.3. Let be defined as Definition 4.2. Given a polytope with constraints that contains minimizer of , and an error parameter , there exists a cutting plane method with that uses at most calls to and an extra arithmetic operations to output a polytope with at most constraints, an approximate volumetric center of and Hessian matrix of log barrier of such that the following holds:
-
•
Part 1. and is the intersection of with hyperplanes outputted by .
-
•
Part 2. .
-
•
Part 3. .
-
•
Part 4. .
-
•
Part 5. .
Proof.
We prove the lemma item by item.
For Part 1, it is implied by the original Vaidya’s algorithm as in [67].
Part 2 is due to the sandwiching lemma for polytope as in Lemma 4.11.
Part 3 is owing to Lemma 4.10.
For Part 4, we prove it in Lemma 4.16.
For Part 5, we show it in Lemma 4.15.
Regarding the runtime, we will use the data structure of Lemma 4.19 for a consecutive of operations, which takes a total of arithmetic operations. Note that each iteration the query guarantees the approximate leverage score satisfy , by Lemma 4.18, we only need to run extra iterations of Newton’s step to obtain an approximate volumetric center with desired guarantee. Thus, the overall arithmetic operation count is . ∎
5 Efficient Minimization via Fast Cutting and Lazy Width Measurement
In Section 5.1, we present our main algorithm, Algorithm 1. In Section 5.2, we present the correctness of our algorithm. In Section 5.3, we show the oracle complexity of our algorithm. In Section 5.4, we show the overall running time of our algorithm. In Section 5.5, we summarize our main result.
5.1 Our Algorithm
Our algorithm is a mixture of efficient cutting plane method of [34], fast shortest vector algorithm of [51] and a novel adaptation and simplification of [32]. The algorithm maintains an affine subspace , a lattice and a polytope . The algorithm then proceeds as follows: it computes the approximate shortest vector on with respect to norm, where is the Hessian of log barrier function at an approximate volumetric center. If the norm of the shortest vector is relatively large, the algorithm performs a sequence of CuttingPlaneMethod for rounds.
5.2 Correctness of Our Algorithm
Lemma 5.1 (Formal version of Theorem 1.1, output guarantee part).
Given a separation oracle for a convex function defined on and a -approximation algorithm ApproxShortestVector for the shortest vector problem which takes arithmetic operations. Suppose the set of minimizers of is contained in a box of radius and satisfies all extreme points of are integral, Algorithm 1 finds an integral minimizer of .
Proof.
Recall that lattice projection and the dual of lattice projections are defined as in Definition 3.9 and Definition 3.10.
We start by showing that the lattice is the orthogonal projection of onto the subspace via induction. In the beginning of each iteration we have and where is a translation of passing through the origin. In the beginning of the algorithm, and , so . Note that the lattice and subspace are only updated in the dimension reduction step of Algorithm 1. For inductive step, let to denote the lattice at the dimension reduction step and and we prove for .
to see the third equality, we note that is the orthogonal subspace of and . As initially , we can inductively show that at time , the subspace is the orthogonal complement to where is the (approximate) shortest vector we use in iteration . As , it is a subspace orthogonal to . Projecting this subspace onto makes it orthogonal to . Thus, first projecting onto then projecting onto is equivalent to first projecting onto then projecting onto . The last equality follows from is a subspace of . This completes the proof of the lattice property.
Now we are ready to show that Algorithm 1 indeed finds the integral minimizer. Assuming has a unique minimizer , we note that CuttingPlaneMethod preserves , so it suffices to show that the dimension reduction step also preserves . We show that in fact, the dimension reduction step preserves all integral points in .
Lemma 4.11 gives the following sandwiching condition:
Recall that we set to be a -spectral approximation to :
We know that
this means that . Consequently, we have
(4) |
Let , by the sandwiching condition, we also have and subsequently
(5) |
We thus have shown that
Now we proceed to show that each dimension reduction iteration preserves all integral points in . We have
Since is satisfied in a dimension reduction step, we can invoke Lemma 3.12, which states that all integral points in lie on the hyperplane given by
Thus, we have and this finishes the proof of the lemma. ∎
5.3 Oracle Complexity
We prove the oracle complexity of Algorithm 1.
Lemma 5.2 (Oracle complexity part of Theorem 5.4).
Given a separation oracle for a convex function on such that the set of minimizers of is contained in a box of radius and all extreme points of are integral, then there exists a randomized algorithm (Algorithm 1) that outputs an integral minimizer of with at most calls to with high probability.
Proof.
We consider the potential function
In the beginning, . A sequence of calls to CuttingPlaneMethod reduce the volume by a factor of , consequently the potential decreases by , additively.
Without loss of generality, let us assume we have a maximal sequence of dimension reduction steps at .
Note that the potential at the beginning of this maximal sequence of dimension reduction iteration is
Between and , we note that is designed in the following fashion: first compute , then consider its outer ellipsoid , note that this blows up the volume by a factor of . Finally, set to be the hyperrectangle containing which is . This again blows up the volume by a factor of .
In the beginning of , we have and . By the proceeding discussion, we have that
for some absolute constant , similarly, we can upper bound the volume of as follows:
recursively apply this inequality, we conclude
The potential after this sequence of dimension reduction iterations is
where the third step is by taking the dual lattice projection (Fact 3.10), the fourth step is due to by Fact 3.9, and the last step is by the volume shrinking after cutting.
Since is a translation of the subspace , we can apply Lemma 4.17 by taking to obtain
(6) |
It remains to provide a lower bound on .
As CuttingPlaneMethod is used in iteration , we have
for the output vector , and that since a cutting plane iteration doesn’t change the lattice.
Since the ApproxShortestVector procedure is -approximation and that is a -spectral approximation to , this implies that
(7) |
This shows that after a sequence of dimension reduction iterations, the potential increases additively by at most . As there are at most such iterations the total amount of potential increase is at most .
Finally we note that whenever the potential becomes smaller than , Minkowski’s first theorem shows the existence of a nonzero vector with . This implies the -approximation algorithm ApproxShortestVector for the shortest vector problem will find a nonzero vector with . So the next iteration where we run the LLL algorithm will always reduce the dimension.
Therefore, the sequence of CuttingPlaneMethod will be run at most
times.
Since each run of CuttingPlaneMethod uses oracle calls, this also gives the total number of oracle calls
This finishes the proof of the theorem.
∎
5.4 Runtime Analysis
The goal of this section is to prove Lemma 5.3.
Lemma 5.3 (Runtime part of Theorem 5.4).
Given a separation oracle for a convex function on such that the set of minimizers of is contained in a box of radius and all extreme points of are integral, then there exists a randomized algorithm (Algorithm 1) that outputs an integral minimizer of , and uses arithmetic operations.
Proof.
As we use -ApproximateShortestVector with , the total number of oracle calls is .
The dimensional reduction step occurs at most times. Each dimension reduction step takes at most arithmetic operations, amounts to an arithmetic operations in total.
The CuttingPlaneMethod is called with , so each call uses arithmetic operations. As such calls happen at most times, the total arithmetic operations for CuttingPlaneMethod is at most .
Regarding the number of calls to FasterShortestVector, we note that there are at most dimension reduction steps, and at most calls to the sequence of CPM. Thus, the total number of calls to FasterShortestVector can be upper bounded by , yielding a total of arithmetic operations. ∎
5.5 Main Result
The goal of this section is to prove Theorem 5.4.
Theorem 5.4 (Main result, formal version of Theorem 1.1).
Given a separation oracle for a convex function on such that the set of minimizers of is contained in a box of radius and all extreme points of are integral, then there exists a randomized algorithm (Algorithm 1) that outputs an integral minimizer of with high probability, and uses
-
•
calls to .
-
•
additional arithmetic operations.
Acknowledgement
We would like to thank Jonathan Kelner for many helpful discussions and for suggesting the title of the paper. Lichen Zhang is supported in part by NSF grant No. 1955217 and NSF grant No. 2022448.
References
- AC [91] Ilan Adler and Steven Cosares. A strongly polynomial algorithm for a special class of linear programs. Operations Research, 39(6):955–960, 1991.
- AKS [01] Miklós Ajtai, Ravi Kumar, and D. Sivakumar. A sieve algorithm for the shortest lattice vector problem. In Proceedings of the Thirty-Third Annual ACM Symposium on Theory of Computing, STOC ’01, 2001.
- Ans [97] Kurt M. Anstreicher. Volumetric path following algorithms for linear programming. Math. Program., 1997.
- AV [95] David S. Atkinson and Pravin M. Vaidya. A cutting plane algorithm for convex programming that uses analytic centers. Math. Program., 1995.
- BLSS [20] Jan van den Brand, Yin Tat Lee, Aaron Sidford, and Zhao Song. Solving tall dense linear programs in nearly linear time. In STOC, 2020.
- BV [04] Dimitris Bertsimas and Santosh Vempala. Solving convex programs by random walks. J. ACM, 2004.
- BWZ [16] Christos Boutsidis, David P Woodruff, and Peilin Zhong. Optimal principal component analysis in distributed and streaming models. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing, pages 236–249, 2016.
- CGJS [23] Deeparnab Chakrabarty, Andrei Graur, Haotian Jiang, and Aaron Sidford. Parallel submodular function minimization. In NeurIPS 2023, 2023.
- Chu [12] Sergei Chubanov. A strongly polynomial algorithm for linear systems having a binary solution. Mathematical programming, 134(2):533–570, 2012.
- Chu [15] Sergei Chubanov. A polynomial algorithm for linear optimization which is strongly polynomial under certain conditions on optimal solutions, 2015.
- CLS [19] Michael B Cohen, Yin Tat Lee, and Zhao Song. Solving linear programs in the current matrix multiplication time. In Proceedings of the 51st Annual ACM Symposium on Theory of Computing (STOC), 2019.
- CM [91] Edith Cohen and Nimrod Megiddo. Improved algorithms for linear inequalities with two variables per inequality. In Proceedings of the twenty-third annual ACM symposium on Theory of Computing, pages 145–155, 1991.
- CW [13] Kenneth L Clarkson and David P Woodruff. Low rank approximation and regression in input sparsity time. In Proceedings of the forty-fifth annual ACM symposium on Theory of Computing (STOC), pages 81–90, 2013.
- DHNV [20] Daniel Dadush, Sophie Huiberts, Bento Natura, and László A Végh. A scaling-invariant algorithm for linear programming whose running time depends only on the constraint matrix. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 761–774, 2020.
- Dik [67] II Dikin. Iterative solution of problems of linear and quadratic programming. In Doklady Akademii Nauk, volume 174, pages 747–748. Russian Academy of Sciences, 1967.
- DSW [22] Yichuan Deng, Zhao Song, and Omri Weinstein. Discrepancy minimization in input-sparsity time. arXiv preprint arXiv:2210.12468, 2022.
- DVZ [21] Dan Dadush, László A Végh, and Giacomo Zambelli. Geometric rescaling algorithms for submodular function minimization. Mathematics of Operations Research, 46(3):1081–1108, 2021.
- DWZ [23] Ran Duan, Hongxun Wu, and Renfei Zhou. Faster matrix multiplication via asymmetric hashing. In FOCS, 2023.
- Edm [03] Jack Edmonds. Submodular functions, matroids, and certain polyhedra. In Combinatorial Optimization—Eureka, You Shrink! Papers Dedicated to Jack Edmonds 5th International Workshop Aussois, France, March 5–9, 2001 Revised Papers, pages 11–26. Springer, 2003.
- FI [03] Lisa Fleischer and Satoru Iwata. A push-relabel framework for submodular function minimization and applications to parametric optimization. Discrete Applied Mathematics, 131(2):311–322, 2003.
- GJS [23] Andrei Graur, Haotian Jiang, and Aaron Sidford. Sparse submodular function minimization. In FOCS 2023. IEEE, 2023.
- GLS [81] Martin Grötschel, László Lovász, and Alexander Schrijver. The ellipsoid method and its consequences in combinatorial optimization. Combinatorica, 1(2):169–197, 1981.
- GLS [84] Martin Grötschel, László Lovász, and Alexander Schrijver. Geometric methods in combinatorial optimization. In Progress in combinatorial optimization, pages 167–183. Elsevier, 1984.
- GLS [88] Martin Grotschel, Laszlo Lovasz, and Alexander Schrijver. Geometric algorithms and combinatorial optimization. Springer, 1988.
- GSZ [23] Yuzhou Gu, Zhao Song, and Lichen Zhang. Faster algorithms for structured linear and kernel support vector machines, 2023.
- HJS+ [22] Baihe Huang, Shunhua Jiang, Zhao Song, Runzhou Tao, and Ruizhe Zhang. Solving sdp faster: A robust ipm framework and efficient implementation. In FOCS, 2022.
- IFF [01] Satoru Iwata, Lisa Fleischer, and Satoru Fujishige. A combinatorial strongly polynomial algorithm for minimizing submodular functions. Journal of the ACM (JACM), 48(4):761–777, 2001.
- IO [09] Satoru Iwata and James B Orlin. A simple combinatorial algorithm for submodular function minimization. In Proceedings of the twentieth annual ACM-SIAM symposium on Discrete algorithms, pages 1230–1237. SIAM, 2009.
- Iwa [03] Satoru Iwata. A faster scaling algorithm for minimizing submodular functions. SIAM Journal on Computing, 32(4):833–840, 2003.
- Iwa [08] Satoru Iwata. Submodular function minimization. Math. Program., 2008.
- Jia [21] Haotian Jiang. Minimizing convex functions with integral minimizers. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms, 2021.
- Jia [22] Haotian Jiang. Minimizing convex functions with rational minimizers. ACM Journal of the ACM (JACM), 2022.
- JKL+ [20] Haotian Jiang, Tarun Kathuria, Yin Tat Lee, Swati Padmanabhan, and Zhao Song. A faster interior point method for semidefinite programming. In FOCS, 2020.
- JLSW [20] Haotian Jiang, Yin Tat Lee, Zhao Song, and Sam Chiu-wai Wong. An improved cutting plane method for convex optimization, convex-concave games, and its applications. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 944–953, 2020.
- JNW [22] Shunhua Jiang, Bento Natura, and Omri Weinstein. A faster interior-point method for sum-of-squares optimization. 49th International Colloquium on Automata, Languages, and Programming (ICALP), 2022.
- JSWZ [21] Shunhua Jiang, Zhao Song, Omri Weinstein, and Hengjie Zhang. Faster dynamic matrix inverse for faster lps. In STOC, 2021.
- Kha [80] L.G. Khachiyan. Polynomial algorithms in linear programming. USSR Computational Mathematics and Mathematical Physics, 1980.
- KLS [95] Ravi Kannan, László Lovász, and Miklós Simonovits. Isoperimetric problems for convex bodies and a localization lemma. Discrete & Computational Geometry, 13(3):541–559, 1995.
- Lev [65] Anatoly Yur’evich Levin. An algorithm for minimizing convex functions. In Doklady Akademii Nauk, volume 160, pages 1244–1247. Russian Academy of Sciences, 1965.
- LG [24] Francois Le Gall. Faster rectangular matrix multiplication by combination loss analysis. In SODA, 2024.
- LLL [82] Arjen K Lenstra, Hendrik Willem Lenstra, and László Lovász. Factoring polynomials with rational coefficients. Mathematische annalen, 1982.
- LS [14] Yin Tat Lee and Aaron Sidford. Path finding methods for linear programming: Solving linear programs in iterations and faster algorithms for maximum flow. In 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pages 424–433. IEEE, 2014.
- LS [19] Yin Tat Lee and Aaron Sidford. Solving linear programs with sqrt (rank) linear system solves. arXiv preprint arXiv:1910.08033, 2019.
- LSW [15] Yin Tat Lee, Aaron Sidford, and Sam Chiu-wai Wong. A faster cutting plane method and its implications for combinatorial and convex optimization. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pages 1049–1065. IEEE, 2015.
- LSZ+ [23] S Cliff Liu, Zhao Song, Hengjie Zhang, Lichen Zhang, and Tianyi Zhou. Space-efficient interior point method, with applications to linear programming and maximum weight bipartite matching. In ICALP, 2023.
- McC [05] S Thomas McCormick. Submodular function minimization. Discrete Optimization, 12:321–391, 2005.
- Meg [83] Nimrod Megiddo. Towards a genuinely polynomial algorithm for linear programming. SIAM Journal on Computing, 12(2):347–353, 1983.
- New [65] Donald J Newman. Location of the maximum on unimodal surfaces. Journal of the ACM (JACM), 12(3):395–398, 1965.
- NN [89] I.U.E. Nesterov and A.S. Nemirovski. Self-concordant Functions and Polynomial-time Methods in Convex Programming. USSR Academy of Sciences, Central Economic & Mathematic Institute, 1989.
- NN [94] Yurii Nesterov and Arkadi Nemirovski. Interior-point polynomial algorithms in convex programming, volume 13. Siam, 1994.
- NS [16] Arnold Neumaier and Damien Stehlé. Faster lll-type reduction of lattice bases. In Proceedings of the ACM on International Symposium on Symbolic and Algebraic Computation (ISSAC), pages 373–380, 2016.
- Orl [09] James B Orlin. A faster strongly polynomial time algorithm for submodular function minimization. Mathematical Programming, 118(2):237–251, 2009.
- OV [20] Neil Olver and László A Végh. A simpler and faster strongly polynomial algorithm for generalized flow maximization. Journal of the ACM (JACM), 67(2):1–26, 2020.
- Ren [88] James Renegar. A polynomial-time algorithm, based on newton’s method, for linear programming. Math. Program., 1988.
- Rot [16] Thomas Rothvoss. Integer optimization and lattices. University of Washington, Spring, 2016.
- Sch [87] C.P. Schnorr. A hierarchy of polynomial time lattice basis reduction algorithms. Theor. Comput. Sci., 1987.
- Sch [00] Alexander Schrijver. A combinatorial algorithm minimizing submodular functions in strongly polynomial time. Journal of Combinatorial Theory, Series B, 80(2):346–355, 2000.
- Sho [77] Naum Shor. Cut-off method with space extension in convex programming problems. Cybernetics and systems analysis, 1977.
- Sma [98] Steve Smale. Mathematical problems for the next century. The mathematical intelligencer, 20(2):7–15, 1998.
- SWZ [17] Zhao Song, David P Woodruff, and Peilin Zhong. Low rank approximation with entrywise l1-norm error. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 688–701, 2017.
- SWZ [19] Zhao Song, David P Woodruff, and Peilin Zhong. Relative error tensor low rank approximation. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2772–2789. SIAM, 2019.
- SYYZ [22] Zhao Song, Xin Yang, Yuanyuan Yang, and Tianyi Zhou. Faster algorithm for structured john ellipsoid computation. arXiv preprint arXiv:2211.14407, 2022.
- SYZ [23] Zhao Song, Mingquan Ye, and Lichen Zhang. Streaming semidefinite programs: passes, small space and fast runtime. arXiv preprint arXiv:2309.05135, 2023.
- Tar [86] Éva Tardos. A strongly polynomial algorithm to solve combinatorial linear programs. Operations Research, 34(2):250–256, 1986.
- TKE [88] S. P. Tarasov, L. G. Khachiyan, and I. I. Èrlikh. The method of inscribed ellipsoids. Dokl. Akad. Nauk SSSR, 1988.
- VA [93] Pravin M. Vaidya and David S. Atkinson. A Technique for Bounding the Number of Iterations in Path Following Algorithms, pages 462–489. World Scientific Publishing Company, 1993.
- Vai [89] P.M. Vaidya. A new algorithm for minimizing convex functions over convex sets. In 30th Annual Symposium on Foundations of Computer Science, pages 338–343, 1989.
- Vég [14] László A Végh. A strongly polynomial algorithm for generalized flow maximization. In Proceedings of the forty-sixth annual ACM symposium on Theory of computing, pages 644–653, 2014.
- VWXXZ [24] Virginia Vassilevska Williams, Yinzhan Xu, Zixuan Xu, and Renfei Zhou. New bounds for matrix multiplication: from alpha to omega. In SODA, 2024.
- VY [96] Stephen A Vavasis and Yinyu Ye. A primal-dual interior point method whose running time depends only on the constraint matrix. Mathematical Programming, 74(1):79–120, 1996.
- Vyg [03] Jens Vygen. A note on schrijver’s submodular function minimization algorithm. Journal of Combinatorial Theory, Series B, 88(2):399–402, 2003.
- YN [76] David Yudin and Arkadii Nemirovski. Evaluation of the information complexity of mathematical programming problems. Ekonomika i Matematicheskie Metody, 1976.