This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

First-Order Bilevel Topology Optimization for Fast Mechanical Design

Zherong Pan, Xifeng Gao, and Kui Wu
Authors are with the Lightspeed & Quantum Studio, Tencent-America. {zrpan,xifgao,[email protected]}
Abstract

Topology Optimization (TO), which maximizes structural robustness under material weight constraints, is becoming an essential step for the automatic design of mechanical parts. However, existing TO algorithms use the Finite Element Analysis (FEA) that requires massive computational resources. We present a novel TO algorithm that incurs a much lower iterative cost. Unlike conventional methods that require exact inversions of large FEA system matrices at every iteration, we reformulate the problem as a bilevel optimization that can be solved using a first-order algorithm and only inverts the system matrix approximately. As a result, our method incurs a low iterative cost, and users can preview the TO results interactively for fast design updates. Theoretical convergence analysis and numerical experiments are conducted to verify our effectiveness. We further discuss extensions to use high-performance preconditioners and fine-grained parallelism on the Graphics Processing Unit (GPU).

Index Terms:
Topology Optimization, Bilevel Optimization, Low-Cost Robotics, Mechanisms and Design

I Introduction

Several emerging techniques over the past decade are contributing to the new trend of lightweight, low-cost mechanical designs. Commodity-level additive manufacturing has been made possible using 3D printers and user-friendly modeling software, allowing a novice to prototype devices, consumer products, and low-end robots. While the appearance of a mechanical part is intuitive to perceive and modify, other essential properties, such as structural robustness, fragility, vibration frequencies, and manufacturing cost, are relatively hard to either visualize or manage. These properties are highly related to the ultimate performance criteria, including flexibility, robotic maneuverability, and safety. For example, the mass and inertia of a collaborative robot should be carefully controlled to avoid harm to the human when they collide. For manipulators on spaceships, the length of each part should be optimized to maximize the end-effector reachability under strict weight limits. Therefore, many mechanical designers rely on TO methods to search for the optimal structure of mechanical parts by tuning their center-of-mass [1], inertia [2], infill pattern [3], or material distribution [4].

Refer to caption
Figure 1: Given the same amount of computational time (200s) on a grid resolution 128×256128\times 256, our method using approximate inverse (a) can find the complex mechanical designs to resist the external load (red force pointing downwards), while the prior method [5] (b) is still far from convergence.

Unfortunately, the current TO algorithms [6] pose a major computational bottleneck. Indeed, TO algorithms rely on FEA to predict the relationship between the extrinsic (boundary loads) and intrinsic status (stress, strain, and energy densities) of a mechanical part. Such predictions are made by solving systems of (non-)linear equations during each iteration of TO. The design space and resulting system size are often gigantic when designing fine-grained material shapes. For example, an 128×256128\times 256 design space illustrated in Figure 1 incurs 10410^{4} decision variables. Prior works make full use of high-end GPU [7] or many-core CPU [8, 9] to solve such problems. Even after such aggressive optimization, the total time spent on FEA is typically at the level of hours. As a result, mechanical engineers cannot modify or preview their designs without waiting for tens of minutes or even hours, which significantly slow down the loops of design refinement.

Main Results: We propose a low-cost algorithm that solves a large subclass of TO problems known as Solid Isotropic Material with Penalization (SIMP) [10]. The SIMP model is widely used to search for fine-grained geometric shapes by optimizing the infill levels. Our main observation is that, in conventional TO solvers, the computation resources are predominated by solving the system of equations to the machine precision. We argue that pursuing such precision for intermediary TO iterations is unnecessary as one only expects the system to be exactly solved on final convergence. Therefore, we propose to maintain an approximate solution that is refined over iterations. We show that such a TO solver corresponds to a bilevel reformulation of the SIMP problems (Section III). We further propose a first-order algorithm to solve these bilevel problems with guaranteed convergence to a first-order critical point (Section IV) of moderate accuracy. Compared with conventional TO algorithms [6], our method incurs a much lower iterative cost of 𝒪(ElogE)\mathcal{O}(E\log E) (EE is the number of decision variables), allowing the mechanical designers to interactively preview the TO results and accelerate the loops of design refinement. We then discuss several useful extensions including different preconditioning schemes for refining the approximate solutions (Section IV-A) and a GPU-based implementation that achieves an asymptotic iterative cost of 𝒪(logE)\mathcal{O}(\log E) (Section IV-C). We highlight the effectiveness of our method through a row of 2D benchmark problems from [11, 12, 13] (Section V), where our iterative cost can be as low as 114011-40ms when handling a resolution of up to 110110k grid cells.

II Related Work

We review representative TO formulations, numerical solvers for TO problems, and bilevel optimizations.

TO Formulations & Applications: Generally speaking, TO involves all mathematical programming problems that use geometric shapes as decision variables. A TO problem has three essential component: geometric representations, governing system of equations, and design objectives/constraints. The SIMP model [10] uses density-based representation where the geometric shape is discretized using FEA and decision variables are the infill levels of discrete elements. There are other popular representations. For example, truss optimization [14] uses the decision variables to model the discrete beam existence. Infill pattern optimization [3] searches for the local arrangements of shape interiors. Compared with the SIMP model, these formulations typically induce much fewer decision variables due to simplicity of shapes or locality.

In terms of governing system of equations, the SIMP model assumes linearized elastic models, which means that the extrinsic and intrinsic properties of a mechanical part are linearly related. Linear models make fairly accurate predictions under the small-deformation assumption, which is the case with most mechanical design problems with rigid parts. For large deformations, there are extensions to the SIMP model that use nonlinear elastic models [15, 16] or even elastoplastic models [17]. However, the use of these nonlinear, non-convex models induces optimization problems of a more complex class than the SIMP model [18], for which our analysis does not apply. In a similar fashion as SIMP, truss optimization [19] has small- and large-deformation variants. We speculate that similar ideas and analysis apply for truss optimization problems under small-deformations.

TO formulations deviate from each other mostly due to their design objectives and constraints. The SIMP model minimizes the total potential energy under a single external load and a total-volume constraint. Standard SIMP model does not have constraints to limit deformations, but there are extensions to include strain/stress constraints [20]. Unfortunately, stress-constrained formulations also induce a more complex problem class, where our analysis does not apply. Yet another extension to the SIMP model [21, 22, 23] is to consider multiple or uncertain external loads. Such extension is favorable because a mechanical system can undergo a variety of unknown extrinsic situations. We speculate it is possible to adapt our method to these cases by sampling the uncertain situations and use our method as underlying solvers for each sampled scenario.

Numerical TO Solvers: Three categories of numerical approaches have been extensively used: truncated gradient method, general-purpose method, and hierarchical method. The reference implementation [10] of the SIMP solver uses the heuristic, truncated gradient method [24]. This method first computes the exact gradient and then updates the design along the negative gradient direction using heuristic step sizes. Although the convergence of these heuristic schemes is hard to analyze, they perform reasonably well in practice and are thus widely used, e.g. in [12]. General-purpose methods are off-the-shelf optimization algorithms (e.g., sequential quadratic programming (SQP)/interior point method [25], and augmented Lagrangian method (ALM) [26]). The generality of these solvers makes it easier to experiment with different TO formulations. Without problem-specific optimization, however, they typically scale poorly to the high-dimensional decision space of practical problems. In particular, the (Globally Convergent) Method of Moving Asymptotes (GCMMA) [27] is a special form of SQP that can handle a large set of box constraints, making it particularly suitable for SIMP-type problems. Finally, solving problem in a hierarchical approach is an attempting way to tackle high-dimensional problems. [28] [28] used a quadtree to focus computational resources on the material boundaries. The multigrid method [12, 29] constructs a hierarchy of grids to accelerate FEA system solves.

Bilevel Optimization [18]: When a constraint of an optimization problem (high-level problem) involves another optimization (low-level problem), the problem is considered bilevel. In our formulation, the high-level problem optimizes the material infill levels, while the low-level problem performs the FEA analysis. Bilevel problems frequently arise in robotics for time optimality [30], trajectory search [31], and task-and-motion planning [32]. As a special case considered in our method, if the low-level problem is strictly convex, then its solution is unique and the low-level problem can be replaced by a well-defined solution mapping function. This treatment essentially reduces the bilevel to a single-level optimization. Otherwise, low-level problem can be non-convex and the solution map is multi-valued, where additional mechanism is needed to compare and select low-level solutions (e.g., a maximum- or minimum-value function can be used and the associated problem is NP-hard [33]). It is noteworthy that oftentimes an arbitrary solution of the low-level problem suffice, in which case penalty method [34] can be used to approximate one of the solutions on the low-level. Bilevel formulation has been used in TO community for years, but prior works resort to off-the-shelf solvers [35] or machine learning [36] to (approximately) solve the underlying optimization problem, while we utilize the special structure of the SIMP model to design our efficient, first-order method.

III Problem Formulation

In this section we introduce the SIMP model using notations in [12]. (see Table II in our appendix for a list of symbols) This model uses FEA to discretize a mechanical part governed by the linear elastic constitutive law. If the material is undergoing an internal displacement u(x):Ω2u(x):\Omega\to\mathbb{R}^{2} where Ω\Omega is the material domain, then the internal potential energy is accumulated according to the constitutive law as follows:

P[u(),v()]=Ω12u(x)Tke(v(x),x)u(x)𝑑x,\displaystyle P[u(\bullet),v(\bullet)]=\int_{\Omega}\frac{1}{2}u(x)^{T}k_{e}(v(x),x)u(x)dx,

where P[,]P[\bullet,\bullet] is the potential energy functional, kek_{e} is the spatially varying stiffness tensor parameterized by infinite-dimensional decision variable v(x):Ωv(x):\Omega\to\mathbb{R} that determines the infill levels. FEA discretizes the material domain Ω\Omega into a set of EE elements each spanning the sub-domain Ωe\Omega_{e}. The elements are connected by a set of NN nodes. By restricting P[]P[\bullet] to a certain finite-dimensional functional space (the shape space), the potential energy can be written as the following sum over elements:

P(u,v)=\displaystyle P(u,v)= 12uTe=1KKe(v)u12uTK(v)u\displaystyle\frac{1}{2}u^{T}\sum_{e=1}^{K}K_{e}(v)u\triangleq\frac{1}{2}u^{T}K(v)u
Ke(v)\displaystyle K_{e}(v)\triangleq Ωeke(v,x)𝑑x.\displaystyle\int_{\Omega_{e}}k_{e}(v,x)dx.

Here KeK_{e} is the stiffness matrix of the eeth element, KK is the stiffness matrix assembled from KeK_{e}. With a slight abuse of notation, we denote P(,)P(\bullet,\bullet) as a finite-dimensional potential energy function, u2Nu\in\mathbb{R}^{2N} without bracket as a vector of 2D nodal displacements (u3Nu\in\mathbb{R}^{3N} in 3D), and vEv\in\mathbb{R}^{E} without bracket as a vector of elementwise infill levels. The SIMP model is compatible with all kinds of FEA discretization scheme and the case with regular grid is illustrated in Figure 2.

Refer to caption
Figure 2: Key steps of the SIMP model and the design loop: define the SIMP problem: external load (red) and fixture (brown); compute displacement uu using FEA (a); perform sensitivity analysis to compute l\nabla l (b); update infill levels (c); repeat (b,c,d) until convergence; deploy on a real robot (e).

Suppose an external force ff is exerted on the mechanical part, then the total (internal+external) potential energy is:

Pf(u,v)=12uTK(v)ufTu.\displaystyle P_{f}(u,v)=\frac{1}{2}u^{T}K(v)u-f^{T}u.

SIMP model works with arbitrary force setup, but external forces are only applied to the boundary in practice. For example, if only the iith boundary node is under a force fif_{i}, then we have f=eifif=e_{i}f_{i} with eie_{i} being the unit vector. The equilibrium state is derived by minimizing PfP_{f} with respect to uu, giving uf=K1(v)fu_{f}=K^{-1}(v)f (Figure 2a). As the name suggests the two variables uf,fu_{f},f are linearly related under the linear elastic constitutive law. The goal of TO is to minimize the induced internal potential energy due to ff:

argminvXl(v)12ufT(v)K(v)uf(v)=12fTK1(v)f,\displaystyle\underset{v\in X}{\text{argmin}}\;l(v)\triangleq\frac{1}{2}u_{f}^{T}(v)K(v)u_{f}(v)=\frac{1}{2}f^{T}K^{-1}(v)f, (1)

where XX is a compact, convex set bounding vv away from potentially singular configurations. Several widely used stiffness matrix parameterization of K(v)K(v) includes the linear law K(v)=e=1EveKeK(v)=\sum_{e=1}^{E}v_{e}K_{e} or the power law K(v)=e=1EveηKeK(v)=\sum_{e=1}^{E}v_{e}^{\eta}K_{e} where η>0\eta>0 is some constant. In both cases, we need to choose XX such that the spectrum of KK is bounded from both sides:

𝟘<ρ¯𝟙ρ(K(v))ρ¯𝟙,\displaystyle\mathbb{0}<\underline{\rho}\mathbb{1}\leq\rho(K(v))\leq\bar{\rho}\mathbb{1}, (2)

for any vXv\in X, where ρ()\rho(\bullet) is the eigenvalues of a matrix. A typical choice of XX to this end is:

X={v|𝟙vv¯𝟙>𝟘𝟙Tvv¯},\displaystyle X=\{v|\mathbb{1}\geq v\geq\underline{v}\mathbb{1}>\mathbb{0}\land\mathbb{1}^{T}v\leq\bar{v}\},

where vev_{e} is the infill level of the eeth element, v¯\underline{v} is the minimal elementwise infill level, v¯\bar{v} is the total infill level, and 𝟙,𝟘\mathbb{1},\mathbb{0} are the all-one and all-zero vectors, respectively. In this work, we use the following more general assumption on the stiffness matrix parameterization and the shape of XX:

Assumption III.1.

XX is a compact polytopic subset of E\mathbb{R}^{E}, on which Equation 2 holds and K(v)K(v) is smooth.

Remark III.2.

The compactness of XX is obvious. XX is also polytopic because we only have linear constraints, which is essential when we measure optimality using relative projection error (see Section IV for details). Assumption III.1 is more general than the standard SIMP model. Indeed, K(v)K(v) can be parameterized using any smooth activation function 𝒜e(v)\mathcal{A}_{e}(v) for the eeth element as K(v)=e=1E𝒜e(v)KeK(v)=\sum_{e=1}^{E}\mathcal{A}_{e}(v)K_{e}, which obviously satisfies Assumption III.1 as long as 𝒜e(v)\mathcal{A}_{e}(v) is bounded from both above and below on XX, away from zero. Our assumption also allows a variety of topology modifications and constraints, some of which are illustrated here: Symmetry-Constraint: We can plug in a left-right symmetric mapping matrix 𝕊\mathbb{S} and define K(v)=K(𝕊vs)K(v)=K(\mathbb{S}v_{s}) where vsv_{s} is the infill levels for the left-half of the material block. Component-Cost-Constraint: A mechanical part can be divided into different sub-components and the amount of material can be assigned for each component as follows:

X{v|v=(v1T,v2T,T,v#T)T𝟙Tviv¯i},\displaystyle X\triangleq\{v|v=\left(\begin{array}[]{cccc}{v^{1}}^{T},&{v^{2}}^{T},&{\cdots}^{T},&{v^{\#}}^{T}\end{array}\right)^{T}\land\mathbb{1}^{T}v^{i}\leq\bar{v}^{i}\},

where #\# is the number of sub-components, viv^{i} is the sub-vector of vv consisting of decision variables for the iith component, and v¯i\bar{v}^{i} is the total allowed amount of material for that sub-component. Material-Filtering: [37] [37] proposed to control the thickness of materials by filtering the infill levels using some convolutional kernel denoted as \mathbb{C}. As long as the convolution operator ()\mathbb{C}(\bullet) is smooth, Assumption III.1 holds by plugging in K(v)=e=1E𝒜e((v))KeK(v)=\sum_{e=1}^{E}\mathcal{A}_{e}(\mathbb{C}(v))K_{e}.

All the conventional methods such as the (GC)MMA [38, 27], SQP [5, 25], and [12] for solving Equation 1 involve computing the exact gradient via sensitivity analysis (Figure 2b):

l=\displaystyle\nabla l= 12ufTvKuf+12ufTKvuf+12ufTKufv=12ufTKvuf,\displaystyle\frac{1}{2}\frac{\partial{u_{f}^{T}}}{\partial{v}}Ku_{f}+\frac{1}{2}u_{f}^{T}\frac{\partial{K}}{\partial{v}}u_{f}+\frac{1}{2}u_{f}^{T}K\frac{\partial{u_{f}}}{\partial{v}}=-\frac{1}{2}u_{f}^{T}\frac{\partial{K}}{\partial{v}}u_{f},

where we have used the derivatives of the matrix inverse. The major bottleneck lies in the computation of ufu_{f} that involves exact matrix inversion. Here we assume that 2Kv2\frac{\partial^{2}{K}}{\partial{v}^{2}} is a third-order tensor of size |u|×|u|×|v||u|\times|u|\times|v| and its left- or right-multiplication means contraction along the first and second dimension. Note that the convergence of GCMMA and SQP solvers rely on the line-search scheme that involves the computation of objective function values, which in turn requires matrix inversion. Practical methods [12] and [10] avoid the line-search schemes using heuristic step sizes. Although working well in practice, the theoretical convergence of these heuristic rules is difficult to establish.

IV Bilevel Topology Optimization

In this section, we first propose our core framework to solve SIMP problems that allows inexact matrix inversions. We then discuss extensions to accelerate convergence via preconditioning (Section IV-A) and fast projection operators (Section IV-B). Finally, we briefly discuss parallel algorithm implementations (Section IV-C). Inspired by the recent advent of first-order bilevel optimization algorithms [39, 40, 41], we propose to reformulation Equation 1 as the following bilevel optimization:

argminvX\displaystyle\underset{v\in X}{\text{argmin}} f(u,v)12ufTK(v)uf\displaystyle f(u,v)\triangleq\frac{1}{2}u_{f}^{T}K(v)u_{f} (3)
s.t. ufargmin𝑢Pf(u,v).\displaystyle\quad u_{f}\in\underset{u}{\text{argmin}}\;P_{f}(u,v).

The low-level part of Equation 3 is a least-square problem in uu and, since K(v)K(v) is always positive definite when vXv\in X, the low-level solution is unique. Plugging the low-level solution into the high-level objective function and Equation 1 is recovered. The first-order bilevel optimization solves Equation 3 by time-integrating the discrete dynamics system as described in Algorithm 1 and we denote this algorithm as First-Order Bilevel Topology Optimization (FBTO). The first line of Algorithm 1 is a single damped Jacobi iteration for refining the low-level solution using an adaptive step size of βk\beta_{k}. But instead of performing multiple Jacobi iterations until convergence, we go ahead to use the inexact result after one iteration for sensitivity analysis. Finally, we use a projected gradient descend to update the infill levels with an adaptive step size of αk\alpha_{k}.

Algorithm 1 FBTO
1:for k1,2,\leftarrow 1,2,\cdots do
2:     uk+1ukβk(K(vk)ukf)u_{k+1}\leftarrow u_{k}-\beta_{k}(K(v_{k})u_{k}-f)
3:     vk+1ProjX(vk+αk2ukTKvkuk)v_{k+1}\leftarrow\text{Proj}_{X}(v_{k}+\frac{\alpha_{k}}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k})

Here ProjX()\text{Proj}_{X}(\bullet) is the projection onto the convex set XX under Euclidean distance. This scheme has linear iterative cost of 𝒪(E)\mathcal{O}(E) as compared with conventional method that has superlinear iterative cost due to sparse matrix inversions (some operations of Algorithm 1 have a complexity of 𝒪(N)\mathcal{O}(N) but we know that E=Θ(N)E=\Theta(N)).

Remark IV.1.

The succinct form of our approximate gradient (ukTK/vkuk/2-u_{k}^{T}{\partial{K}}/{\partial{v_{k}}}u_{k}/2 in Line 3) makes use of the special structure of the SIMP model, which is not possible for more general bilevel problems. If a general-purpose first-order bilevel solver is used, e.g. [39] and follow-up works, one would first compute/store uvPf\nabla_{uv}P_{f} and then approximate uuPf1uvPf\nabla_{uu}P_{f}^{-1}\nabla_{uv}P_{f} via sampling. Although the stochastic approximation scheme does not require exact matrix inversion, they induce an increasing number of samples with a larger number of iterations. Instead, we make use of the cancellation between K(v)K(v) in the high-level objective and K1(v)K^{-1}(v) in the low-level objective to avoid matrix inversion or its approximations, which allows our algorithm to scale to high dimensions without increasing sample complexity.

As our main result, we show that Algorithm 1 is convergent under the following choices of parameters.

Assumption IV.2.

We choose constant βk\beta_{k} and some constant p>0p>0 satisfying the following condition:

0<βk<2ρ¯ρ¯2p>12βkρ¯+β¯2βk22βkρ¯β¯2βk2.\displaystyle 0<\beta_{k}<\frac{2\underline{\rho}}{\bar{\rho}^{2}}\land p>\frac{1-2\beta_{k}\underline{\rho}+\bar{\beta}^{2}\beta_{k}^{2}}{2\beta_{k}\underline{\rho}-\bar{\beta}^{2}\beta_{k}^{2}}. (4)
Assumption IV.3.

Define Δkukuf(vk)2\Delta_{k}\triangleq\|u_{k}-u_{f}(v_{k})\|^{2} and suppose Δ1U2\Delta_{1}\leq U^{2}, we choose the following uniformly bounded sequence {Γk}\{\Gamma_{k}\} with constant Γ¯,Θ¯\bar{\Gamma},\bar{\Theta}:

Θ¯p+1p(12βkρ¯+βk2ρ¯2)ΓkΓ¯U2(1Θ¯)(U+Uf)4,\bar{\Theta}\triangleq\frac{p+1}{p}(1-2\beta_{k}\underline{\rho}+\beta_{k}^{2}\bar{\rho}^{2})\land\Gamma_{k}\leq\bar{\Gamma}\leq\frac{U^{2}(1-\bar{\Theta})}{(U+U_{f})^{4}},

where UfU_{f} is the finite upper bound of uf(v)u_{f}(v) on XX.

Assumption IV.4.

We choose αk\alpha_{k} as follows:

αk=1kmΓ¯Lu2LK24p+1,\displaystyle\alpha_{k}=\frac{1}{k^{m}}\sqrt{\frac{\bar{\Gamma}}{L_{u}^{2}L_{\nabla K}^{2}}\frac{4}{p+1}},

for positive constants m,pm,p, where LuL_{u} is the L-constant of ufu_{f} and LKL_{\nabla K} is the upper bound of the Frobenius norm of K(v)/v{\partial{K(v)}}/{\partial{v}} when vXv\in X.

Assumption IV.5.

We choose positive constants m,nm,n satisfying the following condition:

max(12m+mn,22mmn,23m)<0m<1.\displaystyle\max(1-2m+mn,2-2m-mn,2-3m)<0\land m<1.
Theorem IV.6.

Suppose {vk}\{v_{k}\} is the sequence generated by Algorithm 1 under Assumption III.1, IV.2, IV.3, IV.4, IV.5. For any ϵ>0\epsilon>0, there exists an iteration number kk such that Xl(vk)ϵ\|\nabla_{X}l(v_{k})\|\leq\epsilon, where Xl(vk)\nabla_{X}l(v_{k}) is the projected negative gradient into the tangent cone of XX, 𝒯X\mathcal{T}_{X}, defined as:

Xl(vk)=argmind𝒯Xd+l(vk).\displaystyle\nabla_{X}l(v_{k})=\underset{d\in\mathcal{T}_{X}}{\text{argmin}}\;\|d+\nabla l(v_{k})\|.

We prove Theorem IV.6 in Section VIII where we further show that the high-level optimality error scales as 𝒪(km1)\mathcal{O}(k^{m-1}), so that mm should be as small as possible for the best convergence rate. Direct calculation leads to the constraint of m>3/4m>3/4 from the first two conditions of Assumption IV.5, so the convergence rate can be arbitrarily close to 𝒪(k1/4)\mathcal{O}(k^{-1/4}) as measured by the following relative projection error [42]:

Δkv\displaystyle\Delta_{k}^{v}\triangleq 1αk2ProjX(vkαkl(vk))vk2.\displaystyle\frac{1}{\alpha_{k}^{2}}\|\text{Proj}_{X}(v_{k}-\alpha_{k}\nabla l(v_{k}))-v_{k}\|^{2}.

Although our analysis of convergence uses a similar technique as that for the two-timescale method [40], our result is only single-timescale. Indeed, the low-level step size βk\beta_{k} can be constant and users only need to tune a decaying step size for αk\alpha_{k}. We will further show that certain versions of our algorithm allow a large choice of βk=1\beta_{k}=1 (see Section IX-E). In addition, our formula for choosing αk\alpha_{k} does not rely on the maximal number of iterations of the algorithm.

Algorithm 2 PFBTO
1:for k1,2,\leftarrow 1,2,\cdots do
2:     δkM1(vk)(K(vk)ukf)\delta_{k}\leftarrow M^{-1}(v_{k})(K(v_{k})u_{k}-f)
3:     uk+1ukβkK(vk)M1(vk)δku_{k+1}\leftarrow u_{k}-\beta_{k}K(v_{k})M^{-1}(v_{k})\delta_{k}
4:     vk+1ProjX(vk+αk2ukTKvkuk)v_{k+1}\leftarrow\text{Proj}_{X}(v_{k}+\frac{\alpha_{k}}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k})
Algorithm 3 CPFBTO
1:for k1,2,\leftarrow 1,2,\cdots do
2:     uk+1ukβkM1(vk)(K(vk)ukf)u_{k+1}\leftarrow u_{k}-\beta_{k}M^{-1}(v_{k})(K(v_{k})u_{k}-f)
3:     vk+1ProjX(vk+αk2ukTKvkuk)v_{k+1}\leftarrow\text{Proj}_{X}(v_{k}+\frac{\alpha_{k}}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k})

IV-A Preconditioning

Our Algorithm 1 uses steepest gradient descend to solve the linear system in the low-level problem, which is known to have a slow convergence rate of (11/κ)2k/(1+1/κ)2k(1-1/\kappa)^{2k}/(1+1/\kappa)^{2k} [43] with κ\kappa being the condition number of the linear system matrix. A well-developed technique to boost the convergence rate is preconditioning [44], i.e., pre-multiplying a symmetric positive-definite matrix M(v)M(v) that approximates K(v)K(v) whose inverse can be computed at a low-cost. Preconditioning is a widely used technique in the TO community [12, 45] to accelerate the convergence of iterative linear solvers in solving uf(v)=K(v)1fu_{f}(v)=K(v)^{-1}f. In this section, we show that FBTO can be extended to this setting by pre-multiplying M(v)M(v) in the low-level problem and we name Algorithm 2 as Pre-conditioned FBTO or PFBTO. Algorithm 2 is solving the following different bilevel program from Equation 3:

argminvX\displaystyle\underset{v\in X}{\text{argmin}} f(u,v)12ufTK(v)uf\displaystyle f(u,v)\triangleq\frac{1}{2}u_{f}^{T}K(v)u_{f} (5)
s.t. ufargmin𝑢K(v)uf2,\displaystyle\quad u_{f}\in\underset{u}{\text{argmin}}\;\|K(v)u-f\|^{2},

where the only different is the low-level system matrix being squared to K2(v)K^{2}(v). Since K(v)K(v) is non-singular, the two problems have the same solution set. Equation 5 allows us to multiply M1(v)M^{-1}(v) twice in Algorithm 2 to get the symmetric form of K(v)M2(v)K(v)K(v)M^{-2}(v)K(v). To show the convergence of Algorithm 2, we only need the additional assumption on the uniform boundedness of the spectrum of M(v)M(v):

Assumption IV.7.

For any vXv\in X, we have:

𝟘<ρ¯M𝟙ρ(M(v))ρ¯M𝟙.\displaystyle\mathbb{0}<\underline{\rho}_{M}\mathbb{1}\leq\rho(M(v))\leq\bar{\rho}_{M}\mathbb{1}.

Assumption IV.7 holds for all the symmetric-definite preconditioners. The convergence can then be proved following the same reasoning as Theorem IV.6 with a minor modification summarized in Section VIII-E. Our analysis sheds light on the convergent behavior of prior works using highly efficient preconditioners such as geometric multigrid [12, 45] and conjugate gradient method [46]. We further extend these prior works by enabling convergent algorithms using lightweight preconditioners such as Jacobi/Gauss-Seidel iterations and approximate inverse schemes, to name just a few. The practical performance of Algorithm 2 would highly depend on the design and implementation of specific preconditioners.

On the down side of Algorithm 2, the system matrix in the low-level problem of Equation 5 is squared and so is the condition number. Since the convergence speed of low-level problem is (11/κ)2k/(1+1/κ)2k(1-1/\kappa)^{2k}/(1+1/\kappa)^{2k}, squaring the system matrix can significantly slow down the convergence, counteracting the acceleration brought by preconditioning. This is because we need to derive a symmetric operator of form K(v)M2(v)K(v)K(v)M^{-2}(v)K(v). As an important special case, however, only a single application of M1(v)M^{-1}(v) suffice if M(v)M(v) commutes with K(v)K(v), leading to the Commutable PFBTO or CPFBTO Algorithm 3. A useful commuting preconditioner is the Arnoldi process used by the GMRES solver [47], which is defined as:

M1(v)bi=0DciKi(v)b\displaystyle M^{-1}(v)b\triangleq\sum_{i=0}^{D}c_{i}^{*}K^{i}(v)b (6)
ciargmincibi=1D+1ci1Ki(v)b2,\displaystyle c_{i}^{*}\triangleq\underset{c_{i}}{\text{argmin}}\;\|b-\sum_{i=1}^{D+1}c_{i-1}K^{i}(v)b\|^{2},

where DD is the size of Krylov subspace. By sharing the same eigenvectors, M(v)M(v) is clearly commuting with K(v)K(v). This is a standard technique used as the inner loop of the GMRES solver, where the least square problem Equation LABEL:eq:LSP is solved via the Arnoldi iteration. The Arnoldi process can be efficiently updated through iterations if K(v)K(v) is fixed, which is not the case with our problem. Therefore, we choose the more numerically stable Householder QR factorization to solve the least square problem, and we name Equation LABEL:eq:LSP as Krylov-preconditioner.

To show the convergence of Algorithm 3, we use a slightly different analysis summarized in the following theorem:

Assumption IV.8.

We choose constant βk\beta_{k} and some constant p>0p>0 satisfying the following condition:

0<βk<2ρ¯ρ¯M2ρ¯2ρ¯Mp>12βkρ¯ρ¯M+βk2ρ¯2ρ¯M22βkρ¯ρ¯Mβk2ρ¯2ρ¯M2.\displaystyle 0<\beta_{k}<\frac{2\underline{\rho}\underline{\rho}_{M}^{2}}{\bar{\rho}^{2}\bar{\rho}_{M}}\land p>\frac{1-2\beta_{k}\frac{\underline{\rho}}{\bar{\rho}_{M}}+\beta_{k}^{2}\frac{\bar{\rho}^{2}}{\underline{\rho}_{M}^{2}}}{2\beta_{k}\frac{\underline{\rho}}{\bar{\rho}_{M}}-\beta_{k}^{2}\frac{\bar{\rho}^{2}}{\underline{\rho}_{M}^{2}}}. (7)
Assumption IV.9.

Define ΞkK(vk)ukuf2\Xi_{k}\triangleq\|K(v_{k})u_{k}-u_{f}\|^{2} and suppose Ξ1U2\Xi_{1}\leq U^{2}, we choose the following uniformly bounded sequence {Γk}\{\Gamma_{k}\} with constant Γ¯,Θ¯\bar{\Gamma},\bar{\Theta}:

Θkp+1p(12βkρ¯ρ¯M+βk2ρ¯2ρ¯M)ΓkΓ¯U2(1Θ¯)(U/ρ¯+Uf)6,\Theta_{k}\triangleq\frac{p+1}{p}(1-2\beta_{k}\frac{\underline{\rho}}{\bar{\rho}_{M}}+\beta_{k}^{2}\frac{\bar{\rho}^{2}}{\underline{\rho}_{M}})\land\Gamma_{k}\leq\bar{\Gamma}\leq\frac{U^{2}(1-\bar{\Theta})}{(U/\underline{\rho}+U_{f})^{6}},

where UfU_{f} is the finite upper bound of uf(v)u_{f}(v) on XX.

Assumption IV.10.

We choose αk\alpha_{k} as follows:

αk=1kmΓ¯LK2LK24p+1,\displaystyle\alpha_{k}=\frac{1}{k^{m}}\sqrt{\frac{\bar{\Gamma}}{L_{K}^{2}L_{\nabla K}^{2}}\frac{4}{p+1}},

for some positive constants m,pm,p, where LKL_{K} is the L-coefficient of K(v)K(v)’s Frobenius norm on XX.

Theorem IV.11.

Suppose M(v)M(v) commutes with K(v)K(v), {vk}\{v_{k}\} is the sequence generated by Algorithm 3 under Assumption III.1, IV.8, IV.9, IV.10, IV.5, IV.7. For any ϵ>0\epsilon>0, there exists an iteration number kk such that Xl(vk)ϵ\|\nabla_{X}l(v_{k})\|\leq\epsilon, where Xl(vk)\nabla_{X}l(v_{k}) is the projected negative gradient into the tangent cone of XX.

The proof of Theorem IV.11 is provided in Section VIII-E, which allows the choice of a large, constant step size βk=1\beta_{k}=1 when the Krylov-preconditioner is used.

IV-B Efficient Implementation

We present some practical strategies to further accelerate the computational efficacy of all three Algorithm 1,2,3 that are compatible with our theoretical analysis. First, the total volume constraint is almost always active as noted in [38]. Therefore, it is useful to maintain the total volume constraint when computing the approximate gradient. If we define the approximate gradient as ~l12ukTK/vkuk\tilde{\nabla}l\triangleq-\frac{1}{2}u_{k}^{T}{\partial{K}}/{\partial{v_{k}}}u_{k} and consider the total volume constraint 𝟙Tvk=v¯\mathbb{1}^{T}v_{k}=\bar{v}, then a projected gradient can be computed by solving:

~Plargmin𝟙T~Pl=0~Pl~l2,\displaystyle\tilde{\nabla}_{P}l\triangleq\underset{\mathbb{1}^{T}\tilde{\nabla}_{P}l=0}{\text{argmin}}\;\|\tilde{\nabla}_{P}l-\tilde{\nabla}l\|^{2},

with the analytic solution being ~Pl=~l𝟙𝟙T~l/E\tilde{\nabla}_{P}l=\tilde{\nabla}l-\mathbb{1}\mathbb{1}^{T}\tilde{\nabla}l/E. In our experiments, using ~Pl\tilde{\nabla}_{P}l in place of ~l\tilde{\nabla}l in the last line of FBTO algorithms can boost the convergence rate at an early stage of optimization.

Second, the implementation of the projection operator ProjX()\text{Proj}_{X}(\bullet) can be costly in high-dimensional cases. However, our convex set XX typically takes a special form that consists of mostly bound constraints 𝟙vv¯𝟙\mathbb{1}\geq v\geq\underline{v}\mathbb{1} with only one summation constraint 𝟙Tvv¯\mathbb{1}^{T}v\leq\bar{v}. Such a special convex set is known as a simplex and a special 𝒪(Elog(E))\mathcal{O}(E\log(E)) algorithm exists for implementing the ProjX()\text{Proj}_{X}(\bullet) as proposed in [48]. Although the original algorithm [48] only considers the equality constraint 𝟙Tv=1\mathbb{1}^{T}v=1 and single-sided bounds v0v\geq 0, a similar technique can be adopted to handle our two-sided bounds and inequality summation constraint and we provide its derivation for completeness. We begin by checking whether the equality constraint is active. We can immediately return if 𝟙TProj{𝟙vv¯}(vk)v¯\mathbb{1}^{T}\text{Proj}_{\{\mathbb{1}\geq v\geq\underline{v}\}}(v_{k})\leq\bar{v}. Otherwise, the projection operator amounts to solving the following quadratic program:

argmin𝟙vv¯𝟙12vvk2s.t.𝟙Tv=v¯,\displaystyle\underset{\mathbb{1}\geq v\geq\underline{v}\mathbb{1}}{\text{argmin}}\;\frac{1}{2}\|v-v_{k}\|^{2}\quad\text{s.t.}\;\mathbb{1}^{T}v=\bar{v},

whose Lagrangian \mathcal{L} and first-order optimality conditions are:

=\displaystyle\mathcal{L}= 12vvk2+λ𝟙T(𝟙v)+λv¯T(vv¯𝟙)+λ(𝟙Tvv¯)\displaystyle\frac{1}{2}\|v-v_{k}\|^{2}+\lambda_{\mathbb{1}}^{T}(\mathbb{1}-v)+\lambda_{\underline{v}}^{T}(v-\underline{v}\mathbb{1})+\lambda(\mathbb{1}^{T}v-\bar{v})
v=\displaystyle v= vk+λ𝟙λv¯λ𝟙𝟘λ𝟙𝟙v𝟘𝟘λv¯vv¯𝟙𝟘,\displaystyle v_{k}+\lambda_{\mathbb{1}}-\lambda_{\underline{v}}-\lambda\mathbb{1}\land\mathbb{0}\leq\lambda_{\mathbb{1}}\perp\mathbb{1}-v\geq\mathbb{0}\land\mathbb{0}\leq\lambda_{\underline{v}}\perp{v-\underline{v}\mathbb{1}}\geq\mathbb{0},

where λ𝟙,λv¯,λ\lambda_{\mathbb{1}},\lambda_{\underline{v}},\lambda are Lagrange multipliers. We can conclude that v=Clamp(vkλ𝟙,v¯𝟙,𝟙)v=\text{Clamp}(v_{k}-\lambda\mathbb{1},\underline{v}\mathbb{1},\mathbb{1}) and the following equality holds due to the constraint 𝟙Tv=v¯\mathbb{1}^{T}v=\bar{v} being active:

v¯=𝟙TClamp(vkλ𝟙,v¯𝟙,𝟙),\displaystyle\bar{v}=\mathbb{1}^{T}\text{Clamp}(v_{k}-\lambda\mathbb{1},\underline{v}\mathbb{1},\mathbb{1}), (8)

which is a piecewise linear equation having at most 2E2E pieces. There are EE left-nodes in the form of vkv¯𝟙v_{k}-\underline{v}\mathbb{1} and EE right-nodes in the form of vk𝟙v_{k}-\mathbb{1}, separating different linear pieces. The piecewise linear equation can be solved for λ\lambda and thus vv by first sorting the 2E2E nodes at the cost of 𝒪(ElogE)\mathcal{O}(E\log E) and then looking at each piece for the solution. We summarize this process in Algorithm 4 where we maintain the sorted end points of the line segments via running sums.

Algorithm 4 Project vkv_{k} onto XX (SLOPE is the slope of each line segment, RHS is the righthand side of Equation 8)
1:if 𝟙TProj{𝟙vv¯}(vk)v¯\mathbb{1}^{T}\text{Proj}_{\{\mathbb{1}\geq v\geq\underline{v}\}}(v_{k})\leq\bar{v} then
2:     Return Proj{𝟙vv¯}(vk)\text{Proj}_{\{\mathbb{1}\geq v\geq\underline{v}\}}(v_{k})
3:ARRAY\text{ARRAY}\leftarrow\emptyset
4:for vkevkv_{k}^{e}\in v_{k} do
5:     ARRAYARRAY{vkev¯,vke1}\text{ARRAY}\leftarrow\text{ARRAY}\bigcup\{v_{k}^{e}-\underline{v},v_{k}^{e}-1\}
6:Sort ARRAY from low to high\triangleright 𝒪(ElogE)\mathcal{O}(E\log E)
7:SLOPE0\text{SLOPE}\leftarrow 0, RHSE\text{RHS}\leftarrow E, NODES\text{NODES}\leftarrow\emptyset, λlast0\lambda_{\text{last}}^{*}\leftarrow 0
8:for λARRAY\lambda^{*}\in\text{ARRAY} do\triangleright 𝒪(E)\mathcal{O}(E)
9:     RHSRHS+SLOPE(λλlast)\text{RHS}\leftarrow\text{RHS}+\text{SLOPE}(\lambda^{*}-\lambda_{\text{last}}^{*})
10:     if λ\lambda^{*} of form vke1v_{k}^{e}-1 then
11:         SLOPESLOPE1\text{SLOPE}\leftarrow\text{SLOPE}-1      
12:     else SLOPESLOPE+1\text{SLOPE}\leftarrow\text{SLOPE}+1 \triangleright λ\lambda^{*} of form vkev¯v_{k}^{e}-\underline{v}
13:     NODESNODES{<λ,RHS>}\text{NODES}\leftarrow\text{NODES}\bigcup\{<\lambda^{*},\text{RHS}>\}, λlastλ\lambda_{\text{last}}^{*}\leftarrow\lambda^{*}
14:for Consecutive NODE,NODENODES\text{NODE},\text{NODE}^{\prime}\in\text{NODES} do\triangleright 𝒪(E)\mathcal{O}(E)
15:     if Segment <NODE,NODE><\text{NODE},\text{NODE}^{\prime}> passes through v¯\bar{v} then
16:         Solve for λ\lambda and return Clamp(vkλ𝟙,v¯𝟙,𝟙)\text{Clamp}(v_{k}-\lambda\mathbb{1},\underline{v}\mathbb{1},\mathbb{1})      
Refer to caption
Figure 3: A gallery of benchmark problems selected from [11, 12] and 2D version of problems in [13], which are solved using Algorithm 3.

IV-C Fine-Grained Parallelism

Refer to caption
Figure 4: The acceleration rate of low-end mobile GPU (12801280 cores) versus CPU (6 cores) implementation under different resolutions of discretization. We achieve a maximal speedup of 1616 times.

Prior work [12] proposes to accelerate TO solver on GPU, but they require a complex GPU multgrid implementation which is also costly to compute per iteration. In comparison, our Algorithm 1,2,3 can make full use of many-core hardwares with slight modifications. In this section, we discuss necessary modifications for a GPU implementation assuming the availability of basic parallel scan, reduction, inner-product, and radix sort operations [49]. A bottleneck in Algorithm 1,2,3 is matrix-vector production K(v)uK(v)u, of which the implementation depends on the type of FEA discretization. If the discretization uses a regular pattern, such as a regular grid in Figure 2, then the element-to-node mapping is known and can be hard coded, so the computation for each node is independent and costs 𝒪(1)\mathcal{O}(1) if EE thread blocks are available. For irregular discretizations, we have to store the sparse matrix explicitly and suggest using a parallel sparse matrix-vector product routine [50]. To implement the Krylov-preconditioner, we perform in-place QR factorization to solve for cic_{i}^{*}. The in-place QR factorization involves computing the inner-product for D2D^{2} times and then solving the upper-triangular system. Altogether, the least square solve in Equation LABEL:eq:LSP costs 𝒪(D2logE+D2)\mathcal{O}(D^{2}\log E+D^{2}) when EE threads are available. Here we use a serial implementation of the upper-triangular solve, which is not a performance penalty when DED\ll E. Finally, Algorithm 4 involves one radix sort that costs 𝒪(logE)\mathcal{O}(\log E) and three for loops, the first and last for loops do not have data dependency and cost 𝒪(1)\mathcal{O}(1). The second for loop accumulates two variables (Slope,Total\text{Slope},\text{Total}) that can be accomplished by a parallel scan taking 𝒪(logE)\mathcal{O}(\log E). In summary, the cost of each iteration of Algorithm 1,2,3 can be reduced to 𝒪(D2logE)\mathcal{O}(D^{2}\log E) when 𝒪(E)\mathcal{O}(E) threads are available. We profile the parallel acceleration rate in Figure 4.

Refer to caption (a)
Refer to caption (b)
Figure 5: For the benchmark problem of Figure 1, we run bilevel optimization (Algorithm 3) for 50005000 iterations and single-level optimization (PGD) for 200200 iterations. (a): We profile our low- and high-level error evolution, plotted against the number of iterations. (b): We compare the convergence history of true energy function value l(vk)l(v_{k}) of both algorithms, plotted against computational time. Our method not only converges faster but also converges to a better solution.
Refer to caption
Figure 6: For the benchmark problem in Figure 3e, we compared the convergence history of three choices of α0\alpha_{0}. A larger α0\alpha_{0} would lead to more fluctuations at an early stage, but all three optimizations would ultimately converge. When α0>102\alpha_{0}>10^{-2}, we observe divergence.
Refer to caption
Figure 7: We compare the convergence history of Algorithm 1 and Algorithm 3 under different DD, in terms of exact energy value plotted against computational time. We need to use an extremely small α0=0.001\alpha_{0}=0.001 for Algorithm 1 to converge. For Algorithm 3 with D5D\geq 5, a much larger α0=0.25\alpha_{0}=0.25 can be used, although there are some initial fluctuations when D=5D=5. Algorithm 3 with D=20D=20 exhibits the highest overall convergence speed.
Refer to caption
Figure 8: We run our method for 50005000 iterations and plot the final exact energy value against α0\alpha_{0} where log(l(vk))>9\log(l(v_{k}))>9 indicates divergence. When D15D\geq 15, our method is convergent under all the step sizes. Therefore, we choose D=20D=20 in all examples and only leave α0\alpha_{0} to be tuned by users.

V Experiments

In this section, we compare our most efficient Algorithm 3 with other methods that can provide convergence guarantee. We implement our method using native C++ on a laptop machine with a 2.5G 6-core Intel i7 CPU and a 1.48G Nvidia GTX 1060 mobile GPU having 1280 cores. Our implementation makes full use of the cores on CPU via OpenMP and GPU via Cuda and we implement the regular grid FEA discretization. In all the examples, we apply a 7×77\times 7 separable Gaussian kernel \mathbb{C} as our material filter (see Remark III.2), which can be implemented efficiently on GPU as a product of multiple linear kernels. We found that a 7×77\times 7 kernel achieves the best balance between material thickness and boundary sharpness. Unless otherwise stated, we choose Algorithm 3 with v¯=0.1,η=3,αk=α0k3/4,βk=1,D=20\underline{v}=0.1,\eta=3,\alpha_{k}=\alpha_{0}k^{-3/4},\beta_{k}=1,D=20 for our Krylov-preconditioner in all the experiments where α0\alpha_{0} varies across different experiments. We justify these parameter choices in Section IX-E. For the three algorithms in Section IV, we terminate when the relative change to infill levels over two iterations (vk+1vk\|v_{k+1}-v_{k}\|_{\infty}) is smaller than 10410^{-4} and the absolute error of the linear system solve (K(vk)ukf\|K(v_{k})u_{k}-f\|_{\infty}) is smaller than 10210^{-2}, which can be efficiently checked on GPU by using a reduction operator that costs 𝒪(logE)\mathcal{O}(\log E). The two conditions are measuring the optimality of the high- and low-level problems, respectively. The later is otherwise known as the tracking error, which measures how closely the low-level solution uku_{k} follows its optima as the high-level solution vkv_{k} changes. We choose 7 benchmark problems from [11, 12, 13] to profile our method as illustrated in Figure 3 and summarized in Table I. Our iterative cost is at most 4040ms when handling a grid resolution of 192×576192\times 576, which allows mechanical designers to quickly preview the results.

Benchmark Res. Frac. #Iter.
Time
(Total)
Time
(Iter.)
Mem.
Figure 1 128×256128\times 256 0.40.4 4960049600 694694 0.0140.014 88
Figure 3a 192×576192\times 576 0.30.3 1460014600 584584 0.0400.040 2424
Figure 3b 192×576192\times 576 0.30.3 89008900 356356 0.0400.040 2424
Figure 3c 192×384192\times 384 0.30.3 3740037400 11591159 0.0310.031 1818
Figure 3d 160×240160\times 240 0.40.4 2040020400 346346 0.0170.017 88
Figure 3e 160×160160\times 160 0.50.5 1330013300 146146 0.0110.011 44
Figure 3f 128×384128\times 384 0.50.5 93009300 195195 0.0210.021 1010
Figure 3g 128×256128\times 256 0.30.3 1940019400 252252 0.0130.013 88
TABLE I: Statistics of benchmark problems: benchmark index, grid resolution, volume fraction (v¯)(\bar{v}), iterations until convergence, total computational time(s), iterative cost(s), and GPU memory cost (mb).

Convergence Analysis

We highlight the benefit of using approximate matrix inversions by comparing our method with Projected Gradient Descent (PGD). Using a single-level formulation, PGD differs from Algorithm 3 by using exact matrix inversions, i.e., replacing Line 2 with uk+1K1(vk)fu_{k+1}\leftarrow K^{-1}(v_{k})f. We profile the empirical convergence rate of the two formulations in Figure 5 for solving the benchmark problem in Figure 1. The comparisons on other benchmark problems are given in Section VII. Since exact, sparse matrix factorization prevents PGD from being implemented on GPU efficiently, our method allows fine-grained parallelism and is much more GPU-friendly. For comparison, we implement other steps of PGD on GPU while revert to CPU for sparse matrix factorization. As shown in Figure 5a, the high- and low-level error in our method suffers from some local fluctuation but they are both converging overall. We further plot the evolution of exact energy value l(vk)l(v_{k}). After about 7070s of computation, our method converges to a better optima, while PGD is still faraway from convergence as shown in Figure 5b. We further notice that our method would significantly increase the objective function value at an early stage of optimization. This is because first-order methods rely on adding proximal regularization terms to the objective function to form a Lyapunov function (see e.g. [51]). Although our analysis is not based on Lyapunov candidates, we speculate a similar argument to [51] applies.

Parameter Sensitivity

When setting up the SIMP problem, the user needs to figure out the material model KeK_{e} and external load profile ff. We use linear elastic material relying on Young’s modulus and Poisson’s ratio. We always set Young’s modulus to 11 and normalize the external force ff, since these two parameters only scale the objective function without changing the solutions. When solving the SIMP problem, our algorithm relies on two critical parameters: initial high-level step size α0\alpha_{0} and Krylov subspace size DD for preconditioning. We profile the sensitivity with respect to α0\alpha_{0} in Figure 6 for benchmark problem in Figure 3e. We find our method convergent over a wide range of α0\alpha_{0}, although an overly large α0\alpha_{0} could lead to excessive initial fluctuation. In practice, matrix-vector product is the costliest step and takes up 96%96\% of the computational time, so the iterative cost is almost linear in DD. As illustrated in Figure 7, Algorithm 1 would require an extremely small α0\alpha_{0} and more iterations to converge, while Algorithm 3 even with a small DD can effectively reduce the low-level error and allow a wide range of choices for α0\alpha_{0}, leading to a faster overall convergence speed. We profiled the sensitivity to subspace size DD in Figure 8, and we observe convergent behaviors for all α0\alpha_{0} when D15D\geq 15, so we choose D=20D=20 in all examples and only leave α0\alpha_{0} to be tuned by users.

Refer to caption
Figure 9: We compare SQP and FBTO on the benchmark problem (Figure 3c) using a resolution of 64×12864\times 128. While SQP ultimately converges to a better solution, the two solutions are visually similar. Our method finds an approximate solution much more quickly, allowing designers to interactively preview.

Comparison with SQP

SQP is an off-the-shelf algorithm that pertains global convergence guarantee [5] and have been used for topology optimization. We conduct comparisons with SQP on the benchmark problem in Figure 3c. We use the commercial SQP solver [52] where we avoid computing the exact Hessian matrix of l(v)l(v) via L-BFGS approximation and we avoid factorizing the approximate Hessian matrix using a small number of conjugate gradient iterations. The comparisons are conducted at a resolution of 64×12864\times 128, which is one third that of Figure 3c. Using an exact Hessian factorization or a larger resolution in SQP would use up the memory of our desktop machine. As shown in Figure 9, our method converges to an approximate solution much quicker than SQP, allowing users to preview the results interactively. Such performance of our method is consistent with many other first-order methods, e.g., the Alternating Method Direction of Multiplier (ADMM), which is faster in large-scale problems and converges to a solution of moderate accuracy, while SQP takes longer to converge to a more accurate solution.

Refer to caption
Figure 10: We run ALM on the benchmark problem (Figure 3c) at a resolution of 32×6432\times 64. ALM only exhibits useful solution after 1260012600s (3.53.5hr) and converge in 2268022680s (6.36.3hr).

Comparison with ALM

ALM formulates u,vu,v both as decision variables and treat FEA as additional hard constraints:

argminvX,u12uTK(v)us.t.K(v)u=f.\displaystyle\underset{v\in X,u}{\text{argmin}}\;\frac{1}{2}u^{T}K(v)u\quad\text{s.t.}\;K(v)u=f.

By turning the hard constraints into soft penalty functions via the augmented Lagrangian penalty term, ALM stands out as another solver for TO that allows inexact matrix inversions. Although we are unaware of ALM being used for TO problems, it has been proposed to solve general bilevel programs in [34]. We compare our method with the widely used ALM solver [53]. As the major drawback, ALM needs to square the FEA system matrix K(v)K(v), which essentially squares its condition number and slows down the convergence. We have observed this phenomenon in Figure 10, where we run ALM at a resolution of 32×6432\times 64 which is one-sixth that of Figure 3c. Even at such a coarse resolution, ALM cannot present meaningful results until after 1260012600s (3.53.5hr) that is much longer than either FBTO or SQP. Our Algorithm 2 suffers from the same problem of squaring the system matrix. Therefore, we suggest using Algorithm 2 only when one has a highly efficient, non-commuting preconditioner such as a geometric multigrid.

VI Conclusion & Limitation

We present a novel approach to solve a large subclass of TO problems known as the SIMP model. Unlike classical optimization solvers that perform the exact sensitivity analysis during each iteration, we propose to use inexact sensitivity analysis that is refined over iterations. We show that this method corresponds to the first-order algorithm for a bilevel reformulation of TO problems. Inspired by the recent analytical result [39], we show that this approach has guaranteed convergence to a first-order critical point of the original problem. Our new approach leads to low-iterative cost and allows interactive result preview for mechanical designers. We finally discuss extensions to use preconditioners and fine-grained GPU parallelism. Experiments on several 2D benchmarks from [11, 12, 13] highlight the computational advantage of our method.

An inherent limitation of our method is the requirement to estimate step size parameters αk,βk\alpha_{k},\beta_{k}. We have shown that βk\beta_{k} is fixed and our results are not sensitive to α0\alpha_{0}, but users still need to give a rough estimation of α0\alpha_{0} that can affect practical convergence speed. Further, our first-order solver is based on plain gradient descendent steps, for which several acceleration schemes are available, e.g. Nesterov [54] and Anderson [55] accelerations. The analysis of these techniques are left as future works. Finally, when additional constraints are incorporated into the polytopic set XX or non-polytopic constraints are considered, our analysis needs to be adjusted and the efficacy of Algorithm 4 might be lost.

References

  • [1] Jun Wu, Lou Kramer and Rüdiger Westermann “Shape interior modeling and mass property optimization using ray-reps” In Computers & Graphics 58 Elsevier, 2016, pp. 66–72
  • [2] Moritz Bächer, Emily Whiting, Bernd Bickel and Olga Sorkine-Hornung “Spin-It: Optimizing Moment of Inertia for Spinnable Objects” In ACM Trans. Graph. 33.4 New York, NY, USA: Association for Computing Machinery, 2014 DOI: 10.1145/2601097.2601157
  • [3] Jun Wu, Anders Clausen and Ole Sigmund “Minimum compliance topology optimization of shell–infill composites for additive manufacturing” In Computer Methods in Applied Mechanics and Engineering 326, 2017, pp. 358–375 DOI: https://doi.org/10.1016/j.cma.2017.08.018
  • [4] Thanh T. Banh and Dongkyu Lee “Topology Optimization of Multi-Directional Variable Thickness Thin Plate with Multiple Materials” In Struct. Multidiscip. Optim. 59.5 Berlin, Heidelberg: Springer-Verlag, 2019, pp. 1503–1520 DOI: 10.1007/s00158-018-2143-8
  • [5] Susana Rojas-Labanda and Mathias Stolpe “An efficient second-order SQP method for structural topology optimization” In Structural and Multidisciplinary Optimization 53.6 Springer, 2016, pp. 1315–1333
  • [6] Ole Sigmund and Kurt Maute “Topology optimization approaches” In Structural and Multidisciplinary Optimization 48.6 Springer, 2013, pp. 1031–1055
  • [7] Jesús Martínez-Frutos, Pedro J. Martínez-Castejón and David Herrero-Pérez “Efficient topology optimization using GPU computing with multilevel granularity” In Advances in Engineering Software 106, 2017, pp. 47–62 DOI: https://doi.org/10.1016/j.advengsoft.2017.01.009
  • [8] A Mahdavi, R Balaji, M Frecker and Eric M Mockensturm “Topology optimization of 2D continua for minimum compliance using parallel computing” In Structural and Multidisciplinary Optimization 32.2 Springer, 2006, pp. 121–132
  • [9] Niels Aage, Erik Andreassen and Boyan Stefanov Lazarov “Topology optimization using PETSc: An easy-to-use, fully parallel, open source topology optimization framework” In Structural and Multidisciplinary Optimization 51.3 Springer, 2015, pp. 565–572
  • [10] Ole Sigmund “A 99 line topology optimization code written in Matlab” In Structural and multidisciplinary optimization 21.2 Springer, 2001, pp. 120–127
  • [11] S Ivvan Valdez et al. “Topology optimization benchmarks in 2D: Results for minimum compliance and minimum volume in planar stress problems” In Archives of Computational Methods in Engineering 24.4 Springer, 2017, pp. 803–839
  • [12] Jun Wu, Christian Dick and Rüdiger Westermann “A System for High-Resolution Topology Optimization” In IEEE Transactions on Visualization and Computer Graphics 22.3, 2016, pp. 1195–1208 DOI: 10.1109/TVCG.2015.2502588
  • [13] Sandilya Kambampati, Carolina Jauregui, Ken Museth and H Alicia Kim “Large-scale level set topology optimization for elasticity and heat conduction” In Structural and Multidisciplinary Optimization 61.1 Springer, 2020, pp. 19–38
  • [14] Herbert Martins Gomes “Truss optimization with dynamic constraints using a particle swarm algorithm” In Expert Systems with Applications 38.1 Elsevier, 2011, pp. 957–968
  • [15] Thomas Buhl, Claus BW Pedersen and Ole Sigmund “Stiffness design of geometrically nonlinear structures using topology optimization” In Structural and Multidisciplinary Optimization 19.2 Springer, 2000, pp. 93–104
  • [16] Tyler E Bruns and Daniel A Tortorelli “Topology optimization of non-linear elastic structures and compliant mechanisms” In Computer methods in applied mechanics and engineering 190.26-27 Elsevier, 2001, pp. 3443–3459
  • [17] Kurt Maute, Stefan Schwarz and Ekkehard Ramm “Adaptive topology optimization of elastoplastic structures” In Structural optimization 15.2 Springer, 1998, pp. 81–91
  • [18] Benoît Colson, Patrice Marcotte and Gilles Savard “Bilevel programming: A survey” In 4or 3.2 Springer, 2005, pp. 87–107
  • [19] Krister Svanberg “Optimization of geometry in truss design” In Computer Methods in Applied Mechanics and Engineering 28.1, 1981, pp. 63–80 DOI: https://doi.org/10.1016/0045-7825(81)90027-X
  • [20] Erik Holmberg, Bo Torstenfelt and Anders Klarbring “Stress constrained topology optimization” In Structural and Multidisciplinary Optimization 48.1 Springer, 2013, pp. 33–47
  • [21] Abderrahmane Habbal, Joakim Petersson and Mikael Thellner “Multidisciplinary topology optimization solved as a Nash game” In International Journal for Numerical Methods in Engineering 61.7 Wiley Online Library, 2004, pp. 949–963
  • [22] Erik Holmberg, Carl-Johan Thore and Anders Klarbring “Game theory approach to robust topology optimization with uncertain loading” In Structural and Multidisciplinary Optimization 55.4 Springer, 2017, pp. 1383–1397
  • [23] Peter D Dunning, H Alicia Kim and Glen Mullineux “Introducing loading uncertainty in topology optimization” In AIAA journal 49.4, 2011, pp. 760–768
  • [24] Martin P Bendsøe and Ole Sigmund “Optimization of structural topology, shape, and material” Springer, 1995
  • [25] Susana Rojas-Labanda and Mathias Stolpe “Benchmarking optimization solvers for structural topology optimization” In Structural and Multidisciplinary Optimization 52.3 Springer, 2015, pp. 527–547
  • [26] Gustavo Assis Silva and André Teófilo Beck “Reliability-based topology optimization of continuum structures subject to local stress constraints” In Structural and Multidisciplinary Optimization 57.6 Springer, 2018, pp. 2339–2355
  • [27] Krister Svanberg “The method of moving asymptotes—a new method for structural optimization” In International Journal for Numerical Methods in Engineering 24.2, 1987, pp. 359–373 DOI: https://doi.org/10.1002/nme.1620240207
  • [28] H Nguyen-Xuan “A polytree-based adaptive polygonal finite element method for topology optimization” In International Journal for Numerical Methods in Engineering 110.10 Wiley Online Library, 2017, pp. 972–1000
  • [29] Oded Amir, Niels Aage and Boyan S Lazarov “On multigrid-CG for efficient topology optimization” In Structural and Multidisciplinary Optimization 49.5 Springer, 2014, pp. 815–829
  • [30] Gao Tang, Weidong Sun and Kris Hauser “Enhancing bilevel optimization for uav time-optimal trajectory using a duality gap approach” In 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 2515–2521 IEEE
  • [31] Benoit Landry, Zachary Manchester and Marco Pavone “A Differentiable Augmented Lagrangian Method for Bilevel Nonlinear Optimization” In Proceedings of Robotics: Science and Systems, 2019 DOI: 10.15607/RSS.2019.XV.012
  • [32] Marc Toussaint “Logic-geometric programming: An optimization-based approach to combined task and motion planning” In Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015
  • [33] Jonathan F Bard “Some properties of the bilevel programming problem” In Journal of optimization theory and applications 68.2 Springer, 1991, pp. 371–378
  • [34] Akshay Mehra and Jihun Hamm “Penalty method for inversion-free deep bilevel optimization” In Asian Conference on Machine Learning, 2021, pp. 347–362 PMLR
  • [35] Michal Kočvara “Topology optimization with displacement constraints: a bilevel programming approach” In Structural optimization 14.4 Springer, 1997, pp. 256–263
  • [36] Jonas Zehnder, Yue Li, Stelian Coros and Bernhard Thomaszewski “NTopo: Mesh-free Topology Optimization using Implicit Neural Representations” In Advances in Neural Information Processing Systems 34 Curran Associates, Inc., 2021
  • [37] James K Guest, Jean H Prévost and Ted Belytschko “Achieving minimum length scale in topology optimization using nodal design variables and projection functions” In International journal for numerical methods in engineering 61.2 Wiley Online Library, 2004, pp. 238–254
  • [38] Krister Svanberg “MMA and GCMMA, versions September 2007” In Optimization and Systems Theory 104 KTH Stockholm, Sweden, 2007
  • [39] Saeed Ghadimi and Mengdi Wang “Approximation methods for bilevel programming” In arXiv preprint arXiv:1802.02246, 2018
  • [40] Mingyi Hong, Hoi-To Wai, Zhaoran Wang and Zhuoran Yang “A two-timescale framework for bilevel optimization: Complexity analysis and application to actor-critic” In arXiv preprint arXiv:2007.05170, 2020
  • [41] Prashant Khanduri et al. “A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum” In Advances in neural information processing systems 34, 2021
  • [42] Paul H Calamai and Jorge J Moré “Projected gradient methods for linearly constrained problems” In Mathematical programming 39.1 Springer, 1987, pp. 93–116
  • [43] David G Luenberger and Yinyu Ye “Linear and nonlinear programming” Springer, 1984
  • [44] Michele Benzi “Preconditioning techniques for large linear systems: a survey” In Journal of computational Physics 182.2 Elsevier, 2002, pp. 418–477
  • [45] Haixiang Liu et al. “Narrow-Band Topology Optimization on a Sparsely Populated Grid” In ACM Trans. Graph. 37.6 New York, NY, USA: Association for Computing Machinery, 2018 DOI: 10.1145/3272127.3275012
  • [46] Thomas Borrvall and Joakim Petersson “Large-scale topology optimization in 3D using parallel computing” In Computer methods in applied mechanics and engineering 190.46-47 Elsevier, 2001, pp. 6201–6229
  • [47] Youcef Saad and Martin H Schultz “GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems” In SIAM Journal on scientific and statistical computing 7.3 SIAM, 1986, pp. 856–869
  • [48] Laurent Condat “Fast projection onto the simplex and the l1-ball” In Mathematical Programming 158.1 Springer, 2016, pp. 575–585
  • [49] Jason Sanders and Edward Kandrot “CUDA by example: an introduction to general-purpose GPU programming” Addison-Wesley Professional, 2010
  • [50] Nathan Bell and Michael Garland “Efficient sparse matrix-vector multiplication on CUDA”, 2008
  • [51] Tianyi Chen, Yuejiao Sun and Wotao Yin “Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems” In Advances in Neural Information Processing Systems 34, 2021
  • [52] Richard A Waltz and Jorge Nocedal “KNITRO 2.0 User’s Manual” In Ziena Optimization, Inc.[en ligne] disponible sur http://www. ziena. com (September, 2010) 7, 2004, pp. 33–34
  • [53] Robert J Vanderbei “LOQO user’s manual—version 3.10” In Optimization methods and software 11.1-4 Taylor & Francis, 1999, pp. 485–514
  • [54] Amir Beck and Marc Teboulle “A fast iterative shrinkage-thresholding algorithm for linear inverse problems” In SIAM journal on imaging sciences 2.1 SIAM, 2009, pp. 183–202
  • [55] Junzi Zhang, Brendan O’Donoghue and Stephen Boyd “Globally convergent type-I Anderson acceleration for nonsmooth fixed-point iterations” In SIAM Journal on Optimization 30.4 SIAM, 2020, pp. 3170–3197
TABLE II: Table of Symbols
Variable Definition
xΩx\in\Omega a point in material domain
uu displacement field
PP internal potential energy
kek_{e} stiffness tensor field
v,vev,v_{e} infill level, infill level of eeth element
uk,vk,δku_{k},v_{k},\delta_{k} solutions at kkth iteration
v¯,v¯\underline{v},\bar{v} infill level bounds
Ωe\Omega_{e} sub-domain of eeth element
E,NE,N number of elements, nodes
K,KeK,K_{e} stiffness matrix, stiffness of eeth element
ff external force field
PfP_{f} total energy
eie_{i} unit vector at iith element
ufu_{f} displacement caused by force ff
ll induced internal potential energy
η\eta power law coefficient
ρ()\rho(\bullet) eigenvalue function
ρ¯,ρ¯\underline{\rho},\bar{\rho} eigenvalue bounds
𝟙,𝟘\mathbb{1},\mathbb{0} all-one, all-zero vectors
𝒜e\mathcal{A}_{e} activation function of eeth element
𝕊\mathbb{S} symmetry mapping matrix
vsv_{s} infill levels of the left-half material block
#\# number of sub-components
v¯i\bar{v}^{i} total infill level of iith component
\mathbb{C} material filter operator
αk,βk\alpha_{k},\beta_{k} high-level, low-level step size
Δk,Δkv\Delta_{k},\Delta_{k}^{v} low-level, high-level error metrics
Ξk\Xi_{k} low-level error metric used to analyze Algorithm 3
UU upper bound of Δ1\sqrt{\Delta_{1}}
Γk,Θk\Gamma_{k},\Theta_{k} coefficients of low-level error
Γ¯,Θ¯\bar{\Gamma},\bar{\Theta} upper bounds of Γk,Θk\Gamma_{k},\Theta_{k}
p,m,np,m,n algorithmic constants
Lu,LKLΔK,LlL_{u},L_{K}L_{\Delta K},L_{\nabla l} L-constants of uf,K,ΔK,lu_{f},K,\Delta K,\nabla l
C,CvC,C_{v} coefficients of reduction rate of Δkv\Delta_{k}^{v}
𝒯X\mathcal{T}_{X} tangent cone of XX
dk,τkd_{k},\tau_{k} feasible direction in 𝒯X\mathcal{T}_{X} and step size
l¯\underline{l} lower bound of ll
UfU_{f} uniform upper bound of uf\|u_{f}\|
X\nabla_{X} projected gradient into normal cone
ϵ\epsilon error tolerance of gradient norm
ProjX()\text{Proj}_{X}(\bullet) projection operator onto XX
κ\kappa condition number
MM preconditioner
ρ¯M,ρ¯M\underline{\rho}_{M},\bar{\rho}_{M} eigenvalue bounds of preconditioner
cic_{i} coefficients of Krylov vectors
DD dimension of Krylov subspace
~l\tilde{\nabla}l approximate gradient
~Pl\tilde{\nabla}_{P}l mean-projected approximate gradient
λ𝟙,λv¯\lambda_{\mathbb{1}},\lambda_{\underline{v}} Lagrangian multiplier for projection problem
[Uncaptioned image] (a) [Uncaptioned image] (b) [Uncaptioned image] (c)
[Uncaptioned image] (d) [Uncaptioned image] (e) [Uncaptioned image] (f)
[Uncaptioned image] (g) Figure 11: We summarize the comparative convergence history of PGD and FBTO for all the benchmarks (Figure 3a-g). PGD and FBTO are implemented on GPU with matrix factorization of PGD on CPU. All other parameters are the same.

VII Additional Experiments

We compare the performance of FBTO and PGD for all the benchmarks in Figure 3 and the results are summarized in Figure 11. Note that PGD uses an exact low-level solver so it requires less iterations then FBTO, while FBTO uses more iterations with a much lower iterative cost. For fairness we compare them based on computational time. PGD outperforms or performs similarly to FBTO on two of the problems (Figure 11df), where PGD converges rapidly in 5105-10 iterations while FBTO requires thousands of iterations. In all other cases, FBTO converges faster to an approximate solution.

VIII Convergence Analysis of Algorithm 1

We provide proof of Theorem IV.6. We start from some immediate observations on the low- and high-level objective functions. Then, we prove the rule of error propagation for the low-level optimality, i.e., the low-level different between uku_{k} and its optimal solution uf(vk)u_{f}(v_{k}). Next, we choose parameters to let the difference diminish as the number of iterations increase. Finally, we focus on the high-level objective function and show that the difference between vkv_{k} and a local optima of l(v)l(v) would also diminish. Our analysis resembles the recent analysis on two-timescale bilevel optimization, but we use problem-specific treatment to handle our novel gradient estimation for the high-level problem without matrix inversion.

VIII-A Low-Level Error Propagation

If Assumption III.1 holds, then the low-level objective function is ρ¯\underline{\rho}-strongly convex and Lipschitz-continuous in uu for any vv with ρ¯\bar{\rho} being the L-constant, so that:

Pf(uk+1,vk)Pf(uk,vk)+uk+1uk,uPf(uk,vk)+\displaystyle P_{f}(u_{k+1},v_{k})\leq P_{f}(u_{k},v_{k})+\left<u_{k+1}-u_{k},\nabla_{u}P_{f}(u_{k},v_{k})\right>+
ρ¯2uk+1uk2Pf(uk,vk)+[ρ¯21βk]uk+1uk2.\displaystyle\frac{\bar{\rho}}{2}\|u_{k+1}-u_{k}\|^{2}\leq P_{f}(u_{k},v_{k})+\left[\frac{\bar{\rho}}{2}-\frac{1}{\beta_{k}}\right]\|u_{k+1}-u_{k}\|^{2}. (9)

Similarly, we can immediately estimate the approximation error of the low-level problem due to an update on uu:

Lemma VIII.1.

Under Assumption III.1, the following relationship holds for all k1k\geq 1:

Δk+1p+1p(12βkρ¯+βk2ρ¯2)Δk+Lu2LK2αk2p+14uk4.\displaystyle\Delta_{k+1}\leq\frac{p+1}{p}(1-2\beta_{k}\underline{\rho}+\beta_{k}^{2}\bar{\rho}^{2})\Delta_{k}+L_{u}^{2}L_{\nabla K}^{2}\alpha_{k}^{2}\frac{p+1}{4}\|u_{k}\|^{4}.
Proof.

The following result holds by the triangle inequality and the bounded spectrum of K(v)K(v) (Equation 2):

uk+1uf(vk)2\displaystyle\|u_{k+1}-u_{f}(v_{k})\|^{2}
=\displaystyle= uk+1uk2+ukuf(vk)2+\displaystyle\|u_{k+1}-u_{k}\|^{2}+\|u_{k}-u_{f}(v_{k})\|^{2}+
2uk+1uk,ukuf(vk)\displaystyle 2\left<u_{k+1}-u_{k},u_{k}-u_{f}(v_{k})\right>
=\displaystyle= uk+1uk2+ukuf(vk)2\displaystyle\|u_{k+1}-u_{k}\|^{2}+\|u_{k}-u_{f}(v_{k})\|^{2}-
2βkK(vk)ukf,ukuf(vk)\displaystyle 2\beta_{k}\left<K(v_{k})u_{k}-f,u_{k}-u_{f}(v_{k})\right>
\displaystyle\leq uk+1uk2+(12βkρ¯)ukuf(vk)2\displaystyle\|u_{k+1}-u_{k}\|^{2}+(1-2\beta_{k}\underline{\rho})\|u_{k}-u_{f}(v_{k})\|^{2}
\displaystyle\leq βk2ρ¯2ukuf(vk)2+(12βkρ¯)ukuf(vk)2.\displaystyle\beta_{k}^{2}\bar{\rho}^{2}\|u_{k}-u_{f}(v_{k})\|^{2}+(1-2\beta_{k}\underline{\rho})\|u_{k}-u_{f}(v_{k})\|^{2}. (10)

The optimal solution to the low-level problem is uf(v)u_{f}(v), which is a smooth function defined on a compact domain, so any derivatives of uf(v)u_{f}(v) is bounded L-continuous and we have the following estimate of the change to ufu_{f} due to an update on vv:

uk+1uf(vk)uf(vk)uf(vk+1)\displaystyle\|u_{k+1}-u_{f}(v_{k})\|\|u_{f}(v_{k})-u_{f}(v_{k+1})\|
\displaystyle\leq 12puk+1uf(vk)2+p2uf(vk)uf(vk+1)2\displaystyle\frac{1}{2p}\|u_{k+1}-u_{f}(v_{k})\|^{2}+\frac{p}{2}\|u_{f}(v_{k})-u_{f}(v_{k+1})\|^{2}
\displaystyle\leq 12puk+1uf(vk)2+pLu22vkvk+12\displaystyle\frac{1}{2p}\|u_{k+1}-u_{f}(v_{k})\|^{2}+\frac{pL_{u}^{2}}{2}\|v_{k}-v_{k+1}\|^{2}
\displaystyle\leq 12puk+1uf(vk)2+pLu2αk28ukTKvkuk2,\displaystyle\frac{1}{2p}\|u_{k+1}-u_{f}(v_{k})\|^{2}+\frac{pL_{u}^{2}\alpha_{k}^{2}}{8}\|u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}\|^{2}, (11)

where we have used Young’s inequality and the contractive property of the ProjX()\text{Proj}_{X}(\bullet) operator. Here we denote LuL_{u} as the L-constant of ufu_{f}. By triangular inequality, we further have:

uk+1uf(vk+1)2\displaystyle\|u_{k+1}-u_{f}(v_{k+1})\|^{2}
\displaystyle\leq uk+1uf(vk)2+uf(vk)uf(vk+1)2+\displaystyle\|u_{k+1}-u_{f}(v_{k})\|^{2}+\|u_{f}(v_{k})-u_{f}(v_{k+1})\|^{2}+
2uk+1uf(vk)uf(vk)uf(vk+1)\displaystyle 2\|u_{k+1}-u_{f}(v_{k})\|\|u_{f}(v_{k})-u_{f}(v_{k+1})\|
\displaystyle\leq p+1puk+1uf(vk)2+uf(vk)uf(vk+1)2+\displaystyle\frac{p+1}{p}\|u_{k+1}-u_{f}(v_{k})\|^{2}+\|u_{f}(v_{k})-u_{f}(v_{k+1})\|^{2}+
pLu2αk24ukTKvkuk2\displaystyle\frac{pL_{u}^{2}\alpha_{k}^{2}}{4}\|u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}\|^{2}
\displaystyle\leq p+1puk+1uf(vk)2+Lu2vkvk+12+\displaystyle\frac{p+1}{p}\|u_{k+1}-u_{f}(v_{k})\|^{2}+L_{u}^{2}\|v_{k}-v_{k+1}\|^{2}+
pLu2αk24ukTKvkuk2\displaystyle\frac{pL_{u}^{2}\alpha_{k}^{2}}{4}\|u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}\|^{2}
\displaystyle\leq p+1puk+1uf(vk)2+Lu2αk2p+14ukTKvkuk2\displaystyle\frac{p+1}{p}\|u_{k+1}-u_{f}(v_{k})\|^{2}+L_{u}^{2}\alpha_{k}^{2}\frac{p+1}{4}\|u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}\|^{2}
\displaystyle\leq p+1p(12βkρ¯+βk2ρ¯2)ukuf(vk)2+\displaystyle\frac{p+1}{p}(1-2\beta_{k}\underline{\rho}+\beta_{k}^{2}\bar{\rho}^{2})\|u_{k}-u_{f}(v_{k})\|^{2}+
Lu2αk2p+14ukTKvkuk2\displaystyle L_{u}^{2}\alpha_{k}^{2}\frac{p+1}{4}\|u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}\|^{2}
\displaystyle\leq p+1p(12βkρ¯+βk2ρ¯2)ukuf(vk)2+\displaystyle\frac{p+1}{p}(1-2\beta_{k}\underline{\rho}+\beta_{k}^{2}\bar{\rho}^{2})\|u_{k}-u_{f}(v_{k})\|^{2}+
Lu2LK2αk2p+14uk4,\displaystyle L_{u}^{2}L_{\nabla K}^{2}\alpha_{k}^{2}\frac{p+1}{4}\|u_{k}\|^{4}, (12)

where we have used Equation 10 and Equation 11. ∎

The result of Lemma VIII.1 is a recurrent relationship on the low-level optimality error Δk\Delta_{k}, which will be used to prove low-level convergence via recursive expansion.

VIII-B Low-Level Convergence

We use the following shorthand notation for the result in Lemma VIII.1:

Δk+1ΘkΔk+Γkuk4\displaystyle\Delta_{k+1}\leq\Theta_{k}\Delta_{k}+\Gamma_{k}\|u_{k}\|^{4}
Θkp+1p(12βkρ¯+βk2ρ¯2)ΓkLu2LK2αk2p+14.\displaystyle\Theta_{k}\triangleq\frac{p+1}{p}(1-2\beta_{k}\underline{\rho}+\beta_{k}^{2}\bar{\rho}^{2})\quad\Gamma_{k}\triangleq L_{u}^{2}L_{\nabla K}^{2}\alpha_{k}^{2}\frac{p+1}{4}. (13)

By taking Assumption IV.2 and direct calculation, we can ensure that ΘkΘ¯<1\Theta_{k}\leq\bar{\Theta}<1 (i.e., the first term is contractive). To bound the growth of the second term above, we show by induction that both Δk\Delta_{k} and uku_{k} can be uniformly bounded for all kk via a sufficiently small, constant ΓkΓ¯\Gamma_{k}\leq\bar{\Gamma}.

Lemma VIII.2.

Taking Assumption III.1, IV.2, IV.3, we have ΔkU2,ukUf+U\Delta_{k}\leq U^{2},\|u_{k}\|\leq U_{f}+U for all k1k\geq 1, where UfU_{f} is the uniform upper bound for uf(v)\|u_{f}(v)\|.

Proof.

First, since vXv\in X and XX is compact, we have uf(v)Uf<\|u_{f}(v)\|\leq U_{f}<\infty. Next, we prove ΔkU2\Delta_{k}\leq U^{2} by induction. We already have Δ1U2\Delta_{1}\leq U^{2}. Now suppose ΔkU2\Delta_{k}\leq U^{2}, then Equation 13 and our assumption on Γ¯\bar{\Gamma} immediately leads to:

Δk+1\displaystyle\Delta_{k+1}\leq Θ¯U2+Γ¯(ukuf(vk)+uf(vk))4\displaystyle\bar{\Theta}U^{2}+\bar{\Gamma}(\|u_{k}-u_{f}(v_{k})\|+\|u_{f}(v_{k})\|)^{4}
\displaystyle\leq Θ¯U2+Γ¯(U+Uf)4U2.\displaystyle\bar{\Theta}U^{2}+\bar{\Gamma}(U+U_{f})^{4}\leq U^{2}.

Finally, we have: ukukuf(vk)+uf(vk)U+Uf\|u_{k}\|\leq\|u_{k}-u_{f}(v_{k})\|+\|u_{f}(v_{k})\|\leq U+U_{f} and our lemma follows. ∎

The shrinking coefficient Θ¯\bar{\Theta} and the uniform boundedness of Δk,uk\Delta_{k},u_{k} allows us to establish low-level convergence with the appropriate choice of αk=𝒪(km)\alpha_{k}=\mathcal{O}(k^{-m}) with m1m\geq 1.

Theorem VIII.3.

Taking Assumption III.1, IV.2, IV.3, IV.4, we can upper bound Δk+1\Delta_{k+1} as:

Δk+1Θ¯k1Δ1+(U+Uf)4Γ1i=0k1Θ¯i(ki)2m=𝒪(k12m).\displaystyle\Delta_{k+1}\leq\bar{\Theta}^{k-1}\Delta_{1}+(U+U_{f})^{4}\Gamma_{1}\sum_{i=0}^{k-1}\frac{\bar{\Theta}^{i}}{(k-i)^{2m}}=\mathcal{O}(k^{1-2m}).
Proof.

Recursively expand on Equation 13 and we have:

Δk+1Θ¯Δk+Γkuk4\displaystyle\Delta_{k+1}\leq\bar{\Theta}\Delta_{k}+\Gamma_{k}\|u_{k}\|^{4}
\displaystyle\leq Θ¯(Θ¯Δk1+Γk1uk14)+Γkuk4\displaystyle\bar{\Theta}(\bar{\Theta}\Delta_{k-1}+\Gamma_{k-1}\|u_{k-1}\|^{4})+\Gamma_{k}\|u_{k}\|^{4}
=\displaystyle= Θ¯2Δk1+Θ¯Γk1uk14+Γkuk4\displaystyle\bar{\Theta}^{2}\Delta_{k-1}+\bar{\Theta}\Gamma_{k-1}\|u_{k-1}\|^{4}+\Gamma_{k}\|u_{k}\|^{4}
\displaystyle\leq Θ¯2(Θ¯Δk2+Γk2uk24)+Θ¯Γk1uk14+Γkuk4\displaystyle\bar{\Theta}^{2}(\bar{\Theta}\Delta_{k-2}+\Gamma_{k-2}\|u_{k-2}\|^{4})+\bar{\Theta}\Gamma_{k-1}\|u_{k-1}\|^{4}+\Gamma_{k}\|u_{k}\|^{4}
\displaystyle\cdots
\displaystyle\leq Θ¯kΔ1+(U+Uf)4i=0k1Θ¯iΓki\displaystyle\bar{\Theta}^{k}\Delta_{1}+(U+U_{f})^{4}\sum_{i=0}^{k-1}\bar{\Theta}^{i}\Gamma_{k-i}
\displaystyle\leq Θ¯kΔ1+(U+Uf)4Γ1i=0k1Θ¯i(ki)2m\displaystyle\bar{\Theta}^{k}\Delta_{1}+(U+U_{f})^{4}\Gamma_{1}\sum_{i=0}^{k-1}\frac{\bar{\Theta}^{i}}{(k-i)^{2m}}
\displaystyle\leq Θ¯kΔ1+(U+Uf)4Γ1[i=0k12Θ¯i(ki)2m+i=k+12k1Θ¯i(ki)2m]\displaystyle\bar{\Theta}^{k}\Delta_{1}+(U+U_{f})^{4}\Gamma_{1}\left[\sum_{i=0}^{\lceil\frac{k-1}{2}\rceil}\frac{\bar{\Theta}^{i}}{(k-i)^{2m}}+\sum_{i=\lceil\frac{k+1}{2}\rceil}^{k-1}\frac{\bar{\Theta}^{i}}{(k-i)^{2m}}\right]
\displaystyle\leq Θ¯kΔ1+(U+Uf)4Γ1[k+12(kk12)2m+Θ¯k21Θ¯]=𝒪(k12m),\displaystyle\bar{\Theta}^{k}\Delta_{1}+(U+U_{f})^{4}\Gamma_{1}\left[\frac{\lceil\frac{k+1}{2}\rceil}{(k-\lceil\frac{k-1}{2}\rceil)^{2m}}+\frac{\bar{\Theta}^{\lceil\frac{k}{2}\rceil}}{1-\bar{\Theta}}\right]=\mathcal{O}(k^{1-2m}),

where we have used our choice of αk\alpha_{k} and Lemma VIII.2. ∎

Theorem VIII.3 is pivotal by allowing us to choose mm and tune the convergence speed of the low-level problem, which is used to establish the convergence of high-level problem.

VIII-C High-Level Error Propagation

The high-level error propagation is similar to the low-level analysis, which is due to the L-continuity of any derivatives of l(v)l(v) in the compact domain XX. The following result reveals the rule of error propagation over a single iteration:

Lemma VIII.4.

Under Assumption III.1, IV.2, IV.3, IV.4, the following high-level error propagation rule holds for all k1,q>0k\geq 1,q>0:

l(vk+1)l(vk)1αkvk+1vk2+\displaystyle l(v_{k+1})-l(v_{k})\leq-\frac{1}{\alpha_{k}}\|v_{k+1}-v_{k}\|^{2}+
LK2αk2(Llq+1)8q(U+Uf)4+LK2q8Δk(U+2Uf)2.\displaystyle\frac{L_{\nabla K}^{2}\alpha_{k}^{2}(L_{\nabla l}q+1)}{8q}(U+U_{f})^{4}+\frac{L_{\nabla K}^{2}q}{8}\Delta_{k}(U+2U_{f})^{2}.
Proof.

By the smoothness of l(v)l(v), the compactness of XX, and the obtuse angle criterion, we have:

l(vk+1)l(vk)vk+1vk,l(vk)+Ll2vk+1vk2\displaystyle l(v_{k+1})-l(v_{k})\leq\left<v_{k+1}-v_{k},\nabla l(v_{k})\right>+\frac{L_{\nabla l}}{2}\|v_{k+1}-v_{k}\|^{2}
\displaystyle\leq vk+1vk,12ufT(vk)Kvkuf(vk)+LlLK2αk28uk4\displaystyle\left<v_{k+1}-v_{k},-\frac{1}{2}u_{f}^{T}(v_{k})\frac{\partial{K}}{\partial{v_{k}}}u_{f}(v_{k})\right>+\frac{L_{\nabla l}L_{\nabla K}^{2}\alpha_{k}^{2}}{8}\|u_{k}\|^{4}
=\displaystyle= vk+1vk,12ukTKvkuk+LlLK2αk28uk4+\displaystyle\left<v_{k+1}-v_{k},-\frac{1}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}\right>+\frac{L_{\nabla l}L_{\nabla K}^{2}\alpha_{k}^{2}}{8}\|u_{k}\|^{4}+
vk+1vk,12ukTKvkuk12ufT(vk)Kvkuf(vk)\displaystyle\left<v_{k+1}-v_{k},\frac{1}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}-\frac{1}{2}u_{f}^{T}(v_{k})\frac{\partial{K}}{\partial{v_{k}}}u_{f}(v_{k})\right>
\displaystyle\leq 1αkvk+1vk2+LlLK2αk28uk4+\displaystyle-\frac{1}{\alpha_{k}}\|v_{k+1}-v_{k}\|^{2}+\frac{L_{\nabla l}L_{\nabla K}^{2}\alpha_{k}^{2}}{8}\|u_{k}\|^{4}+
vk+1vk,12ukTKvkuk12ufT(vk)Kvkuf(vk),\displaystyle\left<v_{k+1}-v_{k},\frac{1}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}-\frac{1}{2}u_{f}^{T}(v_{k})\frac{\partial{K}}{\partial{v_{k}}}u_{f}(v_{k})\right>, (14)

where LlL_{\nabla l} is the L-constant of l\nabla l on XX. For the last term above, we can bound it by applying the Young’s inequality:

vk+1vk,12ukTKvkuk12ufT(vk)Kvkuf(vk)\displaystyle\left<v_{k+1}-v_{k},\frac{1}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}-\frac{1}{2}u_{f}^{T}(v_{k})\frac{\partial{K}}{\partial{v_{k}}}u_{f}(v_{k})\right>
\displaystyle\leq 12qvk+1vk2+q212ukTKvkuk12ufT(vk)Kvkuf(vk)2\displaystyle\frac{1}{2q}\|v_{k+1}-v_{k}\|^{2}+\frac{q}{2}\|\frac{1}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}-\frac{1}{2}u_{f}^{T}(v_{k})\frac{\partial{K}}{\partial{v_{k}}}u_{f}(v_{k})\|^{2}
\displaystyle\leq LK2αk28quk4+q212ukTKvkuk12ufT(vk)Kvkuf(vk)2\displaystyle\frac{L_{\nabla K}^{2}\alpha_{k}^{2}}{8q}\|u_{k}\|^{4}+\frac{q}{2}\|\frac{1}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}-\frac{1}{2}u_{f}^{T}(v_{k})\frac{\partial{K}}{\partial{v_{k}}}u_{f}(v_{k})\|^{2}
=\displaystyle= LK2αk28quk4+q212(ukuf(vk))TKvk(uk+uf(vk))2\displaystyle\frac{L_{\nabla K}^{2}\alpha_{k}^{2}}{8q}\|u_{k}\|^{4}+\frac{q}{2}\|\frac{1}{2}(u_{k}-u_{f}(v_{k}))^{T}\frac{\partial{K}}{\partial{v_{k}}}(u_{k}+u_{f}(v_{k}))\|^{2}
\displaystyle\leq LK2αk28quk4+LK2q8Δkuk+uf(vk)2.\displaystyle\frac{L_{\nabla K}^{2}\alpha_{k}^{2}}{8q}\|u_{k}\|^{4}+\frac{L_{\nabla K}^{2}q}{8}\Delta_{k}\|u_{k}+u_{f}(v_{k})\|^{2}. (15)

The lemma follows by combining Equation 14, Equation 15, and Lemma VIII.2. ∎

VIII-D High-Level Convergence

We first show that Δkv\Delta_{k}^{v} is diminishing via the follow lemma:

Lemma VIII.5.

Under Assumption III.1, IV.2, IV.3, IV.4, we have the following bound on the accumulated high-level error Δkv\Delta_{k}^{v}:

12i=1kαiΔivl(v1)l¯+\displaystyle\frac{1}{2}\sum_{i=1}^{k}\alpha_{i}\Delta_{i}^{v}\leq l(v_{1})-\underline{l}+
LK28(U+Uf)4i=1kαi2n(Llαin+1)+\displaystyle\frac{L_{\nabla K}^{2}}{8}(U+U_{f})^{4}\sum_{i=1}^{k}\alpha_{i}^{2-n}(L_{\nabla l}\alpha_{i}^{n}+1)+
LK28(U+2Uf)2i=1kαinΔi+LK24(U+2Uf)2i=1kαiΔi,\displaystyle\frac{L_{\nabla K}^{2}}{8}(U+2U_{f})^{2}\sum_{i=1}^{k}\alpha_{i}^{n}\Delta_{i}+\frac{L_{\nabla K}^{2}}{4}(U+2U_{f})^{2}\sum_{i=1}^{k}\alpha_{i}\Delta_{i},

where l¯\underline{l} is the lower bound of ll on XX.

Proof.

By choosing q=αknq=\alpha_{k}^{n} for some constant nn and summing up the recursive rule of Lemma VIII.4, we have the following result:

l(vk+1)l(v1)i=1k1αivi+1vi2+\displaystyle l(v_{k+1})-l(v_{1})\leq-\sum_{i=1}^{k}\frac{1}{\alpha_{i}}\|v_{i+1}-v_{i}\|^{2}+
LK28(U+Uf)4i=1kαi2n(Llαin+1)+\displaystyle\frac{L_{\nabla K}^{2}}{8}(U+U_{f})^{4}\sum_{i=1}^{k}\alpha_{i}^{2-n}(L_{\nabla l}\alpha_{i}^{n}+1)+
LK28(U+2Uf)2i=1kαinΔi.\displaystyle\frac{L_{\nabla K}^{2}}{8}(U+2U_{f})^{2}\sum_{i=1}^{k}\alpha_{i}^{n}\Delta_{i}. (16)

Equation 16 can immediately lead to the 𝒪(k1)\mathcal{O}(k^{-1}) convergence of vi+1vi2/αi\|v_{i+1}-v_{i}\|^{2}/\alpha_{i}. However, the update from viv_{i} to vi+1v_{i+1} is using the approximate gradient. To show the convergence of l(v)l(v), we need to consider an update using the exact gradient. This can be achieved by combining Equation 16 and the gradient error estimation in Equation 15:

Δkv1αk2vk+1vk2+\displaystyle\Delta_{k}^{v}\leq\frac{1}{\alpha_{k}^{2}}\|v_{k+1}-v_{k}\|^{2}+
1αk2αk2uf(vk)TKvkuf(vk)αk2ukTKvkuk2+\displaystyle\frac{1}{\alpha_{k}^{2}}\|\frac{\alpha_{k}}{2}u_{f}(v_{k})^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{f}(v_{k})-\frac{\alpha_{k}}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}\|^{2}+
2αk2vk+1vkαk2uf(vk)TKvkuf(vk)αk2ukTKvkuk\displaystyle\frac{2}{\alpha_{k}^{2}}\|v_{k+1}-v_{k}\|\|\frac{\alpha_{k}}{2}u_{f}(v_{k})^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{f}(v_{k})-\frac{\alpha_{k}}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}\|
\displaystyle\leq 2αk2vk+1vk2+2αk2αk2uf(vk)TKvkuf(vk)αk2ukTKvkuk2\displaystyle\frac{2}{\alpha_{k}^{2}}\|v_{k+1}-v_{k}\|^{2}+\frac{2}{\alpha_{k}^{2}}\|\frac{\alpha_{k}}{2}u_{f}(v_{k})^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{f}(v_{k})-\frac{\alpha_{k}}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}\|^{2}
\displaystyle\leq 2αk2vk+1vk2+LK22Δk(U+2Uf)2.\displaystyle\frac{2}{\alpha_{k}^{2}}\|v_{k+1}-v_{k}\|^{2}+\frac{L_{\nabla K}^{2}}{2}\Delta_{k}(U+2U_{f})^{2}. (17)

We can prove our lemma by combining Lemma VIII.4, Equation 16, Equation 17. ∎

The last three terms on the righthand side of Lemma VIII.5 are power series, the summand of which scales at the speed of 𝒪(k2m+mn),𝒪(k12mmn),𝒪(k13m)\mathcal{O}(k^{-2m+mn}),\mathcal{O}(k^{1-2m-mn}),\mathcal{O}(k^{1-3m}), respectively. Therefore, for the three summation to be upper bounded for arbitrary kk, we need the first condition in Assumption IV.5. The following corollary is immediate:

Corollary VIII.6.

Under Assumption III.1, IV.2, IV.3, IV.4, IV.5, we have mini=1,,kαiΔivCk1\min_{i=1,\cdots,k}\alpha_{i}\Delta_{i}^{v}\leq Ck^{-1} and there are infinitely many kk such that ΔkvCvkm1\Delta_{k}^{v}\leq C_{v}k^{m-1} for some constants C,CvC,C_{v} independent of kk.

Proof.

By Lemma VIII.5 and Assumption IV.5, we have i=1kαiΔivC\sum_{i=1}^{k}\alpha_{i}\Delta_{i}^{v}\leq C for some constant CC. Now suppose only finitely many kk satisfies ΔkvCvkm1\Delta_{k}^{v}\leq C_{v}k^{m-1}, then after sufficiently large kK0k\geq K_{0}, we have:

i=K0kαiΔivΓ¯Lu2LK24p+1Cvi=K0k1i,\displaystyle\sum_{i=K_{0}}^{k}\alpha_{i}\Delta_{i}^{v}\geq\sqrt{\frac{\bar{\Gamma}}{L_{u}^{2}L_{\nabla K}^{2}}\frac{4}{p+1}}C_{v}\sum_{i=K_{0}}^{k}\frac{1}{i},

which is divergent, leading to a contradiction. ∎

Our remaining argument is similar to the standard convergence proof of PGD [42], with minor modification to account for our approximate gradient:

Proof of Theorem IV.6.

We denote by v~k+1\tilde{v}_{k+1}:

v~k+1ProjX(vkαkl(vk)).\displaystyle\tilde{v}_{k+1}\triangleq\text{Proj}_{X}(v_{k}-\alpha_{k}\nabla l(v_{k})).

Note that v~k+1\tilde{v}_{k+1} is derived from vkv_{k} using the true gradient, while vk+1v_{k+1} is derived using our approximate gradient. Due to the polytopic shape of XX in Assumption III.1, for vkXv_{k}\in X, we can always choose a feasible direction dkd_{k} from 𝒯X(vk)\mathcal{T}_{X}(v_{k}) with dk1\|d_{k}\|\leq 1 such that:

Xl(vk)l(vk),dk+ϵ4.\displaystyle\|\nabla_{X}l(v_{k})\|\leq-\left<\nabla l(v_{k}),d_{k}\right>+\frac{\epsilon}{4}.

We further have the follow inequality holds for any zXz\in X due to the obtuse angle criterion and the convexity of XX:

αkl(vk),v~k+1z\displaystyle\left<\alpha_{k}\nabla l(v_{k}),\tilde{v}_{k+1}-z\right>
\displaystyle\leq αkl(vk),v~k+1z+vkαkl(vk)v~k+1,v~k+1z\displaystyle\left<\alpha_{k}\nabla l(v_{k}),\tilde{v}_{k+1}-z\right>+\left<v_{k}-\alpha_{k}\nabla l(v_{k})-\tilde{v}_{k+1},\tilde{v}_{k+1}-z\right>
=\displaystyle= vkv~k+1,v~k+1zvkv~k+1v~k+1z.\displaystyle\left<v_{k}-\tilde{v}_{k+1},\tilde{v}_{k+1}-z\right>\leq\|v_{k}-\tilde{v}_{k+1}\|\|\tilde{v}_{k+1}-z\|.

Applying Corollary VIII.6 and we can choose sufficiently large kk such that:

l(v~k+1),τkdkτkdk1αkvkv~k+1\displaystyle-\left<\nabla l(\tilde{v}_{k+1}),\frac{\tau_{k}d_{k}}{\tau_{k}\|d_{k}\|}\right>\leq\frac{1}{\alpha_{k}}\|v_{k}-\tilde{v}_{k+1}\|
\displaystyle\leq 1αkvkvk+1+1αkvk+1v~k+1\displaystyle\frac{1}{\alpha_{k}}\|v_{k}-v_{k+1}\|+\frac{1}{\alpha_{k}}\|v_{k+1}-\tilde{v}_{k+1}\|
=\displaystyle= Δkv+1αkvk+1v~k+1\displaystyle\sqrt{\Delta_{k}^{v}}+\frac{1}{\alpha_{k}}\|v_{k+1}-\tilde{v}_{k+1}\|
\displaystyle\leq ϵ4+LK24Δkuk+uf(vk)2=ϵ4+𝒪(k(12m)/2).\displaystyle\frac{\epsilon}{4}+\sqrt{\frac{L_{\nabla K}^{2}}{4}\Delta_{k}\|u_{k}+u_{f}(v_{k})\|^{2}}=\frac{\epsilon}{4}+\mathcal{O}(k^{(1-2m)/2}).

Here the first inequality holds by choosing sufficiently small kk-dependent τk\tau_{k} such that z=v~k+1+τkdkXz=\tilde{v}_{k+1}+\tau_{k}d_{k}\in X and using the fact that dk1\|d_{k}\|\leq 1. The third inequality holds by choosing sufficently large kk and Corollary VIII.6. The last equality holds by the contractive property of projection operator, Theorem VIII.3, and Lemma VIII.4. ∎

VIII-E Convergence Analysis of Algorithm 2

Informally, we establish convergence for the preconditioned Algorithm 2 by modifying Equation 10 as follows:

uk+1uf(vk)2\displaystyle\|u_{k+1}-u_{f}(v_{k})\|^{2}
=\displaystyle= uk+1uk2+ukuf(vk)2+\displaystyle\|u_{k+1}-u_{k}\|^{2}+\|u_{k}-u_{f}(v_{k})\|^{2}+
2uk+1uk,ukuf(vk)\displaystyle 2\left<u_{k+1}-u_{k},u_{k}-u_{f}(v_{k})\right>
=\displaystyle= uk+1uk2+ukuf(vk)2\displaystyle\|u_{k+1}-u_{k}\|^{2}+\|u_{k}-u_{f}(v_{k})\|^{2}-
2βkK(vk)M2(vk)(K(vk)ukf),ukuf(vk)\displaystyle 2\beta_{k}\left<K(v_{k})M^{-2}(v_{k})(K(v_{k})u_{k}-f),u_{k}-u_{f}(v_{k})\right>
\displaystyle\leq uk+1uk2+(12βkρ¯2ρ¯M2)ukuf(vk)2\displaystyle\|u_{k+1}-u_{k}\|^{2}+(1-2\beta_{k}\frac{\underline{\rho}^{2}}{\bar{\rho}_{M}^{2}})\|u_{k}-u_{f}(v_{k})\|^{2}
\displaystyle\leq βk2ρ¯4ρ¯M4ukuf(vk)2+(12βkρ¯2ρ¯M2)ukuf(vk)2.\displaystyle\beta_{k}^{2}\frac{\bar{\rho}^{4}}{\underline{\rho}_{M}^{4}}\|u_{k}-u_{f}(v_{k})\|^{2}+(1-2\beta_{k}\frac{\underline{\rho}^{2}}{\bar{\rho}_{M}^{2}})\|u_{k}-u_{f}(v_{k})\|^{2}.

Assumption IV.2 must also be modified to account for the spectrum M(v)M(v). All the other steps are identical to those of Theorem IV.6.

IX Convergence Analysis of Algorithm 3

The main difference in the analysis of Algorithm 3 lies in the use of a different low-level error metric defined as ΞkK(vk)ukf2\Xi_{k}\triangleq\|K(v_{k})u_{k}-f\|^{2}. Unlike Δk\Delta_{k} which requires exact matrix inversion, Ξk\Xi_{k} can be computed at a rather low cost. We will further show that using Ξk\Xi_{k} for analysis would lead to a much larger, constant choice of low-level step size βk=1\beta_{k}=1. To prove Theorem IV.11, we follow the similar steps as Theorem IV.6 and only list necessary changes in this section.

IX-A Low-Level Error Propagation

Lemma IX.1.

Assuming M(v)M(v) commutes with K(v)K(v) and III.1, the following relationship holds for all k1k\geq 1:

Ξk+1p+1p(12βkρ¯ρ¯M+βk2ρ¯2ρ¯M2)Ξk+LK2LK2αk2p+14uk6,\displaystyle\Xi_{k+1}\leq\frac{p+1}{p}(1-2\beta_{k}\frac{\underline{\rho}}{\bar{\rho}_{M}}+\beta_{k}^{2}\frac{\bar{\rho}^{2}}{\underline{\rho}_{M}^{2}})\Xi_{k}+L_{K}^{2}L_{\nabla K}^{2}\alpha_{k}^{2}\frac{p+1}{4}\|u_{k}\|^{6},

where we define: K(vk+1)K(vk)F2LKvk+1vk2\|K(v_{k+1})-K(v_{k})\|_{F}^{2}\leq L_{K}\|v_{k+1}-v_{k}\|^{2} for any vk,vk+1Xv_{k},v_{k+1}\in X.

Proof.

The following result holds by the triangle inequality and the bounded spectrum of K(v)K(v) (Equation 2):

K(vk)uk+1f2\displaystyle\|K(v_{k})u_{k+1}-f\|^{2}
=\displaystyle= K(vk)(uk+1uk)2+K(vk)ukf2+\displaystyle\|K(v_{k})(u_{k+1}-u_{k})\|^{2}+\|K(v_{k})u_{k}-f\|^{2}+
2K(vk)(uk+1uk),K(vk)ukf\displaystyle 2\left<K(v_{k})(u_{k+1}-u_{k}),K(v_{k})u_{k}-f\right>
=\displaystyle= K(vk)(uk+1uk)2+K(vk)ukf2\displaystyle\|K(v_{k})(u_{k+1}-u_{k})\|^{2}+\|K(v_{k})u_{k}-f\|^{2}-
2βkK(vk)M1(vk)(K(vk)ukf),K(vk)ukf\displaystyle 2\beta_{k}\left<K(v_{k})M^{-1}(v_{k})(K(v_{k})u_{k}-f),K(v_{k})u_{k}-f\right>
\displaystyle\leq K(vk)(uk+1uk)2+(12βkρ¯ρ¯M)K(vk)ukf2\displaystyle\|K(v_{k})(u_{k+1}-u_{k})\|^{2}+(1-2\beta_{k}\frac{\underline{\rho}}{\bar{\rho}_{M}})\|K(v_{k})u_{k}-f\|^{2}
\displaystyle\leq (12βkρ¯ρ¯M+βk2ρ¯2ρ¯M2)K(vk)ukf2.\displaystyle(1-2\beta_{k}\frac{\underline{\rho}}{\bar{\rho}_{M}}+\beta_{k}^{2}\frac{\bar{\rho}^{2}}{\underline{\rho}_{M}^{2}})\|K(v_{k})u_{k}-f\|^{2}. (18)

Next, we estimate an update due to vkv_{k} via the Young’s inequality:

(K(vk+1)K(vk))uk+1K(vk)uk+1f\displaystyle\|(K(v_{k+1})-K(v_{k}))u_{k+1}\|\|K(v_{k})u_{k+1}-f\|
\displaystyle\leq 12pK(vk)uk+1f2+p2(K(vk+1)K(vk))uk+12\displaystyle\frac{1}{2p}\|K(v_{k})u_{k+1}-f\|^{2}+\frac{p}{2}\|(K(v_{k+1})-K(v_{k}))u_{k+1}\|^{2}
\displaystyle\leq 12pK(vk)uk+1f2+pLK2αk28ukTKvkuk2uk+12.\displaystyle\frac{1}{2p}\|K(v_{k})u_{k+1}-f\|^{2}+\frac{pL_{K}^{2}\alpha_{k}^{2}}{8}\|u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}\|^{2}\|u_{k+1}\|^{2}.

Putting the above two equations together, we have:

K(vk+1)uk+1f2\displaystyle\|K(v_{k+1})u_{k+1}-f\|^{2}
\displaystyle\leq (K(vk+1)K(vk))uk+12+K(vk)uk+1f2+\displaystyle\|(K(v_{k+1})-K(v_{k}))u_{k+1}\|^{2}+\|K(v_{k})u_{k+1}-f\|^{2}+
2(K(vk+1)K(vk))uk+1K(vk)uk+1f\displaystyle 2\|(K(v_{k+1})-K(v_{k}))u_{k+1}\|\|K(v_{k})u_{k+1}-f\|
\displaystyle\leq p+1p(12βkρ¯ρ¯M+βk2ρ¯2ρ¯M2)K(vk)ukf2+\displaystyle\frac{p+1}{p}(1-2\beta_{k}\frac{\underline{\rho}}{\bar{\rho}_{M}}+\beta_{k}^{2}\frac{\bar{\rho}^{2}}{\underline{\rho}_{M}^{2}})\|K(v_{k})u_{k}-f\|^{2}+
LK2αk2p+14ukTKvkuk2uk+12,\displaystyle L_{K}^{2}\alpha_{k}^{2}\frac{p+1}{4}\|u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}\|^{2}\|u_{k+1}\|^{2}, (19)

which is a recursive relationship to be used for proving the low-level convergence. ∎

IX-B Low-Level Convergence

We use the following shorthand notation for the result in Lemma IX.1:

Ξk+1ΘkΞk+Γkuk6\displaystyle\Xi_{k+1}\leq\Theta_{k}\Xi_{k}+\Gamma_{k}\|u_{k}\|^{6}
Θkp+1p(12βkρ¯ρ¯M+βk2ρ¯2ρ¯M)ΓkLK2LK2αk2p+14.\displaystyle\Theta_{k}\triangleq\frac{p+1}{p}(1-2\beta_{k}\frac{\underline{\rho}}{\bar{\rho}_{M}}+\beta_{k}^{2}\frac{\bar{\rho}^{2}}{\underline{\rho}_{M}})\quad\Gamma_{k}\triangleq L_{K}^{2}L_{\nabla K}^{2}\alpha_{k}^{2}\frac{p+1}{4}. (20)

By taking Assumption IV.2 and direct calculation, we can ensure that ΘkΘ¯<1\Theta_{k}\leq\bar{\Theta}<1 (i.e., the first term is contractive). To bound the growth of the second term above, we show by induction that both Ξk\Xi_{k} and uku_{k} can be uniformly bounded for all kk via a sufficiently small, constant ΓkΓ¯\Gamma_{k}\leq\bar{\Gamma}.

Lemma IX.2.

Assuming M(v)M(v) commutes with K(v)K(v), III.1, IV.8, IV.9, we have ΞkU2,ukUf+U\Xi_{k}\leq U^{2},\|u_{k}\|\leq U_{f}+U for all k1k\geq 1, where UfU_{f} is the uniform upper bound for uf(v)\|u_{f}(v)\|.

Proof.

First, since vXv\in X and XX is compact, we have uf(v)Uf<\|u_{f}(v)\|\leq U_{f}<\infty. Next, we prove ΞkU2\Xi_{k}\leq U^{2} by induction. We already have Ξ1U2\Xi_{1}\leq U^{2}. Now suppose ΞkU2\Xi_{k}\leq U^{2}, then Equation 20 and our assumption on Γ¯\bar{\Gamma} immediately leads to:

Ξk+1\displaystyle\Xi_{k+1}\leq Θ¯U2+Γ¯(K1(vk)(K(vk)ukf)+uf(vk))6\displaystyle\bar{\Theta}U^{2}+\bar{\Gamma}(\|K^{-1}(v_{k})(K(v_{k})u_{k}-f)\|+\|u_{f}(v_{k})\|)^{6}
\displaystyle\leq Θ¯U2+Γ¯(U/ρ¯+Uf)6U2.\displaystyle\bar{\Theta}U^{2}+\bar{\Gamma}(U/\underline{\rho}+U_{f})^{6}\leq U^{2}.

Finally, we have: ukukuf(vk)+uf(vk)U+Uf\|u_{k}\|\leq\|u_{k}-u_{f}(v_{k})\|+\|u_{f}(v_{k})\|\leq U+U_{f} and our lemma follows. ∎

The shrinking coefficient Θ¯\bar{\Theta} and the uniform boundedness of Ξk,uk\Xi_{k},u_{k} allows us to establish low-level convergence.

Theorem IX.3.

Taking Assumption III.1, IV.8, IV.9, IV.10, we can upper bound Ξk+1\Xi_{k+1} as:

Ξk+1Θ¯k1Ξ1+(U+Uf)6Γ1i=0k1Θ¯i(ki)2m=𝒪(k12m).\displaystyle\Xi_{k+1}\leq\bar{\Theta}^{k-1}\Xi_{1}+(U+U_{f})^{6}\Gamma_{1}\sum_{i=0}^{k-1}\frac{\bar{\Theta}^{i}}{(k-i)^{2m}}=\mathcal{O}(k^{1-2m}).
Proof.

Recursively expand on Equation 13 and we have:

Ξk+1Θ¯kΞ1+(U+Uf)6i=0k1Θ¯iΓki\displaystyle\Xi_{k+1}\leq\bar{\Theta}^{k}\Xi_{1}+(U+U_{f})^{6}\sum_{i=0}^{k-1}\bar{\Theta}^{i}\Gamma_{k-i}
\displaystyle\leq Θ¯kΞ1+(U+Uf)6Γ1[k+12(kk12)2m+Θ¯k21Θ¯]=𝒪(k12m),\displaystyle\bar{\Theta}^{k}\Xi_{1}+(U+U_{f})^{6}\Gamma_{1}\left[\frac{\lceil\frac{k+1}{2}\rceil}{(k-\lceil\frac{k-1}{2}\rceil)^{2m}}+\frac{\bar{\Theta}^{\lceil\frac{k}{2}\rceil}}{1-\bar{\Theta}}\right]=\mathcal{O}(k^{1-2m}),

where we have used our choice of αk\alpha_{k}, Lemma IX.2, and a similar argument as in Theorem VIII.3. ∎

Theorem IX.3 allows us to choose mm and tune the convergence speed of the low-level problem, which is used to establish the convergence of high-level problem.

IX-C High-Level Error Propagation

We first establish the high-level rule of error propagation over a single iteration:

Lemma IX.4.

Assuming M(v)M(v) commutes with K(v)K(v), III.1, IV.2, IV.3, IV.4, the following high-level error propagation rule holds for all k1,q>0k\geq 1,q>0:

l(vk+1)l(vk)1αkvk+1vk2+\displaystyle l(v_{k+1})-l(v_{k})\leq-\frac{1}{\alpha_{k}}\|v_{k+1}-v_{k}\|^{2}+
LK2αk2(Llq+1)8q(U+Uf)4+LK2q8ρ¯Ξk(U+2Uf)2.\displaystyle\frac{L_{\nabla K}^{2}\alpha_{k}^{2}(L_{\nabla l}q+1)}{8q}(U+U_{f})^{4}+\frac{L_{\nabla K}^{2}q}{8\underline{\rho}}\Xi_{k}(U+2U_{f})^{2}.
Proof.

Equation 14 holds by the same argument as in Lemma VIII.4. For the last term in Equation 14, we have:

vk+1vk,12ukTKvkuk12ufT(vk)Kvkuf(vk)\displaystyle\left<v_{k+1}-v_{k},\frac{1}{2}u_{k}^{T}\frac{\partial{K}}{\partial{v_{k}}}u_{k}-\frac{1}{2}u_{f}^{T}(v_{k})\frac{\partial{K}}{\partial{v_{k}}}u_{f}(v_{k})\right>
\displaystyle\leq LK2αk28quk4+q212(ukuf(vk))TKvk(uk+uf(vk))2\displaystyle\frac{L_{\nabla K}^{2}\alpha_{k}^{2}}{8q}\|u_{k}\|^{4}+\frac{q}{2}\|\frac{1}{2}(u_{k}-u_{f}(v_{k}))^{T}\frac{\partial{K}}{\partial{v_{k}}}(u_{k}+u_{f}(v_{k}))\|^{2}
\displaystyle\leq LK2αk28quk4+LK2q8ρ¯Ξkuk+uf(vk)2.\displaystyle\frac{L_{\nabla K}^{2}\alpha_{k}^{2}}{8q}\|u_{k}\|^{4}+\frac{L_{\nabla K}^{2}q}{8\underline{\rho}}\Xi_{k}\|u_{k}+u_{f}(v_{k})\|^{2}. (21)

The lemma follows by combining Equation 14, Equation 21, and Lemma IX.2. ∎

IX-D High-Level Convergence

We first show that Ξkv\Xi_{k}^{v} is diminishing via the follow lemma:

Lemma IX.5.

Assuming M(v)M(v) commutes with K(v)K(v), III.1, IV.8, IV.9, IV.10, we have the following bound on the accumulated high-level error Δkv\Delta_{k}^{v}:

12i=1kαiΔivl(v1)l¯+\displaystyle\frac{1}{2}\sum_{i=1}^{k}\alpha_{i}\Delta_{i}^{v}\leq l(v_{1})-\underline{l}+
LK28(U+Uf)4i=1kαi2n(Llαin+1)+\displaystyle\frac{L_{\nabla K}^{2}}{8}(U+U_{f})^{4}\sum_{i=1}^{k}\alpha_{i}^{2-n}(L_{\nabla l}\alpha_{i}^{n}+1)+
LK28ρ¯(U+2Uf)2i=1kαinΞi+LK24ρ¯(U+2Uf)2i=1kαiΞi,\displaystyle\frac{L_{\nabla K}^{2}}{8\underline{\rho}}(U+2U_{f})^{2}\sum_{i=1}^{k}\alpha_{i}^{n}\Xi_{i}+\frac{L_{\nabla K}^{2}}{4\underline{\rho}}(U+2U_{f})^{2}\sum_{i=1}^{k}\alpha_{i}\Xi_{i},

where l¯\underline{l} is the lower bound of ll on XX.

Proof.

By choosing q=αknq=\alpha_{k}^{n} for some constant nn and summing up the recursive rule of Lemma IX.4, we have the following result:

l(vk+1)l(v1)i=1k1αivi+1vi2+\displaystyle l(v_{k+1})-l(v_{1})\leq-\sum_{i=1}^{k}\frac{1}{\alpha_{i}}\|v_{i+1}-v_{i}\|^{2}+
LK28(U+Uf)4i=1kαi2n(Llαin+1)+\displaystyle\frac{L_{\nabla K}^{2}}{8}(U+U_{f})^{4}\sum_{i=1}^{k}\alpha_{i}^{2-n}(L_{\nabla l}\alpha_{i}^{n}+1)+
LK28ρ¯(U+2Uf)2i=1kαinΞi.\displaystyle\frac{L_{\nabla K}^{2}}{8\underline{\rho}}(U+2U_{f})^{2}\sum_{i=1}^{k}\alpha_{i}^{n}\Xi_{i}. (22)

We further estimate the update using true gradient:

Δkv2αk2vk+1vk2+LK22ρ¯Ξk(U+2Uf)2.\displaystyle\Delta_{k}^{v}\leq\frac{2}{\alpha_{k}^{2}}\|v_{k+1}-v_{k}\|^{2}+\frac{L_{\nabla K}^{2}}{2\underline{\rho}}\Xi_{k}(U+2U_{f})^{2}. (23)

We can prove our lemma by combining Lemma IX.4, Equation 22, and Equation 23. ∎

The following corollary can be proved by the same argument as Corollary VIII.6:

Corollary IX.6.

Assuming M(v)M(v) commutes with K(v)K(v), III.1, IV.8, IV.9, IV.10, IV.5, we have mini=1,,kαiΞivCk1\min_{i=1,\cdots,k}\alpha_{i}\Xi_{i}^{v}\leq Ck^{-1} and there are infinitely many kk such that ΞkvCvkm1\Xi_{k}^{v}\leq C_{v}k^{m-1} for some constants C,CvC,C_{v} independent of kk.

Finally, we list necessary changes to prove Theorem IV.11:

Proof of Theorem IV.11.

We choose v~k+1,dk,τk\tilde{v}_{k+1},d_{k},\tau_{k} as in proof of Theorem IV.6. Applying Corollary IX.6 and we can choose sufficiently large kk such that:

l(v~k+1),τkdkτkdk1αkvkv~k+1\displaystyle-\left<\nabla l(\tilde{v}_{k+1}),\frac{\tau_{k}d_{k}}{\tau_{k}\|d_{k}\|}\right>\leq\frac{1}{\alpha_{k}}\|v_{k}-\tilde{v}_{k+1}\|
\displaystyle\leq 1αkvkvk+1+1αkvk+1v~k+1\displaystyle\frac{1}{\alpha_{k}}\|v_{k}-v_{k+1}\|+\frac{1}{\alpha_{k}}\|v_{k+1}-\tilde{v}_{k+1}\|
=\displaystyle= Δkv+1αkvk+1v~k+1\displaystyle\sqrt{\Delta_{k}^{v}}+\frac{1}{\alpha_{k}}\|v_{k+1}-\tilde{v}_{k+1}\|
\displaystyle\leq ϵ4+LK24ρ¯Ξkuk+uf(vk)2=ϵ4+𝒪(k(12m)/2).\displaystyle\frac{\epsilon}{4}+\sqrt{\frac{L_{\nabla K}^{2}}{4\underline{\rho}}\Xi_{k}\|u_{k}+u_{f}(v_{k})\|^{2}}=\frac{\epsilon}{4}+\mathcal{O}(k^{(1-2m)/2}).

The last inequality holds by the contractive property of projection operator, Theorem IX.3, and Lemma IX.4. ∎

IX-E Parameter Choices

We argue that βk=1\beta_{k}=1 is a valid choice in Algorithm 3 if a Krylov-preconditioner is used. Note that choosing βk=1\beta_{k}=1 is generally incompatible with Assumption IV.8, which is because we make minimal assumption on the preconditioner M(v)M(v), only requiring it to be commuting with K(v)K(v) and positive definite with bounded spectrum. However, many practical preconditioners can provide much stronger guarantee leading to βk=1\beta_{k}=1 being a valid choice. To see this, we observe that the purpose of choosing βk\beta_{k} according to Assumption IV.8 is to establish the following contractive property in Equation 18:

K(vk)uk+1f2Θ¯K(vk)ukf2.\displaystyle\|K(v_{k})u_{k+1}-f\|^{2}\leq\bar{\Theta}\|K(v_{k})u_{k}-f\|^{2}. (24)

Any preconditioner can be used if they pertain such property with βk=1\beta_{k}=1. The following result shows that Krylov-preconditioner pertains the property:

Lemma IX.7.

Suppose M(v)M(v) is the Krylov-preconditioner with positive DD and βk=1\beta_{k}=1, then there exists some constant Θ¯<1\bar{\Theta}<1 where Equation 24 holds in Algorithm 3 for any kk.

Proof.

We define bK(vk)ukfb\triangleq K(v_{k})u_{k}-f and, by the definition of M(v)M(v), we have:

K(vk)uk+1f2=K(vk)(ukM1(vk)b)f2\displaystyle\|K(v_{k})u_{k+1}-f\|^{2}=\|K(v_{k})(u_{k}-M^{-1}(v_{k})b)-f\|^{2}
=\displaystyle= bK(vk)M1(vk)b2=bi=1D+1ciKi(vk)b2\displaystyle\|b-K(v_{k})M^{-1}(v_{k})b\|^{2}=\|b-\sum_{i=1}^{D+1}c_{i}^{*}K^{i}(v_{k})b\|^{2}
\displaystyle\leq (I1ρ¯K(vk))b2(1ρ¯ρ¯)b2Θ¯K(vk)ukf2,\displaystyle\|(I-\frac{1}{\bar{\rho}}K(v_{k}))b\|^{2}\leq(1-\frac{\underline{\rho}}{\bar{\rho}})\|b\|^{2}\triangleq\bar{\Theta}\|K(v_{k})u_{k}-f\|^{2},

where the first inequality holds by the definition of cic_{i}^{*}. ∎

Unfortunately, it is very difficult to theoretically establish this property for other practical preconditioners, such as incomplete LU and multigrid, although it is almost always observed in practice.