This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Kernel Multigrid on Manifolds

Thomas Hangelbroek 111Thomas Hangelbroek’s research was supported by Grant DMS-2010051 from the National Science Foundation. [email protected] University of Hawai‘i at Mānoa Christian Rieger [email protected] Philipps-Universität Marburg
Abstract

Kernel methods for solving partial differential equations on surfaces have the advantage that those methods work intrinsically on the surface and yield high approximation rates if the solution to the partial differential equation is smooth enough. Localized Lagrange bases have proven to alleviate the computational complexity of usual kernel methods to some extent. The efficient numerical solution of the resulting linear systems of equations has not been addressed in the literature so far. In this article we apply the framework of geometric multigrid method with a τ2\tau\geq 2-cycle to this particular setting. Moreover, we show that the resulting linear algebra can be made more efficiently by using the Lagrange function decay again. The convergence rates are obtained by a rigorous analysis. The presented version of a multigrid method provably works on quasi-uniform point clouds on the surface and hence does not require a grid-structure. Moreover, we can show that the computational cost to solve the linear system scales log-linear in the degrees of freedom.

keywords:
Geometric multigrid , partial differential equations on manifolds , kernel-based Galerkin methods , localized Lagrange basis
MSC:
65F10 , 65Y20 , 65M12 , 65M15 , 65M55 , 65M60

1 Introduction

The numerical solution to partial differential equations, often time dependent, on curved geometries is crucial to many real-world applications. Of the many available numerical methods (see for instance [1] for an overview using finite elements, or [2] [3] for alternative mesh-free methods), we focus on mesh-free kernel-based Galerkin methods (see also [4], [5], [6]), which have a number of merits, including delivering high approximation orders for smooth data, providing smooth solutions, and working coordinate-free and without the need for rigid underlying geometric structures like meshes and regular grids.

Let 𝕄\mathbb{M} denote in the following a compact, dd-dimensional Riemannian manifold without boundary. We will consider, as a spatial operator, a generic second order linear elliptic differential operator \mathcal{L} with trivial nullspace. Such operators occur for instance in the numerical solution of the heat equation using implicit time-stepping methods, see e.g. [7]. In conventional kernel-based Galerkin methods, a grid Ξ𝕄\Xi\subset\mathbb{M} is considered and a stiffness matrix 𝐀Ξ\boldsymbol{\mathbf{A}}_{\Xi}, which represents \mathcal{L} on a finite dimensional kernel space expressed endowed with some fixed basis, is assembled. This leads to the equation

𝐀Ξ𝐮=𝐛,\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{u}}=\boldsymbol{\mathbf{b}}, (1)

(see (18) for a precise definition).

In this paper we address the problem of reducing the computational costs of using kernels without spoiling their analytic advantages. Here, we mostly follow [7, 8] and use the kernel-based Lagrange basis. This particular basis, though not locally supported, has very strong decay properties which allows to localize computations. The almost-local support already alleviates the problem of densely populated matrices as usually encountered in kernel methods. Concretely, the full system in (1) can be well-approximated by a sparse matrix using the decay of the local Lagrange basis. Despite this, the condition number cond2(𝐀Ξ)N2/d\operatorname{cond}_{2}(\boldsymbol{\mathbf{A}}_{\Xi})\sim N^{2/d} grows with the problem size N:=|Ξ|N:=|\Xi|. Thus, even if compressed, (1) poses a computational challenge, and, because of its conditioning, iterative methods (without preconditioning) cannot be expected to work well, since the iterations needed to ensure a prescribed accuracy may grow with the number of degrees of freedom.

To this end, we introduce and analyze a mesh-free multigrid algorithm. Specifically, we adapt the standard WW-cycle multigrid method to (localized) kernels and provide a rigorous analysis in this setting. Although multigrid methods recently have gained attention in the mesh-free community, see [9] and [10], the rigorous analysis we provide has been missing in the kernel-based context.

The main novelty of this paper is the following:

  1. 1.

    We prove this mesh-free algorithm is a contraction, with norm independent of grid size (see Theorem 1). By standard techniques, it follows that numerical solutions within given tolerance can be obtained with a fixed, with respect to grid size, number of iterations (see (41) and adjacent discussion).

  2. 2.

    The method we present is stable under perturbations of the stiffness, restriction and prolongation matrices. Thus small errors, which may come from sparsifying these matrices (as well as from quadrature, round-off or modification of the kernel), also yield contractions (see Theorem 2) and therefore do not hinder performance of the algorithm.

  3. 3.

    By compressing the stiffness, restriction and prolongation matrices, which have rapid off-diagonal decay, we obtain a solution method enjoying nearly linear complexity.

This leads to an approximate solution to the system (1) which requires only 𝒪(Nlog(N)d)\mathcal{O}\bigl{(}N\log(N)^{d}\bigr{)} floating point operations per iteration and where again the multigrid matrix has a norm bound less than unity (see Theorem 2), for a total operation count of

𝒪(Nlog(N)dlog(ϵmax)),\mathcal{O}\left(N\log(N)^{d}\log\left({\epsilon_{\max}}\right)\right),

where ϵmax\epsilon_{\max} is the user-prescribed tolerance. The approximate solution 𝐮\boldsymbol{\mathbf{u}}^{\star} to (1) has an (additive) error bound of the form 𝐮𝐮2ϵmax+ϵtr\|\boldsymbol{\mathbf{u}}-\boldsymbol{\mathbf{u}}^{\star}\|_{\ell_{2}}\leq\epsilon_{\max}+\epsilon_{\mathrm{tr}}, where the error due to truncating matrices is ϵtr=𝒪(NJ)\epsilon_{\mathrm{tr}}=\mathcal{O}(N^{-J}). Here J>0J>0 is a user-determined constant which depends linearly on the sparsity of truncated matrices in the algorithm. This is explained in Remark 4.

Remark 1.

Before proceeding, we make the following comments:

  • 1.

    We do not attempt to modify the underlying framework of the multigrid method, as described, for instance, in [11, 12, 13], but show that it can be successfully applied to kernel methods. This is far from being obvious. We follow conceptually the book [12] where also the function space view on multigrid methods is used.

  • 2.

    The fact that \mathcal{L} is injective is a convenient simplification we assume throughout the article. An investigation of operators with a non-trivial nullspace will be considered in a forthcoming work.

  • 3.

    Throughout this article, we will assume to have access to a sequence of quasi-uniform and nested point clouds Ξ0Ξ1ΞL𝕄\Xi_{0}\subset\Xi_{1}\subset\dots\Xi_{L}\subset\mathbb{M} on 𝕄\mathbb{M}, and a corresponding highly localized Lagrange basis generated by a kernel ϕ:𝕄×𝕄\phi:\mathbb{M}\times\mathbb{M}\to\mathbb{R}. Both, the construction of point clouds on manifolds and the computation of the Lagrange basis are independent of the partial differential equation and can hence be pre-computed and even be stored.

  • 4.

    Although numerical integration is necessary to implement the kernel-based Galerkin method, our results hold independently of the choice of quadrature method. (This is treated, for instance, in [7, Section 4].) We therefore assume for the remainder of this paper these steps to be solved. In particular, we assume to have access to the stiffness matrices 𝐀Ξ\boldsymbol{\mathbf{A}}_{\Xi}, as considered in [14] or [15].

  • 5.

    An alternative approach to treating (1) would be to apply a suitable preconditioner. To be effective in this context, such methods would also have to be adapted to the kernel situation and the analysis including the compression argument would have to be carefully carried over as well. Furthermore, many successful preconditioners for finite elements, like [16], typically involve concepts from multigrid methods such as hierarchy of approximation spaces.

  • 6.

    We discuss in this article the multigrid method only from the perspective as standalone solver. We point out that the multigrid method itself provides an attractive preconditioner. Most often, the multigrid preconditioner is combined with a flexible GMRES (FGMRES) iteration method, see e.g. [17] and references therein. This is also implemented in many software libraries such as PETSc (see [18, page 92f.]) just to name one prominent example. We will discuss this in more detail when we present numerical results on this method in an upcoming work.

Outline of the paper

The remainder of this article is organized as follows. In section 2, we introduce the basic notation of second order elliptic equations on manifolds, and their solution via kernel-based Galerkin approximation. In this section we demonstrate the approximation property in the kernel context, which, along with the smoothing property, provides the analytic backbone for the success of the multigrid method. Section 3 introduces the phenomenon of rapidly decaying Lagrange-type bases, which holds for certain kernels – using such bases permits stiffness matrices with rapid off-diagonal decay, among other things. Of special importance is the diagonal behavior of the stiffness matrix given in Lemma 5, which is a novel contribution of this paper, and which is the analytic result necessary to prove the smoothing property. In section 4 we discuss the smoothing property of damped Jacobi iterations for kernel-based methods both in the case of symmetric and non-symmetric differential operators. In section 5 we introduce kernel-based restriction and prolongation operators, the standard two-grid method and then the WW-cycle. In this section, we prove Theorem 1, a consequence of which is the bound (41), which shows the (poly)-logarithmic complexity of our proposed method. Section 6 treats the error resulting from small perturbations of the stiffness matrix, as well as the prolongation and restriction matrices. The main result in this section, Theorem 2, demonstrates how such errors affect the multigrid approximation error. Section 7 investigates the computationally efficient truncated multigrid method as an application of the previous section.

2 Problem Set Up

2.1 Manifold

Consider a compact dd-dimensional Riemannian manifold without boundary 𝕄\mathbb{M}. Here we list some useful tools and their properties which hold in this setting. We direct the reader to [19, 20] for relevant background. Results about covariant derivatives and Sobolev spaces can be found in [21].

The tangent bundle is T𝕄T\mathbb{M} and the cotangent bundle is T𝕄T^{*}\mathbb{M}. We denote by Trk𝕄T^{k}_{r}\mathbb{M} the vector bundle of tensors with contravariant order kk and covariant order rr; we say type (k,r)(k,r) for short. Thus T𝕄=T01𝕄T\mathbb{M}=T^{1}_{0}\mathbb{M} and T𝕄=T10𝕄T^{*}\mathbb{M}=T^{0}_{1}\mathbb{M}. We will denote the fiber at x𝕄x\in\mathbb{M} by Trk𝕄xT^{k}_{r}\mathbb{M}_{x}. The space of tensor fields of type (k,r)(k,r) (known also as sections; i.e., maps 𝐒:𝕄Trk𝕄\mathbf{S}:\mathbb{M}\to T_{r}^{k}\mathbb{M} with 𝐒(x)Trk𝕄x\mathbf{S}(x)\in T_{r}^{k}\mathbb{M}_{x} for every x𝕄x\in\mathbb{M}) is denoted 𝒯rk𝕄\mathcal{T}_{r}^{k}\mathbb{M}.

In this article, we are concerned primarily with covariant tensors (i.e., tensors of type (0,k)(0,k)), so we use the short hand notation Tk𝕄=Tk0𝕄T_{k}\mathbb{M}=T_{k}^{0}\mathbb{M} and 𝒯k𝕄=𝒯k0𝕄\mathcal{T}_{k}\mathbb{M}=\mathcal{T}_{k}^{0}\mathbb{M}.

For a chart (U,ϕ)(U,\phi) for 𝕄\mathbb{M} from which we get the usual vector fields xj\frac{\partial}{\partial x^{j}} and forms dxjdx^{j} (1jd1\leq j\leq d), which act as local bases for T𝕄T\mathbb{M} and T𝕄T^{*}\mathbb{M} over UU. These can be used to generate bases for tensor fields. In particular, for given covariant rank kk and 𝐢=(i1,,ik){1,,d}k\boldsymbol{\mathbf{i}}=(i_{1},\dots,i_{k})\in\{1,\dots,d\}^{k} we have basis element dx𝐢:=dxi1dxikdx^{\boldsymbol{\mathbf{i}}}:=dx^{i_{1}}\cdots dx^{i_{k}} . This allows us to write 𝐒𝒯k𝕄\mathbf{S}\in\mathcal{T}_{k}\mathbb{M} in coordinates as 𝐒(x)=𝐢{1,,d}k(S(x))𝐢dx𝐢\mathbf{S}(x)=\sum_{\boldsymbol{\mathbf{i}}\in\{1,\dots,d\}^{k}}(S(x))_{\boldsymbol{\mathbf{i}}}dx^{\boldsymbol{\mathbf{i}}}.

Because 𝕄\mathbb{M} is a Riemannian manifold, at each x𝕄x\in\mathbb{M}, T𝕄xT\mathbb{M}_{x} has an inner product ,x\langle\cdot,\cdot\rangle_{x} and induced norm x\|\cdot\|_{x}. This means that there is by g𝒯2𝕄g\in\mathcal{T}_{2}\mathbb{M} (the metric tensor), so that for tangent vectors in V,WT𝕄xV,W\in T\mathbb{M}_{x} we have g(x)(V,W)=V,Wxg(x)(V,W)=\langle V,W\rangle_{x}. The inner product extends to the dual: for cotangent vectors μ,νT𝕄x\mu,\nu\in T^{*}\mathbb{M}_{x} we have μ,νx=μjνkgj,k\langle\mu,\nu\rangle_{x}=\sum\mu_{j}\nu_{k}g^{j,k}, where gj,kgk,=δj,\sum g_{j,k}g^{k,\ell}=\delta_{j,\ell}. From this, it naturally extends to tensors. For 𝐓,𝐒Tk𝕄x\mathbf{T},\mathbf{S}\in{T}_{k}\mathbb{M}_{x}, written in coordinates as 𝐓=𝐣{1,,d}kT𝐣dxj1dxjk\mathbf{T}=\sum_{\boldsymbol{\mathbf{j}}\in\{1,\dots,d\}^{k}}T_{\boldsymbol{\mathbf{j}}}\,d{x^{j_{1}}}\dots d{x^{j_{k}}} and 𝐒=𝐢{1,,d}kS𝐢dxi1dxik\mathbf{S}=\sum_{\boldsymbol{\mathbf{i}}\in\{1,\dots,d\}^{k}}S_{\boldsymbol{\mathbf{i}}}\,d{x^{i_{1}}}\dots d{x^{i_{k}}}, we have

𝐓,𝐒x=𝐢,𝐣{1,,d}kgi1j1(x)gik,jk(x)S𝐢T𝐣.\langle\mathbf{T},\mathbf{S}\rangle_{x}=\sum_{\boldsymbol{\mathbf{i}},\boldsymbol{\mathbf{j}}\in\{1,\dots,d\}^{k}}g^{i_{1}j_{1}}(x)\dots g^{i_{k},j_{k}}(x)S_{\boldsymbol{\mathbf{i}}}T_{\boldsymbol{\mathbf{j}}}.

We denote the Riemannian distance on 𝕄\mathbb{M} by dist:𝕄×𝕄[0,)\mathrm{dist}:\mathbb{M}\times\mathbb{M}\to[0,\infty); it is given by the formula dist(x,y)=inf{abγ(t)γ(t)dtγ(a)=x,γ(b)=y}\mathrm{dist}(x,y)=\inf\{\int_{a}^{b}\|\gamma^{\prime}(t)\|_{\gamma(t)}\mathrm{d}t\mid\gamma(a)=x,\ \gamma(b)=y\} where the infimum is taken over piecewise smooth curves connecting xx and yy. The Riemannian metric gives rise to a volume form dμ=det(g(x))dx\mathrm{d}\mu=\sqrt{\det(g(x))}\mathrm{d}x. By compactness, there exist constants 0<α𝕄β𝕄0<\alpha_{\mathbb{M}}\leq\beta_{\mathbb{M}} so that

α𝕄rdμ(B(x,r))β𝕄rd\alpha_{\mathbb{M}}r^{d}\leq\mu(B(x,r))\leq\beta_{\mathbb{M}}r^{d}

or any ball B(x,r):={y𝕄dist(x,y)<r}B(x,r):=\{y\in\mathbb{M}\mid\mathrm{dist}(x,y)<r\} centered at x𝕄x\in\mathbb{M} and having radius 0<r<diam(𝕄)0<r<\mathrm{diam}(\mathbb{M}).

For a finite subset Ξ𝕄\Xi\subset\mathbb{M}, we can define the following useful quantities: the separation distance, qq of Ξ\Xi and the fill distance, hh, of Ξ\Xi in 𝕄\mathbb{M}. They are given by

q:=12minζΞdist(ζ,Ξ{ζ})andh:=h(Ξ,𝕄):=supx𝕄dist(x,Ξ).q:=\frac{1}{2}\min_{\zeta\in\Xi}\mathrm{dist}(\zeta,\Xi\setminus\{\zeta\})\quad\text{and}\quad h:=h(\Xi,\mathbb{M}):=\sup_{x\in\mathbb{M}}\mathrm{dist}(x,\Xi).

The finite sets considered throughout this paper will be quasiuniform, with mesh ratio ρ:=h/q\rho:=h/q bounded by a fixed constant. For this reason, quantities which are controlled above or below by a power of qq can be likewise controlled by a power of hh – in short, whenever possible, we express estimates in terms of the fill distance hh, allowing constants to depend on ρ\rho. For instance, the cardinality |Ξ||\Xi| is bounded above and below by

μ(𝕄)β𝕄hd|Ξ|μ(𝕄)α𝕄qd.\frac{\mu(\mathbb{M})}{\beta_{\mathbb{M}}}h^{-d}\leq|\Xi|\leq\frac{\mu(\mathbb{M})}{\alpha_{\mathbb{M}}}q^{-d}. (2)

Similarly, if f:[0,)f:[0,\infty)\to\mathbb{R} is continuous then for any x𝕄x\in\mathbb{M},

ζΞf(dist(ζ,x))maxtqf(t)+β𝕄α𝕄n=1(n+2)dmaxnqt(n+1)qf(t).\sum_{\zeta\in\Xi}f\bigl{(}\mathrm{dist}(\zeta,x)\bigr{)}\leq\max_{t\leq q}f(t)+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\sum_{n=1}^{\infty}(n+2)^{d}\max_{nq\leq t\leq(n+1)q}f(t). (3)

Moreover, we use the following notation: A={f:A}\mathbb{R}^{A}=\{f:A\to\mathbb{R}\} denotes the functions from the set AA to \mathbb{R}. Of course, if AA is a discrete finite set, we can identity A|A|\mathbb{R}^{A}\cong\mathbb{R}^{|A|}.

2.2 Sobolev spaces

The covariant derivative \nabla maps tensor fields of type (r,s)(r,s) to fields of type (r,s+1)(r,s+1). Its adjoint (with respect to the L2L_{2} inner products on the space of sections of Trs(𝕄)T_{r}^{s}(\mathbb{M})) is denoted \nabla^{*}. For functions, this is fairly elementary. The covariant derivative of a (scalar) function f:𝕄f:\mathbb{M}\to\mathbb{R} equals its exterior derivative; in coordinates, we have

f=j=1dfxjdxj=df.\nabla f=\sum_{j=1}^{d}\frac{\partial f}{\partial x^{j}}dx^{j}=df.

For a 1-form ω=j=1dωjdxj\omega=\sum_{j=1}^{d}\omega_{j}dx^{j}, we have

ω(x)=1detg(x)j=1dk=1dxk(detg(x)gjk(x)ωj(x)).\nabla^{*}\omega(x)=-\frac{1}{\sqrt{\det g(x)}}\sum_{j=1}^{d}\sum_{k=1}^{d}\frac{\partial}{\partial x_{k}}\left(\sqrt{\det g(x)}g^{jk}(x)\omega_{j}(x)\right).

A direct calculation shows 𝕄f(x)ω(x)dμ(x)=𝕄ω(x),f(x)xdμ(x)\int_{\mathbb{M}}f(x)\nabla^{*}\omega(x)\mathrm{d}\mu(x)=\int_{\mathbb{M}}\langle\omega(x),\nabla f(x)\rangle_{x}\mathrm{d}\mu(x).

For Ω𝕄\Omega\subset\mathbb{M} Sobolev space Wpk(Ω)W_{p}^{k}(\Omega) is defined to be the set of functions f:Ωf:\Omega\to\mathbb{R} which satisfy

fWpk(Ω)p==0kΩfxpdμ(x)<,\|f\|_{W_{p}^{k}(\Omega)}^{p}=\sum_{\ell=0}^{k}\int_{\Omega}\|\nabla^{\ell}f\|_{x}^{p}\mathrm{d}\mu(x)<\infty,

where the \ell-th order covariant derivatives can be found in [21, Section 2.2].

For a scalar function f:𝕄f:\mathbb{M}\to\mathbb{R}, the Laplace-Beltrami operator is given in local coordinates as Δf=j=1dk=1d1|detg|xj(|detg|gjkxkf)\Delta f=\sum_{j=1}^{d}\sum_{k=1}^{d}\frac{1}{\sqrt{|\det g|}}\frac{\partial}{\partial x^{j}}\bigl{(}\sqrt{|\det g|}g^{jk}\frac{\partial}{\partial x^{k}}f\bigr{)}. Thus for scalar functions Δf=f\Delta f=-\nabla^{*}\nabla f. For any integer kk\in\mathbb{N} and p(1,)p\in(1,\infty), the Bessel-potential norm (1Δ)k/2fLp(𝕄)\|(1-\Delta)^{k/2}f\|_{L_{p}(\mathbb{M})} is equivalent to fWpk(𝕄)\|f\|_{W_{p}^{k}(\mathbb{M})}, as demonstrated in [22, Theorem 4(ii)] (although when k=1k=1 and p=2p=2, the two norms are equal; this can be observed directly).

Lemma 3.2 from [21] applies, so there are uniform constants Γ1,Γ2\Gamma_{1},\Gamma_{2} and r𝕄>0r_{\mathbb{M}}>0 so that the family of exponential maps {Expx:B(0,r𝕄)𝕄x𝕄}\{\mathrm{Exp}_{x}:B(0,r_{\mathbb{M}})\to\mathbb{M}\mid x\in\mathbb{M}\} (which are diffeomorphisms taking 0 to xx) provides local metric equivalences: for any open set ΩB(0,r𝕄)d\Omega\subset B(0,r_{\mathbb{M}})\subset\mathbb{R}^{d}, we have

Γ1uExpxWpj(Ω)uWpj(Expx(Ω))Γ2uExpxWpj(Ω).\Gamma_{1}\|u\circ\mathrm{Exp}_{x}\|_{W_{p}^{j}(\Omega)}\leq\|u\|_{W_{p}^{j}(\mathrm{Exp}_{x}(\Omega))}\leq\Gamma_{2}\|u\circ\mathrm{Exp}_{x}\|_{W_{p}^{j}(\Omega)}. (4)

This shows that Wpk(𝕄)W_{p}^{k}(\mathbb{M}) can be endowed with equivalent norms using a partition of unity (φj)jN(\varphi_{j})_{j\leq N} subordinate to a cover {𝒪j}jN\{\mathcal{O}_{j}\}_{j\leq N} with associated charts ψj:𝒪jd\psi_{j}:\mathcal{O}_{j}\to\mathbb{R}^{d} to obtain uWpk(𝕄)pj=1N(φju)ψj1Wpk(d)p\|u\|_{W_{p}^{k}(\mathbb{M})}^{p}\sim\sum_{j=1}^{N}\|(\varphi_{j}u)\circ\psi_{j}^{-1}\|_{W_{p}^{k}(\mathbb{R}^{d})}^{p}. Here constants of equivalence depend on the partition of unity and charts.

A useful result in this setting, which we will use explicitly in this article, but is also behind a number of background results in section 2.5, is the following zeros estimate [23, Corollary A.13 ], which holds for Sobolev spaces: If uW2m(𝕄)u\in W_{2}^{m}(\mathbb{M}) satisfies u|=X0u\left|{}_{X}\right.=0, then for any kmk\leq m we have

uW2k(𝕄)CzeroshmkuW2m(𝕄).\|u\|_{W_{2}^{k}(\mathbb{M})}\leq C_{\mathrm{zeros}}h^{m-k}\|u\|_{W_{2}^{m}(\mathbb{M})}. (5)

Here CzerosC_{\mathrm{zeros}} depends on mm and 𝕄\mathbb{M}. The result can also be obtained on bounded, Lipschitz domain Ω𝕄\Omega\subset\mathbb{M} that satisfies a uniform cone condition, with the cone having radius RΩr𝕄/3R_{\Omega}\leq r_{\mathbb{M}}/3, although the constant CzerosC_{\mathrm{zeros}} will depend on the aperture of the cone in that case. See [23, Theorem A.11] for a precise statement and definitions of the involved quantities.

2.3 Elliptic operator

We write the operator \mathcal{L} in divergence form:

=a2()+a1+a0.\mathcal{L}=\nabla^{*}\textsc{a}_{2}^{\flat}(\nabla\cdot)+\textsc{a}_{1}\nabla+\textsc{a}_{0}.

Here a0\textsc{a}_{0} is a smooth function, a1\textsc{a}_{1} is a smooth tensor field of type (1,0)(1,0) and a2\textsc{a}_{2} is a smooth tensor field of type (2,0)(2,0) which generates the field a2\textsc{a}_{2}^{\flat} of type (1,1). In coordinates, a2\textsc{a}_{2} has the form j=1dk=1dajkxjxk\sum_{j=1}^{d}\sum_{k=1}^{d}a^{jk}\frac{\partial}{\partial x_{j}}\frac{\partial}{\partial x_{k}}, and a2\textsc{a}_{2}^{\flat} is j=1dk=1d\tensoradkjxkxj\sum_{j=1}^{d}\sum_{k=1}^{d}\tensor{a}{{}_{k}^{j}}dx_{k}\frac{\partial}{\partial x_{j}} with \tensora=kj=1dgkaj\tensor{a}{{}_{k}^{j}}=\sum_{\ell=1}^{d}g_{k\ell}a^{\ell j}. Here we have used the index lowering operator which ensures, for any μ,νT1𝕄x\mu,\nu\in T_{1}\mathbb{M}_{x}, that a2(μ,),νx=a2(μ,ν)\langle\textsc{a}_{2}^{\flat}(\mu,\cdot),\nu\rangle_{x}=\textsc{a}_{2}(\mu,\nu). This follows by a direct calculation from the above expressions in coordinates (index lowering and the operator are discussed in [19, p. 341] or [20, p. 27]).

Furthermore, we require c0>0c_{0}>0 to be a constant so that

a2(x)(v,v)c0v,vxanda0+12a1c0.\textsc{a}_{2}(x)(v,v)\geq c_{0}\langle v,v\rangle_{x}\quad\text{and}\quad\textsc{a}_{0}+\frac{1}{2}\nabla^{*}\textsc{a}_{1}\geq c_{0}. (6)

As a basic example, we consider a2(x)=,x\textsc{a}_{2}(x)=\langle\cdot,\cdot\rangle_{x}, a1=0\textsc{a}_{1}=0 and a0=1\textsc{a}_{0}=1; then (6) holds with c0=1c_{0}=1. In this case, ajk=gjka^{jk}=g^{jk}, \tensora=kjδj,k\tensor{a}{{}_{k}^{j}}=\delta_{j,k} and =1+=1Δ\mathcal{L}=1+\nabla^{*}\nabla=1-\Delta.

With the identity 𝕄u(a1u)=12𝕄(a1(u2))=12𝕄(a1)u2\int_{\mathbb{M}}u(\textsc{a}_{1}\nabla u)=\frac{1}{2}\int_{\mathbb{M}}(\textsc{a}_{1}\nabla(u^{2}))=\frac{1}{2}\int_{\mathbb{M}}(\nabla^{*}\textsc{a}_{1})u^{2}, the second part of (6) ensures that

𝕄u(a1u+a0u)c0uL2(𝕄)2.\int_{\mathbb{M}}u(\textsc{a}_{1}\nabla u+\textsc{a}_{0}u)\geq c_{0}\|u\|_{L_{2}(\mathbb{M})}^{2}.

We have also that 𝕄u(x)(a2u)(x)dμ(x)=𝕄a2u,uxdμ(x)\int_{\mathbb{M}}u(x)\nabla^{*}(\textsc{a}_{2}^{\flat}\nabla u)(x)\mathrm{d}\mu(x)=\int_{\mathbb{M}}\langle\textsc{a}_{2}^{\flat}\nabla u,\nabla u\rangle_{x}\mathrm{d}\mu(x). The definition of a2\textsc{a}_{2}^{\flat} ensures that this equals 𝕄a2(x)(u(x),u(x))dμ(x)\int_{\mathbb{M}}\textsc{a}_{2}(x)\bigl{(}\nabla u(x),\nabla u(x)\bigr{)}\mathrm{d}\mu(x), and the first part of (6) then ensures that

𝕄(a2u)(x)dμ(x)𝕄ux2dμ(x).\int_{\mathbb{M}}\nabla^{*}(\textsc{a}_{2}^{\flat}\nabla u)(x)\mathrm{d}\mu(x)\geq\int_{\mathbb{M}}\|\nabla u\|_{x}^{2}\mathrm{d}\mu(x).

Thus, (6) guarantees that the bilinear form a(u,v):=𝕄vua(u,v):=\int_{\mathbb{M}}v\mathcal{L}u, defined initially for smooth functions, is bounded on W21(𝕄)W_{2}^{1}(\mathbb{M}) and is coercive. Thus, we have that the energy quasi-norm [u]2:=a(u,u)[u]^{2}_{\mathcal{L}}:=a(u,u) satisfies the metric equivalence

c0uW21(𝕄)2[u]2=a(u,u)CuW21(𝕄)2.\displaystyle c_{0}\|u\|_{W_{2}^{1}(\mathbb{M})}^{2}\leq[u]^{2}_{\mathcal{L}}=a(u,u)\leq C\|u\|_{W_{2}^{1}(\mathbb{M})}^{2}. (7)

(if aa is symmetric, this is a norm, and we write u=[u]\|u\|_{\mathcal{L}}=[u]_{\mathcal{L}}.)

Please note that this excludes differential operators with a null space at the moment. Having in mind the time dependent problems, this assumption is justified. The technical more challenging analysis for those more general operators is left to future research.

2.4 Galerkin methods

We fix fL2(𝕄)f\in L_{2}(\mathbb{M}) and consider uW21(𝕄)u\in W^{1}_{2}(\mathbb{M}) as solution to

a(u,v):=𝕄vu=(f,v)L2(𝕄)=F(v)for all vW21(𝕄).\displaystyle a(u,v):=\int_{\mathbb{M}}v\mathcal{L}u=(f,v)_{L_{2}(\mathbb{M})}=F(v)\quad\text{for all }v\in W^{1}_{2}(\mathbb{M}). (8)

Regularity estimates ([24, Chapter 5.11, Theorem 11.1]) yield that uW22(𝕄)u\in W^{2}_{2}(\mathbb{M}) and uW22(𝕄)CfL2(𝕄)\|u\|_{W^{2}_{2}(\mathbb{M})}\leq C\|f\|_{L_{2}(\mathbb{M})}. Consider now a family of finite dimensional subspaces (Vh)(V_{h}) with VhW21(𝕄)V_{h}\subset W_{2}^{1}(\mathbb{M}) associated to a parameter222In the sequel, these will be kernel spaces generated by a subset Ξ𝕄\Xi\subset\mathbb{M}, and h=h(Ξ,𝕄)h=h(\Xi,\mathbb{M}) will be the fill distance. For now, we consider a more abstract setting, with h>0h>0 only playing a role in establishing the approximation property distW21(g,Vh)ChgW22(𝕄)\mathrm{dist}_{W_{2}^{1}}(g,V_{h})\leq Ch\|g\|_{W_{2}^{2}(\mathbb{M})} below. h>0h>0. Define PVh:W21(𝕄)VhP_{V_{h}}:W^{1}_{2}(\mathbb{M})\to V_{h} so that

a(PVhu,v)=a(u,v)for all vVh.\displaystyle a(P_{V_{h}}u,v)=a(u,v)\quad\text{for all }v\in V_{h}.

Because a(uPVhu,PVhuv)=0a\left(u-P_{V_{h}}u,P_{V_{h}}u-v\right)=0, the classical Céa Lemma holds:

c1uPVhuW21(𝕄)2\displaystyle c_{1}\left\|u-P_{V_{h}}u\right\|^{2}_{W^{1}_{2}(\mathbb{M})} a(uPVhu,uPVhu)=a(uPVhu,uv)\displaystyle\leq a\left(u-P_{V_{h}}u,u-P_{V_{h}}u\right)={\color[rgb]{0,0,0}a\left(u-P_{V_{h}}u,u-v\right)}
c2uPVhuW21(𝕄)uvW21(𝕄) for all vVh.\displaystyle\leq c_{2}\left\|u-P_{V_{h}}u\right\|_{W^{1}_{2}(\mathbb{M})}\left\|u-v\right\|_{W^{1}_{2}(\mathbb{M})}\quad\text{ for all }v\in V_{h}.

And thus we get

uPVhuW21(𝕄)c2c1distW21(𝕄)(u,Vh).\displaystyle\left\|u-P_{V_{h}}u\right\|_{W^{1}_{2}(\mathbb{M})}\leq\frac{c_{2}}{c_{1}}\mathrm{dist}_{{W^{1}_{2}(\mathbb{M})}}\left(u,V_{h}\right). (9)

This can be improved by a Nitsche-type argument to get the following result, whose proof can be found in many textbooks on numerical methods for partial differential equations.

Lemma 1.

Suppose the family (Vh)(V_{h}) has the property that for all u~W22(𝕄)\tilde{u}\in W_{2}^{2}(\mathbb{M}), the distance in W21(𝕄)W_{2}^{1}(\mathbb{M}) from VhV_{h} satisfies distW21(𝕄)(u~,Vh)Chu~W22(𝕄)\mathrm{dist}_{\|\cdot\|_{W^{1}_{2}(\mathbb{M})}}\left(\tilde{u},V_{h}\right)\leq Ch\left\|\tilde{u}\right\|_{W^{2}_{2}(\mathbb{M})}. Then for any uW22(𝕄)u\in W_{2}^{2}(\mathbb{M}),

uPVhuL2(𝕄)ChdistW21(𝕄)(u,Vh).\displaystyle\left\|u-P_{V_{h}}u\right\|_{L_{2}(\mathbb{M})}\leq Ch\mathrm{dist}_{\|\cdot\|_{W^{1}_{2}(\mathbb{M})}}\left(u,V_{h}\right). (10)

We point out, that Lemma 1 does not depend on the choice of a specific basis in the finite dimensional space VhV_{h}.

2.5 Kernel approximation

We consider a continuous function ϕ:𝕄×𝕄\phi:\mathbb{M}\times\mathbb{M}\to\mathbb{R}, the kernel, which satisfies a number of analytic properties which we explain in this section.

Most important is that ϕ\phi is conditionally positive definite with respect to some (possibly trivial) finite dimensional subspace333generally a space spanned by some eigenfunctions of the Laplace-Beltrami operator ΠC(𝕄)\Pi\subset C^{\infty}(\mathbb{M}). Conditional positive definiteness with respect to Π\Pi means that for any Ξ𝕄\Xi\subset\mathbb{M}, the collocation matrix (ϕ(ξ,ζ))ξ,ζ\bigl{(}\phi(\xi,\zeta)\bigr{)}_{\xi,\zeta} is positive definite on the vector space {aΞ(pΠ)ξΞaξp(ξ)=0}\{a\in\mathbb{R}^{\Xi}\mid(\forall p\in\Pi)\,\sum_{\xi\in\Xi}a_{\xi}p(\xi)=0\}. As a result, if Ξ\Xi separates elements of Π\Pi, then the space

VΞ:={ξΞa(ξ)ϕ(,ξ)(pΠ)ξΞaξp(ξ)=0}+Π\displaystyle V_{\Xi}:=\left\{\sum_{\xi\in\Xi}a(\xi)\phi(\cdot,\xi)\mid(\forall p\in\Pi)\,\sum_{\xi\in\Xi}a_{\xi}p(\xi)=0\right\}+\Pi (11)

has dimension Ξ\Xi. In case Π={0}\Pi=\{0\}, the kernel is positive definite, and the collocation matrix is strictly positive definite on Ξ\mathbb{R}^{\Xi}, and VΞ=spanξΞϕ(,ξ)V_{\Xi}=\mathrm{span}_{\xi\in\Xi}\phi(\cdot,\xi).

For a conditionally positive definite kernel, there is an associated reproducing kernel semi-Hilbert space 𝒩(ϕ)C(𝕄)\mathcal{N}(\phi)\subset C(\mathbb{M}) with the property that Π=null(𝒩(ϕ))\Pi=\mathrm{null}(\|\cdot\|_{\mathcal{N}(\phi)}) and that for any aΞa\in\mathbb{R}^{\Xi} for which ξΞaξδξΠ\sum_{\xi\in\Xi}a_{\xi}\delta_{\xi}\perp\Pi, the following identity

ξΞaξf(ξ)=f,ξΞaξϕ(,ξ)𝒩(ϕ)\sum_{\xi\in\Xi}a_{\xi}f(\xi)=\langle f,\sum_{\xi\in\Xi}a_{\xi}\phi(\cdot,\xi)\rangle_{\mathcal{N}(\phi)}

holds for all f𝒩(ϕ)f\in\mathcal{N}(\phi). It follows that if Ξ\Xi separates elements of Π\Pi, interpolation with VΞV_{\Xi} is well defined on Ξ\Xi and the projection IΞ:𝒩(ϕ)VΞI_{\Xi}:\mathcal{N}(\phi)\to V_{\Xi} is orthogonal with respect to the semi-norm on 𝒩(ϕ)\mathcal{N}(\phi). Of special interest are (conditionally) positive definite kernels that have Sobolev native spaces.

Lemma 2.

If ϕ\phi is conditionally positive definite with respect to Π\Pi and satisfies the equivalence 𝒩(ϕ)/ΠW2m(𝕄)/Π\mathcal{N}(\phi)/\Pi\cong W_{2}^{m}(\mathbb{M})/\Pi. then there is a constant CC so that for any integers k,jk,j with 0kjm0\leq k\leq j\leq m and any uW2j(𝕄)u\in W_{2}^{j}(\mathbb{M}),

distW2k(𝕄)(u,VΞ)ChjkuW2j(𝕄).\mathrm{dist}_{W_{2}^{k}(\mathbb{M})}(u,V_{\Xi})\leq Ch^{j-k}\|u\|_{W_{2}^{j}(\mathbb{M})}.
Proof.

The case k=jk=j is trivial; it follows by considering 0VΞ0\in V_{\Xi}.

For j=mj=m and 0k<m0\leq k<m, the zeros estimate [23, Corollary A.13] ensures that

IΞuuW2k(𝕄)ChmkIΞuuW2m(𝕄)\|I_{\Xi}u-u\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{m-k}\|I_{\Xi}u-u\|_{W_{2}^{m}(\mathbb{M})}

(because the interpolation error IΞffI_{\Xi}f-f vanishes on Ξ\Xi). The hypothesis then gives, IΞuuW2k(𝕄)ChmkIΞuu𝒩(ϕ)\|I_{\Xi}u-u\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{m-k}\|I_{\Xi}u-u\|_{\mathcal{N}(\phi)}, with a suitably enlarged constant. Because IΞI_{\Xi} is an orthogonal projector, IΞuu𝒩(ϕ)u𝒩(ϕ)\|I_{\Xi}u-u\|_{\mathcal{N}(\phi)}\leq\|u\|_{\mathcal{N}}(\phi), so we have (again enlarging the constant) that

IΞuuW2k(𝕄)ChmkuW2m(𝕄).\|I_{\Xi}u-u\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{m-k}\|u\|_{W_{2}^{m}(\mathbb{M})}.

For 0k<j<m0\leq k<j<m, we use the fact that W2j(𝕄)W_{2}^{j}(\mathbb{M}) is the real interpolation space [W2k(𝕄),W2m(𝕄)]jkmk,2[W_{2}^{k}(\mathbb{M}),W_{2}^{m}(\mathbb{M})]_{\frac{j-k}{m-k},2}. This is [22, Theorem 5]. For u~W2j(𝕄)\tilde{u}\in W_{2}^{j}(\mathbb{M}), this means that the KK-functional

K(u~,t)=infgW2m(𝕄)u~gW2k(𝕄)+tgW2m(𝕄)K(\tilde{u},t)=\inf_{g\in W_{2}^{m}(\mathbb{M})}\|\tilde{u}-g\|_{W_{2}^{k}(\mathbb{M})}+t\|g\|_{W_{2}^{m}(\mathbb{M})}

satisfies the condition 0(tjkmkK(u~,t))2dtt<\int_{0}^{\infty}(t^{-\frac{j-k}{m-k}}K(\tilde{u},t))^{2}\frac{\mathrm{d}t}{t}<\infty. Since K(u~,t)K(\tilde{u},t) is continuous and monotone, we have that ttjkmkK(u~,t)t\mapsto t^{-\frac{j-k}{m-k}}K(\tilde{u},t) is bounded on (0,)(0,\infty). Thus for t=hmkt=h^{m-k}, there exists gW2m(𝕄)g\in W_{2}^{m}(\mathbb{M}) so that

u~gW2k(𝕄)\displaystyle\|\tilde{u}-g\|_{W_{2}^{k}(\mathbb{M})} Chjku~W2j(𝕄)\displaystyle\leq Ch^{j-k}\|\tilde{u}\|_{W_{2}^{j}(\mathbb{M})} and gW2m(𝕄)\displaystyle\|g\|_{W_{2}^{m}(\mathbb{M})} Chjmu~W2j(𝕄).\displaystyle\leq Ch^{j-m}\|\tilde{u}\|_{W_{2}^{j}(\mathbb{M})}.

The above estimate gives IΞggW2k(𝕄)ChmkgW2m(𝕄)\|I_{\Xi}g-g\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{m-k}\|g\|_{W_{2}^{m}(\mathbb{M})}. This implies that u~IΞgW2k(𝕄)Chjku~W2j(𝕄)\|\tilde{u}-I_{\Xi}g\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{j-k}\|\tilde{u}\|_{W_{2}^{j}(\mathbb{M})} as desired. ∎

As a consequence, the kernel Galerkin solution uΞVΞu_{\Xi}\in V_{\Xi} to u=f\mathcal{L}u=f with fW2j+1(𝕄)f\in W_{2}^{j+1}(\mathbb{M}) satisfies uuΞW21(𝕄)Chj1fW2j+1(𝕄)\|u-u_{\Xi}\|_{W_{2}^{1}(\mathbb{M})}\leq Ch^{j-1}\|f\|_{W_{2}^{j+1}(\mathbb{M})}. More importantly, for our purposes, the hypothesis of Lemma 1 is satisfied by the space VΞV_{\Xi}. Indeed, by (10), we have the following approximation property:

uPΞuL2(𝕄)ChuW21(𝕄)\|u-P_{\Xi}u\|_{L_{2}(\mathbb{M})}\leq Ch\|u\|_{W_{2}^{1}(\mathbb{M})} (12)

which, together with a smoothing property, forms the backbone of the convergence theory for the multigrid method.

3 The Lagrange basis and stiffness matrix

For a kernel ϕ\phi and a set Ξ\Xi which separates points of Π\Pi, the Lagrange basis (χξ)ξΞ(\chi_{\xi})_{\xi\in\Xi} for VΞV_{\Xi} satisfies χξ(ζ)=δξ,ζ\chi_{\xi}(\zeta)=\delta_{\xi,\zeta} for each ξ,ζΞ\xi,\zeta\in\Xi.

A natural consequence of 𝒩(ϕ)W2m(𝕄)\mathcal{N}(\phi)\cong W_{2}^{m}(\mathbb{M}) is that there exists a constant CC so that for any Ξ𝕄\Xi\subset\mathbb{M}, the bound χξW2m(𝕄)Cqd2m\|\chi_{\xi}\|_{W_{2}^{m}(\mathbb{M})}\leq Cq^{\frac{d}{2}-m} holds. Especially on Riemannian manifolds without boundary, much stronger statement can be proven, see [21, 25, 23] or [14] on 𝕊2\mathbb{S}^{2}. We are, however, not convinced that the list of settings where stronger results hold is complete. In order to allow our results to be applied in future settings where those statement will be shown, we will formulate those stronger statements as assumptions or building blocks.

A stronger result is the following:

Assumption 1.

We assume that there is m>d/2+1m>d/2+1 so that 𝒩(ϕ)W2m(𝕄)\mathcal{N}(\phi)\cong W_{2}^{m}(\mathbb{M}), and, furthermore, there exist constants ν>0\nu>0 and CENC_{\mathrm{EN}} so that for R>0R>0

χξW2m(𝕄B(ξ,R))CENqd2meνRh.\|\chi_{\xi}\|_{W_{2}^{m}(\mathbb{M}\setminus B(\xi,R))}\leq C_{\mathrm{EN}}q^{\frac{d}{2}-m}e^{-\nu\frac{R}{h}}.

This gives rise to a number of analytic properties, some of which we present here (there are many more, see [25] and [15] for a detailed discussions). For the following estimates, the constants of equivalence depend on CENC_{\mathrm{EN}}, ν\nu, and ρ\rho.

Pointwise decay: there exist constants CC and ν>0\nu>0 so that for any Ξ𝕄\Xi\subset\mathbb{M}, the estimate

|χξ(x)|CPWeνdist(x,ξ)h|\chi_{\xi}(x)|\leq C_{\mathrm{PW}}e^{-\nu\frac{\mathrm{dist}(x,\xi)}{h}} (13)

holds. Here CPWρmd/2CENC_{\mathrm{PW}}\leq\rho^{m-d/2}C_{\mathrm{EN}}, where we recall that the mesh ratio is ρ=h/q\rho=h/q.

Hölder continuity: Of later importance, we mention the following condition, which follows from [23, Corollary A.15]. For any ϵ<md/2\epsilon<m-d/2, the Lagrange function is ϵ\epsilon Hölder continuous, and satisfies the bound

|χξ(x)χξ(y)|CHo¨lder(dist(x,y))ϵhϵ.|\chi_{\xi}(x)-\chi_{\xi}(y)|\leq C_{\mathrm{H\ddot{o}lder}}\bigl{(}\mathrm{dist}(x,y)\bigr{)}^{\epsilon}h^{-\epsilon}. (14)

Although we make explicit the dependence on ρ\rho here, for the remainder of this article, we assume that constants that follow depend on ρ\rho.

Riesz property: there exist constants 0<C1C2<0<C_{1}\leq C_{2}<\infty so that for any Ξ𝕄\Xi\subset\mathbb{M}, and aΞa\in\mathbb{R}^{\Xi}, we have

C1qd/2a2(Ξ)ξΞaξχξL2(𝕄)C2qd/2a2(Ξ).C_{1}q^{d/2}\|a\|_{\ell_{2}(\Xi)}\leq\|\sum_{\xi\in\Xi}a_{\xi}\chi_{\xi}\|_{L_{2}(\mathbb{M})}\leq C_{2}q^{d/2}\|a\|_{\ell_{2}(\Xi)}. (15)

Bernstein inequalities: There is a constant CBernsteinC_{\mathrm{Bernstein}} so that for 0km0\leq k\leq m,

ξΞaξχξW2k(𝕄)CBernsteinhd/2ka2(Ξ).\|\sum_{\xi\in\Xi}a_{\xi}\chi_{\xi}\|_{W_{2}^{k}(\mathbb{M})}\leq C_{\mathrm{Bernstein}}h^{d/2-k}\|a\|_{\ell_{2}(\Xi)}. (16)

3.1 The stiffness matrix

We now discuss the stiffness matrix and some of its properties. Most of these have appeared in [26], with earlier versions for the sphere appearing in [21] and [8].

A consequence of the results of this section is that the problem of calculating the Galerkin solution to u=f\mathcal{L}u=f from VΞV_{\Xi} involves treating a problem whose condition number grows like 𝒪(h2)\mathcal{O}(h^{-2}) – this is the fundamental issue that the multigrid method seeks to overcome.

The analysis map for (χξ)ξΞ\bigl{(}\chi_{\xi}\bigr{)}_{\xi\in\Xi_{\ell}} with respect to the bilinear form aa defined in (8) is

σΞ:(W21(𝕄),a(,))(Ξ,(,)2):v(a(v,χξ))ξΞT.\displaystyle\sigma_{\Xi}:\left(W^{1}_{2}(\mathbb{M}),a(\cdot,\cdot)\right)\to\left(\mathbb{R}^{\Xi},(\cdot,\cdot)_{2}\right):\quad v\mapsto\bigl{(}a(v,\chi_{\xi})\bigr{)}_{\xi\in\Xi}^{T}.

The analysis map is a surjection.

The synthesis map is

σΞ:(Ξ,(,)2)(W21(𝕄),a(,)):𝐰ξΞwξχξ.\displaystyle\sigma^{\ast}_{\Xi}:\left(\mathbb{R}^{\Xi},(\cdot,\cdot)_{2}\right)\to\left(W^{1}_{2}(\mathbb{M}),a(\cdot,\cdot)\right):\quad\boldsymbol{\mathbf{w}}\mapsto\sum_{\xi\in\Xi}w_{\xi}\chi_{\xi}.

The range of the synthesis map is clearly VΞV_{\Xi}; in other words, it is the natural isomorphism between Euclidean space and the finite dimensional kernel space; indeed, (15) shows that it is bounded above and below between L2(𝕄)L_{2}(\mathbb{M}) and 2(Ξ)\ell_{2}(\Xi). By abusing notation slightly, we write (σΞ)1:VΞΞ\bigl{(}\sigma^{\ast}_{\Xi}\bigr{)}^{-1}:V_{\Xi}\to\mathbb{R}^{\Xi}. This permits a direct matrix representation of linear operators on VΞV_{\Xi} via conjugation: S𝐒:=(σΞ)1SσΞΞ×ΞS\mapsto\mathbf{S}:=(\sigma_{\Xi}^{*})^{-1}S\sigma_{\Xi}^{*}\in\mathbb{R}^{\Xi\times\Xi}. Furthermore, by the Riesz property (15), we have

C1C2𝐒2(Ξ)2(Ξ)SL2(𝕄)L2(𝕄)C2C1𝐒2(Ξ)2(Ξ).\displaystyle\frac{C_{1}}{C_{2}}\|\mathbf{S}\|_{\ell_{2}(\Xi)\to\ell_{2}(\Xi)}\leq\|S\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq\frac{C_{2}}{C_{1}}\|\mathbf{S}\|_{\ell_{2}(\Xi)\to\ell_{2}(\Xi)}. (17)

A simple calculation shows that a(σΞ(𝐰),v)=𝐰,σΞ(v)2a\left(\sigma^{\ast}_{\Xi}(\boldsymbol{\mathbf{w}}),v\right)=\left\langle\boldsymbol{\mathbf{w}},\sigma_{\Xi}(v)\right\rangle_{2}, so σΞ\sigma_{\Xi}^{\ast} is the aa-adjoint of σΞ\sigma_{\Xi}. Of course, when aa is symmetric, we also have σ(v),𝐰2=a(v,σ(𝐰))\left\langle\sigma_{\ell}(v),\boldsymbol{\mathbf{w}}\right\rangle_{2}=a\left(v,\sigma^{\ast}_{\ell}(\boldsymbol{\mathbf{w}})\right).

The stiffness matrix is defined as

𝐀Ξ:=(Aξ,η)ξ,ηΞ:=(a(χξ,χζ))ξ,ζΞ.\displaystyle\boldsymbol{\mathbf{A}}_{\Xi}:=(A_{\xi,\eta})_{\xi,\eta\in\Xi}:=\Bigl{(}a\bigl{(}\chi_{\xi},\chi_{\zeta}\bigr{)}\Bigr{)}_{\xi,\zeta\in\Xi}. (18)

It represents the operator \mathcal{L} on the finite dimensional space VΞV_{\Xi}. Using the analysis and synthesis maps, 𝐀Ξ=σΞσΞ:(Ξ,(,)2)(Ξ,(,)2):𝐜(a(χξ,χζ))ξ,ζΞ𝐜\boldsymbol{\mathbf{A}}_{\Xi}=\sigma_{\Xi}\circ\sigma^{\ast}_{\Xi}:\left(\mathbb{R}^{\Xi},(\cdot,\cdot)_{2}\right)\to\left(\mathbb{R}^{\Xi},(\cdot,\cdot)_{2}\right):\boldsymbol{\mathbf{c}}\mapsto\left(a(\chi_{\xi},\chi_{\zeta})\right)_{\xi,\zeta\in\Xi}\boldsymbol{\mathbf{c}}, and we have that the Galerkin projector PΞ:W21(𝕄)VΞP_{\Xi}:W_{2}^{1}(\mathbb{M})\to V_{\Xi} can be expressed as PΞ=σΞ(σΞσΞ)1σΞP_{\Xi}=\sigma^{\ast}_{\Xi}\left(\sigma_{\Xi}\circ\sigma^{\ast}_{\Xi}\right)^{-1}\sigma_{\Xi}.

Lemma 3.

There is a constant CstiffC_{\mathrm{stiff}} so that the entries of the stiffness matrix satisfy

|Aξ,η|Cstiffhd2eν2dist(ξ,η)h.|A_{\xi,\eta}|\leq C_{\mathrm{stiff}}h^{d-2}e^{-\frac{\nu}{2}\frac{\mathrm{dist}(\xi,\eta)}{h}}.
Proof.

For a tensor field F𝒯1𝕄F\in\mathcal{T}_{1}\mathbb{M}, we write |F|:𝕄:xF(x)x|F|:\mathbb{M}\to\mathbb{R}:x\mapsto\|F(x)\|_{x}. Thus using |χξ|(x)=χξ(x)x|\nabla\chi_{\xi}|(x)=\|\nabla\chi_{\xi}(x)\|_{x}, we can bound the integral

|a(χξ,χη)|\displaystyle|a(\chi_{\xi},\chi_{\eta})| a2|χξ|,|χη|L2(𝕄)+a1|χξ|,|χη|L2(𝕄)\displaystyle\leq\|\textsc{a}_{2}\|_{\infty}\langle|\nabla\chi_{\xi}|,|\nabla\chi_{\eta}|\rangle_{L_{2}(\mathbb{M})}+\|\textsc{a}_{1}\|_{\infty}\langle|\nabla\chi_{\xi}|,|\chi_{\eta}|\rangle_{L_{2}(\mathbb{M})}
+a0|χξ|,|χη|L2(𝕄).\displaystyle+\|\textsc{a}_{0}\|_{\infty}\langle|\chi_{\xi}|,|\chi_{\eta}|\rangle_{L_{2}(\mathbb{M})}.

Decompose each inner product using the half spaces H+:={xdist(x,ξ)<dist(x,η)}H_{+}:=\{x\mid\mathrm{dist}(x,\xi)<\mathrm{dist}(x,\eta)\} and H=𝕄H+H_{-}=\mathbb{M}\setminus H_{+}, noting H+𝕄B(η,R)H_{+}\subset\mathbb{M}\setminus B(\eta,R) and H𝕄B(ξ,R)H_{-}\subset\mathbb{M}\setminus B(\xi,R), with R=dist(ξ,η)/2R=\mathrm{dist}(\xi,\eta)/2. By applying Cauchy-Schwarz to each integral gives, after combining terms,

|Aξ,η|C(χξW21(𝕄)χηW21(𝕄B(η,R))+χξW21(𝕄B(ξ,R))χηW21(𝕄))|A_{\xi,\eta}|\leq C_{\mathcal{L}}(\|\chi_{\xi}\|_{W_{2}^{1}(\mathbb{M})}\|\chi_{\eta}\|_{W_{2}^{1}(\mathbb{M}\setminus B(\eta,R))}+\|\chi_{\xi}\|_{W_{2}^{1}(\mathbb{M}\setminus B(\xi,R))}\|\chi_{\eta}\|_{W_{2}^{1}(\mathbb{M})})

for some constant CC_{\mathcal{L}} depending on the coefficients of \mathcal{L}.

We have χξW21(𝕄)CBernsteinhd/21\|\chi_{\xi}\|_{W_{2}^{1}(\mathbb{M})}\leq C_{\mathrm{Bernstein}}h^{d/2-1} by the Bernstein inequality (16) (with a similar estimate for χη\chi_{\eta}). The zeros estimate for complements of balls, [23, Corollary A.17], applied to χξ\chi_{\xi} gives

χξW21(𝕄B(ξ,R))Czeroshm1χξW2m(𝕄B(ξ,R))\|\chi_{\xi}\|_{W_{2}^{1}(\mathbb{M}\setminus B(\xi,R))}\leq C_{\mathrm{zeros}}h^{m-1}\|\chi_{\xi}\|_{W_{2}^{m}(\mathbb{M}\setminus B(\xi,R))}

(with a similar estimate for χη\chi_{\eta}). Thus we have

|Aξ,η|\displaystyle|A_{\xi,\eta}| \displaystyle\leq CCBernsteinCzeroshd/2+m2(χξW2m(𝕄B(ξ,R))+χηW2m(𝕄B(η,R)))\displaystyle C_{\mathcal{L}}C_{\mathrm{Bernstein}}C_{\mathrm{zeros}}h^{d/2+m-2}(\|\chi_{\xi}\|_{W_{2}^{m}(\mathbb{M}\setminus B(\xi,R))}+\|\chi_{\eta}\|_{W_{2}^{m}(\mathbb{M}\setminus B(\eta,R))})
\displaystyle\leq 2CCBernsteinCzerosCENhd/2+m2qd/2meνR/h.\displaystyle 2C_{\mathcal{L}}C_{\mathrm{Bernstein}}C_{\mathrm{zeros}}C_{\mathrm{EN}}h^{d/2+m-2}q^{d/2-m}e^{-\nu R/h}.

The lemma follows with Cstiff=2CCBernsteinCzerosCENρmd/2C_{\mathrm{stiff}}=2C_{\mathcal{L}}C_{\mathrm{Bernstein}}C_{\mathrm{zeros}}C_{\mathrm{EN}}\rho^{m-d/2}. ∎

By considering row and column sums, we have that

𝐀Ξ22CAhd2\displaystyle\left\|\boldsymbol{\mathbf{A}}_{\Xi}\right\|_{2\to 2}\leq C_{A}h^{d-2} (19)

holds with CA=Cstiff(1+n=1(n+2)deν2ρn)C_{A}=C_{\mathrm{stiff}}(1+\sum_{n=1}^{\infty}(n+2)^{d}e^{-\frac{\nu}{2\rho}n}).

Lemma 4.

Under Assumptions 1, for Ξ𝕄\Xi\subset\mathbb{M}, the stiffness matrix satisfies

𝐀Ξ122CHo¨lderhd\|\boldsymbol{\mathbf{A}}_{\Xi}^{-1}\|_{\ell_{2}\to\ell_{2}}\leq C_{\mathrm{H\ddot{o}lder}}h^{-d}

with a constant CHo¨lderC_{\mathrm{H\ddot{o}lder}} which is independent of Ξ\Xi.

Proof.

Coercivity ensures |a(ξΞvξχξ,ξΞvξχξ)|c0ξΞvξχξW21(𝕄)2|a(\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi},\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi})|\geq c_{0}\left\|\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi}\right\|^{2}_{W_{2}^{1}(\mathbb{M})}, and the metric equivalence vW21(𝕄)2=(1Δ)1/2vL2(𝕄)\|v\|_{W_{2}^{1}(\mathbb{M})}^{2}=\|(1-\Delta)^{1/2}v\|_{L_{2}(\mathbb{M})} gives

𝐯𝐀Ξ𝐯c0𝐯𝐋𝐯,where𝐋:=((1Δ)1/2χξ,(1Δ)1/2χη)ξ,ηΞ\displaystyle\boldsymbol{\mathbf{v}}\cdot\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}}\geq c_{0}\boldsymbol{\mathbf{v}}\cdot\boldsymbol{\mathbf{L}}\boldsymbol{\mathbf{v}},\quad\text{where}\quad\boldsymbol{\mathbf{L}}:=\left(\langle(1-\Delta)^{1/2}\chi_{\xi},(1-\Delta)^{1/2}\chi_{\eta}\rangle\right)_{\xi,\eta\in\Xi}

is the stiffness matrix for the self-adjoint operator 1Δ1-\Delta. Because the spectrum of 1Δ1-\Delta is bounded below, i.e., σspec(1Δ)[1,)\sigma_{\mathrm{spec}}(1-\Delta)\subset[1,\infty), we conclude that 𝐯𝐋𝐯ξΞvξχξL2(𝕄)2C1hd𝐯2(Ξ)2\boldsymbol{\mathbf{v}}\cdot\boldsymbol{\mathbf{L}}\boldsymbol{\mathbf{v}}\geq\left\|\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi}\right\|^{2}_{L_{2}(\mathbb{M})}\geq C_{1}h^{d}\left\|\boldsymbol{\mathbf{v}}\right\|^{2}_{\ell_{2}(\Xi)} by the Riesz property (15). Hence, overall we get

𝐀Ξ𝐯21𝐯2|𝐯𝐀Ξ𝐯|chd𝐯2\displaystyle\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}\geq\frac{1}{\|\boldsymbol{\mathbf{v}}\|_{\ell_{2}}}|\boldsymbol{\mathbf{v}}\cdot\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}}|\geq ch^{d}\left\|\boldsymbol{\mathbf{v}}\right\|_{\ell_{2}}

with c=c0C1c=c_{0}C_{1}, so 𝐀Ξ1221c0C1hd\|\boldsymbol{\mathbf{A}}_{\Xi}^{-1}\|_{\ell_{2}\to\ell_{2}}\leq\frac{1}{c_{0}C_{1}}h^{-d}. ∎

Consequently, the 2\ell_{2} condition number of 𝐀Ξ\boldsymbol{\mathbf{A}}_{\Xi} is bounded by a multiple of h2h^{-2}. Please note that the bounds in Lemma 3 and Lemma 4 do not assume the stiffness matrix to be symmetric, only these bounds are in some sense symmetric. Moreover, (19) is obtained by Riesz-Thorin interpolation and bounds on the row and column sums. Thus this bound also does not require the stiffness matrix to be symmetric.

3.2 The diagonal of the stiffness matrix

As a counterpart to the off-diagonal decay given in Lemma 3, we can give the following lower bounds on the diagonal entries.

Lemma 5.

For an elliptic operator \mathcal{L} and a kernel satisfying Assumption 1, for a mesh ratio ρ\rho, there is Cdiag>0C_{\mathrm{diag}}>0 so that for any Ξ𝕄\Xi\subset\mathbb{M} with h/q<ρh/q<\rho, and any ξΞ\xi\in\Xi, we have

Aξ,ξ=a(χξ,χξ)Cdiaghd2.A_{\xi,\xi}=a(\chi_{\xi},\chi_{\xi})\geq C_{\mathrm{diag}}h^{d-2}.
Proof.

By coercivity of aa, it suffices to prove that χξL2(B(ξ,h))2hd2.\|\nabla\chi_{\xi}\|_{L_{2}(B(\xi,h))}^{2}\gtrsim h^{d-2}.

We begin by establishing a Poincaré-type inequality which is valid for smooth Lagrange functions. To this end, consider f:B(0,r𝕄)f:B(0,\mathrm{r}_{\mathbb{M}})\to\mathbb{R} obtained by the change of variable f(x)=χξ(expξ(x))f(x)=\chi_{\xi}(\exp_{\xi}(x)). Note that for BB(ξ,r𝕄)B\subset B(\xi,\mathrm{r}_{\mathbb{M}}), we have for any 0km0\leq k\leq m, that

fW2k(B)1Γ1χξW2k(𝕄)1Γ1(CZhmk)(CENρmd/2hd/2m)=C1hd/2k.\|f\|_{W_{2}^{k}(B)}\leq\frac{1}{\Gamma_{1}}\|\chi_{\xi}\|_{W_{2}^{k}(\mathbb{M})}\leq\frac{1}{\Gamma_{1}}(CZh^{m-k})(C_{\mathrm{EN}}\rho^{m-d/2}h^{d/2-m})=C_{1}h^{d/2-k}.

by the zeros estimate, with C1:=1Γ1ρmd/2CENCzerosC_{1}:=\frac{1}{\Gamma_{1}}\rho^{m-d/2}C_{\mathrm{EN}}C_{\mathrm{zeros}}, where the constants stem from (4). Now let r:=dist(ξ,Ξ{ξ})r:=\mathrm{dist}(\xi,\Xi\setminus\{\xi\}), and define F:B(0,1)F:B(0,1)\to\mathbb{R} by F:=f(r)F:=f(r\cdot). Then, by a change of variable,

FW2m(B(0,1))2=k=0m|F|W2k(B(0,1))2=k=0mr2kd|f|W2k(B(0,r))2C2\|F\|_{W_{2}^{m}(B(0,1))}^{2}=\sum_{k=0}^{m}|F|_{W_{2}^{k}(B(0,1))}^{2}=\sum_{k=0}^{m}r^{2k-d}|f|_{W_{2}^{k}(B(0,r))}^{2}\leq C_{2}

with C2:=ρdC1C_{2}:=\rho^{d}C_{1}, since h/ρqrhh/\rho\leq q\leq r\leq h.

Because of the imbedding W2m(B(0,1))C(B(0,1)¯)W_{2}^{m}(B(0,1))\subset C(\overline{B(0,1)}) (which holds since m>d/2m>d/2), the set

K:={GW2m(B(0,1))GW2m(B(0,1))C2,G(0)=1,G(e1)=0}K:=\left\{G\in W_{2}^{m}(B(0,1))\mid\|G\|_{W_{2}^{m}(B(0,1))}\leq C_{2},\,G(0)=1,G(e_{1})=0\right\}

is well defined, closed and convex, hence weakly compact, by Banach-Alaoglu.

The natural imbedding ι:W2m(B(0,1))W21(B(0,1))\iota:W_{2}^{m}(B(0,1))\to W_{2}^{1}(B(0,1)), is compact. We wish to show that ι(K)\iota(K) is a compact set.

Because ι\iota is a continuous linear map, it is continuous between the weak topologies of W2m(B(0,1))W_{2}^{m}(B(0,1)) and W21(B(0,1))W_{2}^{1}(B(0,1)). Thus ι(K)\iota(K) is weakly compact in W21(B(0,1))W_{2}^{1}(B(0,1)), and thus norm closed. Finally, because ι(K)\iota(K) is complete and totally bounded, it is a compact subset in the norm topology of W21(B(0,1))W_{2}^{1}(B(0,1)).

Consider the (possibly zero) constant cc defined by

c:=minGKGL2(B(0,1))GL2(B(0,1)).c:=\min_{G\in K}\frac{\|\nabla G\|_{L_{2}(B(0,1))}}{\|G\|_{L_{2}(B(0,1))}}.

The map I:W21(B(0,1)):GGL2GL2I:W_{2}^{1}(B(0,1))\to\mathbb{R}:G\mapsto\frac{\|\nabla G\|_{L_{2}}}{\|G\|_{L_{2}}} is continuous on the complement of 0W21(B(0,1))0\in W_{2}^{1}(B(0,1)) (as quotient of two continuous functions that do not vanish). In particular, it is continuous and non-vanishing on ι(K)\iota(K), so c=minGι(K)I(G)>0c=\min_{G\in\iota(K)}I(G)>0. Indeed, I(G)>0I(G)>0 for all Gι(K)G\in\iota(K), since G(0)=1G(0)=1, G(e1)=0G(e_{1})=0 and GC1(B)G\in C^{1}(B).

Note that in the above minimization problem, the condition G(e1)=0G(e_{1})=0 could be replaced by any other point on the unit sphere without changing the value of cc. By rotation invariance of the W2m(d)W_{2}^{m}(\mathbb{R}^{d}) norm, it follows that FL2cFL2\|\nabla F\|_{L_{2}}\geq c\|F\|_{L_{2}}. Finally, employing the change of variables rd2FL2(B(0,1))2=fL2(B(0,r))2r^{d-2}\|\nabla F\|_{L_{2}(B(0,1))}^{2}=\|\nabla f\|_{L_{2}(B(0,r))}^{2} and FL2(B(0,1))2=rdfL2(B(0,r))2\|F\|_{L_{2}(B(0,1))}^{2}=r^{d}\|f\|_{L_{2}(B(0,r))}^{2}, we have

χξL2(B(ξ,r))2\displaystyle\|\nabla\chi_{\xi}\|_{L_{2}(B(\xi,r))}^{2} \displaystyle\geq Γ12rd2FL2(B(0,1))2\displaystyle{\Gamma_{1}^{2}}r^{d-2}\|\nabla F\|_{L_{2}(B(0,1))}^{2}
\displaystyle\geq c2Γ12rd2FL2(B(0,1))2\displaystyle c^{2}{\Gamma_{1}^{2}}r^{d-2}\|F\|_{L_{2}(B(0,1))}^{2}
\displaystyle\geq c2Γ12ρ2h2χξL2(B(ξ,r))2.\displaystyle c^{2}\Gamma_{1}^{2}\rho^{-2}h^{-2}\|\chi_{\xi}\|_{L_{2}(B(\xi,r))}^{2}.

The last line follows because χξ\chi_{\xi} is Hölder continuous, so stays close to 11 near ξ\xi. Specifically, by (14), for κ:=(2CHo¨lder)1/ϵ\kappa:=\left(2C_{\mathrm{H\ddot{o}lder}}\right)^{-1/\epsilon} we have χξ(x)>12\chi_{\xi}(x)>\frac{1}{2} for all xB(ξ,κh)x\in B(\xi,\kappa h). Thus

χξL2(B(ξ,r))2B(ξ,κh)|χξ(x)|2dx14α𝕄(κh)d.\|\chi_{\xi}\|_{L_{2}(B(\xi,r))}^{2}\geq\int_{B(\xi,\kappa h)}|\chi_{\xi}(x)|^{2}\mathrm{d}x\geq\frac{1}{4}\alpha_{\mathbb{M}}(\kappa h)^{d}.

The lemma follows with constant Cdiag=14c2Γ12ρ2α𝕄κdC_{\mathrm{diag}}=\frac{1}{4}c^{2}\Gamma_{1}^{2}\rho^{-2}\alpha_{\mathbb{M}}\kappa^{d}. ∎

This brings us to the lower bound for diagonal entries of the stiffness matrix. Define the diagonal of 𝐀Ξ\boldsymbol{\mathbf{A}}_{\Xi} as 𝐁Ξ:=diag(𝐀Ξ),\boldsymbol{\mathbf{B}}_{\Xi}:=\mathrm{diag}(\boldsymbol{\mathbf{A}}_{\Xi}), and note that by Lemma 3 and Lemma 5,

κ(𝐁Ξ)=maxξAξ,ξminξAξ,ξCstiffCdiag\kappa(\boldsymbol{\mathbf{B}}_{\Xi})=\frac{\max_{\xi}A_{\xi,\xi}}{\min_{\xi}A_{\xi,\xi}}\leq\frac{C_{\mathrm{stiff}}}{C_{\mathrm{diag}}}

is bounded above by a constant which depends only on the mesh ratio ρ\rho (and not on Ξ\Xi).

This permits us to find suitable damping constants 0<θ<10<\theta<1 so that 𝐁Ξ\boldsymbol{\mathbf{B}}_{\Xi} dominates θ𝐀Ξ\theta\boldsymbol{\mathbf{A}}_{\Xi}. This drives the success of the damped Jacobi method considered in the next section.

Lemma 6.

For an elliptic operator \mathcal{L}, a kernel ϕ\phi satisfying Assumption 1, and mesh ratio ρ\rho, there is θ(0,1)\theta\in(0,1) so that for any point set Ξ𝕄\Xi\subset\mathbb{M}, θ𝐀Ξ𝐯,𝐯𝐁Ξ𝐯,𝐯\theta\langle\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle for all 𝐯Ξ\boldsymbol{\mathbf{v}}\in\mathbb{R}^{\Xi}.

Proof.

By (19), 𝐀Ξ𝐯,𝐯CAhd2v2\langle\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle\leq C_{A}h^{d-2}\|v\|^{2}, while Cdiaghd2v2𝐁Ξ𝐯,𝐱C_{\mathrm{diag}}h^{d-2}\|v\|^{2}\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{x}}\rangle follows from Lemma 5. Thus the lemma holds for any θ\theta in the interval (0,Cdiag/CA](0,{C_{\mathrm{diag}}}/{C_{A}}]. ∎

4 The smoothing property

In this section we define and study the smoothing operator used in the multigrid method. We focus on the damped Jacobi method for the linear system 𝐀Ξ𝐮Ξ=𝐛\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{u}}^{\star}_{\Xi}=\boldsymbol{\mathbf{b}}, where 𝐀Ξ|Ξ|×|Ξ|\boldsymbol{\mathbf{A}}_{\Xi}\in\mathbb{R}^{|\Xi|\times|\Xi|} and 𝐛|Ξ|\boldsymbol{\mathbf{b}}\in\mathbb{R}^{|\Xi|}. For a fixed damping parameter 0<θ<10<\theta<1, 𝐮(j)\boldsymbol{\mathbf{u}}^{(j)} is approximately computed via the iteration:

𝐮(j+1)=θ𝐁Ξ1𝐛+(idθ𝐁Ξ1𝐀Ξ)𝐮(j),j0,\boldsymbol{\mathbf{u}}^{(j+1)}=\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{b}}+(\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi})\boldsymbol{\mathbf{u}}^{(j)},j\geq 0,

with starting value 𝐮(0)|Ξ|\boldsymbol{\mathbf{u}}^{(0)}\in\mathbb{R}^{|\Xi|}. Define the affine map governing a single iteration as

JΞ:Ξ×|Ξ||Ξ|,JΞ(𝐱,𝐛)=(idθ𝐁Ξ1𝐀Ξ)𝐮+θ𝐁Ξ1𝐛.\displaystyle{J}_{\Xi}:\mathbb{R}^{\Xi}\times\mathbb{R}^{|\Xi|}\to\mathbb{R}^{|\Xi|},\quad{J}_{\Xi}(\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{b}})=\left(\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi}\right)\boldsymbol{\mathbf{u}}+\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{b}}.

This shows that the iteration converges if and only if iteration matrix

𝐒Ξ:=idθ𝐁Ξ1𝐀Ξ,\displaystyle\boldsymbol{\mathbf{S}}_{\Xi}:=\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi}, (20)

called the smoothing matrix, is a contraction.

In the context of kernel approximation, the corresponding operator, the smoothing operator, 𝒮Ξ:VΞVΞ\mathcal{S}_{\Xi}:V_{\Xi}\to V_{\Xi} is defined as 𝒮ΞσΞ:=σΞ𝐒Ξ.\mathcal{S}_{\Xi}\sigma_{\Xi}^{*}:=\sigma_{\Xi}^{*}\boldsymbol{\mathbf{S}}_{\Xi}. The success of the multigrid method relies on a smoothing property, which for our purposes states that iterating 𝒮Ξ\mathcal{S}_{\Xi} is eventually contracting: 𝒮ΞνL2(𝕄)Ch1o(ν)\|\mathcal{S}_{\Xi}^{\nu}\|_{L_{2}(\mathbb{M})\to\mathcal{L}}\leq Ch^{-1}o(\nu) as ν\nu\to\infty. This smoothing property is demonstrated in section 4.2. Many of the results in the forthcoming section are formulated both in a matrix including 𝐒Ξ\boldsymbol{\mathbf{S}}_{\Xi} form and an operator form 𝒮Ξ\mathcal{S}_{\Xi}. This resembles a bit a change of basis transformation.

4.1 L2L_{2} stability of the smoothing operator

At this point, we are in a position to show that that iterating this operator is stable on L2(𝕄)L_{2}(\mathbb{M}). To help analyze the matrix 𝐒Ξ=idθ𝐁Ξ1𝐀Ξ\boldsymbol{\mathbf{S}}_{\Xi}=\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi}, we introduce the inner product

𝐮,𝐯𝐁Ξ:=𝐁Ξ1/2𝐮,𝐁Ξ1/2𝐯2(Ξ).\langle\boldsymbol{\mathbf{u}},\boldsymbol{\mathbf{v}}\rangle_{\boldsymbol{\mathbf{B}}_{\Xi}}:=\langle\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}\boldsymbol{\mathbf{u}},\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}\boldsymbol{\mathbf{v}}\rangle_{\ell_{2}(\mathbb{R}^{\Xi})}.

Since 𝐁Ξ\boldsymbol{\mathbf{B}}_{\Xi} is diagonal, and its diagonal entries are χξ,χξhd2\langle\chi_{\xi},\chi_{\xi}\rangle_{\mathcal{L}}\sim h^{d-2}, we have the norm equivalence 𝐌𝐁Ξ𝐁Ξ𝐌22\|\boldsymbol{\mathbf{M}}\|_{\boldsymbol{\mathbf{B}}_{\Xi}\to\boldsymbol{\mathbf{B}}_{\Xi}}\sim\|\boldsymbol{\mathbf{M}}\|_{2\to 2}. Specifically, we have

1C𝐁𝐌22𝐌𝐁𝐁C𝐁𝐌22\frac{1}{C_{\boldsymbol{\mathbf{B}}}}\|\boldsymbol{\mathbf{M}}\|_{2\to 2}\leq\|\boldsymbol{\mathbf{M}}\|_{\boldsymbol{\mathbf{B}}\to\boldsymbol{\mathbf{B}}}\leq C_{\boldsymbol{\mathbf{B}}}\|\boldsymbol{\mathbf{M}}\|_{2\to 2} (21)

with constant of equivalence C𝐁=κ(𝐁Ξ1/2)=maxa(χξ,χξ)mina(χξ,χξ)Cstiff/CdiagC_{\boldsymbol{\mathbf{B}}}=\kappa(\boldsymbol{\mathbf{B}}_{\Xi}^{1/2})=\frac{\max\sqrt{a(\chi_{\xi},\chi_{\xi})}}{\min\sqrt{a(\chi_{\xi},\chi_{\xi})}}\leq\sqrt{C_{\mathrm{stiff}}/C_{\mathrm{diag}}}.

Lemma 7.

For the damping parameter θ\theta in Lemma 6 and 𝐒Ξ\boldsymbol{\mathbf{S}}_{\Xi} as defined in (20), there is C>0C>0 so that for all ν\nu\in\mathbb{N}, 𝐮Ξ\boldsymbol{\mathbf{u}}\in\mathbb{R}^{\Xi} and u=σΞ𝐮VΞu=\sigma_{\Xi}^{*}\boldsymbol{\mathbf{u}}\in V_{\Xi},

𝐒Ξν22κ(𝐁Ξ1/2),and𝒮ΞνuL2(𝕄)C2C1κ(𝐁Ξ1/2)uL2(𝕄).\|\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\|_{2\to 2}\leq\kappa(\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}),\quad\text{and}\quad\|\mathcal{S}_{\Xi}^{\nu}u\|_{L_{2}(\mathbb{M})}\leq\frac{C_{2}}{C_{1}}\kappa(\boldsymbol{\mathbf{B}}_{\Xi}^{1/2})\|u\|_{{\color[rgb]{0,0,0}L_{2}(\mathbb{M})}}.

We note that this holds even when aa is non-symmetric.

Proof.

The matrix 𝐁Ξ1/2(idθ𝐁Ξ1𝐀Ξ)𝐁Ξ1/2\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}(\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi})\boldsymbol{\mathbf{B}}_{\Xi}^{-1/2} has the same spectrum as the matrix idθ𝐁Ξ1𝐀Ξ\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi}. Furthermore, because 0𝐁Ξθ𝐀Ξ𝐱,𝐱𝐁Ξ𝐱,𝐱0\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}-\theta\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle, the spectral radius of 𝐒Ξ\boldsymbol{\mathbf{S}}_{\Xi} is no greater than 11. Thus

𝐒Ξ𝐁Ξ𝐁Ξ=𝐁Ξ1/2(idθ𝐁Ξ1𝐀Ξ)𝐁Ξ1/2221.\|\boldsymbol{\mathbf{S}}_{\Xi}\|_{\boldsymbol{\mathbf{B}}_{\Xi}\to\boldsymbol{\mathbf{B}}_{\Xi}}=\|\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}(\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi})\boldsymbol{\mathbf{B}}_{\Xi}^{-1/2}\|_{2\to 2}\leq 1.

It follows that 𝐒Ξν𝐁Ξ𝐁Ξ1\|\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\|_{\boldsymbol{\mathbf{B}}_{\Xi}\to\boldsymbol{\mathbf{B}}_{\Xi}}\leq 1 as well, and the matrix norm equivalence (21) guarantees that 𝐒Ξν22κ(𝐁Ξ1/2)\|\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\|_{2\to 2}\leq\kappa(\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}). The second inequality follows from the Riesz property (15). ∎

4.2 Smoothing properties

For v=σΞ𝐯VΞv=\sigma_{\Xi}^{*}\boldsymbol{\mathbf{v}}\in V_{\Xi}, we have [v]=a(v,v)=𝐀Ξ𝐯,𝐯[v]_{\mathcal{L}}=\sqrt{a(v,v)}=\sqrt{\langle\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle}. By applying Cauchy-Schwarz, the Riesz property and Lemma 7, the chain of inequalities

[𝒮Ξνv]𝐀Ξ𝐒Ξν𝐯21/2𝐒Ξν𝐯21/2Chd/4𝐀Ξ𝐒Ξν𝐯21/2vL21/2[\mathcal{S}_{\Xi}^{\nu}v]_{\mathcal{L}}\leq\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}^{1/2}\|\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}^{1/2}\leq Ch^{-d/4}\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}^{1/2}\|v\|_{L_{2}}^{1/2}

holds. In the case that aa is symmetric, we have the following smoothness property.

Lemma 8.

For a given ρ>0\rho>0 there is a constant CC so that for any θ\theta chosen as in Lemma 6, 𝐒Ξ\boldsymbol{\mathbf{S}}_{\Xi} as defined in (20), and Ξ𝕄\Xi\subset\mathbb{M} with mesh ratio ρ(Ξ)ρ\rho(\Xi)\leq\rho, the damped Jacobi iteration has smoothing operator 𝒮ΞL(VΞ)\mathcal{S}_{\Xi}\in L(V_{\Xi}) which satisfies

𝒮ΞνvCθ1/21ν+1h1vL2.\|\mathcal{S}_{\Xi}^{\nu}v\|_{\mathcal{L}}\leq{C}{\theta}^{-1/2}\sqrt{\frac{1}{\nu+1}}h^{-1}\|v\|_{L_{2}}.
Proof.

This is a result of [13, Theorem 7.9], which shows that 𝐀Ξ𝐒Ξν𝐯2Chd2θ(ν+1)𝐯2\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}\leq\frac{Ch^{d-2}}{\theta(\nu+1)}\|\boldsymbol{\mathbf{v}}\|_{\ell_{2}}. It follows that 𝐀Ξ𝐒Ξν𝐯2Chd/22θ(ν+1)vL2\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}\leq\frac{Ch^{d/2-2}}{\theta(\nu+1)}\|{v}\|_{L_{2}}, and the result holds by the above discussion. ∎

When aa is not symmetric, we have the smoothness property.

Lemma 9.

Let 𝒮Ξ:VΞVΞ:v=vξχξ(𝐒Ξ𝐯)ξχξ\mathcal{S}_{\Xi}:V_{\Xi}\to V_{\Xi}:v=\sum v_{\xi}\chi_{\xi}\mapsto\sum(\boldsymbol{\mathbf{S}}_{\Xi}\boldsymbol{\mathbf{v}})_{\xi}\chi_{\xi}. For θ\theta as in Lemma 6 and 𝐒Ξ\boldsymbol{\mathbf{S}}_{\Xi} as defined in (20) we have

[𝒮Ξνv]Cν4h1vL2.[\mathcal{S}_{\Xi}^{\nu}v]_{\mathcal{L}}\leq\frac{C}{\sqrt[4]{\nu}}h^{-1}\|v\|_{L_{2}}.
Proof.

This follows from [13, Theorem 7.17], and by techniques of the proof of Lemma 8. ∎

5 The direct kernel multigrid method

We are now in a position to consider the multigrid method applied to the kernel based Galerkin method we have described in the previous sections.

5.1 Setup: Grid transfer

We consider a nested sequence of point sets

Ξ0Ξ1ΞΞ+1ΞL𝕄\displaystyle\Xi_{0}\subset\Xi_{1}\subset\dots\Xi_{\ell}\subset\Xi_{\ell+1}\subset\dots\subset\Xi_{L}\subset\mathbb{M}

and associated kernel spaces VΞV_{\Xi_{\ell}} as described in section 2.5. We denote the Lagrange basis for each such space (χξ())ξΞ(\chi^{(\ell)}_{\xi})_{\xi\in\Xi_{\ell}}, and with it the accompanying analysis map σ:=σΞ\sigma_{\ell}:=\sigma_{\Xi_{\ell}}, synthesis map σ:=σΞ\sigma_{\ell}^{*}:=\sigma_{\Xi_{\ell}}^{*} and stiffness matrix 𝐀:=𝐀Ξ\boldsymbol{\mathbf{A}}_{\ell}:=\boldsymbol{\mathbf{A}}_{\Xi_{\ell}}. Moreover, we assume that there are constants 0<γ1γ2<10<\gamma_{1}\leq\gamma_{2}<1 and ρ1\rho\geq 1 such that

qΞhΞρqΞ,andγ1hΞhΞ+1γ2hΞ.\displaystyle q_{\Xi_{\ell}}\leq h_{\Xi_{\ell}}\leq\rho q_{\Xi_{\ell}},\quad\text{and}\quad\gamma_{1}h_{\Xi_{\ell}}\leq h_{\Xi_{\ell+1}}\leq\gamma_{2}h_{\Xi_{\ell}}. (22)

Note at this point that ρ\rho is a universal constant. Hence ρ\rho does not depend on \ell and thus the constants in deriving the smoothing property do not depend on \ell. Thus, we obtain nhΞdn_{\ell}\sim h^{-d}_{\Xi_{\ell}}, for constants see (2). We will assume that LL is the largest index that we will consider.

In this section, we discuss grid transfer: specifically, the operators and matrices which provide communication between finite dimensional kernel spaces. These include natural prolongation and restriction maps and their corresponding matrices. We show how these can be used to relate Galerkin projectors PΞP_{\Xi_{\ell}} and stiffness matrices 𝐀\boldsymbol{\mathbf{A_{\ell}}}.

Prolongation and restriction

Denote the Lagrange basis of VΞV_{\Xi_{\ell}} by (χξ())(\chi_{\xi}^{(\ell)}), and note that by containment VΞ1VΞV_{\Xi_{\ell-1}}\subset V_{\Xi_{\ell}}, it follows that χξ(1)=ηΞβξ,ηχη()\chi^{(\ell-1)}_{\xi}=\sum_{\eta\in\Xi_{\ell}}\beta_{\xi,\eta}\chi^{(\ell)}_{\eta} holds for some matrix of coefficients βξ,\beta_{\xi,\ell}. Furthermore, from the Lagrange property, the identity χξ(1)(η)=ζΞχξ(1)(ζ)χζ()(η)\chi^{(\ell-1)}_{\xi}(\eta)=\sum_{\zeta\in\Xi_{\ell}}\chi^{(\ell-1)}_{\xi}(\zeta)\chi^{(\ell)}_{\zeta}(\eta) holds for any ηΞ1\eta\in\Xi_{\ell-1}. By uniqueness, we deduce that βξ,η=χξ(1)(η)\beta_{\xi,\eta}=\chi^{(\ell-1)}_{\xi}(\eta), and we have

ξΞ1cξχξ(1)=ξΞ1cξηΞχξ(1)(η)χη()=ηΞξΞ1cξχξ(1)(η)χη().\displaystyle\sum_{\xi\in\Xi_{\ell-1}}c_{\xi}\chi^{(\ell-1)}_{\xi}=\sum_{\xi\in\Xi_{\ell-1}}c_{\xi}\sum_{\eta\in\Xi_{\ell}}\chi^{(\ell-1)}_{\xi}(\eta)\chi^{(\ell)}_{\eta}=\sum_{\eta\in\Xi_{\ell}}\sum_{\xi\in\Xi_{\ell-1}}c_{\xi}\chi^{(\ell-1)}_{\xi}(\eta)\chi^{(\ell)}_{\eta}.

This yields that the natural injection 1():VΞ1VΞ\mathcal{I}_{\ell-1}^{(\ell)}:V_{\Xi_{\ell-1}}\to V_{\Xi_{\ell}}, called the prolongation map, which is described by the rectangular matrix

𝐩:=(χξ(1)(η))ξΞ1,ηΞ=(σ)1σ1n×n1.\boldsymbol{\mathbf{p}}_{\ell}:=\left(\chi^{(\ell-1)}_{\xi}(\eta)\right)_{\xi\in\Xi_{\ell-1},\eta\in\Xi_{\ell}}=(\sigma_{\ell}^{*})^{-1}\sigma_{\ell-1}^{*}\ \in\mathbb{R}^{n_{\ell}\times n_{\ell-1}}.

It is worth noting that 1σ1=σ1\mathcal{I}_{\ell-1}^{\ell}\sigma_{\ell-1}^{*}=\sigma_{\ell-1}^{*}, so we have the identity

σ𝐩=σ1.\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}=\sigma_{\ell-1}^{*}. (23)

The corresponding restriction map (1):VΞVΞ1\mathcal{I}_{\ell}^{(\ell-1)}:V_{\Xi_{\ell}}\to V_{\Xi_{\ell-1}} is described by the transposed matrix 𝐫=(𝐩)T\boldsymbol{\mathbf{r}}_{\ell}=\left(\boldsymbol{\mathbf{p}}_{\ell}\right)^{T}. In other words, it is defined as (1)σ=σ1(𝐩)T\mathcal{I}_{\ell}^{(\ell-1)}\sigma_{\ell}^{*}=\sigma_{\ell-1}^{*}\left(\boldsymbol{\mathbf{p}}_{\ell}\right)^{T}.

Note that we can use 𝐫=(𝐩)T\boldsymbol{\mathbf{r}}_{\ell}=\left(\boldsymbol{\mathbf{p}}_{\ell}\right)^{T} to relate analysis maps at different levels, since we can take the aa-adjoint of both sides of the equation (23) to obtain the following useful identity:

σ1=(𝐩)Tσ.\sigma_{\ell-1}=\left(\boldsymbol{\mathbf{p}}_{\ell}\right)^{T}\sigma_{\ell}. (24)

Moreover, the prolongation is both bounded from above and below. This is a kernel based analogue for [13, Eq. (64)].

Lemma 10.

Using the notation from above, there is a constant Cpro1C_{pro}\geq 1 depending on γ\gamma, ρ\rho, 𝕄\mathbb{M} and the constants in Assumption 2 so that

1\displaystyle 1 𝐩2(Ξ1)2(Ξ)Cpro\displaystyle\leq\left\|\boldsymbol{\mathbf{p}}_{\ell}\right\|_{\ell_{2}\left(\Xi_{\ell-1}\right)\to\ell_{2}\left(\Xi_{\ell}\right)}\leq C_{pro} (25)

holds for all 1\ell\geq 1.

Proof.

We begin by estimating the 1(Ξ1)1(Ξ)\ell_{1}(\Xi_{\ell-1})\to\ell_{1}(\Xi_{\ell}) and (Ξ1)(Ξ)\ell_{\infty}(\Xi_{\ell-1})\to\ell_{\infty}(\Xi_{\ell}) norms of 𝐩\boldsymbol{\mathbf{p}}_{\ell} by taking column and row sums, respectively. These estimates can be made almost simultaneously, because the (ξ,η)(\xi,\eta) entry of 𝐩\boldsymbol{\mathbf{p}}_{\ell} satisfies the bound |χξ(1)(η)|CPWexp(νdist(ξ,η)h1)|\chi^{(\ell-1)}_{\xi}(\eta)|\leq C_{\mathrm{PW}}\exp(-\nu\frac{\mathrm{dist}(\xi,\eta)}{h_{\ell-1}}) by (13).

Let A:=n=1(n+2)dexp(νρn)A:=\sum_{n=1}^{\infty}(n+2)^{d}\exp(-\frac{\nu}{\rho}n) and B:=n=1(n+2)dexp(νγρn)B:=\sum_{n=1}^{\infty}(n+2)^{d}\exp(-\frac{\nu\gamma}{\rho}n), and note that both numbers depend on ρ,γ\rho,\gamma and the exponential decay rate ν\nu from (13). Applying (3), gives

ξΞ1|χξ(1)(η)|CPW(1+β𝕄α𝕄A)\sum_{\xi\in\Xi_{\ell-1}}|\chi^{(\ell-1)}_{\xi}(\eta)|\leq C_{\mathrm{PW}}\left(1+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}A\right)

and

ηΞ|χξ(1)(η)|CPW(1+β𝕄α𝕄B).\sum_{\eta\in\Xi_{\ell}}|\chi^{(\ell-1)}_{\xi}(\eta)|\leq C_{\mathrm{PW}}\left(1+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}B\right).

Finally, interpolation gives the upper bound

Cpro:=CPW(1+β𝕄α𝕄B)(1+β𝕄α𝕄A).C_{pro}:=C_{\mathrm{PW}}\sqrt{(1+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}B)(1+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}A)}.

By using the definition Υ:=ΞΞ1\Upsilon_{\ell}:=\Xi_{\ell}\setminus\Xi_{\ell-1} we write 𝐏a=(χξ(η))ξ,ηΞ1\boldsymbol{\mathbf{P}}_{a}=(\chi_{\xi}(\eta))_{\xi,\eta\in\Xi_{\ell-1}} and 𝐏b=(χξ(ζ))ξΞ1,ζΥ\boldsymbol{\mathbf{P}}_{b}=(\chi_{\xi}(\zeta))_{\xi\in\Xi_{\ell-1},\zeta\in\Upsilon_{\ell}}. Thus, we have 𝐩𝐜22=𝐜22+𝐏b𝐜~22𝐜22\left\|\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{c}}\right\|^{2}_{2}=\|\boldsymbol{\mathbf{c}}\|^{2}_{2}+\|\boldsymbol{\mathbf{P}}_{b}\tilde{\boldsymbol{\mathbf{c}}}\|^{2}_{2}\geq\|\boldsymbol{\mathbf{c}}\|^{2}_{2}, which gives the lower bound. ∎

5.2 multigrid iteration – two level case

We now describe the multigrid algorithm, which is a composition of smoothing operators, restriction, coarse grid correction, prolongation, and then smoothing.

We begin by considering the solution of 𝐀𝐮=𝐛\boldsymbol{\mathbf{A}}_{\ell}\boldsymbol{\mathbf{u}}_{\ell}^{\star}=\boldsymbol{\mathbf{b}}_{\ell}, where 𝐀\boldsymbol{\mathbf{A}}_{\ell} is the stiffness matrix associated to VΞV_{\Xi_{\ell}}, and where 𝐛=σ(u)\boldsymbol{\mathbf{b}}_{\ell}=\sigma_{\ell}(u_{\ell}^{\star}), is the data obtained from the Galerkin solution u=σ(𝐮)VΞu_{\ell}^{\star}=\sigma^{\ast}_{\ell}(\boldsymbol{\mathbf{u}}_{\ell}^{\star})\in V_{\Xi_{\ell}}. Naturally, uu_{\ell}^{\star} is unknown (its coefficients are the solution of the above problem), but we can compute the data σu\sigma_{\ell}{u}_{\ell}^{\star} via

𝐛\displaystyle\boldsymbol{\mathbf{b}}_{\ell} =σ(u)\displaystyle=\sigma_{\ell}(u_{\ell}^{\star})
=(a(u,χξ1()),,a(u,χξn()))T\displaystyle=\left(a(u_{\ell}^{\star},\chi^{(\ell)}_{\xi_{1}}),\dots,a(u_{\ell}^{\star},\chi^{(\ell)}_{\xi_{n_{\ell}}})\right)^{T}
=(f,χξ1()L2(𝕄),,f,χξn()L2(𝕄))T.\displaystyle=\left(\langle f,\chi^{(\ell)}_{\xi_{1}}\rangle_{L_{2}(\mathbb{M})},\dots,\langle f,\chi^{(\ell)}_{\xi_{n_{\ell}}}\rangle_{L_{2}(\mathbb{M})}\right)^{T}\!\!.

In other words, it is obtained from the right hand side ff.

Algorithm 1 TGM\mathrm{TGM}_{\ell}
Two-grid method with post-smoothing in vectorial form, see [13, Eq. (20)]
0:  𝐮oldn\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}\in\mathbb{R}^{n_{\ell}}, right-hand-sides 𝐛=σun\boldsymbol{\mathbf{b}}_{\ell}=\sigma_{\ell}u^{\star}_{\ell}\in\mathbb{R}^{n_{\ell}}
0:  new approximation n𝐮newTGM(𝐮old,𝐛)\mathbb{R}^{n_{\ell}}\ni\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}\leftarrow\operatorname{TGM}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})
  if =0\ell=0 then
     𝐮0new𝐀01𝐛0\boldsymbol{\mathbf{u}}^{\text{new}}_{0}\leftarrow\boldsymbol{\mathbf{A}}^{-1}_{0}\boldsymbol{\mathbf{b}}_{0} {for coarsest grid use direct solver}
  else
     𝐮Jν1(𝐮old,𝐛)\boldsymbol{\mathbf{u}}_{\ell}\leftarrow{J}_{\ell}^{\nu_{1}}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell}) {ν1\nu_{1} steps pre-smoothing}
     𝐝1𝐫(𝐛𝐀𝐮)\boldsymbol{\mathbf{d}}_{\ell-1}\leftarrow\boldsymbol{\mathbf{r}}_{\ell}\left(\boldsymbol{\mathbf{b}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell}\boldsymbol{\mathbf{u}}_{\ell}\right) {restrict residual to coarser grid}
     𝐞~1𝐀11𝐝1\tilde{\boldsymbol{\mathbf{e}}}_{\ell-1}\leftarrow\boldsymbol{\mathbf{A}}^{-1}_{\ell-1}\boldsymbol{\mathbf{d}}_{\ell-1} {solve coarse grid problem}
     𝐮𝐮+𝐩𝐞~1\boldsymbol{\mathbf{u}}_{\ell}\leftarrow\boldsymbol{\mathbf{u}}_{\ell}+\boldsymbol{\mathbf{p}}_{\ell}\tilde{\boldsymbol{\mathbf{e}}}_{\ell-1} {update with coarse grid correction}
     𝐮newJν2(𝐮,𝐛)\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}\leftarrow{J}_{\ell}^{\nu_{2}}(\boldsymbol{\mathbf{u}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell}) {ν2\nu_{2} steps post-smoothing}
  end if
  return  𝐮new\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}

The output 𝐮new=TGM(𝐮old,𝐛)\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}=\operatorname{TGM}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell}) of the two-level multigrid algorithm with initial input 𝐮old\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell} is given by the rather complicated formula

𝐮new=Jν2(Jν1(𝐮old,𝐛)+𝐩(𝐀1)1𝐫(𝐛𝐀Jν1(𝐮old,𝐛)),𝐛).\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}={J}^{\nu_{2}}_{\ell}\left({J}^{\nu_{1}}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})+\boldsymbol{\mathbf{p}}_{\ell}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\left(\boldsymbol{\mathbf{b}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell}{J}^{\nu_{1}}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})\right),\boldsymbol{\mathbf{b}}_{\ell}\right). (26)

Because 𝐮TGM(𝐮,𝐛)\boldsymbol{\mathbf{u}}\mapsto\mathrm{TGM}_{\ell}(\boldsymbol{\mathbf{u}},\boldsymbol{\mathbf{b}}) is consistent (with 𝐀𝐮=𝐛\boldsymbol{\mathbf{A}}_{\ell}\boldsymbol{\mathbf{u}}=\boldsymbol{\mathbf{b}}_{\ell}) the corresponding iteration matrix is

𝐂TG:=𝐒ν2(id𝐩(𝐀1)1𝐫𝐀)𝐒ν1,\displaystyle\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}:=\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{2}}\left(\mathrm{id}-\boldsymbol{\mathbf{p}}_{\ell}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}\right)\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{1}}, (27)

(this is calculated in [13, Eq. 48] as well as in [11, Lemma 11.11]). Thus, the error can be expressed as 𝐮new𝐮=TGM(𝐮old)𝐮=𝐂TG(𝐮old𝐮)\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}-\boldsymbol{\mathbf{u}}_{\ell}^{\star}=\operatorname{TGM}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell})-\boldsymbol{\mathbf{u}}^{\star}_{\ell}=\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}-\boldsymbol{\mathbf{u}}^{\star}_{\ell}).

The corresponding operator on VΞV_{\Xi_{\ell}} is obtained by conjugating with σ\sigma_{\ell}^{*}. This gives the error operator for the two level method 𝒞TGσ:=σ𝐂TG\mathcal{C}_{\text{TG}_{\ell}}\sigma_{\ell}^{*}:=\sigma_{\ell}^{*}\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}.

It is worth noting that by (23) we have the equality σ𝐩(𝐀1)1𝐫𝐀=σ1(𝐀1)1𝐫𝐀\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}=\sigma_{\ell-1}^{*}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}. Using the identity 𝐀=σσ\boldsymbol{\mathbf{A}}_{\ell}=\sigma_{\ell}\sigma_{\ell}^{*} followed by (24) and the identity PΞ1=σ1(𝐀1)1σ1P_{\Xi_{\ell-1}}=\sigma^{\ast}_{\ell-1}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\sigma_{\ell-1} gives

σ𝐩(𝐀1)1𝐫𝐀=PΞ1σ.\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}=P_{\Xi_{\ell-1}}\sigma_{\ell}^{*}. (28)

It follows that 𝒞TG=𝒮ν2(idVΞ1PΞ1)𝒮ν1\mathcal{C}_{\text{TG}_{\ell}}=\mathcal{S}_{\ell}^{\nu_{2}}(\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}.

We are now in a position to show that the two level method is a contraction for sufficiently large values of ν1\nu_{1}.

Proposition 1.

There is a constant CC so that for all \ell, 𝒞TG\mathcal{C}_{\text{TG}_{\ell}}satisfies the bound

𝒞TGL2(𝕄)L2(𝕄)CProp.1g(ν1)=CProp.1{(2ν1+1)12,symmetric ;ν114,non-symmetric .\displaystyle\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq C_{\text{Prop.\ref{prop:boundtwogridop}}}g(\nu_{1})=C_{\text{Prop.\ref{prop:boundtwogridop}}}\begin{cases}(2\nu_{1}+1)^{-\frac{1}{2}},&\text{symmetric };\\ \nu_{1}^{-\frac{1}{4}},&\text{non-symmetric }.\end{cases}
Proof.

The following equality

𝒞TGL2(𝕄)L2(𝕄)=𝒮ν2(idVΞ1PΞ1)𝒮ν1L2(𝕄)L2(𝕄)\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}=\left\|\mathcal{S}_{\ell}^{\nu_{2}}(\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}

holds, so Lemma 7 ensures

𝒞TGL2(𝕄)L2(𝕄)C(idVΞ1PΞ1)𝒮ν1L2(𝕄)L2(𝕄).\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq C\left\|(\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}.

By Lemma 1, idVΞ1PΞ1W21(𝕄)L2(𝕄)Ch1\left\|\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}}\right\|_{W_{2}^{1}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq Ch_{\ell-1} holds, so it follows that

(idVΞ1PΞ1)𝒮ν1L2(𝕄)L2(𝕄)Ch1𝒮ν1L2(𝕄)W21(𝕄).\left\|(\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq Ch_{\ell-1}\|\mathcal{S}_{\ell}^{\nu_{1}}\|_{L_{2}(\mathbb{M})\to W_{2}^{1}(\mathbb{M})}.

By coercivity, this gives 𝒞TGL2(𝕄)L2(𝕄)Ch1𝒮ν1vL2(𝕄)\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq Ch_{\ell-1}\|\mathcal{S}_{\ell}^{\nu_{1}}v\|_{L_{2}(\mathbb{M})\to\mathcal{L}}. Finally, the result follows by applying the smoothing property: Lemma 8 in the symmetric case and Lemma 9 in the non-symmetric case. ∎

Corollary 1.

Let θ\theta be as in Lemma 6 and let 𝐮oldn\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}\in\mathbb{R}^{n_{\ell}} be an initial guess, uold=σ(𝐮old)u^{\text{old}}_{\ell}=\sigma_{\ell}^{*}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}), 𝐮=TGM(𝐮old)\boldsymbol{\mathbf{u}}_{\ell}=\operatorname{TGM}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}), and u=σ𝐮u_{\ell}=\sigma_{\ell}^{*}\boldsymbol{\mathbf{u}}_{\ell}. If aa is symmetric, there is a constant CC independent of \ell and θ\theta so that

unewuL2(𝕄)Cθ1/212ν1+1uolduL2(𝕄).\|u^{\text{new}}_{\ell}-u_{\ell}^{\star}\|_{L_{2}(\mathbb{M})}\leq C\theta^{-1/2}\frac{1}{\sqrt{2\nu_{1}+1}}\|u^{\text{old}}_{\ell}-u_{\ell}^{\star}\|_{L_{2}(\mathbb{M})}.

If aa is not symmetric, there is a constant CC independent of \ell and θ\theta so that

unewuL2(𝕄)Cθ1/21ν14uolduL2(𝕄).\|u^{\text{new}}_{\ell}-u_{\ell}^{\star}\|_{L_{2}(\mathbb{M})}\leq C\theta^{-1/2}\frac{1}{\sqrt[4]{\nu_{1}}}\|u^{\text{old}}_{\ell}-u_{\ell}^{\star}\|_{L_{2}(\mathbb{M})}.
Proof.

This follows by applying Proposition 1 to uolduu^{\text{old}}_{\ell}-u_{\ell}^{\star}. ∎

5.3 multigrid with τ\tau-cycle

In the two-grid method, the computational bottleneck remains the solution on the coarse grid. Thus, there have been many approaches to recursively apply the multigrid philosophy in order to use a direct solver only on the coarsest grid. A flexible algorithm is the so-called τ\tau-cycle. Here τ=1\tau=1 stands for the VV-cycle in multigrid methods and τ=2\tau=2 gives the WW-cycle.

Our results hold for τ2\tau\geq 2.

Algorithm 2 MGM(τ)\operatorname{MGM}^{(\tau)}_{\ell}
Multigrid method with τ\tau cycle, see [13, Eq. (31)]
0:  approximation: 𝐮oldn\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}\in\mathbb{R}^{n_{\ell}}, right-hand-side 𝐛n\boldsymbol{\mathbf{b}}_{\ell}\in\mathbb{R}^{n_{\ell}}, θ\theta
0:  new approximation n𝐮newMGM(τ)(𝐮old,𝐛)\mathbb{R}^{n_{\ell}}\ni\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}\leftarrow\operatorname{MGM}^{(\tau)}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})
  if =0\ell=0 then
     𝐮0new𝐀01𝐛0\boldsymbol{\mathbf{u}}^{\text{new}}_{0}\leftarrow\boldsymbol{\mathbf{A}}^{-1}_{0}\boldsymbol{\mathbf{b}}_{0} {for coarsest grid use direct solver}
  else
     𝐮Jν1(𝐮old,𝐛)\boldsymbol{\mathbf{u}}_{\ell}\leftarrow{J}_{\ell}^{\nu_{1}}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell}) {ν1\nu_{1} steps pre-smoothing}
     𝐝1𝐫(𝐛𝐀𝐮)\boldsymbol{\mathbf{d}}_{\ell-1}\leftarrow\boldsymbol{\mathbf{r}}_{\ell}\left(\boldsymbol{\mathbf{b}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell}\boldsymbol{\mathbf{u}}_{\ell}\right) {restrict residual to coarser grid}
     𝐞1(0)0\boldsymbol{\mathbf{e}}^{(0)}_{\ell-1}\leftarrow 0 {initialize}
     for i=0i=0 to τ\tau do
        𝐞1(i)MGM1(τ)(𝐞1i1,𝐝1)\boldsymbol{\mathbf{e}}^{(i)}_{\ell-1}\leftarrow\operatorname{MGM}^{(\tau)}_{\ell-1}(\boldsymbol{\mathbf{e}}^{i-1}_{\ell-1},\boldsymbol{\mathbf{d}}_{\ell-1}) {recursive call on Ξ1\Xi_{\ell-1}}
     end for
     𝐮𝐮+𝐩𝐞1(τ)\boldsymbol{\mathbf{u}}_{\ell}\leftarrow\boldsymbol{\mathbf{u}}_{\ell}+\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{e}}^{(\tau)}_{\ell-1} {update with coarse grid correction}
     𝐮newJν2(𝐮,𝐛)\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}\leftarrow J_{\ell}^{\nu_{2}}(\boldsymbol{\mathbf{u}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell}) {ν2\nu_{2} steps post-smoothing}
  end if
  return  𝐮new\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}

Before proving our main theorem, we need a statement from elementary real analysis.

Lemma 11.

For any real numbers α,β,γ,τ\alpha,\beta,\gamma,\tau which satisfy 0<γ<10<\gamma<1, τ2\tau\geq 2, β>1/τ\beta>1/\tau and α<min{τ1τ(βτ)1τ1,τ1τγ}\alpha<\min\left\{\frac{\tau-1}{\tau}(\beta\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}, if the sequence (xn)n0(x_{n})_{n\in\mathbb{N}_{0}} satisfies the conditions

x0=0,xn+1α+β(xn)τ\displaystyle x_{0}=0,\quad x_{n+1}\leq\alpha+\beta\bigl{(}x_{n}\bigr{)}^{\tau}

then xnγx_{n}\leq\gamma for all n0n\geq 0.

Proof.

This follows by elementary calculations, as in [17, Lemma 6.15]. ∎

Using [13, Theorem 7.1], we obtain for the iteration matrix of the Algorithm 2 the recursive (in the level) form

𝐂={𝟎,=0𝐂TG+𝐒ν2𝐩(𝐂1)τ𝐀11𝐫𝐀𝐒ν1,.\displaystyle\boldsymbol{\mathbf{C}}_{\ell}=\begin{cases}\boldsymbol{\mathbf{0}},&\ell=0\\ \boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}+\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{2}}\boldsymbol{\mathbf{p}}_{\ell}(\boldsymbol{\mathbf{C}}_{\ell-1})^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{1}},&\ell\in\mathbb{N}.\end{cases}

Again, we define the corresponding operator via 𝒞:=σ𝐂(σ)1\mathcal{C}_{\ell}:=\sigma_{\ell}^{*}\boldsymbol{\mathbf{C}}_{\ell}(\sigma_{\ell}^{*})^{-1}.

Theorem 1.

For every γ(0,1)\gamma\in(0,1), there is a ν:=argminν{ν:CProp.1g(ν)min{τ1τ(βThm.1τ)1τ1,τ1τγ}}\nu^{\star}:=\arg\min_{\nu\in\mathbb{N}}\{\nu\in\mathbb{N}\ :\ C_{\text{Prop.\ref{prop:boundtwogridop}}}g(\nu)\leq\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}\}. For all ν1ν\nu_{1}\geq\nu^{\star} we have

𝒞L2(𝕄)L2(𝕄)γ.\displaystyle\left\|\mathcal{C}_{\ell}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq\gamma. (29)
Proof.

Here, we follow basically [13, Proofof Theorem 7.20]. Let 𝐯n\boldsymbol{\mathbf{v}}\in\mathbb{R}^{n_{\ell}} arbitrary. For v=σ𝐯v=\sigma^{*}_{\ell}\boldsymbol{\mathbf{v}}, we obtain for \ell\in\mathbb{N},

𝒞L2(𝕄)L2(𝕄)\displaystyle\left\|\mathcal{C}_{\ell}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})} \displaystyle\leq 𝒞TGL2(𝕄)L2(𝕄)\displaystyle\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}
+𝒮ν2σ𝐩(𝐂1)τ𝐀11𝐫𝐀(σ)1𝒮ν1L2(𝕄)L2(𝕄)\displaystyle+\left\|\mathcal{S}_{\ell}^{\nu_{2}}\sigma^{*}_{\ell}\boldsymbol{\mathbf{p}}_{\ell}(\boldsymbol{\mathbf{C}}_{\ell-1})^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma^{*}_{\ell})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}
\displaystyle\leq 𝒞TGL2(𝕄)L2(𝕄)\displaystyle\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}
+Cσ𝐩(𝐂1)τ𝐀11𝐫𝐀(σ)1𝒮ν1L2(𝕄)\displaystyle+C\left\|\sigma^{*}_{\ell}\boldsymbol{\mathbf{p}}_{\ell}(\boldsymbol{\mathbf{C}}_{\ell-1})^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma^{*}_{\ell})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}\right\|_{L_{2}(\mathbb{M})}

by Proposition 1 and Lemma 7. We treat the second term with (23) and (28) to obtain

σ𝐩(𝐂1)τ𝐀11𝐫𝐀(σ)1𝒮ν1=𝒞1τPΞ1𝒮ν1.\sigma^{*}_{\ell}\boldsymbol{\mathbf{p}}_{\ell}(\boldsymbol{\mathbf{C}}_{\ell-1})^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma^{*}_{\ell})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}=\mathcal{C}_{\ell-1}^{\tau}P_{\Xi_{\ell-1}}\mathcal{S}_{\ell}^{\nu_{1}}.

This leaves

𝒞vL2(𝕄)\displaystyle\left\|\mathcal{C}_{\ell}v\right\|_{L_{2}(\mathbb{M})} 𝒞TGL2(𝕄)L2(𝕄)\displaystyle\leq\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}
+C𝒞1τL2(𝕄)L2(𝕄)PΞ1𝒮ν1L2(𝕄)L2(𝕄).\displaystyle\phantom{\leq}+C\left\|\mathcal{C}_{\ell-1}^{\tau}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\left\|P_{\Xi_{\ell-1}}\mathcal{S}_{\ell}^{\nu_{1}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}.

The last factor can be bounded by writing PΞ1=id(idPΞ1)P_{\Xi_{\ell-1}}=\mathrm{id}-(\mathrm{id}-P_{\Xi_{\ell-1}}) followed by the triangle inequality. Lemma 7 bounds 𝒮ν1L2(𝕄)L2(𝕄)\|\mathcal{S}_{\ell}^{\nu_{1}}\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}, while Proposition 1 (with ν2=0\nu_{2}=0) bounds (idPΞ1)𝒮ν1L2(𝕄)L2(𝕄)\|(\mathrm{id}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}. Thus, we end up with a bound

𝒞L2(𝕄)L2(𝕄)𝒞TGL2(𝕄)L2(𝕄)+C𝒞1L2L2τ,\displaystyle\left\|\mathcal{C}_{\ell}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}+C\left\|\mathcal{C}_{\ell-1}\right\|^{\tau}_{L_{2}\to L_{2}},

which has the form required by Lemma 11, with x=𝒞L2L2x_{\ell}=\left\|\mathcal{C}_{\ell}\right\|_{L_{2}\to L_{2}}, βThm.1:=C\beta_{\text{Thm.\ref{thm:main}}}:=C and α=𝒞TGL2(𝕄)L2(𝕄)\alpha=\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}. The condition ν1ν\nu_{1}\geq\nu^{\star} ensures the bound αmin{τ1τ(βThm.1τ)1τ1,τ1τγ}\alpha\leq\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}. Thus

𝒞L2(𝕄)L2(𝕄)γ\displaystyle\left\|\mathcal{C}_{\ell}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq\gamma

holds by Lemma 11, and the theorem follows. ∎

Remark 2.

At the finest level, the kernel-based Galerkin problem 𝐀L𝐱=𝐛L\boldsymbol{\mathbf{A}}_{L}\boldsymbol{\mathbf{x}}=\boldsymbol{\mathbf{b}}_{L}, can be solved stably to any precision ϵmax\epsilon_{\max}, by iterating the contraction matrix 𝐂Lτ\boldsymbol{\mathbf{C}}_{L}{\tau} . Select γ<1\gamma<1 and fix ν1\nu_{1} so that Theorem 1 holds. Letting 𝐮(k+1)=MGML(τ)(𝐮(k),𝐛)\boldsymbol{\mathbf{u}}^{(k+1)}=\operatorname{MGM}^{(\tau)}_{L}(\boldsymbol{\mathbf{u}}^{(k)},\boldsymbol{\mathbf{b}}_{\ell}) gives 𝐮𝐮(k)2γk𝐮𝐮(0)2\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\ell_{2}}\leq\gamma^{k}\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(0)}\|_{\ell_{2}}. If kk is the least integer satisfying γk𝐮𝐮(0)<ϵmax\gamma^{k}\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(0)}\|<\epsilon_{\max}, then k1logγlog(ϵmax𝐮𝐮(0))k\sim\frac{1}{\log\gamma}\log\left(\frac{\epsilon_{\max}}{\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(0)}\|}\right).

We note that 𝐮𝐮(k)𝐀LσL𝐮σL𝐮(k)W21CBernsteinhd/21𝐮𝐮(k)2,\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}\sim\|\sigma_{L}^{*}\boldsymbol{\mathbf{u}}^{*}-\sigma_{L}^{*}\boldsymbol{\mathbf{u}}^{(k)}\|_{W_{2}^{1}}\leq C_{\mathrm{Bernstein}}h^{d/2-1}\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\ell_{2}}, and since d2d\geq 2, achieving 𝐮𝐮(k)𝐀L<ϵmax\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}<\epsilon_{\max} also requires only a fixed number of iterations. This shows (41).

6 The perturbed multigrid method

In this section, we consider a modified problem

𝐀ˇL𝐮ˇL=𝐛L,\boldsymbol{\mathbf{\check{A}}}_{L}\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}=\boldsymbol{\mathbf{b}}_{L},

where 𝐀ˇL\boldsymbol{\mathbf{\check{A}}}_{L} is close to 𝐀L\boldsymbol{\mathbf{A}}_{L}. The perturbed multigrid method will produce an approximate solution 𝐮ˇL(k)\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L} to 𝐮L\boldsymbol{\mathbf{u}}^{\star}_{L} which satisfies 𝐮ˇL(k)𝐮L𝐮ˇL(k)𝐮ˇL+𝐀ˇL1𝐀L1𝐛L\|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|\leq\|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}\|+\|\boldsymbol{\mathbf{\check{A}}}_{L}^{-1}-\boldsymbol{\mathbf{A}}_{L}^{-1}\|\|\boldsymbol{\mathbf{b}}_{L}\|. Thus, for the true solution uu to (8) and the Galerkin solution uL=PΞLu=σL𝐮Lu^{\star}_{L}=P_{\Xi_{L}}u=\sigma_{L}^{*}\boldsymbol{\mathbf{u}}^{\star}_{L}, we have

σ𝐮ˇL(k)uW2k(𝕄)(1PΞL)uW2k(𝕄)+ChLd/2k𝐮ˇL(k)𝐮ˇL2.\|\sigma^{*}\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-u\|_{W_{2}^{k}(\mathbb{M})}\leq\|(1-P_{\Xi_{L}})u\|_{W_{2}^{k}(\mathbb{M})}+Ch_{L}^{d/2-k}\|\boldsymbol{\mathbf{\check{u}}}_{L}^{(k)}-\boldsymbol{\mathbf{\check{u}}}_{L}^{\star}\|_{\ell_{2}}.

which can be made as close to (1PΞL)uL2\|(1-P_{\Xi_{L}})u\|_{L_{2}} as desired by controlling the perturbation 𝐀ˇL𝐀L22\|\boldsymbol{\mathbf{\check{A}}}_{L}-\boldsymbol{\mathbf{A}}_{L}\|_{2\to 2} and the error from the multigrid approximation 𝐮ˇL(k)𝐮ˇL\|\boldsymbol{\mathbf{\check{u}}}_{L}^{(k)}-\boldsymbol{\mathbf{\check{u}}}_{L}^{\star}\|.

Such systems may occur for a number of reasons: using localized Lagrange basis functions (as in [8]), truncating a series expansion of the kernel (as in [26]), or by using quadrature to approximate the stiffness matrix (both [8] and [26]). In the next section, we will apply this by truncating the original stiffness matrix to employ only banded matrices and thereby enjoy a computational speed up.

Perturbed multigrid method

The perturbed multigrid method replaces matrices 𝐀\boldsymbol{\mathbf{A}}_{\ell}, 𝐩\boldsymbol{\mathbf{p}}_{\ell} and 𝐫\boldsymbol{\mathbf{r}}_{\ell} appearing in Algorithms 1 and 2 with matrices 𝐀ˇ\boldsymbol{\mathbf{\check{A}}}_{\ell}, 𝐩ˇ\check{\boldsymbol{\mathbf{p}}}_{\ell} and 𝐫ˇ\check{\boldsymbol{\mathbf{r}}}_{\ell}. We assume that for each \ell there exists 0<ϵ0<\epsilon_{\ell} so that

𝐀ˇ𝐀22,𝐩ˇ𝐩22,𝐫ˇ𝐫22ϵ.\|\boldsymbol{\mathbf{\check{A}}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell}\|_{\ell_{2}\to\ell_{2}},\|\check{\boldsymbol{\mathbf{p}}}_{\ell}-\boldsymbol{\mathbf{p}}_{\ell}\|_{\ell_{2}\to\ell_{2}},\|\check{\boldsymbol{\mathbf{r}}}_{\ell}-\boldsymbol{\mathbf{r}}_{\ell}\|_{\ell_{2}\to\ell_{2}}\leq\epsilon_{\ell}.

In this set up ϵ\epsilon_{\ell} may change per level.444Which could be the case, e.g., if 𝐀ˇ\boldsymbol{\mathbf{\check{A}}}_{\ell} involved a Ξ\Xi_{\ell} dependent quadrature scheme, or was obtained by bandlimiting (as we will do in the next section) We assume ϵ\epsilon_{\ell} is small enough that

𝐀ˇ222CAhd2,𝐩ˇ22<2Cpro and 𝐫ˇ22<2Cpro.\|\boldsymbol{\mathbf{\check{A}}}_{\ell}\|_{2\to 2}\leq 2C_{A}h_{\ell}^{d-2},\qquad\|\check{\boldsymbol{\mathbf{p}}}_{\ell}\|_{2\to 2}<2C_{pro}\quad\text{ and }\quad\|\check{\boldsymbol{\mathbf{r}}}_{\ell}\|_{2\to 2}<2C_{pro}.

It then follows from standard arguments that

𝐀ˇ1222CHo¨lderhdand𝐀ˇ1𝐀122(CHo¨lder)2h2dϵ.\|\boldsymbol{\mathbf{\check{A}}}_{\ell}^{-1}\|_{2\to 2}\leq 2C_{\mathrm{H\ddot{o}lder}}h_{\ell}^{-d}\quad\text{and}\quad\|\boldsymbol{\mathbf{\check{A}}}_{\ell}^{-1}-\boldsymbol{\mathbf{A}}_{\ell}^{-1}\|_{2}\leq 2(C_{\mathrm{H\ddot{o}lder}})^{2}h_{\ell}^{-2d}\epsilon_{\ell}.

Because of the entry-wise error |(𝐀ˇ)ξ,ξ(𝐀)ξ,ξ|ϵ|(\boldsymbol{\mathbf{\check{A}}}_{\ell})_{\xi,\xi}-(\boldsymbol{\mathbf{A}}_{\ell})_{\xi,\xi}|\leq\epsilon_{\ell}, we also have that the diagonal matrix 𝐁ˇ=diag(𝐀ˇ)\boldsymbol{\mathbf{\check{B}}}_{\ell}=\operatorname{diag}(\boldsymbol{\mathbf{\check{A}}}_{\ell}) satisfies

κ(𝐁ˇ)1+ϵ1ϵκ(𝐁)2CstiffCdiag.\kappa(\boldsymbol{\mathbf{\check{B}}}_{\ell})\leq\frac{1+\epsilon_{\ell}}{1-\epsilon_{\ell}}\kappa(\boldsymbol{\mathbf{B}}_{\ell})\leq 2\frac{C_{\mathrm{stiff}}}{C_{\mathrm{diag}}}.

Therefore, there is θ\theta so that for all \ell and all 𝐱n\boldsymbol{\mathbf{x}}\in\mathbb{R}^{n_{\ell}}, θ𝐀ˇ𝐱,𝐱𝐁ˇ𝐱,𝐱\theta\langle\boldsymbol{\mathbf{\check{A}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle\leq\langle\boldsymbol{\mathbf{\check{B}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle holds. This permits us to consider the Jacobi iteration applied to the perturbed linear system 𝐀ˇ𝐱=𝐛,\boldsymbol{\mathbf{\check{A}}}_{\ell}\boldsymbol{\mathbf{x}}_{\ell}=\boldsymbol{\mathbf{b}}, which yields Jˇ:n×nn\check{J}_{\ell}:\mathbb{R}^{n_{\ell}}\times\mathbb{R}^{n_{\ell}}\to\mathbb{R}^{n_{\ell}} defined by

Jˇ(𝐱,𝐛)=𝐒ˇ𝐱+θ𝐁ˇ1𝐛,where𝐒ˇ:=idθ𝐁ˇ1𝐀ˇ.\displaystyle\check{J}_{\ell}(\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{b}})=\boldsymbol{\mathbf{\check{S}}}_{\ell}\boldsymbol{\mathbf{x}}+\theta\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}\boldsymbol{\mathbf{b}},\quad\text{where}\quad\boldsymbol{\mathbf{\check{S}}}_{\ell}:=\mathrm{id}-\theta\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}\boldsymbol{\mathbf{\check{A}}}_{\ell}. (30)

Since 𝐒𝐒ˇ=θ(𝐁ˇ1(𝐀ˇ𝐀)+(𝐁ˇ1𝐁1)𝐀)\boldsymbol{\mathbf{S}}_{\ell}-\boldsymbol{\mathbf{\check{S}}}_{\ell}=\theta\Bigl{(}\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}(\boldsymbol{\mathbf{\check{A}}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell})+(\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}-\boldsymbol{\mathbf{B}}_{\ell}^{-1})\boldsymbol{\mathbf{A}}_{\ell}\Bigr{)}, we can estimate the error between smoothing matrices as

𝐒𝐒ˇ223θCA(Cdiag)2h2dϵ.\|\boldsymbol{\mathbf{S}}_{\ell}-\boldsymbol{\mathbf{\check{S}}}_{\ell}\|_{2\to 2}\leq\frac{3\theta C_{A}}{(C_{\mathrm{diag}})^{2}}h_{\ell}^{2-d}\epsilon_{\ell}. (31)

Because θ𝐀ˇ𝐱,𝐱𝐁ˇ𝐱,𝐱\theta\langle\boldsymbol{\mathbf{\check{A}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle\leq\langle\boldsymbol{\mathbf{\check{B}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle it follows that for all nn,

𝐒ˇn222CstiffCdiag.\|\boldsymbol{\mathbf{\check{S}}}_{\ell}^{n}\|_{2\to 2}\leq\sqrt{2\frac{C_{\mathrm{stiff}}}{C_{\mathrm{diag}}}}. (32)

This also yields the following Lemma.

Lemma 12.

For MM\in\mathbb{N}, we get the bound

𝐒M𝐒ˇM222MCstiffCdiag𝐒𝐒ˇ;rΞ22.\displaystyle\left\|\boldsymbol{\mathbf{S}}_{\ell}^{M}-\boldsymbol{\mathbf{\check{S}}}_{\ell}^{M}\right\|_{2\to 2}\leq 2M\frac{C_{\mathrm{stiff}}}{C_{\mathrm{diag}}}\left\|\boldsymbol{\mathbf{S}}_{\ell}-\check{\boldsymbol{\mathbf{S}}}_{\ell;r_{\Xi}}\right\|_{2\to 2}.
Proof.

By telescoping, we have 𝐒M𝐒ˇM=j=0M1𝐒M1j(𝐒𝐒ˇ)𝐒ˇj\boldsymbol{\mathbf{S}}_{\ell}^{M}-\boldsymbol{\mathbf{\check{S}}}_{\ell}^{M}=\sum_{j=0}^{M-1}\boldsymbol{\mathbf{S}}_{\ell}^{M-1-j}\left(\boldsymbol{\mathbf{S}}_{\ell}-\boldsymbol{\mathbf{\check{S}}}_{\ell}\right)\boldsymbol{\mathbf{\check{S}}}_{\ell}^{j}. The inequality

𝐒M𝐒ˇM𝐒𝐒ˇm=0M1𝐒Ξm22𝐒ˇM1m22.\displaystyle\|\boldsymbol{\mathbf{S}}_{\ell}^{M}-\boldsymbol{\mathbf{\check{S}}}_{\ell}^{M}\|\leq\|\boldsymbol{\mathbf{S}}_{\ell}-\boldsymbol{\mathbf{\check{S}}}_{\ell}\|\sum_{m=0}^{M-1}\left\|\boldsymbol{\mathbf{S}}_{\Xi}^{m}\right\|_{2\to 2}\left\|\boldsymbol{\mathbf{\check{S}}}_{\ell}^{M-1-m}\right\|_{2\to 2}.

follows from norm properties, and the result follows by applying (32) and Lemma 7. ∎

This lemma can be combined with the estimate (31) to obtain

𝐒M𝐒ˇM226θCstiffCA(Cdiag)3Mh2dϵ.\displaystyle\left\|\boldsymbol{\mathbf{S}}_{\ell}^{M}-\boldsymbol{\mathbf{\check{S}}}_{\ell}^{M}\right\|_{2\to 2}\leq 6\theta\frac{C_{\mathrm{stiff}}C_{A}}{(C_{\mathrm{diag}})^{3}}Mh_{\ell}^{2-d}\epsilon_{\ell}. (33)

Perturbed two grid method

Now, we consider the perturbed version of the two step algorithm. We aim to apply the two-grid method with only truncated matrices to the problem

𝐀ˇ𝐮ˇ=𝐛=σu.\displaystyle\boldsymbol{\mathbf{\check{A}}}_{\ell}\check{\boldsymbol{\mathbf{u}}}^{\star}_{\ell}=\boldsymbol{\mathbf{b}}_{\ell}=\sigma_{\ell}u^{\star}_{\ell}. (34)

Applying the two-grid method to (34) gives 𝐮ˇ𝐮ˇnew=𝐂ˇTG(𝐮ˇ𝐮ˇold),\check{\boldsymbol{\mathbf{u}}}^{\star}_{\ell}-\check{\boldsymbol{\mathbf{u}}}^{\text{new}}_{\ell}=\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\left(\check{\boldsymbol{\mathbf{u}}}^{\star}_{\ell}-\check{\boldsymbol{\mathbf{u}}}^{\text{old}}_{\ell}\right), where the two grid iteration matrix is

𝐂ˇTG:=𝐒ˇν2(id𝐩ˇ(𝐀ˇ1)1𝐫ˇ𝐀ˇ)𝐒ˇν1.\displaystyle\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}:=\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{2}}\left(\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\left(\boldsymbol{\mathbf{\check{A}}}_{\ell-1}\right)^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\right)\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{1}}. (35)
Lemma 13.

If ϵhd+2\epsilon_{\ell}\leq h_{\ell}^{d+2} holds for all L\ell\leq L, then

𝐂TG𝐂ˇTG22C(ν2+ν1+h14)h1(d+2)ϵ1.\displaystyle\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}-\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\|_{2\to 2}\leq C(\nu_{2}+\nu_{1}+h_{\ell-1}^{-4})h_{\ell-1}^{-(d+2)}\epsilon_{\ell-1}.
Remark 3.

A basic idea, used throughout this section, is the following result: If max(Mj,Mˇj)Cj\max(\|M_{j}\|,\|\check{M}_{j}\|)\leq C_{j}, then

j=1nMjj=1nMˇj(j=1nCj)(j=1nMjMˇjCj).\left\|\prod_{j=1}^{n}M_{j}-\prod_{j=1}^{n}\check{M}_{j}\right\|\leq\left(\prod_{j=1}^{n}C_{j}\right)\left(\sum_{j=1}^{n}\frac{\|M_{j}-\check{M}_{j}\|}{C_{j}}\right).
Proof of Lemma 13.

Consider

E1\displaystyle E_{1} :=\displaystyle:= (id𝐩ˇ𝐀ˇ11𝐫ˇ𝐀ˇ)(id𝐩𝐀11𝐫𝐀)\displaystyle\left(\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\right)-\left(\mathrm{id}-\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}\right)
=\displaystyle= 𝐩𝐀11𝐫𝐀𝐩ˇ𝐀ˇ11𝐫ˇ𝐀ˇ.\displaystyle\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}.

By Remark 3, we have

E122\displaystyle\left\|E_{1}\right\|_{2\to 2} Ch1dhd2(2ϵ2Cpro+2(CHo¨lder)2ϵ1h12d2CHo¨lderh1d+ϵ2CAhd2)\displaystyle\leq Ch_{\ell-1}^{-d}h_{\ell}^{d-2}\left(2\frac{\epsilon_{\ell}}{2C_{pro}}+\frac{2(C_{\mathrm{H\ddot{o}lder}})^{2}\epsilon_{\ell-1}h_{\ell-1}^{-2d}}{2C_{\mathrm{H\ddot{o}lder}}h_{\ell-1}^{-d}}+\frac{\epsilon_{\ell}}{2C_{A}h_{\ell}^{d-2}}\right)
Ch1d2ϵ1.\displaystyle\leq Ch_{\ell-1}^{-d-2}\epsilon_{\ell-1}.

Now, we consider the difference

E2\displaystyle E_{2} :=𝐂TG𝐂ˇTG\displaystyle:=\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}-\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}
=𝐒ν2(id𝐩𝐀11𝐫𝐀)𝐒ν1𝐒ˇν2(id𝐩ˇ𝐀ˇ11𝐫ˇ𝐀ˇ)𝐒ˇν1.\displaystyle=\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{2}}\left(\mathrm{id}-\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}\right)\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{1}}-\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{2}}\left(\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\right)\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{1}}.

Because id𝐩ˇ𝐀ˇ11𝐫ˇ𝐀ˇCh2\|\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\|\leq Ch_{\ell}^{-2} and id𝐩ˇ𝐀ˇ11𝐫ˇ𝐀ˇCh2\|\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\|\leq Ch_{\ell}^{-2}, the lemma follows by using Remark 3 with (31), (33), Lemma 12, and the above estimate of E1\|E_{1}\|. ∎

Perturbed τ\tau-cycle

As in the two-step case we consider the multigrid method also for the truncated system in (34), i.e., 𝐀ˇ𝐮ˇ=𝐛=σu\boldsymbol{\mathbf{\check{A}}}_{\ell}\check{\boldsymbol{\mathbf{u}}}^{\star}_{\ell}=\boldsymbol{\mathbf{b}}_{\ell}=\sigma_{\ell}u^{\star}_{\ell}. The multigrid iteration matrix is

𝐂ˇ={𝟎,=0𝐂ˇTG+𝐒ˇν2𝐩ˇ𝐂ˇ1τ𝐀ˇ11𝐫ˇ𝐀ˇ𝐒ˇν1,1L.\displaystyle\boldsymbol{\mathbf{\check{C}}}_{\ell}=\begin{cases}\boldsymbol{\mathbf{0}},&\ell=0\\ \boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}+\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{2}}\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{1}},&1\leq\ell\leq L.\end{cases}

From this we define the operator 𝒞ˇ:=σ𝐂ˇ(σ)1\mathcal{\check{C}}_{\ell}:=\sigma_{\ell}^{*}\boldsymbol{\mathbf{\check{C}}}_{\ell}(\sigma_{\ell}^{*})^{-1}.

Theorem 2.

For any 0<γ<10<\gamma<1, there exist constants C1C_{1}, C2C_{2} and C4C_{4} and ν\nu^{\star}\in\mathbb{N} such that CProp.1g(ν)C4min{τ1τ(βThm.1τ)1τ1,τ1τγ}C_{\text{Prop.\ref{prop:boundtwogridop}}}g(\nu^{\star})\leq C_{4}\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}. For ν1ν\nu_{1}\geq\nu^{\star} choose ϵ\epsilon_{\ell} such that ϵh(d+2)(h4+ν1+ν2)<CLem.131min(C1,γC2)\epsilon_{\ell}h_{\ell}^{-(d+2)}(h_{\ell}^{-4}+\nu_{1}+\nu_{2})<C^{-1}_{\text{Lem.\ref{lem:2step_pert}}}\min(C_{1},\gamma C_{2}) for 0L0\leq\ell\leq L for all \ell and if h0(CLem.131min(C1,γC2))1/4h_{0}\leq\left(C^{-1}_{\text{Lem.\ref{lem:2step_pert}}}\min(C_{1},\gamma C_{2})\right)^{-1/4}, then

𝒞ˇL2(𝕄)L2(𝕄)γ.\displaystyle\left\|\mathcal{\check{C}}_{\ell}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq\gamma.
Proof.

As in the proof of Theorem 1, we make the estimate

𝒞ˇ𝒞ˇTG+𝒮ˇν2σ𝐩ˇ𝐂ˇ1τ𝐀ˇ11𝐫ˇ𝐀ˇ(σ)1𝒮ˇν1.\|\mathcal{\check{C}}_{\ell}\|\leq\|\mathcal{\check{C}}_{\text{TG}_{\ell}}\|+\|\mathcal{\check{S}}_{\ell}^{\nu_{2}}\sigma_{\ell}^{*}\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{\check{S}}_{\ell}^{\nu_{1}}\|.

Then (32) ensures that σ𝐩ˇ(σ1)1𝒞ˇ1τσ1𝐀ˇ11𝐫ˇ𝐀ˇ(σ)1𝒮ˇν1\|\sigma_{\ell}^{*}\check{\boldsymbol{\mathbf{p}}}_{\ell}(\sigma_{\ell-1}^{*})^{-1}\mathcal{\check{C}}_{\ell-1}^{\tau}\sigma_{\ell-1}^{*}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{\check{S}}_{\ell}^{\nu_{1}}\| controls the second expression. Considering the difference

E:=σ𝐩ˇ𝐂ˇ1τ𝐀ˇ11𝐫ˇ𝐀ˇ(σ)1𝒮ˇν1σ𝐩𝐂ˇ1τ𝐀11𝐫𝐀(σ)1𝒮ν1,E:=\sigma_{\ell}^{*}\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{\check{S}}_{\ell}^{\nu_{1}}-\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{S}_{\ell}^{\nu_{1}},

Remark 3 gives

EL2L2C(h12+ν1)h1dϵ1𝐂ˇ1τ22.\|E\|_{L_{2}\to L_{2}}\leq C(h_{\ell-1}^{-2}+\nu_{1})h_{\ell-1}^{-d}\epsilon_{\ell-1}\|\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\|_{\ell_{2}\to\ell_{2}}.

Using the Riesz property, this gives

𝒞ˇ\displaystyle\|\mathcal{\check{C}}_{\ell}\| \displaystyle\leq 𝒞ˇTG\displaystyle\|\mathcal{\check{C}}_{\text{TG}_{\ell}}\|
+C(h2+ν1)hdϵ𝒞ˇ1τ+σ𝐩𝐂ˇ1τ𝐀11𝐫𝐀(σ)1𝒮ν1).\displaystyle+C\left(h^{-2}+\nu_{1})h^{-d}\epsilon\|\mathcal{\check{C}}_{\ell-1}^{\tau}\|+\left\|\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}\right\|\right).

As in the proof of Theorem 1, the last normed expression can be bounded as

σ𝐩𝐂ˇ1τ𝐀11𝐫𝐀(σ)1𝒮ν1𝒞ˇ1τPΞ1𝒮ν1C𝒞ˇ1τ.\left\|\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}\right\|\leq\|\mathcal{\check{C}}_{\ell-1}^{\tau}\|\|P_{\Xi_{\ell-1}}\mathcal{S}_{\ell}^{\nu_{1}}\|\leq C\|\mathcal{\check{C}}_{\ell-1}^{\tau}\|.

Because (h2+ν1)hdϵ(h_{\ell}^{-2}+\nu_{1})h_{\ell}^{-d}\epsilon_{\ell} is bounded by assumption, it follows that

𝒞ˇ𝒞ˇTG+C3𝒞ˇ1τ.\|\mathcal{\check{C}}_{\ell}\|\leq\|\mathcal{\check{C}}_{\text{TG}_{\ell}}\|+C_{3}\|\mathcal{\check{C}}_{\ell-1}\|^{\tau}.

As by assumption ϵhd+2\epsilon_{\ell}\leq h_{\ell}^{d+2} (no constants due to hh0h\leq h_{0}) holds for all L\ell\leq L, we obtain by Lemma 13

𝐂ˇTG22\displaystyle\|\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\|_{2\to 2} 𝐂TG𝐂ˇTG22+𝐂TG22\displaystyle\leq\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}-\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\|_{2\to 2}+\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}\|_{2\to 2}
CLem.13(ν2+ν1+h14)h1(d+2)ϵ1+𝐂TG22\displaystyle\leq C_{\text{Lem.\ref{lem:2step_pert}}}(\nu_{2}+\nu_{1}+h_{\ell-1}^{-4})h_{\ell-1}^{-(d+2)}\epsilon_{\ell-1}+\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}\|_{2\to 2}
min(C1,γC2)+𝐂TG22.\displaystyle\leq\min(C_{1},\gamma C_{2})+\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}\|_{2\to 2}.

We use Theorem 1 and choose a natural number ν1\nu^{\star}_{1} large enough such that the inequality CProp.1g(ν1)C4min{τ1τ(βThm.1τ)1τ1,τ1τγ}C_{\text{Prop.\ref{prop:boundtwogridop}}}g(\nu^{\star}_{1})\leq C_{4}\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\} is satisfied. Thus, we obtain

𝐂ˇTG22min(C1,γC2)+C4γ.\displaystyle\|\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\|_{2\to 2}\leq\min(C_{1},\gamma C_{2})+C_{4}\gamma.

We have

C1+C4γτ1τ(C3τ)1τ1andC2+C4τ1τ.\displaystyle C_{1}+C_{4}\gamma\leq\frac{\tau-1}{\tau}(C_{3}\tau)^{-\frac{1}{\tau-1}}\quad\text{and}\quad C_{2}+C_{4}\leq\frac{\tau-1}{\tau}.

Thus, we define

β:=C3andα:=maxL𝒞ˇTGmin{τ1τ(βτ)1τ1,τ1τγ}.\beta:=C_{3}\quad\text{and}\quad{\alpha}:=\max_{\ell\leq L}\|\mathcal{\check{C}}_{\text{TG}_{\ell}}\|\leq\min\left\{\frac{\tau-1}{\tau}(\beta\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}.\qquad

Hence Lemma 11 applies and the result follows. ∎

7 Truncated multigrid method

In this section we consider truncating the stiffness, prolongation and restriction matrices in order to improve the computational complexity of the method. Each such matrix has stationary, exponential off-diagonal decay, so by retaining the (ξ,η)(\xi,\eta) entry when dist(ξ,η)Kh|logh|\mathrm{dist}(\xi,\eta)\leq Kh_{\ell}|\log h_{\ell}|, and setting the rest to zero, guarantees a small perturbation error (on the order of 𝒪(hJ)\mathcal{O}(h_{\ell}^{J}), where JKJ\propto K). This is made precise in Lemma 15 and Lemma 16 below, with the aid of the following lemma.

Lemma 14.

Suppose Ξ𝕄\Xi\subset\mathbb{M}, c>0c>0, and r2q(Ξ)r\geq 2q(\Xi). Then for any η𝕄\eta\in\mathbb{M}, we have

ξΞdist(ξ,η)>recdist(ξ,η)β𝕄α𝕄(rq)decr(j=0(j+2)decjq).\sum_{\begin{subarray}{c}\xi\in\Xi\\ \mathrm{dist}(\xi,\eta)>r\end{subarray}}e^{-c\ \mathrm{dist}(\xi,\eta)}\leq\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\left(\frac{r}{q}\right)^{d}e^{-c{r}}\left(\sum_{j=0}^{\infty}(j+2)^{d}e^{-cjq}\right).
Proof.

The underlying set can be decomposed as {ξΞdist(ξ,η)r}=j=0𝒜j\{\xi\in\Xi\mid\mathrm{dist}(\xi,\eta)\geq r\}=\bigcup_{j=0}^{\infty}\mathcal{A}_{j}, where 𝒜j={ξΞr+jqdist(ξ,η)<r+(j+1)q}\mathcal{A}_{j}=\{\xi\in\Xi\mid r+jq\leq\mathrm{dist}(\xi,\eta)<r+(j+1)q\} has cardinality |𝒜j|α𝕄β𝕄(rq+(j+2))d|\mathcal{A}_{j}|\leq\frac{\alpha_{\mathbb{M}}}{\beta_{\mathbb{M}}}\bigl{(}\frac{r}{q}+(j+2)\bigr{)}^{d}. It follows that

ξΞdist(ξ,η)>recdist(ξ,η)β𝕄α𝕄ecrj=0(rq+(j+2))decjq\sum_{\begin{subarray}{c}\xi\in\Xi\\ \mathrm{dist}(\xi,\eta)>r\end{subarray}}e^{-c\ \mathrm{dist}(\xi,\eta)}\leq\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}e^{-cr}\sum_{j=0}^{\infty}\left(\frac{r}{q}+(j+2)\right)^{d}e^{-cjq}

and the lemma follows from the fact that rq+j+2rq(j+2)\frac{r}{q}+j+2\leq\frac{r}{q}(j+2) for all j0j\geq 0. ∎

Truncated stiffness matrix

The exponential decay in Lemma 3 motivates the truncation of the stiffness matrix, see e.g. [8, Eq. (8.1)] We define for positive KK, the truncation parameter rΞ:=Kh|log(h)|r_{\Xi}:=Kh|\log(h)|

𝐀ˇΞ;rΞ|Ξ|×|Ξ|with(𝐀ˇΞ;rΞ)ξ,η:={Aξ,η=a(χξ,χη),dist(ξ,η)rΞ,0,dist(ξ,η)>rΞ.\displaystyle\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}\in\mathbb{R}^{|\Xi|\times|\Xi|}\quad\text{with}\quad(\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}})_{\xi,\eta}:=\begin{cases}A_{\xi,\eta}=a(\chi_{\xi},\chi_{\eta}),&\mathrm{dist}(\xi,\eta)\leq r_{\Xi},\\ 0,&\mathrm{dist}(\xi,\eta)>r_{\Xi}.\end{cases} (36)

We note that 𝐀ˇΞ;rΞ\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}} is symmetric if 𝐀Ξ\boldsymbol{\mathbf{A}}_{\Xi} is symmetric. By construction and quasi-uniformity, we obtain |{ξB(η,rΞ)Ξ}|ρdhΞ,𝕄dβ𝕄α𝕄(rΞ+q)d2β𝕄α𝕄ρd|Klogh|d.|\{\xi\in B(\eta,r_{\Xi})\cap\Xi\}|\leq\rho^{d}h^{-d}_{\Xi,\mathbb{M}}\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}(r_{\Xi}+q)^{d}\leq 2\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\rho^{d}|K\log h|^{d}. By hdC|Ξ|h^{-d}\leq C|\Xi|, this yields

|{ξΞ(𝐀ˇΞ;rΞ)ξ,η0}2β𝕄α𝕄ρdddKd(log|Ξ|)d|\{\xi\in\Xi\,\mid\,(\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}})_{\xi,\eta}\neq 0\}\leq 2\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\rho^{d}d^{d}K^{d}(\log|\Xi|)^{d}.

In particular, we obtain

FLOPS(𝐱𝐀ˇΞ;rΞ𝐱)CcompKdlog(|Ξ|)d|Ξ|\displaystyle\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}\boldsymbol{\mathbf{x}})\leq C_{comp}K^{d}\log(|\Xi|)^{d}|\Xi| (37)

for the number of operations for a matrix vector multiplication with the truncated stiffness matrix, with Ccomp=2β𝕄α𝕄ρdddC_{comp}=2\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\rho^{d}d^{d}.

Lemma 15.

With the global parameter Ctr:=β𝕄α𝕄n=1(n+2)deνρnC_{\text{tr}}:=\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\sum_{n=1}^{\infty}\left(n+2\right)^{d}e^{-\frac{\nu}{\rho}n}, the estimate

𝐀Ξ𝐀ˇΞ;rΞ22CtrCstiff(Kρ|log(h)|)dhν2K2\displaystyle\|\boldsymbol{\mathbf{A}}_{\Xi}-\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}\|_{2\to 2}\leq C_{\text{tr}}C_{\text{stiff}}(K\rho|\log(h)|)^{d}h^{\frac{\nu}{2}K-2} (38)

holds.

Proof.

The proof for the first statement is essentially given in [8, Prop 8.1]. Using Lemma 3, we observe, by symmetry, that 𝐀Ξ𝐀ˇΞ;rΞpp\|\boldsymbol{\mathbf{A}}_{\Xi}-\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}\|_{p\to p} is controlled by the maximum of the 1\ell_{1} and \ell_{\infty} matrix norms, which can be controlled by row and column sums. This leads to off-diagonal sums maxηΞξΞB(η,rΞ)|(𝐀Ξ)ξ,η|\max_{\eta\in\Xi}\sum_{\xi\in\Xi\cap B^{\complement}(\eta,r_{\Xi})}\left|(\boldsymbol{\mathbf{A}}_{\Xi})_{\xi,\eta}\right| and maxξΞηΞB(η,rΞ)|(𝐀Ξ)ξ,η|.\max_{\xi\in\Xi}\sum_{\eta\in\Xi\cap B^{\complement}(\eta,r_{\Xi})}\left|(\boldsymbol{\mathbf{A}}_{\Xi})_{\xi,\eta}\right|. Applying Lemma 14 with r=rΞ=Kh|logh|r=r_{\Xi}=Kh|\log h| and c=ν2hc=\frac{\nu}{2h} yields

𝐀Ξ𝐀ˇΞ;rΞpp\displaystyle\|\boldsymbol{\mathbf{A}}_{\Xi}-\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}\|_{p\to p} Cstiffh2maxξΞηΞB(ξ,rΞ)eν2dist(ξ,η)h\displaystyle\leq C_{\text{stiff}}h^{-2}\max_{\xi\in\Xi}\sum_{\eta\in\Xi\cap B^{\complement}(\xi,r_{\Xi})}e^{-\frac{\nu}{2}\frac{\mathrm{dist}(\xi,\eta)}{h}}
Cstiffβ𝕄α𝕄(Kh|logh|q)dhKν22(j=0(j+2)deν2hjq).\displaystyle\leq C_{\text{stiff}}\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\left(\frac{Kh|\log h|}{q}\right)^{d}h^{\frac{K\nu}{2}-2}\left(\sum_{j=0}^{\infty}(j+2)^{d}e^{-\frac{\nu}{2h}jq}\right).

Truncated prolongation and restriction matrices

We introduce truncated prolongation matrices 𝐩ˇ;rn×n1\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\in\mathbb{R}^{n_{\ell}\times n_{\ell-1}}, with

(𝐩ˇ;r)ξΞ1,ηΞ:={(𝐩)ξ,η=χξ(1)(η),dist(ξ,η)r,0,dist(ξ,η)>r,\displaystyle(\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}})_{\xi\in\Xi_{\ell-1},\eta\in\Xi_{\ell}}:=\begin{cases}(\boldsymbol{\mathbf{p}}_{\ell})_{\xi,\eta}=\chi^{(\ell-1)}_{\xi}(\eta),&\mathrm{dist}(\xi,\eta)\leq r_{\ell},\\ 0,&\mathrm{dist}(\xi,\eta)>r_{\ell},\end{cases}

where we use the notation r:=rΞ1r_{\ell}:=r_{\Xi_{\ell-1}}. Likewise, we define 𝐫ˇ;r:=(𝐩ˇ;r)T\check{\boldsymbol{\mathbf{r}}}_{\ell;r_{\ell}}:=\left(\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\right)^{T}. For the numerical costs, we obtain

max{FLOPS(𝐱𝐩ˇ;r𝐱),FLOPS(𝐱𝐫ˇ;r𝐱)}=𝒪(Kdlog(|Ξ|)d|Ξ|),\displaystyle\max\left\{\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}}),\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{r}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}})\right\}=\mathcal{O}\left(K^{d}\log(|\Xi_{\ell}|)^{d}|\Xi_{\ell}|\right), (39)

where we use that |Ξ|hd=γdh1d|Ξ1|γd|Ξ1||\Xi_{\ell}|\sim h_{\ell}^{-d}=\gamma^{d}h^{d}_{\ell-1}\sim|\Xi_{\ell-1}|\sim\gamma^{d}|\Xi_{\ell-1}| due to (22).

Lemma 16.

We have

𝐩𝐩ˇ;r2(Ξ1)2(Ξ)CPWCtrunc(K|log(h)|)dhν2K.\displaystyle\left\|\boldsymbol{\mathbf{p}}_{\ell}-\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\right\|_{\ell_{2}(\Xi_{\ell-1})\to\ell_{2}(\Xi_{\ell})}\leq C_{\mathrm{PW}}C_{\text{trunc}}(K|\log(h_{\ell})|)^{d}h^{\frac{\nu}{2}K}_{\ell}.
Proof.

We proceed as in the proof of (38). We can estimate row and column sums of 𝐄:=𝐩𝐩ˇ;r\boldsymbol{\mathbf{E}}:=\boldsymbol{\mathbf{p}}_{\ell}-\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}} by (13), obtaining

𝐄CPWξΞ1B(η,r)eνdist(η,ξ)h1,\left\|\boldsymbol{\mathbf{E}}\right\|_{\infty\to\infty}\leq C_{\mathrm{PW}}\sum_{\xi\in\Xi_{\ell-1}\cap B^{\complement}(\eta,r_{\ell})}e^{-\nu\frac{\mathrm{dist}(\eta,\xi)}{h_{\ell-1}}},

so, by Lemma 14 with r=r=Kh|logh|r=r_{\ell}=Kh_{\ell}|\log h_{\ell}| and c=νh1c=\frac{\nu}{h_{\ell-1}}, we have

𝐄CPWβ𝕄α𝕄(Kρ|logh1|)d(h1)Kν(j=0(j+2)deνjρ).\left\|\boldsymbol{\mathbf{E}}\right\|_{\infty\to\infty}\leq C_{\mathrm{PW}}\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\left(K\rho|\log h_{\ell-1}|\right)^{d}(h_{\ell-1})^{K\nu}\left(\sum_{j=0}^{\infty}(j+2)^{d}e^{-\frac{\nu j}{\rho}}\right).

Likewise, 𝐄11CPWηΞB(ξ,r)eνdist(η,ξ)h1\left\|\boldsymbol{\mathbf{E}}\right\|_{1\to 1}\leq C_{\mathrm{PW}}\sum_{\eta\in\Xi_{\ell}\cap B^{\complement}(\xi,r_{\ell})}e^{-\nu\frac{\mathrm{dist}(\eta,\xi)}{h_{\ell-1}}}. Lemma 14 yields this time with r=r=Kh|logh|r=r_{\ell}=Kh_{\ell}|\log h_{\ell}| and c=νh1c=\frac{\nu}{h_{\ell-1}}, the estimate

𝐄11CPWβ𝕄α𝕄(Kh1q|logh1|)d(h1)Kν(j=0(j+2)deνjqh1).\left\|\boldsymbol{\mathbf{E}}\right\|_{1\to 1}\leq C_{\mathrm{PW}}\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\left(K\frac{h_{\ell-1}}{q_{\ell}}|\log h_{\ell-1}|\right)^{d}(h_{\ell-1})^{K\nu}\left(\sum_{j=0}^{\infty}(j+2)^{d}e^{-\frac{\nu jq_{\ell}}{h_{\ell-1}}}\right).

Thus, we get

max{𝐄,𝐄11}\displaystyle\max\left\{\left\|\boldsymbol{\mathbf{E}}\right\|_{\infty\to\infty},\left\|\boldsymbol{\mathbf{E}}\right\|_{1\to 1}\right\} CPWCtrunc(Kργ|log(h1)|)dh1ν2K.\displaystyle\leq C_{\mathrm{PW}}C_{\text{trunc}}(K\rho\gamma|\log(h_{\ell-1})|)^{d}h_{\ell-1}^{\frac{\nu}{2}K}.

Interpolation finishes the proof. ∎

Truncated τ\tau-cycle

We now consider the multigrid method using truncated versions of the stiffness, prolongation and restriction matrices. We denote this by MGMTRUNC(τ)\operatorname{MGMTRUNC}^{(\tau)}_{\ell}, and use it to solve (34) with 𝐀ˇ=𝐀ˇ;r\boldsymbol{\mathbf{\check{A}}}_{\ell}=\check{\boldsymbol{\mathbf{A}}}_{\ell;r_{\ell}}. Lemmas 15 and 16 show that conditions for Theorem 2 are satisfied when KK is chosen sufficiently large.

Theorem 3.

If τγd<1\tau\gamma^{d}<1, we obtain

FLOPS(𝐱MGMTRUNC(τ)(𝐱,𝐛))=𝒪(Nlog(N)d).\displaystyle\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\operatorname{MGMTRUNC}^{(\tau)}_{\ell}(\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{b}}_{\ell}))=\mathcal{O}(N_{\ell}\log(N_{\ell})^{d}). (40)
Proof.

Define the floating point operation count for the truncated multigrid
method by 𝙼:=FLOPS(𝐱MGMTRUNC(τ)(𝐱,𝐛)){\tt M}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\operatorname{MGMTRUNC}^{(\tau)}_{\ell}(\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{b}}_{\ell})).

By estimates (37) and (39), the quantities 𝙿:=FLOPS(𝐱𝐩ˇ;r𝐱){\tt P}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}}), 𝚁:=FLOPS(𝐱𝐫ˇ;r𝐱){\tt R}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{r}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}}), and 𝙰:=FLOPS(𝐱𝐀ˇ;r𝐱){\tt A}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{A}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}}) are each bounded by CKdlog(|Ξ|)d|Ξ|CK^{d}\log(|\Xi_{\ell}|)^{d}|\Xi_{\ell}|. Because each Jacobi iteration involves multiplication by a matrix with the same number of nonzero entries, we note that

𝚂:=FLOPS(xJ(x,b))CKdlog(|Ξ|)d|Ξ|{\tt S}_{\ell}:=\operatorname{FLOPS}\bigl{(}x\mapsto J(x,b)\bigr{)}\leq CK^{d}\log(|\Xi_{\ell}|)^{d}|\Xi_{\ell}|

as well. From this, we have the recursive formula

𝙼=𝙿+𝚁+𝙰+(ν1+ν2)𝚂+τ𝙼1.{\tt M}_{\ell}={\tt P}_{\ell}+{\tt R}_{\ell}+{\tt A}_{\ell}+(\nu_{1}+\nu_{2}){\tt S}_{\ell}+\tau{\tt M}_{\ell-1}.

Applying (37) and (39) gives 𝙼CKd(ν1+ν2+3)(|logh|dhd)+τ𝙼1.{\tt M}_{\ell}\leq CK^{d}(\nu_{1}+\nu_{2}+3)\left(|\log h_{\ell}|^{d}h^{-d}_{\ell}\right)+\tau{\tt M}_{\ell-1}.

By setting w=(|logh|/h)dw_{\ell}=\left(|\log h_{\ell}|/h_{\ell}\right)^{d} and C~:=CKd(ν1+ν2+3)\tilde{C}:=CK^{d}(\nu_{1}+\nu_{2}+3), we have the recurrence

𝙼C~w+τ𝙼1𝙼C~k=01wkτk+τ𝙼0{\tt M}_{\ell}\leq\tilde{C}w_{\ell}+\tau{\tt M}_{\ell-1}\quad\longrightarrow\quad{\tt M}_{\ell}\leq\tilde{C}\sum_{k=0}^{\ell-1}w_{\ell-k}\tau^{k}+\tau^{\ell}{\tt M}_{0}

Note that w(|logh0|+|logγ|)dγdh0dw_{\ell}\leq(|\log h_{0}|+\ell|\log\gamma|)^{d}\gamma^{-d\ell}h_{0}^{-d}, since hγh0h_{\ell}\leq\gamma^{\ell}h_{0}. By Hölder’s inequality, we have (|logh0|+|logγ|)d2d1d(|logh0|d+|logγ|d)(|\log h_{0}|+\ell|\log\gamma|)^{d}\leq 2^{\frac{d-1}{d}}(|\log h_{0}|^{d}+\ell|\log\gamma|^{d}), which provides the estimate wk2d1dh0dγdγdk(|logh0|d+(1k/)|logγ|d)w_{\ell-k}\leq 2^{\frac{d-1}{d}}h_{0}^{-d}\gamma^{-d\ell}\gamma^{dk}(|\log h_{0}|^{d}+\ell(1-k/\ell)|\log\gamma|^{d}).

Applying this to the above estimate for 𝙼{\tt M}_{\ell} gives,

𝙼\displaystyle{\tt M}_{\ell} C~h0dγd(|logh0|dk=01(τγd)k+|logγ|ddk=01(1k/)d(τγd)k)\displaystyle\leq\tilde{C}h_{0}^{-d}\gamma^{-d\ell}\left(|\log h_{0}|^{d}\sum_{k=0}^{\ell-1}(\tau\gamma^{d})^{k}+|\log\gamma|^{d}\ell^{d}\sum_{k=0}^{\ell-1}(1-k/\ell)^{d}(\tau\gamma^{d})^{k}\right)
+τ𝙼0\displaystyle\phantom{\leq}+\tau^{\ell}{\tt M}_{0}
C~h0dγd|logh0+logγ|d+τ𝙼0C~hd(|logh|)d+τ𝙼0.\displaystyle\leq\tilde{C}h_{0}^{-d}\gamma^{-d\ell}\left|\log h_{0}+\log\gamma^{\ell}\right|^{d}+\tau^{\ell}{\tt M}_{0}\leq\tilde{C}h^{-d}(|\log h|)^{d}+\tau^{\ell}{\tt M}_{0}.

The result follows by taking (γh0)dChd(\gamma^{\ell}h_{0})^{-d}\leq Ch^{-d} and (|logh0|+|logγ|)|logh|(|\log h_{0}|+\ell|\log\gamma|)\sim|\log h|. ∎

Remark 4.

The kernel-based Galerkin problem 𝐀ˇL;rL𝐮ˇL=𝐛L\check{\boldsymbol{\mathbf{A}}}_{L;r_{L}}\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}=\boldsymbol{\mathbf{b}}_{L}, can be solved stably to any precision ϵmax\epsilon_{\max}, by iterating MGMTRUNCL(τ)(𝐮ˇL,𝐛L)\operatorname{MGMTRUNC}^{(\tau)}_{L}(\check{\boldsymbol{\mathbf{u}}}_{L},\boldsymbol{\mathbf{b}}_{L}), i.e., the truncated multigrid with τ2\tau\geq 2 cycle. Select γ<1\gamma<1 and fix ν1\nu_{1} so that Theorem 2 holds. Let 𝐮ˇL(k+1)=MGMTRUNCL(τ)(𝐮ˇL(k),𝐛L)\check{\boldsymbol{\mathbf{u}}}_{L}^{(k+1)}=\operatorname{MGMTRUNC}^{(\tau)}_{L}(\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)},\boldsymbol{\mathbf{b}}_{L}). If kk is the least integer satisfying γk𝐮ˇL𝐮ˇL(0)2<ϵmax\gamma^{k}\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(0)}\|_{\ell_{2}}<\epsilon_{\max}, then

k1logγlog(ϵmax𝐮ˇL𝐮ˇL(0)2).k\sim\frac{1}{\log\gamma}\log\left(\frac{\epsilon_{\max}}{\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(0)}\|_{\ell_{2}}}\right).

Due to Theorem 3, we obtain an overall complexity of

𝒪(1logγlog(ϵmax𝐮ˇL𝐮ˇL(0))NLlog(NL)d).\mathcal{O}\left(\frac{1}{\log\gamma}\log\left(\frac{\epsilon_{\max}}{\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(0)}\|}\right)N_{L}\log(N_{L})^{d}\right).

We note that 𝐮ˇL𝐮ˇL(k)𝐀LσL(𝐮ˇL𝐮ˇL(k))W21CBernsteinhd/21𝐮ˇL𝐮ˇL(k)2,\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}\sim\|\sigma_{L}^{*}\left(\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\right)\|_{W_{2}^{1}}\leq C_{\mathrm{Bernstein}}h^{d/2-1}\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\|_{\ell_{2}}, and since d2d\geq 2, achieving 𝐮ˇL𝐮ˇL(k)𝐀L<ϵmax\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}<\epsilon_{\max} also requires only a fixed number of iterations. This is repeated below in statement (41).

Indeed, using kk steps of the Conjugate Gradient method on the original system (1), would give error 𝐮¯L(k)𝐮L𝐀L(CNL2/d1CNL2/d+1)k𝐮L𝐮(0)𝐀L\|\bar{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|_{\boldsymbol{\mathbf{A}}_{L}}\leq\left(\frac{CN^{2/d}_{L}-1}{CN^{2/d}_{L}+1}\right)^{k}\|\boldsymbol{\mathbf{u}}^{\star}_{L}-\boldsymbol{\mathbf{u}}^{(0)}\|_{\boldsymbol{\mathbf{A}}_{L}}. Thus to ensure a tolerance of 𝐮¯L(k)𝐮L𝐀Lϵmax\|\bar{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|_{\boldsymbol{\mathbf{A}}_{L}}\leq\epsilon_{\max}, one would need

k1log(1C~NL2/d)log(ϵmax𝐮¯L(0)𝐮L)𝒪(NL2/dlog(ϵmax𝐮¯L(0)𝐮L))k\sim\frac{1}{\log(1-\tilde{C}N^{-2/d}_{L})}\log\left(\frac{\epsilon_{\max}}{\|\bar{\boldsymbol{\mathbf{u}}}^{(0)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|}\right)\sim\mathcal{O}\left(N^{2/d}_{L}\log\left(\frac{\epsilon_{\max}}{\|\bar{\boldsymbol{\mathbf{u}}}^{(0)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|}\right)\right)

steps, where we use (CNL2/d1CNL2/d+1)(1C~N2/d)\left(\frac{CN^{2/d}_{L}-1}{CN^{2/d}_{L}+1}\right)\sim\left(1-\tilde{C}N^{-2/d}\right).

In contrast to this, the multigrid WW-cycle requires only

k|log(ϵmax𝐮ˇL(0)𝐮ˇL)|k\sim\left|\log\left(\frac{\epsilon_{\max}}{\|\check{\boldsymbol{\mathbf{u}}}^{(0)}_{L}-\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}\|}\right)\right| (41)

iterations to achieve error 𝐮ˇL(k)𝐮ˇL𝐀Lϵmax\|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}\|_{\boldsymbol{\mathbf{A}}_{L}}\leq\epsilon_{\max}. In fact, it reaches 𝐮ˇL(k)𝐮ˇL2ϵmax\|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}\|_{\ell_{2}}\leq\epsilon_{\max}, which is a stronger constraint, within kk iterations. In particular, the number of iterations is independent of the size NLN_{L} of the problem.

Acknowledegment

The authors wish to thank the anonymous reviewers for their careful reading and helpful comments.

References

  • [1] G. Dziuk, C. M. Elliott, Finite element methods for surface PDEs, Acta Numer. 22 (2013) 289–396.
  • [2] V. Shankar, G. B. Wright, A. Narayan, A robust hyperviscosity formulation for stable RBF-FD discretizations of advection-diffusion-reaction equations on manifolds, SIAM J. Sci. Comput. 42 (4) (2020) A2371–A2401.
  • [3] Q. T. Le Gia, I. H. Sloan, H. Wendland, Multiscale RBF collocation for solving PDEs on spheres, Numer. Math. 121 (1) (2012) 99–125.
  • [4] Q. Le Gia, Galerkin approximation for elliptic PDEs on spheres, Journal of Approximation Theory 130 (2) (2004) 125–149.
  • [5] H. Wendland, J. Künemund, Solving partial differential equations on (evolving) surfaces with radial basis functions, Advances in Computational Mathematics 46 (4) (2020) 64.
  • [6] Z. Sun, L. Ling, A kernel-based meshless conservative Galerkin method for solving Hamiltonian wave equations, SIAM Journal on Scientific Computing 44 (4) (2022) A2789–A2807.
  • [7] J. Künemund, F. J. Narcowich, J. D. Ward, H. Wendland, A high-order meshless Galerkin method for semilinear parabolic equations on spheres, Numer. Math. 142 (2) (2019) 383–419.
  • [8] F. J. Narcowich, S. T. Rowe, J. D. Ward, A novel Galerkin method for solving PDEs on the sphere using highly localized kernel bases, Math. Comp. 86 (303) (2017) 197–231.
  • [9] A. Katz, A. Jameson, Multicloud: multigrid convergence with a meshless operator, J. Comput. Phys. 228 (14) (2009) 5237–5250.
  • [10] G. B. Wright, A. Jones, V. Shankar, Mgm: A meshfree geometric multilevel method for systems arising from elliptic equations on point cloud surfaces, SIAM Journal on Scientific Computing 45 (2) (2023) A312–A337.
  • [11] W. Hackbusch, Iterative solution of large sparse systems of equations, Vol. 95, Springer, 1994.
  • [12] S. C. Brenner, L. R. Scott, The Mathematical Theory of Finite Element Methods, Springer New York, NY, 2007, edition Number 3.
  • [13] A. Reusken, Introduction to multigrid methods for elliptic boundary value problems, (2008). Technical report available at {https://www.igpm.rwth-aachen.de/Download/reports/reusken/ARpaper53.pdf}
  • [14] E. Fuselier, T. Hangelbroek, F. J. Narcowich, J. D. Ward, G. B. Wright, Localized bases for kernel spaces on the unit sphere, SIAM J. Numer. Anal. 51 (5) (2013) 2538–2562.
  • [15] T. Hangelbroek, F. J. Narcowich, C. Rieger, J. D. Ward, Direct and inverse results on bounded domains for meshless methods via localized bases on manifolds, in: J. Dick, F. Kuo, H. Woźniakowski (Eds.), Contemporary computational mathematics—a celebration of the 80th birthday of Ian Sloan. Vol. 1, 2, Springer, Cham, 2018, pp. 517–543.
  • [16] J. H. Bramble, J. E. Pasciak, J. Xu, Parallel multilevel preconditioners, Mathematics of Computation 55 (191) (1990) 1–22.
  • [17] V. John, Multigrid methods, Tech. rep., (2014). Lecture notes available at {https://www.wias-berlin.de/people/john/LEHRE/MULTIGRID/multigrid.pdf}
  • [18] S. Balay, S. Abhyankar, M. F. Adams, S. Benson, J. Brown, P. Brune, K. Buschelman, E. Constantinescu, L. Dalcin, A. Dener, V. Eijkhout, J. Faibussowitsch, W. D. Gropp, V. Hapla, T. Isaac, P. Jolivet, D. Karpeev, D. Kaushik, M. G. Knepley, F. Kong, S. Kruger, D. A. May, L. C. McInnes, R. T. Mills, L. Mitchell, T. Munson, J. E. Roman, K. Rupp, P. Sanan, J. Sarich, B. F. Smith, S. Zampini, H. Zhang, H. Zhang, J. Zhang, PETSc/TAO users manual, Tech. Rep. ANL-21/39 - Revision 3.21, Argonne National Laboratory (2024).
  • [19] R. Abraham, J. E. Marsden, T. Ratiu, Manifolds, tensor analysis, and applications, 2nd Edition, Vol. 75 of Applied Mathematical Sciences, Springer-Verlag, New York, 1988.
  • [20] J. M. Lee, Introduction to Riemannian manifolds, 2nd Edition, Vol. 176 of Graduate Texts in Mathematics, Springer, Cham, 2018.
  • [21] T. Hangelbroek, F. J. Narcowich, J. D. Ward, Kernel approximation on manifolds I: bounding the Lebesgue constant, SIAM J. Math. Anal. 42 (4) (2010) 1732–1760.
  • [22] H. Triebel, Spaces of Besov-Hardy-Sobolev type on complete Riemannian manifolds, Ark. Mat. 24 (2) (1986) 299–337.
  • [23] T. Hangelbroek, F. J. Narcowich, J. D. Ward, Polyharmonic and related kernels on manifolds: interpolation and approximation, Found. Comput. Math. 12 (5) (2012) 625–670.
  • [24] M. E. Taylor, Partial differential equations. I, Vol. 115 of Applied Mathematical Sciences, Springer-Verlag, New York, 1996, basic theory.
  • [25] T. Hangelbroek, F. J. Narcowich, X. Sun, J. D. Ward, Kernel approximation on manifolds II: the LL_{\infty} norm of the L2L_{2} projector, SIAM J. Math. Anal. 43 (2) (2011) 662–684.
  • [26] P. Collins, Kernel-based Galerkin methods on compact manifolds without boundary, with an emphasis on SO(3), Ph.D. thesis, University of Hawai‘i at Mānoa, (2021). Thesis available at {https://scholarspace.manoa.hawaii.edu/items/936c034f-5039-4ef0-8f03-03f21d71184b}