Kernel Multigrid on Manifolds

Thomas Hangelbroek ¹¹1Thomas Hangelbroek’s research was supported by Grant DMS-2010051 from the National Science Foundation. [email protected] University of Hawai‘i at Mānoa Christian Rieger [email protected] Philipps-Universität Marburg

Abstract

Kernel methods for solving partial differential equations on surfaces have the advantage that those methods work intrinsically on the surface and yield high approximation rates if the solution to the partial differential equation is smooth enough. Localized Lagrange bases have proven to alleviate the computational complexity of usual kernel methods to some extent. The efficient numerical solution of the resulting linear systems of equations has not been addressed in the literature so far. In this article we apply the framework of geometric multigrid method with a $\tau\geq 2$ -cycle to this particular setting. Moreover, we show that the resulting linear algebra can be made more efficiently by using the Lagrange function decay again. The convergence rates are obtained by a rigorous analysis. The presented version of a multigrid method provably works on quasi-uniform point clouds on the surface and hence does not require a grid-structure. Moreover, we can show that the computational cost to solve the linear system scales log-linear in the degrees of freedom.

keywords:

Geometric multigrid , partial differential equations on manifolds , kernel-based Galerkin methods , localized Lagrange basis

MSC:

65F10 , 65Y20 , 65M12 , 65M15 , 65M55 , 65M60

1 Introduction

The numerical solution to partial differential equations, often time dependent, on curved geometries is crucial to many real-world applications. Of the many available numerical methods (see for instance [1] for an overview using finite elements, or [2] [3] for alternative mesh-free methods), we focus on mesh-free kernel-based Galerkin methods (see also [4], [5], [6]), which have a number of merits, including delivering high approximation orders for smooth data, providing smooth solutions, and working coordinate-free and without the need for rigid underlying geometric structures like meshes and regular grids.

Let $\mathbb{M}$ denote in the following a compact, $d$ -dimensional Riemannian manifold without boundary. We will consider, as a spatial operator, a generic second order linear elliptic differential operator $\mathcal{L}$ with trivial nullspace. Such operators occur for instance in the numerical solution of the heat equation using implicit time-stepping methods, see e.g. [7]. In conventional kernel-based Galerkin methods, a grid $\Xi\subset\mathbb{M}$ is considered and a stiffness matrix $\boldsymbol{\mathbf{A}}_{\Xi}$ , which represents $\mathcal{L}$ on a finite dimensional kernel space expressed endowed with some fixed basis, is assembled. This leads to the equation

\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{u}}=\boldsymbol{\mathbf{b}},

(1)

(see (18) for a precise definition).

In this paper we address the problem of reducing the computational costs of using kernels without spoiling their analytic advantages. Here, we mostly follow [7, 8] and use the kernel-based Lagrange basis. This particular basis, though not locally supported, has very strong decay properties which allows to localize computations. The almost-local support already alleviates the problem of densely populated matrices as usually encountered in kernel methods. Concretely, the full system in (1) can be well-approximated by a sparse matrix using the decay of the local Lagrange basis. Despite this, the condition number $\operatorname{cond}_{2}(\boldsymbol{\mathbf{A}}_{\Xi})\sim N^{2/d}$ grows with the problem size $N:=|\Xi|$ . Thus, even if compressed, (1) poses a computational challenge, and, because of its conditioning, iterative methods (without preconditioning) cannot be expected to work well, since the iterations needed to ensure a prescribed accuracy may grow with the number of degrees of freedom.

To this end, we introduce and analyze a mesh-free multigrid algorithm. Specifically, we adapt the standard $W$ -cycle multigrid method to (localized) kernels and provide a rigorous analysis in this setting. Although multigrid methods recently have gained attention in the mesh-free community, see [9] and [10], the rigorous analysis we provide has been missing in the kernel-based context.

The main novelty of this paper is the following:

1.

We prove this mesh-free algorithm is a contraction, with norm independent of grid size (see Theorem 1). By standard techniques, it follows that numerical solutions within given tolerance can be obtained with a fixed, with respect to grid size, number of iterations (see (41) and adjacent discussion).
2.

The method we present is stable under perturbations of the stiffness, restriction and prolongation matrices. Thus small errors, which may come from sparsifying these matrices (as well as from quadrature, round-off or modification of the kernel), also yield contractions (see Theorem 2) and therefore do not hinder performance of the algorithm.
3.

By compressing the stiffness, restriction and prolongation matrices, which have rapid off-diagonal decay, we obtain a solution method enjoying nearly linear complexity.

This leads to an approximate solution to the system (1) which requires only $\mathcal{O}\bigl{(}N\log(N)^{d}\bigr{)}$ floating point operations per iteration and where again the multigrid matrix has a norm bound less than unity (see Theorem 2), for a total operation count of

\mathcal{O}\left(N\log(N)^{d}\log\left({\epsilon_{\max}}\right)\right),

where $\epsilon_{\max}$ is the user-prescribed tolerance. The approximate solution $\boldsymbol{\mathbf{u}}^{\star}$ to (1) has an (additive) error bound of the form $\|\boldsymbol{\mathbf{u}}-\boldsymbol{\mathbf{u}}^{\star}\|_{\ell_{2}}\leq\epsilon_{\max}+\epsilon_{\mathrm{tr}}$ , where the error due to truncating matrices is $\epsilon_{\mathrm{tr}}=\mathcal{O}(N^{-J})$ . Here $J>0$ is a user-determined constant which depends linearly on the sparsity of truncated matrices in the algorithm. This is explained in Remark 4.

Remark 1.

Before proceeding, we make the following comments:

1.

We do not attempt to modify the underlying framework of the multigrid method, as described, for instance, in [11, 12, 13], but show that it can be successfully applied to kernel methods. This is far from being obvious. We follow conceptually the book [12] where also the function space view on multigrid methods is used.
2.

The fact that $\mathcal{L}$ is injective is a convenient simplification we assume throughout the article. An investigation of operators with a non-trivial nullspace will be considered in a forthcoming work.
3.

Throughout this article, we will assume to have access to a sequence of quasi-uniform and nested point clouds $\Xi_{0}\subset\Xi_{1}\subset\dots\Xi_{L}\subset\mathbb{M}$ on $\mathbb{M}$ , and a corresponding highly localized Lagrange basis generated by a kernel $\phi:\mathbb{M}\times\mathbb{M}\to\mathbb{R}$ . Both, the construction of point clouds on manifolds and the computation of the Lagrange basis are independent of the partial differential equation and can hence be pre-computed and even be stored.
4.

Although numerical integration is necessary to implement the kernel-based Galerkin method, our results hold independently of the choice of quadrature method. (This is treated, for instance, in [7, Section 4].) We therefore assume for the remainder of this paper these steps to be solved. In particular, we assume to have access to the stiffness matrices $\boldsymbol{\mathbf{A}}_{\Xi}$ , as considered in [14] or [15].
5.

An alternative approach to treating (1) would be to apply a suitable preconditioner. To be effective in this context, such methods would also have to be adapted to the kernel situation and the analysis including the compression argument would have to be carefully carried over as well. Furthermore, many successful preconditioners for finite elements, like [16], typically involve concepts from multigrid methods such as hierarchy of approximation spaces.
6.

We discuss in this article the multigrid method only from the perspective as standalone solver. We point out that the multigrid method itself provides an attractive preconditioner. Most often, the multigrid preconditioner is combined with a flexible GMRES (FGMRES) iteration method, see e.g. [17] and references therein. This is also implemented in many software libraries such as PETSc (see [18, page 92f.]) just to name one prominent example. We will discuss this in more detail when we present numerical results on this method in an upcoming work.

Outline of the paper

The remainder of this article is organized as follows. In section 2, we introduce the basic notation of second order elliptic equations on manifolds, and their solution via kernel-based Galerkin approximation. In this section we demonstrate the approximation property in the kernel context, which, along with the smoothing property, provides the analytic backbone for the success of the multigrid method. Section 3 introduces the phenomenon of rapidly decaying Lagrange-type bases, which holds for certain kernels – using such bases permits stiffness matrices with rapid off-diagonal decay, among other things. Of special importance is the diagonal behavior of the stiffness matrix given in Lemma 5, which is a novel contribution of this paper, and which is the analytic result necessary to prove the smoothing property. In section 4 we discuss the smoothing property of damped Jacobi iterations for kernel-based methods both in the case of symmetric and non-symmetric differential operators. In section 5 we introduce kernel-based restriction and prolongation operators, the standard two-grid method and then the $W$ -cycle. In this section, we prove Theorem 1, a consequence of which is the bound (41), which shows the (poly)-logarithmic complexity of our proposed method. Section 6 treats the error resulting from small perturbations of the stiffness matrix, as well as the prolongation and restriction matrices. The main result in this section, Theorem 2, demonstrates how such errors affect the multigrid approximation error. Section 7 investigates the computationally efficient truncated multigrid method as an application of the previous section.

2 Problem Set Up

2.1 Manifold

Consider a compact $d$ -dimensional Riemannian manifold without boundary $\mathbb{M}$ . Here we list some useful tools and their properties which hold in this setting. We direct the reader to [19, 20] for relevant background. Results about covariant derivatives and Sobolev spaces can be found in [21].

The tangent bundle is $T\mathbb{M}$ and the cotangent bundle is $T^{*}\mathbb{M}$ . We denote by $T^{k}_{r}\mathbb{M}$ the vector bundle of tensors with contravariant order $k$ and covariant order $r$ ; we say type $(k,r)$ for short. Thus $T\mathbb{M}=T^{1}_{0}\mathbb{M}$ and $T^{*}\mathbb{M}=T^{0}_{1}\mathbb{M}$ . We will denote the fiber at $x\in\mathbb{M}$ by $T^{k}_{r}\mathbb{M}_{x}$ . The space of tensor fields of type $(k,r)$ (known also as sections; i.e., maps $\mathbf{S}:\mathbb{M}\to T_{r}^{k}\mathbb{M}$ with $\mathbf{S}(x)\in T_{r}^{k}\mathbb{M}_{x}$ for every $x\in\mathbb{M}$ ) is denoted $\mathcal{T}_{r}^{k}\mathbb{M}$ .

In this article, we are concerned primarily with covariant tensors (i.e., tensors of type $(0,k)$ ), so we use the short hand notation $T_{k}\mathbb{M}=T_{k}^{0}\mathbb{M}$ and $\mathcal{T}_{k}\mathbb{M}=\mathcal{T}_{k}^{0}\mathbb{M}$ .

For a chart $(U,\phi)$ for $\mathbb{M}$ from which we get the usual vector fields $\frac{\partial}{\partial x^{j}}$ and forms $dx^{j}$ ( $1\leq j\leq d$ ), which act as local bases for $T\mathbb{M}$ and $T^{*}\mathbb{M}$ over $U$ . These can be used to generate bases for tensor fields. In particular, for given covariant rank $k$ and $\boldsymbol{\mathbf{i}}=(i_{1},\dots,i_{k})\in\{1,\dots,d\}^{k}$ we have basis element $dx^{\boldsymbol{\mathbf{i}}}:=dx^{i_{1}}\cdots dx^{i_{k}}$ . This allows us to write $\mathbf{S}\in\mathcal{T}_{k}\mathbb{M}$ in coordinates as $\mathbf{S}(x)=\sum_{\boldsymbol{\mathbf{i}}\in\{1,\dots,d\}^{k}}(S(x))_{\boldsymbol{\mathbf{i}}}dx^{\boldsymbol{\mathbf{i}}}$ .

Because $\mathbb{M}$ is a Riemannian manifold, at each $x\in\mathbb{M}$ , $T\mathbb{M}_{x}$ has an inner product $\langle\cdot,\cdot\rangle_{x}$ and induced norm $\|\cdot\|_{x}$ . This means that there is by $g\in\mathcal{T}_{2}\mathbb{M}$ (the metric tensor), so that for tangent vectors in $V,W\in T\mathbb{M}_{x}$ we have $g(x)(V,W)=\langle V,W\rangle_{x}$ . The inner product extends to the dual: for cotangent vectors $\mu,\nu\in T^{*}\mathbb{M}_{x}$ we have $\langle\mu,\nu\rangle_{x}=\sum\mu_{j}\nu_{k}g^{j,k}$ , where $\sum g_{j,k}g^{k,\ell}=\delta_{j,\ell}$ . From this, it naturally extends to tensors. For $\mathbf{T},\mathbf{S}\in{T}_{k}\mathbb{M}_{x}$ , written in coordinates as $\mathbf{T}=\sum_{\boldsymbol{\mathbf{j}}\in\{1,\dots,d\}^{k}}T_{\boldsymbol{\mathbf{j}}}\,d{x^{j_{1}}}\dots d{x^{j_{k}}}$ and $\mathbf{S}=\sum_{\boldsymbol{\mathbf{i}}\in\{1,\dots,d\}^{k}}S_{\boldsymbol{\mathbf{i}}}\,d{x^{i_{1}}}\dots d{x^{i_{k}}}$ , we have

\langle\mathbf{T},\mathbf{S}\rangle_{x}=\sum_{\boldsymbol{\mathbf{i}},\boldsymbol{\mathbf{j}}\in\{1,\dots,d\}^{k}}g^{i_{1}j_{1}}(x)\dots g^{i_{k},j_{k}}(x)S_{\boldsymbol{\mathbf{i}}}T_{\boldsymbol{\mathbf{j}}}.

We denote the Riemannian distance on $\mathbb{M}$ by $\mathrm{dist}:\mathbb{M}\times\mathbb{M}\to[0,\infty)$ ; it is given by the formula $\mathrm{dist}(x,y)=\inf\{\int_{a}^{b}\|\gamma^{\prime}(t)\|_{\gamma(t)}\mathrm{d}t\mid\gamma(a)=x,\ \gamma(b)=y\}$ where the infimum is taken over piecewise smooth curves connecting $x$ and $y$ . The Riemannian metric gives rise to a volume form $\mathrm{d}\mu=\sqrt{\det(g(x))}\mathrm{d}x$ . By compactness, there exist constants $0<\alpha_{\mathbb{M}}\leq\beta_{\mathbb{M}}$ so that

\alpha_{\mathbb{M}}r^{d}\leq\mu(B(x,r))\leq\beta_{\mathbb{M}}r^{d}

or any ball $B(x,r):=\{y\in\mathbb{M}\mid\mathrm{dist}(x,y)<r\}$ centered at $x\in\mathbb{M}$ and having radius $0<r<\mathrm{diam}(\mathbb{M})$ .

For a finite subset $\Xi\subset\mathbb{M}$ , we can define the following useful quantities: the separation distance, $q$ of $\Xi$ and the fill distance, $h$ , of $\Xi$ in $\mathbb{M}$ . They are given by

q:=\frac{1}{2}\min_{\zeta\in\Xi}\mathrm{dist}(\zeta,\Xi\setminus\{\zeta\})\quad\text{and}\quad h:=h(\Xi,\mathbb{M}):=\sup_{x\in\mathbb{M}}\mathrm{dist}(x,\Xi).

The finite sets considered throughout this paper will be quasiuniform, with mesh ratio $\rho:=h/q$ bounded by a fixed constant. For this reason, quantities which are controlled above or below by a power of $q$ can be likewise controlled by a power of $h$ – in short, whenever possible, we express estimates in terms of the fill distance $h$ , allowing constants to depend on $\rho$ . For instance, the cardinality $|\Xi|$ is bounded above and below by

\frac{\mu(\mathbb{M})}{\beta_{\mathbb{M}}}h^{-d}\leq|\Xi|\leq\frac{\mu(\mathbb{M})}{\alpha_{\mathbb{M}}}q^{-d}.

(2)

Similarly, if $f:[0,\infty)\to\mathbb{R}$ is continuous then for any $x\in\mathbb{M}$ ,

\sum_{\zeta\in\Xi}f\bigl{(}\mathrm{dist}(\zeta,x)\bigr{)}\leq\max_{t\leq q}f(t)+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\sum_{n=1}^{\infty}(n+2)^{d}\max_{nq\leq t\leq(n+1)q}f(t).

(3)

Moreover, we use the following notation: $\mathbb{R}^{A}=\{f:A\to\mathbb{R}\}$ denotes the functions from the set $A$ to $\mathbb{R}$ . Of course, if $A$ is a discrete finite set, we can identity $\mathbb{R}^{A}\cong\mathbb{R}^{|A|}$ .

2.2 Sobolev spaces

The covariant derivative $\nabla$ maps tensor fields of type $(r,s)$ to fields of type $(r,s+1)$ . Its adjoint (with respect to the $L_{2}$ inner products on the space of sections of $T_{r}^{s}(\mathbb{M})$ ) is denoted $\nabla^{*}$ . For functions, this is fairly elementary. The covariant derivative of a (scalar) function $f:\mathbb{M}\to\mathbb{R}$ equals its exterior derivative; in coordinates, we have

\nabla f=\sum_{j=1}^{d}\frac{\partial f}{\partial x^{j}}dx^{j}=df.

For a 1-form $\omega=\sum_{j=1}^{d}\omega_{j}dx^{j}$ , we have

\nabla^{*}\omega(x)=-\frac{1}{\sqrt{\det g(x)}}\sum_{j=1}^{d}\sum_{k=1}^{d}\frac{\partial}{\partial x_{k}}\left(\sqrt{\det g(x)}g^{jk}(x)\omega_{j}(x)\right).

A direct calculation shows $\int_{\mathbb{M}}f(x)\nabla^{*}\omega(x)\mathrm{d}\mu(x)=\int_{\mathbb{M}}\langle\omega(x),\nabla f(x)\rangle_{x}\mathrm{d}\mu(x)$ .

For $\Omega\subset\mathbb{M}$ Sobolev space $W_{p}^{k}(\Omega)$ is defined to be the set of functions $f:\Omega\to\mathbb{R}$ which satisfy

\|f\|_{W_{p}^{k}(\Omega)}^{p}=\sum_{\ell=0}^{k}\int_{\Omega}\|\nabla^{\ell}f\|_{x}^{p}\mathrm{d}\mu(x)<\infty,

where the $\ell$ -th order covariant derivatives can be found in [21, Section 2.2].

For a scalar function $f:\mathbb{M}\to\mathbb{R}$ , the Laplace-Beltrami operator is given in local coordinates as $\Delta f=\sum_{j=1}^{d}\sum_{k=1}^{d}\frac{1}{\sqrt{|\det g|}}\frac{\partial}{\partial x^{j}}\bigl{(}\sqrt{|\det g|}g^{jk}\frac{\partial}{\partial x^{k}}f\bigr{)}$ . Thus for scalar functions $\Delta f=-\nabla^{*}\nabla f$ . For any integer $k\in\mathbb{N}$ and $p\in(1,\infty)$ , the Bessel-potential norm $\|(1-\Delta)^{k/2}f\|_{L_{p}(\mathbb{M})}$ is equivalent to $\|f\|_{W_{p}^{k}(\mathbb{M})}$ , as demonstrated in [22, Theorem 4(ii)] (although when $k=1$ and $p=2$ , the two norms are equal; this can be observed directly).

Lemma 3.2 from [21] applies, so there are uniform constants $\Gamma_{1},\Gamma_{2}$ and $r_{\mathbb{M}}>0$ so that the family of exponential maps $\{\mathrm{Exp}_{x}:B(0,r_{\mathbb{M}})\to\mathbb{M}\mid x\in\mathbb{M}\}$ (which are diffeomorphisms taking $0$ to $x$ ) provides local metric equivalences: for any open set $\Omega\subset B(0,r_{\mathbb{M}})\subset\mathbb{R}^{d}$ , we have

\Gamma_{1}\|u\circ\mathrm{Exp}_{x}\|_{W_{p}^{j}(\Omega)}\leq\|u\|_{W_{p}^{j}(\mathrm{Exp}_{x}(\Omega))}\leq\Gamma_{2}\|u\circ\mathrm{Exp}_{x}\|_{W_{p}^{j}(\Omega)}.

(4)

This shows that $W_{p}^{k}(\mathbb{M})$ can be endowed with equivalent norms using a partition of unity $(\varphi_{j})_{j\leq N}$ subordinate to a cover $\{\mathcal{O}_{j}\}_{j\leq N}$ with associated charts $\psi_{j}:\mathcal{O}_{j}\to\mathbb{R}^{d}$ to obtain $\|u\|_{W_{p}^{k}(\mathbb{M})}^{p}\sim\sum_{j=1}^{N}\|(\varphi_{j}u)\circ\psi_{j}^{-1}\|_{W_{p}^{k}(\mathbb{R}^{d})}^{p}$ . Here constants of equivalence depend on the partition of unity and charts.

A useful result in this setting, which we will use explicitly in this article, but is also behind a number of background results in section 2.5, is the following zeros estimate [23, Corollary A.13 ], which holds for Sobolev spaces: If $u\in W_{2}^{m}(\mathbb{M})$ satisfies $u\left|{}_{X}\right.=0$ , then for any $k\leq m$ we have

\|u\|_{W_{2}^{k}(\mathbb{M})}\leq C_{\mathrm{zeros}}h^{m-k}\|u\|_{W_{2}^{m}(\mathbb{M})}.

(5)

Here $C_{\mathrm{zeros}}$ depends on $m$ and $\mathbb{M}$ . The result can also be obtained on bounded, Lipschitz domain $\Omega\subset\mathbb{M}$ that satisfies a uniform cone condition, with the cone having radius $R_{\Omega}\leq r_{\mathbb{M}}/3$ , although the constant $C_{\mathrm{zeros}}$ will depend on the aperture of the cone in that case. See [23, Theorem A.11] for a precise statement and definitions of the involved quantities.

2.3 Elliptic operator

We write the operator $\mathcal{L}$ in divergence form:

\mathcal{L}=\nabla^{*}\textsc{a}_{2}^{\flat}(\nabla\cdot)+\textsc{a}_{1}\nabla+\textsc{a}_{0}.

Here $\textsc{a}_{0}$ is a smooth function, $\textsc{a}_{1}$ is a smooth tensor field of type $(1,0)$ and $\textsc{a}_{2}$ is a smooth tensor field of type $(2,0)$ which generates the field $\textsc{a}_{2}^{\flat}$ of type (1,1). In coordinates, $\textsc{a}_{2}$ has the form $\sum_{j=1}^{d}\sum_{k=1}^{d}a^{jk}\frac{\partial}{\partial x_{j}}\frac{\partial}{\partial x_{k}}$ , and $\textsc{a}_{2}^{\flat}$ is $\sum_{j=1}^{d}\sum_{k=1}^{d}\tensor{a}{{}_{k}^{j}}dx_{k}\frac{\partial}{\partial x_{j}}$ with $\tensor{a}{{}_{k}^{j}}=\sum_{\ell=1}^{d}g_{k\ell}a^{\ell j}$ . Here we have used the index lowering operator ^♭ which ensures, for any $\mu,\nu\in T_{1}\mathbb{M}_{x}$ , that $\langle\textsc{a}_{2}^{\flat}(\mu,\cdot),\nu\rangle_{x}=\textsc{a}_{2}(\mu,\nu)$ . This follows by a direct calculation from the above expressions in coordinates (index lowering and the ^♭ operator are discussed in [19, p. 341] or [20, p. 27]).

Furthermore, we require $c_{0}>0$ to be a constant so that

\textsc{a}_{2}(x)(v,v)\geq c_{0}\langle v,v\rangle_{x}\quad\text{and}\quad\textsc{a}_{0}+\frac{1}{2}\nabla^{*}\textsc{a}_{1}\geq c_{0}.

(6)

As a basic example, we consider $\textsc{a}_{2}(x)=\langle\cdot,\cdot\rangle_{x}$ , $\textsc{a}_{1}=0$ and $\textsc{a}_{0}=1$ ; then (6) holds with $c_{0}=1$ . In this case, $a^{jk}=g^{jk}$ , $\tensor{a}{{}_{k}^{j}}=\delta_{j,k}$ and $\mathcal{L}=1+\nabla^{*}\nabla=1-\Delta$ .

With the identity $\int_{\mathbb{M}}u(\textsc{a}_{1}\nabla u)=\frac{1}{2}\int_{\mathbb{M}}(\textsc{a}_{1}\nabla(u^{2}))=\frac{1}{2}\int_{\mathbb{M}}(\nabla^{*}\textsc{a}_{1})u^{2}$ , the second part of (6) ensures that

\int_{\mathbb{M}}u(\textsc{a}_{1}\nabla u+\textsc{a}_{0}u)\geq c_{0}\|u\|_{L_{2}(\mathbb{M})}^{2}.

We have also that $\int_{\mathbb{M}}u(x)\nabla^{*}(\textsc{a}_{2}^{\flat}\nabla u)(x)\mathrm{d}\mu(x)=\int_{\mathbb{M}}\langle\textsc{a}_{2}^{\flat}\nabla u,\nabla u\rangle_{x}\mathrm{d}\mu(x)$ . The definition of $\textsc{a}_{2}^{\flat}$ ensures that this equals $\int_{\mathbb{M}}\textsc{a}_{2}(x)\bigl{(}\nabla u(x),\nabla u(x)\bigr{)}\mathrm{d}\mu(x)$ , and the first part of (6) then ensures that

\int_{\mathbb{M}}\nabla^{*}(\textsc{a}_{2}^{\flat}\nabla u)(x)\mathrm{d}\mu(x)\geq\int_{\mathbb{M}}\|\nabla u\|_{x}^{2}\mathrm{d}\mu(x).

Thus, (6) guarantees that the bilinear form $a(u,v):=\int_{\mathbb{M}}v\mathcal{L}u$ , defined initially for smooth functions, is bounded on $W_{2}^{1}(\mathbb{M})$ and is coercive. Thus, we have that the energy quasi-norm $[u]^{2}_{\mathcal{L}}:=a(u,u)$ satisfies the metric equivalence

\displaystyle c_{0}\|u\|_{W_{2}^{1}(\mathbb{M})}^{2}\leq[u]^{2}_{\mathcal{L}}=a(u,u)\leq C\|u\|_{W_{2}^{1}(\mathbb{M})}^{2}.

(7)

(if $a$ is symmetric, this is a norm, and we write $\|u\|_{\mathcal{L}}=[u]_{\mathcal{L}}$ .)

Please note that this excludes differential operators with a null space at the moment. Having in mind the time dependent problems, this assumption is justified. The technical more challenging analysis for those more general operators is left to future research.

2.4 Galerkin methods

We fix $f\in L_{2}(\mathbb{M})$ and consider $u\in W^{1}_{2}(\mathbb{M})$ as solution to

\displaystyle a(u,v):=\int_{\mathbb{M}}v\mathcal{L}u=(f,v)_{L_{2}(\mathbb{M})}=F(v)\quad\text{for all }v\in W^{1}_{2}(\mathbb{M}).

(8)

Regularity estimates ([24, Chapter 5.11, Theorem 11.1]) yield that $u\in W^{2}_{2}(\mathbb{M})$ and $\|u\|_{W^{2}_{2}(\mathbb{M})}\leq C\|f\|_{L_{2}(\mathbb{M})}$ . Consider now a family of finite dimensional subspaces $(V_{h})$ with $V_{h}\subset W_{2}^{1}(\mathbb{M})$ associated to a parameter²²2In the sequel, these will be kernel spaces generated by a subset $\Xi\subset\mathbb{M}$ , and $h=h(\Xi,\mathbb{M})$ will be the fill distance. For now, we consider a more abstract setting, with $h>0$ only playing a role in establishing the approximation property $\mathrm{dist}_{W_{2}^{1}}(g,V_{h})\leq Ch\|g\|_{W_{2}^{2}(\mathbb{M})}$ below. $h>0$ . Define $P_{V_{h}}:W^{1}_{2}(\mathbb{M})\to V_{h}$ so that

\displaystyle a(P_{V_{h}}u,v)=a(u,v)\quad\text{for all }v\in V_{h}.

Because $a\left(u-P_{V_{h}}u,P_{V_{h}}u-v\right)=0$ , the classical Céa Lemma holds:

	$\displaystyle c_{1}\left\\|u-P_{V_{h}}u\right\\|^{2}_{W^{1}_{2}(\mathbb{M})}$	$\displaystyle\leq a\left(u-P_{V_{h}}u,u-P_{V_{h}}u\right)={\color[rgb]{0,0,0}a\left(u-P_{V_{h}}u,u-v\right)}$
		$\displaystyle\leq c_{2}\left\\|u-P_{V_{h}}u\right\\|_{W^{1}_{2}(\mathbb{M})}\left\\|u-v\right\\|_{W^{1}_{2}(\mathbb{M})}\quad\text{ for all }v\in V_{h}.$

And thus we get

\displaystyle\left\|u-P_{V_{h}}u\right\|_{W^{1}_{2}(\mathbb{M})}\leq\frac{c_{2}}{c_{1}}\mathrm{dist}_{{W^{1}_{2}(\mathbb{M})}}\left(u,V_{h}\right).

(9)

This can be improved by a Nitsche-type argument to get the following result, whose proof can be found in many textbooks on numerical methods for partial differential equations.

Lemma 1.

Suppose the family $(V_{h})$ has the property that for all $\tilde{u}\in W_{2}^{2}(\mathbb{M})$ , the distance in $W_{2}^{1}(\mathbb{M})$ from $V_{h}$ satisfies $\mathrm{dist}_{\|\cdot\|_{W^{1}_{2}(\mathbb{M})}}\left(\tilde{u},V_{h}\right)\leq Ch\left\|\tilde{u}\right\|_{W^{2}_{2}(\mathbb{M})}$ . Then for any $u\in W_{2}^{2}(\mathbb{M})$ ,

\displaystyle\left\|u-P_{V_{h}}u\right\|_{L_{2}(\mathbb{M})}\leq Ch\mathrm{dist}_{\|\cdot\|_{W^{1}_{2}(\mathbb{M})}}\left(u,V_{h}\right).

(10)

We point out, that Lemma 1 does not depend on the choice of a specific basis in the finite dimensional space $V_{h}$ .

2.5 Kernel approximation

We consider a continuous function $\phi:\mathbb{M}\times\mathbb{M}\to\mathbb{R}$ , the kernel, which satisfies a number of analytic properties which we explain in this section.

Most important is that $\phi$ is conditionally positive definite with respect to some (possibly trivial) finite dimensional subspace³³3generally a space spanned by some eigenfunctions of the Laplace-Beltrami operator $\Pi\subset C^{\infty}(\mathbb{M})$ . Conditional positive definiteness with respect to $\Pi$ means that for any $\Xi\subset\mathbb{M}$ , the collocation matrix $\bigl{(}\phi(\xi,\zeta)\bigr{)}_{\xi,\zeta}$ is positive definite on the vector space $\{a\in\mathbb{R}^{\Xi}\mid(\forall p\in\Pi)\,\sum_{\xi\in\Xi}a_{\xi}p(\xi)=0\}$ . As a result, if $\Xi$ separates elements of $\Pi$ , then the space

\displaystyle V_{\Xi}:=\left\{\sum_{\xi\in\Xi}a(\xi)\phi(\cdot,\xi)\mid(\forall p\in\Pi)\,\sum_{\xi\in\Xi}a_{\xi}p(\xi)=0\right\}+\Pi

(11)

has dimension $\Xi$ . In case $\Pi=\{0\}$ , the kernel is positive definite, and the collocation matrix is strictly positive definite on $\mathbb{R}^{\Xi}$ , and $V_{\Xi}=\mathrm{span}_{\xi\in\Xi}\phi(\cdot,\xi)$ .

For a conditionally positive definite kernel, there is an associated reproducing kernel semi-Hilbert space $\mathcal{N}(\phi)\subset C(\mathbb{M})$ with the property that $\Pi=\mathrm{null}(\|\cdot\|_{\mathcal{N}(\phi)})$ and that for any $a\in\mathbb{R}^{\Xi}$ for which $\sum_{\xi\in\Xi}a_{\xi}\delta_{\xi}\perp\Pi$ , the following identity

\sum_{\xi\in\Xi}a_{\xi}f(\xi)=\langle f,\sum_{\xi\in\Xi}a_{\xi}\phi(\cdot,\xi)\rangle_{\mathcal{N}(\phi)}

holds for all $f\in\mathcal{N}(\phi)$ . It follows that if $\Xi$ separates elements of $\Pi$ , interpolation with $V_{\Xi}$ is well defined on $\Xi$ and the projection $I_{\Xi}:\mathcal{N}(\phi)\to V_{\Xi}$ is orthogonal with respect to the semi-norm on $\mathcal{N}(\phi)$ . Of special interest are (conditionally) positive definite kernels that have Sobolev native spaces.

Lemma 2.

If $\phi$ is conditionally positive definite with respect to $\Pi$ and satisfies the equivalence $\mathcal{N}(\phi)/\Pi\cong W_{2}^{m}(\mathbb{M})/\Pi$ . then there is a constant $C$ so that for any integers $k,j$ with $0\leq k\leq j\leq m$ and any $u\in W_{2}^{j}(\mathbb{M})$ ,

\mathrm{dist}_{W_{2}^{k}(\mathbb{M})}(u,V_{\Xi})\leq Ch^{j-k}\|u\|_{W_{2}^{j}(\mathbb{M})}.

Proof.

The case $k=j$ is trivial; it follows by considering $0\in V_{\Xi}$ .

For $j=m$ and $0\leq k<m$ , the zeros estimate [23, Corollary A.13] ensures that

\|I_{\Xi}u-u\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{m-k}\|I_{\Xi}u-u\|_{W_{2}^{m}(\mathbb{M})}

(because the interpolation error $I_{\Xi}f-f$ vanishes on $\Xi$ ). The hypothesis then gives, $\|I_{\Xi}u-u\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{m-k}\|I_{\Xi}u-u\|_{\mathcal{N}(\phi)}$ , with a suitably enlarged constant. Because $I_{\Xi}$ is an orthogonal projector, $\|I_{\Xi}u-u\|_{\mathcal{N}(\phi)}\leq\|u\|_{\mathcal{N}}(\phi)$ , so we have (again enlarging the constant) that

\|I_{\Xi}u-u\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{m-k}\|u\|_{W_{2}^{m}(\mathbb{M})}.

For $0\leq k<j<m$ , we use the fact that $W_{2}^{j}(\mathbb{M})$ is the real interpolation space $[W_{2}^{k}(\mathbb{M}),W_{2}^{m}(\mathbb{M})]_{\frac{j-k}{m-k},2}$ . This is [22, Theorem 5]. For $\tilde{u}\in W_{2}^{j}(\mathbb{M})$ , this means that the $K$ -functional

K(\tilde{u},t)=\inf_{g\in W_{2}^{m}(\mathbb{M})}\|\tilde{u}-g\|_{W_{2}^{k}(\mathbb{M})}+t\|g\|_{W_{2}^{m}(\mathbb{M})}

satisfies the condition $\int_{0}^{\infty}(t^{-\frac{j-k}{m-k}}K(\tilde{u},t))^{2}\frac{\mathrm{d}t}{t}<\infty$ . Since $K(\tilde{u},t)$ is continuous and monotone, we have that $t\mapsto t^{-\frac{j-k}{m-k}}K(\tilde{u},t)$ is bounded on $(0,\infty)$ . Thus for $t=h^{m-k}$ , there exists $g\in W_{2}^{m}(\mathbb{M})$ so that

\displaystyle\|\tilde{u}-g\|_{W_{2}^{k}(\mathbb{M})}

\displaystyle\leq Ch^{j-k}\|\tilde{u}\|_{W_{2}^{j}(\mathbb{M})}

and

\displaystyle\|g\|_{W_{2}^{m}(\mathbb{M})}

\displaystyle\leq Ch^{j-m}\|\tilde{u}\|_{W_{2}^{j}(\mathbb{M})}.

The above estimate gives $\|I_{\Xi}g-g\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{m-k}\|g\|_{W_{2}^{m}(\mathbb{M})}$ . This implies that $\|\tilde{u}-I_{\Xi}g\|_{W_{2}^{k}(\mathbb{M})}\leq Ch^{j-k}\|\tilde{u}\|_{W_{2}^{j}(\mathbb{M})}$ as desired. ∎

As a consequence, the kernel Galerkin solution $u_{\Xi}\in V_{\Xi}$ to $\mathcal{L}u=f$ with $f\in W_{2}^{j+1}(\mathbb{M})$ satisfies $\|u-u_{\Xi}\|_{W_{2}^{1}(\mathbb{M})}\leq Ch^{j-1}\|f\|_{W_{2}^{j+1}(\mathbb{M})}$ . More importantly, for our purposes, the hypothesis of Lemma 1 is satisfied by the space $V_{\Xi}$ . Indeed, by (10), we have the following approximation property:

\|u-P_{\Xi}u\|_{L_{2}(\mathbb{M})}\leq Ch\|u\|_{W_{2}^{1}(\mathbb{M})}

(12)

which, together with a smoothing property, forms the backbone of the convergence theory for the multigrid method.

3 The Lagrange basis and stiffness matrix

For a kernel $\phi$ and a set $\Xi$ which separates points of $\Pi$ , the Lagrange basis $(\chi_{\xi})_{\xi\in\Xi}$ for $V_{\Xi}$ satisfies $\chi_{\xi}(\zeta)=\delta_{\xi,\zeta}$ for each $\xi,\zeta\in\Xi$ .

A natural consequence of $\mathcal{N}(\phi)\cong W_{2}^{m}(\mathbb{M})$ is that there exists a constant $C$ so that for any $\Xi\subset\mathbb{M}$ , the bound $\|\chi_{\xi}\|_{W_{2}^{m}(\mathbb{M})}\leq Cq^{\frac{d}{2}-m}$ holds. Especially on Riemannian manifolds without boundary, much stronger statement can be proven, see [21, 25, 23] or [14] on $\mathbb{S}^{2}$ . We are, however, not convinced that the list of settings where stronger results hold is complete. In order to allow our results to be applied in future settings where those statement will be shown, we will formulate those stronger statements as assumptions or building blocks.

A stronger result is the following:

Assumption 1.

We assume that there is $m>d/2+1$ so that $\mathcal{N}(\phi)\cong W_{2}^{m}(\mathbb{M})$ , and, furthermore, there exist constants $\nu>0$ and $C_{\mathrm{EN}}$ so that for $R>0$

\|\chi_{\xi}\|_{W_{2}^{m}(\mathbb{M}\setminus B(\xi,R))}\leq C_{\mathrm{EN}}q^{\frac{d}{2}-m}e^{-\nu\frac{R}{h}}.

This gives rise to a number of analytic properties, some of which we present here (there are many more, see [25] and [15] for a detailed discussions). For the following estimates, the constants of equivalence depend on $C_{\mathrm{EN}}$ , $\nu$ , and $\rho$ .

Pointwise decay: there exist constants $C$ and $\nu>0$ so that for any $\Xi\subset\mathbb{M}$ , the estimate

|\chi_{\xi}(x)|\leq C_{\mathrm{PW}}e^{-\nu\frac{\mathrm{dist}(x,\xi)}{h}}

(13)

holds. Here $C_{\mathrm{PW}}\leq\rho^{m-d/2}C_{\mathrm{EN}}$ , where we recall that the mesh ratio is $\rho=h/q$ .

Hölder continuity: Of later importance, we mention the following condition, which follows from [23, Corollary A.15]. For any $\epsilon<m-d/2$ , the Lagrange function is $\epsilon$ Hölder continuous, and satisfies the bound

|\chi_{\xi}(x)-\chi_{\xi}(y)|\leq C_{\mathrm{H\ddot{o}lder}}\bigl{(}\mathrm{dist}(x,y)\bigr{)}^{\epsilon}h^{-\epsilon}.

(14)

Although we make explicit the dependence on $\rho$ here, for the remainder of this article, we assume that constants that follow depend on $\rho$ .

Riesz property: there exist constants $0<C_{1}\leq C_{2}<\infty$ so that for any $\Xi\subset\mathbb{M}$ , and $a\in\mathbb{R}^{\Xi}$ , we have

C_{1}q^{d/2}\|a\|_{\ell_{2}(\Xi)}\leq\|\sum_{\xi\in\Xi}a_{\xi}\chi_{\xi}\|_{L_{2}(\mathbb{M})}\leq C_{2}q^{d/2}\|a\|_{\ell_{2}(\Xi)}.

(15)

Bernstein inequalities: There is a constant $C_{\mathrm{Bernstein}}$ so that for $0\leq k\leq m$ ,

\|\sum_{\xi\in\Xi}a_{\xi}\chi_{\xi}\|_{W_{2}^{k}(\mathbb{M})}\leq C_{\mathrm{Bernstein}}h^{d/2-k}\|a\|_{\ell_{2}(\Xi)}.

(16)

3.1 The stiffness matrix

We now discuss the stiffness matrix and some of its properties. Most of these have appeared in [26], with earlier versions for the sphere appearing in [21] and [8].

A consequence of the results of this section is that the problem of calculating the Galerkin solution to $\mathcal{L}u=f$ from $V_{\Xi}$ involves treating a problem whose condition number grows like $\mathcal{O}(h^{-2})$ – this is the fundamental issue that the multigrid method seeks to overcome.

The analysis map for $\bigl{(}\chi_{\xi}\bigr{)}_{\xi\in\Xi_{\ell}}$ with respect to the bilinear form $a$ defined in (8) is

\displaystyle\sigma_{\Xi}:\left(W^{1}_{2}(\mathbb{M}),a(\cdot,\cdot)\right)\to\left(\mathbb{R}^{\Xi},(\cdot,\cdot)_{2}\right):\quad v\mapsto\bigl{(}a(v,\chi_{\xi})\bigr{)}_{\xi\in\Xi}^{T}.

The analysis map is a surjection.

The synthesis map is

\displaystyle\sigma^{\ast}_{\Xi}:\left(\mathbb{R}^{\Xi},(\cdot,\cdot)_{2}\right)\to\left(W^{1}_{2}(\mathbb{M}),a(\cdot,\cdot)\right):\quad\boldsymbol{\mathbf{w}}\mapsto\sum_{\xi\in\Xi}w_{\xi}\chi_{\xi}.

The range of the synthesis map is clearly $V_{\Xi}$ ; in other words, it is the natural isomorphism between Euclidean space and the finite dimensional kernel space; indeed, (15) shows that it is bounded above and below between $L_{2}(\mathbb{M})$ and $\ell_{2}(\Xi)$ . By abusing notation slightly, we write $\bigl{(}\sigma^{\ast}_{\Xi}\bigr{)}^{-1}:V_{\Xi}\to\mathbb{R}^{\Xi}$ . This permits a direct matrix representation of linear operators on $V_{\Xi}$ via conjugation: $S\mapsto\mathbf{S}:=(\sigma_{\Xi}^{*})^{-1}S\sigma_{\Xi}^{*}\in\mathbb{R}^{\Xi\times\Xi}$ . Furthermore, by the Riesz property (15), we have

\displaystyle\frac{C_{1}}{C_{2}}\|\mathbf{S}\|_{\ell_{2}(\Xi)\to\ell_{2}(\Xi)}\leq\|S\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq\frac{C_{2}}{C_{1}}\|\mathbf{S}\|_{\ell_{2}(\Xi)\to\ell_{2}(\Xi)}.

(17)

A simple calculation shows that $a\left(\sigma^{\ast}_{\Xi}(\boldsymbol{\mathbf{w}}),v\right)=\left\langle\boldsymbol{\mathbf{w}},\sigma_{\Xi}(v)\right\rangle_{2}$ , so $\sigma_{\Xi}^{\ast}$ is the $a$ -adjoint of $\sigma_{\Xi}$ . Of course, when $a$ is symmetric, we also have $\left\langle\sigma_{\ell}(v),\boldsymbol{\mathbf{w}}\right\rangle_{2}=a\left(v,\sigma^{\ast}_{\ell}(\boldsymbol{\mathbf{w}})\right)$ .

The stiffness matrix is defined as

\displaystyle\boldsymbol{\mathbf{A}}_{\Xi}:=(A_{\xi,\eta})_{\xi,\eta\in\Xi}:=\Bigl{(}a\bigl{(}\chi_{\xi},\chi_{\zeta}\bigr{)}\Bigr{)}_{\xi,\zeta\in\Xi}.

(18)

It represents the operator $\mathcal{L}$ on the finite dimensional space $V_{\Xi}$ . Using the analysis and synthesis maps, $\boldsymbol{\mathbf{A}}_{\Xi}=\sigma_{\Xi}\circ\sigma^{\ast}_{\Xi}:\left(\mathbb{R}^{\Xi},(\cdot,\cdot)_{2}\right)\to\left(\mathbb{R}^{\Xi},(\cdot,\cdot)_{2}\right):\boldsymbol{\mathbf{c}}\mapsto\left(a(\chi_{\xi},\chi_{\zeta})\right)_{\xi,\zeta\in\Xi}\boldsymbol{\mathbf{c}}$ , and we have that the Galerkin projector $P_{\Xi}:W_{2}^{1}(\mathbb{M})\to V_{\Xi}$ can be expressed as $P_{\Xi}=\sigma^{\ast}_{\Xi}\left(\sigma_{\Xi}\circ\sigma^{\ast}_{\Xi}\right)^{-1}\sigma_{\Xi}$ .

Lemma 3.

There is a constant $C_{\mathrm{stiff}}$ so that the entries of the stiffness matrix satisfy

|A_{\xi,\eta}|\leq C_{\mathrm{stiff}}h^{d-2}e^{-\frac{\nu}{2}\frac{\mathrm{dist}(\xi,\eta)}{h}}.

Proof.

For a tensor field $F\in\mathcal{T}_{1}\mathbb{M}$ , we write $|F|:\mathbb{M}\to\mathbb{R}:x\mapsto\|F(x)\|_{x}$ . Thus using $|\nabla\chi_{\xi}|(x)=\|\nabla\chi_{\xi}(x)\|_{x}$ , we can bound the integral

	$\displaystyle\|a(\chi_{\xi},\chi_{\eta})\|$	$\displaystyle\leq\\|\textsc{a}_{2}\\|_{\infty}\langle\|\nabla\chi_{\xi}\|,\|\nabla\chi_{\eta}\|\rangle_{L_{2}(\mathbb{M})}+\\|\textsc{a}_{1}\\|_{\infty}\langle\|\nabla\chi_{\xi}\|,\|\chi_{\eta}\|\rangle_{L_{2}(\mathbb{M})}$
		$\displaystyle+\\|\textsc{a}_{0}\\|_{\infty}\langle\|\chi_{\xi}\|,\|\chi_{\eta}\|\rangle_{L_{2}(\mathbb{M})}.$

Decompose each inner product using the half spaces $H_{+}:=\{x\mid\mathrm{dist}(x,\xi)<\mathrm{dist}(x,\eta)\}$ and $H_{-}=\mathbb{M}\setminus H_{+}$ , noting $H_{+}\subset\mathbb{M}\setminus B(\eta,R)$ and $H_{-}\subset\mathbb{M}\setminus B(\xi,R)$ , with $R=\mathrm{dist}(\xi,\eta)/2$ . By applying Cauchy-Schwarz to each integral gives, after combining terms,

|A_{\xi,\eta}|\leq C_{\mathcal{L}}(\|\chi_{\xi}\|_{W_{2}^{1}(\mathbb{M})}\|\chi_{\eta}\|_{W_{2}^{1}(\mathbb{M}\setminus B(\eta,R))}+\|\chi_{\xi}\|_{W_{2}^{1}(\mathbb{M}\setminus B(\xi,R))}\|\chi_{\eta}\|_{W_{2}^{1}(\mathbb{M})})

for some constant $C_{\mathcal{L}}$ depending on the coefficients of $\mathcal{L}$ .

We have $\|\chi_{\xi}\|_{W_{2}^{1}(\mathbb{M})}\leq C_{\mathrm{Bernstein}}h^{d/2-1}$ by the Bernstein inequality (16) (with a similar estimate for $\chi_{\eta}$ ). The zeros estimate for complements of balls, [23, Corollary A.17], applied to $\chi_{\xi}$ gives

\|\chi_{\xi}\|_{W_{2}^{1}(\mathbb{M}\setminus B(\xi,R))}\leq C_{\mathrm{zeros}}h^{m-1}\|\chi_{\xi}\|_{W_{2}^{m}(\mathbb{M}\setminus B(\xi,R))}

(with a similar estimate for $\chi_{\eta}$ ). Thus we have

	$\displaystyle\|A_{\xi,\eta}\|$	$\displaystyle\leq$	$\displaystyle C_{\mathcal{L}}C_{\mathrm{Bernstein}}C_{\mathrm{zeros}}h^{d/2+m-2}(\\|\chi_{\xi}\\|_{W_{2}^{m}(\mathbb{M}\setminus B(\xi,R))}+\\|\chi_{\eta}\\|_{W_{2}^{m}(\mathbb{M}\setminus B(\eta,R))})$
		$\displaystyle\leq$	$\displaystyle 2C_{\mathcal{L}}C_{\mathrm{Bernstein}}C_{\mathrm{zeros}}C_{\mathrm{EN}}h^{d/2+m-2}q^{d/2-m}e^{-\nu R/h}.$

The lemma follows with $C_{\mathrm{stiff}}=2C_{\mathcal{L}}C_{\mathrm{Bernstein}}C_{\mathrm{zeros}}C_{\mathrm{EN}}\rho^{m-d/2}$ . ∎

By considering row and column sums, we have that

\displaystyle\left\|\boldsymbol{\mathbf{A}}_{\Xi}\right\|_{2\to 2}\leq C_{A}h^{d-2}

(19)

holds with $C_{A}=C_{\mathrm{stiff}}(1+\sum_{n=1}^{\infty}(n+2)^{d}e^{-\frac{\nu}{2\rho}n})$ .

Lemma 4.

Under Assumptions 1, for $\Xi\subset\mathbb{M}$ , the stiffness matrix satisfies

\|\boldsymbol{\mathbf{A}}_{\Xi}^{-1}\|_{\ell_{2}\to\ell_{2}}\leq C_{\mathrm{H\ddot{o}lder}}h^{-d}

with a constant $C_{\mathrm{H\ddot{o}lder}}$ which is independent of $\Xi$ .

Proof.

Coercivity ensures $|a(\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi},\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi})|\geq c_{0}\left\|\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi}\right\|^{2}_{W_{2}^{1}(\mathbb{M})}$ , and the metric equivalence $\|v\|_{W_{2}^{1}(\mathbb{M})}^{2}=\|(1-\Delta)^{1/2}v\|_{L_{2}(\mathbb{M})}$ gives

\displaystyle\boldsymbol{\mathbf{v}}\cdot\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}}\geq c_{0}\boldsymbol{\mathbf{v}}\cdot\boldsymbol{\mathbf{L}}\boldsymbol{\mathbf{v}},\quad\text{where}\quad\boldsymbol{\mathbf{L}}:=\left(\langle(1-\Delta)^{1/2}\chi_{\xi},(1-\Delta)^{1/2}\chi_{\eta}\rangle\right)_{\xi,\eta\in\Xi}

is the stiffness matrix for the self-adjoint operator $1-\Delta$ . Because the spectrum of $1-\Delta$ is bounded below, i.e., $\sigma_{\mathrm{spec}}(1-\Delta)\subset[1,\infty)$ , we conclude that $\boldsymbol{\mathbf{v}}\cdot\boldsymbol{\mathbf{L}}\boldsymbol{\mathbf{v}}\geq\left\|\sum_{\xi\in\Xi}v_{\xi}\chi_{\xi}\right\|^{2}_{L_{2}(\mathbb{M})}\geq C_{1}h^{d}\left\|\boldsymbol{\mathbf{v}}\right\|^{2}_{\ell_{2}(\Xi)}$ by the Riesz property (15). Hence, overall we get

\displaystyle\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}\geq\frac{1}{\|\boldsymbol{\mathbf{v}}\|_{\ell_{2}}}|\boldsymbol{\mathbf{v}}\cdot\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}}|\geq ch^{d}\left\|\boldsymbol{\mathbf{v}}\right\|_{\ell_{2}}

with $c=c_{0}C_{1}$ , so $\|\boldsymbol{\mathbf{A}}_{\Xi}^{-1}\|_{\ell_{2}\to\ell_{2}}\leq\frac{1}{c_{0}C_{1}}h^{-d}$ . ∎

Consequently, the $\ell_{2}$ condition number of $\boldsymbol{\mathbf{A}}_{\Xi}$ is bounded by a multiple of $h^{-2}$ . Please note that the bounds in Lemma 3 and Lemma 4 do not assume the stiffness matrix to be symmetric, only these bounds are in some sense symmetric. Moreover, (19) is obtained by Riesz-Thorin interpolation and bounds on the row and column sums. Thus this bound also does not require the stiffness matrix to be symmetric.

3.2 The diagonal of the stiffness matrix

As a counterpart to the off-diagonal decay given in Lemma 3, we can give the following lower bounds on the diagonal entries.

Lemma 5.

For an elliptic operator $\mathcal{L}$ and a kernel satisfying Assumption 1, for a mesh ratio $\rho$ , there is $C_{\mathrm{diag}}>0$ so that for any $\Xi\subset\mathbb{M}$ with $h/q<\rho$ , and any $\xi\in\Xi$ , we have

A_{\xi,\xi}=a(\chi_{\xi},\chi_{\xi})\geq C_{\mathrm{diag}}h^{d-2}.

Proof.

By coercivity of $a$ , it suffices to prove that $\|\nabla\chi_{\xi}\|_{L_{2}(B(\xi,h))}^{2}\gtrsim h^{d-2}.$

We begin by establishing a Poincaré-type inequality which is valid for smooth Lagrange functions. To this end, consider $f:B(0,\mathrm{r}_{\mathbb{M}})\to\mathbb{R}$ obtained by the change of variable $f(x)=\chi_{\xi}(\exp_{\xi}(x))$ . Note that for $B\subset B(\xi,\mathrm{r}_{\mathbb{M}})$ , we have for any $0\leq k\leq m$ , that

\|f\|_{W_{2}^{k}(B)}\leq\frac{1}{\Gamma_{1}}\|\chi_{\xi}\|_{W_{2}^{k}(\mathbb{M})}\leq\frac{1}{\Gamma_{1}}(CZh^{m-k})(C_{\mathrm{EN}}\rho^{m-d/2}h^{d/2-m})=C_{1}h^{d/2-k}.

by the zeros estimate, with $C_{1}:=\frac{1}{\Gamma_{1}}\rho^{m-d/2}C_{\mathrm{EN}}C_{\mathrm{zeros}}$ , where the constants stem from (4). Now let $r:=\mathrm{dist}(\xi,\Xi\setminus\{\xi\})$ , and define $F:B(0,1)\to\mathbb{R}$ by $F:=f(r\cdot)$ . Then, by a change of variable,

\|F\|_{W_{2}^{m}(B(0,1))}^{2}=\sum_{k=0}^{m}|F|_{W_{2}^{k}(B(0,1))}^{2}=\sum_{k=0}^{m}r^{2k-d}|f|_{W_{2}^{k}(B(0,r))}^{2}\leq C_{2}

with $C_{2}:=\rho^{d}C_{1}$ , since $h/\rho\leq q\leq r\leq h$ .

Because of the imbedding $W_{2}^{m}(B(0,1))\subset C(\overline{B(0,1)})$ (which holds since $m>d/2$ ), the set

K:=\left\{G\in W_{2}^{m}(B(0,1))\mid\|G\|_{W_{2}^{m}(B(0,1))}\leq C_{2},\,G(0)=1,G(e_{1})=0\right\}

is well defined, closed and convex, hence weakly compact, by Banach-Alaoglu.

The natural imbedding $\iota:W_{2}^{m}(B(0,1))\to W_{2}^{1}(B(0,1))$ , is compact. We wish to show that $\iota(K)$ is a compact set.

Because $\iota$ is a continuous linear map, it is continuous between the weak topologies of $W_{2}^{m}(B(0,1))$ and $W_{2}^{1}(B(0,1))$ . Thus $\iota(K)$ is weakly compact in $W_{2}^{1}(B(0,1))$ , and thus norm closed. Finally, because $\iota(K)$ is complete and totally bounded, it is a compact subset in the norm topology of $W_{2}^{1}(B(0,1))$ .

Consider the (possibly zero) constant $c$ defined by

c:=\min_{G\in K}\frac{\|\nabla G\|_{L_{2}(B(0,1))}}{\|G\|_{L_{2}(B(0,1))}}.

The map $I:W_{2}^{1}(B(0,1))\to\mathbb{R}:G\mapsto\frac{\|\nabla G\|_{L_{2}}}{\|G\|_{L_{2}}}$ is continuous on the complement of $0\in W_{2}^{1}(B(0,1))$ (as quotient of two continuous functions that do not vanish). In particular, it is continuous and non-vanishing on $\iota(K)$ , so $c=\min_{G\in\iota(K)}I(G)>0$ . Indeed, $I(G)>0$ for all $G\in\iota(K)$ , since $G(0)=1$ , $G(e_{1})=0$ and $G\in C^{1}(B)$ .

Note that in the above minimization problem, the condition $G(e_{1})=0$ could be replaced by any other point on the unit sphere without changing the value of $c$ . By rotation invariance of the $W_{2}^{m}(\mathbb{R}^{d})$ norm, it follows that $\|\nabla F\|_{L_{2}}\geq c\|F\|_{L_{2}}$ . Finally, employing the change of variables $r^{d-2}\|\nabla F\|_{L_{2}(B(0,1))}^{2}=\|\nabla f\|_{L_{2}(B(0,r))}^{2}$ and $\|F\|_{L_{2}(B(0,1))}^{2}=r^{d}\|f\|_{L_{2}(B(0,r))}^{2}$ , we have

$\displaystyle\\|\nabla\chi_{\xi}\\|_{L_{2}(B(\xi,r))}^{2}$	$\displaystyle\geq$	$\displaystyle{\Gamma_{1}^{2}}r^{d-2}\\|\nabla F\\|_{L_{2}(B(0,1))}^{2}$
	$\displaystyle\geq$	$\displaystyle c^{2}{\Gamma_{1}^{2}}r^{d-2}\\|F\\|_{L_{2}(B(0,1))}^{2}$
	$\displaystyle\geq$	$\displaystyle c^{2}\Gamma_{1}^{2}\rho^{-2}h^{-2}\\|\chi_{\xi}\\|_{L_{2}(B(\xi,r))}^{2}.$

The last line follows because $\chi_{\xi}$ is Hölder continuous, so stays close to $1$ near $\xi$ . Specifically, by (14), for $\kappa:=\left(2C_{\mathrm{H\ddot{o}lder}}\right)^{-1/\epsilon}$ we have $\chi_{\xi}(x)>\frac{1}{2}$ for all $x\in B(\xi,\kappa h)$ . Thus

\|\chi_{\xi}\|_{L_{2}(B(\xi,r))}^{2}\geq\int_{B(\xi,\kappa h)}|\chi_{\xi}(x)|^{2}\mathrm{d}x\geq\frac{1}{4}\alpha_{\mathbb{M}}(\kappa h)^{d}.

The lemma follows with constant $C_{\mathrm{diag}}=\frac{1}{4}c^{2}\Gamma_{1}^{2}\rho^{-2}\alpha_{\mathbb{M}}\kappa^{d}$ . ∎

This brings us to the lower bound for diagonal entries of the stiffness matrix. Define the diagonal of $\boldsymbol{\mathbf{A}}_{\Xi}$ as $\boldsymbol{\mathbf{B}}_{\Xi}:=\mathrm{diag}(\boldsymbol{\mathbf{A}}_{\Xi}),$ and note that by Lemma 3 and Lemma 5,

\kappa(\boldsymbol{\mathbf{B}}_{\Xi})=\frac{\max_{\xi}A_{\xi,\xi}}{\min_{\xi}A_{\xi,\xi}}\leq\frac{C_{\mathrm{stiff}}}{C_{\mathrm{diag}}}

is bounded above by a constant which depends only on the mesh ratio $\rho$ (and not on $\Xi$ ).

This permits us to find suitable damping constants $0<\theta<1$ so that $\boldsymbol{\mathbf{B}}_{\Xi}$ dominates $\theta\boldsymbol{\mathbf{A}}_{\Xi}$ . This drives the success of the damped Jacobi method considered in the next section.

Lemma 6.

For an elliptic operator $\mathcal{L}$ , a kernel $\phi$ satisfying Assumption 1, and mesh ratio $\rho$ , there is $\theta\in(0,1)$ so that for any point set $\Xi\subset\mathbb{M}$ , $\theta\langle\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle$ for all $\boldsymbol{\mathbf{v}}\in\mathbb{R}^{\Xi}$ .

Proof.

By (19), $\langle\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle\leq C_{A}h^{d-2}\|v\|^{2}$ , while $C_{\mathrm{diag}}h^{d-2}\|v\|^{2}\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{x}}\rangle$ follows from Lemma 5. Thus the lemma holds for any $\theta$ in the interval $(0,{C_{\mathrm{diag}}}/{C_{A}}]$ . ∎

4 The smoothing property

In this section we define and study the smoothing operator used in the multigrid method. We focus on the damped Jacobi method for the linear system $\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{u}}^{\star}_{\Xi}=\boldsymbol{\mathbf{b}}$ , where $\boldsymbol{\mathbf{A}}_{\Xi}\in\mathbb{R}^{|\Xi|\times|\Xi|}$ and $\boldsymbol{\mathbf{b}}\in\mathbb{R}^{|\Xi|}$ . For a fixed damping parameter $0<\theta<1$ , $\boldsymbol{\mathbf{u}}^{(j)}$ is approximately computed via the iteration:

\boldsymbol{\mathbf{u}}^{(j+1)}=\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{b}}+(\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi})\boldsymbol{\mathbf{u}}^{(j)},j\geq 0,

with starting value $\boldsymbol{\mathbf{u}}^{(0)}\in\mathbb{R}^{|\Xi|}$ . Define the affine map governing a single iteration as

\displaystyle{J}_{\Xi}:\mathbb{R}^{\Xi}\times\mathbb{R}^{|\Xi|}\to\mathbb{R}^{|\Xi|},\quad{J}_{\Xi}(\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{b}})=\left(\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi}\right)\boldsymbol{\mathbf{u}}+\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{b}}.

This shows that the iteration converges if and only if iteration matrix

\displaystyle\boldsymbol{\mathbf{S}}_{\Xi}:=\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi},

(20)

called the smoothing matrix, is a contraction.

In the context of kernel approximation, the corresponding operator, the smoothing operator, $\mathcal{S}_{\Xi}:V_{\Xi}\to V_{\Xi}$ is defined as $\mathcal{S}_{\Xi}\sigma_{\Xi}^{*}:=\sigma_{\Xi}^{*}\boldsymbol{\mathbf{S}}_{\Xi}.$ The success of the multigrid method relies on a smoothing property, which for our purposes states that iterating $\mathcal{S}_{\Xi}$ is eventually contracting: $\|\mathcal{S}_{\Xi}^{\nu}\|_{L_{2}(\mathbb{M})\to\mathcal{L}}\leq Ch^{-1}o(\nu)$ as $\nu\to\infty$ . This smoothing property is demonstrated in section 4.2. Many of the results in the forthcoming section are formulated both in a matrix including $\boldsymbol{\mathbf{S}}_{\Xi}$ form and an operator form $\mathcal{S}_{\Xi}$ . This resembles a bit a change of basis transformation.

4.1 $L_{2}$ stability of the smoothing operator

At this point, we are in a position to show that that iterating this operator is stable on $L_{2}(\mathbb{M})$ . To help analyze the matrix $\boldsymbol{\mathbf{S}}_{\Xi}=\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi}$ , we introduce the inner product

\langle\boldsymbol{\mathbf{u}},\boldsymbol{\mathbf{v}}\rangle_{\boldsymbol{\mathbf{B}}_{\Xi}}:=\langle\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}\boldsymbol{\mathbf{u}},\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}\boldsymbol{\mathbf{v}}\rangle_{\ell_{2}(\mathbb{R}^{\Xi})}.

Since $\boldsymbol{\mathbf{B}}_{\Xi}$ is diagonal, and its diagonal entries are $\langle\chi_{\xi},\chi_{\xi}\rangle_{\mathcal{L}}\sim h^{d-2}$ , we have the norm equivalence $\|\boldsymbol{\mathbf{M}}\|_{\boldsymbol{\mathbf{B}}_{\Xi}\to\boldsymbol{\mathbf{B}}_{\Xi}}\sim\|\boldsymbol{\mathbf{M}}\|_{2\to 2}$ . Specifically, we have

\frac{1}{C_{\boldsymbol{\mathbf{B}}}}\|\boldsymbol{\mathbf{M}}\|_{2\to 2}\leq\|\boldsymbol{\mathbf{M}}\|_{\boldsymbol{\mathbf{B}}\to\boldsymbol{\mathbf{B}}}\leq C_{\boldsymbol{\mathbf{B}}}\|\boldsymbol{\mathbf{M}}\|_{2\to 2}

(21)

with constant of equivalence $C_{\boldsymbol{\mathbf{B}}}=\kappa(\boldsymbol{\mathbf{B}}_{\Xi}^{1/2})=\frac{\max\sqrt{a(\chi_{\xi},\chi_{\xi})}}{\min\sqrt{a(\chi_{\xi},\chi_{\xi})}}\leq\sqrt{C_{\mathrm{stiff}}/C_{\mathrm{diag}}}$ .

Lemma 7.

For the damping parameter $\theta$ in Lemma 6 and $\boldsymbol{\mathbf{S}}_{\Xi}$ as defined in (20), there is $C>0$ so that for all $\nu\in\mathbb{N}$ , $\boldsymbol{\mathbf{u}}\in\mathbb{R}^{\Xi}$ and $u=\sigma_{\Xi}^{*}\boldsymbol{\mathbf{u}}\in V_{\Xi}$ ,

\|\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\|_{2\to 2}\leq\kappa(\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}),\quad\text{and}\quad\|\mathcal{S}_{\Xi}^{\nu}u\|_{L_{2}(\mathbb{M})}\leq\frac{C_{2}}{C_{1}}\kappa(\boldsymbol{\mathbf{B}}_{\Xi}^{1/2})\|u\|_{{\color[rgb]{0,0,0}L_{2}(\mathbb{M})}}.

We note that this holds even when $a$ is non-symmetric.

Proof.

The matrix $\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}(\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi})\boldsymbol{\mathbf{B}}_{\Xi}^{-1/2}$ has the same spectrum as the matrix $\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi}$ . Furthermore, because $0\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}-\theta\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle\leq\langle\boldsymbol{\mathbf{B}}_{\Xi}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle$ , the spectral radius of $\boldsymbol{\mathbf{S}}_{\Xi}$ is no greater than $1$ . Thus

\|\boldsymbol{\mathbf{S}}_{\Xi}\|_{\boldsymbol{\mathbf{B}}_{\Xi}\to\boldsymbol{\mathbf{B}}_{\Xi}}=\|\boldsymbol{\mathbf{B}}_{\Xi}^{1/2}(\mathrm{id}-\theta\boldsymbol{\mathbf{B}}_{\Xi}^{-1}\boldsymbol{\mathbf{A}}_{\Xi})\boldsymbol{\mathbf{B}}_{\Xi}^{-1/2}\|_{2\to 2}\leq 1.

It follows that $\|\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\|_{\boldsymbol{\mathbf{B}}_{\Xi}\to\boldsymbol{\mathbf{B}}_{\Xi}}\leq 1$ as well, and the matrix norm equivalence (21) guarantees that $\|\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\|_{2\to 2}\leq\kappa(\boldsymbol{\mathbf{B}}_{\Xi}^{1/2})$ . The second inequality follows from the Riesz property (15). ∎

4.2 Smoothing properties

For $v=\sigma_{\Xi}^{*}\boldsymbol{\mathbf{v}}\in V_{\Xi}$ , we have $[v]_{\mathcal{L}}=\sqrt{a(v,v)}=\sqrt{\langle\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{v}},\boldsymbol{\mathbf{v}}\rangle}$ . By applying Cauchy-Schwarz, the Riesz property and Lemma 7, the chain of inequalities

[\mathcal{S}_{\Xi}^{\nu}v]_{\mathcal{L}}\leq\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}^{1/2}\|\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}^{1/2}\leq Ch^{-d/4}\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}^{1/2}\|v\|_{L_{2}}^{1/2}

holds. In the case that $a$ is symmetric, we have the following smoothness property.

Lemma 8.

For a given $\rho>0$ there is a constant $C$ so that for any $\theta$ chosen as in Lemma 6, $\boldsymbol{\mathbf{S}}_{\Xi}$ as defined in (20), and $\Xi\subset\mathbb{M}$ with mesh ratio $\rho(\Xi)\leq\rho$ , the damped Jacobi iteration has smoothing operator $\mathcal{S}_{\Xi}\in L(V_{\Xi})$ which satisfies

\|\mathcal{S}_{\Xi}^{\nu}v\|_{\mathcal{L}}\leq{C}{\theta}^{-1/2}\sqrt{\frac{1}{\nu+1}}h^{-1}\|v\|_{L_{2}}.

Proof.

This is a result of [13, Theorem 7.9], which shows that $\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}\leq\frac{Ch^{d-2}}{\theta(\nu+1)}\|\boldsymbol{\mathbf{v}}\|_{\ell_{2}}$ . It follows that $\|\boldsymbol{\mathbf{A}}_{\Xi}\boldsymbol{\mathbf{S}}_{\Xi}^{\nu}\boldsymbol{\mathbf{v}}\|_{\ell_{2}}\leq\frac{Ch^{d/2-2}}{\theta(\nu+1)}\|{v}\|_{L_{2}}$ , and the result holds by the above discussion. ∎

When $a$ is not symmetric, we have the smoothness property.

Lemma 9.

Let $\mathcal{S}_{\Xi}:V_{\Xi}\to V_{\Xi}:v=\sum v_{\xi}\chi_{\xi}\mapsto\sum(\boldsymbol{\mathbf{S}}_{\Xi}\boldsymbol{\mathbf{v}})_{\xi}\chi_{\xi}$ . For $\theta$ as in Lemma 6 and $\boldsymbol{\mathbf{S}}_{\Xi}$ as defined in (20) we have

[\mathcal{S}_{\Xi}^{\nu}v]_{\mathcal{L}}\leq\frac{C}{\sqrt[4]{\nu}}h^{-1}\|v\|_{L_{2}}.

Proof.

This follows from [13, Theorem 7.17], and by techniques of the proof of Lemma 8. ∎

5 The direct kernel multigrid method

We are now in a position to consider the multigrid method applied to the kernel based Galerkin method we have described in the previous sections.

5.1 Setup: Grid transfer

We consider a nested sequence of point sets

\displaystyle\Xi_{0}\subset\Xi_{1}\subset\dots\Xi_{\ell}\subset\Xi_{\ell+1}\subset\dots\subset\Xi_{L}\subset\mathbb{M}

and associated kernel spaces $V_{\Xi_{\ell}}$ as described in section 2.5. We denote the Lagrange basis for each such space $(\chi^{(\ell)}_{\xi})_{\xi\in\Xi_{\ell}}$ , and with it the accompanying analysis map $\sigma_{\ell}:=\sigma_{\Xi_{\ell}}$ , synthesis map $\sigma_{\ell}^{*}:=\sigma_{\Xi_{\ell}}^{*}$ and stiffness matrix $\boldsymbol{\mathbf{A}}_{\ell}:=\boldsymbol{\mathbf{A}}_{\Xi_{\ell}}$ . Moreover, we assume that there are constants $0<\gamma_{1}\leq\gamma_{2}<1$ and $\rho\geq 1$ such that

\displaystyle q_{\Xi_{\ell}}\leq h_{\Xi_{\ell}}\leq\rho q_{\Xi_{\ell}},\quad\text{and}\quad\gamma_{1}h_{\Xi_{\ell}}\leq h_{\Xi_{\ell+1}}\leq\gamma_{2}h_{\Xi_{\ell}}.

(22)

Note at this point that $\rho$ is a universal constant. Hence $\rho$ does not depend on $\ell$ and thus the constants in deriving the smoothing property do not depend on $\ell$ . Thus, we obtain $n_{\ell}\sim h^{-d}_{\Xi_{\ell}}$ , for constants see (2). We will assume that $L$ is the largest index that we will consider.

In this section, we discuss grid transfer: specifically, the operators and matrices which provide communication between finite dimensional kernel spaces. These include natural prolongation and restriction maps and their corresponding matrices. We show how these can be used to relate Galerkin projectors $P_{\Xi_{\ell}}$ and stiffness matrices $\boldsymbol{\mathbf{A_{\ell}}}$ .

Prolongation and restriction

Denote the Lagrange basis of $V_{\Xi_{\ell}}$ by $(\chi_{\xi}^{(\ell)})$ , and note that by containment $V_{\Xi_{\ell-1}}\subset V_{\Xi_{\ell}}$ , it follows that $\chi^{(\ell-1)}_{\xi}=\sum_{\eta\in\Xi_{\ell}}\beta_{\xi,\eta}\chi^{(\ell)}_{\eta}$ holds for some matrix of coefficients $\beta_{\xi,\ell}$ . Furthermore, from the Lagrange property, the identity $\chi^{(\ell-1)}_{\xi}(\eta)=\sum_{\zeta\in\Xi_{\ell}}\chi^{(\ell-1)}_{\xi}(\zeta)\chi^{(\ell)}_{\zeta}(\eta)$ holds for any $\eta\in\Xi_{\ell-1}$ . By uniqueness, we deduce that $\beta_{\xi,\eta}=\chi^{(\ell-1)}_{\xi}(\eta)$ , and we have

\displaystyle\sum_{\xi\in\Xi_{\ell-1}}c_{\xi}\chi^{(\ell-1)}_{\xi}=\sum_{\xi\in\Xi_{\ell-1}}c_{\xi}\sum_{\eta\in\Xi_{\ell}}\chi^{(\ell-1)}_{\xi}(\eta)\chi^{(\ell)}_{\eta}=\sum_{\eta\in\Xi_{\ell}}\sum_{\xi\in\Xi_{\ell-1}}c_{\xi}\chi^{(\ell-1)}_{\xi}(\eta)\chi^{(\ell)}_{\eta}.

This yields that the natural injection $\mathcal{I}_{\ell-1}^{(\ell)}:V_{\Xi_{\ell-1}}\to V_{\Xi_{\ell}}$ , called the prolongation map, which is described by the rectangular matrix

\boldsymbol{\mathbf{p}}_{\ell}:=\left(\chi^{(\ell-1)}_{\xi}(\eta)\right)_{\xi\in\Xi_{\ell-1},\eta\in\Xi_{\ell}}=(\sigma_{\ell}^{*})^{-1}\sigma_{\ell-1}^{*}\ \in\mathbb{R}^{n_{\ell}\times n_{\ell-1}}.

It is worth noting that $\mathcal{I}_{\ell-1}^{\ell}\sigma_{\ell-1}^{*}=\sigma_{\ell-1}^{*}$ , so we have the identity

\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}=\sigma_{\ell-1}^{*}.

(23)

The corresponding restriction map $\mathcal{I}_{\ell}^{(\ell-1)}:V_{\Xi_{\ell}}\to V_{\Xi_{\ell-1}}$ is described by the transposed matrix $\boldsymbol{\mathbf{r}}_{\ell}=\left(\boldsymbol{\mathbf{p}}_{\ell}\right)^{T}$ . In other words, it is defined as $\mathcal{I}_{\ell}^{(\ell-1)}\sigma_{\ell}^{*}=\sigma_{\ell-1}^{*}\left(\boldsymbol{\mathbf{p}}_{\ell}\right)^{T}$ .

Note that we can use $\boldsymbol{\mathbf{r}}_{\ell}=\left(\boldsymbol{\mathbf{p}}_{\ell}\right)^{T}$ to relate analysis maps at different levels, since we can take the $a$ -adjoint of both sides of the equation (23) to obtain the following useful identity:

\sigma_{\ell-1}=\left(\boldsymbol{\mathbf{p}}_{\ell}\right)^{T}\sigma_{\ell}.

(24)

Moreover, the prolongation is both bounded from above and below. This is a kernel based analogue for [13, Eq. (64)].

Lemma 10.

Using the notation from above, there is a constant $C_{pro}\geq 1$ depending on $\gamma$ , $\rho$ , $\mathbb{M}$ and the constants in Assumption 2 so that

\displaystyle 1

\displaystyle\leq\left\|\boldsymbol{\mathbf{p}}_{\ell}\right\|_{\ell_{2}\left(\Xi_{\ell-1}\right)\to\ell_{2}\left(\Xi_{\ell}\right)}\leq C_{pro}

(25)

holds for all $\ell\geq 1$ .

Proof.

We begin by estimating the $\ell_{1}(\Xi_{\ell-1})\to\ell_{1}(\Xi_{\ell})$ and $\ell_{\infty}(\Xi_{\ell-1})\to\ell_{\infty}(\Xi_{\ell})$ norms of $\boldsymbol{\mathbf{p}}_{\ell}$ by taking column and row sums, respectively. These estimates can be made almost simultaneously, because the $(\xi,\eta)$ entry of $\boldsymbol{\mathbf{p}}_{\ell}$ satisfies the bound $|\chi^{(\ell-1)}_{\xi}(\eta)|\leq C_{\mathrm{PW}}\exp(-\nu\frac{\mathrm{dist}(\xi,\eta)}{h_{\ell-1}})$ by (13).

Let $A:=\sum_{n=1}^{\infty}(n+2)^{d}\exp(-\frac{\nu}{\rho}n)$ and $B:=\sum_{n=1}^{\infty}(n+2)^{d}\exp(-\frac{\nu\gamma}{\rho}n)$ , and note that both numbers depend on $\rho,\gamma$ and the exponential decay rate $\nu$ from (13). Applying (3), gives

\sum_{\xi\in\Xi_{\ell-1}}|\chi^{(\ell-1)}_{\xi}(\eta)|\leq C_{\mathrm{PW}}\left(1+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}A\right)

and

\sum_{\eta\in\Xi_{\ell}}|\chi^{(\ell-1)}_{\xi}(\eta)|\leq C_{\mathrm{PW}}\left(1+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}B\right).

Finally, interpolation gives the upper bound

C_{pro}:=C_{\mathrm{PW}}\sqrt{(1+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}B)(1+\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}A)}.

By using the definition $\Upsilon_{\ell}:=\Xi_{\ell}\setminus\Xi_{\ell-1}$ we write $\boldsymbol{\mathbf{P}}_{a}=(\chi_{\xi}(\eta))_{\xi,\eta\in\Xi_{\ell-1}}$ and $\boldsymbol{\mathbf{P}}_{b}=(\chi_{\xi}(\zeta))_{\xi\in\Xi_{\ell-1},\zeta\in\Upsilon_{\ell}}$ . Thus, we have $\left\|\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{c}}\right\|^{2}_{2}=\|\boldsymbol{\mathbf{c}}\|^{2}_{2}+\|\boldsymbol{\mathbf{P}}_{b}\tilde{\boldsymbol{\mathbf{c}}}\|^{2}_{2}\geq\|\boldsymbol{\mathbf{c}}\|^{2}_{2}$ , which gives the lower bound. ∎

5.2 multigrid iteration – two level case

We now describe the multigrid algorithm, which is a composition of smoothing operators, restriction, coarse grid correction, prolongation, and then smoothing.

We begin by considering the solution of $\boldsymbol{\mathbf{A}}_{\ell}\boldsymbol{\mathbf{u}}_{\ell}^{\star}=\boldsymbol{\mathbf{b}}_{\ell}$ , where $\boldsymbol{\mathbf{A}}_{\ell}$ is the stiffness matrix associated to $V_{\Xi_{\ell}}$ , and where $\boldsymbol{\mathbf{b}}_{\ell}=\sigma_{\ell}(u_{\ell}^{\star})$ , is the data obtained from the Galerkin solution $u_{\ell}^{\star}=\sigma^{\ast}_{\ell}(\boldsymbol{\mathbf{u}}_{\ell}^{\star})\in V_{\Xi_{\ell}}$ . Naturally, $u_{\ell}^{\star}$ is unknown (its coefficients are the solution of the above problem), but we can compute the data $\sigma_{\ell}{u}_{\ell}^{\star}$ via

	$\displaystyle\boldsymbol{\mathbf{b}}_{\ell}$	$\displaystyle=\sigma_{\ell}(u_{\ell}^{\star})$
		$\displaystyle=\left(a(u_{\ell}^{\star},\chi^{(\ell)}_{\xi_{1}}),\dots,a(u_{\ell}^{\star},\chi^{(\ell)}_{\xi_{n_{\ell}}})\right)^{T}$
		$\displaystyle=\left(\langle f,\chi^{(\ell)}_{\xi_{1}}\rangle_{L_{2}(\mathbb{M})},\dots,\langle f,\chi^{(\ell)}_{\xi_{n_{\ell}}}\rangle_{L_{2}(\mathbb{M})}\right)^{T}\!\!.$

In other words, it is obtained from the right hand side $f$ .

Algorithm 1

\mathrm{TGM}_{\ell}

Two-grid method with post-smoothing in vectorial form, see [13, Eq. (20)]

\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}\in\mathbb{R}^{n_{\ell}}

, right-hand-sides

\boldsymbol{\mathbf{b}}_{\ell}=\sigma_{\ell}u^{\star}_{\ell}\in\mathbb{R}^{n_{\ell}}

0: new approximation

\mathbb{R}^{n_{\ell}}\ni\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}\leftarrow\operatorname{TGM}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})

\ell=0

then

\boldsymbol{\mathbf{u}}^{\text{new}}_{0}\leftarrow\boldsymbol{\mathbf{A}}^{-1}_{0}\boldsymbol{\mathbf{b}}_{0}

{for coarsest grid use direct solver}

else

\boldsymbol{\mathbf{u}}_{\ell}\leftarrow{J}_{\ell}^{\nu_{1}}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})

{

\nu_{1}

steps pre-smoothing}

\boldsymbol{\mathbf{d}}_{\ell-1}\leftarrow\boldsymbol{\mathbf{r}}_{\ell}\left(\boldsymbol{\mathbf{b}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell}\boldsymbol{\mathbf{u}}_{\ell}\right)

{restrict residual to coarser grid}

\tilde{\boldsymbol{\mathbf{e}}}_{\ell-1}\leftarrow\boldsymbol{\mathbf{A}}^{-1}_{\ell-1}\boldsymbol{\mathbf{d}}_{\ell-1}

{solve coarse grid problem}

\boldsymbol{\mathbf{u}}_{\ell}\leftarrow\boldsymbol{\mathbf{u}}_{\ell}+\boldsymbol{\mathbf{p}}_{\ell}\tilde{\boldsymbol{\mathbf{e}}}_{\ell-1}

{update with coarse grid correction}

\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}\leftarrow{J}_{\ell}^{\nu_{2}}(\boldsymbol{\mathbf{u}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})

{

\nu_{2}

steps post-smoothing}

end if

return

\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}

The output $\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}=\operatorname{TGM}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})$ of the two-level multigrid algorithm with initial input $\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}$ is given by the rather complicated formula

\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}={J}^{\nu_{2}}_{\ell}\left({J}^{\nu_{1}}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})+\boldsymbol{\mathbf{p}}_{\ell}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\left(\boldsymbol{\mathbf{b}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell}{J}^{\nu_{1}}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})\right),\boldsymbol{\mathbf{b}}_{\ell}\right).

(26)

Because $\boldsymbol{\mathbf{u}}\mapsto\mathrm{TGM}_{\ell}(\boldsymbol{\mathbf{u}},\boldsymbol{\mathbf{b}})$ is consistent (with $\boldsymbol{\mathbf{A}}_{\ell}\boldsymbol{\mathbf{u}}=\boldsymbol{\mathbf{b}}_{\ell}$ ) the corresponding iteration matrix is

\displaystyle\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}:=\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{2}}\left(\mathrm{id}-\boldsymbol{\mathbf{p}}_{\ell}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}\right)\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{1}},

(27)

(this is calculated in [13, Eq. 48] as well as in [11, Lemma 11.11]). Thus, the error can be expressed as $\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}-\boldsymbol{\mathbf{u}}_{\ell}^{\star}=\operatorname{TGM}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell})-\boldsymbol{\mathbf{u}}^{\star}_{\ell}=\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}-\boldsymbol{\mathbf{u}}^{\star}_{\ell})$ .

The corresponding operator on $V_{\Xi_{\ell}}$ is obtained by conjugating with $\sigma_{\ell}^{*}$ . This gives the error operator for the two level method $\mathcal{C}_{\text{TG}_{\ell}}\sigma_{\ell}^{*}:=\sigma_{\ell}^{*}\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}$ .

It is worth noting that by (23) we have the equality $\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}=\sigma_{\ell-1}^{*}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}$ . Using the identity $\boldsymbol{\mathbf{A}}_{\ell}=\sigma_{\ell}\sigma_{\ell}^{*}$ followed by (24) and the identity $P_{\Xi_{\ell-1}}=\sigma^{\ast}_{\ell-1}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\sigma_{\ell-1}$ gives

\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}\left(\boldsymbol{\mathbf{A}}_{\ell-1}\right)^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}=P_{\Xi_{\ell-1}}\sigma_{\ell}^{*}.

(28)

It follows that $\mathcal{C}_{\text{TG}_{\ell}}=\mathcal{S}_{\ell}^{\nu_{2}}(\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}$ .

We are now in a position to show that the two level method is a contraction for sufficiently large values of $\nu_{1}$ .

Proposition 1.

There is a constant $C$ so that for all $\ell$ , $\mathcal{C}_{\text{TG}_{\ell}}$ satisfies the bound

\displaystyle\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq C_{\text{Prop.\ref{prop:boundtwogridop}}}g(\nu_{1})=C_{\text{Prop.\ref{prop:boundtwogridop}}}\begin{cases}(2\nu_{1}+1)^{-\frac{1}{2}},&\text{symmetric };\\ \nu_{1}^{-\frac{1}{4}},&\text{non-symmetric }.\end{cases}

Proof.

The following equality

\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}=\left\|\mathcal{S}_{\ell}^{\nu_{2}}(\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}

holds, so Lemma 7 ensures

\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq C\left\|(\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}.

By Lemma 1, $\left\|\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}}\right\|_{W_{2}^{1}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq Ch_{\ell-1}$ holds, so it follows that

\left\|(\mathrm{id}_{V_{\Xi_{\ell-1}}}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq Ch_{\ell-1}\|\mathcal{S}_{\ell}^{\nu_{1}}\|_{L_{2}(\mathbb{M})\to W_{2}^{1}(\mathbb{M})}.

By coercivity, this gives $\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq Ch_{\ell-1}\|\mathcal{S}_{\ell}^{\nu_{1}}v\|_{L_{2}(\mathbb{M})\to\mathcal{L}}$ . Finally, the result follows by applying the smoothing property: Lemma 8 in the symmetric case and Lemma 9 in the non-symmetric case. ∎

Corollary 1.

Let $\theta$ be as in Lemma 6 and let $\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}\in\mathbb{R}^{n_{\ell}}$ be an initial guess, $u^{\text{old}}_{\ell}=\sigma_{\ell}^{*}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell})$ , $\boldsymbol{\mathbf{u}}_{\ell}=\operatorname{TGM}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell})$ , and $u_{\ell}=\sigma_{\ell}^{*}\boldsymbol{\mathbf{u}}_{\ell}$ . If $a$ is symmetric, there is a constant $C$ independent of $\ell$ and $\theta$ so that

\|u^{\text{new}}_{\ell}-u_{\ell}^{\star}\|_{L_{2}(\mathbb{M})}\leq C\theta^{-1/2}\frac{1}{\sqrt{2\nu_{1}+1}}\|u^{\text{old}}_{\ell}-u_{\ell}^{\star}\|_{L_{2}(\mathbb{M})}.

If $a$ is not symmetric, there is a constant $C$ independent of $\ell$ and $\theta$ so that

\|u^{\text{new}}_{\ell}-u_{\ell}^{\star}\|_{L_{2}(\mathbb{M})}\leq C\theta^{-1/2}\frac{1}{\sqrt[4]{\nu_{1}}}\|u^{\text{old}}_{\ell}-u_{\ell}^{\star}\|_{L_{2}(\mathbb{M})}.

Proof.

This follows by applying Proposition 1 to $u^{\text{old}}_{\ell}-u_{\ell}^{\star}$ . ∎

5.3 multigrid with $\tau$ -cycle

In the two-grid method, the computational bottleneck remains the solution on the coarse grid. Thus, there have been many approaches to recursively apply the multigrid philosophy in order to use a direct solver only on the coarsest grid. A flexible algorithm is the so-called $\tau$ -cycle. Here $\tau=1$ stands for the $V$ -cycle in multigrid methods and $\tau=2$ gives the $W$ -cycle.

Our results hold for $\tau\geq 2$ .

Algorithm 2

\operatorname{MGM}^{(\tau)}_{\ell}

Multigrid method with

\tau

cycle, see [13, Eq. (31)]

0: approximation:

\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell}\in\mathbb{R}^{n_{\ell}}

, right-hand-side

\boldsymbol{\mathbf{b}}_{\ell}\in\mathbb{R}^{n_{\ell}}

\theta

0: new approximation

\mathbb{R}^{n_{\ell}}\ni\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}\leftarrow\operatorname{MGM}^{(\tau)}_{\ell}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})

\ell=0

then

\boldsymbol{\mathbf{u}}^{\text{new}}_{0}\leftarrow\boldsymbol{\mathbf{A}}^{-1}_{0}\boldsymbol{\mathbf{b}}_{0}

{for coarsest grid use direct solver}

else

\boldsymbol{\mathbf{u}}_{\ell}\leftarrow{J}_{\ell}^{\nu_{1}}(\boldsymbol{\mathbf{u}}^{\text{old}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})

{

\nu_{1}

steps pre-smoothing}

\boldsymbol{\mathbf{d}}_{\ell-1}\leftarrow\boldsymbol{\mathbf{r}}_{\ell}\left(\boldsymbol{\mathbf{b}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell}\boldsymbol{\mathbf{u}}_{\ell}\right)

{restrict residual to coarser grid}

\boldsymbol{\mathbf{e}}^{(0)}_{\ell-1}\leftarrow 0

{initialize}

for

i=0

\tau

\boldsymbol{\mathbf{e}}^{(i)}_{\ell-1}\leftarrow\operatorname{MGM}^{(\tau)}_{\ell-1}(\boldsymbol{\mathbf{e}}^{i-1}_{\ell-1},\boldsymbol{\mathbf{d}}_{\ell-1})

{recursive call on

\Xi_{\ell-1}

}

end for

\boldsymbol{\mathbf{u}}_{\ell}\leftarrow\boldsymbol{\mathbf{u}}_{\ell}+\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{e}}^{(\tau)}_{\ell-1}

{update with coarse grid correction}

\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}\leftarrow J_{\ell}^{\nu_{2}}(\boldsymbol{\mathbf{u}}_{\ell},\boldsymbol{\mathbf{b}}_{\ell})

{

\nu_{2}

steps post-smoothing}

end if

return

\boldsymbol{\mathbf{u}}^{\text{new}}_{\ell}

Before proving our main theorem, we need a statement from elementary real analysis.

Lemma 11.

For any real numbers $\alpha,\beta,\gamma,\tau$ which satisfy $0<\gamma<1$ , $\tau\geq 2$ , $\beta>1/\tau$ and $\alpha<\min\left\{\frac{\tau-1}{\tau}(\beta\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}$ , if the sequence $(x_{n})_{n\in\mathbb{N}_{0}}$ satisfies the conditions

\displaystyle x_{0}=0,\quad x_{n+1}\leq\alpha+\beta\bigl{(}x_{n}\bigr{)}^{\tau}

then $x_{n}\leq\gamma$ for all $n\geq 0$ .

Proof.

This follows by elementary calculations, as in [17, Lemma 6.15]. ∎

Using [13, Theorem 7.1], we obtain for the iteration matrix of the Algorithm 2 the recursive (in the level) form

\displaystyle\boldsymbol{\mathbf{C}}_{\ell}=\begin{cases}\boldsymbol{\mathbf{0}},&\ell=0\\ \boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}+\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{2}}\boldsymbol{\mathbf{p}}_{\ell}(\boldsymbol{\mathbf{C}}_{\ell-1})^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{1}},&\ell\in\mathbb{N}.\end{cases}

Again, we define the corresponding operator via $\mathcal{C}_{\ell}:=\sigma_{\ell}^{*}\boldsymbol{\mathbf{C}}_{\ell}(\sigma_{\ell}^{*})^{-1}$ .

Theorem 1.

For every $\gamma\in(0,1)$ , there is a $\nu^{\star}:=\arg\min_{\nu\in\mathbb{N}}\{\nu\in\mathbb{N}\ :\ C_{\text{Prop.\ref{prop:boundtwogridop}}}g(\nu)\leq\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}\}$ . For all $\nu_{1}\geq\nu^{\star}$ we have

\displaystyle\left\|\mathcal{C}_{\ell}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq\gamma.

(29)

Proof.

Here, we follow basically [13, Proofof Theorem 7.20]. Let $\boldsymbol{\mathbf{v}}\in\mathbb{R}^{n_{\ell}}$ arbitrary. For $v=\sigma^{*}_{\ell}\boldsymbol{\mathbf{v}}$ , we obtain for $\ell\in\mathbb{N}$ ,

$\displaystyle\left\\|\mathcal{C}_{\ell}\right\\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$	$\displaystyle\leq$	$\displaystyle\left\\|\mathcal{C}_{\text{TG}_{\ell}}\right\\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$
		$\displaystyle+\left\\|\mathcal{S}_{\ell}^{\nu_{2}}\sigma^{}_{\ell}\boldsymbol{\mathbf{p}}_{\ell}(\boldsymbol{\mathbf{C}}_{\ell-1})^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma^{}_{\ell})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}\right\\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$
	$\displaystyle\leq$	$\displaystyle\left\\|\mathcal{C}_{\text{TG}_{\ell}}\right\\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$
		$\displaystyle+C\left\\|\sigma^{}_{\ell}\boldsymbol{\mathbf{p}}_{\ell}(\boldsymbol{\mathbf{C}}_{\ell-1})^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma^{}_{\ell})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}\right\\|_{L_{2}(\mathbb{M})}$

by Proposition 1 and Lemma 7. We treat the second term with (23) and (28) to obtain

\sigma^{*}_{\ell}\boldsymbol{\mathbf{p}}_{\ell}(\boldsymbol{\mathbf{C}}_{\ell-1})^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma^{*}_{\ell})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}=\mathcal{C}_{\ell-1}^{\tau}P_{\Xi_{\ell-1}}\mathcal{S}_{\ell}^{\nu_{1}}.

This leaves

	$\displaystyle\left\\|\mathcal{C}_{\ell}v\right\\|_{L_{2}(\mathbb{M})}$	$\displaystyle\leq\left\\|\mathcal{C}_{\text{TG}_{\ell}}\right\\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$
		$\displaystyle\phantom{\leq}+C\left\\|\mathcal{C}_{\ell-1}^{\tau}\right\\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\left\\|P_{\Xi_{\ell-1}}\mathcal{S}_{\ell}^{\nu_{1}}\right\\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}.$

The last factor can be bounded by writing $P_{\Xi_{\ell-1}}=\mathrm{id}-(\mathrm{id}-P_{\Xi_{\ell-1}})$ followed by the triangle inequality. Lemma 7 bounds $\|\mathcal{S}_{\ell}^{\nu_{1}}\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$ , while Proposition 1 (with $\nu_{2}=0$ ) bounds $\|(\mathrm{id}-P_{\Xi_{\ell-1}})\mathcal{S}_{\ell}^{\nu_{1}}\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$ . Thus, we end up with a bound

\displaystyle\left\|\mathcal{C}_{\ell}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}+C\left\|\mathcal{C}_{\ell-1}\right\|^{\tau}_{L_{2}\to L_{2}},

which has the form required by Lemma 11, with $x_{\ell}=\left\|\mathcal{C}_{\ell}\right\|_{L_{2}\to L_{2}}$ , $\beta_{\text{Thm.\ref{thm:main}}}:=C$ and $\alpha=\left\|\mathcal{C}_{\text{TG}_{\ell}}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$ . The condition $\nu_{1}\geq\nu^{\star}$ ensures the bound $\alpha\leq\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}$ . Thus

\displaystyle\left\|\mathcal{C}_{\ell}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq\gamma

holds by Lemma 11, and the theorem follows. ∎

Remark 2.

At the finest level, the kernel-based Galerkin problem $\boldsymbol{\mathbf{A}}_{L}\boldsymbol{\mathbf{x}}=\boldsymbol{\mathbf{b}}_{L}$ , can be solved stably to any precision $\epsilon_{\max}$ , by iterating the contraction matrix $\boldsymbol{\mathbf{C}}_{L}{\tau}$ . Select $\gamma<1$ and fix $\nu_{1}$ so that Theorem 1 holds. Letting $\boldsymbol{\mathbf{u}}^{(k+1)}=\operatorname{MGM}^{(\tau)}_{L}(\boldsymbol{\mathbf{u}}^{(k)},\boldsymbol{\mathbf{b}}_{\ell})$ gives $\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\ell_{2}}\leq\gamma^{k}\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(0)}\|_{\ell_{2}}$ . If $k$ is the least integer satisfying $\gamma^{k}\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(0)}\|<\epsilon_{\max}$ , then $k\sim\frac{1}{\log\gamma}\log\left(\frac{\epsilon_{\max}}{\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(0)}\|}\right)$ .

We note that $\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}\sim\|\sigma_{L}^{*}\boldsymbol{\mathbf{u}}^{*}-\sigma_{L}^{*}\boldsymbol{\mathbf{u}}^{(k)}\|_{W_{2}^{1}}\leq C_{\mathrm{Bernstein}}h^{d/2-1}\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\ell_{2}},$ and since $d\geq 2$ , achieving $\|\boldsymbol{\mathbf{u}}^{*}-\boldsymbol{\mathbf{u}}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}<\epsilon_{\max}$ also requires only a fixed number of iterations. This shows (41).

6 The perturbed multigrid method

In this section, we consider a modified problem

\boldsymbol{\mathbf{\check{A}}}_{L}\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}=\boldsymbol{\mathbf{b}}_{L},

where $\boldsymbol{\mathbf{\check{A}}}_{L}$ is close to $\boldsymbol{\mathbf{A}}_{L}$ . The perturbed multigrid method will produce an approximate solution $\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}$ to $\boldsymbol{\mathbf{u}}^{\star}_{L}$ which satisfies $\|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|\leq\|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}\|+\|\boldsymbol{\mathbf{\check{A}}}_{L}^{-1}-\boldsymbol{\mathbf{A}}_{L}^{-1}\|\|\boldsymbol{\mathbf{b}}_{L}\|$ . Thus, for the true solution $u$ to (8) and the Galerkin solution $u^{\star}_{L}=P_{\Xi_{L}}u=\sigma_{L}^{*}\boldsymbol{\mathbf{u}}^{\star}_{L}$ , we have

\|\sigma^{*}\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-u\|_{W_{2}^{k}(\mathbb{M})}\leq\|(1-P_{\Xi_{L}})u\|_{W_{2}^{k}(\mathbb{M})}+Ch_{L}^{d/2-k}\|\boldsymbol{\mathbf{\check{u}}}_{L}^{(k)}-\boldsymbol{\mathbf{\check{u}}}_{L}^{\star}\|_{\ell_{2}}.

which can be made as close to $\|(1-P_{\Xi_{L}})u\|_{L_{2}}$ as desired by controlling the perturbation $\|\boldsymbol{\mathbf{\check{A}}}_{L}-\boldsymbol{\mathbf{A}}_{L}\|_{2\to 2}$ and the error from the multigrid approximation $\|\boldsymbol{\mathbf{\check{u}}}_{L}^{(k)}-\boldsymbol{\mathbf{\check{u}}}_{L}^{\star}\|$ .

Such systems may occur for a number of reasons: using localized Lagrange basis functions (as in [8]), truncating a series expansion of the kernel (as in [26]), or by using quadrature to approximate the stiffness matrix (both [8] and [26]). In the next section, we will apply this by truncating the original stiffness matrix to employ only banded matrices and thereby enjoy a computational speed up.

Perturbed multigrid method

The perturbed multigrid method replaces matrices $\boldsymbol{\mathbf{A}}_{\ell}$ , $\boldsymbol{\mathbf{p}}_{\ell}$ and $\boldsymbol{\mathbf{r}}_{\ell}$ appearing in Algorithms 1 and 2 with matrices $\boldsymbol{\mathbf{\check{A}}}_{\ell}$ , $\check{\boldsymbol{\mathbf{p}}}_{\ell}$ and $\check{\boldsymbol{\mathbf{r}}}_{\ell}$ . We assume that for each $\ell$ there exists $0<\epsilon_{\ell}$ so that

\|\boldsymbol{\mathbf{\check{A}}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell}\|_{\ell_{2}\to\ell_{2}},\|\check{\boldsymbol{\mathbf{p}}}_{\ell}-\boldsymbol{\mathbf{p}}_{\ell}\|_{\ell_{2}\to\ell_{2}},\|\check{\boldsymbol{\mathbf{r}}}_{\ell}-\boldsymbol{\mathbf{r}}_{\ell}\|_{\ell_{2}\to\ell_{2}}\leq\epsilon_{\ell}.

In this set up $\epsilon_{\ell}$ may change per level.⁴⁴4Which could be the case, e.g., if $\boldsymbol{\mathbf{\check{A}}}_{\ell}$ involved a $\Xi_{\ell}$ dependent quadrature scheme, or was obtained by bandlimiting (as we will do in the next section) We assume $\epsilon_{\ell}$ is small enough that

\|\boldsymbol{\mathbf{\check{A}}}_{\ell}\|_{2\to 2}\leq 2C_{A}h_{\ell}^{d-2},\qquad\|\check{\boldsymbol{\mathbf{p}}}_{\ell}\|_{2\to 2}<2C_{pro}\quad\text{ and }\quad\|\check{\boldsymbol{\mathbf{r}}}_{\ell}\|_{2\to 2}<2C_{pro}.

It then follows from standard arguments that

\|\boldsymbol{\mathbf{\check{A}}}_{\ell}^{-1}\|_{2\to 2}\leq 2C_{\mathrm{H\ddot{o}lder}}h_{\ell}^{-d}\quad\text{and}\quad\|\boldsymbol{\mathbf{\check{A}}}_{\ell}^{-1}-\boldsymbol{\mathbf{A}}_{\ell}^{-1}\|_{2}\leq 2(C_{\mathrm{H\ddot{o}lder}})^{2}h_{\ell}^{-2d}\epsilon_{\ell}.

Because of the entry-wise error $|(\boldsymbol{\mathbf{\check{A}}}_{\ell})_{\xi,\xi}-(\boldsymbol{\mathbf{A}}_{\ell})_{\xi,\xi}|\leq\epsilon_{\ell}$ , we also have that the diagonal matrix $\boldsymbol{\mathbf{\check{B}}}_{\ell}=\operatorname{diag}(\boldsymbol{\mathbf{\check{A}}}_{\ell})$ satisfies

\kappa(\boldsymbol{\mathbf{\check{B}}}_{\ell})\leq\frac{1+\epsilon_{\ell}}{1-\epsilon_{\ell}}\kappa(\boldsymbol{\mathbf{B}}_{\ell})\leq 2\frac{C_{\mathrm{stiff}}}{C_{\mathrm{diag}}}.

Therefore, there is $\theta$ so that for all $\ell$ and all $\boldsymbol{\mathbf{x}}\in\mathbb{R}^{n_{\ell}}$ , $\theta\langle\boldsymbol{\mathbf{\check{A}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle\leq\langle\boldsymbol{\mathbf{\check{B}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle$ holds. This permits us to consider the Jacobi iteration applied to the perturbed linear system $\boldsymbol{\mathbf{\check{A}}}_{\ell}\boldsymbol{\mathbf{x}}_{\ell}=\boldsymbol{\mathbf{b}},$ which yields $\check{J}_{\ell}:\mathbb{R}^{n_{\ell}}\times\mathbb{R}^{n_{\ell}}\to\mathbb{R}^{n_{\ell}}$ defined by

\displaystyle\check{J}_{\ell}(\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{b}})=\boldsymbol{\mathbf{\check{S}}}_{\ell}\boldsymbol{\mathbf{x}}+\theta\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}\boldsymbol{\mathbf{b}},\quad\text{where}\quad\boldsymbol{\mathbf{\check{S}}}_{\ell}:=\mathrm{id}-\theta\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}\boldsymbol{\mathbf{\check{A}}}_{\ell}.

(30)

Since $\boldsymbol{\mathbf{S}}_{\ell}-\boldsymbol{\mathbf{\check{S}}}_{\ell}=\theta\Bigl{(}\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}(\boldsymbol{\mathbf{\check{A}}}_{\ell}-\boldsymbol{\mathbf{A}}_{\ell})+(\boldsymbol{\mathbf{\check{B}}}_{\ell}^{-1}-\boldsymbol{\mathbf{B}}_{\ell}^{-1})\boldsymbol{\mathbf{A}}_{\ell}\Bigr{)}$ , we can estimate the error between smoothing matrices as

\|\boldsymbol{\mathbf{S}}_{\ell}-\boldsymbol{\mathbf{\check{S}}}_{\ell}\|_{2\to 2}\leq\frac{3\theta C_{A}}{(C_{\mathrm{diag}})^{2}}h_{\ell}^{2-d}\epsilon_{\ell}.

(31)

Because $\theta\langle\boldsymbol{\mathbf{\check{A}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle\leq\langle\boldsymbol{\mathbf{\check{B}}}_{\ell}\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{x}}\rangle$ it follows that for all $n$ ,

\|\boldsymbol{\mathbf{\check{S}}}_{\ell}^{n}\|_{2\to 2}\leq\sqrt{2\frac{C_{\mathrm{stiff}}}{C_{\mathrm{diag}}}}.

(32)

This also yields the following Lemma.

Lemma 12.

For $M\in\mathbb{N}$ , we get the bound

\displaystyle\left\|\boldsymbol{\mathbf{S}}_{\ell}^{M}-\boldsymbol{\mathbf{\check{S}}}_{\ell}^{M}\right\|_{2\to 2}\leq 2M\frac{C_{\mathrm{stiff}}}{C_{\mathrm{diag}}}\left\|\boldsymbol{\mathbf{S}}_{\ell}-\check{\boldsymbol{\mathbf{S}}}_{\ell;r_{\Xi}}\right\|_{2\to 2}.

Proof.

By telescoping, we have $\boldsymbol{\mathbf{S}}_{\ell}^{M}-\boldsymbol{\mathbf{\check{S}}}_{\ell}^{M}=\sum_{j=0}^{M-1}\boldsymbol{\mathbf{S}}_{\ell}^{M-1-j}\left(\boldsymbol{\mathbf{S}}_{\ell}-\boldsymbol{\mathbf{\check{S}}}_{\ell}\right)\boldsymbol{\mathbf{\check{S}}}_{\ell}^{j}$ . The inequality

\displaystyle\|\boldsymbol{\mathbf{S}}_{\ell}^{M}-\boldsymbol{\mathbf{\check{S}}}_{\ell}^{M}\|\leq\|\boldsymbol{\mathbf{S}}_{\ell}-\boldsymbol{\mathbf{\check{S}}}_{\ell}\|\sum_{m=0}^{M-1}\left\|\boldsymbol{\mathbf{S}}_{\Xi}^{m}\right\|_{2\to 2}\left\|\boldsymbol{\mathbf{\check{S}}}_{\ell}^{M-1-m}\right\|_{2\to 2}.

follows from norm properties, and the result follows by applying (32) and Lemma 7. ∎

This lemma can be combined with the estimate (31) to obtain

\displaystyle\left\|\boldsymbol{\mathbf{S}}_{\ell}^{M}-\boldsymbol{\mathbf{\check{S}}}_{\ell}^{M}\right\|_{2\to 2}\leq 6\theta\frac{C_{\mathrm{stiff}}C_{A}}{(C_{\mathrm{diag}})^{3}}Mh_{\ell}^{2-d}\epsilon_{\ell}.

(33)

Perturbed two grid method

Now, we consider the perturbed version of the two step algorithm. We aim to apply the two-grid method with only truncated matrices to the problem

\displaystyle\boldsymbol{\mathbf{\check{A}}}_{\ell}\check{\boldsymbol{\mathbf{u}}}^{\star}_{\ell}=\boldsymbol{\mathbf{b}}_{\ell}=\sigma_{\ell}u^{\star}_{\ell}.

(34)

Applying the two-grid method to (34) gives $\check{\boldsymbol{\mathbf{u}}}^{\star}_{\ell}-\check{\boldsymbol{\mathbf{u}}}^{\text{new}}_{\ell}=\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\left(\check{\boldsymbol{\mathbf{u}}}^{\star}_{\ell}-\check{\boldsymbol{\mathbf{u}}}^{\text{old}}_{\ell}\right),$ where the two grid iteration matrix is

\displaystyle\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}:=\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{2}}\left(\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\left(\boldsymbol{\mathbf{\check{A}}}_{\ell-1}\right)^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\right)\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{1}}.

(35)

Lemma 13.

If $\epsilon_{\ell}\leq h_{\ell}^{d+2}$ holds for all $\ell\leq L$ , then

\displaystyle\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}-\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\|_{2\to 2}\leq C(\nu_{2}+\nu_{1}+h_{\ell-1}^{-4})h_{\ell-1}^{-(d+2)}\epsilon_{\ell-1}.

Remark 3.

A basic idea, used throughout this section, is the following result: If $\max(\|M_{j}\|,\|\check{M}_{j}\|)\leq C_{j}$ , then

\left\|\prod_{j=1}^{n}M_{j}-\prod_{j=1}^{n}\check{M}_{j}\right\|\leq\left(\prod_{j=1}^{n}C_{j}\right)\left(\sum_{j=1}^{n}\frac{\|M_{j}-\check{M}_{j}\|}{C_{j}}\right).

Proof of Lemma 13.

Consider

	$\displaystyle E_{1}$	$\displaystyle:=$	$\displaystyle\left(\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\right)-\left(\mathrm{id}-\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}\right)$
		$\displaystyle=$	$\displaystyle\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}.$

By Remark 3, we have

	$\displaystyle\left\\|E_{1}\right\\|_{2\to 2}$	$\displaystyle\leq Ch_{\ell-1}^{-d}h_{\ell}^{d-2}\left(2\frac{\epsilon_{\ell}}{2C_{pro}}+\frac{2(C_{\mathrm{H\ddot{o}lder}})^{2}\epsilon_{\ell-1}h_{\ell-1}^{-2d}}{2C_{\mathrm{H\ddot{o}lder}}h_{\ell-1}^{-d}}+\frac{\epsilon_{\ell}}{2C_{A}h_{\ell}^{d-2}}\right)$
		$\displaystyle\leq Ch_{\ell-1}^{-d-2}\epsilon_{\ell-1}.$

Now, we consider the difference

	$\displaystyle E_{2}$	$\displaystyle:=\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}-\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}$
		$\displaystyle=\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{2}}\left(\mathrm{id}-\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}\right)\boldsymbol{\mathbf{S}}_{\ell}^{\nu_{1}}-\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{2}}\left(\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\right)\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{1}}.$

Because $\|\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\|\leq Ch_{\ell}^{-2}$ and $\|\mathrm{id}-\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\|\leq Ch_{\ell}^{-2}$ , the lemma follows by using Remark 3 with (31), (33), Lemma 12, and the above estimate of $\|E_{1}\|$ . ∎

Perturbed $\tau$ -cycle

As in the two-step case we consider the multigrid method also for the truncated system in (34), i.e., $\boldsymbol{\mathbf{\check{A}}}_{\ell}\check{\boldsymbol{\mathbf{u}}}^{\star}_{\ell}=\boldsymbol{\mathbf{b}}_{\ell}=\sigma_{\ell}u^{\star}_{\ell}$ . The multigrid iteration matrix is

\displaystyle\boldsymbol{\mathbf{\check{C}}}_{\ell}=\begin{cases}\boldsymbol{\mathbf{0}},&\ell=0\\ \boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}+\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{2}}\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}\boldsymbol{\mathbf{\check{S}}}_{\ell}^{\nu_{1}},&1\leq\ell\leq L.\end{cases}

From this we define the operator $\mathcal{\check{C}}_{\ell}:=\sigma_{\ell}^{*}\boldsymbol{\mathbf{\check{C}}}_{\ell}(\sigma_{\ell}^{*})^{-1}$ .

Theorem 2.

For any $0<\gamma<1$ , there exist constants $C_{1}$ , $C_{2}$ and $C_{4}$ and $\nu^{\star}\in\mathbb{N}$ such that $C_{\text{Prop.\ref{prop:boundtwogridop}}}g(\nu^{\star})\leq C_{4}\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}$ . For $\nu_{1}\geq\nu^{\star}$ choose $\epsilon_{\ell}$ such that $\epsilon_{\ell}h_{\ell}^{-(d+2)}(h_{\ell}^{-4}+\nu_{1}+\nu_{2})<C^{-1}_{\text{Lem.\ref{lem:2step_pert}}}\min(C_{1},\gamma C_{2})$ for $0\leq\ell\leq L$ for all $\ell$ and if $h_{0}\leq\left(C^{-1}_{\text{Lem.\ref{lem:2step_pert}}}\min(C_{1},\gamma C_{2})\right)^{-1/4}$ , then

\displaystyle\left\|\mathcal{\check{C}}_{\ell}\right\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}\leq\gamma.

Proof.

As in the proof of Theorem 1, we make the estimate

\|\mathcal{\check{C}}_{\ell}\|\leq\|\mathcal{\check{C}}_{\text{TG}_{\ell}}\|+\|\mathcal{\check{S}}_{\ell}^{\nu_{2}}\sigma_{\ell}^{*}\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{\check{S}}_{\ell}^{\nu_{1}}\|.

Then (32) ensures that $\|\sigma_{\ell}^{*}\check{\boldsymbol{\mathbf{p}}}_{\ell}(\sigma_{\ell-1}^{*})^{-1}\mathcal{\check{C}}_{\ell-1}^{\tau}\sigma_{\ell-1}^{*}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{\check{S}}_{\ell}^{\nu_{1}}\|$ controls the second expression. Considering the difference

E:=\sigma_{\ell}^{*}\check{\boldsymbol{\mathbf{p}}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{\check{A}}}_{\ell-1}^{-1}\check{\boldsymbol{\mathbf{r}}}_{\ell}\boldsymbol{\mathbf{\check{A}}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{\check{S}}_{\ell}^{\nu_{1}}-\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{S}_{\ell}^{\nu_{1}},

Remark 3 gives

\|E\|_{L_{2}\to L_{2}}\leq C(h_{\ell-1}^{-2}+\nu_{1})h_{\ell-1}^{-d}\epsilon_{\ell-1}\|\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\|_{\ell_{2}\to\ell_{2}}.

Using the Riesz property, this gives

	$\displaystyle\\|\mathcal{\check{C}}_{\ell}\\|$	$\displaystyle\leq$	$\displaystyle\\|\mathcal{\check{C}}_{\text{TG}_{\ell}}\\|$
			$\displaystyle+C\left(h^{-2}+\nu_{1})h^{-d}\epsilon\\|\mathcal{\check{C}}_{\ell-1}^{\tau}\\|+\left\\|\sigma_{\ell}^{}\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma_{\ell}^{})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}\right\\|\right).$

As in the proof of Theorem 1, the last normed expression can be bounded as

\left\|\sigma_{\ell}^{*}\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma_{\ell}^{*})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}\right\|\leq\|\mathcal{\check{C}}_{\ell-1}^{\tau}\|\|P_{\Xi_{\ell-1}}\mathcal{S}_{\ell}^{\nu_{1}}\|\leq C\|\mathcal{\check{C}}_{\ell-1}^{\tau}\|.

Because $(h_{\ell}^{-2}+\nu_{1})h_{\ell}^{-d}\epsilon_{\ell}$ is bounded by assumption, it follows that

\|\mathcal{\check{C}}_{\ell}\|\leq\|\mathcal{\check{C}}_{\text{TG}_{\ell}}\|+C_{3}\|\mathcal{\check{C}}_{\ell-1}\|^{\tau}.

As by assumption $\epsilon_{\ell}\leq h_{\ell}^{d+2}$ (no constants due to $h\leq h_{0}$ ) holds for all $\ell\leq L$ , we obtain by Lemma 13

	$\displaystyle\\|\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\\|_{2\to 2}$	$\displaystyle\leq\\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}-\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\\|_{2\to 2}+\\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}\\|_{2\to 2}$
		$\displaystyle\leq C_{\text{Lem.\ref{lem:2step_pert}}}(\nu_{2}+\nu_{1}+h_{\ell-1}^{-4})h_{\ell-1}^{-(d+2)}\epsilon_{\ell-1}+\\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}\\|_{2\to 2}$
		$\displaystyle\leq\min(C_{1},\gamma C_{2})+\\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}\\|_{2\to 2}.$

We use Theorem 1 and choose a natural number $\nu^{\star}_{1}$ large enough such that the inequality $C_{\text{Prop.\ref{prop:boundtwogridop}}}g(\nu^{\star}_{1})\leq C_{4}\min\left\{\frac{\tau-1}{\tau}(\beta_{\text{Thm.\ref{thm:main}}}\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}$ is satisfied. Thus, we obtain

\displaystyle\|\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\|_{2\to 2}\leq\min(C_{1},\gamma C_{2})+C_{4}\gamma.

We have

\displaystyle C_{1}+C_{4}\gamma\leq\frac{\tau-1}{\tau}(C_{3}\tau)^{-\frac{1}{\tau-1}}\quad\text{and}\quad C_{2}+C_{4}\leq\frac{\tau-1}{\tau}.

Thus, we define

\beta:=C_{3}\quad\text{and}\quad{\alpha}:=\max_{\ell\leq L}\|\mathcal{\check{C}}_{\text{TG}_{\ell}}\|\leq\min\left\{\frac{\tau-1}{\tau}(\beta\tau)^{-\frac{1}{\tau-1}},\frac{\tau-1}{\tau}\gamma\right\}.\qquad

Hence Lemma 11 applies and the result follows. ∎

7 Truncated multigrid method

In this section we consider truncating the stiffness, prolongation and restriction matrices in order to improve the computational complexity of the method. Each such matrix has stationary, exponential off-diagonal decay, so by retaining the $(\xi,\eta)$ entry when $\mathrm{dist}(\xi,\eta)\leq Kh_{\ell}|\log h_{\ell}|$ , and setting the rest to zero, guarantees a small perturbation error (on the order of $\mathcal{O}(h_{\ell}^{J})$ , where $J\propto K$ ). This is made precise in Lemma 15 and Lemma 16 below, with the aid of the following lemma.

Lemma 14.

Suppose $\Xi\subset\mathbb{M}$ , $c>0$ , and $r\geq 2q(\Xi)$ . Then for any $\eta\in\mathbb{M}$ , we have

\sum_{\begin{subarray}{c}\xi\in\Xi\\ \mathrm{dist}(\xi,\eta)>r\end{subarray}}e^{-c\ \mathrm{dist}(\xi,\eta)}\leq\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\left(\frac{r}{q}\right)^{d}e^{-c{r}}\left(\sum_{j=0}^{\infty}(j+2)^{d}e^{-cjq}\right).

Proof.

The underlying set can be decomposed as $\{\xi\in\Xi\mid\mathrm{dist}(\xi,\eta)\geq r\}=\bigcup_{j=0}^{\infty}\mathcal{A}_{j}$ , where $\mathcal{A}_{j}=\{\xi\in\Xi\mid r+jq\leq\mathrm{dist}(\xi,\eta)<r+(j+1)q\}$ has cardinality $|\mathcal{A}_{j}|\leq\frac{\alpha_{\mathbb{M}}}{\beta_{\mathbb{M}}}\bigl{(}\frac{r}{q}+(j+2)\bigr{)}^{d}$ . It follows that

\sum_{\begin{subarray}{c}\xi\in\Xi\\ \mathrm{dist}(\xi,\eta)>r\end{subarray}}e^{-c\ \mathrm{dist}(\xi,\eta)}\leq\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}e^{-cr}\sum_{j=0}^{\infty}\left(\frac{r}{q}+(j+2)\right)^{d}e^{-cjq}

and the lemma follows from the fact that $\frac{r}{q}+j+2\leq\frac{r}{q}(j+2)$ for all $j\geq 0$ . ∎

Truncated stiffness matrix

The exponential decay in Lemma 3 motivates the truncation of the stiffness matrix, see e.g. [8, Eq. (8.1)] We define for positive $K$ , the truncation parameter $r_{\Xi}:=Kh|\log(h)|$

\displaystyle\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}\in\mathbb{R}^{|\Xi|\times|\Xi|}\quad\text{with}\quad(\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}})_{\xi,\eta}:=\begin{cases}A_{\xi,\eta}=a(\chi_{\xi},\chi_{\eta}),&\mathrm{dist}(\xi,\eta)\leq r_{\Xi},\\ 0,&\mathrm{dist}(\xi,\eta)>r_{\Xi}.\end{cases}

(36)

We note that $\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}$ is symmetric if $\boldsymbol{\mathbf{A}}_{\Xi}$ is symmetric. By construction and quasi-uniformity, we obtain $|\{\xi\in B(\eta,r_{\Xi})\cap\Xi\}|\leq\rho^{d}h^{-d}_{\Xi,\mathbb{M}}\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}(r_{\Xi}+q)^{d}\leq 2\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\rho^{d}|K\log h|^{d}.$ By $h^{-d}\leq C|\Xi|$ , this yields

$|\{\xi\in\Xi\,\mid\,(\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}})_{\xi,\eta}\neq 0\}\leq 2\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\rho^{d}d^{d}K^{d}(\log|\Xi|)^{d}$ .

In particular, we obtain

\displaystyle\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}\boldsymbol{\mathbf{x}})\leq C_{comp}K^{d}\log(|\Xi|)^{d}|\Xi|

(37)

for the number of operations for a matrix vector multiplication with the truncated stiffness matrix, with $C_{comp}=2\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\rho^{d}d^{d}$ .

Lemma 15.

With the global parameter $C_{\text{tr}}:=\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\sum_{n=1}^{\infty}\left(n+2\right)^{d}e^{-\frac{\nu}{\rho}n}$ , the estimate

\displaystyle\|\boldsymbol{\mathbf{A}}_{\Xi}-\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}\|_{2\to 2}\leq C_{\text{tr}}C_{\text{stiff}}(K\rho|\log(h)|)^{d}h^{\frac{\nu}{2}K-2}

(38)

holds.

Proof.

The proof for the first statement is essentially given in [8, Prop 8.1]. Using Lemma 3, we observe, by symmetry, that $\|\boldsymbol{\mathbf{A}}_{\Xi}-\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}\|_{p\to p}$ is controlled by the maximum of the $\ell_{1}$ and $\ell_{\infty}$ matrix norms, which can be controlled by row and column sums. This leads to off-diagonal sums $\max_{\eta\in\Xi}\sum_{\xi\in\Xi\cap B^{\complement}(\eta,r_{\Xi})}\left|(\boldsymbol{\mathbf{A}}_{\Xi})_{\xi,\eta}\right|$ and $\max_{\xi\in\Xi}\sum_{\eta\in\Xi\cap B^{\complement}(\eta,r_{\Xi})}\left|(\boldsymbol{\mathbf{A}}_{\Xi})_{\xi,\eta}\right|.$ Applying Lemma 14 with $r=r_{\Xi}=Kh|\log h|$ and $c=\frac{\nu}{2h}$ yields

	$\displaystyle\\|\boldsymbol{\mathbf{A}}_{\Xi}-\check{\boldsymbol{\mathbf{A}}}_{\Xi;r_{\Xi}}\\|_{p\to p}$	$\displaystyle\leq C_{\text{stiff}}h^{-2}\max_{\xi\in\Xi}\sum_{\eta\in\Xi\cap B^{\complement}(\xi,r_{\Xi})}e^{-\frac{\nu}{2}\frac{\mathrm{dist}(\xi,\eta)}{h}}$
		$\displaystyle\leq C_{\text{stiff}}\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\left(\frac{Kh\|\log h\|}{q}\right)^{d}h^{\frac{K\nu}{2}-2}\left(\sum_{j=0}^{\infty}(j+2)^{d}e^{-\frac{\nu}{2h}jq}\right).$

∎

Truncated prolongation and restriction matrices

We introduce truncated prolongation matrices $\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\in\mathbb{R}^{n_{\ell}\times n_{\ell-1}}$ , with

\displaystyle(\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}})_{\xi\in\Xi_{\ell-1},\eta\in\Xi_{\ell}}:=\begin{cases}(\boldsymbol{\mathbf{p}}_{\ell})_{\xi,\eta}=\chi^{(\ell-1)}_{\xi}(\eta),&\mathrm{dist}(\xi,\eta)\leq r_{\ell},\\ 0,&\mathrm{dist}(\xi,\eta)>r_{\ell},\end{cases}

where we use the notation $r_{\ell}:=r_{\Xi_{\ell-1}}$ . Likewise, we define $\check{\boldsymbol{\mathbf{r}}}_{\ell;r_{\ell}}:=\left(\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\right)^{T}$ . For the numerical costs, we obtain

\displaystyle\max\left\{\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}}),\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{r}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}})\right\}=\mathcal{O}\left(K^{d}\log(|\Xi_{\ell}|)^{d}|\Xi_{\ell}|\right),

(39)

where we use that $|\Xi_{\ell}|\sim h_{\ell}^{-d}=\gamma^{d}h^{d}_{\ell-1}\sim|\Xi_{\ell-1}|\sim\gamma^{d}|\Xi_{\ell-1}|$ due to (22).

Lemma 16.

We have

\displaystyle\left\|\boldsymbol{\mathbf{p}}_{\ell}-\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\right\|_{\ell_{2}(\Xi_{\ell-1})\to\ell_{2}(\Xi_{\ell})}\leq C_{\mathrm{PW}}C_{\text{trunc}}(K|\log(h_{\ell})|)^{d}h^{\frac{\nu}{2}K}_{\ell}.

Proof.

We proceed as in the proof of (38). We can estimate row and column sums of $\boldsymbol{\mathbf{E}}:=\boldsymbol{\mathbf{p}}_{\ell}-\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}$ by (13), obtaining

\left\|\boldsymbol{\mathbf{E}}\right\|_{\infty\to\infty}\leq C_{\mathrm{PW}}\sum_{\xi\in\Xi_{\ell-1}\cap B^{\complement}(\eta,r_{\ell})}e^{-\nu\frac{\mathrm{dist}(\eta,\xi)}{h_{\ell-1}}},

so, by Lemma 14 with $r=r_{\ell}=Kh_{\ell}|\log h_{\ell}|$ and $c=\frac{\nu}{h_{\ell-1}}$ , we have

\left\|\boldsymbol{\mathbf{E}}\right\|_{\infty\to\infty}\leq C_{\mathrm{PW}}\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\left(K\rho|\log h_{\ell-1}|\right)^{d}(h_{\ell-1})^{K\nu}\left(\sum_{j=0}^{\infty}(j+2)^{d}e^{-\frac{\nu j}{\rho}}\right).

Likewise, $\left\|\boldsymbol{\mathbf{E}}\right\|_{1\to 1}\leq C_{\mathrm{PW}}\sum_{\eta\in\Xi_{\ell}\cap B^{\complement}(\xi,r_{\ell})}e^{-\nu\frac{\mathrm{dist}(\eta,\xi)}{h_{\ell-1}}}$ . Lemma 14 yields this time with $r=r_{\ell}=Kh_{\ell}|\log h_{\ell}|$ and $c=\frac{\nu}{h_{\ell-1}}$ , the estimate

\left\|\boldsymbol{\mathbf{E}}\right\|_{1\to 1}\leq C_{\mathrm{PW}}\frac{\beta_{\mathbb{M}}}{\alpha_{\mathbb{M}}}\left(K\frac{h_{\ell-1}}{q_{\ell}}|\log h_{\ell-1}|\right)^{d}(h_{\ell-1})^{K\nu}\left(\sum_{j=0}^{\infty}(j+2)^{d}e^{-\frac{\nu jq_{\ell}}{h_{\ell-1}}}\right).

Thus, we get

\displaystyle\max\left\{\left\|\boldsymbol{\mathbf{E}}\right\|_{\infty\to\infty},\left\|\boldsymbol{\mathbf{E}}\right\|_{1\to 1}\right\}

\displaystyle\leq C_{\mathrm{PW}}C_{\text{trunc}}(K\rho\gamma|\log(h_{\ell-1})|)^{d}h_{\ell-1}^{\frac{\nu}{2}K}.

Interpolation finishes the proof. ∎

Truncated $\tau$ -cycle

We now consider the multigrid method using truncated versions of the stiffness, prolongation and restriction matrices. We denote this by $\operatorname{MGMTRUNC}^{(\tau)}_{\ell}$ , and use it to solve (34) with $\boldsymbol{\mathbf{\check{A}}}_{\ell}=\check{\boldsymbol{\mathbf{A}}}_{\ell;r_{\ell}}$ . Lemmas 15 and 16 show that conditions for Theorem 2 are satisfied when $K$ is chosen sufficiently large.

Theorem 3.

If $\tau\gamma^{d}<1$ , we obtain

\displaystyle\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\operatorname{MGMTRUNC}^{(\tau)}_{\ell}(\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{b}}_{\ell}))=\mathcal{O}(N_{\ell}\log(N_{\ell})^{d}).

(40)

Proof.

Define the floating point operation count for the truncated multigrid
method by ${\tt M}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\operatorname{MGMTRUNC}^{(\tau)}_{\ell}(\boldsymbol{\mathbf{x}},\boldsymbol{\mathbf{b}}_{\ell}))$ .

By estimates (37) and (39), the quantities ${\tt P}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{p}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}})$ , ${\tt R}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{r}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}})$ , and ${\tt A}_{\ell}:=\operatorname{FLOPS}(\boldsymbol{\mathbf{x}}\mapsto\check{\boldsymbol{\mathbf{A}}}_{\ell;r_{\ell}}\boldsymbol{\mathbf{x}})$ are each bounded by $CK^{d}\log(|\Xi_{\ell}|)^{d}|\Xi_{\ell}|$ . Because each Jacobi iteration involves multiplication by a matrix with the same number of nonzero entries, we note that

{\tt S}_{\ell}:=\operatorname{FLOPS}\bigl{(}x\mapsto J(x,b)\bigr{)}\leq CK^{d}\log(|\Xi_{\ell}|)^{d}|\Xi_{\ell}|

as well. From this, we have the recursive formula

{\tt M}_{\ell}={\tt P}_{\ell}+{\tt R}_{\ell}+{\tt A}_{\ell}+(\nu_{1}+\nu_{2}){\tt S}_{\ell}+\tau{\tt M}_{\ell-1}.

Applying (37) and (39) gives ${\tt M}_{\ell}\leq CK^{d}(\nu_{1}+\nu_{2}+3)\left(|\log h_{\ell}|^{d}h^{-d}_{\ell}\right)+\tau{\tt M}_{\ell-1}.$

By setting $w_{\ell}=\left(|\log h_{\ell}|/h_{\ell}\right)^{d}$ and $\tilde{C}:=CK^{d}(\nu_{1}+\nu_{2}+3)$ , we have the recurrence

{\tt M}_{\ell}\leq\tilde{C}w_{\ell}+\tau{\tt M}_{\ell-1}\quad\longrightarrow\quad{\tt M}_{\ell}\leq\tilde{C}\sum_{k=0}^{\ell-1}w_{\ell-k}\tau^{k}+\tau^{\ell}{\tt M}_{0}

Note that $w_{\ell}\leq(|\log h_{0}|+\ell|\log\gamma|)^{d}\gamma^{-d\ell}h_{0}^{-d}$ , since $h_{\ell}\leq\gamma^{\ell}h_{0}$ . By Hölder’s inequality, we have $(|\log h_{0}|+\ell|\log\gamma|)^{d}\leq 2^{\frac{d-1}{d}}(|\log h_{0}|^{d}+\ell|\log\gamma|^{d})$ , which provides the estimate $w_{\ell-k}\leq 2^{\frac{d-1}{d}}h_{0}^{-d}\gamma^{-d\ell}\gamma^{dk}(|\log h_{0}|^{d}+\ell(1-k/\ell)|\log\gamma|^{d})$ .

Applying this to the above estimate for ${\tt M}_{\ell}$ gives,

	$\displaystyle{\tt M}_{\ell}$	$\displaystyle\leq\tilde{C}h_{0}^{-d}\gamma^{-d\ell}\left(\|\log h_{0}\|^{d}\sum_{k=0}^{\ell-1}(\tau\gamma^{d})^{k}+\|\log\gamma\|^{d}\ell^{d}\sum_{k=0}^{\ell-1}(1-k/\ell)^{d}(\tau\gamma^{d})^{k}\right)$
		$\displaystyle\phantom{\leq}+\tau^{\ell}{\tt M}_{0}$
		$\displaystyle\leq\tilde{C}h_{0}^{-d}\gamma^{-d\ell}\left\|\log h_{0}+\log\gamma^{\ell}\right\|^{d}+\tau^{\ell}{\tt M}_{0}\leq\tilde{C}h^{-d}(\|\log h\|)^{d}+\tau^{\ell}{\tt M}_{0}.$

The result follows by taking $(\gamma^{\ell}h_{0})^{-d}\leq Ch^{-d}$ and $(|\log h_{0}|+\ell|\log\gamma|)\sim|\log h|$ . ∎

Remark 4.

The kernel-based Galerkin problem $\check{\boldsymbol{\mathbf{A}}}_{L;r_{L}}\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}=\boldsymbol{\mathbf{b}}_{L}$ , can be solved stably to any precision $\epsilon_{\max}$ , by iterating $\operatorname{MGMTRUNC}^{(\tau)}_{L}(\check{\boldsymbol{\mathbf{u}}}_{L},\boldsymbol{\mathbf{b}}_{L})$ , i.e., the truncated multigrid with $\tau\geq 2$ cycle. Select $\gamma<1$ and fix $\nu_{1}$ so that Theorem 2 holds. Let $\check{\boldsymbol{\mathbf{u}}}_{L}^{(k+1)}=\operatorname{MGMTRUNC}^{(\tau)}_{L}(\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)},\boldsymbol{\mathbf{b}}_{L})$ . If $k$ is the least integer satisfying $\gamma^{k}\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(0)}\|_{\ell_{2}}<\epsilon_{\max}$ , then

k\sim\frac{1}{\log\gamma}\log\left(\frac{\epsilon_{\max}}{\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(0)}\|_{\ell_{2}}}\right).

Due to Theorem 3, we obtain an overall complexity of

\mathcal{O}\left(\frac{1}{\log\gamma}\log\left(\frac{\epsilon_{\max}}{\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(0)}\|}\right)N_{L}\log(N_{L})^{d}\right).

We note that $\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}\sim\|\sigma_{L}^{*}\left(\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\right)\|_{W_{2}^{1}}\leq C_{\mathrm{Bernstein}}h^{d/2-1}\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\|_{\ell_{2}},$ and since $d\geq 2$ , achieving $\|\check{\boldsymbol{\mathbf{u}}}_{L}^{\star}-\check{\boldsymbol{\mathbf{u}}}_{L}^{(k)}\|_{\boldsymbol{\mathbf{A}}_{L}}<\epsilon_{\max}$ also requires only a fixed number of iterations. This is repeated below in statement (41).

Indeed, using $k$ steps of the Conjugate Gradient method on the original system (1), would give error $\|\bar{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|_{\boldsymbol{\mathbf{A}}_{L}}\leq\left(\frac{CN^{2/d}_{L}-1}{CN^{2/d}_{L}+1}\right)^{k}\|\boldsymbol{\mathbf{u}}^{\star}_{L}-\boldsymbol{\mathbf{u}}^{(0)}\|_{\boldsymbol{\mathbf{A}}_{L}}$ . Thus to ensure a tolerance of $\|\bar{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|_{\boldsymbol{\mathbf{A}}_{L}}\leq\epsilon_{\max}$ , one would need

k\sim\frac{1}{\log(1-\tilde{C}N^{-2/d}_{L})}\log\left(\frac{\epsilon_{\max}}{\|\bar{\boldsymbol{\mathbf{u}}}^{(0)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|}\right)\sim\mathcal{O}\left(N^{2/d}_{L}\log\left(\frac{\epsilon_{\max}}{\|\bar{\boldsymbol{\mathbf{u}}}^{(0)}_{L}-\boldsymbol{\mathbf{u}}^{\star}_{L}\|}\right)\right)

steps, where we use $\left(\frac{CN^{2/d}_{L}-1}{CN^{2/d}_{L}+1}\right)\sim\left(1-\tilde{C}N^{-2/d}\right)$ .

In contrast to this, the multigrid $W$ -cycle requires only

k\sim\left|\log\left(\frac{\epsilon_{\max}}{\|\check{\boldsymbol{\mathbf{u}}}^{(0)}_{L}-\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}\|}\right)\right|

(41)

iterations to achieve error $\|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}\|_{\boldsymbol{\mathbf{A}}_{L}}\leq\epsilon_{\max}$ . In fact, it reaches $\|\check{\boldsymbol{\mathbf{u}}}^{(k)}_{L}-\check{\boldsymbol{\mathbf{u}}}^{\star}_{L}\|_{\ell_{2}}\leq\epsilon_{\max}$ , which is a stronger constraint, within $k$ iterations. In particular, the number of iterations is independent of the size $N_{L}$ of the problem.

Acknowledegment

The authors wish to thank the anonymous reviewers for their careful reading and helpful comments.

References

[1] G. Dziuk, C. M. Elliott, Finite element methods for surface PDEs, Acta Numer. 22 (2013) 289–396.
[2] V. Shankar, G. B. Wright, A. Narayan, A robust hyperviscosity formulation for stable RBF-FD discretizations of advection-diffusion-reaction equations on manifolds, SIAM J. Sci. Comput. 42 (4) (2020) A2371–A2401.
[3] Q. T. Le Gia, I. H. Sloan, H. Wendland, Multiscale RBF collocation for solving PDEs on spheres, Numer. Math. 121 (1) (2012) 99–125.
[4] Q. Le Gia, Galerkin approximation for elliptic PDEs on spheres, Journal of Approximation Theory 130 (2) (2004) 125–149.
[5] H. Wendland, J. Künemund, Solving partial differential equations on (evolving) surfaces with radial basis functions, Advances in Computational Mathematics 46 (4) (2020) 64.
[6] Z. Sun, L. Ling, A kernel-based meshless conservative Galerkin method for solving Hamiltonian wave equations, SIAM Journal on Scientific Computing 44 (4) (2022) A2789–A2807.
[7] J. Künemund, F. J. Narcowich, J. D. Ward, H. Wendland, A high-order meshless Galerkin method for semilinear parabolic equations on spheres, Numer. Math. 142 (2) (2019) 383–419.
[8] F. J. Narcowich, S. T. Rowe, J. D. Ward, A novel Galerkin method for solving PDEs on the sphere using highly localized kernel bases, Math. Comp. 86 (303) (2017) 197–231.
[9] A. Katz, A. Jameson, Multicloud: multigrid convergence with a meshless operator, J. Comput. Phys. 228 (14) (2009) 5237–5250.
[10] G. B. Wright, A. Jones, V. Shankar, Mgm: A meshfree geometric multilevel method for systems arising from elliptic equations on point cloud surfaces, SIAM Journal on Scientific Computing 45 (2) (2023) A312–A337.
[11] W. Hackbusch, Iterative solution of large sparse systems of equations, Vol. 95, Springer, 1994.
[12] S. C. Brenner, L. R. Scott, The Mathematical Theory of Finite Element Methods, Springer New York, NY, 2007, edition Number 3.
[13] A. Reusken, Introduction to multigrid methods for elliptic boundary value problems, (2008). Technical report available at {https://www.igpm.rwth-aachen.de/Download/reports/reusken/ARpaper53.pdf}
[14] E. Fuselier, T. Hangelbroek, F. J. Narcowich, J. D. Ward, G. B. Wright, Localized bases for kernel spaces on the unit sphere, SIAM J. Numer. Anal. 51 (5) (2013) 2538–2562.
[15] T. Hangelbroek, F. J. Narcowich, C. Rieger, J. D. Ward, Direct and inverse results on bounded domains for meshless methods via localized bases on manifolds, in: J. Dick, F. Kuo, H. Woźniakowski (Eds.), Contemporary computational mathematics—a celebration of the 80th birthday of Ian Sloan. Vol. 1, 2, Springer, Cham, 2018, pp. 517–543.
[16] J. H. Bramble, J. E. Pasciak, J. Xu, Parallel multilevel preconditioners, Mathematics of Computation 55 (191) (1990) 1–22.
[17] V. John, Multigrid methods, Tech. rep., (2014). Lecture notes available at {https://www.wias-berlin.de/people/john/LEHRE/MULTIGRID/multigrid.pdf}
[18] S. Balay, S. Abhyankar, M. F. Adams, S. Benson, J. Brown, P. Brune, K. Buschelman, E. Constantinescu, L. Dalcin, A. Dener, V. Eijkhout, J. Faibussowitsch, W. D. Gropp, V. Hapla, T. Isaac, P. Jolivet, D. Karpeev, D. Kaushik, M. G. Knepley, F. Kong, S. Kruger, D. A. May, L. C. McInnes, R. T. Mills, L. Mitchell, T. Munson, J. E. Roman, K. Rupp, P. Sanan, J. Sarich, B. F. Smith, S. Zampini, H. Zhang, H. Zhang, J. Zhang, PETSc/TAO users manual, Tech. Rep. ANL-21/39 - Revision 3.21, Argonne National Laboratory (2024).
[19] R. Abraham, J. E. Marsden, T. Ratiu, Manifolds, tensor analysis, and applications, 2nd Edition, Vol. 75 of Applied Mathematical Sciences, Springer-Verlag, New York, 1988.
[20] J. M. Lee, Introduction to Riemannian manifolds, 2nd Edition, Vol. 176 of Graduate Texts in Mathematics, Springer, Cham, 2018.
[21] T. Hangelbroek, F. J. Narcowich, J. D. Ward, Kernel approximation on manifolds I: bounding the Lebesgue constant, SIAM J. Math. Anal. 42 (4) (2010) 1732–1760.
[22] H. Triebel, Spaces of Besov-Hardy-Sobolev type on complete Riemannian manifolds, Ark. Mat. 24 (2) (1986) 299–337.
[23] T. Hangelbroek, F. J. Narcowich, J. D. Ward, Polyharmonic and related kernels on manifolds: interpolation and approximation, Found. Comput. Math. 12 (5) (2012) 625–670.
[24] M. E. Taylor, Partial differential equations. I, Vol. 115 of Applied Mathematical Sciences, Springer-Verlag, New York, 1996, basic theory.
[25] T. Hangelbroek, F. J. Narcowich, X. Sun, J. D. Ward, Kernel approximation on manifolds II: the $L_{\infty}$ norm of the $L_{2}$ projector, SIAM J. Math. Anal. 43 (2) (2011) 662–684.
[26] P. Collins, Kernel-based Galerkin methods on compact manifolds without boundary, with an emphasis on SO(3), Ph.D. thesis, University of Hawai‘i at Mānoa, (2021). Thesis available at {https://scholarspace.manoa.hawaii.edu/items/936c034f-5039-4ef0-8f03-03f21d71184b}

$\displaystyle\\|\nabla\chi_{\xi}\\|_{L_{2}(B(\xi,r))}^{2}$	$\displaystyle\geq$	$\displaystyle{\Gamma_{1}^{2}}r^{d-2}\\|\nabla F\\|_{L_{2}(B(0,1))}^{2}$
	$\displaystyle\geq$	$\displaystyle c^{2}{\Gamma_{1}^{2}}r^{d-2}\\|F\\|_{L_{2}(B(0,1))}^{2}$
	$\displaystyle\geq$	$\displaystyle c^{2}\Gamma_{1}^{2}\rho^{-2}h^{-2}\\|\chi_{\xi}\\|_{L_{2}(B(\xi,r))}^{2}.$

$\displaystyle\left\\|\mathcal{C}_{\ell}\right\\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$	$\displaystyle\leq$	$\displaystyle\left\\|\mathcal{C}_{\text{TG}_{\ell}}\right\\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$
		$\displaystyle+\left\\|\mathcal{S}_{\ell}^{\nu_{2}}\sigma^{}_{\ell}\boldsymbol{\mathbf{p}}_{\ell}(\boldsymbol{\mathbf{C}}_{\ell-1})^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma^{}_{\ell})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}\right\\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$
	$\displaystyle\leq$	$\displaystyle\left\\|\mathcal{C}_{\text{TG}_{\ell}}\right\\|_{L_{2}(\mathbb{M})\to L_{2}(\mathbb{M})}$
		$\displaystyle+C\left\\|\sigma^{}_{\ell}\boldsymbol{\mathbf{p}}_{\ell}(\boldsymbol{\mathbf{C}}_{\ell-1})^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma^{}_{\ell})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}\right\\|_{L_{2}(\mathbb{M})}$

	$\displaystyle\\|\mathcal{\check{C}}_{\ell}\\|$	$\displaystyle\leq$	$\displaystyle\\|\mathcal{\check{C}}_{\text{TG}_{\ell}}\\|$
			$\displaystyle+C\left(h^{-2}+\nu_{1})h^{-d}\epsilon\\|\mathcal{\check{C}}_{\ell-1}^{\tau}\\|+\left\\|\sigma_{\ell}^{}\boldsymbol{\mathbf{p}}_{\ell}\boldsymbol{\mathbf{\check{C}}}_{\ell-1}^{\tau}\boldsymbol{\mathbf{A}}_{\ell-1}^{-1}\boldsymbol{\mathbf{r}}_{\ell}\boldsymbol{\mathbf{A}}_{\ell}(\sigma_{\ell}^{})^{-1}\mathcal{S}_{\ell}^{\nu_{1}}\right\\|\right).$

	$\displaystyle\\|\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\\|_{2\to 2}$	$\displaystyle\leq\\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}-\boldsymbol{\mathbf{\check{C}}}_{\text{TG}_{\ell}}\\|_{2\to 2}+\\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}\\|_{2\to 2}$
		$\displaystyle\leq C_{\text{Lem.\ref{lem:2step_pert}}}(\nu_{2}+\nu_{1}+h_{\ell-1}^{-4})h_{\ell-1}^{-(d+2)}\epsilon_{\ell-1}+\\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}\\|_{2\to 2}$
		$\displaystyle\leq\min(C_{1},\gamma C_{2})+\\|\boldsymbol{\mathbf{C}}_{\text{TG}_{\ell}}\\|_{2\to 2}.$

Kernel Multigrid on Manifolds

Abstract

keywords:

MSC:

1 Introduction

Remark 1.

Outline of the paper

2 Problem Set Up

2.1 Manifold

2.2 Sobolev spaces

2.3 Elliptic operator

2.4 Galerkin methods

Lemma 1.

2.5 Kernel approximation

Lemma 2.

Proof.

3 The Lagrange basis and stiffness matrix

Assumption 1.

3.1 The stiffness matrix

Lemma 3.

Proof.

Lemma 4.

Proof.

3.2 The diagonal of the stiffness matrix

Lemma 5.

Proof.

Lemma 6.

Proof.

4 The smoothing property

4.1 L2L_{2} stability of the smoothing operator

Lemma 7.

Proof.

4.2 Smoothing properties

Lemma 8.

Proof.

Lemma 9.

Proof.

5 The direct kernel multigrid method

5.1 Setup: Grid transfer

Prolongation and restriction

Lemma 10.

Proof.

5.2 multigrid iteration – two level case

Proposition 1.

Proof.

Corollary 1.

Proof.

5.3 multigrid with τ\tau-cycle

Lemma 11.

Proof.

Theorem 1.

Proof.

Remark 2.

6 The perturbed multigrid method

Perturbed multigrid method

Lemma 12.

Proof.

Perturbed two grid method

Lemma 13.

Remark 3.

Proof of Lemma 13.

Perturbed τ\tau-cycle

Theorem 2.

Proof.

7 Truncated multigrid method

Lemma 14.

Proof.

Truncated stiffness matrix

Lemma 15.

Proof.

Truncated prolongation and restriction matrices

Lemma 16.

Proof.

Truncated τ\tau-cycle

Theorem 3.

Proof.

Remark 4.

Acknowledegment

References

4.1 $L_{2}$ stability of the smoothing operator

5.3 multigrid with $\tau$ -cycle

Perturbed $\tau$ -cycle

Truncated $\tau$ -cycle