Kernel Multigrid on Manifolds
Abstract
Kernel methods for solving partial differential equations on surfaces have the advantage that those methods work intrinsically on the surface and yield high approximation rates if the solution to the partial differential equation is smooth enough. Localized Lagrange bases have proven to alleviate the computational complexity of usual kernel methods to some extent. The efficient numerical solution of the resulting linear systems of equations has not been addressed in the literature so far. In this article we apply the framework of geometric multigrid method with a -cycle to this particular setting. Moreover, we show that the resulting linear algebra can be made more efficiently by using the Lagrange function decay again. The convergence rates are obtained by a rigorous analysis. The presented version of a multigrid method provably works on quasi-uniform point clouds on the surface and hence does not require a grid-structure. Moreover, we can show that the computational cost to solve the linear system scales log-linear in the degrees of freedom.
keywords:
Geometric multigrid , partial differential equations on manifolds , kernel-based Galerkin methods , localized Lagrange basisMSC:
65F10 , 65Y20 , 65M12 , 65M15 , 65M55 , 65M601 Introduction
The numerical solution to partial differential equations, often time dependent, on curved geometries is crucial to many real-world applications. Of the many available numerical methods (see for instance [1] for an overview using finite elements, or [2] [3] for alternative mesh-free methods), we focus on mesh-free kernel-based Galerkin methods (see also [4], [5], [6]), which have a number of merits, including delivering high approximation orders for smooth data, providing smooth solutions, and working coordinate-free and without the need for rigid underlying geometric structures like meshes and regular grids.
Let denote in the following a compact, -dimensional Riemannian manifold without boundary. We will consider, as a spatial operator, a generic second order linear elliptic differential operator with trivial nullspace. Such operators occur for instance in the numerical solution of the heat equation using implicit time-stepping methods, see e.g. [7]. In conventional kernel-based Galerkin methods, a grid is considered and a stiffness matrix , which represents on a finite dimensional kernel space expressed endowed with some fixed basis, is assembled. This leads to the equation
(1) |
(see (18) for a precise definition).
In this paper we address the problem of reducing the computational costs of using kernels without spoiling their analytic advantages. Here, we mostly follow [7, 8] and use the kernel-based Lagrange basis. This particular basis, though not locally supported, has very strong decay properties which allows to localize computations. The almost-local support already alleviates the problem of densely populated matrices as usually encountered in kernel methods. Concretely, the full system in (1) can be well-approximated by a sparse matrix using the decay of the local Lagrange basis. Despite this, the condition number grows with the problem size . Thus, even if compressed, (1) poses a computational challenge, and, because of its conditioning, iterative methods (without preconditioning) cannot be expected to work well, since the iterations needed to ensure a prescribed accuracy may grow with the number of degrees of freedom.
To this end, we introduce and analyze a mesh-free multigrid algorithm. Specifically, we adapt the standard -cycle multigrid method to (localized) kernels and provide a rigorous analysis in this setting. Although multigrid methods recently have gained attention in the mesh-free community, see [9] and [10], the rigorous analysis we provide has been missing in the kernel-based context.
The main novelty of this paper is the following:
-
1.
We prove this mesh-free algorithm is a contraction, with norm independent of grid size (see Theorem 1). By standard techniques, it follows that numerical solutions within given tolerance can be obtained with a fixed, with respect to grid size, number of iterations (see (41) and adjacent discussion).
-
2.
The method we present is stable under perturbations of the stiffness, restriction and prolongation matrices. Thus small errors, which may come from sparsifying these matrices (as well as from quadrature, round-off or modification of the kernel), also yield contractions (see Theorem 2) and therefore do not hinder performance of the algorithm.
-
3.
By compressing the stiffness, restriction and prolongation matrices, which have rapid off-diagonal decay, we obtain a solution method enjoying nearly linear complexity.
This leads to an approximate solution to the system (1) which requires only floating point operations per iteration and where again the multigrid matrix has a norm bound less than unity (see Theorem 2), for a total operation count of
where is the user-prescribed tolerance. The approximate solution to (1) has an (additive) error bound of the form , where the error due to truncating matrices is . Here is a user-determined constant which depends linearly on the sparsity of truncated matrices in the algorithm. This is explained in Remark 4.
Remark 1.
Before proceeding, we make the following comments:
-
1.
We do not attempt to modify the underlying framework of the multigrid method, as described, for instance, in [11, 12, 13], but show that it can be successfully applied to kernel methods. This is far from being obvious. We follow conceptually the book [12] where also the function space view on multigrid methods is used.
-
2.
The fact that is injective is a convenient simplification we assume throughout the article. An investigation of operators with a non-trivial nullspace will be considered in a forthcoming work.
-
3.
Throughout this article, we will assume to have access to a sequence of quasi-uniform and nested point clouds on , and a corresponding highly localized Lagrange basis generated by a kernel . Both, the construction of point clouds on manifolds and the computation of the Lagrange basis are independent of the partial differential equation and can hence be pre-computed and even be stored.
-
4.
Although numerical integration is necessary to implement the kernel-based Galerkin method, our results hold independently of the choice of quadrature method. (This is treated, for instance, in [7, Section 4].) We therefore assume for the remainder of this paper these steps to be solved. In particular, we assume to have access to the stiffness matrices , as considered in [14] or [15].
-
5.
An alternative approach to treating (1) would be to apply a suitable preconditioner. To be effective in this context, such methods would also have to be adapted to the kernel situation and the analysis including the compression argument would have to be carefully carried over as well. Furthermore, many successful preconditioners for finite elements, like [16], typically involve concepts from multigrid methods such as hierarchy of approximation spaces.
-
6.
We discuss in this article the multigrid method only from the perspective as standalone solver. We point out that the multigrid method itself provides an attractive preconditioner. Most often, the multigrid preconditioner is combined with a flexible GMRES (FGMRES) iteration method, see e.g. [17] and references therein. This is also implemented in many software libraries such as PETSc (see [18, page 92f.]) just to name one prominent example. We will discuss this in more detail when we present numerical results on this method in an upcoming work.
Outline of the paper
The remainder of this article is organized as follows. In section 2, we introduce the basic notation of second order elliptic equations on manifolds, and their solution via kernel-based Galerkin approximation. In this section we demonstrate the approximation property in the kernel context, which, along with the smoothing property, provides the analytic backbone for the success of the multigrid method. Section 3 introduces the phenomenon of rapidly decaying Lagrange-type bases, which holds for certain kernels – using such bases permits stiffness matrices with rapid off-diagonal decay, among other things. Of special importance is the diagonal behavior of the stiffness matrix given in Lemma 5, which is a novel contribution of this paper, and which is the analytic result necessary to prove the smoothing property. In section 4 we discuss the smoothing property of damped Jacobi iterations for kernel-based methods both in the case of symmetric and non-symmetric differential operators. In section 5 we introduce kernel-based restriction and prolongation operators, the standard two-grid method and then the -cycle. In this section, we prove Theorem 1, a consequence of which is the bound (41), which shows the (poly)-logarithmic complexity of our proposed method. Section 6 treats the error resulting from small perturbations of the stiffness matrix, as well as the prolongation and restriction matrices. The main result in this section, Theorem 2, demonstrates how such errors affect the multigrid approximation error. Section 7 investigates the computationally efficient truncated multigrid method as an application of the previous section.
2 Problem Set Up
2.1 Manifold
Consider a compact -dimensional Riemannian manifold without boundary . Here we list some useful tools and their properties which hold in this setting. We direct the reader to [19, 20] for relevant background. Results about covariant derivatives and Sobolev spaces can be found in [21].
The tangent bundle is and the cotangent bundle is . We denote by the vector bundle of tensors with contravariant order and covariant order ; we say type for short. Thus and . We will denote the fiber at by . The space of tensor fields of type (known also as sections; i.e., maps with for every ) is denoted .
In this article, we are concerned primarily with covariant tensors (i.e., tensors of type ), so we use the short hand notation and .
For a chart for from which we get the usual vector fields and forms (), which act as local bases for and over . These can be used to generate bases for tensor fields. In particular, for given covariant rank and we have basis element . This allows us to write in coordinates as .
Because is a Riemannian manifold, at each , has an inner product and induced norm . This means that there is by (the metric tensor), so that for tangent vectors in we have . The inner product extends to the dual: for cotangent vectors we have , where . From this, it naturally extends to tensors. For , written in coordinates as and , we have
We denote the Riemannian distance on by ; it is given by the formula where the infimum is taken over piecewise smooth curves connecting and . The Riemannian metric gives rise to a volume form . By compactness, there exist constants so that
or any ball centered at and having radius .
For a finite subset , we can define the following useful quantities: the separation distance, of and the fill distance, , of in . They are given by
The finite sets considered throughout this paper will be quasiuniform, with mesh ratio bounded by a fixed constant. For this reason, quantities which are controlled above or below by a power of can be likewise controlled by a power of – in short, whenever possible, we express estimates in terms of the fill distance , allowing constants to depend on . For instance, the cardinality is bounded above and below by
(2) |
Similarly, if is continuous then for any ,
(3) |
Moreover, we use the following notation: denotes the functions from the set to . Of course, if is a discrete finite set, we can identity .
2.2 Sobolev spaces
The covariant derivative maps tensor fields of type to fields of type . Its adjoint (with respect to the inner products on the space of sections of ) is denoted . For functions, this is fairly elementary. The covariant derivative of a (scalar) function equals its exterior derivative; in coordinates, we have
For a 1-form , we have
A direct calculation shows .
For Sobolev space is defined to be the set of functions which satisfy
where the -th order covariant derivatives can be found in [21, Section 2.2].
For a scalar function , the Laplace-Beltrami operator is given in local coordinates as . Thus for scalar functions . For any integer and , the Bessel-potential norm is equivalent to , as demonstrated in [22, Theorem 4(ii)] (although when and , the two norms are equal; this can be observed directly).
Lemma 3.2 from [21] applies, so there are uniform constants and so that the family of exponential maps (which are diffeomorphisms taking to ) provides local metric equivalences: for any open set , we have
(4) |
This shows that can be endowed with equivalent norms using a partition of unity subordinate to a cover with associated charts to obtain . Here constants of equivalence depend on the partition of unity and charts.
A useful result in this setting, which we will use explicitly in this article, but is also behind a number of background results in section 2.5, is the following zeros estimate [23, Corollary A.13 ], which holds for Sobolev spaces: If satisfies , then for any we have
(5) |
Here depends on and . The result can also be obtained on bounded, Lipschitz domain that satisfies a uniform cone condition, with the cone having radius , although the constant will depend on the aperture of the cone in that case. See [23, Theorem A.11] for a precise statement and definitions of the involved quantities.
2.3 Elliptic operator
We write the operator in divergence form:
Here is a smooth function, is a smooth tensor field of type and is a smooth tensor field of type which generates the field of type (1,1). In coordinates, has the form , and is with . Here we have used the index lowering operator ♭ which ensures, for any , that . This follows by a direct calculation from the above expressions in coordinates (index lowering and the ♭ operator are discussed in [19, p. 341] or [20, p. 27]).
Furthermore, we require to be a constant so that
(6) |
As a basic example, we consider , and ; then (6) holds with . In this case, , and .
With the identity , the second part of (6) ensures that
We have also that . The definition of ensures that this equals , and the first part of (6) then ensures that
Thus, (6) guarantees that the bilinear form , defined initially for smooth functions, is bounded on and is coercive. Thus, we have that the energy quasi-norm satisfies the metric equivalence
(7) |
(if is symmetric, this is a norm, and we write .)
Please note that this excludes differential operators with a null space at the moment. Having in mind the time dependent problems, this assumption is justified. The technical more challenging analysis for those more general operators is left to future research.
2.4 Galerkin methods
We fix and consider as solution to
(8) |
Regularity estimates ([24, Chapter 5.11, Theorem 11.1]) yield that and . Consider now a family of finite dimensional subspaces with associated to a parameter222In the sequel, these will be kernel spaces generated by a subset , and will be the fill distance. For now, we consider a more abstract setting, with only playing a role in establishing the approximation property below. . Define so that
Because , the classical Céa Lemma holds:
And thus we get
(9) |
This can be improved by a Nitsche-type argument to get the following result, whose proof can be found in many textbooks on numerical methods for partial differential equations.
Lemma 1.
Suppose the family has the property that for all , the distance in from satisfies . Then for any ,
(10) |
We point out, that Lemma 1 does not depend on the choice of a specific basis in the finite dimensional space .
2.5 Kernel approximation
We consider a continuous function , the kernel, which satisfies a number of analytic properties which we explain in this section.
Most important is that is conditionally positive definite with respect to some (possibly trivial) finite dimensional subspace333generally a space spanned by some eigenfunctions of the Laplace-Beltrami operator . Conditional positive definiteness with respect to means that for any , the collocation matrix is positive definite on the vector space . As a result, if separates elements of , then the space
(11) |
has dimension . In case , the kernel is positive definite, and the collocation matrix is strictly positive definite on , and .
For a conditionally positive definite kernel, there is an associated reproducing kernel semi-Hilbert space with the property that and that for any for which , the following identity
holds for all . It follows that if separates elements of , interpolation with is well defined on and the projection is orthogonal with respect to the semi-norm on . Of special interest are (conditionally) positive definite kernels that have Sobolev native spaces.
Lemma 2.
If is conditionally positive definite with respect to and satisfies the equivalence . then there is a constant so that for any integers with and any ,
Proof.
The case is trivial; it follows by considering .
For and , the zeros estimate [23, Corollary A.13] ensures that
(because the interpolation error vanishes on ). The hypothesis then gives, , with a suitably enlarged constant. Because is an orthogonal projector, , so we have (again enlarging the constant) that
For , we use the fact that is the real interpolation space . This is [22, Theorem 5]. For , this means that the -functional
satisfies the condition . Since is continuous and monotone, we have that is bounded on . Thus for , there exists so that
and |
The above estimate gives . This implies that as desired. ∎
As a consequence, the kernel Galerkin solution to with satisfies . More importantly, for our purposes, the hypothesis of Lemma 1 is satisfied by the space . Indeed, by (10), we have the following approximation property:
(12) |
which, together with a smoothing property, forms the backbone of the convergence theory for the multigrid method.
3 The Lagrange basis and stiffness matrix
For a kernel and a set which separates points of , the Lagrange basis for satisfies for each .
A natural consequence of is that there exists a constant so that for any , the bound holds. Especially on Riemannian manifolds without boundary, much stronger statement can be proven, see [21, 25, 23] or [14] on . We are, however, not convinced that the list of settings where stronger results hold is complete. In order to allow our results to be applied in future settings where those statement will be shown, we will formulate those stronger statements as assumptions or building blocks.
A stronger result is the following:
Assumption 1.
We assume that there is so that , and, furthermore, there exist constants and so that for
This gives rise to a number of analytic properties, some of which we present here (there are many more, see [25] and [15] for a detailed discussions). For the following estimates, the constants of equivalence depend on , , and .
Pointwise decay: there exist constants and so that for any , the estimate
(13) |
holds. Here , where we recall that the mesh ratio is .
Hölder continuity: Of later importance, we mention the following condition, which follows from [23, Corollary A.15]. For any , the Lagrange function is Hölder continuous, and satisfies the bound
(14) |
Although we make explicit the dependence on here, for the remainder of this article, we assume that constants that follow depend on .
Riesz property: there exist constants so that for any , and , we have
(15) |
Bernstein inequalities: There is a constant so that for ,
(16) |
3.1 The stiffness matrix
We now discuss the stiffness matrix and some of its properties. Most of these have appeared in [26], with earlier versions for the sphere appearing in [21] and [8].
A consequence of the results of this section is that the problem of calculating the Galerkin solution to from involves treating a problem whose condition number grows like – this is the fundamental issue that the multigrid method seeks to overcome.
The analysis map for with respect to the bilinear form defined in (8) is
The analysis map is a surjection.
The synthesis map is
The range of the synthesis map is clearly ; in other words, it is the natural isomorphism between Euclidean space and the finite dimensional kernel space; indeed, (15) shows that it is bounded above and below between and . By abusing notation slightly, we write . This permits a direct matrix representation of linear operators on via conjugation: . Furthermore, by the Riesz property (15), we have
(17) |
A simple calculation shows that , so is the -adjoint of . Of course, when is symmetric, we also have .
The stiffness matrix is defined as
(18) |
It represents the operator on the finite dimensional space . Using the analysis and synthesis maps, , and we have that the Galerkin projector can be expressed as .
Lemma 3.
There is a constant so that the entries of the stiffness matrix satisfy
Proof.
For a tensor field , we write . Thus using , we can bound the integral
Decompose each inner product using the half spaces and , noting and , with . By applying Cauchy-Schwarz to each integral gives, after combining terms,
for some constant depending on the coefficients of .
By considering row and column sums, we have that
(19) |
holds with .
Lemma 4.
Under Assumptions 1, for , the stiffness matrix satisfies
with a constant which is independent of .
Proof.
Coercivity ensures , and the metric equivalence gives
is the stiffness matrix for the self-adjoint operator . Because the spectrum of is bounded below, i.e., , we conclude that by the Riesz property (15). Hence, overall we get
with , so . ∎
Consequently, the condition number of is bounded by a multiple of . Please note that the bounds in Lemma 3 and Lemma 4 do not assume the stiffness matrix to be symmetric, only these bounds are in some sense symmetric. Moreover, (19) is obtained by Riesz-Thorin interpolation and bounds on the row and column sums. Thus this bound also does not require the stiffness matrix to be symmetric.
3.2 The diagonal of the stiffness matrix
As a counterpart to the off-diagonal decay given in Lemma 3, we can give the following lower bounds on the diagonal entries.
Lemma 5.
For an elliptic operator and a kernel satisfying Assumption 1, for a mesh ratio , there is so that for any with , and any , we have
Proof.
By coercivity of , it suffices to prove that
We begin by establishing a Poincaré-type inequality which is valid for smooth Lagrange functions. To this end, consider obtained by the change of variable . Note that for , we have for any , that
by the zeros estimate, with , where the constants stem from (4). Now let , and define by . Then, by a change of variable,
with , since .
Because of the imbedding (which holds since ), the set
is well defined, closed and convex, hence weakly compact, by Banach-Alaoglu.
The natural imbedding , is compact. We wish to show that is a compact set.
Because is a continuous linear map, it is continuous between the weak topologies of and . Thus is weakly compact in , and thus norm closed. Finally, because is complete and totally bounded, it is a compact subset in the norm topology of .
Consider the (possibly zero) constant defined by
The map is continuous on the complement of (as quotient of two continuous functions that do not vanish). In particular, it is continuous and non-vanishing on , so . Indeed, for all , since , and .
Note that in the above minimization problem, the condition could be replaced by any other point on the unit sphere without changing the value of . By rotation invariance of the norm, it follows that . Finally, employing the change of variables and , we have
The last line follows because is Hölder continuous, so stays close to near . Specifically, by (14), for we have for all . Thus
The lemma follows with constant . ∎
This brings us to the lower bound for diagonal entries of the stiffness matrix. Define the diagonal of as and note that by Lemma 3 and Lemma 5,
is bounded above by a constant which depends only on the mesh ratio (and not on ).
This permits us to find suitable damping constants so that dominates . This drives the success of the damped Jacobi method considered in the next section.
Lemma 6.
For an elliptic operator , a kernel satisfying Assumption 1, and mesh ratio , there is so that for any point set , for all .
4 The smoothing property
In this section we define and study the smoothing operator used in the multigrid method. We focus on the damped Jacobi method for the linear system , where and . For a fixed damping parameter , is approximately computed via the iteration:
with starting value . Define the affine map governing a single iteration as
This shows that the iteration converges if and only if iteration matrix
(20) |
called the smoothing matrix, is a contraction.
In the context of kernel approximation, the corresponding operator, the smoothing operator, is defined as The success of the multigrid method relies on a smoothing property, which for our purposes states that iterating is eventually contracting: as . This smoothing property is demonstrated in section 4.2. Many of the results in the forthcoming section are formulated both in a matrix including form and an operator form . This resembles a bit a change of basis transformation.
4.1 stability of the smoothing operator
At this point, we are in a position to show that that iterating this operator is stable on . To help analyze the matrix , we introduce the inner product
Since is diagonal, and its diagonal entries are , we have the norm equivalence . Specifically, we have
(21) |
with constant of equivalence .
Lemma 7.
We note that this holds even when is non-symmetric.
4.2 Smoothing properties
For , we have . By applying Cauchy-Schwarz, the Riesz property and Lemma 7, the chain of inequalities
holds. In the case that is symmetric, we have the following smoothness property.
Lemma 8.
Proof.
This is a result of [13, Theorem 7.9], which shows that . It follows that , and the result holds by the above discussion. ∎
When is not symmetric, we have the smoothness property.
5 The direct kernel multigrid method
We are now in a position to consider the multigrid method applied to the kernel based Galerkin method we have described in the previous sections.
5.1 Setup: Grid transfer
We consider a nested sequence of point sets
and associated kernel spaces as described in section 2.5. We denote the Lagrange basis for each such space , and with it the accompanying analysis map , synthesis map and stiffness matrix . Moreover, we assume that there are constants and such that
(22) |
Note at this point that is a universal constant. Hence does not depend on and thus the constants in deriving the smoothing property do not depend on . Thus, we obtain , for constants see (2). We will assume that is the largest index that we will consider.
In this section, we discuss grid transfer: specifically, the operators and matrices which provide communication between finite dimensional kernel spaces. These include natural prolongation and restriction maps and their corresponding matrices. We show how these can be used to relate Galerkin projectors and stiffness matrices .
Prolongation and restriction
Denote the Lagrange basis of by , and note that by containment , it follows that holds for some matrix of coefficients . Furthermore, from the Lagrange property, the identity holds for any . By uniqueness, we deduce that , and we have
This yields that the natural injection , called the prolongation map, which is described by the rectangular matrix
It is worth noting that , so we have the identity
(23) |
The corresponding restriction map is described by the transposed matrix . In other words, it is defined as .
Note that we can use to relate analysis maps at different levels, since we can take the -adjoint of both sides of the equation (23) to obtain the following useful identity:
(24) |
Moreover, the prolongation is both bounded from above and below. This is a kernel based analogue for [13, Eq. (64)].
Lemma 10.
Using the notation from above, there is a constant depending on , , and the constants in Assumption 2 so that
(25) |
holds for all .
Proof.
We begin by estimating the and norms of by taking column and row sums, respectively. These estimates can be made almost simultaneously, because the entry of satisfies the bound by (13).
Let and , and note that both numbers depend on and the exponential decay rate from (13). Applying (3), gives
and
Finally, interpolation gives the upper bound
By using the definition we write and . Thus, we have , which gives the lower bound. ∎
5.2 multigrid iteration – two level case
We now describe the multigrid algorithm, which is a composition of smoothing operators, restriction, coarse grid correction, prolongation, and then smoothing.
We begin by considering the solution of , where is the stiffness matrix associated to , and where , is the data obtained from the Galerkin solution . Naturally, is unknown (its coefficients are the solution of the above problem), but we can compute the data via
In other words, it is obtained from the right hand side .
Two-grid method with post-smoothing in vectorial form, see [13, Eq. (20)]
The output of the two-level multigrid algorithm with initial input is given by the rather complicated formula
(26) |
Because is consistent (with ) the corresponding iteration matrix is
(27) |
(this is calculated in [13, Eq. 48] as well as in [11, Lemma 11.11]). Thus, the error can be expressed as .
The corresponding operator on is obtained by conjugating with . This gives the error operator for the two level method .
It is worth noting that by (23) we have the equality . Using the identity followed by (24) and the identity gives
(28) |
It follows that .
We are now in a position to show that the two level method is a contraction for sufficiently large values of .
Proposition 1.
There is a constant so that for all , satisfies the bound
Proof.
Corollary 1.
Let be as in Lemma 6 and let be an initial guess, , , and . If is symmetric, there is a constant independent of and so that
If is not symmetric, there is a constant independent of and so that
Proof.
This follows by applying Proposition 1 to . ∎
5.3 multigrid with -cycle
In the two-grid method, the computational bottleneck remains the solution on the coarse grid. Thus, there have been many approaches to recursively apply the multigrid philosophy in order to use a direct solver only on the coarsest grid. A flexible algorithm is the so-called -cycle. Here stands for the -cycle in multigrid methods and gives the -cycle.
Our results hold for .
Multigrid method with cycle, see [13, Eq. (31)]
Before proving our main theorem, we need a statement from elementary real analysis.
Lemma 11.
For any real numbers which satisfy , , and , if the sequence satisfies the conditions
then for all .
Proof.
This follows by elementary calculations, as in [17, Lemma 6.15]. ∎
Using [13, Theorem 7.1], we obtain for the iteration matrix of the Algorithm 2 the recursive (in the level) form
Again, we define the corresponding operator via .
Theorem 1.
For every , there is a . For all we have
(29) |
Proof.
Here, we follow basically [13, Proofof Theorem 7.20]. Let arbitrary. For , we obtain for ,
by Proposition 1 and Lemma 7. We treat the second term with (23) and (28) to obtain
This leaves
The last factor can be bounded by writing followed by the triangle inequality. Lemma 7 bounds , while Proposition 1 (with ) bounds . Thus, we end up with a bound
which has the form required by Lemma 11, with , and . The condition ensures the bound . Thus
holds by Lemma 11, and the theorem follows. ∎
Remark 2.
At the finest level, the kernel-based Galerkin problem , can be solved stably to any precision , by iterating the contraction matrix . Select and fix so that Theorem 1 holds. Letting gives . If is the least integer satisfying , then .
We note that and since , achieving also requires only a fixed number of iterations. This shows (41).
6 The perturbed multigrid method
In this section, we consider a modified problem
where is close to . The perturbed multigrid method will produce an approximate solution to which satisfies . Thus, for the true solution to (8) and the Galerkin solution , we have
which can be made as close to as desired by controlling the perturbation and the error from the multigrid approximation .
Such systems may occur for a number of reasons: using localized Lagrange basis functions (as in [8]), truncating a series expansion of the kernel (as in [26]), or by using quadrature to approximate the stiffness matrix (both [8] and [26]). In the next section, we will apply this by truncating the original stiffness matrix to employ only banded matrices and thereby enjoy a computational speed up.
Perturbed multigrid method
The perturbed multigrid method replaces matrices , and appearing in Algorithms 1 and 2 with matrices , and . We assume that for each there exists so that
In this set up may change per level.444Which could be the case, e.g., if involved a dependent quadrature scheme, or was obtained by bandlimiting (as we will do in the next section) We assume is small enough that
It then follows from standard arguments that
Because of the entry-wise error , we also have that the diagonal matrix satisfies
Therefore, there is so that for all and all , holds. This permits us to consider the Jacobi iteration applied to the perturbed linear system which yields defined by
(30) |
Since , we can estimate the error between smoothing matrices as
(31) |
Because it follows that for all ,
(32) |
This also yields the following Lemma.
Lemma 12.
For , we get the bound
Proof.
This lemma can be combined with the estimate (31) to obtain
(33) |
Perturbed two grid method
Now, we consider the perturbed version of the two step algorithm. We aim to apply the two-grid method with only truncated matrices to the problem
(34) |
Applying the two-grid method to (34) gives where the two grid iteration matrix is
(35) |
Lemma 13.
If holds for all , then
Remark 3.
A basic idea, used throughout this section, is the following result: If , then
Perturbed -cycle
As in the two-step case we consider the multigrid method also for the truncated system in (34), i.e., . The multigrid iteration matrix is
From this we define the operator .
Theorem 2.
For any , there exist constants , and and such that . For choose such that for for all and if , then
Proof.
As in the proof of Theorem 1, we make the estimate
Then (32) ensures that controls the second expression. Considering the difference
Remark 3 gives
Using the Riesz property, this gives
As in the proof of Theorem 1, the last normed expression can be bounded as
Because is bounded by assumption, it follows that
7 Truncated multigrid method
In this section we consider truncating the stiffness, prolongation and restriction matrices in order to improve the computational complexity of the method. Each such matrix has stationary, exponential off-diagonal decay, so by retaining the entry when , and setting the rest to zero, guarantees a small perturbation error (on the order of , where ). This is made precise in Lemma 15 and Lemma 16 below, with the aid of the following lemma.
Lemma 14.
Suppose , , and . Then for any , we have
Proof.
The underlying set can be decomposed as , where has cardinality . It follows that
and the lemma follows from the fact that for all . ∎
Truncated stiffness matrix
The exponential decay in Lemma 3 motivates the truncation of the stiffness matrix, see e.g. [8, Eq. (8.1)] We define for positive , the truncation parameter
(36) |
We note that is symmetric if is symmetric. By construction and quasi-uniformity, we obtain By , this yields
.
In particular, we obtain
(37) |
for the number of operations for a matrix vector multiplication with the truncated stiffness matrix, with .
Lemma 15.
With the global parameter , the estimate
(38) |
holds.
Truncated prolongation and restriction matrices
We introduce truncated prolongation matrices , with
where we use the notation . Likewise, we define . For the numerical costs, we obtain
(39) |
where we use that due to (22).
Lemma 16.
We have
Truncated -cycle
We now consider the multigrid method using truncated versions of the stiffness, prolongation and restriction matrices. We denote this by , and use it to solve (34) with . Lemmas 15 and 16 show that conditions for Theorem 2 are satisfied when is chosen sufficiently large.
Theorem 3.
If , we obtain
(40) |
Proof.
Define the floating point operation count for the truncated
multigrid
method by
.
By estimates (37) and (39), the quantities , , and are each bounded by . Because each Jacobi iteration involves multiplication by a matrix with the same number of nonzero entries, we note that
as well. From this, we have the recursive formula
By setting and , we have the recurrence
Note that , since . By Hölder’s inequality, we have , which provides the estimate .
Applying this to the above estimate for gives,
The result follows by taking and . ∎
Remark 4.
The kernel-based Galerkin problem , can be solved stably to any precision , by iterating , i.e., the truncated multigrid with cycle. Select and fix so that Theorem 2 holds. Let . If is the least integer satisfying , then
Due to Theorem 3, we obtain an overall complexity of
We note that and since , achieving also requires only a fixed number of iterations. This is repeated below in statement (41).
Indeed, using steps of the Conjugate Gradient method on the original system (1), would give error . Thus to ensure a tolerance of , one would need
steps, where we use .
In contrast to this, the multigrid -cycle requires only
(41) |
iterations to achieve error . In fact, it reaches , which is a stronger constraint, within iterations. In particular, the number of iterations is independent of the size of the problem.
Acknowledegment
The authors wish to thank the anonymous reviewers for their careful reading and helpful comments.
References
- [1] G. Dziuk, C. M. Elliott, Finite element methods for surface PDEs, Acta Numer. 22 (2013) 289–396.
- [2] V. Shankar, G. B. Wright, A. Narayan, A robust hyperviscosity formulation for stable RBF-FD discretizations of advection-diffusion-reaction equations on manifolds, SIAM J. Sci. Comput. 42 (4) (2020) A2371–A2401.
- [3] Q. T. Le Gia, I. H. Sloan, H. Wendland, Multiscale RBF collocation for solving PDEs on spheres, Numer. Math. 121 (1) (2012) 99–125.
- [4] Q. Le Gia, Galerkin approximation for elliptic PDEs on spheres, Journal of Approximation Theory 130 (2) (2004) 125–149.
- [5] H. Wendland, J. Künemund, Solving partial differential equations on (evolving) surfaces with radial basis functions, Advances in Computational Mathematics 46 (4) (2020) 64.
- [6] Z. Sun, L. Ling, A kernel-based meshless conservative Galerkin method for solving Hamiltonian wave equations, SIAM Journal on Scientific Computing 44 (4) (2022) A2789–A2807.
- [7] J. Künemund, F. J. Narcowich, J. D. Ward, H. Wendland, A high-order meshless Galerkin method for semilinear parabolic equations on spheres, Numer. Math. 142 (2) (2019) 383–419.
- [8] F. J. Narcowich, S. T. Rowe, J. D. Ward, A novel Galerkin method for solving PDEs on the sphere using highly localized kernel bases, Math. Comp. 86 (303) (2017) 197–231.
- [9] A. Katz, A. Jameson, Multicloud: multigrid convergence with a meshless operator, J. Comput. Phys. 228 (14) (2009) 5237–5250.
- [10] G. B. Wright, A. Jones, V. Shankar, Mgm: A meshfree geometric multilevel method for systems arising from elliptic equations on point cloud surfaces, SIAM Journal on Scientific Computing 45 (2) (2023) A312–A337.
- [11] W. Hackbusch, Iterative solution of large sparse systems of equations, Vol. 95, Springer, 1994.
- [12] S. C. Brenner, L. R. Scott, The Mathematical Theory of Finite Element Methods, Springer New York, NY, 2007, edition Number 3.
- [13] A. Reusken, Introduction to multigrid methods for elliptic boundary value problems, (2008). Technical report available at {https://www.igpm.rwth-aachen.de/Download/reports/reusken/ARpaper53.pdf}
- [14] E. Fuselier, T. Hangelbroek, F. J. Narcowich, J. D. Ward, G. B. Wright, Localized bases for kernel spaces on the unit sphere, SIAM J. Numer. Anal. 51 (5) (2013) 2538–2562.
- [15] T. Hangelbroek, F. J. Narcowich, C. Rieger, J. D. Ward, Direct and inverse results on bounded domains for meshless methods via localized bases on manifolds, in: J. Dick, F. Kuo, H. Woźniakowski (Eds.), Contemporary computational mathematics—a celebration of the 80th birthday of Ian Sloan. Vol. 1, 2, Springer, Cham, 2018, pp. 517–543.
- [16] J. H. Bramble, J. E. Pasciak, J. Xu, Parallel multilevel preconditioners, Mathematics of Computation 55 (191) (1990) 1–22.
- [17] V. John, Multigrid methods, Tech. rep., (2014). Lecture notes available at {https://www.wias-berlin.de/people/john/LEHRE/MULTIGRID/multigrid.pdf}
- [18] S. Balay, S. Abhyankar, M. F. Adams, S. Benson, J. Brown, P. Brune, K. Buschelman, E. Constantinescu, L. Dalcin, A. Dener, V. Eijkhout, J. Faibussowitsch, W. D. Gropp, V. Hapla, T. Isaac, P. Jolivet, D. Karpeev, D. Kaushik, M. G. Knepley, F. Kong, S. Kruger, D. A. May, L. C. McInnes, R. T. Mills, L. Mitchell, T. Munson, J. E. Roman, K. Rupp, P. Sanan, J. Sarich, B. F. Smith, S. Zampini, H. Zhang, H. Zhang, J. Zhang, PETSc/TAO users manual, Tech. Rep. ANL-21/39 - Revision 3.21, Argonne National Laboratory (2024).
- [19] R. Abraham, J. E. Marsden, T. Ratiu, Manifolds, tensor analysis, and applications, 2nd Edition, Vol. 75 of Applied Mathematical Sciences, Springer-Verlag, New York, 1988.
- [20] J. M. Lee, Introduction to Riemannian manifolds, 2nd Edition, Vol. 176 of Graduate Texts in Mathematics, Springer, Cham, 2018.
- [21] T. Hangelbroek, F. J. Narcowich, J. D. Ward, Kernel approximation on manifolds I: bounding the Lebesgue constant, SIAM J. Math. Anal. 42 (4) (2010) 1732–1760.
- [22] H. Triebel, Spaces of Besov-Hardy-Sobolev type on complete Riemannian manifolds, Ark. Mat. 24 (2) (1986) 299–337.
- [23] T. Hangelbroek, F. J. Narcowich, J. D. Ward, Polyharmonic and related kernels on manifolds: interpolation and approximation, Found. Comput. Math. 12 (5) (2012) 625–670.
- [24] M. E. Taylor, Partial differential equations. I, Vol. 115 of Applied Mathematical Sciences, Springer-Verlag, New York, 1996, basic theory.
- [25] T. Hangelbroek, F. J. Narcowich, X. Sun, J. D. Ward, Kernel approximation on manifolds II: the norm of the projector, SIAM J. Math. Anal. 43 (2) (2011) 662–684.
- [26] P. Collins, Kernel-based Galerkin methods on compact manifolds without boundary, with an emphasis on SO(3), Ph.D. thesis, University of Hawai‘i at Mānoa, (2021). Thesis available at {https://scholarspace.manoa.hawaii.edu/items/936c034f-5039-4ef0-8f03-03f21d71184b}