This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Domain Decomposition Framework for Maxwell Finite Element Solvers and Application to PIC

Zane D. Crawford,  O. H. Ramachandran,  Scott O’Connor,  Daniel L. Dault,  John Luginsland,  and B. Shanker,  Z. D. Crawford, O. H. Ramachandran, S. O’Connor, B. Shanker are with the Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, 48824.
D. Dault is with AFRL, Dayton, OH 4543
J. Luginsland is with AFRL/Air Force Office of Scientific Research, Arlington, VA 22201. E-mail: [email protected]
Abstract

The most popular methods for self-consistent simulation of fields interacting with charged species is using finite difference time domain (FDTD) methods together with Newton’s laws of motion to evolve locations and velocities of particles. Despite their popularity, the limitation of FDTD particle in cell (EM-FDTDPIC) methods are well known. To address these, there has been significant interest over the past decade in exploring alternatives. In the past few years, the advances in electromagnetic finite element methods for particle in cell (EM-FEMPIC) has advanced by leaps and bounds. The mathematics necessary for implicit FEM methods that are unconditionally stable and charge conserving are now well understood. Some of these advances are more recent. The next bottleneck necessary to make EM-FEMPIC competitive with FDTD based scheme is overcoming computational cost. Our approach to resolving this challenge is develop two different finite element tearing and integration approaches, and using these to create domain decomposition schemes for EM-FEMPIC. Details of the proposed methodology are presented as well as a number of results that demonstrates charge conservation as well as amelioration of costs for a number of problems.

Index Terms:
particle-in-cell methods, charge conservation, finite element method, domain decomposition, finite element tearing and integration

I Introduction

The design of novel accelerator and vacuum devices require computational tools capable of simulating self-consistent interactions between plasma and electromagnetic fields [1, 2, 3, 4]. This is commonly done by evolving a particle distribution in time due to the Lorentz force defined by an electric field and magnetic flux density; this includes both external/impressed and those generated by the acceleration of charged species. The predominant approach is the finite difference time domain based particle in cell method (EM-FDTDPIC) [5]. This method has seen decades of use and advancement, primarily due its ease of use and efficiency, as the fields can be determined from a simple recurrence formula rather than matrix inversion. Furthermore, parallel implementations and fast techniques have enabled EM-FDTDPIC to simulate large problems [6, 7, 8, 9]. However, there has been interest in exploring alternatives to EM-FDTDPIC. To a large extent, this is motivated by the higher order fidelity that one can obtain in terms of both geometry as well as physics.

In response to this need, the past decade has seen extensive effort to create an electromagnetic particle in cell method based around the finite element method (EM-FEMPIC); see [10, 11] and papers therein. It has been shown that by choosing proper basis sets for the fields and currents, it is possible to obtain charge conserving EM-FEMPIC schemes through exact current mapping [12, 13, 11]. Unfortunately, these methods use explicit time stepping with first order basis functions to represent both the fields, paticles, currents, and time. More recent work has utilized higher order spatial basis better represent fields, but the error in representation is still constrained by the representation for time [14]; note, the moniker “structure-preserving” is used when referring to a method that uses Whitney basis/ensures that the de-Rham relations hold. The downside of using explicit time stepping is that it is conditionally stable, and the time step to ensure stability is determined by the smallest feature size. Obviously, this is a challenge when analyzing plasma in geometrically complex in structures.

More recent work provided a general framework for charge conservation to be satisfied, irrespective of time stepping scheme [15]. A practical demonstration of satisfaction of conservation laws for implicit unconditionally stable finite element scheme for Maxwell equation was shown for a set of test problems in [16]. It was also shown in this paper, that the proposed methodology was valid for a wave equation FEM solver as well. Unfortunately, it is well known that there exists a null space that grows as tϕ(𝐫)t\gradient\phi(\mathbf{r}) for the wave equation solvers and a null space that varies as ϕ(𝐫)\gradient\phi(\mathbf{r}) for implicit Maxwell solvers, where ϕ(𝐫)\phi(\mathbf{r}) is a scalar field. While it may be possible to control the magnitude of the null space excited in Maxwell solvers, the one for the wave equation unfortunately rears its head with time. The development of a discrete Coulomb gauge in [17] shows how one may obtain a method that satisfies Gauss’ laws for both wave equation and Maxwell equation solvers. Despite this progress, a fundamental bottleneck remains. All EM-FEMPIC methods are more expensive than EM-FDTDPIC methods.

The reason for the efficiency of FDTD schemes fairly simple; there is no inversion thanks to the structure of the method [18]. Unfortunately, FEM methods typically involve inversion of matrices. As is well known, a naive direct inversion would cost 𝒪(Ndof3)\mathcal{O}(N_{dof}^{3}) per time step where NdofN_{dof} is the number of degrees of freedom. This can be reduced to 𝒪(NdoflogNdof)\mathcal{O}(N_{dof}\log N_{dof}) if a multifrontal sparse solver is used [19, 20] with 𝒪(Ndofνpolylog(Ndof))\mathcal{O}(N_{dof}^{\nu}polylog(N_{dof})) for factorization where ν<2\nu<2 [21]. Alternatively, one typically uses an iterative solver whose cost scales as 𝒪(NiterNdof)\mathcal{O}(N_{iter}N_{dof}), where NiterN_{iter} is the number of iterations which depends on the error threshold. As a result, one takes recourse to preconditioners. But it has been shown that the most computationally effective approach has been to use domain decomposition [22, 23]. A bulk of the effort has been applying this technique to amortize the cost of wave equations and largely in the Fourier domain [24, 25, 26, 27], with some scant papers in the time domain [28, 29]. The primary benefit of this approach is to reduce the complexity of the problem by effectively reducing the size of NdofN_{dof} that needs to be inverted.

The domain decomposition approach in this work is the finite element tearing and interconnecting (FETI) framework. Here, the domain is partitioned into multiple regions/subdomains. For each subdomain, the unknowns are divided into interior and boundary quantities. The quantities on the boundary are shared between multiple subdomains, and are related via appropriate transmission conditions. Via manipulations, one only solves for unknowns on the boundary of the subdomains which is far fewer that the overall number of unknowns. All interior unknowns can be related to the ones on the boundary. Note, this differs from the domain decomposition approaches currently used for FDTD, which takes a Schwarz approach which uses overlapping subdomains, requiring iteration of each subdomain to get a converged solution[30].

Thus, the main contributions of this paper are as follows: (a) we derive two FETI schemes for time domain Maxwell equation solvers, (b) these methods are then used within an unconditionally stable EM-FEMPIC framework using an exact current mapping scheme, and (c) we present a number of results that demonstrate the efficacy of the proposed technique.

The rest of this paper is organized as follows. Section II will define the problem setup. Section II-A will briefly review particle mapping approach used followed by the field discretization in space and time in Section II-B. Section III will define the two domain decomposition formulations. Results using both of these methods will be presented in Section IV followed by conclusions and future work in Section V.

II Preliminaries

Let a region Ω3\Omega\in\mathbb{R}^{3} be bounded by surface Ω\partial\Omega and contain at least one charged species. The charges are represented by a phase space distribution function (PSDF) that is subject to the Maxwell-Vlasov equation

tf(t,𝐫,𝐯)+𝐯f(t,𝐫,𝐯)+\displaystyle\partial_{t}f(t,\mathbf{r},\mathbf{v})+\mathbf{v}\cdot\nabla f(t,\mathbf{r},\mathbf{v})+ (1)
qm[𝐄(t,𝐫)+𝐯×𝐁(t,𝐫)]vf(t,𝐫,𝐯)=0.\displaystyle\frac{q}{m}[\mathbf{E}(t,\mathbf{r})+\mathbf{v}\times\mathbf{B}(t,\mathbf{r})]\cdot\nabla_{v}f(t,\mathbf{r},\mathbf{v})=0.

The electric field 𝐄(t,𝐫)\mathbf{E}(t,\mathbf{r}) and magnetic flux density 𝐁(t,𝐫)\mathbf{B}(t,\mathbf{r}) are obtained through the mixed finite element method, which discretizes Faraday’s law and Ampere’s law. The PSDF is evolved through the PIC cycle, which consists of:

  1. 1.

    Gathering the particles onto the mesh to be mapped into the field solve,

  2. 2.

    Solving for the fields due to the current from the particle motion,

  3. 3.

    Interpolating the fields at the particle positions,

  4. 4.

    Pushing the particles to new positions.

The following subsections will describe how Maxwell’s equations are discretized in space and time, and how the particles are mapped into the field solver.

II-A Current Mapping

The charge and current density is defined using the PSDF conventional definition, ρ(t,𝐫)=qΩf(t,𝐫,𝐯)𝑑𝐯\rho(t,\mathbf{r})=q\int_{\Omega}f(t,\mathbf{r},\mathbf{v})d\mathbf{v} and 𝐉ρ(t,𝐫)=qΩ𝐯f(t,𝐫,𝐯)𝑑𝐯\mathbf{J}_{\rho}(t,\mathbf{r})=q\int_{\Omega}\mathbf{v}f(t,\mathbf{r},\mathbf{v})d\mathbf{v}. The PSDF is approximated by using NpN_{p} samples, or particles, with shape S(𝐫)S(\mathbf{r}) such that the charge and current density are defined as ρ(t,𝐫)=qp=1NpS(𝐫𝐫p(t))\rho(t,\mathbf{r})=q\sum_{p=1}^{N_{p}}S(\mathbf{r}-\mathbf{r}_{p}(t)) and 𝐉ρ(t,𝐫)=qp=1Np𝐯(t)S(𝐫𝐫p(t))\mathbf{J}_{\rho}(t,\mathbf{r})=q\sum_{p=1}^{N_{p}}\mathbf{v}(t)S(\mathbf{r}-\mathbf{r}_{p}(t)) where 𝐫p(t)\mathbf{r}_{p}(t) and 𝐯p(t)\mathbf{v}_{p}(t) are the position and velocity of particle pp. In this work, the shape functions are delta functions, though other shape functions are also valid and can be used to improve noise quality. The proper function to measure the charge density is the 0-form and the 1-form for the current density. Different schemes to evolve Newton’s equations can be used to preserve certain quantities like phase. In this work, to maintain accuracy of the time evolution at larger time step sizes, the Adams-Bashforth scheme is used. Lastly, the time integral of the current density

𝐆(t,𝐫)=0t𝑑τ𝐉ρ(τ,𝐫)\mathbf{G}(t,\mathbf{r})=\int_{0}^{t}d\tau\mathbf{J}_{\rho}(\tau,\mathbf{r}) (2)

is used to allow the particle current to evolve consistently with the fields. Otherwise, there is a disconnect between the continuity equation and Gauss’ law.

II-B Field discretization

II-B1 Spatial Discretization

The FEM formulation used in this work is the mixed finite element method. The basis functions used are the 1-form 𝐖(1)(𝐫)H(curl;Ω)\mathbf{W}^{(1)}(\mathbf{r})\in H(curl;\Omega) which allows for tangential continuity of fields, and the 2-form 𝐖(2)(𝐫)H(div;Ω)\mathbf{W}^{(2)}(\mathbf{r})\in H(div;\Omega) which allows for normal continuity of fluxes. The reconstructed fields used to accelerate the particles are defined as 𝐄(t,𝐫)=i=1Neei(t)𝐖(1)(𝐫)\mathbf{E}(t,\mathbf{r})=\sum_{i=1}^{N_{e}}e_{i}(t)\mathbf{W}^{(1)}(\mathbf{r}) and 𝐁(t,𝐫)=i=1Nfbi(t)𝐖(2)(𝐫)\mathbf{B}(t,\mathbf{r})=\sum_{i=1}^{N_{f}}b_{i}(t)\mathbf{W}^{(2)}(\mathbf{r}) where NeN_{e} are the number of edges and NfN_{f} are the number of faces. Testing Faraday’s law with the 2-form and Ampere’s law with the 1-form leads to the semidiscrete Maxwell system

[[μ1]00[ϵ0]]A¯¯1[tB¯tE¯]+[0[Mc]c2[Mc]Tc[I]]A¯¯0[B¯E¯]=[0L¯ε]F¯¯\displaystyle\begin{split}\underbrace{\matrixquantity[[\star_{\mu^{-1}}]&0\\ 0&[\star_{\epsilon_{0}}]]}_{\bar{\bar{A}}_{1}}&\matrixquantity[\partial_{t}\bar{B}\\ \partial_{t}\bar{E}]\\ &+\underbrace{\matrixquantity[0\hfil&[M_{c}]\\ -c^{2}[M_{c}]^{T}&c[\star_{I}]]}_{\bar{\bar{A}}_{0}}\matrixquantity[\bar{B}\\ \bar{E}]=\underbrace{\matrixquantity[0\\ \frac{\bar{L}}{\varepsilon}]}_{\bar{\bar{F}}}\end{split} (3)

where the degree of freedom vectors E¯=[e1(t),e2(t),,eNe(t)]\bar{E}=[e_{1}(t),e_{2}(t),\dots,e_{N_{e}}(t)], B¯=[b1(t),b2(t),,bNf(t)]\bar{B}=[b_{1}(t),b_{2}(t),\dots,b_{N_{f}}(t)], and L¯=[l1(t),l2(t),lNe(t)]\bar{L}=[l_{1}(t),l_{2}(t),...l_{N_{e}}(t)] with li(t)=𝐖i(1)(𝐫),𝐉s(t,𝐫)t𝐆(t,𝐫)l_{i}(t)=\langle\mathbf{W}_{i}^{(1)}(\mathbf{r}),\mathbf{J}_{s}(t,\mathbf{r})-\partial_{t}\mathbf{G}(t,\mathbf{r})\rangle. The surface currents 𝐉s(t,𝐫)\mathbf{J}_{s}(t,\mathbf{r}) exist on Ω\partial\Omega due to a Neumann or Robin boundary condition. The integrated particle current is chosen to satisfy charge conservation for arbitrary time marching schemes.

II-B2 Temporal Discretization

This work utilizes the Newmark-β\beta time marching scheme to evolve the fields in time with parameters γ\gamma and β\beta, which creates an unconditionally stable time marching scheme when γ=.5\gamma=.5 and β=.25\beta=.25. Other choices can lead to different stability conditions. The parameter choices effect the definition of a temporal basis function N(t)N(t) that is tested by a function W(t)W(t) [31]. When (3) is discretized with Newmark-β\beta with γ=0.5\gamma=0.5 and β=0.25\beta=0.25, the fully discretized system becomes

(0.5A¯¯1+0.25ΔtA¯¯0)X¯n+1.5ΔtA¯¯0X¯n+(0.5A¯¯1+.25ΔtA¯¯0)X¯n1+0.5ΔtG~n+1+0.5ΔtG~n1+0.25ΔtF~n+10.5ΔtF~n+0.25F~n1=0.\begin{split}&(0.5\bar{\bar{A}}_{1}+0.25\Delta_{t}\bar{\bar{A}}_{0})\bar{X}^{n+1}-.5\Delta_{t}\bar{\bar{A}}_{0}\bar{X}^{n}\\ &+(0.5\bar{\bar{A}}_{1}+.25\Delta_{t}\bar{\bar{A}}_{0})\bar{X}^{n-1}+0.5\Delta_{t}\tilde{G}^{n+1}+0.5\Delta_{t}\tilde{G}^{n-1}\\ &+0.25\Delta_{t}\tilde{F}^{n+1}-0.5\Delta_{t}\tilde{F}^{n}+0.25\tilde{F}^{n-1}=0.\end{split} (4)

Here, X¯m=[B¯mE¯m]T\bar{X}^{m}=[\bar{B}^{m}\;\bar{E}^{m}]^{T}, G~m=[0ε1G¯]m,T\tilde{G}^{m}=[0\;\varepsilon^{-1}\bar{G}]^{m,T}, and F~m=[0ε1J¯sm]\tilde{F}^{m}=[0\;\varepsilon^{-1}\bar{J}^{m}_{s}] at time step mm. The boundary current J¯sm\bar{J}^{m}_{s} is tangential to a surface. As it has no normal component, it does not contribute to Gauss’ law and therefore does violate charge conservation.

III Domain Decomposition

Consider the region Ω\Omega depicted in Fig. 1 that is divided into NvN_{v} non-overlapping subdomains where subdomain Ωi\Omega_{i} and Ωj\Omega_{j} are separated by boundary Γij\Gamma_{ij}. The junction of more than two volumes is a ”corner” denoted by Γc\Gamma_{c}. In each subdomain Ωi\Omega_{i}, (4) is defined and is assumed to be a self contained “primal” problem that has fictitious excitations from ”dual” unknowns. The dual unknowns, the Lagrange multipliers denoted by Λ\Lambda, effect either Neumann or Robin boundary conditions such that the equivalence theorem can be applied for each Ωi\Omega_{i} with respect to Ω\Omega using the external currents on Ωi\partial\Omega_{i}, the fictitious boundary current from Λ\Lambda, and the particle current in Ωi\Omega_{i}. By solving for the dual unknowns and a small subset of the primal unknowns, the overall computational cost of solving the original monolithic system in (4) is reduced.

Refer to caption
Figure 1: Region Ω\Omega partitioned into NvN_{v} subdomains.

III-A MFEM-DP 1

The first formulation presented uses a Neumann-like boundary condition to enforce the continuity conditions of the fields at the interior boundaries,

n^×𝐄i(t,𝐫)=n^×𝐄j(t,𝐫)𝐫Γij\hat{n}\crossproduct\mathbf{E}_{i}(t,\mathbf{r})=\hat{n}\crossproduct\mathbf{E}_{j}(t,\mathbf{r})\;\;\mathbf{r}\in\Gamma_{ij} (5a)
n^𝐁i(t,𝐫)=n^𝐁j(t,𝐫)𝐫Γij.\hat{n}\cdot\mathbf{B}_{i}(t,\mathbf{r})=\hat{n}\cdot\mathbf{B}_{j}(t,\mathbf{r})\;\;\mathbf{r}\in\Gamma_{ij}. (5b)

This is accomplished by ensuring 𝐱in=𝐱jn|Γij\mathbf{x}^{n}_{i}=\mathbf{x}^{n}_{j}|_{\Gamma_{ij}} and defining

n^×𝐇i(t,𝐫)=Λ(t,𝐫)𝐫Γij.\hat{n}\crossproduct\mathbf{H}_{i}(t,\mathbf{r})=\Lambda(t,\mathbf{r})\;\;\mathbf{r}\in\Gamma_{ij}. (6)

The system of equations to be solved in each sub-region

(0.5A¯¯1(i)+0.25ΔtA¯¯0(i))X¯(i),n+1=(i)Λ(i)\left(0.5\bar{\bar{A}}^{(i)}_{1}+0.25\Delta_{t}\bar{\bar{A}}^{(i)}_{0}\right)\bar{X}^{(i),n+1}=\mathcal{L}^{(i)}-\Lambda^{(i)} (7)

where

(i)=0.5ΔtA¯¯0X¯(i),n(0.5A¯¯1(i)+.25ΔtA¯¯0(i))X¯(i),n10.5ΔtG~(i),n+10.5ΔtG~(i),n10.25ΔtF~(i),n+1+0.5ΔtF~(i),n0.25F~(i),n1.\begin{split}&\mathcal{L}^{(i)}=0.5\Delta_{t}\bar{\bar{A}}_{0}\bar{X}^{(i),n}\\ &-\left(0.5\bar{\bar{A}}^{(i)}_{1}+.25\Delta_{t}\bar{\bar{A}}^{(i)}_{0}\right)\bar{X}^{(i),n-1}-0.5\Delta_{t}\tilde{G}^{(i),n+1}\\ &-0.5\Delta_{t}\tilde{G}^{(i),n-1}-0.25\Delta_{t}\tilde{F}^{(i),n+1}+0.5\Delta_{t}\tilde{F}^{(i),n}\\ &-0.25\tilde{F}^{(i),n-1}.\end{split} (8)

It is noted that the time history of the Lagrange multiplier is not explicitly stored, unlike the other currents in the system. Let the unknowns defined on Γc\Gamma_{c} be denoted by the subscript cc and all other unknowns in Ωi\Omega_{i} denoted by subscript rr, such that (7) can be written as

[A¯¯rr(i)A¯¯rc(i)A¯¯cr(i)A¯¯cc(i)][X¯r(i),n+1X¯c(i),n+1]=[r(i)c(i)][Λ¯r(i)Λ¯c(i)]\begin{bmatrix}\bar{\bar{A}}^{(i)}_{rr}&\bar{\bar{A}}^{(i)}_{rc}\\ \bar{\bar{A}}^{(i)}_{cr}&\bar{\bar{A}}^{(i)}_{cc}\end{bmatrix}\begin{bmatrix}\bar{X}^{(i),n+1}_{r}\\ \bar{X}^{(i),n+1}_{c}\end{bmatrix}=\begin{bmatrix}\mathcal{L}^{(i)}_{r}\\ \mathcal{L}^{(i)}_{c}\end{bmatrix}-\begin{bmatrix}\bar{\Lambda}^{(i)}_{r}\\ \bar{\Lambda}^{(i)}_{c}\end{bmatrix} (9)

where A¯¯=.5A¯¯1+.25ΔtA¯¯0\bar{\bar{A}}=.5\bar{\bar{A}}_{1}+.25\Delta_{t}\bar{\bar{A}}_{0}. The first set of equations can be rewritten as

X¯r(i),n+1=A¯¯rr(i),1(r(i)A¯¯rc(i),1X¯c(i),n+1Λ¯r(i))\bar{X}^{(i),n+1}_{r}=\bar{\bar{A}}^{(i),-1}_{rr}(\mathcal{L}^{(i)}_{r}-\bar{\bar{A}}^{(i),-1}_{rc}\bar{X}^{(i),n+1}_{c}-\bar{\Lambda}^{(i)}_{r}) (10)

which can be substituted into the second set of equations

(A¯¯cc(i)A¯¯cr(i),1A¯¯rr(i),1A¯¯rc(i))X¯(i),n+1c=(i)cA¯¯cr(i)A¯¯rr(i),1(r(i)Λ¯r(i))Λ¯c(i).\begin{split}\Big{(}\bar{\bar{A}}^{(i)}_{cc}-\bar{\bar{A}}^{(i),-1}_{cr}&\bar{\bar{A}}^{(i),-1}_{rr}\bar{\bar{A}}^{(i)}_{rc}\Big{)}\bar{X}^{(i),n+1}_{c}=\mathcal{L}^{(i)}_{c}\\ &-\bar{\bar{A}}^{(i)}_{cr}\bar{\bar{A}}^{(i),-1}_{rr}\left(\mathcal{L}^{(i)}_{r}-\bar{\Lambda}^{(i)}_{r}\right)-\bar{\Lambda}^{(i)}_{c}.\end{split} (11)

Define a signed Boolean matrix B¯¯r(i)\bar{\bar{B}}_{r}^{(i)} that maps unknowns in Ωi\Omega_{i}, excluding those on Γc\Gamma_{c}, to a set of unique global unknowns defined on Γij\cup\Gamma_{ij}. The definition of B¯¯r(i)\bar{\bar{B}}_{r}^{(i)} is such that i=1NvB¯¯r(i)x¯(i)=0\sum_{i=1}^{N_{v}}\bar{\bar{B}}_{r}^{(i)}\bar{x}^{(i)}=0, which enforces (5a) and (5b). Furthermore, the transpose B¯¯r(i),T\bar{\bar{B}}_{r}^{(i),T} forms a map from the global set of boundary unknowns to those of a particular volume. A similar unsigned Boolean matrix can be defined for unknowns that lie on Γc\Gamma_{c} within Ωi\Omega_{i}, B¯¯c\bar{\bar{B}}_{c}, which maps to a set of unique global unknowns. Using B¯¯r\bar{\bar{B}}_{r} on (10) and summing over all volumes results in

K¯¯rrΛr=¯rK¯¯rcX¯c\bar{\bar{K}}_{rr}\Lambda_{r}=\bar{\mathcal{L}}_{r}-\bar{\bar{K}}_{rc}\bar{X}_{c} (12)

where

K¯¯rr=i=1NvB¯¯rA¯¯rr(i)B¯¯r(i),T\bar{\bar{K}}_{rr}=\sum_{i=1}^{N_{v}}\bar{\bar{B}}_{r}\bar{\bar{A}}^{(i)}_{rr}\bar{\bar{B}}_{r}^{(i),T} (13a)
K¯¯rr=i=1NvB¯¯rA¯¯rr(i)A¯¯rc(i)B¯¯c(i),T\bar{\bar{K}}_{rr}=\sum_{i=1}^{N_{v}}\bar{\bar{B}}_{r}\bar{\bar{A}}^{(i)}_{rr}\bar{\bar{A}}^{(i)}_{rc}\bar{\bar{B}}_{c}^{(i),T} (13b)
¯r=i=1NvB¯¯r(i)A¯¯rr(i),1r(i)\bar{\mathcal{L}}_{r}=\sum_{i=1}^{N_{v}}\bar{\bar{B}}_{r}^{(i)}\bar{\bar{A}}^{(i),-1}_{rr}\mathcal{L}^{(i)}_{r} (13c)

and X¯c\bar{X}_{c} the degree of freedom vector for corner unknowns.

The boundary condition (5a) is enforced at the corners of Ωi\Omega_{i} as B¯¯c(i)\bar{\bar{B}}_{c}^{(i)} is defined so that i=1NvB¯¯c(i),Tλc(i)=0\sum_{i=1}^{N_{v}}\bar{\bar{B}}_{c}^{(i),T}\lambda^{(i)}_{c}=0. Using that definition, (11) becomes

K¯¯ccX¯cn+1=¯c+K¯¯crΛr.\bar{\bar{K}}_{cc}\bar{X}^{n+1}_{c}=\bar{\mathcal{L}}_{c}+\bar{\bar{K}}_{cr}\Lambda_{r}. (14)

where

K¯¯cc=i=1NvB¯¯c(i)(A¯¯cc(i)A¯¯cr(i),1A¯¯rr(i),1A¯¯rc(i),1)B¯¯c(i),T\bar{\bar{K}}_{cc}=\sum_{i=1}^{N_{v}}\bar{\bar{B}}_{c}^{(i)}(\bar{\bar{A}}^{(i)}_{cc}-\bar{\bar{A}}^{(i),-1}_{cr}\bar{\bar{A}}^{(i),-1}_{rr}\bar{\bar{A}}^{(i),-1}_{rc})\bar{\bar{B}}_{c}^{(i),T} (15a)
K¯¯cr=i=1NvB¯¯c(i)A¯¯cr(i)A¯¯rr(i),1B¯¯r(i),T\bar{\bar{K}}_{cr}=\sum_{i=1}^{N_{v}}\bar{\bar{B}}_{c}^{(i)}\bar{\bar{A}}^{(i)}_{cr}\bar{\bar{A}}^{(i),-1}_{rr}\bar{\bar{B}}_{r}^{(i),T} (15b)
¯c=i=1NvB¯¯c(i)(c(i)A¯¯cr(i)A¯¯rr(i),1r(i)).\bar{\mathcal{L}}_{c}=\sum_{i=1}^{N_{v}}\bar{\bar{B}}_{c}^{(i)}(\mathcal{L}^{(i)}_{c}-\bar{\bar{A}}^{(i)}_{cr}\bar{\bar{A}}^{(i),-1}_{rr}\mathcal{L}^{(i)}_{r}). (15c)

As in the vector wave equation domain decomposition formulations, K¯¯cc\bar{\bar{K}}_{cc} is sparse and generally much smaller than K¯¯rr\bar{\bar{K}}_{rr}. Therefore, if solving the boundary problem directly, it is possible to define

X¯cn+1=K¯¯cc1(¯c+K¯¯crΛr)\bar{X}^{n+1}_{c}=\bar{\bar{K}}^{-1}_{cc}\left(\bar{\mathcal{L}}_{c}+\bar{\bar{K}}_{cr}\Lambda_{r}\right) (16)

and then use substitution into (12) to yield

𝒦¯¯rrΛr=¯rK¯¯rcK¯¯cc1¯c\bar{\bar{\mathcal{K}}}_{rr}\Lambda_{r}=\bar{\mathcal{L}}_{r}-\bar{\bar{K}}_{rc}\bar{\bar{K}}_{cc}^{-1}\bar{\mathcal{L}}_{c} (17)

where 𝒦¯¯rr=(K¯¯rr+K¯¯rcK¯¯cc1K¯¯cr)\bar{\bar{\mathcal{K}}}_{rr}=\left(\bar{\bar{K}}_{rr}+\bar{\bar{K}}_{rc}\bar{\bar{K}}_{cc}^{-1}\bar{\bar{K}}_{cr}\right). If using an iterative solver, (12) and (14) can be solved as a coupled system instead.

III-B MFETI-DP2

In the previous formulation, a unique Lagrange multiplier was defined for each edge and face on Γ\Gamma. This formulation enforces field continuity though a Neumann-like boundary condition. A more robust enforcement of field continuity is to use a Robin boundary condition [jin2015finite]. The boundaries between Ωi\Omega_{i} and Ωj\Omega_{j}, Γij\Gamma_{ij} and Γji\Gamma_{ji}, are distinct and have a unique Lagrange multiplier defined for each subdomain. Instead of (6), the boundary condition on the Lagrange multipliers becomes

Λ(i)(t,𝐫)=n^i×𝐇(t,𝐫)η1n^i×n^i×𝐄(t,𝐫)Γij.\Lambda^{(i)}(t,\mathbf{r})=\hat{n}_{i}\crossproduct\mathbf{H}(t,\mathbf{r})-\eta^{-1}\hat{n}_{i}\crossproduct\hat{n}_{i}\crossproduct\mathbf{E}(t,\mathbf{r})\in\Gamma_{ij}. (18)

The edges on Γij\Gamma_{ij}, as well as the interior unknowns and unknowns on Ωi\partial\Omega_{i} are included in a vector X¯I(i)\bar{X}^{(i)}_{I}. The Lagrange multipliers for the corner edges remain the same, with a unique Lagrange multiplier for each feature. The faces on an Γij\Gamma_{ij} are grouped together with the corner edges, which we will denote as X¯N\bar{X}_{N}. The system of equations solved in each subdomain is

[A¯¯II(i)A¯¯IN(i)A¯¯NI(i)A¯¯NN(i)][X¯I(i),n+1X¯N(i),n+1]=[I(i)N(i)][.5ΔtΛ¯I(i),n+1Λ¯N(i)]Ωi\begin{bmatrix}\bar{\bar{A}}^{(i)}_{II}&\bar{\bar{A}}^{(i)}_{IN}\\ \bar{\bar{A}}^{(i)}_{NI}&\bar{\bar{A}}^{(i)}_{NN}\end{bmatrix}\begin{bmatrix}\bar{X}^{(i),n+1}_{I}\\ \bar{X}^{(i),n+1}_{N}\end{bmatrix}=\begin{bmatrix}\mathcal{R}^{(i)}_{I}\\ \mathcal{R}^{(i)}_{N}\end{bmatrix}-\begin{bmatrix}.5\Delta_{t}\bar{\Lambda}^{(i),n+1}_{I}\\ \bar{\Lambda}^{(i)}_{N}\end{bmatrix}\in\Omega_{i} (19)

where given

=.5ΔtA¯¯0X¯n(.5A¯¯1+.25ΔtA¯¯0)X¯n1.5ΔtG~n+1.5ΔtG~n1.25ΔtF~n+1+.5ΔtF~n.25F~n1.5ΔtΛn+.25Λn1,\begin{split}&\mathcal{R}=.5\Delta_{t}\bar{\bar{A}}_{0}\bar{X}^{n}\\ &-(.5\bar{\bar{A}}_{1}+.25\Delta_{t}\bar{\bar{A}}_{0})\bar{X}^{n-1}-.5\Delta_{t}\tilde{G}^{n+1}-.5\Delta_{t}\tilde{G}^{n-1}\\ &-.25\Delta_{t}\tilde{F}^{n+1}+.5\Delta_{t}\tilde{F}^{n}-.25\tilde{F}^{n-1}-.5\Delta_{t}\Lambda^{n}+.25\Lambda^{n-1},\end{split} (20)

I(i)\mathcal{R}^{(i)}_{I} contains \mathcal{R} associated with interior unknowns, external unknowns, and unknowns of edges on Γij\Gamma_{ij} and N(i)\mathcal{R}^{(i)}_{N} contain the remaining unknowns in subdomain Ωi\Omega_{i}. To derive the set of equations to solve for the dual unknowns, first consider the summation of (18) for two volumes

Λ(i)(t,𝐫)+Λ(j)(t,𝐫)+(Yi+Yj)n^i×n^i×𝐄(t,𝐫).\Lambda^{(i)}(t,\mathbf{r})+\Lambda^{(j)}(t,\mathbf{r})+(Y_{i}+Y_{j})\hat{n}_{i}\crossproduct\hat{n}_{i}\crossproduct\mathbf{E}(t,\mathbf{r}). (21)

Define an unsigned Boolean matrix such that B¯¯I(i)X¯I=e¯I(i)\bar{\bar{B}}_{I}^{(i)}\bar{X}_{I}=\bar{e}_{I}^{(i)}, the unknowns for the electric field on Γij\Gamma_{ij}. Also, define the map from the boundary edges on Γij\Gamma_{ij} to its counterpart on Γji\Gamma_{ji}, T¯ij\bar{T}_{ij}. Testing (21) with curl-conforming basis functions 𝐖e\mathbf{W}^{e}, generalizing the case to any number of subdomains, and using B¯¯I(i)\bar{\bar{B}}_{I}^{(i)} and T¯¯ji\bar{\bar{T}}_{ji} restrict and map the unknowns of Ωj\Omega_{j} to Ωi\Omega_{i} yields

Λ(i)+jσ(i)T¯¯ij(Λ(j)(Yi+Yj)M¯¯T(j)B¯¯I(j)X¯I(j))=0\Lambda^{(i)}+\sum_{j\in\sigma(i)}\bar{\bar{T}}_{ij}(\Lambda^{(j)}-(Y_{i}+Y_{j})\bar{\bar{M}}_{T}^{(j)}\bar{\bar{B}}^{(j)}_{I}\bar{X}^{(j)}_{I})=0 (22)

where σ(i)\sigma(i) are all subdomains Ωj\Omega_{j} which share a boundary with Ωi\Omega_{i} and the matrix M¯¯T(i)n^×𝐖e,n^×𝐖e|Γij\bar{\bar{M}}_{T}^{(i)}\langle\hat{n}\times\mathbf{W}^{e},\hat{n}\times\mathbf{W}^{e}\rangle|_{\Gamma_{ij}}.

Rewrite the first set of equations in (19) as

X¯I(i),n+1=A¯¯II(i),1(I(i).5ΔtΛ¯I(i),n+1A¯¯IN(i)X¯N(i),n+1)\bar{X}^{(i),n+1}_{I}=\bar{\bar{A}}^{(i),-1}_{II}(\mathcal{R}^{(i)}_{I}-.5\Delta_{t}\bar{\Lambda}^{(i),n+1}_{I}-\bar{\bar{A}}^{(i)}_{IN}\bar{X}^{(i),n+1}_{N}) (23)

Now, (23) can be plugged into (22) to yield

Λ(i)+jσ(i)Tij(I¯¯(j)(Yi+Yj)M¯¯T(j)B¯¯I(j)A¯¯II(j),1B¯¯I(j),T)Λ(j)=jσ(i)TijB¯¯I(j)A¯¯II(j),1(A¯¯IN(j)B¯¯N(j)X¯NI(j)).\begin{split}\Lambda^{(i)}\!\!&+\!\!\!\!\!\sum_{j\in\sigma(i)}\!\!T_{ij}(\bar{\bar{I}}^{(j)}-(Y_{i}+Y_{j})\bar{\bar{M}}_{T}^{(j)}\bar{\bar{B}}_{I}^{(j)}\bar{\bar{A}}_{II}^{(j),-1}\bar{\bar{B}}_{I}^{(j),T})\Lambda^{(j)}\\ &=\sum_{j\in\sigma(i)}T_{ij}\bar{\bar{B}}_{I}^{(j)}\bar{\bar{A}}_{II}^{(j),-1}(\bar{\bar{A}}_{IN}^{(j)}\bar{\bar{B}}_{N}^{(j)}\bar{X}_{N}-\mathcal{R}^{(j)}_{I}).\end{split} (24)

In order to complete the definition of the equation to solve for ΛI\Lambda_{I} over Γij\cup\Gamma_{ij}, we first define the equation for X¯N\bar{X}_{N}, the unknowns associated with faces on Γij\Gamma_{ij} and edges on Γc\Gamma_{c}.

The derivation of the equation to solve for X¯N\bar{X}_{N} follows a similar path as X¯c\bar{X}_{c} from MFET-DP1. First, substitute (23) into the second set of equations in (22) to yield

(A¯¯NN(i)A¯¯NI(i),1A¯¯II(i),1A¯¯IN(i))X¯(i),n+1N=(i)NA¯¯NI(i)A¯¯II(i),1(I(i)Λ¯I(i))Λ¯N(i).\begin{split}\Big{(}\bar{\bar{A}}^{(i)}_{NN}-\bar{\bar{A}}^{(i),-1}_{NI}&\bar{\bar{A}}^{(i),-1}_{II}\bar{\bar{A}}^{(i)}_{IN}\Big{)}\bar{X}^{(i),n+1}_{N}=\mathcal{R}^{(i)}_{N}\\ &-\bar{\bar{A}}^{(i)}_{NI}\bar{\bar{A}}^{(i),-1}_{II}\left(\mathcal{R}^{(i)}_{I}-\bar{\Lambda}^{(i)}_{I}\right)-\bar{\Lambda}^{(i)}_{N}.\end{split} (25)

The Boolean matrix B¯¯N(i)\bar{\bar{B}}_{N}^{(i)} is defined as

B¯¯N(i)=[B¯¯r,f(i)00B¯¯c(i)]\bar{\bar{B}}_{N}^{(i)}=\begin{bmatrix}\bar{\bar{B}}_{r,f}^{(i)}&0\\ 0&\bar{\bar{B}}_{c}^{(i)}\end{bmatrix} (26)

to enforce (5b) for the faces on Ωi\Omega_{i} and (5a) edges on corners where B¯¯r,f(i)\bar{\bar{B}}_{r,f}^{(i)} is the portion of B¯¯r(i)\bar{\bar{B}}_{r}^{(i)} that acts on the face degrees of freedom. Using a similar process as (14), the equation for

K¯¯NNX¯Nn+1=¯N+K¯¯NIΛI.\bar{\bar{K}}_{NN}\bar{X}^{n+1}_{N}=\bar{\mathcal{R}}_{N}+\bar{\bar{K}}_{NI}\Lambda_{I}. (27)

where

K¯¯NN=i=1NvB¯¯N(i)(A¯¯NN(i)A¯¯NI(i),A¯¯II(i),1A¯¯IN(i),1)B¯¯N(i),T\bar{\bar{K}}_{NN}=\sum_{i=1}^{N_{v}}\bar{\bar{B}}_{N}^{(i)}(\bar{\bar{A}}^{(i)}_{NN}-\bar{\bar{A}}^{(i),}_{NI}\bar{\bar{A}}^{(i),-1}_{II}\bar{\bar{A}}^{(i),-1}_{IN})\bar{\bar{B}}_{N}^{(i),T} (28a)
K¯¯NI=i=1NvB¯¯N(i)A¯¯NI(i)A¯¯II(i),1B¯¯I(i),TB¯¯L(i),T\bar{\bar{K}}_{NI}=\sum_{i=1}^{N_{v}}\bar{\bar{B}}_{N}^{(i)}\bar{\bar{A}}^{(i)}_{NI}\bar{\bar{A}}^{(i),-1}_{II}\bar{\bar{B}}_{I}^{(i),T}\bar{\bar{B}}_{L}^{(i),T} (28b)
¯N=i=1NvB¯¯N(i)(N(i)A¯¯NI(i)A¯¯II(i),1I(i)).\bar{\mathcal{R}}_{N}=\sum_{i=1}^{N_{v}}\bar{\bar{B}}_{N}^{(i)}(\mathcal{R}^{(i)}_{N}-\bar{\bar{A}}^{(i)}_{NI}\bar{\bar{A}}^{(i),-1}_{II}\mathcal{R}^{(i)}_{I}). (28c)

The global system of equations for the edge Lagrange multipliers Define a Boolean matrix B¯¯L(i)\bar{\bar{B}}_{L}^{(i)} that maps the local Lagrange multipliers to a global vector. The Boolean matrix is applied to (24) and discretized in time to define the equation

K¯¯IIΛI=¯IK¯¯INN¯Nn+1\bar{\bar{K}}_{II}\Lambda_{I}=\bar{\mathcal{R}}_{I}-\bar{\bar{K}}_{IN}\bar{N}^{n+1}_{N} (29)

where

K¯¯II=I¯+i=1NvB¯L(i)jσ(i)T¯¯ij(I¯¯(j)M¯T(j)B¯¯I(j)A¯¯II(j),1)B¯¯I(j),TB¯L(j),T\begin{split}\bar{\bar{K}}_{II}&=\underline{I}+\sum_{i=1}^{N_{v}}\underline{B}^{(i)}_{L}\\ &\sum_{j\in\sigma(i)}\bar{\bar{T}}_{ij}(\bar{\bar{I}}^{(j)}-\underline{M}_{T}^{(j)}\bar{\bar{B}}_{I}^{(j)}\bar{\bar{A}}_{II}^{(j),-1})\bar{\bar{B}}_{I}^{(j),T}\underline{B}_{L}^{(j),T}\end{split} (30a)
K¯¯IN=i=1NvB¯L(i)jσ(i)T¯jiM¯T(j)B¯I(j)A¯¯II(j),1A¯¯IN(j)B¯N(j),T\bar{\bar{K}}_{IN}=-\sum_{i=1}^{N_{v}}\underline{B}_{L}^{(i)}\sum_{j\in\sigma(i)}\underline{T}_{ji}\underline{M}_{T}^{(j)}\underline{B}_{I}^{(j)}\bar{\bar{A}}_{II}^{(j),-1}\bar{\bar{A}}^{(j)}_{IN}\underline{B}_{N}^{(j),T} (30b)
¯I=i=1NvB¯L(j)jσ(i)T¯jiM¯¯T(j)B¯I(j)A¯¯II(j),1I(i))\bar{\mathcal{R}}_{I}=-\sum_{i=1}^{N_{v}}\underline{B}_{L}^{(j)}\sum_{j\in\sigma(i)}\underline{T}_{ji}\bar{\bar{M}}^{(j)}_{T}\underline{B}^{(j)}_{I}\bar{\bar{A}}_{II}^{(j),-1}\mathcal{R}^{(i)}_{I}) (30c)

It can be solved as a coupled system with (27) or in a Schur complement system similar to (17) using

𝒦¯¯IIΛI=¯IK¯¯INK¯¯NN1¯N.\bar{\bar{\mathcal{K}}}_{II}\Lambda_{I}=\bar{\mathcal{L}}_{I}-\bar{\bar{K}}_{IN}\bar{\bar{K}}_{NN}^{-1}\bar{\mathcal{L}}_{N}. (31)

where 𝒦¯¯II=(K¯¯IIK¯¯INK¯¯NN1K¯¯NI)\bar{\bar{\mathcal{K}}}_{II}=\left(\bar{\bar{K}}_{II}-\bar{\bar{K}}_{IN}\bar{\bar{K}}_{NN}^{-1}\bar{\bar{K}}_{NI}\right). However, K¯¯NN\bar{\bar{K}}_{NN} is much larger than its MFETI-DP1 counterpart K¯¯cc\bar{\bar{K}}_{cc} and will take more time and memory to compute and store its inverse.

III-C Discussion

The proposed domain decomposition formulations reduces the cost of solving for the fields by defining a set of equations on fictitious interior boundaries of the true problem. The definitions of these equations uses small matrix inverses, and therefore are generally not sparse. Furthermore, those matrix inverses have to be stored in memory. At the very least, one matrix inverse for each subdomain is needed, making the memory storage more than solving for the original system. However, it is expected that for direct inverses, inverting either the coupled equations of (12) and (14) or (29) and (27) would be more efficient than solving the original problem. If storing the inverse of the matrix associated with Γc\Gamma_{c} is possible, (17) and (31) are even smaller in size. The following discussion uses a sample problem to further probe properties of MFETI-DP1 and MFETI-DP2. In particular, we examine the performance of the method with a sample geometry with several configurations of subdivision.

TABLE I: Average number of unknowns per volume and ratio of interior to boundary unknowns for different decompositions of Ωrect\Omega_{rect}.
MFETI-DP1 MFETI-DP2
NvN_{v} avg unk. α\alpha avg unk. α\alpha
2 1815 N/A 1755 6.75
5 776 1.71 705 1.00
10 404 0.60 351 0.34
25 170 0.15 133 0.08
50 89 0.05 60 0.03

Define Ωrect\Omega_{rect} as a rectangular region in free space with dimension .25 m ×\times .2 m ×\times .3 m discretized with average edge length of 5.5 cm. The region is the split into subdomains in several configurations listed in Table I so that MFETI-DP1 and MFETI-DP2 can be applied, where α\alpha is the ratio of interior unknowns to the number of boundary unknowns. This table is provided to provide the connection between the number of subdomains and α\alpha for the test example, as α\alpha will be more extensible to other geometries than a set number of subdomains. Figure 2 shows the condition number of the matrix formed by the coupled equations (12) and (14) for MFETI-DP1, (29) and (27) for MFETI-DP2, as well as the Schur complement forms 𝒦¯¯rr\bar{\bar{\mathcal{K}}}_{rr} and 𝒦¯¯II\bar{\bar{\mathcal{K}}}_{II} with different numbers of subdivisions compared to the condition number of of the monolithic system matrix A¯¯\bar{\bar{A}}. Solving the coupled equations results in condition number that are generally at or worse than the monolithic solve. Using the Schur complement results in condition numbers comparable to the monolithic solve for MFETI-DP1 and several orders of magnitude smaller for MFETI-DP2.

Typically, MFETI-DP1 and MFETI-DP2 when used with iterative solvers converge faster than the monolithic solve. Consider a domain Ωrect\Omega_{rect} is illuminated by an incident field defined by

E¯(r¯,t)=y^cos(2πf0τ)e(τ8σ)2/2σ2(V/m).\bar{E}(\bar{r},t)=\hat{y}\cos(2\pi f_{0}\tau)e^{-(\tau-8\sigma)^{2}/2\sigma^{2}}\,\text{({V/m})}. (32)

The bandwidth of the modulated Gaussian is σ=3(πfbw\sigma=3(\pi f_{bw}, where the bandwidth fbw=fmaxf0f_{bw}=f_{max}-f_{0} with maximum frequency fmax=500f_{max}=500MHz and center frequency f0=300f_{0}=300MHz. The time step size is 0.3330.333 ns which is ten times larger than what could be taken by a similar FDTD solver or leapfrog method for MFEM. Figure 3 shows the iteration counts for the coupled solves for MFETI-DP1 and MFETI-DP2 compared to the monolithic solve using GMRES with a tolerance of 10610^{-6} and restarted after 20 iterations when Ωrect\Omega_{rect} is illuminated by an incident field. With a diagonal preconditioner, both domain decomposition approaches perform better than the monolithic solve, with MFETI-DP2 performing best overall.

Next, as alluded to earlier, the principal advantage of Newmark-β\beta schemes is the unconditional stability. As a result, it is important to verify that the both MFETI-DP1 and MFETI-DP2 retain this feature. To ascertain this, we examine the eigenspectra of these formulations to understand the stability of the time marching scheme. For the time marching scheme used in this work to be stable, the matrix used to evolve the system in time must have eigenvalues Re(λ)>=0\real(\lambda)>=0 for the system to be unconditionally stable. Figure 4 shows that for solving both the coupled system and using the Schur complement approach, the domain decomposition methods will be stable. From the insert in Figure 4, the effect of using the schur complement shifts the eigenvalues of 𝒦¯¯rr\bar{\bar{\mathcal{K}}}_{rr} and 𝒦¯¯II\bar{\bar{\mathcal{K}}}_{II} away from the imaginary axis, though MFETI-DP2 is shifted further away than MFETI-DP1.

Refer to caption
Figure 2: Condition number of system matrix.
Refer to caption
Figure 3: Convergence of history of GMRES for monolithic, MFETI-DP1, and MFETI-DP2.
Refer to caption
(a) MFETI-DP matrix eigenvalues.
Refer to caption
(b) MFETI-DP matrix eigenvalues using Schur complement.
Figure 4: System matrix eigenvalues.

IV Results

In this section, we present numerical results that show how the proposed algorithms perform compared to a non-DDM approach. Additionally, we will demonstrate that as the fields are almost identical to machine precision, the overall trajectory of the particles are the identical to machine precision as well, i.e., the approach does not induce additional error.

IV-A Field Comparison

The first example shows performance of the methods and difference in the fields. In this example, a 0.25m×0.2m×1m0.25m\times 0.2m\times 1m box is illuminated an incident field defined by

E¯(r¯,t)=y^cos(2πf0τ)e(τ8σ)2/2σ2(V/m).\bar{E}(\bar{r},t)=\hat{y}\cos(2\pi f_{0}\tau)e^{-(\tau-8\sigma)^{2}/2\sigma^{2}}\,\text{({V/m})}. (33)

The bandwidth of the modulated Gaussian is σ=3(πfbw\sigma=3(\pi f_{bw}, where the bandwidth fbw=fmaxf0f_{bw}=f_{max}-f_{0} with maximum frequency fmax=595f_{max}=595MHz and center frequency f0=295f_{0}=295MHz. The time step size δt=(150f0)1\delta_{t}=(150f_{0})^{-1} and the experiment is run for 1500 time steps. The boundary conditions on the external boundaries are robin boundary conditions. Three approaches are compared: monolithic (no DDM), MFETI-DP1, and MFETI-DP2 using three levels of discretization with average edge lengths of h1=5.20h_{1}=5.20cm, h2=3.63h_{2}=3.63cm, and h3=2.59h_{3}=2.59cm. The DDM meshes were split using METIS, such that no volumes have a regular shape. Figure 5, shows that the fields obtained through domain decomposition for edge lengths h1h_{1}, h2h_{2}, and h3h_{3} respectively. Regardless of the number of subdivisions, the methods remain accurate to near machine precision with respect to the monolithic solve. This implies that using either domain decomposition formulation would cause the same acceleration on particles as the fields that are similar to each other within machine precision.

The next result demonstrates the effect of MFETI-DP on simulation time. As seen in Figure 6, both formulations are faster than the original monolithic solve for any number of subdivisions. However, there is an optimal ratio of interior volume unknowns to boundary unknowns for the fastest runtime. For each case, there is a crossover point at which further subdividing Ω\Omega incurs a cost in solving the boundary problem that is greater than the savings by reducing the unknowns in each volume.

Refer to caption
(a) Geometry with average edge length h1h_{1}.
Refer to caption
(b) Geometry with average edge length h2h_{2}.
Refer to caption
(c) Geometry with average edge length h3h_{3}.
Figure 5: Relative error in electric field for MFETI-DP.
Refer to caption
(a) Geometry with average edge length h1h_{1}.
Refer to caption
(b) Geometry with average edge length h2h_{2}.
Refer to caption
(c) Geometry with average edge length h3h_{3}.
Figure 6: Timing data for MFETI-DP.
TABLE II: Ratio of interior to boundary unknowns for different decompositions for particle beam test.
MFETI-DP1 MFETI-DP2
NvN_{v} α\alpha runtime (s) α\alpha runtime (s)
5 3.000 571.4 1.893 694.1
10 0.955 614.8 0.652 711.9
25 0.257 466.7 0.181 488.6
50 0.094 562.5 0.077 507.6

IV-B Particle Beam

The next two examples demonstrate MFETI-DP1 and MFETI-DP2 with particles. In these experiments, the fields that are obtained through domain decomposition are used to push the particles, which are not subject to any decomposition scheme. The first example is a particle beam traveling through a cylindrical PEC cavity. This case is a standard test to show charge conservation as errors readily appear as striations in the particle distribution. Furthermore, approximate solutions can be used to further verify the result. The first test is done on a cavity that is 10 cm in length and 0.02 cm in radius disctretized with an average edge length of 5 mm resulting in roughly 21,000 unknowns. The ten particles at each time step enter at the bottom face of the cavity with a radius of 8 mm. The beam voltage and current are 500 V and 50 mA respectively, causing the an initial velocity 1.326×1041.326\times 10^{4} km/s. The geometry was subdivided in four configurations, the details of which are listed in Table II. The experiment was run for 1000 time steps at a time step size of 1 ns, which for the monolithic case took 1,342 seconds. As shown in figure 7, all the simulations maintained charge conservation over the course of the run. There is no notable difference in the simulations, regardless of the number of subdivisions used. Seven particles were tracked over the course of the run to track the particle trajectories with the domain decomposition. Because the fields are virtually equivalent, the particle trajectories are consistent across the different geometries. Figure 8 shows the error in the tracked particles’ path with rep sect to the path in the monolithic solve.

Refer to caption
(a) MFETI-DP1
Refer to caption
(b) MFETI-DP2
Figure 7: Error in Gauss’ law for particle beam.
Refer to caption
(a) Difference in particle path for MFETI-DP1.
Refer to caption
(b) Difference in particle path for MFETI-DP2.
Figure 8: Error in tracked particle positions.

The next result was a particle beam in a larger PEC cavity. In this case, the PEC cylinder was 40 cm in length and 10 cm in radius. The geometry has an average edge length of 7.38 mm and a total of 218,059 unknowns. The beam voltage and current are 3.2 kV and 50 mA respectively, resulting in an initial velocity of 3.35×1043.35\times 10^{4} km/s. The experiment was run for both MFETI-DP1 and MFETI-DP2 using the iterative solver GMRES using a tolerance of 10610^{-6}. The particle distribution is compared to a well-validated axialsymmetric EM-FDTDPIC code, XOOPIC [32] and shows good agreement in Figure 9.

Refer to caption
(a) XOOPIC vs MFETI-DP1.
Refer to caption
(b) XOOPIC vs MFETI-DP2.
Figure 9: Particle beam expansion for enlarged PEC cavity.

IV-C Plasma Ball

The last example presented is the case of an adiabatic expansion of a plasma ball using MFETI-DP2. This example takes a considerable number or degrees of freedom to resolve the debye length and allow the particles to expand for an extended period of time. However, this can be made more tractable by using either domain decomposition schemes and unconditionally stable time marching to evolve the system with larger time steps than a similar EM-FDTDPIC formulation would allow. Additionally, this experiment has analytic solutions [33] that can be used for validation. A charge neutral, Gaussian distribution of 24000 particles is set at the center of a spherical region 24 cm in diameter. The Sr+Sr^{+} ions are at 1K and the electrons are at 100K with a density of 5×1085\times 10^{8} particles per cubic meter. The outer boundary has an impedance boundary condition and the boundary is sufficiently far from the particles that none leave during the simulation. The average edge length is 1 cm with a total of 681,214 unknowns. Figure 10 shows good agreement of the particle distribution with the analytic result at 150 ns, 300 ns, and 800 ns.

Refer to caption
Refer to caption
Refer to caption
Figure 10: Adiabatic plasma ball expansion after 150, 300, and 800 ns.

V Summary

In this work, we presented two formulations for domain decomposition using non-overlapping subdomains for the mixed finite element method. The methods were used to create a MFETI-PIC scheme which allows for computing the fields due to particle motion for larger problems. For sufficiently large problems, this results in time savings without loss of accuracy. All the results shown were computed on a single node; however, the method can be made parallel to take full advantage of high performance computing resources. Doing so suggest creating a domain decomposition framework for particle mapping and pushing consistent with the field solver.

Future work will focus on creating a qausi-helmholtz decomposition for the FETI-framework. This ensures that the nullspace of the field solver does not corrupt the satisfaction of charge conservation. This is a key issue for using the time domain vector wave equation approach which has a nullspace that grows linearly in time. The combination of domain decomposition and qausi-Helmholtz decomposition will allow for solving even larger problems as the number of degrees of freedom is reduced to those associated with edges, rather than have both edges and faces as presented here.

VI acknowledgments

This work was sponsored by the US Air Force Research Laboratory under contracts FA8650-19-F-1747 and FA8650-20-C-1132. This work was also supported by SMART Scholarship program. We thank the MSU Foundation for support through the Strategic Partnership Grant during early portion of this work. This work was also supported by the Department of Energy Computational Science Graduate Fellowship under grant DE-FG02-97ER25308. The authors would also like to thank the HPCC Facility, Michigan State University, East Lansing, MI, USA.

References

  • [1] Y.-M. Shin, J.-X. Wang, L. R. Barnett, and N. C. Luhmann, “Particle-in-cell simulation analysis of a multicavity w-band sheet beam klystron,” IEEE transactions on electron devices, vol. 58, no. 1, pp. 251–258, 2010.
  • [2] R. Abrams, B. Levush, A. Mondelli, and R. Parker, “Vacuum electronics for the 21st century,” IEEE Microwave magazine, vol. 2, no. 3, pp. 61–72, 2001.
  • [3] S. F. Martins, R. Fonseca, W. Lu, W. B. Mori, and L. Silva, “Exploring laser-wakefield-accelerator regimes for near-term lasers using particle-in-cell simulation in lorentz-boosted frames,” Nature Physics, vol. 6, no. 4, pp. 311–316, 2010.
  • [4] V. A. Dolgashev and S. G. Tantawi, “Simulations of currents in x-band accelerator structures using 2d and 3d particle-in-cell code,” in PACS2001. Proceedings of the 2001 Particle Accelerator Conference (Cat. No. 01CH37268), vol. 5.   IEEE, 2001, pp. 3807–3809.
  • [5] C. K. Birdsall and A. B. Langdon, Plasma physics via computer simulation.   CRC press, 2018.
  • [6] G. Sasser, J. Blahovec, L. Bowers, J. Luginsland, J. Watrous, and S. Colella, “Current emission, resistive losses, and other challenging problems in the simulation of high power microwave components,” in 30th Plasmadynamic and Lasers Conference, 1999, p. 3730.
  • [7] J.-L. Vay, D. Grote, R. Cohen, and A. Friedman, “Novel methods in the particle-in-cell accelerator code-framework warp,” Computational Science & Discovery, vol. 5, no. 1, p. 014019, 2012.
  • [8] R. A. Fonseca, L. O. Silva, F. S. Tsung, V. K. Decyk, W. Lu, C. Ren, W. B. Mori, S. Deng, S. Lee, T. Katsouleas et al., “Osiris: A three-dimensional, fully relativistic particle in cell code for modeling plasma based accelerators,” in International Conference on Computational Science.   Springer, 2002, pp. 342–351.
  • [9] J. Qiang, R. D. Ryne, S. Habib, and V. Decyk, “An object-oriented parallel particle-in-cell code for beam dynamics simulation in linear accelerators,” Journal of Computational Physics, vol. 163, no. 2, pp. 434–451, 2000.
  • [10] C. S. Meierbachtol, A. D. Greenwood, J. P. Verboncoeur, and B. Shanker, “Conformal electromagnetic particle in cell: A review,” IEEE Transactions on Plasma Science, vol. 43, no. 11, pp. 3778–3793, 2015.
  • [11] M. C. Pinto, S. Jund, S. Salmon, and E. Sonnendrücker, “Charge-conserving fem–pic schemes on general grids,” Comptes Rendus Mecanique, vol. 342, no. 10-11, pp. 570–582, 2014.
  • [12] H. Moon, F. L. Teixeira, and Y. A. Omelchenko, “Exact charge-conserving scatter–gather algorithm for particle-in-cell simulations on unstructured grids: A geometric perspective,” Computer Physics Communications, vol. 194, pp. 43–53, 2015.
  • [13] S. O’Connor, Z. Crawford, J. Verboncoeur, J. Lugisland, and B. Shanker, “A set of benchmark tests for validation of 3d particle in cell methods,” arXiv preprint arXiv:2101.09299, 2021.
  • [14] A. S. Glasser and H. Qin, “The geometric theory of charge conservation in particle-in-cell simulations,” arXiv preprint arXiv:1910.12395, 2019.
  • [15] Z. D. Crawford, S. O’Connor, J. Luginsland, and B. Shanker, “Rubrics for charge conserving current mapping in finite element particle in cell methods,” arXiv preprint arXiv:2101.12128, 2021.
  • [16] S. O’Connor, Z. D. Crawford, O. Ramachandran, J. Luginsland, and B. Shanker, “Time integrator agnostic charge conserving finite element pic,” arXiv preprint arXiv:2102.06248, 2021.
  • [17] ——, “Quasi-helmholtz decomposition, gauss’ laws and charge conservation for finite element particle-in-cell,” arXiv preprint arXiv:2103.06737, 2021.
  • [18] E. Tonti, “Finite formulation of the electromagnetic field,” Progress in electromagnetics research, vol. 32, pp. 1–44, 2001.
  • [19] I. S. Duff and J. K. Reid, “The multifrontal solution of indefinite sparse symmetric linear,” ACM Transactions on Mathematical Software (TOMS), vol. 9, no. 3, pp. 302–325, 1983.
  • [20] Y. Liu, P. Ghysels, L. Claus, and X. S. Li, “Sparse approximate multifrontal factorization with butterfly compression for high-frequency wave equations,” SIAM Journal on Scientific Computing, vol. 43, no. 5, pp. S367–S391, 2021.
  • [21] X. Sherry Li and P. Ghysels. Direct sparse linear solvers, preconditioners. Argonne Training Program on Extreme-Scale Computing 2020. [Online]. Available: https://www.extremecomputingtraining.anl.gov/files/2020/08/ATPESC-2020-Track-5-Talk-4-LiAndGhysels-DirectSolvers.pdf
  • [22] C. Farhat and F.-X. Roux, “A method of finite element tearing and interconnecting and its parallel solution algorithm,” International journal for numerical methods in engineering, vol. 32, no. 6, pp. 1205–1227, 1991.
  • [23] C. Farhat, M. Lesoinne, P. LeTallec, K. Pierson, and D. Rixen, “Feti-dp: a dual–primal unified feti method—part i: A faster alternative to the two-level feti method,” International journal for numerical methods in engineering, vol. 50, no. 7, pp. 1523–1544, 2001.
  • [24] Y. Li and J.-M. Jin, “A vector dual-primal finite element tearing and interconnecting method for solving 3-d large-scale electromagnetic problems,” IEEE Transactions on Antennas and Propagation, vol. 54, no. 10, pp. 3000–3009, 2006.
  • [25] Y.-J. Li and J.-M. Jin, “Implementation of the second-order abc in the feti-dpem method for 3d em problems,” IEEE Transactions on Antennas and Propagation, vol. 56, no. 8, pp. 2765–2769, 2008.
  • [26] M.-F. Xue and J.-M. Jin, “Nonconformal feti-dp methods for large-scale electromagnetic simulation,” IEEE transactions on antennas and propagation, vol. 60, no. 9, pp. 4291–4305, 2012.
  • [27] C. Wolfe, U. Navsariwala, and S. D. Gedney, “A parallel finite-element tearing and interconnecting algorithm for solution of the vector wave equation with pml absorbing medium,” IEEE Transactions on Antennas and Propagation, vol. 48, no. 2, pp. 278–284, 2000.
  • [28] A. Akbarzadeh-Sharbaf, V. Mohtashami, and D. D. Giannacopoulos, “An unconditionally stable and energy-preserving domain-decomposition method for transient modeling of large-scale electromagnetic problems,” IEEE Transactions on Antennas and Propagation, vol. 67, no. 11, pp. 6989–7000, 2019.
  • [29] U. D. Navsariwala and S. D. Gedney, “An efficient implementation of the finite-element time-domain algorithm on parallel computers using a finite-element tearing and interconnecting algorithm,” Microwave and Optical Technology Letters, vol. 16, no. 4, pp. 204–208, 1997.
  • [30] Z.-H. Lai, J.-F. Kiang, and R. Mittra, “A domain decomposition finite difference time domain (fdtd) method for scattering problem from very large rough surfaces,” IEEE Transactions on Antennas and Propagation, vol. 63, no. 10, pp. 4468–4476, 2015.
  • [31] O. C. Zienkiewicz, “A new look at the newmark, houbolt and other time stepping formulas. a weighted residual approach,” Earthquake Engineering & Structural Dynamics, vol. 5, no. 4, pp. 413–418, 1977.
  • [32] J. P. Verboncoeur, A. B. Langdon, and N. Gladd, “An object-oriented electromagnetic pic code,” Computer Physics Communications, vol. 87, no. 1-2, pp. 199–211, 1995.
  • [33] V. Kovalev and V. Y. Bychenkov, “Analytic solutions to the vlasov equations for expanding plasmas,” Physical review letters, vol. 90, no. 18, p. 185004, 2003.