This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Verification of a Fully Implicit Particle-in-Cell Method for the vv_{\parallel} Formalism of Electromagnetic Gyrokinetics in the XGC Code

Benjamin J. Sturdevant [email protected] S. Ku L. Chacón Y. Chen D. Hatch M. D. J. Cole A. Y. Sharma M. F. Adams C. S. Chang S. E. Parker R. Hager Princeton Plasma Physics Laboratory, P.O. Box 451, Princeton, NJ 08543, USA Los Alamos National Laboratory, P.O. Box 1663, Los Alamos, NM 87545, USA Lawrence Berkeley National Laboratory, 1 Cyclotron Rd., Berkeley, CA 94720, USA University of Colorado at Boulder, Department of Physics, 390 UCB, Boulder, CO 80309, USA Institute for Fusion Studies, University of Texas at Austin, 1 University Station, C1500, Austin, TX 78712
Abstract

A fully implicit particle-in-cell method for handling the vv_{\parallel}-formalism of electromagnetic gyrokinetics has been implemented in XGC. By choosing the vv_{\parallel} formalism, we avoid introducing the non-physical skin terms in Ampère’s law, which are responsible for the well-known “cancellation problem” in the pp_{\parallel}-formalism. The vv_{\parallel}-formalism, however, is known to suffer from a numerical instability when explicit time integration schemes are used due to the appearance of a time derivative in the particle equations of motion from the inductive component of the electric field. Here, using the conventional δf\delta f scheme, we demonstrate that our implicitly discretized algorithm can provide numerically stable simulation results with accurate dispersive properties. We verify the algorithm using a test case for shear Alfvén wave propagation in addition to a case demonstrating the ITG-KBM transition. The ITG-KBM transition case is compared to results obtained from other δf\delta f gyrokinetic codes/schemes, whose verification has already been archived in the literature.

keywords:
Electromagnetic Gyrokinetics, Implicit Methods, Particle-in-cell, XGC
journal: Physics of Plasmas

1 Introduction

Gyrokinetic particle-in-cell codes are used extensively to study instabilities and microturbulence in magnetically confined fusion devices. Electrostatic gyrokinetic particle simulations of ion temperature gradient (ITG) driven microturbulence have been accessible now for a few decades with the adiabatic electron model [1, 2, 3, 4]. More challenging is the kinetic treatment of electrons due to their small mass and the presence of additional high-frequency modes [5]. Specialized numerical techniques are generally used to mitigate these issues [6, 7, 8, 9, 10, 11, 12]. There are important physics effects in magnetically confined fusion devices that, in addition to kinetic electrons, require the inclusion of electromagnetic perturbations. Electromagnetic capabilities have been implemented in a number of particle and continuum gyrokinetic codes including GEM [9, 10, 11], GTC [4, 13, 14, 15], EUTERPE [16, 17], ORB5 [18, 19, 20], GENE [21, 22, 23], Gkeyll [24, 25], GYRO [26, 27], GKV [28, 29], and GKNET [30, 31]. However, application to long-wavelength MHD-type modes is often limited due to numerical difficulties.

For particle codes in particular, handling the parallel vector potential AA_{\parallel} at high β\beta (ratio of plasma to magnetic pressure), long perpendicular wavelength regimes remains a challenge. It is common to use the parallel canonical momentum pp_{\parallel} as a coordinate, which leads to the appearance of a large skin current term in Ampère’s law from the zero-order distribution [9, 32]. To avoid severe accuracy problems, this skin current term needs to cancel with the corresponding part of the current deposited from the marker particles. The difference in numerical representations between the current appearing in Ampère’s law and the current coming from marker particles makes this exact cancellation particularly difficult to achieve. When the parallel component of velocity vv_{\parallel} is instead used as a coordinate, the original form of Ampère’s law can be used without the appearance of skin current terms. Hence, the vv_{\parallel} formalism avoids the cancellation problem altogether. The difficulty with this approach, however, is the appearance of a time derivative in the particle equations of motion from the inductive component of the electric field perturbation, At\frac{\partial A_{\parallel}}{\partial t}. Explicit time integration methods are generally unstable with the inclusion of this term [33].

Kinetic electron capabilities in XGC [6, 7, 34] are already well established and have been used in large scale physics studies of the tokamak edge [35, 36, 37, 38]. There have been recent efforts to extend this capability to include electromagnetic perturbations. Besides the approach described in this paper, an explicit method using the mixed variables/pullback transformation scheme [39, 40, 17] has been recently implemented [41]. In addition, an implicit kinetic-fluid hybrid model has been previously explored as an inexpensive alternative to electron particles [42]. A key goal of these efforts is to develop the capability of simulating microturbulence consistently with MHD-type electromagnetic modes in the full volume of magnetically confined fusion devices from the magnetic axis to the first wall. Such a capability has not been previously demonstrated. However, we mention that there have been notable steps taken toward this end. For example, ORB5 has recently achieved simulations capturing the self-consistent interactions between Alfvén modes and ITG turbulence using a global gyrokinetic model in a simple core geometry [43]. Beyond MHD, an electromagnetic version of XGC will be useful for the study of microtearing modes and electromagnetic effects on electrostatic modes, e.g., finite-beta stabilization. A full-volume electromagnetic capability would represent an important step forward both for XGC and for gyrokinetic particle simulations in general.

Here, we have applied the techniques developed in Refs. [44, 45, 46, 47, 48] in the 6-dimensional (6D) particle-in-cell context to enable a fully implicit particle-in-cell method for handling the vv_{\parallel}-formalism of electromagnetic gyrokinetics in the 5D particle-in-cell code XGC. Besides the work we document in this paper on XGC, we mention recent efforts in developing an implicit gyrokinetic electromagnetic scheme in the mixed particle-in-cell/particle-in-Fourier TRIMEG-GKX code for applications to energetic particle physics [49]. Implicit time discretization is effective for eliminating the stability issues originating from the inductive component of the electric field, and by using the vv_{\parallel}-formalism, we avoid the cancellation problem in Ampère’s law. The implicit approach, however, requires the solution of a large system of nonlinear equations at each time step, for which we employ Anderson mixing [50] to a preconditioned Picard iteration scheme. The focus of this paper is to present the 5D equations and algorithms, and to verify the scheme implemented in XGC for two test cases. The first demonstrates shear Alfvén wave (SAW) propagation in cylindrical geometry in the long perpendicular wavelength regime, and the second demonstrates the transition from the ITG instability to the kinetic ballooning mode (KBM) in toroidal geometry as β\beta is increased past a critical value. These test cases were motivated by ones previously considered in [51] for the GTS code [52, 53]. By comparing to an analytical dispersion relation, the implicit scheme is shown to accurately reproduce the dispersion properties of the SAW in regimes inaccessible with the pp_{\parallel}-formalism. In addition, the implicit scheme shows good agreement with the GEM and GENE codes and the explicit electromagnetic implementation in XGC [41] for the ITG-KBM transition case. The results presented in this paper strengthen our confidence in the ability of the implicit scheme to accurately solve the electromagnetic gyrokinetic equations.

The remainder of the paper is organized as follows. In Sec. 2, we present the equations implemented in XGC based on the vv_{\parallel}-formalism of electromagnetic gyrokinetics. In Sec. 3, we give an overview of the implicit algorithm, including the discretization of the equations and our approach to solve the resulting system of equations at each timestep. In Sec. 4, we describe the two test cases used to verify our implementation and present the results. In Sec. 5, we briefly discuss the performance of the implicit algorithm. Finally, we give conclusions in Sec. 6.

2 Gyrokineitc Electromagnetic Model in the vv_{\parallel} Formalism

The vv_{\parallel}-formalism of the electromagnetic gyrokinetic equations requires the parallel component of the perturbed vector potential AA_{\parallel}, which is obtained from the following form of Ampère’s law:

2A=μ0s=i,eδjs,\displaystyle-\nabla_{\perp}^{2}A_{\parallel}=\mu_{0}\sum_{s=i,e}\langle\delta j_{\parallel s}\rangle, (1)

where μ0\mu_{0} is the permeability of free space, and \langle\cdot\rangle represents the gyroaveraging operator. Throughout, it is understood that gyroaveraging is only performed for the ions, as the electrons are treated as drift-kinetic. This simplification can be easily removed. The perturbed current contribution from each species on the right hand side is deposited from the marker particles using the δf\delta f weights [54, 55, 56, 57]. The background current is balanced out by the equilibrium magnetic field and plasma. The gyrokinetic Poisson equation is used to solve for the electrostatic potential ϕ\phi and is given by:

min0iB2ϕ=s=i,eqsδns,\displaystyle-\nabla\cdot\frac{m_{i}n_{0i}}{B^{2}}\nabla_{\perp}\phi=\sum_{s=i,e}q_{s}\langle\delta n_{s}\rangle, (2)

where mim_{i} is the ion mass, n0in_{0i} is the background ion density, BB is the background magnetic field strength, and qsq_{s} is the charge for species ss. The short wavelength correction term often used in XGC [7] is turned off in this initial study for simplicity, which has been justified in the benchmarking paper Ref. [41]. Again, the perturbed number densities δns\delta n_{s} on the right hand side come from marker particles, and gyroaveraging is assumed only for the ions. From ϕ\phi and AA_{\parallel}, we can compute the perturbed electric and magnetic fields as

δ𝐄\displaystyle\delta{\bf E} =ϕAt𝐛^\displaystyle=-\nabla\phi-\frac{\partial A_{\parallel}}{\partial t}\hat{{\bf b}} (3)
δ𝐁\displaystyle\delta{\bf B} =A×𝐛^,\displaystyle=\nabla A_{\parallel}\times\hat{{\bf b}}, (4)

where 𝐛^\hat{{\bf b}} is the direction of the background magnetic field. Here, we note the presence of the time derivative in the second term of the electric field, which leads to numerical instabilities when explicit time integration schemes are used. Here, we do not compute δ𝐁\delta{\bf B} from the curl of A𝐛^A_{\parallel}\hat{{\bf b}} as is done, for example, in Ref. [58]. The difference represents a higher order corrction in the gyokinetic ordering, which we have neglected in the present study. The particle equations of motion for species s=i,es=i,e can be written as :

𝐗˙\displaystyle\dot{{\bf X}} =v𝐁B+(δ𝐄μqsB)×𝐛^B\displaystyle=v_{\parallel}\frac{{\bf B}^{*}}{B_{\parallel}^{*}}+\left(\langle\delta{\bf E}\rangle-\frac{\mu}{q_{s}}\nabla B\right)\times\frac{\hat{{\bf b}}}{B_{\parallel}^{*}} (5)
v˙\displaystyle\dot{v}_{\parallel} =qsms𝐁B(δ𝐄μqsB),\displaystyle=\frac{q_{s}}{m_{s}}\frac{{\bf B}^{*}}{B_{\parallel}^{*}}\cdot\left(\langle\delta{\bf E}\rangle-\frac{\mu}{q_{s}}\nabla B\right), (6)

where 𝐗{\bf X} is the gyrocenter position vector, vv_{\parallel} is the parallel component of velocity, μ\mu is the magnetic moment, msm_{s} is the mass of species ss. We note that, in this form of the gyrocenter equations of motion, the curvature drift is hidden in the first term 𝐁/B{\bf B}^{*}/B_{\parallel}^{*}, with

𝐁\displaystyle{\bf B}^{*} =𝐁+δ𝐁+msqsv×𝐛^\displaystyle={\bf B}+\langle\delta{\bf B}\rangle+\frac{m_{s}}{q_{s}}v_{\parallel}\nabla\times\hat{{\bf b}} (7)
B\displaystyle B_{\parallel}^{*} =𝐛^𝐁.\displaystyle=\hat{{\bf b}}\cdot{\bf B}^{*}. (8)

For the purposes of this report, we use the conventional δf\delta f method, splitting the distribution functions for each species into background and perturbed parts as fs=f0s+δfsf_{s}=f_{0s}+\delta f_{s}. This splitting leads to an equation for the evolution of particles weights as :

dwdt=(1w)δf˙s/f0s,\displaystyle\frac{dw}{dt}=(1-w)\delta\dot{f}_{s}/f_{0s}, (9)

where

δf˙s=𝐗˙1f0sv˙1f0sv.\displaystyle\delta\dot{f}_{s}=-\dot{{\bf X}}_{1}\cdot\nabla f_{0s}-\dot{v}_{\parallel 1}\frac{\partial f_{0s}}{\partial v_{\parallel}}. (10)

Here, the subscript 11 indicates that only perturbed quantities are kept from Eqs.(5)–(6). In particular,

𝐗˙1\displaystyle\dot{{\bf X}}_{1} =vδ𝐁B+δ𝐄×𝐛^B\displaystyle=v_{\parallel}\frac{\langle\delta{\bf B}\rangle}{B_{\parallel}^{*}}+\langle\delta{\bf E}\rangle\times\frac{\hat{{\bf b}}}{B_{\parallel}^{*}} (11)
v˙1\displaystyle\dot{v}_{\parallel 1} =qsms(𝐁Bδ𝐄μqsδ𝐁BB).\displaystyle=\frac{q_{s}}{m_{s}}\cdot\left(\frac{{\bf B}^{*}}{B_{\parallel}^{*}}\cdot\langle\delta{\bf E}\rangle-\frac{\mu}{q_{s}}\frac{\langle\delta{\bf B}\rangle}{B_{\parallel}^{*}}\cdot\nabla B\right). (12)

We note that BB^{*}_{\parallel} does not contain any perturbed quantities, since δ𝐁\langle\delta{\bf B}\rangle computed from Eq.(4) is perpendicular to 𝐛^\hat{{\bf b}}. If δ𝐁\delta{\bf B} were computed using the full curl of A𝐛^A_{\parallel}\hat{{\bf b}}, rather than from Eq.(4), a contribution from AA_{\parallel} would appear in BB_{\parallel}^{*} as in Ref. [58].

Finally, we mention that Eq.(10) is only consistent with the gyrokinetic Vlasov equation when f0sf_{0s} is an exact equilibrium solution of the gyrokinetic Vlasov equation. For a local Maxwellian f0sf_{0s}, contributions due to the grad-B and curvature drifts drivers can be missed in the weight equation using Eq.(10). These contributions are commonly ignored in conventional δf\delta f codes and are ignored in the code used in this report for cross-verification in Sec. 4.

3 Implicit Algorithm

In this section, we describe the implicit time discretization scheme applied to the system in Sec. 2, as well as the iterative scheme and preconditioner used in solving the resulting nonlinear system of equations. The discretization scheme and iterative solution method are based on the work in [44, 45, 46, 47, 48].

3.1 Time Discretization

In our scheme, electrons are subcycled [12, 59, 44, 60], meaning they are advanced using several small time steps over the interval nΔtt(n+1)Δtn\Delta t\leq t\leq(n+1)\Delta t. Hence, we need to define the perturbed electric and magnetic fields over the continuous time interval between steps nn and n+1n+1. Following [44, 45, 46, 47, 48], we take the electric field to be constant in time over the subcycling interval as

δ𝐄(t)=(ϕn+1+ϕn2)2Δt(An+1/2A~n)𝐛^fornΔtt(n+1)Δt,\displaystyle\delta{\bf E}(t)=-\nabla\left(\frac{\phi^{n+1}+\phi^{n}}{2}\right)-\frac{2}{\Delta t}\left(A_{\parallel}^{n+1/2}-\tilde{A}_{\parallel}^{n}\right)\hat{{\bf b}}\ \ \ \mathrm{for}\ \ n\Delta t\leq t\leq(n+1)\Delta t, (13)

where A~n\tilde{A}_{\parallel}^{n} is defined recursively in time by

A~n=2An1/2A~n1,\displaystyle\tilde{A}_{\parallel}^{n}=2A_{\parallel}^{n-1/2}-\tilde{A}_{\parallel}^{n-1}, (14)

and is initialized by solving Eq.(1) at timestep 0. Consistent with the time derivative of the parallel vector potential being constant within the interval, we take δ𝐁\delta{\bf B} to vary linearly in time as [46, 47, 48]:

δ𝐁(t)=[(1tnΔtΔt/2)A~n+(tnΔtΔt/2)An+1/2]×𝐛^.\displaystyle\delta{\bf B}(t)=\left[\left(1-\frac{t-n\Delta t}{\Delta t/2}\right)\nabla\tilde{A}_{\parallel}^{n}+\left(\frac{t-n\Delta t}{\Delta t/2}\right)\nabla A_{\parallel}^{n+1/2}\right]\times\hat{{\bf b}}. (15)

The gyrocenter positions of marker particles are described using a cylindrical coordinate system (R,Z,φ)(R,Z,\varphi), where RR is the major radius, ZZ is the distance along the cylindrical axis, and φ\varphi is the toroidal angle. Together with vv_{\parallel} and the δf\delta f particle weight ww, there are five evolving variables for each marker particle. These can be written in vector form as 𝜻=[R,Z,φ,v,w]T\boldsymbol{\zeta}=[R,Z,\varphi,v_{\parallel},w]^{T}, and marker particle evolution can then be expressed as

𝜻˙=𝐆(𝜻,t),\displaystyle\dot{\boldsymbol{\zeta}}={\bf G}(\boldsymbol{\zeta},t), (16)

where

𝐆(𝜻,t)=[(vBR+FZBφBFφBZB)/B(vBZ+FφBRBFRBφB)/B(vBφ+FRBZBFZBRB)/RBqsms(BRFR+BZFZ+BφFφ)/B(1w)(𝐗˙1ΨΨlnf0s+v˙1vlnf0s)].{\bf G}(\boldsymbol{\zeta},t)=\left[\begin{array}[]{c}\left(v_{\parallel}B^{*}_{R}+F_{Z}\frac{B_{\varphi}}{B}-F_{\varphi}\frac{B_{Z}}{B}\right)/B^{*}_{\parallel}\\ \left(v_{\parallel}B^{*}_{Z}+F_{\varphi}\frac{B_{R}}{B}-F_{R}\frac{B_{\varphi}}{B}\right)/B^{*}_{\parallel}\\ \left(v_{\parallel}B^{*}_{\varphi}+F_{R}\frac{B_{Z}}{B}-F_{Z}\frac{B_{R}}{B}\right)/RB^{*}_{\parallel}\\ \frac{q_{s}}{m_{s}}\left(B^{*}_{R}F_{R}+B^{*}_{Z}F_{Z}+B^{*}_{\varphi}F_{\varphi}\right)/B^{*}_{\parallel}\\ -(1-w)\left(\dot{{\bf X}}_{1}\cdot\nabla\Psi\frac{\partial}{\partial\Psi}\ln{f_{0s}}+\dot{v}_{\parallel 1}\frac{\partial}{\partial v_{\parallel}}\ln{f_{0s}}\right)\end{array}\right]. (17)

In this expression, we have defined the vector quantity 𝐅=δ𝐄μqsB{\bf F}=\langle\delta{\bf E}\rangle-\frac{\mu}{q_{s}}\nabla B and have assumed spatial variations in f0sf_{0s} to be along the poloidal flux coordinate Ψ\Psi. Given the discrete-time potentials ϕn\phi^{n}, ϕn+1\phi^{n+1}, A~n\tilde{A}_{\parallel}^{n}, and An+1/2A_{\parallel}^{n+1/2} defined on the mesh, Eq.(13) and Eq.(15) together with the particle interpolation methods allow for the evaluation of the right hand side vector 𝐆(𝜻,t){\bf G}(\boldsymbol{\zeta},t) at all particle locations 𝜻\boldsymbol{\zeta} within the mesh and all times within the interval nΔtt(n+1)Δtn\Delta t\leq t\leq(n+1)\Delta t. This allows considerable freedom in choosing an ODE integration method to advance Eq.(16) in time from step nn to n+1n+1. Here, we choose a standard fourth order Runge-Kutta method (RK4) for both electrons and ions. For subcycled electrons, we divide the interval into an integer number of sub-intervals and apply RK4 successively over the sub-intervals to advance from step nn to n+1n+1. Adaptive integration schemes such as the one considered in Ref. [44] are well-suited for subcycling and will be explored in the future. Ions are not subcycled and use field values at the center and ends of the time interval to compute the RK4 stages.

3.2 Residual Evaluation

By choosing an implicit time discretization, the particles and fields are interdependent at each timestep. Particles are pushed using fields that depend on ϕn+1\phi^{n+1} and An+1/2A_{\parallel}^{n+1/2}, yet these potentials are determined from the particle states over the interval nΔtt(n+1)Δtn\Delta t\leq t\leq(n+1)\Delta t. In particular, the right hand side of Ampère’s law is computed from the time averaged current depositions over the subcycled timesteps between nn and n+1n+1, and the right hand side of the gyrokinetic Poisson equation is computed from the number density deposition at n+1n+1. We require a self-consistent state between the particles and fields at each timestep, which can be expressed in terms of low-dimensional (compared to the degrees of freedom in the particle system) residuals by using the spatially discretized versions of Eqs.(1)–(2). We define the residuals by

RA(ϕn+1,An+1/2)\displaystyle R_{A}(\phi^{n+1},A_{\parallel}^{n+1/2}) 2An+1/2μ0s=i,eδjsn+1/2\displaystyle\equiv-\nabla_{\perp}^{2}A_{\parallel}^{n+1/2}-\mu_{0}\sum_{s=i,e}\langle\delta j_{\parallel s}\rangle^{n+1/2}
Rϕ(ϕn+1,An+1/2)\displaystyle R_{\phi}(\phi^{n+1},A_{\parallel}^{n+1/2}) min0iB2ϕn+1s=i,eqsδnsn+1.\displaystyle\equiv-\nabla\cdot\frac{m_{i}n_{0i}}{B^{2}}\nabla_{\perp}\phi^{n+1}-\sum_{s=i,e}q_{s}\langle\delta n_{s}\rangle^{n+1}. (18)

For given ϕn+1\phi^{n+1} and An+1/2A_{\parallel}^{n+1/2}, evaluation of the residuals involves: pushing the marker particles from step nn to n+1n+1, as described in the previous subsection; depositing δjsn+1/2\langle\delta j_{\parallel s}\rangle^{n+1/2} and δnsn+1\langle\delta n_{s}\rangle^{n+1} to the mesh; and evaluating the elliptic operators in Ampère’s law and the gyrokinetic Poisson equation. Self-consistency is expressed as

RA(ϕn+1,An+1/2)\displaystyle R_{A}(\phi^{n+1},A_{\parallel}^{n+1/2}) =0\displaystyle=0
Rϕ(ϕn+1,An+1/2)\displaystyle R_{\phi}(\phi^{n+1},A_{\parallel}^{n+1/2}) =0,\displaystyle=0, (19)

which represents a nonlinear system of equations for ϕn+1\phi^{n+1} and An+1/2A_{\parallel}^{n+1/2}. We note that both particle and field quantities are coupled in Eq.(19). The moments δjsn+1/2\langle\delta j_{\parallel s}\rangle^{n+1/2} and δnsn+1\langle\delta n_{s}\rangle^{n+1} are implicitly functions of ϕn+1\phi^{n+1} and An+1/2A_{\parallel}^{n+1/2}, where the dependence comes in through the particle equations of motion [44]. Furthermore, given ϕn+1\phi^{n+1} and An+1/2A_{\parallel}^{n+1/2} that satisfy Eq.(19), the consistent particle states follow directly by pushing the particles with the resulting fields.

3.3 Iterative Solution Method

Our iterative solution method involves wrapping Anderson mixing [50] around a preconditioned Picard iteration scheme. Here, we give a brief overview of the preconditioned Picard iteration scheme. We define a state vector for the potentials 𝐱=[ϕn+1,An+1/2]T{\bf x}=[\phi^{n+1},A_{\parallel}^{n+1/2}]^{T} and a residual vector 𝐫=[Rϕ,RA]T{\bf r}=[R_{\phi},R_{A}]^{T}. In this notation, Eq.(19) can be stated as 𝐫(𝐱)=0{\bf r}({\bf x})=0, and the preconditioned Picard iteration scheme can be written as

𝒫Δ𝐱k\displaystyle{\cal P}\Delta{\bf x}^{k} =𝐫(𝐱k)\displaystyle={\bf r}({\bf x}^{k}) (20)
𝐱k+1\displaystyle{\bf x}^{k+1} =𝐱k+Δ𝐱k,\displaystyle={\bf x}^{k}+\Delta{\bf x}^{k},

where 𝒫{\cal P} is an invertible operator whose inverse maps the residual vector to a correction vector Δ𝐱\Delta{\bf x} and kk is the iteration index. We refer to the operator 𝒫{\cal P} as the preconditioner and note that the consistency of the iterative method does not depend on the particular form of 𝒫{\cal P}. In other words, any invertible preconditioner used in this scheme will produce a correct solution such that 𝐫(𝐱)=0{\bf r}({\bf x})=0 provided the scheme converges. The choice of preconditioner will, however, determine if (and at what rate) the scheme converges. As mentioned in the previous subsection, each evaluation of 𝐫{\bf r} on the right hand side of Eq.(20) involves a particle push and deposition to compute updated velocity moments. One preconditioned Picard iteration from kk to k+1k+1 therefore follows these steps

  1. 1.

    Push particles using the fields computed from the potentials in 𝐱k{\bf x}^{k}.

  2. 2.

    Deposit δjs\langle\delta j_{\parallel s}\rangle and δns\langle\delta n_{s}\rangle to compute the residuals in 𝐫k{\bf r}^{k}.

  3. 3.

    Solve 𝒫Δ𝐱k=𝐫(𝐱k){\cal P}\Delta{\bf x}^{k}={\bf r}({\bf x}^{k}) to get the corrections to the potentials Δ𝐱k\Delta{\bf x}^{k}.

  4. 4.

    Update the potentials by 𝐱k+1=𝐱k+Δ𝐱k{\bf x}^{k+1}={\bf x}^{k}+\Delta{\bf x}^{k}.

3.4 Fluid-Based Preconditioner

Similarly to what has been reported elsewhere for 6D electromagnetic PIC [46, 47, 48], we choose a preconditioner based on a simplified set of implicitly discretized fluid equations that are linearized with respect to changes in ϕn+1\phi^{n+1} and An+1/2A_{\parallel}^{n+1/2}. Our starting point is to consider the continuity and momentum equations for electrons, keeping only the terms relevant to dynamics parallel to the background magnetic field. This greatly simplifies the equations we will need to solve when applying 𝒫1{\cal P}^{-1} to the residuals, yet still captures the terms responsible for the fastest timescales in the system. Since ions evolve on a much slower timescale than electrons, we neglect them entirely in the preconditioner. The starting electron fluid equations are

tδne\displaystyle\frac{\partial}{\partial t}\delta n_{e} 1eBB1δje=0\displaystyle-\frac{1}{e}B\nabla_{\parallel}B^{-1}\delta j_{\parallel e}=0 (21)
tδje\displaystyle\frac{\partial}{\partial t}\delta j_{\parallel e} +e2n0me(ϕ+tA)eTemeBB1δne=0.\displaystyle+\frac{e^{2}n_{0}}{m_{e}}\left(\nabla_{\parallel}\phi+\frac{\partial}{\partial t}A_{\parallel}\right)-\frac{eT_{e}}{m_{e}}B\nabla_{\parallel}B^{-1}\delta n_{e}=0. (22)

Next, we consider an implicit discretization of these equations which defines quantities at discrete time locations consistent with the quantities in the PIC system. We have

δnen+1Δt\displaystyle\frac{\delta n_{e}^{n+1}}{\Delta t} 1eBB1δjen+1/2=Fcn\displaystyle-\frac{1}{e}B\nabla_{\parallel}B^{-1}\delta j_{\parallel e}^{n+1/2}=F_{c}^{n} (23)
2Δtδjen+1/2\displaystyle\frac{2}{\Delta t}\delta j_{\parallel e}^{n+1/2} +e2n0me(12ϕn+1+2ΔtAn+1/2)eTemeBB112δnen+1=FMn,\displaystyle+\frac{e^{2}n_{0}}{m_{e}}\left(\frac{1}{2}\nabla_{\parallel}\phi^{n+1}+\frac{2}{\Delta t}A_{\parallel}^{n+1/2}\right)-\frac{eT_{e}}{m_{e}}B\nabla_{\parallel}B^{-1}\frac{1}{2}\delta n_{e}^{n+1}=F_{M}^{n}, (24)

where the right hand sides of these equations can be determined from information known at the previous timestep. Finally, if we consider the linear responses of the fluid moments due to small perturbations in ϕn+1\phi^{n+1} and An+1/2A_{\parallel}^{n+1/2}, we have :

Δne\displaystyle\Delta n_{e} ΔteBB1Δje=0\displaystyle-\frac{\Delta t}{e}B\nabla_{\parallel}B^{-1}\Delta j_{\parallel e}=0 (25)
Δje\displaystyle\Delta j_{\parallel e} +e2n0me(Δt4Δϕ+ΔA)Δt4eTemeBB1Δne=0.\displaystyle+\frac{e^{2}n_{0}}{m_{e}}\left(\frac{\Delta t}{4}\nabla_{\parallel}\Delta\phi+\Delta A_{\parallel}\right)-\frac{\Delta t}{4}\frac{eT_{e}}{m_{e}}B\nabla_{\parallel}B^{-1}\Delta n_{e}=0.

In our notation, δ\delta refers to perturbations from background quantities in the system, whereas Δ\Delta refers to perturbations due to small changes in the discrete-time potentials ϕn+1\phi^{n+1} and An+1/2A_{\parallel}^{n+1/2}. A 4 ×\times 4 block matrix 𝔸{\mathbb{A}} can be written using Eq.(25) together with Ampère’s law and the gyrokinetic Poisson equation, which describes the linear response of the coupled moment-potential system. We write

𝔸=[min0iB20e𝕄0020μ0𝕄00𝕀ΔteBB1Δt4e2n0mee2n0me𝕀Δt4eTemeBB1𝕀],{\mathbb{A}}=\left[\begin{array}[]{cc|cc}-\nabla\cdot\frac{m_{i}n_{0i}}{B^{2}}\nabla_{\perp}&0&e{\mathbb{M}}&0\\ 0&-\nabla_{\perp}^{2}&0&-\mu_{0}{\mathbb{M}}\\ \hline\cr 0&0&{\mathbb{I}}&-\frac{\Delta t}{e}B\nabla_{\parallel}B^{-1}\\ \frac{\Delta t}{4}\frac{e^{2}n_{0}}{m_{e}}\nabla_{\parallel}&\frac{e^{2}n_{0}}{m_{e}}{\mathbb{I}}&-\frac{\Delta t}{4}\frac{eT_{e}}{m_{e}}B\nabla_{\parallel}B^{-1}&{\mathbb{I}}\end{array}\right], (26)

where the operators in each block are N×NN\times N matrices with NN the number of mesh vertices, 𝕀{\mathbb{I}} is the N×NN\times N identity matrix, and 𝕄{\mathbb{M}} is a mass matrix for the finite element discretization used in XGC over the poloidal planes. We recall that in step 4 from the previous subsection only the potentials are updated at the end of each iteration. Equation (26) represents an augmented system with equations involving the additional unknowns Δne\Delta n_{e} and Δje\Delta j_{\parallel e}. These additional unknowns can provide an intuitive way of expressing the system, where the top two rows of 𝔸{\mathbb{A}} model the responses of the potentials to changes in the moments, and the bottom two rows model the responses of the moments to changes in the potentials. However, the actual solutions obtained for Δne\Delta n_{e} and Δje\Delta j_{\parallel e} do not play a role in the update.

We use the block matrix system to solve for Δ𝐱k\Delta{\bf x}^{k} in Eq.(20) in two steps:

  1. 1.

    Solve 𝔸𝐲=𝕎𝐫k{\mathbb{A}}{\bf y}={\mathbb{W}}{\bf r}^{k}, where 𝐲=[Δϕ,ΔA,Δne,Δje]T{\bf y}=[\Delta\phi,\Delta A_{\parallel},\Delta n_{e},\Delta j_{\parallel e}]^{T} and

    𝕎=[𝕀00𝕀0000].{\mathbb{W}}=\left[\begin{array}[]{cc}{\mathbb{I}}&0\\ 0&{\mathbb{I}}\\ 0&0\\ 0&0\end{array}\right].
  2. 2.

    Restrict the solution to the potential correction vector Δ𝐱k\Delta{\bf x}^{k} with Δ𝐱k=𝕎T𝐲\Delta{\bf x}^{k}={\mathbb{W}}^{T}{\bf y}.

The block matrix is therefore related to the preconditioner operator in Eq.(20) by 𝒫1=𝕎T𝔸1𝕎{\cal P}^{-1}={\mathbb{W}}^{T}{\mathbb{A}}^{-1}{\mathbb{W}}. In our implementation, we use the suite of solvers available in PETSc [61, 62, 63] to invert 𝔸{\mathbb{A}}. Some initial convergence results for the iterative scheme applied to the preconditioned implicit PIC equations are given in Sec. 5. However, detailed studies of convergence rates across different parameter regimes, modifications to the fluid-based preconditioner for improving the convergence rates, and methods for efficiently inverting 𝔸{\mathbb{A}} will be the subject of a separate paper.

4 Verification Tests

Our verification tests are based on cases previously considered in [51]. In this section, we describe the problem set up and present results for two test cases. The first is shear Alfvén wave (SAW) propagation in a periodic cylindrical system and the second is a toroidal case demonstrating the transition from an ion temperature gradient (ITG) instability to the kinetic ballooning mode (KBM) as β\beta is increased past a critical transition value. The SAW results are compared to an analytical dispersion relation and the ITG-KBM transition results are compared to results obtained from the GEM and GENE codes, in addition to XGC results using an explicit timestepping method with the mixed variables/pullback transformation scheme [41].

4.1 Magnetic Geometry Model

For the toroidal system, we again consider the cylindrical coordinate system (R,Z,φ)(R,Z,\varphi) from Sec. 3.1. In addition, we can define a simple toroidal coordinate system by expressing RR and ZZ in terms of the RR coordinate at the magnetic axis R0R_{0}, a minor radius coordinate rr, and a poloidal angle coordinate θ\theta as

R\displaystyle R =R0+rcosθ\displaystyle=R_{0}+r\cos{\theta} (27)
Z\displaystyle Z =rsinθ.\displaystyle=r\sin{\theta}. (28)

The background magnetic field model used is from [64] and can be expressed in the cylindrical coordinates as

𝐁=φ^×ΨR+B0R0Rφ^,\displaystyle{\bf B}=\frac{\hat{\varphi}\times\nabla\Psi}{R}+\frac{B_{0}R_{0}}{R}\hat{\varphi}, (29)

where Ψ\Psi is a poloidal flux function. An analytical form is taken for Ψ\Psi depending on rr only as

Ψ(r)=B00rrdrq(r)1(r/R0)2,\displaystyle\Psi(r)=B_{0}\int_{0}^{r}\frac{r^{\prime}dr^{\prime}}{q(r^{\prime})\sqrt{1-\left(r^{\prime}/R_{0}\right)^{2}}}, (30)

where q(r)q(r) is the safety factor. Hence, the magnetic equilibrium is determined once we specify the parameters B0B_{0} and R0R_{0} and the function q(r)q(r). We take q(r)q(r) to be a quadratic polynomial in r/ar/a, where aa is the minor radius

q(r)=q0+q1(ra)+q2(ra)2.\displaystyle q(r)=q_{0}+q_{1}\left(\frac{r}{a}\right)+q_{2}\left(\frac{r}{a}\right)^{2}. (31)

Hence q(r)q(r) is determined by the parameters q0q_{0}, q1q_{1}, and q2q_{2} in addition to the minor radius aa. To adapt this model to the periodic cylindrical system used in the SAW benchmark, Eq.(29) is modified by replacing RR in the denominator of both terms with the constant R0R_{0}. The toroidal angle φ\varphi then serves as an axial coordinate and (R,Z)(R,Z) are coordinates within planes perpendicular to the cylindrical axis. The distance along the cylindrical axis is parameterized by R0φR_{0}\varphi, and the system remains 2π2\pi-periodic in φ\varphi. In this way, we can interpret the cylindrical system as a “straightened” toroidal system.

4.2 Model for Density and Temperature Profiles

To describe density and temperature profiles, we first define a normalized logarithmic gradient. Letting AA represent either density or temperature, the normalized logarithmic gradient of AA is defined as

κA=R0ddrln(A),\displaystyle\kappa_{A}=-R_{0}\frac{d}{dr}\ln{(A)}, (32)

and the profile of AA is then

A(r)=A(r0)exp[r0rκA(r)R0𝑑r],\displaystyle A(r)=A(r_{0})\exp{\left[-\int_{r_{0}}^{r}\frac{\kappa_{A}(r^{\prime})}{R_{0}}dr^{\prime}\right]}, (33)

where r0r_{0} is a reference value of rr. We choose an analytical form for κA\kappa_{A} as

κA(r)=κA(r0)exp[(rr0wAa)6].\displaystyle\kappa_{A}(r)=\kappa_{A}(r_{0})\exp{\left[-\left(\frac{r-r_{0}}{w_{A}a}\right)^{6}\right]}. (34)

Hence profiles are specified by the parameters r0r_{0}, R0R_{0}, aa, wAw_{A}, κA(r0)\kappa_{A}(r_{0}), and A(r0)A(r_{0}).

4.3 Shear Alfvén Wave

We simulate SAW propagation in a periodic cylindrical system with uniform temperature, density, and safety factor profiles. For simplicity, ions are modeled by a uniform background density n0n_{0} in addition to the ion polarization density in the gyrokinetic Poisson equation Eq. (2). Electrons are kinetic. Considering only the polarization response for the ions is sufficient for providing a minimal model supporting Alfvén wave propagation, and similar models have been considered in the past, for example in Refs. [65, 66, 67]. Physically, the ion polarization response provides a cross-field current to counter the parallel electron current in a bid to maintain quasi-neutrality. In this benchmark, deuterium ions are considered along with electrons at twice the realistic mass, giving a mass ratio of mime=1836\frac{m_{i}}{m_{e}}=1836. The relevant fixed parameters for this test case are given in Table 1.

R0R_{0} aa B0B_{0} q0q_{0} T(r0)T(r_{0})
50 m 0.5 m 0.228 T 2.0 2.0 keV
Table 1: Parameters for shear Alfvén wave test case

In Fig.(1), we scan over βe=μ0n0TeB2\beta_{e}=\frac{\mu_{0}n_{0}T_{e}}{B^{2}}, which we vary by choosing different values of density ranging from 6.25×1017m36.25\times 10^{17}\ \mathrm{m}^{-3} to 3.0×1019m33.0\times 10^{19}\ \mathrm{m}^{-3}. Simulations are initialized with a sinusoidal electron density perturbation of the form:

δne=A(r)ei(nφ+mθ)+c.c.,\displaystyle\delta n_{e}=A(r)e^{i(n\varphi+m\theta)}+\mathrm{c.c.}, (35)

where AA is a radial envelope function and c.c. denotes the complex conjugate. We note that βe\beta_{e} scans of SAW propagation have long been used to verify electromagnetic gyrokinetic PIC codes. For example, [68] and [10] consider similar tests in shearless slab geometry. To shed light on potential cancellation issues, we consider a long wavelength large β\beta regime. We initialize our simulations taking (n,m)=(2,2)(n,m)=(2,2). Furthermore, a Fourier filter is used to keep only the n=2n=2 mode. A timestep size of Δt=5.6×103R0/cs\Delta t=5.6\times 10^{-3}R_{0}/c_{s} is used, where csc_{s} is the sound speed given by cs=Te/mi=3.1×105m/sc_{s}=\sqrt{T_{e}/m_{i}}=3.1\times 10^{5}\ \mathrm{m/s}.

The mesh in XGC consists of a number of identical RZR-Z planes equally spaced in φ\varphi, where within each RZR-Z plane, an unstructured triangular mesh is employed. For this case, 9038 mesh vertices are used within each RZR-Z plane, corresponding to a spatial resolution of ΔRΔZ9.3×103m\Delta R\approx\Delta Z\approx 9.3\times 10^{-3}\ \mathrm{m}, which is roughly 0.30.3 times the size of the ion-sound gyroradius ρs=cs/Ωi=2.8×102m\rho_{s}=c_{s}/\Omega_{i}=2.8\times 10^{-2}\ \mathrm{m}. Between the RZR-Z planes, a resolution of R0Δφ19.5mR_{0}\Delta\varphi\approx 19.5\ \mathrm{m} is used. Finally, we take approximately 50 particles per mesh vertex. We compare the measured real frequencies from the simulations to a simple analytic dispersion relation. In B, we present a dispersion analysis for the SAW resulting in

ω=±csβek.\displaystyle\omega=\pm\frac{c_{s}}{\sqrt{\beta_{e}}}k_{\parallel}. (36)

We note that several simplifying assumptions have been made in the derivation of Eq.(36). For example, we have neglected kinetic effects, effects due to nonuniformities in 𝐁{\bf B}, and finite kk_{\perp} effects. For r/R01r/R_{0}\ll 1, with the magnetic field described in Eq.(29) and Eq.(30) and perturbations of the form Eq.(35), we have k(nm/q)/R0k_{\parallel}\approx(n-m/q)/R_{0}. Figure (1) shows excellent agreement when comparing simulation results to this approximate dispersion relation, demonstrating the absence of cancellation errors in our implementation.

Refer to caption
Figure 1: Real frequencies vs. βe\beta_{e} for the SAW test case. Results from simulations are compared to an analytical dispersion relation.

In addition to comparing real frequencies, we also show the time history of the real and imaginary parts of the (n,m)=(2,2)(n,m)=(2,2) mode amplitude of AA_{\parallel} and ϕ\phi for the simulation at βe=3.85%\beta_{e}=3.85\% in Fig.(2). A 9090^{\circ} phase shift is observed between Re(ϕ)\mathrm{Re}(\phi) and Im(A)\mathrm{Im}(A_{\parallel}) and between Im(ϕ)\mathrm{Im}(\phi) and Re(A)\mathrm{Re}(A_{\parallel}). This phase relation follows from Eq.(1), Eq.(2), and the electron continuity equation as is shown in A.

Refer to caption
Figure 2: Time histories of the real and imaginary parts of the complex amplitudes of AA_{\parallel} and ϕ\phi for the (n,m)=(2,2)(n,m)=(2,2) mode at βe=3.85%\beta_{e}=3.85\%.

4.4 ITG-KBM Transition Benchmark

Next, we consider a cyclone-like [69] test case in toroidal geometry similar to the case considered in [70], demonstrating the ITG-KBM transition. ITG represents ion temperature-gradient driven modes, and KBM represents kinetic ballooning modes. In this study, we compare results from our fully implicit electromagenetic algorithm in XGC to results obtained from GEM, GENE, and explicit XGC. The GEM and GENE codes are both global δf\delta f gyrokinetic codes using field-line-following coordinates. GEM is a particle-in-cell code, and GENE is Eulerian using a spectral method in the binormal direction. The electromagnetic capability in GEM is based on the pp_{\parallel}-formalism, and a modified mass matrix is used in Ampère’s law to mitigate accuracy problems at high β\beta from the cancellation problem [11]. In GENE, vv_{\parallel} is used as a coordinate for electromagnetic simulations, and a transformation of the perturbed distribution function involving AA_{\parallel} is performed to eliminate the time derivative in the inductive component of the electric field [22]. GEM and GENE both use forms of the polarization density in the gyrokinetic Poisson equation that are valid for arbitrary wavelengths. In this study, on the other hand, the implicit and explicit electromagnetic XGC have the short wavelength Padé correction term turned off, as shown in Eq.(2). This may cause some discrepancy for short wavelength modes, which has been shown, however, to be insignificant in the recent n=19n=19 benchmarking exercise between GENE and explicit electromagnetic XGC in the Görler benchmarking plasma [41]. Our benchmarking is at the toroidal mode number n=6n=6. More detailed descriptions of GEM and GENE can be found in the literature including Refs. [23, 22, 71, 21, 72] for GENE and Refs. [9, 10, 11] for GEM. For further comparison, we mention that XGC uses unstructured triangular meshes in cylindrical coordinates. In addition, the typical mode of operation in XGC is the total-ff method described in [6, 7]. However, in this paper XGC is run in standard δf\delta f mode for a proper comparison to GEM and GENE.

In this benchmark problem, we take a reduced magnetic field strength compared to the case considered in [70]. The motivation for this choice is to reduce the resolution requirements, allowing this benchmark problem to be run at a lower computational cost. In [70], the size parameter ρ\rho^{*}, defined as the ratio of ion gyroradius to the minor radius, was approximately 1/1801/180; here, ρ1/50\rho^{*}\approx 1/50. Since the perpendicular resolution requirement is set by the size of the ion gyroradius, the number of mesh nodes required for a well-resolved simulation of this benchmark problem is roughly a factor of (180/50)213.0(180/50)^{2}\approx 13.0 smaller than that required for the case considered in [70]. Both ions and electrons are treated kinetically. We take hydrogen ions and electrons with realistic mass, giving a mass ratio of mime=1836\frac{m_{i}}{m_{e}}=1836, and we take TTi=TeT\equiv T_{i}=T_{e}. Furthermore, only the n=6n=6 mode is kept in the simulations. We note that the ITG-KBM transition case in [70] used n=19n=19. Fixed parameters for defining the magnetic geometry and profiles are given in Table 2.

R0R_{0} aa r0r_{0} B0B_{0} q0q_{0} q1q_{1} q2q_{2} T(r0)T(r_{0}) κT(r0)\kappa_{T}(r_{0}) wT(r0)w_{T}(r_{0}) κn(r0)\kappa_{n}(r_{0}) wn(r0)w_{n}(r_{0})
2.8 m 1.0 m 0.5 m 0.236 T 0.86 -0.16 2.52 2.14 keV 6.92 0.25 2.22 0.25
Table 2: Parameters for ITG-KBM test case

In Fig.(3), we show the density and temperature profiles normalized by their values at r0r_{0} on the left and the normalized logarithmic gradients of density and temperature on the right. Compared to the profiles given in Figure 2 of [70], the values of κT\kappa_{T} and κn\kappa_{n} are nearly identical at r0r_{0}; however, the κ\kappa profiles considered here are much flatter in the center and nearly zero at the ends. This results in flat density and temperature profiles near the magnetic axis and the outer radial boundary. More localized gradients are used in this benchmark problem since this can help reduce effects at the boundaries of the simulation domain, which can be more significant in problems with larger ρ\rho^{*}.

Refer to caption
Refer to caption
Figure 3: Normalized density and temperature profiles (left) and normalized logarithmic gradients (right)

In Fig.(4), the safety factor and magnetic shear profiles are shown. Note that these are identical to the ones used in [70].

Refer to caption
Figure 4: Safety factor profile (red) and magnetic shear s^=rqq\hat{s}=\frac{rq^{\prime}}{q} profile (blue)

We again scan over βe\beta_{e} by varying the density value of density at r0r_{0} from 6.5×1016m36.5\times 10^{16}\ \mathrm{m}^{-3} to 1.625×1018m31.625\times 10^{18}\ \mathrm{m}^{-3}. In Fig.(5), we compare measured real frequencies and growth rates obtained from the implicit version of XGC to those obtained in GEM, GENE, and explicit XGC. The real frequencies are given in table 3 and the growth rates in table 4 for the cases that were simulated by all four codes/algorithms. Good agreement is observed with each code/algorithm finding the critical value of βe\beta_{e} to be between 0.65%0.65\% and 0.75%0.75\%. Given the significant differences that exist between the algorithms and formulations used, we consider the overall agreement in real frequencies and growth rates for this benchmark case to be very good. We note that all four comparisons found a collisionless trapped electron mode (CTEM) to be present at βe=0.65%\beta_{e}=0.65\%, characterized by a change in the direction of propagation from the ion diamagnetic direction to the electron diamagnetic direction. We note that there is some disagreement between GENE and the other codes in the real frequency for the CTEM case. A possible cause for this disagreement may be the form of the curvature drift that was used in GENE for this benchmarking exercise. GENE was run using the form of the curvature drift given in C, which is different than what was used in XGC and GEM. Both versions of XGC and GEM used the 𝐁{\bf B}^{*} term in Eq.(5), which implicitly contains the curvature drift. When a true MHD equilibrium is considered, both forms of the curvature drift should be equivalent to first order in the gyrokinetic ordering. However, MHD equilibrium is not ensured in our simplified analytic models of the magnetic field and profiles as in the Görler benchmarking case [70]. Since the physics of the CTEM mode depends strongly on the magnetic curvature, it may be more sensitive to differences in this term than the ITG or KBM modes.

Refer to caption
Refer to caption
Figure 5: Real frequencies (left) and growth rates (right) obtained from simulations across the various codes/implementations vs. βe\beta_{e}.
βe\beta_{e} in %     GEM     GENE XGC-Explicit XGC-Implicit
0.05 0.121 0.118 0.128 0.119
0.50 0.123 0.120 0.125 0.118
0.65 -0.053 -0.153 -0.051 -0.050
0.75 0.526 0.532 0.489 0.464
1.00 0.420 0.443 0.405 0.382
1.25 0.360 0.392 0.354 0.327
Table 3: Real frequencies in units of cs/LTic_{s}/L_{T_{i}} from the plot in Fig.(5)
βe\beta_{e} in %     GEM     GENE XGC-Explicit XGC-Implicit
0.05 0.052 0.057 0.049 0.047
0.50 0.027 0.024 0.029 0.031
0.65 0.027 0.030 0.032 0.028
0.75 0.055 0.042 0.047 0.042
1.00 0.137 0.144 0.125 0.138
1.25 0.194 0.201 0.173 0.192
Table 4: Growth rates in units of cs/LTic_{s}/L_{T_{i}} from the plot in Fig.(5)

Finally, we show the developed mode structures for ϕ\phi and AA_{\parallel} in Fig.(6) for the βe=0.05%\beta_{e}=0.05\% ITG case and in Fig.(7) for the βe=1.25%\beta_{e}=1.25\% KBM case. There is good overall qualitative agreement in the mode structures between the various codes/algorithms for both cases. We note that the electrostatic potentials produced from GEM and GENE feature fine-scale radial structures near rational surfaces. The two algorithms in XGC, on the other hand, produce smoother electrostatic potentials in the radial direction. This may be due to the long-wavelength form of the ion polarization density used in Eq.(2) for the XGC algorithms.

Refer to caption
Figure 6: Mode structures for βe=0.05%\beta_{e}=0.05\%
Refer to caption
Figure 7: Mode structures for βe=1.25%\beta_{e}=1.25\%

The numerical resolutions used by each code/algorithm to obtain these results are the following. For the fully implicit version of XGC, a timestep size of Δt=4.0×102LTics\Delta t=4.0\times 10^{-2}\ \frac{L_{T_{i}}}{c_{s}} was used, where LTi=R0κTi(r0)L_{T_{i}}=\frac{R_{0}}{\kappa_{T_{i}}(r_{0})}. Here, 28439 mesh vertices were used in the RZR\!\!-\!\!Z planes, corresponding to a spatial resolution of approximately 0.5ρi0.5\ \rho_{i}. Along the toroidal direction, 8 RZR\!\!-\!\!Z planes were taken within a wedge spanning 1/61/6 of a full torus, and 50 particles per mesh vertex are used. The explicit version of XGC used the same mesh and toroidal resolution as was used in the implicit version. The timestep size was Δt=1.6×103LTics\Delta t=1.6\times 10^{-3}\ \frac{L_{T_{i}}}{c_{s}} and 25 particles per mesh vertex were used. GEM used a radial resolution of 0.31ρi0.31\ \rho_{i} and resolution in the binormal direction of 0.29ρi0.29\ \rho_{i}. Along the field line, 64 grid points were used spanning π<θπ-\pi<\theta\leq\pi. The timestep size used in GEM was Δt=5.0×103LTics\Delta t=5.0\times 10^{-3}\ \frac{L_{T_{i}}}{c_{s}} and 16 marker ions and 32 marker electrons per cell were used. Finally, GENE used a radial resolution of 0.19ρi0.19\ \rho_{i} and used a single mode in the binormal direction corresponding to a toroidal mode number of n=6n=6. Along the field line, 32 grid points were used spanning π<θπ-\pi<\theta\leq\pi. In velocity space, 48 grid points were used for vv_{\parallel} and 24 grid points were used for μ\mu. The timestep size used in GENE was Δt=8.8×103LTics\Delta t=8.8\times 10^{-3}\ \frac{L_{T_{i}}}{c_{s}}.

Each code used homogeneous Dirichlet boundary conditions when solving for ϕ\phi and AA_{\parallel}. However, there was some variation in the locations of the radial boundaries. The implicit version of XGC took only an outer radial boundary at r/a=0.9r/a=0.9. Since XGC uses cylindrical coordinates, it is able to include the magnetic axis within the simulation domain. The explicit version of XGC took an outer radial boundary at r/a=0.98r/a=0.98 and an inner radial boundary at r/a=0.07r/a=0.07. In addition, filtering was applied after the field solves to set the potentials outside of the region 0.24r/a0.830.24\leq r/a\leq 0.83 to zero. GEM took an outer radial boundary at r/a=0.9r/a=0.9 and an inner radial boundary at r/a=0.1r/a=0.1. GENE took an outer radial boundary at r/a=0.975r/a=0.975 and an inner radial boundary at r/a=0.025r/a=0.025. Both versions of XGC and GEM set particle weights to zero when particles exited the simulation domain, and GENE used a buffer region outside the simulation domain with a Krook collision operator designed to damp the perturbed distribution function towards zero.

For completeness, we include here some timing data from each code/algorithm in simulating the βe=1.25%\beta_{e}=1.25\% case. Since the runs using the explicit version of XGC in the benchmarking study were carried out without consideration of performance, we use a more practical timestep size here for a fairer comparison. All simulations were run using 16 Haswell processor nodes on the Cori supercomputer at NESRC. Both versions of XGC and GENE ran using MPI parallelization with a total of 1024 MPI processes. GEM ran using a hybrid MPI/OpenMP parallelization with 128 MPI processes and 8 OpenMP threads per MPI process. We choose our metric to be the wallclock time required to simulate a physical time unit of LTi/csL_{T_{i}}/c_{s}. GENE required a wallclock time of 2.4 seconds, GEM required 75.0 seconds, the implicit version of XGC required 110.3 seconds, and the explicit version of XGC required 176.6 seconds. We note that in collecting this specific timing data, the explicit version of XGC used a more practical timestep size of 1.0×102LTics1.0\times 10^{-2}\frac{L_{T_{i}}}{c_{s}} and 50 particles per mesh vertex, in an attempt to improve the unnecessarily small timestep size used in the physics benchmarking exercise in the explicit version of XGC and to make the particle number the same. All other parameters are the same as those used in the benchmarking exercise presented above in this section. We caution here that the timing data comes from a single common simulation scenario in which parameters were chosen to provide a low cost physics benchmarking problem without considering computational algorithm optimization, number of subcycling steps (4 steps used in this study is at the low end), optimal hardware conditions, programing language, and practical problem size per each code. For example, optimization of the production XGC has been known to scale well for big physics problems on extreme scale computers that are equipped with high-performance accelerators. In addition, optimal choices of parameters for balancing both accuacy and efficiency are yet to be determined for the different versions of XGC. Hence, the performance numbers provided here should not be used to draw any conclusions about the efficiency of the codes/algorithms in simulating different realistic problems targeted by each codes.

5 Performance Discussion for the Fully Implicit Scheme

While a detailed study of performance is beyond the scope of this paper, for completeness, we include some general discussion and outlook regarding the performance of the fully implicit scheme in addition to some initial convergence results for the SAW benchmark problem. As is the case for any simulation method, assessing performance is a complex issue requiring systematic studies over broad ranges of both physical and numerical parameters on multiple leadership class computer architectures. Here, we highlight some of the key factors contributing to the performance of the fully implicit method. We provide a rough estimate for the cost of the implicit scheme in simulating a reference unit of time Δtref\Delta t_{\mathrm{ref}} as follows. We assume that the process cost CPpush\mathrm{CP}_{\mathrm{push}} of pushing particles over one timestep interval Δt\Delta t is proportional to the total number of marker particles, i.e.

CPpush=CpNppmNm,\displaystyle\mathrm{CP}_{\mathrm{push}}=C_{p}N_{ppm}N_{m}, (37)

where NmN_{m} represents the mesh degrees of freedom, NppmN_{ppm} is the number of particles per mesh degree of freedom, and CpC_{p} is the cost to push a single particle a time unit of Δt\Delta t. Note that we assume subcycling is accounted for in the factor CpC_{p}, i.e., for a fixed sub-timestep size δt\delta t, CpC_{p} will be proportional to the ratio of Δt/δt\Delta t/\delta t. Next, we denote the cost for applying the preconditioner, i.e. solving the linear system of equations defined by the matrix in Eq.(26), by CPpc\mathrm{CP}_{\mathrm{pc}} and the number of residual evaluations required per timestep for convergence of the iterative scheme by NREN_{\mathrm{RE}}. For the Picard scheme with Anderson mixing considered in this paper, only one residual evaluation is required per iteration, meaning NREN_{\mathrm{RE}} corresponds to the number of iterations required for convergence. Our estimate for the total cost of simulating a reference unit of time is

CPtotal=ΔtrefΔtNRE(CPpush+CPpc).\displaystyle\mathrm{CP}_{\mathrm{total}}=\frac{\Delta t_{\mathrm{ref}}}{\Delta t}N_{\mathrm{RE}}\left(\mathrm{CP}_{\mathrm{push}}+\mathrm{CP}_{\mathrm{pc}}\right). (38)

Since the cost is proportional to the number of residual evaluations NREN_{\mathrm{RE}}, an efficient iterative method is essential to the overall performance of the scheme. The efficiency of the iterative method relies heavily on the quality of the preconditioner. We desire a preconditioner that can effectively remove stiffness in the problem for the variety of simulation scenarios that may be encountered in practice. In addition, the cost of applying the preconditioner should not be the dominant factor in the overall cost, i.e. CPpc<CPpush\mathrm{CP}_{\mathrm{pc}}<\mathrm{CP}_{\mathrm{push}}. For simulations requiring a large number of mesh degrees of freedom, a scalable 𝒪(Nm){\cal O}(N_{m}) solver for handling the linear system defined by the matrix Eq.(26) becomes essential. Designing scalable solvers in this context is more challenging than for typical gyrokinetic field solves, which are decoupled in the parallel direction. The system defining the fluid-based preconditioner, on the other hand, is fully three dimensional. To address this issue, we are currently exploring the use of a Schur complement formulation of the block matrix system of Eq.(26) which is amenable to multigrid methods. This work will be reported in a future detailed performance assessment.

As an initial test of the performance of the iterative method, we show in Table 5 the number of iterations (residual evaluations) required for the SAW test case to achieve a relative tolerance of 10510^{-5} in the preconditioned residual norm over a range of the physical parameter βe\beta_{e} and the numerical parameter vteΔtR0Δφ\frac{v_{te}\Delta t}{R_{0}\Delta\varphi}, where vte=Te/mev_{te}=\sqrt{T_{e}/m_{e}} is the electron thermal velocity. This numerical parameter is related to the number of cell crossings for thermal electrons near the magnetic axis within the timestep size Δt\Delta t. Our experience suggests that this is the most relevant parameter in determining the convergence rate of our iterative method. Our convergence criteria is

Δϕ2+vAΔA2<ϵrel(ϕ2+vAA2),\displaystyle\left\lVert\Delta\phi\right\rVert_{2}+v_{A}\left\lVert\Delta A_{\parallel}\right\rVert_{2}<\epsilon_{\mathrm{rel}}\left(\left\lVert\phi\right\rVert_{2}+v_{A}\left\lVert A_{\parallel}\right\rVert_{2}\right), (39)

where Δϕ\Delta\phi and ΔA\Delta A_{\parallel} are the corrections to the potentials coming from the solution of the linear system defined by Eq.(26), ϕ\phi and AA_{\parallel} are the values of the potentials at the current iteration, and ϵrel\epsilon_{\mathrm{rel}} is the relative tolerance. The reported iteration counts represent averages over 15 timesteps. We consider three values of βe\beta_{e} for the SAW test and vary the timestep size to set the parameter vteΔtR0Δφ\frac{v_{te}\Delta t}{R_{0}\Delta\varphi}. Subcycling is adjusted in each case to keep vteδtR0Δφ\frac{v_{te}\delta t}{R_{0}\Delta\varphi} at the fixed value of 0.1250.125. We mention that a small subcycling timestep compared to the cell crossing time seems to be a necessary requirement for convergence. In the recent explicit XGC code, we subcycle the electrons 12 timesteps per ion timestep. This should be considered anyway in order for marker electrons to accurately sample the perturbed fields along their trajectories. The results from Table 5 show that convergence can be achieved within a reasonable number (<10<10) of residual evaluations for a wide range of βe\beta_{e} and with the number of thermal electron cell crossings of order unity. When the number of electron cell crossings is pushed beyond order unity, the iterative method generally fails to converge unless further corrections are included in the preconditioner. These will be reported in future studies.

# iterations
βe\beta_{e} in % vteΔtR0Δφ\frac{v_{te}\Delta t}{R_{0}\Delta\varphi} = 0.25 vteΔtR0Δφ\frac{v_{te}\Delta t}{R_{0}\Delta\varphi} = 1.00 vteΔtR0Δφ\frac{v_{te}\Delta t}{R_{0}\Delta\varphi} = 4.00
0.5 3.9 5.0 5.5
7.7 5.0 6.2 7.1
23.1 5.9 7.0 8.3
Table 5: Iterations required to achieve a relative tolerance of 10510^{-5} in preconditioned residual norm vs βe\beta_{e} and thermal electron cell crossings per timestep.

In general, the cost per timestep with an implicit method will be greater than with an explicit method. The typical motivation for using an implicit method is to relax constraints on the timestep size required for numerical stability when explicit methods are applied. In our case, however, an implicit method allows us to use a form of the electromagnetic gyrokinetic equations that avoids the Ampère cancellation problem. For problems with βe>me/mi\beta_{e}>m_{e}/m_{i}, we have vte>vAv_{te}>v_{A}, where vA=B/μ0min0iv_{A}=B/\sqrt{\mu_{0}m_{i}n_{0i}} is the Alfvén velocity. Since the convergence of our current implementation seems to require vteΔtR0Δφ\frac{v_{te}\Delta t}{R_{0}\Delta\varphi} to be order unity, a large increase in the timestep size compared to an explicit method might not be feasible in this regime. The advantage instead resides in a greater robustness to the cancellation problem, and in possible efficiency gains from relaxed perpendicular spatial resolution requirements (i.e., resolving the electron skin depth is no longer needed for numerical reasons). In low density/low-β\beta regimes, however, there may be a significant timestep size advantage when using an implicit scheme. In this case, it is the timescale of the SAW which sets the constraint on the timestep size for explicit methods. We explore this regime in Table 6, where we fix vteΔtR0Δφ=0.25\frac{v_{te}\Delta t}{R_{0}\Delta\varphi}=0.25 and lower the density to increase the Alfvén velocity (decrease βe\beta_{e}). Again, the iteration counts represent averages over 15 timesteps. It is demonstrated that the number of iterations required to achieve a relative tolerance of 10510^{-5} does not increase significantly with vAv_{A}, even for large values of the Courant number due to the Alfvèn wave: vAΔtR0Δφ\frac{v_{A}\Delta t}{R_{0}\Delta\varphi}.

βe\beta_{e} in % vAΔtR0Δφ\frac{v_{A}\Delta t}{R_{0}\Delta\varphi} # iterations
2.8×1022.8\times 10^{-2} 0.1 3.3
2.8×1042.8\times 10^{-4} 1.0 4.0
2.8×1062.8\times 10^{-6} 10.0 4.0
Table 6: Iterations required to achieve a relative tolerance of 10510^{-5} in preconditioned residual norm for fixed vteΔtR0Δφ\frac{v_{te}\Delta t}{R_{0}\Delta\varphi} and increasing Alfvén velocity (decreasing βe\beta_{e}).

Although we have considered only the SAW benchmark problem for the convergence results in this section, in practice we have found the convergence behavior to be qualitatively similar for the ITG-KBM benchmarking problem. The inclusion of kinetic ions and background gradients in the system seems to have little effect on convergence, which is mainly determined by the fastest timescales in the problem.

6 Conclusions

In this paper, we have verified a fully implicit δf\delta f implementation of the vv_{\parallel}-formalism of electromagnetic gyrokinetics in the XGC code with two test cases - shear Alfvén wave (SAW) propagation in cylindrical geometry and the ITG-KBM transition in toroidal geometry. The vv_{\parallel}-formalism uses the original form of Ampère’s law and is therefore free from the “cancellation problem” that plagues codes based on the pp_{\parallel}-formalism. In addition, a fully implicit time discretization scheme can provide stable simulations for long wavelengths and high β\beta using the vv_{\parallel}-formalism despite the appearance of a time derivative in the definition of the electric field. For the SAW propagation test, the real frequencies measured from XGC simulations using the implicit electromagnetic implementation match well with an analytical dispersion relation, demonstrating the lack of a cancellation problem. The elimination of the cancellation problem with this method can be a significant advantage for long wavelength, high-β\beta regimes. We note that not all formulations avoiding the pp_{\parallel}-formalism eliminate the cancellation issue. For instance, the mixed variables/pullback transformation scheme [39, 40, 17] does not use the pp_{\parallel}-formalism, but it still has a cancellation problem. For the ITG-KBM case, we have compared the real frequencies and growth rates obtained using the implicit electromagnetic implementation against the GEM and GENE codes in addition to an explicit electromagnetic implementation in XGC based on the mixed variables/pullback transformation scheme [41]. All codes give good agreement for this case and predict the transition to occur for βe\beta_{e} between 0.65%0.65\% and 0.75%0.75\%. In addition, all codes predict a collisionless trapped electron mode to be dominant for βe=0.65%\beta_{e}=0.65\%. These results strengthen our confidence in the ability of the implicit scheme to accurately solve the electromagnetic gyrokinetic equations.

Although this paper has focused on linear verification studies for the implicit electromagnetic algorithm, verification of the fully implicit scheme for nonlinear electromagnetic plasma dynamics is an important future objective. For conventional δf\delta f turbulence simulations in the core, where perturbation sizes are small compared to the background, electron physics parallel to the background magnetic field accounts for the main source of stiffness in the implicit equations. Since this difficulty is already present in the linear regime, we do not anticipate additional difficulties arising in the nonlinear regime. For simulations in which large perturbation sizes may develop, however, certain modifications might be necessary in the preconditioner to avoid degradation of convergence rates. For example, the background density and temperature may need to be updated periodically in Eq.(26) so that the preconditioner is guaranteed to be linearized against the previously updated values, or perhaps smaller timestep sizes will be required to improve the conditioning of the implicit equations. We emphasize, however, that poor approximations to the plasma dynamics in the preconditioner will only affect the performance of the scheme by yielding slower convergence rates. This will not affect the accuracy of the discrete formulation. Future work will report on further details regarding the optimization of the implicit algorithm and its performance in both linear and nonlinear regimes.

Appendix A Theoretical relationship between AA_{\parallel} and ϕ\phi

Here we derive a relationship between the complex mode amplitudes of AA_{\parallel} and ϕ\phi for the shear Alfvén wave model considered in Sec. 4. We begin by taking the partial derivative with respect to time of Eq.(2), assuming spatially uniform background quantities and no time dependence in the ion gyrocenter density. We have

min0iB22ϕteδnet\displaystyle-\frac{m_{i}n_{0i}}{B^{2}}\nabla_{\perp}^{2}\frac{\partial\phi}{\partial t}\approx-e\frac{\partial\delta n_{e}}{\partial t} (40)

Next, we take the gradient in the parallel direction of Eq.(1). Assuming parallel and perpendicular derivative operators commute and no contribution to the current from ion gyrocenters, we have

1μ02Aδje.\displaystyle-\frac{1}{\mu_{0}}\nabla_{\perp}^{2}\nabla_{\parallel}A_{\parallel}\approx\nabla_{\parallel}\delta j_{\parallel e}. (41)

If we assume the main contribution in the electron continuity equation to be from the parallel electron current, we have

tδne1eδje0.\displaystyle\frac{\partial}{\partial t}\delta n_{e}-\frac{1}{e}\nabla_{\parallel}\delta j_{\parallel e}\approx 0. (42)

Adding together Eq.(40) and Eq.(41), it follows from Eq.(42) that

ϕt+B02μ0min0iA0.\displaystyle\frac{\partial\phi}{\partial t}+\frac{B_{0}^{2}}{\mu_{0}m_{i}n_{0i}}\nabla_{\parallel}A_{\parallel}\approx 0. (43)

Finally, we take a Fourier mode ansatz along the parallel direction with wave number kk_{\parallel}. The complex mode amplitudes then follow

ϕ~t+ikvA2A~0,\displaystyle\frac{\partial\tilde{\phi}}{\partial t}+ik_{\parallel}v_{A}^{2}\tilde{A_{\parallel}}\approx 0, (44)

which can be written as

tRe(ϕ~)kvA2Im(A~)\displaystyle\frac{\partial}{\partial t}\mathrm{Re}(\tilde{\phi})-k_{\parallel}v_{A}^{2}\mathrm{Im}(\tilde{A_{\parallel}}) 0\displaystyle\approx 0 (45)
tIm(ϕ~)+kvA2Re(A~)\displaystyle\frac{\partial}{\partial t}\mathrm{Im}(\tilde{\phi})+k_{\parallel}v_{A}^{2}\mathrm{Re}(\tilde{A_{\parallel}}) 0\displaystyle\approx 0 (46)

by separating real and imaginary parts.

Appendix B Simple dispersion relation for the shear Alfvén wave

A simple dispersion relation for the shear Alfvén wave can be derived using Eq.(43), Ampère’s law Eq.(1), and the electron momentum equation Eq.(22) in the limit Te0T_{e}\rightarrow 0. The electron momentum equation and Ampère’s law can be combined to give

(1mee2n0μ02)At+ϕ0.\displaystyle\left(1-\frac{m_{e}}{e^{2}n_{0}\mu_{0}}\nabla_{\perp}^{2}\right)\frac{\partial A_{\parallel}}{\partial t}+\nabla_{\parallel}\phi\approx 0. (47)

For long perpendicular wavelengths compared to the electron skin depth de=me/e2n0μ0d_{e}=\sqrt{m_{e}/e^{2}n_{0}\mu_{0}}, the perpendicular Laplacian term can be ignored giving

Atϕ.\displaystyle\frac{\partial A_{\parallel}}{\partial t}\approx-\nabla_{\parallel}\phi. (48)

Taking a partial derivative with respect to time of Eq.(43) and using Eq.(48) gives

2ϕt2B02μ0min0i2ϕ0.\displaystyle\frac{\partial^{2}\phi}{\partial t^{2}}-\frac{B_{0}^{2}}{\mu_{0}m_{i}n_{0i}}\nabla_{\parallel}^{2}\phi\approx 0. (49)

Finally, taking a plane wave ansatz gives

ω2=B02μ0min0ik2=cs2k2βe.\displaystyle\omega^{2}=\frac{B_{0}^{2}}{\mu_{0}m_{i}n_{0i}}k_{\parallel}^{2}=\frac{c_{s}^{2}k_{\parallel}^{2}}{\beta_{e}}. (50)

Appendix C Form of curvature drift used in GENE

Here, we derive the form of the curvature drift that was used in GENE for the ITG-KBM benchmarking problem. We begin with a standard form of the curvature drift given by

𝐯c=msqsBv2𝐛^×(𝐛^)𝐛^.\displaystyle{\bf v}_{c}=\frac{m_{s}}{q_{s}B}v_{\parallel}^{2}\hat{{\bf b}}\times\left(\hat{{\bf b}}\cdot\nabla\right)\hat{{\bf b}}. (51)

Using standard vector calculus identities, we can write the second factor in the cross product as

(𝐛^)𝐛^=×𝐁B×𝐛^+BB.\displaystyle\left(\hat{{\bf b}}\cdot\nabla\right)\hat{{\bf b}}=\frac{\nabla\times{\bf B}}{B}\times\hat{{\bf b}}+\frac{\nabla_{\perp}B}{B}. (52)

From Ampère’s law, it follows that

𝐯c=msqsBv2𝐛^×(μ0𝐣×𝐁B2+BB).\displaystyle{\bf v}_{c}=\frac{m_{s}}{q_{s}B}v_{\parallel}^{2}\hat{{\bf b}}\times\left(\mu_{0}\frac{{\bf j}\times{\bf B}}{B^{2}}+\frac{\nabla B}{B}\right). (53)

Finally, assuming an MHD equilibrium allows us to express the curvature drift in terms of the pressure gradient as

𝐯c=msqsBv2𝐛^×(μ0PB2+BB),\displaystyle{\bf v}_{c}=\frac{m_{s}}{q_{s}B}v_{\parallel}^{2}\hat{{\bf b}}\times\left(\mu_{0}\frac{\nabla P}{B^{2}}+\frac{\nabla B}{B}\right), (54)

which is the form used in GENE for the ITG-KBM benchmarking problem considered in this paper. We note that the full form of Eq.(54) is used in this study. In particular, we do not set P\nabla P to zero as is commonly done under the assumption of a low-pressure plasma [10, 72].

Acknowledgments

This research was supported jointly by the SciDAC-4 project High-fidelity Boundary Plasma Simulation (HBPS) funded by U.S. DOE Office of Science, Office of Advanced Scientific Computing Research and Office of Fusion Energy Sciences, under Contract No. DE-AC02- 09CH11466; and the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-05CH11231.

Data Availability

Data included in the tables and figures of this paper will be made openly available in the PPPL Theory Department ARK, Ref. [73]. Additional data supporting the findings of this study are available from the corresponding author upon reasonable request.

References

  • [1] S. E. Parker, W. W. Lee, R. Santoro, Phys. Rev. Lett. 71 (1993) 2042. doi:10.1103/PhysRevLett.71.2042.
  • [2] A. Dimits, T. Williams, J. Byers, B. Cohen, Phys. Rev. Lett. 77 (1996) 71. doi:10.1103/PhysRevLett.77.71.
  • [3] R. Sydora, V. Decyk, J. Dawson, Plasma Phys. Controlled Fusion 38 (1996) A281. doi:10.1088/0741-3335/38/12A/021.
  • [4] Z. Lin, T. S. Hahm, W. W. Lee, W. M. Tang, R. B. White, Science 281 (1998) 1835. doi:10.1126/science.281.5384.1835.
  • [5] W. W. Lee, J. Comput. Phys. 72 (1987) 243–269. doi:10.1016/0021-9991(87)90080-5.
  • [6] S. Ku, R. Hager, C. S. Chang, J. M. Kwon, S. E. Parker, J. Comput. Phys. 315 (2016) 467–475. doi:10.1016/j.jcp.2016.03.062.
  • [7] S. Ku, C. S. Chang, R. Hager, R. M. Churchill, G. R. Tynan, I. Cziegler, M. Greenwald, J. W. Hughes, S. E. Parker, M. F. Adams, E. D’Azevedo, P. Worley, Phys. Plasmas 25 (2018) 056107. doi:10.1063/1.5020792.
  • [8] I. Manuilskiy, W. W. Lee, Phys. Plasmas 7 (5) (2000) 1381. doi:10.1063/1.873955.
  • [9] Y. Chen, S. E. Parker, Phys. Plasmas 8 (2001) 2095. doi:10.1063/1.1351828.
  • [10] Y. Chen, S. E. Parker, J. Comput. Phys. 189 (2003) 463–475. doi:10.1016/S0021-9991(03)00228-6.
  • [11] Y. Chen, S. E. Parker, J. Comput. Phys. 220 (2007) 839–855. doi:10.1016/j.jcp.2006.05.028.
  • [12] J. C. Adams, A. G. Serveniere, A. B. Langdon, J. Comput. Phys. 47 (1982) 229–244. doi:10.1016/0021-9991(82)90076-6.
  • [13] J. Bao, D. Liu, Z. Lin, Phys. Plasmas 24 (2017) 102516. doi:10.1063/1.4995455.
  • [14] J. Bao, Z. Lin, Z. X. Lu, Phys. Plasmas 25 (2018) 022515. doi:10.1063/1.5016432.
  • [15] G. Dong, J. J. Bao, A. Bhattacharjee, Z. Lin, Phys. Plasmas 26 (2019) 010701. doi:10.1063/1.5066583.
  • [16] V. Kornilov, R. Kleiber, R. Hatzky, L. Villard, G. Jost, Phys. Plasmas 11 (2004) 3196. doi:10.1063/1.1737393.
  • [17] R. Kleiber, R. Hatzky, A. Konies, A. Mishchenko, E. Sonnendrucker, Phys. Plasmas 23 (2016) 032501. doi:10.1063/1.4942788.
  • [18] S. Jolliet, A. Bottino, P. Angelino, R. Hatzky, T. Tran, B. McMillan, O. Sauter, K. Appert, Y. Idomura, L. Villard, Comput. Phys. Comm. 177 (2007) 409–425. doi:10.1016/j.cpc.2007.04.006.
  • [19] A. Bottino, T. Vernay, B. D. Scott, S. Brunner, R. Hatzky, S. Jolliet, B. F. McMillan, T. M. Tran, L. Villard, Plasma Phys. Controlled Fusion 53 (2011) 124027. doi:10.1088/0741-3335/53/12/124027.
  • [20] E. Lanti, N. Ohana, N. Tronko, T. Hayward-Schneider, A. Bottino, B. F. McMillan, A. Mishchenko, A. Scheinberg, A. Biancalani, P. Angelino, S. Brunner, J. Dominski, P. Donnel, C. Gheller, R. Hatzky, A. Jocksch, S. Jolliet, Z. X. Lu, J. P. Martin Collar, I. Novikau, E. Sonnendrücker, T. Vernay, L. Villard, Comput. Phys. Comm. 251 (2020) 107072. doi:10.1016/j.cpc.2019.107072.
  • [21] F. Jenko, W. Dorland, M. Kotschenreuther, B. N. Rogers, Phys. Plasmas 7 (2000) 1904–1910. doi:10.1063/1.874014.
  • [22] T. Gorler, X. Lapillonne, S. Brunner, T. Dannert, F. Jenko, F. Merz, D. Told, J. Comput. Phys. 230 (2011) 7053. doi:10.1016/j.jcp.2011.05.034.
  • [23] T. Gorler, X. Lapillonne, S. Brunner, T. Dannert, F. Jenko, S. K. Aghdam, P. Marcus, B. F. McMillan, F. Merz, O. Sauter, D. Told, L. Villard, Phys. Plasmas 18 (2011) 056103. doi:10.1063/1.3567484.
  • [24] N. R. Mandell, A. Hakim, G. W. Hammett, M. Francisquez, J. Plasma Phys. 86 (2020) 905860109. doi:10.1017/S0022377820000070.
  • [25] A. Hakim, N. R. Mandell, T. N. Bernard, M. Francisquez, G. W. Hammett, E. L. Shi, Phys. Plasmas 27 (2020) 042304. doi:10.1063/1.5141157.
  • [26] J. Candy, R. E. Waltz, J. Comput. Phys. 186 (2003) 545. doi:10.1016/S0021-9991(03)00079-2.
  • [27] J. Candy, Phys. Plasmas 12 (2005) 072307. doi:10.1063/1.1954123.
  • [28] T.-H. Watanabe, H. Sugama, Nucl. Fusion 46 (2006) 24. doi:10.1088/0029-5515/46/1/003.
  • [29] S. Maeyama, A. Ishizawa, T.-H. Watanabe, M. Nakata, N. Miyato, M. Yagi, Y. Idomura, Phys. Plasmas 21 (2014) 052301. doi:10.1063/1.4873379.
  • [30] K. Obrejan, K. Imadera, J. Li, Y. Kishimoto, Comput. Phys. Comm. 216 (2017) 8–17. doi:10.1016/j.cpc.2017.02.010.
  • [31] A. Ishizawa, K. Imadera, Y. Nakamura, Y. Kishimoto, Phys. Plasmas 26 (2019) 082301. doi:10.1063/1.5100308.
  • [32] J. C. Cummings, Gyrokinetic simulation of finite-beta and self-sheared-flow effects on pressure-gradient instabilities, Ph.D. thesis, Princeton University, 1994.
  • [33] J. V. W. Reynders, Gyrokinetic simulation of finite-beta plasmas on parallel architectures, Ph.D. thesis, Princeton University, 1993.
  • [34] C. S. Chang, S. Ku, P. H. Diamond, Z. Lin, S. E. Parker, T. S. Hahm, N. Samatova, Phys. Plasmas 16 (2009) 056108. doi:10.1063/1.3099329.
  • [35] C. S. Chang, S. Ku, A. Loarte, V. Parail, F. Kochl, M. Romanelli, R. Maingi, J. W. Ahn, T. Gray, J. Hughes, B. LaBombard, T. Leonard, M. Makowski, J. Terry, Nucl. Fusion 57 (2017) 116023. doi:10.1088/1741-4326/aa7efb.
  • [36] C. S. Chang, S. Ku, G. R. Tynan, R. Hager, R. M. Churchill, I. Cziegler, M. Greenwald, A. E. Hubbard, J. W. Hughes, Phys. Rev. Lett. 118 (2017) 175001. doi:10.1103/PhysRevLett.118.175001.
  • [37] R. Hager, J. Dominski, C. S. Chang, Cross-verification of neoclassical transport solutions from XGCa against NEO, Phys. Plasmas 26 (104502). doi:10.1063/1.5121308.
  • [38] C. S. Chang, S. Ku, R. Hager, R. M. Churchill, J. Hughes, F. Köchl, A. Loarte, V. Parail, R. Pitts, Constructing a new predictive scaling formula for iter’s divertor heat-load width informed by a simulation-anchored machine learning, Phys. Plasmas 28 (022501). doi:10.1063/5.0027637.
  • [39] A. Mishchenko, M. D. J. Cole, R. Kleiber, A. Konies, Phys. Plasmas 21 (2014) 052113. doi:10.1063/1.4880560.
  • [40] A. Mishchenko, A. Konies, R. Kleiber, M. D. J. Cole, Phys. Plasmas 21 (2014) 092110. doi:10.1063/1.4895501.
  • [41] M. D. J. Cole, A. Mishchenko, A. Bottino, C. S. Chang, Phys. Plasmas 28 (2021) 034501. doi:10.1063/5.0030937.
  • [42] R. Hager, J. Lang, C. S. Chang, S. Ku, Y. Chen, S. E. Parker, M. F. Adams, Phys. Plasmas 24 (2017) 054508. doi:10.1063/1.4983320.
  • [43] A. Biancalani, A. Bottino, A. Di Siena, Ö. Gürcan, T. Hayward-Schneider, F. Jenko, P. Lauber, A. Mishchenko, P. Morel, I. Novikau, F. Vannini, L. Villard, A. Zocco, Plasma Phys. Controlled FusionIn press. doi:https://doi.org/10.1088/1361-6587/abf256.
  • [44] G. Chen, L. Chacón, D. C. Barnes, J. Comput. Phys. 230 (2011) 7018–7036. doi:10.1016/j.jcp.2011.05.031.
  • [45] L. Chacón, G. Chen, D. C. Barnes, J. Comput. Phys. 233 (2013) 1–9. doi:10.1016/j.jcp.2012.07.042.
  • [46] G. Chen, L. Chacón, Comput. Phys. Comm. 185 (2014) 2391–2402. doi:10.1016/j.cpc.2014.05.010.
  • [47] G. Chen, L. Chacón, C. A. Leibs, D. A. Knoll, W. Taitano, J. Comput. Phys. 258 (2014) 555–567. doi:10.1016/j.jcp.2013.10.052.
  • [48] G. Chen, L. Chacón, Comput. Phys. Comm. 197 (2015) 73–87. doi:10.1016/j.cpc.2015.08.008.
  • [49] Z. X. Lu, G. Meng, M. Hoelzl, P. Lauber, J. Comput. Phys. (2021) 110384 In press. doi:10.1016/j.jcp.2021.110384.
  • [50] D. G. Anderson, J. Assoc. Comput. Mach. 12 (1965) 547–560.
  • [51] C. Ma, W. Wang, E. Startsev, S. Ethier, Bull. Am. Phys. Soc. 60, , Abstract NP11.00056. [link].
    URL http://meetings.aps.org/link/BAPS.2018.DPP.NP11.56
  • [52] W. X. Wang, G. Rewoldt, W. M. Tang, F. L. Hinton, J. Manickam, L. E. Zakharov, R. B. White, S. Kaye, Phys. Plasmas 13 (2006) 082501. doi:10.1063/1.2244532.
  • [53] W. X. Wang, P. H. Diamond, T. S. Hahm, S. Ethier, G. Rewoldt, W. M. Tang, Phys. Plasmas 17 (2010) 072511. doi:10.1063/1.3459096.
  • [54] S. E. Parker, W. W. Lee, Phys. Fluids B 5 (1993) 77. doi:10.1063/1.860870.
  • [55] A. M. Dimits, W. W. Lee, J. Comput. Phys. 107 (1993) 309. doi:10.1006/jcph.1993.1146.
  • [56] G. Hu, J. A. Krommes, Phys. Plasmas 1 (1994) 863. doi:10.1063/1.870745.
  • [57] A. Y. Aydemir, Phys. Plasmas 1 (1994) 822. doi:10.1063/1.870740.
  • [58] R. Hatzky, R. Kleiber, A. Könies, A. Mishchenko, M. Borchardt, A. Bottino, E. Sonnendrücker, J. Plasma Phys. 85 (2019) 905850112. doi:10.1017/S0022377819000096.
  • [59] B. I. Cohen, Orbit averaging and subcycling in particle simulation of plasmas 311–333, in Multiple Time Scales, edited by J. U. Brackbill and B. I. Cohen ( Academic Press, 1985).
  • [60] B. Sturdevant, S. E. Parker, Y. Chen, B. B. Hause, J. Comput. Phys. 316 (2016) 519–533. doi:10.1016/j.jcp.2016.04.036.
  • [61] S. Balay, S. Abhyankar, M. F. Adams, J. Brown, P. Brune, K. Buschelman, L. Dalcin, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes, K. Rupp, B. F. Smith, S. Zampini, H. Zhang, H. Zhang, PETSc users manual, Tech. Rep. ANL-95/11 - Revision 3.7, Argonne National Laboratory (2016).
  • [62] S. Balay, W. D. Gropp, L. C. McInnes, B. F. Smith, Efficient management of parallelism in object oriented numerical software libraries, in: E. Arge, A. M. Bruaset, H. P. Langtangen (Eds.), Modern Software Tools in Scientific Computing, Birkhäuser Press, 1997, pp. 163–202.
  • [63] X. S. Li, J. W. Demmel, SuperLU DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems, ACM Trans. Mathematical Software 29 (2) (2003) 110–140.
  • [64] X. Lapillonne, S. Brunner, T. Dannert, S. Jolliet, A. Marinoni, L. Villard, T. Gorler, F. Jenko, F. Merz, Phys. Plasmas 16 (2009) 032308. doi:10.1063/1.3096710.
  • [65] C. K. Goertz, R. W. Boswell, J. Geophys. Res. 84 (1979) 7239. doi:10.1029/JA084iA12p07239.
  • [66] C. H. Hui, C. E. Seyler, J. Geophys. Res. 97 (1992) 3953. doi:10.1029/91JA03101.
  • [67] K. Nishioka, T. H. Watanabe, M. Shinya, J. Comput. Phys. 432 (2021) 110167. doi:10.1016/j.jcp.2021.110167.
  • [68] R. Hatzky, A. Könies, A. Mishchenko, J. Comput. Phys. 251 (2007) 568–590. doi:10.1016/j.jcp.2006.12.019.
  • [69] A. M. Dimits, G. Bateman, M. A. Beer, B. I. Cohen, W. Dorland, G. W. Hammett, C. Kim, J. E. Kinsey, M. Kotschenreuther, A. H. Kritz, L. L. Lao, J. Mandrekas, W. M. Nevins, S. E. Parker, A. J. Redd, D. E. Shumaker, R. Sydora, J. Weiland, Phys. Plasmas 7 (2000) 969. doi:10.1063/1.873896.
  • [70] T. Gorler, N. Tronko, W. A. Hornsby, A. Bottino, R. Kleiber, C. Norscini, V. Grandgirard, F. Jenko, E. Sonnendrucker, Phys. Plasmas 23 (2016) 072503. doi:10.1063/1.4954915.
  • [71] T. Görler, Multiscale effects in plasma microturbulence, Ph.D. thesis, Universität Ulm, 2009.
  • [72] X. Lapillonne, Microturbulence in electron internal transport barriers in the TCV tokamak and global effects, Ph.D. thesis, École Polytechnique Fédérale de Lausanne, 2010.
  • [73] B. Sturdevant, S. Ku, L. Chacón, Y. Chen, D. Hatch, M. D. J. Cole, A. Y. Sharma, M. F. Adams, C. S. Chang, S. E. Parker, R. Hager, Data from tables and figures in ‘verification of a fully implicit particle-in-cell method for the vv_{\parallel} formalism of electromagnetic gyrokinetics in the xgc code’, PPPL Theory Department ARK.
    URL http://arks.princeton.edu/ark:/88435/dsp015425kd34n