This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A structure preserving numerical scheme for Fokker-Planck equations of structured neural networks with learning rules

Qing He111Department of Mathematics, Southern Methodist University, Dallas, TX 75275, USA ([email protected]).,  Jingwei Hu222Department of Applied Mathematics, University of Washington, Seattle, WA 98195 ([email protected]).,  and  Zhennan Zhou333Beijing International Center for Mathematical Research, Peking University, Beijing, China ([email protected]).
Abstract

In this work, we are concerned with a Fokker-Planck equation related to the nonlinear noisy leaky integrate-and-fire model for biological neural networks which are structured by the synaptic weights and equipped with the Hebbian learning rule. The equation contains a small parameter ε\varepsilon separating the time scales of learning and reacting behavior of the neural system, and an asymptotic limit model can be derived by letting ε0\varepsilon\to 0, where the microscopic quasi-static states and the macroscopic evolution equation are coupled through the total firing rate. To handle the endowed flux-shift structure and the multi-scale dynamics in a unified framework, we propose a numerical scheme for this equation that is mass conservative, unconditionally positivity preserving, and asymptotic preserving. We provide extensive numerical tests to verify the schemes’ properties and carry out a set of numerical experiments to investigate the model’s learning ability, and explore the solution’s behavior when the neural network is excitatory.

Key words: Fokker-Planck equation, Nonlinear Noisy Leaky Integrate-and-Fire model, neuron network, structure preserving scheme, asymptotic preserving, multiscale model.

AMS subject classifications: 35M13, 65M08, 92B20.

1 Introduction

In biological neural networks, neurons transmit information in two ways: one is conveying rapid and transient electrical signals on their membranes, and the other is sending neurotransmitters at a synapse. The learning and memory of the network is based on the connection weights of the synapses among the neurons [1]. How a neural network with a given structure reacts to the environment and how the neural network’s infrastructure is modified by the environment in the long run are two of the essential topics in neural science.

Mathematically, there has been plenty of models describing the neural networks’ behavior from all scales and sizes. The most successful single-neuron model is undoubtedly the Hodgkin-Huxley model [2] and its simplification, the Fitzhugh-Nagumo model [3]. However, when modeling large-scale biological neural networks, these models are far too complicated and impractical. Therefore, scientists and mathematicians usually adopt the integrate-and-fire model [4] when modeling the system that consists of a large number of neurons [5, 6, 7, 8, 9, 10]. One of the most widely used version of the integrate-and-fire model is the nonlinear noisy leaky integrate and fire model (NNLIF). In this model, when the firing event does not occur, the membrane potential V(t)V(t) of a neuron is influenced by a relaxation to a resting potential VIV_{I}, an incoming synaptic current (t)\mathcal{I}(t) from other neurons and the background. The current (t)\mathcal{I}(t) is approximately decomposed into a drift term and a term of Brownian motion. The SDE, in the simplest form, is given by

dV=(VVI)dt+μdt+σdBt,\mathrm{d}V=-\left(V-V_{I}\right)\mathrm{d}t+\mu\mathrm{d}t+\sigma\mathrm{d}B_{t}, (1.1)

where the parameters μ\mu and σ\sigma are determined by the synaptic current (t)\mathcal{I}(t). The distinguished feature of this model is the incorporation of the firing event: when a neuron reaches a firing potential VFV_{F}, it discharges itself and the potential jumps to a resetting potential VRV_{R} immediately [11, 12]. Here, we assume, VR<VFV_{R}<V_{F}, and the firing event is expressed by

V(t)=VF,V(t+)=VR.V\left(t^{-}\right)=V_{F},\quad V\left(t^{+}\right)=V_{R}. (1.2)

(1.1) and (1.2) constitute the stochastic process for the NNLIF model, which is a SDE coupled with a renewal process. By Itô’s formula, one can derive the time evolution of the density function p(v,t)p(v,t) [13, 14], which represents the probability of finding a neuron at the voltage vv and given time tt:

{pt+v(hp)a2pv2=N(t)δ(vVR),t(0,),v(,VF),p(v,0)=p0(v),p(,t)=p(VF,t)=0,\begin{cases}{\displaystyle\frac{\partial p}{\partial t}+\frac{\partial}{\partial v}(hp)-a\frac{\partial^{2}p}{\partial v^{2}}=N(t)\delta(v-V_{R}),\quad t\in(0,\infty),v\in(-\infty,V_{F})},\\ {\displaystyle p(v,0)=p_{0}(v),\ \ p(-\infty,t)=p(V_{F},t)=0},\end{cases} (1.3)

where

N(t)=apv(VF,t)N(t)=-a\frac{\partial p}{\partial v}(V_{F},t) (1.4)

stands for the firing rate of the network, a=σ2/2a={\sigma^{2}}/{2} stands for the amplitude of the noise, δ()\delta(\cdot) stands for the Dirac delta function, and

h=h(v,N(t))=v+I+wN(t)h=h(v,N(t))=-v+I+wN(t) (1.5)

stands for the rate of the rising of the neurons’ potential, consisting of a decay term v-v, an external input function II and a group reaction term wN(t)wN(t). The coefficient ww in (1.5) can be either positive or negative. In the simplest scenario, ww is a constant. We say that the neural network is excitatory when w>0w>0, and inhibitory when w<0w<0.

When it comes to the learning behavior of the biological neural network, the most seminal work is the Hebbian rule [15] stated as a motto “neurons that fire together wire together”. A very simple mathematical setup of this rule is as follows: assuming that the strength of weights wijw_{ij} between two neurons ii and jj increases when the two neurons have high activity simultaneously. For MM neurons in interactions, the classical Hebbian rule relates the weights to the activity NiN_{i} of the neuron ii as

ddtwij=kijNiNjwij,1i,jM.\frac{\mathrm{d}}{\mathrm{d}t}w_{ij}=k_{ij}N_{i}N_{j}-w_{ij},\quad 1\leq i,j\leq M.

In [16], the authors designed a model that integrates the NNLIF model and the Hebbian rule stated above. They considered a heterogeneous population of homogeneous neural networks structured by their synaptic weights w(,+)w\in(-\infty,+\infty), negative sign standing for inhibitory neurons and positive sign standing for excitatory neurons. They introduced a learning rule in order to modulate the distribution of synaptic weights HH and allow the network to recognize some given input signals II by choosing an appropriate heterogeneous synaptic weight distribution HH adapted to the signal II. They assumed that the subnetworks interact only via the total firing rate NN, with synaptic weights described with a single variable ww. And they gave the following interpretation: all the subnetworks parametrized by ww may modulate their intrinsic synaptic weight ww with respect to a function Φ\Phi which depends on the intrinsic activity N(w)N(w) of the network parametrized by ww and of the total activity of the network N¯\overline{N}. Then, the proposed generalization of the Hebbian rule consists in choosing

Φ(N(w),N¯)=N¯N(w)K(w),N¯=N(w)𝑑w,\Phi(N(w),\overline{N})=\overline{N}N(w)K(w),\quad\overline{N}=\int_{-\infty}^{\infty}N(w)dw, (1.6)

where K()K(\cdot) represents the learning strength of the subnetwork with synaptic weight ww. Adding the above choice of the learning rule, they obtained the following equation

pt+v[(v+I(w)+wσ(N¯(t)))p]+εw[(Φw)p]a2pv2=N(w,t)δ(vVR),\displaystyle\frac{\partial p}{\partial t}+\frac{\partial}{\partial v}[(-v+I(w)+w\sigma(\overline{N}(t)))p]+\varepsilon\frac{\partial}{\partial w}[(\Phi-w)p]-a\frac{\partial^{2}p}{\partial v^{2}}=N(w,t)\delta\left(v-V_{R}\right), (1.7)

where ε\varepsilon stands for a time scale ratio which takes into account that learning is slower than the typical voltage activity of the network.

If we want to study the learning behavior of the model, we may further perform a time rescaling tt/εt\to t/\varepsilon for (1.7) to arrive at

{pt+w[(N¯(t)N(w,t)K(w)w)p]=1ε{a2pv2v[(v+I(w)+wσ(N¯(t)))p]+N(w,t)δ(vVR)}for (v,w,t)(,VF)×(,+)×(0,T),N(w,t)=apv(VF0,w,t)0,N¯(t)=N(w,t)dw,\left\{\begin{array}[]{ l }\begin{aligned} \frac{\partial p}{\partial t}+\frac{\partial}{\partial w}[(\overline{N}(t)N(w,t)K(w)-w)p]&\\ =\frac{1}{\varepsilon}\Bigl{\{}a\frac{\partial^{2}p}{\partial v^{2}}-\frac{\partial}{\partial v}[(-v+I(w)&+w\sigma(\overline{N}(t)))p]+N(w,t)\delta(v-V_{R})\Bigr{\}}\\ \text{for }(v,w,t)&\in(-\infty,V_{F})\times(-\infty,+\infty)\times(0,T),\end{aligned}\\[30.0pt] \displaystyle N(w,t)=-a\frac{\partial p}{\partial v}(V_{F}-0,w,t)\geq 0,\ \ \overline{N}(t)=\int_{-\infty}^{\infty}N(w,t)\mathrm{d}w,\end{array}\right. (1.8)

where pv(VF0,w,t)\frac{\partial p}{\partial v}(V_{F}-0,w,t) means the left derivative in the vv direction and the corresponding initial and boundary conditions are

p(VF,w,t)=0,p(,w,t)=0,p(v,±,t)=0,p(v,w,0)=p0(v,w).p(V_{F},w,t)=0,\ \ p(-\infty,w,t)=0,\ \ p(v,\pm\infty,t)=0,\ \ p(v,w,0)=p^{0}(v,w). (1.9)

Here p(v,w,t)p(v,w,t) at a fixed time tt describes the probability density of finding a neuron with synaptic weight ww and voltage vv, henceforth the probability density of finding a single neuron with synaptic weight ww is defined as

H(w,t)=VFp(v,w,t)dv,H(w,t)=\int_{-\infty}^{V_{F}}p(v,w,t)\mathrm{d}v, (1.10)

and the normalization condition of probability ensures that +H(w,t)dw=1\int_{-\infty}^{+\infty}H(w,t)\mathrm{d}w=1 holds for all t0t\geq 0. N(w,t)N(w,t) describes the firing rate of the neurons with synaptic weight ww, at time tt and N¯(t)\overline{N}(t) describes the entire network’s firing rate at time tt summing over all the synaptic weights. aa is a positive noise coefficient. The model is a multi-scale transport equation, where the vv-wise transport represents the integrate-and-fire dynamics of the neural network, and the ww-wise transport represents the network’s learning behavior.

Systematic understanding of the model (1.8) is largely lacking, although some of its properties were proved in [16]. Particularly, the numerical studies are far from sufficient for understanding the possible uses and limitations of this model. In this work, we further explore this model from a numerical perspective. Specifically, we aim to design a structure-preserving numerical scheme for (1.8), and further explore this model with extensive numerical experiments to investigate possible solution structures and interpretations.

In our work, we propose a numerical scheme with the positivity preserving and asymptotic preserving properties. There are two key ingredients that make our scheme meet the desired properties. The first one is the Scharfetter-Gummel symmetric reconstruction [17] of the vv-wise convection and diffusion terms in (1.8). The second one is that we treat the flux shift term Nδ(vVR)N\delta(v-V_{R}) in (1.8) implicitly. With the symmetric reconstruction and this implicit treatment, we can prove that implementing our scheme means inverting a so-called MM-matrix at each time step, and the positivity preserving properties of the scheme is independent of the respective ratio between Δt\Delta t and Δv\Delta v or ε\varepsilon. The only grid ratio requirement is that Δt/Δw\Delta t/\Delta w has to satisfy the ww-wise CFL condition, which is a natural requirement. Furthermore, when letting ε0\varepsilon\to 0 without adjusting Δt\Delta t, we obtain a numerical scheme consistent with the asymptotic behavior of the model, consisting of a ww-wise convection equation for H(w,t)H(w,t) defined in (1.10) and a vv-wise quasi-steady equation for p(v,w,t)p(v,w,t) with given H(w,t)H(w,t).

Numerical methods for Fokker-Planck type equations are abundant in the literature. In recent years many works have been proposed to preserve the structure of the solution such as mass conservation, positivity, entropy decay, and steady state preserving. This often relies on delicate choice of numerical fluxes and suitable time discretization. Without being exhaustive, we mention a few relevant works. Regarding spatial discretization, there are mainly two kinds of methods: one is based on formulating the equation into the transport form and using upwind fluxes [18]. This is later extended to general nonlinear aggregation and diffusion equations [19]. The other is based on formulating the equation into the diffusion form (i.e., Scharfetter-Gummel form). Depending on how to choose the fluxes, it bifurcates to several variants, for example, the Chang-Cooper scheme [20] and its generalization [21]; the symmetrized finite difference scheme [17], and more recently the high order finite difference scheme [22]. Regarding time discretization, high order explicit or semi-implicit schemes (c.f. [23]) can be used. However, we mention that to get positivity preserving implicit schemes for Fokker-Planck type equations, one is basically limited to first order scheme [24, 25, 26, 27]. The only exception we are aware of is the second order exponential method in [28] which is applicable to the linear Fokker-Planck equation and hard to generalize to more general cases. For the above reason, we limit ourself to first order method in this paper.

In the numerical experiment part, we verify the scheme’s order of accuracy (first order for w,tw,t, second order for vv) and asymptotic preserving properties, test the model’s recognition properties presented in [16] and further explore the model’s behavior when the solution p(v,w,t)p(v,w,t) is supported in the region w>0w>0. The numerical results suggest that this model shows great potential in distinguishing a general input by producing distinct and unsymmetrical responses to orthogonal basis functions. When w>0w>0, the solution p(v,w,t)p(v,w,t) may either develop to a steady state supported in a region w[0,A]w\in[0,A] for some positive AA, or expand its support rapidly towards ww\to\infty.

The rest of the paper is organized as follows. In Section 2, we present our schemes in detail, including the equations and their implementations; in Section 3, we give a detailed analysis of our schemes’ properties, including conservation, positivity preserving properties and asymptotic preserving properties; and in Section 4, we present the numerical results including the verification of the schemes’ accuracy, asymptotic preserving properties, numerical exploration of the model’s learning and recognition ability and the behavior of the model with positive synaptic weights.

2 The Schemes

Our work aims to provide a numerical scheme for (1.8). In this section, we will give a detailed description of the schemes we designed.

This section is organized as follows: in Section 2.1 and Section 2.2, we introduce the grid point settings and the numerical scheme for (1.8), including a vv-wise semi-implicit (vv-SI) implementation and a vv-wise fully-implicit (vv-FI) implementation; and in Section 2.3, we introduce the matrix form of the vv-SI and vv-FI schemes, from which we introduce a useful matrix 𝑴jN¯\boldsymbol{M}_{j}^{\overline{N}} to formulate further analysis.

2.1 Grid point settings and discretizations

Our scheme is a grid-based method, and this subsection gives the basic grid point settings of our scheme.

We assume that p(v,w,t)p(v,w,t) vanishes fast enough as vv\rightarrow-\infty or w±w\rightarrow\pm\infty such that we truncate the domain into the bounded region

(v,w)[Vmin,VF]×[Wmin,Wmax]×[0,Tmax],(v,w)\in[V_{\min},V_{F}]\times[W_{\min},W_{\max}]\times[0,T_{\max}],

and suppose that p(v,w,t)p(v,w,t) is negligible out of this region. Then we divide the intervals [Vmin,VF][V_{\min},V_{F}], [Wmin,Wmax][W_{\min},W_{\max}], [0,Tmax][0,T_{\max}] into nv,nw,ntn_{v},n_{w},n_{t} equal sub-intervals with size

Δv=VFVminnv,Δw=WmaxWminnw,Δt=Tmaxnt,\Delta v=\frac{V_{F}-V_{\min}}{n_{v}},\ \ \Delta w=\frac{W_{\max}-W_{\min}}{n_{w}},\ \ \Delta t=\frac{T_{\mathrm{max}}}{n_{t}}, (2.1)

respectively, so that the grid points are assigned as follows

{vi=Vmin+iΔvi=0,1,2,,nvwj=Wmin+jΔwj=0,1,2,,nwtm=mΔvm=0,1,2,,nt,\begin{cases}v_{i}=V_{\min}+i\Delta v&i=0,1,2,\cdots,n_{v}\\ w_{j}=W_{\min}+j\Delta w&j=0,1,2,\cdots,n_{w}\\ t^{m}=m\Delta v&m=0,1,2,\cdots,n_{t},\end{cases} (2.2)

then pi,jmp_{i,j}^{m} represents the numerical approximation of p(vi,wj,tm)p\left(v_{i},w_{j},t^{m}\right). We always assume that there is an index rr satisfying

VR=vr,V_{R}=v_{r}, (2.3)

which naturally treats the derivative’s discontinuity at v=VRv=V_{R}. Actually it can also be seen from (2.1) and (2.2) that vnv=VFv_{n_{v}}=V_{F}, so the two points related to flux shift are both aligned with the grid point.

For the discretization of N=ap/v|v=VFN=-a\partial p/\partial v|_{v=V_{F}}, we approximate the derivative with the first-order finite difference

Njm=apnv,jmpnv1,jmΔv=apnv1,jmΔvN_{j}^{m}=-a\frac{p_{n_{v},j}^{m}-p_{n_{v}-1,j}^{m}}{\Delta v}=a\frac{p_{n_{v}-1,j}^{m}}{\Delta v} (2.4)

(here we have applied the boundary condition pnv,jm=0p_{n_{v},j}^{m}=0); for the discretization of N¯(t)=N(w,t)dw\overline{N}(t)=\int_{-\infty}^{\infty}N(w,t)\mathrm{d}w, we apply the simplest rectangular rule of numerical integration

N¯m=Δwj=0nvNjm,\overline{N}^{m}=\Delta w\sum_{j=0}^{n_{v}}N_{j}^{m}, (2.5)

which can also be explained as a trapezoidal rule of numerical integration since the boundary values N0mN_{0}^{m} and NnwmN_{n_{w}}^{m} are supposed to be negligible (zero); furthermore, we also denote the discrete representation of H(w,t)=VFp(v,w,t)dvH(w,t)=\int_{-\infty}^{V_{F}}p(v,w,t)\mathrm{d}v as

Hjm=Δvi=1nvpi,jm.H_{j}^{m}=\Delta v\sum_{i=1}^{n_{v}}p_{i,j}^{m}. (2.6)

2.2 The numerical scheme for the Fokker-Planck equation (1.8)

We design a scheme with a finite volume construction: for each i=0,1,,nv1i=0,1,\cdots,n_{v}-1, j=0,1,,nwj=0,1,\cdots,n_{w}, m=0,1,,nt1m=0,1,\cdots,n_{t}-1, we have the difference equation

pi,jm+1pi,jmΔt+Φi,j+1/2mΦi,j1/2mΔw=1εFi+1/2,jmFi1/2,jmΔv,\frac{p_{i,j}^{m+1}-p_{i,j}^{m}}{\Delta t}+\frac{\Phi_{i,j+1/2}^{m}-\Phi_{i,j-1/2}^{m}}{\Delta w}=\frac{1}{\varepsilon}\frac{F_{i+1/2,j}^{m}-F_{i-1/2,j}^{m}}{\Delta v}, (2.7)

or, in an equivalent splitting form

pi,jmpi,jmΔt+Φi,j+1/2mΦi,j1/2mΔw=0,\frac{p_{i,j}^{*m}-p_{i,j}^{m}}{\Delta t}+\frac{\Phi_{i,j+1/2}^{m}-\Phi_{i,j-1/2}^{m}}{\Delta w}=0, (2.8)
pi,jm+1pi,jmΔt=1εFi+1/2,jmFi1/2,jmΔv.\frac{p_{i,j}^{m+1}-p_{i,j}^{*m}}{\Delta t}=\frac{1}{\varepsilon}\frac{F_{i+1/2,j}^{m}-F_{i-1/2,j}^{m}}{\Delta v}. (2.9)

This splitting form is introduced only for convenience when discussing the scheme’s properties in section 3.3. Our scheme is not based on the idea of operator splitting.

For the ww-wise transport, we take the following explicit flux construction adapted from Godunov’s Method (see, for example, (13.24) in [29]):

Φi,j+1/2m={{min{Φi,jm,Φi,j+1m}pi,jmpi,j+1mmax{Φi,jm,Φi,j+1m}pi,jm>pi,j+1mj=0,,nw10j=1,nw\Phi_{i,j+1/2}^{m}=\begin{cases}\begin{cases}\min\left\{\Phi_{i,j}^{m},\Phi_{i,j+1}^{m}\right\}&p_{i,j}^{m}\leq p_{i,j+1}^{m}\\ \max\left\{\Phi_{i,j}^{m},\Phi_{i,j+1}^{m}\right\}&p_{i,j}^{m}>p_{i,j+1}^{m}\end{cases}&j=0,\cdots,n_{w}-1\\ 0&j=-1,n_{w}\end{cases} (2.10)

where

Φi,jm=(N¯mNjmK(wj)wj)pi,jmforj=0,,nw.\Phi_{i,j}^{m}=(\overline{N}^{m}N_{j}^{m}K(w_{j})-w_{j})p_{i,j}^{m}\ \ \ \text{for}\ \ j=0,\cdots,n_{w}. (2.11)

For the vv-wise transport, we need to treat the operator partially implicitly in an appropriate way such that the scheme’s stability is not limited by the grid ratio Δt/(Δv)2\Delta t/(\Delta v)^{2} and the stiffness introduced by the smallness of ε\varepsilon. We inherit the idea from [30] which implements a Scharfetter-Gummel symmetric reconstruction for the convection-diffusion operator and imposes a flux-shift from the boundary VFV_{F} back to VRV_{R}, but we treat the flux shift operator Nδ(VVR)N\delta(V-V_{R}) implicitly:

Fi+1/2,jm={aMi+1/2,jN¯αΔv(pi+1,jm+1Mi+1,jN¯αpi,jm+1Mi,jN¯α)+Njm+1η(vi+1/2VR)i=0,1,,nv2,0i=1,nv1,F_{i+1/2,j}^{m}=\begin{cases}\displaystyle a\frac{M_{i+1/2,j}^{\overline{N}^{\alpha}}}{\Delta v}\left(\frac{p_{i+1,j}^{m+1}}{M_{i+1,j}^{\overline{N}^{\alpha}}}-\frac{p_{i,j}^{m+1}}{M_{i,j}^{\overline{N}^{\alpha}}}\right)+N_{j}^{m+1}\eta(v_{i+1/2}-V_{R})&i=0,1,\cdots,n_{v}-2,\\ 0&i=-1,n_{v}-1,\end{cases}\quad (2.12)

where η()\eta(\cdot) is the Heaviside function (η(x)=1\eta(x)=1 when x0x\geq 0 and η(x)=0\eta(x)=0 when x<0x<0), and

Mi,jN¯α=e[vi+I(wj)+σ(N¯α)]22a,Mi+1/2,jN¯α=2Mi,jN¯αMi+1,jN¯αMi,jN¯α+Mi+1,jN¯α.M_{i,j}^{\overline{N}^{\alpha}}=\mathrm{e}^{-\frac{\left[-v_{i}+I(w_{j})+\sigma\left(\overline{N}^{\alpha}\right)\right]^{2}}{2a}},\quad M_{i+1/2,j}^{\overline{N}^{\alpha}}=\frac{2M_{i,j}^{\overline{N}^{\alpha}}M_{i+1,j}^{\overline{N}^{\alpha}}}{M_{i,j}^{\overline{N}^{\alpha}}+M_{i+1,j}^{\overline{N}^{\alpha}}}. (2.13)

Now to complete the scheme, the only thing that remains is to specify the expression of N¯α\overline{N}^{\alpha}. As we shall see, the choice of N¯α\overline{N}^{\alpha} will highly affect the implementations and properties of the scheme. Here we propose two choices of N¯α\overline{N}^{\alpha}: either N¯m\overline{N}^{m} or N¯m+1\overline{N}^{m+1}. We call the scheme (2.7)~(2.13) with

N¯α=N¯m\overline{N}^{\alpha}=\overline{N}^{m}

the vv-wise semi-implicit (vv-SI) scheme, and the scheme (2.7)~(2.13) with

N¯α=N¯m+1\overline{N}^{\alpha}=\overline{N}^{m+1}

the vv-wise fully-implicit (vv-FI) scheme. The advantage of the vv-SI scheme is that its implementation is free of nonlinear solvers, and thus is the main focus of this work. The vv-FI scheme is mainly introduced for comparison.

To avoid ambiguity, when discussing the vv-SI scheme, we add a superscript ”SI” for all the variables, which will be put at the first place and separated with other superscripts by a semicolon:

pi,jm,Njm,N¯m,Hjmare written aspi,jSI;m,NjSI;m,N¯SI;m,HjSI;m for the v-SI scheme;p^{m}_{i,j},N^{m}_{j},\overline{N}^{m},H^{m}_{j}\quad\text{are written as}\quad p^{\text{SI};m}_{i,j},N^{\text{SI};m}_{j},\overline{N}^{\text{SI};m},H^{\text{SI};m}_{j}\quad\text{ for the }v\text{-SI scheme};

and a similar set of notations will be introduced for the vv-FI scheme:

pi,jm,Njm,N¯m,Hjmare written aspi,jFI;m,NjFI;m,N¯FI;m,HjFI;mfor the v-FI scheme.p^{m}_{i,j},N^{m}_{j},\overline{N}^{m},H^{m}_{j}\quad\text{are written as}\quad p^{\text{FI};m}_{i,j},N^{\text{FI};m}_{j},\overline{N}^{\text{FI};m},H^{\text{FI};m}_{j}\quad\text{for the }v\text{-FI scheme}.

We still use pi,jm,Njm,N¯m,Hjmp^{m}_{i,j},N^{m}_{j},\overline{N}^{m},H^{m}_{j} when discussing the properties that the vv-SI scheme and the vv-FI scheme have in common (like the conservation and positivity preserving property).

It is worth noting that, in (2.12), the flux-shift part is N¯m+1\overline{N}^{m+1} for both the vv-SI and vv-FI construction of the scheme. This is an apparently small but important difference from the scheme proposed in [30]. This difference will be proved to make the scheme (unconditionally) positivity preserving for all ε\varepsilon and Δv\Delta v.

2.3 The matrix problem involved in the schemes and the iterative methods for the vv-FI scheme

In this subsection, we introduce the matrix equation one will face when implementing the proposed schemes. To begin with, we define the class of matrices which will frequently appear throughout our work:

Definition 1

For a given N¯\overline{N}\in\mathbb{R} and j=0,1,,nwj=0,1,\cdots,n_{w}, we define the matrix 𝐌jN¯\boldsymbol{M}_{j}^{\overline{N}} as an nv×nvn_{v}\times n_{v} matrix such that

(𝑴jN¯)kl={Mk3/2,jN¯/Mk2,jN¯l=k1Mk1/2,jN¯/Mk,jN¯l=k+1(Mk3/2,jN¯+Mk1/2,jN¯)/Mk1,jN¯l=kandl1,nvMn3/2,jN¯/Mn1,jN¯+1l=k=nvM1/2,jN¯/M0,jN¯l=k=11k=r+1andl=nv0otherwise\left(\boldsymbol{M}_{j}^{\overline{N}}\right)_{kl}=\begin{cases}-M_{k-3/2,j}^{\overline{N}}/M_{k-2,j}^{\overline{N}}&l=k-1\\ -M_{k-1/2,j}^{\overline{N}}/M_{k,j}^{\overline{N}}&l=k+1\\ \left(M_{k-3/2,j}^{\overline{N}}+M_{k-1/2,j}^{\overline{N}}\right)/M_{k-1,j}^{\overline{N}}&l=k\ \text{and}\ l\neq 1,n_{v}\\ M_{n-3/2,j}^{\overline{N}}/M_{n-1,j}^{\overline{N}}+1&l=k=n_{v}\\ M_{1/2,j}^{\overline{N}}/M_{0,j}^{\overline{N}}&l=k=1\\ -1&k=r+1\ \text{and}\ l=n_{v}\\ 0&\text{otherwise}\end{cases} (2.14)

where

Mi,jN¯=e[vi+I(wj)+σ(N¯)]22a and Mi+1/2,jN¯=2Mi,jN¯Mi+1,jN¯Mi,jN¯+Mi+1,jN¯.M_{i,j}^{\overline{N}}=\mathrm{e}^{-\frac{[-v_{i}+I(w_{j})+\sigma(\overline{N})]^{2}}{2a}}\text{ \ and \ }M_{i+1/2,j}^{\overline{N}}=\frac{2M_{i,j}^{\overline{N}}M_{i+1,j}^{\overline{N}}}{M_{i,j}^{\overline{N}}+M_{i+1,j}^{\overline{N}}}. (2.15)

𝑴jN¯\boldsymbol{M}_{j}^{\overline{N}} is tridiagonal except for one entry at k=r+1k=r+1, l=nvl=n_{v}, and its properties will be further discussed in section 3.2. For now, having defined 𝑴jN¯\boldsymbol{M}_{j}^{\overline{N}}, we can write the matrix form of the proposed schemes.

For the vv-SI scheme, assume that we have already solved pi,jSI;mp_{i,j}^{\text{SI};*m} from the explicit convection step (2.8), then one easily checks444The only notable detail here is that Njm+1N_{j}^{m+1} in (2.12) should be replaced with apnv1,jm+1/Δvap_{n_{v}-1,j}^{m+1}/\Delta v according to (2.4). that solving the vv-wise transport step (2.9) means solving the system

(ε𝑰+aΔt(Δv)2𝑴jN¯SI;m)𝒑:,jSI;m+1=ε𝒑:,jSI;m\left(\varepsilon\boldsymbol{I}+\frac{a\Delta t}{(\Delta v)^{2}}\boldsymbol{M}_{j}^{\overline{N}^{\text{SI};m}}\right)\boldsymbol{p}_{:,j}^{\text{SI};m+1}=\varepsilon\boldsymbol{p}_{:,j}^{\text{SI};*m} (2.16)

for each j=0,1,,nwj=0,1,\cdots,n_{w}, where

𝒑:,jSI;m+1\displaystyle\begin{array}[]{l}\boldsymbol{p}_{:,j}^{\text{SI};m+1}\\ \end{array} =(p0,jSI;m+1,p1,jSI;m+1,,pnv1,jSI;m+1),\displaystyle=\left(p_{0,j}^{\text{SI};m+1},p_{1,j}^{\text{SI};m+1},\cdots,p_{n_{v}-1,j}^{\text{SI};m+1}\right)^{\top},
𝒑:,jSI;m\displaystyle\boldsymbol{p}_{:,j}^{\text{SI};*m} =(p0,jSI;m,p1,jSI;m,,pnv1,jSI;m),\displaystyle=\left(p_{0,j}^{\text{SI};*m},p_{1,j}^{\text{SI};*m},\cdots,p_{n_{v}-1,j}^{\text{SI};*m}\right)^{\top},

𝑰\boldsymbol{I} is the n×nn\times n identity matrix and 𝑴jN¯SI;m\boldsymbol{M}_{j}^{\overline{N}^{\text{SI};m}} follows the definition given in (2.14). (2.16) is a linear equation for 𝒑:,jSI;m+1\boldsymbol{p}_{:,j}^{\text{SI};m+1}.

For the vv-FI scheme, solving the vv-wise transport equation means solving the system

(ε𝑰+aΔt(Δv)2𝑴jN¯FI;m+1)𝒑:,jFI;m+1=ε𝒑:,jFI;m\left(\varepsilon\boldsymbol{I}+\frac{a\Delta t}{(\Delta v)^{2}}\boldsymbol{M}_{j}^{\overline{N}^{\text{FI};m+1}}\right)\boldsymbol{p}_{:,j}^{\text{FI};m+1}=\varepsilon\boldsymbol{p}_{:,j}^{\text{FI};*m} (2.17)

for each j=0,1,,nwj=0,1,\cdots,n_{w}, where

𝒑:,jFI;m+1\displaystyle\begin{array}[]{l}\boldsymbol{p}_{:,j}^{\text{FI};m+1}\\ \end{array} =(p0,jFI;m+1,p1,jFI;m+1,,pnv1,jFI;m+1),\displaystyle=\left(p_{0,j}^{\text{FI};m+1},p_{1,j}^{\text{FI};m+1},\cdots,p_{n_{v}-1,j}^{\text{FI};m+1}\right)^{\top},
𝒑:,jFI;m\displaystyle\boldsymbol{p}_{:,j}^{\text{FI};*m} =(p0,jFI;m,p1,jFI;m,,pnv1,jFI;m).\displaystyle=\left(p_{0,j}^{\text{FI};*m},p_{1,j}^{\text{FI};*m},\cdots,p_{n_{v}-1,j}^{\text{FI};*m}\right)^{\top}.

(2.17) is a nonlinear system since the matrix 𝑴jN¯FI;m+1\boldsymbol{M}_{j}^{\overline{N}^{\text{FI};m+1}} depends directly on the unsolved variable N¯FI;m+1\overline{N}^{\text{FI};m+1}. Therefore, we have to solve such a system with a nonlinear solver.

We propose a fix point iteration method as follows. We set the initial guess as pi,j(0)=pi,jFI;mp_{i,j}^{(0)}=p_{i,j}^{\text{FI};*m}, hence N¯(0)=N¯m\overline{N}^{(0)}=\overline{N}^{m}, then for k=0,1,2,k=0,1,2,\cdots, let

{(ε𝑰+aΔt(Δv)2𝑴jN¯(k))𝒑:,j(k+1)=ε𝒑:,jFI;mforj=0,1,,nwN¯(k+1)=Δwj=0nvN(k+1)=Δwj=0nvpnv1,j(k+1)Δv,\begin{cases}\displaystyle\left(\varepsilon\boldsymbol{I}+\frac{a\Delta t}{(\Delta v)^{2}}\boldsymbol{M}_{j}^{\overline{N}^{(k)}}\right)\boldsymbol{p}_{:,j}^{(k+1)}=\varepsilon\boldsymbol{p}_{:,j}^{\text{FI};m}\ \text{for}\ j=0,1,\cdots,n_{w}&\\ \displaystyle\overline{N}^{(k+1)}=\Delta w\sum_{j=0}^{n_{v}}N^{(k+1)}=\Delta w\sum_{j=0}^{n_{v}}\frac{p_{n_{v}-1,j}^{(k+1)}}{\Delta v},&\end{cases} (2.18)

where

𝒑:,j(k+1)=(p0,j(k+1),p1,j(k+1),,pnv1,j(k+1)).\boldsymbol{p}_{:,j}^{(k+1)}=\left(p_{0,j}^{(k+1)},p_{1,j}^{(k+1)},\cdots,p_{n_{v}-1,j}^{(k+1)}\right)^{\top}.

We iterate KK rounds to satisfy the stopping criterion |N¯(K+1)N¯(K)|ϵ|\overline{N}^{(K+1)}-\overline{N}^{(K)}|\leq\epsilon for some small enough ϵ\epsilon, and then take

𝒑:,jFI;m+1=𝒑:,j(K),NjFI;m+1=Nj(K),N¯FI;m+1=N¯(K).\boldsymbol{p}_{:,j}^{\text{FI};m+1}=\boldsymbol{p}_{:,j}^{(K)},\ N_{j}^{\text{FI};m+1}=N_{j}^{(K)},\ \overline{N}^{\text{FI};m+1}=\overline{N}^{(K)}.

3 The Properties of the Scheme

Having proposed the scheme in Section 2.2, we discuss how it preserves the properties of the original model.

We show that both the vv-SI and the vv-FI schemes proposed in Section 2.2 are mass-conservative and positivity-preserving as long as the ww-wise CFL condition is satisfied (no matter how small ε\varepsilon and Δv\Delta v is). As for the AP property, the vv-FI scheme and the vv-SI scheme preserve the asymptotic limit in slightly different forms.

3.1 Conservative properties

Being total mass invariant is one of the most basic properties of our model (1.8), and our scheme satisfies the conservation property naturally thanks to the finite volume construction.

Theorem 1

The schemes (2.7)~(2.13) satisfies

i=0nvj=0nwpi,jm+1=i=0nvj=0nwpi,jm,\sum_{i=0}^{n_{v}}\sum_{j=0}^{n_{w}}p_{i,j}^{m+1}=\sum_{i=0}^{n_{v}}\sum_{j=0}^{n_{w}}p_{i,j}^{m}, (3.1)

Proof: Summing (2.7) over ii from 0 to nv1n_{v}-1, over jj from 0 to nwn_{w} and using (2.10) (2.12), we have

i=0nv1j=0nwpi,jm+1\displaystyle\sum_{i=0}^{n_{v}-1}\sum_{j=0}^{n_{w}}p_{i,j}^{m+1} (3.2)
=\displaystyle= i=0nv1j=0nwpi,jm+ΔtεΔvi=0nv1j=0nw(Fi+1/2,jmFi1/2,jm)ΔtΔwi=0nv1j=0nw(Ψi,j+1/2mΨi,j1/2m)\displaystyle\sum_{i=0}^{n_{v}-1}\sum_{j=0}^{n_{w}}p_{i,j}^{m}+\frac{\Delta t}{\varepsilon\Delta v}\sum_{i=0}^{n_{v}-1}\sum_{j=0}^{n_{w}}(F^{m}_{i+1/2,j}-F^{m}_{i-1/2,j})-\frac{\Delta t}{\Delta w}\sum_{i=0}^{n_{v}-1}\sum_{j=0}^{n_{w}}(\Psi^{m}_{i,j+1/2}-\Psi^{m}_{i,j-1/2})
=\displaystyle= i=0nv1j=0nwpi,jm+ΔtεΔvj=0nw(Fnv1/2,jmF1/2,jm)ΔtΔwi=0nv1(Ψi,nw+1/2mΨi,1/2m)\displaystyle\sum_{i=0}^{n_{v}-1}\sum_{j=0}^{n_{w}}p_{i,j}^{m}+\frac{\Delta t}{\varepsilon\Delta v}\sum_{j=0}^{n_{w}}(F^{m}_{n_{v}-1/2,j}-F^{m}_{-1/2,j})-\frac{\Delta t}{\Delta w}\sum_{i=0}^{n_{v}-1}(\Psi^{m}_{i,n_{w}+1/2}-\Psi^{m}_{i,-1/2})
=\displaystyle= i=0nv1j=0nwpi,jm+ΔtεΔvj=0nw(00)ΔtΔwi=0nv1(00)=i=0nv1j=0nwpi,jm,\displaystyle\sum_{i=0}^{n_{v}-1}\sum_{j=0}^{n_{w}}p_{i,j}^{m}+\frac{\Delta t}{\varepsilon\Delta v}\sum_{j=0}^{n_{w}}(0-0)-\frac{\Delta t}{\Delta w}\sum_{i=0}^{n_{v}-1}(0-0)=\sum_{i=0}^{n_{v}-1}\sum_{j=0}^{n_{w}}p_{i,j}^{m},

and because of the boundary condition pnv,jm=0p_{n_{v},j}^{m}=0, we can change “i=0nv1\sum_{i=0}^{n_{v}-1}” on both ends of (3.2) into “i=0nv\sum_{i=0}^{n_{v}}”, which gives us (3.1).

3.2 Properties of the matrix MjN¯M_{j}^{\overline{N}}

Before discussing the properties of our schemes, we firstly investigate the matrix 𝑴jN¯\boldsymbol{M}_{j}^{\overline{N}} defined in (2.14) and (2.15). This matrix has a one dimensional kernel spanned by a strictly positive vector. This property will help us to show the positivity of the schemes.

The main result in this section is

Lemma 1

For all N¯0\overline{N}\geq 0, 𝐌jN¯\boldsymbol{M}_{j}^{\overline{N}} defined in (2.14) and (2.15) has a unique (up to a positive multiple) right eigenvector 𝐪=(q0,,qnv1)\boldsymbol{q}=(q_{0},\cdots,q_{n_{v}-1})^{\top} for eigenvalue 0 which is strictly positive.

Proof: We want to study the solution of

𝑴jN¯𝒒=𝟎.{\boldsymbol{M}_{j}^{\overline{N}}\boldsymbol{q}}=\boldsymbol{0}. (3.3)

We left-multiply both sides of (3.3) by the n×nn\times n bi-diagonal matrix

𝑳=[1111111], or entry-by-entry:Lkl={1k=l+1ork=l0otherwise\boldsymbol{L}=\begin{bmatrix}1&&&&\\ 1&1&&&\\ &1&1&&\\ &&\ddots&\ddots&\\ &&&1&1\end{bmatrix},\text{ or entry-by-entry}:\ L_{kl}=\begin{cases}1&k=l+1\ \text{or}\ k=l\\ 0&\text{otherwise}\end{cases}

obtaining (𝑳𝑴jN¯)𝒒=𝟎\left(\boldsymbol{LM}_{j}^{\overline{N}}\right)\boldsymbol{q}=\boldsymbol{0} (note that 𝑳\boldsymbol{L} is non-singular, so 𝑳𝑴jN¯\boldsymbol{L}\boldsymbol{M}_{j}^{\overline{N}} has the same kernel with 𝑴jN¯\boldsymbol{M}_{j}^{\overline{N}}), which can be rearranged into the following system:

{qi=Mi,jN¯Mi,j+1N¯qi+1i=r1,r2,,0qi=Mi,jN¯Mi,j+1N¯qi+1+Mi,jN¯Mi,j+1/2N¯qn1i=n3,n4,,rqi=(Mi,n2N¯Mi,n1N¯+Mi,n2N¯Mi,n3/2N¯)qn1i=n2\begin{cases}\displaystyle q_{i}=\frac{M_{i,j}^{\overline{N}}}{M_{i,j+1}^{\overline{N}}}q_{i+1}&i=r-1,r-2,\cdots,0\\[15.0pt] \displaystyle q_{i}=\frac{M_{i,j}^{\overline{N}}}{M_{i,j+1}^{\overline{N}}}q_{i+1}+\frac{M_{i,j}^{\overline{N}}}{M_{i,j+1/2}^{\overline{N}}}q_{n-1}&i=n-3,n-4,\cdots,r\\[15.0pt] \displaystyle q_{i}=\left(\frac{M_{i,n-2}^{\overline{N}}}{M_{i,n-1}^{\overline{N}}}+\frac{M_{i,n-2}^{\overline{N}}}{M_{i,n-3/2}^{\overline{N}}}\right)q_{n-1}&i=n-2\end{cases} (3.4)

One can derive from (3.4) recurrently (in a reversed order, from i=n2i=n-2 to i=0i=0) that qn2,qn3,,q0q_{n-2},q_{n-3},\cdots,q_{0} can all be written as a positive multiple of qn1q_{n-1} whose expressions only contains entries of 𝑴jN¯\boldsymbol{M}_{j}^{\overline{N}}. This implies strict-positiveness and uniqueness (up to a positive number multiplication) of 𝒒\boldsymbol{q}. \square

3.3 Positivity preserving properties

Solutions being non-negative is another basic property of model (1.8), hence we also expect our scheme to be positivity preserving. In Theorem 2.2 [30], it is proved that the scheme’s positivity preserving property relies on the grid ratio Δt/(Δv)2\Delta t/(\Delta v)^{2}. In this section, we see that the positivity preserving property of our scheme does not rely on Δt/(Δv)2\Delta t/(\Delta v)^{2} or ε\varepsilon, and this improvement comes from our implicit treatment to the flux-shift operator that is stated in Section 2.

We start with the general matrix form of our problem (both for the vv-SI scheme and for the vv-FI scheme):

(ε𝑰+aΔt(Δv)2𝑴jN¯α)𝒑:,jm+1=ε𝒑:,jm,\left(\varepsilon\boldsymbol{I}+\frac{a\Delta t}{(\Delta v)^{2}}\boldsymbol{M}_{j}^{\overline{N}^{\alpha}}\right)\boldsymbol{p}_{:,j}^{m+1}=\varepsilon\boldsymbol{p}_{:,j}^{*m}, (3.5)

where

𝒑:,jm+1\displaystyle\begin{array}[]{l}\boldsymbol{p}_{:,j}^{m+1}\\ \end{array} =(p0,jm+1,p1,jm+1,,pnv1,jm+1),\displaystyle=\left(p_{0,j}^{m+1},p_{1,j}^{m+1},\cdots,p_{n_{v}-1,j}^{m+1}\right)^{\top},
𝒑:,jm\displaystyle\boldsymbol{p}_{:,j}^{*m} =(p0,jm,p1,jm,,pnv1,jm).\displaystyle=\left(p_{0,j}^{*m},p_{1,j}^{*m},\cdots,p_{n_{v}-1,j}^{*m}\right)^{\top}.

A crucial property of system (3.5) is that the to-be-inverted matrix ε𝑰+aΔt(Δv)2𝑴jN¯α\varepsilon\boldsymbol{I}+\frac{a\Delta t}{(\Delta v)^{2}}\boldsymbol{M}_{j}^{\overline{N}^{\alpha}} is the so-called M-matrix, which is an important topic in the area of positive operators. In [31], the authors provided forty equivalent statements that ensures a Z-matrix (a matrix with all diagonal entries non-negative and all non-diagonal entries non-positive) to be an M-matrix. We pick out the statement F16 and K33 in this article, forming the following lemma:

Lemma 2 (Equivalence of statements F16 and K33 of Theorem 1 in [31])

A semi-positive Z-matrix is an M-matrix, and henceforth monotone. Specifically, if a n×nn\times n matrix 𝐌\boldsymbol{M} satisfies

{𝑴i,j0j=i𝑴i,j0ji\begin{cases}\boldsymbol{M}_{i,j}\geq 0&j=i\\ \boldsymbol{M}_{i,j}\leq 0&j\neq i\end{cases}

and there exists a strictly positive (every entry is positive) n×1n\times 1 vector 𝐯>0\boldsymbol{v}^{*}>0 such that 𝐌𝐯>0\boldsymbol{Mv}^{*}>0, then 𝐌\boldsymbol{M} is a non-singular M-matrix, and henceforth 𝐌𝐯0\boldsymbol{Mv}\geq 0 implies 𝐯0\boldsymbol{v}\geq 0. Here

- 𝐯>0\boldsymbol{v}>0 means that all the entries of 𝐯\boldsymbol{v} are positive, and

- 𝐯0\boldsymbol{v}\geq 0 means that all the entries of 𝐯\boldsymbol{v} are non-negative.

With Lemma 2, we can further prove the following Lemma:

Lemma 3

For (3.5), 𝐩:,jm0\boldsymbol{p}_{:,j}^{*m}\geq 0 indicates 𝐩:,jm+10\boldsymbol{p}_{:,j}^{m+1}\geq 0.

Proof: Firstly, it is not hard to check that ε𝑰+aΔt(Δv)2𝑴jN¯α\varepsilon\boldsymbol{I}+\frac{a\Delta t}{(\Delta v)^{2}}\boldsymbol{M}_{j}^{\overline{N}^{\alpha}} is a Z-matrix according to the definitions. Secondly, we notice that

(ε𝑰+aΔt(Δv)2𝑴jN¯α)𝑸:,jN¯α=ε𝑸:,jN¯α>0\left(\varepsilon\boldsymbol{I}+\frac{a\Delta t}{(\Delta v)^{2}}\boldsymbol{M}_{j}^{\overline{N}^{\alpha}}\right)\boldsymbol{Q}_{:,j}^{\overline{N}^{\alpha}}=\varepsilon{\boldsymbol{Q}_{:,j}^{\overline{N}^{\alpha}}}>0

since 𝑸i,jN¯α{\boldsymbol{Q}_{i,j}^{\overline{N}^{\alpha}}} is in the kernel of 𝑴jN¯α\boldsymbol{M}_{j}^{\overline{N}^{\alpha}}, according to Lemma 2 (taking 𝒗\boldsymbol{v}^{*} as 𝑸jN¯α\boldsymbol{Q}_{j}^{\overline{N}^{\alpha}}), the matrix ε𝑰+aΔt(Δv)2𝑴jN¯α\varepsilon\boldsymbol{I}+\frac{a\Delta t}{(\Delta v)^{2}}\boldsymbol{M}_{j}^{\overline{N}^{\alpha}} is a non-singular M-Matrix and henceforth  𝒑jm0\boldsymbol{p}_{j}^{*m}\geq 0 indicates 𝒑jm+10\boldsymbol{p}_{j}^{m+1}\geq 0. \square

Applying Lemma 3 and (2.8), we can give out the following sufficient conditions of the positivity preserving property of the proposed schemes

Theorem 2

(Positivity Preserving Properties)  pi,jm+10,p_{i,j}^{m+1}\geq 0, holds for all i=0,,nvi=0,\cdots,n_{v} and j=0,,nwj=0,\cdots,n_{w} as long as

pi,jmΔtΔw(Φi,j+1/2mΦi,j1/2m)0p_{i,j}^{m}-\frac{\Delta t}{\Delta w}\left(\Phi_{i,j+1/2}^{m}-\Phi_{i,j-1/2}^{m}\right)\geq 0 (3.6)

holds for all i=0,,nv,j=0,,nwi=0,\cdots,n_{v},j=0,\cdots,n_{w}.

(3.6) tells us that as long as the grid ratio Δt/Δw\Delta t/\Delta w is not too big, (which is the most basic requirement for almost all numerical schemes for hyperbolic conservation law, and is also known as the CFL condition), the numerical solution will be non-negative. Such a property is unrelated to the grid ratio Δt/Δv2\Delta t/\Delta v^{2} or the scaling parameter ε\varepsilon. This means that we can take arbitrarily small Δv\Delta v and ε\varepsilon without worrying about violations of the positivity of the solutions.

Especially, as we can take ε\varepsilon arbitrarily small for a fixed Δt\Delta t, it becomes possible for us to futher analyze the asymptotic behavior of the scheme when ε0\varepsilon\to 0.

3.4 Asymptotic preserving properties

We first discuss the asymptotic properties of the original continuous model. (1.8)

Integrating the first equation of (1.8) over v\displaystyle v from \displaystyle-\infty to VF\displaystyle V_{F} and applying the boundary condition, we get

Ht+w[(N¯(t)N(w,t)K(w)w)H]=0,\frac{\partial H}{\partial t}+\frac{\partial}{\partial w}[(\overline{N}(t)N(w,t)K(w)-w)H]=0, (3.7)

where H=H(w,t)H=H(w,t) is defined in (1.10). When taking ε0\displaystyle\varepsilon\rightarrow 0 in (1.8), we have

{a2pv2v[(v+I(w)+wσ(N¯(t)))p]+N(w,t)δ(vVR)=0,N(w,t):=apv(VF0,w,t),N¯(t)=N(w,t)dw.\begin{cases}\displaystyle a\frac{\partial^{2}p}{\partial v^{2}}-\frac{\partial}{\partial v}[(-v+I(w)+w\sigma(\overline{N}(t)))p]+N(w,t)\delta(v-V_{R})=0,&\\ {\displaystyle N(w,t):=-a\frac{\partial p}{\partial v}(V_{F}-0,w,t),\ \ \overline{N}(t)=\int_{-\infty}^{\infty}N(w,t)\mathrm{d}w.}&\end{cases} (3.8)

According to Theorem 3.1 in [16], the solution of equation (3.8) is graranteed to exist and be unique when H(w,t)\displaystyle H(w,t) is given. Equations (1.10), (3.7) and (3.8) gives the asymptotic limit equation of (1.8) as ε0\displaystyle\varepsilon\rightarrow 0, where (3.7) can be viewed as the macroscopic equation governing the slow dynamics while (3.8) gives the local microscopic equilibrium of the fast dynamics. We consider our scheme with only time t\displaystyle t discretized

{pn+1pnΔt+w[(N¯nNn(w)K(w)w)pn]=1ε{a2pn+1v2v[(v+I(w)+wσ(N¯α))pn+1]+Nn+1(w)δ(vVR)}for (v,w,t)(,VF)×(,+)×(0,T)Nn+1(w):=apv(VF0,w,t),N¯n+1=Nn+1(w)dw;\begin{cases}\begin{aligned} \frac{p^{n+1}-p^{n}}{\Delta t}+\frac{\partial}{\partial w}[(\overline{N}^{n}N^{n}(w)K(w)-w)&p^{n}]\\ =\frac{1}{\varepsilon}\Bigl{\{}a\frac{\partial^{2}p^{n+1}}{\partial v^{2}}-\frac{\partial}{\partial v}[(-v+I(w)+&w\sigma(\overline{N}^{\alpha}))p^{n+1}]+N^{n+1}(w)\delta(v-V_{R})\Bigr{\}}\\ \text{for }(v,w,t)&\in(-\infty,V_{F})\times(-\infty,+\infty)\times(0,T)\end{aligned}\\ {\displaystyle N^{n+1}(w):=-a\frac{\partial p}{\partial v}(V_{F}-0,w,t),\ \ \overline{N}^{n+1}=\int_{-\infty}^{\infty}N^{n+1}(w)\mathrm{d}w;}\end{cases} (3.9)

where α\displaystyle\alpha is n\displaystyle n and n+1\displaystyle n+1 for the v\displaystyle v-SI scheme and v\displaystyle v-FI scheme, respectively.

For the v\displaystyle v-FI scheme, integrating the first equation in (3.9) and applying the boundary conditions, we have

Hn+1HnΔt+w[(N¯nNn(w)K(w)w)Hn]=0\frac{H^{n+1}-H^{n}}{\Delta t}+\frac{\partial}{\partial w}\left[(\overline{N}^{n}N^{n}(w)K(w)-w)H^{n}\right]=0 (3.10)

which is a discretization of (3.7); and when ε0\displaystyle\varepsilon\rightarrow 0, (3.9) becomes

{a2pn+1v2v[(v+I(w)+wσ(N¯n+1))pn+1]+Nn+1(w)δ(vVR)=0Nn+1(w):=apv(VF0,w,t),N¯n+1=Nn+1(w)dw,\begin{cases}\displaystyle a\frac{\partial^{2}p^{n+1}}{\partial v^{2}}-\frac{\partial}{\partial v}[(-v+I(w)+w\sigma(\overline{N}^{n+1}))p^{n+1}]+N^{n+1}(w)\delta(v-V_{R})=0&\\ {\displaystyle N^{n+1}(w):=-a\frac{\partial p}{\partial v}(V_{F}-0,w,t),\ \ \overline{N}^{n+1}=\int_{-\infty}^{\infty}N^{n+1}(w)\mathrm{d}w,}&\end{cases} (3.11)

which is directly the same as (3.8).

For the v\displaystyle v-SI scheme, Integrating the first equation in (3.9) and applying the boundary conditions, we obtain (3.10) again, while taking ε0\displaystyle\varepsilon\rightarrow 0 in (3.9), we have

{a2pn+1v2v[(v+I(w)+wσ(N¯n))pn+1]+Nn+1(w)δ(vVR)=0Nn+1(w):=apv(VF0,w,t),N¯n+1=Nn+1(w)dw.\begin{cases}\displaystyle a\frac{\partial^{2}p^{n+1}}{\partial v^{2}}-\frac{\partial}{\partial v}[(-v+I(w)+w\sigma(\overline{N}^{n}))p^{n+1}]+N^{n+1}(w)\delta(v-V_{R})=0&\\ {\displaystyle N^{n+1}(w):=-a\frac{\partial p}{\partial v}(V_{F}-0,w,t),\ \ \overline{N}^{n+1}=\int_{-\infty}^{\infty}N^{n+1}(w)\mathrm{d}w.}&\end{cases} (3.12)

We remark that the local equilibrium (3.12) can be viewed as a linearization of (3.11) such that the nonlinear coefficient σ(N¯n+1)\sigma(\overline{N}^{n+1}) is replaced by σ(N¯n)\sigma(\overline{N}^{n}),

The schemes (3.10) and (3.12) can be viewed as the limiting scheme for the following time rescaled maybe “delayed” equation as ε0\varepsilon\rightarrow 0

{pt+w[(N¯(t)N(w,t)K(w)w)p]=1ε{a2pv2v[(v+I(w)+wσ(N¯(tΔt)))p]+N(w,t)δ(vVR)}for (v,w,t)(,VF)×(,+)×(0,T)N(w):=apv(VF0,w,t),N¯=N(w)dw.\begin{cases}\begin{aligned} \frac{\partial p}{\partial t}+\frac{\partial}{\partial w}[(\overline{N}(t)N(w,t)K(w)&-w)p]\\ =\frac{1}{\varepsilon}\Bigl{\{}a\frac{\partial^{2}p}{\partial v^{2}}-\frac{\partial}{\partial v}[(-v+&I(w)+w\sigma(\overline{N}(t-\Delta t)))p]+N(w,t)\delta(v-V_{R})\Bigr{\}}\\ \text{for }(v,w,t)&\in(-\infty,V_{F})\times(-\infty,+\infty)\times(0,T)\end{aligned}\\ {\displaystyle N(w):=-a\frac{\partial p}{\partial v}(V_{F}-0,w,t),\ \ \overline{N}=\int_{-\infty}^{\infty}N(w)\mathrm{d}w.}\end{cases} (3.13)

Therefore, the v\displaystyle v-SI scheme is asymptotic preserving up to a Δt\displaystyle\Delta t time delay. In other words, when ε0\varepsilon\rightarrow 0, it is still a consistent first-order in time discretization to the limit of the original model.

4 Numerical Results

This section gives the numerical results obtained from our schemes, involving both the tests on the schemes’ performance and the explorations of the model by our scheme. In Section 4.1, we numerically test the order of accuracy of our schemes; in Section 4.2, we numerically validate the asymptotic preserving properties of our schemes; in Section 4.3, we test the model’s learning and discrimination abilitites; in Section 4.4, we focus on exploring the model’s behavior when the neural system is excitatory.

4.1 Order of Convergence

In this part, we test the order of accuracy of the vv-SI schemes. Since the exact solution is unavailable, we estimate the order of the error by

Oh,Lp=log2ωhωh2pωh2ωh4p,O_{h,L^{p}}=\log_{2}\frac{\left\|\omega_{h}-\omega_{\frac{h}{2}}\right\|_{p}}{\left\|\omega_{\frac{h}{2}}-\omega_{\frac{h}{4}}\right\|_{p}},

where ωh\omega_{h} is the numerical solution with step length hh. The term OhO_{h} above is an approximation for the accuracy order. Errors in both L1L^{1} and L2L^{2} norms are examined.

We choose VF=2,VR=1,Vmin=4,a=1,ε=0.5,Wmin=1.1,Wmax=0.1,σ(N¯)=N¯V_{F}=2,V_{R}=1,V_{\min}=-4,a=1,\varepsilon=0.5,W_{\min}=-1.1,W_{\max}=0.1,\sigma(\overline{N})=\overline{N}, I(w)0I(w)\equiv 0 and

pinit={sin2(πv)sin2(πw)1<w<0 and 1<v<10otherwisep_{\text{init}}=\begin{cases}\displaystyle\sin^{2}(\pi v)\sin^{2}(\pi w)&-1<w<0\text{ and }-1<v<1\\ 0&\text{otherwise}\end{cases}

throughout this section, and we test the accuracy for both Tmax=0.1T_{\max}=0.1 and Tmax=1T_{\max}=1. We test the numerical results on v,w,tv,w,t directions by fixing two of Δt,Δw\Delta t,\Delta w and Δv\Delta v while adjusting the third one. The results are shown in Table 1, Table 2 and Table 3, respectively.

Δv\displaystyle\Delta v pΔvpΔv/21\displaystyle\|p_{\Delta v}-p_{\Delta v/2}\|_{1} OΔv,L1\displaystyle O_{\Delta v,L^{1}} pΔvpΔv/22\displaystyle\|p_{\Delta v}-p_{\Delta v/2}\|_{2} OΔv,L2\displaystyle O_{\Delta v,L^{2}}
2.0e-1 1.1337e-03 2.0818 3.5479e-03 2.0675
1.0e-1 2.6781e-04 2.0122 8.4643e-04 2.0080
5.0e-2 6.6390e-05 1.9340 2.1044e-04 1.8739
2.5e-2 1.7374e-05 - 5.7417e-05 -
Δv\displaystyle\Delta v pΔvpΔv/21\displaystyle\|p_{\Delta v}-p_{\Delta v/2}\|_{1} OΔv,L1\displaystyle O_{\Delta v,L^{1}} pΔvpΔv/22\displaystyle\|p_{\Delta v}-p_{\Delta v/2}\|_{2} OΔv,L2\displaystyle O_{\Delta v,L^{2}}
2.0e-1 1.4571e-01 1.8905 2.8929e-01 1.7984
1.0e-1 3.9299e-02 1.7441 8.3168e-02 1.7309
5.0e-2 1.1731e-02 1.7872 2.5056e-02 1.8471
2.5e-2 3.3990e-03 - 6.9641e-03 -
Table 1: vv-wise order of accuracy, Δt=103,Δw=102\displaystyle\Delta t=10^{-3},\ \Delta w=10^{-2} are fixed. Upper table: Tmax=0.1T_{\max}=0.1; Lower table: Tmax=2.5T_{\max}=2.5.
Δw\displaystyle\Delta w pΔwpΔw/21\displaystyle\|p_{\Delta w}-p_{\Delta w/2}\|_{1} OΔw,L1\displaystyle O_{\Delta w,L^{1}} pΔwpΔw/22\displaystyle\|p_{\Delta w}-p_{\Delta w/2}\|_{2} OΔw,L2\displaystyle O_{\Delta w,L^{2}}
4.0e-2 4.2858e-03 0.9550 9.9587e-03 0.9543
2.0e-2 2.2108e-03 1.0038 5.1394e-03 1.0030
1.0e-2 1.1025e-03 0.9849 2.5643e-03 0.9801
0.5e-2 5.5703e-04 - 1.3000e-03 -
Δw\displaystyle\Delta w pΔwpΔw/21\displaystyle\|p_{\Delta w}-p_{\Delta w/2}\|_{1} OΔw,L1\displaystyle O_{\Delta w,L^{1}} pΔwpΔw/22\displaystyle\|p_{\Delta w}-p_{\Delta w/2}\|_{2} OΔw,L2\displaystyle O_{\Delta w,L^{2}}
4.0e-2 4.5348e-01 1.0102 3.9149e-01 0.7353
2.0e-2 2.2514e-01 1.0095 2.3516e-01 0.7557
1.0e-2 1.1183e-01 0.9717 1.3928e-01 0.6145
0.5e-2 5.7022e-02 - 9.0971e-02 -
Table 2: ww-wise order of accuracy, Δt=103,Δv=101\Delta t=10^{-3},\ \Delta v=10^{-1} are fixed. Upper table: Tmax=0.1T_{\max}=0.1; Lower table: Tmax=2.5T_{\max}=2.5.
Δt\displaystyle\Delta t pΔtpΔt/21\displaystyle\|p_{\Delta t}-p_{\Delta t/2}\|_{1} OΔt,L1\displaystyle O_{\Delta t,L^{1}} pΔtpΔt/22\displaystyle\|p_{\Delta t}-p_{\Delta t/2}\|_{2} OΔt,L2\displaystyle O_{\Delta t,L^{2}}
2.0e-3 3.8523e-04 0.9730 1.2219e-03 0.9647
1.0e-3 1.9625e-04 0.9686 6.2608e-04 0.9626
5.0e-4 1.0028e-04 1.0093 3.2125e-04 1.0089
2.5e-4 4.9819e-05 - 1.5963e-04 -
Δt\displaystyle\Delta t pΔtpΔt/21\displaystyle\|p_{\Delta t}-p_{\Delta t/2}\|_{1} OΔt,L1\displaystyle O_{\Delta t,L^{1}} pΔtpΔt/22\displaystyle\|p_{\Delta t}-p_{\Delta t/2}\|_{2} OΔt,L2\displaystyle O_{\Delta t,L^{2}}
2.0e-3 4.9612e-03 0.9473 4.5521e-03 0.9676
1.0e-3 2.5729e-03 0.9884 2.3278e-03 0.9954
5.0e-4 1.2969e-03 0.9781 1.1677e-03 0.9895
2.5e-4 6.5836e-04 - 5.8810e-04 -
Table 3: tt-wise order of accuracy, Δv=101,Δw=102\Delta v=10^{-1},\ \Delta w=10^{-2} are fixed. Upper table: Tmax=0.1T_{\max}=0.1; Lower table: Tmax=2.5T_{\max}=2.5.

It can be seen that when Tmax=0.1T_{\max}=0.1, the scheme shows the expected second-order convergence rate in the vv direction, and the first-order convergence in w,tw,t directions. However, when Tmax=2.5T_{\max}=2.5, the vv- and ww- wise orders of accuracy decrease a bit. This may be caused by the ww-wise nonlinearity of the equation (1.8). Indeed, when T=2.5T=2.5 the solution has already developed into a discontinuous pattern (Figure 1).

Refer to caption
Figure 1: When t=2.5t=2.5, there is a discontinuity in the numerical solution. Left: p(v,w,t=2.5)p(v,w,t=2.5), with the marked data showing that small variance along the ww direction causes a big variance of pp; Right: N(w,t=2.5)N(w,t=2.5).

4.2 Asymptotic behaviors of the vv-SI and vv-FI schemes

In this subsection, we numerically validate the asymptotic preserving properties of both the vv-SI scheme and the vv-FI scheme by validating that with fixed grid size, the numerical solution of the scheme converge to the vv-wise quasi-steady state at the discretized level as ε0\varepsilon\to 0.

Firstly we introduce a discretized version of the time-independent quasi-steady state equations (1.10) and (3.8) for a given H=H(w)H=H(w). This will serve as a reference solution. We denote the solution of the continuous time-independent quasi-steady state equations (1.10) and (3.8) for a given H=H(w)H=H(w) as PH(v,w)P^{H}(v,w). Since (3.8) means that PH(v,w)P^{H}(v,w) should nullify the stiff operator in (1.8), our discretized version can also be directly defined to nullify the stiff terms (order 1/ε1/\varepsilon terms) in the main equation (2.7) in our scheme. For a given discretized grid function HjH_{j} such that Δwj=0nwHj=1\Delta w\sum_{j=0}^{n_{w}}H_{j}=1 the vv-wise quasi-steady state in the discretized level Pi,jHP_{i,j}^{H} and corresponding NjH,N¯HN_{j}^{H},\overline{N}^{H} is defined to satisfy the following equations

Δvi=0nv1Pi,jH=Hjforj=0,1,,nw,\Delta v\sum_{i=0}^{n_{v}-1}P_{i,j}^{H}=H_{j}\ \ \ \text{for}\ j=0,1,\cdots,n_{w}, (4.1)
Fi+1/2,jHFi1/2,jH=0F_{i+1/2,j}^{H}-F_{i-1/2,j}^{H}=0 (4.2)
Fi+1/2,jH={aMi+1/2,jN¯HΔv(Pi+1,jHMi+1,jN¯HPi,jHMi,jN¯H)+NjHη(vi+1/2VR)i=0,1,,nv20i=1,nv1F_{i+1/2,j}^{H}=\begin{cases}a\frac{M_{i+1/2,j}^{\overline{N}^{H}}}{\Delta v}\left(\frac{P_{i+1,j}^{H}}{M_{i+1,j}^{\overline{N}^{H}}}-\frac{P_{i,j}^{H}}{M_{i,j}^{\overline{N}^{H}}}\right)+N_{j}^{H}\eta(v_{i+1/2}-V_{R})&i=0,1,\cdots,n_{v}-2\\ 0&i=-1,n_{v}-1\end{cases}\quad (4.3)
NjH=aPnv1,jH/Δt,N¯H=Δwj=0nwNjHN_{j}^{H}=aP_{n_{v}-1,j}^{H}/\Delta t,\ \overline{N}^{H}=\Delta w\sum_{j=0}^{n_{w}}N_{j}^{H} (4.4)

To generate this discretized vv-wise quasi-steady state for a given HjH_{j} with Δvj=0nwHj=1\Delta v\sum_{j=0}^{n_{w}}H_{j}=1, one may solve the system

{𝑴jN¯(k)𝑷:,j(k+1)=𝟎andΔti=0nv1Pi,j(k+1)=Hjforj=0,1,,nwN¯(k+1)=Δwj=0nvN(k+1)=Δwj=0nvpnv1,j(k+1)Δv\begin{cases}{\displaystyle\boldsymbol{M}_{j}^{\overline{N}^{(k)}}\boldsymbol{P}_{:,j}^{(k+1)}=\boldsymbol{0}\ \text{and}\ \Delta t\sum_{i=0}^{n_{v}-1}{P}_{i,j}^{(k+1)}=H_{j}\ \text{for}\ j=0,1,\cdots,n_{w}}&\\ {\displaystyle\overline{N}^{(k+1)}=\Delta w\sum_{j=0}^{n_{v}}N^{(k+1)}=\Delta w\sum_{j=0}^{n_{v}}\frac{p_{n_{v}-1,j}^{(k+1)}}{\Delta v}}&\end{cases} (4.5)

iteratively from some initial guess of N¯(0){\overline{N}}^{(0)}, and take Pi,jH=Pi,j(K),NjH=Nj(K),N¯H=N¯(K)P_{i,j}^{H}=P_{i,j}^{(K)},N_{j}^{H}=N_{j}^{(K)},{\overline{N}}^{H}={\overline{N}}^{(K)} for sufficiently large KK that satisfies some stopping criterion like |N(K+1)N(K)|ϵ|N^{(K+1)}-N^{(K)}|\leq\epsilon for some small enough ϵ\epsilon.

The numerical experiments are conducted as follows:

  1. 1.

    Fix all the scheme parameters except ε\varepsilon (these parameters include all the grid lengths, initial/boundary values and coefficients, especially, Δt\Delta t is fixed), and solve the scheme. The solution for a fixed ε\varepsilon is denoted as pεp_{\varepsilon}.

  2. 2.

    Having solved for pεp_{\varepsilon}, find HεH_{\varepsilon} by (2.6) , and thereafter the corresponding discretized vv-wise quasi-steady state PHεP^{H_{\varepsilon}} by (4.1)~(4.4), which can be numerically obtained by implementing the iterations described in (4.5);

  3. 3.

    Plot pεmPHεm\|p_{\varepsilon}^{m}-P^{H_{\varepsilon}^{m}}\| versus tmt^{m} for each ε\varepsilon (the norm is taken at a single time layer, chosen as L1L^{1} norm in our numerical test). AP property means that after a short period of time (the length of which decreases with ε\varepsilon), the difference between the solution pεmp_{\varepsilon}^{m} and the corresponding discretized vv-wise quasi-steady state PHεmP^{H_{\varepsilon}^{m}} should drop to a small value which vanishes as ε0\varepsilon\to 0, i.e.

    pεmPHεmε00.\|p_{\varepsilon}^{m}-P^{H_{\varepsilon}^{m}}\|\xrightarrow{\varepsilon\rightarrow 0}0. (4.6)

In our simulations, we choose VF=2,VR=1,Vmin=4,a=1,Wmin=1.1,Wmax=0.1,Tmax=0.3,σ(N¯)=N¯,Δt=5×104,Δv=0.1,Δw=0.01V_{F}=2,V_{R}=1,V_{\min}=-4,a=1,W_{\min}=-1.1,W_{\max}=0.1,T_{\max}=0.3,\sigma(\overline{N})=\overline{N},\Delta t=5\times 10^{-4},\Delta v=0.1,\Delta w=0.01 ,

pinit={sin2(πv)sin2(πw)1<w<0 and 1<v<10otherwisep_{\text{init}}=\begin{cases}\displaystyle\sin^{2}(\pi v)\sin^{2}(\pi w)&-1<w<0\text{ and }-1<v<1\\ 0&\text{otherwise}\end{cases} (4.7)

and

I(w)=12e(10w+5)2I(w)=\frac{1}{2}\mathrm{e}^{-(10w+5)^{2}} (4.8)

We perform the aforementioned AP test for both the vv-FI scheme and the vv-SI scheme, choosing ε\varepsilon ranging from 10110^{-1} to 10710^{-7}, and the results are shown in Figure 2 and Figure 3, respectively.

Refer to caption
Figure 2: AP property of the vv-FI scheme. pεFI;mPHεFI;m\|p_{\varepsilon}^{\text{FI};m}-P^{H_{\varepsilon}^{\text{FI};m}}\| is plotted over t[0,T]t\in[0,T], discretized with the time step Δt=5×104\Delta t=5\times 10^{-4}. The discretized quasi-steady state PHεFI;mP^{H_{\varepsilon}^{\text{FI};m}} is obtained by the iterations described in (4.5). It can be roughly seen that when ε\varepsilon is small, pεFI;mPHεFI;m=o(ε)\|p_{\varepsilon}^{\text{FI};m}-P^{H_{\varepsilon}^{\text{FI};m}}\|=o(\varepsilon) after a few time steps.
Refer to caption
Figure 3: AP property of the vv-SI scheme. pεSI;mPHεSI;m\|p_{\varepsilon}^{\text{SI};m}-P^{H_{\varepsilon}^{\text{SI};m}}\| is plotted over t[0,T]t\in[0,T], discretized with the grid size Δt=5×104\Delta t=5\times 10^{-4} (left) and 5×1035\times 10^{-3} (right). The discretized quasi-steady state PHεSI;mP^{H_{\varepsilon}^{\text{SI};m}} is obtained by the iterations described in (4.5). Comparing to the vv-FI scheme, the property pεSI;mPHεSI;m\|p_{\varepsilon}^{\text{SI};m}-P^{H_{\varepsilon}^{\text{SI};m}}\| is of order ε\varepsilon when ε\varepsilon is comparable to or greater than Δt\Delta t. When εΔt\varepsilon\ll\Delta t (which is the case when ε=105,106,107\varepsilon=10^{-5},10^{-6},10^{-7} in the figure, represented by the green, cyan and blue curves, respectively), pεSI;mPHεSI;m\|p_{\varepsilon}^{\text{SI};m}-P^{H_{\varepsilon}^{\text{SI};m}}\| becomes roughly of order of Δt\Delta t, so smaller ε\varepsilon does not make this difference uniformly smaller as it does for the vv-FI scheme.

From Figure 2, we can see that for the vv-FI scheme, the differences between pεmp_{\varepsilon}^{m} and PHεmP^{H_{\varepsilon}^{m}} is decreasing when ε0\varepsilon\to 0, we can even assert pεmPHεm=o(ε)\|p_{\varepsilon}^{m}-P^{H_{\varepsilon}^{m}}\|=o(\varepsilon) from the figure.

From Figure 3, we can see that for the vv-SI scheme, the differences between pεmp_{\varepsilon}^{m} and PHεmP^{H_{\varepsilon}^{m}} is also decreasing when ε0\varepsilon\to 0. However, this decreasing trend is halting when ε=105\varepsilon=10^{-5} for Δt=5×104\Delta t=5\times 10^{-4} and ε=104\varepsilon=10^{-4} for Δt=5×103\Delta t=5\times 10^{-3}, and further decrease of ε\varepsilon does not uniformly draw pεmp_{\varepsilon}^{m} and PHεmP^{H_{\varepsilon}^{m}} closer. This can be explained by the time-delayed effect of the vv-SI scheme, which was presented in section 3.4. This time-delay effect introduces an o(Δt)o(\Delta t) difference between pεmp_{\varepsilon}^{m} and PHεmP^{H_{\varepsilon}^{m}}, which will not further decrease when ε\varepsilon becomes a small number comparing to Δt\Delta t. We may say that pεmPHεm=o(Δt)\|p_{\varepsilon}^{m}-P^{H_{\varepsilon}^{m}}\|=o(\Delta t) when εΔt\varepsilon\ll\Delta t and pεmPHεm=o(ε)\|p_{\varepsilon}^{m}-P^{H_{\varepsilon}^{m}}\|=o(\varepsilon) when εΔt\varepsilon\gg\Delta t for the vv-SI scheme.

4.3 Learning and reacting

From this subsection on, we explore model (1.8) with our scheme. We always use the vv-SI scheme in the rest of this section.

In this subsection, we test the model’s learning and discrimination abilities. We let the model perform learning-testing tasks that was originally proposed in [16], containing a learning phase and a testing phase:

Learning Phase A heterogeneous input I(w)I(w) is presented to the system, while the learning process is active. The learning rule is determined for the inhibitory weights by N(w)N¯-N(w)\overline{N} by taking K(w)=1K(w)=-1 if w0w\leq 0. After some time, the synaptic weight distribution H(w,t)H(w,t) converges to an equilibrium distribution HI(w)H_{I}^{*}(w), which depends on II.

Testing Phase A new input J(w)J(w) is presented to the system. This time, the neural system reacts to JJ but does not learn it. In this stage, there is only vv-wise transport and no ww-wise convection. To achieve this goal, we implement the vv-wise quasi-steady state solver (4.5) with II replaced by the testing signal JJ, during which we keep HI(w)H_{I}^{*}(w) we obtained in the Learning Phase.

The pattern recognition property of the system is that it can detect whether the new input J(w)J(w) is actually the same one that has been presented during the learning phase, i.e. I(w)I(w): indeed, in this case, NI,J(w)=w𝟏[A,0)N_{I,J}^{*}(w)=w\boldsymbol{1}[-A,0) has a very specific shape that does not depend upon the original input II that has been learned in the learning phase.

For the learning phase, We implement the vv-SI scheme; for the testing phase, we implement the vv-wise quasi-steady state solver (4.5), replacing the input function I(w)I(w) with the testing input function J(w)J(w).

In our numerical experiments, the learning and testing input functions are chosen from the set

{ψi(10w+5)+1|i=0,1,,4}\{\psi_{i}(10w+5)+1|i=0,1,\cdots,4\} (4.9)

where ψi(x)\psi_{i}(x) stands for the ii-th normalized Hermite function defined recursively as

ψ0(y)π14e12y2;ψ1(y)π142ye12y2ψn+1(y)=2n+1yψn(y)nn+1ψn1(y).\begin{array}[]{c}\displaystyle\psi_{0}(y)\equiv\pi^{-\frac{1}{4}}\mathrm{e}^{-\frac{1}{2}y^{2}};\quad\psi_{1}(y)\equiv\pi^{-\frac{1}{4}}\sqrt{2}y\mathrm{e}^{-\frac{1}{2}y^{2}}\\ \displaystyle\psi_{n+1}(y)=\sqrt{\frac{2}{n+1}}y\psi_{n}(y)-\sqrt{\frac{n}{n+1}}\psi_{n-1}(y).\end{array} (4.10)

This set of functions are well known for forming a set of bases for the Hilbert space L2(,+)L^{2}(-\infty,+\infty). We take the argument 10w+510w+5 so that the input functions are centralized at w=1/2w=-1/2 and almost entirely supported in our truncated solving region; and the extra “+1+1” term widens the support of the solution so that it can react to input signals at wider range of ww. We choose VF=2,VR=1,Vmin=4,a=1,ε=0.1,Wmin=1.1,Wmax=0.1,Tmax=5,σ(N¯)=N¯,Δt=0.005,Δv=0.1,Δw=0.01V_{F}=2,V_{R}=1,V_{\min}=-4,a=1,\varepsilon=0.1,W_{\min}=-1.1,W_{\max}=0.1,T_{\max}=5,\sigma(\overline{N})=\overline{N},\Delta t=0.005,\Delta v=0.1,\Delta w=0.01, and pinitp_{\text{init}} is given as (4.7).

Refer to caption
Figure 4: The candidate input functions φi(10w+5)+1\varphi_{i}(10w+5)+1 where i=0,1,2,3,4i=0,1,2,3,4 and φi\varphi_{i} stands for the ii-th normalized Hermite functions.
Refer to caption
Figure 5: Test on the neural network’s recognition property. The pattern recognition task described at the beginning of subsection 4.3 is done for all possible pairs of input functions given in (4.9). For each i,j=0,1,,4i,j=0,1,\cdots,4, the subfigure at the (i+1)(i+1)-th row, (j+1)(j+1)-th column gives the final firing pattern NI,J(w)N_{I,J}^{*}(w) versus after learning the signal I(w)=φi(10w+5)+1I(w)=\varphi_{i}(10w+5)+1 and reacting to the signal J(w)=φj(10w+5)+1J(w)=\varphi_{j}(10w+5)+1. For each subfigure, the horizontal coordinate represents ww, and the vertical axis represents N(w)N(w) at the end of the task. It can be seen that, when the learned input I(w)I(w) and the testing input J(w)J(w) are the same (corresponding to the diagonal entries in the figure), the firing pattern is of the nice triangular shape NI,J(w)=w𝟏[A,0)N_{I,J}^{*}(w)=w\boldsymbol{1}[-A,0); but when I(w)I(w) and J(w)J(w) are different, the firing pattern will not be regular-shaped.

For each i,j=0,1,,4i,j=0,1,\cdots,4, we let the model perform the fore-mentioned task with I(w)=ψi(10w+5)I(w)=\psi_{i}(10w+5) and J(w)=ψj(10w+5)J(w)=\psi_{j}(10w+5), and the corresponding network activity N(w)N(w) is shown in the (i+1)(i+1)-th row, (j+1)(j+1)-th column in Figure 5. It can be seen that, when the learned input I(w)I(w) and the testing input J(w)J(w) are the same (corresponding to the diagonal entries in the figure), the firing pattern is of the nice triangular shape NI,J(w)=w𝟏[A,0)N_{I,J}^{*}(w)=w\boldsymbol{1}[-A,0); but when I(w)I(w) and J(w)J(w) are different, the firing pattern is not in a regular shape. It is worth noting that in [16], the authors provided a primitive numerical result for the same learning-testing task, the networks firing pattern after reacting to the same learned input signal, shown in the article’s Figure 3, turned out to be quite zigzag. In comparison, with our numerical method, an almost-perfect triangular-shaped pattern is produced after the model learn and react to the same input (see the subfigures in the diagonal entries in Figure 5).

4.4 Positive synaptic weights

This subsection focus on exploring model’s behavior when the neural system is excitatory, i.e. the solution is supported in the with region with w0w\geq 0. In [16], the authors’ analytic work focus mainly on inhibitory networks. And for the excitatory neural system, they provided that, under some choice of parameters, the steady-state solution may not exist. We further explore with our numerical scheme the behavior of the solution when the synaptic weights are positive.

Our numerical exploration shows that in the excitatory cases, the network can either develop to a steady state like it did in the inhibitory cases, or exhibit a trend of accelerating expansion unlike it did with inhibitory cases under different parameter settings.

4.4.1 Long-time steady solution

We firstly test some scenarios in which the solutions present behavior similar to it would present with negative synaptic weights ww.

We choose I1I\equiv 1, a=1a=1, ε=0.2\varepsilon=0.2, σ(N¯)=N¯/(1+N¯)\sigma(\overline{N})=\overline{N}/(1+\overline{N}), Δt=5×103,Δv=0.1,Δw=0.01\Delta t=5\times 10^{-3},\Delta v=0.1,\Delta w=0.01, Tmax=5T_{\max}=5, K(w)1K(w)\equiv 1 and initial condition given as (4.7). The solution finally develops to a steady-state with N(w)N^{*}(w) forming a right triangular pattern supported in a single interval w[0,A]w\in[0,A] for some A>0A>0, just like it did with negative synaptic weights. This is the case that corresponds to the correct biological picture.

Refer to caption
Figure 6: Solution with positive synaptic weight. With I=0I=0, a=1a=1, ε=0.2\varepsilon=0.2, initial condition given as (4.7), σ(N¯)=N¯/(1+N¯)\sigma(\overline{N})=\overline{N}/(1+\overline{N}), Δt=5×103,Δv=0.1,Δw=0.01\Delta t=5\times 10^{-3},\Delta v=0.1,\Delta w=0.01, Tmax=5T_{\max}=5, K(w)1K(w)\equiv 1. The solution finally develops to a steady-state with N(w)N^{*}(w) forming the right triangular pattern just like it did with negative synaptic weights.

,

Indeed, as long as it can develop to a steady state, the excitatory neural network also has the recognition ability shown in subsection 4.3. To show this, we perform a similar learning-testing task as we did in the last subsection. The parameter settings are the same as they were in the last subsection except for that the initial condition pinitp_{\text{init}} is supported in the region with w0w\geq 0, K(w)1K(w)\equiv 1, σ(N¯)=N¯/(1+N¯)\sigma(\overline{N})=\overline{N}/(1+\overline{N}) and the candidate input functions become

ψi(10w5)+1i=0,1,,4\psi_{i}(10w-5)+1\quad i=0,1,\cdots,4 (4.11)

so that the input are centered at +1/2+1/2 instead of 1/2-1/2 for the inhibitory cases. The results are given in Figure 7. The arrangement of the figures are the same as that of Figure 5. It can be seen that wN(w)w-N(w) subfigures at the diagonal entries are in perfect right triangular shape, which is the same as it is in Figure 7.

Refer to caption
Figure 7: Test of recognition property of excitatory neural networks. The pattern recognition task described at the beginning of subsection 4.3 is done for all possible pairs of input functions given in (4.9). For each i,j=0,1,,4i,j=0,1,\cdots,4, the subfigure at the (i+1)(i+1)-th row, (j+1)(j+1)-th column gives the final firing pattern NI,J(w)N_{I,J}^{*}(w) versus after learning the signal I(w)=φi(10w5)+1I(w)=\varphi_{i}(10w-5)+1 and reacting to the signal J(w)=φj(10w5)+1J(w)=\varphi_{j}(10w-5)+1. For each subfigure, the horizontal coordinate represents ww, and the vertical axis represents N(w)N(w) at the end of the task. It can be seen that, when the learned input I(w)I(w) and the testing input J(w)J(w) are the same (corresponding to the diagonal entries in the figure), the firing pattern is of the nice triangular shape NI,J(w)=w𝟏(0,A)N_{I,J}^{*}(w)=w\boldsymbol{1}(0,A); but when I(w)I(w) and J(w)J(w) are different, the firing pattern will not be regular-shaped.

,

4.4.2 Unsteady solution

Refer to caption
Figure 8: Solution with positive synaptic weight. With I(w)1I(w)\equiv 1, a=1a=1, ε=0.2\varepsilon=0.2, initial condition given as (4.7), σ(N¯)=3N¯/(1+N¯)\sigma(\overline{N})=3\overline{N}/(1+\overline{N}), Δt=5×103,Δv=0.1,Δw=0.01\Delta t=5\times 10^{-3},\Delta v=0.1,\Delta w=0.01, Tmax=5T_{\max}=5, K(w)1K(w)\equiv 1. Left: in the unsteady scenarios, N¯\overline{N} grows with accelerate. Right: The solution turns out to propagate to the region with greater ww. The ww-wise right boundary of the support of NN grows increasingly fast.

Now we turn our attention to a different case, where the solution is not converging to a steady solution. In [16], it was presented that even if we choose the firing function σ\sigma to be bounded

σ(N¯)=kN¯1+N¯,\sigma(\overline{N})=k\frac{\overline{N}}{1+\overline{N}}, (4.12)

we may still face some scenarios that a steady state does not theoretically exist. And we further explore the solution’s behavior in this case.

We choose I(w)1I(w)\equiv 1, k=3k=3, and the rest parameters were same as it was in the first numerical experiment in sub-subsection 4.4.1. The solution turns out to propagate to the region with greater ww. The ww-wise right boundary of the support of NN grows increasingly fast (Figure 8 right), and henceforth N¯\overline{N} grows increasingly big (Figure 8 left). It should be noticed that our result shown in Figure 8 does not necessarily imply that this is a blow up solution, since we are not sure whether N¯(t)\overline{N}(t) will grow to \infty in a finite time.

Even if the solution is not a blow-up one, it still does not practically make sense that the network’s synaptic weight grows increasingly fast to \infty. Therefore, the solution’s fast growing behavior is a non-physical effect of our model.

5 Conclusion

In this work, we have proposed a numerical scheme for the Fokker-Planck equation of structured neural networks (1.8), which is shown to be conservative, positivity-preserving, and asymptotic-preserving.

The key to success of our scheme is the implicit treatment for flux shift operators in (1.8), based on which the numerical scheme has an improved positivity preserving property: there is no v\displaystyle v-wise grid ratio constraint to guarantee the positivity, which is otherwise a major bottleneck for efficient simulation.

As for numerical exploration, we have conducted the learning-testing numerical experiments and yielded numerical solutions that not only match the theoretical results perfectly, but also fully demonstrate the recognition ability of the model and provide insights for potential uses and future studies. The numerical tests are also extended to the cases of excitatory neural networks. We have also explored with our numerical experiments the behavior of the model when a steady state solution does not exist. These numerical experiments are successful in capturing partial long time trends of the solution but still preliminary. Further numerical experiments can be done to explore, for example, the model’s behavior when the input signal II becomes time-dependent, or alternative learning rules that can prevent N¯(t)\overline{N}(t) from growing to infinity for the excitatory cases.

Acknowledgement

J. Hu’s research was partially supported by NSF CAREER grant DMS-1654152. Z. Zhou is supported by the National Key R&D Program of China, Project Number 2020YFA0712000 and NSFC grant No. 11801016, No. 12031013.

References

  • [1] Stephen J Martin, Paul D Grimwood, and Richard GM Morris. Synaptic plasticity and memory: an evaluation of the hypothesis. Annual review of neuroscience, 23(1):649–711, 2000.
  • [2] Alan L Hodgkin and Andrew F Huxley. A quantitative description of membrane current and its application to conduction and excitation in nerve. The Journal of physiology, 117(4):500, 1952.
  • [3] Richard FitzHugh. Impulses and physiological states in theoretical models of nerve membrane. Biophysical Journal, 1(6):445–466, 1961.
  • [4] Louis Lapicque. Recherches quantitatives sur l’excitation électrique des nerfs traitée comme une polarisation. J. Physiol. Pathol. Gen, 9(1):620–635, 1907.
  • [5] María J Cáceres, José A Carrillo, and Benoît Perthame. Analysis of nonlinear noisy integrate & fire neuron models: blow-up and steady states. The Journal of Mathematical Neuroscience, 1(1):1–33, 2011.
  • [6] José A Carrillo, María d M González, Maria P Gualdani, and Maria E Schonbek. Classical solutions for a nonlinear fokker-planck equation arising in computational neuroscience. Communications in Partial Differential Equations, 38(3):385–409, 2013.
  • [7] María J. Cáceres, Pierre Roux, Delphine Salort, and Ricarda Schneider. Global-in-time classical solutions and qualitative properties for the nnlif neuron model with synaptic delay. Communications in Partial Differential Equations, 2018.
  • [8] Nicolas Brunel. Dynamics of sparsely connected networks of excitatory and inhibitory spiking neurons. Journal of computational neuroscience, 8(3):183–208, 2000.
  • [9] Grégory Dumont and Pierre Gabriel. The mean-field equation of a leaky integrate-and-fire neural network: measure solutions and steady states. Nonlinearity, 33, 10 2017.
  • [10] Nicolas Brunel and Vincent Hakim. Fast global oscillations in networks of integrate-and-fire neurons with low firing rates. Neural computation, 11(7):1621–1671, 1999.
  • [11] François Delarue, James Inglis, Sylvain Rubenthaler, and Etienne Tanré. Global solvability of a networked integrate-and-fire model of mckean–vlasov type. The Annals of Applied Probability, 25(4):2096–2133, 2015.
  • [12] François Delarue, James Inglis, Sylvain Rubenthaler, and Etienne Tanré. Particle systems with a singular mean-field self-excitation. application to neuronal networks. Stochastic Processes and their Applications, 125(6):2451–2492, 2015.
  • [13] Jian-guo Liu, Ziheng Wang, Yuan Zhang, and Zhennan Zhou. Rigorous justification of the fokker-planck equations of neural networks based on an iteration perspective. arXiv preprint arXiv:2005.08285, 2020.
  • [14] Jian-Guo Liu, Ziheng Wang, Yantong Xie, Yuan Zhang, and Zhennan Zhou. Investigating the integrate and fire model as the limit of a random discharge model: a stochastic analysis perspective. Mathematical Neuroscience and Applications, 1, 2021.
  • [15] Donald Olding Hebb. The organization of behavior: A neuropsychological approach. John Wiley & Sons, 1949.
  • [16] Benoît Perthame, Delphine Salort, and Gilles Wainrib. Distributed synaptic weights in a LIF neural network and learning rules. Physica D: Nonlinear Phenomena, 353-354:20–30, Sep 2017.
  • [17] Shi Jin and Bokai Yan. A class of asymptotic-preserving schemes for the fokker-planck-landau equation. J. Comput. Physics, 230:6420–6437, 07 2011.
  • [18] Marianne Bessemoulin-Chatard and Francis Filbet. A finite volume scheme for nonlinear degenerate parabolic equations. SIAM Journal on Scientific Computing, 34(5):B559–B583, 2012.
  • [19] José A Carrillo, Alina Chertock, and Yanghong Huang. A finite-volume method for nonlinear nonlocal equations with a gradient flow structure. Communications in Computational Physics, 17(1):233–258, 2015.
  • [20] JS Chang and G Cooper. A practical difference scheme for fokker-planck equations. Journal of Computational Physics, 6(1):1–16, 1970.
  • [21] Lorenzo Pareschi and Mattia Zanella. Structure preserving schemes for nonlinear fokker–planck equations and applications. Journal of Scientific Computing, 74(3):1575–1600, 2018.
  • [22] Jingwei Hu and Xiangxiong Zhang. Positivity-preserving and energy-dissipative finite difference schemes for the fokker-planck and keller-segel equations. arXiv preprint arXiv:2103.16790, 2021.
  • [23] Sebastiano Boscarino, Francis Filbet, and Giovanni Russo. High order semi-implicit schemes for time dependent partial differential equations. Journal of Scientific Computing, 68(3):975–1001, 2016.
  • [24] Jian-Guo Liu, Li Wang, and Zhennan Zhou. Positivity-preserving and asymptotic preserving method for 2d keller-segal equations. Mathematics of Computation, 87(311):1165–1189, 2018.
  • [25] Jingwei Hu and Xiaodong Huang. A fully discrete positivity-preserving and energy-dissipative finite difference scheme for poisson–nernst–planck equations. Numerische Mathematik, 145(1):77–115, 2020.
  • [26] Rafael Bailo, Jose A Carrillo, and Jingwei Hu. Fully discrete positivity-preserving and energy-dissipating schemes for aggregation-diffusion equations with a gradient flow structure. Commun. Math. Sci., 18:1259–1303, 2020.
  • [27] Yong Zhang, Yu Zhao, and Zhennan Zhou. A unified structure preserving scheme for a multispecies model with a gradient flow structure and nonlocal interactions via singular kernels. SIAM Journal on Scientific Computing, 43(3):B539–B569, 2021.
  • [28] Jingwei Hu and Ruiwen Shu. A second-order asymptotic-preserving and positivity-preserving exponential runge–kutta method for a class of stiff kinetic equations. Multiscale Modeling & Simulation, 17(4):1123–1146, 2019.
  • [29] Randall J. LeVeque. Numerical methods for conservation laws (2. ed.). Lectures in mathematics. Birkhäuser, 1992.
  • [30] Jingwei Hu, Jian-Guo Liu, Yantong Xie, and Zhennan Zhou. A structure preserving numerical scheme for fokker-planck equations of neuron networks: Numerical analysis and exploration. Journal of Computational Physics, 433:110195, 2021.
  • [31] Robert J Plemmons. M-matrix characterizations. i—nonsingular m-matrices. Linear Algebra and its Applications, 18(2):175–188, 1977.