Discontinuous Galerkin Methods for an Elliptic Optimal Control Problem with a General State Equation and Pointwise State Constraints

Sijing Liu, Zhiyu Tan and Yi Zhang Sijing Liu, Department of Mathematics
University Of Connecticut
Storrs, CT
USA [email protected] Zhiyu Tan, Center for Computation

\&

Technology
Louisiana State University
Baton Rouge, LA
USA [email protected] Yi Zhang, Department of Mathematics
University Of North Carolina Greensboro
Greensboro, NC
USA [email protected]

Abstract.

We investigate discontinuous Galerkin methods for an elliptic optimal control problem with a general state equation and pointwise state constraints on general polygonal domains. We show that discontinuous Galerkin methods for general second-order elliptic boundary value problems can be used to solve the elliptic optimal control problems with pointwise state constraints. We establish concrete error estimates and numerical experiments are shown to support the theoretical results.

Key words and phrases:

elliptic distributed optimal control problems, general state equations, pointwise state constraints, discontinuous Galerkin methods

1991 Mathematics Subject Classification:

49J20, 49M41, 65N30, 65K15

1. Introduction

Let $\Omega$ be a polygonal domain in $\mathbb{R}^{2}$ , $y_{d}\in L_{2}(\Omega)$ , $\beta$ be a positive constant and $g\in H^{4}(\Omega)$ . The elliptic optimal control problem is to find

(1.1)

(\bar{y},\bar{u})=\operatorname*{argmin}_{(y,u)\in\mathbb{K}_{g}}\left[\frac{1}{2}\|y-y_{d}\|^{2}_{L_{2}(\Omega)}+\frac{\beta}{2}\|u\|^{2}_{L_{2}(\Omega)}\right],

where $(y,u)$ belongs to $\mathbb{K}_{g}\subset{H^{1}(\Omega)}\times{L_{2}(\Omega)}$ if and only if

(1.2)		$\displaystyle a(y,v)$	$\displaystyle=\int_{\Omega}uv\ dx\quad\forall v\in H^{1}_{0}(\Omega),$
(1.2)		$\displaystyle y$	$\displaystyle=g\quad\text{on}\quad\partial\Omega,$

and the pointwise state constraint

(1.3)

y\leq\psi\quad\mbox{a.e. in}\ \ \Omega,

where the function

(1.4)

\psi\in W^{3,p}\ \text{with}\ p>2\ \text{and}\ \psi>g\ \text{on}\ \partial\Omega.

The bilinear form $a(\cdot,\cdot)$ is defined as,

(1.5)

a(y,v)=\int_{\Omega}\nabla y\cdot\nabla v\ dx+\int_{\Omega}(\bm{\zeta}\cdot\nabla y)v\ dx+\int_{\Omega}\gamma yv\ dx,

where the vector field $\bm{\zeta}\in[W^{1,\infty}(\Omega)]^{2}$ and the function $\gamma\in W^{1}_{\infty}(\Omega)$ is nonnegative. If $\bm{\zeta}\neq 0$ , then the constraint (1.5) is the weak form of a general second order PDE with an advective/convective term. We assume

(1.6)

\gamma-\frac{1}{2}\nabla\cdot\bm{\zeta}\geq\gamma_{0}>0

such that the problem (1.2) is well-posed.

Here and throughout the paper we will follow the standard notation for differential operators, function spaces and norms that can be found for example in [28, 17].

We define the subspace $\mathring{E}(\Delta;L_{2}(\Omega))$ of $H^{1}_{0}(\Omega)$ as

(1.7)

\mathring{E}(\Delta;L_{2}(\Omega))=\{y\in H^{1}_{0}(\Omega):\mathcal{L}y\in{L_{2}(\Omega)}\},

where $\mathcal{L}y=-\Delta y+\bm{\zeta}\cdot\nabla y+\gamma y$ . We also denote

(1.8)

E(\Delta;L_{2}(\Omega))=\{y\in H^{1}(\Omega):\mathcal{L}y\in{L_{2}(\Omega)}\}.

Due to the elliptic regularity [29, 41], $\mathring{E}(\Delta;L_{2}(\Omega))$ is a subspace of $H^{1+\alpha}(\Omega)\cap H^{2}_{loc}(\Omega)\cap H^{1}_{0}(\Omega)$ for some $\alpha\in(\frac{1}{2},1]$ , where $\alpha=1$ if $\Omega$ is convex, and

(1.9)

\|z\|_{H^{1+\alpha}(\Omega)}\leq C_{\Omega}\|\mathcal{L}z\|_{L_{2}(\Omega)}\quad\forall z\in\mathring{E}(\Delta;L_{2}(\Omega)).

It follows from (1.9) and the Sobolev inequality [1, Theorem 4.12] that $g+\mathring{E}(\Delta;L_{2}(\Omega))\subset C(\bar{\Omega})$ . Therefore we can reformulate (1.1)-(1.3) as the following,

(1.10)

\bar{y}=\operatorname*{argmin}_{y\in K_{g}}\left[\frac{1}{2}\|y-y_{d}\|^{2}_{L_{2}(\Omega)}+\frac{\beta}{2}\|\mathcal{L}y\|^{2}_{L_{2}(\Omega)}\right],

where

(1.11)

K_{g}=\{y\in g+\mathring{E}(\Delta;L_{2}(\Omega)):y\leq\psi\ \ \mbox{in}\ \ \Omega\}.

Optimal control problems with pointwise state constraints are more difficult to analyze due to the low regularity of the Lagrange multiplier. In [18, 24], the authors proved that the Lagrange multiplier $\mu$ is a nonnegative Borel measure and $\mu\in H^{-1}(\Omega)$ at the same time. By using this regularity result, the pointwise state constraints can then be handled.

In the case $\bm{\zeta}=\mathbf{0}$ and $\gamma=0$ , the distributed optimal control problem with pointwise state constraints (1.1)-(1.3) is investigated in [20, 24, 40, 39, 43, 26, 35, 37] using $P_{1}$ finite element methods. Several extensions [16, 14] to the new approach in [20] have been established. Other methods are proposed for (1.1)-(1.3), for example, $C^{0}$ interior penalty methods [11, 22], Morley finite element methods [13], and virtual element methods [21]. Fast solvers for (1.1)-(1.3) are also studied in [15, 10]. We refer to [9] for a more detailed survey about finite element methods for (1.1)-(1.3). Overall, as pointed out in [9], optimal control problems with pointwise state constraints can be analyzed using known finite element methods for fourth order boundary value problems. This crucial observation opens doors to many possible numerical methods for optimal control problems with pointwise state constraints.

Recently, discontinuous Galerkin (DG) methods for (1.1)-(1.3) with $\bm{\zeta}=\mathbf{0}$ and $\gamma=0$ were proposed and analyzed in [12] where a new interior maximum estimate [19] was utilized. For the general case where $\bm{\zeta}\neq\mathbf{0}$ and $\gamma\neq 0$ , a continuous $P_{1}$ finite element method was proposed and analyzed in [14]. The goal of this paper is to extend the results in [12, 14] to an optimal control problem with a general state equation (1.1)-(1.3). The reason for using a DG method is to enable a fast solution of the discrete problem by the primal-dual active algorithm that converges superlinearly (see Remark 3.2). Since the bilinear form (1.3) is nonsymmetric, we must employ an adjoint consistent [3] method (3.12) in order to obtain estimates involving $R_{h}$ and $E_{h}$ which are key ingredients in the convergence analysis.

The rest of the paper is organized as follows. In Section 2, we gather some known regularity results for the continuous problem (1.1)-(1.3). These results are useful in the convergence analysis. In Section 3, we propose mixed discontinuous Galerkin methods to solve the problem (1.1)-(1.3) and establish some important properties of the discrete problem. Three crucial operators $E_{h}$ , $R_{h}$ and $\mathfrak{C}_{h}$ are defined and analyzed in Section 4. Concrete error estimates are established in Section 5. Numerical results are provided in Section 6 and we end with some concluding remarks in Section 7. A description of the primal-dual active set algorithm is given in Appendix A and the proofs of Lemma 3.3 and Lemma 4.3 are provided in Appendix B.

Throughout this paper, we use $C$ (with or without subscripts) to denote a generic positive constant that is independent of any mesh parameters. Also to avoid the proliferation of constants, we use the notation $A\lesssim B$ (or $A\gtrsim B$ ) to represent $A\leq\text{(constant)}B$ . The notation $A\approx B$ is equivalent to $A\lesssim B$ and $B\lesssim A$ .

2. The Continuous Problem

Let $\bar{z}=\bar{y}-g$ . Then the problem (1.10)-(1.11) is equivalent to find

(2.1)

\bar{z}=\operatorname*{argmin}_{z\in\widetilde{K}}\left[\frac{1}{2}\|z-(y_{d}-g)\|^{2}_{L_{2}(\Omega)}+\frac{\beta}{2}\|\mathcal{L}(z+g)\|^{2}_{L_{2}(\Omega)}\right],

where

(2.2)

\widetilde{K}=\{z\in\mathring{E}(\Delta;L_{2}(\Omega)):z\leq\psi-g\ \ \mbox{in}\ \ \Omega\}.

By the classical theory of variation of calculus, it is well-known that there exists a unique solution $\bar{z}$ to (2.1). Consequently, the problem (1.10) has a unique solution and $\bar{y}$ can be characterized by

(2.3)

(\bar{y}-y_{d},y-\bar{y})_{L_{2}(\Omega)}+\beta(\mathcal{L}\bar{y},\mathcal{L}(y-\bar{y}))_{L_{2}(\Omega)}\geq 0\quad\forall\ y\in K_{g}.

Interior regularity of $\bar{y}$

By the interior regularity results for fourth order variational inequalities [32, 33], we have $\bar{z}\in H^{3}_{loc}(\Omega)\cap W^{2,\infty}_{loc}(\Omega)$ . Since $g\in H^{4}(\Omega)$ , which is a subspace of $W^{2,\infty}(\Omega)$ by the Sobolev inequality, we conclude

(2.4)

\bar{y}\in H^{3}_{loc}(\Omega)\cap W^{2,\infty}_{loc}(\Omega).

A proof can be found in [14].

Lagrange multiplier $\mu$

Taking $y=-\phi+\bar{y}\in K_{g}$ in (2.3) where $\phi$ is a nonnegative function in $C^{\infty}_{0}(\Omega)$ . Thus we have

(2.5)

\int_{\Omega}\Big{[}(\bar{y}-y_{d})\phi+\beta(\mathcal{L}\bar{y})(\mathcal{L}\phi)\Big{]}dx\leq 0.

It follows from [44, Section 13, Theorem 25] or [45, Theorem 2.14] that

(2.6)

\int_{\Omega}\Big{[}(\bar{y}-y_{d})z+\beta(\mathcal{L}\bar{y})(\mathcal{L}z)\Big{]}dx=\int_{\Omega}z\ d\mu\quad\forall z\in\mathring{E}(\Delta;L_{2}(\Omega)),

where

(2.7)

\mu\ \text{is a non-positive regular Borel measure}.

Furthermore, it can be proved [18, 24] that

(2.8)

\mu\in H^{-1}(\Omega),

and we have the following complimentarity condition

(2.9)

\int_{\Omega}(\psi-\bar{y})\ \!d\mu=0.

Regularity of $\bar{u}$ and global regularity of $\bar{y}$

It can be proved [11, 20] that $\bar{u}\in H^{1}_{0}(\Omega)$ . According to (1.9), we have

(2.10)

\bar{y}\in H^{1+\alpha}(\Omega),

where $\alpha\in(\frac{1}{2},1]$ and

(2.11)

\|\bar{y}\|_{H^{1+\alpha}(\Omega)}\leq C_{\Omega}\|\mathcal{L}\bar{y}\|_{L_{2}(\Omega)}.

3. The Discrete Problem

Let $\mathcal{T}_{h}$ be a shape regular simplicial triangulation of $\Omega$ . The diameter of $T\in\mathcal{T}_{h}$ is denoted by $h_{T}$ and $h=\max_{T\in\mathcal{T}_{h}}h_{T}$ is the mesh diameter. Let $\mathcal{E}_{h}=\mathcal{E}^{b}_{h}\cup\mathcal{E}^{i}_{h}$ where $\mathcal{E}^{i}_{h}$ (resp. $\mathcal{E}^{b}_{h}$ ) represents the set of interior edges (resp. boundary edges).

We further decompose the boundary edges $\mathcal{E}^{b}_{h}$ into the inflow part $\mathcal{E}^{b,-}_{h}$ and the outflow part $\mathcal{E}^{b,+}_{h}$ which are defined as follows,

(3.1)		$\displaystyle\mathcal{E}^{b,-}_{h}$	$\displaystyle=\{e\in\mathcal{E}^{b}_{h}:e\subset\{x\in\partial\Omega:\bm{\zeta}(x)\cdot\mathbf{n}(x)<0\}\},$
(3.2)		$\displaystyle\mathcal{E}^{b,+}_{h}$	$\displaystyle=\mathcal{E}^{b}_{h}\setminus\mathcal{E}^{b,-}_{h}.$

For an edge $e\in\mathcal{E}^{i}_{h}$ , let $h_{e}$ be the length of $e$ . For each edge we associate a fixed unit normal $\mathbf{n}$ . We denote by $T^{+}$ the element for which $\mathbf{n}$ is the outward normal, and $T^{-}$ the element for which $-\mathbf{n}$ is the outward normal. We define the discontinuous finite element space $V_{h}$ as

(3.3)

V_{h}=\{v\in{L_{2}(\Omega)}:v|_{T}\in\mathbb{P}_{1}(T)\quad\forall\ T\in\mathcal{T}_{h}\}.

For $v\in V_{h}$ on an edge $e$ , we define

(3.4)

v^{+}=v|_{T^{+}}\quad\text{and}\quad v^{-}=v|_{T^{-}}.

We define the jump and average for $v\in V_{h}$ on an edge $e$ as follows,

(3.5)

[v]=v^{+}-v^{-},\quad\{v\}=\frac{v^{+}+v^{-}}{2}.

For $e\in\mathcal{E}_{h}^{b}$ with $e\in\partial T$ , we let

(3.6)

[v]=\{v\}=v|_{T}.

We also denote

(3.7)

(w,v)_{e}:=\int_{e}wv\ \!ds\quad\text{and}\quad(w,v)_{T}:=\int_{T}wv\ \!dx.

3.1. Mixed discontinuous Galerkin methods

We define the piecewise $H^{s}$ space with $s>\frac{3}{2}$ as

(3.8)

H^{s}(\Omega;\mathcal{T}_{h})=\{v\in{L_{2}(\Omega)}:v|_{T}\in H^{s}(T)\quad\forall\ T\in\mathcal{T}_{h}\}.

Define $\mathcal{L}_{h,g}:H^{s}(\Omega;\mathcal{T}_{h})\rightarrow V_{h}$ as

(3.9)		$\displaystyle(\mathcal{L}_{h,g}w,v)_{L_{2}(\Omega)}$	$\displaystyle=a_{h}(w,v)+\sum_{e\in\mathcal{E}_{h}^{b}}(g,\mathbf{n}\cdot\nabla v-\frac{\sigma}{h_{e}}v)_{e}$
(3.9)			$\displaystyle\hskip 14.22636pt+\sum_{e\in\mathcal{E}_{h}^{b,-}}(\mathbf{n}\cdot\bm{\zeta}g,v)_{e}\quad\forall v\in V_{h},$

where

(3.10)

a_{h}(w,v)=a_{h}^{\text{sip}}(w,v)+a^{\text{ar}}_{h}(w,v)\quad\forall w,v\in V_{h}.

Here

(3.11)		$\displaystyle a^{\text{sip}}_{h}(w,v)=$	$\displaystyle\sum_{T\in\mathcal{T}_{h}}(\nabla w,\nabla v)_{T}-\sum_{e\in\mathcal{E}_{h}}(\{\mathbf{n}\cdot\nabla w\},[v])_{e}-\sum_{e\in\mathcal{E}_{h}}(\{\mathbf{n}\cdot\nabla v\},[w])_{e}$
(3.11)			$\displaystyle+\sigma\sum_{e\in\mathcal{E}_{h}}h_{e}^{-1}([w],[v])_{e}$

is the bilinear form of the symmetric interior penalty (SIP) method with sufficiently large penalty parameter $\sigma$ and

(3.12)

\displaystyle a^{\text{ar}}_{h}(w,v)=\sum_{T\in\mathcal{T}_{h}}(\bm{\zeta}\cdot\nabla w+\gamma w,v)_{T}-\sum_{e\in\mathcal{E}^{i}_{h}\cup\mathcal{E}^{b,-}_{h}}(\mathbf{n}\cdot\bm{\zeta}[w],\{v\})_{e}.

is the unstabilized DG scheme for advection and reaction terms (cf. [23] and [30, Section 2.2]).

Remark 3.1.

We do not consider convection-dominated case in this paper. Therefore the bilinear form $a^{\text{ar}}_{h}(\cdot,\cdot)$ does not contain any stabilization terms. However, if one considers convection-dominated case, the following well-known [38, 4] upwind scheme can be utilized,

(3.13)

a^{\text{ar}}_{h}(w,v)=\sum_{T\in\mathcal{T}_{h}}(\bm{\zeta}\cdot\nabla w+\gamma w,v)_{T}-\sum_{e\in\mathcal{E}^{i}_{h}}(\mathbf{n}\cdot\bm{\zeta}[w],v^{\text{up}})_{e}-\sum_{e\in\mathcal{E}^{b,-}_{h}}(\mathbf{n}\cdot\bm{\zeta}\ \!w,v)_{e},

where the upwind value $v^{\text{up}}$ of a function on an interior edge $e\in\mathcal{E}_{h}^{i}$ is defined as

(3.14)

v^{\text{up}}=\left\{\begin{aligned} v^{+}\quad\text{if}\quad\bm{\zeta}\cdot\mathbf{n}\geq 0,\\ v^{-}\quad\text{if}\quad\bm{\zeta}\cdot\mathbf{n}<0.\end{aligned}\right.

Then the discrete problem for (1.10) is to find

(3.15)

\bar{y}_{h}=\operatorname*{argmin}_{y_{h}\in K_{h}}\left[\frac{1}{2}(y_{h}-y_{d},y_{h}-y_{d})_{L_{2}(\Omega)}+\frac{\beta}{2}(\mathcal{L}_{h,g}y_{h},\mathcal{L}_{h,g}y_{h})_{L_{2}(\Omega)}\right],

where

(3.16)

K_{h}=\{y\in V_{h}:y_{T}(p)\leq\psi(p)\ \ \mbox{for all}\ \ p\in\mathcal{V}_{T}\ \text{and all}\ T\in\mathcal{T}_{h}\}.

Here $\mathcal{V}_{T}$ is the set of vertices of $T\in\mathcal{T}_{h}$ . Note that we impose the Dirichlet boundary condition $y=g$ weakly through $\mathcal{L}_{h,g}$ .

Remark 3.2.

The discrete problem (3.15)-(3.16) can be solved by a primal-dual active set method (see Appendix A). Let $\mathbf{M}_{h}$ denote the mass matrix represent the bilinear form $(\cdot,\cdot)_{L_{2}(\Omega)}$ with respect to the natural discontinuous nodal basis in $V_{h}$ . Note that the computation of $\mathcal{L}_{h,g}$ involves $\mathbf{M}_{h}^{-1}$ . In contrast to [20, 16, 14], the matrix $\mathbf{M}_{h}$ is block diagonal hence $\mathbf{M}_{h}^{-1}$ can be obtained easily.

3.2. Properties of $a_{h}(\cdot,\cdot)$

Let $D$ be a subdomain of $\Omega$ and $\mathcal{T}_{h}(D)$ be a collection of all the elements with a nonempty intersection with $D$ . Define a mesh-dependent norm on $V_{h}$ ,

(3.17)

\left|\mkern-2.0mu\left|\mkern-2.0mu\left|v\right|\mkern-2.0mu\right|\mkern-2.0mu\right|_{h,D}^{2}=\sum_{T\in\mathcal{T}_{h}(D)}\left[\|\nabla v\|^{2}_{L_{2}(T)}+\sum_{e\in\partial T}\frac{\sigma}{h_{e}}\|[v]\|^{2}_{L_{2}(e)}+\sum_{e\in\partial T}\frac{h_{e}}{\sigma}\|\{\mathbf{n}\cdot\nabla v\}\|^{2}_{L_{2}(e)}\right].

We use $\left|\mkern-2.0mu\left|\mkern-2.0mu\left|v\right|\mkern-2.0mu\right|\mkern-2.0mu\right|_{h}$ to denote the norm $\left|\mkern-2.0mu\left|\mkern-2.0mu\left|v\right|\mkern-2.0mu\right|\mkern-2.0mu\right|_{h,\Omega}$ if there is no ambiguity. The following lemma establishes the continuity and coercivity of $a_{h}(\cdot,\cdot)$ , which is standard for the discontinuous Galerkin methods [3, 42, 17]. The proof is provided in Appendix B.

Lemma 3.3.

We have

(3.18)		$\displaystyle a_{h}(w,v)$	$\displaystyle\leq C\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|w\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|v\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}\quad$	$\displaystyle\forall w,v\in E(\Delta;L_{2}(\Omega))+V_{h},$
(3.19)		$\displaystyle a_{h}(v,v)$	$\displaystyle\geq C\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|v\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|^{2}_{h}\quad$	$\displaystyle\forall v\in V_{h},$

for large enough $\sigma$ .

3.3. Properties of $\mathcal{L}_{h,g}$

By the definition of $\mathcal{L}_{h,g}$ and integration by parts, we have for any $w\in g+\mathring{E}(\Delta;L_{2}(\Omega))$

(3.20)		$\displaystyle(\mathcal{L}_{h,g}w,v)_{L_{2}(\Omega)}$	$\displaystyle=a_{h}(w,v)_{L_{2}(\Omega)}+\sum_{e\in\mathcal{E}_{h}^{b}}(g,\mathbf{n}\cdot\nabla v-\frac{\sigma}{h_{e}}v)_{e}+\sum_{e\in\mathcal{E}_{h}^{b,-}}(\mathbf{n}\cdot\bm{\zeta}g,v)_{e}$
(3.20)			$\displaystyle=(\mathcal{L}w,v)_{L_{2}(\Omega)}\quad\forall v\in V_{h}.$

which gives the consistency of $\mathcal{L}_{h,g}$ . Moreover, we have

(3.21)

\mathcal{L}_{h,g}w=Q_{h}\mathcal{L}w\quad\forall w\in g+\mathring{E}(\Delta;L_{2}(\Omega)),

where $Q_{h}$ is the orthogonal projection from ${L_{2}(\Omega)}$ onto $V_{h}$ .

3.4. Discrete variational inequalities

Define $\mathcal{L}_{h}:H^{s}(\Omega;\mathcal{T}_{h})\rightarrow V_{h}$ as

(3.22)

(\mathcal{L}_{h}w,v)_{L_{2}(\Omega)}=a_{h}(w,v)_{L_{2}(\Omega)}\quad\forall v\in V_{h}.

Note that we have the following,

(3.23)

\mathcal{L}_{h,g}w_{1}-\mathcal{L}_{h,g}w_{2}=\mathcal{L}_{h}(w_{1}-w_{2})\quad\forall w_{1},w_{2}\in H^{s}(\Omega;\mathcal{T}_{h}).

In particular, we have

(3.24)

\mathcal{L}_{h}w=\mathcal{L}_{h,0}w=Q_{h}\mathcal{L}w\quad\forall w\in\mathring{E}(\Delta;L_{2}(\Omega)).

Since $K_{h}$ is a nonempty closed convex set and the objective function in (3.15) is strongly convex, the discrete problem (3.15)-(3.16) has a unique solution $\bar{y}_{h}\in V_{h}$ , which can be characterized by

(3.25)

(\bar{y}_{h}-y_{d},y_{h}-\bar{y}_{h})_{L_{2}(\Omega)}+\beta(\mathcal{L}_{h,g}\bar{y}_{h},\mathcal{L}_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}\geq 0\quad\forall\ y_{h}\in K_{h}.

4. Preliminary Estimates

In this section, we establish some preliminary estimates for the convergence analysis. We consider both quasi-uniform meshes and graded meshes [7, 5, 2] around reentrant corners.

4.1. Graded Meshes

For a nonconvex domain with reentrant corners, it is well-known that the solution to the state equation (1.2) does not belong to $H^{2}(\Omega)$ in general (see (2.10)). To overcome this lack of regularity, we can use a triangulation $\mathcal{T}_{h}$ with the following properties. Let $\omega_{1},\omega_{2},\ldots,\omega_{L}$ be the interior angles at the corners $c_{1},c_{2},\ldots,c_{L}$ of the bounded polygonal domain $\Omega$ and $c_{T}$ be the center of $T\in\mathcal{T}_{h}$ . There exists constants $C_{1}$ and $C_{2}$ such that

(4.1)

C_{1}h_{T}\leq\Phi_{\mu}(T)h\leq C_{2}h_{T},\quad\forall\ T\in\mathcal{T}_{h},

where $\Phi_{\mu}(T)=\Pi_{l=1}^{L}|c_{l}-c_{T}|^{1-\mu_{l}}$ . Here the grading parameters $\mu_{1},\mu_{2},\ldots,\mu_{L}$ are chosen as,

(4.2)		$\displaystyle\mu_{l}=1,$	$\displaystyle\quad\omega_{l}<\pi,$
(4.2)		$\displaystyle\frac{1}{2}<\mu_{l}<\frac{\pi}{\omega_{l}},$	$\displaystyle\quad\omega_{l}>\pi.$

The construction of graded meshes that satisfy (4.1) can be found in [7, 5, 2].

4.2. Preliminary inequalities

The following standard inequalities [3, 42, 12] are needed. Assume $D$ is a subdomain of $\Omega$ such that $D\Subset\Omega$ , i.e., the closure of $D$ is a compact set of $\Omega$ . Note that $\mathcal{T}_{h}$ is quasi-uniform around $D$ for graded meshes. Then a standard inverse estimate implies

(4.3)

\displaystyle\left|\mkern-2.0mu\left|\mkern-2.0mu\left|v_{h}\right|\mkern-2.0mu\right|\mkern-2.0mu\right|_{h,D}

\displaystyle\leq Ch^{-1}\|v_{h}\|_{L_{2}(\Omega)}\quad\forall v_{h}\in V_{h}.

We also have the following discrete Sobolev inequality [12]

(4.4)

\|y_{h}\|_{L_{\infty}(\Omega)}\leq C(1+|\ln h|)^{\frac{1}{2}}\left|\mkern-2.0mu\left|\mkern-2.0mu\left|y_{h}\right|\mkern-2.0mu\right|\mkern-2.0mu\right|_{h}\quad\forall y_{h}\in V_{h}.

For $T\in\mathcal{T}_{h}$ and $v\in H^{1+s}(T)$ where $s\in(\frac{1}{2},1]$ , the following trace inequalities with scaling is standard (cf. [31, Lemma 7.2] and [27, Proposition 3.1]),

(4.5)		$\displaystyle\\|v\\|_{L_{2}(\partial T)}$	$\displaystyle\leq C(h_{T}^{-\frac{1}{2}}\\|v\\|_{L_{2}(T)}+h_{T}^{s-\frac{1}{2}}\|v\|_{H^{s}(T)}),$
(4.6)		$\displaystyle\\|\nabla v\\|_{L_{2}(\partial T)}$	$\displaystyle\leq C(h_{T}^{-\frac{1}{2}}\\|\nabla v\\|_{L_{2}(T)}+h_{T}^{s-\frac{1}{2}}\|\nabla v\|_{H^{s}(T)}).$

The following discrete Poincaŕe inequality for DG functions [8, 4, 25] is valid for all $v\in V_{h}$ ,

(4.7)

\|v\|^{2}_{L_{2}(\Omega)}\leq C\left(\sum_{T\in\mathcal{T}_{h}}\|\nabla v\|^{2}_{L_{2}(T)}+\sum_{e\in\mathcal{E}_{h}}\frac{1}{h_{e}}\|[v]\|^{2}_{L_{2}(e)}\right).

4.3. Interpolation operator $I_{h}$

Let $V_{h}^{c}$ be the conforming $P_{1}$ finite element space associated with $\mathcal{T}_{h}$ . We use the usual continuous nodal interpolant $I_{h}:E(\Delta;L_{2}(\Omega))\rightarrow V^{c}_{h}$ (which belongs to $V_{h}$ ) [3, 42] such that the following holds.

Lemma 4.1.

We have

(4.8)

\|z-I_{h}z\|_{L_{2}(\Omega)}+h\left|\mkern-2.0mu\left|\mkern-2.0mu\left|z-I_{h}z\right|\mkern-2.0mu\right|\mkern-2.0mu\right|_{h}\lesssim h^{1+\tau}\|\mathcal{L}z\|_{L_{2}(\Omega)}\quad\forall z\in E(\Delta;L_{2}(\Omega)).

For quasi-uniform or graded meshes, we also have

(4.9)

\|z-I_{h}z\|_{L^{\infty}(\Omega)}\leq Ch^{\tau}\|\mathcal{L}z\|_{L_{2}(\Omega)}\quad\forall z\in E(\Delta;L_{2}(\Omega)).

Here $\tau$ is defined by

(4.10)

\tau=\left\{\begin{array}[]{cl}\alpha&\mbox{if}\ \mathcal{T}_{h}\ \mbox{is quasi-uniform,}\\ \\ 1&\mbox{if}\ \mathcal{T}_{h}\ \mbox{is graded around the reentrant corners,}\end{array}\right.

where $\alpha\in(\frac{1}{2},1]$ is the index of elliptic regularity in (1.9).

The following lemma is useful in the convergence analysis.

Lemma 4.2.

Let $\phi$ be a $C^{\infty}$ function with compact support in $\Omega$ . We have

(4.11)

\|\mathcal{L}_{h}(I_{h}\phi)\|_{L_{2}(\Omega)}\lesssim\|\phi\|_{H^{2}(\Omega)}.

Proof.

Let $D\Subset\Omega$ be an open neighborhood of the support of $\phi$ and $v_{h}\in V_{h}$ , we have, by (3.22), (4.3), (3.18) and (4.8),

	$\displaystyle(\mathcal{L}_{h}(\phi-I_{h}\phi),v_{h})_{L_{2}(\Omega)}$	$\displaystyle=a_{h}(\phi-I_{h}\phi,v_{h})$
		$\displaystyle\lesssim\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|\phi-I_{h}\phi\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|v_{h}\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h,D}$
		$\displaystyle\lesssim h\|\phi\|_{H^{2}(\Omega)}\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|v_{h}\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h,D}$
		$\displaystyle\lesssim\|\phi\|_{H^{2}(\Omega)}\\|v_{h}\\|_{L_{2}(\Omega)},$

and therefore,

(4.12)

\|\mathcal{L}_{h}(\phi-I_{h}\phi)\|_{L_{2}(\Omega)}\lesssim|\phi|_{H^{2}(\Omega)}.

It follows from (3.24) and (4.12) that

(4.13)	$\displaystyle\\|\mathcal{L}_{h}(I_{h}\phi)\\|_{L_{2}(\Omega)}$	$\displaystyle\leq\\|\mathcal{L}_{h}(\phi-I_{h}\phi)\\|_{L_{2}(\Omega)}+\\|\mathcal{L}_{h}\phi\\|_{L_{2}(\Omega)}$
		$\displaystyle\lesssim\|\phi\|_{H^{2}(\Omega)}+\\|Q_{h}\mathcal{L}\phi\\|_{L_{2}(\Omega)}$
		$\displaystyle\lesssim\|\phi\|_{H^{2}(\Omega)}+\\|\mathcal{L}\phi\\|_{L_{2}(\Omega)}$
		$\displaystyle\lesssim\\|\phi\\|_{H^{2}(\Omega)}.$

∎

4.4. The Ritz Projection Operator $R_{h}$

We define an operator $R_{h}:E(\Delta;L_{2}(\Omega))\rightarrow V_{h}$ as the following,

(4.14)

a_{h}(R_{h}w,v)=a_{h}(w,v)\quad\forall v\in V_{h}.

It follows from (3.9), (3.21) and (4.14) that

(4.15)

\mathcal{L}_{h,g}(R_{h}w)=\mathcal{L}_{h,g}w=Q_{h}(\mathcal{L}w)\quad\forall w\in g+\mathring{E}(\Delta;L_{2}(\Omega)).

Moreover, we have, by (3.22),

(4.16)

\mathcal{L}_{h}R_{h}w=\mathcal{L}_{h}w\quad\forall w\in E(\Delta;L_{2}(\Omega)).

Lemma 4.3.

We have the following error estimates for $R_{h}$ ,

(4.17)		$\displaystyle\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|w-R_{h}w\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}$	$\displaystyle\leq Ch^{\tau}\\|\mathcal{L}w\\|_{L_{2}(\Omega)}\quad\forall w\in E(\Delta;L_{2}(\Omega)),$
(4.18)		$\displaystyle\\|w-R_{h}w\\|_{L_{2}(\Omega)}$	$\displaystyle\leq Ch^{2\tau}\\|\mathcal{L}w\\|_{L_{2}(\Omega)}\quad\forall w\in E(\Delta;L_{2}(\Omega)).$

Proof.

The proof can be found in Appendix B. ∎

We have the following interior error estimate [12, 19, 25] for $D\Subset\Omega$ ,

(4.19)

\|\bar{y}-R_{h}\bar{y}\|_{L_{\infty}(D)}\leq C((1+|\ln h|)h^{2}+h^{2\tau}).

It follows from (4.4) and (4.17) that

(4.20)

\|\bar{y}-R_{h}\bar{y}\|_{{L_{\infty}(\Omega)}}\leq C((1+|\ln h|)h^{\tau}.

4.5. The Smoothing Operator $E_{h}$

The operator $E_{h}:V_{h}\rightarrow\mathring{E}(\Delta;L_{2}(\Omega))$ is defined by

(4.21)

\mathcal{L}E_{h}v_{h}=\mathcal{L}_{h}v_{h}\quad\forall v_{h}\in V_{h}.

In particular, for all $w\in V_{h}$ ,

(4.22)

\displaystyle a_{h}(E_{h}v_{h},w)=(\mathcal{L}E_{h}v_{h},w)_{L_{2}(\Omega)}=(\mathcal{L}_{h}v_{h},w)_{L_{2}(\Omega)}=a_{h}(v_{h},w).

Note that (4.22) and (4.14) imply

(4.23)

v_{h}=R_{h}(E_{h}v_{h})\quad\forall v_{h}\in V_{h}.

Lemma 4.4.

We have the following estimates for all $v_{h}\in V_{h}$

(4.24)	$\displaystyle\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|E_{h}v_{h}-v_{h}\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}$	$\displaystyle\lesssim h^{\tau}\\|\mathcal{L}_{h}v_{h}\\|_{L_{2}(\Omega)},$
(4.25)	$\displaystyle\\|E_{h}v_{h}-v_{h}\\|_{L_{2}(\Omega)}$	$\displaystyle\lesssim h^{2\tau}\\|\mathcal{L}_{h}v_{h}\\|_{L_{2}(\Omega)},$
(4.26)	$\displaystyle\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|E_{h}v_{h}-v_{h}\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h,G(\mathscr{A})}$	$\displaystyle\lesssim h\\|\mathcal{L}_{h}v_{h}\\|_{L_{2}(\Omega)},$

where $G(\mathscr{A})$ is an open neighborhood of the active set $\mathscr{A}=\{x\in\Omega:\bar{y}(x)=\psi(x)\}$ such that the closure of $G(\mathscr{A})$ is a compact subset of $\Omega$ .

Proof.

The estimates follow from (4.23) and (4.17)-(4.19). Details can be found in [12]. ∎

4.6. The Connection Operator $\mathfrak{C}_{h}$

We need a connection operator $\mathfrak{C}_{h}:V_{h}\rightarrow\mathring{V}_{h}^{c}=V_{h}\cap H^{1}_{0}(\Omega)$ defined as follows,

(4.27)

(\mathfrak{C}_{h}y_{h})(p)=\frac{1}{|\mathcal{T}_{h}(p)|}\sum_{T\in\mathcal{T}_{h}(p)}(y_{h}|_{T})(p)

where $p$ is any node of the $P_{1}$ finite element space interior to $\Omega$ and $\mathcal{T}_{h}(p)$ is the set of triangles in $\mathcal{T}_{h}$ that share the node $p$ . The operator $\mathfrak{C}_{h}$ has the following properties by (3.16) and (4.27),

(4.28)		$\displaystyle\mathfrak{C}_{h}v_{h}$	$\displaystyle=v_{h}\quad\forall v_{h}\in\mathring{V}_{h}^{c},$
(4.29)		$\displaystyle\mathfrak{C}_{h}y_{h}$	$\displaystyle\in K_{h}\quad\forall y_{h}\in K_{h}.$

For any subdomain $D$ of $\Omega$ , we have,

(4.30)

\displaystyle h^{-2}\|y_{h}-\mathfrak{C}_{h}y_{h}\|^{2}_{L_{2}(D)}+\sum_{T\in\mathcal{T}_{h}}|y_{h}-\mathfrak{C}_{h}y_{h}|^{2}_{H^{1}(T)}

\displaystyle\leq C\sum_{T\in\mathcal{T}^{\ast}_{h}(D)}\sum_{e\in\partial T}|e|^{-1}\|[y_{h}]\|_{L_{2}(e)}^{2},

where $T\in\mathcal{T}_{h}$ belongs to $\mathcal{T}^{\ast}_{h}(D)$ if and only if $S_{T}\cap D=\emptyset$ . Here $S_{T}$ (the star of $T$ ) is the union of all the triangles in $\mathcal{T}_{h}$ that share a common vertex with $T$ . We also have (cf. [12])

(4.31)

\|\mathfrak{C}_{h}y_{h}\|_{L_{\infty}(T)}\leq C\|y_{h}\|_{L_{\infty}(S_{T})}\quad\forall T\in\mathcal{T}_{h}.

It follows from (4.26) and (3.17) that,

(4.32)

\sum_{T\in\mathcal{T}^{\ast}_{h}(D)}\sum_{e\in\partial T}|e|^{-1}\|[y_{h}]\|_{L_{2}(e)}^{2}\leq Ch^{2}\|\mathcal{L}_{h}y_{h}\|^{2}_{L_{2}(\Omega)}\quad\quad\forall y_{h}\in V_{h}.

Remark 4.5.

Due to the facts that $V_{h}\not\subset H^{1}_{0}(\Omega)$ and $\mathring{V}_{h}^{c}\subset H^{1}_{0}(\Omega)$ , the operator $\mathfrak{C}_{h}$ is utilized to connect the discrete problem with the continuous problem.

5. Convergence Analysis

In this section, we derive error estimates of the discontinuous Galerkin methods (3.15). We define a mesh-dependent norm

(5.1)

\|v\|_{h}^{2}=(v,v)_{L_{2}(\Omega)}+\beta(\mathcal{L}_{h}v,\mathcal{L}_{h}v)_{L_{2}(\Omega)}.

5.1. An abstract error estimate

Let $\bar{y}_{h}\in K_{h}$ be the solution of (3.15). Given any $y_{h}\in K_{h}$ , we have the following by (3.25) and (3.23),

(5.2)	$\displaystyle\\|y_{h}-\bar{y}_{h}\\|_{h}^{2}$	$\displaystyle=(y_{h}-\bar{y}_{h},y_{h}-\bar{y}_{h})_{L_{2}(\Omega)}+\beta(\mathcal{L}_{h}(y_{h}-\bar{y}_{h}),\mathcal{L}_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}$
		$\displaystyle=(y_{h}-\bar{y},y_{h}-\bar{y}_{h})_{L_{2}(\Omega)}+\beta(\mathcal{L}_{h}(y_{h}-\bar{y}),\mathcal{L}_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}$
		$\displaystyle\hskip 8.5359pt+(\bar{y}-y_{d},y_{h}-\bar{y}_{h})_{L_{2}(\Omega)}+\beta(\mathcal{L}_{h,g}\bar{y},\mathcal{L}_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}$
		$\displaystyle\hskip 8.5359pt-(\bar{y}_{h}-y_{d},y_{h}-\bar{y}_{h})_{L_{2}(\Omega)}-\beta(\mathcal{L}_{h,g}\bar{y}_{h},\mathcal{L}_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}$
		$\displaystyle\leq\\|y_{h}-\bar{y}\\|_{h}\\|y_{h}-\bar{y}_{h}\\|_{h}+(\bar{y}-y_{d},y_{h}-\bar{y_{h}})_{L_{2}(\Omega)}$
		$\displaystyle\hskip 8.5359pt+\beta(\mathcal{L}_{h,g}\bar{y},\mathcal{L}_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}.$

Notice that $E_{h}(y_{h}-\bar{y}_{h})\in\mathring{E}(\Delta;L_{2}(\Omega))$ , we then obtain, by (2.6), (4.21) and (3.21),

(5.3)			$\displaystyle(\bar{y}-y_{d},y_{h}-\bar{y}_{h})_{L_{2}(\Omega)}+\beta(\mathcal{L}_{h,g}\bar{y},\mathcal{L}_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}$
			$\displaystyle\hskip 8.5359pt=(\bar{y}-y_{d},(y_{h}-\bar{y}_{h})-E_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}$
			$\displaystyle\hskip 17.07182pt+(\bar{y}-y_{d},E_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}+\beta(\mathcal{L}\bar{y},\mathcal{L}E_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}$
			$\displaystyle\hskip 8.5359pt=(\bar{y}-y_{d},(y_{h}-\bar{y}_{h})-E_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}+\int_{\Omega}E_{h}(y_{h}-\bar{y}_{h})\ \!d\mu.$

It follows from (4.25) that

(5.4)

(\bar{y}-y_{d},(y_{h}-\bar{y}_{h})-E_{h}(y_{h}-\bar{y}_{h}))_{L_{2}(\Omega)}\lesssim h^{2\tau}\|\mathcal{L}_{h}(y_{h}-\bar{y}_{h})\|_{L_{2}(\Omega)}\lesssim h^{2\tau}\|y_{h}-\bar{y}_{h}\|_{h}.

For the last term in (5.3), we have the following lemma.

Lemma 5.1.

We have, for any $y_{h},\bar{y}_{h}\in K_{h}$ ,

(5.5)

\int_{\Omega}E_{h}(y_{h}-\bar{y}_{h})\ \!d\mu\leq C\left(h\|\mathcal{L}_{h}(y_{h}-\bar{y}_{h})\|_{L_{2}(\Omega)}+h^{2}+\|I_{h}(\bar{y}-\mathfrak{C}_{h}y_{h})\|_{L_{\infty}(\mathscr{A})}\right),

where $C$ is independent of $h$ and $\mathscr{A}$ is the active set.

Proof.

We give a sketch of the proof. The details can be found in [12, Theorem 4.1]. We have, by (3.16) and (2.7),

(5.6)	$\displaystyle\int_{\Omega}E_{h}(y_{h}-\bar{y}_{h})\ \!d\mu$	$\displaystyle=\int_{\Omega}[E_{h}(y_{h}-\bar{y}_{h})-\mathfrak{C}_{h}(y_{h}-\bar{y}_{h})]\ \!d\mu+\int_{\Omega}I_{h}(\psi-\mathfrak{C}_{h}\bar{y}_{h})\ \!d\mu$
		$\displaystyle\hskip 5.69046pt+\int_{\Omega}[I_{h}\mathfrak{C}_{h}(\bar{y}_{h}-y_{h})-\mathfrak{C}_{h}(\bar{y}_{h}-y_{h})]\ \!d\mu+\int_{\Omega}I_{h}(\bar{y}-\psi)\ \!d\mu$
		$\displaystyle\hskip 5.69046pt+\int_{\Omega}I_{h}(\mathfrak{C}_{h}y_{h}-\bar{y})\ \!d\mu$
		$\displaystyle=T_{1}+T_{2}+T_{3}+T_{4}+T_{5}.$

We can bound $T_{2}$ , $T_{4}$ and $T_{5}$ as follows. It follows from (2.7), (3.16) and (4.29) that

(5.7)

T_{2}=\int_{\Omega}I_{h}(\psi-\mathfrak{C}_{h}\bar{y}_{h})\ \!d\mu\leq 0.

We also have

(5.8)		$\displaystyle T_{4}=\int_{\Omega}I_{h}(\bar{y}-\psi)\ \!d\mu$	$\displaystyle=\int_{\Omega}[I_{h}(\bar{y}-\psi)-(\bar{y}-\psi)]\ \!d\mu$
(5.8)			$\displaystyle\leq\\|I_{h}(\bar{y}-\psi)-(\bar{y}-\psi)\\|_{L_{\infty}(\mathscr{A})}\|\mu(\Omega)\|\leq Ch^{2},$

and

(5.9)

T_{5}=\int_{\Omega}I_{h}(\mathfrak{C}_{h}y_{h}-\bar{y})\ \!d\mu\leq\|I_{h}(\mathfrak{C}_{h}y_{h}-\bar{y})\|_{L_{\infty}(\mathscr{A})}|\mu(\Omega)|

by (1.4), (2.7), (2.9) and (2.4). For $T_{1}$ , it follows from (2.8), (4.26), (4.30) and (4.32) that

(5.10)

T_{1}=\int_{\Omega}[E_{h}(y_{h}-\bar{y}_{h})-\mathfrak{C}_{h}(y_{h}-\bar{y}_{h})]\ \!d\mu\leq Ch\|\mathcal{L}_{h}(y_{h}-\bar{y}_{h})\|_{L_{2}(\Omega)}.

Finally, we obtain

(5.11)

T_{3}=\int_{\Omega}[I_{h}\mathfrak{C}_{h}(\bar{y}_{h}-y_{h})-\mathfrak{C}_{h}(\bar{y}_{h}-y_{h})]\ \!d\mu\leq Ch\|\mathcal{L}_{h}(y_{h}-\bar{y}_{h})\|_{L_{2}(\Omega)}

by (2.8), (4.30), (4.32) and (4.8). ∎

Remark 5.2.

The key ingredients in the proof of Lemma 5.1 is the regularity result (2.8) and the connection operator $\mathfrak{C}_{h}$ which is previously used in the analysis of nonconforming methods (cf. [8]).

From (5.2),(5.4) and Lemma 5.1, we have the following abstract error estimate,

(5.12)

\|\bar{y}-\bar{y}_{h}\|_{h}\leq C\left(h+\inf_{y_{h}\in K_{h}}\left[\|y_{h}-\bar{y}\|_{h}+\|I_{h}(\bar{y}-\mathfrak{C}_{h}y_{h})\|^{\frac{1}{2}}_{L_{\infty}(\mathscr{A})}\right]\right).

5.2. Concrete error estimates

Lemma 5.3.

For sufficiently small $h$ , there exists $y^{\ast}_{h}\in K_{h}$ such that

(5.13)

\|y^{\ast}_{h}-\bar{y}\|_{h}+\|I_{h}(\bar{y}-\mathfrak{C}_{h}y^{\ast}_{h})\|^{\frac{1}{2}}_{L_{\infty}(\mathscr{A})}\leq C((1+|\ln h|)^{\frac{1}{2}}h+h^{\tau}),

where $\tau$ is defined in (4.10).

Proof.

We give a sketch of the proof. The details can be found in [12, Lemma 5.1]. The crucial task here is to construct a suitable $y_{h}^{\ast}\in K_{h}$ to bound the infimum in (5.12). Let $G(\mathscr{A})$ be an open neighborhood of the active set $\mathscr{A}$ and $\epsilon_{h}=\|R_{h}\bar{y}-\bar{y}\|_{L_{\infty}(G(\mathscr{A}))}$ . We then define $y^{\ast}_{h}\in V_{h}$ as the following,

(5.14)

y^{\ast}_{h}=R_{h}\bar{y}-\epsilon_{h}I_{h}\phi,

where $\phi\in C^{\infty}$ is a nonnegative function with compact support in $\Omega$ such that $\phi=1$ on $G(\mathscr{A})$ . We can show that $y^{\ast}_{h}\in K_{h}$ . Hence, it follows from Lemma 4.2, (4.18) and (4.16) that

(5.15)		$\displaystyle\\|\bar{y}-y^{\ast}_{h}\\|_{L_{2}(\Omega)}$	$\displaystyle\leq C\epsilon_{h},$
(5.16)		$\displaystyle\\|\mathcal{L}_{h}(\bar{y}-y^{\ast}_{h})\\|_{L_{2}(\Omega)}$	$\displaystyle\leq C\epsilon_{h}.$

Therefore, we obtain, by (5.1) and (4.19),

(5.17)

\|\bar{y}-y^{\ast}_{h}\|_{h}\leq C((1+|\ln h|)h^{2}+h^{2\tau}).

At last, it follows from (4.31), (4.28), (4.19) and (4.8) that,

(5.18)

\|I_{h}(\bar{y}-\mathfrak{C}_{h}y^{\ast}_{h})\|^{\frac{1}{2}}_{L_{\infty}(\mathscr{A})}\leq C((1+|\ln h|)^{\frac{1}{2}}h+h^{\tau}).

∎

The following theorem (cf. [12]) provide concrete error estimates for the discontinuous Galerkin method (3.15).

Theorem 5.4.

Let $\bar{y}_{h}\in K_{h}$ be the solution of (3.15) and $\bar{u}_{h}=\mathcal{L}_{h,g}\bar{y}_{h}$ . We have

\displaystyle\|\bar{u}-\bar{u}_{h}\|_{L_{2}(\Omega)}+\|\bar{y}-\bar{y}_{h}\|_{L_{2}(\Omega)}+\left|\mkern-2.0mu\left|\mkern-2.0mu\left|\bar{y}-\bar{y}_{h}\right|\mkern-2.0mu\right|\mkern-2.0mu\right|_{h}+\|\bar{y}-\bar{y}_{h}\|_{L_{\infty}(\Omega)}\leq C((1+|\ln h|)^{\frac{1}{2}}h+h^{\tau}),

where $\tau$ is defined in (4.10).

6. Numerical Results

In this section we report the numerical results from two examples. The discrete problem (3.15) is solved by a primal-dual active set algorithm [6, 36]. For all examples, we take the penalty parameter $\sigma=6$ and the regularization parameter $\beta=1$ . For simplicity, we also take $\gamma=1$ . We utilized the MATLAB\C $++$ toolbox FELICITY [46] in our computation.

Example 6.1 (Square Domain [14]).

For this example, we take $\Omega=[-4,4]^{2}$ , $g=0$ , $\bm{\zeta}=[1,\ 0]^{t}$ and $\psi=|x|^{2}-1$ in (3.15). We consider the function $y_{d}$ defined as follows,

y_{d}=\left\{\begin{array}[]{cc}\mathcal{L}^{t}\mathcal{L}\bar{y}+\bar{y}&|x|>1\\ \mathcal{L}^{t}\mathcal{L}\bar{y}+\bar{y}+2&|x|\leq 1\end{array}\right..

The function $\bar{y}$ is given by

\bar{y}=\left\{\begin{array}[]{cc}|x|^{2}-1&|x|\leq 1\\ v(|x|)+(1-\phi(|x|))w(x)&1\leq|x|\leq 3\\ w(x)&|x|\geq 3\end{array}\right.,

where

	$\displaystyle v(\|x\|)=(\|x\|^{2}-1)(1-\frac{\|x\|-1}{2})^{4}+\frac{1}{4}(\|x\|-1)^{2}(\|x\|-3)^{4},$
	$\displaystyle\phi(\|x\|)=(1+4\frac{\|x\|-1}{2}+10(\frac{\|x\|-1}{2})^{2}+20(\frac{\|x\|-1}{2})^{3})(1-\frac{\|x\|-1}{2})^{4},$
	$\displaystyle w(x)=2\sin(\frac{\pi}{8}(x_{1}+4))^{3}\sin(\frac{\pi}{8}(x_{2}+4))^{3}.$

By construction, the function $\bar{y}$ is the exact solution and $\bar{y}\leq\psi$ . The active set is the unit disk $\{x:|x|<1\}$ . The convergence rates on uniform meshes are reported in Table 6. As we can see, the convergence rate is around $O(h^{2})$ (in average) for the $L_{2}$ and $L_{\infty}$ error of the state and $O(h^{\frac{3}{2}})$ for the $L_{2}$ error of the control. These are better than the estimates in Theorem 5.4 and consistent with the fact $\bar{y}\in H^{\frac{7}{2}-\varepsilon}(\Omega)$ . We also observe $O(h)$ convergence of the state in $\|\cdot\|_{h}$ which is consistent with Theorem 5.4. See Figure 1 for the optimal state, control and active set at level 6.

Table 1. Convergence results for Example 6.1 on uniform meshes with

\bm{\zeta}=[1\ 0]^{t}

$k$	$\\|\bar{y}-y_{h}\\|_{L_{2}(\Omega)}$	Order	$\\|\bar{y}-y_{h}\\|_{h}$	Order	$\\|\bar{u}-u_{h}\\|_{L_{2}(\Omega)}$	Order	$\\|\bar{y}-y_{h}\\|_{L^{\infty}(\Omega)}$	Order
$1$	3.00e+01	-	3.14e+01	-	6.23e+01	-	8.45e+00	-
$2$	1.02e+01	1.55	9.33e+00	1.75	2.66e+01	1.23	4.68e+00	0.85
$3$	2.70e+00	1.92	3.34e+00	1.48	9.33e+00	1.51	9.50e-01	2.30
$4$	7.45e-01	1.86	1.19e+00	1.48	3.54e+00	1.40	2.80e-01	1.76
$5$	1.40e-01	2.41	4.30e-01	1.47	1.39e+00	1.35	5.03e-02	2.48
$6$	6.74e-02	1.05	1.95e-01	1.14	5.59e-01	1.31	2.11e-02	1.25
$7$	3.76e-02	0.84	9.58e-02	1.02	2.08e-01	1.43	1.28e-02	0.72
$8$	7.18e-03	2.39	4.31e-02	1.15	6.75e-02	1.62	2.56e-03	2.32

Example 6.2 (L-shaped Domain [14]).

For this example, we take $\Omega=[-8,8]^{2}\setminus[0,8]\times[-8,0]$ , $g=10$ and $\bm{\zeta}=[2,\ 1]^{t}$ in (3.15). This example is a modification of Example 6.1. The functions $y_{d}$ and $\psi$ are shifted using the point $x_{\ast}=(-4,4)$ . After that, a singular function $10\psi_{s}$ is added to the functions $y_{d}$ and $\psi$ . By construction, the exact solution is $\tilde{y}=\bar{y}(x-x_{\ast})+10\psi_{s}$ where $\bar{y}$ is defined in Example 6.1. Here the function $\psi_{s}$ is defined by

(6.1)		$\displaystyle\mathcal{L}\psi_{s}$	$\displaystyle=0\quad\text{in}\ \Omega,$
(6.1)		$\displaystyle\psi_{s}$	$\displaystyle=1\quad\text{on}\ \partial\Omega.$

Tables 6 and 6 contain the convergence rates of the discontinuous Galerkin methods (3.15) on uniform meshes and graded meshes (see Figure 2). As we can see in Table 6, the convergence for the state in $\|\cdot\|_{h}$ and $L_{\infty}$ norms is approaching $O(h^{\frac{2}{3}})$ . This coincides with the theoretical results with $\tau=\alpha=\frac{2}{3}-\varepsilon$ . The convergence of the state in $L_{2}$ norm is close to $O(h^{2})$ and the convergence of the control in $L_{2}$ norm is approaching $O(h^{\frac{3}{2}})$ . These are better than the estimates in Theorem 5.4 and consistent with the fact that $\bar{y}\in H^{\frac{7}{2}-\varepsilon}(\Omega)$ and $\bar{u}\in H^{\frac{3}{2}-\varepsilon}(\Omega)$ . We also observe clear improvements of the convergence rates for the state in $\|\cdot\|_{h}$ and $L_{\infty}$ norms in Table 6. This also coincides with Theorem 5.4 with $\tau=1$ .

Table 2. Convergence results for Example 6.2 on uniform meshes

$k$	$\\|\bar{y}-y_{h}\\|_{L_{2}(\Omega)}$	Order	$\\|\bar{y}-y_{h}\\|_{h}$	Order	$\\|\bar{u}-u_{h}\\|_{L_{2}(\Omega)}$	Order	$\\|\bar{y}-y_{h}\\|_{L^{\infty}(\Omega)}$	Order
$1$	3.30e+01	-	2.79e+01	-	6.79e+01	-	7.03e+00	-
$2$	1.97e+01	0.75	2.39e+01	0.22	3.10e+01	1.13	6.73e+00	0.06
$3$	7.34e+00	1.42	1.52e+01	0.65	9.37e+00	1.73	4.90e+00	0.46
$4$	2.02e+00	1.86	8.82e+00	0.79	3.31e+00	1.50	3.08e+00	0.67
$5$	5.09e-01	1.99	4.66e+00	0.92	1.27e+00	1.39	1.88e+00	0.71
$6$	1.38e-01	1.88	2.44e+00	0.93	4.99e-01	1.35	1.15e+00	0.71
$7$	4.76e-02	1.54	1.33e+00	0.87	1.81e-01	1.47	7.15e-01	0.69
$8$	1.10e-02	2.12	7.64e-01	0.80	5.82e-02	1.63	4.47e-01	0.68

Table 3. Convergence results for Example 6.2 on graded meshes

$k$	$\\|\bar{y}-y_{h}\\|_{L_{2}(\Omega)}$	Order	$\\|\bar{y}-y_{h}\\|_{h}$	Order	$\\|\bar{u}-u_{h}\\|_{L_{2}(\Omega)}$	Order	$\\|\bar{y}-y_{h}\\|_{L^{\infty}(\Omega)}$	Order
$1$	2.13e+01	-	2.94e+01	-	3.09e+01	-	6.78e+00	-
$2$	1.14e+01	0.91	2.66e+01	0.15	2.49e+01	0.31	5.07e+00	0.42
$3$	4.18e+00	1.44	1.78e+01	0.58	4.66e+00	2.42	2.55e+00	0.99
$4$	1.04e+00	2.01	9.67e+00	0.88	1.77e+00	1.40	9.81e-01	1.38
$5$	2.52e-01	2.04	4.55e+00	1.09	8.51e-01	1.05	3.15e-01	1.64
$6$	6.75e-02	1.90	2.03e+00	1.16	3.74e-01	1.19	1.50e-01	1.07
$7$	3.24e-02	1.06	9.06e-01	1.17	1.66e-01	1.17	5.93e-02	1.34
$8$	3.99e-03	3.02	4.14e-01	1.13	5.49e-02	1.59	2.80e-02	1.08

7. Concluding Remark

We propose and analyze discontinuous Galerkin methods to solve an optimal control problem with a general state equation and pointwise state constraints on general polygonal domains. Concrete error estimates are established and numerical results are provided to support the theoretical results. We do not consider convection-dominated case in this paper, hence the constants throughout this paper might depend on $\bm{\zeta}$ and $\gamma$ . However, we would like to point out that the potential of our methods is to solve optimal control problems governed by convection-dominated equations with pointwise state constraints. There are some previous work [38, 34] concerning the optimal control problems governed by convection-dominated problems without state constraints. As pointed out in [38], the weak treatment of the Dirichlet boundary conditions (as we did in (3.9)) are crucial for optimal control problems governed by convection-dominated equations. However, rigorous analysis of the convection-dominated case seems nontrivial. We will investigate this in future work.

Appendix A Primal-dual active set algorithm

Now we rewrite (3.15) in matrix and vector form. Let $\mathbf{M}_{h}$ (resp., $\mathbf{A}_{h}$ ) denote the mass (resp., stiffness) matrix represent the bilinear form $(\cdot,\cdot)_{L_{2}(\Omega)}$ (resp., $a_{h}(\cdot,\cdot)$ ) with respect to the natural discontinuous nodal basis in $V_{h}$ . Assume $\mathbf{L}_{h}$ represents $\mathcal{L}_{h,g}$ and $\mathbf{g}$ represents the boundary integral term in (3.9). It follows from (3.9) that

(A.1)

\mathbf{v}^{t}\mathbf{M}_{h}\mathbf{L}_{h}(\mathbf{u})=\mathbf{v}^{t}\mathbf{A}_{h}\mathbf{u}+\mathbf{v}^{t}\mathbf{g}\quad\forall\mathbf{u},\mathbf{v}\in\mathbb{R}^{n_{h}},

where $n_{h}=\dim V_{h}$ . This leads to the relation

(A.2)

\mathbf{L}_{h}(\mathbf{u})=\mathbf{M}_{h}^{-1}\mathbf{A}_{h}\mathbf{u}+\mathbf{M}_{h}^{-1}\mathbf{g}\quad\forall\mathbf{u}\in\mathbb{R}^{n_{h}},

and

(A.3)

(\mathbf{L}_{h}(\mathbf{u}))^{t}=\mathbf{u}^{t}\mathbf{A}^{t}_{h}\mathbf{M}_{h}^{-1}+\mathbf{g}^{t}\mathbf{M}_{h}^{-1}\quad\forall\mathbf{u}\in\mathbb{R}^{n_{h}}.

Thus (3.15) can be rewritten as, by (A.2) and (A.3),

(A.4)			$\displaystyle\operatorname*{argmin}_{\mathbf{y}_{h}\leq\bm{\psi}}\frac{1}{2}(\mathbf{y}_{h}-\mathbf{y}_{d})^{t}\mathbf{M}_{h}(\mathbf{y}_{h}-\mathbf{y}_{d})+\frac{\beta}{2}(\mathbf{L}_{h}(\mathbf{y}_{h}))^{t}\mathbf{M}_{h}\mathbf{L}_{h}(\mathbf{y}_{h})$
			$\displaystyle=\operatorname*{argmin}_{\mathbf{y}_{h}\leq\bm{\psi}}\frac{1}{2}\mathbf{y}_{h}^{t}\mathbf{M}_{h}\mathbf{y}_{h}-\mathbf{y}_{h}^{t}(\mathbf{M}_{h}\mathbf{y}_{d})$
			$\displaystyle\hskip 28.45274pt+\frac{\beta}{2}(\mathbf{y}_{h}^{t}\mathbf{A}^{t}_{h}\mathbf{M}_{h}^{-1}+\mathbf{g}^{t}\mathbf{M}_{h}^{-1})\mathbf{M}_{h}(\mathbf{M}_{h}^{-1}\mathbf{A}_{h}\mathbf{y}_{h}+\mathbf{M}_{h}^{-1}\mathbf{g})$
			$\displaystyle=\operatorname*{argmin}_{\mathbf{y}_{h}\leq\bm{\psi}}\frac{1}{2}\mathbf{y}_{h}^{t}\left[\beta\mathbf{A}^{t}_{h}\mathbf{M}_{h}^{-1}\mathbf{A}_{h}+\mathbf{M}_{h}\right]\mathbf{y}_{h}-\mathbf{y}_{h}^{t}(-\beta\mathbf{A}^{t}_{h}\mathbf{M}_{h}^{-1}\mathbf{g}+\mathbf{M}_{h}\mathbf{y}_{d}).$

Denote $\mathbf{B}_{h}=\beta\mathbf{A}^{t}_{h}\mathbf{M}_{h}^{-1}\mathbf{A}_{h}+\mathbf{M}_{h}$ and $\tilde{\mathbf{y}}_{d}=-\beta\mathbf{A}^{t}_{h}\mathbf{M}_{h}^{-1}\mathbf{g}+\mathbf{M}_{h}\mathbf{y}_{d}$ . Let $\mathfrak{n}=\{1,2,\ldots,n_{h}\}$ , the primal-dual active set method for (A.4) is the following.

•

Given an initial guess $(\mathbf{y}_{0},\bm{\lambda}_{0})$ where $\bm{\lambda}_{0}\geq 0$ , we define

	$\displaystyle\mathcal{A}_{0}$	$\displaystyle=$	$\displaystyle\{j\in\mathfrak{n}:\bm{\lambda}_{0}+c(\mathbf{y}_{0}(j)-\bm{\psi}(j))>0\},$
	$\displaystyle\mathcal{I}_{0}$	$\displaystyle=$	$\displaystyle\{j\in\mathfrak{n}:\bm{\lambda}_{0}+c(\mathbf{y}_{0}(j)-\bm{\psi}(j))\leq 0\}=\mathfrak{n}\setminus\mathcal{A}_{0}.$

•

For $k\geq 1$ we recursively solve the system

(A.5)	$\displaystyle\mathbf{B}_{h}\mathbf{y}_{k}+\bm{\lambda}_{k}$	$\displaystyle=$	$\displaystyle\tilde{\mathbf{y}}_{d},$
(A.6)	$\displaystyle\mathbf{y}_{k}$	$\displaystyle=$	$\displaystyle\bm{\psi}\ \ \ \mbox{on}\ \ \ \mathcal{A}_{k-1},$
(A.7)	$\displaystyle\bm{\lambda}_{k}$	$\displaystyle=$	$\displaystyle 0\ \ \ \ \mbox{on}\ \ \ \mathcal{I}_{k-1}.$

•

Then update the active set and inactive set by

	$\displaystyle\mathcal{A}_{k}$	$\displaystyle=$	$\displaystyle\{j\in\mathfrak{n}:\bm{\lambda}_{k}+c(\mathbf{y}_{k}(j)-\bm{\psi}(j))>0\},$
	$\displaystyle\mathcal{I}_{k}$	$\displaystyle=$	$\displaystyle\{j\in\mathfrak{n}:\bm{\lambda}_{k}+c(\mathbf{y}_{k}(j)-\bm{\psi}(j))\leq 0\}=\mathfrak{n}\setminus\mathcal{A}_{k}.$

The unique solution of (A.5)-(A.7) is determined by

	$\displaystyle\mathbf{y}_{k}({\scriptscriptstyle\mathcal{A}_{k-1}})$	$\displaystyle=$	$\displaystyle\bm{\psi}({\scriptscriptstyle\mathcal{A}_{k-1}}),$
	$\displaystyle\bm{\lambda}_{k}({\scriptscriptstyle\mathcal{I}_{k-1}})$	$\displaystyle=$	$\displaystyle 0,$
(A.8)		$\displaystyle\mathbf{B}_{h}({\scriptscriptstyle\mathcal{I}_{k-1},\mathcal{I}_{k-1}})\mathbf{y}_{k}({\scriptscriptstyle\mathcal{I}_{k-1}})$	$\displaystyle=$	$\displaystyle\tilde{\mathbf{y}}_{d}({\scriptscriptstyle\mathcal{I}_{k-1}})-\mathbf{B}_{h}({\scriptscriptstyle\mathcal{I}_{k-1},\mathcal{A}_{k-1}})\bm{\psi}({\scriptscriptstyle\mathcal{A}_{k-1}}),$
	$\displaystyle\bm{\lambda}_{k}({\scriptscriptstyle\mathcal{A}_{k-1}})$	$\displaystyle=$	$\displaystyle\tilde{\mathbf{y}}_{d}({\scriptscriptstyle\mathcal{A}_{k-1}})-(\mathbf{B}_{h}\mathbf{y}_{k})({\scriptscriptstyle\mathcal{A}_{k-1}}).$

Remark A.1.

Here we follow the MATLAB convention that the vector $\mathbf{y}_{k}({\scriptscriptstyle\mathcal{A}_{k-1}})$ is the subvector of $\mathbf{y}_{k}$ generated by the components of $\mathbf{y}_{k}$ corresponding to the index set ${\scriptscriptstyle\mathcal{A}_{k-1}}$ , the matrix $\mathbf{B}_{h}({\scriptscriptstyle\mathcal{I}_{k-1},\mathcal{I}_{k-1}})$ is the submatrix of $\mathbf{B}_{h}$ generated by the rows and columns of $\mathbf{B}_{h}$ corresponding to the index set ${\scriptscriptstyle\mathcal{I}_{k-1}}$ , etc.

Appendix B Proofs of Lemma 3.3 and Lemma 4.3

Proof of Lemma 3.3.

It is well-known that [3, 42, 17]

(B.1)		$\displaystyle a^{\text{sip}}_{h}(w,v)$	$\displaystyle\leq C\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|w\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|v\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}\quad$	$\displaystyle\forall w,v\in E(\Delta;L_{2}(\Omega))+V_{h},$
(B.2)		$\displaystyle a^{\text{sip}}_{h}(v,v)$	$\displaystyle\geq C\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|v\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|^{2}_{h}\quad$	$\displaystyle\forall v\in V_{h}.$

For the advection-reaction term, we have, for all $w,v\in E(\Delta;L_{2}(\Omega))+V_{h}$ ,

	$\displaystyle a^{\text{ar}}_{h}(w,v)$	$\displaystyle=\sum_{T\in\mathcal{T}_{h}}(\bm{\zeta}\cdot\nabla w+\gamma w,v)_{T}-\sum_{e\in\mathcal{E}^{i}_{h}\cup\mathcal{E}^{b,-}_{h}}(\mathbf{n}\cdot\bm{\zeta}[w],\{v\})_{e}$
		$\displaystyle\lesssim\left(\sum_{T\in\mathcal{T}_{h}}\\|\nabla w\\|^{2}_{L_{2}(T)}\right)^{\frac{1}{2}}\\|v\\|_{L_{2}(\Omega)}+\\|w\\|_{L_{2}(\Omega)}\\|v\\|_{L_{2}(\Omega)}$
		$\displaystyle\hskip 8.5359pt+\left(\sum_{e\in\mathcal{E}^{i}_{h}\cup\mathcal{E}^{b,-}_{h}}\frac{\sigma}{h_{e}}\\|[w]\\|^{2}_{L_{2}(e)}\right)^{\frac{1}{2}}\left(\sum_{e\in\mathcal{E}^{i}_{h}\cup\mathcal{E}^{b,-}_{h}}\frac{h_{e}}{\sigma}\\|\{v\}\\|^{2}_{L_{2}(e)}\right)^{\frac{1}{2}}$
		$\displaystyle\lesssim\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|w\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|v\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h},$

where we use $\bm{\zeta}\in[W^{1,\infty}(\Omega)]^{2}$ , $\gamma\in W^{1}_{\infty}(\Omega)$ , (4.5) and (4.7). Furthermore, upon integration by parts, we have, for all $v\in V_{h}$ ,

	$\displaystyle a^{\text{ar}}_{h}(v,v)$	$\displaystyle=\sum_{T\in\mathcal{T}_{h}}(\bm{\zeta}\cdot\nabla v+\gamma v,v)_{T}-\sum_{e\in\mathcal{E}^{i}_{h}\cup\mathcal{E}^{b,-}_{h}}(\mathbf{n}\cdot\bm{\zeta}[v],\{v\})_{e}$
		$\displaystyle=\sum_{T\in\mathcal{T}_{h}}((\gamma-\frac{1}{2}\nabla\cdot\bm{\zeta})v,v)_{T}+\sum_{T\in\mathcal{T}_{h}}\int_{\partial T}\frac{1}{2}(\bm{\zeta}\cdot\mathbf{n})v^{2}\ \!ds$
		$\displaystyle\hskip 14.22636pt-\sum_{e\in\mathcal{E}^{i}_{h}\cup\mathcal{E}^{b,-}_{h}}\bm{\zeta}\cdot\mathbf{n}[v]\{v\}\ \!ds$
		$\displaystyle=\sum_{T\in\mathcal{T}_{h}}((\gamma-\frac{1}{2}\nabla\cdot\bm{\zeta})v,v)_{T}+\int_{\partial\Omega}\frac{1}{2}\|\bm{\zeta}\cdot\mathbf{n}\|v^{2}\ \!ds.$

By the assumption (1.6), we immediately have $a^{\text{ar}}_{h}(v,v)\geq 0$ . This finishes the proof. ∎

Proof of Lemma 4.3.

It follows from (3.18), (3.19), (4.14) and (4.8) that

(B.3)	$\displaystyle\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|I_{h}w-R_{h}w\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|^{2}_{h}$	$\displaystyle\leq Ca_{h}(I_{h}w-R_{h}w,I_{h}w-R_{h}w)$
		$\displaystyle=Ca_{h}(I_{h}w-w,I_{h}w-R_{h}w)$
		$\displaystyle\leq Ch^{\tau}\\|\mathcal{L}w\\|_{L_{2}(\Omega)}\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|I_{h}w-R_{h}w\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h},$

which implies

(B.4)

\left|\mkern-2.0mu\left|\mkern-2.0mu\left|I_{h}w-R_{h}w\right|\mkern-2.0mu\right|\mkern-2.0mu\right|_{h}\leq Ch^{\tau}\|\mathcal{L}w\|_{L_{2}(\Omega)}.

Hence we have (4.17) by triangle inequality. The estimate (4.18) is established by a duality argument. Let $\phi\in H^{1}_{0}(\Omega)$ be defined by

(B.5)		$\displaystyle-\Delta\phi-\bm{\zeta}\cdot\nabla\phi+(\gamma-\nabla\cdot\bm{\zeta})\phi$	$\displaystyle=w-R_{h}w\quad$		$\displaystyle\text{in}\quad\Omega,$
(B.5)		$\displaystyle\phi$	$\displaystyle=0\quad$		$\displaystyle\text{on}\quad\partial\Omega.$

The weak form of the dual problem (B.5) is to find $\phi\in H^{1}_{0}(\Omega)$ such that

(B.6)

a(v,\phi)=(v,w-R_{h}w)\quad\forall v\in H^{1}_{0}(\Omega).

By elliptic regularity (1.9), we have

(B.7)

\|\phi\|_{H^{1+\alpha}(\Omega)}\leq C_{\Omega}\|w-R_{h}w\|_{L_{2}(\Omega)}.

It follows from (B.5) that,

(B.8)			$\displaystyle\\|w-R_{h}w\\|^{2}_{L_{2}(\Omega)}=(-\Delta\phi-\bm{\zeta}\cdot\nabla\phi+(\gamma-\nabla\cdot\bm{\zeta})\phi,w-R_{h}w)_{L_{2}(\Omega)}$
(B.8)			$\displaystyle\quad=(-\Delta\phi,w-R_{h}w)_{L_{2}(\Omega)}+\sum_{T\in\mathcal{T}_{h}}(-\bm{\zeta}\cdot\nabla\phi+(\gamma-\nabla\cdot\bm{\zeta})\phi,w-R_{h}w)_{T}.$

By the adjoint consistency of the SIP method, we have

(B.9)

(-\Delta\phi,w-R_{h}w)_{L_{2}(\Omega)}=a_{h}^{sip}(w-R_{h}w,\phi).

It follows from integration by parts that

(B.10)		$\displaystyle\sum_{T\in\mathcal{T}_{h}}(-\bm{\zeta}\cdot\nabla\phi+(\gamma-\nabla\cdot\bm{\zeta})\phi,w-R_{h}w)_{T}$
	$\displaystyle=$	$\displaystyle\sum_{T\in\mathcal{T}_{h}}(\bm{\zeta}\cdot\nabla(w-R_{h}w),\phi)_{T}+(\gamma(w-R_{h}w),\phi)_{T}$
		$\displaystyle\hskip 14.22636pt-\sum_{T\in\mathcal{T}_{h}}\int_{\partial T}(\bm{\zeta}\cdot\mathbf{n})(w-R_{h}w)\phi\ \!ds.$

The last term can be rewritten as the following [3, 30],

(B.11)		$\displaystyle\sum_{T\in\mathcal{T}_{h}}\int_{\partial T}(\bm{\zeta}\cdot\mathbf{n})(w-R_{h}w)\phi\ \!ds$
	$\displaystyle=$	$\displaystyle\sum_{e\in\mathcal{E}_{h}^{i}}\int_{e}\bm{\zeta}\cdot\mathbf{n}[(w-R_{h}w)\phi]\ \!ds+\sum_{e\in\mathcal{E}_{h}^{b}}\int_{e}\bm{\zeta}\cdot\mathbf{n}(w-R_{h}w)\phi\ \!ds$
	$\displaystyle=$	$\displaystyle\sum_{e\in\mathcal{E}_{h}^{i}}\int_{e}\bm{\zeta}\cdot\mathbf{n}[w-R_{h}w]\{\phi\}\ \!ds+\sum_{e\in\mathcal{E}_{h}^{i}}\int_{e}\bm{\zeta}\cdot\mathbf{n}\{w-R_{h}w\}[\phi]\ \!ds$
		$\displaystyle\hskip 14.22636pt+\sum_{e\in\mathcal{E}_{h}^{b}}\int_{e}\bm{\zeta}\cdot\mathbf{n}(w-R_{h}w)\phi\ \!ds.$

It then follows from $[\phi]=0$ on internal edges and $\phi=0$ on $\partial\Omega$ that

(B.12)

\sum_{T\in\mathcal{T}_{h}}\int_{\partial T}(\bm{\zeta}\cdot\mathbf{n})(w-R_{h}w)\phi\ \!ds=\sum_{e\in\mathcal{E}_{h}^{i}\cup\mathcal{E}_{h}^{b,-}}\int_{e}\bm{\zeta}\cdot\mathbf{n}[(w-R_{h}w)]\{\phi\}\ \!ds.

According to (B.10)-(B.12), we conclude

(B.13)

\sum_{T\in\mathcal{T}_{h}}(-\bm{\zeta}\cdot\nabla\phi+(\gamma-\nabla\cdot\bm{\zeta})\phi,w-R_{h}w)_{T}=a_{h}^{ar}(w-R_{h}w,\phi),

which implies the following together with (B.9),

(B.14)

\|w-R_{h}w\|^{2}_{L_{2}(\Omega)}=a_{h}(w-R_{h}w,\phi).

Therefore, it follows from (B.14), (4.14), (4.8), (3.18) and (B.7) that

(B.15)	$\displaystyle\\|w-R_{h}w\\|^{2}_{L_{2}(\Omega)}$	$\displaystyle=a_{h}(w-R_{h}w,\phi)$
		$\displaystyle=a_{h}(w-R_{h}w,\phi-I_{h}\phi)\leq C\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|w-R_{h}w\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|\phi-I_{h}\phi\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}$
		$\displaystyle\leq Ch^{\tau}\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|w-R_{h}w\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}\\|w-R_{h}w\\|_{L_{2}(\Omega)}.$

We then obtain the estimate (4.18) combining (4.17) and (B.15).

∎

Acknowledgement

The authors would like to thank Prof. Susanne C. Brenner and Prof. Li-Yeng Sung for the suggestion and discussion regarding this project. The work of the third author was partially supported by the National Science Foundation under grant DMS-2111004.

References

[1] R. A. Adams and J. J. F. Fournier. Sobolev Spaces, volume 140. Elsevier, 2003.
[2] T. Apel, A.-M. Sändig, and J. R. Whiteman. Graded mesh refinement and error estimates for finite element solutions of elliptic boundary value problems in non-smooth domains. Mathematical methods in the Applied Sciences, 19(1):63–85, 1996.
[3] D. N. Arnold, F. Brezzi, B. Cockburn, and L. Marini. Unified analysis of discontinuous galerkin methods for elliptic problems. SIAM Journal on Numerical Analysis, 39(5):1749–1779, 2002.
[4] B. Ayuso and L. D. Marini. Discontinuous galerkin methods for advection-diffusion-reaction problems. SIAM Journal on Numerical Analysis, 47(2):1391–1420, 2009.
[5] I. Babuška. Finite element method for domains with corners. Computing, 6(3):264–273, 1970.
[6] M. Bergounioux and K. Kunisch. Primal-dual strategy for state-constrained optimal control problems. Computational Optimization and Applications, 22(2):193–224, 2002.
[7] J. J. Brannick, H. Li, and L. T. Zikatanov. Uniform convergence of the multigrid V-cycle on graded meshes for corner singularities. Numerical Linear Algebra with Applications, 15(2-3):291–306, 2008.
[8] S. C. Brenner. Poincaré–Friedrichs inequalities for piecewise ${H}^{1}$ functions. SIAM Journal on Numerical Analysis, 41(1):306–324, 2003.
[9] S. C. Brenner. Finite element methods for elliptic distributed optimal control problems with pointwise state constraints (survey). Advances in Mathematical Sciences, pages 3–16, 2020.
[10] S. C. Brenner, C. B. Davis, and L.-Y. Sung. Additive Schwarz preconditioners for a state constrained elliptic distributed optimal control problem discretized by a partition of unity method. In Domain Decomposition Methods in Science and Engineering XXV 25, pages 100–107. Springer, 2020.
[11] S. C. Brenner, J. Gedicke, and L.-Y. Sung. $C^{0}$ interior penalty methods for an elliptic distributed optimal control problem on nonconvex polygonal domains with pointwise state constraints. SIAM Journal on Numerical Analysis, 56(3):1758–1785, 2018.
[12] S. C. Brenner, J. Gedicke, and L.-Y. Sung. A symmetric interior penalty method for an elliptic distributed optimal control problem with pointwise state constraints. Computational Methods in Applied Mathematics, 2023.
[13] S. C. Brenner, T. Gudi, K. Porwal, and L.-Y. Sung. A Morley finite element method for an elliptic distributed optimal control problem with pointwise state and control constraints. ESAIM: Control, Optimisation and Calculus of Variations, 24(3):1181–1206, 2018.
[14] S. C. Brenner, S. Liu, and L.-Y. Sung. A ${P}_{1}$ finite element method for a distributed elliptic optimal control problem with a general state equation and pointwise state constraints. Computational Methods in Applied Mathematics, 21(4):777–790, 2021.
[15] S. C. Brenner, S. Liu, and L.-Y. Sung. Multigrid methods for an elliptic optimal control problem with pointwise state constraints. Results in Applied Mathematics, 17:100356, 2023.
[16] S. C. Brenner, M. Oh, and L.-Y. Sung. ${P}_{1}$ finite element methods for an elliptic state-constrained distributed optimal control problem with Neumann boundary conditions. Results in Applied Mathematics, 8:100090, 2020.
[17] S. C. Brenner and L. R. Scott. The Mathematical Theory of Finite Element Methods, volume 15. Springer Science & Business Media, 2008.
[18] S. C. Brenner and L.-Y. Sung. A new convergence analysis of finite element methods for elliptic distributed optimal control problems with pointwise state constraints. SIAM Journal on Control and Optimization, 55(4):2289–2304, 2017.
[19] S. C. Brenner and L.-Y. Sung. An interior maximum norm error estimate for the symmetric interior penalty method on planar polygonal domains. Computational Methods in Applied Mathematics, 2022.
[20] S. C. Brenner, L.-Y. Sung, and J. Gedicke. $P_{1}$ finite element methods for an elliptic optimal control problem with pointwise state constraints. IMA Journal of Numerical Analysis, 11 2018.
[21] S. C. Brenner, L.-Y. Sung, and Z. Tan. A ${C}^{1}$ virtual element method for an elliptic distributed optimal control problem with pointwise state constraints. Mathematical Models and Methods in Applied Sciences, 31(14):2887–2906, 2021.
[22] S. C. Brenner, L.-Y. Sung, and Y. Zhang. A quadratic $C^{0}$ interior penalty method for an elliptic optimal control problem with state constraints. In Recent Developments in Discontinuous Galerkin Finite Element Methods for Partial Differential Equations, pages 97–132. Springer, 2014.
[23] F. Brezzi, L. D. Marini, and E. Süli. Discontinuous galerkin methods for first-order hyperbolic problems. Mathematical models and methods in applied sciences, 14(12):1893–1903, 2004.
[24] E. Casas, M. Mateos, and B. Vexler. New regularity results and improved error estimates for optimal control problems with state constraints. ESAIM: Control, Optimisation and Calculus of Variations, 20(3):803–822, 2014.
[25] Z. Chen and H. Chen. Pointwise error estimates of discontinuous Galerkin methods with penalty for second-order elliptic problems. SIAM Journal on Numerical Analysis, 42(3):1146–1166, 2004.
[26] S. Cherednichenko and A. Rösch. Error estimates for the discretization of elliptic control problems with pointwise control and state constraints. Computational Optimization & Applications, 44(1), 2009.
[27] P. Ciarlet. Analysis of the Scott–Zhang interpolation in the fractional order sobolev spaces. Journal of Numerical Mathematics, 21(3):173–180, 2013.
[28] P. G. Ciarlet. The Finite Element Method for Elliptic Problems, volume 19. 1978.
[29] M. Dauge. Elliptic boundary value problems on corner domains. Lecture Notes in Mathematics, 1341:1, 1988.
[30] D. A. Di Pietro and A. Ern. Mathematical Aspects of Discontinuous Galerkin Methods, volume 69. Springer Science & Business Media, 2011.
[31] A. Ern and J.-L. Guermond. Finite element quasi-interpolation and best approximation. ESAIM: Mathematical Modelling and Numerical Analysis, 51(4):1367–1385, 2017.
[32] J. Frehse. Zum differenzierbarkeitsproblem bei variationsungleichungen höherer ordnung. In Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg, volume 36, pages 140–149. Springer, 1971.
[33] J. Frehse. On the regularity of the solution of the biharmonic variational inequality. Manuscripta Mathematica, 9(1):91–103, 1973.
[34] M. Heinkenschloss and D. Leykekhman. Local error estimates for SUPG solutions of advection-dominated elliptic linear-quadratic optimal control problems. SIAM Journal on Numerical Analysis, 47(6):4607–4638, 2010.
[35] M. Hintermüller and M. Hinze. Moreau–Yosida regularization in state constrained elliptic control problems: Error estimates and parameter adjustment. SIAM Journal on Numerical Analysis, 47(3):1666–1683, 2009.
[36] M. Hintermüller, K. Ito, and K. Kunisch. The primal-dual active set strategy as a semismooth Newton method. SIAM Journal on Optimization, 13(3):865–888, 2002.
[37] M. Hintermüller, A. Schiela, and W. Wollner. The length of the primal-dual path in Moreau–Yosida-based path-following methods for state constrained optimal control. SIAM journal on optimization, 24(1):108–126, 2014.
[38] D. Leykekhman and M. Heinkenschloss. Local error analysis of discontinuous galerkin methods for advection-dominated elliptic linear-quadratic optimal control problems. SIAM Journal on Numerical Analysis, 50(4):2012–2038, 2012.
[39] W. Liu, W. Gong, and N. Yan. A new finite element approximation of a state-constrained optimal control problem. Journal of Computational Mathematics, pages 97–114, 2009.
[40] C. Meyer. Error estimates for the finite-element approximation of an elliptic control problem with pointwise state and control constraints. Control and Cybernetics, 37:51–83, 2008.
[41] S. Nazarov and B. A. Plamenevsky. Elliptic Problems in Domains with Piecewise Smooth Boundaries, volume 13. Walter de Gruyter, 2011.
[42] B. Rivière. Discontinuous Galerkin Methods for Solving Elliptic and Parabolic Equations: Theory and Implementation. SIAM, 2008.
[43] A. Rösch and D. Wachsmuth. A-posteriori error estimates for optimal control problems with state and control constraints. Numerische Mathematik, 120(4):733–762, 2012.
[44] H. L. Royden and P. Fitzpatrick. Real Analysis, volume 32. Macmillan New York, 1988.
[45] W. Rudin. Real and Complex Analysis. Tata McGraw-Hill Education, 2006.
[46] S. W. Walker. FELICITY: A MATLAB/C++ toolbox for developing finite element methods and simulation modeling. SIAM Journal on Scientific Computing, 40(2):C234–C257, 2018.

(4.5)		$\displaystyle\\|v\\|_{L_{2}(\partial T)}$	$\displaystyle\leq C(h_{T}^{-\frac{1}{2}}\\|v\\|_{L_{2}(T)}+h_{T}^{s-\frac{1}{2}}\|v\|_{H^{s}(T)}),$
(4.6)		$\displaystyle\\|\nabla v\\|_{L_{2}(\partial T)}$	$\displaystyle\leq C(h_{T}^{-\frac{1}{2}}\\|\nabla v\\|_{L_{2}(T)}+h_{T}^{s-\frac{1}{2}}\|\nabla v\|_{H^{s}(T)}).$

	$\displaystyle(\mathcal{L}_{h}(\phi-I_{h}\phi),v_{h})_{L_{2}(\Omega)}$	$\displaystyle=a_{h}(\phi-I_{h}\phi,v_{h})$
		$\displaystyle\lesssim\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|\phi-I_{h}\phi\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|v_{h}\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h,D}$
		$\displaystyle\lesssim h\|\phi\|_{H^{2}(\Omega)}\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|v_{h}\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h,D}$
		$\displaystyle\lesssim\|\phi\|_{H^{2}(\Omega)}\\|v_{h}\\|_{L_{2}(\Omega)},$

(4.13)	$\displaystyle\\|\mathcal{L}_{h}(I_{h}\phi)\\|_{L_{2}(\Omega)}$	$\displaystyle\leq\\|\mathcal{L}_{h}(\phi-I_{h}\phi)\\|_{L_{2}(\Omega)}+\\|\mathcal{L}_{h}\phi\\|_{L_{2}(\Omega)}$
		$\displaystyle\lesssim\|\phi\|_{H^{2}(\Omega)}+\\|Q_{h}\mathcal{L}\phi\\|_{L_{2}(\Omega)}$
		$\displaystyle\lesssim\|\phi\|_{H^{2}(\Omega)}+\\|\mathcal{L}\phi\\|_{L_{2}(\Omega)}$
		$\displaystyle\lesssim\\|\phi\\|_{H^{2}(\Omega)}.$

(4.17)		$\displaystyle\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|w-R_{h}w\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}$	$\displaystyle\leq Ch^{\tau}\\|\mathcal{L}w\\|_{L_{2}(\Omega)}\quad\forall w\in E(\Delta;L_{2}(\Omega)),$
(4.18)		$\displaystyle\\|w-R_{h}w\\|_{L_{2}(\Omega)}$	$\displaystyle\leq Ch^{2\tau}\\|\mathcal{L}w\\|_{L_{2}(\Omega)}\quad\forall w\in E(\Delta;L_{2}(\Omega)).$

(4.24)	$\displaystyle\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|E_{h}v_{h}-v_{h}\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h}$	$\displaystyle\lesssim h^{\tau}\\|\mathcal{L}_{h}v_{h}\\|_{L_{2}(\Omega)},$
(4.25)	$\displaystyle\\|E_{h}v_{h}-v_{h}\\|_{L_{2}(\Omega)}$	$\displaystyle\lesssim h^{2\tau}\\|\mathcal{L}_{h}v_{h}\\|_{L_{2}(\Omega)},$
(4.26)	$\displaystyle\left\|\mkern-2.0mu\left\|\mkern-2.0mu\left\|E_{h}v_{h}-v_{h}\right\|\mkern-2.0mu\right\|\mkern-2.0mu\right\|_{h,G(\mathscr{A})}$	$\displaystyle\lesssim h\\|\mathcal{L}_{h}v_{h}\\|_{L_{2}(\Omega)},$

Discontinuous Galerkin Methods for an Elliptic Optimal Control Problem with a General State Equation and Pointwise State Constraints

Abstract.

Key words and phrases:

1991 Mathematics Subject Classification:

1. Introduction

2. The Continuous Problem

3. The Discrete Problem

3.1. Mixed discontinuous Galerkin methods

Remark 3.1.

Remark 3.2.

3.2. Properties of ah​(⋅,⋅)a_{h}(\cdot,\cdot)

Lemma 3.3.

3.3. Properties of ℒh,g\mathcal{L}_{h,g}

3.4. Discrete variational inequalities

4. Preliminary Estimates

4.1. Graded Meshes

4.2. Preliminary inequalities

4.3. Interpolation operator IhI_{h}

Lemma 4.1.

Lemma 4.2.

Proof.

4.4. The Ritz Projection Operator RhR_{h}

Lemma 4.3.

Proof.

4.5. The Smoothing Operator EhE_{h}

Lemma 4.4.

Proof.

4.6. The Connection Operator ℭh\mathfrak{C}_{h}

Remark 4.5.

5. Convergence Analysis

5.1. An abstract error estimate

Lemma 5.1.

Proof.

Remark 5.2.

5.2. Concrete error estimates

Lemma 5.3.

Proof.

Theorem 5.4.

6. Numerical Results

Example 6.1 (Square Domain [14]).

Example 6.2 (L-shaped Domain [14]).

7. Concluding Remark

Appendix A Primal-dual active set algorithm

Remark A.1.

Appendix B Proofs of Lemma 3.3 and Lemma 4.3

Proof of Lemma 3.3.

Proof of Lemma 4.3.

Acknowledgement

References

3.2. Properties of $a_{h}(\cdot,\cdot)$

3.3. Properties of $\mathcal{L}_{h,g}$

4.3. Interpolation operator $I_{h}$

4.4. The Ritz Projection Operator $R_{h}$

4.5. The Smoothing Operator $E_{h}$

4.6. The Connection Operator $\mathfrak{C}_{h}$