This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Fast MATLAB evaluation of nonlinear energies
using FEM in 2D and 3D: Nodal elements

Alexej Moskovka Department of Mathematics, Faculty of Applied Sciences, University of West Bohemia, Technická 8, 30100 Plzeň, Czechia Jan Valdman Institute of Information Theory and Automation, Czech Academy of Sciences, Pod vodárenskou věží 4, 18200 Praha 8, Czechia Department of Computer Science, Faculty of Science, University of South Bohemia, Branišovská 1760, 37005 České Budějovice, Czechia
Abstract

Nonlinear energy functionals appearing in the calculus of variations can be discretized by the finite element (FE) method and formulated as a sum of energy contributions from local elements. A fast evaluation of energy functionals containing the first order gradient terms is a central part of this contribution. We describe a vectorized implementation using the simplest linear nodal (P1) elements in which all energy contributions are evaluated all at once without the loop over triangular or tetrahedral elements. Furthermore, in connection to the first-order optimization methods, the discrete gradient of energy functional is assembled in a way that the gradient components are evaluated over all degrees of freedom all at once. The key ingredient is the vectorization of exact or approximate energy gradients over nodal patches. It leads to a time-efficient implementation at higher memory-cost. Provided codes in MATLAB related to 2D/3D hyperelasticity and 2D p-Laplacian problem are available for download and structured in a way it can be easily extended to other types of vector or scalar forms of energies.

1 Introduction

Given a domain Ωdim\Omega\in\mathbb{R}^{dim}, where dim{1,2,3}dim\in\{1,2,3\} is the space-dimension, we consider a minimization problem

J(u)=minvVJ(v),J(\textbf{u})=\min_{\textbf{v}\in V}J(\textbf{v}), (1)

where VV is a space of trial functions and J:VJ:V\rightarrow\mathbb{R} represents an energy functional in the form

J(v)=Jgrad(v)+Jlin(v),J(\textbf{v})=J_{grad}(\textbf{v})+J_{lin}(\textbf{v}), (2)

where Jgrad(v)J_{grad}(\textbf{v}) denotes its first-gradient part and Jlin(v)J_{lin}(\textbf{v}) its linear part. Examples of such minimization problems are numerous and their study is a general subject of the Calculus of variations. There are models with higher order derivatives (such as plate problems with the second derivative in their formulation) available but not considered in this contribution.

As the main example we recall a class of vector nonlinear elasticity problems represented by minimizations of energies of hyperelastic materials [10, 12]. The trial space is chosen as

V=WD1,p(Ω,dim)V=W^{1,p}_{D}(\Omega,\mathbb{R}^{dim})

i.e., the (vector) Sobolev space of Lp(Ω)L^{p}(\Omega) integrable functions with the first weak derivative being also Lp(Ω)L^{p}(\Omega) integrable and satisfying (in the sense of traces) Dirichlet boundary conditions v(x)=uD(x)\textbf{v}(\textbf{x})=\textbf{u}_{D}(\textbf{x}) at the domain boundary xΩ\textbf{x}\in\partial\Omega for a prescribed function uD:Ωdim\textbf{u}_{D}:\partial\Omega\rightarrow\mathbb{R}^{dim}. A  primal variable is the deformation mapping vV\textbf{v}\in V describing the relocation of any point xΩ\textbf{x}\in\Omega during the deformation process. Then the gradient deformation tensor FLp(Ω,dim×dim)\textbf{F}\in L^{p}(\Omega,\mathbb{R}^{dim\times dim}) is defined as

F(v)=v=[v(1)x1v(1)xdimv(dim)x1v(dim)xdim].\textbf{F}(\textbf{v})=\nabla\textbf{v}=\begin{bmatrix}\frac{\partial v^{(1)}}{\partial x_{1}}&\dots&\frac{\partial v^{(1)}}{\partial x_{dim}}\\ \vdots&&\vdots\\ \frac{\partial v^{(dim)}}{\partial x_{1}}&\dots&\frac{\partial v^{(dim)}}{\partial x_{dim}}\\ \end{bmatrix}. (3)

The first-gradient and the linear parts of the energy functional (2) read

Jgrad(v)=ΩW(F(v(x)))dx,Jlin(v)=Ωf(x)v(x)dx,J_{grad}(\textbf{v})=\int\limits_{\Omega}W\big{(}\textbf{F}(\textbf{v}(\textbf{x}))\big{)}\,\mathrm{d}\textbf{x},\qquad J_{lin}(\textbf{v})=-\int\limits_{\Omega}\textbf{f}(\textbf{x})\cdot\textbf{v}(\textbf{x})\,\mathrm{d}\textbf{x}\,,

where W:dim×dimW:\mathbb{R}^{dim\times dim}\rightarrow\mathbb{R} defines a strain-energy density function and f:Ωdim\textbf{f}:\Omega\rightarrow\mathbb{R}^{dim} a loading functional. We assume the compressible Neo-Hookean density

W(F)=C1(I1(F)dim2log(detF))+D1(detF1)2,W(\textbf{F})=C_{1}\big{(}I_{1}(\textbf{F})-dim-2\log(\det\textbf{F})\big{)}+D_{1}(\det\textbf{F}-1)^{2}, (4)

where I1(F)=|F|2I_{1}(\textbf{F})=|\textbf{F}|^{2} uses the Frobenius norm |||\cdot|, and det()\det(\cdot) is the matrix determinant operator. An extension to other gradient densities as the St. Venant Kirchhoff is possible.

As the second example we recall a scalar p-Laplacian problem [7] with the energy functional defined as

J(v)=1pΩ|v(x)|pdxΩf(x)v(x)dx,J(v)=\frac{1}{p}\int\limits_{\Omega}|\nabla v(\textbf{x})|^{p}\,\mathrm{d}\textbf{x}-\int\limits_{\Omega}f(\textbf{x})\,v(\textbf{x})\,\mathrm{d}\textbf{x}\,, (5)

where V=WD1,p(Ω,)V=W^{1,p}_{D}(\Omega,\mathbb{R}). The functional J(v)J(v) is then known to be strictly convex in VV for p(1,)p\in(1,\infty) and it has therefore a unique minimizer u(x)Vu(\textbf{x})\in V.

The main motivation of this contribution is to describe how nonlinear energy functionals can be efficiently and automatically evaluated by the finite element method (FE). We provide vectorization concepts and MATLAB implementation for

  • evaluations of the value J(v)J(\textbf{v}),

  • evaluations of the gradient vector J(v)\nabla J(\textbf{v})

expressed for a trial function vV\textbf{v}\in V. These two objects can be passed to an external optimization method of the first order to find a minima of the energy uV\textbf{u}\in V and the corresponding minimal energy value J(u)J(\textbf{u}). We utilize the trust-region optimization method [4] available in the MATLAB Optimization Toolbox for benchmarking. Our implementation is built on the top of our own vectorized codes [3, 6, 15] developed primarily for assemblies of finite element matrices. There are no explicit loops over mesh elements in evaluations of J(v)J(\textbf{v}) and J(v)\nabla J(\textbf{v}), all necessary data such as gradients of basis functions and energy densities are computed all at once. It leads to a significant computational speedup, but also it is memory intensive. Practically, the user specifies the form of the energy J(v)J(\textbf{v}) and the corresponding gradient vector J(v)\nabla J(\textbf{v}) is evaluated approximately by a central difference scheme. Alternatively, if the user is willing to apply some differential calculus to a particular form of the energy, then the exact gradient can be assembled. This exact gradient approach is not versatile, but leads to the further performance speedup in our tests. We set up a set of six benchmarks as a base for future tests and improvements:

  • The mesh and the nodal patches data are preprocessed in Benchmark 1.

  • Evaluations of J(v)J(\textbf{v}) and J(v)\nabla J(\textbf{v}) for a given vV\textbf{v}\in V are provided in Benchmarks 2 and 3.

  • The full minimizations of 2D/3D hyperelasticity and 2D p-Laplacian energies are shown in Benchmarks 4, 5 and 6.

Authors are not aware of any similar MATLAB implementation. This is also our first attempt in this direction apart from our own contribution [13] focusing on p-Laplacian energy minimization. There is a growing number of MATLAB vectorized implementations of the second order linear partial differential equations eg. [9, 17] or particular nonlinear partial differential equations [6, 16].

The paper is structured in the following way: Section 2 summarizes useful notation and Section 3 basics of FEM. Section 4 includes implementation of two structures: mesh and (nodal) patches. Section 5 is focused on implementation of energy evaluation and Section 6 on implementation of the gradient of energy. The final Section 7 reports on the solutions of particular minimization problems.

2 Notation

Index mappings in the construction of FEM:

ILG:I_{LG}:\mathbb{N}\rightarrow\mathbb{N} - (local to global) mapping which for a local basis function on an element returns the index of the corresponding global basis function

IDN:I_{DN}:\mathbb{N}\rightarrow\mathbb{N} - (degree to node) mapping which for the nn-th degree of freedom returns the index ii of the corresponding node, see (22)

Domain triangulations are described by the following parameters:

𝒯,𝒩\mathcal{T},\mathcal{N} - a set of elements, a set of nodes

𝒩\mathcal{M}\subset\mathcal{N} - a set of (at least partially) free nodes

𝒯i,𝒯(n)\mathcal{T}^{i},\mathcal{T}^{(n)} - the ii-th nodal patch, the (IDN(n))\big{(}I_{DN}(n)\big{)}-th nodal patch

|𝒯|,|𝒩|,||,|𝒯i|,|𝒯(n)||\mathcal{T}|,|\mathcal{N}|,|\mathcal{M}|,|\mathcal{T}^{i}|,|\mathcal{T}^{(n)}| - the sizes of sets 𝒯,𝒩,,𝒯i,𝒯(n)\mathcal{T},\mathcal{N},\mathcal{M},\mathcal{T}^{i},\mathcal{T}^{(n)}

Both scalar and vector problems are treated together as a vector problem with dd components, where d=1d=1 for scalar problems, d=dimd=dim for vector problems and dimdim is the space dimension. The following indices are frequently used:

ii - the index of a node (i{1,,nb}i\in\{1,\ldots,n_{b}\}, also i{1,,|𝒩|}i\in\{1,\ldots,|\mathcal{N}|\} or i{1,,||}i\in\{1,\ldots,|\mathcal{M}|\})

jj - the index of a vector component (j{1,,d}j\in\{1,\ldots,d\})

kk - the index of an element (k{1,,|𝒯|}k\in\{1,\ldots,|\mathcal{T}|\}), also (k{1,,|𝒯i|k\in\{1,\ldots,|\mathcal{T}^{i}|) or (k{1,,|𝒯(n)|k\in\{1,\ldots,|\mathcal{T}^{(n)}|)

\ell - the index of a local basis function ({1,,dim+1}\ell\in\{1,\ldots,dim+1\})

mm - the index of a spatial component (m{1,,dim}m\in\{1,\ldots,dim\})

nn - the index of a global degree of freedom (n{1,,d|𝒩|}n\in\{1,\ldots,d|\mathcal{N}|\}) or an active degree of freedom (n{1,,d||}n\in\{1,\ldots,d|\mathcal{M}|\})

rr - the index of a global patches matrix row (r{1,,||}r\in\{1,\ldots,|\mathcal{M}|\})

Nodal basis functions are used in several ways:

φi(x),𝝋i(x)\varphi_{i}(\textbf{x}),\boldsymbol{\varphi}_{i}(\textbf{x}) - a scalar global nodal basis function, a vector global nodal basis function

𝝋k,(x)\boldsymbol{\varphi}_{k,\ell}(\textbf{x}) - the \ell-th local basis function on the kk-th element

A trial vector function is addressed in several ways:

v(x),v(j)(x)\textbf{v}(\textbf{x}),\,v^{(j)}(\textbf{x}) - a trial vector function and its jj-th component

V - a matrix of the coefficients of v(x)\textbf{v}(\textbf{x}) in the nodal finite element basis

vi,v(j),vi(j)\textbf{v}_{i}\,,\,\textbf{v}^{(j)}\,,\,v^{(j)}_{i} - the ii-th row of V, the jj-th column of V, the (i,j) element of V

v - a vector reshaped from V

v^\hat{\textbf{v}} - the restriction of v to free nodes

Given matrices A,Bp×q,\textbf{A},\textbf{B}\in\mathbb{R}^{p\times q}, the following operators are used:

tr(A)\operatorname{tr}(\textbf{A}) - the trace of matrix (for p=qp=q)

det(A)\det(\textbf{A}) - the determinant of matrix (for p=qp=q)

AB\textbf{A}\odot\textbf{B} - the elementwise (Hadamard) product defined as a matrix Cp×q\textbf{C}\in\mathbb{R}^{p\times q}, where ci,j=ai,jbi,jc_{i,j}=a_{i,j}\,b_{i,j}

A:B\textbf{A}:\textbf{B} - the scalar product defined as A:B=i=1pj=1qai,jbi,j=i=1pj=1qAB\textbf{A}:\textbf{B}=\sum\limits_{i=1}^{p}\sum\limits_{j=1}^{q}a_{i,j}b_{i,j}=\sum\limits_{i=1}^{p}\sum\limits_{j=1}^{q}\textbf{A}\odot\textbf{B}

3 Finite element discretization

The finite element method [5] is applied for the discretization of (1). We assume a trial function and a trial space of the form

v(x)=(v(1)(x),,v(d)(x)),V=V(1)××V(d)\textbf{v}(\textbf{x})=(v^{(1)}(\textbf{x}),\ldots,v^{(d)}(\textbf{x})),\qquad V=V^{(1)}\times\ldots\times V^{(d)}

and approximate v(x)V\textbf{v}(\textbf{x})\in V in the finite-dimensional subspace

Vh=Vh(1)××Vh(d)V,V_{h}=V_{h}^{(1)}\times\ldots\times V_{h}^{(d)}\subset V,

where Vh(1)==Vh(d):=VhsV_{h}^{(1)}=\ldots=V_{h}^{(d)}:=V_{h}^{s} and the scalar basis space VhsV_{h}^{s} is generated from the scalar basis functions

φi(x)Vhs,i{1,,nb},\varphi_{i}(\textbf{x})\in V_{h}^{s},\qquad i\in\{1,\ldots,n_{b}\},

where nbn_{b} denotes their number. Hence, any component of v(x)\textbf{v}(\textbf{x}) is given by a linear combination

v(j)(x)=i=1nbvi(j)φi(x),xΩ,j{1,,d}.v^{(j)}(\textbf{x})=\sum_{i=1}^{n_{b}}v^{(j)}_{i}\,\varphi_{i}(\textbf{x})\,,\qquad\textbf{x}\in\Omega,\quad j\in\{1,\ldots,d\}. (6)

The coefficients vi(j)v^{(j)}_{i} from (6) are assembled in a matrix Vnb×d\textbf{V}\in\mathbb{R}^{n_{b}\times d} given as

V=(v1(1)v1(d)vnb(1)vnb(d))=(v1vnb)=(v(1)v(d))\textbf{V}=\begin{pmatrix}v_{1}^{(1)}&\ldots&v_{1}^{(d)}\\ \vdots&&\vdots\\ \vdots&&\vdots\\ v_{n_{b}}^{(1)}&\ldots&v_{n_{b}}^{(d)}\end{pmatrix}=\begin{pmatrix}\textbf{v}_{1}\\ \vdots\\ \vdots\\ \textbf{v}_{n_{b}}\end{pmatrix}=\begin{pmatrix}\textbf{v}^{(1)}\dots\textbf{v}^{(d)}\\ \end{pmatrix} (7)

and latter two equivalent forms assume a row vector vi=(vi(1),,vi(d)),i{1,,nb}\textbf{v}_{i}=(v_{i}^{(1)},\ldots,v_{i}^{(d)})\,,\,i\in\{1,\ldots,n_{b}\}, and a column vector v(j)=(v1(j),,vnb(j))T,j{1,,d}\textbf{v}^{(j)}=(v_{1}^{(j)},\ldots,v_{n_{b}}^{(j)})^{T},\,j\in\{1,\ldots,d\}. Using a row vector basis function

𝝋i(x)=(φi(x),,φi(x))d times,i{1,,nb}\boldsymbol{\varphi}_{i}(\textbf{x})=\underbrace{(\varphi_{i}(\textbf{x}),\ldots,\varphi_{i}(\textbf{x}))}_{d\,-\mbox{ times}},\qquad i\in\{1,\ldots,n_{b}\}

one can rewrite (6) in a compact way for all components j{1,,d}j\in\{1,\ldots,d\} as

v(x)=i=1nbvi𝝋i(x),xΩ,\textbf{v}(\textbf{x})=\sum_{i=1}^{n_{b}}\textbf{v}_{i}\odot\boldsymbol{\varphi}_{i}(\textbf{x})\,,\qquad x\in\Omega\,, (8)

where the symbol \odot represents an elementwise (or Hadamard) multiplication (here multiplication of components with the same index jj). Formula (8) is the key tool for a vectorized implementation, since MATLAB provides the elementwise multiplication.

The domain Ω\Omega is then approximated by its triangulation 𝒯\mathcal{T} into closed elements in the sense of Ciarlet [5]. The simplest possible elements are considered, i.e., triangles for dim=2dim=2 and tetrahedra for dim=3dim=3. The elements are geometrically specified by their nodes (or vertices) belonging to the set of nodes 𝒩\mathcal{N}. The nodes are also clustered into elements edges (for dim2dim\geq 2) and faces (for dim=3dim=3). The numbers of elements and nodes are denoted as |𝒯||\mathcal{T}| and |𝒩||\mathcal{N}|.

Given a node Ni𝒩,i{1,,|𝒩|}N_{i}\in\mathcal{N},i\in\{1,\ldots,|\mathcal{N}|\}, we define a nodal patch 𝒯i\mathcal{T}^{i} by

𝒯i={T𝒯:NiT}\mathcal{T}^{i}=\{T\in\mathcal{T}:N_{i}\in T\}

and the number of its elements by |𝒯i||\mathcal{T}^{i}|. The nodal patch 𝒯i\mathcal{T}^{i} consists of elements denoted as Tki,k{1,,|𝒯i|}T_{k}^{i}\,,\,k\in\{1,\ldots,|\mathcal{T}^{i}|\}, which are adjacent to the node NiN_{i}. The same nodal patch 𝒯i\mathcal{T}^{i} can be alternatively denoted by 𝒯(n)\mathcal{T}^{(n)}, where i=IDN(n)i=I_{DN}(n) and nn is the index of one of the corresponding degrees of freedom.

We consider only the case where Vhs=P1(𝒯)V_{h}^{s}=P^{1}(\mathcal{T}) is the space of nodal, elementwise linear and globally continuous scalar basis functions. Then the number of basis functions is equal to the number of nodes

nb=|𝒩|,n_{b}=|\mathcal{N}|\,,

but some coefficients of the trial function v(x)\textbf{v}(\textbf{x}) are known due to the Dirichlet boundary conditions.

Example 1

One regular triangulation 𝒯\mathcal{T} of an L-shape domain Ω\Omega is shown in Fig. 1. The triangulation is specified by |𝒯|=24|\mathcal{T}|=24 and |𝒩|=21|\mathcal{N}|=21. The graph of the global scalar basis function φ10(x)\varphi_{10}(\textbf{x}) is displayed. The function has a hexagonal pyramid shape and the support on the nodal patch

𝒯10={T1,T2,T7,T8,T19,T20}.\mathcal{T}^{10}=\{T_{1},T_{2},T_{7},T_{8},T_{19},T_{20}\}.

The restrictions of φ10(x)\varphi_{10}(\textbf{x}) to its six supporting triangles in 𝒯10\mathcal{T}^{10} are given as linear functions with values 1 at the node N10N_{10} and values 0 at the two remaining nodes.

Refer to caption
Figure 1: Example of a scalar nodal basis function for dim=2dim=2.

Additionally, for the scalar case (d=1d=1) the node N10N_{10} has only one corresponding degree of freedom with index 1010 and, therefore, 𝒯(10)=𝒯10\mathcal{T}^{(10)}=\mathcal{T}^{10} as long as IDN(10)=10I_{DN}(10)=10. For the vector case (d=2d=2) the same node N10N_{10} has two corresponding degrees of freedom with indices 19,2019,20, hence, 𝒯(19)=𝒯(20)=𝒯10\mathcal{T}^{(19)}=\mathcal{T}^{(20)}=\mathcal{T}^{10} as long as IDN(19)=IDN(20)=10I_{DN}(19)=I_{DN}(20)=10. This alternative notation is essential in Section 6.

3.1 The first-gradient energy term Jgrad(v)J_{grad}(\textbf{v})

Since the first gradient of any scalar basis function φi(x)P1(𝒯),i{1,,|𝒩|}\varphi_{i}(\textbf{x})\in P^{1}(\mathcal{T}),\,i\in\{1,\ldots,|\mathcal{N}|\}, is a piecewise constant function on each element, the gradient part of the discrete energy can be written as a sum over the elements

Jgrad(v)=ΩW(F(v(x)))dx=k=1|𝒯|TkW(v(x))dx=k=1|𝒯||Tk|W(v(x)|Tk),J_{grad}(\textbf{v})=\int\limits_{\Omega}W\big{(}\textbf{F}(\textbf{v}(\textbf{x}))\big{)}\,\mathrm{d}\textbf{x}=\sum_{k=1}^{|\mathcal{T}|}\int_{T_{k}}W\big{(}\nabla\textbf{v}(\textbf{x})\big{)}\,\mathrm{d}\textbf{x}\,=\sum_{k=1}^{|\mathcal{T}|}|T_{k}|\,W\big{(}\nabla\textbf{v}(\textbf{x})\big{|}_{{T_{k}}}\big{)}\,, (9)

and |Tk||T_{k}| denotes the size of the element Tk,k{1,,|𝒯|}T_{k},\,k\in\{1,\ldots,|\mathcal{T}|\} (equal to its length in 1D, its area in 2D or its volume in 3D). In order to evaluate (9), we need to assemble the gradient v(x)|Tk\textbf{v}(\textbf{x})|_{T_{k}} on every element. To do it, we define a (local-global) mapping ILG:×I_{LG}:\mathbb{N}\times\mathbb{N}\rightarrow\mathbb{N} which for the kk-th element and its \ell-th node returns the global index ii of this node. Then a local basis function is given as

𝝋k,(x)=𝝋i|Tk(x),where i=ILG(k,)\boldsymbol{\varphi}_{k,\ell}(x)=\boldsymbol{\varphi}_{i}\big{|}_{T_{k}}(x),\qquad\mbox{where }i=I_{LG}(k,\ell) (10)

for k{1,,|𝒯|},{1,,dim+1}k\in\{1,\ldots,|\mathcal{T}|\},\,\ell\in\{1,\ldots,dim+1\}. Hence, any partial derivative of v(x)\textbf{v}(\textbf{x}) with respect to xmx_{m} reads

v(x)xm|Tk==1dim+1vk,𝝋k,(x)xm,m{1,,dim},\frac{\partial\textbf{v}(\textbf{x})}{\partial x_{m}}\bigg{|}_{T_{k}}=\sum_{\ell=1}^{dim+1}\textbf{v}_{k,\ell}\odot\frac{\partial\boldsymbol{\varphi}_{k,\ell}(\textbf{x})}{\partial x_{m}}\,,\qquad m\in\{1,\ldots,dim\}\,, (11)

where vk,=vILG(k,)\textbf{v}_{k,\ell}=\textbf{v}_{I_{LG}(k,\ell)} represents the values of v in the \ell-th node of the kk-th element.

3.2 The linear energy term JlinJ_{lin}

Furthermore, if f(x)Vh\textbf{f}(\textbf{x})\in V_{h}, then the linear term of the energy (2) rewrites as

Jlin(v)=Ωf(x)v(x)dx=j=1dΩf(j)(x)v(j)(x)dx==j=1dΩ(i1=1|𝒩|fi1(j)φi1(x)i2=1|𝒩|vi2(j)φi2(x))dx==j=1di1=1|𝒩|i2=1|𝒩|fi1(j)vi2(j)Ω(φi1(x)φi2(x))dx.\begin{split}J_{lin}(\textbf{v})=&\int_{\Omega}\textbf{f}(\textbf{x})\cdot\textbf{v}(\textbf{x})\,\mathrm{d}\textbf{x}=\sum_{j=1}^{d}\int_{\Omega}f^{(j)}(\textbf{x})v^{(j)}(\textbf{x})\,\mathrm{d}\textbf{x}=\\ &=\sum_{j=1}^{d}\int_{\Omega}\bigg{(}\sum_{{i}_{1}=1}^{|\mathcal{N}|}f_{i_{1}}^{(j)}\,\varphi_{i_{1}}(\textbf{x})\sum_{i_{2}=1}^{|\mathcal{N}|}v_{i_{2}}^{(j)}\,\varphi_{i_{2}}(\textbf{x})\bigg{)}\,\mathrm{d}\textbf{x}=\\ &=\sum_{j=1}^{d}\sum_{i_{1}=1}^{|\mathcal{N}|}\sum_{i_{2}=1}^{|\mathcal{N}|}f_{i_{1}}^{(j)}v_{i_{2}}^{(j)}\int_{\Omega}\Big{(}\varphi_{i_{1}}(\textbf{x})\,\varphi_{i_{2}}(\textbf{x})\Big{)}\,\mathrm{d}\textbf{x}\,.\end{split} (12)

All integral terms in the formula above can be assembled in a sparse and symmetric mass matrix M|𝒩|×|𝒩|\textbf{M}\in\mathbb{R}^{|\mathcal{N}|\times|\mathcal{N}|} with entries

Mi1,i2=Ωφi1(x)φi2(x)dx,i1,i2{1,2,,|𝒩|}.\textbf{M}_{i_{1},i_{2}}=\int_{\Omega}\varphi_{i_{1}}(\textbf{x})\,\varphi_{i_{2}}(\textbf{x})\,\mathrm{d}\textbf{x},\qquad i_{1},i_{2}\in\{1,2,\ldots,|\mathcal{N}|\}. (13)

Then we can define vectors b(j)=Mf(j)|𝒩|,\textbf{b}^{(j)}=\textbf{M}\,\textbf{f}^{(j)}\in\mathbb{R}^{|\mathcal{N}|}, where f(j)=(f1(j),,f|𝒩|(j))T|𝒩|,\textbf{f}^{(j)}=(f_{1}^{(j)},\ldots,f_{|\mathcal{N}|}^{(j)})^{T}\in\mathbb{R}^{|\mathcal{N}|}, j{1,,d}j\in\{1,\ldots,d\} and it is easy to check the exact formula

Jlin(v)=Ωf(x)v(x)dx=j=1db(j)v(j)J_{lin}(\textbf{v})=\int_{\Omega}\textbf{f}(\textbf{x})\cdot\textbf{v}(\textbf{x})\,\mathrm{d}\textbf{x}=\sum_{j=1}^{d}\textbf{b}^{(j)}\cdot\textbf{v}^{(j)} (14)

which allows us to represent the linear part of the discrete energy J(v)J(\textbf{v}) as a linear function.

4 Implementation: Mesh and nodal patches

We describe the typical properties of meshes and nodal patches needed in our computational techniques. Considered finite element meshes consist of triangles (in 2D) and tetrahedra (in 3D), however detailed explanations of Sections 4, 5, 6 address 3D version only.

Refer to caption
Figure 2: Example: A tetrahedral mesh of a 3D bar domain with red boundary nodes.

4.1 The ’mesh’ structure

The topology and important attributes of the computational domain are stored in the structure-type data object ’mesh’. For the example of a tetrahedral mesh displayed in Fig. 2, a vector energy model and full Dirichlet boundary conditions (specified in all three directions and indicated by the nodes in red circles) it provides the following information:

mesh =  struct with fields:                dim: 3              level: 1                 nn: 729                 ne: 1920        elems2nodes: [1920×4 double]        nodes2coord: [729×3 double]            volumes: [1920×1 double]               dphi: {[1920×4 double] [1920×4 double] [1920×4 double]}     nodesDirichlet: [18×1 double]         nodesMinim: [711×1 double]      dofsDirichlet: [54×1 double]          dofsMinim: [2133×1 double] Parameter dim represents the domain dimension (here dim=3dim=3) of the problem and level the level of a uniform refinement. Higher levels of refinement lead to finer uniformly refined meshes with more elements. The numbers of mesh nodes and elements are provided in nn and ne. Mesh nodes belonging to each tetrahedron are collected in a matrix ’elems2nodes’ and the Cartesian coordinates of mesh nodes in a matrix ’nodes2coord’.

Once the last two matrices are given, the codes of [15] generate thevolumes’ of all tetrahedra together with the restrictions of the partial derivatives (gradients) of all P1-basis functions to tetrahedra stored in a cell ’dphi’ whose components are matrices corresponding to the partial derivates with respect to every xm,m{1,,dim}x_{m}\,,\,m\in\{1,\ldots,dim\}.

The indices of Dirichlet boundary nodes are stored in a vector ’nodesDirichlet’ and the remaining free nodes in a vector ’nodesMinim’. Vectors ’dofsDirichlet’ and ’dofsMinimdenote the indices of the degrees of freedom corresponding to Dirichlet and free nodes.

Remark 1

Vectors ’dofsDirichlet’ and ’dofsMinim’ are equal to vectors ’nodesDirichlet’ and ’nodesMinimin case of a scalar energy formulation.

4.2 The ’patches’ structure

This object is only relevant if the knowledge of gradient J(v)\nabla J(\textbf{v}) is required, e.g. as an input of the trust-region method of Section 7. Then, the additional structure-type data object ’patches’ is constructed for the evaluation of its gradient part given by (9).

Remark 2

Only the components of J(v)\nabla J(\textbf{v}) corresponding to free nodes are evaluated in our implementation, and the remaining components belonging to the full Dirichlet boundary conditions are omitted. Thus, the nodes with at least one free degree of freedom also belong to the set of free nodes and the gradient is evaluated in all of their components and finally restricted to free degrees of freedom.

We denote by \mathcal{M} a set of all free nodes and by |||\mathcal{M}| their number. Then, the node index i,i{1,,||}i,i\in\{1,\ldots,|\mathcal{M}|\}, goes exclusively through the free nodes. A nodal patch 𝒯i,i{1,,||}\mathcal{T}^{i}\,,\,i\in\{1,\ldots,|\mathcal{M}|\} is implemented as a vector of elements indices ’elems_i\boldsymbol{i}’, a vector of their volumes ’volumes_i\boldsymbol{i}’, a matrix of the corresponding elements nodes stored as ’elems2nodes_i\boldsymbol{i}’, values of the gradients of local basis functions stores as a cell ’dphi_i\boldsymbol{i}’ with matrices components dphi_i{1}\boldsymbol{i}\{1\}, \ldots, dphi_i{dim}\boldsymbol{i}\{dim\}. All these matrices and vectors are of size |𝒯i|×(dim+1)|\mathcal{T}^{i}|\,\times\,(dim+1) and |𝒯i|× 1|\mathcal{T}^{i}|\,\times\,1, respectively.

The data of all nodal patches 𝒯i,i{1,,||}\mathcal{T}^{i}\,,\,i\in\{1,\ldots,|\mathcal{M}|\} are then collected in the corresponding long global matrices or vectors with the number of rows equal to

𝒯=i=1|||𝒯i|.\|\mathcal{T}\|=\sum\limits_{i=1}^{|\mathcal{M}|}|\mathcal{T}^{i}|\,.

For i{1,,||}i\in\{1,\ldots,|\mathcal{M}|\} we define the indices

pi=r=1i|𝒯r|,p_{i}=\sum\limits_{r=1}^{i}|\mathcal{T}^{r}|, (15)

and additionally p0=0p_{0}=0. Then the submatrix or subvector extracted from rows (pi1+1),,pi(p_{i-1}+1),\ldots,p_{i} of the global matrices or vector above corresponds to the ii-th nodal patch. It is shown schematically in Fig. 3.

𝒯|M|\mathcal{T}^{|M|}\vdots𝒯1\mathcal{T}^{1}𝒯\|\mathcal{T}\| rows|𝒯|M|||\mathcal{T}^{|M|}| rows|𝒯1||\mathcal{T}^{1}| rowsp|M|p_{|M|}p|M|1+1p_{|M|-1}+1p1p_{1}11
Figure 3: All data from nodal patches are stored in long matrices or vectors.

Thus we obtain the global vectors ’elems’, ’volumes’ of size 𝒯× 1\|\mathcal{T}\|\,\times\,1 and the global matrices ’elems2nodes’, dphi{1}\{1\}, \ldots, dphi{dim}\{dim\}, of size 𝒯×(dim+1)\|\mathcal{T}\|\,\times\,(dim+1).

Remark 3

For the gradient evaluation of Section 6 we will need to extend a set of pi,i{0,,||}p_{i}\,,\,i\in\{0,\ldots,|\mathcal{M}|\}, indices up to {0,,d||}\{0,\ldots,d|\mathcal{M}|\}. Thus, we additionally define

pn=pn||+𝒯,n{||+1,,2||}pn=pn2||+2𝒯,n{2||+1,,3||}.\begin{split}p_{n}&=p_{n\,-\,|\mathcal{M}|}\,+\;\,\|\mathcal{T}\|\,,\qquad n\in\{\;\,|\mathcal{M}|+1,\ldots,2|\mathcal{M}|\}\\ p_{n}&=p_{n-2|\mathcal{M}|}+2\|\mathcal{T}\|\,,\qquad n\in\{2|\mathcal{M}|+1,\ldots,3|\mathcal{M}|\}\,.\end{split} (16)

In order to maintain the right ordering of the local basis functions within each nodal patch, an additional logical-type matrix of zeros and ones ’logical’ is provided. If the nn-th row corresponds to the ii-th patch, then logical(n,)=1(n,\ell)=1 means that elems2nodes(n,)=i(n,\ell)=i. Therefore, in every row of ’logical’ matrix the value ’11’ has exactly one single occurrence.

Below we provide an example of thepatches’ structure along with the same ’mesh’ one introduced in Subsection 4.1 and corresponding to the domain in Fig. 2.

patches =  struct with fields:        lengths: [711×1 double]          elems: [7584×1 double]        volumes: [7584×1 double]    elems2nodes: [7584×4 double]           dphi: {[7584×4 double]  [7584×4 double]  [7584×4 double]}        logical: [7584×4 logical] The first vector ’lengths’ is of size ||×1|\mathcal{M}|\times 1 with entries lengths(i)=|𝒯i|,i{1,,||}\textbf{lengths}(i)=|\mathcal{T}^{i}|\,,\,i\in\{1,\ldots,|\mathcal{M}|\}. Here, its size is equal to 711=72918711=729-18, where |𝒩|=729|\mathcal{N}|=729 is the number of mesh nodes and 1818 is the number of nodes with full Dirichlet boundary conditions.

Benchmark 1

The script benchmark1_start.m generates a sequence of the structures mesh and patches corresponding to each of the uniform mesh refinements. Table 1 provides the assembly times and the memory requirements of both objects.

mesh level number of nodes |𝒩||\mathcal{N}| number of elems. |𝒯||\mathcal{T}| number of free dofs mesh setup time [s] patches setup time [s] mesh memory size [MB] patches memory size [MB]
1 729 1920 2133 0.01 0.01 0.30 1.13
2 4025 15360 11925 0.02 0.02 2.32 9.07
3 26001 122880 77517 0.06 0.13 18.17 72.73
4 185249 983040 554013 0.51 0.98 144.07 582.53
5 1395009 7864320 4178493 5.56 10.62 1147.67 4663.18
Table 1: Benchmark 1 - setup times and memory consumption in 3D.

Note that the most memory consuming part of both structures is given by substructures ’dphi’ containing the values of the precomputed gradients of basis functions.

5 Implementation: energy evaluation

The following matrix-vector transformation is frequently used: matrices B,V|𝒩|×d\textbf{B},\textbf{V}\in\mathbb{R}^{|\mathcal{N}|\times d} are stretched to the isomorphic vectors

b,vd|𝒩|:bn=Bi,j,vn=Vi,j,n{1,,d|𝒩|},\begin{split}\textbf{b},\textbf{v}\in\mathbb{R}^{d|\mathcal{N}|}:\qquad\textbf{b}_{n}=\textbf{B}_{i,j},\quad\textbf{v}_{n}=\textbf{V}_{i,j},\qquad n\in\{1,\ldots,d|\mathcal{N}|\},\end{split} (17)

where i=(n1)/d+1i=(n-1)/d\,+1 and j=(n1)%d+1j=(n-1)\%d\,+1. Here, // symbol is the integer division operator and %\% is the modulo operator. Put simply, for any i{1,,|𝒩|}i\in\{1,\ldots,|\mathcal{N}|\} the elements of v with indices d(i1)+1,,d(i1)+dd(i-1)+1,\ldots,d(i-1)+d corresponds to the values of the trial function v(x)\textbf{v}(\textbf{x}) in the ii-th node in the directions 1,,d1,\ldots,d.

5.1 The linear energy term JlinJ_{lin}

The linear part of the energy (14) rewrites equivalently as

Jlin(v)=Ωf(x)v(x)dx=B:V=bv,J_{lin}(\textbf{v})=\int_{\Omega}\textbf{f}(\textbf{x})\cdot\textbf{v}(\textbf{x})\,\mathrm{d}\textbf{x}=\textbf{B}:\textbf{V}=\textbf{b}\cdot\textbf{v}, (18)

where :: denotes the scalar product of matrices.

5.2 The first-gradient energy term JgradJ_{grad}

The gradient part of the discrete energy (9) is given as a sum of the energy contributions from every element Tk,k{1,,|𝒯|}T_{k},\,k\in\{1,\ldots,|\mathcal{T}|\} and its evaluation in MATLAB is performed effectively by using operations with vectors and matrices only. The energy evaluation for a trial vector vd|𝒩|\textbf{v}\in\mathbb{R}^{d|\mathcal{N}|} is performed by the main function:

{listing}
1function [e, densities] = energy(v,mesh,params)
2% components of deformation
3v_cell = createCellFromVector(v,mesh.dim);
4
5% components of deformation on elements
6v_elems = CellAtMatrixOfIndices(v_cell,mesh.elems2nodes);
7
8% deformation gradients on elements
9F_elems = evaluate_F(mesh,v_elems);
10
11% gradient densities on elements
12densities.Gradient = densityGradientVector_3D(F_elems,params);
13
14% total gradient energy
15e = sum(mesh.volumes.*densities.Gradient);
16end

The structure ’mesh’ is described in Section 5 and theparams’ contains material parameters apart from some other parameters (e.g. visualization parameters). The code above is vectorized and generates the objects:

A cell ’v_elems’ containing matrices v_elems{1}\{1\}, v_elems{2}\{2\}, v_elems{3}\{3\} of size |𝒯|× 4|\mathcal{T}|\,\times\,4 providing the restrictions of nodal deformations to all elements.

A cell ’F_elems’ of size dim×dimdim\times dim storing the deformation gradients (see (3)) in all elements. In particular, F_elems{d}{m}\{d\}\{m\} is then a vector of size |𝒯|×1|\mathcal{T}|\times 1 evaluating the partial derivatives of the dd-th component of deformation with respect to the mm-th variable in all elements.

A vector ’densities.Gradient’ of size |𝒯|×1|\mathcal{T}|\times 1 containing gradient densities in all elements.

The energy e is given as a sum of the gradient energy contributions over the elements (the gradient part Jgrad(v)J_{grad}(\textbf{v}) given by (9)) subtracted by the linear energy term Jlin(v)J_{lin}(\textbf{v}) (given by (14)).

The Neo-Hookean density function W=W(F)W=W(F) from (4) is implemented as

{listing}
1function densities=densityGradientVector_3D(F,params)
2% determinant term
3DET = F{1,1}.*F{2,2}.*F{3,3} + F{1,3}.*F{2,1}.*F{3,2} +
4 F{1,2}.*F{2,3}.*F{3,1} - F{1,3}.*F{2,2}.*F{3,1} -
5 F{1,2}.*F{2,1}.*F{3,3} - F{1,1}.*F{2,3}.*F{3,2};
6
7% I1 term (Frobenius norm squared)
8I1 = F{1,1}.^2 + F{1,2}.^2 + F{1,3}.^2 +
9 F{2,1}.^2 + F{2,2}.^2 + F{2,3}.^2 +
10 F{3,1}.^2 + F{3,2}.^2 + F{3,3}.^2;
11
12% gradient densities
13densities = params.C1*(I1-3-2*log(DET)) + params.D1*(DET-1).^2;
14end
Benchmark 2

Assume a bar domain Ω=(0,lx)×(ly2,ly2)×(lz2,lz2)\Omega=(0,l_{x})\times(-\frac{l_{y}}{2}\,,\,\frac{l_{y}}{2})\times(-\frac{l_{z}}{2}\,,\,\frac{l_{z}}{2}), where lx=0.4,ly=lz=0.01l_{x}=0.4,l_{y}=l_{z}=0.01, specified by the material parameters E=2108E=2\cdot 10^{8} (Young’s modulus) and ν=0.3\nu=0.3 (Poisson’s ratio) and deformed by the prescribed deformation v(x)=v(x,y,z)\textbf{v}(\textbf{x})=\textbf{v}(x,y,z) given by

v(1)(x,y,z)=x,v(2)(x,y,z)=cos(αxlx)y+sin(αxlx)z,v(3)(x,y,z)=sin(αxlx)y+cos(αxlx)z,\begin{split}v^{(1)}(x,y,z)&=\;\;\;x\,,\\ v^{(2)}(x,y,z)&=\;\;\;\cos(\alpha\frac{x}{l_{x}})\,y+\sin(\alpha\frac{x}{l_{x}})\,z\,,\\ v^{(3)}(x,y,z)&=-\sin(\alpha\frac{x}{l_{x}})\,y+\cos(\alpha\frac{x}{l_{x}})\,z\,,\end{split} (19)

or, equivalently, using matrix operations

(v(1)(x,y,z)v(2)(x,y,z)v(3)(x,y,z))=(1000cos(αxlx)sin(αxlx)0sin(αxlx)cos(αxlx))(xyz).\begin{pmatrix}v^{(1)}(x,y,z)\\ v^{(2)}(x,y,z)\\ v^{(3)}(x,y,z)\end{pmatrix}=\begin{pmatrix}1&0&0\\ 0&\cos(\alpha\frac{x}{l_{x}})&\sin(\alpha\frac{x}{l_{x}})\\ 0&-\sin(\alpha\frac{x}{l_{x}})&\cos(\alpha\frac{x}{l_{x}})\end{pmatrix}\begin{pmatrix}x\\ y\\ z\end{pmatrix}\,. (20)

Here, α=2π\alpha=2\pi means that the right Dirichlet wall is twisted once around the x-axes (Fig. 4).

Refer to caption
Figure 4: A bar domain twisted by the prescribed deformation (19).

Constants C1,D1C_{1},D_{1} are transformed according to the formulas C1=μ2,D1=K2,C_{1}=\frac{\mu}{2},D_{1}=\frac{K}{2}, where μ=E2(1+ν)\mu=\frac{E}{2(1+\nu)} is the shear modulus and K=E3(12ν)K=\frac{E}{3(1-2\nu)} is the bulk modulus. The exact evaluation shows that the corresponding gradient energy using the Neo-Hookean density function (4) reads

Jgrad(v)=α2C1lylz(ly2+lz2)12lx6.326670.J_{grad}(\textbf{v})=\frac{\alpha^{2}\,C_{1}\,l_{y}\,l_{z}\,(l_{y}^{2}+l_{z}^{2})}{12\,l_{x}}\approx 6.326670.

The script benchmark2_start.m evaluates an approximation of Jgrad(v)J_{grad}(\textbf{v}) on the sequence of the uniform mesh refinements defined in Benchmark 1. In order to provide higher accuracy of the evaluation times, the energies are recomputed 10 times. Table 2 provides evaluation times and values of the energy approximations (times for setting up the ’mesh’ structure are not included).

mesh level number of free dofs evaluation (10x) of Jgrad(v)J_{grad}(\textbf{v}): time [s] value of Jgrad(v)J_{grad}(\textbf{v})
1 2133 0.01 12.6623
2 11925 0.02 7.9083
3 77517 0.08 6.7219
4 554013 0.76 6.4255
5 4178493 12.82 6.3514
Table 2: Benchmark 2 - energy evaluation times in 3D.

6 Implementation: energy gradient evaluation

Evaluation of the full gradient J(v),\nabla J(\textbf{v}), where v=(v1,,vd|𝒩|)d|𝒩|\textbf{v}=(v_{1},\dots,v_{d|\mathcal{N}|})\in\mathbb{R}^{d|\mathcal{N}|}, requires in general computation of all partial derivatives J(v)vn\frac{\partial J(\textbf{v})}{\partial v_{n}}, n{1,,d|𝒩|}n\in\{1,\ldots,d|\mathcal{N}|\}. Some minimization methods (e.g. trust-region mentioned applied in Section 7) require the knowledge of gradient, but only restricted to free degrees of freedom.

The gradient Jlin(v)\nabla J_{lin}(\textbf{v}) easily reads using (18)

Jlin(v)=b,\nabla J_{lin}(\textbf{v})=\textbf{b}\,, (21)

where b is of size d|𝒩|d|\mathcal{N}|, given by (17) and its restriction to free degrees of freedom is then trivial.

The evaluation of Jgrad(v)\nabla J_{grad}(\textbf{v}) is technically more involved. Firstly, it is evaluated with respect to Remark 2 for all degrees of freedom belonging to all (at least partially) free nodes. Secondly, it is restricted to free degrees of freedom only. In order to determine to which node the nn-th active degree of freedom belongs we define the index mapping

IDN:{1,,d||}{1,,||},IDN(n)=(n1)/d+1I_{DN}:\{1,\ldots,d|\mathcal{M}|\}\rightarrow\{1,\ldots,|\mathcal{M}|\},\qquad I_{DN}(n)=(n-1)/d+1 (22)

which for the nn-th active degree of freedom returns the corresponding ii-th free node (here // is an integer division operator). By

𝒯(n),n{1,,d||},\mathcal{T}^{(n)}\,,\qquad n\in\{1,\ldots,d|\mathcal{M}|\}\,,

we denote the set of elements adjacent to node NIDN(n)N_{I_{DN}(n)} belonging to the nn-th active degree of freedom and by |𝒯(n)||\mathcal{T}^{(n)}| their number. The gradient of the nonlinear part Jgrad(v)J_{grad}(\textbf{v}) can be computed in two different ways:

  1. 1.

    numerically, where the partial derivatives are computed approximately by using a difference scheme.

  2. 2.

    exactly by taking the explicit partial derivatives.

Deriving the exact partial derivatives can be demanding and it depends on the particular problem. On the contrary, the numerical approach is more general and is feasible regardless of the complexity of the function representing the corresponding discrete energy. Hence, we first describe the numerical approach by using the central difference scheme and then explain the gradient evaluation by deriving the explicit form of the partial derivatives.

6.1 Numerical approach to evaluate Jgrad(v)\nabla J_{grad}(\textbf{v})

By using the central difference scheme, one can write

vnJgrad(v)Jgrad(v+εen)Jgrad(vεen)2ε,\frac{\partial}{\partial v_{n}}J_{grad}(\textbf{v})\approx\frac{J_{grad}(\textbf{v}+\varepsilon\textbf{e}_{n})-J_{grad}(\textbf{v}-\varepsilon\textbf{e}_{n})}{2\varepsilon}\,, (23)

where en\textbf{e}_{n} is the nn-th canonical vector in d||\mathbb{R}^{d|\mathcal{M}|} and ε\varepsilon is a small positive number. Both summands in the numerator above can be directly evaluated by taking the energy evaluation procedure introduced in the previous subsection as

Jgrad(v+εen)Jgrad(vεen)==k=1|𝒯|TkW((v+εen))dxk=1|𝒯|TkW((vεen))dx.\begin{split}&J_{grad}(\textbf{v}+\varepsilon\textbf{e}_{n})-J_{grad}(\textbf{v}-\varepsilon\textbf{e}_{n})=\\ &=\sum\limits_{k=1}^{|\mathcal{T}|}\int_{T_{k}}W(\nabla(\textbf{v}+\varepsilon\textbf{e}_{n}))\,\mathrm{d}\textbf{x}-\sum\limits_{k=1}^{|\mathcal{T}|}\int_{T_{k}}W(\nabla(\textbf{v}-\varepsilon\textbf{e}_{n}))\,\mathrm{d}\textbf{x}\,.\end{split}

However, this approach is ineffective as long as the step of the central difference scheme ε\varepsilon occurs only in a few summands of the sums representing Jgrad(v+εen)J_{grad}(\textbf{v}+\varepsilon\textbf{e}_{n}) and Jgrad(vεen)J_{grad}(\textbf{v}-\varepsilon\textbf{e}_{n}) given by (9), while the remaining summands are the same and therefore vanish. Hence, we can further simplify

Jgrad(v+εen)Jgrad(vεen)==k=1|𝒯(n)|Tk(n)W((v+εen))dxk=1|𝒯(n)|Tk(n)W((vεen))dx==k=1|𝒯(n)||Tk(n)|W((v+εen)|Tk(n))k=1|𝒯(n)||Tk(n)|W((vεen)|Tk(n)).\begin{split}&J_{grad}(\textbf{v}+\varepsilon\textbf{e}_{n})-J_{grad}(\textbf{v}-\varepsilon\textbf{e}_{n})=\\ &=\sum\limits_{k=1}^{|\mathcal{T}^{(n)}|}\int_{T_{k}^{(n)}}W(\nabla(\textbf{v}+\varepsilon\textbf{e}_{n}))\,\mathrm{d}\textbf{x}-\sum\limits_{k=1}^{|\mathcal{T}^{(n)}|}\int_{T_{k}^{(n)}}W(\nabla(\textbf{v}-\varepsilon\textbf{e}_{n}))\,\mathrm{d}\textbf{x}=\\ &=\sum\limits_{k=1}^{|\mathcal{T}^{(n)}|}|T_{k}^{(n)}|W\big{(}\nabla(\textbf{v}+\varepsilon\textbf{e}_{n})\big{|}_{T_{k}^{(n)}}\big{)}-\sum\limits_{k=1}^{|\mathcal{T}^{(n)}|}|T_{k}^{(n)}|W\big{(}\nabla(\textbf{v}-\varepsilon\textbf{e}_{n})\big{|}_{T_{k}^{(n)}}\big{)}\,.\end{split} (24)

By using the substitutions

Jgrad(n),(v)=|Tk(n)|W((vεen)|Tk(n)),Jgrad(n),+(v)=|Tk(n)|W((v+εen)|Tk(n)),J_{grad}^{(n),-}(\textbf{v})=|T_{k}^{(n)}|W\big{(}\nabla(\textbf{v}-\varepsilon\textbf{e}_{n})\big{|}_{T_{k}^{(n)}}\big{)}\,,\quad J_{grad}^{(n),+}(\textbf{v})=|T_{k}^{(n)}|W\big{(}\nabla(\textbf{v}+\varepsilon\textbf{e}_{n})\big{|}_{T_{k}^{(n)}}\big{)}\,, (25)

one can rewrite (24)

Jgrad(v+εen)Jgrad(vεen)=k=1|𝒯(n)|Jgrad(n),+(v)k=1|𝒯(n)|Jgrad(n),(v)J_{grad}(\textbf{v}+\varepsilon\textbf{e}_{n})-J_{grad}(\textbf{v}-\varepsilon\textbf{e}_{n})=\sum\limits_{k=1}^{|\mathcal{T}^{(n)}|}J_{grad}^{(n),+}(\textbf{v})-\sum\limits_{k=1}^{|\mathcal{T}^{(n)}|}J_{grad}^{(n),-}(\textbf{v}) (26)

and evaluate the whole Jgrad(v)\nabla J_{grad}(\textbf{v}) via the simple for-loop over its components.

Remark 4

The energy evaluation procedure has to be called in every loop over the vector components n{1,,d||}n\in\{1,\ldots,d|\mathcal{M}|\} and it turned out to cause multiple self built-in times that slowed down performance. To avoid that, the original energy evaluation procedure is modified so that the multiple input vectors can be processed simultaneously and the energy evaluation procedure be called only once.

The outer gradient evaluation procedure is simple:

{listing}
1eps = 1e-8; % finite difference step size
2
3% local patch energies for +eps and -eps
4es_minsplus = energies(v,[-eps eps],mesh,patches,params,indx);
5
6% the gradient vector by the central difference scheme
7g = diff(es_minsplus,1,2)/eps/2;

Here ’es_minsplus’ is a matrix of size 3𝒯× 23\|\mathcal{T}\|\,\times\,2 with the components given by (25):

es_minsplus(n,1)=Jgrad(n),(v)(n,1)=J_{grad}^{(n),-}(\textbf{v}) ,   es_minsplus(n,2)=Jgrad(n),+(v)(n,2)=J_{grad}^{(n),+}(\textbf{v}) .

The gradient vector g is then assembled by using the central difference scheme (23) with a constant central difference step size ε\varepsilon for all n{1,,d||}n\in\{1,\ldots,d|\mathcal{M}|\}. Obviously, a generalization of the above code to higher accuracy difference schemes is possible.

We recall a cell ’v_elems’ containing matrices

v_elems{1}\{1\}, v_elems{2}\{2\}, v_elems{3}\{3\}  of size |𝒯|×4|\mathcal{T}|\times 4

is assembled in the energy evaluation procedure energy from Subsection 5.2. For the evaluation of (23) by using (26) we need to assemble a cell-structure ’v_patches’ containing matrices

v_patches{1}\{1\}, v_patches{2}\{2\}, v_patches{3}\{3\}  of size 𝒯× 4\|\mathcal{T}\|\,\times\,4

that provide the restrictions of nodal deformations to all patches. In fact, the structure ’v_patches’ copies parts of ’v_elems’ to the particular positions.

The extended procedure energies evaluates all ε\varepsilon-perturbed values of energies: {listing}

1function e = energies(v,eps,mesh,patches,params,indx)
2% components of deformation
3v_cell = createCellFromVector(v,mesh.dim);
4% components of deformation on elements
5v_elems = CellAtMatrixOfIndices(v_cell,mesh.elems2nodes);
6% deformation gradients on elements
7F_elems = evaluate_F(mesh,v_elems);
8
9% deformations on patches
10v_patches = CellAtMatrixOfIndices(v_cell,patches.elems2nodes);
11% deformation gradients on patches
12F_patches = CellAtMatrixOfIndices(F_elems,patches.elems);
13
14v_patches_eps = cell(3,1);
15e = zeros(3*numel(mesh.nodesMinim),size(eps,2));
16for comp=1:size(eps,2) % loop over epsilon perturbations
17 % perturbations of deformations on patches
18 v_patches_eps{1} = v_patches{1} + eps(comp)*patches.logical;
19 v_patches_eps{2} = v_patches{2} + eps(comp)*patches.logical;
20 v_patches_eps{3} = v_patches{3} + eps(comp)*patches.logical;
21
22 % deformation gradients of perturbation
23 GG = evaluate_GG(patches,v_patches_eps,F_patches);
24
25 % densities
26 densities_patches = densityGradientVector_3D(GG,params);
27 % energies
28 e_patches = [patches.volumes; patches.volumes; patches.volumes].*densities_patches;
29 % final energy values on patches
30 csep = cumsum(e_patches);
31 e([1:3:end 2:3:end 3:3:end],comp) = [csep(indx(1)); diff(csep(indx))];
32end
33e = e(mesh.dofsMinim_local,:); % dropping out the Dirichlet dofs
34end

The cells ’v_elems’ and ’F_elems’ are the same as in the first-gradient energy evaluation procedure from Subsection 5.2. The cell ’v_patches_eps’ contains matrices

v_patches_eps{1}\{1\}, v_patches_eps{2}\{2\}, v_patches_eps{3}\{3\}  of size 𝒯× 4\|\mathcal{T}\|\,\times\,4

that provide the restrictions of nodal deformations to all patches, but here these deformations are perturbed by the value of the central difference step ε\varepsilon. This cell is used for the construction of the key cell ’GG’ of size dim×dimdim\times dim containing vectors

GG{j,m}\{j,m\}   of length dim𝒯dim\,\|\mathcal{T}\|

storing the partial derivatives of deformations of the jj-th component with respect to the mm-th variable. The structure of ’GG’ is in details displayed in Fig. 5.

𝖥{3,1}\mathsf{F}\{3,1\} 𝖥{3,2}\mathsf{F}\{3,2\} 𝖥{3,3}\mathsf{F}\{3,3\} 𝖥{2,1}\mathsf{F}\{2,1\} 𝖥{2,2}\mathsf{F}\{2,2\} 𝖥{2,3}\mathsf{F}\{2,3\} 𝖦{1,1}\mathsf{G}\{1,1\} 𝖦{1,2}\mathsf{G}\{1,2\} 𝖦{1,3}\mathsf{G}\{1,3\} 𝖥{3,1}\mathsf{F}\{3,1\} 𝖥{3,2}\mathsf{F}\{3,2\} 𝖥{3,3}\mathsf{F}\{3,3\} 𝖦{2,1}\mathsf{G}\{2,1\} 𝖦{2,2}\mathsf{G}\{2,2\} 𝖦{2,3}\mathsf{G}\{2,3\} 𝖥{1,1}\mathsf{F}\{1,1\} 𝖥{1,2}\mathsf{F}\{1,2\} 𝖥{1,3}\mathsf{F}\{1,3\} 𝖦{3,1}\mathsf{G}\{3,1\} 𝖦{3,2}\mathsf{G}\{3,2\} 𝖦{3,3}\mathsf{G}\{3,3\} 𝖥{2,1}\mathsf{F}\{2,1\} 𝖥{2,2}\mathsf{F}\{2,2\} 𝖥{2,3}\mathsf{F}\{2,3\} 𝖥{1,1}\mathsf{F}\{1,1\} 𝖥{1,2}\mathsf{F}\{1,2\} 𝖥{1,3}\mathsf{F}\{1,3\} the 1st layer of vector indices {1,,𝒯\{1,\ldots,\|\mathcal{T}\|} the 2nd layer of vector indices {𝒯+1,,2𝒯\{\|\mathcal{T}\|+1,\ldots,2\|\mathcal{T}\|} the 3rd layer of vector indices {2𝒯+1,,3𝒯\{2\|\mathcal{T}\|+1,\ldots,3\|\mathcal{T}\|} 3𝒯3\|\mathcal{T}\|
Figure 5: The ’GG’ structure.

All three layers of vector indices have the same size of 𝒯\|\mathcal{T}\| entries and the following meaning:

  • -

    the 1st layer corresponds to the ε\varepsilon-perturbation of the component v(1)\textbf{v}^{(1)},

  • -

    the 2nd layer corresponds to the ε\varepsilon-perturbation of the component v(2)\textbf{v}^{(2)},

  • -

    the 3rd layer corresponds to the ε\varepsilon-perturbation of the component v(3)\textbf{v}^{(3)}.

The cell ’G’ has the same size as ’F’, but contains deformation gradient matrices corresponding to the deformations perturbed by ε\varepsilon. The most important feature is using the values of the precomputed input cell ’F’ for the efficient construction of the ’GG’ cell. Note that if the numeric difference step ε\varepsilon is added or subtracted from the first component of displacements, the deformation gradients of the rest two components remain the same and their values are already stored in the ’F’ cell.
A vector ’csep’ of length d𝒯d\|\mathcal{T}\| contains the cumulative sums of ’e_patches’ with elements

csep(n)=r=1ne_patches(r).\mbox{{csep}}(n)=\sum\limits_{r=1}^{n}\mbox{{e\_patches}}(r)\,.

An output matrix e of size 3||×23|\mathcal{M}|\times 2 contains all energy contributions and for any comp{1,2}comp\in\{1,2\}

e(n,comp)=r=pn1+1pne_patches(r)=csep(pn)csep(pn1).\mbox{{e}}(n,comp)=\sum\limits_{r=p_{n-1}+1}^{p_{n}}\mbox{{e\_patches}}(r)=\textbf{csep}(p_{n})-\textbf{csep}(p_{n}-1)\,. (27)

Note that vector ’e_patches’ changes its value inside the loop over the components compcomp. Therefore, the whole vector e is evaluated at once using the Matlab difference function diff with the input vector ’indx’ of length 3||3|\mathcal{M}|, where indx(n)=pn,n{1,,d||}(n)=p_{n}\,,n\in\{1,\ldots,d|\mathcal{M}|\}, with pnp_{n} defined in (15) and additionally in (16).

The procedure for the construction of ’GG’ is listed below:

{listing}
1function GG = evaluate_GG(patches,v_patches,F)
2% deformation gradients on patches
3G = evaluate_F(patches,v_patches);
4
5% deformation gradient on patches and on 3 components
6GG = cell(3,3);
7GG{1,1} = [G{1,1}; F{1,1}; F{1,1}];
8GG{1,2} = [G{1,2}; F{1,2}; F{1,2}];
9GG{1,3} = [G{1,3}; F{1,3}; F{1,3}];
10GG{2,1} = [F{2,1}; G{2,1}; F{2,1}];
11GG{2,2} = [F{2,2}; G{2,2}; F{2,2}];
12GG{2,3} = [F{2,3}; G{2,3}; F{2,3}];
13GG{3,1} = [F{3,1}; F{3,1}; G{3,1}];
14GG{3,2} = [F{3,2}; F{3,2}; G{3,2}];
15GG{3,3} = [F{3,3}; F{3,3}; G{3,3}];
16end

6.2 Exact approach to evaluate Jgrad(v)\nabla J_{grad}(\textbf{v})

Evaluation of the exact J(v)\nabla J(\textbf{v}) requires an explicit deriving of every J(v)vn,n{1,,d||}\frac{\partial J(\textbf{v})}{\partial v_{n}}\,,n\in\{1,\ldots,d|\mathcal{M}|\}. The gradient of the linear part is trivial and is given by (18). Using (9) one can write

Jgrad(v)vn=k=1|𝒯||Tk|vnW(F(v)|Tk).\frac{\partial J_{grad}(\textbf{v})}{\partial v_{n}}=\sum_{k=1}^{|\mathcal{T}|}|T_{k}|\,\frac{\partial}{\partial v_{n}}W\big{(}\textbf{F}(\textbf{v})|_{{T_{k}}}\big{)}\,. (28)

Note that the only elements whose energy contributions depend on vnv_{n} are those belonging to the ii-th patch, where i=IDN(n)i=I_{DN}(n). Therefore, we can simplify the equation above as

Jgrad(v)vn=k=1|𝒯(n)||Tk|vnW(F(v)|Tk).\frac{\partial J_{grad}(\textbf{v})}{\partial v_{n}}=\sum_{k=1}^{|\mathcal{T}^{(n)}|}|T_{k}|\,\frac{\partial}{\partial v_{n}}W\big{(}\textbf{F}(\textbf{v})|_{T_{k}}\big{)}\,. (29)

By using the chain rule one can write

vnW(F(v))=j=1dm=1dimWFj,m(F(v))Fj,mvn(v),\frac{\partial}{\partial v_{n}}W\big{(}\textbf{F}(\textbf{v})\big{)}=\sum\limits_{j=1}^{d}\sum\limits_{m=1}^{dim}\frac{\partial W}{\partial\textbf{F}_{j,m}}\big{(}\textbf{F}(\textbf{v})\big{)}\frac{\partial\textbf{F}_{j,m}}{\partial v_{n}}(\textbf{v})\,, (30)

where assuming the Neo-Hookean density from (4)

W(F)Fj,m=C1(I1(F)Fj,m2det(F)det(F)Fj,m)+2D1(det(F)1)det(F)Fj,m.\begin{split}\frac{\partial W(\textbf{F})}{\partial\textbf{F}_{j,m}}=C_{1}\Big{(}\frac{\partial\textbf{I}_{1}(\textbf{F})}{\partial\textbf{F}_{j,m}}-\frac{2}{\det(\textbf{F})}\frac{\partial\det(\textbf{F})}{\partial\textbf{F}_{j,m}}\Big{)}+2D_{1}\big{(}\det(\textbf{F})-1\big{)}\frac{\partial\det(\textbf{F})}{\partial\textbf{F}_{j,m}}\,.\end{split} (31)

Using the row or the column expansion rule for calculating the determinant of a 3×33\times 3 matrix, one can express

det(F)Fj,m=(1)j+mdet(Fj,msub),\frac{\partial\det(\textbf{F})}{\partial\textbf{F}_{j,m}}=(-1)^{j+m}\det(\textbf{F}_{j,m}^{sub})\,, (32)

where Fj,msub\textbf{F}_{j,m}^{sub} is the 2×22\times 2 submatrix of F given by dropping the jj-th row and the mm-th column of F. Futhermore, I1Fj,m\frac{\partial\textbf{I}_{1}}{\partial\textbf{F}_{j,m}} can be also simplified by

I1(F)Fj,m=2Fj,m.\frac{\partial\textbf{I}_{1}(\textbf{F})}{\partial\textbf{F}_{j,m}}=2\textbf{F}_{j,m}\,. (33)

From Subsection 6.1 we already know how to assemble v|Tk\textbf{v}|_{T_{k}} and F(v)|Tk\textbf{F}(\textbf{v})|_{T_{k}}. Here, in addition, we need to express I1(F)F,det(F)F\frac{\partial\textbf{I}_{1}(\textbf{F})}{\partial\textbf{F}}\,,\frac{\partial\det(\textbf{F})}{\partial\textbf{F}} and Fvn\frac{\partial\textbf{F}}{\partial v_{n}}.

Similarly to the numeric gradient evaluation, the exact gradient evaluation procedure uses the same precomputed structures ’v_cell’, ’v_elems’ and ’F_elems’.

{listing}
1% components of deformation
2v_cell = createCellFromVector(v,mesh.dim);
3% components of deformation on elements
4v_elems = CellAtMatrixOfIndices(v_cell,mesh.elems2nodes);
5% deformation gradients on elements
6F_elems = evaluate_F(mesh,v_elems);

The main procedure for the exact gradient evaluation is provided below:

{listing}
1% deformations on patches
2v_patches = CellAtMatrixOfIndices(v_cell,patches.elems2nodes);
3% deformation gradients on patches
4F_patches = CellAtMatrixOfIndices(F_elems,patches.elems);
5
6% deformation gradients of perturbation
7GG = evaluate_GG(patches,v_patches,F_patches);
8
9% densities
10D_densities_patches = density_D_GradientVector_3D(GG,D_F,params);
11% energies
12D_e_patches = [patches.volumes; patches.volumes; patches.volumes].*D_densities_patches;
13% final energy values on patches
14csDep = cumsum(D_e_patches);
15g = zeros(3*numel(nodesMinim),1);
16g([1:3:end 2:3:end 3:3:end]) = [csDep(indx(1)); diff(csDep(indx))];
17g = g(dofsMinim_local);

Here, ’D_densities_patches’ is a vector of size 3𝒯×13\|\mathcal{T}\|\times 1 containing all W(F(v))vn\frac{\partial W(\textbf{F}(\textbf{v}))}{\partial v_{n}} for every n{1,,d||}n\in\{1,\ldots,d|\mathcal{M}|\} given by (30). Note that the function density_D_GradientVector_3D uses the input 3×33\times 3 cell structure ’D_F_patches’, where D_F_patches{j,m}(n)=Fj,mvn\{j,m\}(n)=\frac{\partial\textbf{F}_{j,m}}{\partial v_{n}}.

Benchmark 3

The script benchmark3.m evaluates approximations of Jgrad(v)\nabla J_{grad}(\textbf{v}) on the sequence of the uniform mesh refinements defined in Benchmark 2. In order to provide higher accuracy of evaluation times, the energies are recomputed 10 times. Table 3 provide evaluation times of the gradient approximations both in the exact and numerical case (the times for setting up ’mesh’ and ’patches’ structures are not included).

mesh level number of free dofs evaluation (10x) of exact J(v)\nabla J(\textbf{v}): time [s] evaluation (10x) of num. J(v)\nabla J(\textbf{v}): time [s]
1 2133 0.12 0.08
2 11925 0.14 0.15
3 77517 1.18 1.51
4 554013 12.47 15.64
5 4178493 487.34 533.24
Table 3: Benchmark 3 - energy gradient evaluation times in 3D.

7 Application to practical energy minimizations

For a practical energy minimization, the trust-region method [4] available in the MATLAB Optimization Toolbox is utilized. Standard stopping criteria (the first-order optimality, tolerance on the argument and tolerance on the function) equal to 10610^{-6} are considered in all benchmarks. The method also allows to specify the sparsity pattern of the Hessian matrix 2J(v)\nabla^{2}J(v) in free degrees of freedom, i.e., only positions (indices) of nonzero entries. The sparsity pattern is directly given by a finite element discretization.

Refer to caption
Figure 6: The sparsity pattern of the 3D bar domain based on nodes connectivity (left) and its extension to the vector problem (right).
Example 2

The sparsity pattern related to the 3D bar domain of Fig. 2 is displayed in Fig. 6 (left). It corresponds to the FEM mesh not taking the Dirichlet boundary conditions into account. Therefore, it is of size 729×729729\times 729. In practical computations, it is first extended from a scalar to a vector problem and then restricted to themesh.dofsMinim’ indices (right). Since there are 18 Dirichlet nodes fixed in all three directions, the corresponding number of rows and columns of the right hessian sparsity pattern is (72918)3=2133(729-18)*3=2133.

7.1 Benchmark 4: time-dependent hyperelasticity in 3D

We consider the elastic bar specified in Benchmark 2 with a time-dependent nonhomogeneous Dirichlet boundary conditions on the right wall (x=lxx=l_{x}). The torsion of the right wall is described by the formula (19) for x=lx=0.4x=l_{x}=0.4, where the rotation angle is assumed to be a time dependent function α=α(t)=αmaxt/T\alpha=\alpha(t)=\alpha_{max}t/T for a sequence of integer discrete times t{1,2,,T}t\in\{1,2,\dots,T\}. Here we assume the final time T=24T=24 and the maximal angle of rotation αmax=8π\alpha_{max}=8\pi ensuring four full rotations of the right Dirichlet wall around the x-axis. Altogether we solve a sequence of TT minimization problems (1) with the Neo-Hook energy density (4) and no loading (f=0\textbf{f}=0).

The trust-region method accepts an initial deformation approximation. For the first discrete time we run iterations from the identity v(x)=x,xΩv(x)=x,x\in\Omega and for the next discrete time the minimizer of the previous discrete time problem serves as its initial approximation. We output the found energy minimizer for each minimization problem with some minimizers shown in form of the deformed meshes in Fig. 7. Additionally, we also provide the minimization time, the number of the trust-region iterations and the value of the corresponding minimal energy.

The overall performance is explained in Table 4 for separate computations on tetrahedral meshes of levels 1, 2, 3 applying the numerical gradient evaluation of Subsection 6.1 only with the choice ε=106\varepsilon=10^{-6} . We notice that the minimizations of finer meshes require only slightly higher number of iterations, which is acceptable. Comparison of the values Jgrad(u)J_{grad}(\textbf{u}) in each line of Table 4 indicates convergence in space of the energy minimizers.

level 1, 2133 free dofs level 2, 11925 free dofs level 3, 77517 free dofs
step time [s] iters Jgrad(u)J_{grad}(\textbf{u}) time [s] iters Jgrad(u)J_{grad}(\textbf{u}) time [s] iters Jgrad(u)J_{grad}(\textbf{u})
t=3 6.65 40 3.1177 66.23 47 1.8244 2148.62 93 1.4631
t=6 10.42 60 12.4423 117.74 76 7.2962 1404.68 68 5.8533
t=9 10.33 60 27.8993 125.76 85 16.4079 1441.84 73 13.1735
t=12 18.43 104 49.5506 102.11 68 29.1664 1277.04 65 23.4286
t=15 15.51 81 77.3859 127.17 87 45.5677 1130.02 58 36.6236
t=18 18.31 97 111.3369 110.65 75 65.5470 1540.90 76 52.7641
t=21 15.74 82 151.4946 74.47 51 89.1672 5523.64 290 71.7883
t=24 16.37 91 197.7577 91.79 63 116.3261 2660.44 131 93.7238
Table 4: Benchmark 4 - performance of hyperelasticity minimizations in 3D.
Refer to caption
(a) t=3.t=3.
Refer to caption
(b) t=6t=6.
Refer to caption
(c) t=9.t=9.
Refer to caption
(d) t=12.t=12.
Refer to caption
(e) t=15.t=15.
Refer to caption
(f) t=18s.t=18s.
Refer to caption
(g) t=21.t=21.
Refer to caption
(h) t=24.t=24.
Figure 7: Benchmark 4 - deformation of the tetrahedal mesh subjected to the right time-rotating Dirichlet plane with the underlying Neo-Hookean density.

7.2 Benchmarks 5 and 6: hyperelasticity and p-Laplacian in 2D

Although our exposition is mainly focused on implementation details in 3D, a reduction to 2D (dim=2dim=2) is straightforward.

As the first example we consider a square domain with lengths lx=ly=2l_{x}=l_{y}=2 perforated by a hole with radius r=1/3r=1/3 (Fig. 9 left) and whose center is located at the center of the square, subjected to zero Dirichlet boundary condition on the bottom and left edges, a constant loading f(x)=(3.5107,3.5107)\textbf{f}(\textbf{x})=(-3.5\cdot 10^{7},-3.5\cdot 10^{7}) and the elastic parameters specified in Benchmark 2. We assume the choice ε=106\varepsilon=10^{-6} in the evaluation of the numerical gradient. The resulting deformation and the deformation gradient densities are depicted in Fig. 9 (right).

Refer to caption
Refer to caption
Figure 8: Benchmark 5 - a triangulation of the square domain perforated by a hole (left) and its deformation with the underlying Neo-Hookean density (right).
Refer to caption
Refer to caption
Figure 9: Benchmark 6 - a triangulation of the L-shape domain (left) and the solution p-Laplacian for the power p=3p=3 and a constant loading f(x)=10f(\textbf{x})=-10 (right).

As the second example we minimize the p-Laplacian energy functional (5) over the L-shape domain (Fig. 9 left). The constant loading f=10f=-10 is assumed together with zero Dirichlet boundary conditions on the full domain boundary. For the numerical gradient we assume ε=105\varepsilon=10^{-5}. The solution is displayed in Fig. 9 (right). Table 6 depicts performance of all options for the power p=3p=3 and confirms faster evaluation times in comparison to our former contribution [13]. This is mainly due to the improved vectorization concepts here, due to the faster CPU and finally due to updates of the trust-region in-built implementation in the latest version of Matlab.

exact gradient numerical gradient
level free dofs time [s] iters J(u)J(\textbf{u}) time [s] iters J(u)J(\textbf{u})
1 278 0.36 4 24.5745 0.13 4 24.5745
2 1102 0.15 4 24.3558 0.16 4 24.3558
3 4382 0.54 4 24.2957 0.54 4 24.2957
4 17470 2.35 5 24.2799 2.65 5 24.2799
5 69758 11.27 5 24.2758 12.39 5 24.2758
6 278782 83.27 6 24.2748 88.74 6 24.2748
Table 5: Benchmark 5 - performance of hyperelasticity minimizations in 2D. True values of the energies J(u)J(\textbf{u}) are multiplied by the scale factor 10710^{7}.
exact gradient numerical gradient
level free dofs time [s] iters J(u)J(\textbf{u}) time [s] iters J(u)J(\textbf{u})
1 33 0.02 8 -7.5353 0.03 8 -7.5353
2 161 0.05 11 -7.9729 0.12 15 -7.9729
3 705 0.11 11 -8.1039 0.19 11 -8.1039
4 2945 0.30 11 -8.1445 0.56 12 -8.1445
5 12033 1.50 11 -8.1578 2.08 12 -8.1578
6 48641 6.30 12 -8.1625 9.48 12 -8.1625
7 195585 48.61 13 -8.1642 60.62 12 -8.1642
8 784385 617.92 22 -8.1649 672.04 16 -8.1649
Table 6: Benchmark 6 - performance of p-Laplacian minimizations for p=3p=3 in 2D.

7.3 Implementation remarks and outlooks

Assembly times in all benchmarks were obtained on a MacBook Air (M1 processor, 2020) with 16 GB memory running MATLAB R2021a. Our implementation is available at

https://www.mathworks.com/matlabcentral/fileexchange/97889

for download and testing. It is based on own codes of [3, 6, 15] used primarily for assemblies of finite element matrices. It also utilizes the function mcolon from the reservoir simulator [11]. The names of mesh attributes were initially motivated by codes of [1] and further modified.

The 3D cuboid mesh is generated by the code of [6] and 2D meshes (the square with the hole and the L-shape) by our own code. Vectorized evaluations of gradients of basis functions are taken from [3, 15]. The general structure of the code was initially taken from [13] and extended to the implementation of hyperelasticity in 2D and 3D. Many parts of this code (mainly related to the gradient evaluation) have been significantly improved since then.

The code is designed in a modular way allowing to add different scalar (e.g. Ginzburg-Landau [1]) and vector (e.g. topology optimization with elastoplasticity [2], shape memory alloys [8]) problems involving the first gradient energy terms.

Acknowledgement

A. Moskovka was supported by the Strategy 21 of the CAS, program 23: City as a Laboratory of Change; Historical Heritage and Place for Safe and Quality Life and by the R&\&D project 8J21AT001 Model Reduction and Optimal Control in Thermomechanics. J. Valdman announces the support of the Czech Science Foundation (GACR) through the grant GF19-29646L Large Strain Challenges in Materials Science.

References

  • [1] J. Alberty, C. Carstensen, S. Funken: Remarks around 50 lines of Matlab: short finite element implementation, Numerical Algorithms 20, 1999, 117-137.
  • [2] S. Almi, U. Stefanelli: Topology Optimization for Incremental Elastoplasticity: A Phase-Field Approach, SIAM Journal on Control and Optimization, 2021, Vol. 59, No. 1 : pp. 339-364.
  • [3] I. Anjam, J. Valdman: Fast MATLAB assembly of FEM matrices in 2D and 3D: edge elements, Applied Mathematics and Computation, 2015, 267, 252–263.
  • [4] A.R. Conn, N.I.M. Gould, P.L. Toint: Trust-Region Methods. SIAM, Philadelphia, 2000.
  • [5] P.G. Ciarlet: The Finite Element Method for Elliptic Problems, SIAM, Philadelphia, 2002.
  • [6] M. Čermák, S. Sysala, J. Valdman: Efficient and flexible MATLAB implementation of 2D and 3D elastoplastic problems, Applied Mathematics and Computation, 2019, 355, 595-614.
  • [7] P. Drábek, J. Milota: Methods of Nonlinear Analysis: Applications to Differential Equations (second edition), Birkhauser, 2013.
  • [8] M. Frost, B. Benešová, P. Sedlák: A microscopically motivated constitutive model for shape memory alloys: formulation, analysis and computations, Math. Mech. Solids, 21(3): 358–382, 2016.
  • [9] J. Koko: Fast MATLAB assembly of FEM matrices in 2D and 3D using cell-array approach, International Journal of Modeling, Simulation, and Scientific Computing Vol. 7, No. 3 (2016).
  • [10] M. Kružík, T. Roubíček: Mathematical Methods in Continuum Mechanics of Solids, Springer, 2019.
  • [11] K. A. Lie: An introduction to reservoir simulation using MATLAB: User guide for the Matlab Reservoir Simulation Toolbox (MRST), Technical report, SINTEF ICT, December 2016. (http://www.sintef.no/projectweb/mrst/).
  • [12] J. E. Marsden, T. J.R. Hughes: Mathematical foundations of elasticity, Dover Publications, 1994.
  • [13] C. Matonoha, A. Moskovka, J. Valdman: Minimization of p-Laplacian via the Finite Element Method in MATLAB, I. Lirkov and S. Margenov (eds): LSSC 2021, LNCS 13127, pp. 496–503 (2022).
  • [14] MATLAB documentation to Minimization with Gradient and Hessian Sparsity Pattern, https://www.mathworks.com/help/optim/ug/minimization-with-gradient-and-hessian-sparsity-pattern.html.
  • [15] T. Rahman, J. Valdman: Fast MATLAB assembly of FEM matrices in 2D and 3D: nodal elements, Applied Mathematics and Computation, 2013, 219, 7151-7158.
  • [16] G. K. Rose: Computational Methods for Nonlinear Systems Analysis With Applications in Mathematics and Engineering (2017), Doctor of Philosophy (PhD), Dissertation, Mechanical &\& Aerospace Engineering, Old Dominion University.
  • [17] T. Weinberg, B. Sousedík: Fast implementation of mixed RT0 finite elements in MATLAB, SIAM Undergraduate Research Online, volume 12, 2018.