This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Classifications of Single-input Lower Triangular Forms

Duan Zhang and Ying Sun This work has been submitted for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.D. Zhang is with the College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023 China (e-mail: [email protected]). Y. Sun is with the School of Civil Engineering and Architecture, Zhejiang Sci-Tech University, Hangzhou, 310018, China (e-mail: [email protected]).
Abstract

The purposes of this paper are to classify lower triangular forms and to determine under what conditions a nonlinear system is equivalent to a specific type of lower triangular forms. According to the least multi-indices and the greatest essential multi-index sets, which are introduced as new notions and can be obtained from the system equations, two classification schemes of lower triangular forms are constructed. It is verified that the type that a given lower triangular form belongs to is invariant under any lower triangular coordinate transformation. Therefore, although a nonlinear system equivalent to a lower triangular form is also equivalent to many other appropriate lower triangular forms, there is only one type that the system can be transformed into. Each of the two classifications induces a classification of all the systems that are equivalent to lower triangular forms. A new method for transforming a nonlinear system into a lower triangular form, if it is possible, is provided to find what type the system belongs to. Additionally, by using the differential geometric control theory, several necessary and sufficient conditions under which a nonlinear system is locally feedback equivalent to a given type of lower triangular form are established. An example is given to illustrate how to determine which type of lower triangular form a given nonlinear system is equivalent to without performing an equivalent transformation.

{IEEEkeywords}

Classification, feedback equivalence, lower triangular form, multi-index.

1 Introduction

\IEEEPARstart

Since nonlinear phenomena are widely present in nature and many industrial processes, the studies of nonlinear control systems are of obvious practical value [1, 2, 3]. Lower triangular forms are a class of nonlinear systems attracting considerable attention. For example, backstepping, as a powerful control strategy for lower triangular systems, has been developed based on the cascade structures of these systems [2, 4, 5, 6, 7, 8, 9]. Many exciting results have been obtained for some special classes of lower triangular forms, such as strict feedback forms [10, 11, 12, 13, 5, 14, 15] and pp-normal forms [16, 17, 18, 19, 20, 21]. Motivated by these works, we address two problems in this paper. The first one is how to make classifications of lower triangular forms in favor of the design of control laws for these systems. The second problem is whether and how a nonlinear system can be equivalently transformed into a given type of lower triangular form.

Before discussing the classification scheme, we first review the related research on lower triangular forms. A nonlinear system is called a lower triangular form [4] if it takes the form

x˙1=f1(x1,x2)\displaystyle\dot{x}_{1}=f_{1}\left(x_{1},x_{2}\right) (1)
\displaystyle\vdots
x˙n1=fn1(x1,,xn)\displaystyle\dot{x}_{n-1}=f_{n-1}\left(x_{1},\dots,x_{n}\right)
x˙n=fn(x1,,xn)+gn(x1,,xn)v\displaystyle\dot{x}_{n}=f_{n}\left(x_{1},\dots,x_{n}\right)+g_{n}\left(x_{1},\dots,x_{n}\right)v

where x=(x1,,xn)x=(x_{1},\dots,x_{n}) is the state vector, vv is the scale input, gng_{n} is a smooth function with gn(0)0g_{n}(0)\neq 0, and fif_{i}, i=1,,ni=1,\dots,n, are smooth functions such that fj/xj+10\partial f_{j}/\partial x_{j+1}\not\equiv 0, j=1,,n1j=1,\dots,n-1, hold in a neighborhood of the origin. A lower triangular form is said to be a pp-normal form [22, 23] if it is also of the special form

x˙1=ψ1,p1(x1,x2)x2p1+j=0p11ψ1,j(x1)x2j\displaystyle{{\dot{x}}_{1}}={\psi_{1,{p_{1}}}}({x_{1}},{x_{2}})x_{2}^{{p_{1}}}+\sum\limits_{j=0}^{{p_{1}}-1}{{\psi_{1,j}}({x_{1}})x_{2}^{j}} (2)
\displaystyle\;\vdots
x˙n1=ψn1,pn1(x1,,xn)xnpn1\displaystyle{{\dot{x}}_{n-1}}={\psi_{n-1,{p_{n-1}}}}({x_{1}},\dots,{x_{n}})x_{n}^{{p_{n-1}}}
+j=0pn11ψn1,j(x1,,xn1)xnj\displaystyle\ \ \ \ \ \ \ \ \ +\sum\limits_{j=0}^{{p_{n-1}}-1}{{\psi_{n-1,j}}({x_{1}},\dots,{x_{n-1}})x_{n}^{j}}
x˙n=fn(x1,,xn)+gn(x1,,xn)v\displaystyle{{\dot{x}}_{n}}={f_{n}}({x_{1}},\dots,{x_{n}})+{g_{n}}({x_{1}},\dots,{x_{n}})v

where pip_{i}, i=1,,n1i=1,\dots,{n-1}, are positive integers, and ψi,j\psi_{i,j}, i=1,,n1i=1,\dots,{n-1} and j=pi,,1j=p_{i},\dots,1, are smooth functions with

ψi,j(0){0j=pi=0jpi.\psi_{i,j}(0)\left\{\begin{matrix}\neq 0&j=p_{i}\\ =0&j\neq p_{i}.\end{matrix}\right.

When p1==pn1=1p_{1}=\dots=p_{n-1}=1, (2) becomes a strict feedback form, which has been verified to be feedback equivalent to the controllable canonical form. The first report on pp-normal forms was carried out by Lin and Qian. From 2000 to 2006, they conducted a series of systematic studies about the controller design for pp-normal forms to meet various control objectives, including global stabilization [16, 17, 18], adaptive control [19], output tracking [20], and output feedback stabilization [21]. Subsequently, further impressive studies focused on those systems are presented, such as finite-time control [24, 25, 26], HH_{\infty} control [27], state-constrained control [28], global stabilization using multiple Lyapunov functions [29], nonsingular prescribed-time stabilization [30], and tracking control [31].

Seeing that one can find a great many lower triangular forms other than pp-normal forms and strict feedback forms, how to classify lower triangular forms is a problem worthy of study. As far as we know, there has been no report on this problem. Two classification schemes proposed in Section III are expected to be helpful in analyzing the behavior of lower triangular forms. The first classification scheme is directly inspired by pp-normal forms. Let us denote the left-hand side of the iith equation of (2) by φi(x)\varphi_{i}(x) for i=1,,n1i=1,\dots,n-1. PP-normal forms have a property that jφi/xi+1j(0)=0\partial^{j}\varphi_{i}/\partial x_{i+1}^{j}(0)=0, j=1,,pi1j=1,\cdots,p_{i}-1, and piφi/xi+1pi(0)0\partial^{p_{i}}\varphi_{i}/\partial x_{i+1}^{p_{i}}(0)\neq 0 are satisfied. In this paper, we say that (0,,0,pi(i+1)th)(0,\dots,0,\underbrace{p_{i}}_{(i+1){\rm th}}) is the least (i+1)(i+1)-multi-index of φi(x)\varphi_{i}(x) (see Definition 6). These multi-indices are observed playing an key role in the controllers for pp-normal forms [16, 17, 18, 19, 20, 21, 24, 25, 26, 27, 28, 29, 30, 31]. This motivates us to classify (1) by the least (i+1)(i+1)-multi-indices of fi(x)f_{i}(x) for i=1,,n1i=1,\dots,n-1. Moreover, we will see that the least (i+1)(i+1)-multi-index of fi(x)f_{i}(x) is invariant under a class of coordinate transformations called lower triangular coordinate transformations. The other way presented in this paper to classify lower triangular forms is based on another new notion called the greatest essential (i+1)(i+1)-multi-index set of fif_{i} (see Definition 3.5 and 3.12). Since the least (i+1)(i+1)-multi-index of fif_{i} belongs to the set, this classification is a refinement of the first one. It will be verified that the set is finite and invariant under any lower triangular coordinate transformation. Also, two algorithms for determining those sets from (1) are given in section III. It is reasonable to infer that those multi-indices can be expected to act as a pivotal part of the controllers for lower triangular forms, considering that the terms corresponding to the least (i+1)(i+1)-multi-index of fif_{i} and the elements of the greatest essential (i+1)(i+1)-multi-index set of fif_{i} can be regarded as the invariant ”control” terms for the equation x˙=fi(x1,,xi+1)\dot{x}=f_{i}(x_{1},\dots,x_{i+1}) given in (1) (see Remark 3.39).

Since a classification of lower triangular forms induces a classification of all the systems that are feedback equivalent to lower triangular forms, the next problem naturally considered in this paper is whether a given nonlinear system is equivalent to a specific type of lower triangular form via a state feedback and a change of coordinates. This problem is about feedback equivalence between different systems. In recent decades, a series of original results have been achieved on the issue of feedback equivalence. In 1973, Krener provided several sufficient and necessary conditions that an affine nonlinear system is equivalent to another affine system or a linear system via a local coordinate transformation [32]. In 1978, taking invariants under feedback into consideration, Brockett proposed a necessary and sufficient condition for a nonlinear system to be equivalent to a controllable linear system via a local coordinate transformation x=T(ξ)x=T(\xi) and a state feedback of the form u=αu(ξ)+βuvu=\alpha_{u}(\xi)+\beta_{u}v, where xx and ξ\xi are two state vectors, αu(ξ)\alpha_{u}(\xi) is a smooth function, and βu\beta_{u} is a real number [33]. In the 1980s, the problem of exact linearization with a feedback taking the form u=αu(ξ)+βu(ξ)vu=\alpha_{u}(\xi)+\beta_{u}(\xi)v, where βu(ξ)\beta_{u}(\xi) is a function satisfying βu(ξ)0\beta_{u}(\xi)\neq 0, was solved in [34, 35, 36]. The multi-input exact feedback linearization problem was solved in [37]. Cheng and Lin [22] presented a necessary and sufficient condition under which a nonlinear system is feedback equivalent to a pp-normal form via a coordinate transformation and a state feedback of the form u=αu(ξ)+βuvu=\alpha_{u}(\xi)+\beta_{u}v, and also designed an algorithm to find the appropriate coordinate transformations and feedback control laws in 2003. In late this year, Respondek [23] solved the pp-normalization problem using a state feedback of the form u=αu(ξ)+βu(ξ)vu=\alpha_{u}(\xi)+\beta_{u}(\xi)v and pointed out pp-normal forms, taking the form (2), are all locally equivalent to their special cases with ψi,pi(x)=1\psi_{i,p_{i}}(x)=1 for i=1,,n1i=1,\dots,n-1.

Two methods are provided to determine whether a nonlinear system is feedback equivalent to a given type of lower triangular form in Section IV. A way to solve the problem is to transform the system into a lower triangular system from which one then can determine the least (i+1)(i+1)-multi-index and the greatest essential (i+1)(i+1)-multi-index set of the right-hand side of its iith equation. A new necessary and sufficient condition for a single-input nonlinear system to be equivalent to a lower triangular form is given to simplify the transformation mentioned above. Since it may be quite difficult to find an appropriate change of coordinates to transform a system into a lower triangular form, we seek a new method for judging the type without implementing an equivalent transformation. Theorem 4.58, Theorem 4.61, Corollary 4.65, and Corollary 4.66 allow us to determine whether a nonlinear system is equivalent to a specific type of lower triangular form by computing Lie brackets.

The rest of this paper is organized as follows. Section II will describe in detail the problem of how to classify single-input lower triangular forms and the problem of whether a system is equivalent to a specific type of lower triangular form. Section III gives two ways to solve the former, and Section IV discusses the latter. We conclude the paper in Section V.

2 Problem Formulations

To begin with, we clarify that throughout this paper all the definitions and statements are local, although it is possible to generalize to the global as well. In other words, we always operate in some neighborhoods of the origin which are small enough. To classify lower triangular forms, we pay special attention to a class of coordinate transformations defined as follows.

Definition 1

A local coordinate transformation y=U(x)y=U(x) is said to be lower triangular if it takes the form

y1=U1(x1)\displaystyle{y_{1}}={U_{1}}\left({{x_{1}}}\right) (3)
\displaystyle\;\vdots
yn=Un(x1,,xn).\displaystyle{y_{n}}={U_{n}}\left({{x_{1}},\dots,{x_{n}}}\right).
Lemma 1

Let y=U(x)y=U(x) be a coordinate transformation. Rewriting (1) in yy-coordinates, it still takes a lower triangular form if and only if the coordinate transformation is of the form (3). Moreover, the inverse transformation of (3) is also a local lower triangular coordinate transformation.

The classifications we investigate here should guarantee that the type a lower triangular form belongs to is unchanged under any lower triangular coordinate transformation.

There are some clarifications about the classifications of lower triangular forms we would like to illustrate. First, the rules we design to classify lower triangular forms are independent of fn(x)f_{n}(x) and gn(x)g_{n}(x) introduced in (1) because they can be changed by the input vv. Suppose that fn(x){f^{\prime}_{n}}(x) and gn(x){g^{\prime}_{n}}(x) are two given smooth functions with gn(0)0{g^{\prime}_{n}}(0)\neq 0. Take v=(fn(x)fn(x))/gn(x)+gn(x)/gn(x)vv={{\left({{{f^{\prime}}_{n}}(x)-{f_{n}}(x)}\right)}\mathord{\left/{\vphantom{{\left({{{f^{\prime}}_{n}}(x)-{f_{n}}(x)}\right)}{{g_{n}}(x)}}}\right.\kern-1.2pt}{{g_{n}}(x)}}+{{{{g^{\prime}}_{n}}(x)}\mathord{\left/{\vphantom{{{{g^{\prime}}_{n}}(x)}{{g_{n}}(x)}}}\right.\kern-1.2pt}{{g_{n}}(x)}}v^{\prime} in an appropriate neighborhood of the origin, and then the last equation of (1) becomes x˙n=fn(x)+g(x)v{\dot{x}_{n}}={f^{\prime}_{n}}(x)+g^{\prime}(x)v^{\prime}. Second, in some literature, such as [25, 26, 30], the parameters pip_{i}, i=1,,n1i=1,...,n-1, in (2) are allowed to be selected as positive fractions. Since xi+1pix_{i+1}^{p_{i}} is not smooth at the origin when pip_{i} is not a nonnegative integer, we only consider the case that pip_{i}, i=1,,n1i=1,...,n-1, are all positive integers. Last, a smooth nonaffine system

x˙1=f1(x1,x2)\displaystyle\dot{x}_{1}=f_{1}\left(x_{1},x_{2}\right)
\displaystyle\vdots
x˙n1=fn1(x1,,xn)\displaystyle\dot{x}_{n-1}=f_{n-1}\left(x_{1},\dots,x_{n}\right)
x˙n=fn(x1,,xn,v),\displaystyle\dot{x}_{n}=f_{n}\left(x_{1},\dots,x_{n},v\right),

can be equivalently transformed into an affine system via adding a new coordinate variable xn+1=vx_{n+1}=v. In fact, the system can be rewritten as

x˙1=f1(x1,x2)\displaystyle\dot{x}_{1}=f_{1}\left(x_{1},x_{2}\right)
\displaystyle\vdots
x˙n=fn(x1,,xn+1)\displaystyle\dot{x}_{n}=f_{n}\left(x_{1},\dots,x_{n+1}\right)
x˙n+1=v˙.\displaystyle\dot{x}_{n+1}=\dot{v}.

Thus, a classification of affine lower triangular forms can be naturally extended to nonaffine lower triangular forms, and we only examine affine systems here.

If the problem of how to classify lower triangular forms has been solved, let us consider a single-input nonlinear system

ξ˙1=F1(ξ)+G1(ξ)u\displaystyle{{\dot{\xi}}_{1}}={F_{1}}(\xi)+{G_{1}}(\xi)u (4)
\displaystyle\dots
ξ˙n=Fn(ξ)+Gn(ξ)u\displaystyle{{\dot{\xi}}_{n}}={F_{n}}(\xi)+{G_{n}}(\xi)u

where ξ=(ξ1,,ξn)n\xi={({\xi_{1}},\dots,{\xi_{n}})}\in{{\mathbb{R}}^{n}} is the system state, uu\in{\mathbb{R}} is the control input, Fi(ξ)F_{i}(\xi), i=1,,ni=1,\dots,n, are smooth functions with Fi(0)=0F_{i}(0)=0, and Gi(ξ)G_{i}(\xi), i=1,,ni=1,\dots,n, are all smooth functions such that there exists an integer j{1,,n}j\in\{1,\dots,n\} satisfying Gj(0)0G_{j}(0)\neq 0. The next problem we address in this paper is whether (4) is locally equivalent to a given type of lower triangular form via a state feedback and a change of coordinates. The state feedback considered here is of the form

u=αu(ξ)+βu(ξ)vu=\alpha_{u}(\xi)+\beta_{u}(\xi)v (5)

where αu(ξ)\alpha_{u}(\xi) and βu(ξ)\beta_{u}(\xi) are smooth functions with βu(0)0\beta_{u}(0)\neq 0, and the change of coordinates can be expressed as

x=T(ξ)=(T1(ξ),,Tn(ξ))x=T(\xi)={\left({{T_{1}}(\xi),\dots,{T_{n}}(\xi)}\right)} (6)

where T:nnT:{{\mathbb{R}}^{n}}\to{{\mathbb{R}}^{n}} is a smooth invertible mapping with T(0)=0T(0)=0.

3 Classifications of Lower Triangular Forms

The problem we are concerned with in this section is how to classify lower triangular forms. Let us start with the following two definition.

Definition 2

An mm-dimensional multi-index or mm-multi-index is an ordered mm-tuple

α=(α1,αm)\alpha=({\alpha_{1}}\dots,{\alpha_{m}}) (7)

where mm is an integer satisfying 1mn1\leq m\leq n and αi{\alpha_{i}}, i=1,,mi=1,\dots,m, are all nonnegative integers [38]; (7) is called a proper kk-multi-index if αk1{\alpha_{k}}\geq 1 and αk+1==αm=0{\alpha_{k+1}}=\dots={\alpha_{m}}=0 hold for some kk with 1km1\leq k\leq m; (7) is said to be a proper 0-multi-index if αi=0\alpha_{i}=0 for all i=1,,mi=1,\dots,m, and we may simply write α=0\alpha=0 in this case.

Definition 3

Let α\alpha and β\beta be multi-indices. We write α=β\alpha=\beta if and only if they are both proper kk-multi-indices with k0k\geq 0 and αi=βi\alpha_{i}=\beta_{i} holds for every i=1,,ki=1,\dots,k when k>0k>0 [38].

Remark 1

Every proper kk-multi-index can be regarded as an ii-multi-index with iki\geq k.

Taking α\alpha as an mm-dimensional multi-index, for ease of notation, we write

xα=x1α1xmαm{x^{\alpha}}=x_{1}^{\alpha_{1}}\dots x_{m}^{\alpha_{m}}

and

αxα=|α|x1α1xmαm\frac{{{\partial^{\alpha}}}}{{\partial{x^{\alpha}}}}=\frac{{{\partial^{\left|\alpha\right|}}}}{{\partial x_{1}^{\alpha_{1}}\dots\partial x_{m}^{\alpha_{m}}}}

where |α|=α1++αm\left|\alpha\right|=\alpha_{1}+\dots+\alpha_{m}. Moreover, if p(x1,,xm)p(x_{1},\dots,x_{m}) is a function and |α|=0\left|\alpha\right|=0, we define that |α|p/x1α1xmαm=p(x1,,xm)\partial^{\left|\alpha\right|}p/{\partial x_{1}^{\alpha_{1}}\dots\partial x_{m}^{\alpha_{m}}}=p(x_{1},\dots,x_{m}) [38].

Definition 4

p(x1,,xm)p(x_{1},\dots,x_{m}) is a smooth function (or a holomorphic function) and α\alpha is a multi-index with |α|>0\left|\alpha\right|>0. We say that α\alpha is a multi-index of p(x1,,xm)p(x_{1},\dots,x_{m}) (with respect to the coordinates x1,,xmx_{1},\dots,x_{m}) if αp/xα(0)0\partial^{\alpha}p/\partial x^{\alpha}(0)\neq 0 holds.

Remark 2

0 is a multi-index of pp if and only if p(0)0p(0)\neq 0.

Remark 3

In most cases, we consider the function p(x1,,xm)p(x_{1},\dots,x_{m}) to be real-valued and smooth. This function is allowed to be complex-valued and holomorphic only for discussing invariant multi-indices in subsection B. For the same reason, the lower triangular coordinate transformation y=U(x)y=U(x) defined by Definition 1 can be smooth or biholomorphic.

Proposition 1

Suppose p(x1,,xm)p(x_{1},\dots,x_{m}) is a smooth (or a holomorphic) function and α\alpha is a multi-index of p(x1,,xm)p(x_{1},\dots,x_{m}). pp can be express as

p(x1,,xm)=cαxα+p¯(x1,,xm).p(x_{1},\dots,x_{m})={c_{\alpha}}{x^{\alpha}}+\bar{p}(x_{1},\dots,x_{m}).

In above equation, cα=αp/xα(0)c_{\alpha}={{{\partial^{\alpha}p}}}/{{\partial{x^{\alpha}}}}(0) is a nonzero coefficient and p¯(x1,,xm)\bar{p}(x_{1},\dots,x_{m}) is a function with αp¯/xα(0)=0{{{\partial^{\alpha}\bar{p}}}}/{{\partial{x^{\alpha}}}}(0)=0.

For convenience, let us denote the set of all the proper kk-multi-indices of p(x1,,xm)p(x_{1},\dots,x_{m}) by k(p){\cal I}_{k}(p) for k=0,,mk=0,\dots,m and write (p)=k=0mk(p){\cal I}(p)=\bigcup_{k=0}^{m}{\cal I}_{k}(p) throughout this paper.

The rest of this section is divided into three subsections. Subsection A discusses several properties of multi-indices. Subsection B investigates the invariant multi-indices of a function under lower triangular coordinate transformations. In Subsection C, we propose two classification schemes of lower triangular forms.

3.1 The Least Multi-index and Essential Multi-indices of Functions

In this subsection, we investigate which multi-indices of a function may be more vital by exploring the relations between multi-indices. The following definition presents one of the ways to compare two multi-indices.

Definition 5

α\alpha and β\beta are proper kαk_{\alpha}-multi-index and proper kβk_{\beta}-multi-index, respectively. Let m=max(kα,kβ)+1m=\mathrm{max}(k_{\alpha},k_{\beta})+1. We say that α\alpha is less than β\beta in lexicographical order, denoted by αβ\alpha\lessdot\beta, if there exists an integer i{1,,m}i\in\{1,\dots,m\} such that αi<βi\alpha_{i}<\beta_{i} and αj=βj\alpha_{j}=\beta_{j} for all j=1,,i1j=1,\dots,i-1.

Example 1

As defined above, we have (2,3,9)(2,5,1)(2,3,9)\lessdot(2,5,1) and (0,3)(1,0,1)(0,3)\lessdot(1,0,1).

Definition 6

Let II be a set whose members are all proper ii-multi-indices. αI\alpha\in I is said to be the least ii-multi-index of II if αβ\alpha\lessdot\beta holds for any βI\beta\in I different from α\alpha. Further let p(x1,,xm)p(x_{1},\dots,x_{m}) be a smooth function (or a holomorphic function). We also call the least ii-multi-index of i(p){\cal I}_{i}(p) as the least ii-multi-index of pp, written as i(p){\cal L}_{i}(p).

Remark 4

0(p)=0{\cal L}_{0}(p)=0 if and only if p(0)0p(0)\neq 0.

Example 2

Consider the following lower triangular form

x˙1=x1x23+x12x2x1=f1(x1,x2)\displaystyle{{\dot{x}}_{1}}=x_{1}x_{2}^{3}+x_{1}^{2}x_{2}-{x_{1}}=f_{1}(x_{1},x_{2})
x˙2=x22x3+x1x3=f2(x1,x2,x3)\displaystyle{{\dot{x}}_{2}}=x_{2}^{2}{x_{3}}+x_{1}x_{3}=f_{2}(x_{1},x_{2},x_{3})
x˙3=x3+u.\displaystyle{{\dot{x}}_{\rm{3}}}={x_{3}}+u\;.

We have 2(f1)=(1,3){\cal L}_{2}(f_{1})=(1,3) and 3(f2)=(0,2,1){\cal L}_{3}(f_{2})=(0,2,1).

Definition 7

Let α\alpha and β\beta be multi-indices. If there exists a lower triangular coordinate transformation y=U(x)y=U(x) such that

xα=cβyβ+h(y){x^{\alpha}}={c_{\beta}}{y^{\beta}}+h(y)

where cβ0c_{\beta}\neq 0 and the function h(y)h(y) satisfies βh/yβ(0)=0{{{\partial^{\beta}}h}/{\partial{y^{\beta}}}}(0)=0, then we say that β\beta is generated by α\alpha, denoted by αβ\alpha\preceq\beta. If αβ\alpha\preceq\beta and αβ\alpha\neq\beta, we write αβ\alpha\prec\beta.

Remark 5

Arbitrary proper ii-multi-index (i>0i>0) can be generated by the proper ii-multi-index (0,,0,1)(0,\dots,0,1). The 0-multi-index 0 can only generate itself and can only be generated by itself.

Example 3

Let α=(1,2,1)\alpha=(1,2,1), and select the following lower triangular coordinate transformation

y1=x1\displaystyle{y_{1}}={x_{1}}
y2=x2+x12\displaystyle{y_{2}}={x_{2}}+x_{1}^{2}
y3=x3+x1,\displaystyle{y_{3}}={x_{3}}+{x_{1}},

whose inverse transformation can be expressed as

x1=y1\displaystyle{x_{1}}={y_{1}}
x2=y2y12\displaystyle{x_{2}}={y_{2}}-y_{1}^{2}
x3=y3y1.\displaystyle{x_{3}}={y_{3}}-{y_{1}}.

Substituting the above equations into xαx^{\alpha} yields

xα\displaystyle{x^{\alpha}} =x1x22x3\displaystyle={x_{1}}x_{2}^{2}{x_{3}}
=y1y22y3y12y222y13y2y3+2y14y2+y15y3y16;\displaystyle={y_{1}}y_{2}^{2}{y_{3}}-y_{1}^{2}y_{2}^{2}-2y_{1}^{3}{y_{2}}{y_{3}}+2y_{1}^{4}{y_{2}}+y_{1}^{5}{y_{3}}-y_{1}^{6}\;;

that is, α\alpha can generate at least the six 3-multi-indices as follows: (1,2,1)(1,2,1), (2,2,0)(2,2,0), (3,1,1)(3,1,1), (4,1,0)(4,1,0), (5,0,1)(5,0,1), and (6,0,0)(6,0,0).

Proposition 2

Let α\alpha and β\beta be proper mαm_{\alpha}-multi-index and proper mβm_{\beta}-multi-index, respectively. If mα<mβm_{\alpha}<m_{\beta} then α\alpha can not generate β\beta.

Theorem 1

α\alpha and β\beta are proper mαm_{\alpha}-multi-index and proper mβm_{\beta}-multi-index, respectively, satisfying mαmβ>0m_{\alpha}\geq m_{\beta}>0 and αβ\alpha\neq\beta. Then αβ\alpha\prec\beta if and only if for all i=1,,mαi=1,\dots,m_{\alpha} we have

j=1iαjj=1iβj.\sum\limits_{j=1}^{i}{{\alpha_{j}}}\leq\sum\limits_{j=1}^{i}{{\beta_{j}}}. (8)
Proof 3.2.

The necessity is obvious, let us verify the sufficiency. We first consider the case of mα=1m_{\alpha}=1. It is clear that mβ=1m_{\beta}=1 in this case. From αβ\alpha\neq\beta and (8), 0<α1<β10<\alpha_{1}<\beta_{1} holds. Let h(x1)=x1α1h(x_{1})=x_{1}^{\alpha_{1}} and y1y_{1} a new coordinate satisfying

x1=y1+y1β1α1+1.{x_{1}}={y_{1}}+y_{1}^{{\beta_{1}}-{\alpha_{1}}+1}\;.

Since substituting the above equation into h(x1)h(x_{1}) yields

h(x1)=hy(y1)=i=0α1(α1i)y1α1iy1i(β1α1+1),h(x_{1})=h_{y}(y_{1})=\sum\limits_{i=0}^{{\alpha_{1}}}{\left({\begin{matrix}{{\alpha_{{}_{1}}}}\\ i\end{matrix}}\right)y_{1}^{{\alpha_{1}}-i}y_{1}^{i({\beta_{1}}-{\alpha_{1}}+1)}}\;,

we have

β1hyy1β1(0)=α1β1!0,\frac{{{\partial^{{\beta_{1}}}}h_{y}}}{{\partial{y_{1}^{{\beta_{1}}}}}}(0)={\alpha_{1}}\cdot{\beta_{1}}!\neq 0\;,

and then αβ\alpha\prec\beta holds for the case.

Suppose that, for an integer k>0k>0 and all the mα=1,,km_{\alpha}=1,\dots,k, αβ\alpha\prec\beta holds when (8) is satisfied. We now prove that, (8) still implies αβ\alpha\prec\beta when α\alpha is a proper (k+1)(k+1)-multi-index. To this end, let us consider the two cases as discussed below. For the case αk+1<βk+1{\alpha_{k+1}}<{\beta_{k+1}}, one can construct a family of new coordinates y1,,yky_{1},\dots,y_{k} satisfying

x1=U1(y1)\displaystyle{x_{1}}={U_{1}}\left({{y_{1}}}\right)
\displaystyle\;\vdots
xk=Uk(y,,yk).\displaystyle{x_{k}}={U_{k}}\left({{y},\dots,{y_{k}}}\right).

and x1α1xkαk=c(β1,,βk)y1β1ykβk+s(y)x_{1}^{\alpha_{1}}\dots x_{k}^{\alpha_{k}}=c_{(\beta_{1},\dots,\beta_{k})}y_{1}^{\beta_{1}}\dots y_{k}^{\beta_{k}}+s(y) where the coefficient c(β1,,βk)0c_{(\beta_{1},\dots,\beta_{k})}\neq 0 and (β1,,βk)s/y(β1,,βk)(0)=0\partial^{(\beta_{1},\dots,\beta_{k})}s/\partial y^{(\beta_{1},\dots,\beta_{k})}(0)=0. If we choose the next coordinate yk+1y_{k+1} satisfying

xk+1=yk+1+yk+1βk+1αk+1+1{x_{k+1}}={y_{k+1}}+y_{k+1}^{{\beta_{k+1}}-{\alpha_{k+1}}+1}

then

xα=\displaystyle x^{\alpha}= (c(β1,,βk)y1β1ykβk+s(y))\displaystyle\left(c_{(\beta_{1},\dots,\beta_{k})}y_{1}^{\beta_{1}}\dots y_{k}^{\beta_{k}}+s(y)\right)\;\cdot (9)
i=0αk+1(αk+1i)yk+1αk+1iyk+1i(βk+1αk+1+1)\displaystyle\sum\limits_{i=0}^{{\alpha_{k+1}}}{\left({\begin{matrix}{{\alpha_{{}_{k+1}}}}\\ i\end{matrix}}\right)y_{k+1}^{{\alpha_{k+1}}-i}y_{k+1}^{i({\beta_{k+1}}-{\alpha_{k+1}}+1)}}

is obtained. There is a term

c(β1,,βk)αk+1y1β1yk+1βk+1c_{(\beta_{1},\dots,\beta_{k})}\,\cdot\,\alpha_{k+1}\,\cdot\,y_{1}^{\beta_{1}}\dots y_{k+1}^{\beta_{k+1}}

in the right-hand side of (9), which implies αβ\alpha\prec\beta. The other case is αk+1βk+1{\alpha_{k+1}}\geq{\beta_{k+1}}. Taking a family of new coordinates x¯1,,x¯k+1\bar{x}_{1},\dots,\bar{x}_{k+1} satisfying

x1=V1(x¯1)=x¯1\displaystyle{x_{1}}=V_{1}(\bar{x}_{1})={{\bar{x}}_{1}}
\displaystyle\;\vdots
xk=Vk(x¯k)=x¯k\displaystyle{x_{k}}=V_{k}(\bar{x}_{k})={{\bar{x}}_{k}}
xk+1=Vk+1(x¯)=x¯k+x¯k+1,\displaystyle{x_{k+1}}=V_{k+1}(\bar{x})={{\bar{x}}_{k}}+{{\bar{x}}_{k+1}},

denoted it by (x1,,xk+1)=V(x¯1,,x¯k+1)(x_{1},\dots,x_{k+1})=V(\bar{x}_{1},\dots,\bar{x}_{k+1}), we compute the function h1(x¯)=xαh_{1}(\bar{x})=x^{\alpha}.

h1(x¯)=x¯1α1x¯k1αk1i=0αk+1(αk+1i)x¯kαk+ix¯k+1αk+1ih_{1}(\bar{x})=\bar{x}_{1}^{{\alpha_{1}}}\dots\bar{x}_{k-1}^{{\alpha_{k-1}}}\sum\limits_{i=0}^{{\alpha_{k+1}}}{\left({\begin{matrix}{{\alpha_{k+1}}}\\ i\end{matrix}}\right)\bar{x}_{k}^{{\alpha_{k}}+i}\bar{x}_{k+1}^{{\alpha_{k+1}}-i}}

The right-hand side of the above equation includes the term

(αk+1αk+1βk+1)x¯1α1x¯k1αk1x¯k(αk+αk+1βk+1)x¯k+1βk+1.\left({\begin{matrix}{{\alpha_{k+1}}}\\ {{\alpha_{k+1}}-{\beta_{k+1}}}\end{matrix}}\right)\bar{x}_{1}^{{\alpha_{1}}}\dots\bar{x}_{k-1}^{{\alpha_{k-1}}}\bar{x}_{k}^{({\alpha_{k}}+{\alpha_{k+1}}-{\beta_{k+1}})}\bar{x}_{k+1}^{{\beta_{k+1}}}. (10)

Let us denote the multi-index of this term by γ=(γ1,,γk+1)=(α1,,αk1,αk+αk+1βk+1,βk+1)\gamma=(\gamma_{1},\dots,\gamma_{k+1})=({\alpha_{1}},\dots,{\alpha_{k-1}},{\alpha_{k}}+{\alpha_{k+1}}-{\beta_{k+1}},{\beta_{k+1}}), it is obvious that αγ\alpha\preceq\gamma. Additionally, for any l=1,,kl=1,\dots,k, the inequality j=1lγjj=1lβj\textstyle{\sum_{j=1}^{l}{{\gamma_{j}}}}\leq\textstyle{\sum_{j=1}^{l}{{\beta_{j}}}} holds, and then (γ1,,γk)(β1,,βk)(\gamma_{1},\dots,\gamma_{k})\prec(\beta_{1},\dots,\beta_{k}) is obtained. This means that we can find a new family of coordinates (y1,,yk+1)(y_{1},\dots,y_{k+1}) satisfying

x¯1=W1(y1)\displaystyle{{\bar{x}}_{1}}={W_{1}}\left({{y_{1}}}\right)
\displaystyle\;\vdots
x¯k=Wk(y1,,yk)\displaystyle{{\bar{x}}_{k}}={W_{k}}\left({{y_{1}},\dots,{y_{k}}}\right)
x¯k+1=Wk+1(yk+1)=yk+1\displaystyle{{\bar{x}}_{k+1}}={W_{k+1}(y_{k+1})}={y_{k+1}}

such that β\beta is a multi-index of h2(y)=x¯γ=(W1(y),,Wk+1(y))γ=W(y)γh_{2}(y)={\bar{x}}^{\gamma}=(W_{1}(y),\dots,W_{k+1}(y))^{\gamma}=W(y)^{\gamma} with respect to yy-coordinates; that is, γβ\gamma\prec\beta. To prove αβ\alpha\prec\beta, let δγ\delta\neq\gamma be another multi-index of h1(x¯)h_{1}(\bar{x}) with respect to x¯\bar{x}. Since δk+1βk+1\delta_{k+1}\neq\beta_{k+1}, β\beta is not a multi-index of h3(y)=x¯δ=W(y)δh_{3}(y)=\bar{x}^{\delta}=W(y)^{\delta} with respect to yy-coordinates. Thus, β\beta must be a multi-index of h4(y)=xα=(V(W(y)))αh_{4}(y)=x^{\alpha}=(V(W(y)))^{\alpha} with respect to yy-coordinates.

Therefore, αβ\alpha\prec\beta holds when we have (8).

From the above theorem, the following two corollaries are immediate consequences.

Corollary 3.3.

Let II be a set of multi-indices, and take any α,β,γI\alpha,\beta,\gamma\in I. Then the relation \preceq has the following properties:

(i) αα\alpha\preceq\alpha;

(ii) both αβ\alpha\preceq\beta and βα\beta\preceq\alpha imply α=β\alpha=\beta;

(iii) both αβ\alpha\preceq\beta and βγ\beta\preceq\gamma imply αγ\alpha\preceq\gamma.
That is, \preceq is a partial order on the ground set II.

Corollary 3.4.

α\alpha and β\beta are proper mαm_{\alpha}-multi-index and proper mβm_{\beta}-multi-index, respectively, with mαmβ>0m_{\alpha}\geq m_{\beta}>0. αβ\alpha\nprec\beta if and only if for some i{1,,mα}i\in\{1,\dots,m_{\alpha}\} the inequality

j=1iαj>j=1iβj.\sum\limits_{j=1}^{i}{{\alpha_{j}}}>\sum\limits_{j=1}^{i}{{\beta_{j}}}.

holds. In the case of mα=mβm_{\alpha}=m_{\beta}, αβ\alpha\npreceq\beta and βα\beta\npreceq\alpha are both true if and only if there exist two integers i1,i2{1,,mα}i_{1},i_{2}\in\{1,\dots,m_{\alpha}\} such that the following inequalities hold.

j=1i1αj<j=1i1βj,j=1i2αj>j=1i2βj\sum\limits_{j=1}^{{i_{\rm{1}}}}{{\alpha_{j}}}<\sum\limits_{j=1}^{{i_{\rm{1}}}}{{\beta_{j}}},{\kern 1.0pt}\sum\limits_{j=1}^{{i_{\rm{2}}}}{{\alpha_{j}}}>\sum\limits_{j=1}^{{i_{\rm{2}}}}{{\beta_{j}}}
Definition 3.5.

Let II be a set of multi-indices and αI\alpha\in I a proper ii-multi-index. α\alpha is said to be a weakly essential ii-multi-index of II if there is no another proper ii-multi-index of II that can generate α\alpha. If αα\alpha^{\prime}\nprec\alpha holds for any αI\alpha^{\prime}\in I, we say that α\alpha is an essential ii-multi-index of II. p(x1,,xm)p(x_{1},\dots,x_{m}) is a smooth function (or a holomorphic function) and β\beta is a proper ii-multi-index of pp. β\beta is said to be a weakly essential ii-multi-index of pp if β\beta is a weakly essential ii-multi-index of (p){\cal I}(p). Moreover, if β\beta is an essential ii-multi-index of (p){\cal I}(p), we say that β\beta is an essential ii-multi-index of pp.

Lemma 3.6.

p(x1,,xm)p(x_{1},\dots,x_{m}) is a smooth function (or a holomorphic function) and x=V(y)x=V(y) is a lower triangular coordinate transformation. For an mm-multi-index α=(α1,,αm)0\alpha=({\alpha_{1}},\dots,{\alpha_{m}})\neq 0 and the function q(y)=p(V(y))q(y)=p(V(y)), we have

αqyα=βαα=k,iγβ,k,i(βqxβk=1mi=1βkγβ,k,ixkyγβ,k,i)\displaystyle\frac{{{\partial^{\alpha}}q}}{{\partial{y^{\alpha}}}}=\sum\limits_{\beta\preceq\alpha}{\sum\limits_{\alpha=\sum\limits_{k,i}{{\gamma^{\beta,k,i}}}}{\left({\frac{{{\partial^{\beta}}q}}{{\partial{x^{\beta}}}}\prod\limits_{k=1}^{m}{\prod\limits_{i=1}^{{\beta_{k}}}{\frac{{{\partial^{{\gamma^{\beta,k,i}}}}{x_{k}}}}{{\partial{y^{{\gamma^{\beta,k,i}}}}}}}}}\right)}} (11)

where β=(β1,,βm)\beta=(\beta_{1},\dots,\beta_{m}) and every γβ,k,i\gamma^{\beta,k,i} is a kk-multi-index.

Proof 3.7.

Let α=(0,,0,αi,0,,0)=(0,,0,1,0,,0)\alpha=(0,\dots,0,\alpha_{i},0,\dots,0)=(0,\dots,0,1,0,\dots,0) where 1im1\leq i\leq m, then

αqyα=qyi=qxixiyi++qxmxmyi\frac{{{\partial^{\alpha}}q}}{{\partial{y^{\alpha}}}}=\frac{{\partial q}}{{\partial{y_{i}}}}=\frac{{\partial q}}{{\partial{x_{i}}}}\frac{{\partial{x_{i}}}}{{\partial{y_{i}}}}+\dots+\frac{{\partial q}}{{\partial{x_{m}}}}\frac{{\partial{x_{m}}}}{{\partial{y_{i}}}}

The equation above implies that (11) is satisfied in this case.

Assume (11) holds for a nonzero multi-index α=(α1,,αm)=(0,,0,αj,,αm)\alpha=({\alpha_{1}},\dots,{\alpha_{m}})=(0,\dots,0,{\alpha_{j}},\dots,{\alpha_{m}}), where 1jm1\leq j\leq m and αj0\alpha_{j}\geq 0. Let α=(0,,0,αj+1,αj+1,,αm)\alpha^{\prime}=(0,\dots,0,{\alpha_{j}}{\rm{+1}},{\alpha_{j+1}},\dots,{\alpha_{m}}). For all β\beta satisfying βα\beta\preceq\alpha, βα\beta\preceq\alpha^{\prime} can be deduced by using Theorem 1. We now focus on the case of β¯α\bar{\beta}\preceq\alpha^{\prime} but β¯α\bar{\beta}\npreceq\alpha. There exists an integer k{j,,m}k\in\{j,\dots,m\} such that i=1kβ¯i>i=1kαi\textstyle{\sum_{i=1}^{k}{{\bar{\beta}_{i}}}}>\textstyle{\sum_{i=1}^{k}{{\alpha_{i}}}} and i=1lβ¯ii=1lαi\textstyle{\sum_{i=1}^{l}{{\bar{\beta}_{i}}}}\leq\textstyle{\sum_{i=1}^{l}{{\alpha_{i}}}} for all l=1,,k1l=1,\dots,k-1. Comparing α\alpha^{\prime} to α\alpha, the relation

(0,,0,β¯j,,β¯k1,β¯k1,β¯k+1,,β¯m)α(0,\dots,0,{\bar{\beta}_{j}},\dots,{\bar{\beta}_{k-1}},{\bar{\beta}_{k}}-1,{\bar{\beta}_{k+1}},\dots,{\bar{\beta}_{m}})\preceq\alpha

must hold for this case. Then a direct calculation presented by (12) shows that (11) holds for α\alpha^{\prime}.

αqyα=αqymαmyj+1αj+1yjαj+1=(βαα=k,iβ,k,i(βqxmβmxjβjk=jmi=1βkγβ,k,ixkykγkβ,k,iyjγjβ,k,i))/yj\displaystyle\frac{\partial^{\alpha^{\prime}}q}{\partial y^{\alpha^{\prime}}}=\frac{\partial^{\alpha^{\prime}}q}{\partial y_{m}^{\alpha_{m}}\cdots\partial y_{j+1}^{\alpha_{j+1}}\partial y_{j}^{\alpha_{j}+1}}=\partial\left(\sum_{\beta\preceq\alpha}\sum_{\alpha=\sum_{k,i}^{\beta,k,i}}\left(\frac{\partial^{\beta}q}{\partial x_{m}^{\beta_{m}}\cdots\partial x_{j}^{\beta_{j}}}\prod_{k=j}^{m}\prod_{i=1}^{\beta_{k}}\frac{\partial^{\gamma^{\beta,k,i}}x_{k}}{\partial y_{k}^{\gamma_{k}^{\beta,k,i}}\cdots\partial y_{j}^{\gamma_{j}^{\beta,k,i}}}\right)\right)\large{\bigg{/}}\partial y_{j} (12)
=βαα=k,iβ,k,i((l=jmβqxmβmxlβl+1xjβjxlβl+1yj)k=jmi=1βkγk,ixkykγkk,iyjγjk,i+βqxmβmxjβj\displaystyle\quad=\sum_{\beta\preceq\alpha}\sum_{\alpha=\sum_{k,i}^{\beta,k,i}}\left(\left(\sum_{l=j}^{m}\frac{\partial^{\beta}q}{\partial x_{m}^{\beta_{m}}\cdots\partial x_{l}^{\beta_{l}+1}\cdots\partial x_{j}^{\beta_{j}}}\frac{\partial x_{l}^{\beta_{l}+1}}{\partial y_{j}}\right)\prod_{k=j}^{m}\prod_{i=1}^{\beta_{k}}\frac{\partial^{\gamma^{k,i}}x_{k}}{\partial y_{k}^{\gamma_{k}^{k,i}}\cdots\partial y_{j}^{\gamma_{j}^{k,i}}}+\frac{\partial^{\beta}q}{\partial x_{m}^{\beta_{m}}\cdots\partial x_{j}^{\beta_{j}}}\cdot\right.
(k=jmi=1βkγβ,k,ixkykγkk,iyjγjk,i)/yj)=βαα=k,iλβ,k,i(βqxmβmxjβjk=jmi=1βkγβ,k,ixkykλkβ,k,iyjλjβ,k,i)\displaystyle\qquad\left.\partial\left(\prod_{k=j}^{m}\prod_{i=1}^{\beta_{k}}\frac{\partial^{\gamma^{\beta,k,i}}x_{k}}{\partial y_{k}^{\gamma_{k}^{k,i}}\cdots\partial y_{j}^{\gamma_{j}^{k,i}}}\right)\large{\bigg{/}}\partial y_{j}\right)=\sum_{\beta^{\prime}\preceq\alpha^{\prime}}\sum_{\alpha^{\prime}=\sum_{k,i}\lambda^{\beta^{\prime},k,i}}\left(\frac{\partial^{\beta^{\prime}}q}{\partial x_{m}^{\beta_{m}^{\prime}}\cdots\partial x_{j}^{\beta_{j}^{\prime}}}\prod_{k=j}^{m}\prod_{i=1}^{\beta_{k}^{\prime}}\frac{\partial^{\gamma^{\beta^{\prime},k,i}}x_{k}}{\partial y_{k}^{\lambda_{k}^{\beta^{\prime},k,i}}\cdots\partial y_{j}^{\lambda_{j}^{\beta^{\prime},k,i}}}\right)

This proves (11).

Proposition 3.8.

α\alpha is a weakly essential ii-multi-index of a smooth function (or a holomorphic function) p(x1,,xm)p(x_{1},\dots,x_{m}) if and only if α\alpha is still a weakly essential ii-multi-index of the function q(y)=p(V(y))q(y)=p(V(y)) where x=V(y)x=V(y) is a coordinate transformation taking the form

x1=V1(y1),,xi=Vi(y1,,yi),\displaystyle x_{1}=V_{1}(y_{1}),\dots,x_{i}=V_{i}(y_{1},\dots,y_{i}), (13)
xi+1=Vi+1(yi+1)=yi+1,,xm=Vm(ym)=ym.\displaystyle x_{i+1}=V_{i+1}(y_{i+1})=y_{i+1},\dots,x_{m}=V_{m}(y_{m})=y_{m}.
Proof 3.9.

Necessity. Since when i=0i=0 the necessity is obvious, we only consider the case of i1i\geq 1. Let β\beta be a multi-index satisfying βα\beta\prec\alpha. If β\beta is a proper ii-multi-index, we obtain βq/xβ(0)=0{{{\partial^{\beta}}q}/{\partial{x^{\beta}}(0)}}=0 since β\beta is not a multi-index of p(x)p(x). Now consider β\beta as a proper ii^{\prime}-multi-index with i<imi<i^{\prime}\leq m. Owing to xi/yk=0\partial x_{i^{\prime}}/\partial y_{k}=0 for any k=1,,ik=1,\dots,i, any term in αq/yα\partial^{\alpha}q/\partial y^{\alpha} which has the multiplier βq/yβ\partial^{\beta}q/\partial y^{\beta} is equal to 0. Thus, the only term in αq/yα{{\partial^{\alpha}}q}/{\partial{y^{\alpha}}} that is not equal to 0 at the origin is

αqxα(x1y1)α1(xmyi)αi;\frac{{{\partial^{\alpha}}q}}{{\partial{x^{\alpha}}}}\left(\frac{\partial x_{1}}{\partial y_{1}}\right)^{\alpha_{1}}\dots\left(\frac{\partial x_{m}}{\partial y_{i}}\right)^{\alpha_{i}};

that is, αq/yα(0)0{{{\partial^{\alpha}}q}/{\partial{y^{\alpha}}(0)}}\neq 0. Similarly, we can obtain γq/yγ(0)=0{{{\partial^{\gamma}}q}/{\partial{y^{\gamma}}(0)}}=0 for arbitrary ii-multi-index γα\gamma\prec\alpha. Therefore α\alpha is a weakly essential ii-multi-index of q(y)q(y).

To prove the sufficiency, it is enough to note that the inverse transformation of VV is of the form

y1=U1(x1),,yi=Ui(x1,,xi),\displaystyle y_{1}=U_{1}(x_{1}),\dots,y_{i}=U_{i}(x_{1},\dots,x_{i}),
yi+1=Ui+1(xi+1)=xi+1,,ym=Um(xm)=xm\displaystyle y_{i+1}=U_{i+1}(x_{i+1})=x_{i+1},\dots,y_{m}=U_{m}(x_{m})=x_{m}

and to repeat the proof of the necessity.

Furthermore, the following proposition can be verify in a similar way to the proof the above proposition.

Proposition 3.10.

α\alpha is an essential multi-index of a smooth function (or a holomorphic function) p(x1,,xm)p(x_{1},\dots,x_{m}) if and only if α\alpha is still an essential multi-index of the function q(y)=p(V(y))q(y)=p(V(y)) where x=V(y)x=V(y) is a lower transformation coordinate transformation.

Definition 3.11.

II is a set of ii-multi-indies and II^{\prime} is a subset of II. II^{\prime} is said to be the greatest weakly essential ii-multi-index set of II if II^{\prime} consists of all the weakly essential ii-multi-indices of II. Let p(x1,,xm)p({x_{1}},\dots,{x_{m}}) be a smooth function (or a holomorphic function) and IpI_{p} is a subset of i(p){\cal I}_{i}(p). IpI_{p} is said to be the greatest weakly essential ii-multi-index set of pp, denoted it by 𝒲i(p){\cal W}_{i}(p), if it is the greatest weakly essential ii-multi-index set of i(p){\cal I}_{i}(p). We also write 𝒲(p)=i=0m𝒲i(p){\cal W}(p)=\bigcup_{i=0}^{m}{\cal W}_{i}(p).

Definition 3.12.

II is a set of multi-indies. II^{\prime} is said to be the greatest essential ii-multi-index set of II if II^{\prime} consists of all the essential ii-multi-indices of II. A set is said to be the greatest essential ii-multi-indices set of p(x1,,xm)p({x_{1}},\dots,{x_{m}}), denoted it by i(p){\cal E}_{i}(p), if the set consists of all the essential ii-multi-indices of (p){\cal I}(p). We also define (p)=i=0mi(p){\cal E}(p)=\bigcup_{i=0}^{m}{\cal E}_{i}(p), and call (p){\cal E}(p) as the greatest essential multi-indices set of pp.

Exploiting Definition 3.11, Definition 3.12, Proposition 3.8, and Proposition 3.10, we obtain the following two theorems.

Theorem 3.13.

Let p(x1,,xm)p({x_{1}},\dots,{x_{m}}) be a smooth function (or a holomorphic function), x=V(y)x=V(y) a change of coordinates taking the form (13), and q(y1,,ym)=p(V1(y1),,Vm(y1,,ym))q(y_{1},\dots,y_{m})=p(V_{1}(y_{1}),\dots,V_{m}(y_{1},\dots,y_{m})). Then 𝒲i(p)=𝒲i(q){\cal W}_{i}(p)={\cal W}_{i}(q).

Theorem 3.14.

Let p(x1,,xm)p({x_{1}},\dots,{x_{m}}) be a smooth function (or a holomorphic function), x=V(y)x=V(y) a lower triangular coordinate transformation, and q(y1,,ym)=p(V1(y1),,Vm(y1,,ym))q(y_{1},\dots,y_{m})=p(V_{1}(y_{1}),\dots,V_{m}(y_{1},\dots,y_{m})). Then (p)=(q){\cal E}(p)={\cal E}(q) and i(p)=i(q){\cal E}_{i}(p)={\cal E}_{i}(q) for i=0,,mi=0,\dots,m.

Proposition 3.15.

II is a set of proper ii-multi-indices such that, for any two different elements α,βI\alpha,\beta\in I, both αβ\alpha\nprec\beta and βα\beta\nprec\alpha are satisfied. Then II is a finite set.

Proof 3.16.

When i=0i=0, II is obviously finite. Assuming i=1i=1 and α=(α1)I\alpha=(\alpha_{1})\in I, α\alpha must be the only element of II because, for any β=(β1)\beta=(\beta_{1}) different from α\alpha, β1<α1\beta_{1}<\alpha_{1} means βα\beta\prec\alpha and α1<β1\alpha_{1}<\beta_{1} means αβ\alpha\prec\beta.

We now show that if for all i=1,,ji=1,\dots,j the set of ii-multi-indices II is finite, then II remains finite when II is a set of (j+1)(j+1)-multi-indices. Suppose α=(α1,,αj+1)\alpha=(\alpha_{1},\dots,\alpha_{j+1}) is a given proper (j+1)(j+1)-multi-index of II. For any β=(β1,,βj+1)I\beta=(\beta_{1},\dots,\beta_{j+1})\in I, there are four possible relations between (α1,,αj)(\alpha_{1},\dots,\alpha_{j}) and (β1,,βj)(\beta_{1},\dots,\beta_{j}) as follows: (α1,,αj)=(β1,,βj)(\alpha_{1},\dots,\alpha_{j})=(\beta_{1},\dots,\beta_{j}), (α1,,αj)(β1,,βj)(\alpha_{1},\dots,\alpha_{j})\prec(\beta_{1},\dots,\beta_{j}), (β1,,βj)(α1,,αj)(\beta_{1},\dots,\beta_{j})\prec(\alpha_{1},\dots,\alpha_{j}), and neither (α1,,αj)(β1,,βj)(\alpha_{1},\dots,\alpha_{j})\preceq(\beta_{1},\dots,\beta_{j}) nor (β1,,βj)(α1,,αj)(\beta_{1},\dots,\beta_{j})\preceq(\alpha_{1},\dots,\alpha_{j}). We will verify that the subset consisting of all the multi-indices falling into each case is finite. In the first case, βj+1=αj+1\beta_{j+1}=\alpha_{j+1} must hold to meet both αβ\alpha\nprec\beta and βα\beta\nprec\alpha; that is, α\alpha is the only multi-index suitable for this case. In the second case, (α1,,αj)(β1,,βj)(\alpha_{1},\dots,\alpha_{j})\prec(\beta_{1},\dots,\beta_{j}) means that βα\beta\nprec\alpha has already been satisfied and we have to choose β\beta such that k=1j+1βk<k=1j+1αk\textstyle{\sum_{k=1}^{j+1}{\beta_{k}}}<\textstyle{\sum_{k=1}^{j+1}{\alpha_{k}}}. For a given α\alpha, the above inequality implies that the choices of β\beta are finite. Let us discuss the third case. The number of all the proper jj-multi-indices (β1,,βj)(\beta_{1},\dots,\beta_{j}) satisfying (β1,,βj)(α1,,αj)(\beta_{1},\dots,\beta_{j})\prec(\alpha_{1},\dots,\alpha_{j}) is finite. Furthermore, for a fixed (β1,,βj)(\beta_{1},\dots,\beta_{j}), there are no more than one element βI\beta^{\prime}\in I satisfying βl=βl\beta_{l}^{\prime}=\beta_{l} for l=1,,jl=1,\dots,j. So the elements of II that meet the third case are also finite. In the last case, the two proper jj-multi-indices (α1,,αj)(\alpha_{1},\dots,\alpha_{j}) and (β1,,βj)(\beta_{1},\dots,\beta_{j}) can not generate each other. For a given (α1,,αj)(\alpha_{1},\dots,\alpha_{j}), all the proper jj-multi-indices that can be select as (β1,,βj)(\beta_{1},\dots,\beta_{j}) have been assumed to be finite. Note that, for a fixed (β1,,βj)(\beta_{1},\dots,\beta_{j}), at most one proper (j+1)(j+1)-multi-index taking the form (β1,,βj,βj+1)(\beta_{1},\dots,\beta_{j},\beta_{j+1}) can belong to II. Then, all the possible proper (j+1)(j+1)-multi-indices that can be chosen as β\beta in this case are finite. In summary, the set II is finite.

The following theorem can be obtained directly from the above proposition.

Theorem 3.17.

Suppose p(x1,,xm)p(x_{1},\dots,x_{m}) is a smooth function (or a holomorphic function). Then, for i=0,,mi=0,\dots,m, 𝒲i(p){\cal W}_{i}(p), i(p){\cal E}_{i}(p), 𝒲(p){\cal W}(p), and (p){\cal E}(p) are all finite sets.

Let II be a set of multi-indices. We write the set that consists of all the multi-indices generated by the elements of II as 𝒢(I){\cal G}(I), and write the subset that consists of all the proper ii-multi-indices of 𝒢(I){\cal G}(I) as 𝒢i(I){\cal G}_{i}(I).

Theorem 3.18.

II is a set of multi-indices, and WW is a set of weakly essential ii-multi-indices of II. Suppose αI𝒢i(W)\alpha\in I\setminus{\cal G}_{i}(W) is a proper ii-multi-index and there exists an integer l{1,,i}l\in\{1,\dots,i\} such that

(i) j=1lαjj=1lβj\textstyle{\sum_{j=1}^{l}{{\alpha_{j}}}}\leq\textstyle{\sum_{j=1}^{l}{{\beta_{j}}}} holds for every proper ii-multi-index βI(𝒢i(W){α})\beta\in I\setminus\left({\cal G}_{i}(W)\bigcup\{\alpha\}\right),

(ii) αβ\alpha\lessdot\beta is satisfied when j=1lαj=j=1lβj\textstyle{\sum_{j=1}^{l}{{\alpha_{j}}}}=\textstyle{\sum_{j=1}^{l}{{\beta_{j}}}}.
Then α\alpha must be a weakly essential ii-multi-index of II. Additionally, if p(x1,,xm)p(x_{1},\dots,x_{m}) is a smooth function (or a holomorphic function) and the aforementioned set I=(p)I={\cal I}(p), then α𝒲i(p)\alpha\in{\cal W}_{i}(p).

Proof 3.19.

Since βα\beta\nprec\alpha and αα\alpha^{\prime}\nprec\alpha for any α𝒢i(W)\alpha^{\prime}\in{\cal G}_{i}(W), α\alpha can not be generated by another proper ii-multi-index of II. Thus, α\alpha is a weakly essential multi-index of II.

Corollary 3.20.

Let II be a set of multi-indices and WW a set of weakly essential ii-multi-indices of II. α\alpha is the least ii-multi-indices of I𝒢i(W)I\setminus{\cal G}_{i}(W), then α\alpha must be a weakly essential multi-index of II. Additionally, suppose p(x1,,xm)p(x_{1},\dots,x_{m}) is a smooth function (or a holomorphic function) and II is exactly (p){\cal I}(p), then α𝒲i(p)\alpha\in{\cal W}_{i}(p).

By using Corollary 3.20 and Theorem 3.18, the following two algorithms are provided to find the greatest weakly essential ii-multi-index set of a set of proper ii-multi-indices.

Algorithm 1

IiI_{i} is a set of proper ii-multi-indices. Determine WiW_{i} the greatest weakly essential ii-multi-index set of IiI_{i}:

Step 1) Set Wi=W_{i}=\emptyset.

Step 2) If Ii\𝒢i(Wi)=I_{i}\backslash{\cal G}_{i}(W_{i})=\emptyset, then the algorithm terminates; else find the least ii-multi-index of Ii\𝒢i(Wi)I_{i}\backslash{\cal G}_{i}(W_{i}), denoted it by α\alpha, set Wi=Wi{α}W_{i}=W_{i}\bigcup\{\alpha\}, and then go to Step 2).

Algorithm 2

IiI_{i} is a set of proper ii-multi-indices. Determine WiW_{i} the greatest weakly essential ii-multi-index set of IiI_{i}:

Step 1) Set Wi=W_{i}=\emptyset.

Step 2) If Ii𝒢i(Wi)=I_{i}\setminus{\cal G}_{i}(W_{i})=\emptyset, then the algorithm terminates; else for every k=1,,ik=1,\dots,i find the least multi-index of the set

{α|αIi𝒢i(Wi)j=1kαj=minαIi𝒢i(Wi)j=1kαj},\left\{\alpha\left|\alpha\in I_{i}\setminus{\cal G}_{i}(W_{i})\,\wedge\,{\sum\limits_{j=1}^{k}{{\alpha_{j}}=}}\right.\mathop{\min}\limits_{\alpha^{\prime}\in I_{i}\setminus{\cal G}_{i}(W_{i})}\sum\limits_{j=1}^{k}{{{\alpha^{\prime}}_{j}}}\right\},

denoted it by αk\alpha^{k}, set Wi=Wi{α1,,αi}W_{i}=W_{i}\bigcup\{\alpha^{1},\dots,\alpha^{i}\}, and then go to Step 2).

Remark 3.21.

For a function p(x1,,xm)p(x_{1},\dots,x_{m}), the above two algorithms provide methods to obtain 𝒲i(p){\cal W}_{i}(p) from i(p){\cal I}_{i}(p).

It is clear that i(p)𝒲i(p){\cal E}_{i}(p)\subseteq{\cal W}_{i}(p) for a function p(x1,,xm)p(x_{1},\dots,x_{m}). In this paper, we pay special attention to m(p){\cal E}_{m}(p). Making use of Proposition 2, we get the following theorem.

Theorem 3.22.

Suppose p(x1,,xm)p(x_{1},\dots,x_{m}) is a smooth function (or a holomorphic function). Then m(p)=𝒲m(p){\cal E}_{m}(p)={\cal W}_{m}(p).

Proposition 3.23.

p(x1,,xm)p(x_{1},\dots,x_{m}) is a smooth function (or a holomorphic function). Then we have

i(p)=𝒲i(p)\𝒢i(j=i+1m𝒲j(p)),{\cal E}_{i}(p)={{\cal W}_{i}(p)}\bigg{\backslash}{{\cal G}_{i}\left(\bigcup_{j=i+1}^{m}{\cal W}_{j}(p)\right)},

and

(p)=j=0mj(p)=𝒲(p)\j=0m(𝒢(𝒲j(p))\𝒲j(p)){\cal E}(p)=\bigcup_{j=0}^{m}{\cal E}_{j}(p)={{\cal W}(p)}\bigg{\backslash}\bigcup_{j=0}^{m}\left({\cal G}\left({\cal W}_{j}(p)\right)\backslash{\cal W}_{j}(p)\right)

3.2 Invariant Multi-indies of Functions

In this subsection, we consider a question that for α\alpha a given multi-index of a function p(x1,,xm)p({x_{1}},\dots,{x_{m}}) whether there exists a lower triangular coordinate transformation x=V(y)x=V(y) such that α\alpha is not a multi-index of p(V(y1,,ym))p(V(y_{1},\dots,y_{m})).

Definition 3.24.

p(x1,,xm)p(x_{1},\dots,x_{m}) is a smooth function (or a holomorphic function). α\alpha, a proper ii-multi-index of function pp with i{0,,m}i\in\{0,\dots,m\}, is said to be invariant under every lower triangular coordinate transformation x=(x1,,xm)=V(y)=(V1(y1),,Vm(y1,,ym))x=(x_{1},\dots,x_{m})=V(y)=(V_{1}(y_{1}),\dots,V_{m}(y_{1},\dots,y_{m})) if α\alpha is still a proper ii-multi-index of the function q(y1,,ym)=p(V1(y1),,Vm(y1,,ym))q(y_{1},\dots,y_{m})=p(V_{1}(y_{1}),\dots,V_{m}(y_{1},\dots,y_{m})).

Proposition 3.10 implies the following proposition.

Proposition 3.25.

All the essential multi-indices of the function p(x1,,xm)p({x_{1}},\dots,{x_{m}}) are invariant.

Now we only need to consider, for α(p)\(p)\alpha\in{\cal I}(p)\backslash{\cal E}(p), whether there exists a lower triangular coordinate transformation x=V(y)x=V(y) such that α\alpha is not a multi-index of q(y)=p(V(y))q(y)=p(V(y)). The next example illustrates that this kind of lower triangular coordinate transformation may not exist when we restrict it to real-value coordinate transformations.

Example 3.26.

Consider the function

p1(x1,x2)=x1x22x13.{p_{\rm{1}}}(x_{1},x_{2})=x_{1}x_{2}^{2}-x_{1}^{\rm{3}}.

p1(x1,x2){p_{\rm{1}}}(x_{1},x_{2}) has proper 22-multi-indices (1,2)(1,2) and (3,0)(3,0). (1,2)(1,2) is the least 22-multi-index of p1p_{1} and can generate (3,0)(3,0). Select a lower triangular coordinate transformation y=U(x)y=U(x) as

y1=x1\displaystyle{y_{1}}={x_{1}}
y2=x1+x2,\displaystyle{y_{2}}=-{x_{1}}+{x_{2}},

the inverse transformation of which, denoted by x=V(y)x=V(y), is

x1=y1\displaystyle{x_{1}}={y_{1}}
x2=y1+y2.\displaystyle{x_{2}}={y_{1}}+{y_{2}}.

We rewrite p1p_{1} in yy-coordinates

p1(V(y1,y2))=y1y22+2y12y2.{p_{1}}(V(y_{1},y_{2}))={y_{1}}y_{2}^{2}+2y_{1}^{2}{y_{2}}.

(3,0)(3,0) is not a multi-index of p1(V(y))p_{1}(V(y)). Now consider another function

p2(x1,x2)=x1x22+x13.{p_{\rm{2}}}(x_{1},x_{2})=x_{1}x_{2}^{2}{\rm{+}}x_{1}^{\rm{3}}.

Choose a lower triangular coordinate transformation

x1=d11y1+r1(y1)\displaystyle{x_{1}}={d_{11}}{y_{1}}+{r_{1}}(y_{1}) (14)
x2=d21y1+d22y2+r2(y1,y2)\displaystyle{x_{2}}={d_{21}}{y_{1}}+{d_{22}}{y_{2}}+{r_{2}}(y_{1},y_{2})

where d11,d21,d22d_{11},d_{21},d_{22} are parameters with d11,d220d_{11},d_{22}\neq 0 and r1,r2r_{1},r_{2} are smooth functions with r1/y1(0)=0\partial r_{1}/\partial y_{1}(0)=0, r2/y1(0)=0\partial r_{2}/\partial y_{1}(0)=0, and r2/y2(0)=0\partial r_{2}/\partial y_{2}(0)=0. In yy-coordinates, we have

p2(V(y1,y2))=\displaystyle p_{2}(V(y_{1},y_{2}))= d11d222y1y22+2d11d21d22y12y2\displaystyle d_{11}d_{22}^{2}y_{1}y_{2}^{2}+2d_{11}{d_{21}}{d_{22}}y_{1}^{2}{y_{2}} (15)
+d11(d112+d212)y13+,\displaystyle+{d_{11}}(d_{11}^{2}+d_{21}^{2})y_{1}^{\rm{3}}+\dots\;,

where we only present all the cubic terms of p2(V(y1,y2))p_{2}(V(y_{1},y_{2})). Because of the arbitrariness of (14), it is impossible to find a real-valued smooth lower triangular coordinate transformation such that d11(d112+d212)=0{d_{11}}(d_{11}^{2}+d_{21}^{2})=0. In order to eliminate the multi-index (3,0)(3,0) from the right-hand side of (15), we have to take d11d_{11} and d21d_{21} as complex numbers.

The above example prompts us to use complex-valued lower triangular coordinate transformations.

Theorem 3.27.

Let p(x1,,xm)p({x_{1}},\dots,{x_{m}}) be a smooth function (or a holomorphic function). A multi-index of pp is invariant under any biholomorphic lower triangular coordinate transformations if and only if it belongs to (p){\cal E}(p).

Proof 3.28.

We only prove that, for h=1,,mh=1,\dots,m and a proper hh-multi-index α(p)\(p)\alpha\in{\cal I}(p)\backslash{\cal E}(p), there exists a biholomorphic lower triangular coordinate transformation x=V(y)x=V(y) such that α(q)\alpha\notin{\cal I}(q) for q(y)=p(V(y))q(y)=p(V(y)).

Let rr be a positive integer and α1,,αr(p)\alpha^{1},\dots,\alpha^{r}\in{\cal I}(p) be all the multi-indices each of which can generate α\alpha and is different from α\alpha. Let

λi=(0,,0,1){\lambda^{i}}=(0,\dots,0,1)

be proper ii-multi-index for i=1,,mi=1,\dots,m. Choose a biholomorphic lower triangular coordinate transformation x=V(y)x=V(y) as

x1=V1(y1)=cλ1y1+j=1j1cβ1,jyβ1,j\displaystyle{x_{1}}=V_{1}(y_{1})={c_{{\lambda^{1}}}}{y_{1}}+\sum\limits_{j=1}^{j_{1}}{{c_{{\beta^{1,j}}}}{y^{{\beta^{1,j}}}}} (16)
\displaystyle\quad\vdots
xm=Vm(y1,,ym)=cλmym+j=1jmcβm,jyβm,j\displaystyle{x_{m}}=V_{m}(y_{1},\dots,y_{m})={c_{{\lambda^{m}}}}{y_{m}}+\sum\limits_{j=1}^{j_{m}}{{c_{{\beta^{m,j}}}}{y^{{\beta^{m,j}}}}}

where cλ1,,cλm0{c_{{\lambda^{1}}}},\dots,{c_{{\lambda^{m}}}}\neq 0 are given real numbers, and cβi,j{c_{{\beta^{i,j}}}}, i=1,,mi=1,\dots,m, j=1,,jij=1,\dots,j_{i}, and ji0j_{i}\geq 0, are undetermined complex-valued coefficients. The multi-indices

β1,1,,β1,j1,\displaystyle{\beta^{1,1}},\dots,{\beta^{1,{j_{1}}}}, (17)
\displaystyle\quad\vdots
βm,1,,βi,jm,\displaystyle{\beta^{m,1}},\dots,{\beta^{i,{j_{m}}}},

introduced in (16) satisfy three conditions:

  1. 1.

    βi,j\beta^{i,j}, i=1,,mi=1,\dots,m, j=1,,jij=1,\dots,j_{i}, and ji0j_{i}\geq 0, are ii-multi-indices with βi,j0\beta^{i,j}\neq 0 and βi,jλi\beta^{i,j}\neq\lambda^{i}.

  2. 2.

    There exist at least one multi-index αk=(α1k,,αmk)\alpha^{k}=(\alpha_{1}^{k},\dots,\alpha_{m}^{k}) with k{1,,r}k\in\{1,\dots,r\} and a family of multi-indices γi,j\gamma^{i,j} (i=1,,mi=1,\dots,m and j=1,,αikj=1,\dots,\alpha^{k}_{i}) selected from λi,βi,1,,βi,ji\lambda^{i},\beta^{i,1},\dots,\beta^{i,j_{i}} such that

    α=i=1mj=1αikγi,j=i=1mni,0λi+i=1mj=1jini,jβi,j\alpha=\sum\limits_{i=1}^{m}{\;\sum\limits_{j=1}^{\alpha_{i}^{k}}{{\gamma^{i,j}}}}=\sum\limits_{i^{\prime}=1}^{m}n^{i^{\prime},0}\lambda^{i^{\prime}}+\sum\limits_{i^{\prime}=1}^{m}{\sum\limits_{j^{\prime}=1}^{j_{i^{\prime}}}{n^{i^{\prime},j^{\prime}}\beta^{i^{\prime},j^{\prime}}}} (18)

    where all the ni,0n^{i^{\prime},0} are nonnegative integers, all the ni,jn^{i^{\prime},j^{\prime}} (j1j^{\prime}\geq 1) are positive integers, and j=0jini,j=αik\sum_{j^{\prime}=0}^{j_{i^{\prime}}}n^{i^{\prime},j^{\prime}}=\alpha^{k}_{i^{\prime}}.

  3. 3.

    if any multi-index listed in (17) is removed, (18) is not satisfied for all α1,,αr\alpha^{1},\dots,\alpha^{r}.

The existence of (17) is guaranteed by αkα\alpha^{k}\prec\alpha for k=1,,rk=1,\dots,r. Without loss of generality, assume that α1,,αs\alpha^{1},\dots,\alpha^{s} with 1sr1\leq s\leq r satisfy (18). pp can be expressed as

p(x1,,xm)=\displaystyle p({x_{1}},\dots,{x_{m}})= cαxα+cα1xα1++cαrxαs\displaystyle{c_{\alpha}}{x^{\alpha}}+{c_{{\alpha^{1}}}}{x^{{\alpha^{1}}}}+\dots+{c_{{\alpha^{r}}}}{x^{{\alpha^{s}}}}
+p(x1,,xm)\displaystyle+p^{\prime}({x_{1}},\dots,{x_{m}})

where cα,cα1,,cαsc_{\alpha},c_{{\alpha^{1}}},\dots,c_{{\alpha^{s}}} are nonzero coefficients, and α,α1,,αs\alpha,\alpha^{1},\dots,\alpha^{s} are not multi-indices of function p(x1,,xm)p^{\prime}({x_{1}},\dots,{x_{m}}). We also assume that for a fixed αk\alpha^{k} there are different tk1t_{k}\geq 1 families of integers n1,0,,n1,j1,,nm,0,,nm,jmn^{1,0},\dots,n^{1,j_{1}},\dots\dots,n^{m,0},\dots,n^{m,j_{m}} satisfying (18). Substituting (16) into p(x1,,xm)p(x_{1},\dots,x_{m}) and taking account of the requirement that α\alpha should be not a multi-index of q(y1,,ym)=p(V(y))q(y_{1},\dots,y_{m})=p(V(y)) yield

cαi=1hcλiαi+k=1sl=1tk(cαki=1mj=1αikcχi,j,k,l)=0{c_{\alpha}}\prod\limits_{i=1}^{h}{c_{{\lambda^{i}}}^{{\alpha_{i}}}}+\sum\limits_{k=1}^{s}\sum\limits_{l=1}^{t_{k}}{\left({{c_{{\alpha^{k}}}}\prod\limits_{i=1}^{m}{\prod\limits_{j=1}^{\alpha_{i}^{k}}{{c_{{\chi^{i,j,k,l}}}}}}}\right)}=0 (19)

where every χi,j,k,l\chi^{i,j,k,l} is selected from λi,βi,1\lambda^{i},\beta^{i,1},\dots,βi,ji\beta^{i,j_{i}}, correspondingly cχi,j,k,l{c_{{\chi^{i,j,k,l}}}} is selected from cλi,cβi,1c_{\lambda^{i}},{c_{{\beta^{i,1}}}},\dots,cβi,ji{c_{{\beta^{i,j_{i}}}}}, and i,jχi,j,k,l=α\sum_{i,j}{{\chi^{i,j,k^{\prime},l^{\prime}}}}=\alpha holds for any fixed pair of the numbers kk^{\prime} and ll^{\prime}. From condition 3), all the undetermined coefficients cβi,j{c_{{\beta^{i,j}}}}, i=1,,mi=1,\dots,m and j=1,,jij=1,\dots,j_{i}, are factors of every term in the left-hand side of (19) except cαi=1hcλiαi{c_{\alpha}}\prod_{i=1}^{h}{c_{{\lambda^{i}}}^{{\alpha_{i}}}}.

It remains to verify that there exist cβi,j{c_{{\beta^{i,j}}}}, i=1,,mi=1,\dots,m and j=1,,jij=1,\dots,j_{i}, such that (19) holds. For convenience, rename β1,1,,β1,j1,,βm,1,,βm,jm{\beta^{1,1}},\dots,{\beta^{1,{j_{1}}}},\dots\dots,{\beta^{m,1}},\dots,{\beta^{m,{j_{m}}}} to β1,,βj1++jm\beta^{1},\dots,\beta^{j_{1}+\dots+j_{m}}, and rename cβ1,1,,cβ1,j1c_{\beta^{1,1}},\dots,c_{\beta^{1,{j_{1}}}}, \dots\dots, cβm,1,,cβm,jmc_{\beta^{m,1}},\dots,c_{\beta^{m,{j_{m}}}} to cβ1,,cβj1++jmc_{\beta^{1}},\dots,c_{\beta^{j_{1}+\dots+j_{m}}}. Let us regard the left-hand side of (19) as a polynomial in indeterminate cβ1c_{\beta^{1}}, denoted the polynomial by P1(cβ1)P_{1}(c_{\beta^{1}}), and assume that the degree of P1(cβ1)P_{1}(c_{\beta^{1}}) is e1e_{1}. Then, the polynomial can be rewritten in the form

P1(cβ1)=P2cβ1e1+R1P_{1}(c_{\beta^{1}})=P_{2}c_{\beta^{1}}^{e_{1}}+R_{1}

where P2P_{2} and R1R_{1} are functions satisfying P2/cβ1=0\partial P_{2}/\partial c_{\beta^{1}}=0 and e1R1/cβ1e1=0\partial^{e_{1}}R_{1}/\partial c_{\beta^{1}}^{e_{1}}=0. P2P_{2} can also be regarded as a polynomial in indeterminate cβ2c_{\beta^{2}}. Let us, in general, consider PkP_{k} (k=1,,j1++jmk=1,\dots,j_{1}+\dots+j_{m}) as a polynomial in indeterminate cβkc_{\beta^{k}} and suppose the degree of Pk(cβk)P_{k}(c_{\beta^{k}}) is eke_{k} (ek1e_{k}\geq 1), then we have

Pk(cβk)=Pk+1cβkek+RkP_{k}(c_{\beta^{k}})=P_{k+1}c_{\beta^{k}}^{e_{k}}+R_{k} (20)

where Pk+1P_{k+1} is a function satisfying Pk+1/cβk¯=0\partial P_{k+1}/\partial c_{\beta^{\bar{k}}}=0 for k¯=1,,k\bar{k}=1,\dots,k, and RkR_{k} is a function satisfying ekRk/cβkek=0\partial^{e_{k}}R_{k}/\partial c_{\beta^{k}}^{e_{k}}=0, Rk/cβk^=0\partial R_{k}/\partial c_{\beta^{\hat{k}}}=0 for k^=1,,k1\hat{k}=1,\dots,k-1, and cβkc_{\beta^{k}} is a factor of every term in RkR_{k}. It is clear that Pk+1P_{k+1} can be regarded as a polynomial in indeterminate cβk+1c_{\beta^{k+1}} if k+1j1++jmk+1\leq j_{1}+\dots+j_{m} is satisfied. Since any two of the multi-indices α1,,αs\alpha^{1},\dots,\alpha^{s} are different from each other, we know that Pj1++jm+1P_{j_{1}+\dots+j_{m}+1} must be a nonzero constant. Setting Pj1++jm=rj1++jmRj1++jm(0)=0P_{j_{1}+\dots+j_{m}}=r_{j_{1}+\dots+j_{m}}\neq R_{j_{1}+\dots+j_{m}}(0)=0 where rj1++jmr_{j_{1}+\dots+j_{m}} is a constant, (20) has at least one nonzero solution for cβj1++jmc_{\beta^{j_{1}+\dots+j_{m}}}. When cβj1++jm,,cβk+1c_{\beta^{j_{1}+\dots+j_{m}}},\dots,c_{\beta^{k+1}} have been determined for k=j1++jm1,,2k=j_{1}+\dots+j_{m}-1,\dots,2, we set Pk=rkRk(0,cβk+1,,cβj1++jm)=0P_{k}=r_{k}\neq R_{k}(0,c_{\beta^{k+1}},\dots,c_{\beta^{j_{1}+\dots+j_{m}}})=0 where rkr_{k} is a constant, and then we can find a nonzero cβkc_{\beta^{k}} satisfying (20). We finally solve (19) for a nonzero cβ1c_{\beta^{1}}. Therefore, an appropriate lower triangular coordinate transformation x=V(y)x=V(y) such that α(q)\alpha\notin{\cal I}(q) is obtained.

3.3 Classifications of Lower Triangular Forms

Having finished the previous discussions about the invariant multi-indices of functions, let us investigate what properties of lower triangular forms are invariant under lower triangular coordinate transformations.

Definition 3.29.

α=(α1,,αj)\alpha=(\alpha_{1},\dots,\alpha_{j}) and β=(β1,,βj)\beta=(\beta_{1},\dots,\beta_{j}) are multi-indices. We write αβ\alpha\leq\beta if αiβi\alpha_{i}\leq\beta_{i} holds for all i=1,,ji=1,\dots,j, and write α<β\alpha<\beta if αβ\alpha\leq\beta and αβ\alpha\neq\beta [38].

Remark 3.30.

Suppose α\alpha and β\beta are proper jj-multi-indices. αβ\alpha\leq\beta implies αβ\alpha\preceq\beta.

Proposition 3.31.

p(x1,,xm)p(x_{1},\dots,x_{m}) and q(x1,,xm1)q(x_{1},\dots,x_{m-1}) are smooth functions with p(0)=0p(0)=0 and q(0)0q(0)\neq 0. Then m(p)=m(pq){\cal E}_{m}(p)={\cal E}_{m}(p\cdot q).

Proof 3.32.

Let q1(x1,,xm1)=q(x1,,xm1)q(0){q_{1}}({x_{1}},\dots,{x_{m}-1})=q({x_{1}},\dots,{x_{m}-1})-q(0). The following equation is the well-known Leibniz formula [38]

α(pq1)xα=βα(i=1mαi!i=1m(βi!(αiβi)!)βpxβαβq1xαβ)\frac{{\partial^{\alpha}}(p\cdot{q_{1}})}{\partial x^{\alpha}}=\sum\limits_{\beta\leq\alpha}\left(\frac{{\prod\limits_{i=1}^{m}{{\alpha_{i}}!}}}{{\prod\limits_{i=1}^{m}{({\beta_{i}}!({\alpha_{i}}-{\beta_{i}})!)}}}\frac{{\partial^{\beta}}p}{\partial x^{\beta}}\frac{{\partial^{\alpha-\beta}}{q_{1}}}{\partial x^{\alpha-\beta}}\right) (21)

where α\alpha is proper mm-multi-index. Assuming γm(p)\gamma\in{\cal E}_{m}(p) and αγ\alpha\preceq\gamma, (21) yields α(pq1)/xα(0)=0{{\partial^{\alpha}}(p\cdot{q_{1}})}/{\partial x^{\alpha}}(0)=0; that is, α\alpha is not a multi-index of pq1p\cdot q_{1}. On the other hand, m(q(0)p)=m(p){\cal E}_{m}(q(0)\cdot p)={\cal E}_{m}(p). Thus we have m(p)m(pq){\cal E}_{m}(p)\subseteq{\cal E}_{m}(p\cdot q). Now let αm(p)\alpha\notin{\cal E}_{m}(p) be a proper mm-multi-index satisfying both αγ\alpha\npreceq\gamma and γα\gamma\npreceq\alpha for all γm(p)\gamma\in{\cal E}_{m}(p), and let βα\beta\leq\alpha be a multi-index. Since βp/xβ(0)=0{\partial^{\beta}p}/{\partial x^{\beta}}(0)=0 holds if β\beta is a proper mm-multi-index and αβq1/xαβ(0)=0{\partial^{\alpha-\beta}q_{1}}/{\partial x^{\alpha-\beta}}(0)=0 holds if β\beta is not a proper mm-multi-index, we have α(pq1)(0)=0{\partial^{\alpha}}(p\cdot{q_{1}})(0)=0, which implies m(pq)m(p){\cal E}_{m}(p\cdot q)\subseteq{\cal E}_{m}(p). In conclusion, m(p)=m(pq){\cal E}_{m}(p)={\cal E}_{m}(p\cdot q).

Theorem 3.33.

Suppose y=U(x)y=U(x) is a lower triangular coordinate transformation, and rewrite (1) in yy-coordinates as follows

y˙1=f¯1(y1,y2)\displaystyle{{\dot{y}}_{1}}={{\bar{f}}_{1}}({y_{1}},{y_{2}}) (22)
\displaystyle\;\vdots
y˙n1=f¯n1(y1,yn)\displaystyle{{\dot{y}}_{n-1}}={{\bar{f}}_{n-1}}({y_{1}}\dots,{y_{n}})
y˙n=f¯n(y1,,yn)+g¯n(y1,,yn)v.\displaystyle{{\dot{y}}_{n}}={{\bar{f}}_{n}}({y_{1}},\dots,{y_{n}})+{{\bar{g}}_{n}}({y_{1}},\dots,{y_{n}})v\;.

Then i+1(fi)=i+1(f¯i){\cal E}_{i+1}({f_{i}})={\cal E}_{i+1}({\bar{f}_{i}}) holds for any i=1,,n1i=1,\dots,n-1.

Proof 3.34.

Let us compute the f¯i(y1,yi+1){{\bar{f}}_{i}}({y_{1}}\dots,{y_{i+1}}) in xx-coordinates

f¯i(y1,,yi+1)=Uixifi(x1,,xi+1)\displaystyle{{\bar{f}}_{i}}({y_{1}},\dots,{y_{i+1}})=\frac{{\partial{U_{i}}}}{{\partial{x_{i}}}}{f_{i}}({x_{1}},\dots,{x_{i+1}})
+k=1i1Uixkfi(x1,,xk+1)=f¯i(U(x1,,xi+1)).\displaystyle\qquad+\sum\limits_{k=1}^{i-1}{\frac{{\partial{U_{i}}}}{{\partial{x_{k}}}}}{f_{i}}({x_{1}},\dots,{x_{k+1}})={\bar{f}}_{i}(U({x_{1}},\dots,{x_{i+1}}))\;.

Thanks to the above proposition, we have i+1(Ui/xifi)=i+1(fi){\cal E}_{i+1}({{\partial{U_{i}}}}/{{\partial{x_{i}}}}\cdot{f_{i}})={\cal E}_{i+1}({f_{i}}). In addition, i+1(Ui/xkfk)={\cal E}_{i+1}({{\partial{U_{i}}}}/{{\partial{x_{k}}}}\cdot{f_{k}})=\emptyset is satisfied for any k=1,,i1k=1,\dots,i-1. Therefore i+1(f¯i(U(x1,,xi+1)))=i+1(fi){\cal E}_{i+1}({\bar{f}}_{i}(U({x_{1}},\dots,{x_{i+1}})))={\cal E}_{i+1}({f_{i}}) holds. Using Theorem 3.14, we conclude i+1(f¯i(y1,,yi+1))=i+1(fi){\cal E}_{i+1}({\bar{f}}_{i}({y_{1}},\dots,{y_{i+1}}))={\cal E}_{i+1}({f_{i}}).

Corollary 3.35.

Suppose y=U(x)y=U(x) is a lower triangular coordinate transformation. Rewriting (1) in yy-coordinates yields (22). Then i+1(fi)=i+1(f¯i){\cal L}_{i+1}({f_{i}})={\cal L}_{i+1}({\bar{f}_{i}}) for i=1,,n1i=1,\dots,n-1.

This corollary leads to a way to classify lower triangular forms.

Definition 3.36.

All the lower triangular forms taking the form (1) and satisfying i+1(fi)=αi{\cal L}_{i+1}(f_{i})=\alpha^{i} for i=1,,n1i=1,\dots,n-1 are grouped under a specific type, denoted by [α1,,αn1][\alpha^{1},\dots,\alpha^{n-1}]. Arbitrary element of [α1,,αn1][\alpha^{1},\dots,\alpha^{n-1}] can be expressed as

x˙1=c1xα1+f^1(x1,x2)\displaystyle{{\dot{x}}_{1}}={c_{1}}{x^{{\alpha^{1}}}}+{{\hat{f}}_{1}}(x_{1},x_{2}) (23)
\displaystyle\;\vdots
x˙n1=cn1xαn1+f^n1(x1,xn)\displaystyle{{\dot{x}}_{n-1}}={c_{n-1}}{x^{{\alpha^{n-1}}}}+{{\hat{f}}_{n-1}}(x_{1}\dots,x_{n})
x˙n=fn(x1,,xn)+gn(x1,,xn)v\displaystyle{{\dot{x}}_{n}}={f_{n}}({x_{1}},\dots,{x_{n}})+{g_{n}}({x_{1}},\dots,{x_{n}})v

where, for any i=1,,n1i=1,\dots,n-1, f^i{\hat{f}}_{i} is smooth function vanishing at the origin and αiβ\alpha^{i}\lessdot\beta, provided that β\beta is any (i+1)(i+1)-multi-index of f^i{\hat{f}}_{i}, is satisfied.

Remark 3.37.

System (2) is of type

[(0,p1),(0,0,p2),,(0,,0,pn1)].[(0,p_{1}),(0,0,p_{2}),\dots,(0,\dots,0,p_{n-1})]\;.

Theorem 3.33 results in another way to classify lower triangular forms.

Definition 3.38.

All the systems taking the form (1) and having the same i+1(fi){\cal E}_{i+1}(f_{i}) for i=1,,n1i=1,\dots,n-1 can be expressed as

x˙1=f(x1,x2)=α2(f1)c1αxα+f~1(x1,x2)\displaystyle{{\dot{x}}_{1}}=f(x_{1},x_{2})=\sum\limits_{\alpha\in{\kern 1.0pt}{{\cal E}_{2}(f_{1})}}{c_{1}^{\alpha}{x^{\alpha}}}+{{\tilde{f}}_{1}}({x_{1},x_{2}}) (24)
\displaystyle\;\vdots
x˙n1=fn1(x1,,xn)\displaystyle{{\dot{x}}_{n-1}}={f_{n-1}}({x_{1}},\dots,{x_{n}})
=αn(fn1)cn1αxα+f~n1(x1,xn)\displaystyle\;\;=\sum\limits_{\alpha\in{\kern 1.0pt}{{\cal E}_{n}(f_{n-1})}}{c_{n-1}^{\alpha}{x^{\alpha}}}+{{\tilde{f}}_{n-1}}({x_{1}}\dots,{x_{n}})
x˙n=fn(x1,,xn)+gn(x1,,xn)v\displaystyle{{\dot{x}}_{n}}={f_{n}}({x_{1}},\dots,{x_{n}})+{g_{n}}({x_{1}},\dots,{x_{n}})v

where every f~i{\tilde{f}}_{i} for i=1,,n1i=1,\dots,n-1 is smooth functions with f~i(0)=0{\tilde{f}}_{i}(0)=0 and every proper (i+1)(i+1)-multi-index of f~i{\tilde{f}}_{i} can be generated by some element of i+1(fi){\cal E}_{i+1}(f_{i}). For sake of convenience, we say that (24) is of type [[2(f1),,n(fn1)]][\kern-1.49994pt[{\cal E}_{2}(f_{1}),\dots,{\cal E}_{n}(f_{n-1})]\kern-1.49994pt].

Remark 3.39.

Apart from the invariance of i+1(fi){\cal L}_{i+1}(f_{i}) and i+1(fi){\cal E}_{i+1}(f_{i}) under lower triangular coordinate transformations, another reason we think the two classifications given in Definition 3.36 and 3.38 are helpful is as follows. For a lower triangular form taking the form (1), xi+1x_{i+1} can be seen as a control input of x˙i=fi(x1,,xi+1)\dot{x}_{i}=f_{i}(x_{1},\dots,x_{i+1}) to some extent, such as designing a feedback controller for (1) using backstepping. So we may also consider (x1,,xi+1)α(x_{1},\dots,x_{i+1})^{\alpha} where α=i+1(fi)\alpha={\cal L}_{i+1}(f_{i}) or αi+1(fi)\alpha\in{\cal E}_{i+1}(f_{i}) as one of the ”control” terms for x˙i=fi(x1,,xi+1)\dot{x}_{i}=f_{i}(x_{1},\dots,x_{i+1}). From some literature, such as [18, 19, 24, 29], we know that, at least for several types of lower triangular forms, there are some control strategies that can be applied to the entire type of lower triangular form to meet some control objectives, no matter what f^i\hat{f}_{i} and f~i\tilde{f}_{i} are. Of course, for many other types of lower triangular forms, a control strategy may only be effective when f^i\hat{f}_{i} and f~i\tilde{f}_{i} satisfy certain conditions, such as [16, 17, 20, 21, 25, 27, 28, 30, 31]. We look forward to more research on the control algorithms for (LABEL:eq_lea_sys) and (24).

Remark 3.40.

If the proper (i+1)(i+1)-multi-index (0,,0,1)(0,\dots,0,1) belongs to i+1(fi){\cal E}_{i+1}(f_{i}) then it is the only element of i+1(fi){\cal E}_{i+1}(f_{i}). In addition, the proper (i+1)(i+1)-multi-index (0,,0,k)(0,\dots,0,k) with k1k\geq 1 can generate any proper (i+1)(i+1)-multi-index α\alpha satisfying |α|k\left|\alpha\right|\geq k. So there are at most a finite number of proper (i+1)(i+1)-multi-indices that can not be generated by i+1(fi){\cal E}_{i+1}(f_{i}) when (0,,0,k)i+1(fi)(0,\dots,0,k)\in{\cal E}_{i+1}(f_{i}); see the following proposition.

Proposition 3.41.

II is a set of proper ii-multi-indices and 𝒜i{\cal A}_{i} represents the set consisting of all the proper ii-multi-indices. 𝒜i𝒢i(I){\cal A}_{i}\setminus{\cal G}_{i}(I) is finite if and only if one can find some positive integer kk for which the proper ii-multi-index λi,k=(0,,0,k)\lambda^{i,k}=(0,\dots,0,k) belongs to II.

Proof 3.42.

The sufficiency is obvious, we only prove the necessity. Assume λi,kI\lambda^{i,k}\not\in I for all positive integer kk. α\alpha is arbitrary element of i(I){\cal E}_{i}(I). With the assumption in mind, λi,k𝒢i({α})\lambda^{i,k}\not\in{\cal G}_{i}(\{\alpha\}) for all k>0k>0 because all the proper ii-multi-indices that can generate λi,k\lambda^{i,k} are λi,k\lambda^{i,k^{\prime}}, k=1,,kk^{\prime}=1,\dots,k. It follows that λi,k𝒢i(i(I))=𝒢i(I)\lambda^{i,k}\not\in{\cal G}_{i}({\cal E}_{i}(I))={\cal G}_{i}(I) for all k>0k>0. This means that 𝒜i𝒢i(I){\cal A}_{i}\setminus{\cal G}_{i}(I) is infinite. This contradiction completes the proof.

Example 3.43.

Consider the following lower triangular form.

x˙1=sinx23+x1x2\displaystyle{{\dot{x}}_{1}}=\sin x_{2}^{3}+{x_{1}}{x_{2}}
x˙2=x3x23+x3x13+x2\displaystyle{{\dot{x}}_{2}}={x_{3}}x_{2}^{3}+{x_{3}}x_{1}^{3}+{x_{2}}
x˙3=x43x3+x4x1+x3\displaystyle{{\dot{x}}_{3}}=x_{4}^{3}{x_{3}}+{x_{4}}{x_{1}}+{x_{3}}
x˙4=x4+v\displaystyle{{\dot{x}}_{4}}={x_{4}}+v

Let us focus on the functions expressed by the right-hand sides of the first three equations of the above system. From the least multi-indices of those functions, this system is of type

[(0,3),(0,3,1),(0,0,1,3)],[(0,3),(0,3,1),(0,0,1,3)],

and, after having computed essential multi-indices of those functions, we know that the system is also of type

[[{(0,3),(1,1)},{(0,3,1)},{(0,0,1,3),(1,0,0,1)}]].\left[\kern-1.49994pt\left[{\{(0,3),(1,1)\},\{(0,3,1)\},\{(0,0,1,3),(1,0,0,1)\}}\right]\kern-1.49994pt\right].

4 Feedback Equivalence

In this section, we solve the problem of whether a nonlinear system is feedback equivalent to a given type of lower triangular form in two methods. The first one helps us determine what types the system belongs to by transforming the system into a lower triangular form if it is possible. And when the second method is adopted, we solve the problem by calculating a series of Lie brackets.

4.1 Transforming into Lower Triangular Forms

Using the notation of the differential geometry, we write the drift vector field and the input vector field of (4) as

F=F1ξ1++Fnξn,F={F_{1}}\frac{\partial}{{\partial{\xi_{1}}}}+\dots+{F_{n}}\frac{\partial}{{\partial{\xi_{n}}}},

and

G=G1ξ1++Gnξn,G={G_{1}}\frac{\partial}{{\partial{\xi_{1}}}}+\dots+{G_{n}}\frac{\partial}{{\partial{\xi_{n}}}},

respectively. Similarly, the drift vector field and the input vector field of (1) can be denoted by

f=f1x1++fnxn,f={f_{1}}\frac{\partial}{{\partial{x_{1}}}}+\dots+{f_{n}}\frac{\partial}{{\partial{x_{n}}}},

and

g=gnxn.g={g_{n}}\frac{\partial}{{\partial{x_{n}}}}.

Let XX and YY be two nn dimensional vector fields defined on a neighborhood of the origin, adXY=[X,Y]\mathrm{ad}_{X}Y=[X,Y] is the Lie bracket of XX and YY. Further let h(ξ1,,ξn)h(\xi_{1},\dots,\xi_{n}) be a smooth function, then X(h)=i=1nXih/ξiX(h)=\sum_{i=1}^{n}X_{i}\cdot\partial h/\partial\xi_{i}.

Though the sufficient and necessary condition under which a nonlinear system is equivalent to a lower triangular form has been already given in [4], let us show here a new condition that may be easier to check and may simplify the implementation of the equivalent transformation.

Theorem 4.44.

Let Dn+1=span{0}D^{n+1}={\rm{span}}\left\{0\right\}, Dn=Dn^=span{G}D^{n}=\hat{D^{n}}={\rm{span}}\left\{G\right\}. System (4) is locally equivalent to (1) via a feedback (5) and a change of coordinates (6) if and only if, for every i=n1,,1i=n-1,\dots,1, (4) satisfies the following condition: suppose DkD^{k}, k=n+1,,i+1k=n+1,\dots,i+1, and D^l\hat{D}^{l}, l=n,,i+1l=n,\dots,i+1, have already been defined, take a vector field Gi+1Di+1Di+2G^{i+1}\in D^{i+1}\setminus D^{i+2}, and set D^i=span{adGi+1F,D^i+1}{\hat{D}^{i}}=\mathrm{span}\{\mathrm{ad}_{G^{i+1}}F,\hat{D}^{i+1}\}, then there exists an ni+1n-i+1 dimensional involutive distribution DiD^{i} in a neighborhood of the origin such that Di=D^iD^{i}=\hat{D}^{i} in an open subset of n{{\mathbb{R}}^{n}} whose closure is a neighborhood of the origin.

Proof 4.45.

Note that, for any smooth vector fields X=i=1nXi/xiX=\textstyle\sum_{i=1}^{n}X_{i}{\partial}/{\partial x_{i}} and Y=i=1nYi/xiY=\textstyle\sum_{i=1}^{n}Y_{i}{\partial}/{\partial x_{i}}, we have

S([X,Y])=[S(X),S(Y)]S_{*}([X,Y])=[S_{*}(X),S_{*}(Y)] (25)

where S:nn,xξS:\mathbb{R}^{n}\to\mathbb{R}^{n},x\mapsto\xi is a change of coordinates and SS_{*} is so-called the differential of SS or the pushforward induced by SS [39]. By using (25), the necessity is clear because (1) satisfies the condition given in the theorem. Let us verify that the condition is sufficient. Due to DnD1{D^{n}}\subset\dots\subset{D^{1}}, we can find a change of coordinates x=T(ξ)x=T(\xi) such that Di=span{/xi,,/xn}{D^{i}}={\rm{span}}\left\{{\partial}/{{\partial{x_{i}}},\dots,\partial}/{{\partial{x_{n}}}}\right\} [39]. Let gi=T(Gi){g^{i}}={T_{*}}({G^{i}}) and f=T(F)f={T_{*}}(F). Since giDi{g^{i}}\in{D^{i}} for i=n,,1i=n,\dots,1, it can be expressed as gi=k=ingki/xk{g^{i}}=\textstyle\sum_{k=i}^{n}{g_{k}^{i}{\partial}/{{\partial{x_{k}}}}}. Calculate the vector field adgif{\rm{a}}{{\rm{d}}_{{g^{i}}}}f as follows.

adgif=\displaystyle{\rm{a}}{{\rm{d}}_{{g^{i}}}}f= k=1i1(j=infkxjgji)xk\displaystyle\sum\limits_{k=1}^{i-1}{\left({\sum\limits_{j=i}^{n}{\frac{{\partial{f_{k}}}}{{\partial{x_{j}}}}g_{j}^{i}}}\right)\frac{\partial}{{\partial{x_{k}}}}}
+k=in(j=infkxjgjij=1ngkixjfj)xk\displaystyle+\sum\limits_{k=i}^{n}{\left({\sum\limits_{j=i}^{n}{\frac{{\partial{f_{k}}}}{{\partial{x_{j}}}}g_{j}^{i}-}\sum\limits_{j=1}^{n}{\frac{{\partial g_{k}^{i}}}{{\partial{x_{j}}}}{f_{j}}}}\right)}\frac{\partial}{{\partial{x_{k}}}}

Then, adgifDi1Di\mathrm{ad}_{g^{i}}f\in{D^{i-1}}\setminus{D^{i}}, i=n,,2i=n,\dots,2, result in

fi1xi0,fjxi0,j=1,,i2\frac{{\partial{f_{i-1}}}}{{\partial{x_{i}}}}\not\equiv 0,\;\frac{{\partial{f_{j}}}}{{\partial{x_{i}}}}\equiv 0,j=1,\dots,i-2

in a neighborhood of the origin. Thus (4) in xx-coordinates is of the form (1).

Remark 4.46.

It is also clear that if a nonlinear system satisfies the condition given in Theorem 4.44 then the system can be transformed into a lower triangular form only via a change of coordinates.

Remark 4.47.

Taking Gn=GG^{n}=G and Gi=adGi+1FG^{i}=\mathrm{ad}_{G^{i+1}}F for i=n1,,2i=n-1,\dots,2, the condition introduced in the above theorem is the same as the condition presented in [4]. By choosing appropriate GiG^{i}, the calculations of the Lie brackets and design of equivalent transformation can be simplified.

The next example shows how to transform a system into its equivalent lower triangular form by using Theorem 4.44 and determine the types the system belongs to.

Example 4.48.

Let us consider a nonlinear system expressed by (26)

ξ˙1=\displaystyle\dot{\xi}_{1}= ξ1ξ3+(ξ1ξ3)(ξ2ξ32+ξ4)+(ξ1ξ3+ξ4)(ξ2ξ32+ξ4)+(ξ2ξ32+ξ4)3\displaystyle\xi_{1}-\xi_{3}+\left(\xi_{1}-\xi_{3}\right)\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)+\left(\xi_{1}-\xi_{3}+\xi_{4}\right)\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)+\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)^{3} (26)
ξ˙2=\displaystyle\dot{\xi}_{2}= ξ2ξ32+2ξ3(ξ1ξ3+(ξ1ξ3+ξ4)(ξ2ξ32+ξ4))+ξ3+ξ4\displaystyle\xi_{2}-\xi_{3}^{2}+2\xi_{3}\left(\xi_{1}-\xi_{3}+\left(\xi_{1}-\xi_{3}+\xi_{4}\right)\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)\right)+\xi_{3}+\xi_{4}
+(ξ1ξ3)(ξ2ξ32+ξ4)+(ξ2ξ32+ξ4)3((ξ1ξ3+ξ4)2+1)u\displaystyle+\left(\xi_{1}-\xi_{3}\right)\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)+\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)^{3}-\left(\left(\xi_{1}-\xi_{3}+\xi_{4}\right)^{2}+1\right)u
ξ˙3=\displaystyle\dot{\xi}_{3}= ξ1ξ3+(ξ1ξ3+ξ4)(ξ2ξ32+ξ4)\displaystyle\xi_{1}-\xi_{3}+\left(\xi_{1}-\xi_{3}+\xi_{4}\right)\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)
ξ˙4=\displaystyle\dot{\xi}_{4}= (ξ1+ξ3(ξ2ξ32+ξ4)2)(ξ2ξ32+ξ4)+((ξ1ξ3+ξ4)2+1)u\displaystyle\left(-\xi_{1}+\xi_{3}-\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)^{2}\right)\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)+\left(\left(\xi_{1}-\xi_{3}+\xi_{4}\right)^{2}+1\right)u

and denote the drift vector field and input vector field of the system by F(ξ)F(\xi) and G(ξ)G(\xi). Select a nonsingular vector field G4(ξ)D4=span{G(ξ)}G^{4}(\xi)\in D^{4}={\rm{span}}\{G(\xi)\} as

G4(ξ)=G(ξ)/((ξ1ξ3+ξ4)2+1)=ξ2+ξ4G^{4}(\xi)=G(\xi)\bigg{/}\left(\left(\xi_{1}-\xi_{3}+\xi_{4}\right)^{2}+1\right)=-\frac{\partial}{\partial\xi_{2}}+\frac{\partial}{\partial\xi_{4}}

and calculate the Lie bracket of G4G^{4} and FF

adG4F=\displaystyle\mathrm{ad}_{G^{4}}F= (ξ2ξ32+ξ4)ξ1+2ξ3(ξ2ξ32+ξ4)ξ2\displaystyle\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)\frac{\partial}{\partial\xi_{1}}+2\xi_{3}\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)\frac{\partial}{\partial\xi_{2}}
+(ξ2ξ32+ξ4)ξ3.\displaystyle+\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)\frac{\partial}{\partial\xi_{3}}.

In noting the form of the right-hand side of the above equation, we select

G3(ξ)=ξ1+2ξ3ξ2+ξ3.G^{3}(\xi)=\frac{\partial}{\partial\xi_{1}}+2\xi_{3}\frac{\partial}{\partial\xi_{2}}+\frac{\partial}{\partial\xi_{3}}\;.

Thanks to the choise for G3G^{3}, adG3F\operatorname{ad}_{G^{3}}F is of such a simple form that we immediately take

G2(ξ)=adG3F=ξ2.G^{2}(\xi)=\operatorname{ad}_{G^{3}}F=\frac{\partial}{\partial\xi_{2}}.

After finishing the computation of adG2F\operatorname{ad}_{G^{2}}F, as shown in (27),

adFG2=\displaystyle{\rm{ad}}_{F}G^{2}= (2ξ12ξ3+ξ4+3(ξ2ξ32+ξ4)2)ξ1+(ξ1+2ξ3(ξ1ξ3+ξ4)ξ3+3(ξ2ξ32+ξ4)2+1)ξ2\displaystyle\left(2\xi_{1}-2\xi_{3}+\xi_{4}+3\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)^{2}\right)\frac{\partial}{\partial\xi_{1}}+\left(\xi_{1}+2\xi_{3}\left(\xi_{1}-\xi_{3}+\xi_{4}\right)-\xi_{3}+3\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)^{2}+1\right)\frac{\partial}{\partial\xi_{2}} (27)
+(ξ1ξ3+ξ4)ξ3+(ξ1+ξ33(ξ2ξ32+ξ4)2)ξ4\displaystyle+\left(\xi_{1}-\xi_{3}+\xi_{4}\right)\frac{\partial}{\partial\xi_{3}}+\left(-\xi_{1}+\xi_{3}-3\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)^{2}\right)\frac{\partial}{\partial\xi_{4}}
=\displaystyle= (ξ1ξ3+3(ξ2ξ32+ξ4)2)ξ1+G2+(ξ1ξ3+ξ4)G3+(ξ1+ξ33(ξ2ξ32+ξ4)2)G4\displaystyle\left(\xi_{1}-\xi_{3}+3\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)^{2}\right)\frac{\partial}{\partial\xi_{1}}+G^{2}+\left(\xi_{1}-\xi_{3}+\xi_{4}\right)G^{3}+\left(-\xi_{1}+\xi_{3}-3\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)^{2}\right)G^{4}

we take

G1(ξ)=ξ1.G^{1}(\xi)=\frac{\partial}{\partial\xi_{1}}.

It is easy to verify that Di=span{Gi,,G4}D^{i}={\rm{span}}\{G^{i},\dots,G^{4}\}, i=1,,4i=1,\dots,4, are 5i5-i dimensional involutive distributions satisfying Di=span{adGi+1F,D^i+1}D^{i}={\mathrm{span}}\{\mathrm{ad}_{G^{i+1}}F,\hat{D}^{i+1}\} in an open set whose closure is a neighborhood of the origin. The Frobenius theorem [39] guarantees that there exists a change of coordinates x=T(ξ)x=T(\xi) such that

Gj(Ti)0,fori=j\displaystyle G^{j}(T_{i})\neq 0,\;{\rm{for}}\;i=j
Gj(Ti)=0,fori<j\displaystyle G^{j}(T_{i})=0,\;{\rm{for}}\;i<j

where i=1,,4i=1,\dots,4. Solving the above equations, we obtain a change of coordinates

x1\displaystyle x_{1} =ξ1ξ3\displaystyle=\xi_{1}-\xi_{3}
x2\displaystyle x_{2} =ξ2+ξ4ξ32\displaystyle=\xi_{2}+\xi_{4}-\xi_{3}^{2}
x3\displaystyle x_{3} =ξ3\displaystyle=\xi_{3}
x4\displaystyle x_{4} =ξ4\displaystyle=\xi_{4}

and, in xx-coordinates, (26) can be rewritten as

x˙1\displaystyle\dot{x}_{1} =x23+x1x2\displaystyle=x_{2}^{3}+x_{1}x_{2} (28)
x˙2\displaystyle\dot{x}_{2} =x3+x2\displaystyle=x_{3}+x_{2}
x˙3\displaystyle\dot{x}_{3} =x4x2+x2x1+x1\displaystyle=x_{4}x_{2}+x_{2}x_{1}+x_{1}
x˙4\displaystyle\dot{x}_{4} =x23x2x1+(1+(x1+x4)2)u.\displaystyle=-x_{2}^{3}-x_{2}x_{1}+\left(1+(x_{1}+x_{4})^{2}\right)u.

Examining the right-hand sides of the first three equations of (28), this system is of type [(0,3),(0,0,1),(0,1,0,1)][(0,3),(0,0,1),(0,1,0,1)], and is also of type [[{(0,3),(1,1)},{(0,0,1)},{(0,1,0,1)}]]\left[\kern-1.49994pt\left[{\{(0,3),(1,1)\},\{(0,0,1)\},\{(0,1,0,1)\}}\right]\kern-1.49994pt\right].

4.2 Conditions for a System to be Equivalent to a Given Type of Lower Triangular Form

In this subsection, we investigate what condition is met to judge that a nonlinear system is equivalent to a specific type of lower triangular system without taking an equivalent transformation. Let us start with the following definition.

Definition 4.49.

α\alpha is a multi-index and β\beta is a proper mβm_{\beta}-multi-index with 1mβn1\leq m_{\beta}\leq n. α\alpha is said to be left equal to β\beta, denoted by α=lβ\alpha=_{l}\beta, if αi=βi\alpha_{i}=\beta_{i} for all i=1,2,,mβi=1,2,\dots,m_{\beta}; α\alpha is said to be left less than β\beta, denoted by α<lβ\alpha<_{l}\beta, if αiβi\alpha_{i}\leq\beta_{i} holds for all i=1,,mβi=1,\dots,m_{\beta} and there exists at least one j{1,,mβ}j\in\{1,\dots,m_{\beta}\} such that αj<βj\alpha_{j}<\beta_{j}. We also define that 0 is the only multi-index left equal to 0 and there is no multi-index left less than 0. Moreover, if α<lβ\alpha<_{l}\beta or α=lβ\alpha=_{l}\beta, we write αlβ\alpha\leq_{l}\beta.

Example 4.50.

According to the definition above, we have (1,1,2)=l(1,1)(1,1,2)=_{l}(1,1) and (1,1,1,1)<l(1,1,2)(1,1,1,1)<_{l}(1,1,2).

From the above definition, it is trivial to verify the following lemma.

Lemma 4.51.

p(x1,,xn)p(x_{1},\dots,x_{n}) is a smooth function, α=(α1,,αm)\alpha=(\alpha_{1},\dots,\alpha_{m}) is a proper mm-multi-index with mnm\leq n, and there is no multi-index of p(x1,,xn)p(x_{1},\dots,x_{n}) left less than α\alpha. Then, any multi-index of p/xk\partial p/\partial x_{k} for k{m,,n}k\in\{m,\dots,n\} is not left less than α=(α1,,αm1,αm1)\alpha^{\prime}=({\alpha_{1}},\dots,{\alpha_{m-1}},{\alpha_{m}}-1), and α\alpha^{\prime} is a multi-index of p/xk(km)\partial p/\partial x_{k}(k\geq m) if and only if α\alpha is a multi-index of pp and k=mk=m.

Using (21), we obtain the lemma as follows.

Lemma 4.52.

p(x1,,xn)p(x_{1},\dots,x_{n}) and q(x1,,xn)q(x_{1},\dots,x_{n}) are smooth functions with p(0)=0p(0)=0, and α\alpha is a proper ii-multi-index. Suppose Lα={β|β<lα}L_{\alpha}=\{\beta|\beta<_{l}\alpha\} and Lα(p)=L_{\alpha}\bigcap{\cal I}(p)=\emptyset. Then (i) Lα(pq)=L_{\alpha}\bigcap{\cal I}(p\cdot q)=\emptyset; (ii) α(pq)\alpha\in{\cal I}(p\cdot q) if and only if q(0)0q(0)\neq 0 and α(p)\alpha\in{\cal I}(p); (iii) For some α=lα\alpha^{\prime}=_{l}\alpha, α(pq)\alpha^{\prime}\in{\cal I}(p\cdot q) if and only if there exists a multi-index of qq left less than the proper ii-multi-index (0,,0,1)(0,\dots,0,1) and there exists α¯=lα\bar{\alpha}=_{l}\alpha satisfying α¯(p)\bar{\alpha}\in{\cal I}(p).

Next, we present a differential geometric lemma that is useful for the further discussion in this subsection.

Lemma 4.53.

Y(ξ)Y(\xi) is a smooth vector field. There exists a change of coordinates x=T(ξ)x=T(\xi) such that YY, in xx-coordinates, can be expressed as

Y(x)\displaystyle Y(x) =i=1k1Yi(x1,,xk1)xi+\displaystyle=\sum\limits_{i=1}^{k-1}{Y_{i}(x_{1},\dots,x_{k-1})\frac{\partial}{\partial x_{i}}}+ (29)
Yk(x1,,xk)xk+i=k+1nYi(x1,,xn)xi\displaystyle{Y_{k}}({x_{1}},\dots,{x_{k}})\frac{\partial}{{\partial{x_{k}}}}+\sum\limits_{i=k+1}^{n}{{Y_{i}}({x_{1}},\dots,{x_{n}})\frac{\partial}{{\partial{x_{i}}}}}

if and only if there exist smooth vector fields X1(ξ),,Xn(ξ)X^{1}(\xi),\dots,X^{n}(\xi) such that Dl=span{Xl,,Xn}D^{l}={\rm{span}}\{X^{l},\dots,X^{n}\}, l=n,,1l=n,\dots,1, are nl+1n-l+1 dimensional involutive distributions and

[Xl,Y]{Dk+1k+1lnDkl=k.[{X^{l}},Y]\in\left\{{\begin{matrix}{{D^{k+1}}}&{k+1\leq l\leq n}\\ {{D^{k}}}&{l=k}\end{matrix}}\right..
Proof 4.54.

The necessity is clear, we only prove the sufficiency here. According to the Frobenius theorem, we can find a change of coordinates x=T(ξ)x=T(\xi) such that Dl=span{/xl,,/xn}D^{l}=\mathrm{span}\{\partial/\partial x_{l},\dots,\partial/\partial x_{n}\} and Xl=i=lnXil(x)/xiX^{l}={\textstyle\sum_{i=l}^{n}{X^{l}_{i}(x)\partial/\partial x_{i}}} for l=n,,1l=n,\dots,1. Let Y(x)=i=1nYi(x)/xiY(x)={\textstyle\sum_{i=1}^{n}{Y_{i}(x)\partial/\partial x_{i}}}. Noting that [Xl,Y]=i=ln(Xil[/xi,Y]Y(Xil)/xi)Dk+1[{X^{l}},Y]={\textstyle\sum_{i=l}^{n}\left({X^{l}_{i}[\partial/\partial x_{i},Y]}-Y(X^{l}_{i})\partial/\partial x_{i}\right)}\in D^{k+1} for l=n,,k+1l=n,\dots,k+1, we have Yj/xl=0\partial Y_{j}/\partial x_{l}=0 for all j=1,,kj=1,\dots,k and l=n,,k+1l=n,\dots,k+1. Additionally, [Xk,Y]Dk[{X^{k}},Y]\in D^{k} implies that Yj/xk=0\partial Y_{j}/\partial x_{k}=0 for any j=1,,k1j=1,\dots,k-1. Thus (29) holds.

Let XX be a vector field, Y1,,YmY^{1},\dots,Y^{m} a family of vector fields, and α=(α1,,αk)\alpha=(\alpha_{1},\dots,\alpha_{k}) a kk-multi-index. We denote, for i=1,,mi=1,\dots,m and an integer j0j\geq 0, adYi0X=X\mathrm{ad}_{Y^{i}}^{0}X=X, adYi1X=adYiX\mathrm{ad}_{Y^{i}}^{1}X=\mathrm{ad}_{Y^{i}}X, adYij+1X=adYiadYijX\mathrm{ad}_{Y^{i}}^{j+1}X=\mathrm{ad}_{Y^{i}}\mathrm{ad}_{Y^{i}}^{j}X, and adYαX=[Yα,X]=adY1α1adYkαkX\mathrm{ad}_{Y}^{\alpha}X=[{Y^{\alpha}},X]=\mathrm{ad}_{Y^{1}}^{\alpha_{1}}\dots\mathrm{ad}_{Y^{k}}^{\alpha_{k}}X. Now we are ready to state several properties of lower triangular forms.

Proposition 4.55.

System (1) is of type [[2(f1),,n(fn1)]][\kern-1.49994pt[{\cal E}_{2}(f_{1}),\dots,{\cal E}_{n}(f_{n-1})]\kern-1.49994pt]. Let

Di=span{xi,,xn},i=n,,1,{D}^{i}={\rm{span}}\left\{\frac{\partial}{\partial x_{i}},\dots,\frac{\partial}{\partial x_{n}}\right\},i=n,\dots,1,

and

Yi+1=k=i+1nYki+1(x)xk\displaystyle{Y^{i+1}}=\sum\limits_{k=i+1}^{n}{Y_{k}^{i+1}(x)\frac{\partial}{\partial{x_{k}}}} (30)
Yj=k=ji1Ykj(x1,,xi1)xk+Yij(x1,,xi)xi\displaystyle{Y^{j}}=\sum\limits_{k=j}^{i-1}{Y_{k}^{j}({x_{1}},\dots,{x_{i-1}})\frac{\partial}{\partial{x_{k}}}}+Y_{i}^{j}({x_{1}},\dots,{x_{i}})\frac{\partial}{\partial{x_{i}}}
+k=i+1nYkj(x)xk,j=1,,i,\displaystyle\qquad+\sum\limits_{k=i+1}^{n}{Y_{k}^{j}(x)\frac{\partial}{\partial{x_{k}}}},\;j=1,\dots,i,

where Yi+1i+1(0)0{Y_{i+1}^{i+1}(0)\neq 0} and Yjj(0)0{Y_{j}^{j}(0)\neq 0}. Then ϵi+1(fi)\epsilon\in{\cal E}_{i+1}(f_{i}) if and only if

adYϵf(0)Di+1(0)\mathrm{ad}_{Y}^{\epsilon}f(0)\notin{D}^{i+1}(0) (31)

and

adYαf(0)Di+1(0)\mathrm{ad}_{Y}^{\alpha}f(0)\in{{D}^{i+1}}(0) (32)

for every proper (i+1)(i+1) multi-index αϵ\alpha\prec\epsilon. In addition, a proper (i+1)(i+1)-multi-index ζ\zeta and all the (i+1)(i+1)-multi-indices that can generate ζ\zeta do not belong to i+1(fi){\cal I}_{i+1}(f_{i}) if and only if

adYαf(0)Di+1(0)\mathrm{ad}_{Y}^{\alpha}f(0)\in{{D}^{i+1}}(0) (33)

for every proper (i+1)(i+1) multi-index αζ\alpha\preceq\zeta.

Proof 4.56.

We first calculate adYθf{\rm{ad}}_{Y}^{\theta}f, where θ=(θ1,,θi+1)\theta=(\theta_{1},\dots,\theta_{i+1}) is a proper (i+1)(i+1)-multi-index, step by step. Let Xi+1,θi+1=f{X^{i+1,\theta_{i+1}}}=f. Compute the following Lie brackets

Xi+1,θi+11=[Yi+1,Xi+1,θi+1]=j=inXji+1,θi+11(x)xj\displaystyle{X^{i+1,\theta_{i+1}-1}}=[{Y^{i+1}},{X^{i+1,\theta_{i+1}}}]=\sum_{j=i}^{n}X_{j}^{i+1,\theta_{i+1}-1}(x)\frac{\partial}{{\partial{x_{j}}}}
\displaystyle\quad\vdots
Xi+1,0=[Yi+1,Xi+1,1]=j=inXji+1,0(x)xj\displaystyle{X^{i+1,0}}=[{Y^{i+1}},{X^{i+1,1}}]=\sum_{j=i}^{n}X_{j}^{i+1,0}(x)\frac{\partial}{{\partial{x_{j}}}}

where Xji+1,k(x)X_{j}^{i+1,k}(x), j=i,,nj=i,\dots,n and k=θi+11,,0k=\theta_{i+1}-1,\dots,0, are all smooth functions, especially

Xii+1,θi+11(x)=Xii+1,θi+1xi+1Yi+1i+1=fixi+1Yi+1i+1\displaystyle X_{i}^{i+1,\theta_{i+1}-1}(x)=\frac{{\partial X_{i}^{i+1,\theta_{i+1}}}}{{\partial{x_{i+1}}}}Y_{i+1}^{i+1}=\frac{{\partial f_{i}}}{{\partial{x_{i+1}}}}Y_{i+1}^{i+1} (34)
Xii+1,θi+12(x)=j=i+1nXii+1,θi+11xjYji+1\displaystyle X_{i}^{i+1,\theta_{i+1}-2}(x)=\sum_{j=i+1}^{n}\frac{{\partial X_{i}^{i+1,\theta_{i+1}-1}}}{{\partial{x_{j}}}}Y_{j}^{i+1}
\displaystyle\quad\vdots
Xii+1,0(x)=j=i+1nXii+1,1xjYji+1.\displaystyle X_{i}^{i+1,0}(x)=\sum_{j=i+1}^{n}\frac{{\partial X_{i}^{i+1,1}}}{{\partial{x_{j}}}}Y_{j}^{i+1}.

Let Xl,θl=Xl+1,0{X^{l,\theta_{l}}}={X^{l+1,0}} for l=i,,1l=i,\dots,1. Proceeding in the same manner, one can calculate

Xl,θl1=[Yl,Xl,θl]=j=inXjl,θl1(x)xj\displaystyle{X^{l,\theta_{l}-1}}=[{Y^{l}},{X^{l,\theta_{l}}}]=\sum_{j=i}^{n}X_{j}^{l,\theta_{l}-1}(x)\frac{\partial}{{\partial{x_{j}}}}
\displaystyle\quad\vdots
Xl,0=[Yl,Xl,1]=j=inXjl,0(x)xj\displaystyle{X^{l,0}}=[{Y^{l}},{X^{l,1}}]=\sum_{j=i}^{n}X_{j}^{l,0}(x)\frac{\partial}{{\partial{x_{j}}}}

where Xjl,k(x)X_{j}^{l,k}(x), l=i,,1l=i,\dots,1, j=i,,nj=i,\dots,n, and k=θl1,,0k=\theta_{l}-1,\dots,0, are all smooth functions with

Xil,θl1(x)=YilxiXil,θl+j=lnXil,θlxjYjl\displaystyle X_{i}^{l,\theta_{l}-1}(x)=-\frac{{\partial Y_{i}^{l}}}{{\partial{x_{i}}}}{X_{i}^{l,\theta_{l}}}+\sum_{j=l}^{n}{\frac{{\partial{X_{i}^{l,\theta_{l}}}}}{{\partial{x_{j}}}}Y_{j}^{l}} (35)
\displaystyle\quad\vdots
Xil,0(x)=YilxiXil,1+j=lnXil,1xjYjl.\displaystyle X_{i}^{l,0}(x)=-\frac{{\partial Y_{i}^{l}}}{{\partial{x_{i}}}}{X_{i}^{l,1}}+\sum_{j=l}^{n}{\frac{{\partial{X_{i}^{l,1}}}}{{\partial{x_{j}}}}Y_{j}^{l}}\;.

Assuming θ=(θ1,,θi+1)\theta=(\theta_{1},\dots,\theta_{i+1}) is a multi-index belonging to i+1(fi){\cal E}_{i+1}(f_{i}), we now prove adYθf(0)Di+1(0){\rm{ad}}_{Y}^{\theta}f(0)\notin{D}^{i+1}(0). For the sake of convenience, we denote θl,k=(θ1,,θl1,k)\theta^{l,k}=(\theta_{1},\dots,\theta_{l-1},k) for l=i+1,,1l=i+1,\dots,1 and k=θl,,0k=\theta_{l},\dots,0. So we have θl,0=θl1,θl1\theta^{l,0}=\theta^{l-1,\theta_{l-1}} for l=i+1,,2l=i+1,\dots,2. It is clear that θi+1,θi+11\theta^{i+1,\theta_{i+1}-1} is a multi-index of Xii+1,θi+1/xi+1{\partial X_{i}^{i+1,\theta_{i+1}}}/{\partial{x_{i+1}}}. Let β\beta be an (i+1)(i+1)-multi-index with β<lθi+1,θi+11\beta<_{l}\theta^{i+1,\theta_{i+1}-1}, then we can assert that β(Xii+1,θi+1/xi+1)\beta\notin{\cal I}({\partial X_{i}^{i+1,\theta_{i+1}}}/{\partial{x_{i+1}}}) because otherwise one would exhibit (β1,,βi,βi+1+1)i+1(fi)(\beta_{1},\dots,\beta_{i},\beta_{i+1}+1)\in{\cal I}_{i+1}(f_{i}) and (β1,,βi,βi+1+1)θ(\beta_{1},\dots,\beta_{i},\beta_{i+1}+1)\prec\theta, which is contradictory with θi+1(fi)\theta\in{\cal E}_{i+1}(f_{i}). From the first equation of (34) and lemma 4.52, θi+1,θi+11(Xii+1,θi+11(x))\theta^{i+1,\theta_{i+1}-1}\in{\cal I}(X_{i}^{i+1,\theta_{i+1}-1}(x)) holds and there is no element of (Xii+1,θi+11(x)){\cal I}(X_{i}^{i+1,\theta_{i+1}-1}(x)) which is left less than θi+1,θi+11\theta^{i+1,\theta_{i+1}-1}. Suppose, for any k{θi+11,,1}k\in\{\theta_{i+1}-1,\dots,1\}, Xii+1,k(x))X_{i}^{i+1,k}(x)) satisfies θi+1,k(Xii+1,k(x))\theta^{i+1,k}\in{\cal I}(X_{i}^{i+1,k}(x)) and β(Xii+1,k(x))\beta\not\in{\cal I}(X_{i}^{i+1,k}(x)) for all β<lθi+1,k\beta<_{l}\theta^{i+1,k}. It follows from (34), lemma 4.51, and lemma 4.52 that θi+1,k1(Xii+1,k1(x))\theta^{i+1,k-1}\in{\cal I}(X_{i}^{i+1,k-1}(x)) and β(Xii+1,k1(x))\beta\not\in{\cal I}(X_{i}^{i+1,k-1}(x)) for all β<lθi+1,k1\beta<_{l}\theta^{i+1,k-1}. Consider Xil,k(x)X_{i}^{l,k}(x) for l{i,,1}l\in\{i,\dots,1\} and k{θl1,,0}k\in\{\theta_{l}-1,\dots,0\} given by (35). Assume that θl,k+1(Xil,k+1(x))\theta^{l,k+1}\in{\cal I}(X_{i}^{l,k+1}(x)) and β(Xil,k+1(x))\beta\not\in{\cal I}(X_{i}^{l,k+1}(x)) for all β<lθl,k+1\beta<_{l}\theta^{l,k+1}. Take account of θl,k<lθl,k+1\theta^{l,k}<_{l}\theta^{l,k+1} and lemma 4.52, βlθl,k\beta\not\leq_{l}\theta^{l,k} holds for any β(Yil/xiXil,k+1)\beta\in{\cal I}({\partial Y_{i}^{l}}/{{\partial{x_{i}}}}\cdot{X_{i}^{l,k+1}}). Using lemma 4.51 and 4.52, we have θl,k(Xil,k+1/xlYll)\theta^{l,k}\in{\cal I}({{\partial X_{i}^{l,k+1}}}/{{\partial{x_{l}}}}\cdot{Y_{l}^{l}}), θl,k(Xil,k+1/xjYjl)\theta^{l,k}\not\in{\cal I}({{\partial X_{i}^{l,k+1}}}/{{\partial{x_{j}}}}\cdot{Y_{j}^{l}}) for j=l+1,,nj=l+1,\dots,n, and β(Xil,k+1/xjYjl)\beta\not\in{\cal I}({{\partial X_{i}^{l,k+1}}}/{{\partial{x_{j^{\prime}}}}}\cdot{Y_{j^{\prime}}^{l}}) for any β<lθl,k\beta<_{l}\theta^{l,k} and j=l,,nj^{\prime}=l,\dots,n. This means θl,k(Xil,k(x))\theta^{l,k}\in{\cal I}(X_{i}^{l,k}(x)) and β(Xil,k(x))\beta\not\in{\cal I}(X_{i}^{l,k}(x)) for all β<lθl,k\beta<_{l}\theta^{l,k}. Especially, θ1,0=(0)(Xi1,0(x))\theta^{1,0}=(0)\in{\cal I}(X_{i}^{1,0}(x)), which implies Xi1,0(0)0X_{i}^{1,0}(0)\neq 0 and adYθf(0)Di+1(0){\rm{ad}}_{Y}^{\theta}f(0)\notin{D}^{i+1}(0).

In a similar way, we can prove that if θ\theta is a proper (i+1)(i+1) multi-index such that βi+1(fi)\beta\not\in{\cal I}^{i+1}(f_{i}) holds for every βθ\beta\preceq\theta then γ(Xil,k(x))\gamma\not\in{\cal I}(X_{i}^{l,k}(x)) for any γlθl,k=(θ1,,θl1,k)\gamma\leq_{l}\theta^{l,k}=(\theta_{1},\dots,\theta_{l-1},k) (l=i+1,,1(l=i+1,\dots,1 and k=θl1,,0)k=\theta_{l}-1,\dots,0). Hence adYθf(0)Di+1(0){\rm{ad}}_{Y}^{\theta}f(0)\in{D}^{i+1}(0).

Consider adYϵf\mathrm{ad}_{Y}^{\epsilon}f and adYαf\mathrm{ad}_{Y}^{\alpha}f with ϵi+1(fi)\epsilon\in{\cal E}_{i+1}(f_{i}) and αϵ\alpha\prec\epsilon. Directly from the previous discussions, one can obtain (31) and (32). We now prove that (31) and (32) imply ϵi+1(fi)\epsilon\in{\cal E}_{i+1}(f_{i}). Suppose the proper (i+1)(i+1)-multi-index α\alpha introduced in (32) satisfies αi+1(fi)\alpha\in{\cal I}_{i+1}(f_{i}). There exists a proper (i+1)(i+1) multi-index αα\alpha^{\prime}\preceq\alpha and αi+1(fi)\alpha^{\prime}\in{\cal E}_{i+1}(f_{i}). Since it has been proved that (31) holds when ϵi+1(fi)\epsilon\in{\cal E}_{i+1}(f_{i}), we obtain adYαf(0)Di+1(0)\mathrm{ad}_{Y}^{\alpha^{\prime}}f(0)\notin{D}^{i+1}(0), which contradicts (32). Therefore, αi+1(fi)\alpha\not\in{\cal I}_{i+1}(f_{i}) must be true. We next consider the (i+1)(i+1) multi-index ϵ\epsilon. It is clear that ϵi+1(fi)\epsilon\in{\cal I}_{i+1}(f_{i}) implies ϵi+1(fi)\epsilon\in{\cal E}_{i+1}(f_{i}). If ϵi+1(fi)\epsilon\not\in{\cal E}_{i+1}(f_{i}) were true then, with αi+1(fi)\alpha\not\in{\cal I}_{i+1}(f_{i}) for every αϵ\alpha\prec\epsilon in mind, adYϵf(0)Di+1(0)\mathrm{ad}_{Y}^{\epsilon}f(0)\in{D}^{i+1}(0) would hold, which also contradicts (32). Thus, we conclude that ϵi+1(fi)\epsilon\in{\cal E}_{i+1}(f_{i}).

It has been proved that (33) holds when αi+1(fi)\alpha\notin{\cal I}_{i+1}(f_{i}) for all αζ\alpha\preceq\zeta. On the other hand, the existence of some αζ\alpha\preceq\zeta satisfying αi+1(fi)\alpha\in{\cal I}_{i+1}(f_{i}), together with (33), obviously contradicts (31). Hence, (33) implies αi+1(fi)\alpha\notin{\cal I}_{i+1}(f_{i}) for all αζ\alpha\preceq\zeta.

Example 4.57.

This counterexample shows that the above proposition is not valid if one modifies (30) to Yj=k=jnYkj(x)/xk{Y^{j}}=\sum_{k=j}^{n}Y_{k}^{j}(x)\partial/\partial x_{k}. Consider the following system

x˙1=x13x2\displaystyle{{\dot{x}}_{1}}=x_{1}^{3}{x_{2}}
x˙2=x3\displaystyle{{\dot{x}}_{2}}={x_{3}}
x˙3=x2+v.\displaystyle{{\dot{x}}_{3}}={x_{2}}+v\;.

Let Y3=/x3Y_{3}={\partial}/{\partial x_{3}}, Y2=/x2Y_{2}={\partial}/{\partial x_{2}}, and Y1=(1+x3)/x1+(x2x1)/x3Y_{1}=(1+x_{3}){\partial}/{\partial x_{1}}+(x_{2}-x_{1}){\partial}/{\partial x_{3}}. Here Y1Y_{1} does not satisfy (30). One can obtain

adY1adY2f=(3x12(x3+1)1)x1+x13x3\mathrm{ad}_{Y_{1}}\mathrm{ad}_{Y_{2}}f=\left({3x_{1}^{2}\left({{x_{3}}+1}\right)-1}\right)\frac{\partial}{{\partial{x_{1}}}}+x_{1}^{3}\frac{\partial}{{\partial{x_{3}}}}

and adY1adY2f(0)span{/x2,/x3}{\rm{a}}{{\rm{d}}_{{Y_{1}}}}{\rm{a}}{{\rm{d}}_{{Y_{2}}}}f(0)\not\in\mathrm{span}\{\partial/{\partial x_{2}},\partial/{\partial x_{3}}\}. But 2(x13x2){\cal L}_{2}(x_{1}^{3}{x_{2}}) =(3,1)=(3,1).

Combining Proposition 4.55, Lemma 4.53, and (25), it is easy to see the next two theorems.

Theorem 4.58.

System (4) is locally equivalent to a lower triangular form satisfying i+1(fi)=αi{\cal L}_{i+1}(f_{i})=\alpha^{i} (i=1,,n1)(i=1,\dots,n-1), namely (LABEL:eq_lea_sys), via a feedback (5) and a change of coordinates (6) if and only if the following conditions are satisfied:

(i) System (4) is locally feedback equivalent to (1).

(ii) Suppose DlD^{l}, l=1,,n+1l=1,\dots,n+1, are nl+1n-l+1 dimensional involutive distributions defined in Theorem 4.44 and XlX^{l}, l=1,,nl=1,\dots,n, are smooth vector fields satisfying XlDlX^{l}\in D^{l} and Xl(0)Dl+1(0)X^{l}(0)\not\in D^{l+1}(0). Let Yi=(Yi,1,,Yi,i+1)Y^{i}=(Y^{i,1},\dots,Y^{i,i+1}), i=1,,n1i=1,\dots,n-1, be tuples of smooth vector fields satisfying Yi,jDjY^{i,j}\in D^{j}, Yi,j(0)Dj+1(0)Y^{i,j}(0)\not\in D^{j+1}(0), and

[Xl,Yi,j]{Di+1i+1lnDil=i[{X^{l}},Y^{i,j}]\in\left\{{\begin{matrix}D^{i+1}&i+1\leq l\leq n\\ D^{i}&l=i\end{matrix}}\right.

for j=1,,i+1j=1,\dots,i+1. Then adYiαiF(0)Di+1(0){\mathrm{ad}_{Y^{i}}^{\alpha^{i}}F}(0)\notin{D^{i+1}}(0) and adYiαF(0)Di+1(0){\mathrm{ad}_{Y^{i}}^{\alpha}F}(0)\in{D^{i+1}}(0) for every proper (i+1)(i+1)-multi-index ααi\alpha\lessdot\alpha^{i}.

Remark 4.59.

X1,,XnX^{1},\dots,X^{n} given in the above theorem obviously satisfy span{Xl,,Xn}=Dl\mathrm{span}\{X^{l},\dots,X^{n}\}=D^{l} in a neighborhood of the origin. It is not difficult to find X1,,XnX^{1},\dots,X^{n} when DlD^{l}, l=1,,nl=1,\dots,n, are known.

Remark 4.60.

The necessary and sufficient condition introduced in the above theorem for a nonlinear system to be equivalent to a pp-normal form is consistent with the condition given in [23] if we take Yn1,n=GY^{n-1,n}=G and Yi,i+1=adYi+1,i+2pi+1FY^{i,i+1}=\mathrm{ad}_{Y^{i+1,i+2}}^{p_{i+1}}F for i=n2,,1i=n-2,\dots,1.

Theorem 4.61.

System (4) is locally equivalent to a lower triangular form taking the form (24) via a feedback (5) and a change of coordinates (6) if and only if the following conditions are satisfied:

(i) System (4) is locally feedback equivalent to (1).

(ii) Suppose DlD^{l}, l=1,,n+1l=1,\dots,n+1, are nl+1n-l+1 dimensional involutive distributions defined in Theorem 4.44 and XlX^{l}, l=1,,nl=1,\dots,n, are smooth vector fields satisfying XlDlX^{l}\in D^{l} and Xl(0)Dl+1(0)X^{l}(0)\not\in D^{l+1}(0). Let Yi=(Yi,1,,Yi,i+1)Y^{i}=(Y^{i,1},\dots,Y^{i,i+1}), i=1,,n1i=1,\dots,n-1, be tuples of smooth vector fields satisfying Yi,jDjY^{i,j}\in D^{j}, Yi,j(0)Dj+1(0)Y^{i,j}(0)\not\in D^{j+1}(0), and

[Xl,Yi,j]{Di+1i+1lnDil=i[{X^{l}},Y^{i,j}]\in\left\{{\begin{matrix}D^{i+1}&i+1\leq l\leq n\\ D^{i}&l=i\end{matrix}}\right.

for j=1,,i+1j=1,\dots,i+1. Then for every ϵi+1(fi)\epsilon\in{\cal E}_{i+1}(f_{i}) and every ζ𝒜i+1𝒢i+1(i+1(fi))\zeta\in{\cal A}_{i+1}\setminus{\cal G}_{i+1}({\cal E}_{i+1}(f_{i})), where 𝒜i+1{\cal A}_{i+1} is the set consisting of all the proper ii-multi-indices, the relations adYiϵF(0)Di+1(0)\mathrm{ad}_{Y^{i}}^{\epsilon}F(0)\notin D^{i+1}(0) and adYiζF(0)Di+1(0)\mathrm{ad}_{Y^{i}}^{\zeta}F(0)\in D^{i+1}(0) hold.

Remark 4.62.

When the proper (i+1)(i+1)-multi-index (0,,0,k)i+1(fi)(0,\dots,0,k)\in{\cal E}_{i+1}(f_{i}), we know that 𝒜i+1𝒢i+1(i+1(fi)){\cal A}_{i+1}\setminus{\cal G}_{i+1}({\cal E}_{i+1}(f_{i})) is finite from Proposition 3.41. To check the condition (ii) in the above theorem, we only need to calculate Lie brackets a finite number of times. But when the proper (i+1)(i+1)-multi-index (0,,0,k)i+1(fi)(0,\dots,0,k)\not\in{\cal E}_{i+1}(f_{i}) for all the positive integer kk, 𝒜i+1𝒢i+1(i+1(fi)){\cal A}_{i+1}\setminus{\cal G}_{i+1}({\cal E}_{i+1}(f_{i})) is infinite. Although it follows that we may need to compute Lie brackets infinitely many times in the case, this is acceptable because one may also have to check infinite many ii-multi-indices of fif_{i} to find i+1(fi){\cal E}_{i+1}(f_{i}).

Remark 4.63.

We now consider how to obtain YiY^{i} required in Theorem 4.58 and 4.61. Yi,i+1Y^{i,i+1} can be selected as any smooth vector field belonging to Di+1D^{i+1} such that Yi+1(0)Di+2(0)Y^{i+1}(0)\not\in D^{i+2}(0). Let, for j=1,ij=1,\dots\,i, Yi,j=k=jnhki,j(ξ)XkY^{i,j}=\textstyle\sum_{k=j}^{n}h^{i,j}_{k}(\xi)X^{k} where hki,j(ξ)h^{i,j}_{k}(\xi) are undetermined smooth functions. Note that

[Xl1,Xl2]=all1,l2(ξ)Xl2++anl1,l2(ξ)Xn[X^{l_{1}},X^{l_{2}}]=a^{l_{1},l_{2}}_{l^{\prime}}(\xi)X^{l_{2}}+\dots+a^{l_{1},l_{2}}_{n}(\xi)X^{n} (36)

where l1l_{1} and l2l_{2} are integers belonging to {1,,n}\{1,\dots,n\}, l=min(l1,l2){l^{\prime}}=\mathrm{min}(l_{1},l_{2}), and all1,l2(ξ),,anl1,l2(ξ)a^{l_{1},l_{2}}_{l^{\prime}}(\xi),\dots,a^{l_{1},l_{2}}_{n}(\xi) are smooth functions satisfying akl1,l2(ξ)=akl2,l1(ξ)a^{l_{1},l_{2}}_{k}(\xi)=-a^{l_{2},l_{1}}_{k}(\xi) and akl,l(ξ)=0a^{l,l}_{k}(\xi)=0. Let us calculate the following Lie bracket, for l=i+1,,nl=i+1,\dots,n,

[Xl,Yi,j]=k=ji[Xl,hki,jXk]+k=i+1n[Xl,hki,jXk]\displaystyle[X^{l},Y^{i,j}]=\sum_{k=j}^{i}[X^{l},h^{i,j}_{k}X^{k}]+\sum_{k={i+1}}^{n}[X^{l},h^{i,j}_{k}X^{k}]
=k=ji(Xl(hki,j)Xk+hki,j[Xl,Xk])+k=i+1n[Xl,hki,jXk]\displaystyle\;=\sum_{k=j}^{i}\left(X^{l}(h^{i,j}_{k})X^{k}+h^{i,j}_{k}[X^{l},X^{k}]\right)+\sum_{k={i+1}}^{n}[X^{l},h^{i,j}_{k}X^{k}]
=k=ji(Xl(hki,j)Xk+hki,jk^=kiak^l,kXk^)+\displaystyle\;=\sum_{k=j}^{i}\left(X^{l}(h^{i,j}_{k})X^{k}+h^{i,j}_{k}\sum_{\hat{k}=k}^{i}a^{l,k}_{\hat{k}}X^{\hat{k}}\right)+
k=ji(hki,jk^=i+1nak^l,kXk^)+k=i+1n[Xl,hki,jXk]\displaystyle\quad\;\,\sum_{k={j}}^{i}\left(h^{i,j}_{k}\sum_{\hat{k}={i+1}}^{n}a^{l,k}_{\hat{k}}X^{\hat{k}}\right)+\sum_{k={i+1}}^{n}[X^{l},h^{i,j}_{k}X^{k}]
=k=ji(Xl(hki,j)+k=jkhki,jakl,k)Xk+\displaystyle\;=\sum_{k=j}^{i}\left(X^{l}(h^{i,j}_{k})+\sum_{k^{\prime}=j}^{k}h^{i,j}_{k^{\prime}}a^{l,k^{\prime}}_{k}\right)X^{k}+
k=ji(hki,jk^=i+1nak^l,kXk^)+k=i+1n[Xl,hki,jXk]Di+1.\displaystyle\quad\;\,\sum_{k={j}}^{i}\left(h^{i,j}_{k}\sum_{\hat{k}={i+1}}^{n}a^{l,k}_{\hat{k}}X^{\hat{k}}\right)+\sum_{k={i+1}}^{n}[X^{l},h^{i,j}_{k}X^{k}]\in D^{i+1}.

Similarly, we have

[Xi,Yi,j]=k=ji1(Xl(hki,j)+k=jkhki,jakl,k)Xk+\displaystyle[X^{i},Y^{i,j}]=\sum_{k=j}^{i-1}\left(X^{l}(h^{i,j}_{k})+\sum_{k^{\prime}=j}^{k}h^{i,j}_{k^{\prime}}a^{l,k^{\prime}}_{k}\right)X^{k}+
k=ji1(hki,jk^=inak^l,kXk^)+k=in[Xl,hki,jXk]Di.\displaystyle\quad\;\,\sum_{k={j}}^{i-1}\left(h^{i,j}_{k}\sum_{\hat{k}={i}}^{n}a^{l,k}_{\hat{k}}X^{\hat{k}}\right)+\sum_{k={i}}^{n}[X^{l},h^{i,j}_{k}X^{k}]\in D^{i}.

Thus, hki,j(ξ)h^{i,j}_{k}(\xi), k=j,,ik=j,\dots,i, can be determined from the equations

Xl(hki,j)+k=jkhki,jakl,k=0X^{l}(h^{i,j}_{k})+\sum_{k^{\prime}=j}^{k}h^{i,j}_{k^{\prime}}a^{l,k^{\prime}}_{k}=0

where k=j,,ik=j,\dots,i when l=i+1,,nl=i+1,\dots,n, and k=j,,i1k=j,\dots,i-1 when l=il=i. The existence of the solution of these equations is guaranteed by Proposition 4.55 and Lemma 4.53. Additionally, hki,j(ξ)h^{i,j}_{k}(\xi), k=i+1,,nk=i+1,\dots,n, can be chosen to be arbitrary smooth functions.

To determine whether a nonlinear system can be transformed into a specific type of lower triangular form by using the previous two theorems, appropriate vector fields Yi,jY^{i,j}, i=1,,n1i=1,\dots,n-1 and j=1,,i+1j=1,\dots,i+1, are required. Partly because there are so many vector fields to find out, this is not an easy task. The next two corollaries are the reduced versions of Theorem 4.58 and 4.61, respectively. The following lemma can be proved in a way similar to the proof of Lemma 4.53.

Lemma 4.64.

{X1(ξ),,Xn(ξ)}\{X^{1}(\xi),\dots,X^{n}(\xi)\} and {Y1(ξ),,Yn(ξ)}\{Y^{1}(\xi),\dots,Y^{n}(\xi)\} are two sets of nonsingular vector fields such that Dk=span{Xk,,Xn}=span{Yk,,Yn}D^{k}={\rm{span}}\{{X^{k}},\dots,{X^{n}}\}={\rm{span}}\{{Y^{k}},\dots,{Y^{n}}\}, k=n,,1k=n,\dots,1, are nk+1n-k+1 dimensional involutive distributions. Then there exists a change of coordinates x=T(ξ)x=T(\xi) such that in xx-coordinates

Yk(x)=i=knYik(x1,,xi)xi{Y^{k}(x)}=\sum\limits_{i=k}^{n}{Y_{i}^{k}({x_{1}},\dots,{x_{i}})\frac{\partial}{{\partial{x_{i}}}}}

if and only if the relation

[Xi,Yj]Di[{X^{i}},{Y^{j}}]\in{D^{i}}

holds for any pair of i,j=1,,ni,j=1,\dots,n satisfying i>ji>j.

The next two corollaries follow at once from the previous two theorems and the above lemma.

Corollary 4.65.

System (4) is locally equivalent to a lower triangular form satisfying i+1(fi)=αi{\cal L}_{i+1}(f_{i})=\alpha^{i} (i=1,,n1)(i=1,\dots,n-1), namely (LABEL:eq_lea_sys), via a feedback (5) and a change of coordinates (6) if and only if the following conditions are satisfied:

(i) System (4) is locally feedback equivalent to (1).

(ii) Suppose DlD^{l}, l=1,,n+1l=1,\dots,n+1, are nl+1n-l+1 dimensional involutive distributions defined in Theorem 4.44 and XlX^{l}, l=1,,nl=1,\dots,n, are smooth vector fields satisfying XlDlX^{l}\in D^{l} and Xl(0)Dl+1(0)X^{l}(0)\not\in D^{l+1}(0). Let Y=(Y1,,Yn)Y=(Y^{1},\dots,Y^{n}) be a tuple of smooth vector fields such that Dk=span{Yk,,Yn}D^{k}={\mathrm{span}}\{Y^{k},\dots,Y^{n}\} for k=n,,1k=n,\dots,1 and [Xl,Yk]Dl[{X^{l}},{Y^{k}}]\in{D^{l}} for all l>kl>k. Then adYαiF(0)Di+1(0){\mathrm{ad}_{Y}^{\alpha^{i}}F}(0)\notin{D^{i+1}}(0) and adYαF(0)Di+1(0){\mathrm{ad}_{Y}^{\alpha}F}(0)\in{D^{i+1}}(0) for every proper (i+1)(i+1)-multi-index ααi\alpha\lessdot\alpha^{i}.

Corollary 4.66.

System (4) is locally equivalent to (24) via a feedback (5) and a change of coordinates (6) if and only if the following conditions are satisfied:

(i) System (4) is locally feedback equivalent to (1).

(ii) Suppose DlD^{l}, l=1,,n+1l=1,\dots,n+1, are nl+1n-l+1 dimensional involutive distributions defined in Theorem 4.44 and XlX^{l}, l=1,,nl=1,\dots,n, are smooth vector fields satisfying XlDlX^{l}\in D^{l} and Xl(0)Dl+1(0)X^{l}(0)\not\in D^{l+1}(0). Let Y=(Y1,,Yn)Y=(Y^{1},\dots,Y^{n}) be a tuple of smooth vector fields such that Dk=span{Yk,,Yn}D^{k}={\mathrm{span}}\{Y^{k},\dots,Y^{n}\} for k=n,,1k=n,\dots,1 and [Xl,Yk]Dl[{X^{l}},{Y^{k}}]\in{D^{l}} for all l>kl>k. Then for every ϵi+1(fi)\epsilon\in{\cal E}_{i+1}(f_{i}) and every ζ𝒜i+1𝒢i+1(i+1(p))\zeta\in{\cal A}_{i+1}\setminus{\cal G}_{i+1}({\cal E}_{i+1}(p)), where 𝒜i+1{\cal A}_{i+1} is the set consisting of all the proper ii-multi-indices, the relations adYϵF(0)Di+1(0)\mathrm{ad}_{Y}^{\epsilon}F(0)\notin D^{i+1}(0) and adYζF(0)Di+1(0)\mathrm{ad}_{Y}^{\zeta}F(0)\in D^{i+1}(0) hold.

Remark 4.67.

YY mentioned in the previous two corollaries can be found by a method similar to Remark 4.63. YnY^{n} can be selected as a smooth vector field belonging to DnD^{n} with Yn(0)0Y^{n}(0)\neq 0. Let, for j=1,n1j=1,\dots\,n-1, Yj=k=jnhkj(ξ)XkY^{j}=\textstyle\sum_{k=j}^{n}h_{k}^{j}(\xi)X^{k}. Since [Xl,Yj]Dl[X^{l},Y^{j}]\in D^{l} for l=j+1,,nl=j+1,\dots,n, the functions hkj(ξ)h_{k}^{j}(\xi), k=j,,n1k=j,\dots,n-1, can be obtained from the solution of the equations

Xl(hkj)+k=jkhkjakl,k=0X^{l}(h_{k}^{j})+\sum_{k^{\prime}=j}^{k}h_{k^{\prime}}^{j}a^{l,k^{\prime}}_{k}=0

where every function akl,ka^{l,k}_{k^{\prime}} is defined by (36). hnj(ξ)h_{n}^{j}(\xi) can be any smooth function.

Example 4.68.

Consider the system given by (26). By using the above corollary, we now show how to determine what type the system is without transforming it into a lower triangular form. Since it has been verified in Example 4.48 that this system satisfies the condition (i) introduced in Corollary 4.66, it is necessary to find four nonsingular vector fields Y4,,Y1Y^{4},\dots,Y^{1} such that, for l=1,,4l=1,\dots,4 and k=1,,l1k=1,\dots,l-1, [Xl,Yk]Dl[{X^{l}},{Y^{k}}]\in{D^{l}} where DlD^{l} and Xl=GlX^{l}=G^{l} are given in Example 4.48. By using the method proposed in Remark 4.67, let us take

Y4=ξ2+ξ4,\displaystyle Y^{4}=-\frac{\partial}{\partial\xi_{2}}+\frac{\partial}{\partial\xi_{4}},
Y3=ξ1+(2ξ3ξ4)ξ2+ξ3+ξ4ξ4,\displaystyle Y^{3}=\frac{\partial}{\partial\xi_{1}}+\left(2\xi_{3}-\xi_{4}\right)\frac{\partial}{\partial\xi_{2}}+\frac{\partial}{\partial\xi_{3}}+\xi_{4}\frac{\partial}{\partial\xi_{4}},
Y2=ξ3ξ1+(2ξ32ξ4+1)ξ2+ξ3ξ3+ξ4ξ4,\displaystyle Y^{2}=\xi_{3}\frac{\partial}{\partial\xi_{1}}+\left(2\xi_{3}^{2}-\xi_{4}+1\right)\frac{\partial}{\partial\xi_{2}}+\xi_{3}\frac{\partial}{\partial\xi_{3}}+\xi_{4}\frac{\partial}{\partial\xi_{4}},

and

Y1=ξ1ξ4ξ2+ξ4ξ4.Y^{1}=\frac{\partial}{\partial\xi_{1}}-\xi_{4}\frac{\partial}{\partial\xi_{2}}+\xi_{4}\frac{\partial}{\partial\xi_{4}}.

After computing several Lie brackets, it is straightforward to see that [Xl,Yk]Dl[{X^{l}},{Y^{k}}]\in{D^{l}} for l=1,,4l=1,\dots,4 and k=1,,l1k=1,\dots,l-1; that is, the condition (ii) introduced in Corollary 4.66 is also satisfied. To simplify the notation, let (26) be of a type [[E1,E2,E3]][\kern-1.49994pt[E^{1},E^{2},E^{3}]\kern-1.49994pt], where EiE^{i}, i=1,2,3i=1,2,3, are sets of proper (i+1)(i+1)-multi-indices. To determine E3E^{3}, we first compute the following Lie bracket

adY4F=\displaystyle\mathrm{ad}_{Y^{4}}F= (ξ2ξ32+ξ4)ξ1+2ξ3(ξ2ξ32+ξ4)ξ2\displaystyle\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)\frac{\partial}{\partial\xi_{1}}+2\xi_{3}\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)\frac{\partial}{\partial\xi_{2}}
+(ξ2ξ32+ξ4)ξ3.\displaystyle+\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)\frac{\partial}{\partial\xi_{3}}.

Since adY4F(0)=0\mathrm{ad}_{Y^{4}}F(0)=0, we have (0,0,0,1)E3(0,0,0,1)\not\in E^{3}. After further computations, we obtain

ad\displaystyle\mathrm{ad} adY4Y2F=(ξ2+ξ32ξ4+1)ξ1+\displaystyle{}_{Y^{2}}\mathrm{ad}_{Y^{4}}F=\left(-\xi_{2}+\xi_{3}^{2}-\xi_{4}+1\right)\frac{\partial}{\partial\xi_{1}}+
2ξ3(ξ2+ξ32ξ4+1)ξ2+(ξ2+ξ32ξ4+1)ξ3\displaystyle 2\xi_{3}\left(-\xi_{2}+\xi_{3}^{2}-\xi_{4}+1\right)\frac{\partial}{\partial\xi_{2}}+\left(-\xi_{2}+\xi_{3}^{2}-\xi_{4}+1\right)\frac{\partial}{\partial\xi_{3}}

and

adY1adY4F=adY3adY4F=adY4adY4F=0.\mathrm{ad}_{Y^{1}}\mathrm{ad}_{Y^{4}}F=\mathrm{ad}_{Y^{3}}\mathrm{ad}_{Y^{4}}F=\mathrm{ad}_{Y^{4}}\mathrm{ad}_{Y^{4}}F=0.

Seeing that adY2adY4F(0)D4\mathrm{ad}_{Y^{2}}\mathrm{ad}_{Y^{4}}F(0)\not\in D^{4}, it is clear that {(0,1,0,1)}E3\{(0,1,0,1)\}\in E^{3}. Let α\alpha be a proper 44-multi-index such that |α|>2\left|\alpha\right|>2. Noting that adYαF0\mathrm{ad}_{Y}^{\alpha}F\neq 0 implies (0,1,0,1)α(0,1,0,1)\prec\alpha, E3={(0,1,0,1)}E^{3}=\{(0,1,0,1)\} holds. According to

ad\displaystyle\mathrm{ad} FY3=ξ2+ξ4(ξ2ξ32+ξ4)(ξ1+2ξ3ξ2+ξ3)\displaystyle{}_{Y^{3}}F=\frac{\partial}{\partial\xi_{2}}+\xi_{4}\left(\xi_{2}-\xi_{3}^{2}+\xi_{4}\right)\left(\frac{\partial}{\partial\xi_{1}}+2\xi_{3}\frac{\partial}{\partial\xi_{2}}+\frac{\partial}{\partial\xi_{3}}\right)
+(ξ1ξ2ξ1ξ32+ξ1ξ4+ξ233ξ22ξ32+3ξ22ξ4+3ξ2ξ34\displaystyle+(\xi_{1}\xi_{2}-\xi_{1}\xi_{3}^{2}+\xi_{1}\xi_{4}+\xi_{2}^{3}-3\xi_{2}^{2}\xi_{3}^{2}+3\xi_{2}^{2}\xi_{4}+3\xi_{2}\xi_{3}^{4}
6ξ2ξ32ξ4ξ2ξ3)(ξ2+ξ4)\displaystyle-6\xi_{2}\xi_{3}^{2}\xi_{4}-\xi_{2}\xi_{3})\left(-\frac{\partial}{\partial\xi_{2}}+\frac{\partial}{\partial\xi_{4}}\right)

and adY3F(0)D3\mathrm{ad}_{Y_{3}}F(0)\not\in D^{3}, (0,0,1)(0,0,1) must be the only element of E2E^{2}. Then, let us focus on E1E^{1}. Since the form of adY2F\mathrm{ad}_{Y_{2}}F is so complex that, for the sake of simplicity, only adY2F(0)\mathrm{ad}_{Y_{2}}F(0) is shown here

adY2F(0)=ξ2.\mathrm{ad}_{Y^{2}}F(0)=\frac{\partial}{\partial\xi_{2}}.

Noting that adY2F(0)D2\mathrm{ad}_{Y_{2}}F(0)\in D^{2}, it is definite that (0,1)E1(0,1)\not\in E^{1}. We also compute the following vector fields at the origin

adY22F(0)=0,\mathrm{ad}_{Y^{2}}^{2}F(0)=0,
adY23F(0)=6ξ1+6ξ26ξ4,\mathrm{ad}_{Y^{2}}^{3}F(0)=6\frac{\partial}{\partial\xi_{1}}+6\frac{\partial}{\partial\xi_{2}}-6\frac{\partial}{\partial\xi_{4}},
adY1adY2F(0)=ξ1+ξ2ξ4.\mathrm{ad}_{Y^{1}}\mathrm{ad}_{Y^{2}}F(0)=\frac{\partial}{\partial\xi_{1}}+\frac{\partial}{\partial\xi_{2}}-\frac{\partial}{\partial\xi_{4}}.

Neither adY23F(0)\mathrm{ad}_{Y_{2}}^{3}F(0) nor adY1adY2F(0)\mathrm{ad}_{Y_{1}}\mathrm{ad}_{Y_{2}}F(0) belongs to D2(0)D^{2}(0). Hence, (0,3),(1,1)E1(0,3),(1,1)\in E^{1}. Since adYαF(0)=0\mathrm{ad}_{Y}^{\alpha}F(0)=0 holds for all α𝒜2𝒢2({(0,3),(1,1)})\alpha\in{\cal A}_{2}\setminus{\cal G}_{2}(\{(0,3),(1,1)\}), it is impossible to find any other proper 2-multi-index belonging to E1E^{1} yet. This allows us to conclude that E1={(0,3),(1,1)}E^{1}=\{(0,3),(1,1)\}. Comparing this example with Example 4.48, the type of (26) determined by using Corollary 4.66 is the same as the type judged from the equivalent lower triangular form of (26).

5 Conclusion

We have developed a framework to analyze the multi-indices of the functions given by the right-hand sides of the system equations of lower triangular forms. This leads to two classification schemes of lower triangular forms. To expand the application of those two classifications, the problem of whether a nonlinear system is equivalent to a specific type of lower triangular form has also been solved in this paper.

References

  • [1] A. Isidori, Nonlinear control systems, 3rd ed.   New York, NY, USA: Springer, 1995.
  • [2] R. Sepulchre, M. Jankovic, and P. V. Kokotovic, Constructive nonlinear control.   London, U.K.: Springer, 2012.
  • [3] S. Sastry, Nonlinear systems: analysis, stability, and control.   New York, USA: Springer, 1999.
  • [4] S. Čelikovskỳ and H. Nijmeijer, “Equivalence of nonlinear systems to triangular form: the singular case,” Systems & control letters, vol. 27, no. 3, pp. 135–144, 1996.
  • [5] X. Zhang and Y. Lin, “A new approach to global asymptotic tracking for a class of low-triangular nonlinear systems via output feedback,” IEEE transactions on automatic control, vol. 57, no. 12, pp. 3192–3196, 2012.
  • [6] R. Ma and J. Zhao, “Backstepping design for global stabilization of switched nonlinear systems in lower triangular form under arbitrary switchings,” Automatica, vol. 46, no. 11, pp. 1819–1823, 2010.
  • [7] Y. Su and J. Huang, “Cooperative global robust output regulation for nonlinear uncertain multi-agent systems in lower triangular form,” IEEE Transactions on Automatic Control, vol. 60, no. 9, pp. 2378–2389, 2015.
  • [8] B. Wang, H. Ji, and J. Zhu, “Robust control design of a class of nonlinear systems in polynomial lower-triangular form,” International Journal of Control, Automation and Systems, vol. 7, no. 1, pp. 41–48, 2009.
  • [9] F. Fotiadis and G. A. Rovithakis, “Prescribed performance control for discontinuous output reference tracking,” IEEE Transactions on Automatic Control, vol. 66, no. 9, pp. 4409–4416, 2020.
  • [10] T. Zhang, S. S. Ge, and C. C. Hang, “Adaptive neural network control for strict-feedback nonlinear systems using backstepping design,” Automatica, vol. 36, no. 12, pp. 1835–1846, 2000.
  • [11] X. Tang, G. Tao, and S. M. Joshi, “Adaptive actuator failure compensation for parametric strict feedback systems and an aircraft application,” Automatica, vol. 39, no. 11, pp. 1975–1982, 2003.
  • [12] C. P. Bechlioulis and G. A. Rovithakis, “Adaptive control with guaranteed transient and steady state tracking error bounds for strict feedback systems,” Automatica, vol. 45, no. 2, pp. 532–538, 2009.
  • [13] B. Chen, X. Liu, K. Liu, and C. Lin, “Direct adaptive fuzzy control of nonlinear strict-feedback systems,” Automatica, vol. 45, no. 6, pp. 1530–1535, 2009.
  • [14] D. Zhai, L. An, J. Dong, and Q. Zhang, “Output feedback adaptive sensor failure compensation for a class of parametric strict feedback systems,” Automatica, vol. 97, pp. 48–57, 2018.
  • [15] J. Zhang and G. Yang, “Low-complexity tracking control of strict-feedback systems with unknown control directions,” IEEE Transactions on Automatic Control, vol. 64, no. 12, pp. 5175–5182, 2019.
  • [16] W. Lin and C. Qian, “Adding one power integrator: a tool for global stabilization of high-order lower-triangular systems,” Systems & Control Letters, vol. 39, no. 5, pp. 339–351, 2000.
  • [17] C. Qian and W. Lin, “Non-lipschitz continuous stabilizers for nonlinear systems with uncontrollable unstable linearization,” Systems & Control Letters, vol. 42, no. 3, pp. 185–200, 2001.
  • [18] ——, “A continuous feedback approach to global strong stabilization of nonlinear systems,” IEEE Transactions on Automatic Control, vol. 46, no. 7, pp. 1061–1079, 2001.
  • [19] W. Lin and C. Qian, “Adaptive control of nonlinearly parameterized systems: a nonsmooth feedback framework,” IEEE Transactions on Automatic control, vol. 47, no. 5, pp. 757–774, 2002.
  • [20] C. Qian and W. Lin, “Practical output tracking of nonlinear systems with uncontrollable unstable linearization,” IEEE Transactions on Automatic Control, vol. 47, no. 1, pp. 21–36, 2002.
  • [21] ——, “Recursive observer design, homogeneous approximation, and nonsmooth output feedback stabilization of nonlinear systems,” IEEE Transactions on Automatic Control, vol. 51, no. 9, pp. 1457–1471, 2006.
  • [22] D. Cheng and W. Lin, “On p-normal forms of nonlinear systems,” IEEE Transactions on Automatic control, vol. 48, no. 7, pp. 1242–1248, 2003.
  • [23] W. Respondek, “Transforming a single-input system to a p-normal form via feedback,” in 42nd IEEE International Conference on Decision and Control (IEEE Cat. No. 03CH37475), vol. 2.   IEEE, 2003, pp. 1574–1579.
  • [24] Y. Hong, J. Wang, and D. Cheng, “Adaptive finite-time control of nonlinear systems with parametric uncertainty,” IEEE Transactions on Automatic control, vol. 51, no. 5, pp. 858–862, 2006.
  • [25] Z. Sun, L. Xue, and K. Zhang, “A new approach to finite-time adaptive stabilization of high-order uncertain nonlinear system,” Automatica, vol. 58, pp. 60–66, 2015.
  • [26] C. Chen and Z. Sun, “A unified approach to finite-time stabilization of high-order nonlinear systems with an asymmetric output constraint,” Automatica, vol. 111, p. 108581, 2020.
  • [27] L. Long and J. Zhao, “H control of switched nonlinear systems in pp-normal form using multiple lyapunov functions,” IEEE Transactions on Automatic Control, vol. 57, no. 5, pp. 1285–1291, 2012.
  • [28] Q. Su, L. Long, and J. Zhao, “Stabilization of state-constrained switched nonlinear systems in p-normal form,” International Journal of Robust and Nonlinear Control, vol. 24, no. 10, pp. 1550–1562, 2014.
  • [29] L. Long and J. Zhao, “An integral-type multiple lyapunov functions approach for switched nonlinear systems,” IEEE Transactions on Automatic Control, vol. 61, no. 7, pp. 1979–1986, 2016.
  • [30] C. Ding, C. Shi, and Y. Chen, “Nonsingular prescribed-time stabilization of a class of uncertain nonlinear systems: A novel coordinate mapping method,” International Journal of Robust and Nonlinear Control, vol. 30, no. 9, pp. 3566–3581, 2020.
  • [31] C. Ding and R. Wei, “Low-complexity tracking control for p-normal form systems using a novel nussbaum function,” IEEE Transactions on Automatic Control, vol. 67, no. 5, pp. 2640–2647, 2021.
  • [32] A. J. Krener, “On the equivalence of control systems and the linearization of nonlinear systems,” SIAM Journal on Control, vol. 11, no. 4, pp. 670–676, 1973.
  • [33] R. W. Brockett, “Feedback invariants for nonlinear systems,” IFAC Proceedings Volumes, vol. 11, no. 1, pp. 1115–1120, 1978.
  • [34] B. Jacubczyk and W. Respondek, “On linearization of control systems,” Bul. L’acad Pol. Sciense, vol. 28, no. 9-10, pp. 517–522, 1980.
  • [35] R. Su, “On the linear equivalents of nonlinear systems,” Systems & Control Letters, vol. 2, no. 1, pp. 48–52, 1982.
  • [36] L. Hunt, R. Su, and G. Meyer, “Global transformations of nonlinear systems,” IEEE Transactions on automatic control, vol. 28, no. 1, pp. 24–31, 1983.
  • [37] A. Isidori, A. Krener, C. Gori-Giorgi, and S. Monaco, “Nonlinear decoupling via feedback: a differential geometric approach,” IEEE transactions on automatic control, vol. 26, no. 2, pp. 331–345, 1981.
  • [38] W. Rudin, Functional analysis, 2nd ed.   New York, NY, USA: McGraw-Hill, 1991.
  • [39] S. Lang, Differential and Riemannian manifolds, 3rd ed.   New York, NY, USA: Springer, 2012.