This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

An Efficient Egocentric Regulator for Continuous Targeting Problems of the Underactuated Quadrotor

Ziying Lin, Wei Dong, Sensen Liu, Xinjun Sheng, and Xiangyang Zhu The authors are with the State Key Laboratory of Mechanical System and Vibration, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China (e-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]).Manuscript received xxx xx, xxxx; revised xxxx xx, xxxx.
Abstract

Flying robots such as the quadrotor could provide an efficient approach for medical treatment or sensor placing of wild animals. In these applications, continuously targeting the moving animal is a crucial requirement. Due to the underactuated characteristics of the quadrotor and the coupled kinematics with the animal, nonlinear optimal tracking approaches, other than smooth feedback control, are required. However, with severe nonlinearities, it would be time-consuming to evaluate control inputs, and real-time tracking may not be achieved with generic optimizers onboard. To tackle this problem, a novel efficient egocentric regulation approach with high computational efficiency is proposed in this paper. Specifically, it directly formulates the optimal tracking problem in an egocentric manner regarding the quadrotor’s body coordinates. Meanwhile, the nonlinearities of the system are peeled off through a mapping of the feedback states as well as control inputs, between the inertial and body coordinates. In this way, the proposed efficient egocentric regulator only requires solving a quadratic performance objective with linear constraints and then generate control inputs analytically. Comparative simulations and mimic biological experiment are carried out to verify the effectiveness and computational efficiency. Results demonstrate that the proposed control approach presents the highest and stablest computational efficiency than generic optimizers on different platforms. Particularly, on a commonly utilized onboard computer, our method can compute the control action in approximately 0.3 ms, which is on the order of 350 times faster than that of generic nonlinear optimizers, establishing a control frequency around 3000 Hz.

Index Terms:
Continuous targeting, efficient egocentric regulator, underactuated quadrotor

I Introduction

In the past few years, the quadrotor has demonstrated impressive capabilities for biological researches [1, 2]. A recent work also realized the plants’ health management via autonomous sensor placement [3]. Motivated by those works, the quadrotor equipped with devices such as an anesthesia gun might be able to further apply for wildlife’s health management. With autonomous anesthetizing or sensor placing, it may significantly reduce the danger or labor of human beings when coping with agile and aggressive animals [4].

Different from the case of stationary plants as mentioned in Ref. [3], the moving animal should be continuously targeted before anesthetizing. For such a targeting task, it requires accurate control simultaneously on the position and attitude of the underactuated quadrotor in a continuous-time horizon. Therefore, smooth feedback control [5, 6, 7] is not applicable for such a targeting task, and an effective optimal tracking approach should be formulated. Considering the quadrotor’s underactuated characteristics and its coupled kinematics with the moving animal, optimal control of this continuous targeting problem involves severe nonlinearities. Solving such a nonlinear optimal control problem is commonly time consuming. Therefore computational efficiency is a crucial issue regarding the processing capability of onboard computers commonly adopted by quadrotors.

In previous researches, mainly two similar types of optimal problems are extensively studied, i.e., state-to-state maneuver problem [8, 9, 10, 11] and optimal tracking multiple fixed waypoints problem [12, 13, 14, 15, 16, 17]. For the state-to-state maneuver problem, the computational efficiency is guaranteed since the original problem can be linearizable or the optimal trajectory can be generated offline. For example, in Refs. [8, 9], computationally efficient trajectory generation algorithms, as well as subspace stabilization tracking method, are also proposed to maneuver the quadrotor to a time-varying target point in 6-dimensional state space. As the reported problems can be formulated with quadratic or linear optimizations, the optimal trajectories can be generated fairly fast. In Refs. [10, 11], time-optimal trajectory generation algorithms are designed for landing a quadrotor onto a moving platform. Although the problem is not formulated as a linear-quadratic problem, it assumes that the platform motion can be represented by a function of time. Therefore, the optimal trajectory can be generated offline without the requirement of computational efficiency. On the other hand, for the optimal tracking of multiple fixed waypoints problem [12, 13, 14, 15, 16, 17], it is a fully actuated problem that mainly concerns situations in which an invariable terminal target point or a set of fixed waypoints are given. Since the waypoints can be defined a priori, the optimal trajectory can be either generated online or offline, and the computational efficiency is not the primary concern. Instead, the trajectory tracking performance is more crucial [16, 17].

Given that the targeted animal is moving randomly during the whole targeting process, the waypoints cannot be obtained a priori in this underactuated targeting problem, as the fully actuated multiple fixed waypoints problem does. Meanwhile, the real-time tracking problem in this work cannot be linearized directly ,and the animal motion cannot be obtained a priori, as the state-to-state maneuver problem does. This is because three variables, i.e., the pitch angle, the translational position and vertical position of the quadrotor, are controlled simultaneously with two control inputs, i.e., thrust and pitch to ensure the targeting. Meanwhile, the vertical targeting performance objective is determined by the pitch angle, the translational as well as vertical position simultaneously, whereas the pitch angle and the translational position are coupled with each other again. Worse still is that the coupled kinematics with the moving animal further intensifies the nonlinearities of this optimal control problem. The aforesaid coupled nonlinearities and underactuated characteristics exist in the performance objective and the nonlinear constraints, making such a targeting optimal problem unlinearizable. Hence, it is inherently different from the aforementioned linearizable optimization problems such as those investigated in [8, 9]. Therefore, the optimal tracking methods proposed in these two types of problems are all unapplicable to solve this targeting problem, considering the heavy nonlinearities and computational efficiency simultaneously. Meanwhile, according to our outdoor experiments, generic optimizers also cannot directly solve this problem with sufficient computational efficiency.

To tackle this problem, an efficient egocentric regulator is proposed in this paper. Firstly, regarding the quadrotor’s body coordinates, a virtual mathematical model is established in an egocentric manner. In such a way, the decoupling of the pitch angle with the translational position is achieved and the fully actuated system is then obtained. Based on this transformed model, the nonlinearities of the system are peeled off. Then, it only needs to solve a linear quadratic optimization problem and then generate the virtual control inputs analytically. The peeled-off nonlinear part is compensated through a mapping of the feedback states and the virtual inputs between the virtual body and inertial coordinates. This kind of mapping can directly obtain the analytical real control input, achieving high computational efficiency. Mimic biological simulations and experiments are performed to testify the excellent targeting performance and the computational efficiency. Results demonstrate that the proposed control approach presents the highest and stablest computational efficiency than generic optimizers on different platforms. Particularly, it can compute the control action on the order of 350 times faster than generic optimizers on the onboard computer. In such a case, a control frequency of thousands of hertz can be achieved by the proposed approach on a commonly utilized onboard computer.

The main contributions of this work are: 1) To continuously target a moving animal, an efficient egocentric regulator is first proposed, which can present the highest and stablest computational efficiency than generic optimizers on different platforms. Particularly, on the onboard computer, our method can provide high computational efficiency on the order of 350 times higher than that of generic optimizers; 2) Extensive mimic biological simulations and experiments are first carried out, and the results demonstrate the excellent targeting performance and real-time computational efficiency of the proposed strategy.

The remainder of this paper is organized as follows. Section II presents the system dynamics and the problem formulation. Section III proposes the efficient egocentric regulator. Comparative simulations are implemented in Section IV ,while outdoor experiments are conducted in Section V. Section VI concludes the paper.

II System Dynamics And Problem Formulation

We first present the general dynamic model of the quadrotor in this section. On this basis, the targeting problem is formulated.

II-A Dynamics of quadrotor

The coordinates and free body diagram of the quadrotor are shown in Fig. 1.

Refer to caption
Figure 1: A free body diagram of the quadrotor. 𝒆\boldsymbol{e}^{\mathcal{I}} is the inertial coordinates and 𝒆\boldsymbol{e}^{\mathcal{B}} is the body fixed coordinates

On this basis, the dynamics of the quadrotor is investigated as follows. First, the force and moment generated by four individual rotors are commonly expressed as

{f=Fi(i=1,,4),τx=(F4F2)Lτz=M1M2+M3M4,τy=(F3F1)L\left\{\begin{array}[]{ll}f=\sum F_{i}\quad(i=1,\ldots,4),&\tau_{x}=\left(F_{4}-F_{2}\right)\cdot L\\ \tau_{z}=M_{1}-M_{2}+M_{3}-M_{4},&\tau_{y}=\left(F_{3}-F_{1}\right)\cdot L\end{array}\right. (1)

where LL is the length from the rotor to the center of the mass of the quadrotor, and FiF_{i} and MiM_{i} are the thrust and torque generated by rotor i(i{1,2,3,4})i(i\in\{1,2,3,4\}).

In view of Eq.(1), considering air drag effects, the dynamics of the quadrotor with respect to the inertial coordinates 𝒆\boldsymbol{e}^{\mathcal{I}} can be expressed as [18, 19, 20]

{𝒑˙=𝒗𝒗˙=fm𝓡𝒆3g𝒆3𝑪𝒗𝛀˙=𝑾𝝎𝝎˙=𝑱1𝝉\left\{\begin{array}[]{ll}\dot{\boldsymbol{p}}^{\mathcal{I}}=\boldsymbol{v}^{\mathcal{I}}&\dot{\boldsymbol{v}}^{\mathcal{I}}=\frac{f}{m}_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}\boldsymbol{e}_{3}^{\mathcal{B}}-g\boldsymbol{e}_{3}^{\mathcal{I}}-\boldsymbol{C}\boldsymbol{v}^{\mathcal{I}}\\ \dot{\boldsymbol{\Omega}}=\boldsymbol{W}\cdot\boldsymbol{\omega}^{\mathcal{B}}&\dot{\boldsymbol{\omega}^{\mathcal{B}}}=\boldsymbol{J}^{-1}\cdot\boldsymbol{\tau}\end{array}\right. (2)

where 𝒑=xq𝒆1+yq𝒆2+zq𝒆3\boldsymbol{p}^{\mathcal{I}}=x_{q}\boldsymbol{e}_{1}^{\mathcal{I}}+y_{q}\boldsymbol{e}_{2}^{\mathcal{I}}+z_{q}\boldsymbol{e}_{3}^{\mathcal{I}} denotes the position of the quadrotor in inertial coordinates 𝒆\boldsymbol{e}^{\mathcal{I}}; 𝒗=vqx𝒆1+vqy𝒆2+vqz𝒆3\boldsymbol{v}^{\mathcal{I}}=v_{qx}\boldsymbol{e}_{1}^{\mathcal{I}}+v_{qy}\boldsymbol{e}_{2}^{\mathcal{I}}+v_{qz}\boldsymbol{e}_{3}^{\mathcal{I}} is the velocity of the quadrotor in 𝒆\boldsymbol{e}^{\mathcal{I}}; mm is the mass of the quadrotor; gg is the gravity constant; 𝑪=diag{c1,c2,c3}\boldsymbol{C}=diag\left\{c_{1},c_{2},c_{3}\right\} is the constant matrix to estimate the aerial drag effects in 𝒆\boldsymbol{e}^{\mathcal{I}}; 𝛀\boldsymbol{\Omega} is the attitude, which contains three components: pitch θ,\theta, roll ϕ,\phi, and yaw ψ\psi; matrix 𝑾\boldsymbol{W} establishes relationship between attitude velocity 𝛀˙\dot{\boldsymbol{\Omega}} and angular velocity 𝝎\boldsymbol{\omega}; 𝑱\boldsymbol{J} is the moment of inertia; 𝝉=[τx,τy,τz]T\boldsymbol{\tau}=\left[\tau_{x},\tau_{y},\tau_{z}\right]^{T};𝓡{}_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}} is the transformation matrix from 𝒆\boldsymbol{e}^{\mathcal{B}} to 𝒆\boldsymbol{e}^{\mathcal{I}}, which is expressed as

𝑹=[CψCθCψSθSϕSψCϕCψSθCϕ+SψSϕSψCθSψSθSϕ+CψCϕSψSθCϕCψSϕSθCθSϕCθCϕ]_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{R}=\left[\begin{array}[]{ccc}C_{\psi}C_{\theta}&C_{\psi}S_{\theta}S_{\phi}-S_{\psi}C_{\phi}&C_{\psi}S_{\theta}C_{\phi}+S_{\psi}S_{\phi}\\ S_{\psi}C_{\theta}&S_{\psi}S_{\theta}S_{\phi}+C_{\psi}C_{\phi}&S_{\psi}S_{\theta}C_{\phi}-C_{\psi}S_{\phi}\\ -S_{\theta}&C_{\theta}S_{\phi}&C_{\theta}C_{\phi}\end{array}\right] (3)

in which Sx=sin(x) and Cx=cos(x)S_{x}=\sin(x)\text{ and }C_{x}=\cos(x).

II-B Problem Formulation

To clarify the formulation of the optimal control strategy designed in this work, three assumptions are imposed as follows.

Assumption 1. The pitch angle θ\theta and the roll angle ϕ\phi of the quadrotor can be set directly without dynamics and delay such that they can be regarded as the control inputs.

Assumption 2. Considering that the yaw angle ψ\psi of the quadrotor can be controlled separately without underactuated characteristics, it can be approximated as zero, i.e., ψ=0\psi=0.

Assumption 3. The animal’s velocity is regarded as a constant in each controller updating timestep.

In view of the developed model, the control objective can be described as follows. An optimal control input of the quadrotor is required to design in such a way the quadrotor can continuously target the animal and keep in a safe distance. The targeting problem is illustrated in Fig. 2, where 𝒑t=xt𝒆1+yt𝒆2+zt𝒆3\boldsymbol{p}_{t}^{\mathcal{I}}=x_{t}\boldsymbol{e}_{1}^{\mathcal{I}}+y_{t}\boldsymbol{e}_{2}^{\mathcal{I}}+z_{t}\boldsymbol{e}_{3}^{\mathcal{I}} denotes the position of the desired-point in inertial coordinate 𝒆\boldsymbol{e}^{\mathcal{I}}.

Combining with Assumption 1, we can define the state vector 𝒙\boldsymbol{x} and the control input 𝒖\boldsymbol{u} of the system as:

𝒙=[(𝒑𝒑t)T,(𝒗𝒗t)T,(𝒗)T]T,𝒖=[fm,θ,ϕ]T\displaystyle\boldsymbol{x}=[\left(\boldsymbol{p}^{\mathcal{I}}-\boldsymbol{p}_{t}^{\mathcal{I}}\right)^{T},\left(\boldsymbol{v}^{\mathcal{I}}-\boldsymbol{v}_{t}^{\mathcal{I}}\right)^{T},\left(\boldsymbol{v}^{\mathcal{I}}\right)^{T}]^{T},\boldsymbol{u}=[\frac{f}{m},\theta,\phi]^{T} (4)

where 𝒗t=vtx𝒆1+vty𝒆2+vtz𝒆3\boldsymbol{v}_{t}^{\mathcal{I}}=v_{tx}\boldsymbol{e}_{1}^{\mathcal{I}}+v_{ty}\boldsymbol{e}_{2}^{\mathcal{I}}+v_{tz}\boldsymbol{e}_{3}^{\mathcal{I}} denotes the velocity of the desired-point in inertial coordinate 𝒆\boldsymbol{e}^{\mathcal{I}}. The quadrotor’s velocity 𝒗\boldsymbol{v}^{\mathcal{I}} is considered as separate states. It is separately introduced to reflect the air drag effect. In this way, the static case and the uniform motion case, where the animal and the quadrotor are both relatively stationary, can be distinguished.

Based on Assumption 2, Assumption 3 and Eq. (2), the dynamics of the targeting process can then be reformulated as follows:

Refer to caption
Figure 2: A schematic diagram for the targeting problem. Desired-point is the desired hit point of the anethesia gun in the animal; targeted plane denotes the plane which is parallel to the 𝒆2𝒆3\boldsymbol{e}_{2}^{\mathcal{I}}\boldsymbol{e}_{3}^{\mathcal{I}} plane while passing through the desired-point; targeted-point is the intersection of the anesthesia gun’s extension line and the targeted plane.
𝒙˙(t)=𝒉[𝒙(t),𝒖(t)]=[𝒙46T,𝒂T,𝒂T]T\displaystyle\boldsymbol{\dot{x}}(t)=\boldsymbol{h}\left[\boldsymbol{x}(t),\boldsymbol{u}(t)\right]=[{\boldsymbol{x}_{4-6}}^{T},\boldsymbol{a}^{T},\boldsymbol{a}^{T}]^{T} (5)

where 𝒙i\boldsymbol{x}_{i} denotes the ii-th component of 𝒙\boldsymbol{x}; 𝒂=[𝒖1C𝒖3S𝒖2c1𝒙7,𝒖1S𝒖3c2𝒙8,𝒖1C𝒖3C𝒖2c3𝒙9g]T\boldsymbol{a}=[\boldsymbol{u}_{1}C_{\boldsymbol{u}_{3}}S_{\boldsymbol{u}_{2}}-c_{1}\boldsymbol{x}_{7},-\boldsymbol{u}_{1}S_{\boldsymbol{u}_{3}}-c_{2}\boldsymbol{x}_{8},\boldsymbol{u}_{1}C_{\boldsymbol{u}_{3}}C_{\boldsymbol{u}_{2}}-c_{3}\boldsymbol{x}_{9}-g]^{T}, where 𝒖i\boldsymbol{u}_{i} denotes ii-th component of 𝒖\boldsymbol{u}.

To facilitate the optimal control, Define dxxtxqd_{x}\triangleq x_{t}-x_{q} as the distance from the center of mass of the quadrotor to the targeted plane, and define dyytyqd_{y}\triangleq y_{t}-y_{q}, dz(dxtanθ(zqzt))d_{z}\triangleq(d_{x}\tan\theta-(z_{q}-z_{t})) as the distance along 𝒆2,𝒆3\boldsymbol{e}_{2}^{\mathcal{I}},\boldsymbol{e}_{3}^{\mathcal{I}} between the targeted-point and the desired-point on the targeted plane. For this targeting task, the control objective is to design an optimal control input 𝒖\boldsymbol{u} such that dxd_{x} is as close to the safe distance (3 meters in this paper) as possible. At the same time, dy,dzd_{y},d_{z} and control input should be minimized within a finite time horizon t[0,tf]t\in[0,t_{f}]. In this way, the considered optimal control problem of the targeting problem can be formulated as the follows:

min\displaystyle\min 𝒥=0tfG[𝒙(t),𝒖(t),t]dt\displaystyle\mathcal{J}=\int_{0}^{{{t}_{f}}}G[\boldsymbol{x}(t),\boldsymbol{u}(t),t]\text{d}t (6)
s.t. 𝒙˙(t)=𝒉[𝒙(t),𝒖(t)],𝒙(0)=𝝃𝟎\displaystyle\boldsymbol{\dot{x}}(t)=\boldsymbol{h}[\boldsymbol{x}(t),\boldsymbol{u}(t)],\boldsymbol{x}(0)=\boldsymbol{{\xi}_{0}}

where the terminal time tft_{f} is fixed; G[𝒙(t),𝒖(t),t]=12𝒖T𝒖+k1(dx3)2+k2dy2+k3dz2\displaystyle G[\boldsymbol{x}(t),\boldsymbol{u}(t),t]={\frac{1}{2}\boldsymbol{u}^{T}\boldsymbol{u}+k_{1}{{\left(d_{x}-3\right)}^{2}}+k_{2}d_{y}^{2}+k_{3}d_{z}^{2}}; 𝝃𝟎9\boldsymbol{{\xi}_{0}}\in\mathbb{R}^{9} is the initial value of the state vector, which is known a priori; ki,i=13k_{i},i=1\ldots 3 are the positive weights.

From (6), it can be inferred that the pitch angle 𝒖2\boldsymbol{u}_{2} is coupled with the translational position 𝒙1\boldsymbol{x}_{1} in the performance objective 𝒥\mathcal{J}. Additionally, there exists severe nonlinear characteristics in this optimal control problem, owing to the coupled kinematics with the animal and the inherent nonlinearities of the quadrotor. Hence, this optimal control problem can neither be linearized nor be solved analytically. The general ways of solving a nonlinear optimal control problem is to solve it numerically with generic optimizers. Due to the coupled and nonlinear characteristics, directly solving this nonlinear optimal control problem with generic optimizers is time-consuming and cannot meet the real-time calculation requirements for real system.

III Efficient Egocentric Regulator

To tackle this issue, a novel method called Efficient Egocentric Regulator (EER) with high computational efficiency is proposed in this section, which peels off the nonlinearities from the optimal control problem (6), as shown in Fig. 3.

Refer to caption
Figure 3: The illustration of the EER. As shown in the right part, a virtual coordinate 𝒱\mathcal{V} parallel to the quadrotor’s instantaneous body coordinate 𝒆\boldsymbol{e}^{\mathcal{B}} is established while taking the desired-point as the origin. Based on this, a virtual system model is established, where the virtual state 𝒙e\boldsymbol{x}_{e} is obtained by mapping 𝒙\boldsymbol{x} from the inertial coordinate 𝒆\boldsymbol{e}^{\mathcal{I}} to the virtual coordinate. Then, a linear quadratic optimization problem is built up to generate the virtual optimal control input 𝒖e\boldsymbol{u}_{e} analytically. Finally, the real control input 𝒖\boldsymbol{u} can be derived by mapping 𝒖e\boldsymbol{u}_{e} back to 𝒆\boldsymbol{e}^{\mathcal{I}}.

Firstly, a virtual coordinate 𝒱\mathcal{V} parallel to the quadrotor’s instantaneous body coordinate {\mathcal{B}} is established while taking the desired-point as the origin. Then, the virtual position and the velocity of the quadrotor in virtual coordinate 𝒆𝒱\boldsymbol{e}^{\mathcal{V}} can be defined as 𝒑𝒱=xe𝒆1𝒱+ye𝒆2+ze𝒆3𝒱,𝒗𝒱=vxe𝒆1𝒱+vye𝒆2+vze𝒆3𝒱\boldsymbol{p}^{\mathcal{V}}=x_{e}\boldsymbol{e}_{1}^{\mathcal{V}}+y_{e}\boldsymbol{e}_{2}^{\mathcal{I}}+z_{e}\boldsymbol{e}_{3}^{\mathcal{V}},\boldsymbol{v}^{\mathcal{V}}=v_{xe}\boldsymbol{e}_{1}^{\mathcal{V}}+v_{ye}\boldsymbol{e}_{2}^{\mathcal{I}}+v_{ze}\boldsymbol{e}_{3}^{\mathcal{V}}. The virtual state 𝒙e\boldsymbol{x}_{e} and the virtual control input 𝒖e\boldsymbol{u}_{e} of the considered system in virtual coordinate can be defined as

𝒙e=[xe+r,ye,ze,vxe,vye,vze]T,𝒖e=[ue1,ue2,ue3]T\boldsymbol{x}_{e}=\left[x_{e}+r^{*},y_{e},z_{e},v_{xe},v_{ye},v_{ze}\right]^{T},\boldsymbol{u}_{e}=\left[u_{e1},u_{e2},u_{e3}\right]^{T} (7)

where rr^{*} is a constant, denoted as the instantaneously desired position of xex_{e}; ue1u_{e1}, ue2u_{e2} and ue3u_{e3} are the accelerations in virtual coordinate along 𝒆1𝒱\boldsymbol{e}_{1}^{\mathcal{V}}, 𝒆2𝒱\boldsymbol{e}_{2}^{\mathcal{V}} 𝒆3𝒱\boldsymbol{e}_{3}^{\mathcal{V}}.

Theoreom 1. The mapping of the feedback states and control inputs, between the inertial and virtual coordinates can be approximated as

𝒙e=[𝑰2×2(𝓡)T]𝒙16+[r,𝟎1×5T]T,𝒖e=𝓡T𝒂\displaystyle\boldsymbol{x}_{e}=[\boldsymbol{I}_{2\times 2}\otimes(_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}})^{T}]\boldsymbol{x}_{1-6}+[r^{*},\boldsymbol{0}_{1\times 5}^{T}]^{T},\boldsymbol{u}_{e}=_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}^{T}{\boldsymbol{a}} (8)

where 𝐱16\boldsymbol{x}_{1-6} denotes the first six components of 𝐱\boldsymbol{x}.

Proof.

It can be easily obtained that

𝒑𝒱=𝓡T𝒙13\displaystyle\boldsymbol{p}^{\mathcal{V}}=_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}^{T}\boldsymbol{x}_{1-3} (9)

where 𝒙13\boldsymbol{x}_{1-3} denotes the first three components of 𝒙\boldsymbol{x}. Taking the first and second time derivative of Eq. (LABEL:transform1) gives

𝒗𝒱=𝓡˙𝒙13+𝓡T𝒙˙13\displaystyle\boldsymbol{v}^{\mathcal{V}}=_{\mathcal{B}}^{\mathcal{I}}\dot{\boldsymbol{\mathcal{R}}}\boldsymbol{x}_{1-3}+_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}^{T}\dot{\boldsymbol{x}}_{1-3} (10)
𝒖e=𝓡¨T𝒙13+2𝓡˙T𝒙˙13+𝓡T𝒙¨13\displaystyle\boldsymbol{u}_{e}=_{\mathcal{B}}^{\mathcal{I}}\ddot{\boldsymbol{\mathcal{R}}}^{T}\boldsymbol{x}_{1-3}+2_{\mathcal{B}}^{\mathcal{I}}\dot{\boldsymbol{\mathcal{R}}}^{T}\dot{\boldsymbol{x}}_{1-3}+_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}^{T}\ddot{\boldsymbol{x}}_{1-3}

Furthermore, it can be assumed that 𝓡˙T𝟎3×3,𝓡¨T𝟎3×3{}_{\mathcal{B}}^{\mathcal{I}}\dot{\boldsymbol{\mathcal{R}}}^{T}\approx\boldsymbol{0}_{3\times 3},_{\mathcal{B}}^{\mathcal{I}}\ddot{\boldsymbol{\mathcal{R}}}^{T}\approx\boldsymbol{0}_{3\times 3} in each controller updating timestep. Substituting this assumption into Eq. (LABEL:transform1) and Eq. (10), one can then prove Theoreom 1. ∎

Theoreom 2. The optimal control input 𝐮\boldsymbol{u}^{*} given by the proposed EER in inertial coordinate is

𝒖\displaystyle\boldsymbol{u}^{*} =[𝒂+g𝒆3𝑪𝒙46,arctan(𝒂1+c1𝒙4𝒂3+c3𝒙6+g),\displaystyle=\left[\left\|{\boldsymbol{a}}^{*}+g\boldsymbol{e}_{3}^{\mathcal{I}}-\boldsymbol{C}\boldsymbol{x}_{4-6}\right\|,\arctan\left(\frac{{\boldsymbol{a}}_{1}^{*}+c_{1}{\boldsymbol{x}}_{4}}{{\boldsymbol{a}}_{3}^{*}+c_{3}{\boldsymbol{x}}_{6}+g}\right)\right., (11)
arcsin(𝒂2c2𝒙5𝒖1)]T\displaystyle\left.\arcsin\left(\frac{-{\boldsymbol{a}}_{2}^{*}-c_{2}{\boldsymbol{x}}_{5}}{\boldsymbol{u}_{1}^{*}}\right)\right]^{T}

where 𝐚=𝓡𝐊𝐱e\boldsymbol{a}^{*}=_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}\boldsymbol{K}\boldsymbol{x}_{e} is the optimal acceleration in inertial coordinate \mathcal{I}, in which 𝐊3×6\boldsymbol{K}\in\mathbb{R}^{3\times 6} is a control gain matrix.

Proof.

Firstly, based on the definition in Eq. (7), the virtual system model can be established in a egocentric manner as:

𝒙˙e=𝑨e𝒙e+𝑩e𝒖e\dot{\boldsymbol{x}}_{e}=\boldsymbol{A}_{e}\boldsymbol{x}_{e}+\boldsymbol{B}_{e}\boldsymbol{u}_{e} (12)

with 𝑨e=[𝟎3×3𝑰3×3𝟎3×3𝟎3×3]\boldsymbol{A}_{e}=\begin{bmatrix}\boldsymbol{0}_{3\times 3}&\boldsymbol{I}_{3\times 3}\\ \boldsymbol{0}_{3\times 3}&\boldsymbol{0}_{3\times 3}\end{bmatrix}, 𝑩e=[𝟎3×3,𝑰3×3]T\boldsymbol{B}_{e}=\left[\boldsymbol{0}_{3\times 3},\boldsymbol{I}_{3\times 3}\right]^{T}. Based on this transformed model, the pitch angle is decoupled with the translational position and a fully actuated linear system with three inputs and three outputs is obtained. In addition, the nonlinearities of the system are peeled off and the separated nonlinear part can be compensated through a mapping of the feedback states and control inputs, between the inertial and virtual coordinates.

Then, it only needs to solve a linear quadratic optimization problem. For the targeting task, the goal is to design an optimal controller which minimizes the sum of the targeting error and input energy. Based on the virtual coordinate, a quadratic optimal control problem can be formulated as

min\displaystyle\min 𝒥e=120𝒙eT𝑸1𝒙e+𝒖eT𝑸2𝒖edt\displaystyle\mathcal{J}_{e}=\frac{1}{2}\int_{0}^{\infty}\boldsymbol{x}_{e}^{T}\boldsymbol{Q}_{1}\boldsymbol{x}_{e}+\boldsymbol{u}_{e}^{T}\boldsymbol{Q}_{2}\boldsymbol{u}_{e}\mathrm{d}t (13)
s.t.\displaystyle s.t. 𝒙˙e=𝑨e𝒙e+𝑩e𝒖e,𝒙e(0)=𝒙e0\displaystyle\dot{\boldsymbol{x}}_{e}=\boldsymbol{A}_{e}\boldsymbol{x}_{e}+\boldsymbol{B}_{e}\boldsymbol{u}_{e},\boldsymbol{x}_{e}(0)=\boldsymbol{x}_{e0}

where 𝑸16×6\boldsymbol{Q}_{1}\in\mathbb{R}^{6\times 6} denotes semi-definite diagonal state weighting matrix; 𝑸23×3\boldsymbol{Q}_{2}\in\mathbb{R}^{3\times 3} is a positive definite diagonal control weighting matrix; 𝒙e09\boldsymbol{x}_{e0}\in\mathbb{R}^{9} is the initial state of 𝒙e\boldsymbol{x}_{e}. Then, by solving such a linear quadratic optimization problem, the optimal virtual control input 𝒖e\boldsymbol{u}_{e}^{*} can be obtained:

𝒖e=𝑲𝒙e(t)\displaystyle\boldsymbol{u}_{e}^{*}=\boldsymbol{K}\boldsymbol{x}_{e}(t) (14)

where 𝒙e\boldsymbol{x}_{e} can be obtained from the mapping given in Eq.(8); 𝑲=𝑸21𝑩eT𝑷\boldsymbol{K}=-\boldsymbol{Q}_{2}^{-1}\boldsymbol{B}_{e}^{T}\boldsymbol{P}, 𝑷6×6\boldsymbol{P}\in\mathbb{R}^{6\times 6} is a positive constant matrix found by solving the continuous time algebraic Riccati equation 𝑷𝑨e𝑨eT𝑷+𝑷𝑩e𝑸21𝑩eT𝑷𝑸1=0-\boldsymbol{P}\boldsymbol{A}_{e}-\boldsymbol{A}_{e}^{T}\boldsymbol{P}+\boldsymbol{PB}_{e}\boldsymbol{Q}_{2}^{-1}\boldsymbol{B}_{e}^{T}\boldsymbol{P}-\boldsymbol{Q}_{1}=0. As 𝑸1\boldsymbol{Q}_{1} and 𝑸2\boldsymbol{Q}_{2} are constant, 𝑷\boldsymbol{P} can be solved offline. In this work, to be consistent with the cost function given in Eq. (6), rr^{*} in 𝒙e\boldsymbol{x}_{e} is selected with r=3/cosθr^{*}=3/\cos\theta, and it can be assumed as a constant in each controller updating timestep considering high computational efficiency. In practice, it is also feasible to directly select r=3r^{*}=3.

After calculating the optimal virtual control input, combined with Theoreom 1, the desired acceleration in inertial coordinate \mathcal{I} can be obtained as

𝒂=𝓡𝒖e=𝓡𝑲𝒙e\displaystyle{\boldsymbol{a}}^{*}=_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}\boldsymbol{u}_{e}^{*}=_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}\boldsymbol{K}\boldsymbol{x}_{e} (15)

Therefore, according to the system dynamics in Eq. (5), the optimal control input 𝒖\boldsymbol{u}^{*} in inertial coordinate can be derived analytically as Theoreom 2. ∎

Corollary 1. When sinϕ0\sin\phi\approx 0, the control of the EER in 𝐞2\boldsymbol{e}_{2}^{\mathcal{I}} can be equivalent to a PD controller.

Proof.

According to the definition of 𝑸2,𝑩e\boldsymbol{Q}_{2},\boldsymbol{B}_{e}, and 𝑷\boldsymbol{P}, the control gain matrix 𝑲\boldsymbol{K} can be rewritten as

𝑲=𝑸21𝑩eT𝑷=[𝑲1,𝑲2]\displaystyle\boldsymbol{K}=-\boldsymbol{Q}_{2}^{-1}\boldsymbol{B}_{e}^{T}\boldsymbol{P}=\left[\boldsymbol{K}_{1},\boldsymbol{K}_{2}\right] (16)

where 𝑲13×3,𝑲23×3\boldsymbol{K}_{1}\in\mathbb{R}^{3\times 3},\boldsymbol{K}_{2}\in\mathbb{R}^{3\times 3} are two diagnoal matrixes.

Based on Eq. (8) and Eq. (15), one can obtain

𝒂\displaystyle{\boldsymbol{a}}^{*} =𝓡[𝑲1,𝑲2](𝑰2×2(𝓡)T𝒙16+[r,𝟎1×5T]T)\displaystyle=_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}\left[\boldsymbol{K}_{1},\boldsymbol{K}_{2}\right]\left(\boldsymbol{I}_{2\times 2}\otimes(_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}})^{T}\boldsymbol{x}_{1-6}+[r^{*},\boldsymbol{0}_{1\times 5}^{T}]^{T}\right) (17)
=[𝓡𝑲1𝓡T,𝓡𝑲2𝓡T]𝒙16+𝓡[𝑲1,1r,0,0]T\displaystyle=\left[{}_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}{\boldsymbol{K}_{1}}_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}^{T},_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}{\boldsymbol{K}_{2}}_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}^{T}\right]\boldsymbol{x}_{1-6}+_{\mathcal{B}}^{\mathcal{I}}\boldsymbol{\mathcal{R}}[\boldsymbol{K}_{1,1}r^{*},0,0]^{T}

where 𝑲1,1\boldsymbol{K}_{1,1} denotes the first diagonal element of 𝑲1\boldsymbol{K}_{1}. Given the near hovering state, i.e., sinϕ0\sin\phi\approx 0 and according to Eq. (3), the optimal accelaration 𝒂2\boldsymbol{a}_{2}^{*} along 𝒆2\boldsymbol{e}_{2}^{\mathcal{I}} can be given by

𝒂2\displaystyle{\boldsymbol{a}}_{2}^{*} =[0𝑲1,2Cϕ200𝑲2,2Cϕ20]𝒙16\displaystyle=\begin{bmatrix}0&\boldsymbol{K}_{1,2}C_{\phi}^{2}&0&0&\boldsymbol{K}_{2,2}C_{\phi}^{2}&0\end{bmatrix}\boldsymbol{x}_{1-6} (18)
kp𝒙2+kd𝒙4\displaystyle\triangleq k_{p}\boldsymbol{x}_{2}+k_{d}\boldsymbol{x}_{4}

where 𝑲1,2,𝑲2,2\boldsymbol{K}_{1,2},\boldsymbol{K}_{2,2} denote the second diagonal element of 𝑲1\boldsymbol{K}_{1} and 𝑲2\boldsymbol{K}_{2}, respectively; kp,kdk_{p},k_{d} are the control gains. Therefore, it can be inferred that the control of the EER in 𝒆2\boldsymbol{e}_{2}^{\mathcal{I}} can be equivalent to a PD controller. ∎

Remark 1. From corollary 1, it can be inferred that the position control along 𝐞2\boldsymbol{e}_{2}^{\mathcal{I}} can be independent of the proposed EER control and then simplified as an independent PD control. In this case, the state variable 𝐱𝐞\boldsymbol{x_{e}} can be reduced to 4 dimensions, and the dimensions of other parameters will also be reduced accordingly.

IV Numerical Evaluation

In this section, to verify the effectiveness of this work, numerical simulations are first carried out.

IV-A Generic optimizers

Generally, to directly tackle the optimal control problem in Eq. (6), there are two types of approaches, i.e., the indirect one and the direct one [21]. In this paper, for comparison, two representative methods are selected from each type, that is, the Gauss Pseudospectral Method[21, 22] and the Boundary Value Problem method [23].

IV-A1 Gauss pseudospectral method

Firstly, Legendre-Gauss (LG) points were selected as interpolation nodes and the values of the state and the control variables on LG points are regarded as unknown parameters. On this basis, the state and control trajectories are approximated using Lagrange interpolating polynomials and the system dynamics can be obtained in a discretized manner. Furthermore, the continuous cost function of Eq. (6) is approximated using a Gauss quadrature. Then, the optimal control problem can be transformed into a nonlinear programming problem (NLP). Due to the space limitation, the specific discretization process of the optimal trajectory control problem via GPM is omitted here. Detailed principles of GPM can be found in [21, 22]. The resulting transcribed NLP via GPM can be written as follows:

min\displaystyle\min\quad 𝒥=tf2𝝎T[12𝜼𝟏2+12𝜼𝟐2+k1(𝝃𝟏3𝟏N×1)2\displaystyle\mathcal{J}=\frac{t_{f}}{2}\boldsymbol{\omega}^{T}\left[\frac{1}{2}\boldsymbol{\eta_{1}}^{2}+\frac{1}{2}\boldsymbol{\eta_{2}}^{2}+k_{1}\left(-\boldsymbol{\xi_{1}}-3\cdot\boldsymbol{1}_{N\times 1}\right)^{2}\right. (19)
+k2𝝃𝟐2+k3((𝝃𝟏tan𝜼𝟑𝝃𝟑)2)]\displaystyle\quad\left.+k_{2}\boldsymbol{\xi_{2}}^{2}+k_{3}\left((-\boldsymbol{\xi_{1}}\tan\boldsymbol{\eta_{3}}-\boldsymbol{\xi_{3}})^{2}\right)\right]
s.t.\displaystyle s.t. 𝑫[𝝃𝟎,𝒊,𝝃𝒊T]Ttf2𝝃𝒊+𝟑=0,i=13\displaystyle\boldsymbol{D}\left[\boldsymbol{\xi_{0,i}},\boldsymbol{\xi_{i}}^{T}\right]^{T}-\frac{t_{f}}{2}\boldsymbol{\xi_{i+3}}=0,i=1\ldots 3
𝑫[𝝃𝟎,𝒊,𝝃𝒊T]Ttf2(𝜼𝟏C𝜼𝟑S𝜼𝟐c1𝝃𝟕)=0,i=4,7\displaystyle\boldsymbol{D}\left[\boldsymbol{\xi_{0,i}},\boldsymbol{\xi_{i}}^{T}\right]^{T}-\frac{t_{f}}{2}\left(\boldsymbol{\eta_{1}}C_{\boldsymbol{\eta_{3}}}S_{\boldsymbol{\eta_{2}}}-c_{1}\boldsymbol{\xi_{7}}\right)=0,i=4,7
𝑫[𝝃𝟎,𝒊,𝝃𝒊T]Ttf2(𝜼𝟏S𝜼𝟑c2𝝃𝟖)=0,i=5,8\displaystyle\boldsymbol{D}\left[\boldsymbol{\xi_{0,i}},\boldsymbol{\xi_{i}}^{T}\right]^{T}-\frac{t_{f}}{2}\left(-\boldsymbol{\eta_{1}}S_{\boldsymbol{\eta_{3}}}-c_{2}\boldsymbol{\xi_{8}}\right)=0,i=5,8
𝑫[𝝃𝟎,𝒊,𝝃𝒊T]Ttf2(𝜼𝟏C𝜼𝟑C𝜼𝟐g𝟏c3𝝃𝟗)=0,\displaystyle\boldsymbol{D}\left[\boldsymbol{\xi_{0,i}},\boldsymbol{\xi_{i}}^{T}\right]^{T}-\frac{t_{f}}{2}\left(\boldsymbol{\eta_{1}}C_{\boldsymbol{\eta_{3}}}C_{\boldsymbol{\eta_{2}}}-g\boldsymbol{1}-c_{3}\boldsymbol{\xi_{9}}\right)=0,
i=6,9\displaystyle i=6,9

where 𝝎\boldsymbol{\omega} is the Gauss weight vector; 𝜼jN,j=13\displaystyle\boldsymbol{\eta}_{j}\in\mathbb{R}^{N},j=1\ldots 3 is the discretized vector, composed of all values of 𝒖j\boldsymbol{u}_{j} at LG points; Similarly, 𝝃iN,i=1,,9\displaystyle\boldsymbol{\xi}_{i}\in\mathbb{R}^{N},i=1,\ldots,9, denotes the discretized vector of 𝒙i\boldsymbol{x}_{i}; 𝟏\boldsymbol{1} is an all-one vector; .2\boldsymbol{.}^{2} denotes element-wise square; 𝑫N×N+1\boldsymbol{D}\in\mathbb{R}^{N\times N+1} is the differential approximation matrix; 𝝃𝟎,𝒊\displaystyle\boldsymbol{\xi_{0,i}} denotes ii-th component of 𝝃𝟎\boldsymbol{\xi_{0}}; for simplicity, the multiplication of two vectors denotes the Hadamard product \odot, which is the element-wise product.

Thus, the transcribed problem is still a nonlinear programming problem and can be solved using any Non-linear Programming (NLP) solver. The calculated control input will then be sent to the quadrotor to drive the quadrotor to target the moving animal.

IV-A2 Boundary Value Problem

This method transforms the optimal problem into a boundary value problem via the calculus of variations. Then the boundary value problem can be solved (often numerically) for extremal trajectories.

However, due to the severe nonlinearities, this method cannot be directly applied to solve the original optimal control problem in Eq.(6). It should be further assumed that the system is in the near hovering state (θ=0,ϕ=0\theta=0,\phi=0), such that dz(𝒙1𝒖2𝒙3)d_{z}\approx(-\boldsymbol{x}_{1}\boldsymbol{u}_{2}-\boldsymbol{x}_{3}). In this case, the optimal problem can be reformulated as

min\displaystyle\min 𝒥=0tfG[𝒙(t),𝒖(t),t]dt\displaystyle\mathcal{J}^{{}^{\prime}}=\int_{0}^{{{t}_{f}}}G^{{}^{\prime}}[\boldsymbol{x}(t),\boldsymbol{u}(t),t]\text{d}t (20)
s.t. 𝒙˙(t)=𝒉[𝒙(t),𝒖(t)],𝒙(0)=𝝃𝟎\displaystyle\boldsymbol{\dot{x}}(t)=\boldsymbol{h}^{{}^{\prime}}[\boldsymbol{x}(t),\boldsymbol{u}(t)],\boldsymbol{x}(0)=\boldsymbol{{\xi}_{0}}

where G=12𝒖T𝒖+k1(dx+3)2+k2dy2+k3dz2\displaystyle G^{{}^{\prime}}={\frac{1}{2}\boldsymbol{u}^{T}\boldsymbol{u}+k_{1}{{\left(d_{x}+3\right)}^{2}}+k_{2}d_{y}^{2}+k_{3}d_{z}^{2}}; 𝒉=[𝒙46T,g𝒖2c1𝒙7,g𝒖3c2𝒙8,𝒖1c3𝒙9g,g𝒖2c1𝒙7,g𝒖3c2𝒙8,𝒖1c3𝒙9g]T\boldsymbol{h}^{{}^{\prime}}=[{\boldsymbol{x}_{4-6}}^{T},g{\boldsymbol{u}_{2}}-c_{1}\boldsymbol{x}_{7},-g\boldsymbol{u}_{3}-c_{2}\boldsymbol{x}_{8},\boldsymbol{u}_{1}-c_{3}\boldsymbol{x}_{9}-g,g{\boldsymbol{u}_{2}}-c_{1}\boldsymbol{x}_{7},-g\boldsymbol{u}_{3}-c_{2}\boldsymbol{x}_{8},\boldsymbol{u}_{1}-c_{3}\boldsymbol{x}_{9}-g]^{T}. Firstly, by applying the Lagrange multiplier method, the augmented performance objective can be constructed as

𝒥=0tfG[𝒙(t),𝒖(t),t]+𝝀T[𝒉(𝒙(t),𝒖(t))𝒙˙(t)]dt\displaystyle\mathcal{J}^{\prime}=\int_{0}^{{{t}_{f}}}G^{{}^{\prime}}[\boldsymbol{x}(t),\boldsymbol{u}(t),t]+\boldsymbol{\lambda}^{T}\left[\boldsymbol{h}^{{}^{\prime}}(\boldsymbol{x}(t),\boldsymbol{u}(t))-\dot{\boldsymbol{x}}(t)\right]\text{d}t (21)

where 𝝀9\boldsymbol{\lambda}\in\mathbb{R}^{9} is the costate vector. Then, via the calculus of variations, the optimal control input can be generated as

𝒖(t)\displaystyle\boldsymbol{u}^{*}(t) =𝑽(𝒙,𝝀)\displaystyle=\boldsymbol{V}(\boldsymbol{x},\boldsymbol{\lambda}) (22)
=[𝝀6𝝀9,(𝝀4+𝝀7)g2k3𝒙1𝒙32k3𝒙12+1,g(𝝀5+𝝀8)]T\displaystyle=\left[-\boldsymbol{\lambda}_{6}-\boldsymbol{\lambda}_{9},\frac{-(\boldsymbol{\lambda}_{4}+\boldsymbol{\lambda}_{7})g-2k_{3}\boldsymbol{x}_{1}\boldsymbol{x}_{3}}{2k_{3}\boldsymbol{x}_{1}^{2}+1},g(\boldsymbol{\lambda}_{5}+\boldsymbol{\lambda}_{8})\right]^{T}

in which 𝝀i\boldsymbol{\lambda}_{i} denotes ii-th component of 𝝀\boldsymbol{\lambda}; 𝝀\boldsymbol{\lambda} and 𝒙\boldsymbol{x} can be calculated by solving the following boundary value problem via well-developed numerical algorithms.

𝝀˙(t)=𝑺[𝒙(t),𝝀(t)],𝝀(tf)=𝟎9×1\displaystyle\dot{\boldsymbol{\lambda}}(t)=\boldsymbol{S}\left[\boldsymbol{x}(t),\boldsymbol{\lambda}(t)\right],\boldsymbol{\lambda}(t_{f})=\boldsymbol{0}_{9\times 1} (23)
𝒙˙(t)=𝒉[𝒙(t)𝝀(t)],𝒙(0)=𝝃𝟎\displaystyle\dot{\boldsymbol{x}}(t)=\boldsymbol{h}^{{}^{\prime}}\left[\boldsymbol{x}(t)\boldsymbol{\lambda}(t)\right],\boldsymbol{x}(0)=\boldsymbol{{\xi}_{0}}

where 𝑺=[2k1(𝒙1+3)+2k3dz𝒖2,2k2𝒙2,2k3dz,𝝀13T,\boldsymbol{S}=\left[-2k_{1}\left(\boldsymbol{x}_{1}+3\right)+2k_{3}d_{z}\boldsymbol{u}^{*}_{2},-2k_{2}\boldsymbol{x}_{2},2k_{3}d_{z},-\boldsymbol{\lambda}_{1-3}^{T}\right.,
(𝝀46T+𝝀79T)𝑪]T\left.(\boldsymbol{\lambda}_{4-6}^{T}+\boldsymbol{\lambda}_{7-9}^{T})\boldsymbol{C}\right]^{T}, 𝒖i\boldsymbol{u}^{*}_{i} is ii-th component of 𝒖\boldsymbol{u}^{*}.

IV-B Evaluation Environment Setup

The simulation is implemented on a desktop computer with an Intel Core i7-10700 CPU running at 2.9GHz, with 16GB of RAM. It runs on a Linux (Ubuntu 18.04) operating system and is constructed in a physically realistic environment.

The implemented structure of this simulation environment is illustrated in Fig. 4. The quadrotor and animal dynamics are implemented in the Gazebo environment. The numerical node is utilized to calculate the optimal control input via the received state in Python. The control node is implemented in C++ to calculate the optimal control input and then control the quadrotor at a frequency of 50 Hz. Both control node and numerical node are implemented within the Robot Operating System (ROS).

To further increase the computational efficiency of the generic optimizers, the following three acceleration techniques are adopted.

1) The optimal control computations of generic optimizers are separately running on GPM/BVP server, which is accelerated by Numba 0.51.2 of Python. In such a case, the numerical node of generic optimizers only serves to forward the message and is bridged with GPM/BVP server by TCP/IP.

2) The calculations of the generic optimizers are combined with warm-starting [24], i.e., initializing using the estimate from the previous calculation results.

3) In the following simulations and experiments, to reduce the computing complexity of the generic optimizers, the position of the quadrotor along 𝒆2\boldsymbol{e}_{2}^{\mathcal{I}} is separately controlled by PD controller as the proposed EER does.

Refer to caption
Figure 4: The schematic diagram of system structure in simulation

In all of the following simulation tests, the EER and generic optimizers are implemented as follows:

1) EER: with the trial and error approach, the two weighting matrices in the EER are selected as 𝑸1=diag{58,264,30,10},𝑸2=diag{40,30}\boldsymbol{Q}_{1}=\text{diag}\{58,264,30,10\},\boldsymbol{Q}_{2}=\text{diag}\{40,30\}. The position control gains along 𝒆2\boldsymbol{e}_{2}^{\mathcal{I}} are kp=2.0,kd=3.0k_{p}=2.0,k_{d}=3.0.

2) Generic optimizers: the terminal time is selected as tf=2t_{f}=2; the weighting gains are chosen as k1=k3=50k_{1}=k_{3}=50; the number of LG points in GPM is chosen as N=7N=7. The position control gains along 𝒆2\boldsymbol{e}_{2}^{\mathcal{I}} are the same as the EER. Boundary value problem in BVP and the NLP in GPM are solved with scipy.integrate.solve_\_bvp and scipy.minimize in Python 3.6, respectively.

IV-C Simulation Tests

In this section, two cases are carried out on the desktop computer to investigate the targeting performance and computational efficiency of the designed method.

Case 1: A first test is conducted, where the animal is moving along 𝒆1\boldsymbol{e}_{1}^{\mathcal{I}} at a constant speed, i.e., vtx=3m/sv_{tx}=3\,\text{m/s}, and the height of the targeted-point is 0.61m. At the initial moment, the animal is located at the origin and the quadrotor is hovering at [10,0,0.61]T[-10,0,0.61]^{T}. The comparative targeting results are depicted in Fig. 5, where TcT_{c} represents the computation time in each controller updating timestep. Because the positions along 𝒆2\boldsymbol{e}_{2}^{\mathcal{I}} of the compared methods are all controlled by the PD controller, there exists little difference in targeting performance between different methods in dyd_{y}. Therefore, dyd_{y} is omitted in the figure. The average computation times of EER, BVP and GPM are 0.07 ms, 3.88 ms, 8.40 ms, respectively. It can be seen that the computational efficiency of the EER is on the order of 120 times faster than GPM. The result can be interpreted as follows. For the generic optimizers, severe nonlinearities of the considered problem deteriorate the computational efficiency. In contrast, for the EER, the severe nonlinear characteristics is peeled off and only a quadratic optimal control problem is required to be solved. In such a case, the control input is generated analytically, which provides excellent profits for high computational efficiency.

Furthermore, it can also be noted that large overshooting and strong oscillation of zqz_{q} appears in generic optimizers. While with the EER, the height of the quadrotor zqz_{q} converges relatively smoothly and fast without overshooting. The reasons for this phenomenon lie in two aspects. Firstly, for the EER, thanks to its high computational efficiency, the control action can be updated in real-time and thus presents a smooth characteristic as shown in Fig. 5. In contrast, for generic optimizers, the long computation time leads to the trembling and discontinuity of the control inputs, deteriorating the targeting performance. Secondly, for the generic optimizers, according to Eq. (6), the vertical position of the quadrotor is coupled with the translational position as well as the pitch angle in the performance objective. Therefore, zqz_{q} cannot be constrained directly by the performance objective. On the contrary, the pitch angle and the translational position decoupled in the EER. Therefore, the vertical position of the quadrotor can be well constrained by the performance objective in a reasonable range. In this case, no oscillation, smaller overshooting as well as fast convergence appear in dx,dz,zqd_{x},d_{z},z_{q} when utilizing the proposed EER to target the animal.

Refer to caption
(a) Targeting error
Refer to caption
(b) Control inputs and cost time
Figure 5: Case 1 in simulation: the animal moves by constant speed
Refer to caption
(a) Targeting error
Refer to caption
(b) Control inputs and cost time
Figure 6: Case 2 in simulation: the animal moves by time-varying speed

Case 2: The second test is then carried out when the velocity of the animal is time-varying. Specifically, the animal’s speed is sinusoidal at a frequency of 0.5 Hz, ranging from 2.6 to 3. Similar to case 1, the quadrotor is hovering at [10,0,0.61]T[-10,0,0.61]^{T} and the animal is located at the origin initially. The targeting result is illustrated in Fig. 6. In this case, the average computation times of EER, BVP, and GPM are 0.1 ms, 3.95 ms, 8.57 ms, respectively. It can be seen that the computational efficiency of the EER is on the order of 85.7 times faster than GPM. It should be mentioned that the computation time of the same approach under the two cases is slightly different. This fluctuation in computation time is affected by some other parallel tasks executing on the desktop computer.

Additionally, in steady-state, the oscillations of zqz_{q} in the generic optimizers are larger than that in case 1. By comparison, for the EER, the vertical position of the quadrotor could also provide comparatively excellent performance. This result can be interpreted as follows. Thanks to the high computational efficiency of the proposed EER, it is reasonable to treat the animal’s velocity as a constant at each controller updating timestep in the system modeling. Therefore, the time-varying characteristic of the animal’s velocity has little effect on targeting performance in the EER. For the generic optimizers, due to the long computation time, assuming the animal velocity as a constant will lead to large modeling errors and worsen the targeting performance when the animal speed is time-varying.

V Experiment

In order to testify the feasibility of EER in practical application, real-time mimic biological experiments are carried out in this section.

V-A Experimental Setup

Refer to caption
Figure 7: Mimic biological experimental testbed

A mimic biological experimental test-bed is developed to verify the developed EER for the targeting task, as shown in Fig. 7. This test-bed consists of a quadrotor, a targeted animal and a Ground Control System (GCS). The airframe is a custom-developed quadrotor that weighs 1.98kg. Its attitude is controlled with an open-source controller Pixracer®, which mainly adopts the cascaded PID control technique. All control methods are implemented in an onboard computer, namely the Up-core®, and it communicates with the Pixracer® by MAVLink. The targeted animal is mounted on a small car, which is controlled by a remote controller.

The position and velocity of the quadrotor, as well as the targeted animal, is estimated by the Real-time Kinematic positioning(RTK). The RTK is a differential GNSS technique which provides centimeter accuracy in the vicinity of a base station. In addition, an LIDAR-Lite v3® is implemented to estimate the altitude of the quadrotor accurately. The quadrotor obtains the position of the targeted animal via a ROS bridge. The detailed implementations of the positioning system of the targeted animal are as follows: 1) The Ground Control System (GCS) first receives the messages from RTK, and then transmits them to CUAV V5+® by radio. 2) Then, the CUAV V5+® accepts the messages, and sends them to the NVIDIA® Jetson AGX Xavier by MAVROS. 3) AGX Xavier publishes the position and velocity of the targeted animal by ROS, and the ROS implemented on the Up-core of the quadrotor can subscribe to the messages.

In all of the following experiments, with a trial and error approach, the two weighting matrices regarding the developed EER are tuned as follows: 𝑸1=diag{116,441,87,18},𝑸2=diag{40,30}\boldsymbol{Q}_{1}=\text{diag}\{116,441,87,18\},\boldsymbol{Q}_{2}=\text{diag}\{40,30\}. The control gains of BVP and GPM are selected as k1=k3=2.5k_{1}=k_{3}=2.5. The selection of the other parameters in optimizations is consistent with the simulation.

V-B Experimental Results

Refer to caption
Figure 8: Case 1 in experiment: control inputs and computation time
Refer to caption
(a) Targeting error
Refer to caption
(b) Mean absolute error
Figure 9: Case 1 in experiment: targeting error

A first test is conducted, where the animal is located at the origin and the quadrotor is hovering at [3,0,0.7]T[-3,0,0.7]^{T} at the initial moment. The height of the desired point is 0.7 m and it then moves along 𝒆1\boldsymbol{e}_{1}^{\mathcal{I}}. Its speed increases from 0 m/sm/s to about 1.5 m/sm/s with nearly a uniform acceleration. The comparative targeting results are depicted in Fig. 8 and Fig. 9.

Fig. 8 depicts the control inputs and the computational time of the EER, BVP, and GPM. In this case, the average computation times of EER, BVP, and GPM are 0.27 ms, 38.12 ms, 108.87 ms, respectively. It can be seen that the EER can compute the control action on the order of 400 times faster than GPM on Up Core® at least. Particularly, EER can achieve a control frequency of thousands of hertz. Compared with the simulation, we can find that the computational efficiency of the EER is stabler than the generic optimizers on different platforms. To be specific, the computation time of the EER in the experiment is similar to that in simulation, while the computation time of the BVP and GPM is about tens or even hundreds of milliseconds slower than in simulation.

Fig. 9 illustrates the targeting error and the mean absolute error (MAE) of the three compared methods. Compared with the simulation, it can be inferred that in the experiment, the superiority of EER compared to the generic optimizers regarding targeting performance is more obvious. Specifically, the MAE of dzd_{z} with EER are smaller than those with the generic optimizers. In addition, regarding dx,dzd_{x},d_{z}, EER presents relatively smaller oscillation than generic optimizers, as shown in Fig. 9(a). This phenomenon can be interpreted as follows. Compared to the Gazebo environment, there exists inherent time delay and various unknown disturbances in the real system, and the processing capability of the onboard computer is not as good as that of the desktop computer in the simulation. Besides, the fluctuation of the animal’s velocity is larger than that of the simulation, leading to a larger modeling error in generic optimizers. In such a case, the computational efficiency in the real experiment is a more crucial issue than that in the simulation. However, due to severe nonlinearities of this optimal problem, generic optimizers cannot solve this problem with sufficient computational efficiency on the onboard computers, leading to a long time delay in the state feedback. Therefore, the targeting performance of the generic optimizers is worse than that of the proposed EER. The above results effectively demonstrate the significance of the computational efficiency for this targeting problem.

Refer to caption
Figure 10: Case 2 in experiment: control inputs and computation time
Refer to caption
Figure 11: Case 2 in experiment: targeting error

The second test is conducted, where the quadrotor is targeting a moving animal at a long distance. The animal is moving at speed close to 1.0 m/sm/s along 𝒆1\boldsymbol{e}_{1}^{\mathcal{I}} and is 9 mm away from the quadrotor. In the experiment, it can be observed that with the generic optimizers, the quadrotor flies very high and oscillates up and down, finally directly collides the animal or crash into the ground. Fig. 10 and Fig. 11 depict the comparative results, where the double slash denotes the crash time.

In Fig. 10, the control input and computation time of the proposed EER and generic optimizers are illustrated. The average computation time of EER, BVP, and GPM from 0 to 4.60 ss are 0.29 ms, 35.40 ms, 100.1 ms, respectively. It can be seen that the computational efficiency of the EER is on the order of 345 times faster than GPM. It can be seen that the generic optimizers fail around 4.6 s. This is because the tiger has speed at the initial moment and the initial error is large. Therefore, the targeting task in this case has higher real-time requirements. Due to the low computational efficiency of the generic optimizers, they cannot adjust the targeting error in time and leads to the crash.

Fig. 11 shows the compared targeting error. It should be stated that the targeting performance in dyd_{y} and dzd_{z} is mainly concerned, so the zoomed views of them are given particularly. Even in such a case, with the EER, the quadrotor quickly catches up with the animal with small oscillation and overshooting. Furthermore, after the system reaches a steady state, the proposed presents excellent targeting performance again. Particularly, the targeting error of dyd_{y} is between -0.2 and 0.1, and the targeting error of dzd_{z} is between -0.1 and 0.3, which can ensure the completion of the targeting task.

All the aforementioned simulation and experimental results demonstrate that the proposed EER has high computational efficiency either on a powerful desktop computer or an onboard computer and can achieve the best targeting performance than those of the generic optimizers.

VI Conclusion

In this paper, in order to continuously target a moving animal with a highly underactuated quadrotor, an efficient egocentric regulator is proposed. This strategy directly constructs the optimal tracking problem in an egocentric manner with regard to the quadrotor’s body coordinates and peels off the system nonlinearities through a mapping of the feedback states as well as control inputs, between the inertial and body coordinates. In this way, only a quadratic performance objective with linear constraint is required to be solved with an analytic evaluation. To verify the advantages and computational efficiency of the proposed strategy, mimic biological simulations and experiments are first carried out. Results demonstrate that the proposed EER can achieve faster convergence and smaller overshooting than those of the generic optimizers and its computational efficiency is highest and stablest than generic optimizers on different platforms. Particularly, on the onboard computer, the computation time of the EER is about 0.3 ms, which is on the order of 350 times faster than generic nonlinear optimizers and can accomplish a control frequency of thousands of hertz. In view of these convincing results, the development of this work is able to effectively enhance the computational efficiency for continuous targeting problems of quadrotors in practice.

References

  • [1] O. M. Cliff, D. L. Saunders, and R. Fitch, “Robotic ecology: Tracking small dynamic animals with an autonomous aerial vehicle,” Science Robotics, vol. 3, no. 23, p. eaat8409, 2018.
  • [2] K. Shah, G. Ballard, A. Schmidt, and M. Schwager, “Multidrone aerial surveys of penguin colonies in antarctica,” Science Robotics, vol. 5, p. eabc3000, 2020.
  • [3] A. Farinha, R. Zufferey, P. Zheng, S. F. Armanini, and M. Kovac, “Unmanned aerial sensor placement for cluttered environments,” IEEE Robotics and Automation Letters, vol. PP, no. 99, pp. 1–1, 2020.
  • [4] J. Dodd, “Zoo animal and wildlife immobilization and anesthesia,” Canadian Veterinary Journal, vol. 51, p. 622, 2010.
  • [5] K. Mohammadi, S. Sirouspour, and A. Grivani, “Control of multiple quad-copters with a cable-suspended payload subject to disturbances,” IEEE/ASME Transactions on Mechatronics, vol. 25, no. 4, pp. 1709–1718, 2020.
  • [6] V. P. Tran, F. Santoso, M. A. Garratt, and I. R. Petersen, “Adaptive second-order strictly negative imaginary controllers based on the interval type-2 fuzzy self-tuning systems for a hovering quadrotor with uncertainties,” IEEE/ASME Transactions on Mechatronics, vol. 25, no. 1, pp. 11–20, 2020.
  • [7] G. Chen, W. Dong, X. Sheng, X. Zhu, and H. Ding, “An active sense and avoid system for flying robots in dynamic environments,” IEEE/ASME Transactions on Mechatronics, vol. 26, no. 2, pp. 668–678, 2021.
  • [8] M. W. Mueller, M. Hehn, and R. D’Andrea, “A computationally efficient algorithm for state-to-state quadrocopter trajectory generation and feasibility verification,” in 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013, pp. 3480–3486.
  • [9] W. Dong, G. Y. Gu, Y. Ding, X. Zhu, and H. Ding, “Ball juggling with an under-actuated flying robot,” in IEEE/RSJ International Conference on Intelligent Robots & Systems, 2015, pp. 68–73.
  • [10] B. T. Hu and S. Mishra, “A time-optimal trajectory generation algorithm for quadrotor landing onto a moving platform,” in 2017 American Control Conference, 2017, pp. 4183–4188.
  • [11] B. Hu and S. Mishra, “Time-optimal trajectory generation for landing a quadrotor onto a moving platform,” IEEE/ASME Transactions on Mechatronics, pp. 1–1, 2019.
  • [12] C. Richter, A. Bry, and N. Roy, “Polynomial trajectory planning for quadrotor flight,” in International Conference on Robotics and Automation, 2013, pp. 69–88.
  • [13] G. M. Hoffmann, S. L. Waslander, and C. J. Tomlin, “Quadrotor helicopter trajectory tracking control,” in 2008 AIAA Guidance, Navigation and Control Conference and Exhibit, 2008, pp. 1–14.
  • [14] I. D. Cowling, O. A. Yakimenko, J. F. Whidborne, and A. K. Cooke, “A prototype of an autonomous controller for a quadrotor uav,” in Proceedings of the European Control Conference, 2007, pp. 1–8.
  • [15] Y. Bouktir, M. Haddad, and T. Chettibi, “Trajectory planning for a quadrotor helicopter,” 2008, p. 1258–1263.
  • [16] R. Sandeepkumar and R. Mohan, “Proceedings of the advances in robotics 2019,” in Flatness-based model predictive control of six degree of freedom fixed-wing UAV, 2019, pp. 1–6.
  • [17] M. Greeff and A. P. Schoellig, “Flatness-based model predictive control for quadrotor trajectory tracking,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018, pp. 6740–6745.
  • [18] Z. T. Dydek, A. M. Annaswamy, and E. Lavretsky, “Adaptive control of quadrotor uavs: A design trade study with flight evaluations,” IEEE Transactions on Control Systems Technology, vol. 21, no. 4, pp. 1400–1406, 2013.
  • [19] W. Dong, G. Y. Gu, X. Y. Zhu, and H. Ding, “High-performance trajectory tracking control of a quadrotor with disturbance observer,” Sensors and Actuators a-Physical, vol. 211, pp. 67–77, 2014.
  • [20] T. Luukkonen, “Modelling and control of quadcopter,” Independent research project in applied mathematics, Espoo, vol. 22, p. 22, 2011.
  • [21] D. A. Benson, G. T. Huntington, T. P. Thorvaldsen, and A. V. Rao, “Direct trajectory optimization and costate estimation via an orthogonal collocation method,” Journal of Guidance, Control, and Dynamics, vol. 29, no. 6, pp. 1435–1440, 2006.
  • [22] A. V. Rao, D. A. Benson, C. Darby, M. A. Patterson, C. Francolin, I. Sanders, and G. T. Huntington, “Algorithm 902: Gpops, a matlab software for solving multiple-phase optimal control problems using the gauss pseudospectral method,” ACM Transactions on Mathematical Software, vol. 37, no. 2, pp. 1–39, 2010.
  • [23] A. Rao, “A survey of numerical methods for optimal control,” Advances in the Astronautical Sciences, vol. 135, 2010.
  • [24] Y. Wang and S. Boyd, “Fast model predictive control using online optimization,” IEEE Transactions on Control Systems Technology, vol. 18, pp. 267–278, 2010.
[Uncaptioned image] Ziying Lin received the B.S. degree in mechanical engineering from Nanjing University of Science and Technology, Nanjing, China, in 2019. She is currently a Ph.D. candidate in the School of mechanical and engineering, Shanghai Jiao Tong University, China. Her research focuses on cooperation and control of unmanned system.
[Uncaptioned image] Wei Dong received the B.S. degree and Ph.D. degree in mechanical engineering from Shanghai Jiao Tong University, Shanghai, China, in 2009 and 2015, respectively. He is currently an associate professor in the School of Mechanical Engineering, Shanghai Jiao Tong University. His research interests include cooperation, perception and agile control of unmanned system.
[Uncaptioned image] Sensen Liu received the B.S. degree in mechanical engineering from Tongji University, Shanghai, China, in 2016. He is currently a Ph.D. candidate in the School of mechanical and engineering, Shanghai Jiao Tong University, China.His research focuses on aerial manipulation, gripper design, planning and control of unmanned aerial vehicles manipulation system.
[Uncaptioned image] Xinjun Sheng received the B.Sc., M.Sc.,and Ph.D. degrees in mechanical engineering from Shanghai Jiao Tong University, Shanghai, China, in 2000, 2003, and 2014, respectively. He is currently a Professor in the School of Mechanical Engineering, Shanghai Jiao Tong University. His current research interests include robotics, and bio-mechatronics. Dr. Sheng is a Member of the IEEERAS, the IEEEEMBS,and the IEEEIES.
[Uncaptioned image] Xiangyang Zhu received the B.S. degree from the Department of Automatic Control Engineering, Nanjing Institute of Technology, Nanjing, China, in 1985, the M.Phil. degree in instrumentation engineering and the Ph.D. degree in control engineering, both from Southeast Engineering, in 1989 and 1992, respectively. Since June 2002, he has been with the School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China, where he is currently a chair professor and the director of the Robotics Institute. His research interests include robotic manipulation planning, neuro-interfacing and neuro-prosthetics, and soft robotics. Dr. Zhu has received a number of awards including the National Science Fund for Distinguished Young Scholars from NSFC in 2005, and the Cheung Kong Distinguished Professorship from the Ministry of Education in 2007. He is serving on the editorial board of IEEE Transactions on Cybernetics and Journal of Bionic Engineering.