Model predictive control of agro-hydrological systems based on a two-layer neural network modeling framework

Zhiyinan Huang^a, Jinfeng Liu^a,, Biao Huang^a

^aDepartment of Chemical & Materials Engineering, University of Alberta,

Edmonton, AB T6G 1H9, Canada
Corresponding author: J. Liu. Tel: +1-780-492-1317. Email: [email protected].

Abstract

Water scarcity is an urgent issue to be resolved and improving irrigation water-use efficiency through closed-loop control is essential. The complex agro-hydrological system dynamics, however, often pose challenges in closed-loop control applications. In this work, we propose a two-layer neural network (NN) framework to approximate the dynamics of the agro-hydrological system. To minimize the prediction error, a linear bias correction is added to the proposed model. The model is employed by a model predictive controller with zone tracking (ZMPC), which aims to keep the root zone soil moisture in the target zone while minimizing the total amount of irrigation. The performance of the proposed approximation model framework is shown to be better compared to a benchmark long-short-term-memory (LSTM) model for both open-loop and closed-loop applications. Significant computational cost reduction of the ZMPC is achieved with the proposed framework. To handle the tracking offset caused by the plant-model-mismatch of the proposed NN framework, a shrinking target zone is proposed for the ZMPC. Different hyper-parameters of the shrinking zone in the presence of noise and weather disturbances are investigated, of which the control performance is compared to a ZMPC with a time-invariant target zone.

1 Introduction

Water scarcity is a rapidly escalating global issue due to various factors such as population growth and climate change. Approximately 70% of the freshwater is consumed by agriculture activities [1, 2]. Thus, improving the water-use efficiency in the agriculture industry is essential. Currently, practical irrigation policies are mostly open-loop, which are determined based on heuristic or empirical knowledge. Real-time feedback from the field is often not considered. This approach typically has low irrigation efficiency, as the applied irrigation amount can be imprecise, leading to over or insufficient irrigation.

Increasing attentions have been drawn to closed-loop irrigation over the past decade [3, 4]. Among various approaches, advanced control strategies are popular due to their ability of handling constraints and multiple objectives simultaneously [5, 6, 7, 8, 9, 10]. In [5, 6, 7], conventional model predictive control (MPC) is employed to optimize the soil-moisture dynamics in real-time in the presence of weather disturbances. Instead of tracking set-point, Mao et al. proposed an MPC controller that tracks a target zone (ZMPC), which provides more degrees of freedom in control actions [8]. Long-term irrigation scheduling that aims to optimize crop production over a longer time duration is investigated in [9, 10]. To be more specific, Nahar et al. proposed a hierarchical framework including a scheduler and a controller [9]. The scheduler has a larger sampling time and optimizes over the entire crop growth season. Some soil moisture reference is provided by the scheduler and is tracked by the controller over shorter horizons. Sahoo et al. proposed a knowledge-based scheduler that optimizes the irrigation amount and time explicitly and is shown to be effective under different scenarios [10].

One of the major challenges in applying advanced control to the agro-hydrological system is the high computational cost. The water dynamics are often described by the Richards equation, which is a nonlinear PDE developed based on first principles [11]. Spatial discretization is often required to simplify the Richards equation in applications [11], however, the resulting nonlinear ODE system is still difficult to solve and leads to states with higher dimensions.

Model reduction or approximation techniques are commonly used to reduce the computational complexity of high-dimensional nonlinear models. Model reduction is carried out based on existing models and aims to simplify it. One of the widely used approaches is to project the high dimension original state space onto a lower dimension subspace [12, 13], of which a representative method is the proper orthogonal decomposition (POD) [14]. However, the physical meanings of the original states are often not preserved by the projection approach, making it difficult to handle state constraints. Data-based model identification method is often used to approximate complex nonlinear models [8, 15, 16, 17] with no first-principle knowledge regarding the system required. The drawback, however, is that traditional model identification approaches require the model structure to be selected ahead of time, which may significantly affect the model performance. Furthermore, the computational cost of the identified model might still be significant depending on its structure. For example, in [8] a linear parameter-varying (LPV) model was identified for agro-hydrological dynamics approximation, where computational cost reduction was observed but still not sufficient for real-time online optimizations.

Model approximation based on neural network (NN) is a more generalized approach compared to traditional model identification. Motivated by the accuracy and computational efficiency of the NN model reduction observed in our previous work [18] and inspired by the structure of the LPV model developed in [8], we propose a two-layer NN framework to approximate the agro-hydrological dynamics in this work. The proposed framework is shown to have better open-loop prediction performance compared to a benchmark single long short term memory (LSTM) NN.

The developed two-layer NN model is employed in a ZMPC controller in the presence of noise and weather disturbances. The control objective is to minimize the amount of irrigation required while keeping the soil moisture content inside the target zone. In order to address the plant-model-mismatch, open-loop bias correction is added to the model, while a shrinking target zone is employed for ZMPC. Two bias correction models are investigated with different updating frequencies and the one with better open-loop prediction performance is used in ZMPC. The shrinking ZMPC target zone is designed such that the shape of the zone can be tuned by modifying the hyper-parameters, of which the effects on the control performance are investigated.

The paper is organized as follows. The agro-hydrological system of interest is introduced in Section 2. Details of the proposed two-layer NN framework are discussed in Section 3, of which the open-loop validation is provided in Section 4 with two plant-model-mismatch compensation strategies investigated in Section 5. The ZMPC controller employed is introduced in Section 6. The closed-loop simulation results are presented in Section 7. Finally, concluding remarks are provided in Section 8.

2 Preliminaries

A schematic of the agro-hydrological system of interest is shown in Figure 1 with grass being the crop of interest. The soil properties are assumed to be homogeneous horizontally and only the soil water dynamics on the vertical axis are considered. Irrigation is applied with a sprinkler and is assumed to reach the soil surface evenly. Precipitation and evapotranspiration are considered as known disturbances based on weather forecasting data. It is assumed that precipitation reaches the soil surface in identical ways as irrigation.

Refer to caption — Figure 1: A schematic diagram of the considered agro-hydrological system.

2.1 Soil Water Dynamics

Based on first principles, the soil water dynamics can be modeled using the 1-dimensional Richards equation with the crop and weather information taken into account [11]:

c(h)\frac{\partial h}{\partial t}=\frac{\partial}{\partial z}\bigg{[}K(h)\bigg{(}\frac{\partial h}{\partial z}+1\bigg{)}\bigg{]}-\alpha(h)\frac{K_{c}ET_{0}}{z_{r}}

(1)

where $h$ $[m]$ is the capillary potential, $c$ $[m^{-1}]$ denotes the soil capillary capacity, and $K$ $[m/s]$ is the soil hydraulic conductivity. The $-\alpha(h)\frac{K_{c}ET_{0}}{z_{r}}$ term captures the effect of weather and crop, where $\alpha(h)$ and $K_{c}$ are dimensionless factors, namely the water stress factor and the crop coefficient. $ET_{0}$ $[m]$ denotes the reference evapotranspiration and is assumed to be a known weather-dependent disturbance. $z_{r}=-0.13$ m represents the rooting depth of the crop and is assumed to be time-invariant in this work. Note that $z$ $[m]$ represents the vertical axis of the soil with upwards as the positive direction. $c(h)$ and $K(h)$ are modeled by the van Genuchten-Mualem soil hydraulic model [19]:

c(h)=(\theta_{s}-\theta_{r})\alpha n\bigg{(}1-\frac{1}{n}\bigg{)}(-\alpha h)^{n-1}[1+(-\alpha h)^{n}]^{\frac{1}{n}-2}

(2)

K(h)=K_{s}\Big{[}(1+(-\alpha h)^{n})^{-\big{(}1-\frac{1}{n}\big{)}}\Big{]}^{\frac{1}{2}}\Bigg{[}1-\bigg{[}1-\Big{[}(1+(-\alpha h)^{n})^{-\big{(}1-\frac{1}{n}\big{)}}\Big{]}^{\frac{n}{n-1}}\bigg{]}^{1-\frac{1}{n}}\Bigg{]}^{2}

(3)

where $K_{s}$ $[m/s]$ denotes the saturated hydraulic conductivity, $\theta_{s}$ $[m^{3}/m^{3}]$ and $\theta_{r}$ $[m^{3}/m^{3}]$ represent the saturated and the residual soil moisture, respectively. $\alpha$ $[m^{-1}]$ and $n$ are the van Genuchten-Mualem parameters related to the soil properties. Sandy loam soil is considered in this work and the corresponding parameters are adopted from [20] and are presented in Table 1.

Table 1: Soil parameters of sandy loamy soil

$K_{s}\,[m/s]$	1.23e-5
$\theta_{s}\,[m^{3}/m^{3}]$	0.41
$\theta_{r}\,[m^{3}/m^{3}]$	0.065
$\alpha\,[m^{-1}]$	7.5
$n$	1.89
$m$	0.47

2.2 Boundary Conditions

The boundary conditions employed in this work are presented in (4)–(5). The Neumann boundary condition is used at the soil surface as shown in (4), while free drainage is assumed for the bottom boundary as described in (5).

\left.\frac{\partial h(t)}{\partial z}\right|_{T}=-1-\frac{I(t)+P(t)}{K(h(t))}

(4)

\left.\frac{\partial h(t)}{\partial z}\right|_{B}=0

(5)

where $I(t)$ $[m/s]$ and $P(t)$ $[m/s]$ denote the rate of irrigation and precipitation at time $t$ . Note that interception caused by grass leaves is ignored in this work, only evapotranspiration, precipitation, and irrigation are considered.

2.3 Model Discretization

The system of equation (1) is discretized over the vertical spatial axis following the approach proposed in [11], which leads to an ODE system:

\dot{x}(t)=f(x(t),u(t))

(6)

where $x\in\mathbb{R}^{26}$ and $u\in\mathbb{R}^{1}$ are the state and input vectors respectively, while $f$ defines the nonlinear system dynamics. The total depth of soil considered is $0.5$ m and is evenly discretized into 26 nodes. The water dynamics at the center of each node are used to represent the dynamics of the entire node. In this work, irrigation rate $I(t)$ $[m/s]$ is the system input, and $h$ $[m]$ at the discretized nodes represents the system states. The system output $y\in\mathbb{R}^{1}$ is the volumetric water content $\theta$ $[m^{3}/m^{3}]$ at the rooting depth $z_{r}$ , which is an algebraic function of the state vector defined by the soil-water retention equation of van Genuchten (7). Eqn. (6) and Eqn. (7) are used to generate all training datasets.

\theta(h)=(\theta_{s}-\theta_{r})\bigg{[}\frac{1}{1+(-\alpha h)^{n}}\bigg{]}^{1-\frac{1}{n}}+\theta_{r}

(7)

3 Proposed two-layer neural network

To reduce the computational complexity of the first-principle model, a two-layer NN framework is proposed to approximate its nonlinear dynamics. In this section, details regarding the proposed framework will be discussed. Section 3.1 introduces the motivation and provides an overview of the proposed framework. Sections 3.2 and 3.3 discuss the data generation procedure for the individual layers of the framework respectively.

3.1 Overview

In this work, LSTM NNs will be used to approximate the original nonlinear system dynamics due to their ability of efficiently handle sequential data [21, 22]. The motivation for designing the proposed framework is that a single LSTM is not able to capture the complex nonlinear dynamics of the agro-hydrological system over the entire operating range of interest. It was noticed, however, within a smaller operating region, a single LSTM provides satisfying modeling performance. Detailed simulation results will be presented in Section 4.

A schematic diagram of the proposed framework is presented in Figure 2. The first layer contains $K$ sub-models, namely Model $1$ – Model $K$ , which will be referred to as $M_{1},M_{2},\cdots,M_{K}$ in the following text. The second layer consists of a single model (Model A, referred to as $M_{A}$ ). Each sub-model in the first layer is responsible for a sub-operating region. The union set of the sub-operating regions forms the entire operating region of interest. $M_{1}-M_{K}$ are LSTMs while $M_{A}$ is a fully connected NN. At the current time step $t$ , $M_{1}-M_{K}$ take the system input $u$ and output $y$ obtained from $p$ previous time steps as NN inputs:

NN_{in}(t)=\left[\begin{array}[]{cc}u(t-p+1),y(t-p+1)\\ u(t-p+2),y(t-p+2)\\ \cdots\\ u(t),y(t)\\ \end{array}\right],

and returns $K$ one-step-ahead predictions regarding the system output:

NN_{out_{1}}=[\bar{y}_{1}(t+1),\cdots,\bar{y}_{K}(t+1)]^{T}

Recall that each sub-model is only reliable in a sub-operating region. Inspired by Mao et al. [8], the proposed framework adopts the idea of the LPV models. The model performance can be improved by combining predictions made by multiple sub-models. The one-step-ahead predictions made by $M_{1}-M_{K}$ are fed to $M_{A}$ , where the final prediction of the system output at the next time step ( $NN_{out_{2}}=y_{pred}(t+1)$ ) is obtained.

Overall, at any time step $t$ , the proposed framework takes $NN_{in}(t)$ as input and returns the one-step-ahead prediction of the system output $y(t+1)$ , which can be expressed as a nonlinear function:

\hat{y}(t+1|t)=g(NN_{in}(t))

(8)

3.2 Design and training of first-layer NNs

Table 2: Operating ranges of the sub-models

	Operating range	# of LSTM layers
$M_{1}$	$y\in[0.12,0.27]$	1
$M_{2}$	$y\in[0.21,0.32]$	1
$M_{3}$	$y\in[0.29,0.40]$	2

As discussed in the previous section, the first layer of the proposed framework consists of $K$ LSTMs. In this work, $K=3$ is used. The operating regions in terms of the system output for each sub-model are summarized in Table 2. The number of LSTM layers employed in the models is also included. Each model consists of 1 or 2 LSTM layers, followed by a fully connected layer. Activation functions ‘sigmoid’ and ‘tanh’ are used. Both activation functions have smooth dynamics, where ‘tanh’ function provides steeper responses compare to ‘sigmoid’ function. Each sub-model is equivalent to a nonlinear function:

m_{i}:\mathbb{R}^{20\times 2}\rightarrow\mathbb{R}^{1},\quad i=1,2,3

where the model input is a matrix with $20$ rows and $2$ columns, and the output is a scalar. $20$ is the number of the past input-output data points ( $p=20$ ), while $2$ is the number of system input ( $n=1$ ) plus the number of system output( $l=1$ ). All models have a sampling time of 2 hours.

Three training datasets that vary in the corresponding operating region are generated for the sub-models respectively. Each dataset contains 30000 data points. Multi-level pseudorandom input signals are employed to generate all three datasets, which are fed into the discrete ODE system (6) and the algebraic equation (7) for open-loop simulations. A multi-level pseudorandom signal is similar to a pseudorandom binary signal, except that it has multiple levels, which helps to stimulate the nonlinear dynamics of the agro-hydrological system. More details regarding the design of multi-level pseudorandom inputs can be found in our previous work [18]. Figure 3 presents a segment of the training input signal and the corresponding system output for $M_{2}$ . Note that a random noise signal $\epsilon$ is added to the output trajectory to mimic real process operations, which is shown in Figure 3(b) in red. The max magnitude of $\epsilon$ is defined as follows:

\epsilon_{max}=0.1[y_{max}-y_{min}]

where $y_{max}$ and $y_{min}$ are the maximum and minimum value of the system output in the dataset.

All datasets need to be reorganized to match the input-output structure of the NNs. Scaling is also required such that the magnitudes of the scaled data are less than 1. All simulations are carried out in Python. TensorFlow [23] is used to train the NN models. Readers may refer to our previous work [18] for more details regarding data pre-processing and NN training.

3.3 Design and training of second-layer NN

The second layer of the proposed framework consists of a fully connected NN equivalent to the following nonlinear function:

m_{A}:\mathbb{R}^{3}\rightarrow\mathbb{R}^{1}

The inputs to model $M_{A}$ are the one-step-ahead predictions of the system output made by $M_{1}$ , $M_{2}$ and $M_{3}$ , while the output is the actual system output prediction. Two fully connected layers with the activation function ‘sigmoid’ are used in $M_{A}$ .

Under the objective of predicting the system output over the entire operating range of interest, the training dataset of $M_{A}$ covers the entire operating range. A section of the training dataset including 100000 points is presented in Figure 4, where the input signal and the corresponding output trajectories without and with white noise added are presented. Note that the training dataset of $M_{A}$ spans all the sub-operating regions with more aggressive dynamics. Instead of continuous irrigation with lower magnitudes, an impulse irrigation signal is used to provide a more aggressive output response that varies in a larger operating range, which is desired for $M_{A}$ training.

Similar to that discussed in the previous section, reconstruction and scaling are necessary for this dataset. Note that the inputs to $M_{A}$ are not the system input and output. The dataset is passed through the pre-identified sub-models after the data pre-processing. The predictions made by the sub-models are collected, which are the designed inputs for $M_{A}$ and are used to train $M_{A}$ .

4 Model Validation

The prediction performance of the proposed framework is validated in this section. The maximum number of prediction steps is $N=20$ for this work, which is also the control horizon of interest for following ZMPC applications. Section 4.1 presents the validation results of the sub-models. The proposed framework is then compared with a single LSTM in Section 4.2.

4.1 Validation of the first-layer sub-models

Table 3: NRMSE of multi-step-ahead-predictions of the sub-models

# of prediction steps	$M_{1}$	$M_{2}$	$M_{3}$
1	0.015	0.016	0.073
10	0.042	0.049	0.111
20	0.060	0.050	0.111

The multi-step-ahead prediction performances of $M_{1}-M_{3}$ are presented in this section. Validation datasets are obtained in similar manners as the training datasets with 20% white noise added to the system output. This helps to better test the prediction performance of the models. At any given time step, noise-treated validation data are fed to the model for one-step-ahead prediction. Starting from the second time instant, the prediction made by the model at the previous time instant will be used as the initial condition of the current time instant. This procedure is repeated 20 times, providing twenty-steps-ahead predictions.

The prediction performance is measured by the normalized root mean square error (NRMSE) of the multi-step-ahead predictions in the desired operating range of each model and is summarized in Table 3. The NRMSE is defined as follows:

NRMSE=(y_{max}-y_{min})^{-1}\sqrt{\frac{\sum_{i=1}^{t_{f}}(y_{act}(i)-y_{pred}(i))^{2}}{t_{f}}}

(9)

where $t_{f}$ is the total number of validation data points, $y_{act}$ and $y_{pred}$ are the true and predicted system outputs, respectively. Figures 5, 6, and 7 present the multi-step-ahead open-loop prediction performance of $M_{1}$ , $M_{2}$ , and $M_{3}$ respectively. The same line styles are used in Figures 5–7 for consistency.

All three models provide reasonable multi-step-ahead prediction without delays in the presence of noise. When the magnitude of the system output is low, small prediction offsets can be observed for $M_{1}$ in Figure 5. Figure 6 shows only some minor offsets exist at the local extremes for $M_{2}$ . For both $M_{1}$ and $M_{2}$ , the effect of noise seems to be minor. For $M_{3}$ , Figure 7 indicates that the noise has a stronger impact on the single-step-ahead prediction, which deviates and becomes insignificant as the number of prediction steps increases. It is also noticed that when the output converges to steady-states, the impact of the noise is the most significant. Similar to $M_{1}$ and $M_{2}$ , $M_{3}$ also has some minor offsets at the local extremes.

For different operating ranges, different system output dynamics are observed. When the soil moisture content and the irrigation rate are low, no obvious steady-state is observed (Figure 5). As the operating range shifts to the higher magnitude side, the soil moisture content tends to saturate to steady-states under continuous irrigation (Figure 7).

4.2 Validation of the proposed two-layer framework

The multi-step-ahead prediction performance of the proposed framework is investigated in this section. A single LSTM that aims to predict the agro-hydrological dynamics over the entire operating region is used as the benchmark approximation model. The benchmark LSTM has the same input ( $NN_{in}$ ) and output ( $NN_{out_{2}}$ ) structures as the proposed two-layer NN framework, which are discussed in Section 3.1. The single LSTM has 2 LSTM layers followed by a fully connected output layer. Both the LSTM layers use the activation function ‘sigmoid’ while the output layer uses the activation function ‘tanh’. Note that the dataset employed for training the benchmark LSTM is the same as that used to train the two-layer NN model.

The same validation dataset is used to compare the performance of the two approximation approaches, which spans the entire operating range of interest. The open-loop prediction trajectories of the system output based on the benchmark LSTM and the proposed framework are presented in Figures 8 and 9 respectively. The single-step-ahead, ten-step-ahead, and twenty-step-ahead predictions are presented in each figure with the same line styles as those employed in the previous section. The NRMSE values under the two cases and the percentage difference $\triangle$ between them are summarized in Table 4. The percentage difference is calculated with respect to the NRMSE of the benchmark LSTM:

\triangle=\frac{NRMSE_{LSTM}-NRMSE_{proposed}}{NRMSE_{LSTM}}

As presented in Figure 8, the LSTM can respond to aggressive dynamics efficiently without delays. However, significant offsets are observed at the local extremes with larger magnitudes. A section of the trajectories is amplified, which can be seen on the top-right corner of Figure 8. At both the local maximum and the minimum, significant offsets can be observed in the zoomed-in plot, of which the predicted local minimums are at the level of the true local maximum. Better prediction performance is observed in Figure 9. To be specific, the dynamics at the local minimums can be captured by the proposed two-layer framework much better compared to the LSTM. From the zoomed-in window on the top right, it can be observed that the predictions of the local minimum and maximum are more accurate compared to the LSTM. Table 4 indicates comparison results, showing that the NRMSE values of the proposed framework are significantly lower than those obtained with the LSTM for all different prediction steps. The improvement of the proposed framework is the most significant with one-step-ahead predictions and slowly reduces as the prediction steps increase.

Table 4: NRMSE of multi-step-ahead-predictions of the NN models

# of Pred. steps	Single LSTM	two-layer NN	% Diff. $\triangle$
1	0.076	0.044	42.5
10	0.096	0.082	14.5
20	0.100	0.093	7.14

It is noticed that the proposed framework cannot accurately approximate the local maximums with larger magnitudes. To account for this problem, we introduce a bias-and-scale-updated model to capture the plant-model-mismatch, of which the details are provided in the following section.

5 Plant-Model-Mismatch Correction

To capture the plant-model-mismatch, two correction approaches are investigated based on the proposed two-layer NN framework. The first approach is to add a simple bias term $b_{1}$ based on the prediction error:

y_{act}=y_{pred}+b_{1}

(10)

where $y_{act}$ and $y_{pred}$ represent the actual output measurement and the prediction made by the two-layer NN framework respectively.

The second approach is to fit a first order linear model that maps the predicted output to the actual measurement:

y_{act}=a\cdot y_{pred}+b_{2}

(11)

where $a$ and $b_{2}$ are parameters to be identified.

For both approaches, an updating frequency $f$ of the parameters needs to be determined first. In this work, we are interested in accurately predicting the system dynamics 20 steps into the future, thus the updating frequency should be a divisor of 20. To test the performance of the two approaches, the validation dataset used in the previous section is employed. The updating policies of the two approaches are presented in Algorithms 1 and 2, respectively. In both algorithms, $t$ represents the current time step, $i$ denotes the step of predictions made ahead of time $t$ , and $\eta(i)$ represents the prediction error at step $i$ .

At any given time instant $t$ , the parameters are first initialized with some initial guess. The initial condition $NN_{in}(t)$ is provided to the proposed framework for multi-step ahead predictions. In the next $N=20$ steps, predictions will be made with the correction model considered, which will be updated every $f$ time steps for $\frac{N}{f}$ times.

The updating policy of the first approach is to take the average of the accumulated prediction errors over the last $f$ prediction steps. For the second approach, a constrained optimization problem that finds the best-fit parameters for the linear model is solved. Note that the parameters are held in bounded compact sets, which help to limit the parameters in a reasonable range. To be more specific, $a\in[0.8,1.5]$ and $b_{2}\in[-0.2,0.3]$ are used. If a simple linear-regression function is employed, the resulting parameters might be impractical. The toolbox CasADi [24] is employed to solve the optimization problems in Python.

Algorithm 1 Parameter updating algorithm for the first correction approach

Set the number of prediction steps

N=20

Set the updating frequency

f=f

Initialize

b_{1}=0

At each time step of the dataset t, set

i=1

Provide the model with the initial condition

y_{pred}(0)=y_{act}(t)

while

i\leq N

Predict the future system output

y_{pred}(i)=NN_{out_{2}}(i)+b_{1}

Record the error with respect to the actual output value

\eta(i)\leftarrow(y_{act}(t+i)-y_{pred}(i))

f

is a divisor of

i

then

Update

b_{1}=\frac{1}{f}\sum_{j=i-f+1}^{i}\eta(j)

end if

Set

i=i+1

end while

Algorithm 2 Parameter updating algorithm for the second correction approach

a\in\mathbb{A}

b_{2}\in\mathbb{B}

Set the number of prediction steps

N=20

Set the updating frequency

f=f

Initialize

a=1

b_{2}=0

At each time step of the dataset t, set

i=1

Provide the model with the initial condition

y_{pred}(0)=y_{act}(t)

while

i\leq N

Predict the future system output

y_{pred}(i)=a\cdot NN_{out_{2}}(j)+b_{2}

f

is a divisor of

i

then

Update

a

and

b_{2}

by solving the optimization problem

a^{*},b_{2}^{*}=\min_{a,b_{2}}\;\sum_{j=i-f+1}^{i}(y_{act}(t+j)-y_{pred}(i))

end if

end while

Recall the prediction horizon of interest $N=20$ . Thus updating frequencies $f=1,2,5,10$ are tested and compared. The correction performance is measured by the average absolute prediction error $\upsilon_{mm}$ :

\upsilon_{mm}=\frac{1}{N\cdot t_{f}}\sum^{t_{f}}\sum^{N}|y_{act}-\hat{y}_{pred}|

where the same definitions are kept for all variables. The absolute prediction errors of the two correction approaches are summarized in Table 5. The absolute prediction error of just the proposed NN framework is as well included in Table 5 and is used as the comparison benchmark.

Table 5: Performance of the mismatch correction approach in terms of average absolute error

$f$	No corr.	Single bias	Linear Model
1	0.101	0.066	-
2	0.101	0.064	0.063
5	0.101	0.195	0.090
10	0.101	0.265	0.098

With $f=5,10$ , the single-bias correction approach leads to more significant errors, which is not applicable. The linear model correction shows very minor performance improvements. When $f=2$ , similar improvements in the prediction performance are observed in both approaches. The single bias correction performs similarly for $f=1$ and $f=2$ . For the linear model correction, $f=1$ is not applicable. Considering the computational complexity of the two approaches, the single-bias correction with $f=2$ is selected.

6 Controller Design

A ZMPC controller with a shrinking target zone is employed in this work and is presented in this section. The major target here is to provide sufficient irrigation such that the crop can live and grow healthily, which implies the soil moisture content at the crop root zone needs to be kept in a bounded range instead of one particular point. Compared to a conventional tracking MPC, ZMPC provides more degrees of freedom to the system and can handle economic objectives better, which fits the objective of this work.

In the basic design of a ZMPC controller, the target zone is kept time-invariant over the control horizon. In this work, we propose a ZMPC design with a time-variant target zone that shrinks over the control horizon to handle plant-model-mismatch. The proposed ZMPC formulation is presented as follows:


$\displaystyle\min_{u(t_{j}\|t_{i})\in S(\triangle)}\;$	$\displaystyle\sum_{j=i}^{i+N-1}\|\|\hat{y}(t_{j}\|t_{i})-y_{z}(t_{j}\|t_{i})\|\|^{2}_{Q}+\|\|u(t_{j}\|t_{i})\|\|^{2}_{R}$	(12a)
$\displaystyle\textmd{s.t.}\ :\;$	$\displaystyle\hat{y}(t_{j+1}\|t_{i})=g(NN_{in}(t_{j}\|t_{i})),\quad j=i,i+1,\cdots,i+N-1$	(12b)
	$\displaystyle\hat{y}(t_{i}\|t_{i})=y(t_{i})$	(12c)
	$\displaystyle u(t_{j}\|t_{i})\in\mathbb{U},\quad j=i,i+1,\cdots,i+N-1$	(12d)
	$\displaystyle\hat{y}(t_{j}\|t_{i})\in\mathbb{Y},\quad j=i,i+1,\cdots,i+N-1$	(12e)
	$\displaystyle y_{z}(t)\in\mathbb{Y}_{z}(t_{j}),\quad j=i,i+1,\cdots,i+N-1$	(12f)

where $u$ , $y$ are the system input and output vector, $y_{z}$ is the zone tracking slack variable. $t_{i}$ represents the current time step, $u(t_{j}|t_{i})$ and $\hat{y}(t_{j}|t_{i})$ denote the system input and output at future time $t_{j}$ predicted at the current time $t_{i}$ . (12a) is the objective function, where $Q$ and $R$ are diagonal weighting matrices for the zone tracking objective and the irrigation amount respectively. The control objective is to drive the system into a particular operating zone while minimizing the irrigation amount at the same time. (12b) represents the two-layer NN framework discussed in the previous section, which is used to predict the system dynamics under the selected input trajectory over the control horizon $N$ . (12c) defines the initial state $y(t_{i})$ at time $t_{i}$ . (12d) - (12f) are the state, output and the zone tracking constraints, where $\mathbb{U}$ , $\mathbb{Y}$ , and $\mathbb{Y}_{z}$ are compact sets. The time-variant target zone $\mathbb{Y}_{z}(t_{j})$ is a subset of $\mathbb{Y}$ and is defined as follow:


$\displaystyle\mathbb{Y}_{z}(t_{j})$	$\displaystyle=\{y(t)\|\underline{y}(t_{j})\leq y(t_{j})\leq\overline{y}(t_{j})\},\quad j=i,i+1,\cdots,i+N-1$	(13a)
$\displaystyle\underline{y}(t_{j})$	$\displaystyle=\min{\Big{[}\underline{y}(t_{i})\cdot e^{\mu\cdot\frac{t_{j}-t_{i}}{N}},\underline{Y}\Big{]}}$	(13b)
$\displaystyle\overline{y}(t_{j})$	$\displaystyle=\max{\Big{[}\overline{y}(t_{i})\cdot e^{-\mu\cdot\frac{t_{j}-t_{i}}{N}},\overline{Y}\Big{]}}$	(13c)

where $\underline{y}(t_{j})$ and $\overline{y}(t_{j})$ represent the lower and upper bound of the target zone at time step $t_{j}$ . The boundaries of the target zone varies over time exponentially from their initial value $\underline{y}(t_{i})$ and $\overline{y}(t_{i})$ until they are converged to the final value $\underline{Y}$ and $\overline{Y}$ , where $\underline{Y}\leq\overline{Y}$ . $\mu$ is a non-negative scalar that determines the shrinking speed of the boundaries. Larger $\mu$ leads to more aggressive shrinkage and vise versa. When $\mu=0$ , the target zone becomes time-invariant (i.e. $\underline{y}(t_{j})=\underline{y}(t_{i})$ , $\overline{y}(t_{j})=\overline{y}(t_{i})$ ). In this work, $\underline{y}(t_{i})=0.18$ and $\overline{y}(t_{i})=0.23$ are used.

Note that the magnitude of the system input $u$ is on the scale of $10^{-8}$ to $10^{-5}$ , which poses numerical issues for solving optimizations problems. Thus $u$ is scaled before feeding to the optimization solver. The scaled input $u_{scaled}$ takes values between $0$ and $1$ . Unless specified otherwise, $Q=4000$ , $R=100$ are employed.

$N=20$ is used in this work. The sample time $\Delta$ of the optimization is $2$ hours (i.e. $\Delta=7200$ s), meaning that at a given time instant $i$ , the agro-hydrological dynamics in the next $40$ hours will be optimized ( $u^{*}=[u^{*}(t_{i}|t_{i}),u^{*}(t_{i}+1|t_{i}),\cdots,u^{*}(t_{i}+N|t_{i})]$ ). Following the receding horizon control strategy, only the optimal input at the first step $u^{*}(t_{i}|t_{i})$ will be applied to the system. The optimization problem is solved repeatedly over the horizon of interest $N_{sim}=60$ , which optimizes the system dynamics over a $5$ days duration.

7 ZMPC Simulation Results

ZMPC simulation results under various operating conditions are presented in this section. The same initial condition is employed for all simulations for fair comparisons. To be more specific, $x_{0}=-0.2\;[m]$ is the initial state at the rooting depth and is employed for the discretized Richards equation, while the corresponding system output $y_{0}=0.266\;[m^{3}/m^{3}]$ is used for the NN models. First, in Section 7.1, the control performance of the proposed framework is compared with the discretized Richards equation and the single LSTM discussed in previous sections. Simulations are performed under the controller defined by (12) with a time-invariant target zone ( $\mu=0$ ) in the presence of process and/or measurement noises. Then the performance of the two-layer-NN-based ZMPC with a shrinking target zone in the presence of significant noise and weather disturbance is presented in Section 7.2. The effects of tuning parameters $\mu$ , $\underline{Y}$ , and $\overline{Y}$ in ZMPC performance are investigated. The basic ZMPC with a time-invariant target zone is employed as the benchmark controller for performance comparison.

7.1 Optimization Performance Validation

In this section the performance of the proposed framework in ZMPC applications is validated by comparing the simulation results with those obtained based on the discretized Richards equation and the single LSTM. Table 6 summarizes the average computational time for solving the optimization problem, the economic performance of ZMPC simulations, and the percentage difference in economic performance based on different models. The economic performance is measured by the accumulated irrigation amount $I_{T}$ in millimeters, which is calculated as follows:

I_{T}=\sum_{t=0}^{N_{sim}}u(t)\cdot\Delta\cdot 1000

where $u(t)$ is the irrigation rate at time $t$ with the unit $m/s$ . The percentage difference in the accumulated irrigation is calculated with respect to the result obtained based on the Richards equation. The optimal control strategy and the corresponding output trajectories obtained based on different models are presented in Figure 10. The results shown in Table 6 and Figure 10 are obtained in the presence of 2% process noise. Figure 11 presents the optimal system input and output trajectories obtained with 5% process and 5% measurement noise based on the single LSTM and the proposed framework. The colors and markers employed in Figure 11 are the same as those in Figure 10. Table 7 summarizes more ZMPC simulation results based the single LSTM and the proposed framework in the presence of process and measurement noise.

Table 6: ZMPC simulation results in the presence of 2% process noise.

Model	Avg. Time (s)	$I_{T}$ (mm)	% Diff. in $I_{T}$
Richards Eqn.	493	8.33	-
Single LSTM	12.5	11.8	41.4
two-layer NN	35.4	9.05	8.66

Table 7: ZMPC simulation results in the presence of process noise (P.N.) and measurement noise (M.N.).

Noise	Model	$I_{T}$ (mm)
2% M.N.	Single LSTM	11.6
2% M.N.	two-layer NN	8.74
5% P.N.	Single LSTM	12.0
5% P.N.	two-layer NN	9.54
5% M.N.	Single LSTM	12.0
5% M.N.	two-layer NN	9.91
2% P.N. & 2% M.N.	Single LSTM	11.7
2% P.N. & 2% M.N.	two-layer NN	9.46
5% P.N. & 5% M.N.	Single LSTM	12.1
5% P.N. & 5% M.N.	two-layer NN	11.2

As presented in Table 6, both the single LSTM and the proposed framework help to reduce the computational cost significantly. The computational cost reduction achieved by the single LSTM is more significant compare to the two-layer NN due to the simplicity of its structure. With only 8.66% deviation, the economic performance of controller using the two-layer NN is similar to that using the Richards equation, which is significantly better compared to that obtained with the single LSTM. The trajectories in Figure 10 indicate the same, as the optimal control law and the system response of the two-layer NN and the Richards equations are highly overlapped. Based on the Richards equation, no irrigation is applied for the first a few steps and starts to increase slowly over time. The two-layer NN leads to similar control strategy with the minor differences caused by the plant-model-mismatch. As for the single LSTM, the irrigation policy is less efficient, which is nonzero over the entire horizon. The output trajectories converge to the lower bound of the target zone for simulations based on the Richards equation and the proposed framework. The responses are reasonable as the economic objective is to reduce the amount of irrigation required. On the other hand, for the single-LSTM-based ZMPC, slower convergence and offsets with respect to the lower bound of the target zone is observed in the optimal output trajectory.

From Tables 6 and 7, it is observed that compared to process noise, measurement noise has stronger impacts on the simulation results. As shown in Table 7, except for the case that considered 5% process noise and 5% measurement noise, control with the two-layer NN has significantly better economic performances compared to that using the single LSTM. As shown in Figure 11, in the presence of 5% process noise and 5% measurement noise, the two-layer NN provides more aggressive control policy to capture the effect of the noise compare to the single LSTM. However, the impact of noise in this case is more significant compare to the control action due to the slow-responding nature of the agro-hydrological system, as the output trajectories obtained based on the two models are similar. To summarize, the proposed framework is shown to have better performances compared to the single LSTM in ZMPC applications. However, some minor violations of the target zone is observed at the end of the horizon of interest (Figure 11) due to the effect of noise, which is not ideal as low soil moisture may cause severe consequences in crop growing. Furthermore, precipitation is an essential disturbance in real-world applications and should be considered. In the following section, we investigate the performance of a shrinking target zone in capturing the effect of significant noise and weather disturbances.

7.2 Shrinking Zone v.s. Time-invariant Zone

In this section, the performance of the proposed shrinking-zone ZMPC and the ZMPC with time-invariant zone (referred to as the basic ZMPC in the following text) is compared in the presence of 5% measurement noise and weather disturbance. The controller is assumed to have information regarding the weather through weather forecasting with some uncertainties. In this work we consider 20% error in weather forecasting. Various parameter combinations are investigated. In Section 7.2.1, $\mu$ is kept constant and the effect of $\underline{Y}$ and $\overline{Y}$ discussed. The opposite is done in Section 7.2.2, where $\underline{Y}$ and $\overline{Y}$ are kept constant with varying $\mu$ .

7.2.1 Tuning $\underline{Y}$ and $\overline{Y}$

Table 8: ZMPC simulation results with different

\underline{Y}

and

\overline{Y}

Target Zone	$I_{T}$ (mm)	% Diff. - $I_{T}$	Zone Viol. ( $m^{3}/m^{3}$ )	% Diff. - Zone Viol.
Time Inv. Z.	6.16	-	0.007	-
$\underline{Y}=\overline{Y},\mu=1$	9.91	61.0	0.0005	93.1
$\underline{Y}<\overline{Y},\mu=1$	9.53	54.9	0.001	85.1

Figure 12 presents two sets of shrinking zones defined with different $\underline{Y}$ and $\overline{Y}$ . The darker zone bounded by dashed boundaries has $\underline{Y}=\overline{Y}=0.205$ , meaning that the target zone shrinks to the center point of the original zone. The lighter zone bounded by dashed-dotted boundaries terminates to a zone with $\underline{Y}=0.20$ , $\overline{Y}=0.21$ . For both controllers, $\mu=1$ is used. The ZMPC results are shown in Figure 13. The results obtained based on the basic ZMPC (i.e. $\underline{Y}=0.18$ , $\overline{Y}=0.23$ ) is employed as the performance benchmark. The results obtained based on the two shrinking zone design are presented with the same line style as those used in Figure 12. The economic and zone tracking performance of the ZMPC controllers are summarized in Table 8. The percentage difference of the total amount of irrigation and zone tracking performances are calculated based on the basic ZMPC controller.

From Figure 13, it is observed that with a time-invariant zone, the controller is able to capture the effect of the weather disturbance in general. When precipitation is foreseen, the irrigation amount is reduced. However, minor violation of the target zone at the lower bound is still observed due to the presence of noise. As presented in the figure, the usage of shrinking zone helps to avoid the violation. Instead of oscillating around the lower bound of the target zone, the output trajectories converge toward the center of the zone. This provides more robustness in the controller in the presence of significant noise. It is also noticed however, the employment of shrinking zones has significantly affect the total amount of irrigation. As summarized in Table 8, the two shrinking zones considered increase the accumulated irrigation amount by 61.0% and 54.9% respectively. Figure 13 and Table 9 indicate that the output responses based on the two shrinking zone designs are very similar, while the design that terminates to a zone performs better economically. Thus, in the following section we investigate the effect of $\mu$ with $\underline{Y}=0.20$ , $\overline{Y}=0.21$ .

7.2.2 Tuning $\mu$

Table 9: ZMPC simulation results with different

\mu

Target Zone	$I_{T}$ (mm)	% Diff. - $I_{T}$	Zone Viol. ( $m^{3}/m^{3}$ )	% Diff. - Zone Viol.
Time Inv. Z.	6.16	-	0.007	-
$\underline{Y}<\overline{Y},\mu=1$	9.53	54.9	0.001	85.1
$\underline{Y}<\overline{Y},\mu=0.7$	8.45	37.2	0.003	62.0
$\underline{Y}<\overline{Y},\mu=0.5$	7.46	21.2	0.005	40.4

Three shrinking zone designs with different $\mu$ are shown in Figure 14. $\mu=1$ provides the fastest shrinking speed and thus the smallest zone. The second largest zone bounded by dotted lines has $\mu=0.7$ , while the largest zone with solid boundaries has $\mu=0.5$ . The shrinking process becomes less aggressive as $\mu$ decrease. Figure 15 shows the optimal control strategies and the corresponding output responses of the three ZMPC controllers. The same line styles as those used in Figure 14 are used in Figure 15. The total irrigation consumption of the three controllers is summarized in Table 9, where the result obtained with the basic ZMPC is again presented as the performance benchmark for consistency.

From Figure 15, it is observed that as the magnitude of $\mu$ decrease, the output response shifts towards the lower bound of the original target zone. The target zone violation thus becomes more significant, which is consistent with the results presented in Table 9. It is also noticed that with reducing $\mu$ value, the total irrigation consumption reduces. In summary, in the presence of noise and disturbances, a trade off exists between the amount of irrigation applied and the zone-tracking performance. Depending on the priority of the control objectives, parameters can be tuned accordingly. If some slight zone violation is allowed, a time-invariant target zone or a shrinking zone with smaller $\mu$ and greater difference between $\underline{Y}$ and $\overline{Y}$ can be employed for better economic performance. Vice versa, a shrinking zone that shrinks more significantly over the control horizon (i.e. larger $\mu$ and narrower terminating zone) can be used if the zone-tracking objective is superior compared to the economic performance.

8 Conclusion

A two-layer NN framework is proposed in this work to approximate the agro-hydrological system of interest, which is shown to be superior to a single LSTM for both open-loop prediction and closed-loop control applications. Based on the two-layer NN model, a ZMPC strategy is proposed to maintain the soil moisture content at the root zone within a certain zone. To handle the effect of noise and weather disturbance, a target zone that shrinks along the control horizon is employed in the controller. The effect of different shrinking rates and widths of the terminal zone are discussed. Through simulations, it is noticed that steeper shrinking zones help to reduce the zone violation while significantly increasing the amount of irrigation required. Mild shrinkage in the target zone leads to lower economic costs but also minor zone violations. Depending on the priority of the control objective, the hyper-parameters can be tuned accordingly.

References

[1] D. Guan and K. Hubacek, “Assessment of regional trade and virtual water flows in China,” Ecological Economics, vol. 61, pp. 159–170, Feb. 2007.
[2] S. Mubako, S. Lahiri, and C. Lant, “Input–output analysis of virtual water transfers: Case study of California and Illinois,” Ecological Economics, vol. 93, pp. 230–238, Sept. 2013.
[3] Y. Kim, R. G. Evans, and W. M. Iversen, “Evaluation of Closed-Loop Site-Specific Irrigation with Wireless Sensor Network,” Journal of Irrigation and Drainage Engineering, vol. 135, pp. 25–31, Feb. 2009.
[4] M. S. Goodchild, K. D. Kühn, M. D. Jenkins, K. J. Burek, and A. J. Dutton, “A Method for Precision Closed-loop Irrigation Using a Modified PID Control Algorithm,” vol. 188, no. 5, p. 9, 2015.
[5] S. K. Saleem, D. K. Delgoda, S. K. Ooi, K. B. Dassanayake, L. Liu, M. N. Halgamuge, and H. Malano, “Model Predictive Control for Real-Time Irrigation Scheduling,” IFAC Proceedings Volumes, vol. 46, pp. 299–304, Aug. 2013.
[6] C. Lozoya, C. Mendoza, L. Mejía, J. Quintana, G. Mendoza, M. Bustillos, O. Arras, and L. Solís, “Model Predictive Control for Closed-Loop Irrigation,” IFAC Proceedings Volumes, vol. 47, no. 3, pp. 4429–4434, 2014.
[7] C. Guo and F. You, “A Data-Driven Real-Time Irrigation Control Method Based on Model Predictive Control,” in 2018 IEEE Conference on Decision and Control (CDC), (Miami Beach, FL), pp. 2599–2604, IEEE, Dec. 2018.
[8] Y. Mao, S. Liu, J. Nahar, J. Liu, and F. Ding, “Soil moisture regulation of agro-hydrological systems using zone model predictive control,” Computers and Electronics in Agriculture, vol. 154, pp. 239–247, Nov. 2018.
[9] J. Nahar, S. Liu, Y. Mao, J. Liu, and S. L. Shah, “Closed-Loop Scheduling and Control for Precision Irrigation,” Industrial & Engineering Chemistry Research, vol. 58, pp. 11485–11497, July 2019.
[10] S. R. Sahoo, B. T. Agyeman, S. Debnath, and J. Liu, “Knowledge-Based Optimal Irrigation Scheduling of Agro-Hydrological Systems,” Sustainability, vol. 14, p. 1304, Jan. 2022.
[11] L. A. Richards, “CAPILLARY CONDUCTION OF LIQUIDS THROUGH POROUS MEDIUMS,” Physics, vol. 1, pp. 318–333, Nov. 1931.
[12] S. Ibrir, “A projection-based algorithm for model-order reduction with H 2 performance: A convex-optimization setting,” Automatica, vol. 93, pp. 510–519, July 2018.
[13] Y. G. I. Acle, F. D. Freitas, N. Martins, and J. Rommes, “Parameter Preserving Model Order Reduction of Large Sparse Small-Signal Electromechanical Stability Power System Models,” IEEE Transactions on Power Systems, vol. 34, pp. 2814–2824, July 2019.
[14] V. B. Nguyen, S. B. Q. Tran, S. A. Khan, J. Rong, and J. Lou, “POD-DEIM model order reduction technique for model predictive control in continuous chemical processing,” Computers & Chemical Engineering, vol. 133, p. 106638, Feb. 2020.
[15] A. Zhang and J. Liu, “Economic MPC of Wastewater Treatment Plants Based on Model Reduction,” Processes, vol. 7, p. 682, Oct. 2019.
[16] J.-D. Diao, J. Guo, and C.-Y. Sun, “Event-triggered identification of FIR systems with binary-valued output observations,” Automatica, vol. 98, pp. 95–102, Dec. 2018.
[17] G. Liu, M.-l. Yu, L. Wang, Z.-y. Yin, J.-k. Liu, and Z.-r. Lu, “Rapid parameter identification of linear time-delay system from noisy frequency domain data,” Applied Mathematical Modelling, vol. 83, pp. 736–753, July 2020.
[18] Z. Huang, Q. Liu, J. Liu, and B. Huang, “A comparative study of model approximation methods applied to economic MPC,” p. accepted.
[19] M. T. van Genuchten, “A Closed-form Equation for Predicting the Hydraulic Conductivity of Unsaturated Soils,” Soil Science Society of America Journal, vol. 44, pp. 892–898, Sept. 1980.
[20] R. F. Carsel and R. S. Parrish, “Developing joint probability distributions of soil water retention characteristics,” Water Resources Research, vol. 24, pp. 755–769, May 1988.
[21] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[22] Z. C. Lipton, J. Berkowitz, and C. Elkan, “A Critical Review of Recurrent Neural Networks for Sequence Learning,” arXiv:1506.00019 [cs], Oct. 2015. arXiv: 1506.00019.
[23] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015.
[24] J. A. E. Andersson, J. Gillis, G. Horn, J. B. Rawlings, and M. Diehl, “CasADi – A software framework for nonlinear optimization and optimal control,” Mathematical Programming Computation, 2018.