This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Zeroth-order optimisation on subsets of symmetric matrices with application to MPC tuning

Alejandro I. Maass Department of Electrical and Electronic Engineering, The University of Melbourne Chris Manzie Department of Electrical and Electronic Engineering, The University of Melbourne Iman Shames College of Engineering & Computer Science, The Australian National University Hayato Nakada Advanced Unit Management System Development Division, Toyota Motor Corporation, Japan
Abstract

This paper provides a zeroth-order optimisation framework for non-smooth and possibly non-convex cost functions with matrix parameters that are real and symmetric. We provide complexity bounds on the number of iterations required to ensure a given accuracy level for both the convex and non-convex case. The derived complexity bounds for the convex case are less conservative than available bounds in the literature since we exploit the symmetric structure of the underlying matrix space. Moreover, the non-convex complexity bounds are novel for the class of optimisation problems we consider. The utility of the framework is evident in the suite of applications that use symmetric matrices as tuning parameters. Of primary interest here is the challenge of tuning the gain matrices in model predictive controllers, as this is a challenge known to be inhibiting industrial implementation of these architectures. To demonstrate the framework we consider the problem of MIMO diesel air-path control, and consider implementing the framework iteratively “in-the-loop” to reduce tracking error on the output channels. Both simulations and experimental results are included to illustrate the effectiveness of the proposed framework over different engine drive cycles.

1 Introduction

Optimisation with matrix-valued decision variables is a problem that appears in a wide variety of applications. Particularly, this approach can be found in non-negative matrix factorization [19], low-rank matrix optimisation [45, 44], signal processing with independent component analysis [3], energy efficient wireless systems [26], graph weight allocation [33, 6], and controller tuning such as model predictive controllers [8], linear quadratic regulators [23], and PIDs for MIMO systems [9]. In most of the aforementioned applications, either explicit expressions of the cost function are not available, or derivatives are difficult or infeasible to obtain. Consequently, we restrict our attention to this class of problems in which the cost function is treated as a black-box. In this case, the cost function directional derivatives can still be approximated by means of cost function values using finite-differences for instance. Of main interest in this paper is the challenge of tuning the gain matrices in model predictive controllers (MPC), which have the property of being symmetric and positive (semi) definite. Efficient calibration of MPCs is often a difficult task given the large number of tuning parameters and their non-intuitive correlation with the output response.

Existing literature regarding MPC tuning can be broadly divided in heuristic guidelines [12], analytical studies [36, 34, 5], and meta-heuristic algorithms [32, 43, 18, 38, 40, 8, 16, 30]. Both heuristic guidelines and analytical studies hold for specific scenarios and MPC formulations, thus trial-and-error approaches are unavoidably adopted in practice. Regarding meta-heuristic methods, the works [32, 38, 40, 16, 30] focus on set-point tuning based on step response characteristics (e.g. overshoot, settling and rise times, and steady-state errors). The authors in [43] deal with trajectory tuning of MPC in which the trajectories are generated by first-order transfer function models. General trajectory tuning is considered in [18] and [8] via particle swarm optimisation (PSO) and gradient-based algorithms, respectively. The number of particles in PSO algorithms depends on the specific problem and it usually oscillates around 20 to 50 particles. This translates into running 20 to 50 closed-loop experiments per iteration, which is not practical for applications where the plant is in the loop. Gradient-based methods [8] and Bayesian optimisation (BO) methods [22, 37] can also be used as an alternative with less experiments per iteration, but existent results are presented only for vectors of parameters (i.e. diagonal tuning matrices). Lastly, inverse optimal control techniques such as [25] could be used for MPC tuning, see e.g. [28]. However, [28] does not provide any theoretical guarantees and it relies on a good choice of initial weighting matrices and heuristic guidelines. We note there is currently a lack of a general framework with convergence guarantees that can directly deal with, not only diagonal matrices, but also general subsets of symmetric matrices in which the algorithm preserves the structure of the tuning matrix at every iteration.

In response to the above discussion, we propose a zeroth-order optimisation framework for non-smooth and possibly non-convex cost functions with matrix parameters that are real and symmetric. We adopt iterates that use zeroth-order matrix oracles to approximate the directional derivative in a randomly chosen direction. This random direction is taken from the Gaussian orthogonal ensemble (GOE), which is the set of matrices with independent normally distributed entries, see e.g. [11, 2, 39]. The algorithm we use is the natural extension of the zeroth-order random search algorithm from [27] but tailored to deal with cost functions with matrix parameters. For this optimisation framework, we provide convergence guarantees in terms of complexity bounds for both convex and non-convex cost functions. In [27], the counterpart of our optimisation problem is studied, in which functions with vector parameters are used, and the zeroth-order oracle samples from the normal distribution. The authors in [27] present complexity bounds for both convex and non-convex functions. The bounds for the convex case can be applied to our matrix setup if one vectorises both the parameter and random direction matrices. However, this does not carry the matrix structure and, in fact, these bounds are significantly more conservative than the ones we propose. Moreover, the non-convex bounds proposed in [27] are only applicable to unconstrained problems and thus cannot be used in our constrained setting. Consequently, we develop novel bounds for the non-convex case, in which the optimisation parameters are constrained to subsets of symmetric matrices. The optimisation framework presented in this paper, when applied to the context of MPC tuning, offers a more general approach with respect to available literature, and it also comes with convergence guarantees. Particularly, it can deal with MPC tuning over trajectories in a general setting as opposed to [32, 38, 40, 16, 43], and it provides MPC weighting matrices that satisfy the required constraints of being symmetric and positive (semi) definite, and thus provide more degrees of freedom than the usual diagonal choices in [8, 18, 22, 37].

The applicability of the proposed approach is illustrated via both simulations and experiments in the problem of diesel air-path control. The MPC parameters are iteratively tuned within the closed-loop setting with the goal of improving the overall tracking performance over an engine drive cycle. For the simulation study, we compare the performance of our proposed framework with other gradient-free algorithms available in the literature such as dividing rectangles (DIRECT), PSO, and BO. In the experimental testing, we tune MPC controllers in a diesel engine test bench over two segments of the new European driving cycle (NEDC), and one segment of the worldwide harmonised light duty test cycle (WLTC).

A summary of our main contributions can be found below.

  1. 1.

    We present theoretical guarantees for a general class of optimisation problems with matrix-valued decision variables on subsets of real and symmetric matrices. The adopted iterates are tailored to this setting in the sense that they produce matrices that belong to the adequate space through projection. When the cost function is convex, the derived complexity bounds are demonstrably less conservative than the ones in [27] (vector-valued decision variables), since we exploit the symmetric structure of the underlying matrix space. In the non-convex case, we derive new complexity bounds that were not available in the literature for this context.

  2. 2.

    We illustrate the applicability of our framework by using it to tune the matrix parameters in MPC. This extends current literature such as [32, 38, 40, 16, 43, 30].

  3. 3.

    We highlight the efficacy of the approach by a simulation study in which we compare it to other available gradient-free algorithms in the literature (DIRECT, PSO, and BO).

  4. 4.

    We provide various experimental evaluations in a diesel engine test bench showing significant improvement of MPC tracking performance with only a few iterations of the proposed optimisation approach.

Organisation: Section 2 introduces optimisation framework. The complexity bounds are derived in Section 3. In Section 4, the proposed optimisation framework is applied to MPC tuning in the context of air-path control. In Section 5, we compare different gradient-free algorithms available in the literature to our framework. Experimental testing is presented in Section 6 for different engine drive cycles. Lastly, conclusions are drawn in Section 7.

Notation: Denote by m×n\mathbb{R}^{m\times n} the set of real matrices of dimension m×nm\times n, and 𝕊n\mathbb{S}^{n} the set of real symmetric matrices of dimension n×nn\times n. Let {1,2,}\mathbb{N}\triangleq\{1,2,\dots\} and 0{0}\mathbb{N}_{0}\triangleq\mathbb{N}\cup\{0\}. Given a matrix MM, MM^{\top} denotes its transpose. We use diag{M1,,Mn}\mbox{diag}\{M_{1},\dots,M_{n}\} to denote the standard block diagonal matrix. The identity and zero matrices of dimension m×nm\times n are denoted by Im×nI_{m\times n} and 0m×n0_{m\times n}, respectively. f(X)\nabla f(X) denotes the matrix gradient for a scalar function ff with matrix parameter XX, see e.g. [4, 10]. For matrices Mn×nM\in\mathbb{R}^{n\times n} and Nn×nN\in\mathbb{R}^{n\times n}, the Frobenius inner product is defined as M,NFTr{MN}\left\langle M,N\right\rangle_{F}\triangleq\mbox{Tr}\left\{M^{\top}N\right\}, which induces the norm MFM,MF\left\|M\right\|_{F}\triangleq\sqrt{\left\langle M,M\right\rangle_{F}}. For a random variable xx\in\mathbb{R}, we write x𝒩(0,σ2)x\sim\mathcal{N}(0,\sigma^{2}) to say that xx is normally distributed with zero mean and variance σ2\sigma^{2}.

2 Optimisation framework

We consider the following class of optimisation problems

minXnf(X),\displaystyle\min_{X\in\mathbb{Q}^{n}}f(X), (1)

where XX is an n×nn\times n matrix parameter, nn×n\mathbb{Q}^{n}\subset\mathbb{R}^{n\times n} is a closed convex set of admissible parameters, and ff is a non-smooth and possibly non-convex cost function. We build upon the random search ideas of [27], but tailored to the matrix space n\mathbb{Q}^{n}. That is, we are interested in solving (1) using iterates of the form

Xk+1=πn{Xkhk𝒪μ(Xk,Uk)},\displaystyle X_{k+1}=\pi_{\mathbb{Q}^{n}}\{X_{k}-h_{k}\mathcal{O}_{\mu}(X_{k},U_{k})\}, (2)

where πn\pi_{\mathbb{Q}^{n}} denotes the Euclidean projection onto the closed convex set n\mathbb{Q}^{n}, hkh_{k} is a positive scalar known as the step size, and 𝒪μ\mathcal{O}_{\mu} denotes the so-called zeroth-order random oracle which is defined as

𝒪μ(X,U)f(X+μU)f(X)μU,\displaystyle\mathcal{O}_{\mu}(X,U)\triangleq\frac{f(X+\mu U)-f(X)}{\mu}U, (3)

where μ>0\mu>0 denotes the oracle’s precision, and UU is a random symmetric matrix that belongs to the Gaussian orthogonal ensemble (GOE) as per Definition 1 below.

Definition 1

The GOE, denoted by 𝔾n\mathbb{G}^{n}, is the set of random symmetric matrices U𝕊nU\in\mathbb{S}^{n} with i.i.d. entries such that [U]ii𝒩(0,1)[U]_{ii}\sim\mathcal{N}(0,1), and [U]ij𝒩(0,12)[U]_{ij}\sim\mathcal{N}(0,\frac{1}{2}) independent of [U]ii[U]_{ii} for i<ji<j, see e.g. [11, 2, 39].

The overall method to solve (1) is represented by the pseudo-code above, which we name zeroth-order random matrix search (ZO-RMS) algorithm, where NN denotes the number of iterations. Note that the oracle, per iteration, only requires the cost function value at two points instead of first-order or second-order derivative information, and it computes an estimate of the directional derivative in a randomly chosen direction.

Zeroth-Order Random Matrix Search (ZO-RMS)
  1. 1:

    Choose X0nX_{0}\in\mathbb{Q}^{n}, μ>0\mu>0 and {hk}k0\{h_{k}\}_{k\in\mathbb{N}_{0}}.

  2. 2:

    for k={0,,N}k=\{0,\dots,N\}, NN\in\mathbb{N} do

  3. 3:

    Generate Uk𝔾nU_{k}\in\mathbb{G}^{n}

  4. 4:

    𝒪μ(Xk,Uk)1μ(f(Xk+μUk)f(Xk))Uk\mathcal{O}_{\mu}(X_{k},U_{k})\leftarrow\frac{1}{\mu}(f(X_{k}+\mu U_{k})-f(X_{k}))U_{k}

  5. 5:

    Xk+1πn{Xkhk𝒪μ(Xk,Uk)}X_{k+1}\leftarrow\pi_{\mathbb{Q}^{n}}\{X_{k}-h_{k}\mathcal{O}_{\mu}(X_{k},U_{k})\}

  6. 6:

    end for

  7. 7:

    Return X^NargminX[f(X):X{X0,,XN}]\hat{X}_{N}\triangleq\arg\min_{X}[f(X):X\in\{X_{0},\dots,X_{N}\}].

We emphasise that the ZO-RMS algorithm is the natural extension of the random vector search algorithm proposed in [27] but tailored to matrix decision variables. This extension is essential to our setting since there is currently an absence of a general framework for the constrained class of problems (1), i.e. adequate algorithm and convergence guarantees. In [27], the authors provide gradient-free algorithms to solve the optimisation problem minx¯nnf(x)\min_{x\in\bar{\mathbb{Q}}^{n}\subset\mathbb{R}^{n}}f(x) for both convex and non-convex cost functions, which is the counterpart of (1) for vector optimisation variables. A zeroth-order random oracle with a multivariate normally distributed direction uu is used. In this paper, we extend the setting in [27] to optimisation problems over a matrix space using random search matrices UU from the GOE, which are the natural counterpart to the normally distributed vector uu. Note that the complexity bounds provided in [27] for the convex case are applicable to our setting if we consider x=vec{X}x=\mbox{vec}\{X\} and u=vec{U}u=\mbox{vec}\{U\}. However, the matrix structure of XX and UU is immediately lost. Consequently, the first theoretical goal of this work is to obtain less conservative complexity bounds for the ZO-RMS algorithm when ff in (1) is convex, by exploiting the structure of the underlying matrix space. With respect to the non-convex case, we note that the complexity bounds available in [27] are not applicable to our case since they consider an unconstrained non-convex problem in n\mathbb{R}^{n} as opposed to a constrained one such as (1). Therefore, our second theoretical goal is to develop new complexity bounds for the ZO-RMS algorithm when solving constrained non-convex optimisation problems in a matrix space, i.e. problem (1) with ff non-convex.

As mentioned in the introduction, the proposed optimisation framework presented here is general and fits many applications. Therefore, the theoretical results presented in the following section will hold for any family of problems that fit the framework. Particularly, in this paper we will illustrate how to apply this framework to tune MPC controllers, since it is an important application for which we can provide interesting contributions validated by simulations and experiments.

3 Complexity bounds

In this section, we study the performance of the ZO-RMS algorithm in terms of complexity bounds that guarantees a given level of accuracy for the algorithm. We provide complexity bounds for both the convex and non-convex case.

Let us first introduce some important definitions. Definition 1 implies that the probability distribution (dU)\mathbb{P}(\mathrm{d}U) in the GOE 𝔾n\mathbb{G}^{n} is (dU)=1κe12UF2dU\mathbb{P}(\mathrm{d}U)=\frac{1}{\kappa}e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U, where dU\mathrm{d}U is the Lebesgue measure on 𝕊nn(n+1)/2\mathbb{S}^{n}\cong\mathbb{R}^{n(n+1)/2}, i.e. dUi=1ndUiii<jdUij\mathrm{d}U\triangleq\prod_{i=1}^{n}\mathrm{d}U_{ii}\prod_{i<j}\mathrm{d}U_{ij}, and κ\kappa is the normalising constant defined as κ𝔾ne12UF2dU=2n2πn2+n4\kappa\triangleq\int_{\mathbb{G}^{n}}e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U=2^{\frac{n}{2}}\pi^{\frac{n^{2}+n}{4}}. Therefore, we can define the expectation of a function q:𝔾nq:\mathbb{G}^{n}\rightarrow\mathbb{R} as 𝔼U{q(U)}𝔾nq(U)(dU)=1κ𝔾nq(U)e12UF2dU\mathbb{E}_{U}\{q(U)\}\triangleq\int_{\mathbb{G}^{n}}q(U)\mathbb{P}(\mathrm{d}U)=\frac{1}{\kappa}\int_{\mathbb{G}^{n}}q(U)e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U, and the following moments of interest in 𝔾n\mathbb{G}^{n}, mp𝔼U{UFp}=1κ𝔾nUFpe12UF2dUm_{p}\triangleq\mathbb{E}_{U}\{\left\|U\right\|_{F}^{p}\}=\frac{1}{\kappa}\int_{\mathbb{G}^{n}}\left\|U\right\|_{F}^{p}e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U, for p0p\in\mathbb{N}_{0}. We state the following lemma for some moments of interest.

Lemma 1

For every n1n\geq 1, we have that m1n2+n2m_{1}\leq\sqrt{\frac{n^{2}+n}{2}}, m2=n2+n2m_{2}=\frac{n^{2}+n}{2}, and m4=14(n4+2n3+5n2+4n)m_{4}=\frac{1}{4}(n^{4}+2n^{3}+5n^{2}+4n).

Proof: Define ψ(p)ln(mp)\psi(p)\triangleq\ln(m_{p}), and note that this function is convex in pp. Let us write p=(1α)×0+α×2p=(1-\alpha)\times 0+\alpha\times 2 (i.e. α=p/2\alpha=p/2). For p[0,2]p\in[0,2] we have α[0,1]\alpha\in[0,1]. Therefore, since ψ(p)\psi(p) is convex, we have that ψ(p)=ψ(α×2+(1α)×0)αψ(2)+(1α)ψ(0)\psi(p)=\psi(\alpha\times 2+(1-\alpha)\times 0)\leq\alpha\psi(2)+(1-\alpha)\psi(0). Particularly, ψ(0)=ln(m0)=0\psi(0)=\ln(m_{0})=0, and ψ(2)=ln(m2)\psi(2)=\ln(m_{2}), then ψ(p)αln(m2)=p2ln(m2)\psi(p)\leq\alpha\ln(m_{2})=\frac{p}{2}\ln(m_{2}), and thus mpm2p/2m_{p}\leq m_{2}^{p/2}. Particularly, m1m2m_{1}\leq\sqrt{m_{2}}. On the other hand, from [14] we get that m2=(n2+n)/2m_{2}=(n^{2}+n)/2 and m4=14(n4+2n3+5n2+4n)m_{4}=\frac{1}{4}(n^{4}+2n^{3}+5n^{2}+4n), which concludes the proof. \blacksquare

Our analysis relies in the following assumption.

Assumption 1
  1. (a)

    The set of admissible parameters is a subset of the set of real symmetric matrices of dimension n×nn\times n, i.e. n𝕊n\mathbb{Q}^{n}\subset\mathbb{S}^{n}.

  2. (b)

    The function f:nf:\mathbb{Q}^{n}\rightarrow\mathbb{R} is Lipschitz continuous, i.e. L0(f)>0\exists L_{0}(f)>0 s.t. |f(X)f(Y)|L0(f)XYF|f(X)-f(Y)|\leq L_{0}(f)\left\|X-Y\right\|_{F} holds for all X,YnX,Y\in\mathbb{Q}^{n}.

3.1 Convex case

Let XnX^{*}\in\mathbb{Q}^{n} be a stationary point of (1), and ff(X)f^{*}\triangleq f(X^{*}). In addition, define U0:kdiag{U0,,Uk}U_{0:k}\triangleq\mbox{diag}\{U_{0},\dots,U_{k}\}, a random matrix composed by i.i.d. variables {Uk}k0\{U_{k}\}_{k\in\mathbb{N}_{0}} associated with each iteration of the scheme.

We are now in a position to state the main result.

Theorem 1

Consider problem (1) with ff convex. Suppose Assumption 1 holds, and let the sequence {Xk}k{0,,N}\{X_{k}\}_{k\in\{0,\dots,N\}} be generated by the ZO-RMS iterates (2). Then, for any N0N\geq 0

1SNk=0Nhk(ϕkf)μL0(f)n2+n2+1SN(12X0XF2+18(n4+2n3+5n2+4n)L02(f)k=0Nhk2),\tfrac{1}{S_{N}}\sum_{k=0}^{N}h_{k}(\phi_{k}-f^{*})\leq\mu L_{0}(f)\sqrt{\tfrac{n^{2}+n}{2}}+\tfrac{1}{S_{N}}\Big{(}\tfrac{1}{2}\left\|X_{0}-X^{*}\right\|_{F}^{2}\\ +\tfrac{1}{8}(n^{4}+2n^{3}+5n^{2}+4n)L_{0}^{2}(f)\sum_{k=0}^{N}h_{k}^{2}\Big{)}, (4)

where SNk=0NhkS_{N}\triangleq\sum_{k=0}^{N}h_{k}, ϕk𝔼U0:k1{f(Xk)}\phi_{k}\triangleq\mathbb{E}_{U_{0:k-1}}\{f(X_{k})\} for kk\in\mathbb{N}, and ϕ0f(X0)\phi_{0}\triangleq f(X_{0}).

Proof: See Appendix B. \blacksquare

A corollary from Theorem 1 can be obtained which provides expressions for μ\mu, hkh_{k}, and NN that ensure a given level of accuracy for the ZO-RMS algorithm.

Corollary 1

Let ϵ>0\epsilon>0 be given. If μ\mu and hkh_{k} are chosen such that

μ\displaystyle\mu ϵL0(f)2(n2+n),\displaystyle\leq\frac{\epsilon}{L_{0}(f)\sqrt{2(n^{2}+n)}}, (5a)
hk\displaystyle h_{k} =2r¯L0(f)n4+2n3+5n2+4nN+1,\displaystyle=\frac{2\bar{r}}{L_{0}(f)\sqrt{n^{4}+2n^{3}+5n^{2}+4n}\sqrt{N+1}}, (5b)

for k{0,,N}k\in\{0,\dots,N\}, then, 𝔼U0:N1{f(X^N)}fϵ\mathbb{E}_{U_{0:N-1}}\{f(\hat{X}_{N})\}-f^{*}\leq\epsilon is guaranteed by the ZO-RMS algorithm in

NL02(f)r¯2ϵ2(n4+2n3+5n2+4n)\displaystyle N\geq\frac{L_{0}^{2}(f)\bar{r}^{2}}{\epsilon^{2}}(n^{4}+2n^{3}+5n^{2}+4n) (6)

iterations, where r¯\bar{r} is such that X0XFr¯\left\|X_{0}-X^{*}\right\|_{F}\leq\bar{r}.

Proof: Note that

𝔼U0:N1{f(X^N)}\displaystyle\mathbb{E}_{U_{0:N-1}}\{f(\hat{X}_{N})\} f\displaystyle-f^{*}
𝔼U0:N1{1SNk=0Nhk(f(Xk)f)}\displaystyle\leq\mathbb{E}_{U_{0:N-1}}\Big{\{}\tfrac{1}{S_{N}}\sum_{k=0}^{N}h_{k}(f(X_{k})-f^{*})\Big{\}}
(4)μL0(f)n2+n2+1SN(R22+18(n4+2n3+5n2+4n)L02(f)k=0Nhk2).\displaystyle\stackrel{{\scriptstyle\eqref{eq:theo6-matrix}}}{{\leq}}\mu L_{0}(f)\sqrt{\tfrac{n^{2}+n}{2}}+\tfrac{1}{S_{N}}\Big{(}\tfrac{R^{2}}{2}+\tfrac{1}{8}(n^{4}+2n^{3}+5n^{2}+4n)L_{0}^{2}(f)\sum_{k=0}^{N}h_{k}^{2}\Big{)}.

Suppose the number of steps NN is fixed, we can choose μ\mu and hkh_{k} as in (5) and further obtain the following bound

𝔼U0:N1{f(X^N)}fϵ2+r¯2SN=ϵ2+r¯L0(f)n4+2n3+5n2+4n2N+1.\displaystyle\mathbb{E}_{U_{0:N-1}}\{f(\hat{X}_{N})\}-f^{*}\leq\frac{\epsilon}{2}+\frac{\bar{r}^{2}}{S_{N}}=\frac{\epsilon}{2}+\frac{\bar{r}L_{0}(f)\sqrt{n^{4}+2n^{3}+5n^{2}+4n}}{2\sqrt{N+1}}.

Therefore, in order to satisfy 𝔼U0:N1{f(X^N)}fϵ\mathbb{E}_{U_{0:N-1}}\{f(\hat{X}_{N})\}-f^{*}\leq\epsilon, we need NL02(f)r¯2ϵ2(n4+2n3+5n2+4n)1N\geq\frac{L_{0}^{2}(f)\bar{r}^{2}}{\epsilon^{2}}(n^{4}+2n^{3}+5n^{2}+4n)-1, concluding the proof. \blacksquare

The complexity bound (6) in Corollary 1 corresponds to the number of iterations in which the ZO-RMS algorithm guarantees a given accuracy of ϵ\epsilon, provided the step size hkh_{k} and parameter μ\mu are chosen as per (5). Certainly, the expressions for hk,μh_{k},\mu, and NN depend on the Lipschitz constant L0(f)L_{0}(f) and r¯\bar{r} which may be hard to obtain explicitly depending on the application. However, these can be numerically bounded as we explain further below in Section 6.1. Since the oracle (3) is random, the guarantees of the ZO-RMS algorithm hold in expectation. In essence, Corollary 1 provides sufficient conditions on the oracle’s precision μ\mu and step size hkh_{k} such that we can get sufficiently close to the optimal cost function ff^{*} in NN iterations, in average.

Refer to caption
Figure 1: Comparison of the complexity bounds (6) and (7), see black and blue lines, respectively.

We now compare our complexity bound (6) to the bounds in [27] which are applicable to (1) (for convex ff) after grouping the elements of the tuning matrix XX into a single vector. Particularly, since XX is symmetric, we group the lower triangular elements into a vector of size n(n+1)/2n(n+1)/2 (or often called half vectorization), therefore the results in [27] apply and the following complexity bound holds (cf. Theorem 6 in [27])

N[27]4Lo2(f)r¯2ϵ2(n(n+1)/2+4)2.\displaystyle N_{\text{\tiny\cite[cite]{[\@@bibref{}{nesspo17}{}{}]}}}\geq\frac{4L_{o}^{2}(f)\bar{r}^{2}}{\epsilon^{2}}(n(n+1)/2+4)^{2}. (7)

Fig. 1 depicts (6) and (7) for different values of nn. We can see that our bound (6) is less conservative than (7) for all nn\in\mathbb{N}, and tends to (7) as nn\rightarrow\infty. We emphasise that our optimisation framework is directly tailored to matrix parameters, and the reduction in conservatism with respect to a vector approach such as [27] follows from exploiting the special structure of the underlying matrix space, namely the symmetry, and the use of iterates that are tailored to it. The reduction in conservativeness is significant, for instance, for 3×33\times 3 matrices (i.e. n=3n=3), N[27]/NCor.1=2.08N_{\text{\tiny\cite[cite]{[\@@bibref{}{nesspo17}{}{}]}}}/N_{\text{\tiny Cor.\ref{coro:complexity-bound}}}=2.08. This translates in a sufficient condition that guarantees a level of accuracy in twice less iterations.

3.2 Non-convex case

Compared to the convex optimisation problem considered in Section 3.1, the analysis for constrained non-convex problems is more challenging. Particularly, constrained convex problems typically focus on the optimality gap 𝔼{f(x)}f\mathbb{E}\left\{f(x)\right\}-f^{*} to measure the convergence rate (as we do in Corollary 1). On the other hand, Nesterov in [27] provided complexity bounds for unconstrained non-convex problems using zeroth order iterates, where the optimisation variable is a vector in a subset of n\mathbb{R}^{n}. In these unconstrained non-convex problems, the object 𝔼{f(x)2}\mathbb{E}\left\{\left\|\nabla f(x)\right\|^{2}\right\} is the typical measure for stationarity. We emphasise that the bounds for unconstrained non-convex problems in [27] are not applicable to our constrained problem (1), since we need to search over the matrix space n\mathbb{Q}^{n} and use iterates of the form (2) that project onto n\mathbb{Q}^{n}. For constrained non-convex problems like ours, a fitting alternative to 𝔼{f(x)2}\mathbb{E}\left\{\left\|\nabla f(x)\right\|^{2}\right\} is to consider the so-called gradient mapping, see e.g. [21, 13], which is defined as follows

Pn(X,𝒪μ,h)1h[Xπn{Xh𝒪μ(X,U)}],\displaystyle P_{\mathbb{Q}^{n}}(X,\mathcal{O}_{\mu},h)\triangleq\frac{1}{h}\Big{[}X-\pi_{\mathbb{Q}^{n}}\{X-h\mathcal{O}_{\mu}(X,U)\}\Big{]},

where 𝒪μ\mathcal{O}_{\mu} is the random zeroth order oracle in (3), and recall that πn\pi_{\mathbb{Q}^{n}} denotes the Euclidean projection onto the closed convex set n\mathbb{Q}^{n}. The natural interpretation of PnP_{\mathbb{Q}^{n}} is that it represents the projected gradient, which offers a feasible update from the previous point XX. The main goal in this section is to provide complexity bounds for the ZO-RMS algorithm (2) in terms of bounding 𝔼{Pn(X,𝒪μ,h)F2}\mathbb{E}\left\{\left\|P_{\mathbb{Q}^{n}}(X,\mathcal{O}_{\mu},h)\right\|_{F}^{2}\right\} when solving the constrained optimisation problem (1) with ff non-convex.

Before stating our results, we impose an additional assumption.

Assumption 2

𝔼{𝒪μ(X,U)fμ(X)F2}σ2\mathbb{E}\left\{\left\|\mathcal{O}_{\mu}(X,U)-\nabla f_{\mu}(X)\right\|_{F}^{2}\right\}\leq\sigma^{2}, σ>0\sigma>0, where fμf_{\mu} is the Gaussian approximation of ff defined in (16).

Assumption 2 essentially bounds the variance of the random oracle 𝒪μ\mathcal{O}_{\mu}. This assumption is often adopted in non-convex problems, see e.g. [13]. If constructing σ\sigma explicitly is of interest, we can construct it as follows. We note that 𝒪μ\mathcal{O}_{\mu} is an unbiased estimator of fμ(X)\nabla f_{\mu}(X) (cf. (18)), i.e. 𝔼{𝒪μ(X,U)}=fμ(X)\mathbb{E}\left\{\mathcal{O}_{\mu}(X,U)\right\}=\nabla f_{\mu}(X), then 𝔼{𝒪μ(X,U)fμ(X)F2}𝔼{𝒪μ(X,U)F2}Thm.3(1/4)L02(f)(n4+2n3+5n2+4n)σ2\mathbb{E}\left\{\left\|\mathcal{O}_{\mu}(X,U)-\nabla f_{\mu}(X)\right\|_{F}^{2}\right\}\leq\mathbb{E}\left\{\left\|\mathcal{O}_{\mu}(X,U)\right\|_{F}^{2}\right\}\stackrel{{\scriptstyle\text{\tiny Thm.\ref{theo:4-matrix}}}}{{\leq}}(1/4)L_{0}^{2}(f)(n^{4}+2n^{3}+5n^{2}+4n)\triangleq\sigma^{2}.

The main result for the non-convex case is stated below. It can be seen as the non-convex counterpart of Theorem 1.

Theorem 2

Consider problem (1) with ff non-convex. Suppose Assumptions 1 and 2 hold, and let the sequence {Xk}k{0,,N}\{X_{k}\}_{k\in\{0,\dots,N\}} be generated by the ZO-RMS iterates (2). Then, for any N0N\geq 0,

1SNk=0Nhk𝔼U0:k{Pn(Xk,𝒪μ(Xk,Uk),hk)F2}1SN[fμ(X0)f+C(μ)k=0Nhk2]+σ2,\frac{1}{S_{N}}\sum_{k=0}^{N}h_{k}\mathbb{E}_{U_{0:k}}\{\left\|P_{\mathbb{Q}^{n}}(X_{k},\mathcal{O}_{\mu}(X_{k},U_{k}),h_{k})\right\|_{F}^{2}\}\leq\frac{1}{S_{N}}\Big{[}f_{\mu}(X_{0})-f^{*}+C(\mu)\sum_{k=0}^{N}h_{k}^{2}\Big{]}+\sigma^{2}, (8)

where C(μ)L03(f)4μn2+n2(n4+2n3+5n2+4n)C(\mu)\triangleq\frac{L_{0}^{3}(f)}{4\mu}\sqrt{\frac{n^{2}+n}{2}}(n^{4}+2n^{3}+5n^{2}+4n).

Proof: See Appendix C. \blacksquare

The next corollary provides a choice of μ,hk\mu,h_{k}, and NN that ensure a given level of accuracy for the ZO-RMS algorithm up to an error of order σ2\sigma^{2}.

Corollary 2

Let δ>0\delta>0 and ε>0\varepsilon>0 be given, and hk=hh_{k}=h for all k{0,,N}k\in\{0,\dots,N\}. If μ\mu and hh are chosen as

μ\displaystyle\mu =εL0(f)[(n2+n)/2]1/2,\displaystyle=\frac{\varepsilon}{L_{0}(f)[(n^{2}+n)/2]^{1/2}}, (9)
h\displaystyle h =[8εr¯(N+1)L03(f)(n2+n)(n4+2n3+5n2+4n)]1/2,\displaystyle=\left[\frac{8\varepsilon\bar{r}}{(N+1)L_{0}^{3}(f)(n^{2}+n)(n^{4}+2n^{3}+5n^{2}+4n)}\right]^{1/2}, (10)

then 𝔼{Pn(XD,𝒪μ(XD,UD),h)2}δ+σ2\mathbb{E}\{\left\|P_{\mathbb{Q}^{n}}(X_{D},\mathcal{O}_{\mu}(X_{D},U_{D}),h)\right\|^{2}\}\leq\delta+\sigma^{2} is guaranteed by the ZO-RMS algorithm in

NL05(f)r¯(n2+n)(n4+2n3+5n2+4n)2εδ2\displaystyle N\geq\frac{L_{0}^{5}(f)\bar{r}(n^{2}+n)(n^{4}+2n^{3}+5n^{2}+4n)}{2\varepsilon\delta^{2}} (11)

iterations, where

Dargmink{0,,N}Pn(Xk,𝒪μ(Xk,Uk),h)F.D\triangleq\arg\min_{k\in\{0,\dots,N\}}\left\|P_{\mathbb{Q}^{n}}(X_{k},\mathcal{O}_{\mu}(X_{k},U_{k}),h)\right\|_{F}.

Proof: Recall that fμf_{\mu} is a smooth approximation of ff, and the gap in this approximation can be upper bounded by |fμ(X)f(X)|μL0(f)n2+n2|f_{\mu}(X)-f(X)|\leq\mu L_{0}(f)\sqrt{\frac{n^{2}+n}{2}}, see Lemma 2-(III) in the appendix. Then, to bound this gap by ε\varepsilon we need to choose μ\mu as per (9). For this choice of μ\mu, which we denote μ¯\bar{\mu}, C(μ¯)=L04(f)4ε(n2+n2)(n4+2n3+5n2+4n)C(\bar{\mu})=\frac{L_{0}^{4}(f)}{4\varepsilon}\left(\frac{n^{2}+n}{2}\right)(n^{4}+2n^{3}+5n^{2}+4n). For a constant step size, the right-hand size of (8) becomes

fμ¯(X0)f(N+1)h+C(μ¯)h+σ2L0(f)r¯(N+1)h+C(μ¯)h+σ2.\displaystyle\frac{f_{\bar{\mu}}(X_{0})-f^{*}}{(N+1)h}+C(\bar{\mu})h+\sigma^{2}\leq\frac{L_{0}(f)\bar{r}}{(N+1)h}+C(\bar{\mu})h+\sigma^{2}.

Let ρ(h)L0(f)r¯(N+1)h+C(μ¯)h\rho(h)\triangleq\frac{L_{0}(f)\bar{r}}{(N+1)h}+C(\bar{\mu})h. Minimising ρ(h)\rho(h) in hh leads to h=L0(f)r¯/[C(μ¯)(N+1)]h^{*}=\sqrt{L_{0}(f)\bar{r}/[C(\bar{\mu})(N+1)]}, which corresponds to (10). For this choice of step size, we get ρ(h)=4L0(f)r¯C(μ¯)/(N+1)\rho(h^{*})=\sqrt{4L_{0}(f)\bar{r}C(\bar{\mu})/(N+1)}. Note that 𝔼{Pn(XD,𝒪μ(XD,UD),h)F2}1N+1k=0N𝔼U0:k{Pn(Xk,𝒪μ(Xk,Uk),h)F2}ρ(h)+σ2\mathbb{E}\{\left\|P_{\mathbb{Q}^{n}}(X_{D},\mathcal{O}_{\mu}(X_{D},U_{D}),h)\right\|_{F}^{2}\}\leq\frac{1}{N+1}\sum_{k=0}^{N}\mathbb{E}_{U_{0:k}}\{\left\|P_{\mathbb{Q}^{n}}(X_{k},\mathcal{O}_{\mu}(X_{k},U_{k}),h)\right\|_{F}^{2}\}\leq\rho(h^{*})+\sigma^{2}. Then, we can guarantee that ρ(h)δ\rho(h^{*})\leq\delta in 4L0(f)r¯C(μ¯)/δ24L_{0}(f)\bar{r}C(\bar{\mu})/\delta^{2} iterations, which corresponds to (11), completing the proof. \blacksquare

Our analysis shows that ZO-RMS for constrained non-convex problems can suffer an additional error of order σ2\sigma^{2} which does not appear in the convex case, see Corollary 1. This effect is consistent with the literature on constrained non-convex problems, where this effect has been reported for different algorithms, see e.g. [21, 13, 20].

4 Application to MPC tuning: air-path control in diesel engines

In this section, we illustrate how to apply the proposed optimisation framework to tune MPC controllers in the context of diesel air-path control.

4.1 Diesel engine air-path model

A schematic representation of the diesel air-path is shown in Fig. 2. The dynamics of a diesel engine air-path are highly non-linear, see e.g. [41], and can be captured in the general form

xk+1\displaystyle x_{k+1} =𝐟(xk,uk,θ),\displaystyle=\mathbf{f}(x_{k},u_{k},\theta), (12a)
yk\displaystyle y_{k} =𝐠(xk,uk,θ),\displaystyle=\mathbf{g}(x_{k},u_{k},\theta), (12b)

where xknx,yknyx_{k}\in\mathbb{R}^{n_{x}},y_{k}\in\mathbb{R}^{n_{y}}, and uknuu_{k}\in\mathbb{R}^{n_{u}} are the state, output, and input of the system respectively, at time instant k0k\in\mathbb{N}_{0}. The engine operational space is typically parametrised by θ(ωe,m˙f)Θ2\theta\triangleq(\omega_{e},\dot{m}_{f})\in\Theta\subseteq\mathbb{R}^{2}, where ωe\omega_{e} denotes the engine speed, m˙f\dot{m}_{f} denotes the fuel rate, and Θ\Theta is the engine operating space defined as Θ{θ2:ωe,minωeωe,max,m˙f,minm˙fm˙f,max}\Theta\triangleq\{\theta\in\mathbb{R}^{2}:\omega_{e,\min}\leq\omega_{e}\leq\omega_{e,\max},\dot{m}_{f,\min}\leq\dot{m}_{f}\leq\dot{m}_{f,\max}\}, for some ωe,min,ωe,max,m˙f,min,m˙f,max\omega_{e,\min},\omega_{e,\max},\dot{m}_{f,\min},\dot{m}_{f,\max}\in\mathbb{R}. Given these highly non-linear dynamics, a common approach to control-oriented modelling of the air-path is to generate linear models trimmed at various operating points θ\theta. Particularly, we follow [30] and use twelve models to approximate the operating space uniformly. That is, the engine operating space Θ\Theta is divided into twelve regions as per Fig. 3, with a linear model representing the engine dynamics in each region. For commercial in confidence purposes, we only show normalised axes in every plot. We emphasise that these regions are chosen to provide adequate coverage over the range of operating points encountered along a drive cycle. The control-oriented model state consists in the intake manifold pressure pimp_{\text{im}} (also known as boost pressure), the exhaust manifold pressure pemp_{\text{em}}, the compressor flow WcompW_{\text{comp}}, and the EGR rate yEGRy_{\text{EGR}}. The input consists of the throttle position uthru_{\text{thr}}, EGR valve position uEGRu_{\text{EGR}}, and VGT valve position uVGTu_{\text{VGT}}. Lastly, the measured output is (pim,yEGR)(p_{\text{im}},y_{\text{EGR}}).

For a given operating point (ωeσ,m˙fσ)(\omega_{e}^{\sigma},\dot{m}_{f}^{\sigma}), σ{1,,12}\sigma\in\{1,\dots,12\}, the engine control unit (ECU) applies certain steady-state actuator values. We denote by u¯σ3,x¯σ4\bar{u}^{\sigma}\in\mathbb{R}^{3},\bar{x}^{\sigma}\in\mathbb{R}^{4}, and y¯σ2\bar{y}^{\sigma}\in\mathbb{R}^{2} the steady-state values of the input, state, and output respectively at each operating condition (ωeσ,m˙fσ)(\omega_{e}^{\sigma},\dot{m}_{f}^{\sigma}). Then, by following the system identification procedure detailed in [35], a linear representation of the non-linear diesel air-path (12) trimmed around the grid point (ωeσ,m˙fσ)(\omega_{e}^{\sigma},\dot{m}_{f}^{\sigma}) is given by

x~k+1\displaystyle\tilde{x}_{k+1} =Aσx~k+Bσu~k,\displaystyle=A^{\sigma}\tilde{x}_{k}+B^{\sigma}\tilde{u}_{k}, (13a)
y~k\displaystyle\tilde{y}_{k} =Cσx~k+Dσu~k,\displaystyle=C^{\sigma}\tilde{x}_{k}+D^{\sigma}\tilde{u}_{k}, (13b)

where σ{1,,12}\sigma\in\{1,\dots,12\}, x~k=xkx¯σ4\tilde{x}_{k}=x_{k}-\bar{x}^{\sigma}\in\mathbb{R}^{4} is the perturbed state around the corresponding steady-state x¯σ\bar{x}^{\sigma}, u~k=uku¯σ3\tilde{u}_{k}=u_{k}-\bar{u}^{\sigma}\in\mathbb{R}^{3} is the perturbed input around the corresponding steady-state input u¯σ\bar{u}^{\sigma}, and y~k=yky¯σ2\tilde{y}_{k}=y_{k}-\bar{y}^{\sigma}\in\mathbb{R}^{2} is the perturbed output around the corresponding steady-state output y¯σ\bar{y}^{\sigma}.

Refer to caption
Figure 2: A diesel engine air-path schematic.
Refer to caption
Figure 3: Engine operational space divisions and the corresponding linearisation points.

4.2 MPC formulation

For each operating point (ωeσ,m˙fσ)(\omega_{e}^{\sigma},\dot{m}_{f}^{\sigma}), σ{1,,12}\sigma\in\{1,\dots,12\}, and corresponding model (13) associated to that region, we formulate an MPC with augmented integrator (see e.g. [29, 42]). To this end, define the augmented state 𝐱k=(x~k,ek)\mathbf{x}_{k}=(\tilde{x}_{k},e_{k}), where eke_{k} is the integrator state with dynamics ek+1=Cσx~k+eke_{k+1}=-C^{\sigma}\tilde{x}_{k}+e_{k}. Define the cost function as

J(𝐱k,𝐮)𝐱k+HPσ𝐱k+H+i=0H1(𝐱k+iQσ𝐱k+i+u~k+iRσu~k+i),\displaystyle J(\mathbf{x}_{k},\mathbf{u})\triangleq\mathbf{x}_{k+H}^{\top}{P}^{\sigma}\mathbf{x}_{k+H}+\sum_{i=0}^{H-1}\left(\mathbf{x}_{k+i}^{\top}{Q}^{\sigma}\mathbf{x}_{k+i}+\tilde{u}_{k+i}^{\top}R^{\sigma}\tilde{u}_{k+i}\right),

where HH\in\mathbb{N} is the prediction horizon, 𝐮{u~k,u~k+1,,u~k+H1}\mathbf{u}\triangleq\{\tilde{u}_{k},\tilde{u}_{k+1},\dots,\tilde{u}_{k+H-1}\} is the sequence of control values applied over the horizon HH, and Pσ𝕊6,Qσ𝕊6{P}^{\sigma}\in\mathbb{S}^{6},{Q}^{\sigma}\in\mathbb{S}^{6}, and Rσ𝕊3R^{\sigma}\in\mathbb{S}^{3} are real symmetric matrices containing the tuning weights. Particularly, we further assume that RσR^{\sigma} is positive definite, and that PσP^{\sigma} and QσQ^{\sigma} are positive semidefinite.

The corresponding MPC problem is stated below,

minimiseJ(𝐱k,𝐮)\displaystyle\text{minimise}\quad J(\mathbf{x}_{k},\mathbf{u})
subject to{𝐱k+1=[Aσ0CσI]𝐱k+[Bσ0]u~k,x~(k)=x(k)x¯σ,Sx𝐱i+Suu~ibi,i{0,,H1},SH𝐱HbH,\displaystyle\text{subject to}\left\{\begin{array}[]{l}\mathbf{x}_{k+1}=\begin{bmatrix}A^{\sigma}&0\\ -C^{\sigma}&I\end{bmatrix}\mathbf{x}_{k}+\begin{bmatrix}B^{\sigma}\\ 0\end{bmatrix}\tilde{u}_{k},\\ \tilde{x}(k)=x(k)-\bar{x}^{\sigma},\\ S_{x}\mathbf{x}_{i}+S_{u}\tilde{u}_{i}\leq b_{i},\forall i\in\{0,\dots,H-1\},\\ S_{H}\mathbf{x}_{H}\leq b_{H},\end{array}\right.

where Sx18×6S_{x}\in\mathbb{R}^{18\times 6}, Su18×3S_{u}\in\mathbb{R}^{18\times 3}, bi18b_{i}\in\mathbb{R}^{18}, SH12×6S_{H}\in\mathbb{R}^{12\times 6}, and bH12b_{H}\in\mathbb{R}^{12} are given by the relevant state and input constraints. The solution to the above optimisation problem yields the optimising control sequence 𝐮={u~k,u~k+1,,u~k+H1}\mathbf{u}^{*}=\{\tilde{u}^{*}_{k},\tilde{u}^{*}_{k+1},\dots,\tilde{u}^{*}_{k+H-1}\}. The first element of the sequence 𝐮\mathbf{u^{*}} is applied to the system and the whole process is repeated as kk is incremented.

For transient operations between operating points, we use a switching LTI-MPC architecture so that the controllers are switched based on the current operating condition. As in [31], the switching LTI-MPC strategy selects the MPC controller at the nearest operating point to the current operating condition.

Under the above scenario, our goal is to use the ZO-RMS algorithm to tune the MPC weighting matrices {Pσ,Qσ,Rσ}\{P^{\sigma},Q^{\sigma},R^{\sigma}\} in order to achieve a satisfactory tracking performance for a given prediction horizon HH\in\mathbb{N}. Since there are twelve available controllers to tune, we could potentially use the ZO-RMS algorithm to tune all of them. However, since experiments are costly, we will focus on tuning only the controllers that have poor performance given an initial choice of tuning parameters. That is, we utilise an initial calibration denoted by {P0σ,Q0σ,R0σ}\{P_{0}^{\sigma},Q_{0}^{\sigma},R_{0}^{\sigma}\} for every controller σ{1,,12}\sigma\in\{1,\dots,12\}, which we call baseline controller. The choice of {P0σ,Q0σ,R0σ}\{P_{0}^{\sigma},Q_{0}^{\sigma},R_{0}^{\sigma}\} may be based on model dynamics, experience, or using ad-hoc guidelines as per [12]. Then, by looking at a drive cycle response with the baseline controller, we detect the controllers that have poor performance. Let us denote by σ\sigma^{*} a controller with unsatisfactory performance and that we attempt to tune with the ZO-RMS algorithm. Therefore, in this context, we want to solve

minPσ,Qσ𝕊+4,Rσ𝕊++3g(y(Pσ,Qσ,Rσ)),\displaystyle\min_{P^{\sigma^{*}},Q^{\sigma^{*}}\in\mathbb{S}^{4}_{+},R^{\sigma^{*}}\in\mathbb{S}^{3}_{++}}g(y(P^{\sigma^{*}},Q^{\sigma^{*}},R^{\sigma^{*}})), (14)

where gg is the tracking performance of the MPC defined in (15) below, which depends on a closed-loop response of interest yy, and 𝕊+4𝕊4\mathbb{S}^{4}_{+}\subset\mathbb{S}^{4} and 𝕊++3𝕊3\mathbb{S}^{3}_{++}\subset\mathbb{S}^{3} denote the set of symmetric positive semi-definite matrices and the set of symmetric positive definite matrices, respectively.

As mentioned above, we use the tracking error as the measure of performance for the MPC controller, that is,

g(y(Pσ,Qσ,Rσ))1Mk=0M1yk(Pσ,Qσ,Rσ)ykref22,\displaystyle g(y(P^{\sigma^{*}},Q^{\sigma^{*}},R^{\sigma^{*}}))\triangleq\frac{1}{\sqrt{M}}\sqrt{\sum_{k=0}^{M-1}\left\|y_{k}(P^{\sigma^{*}},Q^{\sigma^{*}},R^{\sigma^{*}})-y_{k}^{\text{ref}}\right\|_{2}^{2}}\ , (15)

where MM\in\mathbb{N} is the experiment length, ykrefy_{k}^{\text{ref}} is the vector containing the boost pressure and EGR rate references, respectively, and yk(Pσ,Qσ,Rσ)y_{k}(P^{\sigma^{*}},Q^{\sigma^{*}},R^{\sigma^{*}}) is the process measured output when the σ\sigma^{*}-th controller is using tuning parameters Pσ,Qσ,RσP^{\sigma^{*}},Q^{\sigma^{*}},R^{\sigma^{*}}, and the rest of controllers are using the baseline parameters {P0,Q0,R0}\{P_{0},Q_{0},R_{0}\}. We emphasise that for the simulations and experiments below we input normalised signals in (15) so that they are weighted equally.

4.3 Implementing the ZO-RMS algorithm

We now show how to implement the ZO-RMS algorithm for the MPC tuning problem (14). First note that (14) fits the general optimisation problem (1) with X=diag{Pσ,Qσ,Rσ}X=\mbox{diag}\{P^{\sigma^{*}},Q^{\sigma^{*}},R^{\sigma^{*}}\}, f(X)g(y(X))f(X)\triangleq g(y(X)), and

n{X11×11:X=diag{Pσ,Qσ,Rσ},Pσ,Qσ𝕊+4,Rσ𝕊++3}.\displaystyle\mathbb{Q}^{n}\triangleq\big{\{}X\in\mathbb{R}^{11\times 11}:X=\mbox{diag}\{P^{\sigma^{*}},Q^{\sigma^{*}},R^{\sigma^{*}}\},P^{\sigma^{*}},Q^{\sigma^{*}}\in\mathbb{S}^{4}_{+},R^{\sigma^{*}}\in\mathbb{S}^{3}_{++}\big{\}}.

Therefore, we can apply ZO-RMS to solve (14). The overall implementation of ZO-RMS in this context is depicted in Fig. 4. the ZO-RMS algorithm iteratively updates the MPC parameters {Pσ,Qσ,Rσ}\{P^{\sigma^{*}},Q^{\sigma^{*}},R^{\sigma^{*}}\} to minimise f(Pσ,Qσ,Rσ)f(P^{\sigma^{*}},Q^{\sigma^{*}},R^{\sigma^{*}}), which is calculated from closed-loop response experiments carried out between the MPC and diesel engine.

It remains to show how to compute the oracle and projection in (2). Particularly, the oracle in this context is computed as per (3) with U=diag{UP,UQ,UR}U=\mbox{diag}\{U^{P},U^{Q},U^{R}\}, where UP,UQ𝔾4U^{P},U^{Q}\in\mathbb{G}^{4} and UR𝔾3U^{R}\in\mathbb{G}^{3} (see Definition 1). We emphasise that, at each iteration step, the oracle is computed once the entire closed-loop response is available. Particularly, two experiments are required to compute the oracle, one with parameters {Pkσ+μUkP,Qkσ+μUkQ,Rkσ+μUkR}\{P_{k}^{\sigma^{*}}+\mu U_{k}^{P},Q_{k}^{\sigma^{*}}+\mu U_{k}^{Q},R_{k}^{\sigma^{*}}+\mu U_{k}^{R}\} and one with {Pkσ,Qkσ,Rkσ}\{P_{k}^{\sigma^{*}},Q_{k}^{\sigma^{*}},R_{k}^{\sigma^{*}}\}. Once the two closed-loop responses are finished, then ZO-RMS computes the oracle and next update {Pk+1σ,Qk+1σ,Rk+1σ}\{P_{k+1}^{\sigma^{*}},Q_{k+1}^{\sigma^{*}},R_{k+1}^{\sigma^{*}}\}. Two new closed-loop experiments are then performed with these new controller parameters and the process continues iteratively in this fashion.

Since XX is block diagonal, the projection onto n\mathbb{Q}^{n} is computed as πn{X}=diag{π𝕊+4{P},π𝕊+4{Q},π𝕊++3{R}}\pi_{\mathbb{Q}^{n}}\{X\}=\mbox{diag}\{\pi_{\mathbb{S}^{4}_{+}}\{P\},\pi_{\mathbb{S}^{4}_{+}}\{Q\},\pi_{\mathbb{S}^{3}_{++}}\{R\}\}, where π𝕊+4\pi_{\mathbb{S}^{4}_{+}} and π𝕊++3\pi_{\mathbb{S}^{3}_{++}} denote the Euclidean projection onto 𝕊+4\mathbb{S}^{4}_{+} and 𝕊++3\mathbb{S}^{3}_{++}, respectively. To compute them, we follow the well known results in [15]. That is, let K=VΛVK=V\Lambda V^{\top} be the eigenvalue decomposition of a matrix KK. Then, π𝕊+{K}Vmax{0,Λ}V\pi_{\mathbb{S}^{+}}\{K\}\triangleq V\max\{0,\Lambda\}V^{\top} and π𝕊++{K}Vmax{d,Λ}V\pi_{\mathbb{S}^{++}}\{K\}\triangleq V\max\{d,\Lambda\}V^{\top}, d>0d>0.

Refer to caption
Figure 4: Overall MPC tuning scheme.
Refer to caption
Figure 5: Block diagram of the experimental setting.

5 Simulation study

We now perform numerical simulations to show the advantage of our proposed MPC tuning framework with respect to other available gradient-free algorithms in the literature. Particularly, we compare our approach to the dividing rectangles (DIRECT) algorithm [17], particle swarm optimisation algorithms [18], and Bayesian optimisation algorithms [22, 37]. DIRECT is a sample-based global optimisation method for Lipschitz continuous functions defined over multidimensional domains. It partitions the space into hyperrectangles, each with a sampled point at its centre. The cost function is evaluated at these centrepoints, and then the algorithm keeps sampling and dividing into rectangles until the iteration limit has been reached. PSO algorithms solve the problem by having a population of so-called particles which move along the search space according to the updating rules of their position and velocity. The movement of the particles are guided by their own best known position as well as the entire swarm’s best known position in the search space. At each iteration, the cost function is evaluated for every particle in the swarm. Lastly, BO seeks to identify the optimal tuning parameters by strategically exploring and exploiting the parameter space. Exploration aims to evaluate the objective at points in the decision space with the goal of improving the accuracy of a surrogate model of the objective function, while exploitation aims to use the surrogate model to identify decisions that reduce (or increase) the objective function [22].

Intuition suggests that methods such as DIRECT and PSO would perform many cost function evaluations in order to find the optimal value, since they rely on the number of rectangles/centre points (DIRECT) and particles (PSO). We demonstrate that this is indeed true for the MPC tuning problem described in Section 4.

We perform simulations using the tuning scheme from Fig. 4 in which the diesel engine block is simulated via a high-fidelity model, see e.g. [30]. This model is physics-based with the form (12), in which the parameters have been obtained from system identification experiments conducted at Toyota’s Higashi-Fuji Technical Centre. The specific equations and parameters are not included due to commercial in confidence purposes. We focus on tuning controller σ=6\sigma^{*}=6 (see Fig. 3), over a segment of the NEDC, which is called the urban drive cycle (UDC). We focus on tuning diagonal matrices {P6,Q6,R6}\{P^{6},Q^{6},R^{6}\} with positive elements so that we can compare to vector-valued methods like DIRECT, PSO, and BO. We emphasise another contribution of the proposed optimisation framework is that it allows the direct tuning of the weighting matrices so that they satisfy the required constraints of symmetricity and positive (semi) definiteness. For these simulations, we do not include the augmented integrator states.

The cost function is the tracking error as per (15), and to assess the complexity of each algorithm we will use the total number of cost function evaluations that takes to achieve the optimal cost function value up to a certain tolerance. The initial calibration is P06=Q06=diag{0.01,0,0.2,0.01}P_{0}^{6}=Q_{0}^{6}=\mbox{diag}\{0.01,0,0.2,0.01\} and R06=105I3×3R_{0}^{6}=10^{-5}I_{3\times 3}, and the number of tuning parameters is eleven. The ZO-RMS algorithm uses step size hk=5×103/k+1h_{k}=5\times 10^{-3}/\sqrt{k+1} and oracle’s precision μ=8×105\mu=8\times 10^{-5} (L0(f)=6,ϵ=0.008,r¯=2,n=11L_{0}(f)=6,\epsilon=0.008,\bar{r}=2,n=11). For the PSO algorithm we have picked 40 particles as suggested by the literature [18]. For DIRECT, PSO, and BO, we use the decision variable range [0.01,5][0.01,5].

One realisation run of all four algorithms is summarised in Table 1, where foptf_{\text{opt}} denotes the optimal value of the cost function achieved by each algorithm. Next we show the number of cost function evaluations, followed by the total execution time of each algorithm. We can see that in order to achieve a similar value of foptf_{\text{opt}} (up to a 0.0002 difference), the ZO-RMS algorithm performs 20 function evaluations, compared to 66 for DIRECT, 200 for PSO, and 28 for BO. We emphasise that each function evaluation in this context corresponds to a closed-loop experiment, and thus having to perform many of these is intractable in practice. We can see that BO requires only a few more cost function evaluations than ZO-RMS. It can also be observed that DIRECT, PSO, and BO take longer to execute in comparison to ZO-RMS.

Since PSO, BO, and ZO-RMS are stochastic, we also performed a Monte Carlo simulation of 100 realisations in order to compare the three of them more accurately. These results are listed in Table 2. We can see that all achieve a similar optimal value for foptf_{\text{opt}} in average, but it takes about 199 cost function evaluations in average for PSO to achieve this, and only 22 for BO and 20 for ZO-RMS which makes them more efficient for applications in which the plant is in the loop. Overall, BO exhibits comparable performance with respect to ZO-RMS in terms of cost function evaluations and execution time; however, as mentioned in the introduction, these methods only work for vector decision variables and the PSD matrix structure of the MPC tuning matrices is not preserved by the algorithm. They also lack theoretical guarantees in this context.

Table 1: Comparison between available algorithms.
Algorithm foptf_{\text{opt}} N of cost fcn. evaluations Execution time
DIRECT 0.0478 66 \sim0.6 hours
PSO 0.0476 200 \sim1.6 hours
BO 0.0477 28 \sim0.5 hours
ZO-RMS 0.0476 20 \sim0.3 hours
Table 2: Monte Carlo simulation (100 realisations).
Algorithm 𝔼{fopt}\mathbb{E}\{f_{\text{opt}}\} Var{fopt}\textrm{Var}\{f_{\text{opt}}\} 𝔼{N of cost fcn. evaluations}\mathbb{E}\left\{\text{N${}^{\circ}$ of cost fcn. evaluations}\right\}
PSO 0.0472 5×1085\times 10^{-8} 199
BO 0.0477 7×1087\times 10^{-8} 22
ZO-RMS 0.0496 9×1069\times 10^{-6} 20

6 Experiments

The experimental testing of the proposed MPC tuning framework was performed at Toyota’s Higashi-Fuji Technical Center in Susono, Japan. The test bench is equipped with a diesel engine, a transient dynamometer, and a dSPACE DS1006 processor board, which is in turn interfaced to the ECU. The block diagram in Fig. 5 depicts the overall experimental setting. The ECU logs sensor data from the engine and transmits the current state information to the controller. In addition, the ECU directly controls all engine sub-systems, however, the ECU commands for the three actuators (throttle, EGR valve, and VGT) can be overridden by the MPC commands through a virtual switch in the ControlDesk interface. The proposed tuning architecture is implemented iteratively in-the-loop as per Fig. 4, in which the diesel engine block is the real engine described above. Particularly, the switched MPC is implemented in real-time on the dSPACE board to generate the required closed-loop drive cycle responses, and the computations in the ZO-RMS algorithm are performed by Matlab, i.e. oracle, projections, and next set of parameters.

6.1 Choice of algorithm parameters

To run ZO-RMS we need to choose the parameters μ\mu and hkh_{k}. Corollary 1 provides a choice for these parameters so that a given level of accuracy is achieved by the algorithm in NN iterations as per (6). However, this choice depends on the Lipschitz constant L0(f)L_{0}(f), and the bound r¯\bar{r}. These can be numerically bounded depending on the application. Since we apply the proposed optimisation framework to MPC tuning over a real engine, it is not possible to find an explicit expression for the cost function in terms of P,Q,RP,Q,R. Instead, a common approach is to adopt experimental optimisation methods such as the ones in [7, 1] to ensure that the Lipschitz condition is at least satisfied in the experimental data. Particularly, since the oracle 𝒪μ\mathcal{O}_{\mu} uses values of the cost function at two different points, we can store these values and use a consistency-check algorithm to see if the Lipschitz condition is verified (see e.g. [7, Section 4.3]). This could be done by performing high-fidelity simulations of the closed-loop to compute the oracle at different points, and pick the largest constant that satisfies the Lipschitz condition as a first estimate. Later on in the experiments, we can further adjust this first estimate accordingly, and verify whether the bound is satisfied for the collected experimental measurements. Similarly, r¯\bar{r} can be estimated with the stored data of X0X_{0} and X^N\hat{X}_{N} from multiple simulations so that X0Xr¯\left\|X_{0}-X^{*}\right\|\leq\bar{r} holds for the experimental space. In this paper, we initially picked conservative estimates from simulations, and we later refine these heuristically during experimental testing. Online estimation of the Lipschitz constant L0(f)L_{0}(f) can improve the aforementioned methods and it is considered as a future direction.

In what follows, we tune three out of the twelve MPC controllers depicted in Fig. 3 over three different drive cycle segments. Particularly, we tune: controller σ=9\sigma^{*}=9 over the middle segment of the UDC (49–117[s]), and we call it UDC2; controller σ=5\sigma^{*}=5 over the first 100 seconds of the WLTC, which call WLTC1; and controller σ=6\sigma^{*}=6 over the last segment of the UDC (117–195[s]), which we call UDC3.

For all the experiments below we consider an initial calibration Q0=diag{Q¯0,Q¯e,0}Q_{0}=\mbox{diag}\{\bar{Q}_{0},\bar{Q}_{e,0}\}, Q¯0=diag{0.01,0.01,0.2,1}\bar{Q}_{0}=\mbox{diag}\{0.01,0.01,0.2,1\}, Q¯e,0=0.00001×diag{1,100}\bar{Q}_{e,0}=0.00001\times\mbox{diag}\{1,100\}, P0=Q0,R0=I3×3P_{0}=Q_{0},R_{0}=I_{3\times 3} for all twelve controllers, where Q¯0\bar{Q}_{0} and Q¯e,0\bar{Q}_{e,0} respectively relate to the engine state and integrator state. In addition, all the drive cycle references are generated by the ECU based on a driver model. These models are responsible for converting the drive cycle vehicle speed reference into reference operating points in terms of engine speed and load.

Refer to caption
Figure 6: Engine response over the middle UDC segment using the initial parameters {P0,Q0,R0}\{P_{0},Q_{0},R_{0}\} for all twelve controllers.
Refer to caption
Figure 7: Engine response over the middle UDC segment using the tuned {P^9,Q^9,R^9}\{\hat{P}^{9},\hat{Q}^{9},\hat{R}^{9}\} for controller six, and {P0,Q0,R0}\{P_{0},Q_{0},R_{0}\} for the remaining eleven controllers.

6.2 Controller σ=9\sigma^{*}=9 over UDC2

The engine output response to the UDC2 segment with the baseline controller is given in Fig. 6. The grey area illustrates when controller σ=9\sigma^{*}=9 is active, and we consider its tracking performance as unsatisfactory (we draw the attention to this with the boxed area). We would like to improve the tracking performance for this controller by using ZO-RMS. The parameters used in this experiment are111We note that for this particular experiment we used a very conservative estimate of the Lipschitz constant based on different experimental tests. μ=0.003×105\mu=0.003\times 10^{-5}, hk=106/k+1h_{k}=10^{-6}/\sqrt{k+1}. (L0(f)=2.37×104,ϵ=0.01,r¯=1.1,n=9L_{0}(f)=2.37\times 10^{4},\epsilon=0.01,\bar{r}=1.1,n=9).

In this experiment, we consider Q9=diag{Q¯9,Q¯e9}Q^{9}=\mbox{diag}\{\bar{Q}^{9},\bar{Q}_{e}^{9}\} and use ZO-RMS to tune {Q¯9,Q¯e9,R9}\{\bar{Q}^{9},\bar{Q}_{e}^{9},R^{9}\}, whilst P9P^{9} is constructed by solving the discrete-time algebraic Riccati equation (DARE) [24], P9=dare([A90C9I],[B90],[Q¯900Q¯e9],R9)P^{9}=\texttt{dare}\bigg{(}\begin{bmatrix}A^{9}&0\\ -C^{9}&I\end{bmatrix},\begin{bmatrix}B^{9}\\ 0\end{bmatrix},\begin{bmatrix}\bar{Q}^{9}&0\\ 0&\bar{Q}_{e}^{9}\end{bmatrix},R^{9}\bigg{)} at each iteration. We emphasise that the proposed tuning framework provides enough flexibility to either tune a single matrix, or all of the MPC matrices depending on the particular application. For instance, for this drive cycle segment, tuning the integrator state matrix Q¯e\bar{Q}_{e} helped significantly in improving tracking performance, but for the experiments further below it was not necessary to tune this matrix and it was thus fixed.

The resulting optimal tuning matrices for N=11N=11 iterations are given by

Q¯^9\displaystyle\hat{\bar{Q}}^{9} =[0.06110.02770.07420.13890.02770.13930.05540.03180.07420.05540.21770.09630.13890.03180.09631.0701],\displaystyle=\begin{bmatrix}0.0611&0.0277&0.0742&0.1389\\ 0.0277&0.1393&-0.0554&0.0318\\ 0.0742&-0.0554&0.2177&0.0963\\ 0.1389&0.0318&0.0963&1.0701\end{bmatrix},
R^9\displaystyle\hat{R}^{9} =[1.80370.09240.40810.09241.58410.22520.40810.22521.2879],\displaystyle=\begin{bmatrix}1.8037&-0.0924&0.4081\\ -0.0924&1.5841&0.2252\\ 0.4081&0.2252&1.2879\end{bmatrix},
Q¯^e9\displaystyle\hat{\bar{Q}}_{e}^{9} =[0.00130.00140.00140.0041].\displaystyle=\begin{bmatrix}0.0013&0.0014\\ 0.0014&0.0041\end{bmatrix}.

The engine output response over the UDC2 using the above matrices is shown in Fig. 7. The tracking performance has significantly improved in the region where controller nine is acting. Particularly, the tuned matrices provide an improvement of performance222We compute the percentage of improvement with respect to the baseline controller as: 100×[f(P0,Q0,R0)f(P^σ,Q^σ,R^σ)]/f(P0,Q0,R0)100\times[f(P_{0},Q_{0},R_{0})-f(\hat{P}^{\sigma^{*}},\hat{Q}^{\sigma^{*}},\hat{R}^{\sigma^{*}})]/f(P_{0},Q_{0},R_{0}). of 16.2% with only eleven iterations.

6.3 Controller σ=5\sigma^{*}=5 over WLTC1

The engine output response to the WLTC1 segment with the baseline controller is given in Fig. 8. The grey area illustrates when controller σ=5\sigma^{*}=5 is active, and we have boxed the areas with unsatisfactory performance that we would like to improve. The parameters used in this experiment are μ=2.5×106\mu=2.5\times 10^{-6}, hk=106/k+1h_{k}=10^{-6}/\sqrt{k+1} (L0(f)=3450,ϵ=0.1,r¯=0.1,n=7L_{0}(f)=3450,\epsilon=0.1,\bar{r}=0.1,n=7). Similar to the previous experiment, the matrix P5P^{5} is constructed by solving the corresponding DARE, and we consider Q5=diag{Q¯5,Q¯e5}Q^{5}=\mbox{diag}\{\bar{Q}^{5},\bar{Q}_{e}^{5}\}, but now the integrator state matrix Q¯e5\bar{Q}_{e}^{5} is not tuned but kept equal to the initial value Q¯e,0=0.00001×diag{1,100}\bar{Q}_{e,0}=0.00001\times\mbox{diag}\{1,100\}.

Consequently, the tuning parameters are {Q¯5,R5}\{\bar{Q}^{5},R^{5}\}, and their optimal values for N=11N=11 iterations are given by

Q¯^5\displaystyle\hat{\bar{Q}}^{5} =[0.01010.00780.02130.01020.00780.02730.00610.01700.02130.00610.17600.00040.01020.01700.00040.9967],\displaystyle=\begin{bmatrix}0.0101&-0.0078&-0.0213&0.0102\\ -0.0078&0.0273&0.0061&0.0170\\ -0.0213&0.0061&0.1760&0.0004\\ 0.0102&0.0170&0.0004&0.9967\end{bmatrix},
R^5\displaystyle\hat{R}^{5} =[0.99920.00660.00170.00660.98390.00160.00170.00160.9806].\displaystyle=\begin{bmatrix}0.9992&-0.0066&-0.0017\\ -0.0066&0.9839&0.0016\\ -0.0017&0.0016&0.9806\end{bmatrix}.

The engine output response over the WLTC1 using the above matrices is shown in Fig. 9. The tracking performance has improved in the region where controller nine is acting. Particularly, the tuned matrices provide an improvement of performance of 22.67% with only eleven iterations, as illustrated by the boxed regions in Fig. 9. We can observe that the tracking has clearly improved in the boxed areas, but we can see that in the grey area 40-50s in Fig. 8 and 9, the EGR tracking got slightly deteriorated. However, we emphasise that the cost function being minimised is the overall tracking performance considering every grey area where controller σ=5\sigma^{*}=5 is acting. This is considered satisfactory since the overall tracking improvement in this case was 22.67%. Different cost functions or tuning approaches could be potentially used to tackle each region individually.

Refer to caption
Figure 8: Engine response over the first WLTC segment using the initial parameters {P0,Q0,R0}\{P_{0},Q_{0},R_{0}\} for all twelve controllers.
Refer to caption
Figure 9: Engine response over the first WLTC segment using the tuned {P^5,Q^5,R^5}\{\hat{P}^{5},\hat{Q}^{5},\hat{R}^{5}\} for controller five, and {P0,Q0,R0}\{P_{0},Q_{0},R_{0}\} for the remaining eleven controllers.

6.4 Controller σ=6\sigma^{*}=6 over UDC3

The engine output response to the UDC3 segment with the baseline controller is given in Fig. 10. The grey area illustrates when controller σ=6\sigma^{*}=6 is active, and we have boxed the areas with unsatisfactory performance. The parameters used in this experiment are μ=2.5×106\mu=2.5\times 10^{-6}, hk=105/k+1h_{k}=10^{-5}/\sqrt{k+1} (L0(f)=2450,ϵ=0.1,r¯=1.2,n=11L_{0}(f)=2450,\epsilon=0.1,\bar{r}=1.2,n=11). The integrator state matrix Q¯e\bar{Q}_{e} was not tuned but kept equal to the initial value Q¯e,0=0.00001×diag{1,100}\bar{Q}_{e,0}=0.00001\times\mbox{diag}\{1,100\}. As opposed to the previous two experiments, we do include P5P^{5} in the tuning process for this experiment. Specifically, we construct it as P6=diag{P¯6,Q¯e,0}P^{6}=\mbox{diag}\{\bar{P}^{6},\bar{Q}_{e,0}\} and tune P¯6\bar{P}^{6}.

Consequently, the tuning parameters are {P¯6,Q¯6,R6\bar{P}^{6},\bar{Q}^{6},R^{6}}, and their optimal values for N=8N=8 iterations are given by

P¯^6\displaystyle\hat{\bar{P}}^{6} =[0.05020.01170.03980.00570.01170.01020.01460.01180.03980.01460.20030.04260.00570.01180.04261.0022],\displaystyle=\begin{bmatrix}0.0502&-0.0117&-0.0398&0.0057\\ -0.0117&0.0102&0.0146&0.0118\\ -0.0398&0.0146&0.2003&-0.0426\\ 0.0057&0.0118&-0.0426&1.0022\end{bmatrix},
Q¯^6\displaystyle\hat{\bar{Q}}^{6} =[0.46990.07080.01920.09010.07080.23290.15970.16840.01920.15970.34590.00160.09010.16840.00161.1911],\displaystyle=\begin{bmatrix}0.4699&0.0708&0.0192&0.0901\\ 0.0708&0.2329&-0.1597&-0.1684\\ 0.0192&-0.1597&0.3459&0.0016\\ 0.0901&-0.1684&0.0016&1.1911\end{bmatrix},
R^6\displaystyle\hat{R}^{6} =[1.12270.14190.03050.14191.06210.05340.03050.05341.2843].\displaystyle=\begin{bmatrix}1.1227&-0.1419&0.0305\\ -0.1419&1.0621&0.0534\\ 0.0305&0.0534&1.2843\end{bmatrix}.

The engine output response over the UDC3 using the above matrices is shown in Fig. 11. We can see that the tracking performance has improved in the region where controller nine is acting. Particularly, the tuned matrices provide an improvement of performance of 7.73% with only eight iterations.

Refer to caption
Figure 10: Engine response over the third UDC segment using the initial parameters {P0,Q0,R0}\{P_{0},Q_{0},R_{0}\} for all twelve controllers.
Refer to caption
Figure 11: Engine response over the third UDC segment using the tuned {P^6,Q^6,R^6}\{\hat{P}^{6},\hat{Q}^{6},\hat{R}^{6}\} for controller six, and {P0,Q0,R0}\{P_{0},Q_{0},R_{0}\} for the remaining eleven controllers.

The theoretical results in Section 3 are sufficient conditions on the step size hkh_{k} and oracle’s precision μ\mu that ensure a certain level of accuracy in a fixed number of iterations. We note that the complexity bounds presented in this paper hold for any problem that fits the optimisation framework surrounding (1). Depending on the application, these bounds could become more or less conservative, as we observed in the air-path control case study. In fact, since these are sufficient conditions, we observed that the experimental performance of the algorithm showed tracking error improvement only in a few iterations. Nevertheless, the theoretical results serve as a design tool that guide the choice of algorithm parameters as opposed to a heuristic choice of parameters.

Remark 1

Overall, we observed that the proposed approach successfully provides improved tracking performance over three different drive cycle segments, i.e. different data sets, which sheds some light on the robustness of the method. However, an interesting future work direction is to provide a theoretical study on robustness guarantees.

7 Conclusion

This paper provided an optimisation framework with theoretical guarantees for the minimisation of non-smooth and possibly non-convex cost functions with matrix parameters. We applied the proposed algorithm to tune MPCs in the context of air-path control in diesel engines, which is then validated by experimental testing. The algorithm provides improvement of performance with a few iterations, which is demonstrated in different engine drive cycles. Therefore, it creates potential for the development of efficient tuning tools for advanced controllers (and potentially retuning online), even though the theoretical complexity bounds may be large depending on the application.

Future work includes exploring algorithms over other matrix manifolds with practical applications, testing of the proposed algorithms in stochastic drive cycles, robustness guarantees, and online estimation/adaptation of Lipschitz constant L0(f)L_{0}(f).

Appendix A Technical results

Appendix A provides the preliminary results needed for Appendices B and C.

Consider a function f:nf:\mathbb{Q}^{n}\rightarrow\mathbb{R}, and let us define the Gaussian approximation of ff as

fμ(X)1κ𝔾nf(X+μU)e12UF2dU.\displaystyle f_{\mu}(X)\triangleq\frac{1}{\kappa}\int_{\mathbb{G}^{n}}f(X+\mu U)e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U. (16)

We first introduce the following definition.

Definition 2

Let ff be a convex function. A matrix GG is called a subgradient of function ff at Xn×nX\in\mathbb{R}^{n\times n} if for any Yn×nY\in\mathbb{R}^{n\times n} we have f(Y)f(X)+G,YXFf(Y)\geq f(X)+\langle G,Y-X\rangle_{F}. The set of all subgradients of ff at Xn×nX\in\mathbb{R}^{n\times n} is called subdifferential and is denoted by f(X)\partial f(X).

The following lemma holds for fμf_{\mu}.

Lemma 2
  1. (I)

    If ff is convex and Gf(X)G\in\partial f(X), then

    fμ(X)\displaystyle f_{\mu}(X) 1κ𝔾n(f(X)+μG,UF)e12UF2dU=f(X).\displaystyle\geq\frac{1}{\kappa}\int_{\mathbb{G}^{n}}(f(X)+\mu\langle G,U\rangle_{F})e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U=f(X). (17)
  2. (II)

    The gradient matrix fμ\nabla f_{\mu} satisfies

    fμ(X)=1κ𝔾nf(X+μU)f(X)μe12UF2UdU.\displaystyle\nabla f_{\mu}({X})=\frac{1}{\kappa}\int_{\mathbb{G}^{n}}\frac{f({X}+\mu{U})-f({X})}{\mu}e^{-\frac{1}{2}\left\|{U}\right\|_{F}^{2}}{U}\,\mathrm{d}{U}. (18)
  3. (III)

    Let ff be Lipschitz continuous as per Assumption 1, then

    |fμ(X)f(X)|μL0(f)n2+n2,Xn.\displaystyle|f_{\mu}(X)-f(X)|\leq\mu L_{0}(f)\sqrt{\tfrac{n^{2}+n}{2}},\quad X\in\mathbb{Q}^{n}. (19)
  4. (IV)

    Under Assumption 1, fμf_{\mu} has Lipschitz continuous gradient, i.e. |fμ(Y)fμ(X)fμ(X),YXF|12L1(fμ)XYF2|f_{\mu}(Y)-f_{\mu}(X)-\left\langle\nabla f_{\mu}(X),Y-X\right\rangle_{F}|\leq\frac{1}{2}L_{1}(f_{\mu})\left\|X-Y\right\|_{F}^{2}, for all X,YnX,Y\in\mathbb{Q}^{n}, with L1(fμ)=(2L0(f)/μ)(n2+n)/2L_{1}(f_{\mu})=(2L_{0}(f)/\mu)\sqrt{(n^{2}+n)/2}.

Proof:

  1. (I)

    We prove this statement as follows.

    fμ(X)\displaystyle f_{\mu}(X) =(16)1κ𝔾nf(X+μU)e12UF2dU\displaystyle\stackrel{{\scriptstyle\eqref{eq:f_mu-matrix}}}{{=}}\frac{1}{\kappa}\int_{\mathbb{G}^{n}}f(X+\mu U)e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U
    Def.21κ𝔾n(f(X)+G,μUF)e12UF2dU\displaystyle\stackrel{{\scriptstyle\mbox{\small Def.\ref{def:sub-differential-matrix}}}}{{\geq}}\frac{1}{\kappa}\int_{\mathbb{G}^{n}}(f(X)+\langle G,\mu U\rangle_{F})e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U
    =f(X)+1κ𝔾nG,μUFe12UF2dU\displaystyle=f(X)+\frac{1}{\kappa}\int_{\mathbb{G}^{n}}\langle G,\mu U\rangle_{F}e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U
    =f(X)+μTr{1κ𝔾nGUe12UF2dU}\displaystyle=f(X)+\mu\mbox{Tr}\left\{\frac{1}{\kappa}\int_{\mathbb{G}^{n}}G^{\top}Ue^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U\right\}
    =f(X),\displaystyle=f(X),

    where the last equality follows from 𝔼U{U}=0n×n\mathbb{E}_{U}\{U\}=0_{n\times n}.

  2. (II)

    Define YX+μUY\triangleq X+\mu U, and note that by the change of variable formula for multivariate integrals, dU=(1/μn¯)dY\mathrm{d}U=(1/\mu^{\bar{n}})\mathrm{d}Y, where n¯=n(n+1)/2\bar{n}=n(n+1)/2. Consequently,

    fμ(X)\displaystyle f_{\mu}(X) =1κ𝔾nf(Y)e12μ2YXF2×1μn¯dY.\displaystyle=\frac{1}{\kappa}\int_{\mathbb{G}^{n}}f(Y)e^{-\frac{1}{2\mu^{2}}\left\|Y-X\right\|_{F}^{2}}\times\frac{1}{\mu^{\bar{n}}}\mathrm{d}Y.

    Then,

    fμ(X)\displaystyle\nabla f_{\mu}(X) =1μn¯κ𝔾nf(Y)e12μ2YXF2×12μ2×Tr{(YX)2}XdY\displaystyle=\frac{1}{\mu^{\bar{n}}\kappa}\int_{\mathbb{G}^{n}}f(Y)e^{-\frac{1}{2\mu^{2}}\left\|Y-X\right\|_{F}^{2}}\times-\frac{1}{2\mu^{2}}\times\frac{\partial\mbox{Tr}\left\{(Y-X)^{2}\right\}}{\partial X}\mathrm{d}Y
    =1μn¯+2κ𝔾nf(Y)e12μ2YXF2(YX)dY\displaystyle=\frac{1}{\mu^{\bar{n}+2}\kappa}\int_{\mathbb{G}^{n}}f(Y)e^{-\frac{1}{2\mu^{2}}\left\|Y-X\right\|_{F}^{2}}(Y-X)\mathrm{d}Y
    =1μκ𝔾nf(X+μU)e12UF2UdU\displaystyle=\frac{1}{\mu\kappa}\int_{\mathbb{G}^{n}}f(X+\mu U)e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}U\mathrm{d}U
    =1κ𝔾nf(X+μU)f(X)μe12UF2UdU,\displaystyle=\frac{1}{\kappa}\int_{\mathbb{G}^{n}}\frac{f(X+\mu U)-f(X)}{\mu}e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}U\mathrm{d}U,

    where the last equality follows from 𝔼U{U}=0n×n\mathbb{E}_{U}\{U\}=0_{n\times n}.

  3. (III)

    By definition of fμf_{\mu} in (16) we have

    fμ(X)f(X)=1κ𝔾n(f(X+μU)f(X))e12UF2dU.\displaystyle f_{\mu}(X)-f(X)=\frac{1}{\kappa}\int_{\mathbb{G}^{n}}(f(X+\mu U)-f(X))e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U.

    Then, by Lipschitz continuity of ff (cf. Assumption 1),

    |fμ(X)f(X)|\displaystyle|f_{\mu}(X)-f(X)| μLo(f)1κ𝔾nUFe12UF2dUμL0(f)n2+n2,\displaystyle\leq\mu L_{o}(f)\frac{1}{\kappa}\int_{\mathbb{G}^{n}}\left\|U\right\|_{F}e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U\leq\mu L_{0}(f)\sqrt{\tfrac{n^{2}+n}{2}},

    where the last inequality follows from Lemma 1.

  4. (IV)

    Indeed, for all X,YnX,Y\in\mathbb{Q}^{n} we have from (18) and Assumption 1 that

    fμ(X)fμ(Y)F2L0(f)κμ𝔾nUFe12UF2dU,\displaystyle\left\|\nabla f_{\mu}(X)-\nabla f_{\mu}(Y)\right\|_{F}\leq\frac{2L_{0}(f)}{\kappa\mu}\int_{\mathbb{G}^{n}}\left\|U\right\|_{F}e^{-\frac{1}{2}\left\|U\right\|_{F}^{2}}\mathrm{d}U,

    and the result follows from Lemma 1. \blacksquare

For the random oracle in (3), we have the following bound.

Theorem 3

If ff satisfies Assumption 1, then

𝔼U{𝒪μ(X,U)F2}14L02(f)(n4+2n3+5n2+4n).\displaystyle\mathbb{E}_{U}\{\left\|\mathcal{O}_{\mu}(X,U)\right\|^{2}_{F}\}\leq\frac{1}{4}L_{0}^{2}(f)(n^{4}+2n^{3}+5n^{2}+4n). (20)

Proof: Note that 𝒪μ(X,U)F2=Tr{𝒪μ(X,U)𝒪μ(X,U)}=(f(X+μU)f(X))2μ2UF2\left\|\mathcal{O}_{\mu}(X,U)\right\|_{F}^{2}=\mbox{Tr}\left\{\mathcal{O}_{\mu}(X,U)^{\top}\mathcal{O}_{\mu}(X,U)\right\}=\frac{(f(X+\mu U)-f(X))^{2}}{\mu^{2}}\left\|U\right\|_{F}^{2}. Then, 𝔼U{𝒪μ(X,U)F2}1μ2𝔼U{L02(f)μUF2UF2}=L02(f)𝔼U{UF4}\mathbb{E}_{U}\{\left\|\mathcal{O}_{\mu}(X,U)\right\|^{2}_{F}\}\leq\frac{1}{\mu^{2}}\mathbb{E}_{U}\{L_{0}^{2}(f)\left\|\mu U\right\|_{F}^{2}\left\|U\right\|_{F}^{2}\}=L_{0}^{2}(f)\mathbb{E}_{U}\{\left\|U\right\|_{F}^{4}\}. The proof thus follows from Lemma 1. \blacksquare

Appendix B Proof of Theorem 1

We first note that the projection πn\pi_{\mathbb{Q}^{n}} in ZO-RMS satisfies

πn{X}YFXYF,\displaystyle\left\|\pi_{\mathbb{Q}^{n}}\{X\}-Y\right\|_{F}\leq\left\|X-Y\right\|_{F}, (21)

for all YnY\in\mathbb{Q}^{n}.

Now, let point XkX_{k} with kk\in\mathbb{N} be generated after kk iterations of ZO-RMS, and define rkXkXFr_{k}\triangleq\left\|X_{k}-X^{*}\right\|_{F}. Then,

rk+12\displaystyle r_{k+1}^{2} =πn{Xkhk𝒪μ(Xk,Uk)}XF2\displaystyle=\left\|\pi_{\mathbb{Q}^{n}}\{X_{k}-h_{k}\mathcal{O}_{\mu}(X_{k},U_{k})\}-X^{*}\right\|_{F}^{2}
(21)Xkhk𝒪μ(Xk,Uk)XF2\displaystyle\stackrel{{\scriptstyle\eqref{eq:projection-matrix}}}{{\leq}}\left\|X_{k}-h_{k}\mathcal{O}_{\mu}(X_{k},U_{k})-X^{*}\right\|_{F}^{2}
=rk22hkXkX,𝒪μ(Xk,Uk)F+hk2𝒪μ(Xk,Uk)F2.\displaystyle=r_{k}^{2}-2h_{k}\left\langle X_{k}-X^{*},\mathcal{O}_{\mu}(X_{k},U_{k})\right\rangle_{F}+h_{k}^{2}\left\|\mathcal{O}_{\mu}(X_{k},U_{k})\right\|_{F}^{2}\,.

Then, taking expectation

𝔼Uk{rk+12}\displaystyle\mathbb{E}_{U_{k}}\{r_{k+1}^{2}\} (20)rk22hk𝔼Uk{XkX,𝒪μ(Xk,Uk)F}+14hk2L02(f)(n4+2n3+5n2+4n)\displaystyle\stackrel{{\scriptstyle\eqref{eq:(35)-matrix}}}{{\leq}}r_{k}^{2}-2h_{k}\mathbb{E}_{U_{k}}\{\left\langle X_{k}-X^{*},\mathcal{O}_{\mu}(X_{k},U_{k})\right\rangle_{F}\}+\tfrac{1}{4}h_{k}^{2}L_{0}^{2}(f)(n^{4}+2n^{3}+5n^{2}+4n)
=rk22hkTr{[XkX]𝔼Uk{𝒪μ(Xk,Uk)}}+14hk2L02(f)(n4+2n3+5n2+4n).\displaystyle=r_{k}^{2}-2h_{k}\mbox{Tr}\left\{[X_{k}-X^{*}]^{\top}\mathbb{E}_{U_{k}}\{\mathcal{O}_{\mu}(X_{k},U_{k})\}\right\}+\tfrac{1}{4}h_{k}^{2}L_{0}^{2}(f)(n^{4}+2n^{3}+5n^{2}+4n).

From (18), we get 𝔼Uk{𝒪μ(Xk,Uk)}=fμ(Xk)\mathbb{E}_{U_{k}}\{\mathcal{O}_{\mu}(X_{k},U_{k})\}=\nabla f_{\mu}(X_{k}). Therefore,

𝔼Uk{rk+12}\displaystyle\mathbb{E}_{U_{k}}\{r_{k+1}^{2}\} rk22hkXkX,fμ(Xk)F+14hk2L02(f)(n4+2n3+5n2+4n)\displaystyle\leq r_{k}^{2}-2h_{k}\left\langle X_{k}-X^{*},\nabla f_{\mu}(X_{k})\right\rangle_{F}+\tfrac{1}{4}h_{k}^{2}L_{0}^{2}(f)(n^{4}+2n^{3}+5n^{2}+4n)
(a)rk22hk[f(Xk)fμ(X)]+14hk2L02(f)(n4+2n3+5n2+4n),\displaystyle\stackrel{{\scriptstyle(a)}}{{\leq}}r_{k}^{2}-2h_{k}[f(X_{k})-f_{\mu}(X^{*})]+\tfrac{1}{4}h_{k}^{2}L_{0}^{2}(f)(n^{4}+2n^{3}+5n^{2}+4n),

where (a)(a) in the last inequality is shown as follows,

fμ(X)\displaystyle f_{\mu}(X^{*}) Def.2fμ(Xk)+fμ(Xk),XXkF,\displaystyle\stackrel{{\scriptstyle\text{Def.\ref{def:sub-differential-matrix}}}}{{\geq}}f_{\mu}(X_{k})+\langle\nabla f_{\mu}(X_{k}),X^{*}-X_{k}\rangle_{F},
(17)f(Xk)XkX,fμ(Xk)F.\displaystyle\hskip 1.99168pt\stackrel{{\scriptstyle\eqref{eq:f_mu>f-matrix}}}{{\geq}}f(X_{k})-\langle X_{k}-X^{*},\nabla f_{\mu}(X_{k})\rangle_{F}.

Taking now the expectation in U0:k1U_{0:k-1}, we obtain

𝔼U0:k{rk+12}\displaystyle\mathbb{E}_{U_{0:k}}\{r_{k+1}^{2}\} 𝔼U0:k1{rk2}2hk(ϕkfμ(X))+14hk2L02(f)(n4+2n3+5n2+4n).\displaystyle\leq\mathbb{E}_{U_{0:k-1}}\{r_{k}^{2}\}-2h_{k}(\phi_{k}-f_{\mu}(X^{*}))+\tfrac{1}{4}h_{k}^{2}L_{0}^{2}(f)(n^{4}+2n^{3}+5n^{2}+4n).

Moreover, note that fμ(X)(17)f(X)f_{\mu}(X^{*})\stackrel{{\scriptstyle\eqref{eq:f_mu>f-matrix}}}{{\geq}}f(X^{*}), thus (19) implies that fμ(X)f+μL0(f)(n2+n)/2f_{\mu}(X^{*})\leq f^{*}+\mu L_{0}(f)\sqrt{(n^{2}+n)/2}. Consequently,

𝔼U0:k{rk+12}𝔼U0:k1{rk2}2hk(ϕkf)+2hkμL0(f)n2+n2+14hk2L02(f)(n4+2n3+5n2+4n).\mathbb{E}_{U_{0:k}}\{r_{k+1}^{2}\}\leq\mathbb{E}_{U_{0:k-1}}\{r_{k}^{2}\}-2h_{k}(\phi_{k}-f^{*})+2h_{k}\mu L_{0}(f)\sqrt{\tfrac{n^{2}+n}{2}}\\ +\tfrac{1}{4}h_{k}^{2}L_{0}^{2}(f)(n^{4}+2n^{3}+5n^{2}+4n). (22)

We can now iterate (22) from k=0k=0 to k=Nk=N and get

𝔼U0:N{rN+12}r022k=0Nhk(ϕkf)+2SNμL0(f)n2+n2+14(n4+2n3+5n2+4n)L02(f)k=0Nhk2.\mathbb{E}_{U_{0:N}}\{r_{N+1}^{2}\}\leq r_{0}^{2}-2\sum_{k=0}^{N}h_{k}(\phi_{k}-f^{*})+2S_{N}\mu L_{0}(f)\sqrt{\tfrac{n^{2}+n}{2}}\\ +\tfrac{1}{4}(n^{4}+2n^{3}+5n^{2}+4n)L_{0}^{2}(f)\sum_{k=0}^{N}h_{k}^{2}\ . (23)

Since 𝔼U0:N{rN+12}0\mathbb{E}_{U_{0:N}}\{r_{N+1}^{2}\}\geq 0, from (23) we immediately get (4), completing the proof. \blacksquare

Appendix C Proof of Theorem 2

Define gkPn(Xk,𝒪μ(Xk,Uk),hk)g_{k}\triangleq P_{\mathbb{Q}^{n}}(X_{k},\mathcal{O}_{\mu}(X_{k},U_{k}),h_{k}). From Lemma 2(IV), we have

fμ(Xk+1)\displaystyle f_{\mu}(X_{k+1}) fμ(Xk)hkfμ(Xk),gkF+hk2L1(fμ)2gkF2.\displaystyle\leq f_{\mu}(X_{k})-h_{k}\left\langle\nabla f_{\mu}(X_{k}),g_{k}\right\rangle_{F}+\tfrac{h_{k}^{2}L_{1}(f_{\mu})}{2}\left\|g_{k}\right\|_{F}^{2}.

From non-expansiveness of projection, i.e. (21), and from the fact that L1(fμ)=(2L0(f)/μ)(n2+n)/2L_{1}(f_{\mu})=(2L_{0}(f)/\mu)\sqrt{(n^{2}+n)/2}, we have

fμ(Xk+1)\displaystyle f_{\mu}(X_{k+1}) fμ(Xk)hkfμ(Xk),gkF+hk2L0(f)μn2+n2𝒪μ(Xk,Uk)F2.\displaystyle\leq f_{\mu}(X_{k})-h_{k}\left\langle\nabla f_{\mu}(X_{k}),g_{k}\right\rangle_{F}+\tfrac{h_{k}^{2}L_{0}(f)}{\mu}\sqrt{\tfrac{n^{2}+n}{2}}\left\|\mathcal{O}_{\mu}(X_{k},U_{k})\right\|_{F}^{2}.

Define ξk𝒪μ(Xk,Uk)fμ(Xk)\xi_{k}\triangleq\mathcal{O}_{\mu}(X_{k},U_{k})-\nabla f_{\mu}(X_{k}), then fμ(Xk),gkF=𝒪μ(Xk,Uk),gkFξk,gkF\left\langle\nabla f_{\mu}(X_{k}),g_{k}\right\rangle_{F}=\left\langle\mathcal{O}_{\mu}(X_{k},U_{k}),g_{k}\right\rangle_{F}-\left\langle\xi_{k},g_{k}\right\rangle_{F}. Based on [13, Lemma 1] we have 𝒪μ(Xk,Uk),gkgkF2+1hk[h(Xk+1)h(Xk)]\left\langle\mathcal{O}_{\mu}(X_{k},U_{k}),g_{k}\right\rangle\geq\left\|g_{k}\right\|_{F}^{2}+\frac{1}{h_{k}}[\textbf{h}(X_{k+1})-\textbf{h}(X_{k})], where h(X)=0\textbf{h}(X)=0 if XnX\in\mathbb{Q}^{n} and \infty otherwise. Since X0nX_{0}\in\mathbb{Q}^{n} and based on (2), we know that 𝐡(Xk)=0\mathbf{h}(X_{k})=0 for k0k\geq 0. Therefore,

fμ(Xk+1)fμ(Xk)+hkξk,gkPn(Xk,fμ(Xk),hk)F+hkξk,Pn(Xk,fμ(Xk),hk)hkgkF2+hk2L0(f)μn2+n2𝒪μ(Xk,Uk)F2.f_{\mu}(X_{k+1})\leq f_{\mu}(X_{k})+h_{k}\left\langle\xi_{k},g_{k}-P_{\mathbb{Q}^{n}}(X_{k},\nabla f_{\mu}(X_{k}),h_{k})\right\rangle_{F}+h_{k}\left\langle\xi_{k},P_{\mathbb{Q}^{n}}(X_{k},\nabla f_{\mu}(X_{k}),h_{k})\right\rangle\\ -h_{k}\left\|g_{k}\right\|_{F}^{2}+\tfrac{h_{k}^{2}L_{0}(f)}{\mu}\sqrt{\tfrac{n^{2}+n}{2}}\left\|\mathcal{O}_{\mu}(X_{k},U_{k})\right\|_{F}^{2}. (24)

From [13, Proposition 1], we can write ξk,gkPn(Xk,fμ(Xk),hk)FξkF2\left\langle\xi_{k},g_{k}-P_{\mathbb{Q}^{n}}(X_{k},\nabla f_{\mu}(X_{k}),h_{k})\right\rangle_{F}\leq\left\|\xi_{k}\right\|_{F}^{2}. With this fact, we take expectation in (24) and get

hk𝔼Uk{gkF2}fμ(Xk)𝔼Uk{fμ(Xk+1)}+hkσ2+hk2C(μ),\displaystyle h_{k}\mathbb{E}_{U_{k}}\{\left\|g_{k}\right\|_{F}^{2}\}\leq f_{\mu}(X_{k})-\mathbb{E}_{U_{k}}\{f_{\mu}(X_{k+1})\}+h_{k}\sigma^{2}+h_{k}^{2}C(\mu), (25)

where we also used 𝔼Uk{ξk}=0\mathbb{E}_{U_{k}}\{\xi_{k}\}=0, Theorem 3 to compute 𝔼Uk{𝒪μ(Xk,Uk)F2}\mathbb{E}_{U_{k}}\{\left\|\mathcal{O}_{\mu}(X_{k},U_{k})\right\|_{F}^{2}\}, and Assumption 2 to upper bound 𝔼Uk{ξkF2}σ2\mathbb{E}_{U_{k}}\{\left\|\xi_{k}\right\|_{F}^{2}\}\leq\sigma^{2}. Recall that SNk=0NhkS_{N}\triangleq\sum_{k=0}^{N}h_{k} and that fμ(X)ff_{\mu}(X^{*})\geq f^{*}. Then, taking expectation in U0:kU_{0:k} in (25) and then taking summation, we obtain (8) as required. \blacksquare

Acknowledgment

The authors would like to thank the engineering staff at Toyota Higashi-Fuji Technical Centre, Susono, Japan, for their assistance in running the experiments included in this paper.

References

  • [1] Mohamed Osama Ahmed, Sharan Vaswani, and Mark Schmidt. Combining bayesian optimization and lipschitz optimization. Machine Learning, 109(1):79–102, 2020.
  • [2] Greg W Anderson, Alice Guionnet, and Ofer Zeitouni. An introduction to random matrices, volume 118. Cambridge university press, 2010.
  • [3] Matthew Anderson, Xi-Lin Li, Pedro Rodriguez, and Tülay Adali. An effective decoupling method for matrix optimization and its application to the ICA problem. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1885–1888. IEEE, 2012.
  • [4] M. Athans and F.C. Schweppe. Gradient matrices and matrix calculations. Technical report, M.I.T. Lincoln Lab., 1965.
  • [5] Peyman Bagheri and Ali Khaki-Sedigh. An analytical tuning approach to multivariable model predictive controllers. Journal of Process Control, 24(12):41–54, 2014.
  • [6] Alexander Bertrand and Marc Moonen. Topology-aware distributed adaptation of Laplacian weights for in-network averaging. In 21st European Signal Processing Conference (EUSIPCO 2013), pages 1–5. IEEE, 2013.
  • [7] Gene A Bunin and Grégory François. Lipschitz constants in experimental optimization. arXiv preprint arXiv:1603.07847, 2016.
  • [8] Gene A Bunin, Fernando Fraire Tirado, Grégory François, and Dominique Bonvin. Run-to-run MPC tuning via gradient descent. In Computer Aided Chemical Engineering, volume 30, pages 927–931. Elsevier, 2012.
  • [9] MM Chaikovskii and Igor’Borisovich Yadykin. Optimal tuning of PID controllers for MIMO bilinear plants. Automation and Remote Control, 70(1):118–132, 2009.
  • [10] Jon Dattorro. Convex optimization & Euclidean distance geometry. Meboo Publishing USA, 2011.
  • [11] Peter J Forrester. Log-gases and random matrices (LMS-34). Princeton University Press, 2010.
  • [12] Jorge L Garriga and Masoud Soroush. Model predictive control tuning methods: A review. Industrial & Engineering Chemistry Research, 49(8):3505–3515, 2010.
  • [13] Saeed Ghadimi, Guanghui Lan, and Hongchao Zhang. Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Mathematical Programming, 155(1-2):267–305, 2016.
  • [14] T. Hayakawa and Y. Kikuchi. The moments of a function of traces of a matrix with a multivariate symmetric normal distribution. South African Statistical Journal, 13(1):71–82, 1979.
  • [15] Nicholas J Higham. Computing a nearest symmetric positive semidefinite matrix. Linear algebra and its applications, 103:103–118, 1988.
  • [16] Alex S Ira, Chris Manzie, Iman Shames, Robert Chin, Dragan Nešić, Hayato Nakada, and Takeshi Sano. Tuning of multivariable model predictive controllers through expert bandit feedback. International Journal of Control, pages 1–9, 2020.
  • [17] Donald R Jones, Cary D Perttunen, and Bruce E Stuckman. Lipschitzian optimization without the lipschitz constant. Journal of optimization Theory and Applications, 79(1):157–181, 1993.
  • [18] Gesner A Nery Júnior, Márcio AF Martins, and Ricardo Kalid. A PSO-based optimal tuning strategy for constrained multivariable predictive controllers with model uncertainty. ISA transactions, 53(2):560–567, 2014.
  • [19] Chih-Jen Lin. Projected gradient methods for nonnegative matrix factorization. Neural computation, 19(10):2756–2779, 2007.
  • [20] Sijia Liu, Bhavya Kailkhura, Pin-Yu Chen, Paishun Ting, Shiyu Chang, and Lisa Amini. Zeroth-order stochastic variance reduction for nonconvex optimization. Advances in Neural Information Processing Systems, 31:3727–3737, 2018.
  • [21] Sijia Liu, Xingguo Li, Pin-Yu Chen, Jarvis Haupt, and Lisa Amini. Zeroth-order stochastic projected gradient descent for nonconvex optimization. In 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pages 1179–1183. IEEE, 2018.
  • [22] Qiugang Lu, Ranjeet Kumar, and Victor M Zavala. MPC controller tuning using bayesian optimization techniques. arXiv preprint arXiv:2009.14175, 2020.
  • [23] Alonso Marco, Philipp Hennig, Jeannette Bohg, Stefan Schaal, and Sebastian Trimpe. Automatic LQR tuning based on gaussian process global optimization. In 2016 IEEE international conference on robotics and automation (ICRA), pages 270–277. IEEE, 2016.
  • [24] D.Q. Mayne, J.B. Rawling, C.V. Rao, and P.O.M. Scokaert. Constrained model predictive control: Stability and optimality. Automatica, 36:789–814, 2000.
  • [25] Marcel Menner, Peter Worsnop, and Melanie N Zeilinger. Constrained inverse optimal control with application to a human manipulation task. IEEE Transactions on Control Systems Technology, 2019.
  • [26] Panayotis Mertikopoulos, E Veronica Belmega, Romain Negrel, and Luca Sanguinetti. Distributed stochastic optimization via matrix exponential learning. IEEE Transactions on Signal Processing, 65(9):2277–2290, 2017.
  • [27] Yurii Nesterov and Vladimir Spokoiny. Random gradient-free minimization of convex functions. Foundations of Computational Mathematics, 17(2):527–566, 2017.
  • [28] Ahmed Ramadan, Jongeun Choi, Clark J. Radcliffe, John M. Popovich, and N. Peter Reeves. Inferring control intent during seated balance using inverse model predictive control. IEEE Robotics and Automation Letters, 4(2):224–230, 2019.
  • [29] James Blake Rawlings and David Q Mayne. Model predictive control: Theory and design. Nob Hill Pub., 2009.
  • [30] Gokul S Sankar, Rohan C Shekhar, Chris Manzie, Takeshi Sano, and Hayato Nakada. Fast calibration of a robust model predictive controller for diesel engine airpath. IEEE Transactions on Control Systems Technology, 2019.
  • [31] Gokul S Sankar, Rohan C Shekhar, Chris Manzie, Takeshi Sano, and Hayato Nakada. Fast calibration of a robust model predictive controller for diesel engine airpath. IEEE Transactions on Control Systems Technology, 2019.
  • [32] Jose Eduardo W Santos, Jorge Otávio Trierweiler, and Marcelo Farenzena. Robust tuning for classical MPC through the multi-scenarios approach. Industrial & Engineering Chemistry Research, 58(8):3146–3158, 2019.
  • [33] S Yusef Shafi, Murat Arcak, and Laurent El Ghaoui. Graph weight allocation to meet Laplacian spectral constraints. IEEE Transactions on Automatic Control, 57(7):1872–1877, 2011.
  • [34] Gaurang Shah and Sebastian Engell. Tuning MPC for desired closed-loop performance for mimo systems. In Proceedings of the 2011 American Control Conference, pages 4404–4409. IEEE, 2011.
  • [35] Rohan C Shekhar, Gokul S Sankar, Chris Manzie, and Hayato Nakada. Efficient calibration of real-time model-based controllers for diesel engines–Part I: Approach and drive cycle results. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pages 843–848. IEEE, 2017.
  • [36] Rahul Shridhar and Douglas J Cooper. A tuning strategy for unconstrained multivariable model predictive control. Industrial & engineering chemistry research, 37(10):4003–4016, 1998.
  • [37] Farshud Sorourifar, Georgios Makrygirgos, Ali Mesbah, and Joel A Paulson. A data-driven automatic tuning method for mpc under uncertainty using constrained bayesian optimization. arXiv preprint arXiv:2011.11841, 2020.
  • [38] Ryohei Suzuki, Fukiko Kawai, Hideyuki Ito, Chikashi Nakazawa, Yoshikazu Fukuyama, and Eitaro Aiyoshi. Automatic tuning of model predictive control using particle swarm optimization. In 2007 IEEE Swarm Intelligence Symposium, pages 221–226. IEEE, 2007.
  • [39] C.A. Tracy and H. Widom. On orthogonal and symplectic matrix ensembles. Communications in Mathematical Physics, 177(3):727–754, 1996.
  • [40] JH Van der Lee, WY Svrcek, and BR Young. A tuning algorithm for model predictive controllers based on genetic algorithms and fuzzy decision making. ISA transactions, 47(1):53–59, 2008.
  • [41] Johan Wahlström and Lars Eriksson. Modelling diesel engines with a variable-geometry turbocharger and exhaust gas recirculation by optimization of model parameters for capturing non-linear system dynamics. Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering, 225(7):960–986, 2011.
  • [42] Liuping Wang. Model predictive control system design and implementation using MATLAB®. Springer Science & Business Media, 2009.
  • [43] André Shigueo Yamashita, Paulo Martin Alexandre, Antonio Carlos Zanin, and Darci Odloak. Reference trajectory tuning of model predictive control. Control Engineering Practice, 50:1–11, 2016.
  • [44] Tuo Zhao, Zhaoran Wang, and Han Liu. A nonconvex optimization framework for low rank matrix estimation. In Advances in Neural Information Processing Systems, pages 559–567, 2015.
  • [45] Zhihui Zhu, Qiuwei Li, Gongguo Tang, and Michael B Wakin. Global optimality in low-rank matrix optimization. IEEE Transactions on Signal Processing, 66(13):3614–3628, 2018.