This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Learning the Integral Quadratic Constraints on Plant-Model Mismatch

Wentao Tang This work is supported by the National Science Foundation (Award #2414369).Wentao Tang is an assistant professor with the Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC 27695, U.S.A. wentao_tang@ncsu.edu
Abstract

While a characterization of plant-model mismatch is necessary for robust control, the mismatch usually can not be described accurately due to the lack of knowledge about the plant model or the complexity of nonlinear plants. Hence, this paper considers this problem in a data-driven way, where the mismatch is captured by parametric forms of integral quadratic constraints (IQCs) and the parameters contained in the IQC equalities are learned from sampled trajectories from the plant. To this end, a one-class support vector machine (OC-SVM) formulation is proposed, and its generalization performance is analyzed based on the statistical learning theory. The proposed approach is demonstrated by a single-input-single-output time delay mismatch and a nonlinear two-phase reactor with a linear nominal model, showing accurate recovery of frequency-domain uncertainties.

I Introduction

Model-based control, with model predictive control [1] as a representative scheme, is known to be the mainstream control strategy in practice, especially for multivariable constrained control. Due to the inherent complexity of plant dynamics, the existence of disturbances and noises, as well as the expense of system identification, plant models are often far from precise, which leaves the identification of plant-model mismatch a major long-lasting issue [2, 3]. The problem of controller performance monitoring (or mismatch detection) [4, 5, 6, 7, 8] is closely related to the mismatch identification – typically, when the controller performance is found to be largely deteriorated, then mismatch needs to be characterized and compensated in the nominal model. Here, we focus on the identification problem.

From a pragmatic point of view, most model-based control schemes applied in industrial practice are linear model-based (see, e.g., [9, 10]), where transfer functions are used to represent input-output relations. Thus, we desire that plant-model mismatch is characterized in terms of the error on the nominal models in a Laplace domain. Indeed, if the plant and the nominal model can be both considered as transfer functions, then the mismatch can be identified using informative input signals with rich perturbations in the frequency range of interest [11, 12]. However, the actual plant dynamics may not be a linear one; arguably, if the plant was almost linear, then the identification of the plant model should have been accurate, making the mismatch detection and identification of less value. Thus, the plant-model mismatch is better described as one between a nonlinear plant and a linear nominal model. This can be addressed via information-theoretic [13], Gaussian process [14], or deep learning [15, 16] approaches. However, a nonlinearly described mismatch (e.g., a neural network) can be difficult to utilize in linear control, requiring significant changes or even completely replacement of the control algorithm.

Hence, we aim to characterize the underlying nonlinear plant-model mismatch “in a linear way”, so as to allow the controller design or tuning algorithm to remain in a linear framework. This is conceptually related to the classical problem of specifying the conditions for nonlinear systems to be robustly controlled by linear controllers. There exist a range of tools for this purpose, from specific to generic – absolute stability, Popov criteria, passivity, dissipativity, and integral quadratic constraint (IQC) [17]. Simply speaking, these conditions serve as “bounds” on the nonlinear non-idealities, giving rise to linear/quadratic inequalities to constrain the uncertainties and thus enabling the synthesis of robust control laws as linear ones [18, 19, 20]. In particular, the most generic characterization mentioned here, the IQC of a system, refers to the existence of dynamic multipliers on its input and output signals, such that the signal from the dynamic multipliers satisfy a quadratic dissipative inequality [21, 22, 23]. Hence, the problem of interest is how to learn the IQC on the plant-model mismatch from process input and output data.

The problem that we encounter here is similar to the ones pertaining to the learning of L2L_{2}-gain, passivity index, and dissipativity of an unknown (black-box) nonlinear system. For linear systems, these learning problems were explored based on Willems’ Fundamental Lemma [24, 25, 26]. In the author’s previous works [27, 28, 29], machine learning techniques are proposed for learning the dissipativity of nonlinear systems. A recent work from Bridgeman and her coworkers [30] gave preliminary analysis of the statistical learning theory underpinning dissipativity learning.

In this paper, we adopt the one-class support vector machine (OC-SVM) approach [27] for IQC learning and provide a generalization performance bound (similar to [30]). Specifically, in the IQC, the dynamic multiplier is fixed by choosing basis transfer functions (filters), so that the inequality specifying the IQC is parameterized by a symmetric matrix with a positive definite input diagonal block and a negative definite output diagonal block. The IQC is then expressed as linear inequality constraints involving such parameters and sampled trajectories of the plant as data. The learning of the parameters through OC-SVM yields the desired IQC characterization of the plant-model mismatch on the frequency domain.

The remaining paper is organized as follows. Preliminaries on nonlinear control theory is provided in §II. The proposed technique is then discussed in §III. A simple numerical example and a practical application are shown in §IV and §V, respectively111Codes at available at the author’s GitHub repository (https://github.com/WentaoTang-Pack/IQClearning). . Conclusions are given in §VI.

Notations. Upper case letters are used to represent matrices, transfer functions, or dynamical systems, and lower case letters are for scalars and column vectors. For a matrix An×nA\in\mathbb{R}^{n\times n} whose entries are written as aija_{ij} (1i,jn1\leq i,j\leq n), its trace is trA=i=1naii\mathrm{tr}\,A=\sum_{i=1}^{n}a_{ii} and its Frobenius norm is AF=[i=1nj=1n|aij|2]1/2\|A\|_{\mathrm{F}}=\left[\sum_{i=1}^{n}\sum_{j=1}^{n}|a_{ij}|^{2}\right]^{1/2}. The inner product between two n×nn\times n matrices AA and BB is A,B:=tr(AB)=i=1nj=1naijbij\langle A,B\rangle:=\mathrm{tr}(A^{\top}B)=\sum_{i=1}^{n}\sum_{j=1}^{n}a_{ij}b_{ij}. We denote by ABA\succeq B for two symmetric matrices if ABA-B is positive semidefinite. II represents the unit matrix of appropriate dimension, or a static system where the outputs is identical to the inputs. For a complex scalar (or vector, or matrix) aa (AA), we denote by aa^{\dagger} (AA^{\dagger}) its conjugate (or conjugate transpose). We use j=1j=\sqrt{-1}.

II Preliminaries

We consider an unknown plant dynamics (in the scope of this paper, the plant-model mismatch as a system itself) in the form of a nonlinear continuous-time system

Σ:{x˙(t)=f(x(t),u(t))y(t)=h(x(t),u(t))\Sigma:\begin{cases}\dot{x}(t)=f(x(t),u(t))\\ y(t)=h(x(t),u(t))\end{cases}

defined on t[0,+)t\in[0,+\infty), where x(t)nxx(t)\in\mathbb{R}^{n_{x}}, u(t)nuu(t)\in\mathbb{R}^{n_{u}}, and y(t)nyy(t)\in\mathbb{R}^{n_{y}} are the states, inputs, and outputs, respectively. ff and hh are Lipschitz. We denote the Laplacian transforms of the time-domain signals by replacing tt with ss without changing the function symbol, e.g., u(s)=0u(t)est𝑑tu(s)=\int_{0}^{\infty}u(t)e^{-st}dt. Consider a matrix of stable proper real rational transfer functions Ψnz×(ny+nu)\Psi\in\mathcal{RH}_{\infty}^{n_{z}\times(n_{y}+n_{u})}, called a dynamic multiplier, that act on inputs and outputs separately:

z(s)=Ψ(s)[y(s)u(s)]=[Ψy(s)00Ψu(s)][y(s)u(s)]=[zy(s)zu(s)].z(s)=\Psi(s)\begin{bmatrix}y(s)\\ u(s)\end{bmatrix}=\begin{bmatrix}\Psi_{y}(s)&0\\ 0&\Psi_{u}(s)\end{bmatrix}\begin{bmatrix}y(s)\\ u(s)\end{bmatrix}=\begin{bmatrix}z_{y}(s)\\ z_{u}(s)\end{bmatrix}.
Σ\SigmauuyyΨ(s)\Psi(s)zz
Figure 1: The plant and dynamic multiplier
Definition 1.

The system Σ\Sigma is said to be dissipative under the dynamic multiplier Ψ:(y,u)z\Psi:(y,u)\mapsto z with respect to the supply rate σ(z)=zMz\sigma(z)=z^{\top}Mz, if for any trajectory of the aggregated dynamics (Σ,Ψ)(\Sigma,\Psi) starting from the origin (x(t0)=0x(t_{0})=0, ξ(t0)=0\xi(t_{0})=0) and for any t1t0t_{1}\geq t_{0}, we have

t0t1z(t)Mz(t)𝑑t0\int_{t_{0}}^{t_{1}}z^{\top}(t)Mz(t)dt\geq 0 (1)

where MM is a symmetric real matrix.

When all the components of u()u(\cdot), y()y(\cdot) and z()z(\cdot) are L2L_{2}-signals (square integrable on [0,+)[0,+\infty)), a necessary condition for (1) is that it holds when t1t0+t_{1}-t_{0}\rightarrow+\infty. According to the Parseval’s identity, this implies the following inequality, which involves a corresponding quadratic form on frequency domain integrated throughout the imaginary axis, called an integral quadratic constraint (IQC):

+[y(jω)u(jω)]Π(jω)[y(jω)u(jω)]0,\int_{-\infty}^{+\infty}\begin{bmatrix}y^{\dagger}(j\omega)&u^{\dagger}(j\omega)\end{bmatrix}\Pi(j\omega)\begin{bmatrix}y(j\omega)\\ u(j\omega)\end{bmatrix}\geq 0, (2)

where Π(s)=Ψ(s)MΨ(s)\Pi(s)=\Psi^{\dagger}(s)M\Psi(s). However, (1) is defined for any T0T\geq 0. Hence, (1) is a stronger condition than (2). Therefore, (1) is also referred to as the hard IQC and (2) is called the corresponding soft IQC. (Ψ,M)(\Psi,M) is said to be a hard factorization of the IQC specified by Π\Pi [22].

Remark 1.

When the supply rate function takes a direct quadratic form of (y,u)(y,u), i.e., using a “trivial” dynamic multiplier Ψ=I\Psi=I:

σ(y,u)=[yu]M[yu]\sigma(y,u)=\begin{bmatrix}y^{\top}&u^{\top}\end{bmatrix}M\begin{bmatrix}y\\ u\end{bmatrix}

for some symmetric matrix MM, we say that the system is MM-dissipative. If M=[QSSR]M=\begin{bmatrix}Q&S\\ S^{\top}&R\end{bmatrix}, the system is (Q,S,R)(Q,S,R)-dissipative. In particular, the system is passive if nu=nyn_{u}=n_{y}, Q=R=0Q=R=0, and S=IS=I.

Remark 2.

The transfer function entries in Ψy(s)\Psi_{y}(s) and Ψu(s)\Psi_{u}(s) can be viewed as operators for feature extraction from signals. For example, for a SISO system, if Ψy=1\Psi_{y}=1, Ψu(s)=1/(1+τs)\Psi_{u}(s)=1/(1+\tau s) (τ>0\tau>0), Myy=Muu=0M_{yy}=M_{uu}=0, Myu=Muy=1/2M_{yu}=M_{uy}=1/2, then Σ\Sigma can be interpreted as a passive system with respect to the output and a first-order filtered input instead of the original input.

Similar to the approach of [31], one can establish the following conclusion that an IQC system conforming to Definition 1 (i.e., one dissipative under a dynamic multiplier Ψ\Psi) should possess a storage function of the internal states. The proof is identical to the one in [31] or [28], except that Σ\Sigma should now be trivially substituted with (Σ,Ψ)(\Sigma,\Psi).

Theorem 1.

If Σ\Sigma is dissipative under the dynamic multiplier Ψ\Psi with respect to the supply rate σ(z)\sigma(z) as specified in Definition 1, then there exists a positive semidefinite function V(x,ξ)V(x,\xi), defined for any (x,ξ)(x,\xi) that is reachable from the origin through some input trajectories u()u(\cdot) and satisfying V(0,0)=0V(0,0)=0, such that the dissipative inequality:

V(x(t1),ξ(t1))V(x(t0),ξ(t0))t0t1σ(z(t))𝑑tV(x(t_{1}),\xi(t_{1}))-V(x(t_{0}),\xi(t_{0}))\leq\int_{t_{0}}^{t_{1}}\sigma(z(t))dt

holds for any trajectory on any time interval [t0,t1][t_{0},t_{1}]. Such a function VV is called the storage function, and can be constructed according to

V(x,ξ)=inf(u(t),d(t)),t[0,T]x(0)=0,x(T)=x,ξ(0)=0,ξ(T)=ξ0Tσ(z(t))𝑑t.V(x,\xi)=\inf_{\begin{subarray}{c}(u(t),d(t)),\enskip t\in[0,T]\\ x(0)=0,\enskip x(T)=x,\enskip\xi(0)=0,\enskip\xi(T)=\xi\end{subarray}}\int_{0}^{T}\sigma(z(t))dt.

The knowledge of IQC on the uncertainty of a system facilitates the analysis of robust stability, robust performance, and the design of the desired robust controllers (see, e.g., [21, 32]). It should, however, be pointed out that without a full model (or even with a nonlinear model), it is generally difficult to determine the IQC. Thus, in the context of identifying plant-model mismatch, we consider the problem of learning (estimating, inferring) the IQC from data.

III Learning of IQC Parameters

III-A Problem Setting

Consider a plant Π\Pi, as an input-output map (uyu\mapsto y) whose dynamics is unknown and in general nonlinear. Its nominal plant is denoted as Π0\Pi_{0}, for which the output under input uu is denoted as y0y_{0}. The plant-model mismatch Δ=ΠΠ0\Delta=\Pi-\Pi_{0} has an output r=yy0r=y-y_{0}, called the residual. More generally, we may also consider multiplicative mismatch, i.e., Δ\Delta such that Π=(1+Δ)Π0\Pi=(1+\Delta)\Pi_{0}, namely yy0=Δy0y-y_{0}=\Delta y_{0}, or other types of mismatch. Hereforth in this section, we denote the input and output of Δ\Delta as vv and ww, respectively. We assume that the mismatch Δ:vw\Delta:v\mapsto w satisfies some IQC specified by (Ψ,M)(\Psi,M).

For simplicity, we may assume that the dynamic multiplier Ψ(s)\Psi(s), comprising of feature extraction operators, are given and fixed, and thus only the symmetric matrix MM is to be estimated. Specifically, some filters are adopted in the construction of Ψw\Psi_{w} and Ψv\Psi_{v} to transform each component of ww and vv:

z(s)=Ψ(s)[w(s)v(s)]=[Ψw(s)00Ψv(s)][w(s)v(s)]=[zw(s)zv(s)].z(s)=\Psi(s)\begin{bmatrix}w(s)\\ v(s)\end{bmatrix}=\begin{bmatrix}\Psi_{w}(s)&0\\ 0&\Psi_{v}(s)\end{bmatrix}\begin{bmatrix}w(s)\\ v(s)\end{bmatrix}=\begin{bmatrix}z_{w}(s)\\ z_{v}(s)\end{bmatrix}. (3)
Remark 3 (Choice of filters).

Without additional prior information, we may use a finite number of Müntz-Laguerre filters: φ1(s)=2Reb1s+b1\varphi_{1}(s)=\frac{\sqrt{2\mathrm{Re}\,b_{1}}}{s+b_{1}}, φk(s)=2Rebks+bkq=1k1sb¯ks+bk\varphi_{k}(s)=\frac{\sqrt{2\mathrm{Re}\,b_{k}}}{s+b_{k}}\prod_{q^{\prime}=1}^{k-1}\frac{s-\bar{b}_{k^{\prime}}}{s+b_{k^{\prime}}} (k2k\geq 2) with preassigned poles b1,b2,-b_{1},-b_{2},\dots whose real parts do not exceed ϵ-\epsilon for some ϵ>0\epsilon>0 and satisfying k=1Rebk1+|bk|2=\sum_{k=1}^{\infty}\frac{\mathrm{Re}\,b_{k}}{1+|b_{k}|^{2}}=\infty. They are known to form a uniformly bounded orthonormal basis of the Hardy space 2\mathcal{H}_{2} [33]. One may also consider the use of a combination of low-pass, high-pass, and band-pass filters to extract inputs on different frequency ranges. In the choice of filters, the assignment of poles are expected to be critical for accuracy.

Definition 2.

The matrix MM is called the dissipativity parameters.

For the estimation of the dissipativity parameters MM, we suppose that mm trajectories (v(i)(),w(i)())i=1m(v^{(i)}(\cdot),w^{(i)}(\cdot))_{i=1}^{m} are sampled independently, from a distribution of signals with random time durations t1t0t_{1}-t_{0} and input excitations. We assume that all such trajectories start from the origin. Formally, we denote this distribution as a measure \mathbb{P}. The goal is therefore to determine a valid choice of MM such that for all i=1,,mi=1,\dots,m, the following inequality holds approximately:

t0(i)t1(i)z(i)(t)Mz(i)(t)𝑑t0.\int_{t_{0}^{(i)}}^{t_{1}^{(i)}}z^{(i)\top}(t)Mz^{(i)}(t)dt\geq 0.

III-B OC-SVM for IQC Learning

Rewriting the inequality (1) as

M,0Tz(t)z(t)𝑑t=tr(M0Tz(t)z(t)𝑑t)0,\bigg{\langle}M,\int_{0}^{T}z(t)z^{\top}(t)dt\bigg{\rangle}=\mathrm{tr}\left(M\int_{0}^{T}z(t)z^{\top}(t)dt\right)\geq 0,

the goal is to find MM such that the above inequality approximately holds on the sampled trajectories.

Definition 3.

For a trajectory (v(),w())(v(\cdot),w(\cdot)), which determines a trajectory of z()z(\cdot) on [0,T][0,T] according to (3), its corresponding dual dissipativity parameters refers to

Γ=0Tz(t)z(t)𝑑t(0).\Gamma=\int_{0}^{T}z(t)z^{\top}(t)dt\enskip(\succeq 0). (4)

Hence, having calculated the dual dissipativity parameters of the sampled trajectories {Γ(i)}i=1m\{\Gamma^{(i)}\}_{i=1}^{m}, we seek MM such that M,Γ(i)0\langle M,\Gamma^{(i)}\rangle\geq 0 approximately. This problem is amenable to one-class support vector machine (OC-SVM) [34], where we maximize the “margin” of the inequality, i.e., a nonnegative value ρ>0\rho>0 such that M,Γiρ\langle M,\Gamma_{i}\rangle\geq\rho for all ii, but penalizing the norm of the “slope”, i.e., MF\|M\|_{\mathrm{F}}. Equivalently, the problem is to minimize MF2\|M\|_{\mathrm{F}}^{2} while rewarding ρ\rho. For a more flexible formulation, we allow the margin ρ\rho to be violated, while the violations by each sampled trajectory are to be penalized as a cost. In the following so-called “soft-SVM” formulation, ν(0,1)\nu\in(0,1) is the “softness” constant and ξi0\xi_{i}\geq 0 (1im1\leq i\leq m) are the margin violations.

minM,ρ,ξ\displaystyle\min_{M,\rho,\xi} 12MF2ρ+1νmi=1mξi\displaystyle\frac{1}{2}\|M\|_{\mathrm{F}}^{2}-\rho+\frac{1}{\nu m}\sum_{i=1}^{m}\xi_{i} (5)
s.t.\displaystyle\mathrm{s.t.} M,Γiρξi,ξi0(1im);\displaystyle\langle M,\Gamma_{i}\rangle\geq\rho-\xi_{i},\,\xi_{i}\geq 0\,(1\leq i\leq m);
M=[Mww00Mvv],MwwϵwI,MvvϵvI.\displaystyle M=\begin{bmatrix}M_{ww}&0\\ 0&M_{vv}\end{bmatrix},\,-M_{ww}\succeq\epsilon_{w}I,\,M_{vv}\succeq\epsilon_{v}I.

Here, instead of using the typical SVM formulation, we should impose additional constraints on the blocks of MM for our IQC learning problem. The last line of (5) guarantees that (i) the “output” of the mismatch system Δ\Delta (i.e., the output residual rr) contributes to a decrease in the storage function, so that the mismatch is a self-stabilized system222If MwwM_{ww} turns out to be a scalar, then the negative definiteness constraint can be simplified as Mww=1M_{ww}=-1. Similarly, if MvvM_{vv} is scalar, then only Mvv=1M_{vv}=1 is needed., (ii) the inputs can only lead to an increase in the storage, and that (iii) the absence of input-output bilinear terms in the supply rate for simplicity (in fact, they can always be relaxed by an arithmetic-geometric mean inequality).

Remark 4 (Soft OC-SVM).

The use of a soft OC-SVM that allows the nonnegative margin to be violated is justified by the following two considerations. First, we assumed that all such trajectories start from the origin (equilibrium point), which is dificult to guarantee or verify. Practically, the sampled trajectories have nonzero initial values in the storage function, which can cause violation of the dissipative inequality. Second, the system is assumed to be noiseless and disturbance-free, while actual plants are always perturbed.

Remark 5 (Sampling the trajectories).

The generation of informative sample trajectories is critical for learning the IQC. In [29] for dissipativity learning, it was proposed that the input excitations are created by randomly sampling the Fourier coefficients, and the trajectory duration is selected such that the output ranges over an interval of interest. In [30], a variety of sampling methods, including the Fourier coefficient [29], Legendre polynomial, and Wiener process, are experimented for passivity index learning. Since IQC is a frequency-domain characterization, in this work, it is recommended that the input signal v(t)v(t) be sampled to cover the frequency range of interest. An explicit approach is to let vv be sinusoidal waves of randomized frequencies, which is to be used in §IV and §V.

III-C Generalization Performance

The probabilistic guarantee on the generalization error of OC-SVM was given in Schölkopf et al. [34].

Theorem 2.

Suppose that the OC-SVM gives an optimal solution (M,ρ,ξi)(M^{\ast},\rho^{\ast},\xi_{i}) with ρ=ρξi>0\rho_{\ast}=\rho^{\ast}-\xi_{i}^{\ast}>0 for all 1im1\leq i\leq m. Then for any δ(0,1)\delta\in(0,1) and ϵ>0\epsilon>0, with probability 1δ1-\delta (over the choice of samples of size mm), we have

[M,Γ<ρϵ]2m(logκ(MF,ϵ)+2δ)\mathbb{P}\left[\langle M,\Gamma\rangle<\rho_{\ast}-\epsilon\right]\leq\frac{2}{m}\left(\lceil\log\kappa(\|M\|_{\mathrm{F}},\,\epsilon)\rceil+\frac{2}{\delta}\right)

where κ(α,ϵ,2m)\kappa(\alpha,\epsilon,2m) is the covering number of the model space {M,:MFα}\{\langle M,\cdot\rangle:\|M\|_{\mathrm{F}}\leq\alpha\} by balls of radius of ϵ\epsilon under the metric d(M1,M2)=supΓ1,,Γ2mmaxi=1,,2m|M1M2,Γi|d(M_{1},M_{2})=\sup_{\Gamma_{1},\dots,\Gamma_{2m}}\max_{i=1,\dots,2m}|\langle M_{1}-M_{2},\Gamma_{i}\rangle|.

In LoCicero [30], the conclusion was translated explicitly in terms of the dissipativity learning that has the same form as our IQC learning problem formulated in (5). Specifically, by the definition of covering number, it could be found that

[M,Γ<ρϵ]1m𝒪(log1δ+1ϵ2logm+1ϵ2log1ϵ).\mathbb{P}\left[\langle M,\Gamma\rangle<\rho_{\ast}-\epsilon\right]\leq\frac{1}{m}\mathcal{O}\left(\log\frac{1}{\delta}+\frac{1}{\epsilon^{2}}\log m+\frac{1}{\epsilon^{2}}\log\frac{1}{\epsilon}\right).

IV Numerical Example

Consider a simple case where the actual SISO system has an unknown delay not accounted for in the nominal model. The delay is θ[0,θ0]\theta\in[0,\theta_{0}] where the upper bound θ0\theta_{0} is known. As pointed out in [21], the plant-model mismatch Δ(s)=eθs1\Delta(s)=e^{-\theta s}-1 (as a multiplicative uncertainty) satisfies IQCs of the form

Π(jω)=[τ(jω)00τ(jω)(jω)]\Pi(j\omega)=\begin{bmatrix}-\tau(j\omega)&0\\ 0&\tau(j\omega)\ell(j\omega)\end{bmatrix} (6)

where τ(jω)0\tau(j\omega)\geq 0 is a rational weighting function and \ell is a real-valued rational function of ω\omega satisfying (jω)0(jω)\ell(j\omega)\geq\ell_{0}(j\omega), in which

0(jω)=maxθ[0,θ0]|ejωθ1|2={4sin2ωθ02,ω<π/θ04,ωπ/θ0.\ell_{0}(j\omega)=\max_{\theta\in[0,\theta_{0}]}|e^{-j\omega\theta}-1|^{2}=\begin{cases}4\sin^{2}\frac{\omega\theta_{0}}{2},&\omega<\pi/\theta_{0}\\ 4,&\omega\geq\pi/\theta_{0}\end{cases}.

Such majorization does exist, e.g., (ω)=(4ω4+50ω2)/(ω4+6.5ω2+50)\ell(\omega)=(4\omega^{4}+50\omega^{2})/(\omega^{4}+6.5\omega^{2}+50) in Megretski and Rantzer [21]. For simplicity, let τ1\tau\equiv 1 be fixed and ψ\psi be learned from data. It is of interest to examine whether the learned dynamic multiplier is indeed of the theoretical form above.

Without loss of generality, say θ0=1/2\theta_{0}=1/2. To generate the data, we let u(t)u(t) be a sinusoidal wave u(t)=Asinωtu(t)=A\sin\omega t. Here we choose A=1A=1 and log10ω\log_{10}\omega is sampled from a uniform distribution on [2,2][-2,2], in order to cover a sufficiently large range of frequencies. For IQC learning, m=500m=500 trajectories are collected. We choose three transfer functions: φ1(s)=1/(s+1)\varphi_{1}(s)=1/(s+1) (low-pass filter), φ2(s)=s/(s+1)\varphi_{2}(s)=s/(s+1) (high-pass filter), and φ3(s)=φ1(s)φ2(s)\varphi_{3}(s)=\varphi_{1}(s)\varphi_{2}(s) (band-pass filter) to extract the input features, i.e., let z1=yz_{1}=y and zk+1=φk(s)uz_{k+1}=\varphi_{k}(s)u (k=1,2,3k=1,2,3). We aim to obtain a matrix of the form M=diag(1,Mu)M=\mathrm{diag}(-1,M_{u}) where MuM_{u} is a positive semidefinite 3×33\times 3 matrix.

Refer to caption
Figure 2: Frequency in the learned IQC of the example system with a delay mismatch.

By solving the resulting optimization problem (5) in cvx (version 2.2 in Matlab R2024b), the MM matrix is found as

M=[0.02450.01200.02210.01203.96160.01580.02210.01580.0227]diag(0,4,0)M=\begin{bmatrix}0.0245&-0.0120&0.0221\\ -0.0120&3.9616&0.0158\\ 0.0221&0.0158&0.0227\\ \end{bmatrix}\approx\mathrm{diag}(0,4,0)

which gives (jω)4ω2/(1+ω2)\ell(j\omega)\approx 4\omega^{2}/(1+\omega^{2}). The comparison of the learned IQC is compared with the theoretical one as well as the lowest possible curve 0(jω)\ell_{0}(j\omega) in Fig. 2. Here the softness parameter ν=0.01\nu=0.01 is chosen, which results in a small violation of the OC-SVM margin i=1mξi/m=9.34×104\sum_{i=1}^{m}\xi_{i}/m=9.34\times 10^{-4}. When ν\nu is increased to 0.050.05, the average violation becomes 3.07×1023.07\times 10^{-2}, causing more high-frequency learning error. This indicates that OC-SVM is a suitable learning tool, at least when the dataset is clean and informative.

From the result, it is found that the learned IQC provides an approximately correct characterization of the underlying uncertainty, in the sense that except when ωπ/θ0\omega\gtrsim\pi/\theta_{0}, the IQC learned is an overestimation, and also upon ω0\omega\rightarrow 0 and ω\omega\rightarrow\infty, the limits are close to the actual values. On the other hand, since the learned IQC dominantly relies on the high-pass features of the input to provide the S-shaped curve, the pole that is assigned to φ2(s)\varphi_{2}(s), which is 11, misaligns with the half-rise frequency π/2θ0\pi/2\theta_{0} of 0(jω)\ell_{0}(j\omega). Also, the rise of 0\ell_{0} is steeper than a first-order high-pass filter. Hence, to attempt for a tighter estimation, we set φ1\varphi_{1} as the second Butterworth filter with cutoff frequency π/2θ0\pi/2\theta_{0}, and φ2\varphi_{2} as its high-pass counterpart. The resulting \ell is shown in Fig. 2 as well. As expected, the learned IQC becomes more accurate; however, this assumes prior knowledge on a better pole assignment and better filter choice.

Empirically, we found that the learning result is insensitive to the sampling strategy in this simple example. When sampling u(t)u(t) as random binary sequences (with a discretization time of 0.050.05) or as truncated Fourier series with 55 sinusoidal waves of uniformly distributed coefficients, the IQC learned remain identical to the ones shown above. Such robustness should be due to the a priori selection of a correct IQC structure (6) that relies on the user’s judgment.

V Application to a Chemical Process

Refer to caption
(a) Process schematic
Refer to caption
(b) Unit step response
Figure 3: The two-phase reactor.

Consider the two-phase reactor in [35]. An illustration of the process is provided in Fig. 3(a) and the underlying true model, which is nonlinear, was detailed in [29]. We focus on the relation between the vapor flow rate FVF_{\text{V}} as an input (uu) and the substrate concentration in the vapor phase yAy_{\text{A}} as an output (yy). The step response of this process is shown in Fig. 3(b). Suppose that from this step response, a simplistic engineer considers the delayed first-order transfer function Π0(s)=K0eθ0s/(τ0s+1)\Pi_{0}(s)=K_{0}e^{-\theta_{0}s}/(\tau_{0}s+1) as the linear nominal model, where K0=0.28K_{0}=0.28, τ0=4\tau_{0}=4, and θ0=12\theta_{0}=12. The step response of the nominal model is plotted in contrast to the actual step response in Fig. 3(b). We are thus interested in characterizing the nonlinear plant-model mismatch Δ=ΠΠ0\Delta=\Pi-\Pi_{0}.

We sample m=500m=500 trajectories from the unknown nonlinear dynamics under input excitations u(t)=Asinωtu(t)=A\sin\omega t for a duration of 4545 min, where log10ωτ0\log_{10}\omega\tau_{0} comes from a uniform distribution on [2,2][-2,2] and A=1/4A=1/4. Under these settings, the simulations are numerically stable. To learn the IQC, we choose the IQC structure as in (6) with τ(jω)1\tau(j\omega)\equiv 1 and M=diag(1,Mu)M=\mathrm{diag}(1,M_{u}). Thus, (jω)=Ψu(jω)MuΨu(jω)\ell(j\omega)=\Psi_{u}(j\omega)^{\dagger}M_{u}\Psi_{u}(j\omega). The filters in Ψu(s)\Psi_{u}(s) are decided in the following way: (i) 99 frequencies (ω1=101,ω2=103/4,,ω9=101\omega_{1}=10^{-1},\omega_{2}=10^{-3/4},\cdots,\omega_{9}=10^{1}) are first chosen; (ii) between each two frequencies ωk\omega_{k} and ωk+1\omega_{k+1}, let φk(s)=s/ωks/ωk+11s/ωk+1+1\varphi_{k}(s)=\frac{s/\omega_{k}}{s/\omega_{k}+1}\cdot\frac{1}{s/\omega_{k+1}+1}; and (iii) let φ0(s)=1s/ω1+1\varphi_{0}(s)=\frac{1}{s/\omega_{1}+1} and φ10(s)=s/ω9s/ω9+1\varphi_{10}(s)=\frac{s/\omega_{9}}{s/\omega_{9}+1}.

The resulting (ω)\ell(\omega) of the learned IQC under different SVM softness parameters ν\nu, as well as when the high-pass and low-pass filters are substituted by the Butterworth second-order ones, are shown in Fig. 4. Similar to as observed in the previous section, when using simple filters, the curve of (jω)\ell(j\omega) tend to be less steep, while Butterworth second-order filters better concentrate the frequency-domain uncertainties. With large ν\nu, the violation to the linear inequality constraints in the OC-SVM problem (5) can cause an erroneous high-frequency mismatch identification. It is therefore necessary to adopt a small enough ν\nu to recover the anticipated conclusion that the mismatch should not be significant at very low or very high frequencies.

Refer to caption
Figure 4: Frequency response (jω)\ell(j\omega) in the learned IQC for the two-phase reactor.

It thus appears from Fig. 4 that at a frequency ωτ00.3\omega\tau_{0}\approx 0.3, the mismatch peaks. For a verification that the mismatch is indeed most severe around this frequency, we compare the actual and nominal responses under u(t)=cosωtu(t)=\cos\omega t for ωτ0=0.03,0.3,3,30\omega\tau_{0}=0.03,0.3,3,30, shown in Fig. 5. One can intuitively observe here that while the nominal response is a delayed wave, the actual nonlinear dynamics does not even exhibit any oscillation for ωτ0=0.3\omega\tau_{0}=0.3. Therefore, we conclude that the proposed IQC learning approach indeed provides an accurate description of the plant-model mismatch on the frequency domain.

Refer to caption
Figure 5: Comparison of actual and nominal responses under different frequencies.

VI Conclusions

In this work, a practical and efficient-to-implement approach to characterize the mismatch between a nonlinear plant and its nominal model in the form of a parameterized IQC. A numerical example and a realistic chemical process application demonstrate its capacity to accurately recover the mismatch information on the frequency domain. The IQC learned allows the synthesis of robust model-based controllers. With increasing practice of machine learning in the context of data-driven control [36], it is envisioned that the proposed method can find its use in state-of-the-art control technology. For nonlinear MPC, possible extensions of the current approach to corresponding nonlinear model structures will be studied in the future research.

Acknowledgment

The author thanks Dr. Pierre Carrette, Advanced Process Control R&D Lead at Shell Global Solutions, whom the author worked for several years ago, for the exposure to plant-model mismatch detection and identification problems.

References

  • [1] J. B. Rawlings, D. Q. Mayne, and M. Diehl, Model predictive control: Theory, computation, and design. Nob Hill, 2nd ed., 2018.
  • [2] P. Van den Hof, “Closed-loop issues in system identification,” Ann. Rev. Control, vol. 22, pp. 173–186, 1998.
  • [3] A. S. Badwe, R. S. Patwardhan, S. L. Shah, S. C. Patwardhan, and R. D. Gudi, “Quantifying the impact of model-plant mismatch on controller performance,” J. Process Control, vol. 20, no. 4, pp. 408–425, 2010.
  • [4] S. J. Qin, “Control performance monitoring – a review and assessment,” Comput. Chem. Eng., vol. 23, no. 2, pp. 173–186, 1998.
  • [5] S. J. Qin and J. Yu, “Recent developments in multivariable controller performance monitoring,” J. Process Control, vol. 17, no. 3, pp. 221–227, 2007.
  • [6] B. Huang and R. Kadali, Dynamic modeling, predictive control and performance monitoring: A data-driven subspace approach. Springer, 2008.
  • [7] S. Kaw, A. K. Tangirala, and A. Karimi, “Improved methodology and set-point design for diagnosis of model-plant mismatch in control loops using plant-model ratio,” J. Process Control, vol. 24, no. 11, pp. 1720–1732, 2014.
  • [8] X. Gao, F. Yang, C. Shang, and D. Huang, “A review of control loop monitoring and diagnosis: Prospects of controller maintenance in big data era,” Chin. J. Chem. Eng., vol. 24, no. 8, pp. 952–962, 2016.
  • [9] D. M. Prett and M. Morari, The Shell process control workshop. Butterworth, 1987.
  • [10] W. Tang, P. Carrette, Y. Cai, J. M. Williamson, and P. Daoutidis, “Automatic decomposition of large-scale industrial processes for distributed MPC on the Shell-Yokogawa Platform for Advanced Control and Estimation (PACE),” Comput. Chem. Eng., vol. 178, p. 108382, 2023.
  • [11] A. S. Badwe, R. D. Gudi, R. S. Patwardhan, S. L. Shah, and S. C. Patwardhan, “Detection of model-plant mismatch in MPC applications,” J. Process Control, vol. 19, no. 8, pp. 1305–1313, 2009.
  • [12] D. Ling, Y. Zheng, H. Zhang, W. Yang, and B. Tao, “Detection of model-plant mismatch in closed-loop control system,” J. Process Control, vol. 57, pp. 66–79, 2017.
  • [13] Y. Chen and M. Ierapetritou, “A framework of hybrid model development with identification of plant-model mismatch,” AIChE J., vol. 66, no. 10, p. e16996, 2020.
  • [14] Q. Wu and W. Du, “Online detection of model-plant mismatch in closed-loop systems with gaussian processes,” IEEE Trans. Ind. Inform., vol. 18, no. 4, pp. 2213–2222, 2021.
  • [15] S. H. Son, J. W. Kim, T. H. Oh, D. H. Jeong, and J. M. Lee, “Learning of model-plant mismatch map via neural network modeling and its application to offset-free model predictive control,” J. Process Control, vol. 115, pp. 112–122, 2022.
  • [16] F. Moayedi, A. Chandrasekar, S. Rasmussen, S. Sarna, B. Corbett, and P. Mhaskar, “Physics-informed neural networks for process systems: Handling plant-model mismatch,” Ind. Eng. Chem. Res., vol. 63, no. 31, pp. 13650–13659, 2024.
  • [17] S. Sastry, Nonlinear systems: Analysis, stability, and control. Springer, 1999.
  • [18] K. Zhou and J. C. Doyle, Essentials of robust control. Prentice Hall, 1998.
  • [19] G. E. Dullerud and F. Paganini, A course in robust control theory: A convex approach. Springer, 2001.
  • [20] R. Lozano, B. Brogliato, O. Egeland, and B. Maschke, Dissipative systems analysis and control: Theory and applications. Springer, 2013.
  • [21] A. Megretski and A. Rantzer, “System analysis via integral quadratic constraints,” IEEE Trans. Autom. Control, vol. 42, no. 6, pp. 819–830, 1997.
  • [22] P. Seiler, “Stability analysis with dissipation inequalities and integral quadratic constraints,” IEEE Trans. Autom. Control, vol. 60, no. 6, pp. 1704–1709, 2014.
  • [23] J. Veenman, C. W. Scherer, and H. Köroğlu, “Robust stability and performance analysis based on integral quadratic constraints,” Eur. J. Control, vol. 31, pp. 1–32, 2016.
  • [24] B. Wahlberg, M. B. Syberg, and H. Hjalmarsson, “Non-parametric methods for L2L_{2}-gain estimation using iterative experiments,” Automatica, vol. 46, no. 8, pp. 1376–1381, 2010.
  • [25] J. Berberich and F. Allgöwer, “A trajectory-based framework for data-driven system analysis and control,” in Eur. Control Conf. (ECC), pp. 1365–1370, IEEE, 2020.
  • [26] A. Koch, J. Berberich, and F. Allgöwer, “Provably robust verification of dissipativity properties from data,” IEEE Trans. Autom. Control, vol. 67, no. 8, pp. 4248–4255, 2021.
  • [27] W. Tang and P. Daoutidis, “Input-output data-driven control through dissipativity learning,” in Am. Control Conf. (ACC), pp. 4217–4222, IEEE, 2019.
  • [28] W. Tang and P. Daoutidis, “Dissipativity learning control (DLC): A framework of input–output data-driven control,” Comput. Chem. Eng., vol. 130, p. 106576, 2019.
  • [29] W. Tang and P. Daoutidis, “Dissipativity learning control (DLC): theoretical foundations of input–output data-driven model-free control,” Syst. Control Lett., vol. 147, p. 104831, 2021.
  • [30] E. LoCicero, A. Penne, and L. Bridgeman, “Issues with input-space representation in nonlinear data-based dissipativity estimation,” arXiv preprint, 2024. arXiv:2411.13404.
  • [31] D. Hill and P. Moylan, “The stability of nonlinear dissipative systems,” IEEE Trans. Autom. Control, vol. 21, no. 5, pp. 708–711, 1976.
  • [32] J. Veenman and C. W. Scherer, “IQC-synthesis with general dynamic multipliers,” Int. J. Robust Nonlin. Control, vol. 24, no. 17, pp. 3027–3056, 2014.
  • [33] L. Knockaert, “On orthonormal Müntz-Laguerre filters,” IEEE Trans. Signal Process., vol. 49, no. 4, pp. 790–793, 2001.
  • [34] B. Schölkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson, “Estimating the support of a high-dimensional distribution,” Neur. Comput., vol. 13, no. 7, pp. 1443–1471, 2001.
  • [35] A. Kumar and P. Daoutidis, “Feedback control of nonlinear differential-algebraic-equation systems,” AIChE J., vol. 41, no. 3, pp. 619–636, 1995.
  • [36] W. Tang and P. Daoutidis, “Data-driven control: Overview and perspectives,” in Am. Control Conf. (ACC), pp. 1048–1064, IEEE, 2022.