Model Predictive Control of Nonlinear Dynamics
Using Online Adaptive Koopman Operators

\NameDaisuke Uchida \Email[email protected] \NameKarthik Duraisamy \Email[email protected]
\addrDepartment of Aerospace Engineering University of Michigan MI 48109 USA

Abstract

This paper develops a methodology for adaptive data-driven Model Predictive Control (MPC) using Koopman operators. While MPC is ubiquitous in various fields of engineering, the controller performance can deteriorate if the modeling error between the control model and the true dynamics persists, which may often be the case with complex nonlinear dynamics. Adaptive MPC techniques learn models online such that the controller can compensate for the modeling error by incorporating newly available data. We utilize the Koopman operator framework to formulate an adaptive MPC technique that corrects for model discrepancies in a computationally efficient manner by virtue of convex optimization. With the use of neural networks to learn embedding spaces, Koopman operator models enable accurate dynamics modeling. Such complex model forms, however, often lead to unstable online learning. To this end, the proposed method utilizes the soft update of target networks, a technique used in stabilization of model learning in Reinforcement Learning (RL). Also, we provide a discussion on which parameters to be chosen as online updated parameters based on a specific description of linear embedding models. Numerical simulations on a canonical nonlinear dynamical system show that the proposed method performs favorably compared to other data-driven MPC methods while achieving superior computational efficiency through the utilization of Koopman operators.

keywords:

Koopman Operator, Data-driven Control, Model Predictive Control

1 Introduction

Model Predictive Control (MPC) has emerged as a powerful framework for solving complex control challenges across diverse engineering applications, from process industries to robotics and autonomous systems. At the heart of MPC lies its ability to optimize control decisions based on predicted future system behavior. This predictive capability critically depends on the accuracy of the underlying control model. Yet for complex systems, the mathematical models used for control often deviate from the true system dynamics. To address this limitation, researchers have developed adaptive and data-driven MPC approaches that explicitly account for modeling uncertainties, leading to more robust control performance ([Adetola et al.(2009)Adetola, DeHaan, and Guay, Klenske et al.(2016)Klenske, Zeilinger, Schölkopf, and Hennig, Ostafew et al.(2016)Ostafew, Schoellig, and Barfoot, Ostafew et al.(2014)Ostafew, Schoellig, and Barfoot]). For instance, the online adaptation procedure can be formulated such that the residual dynamics, which is the difference between the control model and the true dynamics, is learned online by fitting non-parametric or parametric function approximators such as Gaussian processes ([Hewing et al.(2020)Hewing, Kabzan, and Zeilinger]) and random Fourier features ([Zhou and Tzoumas(2024)]).

Koopman operator theory has gained popularity in recent years for its utility as an alternative approach to describing nonlinear dynamics ([Brunton and Kutz(2019), Mauroy et al.(2020)Mauroy, Mezić, and Susuki, Pan and Duraisamy(2024)]). Specifically, nonlinear dynamics can be represented as linear ones in the embedded space of feature maps, and several computational methods have been developed to obtain finite-dimensional approximations of Koopman operators, which are then utilized for prediction and control. The Koopman operator framework offers a potentially powerful advantage: it enables the transformation of nonlinear dynamical systems into linear representations through data-driven methods. This linearization affords access to the extensive theoretical machinery and computational tools developed for linear control systems, while preserving the ability to handle underlying nonlinear dynamics ([Korda and Mezić(2018), Arbabi et al.(2018)Arbabi, Korda, and Mezić]). While Koopman operator-based models can be learned by linear regression types of methods such as Extended Dynamic Mode Decomposition (EDMD) ([Williams et al.(2015)Williams, Kevrekidis, and Rowley]), utilization of neural networks to learn (in contrast to prescribing from a dictionary) feature spaces has been shown to be promising for complex nonlinear dynamics and incorporated into data-driven Koopman operator-based control ([Han et al.(2022)Han, Euler-Rolle, and Katzschmann, Xiao et al.(2023)Xiao, Zhang, Xu, Liu, and Liu, Uchida and Duraisamy(2023), Pan and Duraisamy(2020)]).

While the use of Koopman operators enables expressive and flexible modeling for data-driven control, modeling errors may arise due to several possible factors, e.g., lack of data quantity/quality, inadequate model structures, etc. Whereas there are several methods to tackle this issue from the control theoretic viewpoints ([Zhang et al.(2022)Zhang, Pan, Scattolini, Yu, and Xu, Son et al.(2020)Son, Narasingam, and Sang-Il Kwon, Uchida et al.(2021)Uchida, Yamashita, and Asama]), most of them are based on EDMD-type models. Also, while [Han et al.(2022)Han, Euler-Rolle, and Katzschmann] develops a model uncertainty-aware Koopman MPC with the use of probabilistic neural networks and [Uchida and Duraisamy(2023)] proposes a model refinement technique to handle the modeling error of neural network-based Koopman models in the context of control, there is a relative scarcity of exploration of online update methods of such control models.

In this paper, we propose an online adaptation method for Koopman operator-based MPC to avoid performance deterioration due to the modeling error. Whereas [Zhang et al.(2019)Zhang, Rowley, Deem, and Cattafesta, Hemati et al.(2014)Hemati, Williams, and Rowley, Sinha et al.(2020)Sinha, Nandanoori, and Yeung, Alfatlawi and Srivastava(2020)] explore the idea of online adaptation in the context of Koopman operator-based computational modeling, they do not assume controller design problems. [Deem et al.(2020)Deem, Cattafesta, Hemati, Zhang, Rowley, and Mittal] presents an adaptive control of flow separation based on the online dynamic mode decomposition, which is an EDMD-type model without nonlinear feature maps. [Singh et al.(2024)Singh, Sah, and Keshavan] also develops an adaptive Koopman MPC, in which the linear operator $[A\ B]$ in (6) is updated to $[A+\Delta A\ B+\Delta B]$ s.t. $\Delta A$ and $\Delta B$ are parameterized by additional neural networks, which are trained online w.r.t. an adaptation loss function. On the other hand, the proposed method provides a more tractable yet effective online adaptation procedure. We only use a single loss function to train neural networks throughout offline and online model learning, which results in fewer hyperparameters and less complexity of the model learning. At the same time, the proposed method enables flexible online model learning since it allows an adaptation of the feature maps in addition to the linear operator $[A\ B]$ . Considering that model learning involving deep neural networks results in high-dimensional, non-convex optimizations and typically becomes unstable, we adopt the soft update of target networks ([Lillicrap et al.(2015)Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa, Silver, and Wierstra]), a common technique to stabilize learning used in Reinforcement Learning (RL). Also, we provide a discussion on which parameters should be prioritized in the online update procedure based on a specific formulation of linear embedding models in [Iacob et al.(2024)Iacob, Tóth, and Schoukens] to further improve the stability and robustness of online learning. The overview of the proposed method is shown in Fig. 1.

In Section 2, MPC with Koopman operator-based linear embedding models is presented. This is followed by the formulation of offline model learning based on the Koopman operator framework in Section 3. The proposed method is formalized in Section 4 and numerical evaluations are provided in Section 5.

Refer to caption — Figure 1: Proposed adaptive Koopman MPC with soft update. See Algorithm 4 for details.

2 Problem Setup

2.1 Model Predictive Control

We consider the problem of designing controllers for a discrete-time dynamics:

\displaystyle x_{k+1}=f(x_{k},u_{k}),

(1)

where $x_{k}\in\mathcal{X}\subseteq\mathbb{R}^{n}$ and $u_{k}\in\mathcal{U}\subseteq\mathbb{R}^{p}$ denote the state and the control input, respectively, and $f:\mathbb{R}^{n}\times\mathbb{R}^{p}\rightarrow\mathbb{R}^{n}$ is a possibly nonlinear mapping. The control objective is supposed to minimize a quadratic cost $J$ at each time step, which is defined by

\displaystyle J:=\sum_{k=0}^{H+1}\left\{(x_{k}-x^{\text{ref}}_{k})^{\mathsf{T}}Q_{\text{state}}(x_{k}-x^{\text{ref}}_{k})+u_{k}^{\mathsf{T}}Ru_{k}\right\},

(2)

where $x_{0}$ denotes the state at the current time step, and $H$ , $x^{\text{ref}}_{k}$ , $Q_{\text{state}}$ , and $R$ are a look-ahead horizon, a reference signal, and weight matrices w.r.t. the state and the control input, respectively.

It is assumed that while $f$ is unknown, we are given prior information about the dynamics in the form of nominal model $x_{k+1}=f_{\text{known}}(x_{k},u_{k})$ so that the true system is decomposed into:

\displaystyle x_{k+1}=f_{\text{known}}(x_{k},u_{k})+r(x_{k},u_{k}),

(3)

where $r(x_{k},u_{k}):=f(x_{k},u_{k})-f_{\text{known}}(x_{k},u_{k})$ is the residual dynamics. This is a typical scenario in engineering applications such as robotics, and controller design under the unknown component $r$ is a problem that is actively being explored ([Zhou et al.(2023)Zhou, Song, and Tzoumas, Shi et al.(2019)Shi, Shi, O’Connell, Yu, Azizzadenesheli, Anandkumar, Yue, and Chung, Zhou and Tzoumas(2023)]).

In MPC, the controller determines the optimal control input by minimizing a predefined cost function subject to the system’s dynamic constraints as follows.

Problem 1

(Model Predictive Control with Quadratic Cost)
Given a current state $\xi_{0}$ of control model, apply the first element $u_{0}$ of the solution to the problem:

	$\displaystyle\underset{u_{0},u_{1},\cdots,u_{H}}{\text{min}}\sum_{k=0}^{H+1}\left\{(\xi_{k}-\xi^{\text{ref}}_{k})^{\mathsf{T}}Q(\xi_{k}-\xi^{\text{ref}}_{k})+u_{k}^{\mathsf{T}}Ru_{k}\right\},$			(4)
	$\displaystyle\text{subject to:}\ \ \xi_{k+1}=\mathcal{F}(\xi_{k},u_{k},k),$			(5)

where a possibly time-varying mapping $\mathcal{F}:\mathbb{R}^{N}\times\mathbb{R}^{p}\times\mathbb{Z}_{\geq 0}\rightarrow\mathbb{R}^{N}$ denotes a control model with the state $\xi_{k}$ and the control input $u_{k}$ , and $\xi^{\text{ref}}_{k}$ is a reference signal. Weight matrices $Q$ and $R$ are assumed to be positive definite.

A simplest choice of the control model $\mathcal{F}$ is the nominal model so that $\xi_{k}\in\mathbb{R}^{n}$ ( $N:=n$ ), $\mathcal{F}(\xi_{k},u_{k},k):=f_{\text{known}}(\xi_{k},u_{k})$ , $\xi^{\text{ref}}_{k}:=x^{\text{ref}}_{k}$ , and $Q:=Q_{\text{state}}$ , in which case we call Problem 1 nominal MPC. Since the cost function (4) is quadratic, Problem 1 becomes a convex optimization if the control model $\mathcal{F}$ is linear. Note that the controller performance may be degraded due to the discrepancy between the control model $\mathcal{F}$ and the true dynamics $f$ .

2.2 Linear Embedding Model

In the proposed method, we use a linear embedding model to derive the control model $\mathcal{F}$ for Problem 1. Given a state $x_{k}\in\mathcal{X}$ and a control input $u_{k}\in\mathcal{U}$ of the true dynamics (3), a linear embedding model outputs a predicted state $x^{\text{pred}}_{k+1}$ at the next time step s.t.

	$\displaystyle x^{\text{pred}}_{k+1}=C\left\{Ag(x_{k})+Bu_{k}\right\}\Leftrightarrow$	$\displaystyle g^{+}=Ag(x_{k})+Bu_{k},$			(6)
	$\displaystyle x^{\text{pred}}_{k+1}=C\left\{Ag(x_{k})+Bu_{k}\right\}\Leftrightarrow$	$\displaystyle x^{\text{pred}}_{k+1}=Cg^{+},$			(7)

where $g:\mathcal{X}\rightarrow\mathbb{R}^{N}$ is a vector-valued function called feature maps and $A\in\mathbb{R}^{N\times N}$ , $B\in\mathbb{R}^{N\times p}$ , and $C\in\mathbb{R}^{n\times N}$ are matrices. The original state $x_{k}\in\mathbb{R}^{n}$ is first embedded onto a space $\mathbb{R}^{N}$ through the feature maps $g$ , and the model state $g(x_{k})$ is advanced by one step by the linear operator $[A\,B]$ to yield the prediction $g^{+}$ in (6). The decoder $C$ then projects $g^{+}$ back onto the original state space $\mathbb{R}^{n}$ to make the state prediction $x^{\text{pred}}_{k+1}$ in (7). A sufficient condition for a linear embedding model to reconstruct the true dynamics is given by the following.

Proposition 1

Consider a linear embedding model (6), (7). For a state-input triplet $(x_{k},u_{k},x_{k+1})$ of the true dynamics s.t. $x_{k}\in\mathcal{X},u_{k}\in\mathcal{U},x_{k+1}=f(x_{k},u_{k})$ , a relation $x^{\text{pred}}_{k+1}=x_{k+1}$ holds if

	$\displaystyle g(x_{k+1})=Ag(x_{k})+Bu_{k},$			(8)
	$\displaystyle x_{k+1}=Cg(x_{k+1}).$			(9)

Proof 2.1.

This follows directly from the definitions.

A major advantage of using a linear embedding model is its utility as a linear control model in the embedded space. In Problem 1, consider a control model:

\displaystyle\xi_{k+1}=A\xi_{k}+Bu_{k},\ \xi_{0}:=g(x_{0})\in\mathbb{R}^{N},

(10)

with the reference signal and the weight matrix defined as $\xi^{\text{ref}}_{k}:=g(x^{\text{ref}}_{k})$ and $Q:=C^{\mathsf{T}}Q_{\text{state}}C$ , respectively. While the system (10) is no longer defined in the original state space $\mathcal{X}\subseteq\mathbb{R}^{n}$ , an MPC solution in this setting still leads to an optimal control input w.r.t. $J$ in (2) on some conditions. Specifically, if (8) and (9) hold for $\forall x_{k}\in\mathcal{X},\forall u_{k}\in\mathcal{U}$ , a solution to Problem 1 also minimizes the cost $J$ since

\displaystyle(\xi_{k}-\xi^{\text{ref}}_{k})^{\mathsf{T}}Q(\xi_{k}-\xi^{\text{ref}}_{k})=(x_{k}-x^{\text{ref}}_{k})^{\mathsf{T}}Q_{\text{state}}(x_{k}-x^{\text{ref}}_{k}),

(11)

where $\xi_{k}=g(x_{k})$ and $x_{k}=Cg(x_{k})$ are substituted.

This class of MPC is often referred to as Koopman MPC in the literature since the linear operator $[A\,B]$ in (6) can be considered a finite-dimensional approximation of a Koopman operator ([Korda and Mezić(2018)]), as described in the next section. MPC methods based on the Koopman operator formalism have shown promise via their computational efficiency and control performance in various applications. Specifically, Koopman MPC will result in faster execution times than most nonlinear MPC methods since the optimization becomes convex, and the validity of the linear control model (10) may be even established in case the true dynamics (3) is nonlinear if the model parameters are learned appropriately (see the next section for details). Also, a linear embedding model can be computed in a fully data-driven manner, i.e., its model parameters can be determined by only using time-series data sampled from either the true dynamics or a simulator of a nominal model. In this paper, we refer to this type of controller design Koopman MPC.

Problem 1.

(Koopman MPC)
Given a current state $x_{0}$ and a linear embedding model (6), (7), solve Problem 1 with the initial condition, the control model, and the weight matrix chosen as:

\displaystyle\xi_{0}:=g(x_{0}),

(12)

	$\displaystyle\xi_{k+1}=A\xi_{k}+Bu_{k},\text{ and}$		(13)
	$\displaystyle Q:=C^{\mathsf{T}}Q_{\text{state}}C.$		(14)

In the proposed method, we employ the Koopman MPC as the baseline method of the control strategy with the use of a nominal model $f_{\text{known}}$ , and compensate for the modeling error using an online update while retaining the advantages of linear embedding models.

3 Offline Learning of Linear Embedding Models

The model parameters of the linear embedding model (6),(7) are the feature maps $g:\mathcal{X}\rightarrow\mathbb{R}^{N}$ and matrices $A,B,$ and $C$ . In this section, we formulate a problem of learning these parameters in an offline manner, where it is assumed that only data samples generated by a nominal model $f_{\text{known}}$ are available but we do not have access to data from the true system (3).

3.1 Koopman Operators

Koopman operators characterize the time evolution of dynamical systems by acting as composition operators on function spaces, enabling the analysis of discrete-time dynamics through functional transformations. These operators share a fundamental mathematical connection with the linear embedding matrices $[A,B]$ , providing a theoretical foundation for transforming nonlinear dynamics into linear representations. For instance, given autonomous dynamics $x_{k+1}=f_{a}(x_{k})$ , $x_{k}\in\mathbb{R}^{n}$ , and a function $g:\mathbb{R}^{n}\rightarrow\mathbb{R}$ s.t. $g\in\mathcal{G}$ where $\mathcal{G}$ is some function space, the Koopman operator $\mathcal{K}$ associated with this system is defined as $\mathcal{K}:\mathcal{G}\rightarrow\mathcal{G}:g\mapsto g\circ f_{a}$ on the assumption that $g\circ f_{a}\in\mathcal{G}$ , $\forall g\in\mathcal{G}$ . This corresponds to time evolution of the dynamics $x_{k+1}=f_{a}(x_{k})$ through the function $g$ since

\displaystyle g(x_{k+1})=g(f_{a}(x_{k}))=(g\circ f_{a})(x_{k})\overset{\text{def.}}{=}(\mathcal{K}g)(x_{k}).

(15)

Note that $g$ corresponds to a feature map in our formulation in Section 2.2. As $\mathcal{K}$ is a composition operator, it is easily confirmed that Koopman operators are linear operators. A major difference between the two descriptions of the dynamics is that the time evolution of (15) is governed linearly by $\mathcal{K}$ whereas $f_{a}$ is possibly nonlinear, which naturally leads to an idea of deriving linear models using $\mathcal{K}$ . Noticing that Koopman operators are, however, infinite-dimensional in general since they are defined on function spaces, this is realized by a finite-dimensional version or its approximation of $\mathcal{K}$ . An exact finite-dimensional version of the Koopman operator $\mathcal{K}$ exists if and only if we can find functions $g$ that span an invariant subspace, as shown in the following proposition.

Proposition 2.

Given $N$ functions $g_{i}\in\mathcal{G}$ , $i=1,\cdots,N$ , there exists a matrix $K\in\mathbb{R}^{N\times N}$ s.t.

\displaystyle[\mathcal{K}g_{1}\cdots\mathcal{K}g_{N}]^{\mathsf{T}}=K[g_{1}\cdots g_{N}]^{\mathsf{T}},

(16)

if and only if $\text{span}(g_{1},\cdots,g_{N})$ is an invariant subspace under the action of $\mathcal{K}$ , i.e.,
$\mathcal{K}g\in\text{span}(g_{1},\cdots,g_{N})$ for $\forall g\in\text{span}(g_{1},\cdots,g_{N})$ .

Proof 3.1.

For instance, see [Uchida and Duraisamy(2023)].

In practice, finding an invariant subspace is not trivial, and a finite-dimensional approximation of $\mathcal{K}$ is computed from data samples. Extended Dynamic Mode Decomposition (EDMD) ([Williams et al.(2015)Williams, Kevrekidis, and Rowley]) is one such method using linear regression and user-specified feature maps $g_{i}$ . While EDMD enables simple and tractable model training, it may not be adequately expressive for complex nonlinear systems. On the other hand, joint learning of $g_{i}$ and $K$ can achieve more accurate models since $g_{i}$ is also learned on the training data along with the matrix $K$ , and parameterizing $g_{i}$ by neural networks has been shown to be successful in a wide range of problems ([Lusch et al.(2018)Lusch, Kutz, and Brunton, Takeishi et al.(2017)Takeishi, Kawahara, and Yairi, Pan and Duraisamy(2020)]).

For general non-autonomous dynamics (3) with the control input, corresponding Koopman operators are defined in a similar manner but with a sequence of control inputs included ([Korda and Mezić(2018)]). For the space of input sequences: $l(\mathcal{U}):=\{(u_{0},u_{1},\cdots)\mid u_{k}\in\mathcal{U},\forall k\}$ , consider a mapping $\hat{f}:\mathcal{X}\times l(\mathcal{U})\rightarrow\mathcal{X}\times l(\mathcal{U}):(x,(u_{0},u_{1},\cdots))\mapsto(f(x,u_{0}),(u_{1},u_{2},\cdots))$ . Also, let $\hat{g}:\mathcal{X}\times l(\mathcal{U})\rightarrow\mathbb{R}$ be a function from an extended space $\mathcal{X}\times l(\mathcal{U})$ to $\mathbb{R}$ . Then, the Koopman operator associated with (3) is defined as a linear operator $\mathcal{K}:\hat{\mathcal{G}}\rightarrow\hat{\mathcal{G}}:\hat{g}\mapsto\hat{g}\circ\hat{f}$ s.t. $\hat{\mathcal{G}}$ is a function space to which $\hat{g}$ belongs and the dynamics along with a sequence $(u_{k},u_{k+1},\cdots)$ of control inputs is represented by

\displaystyle\hat{g}(x_{k+1},(u_{k+1},u_{k+2},\cdots))

\displaystyle=(\hat{g}\circ\hat{f})(x_{k},(u_{k},u_{k+1},\cdots))=(\mathcal{K}\hat{g})(x_{k},(u_{k},u_{k+1},\cdots)).

(17)

It can be easily verified that Proposition 2 also holds for the non-autonomous case. If we consider the following $N+p$ functions $\hat{g}_{i}$ of specific forms:

\displaystyle[\hat{g}_{1}(x_{k},(u_{k},u_{k+1},\cdots))\cdots\hat{g}_{N+p}(x_{k},(u_{k},u_{k+1},\cdots))]^{\mathsf{T}}=[g_{1}(x_{k})\cdots g_{N}(x_{k})\ u_{k}^{\mathsf{T}}]^{\mathsf{T}},

(18)

the first $N$ rows of (16) reads $g(x_{k+1})=Ag(x_{k})+Bu_{k}$ , where $[A\ B]\in\mathbb{R}^{{N}\times(N+p)}$ denotes the first $N$ rows of $K$ . Therefore, the condition (8) is ensured by choosing $[A\,B]$ as a finite-dimensional Koopman operator acting on an invariant subspace. As is the case with the autonomous setting, finding such a subspace is not trivial in practice and a finite-dimensional approximation may be computed as $[A\,B]$ by either EDMD-type methods or joint learning of $g_{i}$ and $[A\,B]$ .

3.2 Offline Learning Procedure

In the proposed method, we adopt the joint learning of feature maps $g_{i}$ and matrices $A$ , $B$ , and $C$ with the use of neural networks, which is formulated as follows.

Problem 3.

(Offline Learning Using Nominal Model)
Let $g(\cdot;\theta):\mathcal{X}\rightarrow\mathbb{R}^{N}$ be a neural network characterized by parameters $\theta$ . Find $\theta$ , $A\in\mathbb{R}^{N\times N}$ , $B\in\mathbb{R}^{N\times p}$ , and $C\in\mathbb{R}^{n\times N}$ that minimize the loss function:

\displaystyle\sum_{i}\left(\lambda_{1}\left\|Ag(x_{i};\theta)+Bu_{i}-g(y_{i};\theta)\right\|_{2}^{2}+\lambda_{2}\left\|C(Ag(x_{i};\theta)+Bu_{i})-y_{i}\right\|_{2}^{2}\right),

(19)

where the data set is given in the form $\mathcal{D}:=\{(x_{i},u_{i},y_{i})\mid y_{i}=f_{\text{known}}(x_{i},u_{i})\}$ and $\lambda_{1},\lambda_{2}\in\mathbb{R}$ are hyperparameters.

The first and second terms of the loss function (19) are responsible for (approximately) ensuring the conditions (8) and (9), respectively.

3.3 Data Generation Using MPC Simulations

In Problem 3, how the control inputs $u_{i}$ in the dataset $\mathcal{D}$ are generated determines the quality of data and therefore has a significant influence on the learning results. A typical strategy is sampling both states and inputs from some distributions assuming that $x_{i},u_{i}$ are i.i.d. random variables, in which case the loss function (19) will converge to a more general characteristic such as an $L_{2}$ norm of the modeling error as the number of data samples tends to infinity (e.g., [Uchida and Duraisamy(2023)]). However, it is challenging to sample the product space $\mathcal{X}\times\mathcal{U}$ adequately unless the dimensions of the state and the control input are sufficiently small, and this sampling strategy may not necessarily result in accurate and unbiased model predictions in practice.

As an alternative approach, we utilize MPC simulations of the nominal model $f_{\text{known}}$ to generate data samples. Specifically, the dataset $\mathcal{D}$ in Problem 3 consists of a collection of trajectories $(x_{0},u_{0},x_{1},u_{1},\cdots)$ s.t. $x_{0}$ is randomly sampled, and $u_{k}$ and $x_{k+1}$ are a solution of the nominal MPC (Problem 1 with $\mathcal{F}(\xi_{k},u_{k},k):=f_{\text{known}}(\xi_{k},u_{k})$ ) and its corresponding next state $x_{k+1}=f_{\text{known}}(x_{k},u_{k})$ , respectively. The main intent is to selectively learn relevant regimes of dynamics by the use of MPC simulations of the nominal model so that the controller performance will be improved. A similar approach is also employed in [Li et al.(2024)Li, Abuduweili, Sun, Chen, Zhao, and Liu].

4 Adaptive Koopman MPC

While the Koopman MPC with a linear embedding model learned by Problem 3 is expected to perform well if the nominal model $f_{\text{known}}$ is sufficiently close to the true dynamics, updating the control model online will further improve the performance or foster robustness in case there exists a large discrepancy between the nominal model and the true dynamics.

Let $\Theta:=\{A,B,C,\theta\}$ be the model parameters of a linear embedding model (6), (7). Also, we use a notation $\text{LinearMPCSolver}(g(x_{k});\Theta)$ to denote a solution of the Koopman MPC (Problem 1) given a current state $x_{k}$ . Assuming that a new data sample $x_{k+1}$ is available at time $k+1$ s.t. $x_{k+1}=f(x_{k},u_{k})$ where $u_{k}=\text{LinearMPCSolver}(g(x_{k});\Theta)$ , we add $x_{k+1}$ to a relay buffer, from which a data batch is sampled at each time step to update $\Theta$ in an online manner. Similar to Problem 3, the model parameters are updated online by minimizing (19).

It is, however, known that using neural networks as function approximators often makes learning processes unstable due to its non-convexity and high-dimensionality. To this end, the proposed method adopts the soft update of target networks ([Lillicrap et al.(2015)Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa, Silver, and Wierstra]). A target network is paired with a main network and its parameters are slowly updated towards that of the main network, by which abrupt changes in the outputs will be avoided to enhance the stability of the learning process.

In the proposed method, we initialize both the main and the target networks (two linear embedding models with the model parameters labeled $\Theta$ and $\Theta_{\text{target}}$ , respectively) by the offline model from Problem 3. The target network is then updated at each time step by an interpolation $\Theta_{\text{target}}\leftarrow\tau\Theta+(1-\tau)\Theta_{\text{target}}$ , where $\tau$ is a hyperparameter to adjust the smoothness of the update. The actual control input is computed by the Koopman MPC solver with the slowly changing model parameters $\Theta_{\text{target}}$ so that we can suppress undesirable behavior of the closed-loop dynamics due to large fluctuations of the control model. The proposed Adaptive Koopman MPC with Soft Update is summarized in Algorithm 4.

{algorithm2e}

[t] Adaptive Koopman MPC with Soft Update Require: Prior Model or simulator $f_{\text{known}}$ of known dynamics Step 1: Train an offline model Simulate $f_{\text{known}}$ and collect data:
$\mathcal{D}_{\text{known}}=\{(x,u,y)\mid y=f_{\text{known}}(x,u),u=\text{MPCSolver}(x;f_{\text{known}})\}$ Solve Problem 3 to train a linear embedding model:
$x^{\text{pred}}_{k+1}=C_{\text{prior}}(A_{\text{prior}}g_{\text{prior}}(x_{k};\theta_{\text{prior}})+B_{\text{prior}}u_{k})$ on the data set $\mathcal{D}_{\text{known}}$ Step 2: Adaptive Data-driven MPC $(A,B,C,\theta)\leftarrow(A_{\text{prior}},B_{\text{prior}},C_{\text{prior}},\theta_{\text{prior}})$ \tcpMain network parameters $(A_{\text{target}},B_{\text{target}},C_{\text{target}},\theta_{\text{target}})\leftarrow(A_{\text{prior}},B_{\text{prior}},C_{\text{prior}},\theta_{\text{prior}})$ \tcpTarget network params $(\Theta,\Theta_{\text{target}})\leftarrow(\{A,B,C,\theta\},\{A_{\text{target}},B_{\text{target}},C_{\text{target}},\theta_{\text{target}}\})$ $\mathcal{B}\leftarrow\emptyset$ \tcpReplay buffer $x_{0}\leftarrow$ Initial condition \For $k=0,1,2,\cdots$ $u_{k}\leftarrow\text{LinearMPCSolver}(g(x_{k});\Theta_{\text{target}})$ \tcpKoopman MPC $x_{k+1}\leftarrow f(x_{k},u_{k})$ \tcpNext state from the true environment $\mathcal{B}\leftarrow\mathcal{B}\cup\{(x_{k},u_{k},x_{k+1})\}$ \tcpAdd new data to replay buffer $\mathcal{D}_{\text{online}}\leftarrow\text{BatchSample}(\mathcal{B})$ $\Theta\leftarrow\text{GradientDescent}(\mathcal{D}_{\text{online}})$ $\Theta_{\text{target}}\leftarrow\tau\Theta+(1-\tau)\Theta_{\text{target}}$ \tcpSoft update

4.1 Parameter Selection for Online Update

While the online update can be applied to all the parameters in $\Theta$ , it is advisable to only select ones that have dominant and essential effects on the control performance to improve both the computational efficiency and the robustness of the algorithm. For instance, the number of parameters in $\Theta$ can become quite large for complex and/or large-scale dynamics, and updating all the parameters may lead to an undesirable computational burden for online updates. Also, the stability of the online update can be further improved by excluding parameters that are sensitive to model outputs but not meaningful in terms of control performance. To this end, we propose to exclude the matrix $A$ from online updated parameters if the online learning becomes unstable, which is suggested by the following property of linear embedding models.

Theorem 4.

([Iacob et al.(2024)Iacob, Tóth, and Schoukens])
Suppose that $\mathcal{X}$ and $\mathcal{U}$ are convex sets and $0\in\mathcal{U}$ . For a linear embedding model (6), (7) s.t. $g_{i}\in C^{1}$ for $\forall i$ and $\text{span}(g_{1},\cdots,g_{N})$ is an invariant subspace under the action of the Koopman operator associated with $f(\cdot,0)$ , the following holds:

\displaystyle g(x_{k+1})=Ag(x_{k})+\underset{=:\hat{B}(x_{k},u_{k})}{\underbrace{\int_{0}^{1}\cfrac{\partial\mathcal{B}}{\partial u}(x_{k},\lambda u_{k})d\lambda}}\ u_{k},

(20)

where

\displaystyle\mathcal{B}(x,u):=\left\{\int_{0}^{1}\cfrac{\partial g}{\partial x}(f(x,0)+\lambda(f(x,u)-f(x,0)))d\lambda\right\}(f(x,u)-f(x,0)),

(21)

and $\mathcal{B}(x,u)$ is assumed to be differentiable in $u$ .

Equation (20) implies that there exists a linear embedding model with no modeling error s.t. $A$ is a constant matrix while $B$ is given as a time-varying one $\hat{B}(x_{k},u_{k})$ . Specifically, if we can find a finite-dimensional Koopman operator associated with the drift term of the dynamics, it is sufficient to update only $B$ appropriately at every time step to reconstruct the true dynamics. Whereas there is no guarantee in general that the learning results of Problem 3 or the online update satisfy the assumptions in Theorem 4, we heuristically find that excluding $A$ from the online updated parameters while updating $B$ and $g_{i}$ online improves control performance and stabilizes learning in many cases.

5 Numerical Example

As a numerical example, we consider a cartpole system with a cart mass $m_{c}$ , a pole mass $m_{p}$ , and a pole length $2l$ , which is described by the following ordinary differential equations (ODE) ([Yuan et al.(2022)Yuan, Hall, Zhou, Brunke, Greeff, Panerati, and Schoellig]):

\displaystyle\ddot{x}(t)=\cfrac{F(t)+m_{p}l(\dot{\theta}^{2}(t)\sin\theta(t)-\ddot{\theta}(t)\cos\theta(t))}{m_{c}+m_{p}},

(22)

\displaystyle\ddot{\theta}(t)=\cfrac{g\sin\theta(t)+\cos\theta(t)\left(\frac{-F-m_{p}l\dot{\theta}^{2}(t)\sin\theta(t)}{m_{c}+m_{p}}\right)}{l\left(\frac{4}{3}-\frac{m_{p}\cos^{2}\theta(t)}{m_{c}+m_{p}}\right)},

(23)

where $x(t)$ , $\theta(t)$ , $F(t)$ , and $g$ are the cart position, the pole angle, the force to the cart, and the acceleration due to gravity, respectively. The state of the system is supposed to be $x_{k}:=[x(k\Delta t)\ \dot{x}(k\Delta t)\ \theta(k\Delta t)\ \dot{\theta}(k\Delta t)]^{\mathsf{T}}$ s.t. $x(t)$ , $\dot{x}(t)$ , $\theta(t)$ , and $\dot{\theta}(t)$ are sampled with a sampling period $\Delta t=1/15$ [s]. Also, $F(t)$ is given by the control input $u_{k}$ determined by MPC s.t. $F(t):=u_{k}$ for $k\Delta t\leq t\leq(k+1)\Delta t$ . The reference signal is set to $x^{\text{ref}}_{k}:=0.$

It is assumed that we are given the ODE (22), (23) with $(m_{c},m_{p},l)=(0.75,0.075,0.375)$ as a nominal model, whereas the true dynamics is governed by the same equations with different parameter values: $(m_{c},m_{p},l)=(1,\ 0.1,\ 0.5)$ . In Step 1 of Algorithm 4, we collect data from the nominal model that consists of 500 trajectories, each of which has a length of 60 time steps, and train a linear embedding model with the feature maps given by $g(x_{k};\theta):=[x_{k}^{\mathsf{T}}\ g_{5}(x_{k};\theta)\ g_{6}(x_{k};\theta)]^{\mathsf{T}}$ , where $[g_{5}(\cdot;\theta)\ g_{6}(\cdot;\theta)]$ is a feed-forward neural network with three hidden layers, each of which has 64 neurons. Note that including the state $x_{k}$ itself in the feature maps eliminates the decoding error and ensures the condition (9) by an analytical decoder $C:=[I_{n}\ 0]$ . Therefore, we set $\lambda_{2}=0$ in (19). Following the discussion in Section 4.1, we fix $A$ to the learning result of Problem 3 and update $g$ and $B$ online with $\tau:=0.05$ .

For comparison, we also test three controllers in addition to the proposed method. As a baseline of non-adaptive MPC, we consider the nominal MPC, which is described in Section 2. As data-driven adaptive MPC, the Gaussian Process MPC (GP-MPC) ([Hewing et al.(2020)Hewing, Kabzan, and Zeilinger]) and the MPC with random Fourier Features (RFF-MPC) ([Zhou and Tzoumas(2024)]) are considered. GP-MPC and RFF-MPC learn the residual dynamics $r(x_{k},u_{k})$ in (3) with sparse Gaussian processes and random Fourier features in online manners, respectively. For all the MPC methods, we set $H:=20$ , $Q_{\text{state}}:=\text{diag}(5,0.1,5,0.1)$ , and $R:=0.1$ . For each controller, a simulation with the same setting is performed 10 times with randomly chosen initial conditions. The simulations are implemented with safe control gym ([Yuan et al.(2022)Yuan, Hall, Zhou, Brunke, Greeff, Panerati, and Schoellig]) on a system with an AMD Ryzen 7 7700X 8-core processor and 32 GB of memory.

The results are shown in Fig. 2, where a sample trajectory and the average error defined by $\frac{1}{10}\sum_{i=1}^{10}\|x_{k,i}-x^{\text{ref}}_{k}\|_{2}$ ( $x_{k,i}$ : state of the $i$ -th trajectory) for each method are on the left and right panels, respectively. Since nominal MPC does not take the effect of the residual dynamics into account, it does not track the reference within the simulation window of six seconds. On the other hand, all the adaptive MPC methods successfully stabilize the states by the end of the simulations. Compared to GP- and RFF-MPC, the proposed adaptive Koopman MPC outperforms in terms of the average errors. Table 1 shows the average execution times of the simulation for individual controllers. Since GP- and RFF-MPC are adaptive methods based on nonlinear MPC, they take longer than the nominal MPC. On the other hand, the proposed adaptive Koopman MPC achieves an even shorter execution time than the nominal MPC thanks to the convexity of its formulation.

Finally, a sensitivity analysis w.r.t. the residual dynamics is performed, where we consider various extents of the discrepancy between the nominal model and the true dynamics. Figure 3 shows the results, where the true dynamics parameters are varied by 10 to 30 % w.r.t. the nominal model and the same experiment is performed for each case. Whereas the proposed method results in higher average errors after $t=1.5$ [s] for relatively small extents of the residual dynamics (Figs. 3(a,b)), it outperforms the other controllers at the beginning of the simulation in all cases. Also, the proposed method shows the most robust performance across the given range of residual dynamics.

Method	Nominal MPC	GP-MPC	RFF-MPC	Proposed
Average execution time [s]	0.70	5.11	1.12	0.52

Table 1: Average execution times of the MPC simulation.

6 Conclusion

This work introduced an adaptive Model Predictive Control (MPC) framework that leverages Koopman operators for nonlinear dynamics. While MPC is widely adopted across diverse control applications, its performance can deteriorate when the control model fails to accurately capture the true system dynamics. To address this challenge, we developed an approach that combines the Koopman operator framework with a linear embedding model, enabling online parameter updates to compensate for model discrepancies. The offline training learns the features and operator matrices jointly, allowing for greater expressivity. By maintaining linearity in the control model while accommodating nonlinear dynamics, our method achieves both computational efficiency and adaptability. Online learning may become unstable if the model is parameterized by complex model forms such as neural networks, which result in a high-dimensional non-convex optimization. The proposed method uses the soft update of target networks so that abrupt changes in the model will be avoided and we can stabilize online updates. Also, we provide a discussion on which model parameters to prioritize for the online update based on a specific system description of linear embedding models in [Iacob et al.(2024)Iacob, Tóth, and Schoukens]. Experimental validation on a cartpole system demonstrates that our method achieves favorable control performance while requiring significantly lower computational resources compared to existing adaptive MPC approaches. Furthermore, the results reveal enhanced robustness to residual dynamics. While our initial results demonstrate promise and basic viability, expanding the validation to more complex industrial and robotic systems represents an important next step in establishing this framework’s broad applicability.

\acks

This work has benefited from several discussions and code-sharing with Hongyu Zhou and Vasileios Tzoumas, for which we are grateful. This work was funded by AFOSR grant FA9550-17-1-0195.

References

[Adetola et al.(2009)Adetola, DeHaan, and Guay] Veronica Adetola, Darryl DeHaan, and Martin Guay. Adaptive model predictive control for constrained nonlinear systems. Systems & Control Letters, 58(5):320–326, 2009.
[Alfatlawi and Srivastava(2020)] Mustaffa Alfatlawi and Vaibhav Srivastava. An incremental approach to online dynamic mode decomposition for time-varying systems with applications to EEG data modeling, 2020.
[Arbabi et al.(2018)Arbabi, Korda, and Mezić] Hassan Arbabi, Milan Korda, and Igor Mezić. A data-driven Koopman model predictive control framework for nonlinear partial differential equations. In 2018 IEEE Conference on Decision and Control (CDC), pages 6409–6414, 2018.
[Brunton and Kutz(2019)] Steven L. Brunton and J. Nathan Kutz. Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Cambridge University Press, 2019.
[Deem et al.(2020)Deem, Cattafesta, Hemati, Zhang, Rowley, and Mittal] Eric A. Deem, Louis N. Cattafesta, Maziar S. Hemati, Hao Zhang, Clarence Rowley, and Rajat Mittal. Adaptive separation control of a laminar boundary layer using online dynamic mode decomposition. Journal of Fluid Mechanics, 903:A21, 2020.
[Han et al.(2022)Han, Euler-Rolle, and Katzschmann] Minghao Han, Jacob Euler-Rolle, and Robert K. Katzschmann. DeSKO: Stability-assured robust control with a deep stochastic Koopman operator. In The Tenth International Conference on Learning Representations, ICLR, 2022.
[Hemati et al.(2014)Hemati, Williams, and Rowley] Maziar S. Hemati, Matthew O. Williams, and Clarence W. Rowley. Dynamic mode decomposition for large and streaming datasets. Physics of Fluids, 26(11):111701, 2014.
[Hewing et al.(2020)Hewing, Kabzan, and Zeilinger] Lukas Hewing, Juraj Kabzan, and Melanie N. Zeilinger. Cautious model predictive control using Gaussian process regression. IEEE Transactions on Control Systems Technology, 28(6):2736–2743, 2020.
[Iacob et al.(2024)Iacob, Tóth, and Schoukens] Lucian Cristian Iacob, Roland Tóth, and Maarten Schoukens. Koopman form of nonlinear systems with inputs. Automatica, 162:111525, 2024.
[Klenske et al.(2016)Klenske, Zeilinger, Schölkopf, and Hennig] Edgar D. Klenske, Melanie N. Zeilinger, Bernhard Schölkopf, and Philipp Hennig. Gaussian process-based predictive control for periodic error correction. IEEE Transactions on Control Systems Technology, 24(1):110–121, 2016.
[Korda and Mezić(2018)] Milan Korda and Igor Mezić. Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control. Automatica, 93:149–160, 2018.
[Li et al.(2024)Li, Abuduweili, Sun, Chen, Zhao, and Liu] Feihan Li, Abulikemu Abuduweili, Yifan Sun, Rui Chen, Weiye Zhao, and Changliu Liu. Continual Learning and Lifting of Koopman Dynamics for Linear Control of Legged Robots. arXiv e-prints, page arXiv:2411.14321, 2024.
[Lillicrap et al.(2015)Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa, Silver, and Wierstra] Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. arXiv e-prints, page arXiv:1509.02971, 2015.
[Lusch et al.(2018)Lusch, Kutz, and Brunton] Bethany Lusch, Nathan J. Kutz, and Steven L. Brunton. Deep learning for universal linear embeddings of nonlinear dynamics. Nature Communications, 9(1):4950, 2018.
[Mauroy et al.(2020)Mauroy, Mezić, and Susuki] Alexandre Mauroy, Igor Mezić, and Yoshihiko Susuki. The Koopman Operator in Systems and Control. Springer International Publishing, 2020. ISBN 978-3-030-35712-2.
[Ostafew et al.(2014)Ostafew, Schoellig, and Barfoot] Chris J. Ostafew, Angela P. Schoellig, and Timothy D. Barfoot. Learning-based nonlinear model predictive control to improve vision-based mobile robot path-tracking in challenging outdoor environments. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pages 4029–4036, 2014.
[Ostafew et al.(2016)Ostafew, Schoellig, and Barfoot] Chris J. Ostafew, Angela P. Schoellig, and Timothy D. Barfoot. Robust constrained learning-based NMPC enabling reliable mobile robot path tracking. The International Journal of Robotics Research, 35(13):1547––1563, 2016.
[Pan and Duraisamy(2020)] Shaowu Pan and Karthik Duraisamy. Physics-informed probabilistic learning of linear embeddings of nonlinear dynamics with guaranteed stability. SIAM Journal on Applied Dynamical Systems, 19(1):480–509, 2020.
[Pan and Duraisamy(2024)] Shaowu Pan and Karthik Duraisamy. On the lifting and reconstruction of nonlinear systems with multiple invariant sets. Nonlinear Dynamics, 112(12):10157–10165, 2024.
[Shi et al.(2019)Shi, Shi, O’Connell, Yu, Azizzadenesheli, Anandkumar, Yue, and Chung] Guanya Shi, Xichen Shi, Michael O’Connell, Rose Yu, Kamyar Azizzadenesheli, Animashree Anandkumar, Yisong Yue, and Soon-Jo Chung. Neural lander: Stable drone landing control using learned dynamics. In 2019 International Conference on Robotics and Automation (ICRA), pages 9784–9790, 2019.
[Singh et al.(2024)Singh, Sah, and Keshavan] Rajpal Singh, Chandan Kumar Sah, and Jishnu Keshavan. Adaptive Koopman Embedding for Robust Control of Complex Nonlinear Dynamical Systems. arXiv e-prints, page arXiv:2405.09101, 2024.
[Sinha et al.(2020)Sinha, Nandanoori, and Yeung] Subhrajit Sinha, Sai Pushpak Nandanoori, and Enoch Yeung. Data driven online learning of power system dynamics. In 2020 IEEE Power & Energy Society General Meeting (PESGM), pages 1–5, 2020.
[Son et al.(2020)Son, Narasingam, and Sang-Il Kwon] Sang Hwan Son, Abhinav Narasingam, and Joseph Sang-Il Kwon. Handling plant-model mismatch in Koopman Lyapunov-based model predictive control via offset-free control framework. arXiv e-prints, page arXiv:2010.07239, 2020.
[Takeishi et al.(2017)Takeishi, Kawahara, and Yairi] Naoya Takeishi, Yoshinobu Kawahara, and Takehisa Yairi. Learning Koopman invariant subspaces for dynamic mode decomposition. Advances in Neural Information Processing Systems, 30:1130–1140, 12 2017.
[Uchida and Duraisamy(2023)] Daisuke Uchida and Karthik Duraisamy. Control-aware Learning of Koopman Embedding Models. In 2023 American Control Conference (ACC), pages 941–948, 2023.
[Uchida et al.(2021)Uchida, Yamashita, and Asama] Daisuke Uchida, Atsushi Yamashita, and Hajime Asama. Data-driven Koopman controller synthesis based on the extended $\mathcal{H}_{2}$ norm characterization. IEEE Control Systems Letters, 5(5):1795–1800, 2021.
[Williams et al.(2015)Williams, Kevrekidis, and Rowley] M. Williams, I. Kevrekidis, and C. Rowley. A data driven approximation of the Koopman operator: Extending dynamic mode decomposition. Journal of Nonlinear Science, 25(6):1307–1346, 2015.
[Xiao et al.(2023)Xiao, Zhang, Xu, Liu, and Liu] Yongqian Xiao, Xinglong Zhang, Xin Xu, Xueqing Liu, and Jiahang Liu. Deep neural networks with Koopman operators for modeling and control of autonomous vehicles. IEEE Transactions on Intelligent Vehicles, 8(1):135–146, 2023.
[Yuan et al.(2022)Yuan, Hall, Zhou, Brunke, Greeff, Panerati, and Schoellig] Zhaocong Yuan, Adam W. Hall, Siqi Zhou, Lukas Brunke, Melissa Greeff, Jacopo Panerati, and Angela P. Schoellig. Safe-control-gym: A unified benchmark suite for safe learning-based control and reinforcement learning in robotics. IEEE Robotics and Automation Letters, 7(4):11142–11149, 2022.
[Zhang et al.(2019)Zhang, Rowley, Deem, and Cattafesta] Hao Zhang, Clarence W. Rowley, Eric A. Deem, and Louis N. Cattafesta. Online dynamic mode decomposition for time-varying systems. SIAM Journal on Applied Dynamical Systems, 18(3):1586–1609, 2019.
[Zhang et al.(2022)Zhang, Pan, Scattolini, Yu, and Xu] Xinglong Zhang, Wei Pan, Riccardo Scattolini, Shuyou Yu, and Xin Xu. Robust tube-based model predictive control with Koopman operators. Automatica, 137:110114, 2022. ISSN 0005–1098.
[Zhou and Tzoumas(2023)] Hongyu Zhou and Vasileios Tzoumas. Safe control of partially-observed linear time-varying systems with minimal worst-case dynamic regret. In 2023 62nd IEEE Conference on Decision and Control (CDC), pages 8781–8787, 2023.
[Zhou and Tzoumas(2024)] Hongyu Zhou and Vasileios Tzoumas. Simultaneous System Identification and Model Predictive Control with No Dynamic Regret. arXiv e-prints, page arXiv:2407.04143, 2024.
[Zhou et al.(2023)Zhou, Song, and Tzoumas] Hongyu Zhou, Yichen Song, and Vasileios Tzoumas. Safe non-stochastic control of control-affine systems: An online convex optimization approach. IEEE Robotics and Automation Letters, 8(12):7873–7880, 2023.

Model Predictive Control of Nonlinear Dynamics Using Online Adaptive Koopman Operators