Safety Filter Design for Neural Network Systems via
Convex Optimization
Abstract
With the increase in data availability, it has been widely demonstrated that neural networks (NN) can capture complex system dynamics precisely in a data-driven manner. However, the architectural complexity and nonlinearity of the NNs make it challenging to synthesize a provably safe controller. In this work, we propose a novel safety filter that relies on convex optimization to ensure safety for a NN system, subject to additive disturbances that are capable of capturing modeling errors. Our approach leverages tools from NN verification to over-approximate NN dynamics with a set of linear bounds, followed by an application of robust linear MPC to search for controllers that can guarantee robust constraint satisfaction. We demonstrate the efficacy of the proposed framework numerically on a nonlinear pendulum system.
I Introduction
With the rapid development in machine learning infrastructure, neural networks (NN) have been ubiquitously applied in the modeling of complex dynamical systems [1, 2]. Through a data collection and training procedure [3], NNs can capture accurate representations of the system dynamics even in challenging scenarios where high-speed aerodynamic effects [4, 5, 6] or contact-rich environments [7, 8] are present. Moreover, NNs can be easily updated online as more data is collected, making them suitable for online tasks or modeling changing environments. For example, NN dynamical systems are widely used in model-based reinforcement learning [9] and learning-based adaptive control [6].
However, applying NN dynamics brings about significant challenges in providing safety guarantees for the controlled system. Benefiting from the expressivity of NNs, we are meanwhile faced with the high nonlinearity and large scale of NNs since they are often overparameterized. The runtime assurance (RTA) [10] mechanism provides a practical and effective solution to guarantee the safety of complex dynamical systems by designing a safety filter that primarily focuses on enforcing safety constraints. Given a primary controller that aims to optimize performance, the safety filter monitors and modifies the output of the primary controller online to guarantee that only safe control inputs are applied. The safety filter allows the design of the safety and performance-based controllers to be decoupled and has found wide applications in safe learning-based control [11, 12, 13].
In this work, we focus on the design of a predictive safety filter (PSF) [14] for uncertain NN dynamics. The PSF essentially follows a model predictive control (MPC) formulation with the nonlinear dynamics and constraints encoded in an optimization problem. Different from MPC, the PSF is less complex to solve since it does not consider any performance objectives [14]. Compared with the alternative safety filter construction schemes through control barrier functions (CBF) [15, 16] or Hamilton-Jacobi (HJ) reachability analysis [17, 18], the PSF enjoys flexibility in handling dynamically changing NN models or model uncertainty bounds when updated online. We refer the interested readers to [14, Section 1 and 2] for a detailed discussion of the PSF, CBF, and HJ reachability-based safety filters.
Contributions: In this work, we consider uncertain NN dynamics subject to bounded additive disturbances, where the disturbances can encapsulate the errors between the learned NN dynamics and the true system. Despite being highly expressive, the considered uncertain NN dynamics requires solving a robust optimization problem involving NN dynamical constraints online in the PSF. To resolve this computational challenge, we propose to apply NN verification tools [19] to abstract the NN dynamics locally as a linear uncertain system, thereby reducing the original PSF problem into one that is amenable to robust linear MPC and convex optimization. In particular, we adapt the SLS (System Level Synthesis) MPC method [20] to solve the resulting robust MPC problem. A schematic of our pipeline is shown in Fig. 1. Soft constraints are used in robust linear MPC where slack variables denoting constraint violations are penalized. By applying a hierarchy of conservative function and model uncertainty approximations, we transform the original optimization problem into a convex one. A safety certificate for the uncertain NN dynamics over a finite horizon can then be provided when all slack variables are zero. Our contributions are summarized below.
-
1.
Drawing tools from NN verification and robust linear MPC, we propose a novel predictive safety filter for uncertain NN dynamics through convex optimization. Importantly, the complexity of the convex optimization problem is independent of the NN size (i.e., width and depth of the NN).
-
2.
Our PSF provides a safety certificate for the uncertain NN dynamics over a finite horizon when a certain numerical criterion is met by the convex optimization solutions.
I-A Related works
The problem of ensuring the safety of learning-based systems has received significant interest with a plethora of methods described in [21]. Directly related to our work is the PSF developed in [14], which monitors and modifies a given control input by solving a predictive control problem online to guarantee the safety of the system. This formulation has been extended to the SLS setting [22], applied to racing cars [12] and a soft-constrained variant is proposed in [23] to handle unexpected disturbances to the states. The PSF that we propose differs from those in the existing work in the following ways. First, we exploit the structure of the neural networks to extract linear bounds on the NN outputs using NN verification tools [24, 19], simplifying the PSF formulation for NN dynamics. Second, our proposed pipeline circumvents the need to solve a robust non-convex optimization problem, even with the consideration of additive disturbances within the uncertain dynamics, as typical for nonlinear variants of SLS [25]. Unlike existing work in predictive control of NN dynamics [26, 27, 2], our work considers robust control of uncertain NN dynamics with a focus on obtaining formal safety guarantees.
Notation: denotes the set . denotes the ball centered at with radius . We use to denote the sequence . denotes a vector of all zeros, and its dimension can be inferred from context. For a vector , denotes a diagonal matrix with being the diagonal vector. For a sequence of matrices , denotes a block diagonal matrix whose diagonal blocks are arranged in the order. We represent a linear, causal operator defined over a horizon by the block-lower-triangular matrix
(1) |
where is a matrix of compatible dimension. The set of such matrices is denoted by and the superscript or will be dropped when it is clear from the context.

II Problem Formulation
Consider the following discrete-time nonlinear system,
(2) |
where the vectors and are the state and control input. The vector denotes the additive disturbances that can account for unknown effects from the environment or unmodeled dynamics. We assume that is norm-bounded, i.e., . The system consists of a set of linear dynamics characterized by the matrices and and a set of nonlinear dynamics, . Specifically, these nonlinear dynamics are given by a NN with an arbitrary architecture. While we discuss our proposed approach taking into account a general formulation of dynamics in (2), the approach allows the matrices and to be zero and has the flexibility to account for time-varying dynamics . The system (2) is required to satisfy the following constraints,
(3) |
where and are polytopes. The state and input are considered safe if they satisfy constraints (3).
II-A Predictive safety filter
We assume a primary controller which aims to complete a task or achieve high performance is given for system (2). Following the runtime assurance scheme, the primary controller is not guaranteed to be safe since its design may be decoupled from the safety requirements. To ensure constraint satisfaction of the closed-loop system, we aim to design a predictive safety filter that monitors and modifies the control input given by the primary controller in a minimally invasive manner online. This is achieved by solving the following robust optimization problem at each time step ,
(4) | ||||
subject to | ||||
where the vectors denote the predicted states and inputs over the horizon , with representing the current state of the system. We denote Problem (4) as the PSF problem and the sequence as the optimal solution of Problem (4). When applied, the control inputs can guarantee the safety of the system for the next steps. In practice, the PSF problem (4) is solved recursively at each time step , and the first optimal control input is applied to the system, analogous to an MPC scheme.
II-B Challenges with the predictive safety filter
While the solution to Problem (4) is able to provide safety guarantees, solving Problem (4) is a challenging task. Some potential issues include
-
(i)
it is well known in robust MPC that searching over open-loop control sequences can be overly conservative [28],
-
(ii)
the presence of the NN dynamics makes solving the PSF computationally challenging,
-
(iii)
the safety certificate of the solution is not available until convergence is reached, and
-
(iv)
without the availability of a robust forward invariant set, attempting to solve the PSF may result in infeasibility.
To handle all the aforementioned issues, we combine NN verification tools [19] and robust MPC [20]. Our solution consists of two steps. First, we generate local linear bounds for the NN dynamics using tools from NN verification. Next, we apply robust linear MPC to synthesize a state feedback control policy that guarantees robust constraint satisfaction for the system. This combined procedure provides a powerful simplification of the PSF problem and resolves issues (i) to (iii). To address the issue (iv), we introduce soft constraints in our formulation. This provides formal safety guarantees for the system when the slack variables are zero. We describe these two steps in Sections III and IV, respectively. Section V demonstrates our method numerically, and Section VI concludes the paper.
Remark 1
To ensure the safety constraints are satisfied at all times or guarantee recursive feasibility of the PSF (4), a local forward invariant set for the nonlinear system (2) is required, which is generally challenging to find. In this work, we do not assume the availability of such a forward invariant set and use slack variables as numerical certificates of safety. We leave the synthesis of the forward invariant terminal set for NN dynamics and its integration into the PSF as part of our future work.
III Neural network verification bounds
In this section, we demonstrate how tools from NN verification can be utilized to over-approximate the NN dynamics with a linear time-varying (LTV) representation. This enables us to conservatively transform the PSF problem into a robust convex optimization problem, which is simpler to solve.
Given a bounded input set, the linear relaxation-based perturbation analysis (LiRPA) [19] is an efficient method to synthesize linear lower and upper bounds for the outputs of a NN with a general architecture. The bounds computed from this method are described in the following theorem.
Theorem 1
(rephrasing [24, Theorem 3.2]) Given a NN , there exist two explicit linear functions and such that for all , we have
(5) |
where the inequalities are applied component-wise.
The parameters are derived from the weights, biases and activation functions of the NN. In this paper, we choose such that is polyhedral. The bounds (5) are computed using closed-form updates with a computational complexity polynomial in the number of neurons [24]. This allows the method to scale well to deep networks. However, if the NN is deep or if the input domain is large, the computed bounds tend to be loose. Motivated by this observation, we propose to extract a set of local linear bounds along a reference trajectory. Specifically, at every time step , we construct a reference trajectory given by the sequences of reference states and control inputs where
(6) | ||||
The reference control inputs can be obtained, e.g., by rolling out the nominal NN dynamics following the primary policy . By denoting , and to be the radius of the ball around , we apply Theorem 1 to get the following bounds for the NN dynamics along the reference trajectory,
(7) |
for all . In other words, the NN dynamics is over-approximated with a set of linear lower and upper bounds. The ball is referred to as the trust region in which the bounds (7) are valid.
To reduce conservatism in the formulation of the filter within the robust MPC framework, we integrate the bounds into the linear dynamics of the system. Specifically, using the bounds in (7), we define
(8) |
where
(9) |
denote the means of the linear bounds.
Corollary 1
It is important to note that although the NN dynamics can have large values within the trust region , the dynamics tends to be small in magnitude and can be treated as disturbances to the system. With the extraction of , we obtain an LTV reformulation of the PSF problem, referred to as the linear PSF problem,
(11) | ||||
subject to | ||||
where the means in (9) are merged into the linear dynamics of the system with the definition of the following time-varying system parameters 111When a time-varying dynamics is considered in (2), we can replace by their time-varying counterparts in the definitions of . ,
The uncertainty set is given as
(12) |
using the bounds obtained from Corollary 1. It immediately follows that corresponds to a realization of .
A few remarks about the linear PSF problem are in order. First, the solution of the linear PSF problem is dependent on the centers and radius of the trust regions. The centers play an important role when the reference trajectory lies near or beyond the boundaries of the constraint set. In this case, the reference trajectory should be shifted towards the constraint set and this is done by adjusting the centers of these trust regions. Next, based on how the linear bounds in (12) are computed, there is a trade-off in the size of the radius . A small radius ensures that the computed bounds are accurate, but it limits the range in which the centers can be updated at each iteration. On the other hand, a large radius provides more flexibility in updating the reference trajectory, but the bounds can be overly conservative. Lastly, any feasible solution to the linear PSF problem (11) is also feasible for the PSF problem (4).
IV Robust Linear MPC
Compared with the PSF problem (4), the linear PSF problem (11) only involves uncertain linear dynamics. However, solving it can still be a challenge and a conservative approach may not succeed. Since optimizing over open-loop control sequences is conservative in robust MPC, we consider optimizing over state-feedback controllers instead. To achieve this, we apply an extension of the SLS MPC algorithm in [20] which is shown to enjoy outstanding tightness among the existing robust linear MPC methods.
IV-A Overview of SLS MPC
In SLS MPC, we consider the following uncertain linear time-varying system,
(13) |
where the time-varying matrices that represent the nominal dynamics are known and denotes the lumped uncertainty, which will be used to capture the effects of uncertainty in the dynamics.
Consider the dynamics (13) over a horizon . We define the following variables, which are concatenations of the variables in (13) over the horizon ,
(14) | ||||
and these concatenated system matrices,
We define as the block-downshift operator with the first block sub-diagonal filled with identity matrices and zeros everywhere else. The dynamics (13) over the horizon can then be compactly written as,
(15) |
Next, we consider a LTV state feedback controller , compactly given as , . Plugging into (15) gives the following system responses , mapping to the closed-loop states and inputs ,
(16) |
The following theorem establishes the connection between and state feedback controllers.
Theorem 2
The system responses explicitly characterize the effects of the lumped uncertainty on . In a previous work[20], a system subjected to polytopic uncertainties and additive disturbances is considered, i.e., where the parameters belong to a polytopic set. Interested readers are referred to [20] for more details on SLS MPC.
Despite its outstanding performance in conservatism reduction compared to existing robust MPC methods [20], applying SLS MPC directly to solve the linear PSF comes with these two challenges; (i) the uncertainty set is defined through affine inequalities and converting into a vertex representation, which is amenable to existing robust MPC methods [30, 31, 32, 33], requires vertices, (ii) applying SLS MPC requires merging the constant into the lumped uncertainty which causes the bounds on to be overly conservative. In the following subsections, we describe an extension to SLS MPC to address these two challenges.
IV-B Controller parameterization
For the uncertain linear dynamics
(18) |
stated in (11), we define as the lumped uncertainty. Instead of treating in (13), we decompose (18) into a set of nominal and error dynamics.
First, in addition to (14), we concatenate these variables over the horizon ,
(19) | ||||
We define the nominal and error states and control inputs as
with the nominal and error dynamics as
(20a) | ||||
(20b) |
It is important to note that (20b) conforms with (13). A LTV state feedback controller is then applied to control the error states. The overall controller for (18) is given by .
IV-C Lumped uncertainty over-approximation
For the lumped uncertainty , its dependence on , and complicates the design of the robust controller. As in SLS MPC, the approach is to over-approximate by an independent, filtered virtual disturbance signal , where The matrix is a filter operating on the finite-horizon virtual disturbance signal , with its diagonal blocks of structured as with
We define as the set of admissible virtual disturbances. Since are unit norm-bounded, we tune the filter to change the reachable set of , defined as .
Our goal is to find sufficient conditions on the control parameters and such that the reachable set of , denoted by , is a subset of the reachable set of . The following proposition provides these sufficient conditions and the proof is given in Appendix -A.
Proposition 1
Let denote the -th standard basis for and , for . The following constraints:
(21) |
and
(22a) | |||
(22b) | |||
(22c) | |||
(22d) | |||
guarantee that holds.
IV-D Convex formulation of the linear PSF
With the constraints (21) and (22), we have for any realization of , there exists such that . Therefore, we can represent as and write the error dynamics (20b) as
(23) |
By [20, Corollary 1], we have constraint (21) parameterizes all system responses of system (23) under the controller . Then, the closed-loop states and control inputs of system (18) are given by
(24) |
To guarantee robust constraint satisfaction of the controller , we tighten the constraints in the linear PSF (11). Consider the state constraint as an example. The constraints are represented by a polyhedral set, , where , , and denote the rows of . This implies that denotes the -th set of linear constraint parameters in . From (24), we have . Then, the following constraints
(25) |
guarantee that all constraints in are satisfied robustly. As discussed in Section II, the recursive feasibility of the linear PSF cannot be guaranteed without a robust forward invariant set. Therefore, in this work, we introduce soft constraints into (25),
(26) | ||||
In the cost function, a large penalty on is applied. In the case , we obtain robust constraint satisfaction guarantee for the constraint with parameters . Similarly, the input constraints can be tightened as
(27) | ||||
where denotes the number of linear inequalities defining and denotes the -th set of linear constraint parameters in . The trust region constraints analogously can be tightened as
(28) | ||||
where and are defined similarly for the state and input-related constraint parameters in .
Summarizing the results above, we propose a convex tightening of the linear PSF, which can be written as
(29) | ||||
subject to | ||||
where are chosen as large numbers. When a polyhedral robust forward invariant set is used as the terminal constraint, we can tighten it similarly as (26). All the constraints in Problem (29) are linear, making (29) a quadratic program. With the use of soft constraints, (29) is always feasible. If the slack variables in the solution are zero, we obtain a certificate that the system is safe for the next steps under the state-feedback controller with .
IV-E Trust region update
Following the discussion on the trust regions in Section III, we describe a method to update the trust regions online. Starting with the reference trajectory given by the primary policy , we propose to iteratively increase and update the reference trajectory by applying the policy , synthesized in Section IV. We then pick the reference trajectory that gives the smallest slack variables and apply the corresponding control inputs to the system. Our framework is summarized in Algorithm 1.
Input: Current state , horizon , number of iterations , initial reference trajectory , reference control input , initial trust region radius
Output: Filtered control input , safety certificate safe_cert
V Numerical Example
To verify the efficacy of the proposed solution, we test it on a NN proxy of the nonlinear pendulum system 222 Our codes are publicly available at https://github.com/ShaoruChen/NN-System-PSF.. The pendulum consists of the following dynamics [34],
(30) |
where is the angle between the pendulum and the vertical, and are the mass and length of the pendulum, is the gravitational force, and is the external torque acting on the pendulum. The state and control input are defined as and . To obtain the linear dynamics in (2), the dynamics (30) are linearized about the origin and discretized with a sampling time of 0.05s to obtain the following dynamics,
(31) |
Next, we train a NN to approximate the residual dynamics that are not captured by the linear dynamics in (31). We first collect data by simulating the nonlinear dynamics in (30) for a duration of 15s. The NN is then trained through a backpropagation procedure [3], using the mean squared errors between the predicted and true states as the loss function. The NN consists of 3 hidden layers with 64 neurons in each layer and uses the rectified linear unit (ReLU) as the activation function. Additive noise with a maximum magnitude of is injected into the states of the system. For the primary control policy, we consider an iterative linear quadratic regulator (iLQR) scheme [35] with the box-constrained heuristic [36]. This is implemented with the mpc.pytorch library [37]. The box-constrained heuristic allows the system to adhere to the control constraints, but does not account for state constraints.
Four test cases are considered in our experiments. In each of these cases, the system is required to track a pair of reference angles sequentially, starting from an initial condition and across a duration of 2s. The initial conditions and reference angles are given in Table I.
Test Case | [deg; deg/s] | [deg] | [deg] |
---|---|---|---|
1 | [57.3; -120.3] | 120 | -50 |
2 | [-85.9; -85.9] | -150 | 40 |
3 | [-85.9; -114.6] | -100 | -180 |
4 | [85.9; 57.3] | 100 | 180 |
We simulate each of these test cases under 4 control schemes - (i) a nominal iLQR framework, (ii) a soft-constrained iLQR framework (SC-iLQR), (iii) safe-filtered iLQR, where we apply the proposed safety filter to the nominal iLQR scheme, and (iv) safe-filtered SC-iLQR, where the safety filter is applied to SC-iLQR. For SC-iLQR, soft state constraints are incorporated into the cost function of the forward pass of the iLQR algorithm. Specifically, the function is applied onto each of the constraints, which increases the cost function proportionally whenever the constraints are violated. In safe-filtered iLQR and SC-iLQR, the initial reference trajectories in Algorithm 1 were initialized by iLQR and SC-iLQR, respectively.
The state trajectories under these control schemes are plotted in Fig. 2 and the percentages of constraint violations are tabulated in Table II. These percentages are computed by taking the ratio between the number of points in which the states violate the constraints against the total number of points in the state trajectory. Since the nominal iLQR method does not account for state constraints, it results in the largest percentage of constraint violation. While the soft-constrained iLQR reduces the level of constraint violation, there are a number of instances where the constraints are violated, as depicted in Fig. 2. On the other hand, as shown in Table II, through the application of the safety filter to iLQR and SC-iLQR, no constraint violations are observed for the test cases. To illustrate the safety certificate obtained, we plot the slack variables that characterize the trust region and state and input constraints, together with the safety certificate for the third test case, under a maximum noise level , in Fig. 3. The combination of Table II and Fig. 3 indicates that given the formulation in (29), the PSF may not give a numerical safety certificate even when the state trajectories are safe. Meanwhile, this proposed PSF can effectively encourage the system to behave safely by minimizing the conservative upper bounds on the constraint violation.
The statistics of the computational times of the control scheme, in this case the iLQR scheme, and the safety filter are depicted in Fig. 4. While introducing soft constraints increases the run times of the iLQR scheme, it has a significant effect on the run times of the safety filter, as observed in the right panel of Fig. 4. With the soft constraints added to the iLQR scheme, the state trajectories are closer to the boundaries of the constraint sets, without the activation of the safety filter. This allows the filter to find a solution that satisfies the constraints under a smaller number of iterations, which reduces computational time.


Method | Maximum noise level, | |||
---|---|---|---|---|
Case 1 [%] | Case 2 [%] | Case 3 [%] | Case 4 [%] | |
iLQR | 12.20 | 14.63 | 14.63 | 21.95 |
SC-iLQR | 7.32 | 4.88 | 12.20 | 12.20 |
Safe-filtered iLQR | 0.0 | 0.0 | 0.0 | 0.0 |
Safe-filtered SC-iLQR | 0.0 | 0.0 | 0.0 | 0.0 |
Maximum noise level, | ||||
Case 1 [%] | Case 2 [%] | Case 3 [%] | Case 4 [%] | |
iLQR | 14.63 | 14.63 | 17.07 | 26.83 |
SC-iLQR | 4.87 | 2.44 | 7.32 | 9.75 |
Safe-filtered iLQR | 0.0 | 0.0 | 0.0 | 0.0 |
Safe-filtered SC-iLQR | 0.0 | 0.0 | 0.0 | 0.0 |

VI Conclusion
We propose a convex optimization-based predictive safety filter for uncertain NN dynamical systems with the inclusion of additive disturbances. By utilizing tools from NN verification and robust linear MPC, our method requires solving a soft-constrained convex program online whose complexity is independent of the NN size. With our framework, formal safety guarantees, together with a robust state-feedback controller, are attained when the slack variables in the solution are all zero.
-A Proof of Proposition 1
Showing that is equivalent to showing for every possible values of , there exist such that . Following this perspective, the sufficient conditions in Proposition 1 can be derived inductively.
Initial case
At , we have , and
(32) | ||||
with . Note that . For to have a solution satisfying for all possible realization of the lumped uncertainty , it is both sufficient and necessary to have
(33) |
hold robustly for where denotes the -th standard basis. Using the bounds in (32) and the fact that , we have constraints (22a) and (22b) guarantee the robust feasibility of (33) for all possible values of . For any realization of , we denote the corresponding solution as such that .
Induction step
For a given controller , consider an arbitrary realization of the lumped uncertainty . At time , let denote the truncation of consisting of the first components of , and denote the truncation of up to the -th row and column. The truncated vectors and matrices are defined similarly. Assume there exist such that . Next, we will show that under constraint (22) there exist with such that
(34) | ||||
holds.
Since , the error dynamics (20b) up to time can be written as
(35) |
According to [20, Corollary 1], the affine constraint (21) parameterizes all closed-loop system responses , under the controller where are the corresponding truncation of , respectively. Define and . It follows from (35) and the fact , that the state and control input at time under the policy are given by
(36) |
Correspondingly, the uncertainty at time are bounded by
(37) | ||||
which is achieved by plugging (36) into the definition of . Recall that . For
(38) |
to have a solution such that , it is equivalent to require
(39) | ||||
hold. Since is bounded by (37), for and , we have constraints (22c) and (22d) are sufficient to guarantee the inequalities (39) hold. Then, we denote the solution of (38) as for the given realization of . We repeat this process until and in this way construct the virtual disturbance such that . Since the realization of is chosen arbitrarily, we prove Proposition 1.
References
- [1] O. Ogunmolu, X. Gu, S. Jiang, and N. Gans, “Nonlinear systems identification using deep dynamic neural networks,” arXiv preprint arXiv:1610.01439, 2016.
- [2] K. Y. Chee, T. Z. Jiahao, and M. A. Hsieh, “Knode-mpc: A knowledge-based data-driven predictive control framework for aerial robots,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2819–2826, 2022.
- [3] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” 2017.
- [4] S. Bansal, A. K. Akametalu, F. J. Jiang, F. Laine, and C. J. Tomlin, “Learning quadrotor dynamics using neural network for flight control,” in 2016 IEEE 55th Conference on Decision and Control (CDC), pp. 4653–4660, IEEE, 2016.
- [5] L. Bauersfeld, E. Kaufmann, P. Foehn, S. Sun, and D. Scaramuzza, “Neurobem: Hybrid aerodynamic quadrotor model,” arXiv preprint arXiv:2106.08015, 2021.
- [6] M. O’Connell, G. Shi, X. Shi, K. Azizzadenesheli, A. Anandkumar, Y. Yue, and S.-J. Chung, “Neural-fly enables rapid learning for agile flight in strong winds,” Science Robotics, vol. 7, no. 66, p. eabm6597, 2022.
- [7] Y. Yang, K. Caluwaerts, A. Iscen, T. Zhang, J. Tan, and V. Sindhwani, “Data efficient reinforcement learning for legged robots,” in Conference on Robot Learning, pp. 1–10, PMLR, 2020.
- [8] G. Williams, N. Wagener, B. Goldfain, P. Drews, J. M. Rehg, B. Boots, and E. A. Theodorou, “Information theoretic MPC for model-based reinforcement learning,” in 2017 IEEE Int. Conf. on Robotics and Automation (ICRA), pp. 1714–1721, IEEE, 2017.
- [9] A. Nagabandi, G. Kahn, R. S. Fearing, and S. Levine, “Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning,” in 2018 IEEE international conference on robotics and automation (ICRA), pp. 7559–7566, IEEE, 2018.
- [10] K. L. Hobbs, M. L. Mote, M. C. Abate, S. D. Coogan, and E. M. Feron, “Runtime assurance for safety-critical systems: An introduction to safety filtering approaches for complex control systems,” IEEE Control Systems Magazine, vol. 43, no. 2, pp. 28–65, 2023.
- [11] Y. Emam, G. Notomista, P. Glotfelter, Z. Kira, and M. Egerstedt, “Safe reinforcement learning using robust control barrier functions,” IEEE Robotics and Automation Letters, no. 99, pp. 1–8, 2022.
- [12] B. Tearle, K. P. Wabersich, A. Carron, and M. N. Zeilinger, “A predictive safety filter for learning-based racing control,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 7635–7642, 2021.
- [13] M. Alshiekh, R. Bloem, R. Ehlers, B. Könighofer, S. Niekum, and U. Topcu, “Safe reinforcement learning via shielding,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, 2018.
- [14] K. P. Wabersich and M. N. Zeilinger, “A predictive safety filter for learning-based control of constrained nonlinear dynamical systems,” Automatica, vol. 129, p. 109597, 2021.
- [15] A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier function based quadratic programs for safety critical systems,” IEEE Transactions on Automatic Control, vol. 62, no. 8, pp. 3861–3876, 2016.
- [16] H. Ma, B. Zhang, M. Tomizuka, and K. Sreenath, “Learning differentiable safety-critical control using control barrier functions for generalization to novel environments,” in 2022 European Control Conference (ECC), pp. 1301–1308, IEEE, 2022.
- [17] A. K. Akametalu, J. F. Fisac, J. H. Gillula, S. Kaynama, M. N. Zeilinger, and C. J. Tomlin, “Reachability-based safe learning with gaussian processes,” in 53rd IEEE Conference on Decision and Control, pp. 1424–1431, IEEE, 2014.
- [18] J. F. Fisac, A. K. Akametalu, M. N. Zeilinger, S. Kaynama, J. Gillula, and C. J. Tomlin, “A general safety framework for learning-based control in uncertain robotic systems,” IEEE Transactions on Automatic Control, vol. 64, no. 7, pp. 2737–2752, 2018.
- [19] K. Xu, Z. Shi, H. Zhang, Y. Wang, K.-W. Chang, M. Huang, B. Kailkhura, X. Lin, and C.-J. Hsieh, “Automatic perturbation analysis for scalable certified robustness and beyond,” Advances in Neural Information Processing Systems, vol. 33, pp. 1129–1141, 2020.
- [20] S. Chen, V. M. Preciado, M. Morari, and N. Matni, “Robust model predictive control with polytopic model uncertainty through system level synthesis,” arXiv preprint arXiv:2203.11375, 2022.
- [21] L. Brunke, M. Greeff, A. W. Hall, Z. Yuan, S. Zhou, J. Panerati, and A. P. Schoellig, “Safe learning in robotics: From learning-based control to safe reinforcement learning,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 5, pp. 411–444, 2022.
- [22] A. P. Leeman, J. Köhler, S. Bennani, and M. N. Zeilinger, “Predictive safety filter using system level synthesis,” in Proceedings of The 5th Annual Learning for Dynamics and Control Conference, pp. 1180–1192, PMLR, 2023.
- [23] K. P. Wabersich and M. N. Zeilinger, “Predictive control barrier functions: Enhanced safety mechanisms for learning-based control,” IEEE Transactions on Automatic Control, 2022.
- [24] H. Zhang, T.-W. Weng, P.-Y. Chen, C.-J. Hsieh, and L. Daniel, “Efficient neural network robustness certification with general activation functions,” Advances in neural information processing systems, vol. 31, 2018.
- [25] A. P. Leeman, J. Köhler, A. Zanelli, S. Bennani, and M. N. Zeilinger, “Robust nonlinear optimal control via system level synthesis,” arXiv preprint arXiv:2301.04943, 2023.
- [26] A. Saviolo, G. Li, and G. Loianno, “Physics-inspired temporal learning of quadrotor dynamics for accurate model predictive trajectory tracking,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 10256–10263, 2022.
- [27] N. A. Spielberg, M. Brown, and J. C. Gerdes, “Neural network model predictive motion control applied to automated driving with unknown friction,” IEEE Transactions on Control Systems Technology, vol. 30, no. 5, pp. 1934–1945, 2021.
- [28] D. Q. Mayne, “Model predictive control: Recent developments and future promise,” Automatica, vol. 50, no. 12, pp. 2967–2986, 2014.
- [29] J. Anderson, J. C. Doyle, S. H. Low, and N. Matni, “System level synthesis,” Annual Reviews in Control, vol. 47, pp. 364–393, 2019.
- [30] W. Langson, I. Chryssochoos, S. Raković, and D. Q. Mayne, “Robust model predictive control using tubes,” Automatica, vol. 40, no. 1, pp. 125–133, 2004.
- [31] J. Köhler, E. Andina, R. Soloperto, M. A. Müller, and F. Allgöwer, “Linear robust adaptive model predictive control: Computational complexity and conservatism,” in 2019 IEEE 58th Conference on Decision and Control (CDC), pp. 1383–1388, IEEE, 2019.
- [32] J. Fleming, B. Kouvaritakis, and M. Cannon, “Robust tube mpc for linear systems with multiplicative uncertainty,” IEEE Transactions on Automatic Control, vol. 60, no. 4, pp. 1087–1092, 2014.
- [33] M. Bujarbaruah, U. Rosolia, Y. R. Stürz, X. Zhang, and F. Borrelli, “Robust mpc for lpv systems via a novel optimization-based constraint tightening,” Automatica, vol. 143, p. 110459, 2022.
- [34] G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” arXiv preprint arXiv:1606.01540, 2016.
- [35] W. Li and E. Todorov, “Iterative linear quadratic regulator design for nonlinear biological movement systems.,” in ICINCO (1), pp. 222–229, Citeseer, 2004.
- [36] Y. Tassa, N. Mansard, and E. Todorov, “Control-limited differential dynamic programming,” in 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 1168–1175, IEEE, 2014.
- [37] B. Amos, I. Jimenez, J. Sacks, B. Boots, and J. Z. Kolter, “Differentiable mpc for end-to-end planning and control,” Advances in neural information processing systems, vol. 31, 2018.