This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Coordinated Science Laboratory
University of Illinois Urbana-Champaign
11email: {li213, bcyang2, dzhuang6, mitras}@illinois.edu
22institutetext: Massachusetts Institute of Technology
22email: [email protected]

Refining Perception Contracts: Case Studies in Vision-based Safe Auto-landing

Yangge Li 11 0000-0003-4633-9408    Benjamin C Yang 11 0009-0007-4998-2702    Yixuan Jia 22 0009-0007-5717-5222    Daniel Zhuang 11 0009-0005-6338-5045    Sayan Mitra 11 0000-0002-6672-8470
Abstract

Perception contracts provide a method for evaluating safety of control systems that use machine learning for perception. A perception contract is a specification for testing the ML components, and it gives a method for proving end-to-end system-level safety requirements. The feasibility of contract-based testing and assurance was established earlier in the context of straight lane keeping—a 3-dimensional system with relatively simple dynamics. This paper presents the analysis of two 6 and 12-dimensional flight control systems that use multi-stage, heterogeneous, ML-enabled perception. The paper advances methodology by introducing an algorithm for constructing data and requirement guided refinement of perception contracts (DaRePC). The resulting analysis provides testable contracts which establish the state and environment conditions under which an aircraft can safety touchdown on the runway and a drone can safely pass through a sequence of gates. It can also discover conditions (e.g., low-horizon sun) that can possibly violate the safety of the vision-based control system.

Keywords:
Learning-enabled control systems Air vehicles Assured autonomy Testing.

1 Introduction

The successes of Machine Learning (ML) have led to an emergence of a family of systems that take autonomous control actions based on recognition and interpretation of signals. Vision-based lane keeping in cars and automated landing systems for aircraft are exemplars of this family. In this paper, we explore two flight control systems that use multi-stage, heterogeneous, ML-enabled perception: an automated landing system created in close collaboration with our industry partners and an autonomous racing controller built for a course in our lab.

Vision-based auto-landing system.

For autonomous landing, availability of relatively inexpensive cameras and ML algorithms could serve as an attractive alternative. The perception pipeline of a prototype vision-based automated landing system (see Fig. 1) estimates the relative pose of the aircraft with respect to the touchdown area using combination of deep neural networks and classical computer vision algorithm. The rest of the auto-landing system uses the estimated pose from the perception pipeline to guide and control the aircraft from 2300 meters—along a reference trajectory of 33^{\circ} slope—down to the touchdown area on the runway. The quality of the camera images depend on the intensity and direction of lighting, weather conditions, as well as the actual position of the aircraft and, in turn, influences the performance of the perception-based pose estimation. From a testing and verification perspective, the problem is to identify a range of environmental conditions and a range of initial approaches from which the aircraft can achieve safe touchdown; safety being defined by a cone or a sequence of predicates in the state space (see Fig. 2).

Vision-based drone racing.

Drone racing systems are pushing the boundaries of perception and agile control [23]. Our second case study (DroneRace) is about autonomous control of a drone through a sequence of gates, under different visibility conditions, in a photorealistic simulator we built using Neural Radiance Fields [26]. The perception pipeline here estimates the position of the drone using a vision-based particle filter. Once again, the challenge is to discover the environment and approach conditions under which the maneuver can be performed reliably.

Refer to caption
Refer to caption
Refer to caption
Figure 1: Left: Architecture of vision-based automated landing system (AutoLand). Middle: Camera view and keypoint detection variations with ambient and directed lighting. Right: Variation of simulation environment for DroneRace.

The problem is challenging because the uncertainties in the perception module are state and environment dependent; these uncertainties have to be quantified and then propagated through the control and dynamics of the system. The core technical challenge for addressing this problem is that we do not have specifications for the DNN-based perception pipeline: What precise property should they satisfy for system-level safety? The recently introduced perception contracts approach [15, 2] addresses this problem: The contract can be created from ground truth-labeled data, and it serves a specification for modular testing of the ML perception components. Perception contracts can also be used to prove end-to-end system-level safety requirements. It gives a bound on the error between true pose dd and estimated pose d^\hat{d} as a function of the ground truth dd, that is supposed to hold even under environmental variations. In the previous works [15, 2], the range of environmental factors, and consequently, the perception contracts, were fixed, and the feasibility of such contract-based reasoning is demonstrated in low-dimensional, and relatively simple applications, such as a lane-keeping system.

This work advances the perception contracts method in two important ways. First, we develop an algorithm called DaRePC that iteratively refines the contracts based both the target system-level requirement and sampled data from the environment. To the best of our knowledge, DaRePC is the first approach that can help automatically identify the environmental conditions of vision-based flight control systems, that can assure system-level safety. We show that under certain convergence assumptions that can be empirically validated, DaRePC terminates with either a contract that proves the requirement or finds a counter-example. We discuss findings from applying DaRePC to AutoLand and DroneRace. These are challenging applications with high-dimensional (6d and 12d), nonlinear dynamics, and heterogeneous perception pipelines involving Deep Neural Networks and filtering modules. For both case studies, we are able to identify a set of environmental conditions where the system is proven to satisfy the requirement. We show that DaRePC is able to identify interesting combinations of non-convex set of environmental conditions that the system can satisfy requirement (Fig. 2). For the AutoLand case study, we can further show that our algorithm can find counter-examples violating requirements.

2 Related Works

Contracts and Assume-Guarantee.

A closely related approach has been developed by Păsăreanu, Mangal, Gopinath, and et al. in [18, 3, 19]. In [19], the authors generate weakest assumptions about the ML component, from the system-level correctness requirements and the available information about the non-ML components. These weakest assumptions are analogous to perception contracts, in that they are useful for assume-guarantee style reasoning for proving system-level requirements. However, unlike PCs they are completely agnostic of the actual perception components. The approach was used to analyze a TaxiNet-based lane following system [18]. Another important technical difference is that this framework uses probabilistic discrete state transition systems as the underlying mathematical model.

Our previous work [15, 2] develops the idea of perception contracts using the language of continuous state space models. Thus far, all the applications studies in these threads are related to lane following by a single agent, which is quite different from distributed formation control. The idea was recently extended to distributed system to assure safety of vision-based swarm formation control [14]. In [25], the perception contract is developed and used to help design a controller for safely landing a drone on a landing pad.

Safety of automated landing

Safety of a vision-based automated landing system is analyzed in [21]. This approach carefully exploits the geometry of perspective cameras to create a mathematical relationship between the aircraft pose and perceived pose. This relation is then modeled a NN which is composed with the controller (another NN in this case) for end-to-end analysis. This approach gives strong guarantees at the expense of requiring expertise (in creating the perception model) and being tailored for a particular application. Shoukri et al. in [22] analyze a vision-based auto landing system using falsification techniques. Their techniques involve computing the range of tolerable error from the perception pipeline and finding states where this error is violated. However, the impacts of environmental variations were not considered in either of these works.

Perception-based control

In a series of recent works by Dean et al., the authors studied the robustness guarantees of perception-based control algorithms under simple perception noises: In [4], the authors proposed a perception-based controller synthesis approach for linear systems and provided a theoretical analysis. In [5], the authors proposed robust barrier functions that provide an approach for synthesizing safety-critical controllers under uncertainty of the state. More complex perception modules have also been studied. In [8], the reachability of a closed-loop system with a neural controller has been studied.

VerifAI [7] provides a complete framework for analyzing autonomous systems with ML modules in the loop. It is used in [11] to analyze and redesign a Neural Network-Based Aircraft Taxiing System.

3 The Problem: Assuring Safety of Vision-based Systems

Notations.

For a function f:XYf:X\rightarrow Y, we will extend its definition to sets as usual f(S):={f(x)|xS}f(S):=\{f(x)\ |\ x\in S\} for any SXS\subseteq X. Also, for a function defined on sets f:2XYf:2^{X}\rightarrow Y, we will write f({x})f(\{x\}) simply as f(x)f(x) for xXx\in X.

A vision-based system or simply a system SS is described by a tuple X,X0,Y,E,f,o\langle X,X_{0},Y,\\ E,f,o\rangle where XnX\subseteq\mathbb{R}^{n} is called the state space, X0XX_{0}\subseteq X is the set of initial states, YmY\subseteq\mathbb{R}^{m} is called the observation space, EE is called an environment space, f:X×Y2Xf:X\times Y\rightarrow 2^{X} is called the dynamics, and o:X×E2Yo:X\times E\rightarrow 2^{Y} is called the observer. Typically, XX and ff are available in analytical form and are amenable to formal analysis. In contrast, the environment EE and the observer oo are not, and we only have black-box access to oo. For example, the perception pipelines implementing the observer for the AutoLand and the DroneRace systems (Sections 5-6), are well beyond the capabilities of NN verification tools.

An execution of a system SS in an environment eEe\in E is a sequence of states α=x0,x1,,\alpha=x_{0},x_{1},\ldots, such that x0X0x_{0}\in X_{0} and for each tt, xt+1f(xt,o(xt,e)).x_{t+1}\in f(x_{t},o(x_{t},e)). For an execution α\alpha and t0t\geq 0, we write the t𝑡ℎt^{\mathit{th}} state as α(t)\alpha(t). The reachable states of a system S{S} in an environment eEe\in E is a sequence of sets of states ReachS=X0,X1,,{Reach}_{S}=X_{0},X_{1},\ldots, such that for each tt, Xt=α(t)X_{t}=\cup{\alpha}(t) for all possible execution α{\alpha}. For any t0t\geq 0, ReachS(t){Reach}_{S}(t) denotes the set of states reachable at time tt.

A requirement is given by a set RXR\subseteq X. An execution α\alpha in environment ee satisfies a requirement RR if for each t0t\geq 0, α(t)R\alpha(t)\in R. The system SS satisfies RR over X0X_{0} and EE, which we write as o,X0,ER\langle o,X_{0},E\rangle\models R, if for every eEe\in E, every execution of the SS with observation function oo, starting from any state in XoX_{o} satisfies RR. We write (o,X1,E1)⊧̸R(o,X_{1},E_{1})\not\models R for some X1XX_{1}\subseteq X and E1EE_{1}\subseteq E, if there exists xX1,eE1x\in X_{1},e\in E_{1} and t0t\geq 0, if for the execution α\alpha starting from xx in ee, α(t)R\alpha(t)\notin R. We make the dependence on o,X0,o,X_{0}, and EE explicit here because these are the parameters that will change in our discussion.

Problem.

Given a system SS, we are interested in finding either (a) a subset E0EE_{0}\subseteq E of the environment such that o,X0,E0R\langle o,X_{0},E_{0}\rangle\models R or (b) a counterexample, that is, an environment EcEE_{c}\in E such that o,X0,Ec⊧̸R\langle o,X_{0},E_{c}\rangle\not\models R.

4 Method: Perception Contracts

The system SS is a composition of the dynamic function ff and the observer oo, and we are interested in developing compositional or modular solutions to the above problem. The observer oo in a real system is implemented using multiple machine learning (ML) components, and it depends on the environment in complicated ways. For example, a camera-based pose estimation system of the type used in 𝖠𝗎𝗍𝗈𝖫𝖺𝗇𝖽{\sf AutoLand} uses camera images which depend on lighting direction, lens flare, albedo and other factors, in addition to the actual physical position of the camera with respect to the target. Testing such ML-based observers is challenging because, in general, the observer has no specifications.

A natural way of tackling this problem is to create an over-approximation M{M} of the observer, through sampling or other means. The actual observer oo can be made to conform to this over-approximation M{M} arbitrarily precisely, by simply making M{M} large or conservative. However, such an over-approximation may not be useful for establishing system-level correctness. Thus, M{M} has to balance observer conformance and system correctness, which leads to the following notion of perception contract.

Definition 1

For a given observer o:X×EYo:X\times E\rightarrow Y, a requirement R{R}, and sets X0~X~X\tilde{X_{0}}\subseteq\tilde{X}\subseteq X, E~E\tilde{E}\subseteq E, a (X~0,X~,E~)(\tilde{X}_{0},\tilde{X},\tilde{E})-perception contract is a map M:X2Y{M}:X\rightarrow 2^{Y} such that

  1. 1.

    Conformance: xX~\forall x\in\tilde{X}, o(x,E~)M(x)o(x,\tilde{E}){\color[rgb]{0,0,0}\subseteq}{M}(x)

  2. 2.

    Correctness: M,X~0,E~R\langle{M},\tilde{X}_{0},\tilde{E}\rangle\models{R}.

Recall, M,X~0,E~0R\langle{M},\tilde{X}_{0},\tilde{E}_{0}\rangle\models{R} means that the closed loop system SMS_{M}, which is the system SS with the observer oo substituted by MM, satisfies the requirement RR. The following proposition states how perception contracts can be used for verifying actual perception-based systems.

Proposition 1

Suppose X~X\tilde{X}\subseteq X contains the reachable states of the system S=X,X~0,Y,E~,f,oS=\langle X,\tilde{X}_{0},Y,\tilde{E},f,o\rangle. If M{M} is a (X~0,X~,E~)(\tilde{X}_{0},\tilde{X},\tilde{E})-perception contract for oo for some requirement RR, then SS satisfies RR, i.e., o,X~0,E~R\langle o,\tilde{X}_{0},\tilde{E}\rangle\models{R}.

At a glance, Proposition 1 may seem circular: it requires us to first compute the reachable states X~\tilde{X} of SS to get a perception contract MM, which is then used to establish the requirement RR for SS. However, as we shall see with the case studies, the initial X~\tilde{X} can be a very crude over-approximation of the reachable states of SS, and the perception contract can be useful for proving much stronger requirements.

4.1 Learning Conformant Perception Contracts from Data

For this paper, we consider deterministic observers h^:X×EY{\hat{h}}:X\times E\rightarrow Y. This covers a broad range of perception pipelines that use neural networks and traditional computer vision algorithms. Recall, a perception contract M:X~2YM:{\color[rgb]{0,0,0}\tilde{X}}\rightarrow 2^{Y} gives a set M(x)M(x) of perceived or observed values, for any state xX~x\in\tilde{X}. In this paper, we choose to represent this set M(x)M(x) as a ball BMr(x)(Mc(x))YB_{M_{r}(x)}(M_{c}(x))\subseteq Y of radius Mr(x)M_{r}(x) centered at Mc(x)M_{c}(x).

In real environments, observers are unlikely to be perfectly conformant to perception contracts. Instead, we can try to compute a contract MM that maximizes some measure of conformance of h^\hat{h} such as p=𝔼x,e𝒟[𝕀(h^(x,e)M(x))]p=\mathbb{E}_{x,e\sim\mathcal{D}}[\mathbb{I}(\hat{h}(x,e)\in M(x))], where 𝕀\mathbb{I} is the indicator function and 𝒟\mathcal{D} is a distribution over the XX and EE. See the case studies and the discussions in Section 7 for more information on the choice of 𝒟\mathcal{D}. Computing the conformance measure pp is infeasible because we only have black box access to h^\hat{h} and 𝒟\mathcal{D}. Instead, we can estimate an empirical conformance measure: Given a set of states X~\tilde{X} and a set of environments E~\tilde{E} over which the perception contract MM is computed, we can sample according to 𝒟\mathcal{D} to get L={x1,e1,y^1,,xn,en,y^n}L=\{\langle x_{1},e_{1},\hat{y}_{1}\rangle,...,\langle x_{n},e_{n},\hat{y}_{n}\rangle\}, where xiX~x_{i}\in\tilde{X}, eiE~e_{i}\in\tilde{E}, and y^i=h^(xi,ei)\hat{y}_{i}={\hat{h}}(x_{i},e_{i}). We define the empirical conformance measure of M{M} as the fraction of samples that satisfies M{M}: p^n=1ni=1n𝕀(y^iM(xi))\hat{p}_{n}=\frac{1}{n}\sum_{i=1}^{n}\mathbb{I}(\hat{y}_{i}\in M(x_{i})). Using Hoeffding’s inequality [13] , we can bound the gap between the actual conformance measure pp and the empirical measure p^n\hat{p}_{n}, as a function of the number of samples nn.

Proposition 2

For any δ(0,1)\delta\in(0,1), with probability at least 1δ1-\delta, pp^lnδ2np\geq\hat{p}-\sqrt{-\frac{\ln{\delta}}{2n}}.

This gives us a way to choose the number of samples nn, to bound the gap between the empirical and the actual conformance, with some probability (as decided by the choice of δ\delta).

Algorithm 1 LearnContract(X~,E~,h^,pr,ϵ,δ)\textsc{LearnContract}(\tilde{X},\tilde{E},{\hat{h}},pr,\epsilon,\delta)
1:n=lnδ2ϵ2n=-\frac{ln\delta}{2\epsilon^{2}}
2:{xi,ei}i=1n=Sample𝒟(X~,E~,n)\{\langle x_{i},e_{i}\rangle\}_{i=1}^{n}=\textsc{Sample}_{\mathcal{D}}(\tilde{X},\tilde{E},n)
3:y^1,,y^n=h^(x1,e1),,h^(xn,en)\hat{y}_{1},...,\hat{y}_{n}={\hat{h}}(x_{1},e_{1}),...,{\hat{h}}(x_{n},e_{n})
4:Lc={x1,y^1,,x1,y^n}L_{c}=\{\langle x_{1},\hat{y}_{1}\rangle,...,\langle x_{1},\hat{y}_{n}\rangle\}
5:Mc=Regression(Lc)M_{c}=\textsc{Regression}(L_{c})
6:Lr={x1,|y^1Mc(x1)|,,xn,|y^nMc(xn)|}L_{r}=\{\langle x_{1},|\hat{y}_{1}-M_{c}(x_{1})|\rangle,...,\langle x_{n},|\hat{y}_{n}-M_{c}(x_{n})|\rangle\}
7:Mr=QuantileRegression(Lr,pr+ϵ)M_{r}=\textsc{QuantileRegression}(L_{r},pr+\epsilon)
8:return Mc,Mr\langle M_{c},M_{r}\rangle

The LearnContract algorithm takes as input a set of states X~\tilde{X}, an environment set E~\tilde{E} over which a perception contract is to be constructed for the actual perception function h^{\hat{h}}. It also takes the desired conformance measure prpr, the tolerable gap ϵ\epsilon (between the empirical and actual conformance), and the confidence parameter δ\delta. The algorithm first computes the required number of samples for which the gap ϵ\epsilon holds with probability more than 1δ1-\delta (based on Proposition 2). It then samples nn states over X~\tilde{X}, and nn environmental parameters in E~\tilde{E}, according to a distribution 𝒟\mathcal{D}. The samples are then used to compute Mc(x)M_{c}(x) function using standard regression.

The training data for the radius function MrM_{r} is obtained by computing the difference between actual perceived state h^(x)\hat{h}(x) and the center predicted by Mc(x)M_{c}(x). To make actual conformance pprp\geq pr, from Proposition 2, we know that the empirical conformance p^p+ϵpr+ϵ\hat{p}\geq p+\epsilon\geq pr+\epsilon. Recall, the empirical conformance is the fraction of samples satisfying |y^iMc(xi)|Mr(xi)|\hat{y}_{i}-M_{c}(x_{i})|\leq M_{r}(x_{i}). MrM_{r} is obtained by performing quantile regression with quantile value pr+ϵpr+\epsilon. Given Lr={x1,r1,,xn,rn}L_{r}=\{\langle x_{1},r_{1}\rangle,...,\langle x_{n},r_{n}\rangle\} where ri=|y^1Mc(x1)|r_{i}=|\hat{y}_{1}-M_{c}(x_{1})|, QuantileRegression finds the function MrM_{r} such that pr+ϵpr+\epsilon fraction of data points satisfy that |y^iMc(xi)|Mr(xi)|\hat{y}_{i}-M_{c}(x_{i})|\leq M_{r}(x_{i}) by solving the optimization problem [6]:

minMr{(1p^)ri<Mr(xi)|riMr(xi)|+p^riMr(xi)|riMr(xi)|}.\min_{M_{r}}~{}\{(1-\hat{p})\sum_{r_{i}<M_{r}(x_{i})}{|r_{i}-M_{r}(x_{i})|}+\hat{p}\sum_{r_{i}\geq M_{r}(x_{i})}{|r_{i}-M_{r}(x_{i})|}\}.

Therefore, given prpr, ϵ\epsilon, δ\delta, the algorithm is able to find M{M} with desired conformance.

Contracts with fixed training data.

Sometimes we only have access to a pre-collected training data set. LearnContract can be modified to accommodate this setup. Given the number of samples nn and prpr, δ\delta, according to Proposition 2, we can get an upper bound on the gap ϵ\epsilon and therefore get a lower bound for the empirical conformance p^\hat{p}. We can then use quantile regression to find the contract with right empirical conformance on the training data set.

LearnContract constructs candidate perception contracts that are conformant to a degree, over specified state X~\tilde{X} and environment E~\tilde{E} spaces. For certain environments eE~e\in\tilde{E}, the system SS may not meet the requirements, and therefore, the most precise and conformant candidate contracts computed over such an environment ee will not be correct. In Section 4.2, we will develop techniques for refining X~\tilde{X} and E~\tilde{E} to find contracts that are correct or find counter-examples. For the completeness of this search process we will make the following assumption about convergence of LearnContract with respect to E~\tilde{E}.

Convergence of LearnContract.

Consider a monotonically decreasing sequence of sets {Ek}n=1\{E_{k}\}_{n=1}^{\infty}, such that their limit lim infkEk={ed}\liminf_{k\rightarrow\infty}E_{k}=\{e_{d}\} for some environment edE~e_{d}\in\tilde{E}. Let {Mk}\{M_{k}\} be the corresponding sequence of candidate contracts constructed by LearnContract for changing E~=Ek\tilde{E}=E_{k}, while keeping all other parameters fixed. We will say that the algorithm is convergent at ede_{d} if for all xX~,lim infkMk(x)=h^(x,ed).x\in\tilde{X},\liminf_{k\rightarrow\infty}M_{k}(x)=\hat{h}(x,e_{d}). Informally, as candidate contracts are computed for a sequence of environments converging to ede_{d}, the contracts point-wise converge to h^\hat{h}. In Section 5, we will see examples providing empirical evidence for this convergence property.

4.2 Requirement Guided Refinement of Perception Contracts

We now present Data and Requirement Guided Perception Contract (DaRePC) algorithm which uses LearnContract to find perception contract that are correct. Consider system S^=X~,X0,Y,E0,f,h^{\hat{S}}=\langle\tilde{X},X_{0},Y,{\color[rgb]{0,0,0}E_{0}},f,{\hat{h}}\rangle and the safety requirement R{R}, we assume there exists a nominal environment edE0e_{d}\in E_{0} such that there exists xdX0x_{d}\in X_{0} with h^,xd,edR\langle{\hat{h}},x_{d},e_{d}\rangle\models{R}.

Algorithm 2 DaRePC(X0,E0,ed,R,h^)\textsc{DaRePC}(X_{0},E_{0},e_{d},{R},{\hat{h}})
1:Params: X~,pr,ϵ,δ\tilde{X},pr,\epsilon,\delta
2:E~=E0\tilde{E}=E_{0}
3:ME~=LearnContract(X~,E~,h^,pr,ϵ,δ){M}_{\tilde{E}}=\textsc{LearnContract}(\tilde{X},\tilde{E},{\hat{h}},pr,\epsilon,\delta)
4:queue.push(X0)queue.push(X_{0})
5:while queuequeue\neq\emptyset do
6:     Xc=queue.pop()X_{c}=queue.pop()
7:     if t s.t Reach(Xc,ME~,t)R=\exists t\text{ s.t }\textsc{Reach}(X_{c},{M}_{\tilde{E}},t)\cap{R}=\emptyset then return None,E~,Xc\langle None,\tilde{E},X_{c}\rangle
8:     else if t s.t Reach(Xc,ME~,t)R\exists t\text{ s.t }\textsc{Reach}(X_{c},{M}_{\tilde{E}},t)\nsubseteq{R} then
9:         X1,X2=RefineState(Xc)X_{1},X_{2}=\textsc{RefineState}(X_{c})
10:         queue.push(X1,X2)queue.push(X_{1},X_{2})
11:         E~=ShrinkEnv(E~,ed,ME~,h^)\tilde{E}=\textsc{ShrinkEnv}(\tilde{E},e_{d},{M}_{\tilde{E}},{\hat{h}})
12:         ME~=LearnContract(X~,E~,h^,pr,ϵ,δ){M}_{\tilde{E}}=\textsc{LearnContract}(\tilde{X},\tilde{E},\hat{h},pr,\epsilon,\delta)      
13:return ME~,E~,None\langle{M}_{\tilde{E}},\tilde{E},None\rangle

DaRePC takes as input initial X0X_{0}, environment E0E_{0} sets, requirement R{R}, and the parameters for LearnContract. It outputs either an environment set E~E0\tilde{E}\in E_{0} with a candidate contract ME~{M}_{\tilde{E}} such that ME~,X0,E~R\langle{M}_{\tilde{E}},X_{0},\tilde{E}\rangle\models{R} and the actual observer h^\hat{h} conforms to ME~{M}_{\tilde{E}} over E~\tilde{E}, or it outputs a set of initial states XcX0X_{c}\in X_{0} and an environment set E~E0\tilde{E}\in E_{0} such that R{R} will be violated.

DaRePC will check if Reach(P,ME~,t)\textsc{Reach}(P,{M}_{\tilde{E}},t) leaves R(t){R}(t). If Reach(P,ME~,t)\textsc{Reach}(P,{M}_{\tilde{E}},t) completely leaves R(t){R}(t), then ME{M}_{E} satisfies R{R} can’t be found on XcX_{c}. XcX_{c} and E~\tilde{E} is returned as a counter-example (Line 7). When Reach(P,ME,t){Reach}(P,{M}_{E},t) partially exits R(t){R}(t), then refinement will be performed by partitioning the initial states using RefineState and shrinking the environments around nominal ede_{d} using ShrinkEnv (Line 8). If Reach(P,ME,t)\textsc{Reach}(P,{M}_{E},t) never exit R(t){R}(t), then ME{M}_{E} is a valid perception contract that satisfy the correctness requirement in Definition 1. It is returned together with E~\tilde{E}.

We briefly discuss the subroutines used in DaRePC.

Reach(Xc,ME~,t)\textsc{Reach}(X_{c},M_{\tilde{E}},t)

computes an over-approximation of ReachSM(t){Reach}_{S_{M}}(t) of the system SM=X~,Xc,Y,E~,f,ME~S_{{M}}=\langle\tilde{X},X_{c},Y,\tilde{E},f,M_{\tilde{E}}\rangle. For showing completeness of the algorithm, we assume that Reach converges to a single execution as XcX_{c}, E~\tilde{E} converges to a singleton set. This assumption is satisfied by many reachability algorithms, especially simulation-based reachability algorithms [9, 24].

RefineState(Xc)\textsc{RefineState}(X_{c})

takes as input a set of states, and partitions it into a finer set of states. With enough refinements, all partitions will converge to arbitrarily small sets. There are many strategies for refining state partitions, such as, bisecting hyperrectangular sets along coordinate axes.

ShrinkEnv(E~,ed,ME~,h^)\textsc{ShrinkEnv}(\tilde{E},e_{d},{M}_{\tilde{E}},{\hat{h}})

The ShrinkEnv function takes as input a set of environmental parameters E~\tilde{E}, the nominal environmental parameter ede_{d}, ME~{M}_{\tilde{E}} and the actual perception function h^{\hat{h}} and generates a new set of environmental parameters. The sequence of sets generated by repeatedly apply ShrinkEnv will monotonically converge to a singleton set containing only the nominal environmental parameter ede_{d}.

4.3 Properties of DaRePC

Theorem 4.1

If ME~{M}_{\tilde{E}} is output from DaRePC, then ME~,X0,E~R\langle{M}_{\tilde{E}},X_{0},\tilde{E}\rangle\models{R}

Theorem 4.1 implies that if the condition in Proposition 1 is satisfied, then h^,X0,E~R\langle{\hat{h}},X_{0},\tilde{E}\rangle\models{R}.

From the assumptions for M{M} and the Reach function, we are able to show that as RefineState and ShrinkEnv are called repeatedly, the resulting reachable states computed by Reach will converge to a single trajectory under nominal environmental conditions. The property can be stated more formally as:

Proposition 3

Given S=X~,X0,Y,E0,f,h^S=\langle\tilde{X},X_{0},Y,E_{0},f,{\hat{h}}\rangle, x0X0\forall x_{0}\in X_{0} and an execution α{\alpha} of SS in nominal environment ede_{d}, ϵ>0\forall\epsilon>0, n\exists n such that XnX_{n} generated by calling RefineState nn times on X0X_{0} and MEn{M}_{E_{n}} obtained on environment refined by ShrinkEnv nn times satisfies that t\forall t, Reach(Xn,MEn,t)Bϵ(α(t))\textsc{Reach}(X_{n},{M}_{E_{n}},t)\subseteq B_{\epsilon}({\alpha}(t))

We further define how a system can satisfy/violate the requirement robustly.

Definition 2

S^=X~,X0,Y,E0,f,h^{\hat{S}}=\langle\tilde{X},X_{0},Y,E_{0},f,{\hat{h}}\rangle satisfy R{R} robustly for X0X_{0} and E~\tilde{E} if x0X0\forall x_{0}\in X_{0}, eE~\forall e\in\tilde{E}, the execution α{\alpha} satisfies that t\forall t, ϵ>0\exists\epsilon>0 such that Bϵ(α(t))RB_{\epsilon}({\alpha}(t))\subseteq{R}.

Similarly, S^{\hat{S}} violates R{R} robustly if there exists an execution α\alpha starting from x0X0x_{0}\in X_{0} under ede_{d} and at time tt, α(t)R{\alpha}(t)\notin{R}, there always exist ϵ>0\epsilon>0 such that Bϵ(α(t))R=B_{\epsilon}({\alpha}(t))\cap{R}=\emptyset

With all these, we can show completeness of DaRePC.

Theorem 4.2 (Completeness)

If M{M} satisfies conformance and convergence and S^=X~,X0,Y,E0,f,h^{\hat{S}}=\langle\tilde{X},X_{0},Y,E_{0},f,{\hat{h}}\rangle satisfy/violate requirement robustly, then Algorithm 2 always terminates, i.e., the algorithm can either find a E~E0\tilde{E}\subseteq E_{0} and ME~{M}_{\tilde{E}} that ME~,X0,E~R\langle{M}_{\tilde{E}},X_{0},\tilde{E}\rangle\models{R} or find a counter-example XcX0X_{c}\subseteq X_{0} and E~E0\tilde{E}\subseteq E_{0} with edE~e_{d}\in\tilde{E} that h^,Xc,E~⊧̸R\langle{\hat{h}},X_{c},\tilde{E}\rangle\not\models{R}.

5 Case Study: Vision-based Auto Landing

Our first case study is 𝖠𝗎𝗍𝗈𝖫𝖺𝗇𝖽{\sf AutoLand}, which is a vision-based auto-landing system. Here an aircraft is attempting to touchdown on a runway using a vision-based pose estimator. Such systems are being contemplated for different types of aircraft landing in airports that do not have expensive Instrumented Landing Systems (ILS). The system block diagram is shown in Fig. 1. The safety requirement R{R} defines a sequence of shrinking rectangular regions that the aircraft should remain in (see Fig. 2), otherwise, it has to trigger an expensive missed approach protocol. For our analysis we built a detailed simulation of the aircraft dynamics; the observer or the perception pipeline is the actual code used for vision-based pose estimation.

5.1 AutoLand control system with DNN-based pose estimator

The state space XX of the AutoLand system is defined by the state variables x,y,z,ψ,θ,v\langle x,y,z,\psi,\theta,v\rangle, where x,y,zx,y,z are position coordinates, ψ\psi is the yaw, θ\theta is the pitch and vv is the velocity of the aircraft. The perception pipeline implements the observer h^{\hat{h}} that attempts to estimate the full state of the aircraft (relative to the touchdown point), except the velocity. That is, the observation space YY is defined by x^,y^,z^,ψ^,θ^\langle\hat{x},\hat{y},\hat{z},\hat{\psi},\hat{\theta}\rangle.

Dynamics.

The dynamics (ff) combines the physics model of the aircraft with a controller that follows a nominal trajectory. A 6-dimensional nonlinear differential equation describes the motion of the aircraft with yaw rate β\beta, the pitch rate ω\omega, and acceleration aa as the control inputs (Details are given in the Appendix). The aircraft controller tries to follow a reference trajectory ξr:0X\xi_{r}:\mathbb{Z}^{\geq 0}\rightarrow X approaching the runway. This reference approach trajectory starts at position 2.32.3 kilometers from the touchdown point and 120120 meters above ground, and has a 3-3^{\circ} slope in the z direction. The reference approach has constant speed of 10m/s10m/s.

Vision-based observer.

The observations x^,y^,z^,ψ^,θ^\langle\hat{x},\hat{y},\hat{z},\hat{\psi},\hat{\theta}\rangle are computed using a realistic perception pipeline composed of three functions (Fig. 1): (1) The airplane’s front camera generates an image, (2) a key-point detector detects specific key-points on the runway in that image using U-Net [20], (3) pose estimator uses the detected key-points and the prior knowledge of the actual positions of those key-points on the ground, to estimate the pose of the aircraft. This last calculation is implemented using the perspective-n-point algorithm [10]. Apart from the impact of the environment on the image, the observer is deterministic.

Environment.

For this paper, we study the effects of two dominant environment factors, namely the ambient light and the position of the sun, represented by a spotlight111We have also performed similar analysis with fog and rain, but the discussion is not included here owing to limited space.. (1) The ambient light intensity in the range [0.2, 1.2] and (2) the spot light’s angle (with respect to the runway) in the range [0.1,0.6][-0.1,0.6] radians, define the environment E0E_{0}. These ranges are chosen to include a nominal environment ed=1.0,0e_{d}=\langle 1.0,0\rangle in which there exists an initial state x0X0x_{0}\in X_{0} from which the system is safe. Sample camera images from the same state under different environmental conditions are shown in Fig. 1.

5.2 Safety Analysis of AutoLand

We apply DaRePC on the 𝖠𝗎𝗍𝗈𝖫𝖺𝗇𝖽{\sf AutoLand} system with reachability analysis implemented using Verse [16]. We run the algorithm with the above range of the environment E0E_{0}, the requirement R{R}, and on two rectangular initial sets X01X_{01} and X02X_{02} with different range of initial positions (see Appendix 0.F).

Refer to caption
Refer to caption
Figure 2: Left: Safe environments found by DaRePC. Right: R{R} (blue rectangles), computed Reach (red rectangles) and simulations trajectories of vision based landing system starting from X01X_{01}.

For X01X_{01}, the algorithm finds E~E0\tilde{E}\subset E_{0} and a corresponding perception contract ME~M_{\tilde{E}} such that ME~,X01,E~R\langle{M}_{\tilde{E}},X_{01},\tilde{E}\rangle\models{R} after performing 12 rounds of refinements with empirical conformance measure of 92.4%92.4\%. The resulting E~\tilde{E} is visualized in Fig. 2. The white region represents the environmental parameters that are removed (not provably safe) during refinement. For the colored region, different colors represent the empirical conformance of h^\hat{h} to MM over each environmental parameter partition. We have following observations from the analysis.

The computed perception contract ME~{M}_{\tilde{E}} achieves desired conformance.

To compute the perception contract, we choose the desired conformance pr=90%pr=90\%, the gap between actual conformance and empirical conformance ϵ=1%\epsilon=1\% and the confidence parameter δ=0.1%\delta=0.1\%. From LearnContract, the empirical conformance measure should be at least 91%91\% for approximately 3434K samples. We round this up to 4040k samples. The computed perception contract ME~{M}_{\tilde{E}} achieves 92.4%92.4\% empirical conformance measure over E~\tilde{E}.

Additionally, we run 30 random simulations from X01X_{01} under E~\tilde{E}. Out of the resulting 3030K data points generated, 2959029590 observations were contained in the computed perception contract MM, which gives an empirical conformance of 98.6%98.6\% for this different sampling distribution. This shows that ME~{M}_{\tilde{E}} meets the desired conformance and can be even better for some realistic sampling distributions.

DaRePC captures interesting behavior.

The set of environmental parameters E~\tilde{E} found by DaRePC also reveals some interesting behavior of the perception pipeline. We can see from Fig. 2 that when the intensity of ambient light is below 0.6 or above 1.15, the performance of perception pipeline becomes bad as many of the partitions are removed in those region. However, we can see there is a region [0.95,1.1]×[0.25,0.4][0.95,1.1]\times[0.25,0.4] where the ambient lighting condition is close to the nominal ambient lighting condition, but surprisingly, the performance of perception pipeline is still bad. We found that this happens because in this environment, the glare from the spotlight (sun) makes the runway too bright for the perception pipeline to detect keypoints. Conversely, in region [0.4,0.6]×[0.25,0.4][0.4,0.6]\times[0.25,0.4], the relatively low ambient light is compensated by the sun, and therefore, the perception pipeline has reliable performance for landing.

Conformance, correctness and system-level safety.

Fig. 2 shows the system-level safety requirement R{R} in blue and Reach(X01,ME~,t)\textsc{Reach}(X_{01},{M}_{\tilde{E}},t) in red. We can see that the computed reachable set indeed falls inside R{R}, which by Proposition 1, implies that if the perception pipeline h^\hat{h} conforms to ME~{M}_{\tilde{E}} then h^,X01,E~R\langle{\hat{h}},X_{01},\tilde{E}\rangle\models{R}.

Further, to check h^,X01,E~R\langle{\hat{h}},X_{01},\tilde{E}\rangle\models{R}, we run 30 simulations starting from states randomly sampled from X01X_{01} under environments randomly sampled from E~\tilde{E}. The results are shown in Fig. 2. All 30 simulations satisfy R{R}, and only one of these is not in the computed reachable set. This happens because ME~{M}_{\tilde{E}} doesn’t have 100%100\% conformance, and therefore cannot capture all behaviors of the actual system.

Discovering bad environments; h^,X01,E0\E~\langle{\hat{h}},X_{01},E_{0}\textbackslash\tilde{E}\rangle doesn’t satisfy R{R}.

We performed 10 random simulations from X01X_{01} under E0E~E_{0}\setminus\tilde{E}. Eight out of the 10 violate R{R} (Red trajectories in Fig. 2). This suggests that the environments removed during refinement can indeed lead to real counter-examples violating R{R}.

For X02X_{02}, DaRePC is able to find a set of states XcX02X_{c}\subseteq X_{02} and EcE0E_{c}\subseteq E_{0} such that h^,xc,ec⊧̸R\langle{\hat{h}},x_{c},e_{c}\rangle\not\models{R}. XcX_{c} and EcE_{c} are found by calling RefineState on X0X_{0} 5 times and ShrinkEnv on E0E_{0} 8 times. The computed set of environmental parameters EcE_{c} and XcX_{c} is shown in Fig. 3. We perform 30 simulations starting from randomly sampled xcXcx_{c}\in X_{c} under ecEce_{c}\in E_{c} and the results are shown in Fig. 3. We can see clearly from the plot that all the simulation trajectories violates R{R}, which provides evidence that XcX_{c} and EcE_{c} we found are indeed counter-examples.

Refer to caption
Refer to caption
Figure 3: Left: Yellow region show EcE_{c} found by DaRePC.Right: R{R} (blue rectangles) and simulations trajectories (red traces) of vision based landing system starting from XcX_{c} (yellow rectangle) under EcE_{c}.

5.3 Convergence and conformance of Perception Contracts

In Section 4.1 , for showing termination of the algorithm, we assume that the perception contract we computed is convergent, i.e. as candidate contracts are computed for a sequence of environments converging to ede_{d}, the contracts also converge to h^\hat{h} (pointwise). To test this, we pre-sampled 8080k data points and set pr=85%pr=85\%, δ=0.1%\delta=0.1\%. We reduce E~\tilde{E} for ME~{M}_{\tilde{E}} by invoking ShrinkEnv function 0, 3, and 6 times on the set of environments with pre-sampled data points. ShrinkEnv refines the environments by removing environmental parameters where the perceived states is far from the actual states and removing corresponding data points sampled under these environments. Fig. 4 shows how the output perception contract and its conformance measure are influenced by successive refinements. For the three range of environments E~0\tilde{E}_{0}, E~3\tilde{E}_{3}, E~6\tilde{E}_{6} after 0,3,6 round of refinements, we observe that h^(x,ed)ME~6(x)ME~3(x)ME~0(x){\hat{h}}(x,e_{d})\in M_{\tilde{E}_{6}}(x)\subseteq M_{\tilde{E}_{3}}(x)\subseteq M_{\tilde{E}_{0}}(x), which provides evidence that the perception contract is convergent.

Refinement preserves conformance measure of perception contract.

As we reduce the range of environmental parameters with 0, 3 and 6 refinements, the training data set reduced from 80000 to 71215 and 62638, respectively. Therefore, according to Proposition 2, the empirical conformance measure should at least be 85.8%85.8\% for all these three cases to achieve the 85%85\% actual conformance measure. In practice, we are able to get 85.9%85.9\%, 87.5%87.5\% and 89.1%89.1\% empirical conformance measure. From this, we can see refinement preserves conformance measure of perception contract.

Refer to caption
Refer to caption
Refer to caption
Figure 4: Visualizing perception contract with different number of refinements for different dimensions.

6 Case Study: Vision-based Drone Racing (DroneRace)

In the second case study, a custom-made quadrotor has to follow a path through a sequence of gates in our laboratory using a vision-based particle filtering for pose estimation. The system architecture is shown in Fig. 6. The safety requirement RR is a pipe-like region in space passing through the gates. Our analysis here is based on a simulator mimicking the laboratory flying space that we created for this work using Neural Radiance Fields (NeRF) [26]. This simulator enables us to render photo-realistic camera images from any position in the space and with different lighting and visibility (fog) conditions. The observer implementing the vision-based particle filter is the actual code that flies the drone.

6.1 System Description

The 12-dimensional state space XX of the system is defined by the position x,y,zx,y,z, linear velocities vx,vy,vzvx,vy,vz, roll pith yaw angles ϕ,θ,ψ\phi,\theta,\psi, and the corresponding angular velocities ρ,ω,β\rho,\omega,\beta. The observer h^{\hat{h}} estimates the 3d-position and the yaw angle. That is, the observation space YY is defined by x^,y^,z^,ψ^\langle\hat{x},\hat{y},\hat{z},\hat{\psi}\rangle. This is a reasonable setup as roll and pitch angles can be measured directly using onboard accelerometer, but position and yaw angles have to be obtained through computation or expensive external devices such as GPS or Vicon.

Dynamics.

A standard 12-dimensional nonlinear model for the drone is used [12]. The input to the aircraft are the desired roll, pitch, yaw angles, and the vertical thrust. The path for passing through the gates is generated by a planner and the controller tries to follow this planned path. More details about the controller and the dynamics are provided in the Appendix 0.H.

Observer.

The observations x^,y^,z^,ψ^\langle\hat{x},\hat{y},\hat{z},\hat{\psi}\rangle are computed using a particle filter algorithm from [17]. This is a very different perception pipeline compared to 𝖠𝗎𝗍𝗈𝖫𝖺𝗇𝖽{\sf AutoLand}. The probability distribution over the state is represented using a set of particles, from which the mean estimate is calculated. Roughly, the particle filter updates the particles in two steps: in the prediction step it uses the dynamics model to move the particles forward in time, and in the correction step it re-samples the particles to select those particles that are consistent with the actual measurement (images in this case). The details of this algorithm are not important for our analysis. However, it is important to note that the particles used for prediction and correction do not have access to the environment parameters (e.g. lighting).

Besides using current image, the particle filter also uses history information to help estimation and therefore the estimation accuracy may improve over time. The shape of the reference trajectory that the drone is following can also influence its the performance, especially rapid turning in the reference trajectory. Therefore, consider this specific application, we decided to sample data along the reference trajectory for constructing the perception contract which gives similar distribution when the drone is following the reference trajectory.

Environment.

We study the impact of two environmental parameters: (1) Fog ranging from 0 (clear) to 11 (heavy fog), and (2) Ambient light ranging from 1-1 (dark) and 0 (bright). The nominal environment ed=0,0e_{d}=\langle 0,0\rangle is a scenario in which the drone is able to complete the path successfully. The effect of different environments on the same view are shown in Fig. 1.

6.2 Analysis of DroneRace

For the DroneRace system, the initial set X0X_{0} with x[0.85,0.83]x\in[-0.85,-0.83], y[0.06,0.04]y\in[-0.06,-0.04], z[0.34,0.32]z\in[-0.34,-0.32], ϕ[0.01,0.01]\phi\in[-0.01,0.01], θ[0.01,0.01]\theta\in[-0.01,0.01], ψ[0.60,0.62]\psi\in[0.60,0.62] defines an approach zone (all the other dimensions are set to 0). Due to the fact that the perception pipeline for the DroneRace example is highly sensitive, we choose desired actual conformance measure pr=70%pr=70\%, the gap ϵ=2%\epsilon=2\% and the confidence parameter δ=1%\delta=1\%. Therefore, the empirical conformance measure should be at least 72%72\% with at least 5756 training data points.

Refer to caption
Refer to caption
Figure 5: Left: Computed Reach (red rectangles) and simulations trajectories of DroneRace starting from X0X_{0}. Right: Computed smaller Reach (red rectangles) after shrinking E~\tilde{E} identified by DaRePC.

Conformance, correctness and system-level safety.

For X0X_{0}, with 23 refinements on the environmental, DaRePC is able to show that X0X_{0} is safe. The environment set E~\tilde{E} that was proved safe is ball-like region around nominal environment ed=0,0e_{d}=\langle 0,0\rangle. The empirical conformance measure of the contract is 72.4%72.4\%, which is higher than the minimal empirical conformance needed to get the desired conformance with confidence δ\delta. The computed Reach(X0,ME~,t)\textsc{Reach}(X_{0},M_{\tilde{E}},t) reachable set R{R} shows correctness with respect to the requirement R{R} (Fig. 5), that is, ME~,X0,E~R\langle{M}_{\tilde{E}},X_{0},\tilde{E}\rangle\models{R}.

h^,X0,E~\langle\hat{h},X_{0},\tilde{E}\rangle satisfies R{R}.

We run 20 random simulated trajectories starting from randomly sampled xX0x\in X_{0} and eE~e\in\tilde{E}. The results are visualized by the blue trajectories in Fig. 5. We can see all the 20 simulations are covered by the Reach computed and therefore doesn’t violate R{R}. This provides evidence that the h^,X0,E~R\langle{\hat{h}},X_{0},\tilde{E}\rangle\models{R}. We also measure the empirical conformance measure of the perception contract for states in these 20 random simulations and the contract is able to get 93.6%93.6\% empirical conformance measure on these states.

Discovering bad environments.

We run 20 additional simulations starting from randomly sampled xX0x\in X_{0} under environments randomly sampled from outside of E~\tilde{E}. The results are shown in Fig. 7 in appendix. We observe that among all 20 simulations, the drone violates requirments in 19 simulations. This provides evidence that the system can easily violate the requirement outside the environmental factors E~\tilde{E} DaRePC identified.

7 Discussions

There are several limitations of the perception contract method developed in this paper, which also suggest directions for future explorations.

Distributional Shifts.

We discussed that DaRePC has nice properties of conformance and termination provided LearnContract converges, as the environment converges to the ground conditions. This only makes sense in the limited situation where the underlying sampling distribution 𝒟\mathcal{D} for constructing the contract MM remains the same under deployed conditions. It will be important to develop techniques for detecting and adapting to distributional shifts in the perception data, in order to have guarantees that are robust to such changes [1].

Runtime monitoring and control.

The proplem formulation in this paper is focused on verification, which in turn requires the identification of the environmental conditions under which system safety can be assured. Perception contracts can also be used for runtime monitoring for detection of the violation of certain properties. The key challenge here is to compute preimages of the contracts, that is: M1:YXM^{-1}:Y\rightarrow X, that can be used for detecting violation or imminent violation of the requirements. Another related problem is the correction of such imminent violations by changing the control actions.

The paper presents the analysis of 6 and 12 dimensional flight control systems that use multi-stage, heterogeneous, ML-enabled perception. Both systems have high-dimensional nonlinear dynamics and realistic perception pipelines including deep neural networks. We explore how the variation of environment impacts system level safety performance. We advances the method of perception contract by introducing an algorithm for constructing data and requirement guided refinement of perception contracts (DaRePC). DaRePC is able to either identify a set of states and environmental factors that the requirement can be satisfied or find a counter example. In future, we are planning to apply the algorithm to real hardware to test its performance.

References

  • [1] Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., Mané, D.: Concrete problems in ai safety (2016)
  • [2] Angello, A., Hsieh, C., Madhusudan, P., Mitra, S.: Perception contracts for safety of ml-enabled systems. In: Proc. of the ACM on Programming Languages (PACMPL), OOPSLA (2023)
  • [3] Calinescu, R., Imrie, C., Mangal, R., Rodrigues, G.N., Păsăreanu, C., Santana, M.A., Vázquez, G.: Discrete-event controller synthesis for autonomous systems with deep-learning perception components. arXiv:2202.03360 (2023)
  • [4] Dean, S., Matni, N., Recht, B., Ye, V.: Robust guarantees for perception-based control. In: Bayen, A.M., Jadbabaie, A., Pappas, G., Parrilo, P.A., Recht, B., Tomlin, C., Zeilinger, M. (eds.) Proceedings of the 2nd Conference on Learning for Dynamics and Control. Proceedings of Machine Learning Research, vol. 120, pp. 350–360. PMLR (10–11 Jun 2020), https://proceedings.mlr.press/v120/dean20a.html
  • [5] Dean, S., Taylor, A., Cosner, R., Recht, B., Ames, A.: Guaranteeing safety of learned perception modules via measurement-robust control barrier functions. In: Kober, J., Ramos, F., Tomlin, C. (eds.) Proceedings of the 2020 Conference on Robot Learning. Proceedings of Machine Learning Research, vol. 155, pp. 654–670. PMLR (16–18 Nov 2021), https://proceedings.mlr.press/v155/dean21a.html
  • [6] Dekkers, G.: Quantile regression: Theory and applications (2014)
  • [7] Dreossi, T., Fremont, D.J., Ghosh, S., Kim, E., Ravanbakhsh, H., Vazquez-Chanlatte, M., Seshia, S.A.: VerifAI: A toolkit for the formal design and analysis of artificial intelligence-based systems. In: Computer Aided Verification (CAV) (Jul 2019)
  • [8] Everett, M., Habibi, G., How, J.P.: Efficient reachability analysis of closed-loop systems with neural network controllers. In: 2021 IEEE International Conference on Robotics and Automation (ICRA). pp. 4384–4390 (2021). https://doi.org/10.1109/ICRA48506.2021.9561348
  • [9] Fan, C., Qi, B., Mitra, S., Viswanathan, M.: Dryvr: Data-driven verification and compositional reasoning for automotive systems. In: Majumdar, R., Kunčak, V. (eds.) Computer Aided Verification (CAV). pp. 441–461. Springer, Cham (2017)
  • [10] Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (jun 1981). https://doi.org/10.1145/358669.358692, https://doi.org/10.1145/358669.358692
  • [11] Fremont, D.J., Chiu, J., Margineantu, D.D., Osipychev, D., Seshia, S.A.: Formal analysis and redesign of a neural network-based aircraft taxiing system with VerifAI. In: Computer Aided Verification (CAV) (Jul 2020)
  • [12] Herbert, S.L., Chen, M., Han, S., Bansal, S., Fisac, J.F., Tomlin, C.J.: Fastrack: A modular framework for fast and guaranteed safe motion planning. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC). pp. 1517–1522 (2017). https://doi.org/10.1109/CDC.2017.8263867
  • [13] Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58(301), 13–30 (1963), http://www.jstor.org/stable/2282952
  • [14] Hsieh, C., Koh, Y., Li, Y., Mitra, S.: Assuring safety of vision-based swarm formation control (2023)
  • [15] Hsieh, C., Li, Y., Sun, D., Joshi, K., Misailovic, S., Mitra, S.: Verifying controllers with vision-based perception using safe approximate abstractions. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41(11), 4205–4216 (2022)
  • [16] Li, Y., Zhu, H., Braught, K., Shen, K., Mitra, S.: Verse: A python library for reasoning about multi-agent hybrid system scenarios. In: Enea, C., Lal, A. (eds.) Computer Aided Verification. pp. 351–364. Springer Nature Switzerland, Cham (2023)
  • [17] Maggio, D., Abate, M., Shi, J., Mario, C., Carlone, L.: Loc-nerf: Monte carlo localization using neural radiance fields (2022)
  • [18] Păsăreanu, C.S., Mangal, R., Gopinath, D., Getir Yaman, S., Imrie, C., Calinescu, R., Yu, H.: Closed-loop analysis of vision-based autonomous systems: A case study. In: Proc. 35th Int. Conf. Comput. Aided Verification. pp. 289–303 (2023)
  • [19] Păsăreanu, C., Mangal, R., Gopinath, D., Yu, H.: Assumption Generation for the Verification of Learning-Enabled Autonomous Systems. arXiv:2305.18372 (2023)
  • [20] Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. pp. 234–241. Springer International Publishing, Cham (2015)
  • [21] Santa Cruz, U., Shoukry, Y.: Nnlander-verif: A neural network formal verification framework for vision-based autonomous aircraft landing. In: Deshmukh, J.V., Havelund, K., Perez, I. (eds.) NASA Formal Methods. pp. 213–230. Springer International Publishing, Cham (2022)
  • [22] Shoouri, S., Jalili, S., Xu, J., Gallagher, I., Zhang, Y., Wilhelm, J., Jeannin, J.B., Ozay, N.: Falsification of a Vision-based Automatic Landing System. https://doi.org/10.2514/6.2021-0998, https://arc.aiaa.org/doi/abs/10.2514/6.2021-0998
  • [23] Song, Y., Romero, A., Müller, M., Koltun, V., Scaramuzza, D.: Reaching the limit in autonomous racing: Optimal control versus reinforcement learning. Science Robotics 8(82), eadg1462 (2023). https://doi.org/10.1126/scirobotics.adg1462, https://www.science.org/doi/abs/10.1126/scirobotics.adg1462
  • [24] Sun, D., Mitra, S.: NeuReach: Learning reachability functions from simulations. In: Fisman, D., Rosu, G. (eds.) Tools and Algorithms for the Construction and Analysis of Systems. pp. 322–337. Springer, Cham (2022)
  • [25] Sun, D., Yang, B.C., Mitra, S.: Learning-based perception contracts and applications (2023)
  • [26] Tancik, M., Weber, E., Ng, E., Li, R., Yi, B., Kerr, J., Wang, T., Kristoffersen, A., Austin, J., Salahi, K., Ahuja, A., McAllister, D., Kanazawa, A.: Nerfstudio: A modular framework for neural radiance field development. In: ACM SIGGRAPH 2023 Conference Proceedings. SIGGRAPH ’23 (2023)

Appendix 0.A Proof for Proposition 1

Proposition 1 Restated  Suppose X~X\tilde{X}\subseteq X contains the reachable states of the system S=X,X~0,Y,E~,f,oS=\langle X,\tilde{X}_{0},Y,\tilde{E},f,o\rangle. If M{M} is a (X~0,X~,E~)(\tilde{X}_{0},\tilde{X},\tilde{E})-perception contract for oo for some requirement RR, then SS satisfies RR, i.e., o,X~0,E~R\langle o,\tilde{X}_{0},\tilde{E}\rangle\models{R}.

Proof

Consider any environment eE~e\in\tilde{E}, any initial state x0X~0x_{0}\in\tilde{X}_{0}. From correctness of M{M}, it follows that M,X~0,E~R\langle M,\tilde{X}_{0},\tilde{E}\rangle\models{R}. That is, for the system with MM as observer SM=X,X~0,Y,E~,M{S_{{M}}}=\langle X,\tilde{X}_{0},Y,\tilde{E},{M}\rangle the reachable sets ReachSM(t)R{Reach_{S_{M}}}(t)\subseteq{R} for all tt, and for each eE~e\in\tilde{E}. Now consider any execution α\alpha of S=X,X~0,Y,E~,f,oS=\langle X,\tilde{X}_{0},Y,\tilde{E},f,o\rangle. We know α(0)X~0ReachSM(0)\alpha(0)\in\tilde{X}_{0}\subseteq{Reach_{S_{M}}}(0). Since ReachS(t)X~{Reach}_{S}(t)\subseteq\tilde{X}, from conformance of M{M}, we know that o(α(t),E~)M(α(t))o(\alpha(t),\tilde{E})\subseteq{M}(\alpha(t)) for all tt. Therefore, for α(t)ReachSM(t)\alpha(t)\in{Reach_{S_{M}}}(t), we have α(t+1)=f(α(t),o(α(t),e))f(α(t),M(α(t)))ReachSM(t+1)\alpha(t+1)=f(\alpha(t),o(\alpha(t),e))\in f(\alpha(t),M(\alpha(t)))\subseteq{Reach_{S_{M}}}(t+1). This tells us that α(t)ReachSM(t)\alpha(t)\in{Reach_{S_{M}}}(t) for all execution α\alpha for all tt, which gives us o,X~0,E~R\langle o,\tilde{X}_{0},\tilde{E}\rangle\models{R}.

Appendix 0.B Proof for Proposition 2

Proposition 2 Restated  For any δ(0,1)\delta\in(0,1), with probability at least 1δ1-\delta, we have that

pp^lnδ2np\geq\hat{p}-\sqrt{-\frac{\ln{\delta}}{2n}}
Proof

From Hoeffding’s inequality, we get

Pr(p^plnδ2n)e2n(lnδ2n)2=e2n(lnδ2n)=δ\begin{split}Pr(\hat{p}-p\geq\sqrt{-\frac{\ln{\delta}}{2n}})&\leq e^{-2n(\sqrt{-\frac{\ln{\delta}}{2n}})^{2}}=e^{-2n*(-\frac{ln\delta}{2n})}=\delta\end{split}

This implies that

Pr(p^plnδ2n)1δPr(\hat{p}-p\leq\sqrt{-\frac{ln\delta}{2n}})\geq 1-\delta

which proves the proposition.

Appendix 0.C Proof for Proposition 3

Proposition 3 Restated  Given S=X,X0,Y,E0,f,h^S=\langle X,X_{0},Y,E_{0},f,{\hat{h}}\rangle, x0X0\forall x_{0}\in X_{0} and an execution α{\alpha} of SS in nominal environment ede_{d}, ϵ>0\forall\epsilon>0, n\exists n such that XnX_{n} generated by calling RefineState nn times on X0X_{0} and MEn{M}_{E_{n}} obtained on environment refined by ShrinkEnv nn times satisfies that t\forall t, Reach(Xn,MEn,t)Bϵ(α(t))\textsc{Reach}(X_{n},{M}_{E_{n}},t)\subseteq B_{\epsilon}({\alpha}(t))

Proof

We know from the definition of ShrinkEnv that as the sequence of sets of environmental parameters generated by ShrinkEnv will monotonically converge to a singleton set contain the grounding environmental parameter ede_{d}. We know from convergence of M{M} that M{M} will converge to h^{\hat{h}} as the set of environmental parameters converge to a singleton set. In addition, we know the output of RefineState will converge to a singleton set of states after it’s repeatedly invoked by definition. Therefore, by the assumption of Reach, when a partition XX converges to some xXx\in X and E~{ed}\tilde{E}\rightarrow\{e_{d}\}, Reach(X,ME~,t)α(t)\textsc{Reach}(X,{M}_{\tilde{E}},t)\rightarrow{\alpha}(t) which is the trajectory starting from state xx generated by S^{\hat{S}} under environmental parameter ede_{d}.

Appendix 0.D Proof for Theorem 4.2

Theorem 4.2 Restated  If M{M} satisfies conformance and convergence and S^=X,X0,Y,E0,f,h^{\hat{S}}=\langle X,X_{0},Y,E_{0},f,{\hat{h}}\rangle satisfy/violate requirement robustly, then Algorithm 2 always terminates, i.e., the algorithm can either find a E~E0\tilde{E}\subseteq E_{0} and ME~{M}_{\tilde{E}} that ME~,X0,E~R\langle{M}_{\tilde{E}},X_{0},\tilde{E}\rangle\models{R} or find a counter-example XcX0X_{c}\subseteq X_{0} and E~E0\tilde{E}\subseteq E_{0} with edE~e_{d}\in\tilde{E} that h^,Xc,E~⊧̸R\langle{\hat{h}},X_{c},\tilde{E}\rangle\not\models{R}.

Proof

To show completeness of the algorithm, we need to show four cases:

1) When there exist a counter-example, the algorithm can find the counter-example. If there exist a counter-example such that trajectory αc{\alpha}_{c} starting from xcX0x_{c}\in X_{0} under grounding environmental parameter ede_{d} exits R{R} at time tt, then according to the robustness assumption of the system, there exist ϵ>0\epsilon>0, such that xBϵ(αc(t))\forall x\in B_{\epsilon}({\alpha}_{c}(t)), xRx\notin{R}. Therefore, according to Proposition 3, with enough refinement, we can get a set of states XcX0X_{c}\subset X_{0} with xcXcx_{c}\in X_{c} and E~E0\tilde{E}\subset E_{0} with edE~e_{d}\in\tilde{E} such that αc(T)Reach(Xc,ME~,t)Bϵ(αc(T)){\alpha}_{c}(T)\in\textsc{Reach}(X_{c},{M}_{\tilde{E}},t)\subseteq B_{\epsilon}({\alpha}_{c}(T)). In this case, Reach(P,ME~,t)R=\textsc{Reach}(P,{M}_{\tilde{E}},t)\cap{R}=\emptyset which captures the counter-example.

2) A counter-example returned is indeed a counter-example. A set of states XcX_{c} is returned as counter-example when t\exists t, such that Reach(Xc,ME~,t)R=\textsc{Reach}(X_{c},{M}_{\tilde{E}},t)\cap{R}=\emptyset for some ME~{M}_{\tilde{E}}. From conformance of ME~{M}_{\tilde{E}}, we know h^(x,e)M(x){\hat{h}}(x,e)\in M(x) for all xx and eE~e\in\tilde{E} In this case, trajectory α{\alpha} starting from xcXcx_{c}\in X_{c} with any environmental parameter eE~e\in\tilde{E} will satisfy that α(t)Reach(P,ME~,t){\alpha}(t)\in{Reach}(P,{M}_{\tilde{E}},t) which is not in R{R}. The output of DaRePC in this case is indeed a counter-example.

3) When a counter-example doesn’t exist, the algorithm can find ME~{M}_{\tilde{E}} satisfy R{R} on X0X_{0}. This case can be shown similar to case 1). Since we assume the system is robust, for every trajectory of the system that satisfies R{R}, there exist ϵ>0\epsilon>0 such that Bϵ(α(t))RB_{\epsilon}({\alpha}(t))\subset{R} for all tt. Since we know as DaRePC repeatedly performs refinement, Reach(X,M,t)\textsc{Reach}(X,{M},t) will converge to a single execution. Therefore, according to Proposition 3, with enough refinement, for all safe trajectory we can get α(t)Reach(X,M,t)Bϵ(α(t))R(t){\alpha}(t)\in\textsc{Reach}(X,{M},t)\subset B_{\epsilon}({\alpha}(t))\subset{R}(t). Therefore, with enough refinement, we can always find a ME~{M}_{\tilde{E}} that satisfies R{R} when the S^{\hat{S}} satisfy R{R} on X0X_{0} under ede_{d}.

4) A ME{M}_{E} returned by DaRePC will always satisfy R{R} on X0X_{0}. This is given automatically by Theorem 4.1 and the conformance assumption.

Appendix 0.E Dynamics and Controller for AutoLand Case Study

The dynamics of the AutoLand Case Study is given by

xt+1=xt+vtcos(ψt)cos(θt)δtyt+1=yt+vtsin(ψt)cos(θt)δtzt+1=zt+sin(θt)δtψt+1=ψt+βδtθt+1=θt+ωδtvt+1=θt+aδt\begin{split}&x_{t+1}=x_{t}+v_{t}cos(\psi_{t})cos(\theta_{t})\delta t\\ &y_{t+1}=y_{t}+v_{t}sin(\psi_{t})cos(\theta_{t})\delta t\\ &z_{t+1}=z_{t}+sin(\theta_{t})\delta t\\ &\psi_{t+1}=\psi_{t}+\beta\delta t\\ &\theta_{t+1}=\theta_{t}+\omega\delta t\\ &v_{t+1}=\theta_{t}+a\delta t\end{split}

δt\delta t is a descretization parameter. The control inputs are the yaw rate β\beta, pitch rate ω\omega and acceleration aa

The error between the reference points and state is then given by

xe=cos(ψ)(xrx)+sin(ψ)(yry)ye=sin(ψ)(xrx)+cos(ψ)(yry)ze=zrzψe=ψrψθe=θrθve=vrv\begin{split}&x_{e}=cos(\psi)(x_{r}-x)+sin(\psi)(y_{r}-y)\\ &y_{e}=-sin(\psi)(x_{r}-x)+cos(\psi)(y_{r}-y)\\ &z_{e}=z_{r}-z\\ &\psi_{e}=\psi_{r}-\psi\\ &\theta_{e}=\theta_{r}-\theta\\ &v_{e}=v_{r}-v\end{split}

The control inputs are then computed using errors computed by a PD controller.

a=0.005((vrcos(θr)cos(ψr)+0.01xe)2+(vrsin(θr)+0.01ze)2v)β=ψe+vr(0.01ye+0.01sin(ψe)ω=(θrθ)+0.001ze\begin{split}&a=0.005(\sqrt{(v_{r}cos(\theta_{r})cos(\psi_{r})+0.01x_{e})^{2}+(v_{r}sin(\theta_{r})+0.01z_{e})^{2}}-v)\\ &\beta=\psi_{e}+v_{r}(0.01y_{e}+0.01sin(\psi_{e})\\ &\omega=(\theta_{r}-\theta)+0.001z_{e}\end{split}

Appendix 0.F Initial Sets for AutoLand Case Study

The initial sets X01X_{01}, X02X_{02} for AutoLand case study are

X01=[3020,3010]×[5,5]×[118,122]×[0.001,0.001]×[0.0534,0.0514]×[9.99,10.01]\displaystyle\small X_{01}=[-3020,-3010]\times[-5,5]\times[118,122]\times[-0.001,0.001]\times[-0.0534,-0.0514]\times[9.99,10.01]
X02=[3030,3000]×[5,5]×[100,140]×[0.001,0.001]×[0.0534,0.0514]×[9.99,10.01].\displaystyle X_{02}=[-3030,-3000]\times[-5,5]\times[100,140]\times[-0.001,0.001]\times[-0.0534,-0.0514]\times[9.99,10.01].

Appendix 0.G Block Diagram for DroneRace

The block diagram for the DroneRace case study is shown in Fig. 6

Refer to caption
Figure 6: Architecture of drone racing system (DroneRace).

Appendix 0.H Dynamics and Controller for DroneRace Case Study

We used a slightly modified version of drone dynamics from [12]. It’s given by the following differential equations

x˙=vxcos(ψ)vysin(ψ)vx˙=gtan(ϕ)ϕ˙=d1ϕ+ρρ˙=d0ϕ+n0axy˙=vxsin(ψ)+vycos(ψ)vy˙=gtan(θ)θ˙=d1θ+ωω˙=d0θ+n0ayz˙=vzvz˙=kTFgψ˙=ββ˙=n0az\begin{split}&\dot{x}=vx*cos(\psi)-vy*sin(\psi)\\ &\dot{vx}=g*tan(\phi)\\ &\dot{\phi}=-d1*\phi+\rho\\ &\dot{\rho}=-d0*\phi+n0*ax\\ &\dot{y}=vx*sin(\psi)+vy*cos(\psi)\\ &\dot{vy}=g*tan(\theta)\\ &\dot{\theta}=-d1*\theta+\omega\\ &\dot{\omega}=-d0*\theta+n0*ay\\ &\dot{z}=vz\\ &\dot{vz}=kT*F-g\\ &\dot{\psi}=\beta\\ &\dot{\beta}=n0*az\end{split}

Where some parameters g=9.81g=9.81, d0=10d0=10, d1=8d1=8, n0=10n0=10, kT=0.91kT=0.91. Given reference state provided by the reference trajectory, we can compute the tracking error between reference state and current state err=[xe,vxe,ϕe,ρe,ye,vye,θe,ωe,ze,vze,ψe,βe]err=[x_{e},vx_{e},\phi_{e},\rho_{e},y_{e},vy_{e},\theta_{e},\omega_{e},z_{e},vz_{e},\psi_{e},\beta_{e}]. The position error between current state of the drone and the reference trajectory is given by.

[xeyeze]=[cos(ψ)sin(ψ)0sin(ψ)cos(ψ)0001]\begin{split}&\begin{bmatrix}x_{e}\\ y_{e}\\ z_{e}\end{bmatrix}=\begin{bmatrix}cos(\psi)&-sin(\psi)&0\\ sin(\psi)&cos(\psi)&0\\ 0&0&1\end{bmatrix}\end{split}

The tracking error for other states is given by the difference between reference state and current state. Therefore, the control signals ax,ay,kT,azax,ay,kT,az can be computed using following feedback controller.

[axaykT]=[3.164.524.251.360.0.0.0.0.0.0.0.0.0.1.1.831.461.140.0.0.0.0.0.0.0.0.0.1.1.79]erraz=1.0ψe+1.0βe\begin{split}&\begin{bmatrix}ax\\ ay\\ kT\end{bmatrix}=\begin{bmatrix}3.16&4.52&4.25&1.36&0.&-0.&-0.&0.&-0.&0.\\ 0.&0.&0.&0.&1.&1.83&1.46&1.14&-0.&-0.\\ 0.&0.&-0.&0.&0.&-0.&-0.&-0.&1.&1.79\end{bmatrix}err\\ &az=1.0*\psi_{e}+1.0*\beta_{e}\end{split}

Appendix 0.I Simulation Trajectories for eE~e\notin\tilde{E} for DroneRace

Refer to caption
Figure 7: Simulation trajectories under environments eE0\E~e\in E_{0}\textbackslash\tilde{E} violates requirements.