This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Aalborg University, Aalborg, Denmark
11email: {christianms,mzi}@cs.aau.dk

The Reachability Problem for
Neural-Network Control Systems

Christian Schilling 11 0000-0003-3658-1065   
Martin Zimmermann
11 0000-0002-8038-2453
Abstract

A control system consists of a plant component and a controller which periodically computes a control input for the plant. We consider systems where the controller is implemented by a feedforward neural network with ReLU activations. The reachability problem asks, given a set of initial states, whether a set of target states can be reached. We show that this problem is undecidable even for trivial plants and fixed-depth neural networks with three inputs and outputs. We also show that the problem becomes semi-decidable when the plant as well as the input and target sets are given by automata over infinite words.

1 Introduction

Cyber-physical systems consist of digital (cyber) and physical components. A common instance of this paradigm is a control system, consisting of a physical plant and a controller whose purpose is to steer the plant to a desired behavior [7]. Control theory studies the automatic synthesis of such controllers, which, for nonlinear systems, is a difficult task. Machine learning has long been successfully applied to tackle this task, where the learned controller was first represented by a (shallow) feedforward neural network [19] and more recently by a deep neural network (DNN) [15]. We call a control system with a DNN controller a neural-network control system (NNCS). Due to their black-box nature, DNNs have raised concerns about their correctness and safety, in particular in terms of the worst-case behavior [28]. However, as they are deployed in safety-critical applications, proving machine-learned NNCS correct is of utmost importance, and considerable resources have been invested into their verification [16, 17].

In this paper, we are concerned with the fundamental problem of safety for NNCS: given a set of initial states and a set of bad states of the plant, does the controller prevent the plant to reach a bad state when started in an initial state? Note that the failure of safety is captured by a reachability property: does there exist an initial state from which a bad state is reachable? Thus, in the following, we study the reachability problem for NNCS.

Recall that an NNCS is a combination of a DNN (the controller) and a plant. It is known that the reachability problem is already undecidable for sufficiently complex plants, even without any controller [13]. So the question becomes: is there a simple but expressive class of plants for which the reachability problem is tractable? Inspired by similar results for recurrent neural networks [27, 10], we show in Section 3 that the answer is negative: the reachability problem is undecidable even for trivial plants. Intuitively, a DNN can simulate one computational step of a two-counter machine. Thus, a recurrent neural network can simulate a two-counter machine. As a DNN controlling a plant is essentially recurrent (as it bases its control decisions on the current state of the plant), undecidability follows.

On the positive side, we show in Section 4 that the reachability problem is at least semi-decidable for plants whose behavior can be captured by automata over infinite words: Sälzer et al. showed that the behavior of DNNs can be captured by such automata [23]. Hence, relying on standard automata-theoretic constructions, the composition of a DNN and an automata-definable plant can also be captured by automata. The class of automata-definable plants includes, for instance, plants that are described by multi-mode linear maps. Such maps are able to express, for example, the dynamics of adaptive cruise controls [14].

1.1 Related work

Reachability in NNCS is generally challenging. Existing approaches typically combine techniques developed for dynamical systems (the plant) [2] and neural networks [16]. Tools such as CORA [1, 12], JuliaReach [5, 25], and NNV [18] compete in the ARCH-COMP friendly competition, and we refer to the report [17] for typical examples of NNCS.

Undecidability of questions about unbounded computations with piecewise-linear (PWL) functions is long known, e.g., periodicity in iterated 2D maps [22] or reachability for linear hybrid automata [9]. Similar results have been shown for DNNs. Siegelmann and Sontag showed undecidability for unbounded computations in DNNs with activations given by a PWL approximation of the sigmoid function (which effectively is the ReLU function truncated at 11[27]. Later, Hyotyniemi showed an encoding of two-counter machines in recurrent neural networks (RNNs) with ReLU activations [10]. While an RNN can conceptually be seen as the special case of an NNCS without a plant, the formalism differs. We thus consider our encoding of two-counter machines in NNCS of independent (yet mainly pedagogical) value. Cabessa showed an encoding of two-counter machines in a variant of RNNs with conditional weights called spike-timing dependent plasticity [6]. Recently, we also showed an encoding of two-counter machines in decision-tree control systems [26], where the DNN is replaced by a decision tree with simple conditions xcx\leq c for some variable xx and constant cc.

Katz et al. studied the problem of reachability in a ReLU DNN without iteration and showed that, given polyhedral (i.e., described by linear constraints) input and output sets, the reachability problem is NP-complete [11]. Sälzer and Lange later fixed some issues in the proof, mainly related to the effective representation of real numbers [24].

Sälzer et al. recently presented an encoding of a DNN in a weak Büchi automaton [23]. We build on this encoding for the analysis of semi-decidability.

2 Preliminaries

We start by formally introducing the type of DNN under study.

Definition 1 (Deep neural network)

A neuron is a function ν:m\nu\colon\mathbb{R}^{m}\to\mathbb{R} with ν(x)=σ(i=1mcixi+b)\nu(\vec{x})=\sigma(\sum_{i=1}^{m}c_{i}x_{i}+b), where mm is the input dimension, the cic_{i}\in\mathbb{Q} are the weights, bb\in\mathbb{Q} is the bias, and σ:\sigma\colon\mathbb{R}\to\mathbb{R} is the activation function of ν\nu, which is either the identity function or the rectified linear unit (ReLU) ymax{y,0}y\mapsto\max\{y,0\}.

A layer is a sequence of neurons (ν1,,νn)(\nu_{1},\dots,\nu_{n}), all of the same input dimension mm, computing the function :mn\ell\colon\mathbb{R}^{m}\to\mathbb{R}^{n} given by (x)=(ν1(x),,νn(x))\ell(\vec{x})=(\nu_{1}(\vec{x}),\dots,\nu_{n}(\vec{x})). The dimensions mm and nn are the input resp. output dimension of the layer.

A deep neural network (DNN) NN is a sequence of layers (1,,k)(\ell_{1},\dots,\ell_{k}) such that the output dimension of i\ell_{i} is the input dimension of i+1\ell_{i+1} for all i=1,,k1i=1,\dots,k-1. The last layer is the output layer and all other layers are called hidden layers. If mm is the input dimension of 1\ell_{1} and nn is the output dimension of k\ell_{k}, then the DNN computes the function N:mnN\colon\mathbb{R}^{m}\to\mathbb{R}^{n} defined as

N(x)=k(k1(1(x))).N(\vec{x})=\ell_{k}(\ell_{k-1}(\dots\ell_{1}(\vec{x})\dots)).
Neural-network controller NN
Plant PP
x0\vec{x}_{0}xk=P(xk1,uk)\vec{x}_{k}=P(\vec{x}_{k-1},\vec{u}_{k})uk=N(xk1)\vec{u}_{k}=N(\vec{x}_{k-1})
Figure 1: Neural-network control system.

Next, we define control systems as depicted in Fig. 1. The system consists of a plant and a controller (here: a DNN) and acts in iterations: first, the controller computes a control input u\vec{u} for the plant based on the current state x\vec{x} of the plant. Next, from its inputs x\vec{x} and u\vec{u}, the plant computes a new state. Then the process repeats. For the plant, we restrict ourselves to discrete time, i.e., we are only interested in its output and not its intermediate states. For now, we also abstract from the concrete type of plant and just view it as a general function.

Definition 2 (Neural-network control system)

A neural-network control system (NNCS) is a tuple (P,N)(P,N) with a plant P:d+cdP\colon\mathbb{R}^{d+c}\to\mathbb{R}^{d} and a controller given by a DNN N:dcN\colon\mathbb{R}^{d}\to\mathbb{R}^{c}, i.e., dd is the dimension of the states of PP and cc is the dimension of the control vectors computed by NN.

The semantics of an NNCS are given as a sequence of states xk\vec{x}_{k} and control inputs uk\vec{u}_{k}, induced by some initial state x0d\vec{x}_{0}\in\mathbb{R}^{d} via

uk\displaystyle\vec{u}_{k} =N(xk1)\displaystyle=N(\vec{x}_{k-1})
xk\displaystyle\vec{x}_{k} =P(xk1,uk)\displaystyle=P(\vec{x}_{k-1},\vec{u}_{k})

We introduce a shorthand to express one iteration of the control loop in Fig. 1, i.e., the composition of the DNN followed by the plant, to compute xk\vec{x}_{k} from xk1\vec{x}_{k-1}:

CP,N(xk1)=P(xk1,N(xk1))C_{P,N}(\vec{x}_{k-1})=P(\vec{x}_{k-1},N(\vec{x}_{k-1}))

We will focus on sets of states represented by linear constraints. Given ana\in\mathbb{Q}^{n}, bb\in\mathbb{Q}, the set Ha,b={xna,xb}H_{a,b}=\{\vec{x}\in\mathbb{R}^{n}\mid\langle a,\vec{x}\rangle\leq b\} is a linear constraint, where “,\langle\cdot,\cdot\rangle” denotes the scalar product. A polyhedron is a finite intersection of linear constraints. Let 𝒫(n)\mathcal{P}(n) denote the set of all polyhedra in nn dimensions.

We are now ready to define the reachability problem for NNCS.

Problem 1 (Reachability problem for NNCS)

Given a DNN N:dcN\colon\mathbb{R}^{d}\to\mathbb{R}^{c}, a plant P:d+cdP\colon\mathbb{R}^{d+c}\to\mathbb{R}^{d}, a polyhedron X0𝒫(d)X_{0}\in\mathcal{P}(d) of initial states, and a polyhedron φ𝒫(d)\varphi\in\mathcal{P}(d) of target states, does there exist an initial state x0X0\vec{x}_{0}\in X_{0} and a kk\in\mathbb{N} such that (CP,N)k(x0)φ(C_{P,N})^{k}(\vec{x}_{0})\in\varphi?

3 Undecidability

In this section, we prove that the NNCS reachability problem is undecidable. The proof is by a reduction from the halting problem for two-counter machines.

Formally, a two-counter machine \mathcal{M} is a sequence

(0:I0)(1:I1)(k2:Ik2)(k1:STOP),(0:\texttt{I}_{0})(1:\texttt{I}_{1})\cdots(k-2:\texttt{I}_{k-2})(k-1:\texttt{STOP}),

where the first element of a pair (:I)(\ell:\texttt{I}_{\ell}) is the line number and I\texttt{I}_{\ell} for 0<k10\leq\ell<k-1 is an instruction of the form

  • INC(i)\texttt{INC}(i) with i{0,1}i\in\{0,1\},

  • DEC(i)\texttt{DEC}(i) with i{0,1}i\in\{0,1\}, or

  • JZ(i,)\texttt{JZ}(i,\ell^{\prime}) with i{0,1}i\in\{0,1\} and {0,,k1}\ell^{\prime}\in\{0,\ldots,k-1\}.

A configuration of \mathcal{M} is of the form (,c0,c1)(\ell,c_{0},c_{1}) with {0,,k1}\ell\in\{0,\ldots,k-1\} (the current value of the program counter) and c0,c1c_{0},c_{1}\in\mathbb{N} (the current contents of the two counters). The initial configuration is (0,0,0)(0,0,0) and the unique successor configuration of a configuration (,c0,c1)(\ell,c_{0},c_{1}) is defined as follows:

  • If I=INC(i)\texttt{I}_{\ell}=\texttt{INC}(i), then the successor configuration is (+1,c0,c1)(\ell+1,c_{0}^{\prime},c_{1}^{\prime}) with ci=ci+1c_{i}^{\prime}=c_{i}+1 and c1i=c1ic_{1-i}^{\prime}=c_{1-i}.

  • If I=DEC(i)\texttt{I}_{\ell}=\texttt{DEC}(i), then the successor configuration is (+1,c0,c1)(\ell+1,c_{0}^{\prime},c_{1}^{\prime}) with ci=max{ci1,0}c_{i}^{\prime}=\max\{c_{i}-1,0\} and c1i=c1ic_{1-i}^{\prime}=c_{1-i}.

  • If I=JZ(i,)\texttt{I}_{\ell}=\texttt{JZ}(i,\ell^{\prime}) and ci=0c_{i}=0, then the successor configuration is (,c0,c1)(\ell^{\prime},c_{0},c_{1}).

  • If I=JZ(i,)\texttt{I}_{\ell}=\texttt{JZ}(i,\ell^{\prime}) and ci>0c_{i}>0, then the successor configuration is (+1,c0,c1)(\ell+1,c_{0},c_{1}).

  • If I=STOP\texttt{I}_{\ell}=\texttt{STOP}, then (,c0,c1)(\ell,c_{0},c_{1}) has no successor configuration.

The unique run of \mathcal{M} (starting in the initial configuration) is defined as the maximal sequence γ0γ1γ2\gamma_{0}\gamma_{1}\gamma_{2}\cdots of configurations γj3\gamma_{j}\in\mathbb{N}^{3} where γ0\gamma_{0} is the initial configuration and γj+1\gamma_{j+1} is the successor configuration of γj\gamma_{j}, if γj\gamma_{j} has a successor configuration. This run is either finite (line k1k-1 is reached) or infinite (line k1k-1 is never reached). In the former case, we say that \mathcal{M} terminates. The halting problem for two-counter machines asks, given a two-counter machine \mathcal{M}, whether \mathcal{M} terminates when started in the initial configuration.

Proposition 1 ([20])

The halting problem for two-counter machines is undecidable.

In the following, we show that the halting problem for two-counter machines can be reduced to the reachability problem for NNCS by simulating the semantics of a two-counter machine by a NNCS.

Theorem 3.1

The reachability problem for NNCS is undecidable.

Proof

Fix some two-counter machine \mathcal{M} with kk instructions. We show how to construct a gadget for each instruction of \mathcal{M} (except for the STOP instruction), which we then combine into a DNN simulating one configuration update of \mathcal{M}. Thus, the reachability problem for NNCS (which involves the iterated application of the DNN) then allows to simulate the full run of \mathcal{M}.

Formally, the DNN implements a function from 33\mathbb{R}^{3}\to\mathbb{R}^{3} with the following property: If the three inputs encode a non-stopping configuration of the two-counter machine, then the three outputs encode the successor configuration. Note that, since the weights and biases of the DNN we construct are integral, the outputs given integral inputs are also integral. In the following, we often implicitly assume that inputs are integral when we explain the intuition behind our construction.

Our construction of the DNN fits into the common architecture [8] that all hidden neurons use ReLU activations and all output neurons use identity activations. We let the plant component simply turn the control input into the new state (P(x,u)=uP(\vec{x},\vec{u})=\vec{u}), as the DNN already simulates \mathcal{M}.

In some more detail, for every instruction (;I)(\ell;\texttt{I}_{\ell}) of \mathcal{M}, we construct one gadget simulating this instruction. All these gadgets will be executed in parallel in one iteration of the DNN, but only one of them (determined by the current value of the program counter) will actually perform a computation. The other gadgets just compute the identity function for each of their inputs. Thus, in the end we need to subtract (k2)v(k-2)\cdot v from each output vv.

All gadgets have inputs named 𝑝𝑐\mathit{pc} (representing the current value of the program counter), and c0c_{0} and c1c_{1} (representing the current counter values), as well as three outputs named 𝑝𝑐\mathit{pc}^{\prime} (representing the value of the program counter of the successor configuration), and c0c_{0}^{\prime} and c1c_{1}^{\prime} (representing the counter values of the successor configuration). To simplify our construction, we use an additional gadget that conceptually checks whether the value of the program counter is equal to some fixed line number \ell. This gadget is shown in Fig. 2. The output aa_{\ell} of this gadget (which has only one input 𝑝𝑐\mathit{pc}) satisfies

a={1𝑝𝑐=,0𝑝𝑐.a_{\ell}=\begin{cases}1&\mathit{pc}=\ell,\\ 0&\mathit{pc}\neq\ell.\end{cases}
𝑝𝑐\mathit{pc}+1-\ell+11-1aa_{\ell}2-2
Figure 2: Auxiliary gadget for instruction \ell. Here and in all later illustrations of DNNs, dots denote neurons, where filled dots use ReLU activations and empty dots can use either identity or ReLU activations (the choice is irrelevant since the value before the activation is nonnegative). Sometimes, as in this case, the empty dots are only present for a fully-connected architecture. Edge colors only serve the visual association with the weights. We omit weight 11 and bias 0 as well as connections with weight 0.

The outputs aa_{\ell} of these auxiliary gadgets (we have one for each line number \ell) will be fed into the other gadgets simulating the instructions.

Next, we describe the instruction gadgets, where we restrict ourselves to the counter with index zero; the counter with index one is treated in the analogous way. Fig. 3 shows these gadgets together with the possible outputs. It is easy to verify that each gadget performs the corresponding computation whenever the input 𝑝𝑐\mathit{pc} is equal to \ell, and the identity function otherwise. Let us stress that each gadget we construct depends both on the line number and the instruction.

c0c_{0}c1c_{1}𝑝𝑐\mathit{pc}aa_{\ell}c0c_{0}^{\prime}c1c_{1}^{\prime}𝑝𝑐\mathit{pc}^{\prime}
(a) :INC(0)\ell:\texttt{INC}(0) gadget.
Condition xx^{\prime} yy^{\prime} 𝑝𝑐\mathit{pc}^{\prime}
𝑝𝑐=\mathit{pc}=\ell c0+1c_{0}+1 c1c_{1} 𝑝𝑐+1\mathit{pc}+1
𝑝𝑐\mathit{pc}\neq\ell c0c_{0} c1c_{1} 𝑝𝑐\mathit{pc}
(b) :INC(0)\ell:\texttt{INC}(0) gadget’s possible output values.
c0c_{0}c1c_{1}𝑝𝑐\mathit{pc}aa_{\ell}c0c_{0}^{\prime}c1c_{1}^{\prime}𝑝𝑐\mathit{pc}^{\prime}1-1
(c) :DEC(0)\ell:\texttt{DEC}(0) gadget.
Condition c0c_{0}^{\prime} c1c_{1}^{\prime} 𝑝𝑐\mathit{pc}^{\prime}
𝑝𝑐=\mathit{pc}=\ell max{c01,0}\max\{c_{0}-1,0\} c1c_{1} 𝑝𝑐+1\mathit{pc}+1
𝑝𝑐\mathit{pc}\neq\ell c0c_{0} c1c_{1} 𝑝𝑐\mathit{pc}
(d) :DEC(0)\ell:\texttt{DEC}(0) gadget’s possible output values.
c0c_{0}c1c_{1}𝑝𝑐\mathit{pc}aa_{\ell}111-1𝑝𝑐\mathit{pc}^{\prime}c1c_{1}^{\prime}c0c_{0}^{\prime}1-12\ell^{\prime}-\ell-2
(e) :JZ(0,)\ell:\texttt{JZ}(0,\ell^{\prime}) gadget.
Condition c0c_{0}^{\prime} c1c_{1}^{\prime} 𝑝𝑐\mathit{pc}^{\prime}
𝑝𝑐= and c0=0\mathit{pc}=\ell\text{ and }c_{0}=0 c0c_{0} c1c_{1} \ell^{\prime}
𝑝𝑐= and c00\mathit{pc}=\ell\text{ and }c_{0}\neq 0 c0c_{0} c1c_{1} 𝑝𝑐+1\mathit{pc}+1
𝑝𝑐\mathit{pc}\neq\ell c0c_{0} c1c_{1} 𝑝𝑐\mathit{pc}
(f) :JZ(0,)\ell:\texttt{JZ}(0,\ell^{\prime}) gadget’s possible output values.
Figure 3: The gadgets for the three instructions. See Fig. 2 for further explanations.

The final layout of the gadgets is shown in Fig. 4. Essentially, each auxiliary gadget is wired to the corresponding instruction gadget, and at the end we need to subtract the inputs k2k-2 times as described above. Note that there is no gadget for the STOP instruction (instruction k1k-1 in \mathcal{M}). When 𝑝𝑐\mathit{pc} is equal to k1k-1, then the DNN computes the identity function: First, all aa_{\ell} are equal to 0; Hence, each of the k1k-1 instruction gadgets II_{\ell} computes the identity function; after the subtraction, we are indeed left with the identity.

c0c_{0}c1c_{1}𝑝𝑐\mathit{pc}a0a_{0}\vdotsak2a_{k-2}I0I_{0}\vdotsIk2I_{k-2}c0c_{0}^{\prime}c1c_{1}^{\prime}𝑝𝑐\mathit{pc}^{\prime}k+2-k+2k+2-k+2k+2-k+2
Figure 4: Complete construction. Each box represents an auxiliary gadget aa_{\ell} resp. an instruction gadget II_{\ell}. Small dots denote junctions of connections and have no further semantics. The last layer is the output layer (with identity activations).

Finally, the initial input to the DNN is x0=(0,0,0)\vec{x}_{0}=(0,0,0) (representing the initial configuration) and the target set is φ={(k1,c0,c1)c0,c10}\varphi=\{(k-1,c_{0},c_{1})\mid c_{0},c_{1}\geq 0\}, where k1k-1 is the last instruction number (STOP) of \mathcal{M}. Clearly, \mathcal{M} terminates if and only if the NNCS reaches a state satisfying φ\varphi when started in X0={x0}X_{0}=\{\vec{x}_{0}\}. ∎

We note that the DNNs simulating two-counter machines are rather simple.

Corollary 1

The NNCS reachability problem remains undecidable for DNNs with integral weights, 33 input and output dimensions, 66 hidden layers, a singleton initial set, and a target set o=vo=v for some output neuron oo and constant vv\in\mathbb{N}.

c01c_{0}^{1}c11c_{1}^{1}𝑝𝑐1\mathit{pc}^{1}​​​\vdotsi17i_{1}^{7}i27i_{2}^{7}i37i_{3}^{7} 1\ell_{1} \vdots 7\ell_{7} o01o_{0}^{1}o11o_{1}^{1}o21o_{2}^{1}o02o_{0}^{2}o12o_{1}^{2}o22o_{2}^{2}\vdotsok2o_{k}^{2}\vdotso07o_{0}^{7}o17o_{1}^{7}o27o_{2}^{7}
(a) Stacking of previous hidden layers.
mi1m_{i}^{1}mi2m_{i}^{2}​​​\vdotsmi7m_{i}^{7}\vdotsmo1m_{o}^{1}mo2m_{o}^{2}\vdotsmo7m_{o}^{7}
(b) Modulo-77 counter.
Figure 5: Construction with a single hidden layer.

One may wonder whether the six hidden layers are necessary. In general, one cannot hope to obtain a small neural network when removing layers [4]. However, since we can iterate the NNCS, and the plant model is not interfering, we can reduce one iteration of a DNN NN with six hidden layers (constructed in the proof above) to seven iterations of a DNN NN^{\prime} with one hidden layer.111Typically, neural networks with only one hidden layer are not called deep. Fig. 5 shows a sketch of the construction idea. Essentially, we take the hidden layers of NN and stack them as one wide hidden layer in NN^{\prime}. (For instance, layer 1\ell_{1} has width k+1k+1.) We refer to each of these hidden layers as a track. For each track, we need to add input and output dimensions corresponding to the number of neurons in the respective previous and next hidden layers. The output of track jj is fed to track j+1j+1 (and the output of the last track is fed to the first track).

When presented with an input vector x0\vec{x}_{0} of appropriate size, the first track performs the computation of the first hidden layer and feeds its output to the second track, and so on. After seven iterations, the output of the last track will equal the output of the seven-layer DNN NN after the first iteration. This output is then used as the input of the first track again and the process continues.

Finally, we need to make sure that the other tracks do not accidentally produce an output that leads to a target state between multiples of seven iterations. In order to only consider outputs in every seventh iteration, we use the additional gadget shown in Fig. 5(b). This gadget has seven inputs and outputs and is to be stacked below the other DNN. When the initial input is (1,0,0,0,0,0,0)(1,0,0,0,0,0,0), the 11 is propagated to the second index, and so on, until it arrives back at the first index after seven iterations.

The target set φ\varphi now simply needs to get extended to arbitrary values in the auxiliary dimensions, except for the seven-last entry (mi1m_{i}^{1}), which has to equal one. Formally: φ=(φmi11mi11)\varphi^{\prime}=(\varphi\land m_{i}^{1}\leq 1\land-m_{i}^{1}\leq-1).

In summary, by scaling the number of inputs and outputs with \mathcal{M}, we obtain a DNN with one (wide) hidden layer.

Corollary 2

The NNCS reachability problem remains undecidable for DNNs with integral weights, one hidden layer, a singleton initial set, and a target set o=vo=v for some output neuron oo and constant vv\in\mathbb{N}.

4 Semi-decidability

In this section, we show that the NNCS reachability problem is semi-decidable for a particular class of plants. Indeed, from a single initial state x0\vec{x}_{0}, we can enumerate all states CP,N(x0)kC_{P,N}(\vec{x}_{0})^{k} reachable in kk iterations and for each of them decide membership in the target polyhedron φ\varphi. However, since we allow for an initial set X0X_{0}, this algorithm is not effective.

The image of a polyhedron under a ReLU DNN is a (finite) union of polyhedra [21]. If we choose a class of plants with the same property, we obtain an effective algorithm again. In what follows, we show a more general result by using an automaton encoding of DNNs from [23]. This will allow us to more abstractly consider a class of plants that is definable in the same automaton formalism.

We slightly deviate from the original approach by Sälzer et al. [23] in that we use a more expressive automaton model, as we are (unlike them) not bothered with efficiency considerations (since our problem is undecidable).

Definition 3 (Büchi automaton)

A (nondeterministic) Büchi automaton (NBA) 𝒜=(Q,Σ,q0,δ,F)\mathcal{A}=(Q,\Sigma,q_{0},\delta,F) consists of a finite set QQ of states, a finite alphabet Σ\Sigma, an initial state q0Qq_{0}\in Q, a transition relation δQ×Σ×Q\delta\subseteq Q\times\Sigma\times Q, and a set of accepting states FQF\subseteq Q.

A run on an infinite word w=a0a1w=a_{0}a_{1}\dots is an infinite sequence of states q0,q1,q_{0},q_{1},\dots starting in the initial state and satisfying (qi,ai,qi+1)δ(q_{i},a_{i},q_{i+1})\in\delta for all i0i\geq 0. A run is accepting if qiFq_{i}\in F for infinitely many ii. The language of 𝒜\mathcal{A} is

L(𝒜)={wΣωA has an accepting run on w}.L(\mathcal{A})=\{w\in\Sigma^{\omega}\mid\text{$A$ has an accepting run on $w$}\}.

A language is ω\omega-regular if there exists an NBA that accepts it.

In the following, we recall an effective encoding of real numbers in NBA from [23]. Let Σ={+,,0,1,.}\Sigma=\{+,-,0,1,.\}. A word w=sana0.b0b1w=sa_{n}\dots a_{0}.b_{0}b_{1}\dots with n0n\geq 0, s{+,}s\in\{+,-\}, ai,bi{0,1}a_{i},b_{i}\in\{0,1\} encodes the real value

𝑑𝑒𝑐(w)=(1)sign(s)(i=0nai2i+i=0bi2(i+1))\mathit{dec}(w)=(-1)^{\operatorname{sign}(s)}\cdot\left(\sum_{i=0}^{n}a_{i}\cdot 2^{i}+\sum_{i=0}^{\infty}b_{i}\cdot 2^{-(i+1)}\right)

where sign(s)=0\operatorname{sign}(s)=0 if s=+s=+ and sign(s)=1\operatorname{sign}(s)=1 if s=s=-. As usual, the word encoding is not unique, but the decoding is [23, Page 5].

Now, we switch to a word encoding of multiple numbers by using a product alphabet. A symbol over this product alphabet Σk\Sigma^{k} is a kk-vector of symbols. A word over Σk\Sigma^{k} is well-formed if it is of the form

w=[s1sk][a1,nak,n][a1,0ak,0][..][b1,0bk,0][b1,1bk,1]w=\begin{bmatrix}s_{1}\\ \vdots\\ s_{k}\end{bmatrix}\begin{bmatrix}a_{1,n}\\ \vdots\\ a_{k,n}\end{bmatrix}\cdots\begin{bmatrix}a_{1,0}\\ \vdots\\ a_{k,0}\end{bmatrix}\begin{bmatrix}.\\ \vdots\\ .\end{bmatrix}\begin{bmatrix}b_{1,0}\\ \vdots\\ b_{k,0}\end{bmatrix}\begin{bmatrix}b_{1,1}\\ \vdots\\ b_{k,1}\end{bmatrix}\cdots

where si{+,}s_{i}\in\{+,-\}, ai,j,bi,h{0,1}a_{i,j},b_{i,h}\in\{0,1\} for i=1,,ki=1,\dots,k, j=0,,nj=0,\dots,n, and h=0,1,h=0,1,\dots. In other words, the signs and the point are aligned, which can be achieved by filling up with leading zeros. The language 𝑊𝐹k\mathit{WF}_{k} of well-formed words is ω\omega-regular [23]. The selection of a single component i{1,,k}i\in\{1,\dots,k\} is obtained in the obvious way:

wi=siai,nai,0.bi,0bi,1w_{i}=s_{i}a_{i,n}\dots a_{i,0}\,.\,b_{i,0}b_{i,1}\dots

If an NBA over Σk\Sigma^{k} accepts only well-formed words, then we can understand its language as a relation over k\mathbb{R}^{k}. Furthermore, linear constraints are also ω\omega-regular [23]. Thus, as NBA are closed under intersection and union, (finite unions of) polyhedra are also ω\omega-regular. Finally, we can also use NBAs to encode functions f:mnf\colon\mathbb{R}^{m}\rightarrow\mathbb{R}^{n} via their graphs, which are relations over m+n\mathbb{R}^{m+n}.

Sälzer et al. showed that every function computable by a DNN can be represented by an NBA.222They actually proved the result for the more restrictive class of eventually-always weak NBA. But for us it is more prudent to consider the more general class of NBA.

Proposition 2 (Theorem 1 in [23])

Let N:dcN\colon\mathbb{R}^{d}\to\mathbb{R}^{c} be a DNN. There exists an NBA 𝒜N\mathcal{A}_{N} over Σd+c\Sigma^{d+c} with

L(𝒜N)={w𝑊𝐹d+cN(𝑑𝑒𝑐(w1),,𝑑𝑒𝑐(wd))=(𝑑𝑒𝑐(wd+1),,𝑑𝑒𝑐(wd+c))}.L(\mathcal{A}_{N})=\{w\in\!\mathit{WF}_{\!d+c}\mid N(\mathit{dec}(w_{1}),\dots,\mathit{dec}(w_{d}))\!=\!(\mathit{dec}(w_{d+1}),\dots,\mathit{dec}(w_{d+c}))\}.

For our application, we need to slightly modify the automaton 𝒜N\mathcal{A}_{N} from Proposition 2 so that it also copies its input for further use. This modification can be implemented by replacing each transition label (a1,,an,a1,,am)(a_{1},\ldots,a_{n},a_{1}^{\prime},\dots,a_{m}^{\prime}) by (a1,,an,a1,,an,a1,,am)(a_{1},\ldots,a_{n},a_{1},\ldots,a_{n},a_{1}^{\prime},\dots,a_{m}^{\prime}).

Corollary 3

Let N:dcN\colon\mathbb{R}^{d}\to\mathbb{R}^{c} be a DNN. There exists an NBA 𝒜^N\widehat{\mathcal{A}}_{N} over Σd+d+c\Sigma^{d+d+c} with

L(𝒜^N)={w\displaystyle L(\widehat{\mathcal{A}}_{N})=\{w\in{} 𝑊𝐹d+d+c(w1,,wd)=(wd+1,,wd+d) and\displaystyle\mathit{WF}_{d+d+c}\mid(w_{1},\dots,w_{d})=(w_{d+1},\dots,w_{d+d})\text{ and }
N((𝑑𝑒𝑐(w1),,𝑑𝑒𝑐(wd))=(𝑑𝑒𝑐(wd+d+1),,𝑑𝑒𝑐(wd+d+c))}.\displaystyle N((\mathit{dec}(w_{1}),\dots,\mathit{dec}(w_{d}))=(\mathit{dec}(w_{d+d+1}),\dots,\mathit{dec}(w_{d+d+c}))\}.

Thus, the semantics of DNNs can be captured by NBAs. This is in general not true for plants. Hence, in the following, we restrict ourselves to plants that can also be captured by NBAs.

Definition 4 (ω\omega-regular plant)

A plant P:d+cdP\colon\mathbb{R}^{d+c}\to\mathbb{R}^{d} is ω\omega-regular if there exists an NBA 𝒜P\mathcal{A}_{P} over Σd+c+d\Sigma^{d+c+d} such that

L(𝒜P)={w\displaystyle L(\mathcal{A}_{P})=\{w\in{} 𝑊𝐹d+c+d\displaystyle\mathit{WF}_{d+c+d}\mid{}
P(𝑑𝑒𝑐(w1),,𝑑𝑒𝑐(wd+c))=(𝑑𝑒𝑐(wd+c+1),,𝑑𝑒𝑐(wd+c+d))}.\displaystyle P(\mathit{dec}(w_{1}),\dots,\mathit{dec}(w_{d+c}))=(\mathit{dec}(w_{d+c+1}),\dots,\mathit{dec}(w_{d+c+d}))\}.

Now, both the DNN and the plant are given by NBA. Hence, we can apply standard automata-theoretic constructions to capture a bounded number of applications of the control loop by repeatedly composing the NBA for the DNN and the NBA for the plant. To this end, we introduce the (parametric) composition operator k\circ_{k} constructing from two NBAs 𝒜1\mathcal{A}_{1} and 𝒜2\mathcal{A}_{2}, which accept the graphs of two functions f1:k1kf_{1}\colon\mathbb{R}^{k_{1}}\to\mathbb{R}^{k} and f2:kk2f_{2}\colon\mathbb{R}^{k}\to\mathbb{R}^{k_{2}}, an NBA 𝒜1k𝒜2\mathcal{A}_{1}\circ_{k}\mathcal{A}_{2} accepting the graph of xf2(f1(x))\vec{x}\mapsto f_{2}(f_{1}(\vec{x})).

Lemma 1 (Lemma 4 of [23])

Let k,k1,k20k,k_{1},k_{2}\geq 0 and let A1A_{1} and 𝒜2\mathcal{A}_{2} be two NBAs over Σk1+k\Sigma^{k_{1}+k} and Σk+k2\Sigma^{k+k_{2}}, respectively. Then, there exists an NBA 𝒜1k𝒜2\mathcal{A}_{1}\circ_{k}\mathcal{A}_{2} over Σk1+k2\Sigma^{k_{1}+k_{2}} accepting the language

{(u1,,uk1,wk+1,,wk+k2)\displaystyle\{(u_{1},\dots,u_{k_{1}},w_{k+1},\dots,w_{k+k_{2}})\mid{} (v1,,vk) s.t.\displaystyle{}\exists(v_{1},\dots,v_{k})\text{ s.t. }
(u1,,uk1,v1,,vk)L(𝒜1) and\displaystyle(u_{1},\dots,u_{k_{1}},v_{1},\dots,v_{k})\in L(\mathcal{A}_{1})\text{ and }
(v1,,vk,wk+1,,wk+k2)L(𝒜2)}.\displaystyle(v_{1},\dots,v_{k},w_{k+1},\dots,w_{k+k_{2}})\in L(\mathcal{A}_{2})\}.

Now we are ready to prove our main result of this section: NNCS reachability restricted to ω\omega-regular plants is semi-decidable. Note that this is tight, as the problem is undecidable as shown in Theorem 3.1: the plant just returning its control input as output is ω\omega-regular.

Theorem 4.1

The NNCS reachability problem is semi-decidable when restricted to ω\omega-regular plants.

Proof

We are given a problem instance (N,P,X0,φ)(N,P,X_{0},\varphi) and need to (semi)-decide whether there exists a k0k\geq 0 such that (CP,N)k(x0)φ(C_{P,N})^{k}(\vec{x}_{0})\in\varphi for some x0X0\vec{x}_{0}\in X_{0}. Let dd be the dimension of the states of PP and cc be the dimension of the control vectors computed by NN, respectively.

Let 𝒜N\mathcal{A}_{N} (over Σd+d+c\Sigma^{d+d+c}) and 𝒜P\mathcal{A}_{P} (over Σd+c+d\Sigma^{d+c+d}) be the NBAs as in Corollary 3 and Definition 4. Then, we define I0I_{0} to be an NBA accepting the graph of the dd-ary identity function

L(I0)={w𝑊𝐹d+d(w0,,wd)=(wd+1,,wd+d)}L(I_{0})=\{w\in\mathit{WF}_{d+d}\mid(w_{0},\dots,w_{d})=(w_{d+1},\dots,w_{d+d})\}

and, for k1k\geq 1, Ik=Ik1d(𝒜^Nd+c𝒜P)I_{k}=I_{k-1}\circ_{d}(\widehat{\mathcal{A}}_{N}\circ_{d+c}\mathcal{A}_{P}).

By construction, we have

(w1,,wd,wd+1,,wd+d)L(Ik)(w_{1},\dots,w_{d},w_{d+1},\dots,w_{d+d})\in L(I_{k})

if and only if

(𝑑𝑒𝑐(wd+1),,𝑑𝑒𝑐(wd+d))(CP,N)k(𝑑𝑒𝑐(w1),,𝑑𝑒𝑐(wd)).(\mathit{dec}(w_{d+1}),\dots,\mathit{dec}(w_{d+d}))\in(C_{P,N})^{k}(\mathit{dec}(w_{1}),\dots,\mathit{dec}(w_{d})).

There are NBAs 𝒜0\mathcal{A}_{0} and 𝒜φ\mathcal{A}_{\varphi} accepting X0X_{0} and φ\varphi, as they are polyhedra. Both these NBAs have alphabet Σd\Sigma^{d}, while each IkI_{k} has alphabet Σd+d\Sigma^{d+d} where the first dd components encode the inputs and the last dd components encode the outputs. Hence, to restrict 𝒜0\mathcal{A}_{0} and 𝒜φ\mathcal{A}_{\varphi} to X0X_{0} and φ\varphi, we need to widen 𝒜0\mathcal{A}_{0} and 𝒜φ\mathcal{A}_{\varphi} to NBA with alphabet Σd+d\Sigma^{d+d}. Formally, let NBAs 𝒜^0\widehat{\mathcal{A}}_{0} and 𝒜^φ\widehat{\mathcal{A}}_{\varphi} (both over Σd+d\Sigma^{d+d}) such that

  • L(𝒜^0)L(\widehat{\mathcal{A}}_{0}) contains the encodings of all vectors (x1,,xd+d)𝑊𝐹d+d(x_{1},\dots,x_{d+d})\in\mathit{WF}_{d+d} such that (x1,,xd)(x_{1},\dots,x_{d}) is in X0dX_{0}\subseteq\mathbb{R}^{d} and (xd+1,,xd+d)d(x_{d+1},\dots,x_{d+d})\in\mathbb{R}^{d} is arbitrary, and

  • L(𝒜^φ)L(\widehat{\mathcal{A}}_{\varphi}) contains the encodings of all vectors (x1,,xd+d)𝑊𝐹d+d(x_{1},\dots,x_{d+d})\in\mathit{WF}_{d+d} such that (x1,,xd)d(x_{1},\dots,x_{d})\in\mathbb{R}^{d} is arbitrary and (xd+1,,xd+d)(x_{d+1},\dots,x_{d+d}) is in φd\varphi\subseteq\mathbb{R}^{d}.

Now, there exist an x0X0\vec{x}_{0}\in X_{0} and a k0k\geq 0 such that (CP,N)k(x0)φ(C_{P,N})^{k}(\vec{x}_{0})\in\varphi if and only if the language of 𝒜^0Ik𝒜^φ\widehat{\mathcal{A}}_{0}\cap I_{k}\cap\widehat{\mathcal{A}}_{\varphi} is nonempty.

In summary, to semi-decide the NNCS reachability problem for ω\omega-regular plants, we iteratively construct IkI_{k} for k0k\geq 0 and check 𝒜^0Ik𝒜^φ\widehat{\mathcal{A}}_{0}\cap I_{k}\cap\widehat{\mathcal{A}}_{\varphi} for nonemptiness. ∎

Let us remark that the construction in Theorem 4.1 does not require the initial set X0X_{0} and the target set φ\varphi to be polyhedral. It is sufficient that they are ω\omega-regular to effectively decide (non)emptiness of the intersection. The class of ω\omega-regular languages is strictly more expressive than polyhedral sets.333For example, the set of natural numbers is ω\omega-regular (in the encoding used here), but it is not a polyhedron. Thus, our result is more general than the statement of Theorem 4.1.

4.1 Multi-mode linear plants

In this subsection, we give an example of a plant model that falls into the class of ω\omega-regular languages. Our example is inspired by linear hybrid automata [3], which are finite state machines with constant-term ordinary differential equations (ODEs) in the modes (states) and guard conditions on the transitions.

Hybrid automata have two sources of nondeterminism: an enabled transition need not be taken (may-semantics), and multiple transitions may be enabled at the same time. Because we have restricted ourselves to deterministic plants in this work, we need to introduce some restrictions. First, we assume a fixed rational control period, and a transition can only be taken at the end of such a period. Then, the solution of the ODEs is a linear map, which can be analytically computed, and our system becomes discrete-time. Second, we require that exactly one guard is enabled, i.e., in each mode, all guards are pairwise-disjoint and their union is the universe. To simplify the presentation, we do not include discrete updates with the transitions but note that these can easily be added. We call the resulting model a multi-mode linear map.

Definition 5 (Multi-mode linear map)

A multi-mode linear map is a tuple =(M,E,d,c,F,G)\mathcal{H}=(M,E,d,c,F,G) consisting of a finite set MM\subseteq\mathbb{N} of modes, a set of edges EM×ME\subseteq M\times M, input and control dimensions dd and cc, a flow function F:Md×d×d×c×dF\colon M\to\mathbb{Q}^{d\times d}\times\mathbb{Q}^{d\times c}\times\mathbb{Q}^{d} (mapping a mode to two matrices and a vector), and a guard function G:E𝒰𝒫(d+c)G\colon E\to\mathcal{FUP}(d+c) (where 𝒰𝒫\mathcal{FUP} denotes the set of finite unions of polyhedra), satisfying

  • if (m,m)E(m,m^{\prime})\in E and (m,m′′)E(m,m^{\prime\prime})\in E, then G(m,m)G(m,m′′)=G(m,m^{\prime})\cap G(m,m^{\prime\prime})=\emptyset, and

  • mMG(m,m)=d+c\bigcup_{m^{\prime}\in M}G(m,m^{\prime})=\mathbb{R}^{d+c} for all mMm\in M.

The function f:M×d+cM×df_{\mathcal{H}}\colon M\times\mathbb{R}^{d+c}\to M\times\mathbb{R}^{d} computed by \mathcal{H} is defined as

f(m,x1,,xd,u1,,uc)=(m,x1,,xd)f_{\mathcal{H}}(m,x_{1},\dots,x_{d},u_{1},\dots,u_{c})=(m^{\prime},x_{1}^{\prime},\dots,x_{d}^{\prime})

where F(m)=(A,B,c)F(m)=(A,B,c),

(x1,,xd)=A(x1,,xd)T+B(u1,,uc)T+c,(x_{1}^{\prime},\dots,x_{d}^{\prime})=A\cdot(x_{1},\dots,x_{d})^{T}+B\cdot(u_{1},\dots,u_{c})^{T}+c,

and mm^{\prime} is the unique mode such that

(x1,,xd,u1,,uc)G(m,m).(x_{1}^{\prime},\dots,x_{d}^{\prime},u_{1},\dots,u_{c})\in G(m,m^{\prime}).

Note that the first component of inputs for ff_{\mathcal{H}} is restricted to modes of \mathcal{H}, not arbitrary reals as stipulated by the definition of plants. However, this is not an issue as long as the initial input has a mode in the first component, as ff_{\mathcal{H}} also returns only outputs that have a mode in the first component.

Lemma 2

Multi-mode linear maps are ω\omega-regular plants.

Proof (Sketch)

The following operations can be implemented by NBAs [23]:

  • Multiplication of real inputs with constants in \mathbb{Q} and addition of reals. These two operations allow us to compute the output (x1,,xd)(x_{1}^{\prime},\dots,x_{d}^{\prime}) from (x1,,xd)(x_{1},\dots,x_{d}) and (u1,,uc)(u_{1},\dots,u_{c}).

  • Checking membership of a vector of reals in a fixed polyhedron. This allows us to compute the next mode mm^{\prime} from the current mode mm, the current state (x1,,xd)(x_{1},\dots,x_{d}), and the current input (u1,,uc)(u_{1},\dots,u_{c}), as mm^{\prime} is determined by the membership of (x1,,xd,u1,,uc)(x_{1},\dots,x_{d},u_{1},\dots,u_{c}) in a finite union of polyhedra.

This allows us to build an NBA that accepts the graph of ff_{\mathcal{H}} for every given multi-mode linear map \mathcal{H}. ∎

Corollary 4

The NNCS reachability problem is semi-decidable when the plant is restricted to multi-mode linear maps.

5 Conclusion

In this paper, we studied the reachability problem for dynamical systems controlled by deep neural networks. We showed that, for the common ReLU activations, the problem is undecidable even when the plant is trivial and the network is restricted to integral weights and a singleton initial set; furthermore, we can either fix the input and output dimensions to 33 and the number of hidden layers to 66, or use a single hidden layer. We then turned to the question when the problem can be semi-decided; here we extended a recent encoding of neural networks in Büchi automata and showed that ω\omega-regular plants as well as input and target sets are sufficient for a semi-decision procedure; as an example, we demonstrated that a model akin to linear hybrid automata is ω\omega-regular.

Acknowledgments

We thank the participants of AISoLA 2023 for suggesting to study the NNCS reachability problem with one hidden layer.

This research was partly supported by the Independent Research Fund Denmark under reference number 10.46540/3120-00041B, DIREC - Digital Research Centre Denmark under reference number 9142-0001B, and the Villum Investigator Grant S4OS under reference number 37819.

References

  • [1] Althoff, M.: An introduction to CORA 2015. In: ARCH. EPiC Series in Computing, vol. 34, pp. 120–151. EasyChair (2015), https://doi.org/10.29007/zbkv
  • [2] Althoff, M., Frehse, G., Girard, A.: Set propagation techniques for reachability analysis. Annu. Rev. Control. Robotics Auton. Syst. 4, 369–395 (2021), https://doi.org/10.1146/annurev-control-071420-081941
  • [3] Alur, R., Courcoubetis, C., Henzinger, T.A., Ho, P.: Hybrid automata: An algorithmic approach to the specification and verification of hybrid systems. In: Hybrid Systems. LNCS, vol. 736, pp. 209–229. Springer (1992), https://doi.org/10.1007/3-540-57318-6_30
  • [4] Arora, R., Basu, A., Mianjy, P., Mukherjee, A.: Understanding deep neural networks with rectified linear units. In: ICLR. OpenReview.net (2018), https://openreview.net/forum?id=B1J_rgWRW
  • [5] Bogomolov, S., Forets, M., Frehse, G., Potomkin, K., Schilling, C.: JuliaReach: a toolbox for set-based reachability. In: HSCC. pp. 39–44. ACM (2019), https://doi.org/10.1145/3302504.3311804
  • [6] Cabessa, J.: Turing complete neural computation based on synaptic plasticity. PloS one 14(10), e0223451 (2019), https://doi.org/10.1371/journal.pone.0223451
  • [7] Doyle, J.C., Francis, B.A., Tannenbaum, A.R.: Feedback control theory. Dover Publications (2013)
  • [8] Goodfellow, I.J., Bengio, Y., Courville, A.C.: Deep Learning. Adaptive computation and machine learning, MIT Press (2016), http://www.deeplearningbook.org/
  • [9] Henzinger, T.A., Kopke, P.W., Puri, A., Varaiya, P.: What’s decidable about hybrid automata? J. Comput. Syst. Sci. 57(1), 94–124 (1998), https://doi.org/10.1006/jcss.1998.1581
  • [10] Hyötyniemi, H.: Turing machines are recurrent neural networks. STeP 96, 13–24 (1996)
  • [11] Katz, G., Barrett, C.W., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: An efficient SMT solver for verifying deep neural networks. In: CAV. LNCS, vol. 10426, pp. 97–117. Springer (2017), https://doi.org/10.1007/978-3-319-63387-9_5
  • [12] Kochdumper, N., Schilling, C., Althoff, M., Bak, S.: Open- and closed-loop neural network verification using polynomial zonotopes. In: NFM. LNCS, vol. 13903, pp. 16–36. Springer (2023), https://doi.org/10.1007/978-3-031-33170-1_2
  • [13] Koiran, P., Moore, C.: Closed-form analytic maps in one and two dimensions can simulate universal turing machines. Theor. Comput. Sci. 210(1), 217–223 (1999), https://doi.org/10.1016/S0304-3975(98)00117-0
  • [14] Larsen, K.G., Mikucionis, M., Taankvist, J.H.: Safe and optimal adaptive cruise control. In: Correct System Design. LNCS, vol. 9360, pp. 260–277. Springer (2015), https://doi.org/10.1007/978-3-319-23506-6_17
  • [15] Le, D.M., Greene, M.L., Makumi, W.A., Dixon, W.E.: Real-time modular deep neural network-based adaptive control of nonlinear systems. IEEE Control. Syst. Lett. 6, 476–481 (2022), https://doi.org/10.1109/LCSYS.2021.3081361
  • [16] Liu, C., Arnon, T., Lazarus, C., Strong, C.A., Barrett, C.W., Kochenderfer, M.J.: Algorithms for verifying deep neural networks. Found. Trends Optim. 4(3-4), 244–404 (2021), https://doi.org/10.1561/2400000035
  • [17] Lopez, D.M., Althoff, M., Forets, M., Johnson, T.T., Ladner, T., Schilling, C.: ARCH-COMP23 category report: Artificial intelligence and neural network control systems (AINNCS) for continuous and hybrid systems plants. In: ARCH. EPiC Series in Computing, vol. 96. EasyChair (2023), https://doi.org/10.29007/x38n
  • [18] Lopez, D.M., Choi, S.W., Tran, H., Johnson, T.T.: NNV 2.0: The neural network verification tool. In: CAV. LNCS, vol. 13965, pp. 397–412. Springer (2023), https://doi.org/10.1007/978-3-031-37703-7_19
  • [19] Miller, W.T., Sutton, R.S., Werbos, P.J.: Neural networks for control. MIT press (1995)
  • [20] Minsky, M.L.: Computation: Finite and Infinite Machines. Prentice-Hall (1967)
  • [21] Montúfar, G., Pascanu, R., Cho, K., Bengio, Y.: On the number of linear regions of deep neural networks. In: NeurIPS. pp. 2924–2932 (2014), https://proceedings.neurips.cc/paper/2014/hash/109d2dd3608f669ca17920c511c2a41e-Abstract.html
  • [22] Moore, C.: Unpredictability and undecidability in dynamical systems. Phys. Rev. Lett. 64(20), 2354–2357 (1990), https://doi.org/10.1103/PhysRevLett.64.2354
  • [23] Sälzer, M., Alsmann, E., Bruse, F., Lange, M.: Verifying and interpreting neural networks using finite automata. CoRR abs/2211.01022 (2022), https://doi.org/10.48550/arXiv.2211.01022
  • [24] Sälzer, M., Lange, M.: Reachability in simple neural networks. Fundam. Informaticae 189(3-4), 241–259 (2022), https://doi.org/10.3233/FI-222160
  • [25] Schilling, C., Forets, M., Guadalupe, S.: Verification of neural-network control systems by integrating Taylor models and zonotopes. In: AAAI. pp. 8169–8177. AAAI Press (2022), https://doi.org/10.1609/aaai.v36i7.20790
  • [26] Schilling, C., Lukina, A., Demirović, E., Larsen, K.G.: Safety verification of decision-tree policies in continuous time. In: NeurIPS. vol. 36, pp. 14750–14769. Curran Associates, Inc. (2023), https://proceedings.neurips.cc//paper_files/paper/2023/hash/2f89a23a19d1617e7fb16d4f7a049ce2-Abstract-Conference.html
  • [27] Siegelmann, H.T., Sontag, E.D.: Turing computability with neural nets. Applied Mathematics Letters 4(6), 77–80 (1991), https://doi.org/10.1016/0893-9659(91)90080-F
  • [28] Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I.J., Fergus, R.: Intriguing properties of neural networks. In: ICLR (2014), http://arxiv.org/abs/1312.6199