This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Coordinated Defense Allocation in Reach-Avoid Scenarios with Efficient Online Optimization

Junwei Liu, Zikai Ouyang, Jiahui Yang, Hua Chen, Haibo Lu and Wei Zhang Junwei Liu and Hua Chen are with the School of System Design and Intelligent Manufacturing, Southern University of Science and Technology, Shenzhen 518055, China {liujw,chenh6}@sustech.edu.cnZikai Ouyang and Wei Zhang are with the School of System Design and Intelligent Manufacturing, Southern University of Science and Technology, Shenzhen 518055, China, and also with the Peng Cheng Laboratory, Shenzhen 518000, China [email protected], [email protected]Jiahui Yang is with the Department of Mechanical and Energy Engineering, Southern University of Science and Technology, Shenzhen 518055, China [email protected]Haibo Lu is with the Peng Cheng Laboratory, Shenzhen 518000, China [email protected]
Abstract

In this paper, we present a dual-layer online optimization strategy for defender robots operating in multiplayer reach-avoid games within general convex environments. Our goal is to intercept as many attacker robots as possible without prior knowledge of their strategies. To balance optimality and efficiency, our approach alternates between coordinating defender coalitions against individual attackers and allocating coalitions to attackers based on predicted single-attack coordination outcomes. We develop an online convex programming technique for single-attack defense coordination, which not only allows adaptability to joint states but also identifies the maximal region of initial joint states that guarantees successful attack interception. Our defense allocation algorithm utilizes a hierarchical iterative method to approximate integer linear programs with a monotonicity constraint, reducing computational burden while ensuring enhanced defense performance over time. Extensive simulations conducted in 2D and 3D environments validate the efficacy of our approach in comparison to state-of-the-art approaches, and show its applicability in wheeled mobile robots and quadcopters.

Index Terms:
Multirobot systems, coordination, task allocation, optimization and optimal control.

I Introduction

Effective task execution in many multirobot systems necessitates both cooperation among team members and competition against adversarial agents. Multiplayer adversarial games, characterized by conflicting objectives, serve as an ideal platform to investigate the delicate interplay between cooperation and competition. Techniques derived from multiplayer adversarial games have been applied to a wide range of real-world applications, including search and rescue [1], autonomous vehicles [2], security [3], among others. Nevertheless, solving multiplayer adversarial games optimally and efficiently remains a formidable challenge due to the high dimensionality of the solution space, the need for fast decision-making, and the uncertainty induced by unknown opposing strategies. Although deep multi-agent reinforcement learning [4, 5, 6] offers powerful tools, these approaches currently encounter difficulties in managing non-stationarity during training, restricting their applicability in real-world adversarial scenarios.

This paper delves into a class of multiplayer adversarial games, known as multiplayer reach-avoid games, wherein two teams strive to attack or defend a target set without awareness of their opponent’s strategies. The reach-avoid game exhibits close connections with other types of adversarial games, such as pursuit-evasion [7, 8, 9, 10, 11], capture-the-flag [12, 13, 14], and perimeter defense [15, 16, 17], among others. In a typical multiplayer reach-avoid game, the attacker team tries to reach the target set with as many robots as possible while avoiding capture, whereas the defender team seeks to prevent such incursions by capturing attackers outside the target set. In this work, we focus on the defensive perspective of multiplayer reach-avoid games within convex environments, proposing an effective strategy for the defender team to minimize the number of successful attackers entering the target set, irrespective of the attacker team’s decisions.

A conventional approach to devising defense strategies for multirobot systems employs the reach-avoid differential game framework. This framework casts the problem as a zero-sum game with two opposing players, aiming to determine whether the joint state can be guided to a target set without entering an avoid set [18], [19]. Unlike the traditional Hamilton-Jacobi reachability problem [20], reach-avoid differential games introduce additional challenges due to the state constraints imposed by the avoid set. Within the reach-avoid differential game framework, defense strategies can be generated by solving the Hamilton-Jacobi-Isaacs (HJI) equation or a related variant offline [21]. However, the HJI method and its numerical approximation [22], [23] struggle to decouple the inherent computational complexities, leading to poor scalability in large-scale multirobot systems.

To overcome these limitations, our work focuses on the decomposition method, which partitions the problem into two distinct yet interrelated subproblems: single-attack defense coordination and defense allocation. The former involves developing a coordination strategy for a coalition of defenders to prevent an individual attacker from reaching the target set. If successful, we say the defender coalition wins against the attacker. The latter, on the other hand, seeks to optimally assign defender coalitions to attackers by evaluating the predicted outcomes of selected single-attack coordination scenarios. A defense strategy can be obtained by solving these subproblems in an alternating manner. In contrast to the HJI method, the decomposition method offers a more tractable solution by leveraging problem structure and improving computational efficiency. We introduce online optimization techniques for both subproblems, ensuring the real-time adaptability and scalability of our approach for dynamic interactions.

As the foundation of our framework, we present an online convex programming-based defense coordination algorithm for any defender coalition against a single attacker. Our algorithm adapts to every joint state of the defender coalition and attacker, and identifies the maximum defense-winning region of initial joint states, regardless of any admissible strategy employed by the attacker. The underlying principle behind our approach leverages the concept of the safe-reachable set to devise objective functions. The safe-reachable set represents the collection of positions an attacker can reach without being captured by any defender given the current joint state. We then translate the single-attack defense coordination problem into an online robust maximization problem, where the robustness accounts for the worst-case scenario the attacker could employ. Solving this online robust maximization problem yields the corresponding defense coordination strategy.

Building upon the single-attack defense coordination results, we develop an allocation algorithm that ensures a monotonic improvement in defense performance. This algorithm incorporates predicted outcomes from single-attack defense coordination scenarios to formulate the problem as integer linear programs (ILPs) that rely on the joint state. We employ a heuristic method to tackle the ILPs and incorporate a monotonicity constraint. This constraint guarantees that the expected number of attackers unable to reach the target set is monotonically non-decreasing over time, facilitating a sustained improvement in defense performance as the game progresses. At the core of our algorithm is a hierarchical iterative method that approximately solves the ILPs, with each hierarchy defined by the concepts of active defense set and irreducibility to balance optimality and efficiency. By aggregating the suboptimal solutions obtained across all hierarchies, we derive a solution that substantially diminishes the number of selected pairs of defender coalitions and attackers within the ILPs, while preserving a certain degree of optimality.

I-A Related Work

In this subsection, we provide an overview of the current state-of-the-art in single-attack defense coordination and defense allocation, highlighting the prevailing issues and unresolved challenges within each domain.

One approach to single-attack defense coordination involves constructing the HJI solution analytically [24], [25]. Garcia et al. [24] present the derivation of the optimal defense strategy for two defenders against a single attacker in a 3D scenario with a point target set and zero capture radii. Another approach employs mixed pursuit-defense strategies, where some defenders are designated as pursuers focusing on directly capturing the attacker [26], [27]. In [27], it is shown that the attacker is guaranteed capture outside a circular target set within a bounded 2D convex domain for a set of initial joint states. This is achieved by assigning one defender to perform defense, while the remaining defenders implement the area-minimization strategy based on the attacker’s Voronoi cell, as developed in [28, 29].

Recently, a series of geometric-based works by Yan et al. has been proposed for deriving defense coordination strategies [30, 31, 32, 33]. Central to these approaches is the set that characterizes the attacker’s locations which can be safely reached through admissible strategies. This concept was first introduced in [34] as the “safe region” and later reformulated for cooperative pursuit in [35] as the “safe-reachable set”. We adopt the term safe-reachable set to emphasize the combination of safety and reachability. In cases where the capture radii are zero and the maximum speeds of defenders are all greater than that of the attacker, the safe-reachable set reduces to the Apollonius circle/sphere. Building on the Apollonius circle (resp. sphere), an analytical defense strategy is given for a set of initial joint states involving two defenders in a 2D (or 3D) convex domain with a line segment (resp. plane) shaped target set [30] (resp. [32]), ensuring the attacker cannot enter the target set. This defense coordination technique has been extended to accommodate an arbitrary number of defenders in both specific 2D and 3D scenarios [31], [33]. Notably, the approach presented in [33] is capable of handling cases involving non-zero capture radii.

Despite the progress made in single-attack defense coordination strategies, the existing approaches have mainly concentrated on specific cases with particular target sets or scenarios with zero capture radii. As a result, a comprehensive solution for general convex target sets in convex environments and non-zero capture radii remains an open challenge. Moreover, the defense strategies developed thus far are applicable only to a portion of the joint space, which may not adequately address real-world situations in which the attacker deviates from the optimal strategy. These limitations underscore the need for more adaptable defense coordination methods that can be applied to a broader range of scenarios.

In the context of defense allocation, the problem can be reformulated into joint state-dependent ILPs. Distinct from classical multirobot allocation problems [36, 37], the objective function here hinges on the estimated outcomes involving all defenders against a single attacker, which generally suffers from computational burdens when attempting a direct solution. In [38], the authors rely on the HJI method to exclusively examine pairwise interactions between defenders and attackers. While this approach enables polynomial-time solutions through the Ford-Fulkerson algorithm [39] or its improvement, the Hopcroft-Karp algorithm [40], it oversimplifies the problem by reducing the level of cooperation among defenders. The study presented in [41] concentrates on the assignment of performance improvement under the assumption that all attackers can be captured, which restricts the applicability of their approach. By the results of the geometric-based approach to single-attack defense coordination, the ILP in [31] is resolved exclusively at the initial time, with the resulting assignment maintained throughout the entire process. This approach can guarantee the capture of a certain number of attackers, whereas it lacks robustness in response to the dynamic interactions among agents. To overcome this limitation, the approach introduced in [31] is further refined in [33] by approximately solving the ILPs with a polynomial-time method. Nevertheless, this solution entails employing all possible combinations of single-attack defense coordination, which can be time-consuming in practical applications.

From the preceding discussion, it is evident that existing defense allocation methods fail to tackle the time-consuming issue associated with identifying the estimated outcomes of selected pairs of defender coalitions and individual attackers, particularly when the estimated outcomes cannot be readily provided offline. Moreover, when the optimal solution of the ILP remains unattainable due to efficiency constraints, current approaches fall short in offering guidance to enhance cooperative defense performance. Consequently, there is a demand for innovative solutions on defense allocation that balance optimality and computational efficiency.

I-B Contributions

Our contributions are summarized as fourfold:

  1. 1.

    Our defense strategy extends the applicability over previous methods to convex domains that contain a general convex target set in both 2D and 3D spaces, and it accommodates scenarios involving defenders with nonzero capture radii.

  2. 2.

    Our single-attack defense coordination algorithm, which is implemented through online convex programming, is designed to yield the maximum defense-winning region. The algorithm is applicable to all joint states, rather than being confined to a limited portion of the joint space as seen in existing strategies. This allows for a higher chance of winning when facing non-optimal attack strategies.

  3. 3.

    We propose a defense allocation algorithm that effectively reduces the computational load associated with determining the predicted outcomes of selected defender coalition-attacker pairs. This efficiency is a major advancement, as existing methods often struggle with time-consuming computations that hinder their practical use. Moreover, our algorithm is designed to incrementally improve defense performance, ensuring continual enhancement over time. This is particularly valuable in situations where the optimal allocation is unattainable due to efficiency constraints.

  4. 4.

    We provide a comprehensive evaluation of our algorithm through both simulations and real-world physics engine validations in 2D and 3D scenarios. This rigorous testing showcases the superiority of our algorithm in comparison to existing state-of-the-art approaches, demonstrating improvements in both optimality and efficiency.

I-C Organization and Notation

The remainder of this paper is organized as follows. In Section II, we formulate the problem of cooperative defense in multiplayer reach-avoid games and decompose it into two subproblems: single-attack defense coordination and defense allocation. In Section III, we present a defense strategy for a coalition against a single attacker. Section IV reformulates the defense allocation problem as an ILP and presents a heuristic method to approximate the solution. Finally, Section V presents the simulation results that benchmark the effectiveness of our approach.

We use the following notation throughout the paper. Let n\mathbb{R}^{n} be the nn-dimensional Euclidean space, and ||||||\cdot|| be the Euclidean norm. Given a vector vv, the normalizer 𝒩(v)\mathscr{N}(v) is defined as 𝒩(v)=vv\mathscr{N}(v)=\frac{v}{||v||} if v0v\neq 0 and 0 otherwise. In the scalar case, 𝒩\mathscr{N} is equivalent to the sign function. The indicator function of a set SS is denoted by 𝟏S()\mathbf{1}_{S}(\cdot) and defined as 𝟏S(x)=1\mathbf{1}_{S}(x)=1 if xSx\in S and 0 otherwise. The cardinality of a set SS is denoted by |S||S|. Given a set of vectors v1,,vnv_{1},\ldots,v_{n}, (v1,,vn)(v_{1},\ldots,v_{n}) is used to denote the column vector consisting of the entries of viv_{i} for i=1,,ni=1,\ldots,n in sequence. The Cartesian product of NN copies of a set SS is denoted by SNS^{N}.

II Problem Formulation

Refer to caption
Figure 1: A 2D instance of a multiplayer reach-avoid game.

Consider a multiplayer reach-avoid game involving a team of defender robots (called defenders) competing against a team of attacker robots (called attackers) within a confined domain containing a designated target set. The primary objective of the attacker team is to deploy as many robots as possible to reach the target set while avoiding capture by any defender, whereas the defender team strives to prevent the attackers from entering the target set. An illustrative depiction of this game is provided in Fig. 1. This paper primarily focuses on the defense perspective, wherein we aim to develop a strategy for the defending team to minimize the total number of attackers that successfully reach the target set, without prior knowledge of the attackers’ strategy.

We study scenarios where the domain and target set, denoted as 𝒟\mathcal{D} and 𝒢\mathcal{G} respectively, are bounded, closed, and convex. Concretely, we assume that 𝒟\mathcal{D} and 𝒢\mathcal{G} are defined by the zero sublevel sets of smooth convex functions dkd_{k} and glg_{l}, where kk and jj belong to the index sets D\mathcal{I}_{D} and G\mathcal{I}_{G}, respectively. Thus, these sets can be expressed as follows:

𝒟=\displaystyle\mathcal{D}= {qnmaxkDdk(q)0}\displaystyle\{q\in\mathbb{R}^{n}\mid\max_{k\in\mathcal{I}_{D}}d_{k}(q)\leq 0\} (1)
𝒢=\displaystyle\mathcal{G}= {qnmaxlGgl(q)0}.\displaystyle\{q\in\mathbb{R}^{n}\mid\max_{l\in\mathcal{I}_{G}}g_{l}(q)\leq 0\}. (2)

In this context, we model the robots as particles, with defenders indexed by 𝒩={1,,N}\mathcal{N}=\{1,\ldots,N\} and attackers indexed by ={1,,M}\mathcal{M}=\{1,\ldots,M\}. Besides, the robots are assumed to move within the domain 𝒟\mathcal{D} with single-integrator dynamics:

p˙id=\displaystyle\dot{p}^{d}_{i}= uid,i𝒩\displaystyle u^{d}_{i},~{}\forall i\in\mathcal{N} (3)
p˙ja=\displaystyle\dot{p}^{a}_{j}= uja,j\displaystyle u^{a}_{j},~{}\forall j\in\mathcal{M} (4)

where pidp^{d}_{i} and uidu^{d}_{i} (resp. pjap^{a}_{j} and ujau^{a}_{j}) represent the position and velocity of defender ii (resp. attacker jj).

We define x=(p1d,,pNd,p1a,,pMa)x=(p^{d}_{1},\ldots,p^{d}_{N},p^{a}_{1},\ldots,p^{a}_{M}) as the joint state of the system (3)-(4). Thus, the joint state satisfies the constraint:

x𝒟N+M.x\in\mathcal{D}^{N+M}. (5)

Additionally, we define the control inputs of defender ii and attacker jj as uidu^{d}_{i} and ujau^{a}_{j}, respectively. These control inputs are subject to the following constraints:

uid𝒰id=\displaystyle u^{d}_{i}\in\mathcal{U}^{d}_{i}= {vidnvidvi,maxd},i𝒩\displaystyle\{v^{d}_{i}\in\mathbb{R}^{n}\mid||v^{d}_{i}||\leq v^{d}_{i,\max}\},~{}\forall i\in\mathcal{N} (6)
uja𝒰ja=\displaystyle u^{a}_{j}\in\mathcal{U}^{a}_{j}= {vjanvjavj,maxa},j.\displaystyle\{v^{a}_{j}\in\mathbb{R}^{n}\mid||v^{a}_{j}||\leq v^{a}_{j,\max}\},~{}\forall j\in\mathcal{M}. (7)

Here, vi,maxdv^{d}_{i,\max} and vj,maxav^{a}_{j,\max} are the maximum speeds of defender ii and attacker jj, respectively. To prevent attackers from easily outrunning defenders, we impose a constraint on the maximum speed ratio, requiring that defenders possess equal or greater speed capabilities than attackers:

γij=vi,maxdvj,maxa1,(i,j)𝒩×\gamma_{ij}=\frac{v^{d}_{i,\max}}{v^{a}_{j,\max}}\geq 1,~{}\forall(i,j)\in\mathcal{N}\times\mathcal{M} (8)

where γij\gamma_{ij} denotes the ratio of defender ii’s maximum speed relative to that of attacker jj.

In the multiplayer reach-avoid game, interactions between defenders and attackers are determined by their relative positions. Specifically, attacker jj is said to be captured by defender ii if the distance between their positions is less than a positive constant rir_{i}:

pjapid<ri||p^{a}_{j}-p^{d}_{i}||<r_{i} (9)

where rir_{i} represents the iith capture radius. This leads to the capture region of defender ii defined as i={q𝒟qpid<ri}\mathcal{R}_{i}=\{q\in\mathcal{D}\mid||q-p^{d}_{i}||<r_{i}\}.

To simplify our analysis, we assume that attackers that have been captured or have reached the target set become stationary, while defenders can proceed with their tasks after capturing an attacker. This assumption allows us to focus on active attackers, which are defined as those attackers who have neither been captured nor reached the target set. The active attackers at the joint state xx are indexed by a(x)={jρj(xj)=1}\mathcal{M}_{a}(x)=\{j\in\mathcal{M}\mid\rho_{j}(x_{j})=1\}, where xj=(p1d,,pNd,pja)x_{j}=(p^{d}_{1},\ldots,p^{d}_{N},p^{a}_{j}) and ρj\rho_{j} is defined as:

ρj(xj)={1,attackerjis active0,otherwise.\rho_{j}(x_{j})=\begin{cases}1,&\text{attacker}~{}j~{}\text{is active}\\ 0,&\text{otherwise}.\end{cases} (10)

Keeping this in mind, we can modify the input constraint for attacker jj as follows:

uja𝒰~ja(xj)={{0},ρj(xj)=0𝒰ja,ρj(xj)=1,j.u^{a}_{j}\in\tilde{\mathcal{U}}^{a}_{j}(x_{j})=\begin{cases}\{0\},&\rho_{j}(x_{j})=0\\ \mathcal{U}^{a}_{j},&\rho_{j}(x_{j})=1,\end{cases}~{}\forall j\in\mathcal{M}. (11)

In this game, we adopt a state feedback information pattern wherein each team makes decisions according to the current joint state without accessing the control inputs of the opposing team. Under this information pattern, an admissible strategy for defender ii (resp. attacker jj) is defined as a mapping from the joint state xx to the input constraint set 𝒰id\mathcal{U}^{d}_{i} (resp. 𝒰~ja(xj)\tilde{\mathcal{U}}^{a}_{j}(x_{j})) such that uid=πid(x)u^{d}_{i}=\pi^{d}_{i}(x) (resp. uja=πja(x)u^{a}_{j}=\pi^{a}_{j}(x)). We define an admissible defense strategy as the collection of all defenders’ admissible strategies, denoted by πd=(π1d,,πNd)\pi^{d}=(\pi^{d}_{1},\ldots,\pi^{d}_{N}). Likewise, an admissible attack strategy, denoted by πa=(π1a,,πMa)\pi^{a}=(\pi^{a}_{1},\ldots,\pi^{a}_{M}), is defined as the collection of all attackers’ admissible strategies.

The outcome of the multiplayer reach-avoid game is evaluated in terms of a payoff function, which quantifies the total number of attackers reaching the target set up to the terminal time tft_{f}. We consider the terminal time tft_{f} to be free, meaning that the game continues until all attackers have been captured or have reached the target set. Given an initial joint state x0x_{0} and a pair of admissible defense and attack strategies πd\pi^{d} and πa\pi^{a}, the payoff function is formally defined as:

J(x0,πd,πa)=j𝟏𝒢(pja(tf)).J(x_{0},\pi^{d},\pi^{a})=\sum_{j\in\mathcal{M}}\mathbf{1}_{\mathcal{G}}(p^{a}_{j}(t_{f})). (12)

The goal of the defender team is to minimize the value of the payoff function JJ by choosing appropriate defense strategies, regardless of any admissible attack strategy.

Problem 1 (Cooperative Defense in Multiplayer Reach-Avoid Games)

Given a domain 𝒟\mathcal{D} and a target set 𝒢\mathcal{G} as defined by (1) and (2), along with an initial joint state x0x_{0} of the system (3)-(4), we seek to find an admissible defense strategy πd\pi^{d} that minimizes the payoff function JJ against any admissible attack strategy πa\pi^{a}. This corresponds to solving the following minimax optimal control problem:

minπdmaxπaJ(x0,πd,πa)s.t.x(0)=x0,(3),(4),(5),(6),(8),(11).\begin{split}&\min_{\pi^{d}}\max_{\pi^{a}}~{}J(x_{0},\pi^{d},\pi^{a})\\ &~{}\mathrm{s.t.}~{}~{}x(0)=x_{0},~{}\eqref{dyn:d},~{}\eqref{dyn:a},~{}\eqref{con:state},~{}\eqref{con:d_input},~{}\eqref{con:ratio},~{}\eqref{con:a_input'}.\end{split}

Solving Problem 1 poses significant challenges due to several factors, such as the high dimensionality of the joint state, the discreteness of the payoff function, and the presence of both cooperative and competitive interactions among agents. A widely used approach to tackling these challenges is to employ the differential game formulation, which involves solving the HJI equation or its invariant. However, as the dimension of the joint state increases, the computational complexity of this method becomes intractable, particularly in multi-agent scenarios.

To tackle the aforementioned challenges, we decompose Problem 1 into two interconnected subproblems:

  • Single-Attack Defense Coordination Problem: Given a coalition of defenders and a single attacker, design an admissible strategy for the coalition to prevent the attacker from reaching the target set, irrespective of any admissible strategy employed by the attacker. In this context, a coalition refers to a group of defenders collaboratively working to protect against a specific attacker.

  • Multi-Attack Defense Allocation Problem: Building upon the solution to the Single-Attack Defense Coordination Problem, design an allocation algorithm that assigns coalitions to active attackers, with the objective of minimizing the payoff function (12). Recall that, active attackers are those that have not been captured or have not yet reached the target set.

The Single-Attack Defense Coordination Problem is solved in a continuous-time manner, wherein the outcome of the multiplayer reach-avoid game is binary: the attacker either succeeds in reaching the target set or is captured outside the target set. We say that the coalition wins if the attacker can never enter the target set; otherwise, the coalition is considered defeated. Consequently, the problem of single-attack defense coordination is to find a winning strategy for the coalition against the attacker. Conversely, the Multi-Attack Defense Allocation Problem is handled in a discrete-time framework, where each time instant corresponds to an implementation of the allocation. We refer to these time instants as allocation times. At each allocation time, the allocation algorithm reevaluates the assignment of active attackers to coalitions based on the current joint state.

Integrating the solutions of these two subproblems results in a holistic solution to Problem 1. Given the current joint state, we initially identify the set of active attackers a\mathcal{M}_{a} by checking their relative distances to the defenders up to the last time instant. Subsequently, the solution to the Multi-Attack Defense Allocation Problem dictates the assignment of active attackers to coalitions. For each specified coalition-attacker pair, we then apply the solution to the Single-Attack Defense Coordination Problem to compute the control input for each defender in the corresponding coalition. This dual-layer scheme continues to update in real-time, effectively decomposing the high-dimensional, complex minimax optimal control problem into smaller, more tractable problems.

III Defense Coordination Against A Single Attacker

In this section, we present a defense strategy for a coalition in multiplayer reach-avoid games involving a single attacker. Our approach begins with a formal definition of the safe-reachable set and a derivation of its sublevel set representation. Leveraging the minimum squared distance between the safe-reachable set and the target set, we construct an objective function that transcribes the single-attack defense coordination problem into its online robust maximization. The solution to this robust maximization leads to an online convex programming-based defense strategy, which guarantees the coalition’s victory for a specific region of initial joint states. To increase the chances of winning against non-optimal attack strategies, we further propose a dual-mode switching defense strategy to fit with all initial joint states.

III-A Safe-Reachable Set

In the multiplayer reach-avoid game with a single attacker, the binary outcome of an initial joint state is determined by the reach-avoid set [18]. This set comprises joint states from which the attacker can reach the target set without being captured by any of the defenders. By solving an HJI equation numerically, the reach-avoid set can be represented as a sublevel set of the HJI solution [42]. However, the backward computation associated with this method suffers from the curse of dimensionality, hindering its application to multi-agent systems [23]. In contrast, the safe-reachable set seeks to collect the attacker’s reachable locations that remain capture-free given the current joint state [35]. This set can be computed using forward computation, which only concerns the attacker’s position state and does not involve solving for the target set reachability, making it more computationally efficient. We will demonstrate that under our problem settings, the binary outcome of an initial joint state can also be characterized by the safe-reachable set.

A formal definition of the safe-reachable set for a coalition-attacker pair is given below. Consider a coalition 𝒞\mathcal{C} and an attacker jj with dynamics described by (3) and (4), respectively, and denote their joint state as xj𝒞=(pid,i𝒞;pja)x^{\mathcal{C}}_{j}=(p^{d}_{i},i\in\mathcal{C};p^{a}_{j}). In light of (9), the capture-free condition for attacker jj can be expressed as follows:

maxi𝒞si(pid,pja)0\max_{i\in\mathcal{C}}s_{i}(p^{d}_{i},p^{a}_{j})\leq 0 (13)

where si(pid,pja)=ripidpjas_{i}(p^{d}_{i},p^{a}_{j})=r_{i}-||p^{d}_{i}-p^{a}_{j}||. To account for the dependence of agent trajectories on their initial states and strategies, we introduce the notations χja(;pja(0),πja)\chi^{a}_{j}(\cdot;p^{a}_{j}(0),\pi^{a}_{j}) and χid(;pid(0),πid)\chi^{d}_{i}(\cdot;p^{d}_{i}(0),\pi^{d}_{i}), which respectively denote the trajectory of attacker jj over time starting from pja(0)p^{a}_{j}(0) using the attack strategy πja\pi^{a}_{j}, and the trajectory of defender ii over time starting from pid(0)p^{d}_{i}(0) using the defense strategy πid\pi^{d}_{i}.

Definition 1 (Safe-Reachable Set)

Given a joint state xj𝒞x^{\mathcal{C}}_{j} of (𝒞,j)(\mathcal{C},j), the safe-reachable set (SRS) Ωj𝒞(xj𝒞)\Omega^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}) for attacker jj induced by the coalition 𝒞\mathcal{C} consists of all positions q𝒟q\in\mathcal{D} satisfying the following properties:

  1. 1.

    (Reachability) There exists a time instant τ\tau and an admissible attack strategy πja\pi^{a}_{j} over the interval [0,τ][0,\tau] such that the attacker jj can reach position qq at time τ\tau:

    q=χja(τ;pja,πja);q=\chi^{a}_{j}(\tau;p^{a}_{j},\pi^{a}_{j});
  2. 2.

    (Safety) For any admissible defense strategy πid\pi^{d}_{i} used by the defenders i𝒞i\in\mathcal{C}, the capture-free condition (13) holds for all t[0,τ]t\in[0,\tau]:

    maxi𝒞si(χid(t;pid,πid),χja(t;pja,πja))0,t[0,τ].\max_{i\in\mathcal{C}}s_{i}(\chi_{i}^{d}(t;p^{d}_{i},\pi^{d}_{i}),\chi^{a}_{j}(t;p^{a}_{j},\pi^{a}_{j}))\leq 0,~{}\forall t\in[0,\tau].

The SRS offers a predictive way to evaluate the capture-free condition (13) by incorporating the dynamic and input constraints of both opponents. When the SRS is nonempty, it implies that the attacker is currently situated within the SRS, and the capture-free condition is satisfied for the current joint state, as per Definition 1. Moreover, the size of the SRS can be used to anticipate the attacker’s ability to move freely without compromising safety. A larger SRS indicates that the attacker has more options for safe movement, regardless of any defense strategies that may be employed.

Similar to the reach-avoid set, the SRS admits a sublevel set representation. Consider a point qq inside the SRS that attacker jj can reach at time τ\tau. Simultaneously, defender ii can arrive at point qid=pid+vi,maxdqpidqpidτq^{d}_{i}=p^{d}_{i}+v^{d}_{i,\max}\frac{q-p^{d}_{i}}{||q-p^{d}_{i}||}\tau using the constant strategy vi,maxdqpidqpidv^{d}_{i,\max}\frac{q-p^{d}_{i}}{||q-p^{d}_{i}||}. According to Definition 1, qq adheres to the capture-free condition maxi𝒞si(qid,q)=maxi𝒞(ri|qpidvi,maxdτ|)0\max_{i\in\mathcal{C}}s_{i}(q^{d}_{i},q)=\max_{i\in\mathcal{C}}(r_{i}-\bigl{\lvert}||q-p^{d}_{i}||-v^{d}_{i,\max}\tau\bigr{\rvert})\leq 0. On the other hand, since attacker jj can safely reach qq, the time τ\tau must fulfill τ[τmin,qpidvi,maxd)\tau\in[\tau_{\min},\frac{||q-p^{d}_{i}||}{v^{d}_{i,\max}}), with τmin=qpjavj,maxa\tau_{\min}=\frac{||q-p^{a}_{j}||}{v^{a}_{j,\max}} representing the minimum time for attacker jj to reach qq. This leads to maxi𝒞(ri|qpidvi,maxdτmin|)=maxi𝒞(ri+γijqpjaqpid)0\max_{i\in\mathcal{C}}(r_{i}-\bigl{\lvert}||q-p^{d}_{i}||-v^{d}_{i,\max}\tau_{\min}\bigr{\rvert})=\max_{i\in\mathcal{C}}(r_{i}+\gamma_{ij}||q-p^{a}_{j}||-||q-p^{d}_{i}||)\leq 0, which can be rewritten as maxi𝒞cij(q,xij)0\max_{i\in\mathcal{C}}c_{ij}(q,x_{ij})\leq 0 with

cij(q,xij)=(γijqpja+ri)2qpid2c_{ij}(q,x_{ij})=(\gamma_{ij}||q-p^{a}_{j}||+r_{i})^{2}-||q-p^{d}_{i}||^{2} (14)

and xij=(pid,pja)x_{ij}=(p^{d}_{i},p^{a}_{j}). Moreover, satisfying inequality (14) for every i𝒞i\in\mathcal{C} is sufficient to ensure that qq lies within the SRS.

Proposition 1 (Sublevel Set Representation)

The SRS defined in Definition 1 is closed and convex, taking the form

Ωj𝒞(xj𝒞)=i𝒞{q𝒟cij(q,xij)0}\Omega^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j})=\bigcap_{i\in\mathcal{C}}\Big{\{}q\in\mathcal{D}\mid c_{ij}(q,x_{ij})\leq 0\Big{\}} (15)

where cijc_{ij} is given by equation (14).

Proof: We have shown that maxi𝒞cij(q,xij)0\max_{i\in\mathcal{C}}c_{ij}(q,x_{ij})\leq 0 for each qΩj𝒞(xj𝒞)q\in\Omega^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}), and the remaining is detailed in Appendix -A. \blacksquare

The SRS serves as a generalization of both the Voronoi diagram and Apollonius diagram by accommodating nonzero capture radii. As evident from equation (15), when all the capture radii equal zero, the SRS simplifies to the Voronoi cell if γij=1\gamma_{ij}=1 and to the Apollonius circle/sphere if γij>1\gamma_{ij}>1. Unlike the Voronoi diagram and Apollonius diagram, the SRS partitions the space based on the attacker’s capacity to move freely without being captured, factoring in the defenders’ capture radii, as depicted in Fig. 2. This distinctive characteristic of the SRS enables a more realistic representation of the interaction between the attacker and defenders, enhancing defense coordination with unpredictable attacker movements.

Refer to caption
Refer to caption
Figure 2: Visualization of the SRS under different joint states in a 2D domain. The blue disks represent the defenders’ capture regions, and the red region illustrates the SRS for a specific attacker. The parameters are chosen as γ1j=1\gamma_{1j}=1, γ2j=2\gamma_{2j}=2, r1=2r_{1}=2 and r2=0.5r_{2}=0.5. As displayed from left to right, the SRS varies with joint states, and its size decreases as the distances between the attacker and the defenders diminish while accounting for the capture radii of the defenders.

III-B Single-Attack Objective Function

Having the SRS concept, we next develop an objective function for the single-attack defense coordination problem. Given that the SRS pertains to the attacker’s forward reachability, it is natural to define an objective function that connects the SRS and the target set in terms of their minimum distance. With this in mind, we propose the minimum squared distance between the SRS and the target set as the objective function, which we refer to as the single-attack objective function:

Φj𝒞(xj𝒞)=minqΩj𝒞(xj𝒞)q~𝒢qq~2.\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j})=\min_{\genfrac{}{}{0.0pt}{2}{q\in\Omega^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j})}{\tilde{q}\in\mathcal{G}}}||q-\tilde{q}||^{2}. (16)

Since both the SRS and the target set are convex and admit sublevel set representations, according to Proposition 1 and the assumption on the target set, the value of Φj𝒞\Phi^{\mathcal{C}}_{j} can be determined by solving the following parametric convex program:

min(q,q~)2nqq~2s.t.cij(q,xij)0,i𝒞dk(q)0,kDgl(q~)0,lG\begin{split}&\min_{(q,\tilde{q})\in\mathbb{R}^{2n}}~{}~{}||q-\tilde{q}||^{2}\\ &~{}~{}~{}~{}\mathrm{s.t.}~{}~{}c_{ij}(q,x_{ij})\leq 0,~{}\forall i\in\mathcal{C}\\ &~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}d_{k}(q)\leq 0,~{}\forall k\in\mathcal{I}_{D}\\ &~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}g_{l}(\tilde{q})\leq 0,~{}\forall l\in\mathcal{I}_{G}\end{split} (17)

where the joint state xj𝒞x^{\mathcal{C}}_{j} serves as the parametric vector.

The single-attack objective function provides a quantitative measure for the binary outcome of the single-attacker multi-defender reach-avoid game. It is essential to note that the value of Φj𝒞\Phi^{\mathcal{C}}_{j} is always nonnegative for a nonempty SRS, and positive if the intersection between the SRS and the target set is empty. This suggests that retaining the positivity of Φj𝒞\Phi^{\mathcal{C}}_{j} is critical in preventing the attacker from reaching the target set. Consequently, a winning guarantee condition for the coalition can be derived in terms of the single-attack objective function.

Lemma 1 (Defense-Winning Condition)

For a given coalition-attacker pair (𝒞,j)(\mathcal{C},j) at the initial joint state, the coalition 𝒞\mathcal{C} is guaranteed to win against attacker jj if and only if there exists an admissible defense strategy such that, for any admissible attack strategy, the single-attack objective function remains positive throughout the duration:

Φj𝒞(xj𝒞(t))>0,t[0,tf).\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}(t))>0,~{}\forall t\in[0,t_{f}).

Proof: For sufficiency, if Φj𝒞(xj𝒞(t))\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}(t)) remains positive for all t[0,tf)t\in[0,t_{f}), then the SRS cannot intersect the target set during this time interval. Since the attacker’s position is always contained within the non-empty SRS, the attacker can never enter the target set. This implies that the coalition is guaranteed to win the game. For necessity, assume for the sake of contradiction that the coalition wins while Φj𝒞(xj𝒞(T))=0\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}(T))=0 for some T[0,tf)T\in[0,t_{f}). According to Definition 1, there is an admissible attack strategy starting from time TT that drives attacker jj to the target set within a finite time. This, however, contradicts the assumption that the coalition wins. \blacksquare

Remark 1 (Online Robust Maximization)

Lemma 1 confirms that if the strict zero superlevel set of Φj𝒞\Phi^{\mathcal{C}}_{j} is robustly positively invariant [43] under an admissible defense strategy, then the coalition can guarantee victory against any admissible attack strategy. The notion of robustness means that the coalition must maximize the single-attack objective function Φj𝒞\Phi^{\mathcal{C}}_{j} for the worst-case attack strategy. As a consequence, the optimal defense coordination problem can be converted into an online robust maximization of the single-attack objective function for the coalition at each joint state.

III-C Defense-Winning Strategy

We proceed to derive a defense-winning strategy for the coalition 𝒞\mathcal{C} by performing online robust maximization of the single-attack objective function Φj𝒞\Phi^{\mathcal{C}}_{j} when it has a positive value. In light of Pontryagin’s Maximum Principle for differential games (cf. Theorem 8.2 in [44]), robustly maximizing Φj𝒞\Phi^{\mathcal{C}}_{j} for the coalition 𝒞\mathcal{C} amounts to robustly maximizing its time derivative Φ˙j𝒞\dot{\Phi}^{\mathcal{C}}_{j} with respect to the inputs uidu^{d}_{i}, i𝒞i\in\mathcal{C}. Note that, the time derivative of Φj𝒞\Phi^{\mathcal{C}}_{j} along the joint state trajectory is given by

Φ˙j𝒞=i𝒞Φj𝒞piduid+Φj𝒞pjauja.\dot{\Phi}^{\mathcal{C}}_{j}=\sum_{i\in\mathcal{C}}\frac{\partial\Phi^{\mathcal{C}}_{j}}{\partial p^{d}_{i}}u^{d}_{i}+\frac{\partial\Phi^{\mathcal{C}}_{j}}{\partial p^{a}_{j}}u^{a}_{j}. (18)

To measure the effect of each agent’s input on Φ˙j𝒞\dot{\Phi}^{\mathcal{C}}_{j}, we need to determine the gradients of Φj𝒞\Phi^{\mathcal{C}}_{j} concerning the positions of all agents. Fortunately, by invoking the Karush–Kuhn–Tucker (KKT) conditions (cf. Theorem 12.1 in [44]), a linear correspondence can be made almost everywhere between the gradients of Φj𝒞\Phi^{\mathcal{C}}_{j} and those of cijc_{ij}, i𝒞i\in\mathcal{C} given in (14).

Proposition 2 (Gradient Correspondence)

Suppose the convex program (17) admits a unique solution for each joint state xj𝒞x^{\mathcal{C}}_{j} within a region 𝒟𝒟\mathcal{D}^{\prime}\subset\mathcal{D}. Then for almost every xj𝒞𝒟x^{\mathcal{C}}_{j}\in\mathcal{D}^{\prime}, there exist nonnegative constants λij\lambda_{ij}^{*}, i𝒞i\in\mathcal{C} such that

Φj𝒞pid=λijcijpid(ξj𝒞,xij),i𝒞Φj𝒞pja=i𝒞λijcijpja(ξj𝒞,xij)\begin{split}\frac{\partial\Phi^{\mathcal{C}}_{j}}{\partial p^{d}_{i}}=&\lambda_{ij}^{*}\frac{\partial c_{ij}}{\partial p^{d}_{i}}(\xi^{\mathcal{C}}_{j},x_{ij}),~{}~{}\forall i\in\mathcal{C}\\ \frac{\partial\Phi^{\mathcal{C}}_{j}}{\partial p^{a}_{j}}=&\sum_{i\in\mathcal{C}}\lambda_{ij}^{*}\frac{\partial c_{ij}}{\partial p^{a}_{j}}(\xi^{\mathcal{C}}_{j},x_{ij})\end{split} (19)

where ξj𝒞\xi^{\mathcal{C}}_{j} is the first nn-dimensional component of the unique solution to the convex program (17).

Proof: See Appendix -B. \blacksquare

Proposition 2 provides a tractable procedure for generating a defense strategy that approximately robustly maximizes Φj𝒞\Phi^{\mathcal{C}}_{j}, assuming that (20) holds almost everywhere. This procedure entails deriving the gradient of the functions cijc_{ij}, i𝒞i\in\mathcal{C} for each qΩj𝒞(xj𝒞)q\in\Omega^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}) and qpjaq\neq p^{a}_{j} as follows:

cijpid(q,xij)=2(qpid)T,i𝒞cijpja(q,xij)=2ζij(q,pja)(qpja)T\begin{split}\frac{\partial c_{ij}}{\partial p^{d}_{i}}(q,x_{ij})=&2(q-p^{d}_{i})^{T},~{}i\in\mathcal{C}\\ \frac{\partial c_{ij}}{\partial p^{a}_{j}}(q,x_{ij})=&-2\zeta_{ij}(q,p^{a}_{j})(q-p^{a}_{j})^{T}\end{split} (20)

where ζij(q,pja)=γij2+riγijqpja\zeta_{ij}(q,p^{a}_{j})=\gamma_{ij}^{2}+\frac{r_{i}\gamma_{ij}}{||q-p^{a}_{j}||}. By substituting (20) with q=ξj𝒞q=\xi^{\mathcal{C}}_{j} into (19) and noting that the parameter λij\lambda_{ij}^{*} is nonnegative, we can obtain an almost robust optimal defense strategy for defender ii in coalition 𝒞\mathcal{C} given by:

πid(xj𝒞)=vi,maxd𝒩(ξj𝒞pid),i𝒞\pi^{d}_{i}(x^{\mathcal{C}}_{j})=v^{d}_{i,\max}\mathscr{N}(\xi^{\mathcal{C}}_{j}-p^{d}_{i}),~{}i\in\mathcal{C} (21)

for xj𝒞𝒟x^{\mathcal{C}}_{j}\in\mathcal{D}^{\prime} such that the convex program (17) has a unique solution. Moreover, it can be shown that the uniqueness of the solution to (17) is guaranteed as long as Φj𝒞\Phi^{\mathcal{C}}_{j} remains positive.

Proposition 3 (Solution Uniqueness)

For each joint state xj𝒞x^{\mathcal{C}}_{j} within the region defined by

𝒟j𝒞={xj𝒞𝒟|𝒞|+1Φj𝒞(xj𝒞)>0}\mathcal{D}^{\mathcal{C}}_{j}=\Big{\{}x^{\mathcal{C}}_{j}\in\mathcal{D}^{|\mathcal{C}|+1}\mid\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j})>0\Big{\}}

the convex program (17) has a unique solution.

Proof: See Appendix -C. \blacksquare

We are now ready to present the main result of this section, which asserts that the almost robust optimal defense strategy given by (21) guarantees a win for the coalition once the joint state xj𝒞x^{\mathcal{C}}_{j} lies within the region 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j}.

Theorem 1 (Defense-Winning Strategy)

For any initial joint state within the region 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j}, the coalition 𝒞\mathcal{C} is ensured to win against attacker jj under the defense strategy (21), irrespective of any admissible attack strategies.

Proof: In light of Proposition 2, inserting (19) along with (20) and (21) into (18) reveals that for almost every xj𝒞𝒟j𝒞x^{\mathcal{C}}_{j}\in\mathcal{D}^{\mathcal{C}}_{j},

Φ˙j𝒞=i𝒞2λijvi,maxd(ξj𝒞pid)T𝒩(ξj𝒞pid)i𝒞2λijζij(ξj𝒞,pja)(ξj𝒞pja)Tujai𝒞2vi,maxdλij(ξj𝒞pidγijξj𝒞pjari)0\begin{split}\dot{\Phi}^{\mathcal{C}}_{j}=&\sum_{i\in\mathcal{C}}2\lambda_{ij}^{*}v^{d}_{i,\max}(\xi^{\mathcal{C}}_{j}-p^{d}_{i})^{T}\mathscr{N}(\xi^{\mathcal{C}}_{j}-p^{d}_{i})\\ &-\sum_{i\in\mathcal{C}}2\lambda_{ij}^{*}\zeta_{ij}(\xi^{\mathcal{C}}_{j},p^{a}_{j})(\xi^{\mathcal{C}}_{j}-p^{a}_{j})^{T}u^{a}_{j}\\ \geq&\sum_{i\in\mathcal{C}}2v^{d}_{i,\max}\lambda_{ij}^{*}(||\xi^{\mathcal{C}}_{j}-p^{d}_{i}||-\gamma_{ij}||\xi^{\mathcal{C}}_{j}-p^{a}_{j}||-r_{i})\\ \geq&0\end{split}

where the first and second inequalities hold due to ujavj,maxa||u^{a}_{j}||\leq v^{a}_{j,\max} and ξj𝒞Ωj𝒞(xj𝒞)\xi^{\mathcal{C}}_{j}\in\Omega^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}), respectively. As such, Φj𝒞(xj𝒞(t))\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}(t)) is non-decreasing over the time interval [0,tf)[0,t_{f}). This implies that Φj𝒞(xj𝒞(t))Φj𝒞(xj𝒞(0))>0\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}(t))\geq\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}(0))>0 for all t[0,tf)t\in[0,t_{f}), which in turn completes the proof, as established by Lemma 1. \blacksquare

Remark 2 (Maximum Defense-Winning Region)

The defense-winning region 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j} is maximum in the sense that 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j} contains all initial joint states for which the coalition 𝒞\mathcal{C} can assuredly win against attacker jj. This is because if an initial joint state xj𝒞(0)x^{\mathcal{C}}_{j}(0) is not in 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j}, then it will lead to Φj𝒞(xj𝒞(0))=0\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}(0))=0 according to the definition of 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j}. This implies that attacker jj can safely reach the target set as stated by Lemma 1. Therefore, the outcome of a single-attack multi-defender reach-avoid game can be fully determined by checking whether the current joint state lies within the defense-winning region 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j}.

III-D Dual-Mode Switching Defense Coordination

We have identified a region 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j} of joint states within which the almost robust optimal defense strategy (21) ensures victory for the coalition. This defense strategy relies on the solution of the convex program (17), which typically does not guarantee solution uniqueness for joint states outside 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j}. In real-world scenarios, as the attack strategy is unknown, it would be advantageous for the defense strategy to handle any joint state, not just those within the region 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j}. By doing so, the chances of winning for the coalition against non-optimal attack strategies are increased. In the remainder of this section, we focus on utilizing the concept of SRS to extend the defense strategy to cover the entire joint state space.

We start by proposing a novel objective function specifically designed for joint states outside the region 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j}. The single-attack objective function Φj𝒞\Phi^{\mathcal{C}}_{j} is not suitable for this purpose, as it cannot distinguish between joint states that are not in 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j}, resulting in zero values for all such joint states. Given that an intelligent attacker tends to reach the target set safely and rapidly, and the intersection between the SRS and the target set is nonempty for xj𝒞𝒟j𝒞x^{\mathcal{C}}_{j}\notin\mathcal{D}^{\mathcal{C}}_{j}, the objective function is defined as the minimum squared distance between the attacker’s position and the intersection between the SRS and the target set:

Φ¯j𝒞(xj𝒞)=minqΩj𝒞(xj𝒞)𝒢qpja2.\begin{split}\bar{\Phi}^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j})=\min_{q\in\Omega^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j})\cap\mathcal{G}}||q-p^{a}_{j}||^{2}.\end{split} (22)

The convexity of the SRS enables the determination of Φ¯j𝒞(xj𝒞)\bar{\Phi}^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}) through the solution of the parametric convex program:

minqnqpja2s.t.cij(q,xij)0,i𝒞gl(q)0,lG.\begin{split}&\min_{q\in\mathbb{R}^{n}}~{}~{}||q-p^{a}_{j}||^{2}\\ &~{}~{}\mathrm{s.t.}~{}~{}c_{ij}(q,x_{ij})\leq 0,~{}\forall i\in\mathcal{C}\\ &~{}~{}~{}~{}~{}~{}~{}~{}g_{l}(q)\leq 0,~{}\forall l\in\mathcal{I}_{G}.\end{split} (23)

The uniqueness of the solution to (23) is guaranteed since the projection of pjap^{a}_{j} onto the convex and closed set Ωj𝒞(xj𝒞)𝒢\Omega^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j})\cap\mathcal{G} is unique [45].

Data: Positions of defenders pidp^{d}_{i}, i𝒞i\in\mathcal{C} and attacker pjap^{a}_{j}
Result: Defender inputs uidu^{d}_{i}, i𝒞i\in\mathcal{C}
for each defender i𝒞i\in\mathcal{C} do
       Φj𝒞,ξj𝒞\Phi^{\mathcal{C}}_{j},\xi^{\mathcal{C}}_{j}\leftarrow Optimal value and solution of (17) with (𝒞,j)(\mathcal{C},j)
       if Φj𝒞>0\Phi^{\mathcal{C}}_{j}>0 then
             uidvi,maxd𝒩(ξj𝒞pid)u^{d}_{i}\leftarrow v^{d}_{i,\max}\mathscr{N}(\xi^{\mathcal{C}}_{j}-p^{d}_{i})
       else
             ξ¯j𝒞\bar{\xi}^{\mathcal{C}}_{j}\leftarrow Solution to (23) with (𝒞,j)(\mathcal{C},j)
             uidvi,maxd𝒩(ξ¯j𝒞pid)u^{d}_{i}\leftarrow v^{d}_{i,\max}\mathscr{N}(\bar{\xi}^{\mathcal{C}}_{j}-p^{d}_{i})
       end if
      
end for
Algorithm 1 Dual-Mode Switching Defense Coordination

The objective function Φ¯j𝒞\bar{\Phi}^{\mathcal{C}}_{j} is then employed to derive a defense strategy for which xj𝒞𝒟jCx^{\mathcal{C}}_{j}\notin\mathcal{D}^{C}_{j}. Based on the definition of Φ¯j𝒞\bar{\Phi}^{\mathcal{C}}_{j}, the goal of the coalition is to robustly maximize Φ¯j𝒞\bar{\Phi}^{\mathcal{C}}_{j} over time, which can be accomplished by robustly maximizing the time derivative of Φ¯j𝒞\bar{\Phi}^{\mathcal{C}}_{j}. To achieve this, we need to determine the gradient of Φ¯j𝒞\bar{\Phi}^{\mathcal{C}}_{j} with respect to the position of each defender in the coalition. Analogous to Proposition 2, it can be shown that for almost every xj𝒞𝒟j𝒞x^{\mathcal{C}}_{j}\notin\mathcal{D}^{\mathcal{C}}_{j}, there is a set of nonnegative constants μij\mu^{*}_{ij}, i𝒩i\in\mathcal{N}, such that

Φ¯j𝒞pid=μijcijpid(ξ¯j𝒞,xij),i𝒞Φ¯j𝒞pja=i𝒞μijcijpja(ξ¯j𝒞,xij)2(ξ¯j𝒞pja)T\begin{split}\frac{\partial\bar{\Phi}^{\mathcal{C}}_{j}}{\partial p^{d}_{i}}=&\mu^{*}_{ij}\frac{\partial c_{ij}}{\partial p^{d}_{i}}(\bar{\xi}^{\mathcal{C}}_{j},x_{ij}),~{}\forall i\in\mathcal{C}\\ \frac{\partial\bar{\Phi}^{\mathcal{C}}_{j}}{\partial p^{a}_{j}}=&\sum_{i\in\mathcal{C}}\mu^{*}_{ij}\frac{\partial c_{ij}}{\partial p^{a}_{j}}(\bar{\xi}^{\mathcal{C}}_{j},x_{ij})-2(\bar{\xi}^{\mathcal{C}}_{j}-p^{a}_{j})^{T}\end{split} (24)

where ξ¯j𝒞\bar{\xi}^{\mathcal{C}}_{j} is the unique solution to the convex program (23). The proof of (24), owing to its similarity to that of Proposition 2, can be found in Appendix -B. Considering the almost everywhere holding of (24), substituting (20) with q=ξ¯j𝒞q=\bar{\xi}^{\mathcal{C}}_{j} into (24) yields an almost robust optimal defense strategy for xj𝒞𝒟jCx^{\mathcal{C}}_{j}\notin\mathcal{D}^{C}_{j}:

πid(xj𝒞)=vi,maxd𝒩(ξ¯j𝒞pid),i𝒞.\pi^{d}_{i}(x^{\mathcal{C}}_{j})=v^{d}_{i,\max}\mathscr{N}(\bar{\xi}^{\mathcal{C}}_{j}-p^{d}_{i}),~{}\forall i\in\mathcal{C}. (25)

This defense strategy, together with the defense strategy (21) for xj𝒞𝒟jCx^{\mathcal{C}}_{j}\in\mathcal{D}^{C}_{j}, constitutes a dual-mode switching defense strategy applicable to the entire joint state space. The proposed algorithm for coordinating the defense against a single attacker is outlined in Algorithm 1.

Remark 3 (Almost-Optimal Waypoint for Defense Coordination)

Algorithm 1 suggests that the nn-dimensional vector defined by

ξ~j𝒞={ξj𝒞,xj𝒞𝒟j𝒞ξ¯j𝒞,otherwise\begin{split}\tilde{\xi}^{\mathcal{C}}_{j}=\begin{cases}\xi^{\mathcal{C}}_{j},&x^{\mathcal{C}}_{j}\in\mathcal{D}^{\mathcal{C}}_{j}\\ \bar{\xi}^{\mathcal{C}}_{j},&\mathrm{otherwise}\end{cases}\end{split} (26)

which is obtained by solving the convex program (17) or (23), serves as an almost-optimal waypoint for the coalition. This waypoint provides a reference point for the defenders in the coalition to coordinate their movements, depending on whether the joint state lies within the region 𝒟j𝒞\mathcal{D}^{\mathcal{C}}_{j} or not, as illustrated in Fig. 3. By following the almost-optimal waypoint, the coalition can effectively adapt their defense strategy to defend against a single attacker.

Refer to caption
Refer to caption
Figure 3: Defense coordination in both 2D and 3D cases. On the left, the almost-optimal waypoint ξj𝒞\xi^{\mathcal{C}}_{j} represents the point within the SRS Ωj𝒞\Omega^{\mathcal{C}}_{j} that is closest to the target set 𝒢\mathcal{G}. On the right, the almost-optimal waypoint ξ¯j𝒞\bar{\xi}^{\mathcal{C}}_{j} signifies the point within the intersection of the SRS Ωj𝒞\Omega^{\mathcal{C}}_{j} and the target set 𝒢\mathcal{G} that is closest to the attacker. Both almost-optimal waypoints offer intuitive reference points for the coalition to intercept the attacker in various scenarios.
Remark 4 (Individual Attack Strategy)

As a byproduct of our analysis, we can also derive a strategy for the single attacker. Contrary to the defenders’ goal, the attacker seeks to minimize Φj𝒞\Phi^{\mathcal{C}}_{j} robustly for xj𝒞𝒟j𝒞x^{\mathcal{C}}_{j}\in\mathcal{D}^{\mathcal{C}}_{j}, and Φ¯j𝒞\bar{\Phi}^{\mathcal{C}}_{j} robustly for xj𝒞𝒟j𝒞x^{\mathcal{C}}_{j}\notin\mathcal{D}^{\mathcal{C}}_{j}. Employing a similar approach to that used in the derivation of the defense strategy, we arrive at the following attack strategy:

πja(xj𝒞)=vj,maxa𝒩(ξ~j𝒞pja)\pi^{a}_{j}(x^{\mathcal{C}}_{j})=v^{a}_{j,\max}\mathscr{N}(\tilde{\xi}^{\mathcal{C}}_{j}-p^{a}_{j}) (27)

where ξ~j𝒞\tilde{\xi}^{\mathcal{C}}_{j}, defined as (26), also serves as an almost-optimal waypoint for the attacker. The detailed derivation is omitted in this paper, as our primary focus lies on the defense perspective.

IV Defense Allocation for Multiple Attackers

In this section, we delve into the dynamic problem of allocating coalitions to multiple attackers. Building upon our previous results on defense coordination for single-attack scenarios, we reformulate the defense allocation problem as a joint state-dependent integer linear program (ILP), but solving this ILP in real-time may become computationally infeasible as the agent size grows. To address this issue, we propose an alternative solution that integrates two key components: the Hierarchical Integer Linear Programming (HILP) method and the Monotonic Defense Enhancement Allocation (MDEA) algorithm. The HILP method iteratively constructs a hierarchy based on the concepts of active defense sets and irreducibility, leading to a computationally efficient suboptimal solution. The MDEA algorithm then incorporates this suboptimal solution while enforcing a monotonicity constraint, ensuring that the expected number of attackers that fail to reach the target set remains non-decreasing over time.

IV-A Integer Linear Program Formulation

Following the classical taxonomy of multirobot task allocation proposed by Gerkey et al. [36], the defense allocation problem can be categorized as a single-task robot, multirobot task, and instantaneous assignment problem. Under the current joint state, each defender can defend against at most one attacker at a time, each attacker can be simultaneously guarded by multiple defenders, and the assignment only concerns the present moment. To represent the assignment, a binary matrix called the assignment matrix 𝒜=(aij)N×M\mathcal{A}=(a_{ij})_{N\times M} is used, where the entry aij=1a_{ij}=1 if attacker jj is assigned to defender ii, and 0 otherwise. In addition, since all defenders are single-task, the assignment matrix must respect the conflict-free constraint, i.e., the row sum of the assignment matrix should not exceed 1.

To better model the effect of each coalition-attacker pair on the defense performance, we introduce an alternative matrix representation. Let 𝒞k\mathcal{C}_{k}, k=1,,N~k=1,\ldots,\tilde{N} be the list of all different coalitions in which N~=2N1\tilde{N}=2^{N}-1. We define the coalition assignment matrix to be Θ=(θkj)N~×M\Theta=(\theta_{kj})_{\tilde{N}\times M} such that

θkj{0,1},k=1,,N~,j=1,,M\theta_{kj}\in\{0,1\},~{}\forall k=1,\ldots,\tilde{N},~{}j=1,\ldots,M (28)

and θkj=1\theta_{kj}=1 if attacker jj is allocated to coalition 𝒞k\mathcal{C}_{k}. The coalition assignment matrix Θ\Theta inherently induces an assignment matrix 𝒜\mathcal{A} through the linear transformation:

𝒜=Θ\mathcal{A}=\mathcal{B}\Theta (29)

where =(bik)N×N~\mathcal{B}=(b_{ik})_{N\times\tilde{N}} denotes the incidence matrix representing the binary relationship between defenders and coalitions. Specifically, bik=1b_{ik}=1 if i𝒞ki\in\mathcal{C}_{k} and 0 otherwise. To maintain the single-task property, the conflict-free constraint is defined through the induced assignment matrix as

j=1Mk=1N~bikθkj1,i=1,,N.\sum_{j=1}^{M}\sum_{k=1}^{\tilde{N}}b_{ik}\theta_{kj}\leq 1,~{}\forall i=1,\ldots,N. (30)

Additionally, the redundancy-free constraint is imposed on Θ\Theta:

k=1N~θkj1,j=1,,M\sum_{k=1}^{\tilde{N}}\theta_{kj}\leq 1,~{}\forall j=1,\ldots,M (31)

ensuring that each attacker is assigned to at most one coalition.

We now present an objective function incorporating decision variables as elements of the coalition assignment matrix. Recalling Theorem 1, if Φj𝒞k>0\Phi^{\mathcal{C}_{k}}_{j}>0 at the joint state xj𝒞kx^{\mathcal{C}_{k}}_{j}, an admissible defense strategy for 𝒞k\mathcal{C}_{k} exists that prevents attacker jj from reaching the target set. In light of this, we designate (𝒞k,j)(\mathcal{C}_{k},j) as a feasible coalition-attacker pair at xj𝒞kx^{\mathcal{C}_{k}}_{j} if Φj𝒞k(xj𝒞k)>0\Phi^{\mathcal{C}_{k}}_{j}(x^{\mathcal{C}_{k}}_{j})>0. In line with the payoff function given in (12), the reward for assigning attacker jj to coalition 𝒞k\mathcal{C}_{k}, denoted by wkjw_{kj}, is set as 1 if (𝒞k,j)(\mathcal{C}_{k},j) is feasible and attacker jj is active, and 0 otherwise, i.e.,

wkj(xj)=ρj(xj)𝒩(Φj𝒞k(xj𝒞k))w_{kj}(x_{j})=\rho_{j}(x_{j})\mathscr{N}(\Phi^{\mathcal{C}_{k}}_{j}(x^{\mathcal{C}_{k}}_{j}))

with ρj\rho_{j} given in (10). Subsequently, the objective function, depending on the choice of a coalition assignment matrix Θ\Theta at a given joint state xx, is defined as the total sum of the rewards of all assigned coalition-attacker pairs:

Γ(Θ,x)=k=1N~j=1Mwkj(xj)θkj.\Gamma(\Theta,x)=\sum_{k=1}^{\tilde{N}}\sum_{j=1}^{M}w_{kj}(x_{j})\theta_{kj}. (32)

We refer to Γ\Gamma as the multi-attack objective function, which signifies the expected number of attackers that will be unable to access the target set from the current joint state. The reason behind this objective function is that it enables the assessment of defense performance in multi-attack scenarios.

Lemma 2 (Defense Performance Assessment)

Let πd\pi^{d} be an admissible defense strategy associated with time-dependent coalition assignment matrices Θ()\Theta(\cdot) that consistently satisfy constraints (28), (30), and (31). Suppose there exists a time instant t00t_{0}\geq 0, such that for all admissible attack strategies,

Γ(Θ(t),x(t))+Nc(t)Mc\Gamma(\Theta(t),x(t))+N_{c}(t)\geq M_{c} (33)

holds over the time interval [t0,tf)[t_{0},t_{f}), where Nc(t)N_{c}(t) is the number of captured attackers up to time tt, and McM_{c} is a positive integer. Then, the payoff function JJ under the defense strategy πd\pi^{d} is bounded by MMcM-M_{c}, i.e.,

J(x(0),πd,πa)MMcJ(x(0),\pi^{d},\pi^{a})\leq M-M_{c}

for any admissible attack strategy πa\pi^{a}.

Proof: Let Ne(t)N_{e}(t) denote the number of attackers that have entered the target set up to time tt. At time tt, each attacker must be either captured, active, or have entered the target set. Hence, we have M=Ne(t)+Ma(t)+Nc(t)M=N_{e}(t)+M_{a}(t)+N_{c}(t). Since the multi-attack objective function Γ\Gamma only concerns active attackers, it follows that Γ(Θ(t),x(t))Ma(t)\Gamma(\Theta(t),x(t))\leq M_{a}(t). Combining this with inequality (33) yields McΓ(Θ(t),x(t))+Nc(t)Ma(t)+Nc(t)M_{c}\leq\Gamma(\Theta(t),x(t))+N_{c}(t)\leq M_{a}(t)+N_{c}(t) for all t[t0,tf)t\in[t_{0},t_{f}), implying that Ne(t)MMcN_{e}(t)\leq M-M_{c} for all t[t0,tf)t\in[t_{0},t_{f}). Consequently, based on the definition of the payoff function, we obtain J(x(0),πd,πa)=Ne(tf)MMcJ(x(0),\pi^{d},\pi^{a})=N_{e}(t_{f})\leq M-M_{c}. \blacksquare

Remark 5 (Online Maximization and Monotonicity Constraint)

Lemma 2 implies that one way to minimize the payoff function JJ, against any admissible attack strategies, is to maximize the multi-attack objective function Γ\Gamma with respect to the coalition assignment matrix Θ\Theta at each joint state xx. This leads to the following integer linear program (ILP) for the multi-attack defense allocation problem along the joint state trajectory:

maxΘΓ(Θ,x)s.t.(28),(30),(31).\begin{split}&\max_{\Theta}~{}\Gamma(\Theta,x)\\ &~{}\mathrm{s.t.}~{}~{}\eqref{con:binary},\eqref{con:cf},\eqref{con:rf}.\end{split} (34)

Furthermore, to ensure overall defense performance, Lemma 2 highlights the significance of maintaining a robustly positively invariant superlevel set for the function Γ+Nc\Gamma+N_{c} with a level Mc>0M_{c}>0. Here, the value Γ(Θ(t),x(t))+Nc(t)\Gamma(\Theta(t),x(t))+N_{c}(t) represents the expected number of attackers that fail to reach the target set at time tt. A sufficient condition for preserving the superlevel set property is to impose the monotonicity constraint, given by:

Γ(Θ(tl),x(tl))+Nc(tl)Γ(Θ(tl1),x(tl1))+Nc(tl1),l=1,,L\begin{split}&\Gamma(\Theta(t_{l}),x(t_{l}))+N_{c}(t_{l})\\ \geq&\Gamma(\Theta(t_{l-1}),x(t_{l-1}))+N_{c}(t_{l-1}),~{}\forall l=1,\ldots,L\end{split} (35)

where tlt_{l}, l=0,,Ll=0,\ldots,L are the allocation times with t0=0t_{0}=0. It should be noted that the optimal solution to the ILP (34) automatically satisfies the monotonicity constraint (35).

Although solving the ILP (34) provides an optimal solution to the defense allocation problem, it entails a substantial computational burden, particularly as the number of agents increases. This is mainly due to the vast amount of possible coalition-attack pairs, which not only leads to a large-scale ILP but more importantly, requires considerable computational effort to determine the multi-attack objective function. In the case of NN defenders and MaM_{a} active attackers, there are N~Ma\tilde{N}M_{a} possible coalition-attacker pairs, implying that the same amount of convex programs of the form (17) need to be calculated to specify the corresponding rewards. To alleviate the computational burden, we propose a heuristic algorithm for the ILP (34) in Section IV-C to generate a suboptimal solution, which will then be fed into an allocation algorithm to ensure the monotonicity constraint (35) in Section IV-D. In particular, prior to detailing the heuristic algorithm, we will derive two critical components in the next section, which together construct a hierarchy that facilitates the generation of a suboptimal solution for the ILP (34).

IV-B Active-Defense Set and Irreducibility

The optimal solution to the ILP (34) tends to favor feasible coalition-attacker pairs comprising smaller coalitions. On one hand, the multi-attack objective function is influenced only by feasible coalition-attacker pairs, as the reward for an infeasible pair is zero. On the other hand, smaller coalitions are less likely to violate the conflict-free constraint (30), and thus more likely to contribute positively to the multi-attack objective function. As a result, the key to striking a balance between optimality and efficiency in solving the ILP (34) is to identify as many feasible sub-pairs of coalition-attacker pairs with small amounts of defenders as possible while keeping the computational cost low. Here, a sub-pair of a coalition-attacker pair refers to a subset of the coalition that is also paired with the same attacker. To this end, we will focus on seeking feasible sub-pairs of a coalition-attacker pair from the aspects of optimality and efficiency in the subsequent analysis.

As a point of departure, we revisit the convex program (17) subject to a feasible coalition-attacker pair (𝒞,j)(\mathcal{C},j). As shown in Theorem 1, the vector ξj𝒞\xi^{\mathcal{C}}_{j} extracted from the solution to (17) establishes an almost-optimal waypoint for the coalition 𝒞\mathcal{C}, which must lie on the boundary of the SRS Ωj𝒞\Omega^{\mathcal{C}}_{j}. Moreover, owing to the convexity of the SRS, some of the inequality constraints cij(q,xij)0c_{ij}(q,x_{ij})\leq 0, i𝒞i\in\mathcal{C} must be active at q=ξj𝒞q=\xi^{\mathcal{C}}_{j}. As such, we can define an index set of defenders through an active set in the convex program (17) that incorporates the almost-optimal waypoint ξj𝒞\xi^{\mathcal{C}}_{j}.

Definition 2 (Active-Defense Set)

Suppose the coalition-attacker pair (𝒞,j)(\mathcal{C},j) is feasible at the joint state xj𝒞x^{\mathcal{C}}_{j}. The active-defense set (ADS) Λj𝒞(xj𝒞)\Lambda^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}) for (𝒞,j)(\mathcal{C},j) is the active set of the inequality constraints cij(q,xij)0c_{ij}(q,x_{ij})\leq 0, i𝒞i\in\mathcal{C} in (17) at the almost-optimal waypoint ξj𝒞\xi^{\mathcal{C}}_{j}, i.e.,

Λj𝒞(xj𝒞)={i𝒞cij(ξj𝒞,xij)=0}.\Lambda^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j})=\left\{i\in\mathcal{C}\mid c_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})=0\right\}. (36)
Refer to caption
Figure 4: Geometric interpretation of the ADS in a 2D domain for the case of γij=1\gamma_{ij}=1. In this case, the ADS Λj𝒞\Lambda^{\mathcal{C}}_{j} is the set of defender indices such that the capture region for each defender iΛj𝒞i\in\Lambda^{\mathcal{C}}_{j} (illustrated as a blue disk) is tangent to the dashed circle of radius ξj𝒞pja||\xi^{\mathcal{C}}_{j}-p^{a}_{j}|| centered at the almost-optimal waypoint ξj𝒞\xi^{\mathcal{C}}_{j}.

The construction of the ADS does not necessitate the computation of any additional convex programs. This is because the feasibility of the coalition-attacker pair has already been verified using a convex program of the form (17), and the almost-optimal waypoint can be determined as a byproduct of this computation. It is worth noting that the ADS concept indeed provides a proximity-based method for obtaining a specific sub-pair, which is associated with both locations of the attacker and the target set, as illustrated in Fig. 4. Furthermore, it can be shown that such a sub-pair inherits the feasibility of the original coalition-attacker pair.

Proposition 4 (Feasibility)

If the coalition-attacker pair (𝒞,j)(\mathcal{C},j) is feasible at the joint state xj𝒞x^{\mathcal{C}}_{j}, then the sub-pair (Λj𝒞,j)(\Lambda^{\mathcal{C}}_{j},j) is also feasible at the corresponding joint state.

Proof: See Appendix -D. \blacksquare

We turn to look at how to generate feasible sub-pairs of a sufficiently small scale. While the ADS allows for feasible sub-pair generation without incurring additional computational effort, the resulting sub-pair size may not be minimum with respect to feasibility. In other words, the sub-pair may include redundant defenders that could be removed without affecting the feasibility of the sub-pair. To precisely quantify the redundancy, we introduce the following definition.

Definition 3 (Irreducibility)

A feasible coalition-attacker pair (𝒞,j)(\mathcal{C},j) is said to be irreducible at a joint state xj𝒞x^{\mathcal{C}}_{j} if there is no proper subset 𝒞\mathcal{C}^{\prime} of 𝒞\mathcal{C} such that (𝒞,j)(\mathcal{C}^{\prime},j) is feasible.

Given a feasible coalition-attacker pair (𝒞,j)(\mathcal{C},j), we can develop an iterative procedure to generate all irreducible sub-pairs. The process commences by evaluating size-1 subsets of 𝒞\mathcal{C}. If |𝒞|=1|\mathcal{C}|=1, (𝒞,j)(\mathcal{C},j) is inherently deemed irreducible and the process terminates. Otherwise, we check each size-1 subset 𝒞\mathcal{C}^{\prime} for irreducibility by confirming whether Φj𝒞>0\Phi^{\mathcal{C}^{\prime}}_{j}>0. Then, we remove the element of 𝒞\mathcal{C}^{\prime} from 𝒞\mathcal{C} for which (𝒞,j)(\mathcal{C}^{\prime},j) is irreducible, forming a new set 𝒞r(1)\mathcal{C}_{r}^{(1)}. If |𝒞r(1)|1|\mathcal{C}_{r}^{(1)}|\leq 1, all irreducible sub-pairs have been identified, and the process ends. In case further iterations are required, we repeat with all size-ll subsets of 𝒞r(l1)\mathcal{C}_{r}^{(l-1)} for l=2,,|𝒞|l=2,\ldots,|\mathcal{C}|, where 𝒞r(l)\mathcal{C}_{r}^{(l)} is analogously defined by removing elements from 𝒞r(l1)\mathcal{C}_{r}^{(l-1)}. The procedure continues until either |𝒞r(l)|l|\mathcal{C}_{r}^{(l)}|\leq l or |𝒞|=l|\mathcal{C}|=l at the iteration ll. Moreover, the number of iterations can be reduced by noting that the cardinality of the coalition for any irreducible sub-pair is upper bounded by the dimension of the domain.

Proposition 5 (Upper Bounded Cardinality)

In an nn-dimensional domain, if the coalition-attacker pair (𝒞,j)(\mathcal{C},j) is irreducible, then 𝒞\mathcal{C} has cardinality no greater than nn.

Proof: See Appendix -E. \blacksquare

Data: Positions pidnp^{d}_{i}\in\mathbb{R}^{n}, i𝒞i\in\mathcal{C} and pjanp^{a}_{j}\in\mathbb{R}^{n}
Result: The set consisting of irreducible sub-pairs 𝒫\mathcal{P}
Initialization: 𝒫\mathcal{P}\leftarrow\emptyset, 𝒞r𝒞\mathcal{C}_{r}\leftarrow\mathcal{C}
for i=1i=1 to nn do
       if |𝒞r|<i|\mathcal{C}_{r}|<i then
             break
       else if |𝒞|=i|\mathcal{C}|=i then
             Put (𝒞,j)(\mathcal{C},j) in 𝒫\mathcal{P}
             break
       else
             forall 𝒞𝒞r\mathcal{C}^{\prime}\subset\mathcal{C}_{r} with |𝒞|=i|\mathcal{C}^{\prime}|=i do
                   Φj𝒞\Phi^{\mathcal{C}^{\prime}}_{j}\leftarrow Optimal value of (17) s.t. (𝒞,j)(\mathcal{C}^{\prime},j)
                   if Φj𝒞>0\Phi^{\mathcal{C}^{\prime}}_{j}>0 then
                         Put (𝒞,j)(\mathcal{C}^{\prime},j) in 𝒫\mathcal{P}
                         Remove the elements of 𝒞\mathcal{C}^{\prime} from 𝒞r\mathcal{C}_{r}
                   end if
                  
             end forall
            
       end if
      
end for
Algorithm 2 Irreducible Sub-Pairs of (𝒞,j)(\mathcal{C},j)

We summarize our findings of identifying irreducible sub-pairs of a feasible coalition-attacker pair in Algorithm 2. It should be emphasized that when solving the ILP (34), we can focus on irreducible coalition-attacker pairs rather than all possible pairs. This is because replacing a coalition-attacker pair with one of its irreducible sub-pairs will not affect the optimal value of the multi-attack objective function. In general, Algorithm 2 can lower the number of convex programs needed to determine the multi-attack objective function. However, directly using the concept of irreducibility could lead to heavy computational loads. For instance, computing all irreducible sub-pairs in the worst-case scenario requires up to compute N~\tilde{N} convex programs for a coalition-attacker pair (𝒩,j)(\mathcal{N},j). To improve computational efficiency, a more effective approach would be to combine the concept of irreducibility with that of the ADS.

IV-C Hierarchical Integer Linear Programming

We now exhibit the integration of the ADS and irreducibility concepts to establish the first level of the hierarchy for the ILP (34). Our approach initiates by checking coalition-attack pairs of the form (𝒩,j)(\mathcal{N},j) and only focusing on feasible ones. The proximity-based sub-pair (Λj𝒩,j)(\Lambda^{\mathcal{N}}_{j},j) is then taken into account, where Λj𝒩\Lambda^{\mathcal{N}}_{j} is the ADS for the feasible pair (𝒩,j)(\mathcal{N},j). Finally, irreducible sub-pairs of (Λj𝒩,j)(\Lambda^{\mathcal{N}}_{j},j) are selected to refine the multi-attack objective function Γ\Gamma. As a result, we assign the highest priority to coalition-attacker pairs (𝒞,j)(\mathcal{C},j) that are irreducible and have 𝒞\mathcal{C} as a subset of Λj𝒩\Lambda^{\mathcal{N}}_{j}. Compared to directly prioritizing all irreducible sub-pairs of (𝒩,j)(\mathcal{N},j), this method reduces the potential computational challenges while offering a considerable amount of feasible sub-pairs.

After specifying the first hierarchy level, we can derive a suboptimal solution to the ILP (34). Let Ξ1\Xi_{1} denote the first priority set that contains all coalition-attacker pairs with the highest priority. In the case where Ξ1\Xi_{1} is empty, no feasible coalition-attacker pairs exist, and the ILP solution simply corresponds to a zero coalition assignment matrix. To confine the selection of coalition-attacker pairs solely to those within Ξ1\Xi_{1}, we impose the following constraint on the decision variables:

θkj=0,(𝒞k,j)Ξ1.\theta_{kj}=0,~{}\forall(\mathcal{C}_{k},j)\notin\Xi_{1}. (37)

By resolving the ILP (34) with the additional constraint (37), we attain a suboptimal solution denoted as Θ1\Theta_{1}, which provides a lower bound on the optimal value of the multi-attack objective function. In order to evaluate Θ1\Theta_{1}, we define two index sets 𝒩r(1)\mathcal{N}_{r}^{(1)} and r(1)\mathcal{M}_{r}^{(1)} in which 𝒩r(1)\mathcal{N}_{r}^{(1)} contains the indices of defenders that remain unassigned to any attacker based on Θ1\Theta_{1}, and jr(1)j\in\mathcal{M}_{r}^{(1)} if Θ1\Theta_{1} omits attacker jj and (𝒩,j)(\mathcal{N},j) is feasible. If either 𝒩r(1)\mathcal{N}_{r}^{(1)} or r(1)\mathcal{M}_{r}^{(1)} is empty, there are no more available coalition-attacker pairs and further steps are not needed.

The above procedure can be iteratively applied, if possible, to improve the suboptimal solution. At iteration l2l\geq 2, the ll-th priority set Ξl\Xi_{l} is formed by identifying all irreducible coalition-attacker pairs (𝒞,j)(\mathcal{C},j) for which 𝒞\mathcal{C} is contained in the ADS for a feasible (𝒩r(l1),j)(\mathcal{N}^{(l-1)}_{r},j) and jr(l1)j\in\mathcal{M}_{r}^{(l-1)}. The decision variables are then restricted to Ξl\Xi_{l} through the binary constraint:

θkj=0,(𝒞k,j)Ξl\begin{split}&\theta_{kj}=0,~{}\forall(\mathcal{C}_{k},j)\notin\Xi_{l}\end{split} (38)

resulting in a subproblem of the original ILP (34):

maxΓ(Θ,x)s.t.(28),(30),(31),(38).\begin{split}&\max~{}\Gamma(\Theta,x)\\ &~{}\mathrm{s.t.}~{}\eqref{con:binary},\eqref{con:cf},\eqref{con:rf},\eqref{con:binary_iter}.\end{split} (39)

Having a solution to the subproblem (39), represented by Θl\Theta_{l}, the suboptimal solution to the ILP (34) is updated by adding Θl\Theta_{l} to the previous suboptimal solution, leading to

ΘHILP=k=1lΘk.\Theta_{\text{HILP}}=\sum_{k=1}^{l}\Theta_{k}.

The procedure terminates at the llth iteration when Ξl\Xi_{l} is empty or one of the index sets 𝒩r(l)\mathcal{N}_{r}^{(l)} and r(l)\mathcal{M}_{r}^{(l)} is empty, where 𝒩r(l)\mathcal{N}_{r}^{(l)} and r(l)\mathcal{M}_{r}^{(l)} are obtained by excluding the indices related to Θ~l\tilde{\Theta}_{l} from 𝒩r(l1)\mathcal{N}_{r}^{(l-1)} and r(l1)\mathcal{M}_{r}^{(l-1)}, respectively.

Data: Positions pidp^{d}_{i}, i𝒩i\in\mathcal{N} and pjap^{a}_{j}, jaj\in\mathcal{M}_{a}
Result: Coalition assignment matrix ΘHILP\Theta_{\text{HILP}}
Initialize ΘHILP0N~×M\Theta_{\text{HILP}}\leftarrow 0_{\tilde{N}\times M}, 𝒩r𝒩\mathcal{N}_{r}\leftarrow\mathcal{N} ra\mathcal{M}_{r}\leftarrow\mathcal{M}_{a}
while 𝒩r\mathcal{N}_{r}\neq\emptyset and r\mathcal{M}_{r}\neq\emptyset do
       Ξ\Xi\leftarrow\emptyset
       forall jrj\in\mathcal{M}_{r} do
             Φj𝒩r\Phi^{\mathcal{N}_{r}}_{j}\leftarrow Optimal value of (17) s.t. (𝒩r,j)(\mathcal{N}_{r},j)
             if Φj𝒩r>0\Phi^{\mathcal{N}_{r}}_{j}>0 then
                   Λj𝒩r\Lambda^{\mathcal{N}_{r}}_{j}\leftarrow ADS for (𝒩r,j)(\mathcal{N}_{r},j)
                   𝒫\mathcal{P}\leftarrow Result of Algorithm 2 s.t. (Λj𝒩r,j)(\Lambda^{\mathcal{N}_{r}}_{j},j)
                   Put all elements of 𝒫\mathcal{P} in Ξ\Xi
            else
                   Remove jj from r\mathcal{M}_{r}
             end if
            
       end forall
      if Ξ=\Xi=\emptyset then
             break
       end if
      Θ\Theta\leftarrow A solution to the ILP (39) with (38) s.t. Ξ\Xi
       ΘHILPΘHILP+Θ\Theta_{\text{HILP}}\leftarrow\Theta_{\text{HILP}}+\Theta
       Remove all ii with (Θ)(i,:)01×M(\mathcal{B}\Theta)(i,:)\neq 0_{1\times M} from 𝒩r\mathcal{N}_{r}
       Remove all jj with Θ(:,j)0N~×1\Theta(:,j)\neq 0_{\tilde{N}\times 1} from r\mathcal{M}_{r}
end while
Algorithm 3 Hierarchical Integer Linear Programming

In summary, our approach to solving the ILP (34) is outlined in Algorithm 3, which strikes a balance between optimality and efficiency, as demonstrated by the subsequent proposition.

Proposition 6 (Optimality-Efficiency Tradeoff)

Algorithm 3 possesses two key properties:

1) (Optimality) If r\mathcal{M}_{r} is empty after the first iteration, then the suboptimal solution ΘHILP\Theta_{\text{HILP}} is an optimal solution to the ILP (34).

2) (Efficiency) If the cardinality of the ADS for each (𝒩r,j)(\mathcal{N}_{r},j) in each iteration is at most nn, then the total number of calculations for the convex program of the form (17) is less than 2n1Ma(1+Ma)2^{n-1}M_{a}(1+M_{a}).

Proof: See Appendix -F. \blacksquare

The assumptions made in Proposition 6 for Algorithm 3 are satisfied in many practical scenarios. For instance, if the ADSs Λj𝒩r\Lambda^{\mathcal{N}_{r}}_{j} for the coalition-attacker pair (𝒩,j)(\mathcal{N},j), jaj\in\mathcal{M}_{a} are mutually disjoint, then the set r\mathcal{M}_{r} becomes empty after the first iteration and the suboptimal solution Θ~\tilde{\Theta} is optimal for the ILP (34). Such a scenario commonly arises when the attackers are sparsely distributed relative to the defenders. Furthermore, empirical results indicate that the cardinality of the ADSs typically does not exceed the number of defenders. Compared to directly solving the ILP, where the number of calculations for the convex program of the form (17) grows exponentially with the number of defenders, the calculation times for our algorithm do not empirically depend on the number of defenders and are only quadratically dependent on the number of active attackers. Consequently, Algorithm 3 offers a viable alternative for tackling the ILP in large-scale scenarios with many defenders, especially when conventional approaches become computationally infeasible.

IV-D Monotonic Defense Enhancement Allocation

Once acquiring the suboptimal solution produced by Algorithm 3, we adjust it by incorporating the monotonicity constraint (35). Specifically, we update the coalition assignment matrix at each time step by equating it to the solution derived from Algorithm 3 if the monotonicity constraint (35) is met; otherwise, we retain the updated coalition assignment matrix from the previous step. Note that the number of active attackers may change between two consecutive time steps. As such, we need to set the columns of the previous coalition assignment matrix to zero for those corresponding to attackers captured at the current time step before executing the update. This approach offers the advantage of maintaining the validity of all inequalities in (34) without requiring additional computation of the ILP, thereby reducing the computational effort.

While the above approach effectively integrates the monotonicity constraint (35), it may give rise to undesired oscillatory behaviors in defense allocation. Such oscillations occur when a defender’s assignment repeatedly switches between two distinct attackers, potentially impairing the system’s performance. These oscillations stem from the fact that Algorithm 3 might produce non-unique solutions, leading to the equality in (35) holding. To tackle this issue, we implement a stricter version of the monotonicity constraint (35), updating the coalition assignment matrix only when the inequality constraint (35) is strictly fulfilled. This ensures that we stick to the previous coalition assignment matrix when the equality in (35) holds, thus avoiding oscillations while still enforcing the monotonicity constraint (35).

Data: Positions pidp^{d}_{i}, i𝒩i\in\mathcal{N} and pjap^{a}_{j}, jaj\in\mathcal{M}_{a}, solution of Algorithm 3 ΘHILP\Theta_{\text{HILP}}, previous coalition assignment matrix Θ~p\tilde{\Theta}^{p}, previous index set of active attackers ap\mathcal{M}_{a}^{p}
Result: Coalition assignment matrix ΘMDEA\Theta_{\text{MDEA}}
if a=\mathcal{M}_{a}=\emptyset then
       return ΘMDEA0N~×M\Theta_{\text{MDEA}}\leftarrow 0_{\tilde{N}\times M}
end if
Part I: Monotonicity constraint enforcement
Γ\Gamma\leftarrow Number of nonzero column vectors in ΘHILP\Theta_{\text{HILP}}
Γp\Gamma^{p}\leftarrow Number of nonzero column vectors in Θ~p\tilde{\Theta}^{p}
if Γ>Γp|ap|+|a|\Gamma>\Gamma^{p}-|\mathcal{M}_{a}^{p}|+|\mathcal{M}_{a}| then
       ΘMDEAΘHILP\Theta_{\text{MDEA}}\leftarrow\Theta_{\text{HILP}}
else
       if |ap||a||\mathcal{M}_{a}^{p}|\neq|\mathcal{M}_{a}| then
             forall japaj\in\mathcal{M}_{a}^{p}\setminus\mathcal{M}_{a} do
                   Θ~p(:,j)0N~×1\tilde{\Theta}^{p}(:,j)\leftarrow 0_{\tilde{N}\times 1}
             end forall
            
       end if
      ΘMDEAΘ~p\Theta_{\text{MDEA}}\leftarrow\tilde{\Theta}^{p}
end if
Θ~pΘMDEA\tilde{\Theta}^{p}\leftarrow\Theta_{\text{MDEA}}, apa\mathcal{M}_{a}^{p}\leftarrow\mathcal{M}_{a}
Part II: Greedy assignment of unassigned defenders
𝒜ΘMDEA\mathcal{A}\leftarrow\mathcal{B}\Theta_{\text{MDEA}}
Put all i𝒩i\in\mathcal{N} with 𝒜(i,:)=01×M\mathcal{A}(i,:)=0_{1\times M} in 𝒩r\mathcal{N}_{r}
Put all jaj\in\mathcal{M}_{a} with 𝒜(:,j)=0N×1\mathcal{A}(:,j)=0_{N\times 1} in r\mathcal{M}_{r}
if r\mathcal{M}_{r}\neq\emptyset and 𝒩r\mathcal{N}_{r}\neq\emptyset then
       forall i𝒩ri\in\mathcal{N}_{r} do
             jminargminjrpidpjaj_{\min}\leftarrow\arg\min_{j\in\mathcal{M}_{r}}||p^{d}_{i}-p^{a}_{j}||
             (i,jmin)(i,j_{\min})th element of 𝒜1\mathcal{A}\leftarrow 1
       end forall
      
end if
ΘMDEA𝒜\Theta_{\text{MDEA}}\leftarrow\mathcal{B}^{\dagger}\mathcal{A} with \mathcal{B}^{\dagger} as the pseudoinverse of \mathcal{B}
Algorithm 4 Monotonic Defense Enhancement Allocation

Our approach to the multi-attack defense allocation problem is summarized in Algorithm 4, which consists of two components. The first component imposes the monotonicity constraint (35) on the solution obtained from Algorithm 3. For the initial step, Θ~p\tilde{\Theta}^{p} and ap\mathcal{M}^{p}_{a} are respectively set to ΘHILP\Theta_{\text{HILP}} and ap=\mathcal{M}^{p}_{a}=\mathcal{M}. The second part uses a greedy algorithm that allocates each unassigned defender to the nearest unassigned active attacker. This is inspired by the observation that a closer the defender-attacker pair results in a reduced SRS. To achieve this, we first convert the coalition assignment matrix into the corresponding assignment matrix in terms of the linear equation (29). We then update the assignment matrix using the greedy algorithm and revert it to the coalition assignment matrix through the pseudoinverse of the incidence matrix. The greedy algorithm is computationally efficient and provides a valuable supplement to the first component of Algorithm 4.

We are now able to encode the solution to the multi-attack defense allocation problem into an admissible defense strategy for the multiplayer reach-avoid game. Given the output of Algorithm 4, ΘMDEA=(θkj)\Theta_{\text{MDEA}}=(\theta_{kj}), we define the assignment rule as follows:

σ(i)={j,i𝒞k&θkj=1null,otherwise\begin{split}\sigma(i)=\begin{cases}j,&i\in\mathcal{C}_{k}~{}\&~{}\theta_{kj}=1\\ \mathrm{null},&\text{otherwise}\end{cases}\end{split} (40)

where null\mathrm{null} denotes the case of an unassigned defender. For each active attacker jaj\in\mathcal{M}_{a}, the coalition against attacker jj induced by σ\sigma is given by:

𝒞~j={i𝒩σ(i)=j}.\tilde{\mathcal{C}}_{j}=\{i\in\mathcal{N}\mid\sigma(i)=j\}.

Combining Algorithm 1 with the assignment rule σ\sigma as described in (40), we obtain an admissible defense strategy of the form:

π~id(x)=vi,maxd𝒩(ξσ(i)pid),i𝒩\tilde{\pi}^{d}_{i}(x)=v^{d}_{i,\max}\mathscr{N}(\xi_{\sigma(i)}-p^{d}_{i}),~{}i\in\mathcal{N} (41)

where ξj=ξ~j𝒞~j\xi_{j}=\tilde{\xi}^{\tilde{\mathcal{C}}_{j}}_{j} is the almost-optimal waypoint defined in (26). If a defender is unassigned, its control input is set to zero for the sake of energy saving.

Theorem 2 (Performance-Guaranteed Defense Strategy)

For any initial joint state x0x_{0}, the payoff function JJ under the defense strategy π~d\tilde{\pi}^{d} given by (41), and any admissible attack strategy satisfies the following inequality:

J(x0,π~d,πa)MΓ(ΘMDEA0,x0)Nc(0)J(x_{0},\tilde{\pi}^{d},\pi^{a})\leq M-\Gamma(\Theta_{\text{MDEA}}^{0},x_{0})-N_{c}(0) (42)

where Γ\Gamma is the multi-attack objective function defined in (32), ΘMDEA0\Theta_{\text{MDEA}}^{0} is the initial solution of Algorithm 4, and Nc(0)N_{c}(0) is the number of attackers captured at the initial joint state. Moreover, the function Γ+Nc\Gamma+N_{c} is monotonically non-decreasing along the system trajectory.

Proof: The enforcement of the monotonicity constraint (35) in Algorithm 4 ensures that the sequence Γ(ΘMDEA(tl),x(tl))+Nc(tl)}l=0L\Gamma(\Theta_{\text{MDEA}}(t_{l}),x(t_{l}))+N_{c}(t_{l})\}_{l=0}^{L} is monotonically non-decreasing. Further, it is noted that Γ(ΘMDEA(t),x(t))+Nc(t)\Gamma(\Theta_{\text{MDEA}}(t),x(t))+N_{c}(t) remains constant within each time interval [tl1,tl)[t_{l-1},t_{l}), as the assignment of defenders to active attackers is updated only at the time instants tlt_{l}. As a result, Γ+Nc\Gamma+N_{c} is non-decreasing. This indicates that the inequality (33) holds with Mc=Γ(ΘMDEA0,x0)+Nc(0)M_{c}=\Gamma(\Theta_{\text{MDEA}}^{0},x_{0})+N_{c}(0), which in turn, according to Lemma 2, leads to the satisfaction of the inequality (42) for the payoff function. \blacksquare

Theorem 2 provides a solid assurance that our defense strategy (41) is capable of achieving a performance level no worse than the bound of the payoff function specified in (42) for any initial joint state. Theorem 2 also establishes that the function Γ+Nc\Gamma+N_{c}, representing the expected number of attackers that fail to reach the target set, is monotonically non-decreasing over time. This property suggests that the upper bound of the payoff function in (42) becomes progressively tighter as the game progresses, leading to a continuous improvement in the performance of the proposed strategy. In the next section, we will present empirical results to demonstrate performance enhancement in practical scenarios.

V Simulations

In this section, we evaluate the performance and efficiency of our proposed single-attack defense coordination and allocation algorithms through a series of simulated experiments. All algorithms are written in Python, with convex programs and ILP implemented using the CVXPY solver [46]. We conduct the experiments on an Ubuntu 20.04 operating system equipped with an i9-10900 5.2 GHz CPU and 32 GB RAM, featuring 10 physical cores and 20 threads. To demonstrate the effectiveness of our approach, we benchmark our algorithms against current state-of-the-art algorithms in both 2D and 3D scenarios. Initially, we examine the Dual-Mode Switching Defense Coordination (DMSDC) algorithm (Algorithm 1), which serves as the basis for subsequent experiments with the monotonic defense enhancement allocation (MDEA) algorithm (Algorithm 4). Furthermore, to demonstrate the applicability of our algorithms in practical settings, we extend our evaluation to a more realistic simulation environment utilizing Gazebo with the robot operating system (ROS) [47].

V-A Single-Attack Defense Coordination

We start with the DMSDC for the problem of single-attack defense coordination. Our approach is compared with two existing methods: the multi-agent pursuit-defense strategy presented in [27] for a 2D scene and the analytical HJI-based strategy introduced in [24] for a 3D scene. We conduct simulations in a 2D rectangular domain of [5,5]2[-5,5]^{2} with a unit circle as the target set, and a 3D box domain of [5,5]3[-5,5]^{3} with the origin as the target set. To examine the effectiveness of our approach in guaranteeing the coalition’s win under various initial joint states, we perform 2000 randomized simulations with different numbers of defenders and capture radii for both the 2D and 3D scenarios. To ensure a fair comparison, we adopt the attack strategy given in (27) across all methods and set the maximum speed ratios to 1. The simulation outcomes are classified into four categories based on whether the coalition wins the game:

  • True Positive: The percentage of cases where both methods successfully capture the attacker.

  • False Negative: The percentage of cases where our method succeeds while the comparison method fails to capture the attacker.

  • False Positive: The percentage of cases where our method fails while the comparison method succeeds in capturing the attacker.

  • True Negative: The percentage of cases where both methods fail in capturing the attacker.

Table I showcases the 2D simulation results, which highlight the superior performance of our proposed algorithm over the multi-agent pursuit-defense strategy. Notably, our algorithm captures the attacker in 7.1% to 22.4% more instances than the multi-agent pursuit-defense strategy, without any occurrences where our algorithm fails while the multi-agent pursuit-defense strategy prevails. Interestingly, as the capture radius increases, our algorithm records a higher percentage of false negatives. This observation can be attributed to the fact that the Voronoi diagram employed in the multi-agent strategy does not inherently account for nonzero capture radii, whereas our algorithm exploits the safe-reachable set of the attacker to devise a more effective defense coordination strategy. Moreover, we note that when the number of defenders increases from 2 to 3, both capture radius scenarios exhibit a higher percentage of false negatives. This result arises from the multi-agent pursuit-defense strategy utilizing only one defender for defense, with the remaining defenders acting as pursuers. In contrast, our algorithm capitalizes on the collective strength of multiple defenders to achieve overall coordination.

TABLE I: Simulation results of a 2D multi-defender single-attacker reach-avoid game.
Defender Capture True False False True
Number Radius Positive Negative Positive Negative
      2 0.5 25.6% 7.1% 0% 67.3%
      3 0.5 33.0% 10.0% 0% 57.0%
      2 3.0 33.9% 18.8% 0% 47.3%
      3 3.0 41.6% 22.4% 0% 36.0%

On the other hand, the 3D simulation results displayed in Table II indicate that our proposed algorithm outperforms the analytical HJI-based strategy. Specifically, our algorithm captures the attacker in up to 13.6% more instances than the analytical HJI-based strategy when the capture radius is 2, and there are no cases in which our algorithm fails while the analytical HJI-based strategy succeeds. It should be emphasized that the analytical HJI-based strategy is designed for cases with zero capture radii; consequently, the performance of our algorithm closely aligns with the analytical HJI-based strategy when the capture radius is near zero, which in turn demonstrates the optimality of our approach. Additionally, our algorithm exhibits better performance as the number of defenders increases, similar to our 2D comparison. This improvement is due to the analytical HJI-based strategy focusing solely on scenarios with one or two defenders, whereas our approach accommodates multiple defenders.

TABLE II: Simulation results of a 3D multi-defender single-attacker reach-avoid game.
Defender Capture True False False True
Number Radius Positive Negative Positive Negative
      2 0.1 66.5% 0.9% 0% 32.6%
      3 0.1 72.2% 1.6% 0% 26.2%
      2 2.0 84.3% 7.3% 0% 8.4%
      3 2.0 81.6% 13.6% 0% 5.2%

V-B Multiple-Attack Defense Allocation

Refer to caption
Refer to caption
Figure 5: Performance comparison between our MDEA algorithm and the analytical barrier approach in a 2D scenario. (a): Capture number comparison. (b): Check number comparison.
Refer to caption
Refer to caption
Figure 6: Performance comparison between our MDEA algorithm and the sequential matching approach in a 3D scenario. (a): Capture number comparison. (b): Average check number comparison.

We next examine the MDEA algorithm for the multi-attack defense allocation problem. We compare our approach to two existing algorithms: the analytical barrier introduced in [31] for a 2D scenario and the sequential matching presented in [33] for a 3D scenario. To conduct our experiments, we use five groups of agents, maintaining a 1:1 ratio between defenders and attackers. The number of agents in each group ranges from 10 to 50, and we perform simulations with 10 different initial joint states for each group. We employ the same domain as described in Section V-A, with variations in the target sets for the different scenarios. Specifically, the 2D scenario has a rectangular domain of [5,5]2[-5,5]^{2} with the line segment y=5y=5 as the target set, while the 3D scenario has a box domain of [5,5]3[-5,5]^{3} with the flat face z=5z=-5 as the target set.

During the simulation, we track two metrics to evaluate the effectiveness and computational efficiency of our proposed algorithm. The first metric is the capture number, representing the total number of attackers captured throughout the game, which is complementary to the payoff function. The second metric is the check number, denoting the count of convex programs of the form (17) computed within the multi-attack objective function defined in (32). This metric plays a crucial role in computational complexity, particularly when the agent size increases, as witnessed by our empirical results. In these cases, it can be seen that our single-attack defense coordination strategy (21) is the limiting case of that proposed in [31] when the capture radii tend to zero, and it includes the one proposed in [33] as a special case. To ensure a fair comparison with other algorithms, we employ a low capture radius of ri=0.05mr_{i}=0.05\text{m} for the 2D scenario and ri=0.2mr_{i}=0.2\text{m} for the 3D scenario. Similarly, we adopt the attack strategy given in (27) and set the maximum speed ratios to 1, as described in Section V-A.

We employ boxplots to display the results of our 2D simulation in Fig. 5, indicating that our MDEA algorithm outperforms the analytical barrier approach. Three colors are used to represent three cases: 1) MDEA applied at each allocation time; 2) analytical barrier, which performs allocation only at the initial joint state; 3) MDEA implemented solely at the initial joint state, akin to the analytical barrier. As illustrated in Fig. 5, MDEA yields a considerable improvement over the analytical barrier, with its median values consistently exceeding those of the analytical barrier. Furthermore, the capture number achieved by MDEA surpasses that attained at the initial joint state, demonstrating the defense performance improvement of our approach as proven in Theorem 2. In Fig. 5, it can be observed that the analytical barrier exhibits a constant check number for each case, equal to the dimension of the decision variables in the ILP (34). Conversely, the check number of MDEA at the initial joint state is significantly lower than that of the analytical barrier, highlighting the superior computational efficiency of our method.

In the 3D simulation, our MDEA algorithm also outperforms the sequential matching method, as depicted in Fig. 6. We employ the same three colors to represent three cases as in the 2D simulation, with the analytical barrier replaced by sequential matching. As illustrated in Fig. 6, MDEA surpasses the sequential matching with higher median values of capture number. Moreover, MDEA attains a final capture number that exceeds the expected capture number initially obtained by the algorithm, thereby demonstrating the effectiveness of our approach in enhancing defense performance. Besides, Fig. 6 reveals that the average check number achieved by MDEA is significantly lower than that of the sequential matching, where the average check number is calculated by averaging the check number of each step. This showcases the computational efficiency of our approach.

V-C Gazebo Simulators with ROS

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 7: 2D Gazebo simulation and visualization results for an indoor scenario with three defenders (blue) and five attackers (red). The green region denotes the target set, and the black numbers above the blue dots indicate the instantaneous assigned indices. (a) and (e) Initial configuration; (b) and (f) Configuration at 13s (optimal attack strategy case); (c) and (g) Final configuration (optimal attack strategy case); (d) and (h) Final configuration (straight-line attack strategy case).
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 8: 3D Gazebo simulation and visualization results for an outdoor scenario with five defenders (blue) and five attackers (red). The target set is represented by the cylindrical-shaped no-fly zone, as illustrated by the green cylinders in (e)-(h). The black numbers above the blue dots denote the instantaneous assigned indices. (a) and (e) Initial configuration; (b) and (f) Configuration at 5s (optimal attack strategy case); (c) and (g) Final configuration (optimal attack strategy case); (d) and (h) Final configuration (straight-line attack strategy case).

Finally, we evaluate the practical applicability of our proposed algorithm using the Gazebo simulators in conjunction with ROS. To address real-world scenarios, we incorporate a multi-agent collision avoidance approach based on the control barrier function, as presented by Wang et al. [48]. Additionally, we utilize the mapping provided in [49] to convert the velocity vectors generated by our algorithm into corresponding linear and angular velocity commands. We conduct experiments in two different environments: a 2D indoor setting and a 3D outdoor setting. The indoor environment is designed as a square space surrounded by walls, with a rectangular region adjacent to one of the walls serving as the target set. In this scenario, TurtleBot3 Burger wheeled mobile robots are deployed as agents, featuring maximum linear and angular velocities of 0.22m/s and 2.84rad/s, respectively. On the other hand, the outdoor environment is characterized by a box-shaped region elevated above the ground, with a cylindrical region designated as the target set to represent a no-fly zone in real-world scenarios. Here, Iris quadcopter robots are employed as agents. The computed velocity vector is utilized as a reference point to regulate the robot’s velocity and heading. The linear and angular velocity references are set to 0.8m/s and 0rad/s, respectively, enabling the Iris quadcopter robots to move swiftly towards the reference point while preserving safety. The safety radii for the TurtleBot3 Burger and Iris robots are 0.1m and 0.5m, respectively, while their capture radii are set to 0.6m and 1.5m, respectively. Both robotics platforms operate at a frequency of 50Hz, with a defense allocation frequency of 10Hz.

To test the effectiveness of our proposed algorithm, we use two different attack strategies and examine the defender team’s adaptability in response to these variations. The first strategy is the optimal attack strategy given in (27), where attacker jj moves towards the almost-optimal waypoint ξ~j𝒩\tilde{\xi}^{\mathcal{N}}_{j}, as specified in (26). The second strategy termed the straight-line approach, entails each attacker heading toward the nearest point on the target set, with the objective of minimizing the time required to reach the target set while disregarding the risk of potential capture by the defenders. The simulation process can be seen in the attached video, and illustrative snapshots are provided in Fig. 7 and Fig. 8. In the 2D indoor scenario, featuring a configuration of five attackers and three defenders, our proposed algorithm exhibits a high level of effectiveness, achieving a final capture number of four for the optimal attack strategy and five for the straight-line attack strategy. Likewise, in the 3D outdoor scenario, comprising five attackers and five defenders, the algorithm demonstrates its robustness by attaining a final capture number of four for the optimal attack strategy and five for the straight-line attack strategy. These results substantiate the effectiveness of our proposed algorithm in diverse environmental settings.

VI Conclusion

In this article, we presented a dual-layer online optimization strategy for defenders in multiplayer reach-avoid games, with the goal of minimizing the number of successful attackers entering a convex target set within a convex domain while facing unknown attack strategies. We devised a dual-mode switching defense algorithm for defender coalitions to counter individual attackers via online convex programming. This algorithm induces the maximum defense-winning region for initial joint states, sharpening the chances of winning while offering scalability. Additionally, we developed a monotonic defense enhancement allocation algorithm for determining coalition assignment matrices in integer linear programs (ILPs) by incorporating predicted outcomes from single-attack coordination. This algorithm employs a hierarchical iterative method to approximate ILPs and includes a monotonicity constraint, reducing the chosen coalition-attacker pairs while maintaining a non-decreasing expected number of unsuccessful attackers over time. We assessed our approach’s performance across diverse multi-robot systems and environments, demonstrating its superiority in terms of optimality and efficiency. Moreover, Gazebo simulations confirmed the real-world applicability and potential for practical robotics implementation. In future work, extending our approach to encompass non-convex environments and integrating more complex agent models could enhance the performance and applicability of our proposed framework. In addition, accounting for attackers with different priorities may better reflect real-world scenarios and further refine the defense strategies employed in our approach.

-A Proof of Proposition 1

Proof: To show the convexity of the SRS, we first exhibit that for each i𝒞i\in\mathcal{C}, cij(q,xij)c_{ij}(q,x_{ij}) is convex with respect to qq. Note that cijc_{ij} can be decomposed into a sum of three terms: cij(q,xij)=(γij21)q2+2riγijqpja+[2(pidγij2pja)Tq+γij2pja2pid2+ri2]c_{ij}(q,x_{ij})=(\gamma_{ij}^{2}-1)||q||^{2}+2r_{i}\gamma_{ij}||q-p^{a}_{j}||+[2(p^{d}_{i}-\gamma_{ij}^{2}p^{a}_{j})^{T}q+\gamma_{ij}^{2}||p^{a}_{j}||^{2}-||p^{d}_{i}||^{2}+r_{i}^{2}]. Owing to the maximum speed ratio constraint (8) and the positivity of rir_{i}, the coefficients of the first two terms are non-negative. Consequently, these two terms are convex, as the (squared) Euclidean norm is a convex function. The convexity of cij(q,xij)c_{ij}(q,x_{ij}) then follows from the fact that the last term is linear, and the sum of convex functions is itself a convex function.

It remains to establish that if q𝒟q\in\mathcal{D} satisfies the inequality maxi𝒞cij(q,xij)0\max_{i\in\mathcal{C}}c_{ij}(q,x_{ij})\leq 0, then qq belongs to the SRS. Given any such qq, consider the constant attack strategy πja=vj,maxa𝒩(qpja)\pi^{a}_{j}=v^{a}_{j,\max}\mathscr{N}(q-p^{a}_{j}) over the time interval [0,τ][0,\tau], where τ=qpjavj,maxa\tau=\frac{||q-p^{a}_{j}||}{v^{a}_{j,\max}}. The reachability property of the SRS holds by noting that attacker jj can reach qq from pjap^{a}_{j} along the line segment LL connecting pjap^{a}_{j} and qq using πja\pi^{a}_{j}. On the other hand, any point on LL can be expressed as qλ=(1λ)pja+λqq_{\lambda}=(1-\lambda)p^{a}_{j}+\lambda q with parameter λ[0,1]\lambda\in[0,1], which is reached by attacker jj at time λτ\lambda\tau using πja\pi^{a}_{j}. Under the same time, the closest point to qλq_{\lambda} that defender ii can reach from pidp^{d}_{i}, denoted by qi,λdq^{d*}_{i,\lambda}, must lie on the straight line that connects pidp^{d}_{i} and qλq_{\lambda}, as defender ii can move in any direction. This leads to qi,λd=pid+vi,maxdqλpidqλpidλτq^{d*}_{i,\lambda}=p^{d}_{i}+v^{d}_{i,\max}\frac{q_{\lambda}-p^{d}_{i}}{||q_{\lambda}-p^{d}_{i}||}\lambda\tau. Besides, the convexity of cij(q,xij)c_{ij}(q,x_{ij}) implies that cij(qλ,xij)(1λ)cij(pja,xij)+λcij(q,xij)0c_{ij}(q_{\lambda},x_{ij})\leq(1-\lambda)c_{ij}(p^{a}_{j},x_{ij})+\lambda c_{ij}(q,x_{ij})\leq 0, which in turn indicates that si(qi,λd,qλ)=ri|qλpidγijqλpja|=γijqλpja+riqλpid0s_{i}(q^{d*}_{i,\lambda},q_{\lambda})=r_{i}-\bigl{\lvert}||q_{\lambda}-p^{d}_{i}||-\gamma_{ij}||q_{\lambda}-p^{a}_{j}||\bigr{\rvert}=\gamma_{ij}||q_{\lambda}-p^{a}_{j}||+r_{i}-||q_{\lambda}-p^{d}_{i}||\leq 0. As a result, for any admissible defense strategy πid\pi^{d}_{i}, si(χid(λτ;pid,πid),qλ)si(qi,λd,qλ)0s_{i}(\chi^{d}_{i}(\lambda\tau;p^{d}_{i},\pi^{d}_{i}),q_{\lambda})\leq s_{i}(q^{d*}_{i,\lambda},q_{\lambda})\leq 0 for all λ[0,1]\lambda\in[0,1], yielding the safety property of the SRS. \blacksquare

-B Proofs of Proposition 2 and (24)

Proof: Consider the Lagrangian function of (17):

d(q^,λ^)=qq~2+i𝒞λijcij(q,xij)+kDλ¯kdk(q)+lGλ~lgl(q~)\begin{split}\mathcal{L}^{d}(\hat{q},\hat{\lambda})=&||q-\tilde{q}||^{2}+\sum_{i\in\mathcal{C}}\lambda_{ij}c_{ij}(q,x_{ij})\\ &+\sum_{k\in\mathcal{I}_{D}}\bar{\lambda}_{k}d_{k}(q)+\sum_{l\in\mathcal{I}_{G}}\tilde{\lambda}_{l}g_{l}(\tilde{q})\end{split}

where q^=(q,q~)\hat{q}=(q,\tilde{q}) and λ^=(λij,i𝒞;λ¯k,kD;λ~l,lG)\hat{\lambda}=(\lambda_{ij},i\in\mathcal{C};\bar{\lambda}_{k},k\in\mathcal{I}_{D};\tilde{\lambda}_{l},l\in\mathcal{I}_{G}) is the Lagrange multiplier. The KKT conditions asserts that for the optimal solution q^=(ξj𝒞,ξ~j𝒞)\hat{q}^{*}=(\xi^{\mathcal{C}}_{j},\tilde{\xi}^{\mathcal{C}}_{j}), there is a Lagrange multiplier λ^\hat{\lambda}^{*} with nonnegative components such that dq^(q^,λ^)=0\frac{\partial\mathcal{L}^{d}}{\partial\hat{q}}(\hat{q}^{*},\hat{\lambda}^{*})=0, i.e.,

2(ξj𝒞ξ~j𝒞)T+i𝒞λijcijq(ξj𝒞,xij)+kDλ¯kdkq(ξj𝒞)=0\displaystyle 2(\xi^{\mathcal{C}}_{j}-\tilde{\xi}^{\mathcal{C}}_{j})^{T}+\sum_{i\in\mathcal{C}}\lambda_{ij}^{*}\frac{\partial c_{ij}}{\partial q}(\xi^{\mathcal{C}}_{j},x_{ij})+\sum_{k\in\mathcal{I}_{D}}\bar{\lambda}_{k}^{*}\frac{\partial d_{k}}{\partial q}(\xi^{\mathcal{C}}_{j})=0 (43)
2(ξj𝒞ξ~j𝒞)TlGλ~lglq~(ξ~j𝒞)=0.\displaystyle 2(\xi^{\mathcal{C}}_{j}-\tilde{\xi}^{\mathcal{C}}_{j})^{T}-\sum_{l\in\mathcal{I}_{G}}\tilde{\lambda}_{l}^{*}\frac{\partial g_{l}}{\partial\tilde{q}}(\tilde{\xi}^{\mathcal{C}}_{j})=0. (44)

Additionally, for any interior point xj𝒞𝒟x^{\mathcal{C}}_{j}\in\mathcal{D}^{\prime} and any i𝒞i\in\mathcal{C}, one of the following three cases occurs: i) cij(ξj𝒞,xij)<0c_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})<0; ii) cij(ξj𝒞,xij)=0c_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})=0 and c˙ij(ξj𝒞,xij)=0\dot{c}_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})=0; iii) cij(ξj𝒞,xij)=0c_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})=0 and c˙ij(ξj𝒞,xij)0\dot{c}_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})\neq 0. In particular, since cijc_{ij} is continuously differentiable, the set of interior points for which the last case holds has measure zero, according to the results from real analysis [50]. Thus, the complementarity condition λijcij(ξj𝒞,xij)=0\lambda_{ij}^{*}c_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})=0 implies that for almost every xj𝒞𝒟x^{\mathcal{C}}_{j}\in\mathcal{D}^{\prime}, λijc˙ij(ξj𝒞,xij)=0\lambda_{ij}^{*}\dot{c}_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})=0, i.e.,

λij(cijq(ξj𝒞,xij)ξ˙j𝒞+cijpid(ξj𝒞,xij)uid+cijpja(ξj𝒞,xij)uja)=0,i𝒞.\begin{split}&\lambda_{ij}^{*}\left(\frac{\partial c_{ij}}{\partial q}(\xi^{\mathcal{C}}_{j},x_{ij})\dot{\xi}^{\mathcal{C}}_{j}+\frac{\partial c_{ij}}{\partial p^{d}_{i}}(\xi^{\mathcal{C}}_{j},x_{ij})u^{d}_{i}\right.\\ &\left.+\frac{\partial c_{ij}}{\partial p^{a}_{j}}(\xi^{\mathcal{C}}_{j},x_{ij})u^{a}_{j}\right)=0,~{}\forall i\in\mathcal{C}.\end{split} (45)

Similar derivations can be respectively employed to the complementarity conditions λ¯kdk(ξj𝒞)=0\bar{\lambda}_{k}^{*}d_{k}(\xi^{\mathcal{C}}_{j})=0, kDk\in\mathcal{I}_{D} and λ~lgl(ξ~j𝒞)=0\tilde{\lambda}_{l}^{*}g_{l}(\tilde{\xi}^{\mathcal{C}}_{j})=0, lGl\in\mathcal{I}_{G}, yielding that for almost every xj𝒞𝒟x^{\mathcal{C}}_{j}\in\mathcal{D}^{\prime},

λ¯kdkq(ξj𝒞)ξ˙j𝒞=0,kD\bar{\lambda}_{k}^{*}\frac{\partial d_{k}}{\partial q}(\xi^{\mathcal{C}}_{j})\dot{\xi}^{\mathcal{C}}_{j}=0,~{}\forall k\in\mathcal{I}_{D} (46)

and λ~lglq~(ξ~j𝒞)ξ~˙j𝒞=0\tilde{\lambda}_{l}^{*}\frac{\partial g_{l}}{\partial\tilde{q}}(\tilde{\xi}^{\mathcal{C}}_{j})\dot{\tilde{\xi}}^{\mathcal{C}}_{j}=0 for all lGl\in\mathcal{I}_{G}, where the latter equation together with (44) leads to

2(ξj𝒞ξ~j𝒞)Tξ~˙j𝒞=lGλ~lglq~(ξ~j𝒞)ξ~˙j𝒞=0.2(\xi^{\mathcal{C}}_{j}-\tilde{\xi}^{\mathcal{C}}_{j})^{T}\dot{\tilde{\xi}}^{\mathcal{C}}_{j}=\sum_{l\in\mathcal{I}_{G}}\tilde{\lambda}_{l}^{*}\frac{\partial g_{l}}{\partial\tilde{q}}(\tilde{\xi}^{\mathcal{C}}_{j})\dot{\tilde{\xi}}^{\mathcal{C}}_{j}=0. (47)

Therefore,

Φ˙j𝒞=def2(ξj𝒞ξ~j𝒞)T(ξj𝒞˙ξ~˙j𝒞)=(47)2(ξj𝒞ξ~j𝒞)Tξj𝒞˙=(43)i𝒞λijcijq(ξj𝒞,xij)ξ˙j𝒞kDλ¯kdkq(ξj𝒞)ξ˙j𝒞=(45)+(46)i𝒞λijcijpid(ξj𝒞,xij)uid+i𝒞λijcijpja(ξj𝒞,xij)uja\begin{split}\dot{\Phi}^{\mathcal{C}}_{j}\overset{\mathrm{def}}{=}&2(\xi^{\mathcal{C}}_{j}-\tilde{\xi}^{\mathcal{C}}_{j})^{T}(\dot{\xi^{\mathcal{C}}_{j}}-\dot{\tilde{\xi}}^{\mathcal{C}}_{j})\\ \overset{\eqref{eq:d5}}{=}&2(\xi^{\mathcal{C}}_{j}-\tilde{\xi}^{\mathcal{C}}_{j})^{T}\dot{\xi^{\mathcal{C}}_{j}}\\ \overset{\eqref{eq:d1}}{=}&-\sum_{i\in\mathcal{C}}\lambda_{ij}^{*}\frac{\partial c_{ij}}{\partial q}(\xi^{\mathcal{C}}_{j},x_{ij})\dot{\xi}^{\mathcal{C}}_{j}-\sum_{k\in\mathcal{I}_{D}}\bar{\lambda}_{k}^{*}\frac{\partial d_{k}}{\partial q}(\xi^{\mathcal{C}}_{j})\dot{\xi}^{\mathcal{C}}_{j}\\ \overset{\eqref{eq:d3}+\eqref{eq:d4}}{=}&\sum_{i\in\mathcal{C}}\lambda_{ij}^{*}\frac{\partial c_{ij}}{\partial p^{d}_{i}}(\xi^{\mathcal{C}}_{j},x_{ij})u^{d}_{i}+\sum_{i\in\mathcal{C}}\lambda_{ij}^{*}\frac{\partial c_{ij}}{\partial p^{a}_{j}}(\xi^{\mathcal{C}}_{j},x_{ij})u^{a}_{j}\end{split}

holds for almost every xj𝒞𝒟x^{\mathcal{C}}_{j}\in\mathcal{D}^{\prime}. This completes the proof of Proposition 2 by examining (18).

To verify the equality (24), we consider the Lagrangian function of (23):

a(q,μ^)=qpja2+i𝒞μijcij(q,xij)+lGμ~lgl(q)\mathcal{L}^{a}(q,\hat{\mu})=||q-p^{a}_{j}||^{2}+\sum_{i\in\mathcal{C}}\mu_{ij}c_{ij}(q,x_{ij})+\sum_{l\in\mathcal{I}_{G}}\tilde{\mu}_{l}g_{l}(q)

with the Lagrange multiplier μ^=(μij,i𝒞,μ~l,lG)\hat{\mu}=(\mu_{ij},i\in\mathcal{C},\tilde{\mu}_{l},l\in\mathcal{I}_{G}). Analogous to the proof of Proposition 2, the KKT conditions imply that there is a Lagrange multiplier μ^\hat{\mu}^{*} with nonnegative components such that

2(ξ¯j𝒞pja)T+i𝒞μijcijq(ξ¯j𝒞,xij)+lGμ~lglq(ξ¯j𝒞)=02(\bar{\xi}^{\mathcal{C}}_{j}-p^{a}_{j})^{T}+\sum_{i\in\mathcal{C}}\mu_{ij}^{*}\frac{\partial c_{ij}}{\partial q}(\bar{\xi}^{\mathcal{C}}_{j},x_{ij})+\sum_{l\in\mathcal{I}_{G}}\tilde{\mu}_{l}^{*}\frac{\partial g_{l}}{\partial q}(\bar{\xi}^{\mathcal{C}}_{j})=0 (48)

and for almost every xj𝒞𝒟j𝒞x^{\mathcal{C}}_{j}\notin\mathcal{D}^{\mathcal{C}}_{j},

μij(cijq(ξ¯j𝒞,xij)ξ¯˙j𝒞+cijpid(ξ¯j𝒞,xij)uid+cijpja(ξ¯j𝒞,xij)uja)=0,i𝒞\begin{split}&\mu_{ij}^{*}\left(\frac{\partial c_{ij}}{\partial q}(\bar{\xi}^{\mathcal{C}}_{j},x_{ij})\dot{\bar{\xi}}^{\mathcal{C}}_{j}+\frac{\partial c_{ij}}{\partial p^{d}_{i}}(\bar{\xi}^{\mathcal{C}}_{j},x_{ij})u^{d}_{i}\right.\\ &\left.+\frac{\partial c_{ij}}{\partial p^{a}_{j}}(\bar{\xi}^{\mathcal{C}}_{j},x_{ij})u^{a}_{j}\right)=0,~{}\forall i\in\mathcal{C}\end{split} (49)

and

μ~lglq(ξ¯j𝒞)ξ¯˙j𝒞=0,lG.\tilde{\mu}_{l}^{*}\frac{\partial g_{l}}{\partial q}(\bar{\xi}^{\mathcal{C}}_{j})\dot{\bar{\xi}}^{\mathcal{C}}_{j}=0,~{}\forall l\in\mathcal{I}_{G}. (50)

Consequently, the proof is done by noting that

Φ¯˙j𝒞+2(ξ¯j𝒞pja)Tuja=def2(ξ¯j𝒞pja)Tξ¯˙j𝒞=(48)i𝒞μijcijq(ξ¯j𝒞,xij)ξ¯˙j𝒞lGμ~lglq(ξ¯j𝒞)ξ¯˙j𝒞=(49)+(50)i𝒞μijcijpid(ξ¯j𝒞,xij)uid+i𝒞μijcijpja(ξ¯j𝒞,xij)uja.\begin{split}&\dot{\bar{\Phi}}^{\mathcal{C}}_{j}+2(\bar{\xi}^{\mathcal{C}}_{j}-p^{a}_{j})^{T}u^{a}_{j}\\ \overset{\mathrm{def}}{=}&2(\bar{\xi}^{\mathcal{C}}_{j}-p^{a}_{j})^{T}\dot{\bar{\xi}}^{\mathcal{C}}_{j}\\ \overset{\eqref{eq:a1}}{=}&-\sum_{i\in\mathcal{C}}\mu_{ij}^{*}\frac{\partial c_{ij}}{\partial q}(\bar{\xi}^{\mathcal{C}}_{j},x_{ij})\dot{\bar{\xi}}^{\mathcal{C}}_{j}-\sum_{l\in\mathcal{I}_{G}}\tilde{\mu}_{l}^{*}\frac{\partial g_{l}}{\partial q}(\bar{\xi}^{\mathcal{C}}_{j})\dot{\bar{\xi}}^{\mathcal{C}}_{j}\\ \overset{\eqref{eq:a2}+\eqref{eq:a3}}{=}&\sum_{i\in\mathcal{C}}\mu_{ij}^{*}\frac{\partial c_{ij}}{\partial p^{d}_{i}}(\bar{\xi}^{\mathcal{C}}_{j},x_{ij})u^{d}_{i}+\sum_{i\in\mathcal{C}}\mu_{ij}^{*}\frac{\partial c_{ij}}{\partial p^{a}_{j}}(\bar{\xi}^{\mathcal{C}}_{j},x_{ij})u^{a}_{j}.\end{split}

\blacksquare

-C Proof of Proposition 3

Proof: The proof is conducted by contradiction. Suppose there are q1,q2Ωj𝒞(xj𝒞)q_{1},q_{2}\in\Omega^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}) such that Ψ(q1)=Ψ(q2)\Psi(q_{1})=\Psi(q_{2}), where

Ψ(q)=minq~𝒢qq~2\Psi(q)=\min_{\tilde{q}\in\mathcal{G}}||q-\tilde{q}||^{2} (51)

is the minimum squared distance from qq to the target set. We claim that the line segment connecting q1q_{1} and q2q_{2}, denoted as qλ=λq1+(1λ)q2q_{\lambda}=\lambda q_{1}+(1-\lambda)q_{2} with parameter λ[0,1]\lambda\in[0,1], must lie on the zero level set defined by ckj(q,xkj)=0c_{kj}(q,x_{kj})=0 for some k𝒩k\in\mathcal{N}. First, note that the line segment qλq_{\lambda} must lie within the SRS due to its convexity. Next, by the Definition of Φj𝒞\Phi^{\mathcal{C}}_{j}, we know that Φj𝒞(xj𝒞)Ψ(q)\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j})\leq\Psi(q) for all qΩj𝒞(xj𝒞)q\in\Omega^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}), and thus Φj𝒞(xj𝒞)Ψ(qλ)\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j})\leq\Psi(q_{\lambda}) for all λ[0,1]\lambda\in[0,1]. Furthermore, since Ψ\Psi is convex, we have Ψ(qλ)λΨ(q1)+(1λ)Ψ(q2)=Φj𝒞(xj𝒞)\Psi(q_{\lambda})\leq\lambda\Psi(q_{1})+(1-\lambda)\Psi(q_{2})=\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}). Therefore, Ψ(qλ)=Φj𝒞(xj𝒞)\Psi(q_{\lambda})=\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}) for all λ[0,1]\lambda\in[0,1], indicating that the line segment qλq_{\lambda} is on the boundary of the SRS. Lastly, we verify the claim by noting that if maxi𝒩cij(qλ0,xij)<0\max_{i\in\mathcal{N}}c_{ij}(q_{\lambda_{0}},x_{ij})<0 for some λ0[0,1]\lambda_{0}\in[0,1], then the point qλ0q_{\lambda_{0}} lies strictly within the SRS, which would contradict the result that qλ0q_{\lambda_{0}} lies on the boundary of the SRS. However, the claim violates the fact that the zero level set defined by ckj(q,xkj)=0c_{kj}(q,x_{kj})=0 contains no line segments. This is because (q1q2)Tqckj=0(q_{1}-q_{2})^{T}\nabla_{q}c_{kj}=0 can never have infinite solutions, where qckj=2(γkj21)q+2rkγkjqpjaqpja+2(pkdγkj2pja)\nabla_{q}c_{kj}=2(\gamma_{kj}^{2}-1)q+2r_{k}\gamma_{kj}\frac{q-p^{a}_{j}}{||q-p^{a}_{j}||}+2(p^{d}_{k}-\gamma_{kj}^{2}p^{a}_{j}) is the normal vector to the zero level set of ckj(q,xkj)=0c_{kj}(q,x_{kj})=0. \blacksquare

-D Proof of Proposition 4

Proof: It suffices to show by contradiction that Φj𝒞(xj𝒞)=Φj𝒞(xj𝒞)\Phi^{\mathcal{C}^{\prime}}_{j}(x^{\mathcal{C}^{\prime}}_{j})=\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}) with 𝒞=Λj𝒞\mathcal{C}^{\prime}=\Lambda^{\mathcal{C}}_{j}, or equivalently Ψ(ξj𝒞)=Ψ(ξj𝒞)\Psi(\xi^{\mathcal{C}^{\prime}}_{j})=\Psi(\xi^{\mathcal{C}}_{j}) in which Ψ\Psi is defined in (51). Assume that Ψ(ξj𝒞)Ψ(ξj𝒞)\Psi(\xi^{\mathcal{C}^{\prime}}_{j})\neq\Psi(\xi^{\mathcal{C}}_{j}). In this case, we have Ψ(ξj𝒞)<Ψ(ξj𝒞)\Psi(\xi^{\mathcal{C}^{\prime}}_{j})<\Psi(\xi^{\mathcal{C}}_{j}) as ξj𝒞Ωj𝒞\xi^{\mathcal{C}}_{j}\in\Omega^{\mathcal{C}^{\prime}}_{j}. To reach a contradiction, we consider the line segment connecting ξj𝒞\xi^{\mathcal{C}^{\prime}}_{j} and ξj𝒞\xi^{\mathcal{C}}_{j}, which is parameterized by μ[0,1]\mu\in[0,1] as ξμ=μξj𝒞+(1μ)ξj𝒞\xi_{\mu}=\mu\xi^{\mathcal{C}^{\prime}}_{j}+(1-\mu)\xi^{\mathcal{C}}_{j}. Since 𝒞\mathcal{C}^{\prime} is the ADS for (𝒞,j)(\mathcal{C},j), it follows that cij(ξj𝒞,xij)<0c_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})<0 for all i𝒞𝒞i\in\mathcal{C}\setminus\mathcal{C}^{\prime}. Due to the continuity of cij(ξμ,xij)c_{ij}(\xi_{\mu},x_{ij}) with respect to μ\mu, there exists μi(0,1)\mu_{i}\in(0,1) for each i𝒞𝒞i\in\mathcal{C}\setminus\mathcal{C}^{\prime} such that cij(ξμ,xij)0c_{ij}(\xi_{\mu},x_{ij})\leq 0 for all μ[0,μi,]\mu\in[0,\mu_{i},]. Let μ¯\bar{\mu} be the minimum of all μi\mu_{i}. Then cij(ξμ¯,xij)0c_{ij}(\xi_{\bar{\mu}},x_{ij})\leq 0 for all i𝒞𝒞i\in\mathcal{C}\setminus\mathcal{C}^{\prime}. Moreover, since the line segment ξμ\xi_{\mu} lies within Ωj𝒞(xj𝒞)\Omega^{\mathcal{C}^{\prime}}_{j}(x^{\mathcal{C}^{\prime}}_{j}), we have cij(ξμ¯,xij)0c_{ij}(\xi_{\bar{\mu}},x_{ij})\leq 0 for all i𝒞i\in\mathcal{C}^{\prime}. As a result, ξμ¯Ωj𝒞\xi_{\bar{\mu}}\in\Omega^{\mathcal{C}}_{j}. However, it is noted that Ψ(ξμ¯)μ¯Ψ(ξj𝒞)+(1μ¯)Ψ(ξj𝒞)<Ψ(ξj𝒞)\Psi(\xi_{\bar{\mu}})\leq\bar{\mu}\Psi(\xi^{\mathcal{C}}_{j})+(1-\bar{\mu})\Psi(\xi^{\mathcal{C}^{\prime}}_{j})<\Psi(\xi^{\mathcal{C}}_{j}), which contradicts the fact that the minimum value of Ψ\Psi occurs at ξj𝒞\xi^{\mathcal{C}}_{j}. \blacksquare

-E Proof of Proposition 5

Proof: According to Proposition 4, it is sufficient to establish that for any feasible coalition-attacker pair (𝒞,j)(\mathcal{C},j) with |Λj𝒞|>n|\Lambda^{\mathcal{C}}_{j}|>n, there exists a subset 𝒞\mathcal{C}^{\prime} of Λj𝒞\Lambda^{\mathcal{C}}_{j} with |𝒞|n|\mathcal{C}^{\prime}|\leq n such that Φj𝒞(xj𝒞)=Φj𝒞(xj𝒞)\Phi^{\mathcal{C}^{\prime}}_{j}(x^{\mathcal{C}^{\prime}}_{j})=\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}). For each iΛj𝒞i\in\Lambda^{\mathcal{C}}_{j}, the almost-optimal waypoint ξj𝒞\xi^{\mathcal{C}}_{j} lies on the tangent of the zero level set cij(ξj𝒞,xij)=0c_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})=0, and thus satisfies the linear equation (qξj𝒞)Tqcij(ξj𝒞,xij)=0(q-\xi^{\mathcal{C}}_{j})^{T}\nabla_{q}c_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})=0, where qcij(ξj𝒞,xij)\nabla_{q}c_{ij}(\xi^{\mathcal{C}}_{j},x_{ij}) is the normal vector of cij(q,xij)=0c_{ij}(q,x_{ij})=0 at q=ξj𝒞q=\xi^{\mathcal{C}}_{j}. Since an nn-dimensional solution can be uniquely determined by at most nn independent linear equations, there is a subset 𝒞\mathcal{C}^{\prime} of Λj𝒞\Lambda^{\mathcal{C}}_{j} with |𝒞|n|\mathcal{C}^{\prime}|\leq n such that ξj𝒞\xi^{\mathcal{C}}_{j} is the unique solution to the linear equations (qξj𝒞)Tqcij(ξj𝒞,xij)=0(q-\xi^{\mathcal{C}}_{j})^{T}\nabla_{q}c_{ij}(\xi^{\mathcal{C}}_{j},x_{ij})=0 for all i𝒞i\in\mathcal{C}^{\prime}. Moreover, due to the sublevel set representation of Ωj𝒞\Omega^{\mathcal{C}}_{j}, 𝒞\mathcal{C}^{\prime} can be chosen in a way that UΩj𝒞=UΩj𝒞U\cap\Omega^{\mathcal{C}}_{j}=U\cap\Omega^{\mathcal{C}^{\prime}}_{j} for some neighborhood UU of ξj𝒞\xi^{\mathcal{C}}_{j}. Consequently, by the convexity of the SRS, we can conclude that Φj𝒞(xj𝒞)=Φj𝒞(xj𝒞)\Phi^{\mathcal{C}^{\prime}}_{j}(x^{\mathcal{C}^{\prime}}_{j})=\Phi^{\mathcal{C}}_{j}(x^{\mathcal{C}}_{j}). \blacksquare

-F Proof of Proposition 6

Proof: Optimality: Note that the maximum possible value for the multi-attack objective function Γ\Gamma is equal to the number of active attackers jj for which the coalition-attacker pair (𝒩,j)(\mathcal{N},j) is feasible. In view of Algorithm 3, an index jj is removed from r\mathcal{M}_{r} after the first iteration if and only if (𝒩,j)(\mathcal{N},j) is infeasible or attacker jj is included in the suboptimal solution Θ~1\tilde{\Theta}_{1}. Hence, the emptiness of r\mathcal{M}_{r} after the first iteration indicates that Θ~\tilde{\Theta} achieves the maximum possible value of Γ\Gamma, rendering it an optimal solution to the ILP (34). Efficiency: In each iteration of Algorithm 3, there are two steps that require the computation of the convex program of the form (17): determining the feasibility of the coalition-attacker pair (𝒩r,j)(\mathcal{N}_{r},j), and identifying all irreducible sub-pairs of (Λj𝒩r,j)(\Lambda^{\mathcal{N}_{r}}_{j},j). The former step requires only one calculation, while the latter step, performed using Algorithm 2, demands at most k=1n1(nk)=2n2\sum_{k=1}^{n-1}\binom{n}{k}=2^{n}-2 computations given that the cardinality of the ADS does not exceed nn. Thus, the total number of computations at this iteration for one attacker is at most 2n12^{n}-1. Moreover, the number of iterations is constrained by the number of active attackers MaM_{a}, since at least one attacker is removed from the set r\mathcal{M}_{r} after each iteration. The total number of calculations for all iterations is therefore at most k=1Ma(2n1)k\sum_{k=1}^{M_{a}}(2^{n}-1)k, which is less than 2n1Ma(1+Ma)2^{n-1}M_{a}(1+M_{a}). \blacksquare

References

  • [1] R. Vidal, O. Shakernia, H. J. Kim, D. H. Shim, and S. Sastry, “Probabilistic pursuit-evasion games: theory, implementation, and experimental evaluation,” IEEE transactions on robotics and automation, vol. 18, no. 5, pp. 662–669, 2002.
  • [2] S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin, “Hamilton-jacobi reachability: A brief overview and recent advances,” in 2017 IEEE 56th Annual Conference on Decision and Control (CDC), 2017, pp. 2242–2253.
  • [3] C. Robin and S. Lacroix, “Multi-robot target detection and tracking: taxonomy and survey,” Autonomous Robots, vol. 40, pp. 729–760, 2016.
  • [4] R. Lowe, Y. I. Wu, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” Advances in neural information processing systems, vol. 30, 2017.
  • [5] T. Rashid, M. Samvelyan, C. S. De Witt, G. Farquhar, J. Foerster, and S. Whiteson, “Monotonic value function factorisation for deep multi-agent reinforcement learning,” The Journal of Machine Learning Research, vol. 21, no. 1, pp. 7234–7284, 2020.
  • [6] K. Zhang, Z. Yang, and T. Başar, “Multi-agent reinforcement learning: A selective overview of theories and algorithms,” Handbook of reinforcement learning and control, pp. 321–384, 2021.
  • [7] T. H. Chung, G. A. Hollinger, and V. Isler, “Search and pursuit-evasion in mobile robotics,” Autonomous robots, vol. 31, no. 4, pp. 299–316, 2011.
  • [8] E. Bakolas and P. Tsiotras, “Relay pursuit of a maneuvering target using dynamic voronoi diagrams,” Automatica, vol. 48, no. 9, pp. 2213–2220, 2012.
  • [9] A. Alexopoulos, T. Schmidt, and E. Badreddin, “Cooperative pursue in pursuit-evasion games with unmanned aerial vehicles,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2015, pp. 4538–4543.
  • [10] C. De Souza, R. Newbury, A. Cosgun, P. Castillo, B. Vidolov, and D. Kulić, “Decentralized multi-agent pursuit using deep reinforcement learning,” IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 4552–4559, 2021.
  • [11] R. Zhang, Q. Zong, X. Zhang, L. Dou, and B. Tian, “Game of drones: Multi-uav pursuit-evasion game with online motion planning by deep reinforcement learning,” IEEE Transactions on Neural Networks and Learning Systems, 2022.
  • [12] H. Huang, J. Ding, W. Zhang, and C. J. Tomlin, “A differential game approach to planning in adversarial scenarios: A case study on capture-the-flag,” in 2011 IEEE International Conference on Robotics and Automation.   IEEE, 2011, pp. 1451–1456.
  • [13] E. Garcia, D. W. Casbeer, and M. Pachter, “The capture-the-flag differential game,” in 2018 IEEE conference on decision and control (CDC).   IEEE, 2018, pp. 4167–4172.
  • [14] M. Jaderberg, W. M. Czarnecki, I. Dunning, L. Marris, G. Lever, A. G. Castaneda, C. Beattie, N. C. Rabinowitz, A. S. Morcos, A. Ruderman et al., “Human-level performance in 3d multiplayer games with population-based reinforcement learning,” Science, vol. 364, no. 6443, pp. 859–865, 2019.
  • [15] D. Shishika and V. Kumar, “Local-game decomposition for multiplayer perimeter-defense problem,” in 2018 IEEE conference on decision and control (CDC).   IEEE, 2018, pp. 2093–2100.
  • [16] L. Guerrero-Bonilla and D. V. Dimarogonas, “Perimeter surveillance based on set-invariance,” IEEE Robotics and Automation Letters, vol. 6, no. 1, pp. 9–16, 2020.
  • [17] S. Velhal, S. Sundaram, and N. Sundararajan, “A decentralized multirobot spatiotemporal multitask assignment approach for perimeter defense,” IEEE Transactions on Robotics, 2022.
  • [18] K. Margellos and J. Lygeros, “Hamilton–jacobi formulation for reach–avoid differential games,” IEEE Transactions on automatic control, vol. 56, no. 8, pp. 1849–1861, 2011.
  • [19] J. F. Fisac, M. Chen, C. J. Tomlin, and S. S. Sastry, “Reach-avoid problems with time-varying dynamics, targets and constraints,” in Proceedings of the 18th international conference on hybrid systems: computation and control, 2015, pp. 11–20.
  • [20] I. M. Mitchell, A. M. Bayen, and C. J. Tomlin, “A time-dependent hamilton-jacobi formulation of reachable sets for continuous dynamic games,” IEEE Transactions on automatic control, vol. 50, no. 7, pp. 947–957, 2005.
  • [21] T. Başar and G. J. Olsder, Dynamic noncooperative game theory.   SIAM, 1998.
  • [22] B. Landry, M. Chen, S. Hemley, and M. Pavone, “Reach-avoid problems via sum-or-squares optimization and dynamic programming,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2018, pp. 4325–4332.
  • [23] Z. Zhou, J. Ding, H. Huang, R. Takei, and C. Tomlin, “Efficient path planning algorithms in reach-avoid problems,” Automatica, vol. 89, pp. 28–36, 2018.
  • [24] E. Garcia, D. W. Casbeer, and M. Pachter, “Optimal strategies for a class of multi-player reach-avoid differential games in 3d space,” IEEE Robotics and Automation Letters, vol. 5, no. 3, pp. 4257–4264, 2020.
  • [25] H. Fu and H. H.-T. Liu, “Justification of the geometric solution of a target defense game with faster defenders and a convex target area using the hji equation,” Automatica, vol. 149, p. 110811, 2023.
  • [26] S. Pan, H. Huang, J. Ding, W. Zhang, C. J. Tomlin et al., “Pursuit, evasion and defense in the plane,” in 2012 American Control Conference (ACC).   IEEE, 2012, pp. 4167–4173.
  • [27] Z. Deng and Z. Kong, “Multi-agent cooperative pursuit-defense strategy against one single attacker,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 5772–5778, 2020.
  • [28] H. Huang, W. Zhang, J. Ding, D. M. Stipanović, and C. J. Tomlin, “Guaranteed decentralized pursuit-evasion in the plane with multiple pursuers,” in 2011 50th IEEE Conference on Decision and Control and European Control Conference.   IEEE, 2011, pp. 4835–4840.
  • [29] A. Pierson, Z. Wang, and M. Schwager, “Intercepting rogue robots: An algorithm for capturing multiple evaders with multiple pursuers,” IEEE Robotics and Automation Letters, vol. 2, no. 2, pp. 530–537, 2016.
  • [30] R. Yan, Z. Shi, and Y. Zhong, “Reach-avoid games with two defenders and one attacker: An analytical approach,” IEEE transactions on cybernetics, vol. 49, no. 3, pp. 1035–1046, 2018.
  • [31] ——, “Task assignment for multiplayer reach–avoid games in convex domains via analytical barriers,” IEEE Transactions on Robotics, vol. 36, no. 1, pp. 107–124, 2019.
  • [32] ——, “Guarding a subspace in high-dimensional space with two defenders and one attacker,” IEEE Transactions on Cybernetics, vol. 52, no. 5, pp. 3998–4011, 2020.
  • [33] R. Yan, X. Duan, Z. Shi, Y. Zhong, and F. Bullo, “Matching-based capture strategies for 3d heterogeneous multiplayer reach-avoid differential games,” Automatica, vol. 140, p. 110207, 2022.
  • [34] R. Isaacs, Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization.   Courier Corporation, 1999.
  • [35] Z. Zhou, W. Zhang, J. Ding, H. Huang, D. M. Stipanović, and C. J. Tomlin, “Cooperative pursuit with voronoi partitions,” Automatica, vol. 72, pp. 64–72, 2016.
  • [36] B. P. Gerkey and M. J. Matarić, “A formal analysis and taxonomy of task allocation in multi-robot systems,” The International journal of robotics research, vol. 23, no. 9, pp. 939–954, 2004.
  • [37] G. A. Korsah, A. Stentz, and M. B. Dias, “A comprehensive taxonomy for multi-robot task allocation,” The International Journal of Robotics Research, vol. 32, no. 12, pp. 1495–1512, 2013.
  • [38] M. Chen, Z. Zhou, and C. J. Tomlin, “Multiplayer reach-avoid games via pairwise outcomes,” IEEE Transactions on Automatic Control, vol. 62, no. 3, pp. 1451–1457, 2017.
  • [39] L. R. Ford and D. R. Fulkerson, “Maximal flow through a network,” Canadian journal of Mathematics, vol. 8, pp. 399–404, 1956.
  • [40] J. E. Hopcroft and R. M. Karp, “An n^5/2 algorithm for maximum matchings in bipartite graphs,” SIAM Journal on computing, vol. 2, no. 4, pp. 225–231, 1973.
  • [41] E. Garcia, D. W. Casbeer, A. Von Moll, and M. Pachter, “Multiple pursuer multiple evader differential games,” IEEE Transactions on Automatic Control, vol. 66, no. 5, pp. 2345–2350, 2020.
  • [42] I. M. Mitchell et al., “A toolbox of level set methods,” UBC Department of Computer Science Technical Report TR-2007-11, p. 31, 2007.
  • [43] F. Blanchini, “Set invariance in control,” Automatica, vol. 35, no. 11, pp. 1747–1767, 1999.
  • [44] J. Nocedal and S. Wright, Numerical optimization.   Springer Science & Business Media, 2006.
  • [45] S. Boyd, S. P. Boyd, and L. Vandenberghe, Convex optimization.   Cambridge university press, 2004.
  • [46] S. Diamond and S. Boyd, “Cvxpy: A python-embedded modeling language for convex optimization,” The Journal of Machine Learning Research, vol. 17, no. 1, pp. 2909–2913, 2016.
  • [47] N. Koenig and A. Howard, “Design and use paradigms for gazebo, an open-source multi-robot simulator,” in 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)(IEEE Cat. No. 04CH37566), vol. 3.   IEEE, 2004, pp. 2149–2154.
  • [48] L. Wang, A. D. Ames, and M. Egerstedt, “Safety barrier certificates for collisions-free multirobot systems,” IEEE Transactions on Robotics, vol. 33, no. 3, pp. 661–674, 2017.
  • [49] S. G. Lee, Y. Diaz-Mercado, and M. Egerstedt, “Multirobot control using time-varying density functions,” IEEE Transactions on robotics, vol. 31, no. 2, pp. 489–493, 2015.
  • [50] H. L. Royden and P. Fitzpatrick, Real analysis.   Macmillan New York, 1988, vol. 32.