This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Time-Independent Planning for Multiple Moving Agents

Keisuke Okumura, Yasumasa Tamura, Xavier Défago
Abstract

Typical Multi-agent Path Finding (MAPF) solvers assume that agents move synchronously, thus neglecting the reality gap in timing assumptions, e.g., delays caused by an imperfect execution of asynchronous moves. So far, two policies enforce a robust execution of MAPF plans taken as input: either by forcing agents to synchronize or by executing plans while preserving temporal dependencies. This paper proposes an alternative approach, called time-independent planning, which is both online and distributed. We represent reality as a transition system that changes configurations according to atomic actions of agents, and use it to generate a time-independent schedule. Empirical results in a simulated environment with stochastic delays of agents’ moves support the validity of our proposal.

Introduction

Multi-agent systems with physically moving agents are becoming gradually more common, e.g., automated warehouse (Wurman, D’Andrea, and Mountz 2008), traffic control (Dresner and Stone 2008), or self-driving cars. In such systems, agents must move smoothly without colliding. This is embodied by the problem of Multi-agent Path Finding (MAPF) (Stern 2019). Planning techniques for MAPF have been extensively studied in recent years.

The output of such planning is bound to be executed in real-world situations with agents (robots). Typical MAPF is defined in discrete time. Agents are assumed to do two kinds of atomic actions synchronously: move to a neighboring location or stay at their current location. Perfect executions for the planning are however difficult to ensure since timing assumptions are inherently uncertain in reality, due to the difficulty of: 1) accurately predicting the temporal behavior of many aspects of the system, e.g., kinematics, 2) anticipating external events such as faults and interference, and 3) ensuring a globally consistent and accurate notion of time in the face of clock shift and clock drift. Even worse, the potential of unexpected interference increases with the number of agents, hence the need to prepare for imperfect executions regarding the timing assumptions.

There are two intuitive ways to tackle imperfect executions of MAPF plans taken as input. The first, and conservative idea is to forcibly synchronize agents’ moves, globally or locally. Most decentralized approaches to MAPF take this approach implicitly (Wiktor et al. 2014; Kim et al. 2015; Okumura et al. 2019; Wang and Rubenstein 2020). This policy negatively affects the entire performance of the system with unexpected delays and lacks flexibility (Ma, Kumar, and Koenig 2017). The second policy makes agents preserve temporal dependencies of the planning (Hönig et al. 2016; Ma, Kumar, and Koenig 2017; Hönig et al. 2019; Atzmon et al. 2020). Two types of temporal dependencies exist: 1) internal events within one agent and, 2) order relation of visiting one node. This policy is sound but still vulnerable to delays. Consider an extreme example where one agent moves very slowly or crashes. Due to the second type of dependencies, the locations where the agent will be are constrained by the use of the other agents. Thus, the asynchrony of the movements is sensitive to the whole system.

We, therefore, propose an alternative approach, called time-independent planning, that aims at online and distributed execution focusing on agents’ moves. We represent the whole system as a transition system that changes configurations according to atomic actions of agents, namely, 1) request the next locations (𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}), 2) move (𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}), and, 3) release the past locations, or stay (𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}). In this time-independent model, any a priori knowledge for timings of atomic actions is unavailable, representing non-deterministic behaviors of the external environment. The challenge is to design algorithms tolerant of all possible sequences of actions.

a7a_{7}priority:higha6a_{6}a5a_{5}a4a_{4}hga3a_{3}a1a_{1}lowa2a_{2}medium
(a) priority inheritance (PI)
a7a_{7}a6a_{6}a5a_{5}a4a_{4}a3a_{3}a1a_{1}low (as high)a2a_{2}
(b) backtracking and PI again
a7a_{7}a6a_{6}a5a_{5}a4a_{4}hga3a_{3}a1a_{1}a2a_{2}
(c) backtracking
a7a_{7}a6a_{6}a5a_{5}a4a_{4}hga3a_{3}a1a_{1}a2a_{2}
(d) one timestep later
a7a_{7}a6a_{6}a5a_{5}a4a_{4}hga3a_{3}a1a_{1}a2a_{2}
(e) as DFST
Figure 1: Example of PIBT. Requests for the next timestep are depicted by dashed circles, determined greedily according to agents’ destinations (omitted here). Flows of priority inheritance and backtracking are drawn as single-line and doubled-line arrows, respectively. First, a7a_{7} (blue agent) determines the next desired node (current location of a6a_{6}). Then, priority inheritance happens from a7a_{7} to a6a_{6}, making a7a_{7} wait for backtracking and a6a_{6} start planning; a6a_{6}, a5a_{5} and a4a_{4} do the same. a3a_{3} (red), however, is stuck (1). Thus, a3a_{3} backtracks as invalid to a4a_{4} (1). a4a_{4} tries to replan, however, a4a_{4} is also stuck hence a4a_{4} sends backtracking as invalid to a5a_{5} (blue with diagonal lines). a5a_{5}, with success replanning, executes other priority inheritance to a1a_{1} (1). Finally, a1,a5,a6a_{1},a_{5},a_{6} and a7a_{7} receives backtracking as valid (1) and then start moving (1). Fig. 1: virtual depth first search tree for this example, for explanation of Causal-PIBT. a7a_{7} is a root. Solid arrows are drawn from a parent to a child. a2a_{2} finds an empty node.

The main contributions of this paper are: 1) the formalization of the time-independent model and Causal-PIBT, a proposed time-independent planning with guaranteed reachability, i.e., all agents are ensured to reach their destinations within finite time. Causal-PIBT, as a proof-of-concept, extends a recently-developed decoupled approach that solves MAPF iteratively, Priority Inheritance with Backtracking (PIBT) (Okumura et al. 2019). We also present how an offline MAPF plan enhances Causal-PIBT. 2) experimental results demonstrating the validity and robustness of the proposal through the simulation with stochastic delays of agents’ moves, using MAPF-DP (with Delay Probabilities) (Ma, Kumar, and Koenig 2017).

The remainder of this paper consists of the following five sections: 1) preliminary including the formalization of MAPF and related work, 2) the time-independent model, 3) examples of time-independent planning, 4) empirical results of the proposals using MAPF-DP, and 5) conclusion and future directions.

Preliminaries

This section first defines MAPF. Then, we explain the MAPF variant emulating asynchrony of movements, called MAPF-DP, which we later use in experiments. We also explain two policies that execute MAPF plans and PIBT, the original form of Causal-PIBT.

MAPF

In an environment represented as a graph G=(V,E)G=(V,E), the MAPF problem is defined as follows. Let πi[t]V\pi_{i}[t]\in V denote the location of an agent aia_{i} at discrete time tt\in\mathbb{N}. Given distinct initial locations πi[0]V\pi_{i}[0]\in V and destinations giVg_{i}\in V for each agent, assign a path πi=(πi[0],πi[1],,πi[T])\pi_{i}=(\pi_{i}[0],\pi_{i}[1],\dots,\pi_{i}[T]) such that πi[T]=gi\pi_{i}[T]=g_{i} to each agent minimizing some objective function (see below).

At each timestep tt, aia_{i} can move to an adjacent node, or, can stay at its current location, i.e., πi[t+1]𝑁𝑒𝑖𝑔ℎ(πi[t]){πi[t]}\pi_{i}[t+1]\in\mathit{Neigh}(\pi_{i}[t])\cup\{\pi_{i}[t]\}, where 𝑁𝑒𝑖𝑔ℎ(v)\mathit{Neigh}(v) is the set of nodes neighbor to vVv\in V. Agents must avoid two types of conflicts (Stern et al. 2019): 1) vertex conflict: πi[t]πj[t]\pi_{i}[t]\neq\pi_{j}[t], and, 2) swap conflict: πi[t]πj[t+1]πi[t+1]πj[t]\pi_{i}[t]\neq\pi_{j}[t+1]\lor\pi_{i}[t+1]\neq\pi_{j}[t].

Two kinds of objective functions are commonly used to evaluate MAPF solutions: 1) sum of cost (SOC), where the cost is the earliest timestep TiT_{i} such that πi[Ti]=gi,,πi[T]=gi,TiT\pi_{i}[T_{i}]=g_{i},\dots,\pi_{i}[T]=g_{i},T_{i}\leq T, 2) makespan, i.e., TT. This paper focuses on the sum of cost (SOC).

MAPF-DP (with Delay Probabilities)

MAPF-DP (Ma, Kumar, and Koenig 2017) emulates imperfect executions of MAPF plans by introducing the possibility of unsuccessful moves. Time is still discrete. At each timestep, an agent aia_{i} can either stay in place or move to an adjacent node with a probability pip_{i} of being unsuccessful. The definition of conflicts is more restrictive than with normal MAPF: 1) vertex conflict is as defined in MAPF, and 2) following conflict: πi[t+1]πj[t]\pi_{i}[t+1]\neq\pi_{j}[t]. The rationale is that, without the later restriction, two agents might be in the same node due to one failing to move. Note that following conflict contains swap conflict.

Execution Policies

Ma et al. (Ma, Kumar, and Koenig 2017) studied two robust execution policies using MAPF plans for the MAPF-DP setting. The first one, called Fully Synchronized Policies (FSPs), synchronizes the movements of agents globally, i.e., aia_{i} waits to move to πi[t+1]\pi_{i}[t+1] (πi[t]\neq\pi_{i}[t]) until all move actions of πj[t]\pi_{j}[t^{\prime}], ttt^{\prime}\leq t are completed. The second approach, called Minimal Communication Policies (MCPs), executes a plan while maintaining its temporal dependencies. There are two kinds of dependencies: 1) internal events, i.e., the corresponding action of πi[t]\pi_{i}[t] is executed prior to that of πi[t+1]\pi_{i}[t+1], and 2) node-related events, i.e., if πi[t]=πj[t]\pi_{i}[t]=\pi_{j}[t^{\prime}] and t<tt<t^{\prime}, the event of πi[t]\pi_{i}[t] is executed prior to that of πj[t]\pi_{j}[t^{\prime}]. As long as an MAPF plan is valid, both policies make agents reach their destinations without conflicts, despite of delay probabilities.

v3v_{3}v1v_{1} v2v_{2} v4v_{4} v5v_{5} a1a_{1}a2a_{2}
state σ1\sigma_{1} σ2\sigma_{2} σ1\sigma_{1} σ2\sigma_{2} σ1\sigma_{1} σ2\sigma_{2} σ2\sigma_{2} σ1\sigma_{1} σ1\sigma_{1} σ1\sigma_{1} σ1\sigma_{1} σ2\sigma_{2} σ2\sigma_{2} σ2\sigma_{2} σ2\sigma_{2} σ2\sigma_{2}
𝑚𝑜𝑑𝑒\mathit{mode} c\mathit{c} c\mathit{c} r\mathit{r} r\mathit{r} e\mathit{e} c\mathit{c} r\mathit{r} c\mathit{c} r\mathit{r} e\mathit{e} c\mathit{c} e\mathit{e} c\mathit{c} r\mathit{r} e\mathit{e} c\mathit{c}
ℎ𝑒𝑎𝑑\mathit{head} \bot \bot v3v_{3} v3v_{3} v3v_{3} \bot v3v_{3} \bot v5v_{5} v5v_{5} \bot v3v_{3} \bot v4v_{4} v4v_{4} \bot
𝑡𝑎𝑖𝑙\mathit{tail} v1v_{1} v2v_{2} v1v_{1} v2v_{2} v1v_{1} v2v_{2} v2v_{2} v3v_{3} v3v_{3} v3v_{3} v5v_{5} v2v_{2} v3v_{3} v3v_{3} v3v_{3} v4v_{4}
activated agentinteracted agentinteractioninitial statesc\mathit{c}: 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}, r\mathit{r}: 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}, e\mathit{e}: 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}
Figure 2: Example of an execution of the model. a1a_{1} goes to v5v_{5} and a2a_{2} goes to v4v_{4}. In the table, time progresses from left to right. There is an interaction, which makes a2a_{2} be back to 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}.

Priority Inheritance with Backtracking (PIBT)

PIBT (Okumura et al. 2019) repeats one-timestep planning until it terminates, aiming at solving MAPF iteratively (without delays). Unlike complete solvers for MAPF, PIBT does not ensure that all agents are on their goals simultaneously, rather it ensures reachability; all agents are ensured to reach their destinations eventually. Thus, an agent that reaches its goal potentially moves from there while other agents are moving. Reachability plays a crucial role in situations where destinations are given continuously such as lifelong MAPF (Ma et al. 2017) because all tasks (i.e., destinations) assigned to agents are ensured to be completed.

In PIBT, for each timestep, agents are provided with unique priorities and they sequentially determine their next locations in decreasing order of priorities while avoiding to use nodes that have requested by higher-priority agents. Priority inheritance, originally considered in resource scheduling problems (Sha, Rajkumar, and Lehoczky 1990), is introduced; when a low-priority agent XX obstructs the movement of a higher-priority agent YY, XX temporarily inherits the higher-priority of YY. Priority inheritance can be applied iteratively. Agents giving their priority (YY) must wait for backtracking from agents inheriting the priority (XX). Backtracking has two outcomes: valid or invalid. A valid situation occurs when YY can successfully move to the location of XX in the next timestep. An invalid situation occurs when XX is stuck, i.e., neither going anywhere nor staying at its current location without colliding, forcing YY to replan its path. Fig. 1 shows an example of PIBT in one timestep.

With these protocols, the agent with highest priority is ensured to move to an arbitrary neighbor node if the graph satisfies the adequate property, e.g., biconnected. Subsequently, such an agent moves to its goal along the shortest path to its goal. To ensure reachability, PIBT uses dynamic priorities, where the priority of an agent increments gradually at each timestep until it drops upon reaching its goal, meaning that, agents not reaching their goals yet eventually get the highest.

Time-Independent Model

The time-independent model and the terminology used in this paper are partly inspired by the model for distributed algorithms with synchronous message passing (Tel 2000) and the Amoebot model (Derakhshandeh et al. 2014), an abstract computational model for programmable matter. Our model is however different in that agents are physically embodied and move on a graph, have destinations, and know their locations.

Components

The system consists of a set of agents A={a1,,an}A=\{a_{1},\dots,a_{n}\} and a graph G=(V,E)G=(V,E). We assume that each agent knows GG to plan their respective paths.

Configuration and State

The whole system is represented as a transition system according to atomic actions of agents. Each agent aia_{i} is itself a transition system with its state, denoted as σi\sigma_{i}, consisting of its internal variables, its current location, and destination. A configuration γ=(σ1,,σn)\gamma=(\sigma_{1},\dots,\sigma_{n}) of the whole system at a given time consists of the states of all agents at that time. A change in the state of some agents causes a change of configuration of the system, e.g., γ=(,σi1,σi,σi+1,)γ=(,σi1,σi,σi+1,)\gamma=(\dots,\sigma_{i-1},\sigma_{i},\sigma_{i+1},\dots)\rightarrow\gamma^{\prime}=(\dots,\sigma_{i-1},\sigma_{i}^{\prime},\sigma_{i+1},\dots).

Mode

Each agent at any time occupies at least one node and occupies two nodes during moves between nodes. We use two terms to represent agents’ locations: 𝑡𝑎𝑖𝑙iV\mathit{tail}_{i}\in V and ℎ𝑒𝑎𝑑iV{}\mathit{head}_{i}\in V\cup\{\bot\}, where \bot is void. They are associated with a mode 𝑚𝑜𝑑𝑒i\mathit{mode}_{i} which can be: 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}, 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}, and 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}.

  • 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}: aia_{i} stays at one node 𝑡𝑎𝑖𝑙i\mathit{tail}_{i}, and ℎ𝑒𝑎𝑑i=\mathit{head}_{i}=\bot; location of aia_{i} is 𝑡𝑎𝑖𝑙i\mathit{tail}_{i}

  • 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}: aia_{i} attempts to move to ℎ𝑒𝑎𝑑i\mathit{head}_{i}\not=\bot, being at 𝑡𝑎𝑖𝑙i\mathit{tail}_{i}; location of aia_{i} is 𝑡𝑎𝑖𝑙i\mathit{tail}_{i}

  • 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}: aia_{i} is moving from 𝑡𝑎𝑖𝑙i\mathit{tail}_{i} to ℎ𝑒𝑎𝑑i\mathit{head}_{i}; locations of aia_{i} are both 𝑡𝑎𝑖𝑙i\mathit{tail}_{i} and ℎ𝑒𝑎𝑑i\mathit{head}_{i}.

Initially, all agents are 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} and have distinct initial locations. ℎ𝑒𝑎𝑑i\mathit{head}_{i} (\neq\bot) is always adjacent to 𝑡𝑎𝑖𝑙i\mathit{tail}_{i}.

Transition

Agents move by changing modes. These transitions are accompanied by changing 𝑡𝑎𝑖𝑙\mathit{tail} and ℎ𝑒𝑎𝑑\mathit{head} as described below.

  • from 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}, aia_{i} can become 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} by setting ℎ𝑒𝑎𝑑i\mathit{head}_{i} to u𝑁𝑒𝑖𝑔ℎ(𝑡𝑎𝑖𝑙i)u\in\mathit{Neigh}(\mathit{tail}_{i}), to move to a neighbor node.

  • from 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}, aia_{i} can revert to 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} by changing its ℎ𝑒𝑎𝑑i\mathit{head}_{i} to \bot.

  • from 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}, agent aia_{i} can become 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} when ¬𝑜𝑐𝑐𝑢𝑝𝑖𝑒𝑑(ℎ𝑒𝑎𝑑i)\lnot\mathit{occupied}({\mathit{head}_{i}}), where 𝑜𝑐𝑐𝑢𝑝𝑖𝑒𝑑(v)\mathit{occupied}({v}) holds when there is no agent aja_{j} such that 𝑡𝑎𝑖𝑙j=v\mathit{tail}_{j}=v and no agent aja_{j} in 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} such that ℎ𝑒𝑎𝑑j=v\mathit{head}_{j}=v.

  • from 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}, aia_{i} can become 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}, implying that the movement is finished. This comes with 𝑡𝑎𝑖𝑙iℎ𝑒𝑎𝑑i\mathit{tail}_{i}\leftarrow\mathit{head}_{i}, then ℎ𝑒𝑎𝑑i\mathit{head}_{i}\leftarrow\bot.

Transitions are atomic so, e.g., if agents in 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} become 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} through 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}, at least two actions are required; see an example later. Other modes transitions are disallowed, e.g., an agent in 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} cannot become 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} directly.

Conflict-freedom and Deadlock

In the above transition rules, the model implicitly prohibits vertex and following conflicts of MAPF. Rather, the model is prone to deadlocks; A set of agents {ak,,al}\{a_{k},\ldots,a_{l}\} are in a deadlock when all of them are 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} and are in a cycle ℎ𝑒𝑎𝑑k=𝑡𝑎𝑖𝑙k+1\mathit{head}_{k}=\mathit{tail}_{k+1}, \cdots, ℎ𝑒𝑎𝑑l=𝑡𝑎𝑖𝑙k\mathit{head}_{l}=\mathit{tail}_{k}.

Activation

Agents perform a single atomic action as a result of being activated. The nature and outcome of the action depend on the state of the agent (e.g., mode transition, local variable update). Activations occur non-deterministically, and there is no synchronization between agents. For simplicity, we assume that at most one agent is activated at any time. In other words, the simultaneous activation of two agents aia_{i} and aja_{j} results in a sequence (,σi,,σj,)(,σi,,σj,)(,σi,,σj,)(\dots,\sigma_{i},\dots,\sigma_{j},\dots)\rightarrow(\dots,\sigma_{i}^{\prime},\dots,\sigma_{j},\dots)\rightarrow(\dots,\sigma_{i}^{\prime},\dots,\sigma_{j}^{\prime},\dots). Activations are supposed to be fair in the sense that, in any sufficiently long period, all agents must be activated at least once.

Interaction

An activation may affect not only variables of the activated agent, but also affect nearby agents indirectly. For instance, if two 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} agents have the same ℎ𝑒𝑎𝑑\mathit{head}, one wins and becomes 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} whereas the other loses and becomes 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}, atomically. This type of activation is called an interaction. Interactions include activations such that the activated agents change their variables referring to the variables of other agents. We say that the agents involved in the interaction, except aia_{i} itself, are interacted agents. Given an activated agent aia_{i} and an interacted agent aja_{j}, the system transitions as follows: (,σi,,σj,)(,σi,,σj,)(\dots,\sigma_{i},\dots,\sigma_{j},\dots)\rightarrow(\dots,\sigma_{i}^{\prime},\dots,\sigma_{j}^{\prime},\dots) with the state of all other agents unchanged. Except for interactions, the configuration is changed by the state change of a single agent. We assume that interactions are performed by communication between agents, but the detailed implementation is not relevant to this paper.

Termination

Assuming that each agent aia_{i} has its own destination giVg_{i}\in V, termination can be defined in two different ways.

  • Strong termination occurs when reaching a configuration such that 𝑚𝑜𝑑𝑒i=𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑𝑡𝑎𝑖𝑙i=gi\mathit{mode}_{i}=\mathit{contracted}\land\mathit{tail}_{i}=g_{i} for any agent aia_{i}.

  • Weak termination is when all agents have been at least once in a state where 𝑚𝑜𝑑𝑒i=𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑𝑡𝑎𝑖𝑙i=gi\mathit{mode}_{i}=\mathit{contracted}\land\mathit{tail}_{i}=g_{i}.

Strong termination corresponds to usual MAPF termination, whereas weak termination corresponds to the reachability property of PIBT. Strong termination implies weak termination.

In this paper, we refer to weak termination as reachability. Note that the properties of either deadlock-freedom or deadlock-recovery are required to ensure reachability.

Remarks

Figure 2 illustrates an execution in the time-independent model. Although the example focuses on (classical) MAPF, many other problem variants, e.g., iterative MAPF (Okumura et al. 2019), can be addressed simply by adapting termination and goal assignments.

Algorithm

This section presents two examples of time-independent planning: GREEDY and Causal-PIBT. We also present how to enhance Causal-PIBT with offline MAPF plans.

Greedy Approach

GREEDY performs only basic actions; it can be a template for another time-independent planning. We simply describe its implementation for aia_{i} as follows.

  • when 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}: Choose the nearest node to gig_{i} from 𝑁𝑒𝑖𝑔ℎ(𝑡𝑎𝑖𝑙i)\mathit{Neigh}(\mathit{tail}_{i}) as new ℎ𝑒𝑎𝑑i\mathit{head}_{i}, then become 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}.

  • when 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}: Become 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} when the ℎ𝑒𝑎𝑑i\mathit{head}_{i} is unoccupied, otherwise, do nothing.

  • when 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}: Become 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}.

Obviously, GREEDY is prone to deadlocks, e.g, when two adjacent agents try to swap their locations, they block eternally. The time-independent planning without deadlock-freedom or deadlock-recovery properties is impractical, motivating the next algorithm.

Causal-PIBT

The Causal-PIBT algorithm extends both algorithms PIBT and GREEDY. Although PIBT is timing-based relying on synchronous moves, Causal-PIBT is event-based relying on causal dependencies of agents’ actions.

Concept

Two intuitions are obtained from PIBT to design a time-independent algorithm that ensures reachability; 1) Build a depth-first search tree rooted at the agent with highest priority, using the mechanism of priority inheritance and backtracking. 2) Drop priorities of agents that arrive at their goals to give higher priorities to all agents not having reached their goals yet, thus resulting in that all agents eventually reach their goals. We complement the first part as follows.

The path adjustment of PIBT in one timestep can be seen as the construction of a (virtual) depth-first search tree consisting of agents and their dependencies. The tree is rooted at the first agent starting priority inheritance, i.e., locally highest priority agent, e.g., a7a_{7} in Fig. 1. When an agent aja_{j} inherits a priority from another agent aia_{i}, aja_{j} becomes a child of aia_{i} and aia_{i} becomes its parent. We show this virtual tree in Fig. 1. Once an empty node adjacent to the tree is found, all agents on the path from the root to that empty node can move toward one step forward (a2,a1,a5,a6a_{2},a_{1},a_{5},a_{6} and a7a_{7}). This enables the agent with highest priority, being always a root, to move from the current node to any arbitrary neighbor nodes. The invalid outcome in backtracking works as backtracking in a depth-first search, while the valid outcome notifies the search termination.

Algorithm 1 Causal-PIBT
𝑝𝑎𝑟𝑒𝑛𝑡iA\mathit{parent}_{i}\in A: initially aia_{i}; 𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛iA\mathit{children}_{i}\subset A: initially \emptyset
𝑝𝑜𝑟𝑖i\mathit{pori}_{i}: original priority; 𝑝𝑡𝑚𝑝i\mathit{ptmp}_{i}: temporal priority, initially 𝑝𝑜𝑟𝑖i\mathit{pori}_{i}
CiVC_{i}\subseteq V: candidate nodes, initially 𝑁𝑒𝑖𝑔ℎ(𝑡𝑎𝑖𝑙i){𝑡𝑎𝑖𝑙i}\mathit{Neigh}(\mathit{tail}_{i})\cup\{\mathit{tail}_{i}\}
SiVS_{i}\subseteq V: searched nodes, initially \emptyset
1:when 𝑚𝑜𝑑𝑒i=𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{mode}_{i}=\mathit{contracted}
2:     if Ci=𝑝𝑎𝑟𝑒𝑛𝑡i=aiC_{i}=\emptyset\land\mathit{parent}_{i}=a_{i} then
3:         ReleaseChildren, Reset
4:     end if
5:     PriorityInheritance
6:     if Ci=C_{i}=\emptyset then
7:         let aja_{j} be 𝑝𝑎𝑟𝑒𝑛𝑡i\mathit{parent}_{i}
8:         if ℎ𝑒𝑎𝑑j=𝑡𝑎𝑖𝑙i\mathit{head}_{j}=\mathit{tail}_{i} then
9:              SjSjSiS_{j}\leftarrow S_{j}\cup S_{i}, CjCjSjC_{j}\leftarrow C_{j}\setminus S_{j} \triangleright with aja_{j}
10:              ℎ𝑒𝑎𝑑j\mathit{head}_{j}\leftarrow\bot, 𝑚𝑜𝑑𝑒j𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{mode}_{j}\leftarrow\mathit{contracted}
11:         end if
12:         return
13:     end if
14:     uu\leftarrow the nearest node to gig_{i} in CiC_{i}
15:     if u=𝑡𝑎𝑖𝑙iu=\mathit{tail}_{i} then
16:         ReleaseChildren, Reset
17:         return
18:     end if
19:     CiCi{u}C_{i}\leftarrow C_{i}\setminus\{u\}, SiSi{u,𝑡𝑎𝑖𝑙i}S_{i}\leftarrow S_{i}\cup\{u,\mathit{tail}_{i}\}
20:     ℎ𝑒𝑎𝑑iu\mathit{head}_{i}\leftarrow u, 𝑚𝑜𝑑𝑒i𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{mode}_{i}\leftarrow\mathit{requesting}
21:end when
22:when 𝑚𝑜𝑑𝑒i=𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{mode}_{i}=\mathit{requesting}
23:     PriorityInheritance
24:     if 𝑝𝑎𝑟𝑒𝑛𝑡iaiℎ𝑒𝑎𝑑iS𝑝𝑎𝑟𝑒𝑛𝑡i\mathit{parent}_{i}\neq a_{i}\land\mathit{head}_{i}\in S_{\mathit{parent}_{i}} then \triangleright 𝑝𝑎𝑟𝑒𝑛𝑡i\mathit{parent}_{i}
25:         ℎ𝑒𝑎𝑑i\mathit{head}_{i}\leftarrow\bot, 𝑚𝑜𝑑𝑒i𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{mode}_{i}\leftarrow\mathit{contracted}
26:         return
27:     end if
28:     if 𝑜𝑐𝑐𝑢𝑝𝑖𝑒𝑑(ℎ𝑒𝑎𝑑i)\mathit{occupied}({\mathit{head}_{i}}) then return
29:     A{aj|ajA,𝑚𝑜𝑑𝑒j=𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔,ℎ𝑒𝑎𝑑j=ℎ𝑒𝑎𝑑i}A^{\prime}\leftarrow\{a_{j}\;|\;a_{j}\in A,\mathit{mode}_{j}=\mathit{requesting},\mathit{head}_{j}=\mathit{head}_{i}\}
30:     aargmaxajA𝑝𝑡𝑚𝑝ja^{\star}\leftarrow\mathop{\rm arg~{}max}\limits_{a_{j}\in A^{\prime}}\mathit{ptmp}_{j}
31:     for ajA{a}a_{j}\in A^{\prime}\setminus\{a^{\star}\} do \triangleright agents in AA^{\prime}
32:         ℎ𝑒𝑎𝑑j\mathit{head}_{j}\leftarrow\bot, 𝑚𝑜𝑑𝑒j𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{mode}_{j}\leftarrow\mathit{contracted}
33:     end for
34:     if aaia^{\star}\neq a_{i} then return
35:     𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛𝑝𝑎𝑟𝑒𝑛𝑡i𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛𝑝𝑎𝑟𝑒𝑛𝑡i{ai}\mathit{children}_{\mathit{parent}_{i}}\leftarrow\mathit{children}_{\mathit{parent}_{i}}\setminus\{a_{i}\} \triangleright 𝑝𝑎𝑟𝑒𝑛𝑡i\mathit{parent}_{i}
36:     𝑝𝑎𝑟𝑒𝑛𝑡iai\mathit{parent}_{i}\leftarrow a_{i}
37:     ReleaseChildren
38:     𝑚𝑜𝑑𝑒iextended\mathit{mode}_{i}\leftarrow extended
39:end when
40:when 𝑚𝑜𝑑𝑒i=𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{mode}_{i}=\mathit{extended}
41:     𝑡𝑎𝑖𝑙iℎ𝑒𝑎𝑑i\mathit{tail}_{i}\leftarrow\mathit{head}_{i}, ℎ𝑒𝑎𝑑i\mathit{head}_{i}\leftarrow\bot, 𝑚𝑜𝑑𝑒i𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{mode}_{i}\leftarrow\mathit{contracted}
42:     update 𝑝𝑜𝑟𝑖i\mathit{pori}_{i}, Reset
43:end when

Description

Causal-PIBT basically performs GREEDY, i.e., at each activation an agent aia_{i} tries to move towards its goal; in addition, Causal-PIBT uses priority inheritance. When aia_{i} is 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} or 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}, aia_{i} potentially inherits priority from another agent aja_{j} in 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} with ℎ𝑒𝑎𝑑j=𝑡𝑎𝑖𝑙i\mathit{head}_{j}=\mathit{tail}_{i}. We now explain the details. The pseudocode is split into Algorithm 1 and 2. Interactions and the corresponding interacted agents are explicitly marked in comments. In Algo. 1, procedures with activation are denoted for each mode.

Variants: We first introduce local variants of aia_{i}.

  • 𝑝𝑎𝑟𝑒𝑛𝑡i\mathit{parent}_{i} and 𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛i\mathit{children}_{i}: for maintaining tree structures. aia_{i} is a root when 𝑝𝑎𝑟𝑒𝑛𝑡i=ai\mathit{parent}_{i}=a_{i}. The algorithm updates these variants so that ai=𝑝𝑎𝑟𝑒𝑛𝑡jaj𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛ia_{i}=\mathit{parent}_{j}\Leftrightarrow a_{j}\in\mathit{children}_{i}.

  • CiC_{i} and SiS_{i}: for searching unoccupied neighbor nodes. CiC_{i} is candidate nodes of next locations. aia_{i} in 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} selects ℎ𝑒𝑎𝑑i\mathit{head}_{i} from CiC_{i}. SiS_{i} represents already searched nodes by a tree to which aia_{i} belongs. SiS_{i} is propagated in the tree. CiC_{i} is updated to be disjoint from SiS_{i}.

  • 𝑝𝑜𝑟𝑖i\mathit{pori}_{i} and 𝑝𝑡𝑚𝑝i\mathit{ptmp}_{i}: priorities. They are components of a total order set. 𝑝𝑡𝑚𝑝i\mathit{ptmp}_{i} is basically equal to 𝑝𝑜𝑟𝑖i\mathit{pori}_{i}, however, it is changed by priority inheritance. 𝑝𝑡𝑚𝑝i𝑝𝑜𝑟𝑖i\mathit{ptmp}_{i}\geq\mathit{pori}_{i} in any time, and only 𝑝𝑡𝑚𝑝i\mathit{ptmp}_{i} is used for interaction.

Structure: The procedures in 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} consist of:

  • Relaying release [Line 24]. An agent in stuck like a4a_{4} in Fig. 1 is “released” when its parent (a5a_{5}) moves. If it has children (a3a_{3}), it needs to relay the release to its children, then initialize.

  • Priority inheritance [Line 5]. When aia_{i} is activated and there exists an agent aja_{j} with higher priority such that ℎ𝑒𝑎𝑑j=𝑡𝑎𝑖𝑙i\mathit{head}_{j}=\mathit{tail}_{i}, then priority inheritance happens from aja_{j} to aia_{i}, e.g., from a7a_{7} and a6a_{6} in Fig. 1.

  • Backtracking [Line 613]. When aia_{i} detects stuck (Ci=C_{i}=\emptyset), making its parent aja_{j} 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} while propagating already searched nodes SiS_{i}, then stops. This procedures correspond to invalid case of backtracking in PIBT, e.g., from a3a_{3} to a4a_{4} in Fig. 1.

  • Prioritized planning [Line 1420]. Pickup one node uu for the next location from CiC_{i}, update searched nodes, and become 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}. Initialize when uu is 𝑡𝑎𝑖𝑙i\mathit{tail}_{i}.

The procedures in 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} consist of:

  • Priority inheritance [Line 23]. The condition is the same as priority inheritance in 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}. This implicitly contributes to detecting deadlocks.

  • Deadlock detection and resolving [Line 2427]. This part detects the circle of requests, then go back to 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}.

  • Winner determination [Line 2833]. When ℎ𝑒𝑎𝑑i\mathit{head}_{i} is unoccupied, there is a chance to move there. First, identify agents requesting the same node then select one with highest priority as a winner. Let losers back to be contracted. See “interaction” in Fig. 2.

  • Preparation for moves [Line 3438]. If aia_{i} itself is a winner, after isolating itself from a tree belonged to, aia_{i} becomes 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}.

The procedures in 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} are to update priority and back to 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}.

Subprocedures: We use three subprocedures, as shown in Algo. 2. PriorityInheritance first determines whether priority inheritance should occur [Line 25], then updates the tree structure and inherits both the priority and the searched nodes from the new parent [Line 610]. ReleaseChildren just cuts off the relationship with the children of aia_{i}. Reset initializes the agent’s status.

Overview: Causal-PIBT constructs depth-first search trees, each rooted at the agents with locally highest priorities, using 𝑝𝑎𝑟𝑒𝑛𝑡\mathit{parent} and 𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛\mathit{children}. They are updated through priority inheritance and backtracking mechanisms. All agents in the same tree have the same priority 𝑝𝑡𝑚𝑝\mathit{ptmp}, equal to the priority 𝑝𝑜𝑟𝑖\mathit{pori} of the rooted agent. When a tree with higher priority comes in contact with a lower-priority tree (by some agent becoming 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}), the latter tree is decomposed and is partly merged into the former tree (by PriorityInheritance). In the reverse case, the tree with lower priority just waits for the release of the area. Backtracking occurs when a child has no candidate node to move due to requests from agents with higher priorities [Line 811 in Algo. 1].

Dynamic Priorities: We assume that 𝑝𝑜𝑟𝑖i\mathit{pori}_{i} is unique between agents in any configuration and updated upon reaching a goal to be lower than priorities of all agents who have not yet reached their goals [Line 42 in Algo. 1]. This is realized by a prioritization scheme similar to PIBT.

Properties

Next, two useful properties of Causal-PIBT are shown: deadlock-recovery and the reachability. We also state that Causal-PIBT is quick enough as atomic actions in terms of time complexity.

Theorem 1 (deadlock-recovery).

Causal-PIBT ensures no deadlock situation can last forever.

Proof.

Assume that there is a deadlock. When an agent aia_{i} changes from 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} to 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}, it updates SiS_{i} such that SiS_{i} includes ℎ𝑒𝑎𝑑i\mathit{head}_{i} and 𝑡𝑎𝑖𝑙i\mathit{tail}_{i} [Line 19 in Algo. 1]. After finite activations, all agents involved in the deadlock must have the same priority 𝑝𝑡𝑚𝑝\mathit{ptmp} due to priority inheritance. Every priority inheritance is accompanied by the propagation of SiS_{i} [Line 10 in Algo. 2]. When an agent in 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} detects its head in S𝑝𝑎𝑟𝑒𝑛𝑡iS_{\mathit{parent}_{i}}, it changes back to 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} [Line 2427 in Algo. 1]. These imply the statement. ∎

Once an agent aia_{i} in a deadlock comebacks to 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}, aia_{i} cannot request the same place as before due to the update of CiC_{i} [Line 19 in Algo. 1], preventing a reoccurrence of the deadlock situation; i.e., a livelock situation if repeated indefinitely.

Theorem 2 (reachability).

Causal-PIBT has the reachability in biconnected graphs if |A|<|V||A|<|V|.

Proof sketch.

Let aia_{i} be the agent with highest priority 𝑝𝑜𝑟𝑖\mathit{pori}. If aia_{i} is 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}, this agent does not inherit any priority. Thus, aia_{i} can be 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} so that ℎ𝑒𝑎𝑑i\mathit{head}_{i} is any neighbor node of 𝑡𝑎𝑖𝑙i\mathit{tail}_{i}. If aia_{i} is 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}, aia_{i} eventually moves to ℎ𝑒𝑎𝑑i\mathit{head}_{i} due to the construction of the depth-first search tree in a biconnected graph. These imply that aia_{i} can move to an arbitrary neighbor node in finite activations. By this, aia_{i} moves along the shortest path to its goal. Due to the prioritization scheme, an agent that has not reached its goal eventually inherits the highest priority 𝑝𝑜𝑟𝑖\mathit{pori}, and then starts moving along its shortest path to its goal. This ensures reachability. The detailed proof is found in the long version (Okumura, Tamura, and Défago 2020). ∎

Algorithm 2 Procedures of Causal-PIBT
1:procedure PriorityInheritance
2:     A′′{aj|ajA,𝑚𝑜𝑑𝑒j=𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔,𝑡𝑎𝑖𝑙i=ℎ𝑒𝑎𝑑j}A^{\prime\prime}\leftarrow\{a_{j}\;|\;a_{j}\in A,\mathit{mode}_{j}=\mathit{requesting},\mathit{tail}_{i}=\mathit{head}_{j}\}
3:     if A′′=A^{\prime\prime}=\emptyset then return
4:     akargmaxajA′′𝑝𝑡𝑚𝑝ja_{k}\leftarrow\mathop{\rm arg~{}max}\limits_{a_{j}\in A^{\prime\prime}}\mathit{ptmp}_{j}
5:     if 𝑝𝑡𝑚𝑝k𝑝𝑡𝑚𝑝i\mathit{ptmp}_{k}\leq\mathit{ptmp}_{i} then return
6:     ReleaseChildren
7:     𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛𝑝𝑎𝑟𝑒𝑛𝑡i𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛𝑝𝑎𝑟𝑒𝑛𝑡i{ai}\mathit{children}_{\mathit{parent}_{i}}\leftarrow\mathit{children}_{\mathit{parent}_{i}}\setminus\{a_{i}\} \triangleright 𝑝𝑎𝑟𝑒𝑛𝑡i\mathit{parent}_{i}
8:     𝑝𝑎𝑟𝑒𝑛𝑡iak\mathit{parent}_{i}\leftarrow a_{k}, 𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛k𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛k{ai}\mathit{children}_{k}\leftarrow\mathit{children}_{k}\cup\{a_{i}\} \triangleright aka_{k}
9:     𝑝𝑡𝑚𝑝i𝑝𝑡𝑚𝑝k\mathit{ptmp}_{i}\leftarrow\mathit{ptmp}_{k}
10:     SiSk{ℎ𝑒𝑎𝑑i}S_{i}\leftarrow S_{k}\cup\{\mathit{head}_{i}\}, Ci𝑁𝑒𝑖𝑔ℎ(𝑡𝑎𝑖𝑙i){𝑡𝑎𝑖𝑙i}SiC_{i}\leftarrow\mathit{Neigh}(\mathit{tail}_{i})\cup\{\mathit{tail}_{i}\}\setminus S_{i}
11:end procedure
12:procedure ReleaseChildren
13:     for aj𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛ia_{j}\in\mathit{children}_{i} do 𝑝𝑎𝑟𝑒𝑛𝑡jaj\mathit{parent}_{j}\leftarrow a_{j} \triangleright aja_{j}
14:     𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛i\mathit{children}_{i}\leftarrow\emptyset
15:end procedure
16:procedure Reset
17:     SiS_{i}\leftarrow\emptyset, Ci𝑁𝑒𝑖𝑔ℎ(𝑡𝑎𝑖𝑙i){𝑡𝑎𝑖𝑙i}C_{i}\leftarrow\mathit{Neigh}(\mathit{tail}_{i})\cup\{\mathit{tail}_{i}\}
18:     𝑝𝑡𝑚𝑝i𝑝𝑜𝑟𝑖i\mathit{ptmp}_{i}\leftarrow\mathit{pori}_{i}
19:end procedure
Proposition 1 (time complexity).

Assume that the maximum time required for one operation to update SiS_{i} or CiC_{i} (i.e., union, set minus) is α\alpha and for an operation of Line 14 in Algo. 1 is β\beta. Let Δ(G)\Delta(G) denote the maximum degree of GG. For each activation, time complexity of Causal-PIBT in 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} is O(Δ(G)+α+β)O(\Delta(G)+\alpha+\beta), in 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} is O(Δ(G)+α)O(\Delta(G)+\alpha), and in 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} is O(1)O(1).

Proof sketch.

|𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛i||\mathit{children}_{i}|, |A||A^{\prime}| used in 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}, and |A′′||A^{\prime\prime}| used in PriorityInheritance are smaller than equal to Δ(G)\Delta(G). Then, the complexity of PriorityInheritance is O(Δ(G)+α)O(\Delta(G)+\alpha). ReleaseChildren is O(Δ(G))O(\Delta(G)). Reset is O(1)O(1). We can derive the statements using the above. ∎

Refer to caption
Refer to caption
Figure 3: The results in small benchmarks, shown as boxplots. The number of successful instances for GREEDY are 0, 5, 8, 11, 17, 26, 30, 36, 40, 51 for each p¯\bar{p} (from 0 to 0.90.9) in the upper plot. GREEDY failed all cases in the lower plot.

MAPF Plans as Hints

Although GREEDY and Causal-PIBT are for online situations, they can optionally use offline MAPF plans as hints while still being online and distributed schemes during execution. This can contribute to relaxing congestion because time-independent planning is shortsighted, i.e., planning paths anticipating only a single step ahead. Finding optimal MAPF plans is NP-hard (Yu and LaValle 2013; Ma et al. 2016), however, powerful solvers have been developed so far (Stern 2019); time-independent planning can use them. We describe how Causal-PIBT is enhanced by MAPF plans. Note that GREEDY can be extended as well. The intuition is to make agents follow original plans whenever possible.

Given a MAPF plan 𝝅\bm{\pi}, assume that aia_{i} knows πi\pi_{i} a priori. Before execution, aia_{i} makes a new path π~i\tilde{\pi}_{i} by removing a node πi[t]\pi_{i}[t] such that πi[t]=πi[t1]\pi_{i}[t]=\pi_{i}[t-1] from πi\pi_{i} while keeping the order of nodes, since the action “stay” is meaningless in the time-independent model. During execution, aia_{i} manages its own discrete time tit_{i} internally (initially ti=0t_{i}=0).

The node selection phase [Line 14 in Algo. 1] is modified as follows. If ti|π~i|1t_{i}\geq|\tilde{\pi}_{i}|-1, i.e., aia_{i} finished π~i\tilde{\pi}_{i}, aia_{i} follows the original policy; chooses a next node greedily from CiC_{i}. Otherwise, if 𝑡𝑎𝑖𝑙i=πi[ti]\mathit{tail}_{i}=\pi_{i}[t_{i}] and πi[ti+1]Ci\pi_{i}[t_{i}+1]\in C_{i}, i.e., when aia_{i} can follow π~i\tilde{\pi}_{i}, aia_{i} selects πi[ti+1]\pi_{i}[t_{i}+1]. Or else, aia_{i} selects the nearest node on the rest of π~i\tilde{\pi}_{i}, formally; argminvCi{minuπ~[ti+1:]cost(v,u)}\mathop{\rm arg~{}min}\limits_{v\in C_{i}}\left\{\mathop{\rm min}\limits_{u\in\tilde{\pi}[t_{i}+1\colon]}\text{cost}(v,u)\right\} where π~[t:]=(π~[t],π~[t+1],)\tilde{\pi}[t\colon]=(\tilde{\pi}[t],\tilde{\pi}[t+1],\dots).

tit_{i} is updated when aia_{i} in 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} is activated. If ℎ𝑒𝑎𝑑iπ~i[ti+1:]\mathit{head}_{i}\in\tilde{\pi}_{i}[t_{i}+1\colon], tit_{i} becomes the corresponding timestep tt^{\prime} such that π~i[t]=ℎ𝑒𝑎𝑑i\tilde{\pi}_{i}[t^{\prime}]=\mathit{head}_{i}; otherwise, do nothing.

Refer to captionRefer to captionrandom-32-32-1032×3232{\times}323535 agents
Refer to captionRefer to captionrandom-32-32-1032×3232{\times}32p¯=0.5\bar{p}=0.5
Figure 4: Executions in random grids.
Refer to captionRefer to captionden312d, 65×8165{\times}81bottleneckp¯=0.1\bar{p}=0.1
Refer to captionRefer to captionrandom-64-64-20, 64×6464{\times}64p¯=0.1\bar{p}=0.1
Figure 5: Executions in large fields.

Evaluation

The experiments aim at: 1) showing the robustness of time-independent planning, i.e., even though delays of agents are unpredictable, their trajectories are kept efficient, and, 2) verifying the usefulness to use MAPF plans as hints during executions.

We used the MAPF-DP problem because it can emulate imperfect executions. To adapt the time-independent model to MAPF-DP, rules of activation are required. We repeated the following two phases: 1) Each agent aia_{i} in 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} is activated with probability 1pi1-p_{i}. As a result, aia_{i} successfully moves to ℎ𝑒𝑎𝑑i\mathit{head}_{i} with probability 1pi1-p_{i} and becomes 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}. 2) Pickup one agent in 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} or in 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} randomly then makes it activated, and repeat this until the configuration becomes stable, i.e, all agents in 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} and 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} do not change their states unless any agent in 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} is activated. The rationale is that time required by executing atomic actions except for agents’ moves is much smaller than that of the moves. We regard a pair of these two phases as one timestep. The strong termination was used to compare with existing two approaches for MAPF-DP: FSPs (Fully Synchronized Policies) and MCPs (Minimum Communication Policies). The evaluation is based on the sum of cost (SOC) metric. The simulator was developed in C++ 111 https://github.com/Kei18/time-independent-planning , and all experiments were run on a laptop with Intel Core i9 2.3GHz CPU and 16GB RAM. In all settings, we tried 100100 repetitions.

Small Benchmarks

First, we tested the time-independent planning in two small benchmarks shown in Fig. 3. The delay probabilities pip_{i} were chosen uniformly at random from [0,p¯][0,\bar{p}], where p¯\bar{p} is the upper bound of pip_{i}. The higher p¯\bar{p} means that agents delay often. p¯=0\bar{p}{=}0 corresponds to perfect executions without delays. We manipulated p¯\bar{p}. GREEDY, Causal-PIBT, and the enhanced one using MAPF plans as hints (Causal-PIBT+) were performed. Those three were regarded to fail after 1000010000 activations in total, implying occurring deadlocks or livelocks. FSPs and MCPs were also tested as comparisons. To execute FSPs, MCPs, and Causal-PIBT+, valid MAPF plans such that minimize SOC, which assumes following conflicts, were computed by an adapted version of Conflict-based Search (CBS) (Sharon et al. 2015), prior to performing MAPF-DP.

The results are shown in Fig. 3. Although GREEDY failed in most cases due to deadlocks, Causal-PIBT (+) succeeded in all trials due to the deadlock-recovery and the reachability. FSPs resulted in larger SOC compared to MCPs. The results of Causal-PIBT (+) were competitive to those of MCPs.

Random Grid

Next, we tested Causal-PIBT (+) using one scenario from MAPF benchmarks (Stern et al. 2019), shown in Fig. 4. We manipulated two factors: 1) changing p¯\bar{p} while fixing the number of agents (=35{=}35), and, 2) changing the number of agents while fixing p¯\bar{p} (=0.5{=}0.5). When the number of agents increases, the probability that someone delays also increases. We set sufficient upper bounds of activations. FSPs and MCPs were also tested. In this time, an adapted version of Enhanced CBS (ECBS) (Barer et al. 2014), bounded sub-optimal MAPF solver, was used to obtain valid MAPF-DP plans, where the suboptimality was 1.11.1.

Fig. 4 shows the results. The proposals succeeded in all cases even though the fields were not biconnected. The results demonstrated the time-independent planning outputs robust executions while maintaining the small SOC in the severe environment as for timing assumptions. In precisely, when p¯\bar{p} or the number of agents is small, MCPs was efficient, on the other hand, when these number increases, the time-independent plannings had an advantage.

Large Fields

Finally, we tested proposals using large fields from the MAPF benchmarks, as shown in Fig. 5. We respectively picked one scenario for each field, then tested while changing the number of agents. p¯\bar{p} is fixed to 0.10.1. Regular ECBS (suboptimality: 1.11.1) was used for Causal-PIBT+. Note that Causal-PIBT+ does not require MAPF plans that assume following conflicts.

The results (in Fig. 5) demonstrate the advantage of using MAPF plans in Causal-PIBT. The proposed methods succeeded in all cases. We observed a very large SOC in map den312d due to the existence of a bottleneck. Such a structure critically affects execution delays.

Conclusion

This paper studied the online and distributed planning for multiple moving agents without timing assumptions. We abstracted the reality of the execution as a transition system, then proposed time-independent planning, and Causal-PIBT as an implementation that ensures reachability. Simulations in MAPF-DP demonstrate the robustness of time-independent planning and the usefulness of using MAPF plans as hints.

Future directions include the following: 1) Develop time-independent planning with strong termination, e.g., by adapting Push and Swap (Luna and Bekris 2011) to our model. 2) Address communication between agents explicitly. In particular, this paper neglects delays caused by communication by treating interactions as a black box, and the next big challenge is there.

Acknowledgments

We thank the anonymous reviewers for their many insightful comments and suggestions. This work was partly supported by JSPS KAKENHI Grant Numbers 20J23011 and 20K11685, and Yoshida Scholarship Foundation.

References

  • Atzmon et al. (2020) Atzmon, D.; Stern, R.; Felner, A.; Wagner, G.; Barták, R.; and Zhou, N.-F. 2020. Robust multi-agent path finding and executing. Journal of Artificial Intelligence Research 67: 549–579.
  • Barer et al. (2014) Barer, M.; Sharon, G.; Stern, R.; and Felner, A. 2014. Suboptimal variants of the conflict-based search algorithm for the multi-agent pathfinding problem. In Proc. Annual Symp. on Combinatorial Search (SoCS), 19–27.
  • Derakhshandeh et al. (2014) Derakhshandeh, Z.; Dolev, S.; Gmyr, R.; Richa, A. W.; Scheideler, C.; and Strothmann, T. 2014. Brief announcement: amoebot–a new model for programmable matter. In Proc. Symp. on Parallelism in Algorithms and Architectures, 220–222. ACM.
  • Dresner and Stone (2008) Dresner, K.; and Stone, P. 2008. A multiagent approach to autonomous intersection management. Journal of artificial intelligence research 31: 591–656.
  • Hönig et al. (2019) Hönig, W.; Kiesel, S.; Tinka, A.; Durham, J. W.; and Ayanian, N. 2019. Persistent and robust execution of mapf schedules in warehouses. IEEE Robotics and Automation Letters 4(2): 1125–1131.
  • Hönig et al. (2016) Hönig, W.; Kumar, T. S.; Cohen, L.; Ma, H.; Xu, H.; Ayanian, N.; and Koenig, S. 2016. Multi-agent path finding with kinematic constraints. In Proc. Intl. Conf. on Automated Planning and Scheduling (ICAPS), 477–485.
  • Kim et al. (2015) Kim, K.; Campbell, J.; Duong, W.; Zhang, Y.; and Fainekos, G. 2015. DisCoF+: Asynchronous DisCoF with flexible decoupling for cooperative pathfinding in distributed systems. In Proc. IEEE Intl. Conf. on Automation Science and Engineering (CASE), 369–376.
  • Luna and Bekris (2011) Luna, R.; and Bekris, K. E. 2011. Push and swap: Fast cooperative path-finding with completeness guarantees. In Proc. Intl. Joint Conf. on Artificial Intelligence (IJCAI), 294–300.
  • Ma, Kumar, and Koenig (2017) Ma, H.; Kumar, T. S.; and Koenig, S. 2017. Multi-agent path finding with delay probabilities. In Proc. AAAI Conf. on Artificial Intelligence, 3605–3612.
  • Ma et al. (2017) Ma, H.; Li, J.; Kumar, T.; and Koenig, S. 2017. Lifelong multi-agent path finding for online pickup and delivery tasks. In Proc. Intl. Joint Conf. on Autonomous Agents & Multiagent Systems (AAMAS), 837–845.
  • Ma et al. (2016) Ma, H.; Tovey, C.; Sharon, G.; Kumar, T. S.; and Koenig, S. 2016. Multi-agent path finding with payload transfers and the package-exchange robot-routing problem. In Proc. AAAI Conf. on Artificial Intelligence, 3166–3173.
  • Okumura et al. (2019) Okumura, K.; Machida, M.; Défago, X.; and Tamura, Y. 2019. Priority Inheritance with Backtracking for Iterative Multi-agent Path Finding. In Proc. Intl. Joint Conf. on Artificial Intelligence (IJCAI), 535–542.
  • Okumura, Tamura, and Défago (2020) Okumura, K.; Tamura, Y.; and Défago, X. 2020. Time-Independent Planning for Multiple Moving Agents. CoRR abs/2005.13187. URL https://arxiv.org/abs/2005.13187.
  • Sha, Rajkumar, and Lehoczky (1990) Sha, L.; Rajkumar, R.; and Lehoczky, J. P. 1990. Priority inheritance protocols: An approach to real-time synchronization. IEEE Transactions on Computers 39(9): 1175–1185.
  • Sharon et al. (2015) Sharon, G.; Stern, R.; Felner, A.; and Sturtevant, N. R. 2015. Conflict-based search for optimal multi-agent pathfinding. Artificial Intelligence 219: 40–66.
  • Stern (2019) Stern, R. 2019. Multi-Agent Path Finding–An Overview. In Artificial Intelligence - 5th RAAI Summer School, volume 11866 of LNCS, 96–115. Springer.
  • Stern et al. (2019) Stern, R.; Sturtevant, N.; Felner, A.; Koenig, S.; Ma, H.; Walker, T.; Li, J.; Atzmon, D.; Cohen, L.; Kumar, T.; et al. 2019. Multi-Agent Pathfinding: Definitions, Variants, and Benchmarks. In Proc. Intl. Symp. on Combinatorial Search (SoCS), 151–159.
  • Tel (2000) Tel, G. 2000. Introduction to distributed algorithms. Cambridge university press.
  • Wang and Rubenstein (2020) Wang, H.; and Rubenstein, M. 2020. Walk, Stop, Count, and Swap: Decentralized Multi-Agent Path Finding With Theoretical Guarantees. IEEE Robotics and Automation Letters 5(2): 1119–1126.
  • Wiktor et al. (2014) Wiktor, A.; Scobee, D.; Messenger, S.; and Clark, C. 2014. Decentralized and complete multi-robot motion planning in confined spaces. In Proc. IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS), 1168–1175.
  • Wurman, D’Andrea, and Mountz (2008) Wurman, P. R.; D’Andrea, R.; and Mountz, M. 2008. Coordinating hundreds of cooperative, autonomous vehicles in warehouses. AI magazine 29(1): 9–20.
  • Yu and LaValle (2013) Yu, J.; and LaValle, S. M. 2013. Structure and Intractability of Optimal Multi-Robot Path Planning on Graphs. In Proc. AAAI Conf. on Artificial Intelligence.

Appendix: Proof of Reachability

In this section, we formally derive Theorem 2. We will use V(G)V(G^{\prime}) and E(G)E(G^{\prime}) as vertices and edges on a graph GG^{\prime} to clarify the graph that we mention.

At first, let us define a subgraph HH of GG, called a parent-child graph, to grasp the relationship between agents in Causal-PIBT. Although HH is defined by a configuration formally, we just denote HH when we mention an invariant property of the parent-child graph.

Definition 1 (parent-child graph).

Given a configuration γk\gamma^{k}, a parent-child graph HkH^{k} is a subgraph of GG with the following properties. Let BkAB^{k}\subseteq A be a set of agents occupying only one node in γk\gamma^{k}, i.e., Bk={aj|ajA,𝑚𝑜𝑑𝑒j𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑}B^{k}=\{a_{j}\;|\;a_{j}\in A,\mathit{mode}_{j}\neq\mathit{extended}\}. Then, Hk=(V(Hk),E(Hk))H^{k}=(V(H^{k}),E(H^{k})), where V(Hk)={𝑡𝑎𝑖𝑙j|ajBk}V(H^{k})=\{\mathit{tail}_{j}\;|\;a_{j}\in B^{k}\} and E(Hk)={(𝑡𝑎𝑖𝑙j,𝑡𝑎𝑖𝑙k)|ajBk,ak𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛j}E(H^{k})=\{(\mathit{tail}_{j},\mathit{tail}_{k})\;|\;a_{j}\in B^{k},a_{k}\in\mathit{children}_{j}\}.

The next lemma states that HH transits while keeping its graph property; forest.

Lemma 1.

HH is always a forest.

Proof.

Given the initial configuration γ0\gamma^{0}, H0H^{0} is obviously a forest. Assume that Hk1H^{k-1} is a forest, where k>0k>0. Assume further that aia_{i} is activated and the configuration transits from γk1\gamma^{k-1} to γk\gamma^{k}. We now show that HkH^{k} is also a forest. There are three kinds of causes for changing edges in HkH^{k} from Hk1H^{k-1}.

The first case is the procedure ReleaseChildren. This removes all edges in Hk1H^{k-1} from 𝑡𝑎𝑖𝑙i\mathit{tail}_{i} to its children’s tail. This is a decomposition of a subtree in Hk1H^{k-1}, thus, HkH^{k} is still a forest.

The second case is the procedure PriorityInheritance. If there exists an agent aka_{k} as a new parent, this procedure first 𝑡𝑎𝑖𝑙i\mathit{tail}_{i} be isolated in Hk1H^{k-1}. Then, aia_{i} updates its parent, i.e., a new edge between 𝑡𝑎𝑖𝑙i\mathit{tail}_{i} and 𝑡𝑎𝑖𝑙k\mathit{tail}_{k} is added to HkH^{k}. This update still keeps HkH^{k} being a forest. Note that cycles are not constructed; agents in one component have the same 𝑝𝑡𝑚𝑝\mathit{ptmp} due to priority inheritance. Priority inheritance occurs only between agents with different 𝑝𝑡𝑚𝑝\mathit{ptmp}.

The third case is that aia_{i} transits from 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} to 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}. Before being 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}, aia_{i} makes 𝑡𝑎𝑖𝑙i\mathit{tail}_{i} isolated. This is a decomposition of a subtree in Hk1H^{k-1}, thus, HkH^{k} is kept as a forest.

By induction, HH is always a forest. ∎

We next introduce a dominant agent/tree. Intuitively, it is an agent or a tree with the highest priority in one configuration. Note that there exists a configuration without a dominant agent/tree, e.g., when the agent with highest 𝑝𝑜𝑟𝑖\mathit{pori} is 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}.

Definition 2 (dominant agent/tree).

Assume an agent ada_{d} in
𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} such that 𝑝𝑜𝑟𝑖d>𝑝𝑜𝑟𝑖j,ajA{ad}\mathit{pori}_{d}>\mathit{pori}_{j},\forall a_{j}\in A\setminus\{a_{d}\}. ada_{d} is a dominant agent when it becomes 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}, until it becomes 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}. A component of the parent-child graph HH containing 𝑡𝑎𝑖𝑙d\mathit{tail}_{d} is a dominant tree TdT_{d}.

Since HH is a forest from Lemma 1, TdT_{d} is a tree. Subsequently, we state the condition where the dominant agent can move to its desired node, then derive reachability.

Definition 3 (movable agent).

An agent ama_{m} is said to be movable in a given configuration when 𝑚𝑜𝑑𝑒m=𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{mode}_{m}=\mathit{requesting} and ¬𝑜𝑐𝑐𝑢𝑝𝑖𝑒𝑑(ℎ𝑒𝑎𝑑m)\lnot\mathit{occupied}({\mathit{head}_{m}}).

Lemma 2.

If there is a dominant tree TdT_{d} and a movable agent ama_{m} such that 𝑡𝑎𝑖𝑙mV(Td)\mathit{tail}_{m}\in V(T_{d}), then a dominant agent ada_{d} eventually becomes 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} without being 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}.

Proof.

Only the procedure PriorityInheritance adds new edges to HH as shown in the proof of Lemma 1. Thus, all agents in TdT_{d} have the same temporal priority, i.e., 𝑝𝑡𝑚𝑝d\mathit{ptmp}_{d}. This means that no agents outside of TdT_{d} can disrupt TdT_{d}.

Considering the construction of TdT_{d}, all agents on a path from 𝑡𝑎𝑖𝑙d\mathit{tail}_{d} to 𝑡𝑎𝑖𝑙m\mathit{tail}_{m} in TdT_{d} are 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}. The other agents in TdT_{d} are 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} and do nothing if they are activated. Moreover, if agents in 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} in TdT_{d} excluding ama_{m} are activated, they do nothing. As a result, only the activation of ama_{m} can update states of agents in TdT_{d}.

Let ama_{m^{\prime}} be 𝑝𝑎𝑟𝑒𝑛𝑡m\mathit{parent}_{m}. Any agent aia_{i} in 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} with ℎ𝑒𝑎𝑑i=ℎ𝑒𝑎𝑑m\mathit{head}_{i}=\mathit{head}_{m} will defeat since 𝑝𝑡𝑚𝑝m=𝑝𝑡𝑚𝑝d\mathit{ptmp}_{m}=\mathit{ptmp}_{d}. As a result, when ama_{m} is activated, it certainly becomes 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} and departs from TdT_{d}. ama_{m} becomes 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} in finite activations. During this transition, any agent aia_{i} in 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} with ℎ𝑒𝑎𝑑i=ℎ𝑒𝑎𝑑m\mathit{head}_{i}=\mathit{head}_{m^{\prime}} will defeat since 𝑝𝑡𝑚𝑝m=𝑝𝑡𝑚𝑝d\mathit{ptmp}_{m^{\prime}}=\mathit{ptmp}_{d}. Now, ℎ𝑒𝑎𝑑m\mathit{head}_{m^{\prime}} is not occupied by any agents. We have a new movable agent, namely, ama_{m^{\prime}}. By the same procedure, ama_{m^{\prime}} moves to its head in finite time. By induction, ada_{d} eventually becomes 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} without being 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}. ∎

Lemma 3.

If GG is biconnected and |A|<|V||A|<|V|, a dominant agent ada_{d} eventually becomes 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended} without being 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}.

Proof.

First, we show that ada_{d} does not become 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}. Assume by contradiction that ada_{d} becomes 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}. Considering the process of the construction of the dominant tree TdT_{d}, all agents aia_{i} in TdT_{d} excluding ada_{d} are 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted} and Ci=C_{i}=\emptyset. This implies that there exists no neighbor nodes of V(Td){𝑡𝑎𝑖𝑙d}V(T_{d})\setminus\{\mathit{tail}_{d}\} in V(G)V(G). Since GG is biconnected, although 𝑡𝑎𝑖𝑙d\mathit{tail}_{d} is removed from GG, there exists a path between any two nodes in V(G)V(G). Thus, V(Td)V(T_{d}) must spans GG, i.e., |V|=|A||V|=|A|. This is a contradiction, so, ada_{d} never becomes 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}.

Since TdT_{d} tries to expand so that it spans GG, and |A|<|V||A|<|V|, we have eventually a movable agent in TdT_{d}. According to lemma 2, ada_{d} eventually becomes 𝑒𝑥𝑡𝑒𝑛𝑑𝑒𝑑\mathit{extended}. ∎

Theorem 2 (reachability).

Causal-PIBT ensures reachability in biconnected graphs if |A|<|V||A|<|V|.

Proof.

Let an agent be aha_{h} with the highest 𝑝𝑜𝑟𝑖\mathit{pori}. If aha_{h} is 𝑐𝑜𝑛𝑡𝑟𝑎𝑐𝑡𝑒𝑑\mathit{contracted}, this agent does not inherit any priority. Thus, aha_{h} can be 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting} so that ℎ𝑒𝑎𝑑h\mathit{head}_{h} is any neighbor nodes of 𝑡𝑎𝑖𝑙h\mathit{tail}_{h}. If aha_{h} is 𝑟𝑒𝑞𝑢𝑒𝑠𝑡𝑖𝑛𝑔\mathit{requesting}, aha_{h} is a dominant agent. According to lemma 3, aha_{h} eventually moves to ℎ𝑒𝑎𝑑h\mathit{head}_{h}. These imply that aha_{h} can move to an arbitrary neighbor node in finite activations. By this, aha_{h} moves along the shortest path to its goal, then drops its priority 𝑝𝑜𝑟𝑖h\mathit{pori}_{h}.

Due to the above mechanism, an agent that has not reached its goal eventually gets the highest 𝑝𝑜𝑟𝑖\mathit{pori}, then it starts moving along its shortest path to its goal. This satisfies reachability. ∎

It is interesting to note that the necessary condition to ensure reachability in Causal-PIBT is more restrictive than in PIBT. This is due to the prohibition of rotations (cyclic conflicts) by the time-independent model.