This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\setcopyright

ifaamas \acmConference[AAMAS ’24]Proc. of the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024)May 6 – 10, 2024 Auckland, New ZealandN. Alechina, V. Dignum, M. Dastani, J.S. Sichman (eds.) \copyrightyear2024 \acmYear2024 \acmDOI \acmPrice \acmISBN \acmSubmissionID427 \affiliation \institutionCornell University \cityIthaca \stateNY \countryUSA \affiliation \institutionCornell University \cityIthaca \stateNY \countryUSA

High-Level, Collaborative Task Planning Grammar and Execution for Heterogeneous Agents

Amy Fang [email protected]  and  Hadas Kress-Gazit [email protected]
Abstract.

We propose a new multi-agent task grammar to encode collaborative tasks for a team of heterogeneous agents that can have overlapping capabilities. The grammar allows users to specify the relationship between agents and parts of the task without providing explicit assignments or constraints on the number of agents required. We develop a method to automatically find a team of agents and synthesize correct-by-construction control with synchronization policies to satisfy the task. We demonstrate the scalability of our approach through simulation and compare our method to existing task grammars that encode multi-agent tasks.

Key words and phrases:
Formal methods, multiagent coordination, task planning, robotics

1. Introduction

Agents working together to achieve common goals have a variety of applications, such as warehouse automation or disaster response. Multi-agent tasks have been defined in different ways in the scheduling and planning literature. For example, in multi-agent task allocation  Jia and Meng (2013); Gerkey and Matarić (2004); Korsah et al. (2013) and coalition formation Xu et al. (2015); Li et al. (2009), each task is a single goal with an associated utility. Individual agents or agent teams then automatically assign themselves to a task based on some optimization metric. Swarm approaches Schmickl et al. (2006); Wang and Rubenstein (2020) consider emergent behavior of an agent collective as the task, for example, aggregation or shape formation.

Recently, formal methods, such as temporal logics for task specifications and correct-by-construction synthesis, have been used to solve different types of multi-agent planning tasks Ulusoy et al. (2012); Schillinger et al. (2018); Chen et al. (2021). Tasks written in temporal logic, such as Linear Temporal Logic (LTL), allow users to capture complex tasks with temporal constraints. Existing work has extended LTL Luo and Zavlanos (2022); Sahin et al. (2017) and Signal Temporal Logic Leahy et al. (2022) to encode tasks that require multiple agents.

In this paper, we consider tasks that a team of heterogeneous agents are required to collaboratively satisfy. For instance, consider a precision agriculture scenario in which a farm contains agents with different on-board sensors to monitor crop health. The user may want to take a moisture measurement in one region, and then take a soil sample of a different region. Depending on the agents’ sensors and sensing range, the agents may decide to collaborate to satisfy the task. For example, one agent may perform the entire task on its own if it has both a moisture sensor and an arm mechanism to pick up a soil sample and can move between the two regions. However, another possible solution is for two agents to team up so that one takes a moisture measurement and the other picks up the soil. Existing task grammars Luo and Zavlanos (2022); Sahin et al. (2017); Leahy et al. (2022) capture tasks such as the above by providing explicit constraints on the types or number of agents for each part of the task, i.e. the task must explicitly encode whether it should be one agent, two agents, or either of these options. In this paper, we create a task grammar and associated control synthesis that removes the need to a priori decide on the number of agents necessary to accomplish a task, allowing users to focus solely on the actions required to achieve the task (e.g. “take a moisture measurement and then pick up a soil sample, irrespective of which or how many agents perform which actions”).

Our task grammar has several unique aspects. First, this grammar enables the interleaving of agent actions, alleviating the need for explicit task decomposition in order to assign agents to parts of the task. Second, rather than providing explicit constraints on the types or number of agents for each part of the task, the task encodes, using the concept of bindings (inspired by Luo and Zavlanos (2022)), the overall relationship between agent assignments and team behavior; we can require certain parts of the task to be satisfied by the same agent without assigning the exact agent or type of agent a priori. Lastly, the grammar allows users to make the distinction between the requirements “for all agents” and “at least one agent”. Given these types of tasks, agents autonomously determine, based on their capabilities, which parts of the task they can and should do for the team to satisfy the task.

Tasks may require collaboration between different agents. Similar to Kloetzer and Belta (2010); Tumova and Dimarogonas (2016); Chen et al. (2011), to ensure the actions are performed in the correct order, our framework takes the corresponding synchronization constraints into account while synthesizing agent behavior; agents must wait to execute the actions together. In our approach, execution of the synchronous behavior for each agent is decentralized; agents carry out their plan and communicate with one another when synchronization is necessary.

Depending on the task and the available agents, there might be different teams (i.e., subsets of the agent set) that can carry out the task; our algorithm for assigning a team and synthesizing behavior for the agents finds the largest team of agents that satisfies the task. This means that the team may have redundancies, i.e. agents can be removed while still ensuring the overall task is satisfied. This is beneficial both for robustness and optimality; the user can choose a subset of the team (provided that all the required bindings are still assigned) to optimize different metrics, such as cost or overall number of agents.

Related work: One way to encode tasks is to first decompose them into independent sub-tasks and then allocate them to the agents. For example, Schillinger et al. (2018); Faruq et al. (2018) address finite-horizon tasks for multi-agent teams. The authors first automatically decompose a global automaton representing the task into independent sub-tasks. To synthesize control policies, the authors build product automata for each heterogeneous agent. Each automaton is then sequentially linked using switch transitions to reduce state-space explosion in synthesizing parallel plans. In our prior work Fang and Kress-Gazit (2022), we address infinite-horizon tasks that have already been decomposed into sub-tasks. Given a new task, we proposed a decentralized framework for agents to automatically update their behavior based on a new task and their existing tasks, allowing agents to interleave the tasks.

The works discussed above make the critical assumption that tasks are independent, i.e. agents do not collaborate with one another. One approach to including collaborative actions is to explicitly encode the agent assignments in the tasks. To synthesize agent control for these types of tasks, in Tumova and Dimarogonas (2016), the authors construct a reduced product automaton in which the agents only synchronize when cooperative actions are required. The work in Kantaros and Zavlanos (2020) proposes a sampling-based method that approximates the product automaton of the team by building trees incrementally while maintaining probabilistic completeness. In this paper, we consider the more general setting in which agents may need to collaborate with each other, but are not given explicit task assignments a priori.

Rather than providing predetermined task assignments, another approach for defining collaborative tasks is to capture information about the number and type of agents needed for parts of the specification. For example, Sahin et al. (2017) imposes constraints on the number of agents necessary in regions using counting LTL. Leahy et al. (2022) uses Capability Temporal Logic to encode both the number and capabilities necessary in certain abstracted locations in the environment and then formulates the problem as a MILP to find an optimal teaming strategy. The authors of Luo and Zavlanos (2022) introduce the concept of induced propositions, where each atomic proposition not only encodes information about the number, type of agents, and target regions, but also has a connector that binds the truth of certain atomic propositions together. To synthesize behavior for the agents, they propose a hierarchical approach that first constructs the automaton representing the task and then decomposes the task into possible sub-tasks. The temporal order of these sub-tasks is captured using partially ordered sets and are used in the task allocation problem, which is formulated as a MILP.

Inspired by Luo and Zavlanos (2022) and the concept of induced propositions, we create a task grammar that includes information about how the atomic propositions are related to one another, which represents the overall relationship between agents and task requirements. Unlike Luo and Zavlanos (2022), which considers navigation tasks in which the same set of agents of a certain type may need to visit different regions, we generalize these tasks to any type of abstract action an agent may be able to perform. In addition, a key assumption we relax is that we do not require each agent to be only categorized as one type. As a result, agents can have overlapping capabilities. To our knowledge, no other grammars have been proposed for these generalized types of multi-agent collaborative tasks.

Contributions: We propose a task description and control synthesis framework for heterogeneous agents to satisfy collaborative tasks. Specifically, we present a new, LTL-based task grammar for the formulation of collaborative tasks, and provide a framework to form a team of agents and synthesize control and synchronization policies to guarantee the team satisfies the task. We demonstrate our approach in simulated precision agriculture scenarios.

2. Preliminaries

2.1. Linear Temporal Logic

LTL formulas are defined over a set of atomic propositions APAP, where πAP\pi\in AP are Boolean variables Emerson (1990). We abstract agent actions as atomic propositions. For example, UVUV captures an agent taking UV measurement.

Syntax: An LTL formula is defined as:

φ::=π|¬φ|φφ|φ|φ𝒰φ\varphi::=\pi\ |\ \neg\varphi\ |\ \varphi\vee\varphi\ |\ \bigcirc\varphi\ |\ \varphi\ \mathcal{U}\ \varphi

where ¬\neg (“not”) and \vee (“or”) are Boolean operators, and \bigcirc (“next”) and 𝒰\mathcal{U} (“until”) are temporal operators. From these operators, we can define: conjunction φφ\varphi\wedge\varphi, implication φφ\varphi\Rightarrow\varphi, eventually φ=True𝒰φ\Diamond\varphi=\text{True}\ \mathcal{U}\ \varphi, and always φ=¬¬φ\Box\varphi=\neg\Diamond\neg\varphi.

Semantics: The semantics of an LTL formula φ\varphi are defined over an infinite trace σ=σ(0)σ(1)σ(2)\sigma=\sigma(0)\sigma(1)\sigma(2)..., where σ(i)\sigma(i) is the set of true APAP at position ii. We denote that σ\sigma satisfies LTL formula φ\varphi as σφ\sigma\models\varphi.

Intuitively, φ\Diamond\varphi is satisfied if there exists a σ(i)\sigma(i) in which φ\varphi is true. φ\Box\varphi is satisfied if φ\varphi is true at every position in σ\sigma. To satisfy φ1𝒰φ2\varphi_{1}\ \mathcal{U}\ \varphi_{2}, φ1\varphi_{1} must remain true until φ2\varphi_{2} becomes true. See Emerson (1990) for the full semantics.

2.2. Büchi Automata

An LTL formula φ\varphi can be translated into a Nondeterministic Büchi Automaton that accepts infinite traces if and only if they satisfy φ\varphi. A Büchi automaton is a tuple =(Z,z0,Σ,δ,F)\mathcal{B}=(Z,z_{0},\Sigma_{\mathcal{B}},\delta_{\mathcal{B}},F), where ZZ is the set of states, z0Zz_{0}\in Z is the initial state, Σ\Sigma_{\mathcal{B}} is the input alphabet, δ:Z×Σ×Z\delta_{\mathcal{B}}:Z\times\Sigma_{\mathcal{B}}\times Z is the transition relation, and FZF\subseteq Z is a set of accepting states. An infinite run of \mathcal{B} over a word w=w1w2w3w=w_{1}w_{2}w_{3}..., wiΣw_{i}\in\Sigma_{\mathcal{B}} is an infinite sequence of states z=z0z1z2z=z_{0}z_{1}z_{2}... such that (zi1,wi,zi)δ(z_{i-1},w_{i},z_{i})\in\delta_{\mathcal{B}}. A run is accepting if and only if Inf(zz) F\cap\ F\neq\emptyset, where Inf(zz) is the set of states that appear in zz infinitely often Baier and Katoen (2008).

2.3. Agent Model

Following Fang and Kress-Gazit (2022), we create an abstract model for each agent based on its set of capabilities. A capability is a weighted transition system λ=(S,s0,AP,Δ,L,W)\mathcal{\lambda}=(S,s_{0},AP,\Delta,L,W), where SS is a finite set of states, s0Ss_{0}\in S is the initial state, APAP is the set of atomic propositions, ΔS×S\Delta\subseteq S\times S is a transition relation where for all sSs\in S, sS\exists s^{\prime}\in S such that (s,s)Δ(s,s^{\prime})\in\Delta, L:S2APL:S\rightarrow 2^{AP} is the labeling function such that L(s)L(s) is the set of propositions that are true in state ss, and W:Δ0W:\Delta\rightarrow\mathbb{R}_{\geq 0} is the cost function assigning a weight to each transition. Since we are considering a group of heterogeneous agents, agent jj has its own set of kk capabilities Λj={λ1,,λk}\Lambda_{j}=\{\lambda_{1},...,\lambda_{k}\}.

An agent model AjA_{j} is the product of its capabilities: Aj=λ1××λkA_{j}=\lambda_{1}\times...\times\lambda_{k} such that Aj=(S,s0,APj,γ,L,W)A_{j}=(S,s_{0},AP_{j},\gamma,L,W), where S=S1××SkS=S_{1}\times...\times S_{k} is the set of states, s0Ss_{0}\in S is the initial state, APj=i=1kAPiAP_{j}=\bigcup_{i=1}^{k}AP_{i} is the set of propositions, γS×S\gamma\subseteq S\times S is the transition relation such that (s,s)γ(s,s^{\prime})\in\gamma, where s=(s1,,sk),s=(s1,,sk)s=(s_{1},...,s_{k}),s^{\prime}=(s^{\prime}_{1},...,s^{\prime}_{k}), if and only if for all i={1,,k},(si,si)Δii=\{1,...,k\},(s_{i},s^{\prime}_{i})\in\Delta_{i}, L:S2APjL:S\rightarrow 2^{AP_{j}} is the labeling function where L(s)=i=1kLi(si)L(s)=\bigcup_{i=1}^{k}L_{i}(s_{i}), and W:γ0W:\gamma\rightarrow\mathbb{R}_{\geq 0} is the cost function that combines the costs of the capabilities. Fig. 1c depicts a snippet of an agent model where we treat the cost as additive. Fig. 1a represents the agent’s sensing area λarea\lambda_{area}; the agent can orient its sensors to take measurements in different regions of a partitioned workspace (in this case, regions A and B). Fig. 1b represents the agent’s robot manipulator, which can pick up and drop off soil samples, as well as pull weeds.

Refer to caption
Figure 1. Agent partial model: (a) λ𝑎𝑟𝑒𝑎\lambda_{\mathit{area}} (b) λ𝑎𝑟𝑚\lambda_{\mathit{arm}} (c) AgreenA_{green}

3. Task Grammar - LTLψ

We define the task grammar LTLψ that includes atomic propositions that abstract agent action, logical and temporal operators, as in LTL, and bindings that connect actions to specific agents; any action labeled with the same binding must be satisfied by the same agent(s) (the actual value of the binding is not important). We define a task recursively over LTL and binding formulas.

ψ\displaystyle\psi :=ρ|ψ1ψ2|ψ1ψ2\displaystyle:=\rho\>|\>\psi_{1}\vee\psi_{2}\>|\>\psi_{1}\wedge\psi_{2} (1)
φ\displaystyle\varphi :=π|¬φ|φφ|φ|φ𝒰φ\displaystyle:=\pi\ |\ \neg\varphi\ |\ \varphi\vee\varphi\ |\bigcirc\varphi\ |\ \varphi\ \mathcal{U}\ \varphi (2)
φψ\displaystyle\varphi^{\psi}\! :=φψ|¬(φψ)|φ1ψ1φ2ψ2|φ1ψ1φ2ψ2|φψ|φ1ψ1𝒰φ2ψ2|φψ\displaystyle:=\varphi^{\psi}\!\>|\>\neg(\varphi^{\psi})\>|\>\!\varphi_{1}^{\psi_{1}}\!\!\wedge\!\varphi_{2}^{\psi_{2}}|\ \!\varphi_{1}^{\psi_{1}}\!\!\vee\!\varphi_{2}^{\psi_{2}}\!\>|\>\!\!\bigcirc\!\varphi^{\psi}\!\>|\>\varphi_{1}^{\psi_{1}}\mathcal{U}\varphi_{2}^{\psi_{2}}\!\>|\>\Box\varphi^{\psi} (3)

where ψ\psi, the binding formula, is a Boolean formula excluding negation over ρAPψ\rho\in AP_{\psi}, and φ\varphi is an LTL formula. An LTLψ formula consists of conjunction, disjunction, and temporal operators; we define eventually as φψ=True𝒰φψ\Diamond\varphi^{\psi}=\text{True}\ \mathcal{U}\ \varphi^{\psi}. An example of an LTLψ  formula is shown in Eq. 4.

Semantics: The semantics of an LTLψ formula φψ\varphi^{\psi} are defined over σ\sigma and RR; σ=σ1σ2σn\sigma=\sigma_{1}\sigma_{2}...\sigma_{n} is the team trace where σj\sigma_{j} is agent jj’s trace, and i,σ(i)=σ1(i)σ2(i)σn(i)\forall i,\sigma(i)=\sigma_{1}(i)\sigma_{2}(i)...\sigma_{n}(i). R={r1,r2,,rn}R=\{r_{1},r_{2},...,r_{n}\} is the set of binding assignments, where rjRr_{j}\in R is the set of APψAP_{\psi} that are assigned to agent jj. Once a team is established, RR is constant, i.e. an agent’s binding assignment does not change throughout the task execution. For example, r1={2,3},r2={1}r_{1}=\{2,3\},r_{2}=\{1\} denotes that agent 1 is assigned bindings 2 and 3, and agent 2 is assigned binding 1.

Given nn agents and a set of binding propositions APψAP_{\psi}, we define the function ζ:ψ22APψ\zeta:\psi\rightarrow 2^{2^{AP_{\psi}}} such that ζ(ψ)\zeta(\psi) is the set of all possible combinations of ρ\rho that satisfy ψ\psi. For example, ζ((12)3)={{1,3},{2,3},{1,2,3}}\zeta\bigl{(}(1\vee 2)\wedge 3\bigr{)}=\{\{1,3\},\{2,3\},\{1,2,3\}\}.

The semantics of LTLψare:

  • (σ(i),R)φψ(\sigma(i),R)\!\models\varphi^{\psi} iff Kζ(ψ)\exists K\in\zeta(\psi) s.t. (Kp=1nrpK\subseteq\bigcup\limits_{p=1}^{n}r_{p}) and (j\forall j s.t. KrjK\cap r_{j}\neq\emptyset, σj(i)φ\sigma_{j}(i)\models\varphi)

  • (σ(i),R)(¬φ)ψ(\sigma(i),R)\!\models(\neg\varphi)^{\psi} iff Kζ(ψ)\exists K\in\zeta(\psi) s.t. (Kp=1nrpK\subseteq\bigcup\limits_{p=1}^{n}r_{p}) and (j\forall j s.t. KrjK\cap r_{j}\neq\emptyset, σj(i)⊧̸φ\sigma_{j}(i)\not\models\varphi)

  • (σ(i),R)¬(φψ)(\sigma(i),R)\!\models\neg(\varphi^{\psi}) iff Kζ(ψ)\exists K\in\zeta(\psi) s.t. (Kp=1nrpK\subseteq\bigcup\limits_{p=1}^{n}r_{p}) and (j\exists j s.t. KrjK\cap r_{j}\neq\emptyset, σj(i)⊧̸φ\sigma_{j}(i)\not\models\varphi)

  • (σ(i),R)φ1ψ1φ2ψ2(\sigma(i),\!R)\!\!\models\!\!\varphi_{1}^{\psi_{1}}\!\wedge\varphi_{2}^{\psi_{2}} iff (σ(i),R)φ1ψ1(\sigma(i),\!R)\!\!\models\!\varphi_{1}^{\psi_{1}}and (σ(i),R)φ2ψ2(\sigma(i),\!R)\!\models\!\varphi_{2}^{\psi_{2}}

  • (σ(i),R)φ1ψ1φ2ψ2(\sigma(i),\!R)\!\models\!\varphi_{1}^{\psi_{1}}\!\vee\!\varphi_{2}^{\psi_{2}} iff (σ(i),R)φ1ψ1(\sigma(i),\!R)\!\models\!\varphi_{1}^{\psi_{1}}or (σ(i),R)φ2ψ2(\sigma(i),R)\!\models\varphi_{2}^{\psi_{2}}

  • (σ(i),R)φψ(\sigma(i),R)\models\bigcirc\varphi^{\psi} iff σ(i+1),Rφψ\sigma(i+1),R\models\varphi^{\psi}

  • (σ(i),R)φ1ψ1𝒰φ2ψ2(\sigma(i),R)\!\models\varphi_{1}^{\psi_{1}}\mathcal{U}\varphi_{2}^{\psi_{2}} iff i\exists\ell\geq i s.t. (σ(),R)φ2ψ2(\sigma(\ell),R)\models\varphi_{2}^{\psi_{2}} and ik<,(σ(k),R)φ1ψ1\forall i\leq k<\ell,(\sigma(k),R)\models\varphi_{1}^{\psi_{1}}

  • (σ(i),R)φψ(\sigma(i),R)\models\Box\varphi^{\psi} iff >i,(σ(),R)φψ\forall\ell>i,(\sigma(\ell),R)\models\varphi^{\psi}

Intuitively, the behavior of an agent team and their respective binding assignments satisfy φψ\varphi^{\psi} if there exists a possible binding assignment in ζ(ψ)\zeta(\psi) in which all the bindings are assigned to (at least one) agent, and the behavior of all agents with a relevant binding assignment satisfy φ\varphi. An agent can be assigned more than one binding, and a binding can be assigned to more than one agent.

Remark 1. For the sake of clarity in notation, ¬φψ\neg\varphi^{\psi} is equivalent to (¬φ)ψ(\neg\varphi)^{\psi}. For example, ¬pickup1(¬pickup)1\neg pickup^{1}\triangleq(\neg pickup)^{1}.

Remark 2. Note the subtle but important difference between (¬φ)ψ(\neg\varphi)^{\psi} and ¬(φψ)\neg(\varphi^{\psi}). Informally, the former requires all agents with binding assignments that satisfy ψ\psi to satisfy ¬φ\neg\varphi; the latter requires the formula φψ\varphi^{\psi} to be violated, meaning that at least one agent’s trace violates φ\varphi, i.e. satisfies ¬φ\neg\varphi.

Remark 3. Unique to LTLψis the ability to encode both tasks that include constraints on all agents or on at least one agent; “For all agents” is captured by φψ\varphi^{\psi}; “at least one agent” is encoded as ¬((¬φ)ψ)\neg((\neg\varphi)^{\psi}), which captures “at least one agent assigned a binding in Kζ(ψ)K\in\zeta(\psi) satisfies φ\varphi”. This allows for multiple agents to be assigned the same binding, but only one of those agents is necessary to satisfy φ\varphi. This can be particularly useful in tasks with safety constraints; for example, we can write ¬(¬regionA1)(regionAvisual)2\neg(\neg region_{A}^{1})\Rightarrow(region_{A}\wedge visual)^{2}, which says “if any agent assigned binding 1 is in region A, all agents assigned binding 2 must take a picture of the region.”

Example. Let APψ={1,2,3}AP_{\psi}=\{1,2,3\}, APφ={regionA,regionB,pickup,AP_{\varphi}=\{region_{A},region_{B},pickup,
thermal,visual,moisture,UV}thermal,visual,moisture,UV\}, and φψ=φ1ψφ2ψ\varphi^{\psi}=\varphi^{\psi}_{1}\wedge\varphi^{\psi}_{2}, where

φ1ψ=((regionBmoistureUV)23(regionApickup)1)\displaystyle\varphi^{\psi}_{1}\!=\!\Diamond((region_{B}\!\wedge\!moisture\!\wedge\!UV)^{2\wedge 3}\!\!\wedge\!(region_{A}\!\wedge\!pickup)^{1}) (4a)
φ2ψ=¬pickup1𝒰(regionA\displaystyle\varphi^{\psi}_{2}=\neg pickup^{1}\ \mathcal{U}\ (region_{A}
((thermalvisual)¬(thermalvisual)))2\displaystyle\;\;\;\;\;\;\;\;\;\;\;\;\wedge((thermal\vee visual)\wedge\neg(thermal\wedge visual)))^{2} (4b)

φ1ψ\varphi^{\psi}_{1} captures “Agent(s) assigned bindings 2 and 3 should take a moisture measurement and a UV measurement in the region B at the same time that agent(s) assigned binding 1 picks up a soil sample in region A.” φ2ψ\varphi^{\psi}_{2} captures “Before the soil sample can be picked up, agent(s) assigned binding 2 needs to either take a thermal image or a visual image (but not both) of region A.”

Note that, since multiple bindings can be assigned to the same agent, an agent can be assigned both bindings 2 and 3, provided that it has the capabilities to satisfy the corresponding parts of the formula. In addition, depending on the final assignments, the agents may need to synchronize with one another to perform parts of the task. For example, agents assigned with any subset of bindings {1,2,3}\{1,2,3\} need to synchronize their respective actions to satisfy φ1ψ\varphi^{\psi}_{1}.

4. Control Synthesis for LTLψ

Problem statement: Given nn heterogeneous agents A={A1,,An}A=\{A_{1},...,A_{n}\} and a task φψ\varphi^{\psi} in LTLψ, find a team of agents A^A\hat{A}\subseteq A, their binding assignments RA^R_{\hat{A}}, and synthesize behavior σj\sigma_{j} for each agent such that (σ(0),RA^)φψ(\sigma(0),R_{\hat{A}})\models\varphi^{\psi}. This behavior includes synchronization constraints for agents to satisfy the necessary collaborative actions. We assume that each agent is able to wait in any state (i.e. every state in the agent model has a self-transition).

Example. Consider a group of four agents A={AgreenA=\{A_{green}, AblueA_{blue}, AorangeA_{orange}, Apink}A_{pink}\} in a precision agriculture environment composed of 5 regions, as illustrated in Fig. 2. AorangeA_{orange} is a mobile robot manipulator, such as Harvest Automation’s HV-100, while the other agents are stationary with different onboard sensing capabilities. The set of all capabilities is Λ={\Lambda=\{λ𝑎𝑟𝑒𝑎_j\lambda_{\mathit{area\_j}},λmotion\lambda_{motion}, λarm\lambda_{arm}, λUV\lambda_{UV}, λmoisture\lambda_{moisture}, λvisual\lambda_{visual}, λthermal}\lambda_{thermal}\}, where j={green,blue,pink}\forall j=\{green,blue,pink\}, λ𝑎𝑟𝑒𝑎_j\lambda_{\mathit{area\_j}} is agent jj’s sensing area model. The green agent can orient its arm to reach either region A or B. The blue agent can orient its sensors to see one of three regions, B, C, or D; in order to reorient its sensors from regions B to D, its sensing range must first pass through region C. Similarly, the pink agent can orient its sensors to see either region A, B, or C, and its sensing range must pass through region B to get from regions A to C. The orange agent’s ability to move between adjacent regions is represented by the capability λmotion\lambda_{motion}. Its sensing region is whichever region it is in. AParm={pickup, dropoff, weed}AP_{arm}=\{\textit{pickup, dropoff, weed}\} is an abstraction of a robot manipulator that represents different actions the arm can perform, such as picking up soil samples or pulling weeds. APUVAP_{UV}, APmoistureAP_{moisture}, APvisualAP_{visual}, APthermalAP_{thermal} all contain a single proposition representing a agent’s ability to take UV measurements, soil moisture measurements, visual images, and thermal images, respectively. λarm\lambda_{arm} has more states (see Fig. 1b). Each agent may have distinct cost functions corresponding to individual capabilities.

The agent capabilities and label on the initial state are:
Λgreen={λ𝑎𝑟𝑒𝑎_1,λ𝑎𝑟𝑚},L(s0)={regionB}\Lambda_{green}=\{\lambda_{\mathit{area\_1}},\lambda_{\mathit{arm}}\},L(s_{0})=\{region_{B}\}
Λblue={λ𝑎𝑟𝑒𝑎_2,λ𝑚𝑜𝑖𝑠𝑡𝑢𝑟𝑒,λUV},L(s0)={regionD}\Lambda_{blue}=\{\lambda_{\mathit{area\_2}},\lambda_{\mathit{moisture}},\lambda_{UV}\},L(s_{0})=\{region_{D}\}
Λorange={λ𝑚𝑜𝑡𝑖𝑜𝑛,λ𝑚𝑜𝑖𝑠𝑡𝑢𝑟𝑒,λUV,λarm},L(s0)={regionE}\Lambda_{orange}=\{\lambda_{\mathit{motion}},\lambda_{\mathit{moisture}},\lambda_{UV},\lambda_{arm}\},L(s_{0})=\{region_{E}\}
Λpink={λ𝑎𝑟𝑒𝑎_4,λ𝑡ℎ𝑒𝑟𝑚𝑎𝑙,λ𝑣𝑖𝑠𝑢𝑎𝑙,λ𝑚𝑜𝑖𝑠𝑡𝑢𝑟𝑒,λUV}\Lambda_{pink}=\{\lambda_{\mathit{area\_4}},\lambda_{\mathit{thermal}},\lambda_{\mathit{visual}},\lambda_{\mathit{moisture}},\lambda_{UV}\}, L(s0)={regionC}L(s_{0})=\{region_{C}\}

The team receives the task φψ\varphi^{\psi} (Eq. 4) and must determine a teaming assignment and behavior to satisfy the task. During execution, the agents must also synchronize with each other when necessary.

Refer to caption
Figure 2. Agriculture environment and initial agent states. The green, blue, and pink agents are stationary; the orientation of their sensors are indicated by the colored boxes.

5. Approach

To find a teaming assignment and synthesize the corresponding synchronization and control, we first automatically generate a Büchi automaton \mathcal{B} for the task φψ\varphi^{\psi}  (Sec. 5.1). Each agent AjA_{j} then constructs a product automaton 𝒢j=Aj×\mathcal{G}_{j}=A_{j}\times\mathcal{B} (Sec. 5.2). For each binding ρAPψ\rho\in AP_{\psi}, it checks whether or not it can perform the task associated with that binding by finding a path to an accepting cycle in 𝒢j\mathcal{G}_{j}. Each agent creates a copy of the Büchi automaton j\mathcal{B}_{j} pruned to remove any unreachable transitions and stores information about which combinations of binding assignments it can do.

For parts of the task that require collaboration (e.g., when a transition calls for actions with bindings {1,2}\{1,2\} and rgreen={1,2},rblue={2}r_{green}=\{1,2\},r_{blue}=\{2\}), we need agents to synchronize. Thus, we synthesize behavior that allows for parallel execution while also guaranteeing that the team’s overall behavior satisfies the global specification.

To find a team of agents that can satisfy the task and their assignments, we need to guarantee that 1) every binding is assigned to at least one agent and 2) the agents synchronize for the collaborative portions of the task. To do so, we first run a depth-first search (DFS) to find a path through the \mathcal{B} to an accepting cycle in which there exists a team of agents such that for every transition in the path, every proposition in APψAP_{\psi} is assigned to at least one agent (Sec. 5.4). Each agent then synthesizes behavior to satisfy this path and communicates to other agents when synchronization is necessary.

5.1. Büchi Automaton for an LTLψ Formula

When constructing a Büchi automaton for an LTLψ specification, we automatically rewrite the specification such that the binding propositions are only over individual atomic proposition πAPφ\pi\in AP_{\varphi} (i.e. the formula is composed of πρ\pi^{\rho}). For instance, the formula (¬pickup𝒰regionA)12(\neg pickup\ \mathcal{U}\ region_{A})^{1\vee 2} is rewritten as (¬pickup1𝒰regionA1)(¬pickup2𝒰regionA2).(\neg pickup^{1}\ \mathcal{U}\ region_{A}^{1})\vee(\neg pickup^{2}\ \mathcal{U}\ region_{A}^{2}).

In our running example, we rewrite the formula in Eq. 4a as

(regionB2moisture2UV2\displaystyle\Diamond(region_{B}^{2}\wedge moisture^{2}\wedge UV^{2} (5)
regionB3moisture3UV3regionA1pickup1)\displaystyle\wedge region_{B}^{3}\wedge moisture^{3}\wedge UV^{3}\wedge region_{A}^{1}\wedge pickup^{1})

Remark 4. In rewriting the specification, negation follows bindings in the order of operations. For example, ¬pickup12=¬pickup1¬pickup2\neg pickup^{1\wedge 2}=\neg pickup^{1}\wedge\neg pickup^{2}, and ¬(pickup12)=¬(pickup1pickup2)=¬(pickup1)¬(pickup2)\neg(pickup^{1\wedge 2})=\neg(pickup^{1}\wedge pickup^{2})=\neg(pickup^{1})\vee\neg(pickup^{2}).

From APφAP_{\varphi} and APψAP_{\psi}, we define the set of propositions APφψAP_{\varphi}^{\psi}, where πAPφ\forall\pi\in AP_{\varphi} and ρAPψ\forall\rho\in AP_{\psi}, πρAPφψ\pi^{\rho}\in AP_{\varphi}^{\psi}. Given APφψAP_{\varphi}^{\psi}, we automatically translate the specification into a Büchi automaton using Spot Duret-Lutz et al. (2016).

To facilitate control synthesis, we transform any transitions in the Büchi automaton labeled with disjunctive formulas into disjunctive normal form (DNF). We then replace the transition labeled with a DNF formula containing \ell conjunctive clauses with \ell transitions between the same states, each labeled with a different conjunction of the original label.

In general, when creating a Büchi automaton from an LTL formula φ\varphi, wΣw\in\Sigma_{\mathcal{B}} are Boolean formulas over APφAP_{\varphi}, the atomic propositions that appear in φ\varphi, as seen in Fig. 3. In the following, for creating the product automaton, we use an equivalent representation, where Σ=2APφψ×2APφψ\Sigma_{\mathcal{B}}=2^{AP_{\varphi}^{\psi}}\times 2^{AP_{\varphi}^{\psi}} and w=(σT,σF)Σw=(\sigma_{T},\sigma_{F})\in\Sigma_{\mathcal{B}} contains the set of propositions that must be true, σT\sigma_{T}, and the set of propositions that must be false, σF\sigma_{F}, for the Boolean formula over a transition to evaluate to True. These sets are unique in our case since each transition is labeled with a conjunctive clause (i.e. no disjunction). Note that σTσF=\sigma_{T}\cap\sigma_{F}=\emptyset and σTσFAPφ\sigma_{T}\cup\sigma_{F}\subseteq AP_{\varphi}; propositions that do not appear in ww can have any truth value.

Given a Büchi automaton for an LTLψ specification \mathcal{B}, we define the following functions:

Refer to caption
Figure 3. \mathcal{B} for φψ\varphi^{\psi} (Eq. 4). The purple transitions illustrate a possible accepting trace.

Definition 1 (Binding Function). 𝔅:Σ2APψ\mathfrak{B}:\Sigma_{\mathcal{B}}\rightarrow 2^{AP_{\psi}} such that for σ=(σT,σF)Σ,𝔅(σ)APψ\sigma{=(\sigma_{T},\sigma_{F})}\in\Sigma_{\mathcal{B}},\mathfrak{B}(\sigma)\subseteq AP_{\psi} is the set {ρAPψ|πρσTσF}\{\rho\in AP_{\psi}\ |\ \exists\pi^{\rho}\in{\sigma_{T}\cup\sigma_{F}\}}. Intuitively, it is the set of bindings that appear in label σ\sigma of a Büchi transition.

Definition 2 (Capability Function). :Σ×APψ2APφ×2APφ\mathfrak{C}:\Sigma_{\mathcal{B}}\times AP_{\psi}\rightarrow 2^{AP_{\varphi}}\times 2^{AP_{\varphi}} such that for σ=(σT,σF)Σ,ρAPψ\sigma=(\sigma_{T},\sigma_{F})\in\Sigma_{\mathcal{B}},\rho\in AP_{\psi}, (σ,ρ)=(CT,CF)\mathfrak{C}(\sigma,\rho)=(C_{T},C_{F}), where CT={πAPφ|πρσT}C_{T}=\{\pi\in AP_{\varphi}\ |\ \exists\pi^{\rho}\in\sigma_{T}\} and CF={πAPφ|πρσF}C_{F}=\{\pi\in AP_{\varphi}\ |\ \exists\pi^{\rho}\in\sigma_{F}\}. Here, CTC_{T} and CFC_{F} are the sets of action propositions that are True/False and appear with binding ρ\rho in label σ\sigma of a Büchi transition.

5.2. Agent Behavior for an LTLψ Specification

To synthesize behavior for an agent, we find an accepting trace in its product automaton 𝒢j=Aj×\mathcal{G}_{j}=A_{j}\times\mathcal{B}, where Aj=(S,s0,APj,γ,L,W)A_{j}=(S,s_{0},AP_{j},\gamma,L,W) is the agent model, and =(Z,z0,Σ,δ,F)\mathcal{B}=(Z,z_{0},\Sigma_{\mathcal{B}},\delta_{\mathcal{B}},F) is the Büchi automaton.

Since the set of propositions of AjA_{j} may not be equivalent to the set of propositions of \mathcal{B}, we borrow from the definition of the product automaton in Fang and Kress-Gazit (2022). We first define the following function:

Definition 3 (Binding Assignment Function). Let q=(s,z)q=(s,z), q=(s,z)q^{\prime}=(s^{\prime},z^{\prime}), σ=(σT,σF)Σ\sigma=(\sigma_{T},\sigma_{F})\in\Sigma_{\mathcal{B}}. Then (q,σ,q)={r2APψ|ρr,(CT,CF)=(σ,ρ),ρrCTL(s)\mathfrak{R}(q,\sigma,q^{\prime})=\{r\in 2^{AP_{\psi}}\setminus\emptyset\ |\ \forall\rho\in r,(C_{T},C_{F})=\mathfrak{C}(\sigma,\rho),\bigcup_{\rho\in r}C_{T}\subseteq L(s^{\prime}) and ρrCFL(s)=}\bigcup_{\rho\in r}C_{F}\cap L(s^{\prime})=\emptyset\}.

Intuitively, \mathfrak{R} outputs all possible combinations of binding propositions that the agent can be assigned for a transition (q,σ,q)(q,\sigma,q^{\prime}). An agent can be assigned ρ\rho if and only if the agent’s next state ss^{\prime} is labeled with all the action and motion propositions πAPφ\pi\in AP_{\varphi} that appear in σT\sigma_{T} as πρ\pi^{\rho}, and all the propositions πAPφ\pi\in AP_{\varphi} that appear in σF\sigma_{F} as πρ\pi^{\rho} are not part of the state label (i.e. the agent is not performing that action). If a proposition πρ\pi^{\rho} is in σF\sigma_{F} and π\pi is not in APjAP_{j} (e.g. scan1σFscan^{1}\in\sigma_{F} and the agent does not have λscan\lambda_{scan}), the agent may be assigned ρ\rho. Note that rr may include any binding propositions that are not in σ\sigma, since there are no actions required by those bindings in that transition. For example, if σ=({scan1},{pickup2})\sigma=(\{scan^{1}\},\{pickup^{2}\}) and APψ={1,2,3}AP_{\psi}=\{1,2,3\}, then {3}\{3\} will be in the set (q,σ,q)\mathfrak{R}(q,\sigma,q^{\prime}) for all q,qq,q^{\prime}.

Given AjA_{j} and \mathcal{B}, we define the product automaton 𝒢j=Aj×\mathcal{G}_{j}=A_{j}\times\mathcal{B}:

Definition 4 (Product Automaton). The product automaton 𝒢j=(Q,q0,APj,δ𝒢,L𝒢,W𝒢,F𝒢)\mathcal{G}_{j}=(Q,q_{0},AP_{j},\delta_{\mathcal{G}},L_{\mathcal{G}},W_{\mathcal{G}},F_{\mathcal{G}}), where

  • Q=S×ZQ=S\times Z is a finite set of states

  • q0=(s0,z0)Qq_{0}=(s_{0},z_{0})\in Q is the initial state

  • δ𝒢Q×Q\delta_{\mathcal{G}}\subseteq Q\times Q is the transition relation, where for q=(s,z)q=(s,z) and q=(s,z)q^{\prime}=(s^{\prime},z^{\prime}), (q,q)δ𝒢(q,q^{\prime})\in\delta_{\mathcal{G}} if and only if (s,s)γ(s,s^{\prime})\in\gamma and σΣ\exists\sigma\in\Sigma_{\mathcal{B}} such that (z,σ,z)δ(z,\sigma,z^{\prime})\in\delta_{\mathcal{B}} and (q,σ,q)\mathfrak{R}(q,\sigma,q^{\prime})\neq\emptyset

  • L𝒢L_{\mathcal{G}} is the labeling function s.t. for q=(s,z)q=(s,z), L𝒢(q)=L(s)APjL_{\mathcal{G}}(q)\!=\!L(s)\!\subseteq\!AP_{j}

  • W𝒢:δ𝒢0W_{\mathcal{G}}:\delta_{\mathcal{G}}\rightarrow\mathbb{R}_{\geq 0} is the cost function s.t. for (q,q)δ𝒢(q,q^{\prime})\in\delta_{\mathcal{G}}, q=(s,z)q=(s,z), q=(s,z)q^{\prime}=(s^{\prime},z^{\prime}), W𝒢((q,q))=W((s,s))W_{\mathcal{G}}((q,q^{\prime}))=W((s,s^{\prime}))

  • F𝒢=S×FF_{\mathcal{G}}=S\times F is the set of accepting states

Refer to caption
Figure 4. A small portion of 𝒢green\mathcal{G}_{green}

Example. Fig. 4 depicts a small portion of 𝒢green\mathcal{G}_{green}; for the self-transition in \mathcal{B} that is labeled with σ=(\sigma=(\emptyset, {pickup1,\{pickup^{1}, regionA2})region_{A}^{2}\}) (labeled as e1e1 in Fig. 3), and for states in AgreenA_{green} where L(s1)={regionB}L(s_{1})=\{region_{B}\}, L(s2)={regionA}L(s_{2})=\{region_{A}\}, L(s3)={regionA,pickup}L(s_{3})=\{region_{A},pickup\}, then the possible binding assignments are ((s1,1),σ,(s1,2))=2{1,2,3}\mathfrak{R}((s_{1},1),\sigma,(s_{1},2))=2^{\{1,2,3\}}\setminus\emptyset and ((s1,1),σ,(s2,2))={{1},{3},{1,3}}\mathfrak{R}((s_{1},1),\sigma,(s_{2},2))=\{\{1\},\{3\},\{1,3\}\}. When the agent is in s3s_{3}, it cannot be assigned either bindings 1 or 2, but since no propositions appear with binding 3 in σ\sigma, ((s1,1),σ,(s3,2))={{3}}\mathfrak{R}((s_{1},1),\sigma,(s_{3},2))=\{\{3\}\}.

5.3. Finding Possible Individual Agent Bindings

To construct a team, we first reason about each agent and the sets of bindings it can perform. For example, for a formula regionA1regionB2region_{A}^{1}\wedge region_{B}^{2}, an agent may be assigned rj={1}r_{j}=\{1\} or rj={2}r_{j}=\{2\} but not rj={1,2}r_{j}=\{1,2\}, since it cannot be in two regions at the same time.

To find the set of possible binding assignments Rj2APψR_{j}\subseteq 2^{AP_{\psi}}, we search for an accepting trace in 𝒢j\mathcal{G}_{j} for every binding assignment rj2APψr_{j}\in 2^{AP_{\psi}}. We start from the full set of bindings rj=APψr_{j}=AP_{\psi}. Given an assignment rjr_{j} to check, we find an accepting trace in 𝒢j\mathcal{G}_{j} such that for all transitions (q,q)(q,q^{\prime}) in the trace, rj(q,σ,q)r_{j}\subseteq\mathfrak{R}(q,\sigma,q^{\prime}). This ensures that the agent can satisfy its binding assignment for the entirety of its execution (i.e. rjr_{j} does not change). Since every subset of a binding assignment rjr_{j} is itself a possible binding assignment, if the agent can be assigned all m=|APψ|m=|AP_{\psi}| bindings, then we know it can also be assigned every possible subset of mm. If not, we check the (mm1){m\choose m-1} combinations, and continue iterating until we have determined the agent’s ability to perform every combination of the mm bindings.

Once an agent determines its possible binding assignments RjR_{j}, it creates the Büchi automaton j\mathcal{B}_{j} by removing any transition in \mathcal{B} that cannot be traversed by any assignment in RjR_{j}. In our example (Fig. 3), each agent can be assigned at least one binding over every transition in \mathcal{B}. Thus, j{green,blue,orange,pink},j=\forall j\in\{green,blue,orange,pink\},\mathcal{B}_{j}=\mathcal{B}.

5.4. Agent Team Assignment

A team of agents can perform the task if 1) all the bindings are assigned, with each agent maintaining the same binding assignment for the entirety of the task, and 2) the agents satisfy synchronization requirements. For a viable team, the agents’ control follows the same path in the Büchi automaton \mathcal{B} to an accepting cycle. We perform DFS over \mathcal{B} to find an accepting trace (Alg. 1), where each tuple in stackstack contains the current edge (z,σ,z)(z,\sigma,z^{\prime}), the current team of agents RA^R_{\hat{A}}, and the path traversed so far βA^\beta_{\hat{A}}.

We initialize the team with all agents AjA_{j} and all possible binding assignments RjR_{j}, and each path βA^\beta_{\hat{A}} starts from state z0z_{0} of \mathcal{B}. When checking a transition (z,σ,z)(z,\sigma,z^{\prime}), we remove any agent jj if ((s,z),(s,z))δ𝒢j\forall((s,z),(s^{\prime},z^{\prime}))\in\delta_{\mathcal{G}_{j}}, there are no possible binding assignments it can satisfy. This is done by checking each agent’s pruned Büchi automaton j\mathcal{B}_{j} in update_team (line 1). We want the agent’s behavior to satisfy not only the current transition, but also the entire path with a consistent binding assignment. Thus, we update possible bindings (update_bindings, lines 1-1).

To guarantee the overall team behavior, we need to ensure agents are able to “wait in a state” before they synchronize, as they may reach states at different times. This means that each state in the trace must have a corresponding self-transition. Thus, for every (z,σ,z)(z,\sigma,z^{\prime}) that we add to the path in which zzz\neq z^{\prime}, the next edge to traverse must be a self-transition from zz^{\prime} to itself; the same holds vice-versa. In line 1, we check if the current transition is self-looping or not, and add subsequent transitions into the stack accordingly. If there is no self-transition on zz^{\prime} (i.e. (z,σ,z)δ(z^{\prime},\sigma,z^{\prime})\notin\delta_{\mathcal{B}}), then we do not consider zz^{\prime} to be valid and do not add it to the path.

Once we find a valid path to an accepting cycle, we parse it into β\beta, the path without self-transitions, and δself\delta_{self}, which contains the corresponding self-transition for each state in the path. Fig. 3 shows a valid path in \mathcal{B} for the example in Sec. 4 and the corresponding team assignment A^={Agreen,Ablue,Aorange,Apink}\hat{A}=\{A_{green},A_{blue},A_{orange},A_{pink}\} and bindings rgreen={1}r_{green}=\{1\}, rblue={3},rorange={1},rpink={2,3}r_{blue}=\{3\},r_{orange}=\{1\},r_{pink}=\{2,3\}. Note that we find a valid path rather than a globally optimal one. However, the algorithm is complete; it will find a feasible path if one exists.

Input : A={A1,A2,,An}A=\{A_{1},A_{2},...,A_{n}\}, R={R1,R2,,Rn}R=\{R_{1},R_{2},...,R_{n}\}, \mathcal{B}, {1,2,n}\{\mathcal{B}_{1},\mathcal{B}_{2}...,\mathcal{B}_{n}\}
Output : β\beta, δself\delta_{self}, A^A\hat{A}\subseteq A, RA^R_{\hat{A}}
1
2stack=stack=\emptyset, visited=visited=\emptyset
3 for e{(z,σ,z)δ|z=z0}e\in\{(z,\sigma,z^{\prime})\in\delta_{\mathcal{B}}\ |\ z=z_{0}\}  do
4       stack=stack{(e,R,[e])}stack=stack\cup\{(e,R,[e])\}
5while stackstack\neq\emptyset do
6       ((z,σ,z),RA^,βA^)=stack.pop()((z,\sigma,z^{\prime}),R_{\hat{A}},\beta_{\hat{A}})=stack.pop()
7      
8      if (z,σ,z)visited(z,\sigma,z^{\prime})\not\in visited then
9             visited=visited(z,σ,z)visited=visited\cup(z,\sigma,z^{\prime})
10            
11            RA^=update_team((z,σ,z),{1,,n})R_{\hat{A}}=\textsc{update\_team}((z,\sigma,z^{\prime}),\{\mathcal{B}_{1},...,\mathcal{B}_{n}\})
12            for RjRA^R_{j}\in R_{\hat{A}} do
13                   Rj=update_bindings(Rj,(z,σ,z))R_{j}^{\prime}=\textsc{update\_bindings}(R_{j},(z,\sigma,z^{\prime}))
14                   if Rj=R_{j}^{\prime}=\emptyset then
15                         RA^=RA^RjR_{\hat{A}}=R_{\hat{A}}\setminus R_{j}
16                  else
17                        RA^=(RA^Rj)RjR_{\hat{A}}=(R_{\hat{A}}\setminus R_{j})\cup R_{j}^{\prime}
18            
19            if j(RjRA^)=APψ\bigcup_{j}(R_{j}\in R_{\hat{A}})=AP_{\psi} then
20                  
21                  if zFz^{\prime}\in F then
22                         β,δself=parse_path(βA^)\beta,\delta_{self}=\textsc{parse\_path}(\beta_{\hat{A}})
23                         return β,δself,RA^\beta,\delta_{self},R_{\hat{A}}
24                  E={(z,σ,z′′)δ}E=\{(z^{\prime},\sigma^{\prime},z^{\prime\prime})\in\delta_{\mathcal{B}}\}
25                  for (z,σ,z′′)E(z^{\prime},\sigma^{\prime},z^{\prime\prime})\in E  do
26                         if (z=z and zz′′)(z=z^{\prime}\text{ and }z^{\prime}\neq z^{\prime\prime}) or (zz and z=z′′)(z\neq z^{\prime}\text{ and }z^{\prime}=z^{\prime\prime})  then
27                               stack=stack{((z,σ,z′′),RA^,[βA^(z,σ,z′′)])}stack=stack\cup\{\left((z^{\prime},\sigma^{\prime},z^{\prime\prime}),R_{\hat{A}},[\beta_{\hat{A}}\ (z^{\prime},\sigma^{\prime},z^{\prime\prime})]\right)\}
Algorithm 1 Find Accepting Trace for Agent Team

5.5. Synthesis and Execution of Control and Synchronization Policies

Given an accepting trace β\beta through \mathcal{B} and the corresponding self-transitions δself\delta_{self} that are valid for all agents in RA^R_{\hat{A}}, we synthesize control and synchronization for each agent such that the overall team execution satisfies β\beta (Alg. 2). For each transition (z,σ,z)(z,\sigma,z^{\prime}) in β\beta, we find R¯\overline{R}, which contains the binding assignments of all agents that require synchronization at state zz^{\prime}. Agent jj participates in the synchronization step if rjr_{j} contains a binding ρ\rho that is required by σ\sigma and is not the only agent assigned bindings from σ\sigma (line 2).

Subsequently, agent jj finds an accepting trace in 𝒢j\mathcal{G}_{j} that reaches zz^{\prime} with minimum cost, following self-transitions stored in δself\delta_{self} if necessary. As it executes this behavior, it communicates with other agents the tuple pp, which contains 1) its ID, 2) the state zz^{\prime} it is currently going to, and 3) if it is ready for synchronization (line 2). If no synchronization is required (line 2), the agent can simply execute the behavior. Otherwise, to guarantee that the behavior does not violate the requirements of the task, the agent executes the synthesized behavior up until the penultimate state, zwaitz_{wait}.

When the agent reaches zwaitz_{wait}, it signals to other agents that it is ready for synchronization. Since all agents know the overall teaming assignment, the agent continues to wait in state zwaitz_{wait} until it receives a signal that all other agents in R¯\overline{R} are ready (line 2). These agents then move to the next state in the behavior simultaneously. Agent jj continues synthesizing behavior through β\beta until synchronization is necessary again, and this process is repeated.

Input : 𝒢j\mathcal{G}_{j}, rjr_{j}, RA^R_{\hat{A}}, β\beta, δself\delta_{self}
1 for (z,σ,z)β(z,\sigma,z^{\prime})\in\beta do
2       bj=find_behavior(𝒢j,rj,(z,σ,z),δself)b_{j}=\textsc{find\_behavior}(\mathcal{G}_{j},r_{j},(z,\sigma,z^{\prime}),\delta_{self})
3       R¯={rkRA^|rk𝔅(σ)}\overline{R}=\{r_{k}\in R_{\hat{A}}\ |\ r_{k}\cap\mathfrak{B}(\sigma)\neq\emptyset\} if rjR¯r_{j}\not\in\overline{R} or R¯={rj}\overline{R}=\{r_{j}\}  then
4             p=()p=()
5             execute(bj,p)\textsc{execute}(b_{j},p)
6      else
7             p=(j,z,0)p=(j,z^{\prime},0), =length(bj)\ell=length(b_{j})
8             execute(bj[1:1],p)\textsc{execute}(b_{j}[1:\ell-1],p)
9             zwait=bj[1]z_{wait}=b_{j}[\ell-1], P={j}P=\{j\}
10             while iP(riR¯)𝔅(σ)\bigcup_{i\in P}(r_{i}\in\overline{R})\neq\mathfrak{B}(\sigma) do
11                   p=(j,z,1)p=(j,z^{\prime},1)
12                   execute(zwait,p)\textsc{execute}(z_{wait},p)
13                   P=j{k|(k,z,1)receive()}P=j\cup\{k\ |\ (k,z^{\prime},1)\in\textsc{receive}()\}
14            execute(bj[])\textsc{execute}(b_{j}[\ell])
Algorithm 2 Synthesize an Agent’s Behavior

6. Results and Discussion

Fig. 5 shows the final step of the synchronized behavior of the agent team for the example in Section 4, where A^={\hat{A}=\{AgreenA_{green}, AblueA_{blue}, AorangeA_{orange}, Apink}A_{pink}\} with binding assignments rgreen={1}r_{green}=\{1\}, rblue={3},rorange={1},rpink={2,3}r_{blue}=\{3\},r_{orange}=\{1\},r_{pink}=\{2,3\}. A simulation of the full behavior is shown in the accompanying video.

Optimizing teams: Our synthesis algorithm can be seen as a greatest fixpoint computation, where we start with the full set of agents and remove those that cannot contribute to the task. As a result, the team may have redundancies, i.e. agents can be removed while still ensuring the overall task will be completed; this may be beneficial for robustness. Furthermore, we can choose a sub-team to optimize different metrics, as long as the agent bindings assignments still cover all the required bindings. For example, minimizing the number of bindings per agent could result in A^={Agreen,Ablue,Apink}\hat{A}=\{A_{green},A_{blue},A_{pink}\}, rgreen={1},rblue={3},rpink={2}r_{green}=\{1\},r_{blue}=\{3\},r_{pink}=\{2\}; minimizing the number of agents results in A^={Agreen,Apink}\hat{A}=\{A_{green},A_{pink}\}, rgreen={1},rpink={2,3}r_{green}=\{1\},r_{pink}=\{2,3\}.

To illustrate other possible metrics, we consider a set of 20 agents and create a team for the specification in Eq. 4. Their final binding assignments and costs are shown in Table 1. Minimizing cost results in a team A^={A7,A11}\hat{A}=\{A_{7},A_{11}\}. Minimizing cost while requiring each binding to be assigned to two agents results in A^={A4,A7,A11,A16}\hat{A}=\{A_{4},A_{7},A_{11},A_{16}\}.

Refer to caption
Figure 5. The final step in the synchronized behavior of the agent team with their corresponding actions.
Agent rjr_{j} costcost Agent rjr_{j} costcost Agent rjr_{j} costcost Agent rjr_{j} costcost Agent rjr_{j} costcost
1 1 1.2 5 3 2.75 9 3 2.6 13 3 2.0 17 2,3 3.275
2 3 1.0 6 1 0.95 10 1 2.8 14 1 1.2 18 3 2.55
3 1 1.2 7 1 0.65 11 2,3 0.9 15 3 1.1 19 1 1.9
4 2,3 1.3 8 1 1.0 12 2,3 1.825 16 1 0.775 20 2,3 2.35
Table 1. Example teaming assignment with 20 robots

Computational complexity: The control synthesis algorithm (Alg. 2) is agnostic to the number of agents, since each agent determines its own possible bindings assignments and behavior. For the team assignment (Alg. 1), since it is a DFS algorithm, we need to store the agent team and their possible binding assignments as we build an accepting trace. Thus, it has both a space and time complexity of O(|E|2mn)O(|E|*2^{m}*n), where |E||E| is the number of edges in \mathcal{B}, mm is the number of bindings, and nn is the number of agents.

Fig. 6(a) shows the computation time of the synthesis framework (Sec. 5.25.4) for simulated agent teams in which we vary the number of agents from 3 to 20, running 30 simulations for each set of agents and randomizing their capabilities. The task for each simulation is the example in Eq. 4. We also ran simulations in which we increase the number of bindings from 3 to 10 and randomized the capabilities of 4 agents (Fig. 6(b)). The variance in computation time is a result of the randomized agent capabilities, which affects the computation time of possible binding assignments (Sec. 5.3). All simulations ran on a 2.5 GHz quad-core Intel Core i7 CPU.

Refer to caption
(a)
Refer to caption
(b)
Figure 6. Computation time when increasing the number of agents (6(a)) and the number of bindings (6(b)). The error bars represent min/max values.

Task expressivity with respect to other approaches: We compare LTLψ to other approaches that encode collaborative heterogeneous multi-agent tasks using temporal logic.

Standard LTL: One approach is to use LTL to express the task by enumerating all possible assignments in the specification. In our example, Eq. 4a would be rewritten as:

φ1ψ=\displaystyle\varphi^{\psi}_{1}= (((regionBgreenmoisturegreenUVgreen)\displaystyle(\Diamond((region_{B}^{green}\wedge moisture^{green}\wedge UV^{green})
\displaystyle\wedge (regionAbluepickupblue)))\displaystyle(region_{A}^{blue}\wedge pickup^{blue})))
\displaystyle\vee (((regionBgreenmoisturegreenUVgreen)\displaystyle(\Diamond((region_{B}^{green}\wedge moisture^{green}\wedge UV^{green})
\displaystyle\wedge (regionAorangepickuporange)))\displaystyle(region_{A}^{orange}\wedge pickup^{orange})))\vee...

where each agent has its own unique set of APAP, denoted here by each proposition’s superscript. As a result, the number of propositions increases exponentially with the number of agents. The task complexity also increases, as the specification must include all possible agent assignments. Another drawback of using LTL for such tasks is that the specification is not generalizable to any number of agents; it must be rewritten when the set of agents change.

LTLχ: In Luo and Zavlanos (2022), tasks are written in LTLχ, where proposition πi,jk,χ\pi_{i,j}^{k,\chi} is true if at least ii agents of type jj are in region kk with binding χ\chi. We can express φ1ψ\varphi^{\psi}_{1} (Eq. 4a) of our example as (π1,moisregionB,2π1,UVregionB,2π1,moisregionB,3π1,UVregionB,3π1,armregionA,1)\Diamond(\pi_{1,mois}^{regionB,2}\wedge\pi_{1,UV}^{regionB,2}\wedge\pi_{1,mois}^{regionB,3}\wedge\pi_{1,UV}^{regionB,3}\wedge\pi_{1,arm}^{regionA,1}). The truth value of πi,jk,χ\pi_{i,j}^{k,\chi} is not dependent on any particular action an agent might take. LTLχ can be extended to action propositions, but since an agent can only be categorized as one type, each type of agent must have non-overlapping capabilities (here, we have written the LTLχ formula such that each type of agent only has one capability). In addition, φ2ψ\varphi^{\psi}_{2} (Eq. 4b) cannot be written in LTLχ because the negation defined in our grammar cannot be expressed in LTLχ. On the other hand, the negative proposition ¬πi,jk,χ\neg\pi_{i,j}^{k,\chi} from Luo and Zavlanos (2022) is equivalent to “less than ii agents of type jj are in region kk”, which our logic cannot encode.

Capability Temporal Logic (CaTL): Tasks in CaTL Leahy et al. (2022) are constructed over tasks T=(d,π,cpT)T=(d,\pi,cp_{T}), where dd is a duration of time, π\pi is a region in AP, (ci,mi)cpT(c_{i},m_{i})\in cp_{T} denotes that at least mim_{i} agents with capability cic_{i} are required. Similar to our grammar, CaTL allows agents to have multiple capabilities, but each task must specify the number of agents required. Since it is an extension of Signal Temporal Logic, tasks provide timing requirements, which our logic cannot encode. However, it does not include the concept of binding assignments; in our example φ1ψ\varphi^{\psi}_{1} (Eq. 4a), CaTL cannot express that we require the same agent that took a UV measurement to also take a thermal image. Ignoring binding assignments and adding timing constraints, φ1ψ\varphi^{\psi}_{1} (Eq. 4a) can be rewritten in CaTL as [0,10)\Diamond_{[0,10)}(T(0.1,regionB,(T(0.1,region_{B}, {(moisture,2),(UV,2)})\{(moisture,2),(UV,2)\})\wedge T(0.5,T(0.5, regionA,region_{A}, {(arm,1)})\{(arm,1)\}). Each capability in CaTL is represented as a sensor and therefore cannot include more complex capabilities, such as a robot arm that can perform several different actions. In addition, because CaTL requires the formula to be in positive normal form (i.e. no negation), we cannot express φ2ψ\varphi^{\psi}_{2} (Eq. 4b) in this grammar.

7. Conclusion

We define a new task grammar for heterogeneous teams of agents and develop a framework to automatically assign the task to a (sub)team of agents and synthesize correct-by-construction control policies to satisfy the task. We include synchronization constraints to guarantee that the agents perform the necessary collaborations.

In future work, we plan to demonstrate the approach on physical systems where we need to ensure that the continuous execution satisfies all the collaboration and safety constraints. In addition, we will explore different notions of optimality when finding a teaming plan, as well as increase the expressivity of the grammar by allowing reactive tasks where agents modify their behavior at runtime in response to environment events.

{acks}

This work is supported by the National Defense Science & Engineering Graduate Fellowship (NDSEG) Fellowship Program.

References

  • (1)
  • Baier and Katoen (2008) Christel Baier and Joost-Pieter Katoen. 2008. Principles of Model Checking. The MIT Press.
  • Chen et al. (2021) Ji Chen, Ruojia Sun, and Hadas Kress-Gazit. 2021. Distributed Control of Robotic Swarms from Reactive High-Level Specifications. In 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE). 1247–1254. https://doi.org/10.1109/CASE49439.2021.9551578
  • Chen et al. (2011) Yushan Chen, Xu Chu Ding, and Calin Belta. 2011. Synthesis of distributed control and communication schemes from global LTL specifications. In 2011 50th IEEE Conference on Decision and Control and European Control Conference. 2718–2723. https://doi.org/10.1109/CDC.2011.6160740
  • Duret-Lutz et al. (2016) Alexandre Duret-Lutz, Alexandre Lewkowicz, Amaury Fauchille, Thibaud Michaud, Etienne Renault, and Laurent Xu. 2016. Spot 2.0 — a framework for LTL and ω\omega-automata manipulation. In Proceedings of the 14th International Symposium on Automated Technology for Verification and Analysis (ATVA’16) (Lecture Notes in Computer Science, Vol. 9938). Springer, 122–129. https://doi.org/10.1007/978-3-319-46520-3_8
  • Emerson (1990) E. Allen Emerson. 1990. Temporal and Modal Logic. In Formal Models and Semantics, JAN Van Leeuwen (Ed.). Elsevier, Amsterdam, 995–1072. https://doi.org/10.1016/B978-0-444-88074-1.50021-4
  • Fang and Kress-Gazit (2022) Amy Fang and Hadas Kress-Gazit. 2022. Automated Task Updates of Temporal Logic Specifications for Heterogeneous Robots. In 2022 International Conference on Robotics and Automation (ICRA). 4363–4369. https://doi.org/10.1109/ICRA46639.2022.9812045
  • Faruq et al. (2018) Fatma Faruq, David Parker, Bruno Laccrda, and Nick Hawes. 2018. Simultaneous Task Allocation and Planning Under Uncertainty. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 3559–3564. https://doi.org/10.1109/IROS.2018.8594404
  • Gerkey and Matarić (2004) Brian P. Gerkey and Maja J. Matarić. 2004. A Formal Analysis and Taxonomy of Task Allocation in Multi-Robot Systems. The International Journal of Robotics Research 23, 9 (2004), 939–954. https://doi.org/10.1177/0278364904045564 arXiv:https://doi.org/10.1177/0278364904045564
  • Jia and Meng (2013) Xiao Jia and Max Q.-H. Meng. 2013. A survey and analysis of task allocation algorithms in multi-robot systems. In 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO). 2280–2285. https://doi.org/10.1109/ROBIO.2013.6739809
  • Kantaros and Zavlanos (2020) Yiannis Kantaros and Michael M Zavlanos. 2020. STyLuS*: A Temporal Logic Optimal Control Synthesis Algorithm for Large-Scale Multi-Robot Systems. The International Journal of Robotics Research 39, 7 (2020), 812–836. https://doi.org/10.1177/0278364920913922 arXiv:https://doi.org/10.1177/0278364920913922
  • Kloetzer and Belta (2010) Marius Kloetzer and Calin Belta. 2010. Automatic Deployment of Distributed Teams of Robots From Temporal Logic Motion Specifications. IEEE Transactions on Robotics 26, 1 (2010), 48–61. https://doi.org/10.1109/TRO.2009.2035776
  • Korsah et al. (2013) G. Ayorkor Korsah, Anthony Stentz, and M. Bernardine Dias. 2013. A comprehensive taxonomy for multi-robot task allocation. The International Journal of Robotics Research 32, 12 (2013), 1495–1512. https://doi.org/10.1177/0278364913496484 arXiv:https://doi.org/10.1177/0278364913496484
  • Leahy et al. (2022) Kevin Leahy, Zachary Serlin, Cristian-Ioan Vasile, Andrew Schoer, Austin M. Jones, Roberto Tron, and Calin Belta. 2022. Scalable and Robust Algorithms for Task-Based Coordination From High-Level Specifications (ScRATCHeS). IEEE Transactions on Robotics 38, 4 (2022), 2516–2535. https://doi.org/10.1109/TRO.2021.3130794
  • Li et al. (2009) Zhiyong Li, Bo Xu, Lei Yang, Jun Chen, and Kenli Li. 2009. Quantum Evolutionary Algorithm for Multi-Robot Coalition Formation. In Proceedings of the First ACM/SIGEVO Summit on Genetic and Evolutionary Computation (Shanghai, China) (GEC ’09). Association for Computing Machinery, New York, NY, USA, 295–302. https://doi.org/10.1145/1543834.1543874
  • Luo and Zavlanos (2022) Xusheng Luo and Michael M. Zavlanos. 2022. Temporal Logic Task Allocation in Heterogeneous Multirobot Systems. IEEE Transactions on Robotics (2022), 1–20. https://doi.org/10.1109/TRO.2022.3181948
  • Sahin et al. (2017) Yunus Emre Sahin, Petter Nilsson, and Necmiye Ozay. 2017. Synchronous and asynchronous multi-agent coordination with cLTL+ constraints. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC). 335–342. https://doi.org/10.1109/CDC.2017.8263687
  • Schillinger et al. (2018) Philipp Schillinger, Mathias Bürger, and Dimos V. Dimarogonas. 2018. Simultaneous task allocation and planning for temporal logic goals in heterogeneous multi-robot systems. The International Journal of Robotics Research 37, 7 (2018), 818–838. https://doi.org/10.1177/0278364918774135 arXiv:https://doi.org/10.1177/0278364918774135
  • Schmickl et al. (2006) Thomas Schmickl, Christoph Möslinger, and Karl Crailsheim. 2006. Collective Perception in a Robot Swarm. 144–157. https://doi.org/10.1007/978-3-540-71541-2_10
  • Tumova and Dimarogonas (2016) Jana Tumova and Dimos V. Dimarogonas. 2016. Multi-agent planning under local LTL specifications and event-based synchronization. Automatica 70 (2016), 239–248. https://doi.org/10.1016/j.automatica.2016.04.006
  • Ulusoy et al. (2012) Alphan Ulusoy, Stephen L. Smith, Xu Chu Ding, and Calin A. Belta. 2012. Robust multi-robot optimal path planning with temporal logic constraints. 2012 IEEE International Conference on Robotics and Automation (2012), 4693–4698.
  • Wang and Rubenstein (2020) Hanlin Wang and Michael Rubenstein. 2020. Shape Formation in Homogeneous Swarms Using Local Task Swapping. IEEE Transactions on Robotics 36, 3 (2020), 597–612. https://doi.org/10.1109/TRO.2020.2967656
  • Xu et al. (2015) Bo Xu, Zhaofeng Yang, Yu Ge, and Zhiping Peng. 2015. Coalition Formation in Multi-agent Systems Based on Improved Particle Swarm Optimization Algorithm. International Journal of Hybrid Information Technology 8 (03 2015), 1–8. https://doi.org/10.14257/ijhit.2015.8.3.01