This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Parameter Estimation in Epidemic Spread Networks Using Limited Measurementsthanks: Research supported in part by the National Science Foundation, grants NSF-CMMI 1635014 and NSF-ECCS 2032258.

Lintao Ye Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN, USA ([email protected])    Philip E. Paré School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA ({philpare,sundara2}@purdue.edu)    Shreyas Sundaram33footnotemark: 3
Abstract

We study the problem of estimating the parameters (i.e., infection rate and recovery rate) governing the spread of epidemics in networks. Such parameters are typically estimated by measuring various characteristics (such as the number of infected and recovered individuals) of the infected populations over time. However, these measurements also incur certain costs, depending on the population being tested and the times at which the tests are administered. We thus formulate the epidemic parameter estimation problem as an optimization problem, where the goal is to either minimize the total cost spent on collecting measurements, or to optimize the parameter estimates while remaining within a measurement budget. We show that these problems are NP-hard to solve in general, and then propose approximation algorithms with performance guarantees. We validate our algorithms using numerical examples.

Keywords: Epidemic spread networks, Parameter estimation, Optimization algorithms

1 Introduction

Models of spreading processes over networks have been widely studied by researchers from different fields (e.g., [14, 23, 6, 3, 25, 20]). The case of epidemics spreading through networked populations has received a particularly significant amount of attention, especially in light of the ongoing COVID-19 pandemic (e.g., [20, 21]). A canonical example is the networked SIR model, where each node in the network represents a subpopulation or an individual, and can be in one of three states: susceptible (S), infected (I), or recovered (R) [18]. There are two key parameters that govern such models: the infection rate of a given node, and the recovery rate of that node. In the case of a novel virus, these parameters may not be known a priori, and must be identified or estimated from gathered data, including for instance the number of infected and recovered individuals in the network at certain points of time. For instance, in the COVID-19 pandemic, when collecting the data on the number of infected individuals or the number of recovered individuals in the network, one possibility is to perform virus or antibody tests on the individuals, with each test incurring a cost. Therefore, in the problem of parameter estimation in epidemic spread networks, it is important and of practical interest to take the costs of collecting the data (i.e., measurements) into account, which leads to the problem formulations considered in this paper. The goal is to exactly identify (when possible) or estimate the parameters in the networked SIR model using a limited number of measurements. Specifically, we divide our analysis into two scenarios: 1) when the measurements (e.g., the number of infected individuals) can be collected exactly without error; and 2) when only a stochastic measurement can be obtained.

Under the setting when exact measurements of the infected and recovered proportions of the population at certain nodes in the network can be obtained, we formulate the Parameter Identification Measurement Selection (PIMS) problem as minimizing the cost spent on collecting the measurements, while ensuring that the parameters of the SIR model can be uniquely identified (within a certain time interval in the epidemic dynamics). In settings where the measurements are stochastic (thereby precluding exact identification of the parameters), we formulate the Parameter Estimation Measurement Selection (PEMS) problem. The goal is to optimize certain estimation metrics based on the collected measurements, while satisfying the budget on collecting the measurements.

Related Work

The authors in [22, 34] studied the parameter estimation problem in epidemic spread networks using a "Susceptible-Infected-Susceptible (SIS)" model of epidemics. model. When exact measurements of the infected proportion of the population at each node of the network can be obtained, the authors proposed a sufficient and necessary condition on the set of the collected measurements such that the parameters of the SIS model (i.e., the infection rate and the recovery rate) can be uniquely identified. However, this condition does not pose any constraint on the number of measurements that can be collected.

In [24], the authors considered a measurement selection problem in the SIR model. Their problem is to perform a limited number of virus tests among the population such that the probability of undetected asymptotic cases is minimized. The transmission of the disease in the SIR model considered in [24] is characterized by a Bernoulli random variable which leads to a Hidden Markov Model for the SIR dynamics.

Finally, our work is also closely related to the sensor placement problem that has been studied for control systems (e.g., [19, 37, 36]), signal processing (e.g., [7, 35]), and machine learning (e.g., [17]). The goal of these problems is to optimize certain (problem-specific) performance metrics of the estimate based on the measurements of the placed sensors, while satisfying the sensor placement budget constraints.

Contributions

First, we show that the PIMS problem is NP-hard, which precludes polynomial-time algorithms for the PIMS problem (if P \neq NP). By exploring structural properties of the PIMS problem, we provide a polynomial-time approximation algorithm which returns a solution that is within a certain approximation ratio of the optimal. The approximation ratio depends on the cost structure of the measurements and on the graph structure of the epidemic spread network. Next, we show that the PEMS problem is also NP-hard. In order to provide a polynomial-time approximation algorithm that solves the PEMS problem with performance guarantees, we first show that the PEMS problem can be transformed into the problem of maximizing a set function subject to a knapsack constraint. We then apply a greedy algorithm to the (transformed) PEMS problem, and provide approximation guarantees for the greedy algorithm. Our analysis for the greedy algorithm also generalizes the results from the literature for maximizing a submodular set function under a knapsack constraint to nonsubmodular settings. We use numerical examples to validate the obtained performance bounds of the greedy algorithm, and show that the greedy algorithm performs well in practice.

Notation and Terminology

The sets of integers and real numbers are denoted as \mathbb{Z} and \mathbb{R}, respectively. For a set 𝒮\mathcal{S}, let |𝒮||\mathcal{S}| denote its cardinality. For any n1n\in\mathbb{Z}_{\geq 1}, let [n]{1,2,,n}[n]\triangleq\{1,2,\dots,n\}. Let 𝟎m×n\mathbf{0}_{m\times n} denote a zero matrix of dimension m×nm\times n; the subscript is dropped if the dimension can be inferred from the context. For a matrix Pn×nP\in\mathbb{R}^{n\times n}, let PP^{\top}, tr(P)\operatorname{tr}(P) and det(P)\det(P) be its transpose, trace and determinant, respectively. The eigenvalues of PP are ordered such that |λ1(P)||λn(P)||\lambda_{1}(P)|\geq\cdots\geq|\lambda_{n}(P)|. Let PijP_{ij} (or (P)ij(P)_{ij}) denote the element in the iith row and jjth column of PP, and let (P)i(P)_{i} denote the iith row of PP. A positive semidefinite matrix Pn×nP\in\mathbb{R}^{n\times n} is denoted by P𝟎P\succeq\mathbf{0}.

2 Model of Epidemic Spread Network

Suppose a disease (or virus) is spreading over a directed graph 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\}, where 𝒱[n]\mathcal{V}\triangleq[n] is the set of nn nodes, and \mathcal{E} is the set of directed edges (and self loops) that captures the interactions among the nodes in 𝒱\mathcal{V}. Here, each node i𝒱i\in\mathcal{V} is considered to be a group (or population) of individuals (e.g., a city or a country). A directed edge from node ii to node jj, where iji\neq j, is denoted by (i,j)(i,j). For all i𝒱i\in\mathcal{V}, denote 𝒩i{j:(j,i)}\mathcal{N}_{i}\triangleq\{j:(j,i)\in\mathcal{E}\} and 𝒩¯i{j:(j,i)}{i}\bar{\mathcal{N}}_{i}\triangleq\{j:(j,i)\in\mathcal{E}\}\cup\{i\}. For all i𝒱i\in\mathcal{V} and for all k0k\in\mathbb{Z}_{\geq 0}, we let si[k]s_{i}[k], xi[k]x_{i}[k] and ri[k]r_{i}[k] represent the proportions of population of node i𝒱i\in\mathcal{V} that are susceptible, infected and recovered at time kk, respectively. To describe the dynamics of the spread of the disease in 𝒢\mathcal{G}, we will use the following discrete-time SIR model (e.g., [11]):

si[k+1]=si[k]hsi[k]βj𝒩¯iaijxj[k],\displaystyle s_{i}[k+1]=s_{i}[k]-hs_{i}[k]\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k], (1a)
xi[k+1]=(1hδ)xi[k]+hsi[k]βj𝒩¯iaijxj[k],\displaystyle x_{i}[k+1]=(1-h\delta)x_{i}[k]+hs_{i}[k]\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k], (1b)
ri[k+1]=ri[k]+hδxi[k],\displaystyle r_{i}[k+1]=r_{i}[k]+h\delta x_{i}[k], (1c)

where β0\beta\in\mathbb{R}_{\geq 0} is the infection rate of the disease, δ0\delta\in\mathbb{R}_{\geq 0} is the recovery rate of the disease, h0h\in\mathbb{R}_{\geq 0} is the sampling parameter, and aij0a_{ij}\in\mathbb{R}_{\geq 0} is the weight associated with edge (j,i)(j,i). Let An×nA\in\mathbb{R}^{n\times n} be a weight matrix, where Aij=aijA_{ij}=a_{ij} for all i,j𝒱i,j\in\mathcal{V} such that j𝒩¯ij\in\bar{\mathcal{N}}_{i}, and Aij=0A_{ij}=0 otherwise. Denote s[k][s1[k]sn[k]]Tns[k]\triangleq\begin{bmatrix}s_{1}[k]&\cdots&s_{n}[k]\end{bmatrix}^{T}\in\mathbb{R}^{n}, x[k][x1[k]xn[k]]Tnx[k]\triangleq\begin{bmatrix}x_{1}[k]&\cdots&x_{n}[k]\end{bmatrix}^{T}\in\mathbb{R}^{n}, and r[k][r1[k]rn[k]]Tnr[k]\triangleq\begin{bmatrix}r_{1}[k]&\cdots&r_{n}[k]\end{bmatrix}^{T}\in\mathbb{R}^{n}, for all k0k\in\mathbb{Z}_{\geq 0}. Throughout this paper, we assume that the weight matrix An×nA\in\mathbb{R}^{n\times n} and the sampling parameter h0h\in\mathbb{R}_{\geq 0} are given.

3 Preliminaries

We make the following assumptions on the initial conditions s[0]s[0], x[0]x[0] and r[0]r[0], and the parameters of the SIR model in Eq. (1) (e.g., [22, 11]).

Assumption 3.1.

For all i𝒱i\in\mathcal{V}, si[0](0,1]s_{i}[0]\in(0,1], xi[0][0,1)x_{i}[0]\in[0,1), ri[0]=0r_{i}[0]=0, and si[0]+xi[0]=1s_{i}[0]+x_{i}[0]=1.

Assumption 3.2.

Assume that h,β,δ>0h,\beta,\delta\in\mathbb{R}_{>0} with hδ<1h\delta<1. For all i,j𝒱i,j\in\mathcal{V} with (j,i)(j,i)\in\mathcal{E} and iji\neq j, assume that aij>0a_{ij}\in\mathbb{R}_{>0}. For all i𝒱i\in\mathcal{V}, hβj𝒩¯iaij<1h\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}<1.

Definition 3.3.

Consider a directed graph 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\} with 𝒱=[n]\mathcal{V}=[n]. A directed path of length tt from node i0i_{0} to node iti_{t} in 𝒢\mathcal{G} is a sequence of tt directed edges (i0,i1),,(it1,it)(i_{0},i_{1}),\dots,(i_{t-1},i_{t}). For any distinct pair of nodes i,j𝒱i,j\in\mathcal{V} such that there exists a directed path from ii to jj, the distance from node ii to node jj, denoted as dijd_{ij}, is defined as the shortest length over all such paths.

Definition 3.4.

Define 𝒮I{i:xi[0]>0,i𝒱}\mathcal{S}_{I}\triangleq\{i:x_{i}[0]>0,i\in\mathcal{V}\} and 𝒮H{i:xi[0]=0,i𝒱}\mathcal{S}_{H}\triangleq\{i:x_{i}[0]=0,i\in\mathcal{V}\}. For all i𝒮Hi\in\mathcal{S}_{H}, define diminj𝒮Idjid_{i}\triangleq\mathop{\min}_{j\in\mathcal{S}_{I}}d_{ji} where di1d_{i}\geq 1, and di+d_{i}\triangleq+\infty if there is no path from jj to ii for any j𝒮Ij\in\mathcal{S}_{I}. For all i𝒮Ii\in\mathcal{S}_{I}, define di0d_{i}\triangleq 0.

Using similar arguments to those in [11], one can show that si[k],xi[k],ri[k][0,1]s_{i}[k],x_{i}[k],r_{i}[k]\in[0,1] with si[k]+xi[k]+ri[k]=1s_{i}[k]+x_{i}[k]+r_{i}[k]=1 for all i𝒱i\in\mathcal{V} and for all k0k\in\mathbb{Z}_{\geq 0} under Assumptions 3.1-3.2. Thus, given xi[k]x_{i}[k] and ri[k]r_{i}[k], we can obtain si[k]=1xi[k]ri[k]s_{i}[k]=1-x_{i}[k]-r_{i}[k] for all i𝒱i\in\mathcal{V} and for all k0k\in\mathbb{Z}_{\geq 0}. We also have the following result that characterizes properties of xi[k]x_{i}[k] and ri[k]r_{i}[k] in the SIR model over 𝒢\mathcal{G} given by Eq. (1); the proof can be found in Section 7.1 in the Appendix.

Lemma 3.5.

Consider a directed graph 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\} with 𝒱=[n]\mathcal{V}=[n] and the SIR dynamics given by Eq. (1). Suppose Assumptions 3.1-3.2 hold. Then, the following results hold for all i𝒱i\in\mathcal{V}, where k0k\in\mathbb{Z}_{\geq 0}, and 𝒮H\mathcal{S}_{H} and did_{i} are defined in Definition 3.4.
(a)(a) si[k]>0s_{i}[k]>0 for all k0k\geq 0.
(b)(b) If di+d_{i}\neq+\infty, then xi[k]=0x_{i}[k]=0 for all k<dik<d_{i}, and xi[k](0,1)x_{i}[k]\in(0,1) for all kdik\geq d_{i}.111Note that for the case when di=0d_{i}=0, i.e., i𝒮Ii\in\mathcal{S}_{I}, part (b)(b) implies xi[k]>0x_{i}[k]>0 for all k0k\geq 0.
(c)(c) If di+d_{i}\neq+\infty, then ri[k]=0r_{i}[k]=0 for all kdik\leq d_{i}, and ri[k](0,1)r_{i}[k]\in(0,1) for all k>dik>d_{i}.
(d)(d) If i𝒮Hi\in\mathcal{S}_{H} with di=+d_{i}=+\infty, then xi[k]=0x_{i}[k]=0 and ri[k]=0r_{i}[k]=0 for all k0k\geq 0.

4 Measurement Selection Problem in Exact Measurement Setting

Throughout this section, we assume that 𝒮I,𝒮H𝒱\mathcal{S}_{I},\mathcal{S}_{H}\subseteq\mathcal{V} defined in Definition 3.4 are known, i.e., we know the set of nodes in 𝒱\mathcal{V} that have infected individuals initially.

4.1 Problem Formulation

Given exact measurements of xi[k]x_{i}[k] and ri[k]r_{i}[k] for a subset of nodes, our goal is to estimate (or uniquely identify, if possible) the unknown parameters β\beta and δ\delta. Here, we consider the scenario where collecting the measurement of xi[k]x_{i}[k] (resp., ri[k]r_{i}[k]) at any node i𝒱i\in\mathcal{V} and at any time step k0k\in\mathbb{Z}_{\geq 0} incurs a cost, denoted as ck,i0c_{k,i}\in\mathbb{R}_{\geq 0} (resp., bk,i0b_{k,i}\in\mathbb{R}_{\geq 0}). Moreover, we can only collect the measurements of xi[k]x_{i}[k] and ri[k]r_{i}[k] for k{t1,t1+1,,t2}k\in\{t_{1},t_{1}+1,\dots,t_{2}\}, where t1,t20t_{1},t_{2}\in\mathbb{Z}_{\geq 0} are given with t2>t1t_{2}>t_{1}. Noting that Lemma 3.5 provides a (sufficient and necessary) condition under which xi[k]=0x_{i}[k]=0 (resp., ri[k]=0r_{i}[k]=0) holds, we see that one does not need to collect a measurement of xi[k]x_{i}[k] (resp., ri[k]r_{i}[k]) if xi[k]=0x_{i}[k]=0 (resp., ri[k]=0r_{i}[k]=0) from Lemma 3.5. Given time steps t1,t20t_{1},t_{2}\in\mathbb{Z}_{\geq 0} with t2>t1t_{2}>t_{1}, we define a set

t1:t2{λi[k]:k{t1,,t2},i𝒱,λi[k]>0,λ{x,r}},\mathcal{I}_{t_{1}:t_{2}}\triangleq\{\lambda_{i}[k]:k\in\{t_{1},\dots,t_{2}\},i\in\mathcal{V},\lambda_{i}[k]>0,\lambda\in\{x,r\}\}, (2)

which represents the set of all candidate measurements from time step t1t_{1} to time step t2t_{2}. To proceed, we first use Eq. (1b)-(1c) to obtain

[x[t1+1]x[t1]x[t2]x[t21]r[t1+1]r[t1]r[t2]r[t21]]=h[Φt1:t2xΦt1:t2r][βδ],\begin{bmatrix}x[t_{1}+1]-x[t_{1}]\\ \vdots\\ x[t_{2}]-x[t_{2}-1]\\ r[t_{1}+1]-r[t_{1}]\\ \vdots\\ r[t_{2}]-r[t_{2}-1]\end{bmatrix}=h\begin{bmatrix}\Phi_{t_{1}:t_{2}}^{x}\\ \Phi_{t_{1}:t_{2}}^{r}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix}, (3)

where Φt1:t21x[(Φt1x)T(Φt21x)T]T\Phi_{t_{1}:t_{2}-1}^{x}\triangleq\begin{bmatrix}(\Phi_{t_{1}}^{x})^{T}&\cdots&(\Phi_{t_{2}-1}^{x})^{T}\end{bmatrix}^{T} with

Φkx[s1[k]j𝒩¯1a1jxj[k]x1[k]sn[k]j𝒩¯nanjxj[k]xn[k]],k{t1,,t21},\Phi_{k}^{x}\triangleq\begin{bmatrix}s_{1}[k]\sum_{j\in\bar{\mathcal{N}}_{1}}a_{1j}x_{j}[k]&-x_{1}[k]\\ \vdots&\vdots\\ s_{n}[k]\sum_{j\in\bar{\mathcal{N}}_{n}}a_{nj}x_{j}[k]&-x_{n}[k]\end{bmatrix},\ \forall k\in\{t_{1},\dots,t_{2}-1\}, (4)

and Φt1:t21r[(Φt1r)T(Φt21r)T]T\Phi_{t_{1}:t_{2}-1}^{r}\triangleq\begin{bmatrix}(\Phi_{t_{1}}^{r})^{T}&\cdots&(\Phi_{t_{2}-1}^{r})^{T}\end{bmatrix}^{T} with

Φkr[0x1[k]0xn[k]],k{t1,,t21}.\Phi_{k}^{r}\triangleq\begin{bmatrix}0&x_{1}[k]\\ \vdots&\vdots\\ 0&x_{n}[k]\end{bmatrix},\ \forall k\in\{t_{1},\dots,t_{2}-1\}. (5)

We can then view Eq. (3) as a set of 2(t2t1)n2(t_{2}-t_{1})n equations in β\beta and δ\delta. Noting that si[k]s_{i}[k] for all i𝒱i\in\mathcal{V} can be obtained from si[k]=1xi[k]ri[k]s_{i}[k]=1-x_{i}[k]-r_{i}[k] as argued in Section 3, we see that the coefficients in the set of equations in β\beta and δ\delta given by Eq. (3), i.e., the terms in Eq. (3) other than β\beta and δ\delta, can be determined given that x[k]x[k] and r[k]r[k] are known for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}. Also note that given x[k]x[k] and r[k]r[k] for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}, we can uniquely identify β\beta and δ\delta using Eq. (3) if and only if rank([(Φt1:t21x)T(Φt1:t21r)T])=2\operatorname{rank}(\begin{bmatrix}(\Phi_{t_{1}:t_{2}-1}^{x})^{T}&(\Phi_{t_{1}:t_{2}-1}^{r})^{T}\end{bmatrix})=2.

Next, let t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}} denote a measurement selection strategy, where t1:t2\mathcal{I}_{t_{1}:t_{2}} is given by Eq. (2). We will then consider identifying β\beta and δ\delta using measurements contained in t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}. To illustrate our analysis, given any i𝒱i\in\mathcal{V} and any k{t1,,t21}k\in\{t_{1},\dots,t_{2}-1\}, we first consider the following equation from Eq. (3):

xi[k+1]xi[k]=h[si[k]w𝒩¯iaiwxw[k]xi[k]][βδ],x_{i}[k+1]-x_{i}[k]=h\begin{bmatrix}s_{i}[k]\sum_{w\in\bar{\mathcal{N}}_{i}}a_{iw}x_{w}[k]&-x_{i}[k]\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix}, (6)

where si[k]=1xi[k]ri[k]s_{i}[k]=1-x_{i}[k]-r_{i}[k], and we index the equation in Eq. (3) corresponding to Eq. (6) as (k,i,x)(k,i,x). Note that in order to use Eq. (6) in identifying β\beta and δ\delta, one needs to determine the coefficients (i.e., the terms other than β\beta and δ\delta) in the equation. Also note that in order to determine the coefficients in equation (k,i,x)(k,i,x), one can use the measurements contained in t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}, and use Lemma 3.5 to determine if xi[k]=0x_{i}[k]=0 (resp., ri[k]=0r_{i}[k]=0) holds. Supposing xi[k+1]=0x_{i}[k+1]=0, we see from Lemma 3.5 and Eq. (1b) that xi[k]=0x_{i}[k]=0 and si[k]w𝒩¯iaiwxw[k]=0s_{i}[k]\sum_{w\in\bar{\mathcal{N}}_{i}}a_{iw}x_{w}[k]=0, which makes equation (k,i,x)(k,i,x) useless in identifying β\beta and δ\delta. Thus, in order to use equation (k,i,x)(k,i,x) in identifying β\beta and δ\delta, we need xi[k+1]x_{i}[k+1]\in\mathcal{I} with xi[k+1]>0x_{i}[k+1]>0. Similarly, given any i𝒱i\in\mathcal{V} and any k{t1,,t21}k\in\{t_{1},\dots,t_{2}-1\}, we consider the following equation from Eq. (3):

ri[k+1]ri[k]=h[0xi[k]][βδ],r_{i}[k+1]-r_{i}[k]=h\begin{bmatrix}0&x_{i}[k]\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix}, (7)

where we index the above equation as (k,i,r)(k,i,r). Supposing ri[k+1]=0r_{i}[k+1]=0, we see from Lemma 3.5 and Eq. (1c) that ri[k]=xi[k]=0r_{i}[k]=x_{i}[k]=0, which makes equation (k,i,r)(k,i,r) useless in identifying β\beta and δ\delta. Hence, in order to use equation (k,i,r)(k,i,r) in identifying β\beta and δ\delta, we need to have {xi[k],ri[k+1]}\{x_{i}[k],r_{i}[k+1]\}\subseteq\mathcal{I} with xi[k]>0x_{i}[k]>0 and ri[k+1]>0r_{i}[k+1]>0. More precisely, we observe that equation (k,i,r)(k,i,r) can be used in identifying β\beta and δ\delta if and only if {xi[k],ri[k+1]}\{x_{i}[k],r_{i}[k+1]\}\subseteq\mathcal{I}, and ri[k]r_{i}[k]\in\mathcal{I} or ri[k]=0r_{i}[k]=0 (from Lemma 3.5).

In general, let us denote the following two coefficient matrices corresponding to equations (k,i,x)(k,i,x) and (k,i,r)(k,i,r) in Eq. (3), respectively:

Φk,ix[si[k]j𝒩¯iaijxj[k]xi[k]],\displaystyle\Phi_{k,i}^{x}\triangleq\begin{bmatrix}s_{i}[k]\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k]&-x_{i}[k]\end{bmatrix}, (8a)
Φk,ir[0xi[k]],\displaystyle\Phi_{k,i}^{r}\triangleq\begin{bmatrix}0&x_{i}[k]\end{bmatrix}, (8b)

for all k{t1,,t21}k\in\{t_{1},\dots,t_{2}-1\} and for all i𝒱i\in\mathcal{V}. Moreover, given any measurement selection strategy t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}, we let

¯{(k,i,x):xi[k+1],xi[k]=0}{(k,i,x):{xi[k+1],xi[k]}}{(k,i,r):{ri[k+1],xi[k]},ri[k]=0}{(k,i,r):{ri[k+1],ri[k],xi[k]}}\bar{\mathcal{I}}\triangleq\{(k,i,x):x_{i}[k+1]\in\mathcal{I},x_{i}[k]=0\}\cup\{(k,i,x):\{x_{i}[k+1],x_{i}[k]\}\subseteq\mathcal{I}\}\\ \cup\{(k,i,r):\{r_{i}[k+1],x_{i}[k]\}\subseteq\mathcal{I},r_{i}[k]=0\}\cup\{(k,i,r):\{r_{i}[k+1],r_{i}[k],x_{i}[k]\}\subseteq\mathcal{I}\} (9)

be the set that contains indices of the equations from Eq. (3) that can be potentially used in identifying β\beta and δ\delta, based on the measurements contained in \mathcal{I}. In other words, the coefficients on the left-hand side of equation (k,i,x)(k,i,x) (resp., (k,i,rk,i,r)) can be determined using the measurements from \mathcal{I} and using Lemma 3.5, for all (k,i,x)¯(k,i,x)\in\bar{\mathcal{I}} (resp., (k,i,r)¯(k,i,r)\in\bar{\mathcal{I}}). Let us now consider the coefficient matrix Φk,ix\Phi_{k,i}^{x} (resp., Φk,ir\Phi_{k,i}^{r}) corresponding to (k,i,x)¯(k,i,x)\in\bar{\mathcal{I}} (resp., (k,i,r)¯(k,i,r)\in\bar{\mathcal{I}}). One can then show that it is possible that there exist equations in ¯\bar{\mathcal{I}} whose coefficients cannot be (directly) determined using the measurements contained in \mathcal{I} or using Lemma 3.5, where the undetermined coefficients come from the first element in Φk,ix\Phi_{k,i}^{x} given by Eq. (8a). Nevertheless, it is also possible that one can perform algebraic operations among the equations in ¯\bar{\mathcal{I}} such that the undetermined coefficients get cancelled. Formally, we define the following.

Definition 4.1.

Consider a measurement selection strategy t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}, where t1:t2\mathcal{I}_{t_{1}:t_{2}} is given by Eq. (2). Stack coefficient matrices Φk,ix1×2\Phi_{k,i}^{x}\in\mathbb{R}^{1\times 2} for all (k,i,x)¯(k,i,x)\in\bar{\mathcal{I}} and Φk,ir1×2\Phi_{k,i}^{r}\in\mathbb{R}^{1\times 2} for all (k,i,r)¯(k,i,r)\in\bar{\mathcal{I}} into a single matrix, where Φk,ix\Phi_{k,i}^{x} and Φk,ir\Phi_{k,i}^{r} are given by (8) and ¯\bar{\mathcal{I}} is given by Eq. (9). The resulting matrix is denoted as Φ()|¯|×2\Phi(\mathcal{I})\in\mathbb{R}^{|\bar{\mathcal{I}}|\times 2}. Moreover, define Φ~()\tilde{\Phi}(\mathcal{I}) to be the set that contains all the matrices Φ2×2\Phi\in\mathbb{R}^{2\times 2} such that (Φ)1(\Phi)_{1} and (Φ)2(\Phi)_{2} can be obtained via algebraic operations among the rows in Φ()\Phi(\mathcal{I}), and the elements in (Φ)1(\Phi)_{1} and (Φ)2(\Phi)_{2} can be fully determined using the measurements from t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}} and using Lemma 3.5.

In other words, ΦΦ~()\Phi\in\tilde{\Phi}(\mathcal{I}) corresponds to two equations (in β\beta and δ\delta) obtained from Eq. (3) such that the coefficients on the right-hand side of the two equations can be determined using the measurements contained in \mathcal{I} and using Lemma 3.5 (if the coefficients contain xi[k]=0x_{i}[k]=0 or ri[k]=0r_{i}[k]=0). Moreover, one can show that the coefficients on the left-hand side of the two equations obtained from Eq. (3) corresponding to Φ\Phi can also be determined using measurements from \mathcal{I} and using Lemma 3.5. Putting the above arguments together, we see that given a measurement selection strategy t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}, β\beta and δ\delta can be uniquely identified if and only if there exists ΦΦ~()\Phi\in\tilde{\Phi}(\mathcal{I}) such that rank(Φ)=2\operatorname{rank}(\Phi)=2. Equivalently, denoting

rmax()maxΦΦ~()rank(Φ),r_{\mathop{\max}}(\mathcal{I})\triangleq\mathop{\max}_{\Phi\in\tilde{\Phi}(\mathcal{I})}\operatorname{rank}(\Phi), (10)

where rmax()0r_{\mathop{\max}}(\mathcal{I})\triangleq 0 if Φ~()=\tilde{\Phi}(\mathcal{I})=\emptyset, we see that β\beta and δ\delta can be uniquely identified using the measurements from t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}} if and only if rmax()=2r_{\mathop{\max}}(\mathcal{I})=2.

Remark 4.2.

Note that if a measurement selection strategy t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}} satisfies rmax()=2r_{\mathop{\max}}(\mathcal{I})=2, it follows from the above arguments that |¯|2|\bar{\mathcal{I}}|\geq 2, i.e., Φ()|¯|×2\Phi(\mathcal{I})\in\mathbb{R}^{|\bar{\mathcal{I}}|\times 2} has at least two rows, where ¯\bar{\mathcal{I}} is defined in Eq. (9).

Recall that collecting the measurement of xi[k]x_{i}[k] (resp., ri[k]r_{i}[k]) at any node i𝒱i\in\mathcal{V} and at any time step k0k\in\mathbb{Z}_{\geq 0} incurs cost ck,i0c_{k,i}\in\mathbb{R}_{\geq 0} (resp., bk,i0b_{k,i}\in\mathbb{R}_{\geq 0}). Given any measurement selection strategy t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}, we denote the cost associated with \mathcal{I} as

c()xi[k]ck,i+ri[k]bk,i.c(\mathcal{I})\triangleq\sum_{x_{i}[k]\in\mathcal{I}}c_{k,i}+\sum_{r_{i}[k]\in\mathcal{I}}b_{k,i}. (11)

We then define the Parameter Identification Measurement Selection (PIMS) problem in the perfect measurement setting as follows.

Problem 4.3.

Consider a discrete-time SIR model given by Eq. (1) with a directed graph 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\}, a weight matrix An×nA\in\mathbb{R}^{n\times n}, a sampling parameter h0h\in\mathbb{R}_{\geq 0}, and sets 𝒮I,𝒮H𝒱\mathcal{S}_{I},\mathcal{S}_{H}\subseteq\mathcal{V} defined in Definition 3.4. Moreover, consider time steps t1,t20t_{1},t_{2}\in\mathbb{Z}_{\geq 0} with t1<t2t_{1}<t_{2}, and a cost ck,i0c_{k,i}\in\mathbb{R}_{\geq 0} of measuring xi[k]x_{i}[k] and a cost bk,i0b_{k,i}\in\mathbb{R}_{\geq 0} of measuring ri[k]r_{i}[k] for all i𝒱i\in\mathcal{V} and for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}. The PIMS problem is to find t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}} that solves

mint1:t2c()s.t.rmax()=2,\begin{split}&\mathop{\min}_{\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}}c(\mathcal{I})\\ s.t.\ &r_{\mathop{\max}}(\mathcal{I})=2,\end{split} (12)

where t1:t2\mathcal{I}_{t_{1}:t_{2}} is defined in Eq. (2), c()c(\mathcal{I}) is defined in Eq. (11), and rmax()r_{\mathop{\max}}(\mathcal{I}) is defined in Eq. (10).

We have the following result; the proof is included in Section 7.2 in the Appendix.

Theorem 4.4.

The PIMS problem is NP-hard.

Theorem 4.4 indicates that there is no polynomial-time algorithm that solves all instances of the PIMS problem optimally (if P \neq NP). Moreover, we note from the formulation of the PIMS problem given by Problem 4.3 that for a measurement selection strategy t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}, one needs to check if maxΦΦ~()rank(Φ)=2\mathop{\max}_{\Phi\in\tilde{\Phi}(\mathcal{I})}\operatorname{rank}(\Phi)=2 holds, before the corresponding measurements are collected. However, in general, it is not possible to calculate rank(Φ)\operatorname{rank}(\Phi) when no measurements are collected. In order to bypass these issues, we will explore additional properties of the PIMS problem in the following.

4.2 Solving the PIMS Problem

We start with the following result.

Lemma 4.5.

Consider a discrete time SIR model given by Eq. (1). Suppose Assumptions 3.1-3.2 hold. Then, the following results hold, where Φk1,i1x1×2\Phi_{k_{1},i_{1}}^{x}\in\mathbb{R}^{1\times 2} and Φk2,i2r1×2\Phi_{k_{2},i_{2}}^{r}\in\mathbb{R}^{1\times 2} are defined in (8), 𝒮I{i𝒮I:aii>0}\mathcal{S}_{I}^{\prime}\triangleq\{i\in\mathcal{S}_{I}:a_{ii}>0\}, 𝒮{i𝒱𝒮I:𝒩i,min{dj:j𝒩i}}\mathcal{S}^{\prime}\triangleq\{i\in\mathcal{V}\setminus\mathcal{S}_{I}^{\prime}:\mathcal{N}_{i}\neq\emptyset,\mathop{\min}\{d_{j}:j\in\mathcal{N}_{i}\}\neq\infty\}, and 𝒮I\mathcal{S}_{I} and did_{i} are defined in Definition 3.4 for all i𝒱i\in\mathcal{V}.
(a)(a) For any i1𝒮Ii_{1}\in\mathcal{S}_{I}^{\prime} and for any i2𝒱i_{2}\in\mathcal{V} with di2d_{i_{2}}\neq\infty, rank([(Φk1,i1x)T(Φk2,i2r)T])=2\text{rank}\big{(}\begin{bmatrix}(\Phi_{k_{1},i_{1}}^{x})^{T}&(\Phi_{k_{2},i_{2}}^{r})^{T}\end{bmatrix}\big{)}=2 for all k10k_{1}\geq 0 and for all k2di2k_{2}\geq d_{i_{2}}, where k1,k20k_{1},k_{2}\in\mathbb{Z}_{\geq 0}.
(b)(b) For any i1𝒮i_{1}\in\mathcal{S}^{\prime} and for any i2𝒱i_{2}\in\mathcal{V} with di2d_{i_{2}}\neq\infty, rank([(Φk1,i1x)T(Φk2,i2r)T])=2\text{rank}\big{(}\begin{bmatrix}(\Phi_{k_{1},i_{1}}^{x})^{T}&(\Phi_{k_{2},i_{2}}^{r})^{T}\end{bmatrix}\big{)}=2 for all k1min{dj:j𝒩i1}k_{1}\geq\mathop{\min}\{d_{j}:j\in\mathcal{N}_{i_{1}}\}, and for all k2di2k_{2}\geq d_{i_{2}}, where k1,k20k_{1},k_{2}\in\mathbb{Z}_{\geq 0}.

Proof.

Noting from (8), we have

[Φk1,i1xΦk2,i2r]=[si1[k1]j𝒩¯i1ai1jxj[k1]xi1[k1]0xi2[k2]].\begin{bmatrix}\Phi_{k_{1},i_{1}}^{x}\\ \Phi_{k_{2},i_{2}}^{r}\end{bmatrix}=\begin{bmatrix}s_{i_{1}}[k_{1}]\sum_{j\in\bar{\mathcal{N}}_{i_{1}}}a_{i_{1}j}x_{j}[k_{1}]&-x_{i_{1}}[k_{1}]\\ 0&x_{i_{2}}[k_{2}]\end{bmatrix}. (13)

To prove part (a)(a), consider any i1𝒮Ii_{1}\in\mathcal{S}_{I}^{\prime} and any i2𝒱i_{2}\in\mathcal{V} with di2d_{i_{2}}\neq\infty, where we note xi1[0]>0x_{i_{1}}[0]>0 and ai1i1>0a_{i_{1}i_{1}}>0 from the definition of 𝒮I\mathcal{S}_{I}^{\prime}. We then see from Lemma 3.5(a)\ref{lemma:propagate of initial condition}(a)-(b)(b) that si1[k1]>0s_{i_{1}}[k_{1}]>0 and xi1[k1]>0x_{i_{1}}[k_{1}]>0 for all k10k_{1}\geq 0. It follows that si1[k1]j𝒩¯i1ai1jxj[k1]>0s_{i_{1}}[k_{1}]\sum_{j\in\bar{\mathcal{N}}_{i_{1}}}a_{i_{1}j}x_{j}[k_{1}]>0 for all k10k_{1}\geq 0. Also, we obtain from Lemma 3.5(b)\ref{lemma:propagate of initial condition}(b) xi2[k2]>0x_{i_{2}}[k_{2}]>0 for all k2di2k_{2}\geq d_{i_{2}}, which proves part (a)(a).

We then prove part (b)(b). Considering any i1𝒮i_{1}\in\mathcal{S}^{\prime} and any i2𝒱i_{2}\in\mathcal{V} with d2d_{2}\neq\infty, we see from the definition of 𝒮\mathcal{S}^{\prime} that 𝒩i1\mathcal{N}_{i_{1}}\neq\emptyset and there exists j𝒩i1j\in\mathcal{N}_{i_{1}} such that djd_{j}\neq\infty. Letting j1j_{1} be a node in 𝒩i1\mathcal{N}_{i_{1}} such that dj1=min{dj:j𝒩i1}d_{j_{1}}=\mathop{\min}\{d_{j}:j\in\mathcal{N}_{i_{1}}\}\neq\infty, we note from Lemma 3.5(a)\ref{lemma:propagate of initial condition}(a) that xj1[k1]>0x_{j_{1}}[k_{1}]>0 for all k1min{dj:j𝒩i1}k_{1}\geq\mathop{\min}\{d_{j}:j\in\mathcal{N}_{i_{1}}\}. Also note that ai1j1>0a_{i_{1}j_{1}}>0 from Assumption 3.2. The rest of the proof of part (b)(b) is then identical to that of part (a)(a). ∎

Recalling the way we index the equations in Eq. (3) (see Eqs. (6)-(7) for examples), we define the set that contains all the indices of the equations in Eq. (3):

𝒬{(k,i,λ):k{t1,,t21},i𝒱,λ{x,r}}.\mathcal{Q}\triangleq\{(k,i,\lambda):k\in\{t_{1},\dots,t_{2}-1\},i\in\mathcal{V},\lambda\in\{x,r\}\}. (14)

Following the arguments in Lemma 4.5, we denote

𝒬1{(k,i,x)𝒬:i𝒮I}{(k,i,x)𝒬:kmin{dj:j𝒩i},i𝒮},\displaystyle\mathcal{Q}_{1}\triangleq\{(k,i,x)\in\mathcal{Q}:i\in\mathcal{S}_{I}^{\prime}\}\cup\{(k,i,x)\in\mathcal{Q}:k\geq\mathop{\min}\{d_{j}:j\in\mathcal{N}_{i}\},i\in\mathcal{S}^{\prime}\}, (15)
𝒬2{(k,i,r)𝒬:kdi,i𝒱,di},\displaystyle\mathcal{Q}_{2}\triangleq\{(k,i,r)\in\mathcal{Q}:k\geq d_{i},i\in\mathcal{V},d_{i}\neq\infty\}, (16)

where 𝒮I\mathcal{S}_{I}^{\prime} and 𝒮\mathcal{S}^{\prime} are defined in Lemma 4.5, and did_{i} is defined in Definition 3.4. Next, for all (k,i,x)𝒬(k,i,x)\in\mathcal{Q}, we define the set of measurements that are needed to determine the coefficients in equation (k,i,x)(k,i,x) (when no other equations are used) to be

(k,i,x)({xi[k+1],ri[k]}{xj[k]:j𝒩¯i})t1:t2,\mathcal{I}(k,i,x)\triangleq\big{(}\{x_{i}[k+1],r_{i}[k]\}\cup\{x_{j}[k]:j\in\bar{\mathcal{N}}_{i}\}\big{)}\cap\mathcal{I}_{t_{1}:t_{2}},

where t1:t2\mathcal{I}_{t_{1}:t_{2}} is defined in Eq. (2). Similarly, for all (k,i,r)𝒬(k,i,r)\in\mathcal{Q}, we define

(k,i,r)({ri[k+1],ri[k],xi[k]})t1:t2.\mathcal{I}(k,i,r)\triangleq\big{(}\{r_{i}[k+1],r_{i}[k],x_{i}[k]\}\big{)}\cap\mathcal{I}_{t_{1}:t_{2}}.

Moreover, let us denote

((k1,i1,λ1),(k2,i2,λ2))(k1,i1,λ1)(k2,i2,λ2)\mathcal{I}((k_{1},i_{1},\lambda_{1}),(k_{2},i_{2},\lambda_{2}))\triangleq\mathcal{I}(k_{1},i_{1},\lambda_{1})\cup\mathcal{I}(k_{2},i_{2},\lambda_{2}) (17)

for all (k1,i1,λ1),(k2,i2,λ2)𝒬(k_{1},i_{1},\lambda_{1}),(k_{2},i_{2},\lambda_{2})\in\mathcal{Q}. Similarly to Eq. (11), denote the sum of the costs of the measurements from ((k1,i1,λ1),(k2,i2,λ2))\mathcal{I}((k_{1},i_{1},\lambda_{1}),(k_{2},i_{2},\lambda_{2})) as c(((k1,i1,λ1),(k2,i2,λ2)))c(\mathcal{I}((k_{1},i_{1},\lambda_{1}),(k_{2},i_{2},\lambda_{2}))).

Algorithm 1 Algorithm for PIMS
1:Input: An instance of PIMS
2:Find (k1,i1,x)𝒬1(k_{1},i_{1},x)\in\mathcal{Q}_{1}, (k2,i2,r)𝒬2(k_{2},i_{2},r)\in\mathcal{Q}_{2} s.t. c(((k1,i1,x),(k2,i2,r)))c(\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r))) is minimized return ((k1,i1,x),(k2,i2,r))\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r))

Based on the above arguments, we propose an algorithm defined in Algorithm 1 for the PIMS problem. Note that Algorithm 1 finds an equation from 𝒬1\mathcal{Q}_{1} and an equation from 𝒬2\mathcal{Q}_{2} such that the sum of the costs of the two equations is minimized, where 𝒬1\mathcal{Q}_{1} and 𝒬2\mathcal{Q}_{2} are defined in Eq. (15) and Eq. (16), respectively.

Proposition 4.6.

Consider an instance of the PIMS problem under Assumptions 3.1-3.2. Algorithm 1 returns a solution ((k1,i1,x),(k2,i2,r))\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r)) to the PIMS problem that satisfies the constraint in (12), and the following:

c(((k1,i1,x),(k2,i2,r)))c()min(k,i,x)𝒬1(bk+1,i+bk,i+ck+1,i+j𝒩¯ick,j)3cmin,\frac{c(\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r)))}{c(\mathcal{I}^{\star})}\leq\frac{\mathop{\min}_{(k,i,x)\in\mathcal{Q}_{1}}(b_{k+1,i}+b_{k,i}+c_{k+1,i}+\sum_{j\in\bar{\mathcal{N}}_{i}}c_{k,j})}{3c_{\mathop{\min}}}, (18)

where \mathcal{I}^{\star} is an optimal solution to the PIMS problem, 𝒬1\mathcal{Q}_{1} is defined in Eq. (15), and cminmin{minxi[k]t1:t2ck,i,minri[k]t1:t2bk,i}>0c_{\mathop{\min}}\triangleq\mathop{\min}\{\mathop{\min}_{x_{i}[k]\in\mathcal{I}_{t_{1}:t_{2}}}c_{k,i},\mathop{\min}_{r_{i}[k]\in\mathcal{I}_{t_{1}:t_{2}}}b_{k,i}\}>0 with t1:t2\mathcal{I}_{t_{1}:t_{2}} given by Eq. (2).

Proof.

The feasibility of ((k1,i1,x),(k2,i2,r))\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r)) follows directly from the definition of Algorithm 1 and Lemma 4.5. We now prove (18). Consider any equations (k,i,x)𝒬1(k,i,x)\in\mathcal{Q}_{1} and (k,i,r)𝒬2(k,i,r)\in\mathcal{Q}_{2}. We have from Eq. (17) the following:

((k,i,x),(k,i,r))=({xi[k+1],ri[k]}{xj[k]:j𝒩¯i}{ri[k+1],ri[k],xi[k]})t1:t2,\mathcal{I}((k,i,x),(k,i,r))=\big{(}\{x_{i}[k+1],r_{i}[k]\}\cup\{x_{j}[k]:j\in\bar{\mathcal{N}}_{i}\}\cup\{r_{i}[k+1],r_{i}[k],x_{i}[k]\}\big{)}\cap\mathcal{I}_{t_{1}:t_{2}},

which implies

c(((k1,i1,x),(k2,i2,r)))min(k,i,x)𝒬1(bk+1,i+bk,i+ck+1,i+j𝒩¯ick,j).c(\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r)))\leq\mathop{\min}_{(k,i,x)\in\mathcal{Q}_{1}}(b_{k+1,i}+b_{k,i}+c_{k+1,i}+\sum_{j\in\bar{\mathcal{N}}_{i}}c_{k,j}).

Next, since \mathcal{I}^{\star} satisfies rmax()=2r_{\mathop{\max}}(\mathcal{I}^{\star})=2, we recall from Remark 4.2 |¯|2|\bar{\mathcal{I}}^{\star}|\geq 2, where

¯={(k,i,x):xi[k+1],xi[k]=0}{(k,i,x):{xi[k+1],xi[k]}}{(k,i,r):{ri[k+1],xi[k]},ri[k]=0}{(k,i,r):{ri[k+1],ri[k],xi[k]}},\bar{\mathcal{I}}^{\star}=\{(k,i,x):x_{i}[k+1]\in\mathcal{I}^{\star},x_{i}[k]=0\}\cup\{(k,i,x):\{x_{i}[k+1],x_{i}[k]\}\subseteq\mathcal{I}^{\star}\}\\ \cup\{(k,i,r):\{r_{i}[k+1],x_{i}[k]\}\subseteq\mathcal{I}^{\star},r_{i}[k]=0\}\cup\{(k,i,r):\{r_{i}[k+1],r_{i}[k],x_{i}[k]\}\subseteq\mathcal{I}^{\star}\},

which implies ||2|\mathcal{I}^{\star}|\geq 2. In fact, suppose ={xi[k+1],xj[k+1]}\mathcal{I}^{\star}=\{x_{i}[k+1],x_{j}[k+1]\}, where i𝒱i\in\mathcal{V} and k{t11,,t21}k\in\{t_{1}-1,\dots,t_{2}-1\}. Since the elements in Φk,ix\Phi_{k,i}^{x} and Φk,jx\Phi_{k,j}^{x} (defined in (8)) do not contain xw[0]x_{w}[0], rw[0]r_{w}[0] or sw[0]s_{w}[0] for any w𝒱w\in\mathcal{V}, and cannot all be zero, we see that there exists xw[k]x_{w^{\prime}}[k]\in\mathcal{I}^{\star} (with xw[k]>0x_{w^{\prime}}[k]>0), where w𝒱w^{\prime}\in\mathcal{V}. This further implies ||3|\mathcal{I}^{\star}|\geq 3. Using similar arguments, one can show that ||3|\mathcal{I}^{\star}|\geq 3 holds in general, which implies c()3cminc(\mathcal{I}^{\star})\geq 3c_{\mathop{\min}}. Combining the above arguments leads to (18). ∎

Finally, note that 𝒬2\mathcal{Q}_{2} and t1:t2\mathcal{I}_{t_{1}:t_{2}} can be obtained by calling the Breadth-First-Search (BFS) algorithm (e.g., [8]) |𝒮I||\mathcal{S}_{I}| times with O(|𝒮I|(n+||))O(|\mathcal{S}_{I}|(n+|\mathcal{E}|)) total time complexity. Also note that the time complexity of line 22 in Algorithm 1 is O(n2(t2t1+1)2)O(n^{2}(t_{2}-t_{1}+1)^{2}). Thus, the overall time complexity of Algorithm 1 is O(n2(t2t1+1)2+|𝒮I|||)=O(|𝒬|2+|𝒮I|||)O(n^{2}(t_{2}-t_{1}+1)^{2}+|\mathcal{S}_{I}||\mathcal{E}|)=O(|\mathcal{Q}|^{2}+|\mathcal{S}_{I}||\mathcal{E}|).

5 Measurement Selection Problem in Random Measurement Setting

In this section, we assume that the initial condition l[(s[0])T(x[0])T(r[0])T]Tl\triangleq[(s[0])^{T}\ (x[0])^{T}\ (r[0])^{T}]^{T} is known. Nevertheless, our analysis can potentially be extended to cases where the initial condition ll is given by a probability distribution.

5.1 Problem Formulation

Here, we consider the scenario where the measurement of xi[k]x_{i}[k] (resp., ri[k]r_{i}[k]), denoted as x^i[k]\hat{x}_{i}[k] (resp., r^i[k]\hat{r}_{i}[k]), is given by a pmf p(x^i[k]|xi[k])p(\hat{x}_{i}[k]|x_{i}[k]) (resp., p(r^i[k]|ri[k])p(\hat{r}_{i}[k]|r_{i}[k])). Note one can express xi[k]x_{i}[k] in terms of ll and θ[βδ]T\theta\triangleq[\beta\ \delta]^{T} using (1b). Hence, given ll and θ\theta, we can alternatively write p(x^i[k]|xi[k])p(\hat{x}_{i}[k]|x_{i}[k]) as p(x^i[k]|l,θ)p(\hat{x}_{i}[k]|l,\theta) for all i𝒱i\in\mathcal{V} and for all k1k\in\mathbb{Z}_{\geq 1}. Since the initial conditions are assumed to be known, we drop the dependency of p(x^i[k]|l,θ)p(\hat{x}_{i}[k]|l,\theta) on ll, and denote the pmf of x^i[k]\hat{x}_{i}[k] as p(x^i[k]|θ)p(\hat{x}_{i}[k]|\theta) for all i𝒱i\in\mathcal{V} and for all k1k\in\mathbb{Z}_{\geq 1}. Similarly, given ll and θ\theta, we denote the pmf of r^i[k]\hat{r}_{i}[k] as p(r^i[k]|θ)p(\hat{r}_{i}[k]|\theta) for all i𝒱i\in\mathcal{V} and for all k1k\in\mathbb{Z}_{\geq 1}. Note that when collecting measurement x^i[k]\hat{x}_{i}[k] (resp., r^i[k]\hat{r}_{i}[k]) under a limited budget, one possibility is to give virus (resp., antibody) tests to a group of randomly and uniformly sampled individuals of the population at node i𝒱i\in\mathcal{V} and at time k1k\in\mathbb{Z}_{\geq 1} (e.g., [2]), where a positive testing result indicates that the tested individual is infected (resp., recovered) at time kk (e.g., [1]). Thus, the obtained random measurements x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k] and the corresponding pmfs p(x^i[k]|θ)p(\hat{x}_{i}[k]|\theta) and p(r^i[k]|θ)p(\hat{r}_{i}[k]|\theta) depend on the total number of conducted virus tests and antibody tests at node ii and at time kk, respectively. Consider any node i𝒱i\in\mathcal{V} and any time step k1k\in\mathbb{Z}_{\geq 1}, where the total population of ii is denoted by Ni1N_{i}\in\mathbb{Z}_{\geq 1} and is assumed to be fixed over time. Suppose we are allowed to choose the number of virus (resp., antibody) tests that will be performed on the (randomly sampled) individuals at node ii and at time kk. Assume that the cost of performing the virus (resp., antibody) tests is proportional to the number of the tests. For any i𝒱i\in\mathcal{V} and for any k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}, let

𝒞k,i{ζck,i:ζ({0}[ζi])}\mathcal{C}_{k,i}\triangleq\{\zeta c_{k,i}:\zeta\in(\{0\}\cup[\zeta_{i}])\} (19)

be the set of all possible costs that we can spend on collecting the measurement x^i[k]\hat{x}_{i}[k], where ck,i0c_{k,i}\in\mathbb{R}_{\geq 0} and ζi1\zeta_{i}\in\mathbb{Z}_{\geq 1}. Similarly, for any i𝒱i\in\mathcal{V} and any k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}, let

k,i{ηbk,i:η({0}[ηi])}\mathcal{B}_{k,i}\triangleq\{\eta b_{k,i}:\eta\in(\{0\}\cup[\eta_{i}])\} (20)

denote the set of all possible costs that we can spend on collecting the measurement r^i[k]\hat{r}_{i}[k], where bk,i0b_{k,i}\in\mathbb{R}_{\geq 0} and ηi1\eta_{i}\in\mathbb{Z}_{\geq 1}. For instance, ζck,i\zeta c_{k,i} can be viewed as the cost of performing virus tests on ζNix\zeta N_{i}^{x} (randomly sampled) individuals in the population at node ii, where Nix1N_{i}^{x}\in\mathbb{Z}_{\geq 1} and ζiNixNi\zeta_{i}N_{i}^{x}\leq N_{i}. To reflect the dependency of the pmf p(x^i[k]|θ)p(\hat{x}_{i}[k]|\theta) (resp., p(r^i[k]|θ)p(\hat{r}_{i}[k]|\theta)) of measurement x^i[k]\hat{x}_{i}[k] (resp., r^i[k]\hat{r}_{i}[k]) on the cost spent on collecting the measurement of xi[k]x_{i}[k] (resp., ri[k]r_{i}[k]), we further denote the pmf of x^i[k]\hat{x}_{i}[k] (resp., r^i[k]\hat{r}_{i}[k]) as p(x^i[k]|θ,φk,i)p(\hat{x}_{i}[k]|\theta,\varphi_{k,i}) (resp., p(r^i[k]|θ,ωk,i)p(\hat{r}_{i}[k]|\theta,\omega_{k,i})), where φk,i𝒞k,i\varphi_{k,i}\in\mathcal{C}_{k,i} (resp., ωk,ik,i\omega_{k,i}\in\mathcal{B}_{k,i}) with 𝒞k.i\mathcal{C}_{k.i} (resp., k,i\mathcal{B}_{k,i}) given by Eq. (19) (resp., Eq. (20)). Note that φk,i\varphi_{k,i} (resp., ωk,i\omega_{k,i}) is the cost that we spend on collecting measurement x^i[k]\hat{x}_{i}[k] (resp., r^i[k]\hat{r}_{i}[k]), and φk,i=0\varphi_{k,i}=0 (resp., ωk,i=0\omega_{k,i}=0) indicates that measurement x^i[k]\hat{x}_{i}[k] (resp., r^i[k]\hat{r}_{i}[k]) is not collected.

In contrast with the exact measurement case studied in Section 4, it is not possible to uniquely identify β\beta and δ\delta using measurements x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k] which are now random variables. Thus, we will consider estimators of β\beta and δ\delta based on the measurements indicated by a measurement selection strategy. Similarly to Section 4, given time steps t1,t21t_{1},t_{2}\in\mathbb{Z}_{\geq 1} with t2t1t_{2}\geq t_{1}, define the set of all candidate measurements as

𝒰t1:t2{x^i[k]:i𝒱,k{t1,,t2}}{r^i[k]:i𝒱,k{t1,,t2}}.\mathcal{U}_{t_{1}:t_{2}}\triangleq\{\hat{x}_{i}[k]:i\in\mathcal{V},k\in\{t_{1},\dots,t_{2}\}\}\cup\{\hat{r}_{i}[k]:i\in\mathcal{V},k\in\{t_{1},\dots,t_{2}\}\}. (21)

Recalling 𝒞k,i\mathcal{C}_{k,i} and k,i\mathcal{B}_{k,i} defined in Eq. (19) and Eq. (20), respectively, we let μ0𝒰t1:t2\mu\in\mathbb{Z}_{\geq 0}^{\mathcal{U}_{t_{1}:t_{2}}} be a measurement selection that specifies the costs spent on collecting measurements x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k] for all i𝒱i\in\mathcal{V} and for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}. Moreover, we define the set of all candidate measurement selections as

{μ0𝒰t1:t2:μ(x^i[k])({0}[ζi]),μ(r^i[k])({0}[ηi])},\mathcal{M}\triangleq\{\mu\in\mathbb{Z}_{\geq 0}^{\mathcal{U}_{t_{1}:t_{2}}}:\mu(\hat{x}_{i}[k])\in(\{0\}\cup[\zeta_{i}]),\mu(\hat{r}_{i}[k])\in(\{0\}\cup[\eta_{i}])\}, (22)

where ζi,ηi1\zeta_{i},\eta_{i}\in\mathbb{Z}_{\geq 1} for all i𝒱i\in\mathcal{V}. In other words, a measurement selection μ\mu is defined over the integer lattice 0𝒰t1:t2\mathbb{Z}_{\geq 0}^{\mathcal{U}_{t_{1}:t_{2}}} so that μ\mu is a vector of dimension |𝒰t1:t2||\mathcal{U}_{t_{1}:t_{2}}|, where each element of μ\mu corresponds to an element in 𝒰t1:t2\mathcal{U}_{t_{1}:t_{2}}, and is denoted as μ(x^i[k])\mu(\hat{x}_{i}[k]) (or μ(r^i[k])\mu(\hat{r}_{i}[k])). The set \mathcal{M} contains all μ0𝒰t1:t2\mu\in\mathbb{Z}_{\geq 0}^{\mathcal{U}_{t_{1}:t_{2}}} such that μ(x^i[k])({0}[ζi])\mu(\hat{x}_{i}[k])\in(\{0\}\cup[\zeta_{i}]) and μ(r^i[k])({0}[ηi])\mu(\hat{r}_{i}[k])\in(\{0\}\cup[\eta_{i}]) for all i𝒱i\in\mathcal{V} and for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}. Thus, for any φk,i𝒞k,i\varphi_{k,i}\in\mathcal{C}_{k,i} and for any ωk,ik,i\omega_{k,i}\in\mathcal{B}_{k,i}, there exists μ0𝒰t1:t2\mu\in\mathcal{M}_{\geq 0}^{\mathcal{U}_{t_{1}:t_{2}}} such that μ(x^i[k])ck,i=φk,i\mu(\hat{x}_{i}[k])c_{k,i}=\varphi_{k,i} and μ(r^i[k])bk,i=ωk,i\mu(\hat{r}_{i}[k])b_{k,i}=\omega_{k,i}. In other words, μ(x^i[k])ck,i\mu(\hat{x}_{i}[k])c_{k,i} (resp., μ(r^i[k])bk,i\mu(\hat{r}_{i}[k])b_{k,i}) indicates the cost spent on collecting the measurement of xi[k]x_{i}[k] (resp., ri[k]r_{i}[k]). Given a measurement selection μ0t1:t2\mu\in\mathbb{Z}_{\geq 0}^{t_{1}:t_{2}}, we can also denote the pmfs of x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k] as p(x^i[k]|θ,μ(x^i[k]))p(\hat{x}_{i}[k]|\theta,\mu(\hat{x}_{i}[k])) and p(r^i[k]|θ,μ(r^i[k]))p(\hat{r}_{i}[k]|\theta,\mu(\hat{r}_{i}[k])), respectively, where we drop the dependencies of the pmfs on ck,ic_{k,i} and bk,ib_{k,i} for notational simplicity.

To proceed, we consider the scenario where measurements can only be collected under a budget constraint given by B0B\in\mathbb{R}_{\geq 0}. Using the above notations, the budget constraint can be expressed as

x^i[k]𝒰t1:t2ck,iμ(x^i[k])+r^i[k]𝒰t1:t2bk,iμ(r^i[k])B.\sum_{\hat{x}_{i}[k]\in\mathcal{U}_{t_{1}:t_{2}}}c_{k,i}\mu(\hat{x}_{i}[k])+\sum_{\hat{r}_{i}[k]\in\mathcal{U}_{t_{1}:t_{2}}}b_{k,i}\mu(\hat{r}_{i}[k])\leq B. (23)

We then consider estimators of θ=[βδ]T\theta=[\beta\ \delta]^{T} based on any given measurement selection μ\mu\in\mathcal{M}. Considering any μ\mu\in\mathcal{M}, we denote

𝒰iλ{k:μ(λ^i[k])>0,k{t1,,t2}},\mathcal{U}^{\lambda}_{i}\triangleq\{k:\mu(\hat{\lambda}_{i}[k])>0,k\in\{t_{1},\dots,t_{2}\}\}, (24)

for all i𝒱i\in\mathcal{V} and for all λ{x,r}\lambda\in\{x,r\}. For all i𝒱i\in\mathcal{V} and for all λ{x,r}\lambda\in\{x,r\} with 𝒰iλ\mathcal{U}_{i}^{\lambda}\neq\emptyset, denote y(𝒰iλ)[λ^i[k1]λ^i[k|𝒰iλ|]]Ty(\mathcal{U}_{i}^{\lambda})\triangleq\begin{bmatrix}\hat{\lambda}_{i}[k_{1}]&\cdots\hat{\lambda}_{i}[k_{|\mathcal{U}_{i}^{\lambda}|}]\end{bmatrix}^{T}, where 𝒰iλ={k1,,k|𝒰iλ|}\mathcal{U}_{i}^{\lambda}=\{k_{1},\dots,k_{|\mathcal{U}_{i}^{\lambda}|}\}. Letting

𝒰λ{i:𝒰iλ,i𝒱}λ{x,r},\mathcal{U}_{\lambda}\triangleq\{i:\mathcal{U}_{i}^{\lambda}\neq\emptyset,i\in\mathcal{V}\}\ \forall\lambda\in\{x,r\},

we denote the measurement vector indicated by μ\mu\in\mathcal{M} as

y(μ)[(y(𝒰i1x))T(y(𝒰i|𝒰x|x))T(y(𝒰j1r))T(y(𝒰j|𝒰r|r))T]T,y(\mu)\triangleq\begin{bmatrix}(y(\mathcal{U}_{i_{1}}^{x}))^{T}&\cdots&(y(\mathcal{U}_{i_{|\mathcal{U}_{x}|}}^{x}))^{T}&(y(\mathcal{U}_{j_{1}}^{r}))^{T}&\cdots&(y(\mathcal{U}_{j_{|\mathcal{U}_{r}|}}^{r}))^{T}\end{bmatrix}^{T}, (25)

where 𝒰x={i1,,i|𝒰x|}\mathcal{U}_{x}=\{i_{1},\dots,i_{|\mathcal{U}_{x}|}\} and 𝒰r={j1,,j|𝒰r|}\mathcal{U}_{r}=\{j_{1},\dots,j_{|\mathcal{U}_{r}|}\}. Note that x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k] are (discrete) random variables with pmfs p(x^i[k]|θ,μ(x^i[k]))p(\hat{x}_{i}[k]|\theta,\mu(\hat{x}_{i}[k])) and p(r^i[k]|θ,μ(r^i[k]))p(\hat{r}_{i}[k]|\theta,\mu(\hat{r}_{i}[k])), respectively. We then see from Eq. (25) that y(μ)y(\mu) is a random vector whose pmf is denoted as p(y(μ)|θ,μ)p(y(\mu)|\theta,\mu). Similarly, the pmf of y(𝒰ix)y(\mathcal{U}_{i}^{x}) (resp., y(𝒰ir)y(\mathcal{U}_{i}^{r})) is denoted as p(y(𝒰ix)|θ,μ)p(y(\mathcal{U}_{i}^{x})|\theta,\mu) (resp., p(y(𝒰ir)|θ,μ)p(y(\mathcal{U}_{i}^{r})|\theta,\mu)). Given t1,t21t_{1},t_{2}\in\mathbb{Z}_{\geq 1} with t2t1t_{2}\geq t_{1}, we make the following assumption on measurements x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k].

Assumption 5.1.

For any i𝒱i\in\mathcal{V} and for any k1,k2{t1,,t2}k_{1},k_{2}\in\{t_{1},\dots,t_{2}\} (k1k2k_{1}\neq k_{2}), x^i[k1]\hat{x}_{i}[k_{1}], x^i[k2]\hat{x}_{i}[k_{2}], r^i[k1]\hat{r}_{i}[k_{1}] and r^i[k2]\hat{r}_{i}[k_{2}] are independent of each other. Moreover, for any i,j𝒱i,j\in\mathcal{V} (iji\neq j) and for any k1,k2{t1,,t2}k_{1},k_{2}\in\{t_{1},\dots,t_{2}\}, x^i[k1]\hat{x}_{i}[k_{1}] and x^j[k2]\hat{x}_{j}[k_{2}] are independent, and x^i[k1]\hat{x}_{i}[k_{1}] and r^j[k2]\hat{r}_{j}[k_{2}] are independent.

The above assumption ensures that measurements from different nodes or from different time steps are independent, and the measurements of xi[k]x_{i}[k] and ri[k]r_{i}[k] are also independent. It then follows from Eq. (25) that the pmf of y(μ)y(\mu) can be written as

p(y(μ)|θ,μ)=i𝒰xp(y(𝒰ix)|θ,μ)j𝒰rp(y(𝒰jr)|θ,μ),p(y(\mu)|\theta,\mu)=\prod_{i\in\mathcal{U}_{x}}p(y(\mathcal{U}_{i}^{x})|\theta,\mu)\cdot\prod_{j\in\mathcal{U}_{r}}p(y(\mathcal{U}_{j}^{r})|\theta,\mu), (26)

where we can further write p(y(𝒰ix)|θ,μ)=k𝒰ixp(x^i[k]|θ,μ(x^i[k]))p(y(\mathcal{U}_{i}^{x})|\theta,\mu)=\prod_{k\in\mathcal{U}_{i}^{x}}p(\hat{x}_{i}[k]|\theta,\mu(\hat{x}_{i}[k])) for all i𝒰xi\in\mathcal{U}_{x}, and p(y(𝒰jr)|θ,μ)=k𝒰jrp(r^j[k]|θ,μ(r^j[k]))p(y(\mathcal{U}_{j}^{r})|\theta,\mu)=\prod_{k\in\mathcal{U}_{j}^{r}}p(\hat{r}_{j}[k]|\theta,\mu(\hat{r}_{j}[k])) for all j𝒰rj\in\mathcal{U}_{r}.

In order to quantify the performance (e.g., precision) of estimators of θ\theta based on μ\mu, we use the Bayesian Cramer-Rao Lower Bound (BCRLB) (e.g., [33]) associated with μ\mu. In the following, we introduce the BCRLB, and explain why we choose it as a performance metric. First, given any measurement μ\mu\in\mathcal{M}, let Fθ(μ)F_{\theta}(\mu) be the corresponding Fisher information matrix defined as

Fθ(μ)𝔼[2lnp(y(μ)|θ,μ)β22lnp(y(μ)|θ,μ)βδ2lnp(y(μ)|θ,μ)δβ2lnp(y(μ)|θ,μ)δ2]F_{\theta}(\mu)\triangleq-\mathbb{E}\begin{bmatrix}\frac{\partial^{2}\ln p(y(\mu)|\theta,\mu)}{{\partial\beta^{2}}}&\frac{\partial^{2}\ln p(y(\mu)|\theta,\mu)}{{\partial\beta}{\partial\delta}}\\ \frac{\partial^{2}\ln p(y(\mu)|\theta,\mu)}{{\partial\delta}{\partial\beta}}&\frac{\partial^{2}\ln p(y(\mu)|\theta,\mu)}{{\partial\delta^{2}}}\end{bmatrix} (27)

with the expectation 𝔼[]\mathbb{E}[\cdot] taken with respect to p(y(μ)|θ,μ)p(y(\mu)|\theta,\mu). Under Assumption 5.1 and some regularity conditions on the pmfs of x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k], Eq. (27) can be written as the following (e.g., [13]):

Fθ(μ)=λ{x,r}i𝒰λk𝒰iλ𝔼[lnp(λ^i[k]|θ,μ(λ^i[k]))θ(lnp(λ^i[k]|θ,μ(λ^i[k]))θ)T].F_{\theta}(\mu)=\sum_{\lambda\in\{x,r\}}\sum_{i\in\mathcal{U}_{\lambda}}\sum_{k\in\mathcal{U}_{i}^{\lambda}}\mathbb{E}\Big{[}\frac{\partial\ln p(\hat{\lambda}_{i}[k]|\theta,\mu(\hat{\lambda}_{i}[k]))}{{\partial\theta}}\big{(}\frac{\partial\ln p(\hat{\lambda}_{i}[k]|\theta,\mu(\hat{\lambda}_{i}[k]))}{{\partial\theta}}\big{)}^{T}\Big{]}. (28)

Consider any estimator θ^(μ)\hat{\theta}(\mu) of θ\theta based on a measurement selection μ\mu\in\mathcal{M}, and assume that we have a prior pdf of θ=[βδ]T\theta=[\beta\ \delta]^{T}, denoted as p(θ)p(\theta). Under some regularity conditions on the pmfs of x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k], and p(θ)p(\theta), we have (e.g., [32, 33]):

Rθ^(μ)=𝔼[(θ^(μ)θ)(θ^(μ)θ)T]C¯(μ),R_{\hat{\theta}(\mu)}=\mathbb{E}[(\hat{\theta}(\mu)-\theta)(\hat{\theta}(\mu)-\theta)^{T}]\succeq\bar{C}(\mu), (29)

where Rθ^(μ)2×2R_{\hat{\theta}(\mu)}\in\mathbb{R}^{2\times 2} is the error covariance of the estimator θ^(μ)\hat{\theta}(\mu), the expectation 𝔼[]\mathbb{E}[\cdot] is taken with respect to p(y(μ)|θ,μ)p(θ)p(y(\mu)|\theta,\mu)p(\theta), and C¯(μ)2×2\bar{C}(\mu)\in\mathbb{R}^{2\times 2} is the BCRLB associated with the measurement selection μ\mu. The BCRLB is defined as (e.g., [32, 33])

C¯(μ)(𝔼θ[Fθ(μ)]+Fp)1,\bar{C}(\mu)\triangleq(\mathbb{E}_{\theta}[F_{\theta}(\mu)]+F_{p})^{-1}, (30)

where 𝔼θ[]\mathbb{E}_{\theta}[\cdot] denotes the expectation taken with respect to p(θ)p(\theta), Fθ(μ)F_{\theta}(\mu) is given by Eq. (27), and Fp2×2F_{p}\in\mathbb{R}^{2\times 2} encodes the prior knowledge of θ\theta as

Fp=𝔼θ[2lnp(θ)β22lnp(θ)βδ2lnp(θ)δβ2lnp(θ)δ2]=𝔼θ[lnp(θ)θ(lnp(θ)θ)T]𝟎,F_{p}=-\mathbb{E}_{\theta}\begin{bmatrix}\frac{\partial^{2}\ln p(\theta)}{{\partial\beta^{2}}}&\frac{\partial^{2}\ln p(\theta)}{{\partial\beta}{\partial\delta}}\\ \frac{\partial^{2}\ln p(\theta)}{{\partial\delta}{\partial\beta}}&\frac{\partial^{2}\ln p(\theta)}{{\partial\delta^{2}}}\end{bmatrix}=\mathbb{E}_{\theta}\Big{[}\frac{\partial\ln p(\theta)}{\partial\theta}\big{(}\frac{\partial\ln p(\theta)}{\partial\theta}\big{)}^{T}\Big{]}\succeq\mathbf{0}, (31)

where the second equality holds under some regularity conditions on p(θ)p(\theta) [32].

Thus, the above arguments motivate us to consider (functions of) C¯()\bar{C}(\cdot) as optimization metrics in the measurement selection problem studied in this section, in order to characterize the estimation performance corresponding to a measurement selection μ\mu\in\mathcal{M}. In particular, we will consider tr(C¯())\operatorname{tr}(\bar{C}(\cdot)) and lndet(C¯())\ln\det(\bar{C}(\cdot)), which are widely used criteria in parameter estimation (e.g., [12]), and are also known as the Bayesian A-optimality and D-optimality criteria respectively in the context of experimental design (e.g., [27]). First, considering the optimization metric tr(C¯())\operatorname{tr}(\bar{C}(\cdot)), we see from the above arguments that (29) directly implies tr(Rθ^(μ))tr(C¯(μ))\operatorname{tr}(R_{\hat{\theta}(\mu)})\geq\operatorname{tr}(\bar{C}(\mu)) for all estimators θ^(μ)\hat{\theta}(\mu) of θ\theta and for all μ\mu\in\mathcal{M}. Therefore, a measurement selection μ\mu^{\star} that minimizes tr(C¯(μ))\operatorname{tr}(\bar{C}(\mu)) potentially yields a lower value of tr(Rθ^(μ))\operatorname{tr}(R_{\hat{\theta}(\mu)}) for an estimator θ^(μ)\hat{\theta}(\mu) of θ\theta. Furthermore, there may exist an estimator θ^(μ)\hat{\theta}(\mu) that achieves the BCRLB (e.g., [32]), i.e., tr(C¯(μ))\operatorname{tr}(\bar{C}(\mu)) provides the minimum value of tr(Rθ^(μ))\operatorname{tr}(R_{\hat{\theta}(\mu)}) that can be possibly achieved by any estimator θ^(μ)\hat{\theta}(\mu) of θ\theta, given a measurement selection μ\mu. Similar arguments hold for lndet(C¯())\ln\det(\bar{C}(\cdot)). To proceed, denoting

fa(μ)tr(C¯(μ))andfd(μ)lndet(C¯(μ))μ,f_{a}(\mu)\triangleq\operatorname{tr}(\bar{C}(\mu))\ \text{and}\ f_{d}(\mu)\triangleq\ln\det(\bar{C}(\mu))\ \forall\mu\in\mathcal{M}, (32)

we define the Parameter Estimation Measurement Selection (PEMS) problem.

Problem 5.2.

Consider a discrete-time SIR model given by Eq. (1) with a directed graph 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\}, a weight matrix An×nA\in\mathbb{R}^{n\times n}, a sampling parameter h0h\in\mathbb{R}_{\geq 0}, and an initial condition l=[((s[0])T(x[0])T(r[0])T]Tl=[((s[0])^{T}\ (x[0])^{T}\ (r[0])^{T}]^{T}. Moreover, consider time steps t1,t21t_{1},t_{2}\in\mathbb{Z}_{\geq 1} with t2t1t_{2}\geq t_{1}; a set 𝒞k,i={ζck,i:ζ({0}[ζi])}\mathcal{C}_{k,i}=\{\zeta c_{k,i}:\zeta\in(\{0\}\cup[\zeta_{i}])\} with ck,i0c_{k,i}\in\mathbb{R}_{\geq 0} and ζi1\zeta_{i}\in\mathbb{Z}_{\geq 1}, for all i𝒱i\in\mathcal{V} and for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}; a set k,i={ηbk,i:η({0}[ηi])}\mathcal{B}_{k,i}=\{\eta b_{k,i}:\eta\in(\{0\}\cup[\eta_{i}])\} with bk,i0b_{k,i}\in\mathbb{R}_{\geq 0} and ηi1\eta_{i}\in\mathbb{Z}_{\geq 1}, for all i𝒱i\in\mathcal{V} and for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}; a budget B0B\in\mathbb{R}_{\geq 0}; and a prior pdf p(θ)p(\theta). Suppose x^i[k]\hat{x}_{i}[k] (resp., r^i[k]\hat{r}_{i}[k]) is given by a pmf p(x^i[k]|θ,φk,i)p(\hat{x}_{i}[k]|\theta,\varphi_{k,i}) (resp., p(r^i[k]|θ,ωk,i)p(\hat{r}_{i}[k]|\theta,\omega_{k,i})), where φk,i𝒞k,i\varphi_{k,i}\in\mathcal{C}_{k,i} (resp., ωk,ik,i\omega_{k,i}\in\mathcal{B}_{k,i}). The PEMS problem is to find a measurement selection μ\mu that solves

minμf(μ)s.t.x^i[k]𝒰t1:t2ck,iμ(x^i[k])+r^i[k]𝒰t1:t2bk,iμ(r^i[k])B,\begin{split}&\mathop{\min}_{\mu\in\mathcal{M}}f(\mu)\\ s.t.&\ \sum_{\hat{x}_{i}[k]\in\mathcal{U}_{t_{1}:t_{2}}}c_{k,i}\mu(\hat{x}_{i}[k])+\sum_{\hat{r}_{i}[k]\in\mathcal{U}_{t_{1}:t_{2}}}b_{k,i}\mu(\hat{r}_{i}[k])\leq B,\end{split} (33)

where \mathcal{M} is defined in Eq. (22), f()f(\cdot) can be either of fa()f_{a}(\cdot) or fd()f_{d}(\cdot) with fa()f_{a}(\cdot) and fd()f_{d}(\cdot) defined in Eq. (32), 𝒰t1:t2\mathcal{U}_{t_{1}:t_{2}} is defined in Eq. (21), and C¯(μ)\bar{C}(\mu) is given by Eq. (30).

Note that Fp𝟎F_{p}\succeq\mathbf{0} from (31), and fa(𝟎)=tr(C¯(𝟎))=tr((Fp)1)f_{a}(\mathbf{0})=\operatorname{tr}(\bar{C}(\mathbf{0}))=\operatorname{tr}((F_{p})^{-1}) and fd(𝟎)=lndet(C¯(𝟎))=lndet((Fp)1)f_{d}(\mathbf{0})=\ln\det(\bar{C}(\mathbf{0}))=\ln\det((F_{p})^{-1}) from Eq. (30). We further assume that Fp𝟎F_{p}\succ\mathbf{0} in the sequel, which implies f(μ)>0f(\mu)>0 for all μ\mu\in\mathcal{M}.

5.2 Solving the PEMS Problem

In this section, we consider a measurement model with specific pmfs of x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k] (e.g., [4] and [11]). Nonetheless, our analysis can potentially be extended to other measurement models.

5.2.1 Pmfs of Measurements x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k]

Consider any i𝒱i\in\mathcal{V} and any k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}. Assume that the total population of node ii is fixed over time and is denoted as Ni1N_{i}\in\mathbb{Z}_{\geq 1}. Given any measurement selection μ\mu\in\mathcal{M} with \mathcal{M} defined in Eq. (22), we recall from Section 5.1 that μ(x^i[k])ck,i\mu(\hat{x}_{i}[k])c_{k,i} can be viewed as the cost of performing virus tests on μ(x^i[k])Nix\mu(\hat{x}_{i}[k])N_{i}^{x} randomly and uniformly sampled individuals in the population of node i𝒱i\in\mathcal{V}, where μ(x^i[k])({0}[ζi])\mu(\hat{x}_{i}[k])\in(\{0\}\cup[\zeta_{i}]) (with ζi1\zeta_{i}\in\mathbb{Z}_{\geq 1}), ck,i0c_{k,i}\in\mathbb{R}_{\geq 0} and Nix1N_{i}^{x}\in\mathbb{Z}_{\geq 1} with ζiNixNi\zeta_{i}N_{i}^{x}\leq N_{i}. Note that xi[k]x_{i}[k] is the proportion of population at node ii and at time kk that is infected, and xi[k][0,1)x_{i}[k]\in[0,1) under Assumptions 3.1-3.2 as shown by Lemma 3.5. Thus, a randomly and uniformly sampled individual in the population at node ii and at time kk will be an infected individual (at time kk) with probability xi[k]x_{i}[k], and will be a non-infected (i.e., susceptible or recovered) individual with probability 1xi[k]1-x_{i}[k]. Supposing the tests are accurate,222Here, “accurate” means that an infected individual (at time kk) will be tested positive with probability one, and an individual that is not infected will be tested negative with probability one. we see from the above arguments that the obtained number of individuals that are tested positive, i.e., Nix^i[k]N_{i}\hat{x}_{i}[k], is a binomial random variable with parameters Nixμ(x^i[k])1N_{i}^{x}\mu(\hat{x}_{i}[k])\in\mathbb{Z}_{\geq 1} and xi[k][0,1)x_{i}[k]\in[0,1). Thus, for any i𝒱i\in\mathcal{V} and for any k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}, the pmf of x^i[k]\hat{x}_{i}[k] is

p(x^i[k]=x|θ,μ(x^i[k]))=(Nixμ(x^i[k])Nix)(xi[k])Nix(1xi[k])Nixμ(x^i[k])Nix,p(\hat{x}_{i}[k]=x|\theta,\mu(\hat{x}_{i}[k]))={N_{i}^{x}\mu(\hat{x}_{i}[k])\choose N_{i}x}(x_{i}[k])^{N_{i}x}(1-x_{i}[k])^{N_{i}^{x}\mu(\hat{x}_{i}[k])-N_{i}x}, (34)

where x{0,1Ni,2Ni,,Nixμ(x^i[k])Ni}x\in\{0,\frac{1}{N_{i}},\frac{2}{N_{i}},\dots,\frac{N_{i}^{x}\mu(\hat{x}_{i}[k])}{N_{i}}\} with x[0,1]x\in[0,1] since NixζiNiN_{i}^{x}\zeta_{i}\leq N_{i}. Note that we do not define the pmf of measurement x^i[k]\hat{x}_{i}[k] when Nixμ(x^i[k])=0N_{i}^{x}\mu(\hat{x}_{i}[k])=0, i.e., when μ(x^i[k])=0\mu(\hat{x}_{i}[k])=0, since μ(x^i[k])=0\mu(\hat{x}_{i}[k])=0 indicates no measurement is collected for state xi[k]x_{i}[k]. Also note that when xi[k]=0x_{i}[k]=0, the pmf of x^i[k]\hat{x}_{i}[k] given in Eq. (34) reduces to p(x^i[k]=0|θ,μ(x^i[k]))=1p(\hat{x}_{i}[k]=0|\theta,\mu(\hat{x}_{i}[k]))=1. Moreover, since the weight matrix An×nA\in\mathbb{R}^{n\times n} and the sampling parameter h0h\in\mathbb{R}_{\geq 0} are assumed to be given, we see that given θ=[βδ]T\theta=[\beta\ \delta]^{T} and initial condition l=[(s[0])T(x[0])T(r[0])T]Tl=[(s[0])^{T}\ (x[0])^{T}\ (r[0])^{T}]^{T}, xi[k]x_{i}[k] can be obtained using Eq. (1b) for all i𝒱i\in\mathcal{V} and for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}, where we can view xi[k]x_{i}[k] as a function in the unknown parameter θ\theta. In other words, given ll, θ\theta, μ(x^i[k])\mu(\hat{x}_{i}[k]), NixN_{i}^{x} and NiN_{i}, one can obtain the right-hand side of Eq. (34). Again, we only explicitly express the dependency of the pmf of x^i[k]\hat{x}_{i}[k] on θ\theta and μ(x^i[k])\mu(\hat{x}_{i}[k]) in Eq. (34). Following similar arguments to those above, we assume that for any i𝒱i\in\mathcal{V} and for any k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}, measurement r^i[k]\hat{r}_{i}[k] has the following pmf:

p(r^i[k]=r|θ,μ(r^i[k]))=(Nirμ(r^i[k])Nir)(ri[k])Nir(1ri[k])Nirμ(r^i[k])Nir,p(\hat{r}_{i}[k]=r|\theta,\mu(\hat{r}_{i}[k]))={N_{i}^{r}\mu(\hat{r}_{i}[k])\choose N_{i}r}(r_{i}[k])^{N_{i}r}(1-r_{i}[k])^{N_{i}^{r}\mu(\hat{r}_{i}[k])-N_{i}r}, (35)

where r{0,1Ni,2Ni,,Nirμ(r^i[k])Ni}r\in\{0,\frac{1}{N_{i}},\frac{2}{N_{i}},\dots,\frac{N_{i}^{r}\mu(\hat{r}_{i}[k])}{N_{i}}\} with r[0,1]r\in[0,1], μ(r^i[k]){0,,ηi}\mu(\hat{r}_{i}[k])\in\{0,\dots,\eta_{i}\}, Nir1N_{i}^{r}\in\mathbb{Z}_{\geq 1} and Nirμ(r^i[k])NiN_{i}^{r}\mu(\hat{r}_{i}[k])\leq N_{i}. Similarly, we note that the pmf of r^i[k]\hat{r}_{i}[k] given in Eq. (35) reduces to p(r^i[k]=0|θ,μ(r^i[k]))=1p(\hat{r}_{i}[k]=0|\theta,\mu(\hat{r}_{i}[k]))=1 when ri[k]=0r_{i}[k]=0. Considering any measurement selection μ\mu\in\mathcal{M} and any measurement λ^i[k]𝒰t1:t2\hat{\lambda}_{i}[k]\in\mathcal{U}_{t_{1}:t_{2}}, where λ{x,r}\lambda\in\{x,r\} and 𝒰t1:t2\mathcal{U}_{t_{1}:t_{2}} is defined in Eq. (21), we have the following:

𝔼[lnp(λ^i[k]|θ,μ(λ^i[k]))θ(lnp(λ^i[k]|θ,μ(λi[k]))θ)T]\displaystyle\mathbb{E}\Big{[}\frac{\partial\ln p(\hat{\lambda}_{i}[k]|\theta,\mu(\hat{\lambda}_{i}[k]))}{{\partial\theta}}\big{(}\frac{\partial\ln p(\hat{\lambda}_{i}[k]|\theta,\mu(\lambda_{i}[k]))}{{\partial\theta}}\big{)}^{T}\Big{]}
=\displaystyle= 𝔼[(lnp(λ^i[k]|θ,μ(λ^i[k]))λi[k])2λi[k]θ(λi[k]θ)T]\displaystyle\mathbb{E}\Big{[}\big{(}\frac{\partial\ln p(\hat{\lambda}_{i}[k]|\theta,\mu(\hat{\lambda}_{i}[k]))}{{\partial{\lambda}_{i}[k]}}\big{)}^{2}\cdot\frac{\partial{\lambda}_{i}[k]}{\partial\theta}\big{(}\frac{\partial{\lambda}_{i}[k]}{\partial\theta}\big{)}^{T}\Big{]} (36)
=\displaystyle= Niλμ(λ^i[k])λi[k](1λi[k])λi[k]θ(λi[k]θ)T,\displaystyle\frac{N_{i}^{\lambda}\mu(\hat{\lambda}_{i}[k])}{{\lambda}_{i}[k](1-{\lambda}_{i}[k])}\cdot\frac{\partial\lambda_{i}[k]}{\partial\theta}\big{(}\frac{\partial\lambda_{i}[k]}{\partial\theta}\big{)}^{T}, (37)

where the expectation 𝔼[]\mathbb{E}[\cdot] is taken with respect to p(λ^i[k]|θ,μ(λ^i[k]))p(\hat{\lambda}_{i}[k]|\theta,\mu(\hat{\lambda}_{i}[k])), and λi[k][0,1){\lambda}_{i}[k]\in[0,1). To obtain (36), we note the form of lnp(λ^i[k]|θ,μ(λ^i[k]))\ln p(\hat{\lambda}_{i}[k]|\theta,\mu(\hat{\lambda}_{i}[k])) in Eq. (34) (or Eq. (35)), and use the chain rule. Moreover, one can obtain (37) from the fact that λ^i[k]\hat{\lambda}_{i}[k] is a binomial random variable. Noting that the pmf of λ^i[k]\hat{\lambda}_{i}[k] reduces to p(λ^i[k]=0|θ,μ(λ^i[k]))=1p(\hat{\lambda}_{i}[k]=0|\theta,\mu(\hat{\lambda}_{i}[k]))=1 if λi[k]=0{\lambda}_{i}[k]=0 as argued above, we let the right-hand side of (37) be zero if λi[k]=0{\lambda}_{i}[k]=0.

5.2.2 Complexity of the PEMS Problem

Under the measurement model described above, we show that the PEMS problem is also NP-hard, i.e., there exist instances of the PEMS problem that cannot be solved optimally by any polynomial-time algorithm (if P \neq NP). The proof of the following result is included in Section 7.3 in the Appendix.

Theorem 5.3.

The PEMS problem is NP-hard.

5.2.3 An Equivalent Formulation for the PEMS Problem

Theorem 5.3 motivates us to consider approximation algorithms for solving the PEMS problem. To begin with, we note that the objective function in the PEMS problem can be viewed as a function defined over an integer lattice. We then have fa:0f_{a}:\mathcal{M}\to\mathbb{R}_{\geq 0} and fd:0f_{d}:\mathcal{M}\to\mathbb{R}_{\geq 0}, where \mathcal{M} is defined in Eq. (22). First, considering fa:0f_{a}:\mathcal{M}\to\mathbb{R}_{\geq 0}, we will define a set function fPa:2M¯0f_{Pa}:2^{\bar{M}}\to\mathbb{R}_{\geq 0}, where ¯\bar{\mathcal{M}} is a set constructed as

¯{(x^i[k],l1):i𝒱,k{t1,,t2},l1[ζi]}{(r^i[k],l2):i𝒱,k{t1,,t2},l2[ηi]}.\bar{\mathcal{M}}\triangleq\{(\hat{x}_{i}[k],l_{1}):i\in\mathcal{V},k\in\{t_{1},\dots,t_{2}\},l_{1}\in[\zeta_{i}]\}\cup\{(\hat{r}_{i}[k],l_{2}):i\in\mathcal{V},k\in\{t_{1},\dots,t_{2}\},l_{2}\in[\eta_{i}]\}. (38)

In other words, for any i𝒱i\in\mathcal{V} and for any k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}, we associate elements (x^i[k],1),,(x^i[k],ζi)(\hat{x}_{i}[k],1),\dots,(\hat{x}_{i}[k],\zeta_{i}) (resp., (r^i[k],1),,(r^i[k],ηi)(\hat{r}_{i}[k],1),\dots,(\hat{r}_{i}[k],\eta_{i})) in set ¯\bar{\mathcal{M}} to measurement x^i[k]\hat{x}_{i}[k] (resp., r^i[k]\hat{r}_{i}[k]). The set function fPa()f_{Pa}(\cdot) is then defined as

fPa(𝒴)fa(𝟎)fa(μ𝒴)=tr(C¯(𝟎))tr(C¯(μ𝒴))𝒴¯,f_{Pa}(\mathcal{Y})\triangleq f_{a}(\mathbf{0})-f_{a}(\mu_{\mathcal{Y}})=\operatorname{tr}(\bar{C}(\mathbf{0}))-\operatorname{tr}(\bar{C}(\mu_{\mathcal{Y}}))\ \forall\mathcal{Y}\subseteq\bar{\mathcal{M}}, (39)

where for any 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}, we define μ𝒴\mu_{\mathcal{Y}}\in\mathcal{M} such that μ𝒴(x^i[k])=|{(x^i[k],l1):(x^i[k],l1)𝒴}|\mu_{\mathcal{Y}}(\hat{x}_{i}[k])=|\{(\hat{x}_{i}[k],l_{1}):(\hat{x}_{i}[k],l_{1})\in\mathcal{Y}\}| and μ𝒴(r^i[k])=|{(r^i[k],l2):(r^i[k],l2)𝒴}|\mu_{\mathcal{Y}}(\hat{r}_{i}[k])=|\{(\hat{r}_{i}[k],l_{2}):(\hat{r}_{i}[k],l_{2})\in\mathcal{Y}\}| for all i𝒱i\in\mathcal{V} and for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}. In other words, μ𝒴(x^i[k])\mu_{\mathcal{Y}}(\hat{x}_{i}[k]) (resp., μ𝒴(r^i[k])\mu_{\mathcal{Y}}(\hat{r}_{i}[k])) is set to be the number of elements in 𝒴\mathcal{Y} that correspond to the measurement x^i[k]\hat{x}_{i}[k] (resp., r^i[k]\hat{r}_{i}[k]). Also note that fPa()=0f_{Pa}(\emptyset)=0. Following the arguments leading to (37), we define

Hy{𝔼θ[Nixxi[k](1xi[k])xi[k]θ(xi[k]θ)T]ify=(x^i[k],l1)𝔼θ[Nirri[k](1ri[k])ri[k]θ(ri[k]θ)T]ify=(r^i[k],l2)y¯,H_{y}\triangleq\begin{cases}&\mathbb{E}_{\theta}\big{[}\frac{N_{i}^{x}}{x_{i}[k](1-x_{i}[k])}\frac{\partial x_{i}[k]}{\partial\theta}\big{(}\frac{\partial x_{i}[k]}{\partial\theta}\big{)}^{T}\big{]}\ \text{if}\ y=(\hat{x}_{i}[k],l_{1})\\ &\mathbb{E}_{\theta}\big{[}\frac{N_{i}^{r}}{r_{i}[k](1-r_{i}[k])}\frac{\partial r_{i}[k]}{\partial\theta}\big{(}\frac{\partial r_{i}[k]}{\partial\theta}\big{)}^{T}\big{]}\ \text{if}\ y=(\hat{r}_{i}[k],l_{2})\end{cases}\ \forall y\in\bar{\mathcal{M}}, (40)

where xi[k],ri[k][0,1)x_{i}[k],r_{i}[k]\in[0,1), i𝒱i\in\mathcal{V}, k{t1,.t2}k\in\{t_{1},\dots.t_{2}\}, l1[ζi]l_{1}\in[\zeta_{i}], l2[ηi]l_{2}\in[\eta_{i}], and the expectation 𝔼θ[]\mathbb{E}_{\theta}[\cdot] is taken with respect to the prior pdf p(θ)p(\theta). Given any θ=[βδ]T\theta=[\beta\ \delta]^{T}, we see from the arguments for (37) that Nixxi[k](1xi[k])xi[k]θ(xi[k]θ)T𝟎\frac{N_{i}^{x}}{x_{i}[k](1-x_{i}[k])}\frac{\partial x_{i}[k]}{\partial\theta}\big{(}\frac{\partial x_{i}[k]}{\partial\theta}\big{)}^{T}\succeq\mathbf{0}. Moreover, one can show that 𝔼θ[Nixxi[k](1xi[k])xi[k]θ(xi[k]θ)T]𝟎\mathbb{E}_{\theta}\big{[}\frac{N_{i}^{x}}{x_{i}[k](1-x_{i}[k])}\frac{\partial x_{i}[k]}{\partial\theta}\big{(}\frac{\partial x_{i}[k]}{\partial\theta}\big{)}^{T}\big{]}\succeq\mathbf{0}. Similarly, one can obtain 𝔼θ[Nirri[k](1ri[k])ri[k]θ(ri[k]θ)T]𝟎\mathbb{E}_{\theta}\big{[}\frac{N_{i}^{r}}{r_{i}[k](1-r_{i}[k])}\frac{\partial r_{i}[k]}{\partial\theta}\big{(}\frac{\partial r_{i}[k]}{\partial\theta}\big{)}^{T}\big{]}\succeq\mathbf{0}, which implies Hy𝟎H_{y}\succeq\mathbf{0} for all y¯y\in\bar{\mathcal{M}}. Now, suppose the pmfs of x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k] are given by Eq. (34) and Eq. (35), respectively. Recall from Eq. (30) that tr(C¯(μ))=tr((𝔼θ[Fθ(μ)]+Fp)1)\operatorname{tr}(\bar{C}(\mu))=\operatorname{tr}((\mathbb{E}_{\theta}[F_{\theta}(\mu)]+F_{p})^{-1}) for all μ\mu\in\mathcal{M}, where FpF_{p} and Fθ(μ)F_{\theta}(\mu) are given by (31) and (28), respectively. Supposing Assumption 5.1 holds, for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}, one can first express Fθ(μ𝒴)F_{\theta}(\mu_{\mathcal{Y}}) using (37), and then use Eq. (40) to obtain 𝔼θ[Fθ(μ𝒴)]=y𝒴HyH(𝒴)\mathbb{E}_{\theta}[F_{\theta}(\mu_{\mathcal{Y}})]=\sum_{y\in\mathcal{Y}}H_{y}\triangleq H(\mathcal{Y}), where μ𝒴\mu_{\mathcal{Y}} is defined above given 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}. Putting the above arguments together, we have from Eq. (39) the following:

fPa(𝒴)=tr((Fp)1)tr((Fp+H(𝒴))1)𝒴¯.f_{Pa}(\mathcal{Y})=\operatorname{tr}\big{(}(F_{p})^{-1}\big{)}-\operatorname{tr}\big{(}(F_{p}+H(\mathcal{Y}))^{-1}\big{)}\quad\forall\mathcal{Y}\subseteq\bar{\mathcal{M}}. (41)

Next, let the cost of (x^i[k],l1)(\hat{x}_{i}[k],l_{1}) be ck,ic_{k,i}, denoted as c(x^i[k],l1)c(\hat{x}_{i}[k],l_{1}), for all (x^i[k],l1)¯(\hat{x}_{i}[k],l_{1})\in\bar{\mathcal{M}}, and let the cost of (r^i[k],l2)(\hat{r}_{i}[k],l_{2}) be bk,ib_{k,i}, denoted as c(r^i[k],l2)c(\hat{r}_{i}[k],l_{2}), for all (r^i[k],l2)¯(\hat{r}_{i}[k],l_{2})\in\bar{\mathcal{M}}, where ck,i>0c_{k,i}\in\mathbb{R}_{>0} and bk,i>0b_{k,i}\in\mathbb{R}_{>0} are given in the instance of the PEMS problem. Setting the cost structure of the elements in ¯\bar{\mathcal{M}} in this way, we establish an equivalence between the cost of a subset 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}} and the cost of μ𝒴\mu_{\mathcal{Y}}\in\mathcal{M}, where μ𝒴\mu_{\mathcal{Y}} is defined above. Similarly, considering the objective function fd:0f_{d}:\mathcal{M}\to\mathbb{R}_{\geq 0} in the PEMS problem, we define a set function fPd:2¯0f_{Pd}:2^{\bar{\mathcal{M}}}\to\mathbb{R}_{\geq 0} as

fPd(𝒴)fd(𝟎)fd(μ𝒴)=lndet(Fp+H(𝒴))lndet(Fp)𝒴¯,f_{Pd}(\mathcal{Y})\triangleq f_{d}(\mathbf{0})-f_{d}(\mu_{\mathcal{Y}})=\ln\det(F_{p}+H(\mathcal{Y}))-\ln\det(F_{p})\quad\forall\mathcal{Y}\subseteq\bar{\mathcal{M}}, (42)

where we define μ𝒴\mu_{\mathcal{Y}}\in\mathcal{M} such that μ𝒴(x^i[k])=|{(x^i[k],l1):(x^i[k],l1)𝒴}|\mu_{\mathcal{Y}}(\hat{x}_{i}[k])=|\{(\hat{x}_{i}[k],l_{1}):(\hat{x}_{i}[k],l_{1})\in\mathcal{Y}\}| and μ𝒴(r^i[k])=|{(r^i[k],l2):(r^i[k],l2)𝒴}|\mu_{\mathcal{Y}}(\hat{r}_{i}[k])=|\{(\hat{r}_{i}[k],l_{2}):(\hat{r}_{i}[k],l_{2})\in\mathcal{Y}\}| for all i𝒱i\in\mathcal{V} and for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\}. Note that given an instance of the PEMS problem in Problem 5.2, we can construct the set ¯\bar{\mathcal{M}} with the associated costs of the elements in ¯\bar{\mathcal{M}} in O(n(t2t1+1)(ζ+η))O(n(t_{2}-t_{1}+1)(\zeta+\eta)) time, where nn is the number of nodes in graph 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\}, and ζ,η1\zeta,\eta\in\mathbb{Z}_{\geq 1} with ζiζ\zeta_{i}\leq\zeta and ηiη\eta_{i}\leq\eta for all i𝒱i\in\mathcal{V}. Assuming that ζ\zeta and η\eta are (fixed) constants, the construction of the set ¯\bar{\mathcal{M}} with the associated costs takes O(n(t2t1+1))O(n(t_{2}-t_{1}+1)) time, which is polynomial in the parameters of the PEMS problem (Problem 5.2). Based on the above arguments, we further consider the following problem:

max𝒴¯fP(𝒴)s.t.c(𝒴)B,\begin{split}&\mathop{\max}_{\mathcal{Y}\subseteq\bar{\mathcal{M}}}f_{P}(\mathcal{Y})\\ s.t.&\ c(\mathcal{Y})\leq B,\end{split} (P)

where fP()f_{P}(\cdot) can be either of fPa()f_{Pa}(\cdot) or fPd()f_{Pd}(\cdot) with fPa()f_{Pa}(\cdot) and fPd()f_{Pd}(\cdot) given by in (41) and (42), respectively, and c(𝒴)y𝒴c(y)c(\mathcal{Y})\triangleq\sum_{y\in\mathcal{Y}}c(y) for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}. By the way we construct fP()f_{P}(\cdot) and the costs of elements in ¯\bar{\mathcal{M}}, one can verify that 𝒴a¯\mathcal{Y}_{a}^{\star}\subseteq\bar{\mathcal{M}} (resp., 𝒴d¯\mathcal{Y}_{d}^{\star}\subseteq\bar{\mathcal{M}}) is an optimal solution to Problem (P) with fP()=fPa()f_{P}(\cdot)=f_{Pa}(\cdot) (resp., fP()=fPd()f_{P}(\cdot)=f_{Pd}(\cdot)) if and only if μ𝒴a\mu_{\mathcal{Y}_{a}^{\star}} (resp., μ𝒴d\mu_{\mathcal{Y}_{d}^{\star}}) defined above is an optimal solution to (33) in Problem 5.2 with f()=fa()f(\cdot)=f_{a}(\cdot) (resp., f()=fd()f(\cdot)=f_{d}(\cdot)). Thus, given a PEMS instance we can first construct ¯\bar{\mathcal{M}} with the associated cost for each element in ¯\bar{\mathcal{M}}, and then solve Problem (P).

5.3 Greedy Algorithm for the PEMS Problem

Note that Problem (P) can be viewed as a problem of maximizing a set function subject to a knapsack constraint, and greedy algorithms have been proposed to solve this problem with performance guarantees when the objective function is monotone nondecreasing and submodular333A set function g:2𝒱g:2^{\mathcal{V}}\to\mathbb{R}, where 𝒱=[n]\mathcal{V}=[n] is the ground set, is said to be monotone nondecreasing if g(𝒜)g()g(\mathcal{A})\leq g(\mathcal{B}) for all 𝒜𝒱\mathcal{A}\subseteq\mathcal{B}\subseteq\mathcal{V}. g()g(\cdot) is called submodular if g({y}𝒜)g(𝒜)g({y})g()g(\{y\}\cup\mathcal{A})-g(\mathcal{A})\geq g(\{y\}\cup\mathcal{B})-g(\mathcal{B}) for all 𝒜𝒱\mathcal{A}\subseteq\mathcal{B}\subseteq\mathcal{V} and for all y𝒱y\in\mathcal{V}\setminus\mathcal{B}. (e.g., [16] and [29]). Before we formally introduce the greedy algorithm for the PEMS problem, we first note from (40)-(42) that given a prior pdf of θ\theta and any 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}, one has to take the expectation 𝔼θ[]\mathbb{E}_{\theta}[\cdot] in order to obtain the value of fP(𝒴)f_{P}(\mathcal{Y}). However, it is in general intractable to explicitly calculate the integration corresponding to 𝔼θ[]\mathbb{E}_{\theta}[\cdot]. Hence, one may alternatively evaluate the value of fP(𝒴)f_{P}(\mathcal{Y}) using numerical integration with respect to θ=[βδ]T\theta=[\beta\ \delta]^{T} (e.g., [28]). Specifically, a typical numerical integration method (e.g., the trapezoid rule) approximates the integral of a function (over an interval) based on a summation of (weighted) function values evaluated at certain points within the integration interval, which incurs an approximation error (see e.g., [28], for more details). We then see from (40)-(42) that in order to apply the numerical integration method described above to fP(𝒴)f_{P}(\mathcal{Y}), one has to obtain the values of xi[k]x_{i}[k], ri[k]r_{i}[k], xi[k]θ\frac{\partial x_{i}[k]}{\partial\theta}, and ri[k]θ\frac{\partial r_{i}[k]}{\partial\theta} for a given θ\theta (within the integration interval), where i𝒱i\in\mathcal{V} and t1kt2t_{1}\leq k\leq t_{2} with t1,t2t_{1},t_{2} given in an instance of the PEMS problem. Recall that the initial conditions s[0]s[0], x[0]x[0] and r[0]r[0] are assumed to be known. We first observe that for any given θ\theta, the values of xi[k]x_{i}[k] and ri[k]r_{i}[k] for all i𝒱i\in\mathcal{V} and for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\} can be obtained using the recursions in (1) in O((t2t1+1)n2)O((t_{2}-t_{1}+1)n^{2}) time. Next, noting that xi[k]θ=[xi[k]βxi[k]δ]T\frac{\partial x_{i}[k]}{\partial\theta}=[\frac{\partial x_{i}[k]}{\partial\beta}\ \frac{\partial x_{i}[k]}{\partial\delta}]^{T} and ri[k]θ=[ri[k]βri[k]δ]T\frac{\partial r_{i}[k]}{\partial\theta}=[\frac{\partial r_{i}[k]}{\partial\beta}\ \frac{\partial r_{i}[k]}{\partial\delta}]^{T}, we take the derivative with respect to β\beta on both sides of the equations in (1) and obtain

si[k+1]β=si[k]βh(si[k]ββ+si[k])(j𝒩¯iaijxj[k])hsi[k]β(j𝒩¯iaijxj[k]β),xi[k+1]β=(1hδ)xi[k]β+h(si[k]ββ+si[k])(j𝒩¯iaijxj[k])+hsi[k]β(j𝒩¯iaijxj[k]β),ri[k+1]β=ri[k]β+hδxi[k]β.\begin{split}&\frac{\partial s_{i}[k+1]}{\partial\beta}=\frac{\partial s_{i}[k]}{\partial\beta}-h(\frac{\partial s_{i}[k]}{\partial\beta}\beta+s_{i}[k])(\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k])-hs_{i}[k]\beta(\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}\frac{\partial x_{j}[k]}{\partial\beta}),\\ &\frac{\partial x_{i}[k+1]}{\partial\beta}=(1-h\delta)\frac{\partial x_{i}[k]}{\partial\beta}+h(\frac{\partial s_{i}[k]}{\partial\beta}\beta+s_{i}[k])(\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k])+hs_{i}[k]\beta(\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}\frac{\partial x_{j}[k]}{\partial\beta}),\\ &\frac{\partial r_{i}[k+1]}{\partial\beta}=\frac{\partial r_{i}[k]}{\partial\beta}+h\delta\frac{\partial x_{i}[k]}{\partial\beta}.\end{split} (43)

Considering any given β\beta, we can then use the recursion in (1) together with the recursion in (43) to obtain the values of xi[k]β\frac{\partial x_{i}[k]}{\partial\beta} and ri[k]β\frac{\partial r_{i}[k]}{\partial\beta} for all i𝒱i\in\mathcal{V} and for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\} in O((t2t1+1)n2)O((t_{2}-t_{1}+1)n^{2}) time. Similarly, considering any given δ\delta, one can obtain the values of xi[k]δ\frac{\partial x_{i}[k]}{\partial\delta} and ri[k]δ\frac{\partial r_{i}[k]}{\partial\delta} for all i𝒱i\in\mathcal{V} and for all k{t1,,t2}k\in\{t_{1},\dots,t_{2}\} in O((t2t1+1)n2)O((t_{2}-t_{1}+1)n^{2}) time.

Putting the above arguments together and considering the prior pdf of θ\theta, i.e., p(θ)p(\theta), we see from (40)-(42) that for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}, an approximation of fP(𝒴)f_{P}(\mathcal{Y}), denoted as f^P(𝒴)\hat{f}_{P}(\mathcal{Y}), can be obtained in O(nI(t2t1+1)n2)O(n_{I}(t_{2}-t_{1}+1)n^{2}) time, where nI1n_{I}\in\mathbb{Z}_{\geq 1} is the number of points used for the numerical integration method with respective to θ\theta, as we described above.444We assume that nIn_{I} is polynomial in the parameters of the PEMS instance. Furthermore, in the sequel, we may assume that f^P()\hat{f}_{P}(\cdot) satisfies |f^P(𝒴)fP(𝒴)|ε/2|\hat{f}_{P}(\mathcal{Y})-f_{P}(\mathcal{Y})|\leq\varepsilon/2 for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}} (with f^P()=0\hat{f}_{P}(\emptyset)=0), where ε0\varepsilon\in\mathbb{R}_{\geq 0}.555Note that ε\varepsilon is related to the approximation error of the numerical integration method, and ε\varepsilon will decrease if nIn_{I} increases; see, e.g., [28], for more details about the numerical integration method. We are now ready to introduce the greedy algorithm given in Algorithm 2 to solve the PEMS problem, where f^P(){f^Pa(),f^Pd()}\hat{f}_{P}(\cdot)\in\{\hat{f}_{Pa}(\cdot),\hat{f}_{Pd}(\cdot)\} and f^P()\hat{f}_{P}(\cdot) is the approximation of fP()f_{P}(\cdot) as we described above. From the definition of Algorithm 2, we see that the number of function calls of f^P()\hat{f}_{P}(\cdot) required in the algorithm is O(|¯|2)O(|\bar{\mathcal{M}}|^{2}), and thus the overall time complexity of Algorithm 2 is given by O(nI(t2t1+1)n2|¯|2)O(n_{I}(t_{2}-t_{1}+1)n^{2}|\bar{\mathcal{M}}|^{2}).

We proceed to analyze the performance of Algorithm 2 when applied to the PEMS problem. First, one can observe that fPd(𝒴)=lndet(Fp+H(𝒴))lndet(Fp)f_{Pd}(\mathcal{Y})=\ln\det(F_{p}+H(\mathcal{Y}))-\ln\det(F_{p}) in Problem (P) shares a similar form to the one in [30]. Thus, using similar arguments to those in [30], one can show that fPd()f_{Pd}(\cdot) is monotone nondecreasing and submodular with fPd()=0f_{Pd}(\emptyset)=0. Noting the assumption that |f^Pd(𝒴)fPd(𝒴)|ε/2|\hat{f}_{Pd}(\mathcal{Y})-f_{Pd}(\mathcal{Y})|\leq\varepsilon/2 for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}, one can show that yy^{\star} given by line 6 of Algorithm 2 satisfies that f^Pd({y}𝒴2)f^Pd(𝒴2)+εc(y)fPd({y}𝒴2)fPd(𝒴2)εc(y)\frac{\hat{f}_{Pd}(\{y^{\star}\}\cup\mathcal{Y}_{2})-\hat{f}_{Pd}(\mathcal{Y}_{2})+\varepsilon}{c(y^{\star})}\geq\frac{f_{Pd}(\{y\}\cup\mathcal{Y}_{2})-f_{Pd}(\mathcal{Y}_{2})-\varepsilon}{c(y)} for all y𝒞y\in\mathcal{C}. Similarly, one can show that maxy¯fPd(y)fPd(𝒴1)+ε\mathop{\max}_{y\in\bar{\mathcal{M}}}f_{Pd}(y)\leq f_{Pd}(\mathcal{Y}_{1})+\varepsilon, where 𝒴1\mathcal{Y}_{1} is given by line 3 in Algorithm 2. One can then use similar arguments to those for Theorem 1 in [16] and obtain the following result; the detailed proof is omitted for conciseness.

Theorem 5.4.

Consider Problem (P) with the objective function fPd:2¯0f_{Pd}:2^{\bar{\mathcal{M}}}\to\mathbb{R}_{\geq 0} given by (42). Then Algorithm 2 yields a solution, denoted as 𝒴dg\mathcal{Y}_{d}^{g}, to Problem (P) that satisfies

fPd(𝒴dg)12(1e1)fPd(𝒴d)(Bcmin+32)ε,f_{Pd}(\mathcal{Y}_{d}^{g})\geq\frac{1}{2}(1-e^{-1})f_{Pd}(\mathcal{Y}_{d}^{\star})-(\frac{B}{c_{\mathop{\min}}}+\frac{3}{2})\varepsilon, (44)

where 𝒴d¯\mathcal{Y}_{d}^{\star}\subseteq\bar{\mathcal{M}} is an optimal solution to Problem  (P), cmin=miny¯c(y)c_{\mathop{\min}}=\mathop{\min}_{y\in\bar{\mathcal{M}}}c(y),666Note that we can assume without loss of generality that c(y)Bc(y)\leq B for all y¯y\in\bar{\mathcal{M}}. and ε0\varepsilon\in\mathbb{R}_{\geq 0} satisfies |f^Pd(𝒴)fPd(𝒴)|ε/2|\hat{f}_{Pd}(\mathcal{Y})-f_{Pd}(\mathcal{Y})|\leq\varepsilon/2 for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}.

Algorithm 2 Greedy algorithm for PEMS
1:Input: An instance of PEMS transformed into the form in (P)
2:Output: 𝒴g\mathcal{Y}_{g}
3:Find 𝒴1argmaxy¯f^P(y)\mathcal{Y}_{1}\in\mathop{\arg}{\max}_{y\in\bar{\mathcal{M}}}\hat{f}_{P}(y)
4:Initialize 𝒴2=\mathcal{Y}_{2}=\emptyset and 𝒞=¯\mathcal{C}=\bar{\mathcal{M}}
5:while 𝒞\mathcal{C}\neq\emptyset do
6:     Find yargmaxy𝒞f^P({y}𝒴2)f^P(𝒴2)c(y)y^{\star}\in\mathop{\arg}{\max}_{y\in\mathcal{C}}\frac{\hat{f}_{P}(\{y\}\cup\mathcal{Y}_{2})-\hat{f}_{P}(\mathcal{Y}_{2})}{c(y)}
7:     if c(y)+c(𝒴2)Bc(y^{\star})+c(\mathcal{Y}_{2})\leq B then
8:         𝒴2={y}𝒴2\mathcal{Y}_{2}=\{y^{\star}\}\cup\mathcal{Y}_{2}      
9:     𝒞=𝒞{y}\mathcal{C}=\mathcal{C}\setminus\{y^{\star}\}
10:𝒴g=argmax𝒴{𝒴1,𝒴2}{f^P(𝒴1),f^P(𝒴2)}\mathcal{Y}_{g}=\mathop{\arg}{\max}_{\mathcal{Y}\in\{\mathcal{Y}_{1},\mathcal{Y}_{2}\}}\{\hat{f}_{P}(\mathcal{Y}_{1}),\hat{f}_{P}(\mathcal{Y}_{2})\}

In contrast to fPd()f_{Pd}(\cdot), the objective function fPa()f_{Pa}(\cdot) is not submodular in general (e.g., [17]). In fact, one can construct examples where the objective function fPa(𝒴)=tr((Fp)1)tr((Fp+H(𝒴))1)f_{Pa}(\mathcal{Y})=\operatorname{tr}((F_{p})^{-1}\big{)}-\operatorname{tr}\big{(}(F_{p}+H(\mathcal{Y}))^{-1}) in the PEMS problem is not submodular. Hence, in order to provide performance guarantees of the greedy algorithm when applied to Problem (P) with f()=fPa()f(\cdot)=f_{Pa}(\cdot), we will extend the analysis in [16] to nonsubmodular settings. To proceed, note that for all 𝒜¯\mathcal{A}\subseteq\mathcal{B}\subseteq\bar{\mathcal{M}}, we have Fp+H(𝒜)Fp+H()F_{p}+H(\mathcal{A})\preceq F_{p}+H(\mathcal{B}), which implies (Fp+H(𝒜))1(Fp+H())1(F_{p}+H(\mathcal{A}))^{-1}\succeq(F_{p}+H(\mathcal{B}))^{-1} and tr(Fp+H(𝒜))1)tr(Fp+H())1)\operatorname{tr}(F_{p}+H(\mathcal{A}))^{-1})\geq\operatorname{tr}(F_{p}+H(\mathcal{B}))^{-1}) [10]. Therefore, the objective function fPa()f_{Pa}(\cdot) is monotone nondecreasing with fPa()=0f_{Pa}(\emptyset)=0. We then characterize how close fPa()f_{Pa}(\cdot) is to being submodular by introducing the following definition.

Definition 5.5.

Consider Problem (P) with fP()=fPa()f_{P}(\cdot)=f_{Pa}(\cdot), where fPa:2¯0f_{Pa}:2^{\bar{\mathcal{M}}}\to\mathbb{R}_{\geq 0} is defined in (39). Suppose Algorithm 2 is applied to solve Problem (P). For all j{1,,|𝒴2|}j\in\{1,\dots,|\mathcal{Y}_{2}|\}, let 𝒴2j={y1,,yj}\mathcal{Y}_{2}^{j}=\{y_{1},\dots,y_{j}\} denote the set that contains the first jj elements added to set 𝒴2\mathcal{Y}_{2} in Algorithm 2, and let 𝒴20=\mathcal{Y}_{2}^{0}=\emptyset. The type-1 greedy submodularity ratio of fPa()f_{Pa}(\cdot) is defined to be the largest γ1\gamma_{1}\in\mathbb{R} that satisfies

y𝒜𝒴2j(fPa({y}𝒴2j)fPa(𝒴2j))γ1(fPa(𝒜𝒴2j)fPa(𝒴2j)),\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\big{(}f_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})\big{)}\geq\gamma_{1}\big{(}f_{Pa}(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})\big{)}, (45)

for all 𝒜¯\mathcal{A}\subseteq\bar{\mathcal{M}} and for all j{0,,|𝒴2|}j\in\{0,\dots,|\mathcal{Y}_{2}|\}. The type-2 greedy submodularity ratio of fPa()f_{Pa}(\cdot) is defined to be the largest γ2\gamma_{2}\in\mathbb{R} that satisfies

fPa(𝒴1)fPa()γ2(fPa({y}𝒴2j)fPa(𝒴2j)),f_{Pa}(\mathcal{Y}_{1})-f_{Pa}(\emptyset)\geq\gamma_{2}\big{(}f_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})\big{)}, (46)

for all j{0,,|𝒴2|}j\in\{0,\dots,|\mathcal{Y}_{2}|\} and for all y¯𝒴2jy\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j} such that c(y)+c(𝒴2j)>Bc(y)+c(\mathcal{Y}_{2}^{j})>B, where 𝒴1argmaxy¯f^Pa(y)\mathcal{Y}_{1}\in\mathop{\arg}{\max}_{y\in\bar{\mathcal{M}}}\hat{f}_{Pa}(y).

Remark 5.6.

Note that fPa()f_{Pa}(\cdot) is monotone nondecreasing as argued above. Noting the definition of γ1\gamma_{1} in (45), one can use similar arguments to those in [5] and show that γ1[0,1]\gamma_{1}\in[0,1]; if fPa()f_{Pa}(\cdot) is submodular, then γ1=1\gamma_{1}=1. Similarly, one can show that γ20\gamma_{2}\geq 0. Supposing that 𝒴1argmaxy¯fPa(y)\mathcal{Y}_{1}\in\mathop{\arg}{\max}_{y\in\bar{\mathcal{M}}}f_{Pa}(y), one can further show that if fPa()f_{Pa}(\cdot) is submodular, then γ21\gamma_{2}\geq 1.

Note that since we approximate fPa()f_{Pa}(\cdot) using f^Pa()\hat{f}_{Pa}(\cdot) as we argued above, we may not be able to obtain the exact values of γ1\gamma_{1} and γ2\gamma_{2} from Definition 5.5. Moreover, finding γ1\gamma_{1} may require an exponential number of function calls of fPa()f_{Pa}(\cdot) (or f^Pa()\hat{f}_{Pa}(\cdot)). Nonetheless, it will be clear from our analysis below that obtaining lower bounds on γ1\gamma_{1} and γ2\gamma_{2} suffices. Here, we describe how we obtain a lower bound on γ2\gamma_{2} using f^Pa()\hat{f}_{Pa}(\cdot), and defer our analysis for lower bounding γ1\gamma_{1} to the end of this section, which requires more careful analysis. Similarly to (46), let γ^2\hat{\gamma}_{2} denote the largest real number that satisfies

f^Pa(𝒴1)ε2γ^2(f^Pa({y}𝒴2j)f^Pa(𝒴2j)+ε),\hat{f}_{Pa}(\mathcal{Y}_{1})-\frac{\varepsilon}{2}\geq\hat{\gamma}_{2}\big{(}\hat{f}_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-\hat{f}_{Pa}(\mathcal{Y}_{2}^{j})+\varepsilon), (47)

for all j{0,,|𝒴2|}j\in\{0,\dots,|\mathcal{Y}_{2}|\} and for all y¯𝒴2jy\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j} such that c(y)+c(𝒴2j)>Bc(y)+c(\mathcal{Y}_{2}^{j})>B. Noting our assumption that |f^Pa(𝒴)fPa(𝒴)|ε/2|\hat{f}_{Pa}(\mathcal{Y})-f_{Pa}(\mathcal{Y})|\leq\varepsilon/2 for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}} (with f^Pa()=0\hat{f}_{Pa}(\emptyset)=0), one can see that γ^2\hat{\gamma}_{2} given by (47) also satisfies (46), which implies γ^2γ2\hat{\gamma}_{2}\leq\gamma_{2}. Given 𝒴2j\mathcal{Y}_{2}^{j} for all j{0,,|𝒴2|}j\in\{0,\dots,|\mathcal{Y}_{2}|\} from Algorithm 2, γ^2\hat{\gamma}_{2} can now be obtained via O(|¯|2)O(|\bar{\mathcal{M}}|^{2}) function calls of f^Pa()\hat{f}_{Pa}(\cdot).

Based on Definition 5.5, the following result extends the analysis in [15, 16], and characterizes the performance guarantees of Algorithm 2 for Problem (P) with fP()=fPa()f_{P}(\cdot)=f_{Pa}(\cdot).

Theorem 5.7.

Consider Problem (P) with the objective function fPa:2¯0f_{Pa}:2^{\bar{\mathcal{M}}}\to\mathbb{R}_{\geq 0} given by (39). Then Algorithm 2 yields a solution, denoted as 𝒴ag\mathcal{Y}_{a}^{g}, to Problem (P) that satisfies

fPa(𝒴ag)min{γ2,1}2(1eγ1)fPa(𝒴a)(B+cmaxcmin+1)ε,f_{Pa}(\mathcal{Y}_{a}^{g})\geq\frac{\mathop{\min}\{\gamma_{2},1\}}{2}(1-e^{-\gamma_{1}})f_{Pa}(\mathcal{Y}_{a}^{\star})-(\frac{B+c_{\mathop{\max}}}{c_{\mathop{\min}}}+1)\varepsilon, (48)

where 𝒴a¯\mathcal{Y}_{a}^{\star}\subseteq\bar{\mathcal{M}} is an optimal solution to Problem  (P), γ10\gamma_{1}\in\mathbb{R}_{\geq 0} and γ20\gamma_{2}\in\mathbb{R}_{\geq 0} are defined in Definition 5.5, cmin=miny¯c(y)c_{\mathop{\min}}=\mathop{\min}_{y\in\bar{\mathcal{M}}}c(y), cmax=maxy¯c(y)c_{\mathop{\max}}=\mathop{\max}_{y\in\bar{\mathcal{M}}}c(y), and ε0\varepsilon\in\mathbb{R}_{\geq 0} satisfies |f^Pa(𝒴)fPa(𝒴)|ε/2|\hat{f}_{Pa}(\mathcal{Y})-f_{Pa}(\mathcal{Y})|\leq\varepsilon/2 for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}.

Proof.

Noting that (48) holds trivially if γ1=0\gamma_{1}=0 or γ2=0\gamma_{2}=0, we assume that γ1>0\gamma_{1}>0 and γ2>0\gamma_{2}>0. In this proof, we drop the subscript of fPa()f_{Pa}(\cdot) (resp., f^Pa()\hat{f}_{Pa}(\cdot)) and denote f()f(\cdot) (resp., f^()\hat{f}(\cdot)) for notational simplicity. First, recall that for all j{1,,|𝒴2|}j\in\{1,\dots,|\mathcal{Y}_{2}|\}, we let 𝒴2j={y1,,yj}\mathcal{Y}_{2}^{j}=\{y_{1},\dots,y_{j}\} denote the set that contains the first jj elements added to set 𝒴2\mathcal{Y}_{2} in Algorithm 2, and let 𝒴20=\mathcal{Y}_{2}^{0}=\emptyset. Now, let jlj_{l} be the first index in {1,,|𝒴2|}\{1,\dots,|\mathcal{Y}_{2}|\} such that a candidate element yargmaxy𝒞f^({y}𝒴2jl)f^(𝒴2jl)c(y)y^{\star}\in\mathop{\arg}{\max}_{y\in\mathcal{C}}\frac{\hat{f}(\{y\}\cup\mathcal{Y}_{2}^{j_{l}})-\hat{f}(\mathcal{Y}_{2}^{j_{l}})}{c(y)} for 𝒴2\mathcal{Y}_{2} (given in line 66 of Algorithm 2) cannot be added to 𝒴2\mathcal{Y}_{2} due to c(y)+c(𝒴2jl)>Bc(y^{\star})+c(\mathcal{Y}_{2}^{j_{l}})>B. In other words, for all j{0,,jl1}j\in\{0,\dots,j_{l}-1\}, any candidate element yargmaxy𝒞f^({y}𝒴2j)f^(𝒴2j)c(y)y^{\star}\in\mathop{\arg}{\max}_{y\in\mathcal{C}}\frac{\hat{f}(\{y\}\cup\mathcal{Y}_{2}^{j})-\hat{f}(\mathcal{Y}_{2}^{j})}{c(y)} for 𝒴2\mathcal{Y}_{2} satisfies c(y)+c(𝒴2j)Bc(y^{\star})+c(\mathcal{Y}_{2}^{j})\leq B and can be added to 𝒴2\mathcal{Y}_{2} in Algorithm 2. Noting that |f^P(𝒴)fP(𝒴)|ε/2|\hat{f}_{P}(\mathcal{Y})-f_{P}(\mathcal{Y})|\leq\varepsilon/2 for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}, one can then show that the following hold for all j{0,,jl1}j\in\{0,\dots,j_{l}-1\}:

f(𝒴2j+1)f(𝒴2j)+εc(yj+1)f({y}𝒴2j)f(𝒴2j)εc(y)y¯𝒴2j.\frac{f(\mathcal{Y}_{2}^{j+1})-f(\mathcal{Y}_{2}^{j})+\varepsilon}{c(y_{j+1})}\geq\frac{f(\{y\}\cup\mathcal{Y}_{2}^{j})-f(\mathcal{Y}_{2}^{j})-\varepsilon}{c(y)}\quad\forall y\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j}. (49)

Now, considering any j{0,,jl1}j\in\{0,\dots,j_{l}-1\}, we have the following:

f(𝒴a𝒴2j)f(𝒴2j)1γ1y𝒴a𝒴2jc(y)f({y}𝒴2j)f(𝒴2j)c(y)\displaystyle\quad f(\mathcal{Y}^{\star}_{a}\cup\mathcal{Y}_{2}^{j})-f(\mathcal{Y}_{2}^{j})\leq\frac{1}{\gamma_{1}}\sum_{y\in\mathcal{Y}^{\star}_{a}\setminus\mathcal{Y}^{j}_{2}}c(y)\cdot\frac{f(\{y\}\cup\mathcal{Y}^{j}_{2})-f(\mathcal{Y}^{j}_{2})}{c(y)} (50)
1γ1y𝒴a𝒴2jc(y)(f(𝒴2j+1)f(𝒴2j)+εc(yj+1)+εc(y))\displaystyle\leq\frac{1}{\gamma_{1}}\sum_{y\in\mathcal{Y}^{\star}_{a}\setminus\mathcal{Y}_{2}^{j}}c(y)(\frac{f(\mathcal{Y}^{j+1}_{2})-f(\mathcal{Y}^{j}_{2})+\varepsilon}{c(y_{j+1})}+\frac{\varepsilon}{c(y)}) (51)
Bγ1f(𝒴2j+1)f(𝒴2j)c(yj+1)+εγ1y𝒴a𝒴j2(c(y)c(yj+1)+1)\displaystyle\leq\frac{B}{\gamma_{1}}\cdot\frac{f(\mathcal{Y}^{j+1}_{2})-f(\mathcal{Y}^{j}_{2})}{c(y_{j+1})}+\frac{\varepsilon}{\gamma_{1}}\sum_{y\in\mathcal{Y}_{a}^{\star}\setminus\mathcal{Y}_{j}^{2}}(\frac{c(y)}{c(y_{j+1})}+1) (52)
Bγ1f(𝒴2j+1)f(𝒴2j)c(yj+1)+εγ1(Bc(yj+1)+|𝒴a|),\displaystyle\leq\frac{B}{\gamma_{1}}\cdot\frac{f(\mathcal{Y}^{j+1}_{2})-f(\mathcal{Y}^{j}_{2})}{c(y_{j+1})}+\frac{\varepsilon}{\gamma_{1}}(\frac{B}{c(y_{j+1})}+|\mathcal{Y}_{a}^{\star}|), (53)

where (50) follows from the definition of γ1\gamma_{1} in (45), and (51) follows from (49). To obtain (52), we use the fact c(𝒴a)Bc(\mathcal{Y}^{\star}_{a})\leq B. Similarly, we obtain (53). Noting that f()f(\cdot) is monotone nondecreasing, one can further obtain from (53) that

f(𝒴2j+1)f(𝒴2j)γ1c(yj+1)B(f(𝒴a)f(𝒴2j))ε(1+|𝒴a|c(yj+1)B).f(\mathcal{Y}_{2}^{j+1})-f(\mathcal{Y}_{2}^{j})\geq\frac{\gamma_{1}c(y_{j+1})}{B}\big{(}f(\mathcal{Y}_{a}^{\star})-f(\mathcal{Y}_{2}^{j})\big{)}-\varepsilon(1+|\mathcal{Y}_{a}^{\star}|\frac{c(y_{j+1})}{B}). (54)

To proceed, let yargmaxy𝒞f^({y}𝒴2jl)f^(𝒴2jl)c(y)y^{\prime}\in\mathop{\arg}{\max}_{y\in\mathcal{C}}\frac{\hat{f}(\{y\}\cup\mathcal{Y}_{2}^{j_{l}})-\hat{f}(\mathcal{Y}_{2}^{j_{l}})}{c(y)} be the (first) candidate element for 𝒴2\mathcal{Y}_{2} that cannot be added to 𝒴2\mathcal{Y}_{2} due to c(y)+c(𝒴2jl)>Bc(y^{\prime})+c(\mathcal{Y}_{2}^{j_{l}})>B, as we argued above. Similarly to (49), one can see that f({y}𝒴2jl)f(𝒴2jl)+εc(y)f({y}𝒴2jl)f(𝒴2jl)εc(y)\frac{f(\{y^{\prime}\}\cup\mathcal{Y}_{2}^{j_{l}})-f(\mathcal{Y}_{2}^{j_{l}})+\varepsilon}{c(y^{\prime})}\geq\frac{f(\{y\}\cup\mathcal{Y}_{2}^{j_{l}})-f(\mathcal{Y}_{2}^{j_{l}})-\varepsilon}{c(y)} holds for all y¯𝒴2jly\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j_{l}}. Letting 𝒴¯2jl+1{y}𝒴2jl\bar{\mathcal{Y}}_{2}^{j_{l}+1}\triangleq\{y^{\prime}\}\cup\mathcal{Y}_{2}^{j_{l}} and following similar arguments to those leading up to (54), we have

f(𝒴¯2jl+1)f(𝒴2jl)γ1c(y)B(f(𝒴a)f(𝒴2jl))ε(1+|𝒴a|c(y)B).f(\bar{\mathcal{Y}}_{2}^{j_{l}+1})-f(\mathcal{Y}_{2}^{j_{l}})\geq\frac{\gamma_{1}c(y^{\prime})}{B}\big{(}f(\mathcal{Y}_{a}^{\star})-f(\mathcal{Y}_{2}^{j_{l}})\big{)}-\varepsilon(1+|\mathcal{Y}_{a}^{\star}|\frac{c(y^{\prime})}{B}). (55)

Denoting Δjf(𝒴a)f(𝒴2j)\Delta_{j}\triangleq f(\mathcal{Y}^{\star}_{a})-f(\mathcal{Y}^{j}_{2}) for all j{0,,jl}j\in\{0,\dots,j_{l}\} and Δjl+1f(𝒴a)f(𝒴¯2jl+1)\Delta_{j_{l}+1}\triangleq f(\mathcal{Y}_{a}^{\star})-f(\bar{\mathcal{Y}}_{2}^{j_{l}+1}), we obtain from (54) and (55) the following:

ΔjΔj1(1c(yj)γ1B)+ε+c(yj)|𝒴a|Bεj[jl+1].\Delta_{j}\leq\Delta_{j-1}(1-\frac{c(y_{j})\gamma_{1}}{B})+\varepsilon+\frac{c(y_{j})|\mathcal{Y}_{a}^{\star}|}{B}\varepsilon\quad\forall j\in[j_{l}+1]. (56)

Unrolling (56) yields

Δjl+1Δ0(j=1jl(1c(yj)γ1B))(1c(y)γ1B)+(jl+1+c(𝒴¯2jl+1)|𝒴a|B)ε,\displaystyle\Delta_{j_{l}+1}\leq\Delta_{0}\big{(}\prod_{j=1}^{j_{l}}(1-\frac{c(y_{j})\gamma_{1}}{B})\big{)}(1-\frac{c(y^{\prime})\gamma_{1}}{B})+(j_{l}+1+\frac{c(\bar{\mathcal{Y}}_{2}^{j_{l}+1})|\mathcal{Y}_{a}^{\star}|}{B})\varepsilon, (57)
\displaystyle\implies Δjl+1Δ0(j=1jl(1c(yj)γ1B))(1c(y)γ1B)+2(B+cmax)cminε.\displaystyle\Delta_{j_{l}+1}\leq\Delta_{0}\big{(}\prod_{j=1}^{j_{l}}(1-\frac{c(y_{j})\gamma_{1}}{B})\big{)}(1-\frac{c(y^{\prime})\gamma_{1}}{B})+\frac{2(B+c_{\mathop{\max}})}{c_{\mathop{\min}}}\varepsilon. (58)

To obtain (57), we use the facts that (1c(yj)γ1B)1(1-\frac{c(y_{j})\gamma_{1}}{B})\leq 1 for all j[jl+1]j\in[j_{l}+1] and (1c(y)γ1B)1(1-\frac{c(y^{\prime})\gamma_{1}}{B})\leq 1, since γ1(0,1]\gamma_{1}\in(0,1] as we argued in Remark 5.6. To obtain (58), we first note from the way we defined jlj_{l} that jl+1c(𝒴¯2jl+1)/cmin(B+cmax)/cminj_{l}+1\leq c(\bar{\mathcal{Y}}^{j_{l}+1}_{2})/c_{\mathop{\min}}\leq(B+c_{\mathop{\max}})/c_{\mathop{\min}}. Also noting that |𝒴a|B/cmin|\mathcal{Y}_{a}^{\star}|\leq B/c_{\mathop{\min}}, we then obtain (58).

Now, one can show that (j=1jl(1c(yj)γ1B))(1c(y)γ1B)j=1jl+1(1c(𝒴¯2jl+1)γ1(jl+1)B)eγ1c(𝒴¯2jl+1)B\big{(}\prod_{j=1}^{j_{l}}(1-\frac{c(y_{j})\gamma_{1}}{B})\big{)}(1-\frac{c(y^{\prime})\gamma_{1}}{B})\leq\prod_{j=1}^{j_{l}+1}(1-\frac{c(\bar{\mathcal{Y}}_{2}^{j_{l}+1})\gamma_{1}}{(j_{l}+1)B})\leq e^{-\gamma_{1}\frac{c(\bar{\mathcal{Y}}_{2}^{j_{l}+1})}{B}} (e.g., [15]). We then have from (58) the following:

f(𝒴a)f(𝒴¯2jl+1)f(𝒴a)eγ1c(𝒴¯2jl+1)B+2(B+cmax)cminε\displaystyle f(\mathcal{Y}_{a}^{\star})-f(\bar{\mathcal{Y}}_{2}^{j_{l}+1})\leq f(\mathcal{Y}^{\star}_{a})e^{-\gamma_{1}\frac{c(\bar{\mathcal{Y}}_{2}^{j_{l}+1})}{B}}+\frac{2(B+c_{\mathop{\max}})}{c_{\mathop{\min}}}\varepsilon
\displaystyle\implies f(𝒴¯2jl+1)(1eγ1)f(𝒴a)2(B+cmax)cminε,\displaystyle f(\bar{\mathcal{Y}}_{2}^{j_{l}+1})\geq(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{2(B+c_{\mathop{\max}})}{c_{\mathop{\min}}}\varepsilon, (59)

where (59) follows from c(𝒴¯2jl+1)>Bc(\bar{\mathcal{Y}}_{2}^{j_{l}+1})>B.

To proceed with the proof of the theorem, we note from the definition of γ2\gamma_{2} in Definition 5.5 that f({y}𝒴2jl)f(𝒴2jl)1γ2f(𝒴1)f(\{y^{\prime}\}\cup\mathcal{Y}_{2}^{j_{l}})-f(\mathcal{Y}_{2}^{j_{l}})\leq\frac{1}{\gamma_{2}}f(\mathcal{Y}_{1}) with γ2>0\gamma_{2}>0, which together with (59) imply that

f(𝒴2jl)+1γ2f(𝒴1)f(𝒴¯2jl+1)(1eγ1)f(𝒴a)2B¯cminε,f(\mathcal{Y}_{2}^{j_{l}})+\frac{1}{\gamma_{2}}f(\mathcal{Y}_{1})\geq f(\bar{\mathcal{Y}}_{2}^{j_{l}+1})\geq(1-e^{\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{2\bar{B}}{c_{\mathop{\min}}}\varepsilon, (60)

where B¯B+cmax\bar{B}\triangleq B+c_{\mathop{\max}}. Since f()f(\cdot) is monotone nondecreasing, we obtain from (60)

f(𝒴2)+1γ2f(𝒴1)(1eγ1)f(𝒴a)2B¯cminε.f(\mathcal{Y}_{2})+\frac{1}{\gamma_{2}}f(\mathcal{Y}_{1})\geq(1-e^{\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{2\bar{B}}{c_{\mathop{\min}}}\varepsilon. (61)

We will then split our analysis into two cases. First, supposing that γ21\gamma_{2}\geq 1, we see from (61) that at least one of f(𝒴2)12(1eγ1)f(𝒴a)B¯cminεf(\mathcal{Y}_{2})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\bar{B}}{c_{\mathop{\min}}}\varepsilon and f(𝒴1)12(1eγ1)f(𝒴a)B¯cminεf(\mathcal{Y}_{1})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\bar{B}}{c_{\mathop{\min}}}\varepsilon holds. Recalling that |f^(𝒴)f(𝒴)|ε/2|\hat{f}(\mathcal{Y})-f(\mathcal{Y})|\leq\varepsilon/2 for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}, it follows that at least one of f^(𝒴2)12(1eγ1)f(𝒴a)B¯cminεε2\hat{f}(\mathcal{Y}_{2})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\bar{B}}{c_{\mathop{\min}}}\varepsilon-\frac{\varepsilon}{2} and f^(𝒴1)12(1eγ1)f(𝒴a)B¯cminεε2\hat{f}(\mathcal{Y}_{1})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\bar{B}}{c_{\mathop{\min}}}\varepsilon-\frac{\varepsilon}{2} holds. Second, supposing γ2<1\gamma_{2}<1 and using similar arguments, we have that at least one of f^(𝒴2)12(1eγ1)f(𝒴a)B¯cminεε2\hat{f}(\mathcal{Y}_{2})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\bar{B}}{c_{\mathop{\min}}}\varepsilon-\frac{\varepsilon}{2} and f^(𝒴1)γ22(1eγ1)f(𝒴a)γ2B¯cminεε2\hat{f}(\mathcal{Y}_{1})\geq\frac{\gamma_{2}}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\gamma_{2}\bar{B}}{c_{\mathop{\min}}}\varepsilon-\frac{\varepsilon}{2} holds. Now, we note from line 12 of Algorithm 2 that f^(𝒴ag)max{f^(𝒴1),f^(𝒴2)}\hat{f}(\mathcal{Y}_{a}^{g})\geq\mathop{\max}\{\hat{f}(\mathcal{Y}_{1}),\hat{f}(\mathcal{Y}_{2})\}, which implies f(𝒴ag)max{f^(𝒴1),f^(𝒴2)}ε2f(\mathcal{Y}_{a}^{g})\geq\mathop{\max}\{\hat{f}(\mathcal{Y}_{1}),\hat{f}(\mathcal{Y}_{2})\}-\frac{\varepsilon}{2}. Combining the above arguments together, we obtain (48). ∎

Remark 5.8.

Note that (48) becomes fPa(𝒴ag)12(1eγ1)fPa(𝒴a)(B+cmaxcmin+1)εf_{Pa}(\mathcal{Y}_{a}^{g})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f_{Pa}(\mathcal{Y}_{a}^{\star})-(\frac{B+c_{\mathop{\max}}}{c_{\mathop{\min}}}+1)\varepsilon if γ21\gamma_{2}\geq 1. Also note that γ21\gamma_{2}\geq 1 can hold when the objective function fPa()f_{Pa}(\cdot) is not submodular, as we will see later in our numerical examples.

Remark 5.9.

The authors in [31] also extended the analysis of Algorithm 2 to nonsubmodular settings, when the objective function can be exactly evaluated (i.e., ε=0\varepsilon=0). They obtained a performance guarantee for Algorithm 2 that depends on a submodularity ratio defined in a different manner. One can show that the submodularity ratios defined in Definition 5.5 are lower bounded by the one defined in [31], which further implies that the performance bound (when ε\varepsilon is 0) for Algorithm 2 given in Theorem 5.7 is tighter than that provided in [31].

Finally, we aim to provide a lower bound on γ1\gamma_{1} that can be computed in polynomial time. The lower bounds on γ1\gamma_{1} and γ2\gamma_{2} together with Theorem 5.7 will also provide performance guarantees for the greedy algorithm.

Lemma 5.10.

([10]) For any positive semidefinite matrices P,Qn×nP,Q\in\mathbb{R}^{n\times n}, λ1(P)λ1(P+Q)λ1(P)+λ1(Q)\lambda_{1}(P)\leq\lambda_{1}(P+Q)\leq\lambda_{1}(P)+\lambda_{1}(Q), and λn(P+Q)λn(P)+λn(Q)\lambda_{n}(P+Q)\geq\lambda_{n}(P)+\lambda_{n}(Q).

We have the following result; the proof is included in Section 7.4 in the Appendix.

Lemma 5.11.

Consider the set function fPa:2¯0f_{Pa}:2^{\bar{\mathcal{M}}}\to\mathbb{R}_{\geq 0} defined in (39). The type-1 greedy submodularity ratio of fPa()f_{Pa}(\cdot) given by Definition 5.5 satisfies

γ1minj{0,,|𝒴2|}λ2(Fp+H(𝒴2j))λ2(Fp+H({zj}𝒴2j))λ1(Fp+H(𝒴2j))λ1(Fp+H({zj}𝒴2j)),\gamma_{1}\geq\mathop{\min}_{j\in\{0,\dots,|\mathcal{Y}_{2}|\}}\frac{\lambda_{2}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{2}(F_{p}+H(\{z_{j}\}\cup\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{1}(F_{p}+H(\{z_{j}\}\cup\mathcal{Y}_{2}^{j}))}, (62)

where 𝒴2j\mathcal{Y}_{2}^{j} contains the first jj elements added to 𝒴2\mathcal{Y}_{2} in Algorithm 2 j{1,,|𝒴2|}\forall j\in\{1,\dots,|\mathcal{Y}_{2}|\} with 𝒴20=\mathcal{Y}_{2}^{0}=\emptyset, FpF_{p} is given by (31), H(𝒴)=y𝒴HyH(\mathcal{Y})=\sum_{y\in\mathcal{Y}}H_{y} 𝒴¯\forall\mathcal{Y}\subseteq\bar{\mathcal{M}} with Hy𝟎H_{y}\succeq\mathbf{0} defined in (40), and zjargminy¯𝒴2jλ2(Fp+H({y}𝒴2j))λ1(Fp+H({y}𝒴2j))z_{j}\in\mathop{\arg}{\min}_{y\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j}}\frac{\lambda_{2}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))} j{1,,|𝒴2|}\forall j\in\{1,\dots,|\mathcal{Y}_{2}|\}.

Recalling our arguments at the beginning of this section, we may only obtain approximations of the entries in the (22 by 22) matrix Fp+H(𝒴)F_{p}+H(\mathcal{Y}) for 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}} using, e.g., numerical integration, where HyH_{y} (resp., FpF_{p}) is defined in (40) (resp., (31)). Specifically, for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}, let H^(𝒴)=(Fp+H(𝒴))+E(𝒴)\hat{H}(\mathcal{Y})=(F_{p}+H(\mathcal{Y}))+E(\mathcal{Y}) be the approximation of Fp+H(𝒴)F_{p}+H(\mathcal{Y}), where each entry of E(𝒴)2×2E(\mathcal{Y})\in\mathbb{R}^{2\times 2} represents the approximation error of the corresponding entry in Fp+H(𝒴)F_{p}+H(\mathcal{Y}). Since FpF_{p} and H(𝒴)H(\mathcal{Y}) are positive semidefinite matrices, E(𝒴)E(\mathcal{Y}) is a symmetric matrix. Now, using a standard eigenvalue perturbation result, e.g., Corollary 6.3.8 in [10], one can obtain that i=12|λi(Fp+H(𝒴))λi(H^(𝒴)|2E(𝒴)F2\sum_{i=1}^{2}|\lambda_{i}(F_{p}+H(\mathcal{Y}))-\lambda_{i}(\hat{H}(\mathcal{Y})|^{2}\leq\|E(\mathcal{Y})\|_{F}^{2} for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}, where E(𝒴)Ftr(E(𝒴)E(𝒴))\|E(\mathcal{Y})\|_{F}\triangleq\sqrt{\operatorname{tr}(E(\mathcal{Y})^{\top}E(\mathcal{Y}))} denotes the Frobenius norm of a matrix. It then follows that

λ2(Fp+H(𝒴))λ1(Fp+H(𝒴))λ2(H^(𝒴))E(𝒴)Fλ1(H^(𝒴))+E(𝒴)Fλ2(H^(𝒴))ελ1(H^(𝒴))+ε𝒴¯,\frac{\lambda_{2}(F_{p}+H(\mathcal{Y}))}{\lambda_{1}(F_{p}+H(\mathcal{Y}))}\geq\frac{\lambda_{2}(\hat{H}(\mathcal{Y}))-\|E(\mathcal{Y})\|_{F}}{\lambda_{1}(\hat{H}(\mathcal{Y}))+\|E(\mathcal{Y})\|_{F}}\geq\frac{\lambda_{2}(\hat{H}(\mathcal{Y}))-\varepsilon^{\prime}}{\lambda_{1}(\hat{H}(\mathcal{Y}))+\varepsilon^{\prime}}\quad\forall\mathcal{Y}\subseteq\bar{\mathcal{M}},

where ε0\varepsilon^{\prime}\in\mathbb{R}_{\geq 0} and satisfies E(𝒴)Fε\|E(\mathcal{Y})\|_{F}\leq\varepsilon^{\prime} for all 𝒴¯\mathcal{Y}\subseteq\bar{\mathcal{M}}. Combining the above arguments with (62) in Lemma 5.11, we obtain a lower bound on γ1\gamma_{1} that can be computed using O(|¯|2)O(|\bar{\mathcal{M}}|^{2}) function calls of H^()\hat{H}(\cdot).

5.3.1 Illustrations

Note that one can further obtain from (62):

γ1minj{0,,|𝒴2|}λ2(Fp)+λ2(H(𝒴2j))λ1(Fp)+λ1(H(𝒴2j))λ2(Fp)+λ2(H(zj))+λ2(H(𝒴2j))λ1(Fp)+λ1(H(zj))+λ1(H(𝒴2j)),\gamma_{1}\geq\mathop{\min}_{j\in\{0,\dots,|\mathcal{Y}_{2}|\}}\frac{\lambda_{2}(F_{p})+\lambda_{2}(H(\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p})+\lambda_{1}(H(\mathcal{Y}_{2}^{j}))}\cdot\frac{\lambda_{2}(F_{p})+\lambda_{2}(H(z_{j}))+\lambda_{2}(H(\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p})+\lambda_{1}(H(z_{j}))+\lambda_{1}(H(\mathcal{Y}_{2}^{j}))}, (63)

where zjargminy¯𝒴2jλ2(Fp+H({y}𝒴2j))λ1(Fp+H({y}𝒴2j))z_{j}\in\mathop{\arg}{\min}_{y\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j}}\frac{\lambda_{2}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}. Supposing FpF_{p} is fixed, we see from (63) that the lower bound on γ1\gamma_{1} would potentially increase if λ2(H(zj))/λ1(H(zj))\lambda_{2}(H(z_{j}))/\lambda_{1}(H(z_{j})) and λ2(H(𝒴2j))/λ1(H(𝒴2j))\lambda_{2}(H(\mathcal{Y}_{2}^{j}))/\lambda_{1}(H(\mathcal{Y}_{2}^{j})) increase. Recall that FpF_{p} given by (31) encodes the prior knowledge that we have about θ=[βδ]T\theta=[\beta\ \delta]^{T}. Moreover, recall from (40) that H(y)H(y) depends on the prior pdf p(θ)p(\theta) and the dynamics of the SIR model in (1). Thus, the lower bound given by Lemma 5.11 and thus the corresponding performance bound of Algorithm 2 given in Theorem 5.7 depend on the prior knowledge that we have on θ=[βδ]T\theta=[\beta\ \delta]^{T} and the dynamics of the SIR model. Also note that the performance bounds given in Theorem 5.7 are worst-case performance bounds for Algorithm 2. Thus, in practice the ratio between a solution returned by the algorithm and an optimal solution can be smaller than the ratio predicted by Theorem 5.7, as we will see in our simulations in next section. Moreover, instances with tighter performance bounds potentially imply better performance of the algorithm when applied to those instances. Similar arguments hold for the performance bounds provided in Theorem 5.4.

5.3.2 Simulations

To validate the theoretical results in Theorems 5.4 and 5.7, and Lemma 5.11, we consider various PEMS instances.777In our simulations, we neglect the approximation error corresponding to the numerical integrations discussed in Section 5.3, since the error terms are found to be sufficiently small. The directed network 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\} is given by Fig. 1(a)\ref{fig:parameters}(a). According to the existing literature about the estimated infection and recovery rates for the COVID-19 pandemic (e.g., [26]), we assume that the infection rate β\beta and the recovery rate δ\delta lie in the intervals [3,7][3,7] and [1,4][1,4], respectively. Let the prior pdf of β\beta (resp., δ\delta) be a (linearly transformed) Beta distribution with parameters α1=6\alpha_{1}=6 and α2=3\alpha_{2}=3 (resp., α1=3\alpha_{1}=3 and α2=4\alpha_{2}=4), where β\beta and δ\delta are also assumed to be independent. The prior pdfs of β\beta and δ\delta are then plotted in Fig. 1(b)\ref{fig:parameters}(b) and Fig. 1(c)\ref{fig:parameters}(c), respectively. We set the sampling parameter h=0.1h=0.1. We then randomly generate the weight matrix A5×5A\in\mathbb{R}^{5\times 5} such that Assumptions 3.1-3.2 are satisfied, where each entry of AA is drawn (independently) from certain uniform distributions. The initial condition is set to be s1[0]=0.95s_{1}[0]=0.95, x1[0]=0.05x_{1}[0]=0.05 and r1[0]=0r_{1}[0]=0, and si[0]=0.99s_{i}[0]=0.99, xi[0]=0.01x_{i}[0]=0.01 and ri[0]=0r_{i}[0]=0 for all i{2,,5}i\in\{2,\dots,5\}. In the pmfs of measurements x^i[k]\hat{x}_{i}[k] and r^i[k]\hat{r}_{i}[k] given in Eq. (34) and Eq. (35), respectively, we set Nix=Nir=100N_{i}^{x}=N_{i}^{r}=100 and Ni=1000N_{i}=1000 i𝒱\forall i\in\mathcal{V}, where NiN_{i} is the total population at node ii.

Refer to caption
(a) a
Refer to caption
(b) b
Refer to caption
(c) c
Figure 1: Network structure and prior pdfs of β\beta and δ\delta.

First, let us consider PEMS instances with a relatively smaller size. In such instances, we set the time steps t1=t2=5t_{1}=t_{2}=5, i.e., we only consider collecting measurements at time step t=5t=5. In the sets 𝒞5,i={ζc5,i:ζ({0}[ζi])}\mathcal{C}_{5,i}=\{\zeta c_{5,i}:\zeta\in(\{0\}\cup[\zeta_{i}])\} and 5,i={ηb5,i:η({0}[ηi])}\mathcal{B}_{5,i}=\{\eta b_{5,i}:\eta\in(\{0\}\cup[\eta_{i}])\}, we let c5,i=b5,ic_{5,i}=b_{5,i} and ζi=ηi=2\zeta_{i}=\eta_{i}=2 for all i𝒱i\in\mathcal{V}, and draw c5,ic_{5,i} and b5,ib_{5,i} uniformly randomly from {1,2,3}\{1,2,3\}. Here, we can choose to perform 0, 100100, or 200200 virus (or antibody) tests at a node i𝒱i\in\mathcal{V} and at k=5k=5. In Fig. 2(a)\ref{fig:small instances}(a), we consider the objective function fPd()f_{Pd}(\cdot), given by Eq. (42), in the PEMS instances constructed above, and plot the greedy solutions and the optimal solutions (found by brute force) to the PEMS instances under different values of budget BB. Note that for all the simulation results in this section, we obtain the averaged results from 5050 randomly generated AA matrices as described above, for each value of BB. As shown in Theorem 5.4, the greedy algorithm yields a 12(1e1)0.31\frac{1}{2}(1-e^{-1})\approx 0.31 approximation for fPd()f_{Pd}(\cdot) (in the worst case), and the results in Fig. 2(a)\ref{fig:small instances}(a) show that the greedy algorithm performs near optimally for the PEMS instances generated above. Similarly, in Fig. 2(b)\ref{fig:small instances}(b), we plot the greedy solutions and the optimal solutions to the PEMS instances constructed above under different values of BB, when the objective function is fPa()f_{Pa}(\cdot) given in Eq. (39). Again, the results in Fig. 2(b)\ref{fig:small instances}(b) show that the greedy algorithm performs well for the constructed PEMS instances. Moreover, according to Lemma 5.11, we plot the lower bound on the submodularity ratio γ1\gamma_{1} of fPa()f_{Pa}(\cdot) in Fig. 2(c)\ref{fig:small instances}(c). Here, we note that the submodularity ratio γ2\gamma_{2} of fPa()f_{Pa}(\cdot) is always greater than one in the PEMS instances constructed above. Hence, Theorem 5.7 yields a 12(1eγ1)\frac{1}{2}(1-e^{-\gamma_{1}}) worst-case approximation guarantee for the greedy algorithm, where 12(1e0.3)0.13\frac{1}{2}(1-e^{-0.3})\approx 0.13.

Refer to caption
(a) a
Refer to caption
(b) b
Refer to caption
(c) c
Figure 2: Results for PEMS instances of medium size.

We then investigate the performance of the greedy algorithm for PEMS instances of a larger size. Since the optimal solution to the PEMS instances cannot be efficiently obtained when the sizes of the instances become large, we obtain the lower bound on the submodularity ratio γ1\gamma_{1} of fPa()f_{Pa}(\cdot) provided in Lemma 5.11, which can be computed in polynomial time. Different from the smaller instances constructed above, we set t1=1t_{1}=1 and t2=5t_{2}=5. We let ζi=ηi=10\zeta_{i}=\eta_{i}=10 for all i𝒱i\in\mathcal{V} in 𝒞k,i={ζck,i:ζ({0}[ζi])}\mathcal{C}_{k,i}=\{\zeta c_{k,i}:\zeta\in(\{0\}\cup[\zeta_{i}])\} and k,i={ηbk,i:η({0}[ηi])}\mathcal{B}_{k,i}=\{\eta b_{k,i}:\eta\in(\{0\}\cup[\eta_{i}])\}, where we also set ck,i=bk,ic_{k,i}=b_{k,i} and draw ck,ic_{k,i} and bk,ib_{k,i} uniformly randomly from {1,2,3}\{1,2,3\}, for all k[5]k\in[5] and for all i𝒱i\in\mathcal{V}. Moreover, we modify the parameter of the Beta distribution corresponding to the pdf of β\beta to be α1=8\alpha_{1}=8 and α2=3\alpha_{2}=3. Here, we can choose to perform 0, 100100, 200200, …, or 10001000 virus (or antibody) tests at a node i𝒱i\in\mathcal{V} and at k[5]k\in[5]. In Fig. 3(a)\ref{fig:large instances}(a), we plot the lower bound on γ1\gamma_{1} obtained from the PEMS instances constructed above. We note that the submodularity ratio γ2\gamma_{2} of fPa()f_{Pa}(\cdot) is also always greater than one. Hence, Theorem 5.7 yields a 12(1eγ1)\frac{1}{2}(1-e^{-\gamma_{1}}) worst-case approximation guarantee for the greedy algorithm. We plot in Fig. 3(b)\ref{fig:large instances}(b) the approximation guarantee using the lower bound that we obtained on γ1\gamma_{1}.

Refer to caption
(a) a
Refer to caption
(b) b
Figure 3: Results for PEMS instances of large size.

6 Conclusion

We first considered the PIMS problem under the exact measurement setting, and showed that the problem is NP-hard. We then proposed an approximation algorithm that returns a solution to the PIMS problem that is within a certain factor of the optimal one. Next, we studied the PEMS problem under the noisy measurement setting. Again, we showed that the problem is NP-hard. We applied a greedy algorithm to solve the PEMS problem, and provided performance guarantees on the greedy algorithm. We presented numerical examples to validate the obtained performance bounds of the greedy algorithm, and showed that the greedy algorithm performs well in practice.

7 Appendix

7.1 Proof of Lemma 3.5

We first prove part (a)(a). Considering any i𝒱i\in\mathcal{V} and any k0k\in\mathbb{Z}_{\geq 0}, we note from Eq. (1a) that

si[k+1]=si[k](1hβj𝒩¯iaijxj[k]).s_{i}[k+1]=s_{i}[k](1-h\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k]). (64)

Under Assumptions 3.1-3.2, we have xi[k][0,1]x_{i}[k]\in[0,1] for all i𝒱i\in\mathcal{V} as argued in Section 3, and hβj𝒩¯iaij<1h\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}<1 for all i𝒱i\in\mathcal{V}, which implies 1hβj𝒩¯iaijxj[k]1hβj𝒩¯iaij>01-h\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k]\geq 1-h\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}>0. Supposing si[k]>0s_{i}[k]>0, we have from Eq. (64) si[k+1]>0s_{i}[k+1]>0. Combining the above arguments with the fact si[0](0,1]s_{i}[0]\in(0,1] from Assumption 3.1, we see that si[k]>0s_{i}[k]>0 for all k0k\in\mathbb{Z}_{\geq 0}. Noting that si[k],xi[k],ri[k][0,1]s_{i}[k],x_{i}[k],r_{i}[k]\in[0,1] with si[k]+xi[k]+ri[k]=1s_{i}[k]+x_{i}[k]+r_{i}[k]=1 for all i𝒱i\in\mathcal{V} and for all k0k\in\mathbb{Z}_{\geq 0} as argued in Section 3 and that xi[0][0,1)x_{i}[0]\in[0,1) and ri[0]=0r_{i}[0]=0 for all i𝒱i\in\mathcal{V}, the result in part (a)(a) also implies xi[k],ri[k][0,1)x_{i}[k],r_{i}[k]\in[0,1) for all i𝒱i\in\mathcal{V} and for all k0k\in\mathbb{Z}_{\geq 0}.

One can then observe that in order to prove parts (b)(b)-(d)(d), it is sufficient to prove the following facts.

Fact 1.

Consider any i𝒱i\in\mathcal{V} and any k10k_{1}\in\mathbb{Z}_{\geq 0}. If xi[k1]>0x_{i}[k_{1}]>0, then xi[k2]>0x_{i}[k_{2}]>0 for all k20k_{2}\in\mathbb{Z}_{\geq 0} with k2k1k_{2}\geq k_{1}.

Fact 2.

Consider any i𝒱i\in\mathcal{V} and any k0k\in\mathbb{Z}_{\geq 0} such that xi[k]=0x_{i}[k]=0. If there exists j𝒩ij\in\mathcal{N}_{i} such that xj[k]>0x_{j}[k]>0, then xi[k+1]>0x_{i}[k+1]>0. If xj[k]=0x_{j}[k]=0 for all j𝒩ij\in\mathcal{N}_{i}, then xi[k+1]=0x_{i}[k+1]=0.

Fact 3.

Consider any i𝒱i\in\mathcal{V} and any k10k_{1}\in\mathbb{Z}_{\geq 0}. If xi[k1]>0x_{i}[k_{1}]>0, then ri[k1+1]>0r_{i}[k_{1}+1]>0. If xi[k1]=0x_{i}[k_{1}]=0, then ri[k1+1]=0r_{i}[k_{1}+1]=0.

Let us first prove Fact 1. Consider any i𝒱i\in\mathcal{V} and any k0k\in\mathbb{Z}_{\geq 0}. Supposing xi[k]>0x_{i}[k]>0, we have from Eq. (1)

xi[k+1]=(1hδ)xi[k]+hsi[k]βj𝒩¯iaijxj[k],x_{i}[k+1]=(1-h\delta)x_{i}[k]+hs_{i}[k]\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k], (65)

where the first term on the right-hand side of the above equation is positive, since 1hδ>01-h\delta>0 from Assumption 3.2, and the second term on the right-hand side of the above equation is nonnegative. It then follows that xi[k+1]>0x_{i}[k+1]>0. Repeating the above argument proves Fact 1.

We next prove Fact 2. Considering any i𝒱i\in\mathcal{V} and any k0k\in\mathbb{Z}_{\geq 0} such that xi[k]=0x_{i}[k]=0, we note from Eq. (1) that

xi[k+1]=hsi[k]βj𝒩iaijxj[k],x_{i}[k+1]=hs_{i}[k]\beta\sum_{j\in\mathcal{N}_{i}}a_{ij}x_{j}[k], (66)

where si[k]>0s_{i}[k]>0 as shown in part (a)(a). Suppose there exists j𝒩ij\in\mathcal{N}_{i} such that xj[k]>0x_{j}[k]>0. Since h,β>0h,\beta\in\mathbb{R}_{>0} and aij>0a_{ij}>0 for all j𝒩ij\in\mathcal{N}_{i} from Assumption 3.2, we have from Eq. (66) xi[k+1]>0x_{i}[k+1]>0. Next, supposing xj[k]=0x_{j}[k]=0 for all j𝒩ij\in\mathcal{N}_{i}, we obtain from Eq. (66) xi[k+1]=0x_{i}[k+1]=0. This proves Fact 2.

Finally, we prove Fact 3. Let us consider any i𝒱i\in\mathcal{V} and any k10k_{1}\in\mathbb{Z}_{\geq 0}. Suppose xi[k1]>0x_{i}[k_{1}]>0. Since h,δ>0h,\delta\in\mathbb{R}_{>0} from Assumption 3.2, we have from Eq. (1c) ri[k1+1]=ri[k]+hδxi[k1]>0r_{i}[k_{1}+1]=r_{i}[k]+h\delta x_{i}[k_{1}]>0. Next, supposing xi[k1]=0x_{i}[k_{1}]=0, we note from Fact 1 that xi[k1]=0x_{i}[k_{1}^{\prime}]=0 for all k1k1k_{1}^{\prime}\leq k_{1}. It then follows from Eq. (1c) and Assumption 3.1 that ri[k1+1]=ri[k1]==ri[0]=0r_{i}[k_{1}+1]=r_{i}[k_{1}]=\cdots=r_{i}[0]=0, completing the proof of Fact 3.\hfill\square

7.2 Proof of Theorem 4.4

We show that PIMS is NP-hard via a polynomial reduction from the exact cover by 3-sets (X3C) problem which is known to be NP-complete [9].

Problem 7.1.

Consider 𝒳={1,2,,3m}\mathcal{X}=\{1,2,\dots,3m\} and a collection 𝒵={z1,z2,,zτ}\mathcal{Z}=\{z_{1},z_{2},\dots,z_{\tau}\} of 3-element subsets of 𝒳\mathcal{X}, where τm\tau\geq m. The X3C problem is to determine if there is an exact cover for 𝒳\mathcal{X}, i.e., a subcollection 𝒵𝒵\mathcal{Z}^{\prime}\subseteq\mathcal{Z} such that every element of 𝒳\mathcal{X} occurs in exactly one member of 𝒵\mathcal{Z}^{\prime}.

Refer to caption
Figure 4: Graph 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\} constructed in the proof of Theorem 4.4.

Consider an instance of the X3C problem given by a set 𝒳={1,,3m}\mathcal{X}=\{1,\dots,3m\} and a collection 𝒵={z1,,zτ}\mathcal{Z}=\{z_{1},\dots,z_{\tau}\} of 3-element subsets of 𝒳\mathcal{X}, where τm\tau\geq m. We then construct an instance of the PIMS problem as follows. The node set of the graph 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\} is set to be 𝒱={i0,i1,,iτ}{j1,j2,,j3m}\mathcal{V}=\{i_{0},i_{1},\dots,i_{\tau}\}\cup\{j_{1},j_{2},\dots,j_{3m}\}. The edge set of 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\} is set to satisfy that (jq,il)(j_{q},i_{l})\in\mathcal{E} if q𝒳q\in\mathcal{X} is contained in zl𝒵z_{l}\in\mathcal{Z}, (jq,i0)(j_{q},i_{0})\in\mathcal{E} for all q𝒳q\in\mathcal{X}, and (i0,i0)(i_{0},i_{0})\in\mathcal{E}. Note that based on the construction of 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\}, each node i{i1,,iτ}i\in\{i_{1},\dots,i_{\tau}\} represents a subset from 𝒵\mathcal{Z} in the X3C instance, and each node j{j1,,j3m}j\in\{j_{1},\dots,j_{3m}\} represents an element from 𝒳\mathcal{X} in the X3C instance, where the edges between {i1,,iτ}\{i_{1},\dots,i_{\tau}\} and {j1,,j3m}\{j_{1},\dots,j_{3m}\} indicate how the elements in 𝒳\mathcal{X} are included in the subsets in 𝒵\mathcal{Z}. A plot of 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\} is given in Fig. 4. Accordingly, the weight matrix A(3m+τ+1)×(3m+τ+1)A\in\mathbb{R}^{(3m+\tau+1)\times(3m+\tau+1)} is set to satisfy that ailjq=1a_{i_{l}j_{q}}=1 if q𝒳q\in\mathcal{X} is contained in zl𝒵z_{l}\in\mathcal{Z}, ai0jq=1a_{i_{0}j_{q}}=1 for all q𝒳q\in\mathcal{X}, and ai0i0=1a_{i_{0}i_{0}}=1. We set the sampling parameter to be h=1/(3m+1)h=1/(3m+1). The set 𝒮I𝒱\mathcal{S}_{I}\subseteq\mathcal{V} is set to be 𝒮I=𝒱\mathcal{S}_{I}=\mathcal{V}, i.e., xi[0]>0x_{i}[0]>0 for all i𝒱i\in\mathcal{V}. We set time steps t1=2t_{1}=2 and t2=3t_{2}=3. Finally, we set b2,i=b3,i=0b_{2,i}=b_{3,i}=0 for all i𝒱i\in\mathcal{V}, c2,il=1c_{2,i_{l}}=1 and c3,il=0c_{3,i_{l}}=0 for all l{1,,τ}l\in\{1,\dots,\tau\}, c2,jq=c3,jq=m+1c_{2,j_{q}}=c_{3,j_{q}}=m+1 for all q𝒳q\in\mathcal{X}, and c2,i0=c3,i0=0c_{2,i_{0}}=c_{3,i_{0}}=0. Since xi[0]>0x_{i}[0]>0 for all i𝒱i\in\mathcal{V}, we see from Lemma 3.5 that xi[k]>0x_{i}[k]>0 and ri[k]>0r_{i}[k]>0 for all i𝒱i\in\mathcal{V} and for all k{2,3}k\in\{2,3\}. Therefore, Lemma 3.5 is no longer useful in determining the coefficients in the equations from Eq. (3).

We claim that an optimal solution, denoted as \mathcal{I}^{\star}, to the constructed PIMS instance satisfies c()mc(\mathcal{I}^{\star})\leq m if and only if the solution to the X3C instance is “yes”.

First, suppose the solution to the X3C instance is “yes”. Denote an exact cover as 𝒵={zq1,,zqm}𝒵\mathcal{Z}^{\prime}=\{z_{q_{1}},\dots,z_{q_{m}}\}\subseteq\mathcal{Z}, where {q1,,qm}{1,,τ}\{q_{1},\dots,q_{m}\}\subseteq\{1,\dots,\tau\}. Let us consider a measurement selection strategy 0t1:t2\mathcal{I}_{0}\subseteq\mathcal{I}_{t_{1}:t_{2}} given by

0=(l{1,,m}{xiql[2],xiql[3],riql[2]}){xi0[2],xi0[3],ri0[2],ri0[3]}.\mathcal{I}_{0}=\big{(}\bigcup_{l\in\{1,\dots,m\}}\{x_{i_{q_{l}}}[2],x_{i_{q_{l}}}[3],r_{i_{q_{l}}}[2]\}\big{)}\cup\{x_{i_{0}}[2],x_{i_{0}}[3],r_{i_{0}}[2],r_{i_{0}}[3]\}.

We then have from Eq. (9) ¯0={(2,i0,r),(2,i0,x)}{(2,iql,x):l{1,,m}}\bar{\mathcal{I}}_{0}=\{(2,i_{0},r),(2,i_{0},x)\}\cup\{(2,i_{q_{l}},x):l\in\{1,\dots,m\}\}. Noting that si[k]>0s_{i}[k]>0 for all i𝒱i\in\mathcal{V} and for all k0k\in\mathbb{Z}_{\geq 0} from Lemma 3.5(a)\ref{lemma:propagate of initial condition}(a), we consider the following (m+1)(m+1) equations from Eq. (3) whose indices are contained in ¯0\bar{\mathcal{I}}_{0}:

1si0[2](xi0[3]xi0[2])=h[xi0[2]+w𝒩i0xw[2]xi0[2]si0[1]][βδ],\displaystyle\frac{1}{s_{i_{0}}[2]}(x_{i_{0}}[3]-x_{i_{0}}[2])=h\begin{bmatrix}x_{i_{0}}[2]+\sum_{w\in\mathcal{N}_{i_{0}}}x_{w}[2]&-\frac{x_{i_{0}}[2]}{s_{i_{0}}[1]}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix}, (67)
1siql[2](xiql[3]xiql[2])=h[w𝒩iqlxw[2]xiql[2]siql[2]][βδ],l{1,,m},\displaystyle\frac{1}{s_{i_{q_{l}}}[2]}(x_{i_{q_{l}}}[3]-x_{i_{q_{l}}}[2])=h\begin{bmatrix}\sum_{w\in\mathcal{N}_{i_{q_{l}}}}x_{w}[2]&-\frac{x_{i_{q_{l}}}[2]}{s_{i_{q_{l}}}[2]}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix},\ \forall l\in\{1,\dots,m\}, (68)

where we note 𝒩i0={j1,,j3m}\mathcal{N}_{i_{0}}=\{j_{1},\dots,j_{3m}\} from the way we constructed 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\}. Since 𝒵={zq1,,zqm}\mathcal{Z}^{\prime}=\{z_{q_{1}},\dots,z_{q_{m}}\} is an exact cover for 𝒳\mathcal{X}, we see from the construction of 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\} that l{1,,m}𝒩iql\bigcup_{l\in\{1,\dots,m\}}\mathcal{N}_{i_{q_{l}}} is a union of mutually disjoint (3-element) sets such that l{1,,m}𝒩iql={j1,,j3m}\bigcup_{l\in\{1,\dots,m\}}\mathcal{N}_{i_{q_{l}}}=\{j_{1},\dots,j_{3m}\}. Thus, subtracting the equations in (68) from Eq. (67), we obtain

1si0[2](xi0[3]xi0[2])l{1,,m}1siql[2](xiql[3]xiql[2])=h[xi0[2]xi0[2]si0[2]+l{1,,m}xiql[2]siql[2]][βδ],\frac{1}{s_{i_{0}}[2]}(x_{i_{0}}[3]-x_{i_{0}}[2])-\sum_{l\in\{1,\dots,m\}}\frac{1}{s_{i_{q_{l}}}[2]}(x_{i_{q_{l}}}[3]-x_{i_{q_{l}}}[2])\\ =h\begin{bmatrix}x_{i_{0}}[2]&-\frac{x_{i_{0}}[2]}{s_{i_{0}}[2]}+\sum_{l\in\{1,\dots,m\}}\frac{x_{i_{q_{l}}}[2]}{s_{i_{q_{l}}}[2]}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix}, (69)

where we note xi0[2]>0x_{i_{0}}[2]>0 as argued above. Following Definition 4.1, we stack coefficient matrices Φ2,i0r1×2\Phi_{2,i_{0}}^{r}\in\mathbb{R}^{1\times 2}, Φ2,i0x1×2\Phi_{2,i_{0}}^{x}\in\mathbb{R}^{1\times 2} and Φ2,iqlx1×2\Phi_{2,i_{q_{l}}}^{x}\in\mathbb{R}^{1\times 2} for all l{1,,m}l\in\{1,\dots,m\} into a matrix Φ(0)(m+2)×2\Phi(\mathcal{I}_{0})\in\mathbb{R}^{(m+2)\times 2}, where Φk,ir\Phi_{k,i}^{r} and Φk,ix\Phi_{k,i}^{x} are defined in (8). Now, considering the matrix:

Φ0=[xi0[2]xi0[2]si0[2]+l{1,,m}xiql[2]siql[2]0xi0[2]],\Phi_{0}=\begin{bmatrix}x_{i_{0}}[2]&-\frac{x_{i_{0}}[2]}{s_{i_{0}}[2]}+\sum_{l\in\{1,\dots,m\}}\frac{x_{i_{q_{l}}}[2]}{s_{i_{q_{l}}}[2]}\\ 0&x_{i_{0}}[2]\end{bmatrix}, (70)

we see from the above arguments that (Φ0)1(\Phi_{0})_{1} and (Φ0)2(\Phi_{0})_{2} can be be obtained via algebraic operations among the rows in Φ(0)\Phi(\mathcal{I}_{0}), and the elements in (Φ0)1(\Phi_{0})_{1} and (Φ0)2(\Phi_{0})_{2} can be determined using the measurements from 0\mathcal{I}_{0}. Therefore, we have Φ0Φ~(0)\Phi_{0}\in\tilde{\Phi}(\mathcal{I}_{0}), where Φ~(0)\tilde{\Phi}(\mathcal{I}_{0}) is defined in Definition 4.1. Noting that xi0[2]>0x_{i_{0}}[2]>0, we have rank(Φ0)=2\operatorname{rank}(\Phi_{0})=2, which implies rmax(0)=2r_{\mathop{\max}}(\mathcal{I}_{0})=2, where rmax(0)r_{\mathop{\max}}(\mathcal{I}_{0}) is given by Eq. (10). Thus, 0t1:t2\mathcal{I}_{0}\subseteq\mathcal{I}_{t_{1}:t_{2}} satisfies the constraint in (12). Since c(0)=mc(\mathcal{I}_{0})=m from the way we set the costs of collecting measurements in the PIMS instance, we have c()mc(\mathcal{I}^{\star})\leq m.

Conversely, suppose the solution to the X3C instance is “no”, i.e., for any subcollection 𝒵𝒵\mathcal{Z}^{\prime}\subseteq\mathcal{Z} that contains mm subsets, there exists at least one element in 𝒳\mathcal{X} that is not contained in any subset in 𝒵\mathcal{Z}^{\prime}. We will show that for any measurement selection strategy t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}} that satisfies rmax()=2r_{\mathop{\max}}(\mathcal{I})=2, c()>mc(\mathcal{I})>m holds. Equivalently, we will show that for any t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}} with c()mc(\mathcal{I})\leq m, rmax()=2r_{\mathop{\max}}(\mathcal{I})=2 does not hold. Consider any t1:t2\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}} such that c()mc(\mathcal{I})\leq m. Noting that c2,jq=c3,jq=m+1c_{2,j_{q}}=c_{3,j_{q}}=m+1 for all q𝒳q\in\mathcal{X} in the constructed PIMS instance, we have xjq[2]x_{j_{q}}[2]\notin\mathcal{I} and xjq[3]x_{j_{q}}[3]\notin\mathcal{I} for all q𝒳q\in\mathcal{X}. Moreover, we see that \mathcal{I} contains at most mm measurements from {xi1[2],,xiτ[2]}\{x_{i_{1}}[2],\dots,x_{i_{\tau}}[2]\}. To proceed, let us consider any 1t1:t2\mathcal{I}_{1}\subseteq\mathcal{I}_{t_{1}:t_{2}} such that

1={xi0[2],xiv1[2],,xivm[2]}(l{0,,τ}{xil[3]})(i𝒱{ri[2],ri[3]}),\mathcal{I}_{1}=\{x_{i_{0}}[2],x_{i_{v_{1}}}[2],\dots,x_{i_{v_{m}}}[2]\}\cup\big{(}\bigcup_{l\in\{0,\dots,\tau\}}\{x_{i_{l}}[3]\}\big{)}\cup\big{(}\bigcup_{i\in\mathcal{V}}\{r_{i}[2],r_{i}[3]\}\big{)}, (71)

where {v1,,vm}{1,,τ}\{v_{1},\dots,v_{m}\}\subseteq\{1,\dots,\tau\}. In other words, 1t1:t2\mathcal{I}_{1}\subseteq\mathcal{I}_{t_{1}:t_{2}} contains mm measurements from {xi1[2],,xiτ[2]}\{x_{i_{1}}[2],\dots,x_{i_{\tau}}[2]\} and all the other measurements from t1:t2\mathcal{I}_{t_{1}:t_{2}} that have zero costs. It follows that c(1)=mc(\mathcal{I}_{1})=m. Similarly to (67) and (68), we have the following (m+1)(m+1) equations from Eq. (3) whose indices are contained in ¯1\bar{\mathcal{I}}_{1} (given by Eq. (9)):

1si0[2](xi0[3]xi0[2])=h[xi0[2]+w𝒩i0xw[2]xi0[2]si0[1]][βδ],\displaystyle\frac{1}{s_{i_{0}}[2]}(x_{i_{0}}[3]-x_{i_{0}}[2])=h\begin{bmatrix}x_{i_{0}}[2]+\sum_{w\in\mathcal{N}_{i_{0}}}x_{w}[2]&-\frac{x_{i_{0}}[2]}{s_{i_{0}}[1]}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix}, (72)
1sivl[2](xivl[3]xivl[2])=h[w𝒩ivlxw[2]xivl[2]sivl[2]][βδ],l{1,,m}.\displaystyle\frac{1}{s_{i_{v_{l}}}[2]}(x_{i_{v_{l}}}[3]-x_{i_{v_{l}}}[2])=h\begin{bmatrix}\sum_{w\in\mathcal{N}_{i_{v_{l}}}}x_{w}[2]&-\frac{x_{i_{v_{l}}}[2]}{s_{i_{v_{l}}}[2]}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix},\forall l\in\{1,\dots,m\}. (73)

Noting that for any subcollection 𝒵𝒵\mathcal{Z}^{\prime}\subseteq\mathcal{Z} that contains mm subsets, there exists at least one element in 𝒳\mathcal{X} that is not contained in any subset in 𝒵\mathcal{Z}^{\prime} as we argued above, we see that there exists at least one element in 𝒳\mathcal{X} that is not contained in any subset in {zv1,,zvm}\{z_{v_{1}},\dots,z_{v_{m}}\}. It then follows from the way we constructed 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\} that there exists w𝒩i0w^{\prime}\in\mathcal{N}_{i_{0}} such that w𝒩ivlw^{\prime}\notin\mathcal{N}_{i_{v_{l}}} for all l{1,,m}l\in\{1,\dots,m\}. Thus, by subtracting the equations in (73) (multiplied by any constants resp.) from Eq. (72), xw[2]x_{w^{\prime}}[2] will remain on the right-hand side of the equation in (72). Similarly, consider any equation from (73) indexed by (2,ivl,x)¯1(2,i_{v_{l}},x)\in\bar{\mathcal{I}}_{1}, where l{1,,m}l\in\{1,\dots,m\}. First, suppose we subtract Eq. (72) multiplied by some positive constant and any equations in (73) other than equation (2,ivl,x)(2,i_{v_{l}},x) (multiplied by any constants resp.) from equation (2,ivl,x)(2,i_{v_{l}},x). Since there exists w𝒩i0w^{\prime}\in\mathcal{N}_{i_{0}} such that w𝒩ivlw^{\prime}\notin\mathcal{N}_{i_{v_{l}}} for all l{1,,m}l\in\{1,\dots,m\} as argued above, we see that xw[2]x_{w^{\prime}}[2] will appear on the right-hand side of equation (2,ivl,x)(2,i_{v_{l}},x). Next, suppose we subtract any equations in (73) other than equation (2,ivl,x)(2,i_{v_{l}},x) (multiplied by any constants resp.) from equation (2,ivl,x)(2,i_{v_{l}},x). One can check that either of the following two cases holds for the resulting equation (2,ivl,x)(2,i_{v_{l}},x): (a) the coefficients on the right-hand side of equation (2,ivl,x)(2,i_{v_{l}},x) contain xjq[2]1x_{j_{q}}[2]\notin\mathcal{I}_{1}, where q𝒳q\in\mathcal{X}; or (b) the coefficient matrix on the right-hand side of equation (2,ivl,x)(2,i_{v_{l}},x) is of the form [0]\begin{bmatrix}0&\star\end{bmatrix}. Again, we stack Φk,ir1×2\Phi_{k,i}^{r}\in\mathbb{R}^{1\times 2} for all (k,i,r)¯1(k,i,r)\in\bar{\mathcal{I}}_{1} and Φk,ix1×2\Phi_{k,i}^{x}\in\mathbb{R}^{1\times 2} for all (k,i,x)¯1(k,i,x)\in\bar{\mathcal{I}}_{1} into a matrix Φ(1)\Phi(\mathcal{I}_{1}), where we note that Φk,ir\Phi_{k,i}^{r} is of the form [0]\begin{bmatrix}0&\star\end{bmatrix} for all (k,i,r)¯1(k,i,r)\in\bar{\mathcal{I}}_{1}. One can then see from the above arguments that for all Φ2×2\Phi\in\mathbb{R}^{2\times 2} (if they exist) such that (Φ)1(\Phi)_{1} and (Φ)2(\Phi)_{2} can be obtained from algebraic operations among the rows in Φ(1)\Phi(\mathcal{I}_{1}), and the elements in (Φ)1(\Phi)_{1} and (Φ)2(\Phi)_{2} can be determined using the measurements from 1\mathcal{I}_{1}, rank(Φ)1\operatorname{rank}(\Phi)\leq 1 holds. It follows that rmax(1)<2r_{\mathop{\max}}(\mathcal{I}_{1})<2, i.e., constraint rmax(1)=2r_{\mathop{\max}}(\mathcal{I}_{1})=2 in (12) does not hold. Using similar arguments to those above, one can further show that rmax()<2r_{\mathop{\max}}(\mathcal{I})<2 holds for all c()mc(\mathcal{I})\leq m, completing the proof of the converse direction of the above claim.

Hence, it follows directly from the above arguments that an algorithm for the PIMS problem can also be used to solve the X3C problem. Since X3C is NP-complete, we conclude that the PIMS problem is NP-hard.\hfill\square

7.3 Proof of Theorem 5.3

We prove the NP-hardness of the PEMS problem via a polynomial reduction from the knapsack problem which is known to be NP-hard [9]. An instance of the knapsack problem is given by a set D={d1,,dτ}D=\{d_{1},\dots,d_{\tau}\}, a size s(d)>0s(d)\in\mathbb{Z}_{>0} and a value v(d)>0v(d)\in\mathbb{Z}_{>0} for each dDd\in D, and K>0K\in\mathbb{Z}_{>0}. The knapsack problem is to find DDD^{\prime}\subseteq D such that dDv(d)\sum_{d\in D^{\prime}}v(d) is maximized while satisfying dDs(d)K\sum_{d\in D^{\prime}}s(d)\leq K.

Given any knapsack instance, we construct an instance of the PEMS problem as follows. Let 𝒢={𝒱,}\mathcal{G}=\{\mathcal{V},\mathcal{E}\} be a graph that contains a set of nn isolated nodes with n=τn=\tau and 𝒱=[n]\mathcal{V}=[n]. Set the weight matrix to be A=𝟎n×nA=\mathbf{0}_{n\times n}, and set the sampling parameter as h=1h=1. The time steps t1t_{1} and t2t_{2} are set to be t1=t2=1t_{1}=t_{2}=1, i.e., only the measurements of xi[1]x_{i}[1] and ri[1]r_{i}[1] for all i𝒱i\in\mathcal{V} will be considered. The initial condition is set to satisfy si[0]=0.5s_{i}[0]=0.5, xi[0]=0.5x_{i}[0]=0.5 and ri[0]=0r_{i}[0]=0 for all i𝒱i\in\mathcal{V}. The budget constraint is set as B=KB=K. Let 𝒞1,i={0,B+1}\mathcal{C}_{1,i}=\{0,B+1\} and 1,i={0,s(di)}\mathcal{B}_{1,i}=\{0,s(d_{i})\} for all i𝒱i\in\mathcal{V}. The pmfs of measurements x^i[1]\hat{x}_{i}[1] and r^i[1]\hat{r}_{i}[1] are given by Eqs. (34) and (35), respectively, with Nix=Nir=v(di)N_{i}^{x}=N_{i}^{r}=v(d_{i}) and Ni=maxi𝒱v(di)N_{i}=\mathop{\max}_{i\in\mathcal{V}}v(d_{i}) for all i𝒱i\in\mathcal{V}, where Assumption 5.1 is assumed to hold. Finally, let the prior pdf of β(0,1)\beta\in(0,1) be a Beta distribution with parameters α1=3\alpha_{1}=3 and α2=3\alpha_{2}=3, and let the prior pdf of δ(0,1)\delta\in(0,1) also be a Beta distribution with parameters α1=3\alpha_{1}=3 and α2=3\alpha_{2}=3, where we take β\beta and δ\delta to be independent. Noting that 𝒞1,i={0,B+1}\mathcal{C}_{1,i}=\{0,B+1\} in the PEMS instance constructed above, i.e., x^i[k]\hat{x}_{i}[k] incurs a cost of B+1>BB+1>B, we only need to consider measurements r^i[1]\hat{r}_{i}[1] for all i𝒱i\in\mathcal{V}. Moreover, since 1,i={0,s(di)}\mathcal{B}_{1,i}=\{0,s(d_{i})\}, a corresponding measurement selection is then given by μ{0,1}𝒱\mu\in\{0,1\}^{\mathcal{V}}. In other words, μ(i)=1\mu(i)=1 if measurement r^i[1]\hat{r}_{i}[1] is collected (with cost s(di)s(d_{i})), and μ(i)=0\mu(i)=0 if measurement r^i[1]\hat{r}_{i}[1] is not collected. We will see that there is a one to one correspondence between a measurement r^i[1]\hat{r}_{i}[1] in the PEMS instance and an element diDd_{i}\in D in the knapsack instance.

Consider a measurement selection μ{0,1}𝒱\mu\in\{0,1\}^{\mathcal{V}}. Since ri[0]=0r_{i}[0]=0 and xi[0]=0.5x_{i}[0]=0.5 for all i𝒱i\in\mathcal{V}, Eq. (1c) implies ri[1]=0.5hδr_{i}[1]=0.5h\delta for all i𝒱i\in\mathcal{V}, where h=1h=1. One then obtain from Eq. (28) and Eq. (37) the following:

Fθ(μ)=10.5δ(10.5δ)[0000.25]isupp(μ)Nirμ(i).F_{\theta}(\mu)=\frac{1}{0.5\delta(1-0.5\delta)}\begin{bmatrix}0&0\\ 0&0.25\end{bmatrix}\sum_{i\in\text{supp}(\mu)}N_{i}^{r}\mu(i). (74)

Next, noting that β\beta and δ\delta are independent, one can show via Eq. (31) that Fp2×2F_{p}\in\mathbb{R}^{2\times 2} is diagonal, where one can further show that (Fp)11=(Fp)22>0(F_{p})_{11}=(F_{p})_{22}>0 using the fact that the pdfs of β\beta and δ\delta are Beta distributions with parameters α1=3\alpha_{1}=3 and α2=3\alpha_{2}=3. Similarly, one can obtain 𝔼θ[1/0.5δ(10.5δ)]>0\mathbb{E}_{\theta}[1/0.5\delta(1-0.5\delta)]>0. It now follows from Eq. (74) that

𝔼θ[Fθ(μ)]+Fp=[z100z1+z2isupp(μ)Nirμ(i)],\mathbb{E}_{\theta}[F_{\theta}(\mu)]+F_{p}=\begin{bmatrix}z_{1}&0\\ 0&z_{1}+z_{2}\sum_{i\in\text{supp}(\mu)}N_{i}^{r}\mu(i)\end{bmatrix}, (75)

where z1,z2>0z_{1},z_{2}\in\mathbb{R}_{>0} are some constants (independent of μ\mu). Note that the objective in the PEMS instance is given by minμ{0,1}𝒱f(μ)\mathop{\min}_{\mu\in\{0,1\}^{\mathcal{V}}}f(\mu), where f(){fa(),fd()}f(\cdot)\in\{f_{a}(\cdot),f_{d}(\cdot)\}. First, considering the objective function fa(μ)=tr(C¯(μ))f_{a}(\mu)=\operatorname{tr}(\bar{C}(\mu)), where C¯(μ)=(𝔼θ[Fθ(μ)]+Fp)1\bar{C}(\mu)=(\mathbb{E}_{\theta}[F_{\theta}(\mu)]+F_{p})^{-1}, we see from Eq. (75) that tr(C¯(μ))\operatorname{tr}(\bar{C}(\mu)) is minimized (over μ{0,1}𝒱\mu\in\{0,1\}^{\mathcal{V}}) if and only if isupp(μ)Nirμ(i)\sum_{i\in\text{supp}(\mu)}N_{i}^{r}\mu(i) is maximized. Similar arguments hold for the objective function fd(μ)=lndet(C¯(μ))f_{d}(\mu)=\ln\det(\bar{C}(\mu)). It then follows directly from the above arguments that a measurement selection μ{0,1}𝒱\mu^{\star}\in\{0,1\}^{\mathcal{V}} is an optimal solution to the PEMS instance if and only if D{di:isupp(μ)}D^{\star}\triangleq\{d_{i}:i\in\text{supp}(\mu^{\star})\} is an optimal solution to the knapsack instance. Since the knapsack problem is NP-hard, the PEMS problem is NP-hard.\hfill\square

7.4 Proof of Lemma 5.11

Noting the definition of γ1\gamma_{1} in Definition 5.5, we provide a lower bound on y𝒜𝒴2j(fPa({y}𝒴2j)fPa(𝒴2j))fPa(𝒜𝒴2j)fPa(𝒴2j)\frac{\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}(f_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j}))}{f_{Pa}(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})} for all 𝒜¯\mathcal{A}\subseteq\bar{\mathcal{M}} and for all 𝒴2j\mathcal{Y}_{2}^{j}, where we assume that 𝒜𝒴2j\mathcal{A}\setminus\mathcal{Y}_{2}^{j}\neq\emptyset, otherwise (45) would be satisfied for all γ1\gamma_{1}\in\mathbb{R}. Recalling the definition of fPa()f_{Pa}(\cdot) in (39), we lower bound LHSy𝒜𝒴2j(fPa({y}𝒴2j)fPa(𝒴2j))LHS\triangleq\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\big{(}f_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})\big{)} in the following manner:

LHS\displaystyle\qquad\qquad LHS =y𝒜𝒴2ji=12λi(Fp+H({y}𝒴2j))λi(Fp+H(𝒴2j))λi(Fp+H(𝒴2j))λi(Fp+H({y}𝒴2j))\displaystyle=\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\sum_{i=1}^{2}\frac{\lambda_{i}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))-\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j}))}{\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{i}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}
y𝒜𝒴2ji=12(λi(Fp+H({y}𝒴2j))λi(Fp+H(𝒴2j)))λ1(Fp+H(𝒴2j))λ1(Fp+H({z}𝒴2j))\displaystyle\geq\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\frac{\sum_{i=1}^{2}(\lambda_{i}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))-\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j})))}{\lambda_{1}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{1}(F_{p}+H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j}))} (76)
=y𝒜𝒴2jtr(Hy)λ1(Fp+H(𝒴2j))λ1(Fp+H({z}𝒴2j)).\displaystyle=\frac{\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\operatorname{tr}(H_{y})}{\lambda_{1}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{1}(F_{p}+H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j}))}. (77)

To obtain (76), we let zargmaxy𝒜𝒴2jλ1(Fp+H({y}𝒴2j))z^{\prime}\in\mathop{\arg}{\max}_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\lambda_{1}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j})) and note that λ1(Fp+H({z}𝒴2j))λi(Fp+H({y}𝒴2j))\lambda_{1}(F_{p}+H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j}))\geq\lambda_{i}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j})) for all i{1,2}i\in\{1,2\} and for all y𝒜𝒴2jy\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}. Next, we upper bound fPa(𝒜𝒴2j)fPa(𝒴2j)f_{Pa}(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j}) in the following manner:

fPa(𝒜𝒴2j)fPa(𝒴2j)\displaystyle f_{Pa}(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j}) =i=12λi(Fp+H(𝒜𝒴2j))λi(Fp+H(𝒴2j))λi(Fp+H(𝒴2j))λi(Fp+H(𝒜𝒴2j))\displaystyle=\sum_{i=1}^{2}\frac{\lambda_{i}(F_{p}+H(\mathcal{A}\cup\mathcal{Y}_{2}^{j}))-\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j}))}{\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{i}(F_{p}+H(\mathcal{A}\cup\mathcal{Y}_{2}^{j}))}
i=12(λi(Fp+H(𝒜𝒴2j))λi(Fp+H(𝒴2j))λ2(Fp+H(𝒴2j))λ2(Fp+H({z}𝒴2j))\displaystyle\leq\frac{\sum_{i=1}^{2}\big{(}\lambda_{i}(F_{p}+H(\mathcal{A}\cup\mathcal{Y}_{2}^{j}))-\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j})\big{)}}{\lambda_{2}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{2}(F_{p}+H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j}))} (78)
=y𝒜𝒴2jtr(Hy)λ2(Fp+H(𝒴2j))λ2(Fp+H({z}𝒴2j)).\displaystyle=\frac{\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\operatorname{tr}(H_{y})}{\lambda_{2}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{2}(F_{p}+H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j}))}. (79)

To obtain (78), we note that λi(Fp+H(𝒜𝒴2j))λ2(Fp+H(𝒜𝒴2j))λ2(Fp+{z}𝒴2j)\lambda_{i}(F_{p}+H(\mathcal{A}\cup\mathcal{Y}_{2}^{j}))\geq\lambda_{2}(F_{p}+H(\mathcal{A}\cup\mathcal{Y}_{2}^{j}))\geq\lambda_{2}(F_{p}+\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j}) for all i{1,2}i\in\{1,2\}, where the second inequality follows from Lemma 5.10 with the fact H(𝒜𝒴2j)H({z}𝒴2j)𝟎H(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j})\succeq\mathbf{0}, and zz^{\prime} is defined above. Combining (77) and (79), and noting zjargminy¯𝒴2jλ2(Fp+H({y}𝒴2j))λ1(Fp+H({y}𝒴2j))z_{j}\in\mathop{\arg}{\min}_{y\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j}}\frac{\lambda_{2}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}, we have

y𝒜𝒴2j(fPa({y}𝒴2j)fPa(𝒴2j))fPa(𝒜𝒴2j)fPa(𝒴2j)λ2(Fp+H(𝒴2j))λ2(Fp+H({zj}𝒴2j))λ1(Fp+H(𝒴2j))λ1(Fp+H({zj}𝒴2j)),\frac{\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}(f_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j}))}{f_{Pa}(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})}\geq\frac{\lambda_{2}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{2}(F_{p}+H(\{z_{j}\}\cup\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{1}(F_{p}+H(\{z_{j}\}\cup\mathcal{Y}_{2}^{j}))}, (80)

which implies (62).\hfill\square

References

  • [1] Centers for Disease Control and Prevention Test for Current Infection. https://www.cdc.gov/coronavirus/2019-ncov/testing/diagnostic-testing.html. 2020.
  • [2] Protect Purdue-Purdue University’s response to COVID-19. https://protect.purdue.edu. 2020.
  • Ahn and Hassibi [2013] H. J. Ahn and B. Hassibi. Global dynamics of epidemic spread over complex networks. In Proc. Conference on Decision and Control, pages 4579–4585. IEEE, 2013.
  • Bendavid et al. [2020] E. Bendavid, B. Mulaney, N. Sood, S. Shah, E. Ling, R. Bromley-Dulfano, C. Lai, Z. Weissberg, R. Saavedra, J. Tedrow, et al. COVID-19 antibody seroprevalence in Santa Clara County, California. MedRxiv, 2020.
  • Bian et al. [2017] A. A. Bian, J. M. Buhmann, A. Krause, and S. Tschiatschek. Guarantees for greedy maximization of non-submodular functions with applications. In Proc. International Conference on Machine Learning, pages 498–507, 2017.
  • Chakrabarti et al. [2008] D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos. Epidemic thresholds in real networks. ACM Transactions on Information and System Security, 10(4):1–26, 2008.
  • Chepuri and Leus [2014] S. P. Chepuri and G. Leus. Sparsity-promoting sensor selection for non-linear measurement models. IEEE Transactions on Signal Processing, 63(3):684–698, 2014.
  • Cormen et al. [2009] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms. MIT press, 2009.
  • Garey and Johnson [1979] M. R. Garey and D. S. Johnson. Computers and intractability, volume 174. Freeman San Francisco, 1979.
  • Horn and Johnson [2012] R. A. Horn and C. R. Johnson. Matrix analysis. Cambridge University Press, 2012.
  • Hota et al. [2020] A. R. Hota, J. Godbole, P. Bhariya, and P. E. Paré. A closed-loop framework for inference, prediction and control of SIR epidemics on networks. arXiv preprint arXiv:2006.16185, 2020.
  • Joshi and Boyd [2008] S. Joshi and S. Boyd. Sensor selection via convex optimization. IEEE Transactions on Signal Processing, 57(2):451–462, 2008.
  • Kay [1993] S. M. Kay. Fundamentals of statistical signal processing: Estimation theory. Prentice Hall PTR, 1993.
  • Kempe et al. [2003] D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. In Proc. international conference on Knowledge Discovery and Data mining, pages 137–146, 2003.
  • Khuller et al. [1999] S. Khuller, A. Moss, and J. S. Naor. The budgeted maximum coverage problem. Information Processing Letters, 70(1):39–45, 1999.
  • Krause and Guestrin [2005] A. Krause and C. Guestrin. A note on the budgeted maximization of submodular functions. Carnegie Mellon University. Center for Automated Learning and Discovery, 2005.
  • Krause et al. [2008] A. Krause, A. Singh, and C. Guestrin. Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies. Journal of Machine Learning Research, 9(Feb):235–284, 2008.
  • Mei et al. [2017] W. Mei, S. Mohagheghi, S. Zampieri, and F. Bullo. On the dynamics of deterministic epidemic propagation over networks. Annual Reviews in Control, 44:116–128, 2017.
  • Mo et al. [2011] Y. Mo, R. Ambrosino, and B. Sinopoli. Sensor selection strategies for state estimation in energy constrained wireless sensor networks. Automatica, 47(7):1330–1338, 2011.
  • Newman [2002] M. E. Newman. Spread of epidemic disease on networks. Physical Review E, 66(1):016128, 2002.
  • Nowzari et al. [2016] C. Nowzari, V. M. Preciado, and G. J. Pappas. Analysis and control of epidemics: A survey of spreading processes on complex networks. IEEE Control Systems Magazine, 36(1):26–46, 2016.
  • Paré et al. [2018] P. E. Paré, J. Liu, C. L. Beck, B. E. Kirwan, and T. Başar. Analysis, estimation, and validation of discrete-time epidemic processes. IEEE Transactions on Control Systems Technology, 2018.
  • Pastor-Satorras et al. [2015] R. Pastor-Satorras, C. Castellano, P. Van Mieghem, and A. Vespignani. Epidemic processes in complex networks. Reviews of Modern Physics, 87(3):925, 2015.
  • Pezzutto et al. [2020] M. Pezzutto, N. B. Rossello, L. Schenato, and E. Garone. Smart testing and selective quarantine for the control of epidemics. arXiv preprint arXiv:2007.15412, 2020.
  • Preciado et al. [2014] V. M. Preciado, M. Zargham, C. Enyioha, A. Jadbabaie, and G. J. Pappas. Optimal resource allocation for network protection against spreading processes. IEEE Transactions on Control of Network Systems, 1(1):99–108, 2014.
  • Prem et al. [2020] K. Prem, Y. Liu, T. W. Russell, A. J. Kucharski, R. M. Eggo, N. Davies, S. Flasche, S. Clifford, C. A. Pearson, J. D. Munday, et al. The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study. The Lancet Public Health, 2020.
  • Pukelsheim [2006] F. Pukelsheim. Optimal design of experiments. SIAM, 2006.
  • Stoer and Bulirsch [2013] J. Stoer and R. Bulirsch. Introduction to numerical analysis, volume 12. Springer Science & Business Media, 2013.
  • Streeter and Golovin [2009] M. Streeter and D. Golovin. An online algorithm for maximizing submodular functions. In Proc. Advances in Neural Information Processing Systems, pages 1577–1584, 2009.
  • Summers et al. [2015] T. H. Summers, F. L. Cortesi, and J. Lygeros. On submodularity and controllability in complex dynamical networks. IEEE Transactions on Control of Network Systems, 3(1):91–101, 2015.
  • Tzoumas et al. [2020] V. Tzoumas, L. Carlone, G. J. Pappas, and A. Jadbabaie. Lqg control and sensing co-design. IEEE Transactions on Automatic Control, pages 1–1, 2020. doi: 10.1109/TAC.2020.2997661.
  • Van Trees [2004a] H. L. Van Trees. Detection, estimation, and modulation theory, part I. John Wiley & Sons, 2004a.
  • Van Trees [2004b] H. L. Van Trees. Detection, estimation, and modulation theory, part IV: Optimum array processing. John Wiley & Sons, 2004b.
  • Vrabac et al. [2020] D. Vrabac, P. E. Paré, H. Sandberg, and K. H. Johansson. Overcoming challenges for estimating virus spread dynamics from data. In Proc. Conference on Information Sciences and Systems, pages 1–6. IEEE, 2020.
  • Ye and Sundaram [2019] L. Ye and S. Sundaram. Sensor selection for hypothesis testing: Complexity and greedy algorithms. In Proc. Conference on Decision and Control, pages 7844–7849. IEEE, 2019.
  • Ye et al. [2020] L. Ye, S. Roy, and S. Sundaram. Resilient sensor placement for Kalman filtering in networked systems: Complexity and algorithms. IEEE Transactions on Control of Network Systems, 7(4):1870–1881, 2020.
  • Ye et al. [2021] L. Ye, N. Woodford, S. Roy, and S. Sundaram. On the complexity and approximability of optimal sensor selection and attack for Kalman filtering. IEEE Transactions on Automatic Control, 66(5):2146–2161, 2021.