Parameter Estimation in Epidemic Spread Networks Using Limited Measurements^†^†thanks: Research supported in part by the National Science Foundation, grants NSF-CMMI 1635014 and NSF-ECCS 2032258.

Lintao Ye Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN, USA ([email protected]) Philip E. Paré School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA ({philpare,sundara2}@purdue.edu) Shreyas Sundaram³³footnotemark: 3

Abstract

We study the problem of estimating the parameters (i.e., infection rate and recovery rate) governing the spread of epidemics in networks. Such parameters are typically estimated by measuring various characteristics (such as the number of infected and recovered individuals) of the infected populations over time. However, these measurements also incur certain costs, depending on the population being tested and the times at which the tests are administered. We thus formulate the epidemic parameter estimation problem as an optimization problem, where the goal is to either minimize the total cost spent on collecting measurements, or to optimize the parameter estimates while remaining within a measurement budget. We show that these problems are NP-hard to solve in general, and then propose approximation algorithms with performance guarantees. We validate our algorithms using numerical examples.

Keywords: Epidemic spread networks, Parameter estimation, Optimization algorithms

1 Introduction

Models of spreading processes over networks have been widely studied by researchers from different fields (e.g., [14, 23, 6, 3, 25, 20]). The case of epidemics spreading through networked populations has received a particularly significant amount of attention, especially in light of the ongoing COVID-19 pandemic (e.g., [20, 21]). A canonical example is the networked SIR model, where each node in the network represents a subpopulation or an individual, and can be in one of three states: susceptible (S), infected (I), or recovered (R) [18]. There are two key parameters that govern such models: the infection rate of a given node, and the recovery rate of that node. In the case of a novel virus, these parameters may not be known a priori, and must be identified or estimated from gathered data, including for instance the number of infected and recovered individuals in the network at certain points of time. For instance, in the COVID-19 pandemic, when collecting the data on the number of infected individuals or the number of recovered individuals in the network, one possibility is to perform virus or antibody tests on the individuals, with each test incurring a cost. Therefore, in the problem of parameter estimation in epidemic spread networks, it is important and of practical interest to take the costs of collecting the data (i.e., measurements) into account, which leads to the problem formulations considered in this paper. The goal is to exactly identify (when possible) or estimate the parameters in the networked SIR model using a limited number of measurements. Specifically, we divide our analysis into two scenarios: 1) when the measurements (e.g., the number of infected individuals) can be collected exactly without error; and 2) when only a stochastic measurement can be obtained.

Under the setting when exact measurements of the infected and recovered proportions of the population at certain nodes in the network can be obtained, we formulate the Parameter Identification Measurement Selection (PIMS) problem as minimizing the cost spent on collecting the measurements, while ensuring that the parameters of the SIR model can be uniquely identified (within a certain time interval in the epidemic dynamics). In settings where the measurements are stochastic (thereby precluding exact identification of the parameters), we formulate the Parameter Estimation Measurement Selection (PEMS) problem. The goal is to optimize certain estimation metrics based on the collected measurements, while satisfying the budget on collecting the measurements.

Related Work

The authors in [22, 34] studied the parameter estimation problem in epidemic spread networks using a "Susceptible-Infected-Susceptible (SIS)" model of epidemics. model. When exact measurements of the infected proportion of the population at each node of the network can be obtained, the authors proposed a sufficient and necessary condition on the set of the collected measurements such that the parameters of the SIS model (i.e., the infection rate and the recovery rate) can be uniquely identified. However, this condition does not pose any constraint on the number of measurements that can be collected.

In [24], the authors considered a measurement selection problem in the SIR model. Their problem is to perform a limited number of virus tests among the population such that the probability of undetected asymptotic cases is minimized. The transmission of the disease in the SIR model considered in [24] is characterized by a Bernoulli random variable which leads to a Hidden Markov Model for the SIR dynamics.

Finally, our work is also closely related to the sensor placement problem that has been studied for control systems (e.g., [19, 37, 36]), signal processing (e.g., [7, 35]), and machine learning (e.g., [17]). The goal of these problems is to optimize certain (problem-specific) performance metrics of the estimate based on the measurements of the placed sensors, while satisfying the sensor placement budget constraints.

Contributions

First, we show that the PIMS problem is NP-hard, which precludes polynomial-time algorithms for the PIMS problem (if P $\neq$ NP). By exploring structural properties of the PIMS problem, we provide a polynomial-time approximation algorithm which returns a solution that is within a certain approximation ratio of the optimal. The approximation ratio depends on the cost structure of the measurements and on the graph structure of the epidemic spread network. Next, we show that the PEMS problem is also NP-hard. In order to provide a polynomial-time approximation algorithm that solves the PEMS problem with performance guarantees, we first show that the PEMS problem can be transformed into the problem of maximizing a set function subject to a knapsack constraint. We then apply a greedy algorithm to the (transformed) PEMS problem, and provide approximation guarantees for the greedy algorithm. Our analysis for the greedy algorithm also generalizes the results from the literature for maximizing a submodular set function under a knapsack constraint to nonsubmodular settings. We use numerical examples to validate the obtained performance bounds of the greedy algorithm, and show that the greedy algorithm performs well in practice.

Notation and Terminology

The sets of integers and real numbers are denoted as $\mathbb{Z}$ and $\mathbb{R}$ , respectively. For a set $\mathcal{S}$ , let $|\mathcal{S}|$ denote its cardinality. For any $n\in\mathbb{Z}_{\geq 1}$ , let $[n]\triangleq\{1,2,\dots,n\}$ . Let $\mathbf{0}_{m\times n}$ denote a zero matrix of dimension $m\times n$ ; the subscript is dropped if the dimension can be inferred from the context. For a matrix $P\in\mathbb{R}^{n\times n}$ , let $P^{\top}$ , $\operatorname{tr}(P)$ and $\det(P)$ be its transpose, trace and determinant, respectively. The eigenvalues of $P$ are ordered such that $|\lambda_{1}(P)|\geq\cdots\geq|\lambda_{n}(P)|$ . Let $P_{ij}$ (or $(P)_{ij}$ ) denote the element in the $i$ th row and $j$ th column of $P$ , and let $(P)_{i}$ denote the $i$ th row of $P$ . A positive semidefinite matrix $P\in\mathbb{R}^{n\times n}$ is denoted by $P\succeq\mathbf{0}$ .

2 Model of Epidemic Spread Network

Suppose a disease (or virus) is spreading over a directed graph $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ , where $\mathcal{V}\triangleq[n]$ is the set of $n$ nodes, and $\mathcal{E}$ is the set of directed edges (and self loops) that captures the interactions among the nodes in $\mathcal{V}$ . Here, each node $i\in\mathcal{V}$ is considered to be a group (or population) of individuals (e.g., a city or a country). A directed edge from node $i$ to node $j$ , where $i\neq j$ , is denoted by $(i,j)$ . For all $i\in\mathcal{V}$ , denote $\mathcal{N}_{i}\triangleq\{j:(j,i)\in\mathcal{E}\}$ and $\bar{\mathcal{N}}_{i}\triangleq\{j:(j,i)\in\mathcal{E}\}\cup\{i\}$ . For all $i\in\mathcal{V}$ and for all $k\in\mathbb{Z}_{\geq 0}$ , we let $s_{i}[k]$ , $x_{i}[k]$ and $r_{i}[k]$ represent the proportions of population of node $i\in\mathcal{V}$ that are susceptible, infected and recovered at time $k$ , respectively. To describe the dynamics of the spread of the disease in $\mathcal{G}$ , we will use the following discrete-time SIR model (e.g., [11]):


	$\displaystyle s_{i}[k+1]=s_{i}[k]-hs_{i}[k]\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k],$		(1a)
	$\displaystyle x_{i}[k+1]=(1-h\delta)x_{i}[k]+hs_{i}[k]\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k],$		(1b)
	$\displaystyle r_{i}[k+1]=r_{i}[k]+h\delta x_{i}[k],$		(1c)

where $\beta\in\mathbb{R}_{\geq 0}$ is the infection rate of the disease, $\delta\in\mathbb{R}_{\geq 0}$ is the recovery rate of the disease, $h\in\mathbb{R}_{\geq 0}$ is the sampling parameter, and $a_{ij}\in\mathbb{R}_{\geq 0}$ is the weight associated with edge $(j,i)$ . Let $A\in\mathbb{R}^{n\times n}$ be a weight matrix, where $A_{ij}=a_{ij}$ for all $i,j\in\mathcal{V}$ such that $j\in\bar{\mathcal{N}}_{i}$ , and $A_{ij}=0$ otherwise. Denote $s[k]\triangleq\begin{bmatrix}s_{1}[k]&\cdots&s_{n}[k]\end{bmatrix}^{T}\in\mathbb{R}^{n}$ , $x[k]\triangleq\begin{bmatrix}x_{1}[k]&\cdots&x_{n}[k]\end{bmatrix}^{T}\in\mathbb{R}^{n}$ , and $r[k]\triangleq\begin{bmatrix}r_{1}[k]&\cdots&r_{n}[k]\end{bmatrix}^{T}\in\mathbb{R}^{n}$ , for all $k\in\mathbb{Z}_{\geq 0}$ . Throughout this paper, we assume that the weight matrix $A\in\mathbb{R}^{n\times n}$ and the sampling parameter $h\in\mathbb{R}_{\geq 0}$ are given.

3 Preliminaries

We make the following assumptions on the initial conditions $s[0]$ , $x[0]$ and $r[0]$ , and the parameters of the SIR model in Eq. (1) (e.g., [22, 11]).

Assumption 3.1.

For all $i\in\mathcal{V}$ , $s_{i}[0]\in(0,1]$ , $x_{i}[0]\in[0,1)$ , $r_{i}[0]=0$ , and $s_{i}[0]+x_{i}[0]=1$ .

Assumption 3.2.

Assume that $h,\beta,\delta\in\mathbb{R}_{>0}$ with $h\delta<1$ . For all $i,j\in\mathcal{V}$ with $(j,i)\in\mathcal{E}$ and $i\neq j$ , assume that $a_{ij}\in\mathbb{R}_{>0}$ . For all $i\in\mathcal{V}$ , $h\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}<1$ .

Definition 3.3.

Consider a directed graph $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ with $\mathcal{V}=[n]$ . A directed path of length $t$ from node $i_{0}$ to node $i_{t}$ in $\mathcal{G}$ is a sequence of $t$ directed edges $(i_{0},i_{1}),\dots,(i_{t-1},i_{t})$ . For any distinct pair of nodes $i,j\in\mathcal{V}$ such that there exists a directed path from $i$ to $j$ , the distance from node $i$ to node $j$ , denoted as $d_{ij}$ , is defined as the shortest length over all such paths.

Definition 3.4.

Define $\mathcal{S}_{I}\triangleq\{i:x_{i}[0]>0,i\in\mathcal{V}\}$ and $\mathcal{S}_{H}\triangleq\{i:x_{i}[0]=0,i\in\mathcal{V}\}$ . For all $i\in\mathcal{S}_{H}$ , define $d_{i}\triangleq\mathop{\min}_{j\in\mathcal{S}_{I}}d_{ji}$ where $d_{i}\geq 1$ , and $d_{i}\triangleq+\infty$ if there is no path from $j$ to $i$ for any $j\in\mathcal{S}_{I}$ . For all $i\in\mathcal{S}_{I}$ , define $d_{i}\triangleq 0$ .

Using similar arguments to those in [11], one can show that $s_{i}[k],x_{i}[k],r_{i}[k]\in[0,1]$ with $s_{i}[k]+x_{i}[k]+r_{i}[k]=1$ for all $i\in\mathcal{V}$ and for all $k\in\mathbb{Z}_{\geq 0}$ under Assumptions 3.1-3.2. Thus, given $x_{i}[k]$ and $r_{i}[k]$ , we can obtain $s_{i}[k]=1-x_{i}[k]-r_{i}[k]$ for all $i\in\mathcal{V}$ and for all $k\in\mathbb{Z}_{\geq 0}$ . We also have the following result that characterizes properties of $x_{i}[k]$ and $r_{i}[k]$ in the SIR model over $\mathcal{G}$ given by Eq. (1); the proof can be found in Section 7.1 in the Appendix.

Lemma 3.5.

Consider a directed graph $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ with $\mathcal{V}=[n]$ and the SIR dynamics given by Eq. (1). Suppose Assumptions 3.1-3.2 hold. Then, the following results hold for all $i\in\mathcal{V}$ , where $k\in\mathbb{Z}_{\geq 0}$ , and $\mathcal{S}_{H}$ and $d_{i}$ are defined in Definition 3.4.
$(a)$ $s_{i}[k]>0$ for all $k\geq 0$ .
$(b)$ If $d_{i}\neq+\infty$ , then $x_{i}[k]=0$ for all $k<d_{i}$ , and $x_{i}[k]\in(0,1)$ for all $k\geq d_{i}$ .¹¹1Note that for the case when $d_{i}=0$ , i.e., $i\in\mathcal{S}_{I}$ , part $(b)$ implies $x_{i}[k]>0$ for all $k\geq 0$ .
$(c)$ If $d_{i}\neq+\infty$ , then $r_{i}[k]=0$ for all $k\leq d_{i}$ , and $r_{i}[k]\in(0,1)$ for all $k>d_{i}$ .
$(d)$ If $i\in\mathcal{S}_{H}$ with $d_{i}=+\infty$ , then $x_{i}[k]=0$ and $r_{i}[k]=0$ for all $k\geq 0$ .

4 Measurement Selection Problem in Exact Measurement Setting

Throughout this section, we assume that $\mathcal{S}_{I},\mathcal{S}_{H}\subseteq\mathcal{V}$ defined in Definition 3.4 are known, i.e., we know the set of nodes in $\mathcal{V}$ that have infected individuals initially.

4.1 Problem Formulation

Given exact measurements of $x_{i}[k]$ and $r_{i}[k]$ for a subset of nodes, our goal is to estimate (or uniquely identify, if possible) the unknown parameters $\beta$ and $\delta$ . Here, we consider the scenario where collecting the measurement of $x_{i}[k]$ (resp., $r_{i}[k]$ ) at any node $i\in\mathcal{V}$ and at any time step $k\in\mathbb{Z}_{\geq 0}$ incurs a cost, denoted as $c_{k,i}\in\mathbb{R}_{\geq 0}$ (resp., $b_{k,i}\in\mathbb{R}_{\geq 0}$ ). Moreover, we can only collect the measurements of $x_{i}[k]$ and $r_{i}[k]$ for $k\in\{t_{1},t_{1}+1,\dots,t_{2}\}$ , where $t_{1},t_{2}\in\mathbb{Z}_{\geq 0}$ are given with $t_{2}>t_{1}$ . Noting that Lemma 3.5 provides a (sufficient and necessary) condition under which $x_{i}[k]=0$ (resp., $r_{i}[k]=0$ ) holds, we see that one does not need to collect a measurement of $x_{i}[k]$ (resp., $r_{i}[k]$ ) if $x_{i}[k]=0$ (resp., $r_{i}[k]=0$ ) from Lemma 3.5. Given time steps $t_{1},t_{2}\in\mathbb{Z}_{\geq 0}$ with $t_{2}>t_{1}$ , we define a set

\mathcal{I}_{t_{1}:t_{2}}\triangleq\{\lambda_{i}[k]:k\in\{t_{1},\dots,t_{2}\},i\in\mathcal{V},\lambda_{i}[k]>0,\lambda\in\{x,r\}\},

(2)

which represents the set of all candidate measurements from time step $t_{1}$ to time step $t_{2}$ . To proceed, we first use Eq. (1b)-(1c) to obtain

\begin{bmatrix}x[t_{1}+1]-x[t_{1}]\\ \vdots\\ x[t_{2}]-x[t_{2}-1]\\ r[t_{1}+1]-r[t_{1}]\\ \vdots\\ r[t_{2}]-r[t_{2}-1]\end{bmatrix}=h\begin{bmatrix}\Phi_{t_{1}:t_{2}}^{x}\\ \Phi_{t_{1}:t_{2}}^{r}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix},

(3)

where $\Phi_{t_{1}:t_{2}-1}^{x}\triangleq\begin{bmatrix}(\Phi_{t_{1}}^{x})^{T}&\cdots&(\Phi_{t_{2}-1}^{x})^{T}\end{bmatrix}^{T}$ with

\Phi_{k}^{x}\triangleq\begin{bmatrix}s_{1}[k]\sum_{j\in\bar{\mathcal{N}}_{1}}a_{1j}x_{j}[k]&-x_{1}[k]\\ \vdots&\vdots\\ s_{n}[k]\sum_{j\in\bar{\mathcal{N}}_{n}}a_{nj}x_{j}[k]&-x_{n}[k]\end{bmatrix},\ \forall k\in\{t_{1},\dots,t_{2}-1\},

(4)

and $\Phi_{t_{1}:t_{2}-1}^{r}\triangleq\begin{bmatrix}(\Phi_{t_{1}}^{r})^{T}&\cdots&(\Phi_{t_{2}-1}^{r})^{T}\end{bmatrix}^{T}$ with

\Phi_{k}^{r}\triangleq\begin{bmatrix}0&x_{1}[k]\\ \vdots&\vdots\\ 0&x_{n}[k]\end{bmatrix},\ \forall k\in\{t_{1},\dots,t_{2}-1\}.

(5)

We can then view Eq. (3) as a set of $2(t_{2}-t_{1})n$ equations in $\beta$ and $\delta$ . Noting that $s_{i}[k]$ for all $i\in\mathcal{V}$ can be obtained from $s_{i}[k]=1-x_{i}[k]-r_{i}[k]$ as argued in Section 3, we see that the coefficients in the set of equations in $\beta$ and $\delta$ given by Eq. (3), i.e., the terms in Eq. (3) other than $\beta$ and $\delta$ , can be determined given that $x[k]$ and $r[k]$ are known for all $k\in\{t_{1},\dots,t_{2}\}$ . Also note that given $x[k]$ and $r[k]$ for all $k\in\{t_{1},\dots,t_{2}\}$ , we can uniquely identify $\beta$ and $\delta$ using Eq. (3) if and only if $\operatorname{rank}(\begin{bmatrix}(\Phi_{t_{1}:t_{2}-1}^{x})^{T}&(\Phi_{t_{1}:t_{2}-1}^{r})^{T}\end{bmatrix})=2$ .

Next, let $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ denote a measurement selection strategy, where $\mathcal{I}_{t_{1}:t_{2}}$ is given by Eq. (2). We will then consider identifying $\beta$ and $\delta$ using measurements contained in $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ . To illustrate our analysis, given any $i\in\mathcal{V}$ and any $k\in\{t_{1},\dots,t_{2}-1\}$ , we first consider the following equation from Eq. (3):

x_{i}[k+1]-x_{i}[k]=h\begin{bmatrix}s_{i}[k]\sum_{w\in\bar{\mathcal{N}}_{i}}a_{iw}x_{w}[k]&-x_{i}[k]\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix},

(6)

where $s_{i}[k]=1-x_{i}[k]-r_{i}[k]$ , and we index the equation in Eq. (3) corresponding to Eq. (6) as $(k,i,x)$ . Note that in order to use Eq. (6) in identifying $\beta$ and $\delta$ , one needs to determine the coefficients (i.e., the terms other than $\beta$ and $\delta$ ) in the equation. Also note that in order to determine the coefficients in equation $(k,i,x)$ , one can use the measurements contained in $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ , and use Lemma 3.5 to determine if $x_{i}[k]=0$ (resp., $r_{i}[k]=0$ ) holds. Supposing $x_{i}[k+1]=0$ , we see from Lemma 3.5 and Eq. (1b) that $x_{i}[k]=0$ and $s_{i}[k]\sum_{w\in\bar{\mathcal{N}}_{i}}a_{iw}x_{w}[k]=0$ , which makes equation $(k,i,x)$ useless in identifying $\beta$ and $\delta$ . Thus, in order to use equation $(k,i,x)$ in identifying $\beta$ and $\delta$ , we need $x_{i}[k+1]\in\mathcal{I}$ with $x_{i}[k+1]>0$ . Similarly, given any $i\in\mathcal{V}$ and any $k\in\{t_{1},\dots,t_{2}-1\}$ , we consider the following equation from Eq. (3):

r_{i}[k+1]-r_{i}[k]=h\begin{bmatrix}0&x_{i}[k]\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix},

(7)

where we index the above equation as $(k,i,r)$ . Supposing $r_{i}[k+1]=0$ , we see from Lemma 3.5 and Eq. (1c) that $r_{i}[k]=x_{i}[k]=0$ , which makes equation $(k,i,r)$ useless in identifying $\beta$ and $\delta$ . Hence, in order to use equation $(k,i,r)$ in identifying $\beta$ and $\delta$ , we need to have $\{x_{i}[k],r_{i}[k+1]\}\subseteq\mathcal{I}$ with $x_{i}[k]>0$ and $r_{i}[k+1]>0$ . More precisely, we observe that equation $(k,i,r)$ can be used in identifying $\beta$ and $\delta$ if and only if $\{x_{i}[k],r_{i}[k+1]\}\subseteq\mathcal{I}$ , and $r_{i}[k]\in\mathcal{I}$ or $r_{i}[k]=0$ (from Lemma 3.5).

In general, let us denote the following two coefficient matrices corresponding to equations $(k,i,x)$ and $(k,i,r)$ in Eq. (3), respectively:


	$\displaystyle\Phi_{k,i}^{x}\triangleq\begin{bmatrix}s_{i}[k]\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k]&-x_{i}[k]\end{bmatrix},$		(8a)
	$\displaystyle\Phi_{k,i}^{r}\triangleq\begin{bmatrix}0&x_{i}[k]\end{bmatrix},$		(8b)

for all $k\in\{t_{1},\dots,t_{2}-1\}$ and for all $i\in\mathcal{V}$ . Moreover, given any measurement selection strategy $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ , we let

\bar{\mathcal{I}}\triangleq\{(k,i,x):x_{i}[k+1]\in\mathcal{I},x_{i}[k]=0\}\cup\{(k,i,x):\{x_{i}[k+1],x_{i}[k]\}\subseteq\mathcal{I}\}\\ \cup\{(k,i,r):\{r_{i}[k+1],x_{i}[k]\}\subseteq\mathcal{I},r_{i}[k]=0\}\cup\{(k,i,r):\{r_{i}[k+1],r_{i}[k],x_{i}[k]\}\subseteq\mathcal{I}\}

(9)

be the set that contains indices of the equations from Eq. (3) that can be potentially used in identifying $\beta$ and $\delta$ , based on the measurements contained in $\mathcal{I}$ . In other words, the coefficients on the left-hand side of equation $(k,i,x)$ (resp., ( $k,i,r$ )) can be determined using the measurements from $\mathcal{I}$ and using Lemma 3.5, for all $(k,i,x)\in\bar{\mathcal{I}}$ (resp., $(k,i,r)\in\bar{\mathcal{I}}$ ). Let us now consider the coefficient matrix $\Phi_{k,i}^{x}$ (resp., $\Phi_{k,i}^{r}$ ) corresponding to $(k,i,x)\in\bar{\mathcal{I}}$ (resp., $(k,i,r)\in\bar{\mathcal{I}}$ ). One can then show that it is possible that there exist equations in $\bar{\mathcal{I}}$ whose coefficients cannot be (directly) determined using the measurements contained in $\mathcal{I}$ or using Lemma 3.5, where the undetermined coefficients come from the first element in $\Phi_{k,i}^{x}$ given by Eq. (8a). Nevertheless, it is also possible that one can perform algebraic operations among the equations in $\bar{\mathcal{I}}$ such that the undetermined coefficients get cancelled. Formally, we define the following.

Definition 4.1.

Consider a measurement selection strategy $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ , where $\mathcal{I}_{t_{1}:t_{2}}$ is given by Eq. (2). Stack coefficient matrices $\Phi_{k,i}^{x}\in\mathbb{R}^{1\times 2}$ for all $(k,i,x)\in\bar{\mathcal{I}}$ and $\Phi_{k,i}^{r}\in\mathbb{R}^{1\times 2}$ for all $(k,i,r)\in\bar{\mathcal{I}}$ into a single matrix, where $\Phi_{k,i}^{x}$ and $\Phi_{k,i}^{r}$ are given by (8) and $\bar{\mathcal{I}}$ is given by Eq. (9). The resulting matrix is denoted as $\Phi(\mathcal{I})\in\mathbb{R}^{|\bar{\mathcal{I}}|\times 2}$ . Moreover, define $\tilde{\Phi}(\mathcal{I})$ to be the set that contains all the matrices $\Phi\in\mathbb{R}^{2\times 2}$ such that $(\Phi)_{1}$ and $(\Phi)_{2}$ can be obtained via algebraic operations among the rows in $\Phi(\mathcal{I})$ , and the elements in $(\Phi)_{1}$ and $(\Phi)_{2}$ can be fully determined using the measurements from $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ and using Lemma 3.5.

In other words, $\Phi\in\tilde{\Phi}(\mathcal{I})$ corresponds to two equations (in $\beta$ and $\delta$ ) obtained from Eq. (3) such that the coefficients on the right-hand side of the two equations can be determined using the measurements contained in $\mathcal{I}$ and using Lemma 3.5 (if the coefficients contain $x_{i}[k]=0$ or $r_{i}[k]=0$ ). Moreover, one can show that the coefficients on the left-hand side of the two equations obtained from Eq. (3) corresponding to $\Phi$ can also be determined using measurements from $\mathcal{I}$ and using Lemma 3.5. Putting the above arguments together, we see that given a measurement selection strategy $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ , $\beta$ and $\delta$ can be uniquely identified if and only if there exists $\Phi\in\tilde{\Phi}(\mathcal{I})$ such that $\operatorname{rank}(\Phi)=2$ . Equivalently, denoting

r_{\mathop{\max}}(\mathcal{I})\triangleq\mathop{\max}_{\Phi\in\tilde{\Phi}(\mathcal{I})}\operatorname{rank}(\Phi),

(10)

where $r_{\mathop{\max}}(\mathcal{I})\triangleq 0$ if $\tilde{\Phi}(\mathcal{I})=\emptyset$ , we see that $\beta$ and $\delta$ can be uniquely identified using the measurements from $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ if and only if $r_{\mathop{\max}}(\mathcal{I})=2$ .

Remark 4.2.

Note that if a measurement selection strategy $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ satisfies $r_{\mathop{\max}}(\mathcal{I})=2$ , it follows from the above arguments that $|\bar{\mathcal{I}}|\geq 2$ , i.e., $\Phi(\mathcal{I})\in\mathbb{R}^{|\bar{\mathcal{I}}|\times 2}$ has at least two rows, where $\bar{\mathcal{I}}$ is defined in Eq. (9).

Recall that collecting the measurement of $x_{i}[k]$ (resp., $r_{i}[k]$ ) at any node $i\in\mathcal{V}$ and at any time step $k\in\mathbb{Z}_{\geq 0}$ incurs cost $c_{k,i}\in\mathbb{R}_{\geq 0}$ (resp., $b_{k,i}\in\mathbb{R}_{\geq 0}$ ). Given any measurement selection strategy $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ , we denote the cost associated with $\mathcal{I}$ as

c(\mathcal{I})\triangleq\sum_{x_{i}[k]\in\mathcal{I}}c_{k,i}+\sum_{r_{i}[k]\in\mathcal{I}}b_{k,i}.

(11)

We then define the Parameter Identification Measurement Selection (PIMS) problem in the perfect measurement setting as follows.

Problem 4.3.

Consider a discrete-time SIR model given by Eq. (1) with a directed graph $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ , a weight matrix $A\in\mathbb{R}^{n\times n}$ , a sampling parameter $h\in\mathbb{R}_{\geq 0}$ , and sets $\mathcal{S}_{I},\mathcal{S}_{H}\subseteq\mathcal{V}$ defined in Definition 3.4. Moreover, consider time steps $t_{1},t_{2}\in\mathbb{Z}_{\geq 0}$ with $t_{1}<t_{2}$ , and a cost $c_{k,i}\in\mathbb{R}_{\geq 0}$ of measuring $x_{i}[k]$ and a cost $b_{k,i}\in\mathbb{R}_{\geq 0}$ of measuring $r_{i}[k]$ for all $i\in\mathcal{V}$ and for all $k\in\{t_{1},\dots,t_{2}\}$ . The PIMS problem is to find $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ that solves

\begin{split}&\mathop{\min}_{\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}}c(\mathcal{I})\\ s.t.\ &r_{\mathop{\max}}(\mathcal{I})=2,\end{split}

(12)

where $\mathcal{I}_{t_{1}:t_{2}}$ is defined in Eq. (2), $c(\mathcal{I})$ is defined in Eq. (11), and $r_{\mathop{\max}}(\mathcal{I})$ is defined in Eq. (10).

We have the following result; the proof is included in Section 7.2 in the Appendix.

Theorem 4.4.

The PIMS problem is NP-hard.

Theorem 4.4 indicates that there is no polynomial-time algorithm that solves all instances of the PIMS problem optimally (if P $\neq$ NP). Moreover, we note from the formulation of the PIMS problem given by Problem 4.3 that for a measurement selection strategy $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ , one needs to check if $\mathop{\max}_{\Phi\in\tilde{\Phi}(\mathcal{I})}\operatorname{rank}(\Phi)=2$ holds, before the corresponding measurements are collected. However, in general, it is not possible to calculate $\operatorname{rank}(\Phi)$ when no measurements are collected. In order to bypass these issues, we will explore additional properties of the PIMS problem in the following.

4.2 Solving the PIMS Problem

We start with the following result.

Lemma 4.5.

Consider a discrete time SIR model given by Eq. (1). Suppose Assumptions 3.1-3.2 hold. Then, the following results hold, where $\Phi_{k_{1},i_{1}}^{x}\in\mathbb{R}^{1\times 2}$ and $\Phi_{k_{2},i_{2}}^{r}\in\mathbb{R}^{1\times 2}$ are defined in (8), $\mathcal{S}_{I}^{\prime}\triangleq\{i\in\mathcal{S}_{I}:a_{ii}>0\}$ , $\mathcal{S}^{\prime}\triangleq\{i\in\mathcal{V}\setminus\mathcal{S}_{I}^{\prime}:\mathcal{N}_{i}\neq\emptyset,\mathop{\min}\{d_{j}:j\in\mathcal{N}_{i}\}\neq\infty\}$ , and $\mathcal{S}_{I}$ and $d_{i}$ are defined in Definition 3.4 for all $i\in\mathcal{V}$ .
$(a)$ For any $i_{1}\in\mathcal{S}_{I}^{\prime}$ and for any $i_{2}\in\mathcal{V}$ with $d_{i_{2}}\neq\infty$ , $\text{rank}\big{(}\begin{bmatrix}(\Phi_{k_{1},i_{1}}^{x})^{T}&(\Phi_{k_{2},i_{2}}^{r})^{T}\end{bmatrix}\big{)}=2$ for all $k_{1}\geq 0$ and for all $k_{2}\geq d_{i_{2}}$ , where $k_{1},k_{2}\in\mathbb{Z}_{\geq 0}$ .
$(b)$ For any $i_{1}\in\mathcal{S}^{\prime}$ and for any $i_{2}\in\mathcal{V}$ with $d_{i_{2}}\neq\infty$ , $\text{rank}\big{(}\begin{bmatrix}(\Phi_{k_{1},i_{1}}^{x})^{T}&(\Phi_{k_{2},i_{2}}^{r})^{T}\end{bmatrix}\big{)}=2$ for all $k_{1}\geq\mathop{\min}\{d_{j}:j\in\mathcal{N}_{i_{1}}\}$ , and for all $k_{2}\geq d_{i_{2}}$ , where $k_{1},k_{2}\in\mathbb{Z}_{\geq 0}$ .

Proof.

Noting from (8), we have

\begin{bmatrix}\Phi_{k_{1},i_{1}}^{x}\\ \Phi_{k_{2},i_{2}}^{r}\end{bmatrix}=\begin{bmatrix}s_{i_{1}}[k_{1}]\sum_{j\in\bar{\mathcal{N}}_{i_{1}}}a_{i_{1}j}x_{j}[k_{1}]&-x_{i_{1}}[k_{1}]\\ 0&x_{i_{2}}[k_{2}]\end{bmatrix}.

(13)

To prove part $(a)$ , consider any $i_{1}\in\mathcal{S}_{I}^{\prime}$ and any $i_{2}\in\mathcal{V}$ with $d_{i_{2}}\neq\infty$ , where we note $x_{i_{1}}[0]>0$ and $a_{i_{1}i_{1}}>0$ from the definition of $\mathcal{S}_{I}^{\prime}$ . We then see from Lemma $\ref{lemma:propagate of initial condition}(a)$ - $(b)$ that $s_{i_{1}}[k_{1}]>0$ and $x_{i_{1}}[k_{1}]>0$ for all $k_{1}\geq 0$ . It follows that $s_{i_{1}}[k_{1}]\sum_{j\in\bar{\mathcal{N}}_{i_{1}}}a_{i_{1}j}x_{j}[k_{1}]>0$ for all $k_{1}\geq 0$ . Also, we obtain from Lemma $\ref{lemma:propagate of initial condition}(b)$ $x_{i_{2}}[k_{2}]>0$ for all $k_{2}\geq d_{i_{2}}$ , which proves part $(a)$ .

We then prove part $(b)$ . Considering any $i_{1}\in\mathcal{S}^{\prime}$ and any $i_{2}\in\mathcal{V}$ with $d_{2}\neq\infty$ , we see from the definition of $\mathcal{S}^{\prime}$ that $\mathcal{N}_{i_{1}}\neq\emptyset$ and there exists $j\in\mathcal{N}_{i_{1}}$ such that $d_{j}\neq\infty$ . Letting $j_{1}$ be a node in $\mathcal{N}_{i_{1}}$ such that $d_{j_{1}}=\mathop{\min}\{d_{j}:j\in\mathcal{N}_{i_{1}}\}\neq\infty$ , we note from Lemma $\ref{lemma:propagate of initial condition}(a)$ that $x_{j_{1}}[k_{1}]>0$ for all $k_{1}\geq\mathop{\min}\{d_{j}:j\in\mathcal{N}_{i_{1}}\}$ . Also note that $a_{i_{1}j_{1}}>0$ from Assumption 3.2. The rest of the proof of part $(b)$ is then identical to that of part $(a)$ . ∎

Recalling the way we index the equations in Eq. (3) (see Eqs. (6)-(7) for examples), we define the set that contains all the indices of the equations in Eq. (3):

\mathcal{Q}\triangleq\{(k,i,\lambda):k\in\{t_{1},\dots,t_{2}-1\},i\in\mathcal{V},\lambda\in\{x,r\}\}.

(14)

Following the arguments in Lemma 4.5, we denote

	$\displaystyle\mathcal{Q}_{1}\triangleq\{(k,i,x)\in\mathcal{Q}:i\in\mathcal{S}_{I}^{\prime}\}\cup\{(k,i,x)\in\mathcal{Q}:k\geq\mathop{\min}\{d_{j}:j\in\mathcal{N}_{i}\},i\in\mathcal{S}^{\prime}\},$		(15)
	$\displaystyle\mathcal{Q}_{2}\triangleq\{(k,i,r)\in\mathcal{Q}:k\geq d_{i},i\in\mathcal{V},d_{i}\neq\infty\},$		(16)

where $\mathcal{S}_{I}^{\prime}$ and $\mathcal{S}^{\prime}$ are defined in Lemma 4.5, and $d_{i}$ is defined in Definition 3.4. Next, for all $(k,i,x)\in\mathcal{Q}$ , we define the set of measurements that are needed to determine the coefficients in equation $(k,i,x)$ (when no other equations are used) to be

\mathcal{I}(k,i,x)\triangleq\big{(}\{x_{i}[k+1],r_{i}[k]\}\cup\{x_{j}[k]:j\in\bar{\mathcal{N}}_{i}\}\big{)}\cap\mathcal{I}_{t_{1}:t_{2}},

where $\mathcal{I}_{t_{1}:t_{2}}$ is defined in Eq. (2). Similarly, for all $(k,i,r)\in\mathcal{Q}$ , we define

\mathcal{I}(k,i,r)\triangleq\big{(}\{r_{i}[k+1],r_{i}[k],x_{i}[k]\}\big{)}\cap\mathcal{I}_{t_{1}:t_{2}}.

Moreover, let us denote

\mathcal{I}((k_{1},i_{1},\lambda_{1}),(k_{2},i_{2},\lambda_{2}))\triangleq\mathcal{I}(k_{1},i_{1},\lambda_{1})\cup\mathcal{I}(k_{2},i_{2},\lambda_{2})

(17)

for all $(k_{1},i_{1},\lambda_{1}),(k_{2},i_{2},\lambda_{2})\in\mathcal{Q}$ . Similarly to Eq. (11), denote the sum of the costs of the measurements from $\mathcal{I}((k_{1},i_{1},\lambda_{1}),(k_{2},i_{2},\lambda_{2}))$ as $c(\mathcal{I}((k_{1},i_{1},\lambda_{1}),(k_{2},i_{2},\lambda_{2})))$ .

Algorithm 1 Algorithm for PIMS

1:Input: An instance of PIMS

2:Find

(k_{1},i_{1},x)\in\mathcal{Q}_{1}

(k_{2},i_{2},r)\in\mathcal{Q}_{2}

s.t.

c(\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r)))

is minimized return

\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r))

Based on the above arguments, we propose an algorithm defined in Algorithm 1 for the PIMS problem. Note that Algorithm 1 finds an equation from $\mathcal{Q}_{1}$ and an equation from $\mathcal{Q}_{2}$ such that the sum of the costs of the two equations is minimized, where $\mathcal{Q}_{1}$ and $\mathcal{Q}_{2}$ are defined in Eq. (15) and Eq. (16), respectively.

Proposition 4.6.

Consider an instance of the PIMS problem under Assumptions 3.1-3.2. Algorithm 1 returns a solution $\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r))$ to the PIMS problem that satisfies the constraint in (12), and the following:

\frac{c(\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r)))}{c(\mathcal{I}^{\star})}\leq\frac{\mathop{\min}_{(k,i,x)\in\mathcal{Q}_{1}}(b_{k+1,i}+b_{k,i}+c_{k+1,i}+\sum_{j\in\bar{\mathcal{N}}_{i}}c_{k,j})}{3c_{\mathop{\min}}},

(18)

where $\mathcal{I}^{\star}$ is an optimal solution to the PIMS problem, $\mathcal{Q}_{1}$ is defined in Eq. (15), and $c_{\mathop{\min}}\triangleq\mathop{\min}\{\mathop{\min}_{x_{i}[k]\in\mathcal{I}_{t_{1}:t_{2}}}c_{k,i},\mathop{\min}_{r_{i}[k]\in\mathcal{I}_{t_{1}:t_{2}}}b_{k,i}\}>0$ with $\mathcal{I}_{t_{1}:t_{2}}$ given by Eq. (2).

Proof.

The feasibility of $\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r))$ follows directly from the definition of Algorithm 1 and Lemma 4.5. We now prove (18). Consider any equations $(k,i,x)\in\mathcal{Q}_{1}$ and $(k,i,r)\in\mathcal{Q}_{2}$ . We have from Eq. (17) the following:

\mathcal{I}((k,i,x),(k,i,r))=\big{(}\{x_{i}[k+1],r_{i}[k]\}\cup\{x_{j}[k]:j\in\bar{\mathcal{N}}_{i}\}\cup\{r_{i}[k+1],r_{i}[k],x_{i}[k]\}\big{)}\cap\mathcal{I}_{t_{1}:t_{2}},

which implies

c(\mathcal{I}((k_{1},i_{1},x),(k_{2},i_{2},r)))\leq\mathop{\min}_{(k,i,x)\in\mathcal{Q}_{1}}(b_{k+1,i}+b_{k,i}+c_{k+1,i}+\sum_{j\in\bar{\mathcal{N}}_{i}}c_{k,j}).

Next, since $\mathcal{I}^{\star}$ satisfies $r_{\mathop{\max}}(\mathcal{I}^{\star})=2$ , we recall from Remark 4.2 $|\bar{\mathcal{I}}^{\star}|\geq 2$ , where

\bar{\mathcal{I}}^{\star}=\{(k,i,x):x_{i}[k+1]\in\mathcal{I}^{\star},x_{i}[k]=0\}\cup\{(k,i,x):\{x_{i}[k+1],x_{i}[k]\}\subseteq\mathcal{I}^{\star}\}\\ \cup\{(k,i,r):\{r_{i}[k+1],x_{i}[k]\}\subseteq\mathcal{I}^{\star},r_{i}[k]=0\}\cup\{(k,i,r):\{r_{i}[k+1],r_{i}[k],x_{i}[k]\}\subseteq\mathcal{I}^{\star}\},

which implies $|\mathcal{I}^{\star}|\geq 2$ . In fact, suppose $\mathcal{I}^{\star}=\{x_{i}[k+1],x_{j}[k+1]\}$ , where $i\in\mathcal{V}$ and $k\in\{t_{1}-1,\dots,t_{2}-1\}$ . Since the elements in $\Phi_{k,i}^{x}$ and $\Phi_{k,j}^{x}$ (defined in (8)) do not contain $x_{w}[0]$ , $r_{w}[0]$ or $s_{w}[0]$ for any $w\in\mathcal{V}$ , and cannot all be zero, we see that there exists $x_{w^{\prime}}[k]\in\mathcal{I}^{\star}$ (with $x_{w^{\prime}}[k]>0$ ), where $w^{\prime}\in\mathcal{V}$ . This further implies $|\mathcal{I}^{\star}|\geq 3$ . Using similar arguments, one can show that $|\mathcal{I}^{\star}|\geq 3$ holds in general, which implies $c(\mathcal{I}^{\star})\geq 3c_{\mathop{\min}}$ . Combining the above arguments leads to (18). ∎

Finally, note that $\mathcal{Q}_{2}$ and $\mathcal{I}_{t_{1}:t_{2}}$ can be obtained by calling the Breadth-First-Search (BFS) algorithm (e.g., [8]) $|\mathcal{S}_{I}|$ times with $O(|\mathcal{S}_{I}|(n+|\mathcal{E}|))$ total time complexity. Also note that the time complexity of line $2$ in Algorithm 1 is $O(n^{2}(t_{2}-t_{1}+1)^{2})$ . Thus, the overall time complexity of Algorithm 1 is $O(n^{2}(t_{2}-t_{1}+1)^{2}+|\mathcal{S}_{I}||\mathcal{E}|)=O(|\mathcal{Q}|^{2}+|\mathcal{S}_{I}||\mathcal{E}|)$ .

5 Measurement Selection Problem in Random Measurement Setting

In this section, we assume that the initial condition $l\triangleq[(s[0])^{T}\ (x[0])^{T}\ (r[0])^{T}]^{T}$ is known. Nevertheless, our analysis can potentially be extended to cases where the initial condition $l$ is given by a probability distribution.

5.1 Problem Formulation

Here, we consider the scenario where the measurement of $x_{i}[k]$ (resp., $r_{i}[k]$ ), denoted as $\hat{x}_{i}[k]$ (resp., $\hat{r}_{i}[k]$ ), is given by a pmf $p(\hat{x}_{i}[k]|x_{i}[k])$ (resp., $p(\hat{r}_{i}[k]|r_{i}[k])$ ). Note one can express $x_{i}[k]$ in terms of $l$ and $\theta\triangleq[\beta\ \delta]^{T}$ using (1b). Hence, given $l$ and $\theta$ , we can alternatively write $p(\hat{x}_{i}[k]|x_{i}[k])$ as $p(\hat{x}_{i}[k]|l,\theta)$ for all $i\in\mathcal{V}$ and for all $k\in\mathbb{Z}_{\geq 1}$ . Since the initial conditions are assumed to be known, we drop the dependency of $p(\hat{x}_{i}[k]|l,\theta)$ on $l$ , and denote the pmf of $\hat{x}_{i}[k]$ as $p(\hat{x}_{i}[k]|\theta)$ for all $i\in\mathcal{V}$ and for all $k\in\mathbb{Z}_{\geq 1}$ . Similarly, given $l$ and $\theta$ , we denote the pmf of $\hat{r}_{i}[k]$ as $p(\hat{r}_{i}[k]|\theta)$ for all $i\in\mathcal{V}$ and for all $k\in\mathbb{Z}_{\geq 1}$ . Note that when collecting measurement $\hat{x}_{i}[k]$ (resp., $\hat{r}_{i}[k]$ ) under a limited budget, one possibility is to give virus (resp., antibody) tests to a group of randomly and uniformly sampled individuals of the population at node $i\in\mathcal{V}$ and at time $k\in\mathbb{Z}_{\geq 1}$ (e.g., [2]), where a positive testing result indicates that the tested individual is infected (resp., recovered) at time $k$ (e.g., [1]). Thus, the obtained random measurements $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$ and the corresponding pmfs $p(\hat{x}_{i}[k]|\theta)$ and $p(\hat{r}_{i}[k]|\theta)$ depend on the total number of conducted virus tests and antibody tests at node $i$ and at time $k$ , respectively. Consider any node $i\in\mathcal{V}$ and any time step $k\in\mathbb{Z}_{\geq 1}$ , where the total population of $i$ is denoted by $N_{i}\in\mathbb{Z}_{\geq 1}$ and is assumed to be fixed over time. Suppose we are allowed to choose the number of virus (resp., antibody) tests that will be performed on the (randomly sampled) individuals at node $i$ and at time $k$ . Assume that the cost of performing the virus (resp., antibody) tests is proportional to the number of the tests. For any $i\in\mathcal{V}$ and for any $k\in\{t_{1},\dots,t_{2}\}$ , let

\mathcal{C}_{k,i}\triangleq\{\zeta c_{k,i}:\zeta\in(\{0\}\cup[\zeta_{i}])\}

(19)

be the set of all possible costs that we can spend on collecting the measurement $\hat{x}_{i}[k]$ , where $c_{k,i}\in\mathbb{R}_{\geq 0}$ and $\zeta_{i}\in\mathbb{Z}_{\geq 1}$ . Similarly, for any $i\in\mathcal{V}$ and any $k\in\{t_{1},\dots,t_{2}\}$ , let

\mathcal{B}_{k,i}\triangleq\{\eta b_{k,i}:\eta\in(\{0\}\cup[\eta_{i}])\}

(20)

denote the set of all possible costs that we can spend on collecting the measurement $\hat{r}_{i}[k]$ , where $b_{k,i}\in\mathbb{R}_{\geq 0}$ and $\eta_{i}\in\mathbb{Z}_{\geq 1}$ . For instance, $\zeta c_{k,i}$ can be viewed as the cost of performing virus tests on $\zeta N_{i}^{x}$ (randomly sampled) individuals in the population at node $i$ , where $N_{i}^{x}\in\mathbb{Z}_{\geq 1}$ and $\zeta_{i}N_{i}^{x}\leq N_{i}$ . To reflect the dependency of the pmf $p(\hat{x}_{i}[k]|\theta)$ (resp., $p(\hat{r}_{i}[k]|\theta)$ ) of measurement $\hat{x}_{i}[k]$ (resp., $\hat{r}_{i}[k]$ ) on the cost spent on collecting the measurement of $x_{i}[k]$ (resp., $r_{i}[k]$ ), we further denote the pmf of $\hat{x}_{i}[k]$ (resp., $\hat{r}_{i}[k]$ ) as $p(\hat{x}_{i}[k]|\theta,\varphi_{k,i})$ (resp., $p(\hat{r}_{i}[k]|\theta,\omega_{k,i})$ ), where $\varphi_{k,i}\in\mathcal{C}_{k,i}$ (resp., $\omega_{k,i}\in\mathcal{B}_{k,i}$ ) with $\mathcal{C}_{k.i}$ (resp., $\mathcal{B}_{k,i}$ ) given by Eq. (19) (resp., Eq. (20)). Note that $\varphi_{k,i}$ (resp., $\omega_{k,i}$ ) is the cost that we spend on collecting measurement $\hat{x}_{i}[k]$ (resp., $\hat{r}_{i}[k]$ ), and $\varphi_{k,i}=0$ (resp., $\omega_{k,i}=0$ ) indicates that measurement $\hat{x}_{i}[k]$ (resp., $\hat{r}_{i}[k]$ ) is not collected.

In contrast with the exact measurement case studied in Section 4, it is not possible to uniquely identify $\beta$ and $\delta$ using measurements $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$ which are now random variables. Thus, we will consider estimators of $\beta$ and $\delta$ based on the measurements indicated by a measurement selection strategy. Similarly to Section 4, given time steps $t_{1},t_{2}\in\mathbb{Z}_{\geq 1}$ with $t_{2}\geq t_{1}$ , define the set of all candidate measurements as

\mathcal{U}_{t_{1}:t_{2}}\triangleq\{\hat{x}_{i}[k]:i\in\mathcal{V},k\in\{t_{1},\dots,t_{2}\}\}\cup\{\hat{r}_{i}[k]:i\in\mathcal{V},k\in\{t_{1},\dots,t_{2}\}\}.

(21)

Recalling $\mathcal{C}_{k,i}$ and $\mathcal{B}_{k,i}$ defined in Eq. (19) and Eq. (20), respectively, we let $\mu\in\mathbb{Z}_{\geq 0}^{\mathcal{U}_{t_{1}:t_{2}}}$ be a measurement selection that specifies the costs spent on collecting measurements $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$ for all $i\in\mathcal{V}$ and for all $k\in\{t_{1},\dots,t_{2}\}$ . Moreover, we define the set of all candidate measurement selections as

\mathcal{M}\triangleq\{\mu\in\mathbb{Z}_{\geq 0}^{\mathcal{U}_{t_{1}:t_{2}}}:\mu(\hat{x}_{i}[k])\in(\{0\}\cup[\zeta_{i}]),\mu(\hat{r}_{i}[k])\in(\{0\}\cup[\eta_{i}])\},

(22)

where $\zeta_{i},\eta_{i}\in\mathbb{Z}_{\geq 1}$ for all $i\in\mathcal{V}$ . In other words, a measurement selection $\mu$ is defined over the integer lattice $\mathbb{Z}_{\geq 0}^{\mathcal{U}_{t_{1}:t_{2}}}$ so that $\mu$ is a vector of dimension $|\mathcal{U}_{t_{1}:t_{2}}|$ , where each element of $\mu$ corresponds to an element in $\mathcal{U}_{t_{1}:t_{2}}$ , and is denoted as $\mu(\hat{x}_{i}[k])$ (or $\mu(\hat{r}_{i}[k])$ ). The set $\mathcal{M}$ contains all $\mu\in\mathbb{Z}_{\geq 0}^{\mathcal{U}_{t_{1}:t_{2}}}$ such that $\mu(\hat{x}_{i}[k])\in(\{0\}\cup[\zeta_{i}])$ and $\mu(\hat{r}_{i}[k])\in(\{0\}\cup[\eta_{i}])$ for all $i\in\mathcal{V}$ and for all $k\in\{t_{1},\dots,t_{2}\}$ . Thus, for any $\varphi_{k,i}\in\mathcal{C}_{k,i}$ and for any $\omega_{k,i}\in\mathcal{B}_{k,i}$ , there exists $\mu\in\mathcal{M}_{\geq 0}^{\mathcal{U}_{t_{1}:t_{2}}}$ such that $\mu(\hat{x}_{i}[k])c_{k,i}=\varphi_{k,i}$ and $\mu(\hat{r}_{i}[k])b_{k,i}=\omega_{k,i}$ . In other words, $\mu(\hat{x}_{i}[k])c_{k,i}$ (resp., $\mu(\hat{r}_{i}[k])b_{k,i}$ ) indicates the cost spent on collecting the measurement of $x_{i}[k]$ (resp., $r_{i}[k]$ ). Given a measurement selection $\mu\in\mathbb{Z}_{\geq 0}^{t_{1}:t_{2}}$ , we can also denote the pmfs of $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$ as $p(\hat{x}_{i}[k]|\theta,\mu(\hat{x}_{i}[k]))$ and $p(\hat{r}_{i}[k]|\theta,\mu(\hat{r}_{i}[k]))$ , respectively, where we drop the dependencies of the pmfs on $c_{k,i}$ and $b_{k,i}$ for notational simplicity.

To proceed, we consider the scenario where measurements can only be collected under a budget constraint given by $B\in\mathbb{R}_{\geq 0}$ . Using the above notations, the budget constraint can be expressed as

\sum_{\hat{x}_{i}[k]\in\mathcal{U}_{t_{1}:t_{2}}}c_{k,i}\mu(\hat{x}_{i}[k])+\sum_{\hat{r}_{i}[k]\in\mathcal{U}_{t_{1}:t_{2}}}b_{k,i}\mu(\hat{r}_{i}[k])\leq B.

(23)

We then consider estimators of $\theta=[\beta\ \delta]^{T}$ based on any given measurement selection $\mu\in\mathcal{M}$ . Considering any $\mu\in\mathcal{M}$ , we denote

\mathcal{U}^{\lambda}_{i}\triangleq\{k:\mu(\hat{\lambda}_{i}[k])>0,k\in\{t_{1},\dots,t_{2}\}\},

(24)

for all $i\in\mathcal{V}$ and for all $\lambda\in\{x,r\}$ . For all $i\in\mathcal{V}$ and for all $\lambda\in\{x,r\}$ with $\mathcal{U}_{i}^{\lambda}\neq\emptyset$ , denote $y(\mathcal{U}_{i}^{\lambda})\triangleq\begin{bmatrix}\hat{\lambda}_{i}[k_{1}]&\cdots\hat{\lambda}_{i}[k_{|\mathcal{U}_{i}^{\lambda}|}]\end{bmatrix}^{T}$ , where $\mathcal{U}_{i}^{\lambda}=\{k_{1},\dots,k_{|\mathcal{U}_{i}^{\lambda}|}\}$ . Letting

\mathcal{U}_{\lambda}\triangleq\{i:\mathcal{U}_{i}^{\lambda}\neq\emptyset,i\in\mathcal{V}\}\ \forall\lambda\in\{x,r\},

we denote the measurement vector indicated by $\mu\in\mathcal{M}$ as

y(\mu)\triangleq\begin{bmatrix}(y(\mathcal{U}_{i_{1}}^{x}))^{T}&\cdots&(y(\mathcal{U}_{i_{|\mathcal{U}_{x}|}}^{x}))^{T}&(y(\mathcal{U}_{j_{1}}^{r}))^{T}&\cdots&(y(\mathcal{U}_{j_{|\mathcal{U}_{r}|}}^{r}))^{T}\end{bmatrix}^{T},

(25)

where $\mathcal{U}_{x}=\{i_{1},\dots,i_{|\mathcal{U}_{x}|}\}$ and $\mathcal{U}_{r}=\{j_{1},\dots,j_{|\mathcal{U}_{r}|}\}$ . Note that $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$ are (discrete) random variables with pmfs $p(\hat{x}_{i}[k]|\theta,\mu(\hat{x}_{i}[k]))$ and $p(\hat{r}_{i}[k]|\theta,\mu(\hat{r}_{i}[k]))$ , respectively. We then see from Eq. (25) that $y(\mu)$ is a random vector whose pmf is denoted as $p(y(\mu)|\theta,\mu)$ . Similarly, the pmf of $y(\mathcal{U}_{i}^{x})$ (resp., $y(\mathcal{U}_{i}^{r})$ ) is denoted as $p(y(\mathcal{U}_{i}^{x})|\theta,\mu)$ (resp., $p(y(\mathcal{U}_{i}^{r})|\theta,\mu)$ ). Given $t_{1},t_{2}\in\mathbb{Z}_{\geq 1}$ with $t_{2}\geq t_{1}$ , we make the following assumption on measurements $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$ .

Assumption 5.1.

For any $i\in\mathcal{V}$ and for any $k_{1},k_{2}\in\{t_{1},\dots,t_{2}\}$ ( $k_{1}\neq k_{2}$ ), $\hat{x}_{i}[k_{1}]$ , $\hat{x}_{i}[k_{2}]$ , $\hat{r}_{i}[k_{1}]$ and $\hat{r}_{i}[k_{2}]$ are independent of each other. Moreover, for any $i,j\in\mathcal{V}$ ( $i\neq j$ ) and for any $k_{1},k_{2}\in\{t_{1},\dots,t_{2}\}$ , $\hat{x}_{i}[k_{1}]$ and $\hat{x}_{j}[k_{2}]$ are independent, and $\hat{x}_{i}[k_{1}]$ and $\hat{r}_{j}[k_{2}]$ are independent.

The above assumption ensures that measurements from different nodes or from different time steps are independent, and the measurements of $x_{i}[k]$ and $r_{i}[k]$ are also independent. It then follows from Eq. (25) that the pmf of $y(\mu)$ can be written as

p(y(\mu)|\theta,\mu)=\prod_{i\in\mathcal{U}_{x}}p(y(\mathcal{U}_{i}^{x})|\theta,\mu)\cdot\prod_{j\in\mathcal{U}_{r}}p(y(\mathcal{U}_{j}^{r})|\theta,\mu),

(26)

where we can further write $p(y(\mathcal{U}_{i}^{x})|\theta,\mu)=\prod_{k\in\mathcal{U}_{i}^{x}}p(\hat{x}_{i}[k]|\theta,\mu(\hat{x}_{i}[k]))$ for all $i\in\mathcal{U}_{x}$ , and $p(y(\mathcal{U}_{j}^{r})|\theta,\mu)=\prod_{k\in\mathcal{U}_{j}^{r}}p(\hat{r}_{j}[k]|\theta,\mu(\hat{r}_{j}[k]))$ for all $j\in\mathcal{U}_{r}$ .

In order to quantify the performance (e.g., precision) of estimators of $\theta$ based on $\mu$ , we use the Bayesian Cramer-Rao Lower Bound (BCRLB) (e.g., [33]) associated with $\mu$ . In the following, we introduce the BCRLB, and explain why we choose it as a performance metric. First, given any measurement $\mu\in\mathcal{M}$ , let $F_{\theta}(\mu)$ be the corresponding Fisher information matrix defined as

F_{\theta}(\mu)\triangleq-\mathbb{E}\begin{bmatrix}\frac{\partial^{2}\ln p(y(\mu)|\theta,\mu)}{{\partial\beta^{2}}}&\frac{\partial^{2}\ln p(y(\mu)|\theta,\mu)}{{\partial\beta}{\partial\delta}}\\ \frac{\partial^{2}\ln p(y(\mu)|\theta,\mu)}{{\partial\delta}{\partial\beta}}&\frac{\partial^{2}\ln p(y(\mu)|\theta,\mu)}{{\partial\delta^{2}}}\end{bmatrix}

(27)

with the expectation $\mathbb{E}[\cdot]$ taken with respect to $p(y(\mu)|\theta,\mu)$ . Under Assumption 5.1 and some regularity conditions on the pmfs of $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$ , Eq. (27) can be written as the following (e.g., [13]):

F_{\theta}(\mu)=\sum_{\lambda\in\{x,r\}}\sum_{i\in\mathcal{U}_{\lambda}}\sum_{k\in\mathcal{U}_{i}^{\lambda}}\mathbb{E}\Big{[}\frac{\partial\ln p(\hat{\lambda}_{i}[k]|\theta,\mu(\hat{\lambda}_{i}[k]))}{{\partial\theta}}\big{(}\frac{\partial\ln p(\hat{\lambda}_{i}[k]|\theta,\mu(\hat{\lambda}_{i}[k]))}{{\partial\theta}}\big{)}^{T}\Big{]}.

(28)

Consider any estimator $\hat{\theta}(\mu)$ of $\theta$ based on a measurement selection $\mu\in\mathcal{M}$ , and assume that we have a prior pdf of $\theta=[\beta\ \delta]^{T}$ , denoted as $p(\theta)$ . Under some regularity conditions on the pmfs of $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$ , and $p(\theta)$ , we have (e.g., [32, 33]):

R_{\hat{\theta}(\mu)}=\mathbb{E}[(\hat{\theta}(\mu)-\theta)(\hat{\theta}(\mu)-\theta)^{T}]\succeq\bar{C}(\mu),

(29)

where $R_{\hat{\theta}(\mu)}\in\mathbb{R}^{2\times 2}$ is the error covariance of the estimator $\hat{\theta}(\mu)$ , the expectation $\mathbb{E}[\cdot]$ is taken with respect to $p(y(\mu)|\theta,\mu)p(\theta)$ , and $\bar{C}(\mu)\in\mathbb{R}^{2\times 2}$ is the BCRLB associated with the measurement selection $\mu$ . The BCRLB is defined as (e.g., [32, 33])

\bar{C}(\mu)\triangleq(\mathbb{E}_{\theta}[F_{\theta}(\mu)]+F_{p})^{-1},

(30)

where $\mathbb{E}_{\theta}[\cdot]$ denotes the expectation taken with respect to $p(\theta)$ , $F_{\theta}(\mu)$ is given by Eq. (27), and $F_{p}\in\mathbb{R}^{2\times 2}$ encodes the prior knowledge of $\theta$ as

F_{p}=-\mathbb{E}_{\theta}\begin{bmatrix}\frac{\partial^{2}\ln p(\theta)}{{\partial\beta^{2}}}&\frac{\partial^{2}\ln p(\theta)}{{\partial\beta}{\partial\delta}}\\ \frac{\partial^{2}\ln p(\theta)}{{\partial\delta}{\partial\beta}}&\frac{\partial^{2}\ln p(\theta)}{{\partial\delta^{2}}}\end{bmatrix}=\mathbb{E}_{\theta}\Big{[}\frac{\partial\ln p(\theta)}{\partial\theta}\big{(}\frac{\partial\ln p(\theta)}{\partial\theta}\big{)}^{T}\Big{]}\succeq\mathbf{0},

(31)

where the second equality holds under some regularity conditions on $p(\theta)$ [32].

Thus, the above arguments motivate us to consider (functions of) $\bar{C}(\cdot)$ as optimization metrics in the measurement selection problem studied in this section, in order to characterize the estimation performance corresponding to a measurement selection $\mu\in\mathcal{M}$ . In particular, we will consider $\operatorname{tr}(\bar{C}(\cdot))$ and $\ln\det(\bar{C}(\cdot))$ , which are widely used criteria in parameter estimation (e.g., [12]), and are also known as the Bayesian A-optimality and D-optimality criteria respectively in the context of experimental design (e.g., [27]). First, considering the optimization metric $\operatorname{tr}(\bar{C}(\cdot))$ , we see from the above arguments that (29) directly implies $\operatorname{tr}(R_{\hat{\theta}(\mu)})\geq\operatorname{tr}(\bar{C}(\mu))$ for all estimators $\hat{\theta}(\mu)$ of $\theta$ and for all $\mu\in\mathcal{M}$ . Therefore, a measurement selection $\mu^{\star}$ that minimizes $\operatorname{tr}(\bar{C}(\mu))$ potentially yields a lower value of $\operatorname{tr}(R_{\hat{\theta}(\mu)})$ for an estimator $\hat{\theta}(\mu)$ of $\theta$ . Furthermore, there may exist an estimator $\hat{\theta}(\mu)$ that achieves the BCRLB (e.g., [32]), i.e., $\operatorname{tr}(\bar{C}(\mu))$ provides the minimum value of $\operatorname{tr}(R_{\hat{\theta}(\mu)})$ that can be possibly achieved by any estimator $\hat{\theta}(\mu)$ of $\theta$ , given a measurement selection $\mu$ . Similar arguments hold for $\ln\det(\bar{C}(\cdot))$ . To proceed, denoting

f_{a}(\mu)\triangleq\operatorname{tr}(\bar{C}(\mu))\ \text{and}\ f_{d}(\mu)\triangleq\ln\det(\bar{C}(\mu))\ \forall\mu\in\mathcal{M},

(32)

we define the Parameter Estimation Measurement Selection (PEMS) problem.

Problem 5.2.

Consider a discrete-time SIR model given by Eq. (1) with a directed graph $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ , a weight matrix $A\in\mathbb{R}^{n\times n}$ , a sampling parameter $h\in\mathbb{R}_{\geq 0}$ , and an initial condition $l=[((s[0])^{T}\ (x[0])^{T}\ (r[0])^{T}]^{T}$ . Moreover, consider time steps $t_{1},t_{2}\in\mathbb{Z}_{\geq 1}$ with $t_{2}\geq t_{1}$ ; a set $\mathcal{C}_{k,i}=\{\zeta c_{k,i}:\zeta\in(\{0\}\cup[\zeta_{i}])\}$ with $c_{k,i}\in\mathbb{R}_{\geq 0}$ and $\zeta_{i}\in\mathbb{Z}_{\geq 1}$ , for all $i\in\mathcal{V}$ and for all $k\in\{t_{1},\dots,t_{2}\}$ ; a set $\mathcal{B}_{k,i}=\{\eta b_{k,i}:\eta\in(\{0\}\cup[\eta_{i}])\}$ with $b_{k,i}\in\mathbb{R}_{\geq 0}$ and $\eta_{i}\in\mathbb{Z}_{\geq 1}$ , for all $i\in\mathcal{V}$ and for all $k\in\{t_{1},\dots,t_{2}\}$ ; a budget $B\in\mathbb{R}_{\geq 0}$ ; and a prior pdf $p(\theta)$ . Suppose $\hat{x}_{i}[k]$ (resp., $\hat{r}_{i}[k]$ ) is given by a pmf $p(\hat{x}_{i}[k]|\theta,\varphi_{k,i})$ (resp., $p(\hat{r}_{i}[k]|\theta,\omega_{k,i})$ ), where $\varphi_{k,i}\in\mathcal{C}_{k,i}$ (resp., $\omega_{k,i}\in\mathcal{B}_{k,i}$ ). The PEMS problem is to find a measurement selection $\mu$ that solves

\begin{split}&\mathop{\min}_{\mu\in\mathcal{M}}f(\mu)\\ s.t.&\ \sum_{\hat{x}_{i}[k]\in\mathcal{U}_{t_{1}:t_{2}}}c_{k,i}\mu(\hat{x}_{i}[k])+\sum_{\hat{r}_{i}[k]\in\mathcal{U}_{t_{1}:t_{2}}}b_{k,i}\mu(\hat{r}_{i}[k])\leq B,\end{split}

(33)

where $\mathcal{M}$ is defined in Eq. (22), $f(\cdot)$ can be either of $f_{a}(\cdot)$ or $f_{d}(\cdot)$ with $f_{a}(\cdot)$ and $f_{d}(\cdot)$ defined in Eq. (32), $\mathcal{U}_{t_{1}:t_{2}}$ is defined in Eq. (21), and $\bar{C}(\mu)$ is given by Eq. (30).

Note that $F_{p}\succeq\mathbf{0}$ from (31), and $f_{a}(\mathbf{0})=\operatorname{tr}(\bar{C}(\mathbf{0}))=\operatorname{tr}((F_{p})^{-1})$ and $f_{d}(\mathbf{0})=\ln\det(\bar{C}(\mathbf{0}))=\ln\det((F_{p})^{-1})$ from Eq. (30). We further assume that $F_{p}\succ\mathbf{0}$ in the sequel, which implies $f(\mu)>0$ for all $\mu\in\mathcal{M}$ .

5.2 Solving the PEMS Problem

In this section, we consider a measurement model with specific pmfs of $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$ (e.g., [4] and [11]). Nonetheless, our analysis can potentially be extended to other measurement models.

5.2.1 Pmfs of Measurements $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$

Consider any $i\in\mathcal{V}$ and any $k\in\{t_{1},\dots,t_{2}\}$ . Assume that the total population of node $i$ is fixed over time and is denoted as $N_{i}\in\mathbb{Z}_{\geq 1}$ . Given any measurement selection $\mu\in\mathcal{M}$ with $\mathcal{M}$ defined in Eq. (22), we recall from Section 5.1 that $\mu(\hat{x}_{i}[k])c_{k,i}$ can be viewed as the cost of performing virus tests on $\mu(\hat{x}_{i}[k])N_{i}^{x}$ randomly and uniformly sampled individuals in the population of node $i\in\mathcal{V}$ , where $\mu(\hat{x}_{i}[k])\in(\{0\}\cup[\zeta_{i}])$ (with $\zeta_{i}\in\mathbb{Z}_{\geq 1}$ ), $c_{k,i}\in\mathbb{R}_{\geq 0}$ and $N_{i}^{x}\in\mathbb{Z}_{\geq 1}$ with $\zeta_{i}N_{i}^{x}\leq N_{i}$ . Note that $x_{i}[k]$ is the proportion of population at node $i$ and at time $k$ that is infected, and $x_{i}[k]\in[0,1)$ under Assumptions 3.1-3.2 as shown by Lemma 3.5. Thus, a randomly and uniformly sampled individual in the population at node $i$ and at time $k$ will be an infected individual (at time $k$ ) with probability $x_{i}[k]$ , and will be a non-infected (i.e., susceptible or recovered) individual with probability $1-x_{i}[k]$ . Supposing the tests are accurate,²²2Here, “accurate” means that an infected individual (at time $k$ ) will be tested positive with probability one, and an individual that is not infected will be tested negative with probability one. we see from the above arguments that the obtained number of individuals that are tested positive, i.e., $N_{i}\hat{x}_{i}[k]$ , is a binomial random variable with parameters $N_{i}^{x}\mu(\hat{x}_{i}[k])\in\mathbb{Z}_{\geq 1}$ and $x_{i}[k]\in[0,1)$ . Thus, for any $i\in\mathcal{V}$ and for any $k\in\{t_{1},\dots,t_{2}\}$ , the pmf of $\hat{x}_{i}[k]$ is

p(\hat{x}_{i}[k]=x|\theta,\mu(\hat{x}_{i}[k]))={N_{i}^{x}\mu(\hat{x}_{i}[k])\choose N_{i}x}(x_{i}[k])^{N_{i}x}(1-x_{i}[k])^{N_{i}^{x}\mu(\hat{x}_{i}[k])-N_{i}x},

(34)

where $x\in\{0,\frac{1}{N_{i}},\frac{2}{N_{i}},\dots,\frac{N_{i}^{x}\mu(\hat{x}_{i}[k])}{N_{i}}\}$ with $x\in[0,1]$ since $N_{i}^{x}\zeta_{i}\leq N_{i}$ . Note that we do not define the pmf of measurement $\hat{x}_{i}[k]$ when $N_{i}^{x}\mu(\hat{x}_{i}[k])=0$ , i.e., when $\mu(\hat{x}_{i}[k])=0$ , since $\mu(\hat{x}_{i}[k])=0$ indicates no measurement is collected for state $x_{i}[k]$ . Also note that when $x_{i}[k]=0$ , the pmf of $\hat{x}_{i}[k]$ given in Eq. (34) reduces to $p(\hat{x}_{i}[k]=0|\theta,\mu(\hat{x}_{i}[k]))=1$ . Moreover, since the weight matrix $A\in\mathbb{R}^{n\times n}$ and the sampling parameter $h\in\mathbb{R}_{\geq 0}$ are assumed to be given, we see that given $\theta=[\beta\ \delta]^{T}$ and initial condition $l=[(s[0])^{T}\ (x[0])^{T}\ (r[0])^{T}]^{T}$ , $x_{i}[k]$ can be obtained using Eq. (1b) for all $i\in\mathcal{V}$ and for all $k\in\{t_{1},\dots,t_{2}\}$ , where we can view $x_{i}[k]$ as a function in the unknown parameter $\theta$ . In other words, given $l$ , $\theta$ , $\mu(\hat{x}_{i}[k])$ , $N_{i}^{x}$ and $N_{i}$ , one can obtain the right-hand side of Eq. (34). Again, we only explicitly express the dependency of the pmf of $\hat{x}_{i}[k]$ on $\theta$ and $\mu(\hat{x}_{i}[k])$ in Eq. (34). Following similar arguments to those above, we assume that for any $i\in\mathcal{V}$ and for any $k\in\{t_{1},\dots,t_{2}\}$ , measurement $\hat{r}_{i}[k]$ has the following pmf:

p(\hat{r}_{i}[k]=r|\theta,\mu(\hat{r}_{i}[k]))={N_{i}^{r}\mu(\hat{r}_{i}[k])\choose N_{i}r}(r_{i}[k])^{N_{i}r}(1-r_{i}[k])^{N_{i}^{r}\mu(\hat{r}_{i}[k])-N_{i}r},

(35)

where $r\in\{0,\frac{1}{N_{i}},\frac{2}{N_{i}},\dots,\frac{N_{i}^{r}\mu(\hat{r}_{i}[k])}{N_{i}}\}$ with $r\in[0,1]$ , $\mu(\hat{r}_{i}[k])\in\{0,\dots,\eta_{i}\}$ , $N_{i}^{r}\in\mathbb{Z}_{\geq 1}$ and $N_{i}^{r}\mu(\hat{r}_{i}[k])\leq N_{i}$ . Similarly, we note that the pmf of $\hat{r}_{i}[k]$ given in Eq. (35) reduces to $p(\hat{r}_{i}[k]=0|\theta,\mu(\hat{r}_{i}[k]))=1$ when $r_{i}[k]=0$ . Considering any measurement selection $\mu\in\mathcal{M}$ and any measurement $\hat{\lambda}_{i}[k]\in\mathcal{U}_{t_{1}:t_{2}}$ , where $\lambda\in\{x,r\}$ and $\mathcal{U}_{t_{1}:t_{2}}$ is defined in Eq. (21), we have the following:

	$\displaystyle\mathbb{E}\Big{[}\frac{\partial\ln p(\hat{\lambda}_{i}[k]\|\theta,\mu(\hat{\lambda}_{i}[k]))}{{\partial\theta}}\big{(}\frac{\partial\ln p(\hat{\lambda}_{i}[k]\|\theta,\mu(\lambda_{i}[k]))}{{\partial\theta}}\big{)}^{T}\Big{]}$
$\displaystyle=$	$\displaystyle\mathbb{E}\Big{[}\big{(}\frac{\partial\ln p(\hat{\lambda}_{i}[k]\|\theta,\mu(\hat{\lambda}_{i}[k]))}{{\partial{\lambda}_{i}[k]}}\big{)}^{2}\cdot\frac{\partial{\lambda}_{i}[k]}{\partial\theta}\big{(}\frac{\partial{\lambda}_{i}[k]}{\partial\theta}\big{)}^{T}\Big{]}$	(36)
$\displaystyle=$	$\displaystyle\frac{N_{i}^{\lambda}\mu(\hat{\lambda}_{i}[k])}{{\lambda}_{i}[k](1-{\lambda}_{i}[k])}\cdot\frac{\partial\lambda_{i}[k]}{\partial\theta}\big{(}\frac{\partial\lambda_{i}[k]}{\partial\theta}\big{)}^{T},$	(37)

where the expectation $\mathbb{E}[\cdot]$ is taken with respect to $p(\hat{\lambda}_{i}[k]|\theta,\mu(\hat{\lambda}_{i}[k]))$ , and ${\lambda}_{i}[k]\in[0,1)$ . To obtain (36), we note the form of $\ln p(\hat{\lambda}_{i}[k]|\theta,\mu(\hat{\lambda}_{i}[k]))$ in Eq. (34) (or Eq. (35)), and use the chain rule. Moreover, one can obtain (37) from the fact that $\hat{\lambda}_{i}[k]$ is a binomial random variable. Noting that the pmf of $\hat{\lambda}_{i}[k]$ reduces to $p(\hat{\lambda}_{i}[k]=0|\theta,\mu(\hat{\lambda}_{i}[k]))=1$ if ${\lambda}_{i}[k]=0$ as argued above, we let the right-hand side of (37) be zero if ${\lambda}_{i}[k]=0$ .

5.2.2 Complexity of the PEMS Problem

Under the measurement model described above, we show that the PEMS problem is also NP-hard, i.e., there exist instances of the PEMS problem that cannot be solved optimally by any polynomial-time algorithm (if P $\neq$ NP). The proof of the following result is included in Section 7.3 in the Appendix.

Theorem 5.3.

The PEMS problem is NP-hard.

5.2.3 An Equivalent Formulation for the PEMS Problem

Theorem 5.3 motivates us to consider approximation algorithms for solving the PEMS problem. To begin with, we note that the objective function in the PEMS problem can be viewed as a function defined over an integer lattice. We then have $f_{a}:\mathcal{M}\to\mathbb{R}_{\geq 0}$ and $f_{d}:\mathcal{M}\to\mathbb{R}_{\geq 0}$ , where $\mathcal{M}$ is defined in Eq. (22). First, considering $f_{a}:\mathcal{M}\to\mathbb{R}_{\geq 0}$ , we will define a set function $f_{Pa}:2^{\bar{M}}\to\mathbb{R}_{\geq 0}$ , where $\bar{\mathcal{M}}$ is a set constructed as

\bar{\mathcal{M}}\triangleq\{(\hat{x}_{i}[k],l_{1}):i\in\mathcal{V},k\in\{t_{1},\dots,t_{2}\},l_{1}\in[\zeta_{i}]\}\cup\{(\hat{r}_{i}[k],l_{2}):i\in\mathcal{V},k\in\{t_{1},\dots,t_{2}\},l_{2}\in[\eta_{i}]\}.

(38)

In other words, for any $i\in\mathcal{V}$ and for any $k\in\{t_{1},\dots,t_{2}\}$ , we associate elements $(\hat{x}_{i}[k],1),\dots,(\hat{x}_{i}[k],\zeta_{i})$ (resp., $(\hat{r}_{i}[k],1),\dots,(\hat{r}_{i}[k],\eta_{i})$ ) in set $\bar{\mathcal{M}}$ to measurement $\hat{x}_{i}[k]$ (resp., $\hat{r}_{i}[k]$ ). The set function $f_{Pa}(\cdot)$ is then defined as

f_{Pa}(\mathcal{Y})\triangleq f_{a}(\mathbf{0})-f_{a}(\mu_{\mathcal{Y}})=\operatorname{tr}(\bar{C}(\mathbf{0}))-\operatorname{tr}(\bar{C}(\mu_{\mathcal{Y}}))\ \forall\mathcal{Y}\subseteq\bar{\mathcal{M}},

(39)

where for any $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ , we define $\mu_{\mathcal{Y}}\in\mathcal{M}$ such that $\mu_{\mathcal{Y}}(\hat{x}_{i}[k])=|\{(\hat{x}_{i}[k],l_{1}):(\hat{x}_{i}[k],l_{1})\in\mathcal{Y}\}|$ and $\mu_{\mathcal{Y}}(\hat{r}_{i}[k])=|\{(\hat{r}_{i}[k],l_{2}):(\hat{r}_{i}[k],l_{2})\in\mathcal{Y}\}|$ for all $i\in\mathcal{V}$ and for all $k\in\{t_{1},\dots,t_{2}\}$ . In other words, $\mu_{\mathcal{Y}}(\hat{x}_{i}[k])$ (resp., $\mu_{\mathcal{Y}}(\hat{r}_{i}[k])$ ) is set to be the number of elements in $\mathcal{Y}$ that correspond to the measurement $\hat{x}_{i}[k]$ (resp., $\hat{r}_{i}[k]$ ). Also note that $f_{Pa}(\emptyset)=0$ . Following the arguments leading to (37), we define

H_{y}\triangleq\begin{cases}&\mathbb{E}_{\theta}\big{[}\frac{N_{i}^{x}}{x_{i}[k](1-x_{i}[k])}\frac{\partial x_{i}[k]}{\partial\theta}\big{(}\frac{\partial x_{i}[k]}{\partial\theta}\big{)}^{T}\big{]}\ \text{if}\ y=(\hat{x}_{i}[k],l_{1})\\ &\mathbb{E}_{\theta}\big{[}\frac{N_{i}^{r}}{r_{i}[k](1-r_{i}[k])}\frac{\partial r_{i}[k]}{\partial\theta}\big{(}\frac{\partial r_{i}[k]}{\partial\theta}\big{)}^{T}\big{]}\ \text{if}\ y=(\hat{r}_{i}[k],l_{2})\end{cases}\ \forall y\in\bar{\mathcal{M}},

(40)

where $x_{i}[k],r_{i}[k]\in[0,1)$ , $i\in\mathcal{V}$ , $k\in\{t_{1},\dots.t_{2}\}$ , $l_{1}\in[\zeta_{i}]$ , $l_{2}\in[\eta_{i}]$ , and the expectation $\mathbb{E}_{\theta}[\cdot]$ is taken with respect to the prior pdf $p(\theta)$ . Given any $\theta=[\beta\ \delta]^{T}$ , we see from the arguments for (37) that $\frac{N_{i}^{x}}{x_{i}[k](1-x_{i}[k])}\frac{\partial x_{i}[k]}{\partial\theta}\big{(}\frac{\partial x_{i}[k]}{\partial\theta}\big{)}^{T}\succeq\mathbf{0}$ . Moreover, one can show that $\mathbb{E}_{\theta}\big{[}\frac{N_{i}^{x}}{x_{i}[k](1-x_{i}[k])}\frac{\partial x_{i}[k]}{\partial\theta}\big{(}\frac{\partial x_{i}[k]}{\partial\theta}\big{)}^{T}\big{]}\succeq\mathbf{0}$ . Similarly, one can obtain $\mathbb{E}_{\theta}\big{[}\frac{N_{i}^{r}}{r_{i}[k](1-r_{i}[k])}\frac{\partial r_{i}[k]}{\partial\theta}\big{(}\frac{\partial r_{i}[k]}{\partial\theta}\big{)}^{T}\big{]}\succeq\mathbf{0}$ , which implies $H_{y}\succeq\mathbf{0}$ for all $y\in\bar{\mathcal{M}}$ . Now, suppose the pmfs of $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$ are given by Eq. (34) and Eq. (35), respectively. Recall from Eq. (30) that $\operatorname{tr}(\bar{C}(\mu))=\operatorname{tr}((\mathbb{E}_{\theta}[F_{\theta}(\mu)]+F_{p})^{-1})$ for all $\mu\in\mathcal{M}$ , where $F_{p}$ and $F_{\theta}(\mu)$ are given by (31) and (28), respectively. Supposing Assumption 5.1 holds, for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ , one can first express $F_{\theta}(\mu_{\mathcal{Y}})$ using (37), and then use Eq. (40) to obtain $\mathbb{E}_{\theta}[F_{\theta}(\mu_{\mathcal{Y}})]=\sum_{y\in\mathcal{Y}}H_{y}\triangleq H(\mathcal{Y})$ , where $\mu_{\mathcal{Y}}$ is defined above given $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ . Putting the above arguments together, we have from Eq. (39) the following:

f_{Pa}(\mathcal{Y})=\operatorname{tr}\big{(}(F_{p})^{-1}\big{)}-\operatorname{tr}\big{(}(F_{p}+H(\mathcal{Y}))^{-1}\big{)}\quad\forall\mathcal{Y}\subseteq\bar{\mathcal{M}}.

(41)

Next, let the cost of $(\hat{x}_{i}[k],l_{1})$ be $c_{k,i}$ , denoted as $c(\hat{x}_{i}[k],l_{1})$ , for all $(\hat{x}_{i}[k],l_{1})\in\bar{\mathcal{M}}$ , and let the cost of $(\hat{r}_{i}[k],l_{2})$ be $b_{k,i}$ , denoted as $c(\hat{r}_{i}[k],l_{2})$ , for all $(\hat{r}_{i}[k],l_{2})\in\bar{\mathcal{M}}$ , where $c_{k,i}\in\mathbb{R}_{>0}$ and $b_{k,i}\in\mathbb{R}_{>0}$ are given in the instance of the PEMS problem. Setting the cost structure of the elements in $\bar{\mathcal{M}}$ in this way, we establish an equivalence between the cost of a subset $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ and the cost of $\mu_{\mathcal{Y}}\in\mathcal{M}$ , where $\mu_{\mathcal{Y}}$ is defined above. Similarly, considering the objective function $f_{d}:\mathcal{M}\to\mathbb{R}_{\geq 0}$ in the PEMS problem, we define a set function $f_{Pd}:2^{\bar{\mathcal{M}}}\to\mathbb{R}_{\geq 0}$ as

f_{Pd}(\mathcal{Y})\triangleq f_{d}(\mathbf{0})-f_{d}(\mu_{\mathcal{Y}})=\ln\det(F_{p}+H(\mathcal{Y}))-\ln\det(F_{p})\quad\forall\mathcal{Y}\subseteq\bar{\mathcal{M}},

(42)

where we define $\mu_{\mathcal{Y}}\in\mathcal{M}$ such that $\mu_{\mathcal{Y}}(\hat{x}_{i}[k])=|\{(\hat{x}_{i}[k],l_{1}):(\hat{x}_{i}[k],l_{1})\in\mathcal{Y}\}|$ and $\mu_{\mathcal{Y}}(\hat{r}_{i}[k])=|\{(\hat{r}_{i}[k],l_{2}):(\hat{r}_{i}[k],l_{2})\in\mathcal{Y}\}|$ for all $i\in\mathcal{V}$ and for all $k\in\{t_{1},\dots,t_{2}\}$ . Note that given an instance of the PEMS problem in Problem 5.2, we can construct the set $\bar{\mathcal{M}}$ with the associated costs of the elements in $\bar{\mathcal{M}}$ in $O(n(t_{2}-t_{1}+1)(\zeta+\eta))$ time, where $n$ is the number of nodes in graph $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ , and $\zeta,\eta\in\mathbb{Z}_{\geq 1}$ with $\zeta_{i}\leq\zeta$ and $\eta_{i}\leq\eta$ for all $i\in\mathcal{V}$ . Assuming that $\zeta$ and $\eta$ are (fixed) constants, the construction of the set $\bar{\mathcal{M}}$ with the associated costs takes $O(n(t_{2}-t_{1}+1))$ time, which is polynomial in the parameters of the PEMS problem (Problem 5.2). Based on the above arguments, we further consider the following problem:

\begin{split}&\mathop{\max}_{\mathcal{Y}\subseteq\bar{\mathcal{M}}}f_{P}(\mathcal{Y})\\ s.t.&\ c(\mathcal{Y})\leq B,\end{split}

(P)

where $f_{P}(\cdot)$ can be either of $f_{Pa}(\cdot)$ or $f_{Pd}(\cdot)$ with $f_{Pa}(\cdot)$ and $f_{Pd}(\cdot)$ given by in (41) and (42), respectively, and $c(\mathcal{Y})\triangleq\sum_{y\in\mathcal{Y}}c(y)$ for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ . By the way we construct $f_{P}(\cdot)$ and the costs of elements in $\bar{\mathcal{M}}$ , one can verify that $\mathcal{Y}_{a}^{\star}\subseteq\bar{\mathcal{M}}$ (resp., $\mathcal{Y}_{d}^{\star}\subseteq\bar{\mathcal{M}}$ ) is an optimal solution to Problem (P) with $f_{P}(\cdot)=f_{Pa}(\cdot)$ (resp., $f_{P}(\cdot)=f_{Pd}(\cdot)$ ) if and only if $\mu_{\mathcal{Y}_{a}^{\star}}$ (resp., $\mu_{\mathcal{Y}_{d}^{\star}}$ ) defined above is an optimal solution to (33) in Problem 5.2 with $f(\cdot)=f_{a}(\cdot)$ (resp., $f(\cdot)=f_{d}(\cdot)$ ). Thus, given a PEMS instance we can first construct $\bar{\mathcal{M}}$ with the associated cost for each element in $\bar{\mathcal{M}}$ , and then solve Problem (P).

5.3 Greedy Algorithm for the PEMS Problem

Note that Problem (P) can be viewed as a problem of maximizing a set function subject to a knapsack constraint, and greedy algorithms have been proposed to solve this problem with performance guarantees when the objective function is monotone nondecreasing and submodular³³3A set function $g:2^{\mathcal{V}}\to\mathbb{R}$ , where $\mathcal{V}=[n]$ is the ground set, is said to be monotone nondecreasing if $g(\mathcal{A})\leq g(\mathcal{B})$ for all $\mathcal{A}\subseteq\mathcal{B}\subseteq\mathcal{V}$ . $g(\cdot)$ is called submodular if $g(\{y\}\cup\mathcal{A})-g(\mathcal{A})\geq g(\{y\}\cup\mathcal{B})-g(\mathcal{B})$ for all $\mathcal{A}\subseteq\mathcal{B}\subseteq\mathcal{V}$ and for all $y\in\mathcal{V}\setminus\mathcal{B}$ . (e.g., [16] and [29]). Before we formally introduce the greedy algorithm for the PEMS problem, we first note from (40)-(42) that given a prior pdf of $\theta$ and any $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ , one has to take the expectation $\mathbb{E}_{\theta}[\cdot]$ in order to obtain the value of $f_{P}(\mathcal{Y})$ . However, it is in general intractable to explicitly calculate the integration corresponding to $\mathbb{E}_{\theta}[\cdot]$ . Hence, one may alternatively evaluate the value of $f_{P}(\mathcal{Y})$ using numerical integration with respect to $\theta=[\beta\ \delta]^{T}$ (e.g., [28]). Specifically, a typical numerical integration method (e.g., the trapezoid rule) approximates the integral of a function (over an interval) based on a summation of (weighted) function values evaluated at certain points within the integration interval, which incurs an approximation error (see e.g., [28], for more details). We then see from (40)-(42) that in order to apply the numerical integration method described above to $f_{P}(\mathcal{Y})$ , one has to obtain the values of $x_{i}[k]$ , $r_{i}[k]$ , $\frac{\partial x_{i}[k]}{\partial\theta}$ , and $\frac{\partial r_{i}[k]}{\partial\theta}$ for a given $\theta$ (within the integration interval), where $i\in\mathcal{V}$ and $t_{1}\leq k\leq t_{2}$ with $t_{1},t_{2}$ given in an instance of the PEMS problem. Recall that the initial conditions $s[0]$ , $x[0]$ and $r[0]$ are assumed to be known. We first observe that for any given $\theta$ , the values of $x_{i}[k]$ and $r_{i}[k]$ for all $i\in\mathcal{V}$ and for all $k\in\{t_{1},\dots,t_{2}\}$ can be obtained using the recursions in (1) in $O((t_{2}-t_{1}+1)n^{2})$ time. Next, noting that $\frac{\partial x_{i}[k]}{\partial\theta}=[\frac{\partial x_{i}[k]}{\partial\beta}\ \frac{\partial x_{i}[k]}{\partial\delta}]^{T}$ and $\frac{\partial r_{i}[k]}{\partial\theta}=[\frac{\partial r_{i}[k]}{\partial\beta}\ \frac{\partial r_{i}[k]}{\partial\delta}]^{T}$ , we take the derivative with respect to $\beta$ on both sides of the equations in (1) and obtain

\begin{split}&\frac{\partial s_{i}[k+1]}{\partial\beta}=\frac{\partial s_{i}[k]}{\partial\beta}-h(\frac{\partial s_{i}[k]}{\partial\beta}\beta+s_{i}[k])(\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k])-hs_{i}[k]\beta(\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}\frac{\partial x_{j}[k]}{\partial\beta}),\\ &\frac{\partial x_{i}[k+1]}{\partial\beta}=(1-h\delta)\frac{\partial x_{i}[k]}{\partial\beta}+h(\frac{\partial s_{i}[k]}{\partial\beta}\beta+s_{i}[k])(\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k])+hs_{i}[k]\beta(\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}\frac{\partial x_{j}[k]}{\partial\beta}),\\ &\frac{\partial r_{i}[k+1]}{\partial\beta}=\frac{\partial r_{i}[k]}{\partial\beta}+h\delta\frac{\partial x_{i}[k]}{\partial\beta}.\end{split}

(43)

Considering any given $\beta$ , we can then use the recursion in (1) together with the recursion in (43) to obtain the values of $\frac{\partial x_{i}[k]}{\partial\beta}$ and $\frac{\partial r_{i}[k]}{\partial\beta}$ for all $i\in\mathcal{V}$ and for all $k\in\{t_{1},\dots,t_{2}\}$ in $O((t_{2}-t_{1}+1)n^{2})$ time. Similarly, considering any given $\delta$ , one can obtain the values of $\frac{\partial x_{i}[k]}{\partial\delta}$ and $\frac{\partial r_{i}[k]}{\partial\delta}$ for all $i\in\mathcal{V}$ and for all $k\in\{t_{1},\dots,t_{2}\}$ in $O((t_{2}-t_{1}+1)n^{2})$ time.

Putting the above arguments together and considering the prior pdf of $\theta$ , i.e., $p(\theta)$ , we see from (40)-(42) that for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ , an approximation of $f_{P}(\mathcal{Y})$ , denoted as $\hat{f}_{P}(\mathcal{Y})$ , can be obtained in $O(n_{I}(t_{2}-t_{1}+1)n^{2})$ time, where $n_{I}\in\mathbb{Z}_{\geq 1}$ is the number of points used for the numerical integration method with respective to $\theta$ , as we described above.⁴⁴4We assume that $n_{I}$ is polynomial in the parameters of the PEMS instance. Furthermore, in the sequel, we may assume that $\hat{f}_{P}(\cdot)$ satisfies $|\hat{f}_{P}(\mathcal{Y})-f_{P}(\mathcal{Y})|\leq\varepsilon/2$ for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ (with $\hat{f}_{P}(\emptyset)=0$ ), where $\varepsilon\in\mathbb{R}_{\geq 0}$ .⁵⁵5Note that $\varepsilon$ is related to the approximation error of the numerical integration method, and $\varepsilon$ will decrease if $n_{I}$ increases; see, e.g., [28], for more details about the numerical integration method. We are now ready to introduce the greedy algorithm given in Algorithm 2 to solve the PEMS problem, where $\hat{f}_{P}(\cdot)\in\{\hat{f}_{Pa}(\cdot),\hat{f}_{Pd}(\cdot)\}$ and $\hat{f}_{P}(\cdot)$ is the approximation of $f_{P}(\cdot)$ as we described above. From the definition of Algorithm 2, we see that the number of function calls of $\hat{f}_{P}(\cdot)$ required in the algorithm is $O(|\bar{\mathcal{M}}|^{2})$ , and thus the overall time complexity of Algorithm 2 is given by $O(n_{I}(t_{2}-t_{1}+1)n^{2}|\bar{\mathcal{M}}|^{2})$ .

We proceed to analyze the performance of Algorithm 2 when applied to the PEMS problem. First, one can observe that $f_{Pd}(\mathcal{Y})=\ln\det(F_{p}+H(\mathcal{Y}))-\ln\det(F_{p})$ in Problem (P) shares a similar form to the one in [30]. Thus, using similar arguments to those in [30], one can show that $f_{Pd}(\cdot)$ is monotone nondecreasing and submodular with $f_{Pd}(\emptyset)=0$ . Noting the assumption that $|\hat{f}_{Pd}(\mathcal{Y})-f_{Pd}(\mathcal{Y})|\leq\varepsilon/2$ for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ , one can show that $y^{\star}$ given by line 6 of Algorithm 2 satisfies that $\frac{\hat{f}_{Pd}(\{y^{\star}\}\cup\mathcal{Y}_{2})-\hat{f}_{Pd}(\mathcal{Y}_{2})+\varepsilon}{c(y^{\star})}\geq\frac{f_{Pd}(\{y\}\cup\mathcal{Y}_{2})-f_{Pd}(\mathcal{Y}_{2})-\varepsilon}{c(y)}$ for all $y\in\mathcal{C}$ . Similarly, one can show that $\mathop{\max}_{y\in\bar{\mathcal{M}}}f_{Pd}(y)\leq f_{Pd}(\mathcal{Y}_{1})+\varepsilon$ , where $\mathcal{Y}_{1}$ is given by line 3 in Algorithm 2. One can then use similar arguments to those for Theorem 1 in [16] and obtain the following result; the detailed proof is omitted for conciseness.

Theorem 5.4.

Consider Problem (P) with the objective function $f_{Pd}:2^{\bar{\mathcal{M}}}\to\mathbb{R}_{\geq 0}$ given by (42). Then Algorithm 2 yields a solution, denoted as $\mathcal{Y}_{d}^{g}$ , to Problem (P) that satisfies

f_{Pd}(\mathcal{Y}_{d}^{g})\geq\frac{1}{2}(1-e^{-1})f_{Pd}(\mathcal{Y}_{d}^{\star})-(\frac{B}{c_{\mathop{\min}}}+\frac{3}{2})\varepsilon,

(44)

where $\mathcal{Y}_{d}^{\star}\subseteq\bar{\mathcal{M}}$ is an optimal solution to Problem (P), $c_{\mathop{\min}}=\mathop{\min}_{y\in\bar{\mathcal{M}}}c(y)$ ,⁶⁶6Note that we can assume without loss of generality that $c(y)\leq B$ for all $y\in\bar{\mathcal{M}}$ . and $\varepsilon\in\mathbb{R}_{\geq 0}$ satisfies $|\hat{f}_{Pd}(\mathcal{Y})-f_{Pd}(\mathcal{Y})|\leq\varepsilon/2$ for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ .

Algorithm 2 Greedy algorithm for PEMS

1:Input: An instance of PEMS transformed into the form in (P)

2:Output:

\mathcal{Y}_{g}

3:Find

\mathcal{Y}_{1}\in\mathop{\arg}{\max}_{y\in\bar{\mathcal{M}}}\hat{f}_{P}(y)

4:Initialize

\mathcal{Y}_{2}=\emptyset

and

\mathcal{C}=\bar{\mathcal{M}}

5:while

\mathcal{C}\neq\emptyset

6: Find

y^{\star}\in\mathop{\arg}{\max}_{y\in\mathcal{C}}\frac{\hat{f}_{P}(\{y\}\cup\mathcal{Y}_{2})-\hat{f}_{P}(\mathcal{Y}_{2})}{c(y)}

7: if

c(y^{\star})+c(\mathcal{Y}_{2})\leq B

then

\mathcal{Y}_{2}=\{y^{\star}\}\cup\mathcal{Y}_{2}

\mathcal{C}=\mathcal{C}\setminus\{y^{\star}\}

10:

\mathcal{Y}_{g}=\mathop{\arg}{\max}_{\mathcal{Y}\in\{\mathcal{Y}_{1},\mathcal{Y}_{2}\}}\{\hat{f}_{P}(\mathcal{Y}_{1}),\hat{f}_{P}(\mathcal{Y}_{2})\}

In contrast to $f_{Pd}(\cdot)$ , the objective function $f_{Pa}(\cdot)$ is not submodular in general (e.g., [17]). In fact, one can construct examples where the objective function $f_{Pa}(\mathcal{Y})=\operatorname{tr}((F_{p})^{-1}\big{)}-\operatorname{tr}\big{(}(F_{p}+H(\mathcal{Y}))^{-1})$ in the PEMS problem is not submodular. Hence, in order to provide performance guarantees of the greedy algorithm when applied to Problem (P) with $f(\cdot)=f_{Pa}(\cdot)$ , we will extend the analysis in [16] to nonsubmodular settings. To proceed, note that for all $\mathcal{A}\subseteq\mathcal{B}\subseteq\bar{\mathcal{M}}$ , we have $F_{p}+H(\mathcal{A})\preceq F_{p}+H(\mathcal{B})$ , which implies $(F_{p}+H(\mathcal{A}))^{-1}\succeq(F_{p}+H(\mathcal{B}))^{-1}$ and $\operatorname{tr}(F_{p}+H(\mathcal{A}))^{-1})\geq\operatorname{tr}(F_{p}+H(\mathcal{B}))^{-1})$ [10]. Therefore, the objective function $f_{Pa}(\cdot)$ is monotone nondecreasing with $f_{Pa}(\emptyset)=0$ . We then characterize how close $f_{Pa}(\cdot)$ is to being submodular by introducing the following definition.

Definition 5.5.

Consider Problem (P) with $f_{P}(\cdot)=f_{Pa}(\cdot)$ , where $f_{Pa}:2^{\bar{\mathcal{M}}}\to\mathbb{R}_{\geq 0}$ is defined in (39). Suppose Algorithm 2 is applied to solve Problem (P). For all $j\in\{1,\dots,|\mathcal{Y}_{2}|\}$ , let $\mathcal{Y}_{2}^{j}=\{y_{1},\dots,y_{j}\}$ denote the set that contains the first $j$ elements added to set $\mathcal{Y}_{2}$ in Algorithm 2, and let $\mathcal{Y}_{2}^{0}=\emptyset$ . The type-1 greedy submodularity ratio of $f_{Pa}(\cdot)$ is defined to be the largest $\gamma_{1}\in\mathbb{R}$ that satisfies

\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\big{(}f_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})\big{)}\geq\gamma_{1}\big{(}f_{Pa}(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})\big{)},

(45)

for all $\mathcal{A}\subseteq\bar{\mathcal{M}}$ and for all $j\in\{0,\dots,|\mathcal{Y}_{2}|\}$ . The type-2 greedy submodularity ratio of $f_{Pa}(\cdot)$ is defined to be the largest $\gamma_{2}\in\mathbb{R}$ that satisfies

f_{Pa}(\mathcal{Y}_{1})-f_{Pa}(\emptyset)\geq\gamma_{2}\big{(}f_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})\big{)},

(46)

for all $j\in\{0,\dots,|\mathcal{Y}_{2}|\}$ and for all $y\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j}$ such that $c(y)+c(\mathcal{Y}_{2}^{j})>B$ , where $\mathcal{Y}_{1}\in\mathop{\arg}{\max}_{y\in\bar{\mathcal{M}}}\hat{f}_{Pa}(y)$ .

Remark 5.6.

Note that $f_{Pa}(\cdot)$ is monotone nondecreasing as argued above. Noting the definition of $\gamma_{1}$ in (45), one can use similar arguments to those in [5] and show that $\gamma_{1}\in[0,1]$ ; if $f_{Pa}(\cdot)$ is submodular, then $\gamma_{1}=1$ . Similarly, one can show that $\gamma_{2}\geq 0$ . Supposing that $\mathcal{Y}_{1}\in\mathop{\arg}{\max}_{y\in\bar{\mathcal{M}}}f_{Pa}(y)$ , one can further show that if $f_{Pa}(\cdot)$ is submodular, then $\gamma_{2}\geq 1$ .

Note that since we approximate $f_{Pa}(\cdot)$ using $\hat{f}_{Pa}(\cdot)$ as we argued above, we may not be able to obtain the exact values of $\gamma_{1}$ and $\gamma_{2}$ from Definition 5.5. Moreover, finding $\gamma_{1}$ may require an exponential number of function calls of $f_{Pa}(\cdot)$ (or $\hat{f}_{Pa}(\cdot)$ ). Nonetheless, it will be clear from our analysis below that obtaining lower bounds on $\gamma_{1}$ and $\gamma_{2}$ suffices. Here, we describe how we obtain a lower bound on $\gamma_{2}$ using $\hat{f}_{Pa}(\cdot)$ , and defer our analysis for lower bounding $\gamma_{1}$ to the end of this section, which requires more careful analysis. Similarly to (46), let $\hat{\gamma}_{2}$ denote the largest real number that satisfies

\hat{f}_{Pa}(\mathcal{Y}_{1})-\frac{\varepsilon}{2}\geq\hat{\gamma}_{2}\big{(}\hat{f}_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-\hat{f}_{Pa}(\mathcal{Y}_{2}^{j})+\varepsilon),

(47)

for all $j\in\{0,\dots,|\mathcal{Y}_{2}|\}$ and for all $y\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j}$ such that $c(y)+c(\mathcal{Y}_{2}^{j})>B$ . Noting our assumption that $|\hat{f}_{Pa}(\mathcal{Y})-f_{Pa}(\mathcal{Y})|\leq\varepsilon/2$ for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ (with $\hat{f}_{Pa}(\emptyset)=0$ ), one can see that $\hat{\gamma}_{2}$ given by (47) also satisfies (46), which implies $\hat{\gamma}_{2}\leq\gamma_{2}$ . Given $\mathcal{Y}_{2}^{j}$ for all $j\in\{0,\dots,|\mathcal{Y}_{2}|\}$ from Algorithm 2, $\hat{\gamma}_{2}$ can now be obtained via $O(|\bar{\mathcal{M}}|^{2})$ function calls of $\hat{f}_{Pa}(\cdot)$ .

Based on Definition 5.5, the following result extends the analysis in [15, 16], and characterizes the performance guarantees of Algorithm 2 for Problem (P) with $f_{P}(\cdot)=f_{Pa}(\cdot)$ .

Theorem 5.7.

Consider Problem (P) with the objective function $f_{Pa}:2^{\bar{\mathcal{M}}}\to\mathbb{R}_{\geq 0}$ given by (39). Then Algorithm 2 yields a solution, denoted as $\mathcal{Y}_{a}^{g}$ , to Problem (P) that satisfies

f_{Pa}(\mathcal{Y}_{a}^{g})\geq\frac{\mathop{\min}\{\gamma_{2},1\}}{2}(1-e^{-\gamma_{1}})f_{Pa}(\mathcal{Y}_{a}^{\star})-(\frac{B+c_{\mathop{\max}}}{c_{\mathop{\min}}}+1)\varepsilon,

(48)

where $\mathcal{Y}_{a}^{\star}\subseteq\bar{\mathcal{M}}$ is an optimal solution to Problem (P), $\gamma_{1}\in\mathbb{R}_{\geq 0}$ and $\gamma_{2}\in\mathbb{R}_{\geq 0}$ are defined in Definition 5.5, $c_{\mathop{\min}}=\mathop{\min}_{y\in\bar{\mathcal{M}}}c(y)$ , $c_{\mathop{\max}}=\mathop{\max}_{y\in\bar{\mathcal{M}}}c(y)$ , and $\varepsilon\in\mathbb{R}_{\geq 0}$ satisfies $|\hat{f}_{Pa}(\mathcal{Y})-f_{Pa}(\mathcal{Y})|\leq\varepsilon/2$ for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ .

Proof.

Noting that (48) holds trivially if $\gamma_{1}=0$ or $\gamma_{2}=0$ , we assume that $\gamma_{1}>0$ and $\gamma_{2}>0$ . In this proof, we drop the subscript of $f_{Pa}(\cdot)$ (resp., $\hat{f}_{Pa}(\cdot)$ ) and denote $f(\cdot)$ (resp., $\hat{f}(\cdot)$ ) for notational simplicity. First, recall that for all $j\in\{1,\dots,|\mathcal{Y}_{2}|\}$ , we let $\mathcal{Y}_{2}^{j}=\{y_{1},\dots,y_{j}\}$ denote the set that contains the first $j$ elements added to set $\mathcal{Y}_{2}$ in Algorithm 2, and let $\mathcal{Y}_{2}^{0}=\emptyset$ . Now, let $j_{l}$ be the first index in $\{1,\dots,|\mathcal{Y}_{2}|\}$ such that a candidate element $y^{\star}\in\mathop{\arg}{\max}_{y\in\mathcal{C}}\frac{\hat{f}(\{y\}\cup\mathcal{Y}_{2}^{j_{l}})-\hat{f}(\mathcal{Y}_{2}^{j_{l}})}{c(y)}$ for $\mathcal{Y}_{2}$ (given in line $6$ of Algorithm 2) cannot be added to $\mathcal{Y}_{2}$ due to $c(y^{\star})+c(\mathcal{Y}_{2}^{j_{l}})>B$ . In other words, for all $j\in\{0,\dots,j_{l}-1\}$ , any candidate element $y^{\star}\in\mathop{\arg}{\max}_{y\in\mathcal{C}}\frac{\hat{f}(\{y\}\cup\mathcal{Y}_{2}^{j})-\hat{f}(\mathcal{Y}_{2}^{j})}{c(y)}$ for $\mathcal{Y}_{2}$ satisfies $c(y^{\star})+c(\mathcal{Y}_{2}^{j})\leq B$ and can be added to $\mathcal{Y}_{2}$ in Algorithm 2. Noting that $|\hat{f}_{P}(\mathcal{Y})-f_{P}(\mathcal{Y})|\leq\varepsilon/2$ for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ , one can then show that the following hold for all $j\in\{0,\dots,j_{l}-1\}$ :

\frac{f(\mathcal{Y}_{2}^{j+1})-f(\mathcal{Y}_{2}^{j})+\varepsilon}{c(y_{j+1})}\geq\frac{f(\{y\}\cup\mathcal{Y}_{2}^{j})-f(\mathcal{Y}_{2}^{j})-\varepsilon}{c(y)}\quad\forall y\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j}.

(49)

Now, considering any $j\in\{0,\dots,j_{l}-1\}$ , we have the following:

	$\displaystyle\quad f(\mathcal{Y}^{\star}_{a}\cup\mathcal{Y}_{2}^{j})-f(\mathcal{Y}_{2}^{j})\leq\frac{1}{\gamma_{1}}\sum_{y\in\mathcal{Y}^{\star}_{a}\setminus\mathcal{Y}^{j}_{2}}c(y)\cdot\frac{f(\{y\}\cup\mathcal{Y}^{j}_{2})-f(\mathcal{Y}^{j}_{2})}{c(y)}$		(50)
	$\displaystyle\leq\frac{1}{\gamma_{1}}\sum_{y\in\mathcal{Y}^{\star}_{a}\setminus\mathcal{Y}_{2}^{j}}c(y)(\frac{f(\mathcal{Y}^{j+1}_{2})-f(\mathcal{Y}^{j}_{2})+\varepsilon}{c(y_{j+1})}+\frac{\varepsilon}{c(y)})$		(51)
	$\displaystyle\leq\frac{B}{\gamma_{1}}\cdot\frac{f(\mathcal{Y}^{j+1}_{2})-f(\mathcal{Y}^{j}_{2})}{c(y_{j+1})}+\frac{\varepsilon}{\gamma_{1}}\sum_{y\in\mathcal{Y}_{a}^{\star}\setminus\mathcal{Y}_{j}^{2}}(\frac{c(y)}{c(y_{j+1})}+1)$		(52)
	$\displaystyle\leq\frac{B}{\gamma_{1}}\cdot\frac{f(\mathcal{Y}^{j+1}_{2})-f(\mathcal{Y}^{j}_{2})}{c(y_{j+1})}+\frac{\varepsilon}{\gamma_{1}}(\frac{B}{c(y_{j+1})}+\|\mathcal{Y}_{a}^{\star}\|),$		(53)

where (50) follows from the definition of $\gamma_{1}$ in (45), and (51) follows from (49). To obtain (52), we use the fact $c(\mathcal{Y}^{\star}_{a})\leq B$ . Similarly, we obtain (53). Noting that $f(\cdot)$ is monotone nondecreasing, one can further obtain from (53) that

f(\mathcal{Y}_{2}^{j+1})-f(\mathcal{Y}_{2}^{j})\geq\frac{\gamma_{1}c(y_{j+1})}{B}\big{(}f(\mathcal{Y}_{a}^{\star})-f(\mathcal{Y}_{2}^{j})\big{)}-\varepsilon(1+|\mathcal{Y}_{a}^{\star}|\frac{c(y_{j+1})}{B}).

(54)

To proceed, let $y^{\prime}\in\mathop{\arg}{\max}_{y\in\mathcal{C}}\frac{\hat{f}(\{y\}\cup\mathcal{Y}_{2}^{j_{l}})-\hat{f}(\mathcal{Y}_{2}^{j_{l}})}{c(y)}$ be the (first) candidate element for $\mathcal{Y}_{2}$ that cannot be added to $\mathcal{Y}_{2}$ due to $c(y^{\prime})+c(\mathcal{Y}_{2}^{j_{l}})>B$ , as we argued above. Similarly to (49), one can see that $\frac{f(\{y^{\prime}\}\cup\mathcal{Y}_{2}^{j_{l}})-f(\mathcal{Y}_{2}^{j_{l}})+\varepsilon}{c(y^{\prime})}\geq\frac{f(\{y\}\cup\mathcal{Y}_{2}^{j_{l}})-f(\mathcal{Y}_{2}^{j_{l}})-\varepsilon}{c(y)}$ holds for all $y\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j_{l}}$ . Letting $\bar{\mathcal{Y}}_{2}^{j_{l}+1}\triangleq\{y^{\prime}\}\cup\mathcal{Y}_{2}^{j_{l}}$ and following similar arguments to those leading up to (54), we have

f(\bar{\mathcal{Y}}_{2}^{j_{l}+1})-f(\mathcal{Y}_{2}^{j_{l}})\geq\frac{\gamma_{1}c(y^{\prime})}{B}\big{(}f(\mathcal{Y}_{a}^{\star})-f(\mathcal{Y}_{2}^{j_{l}})\big{)}-\varepsilon(1+|\mathcal{Y}_{a}^{\star}|\frac{c(y^{\prime})}{B}).

(55)

Denoting $\Delta_{j}\triangleq f(\mathcal{Y}^{\star}_{a})-f(\mathcal{Y}^{j}_{2})$ for all $j\in\{0,\dots,j_{l}\}$ and $\Delta_{j_{l}+1}\triangleq f(\mathcal{Y}_{a}^{\star})-f(\bar{\mathcal{Y}}_{2}^{j_{l}+1})$ , we obtain from (54) and (55) the following:

\Delta_{j}\leq\Delta_{j-1}(1-\frac{c(y_{j})\gamma_{1}}{B})+\varepsilon+\frac{c(y_{j})|\mathcal{Y}_{a}^{\star}|}{B}\varepsilon\quad\forall j\in[j_{l}+1].

(56)

Unrolling (56) yields

		$\displaystyle\Delta_{j_{l}+1}\leq\Delta_{0}\big{(}\prod_{j=1}^{j_{l}}(1-\frac{c(y_{j})\gamma_{1}}{B})\big{)}(1-\frac{c(y^{\prime})\gamma_{1}}{B})+(j_{l}+1+\frac{c(\bar{\mathcal{Y}}_{2}^{j_{l}+1})\|\mathcal{Y}_{a}^{\star}\|}{B})\varepsilon,$		(57)
	$\displaystyle\implies$	$\displaystyle\Delta_{j_{l}+1}\leq\Delta_{0}\big{(}\prod_{j=1}^{j_{l}}(1-\frac{c(y_{j})\gamma_{1}}{B})\big{)}(1-\frac{c(y^{\prime})\gamma_{1}}{B})+\frac{2(B+c_{\mathop{\max}})}{c_{\mathop{\min}}}\varepsilon.$		(58)

To obtain (57), we use the facts that $(1-\frac{c(y_{j})\gamma_{1}}{B})\leq 1$ for all $j\in[j_{l}+1]$ and $(1-\frac{c(y^{\prime})\gamma_{1}}{B})\leq 1$ , since $\gamma_{1}\in(0,1]$ as we argued in Remark 5.6. To obtain (58), we first note from the way we defined $j_{l}$ that $j_{l}+1\leq c(\bar{\mathcal{Y}}^{j_{l}+1}_{2})/c_{\mathop{\min}}\leq(B+c_{\mathop{\max}})/c_{\mathop{\min}}$ . Also noting that $|\mathcal{Y}_{a}^{\star}|\leq B/c_{\mathop{\min}}$ , we then obtain (58).

Now, one can show that $\big{(}\prod_{j=1}^{j_{l}}(1-\frac{c(y_{j})\gamma_{1}}{B})\big{)}(1-\frac{c(y^{\prime})\gamma_{1}}{B})\leq\prod_{j=1}^{j_{l}+1}(1-\frac{c(\bar{\mathcal{Y}}_{2}^{j_{l}+1})\gamma_{1}}{(j_{l}+1)B})\leq e^{-\gamma_{1}\frac{c(\bar{\mathcal{Y}}_{2}^{j_{l}+1})}{B}}$ (e.g., [15]). We then have from (58) the following:

		$\displaystyle f(\mathcal{Y}_{a}^{\star})-f(\bar{\mathcal{Y}}_{2}^{j_{l}+1})\leq f(\mathcal{Y}^{\star}_{a})e^{-\gamma_{1}\frac{c(\bar{\mathcal{Y}}_{2}^{j_{l}+1})}{B}}+\frac{2(B+c_{\mathop{\max}})}{c_{\mathop{\min}}}\varepsilon$
	$\displaystyle\implies$	$\displaystyle f(\bar{\mathcal{Y}}_{2}^{j_{l}+1})\geq(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{2(B+c_{\mathop{\max}})}{c_{\mathop{\min}}}\varepsilon,$		(59)

where (59) follows from $c(\bar{\mathcal{Y}}_{2}^{j_{l}+1})>B$ .

To proceed with the proof of the theorem, we note from the definition of $\gamma_{2}$ in Definition 5.5 that $f(\{y^{\prime}\}\cup\mathcal{Y}_{2}^{j_{l}})-f(\mathcal{Y}_{2}^{j_{l}})\leq\frac{1}{\gamma_{2}}f(\mathcal{Y}_{1})$ with $\gamma_{2}>0$ , which together with (59) imply that

f(\mathcal{Y}_{2}^{j_{l}})+\frac{1}{\gamma_{2}}f(\mathcal{Y}_{1})\geq f(\bar{\mathcal{Y}}_{2}^{j_{l}+1})\geq(1-e^{\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{2\bar{B}}{c_{\mathop{\min}}}\varepsilon,

(60)

where $\bar{B}\triangleq B+c_{\mathop{\max}}$ . Since $f(\cdot)$ is monotone nondecreasing, we obtain from (60)

f(\mathcal{Y}_{2})+\frac{1}{\gamma_{2}}f(\mathcal{Y}_{1})\geq(1-e^{\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{2\bar{B}}{c_{\mathop{\min}}}\varepsilon.

(61)

We will then split our analysis into two cases. First, supposing that $\gamma_{2}\geq 1$ , we see from (61) that at least one of $f(\mathcal{Y}_{2})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\bar{B}}{c_{\mathop{\min}}}\varepsilon$ and $f(\mathcal{Y}_{1})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\bar{B}}{c_{\mathop{\min}}}\varepsilon$ holds. Recalling that $|\hat{f}(\mathcal{Y})-f(\mathcal{Y})|\leq\varepsilon/2$ for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ , it follows that at least one of $\hat{f}(\mathcal{Y}_{2})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\bar{B}}{c_{\mathop{\min}}}\varepsilon-\frac{\varepsilon}{2}$ and $\hat{f}(\mathcal{Y}_{1})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\bar{B}}{c_{\mathop{\min}}}\varepsilon-\frac{\varepsilon}{2}$ holds. Second, supposing $\gamma_{2}<1$ and using similar arguments, we have that at least one of $\hat{f}(\mathcal{Y}_{2})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\bar{B}}{c_{\mathop{\min}}}\varepsilon-\frac{\varepsilon}{2}$ and $\hat{f}(\mathcal{Y}_{1})\geq\frac{\gamma_{2}}{2}(1-e^{-\gamma_{1}})f(\mathcal{Y}_{a}^{\star})-\frac{\gamma_{2}\bar{B}}{c_{\mathop{\min}}}\varepsilon-\frac{\varepsilon}{2}$ holds. Now, we note from line 12 of Algorithm 2 that $\hat{f}(\mathcal{Y}_{a}^{g})\geq\mathop{\max}\{\hat{f}(\mathcal{Y}_{1}),\hat{f}(\mathcal{Y}_{2})\}$ , which implies $f(\mathcal{Y}_{a}^{g})\geq\mathop{\max}\{\hat{f}(\mathcal{Y}_{1}),\hat{f}(\mathcal{Y}_{2})\}-\frac{\varepsilon}{2}$ . Combining the above arguments together, we obtain (48). ∎

Remark 5.8.

Note that (48) becomes $f_{Pa}(\mathcal{Y}_{a}^{g})\geq\frac{1}{2}(1-e^{-\gamma_{1}})f_{Pa}(\mathcal{Y}_{a}^{\star})-(\frac{B+c_{\mathop{\max}}}{c_{\mathop{\min}}}+1)\varepsilon$ if $\gamma_{2}\geq 1$ . Also note that $\gamma_{2}\geq 1$ can hold when the objective function $f_{Pa}(\cdot)$ is not submodular, as we will see later in our numerical examples.

Remark 5.9.

The authors in [31] also extended the analysis of Algorithm 2 to nonsubmodular settings, when the objective function can be exactly evaluated (i.e., $\varepsilon=0$ ). They obtained a performance guarantee for Algorithm 2 that depends on a submodularity ratio defined in a different manner. One can show that the submodularity ratios defined in Definition 5.5 are lower bounded by the one defined in [31], which further implies that the performance bound (when $\varepsilon$ is $0$ ) for Algorithm 2 given in Theorem 5.7 is tighter than that provided in [31].

Finally, we aim to provide a lower bound on $\gamma_{1}$ that can be computed in polynomial time. The lower bounds on $\gamma_{1}$ and $\gamma_{2}$ together with Theorem 5.7 will also provide performance guarantees for the greedy algorithm.

Lemma 5.10.

([10]) For any positive semidefinite matrices $P,Q\in\mathbb{R}^{n\times n}$ , $\lambda_{1}(P)\leq\lambda_{1}(P+Q)\leq\lambda_{1}(P)+\lambda_{1}(Q)$ , and $\lambda_{n}(P+Q)\geq\lambda_{n}(P)+\lambda_{n}(Q)$ .

We have the following result; the proof is included in Section 7.4 in the Appendix.

Lemma 5.11.

Consider the set function $f_{Pa}:2^{\bar{\mathcal{M}}}\to\mathbb{R}_{\geq 0}$ defined in (39). The type-1 greedy submodularity ratio of $f_{Pa}(\cdot)$ given by Definition 5.5 satisfies

\gamma_{1}\geq\mathop{\min}_{j\in\{0,\dots,|\mathcal{Y}_{2}|\}}\frac{\lambda_{2}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{2}(F_{p}+H(\{z_{j}\}\cup\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{1}(F_{p}+H(\{z_{j}\}\cup\mathcal{Y}_{2}^{j}))},

(62)

where $\mathcal{Y}_{2}^{j}$ contains the first $j$ elements added to $\mathcal{Y}_{2}$ in Algorithm 2 $\forall j\in\{1,\dots,|\mathcal{Y}_{2}|\}$ with $\mathcal{Y}_{2}^{0}=\emptyset$ , $F_{p}$ is given by (31), $H(\mathcal{Y})=\sum_{y\in\mathcal{Y}}H_{y}$ $\forall\mathcal{Y}\subseteq\bar{\mathcal{M}}$ with $H_{y}\succeq\mathbf{0}$ defined in (40), and $z_{j}\in\mathop{\arg}{\min}_{y\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j}}\frac{\lambda_{2}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}$ $\forall j\in\{1,\dots,|\mathcal{Y}_{2}|\}$ .

Recalling our arguments at the beginning of this section, we may only obtain approximations of the entries in the ( $2$ by $2$ ) matrix $F_{p}+H(\mathcal{Y})$ for $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ using, e.g., numerical integration, where $H_{y}$ (resp., $F_{p}$ ) is defined in (40) (resp., (31)). Specifically, for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ , let $\hat{H}(\mathcal{Y})=(F_{p}+H(\mathcal{Y}))+E(\mathcal{Y})$ be the approximation of $F_{p}+H(\mathcal{Y})$ , where each entry of $E(\mathcal{Y})\in\mathbb{R}^{2\times 2}$ represents the approximation error of the corresponding entry in $F_{p}+H(\mathcal{Y})$ . Since $F_{p}$ and $H(\mathcal{Y})$ are positive semidefinite matrices, $E(\mathcal{Y})$ is a symmetric matrix. Now, using a standard eigenvalue perturbation result, e.g., Corollary 6.3.8 in [10], one can obtain that $\sum_{i=1}^{2}|\lambda_{i}(F_{p}+H(\mathcal{Y}))-\lambda_{i}(\hat{H}(\mathcal{Y})|^{2}\leq\|E(\mathcal{Y})\|_{F}^{2}$ for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ , where $\|E(\mathcal{Y})\|_{F}\triangleq\sqrt{\operatorname{tr}(E(\mathcal{Y})^{\top}E(\mathcal{Y}))}$ denotes the Frobenius norm of a matrix. It then follows that

\frac{\lambda_{2}(F_{p}+H(\mathcal{Y}))}{\lambda_{1}(F_{p}+H(\mathcal{Y}))}\geq\frac{\lambda_{2}(\hat{H}(\mathcal{Y}))-\|E(\mathcal{Y})\|_{F}}{\lambda_{1}(\hat{H}(\mathcal{Y}))+\|E(\mathcal{Y})\|_{F}}\geq\frac{\lambda_{2}(\hat{H}(\mathcal{Y}))-\varepsilon^{\prime}}{\lambda_{1}(\hat{H}(\mathcal{Y}))+\varepsilon^{\prime}}\quad\forall\mathcal{Y}\subseteq\bar{\mathcal{M}},

where $\varepsilon^{\prime}\in\mathbb{R}_{\geq 0}$ and satisfies $\|E(\mathcal{Y})\|_{F}\leq\varepsilon^{\prime}$ for all $\mathcal{Y}\subseteq\bar{\mathcal{M}}$ . Combining the above arguments with (62) in Lemma 5.11, we obtain a lower bound on $\gamma_{1}$ that can be computed using $O(|\bar{\mathcal{M}}|^{2})$ function calls of $\hat{H}(\cdot)$ .

5.3.1 Illustrations

Note that one can further obtain from (62):

\gamma_{1}\geq\mathop{\min}_{j\in\{0,\dots,|\mathcal{Y}_{2}|\}}\frac{\lambda_{2}(F_{p})+\lambda_{2}(H(\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p})+\lambda_{1}(H(\mathcal{Y}_{2}^{j}))}\cdot\frac{\lambda_{2}(F_{p})+\lambda_{2}(H(z_{j}))+\lambda_{2}(H(\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p})+\lambda_{1}(H(z_{j}))+\lambda_{1}(H(\mathcal{Y}_{2}^{j}))},

(63)

where $z_{j}\in\mathop{\arg}{\min}_{y\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j}}\frac{\lambda_{2}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}$ . Supposing $F_{p}$ is fixed, we see from (63) that the lower bound on $\gamma_{1}$ would potentially increase if $\lambda_{2}(H(z_{j}))/\lambda_{1}(H(z_{j}))$ and $\lambda_{2}(H(\mathcal{Y}_{2}^{j}))/\lambda_{1}(H(\mathcal{Y}_{2}^{j}))$ increase. Recall that $F_{p}$ given by (31) encodes the prior knowledge that we have about $\theta=[\beta\ \delta]^{T}$ . Moreover, recall from (40) that $H(y)$ depends on the prior pdf $p(\theta)$ and the dynamics of the SIR model in (1). Thus, the lower bound given by Lemma 5.11 and thus the corresponding performance bound of Algorithm 2 given in Theorem 5.7 depend on the prior knowledge that we have on $\theta=[\beta\ \delta]^{T}$ and the dynamics of the SIR model. Also note that the performance bounds given in Theorem 5.7 are worst-case performance bounds for Algorithm 2. Thus, in practice the ratio between a solution returned by the algorithm and an optimal solution can be smaller than the ratio predicted by Theorem 5.7, as we will see in our simulations in next section. Moreover, instances with tighter performance bounds potentially imply better performance of the algorithm when applied to those instances. Similar arguments hold for the performance bounds provided in Theorem 5.4.

5.3.2 Simulations

To validate the theoretical results in Theorems 5.4 and 5.7, and Lemma 5.11, we consider various PEMS instances.⁷⁷7In our simulations, we neglect the approximation error corresponding to the numerical integrations discussed in Section 5.3, since the error terms are found to be sufficiently small. The directed network $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ is given by Fig. $\ref{fig:parameters}(a)$ . According to the existing literature about the estimated infection and recovery rates for the COVID-19 pandemic (e.g., [26]), we assume that the infection rate $\beta$ and the recovery rate $\delta$ lie in the intervals $[3,7]$ and $[1,4]$ , respectively. Let the prior pdf of $\beta$ (resp., $\delta$ ) be a (linearly transformed) Beta distribution with parameters $\alpha_{1}=6$ and $\alpha_{2}=3$ (resp., $\alpha_{1}=3$ and $\alpha_{2}=4$ ), where $\beta$ and $\delta$ are also assumed to be independent. The prior pdfs of $\beta$ and $\delta$ are then plotted in Fig. $\ref{fig:parameters}(b)$ and Fig. $\ref{fig:parameters}(c)$ , respectively. We set the sampling parameter $h=0.1$ . We then randomly generate the weight matrix $A\in\mathbb{R}^{5\times 5}$ such that Assumptions 3.1-3.2 are satisfied, where each entry of $A$ is drawn (independently) from certain uniform distributions. The initial condition is set to be $s_{1}[0]=0.95$ , $x_{1}[0]=0.05$ and $r_{1}[0]=0$ , and $s_{i}[0]=0.99$ , $x_{i}[0]=0.01$ and $r_{i}[0]=0$ for all $i\in\{2,\dots,5\}$ . In the pmfs of measurements $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$ given in Eq. (34) and Eq. (35), respectively, we set $N_{i}^{x}=N_{i}^{r}=100$ and $N_{i}=1000$ $\forall i\in\mathcal{V}$ , where $N_{i}$ is the total population at node $i$ .

First, let us consider PEMS instances with a relatively smaller size. In such instances, we set the time steps $t_{1}=t_{2}=5$ , i.e., we only consider collecting measurements at time step $t=5$ . In the sets $\mathcal{C}_{5,i}=\{\zeta c_{5,i}:\zeta\in(\{0\}\cup[\zeta_{i}])\}$ and $\mathcal{B}_{5,i}=\{\eta b_{5,i}:\eta\in(\{0\}\cup[\eta_{i}])\}$ , we let $c_{5,i}=b_{5,i}$ and $\zeta_{i}=\eta_{i}=2$ for all $i\in\mathcal{V}$ , and draw $c_{5,i}$ and $b_{5,i}$ uniformly randomly from $\{1,2,3\}$ . Here, we can choose to perform $0$ , $100$ , or $200$ virus (or antibody) tests at a node $i\in\mathcal{V}$ and at $k=5$ . In Fig. $\ref{fig:small instances}(a)$ , we consider the objective function $f_{Pd}(\cdot)$ , given by Eq. (42), in the PEMS instances constructed above, and plot the greedy solutions and the optimal solutions (found by brute force) to the PEMS instances under different values of budget $B$ . Note that for all the simulation results in this section, we obtain the averaged results from $50$ randomly generated $A$ matrices as described above, for each value of $B$ . As shown in Theorem 5.4, the greedy algorithm yields a $\frac{1}{2}(1-e^{-1})\approx 0.31$ approximation for $f_{Pd}(\cdot)$ (in the worst case), and the results in Fig. $\ref{fig:small instances}(a)$ show that the greedy algorithm performs near optimally for the PEMS instances generated above. Similarly, in Fig. $\ref{fig:small instances}(b)$ , we plot the greedy solutions and the optimal solutions to the PEMS instances constructed above under different values of $B$ , when the objective function is $f_{Pa}(\cdot)$ given in Eq. (39). Again, the results in Fig. $\ref{fig:small instances}(b)$ show that the greedy algorithm performs well for the constructed PEMS instances. Moreover, according to Lemma 5.11, we plot the lower bound on the submodularity ratio $\gamma_{1}$ of $f_{Pa}(\cdot)$ in Fig. $\ref{fig:small instances}(c)$ . Here, we note that the submodularity ratio $\gamma_{2}$ of $f_{Pa}(\cdot)$ is always greater than one in the PEMS instances constructed above. Hence, Theorem 5.7 yields a $\frac{1}{2}(1-e^{-\gamma_{1}})$ worst-case approximation guarantee for the greedy algorithm, where $\frac{1}{2}(1-e^{-0.3})\approx 0.13$ .

We then investigate the performance of the greedy algorithm for PEMS instances of a larger size. Since the optimal solution to the PEMS instances cannot be efficiently obtained when the sizes of the instances become large, we obtain the lower bound on the submodularity ratio $\gamma_{1}$ of $f_{Pa}(\cdot)$ provided in Lemma 5.11, which can be computed in polynomial time. Different from the smaller instances constructed above, we set $t_{1}=1$ and $t_{2}=5$ . We let $\zeta_{i}=\eta_{i}=10$ for all $i\in\mathcal{V}$ in $\mathcal{C}_{k,i}=\{\zeta c_{k,i}:\zeta\in(\{0\}\cup[\zeta_{i}])\}$ and $\mathcal{B}_{k,i}=\{\eta b_{k,i}:\eta\in(\{0\}\cup[\eta_{i}])\}$ , where we also set $c_{k,i}=b_{k,i}$ and draw $c_{k,i}$ and $b_{k,i}$ uniformly randomly from $\{1,2,3\}$ , for all $k\in[5]$ and for all $i\in\mathcal{V}$ . Moreover, we modify the parameter of the Beta distribution corresponding to the pdf of $\beta$ to be $\alpha_{1}=8$ and $\alpha_{2}=3$ . Here, we can choose to perform $0$ , $100$ , $200$ , …, or $1000$ virus (or antibody) tests at a node $i\in\mathcal{V}$ and at $k\in[5]$ . In Fig. $\ref{fig:large instances}(a)$ , we plot the lower bound on $\gamma_{1}$ obtained from the PEMS instances constructed above. We note that the submodularity ratio $\gamma_{2}$ of $f_{Pa}(\cdot)$ is also always greater than one. Hence, Theorem 5.7 yields a $\frac{1}{2}(1-e^{-\gamma_{1}})$ worst-case approximation guarantee for the greedy algorithm. We plot in Fig. $\ref{fig:large instances}(b)$ the approximation guarantee using the lower bound that we obtained on $\gamma_{1}$ .

6 Conclusion

We first considered the PIMS problem under the exact measurement setting, and showed that the problem is NP-hard. We then proposed an approximation algorithm that returns a solution to the PIMS problem that is within a certain factor of the optimal one. Next, we studied the PEMS problem under the noisy measurement setting. Again, we showed that the problem is NP-hard. We applied a greedy algorithm to solve the PEMS problem, and provided performance guarantees on the greedy algorithm. We presented numerical examples to validate the obtained performance bounds of the greedy algorithm, and showed that the greedy algorithm performs well in practice.

7 Appendix

7.1 Proof of Lemma 3.5

We first prove part $(a)$ . Considering any $i\in\mathcal{V}$ and any $k\in\mathbb{Z}_{\geq 0}$ , we note from Eq. (1a) that

s_{i}[k+1]=s_{i}[k](1-h\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k]).

(64)

Under Assumptions 3.1-3.2, we have $x_{i}[k]\in[0,1]$ for all $i\in\mathcal{V}$ as argued in Section 3, and $h\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}<1$ for all $i\in\mathcal{V}$ , which implies $1-h\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k]\geq 1-h\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}>0$ . Supposing $s_{i}[k]>0$ , we have from Eq. (64) $s_{i}[k+1]>0$ . Combining the above arguments with the fact $s_{i}[0]\in(0,1]$ from Assumption 3.1, we see that $s_{i}[k]>0$ for all $k\in\mathbb{Z}_{\geq 0}$ . Noting that $s_{i}[k],x_{i}[k],r_{i}[k]\in[0,1]$ with $s_{i}[k]+x_{i}[k]+r_{i}[k]=1$ for all $i\in\mathcal{V}$ and for all $k\in\mathbb{Z}_{\geq 0}$ as argued in Section 3 and that $x_{i}[0]\in[0,1)$ and $r_{i}[0]=0$ for all $i\in\mathcal{V}$ , the result in part $(a)$ also implies $x_{i}[k],r_{i}[k]\in[0,1)$ for all $i\in\mathcal{V}$ and for all $k\in\mathbb{Z}_{\geq 0}$ .

One can then observe that in order to prove parts $(b)$ - $(d)$ , it is sufficient to prove the following facts.

Fact 1.

Consider any $i\in\mathcal{V}$ and any $k_{1}\in\mathbb{Z}_{\geq 0}$ . If $x_{i}[k_{1}]>0$ , then $x_{i}[k_{2}]>0$ for all $k_{2}\in\mathbb{Z}_{\geq 0}$ with $k_{2}\geq k_{1}$ .

Fact 2.

Consider any $i\in\mathcal{V}$ and any $k\in\mathbb{Z}_{\geq 0}$ such that $x_{i}[k]=0$ . If there exists $j\in\mathcal{N}_{i}$ such that $x_{j}[k]>0$ , then $x_{i}[k+1]>0$ . If $x_{j}[k]=0$ for all $j\in\mathcal{N}_{i}$ , then $x_{i}[k+1]=0$ .

Fact 3.

Consider any $i\in\mathcal{V}$ and any $k_{1}\in\mathbb{Z}_{\geq 0}$ . If $x_{i}[k_{1}]>0$ , then $r_{i}[k_{1}+1]>0$ . If $x_{i}[k_{1}]=0$ , then $r_{i}[k_{1}+1]=0$ .

Let us first prove Fact 1. Consider any $i\in\mathcal{V}$ and any $k\in\mathbb{Z}_{\geq 0}$ . Supposing $x_{i}[k]>0$ , we have from Eq. (1)

x_{i}[k+1]=(1-h\delta)x_{i}[k]+hs_{i}[k]\beta\sum_{j\in\bar{\mathcal{N}}_{i}}a_{ij}x_{j}[k],

(65)

where the first term on the right-hand side of the above equation is positive, since $1-h\delta>0$ from Assumption 3.2, and the second term on the right-hand side of the above equation is nonnegative. It then follows that $x_{i}[k+1]>0$ . Repeating the above argument proves Fact 1.

We next prove Fact 2. Considering any $i\in\mathcal{V}$ and any $k\in\mathbb{Z}_{\geq 0}$ such that $x_{i}[k]=0$ , we note from Eq. (1) that

x_{i}[k+1]=hs_{i}[k]\beta\sum_{j\in\mathcal{N}_{i}}a_{ij}x_{j}[k],

(66)

where $s_{i}[k]>0$ as shown in part $(a)$ . Suppose there exists $j\in\mathcal{N}_{i}$ such that $x_{j}[k]>0$ . Since $h,\beta\in\mathbb{R}_{>0}$ and $a_{ij}>0$ for all $j\in\mathcal{N}_{i}$ from Assumption 3.2, we have from Eq. (66) $x_{i}[k+1]>0$ . Next, supposing $x_{j}[k]=0$ for all $j\in\mathcal{N}_{i}$ , we obtain from Eq. (66) $x_{i}[k+1]=0$ . This proves Fact 2.

Finally, we prove Fact 3. Let us consider any $i\in\mathcal{V}$ and any $k_{1}\in\mathbb{Z}_{\geq 0}$ . Suppose $x_{i}[k_{1}]>0$ . Since $h,\delta\in\mathbb{R}_{>0}$ from Assumption 3.2, we have from Eq. (1c) $r_{i}[k_{1}+1]=r_{i}[k]+h\delta x_{i}[k_{1}]>0$ . Next, supposing $x_{i}[k_{1}]=0$ , we note from Fact 1 that $x_{i}[k_{1}^{\prime}]=0$ for all $k_{1}^{\prime}\leq k_{1}$ . It then follows from Eq. (1c) and Assumption 3.1 that $r_{i}[k_{1}+1]=r_{i}[k_{1}]=\cdots=r_{i}[0]=0$ , completing the proof of Fact 3. $\hfill\square$

7.2 Proof of Theorem 4.4

We show that PIMS is NP-hard via a polynomial reduction from the exact cover by 3-sets (X3C) problem which is known to be NP-complete [9].

Problem 7.1.

Consider $\mathcal{X}=\{1,2,\dots,3m\}$ and a collection $\mathcal{Z}=\{z_{1},z_{2},\dots,z_{\tau}\}$ of 3-element subsets of $\mathcal{X}$ , where $\tau\geq m$ . The X3C problem is to determine if there is an exact cover for $\mathcal{X}$ , i.e., a subcollection $\mathcal{Z}^{\prime}\subseteq\mathcal{Z}$ such that every element of $\mathcal{X}$ occurs in exactly one member of $\mathcal{Z}^{\prime}$ .

Consider an instance of the X3C problem given by a set $\mathcal{X}=\{1,\dots,3m\}$ and a collection $\mathcal{Z}=\{z_{1},\dots,z_{\tau}\}$ of 3-element subsets of $\mathcal{X}$ , where $\tau\geq m$ . We then construct an instance of the PIMS problem as follows. The node set of the graph $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ is set to be $\mathcal{V}=\{i_{0},i_{1},\dots,i_{\tau}\}\cup\{j_{1},j_{2},\dots,j_{3m}\}$ . The edge set of $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ is set to satisfy that $(j_{q},i_{l})\in\mathcal{E}$ if $q\in\mathcal{X}$ is contained in $z_{l}\in\mathcal{Z}$ , $(j_{q},i_{0})\in\mathcal{E}$ for all $q\in\mathcal{X}$ , and $(i_{0},i_{0})\in\mathcal{E}$ . Note that based on the construction of $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ , each node $i\in\{i_{1},\dots,i_{\tau}\}$ represents a subset from $\mathcal{Z}$ in the X3C instance, and each node $j\in\{j_{1},\dots,j_{3m}\}$ represents an element from $\mathcal{X}$ in the X3C instance, where the edges between $\{i_{1},\dots,i_{\tau}\}$ and $\{j_{1},\dots,j_{3m}\}$ indicate how the elements in $\mathcal{X}$ are included in the subsets in $\mathcal{Z}$ . A plot of $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ is given in Fig. 4. Accordingly, the weight matrix $A\in\mathbb{R}^{(3m+\tau+1)\times(3m+\tau+1)}$ is set to satisfy that $a_{i_{l}j_{q}}=1$ if $q\in\mathcal{X}$ is contained in $z_{l}\in\mathcal{Z}$ , $a_{i_{0}j_{q}}=1$ for all $q\in\mathcal{X}$ , and $a_{i_{0}i_{0}}=1$ . We set the sampling parameter to be $h=1/(3m+1)$ . The set $\mathcal{S}_{I}\subseteq\mathcal{V}$ is set to be $\mathcal{S}_{I}=\mathcal{V}$ , i.e., $x_{i}[0]>0$ for all $i\in\mathcal{V}$ . We set time steps $t_{1}=2$ and $t_{2}=3$ . Finally, we set $b_{2,i}=b_{3,i}=0$ for all $i\in\mathcal{V}$ , $c_{2,i_{l}}=1$ and $c_{3,i_{l}}=0$ for all $l\in\{1,\dots,\tau\}$ , $c_{2,j_{q}}=c_{3,j_{q}}=m+1$ for all $q\in\mathcal{X}$ , and $c_{2,i_{0}}=c_{3,i_{0}}=0$ . Since $x_{i}[0]>0$ for all $i\in\mathcal{V}$ , we see from Lemma 3.5 that $x_{i}[k]>0$ and $r_{i}[k]>0$ for all $i\in\mathcal{V}$ and for all $k\in\{2,3\}$ . Therefore, Lemma 3.5 is no longer useful in determining the coefficients in the equations from Eq. (3).

We claim that an optimal solution, denoted as $\mathcal{I}^{\star}$ , to the constructed PIMS instance satisfies $c(\mathcal{I}^{\star})\leq m$ if and only if the solution to the X3C instance is “yes”.

First, suppose the solution to the X3C instance is “yes”. Denote an exact cover as $\mathcal{Z}^{\prime}=\{z_{q_{1}},\dots,z_{q_{m}}\}\subseteq\mathcal{Z}$ , where $\{q_{1},\dots,q_{m}\}\subseteq\{1,\dots,\tau\}$ . Let us consider a measurement selection strategy $\mathcal{I}_{0}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ given by

\mathcal{I}_{0}=\big{(}\bigcup_{l\in\{1,\dots,m\}}\{x_{i_{q_{l}}}[2],x_{i_{q_{l}}}[3],r_{i_{q_{l}}}[2]\}\big{)}\cup\{x_{i_{0}}[2],x_{i_{0}}[3],r_{i_{0}}[2],r_{i_{0}}[3]\}.

We then have from Eq. (9) $\bar{\mathcal{I}}_{0}=\{(2,i_{0},r),(2,i_{0},x)\}\cup\{(2,i_{q_{l}},x):l\in\{1,\dots,m\}\}$ . Noting that $s_{i}[k]>0$ for all $i\in\mathcal{V}$ and for all $k\in\mathbb{Z}_{\geq 0}$ from Lemma $\ref{lemma:propagate of initial condition}(a)$ , we consider the following $(m+1)$ equations from Eq. (3) whose indices are contained in $\bar{\mathcal{I}}_{0}$ :

	$\displaystyle\frac{1}{s_{i_{0}}[2]}(x_{i_{0}}[3]-x_{i_{0}}[2])=h\begin{bmatrix}x_{i_{0}}[2]+\sum_{w\in\mathcal{N}_{i_{0}}}x_{w}[2]&-\frac{x_{i_{0}}[2]}{s_{i_{0}}[1]}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix},$		(67)
	$\displaystyle\frac{1}{s_{i_{q_{l}}}[2]}(x_{i_{q_{l}}}[3]-x_{i_{q_{l}}}[2])=h\begin{bmatrix}\sum_{w\in\mathcal{N}_{i_{q_{l}}}}x_{w}[2]&-\frac{x_{i_{q_{l}}}[2]}{s_{i_{q_{l}}}[2]}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix},\ \forall l\in\{1,\dots,m\},$		(68)

where we note $\mathcal{N}_{i_{0}}=\{j_{1},\dots,j_{3m}\}$ from the way we constructed $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ . Since $\mathcal{Z}^{\prime}=\{z_{q_{1}},\dots,z_{q_{m}}\}$ is an exact cover for $\mathcal{X}$ , we see from the construction of $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ that $\bigcup_{l\in\{1,\dots,m\}}\mathcal{N}_{i_{q_{l}}}$ is a union of mutually disjoint (3-element) sets such that $\bigcup_{l\in\{1,\dots,m\}}\mathcal{N}_{i_{q_{l}}}=\{j_{1},\dots,j_{3m}\}$ . Thus, subtracting the equations in (68) from Eq. (67), we obtain

\frac{1}{s_{i_{0}}[2]}(x_{i_{0}}[3]-x_{i_{0}}[2])-\sum_{l\in\{1,\dots,m\}}\frac{1}{s_{i_{q_{l}}}[2]}(x_{i_{q_{l}}}[3]-x_{i_{q_{l}}}[2])\\ =h\begin{bmatrix}x_{i_{0}}[2]&-\frac{x_{i_{0}}[2]}{s_{i_{0}}[2]}+\sum_{l\in\{1,\dots,m\}}\frac{x_{i_{q_{l}}}[2]}{s_{i_{q_{l}}}[2]}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix},

(69)

where we note $x_{i_{0}}[2]>0$ as argued above. Following Definition 4.1, we stack coefficient matrices $\Phi_{2,i_{0}}^{r}\in\mathbb{R}^{1\times 2}$ , $\Phi_{2,i_{0}}^{x}\in\mathbb{R}^{1\times 2}$ and $\Phi_{2,i_{q_{l}}}^{x}\in\mathbb{R}^{1\times 2}$ for all $l\in\{1,\dots,m\}$ into a matrix $\Phi(\mathcal{I}_{0})\in\mathbb{R}^{(m+2)\times 2}$ , where $\Phi_{k,i}^{r}$ and $\Phi_{k,i}^{x}$ are defined in (8). Now, considering the matrix:

\Phi_{0}=\begin{bmatrix}x_{i_{0}}[2]&-\frac{x_{i_{0}}[2]}{s_{i_{0}}[2]}+\sum_{l\in\{1,\dots,m\}}\frac{x_{i_{q_{l}}}[2]}{s_{i_{q_{l}}}[2]}\\ 0&x_{i_{0}}[2]\end{bmatrix},

(70)

we see from the above arguments that $(\Phi_{0})_{1}$ and $(\Phi_{0})_{2}$ can be be obtained via algebraic operations among the rows in $\Phi(\mathcal{I}_{0})$ , and the elements in $(\Phi_{0})_{1}$ and $(\Phi_{0})_{2}$ can be determined using the measurements from $\mathcal{I}_{0}$ . Therefore, we have $\Phi_{0}\in\tilde{\Phi}(\mathcal{I}_{0})$ , where $\tilde{\Phi}(\mathcal{I}_{0})$ is defined in Definition 4.1. Noting that $x_{i_{0}}[2]>0$ , we have $\operatorname{rank}(\Phi_{0})=2$ , which implies $r_{\mathop{\max}}(\mathcal{I}_{0})=2$ , where $r_{\mathop{\max}}(\mathcal{I}_{0})$ is given by Eq. (10). Thus, $\mathcal{I}_{0}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ satisfies the constraint in (12). Since $c(\mathcal{I}_{0})=m$ from the way we set the costs of collecting measurements in the PIMS instance, we have $c(\mathcal{I}^{\star})\leq m$ .

Conversely, suppose the solution to the X3C instance is “no”, i.e., for any subcollection $\mathcal{Z}^{\prime}\subseteq\mathcal{Z}$ that contains $m$ subsets, there exists at least one element in $\mathcal{X}$ that is not contained in any subset in $\mathcal{Z}^{\prime}$ . We will show that for any measurement selection strategy $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ that satisfies $r_{\mathop{\max}}(\mathcal{I})=2$ , $c(\mathcal{I})>m$ holds. Equivalently, we will show that for any $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ with $c(\mathcal{I})\leq m$ , $r_{\mathop{\max}}(\mathcal{I})=2$ does not hold. Consider any $\mathcal{I}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ such that $c(\mathcal{I})\leq m$ . Noting that $c_{2,j_{q}}=c_{3,j_{q}}=m+1$ for all $q\in\mathcal{X}$ in the constructed PIMS instance, we have $x_{j_{q}}[2]\notin\mathcal{I}$ and $x_{j_{q}}[3]\notin\mathcal{I}$ for all $q\in\mathcal{X}$ . Moreover, we see that $\mathcal{I}$ contains at most $m$ measurements from $\{x_{i_{1}}[2],\dots,x_{i_{\tau}}[2]\}$ . To proceed, let us consider any $\mathcal{I}_{1}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ such that

\mathcal{I}_{1}=\{x_{i_{0}}[2],x_{i_{v_{1}}}[2],\dots,x_{i_{v_{m}}}[2]\}\cup\big{(}\bigcup_{l\in\{0,\dots,\tau\}}\{x_{i_{l}}[3]\}\big{)}\cup\big{(}\bigcup_{i\in\mathcal{V}}\{r_{i}[2],r_{i}[3]\}\big{)},

(71)

where $\{v_{1},\dots,v_{m}\}\subseteq\{1,\dots,\tau\}$ . In other words, $\mathcal{I}_{1}\subseteq\mathcal{I}_{t_{1}:t_{2}}$ contains $m$ measurements from $\{x_{i_{1}}[2],\dots,x_{i_{\tau}}[2]\}$ and all the other measurements from $\mathcal{I}_{t_{1}:t_{2}}$ that have zero costs. It follows that $c(\mathcal{I}_{1})=m$ . Similarly to (67) and (68), we have the following $(m+1)$ equations from Eq. (3) whose indices are contained in $\bar{\mathcal{I}}_{1}$ (given by Eq. (9)):

	$\displaystyle\frac{1}{s_{i_{0}}[2]}(x_{i_{0}}[3]-x_{i_{0}}[2])=h\begin{bmatrix}x_{i_{0}}[2]+\sum_{w\in\mathcal{N}_{i_{0}}}x_{w}[2]&-\frac{x_{i_{0}}[2]}{s_{i_{0}}[1]}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix},$		(72)
	$\displaystyle\frac{1}{s_{i_{v_{l}}}[2]}(x_{i_{v_{l}}}[3]-x_{i_{v_{l}}}[2])=h\begin{bmatrix}\sum_{w\in\mathcal{N}_{i_{v_{l}}}}x_{w}[2]&-\frac{x_{i_{v_{l}}}[2]}{s_{i_{v_{l}}}[2]}\end{bmatrix}\begin{bmatrix}\beta\\ \delta\end{bmatrix},\forall l\in\{1,\dots,m\}.$		(73)

Noting that for any subcollection $\mathcal{Z}^{\prime}\subseteq\mathcal{Z}$ that contains $m$ subsets, there exists at least one element in $\mathcal{X}$ that is not contained in any subset in $\mathcal{Z}^{\prime}$ as we argued above, we see that there exists at least one element in $\mathcal{X}$ that is not contained in any subset in $\{z_{v_{1}},\dots,z_{v_{m}}\}$ . It then follows from the way we constructed $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ that there exists $w^{\prime}\in\mathcal{N}_{i_{0}}$ such that $w^{\prime}\notin\mathcal{N}_{i_{v_{l}}}$ for all $l\in\{1,\dots,m\}$ . Thus, by subtracting the equations in (73) (multiplied by any constants resp.) from Eq. (72), $x_{w^{\prime}}[2]$ will remain on the right-hand side of the equation in (72). Similarly, consider any equation from (73) indexed by $(2,i_{v_{l}},x)\in\bar{\mathcal{I}}_{1}$ , where $l\in\{1,\dots,m\}$ . First, suppose we subtract Eq. (72) multiplied by some positive constant and any equations in (73) other than equation $(2,i_{v_{l}},x)$ (multiplied by any constants resp.) from equation $(2,i_{v_{l}},x)$ . Since there exists $w^{\prime}\in\mathcal{N}_{i_{0}}$ such that $w^{\prime}\notin\mathcal{N}_{i_{v_{l}}}$ for all $l\in\{1,\dots,m\}$ as argued above, we see that $x_{w^{\prime}}[2]$ will appear on the right-hand side of equation $(2,i_{v_{l}},x)$ . Next, suppose we subtract any equations in (73) other than equation $(2,i_{v_{l}},x)$ (multiplied by any constants resp.) from equation $(2,i_{v_{l}},x)$ . One can check that either of the following two cases holds for the resulting equation $(2,i_{v_{l}},x)$ : (a) the coefficients on the right-hand side of equation $(2,i_{v_{l}},x)$ contain $x_{j_{q}}[2]\notin\mathcal{I}_{1}$ , where $q\in\mathcal{X}$ ; or (b) the coefficient matrix on the right-hand side of equation $(2,i_{v_{l}},x)$ is of the form $\begin{bmatrix}0&\star\end{bmatrix}$ . Again, we stack $\Phi_{k,i}^{r}\in\mathbb{R}^{1\times 2}$ for all $(k,i,r)\in\bar{\mathcal{I}}_{1}$ and $\Phi_{k,i}^{x}\in\mathbb{R}^{1\times 2}$ for all $(k,i,x)\in\bar{\mathcal{I}}_{1}$ into a matrix $\Phi(\mathcal{I}_{1})$ , where we note that $\Phi_{k,i}^{r}$ is of the form $\begin{bmatrix}0&\star\end{bmatrix}$ for all $(k,i,r)\in\bar{\mathcal{I}}_{1}$ . One can then see from the above arguments that for all $\Phi\in\mathbb{R}^{2\times 2}$ (if they exist) such that $(\Phi)_{1}$ and $(\Phi)_{2}$ can be obtained from algebraic operations among the rows in $\Phi(\mathcal{I}_{1})$ , and the elements in $(\Phi)_{1}$ and $(\Phi)_{2}$ can be determined using the measurements from $\mathcal{I}_{1}$ , $\operatorname{rank}(\Phi)\leq 1$ holds. It follows that $r_{\mathop{\max}}(\mathcal{I}_{1})<2$ , i.e., constraint $r_{\mathop{\max}}(\mathcal{I}_{1})=2$ in (12) does not hold. Using similar arguments to those above, one can further show that $r_{\mathop{\max}}(\mathcal{I})<2$ holds for all $c(\mathcal{I})\leq m$ , completing the proof of the converse direction of the above claim.

Hence, it follows directly from the above arguments that an algorithm for the PIMS problem can also be used to solve the X3C problem. Since X3C is NP-complete, we conclude that the PIMS problem is NP-hard. $\hfill\square$

7.3 Proof of Theorem 5.3

We prove the NP-hardness of the PEMS problem via a polynomial reduction from the knapsack problem which is known to be NP-hard [9]. An instance of the knapsack problem is given by a set $D=\{d_{1},\dots,d_{\tau}\}$ , a size $s(d)\in\mathbb{Z}_{>0}$ and a value $v(d)\in\mathbb{Z}_{>0}$ for each $d\in D$ , and $K\in\mathbb{Z}_{>0}$ . The knapsack problem is to find $D^{\prime}\subseteq D$ such that $\sum_{d\in D^{\prime}}v(d)$ is maximized while satisfying $\sum_{d\in D^{\prime}}s(d)\leq K$ .

Given any knapsack instance, we construct an instance of the PEMS problem as follows. Let $\mathcal{G}=\{\mathcal{V},\mathcal{E}\}$ be a graph that contains a set of $n$ isolated nodes with $n=\tau$ and $\mathcal{V}=[n]$ . Set the weight matrix to be $A=\mathbf{0}_{n\times n}$ , and set the sampling parameter as $h=1$ . The time steps $t_{1}$ and $t_{2}$ are set to be $t_{1}=t_{2}=1$ , i.e., only the measurements of $x_{i}[1]$ and $r_{i}[1]$ for all $i\in\mathcal{V}$ will be considered. The initial condition is set to satisfy $s_{i}[0]=0.5$ , $x_{i}[0]=0.5$ and $r_{i}[0]=0$ for all $i\in\mathcal{V}$ . The budget constraint is set as $B=K$ . Let $\mathcal{C}_{1,i}=\{0,B+1\}$ and $\mathcal{B}_{1,i}=\{0,s(d_{i})\}$ for all $i\in\mathcal{V}$ . The pmfs of measurements $\hat{x}_{i}[1]$ and $\hat{r}_{i}[1]$ are given by Eqs. (34) and (35), respectively, with $N_{i}^{x}=N_{i}^{r}=v(d_{i})$ and $N_{i}=\mathop{\max}_{i\in\mathcal{V}}v(d_{i})$ for all $i\in\mathcal{V}$ , where Assumption 5.1 is assumed to hold. Finally, let the prior pdf of $\beta\in(0,1)$ be a Beta distribution with parameters $\alpha_{1}=3$ and $\alpha_{2}=3$ , and let the prior pdf of $\delta\in(0,1)$ also be a Beta distribution with parameters $\alpha_{1}=3$ and $\alpha_{2}=3$ , where we take $\beta$ and $\delta$ to be independent. Noting that $\mathcal{C}_{1,i}=\{0,B+1\}$ in the PEMS instance constructed above, i.e., $\hat{x}_{i}[k]$ incurs a cost of $B+1>B$ , we only need to consider measurements $\hat{r}_{i}[1]$ for all $i\in\mathcal{V}$ . Moreover, since $\mathcal{B}_{1,i}=\{0,s(d_{i})\}$ , a corresponding measurement selection is then given by $\mu\in\{0,1\}^{\mathcal{V}}$ . In other words, $\mu(i)=1$ if measurement $\hat{r}_{i}[1]$ is collected (with cost $s(d_{i})$ ), and $\mu(i)=0$ if measurement $\hat{r}_{i}[1]$ is not collected. We will see that there is a one to one correspondence between a measurement $\hat{r}_{i}[1]$ in the PEMS instance and an element $d_{i}\in D$ in the knapsack instance.

Consider a measurement selection $\mu\in\{0,1\}^{\mathcal{V}}$ . Since $r_{i}[0]=0$ and $x_{i}[0]=0.5$ for all $i\in\mathcal{V}$ , Eq. (1c) implies $r_{i}[1]=0.5h\delta$ for all $i\in\mathcal{V}$ , where $h=1$ . One then obtain from Eq. (28) and Eq. (37) the following:

F_{\theta}(\mu)=\frac{1}{0.5\delta(1-0.5\delta)}\begin{bmatrix}0&0\\ 0&0.25\end{bmatrix}\sum_{i\in\text{supp}(\mu)}N_{i}^{r}\mu(i).

(74)

Next, noting that $\beta$ and $\delta$ are independent, one can show via Eq. (31) that $F_{p}\in\mathbb{R}^{2\times 2}$ is diagonal, where one can further show that $(F_{p})_{11}=(F_{p})_{22}>0$ using the fact that the pdfs of $\beta$ and $\delta$ are Beta distributions with parameters $\alpha_{1}=3$ and $\alpha_{2}=3$ . Similarly, one can obtain $\mathbb{E}_{\theta}[1/0.5\delta(1-0.5\delta)]>0$ . It now follows from Eq. (74) that

\mathbb{E}_{\theta}[F_{\theta}(\mu)]+F_{p}=\begin{bmatrix}z_{1}&0\\ 0&z_{1}+z_{2}\sum_{i\in\text{supp}(\mu)}N_{i}^{r}\mu(i)\end{bmatrix},

(75)

where $z_{1},z_{2}\in\mathbb{R}_{>0}$ are some constants (independent of $\mu$ ). Note that the objective in the PEMS instance is given by $\mathop{\min}_{\mu\in\{0,1\}^{\mathcal{V}}}f(\mu)$ , where $f(\cdot)\in\{f_{a}(\cdot),f_{d}(\cdot)\}$ . First, considering the objective function $f_{a}(\mu)=\operatorname{tr}(\bar{C}(\mu))$ , where $\bar{C}(\mu)=(\mathbb{E}_{\theta}[F_{\theta}(\mu)]+F_{p})^{-1}$ , we see from Eq. (75) that $\operatorname{tr}(\bar{C}(\mu))$ is minimized (over $\mu\in\{0,1\}^{\mathcal{V}}$ ) if and only if $\sum_{i\in\text{supp}(\mu)}N_{i}^{r}\mu(i)$ is maximized. Similar arguments hold for the objective function $f_{d}(\mu)=\ln\det(\bar{C}(\mu))$ . It then follows directly from the above arguments that a measurement selection $\mu^{\star}\in\{0,1\}^{\mathcal{V}}$ is an optimal solution to the PEMS instance if and only if $D^{\star}\triangleq\{d_{i}:i\in\text{supp}(\mu^{\star})\}$ is an optimal solution to the knapsack instance. Since the knapsack problem is NP-hard, the PEMS problem is NP-hard. $\hfill\square$

7.4 Proof of Lemma 5.11

Noting the definition of $\gamma_{1}$ in Definition 5.5, we provide a lower bound on $\frac{\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}(f_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j}))}{f_{Pa}(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})}$ for all $\mathcal{A}\subseteq\bar{\mathcal{M}}$ and for all $\mathcal{Y}_{2}^{j}$ , where we assume that $\mathcal{A}\setminus\mathcal{Y}_{2}^{j}\neq\emptyset$ , otherwise (45) would be satisfied for all $\gamma_{1}\in\mathbb{R}$ . Recalling the definition of $f_{Pa}(\cdot)$ in (39), we lower bound $LHS\triangleq\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\big{(}f_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})\big{)}$ in the following manner:

$\displaystyle\qquad\qquad LHS$	$\displaystyle=\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\sum_{i=1}^{2}\frac{\lambda_{i}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))-\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j}))}{\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{i}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}$
	$\displaystyle\geq\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\frac{\sum_{i=1}^{2}(\lambda_{i}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))-\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j})))}{\lambda_{1}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{1}(F_{p}+H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j}))}$	(76)
	$\displaystyle=\frac{\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\operatorname{tr}(H_{y})}{\lambda_{1}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{1}(F_{p}+H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j}))}.$	(77)

To obtain (76), we let $z^{\prime}\in\mathop{\arg}{\max}_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\lambda_{1}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))$ and note that $\lambda_{1}(F_{p}+H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j}))\geq\lambda_{i}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))$ for all $i\in\{1,2\}$ and for all $y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}$ . Next, we upper bound $f_{Pa}(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})$ in the following manner:

$\displaystyle f_{Pa}(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})$	$\displaystyle=\sum_{i=1}^{2}\frac{\lambda_{i}(F_{p}+H(\mathcal{A}\cup\mathcal{Y}_{2}^{j}))-\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j}))}{\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{i}(F_{p}+H(\mathcal{A}\cup\mathcal{Y}_{2}^{j}))}$
	$\displaystyle\leq\frac{\sum_{i=1}^{2}\big{(}\lambda_{i}(F_{p}+H(\mathcal{A}\cup\mathcal{Y}_{2}^{j}))-\lambda_{i}(F_{p}+H(\mathcal{Y}_{2}^{j})\big{)}}{\lambda_{2}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{2}(F_{p}+H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j}))}$	(78)
	$\displaystyle=\frac{\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}\operatorname{tr}(H_{y})}{\lambda_{2}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{2}(F_{p}+H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j}))}.$	(79)

To obtain (78), we note that $\lambda_{i}(F_{p}+H(\mathcal{A}\cup\mathcal{Y}_{2}^{j}))\geq\lambda_{2}(F_{p}+H(\mathcal{A}\cup\mathcal{Y}_{2}^{j}))\geq\lambda_{2}(F_{p}+\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j})$ for all $i\in\{1,2\}$ , where the second inequality follows from Lemma 5.10 with the fact $H(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-H(\{z^{\prime}\}\cup\mathcal{Y}_{2}^{j})\succeq\mathbf{0}$ , and $z^{\prime}$ is defined above. Combining (77) and (79), and noting $z_{j}\in\mathop{\arg}{\min}_{y\in\bar{\mathcal{M}}\setminus\mathcal{Y}_{2}^{j}}\frac{\lambda_{2}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p}+H(\{y\}\cup\mathcal{Y}_{2}^{j}))}$ , we have

\frac{\sum_{y\in\mathcal{A}\setminus\mathcal{Y}_{2}^{j}}(f_{Pa}(\{y\}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j}))}{f_{Pa}(\mathcal{A}\cup\mathcal{Y}_{2}^{j})-f_{Pa}(\mathcal{Y}_{2}^{j})}\geq\frac{\lambda_{2}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{2}(F_{p}+H(\{z_{j}\}\cup\mathcal{Y}_{2}^{j}))}{\lambda_{1}(F_{p}+H(\mathcal{Y}_{2}^{j}))\lambda_{1}(F_{p}+H(\{z_{j}\}\cup\mathcal{Y}_{2}^{j}))},

(80)

which implies (62). $\hfill\square$

References

[1] Centers for Disease Control and Prevention Test for Current Infection. https://www.cdc.gov/coronavirus/2019-ncov/testing/diagnostic-testing.html. 2020.
[2] Protect Purdue-Purdue University’s response to COVID-19. https://protect.purdue.edu. 2020.
Ahn and Hassibi [2013] H. J. Ahn and B. Hassibi. Global dynamics of epidemic spread over complex networks. In Proc. Conference on Decision and Control, pages 4579–4585. IEEE, 2013.
Bendavid et al. [2020] E. Bendavid, B. Mulaney, N. Sood, S. Shah, E. Ling, R. Bromley-Dulfano, C. Lai, Z. Weissberg, R. Saavedra, J. Tedrow, et al. COVID-19 antibody seroprevalence in Santa Clara County, California. MedRxiv, 2020.
Bian et al. [2017] A. A. Bian, J. M. Buhmann, A. Krause, and S. Tschiatschek. Guarantees for greedy maximization of non-submodular functions with applications. In Proc. International Conference on Machine Learning, pages 498–507, 2017.
Chakrabarti et al. [2008] D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos. Epidemic thresholds in real networks. ACM Transactions on Information and System Security, 10(4):1–26, 2008.
Chepuri and Leus [2014] S. P. Chepuri and G. Leus. Sparsity-promoting sensor selection for non-linear measurement models. IEEE Transactions on Signal Processing, 63(3):684–698, 2014.
Cormen et al. [2009] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms. MIT press, 2009.
Garey and Johnson [1979] M. R. Garey and D. S. Johnson. Computers and intractability, volume 174. Freeman San Francisco, 1979.
Horn and Johnson [2012] R. A. Horn and C. R. Johnson. Matrix analysis. Cambridge University Press, 2012.
Hota et al. [2020] A. R. Hota, J. Godbole, P. Bhariya, and P. E. Paré. A closed-loop framework for inference, prediction and control of SIR epidemics on networks. arXiv preprint arXiv:2006.16185, 2020.
Joshi and Boyd [2008] S. Joshi and S. Boyd. Sensor selection via convex optimization. IEEE Transactions on Signal Processing, 57(2):451–462, 2008.
Kay [1993] S. M. Kay. Fundamentals of statistical signal processing: Estimation theory. Prentice Hall PTR, 1993.
Kempe et al. [2003] D. Kempe, J. Kleinberg, and É. Tardos. Maximizing the spread of influence through a social network. In Proc. international conference on Knowledge Discovery and Data mining, pages 137–146, 2003.
Khuller et al. [1999] S. Khuller, A. Moss, and J. S. Naor. The budgeted maximum coverage problem. Information Processing Letters, 70(1):39–45, 1999.
Krause and Guestrin [2005] A. Krause and C. Guestrin. A note on the budgeted maximization of submodular functions. Carnegie Mellon University. Center for Automated Learning and Discovery, 2005.
Krause et al. [2008] A. Krause, A. Singh, and C. Guestrin. Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies. Journal of Machine Learning Research, 9(Feb):235–284, 2008.
Mei et al. [2017] W. Mei, S. Mohagheghi, S. Zampieri, and F. Bullo. On the dynamics of deterministic epidemic propagation over networks. Annual Reviews in Control, 44:116–128, 2017.
Mo et al. [2011] Y. Mo, R. Ambrosino, and B. Sinopoli. Sensor selection strategies for state estimation in energy constrained wireless sensor networks. Automatica, 47(7):1330–1338, 2011.
Newman [2002] M. E. Newman. Spread of epidemic disease on networks. Physical Review E, 66(1):016128, 2002.
Nowzari et al. [2016] C. Nowzari, V. M. Preciado, and G. J. Pappas. Analysis and control of epidemics: A survey of spreading processes on complex networks. IEEE Control Systems Magazine, 36(1):26–46, 2016.
Paré et al. [2018] P. E. Paré, J. Liu, C. L. Beck, B. E. Kirwan, and T. Başar. Analysis, estimation, and validation of discrete-time epidemic processes. IEEE Transactions on Control Systems Technology, 2018.
Pastor-Satorras et al. [2015] R. Pastor-Satorras, C. Castellano, P. Van Mieghem, and A. Vespignani. Epidemic processes in complex networks. Reviews of Modern Physics, 87(3):925, 2015.
Pezzutto et al. [2020] M. Pezzutto, N. B. Rossello, L. Schenato, and E. Garone. Smart testing and selective quarantine for the control of epidemics. arXiv preprint arXiv:2007.15412, 2020.
Preciado et al. [2014] V. M. Preciado, M. Zargham, C. Enyioha, A. Jadbabaie, and G. J. Pappas. Optimal resource allocation for network protection against spreading processes. IEEE Transactions on Control of Network Systems, 1(1):99–108, 2014.
Prem et al. [2020] K. Prem, Y. Liu, T. W. Russell, A. J. Kucharski, R. M. Eggo, N. Davies, S. Flasche, S. Clifford, C. A. Pearson, J. D. Munday, et al. The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study. The Lancet Public Health, 2020.
Pukelsheim [2006] F. Pukelsheim. Optimal design of experiments. SIAM, 2006.
Stoer and Bulirsch [2013] J. Stoer and R. Bulirsch. Introduction to numerical analysis, volume 12. Springer Science & Business Media, 2013.
Streeter and Golovin [2009] M. Streeter and D. Golovin. An online algorithm for maximizing submodular functions. In Proc. Advances in Neural Information Processing Systems, pages 1577–1584, 2009.
Summers et al. [2015] T. H. Summers, F. L. Cortesi, and J. Lygeros. On submodularity and controllability in complex dynamical networks. IEEE Transactions on Control of Network Systems, 3(1):91–101, 2015.
Tzoumas et al. [2020] V. Tzoumas, L. Carlone, G. J. Pappas, and A. Jadbabaie. Lqg control and sensing co-design. IEEE Transactions on Automatic Control, pages 1–1, 2020. doi: 10.1109/TAC.2020.2997661.
Van Trees [2004a] H. L. Van Trees. Detection, estimation, and modulation theory, part I. John Wiley & Sons, 2004a.
Van Trees [2004b] H. L. Van Trees. Detection, estimation, and modulation theory, part IV: Optimum array processing. John Wiley & Sons, 2004b.
Vrabac et al. [2020] D. Vrabac, P. E. Paré, H. Sandberg, and K. H. Johansson. Overcoming challenges for estimating virus spread dynamics from data. In Proc. Conference on Information Sciences and Systems, pages 1–6. IEEE, 2020.
Ye and Sundaram [2019] L. Ye and S. Sundaram. Sensor selection for hypothesis testing: Complexity and greedy algorithms. In Proc. Conference on Decision and Control, pages 7844–7849. IEEE, 2019.
Ye et al. [2020] L. Ye, S. Roy, and S. Sundaram. Resilient sensor placement for Kalman filtering in networked systems: Complexity and algorithms. IEEE Transactions on Control of Network Systems, 7(4):1870–1881, 2020.
Ye et al. [2021] L. Ye, N. Woodford, S. Roy, and S. Sundaram. On the complexity and approximability of optimal sensor selection and attack for Kalman filtering. IEEE Transactions on Automatic Control, 66(5):2146–2161, 2021.

Parameter Estimation in Epidemic Spread Networks Using Limited Measurements††thanks: Research supported in part by the National Science Foundation, grants NSF-CMMI 1635014 and NSF-ECCS 2032258.

Abstract

1 Introduction

Related Work

Contributions

Notation and Terminology

2 Model of Epidemic Spread Network

3 Preliminaries

Assumption 3.1.

Assumption 3.2.

Definition 3.3.

Definition 3.4.

Lemma 3.5.

4 Measurement Selection Problem in Exact Measurement Setting

4.1 Problem Formulation

Definition 4.1.

Remark 4.2.

Problem 4.3.

Theorem 4.4.

4.2 Solving the PIMS Problem

Lemma 4.5.

Proof.

Proposition 4.6.

Proof.

5 Measurement Selection Problem in Random Measurement Setting

5.1 Problem Formulation

Assumption 5.1.

Problem 5.2.

5.2 Solving the PEMS Problem

5.2.1 Pmfs of Measurements x^i​[k]\hat{x}_{i}[k] and r^i​[k]\hat{r}_{i}[k]

5.2.2 Complexity of the PEMS Problem

Theorem 5.3.

5.2.3 An Equivalent Formulation for the PEMS Problem

5.3 Greedy Algorithm for the PEMS Problem

Theorem 5.4.

Definition 5.5.

Remark 5.6.

Theorem 5.7.

Proof.

Remark 5.8.

Remark 5.9.

Lemma 5.10.

Lemma 5.11.

5.3.1 Illustrations

5.3.2 Simulations

6 Conclusion

7 Appendix

7.1 Proof of Lemma 3.5

Fact 1.

Fact 2.

Fact 3.

7.2 Proof of Theorem 4.4

Problem 7.1.

7.3 Proof of Theorem 5.3

7.4 Proof of Lemma 5.11

References

Parameter Estimation in Epidemic Spread Networks Using Limited Measurements^†^†thanks: Research supported in part by the National Science Foundation, grants NSF-CMMI 1635014 and NSF-ECCS 2032258.

5.2.1 Pmfs of Measurements $\hat{x}_{i}[k]$ and $\hat{r}_{i}[k]$