Traffic Rate Network Tomography with Higher-Order Cumulants
Abstract
Network tomography aims at estimating source-destination traffic rates from link traffic measurements. This inverse problem was formulated by Vardi in 1996 for Poisson traffic over networks operating under deterministic as well as random routing regimes. In this paper we expand Vardi’s second-order moment matching rate estimation approach to higher-order cumulant matching with the goal of increasing the column rank of the mapping and consequently improving the rate estimation accuracy. We develop a systematic set of linear cumulant matching equations and express them compactly in terms of the Khatri-Rao product. Both least squares estimation and iterative minimum I-divergence estimation are considered. We develop an upper bound on the mean squared error (MSE) in least squares rate estimation from empirical cumulants. We demonstrate for the NSFnet that supplementing Vardi’s approach with third-order empirical cumulant reduces its averaged normalized MSE relative to the theoretical minimum of the second-order moment matching approach by about . This minimum MSE is obtained when Vardi’s second-order moment matching approach is based on the theoretical rather than the empirical moments.
Index Terms:
Network traffic, network tomography, inverse problem, higher-order cumulants.I Introduction
Network tomography was formulated by Y. Vardi in the seminal paper [1]. The goal is to estimate the rates of traffic flows over source-destination pairs from traffic flows over links of the networks. Traffic flow rates are measured, for example, by the number of packets per second. We use to denote a vector of source-destination traffic flows, and to denote a vector of link traffic flows. Normally, . All source-destination traffic flows are assumed independent Poisson random variables. Thus, comprises a vector of independent Poisson random variables. Link traffic flow measurements are assumed passive and require no probes. For networks operating under a deterministic routing regime, traffic flows from each source node is routed to the destination node over a fixed path. Traffic flows over each link may originate from multiple source-destination traffic flows. Thus, we define a binary variable such that when traffic over source-destination passes through link and otherwise. We also define an binary routing matrix which is assumed known throughout the paper. It follows that . The expected value of the th Poisson source-destination traffic flow constitutes its traffic flow rate. Thus, the vector of source-destination traffic flow rates is the expected value of which we denote by .
The network tomography problem is that of estimating the rate vector from given realizations of . This is a rather challenging inverse problem since is an underdetermined set of equations, and the estimate of must be non-negative. Vardi invoked the terminology “LININPOS” for (LINear INverse POSitive problem). Maximum likelihood estimation of is not feasible since the components of are dependent Poisson random variables with no explicitly known distribution. The expectation-maximization (EM) algorithm is also not useful for this problem as it requires calculation of the conditional mean which in turns requires knowledge of the joint distribution of and . Vanderbei and Iannone [2] developed an EM algorithm in which is simulated. A common approach is to rely on the explicit form of for jointly Gaussian and , see, e.g., [1], [3], [4].
Vardi resorted to moment matching in which is estimated from a linear matrix equation relating the first two empirical moments of to the corresponding first two theoretical moments of . The approach is applicable to networks operating under deterministic as well as random routing regimes. In the latter case, there are multiple alternative paths for each source-destination pair which can be selected according to some probability law. Vardi invoked an iterative procedure for estimating which was previously developed for image deblurring [5]. This useful procedure will be discussed in Section II. Vardi’s work ignited extensive research in the areas of network inference and medical tomography.
In this paper we explore the benefits of supplementing Vardi’s second-order moment matching approach with higher-order empirical cumulants. Here too the cumulant matching equation is linear in , and the approach is applicable to networks operating under deterministic as well as random routing regimes. Higher-order cumulants introduce new useful information on the unavailable distribution of which can be leveraged in estimating the source-destination flow rates. By using a sufficient number of empirical cumulants, the linear mapping involved in the cumulant matching approach can achieve full column rank. In an ideal scenario where the cumulant matching equations are consistent, and the cumulants of the link traffic flows are accurately known, the rate vector can be recovered without error. As we demonstrate in Section V, this ideal situation is approachable when a sufficiently large number of realizations of are available.
A vast literature exists on network tomography. Several forms of network tomography have been studied in the literature depending on the type of measurement data and the network performance parameters (e.g., rate, delay) of interest: (i) source-destination path-level traffic rate estimation based on link-level traffic measurements [1, 6, 3, 4, 7, 8, 9]; (ii) link-level parameter estimation based on end-to-end path-level traffic measurements [2], [10, 11, 12, 13, 14]; (iii) network topology discovery from traffic measurements [15, 16, 17, 18]. Approaches to network tomography can also be characterized as active (e.g., [10, 11, 19, 20]), whereby explicit control traffic is injected into the network to extract measurements, or passive (e.g., [1, 3, 21, 14]), whereby the observation data is obtained from existing network traffic. A somewhat outdated survey circa 2003 can be found in [22]. A more recent survey [23] discusses network tomography in conjunction with network coding.
Closest to Vardi’s work is the contemporary work of Vanderbei and Iannone [2] which relies on a Poisson model for incoming traffic. The goal is to estimate the rate of traffic on each link connecting input and output nodes from traffic counts at the input and output nodes. Vanderbei and Iannone did not resort to moment matching but rather developed a simulation based EM algorithm for maximum likelihood estimation of the rates. A thorough Bayesian approach to Vardi’s problem was developed by Tebaldi and West [6] using Markov Chain Monte Carlo simulation. See also [9]. Another closely related work to Vardi’s problem appeared in [3], [4], [22] where maximum likelihood estimation of the source-destination rates from link data was implemented under a Gaussian rather than a Poisson traffic model. In [7], source-destination rates were estimated by utilizing spatiotemporal correlation of nominal traffic, and the fact that traffic anomalies are sparse. In [8], conditions for identifiability of higher order cumulants in estimation of source-destination traffic from link measurements were established. In [13] an algorithm was developed for choosing a set of linearly independent source-destination measurement paths from which additive link metrics are individually estimated.
The plan for this paper is as follows. In Section II we present the minimum I-divergence iterative procedure which plays an important role in this paper. In Section III we address rate estimation in networks with deterministic routing. We develop a new systematic set of cumulant matching equations for estimating the rate vector from up to the fourth-order empirical cumulant. We also develop an upper bound on the mean squared error (MSE) in least squares estimation of the rate vector from empirical cumulants. We conclude this section with a discussion on some implementation issues of the proposed cumulant matching approach. Complexity of the approach is discussed in Section III-D. In Section IV we discuss rate estimation in networks operating under a random routing regime. Numerical results for the NSFnet [24] are presented in Section V. Concluding remarks are given in Section VI. Technical details of the derivations are deferred to the Appendix.
II Minimum I-Divergence Iteration
Under the Poisson model for , the moment (and cumulant) matching approach results in a set of linear equations in . Generally speaking, the set of equations has the form of where is a vector of empirical moments (first and second-order in [1]) of , is a constant zero-one matrix that depends on but not on , and represents the corresponding theoretical moments.
To estimate that satisfies , Vardi proposed to use an iterative estimation approach which originated in the study of another inverse problem concerning image deblurring [5], [25]. Let denote a current estimate of the th component of , and let denote the new estimate of that component at the conclusion of the iteration. Let denote the th component of . Similarly, let denote the th component of . The iteration is given by
(1) |
for . This iteration is particularly suitable for solving positive inverse problems. It reaches a fixed point when the moment matching equation is satisfied. The iteration was studied by Snyder, Schulz and O’Sullivan [25] in a similarly formulated application of image deblurring. It was shown to monotonically decrease Csiszár’s I-divergence [26] between the original image convolved with the kernel, and the observed blurred image. The procedure turned out to be an EM iteration in the positron emission tomography problem, which follows a similar formulation as network tomography, but with the crucially facilitating difference that the Poisson components of are now independent random variables [27], [28]. This iteration will play a central role in our cumulant matching approach.
III Cumulant Matching in Deterministic Routing
In this section we present our cumulant matching approach and its theoretical performance bound for networks operating under a deterministic routing regime. The goal is to estimate the source-destination rate vector from realizations of the link traffic flow where and the routing matrix is known. Conditions for identifiability of were given in [1]. Specifically, the parameter is identifiable if all columns of are distinct and none is null.
III-A Cumulant Matching
Let where is any real random vector with finite mean and possibly dependent components. The vectorized th order central moment of is given by
(2) |
where is any positive integer and denotes the Kronecker product. When the length of is , the length of the vector is . For , it follows from the identity
(3) |
where are matrices of suitable dimensions [29, p. 408], that
(4) |
Expressions for the cumulants of the observed process in a state-space model were developed by Swami and Mendal [30]. The process in this paper is a particular case for which simpler expressions hold. We provide an alternative derivation for our particular case in the Appendix. Let denote the vectorized th central cumulant of any real random vector with finite mean. For , . For we have
(5) |
where is the covariance matrix of and is an permutation matrix defined in (A.70). We also have
(6) |
When is a diagonal matrix, as is the case of interest in this paper, it follows that is a symmetric matrix. We shall restrict our attention to cumulants of order since estimation of higher order cumulants is not practical.
Assume next that the components of are independent. For this special case of interest, the central moments and central cumulants of are conveniently and concisely expressed in terms of the Khatri-Rao product defined by
(7) |
It is shown in the Appendix that
(8) |
and
(13) |
where
(16) |
When are independent Poisson random variables, all cumulants of equal its rate , and we have from and (III-A)-(16) that,
(25) |
This is the key cumulant matching equation for . In (25), let denote the matrix of stacked Khatri-Rao products of , and let denote the vector of stacked cumulants. Then the rate vector satisfies
(26) |
To estimate , is replaced by a vector of empirical estimates and is estimated from
(27) |
In this paper we consider least squares estimation and the minimum I-divergence iterative procedure (1). The vector comprises minimum variance unbiased cumulant estimates given by the -Statistics which we discuss in the next section [31].
Vardi’s second-order moment matching equation can be summarized as . With only two moments, the column rank of the matrix may be too low to provide an accurate solution. Note that if the set of equations (27) is consistent, the theoretical cumulants are known, and is sufficiently large so that has full column rank, then can be estimated error free as the unique solution of (27). We demonstrate in Section V for the NSFnet [24] that is sufficient to achieve full column rank, and that the theoretical performance is approachable when the cumulants are estimated from a sufficiently large number of realizations of .
III-B Error Analysis
In this section we assess the MSE in the least squares estimation of satisfying (26) from the cumulant matching equation (27). We assume for this analysis that is sufficiently large so that the augmented matrix has full column rank.
-Statistics of scalar processes were developed in [31, p. 281]. For vector processes and , let and denote the -Statistics of and , respectively. Similarly, let and denote the -Statistics of and , respectively. Let denote a sequence of independent identically distributed source-destination traffic flows defined similarly to . Each comprised independent Poisson random variables with rate . Let . Define
(28) |
where
(29) |
and the empirical cumulant estimate for as
(30) |
Similar definitions follow for the process when in (28)-(30) is replaced by . The -Statistic for is given by
(31) |
for ,
(32) |
for ,
(33) |
and for ,
(34) |
where we have used (3). Similar relations are expected to hold for higher order cumulants. For and sufficiently large , the -Statistic and the empirical cumulant are essentially the same.
Let and denote vectors of stacked -statistics of order smaller than or equal to corresponding to and , respectively. The least squares estimate of in (27) is given by
(35) |
Let
(36) |
denote the error vector in estimating . From (26),
(37) |
Substituting (37) into (35) yields,
(38) |
Let . The rate estimation error is given by
(39) |
and the MSE is given by
(40) |
where the inequality follows from the Rayleigh quotient theorem 111 The Rayleigh quotient of a Hermitian matrix is defined as where is a vector of suitable dimension. [29, Theorem 8.1.4], and denotes the maximal eigenvalue of the Hermitian matrix . From the theory of singular value decomposition, we recall that for any matrix ,
(41) |
Applying this to (III-B) gives
(42) |
Let denote a block diagonal matrix with the th diagonal block being . There are diagonal blocks in total. Let
(43) |
From (36), (31)-(III-B), and (25),
(44) |
It follows from the Rayleigh quotient [29, Theorem 8.1.4] that
(45) |
where equals the maximal singular value of , and
(46) |
The th diagonal block of is given by
(47) |
Since the eigenvalues of the Kronecker product of two matrices equal the Kronecker product of the vectors of eigenvalues of each matrix [29, p. 412], we conclude that the eigenvalues of for any positive integer are all possible products of length of the eigenvalues of . In particular, since is positive semi-definite,
(48) |
Furthermore, since is block diagonal with blocks of increasing size, its eigenvalues are given by the union of the sets of eigenvalues associated with each block. Thus,
(49) |
When , the maximum in (49) is achieved for . Otherwise, when , the maximum in (49) is achieved for . Hence, from (45),
(50) |
From (III-B) and (III-B) we obtain,
(51) |
Evaluating the MSE is quite involved. For , and sufficiently large , the cross terms in will be approximately zero, and hence
(52) |
where is the MSE associated with the -statistics of the th component of . It is also the variance of . Since the vectors are assumed independent, we can infer from the variance of the -statistics of an IID sequence of scalar random variables given by [31, p. 291],
var | ||||
var | ||||
(53) |
When the IID random variables are Poisson with , all cumulants equal , and hence,
(54) |
(55) |
The upper bound on for , follows from (51), (52), (III-B) and (III-B).
The bound (51) is loose when the condition number of the matrix is large. It does have qualitative value as it shows that estimation of moments becomes increasingly harder as the order increases. Since usually in network tomography, the error in estimating is much larger than that in estimating .
III-C Implementation
In this section we address several aspects related to the implementation of the cumulant matching approach. The Khatri-Rao product can be expressed in terms of the rows of instead of its columns as in (7). Let denote the th row of , . Then where stack refers to the stacking of row vectors of equal length to form a matrix. We have
(56) |
etc., where denotes the Schur-Hadamard product and lexicographic ordering of the indices is assumed. Using this formulation, it is easy to see that contains duplicate rows. For example, in and in are duplicates since the elements of belong to . Additionally, may contain null rows as is easy to see from (7). Thus, equations in (25) corresponding to duplicate and null rows in can be removed. We shall always opt to removing duplicate rows that correspond to higher-order cumulants rather than to low order cumulants since higher-order cumulants are harder to estimate. To simplify notation we shall henceforth assume that and the right hand side (RHS) vector of (25) are given in their reduced form. Similarly, we shall consider (26) and (27) as given in their reduced form.
It is interesting to compare the number of rows in the original and reduced . The number of rows in the original is given by
(57) |
The maximum number of distinct rows in the reduced is counted as the sum of the maximum number of distinct rows contributed individually by , , , etc., with the last contribution from the -fold Khatri-Rao product of with itself. The contribution from the th term is given by , , which represents the number of unordered combinations of rows chosen without replacement from the given rows. Thus, the maximum number of distinct rows in the reduced is given by
(58) |
For example, consider a network with a routing matrix whose columns comprise all lexicographically ordered non-zero binary -tuples. Here, , and and are shown in Table I. Note that for this example, the number of reduced equations coincides with the row rank of for each . Furthermore, full column rank is achieved only when .
1 | 2 | 3 | 4 | |
row rank of |
We study two estimates of that approximate the cumulant matching equation (27). The first estimate follows from the iteration (1) which is initialized by a constant vector. The second estimate follows from least squares. Specifically, we use the unique Tikhonov regularized least squares solution for the inconsistent set of equations (27), when is not necessarily full column rank. This estimator is given by [32, p. 51]
(59) |
for some . Note that the regularized estimator applies to a skinny as well as a fat matrix .
To mitigate the effects of the error introduced by the empirical cumulant estimates, while allowing the estimator of to benefit from the higher order cumulants, the relative weights of the third and fourth order cumulants compared to the first and second order moments, can be reduced. This can be done by multiplying all equations in (59) with rows originating from by some , and all equations with rows originating from by some . This regularization approach was advocated by Vardi [1] in the context of his second-order moment matching approach. Following this approach, let the reduced and -weighted matrix be denoted by , and let the reduced -weighted vector be denoted by . Then, from (59), the rate vector is estimated from
(60) |
Note that the estimator (60) is not guaranteed to be non-negative. A non-negative estimate of can be obtained by using non-negative least squares optimization [33]. In our numeric examples we have arbitrarily substituted negative estimates with the value of . This approach resulted in substantially lower MSE compared to using the constrained optimization algorithm of [33, p. 161]. The performance of the algorithm remained essentially the same when other small values (e.g., or ) were used since occurrences of negative estimates are infrequent at our working point. See Table V.
III-D Complexity
The computational effort in the cumulant matching approach consists of the effort to construct and solve the set of equations (27). The number of equations in this set is . Construction of the left hand side of (27) requires operations. Construction of the right hand side of (27) requires operations where is the number vectors used to estimate each cumulant. Solving the equations requires effort that depends only on , and the number of iterations in (1). The combined effort is dominated by since must be large to produce meaningful cumulant estimates. Thus the computational effort of the cumulant matching approach is approximately linear in when is large which is always the case.
IV Random Routing
In a typical network, there are multiple paths connecting every source node with every destination node. When the network operates under a deterministic routing regime, a single fixed path is used for every source-destination pair. When the network operates under a random routing regime, a path from the source to destination nodes is selected according to some probability law. Vardi attributed Markovian routing for traffic on each source-destination pair. The accessible nodes and links for the given pair are represented by the states and transition probabilities of the Markov chain, respectively. The transition probabilities of the Markov chain determine the probability of each path with the same source-destination address.
Tebaldi and West argued that random routing can be viewed as deterministic routing in a super-network in which all possible paths for each source-destination address are listed [6]. This approach results in an expanded zero-one routing matrix with each column in the original routing matrix replaced by multiple columns representing the feasible paths for the source-destination pair. The Poisson traffic flow on a source-destination pair is thinned into multiple Poisson traffic flows with the multinomial thinning probabilities being the probabilities of the paths with the same source-destination address. Thus, the super network and the original network operate under similar statistical (Poisson) models. The thinned Poisson rates can now be estimated as was originally done, e.g., using cumulant matching, and each source-destination rate estimate can be obtained from the sum of the thinned rate estimates in that source-destination pair.
V Numerical Example
In this section we demonstrate the performance of our approach and the gain realized by using higher-order empirical cumulants222Our experiments were run on ARGO, a research computing cluster provided by the Office of Research Computing at George Mason University, VA. (URL: http://orc.gmu.edu).. We study the NSFnet [24] whose topology is shown in Fig. 1. The network consists of 14 nodes and 21 bidirectional links. Hence, it contains source-destination pairs.
This size network may represent a private network, a transportation network or a subset of interest of a large-scale network. The link weights in Fig. 1 are exclusively used to determine shortest paths for each source-destination pair. Otherwise, they play no role in the traffic rate estimation problem. To determine the shortest paths between a given source-destination pair, we used the shortest simple paths function from the NetworkX Python library, which is based on the algorithm of Yen [34]. When , the number of source-destination paths equals the number of source-destination pairs and the routing matrix is a matrix. The augmented routing matrix achieves full column rank when .
With , we can assign multiple paths to each source-destination pair and treat them as distinguishable new source-destination pairs. The routing matrix thus becomes fatter and using higher-order empirical cumulants becomes beneficial. For example, when , we have source-destination paths, the column rank of is and has full column rank. Thus, using this example we focus on third-order cumulant matching.
The network with and a routing matrix could also be seen as a super-network in the Tebaldi-West sense [6] for a network with source-destination pairs operating under a random routing regime with two possible paths per each source-destination pair. The accuracy of the rate estimation for the random routing network is determined by the accuracy of the rate estimation in the deterministic routing super-network. Thus, it suffices to focus on rate estimation in the deterministic routing network with the routing matrix .
The arrival rates in our experiment were generated randomly from the interval . In each of simulation runs, a rate vector was generated, and was subsequently used in generating statistically independent identically distributed Poisson vectors which were transformed into the vectors using the assumed known routing matrix . The statistically independent identically distributed Poisson vectors were used to generate the empirical cumulants using (29) when and (30) when . We experimented with in the range of to . The MSE in estimating the cumulants for the NSFnet with are given in Table II. Clearly, the MSE decreases monotonically with for .
The empirical cumulants were used to estimate the rate vector in the current run. For the cumulants regularization we have used and . Let and denote, respectively, the th component of and its estimate at the th run where and . For each estimate we evaluated the normalized MSE defined by
(61) |
and the averaged normalized MSE defined by
(62) |
The MSE in estimating is approximately when is sufficiently large.
Two rate estimators were used, the iterative estimator (1) and the least squares estimator (60). The iteration was initialized uniformly with all rates set to . It was terminated after iterations. The least squares regularization factor was set to .
0.0048 | 0.2578 | 14.9097 | |
0.0025 | 0.1285 | 7.4569 | |
0.0010 | 0.0515 | 2.9978 | |
0.0005 | 0.0257 | 1.4989 | |
0.0002 | 0.0131 | 0.7429 | |
0.0001 | 0.0052 | 0.2993 |
0.2292 | 0.0993 | 0.1008 | |
0.2292 | 0.0701 | 0.0694 | |
0.2292 | 0.0502 | 0.0467 | |
0.2292 | 0.0434 | 0.0381 | |
0.2292 | 0.0399 | 0.0335 | |
0.2292 | 0.0372 | 0.0298 | |
Theoretical | 0.2292 | 0.0353 | 0.0015 |
Rank | 21 | 162 | 182 |
0.3037 | 0.0951 | 0.0950 | |
0.3037 | 0.0637 | 0.0625 | |
0.3037 | 0.0430 | 0.0407 | |
0.3037 | 0.0358 | 0.0329 | |
0.3037 | 0.0321 | 0.0288 | |
0.3037 | 0.0293 | 0.0257 | |
Theoretical | 0.3037 | 0.0275 | 0.0000 |
Rank | 21 | 162 | 182 |
0.1890 | 4.1747 | 4.2275 | |
0.1868 | 2.9813 | 3.0626 | |
0.1890 | 1.9879 | 1.9901 | |
0.1890 | 1.6385 | 1.5780 | |
0.1890 | 1.3747 | 1.3242 | |
0.1890 | 1.1956 | 1.1165 | |
Theoretical | 0.1890 | 0.9692 | 0.0022 |
The performance of the iterative estimator (1) is demonstrated in Table III. The table shows the averaged normalized MSE in estimating the path rates using empirical cumulants estimated from vectors with ranging from to . The averaged normalized MSE in estimating the path rates using theoretical cumulants is shown as well. As can be observed in the table, monotonically decreases with for . For a given , the performance using cumulants is better than using moments, except for where the difference is only . Similar behavior can be seen in Table IV when the least squares estimator (60) is used for the path rate estimation. The percentage of negative rate estimates observed with the least squares estimator is shown in Table V. This percentage is negligible for all values of and and does not exceed . Comparing the two estimators (1) and (60), the least squares estimator appears superior as it consistently provides lower at the expense of negligible percentage of negative rate estimates.
To assess the effectiveness of the third-order cumulant matching approach and compare it with Vardi’s second-order moment matching approach, we contrast the performance of the former approach with the best performance of the latter approach as obtained by using the theoretical rather than the empirical moments. Let denote the averaged normalized MSE from (62) obtained when is estimated using empirical cumulants that were estimated from realizations of the vector . Let denote the averaged normalized MSE from (62) obtained by using theoretical cumulants, or effectively by using . Table VI shows and for and the estimators (1) and (60) discussed in Tables III–IV.
[Eq. (1)] | [Eq. (60)] | |
---|---|---|
1.1303 | 1.1673 | |
0.9490 | 1.0473 |
Table VI shows the relative averaged normalized MSE in second and third-order cumulant matching compared to the minimum averaged normalized MSE in theoretical second-order moment matching as obtained by using the true second-order moments. The reduction in the relative averaged normalized MSE obtained when the estimator (1) was used was from to , and from to when the estimator (60) was used. The third-order cumulants in this experiment were estimated using . This represents a reduction of about .
To show the accuracy in estimating individual path rates, let denote the normalized MSE from (61) when estimating the th component of using empirical cumulants estimated from realizations of . Figure 2 shows for when estimation was done using the Iteration (1) and . Excluding outliers, the figure shows that equals about of . The outliers correspond to rate components for which is very small so that the ratio is rather large.
VI Concluding Remarks
We have developed a framework for high-order cumulant matching approach for estimating the rates of source-destination Poisson traffic flows from link traffic flows. The approach is equally applicable to networks operating under deterministic as well as random routing strategies. Under independent Poisson source-destination traffic flows, the approach boils down to a set of linear equations relating empirical cumulants of the link measurements to the rate vector via a matrix involving Khatri-Rao products of the routing matrix. We studied an iterative minimum I-divergence approach and least squares estimation of the rate vector from the cumulant matching equations. We have established an upper bound on the MSE in least squares estimation of . The bound is useful for full column rank matrices with small condition number. We demonstrated the performance of the cumulant matching approach and compared it with Vardi’s best second-order moment matching on the NSFnet. We demonstrated that supplementing Vardi’s approach with the third-order empirical cumulant reduces its minimum averaged normalized MSE in rate estimation by . The minimum averaged normalized MSE is obtained when the theoretical rather than the empirical moments are used in Vardi’s second-order approach. The computational complexity of the approach is linear in the number of link traffic flows used to estimate the cumulants.
Appendix A Cumulants of Linear Maps
Let be an vector with a in the th component and zeros elsewhere. The vector can be written as . Substituting in (2), and applying the distributive property of the Kronecker product, we obtain,
(A.63) |
Similarly, the fourth order cumulant of is given by
(A.64) |
where
(A.65) |
Substituting (A) into (A.64) we obtain
(A.66) |
Row coincides with (A.63) and hence it equals . The expression in row can be written as
(A.67) |
For row ,
(A.68) |
and hence,
(A.69) |
For row we utilize the permutation matrix
(A.70) |
to obtain,
(A.71) |
Hence, row is given by
(A.72) |
We next express in terms of for . By substituting with in (III-A) we obtain,
(A.73) |
The first term in the RHS of (A) is given by (4). The second term can be expressed using (3) and (4) as,
(A.74) |
For the third term in (A) we use (3) and the well known identity [29, p. 410]
(A.75) |
to obtain
vec | ||||
(A.76) |
For the last term in (A), we use (3) and the relation
(A.77) |
to obtain
(A.78) |
Hence, from (A.75),
vec | ||||
(A.79) |
Combining these results we obtain (6).
Suppose now that the components of are independent random variables. Then,
(A.82) |
and
(A.83) |
The central moment follows from (4). Expressing
(A.84) |
we have
(A.85) |
Substituting (A) and (A) in (4), and using the orthonormality of the vectors , we obtain,
(A.86) |
It can similarly be shown that
(A.87) |
When the components of are independent random variables
(A.88) |
and
(A.91) |
Substituting (A.91) into (A) yields,
(A.92) |
To find we use (6) where
(A.93) |
as follows from (A.84)-(A), and is given in (A). Using orthonormality of the vectors as in (A), yields,
(A.98) |
which is (13). We conjecture that the same relation holds for all higher order cumulants.
References
- [1] Y. Vardi, “Network tomography: Estimating source-destination traffic intensities from link data,” J. Amer. Stat. Assoc., vol. 91, no. 433, pp. 365–377, 1996.
- [2] R. J. Vanderbei and J. Iannone, “An EM approach to OD matrix estimation,” Princeton University, Tech. Rep. SOR 94-04, 1994.
- [3] J. Cao, D. Davis, S. Vander Wiel, and B. Yu, “Time-varying network tomography: Router link data,” J. Amer. Statist. Assoc., vol. 95, pp. 1063–1075, 2004.
- [4] G. Liang and B. Yu, “Maximum pseudo likelihood estimation in network tomography,” IEEE Trans. Signal Process., vol. 51, no. 8, pp. 2043–2053, Aug. 2003.
- [5] W. H. Richardson, “Bayesian-based iterative method of image restoration,” J. Opt. Soc. Amer., vol. 62, pp. 55–59, Jan 1972.
- [6] C. Tebaldi and M. West, “Bayesian inference on network traffic using link count data (with discussion),” J. Amer. Stat. Assoc., vol. 93, no. 442, pp. 557–576, 1998.
- [7] M. Mardani and G. B. Giannakis, “Estimating traffic and anomaly maps via network tomography,” IEEE/ACM Trans. Netw., vol. 24, no. 3, pp. 1533–1547, June 2016.
- [8] H. Singhal and G. Michailidis, “Identifiability of flow distributions from link measurements with applications to computer networks,” Inverse Problems, vol. 23, pp. 1821–1849, 2007.
- [9] M. L. Hazelton, “Network tomography for integer-valued traffic,” The Annals of Applied Statistics, vol. 9, no. 1, pp. 474–506, 2015.
- [10] R. Cáceres, N. G. Duffield, J. Horowitz, and D. F. Towsley, “Multicast-based inference of network internal loss characteristics,” IEEE Trans. Inf. Theory, vol. 45, no. 7, pp. 2462–2480, Nov. 1999.
- [11] F. Lo Presti, N. G. Duffield, J. Horowitz, and D. Towsley, “Multicast-based inference of network-internal delay distributions,” IEEE/ACM Trans. Netw., vol. 10, no. 6, pp. 761–775, Dec. 2002.
- [12] Y. Tsang, M. J. Coates, and R. Nowak, “Network delay tomography,” IEEE Trans. Signal Process., vol. 51, pp. 2125–2136, 2003.
- [13] L. Ma, T. He, K. K. Leung, D. Towsley, and A. Swami, “Efficient identification of additive link metrics via network tomography,” in IEEE 33rd Int. Conf. Distributed Computing Systems, Philadelphia, PA, 2013, pp. 581–590.
- [14] N. Etemadi Rad, Y. Ephraim, and B. L. Mark, “Delay network tomography using a partially observable bivariate Markov chain,” IEEE/ACM Trans. Netw., vol. 25, no. 1, pp. 126–138, Feb. 2017.
- [15] X. Jin, W.-P. Yiu, S.-H. Chan, and Y. Wang, “Network topology inference b ased on end-to-end measurements,” IEEE J. Sel. Areas Commun., vol. 24, no. 12, pp. 2182–2195, Dec. 2006.
- [16] B. Eriksson, G. Dasarathy, P. Barford, and R. Nowak, “Toward the practical use of network tomography for Internet topology discovery,” in Proc. IEEE INFOCOM, San Diego, CA, USA, March 2010, pp. 1–9.
- [17] J. Ni, H. Xie, S. Tatikonda, and Y. R. Yang, “Efficient and dynamic routing topology inference from end-to-end measurements,” IEEE/ACM Trans. Netw., vol. 18, no. 1, pp. 123–135, Feb. 2010.
- [18] A. Anandkumar, A. Hassidim, and J. Kelner, “Topology discovery of sparse random graphs with few participants,” in ACM SIGMETRICS’11, San Jose, CA, USA, June 2011.
- [19] R. Kumar and J. Kaur, “Practical beacon placement for link monitoring using network tomography,” IEEE J. Sel. Areas Commun., vol. 24, no. 12, pp. 2196–2209, Dec. 2006.
- [20] M. G. Rabbat, M. J. Coates, and R. D. Nowak, “Multiple-source Internet tomography,” IEEE J. Sel. Areas Commun., vol. 24, no. 12, pp. 2221–2234, Dec. 2006.
- [21] H. Yao, S. Jaggi, and M. Chen, “Passive network tomography for erroneous networks: A network coding approach,” IEEE Trans. Inf. Theory, vol. 58, no. 9, pp. 5922–5940, Sep. 2012.
- [22] R. Castro, M. Coates, G. Liang, R. Nowak, and B. Yu, “Network tomography: Recent developments,” J. Stat. Sci., vol. 19, no. 3, pp. 499–517, 2004.
- [23] P. Qin, B. Dai, B. Huang, G. Xu, and K. Wu, “A survey on network tomography with network coding,” IEEE Commun. Surveys Tuts., vol. 16, no. 4, pp. 1981–1995, 2014.
- [24] L. H. Bonani, “Modeling an optical network operating with hybrid-switching paradigms,” J. Microw. Optoelectron. Electromagn. Appl., vol. 15, no. 4, pp. 275–292, 2016.
- [25] D. L. Snyder, T. J. Schulz, and J. A. O’Sullivan, “Deblurring subject to nonnegativity constraints,” IEEE Trans. Sig. Proc., vol. 40, 1992.
- [26] I. Csiszár, “Why least squares and maximum entropy? an axiomatic approach to inference for linear inverse problems,” The Annals ofStatistics, vol. 19, no. 4, pp. 2032–2066, Dec. 1991.
- [27] L. A. Shepp and Y. Vardi, “Maximum likelihood reconstruction for emission tomography,” IEEE Trans Med Imaging, vol. 1, pp. 113–122, 1982.
- [28] Y. Vardi, “Applications of the EM algorithm to linear inverse problems with positivity constraints,” in Image Models (and their Speech Cousins), S. E. Levinson and L. A. Shepp, Eds. Springer, 1966, pp. 183–198.
- [29] P. Lancaster and M. Tismenetsky, The Theory of Matrices, 2nd ed. Orlando, Florida: Academic Press, 1985.
- [30] A. Swami and J. M. Mendel, “Time and lag recursive computation of cumulants from a state-space model,” IEEE Trans. Automatic Control, vol. 35, no. 1, pp. 4–17, Jan. 1990.
- [31] M. G. Kendall and A. Stuart, The Advanced Theory of Statistics. Hafner Publishing Company, 1969.
- [32] T. Kailath, A. H. Sayed, and B. Hassibi, Linear Estimation, 1st ed. New Jersey: Prentice Hall, 2000.
- [33] C. L. Lawson and R. J. Hanson, Solving Least Squares Problems. New Jersey: Prentice-Hall, 1974.
- [34] J. Y. Yen, “Finding the k shortest loopless paths in a network,” Management Science, vol. 17, no. 11, pp. 712–716, Jul. 1971.