This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Núcleo de Investigación y Desarrollo Tecnológico (NIDTEC) 22institutetext: Facultad Politécnica - Universidad Nacional de Asunción, San Lorenzo C.P. 2169, Paraguay.

A Distributed Algorithm for Spectral Sparsification of Graphs with Applications to Data Clustering

Fabricio Mendoza-Granada and Marcos Villagra
Abstract

Spectral sparsification is a technique that is used to reduce the number of non-zero entries in a positive semidefinite matrix with little changes to its spectrum. In particular, the main application of spectral sparsification is to construct sparse graphs whose spectra are close to a given dense graph. We study spectral sparsification under the assumption that the edges of a graph are allocated among sites which can communicate among each other. In this work we show that if a graph is allocated among several sites, the union of the spectral sparsifiers of each induced subgraph give us an spectral sparsifier of the original graph. In contrast to other works in the literature, we present precise computations of the approximation factor of the union of spectral sparsifiers and give an explicit calculation of the edge weights. Then we present an application of this result to data clustering in the Number-On-Forehead model of multiparty communication complexity when input data is allocated as a sunflower among sites in the party.

1 Introduction

Spectral sparsification is a technique introduced by Spielman and Teng spielman:2011 that is used to approximate a graph GG by a sparse graph HH. The notion of approximation used by spectral sparsification is that the spectra of both HH and GG must be close up to a constant factor. Batson, Spielman and Srivastava batson:12 proved that every graph GG has an spectral sparsifier with a number of edges linear in the number of vertices of GG and provided an algorithm achieving such bound. There are several algorithms in the literature that construct spectral sparsifiers of graphs with a trade-off between running time and number of edges of HH. To the best of our knowledge, Lee and Sun lee:2018 has the best probabilistic algorithm for spectral sparsification with a running time that is almost linear and constructs spectral sparsifiers with O(qn/ϵ2)O(qn/\epsilon^{2}) edges, where nn is the number of vertices of GG, ϵ\epsilon is an approximation factor and q10q\geq 10 is a constant.

There are situations where algorithms need to work with data that is not centralized and allocated in different sites. One way to deal with decentralized data is to design communication protocols so that the sites can communicate among them. The efficiency of a communication protocol can be measured by the number of bits shared among the sites and such a measure is known as the communication complexity of the protocol nisan:93 . When data comes in the form of a graph, the edges greatly affects communication complexity, and hence, computing spectral sparsifiers of graphs in distributed systems is of great importance.

In this work we present a distributed algorithm for spectral sparsification of graphs in the communication complexity model. In this model, we are only interested in the communication costs among sites and we assume that each site has arbitrary computational power. The idea behind this protocol is that, given an input graph GG, spectral sparsifiers of induced subgraphs of GG can be computed in each site first, and then any given site computes the union of such graphs which results in a spectral sparsifier of GG. Even though other works have used the idea of taking the union of spectral sparsifiers like Chen et al. chen:16 , they have not shown a precise calculation of the approximation factor. The main contribution of this work, presented in Theorem 3.1, is an estimation of the approximation factor and an explicit calculation of the edge weights in the union of spectral sparsifiers. In order to compute the approximation factor we introduce an idea that we call “overlapping cardinality partition,” which is a way to partition the edge set of a graph with respect to the number of times each edge is allocated among sites. Overlapping cardinality partition is a technical tool that allows us to express the Laplacian matrix of the union of induced subgraphs of GG as a linear combination of the Laplacian matrices of graphs induced from the partition.

In a second part of this paper, we present in Section 4 an application of Theorem 3.1 in distributed data clustering in the Number-On-Forehead model of communication complexity. In particular, if we assume the existence of a sunflower structure erdos:61 ; deza:74 ; kostochka:00 on the input data, we show how a communication protocol can detect the presence of the sunflower and take advantage of its kernel to reduce the communication costs.

The rest of this paper is organized as follows. In Section 2 we present the main definitions and notation used throughout this work. In Section 3 we present the main result of this work, and in Section 4 we present our application to data clustering.

2 Preliminaries and Notation

In this section we will introduce some definitions and notations that will be used throughout this paper.

2.1 Spectral Graph Theory

Let G=(V,E,w)G=(V,E,w) be an undirected and weighted graph with nn vertices and mm edges. Let {Ei}i1\{E_{i}\}_{i\geq 1} be family of subsets of EE. We denote by Gi=(V,Ei,wi)G_{i}=(V,E_{i},w_{i}) the subgraph induced by EiE_{i}, where wi:Ei+w_{i}:E_{i}\rightarrow\mathbb{R}^{+} is defined as wi(e)=w(e)w_{i}(e)=w(e) for all eEie\in E_{i} and 0 otherwise. Every graph GG has an associated matrix called its Laplacian matrix, or simply Laplacian, which is define as

LG=DGWG,\displaystyle L_{G}=D_{G}-W_{G},

where WGW_{G} is the weighted adjacency matrix and DGD_{G} is the weighted degree matrix. We will omit the subindex GG from LG,WGL_{G},W_{G} and DGD_{G} when it is clear from the context.

The normalized Laplacian is defined as =D1/2LD1/2\mathcal{L}=D^{-1/2}LD^{-1/2}. The Laplacian matrix (and normalized Laplacian) is positive semidefinite (PSD) with its first eigenvalue λ1\lambda_{1} always equals zero with multiplicity equal to the number of connected components of GG luxburg:07 . Indeed, if there exists a multicut of size kk in GG then the kk-th smallest eigenvalue λk\lambda_{k} of LL gives useful information to find a multicut.

One of the fastest methods to approximate an optimal multicut in a graph is the so-called spectral clustering algorithm. This technique uses kk eigenvectors of LL or \mathcal{L} associated to the first kk smallest eigenvalues in order to construct a matrix XX with the eigenvectors as columns, and then, it applies a simpler clustering algorithm (like k-means) to the rows of XX luxburg:07 . Lee, Gharan, and Trevisan lee2:2014 proved that λk\lambda_{k} approximates the optimal value of a multicut of size kk in GG and the eingevectors give the corresponding partition over VV.

2.2 Spectral Sparsification

Spectral sparsification is a technique used to reduce the density of a given PSD matrix changing its spectra only by a constant factor of approximation. Given a matrix MM, spectral sparsification constructs another matrix which is “similar” to MM in some well-defined way. We will use a notion of similarity defined in spielman:2011 . A subgraph HH of GG is called an ϵ\epsilon-spectral sparsifier of GG if for any xnx\in\mathbb{R}^{n} we have that

(1ϵ)xTLGxxTLHx(1+ϵ)xTLGx.\displaystyle(1-\epsilon)x^{T}L_{G}x\leq x^{T}L_{H}x\leq(1+\epsilon)x^{T}L_{G}x.

The importance of a spectral sparsifier lies on the sparseness of LHL_{H}, for example, some computations are easier over an sparse matrix. There are deterministic and probabilistic algorithms to find spectral sparsifiers of a given graph. The algorithm of Batson, Spielman and Srivastava batson:12 is currently the best deterministic algorithm. The algorithm of batson:12 constructs a graph with O(qnϵ2)O(\frac{qn}{\epsilon^{2}}) edges in O(qmn5/qϵ4+4/q)O(\frac{qmn^{5/q}}{\epsilon^{4+4/q}}) time, where ϵ\epsilon is the approximation factor and q10q\geq 10 is a constant.

3 A Distributed Algorithm for Spectral Sparsification

In this section we present our main result. In particular, given a graph GG and a family of induced subgraphs of GG, we show that the union of spectral sparsifiers of the induced subgraphs is a spectral sparsifier of GG. In contrast to other work, however, we give explicit bounds on the approximation factor and a construction of the new weight function.

First we introduce some definitions which will help us understand the overlapping of data among the sites. We denote by [n][n] the set {1,2,,n}\{1,2,\dots,n\}.

Definition 1 (Occurrence Number)

Let ={E1,,Et}{\mathcal{E}}=\{E_{1},\dots,E_{t}\} be a family of subsets of [n][n]. For any a[n]a\in[n], the occurrence number of aa in \mathcal{E}, denoted #(a)\#(a), is the maximum number of sets from \mathcal{E} in which aa appears.

Example 1

Let n=7n=7 and ={{1,2,3},{2,3,4},{4,5,1},{3,2,6},{4,7,1}{\mathcal{E}}=\{\{1,2,3\},\{2,3,4\},\{4,5,1\},\{3,2,6\},\{4,7,1\}, {2,3}\{2,3\}, {5,6,7},{1,3,5},{2,4}}\{5,6,7\},\{1,3,5\},\{2,4\}\}. Here we have that #(1)=4\#(1)=4, #(2)=5\#(2)=5, #(3)=5\#(3)=5, and so on.∎

Definition 2 (Overlapping Cardinality)

Let ={E1,,Et}\mathcal{E}=\{E_{1},\dots,E_{t}\} be a family of subsets of [n][n] for some fixed nn and E=i=1tEiE=\bigcup_{i=1}^{t}E_{i}. The overlapping cardinality of a subset EEE^{\prime}\subseteq E in \mathcal{E} is a positive integer kk such that for each aEa\in E^{\prime} its ocurrence number #(a)=k\#(a)=k; otherwise the overlapping cardinality of EE^{\prime} in \mathcal{E} is 0.

The overlapping cardinality identifies the maximum number of times the elements of a subset appears in a family of subsets.

Example 2

Let n=7n=7 and \mathcal{E} be as in Example 1. Here we have that E=i=1tEi=[n]E=\bigcup_{i=1}^{t}E_{i}=[n]. Now consider the sets {1,4}\{1,4\} and {1,2,3}\{1,2,3\}.

  • The overlapping cardinality of {1,4}\{1,4\} in \mathcal{E} is 4, because #(1)=#(4)=4\#(1)=\#(4)=4.

  • The overlapping cardinality of {1,2,3}\{1,2,3\} in \mathcal{E} is 0 because the occurrence number of one of the elements of the set is different from the others, namely, #(1)=4\#(1)=4, #(2)=5\#(2)=5, and #(3)=5\#(3)=5.∎

Now we use the idea of overlapping cardinality to construct a partition on the set \mathcal{E} of subsets of [n][n].

Definition 3 (Overlapping Cardinality Partition)

Given a family {\mathcal{E}} as in Definition 2, an overlapping cardinality partition over EE on \mathcal{E} is a partition {E1,,Ek}\{E_{1}^{\prime},\dots,E_{k}^{\prime}\} of EE where each EiE_{i}^{\prime} has overlapping cardinality cic_{i} on \mathcal{E}. We call the sequence (c1,c2,,ck)(c_{1},c_{2},\dots,c_{k}), with 1c1<c2<<ck1\leq c_{1}<c_{2}<\cdots<c_{k}, the overlapping cardinalities over the family \mathcal{E}.

Example 3

Take \mathcal{E} from examples 1 and 2. An overlapping cardinality partition is

{{6,7},{5},{1,4},{2,3}}.\displaystyle\{\{6,7\},\{5\},\{1,4\},\{2,3\}\}.

Here, {6,7}\{6,7\} has overlapping cardinality equal to 22 because #(6)=#(7)=2\#(6)=\#(7)=2. The subset {5}\{5\} has overlapping cardinality equal to 33 because #(5)=3\#(5)=3. In Example 2 we saw that the subset {1,4}\{1,4\} has overlapping cardinality 44. Finally, the subset {2,3}\{2,3\} has overlapping cardinality equal to 55 because #(2)=#(3)=5\#(2)=\#(3)=5.∎

Our main technical lemma shows that the Laplacian of an input graph can be rewritten as a linear combination of Laplacians corresponding to induced subgraphs constructed from an overlapping cardinality partition of the set of edges.

For the rest of this section we make the following assumptions. Let G=(V,E,w)G=(V,E,w) be an undirected and weighted graph with a function w:E+w:E\rightarrow\mathbb{R}^{+}, let ={E1,,Et}\mathcal{E}=\{E_{1},\dots,E_{t}\} be a collection of subsets of EE such that i=1tEi=E\bigcup_{i=1}^{t}E_{i}=E where EiE_{i}\neq\emptyset and Gi=(V,Ei,wi)G_{i}=(V,E_{i},w_{i}) is an induced subgraph of GG where wi:Ei+w_{i}:E_{i}\rightarrow\mathbb{R}^{+} and wi(e)=w(e)w_{i}(e)=w(e) for all eEie\in E_{i} and 0 otherwise.

Lemma 1

If 1c1<c2<<ck1\leq c_{1}<c_{2}<\dots<c_{k} are the overlapping cardinalities over the family \mathcal{E} with an overlapping cardinality partition {Ecj}jk\{E_{c_{j}}^{\prime}\}_{j\leq k}, then i=1tLGi=j=1kcjLGcj\sum_{i=1}^{t}L_{G_{i}}=\sum_{j=1}^{k}c_{j}L_{G_{c_{j}}^{\prime}} where LGcjL_{G_{c_{j}}^{\prime}} is the Laplacian of Gcj=(V,Ecj,wcj)G_{c_{j}}^{\prime}=(V,E_{c_{j}}^{\prime},w_{c_{j}}^{\prime}).

Proof

First notice that, for all e=xyEcje=xy\in E_{c_{j}}^{\prime} there exists a subfamily of \mathcal{E} with cardinality equal to cjc_{j} such that ee belongs to every member of it and its associated subgraph. Take any xyEcjxy\in E_{c_{j}}^{\prime} for some j{1,,k}j\in\{1,\dots,k\}. There exists cjc_{j} induced subgraphs Gi1,,GicjG_{i_{1}},\dots,G_{i_{c_{j}}} of GG that has xyxy as an edge, and all other induced subgraphs Gk1,,GkG_{k_{1}},\dots,G_{k_{\ell}} do not have xyxy as and edge, where cj+=tc_{j}+\ell=t. This means that

i=1tLGi(x,y)=cjLGcj(x,y)=cjw(x,y).\sum_{i=1}^{t}L_{G_{i}}(x,y)=c_{j}\cdot L_{G_{c_{j}}^{\prime}}(x,y)=-c_{j}\cdot w(x,y). (1)

Now, let dG(x)d_{G}(x) denote the degree of xx in GG. We know that dG(x)=yw(x,y)d_{G}(x)=\sum_{y}w(x,y) where xyExy\in E. Since {Ecj}jk\{E_{c_{j}}^{\prime}\}_{j\leq k} is a partition of EE, we can rewrite the degree of xx as

dG(x)=xyc1Ec1w(x,yc1)++xyckEckw(x,yck).\displaystyle d_{G}(x)=\sum_{xy_{c_{1}}\in E_{c_{1}}^{\prime}}w(x,y_{c_{1}})+\dots+\sum_{xy_{c_{k}}\in E_{c_{k}}^{\prime}}w(x,y_{c_{k}}).

Then, the degree of xx in the graph GcjG_{c_{j}}^{\prime} is

LGcj(x,x)=xycjEcjw(x,ycj)=dGcj(x).\displaystyle L_{G_{c_{j}}^{\prime}}(x,x)=\sum_{xy_{c_{j}}\in E_{c_{j}}^{\prime}}w(x,y_{c_{j}})=d_{G_{c_{j}}^{\prime}}(x).

If we take an edge xycjEcjxy_{c_{j}}\in E_{c_{j}}^{\prime}, where xx is fixed, we know that xycjxy_{c_{j}} appears only in the induced subgraphs Gi1,,GicjG_{i_{1}},\dots,G_{i_{c_{j}}}, and hence, we obtain

i=1t(xycjEcjwi(x,ycj))=cjdGcj(x).\sum_{i=1}^{t}\left(\sum_{xy_{c_{j}}\in E_{c_{j}}^{\prime}}w_{i}(x,y_{c_{j}})\right)=c_{j}\cdot d_{G_{c_{j}}^{\prime}}(x). (2)

If we take another edge uvEcmuv\in E_{c_{m}}^{\prime}, with mjm\neq j, note that uvuv does not belong to any of the graphs Gi1,,GicjG_{i_{1}},\dots,G_{i_{c_{j}}} and each Laplacian matrix LGi1,,LGicjL_{G_{i_{1}}},\dots,L_{G_{i_{c_{j}}}} has 0 in its (u,v)(u,v)-entry. Therefore, adding uvuv to Eq.(1) we have that

i=1t(LGi(x,y)+LGi(u,v))=cjLGcj(x,y)+cmLGcm(u,v).\sum_{i=1}^{t}\left(L_{G_{i}}(x,y)+L_{G_{i}}(u,v)\right)=c_{j}\cdot L_{G_{c_{j}}^{\prime}}(x,y)+c_{m}\cdot L_{G_{c_{m}}^{\prime}}(u,v).

Extending this argument to all equivalent classes in {Ecj}jk\{E_{c_{j}}^{\prime}\}_{j\leq k}, for each non-diagonal entry (x,y)(x,y), with xyExy\in E, it holds

i=1tLGi(x,y)=j=1kcjLGcj(x,y).\sum_{i=1}^{t}L_{G_{i}}(x,y)=\sum_{j=1}^{k}c_{j}\cdot L_{G_{c_{j}}^{\prime}}(x,y). (3)

A similar argument can be made for the diagonal entries with Eq.(2), thus obtaining

i=1t(xyc1Ec1wi(x,yc1)++xyckEckwi(x,yck))=i=1tLGi(x,x)=j=1kcjLGcj(x,x).\sum_{i=1}^{t}\left(\sum_{xy_{c_{1}}\in E_{c_{1}}^{\prime}}w_{i}(x,y_{c_{1}})+\cdots+\sum_{xy_{c_{k}}\in E_{c_{k}}^{\prime}}w_{i}(x,y_{c_{k}})\right)=\sum_{i=1}^{t}L_{G_{i}}(x,x)=\sum_{j=1}^{k}c_{j}\cdot L_{G_{c_{j}}^{\prime}}(x,x). (4)

Equations (3) and (4) imply the lemma.∎

Now we will use Lemma 1 to show that the spectral sparsifier of j=1kcjLGcj\sum_{j=1}^{k}c_{j}L_{G_{c_{j}}^{\prime}} is an spectral sparsifier of the Laplacian LGL_{G} of an input graph GG.

Theorem 3.1

Let (1=c1<c2<<ck)(1=c_{1}<c_{2}<\dots<c_{k}) be the overlapping cardinalities over the family \mathcal{E} with {Ecj}jk\{E_{c_{j}}^{\prime}\}_{j\leq k} its associated overlapping cardinality partition and LG1,,LGtL_{G_{1}},\dots,L_{G_{t}} the Laplacians of G1,,GtG_{1},\dots,G_{t}. If Hi=(V,Di,hi)H_{i}=(V,D_{i},h_{i}) is an ϵ\epsilon-spectral sparsifier of GiG_{i}, then H=(V,i=1tDi,h)H=(V,\bigcup_{i=1}^{t}D_{i},h) is an ϵ\epsilon^{\prime}-spectral sparsifier of GG where h(e)=i=1thi(e)c1ckh(e)=\frac{\sum_{i=1}^{t}h_{i}(e)}{c_{1}c_{k}} and ϵ11ϵck\epsilon^{\prime}\geq 1-\frac{1-\epsilon}{c_{k}}.

Proof

Let LHiL_{H_{i}} be the Laplacian of HiH_{i}. By hypothesis we have that for every i[t]i\in[t] and xVx\in\mathbb{R}^{V}

(1ϵ)xTLGixxTLHix(1+ϵ)xTLGix.\displaystyle(1-\epsilon)x^{T}L_{G_{i}}x\leq x^{T}L_{H_{i}}x\leq(1+\epsilon)x^{T}L_{G_{i}}x.

Then we may take the summation over all i[t]i\in[t] to get

(1ϵ)i=1txTLGixi=1txTLHix(1+ϵ)i=1txTLGix.\displaystyle(1-\epsilon)\sum_{i=1}^{t}x^{T}L_{G_{i}}x\leq\sum_{i=1}^{t}x^{T}L_{H_{i}}x\leq(1+\epsilon)\sum_{i=1}^{t}x^{T}L_{G_{i}}x. (5)

Now, lets consider the left hand side of the Equation (5). Using Lemma 1 we get

(1ϵ)i=1txTLGix\displaystyle(1-\epsilon)\sum_{i=1}^{t}x^{T}L_{G_{i}}x =(1ϵ)i=1kcixTLGcix\displaystyle=(1-\epsilon)\sum_{i=1}^{k}c_{i}\cdot x^{T}L_{G_{c_{i}}^{\prime}}x
(1ϵ)c1i=1kxTLGcix\displaystyle\geq(1-\epsilon)c_{1}\sum_{i=1}^{k}x^{T}L_{G_{c_{i}}^{\prime}}x
=(1ϵ)c1xTLGx,\displaystyle=(1-\epsilon)c_{1}x^{T}L_{G}x, (6)

where the last equality follows from the fact that {Ecj}jk\{E_{c_{j}}^{\prime}\}_{j\leq k} is a partition of EE. Similarly for the right hand side of Equation (5) we have that

(1+ϵ)i=1txTLGix(1+ϵ)ckxTLGx.\displaystyle(1+\epsilon)\sum_{i=1}^{t}x^{T}L_{G_{i}}x\leq(1+\epsilon)c_{k}x^{T}L_{G}x. (7)

Therefore, by multiplying equations (6) and (7) by 1c1ck\frac{1}{c_{1}c_{k}} we obtain

(1ϵ)xTLGxckxTLHx(1+ϵ)xTLGxc1,\displaystyle(1-\epsilon)\frac{x^{T}L_{G}x}{c_{k}}\leq x^{T}L_{H}x\leq(1+\epsilon)\frac{x^{T}L_{G}x}{c_{1}},

where xTLHx=(i=1txTLHix)/(c1ck)x^{T}L_{H}x=(\sum_{i=1}^{t}x^{T}L_{H_{i}}x)/(c_{1}c_{k}).

To finish the proof, note that we want 1ϵ(1ϵ)/ck1-\epsilon^{\prime}\leq(1-\epsilon)/c_{k} and (1+ϵ)/c11+ϵ(1+\epsilon)/c_{1}\leq 1+\epsilon^{\prime} with ϵϵ<1\epsilon\leq\epsilon^{\prime}<1. In order to solve this, we choose an ϵ11ϵck\epsilon^{\prime}\geq 1-\frac{1-\epsilon}{c_{k}}. First notice that 1ϵ11+1ϵck=1ϵck1-\epsilon^{\prime}\leq 1-1+\frac{1-\epsilon}{c_{k}}=\frac{1-\epsilon}{c_{k}}. Then we have that 1+ϵc11+ϵc1=1+ϵ\frac{1+\epsilon}{c_{1}}\leq\frac{1+\epsilon^{\prime}}{c_{1}}=1+\epsilon^{\prime}. ∎

From Theorem 3.1, a distributed algorithm for computing spectral sparsifiers is natural. Just let every site compute a spectral sparsifier of its own input and then each site sends its result to a coordinator that will construct the union of all spectral sparsifiers.

4 Data Clustering in the Number-On-Forehead Model

In this section we will show an application of Theorem 3.1 to distributed data clustering in the Number-On-Forehead model of communication complexity for the case when the input data is allocated as a sunflower among sites.

Clustering is an unsupervised machine learning task that involves finding a partition over a given set of points x1,,xndx_{1},\dots,x_{n}\in\mathbb{R}^{d}. Such a partition must fulfill two conditions, (i) every two points in the same set must be “similar” in some way and (ii) every two points on different sets must be far from being similar. Each equivalence class from the partition is also called a cluster. Clustering can be accomplished by different kinds of techniques, where spectral clustering luxburg:07 is one of the fastest methods.

It is easy to see clustering as a graph problem, where each point corresponds to a vertex in a complete graph and the cost of each edge is interpreted as a similarity between points. Thus, finding a set of optimal clusters in data is equivalent to finding an optimal multicut in a graph. Since the optimal multicut depends on the spectrum of the graph’s Laplacian lee2:2014 and we want to keep the communication costs low, each site must be capable of constructing sparse induced subgraphs of its own data while preserving the spectrum of its graph Laplacian.

In our communication protocol, each site is assigned an induced subgraph of GG, and we want each site to be aware of all clusters in the data. Consequently, each site must be capable of running a clustering algorithm on its own data, communicate its results to the other sites, and then use the exchanged messages to construct an approximation to the original graph GG. This is where the distributed spectral sparsification algorithm is relevant.

First, we will construct a protocol to verify if the input data in every site is a sunflower. If the input is indeed allocated in a sunflower structure, then a party can take advantage of the sunflower to find an approximation of clusters in the data.

4.1 Models of Communication and their Complexity Measure

We will introduce some standard notations from communication complexity—we refer the interested reader to the textbook by Kushilevitz and Nisan kushilevitz:97 for more details. Let P1,P2,,PsP_{1},P_{2},\dots,P_{s} be a set of sites where a site PjP_{j} has an input xj{0,1}rx_{j}\in\{0,1\}^{r}, with rr a positive integer. In a multiparty communication protocol, with s3s\geq 3, the sites want to jointly compute a function f:{0,1}r××{0,1}rZf:\{0,1\}^{r}\times\dots\times\{0,1\}^{r}\rightarrow Z for some finite codomain ZZ. In the Number-On-Forehead model of communication, or NOF model, each site only has access to the other sites’s input but not its own, that is, a site PjP_{j} has access to (x1,,xj1,xj+1,,xs)(x_{1},...,x_{j-1},x_{j+1},...,x_{s}). In order to compute ff the sites must communicate, and they do so by writing bits on a blackboard which can be accessed by all sites in the party. This is the so-called blackboard model of communication.

The maximum number of bits exchanged in the protocol over the worst-case input is the cost of the protocol. The deterministic communication complexity of the function ff is the minimum cost over all protocols which compute ff.

Let G=(V,E)G=(V,E) be an input graph and {Ej}js\{E_{j}\}_{j\leq s} be a family of subsets of EE. In order to study communication protocols for graph problems we assume that EjE_{j} is the input data to site PjP_{j}. In the NOF model, we let Fj={E1,E2,,Ej1,Ej+1,,Es}F_{j}=\{E_{1},E_{2},...,E_{j-1},E_{j+1},...,E_{s}\} be the set of edges which PjP_{j} can access. Given a site PjP_{j}, the symmetric difference on PjP_{j}, denoted Δj\Delta_{j}, is defined as the symmetric difference among all sets PjP_{j} has access to, that is, Δj\Delta_{j} is the symmetric difference between each set in FjF_{j}.

For the rest of this paper, we use as a shorthand \mathcal{E} for the set {E1,,Es}\{E_{1},\dots,E_{s}\} of subsets of the set of edges EE of an input graph G=(V,E)G=(V,E) with i=1sEi=E\bigcup_{i=1}^{s}E_{i}=E, and \mathcal{F} for the set {F1,,Fs}\{F_{1},\dots,F_{s}\} where Fj={E1,,Ej1,Ej+1,,Es}F_{j}=\{E_{1},\dots,E_{j-1},E_{j+1},\dots,E_{s}\}. Here \mathcal{F} captures the idea of the NOF model where every site have access to the other’s sites input but not its own.

4.2 Sunflowers and NOF Communication

A sunflower or Δ\Delta-System is a family of sets 𝒜={A1,,At}\mathcal{A}=\{A_{1},...,A_{t}\} where (AiAj)=k=1tAk=K(A_{i}\cap A_{j})=\bigcap_{k=1}^{t}A_{k}=K for all iji\neq j. We call KK the kernel of 𝒜\mathcal{A}. The family 𝒜\mathcal{A} is a weak Δ\Delta-System if |AiAj|=λ|A_{i}\cap A_{j}|=\lambda for all iji\neq j for some constant λ\lambda kostochka:00 . It is known that if 𝒜\mathcal{A} is a weak Δ\Delta-System and |𝒜|2+2|\mathcal{A}|\geq\ell^{2}-\ell+2, where =maxi=1t{Ai}\ell=\max_{i=1}^{t}\{A_{i}\}, then 𝒜\mathcal{A} is a Δ\Delta-System deza:74 .

We start with a simple fact that ensures the existence of Δ\Delta-Systems with the same kernel in the NOF model if input data in a communication protocol is allocated as a sunflower among sites.

Lemma 2

If s=||3s=|\mathcal{E}|\geq 3 and \mathcal{E} is a Δ\Delta-System with kernel KK, then any FiF_{i} is a Δ\Delta-System with kernel KK.

The following lemma states a sufficient condition for the existence of a Δ\Delta-System in the input data in the NOF model with the requirement, however, that we need at least four or more sites

Lemma 3

Let s=||4s=|\mathcal{E}|\geq 4. If, for all i[s]i\in[s], we have that FiF_{i} is a Δ\Delta-System, then \mathcal{E} is a Δ\Delta-System.

Proof

Suppose that \mathcal{E} is no a Δ\Delta-System, and we want to prove that for some 1is1\leq i\leq s, FiF_{i} is not a Δ\Delta-System.

With no loss of generality, suppose that there exists exactly two sets EiE_{i} and EjE_{j} that certify that \mathcal{E} is not a Δ\Delta-System; that is, there exists EiE_{i} and EjE_{j} such that EiEj=KE_{i}\cap E_{j}=K^{\prime}, and, for any aia\neq i and bjb\neq j, it holds that EaEj=EbEi=KE_{a}\cap E_{j}=E_{b}\cap E_{i}=K, with KKK\neq K^{\prime}. Now take any FcF_{c}, with cc different from ii and jj. Then FcF_{c} cannot be a Δ\Delta-System because EiE_{i} and EjE_{j} belong to FcF_{c} and there is at least another set in FcF_{c} because ||4|\mathcal{E}|\geq 4.∎

Lemma 3 implies that we only need to know if all sites in a communication protocol have access to a Δ\Delta-System to ensure that an entire family of input sets is a Δ\Delta-System, provided there are at least 4 sites.

Proposition 1

There exists a protocol that verifies if \mathcal{E}, with ||4|\mathcal{E}|\geq 4, is a Δ\Delta-System or not with s1s-1 bits of communication exchanged.

With Proposition 1, a multiparty communication protocol with a number of sites s4s\geq 4 can check for the existence of a sunflower structure in its input data. Furthermore, if input data is allocated among sites as a sunflower, then, by Lemma 2, any site immediately knows the kernel of the sunflower.

4.3 Data Clustering with Sunflowers

In this section, we present a NOF communication protocol that exploits the sunflower structure in input data. First, we start by defining an overlapping coefficient of the edges of GG which can be seen as a measure of how well spread out are the edges among sites.

Definition 4

The overlapping coefficient on site PjP_{j} is defined as δ(j)=|ijEi||ijEi|\delta(j)=\frac{|\bigcap_{i\neq j}E_{i}|}{|\bigcup_{i\neq j}E_{i}|} and the greatest overlapping coefficient is defined as δ=maxj[s]δ(j).\delta=\max_{j\in[s]}\delta(j).

The following proposition presents a simple protocol that makes every site aware of the entire input graph.

Proposition 2

Let PjP_{j} be a site and let {\mathcal{E}} be a weak Δ\Delta-System with each |Ek|=|E_{k}|=\ell for k=1,2,,sk=1,2,\dots,s, with a kernel of size λ\lambda. Suppose that s2+3s\geq\ell^{2}-\ell+3. If site PjP_{j} sends all the edges in Δj\Delta_{j}, then every other site will know the entire graph GG. The number of edges this communication protocol sends is at most |ijEi|(1δ)+|\bigcup_{i\neq j}E_{i}|(1-\delta)+\ell.

Proof

We will prove this proposition by showing how each site constructs the graph GG. First, a given site PjP_{j} computes Δj\Delta_{j} and writes it on the blackboard. Since s2+3s\geq\ell^{2}-\ell+3, by the result of Deza deza:74 , we known that {\mathcal{E}} is a sunflower with kernel KK and by Lemma 2 this kernel is the same in all sites. At this point all sites iji\neq j know Δj\Delta_{j}, therefore, they can construct GG by its own using the kernel KK of \mathcal{E}. In one more round, one of the sites iji\neq j writes EjE_{j} so that site PjP_{j} can also construct GG.

In order to compute the communication cost of the protocol, first notice that δ=λ/(|ijEi|)=λ/(|Δj|+λ)\delta={\lambda}/({|\bigcup_{i\neq j}E_{i}|})={\lambda}/({|\Delta_{j}|+\lambda}),where we used the fact that the union of all edges in every site equals the union of the symmetric difference and the kernel KK. Then we have that δ|Δj|=λδλ\delta|\Delta_{j}|=\lambda-\delta\lambda, which implies |Δj|=λδλδ=|ijEi||(1δ)|\Delta_{j}|=\frac{\lambda-\delta\lambda}{\delta}=|\bigcup_{i\neq j}E_{i}||(1-\delta), where the last equality follows from the fact that |ijEi|=λ/δ|\bigcup_{i\neq j}E_{i}|=\lambda/\delta. Finally, after EjE_{j} was sent to the blackboard the communication cost is |ijEi||(1δ)+|\bigcup_{i\neq j}E_{i}||(1-\delta)+\ell.∎

Theorem 4.1

Let {\mathcal{E}} be a weak Δ\Delta-system with each |Ek|=|E_{k}|=\ell for k=1,2,,sk=1,2,\dots,s, and suppose that s2+3s\geq\ell^{2}-\ell+3. There exists a communication protocol such that after two rounds of communication every site knows an ϵ\epsilon-spectral sparsifier of the entire graph GG with communication cost O(log(nϵ21δ))O\left(\log\left(\frac{n}{\epsilon^{2}}\sqrt{1-\delta}\right)\right).

Proof

From deza:74 we know that \mathcal{E} is a sunflower with a kernel KK of size λ\lambda and, by Lemma 2, KK is equal in all sites. First, a site PjP_{j} computes a spectral sparsifier Hj=(V,Δ^j)H_{j}=(V,\hat{\Delta}_{j}) of the induced subgraph Gj=(V,Δj)G_{j}=(V,\Delta_{j}) using the spectral sparsification algorithm of lee:2018 . This way we have that |Δ^j|=O(n/ϵ2)|\hat{\Delta}_{j}|=O(n/\epsilon^{2}) where 0<ϵ1/1200<\epsilon\leq 1/120. Then site PjP_{j} writes Δ^j\hat{\Delta}_{j} on the blackboard. Any other site iji\neq j constructs an ϵ\epsilon-spectral sparsifier Hi=(V,E^j)H_{i}^{\prime}=(V,\hat{E}_{j}) of Gi=(V,Ej)G_{i}^{\prime}=(V,E_{j}). By Theorem 3.1, the graph H=(V,Δ^jE^j)H=(V,\hat{\Delta}_{j}\cup\hat{E}_{j}) is a ϵ\epsilon^{\prime}-spectral sparsifier of GG. In a second round, a given site PiP_{i} writes E^j\hat{E}_{j} on the blackboard. Finally, site PjP_{j} receives E^j\hat{E}_{j} and by Theorem 3.1 it can also construct an ϵ\epsilon^{\prime}-spectral sparsifier for GG. Finally, the communication complexity is upper-bounded by O(log(nϵ2(1δ))+log(nϵ2))=O(log(nϵ21δ)).O\left(\log\left(\frac{n}{\epsilon^{2}}(1-\delta)\right)+\log\left(\frac{n}{\epsilon^{2}}\right)\right)=O\left(\log\left(\frac{n}{\epsilon^{2}}\sqrt{1-\delta}\right)\right).

Acknowledgement. We give our thanks to the reviewers of CTW 2020 for their comments that helped improve this paper. This work is supported by Conacyt research grants POSG17-62 and PINV15-208.

References

  • (1) Batson, J.D., Spielman, D.A., Srivastava, N.: Twice-ramanujan sparsifiers. In: Proceedings of the 41st annual ACM symposium on Theory of computing (STOC), pp. 255–262 (2009)
  • (2) Chen, J., Sun, H., Woodruff, D., Zhang, Q.: Communication-optimal distributed clustering. In: Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS), pp. 3727–3735 (2016)
  • (3) Deza, M.: Solution d’un problème de Erdös-Lovász. Journal of Combinatorial Theory, Series B 16(2), 166–167 (1974)
  • (4) Erdös, P., Chao, Rado, R.: Intersection theorems for systems op finite sets. Quarterly Journal of Mathematics 12(1), 313–320 (1961)
  • (5) Kostochka, A.: Extremal problems on Δ\Delta-systems. Numbers, Information and Complexity pp. 143–150 (2000)
  • (6) Kushilevitz, E., Nisan, N.: Communication complexity. Cambridge University Press (2006)
  • (7) Lee, J.R., Gharan, S.O., Trevisan, L.: Multiway spectral partitioning and higher-order cheeger inequalities. Journal of the ACM (JACM) 61(6), 37 (2014)
  • (8) Lee, Y.T., Sun, H.: Constructing linear-sized spectral sparsification in almost-linear time. SIAM Journal on Computing 47(6), 2315–2336 (2018)
  • (9) von Luxburg, U.: A tutorial on spectral clustering. Statistics and computing 17(4), 395–416 (2007)
  • (10) Nisan, N.: The communication complexity of threshold gates. Combinatorics, Paul Erdös is Eighty 1, 301–315 (1993)
  • (11) Spielman, D.A., Teng, S.H.: Spectral sparsification of graphs. SIAM Journal On Computing 40(4), 981–1025 (2011)