This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Incremental Measurement of Structural Entropy for Dynamic Graphs

Runze Yang Hao Peng [email protected] Chunyang Liu Angsheng Li
(“Two-dimensional encoding tree” means the height of the encoding tree is restricted to 22. The corresponding structural entropy of the two-dimensional encoding tree is named “two-dimensional structural entropy”.; Additionally, we also generalize our incremental methods to undirected weighted graphs and conduct a detailed discussion on the calculation of one-dimensional structural entropy for directed weighted graphs.; Extending the proposed methods to weighted graphs and providing new incremental computation methods for directed weighted graphs.; Section 5 gives discussion on more complex graphs.; Finally, we give further discussion between the two proposed dynamic adjustment strategies.; An example graph with its two equivalent two-dimensional encoding trees.; An example graph with its two equivalent two-dimensional encoding trees.; An intuitive way to calculate the updated two-dimensional structural entropy is to update the variables in Eq. (4) and then compute via the updated formula Eq. (14). However, the incremental size nn affects all terms in the summation equation in Eq. (14). Therefore, the updating and calculation process costs at least O(|𝒱|)O(|\mathcal{V}|), which is huge when the graph becomes extremely large. So how can we find an incremental formula with a smaller time complexity only related to the incremental size nn? An intuitive attempt is to make a difference between the updated structural entropy and the original one to try to compute the incremental entropy in O(n)O(n). Nevertheless, the fact that mm changes to m+nm+n in all terms of Eq. (14) makes it difficult to derive a concise formula of O(n)O(n) from the difference equation. ; To address this issue, we here introduce Global Invariant and Local Difference. We define the Global Invariant (Definition 5) as an approximate updated structural entropy that only updates mm to m+nm+n in Eq. (4) and keeps other variables unchanged. The Local Difference (Definition 6) is defined as the difference between the updated structural entropy and the Global Invariant, which can also be regarded as the approximate error.; Obviously,; by computing and summing up the Global Invariant and the Local Difference. In the experimental part, we use a more explicit and practiced form of the structural entropy update formula (Eq. (55)) derived from the Global Invariant and the Local Difference.; Discussion: The boundedness analysis gives the lower and upper bound of the Local Difference. This suggests that when we compute Global Invariant to quickly get an approximate value of the updated structural entropy, the approximate error is bounded and decreases as the graph grows larger and thus the validity and accuracy of the approximation are guaranteed. ; Discussion: The convergence analysis gives proof of the convergence of the Local Difference and its first-order absolute moment. That is, when we use Global Invariant to approximate the updated structural entropy, the approximate error and its absolute value’s expectation both converge to 0 no slower than O(logmm)O(\frac{\log m}{m}) as the graph edge number mm grows larger, suggesting a high level of confidence that our approximation is reliable. Additionally, the convergence analysis also demonstrates that the updated structural entropy is mainly contributed by the incremental size nn other than the position of the incremental edges when the graph is extremely large. It is because when the graph grows larger, we can approximate the updated structural entropy with an approximate error that converges to 0 by simply changing mm to m+nm+n. ;

3.1.5 Limitations

The limitations of the naive adjustment strategy are listed below. First, this strategy cannot deal with multiple incremental edges at the same time, e.g. a new node appears connecting with two different existing nodes. An alternative solution is to arrange all incremental edges at a certain time stamp into a sequence with which we can add the edges one by one while keeping the connectivity of the graph. In this way, the community of newly introduced nodes is inevitably related to the input order of the incremental edges. Second, it cannot handle edge or node deletions. Third, the community of the existing nodes remains unchanged, which is sub-optimal in most cases.

; as the best community for a target node, i.e., if the target node moves into its OPC, the overall two-dimensional structural entropy must be the lowest compared to other community other than OPC.; First; for further theoretical analysis; Second; Further Discussion between the Two Dynamic Adjustment Strategies; Further Discussion between the Two Dynamic Adjustment Strategies; Similarities: (1) Both dynamic adjustment strategies are designed to incrementally change the original two-dimensional encoding trees to adapt the incremental edges and nodes in dynamic scenarios. (2) The time complexities for computing the updated structural entropy of both strategies are significantly lower than the original calculation formula (detailed analysis is shown in Section 4.3). (3) Both strategies cannot handle the birth of new communities and the dismission of the existing communities. ; Differences: (1) The focuses of the two strategies are different. The naive adjustment strategy emphasizes theoretical analysis, such as boundedness and convergence analysis, and acts as a fast incremental baseline in experimental evaluations. By contrast, the node-shifting adjustment strategy mainly focuses on addressing the limitations of the naive strategy (Section 3.1.5) and the dynamic optimization of the existing communities towards a lower structural entropy. (2) The ways of updating encoding trees, or updating community partitionings, of these two strategies are different. In the naive adjustment strategy, new edges do not change the communities of the existing nodes, and new nodes are assigned to the direct neighbors’ communities. While in the node-shifting adjustment strategy, the influence on community adjustment of new edges is considered and the new nodes’ community is also determined by the incremental edges. (3) The time complexity of the naive adjustment strategy is fixed while that of the node-shifting strategy grows nearly linearly with the iteration number NN. Experiments show that the naive strategy is faster than the node-shifting strategy with N5N\geq 5 in most cases (Fig. 13). ; Definitions; Definitions; In this part, we present the definitions of Structural Data, Structural Expressions, and Adjustment, which will be employed in subsequent sections. ; efficiently; while dynamically adjusting the community partitioning; Baseline: the Traditional Offline Algorithm; Baseline: the Traditional Offline Algorithm; Extensions on More Complex Graphs; Extensions on More Complex Graphs; In this section, we discuss the feasibility of the extension to weighted or directed graphs of our methods. First, we argue that the method for undirected weighted graphs can be extended naturally from that of undirected unweighted graphs. Second, we analyze the fundamental differences between the paradigm of structural entropy incremental computation for directed graphs and that for undirected graphs and present new methods for calculating one-dimensional structural entropy incrementally on directed weighted graphs. ; Undirected Weighted Graphs; Undirected Weighted Graphs; The incremental measurement method of structural entropy for undirected weighted graphs can be intuitively and easily extended from the methods for undirected unweighted graphs proposed earlier. In the following, we first introduce the definition of two-dimensional structural entropy of undirected weighted graphs. Then we update the definition of the Adjustment and propose an incremental formula for structural entropy computation under the new circumstance. ; Two-dimensional structural entropy of undirected weighted graphs. An undirected weighted graph can be denoted as GW=(𝒱,,w)G_{W}=(\mathcal{V},\mathcal{E},w), where w(e)+w(e)\in\mathbb{R}^{+} represents the weight of ee\in\mathcal{E}. For any ee\notin\mathcal{E}, let w(e)=0w(e)=0. According to Li and Pan [6], the definition of the encoding tree for an undirected weighted graph remains unchanged, and the two-dimensional structural entropy of GW=(𝒱,,w)G_{W}=(\mathcal{V},\mathcal{E},w) by its two-dimensional encoding tree 𝒯\mathcal{T} is defined as ;
H𝒯(GW)=αiA(gαiVλlogVαiVλ+vjTαidjVλlogdjVαi),H^{\mathcal{T}}(G_{W})=\sum_{\alpha_{i}\in A}(-\frac{g_{\alpha_{i}}}{V_{\lambda}}\log\frac{V_{\alpha_{i}}}{V_{\lambda}}+\sum_{v_{j}\in T_{\alpha_{i}}}-\frac{d_{j}}{V_{\lambda}}\log\frac{d_{j}}{V_{\alpha_{i}}}), (57)
; where new degree dj=vi𝒩(vj)w(vj,vi)d_{j}=\sum_{v_{i}\in\mathcal{N}(v_{j})}w(v_{j},v_{i}) (𝒩(v)\mathcal{N}(v) denotes the set of neighbors of node vv), new volume Vα=viTαdiV_{\alpha}=\sum_{v_{i}\in T_{\alpha}}d_{i}, and new cut edge number gαg_{\alpha} is replaced with the sum of the weights of the edges with exactly one endpoint in TαT_{\alpha}. ; Adjustment definition of undirected weighted graphs. Due to the continuity of edge weights, the original node level and graph level Adjustment definitions (Definition 11) no longer apply. We renew the Adjustment definitions of the two levels as follows:
  1. 1.

    (node level) the total weight change δ(vi)=didi\delta(v_{i})=d^{\prime}_{i}-d_{i} for each node viϕλv_{i}\in\phi_{\lambda};

  2. 2.

    (graph level) the total weight change of the whole graph denoted by δV(λ)\delta_{V}(\lambda).

; Incremental formula for structural entropy computation.; According to Eq. (57), the incremental computation formula for undirected weighted graphs can then be formulated as (extended from Eq. (55)); HT′(G′W)=-1Vλ+δV(λ)[^S′N+^S′C+^S′Glog(Vλ+δV(λ))],; where ; ^S′N=^SN+∑vk∈ϕλ[(dk+δ(vk))log(dk+δ(vk))-dklogdk];^S′C=^SC+∑α∈A[(gα+δg(α)+Vα+δV(α))log(Vα+δV(α))-(gα+Vα)logVα];^S′G=^SG+∑α∈A-δg(α).; Directed Weighted Graphs; Directed Weighted Graphs; The main method proposed in this paper is difficult to transfer to directed graph scenarios since the measurement of the structural entropy of directed graphs is fundamentally different from that of undirected graphs. The key difference is that the directed graph needs to be transferred into a transition matrix and stationary distribution needs computed. In this part, we briefly present an incremental scheme for measuring the one-dimensional structural entropy of directed weighted graphs since the incremental computation of two-dimensional structural entropy is quite complex. Specifically, we first define the directed weighted graph and its non-negative matrix representation. After that, we introduce the formula of the structural entropy of directed weighted graphs [6]. Finally, we review the traditional methods, namely Eigenvector Calculation and Global Aggregation, for accurately or approximately calculating the one-dimensional structural entropy of directed weighted graphs and then propose an incremental iterative approximation algorithm, i.e., Local Propagation. ; Directed Weighted Graph and Non-Negative Matrix; Directed Weighted Graph and Non-Negative Matrix; Directed Weighted Graph; A directed weighted graph can be denoted as GDW=(𝒱,,f)G_{DW}=(\mathcal{V},\mathcal{E},f), which satisfies the following properties: 1. 𝒱={v1,v2,,vN}\mathcal{V}=\{v_{1},v_{2},...,v_{N}\} is the node set; 2. \mathcal{E} denotes the set of directed edges e=(vi,vj)e=(v_{i},v_{j}); 3. for each directed edge e=(vi,vj)e=(v_{i},v_{j})\in\mathcal{E}, f(e)=f(vi,vj)>0f(e)=f(v_{i},v_{j})>0 is the edge weight from viv_{i} to vjv_{j}. For e=(vi,vj)e=(v_{i},v_{j})\notin\mathcal{E}, we denote f(e)=f(vi,vj)=0f(e)=f(v_{i},v_{j})=0. ; The definition of directed weighted graphs is shown in Definition 12. Given a directed weighted graph GDW=(𝒱,,f)G_{DW}=(\mathcal{V},\mathcal{E},f) with NN nodes, we fix the nodes in 𝒱\mathcal{V} in an order like v1,v2,,vNv_{1},v_{2},...,v_{N}. For i,j{1,2,..,N}i,j\in\{1,2,..,N\}, we define aij=f(vi,vj)a_{ij}=f(v_{i},v_{j}). Then we can obtain a non-negative matrix 𝐀=(aij)\mathbf{A}=(a_{ij}), named as a matrix of GDWG_{DW}. In other words, a directed weighted graph can be represented by a non-negative matrix. The definition of the non-negative matrix representation of directed weighted graphs is presented in Definition 13. ; Directed Weighted Graph Represented by Non-Negative Matrix; A directed weighted graph GDW=(𝒱,,f)G_{DW}=(\mathcal{V},\mathcal{E},f) can be denoted as an N×NN\times N non-negative matrix 𝐀=(aij)0N×N\mathbf{A}=(a_{ij})\in\mathbb{R}^{N\times N}_{\geq 0}, where for each i,j{1,2,,N}i,j\in\{1,2,...,N\}, aij=f(vi,vj)0a_{ij}=f(v_{i},v_{j})\geq 0. ; One-Dimensional Structural Entropy of Directed Weighted Graphs; One-Dimensional Structural Entropy of Directed Weighted Graphs; For direct graphs, the structural entropy is an uncertainty measurement defined on strongly connected graphs whose matrices must be irreducible (Definition 14). ; Irreducible Matrix; Given a non-negative matrix 𝐀0N×N\mathbf{A}\in\mathbb{R}^{N\times N}_{\geq 0}, if there exists a permutation matrix 𝐏\mathbf{P} such that 𝐏𝐀𝐏=[𝐗𝐘𝟎𝐙],\mathbf{P}\mathbf{A}\mathbf{P}=\begin{bmatrix}\mathbf{X}&\mathbf{Y}\\ \mathbf{0}&\mathbf{Z}\end{bmatrix}, (60) where 𝐗\mathbf{X} and 𝐙\mathbf{Z} are square matrices, then 𝐀\mathbf{A} is said to be a reducible matrix, otherwise 𝐀\mathbf{A} is an irreducible matrix. ; Normalize each row in an irreducible matrix 𝐀\mathbf{A}, i.e., for each ii and jj, let
bij=aijk=1Naik.b_{ij}=\frac{a_{ij}}{\sum_{k=1}^{N}a_{ik}}. (61)
Then we can get a normalized irreducible matrix 𝐁=(bij)\mathbf{B}=(b_{ij}) where the sum of elements in each row is 11. bijb_{ij} is called the normalized edge weight. There is a fundamental theorem for normalized irreducible matrices like 𝐁\mathbf{B}. ; Perron Forbenius; If 𝐁0N×N\mathbf{B}\in\mathbb{R}^{N\times N}_{\geq 0} is an irreducible matrix where for each ii, k=1Nbik=1\sum_{k=1}^{N}b_{ik}=1, then 1. the maximum eigenvalue of 𝐁\mathbf{B} is 11; 2. the maximum eigenvalue of 𝐁\mathbf{B} has a unique left eigenvector; 3. The unique left eigenvector of the maximum eigenvalue of 𝐁\mathbf{B} is a probability distribution, denoted by π=[π1,π2,,πN]\pi=[\pi_{1},\pi_{2},...,\pi_{N}]. ; According to Theorem 5, the stationary distribution of 𝐀\mathbf{A} can then be defined in Definition 9. ; Stationary Distribution; Let 𝐀0N×N\mathbf{A}\in\mathbb{R}^{N\times N}_{\geq 0} be an irreducible matrix and 𝐁\mathbf{B} represents the normalized matrix of 𝐀\mathbf{A} where for each ii, k=1Nbik\sum_{k=1}^{N}b_{ik} =1=1. The stationary distribution of 𝐀\mathbf{A} is defined as the unique left eigenvector of the maximum eigenvalue of 𝐁\mathbf{B}, denoted by π=[π1,π2,,πN]\pi=[\pi_{1},\pi_{2},...,\pi_{N}]. ; Finally, the one-dimensional structural entropy (Definition 16) of an irreducible non-negative matrix (or a directed weighted graph) is defined as the Shannon entropy of the stationary distribution, which measures the total amount of uncertainty embedded in a directed weighted graph. ; One-Dimensional Structural Entropy; Let 𝐀0N×N\mathbf{A}\in\mathbb{R}^{N\times N}_{\geq 0} be an irreducible matrix and π=[π1,π2,,πN]\pi=[\pi_{1},\pi_{2},...,\pi_{N}] is the stationary distribution of 𝐀\mathbf{A}. The one-dimensional structural entropy is defined as ; 1(𝐀)=i=1Nπilog2πi.\mathcal{H}^{1}(\mathbf{A})=-\sum_{i=1}^{N}\pi_{i}\log_{2}\pi_{i}. (62) ; Incremental Measurement of One-Dimensional Structural Entropy; Incremental Measurement of One-Dimensional Structural Entropy; Generally, the exact value of one-dimensional structural entropy can be obtained by Eq. (62) in O(n)O(n) given the stationary distribution. However, calculating the exact stationary distribution needs to solve the eigenvectors of the normalized irreducible matrix, which usually costs O(n3)O(n^{3}). In dynamic scenarios, the directed weighted graph evolves as time passes by. When an irreducible matrix 𝐀0N×N\mathbf{A}\in\mathbb{R}^{N\times N}_{\geq 0} representing a directed weighted graph gets an incremental Δ𝐀\Delta\mathbf{A}, it becomes an updated irreducible matrix 𝐀=𝐀+Δ𝐀\mathbf{A}^{\prime}=\mathbf{A}+\Delta\mathbf{A}. Let n=Δ𝐀0n=||\Delta\mathbf{A}||_{0} denote the incremental size. Let 𝐁\mathbf{B} and 𝐁\mathbf{B}^{\prime} be the normalized matrices of 𝐀\mathbf{A} and 𝐀\mathbf{A}^{\prime}. Define Δ𝐁=𝐁𝐁\Delta\mathbf{B}=\mathbf{B}^{\prime}-\mathbf{B}. We can calculate Δ𝐁=(Δbij)\Delta\mathbf{B}=(\Delta b_{ij}) from 𝐀\mathbf{A} and Δ𝐀\Delta\mathbf{A} by
Δbij=bijbij=aijk=1Naikaijk=1Naik=aij+Δaij(k=1Naik+Δaik)aijk=1Naik.\Delta b_{ij}=b^{\prime}_{ij}-b_{ij}=\frac{a^{\prime}_{ij}}{\sum_{k=1}^{N}a^{\prime}_{ik}}-\frac{a_{ij}}{\sum_{k=1}^{N}a_{ik}}=\frac{a_{ij}+\Delta a_{ij}}{(\sum_{k=1}^{N}a_{ik}+\Delta a_{ik})}-\frac{a_{ij}}{\sum_{k=1}^{N}a_{ik}}. (63)
Since a non-zero element of Δ𝐀\Delta\mathbf{A} may influence at most all NN elements of a row in 𝐁\mathbf{B}, Δ𝐁0||\Delta\mathbf{B}||_{0}, named as the normalized incremental size, will be no more than nNnN. In addition, the number of the influenced rows in 𝐁\mathbf{B}, denoted by nBn_{B}, will be no more than nn. In the following, three ways are listed to compute the updated one-dimensional structural entropy. ; Exact value calculation by Eigenvector Calculation.; The first way is to calculate the exact value of the updated one-dimensional structural entropy by Definition 15 and 16. Specifically, the new exact stationary distribution π\pi^{\prime} is first computed by solving the left eigenvector of the maximum eigenvalue of 𝐁\mathbf{B}^{\prime}. Then the updated one-dimensional structural entropy can be calculated by
1(𝐀)=i=1Nπilog2πi.\mathcal{H}^{1}(\mathbf{A^{\prime}})=-\sum_{i=1}^{N}\pi^{\prime}_{i}\log_{2}\pi^{\prime}_{i}. (64)
Generally, the total time complexity of this way is O(N3)O(N^{3}). ; Approximate value calculation by Global Aggregation.; According to the theory of the Markov chain, the approximate stationary distribution can be approximated by iteratively right-multiplying 𝐁\mathbf{B}^{\prime}, i.e.,
π(θ)=π𝐁θ=π(𝐁+Δ𝐁)θ,\pi^{(\theta)}=\pi\mathbf{B}^{\prime\theta}=\pi(\mathbf{B}+\Delta\mathbf{B})^{\theta}, (65)
where θ\theta\in\mathbb{N} is the iteration number. Whenever π(θ)\pi^{(\theta)} updates to π(θ+1)\pi^{(\theta+1)} by right-multiplying 𝐁+Δ𝐁\mathbf{B}+\Delta\mathbf{B}, the updating process is equivalent to an information aggregation operation on graphs along directed edges, i.e., for each node viv_{i},
πi(θ+1)=vj𝒩(vi)πj(θ)bji,\pi^{(\theta+1)}_{i}=\sum_{v_{j}\in\mathcal{N}(v_{i})}\pi^{(\theta)}_{j}b_{ji}, (66)
where 𝒩(vi)\mathcal{N}(v_{i}) denotes the neighbors of viv_{i}. Since for each iteration, all nodes need to be updated, we name this calculation method as “Global Aggregation”. Finally, we compute the Shannon entropy of π(θ)\pi^{(\theta)} as the approximate one-dimensional structural entropy:
1(𝐀)i=1Nπi(θ)log2πi(θ).\mathcal{H}^{1}(\mathbf{A^{\prime}})\approx-\sum_{i=1}^{N}\pi^{(\theta)}_{i}\log_{2}\pi^{(\theta)}_{i}. (67)
The total computational complexity of Global Aggregation is about O(θN2)O(\theta N^{2}) in the form of matrix multiplication using Eq. (65). Utilizing graph perspective (Eq. (66)), the time complexity can be reduced to O(θNd¯)=O(θ||)O(\theta N\overline{d})=O(\theta|\mathcal{E}|), where d¯\overline{d} denotes the mean direct successor number of all graph nodes. ; Fast approximate value calculation by Local Propagation.; In Global Aggregation, all nodes and edges need to be traversed in each iteration, which leads to high computational redundancy. In this part, we propose a new method for fast approximation of the updated one-dimensional structural entropy, namely Local Propagation. As the name suggests, the key idea is to use an information propagation scheme involving only changed local nodes to further reduce the redundancy of the aggregation process in Eq. (66). ; An illustration of Local Propagation. Note that all edge weights are normalized. In the first step (a), the red arrows denote the edges whose weight changes after getting an incremental, i.e., the involved edges. The stationary distributions of the involved nodes (π2,π4,π10\pi_{2},\pi_{4},\pi_{10}) are updated. In the second step (b), the direct successors of the last involved nodes update their stationary distributions and become new involved nodes. Then the procedure of (b) is repeated until the maximum iteration number is reached. ; An illustration of Local Propagation. Note that all edge weights are normalized. In the first step (a), the red arrows denote the edges whose weight changes after getting an incremental, i.e., the involved edges. The stationary distributions of the involved nodes (π2,π4,π10\pi_{2},\pi_{4},\pi_{10}) are updated. In the second step (b), the direct successors of the last involved nodes update their stationary distributions and become new involved nodes. Then the procedure of (b) is repeated until the maximum iteration number is reached. ; In particular, Local Propagation contains two steps. In the first step (Fig. 9(a)), we define the set of directed edges in 𝐁\mathbf{B} influenced by Δ𝐁\Delta\mathbf{B} as “involved edges” (red edges in Fig. 9(a)). We then let π(1)=π(0)\pi^{(1)}=\pi^{(0)}. For each involved edge (vi,vj)(v_{i},v_{j}), the stationary distribution value of the pointed node vjv_{j} is updated by
πj(1)πj(1)+Δbijπi,\pi^{(1)}_{j}\leftarrow\pi^{(1)}_{j}+\Delta b_{ij}\pi_{i}, (68)
and vjv_{j} is added into an “involved nodes” set denoted by I(1)I^{(1)}. The time complexity of the first step is linearly correlated to the size of the involved edge set, i.e., O(nN)O(nN). ; In the second step, we repeat the following procedure for θ1\theta-1 times. Let π(θ+1)=π(θ)\pi^{(\theta+1)}=\pi^{(\theta)}. For each node viv_{i} in I(θ)I^{(\theta)}, we update the stationary distribution values of all viv_{i}’s direct successors (like Fig. 9(b)) by
πj(θ+1)πj(θ+1)+bij(πi(θ)πi(θ1)),vj𝒩s(vi),\pi^{(\theta+1)}_{j}\leftarrow\pi^{(\theta+1)}_{j}+b^{\prime}_{ij}(\pi^{(\theta)}_{i}-\pi^{(\theta-1)}_{i}),v_{j}\in\mathcal{N}_{s}(v_{i}), (69)
where 𝒩s(vi)\mathcal{N}_{s}(v_{i}) denotes the direct successors of viv_{i}. After all nodes in I(θ)I^{(\theta)} are traversed, I(θ)I^{(\theta)} is updated as I(θ+1)I^{(\theta+1)}, i.e., all the direct successors of nodes of the original set I(θ)I^{(\theta)}. For each iteration, the time complexity is O(|I(θ)|d¯(θ))O(|I^{(\theta)}|\overline{d}^{(\theta)}), where d¯(θ)\overline{d}^{(\theta)} denotes the mean direct successor number of nodes in I(θ)I^{(\theta)}. In other words, let E(θ)E^{(\theta)} (θ1\theta\geq 1) denote the set of edges whose starting points belong to I(θ)I^{(\theta)}, we have |I(θ)|d¯(θ)=|E(θ)||I^{(\theta)}|\overline{d}^{(\theta)}=|E^{(\theta)}|. Therefore, the time complexity of each iteration is O(|E(θ)|)O(|E^{(\theta)}|) which must be less or equal to the complexity of Eq. (66), O(||)O(|\mathcal{E}|). ; Parameter descriptions of the three methods are listed below:; Random parameters:; Gaussian parameters:; This method creates kk communities each with a size drawn from a normal distribution with mean ss and variance s/vs/v. Nodes are connected within communities with probability pinp_{\mathrm{in}} and between communities with probability pacp_{\mathrm{ac}}. ; SBM parameters:; This method partitions the nodes in communities of given sizes 𝒮=[s1,s2,]\mathcal{S}=[s_{1},s_{2},...], and places edges between pairs of nodes independently, with a probability that depends on the communities. ; The generation process of the artificial Hawkes datasets.; The generation process of the artificial Hawkes datasets.; In this process, we first randomly choose a node xx to be the target node (Fig. 10(t=1t=1)). Second, we add edges or nodes with given probabilities. Specifically, with probability phnp_{\mathrm{hn}} (Fig. 10(Node Addition)), we connect a new node with xx. With probability 1phn1-p_{\mathrm{hn}} (Fig. 10(Edge Addition)), we (1) use Node2vec [25] to get embedding vectors of all nodes; (2) calculate the conditional intensity function Λyi|x\Lambda_{y_{i}|x} between xx and each of its non-neighboring nodes yy: ; where ex,ey\textbf{e}_{x},\textbf{e}_{y} refers to the embedding vectors of nodes xx and yy, hh refers to the historical neighbor nodes connected to xx at time tht_{h} before tt, and δx\delta_{x} refers to the discount rate, defined in this paper as the number of neighbors of xx; and (3) add an edge between xx and yiy_{i} with the Softmax conditional probability: ; Subsequently, we repeat the above two steps to generate incremental sequences and updated graphs (Fig. 10(t=2,3,,Tt=2,3,...,T)). The chosen parameter settings of the initial states and Hawkes Process are described in Table 2. ; Parameter Values of the Generated Initial States and Hawkes Process. ; Parameter Values of the Generated Initial States and Hawkes Process. ; Statistics Description of the Artificial and Real-World Datasets.; Statistics Description of the Artificial and Real-World Datasets.; Datasets; Random-Hawkes; 6,000; 203,659; 1,017; 20,365; Gaussian-Hawkes; 6,000; 164,482; 596; 11,942; SBM-Hawkes; 6,000; 176,526; 592; 11,942; 25,656; 132,119; 235; 10,727; DBLP; 7,184; 13,451; 3,379; 7,907; 14,094; 72,809; 2,233; 26,000; DBLP contains a co-authorship network of computer science papers from 19541954 to 20152015 in which authors are represented as vertices and co-authors are linked by an edge.; (on real-world datasets) and Fig. 12 (on artificial datasets); NAGA+AIUA; compared with NAGA+AIUA; In Fig. 11, the performance of NSGA+AIUA is only slightly weaker than TOATOA with Infomap on Cit-HepPh and Facebook. Further, it even achieves lower structural entropy than the offline algorithm when Louvain and Leiden are chosen. Two main reasons are listed as follows. (1) In the incremental algorithms, the new community partitioning is obtained by locally modifying the original partitioning of the initial state. While in TOATOA the community partitioning is reconstructed globally from snapshots (updated graphs) by the static methods. Although TOATOA spends a large amount of time searching for optimal partitioning globally, there is no theoretical proof that it must be better than local and greedy optimization used in incremental algorithms. (2) Additionally, the optimization objectives of incremental algorithms and TOATOA are different. The former focuses on the structural entropy itself while the latter aims to optimize modularity (Louvain and Leiden) or minimize the expected length of information transmission (Infomap). ; From the experimental results on artificial datasets (Fig. 12), we can conclude that our incremental algorithms have higher stability in the dynamic calculation of structural entropy. In most cases, there are many mutations in the structural entropy value using TOA, while our algorithms maintain a continuous and stable decrease. It is because TOA globally reconstructs the community for each iteration so that small node or edge changes may cause more influence. By contrast, our algorithms dynamically adjust the community partitioning from the last snapshot, which affects only locally involved nodes. ; The structural differences between different static methods on artificial Hawkes datasets are really small indicating that the initial community partitionings of the three methods are almost the same. ; Table 5 shows the time comparison between our online algorithm NSGA+ AIUA (N=5N=5) and the offline algorithm TOA. As we can see from the results, all our proposed incremental algorithms are significantly faster than the existing static methods. Specifically, for example, NSGA+AIUA (N=5N=5) obtain over 140.93140.93x and 77.8177.81x speed up on average on DBLP and SBM-Hawkes, respectively, in contrast with the static method using Infomap. ; Time Consumption Comparison of Our Incremental Algorithms (Online Time) and the Baseline Traditional Offline Algorithm (Offline Time).; Time Consumption Comparison of Our Incremental Algorithms (Online Time) and the Baseline Traditional Offline Algorithm (Offline Time).; The Influence on Total Time Consumption and Structural Entropy of Different Updated Threshold θ\theta on Cit-HepPh dataset with Infomap static method.; The Influence on Total Time Consumption and Structural Entropy of Different Updated Threshold θ\theta on Cit-HepPh dataset with Infomap static method.;

6.3.5 Robustness Analysis

; The structural entropy of NAGA/NSGA+AIUA and TOA with Infomap under different initial state parameter pacp_{\text{ac}} representing different noise conditions.; The structural entropy of NAGA/NSGA+AIUA and TOA with Infomap under different initial state parameter pacp_{\text{ac}} representing different noise conditions.; One-Dimensional Structural Entropy Measurement of Directed Weighted Graphs; One-Dimensional Structural Entropy Measurement of Directed Weighted Graphs; ER initial state.; The initial state of the ER dataset is created by the Erdős-Rényi model implemented by erdos_renyi_graph in Networkx [22] which has two main parameters. One is the number of nodes nn and the other is the probability of edge creation pp. In this dataset, the Erdős-Rényi model chooses each of the n(n1)n(n-1) possible directed edges with probability pp. ; Cycle initial state.; The Cycle dataset’s initial state contains cyclically connected nodes, where the edge direction is in increasing order. For example, a cycle dataset {v0,v1,,vn}\{v_{0},v_{1},...,v_{n}\} has n+1n+1 direct edges {(v0,v1),(v1,v2),,(vn1,vn)\{(v_{0},v_{1}),(v_{1},v_{2}),...,(v_{n-1},v_{n}) ,(vn,v0)},(v_{n},v_{0})\}. ; Weight and incremental settings.; For the initial states of ER and Cycle datasets, we first set all edge weights as random integers in [1,10][1,10]. Then we generate several incremental sequences with sizes 500500, 20002000, 50005000, and 1000010000, respectively, corresponding to four different updated graphs from each initial state. Specifically, for each incremental edge, we randomly choose two nodes to add a direct edge between them with random weight in [1,10][1,10]. If the edge exists, we randomly change its weight as another value in [1,10][1,10]. ; Experimental settings.; Experimental results.; Time consumption of Global Aggregation and Local Propagation on ER and Cycle dataset. Note that only the time consumption of Global Aggregation of n=500n=500 is shown since the theoretical time consumption of different incremental size nn is the same. ; Time consumption of Global Aggregation and Local Propagation on ER and Cycle dataset. Note that only the time consumption of Global Aggregation of n=500n=500 is shown since the theoretical time consumption of different incremental size nn is the same. ; on undirected graphs; Besides, we also discuss the computation method for one-dimensional structural entropy on directed weighted graphs.; Further, we discuss the extension of our proposed methods to weighted graphs and the one-dimensional structural entropy computation on directed and weighted graphs.)
Abstract

Structural entropy is a metric that measures the amount of information embedded in graph structure data under a strategy of hierarchical abstracting. To measure the structural entropy of a dynamic graph, we need to decode the optimal encoding tree corresponding to the best community partitioning for each snapshot. However, the current methods do not support dynamic encoding tree updating and incremental structural entropy computation. To address this issue, we propose Incre-2dSE, a novel incremental measurement framework that dynamically adjusts the community partitioning and efficiently computes the updated structural entropy for each updated graph. Specifically, Incre-2dSE includes incremental algorithms based on two dynamic adjustment strategies for two-dimensional encoding trees, i.e., the naive adjustment strategy and the node-shifting adjustment strategy, which support theoretical analysis of updated structural entropy and incrementally optimize community partitioning towards a lower structural entropy. We conduct extensive experiments on 33 artificial datasets generated by Hawkes Process and 33 real-world datasets. Experimental results confirm that our incremental algorithms effectively capture the dynamic evolution of the communities, reduce time consumption, and provide great interpretability.

keywords:
Structural entropy , dynamic graph , boundedness and convergence analysis , incremental algorithm
journal: Artificial Intelligence
\affiliation

[1]organization=State Key Laboratory of Software Development Environment, School of Computer Science and Engineering, Beihang University,city=Beijing, postcode=100191, country=China

\affiliation

[2]organization=School of Cyber Science and Technology, Beihang University,city=Beijing, postcode=100191, country=China

\affiliation

[3]organization=Didi Chuxing,city=Beijing, postcode=100193, country=China

1 Introduction

In 1953, Shannon [1] proposed the problem of structural information quantification to analyze communication systems. Since then, many information metrics of graph structures [2, 3, 4, 5] have been presented based on the Shannon entropy of random variables. In recent years, Li et al. [6, 7] proposed an encoding-tree-based graph structural information metric, namely structural entropy, to discover the natural hierarchical structure embedded in a graph. The structural entropy has been used extensively in the fields of biological data mining [8, 9], information security [10, 11], and graph neural networks [12, 13, 14], etc.

The computation of structural entropy [6] consists of three steps: encoding tree construction, node structural entropy calculation, and total structural entropy calculation. Firstly, the graph node set is hierarchically divided into several communities (shown in Fig. 1(a)) to construct a partitioning tree, namely an encoding tree (shown in Fig. 1(b)). Secondly, the total node degree and cut edge number of each community are counted to compute the structural entropy of each non-root node in the encoding tree. Thirdly, the structural entropy of the whole graph is calculated by summing up the node structural entropy. In general, smaller structural entropy corresponds to better community partitioning. Specifically, the minimized structural entropy, namely the graph structural entropy, corresponds to the optimal encoding tree, which reveals the best hierarchical community partitioning of the graph.

In dynamic scenarios, a graph evolves from its initial state to many updated graphs during time series [15]. To efficiently measure the quality of evolving community partitioning, we need to incrementally compute the updated structural entropy at any given time. Unfortunately, the current structural entropy methods [6, 7] do not support efficient incremental computation due to two challenges. The first challenge is that the encoding tree needs to be reconstructed for every updated graph, which leads to enormous time consumption. To address this issue, we propose two dynamic adjustment strategies for two-dimensional encoding trees111, namely the naive adjustment strategy and the node-shifting adjustment strategy. The former strategy maintains the old community partitioning, and supports theoretical structural entropy analysis, while the latter dynamically adjusts the community partitioning by moving nodes between communities based on the principle of structural entropy minimization. The second challenge is the high time complexity of structural entropy computation by the traditional definition. To tackle this problem, we design an incremental framework, namely Incre-2dSE, for efficiently measuring the updated two-dimensional structural entropy. To be specific, Incre-2dSE first utilizes the two dynamic adjustment strategies to generate Adjustments, i.e., the changes of important statistics from the original graph to the updated graph and then uses the Adjustments to calculate the updated structural entropy by newly designed incremental formula.

Refer to caption
Figure 1: a) A graph containing three communities A, B, and C, where A is divided into two sub-communities A.1 and A.2. b) An encoding tree of the left graph. Each leaf node corresponds to a single graph node. Each branch node corresponds to a community. The root node corresponds to the graph node set.

We conduct extensive experiments on 33 artificial dynamic graph datasets generated by Hawkes Process and 33 real-world datasets on the application of real-time monitoring of community partitioning quality (two-dimensional structural entropy) and community optimization. Comprehensive experimental results show that our methods effectively capture the community evolution features and significantly reduce the time consumption with great interpretability. All source code and data of this project are publicly available at https://github.com/SELGroup/IncreSE.

The main contributions of this paper are as follows:

  • 1.

    Proposing two dynamic adjustment strategies for two-dimensional encoding trees to avoid the reconstruction of the encoding tree for every updated graph.

  • 2.

    Designing an incremental framework for efficiently measuring the updated two-dimensional structural entropy with low time complexity.

  • 3.
  • 4.

    Conducting extensive experiments on artificial and real-world datasets to demonstrate the effectiveness and efficiency of our method in dynamic measurement of community partitioning quality.

The article is structured as follows: Section 2 outlines the definitions and notations. Section 3 describes the dynamic adjustment strategies. The algorithms are detailed in Section 4, and The experiments are discussed in Section 6. Section 7 presents the related works before concluding the paper in section 8.

2 Definitions and Notations

In this section, we summarize the notations in Table 1, and formalize the definition of Incremental Sequence, Dynamic Graph, Encoding Tree, and Structural Entropy as follows.

Table 1: Glossary of Notations.
Notation Description
GG Graph
𝒱;\mathcal{V};\mathcal{E} Node set; Edge set
v;ev;e Node; Edge
di;dvd_{i};d_{v} Node degree of node viv_{i}; Node degree of vv
mm The total edge number of GG
𝒯\mathcal{T} Encoding tree
λ\lambda The root node of an encoding tree
α\alpha The non-root node in an encoding tree, i.e., the community ID
AA The set of all 11-height nodes in an encoding tree
TαT_{\alpha} The label of α\alpha, i.e., the node community corresponding to α\alpha
VαV_{\alpha} The volume of TαT_{\alpha}
gαg_{\alpha} The cut edge number of TαT_{\alpha}
𝒢\mathcal{G} Dynamic graph
G0G_{0} Initial state of a dynamic graph
GtG_{t} The updated graph at time tt
ξt\xi_{t} Incremental sequence at time tt
ξ1t\xi_{1\rightarrow t} Cumulative incremental sequence at time tt
nn Incremental size
δ(v)\delta(v) The degree incremental of vv
δV(α)\delta_{V}(\alpha) The volume incremental of TαT_{\alpha}
δg(α)\delta_{g}(\alpha) The cut edge number incremental of TαT_{\alpha}
ϕλ\phi_{\lambda} The degree-changed node set
𝒜\mathcal{A} The set of 11-height tree nodes whose volume or cut edge
number change
H𝒯(G)H^{\mathcal{T}}(G) The structural entropy of GG by 𝒯\mathcal{T}
HGI𝒯(G,n)H^{\mathcal{T}}_{GI}(G,n) Global Invariant with incremental size nn
ΔL\Delta L Local Difference, i.e., the approximation error between H𝒯(G)H^{\mathcal{T}}(G)
and HGI𝒯(G,n)H^{\mathcal{T}}_{GI}(G,n)
Definition 1 (Incremental Sequence).

An incremental sequence is a set of incremental edges represented by ξ={<(v1,u1),op1>,<(v2,u2),op2>,,<(vn,un),opn>}\xi=\{<(v_{1},u_{1}),op_{1}>,<(v_{2},u_{2}),op_{2}>,...,<(v_{n},u_{n}),op_{n}>\}, where (vi,ui)(v_{i},u_{i}) denotes an incremental edge eie_{i} with two endpoints viv_{i} and uiu_{i}. The operator opi{+,}op_{i}\in\{+,-\} represents that the edge eie_{i} will be added to or removed from a graph. The number of the incremental edges nn is named the incremental size.

Definition 2 (Dynamic Graph).

In this work, a dynamic graph is defined as a series of snapshots of a temporal, undirected, unweighted, and connected graph 𝒢={G0,G1,,GT}\mathcal{G}=\{G_{0},G_{1},...,G_{T}\}. G0=(𝒱0,0)G_{0}=(\mathcal{V}_{0},\mathcal{E}_{0}) denotes the initial state and Gt=(𝒱t,t)G_{t}=(\mathcal{V}_{t},\mathcal{E}_{t}) denotes the updated graph at time tt (1tT)(1\leq t\leq T). To describe the temporal dynamic graph, we suppose that an incremental sequence ξt\xi_{t} arrives at each non-zero time tt. The updated graph GtG_{t} is generated by orderly combining all new edges and nodes introduced by ξt\xi_{t} with Gt1G_{t-1}, i.e., Gt:=CMB(Gt1,ξt)G_{t}:=\text{CMB}(G_{t-1},\xi_{t}). We further define the cumulative incremental sequence at time tt, denoted by ξ1t\xi_{1\rightarrow t}, as the sequence formed by sequentially concatenating sequences ξ1,ξ2,,ξt\xi_{1},\xi_{2},...,\xi_{t}, and then we have Gt:=CMB(G0,ξ1t)G_{t}:=\text{CMB}(G_{0},\xi_{1\rightarrow t}).

Definition 3 (Encoding Tree).

The concept of the encoding tree is the same as “the Partitioning Tree” proposed in the previous work [6]. The encoding tree 𝒯\mathcal{T} of a graph G=(𝒱,)G=(\mathcal{V},\mathcal{E}) is an undirected tree that satisfies the following properties:

  1. 1.

    The root node λ\lambda in 𝒯\mathcal{T} has a label Tλ=𝒱T_{\lambda}=\mathcal{V}.

  2. 2.

    Each non-root node α\alpha in 𝒯\mathcal{T} has a label Tα𝒱T_{\alpha}\subset\mathcal{V}.

  3. 3.

    For each node α\alpha in 𝒯\mathcal{T}, if β1,β2,,βk\beta_{1},\beta_{2},\ldots,\beta_{k} are all immediate successors of α\alpha, then Tβ1,,TβkT_{\beta_{1}},\ldots,T_{\beta_{k}} is a partitioning of TαT_{\alpha}.

  4. 4.

    The label of each leaf node γ\gamma is a single node set, i.e., Tγ={v}T_{\gamma}=\{v\}.

  5. 5.

    h(α)h(\alpha) denotes the height of a node α\alpha in 𝒯\mathcal{T}. Let h(λ)=0h(\lambda)=0 and h(α)=h(α)+1h(\alpha)=h\left(\alpha^{-}\right)+1, where α\alpha^{-} is the parent of α\alpha. The height of the encoding tree h(𝒯)h(\mathcal{T}), namely the dimension, is equal to the maximum of h(γ)h(\gamma).

Definition 4 (Structural Entropy).

The structural entropy is defined by Li and Pan [6]. We follow this work and state the definition below. Given an undirected, unweighted, and connected graph G=(𝒱,)G=(\mathcal{V},\mathcal{E}) and its encoding tree 𝒯\mathcal{T}, the structural entropy of GG by 𝒯\mathcal{T} is defined as:

H𝒯(G)=α𝒯,αλgα2mlogVαVα,H^{\mathcal{T}}(G)=\sum_{\alpha\in\mathcal{T},\alpha\neq\lambda}-\frac{g_{\alpha}}{2m}\log\frac{V_{\alpha}}{V_{\alpha^{-}}}, (1)

where mm is the total edge number of GG; gαg_{\alpha} is the cut edge number of TαT_{\alpha}, i.e., the number of edges between nodes in and not in TαT_{\alpha}; VαV_{\alpha} is the volume of TαT_{\alpha}, i.e., the sum of the degrees of all nodes in TαT_{\alpha}; log()\log(\cdot) denotes logarithm with a base of 2. We name H𝒯(G)H^{\mathcal{T}}(G) as the KK-dimensional structural entropy if 𝒯\mathcal{T}’s height is KK.

The graph structural entropy of GG is defined as:

H(G)=min𝒯{H𝒯(G)},H(G)=\min_{\mathcal{T}}\left\{H^{\mathcal{T}}(G)\right\}, (2)

where 𝒯\mathcal{T} ranges over all possible encoding trees.

If the height of 𝒯\mathcal{T} is restricted to KK, the KK-dimensional graph structural entropy of GG is defined as:

HK(G)=min𝒯{H𝒯(G)},H^{K}(G)=\min_{\mathcal{T}}\left\{H^{\mathcal{T}}(G)\right\}, (3)

where 𝒯\mathcal{T} ranges over all the possible encoding trees of height KK. The encoding tree corresponding to HK(G)H^{K}(G), which minimizes the KK-dimensional structural entropy, is named the optimal KK-dimensional encoding tree.

3 Dynamic Adjustment Strategies for Two-Dimensional Encoding Trees

In this section, we first introduce the naive adjustment strategy and analyze the updated structural entropy under this strategy. Then, we describe the node-shifting adjustment strategy which leads to lower structural entropy, and provide its theoretical proof.

3.1 Naive Adjustment Strategy

In this part, we first provide a formal description of the naive adjustment strategy. Next, we introduce two metrics, Global Invariant and Local Difference, to realize the incremental computation of updated structural entropy. Finally, we analyze the boundedness and convergence of the Local Difference.

3.1.1 Strategy Description

Before we introduce the naive adjustment strategy, we first discuss the form of the two-dimensional encoding trees in detail and the formula of their corresponding structural entropy. For all possible two-dimensional encoding trees, whenever there is a leaf node γ\gamma (Tγ={vk}T_{\gamma}=\{v_{k}\}) whose height is 11 (like Fig. 2(b)), we can connect it with a child node γ+\gamma^{+} with the same label Tγ+={vk}T_{\gamma^{+}}=\{v_{k}\} to make all leaf nodes have height 22 (Fig. 2(c)). At this time, the structural entropy remains unchanged since the additional term induced by γ+\gamma^{+}, i.e. dk2mlogdkdk-\frac{d_{k}}{2m}\log\frac{d_{k}}{d_{k}}, equals to 0.

Refer to caption
Figure 2:

In other words, the two encoding trees in Fig. 2 are equivalent. Therefore, in this paper, we only consider the two-dimensional encoding trees where the height of all leaf nodes is 22 for the sake of brevity. Given a graph GG and its two-dimensional encoding tree 𝒯\mathcal{T}, the two-dimensional structural entropy of GG by 𝒯\mathcal{T} can uniformly be formulated as:

H𝒯(G)=αiA(gαi2mlogVαi2m+vjTαidj2mlogdjVαi),\displaystyle H^{\mathcal{T}}(G)=\sum_{\alpha_{i}\in A}(-\frac{g_{\alpha_{i}}}{2m}\log\frac{V_{\alpha_{i}}}{2m}+\sum_{v_{j}\in T_{\alpha_{i}}}-\frac{d_{j}}{2m}\log\frac{d_{j}}{V_{\alpha_{i}}}), (4)

where AA denotes the set of 11-height nodes in 𝒯\mathcal{T}, i.e., A={α in 𝒯|h(α)=1}A=\{\alpha\text{ in }\mathcal{T}|h(\alpha)=1\}.

Now we present the description of the naive adjustment strategy. This strategy comprises two parts: the edge strategy and the node strategy. The edge strategy dictates that incremental edges do not alter the encoding tree’s structure. On the other hand, the node strategy specifies that when a new node vv connects with an existing node uu (shown in Fig. 3(a)), and uu corresponds to a leaf node η\eta in the two-dimensional encoding tree, i.e., Tη={u}T_{\eta}=\{u\} (shown in Fig. 3(b)), a new leaf node γ\gamma with a label Tγ={v}T_{\gamma}=\{v\} will be set as an immediate successor of η\eta’s parent (α\alpha in Fig. 3(d)), instead of another 1-height node (β\beta in Fig. 3(f)). We can describe the modification of the encoding trees from the community perspective. Specifically, the incremental edges do not change the communities of the existing nodes while the new node is assigned to its neighbor’s community (TαT_{\alpha} in Fig. 3(c)), rather than another arbitrary community (TβT_{\beta} in Fig. 3(e)). Obviously, we can get the updated encoding tree, i.e., the updated community partitioning, in the time complexity of O(n)O(n) given an incremental sequence of size nn. To ensure that the node strategy minimizes the updated structural entropy most of the time, we give the following theorem.

Refer to caption
Figure 3: An example of the node strategy for adjusting two-dimensional encoding trees.
Theorem 1.

Suppose that a new graph node vv is connected to an existing node uu, where {u}Tα\{u\}\subseteq T_{\alpha}. If 2m+2Vα+2e\frac{2m+2}{V_{\alpha}+2}\geq e, we have:

Hvα𝒯(G)<Hvβα𝒯(G),\displaystyle H^{\mathcal{T}^{\prime}}_{v\rightarrow\alpha}(G^{\prime})<H^{\mathcal{T}^{\prime}}_{v\rightarrow\beta\neq\alpha}(G^{\prime}), (5)

where Hvα𝒯(G)H^{\mathcal{T}^{\prime}}_{v\rightarrow\alpha}(G^{\prime}) denotes the updated structural entropy when the new node vv is assigned to uu’s community TαT_{\alpha}, i.e., {v}Tα\{v\}\subset T_{\alpha}, and Hvβα𝒯(G)H^{\mathcal{T}^{\prime}}_{v\rightarrow\beta\neq\alpha}(G^{\prime}) represents the updated structural entropy when vv is allocated to another arbitrary community TβαT_{\beta\neq\alpha}, i.e., {v}Tβα\{v\}\subset T_{\beta\neq\alpha}.

Proof. Differentiating the updated structural entropy of the two cases above, we can obtain:

ΔH𝒯(G)=\displaystyle\Delta H^{\mathcal{T}^{\prime}}(G^{\prime})= Hvα𝒯(G)Hvβα𝒯(G)\displaystyle H^{\mathcal{T}^{\prime}}_{v\rightarrow\alpha}(G^{\prime})-H^{\mathcal{T}^{\prime}}_{v\rightarrow\beta\neq\alpha}(G^{\prime})
=\displaystyle= 12m+2[logVα+22m+2+(gαVα)logVα+1Vα+2(gβVβ)logVβVβ+1].\displaystyle\frac{1}{2m+2}[\log{\frac{V_{\alpha}+2}{2m+2}}+(g_{\alpha}-V_{\alpha})\log\frac{V_{\alpha}+1}{V_{\alpha}+2}-(g_{\beta}-V_{\beta})\log\frac{V_{\beta}}{V_{\beta}+1}]. (6)

Here we define:

f1(g,V)=\displaystyle f_{1}(g,V)= (gV)logV+1V+2,\displaystyle(g-V)\log\frac{V+1}{V+2}, (7)
f2(g,V)=\displaystyle f_{2}(g,V)= (gV)logVV+1.\displaystyle-(g-V)\log\frac{V}{V+1}. (8)

Let θ\theta (0θ<1)(0\leq\theta<1) be the minimum proportion of the in-community edges in each community, i.e.,

θ=minα{VαgαVα}.\displaystyle\theta=\min_{\alpha}\{\frac{V_{\alpha}-g_{\alpha}}{V_{\alpha}}\}. (9)

Since 1V<2m1\leq V<2m and 0<g(1θ)V0<g\leq(1-\theta)V, we have:

f1(g,V)\displaystyle f_{1}(g,V) <VlogV+1V+2<1ln2=loge,\displaystyle<-V\log\frac{V+1}{V+2}<\frac{1}{\ln{2}}=\log{e}, (10)
f2(g,V)\displaystyle f_{2}(g,V) θVlogVV+1θlog12=θ.\displaystyle\leq\theta V\log{\frac{V}{V+1}}\leq\theta\log{\frac{1}{2}}=-\theta. (11)

So

ΔH𝒯(G)<\displaystyle\Delta H^{\mathcal{T}^{\prime}}(G^{\prime})< 12m+2(logVα+22m+2+logeθ)\displaystyle\frac{1}{2m+2}(\log{\frac{V_{\alpha}+2}{2m+2}}+\log e-\theta) (12)
=\displaystyle= 12m+2log(Vα+22m+22θe).\displaystyle\frac{1}{2m+2}\log({\frac{V_{\alpha}+2}{2m+2}}\cdot 2^{-\theta}e).

Therefore, if the following condition holds:

2m+2Vα+2max{2θe}=e,\displaystyle\frac{2m+2}{V_{\alpha}+2}\geq\max\{2^{-\theta}e\}=e, (13)

then Eq. (5) holds, and thus Theorem 1 is proven.∎

According to Theorem 1, our node strategy minimizes the updated structural entropy when the total volume of the whole graph 2m2m is approximately larger than ee times the maximum volume VmV_{m} of all communities. Usually, we have θ1\theta\approx 1, so the real condition is much looser.

3.1.2 Global Invariant and Local Difference

In this part, we introduce two quantities, Global Invariant and Local Difference, to realize the approximation and the fast incremental calculation of the updated structural entropy by naive adjustment strategy. When an incremental sequence ξ\xi with size nn is applied to a graph GG, resulting in a new graph GG^{\prime} and its corresponding two-dimensional encoding tree 𝒯\mathcal{T}^{\prime} using the naive adjustment strategy, the updated two-dimensional structural entropy can be expressed as:

H𝒯(G)=\displaystyle H^{\mathcal{T}^{\prime}}(G^{\prime})= αiA(gαi2m+2nlogVαi2m+2n+vjTαidj2m+2nlogdjVαi).\displaystyle\sum_{\alpha_{i}\in A}(-\frac{g^{\prime}_{\alpha_{i}}}{2m+2n}\log\frac{V^{\prime}_{\alpha_{i}}}{2m+2n}+\sum_{v_{j}\in T_{\alpha_{i}}}-\frac{d^{\prime}_{j}}{2m+2n}\log\frac{d^{\prime}_{j}}{V^{\prime}_{\alpha_{i}}}). (14)

we can get the Global Invariant in the time complexity of O(1)O(1) if SNS_{N}, SCS_{C}, and SGS_{G} are saved. The Local Difference can also be computed in O(n)O(n) given the necessary incremental changes. Overall, the updated two-dimensional structural entropy can be calculated in O(n)O(n)

Definition 5 (Global Invariant).

Given an original graph GG and its two-dimensional encoding tree 𝒯\mathcal{T}, the Global Invariant is defined as an approximate value of the updated structural entropy after an incremental sequence with size nn, i.e.,

HGI𝒯(G,n)\displaystyle H_{GI}^{\mathcal{T}}(G,n) =αiA(gαi2m+2nlogVαi2m+2n+vjTαidj2m+2nlogdjVαi)\displaystyle=\sum_{\alpha_{i}\in A}(-\frac{g_{\alpha_{i}}}{2m+2n}\log\frac{V_{\alpha_{i}}}{2m+2n}+\sum_{v_{j}\in T_{\alpha_{i}}}-\frac{d_{j}}{2m+2n}\log\frac{d_{j}}{V_{\alpha_{i}}})
=12m+2n(SN+SC+SG),\displaystyle=-\frac{1}{2m+2n}(S_{N}+S_{C}+S_{G}), (15)

where

SN\displaystyle S_{N} =vi𝒱dilogdi,\displaystyle=\sum_{v_{i}\in\mathcal{V}}d_{i}\log{d_{i}}, (16)
SC\displaystyle S_{C} =αiA(gαiVαi)logVαi,\displaystyle=\sum_{\alpha_{i}\in A}(g_{\alpha_{i}}-V_{\alpha_{i}})\log V_{\alpha_{i}}, (17)
SG\displaystyle S_{G} =αiAgαilog(2m+2n).\displaystyle=-\sum_{\alpha_{i}\in A}g_{\alpha_{i}}\log(2m+2n). (18)
Definition 6 (Local Difference).

Given the updated graph GG^{\prime}, the updated two-dimensional encoding tree 𝒯\mathcal{T}^{\prime}, and incremental size nn, the Local Difference is defined as the difference between the exact updated two-dimensional structural entropy and the Global Invariant, as shown below:

ΔL=H𝒯(G)HGI𝒯(G,n)=12m+2n(ΔSN+ΔSC+ΔSG),\displaystyle\Delta L=H^{\mathcal{T}^{\prime}}(G^{\prime})-H^{\mathcal{T}}_{GI}(G,n)=-\frac{1}{2m+2n}(\Delta S_{N}+\Delta S_{C}+\Delta S_{G}), (19)

where

ΔSN=\displaystyle\Delta S_{N}= vkϕλ[(dk+δ(vk))log(dk+δ(vk))dklogdk],\displaystyle\sum_{v_{k}\in\phi_{\lambda}}[(d_{k}+\delta(v_{k}))\log(d_{k}+\delta(v_{k}))-d_{k}\log{d_{k}}], (20)
ΔSC=\displaystyle\Delta S_{C}= α𝒜[(gα+δg(α)VαδV(α))log(Vα+δV(α))(gαVα)logVα],\displaystyle\sum_{\alpha\in\mathcal{A}}[(g_{\alpha}+\delta_{g}(\alpha)-V_{\alpha}-\delta_{V}(\alpha))\log(V_{\alpha}+\delta_{V}(\alpha))-(g_{\alpha}-V_{\alpha})\log V_{\alpha}], (21)
ΔSG=\displaystyle\Delta S_{G}= α𝒜δg(α)log(2m+2n).\displaystyle-\sum_{\alpha\in\mathcal{A}}\delta_{g}(\alpha)\log(2m+2n). (22)

Here, δ(vk)\delta(v_{k}) denotes the incremental change in degree dkdkd^{\prime}_{k}-d_{k}, δV(α)\delta_{V}(\alpha) represents the incremental change in volume VαVαV^{\prime}_{\alpha}-V_{\alpha}, δg(α)\delta_{g}(\alpha) represents the incremental change in cut edge gαgαg^{\prime}_{\alpha}-g_{\alpha}, ϕλ\phi_{\lambda} denotes the set of nodes that have changes in degree {vk𝒱|δ(vk)0}\{v_{k}\in\mathcal{V}^{\prime}|\delta(v_{k})\neq 0\}, and 𝒜\mathcal{A} denotes the set of 11-height tree nodes that have changes in VαV_{\alpha} or gαg_{\alpha}, i.e., 𝒜={αA|δV(α)0 or δg(α)0}\mathcal{A}=\{\alpha\in A^{\prime}|\delta_{V}(\alpha)\neq 0\text{ or }\delta_{g}(\alpha)\neq 0\}.

3.1.3 Boundedness Analysis

According to Eq. (19), the bounds of ΔL\Delta L can be obtained by analyzing its components, namely ΔSN\Delta S_{N}, ΔSC\Delta S_{C} and ΔSG\Delta S_{G}. First, we analyze the maximum and minimum values of ΔSN\Delta S_{N}. We define

sN(d,x)=(d+x)log(d+x)dlogd.s_{N}(d,x)=(d+x)\log(d+x)-d\log{d}. (23)

Since sN(d,n)s_{N}(d,n) is monotonically increasing with dd, ΔSN\Delta S_{N} takes the maximum value when nn new incremental edges connect the two nodes with the largest degree. Therefore, we have

ΔSN2sN(dm,n),\Delta S_{N}\leq 2s_{N}(d_{m},n), (24)

where dmd_{m} denotes the maximum degree in GG. Since multiple edges are not allowed, the equality may hold if and only if n=1n=1. When each of the nn incremental edges connects a one-degree node and a new node, ΔSN\Delta S_{N} is minimized:

ΔSNnsN(1,0).\Delta S_{N}\geq ns_{N}(1,0). (25)

Second, we analyze the bounds of ΔSC\Delta S_{C} and ΔSG\Delta S_{G}. For convenience, we define

ΔSCG=ΔSC+ΔSG.\displaystyle\Delta S_{CG}=\Delta S_{C}+\Delta S_{G}. (26)

We commence by analyzing the bound of ΔSCG\Delta S_{CG} when adding one new edge. If a new edge is added between two communities Tα1T_{\alpha_{1}} and Tα2T_{\alpha_{2}}, we can get

ΔSCG=\displaystyle\Delta S_{CG}= (gα1Vα1)log(Vα1+1)(gα1Vα1)logVα1\displaystyle(g_{\alpha_{1}}-V_{\alpha_{1}})\log(V_{\alpha_{1}}+1)-(g_{\alpha_{1}}-V_{\alpha_{1}})\log V_{\alpha_{1}}
+(gα2Vα2)log(Vα2+1)(gα2Vα2)logVα22log(2m+2).\displaystyle+(g_{\alpha_{2}}-V_{\alpha_{2}})\log(V_{\alpha_{2}}+1)-(g_{\alpha_{2}}-V_{\alpha_{2}})\log V_{\alpha_{2}}-2\log(2m+2). (27)

Thus we have

ΔSCG\displaystyle\Delta S_{CG}\geq 2Vmlog(VmVm+1)2log(2m+2),\displaystyle 2V_{m}\log(\frac{V_{m}}{V_{m}+1})-2\log(2m+2), (28)

and

ΔSCG\displaystyle\Delta S_{CG}\leq 2log(2m+2),\displaystyle-2\log(2m+2), (29)

where VmV_{m} denotes the maximum volume of all TαT_{\alpha}. If a new edge is added within a single community TαT_{\alpha} (or a new node is connected with an existing node in TαT_{\alpha}), we have

ΔSCG=\displaystyle\Delta S_{CG}= (gαVα2)log(Vα+2)(gαVα)logVα.\displaystyle(g_{\alpha}-V_{\alpha}-2)\log(V_{\alpha}+2)-(g_{\alpha}-V_{\alpha})\log V_{\alpha}. (30)

Then we can obtain

ΔSCG\displaystyle\Delta S_{CG}\geq (Vm+2)log(Vm+2)+VmlogVm,\displaystyle-(V_{m}+2)\log(V_{m}+2)+V_{m}\log V_{m}, (31)

and

ΔSCG\displaystyle\Delta S_{CG}\leq 2log(Vmin+2),\displaystyle-2\log(V_{min}+2), (32)

where VminV_{min} denotes the minimum volume of all TαT_{\alpha}. We next analyze the bound of ΔSCG\Delta S_{CG} when adding nn new edges. When the nn edges are all between the two communities with the largest volume, we have:

ΔSCG\displaystyle\Delta S_{CG}\geq 2Vmlog(VmVm+n)2nlog(2m+2n)\displaystyle 2V_{m}\log(\frac{V_{m}}{V_{m}+n})-2n\log(2m+2n) (33)
>\displaystyle> 2n2nlog(2m+2n),\displaystyle-2n-2n\log(2m+2n),

and ΔSCG\Delta S_{CG} takes the minimum value:

ΔSCGmin=\displaystyle\Delta S_{CGmin}= 2n2nlog(2m+2n).\displaystyle-2n-2n\log(2m+2n). (34)

When each of the nn edges is added within nn communities with the smallest volume, respectively, ΔSCG\Delta S_{CG} takes its maximum value:

ΔSCGm=\displaystyle\Delta S_{CGm}= 2nlog(Vmin+2).\displaystyle-2n\log(V_{min}+2). (35)

Finally, we can get a lower bound of ΔL\Delta L as

LB(ΔL)=\displaystyle\text{LB}(\Delta L)= 12m+2n(2sN(dm,n)+ΔSCGm)\displaystyle-\frac{1}{2m+2n}(2s_{N}(d_{m},n)+\Delta S_{CGm})
=\displaystyle= 1m+n[dmlogdm(dm+n)log(dm+n)+nlog(Vmin+2)].\displaystyle\frac{1}{m+n}[d_{m}\log d_{m}-(d_{m}+n)\log{(d_{m}+n)}+n\log(V_{min}+2)]. (36)

An upper bound of ΔL\Delta L is as follows:

UB(ΔL)\displaystyle\text{UB}(\Delta L) =12m+2n(nsN(1,0)+ΔSCGmin)\displaystyle=-\frac{1}{2m+2n}(ns_{N}(1,0)+\Delta S_{CGmin})
=nlog(m+n)+52nm+n.\displaystyle=\frac{n\log(m+n)+\frac{5}{2}n}{m+n}. (37)

3.1.4 Convergence Analysis

In this section, we analyze the convergence of the Local Difference as well as its first-order absolute moment. To denote that one function converges at the same rate or faster than another function, we use the notation g(m)=O(f(m))g(m)=O(f(m)), which is equivalent to limmg(m)f(m)=C\lim_{m\rightarrow\infty}\frac{g(m)}{f(m)}=C, where CC is a constant.

Theorem 2.

Given the incremental size nn, the Local Difference converges at the rate of O(logmm)O(\frac{\log m}{m}), represented as:

ΔL=\displaystyle\Delta L= O(logmm).\displaystyle O(\frac{\log m}{m}). (38)

Proof. The lower bound of ΔL\Delta L is given by:

LB(ΔL)=\displaystyle\text{LB}(\Delta L)= dmlogdm(dm+n)log(dm+n)m+n+nlog(Vmin+2)m+n\displaystyle\frac{d_{m}\log d_{m}-(d_{m}+n)\log{(d_{m}+n)}}{m+n}+\frac{n\log(V_{min}+2)}{m+n}
\displaystyle\geq mlogm(m+n)log(m+n)m+n+nlog(2+2)m+n,\displaystyle\frac{m\log m-(m+n)\log{(m+n)}}{m+n}+\frac{n\log(2+2)}{m+n},
\displaystyle\geq 1m+n[log(1nm+n)mnlog(m+n)]\displaystyle\frac{1}{m+n}[\log(1-\frac{n}{m+n})^{m}-n\log(m+n)]
=\displaystyle= O(logmm).\displaystyle O(\frac{\log m}{m}). (39)

Similarly, the upper bound is given by:

UB(ΔL)=\displaystyle\text{UB}(\Delta L)= nlog(m+n)+52nm+n\displaystyle\frac{n\log(m+n)+\frac{5}{2}n}{m+n}
=\displaystyle= O(logmm).\displaystyle O(\frac{\log m}{m}). (40)

Since

LB(ΔL)ΔLUB(ΔL),\displaystyle\text{LB}(\Delta L)\leq\Delta L\leq\text{UB}(\Delta L), (41)

Theorem 38 is proved.∎

It follows that the difference between the exact value of the updated two-dimensional structural entropy and the Global Invariant converges at the rate of O(logmm)O(\frac{\log m}{m}).

Definition 7.

Let XX be a random variable representing the incremental size nn. We remind that 𝔼[X]=n¯\mathbb{E}[X]=\overline{n}.

Theorem 3.

The first-order absolute moment of the Local Difference converges at the rate of O(logmm)O(\frac{\log m}{m}):

𝔼[|ΔL|]=O(logmm).\displaystyle\mathbb{E}[|\Delta L|]=O(\frac{\log m}{m}). (42)

Proof. We can represent the expectation of the lower bound of ΔL\Delta L as:

𝔼[|LB(ΔL)|]=\displaystyle\mathbb{E}[|\text{LB}(\Delta L)|]= 𝔼[|dmlogdm(dm+X)log(dm+X)m+X+nlog(Vmin+2)m+X|]\displaystyle\mathbb{E}[|\frac{d_{m}\log d_{m}-(d_{m}+X)\log(d_{m}+X)}{m+X}+\frac{n\log(V_{min}+2)}{m+X}|]
\displaystyle\leq 𝔼[(m+X)log(m+X)mlogmm+X]+𝔼[Xlog(m+2)m+X]\displaystyle\mathbb{E}[\frac{(m+X)\log{(m+X)}-m\log m}{m+X}]+\mathbb{E}[\frac{X\log(m+2)}{m+X}]
\displaystyle\leq mlogm(m+n¯)log(m+n¯)m+n¯+n¯log(m+2)m+n¯\displaystyle\frac{m\log m-(m+\overline{n})\log{(m+\overline{n})}}{m+\overline{n}}+\frac{\overline{n}\log(m+2)}{m+\overline{n}}
=\displaystyle= O(logmm).\displaystyle O(\frac{\log m}{m}). (43)

Similarly, the expectation of the upper bound is given by:

𝔼[|UB(ΔL)|]=\displaystyle\mathbb{E}[|\text{UB}(\Delta L)|]= 𝔼[Xlog(m+X)+52Xm+X]\displaystyle\mathbb{E}[\frac{X\log(m+X)+\frac{5}{2}X}{m+X}]
\displaystyle\leq n¯log(m+n¯)+52n¯m+n¯\displaystyle\frac{\overline{n}\log(m+\overline{n})+\frac{5}{2}\overline{n}}{m+\overline{n}}
=\displaystyle= O(logmm).\displaystyle O(\frac{\log m}{m}). (44)

Finally, since

0𝔼[|ΔL|]max{𝔼[|LB(ΔL)|],𝔼[|UB(ΔL)|]},\displaystyle 0\leq\mathbb{E}[|\Delta L|]\leq\max\{\mathbb{E}[|\text{LB}(\Delta L)|],\mathbb{E}[|\text{UB}(\Delta L)|]\}, (45)

Theorem 42 is proved.∎

3.2 Node-Shifting Adjustment Strategy

Although the naive adjustment strategy can quickly obtain an updated two-dimensional encoding tree and its corresponding structural entropy, we still need a more effective strategy to get a better community partitioning towards lower structural entropy. Therefore, we propose another novel dynamic adjustment strategy, namely node-shifting, by moving nodes to their optimal preference communities (Definition 8) iteratively. Different from the naive adjustment strategy, edge changes can change the communities of the existing nodes to minimize the structural entropy. Besides, this strategy supports multiple incremental edges at the same time and the removal of the existing edges. Therefore, the node-shifting adjustment strategy fully overcomes the limitations of the naive adjustment strategy listed in Section 3.1.5. In the following, we first describe the node-shifting adjustment strategy in detail and then prove that the node’s movement towards its optimal preference community can get the lowest structural entropy greedily. Finally, we discuss the limitations of this strategy.

3.2.1 Strategy Description

We first define the optimal preference community (OPC) (Definition 8) Then the node-shifting adjustment strategy can be described as follows: (1) let involved nodes be all nodes that appeared in the incremental sequence; (2) for each involved node, move it to its OPC; (3) update the involved nodes to all nodes connected with the shifted nodes but in different communities, then repeat step (2).

Definition 8 (Optimal Preference Community (OPC)).

Given a graph G=(𝒱,)G=(\mathcal{V},\mathcal{E}) and a target node vt𝒱v_{t}\in\mathcal{V}, the optimal preference community of vtv_{t} is defined as the community TαT_{\alpha^{*}} in which

α={argminα[(gαVα)logVαVα+dt+2d(α)logVα+dt2m],vtTα;argminα[(gαVα+dt+d(α))logVαdtVα+2d(α)logVα2m],vtTα,\begin{split}\alpha^{*}=\left\{\begin{array}[]{ll}\arg\min_{\alpha}[(g_{\alpha}-V_{\alpha})\log\frac{V_{\alpha}}{V_{\alpha}+d_{t}}+2d^{(\alpha)}\log\frac{V_{\alpha}+d_{t}}{2m}],&v_{t}\notin T_{\alpha};\\ \arg\min_{\alpha}[(g_{\alpha}-V_{\alpha}+d_{t}+d^{(\alpha)})\log\frac{V_{\alpha}-d_{t}}{V_{\alpha}}+2d^{(\alpha)}\log\frac{V_{\alpha}}{2m}],&v_{t}\in T_{\alpha},\\ \end{array}\right.\end{split} (46)

where d(α)d^{(\alpha)} denotes the number of the edges connected between vtv_{t} and TαT_{\alpha}.

Fig. 4 and Fig. 5 give examples to illustrate the node-shifting adjustment strategy in different situations. Fig. 4 shows how incremental edges affect community partitioning. In Fig. 4(a), the graph is divided into 22 communities TαT_{\alpha} and TβT_{\beta}. In Fig. 4(b), 44 incremental edges (red dotted) are inserted into the graph. Then all involved nodes (red outlined) are checked for moving into their OPCs. In this step, one green node is shifted from TαT_{\alpha} to TβT_{\beta} (denoted by the red arrow). In Fig. 4(c), the shifted node in the previous step “sends messages” (red dotted arrows) to its neighbors in TαT_{\alpha} (red outlined). The nodes that received the message (red outlined) are then checked for shifting. At this time, another green node moves into TβT_{\beta}. In Fig. 4(d)-(e), the graph follows the above process to continue the iterative update. The final community partitioning is shown in Fig. 4(f). Fig. 5 shows how new nodes are assigned to communities. Fig. 5(a) gives a 77 nodes graph with 22 communities. In Fig. 5(b), 33 new nodes (white filled) are added with 77 incremental edges and they belong to no community. Then all of them and their existing neighbors become involved nodes. Next, the upper new node is assigned to TαT_{\alpha} because TαT_{\alpha} is determined as its OPC. Also, the lower two new nodes are moved into their OPCs. In Fig. 5(c), the new involved nodes (red outlined) are checked. Fig. 5(d) shows the final state of this node-shifting process.

Refer to caption
Figure 4: An example of the node-shifting adjustment strategy for adding new edges.
Refer to caption
Figure 5: An example of the node-shifting adjustment strategy for adding new nodes.

3.2.2 Theoretical Proof

In this part, we provide a simplified model (Fig. 6) to theoretically derive the OPC’s solution formula (Eq. (46)). In the graph of this model, there exists rr communities Tα1,Tα2,,TαrT_{\alpha_{1}},T_{\alpha_{2}},...,T_{\alpha_{r}}. There is also a target node vtv_{t} which does not belong to any community. The number of the edges connected between vtv_{t} and TαiT_{\alpha_{i}} is denoted by d(i)d^{(i)}. The volume and the number of the cut edges of TαiT_{\alpha_{i}} are denoted by ViV_{i} and gig_{i}, respectively. Then we have Theorem 47.

Refer to caption
Figure 6: A simplified model for theoretical analysis.
Theorem 4.

Suppose that the node vtv_{t} is moving into community Tαk,k{1,2,,r}T_{\alpha_{k}},k\in\{1,2,...,r\}. The updated structural entropy is minimized when vtv_{t} moves into TαkT_{\alpha_{k^{*}}} where

k=argmink[(gkVk)logVkVk+dt+2d(k)logVk+dt2m].\displaystyle k^{*}=\arg\min_{k}[(g_{k}-V_{k})\log\frac{V_{k}}{V_{k}+d_{t}}+2d^{(k)}\log\frac{V_{k}+d_{t}}{2m}]. (47)

Proof. Let HkH_{k} be the two-dimensional structural entropy after vtv_{t} moves into TαkT_{\alpha_{k}}. Then HkH_{k} is given by

Hk=\displaystyle H_{k}= αiαk(gi2mlogVi2m+vjTαidj2mlogdjVi)+(gk+dt2d(k)2mlogVk+dt2m\displaystyle\sum_{\alpha_{i}\neq\alpha_{k}}(-\frac{g_{i}}{2m}\log\frac{V_{i}}{2m}+\sum_{v_{j}\in T_{\alpha_{i}}}-\frac{d_{j}}{2m}\log\frac{d_{j}}{V_{i}})+(-\frac{g_{k}+d_{t}-2d^{(k)}}{2m}\log\frac{V_{k}+d_{t}}{2m} (48)
+vqTαk/{vt}dq2mlogdqVk+dtdt2mlogdtVk+dt).\displaystyle+\sum_{v_{q}\in T_{\alpha_{k}}/\{v_{t}\}}-\frac{d_{q}}{2m}\log\frac{d_{q}}{V_{k}+d_{t}}-\frac{d_{t}}{2m}\log\frac{d_{t}}{V_{k}+d_{t}}).

Therefore, the structural entropy is minimized when vtv_{t} moves into TαkT_{\alpha_{k^{*}}} where

k=argminkHk.\displaystyle k^{*}=\arg\min_{k}H_{k}. (49)

Let the structural entropy before the node movement be H~\tilde{H} given by

H~=αi(gi2mlogVi2m+vjTαidj2mlogdjVi)+(dt2mlogdt2mdt2mlogdtdt).\displaystyle\tilde{H}=\sum_{\alpha_{i}}(-\frac{g_{i}}{2m}\log\frac{V_{i}}{2m}+\sum_{v_{j}\in T_{\alpha_{i}}}-\frac{d_{j}}{2m}\log\frac{d_{j}}{V_{i}})+(-\frac{d_{t}}{2m}\log\frac{d_{t}}{2m}-\frac{d_{t}}{2m}\log\frac{d_{t}}{d_{t}}). (50)

Since H~\tilde{H} is independent of kk, we have

k=\displaystyle k^{*}= argmink2m(HkH~)\displaystyle\arg\min_{k}2m(H_{k}-\tilde{H}) (51)
=\displaystyle= argmink[(gkVk)logVkVk+dt+2d(k)logVk+dt2m].\displaystyle\arg\min_{k}[(g_{k}-V_{k})\log\frac{V_{k}}{V_{k}+d_{t}}+2d^{(k)}\log\frac{V_{k}+d_{t}}{2m}].

Therefore Theorem 47 is proved.∎

In practice, all nodes belong to their communities. We can first move the target node out of its community, and then use Eq. (47) to determine the OPC. This process is equivalent to directly using Definition 8 without moving out the target node.

3.2.3 Limitations

The limitations of the node-shifting adjustment strategy are listed below. , it is hard to give the bound of the gap between the Global Invariant and the updated structural entropy . , the node-shifting adjustment strategy may not converge in some cases (Fig. 7 gives an example), which forces us to set the maximum number of iterations.

Refer to caption
Figure 7: An example where the node-shifting adjustment strategy does not converge. The left is a graph added with two incremental edges which cause two nodes to shift. The right shows the updated communities after the first iteration and the future movement at the second iteration. After the second iteration, the graph becomes the left again.

3.3

4 Incre-2dSE: an Incremental Measurement Framework of the Updated Two-Dimensional Structural Entropy

4.1

Definition 9 (Structural Data).

Given a graph GG, the Structural Data of GG is defined as follows:

  1. 1.

    (node level) the degree did_{i} of each vi𝒱v_{i}\in\mathcal{V};

  2. 2.

    (community level) the volume VαV_{\alpha} and the cut edge number gαg_{\alpha};

  3. 3.

    (graph level) the total edge number mm;

  4. 4.

    (node-community map) the community ID viv_{i} belongs to, denoted by α(vi)A\alpha(v_{i})\in A where viTα(vi)v_{i}\in T_{\alpha(v_{i})};

  5. 5.

    (community-node map) the community TαT_{\alpha} of each αA\alpha\in A.

Definition 10 (Structural Expressions).

The Structural Expressions of GG are defined as follows:

  1. 1.

    (node level)

    S^N=dDkddlogd,\hat{S}_{N}=\sum_{d\in D}k_{d}d\log d, (52)

    where kdk_{d} denotes the node number of each dDd\in D while DD denotes the set of all distinct degrees in GG;

  2. 2.

    (community level)

    S^C=αA(gαVα)logVα;\hat{S}_{C}=\sum_{\alpha\in A}(g_{\alpha}-V_{\alpha})\log V_{\alpha}; (53)
  3. 3.

    (graph level)

    S^G=αAgα.\hat{S}_{G}=-\sum_{\alpha\in A}g_{\alpha}. (54)
Definition 11 (Adjustment).

The Adjustment from the original graph GG to the updated graph GG^{\prime} is defined as follows:

  1. 1.

    (node level) the degree change δ(v)\delta(v) for each node vϕλv\in\phi_{\lambda} and the node number change δk(d)=kdkd\delta_{k}(d)=k^{\prime}_{d}-k_{d} of each d𝒟d\in\mathcal{D}, where 𝒟\mathcal{D} denotes the set of the degrees which have node number changes from GG to GG^{\prime};

  2. 2.

    (community level) the volume change δV(α)\delta_{V}(\alpha) and the cut edge number change δg(α)\delta_{g}(\alpha) of each α𝒜\alpha\in\mathcal{A};

  3. 3.

    (graph level) the total edge number change nn;

  4. 4.

    (node-community map) the change list of the node-community map Structural Data denoted by Jnc={,(vi,α(vi)),}J_{n-c}=\{...,(v_{i},\alpha^{\prime}(v_{i})),...\} where α(vi)\alpha^{\prime}(v_{i}) denotes the new community ID of viv_{i};

  5. 5.

    (community-node map) the change list of the community-node map Structural Data denoted by Jcn={,(αi,vj,+/),}J_{c-n}=\{...,(\alpha_{i},v_{j},+/-),...\} where (αi,vj,(\alpha_{i},v_{j}, +/)+/-) denotes that community TαiT_{\alpha_{i}} is updated as Tαi{vj}T_{\alpha_{i}}\cup\{v_{j}\} or Tαi/{vj}T_{\alpha_{i}}/\{v_{j}\}.

4.2 Outline

Refer to caption
Figure 8: The outline of Incre-2dSE (including two stages, Initialization and Measurement) and the traditional offline algorithm.

The illustration of our incremental framework Incre-2dSE and its static baseline, the traditional offline algorithm (TOA), is shown in Fig. 8. Incre-2dSE aims to measure the updated two-dimensional structural entropy given the original graph, the original encoding tree, and the incremental sequences. This framework consists of two stages, initialization and measurement. In the initialization stage, the Structural Data (Definition 9), which contains a graph’s essential data to compute the structural entropy, is extracted from the original graph and its encoding tree (Fig. 8 i). Then the Structural Expressions (Definition 10), which are defined as the expressions of the Structural Data, are computed and saved (Fig. 8 ii). For the same original graph, Initialization only needs to be performed once. In the measurement stage, the Adjustment (Definition 11), which is defined as a data structure storing the changes in degree, volume, and cut edge number from the original graph to the updated graph, is first generated and saved according to the structural data and the incremental sequence by the Adjustment generation algorithm with the naive adjustment strategy (NAGA) or the node-shifting adjustment strategy (NSGA) (Fig. 8 1). Then, the Adjustment-based incremental updating algorithm (AUIA) is called to gather the Structural Data, the Structural Expression, and the Adjustment to efficiently calculate the updated structural entropy and update the Structural Data and the Structural Expressions (Fig. 8 2). As the baseline, TOA commences by updating the graph using the incremental sequence (Fig. 8ⓐ). Next, the new encoding tree of the updated graph is reconstructed using a static community detection method (Fig. 8ⓑ). Then, the updated Structural Data is extracted (Fig. 8ⓒ), and finally, the updated structural entropy is computed by definition (Fig. 8ⓓ).

4.3 The Incremental Framework

4.3.1 Stage 1: Initialization

Given a graph G=(𝒱,)G=(\mathcal{V},\mathcal{E}) as a sparse matrix and its two-dimensional encoding tree represented by a dictionary like {community ID 11: node list 11, community ID 22: node list 22, …}, the Structural Data (Definition 9) can be easily obtained and saved in the time complexity of O(||)O(|\mathcal{E}|) (Fig. 8 i). Then the Structural Expressions (Definition 10) can be calculated with the saved Structural Data in O(|𝒱|)O(|\mathcal{V}|) (Fig. 8 ii). Overall, the Initialization stage requires total time complexity O(||)O(|\mathcal{E}|).

4.3.2 Stage 2: Measurement

In this stage, we first need to generate the Adjustment (Definition 11) from GG to GG^{\prime}. We provide two algorithms for generating Adjustments by the proposed two dynamic adjustment strategies, namely the naive adjustment generation algorithm (NAGA) and the node-shifting adjustment generation algorithm (NSGA) (Fig. 8 1). The input of both two algorithms is the Structural Data of the original graph and an incremental sequence and the output is an Adjustment. The pseudo-code of NAGA and NSGA are shown in Algorithm 1 and Algorithm 2, respectively. The time complexity of NAGA is O(n)O(n) because it needs to traverse nn edges in the incremental sequence and it only costs O(1)O(1) for each edge. In NSGA, we first need O(n)O(n) to initialize the Adjustment (line 55-3131). Second, in the node-shifting part (line 3232-5151), we need to determine the OPC for all |I||I| involved nodes, which costs O(|A||I|)O(|A||I|). This step is repeated NN times and the time cost is O(|A|(|I1|+|I2|++|IN|))O(|A|(|I_{1}|+|I_{2}|+...+|I_{N}|)), where IiI_{i} denotes the number of the involved nodes in the ii-th iteration. Since |I1|n|I_{1}|\leq n and |Ii+1||Ii||I_{i+1}|\leq|I_{i}| most of the time, the total time complexity of NSGA is O(nN|A|)O(nN|A|).

Input : The Structural Data (did_{i}, VαV_{\alpha}, gαg_{\alpha}, mm, α(vi)\alpha(v_{i}), and TαT_{\alpha}) of GG, and an incremental sequence ξ\xi from GG to GG^{\prime}.
Output : The Adjustment (δ(vi)\delta(v_{i}), δk(d)\delta_{k}(d), δV(α)\delta_{V}(\alpha), δg(α)\delta_{g}(\alpha), nn, JncJ_{n-c}, and JcnJ_{c-n}) from GG to GG^{\prime} by the naive adjustment strategy.
1 n:=GetLength(ξ)n:=\textnormal{{GetLength}}(\xi);
2 δ(vi):=0\delta(v_{i}):=0, δk(d):=0\delta_{k}(d):=0, δV(α):=0\delta_{V}(\alpha):=0, δg(α):=0\delta_{g}(\alpha):=0, 𝒟:=\mathcal{D}:=\emptyset, 𝒜:=\mathcal{A}:=\emptyset, Jnc:=J_{n-c}:=\emptyset, Jcn:=J_{c-n}:=\emptyset;
3 Let the proxy maps be α^(vi):=α(vi),vi𝒱\hat{\alpha}(v_{i}):=\alpha(v_{i}),v_{i}\in\mathcal{V};
4 Let the proxy node level Structural Data be d^v:=dv,v𝒱\hat{d}_{v}:=d_{v},v\in\mathcal{V}, where dvd_{v} denotes the degree of vv;
5 for e=(u,v,+)ξe=(u,v,+)\in\xi do
6       𝒟:=𝒟{du,dv,du+1,dv+1}\mathcal{D}:=\mathcal{D}\cup\{d_{u},d_{v},d_{u}+1,d_{v}+1\};
7       δk(d^u):=δk(d^u)1\delta_{k}(\hat{d}_{u}):=\delta_{k}(\hat{d}_{u})-1, δk(d^u+1):=δk(d^u+1)+1\delta_{k}(\hat{d}_{u}+1):=\delta_{k}(\hat{d}_{u}+1)+1;
8       δk(d^v):=δk(d^v)1\delta_{k}(\hat{d}_{v}):=\delta_{k}(\hat{d}_{v})-1, δk(d^v+1):=δk(d^v+1)+1\delta_{k}(\hat{d}_{v}+1):=\delta_{k}(\hat{d}_{v}+1)+1;
9       d^u:=d^u+1\hat{d}_{u}:=\hat{d}_{u}+1, d^v:=d^v+1\hat{d}_{v}:=\hat{d}_{v}+1, δ(u):=δ(u)+1\delta(u):=\delta(u)+1, δ(v):=δ(v)+1\delta(v):=\delta(v)+1;
10       if α^(u)\hat{\alpha}(u) == NULL then
11             α^(u):=α^(v)\hat{\alpha}(u):=\hat{\alpha}(v);
12             Jnc:=Jnc{(u,α^(v))}J_{n-c}:=J_{n-c}\cup\{(u,\hat{\alpha}(v))\}, Jcn:=Jcn{(α^(v),u,+)}J_{c-n}:=J_{c-n}\cup\{(\hat{\alpha}(v),u,+)\};
13            
14       end if
15      if α^(v)\hat{\alpha}(v) == NULL then
16             α^(v):=α^(u)\hat{\alpha}(v):=\hat{\alpha}(u);
17             Jnc:=Jnc{(v,α^(u))}J_{n-c}:=J_{n-c}\cup\{(v,\hat{\alpha}(u))\}, Jcn:=Jcn{(α^(u),v,+)}J_{c-n}:=J_{c-n}\cup\{(\hat{\alpha}(u),v,+)\};
18            
19       end if
20      𝒜:=𝒜{α^(u),α^(v)}\mathcal{A}:=\mathcal{A}\cup\{\hat{\alpha}(u),\hat{\alpha}(v)\};
21       if α^(v)==α^(u)\hat{\alpha}(v)==\hat{\alpha}(u) then
22             δV(α^(v)):=δV(α^(v))+2\delta_{V}(\hat{\alpha}(v)):=\delta_{V}(\hat{\alpha}(v))+2;
23            
24       end if
25      if α^(v)α^(u)\hat{\alpha}(v)\neq\hat{\alpha}(u) then
26             δV(α^(v)):=δV(α^(v))+1\delta_{V}(\hat{\alpha}(v)):=\delta_{V}(\hat{\alpha}(v))+1, δV(α^(u)):=δV(α^(u))+1\delta_{V}(\hat{\alpha}(u)):=\delta_{V}(\hat{\alpha}(u))+1;
27             δg(α^(v)):=δg(α^(v))+1\delta_{g}(\hat{\alpha}(v)):=\delta_{g}(\hat{\alpha}(v))+1, δg(α^(u)):=δg(α^(u))+1\delta_{g}(\hat{\alpha}(u)):=\delta_{g}(\hat{\alpha}(u))+1;
28            
29       end if
30      
31 end for
32return the Adjustment from GG to GG^{\prime}.
Algorithm 1 Naive adjustment generation algorithm (NAGA)
1
Input : The Structural Data (did_{i}, VαV_{\alpha}, gαg_{\alpha}, mm, α(vi)\alpha(v_{i}), and TαT_{\alpha}) of the original graph GG, an incremental sequence ξ\xi from GG to GG^{\prime}, and the iteration number NN.
Output : The Adjustment (δ(vi)\delta(v_{i}), δk(d)\delta_{k}(d), δV(α)\delta_{V}(\alpha), δg(α)\delta_{g}(\alpha), nn, JncJ_{n-c}, and JcnJ_{c-n}) from GG to GG^{\prime}.
2 n:=GetLength(ξ)n:=\textnormal{{GetLength}}(\xi), Jnc:=J_{n-c}:=\emptyset, Jcn:=J_{c-n}:=\emptyset δ(vi):=0\delta(v_{i}):=0, δk(d):=0\delta_{k}(d):=0, δV(α):=0\delta_{V}(\alpha):=0, δg(α):=0\delta_{g}(\alpha):=0;
3 Let the proxy maps be α^(vi):=α(vi),α^^(vi):=α(vi),vi𝒱\hat{\alpha}(v_{i}):=\alpha(v_{i}),\hat{\hat{\alpha}}(v_{i}):=\alpha(v_{i}),v_{i}\in\mathcal{V};
4 Let the proxy node-level Structural Data be d^v:=dv,v𝒱\hat{d}_{v}:=d_{v},v\in\mathcal{V}, where dvd_{v} denotes the degree of vv;
5 Let the involved node set be I:=I:=\emptyset;
// Initialize the Adjustment
6 for e=(u,v,op)ξe=(u,v,op)\in\xi do
7       I:=I{u,v}I:=I\cup\{u,v\};
8       if op==+op==+ then
9             if u,vu,v are both existing nodes in 𝒱\mathcal{V} then
10                   Update the node-level Adjustment (using the proxy node-level Structural Data, the same as below);
11                   if α(u)\alpha(u) and α(v)\alpha(v) are both not None then
12                         Update the community-level Adjustment without changing the community partitioning;
13                        
14                  else if α(u)\alpha(u) or α(v)\alpha(v) is None (suppose α(u)==\alpha(u)== None) then
15                         δV(α(u)):=δV(α(u))+1\delta_{V}(\alpha(u)):=\delta_{V}(\alpha(u))+1; δg(α(u)):=δg(α(u))+1\delta_{g}(\alpha(u)):=\delta_{g}(\alpha(u))+1;
16                        
17                  
18            else if uu or vv does not exist (suppose uu does not exist) then
19                   α(u):=None\alpha(u):=\text{None};
20                   Update the node-level Adjustment;
21                   δV(α(u)):=δV(α(u))+1\delta_{V}(\alpha(u)):=\delta_{V}(\alpha(u))+1; δg(α(u)):=δg(α(u))+1\delta_{g}(\alpha(u)):=\delta_{g}(\alpha(u))+1;
22                   Jnc:=Jnc{(u,None)}J_{n-c}:=J_{n-c}\cup\{(u,\text{None})\}; Jcn:=Jcn{(None,u,+)}J_{c-n}:=J_{c-n}\cup\{(\text{None},u,+)\};
23                  
24            else if uu and vv are both not existing then
25                   α(u):=None\alpha(u):=\text{None}, α(v):=None\alpha(v):=\text{None};
26                   Update the node-level Adjustment;
27                   Jnc:=Jnc{(u,None)}J_{n-c}:=J_{n-c}\cup\{(u,\text{None})\}; Jnc:=Jnc{(v,None)}J_{n-c}:=J_{n-c}\cup\{(v,\text{None})\};
28                   Jcn:=Jcn{(None,u,+)}J_{c-n}:=J_{c-n}\cup\{(\text{None},u,+)\}; Jcn:=Jcn{(None,u,+)}J_{c-n}:=J_{c-n}\cup\{(\text{None},u,+)\};
29                  
30             Update the proxy node-level Strucutual Data as if the edge ee is added into the graph;
31            
32      else if op==op==- then
33             Update the node-level and the community-level Adjustment without changing the community partitioning;
34             Update the proxy node-level Strucutual Data as if the edge ee is removed from the graph;
35            
36       end if
37      
38 end for
// See the rest on the next page
Algorithm 2 Node-shifting adjustment generation algorithm (NSGA)
// The rest of Algorithm 2
// Start node-shifting
32 τ:=1\tau:=1;
33 while τN\tau\leq N and II\neq\emptyset do
34       I~:=\tilde{I}:=\emptyset, X:=X:=\emptyset;
35       for each node vIv\in I do
36             Determine the OPC of vv denoted by α\alpha^{*} using α^(v)\hat{\alpha}(v);
37             X:=X{(v,α)}X:=X\cup\{(v,\alpha^{*})\};
38             if α^(v)α\hat{\alpha}(v)\neq\alpha^{*} then
39                   Update the Adjustment as if vv moves into TαT_{\alpha^{*}} using α^^(v)\hat{\hat{\alpha}}(v);
40                   for each node zz\in Neighbor(vv) do
41                         if α^(z)α\hat{\alpha}(z)\neq\alpha^{*} then
42                               I~:=I~{z}\tilde{I}:=\tilde{I}\cup\{z\};
43                              
44                         end if
45                        
46                   end for
47                  α^^(v):=α\hat{\hat{\alpha}}(v):=\alpha^{*};
48                  
49             end if
50            
51       end for
52      for each (v,α)X(v,\alpha)\in X do
53             α^(v):=α\hat{\alpha}(v):=\alpha;
54            
55       end for
56      I:=I~I:=\tilde{I}, τ:=τ+1\tau:=\tau+1;
57      
58 end while
59
60return the Adjustment from GG to GG^{\prime}.

After getting the Adjustment, the updated two-dimensional structural entropy of GG^{\prime} can then be incrementally calculated by:

H𝒯(G)=12m+2n[S^N+S^C+S^Glog(2m+2n)],H^{\mathcal{T}^{\prime}}(G^{\prime})=-\frac{1}{2m+2n}[\hat{S}^{\prime}_{N}+\hat{S}^{\prime}_{C}+\hat{S}^{\prime}_{G}\log(2m+2n)], (55)

where S^N\hat{S}^{\prime}_{N}, S^C\hat{S}^{\prime}_{C}, and S^G\hat{S}^{\prime}_{G} denote the incrementally updated Structural Expressions:

S^N\displaystyle\hat{S}^{\prime}_{N} =S^N+d𝒟δk(d)dlogd;\displaystyle=\hat{S}_{N}+\sum_{d\in\mathcal{D}}\delta_{k}(d)d\log d; (56)
S^C\displaystyle\hat{S}^{\prime}_{C} =S^C+α𝒜[(gα+δg(α)+Vα+δV(α))log(Vα+δV(α))(gα+Vα)logVα];\displaystyle=\hat{S}_{C}+\sum_{\alpha\in\mathcal{A}}[(g_{\alpha}+\delta_{g}(\alpha)+V_{\alpha}+\delta_{V}(\alpha))\log(V_{\alpha}+\delta_{V}(\alpha))-(g_{\alpha}+V_{\alpha})\log V_{\alpha}];
S^G\displaystyle\hat{S}^{\prime}_{G} =S^G+α𝒜δg(α).\displaystyle=\hat{S}_{G}+\sum_{\alpha\in\mathcal{A}}-\delta_{g}(\alpha).

To implement the above incremental calculation process, we provide the Adjustment-based incremental updating algorithm (AIUA) (Fig. 8 2). Given the input, i.e., the Structural Data and Structural Expressions of the original graph and an Adjustment to the updated graph, we can compute the updated two-dimensional structural entropy incrementally, and update the Structural Data and Structural Expressions efficiently preparing for the next AIUA process when a new Adjustment comes. The pseudo-code of AIUA is shown in Algorithm 3. The time complexity of updating the Structural Data is O(|ϕλ|+|𝒜|+|Jnc|+|Jcn|)O(n)O(|\phi_{\lambda}|+|\mathcal{A}|+|J_{n-c}|+|J_{c-n}|)\leq O(n). The time complexity of updating the Structural Expressions is O(|𝒟|+|𝒜|)O(n)O(|\mathcal{D}|+|\mathcal{A}|)\leq O(n). The time complexity of calculating the updated two-dimensional structural entropy is O(1)O(1). In summary, the total time complexity of AIUA is O(n)O(n).

Input : The Structural Data (did_{i}, VαV_{\alpha}, gαg_{\alpha}, mm, α(vi)\alpha(v_{i}), and TαT_{\alpha}) and the Structural Expressions (S^N\hat{S}_{N}, S^C\hat{S}_{C}, and S^G\hat{S}_{G}) of the original graph GG, and the Adjustment (δ(vi)\delta(v_{i}), δk(d)\delta_{k}(d), δV(α)\delta_{V}(\alpha), δg(α)\delta_{g}(\alpha), nn, JncJ_{n-c}, and JcnJ_{c-n}) from GG to GG^{\prime}.
Output : The updated two-dimensional structural entropy H𝒯(G)H^{\mathcal{T}^{\prime}}(G^{\prime}), the updated Structural Data (did^{\prime}_{i}, VαV^{\prime}_{\alpha}, gαg^{\prime}_{\alpha}, mm^{\prime}, α(vi)\alpha^{\prime}(v_{i}), and TαT^{\prime}_{\alpha}), and the updated Structural Expressions (S^N\hat{S}^{\prime}_{N}, S^C\hat{S}^{\prime}_{C}, and S^G\hat{S}^{\prime}_{G}).
// Update the Structural Data
1 for each viϕλv_{i}\in\phi_{\lambda} do
2       di:=di+δ(vi)d^{\prime}_{i}:=d_{i}+\delta(v_{i});
3      
4 end for
5
6for each α𝒜\alpha\in\mathcal{A} do
7       Vα:=Vα+δV(α)V^{\prime}_{\alpha}:=V_{\alpha}+\delta_{V}(\alpha); gα:=gα+δg(α)g^{\prime}_{\alpha}:=g_{\alpha}+\delta_{g}(\alpha);
8      
9 end for
10m=m+nm^{\prime}=m+n;
11 for each (v,α)Jnc(v,\alpha)\in J_{n-c} do
12       α(v):=α\alpha^{\prime}(v):=\alpha;
13      
14 end for
15Tα:=TαT^{\prime}_{\alpha}:=T_{\alpha} for all αA\alpha\in A;
16 for each (α,v,op)Jcn(\alpha,v,op)\in J_{c-n} do
17       if op==+op==+ then
18             Tα:=Tα{v}T^{\prime}_{\alpha}:=T^{\prime}_{\alpha}\cup\{v\};
19            
20       end if
21      if op==op==- then
22             Tα:=Tα/{v}T^{\prime}_{\alpha}:=T^{\prime}_{\alpha}/\{v\};
23            
24       end if
25      
26 end for
// Update the Structural Expressions
27 S^N:=S^N\hat{S}^{\prime}_{N}:=\hat{S}_{N}; S^C:=S^C\hat{S}^{\prime}_{C}:=\hat{S}_{C}; S^G:=S^G\hat{S}^{\prime}_{G}:=\hat{S}_{G};
28 for each d𝒟d\in\mathcal{D} do
29       SN:=SN+δk(d)dlogdS^{\prime}_{N}:=S^{\prime}_{N}+\delta_{k}(d)d\log d;
30      
31 end for
32for each α𝒜\alpha\in\mathcal{A} do
33       S^C:=S^C+(gα+δg(α)+Vα+δV(α))log(Vα+δV(α))(gα+Vα)logVα\hat{S}^{\prime}_{C}:=\hat{S}^{\prime}_{C}+(g_{\alpha}+\delta_{g}(\alpha)+V_{\alpha}+\delta_{V}(\alpha))\log(V_{\alpha}+\delta_{V}(\alpha))-(g_{\alpha}+V_{\alpha})\log V_{\alpha};
34       S^G:=S^Gδg(α)\hat{S}^{\prime}_{G}:=\hat{S}^{\prime}_{G}-\delta_{g}(\alpha);
35      
36 end for
// Calculate the updated two-dimensional structural entropy
37 H𝒯(G):=12m+2n[S^N+S^C+S^Glog(2m+2n)]H^{\mathcal{T}^{\prime}}(G^{\prime}):=-\frac{1}{2m+2n}[\hat{S}^{\prime}_{N}+\hat{S}^{\prime}_{C}+\hat{S}^{\prime}_{G}\log(2m+2n)];
38
39return H𝒯(G)H^{\mathcal{T}^{\prime}}(G^{\prime}), the updated Structural Data, and the updated Structural Expressions.
Algorithm 3 Adjustment-based incremental updating algorithm (AIUA)

4.4

The traditional offline algorithm (TOA) reconstructs the encoding tree for each updated graph and calculates the updated two-dimensional structural entropy by definition. TOA consists of the following four steps. Firstly, it generates the updated graph by combining the original graph and the incremental sequence (ⓐ in Fig. 8). Secondly, it partitions the graph node set into communities using several different static community detection algorithms, e.g., Infomap [16], Louvain [17], and Leiden [18], to construct the two-dimensional encoding tree (ⓑ in Fig. 8). Thirdly, the node-level, community-level, and graph-level Structural Data of the updated graph is counted and saved (ⓒ in Fig. 8). Finally, the updated structural entropy is computed via Eq. (14) (ⓓ in Fig. 8). The total time cost of TOA is O(||+n)O(|\mathcal{E}|+n) plus the cost of the chosen community detection algorithm. The pseudo-code of TOA is shown in Algorithm 4.

Input : The original graph GG and an incremental sequence ξ\xi.
Output : The updated two-dimensional structural entropy H𝒯(G)H^{\mathcal{T}^{\prime}}(G^{\prime}).
1 Get the updated graph GG^{\prime} by combining GG and ξ\xi;
2 Construct the two-dimensional encoding tree 𝒯\mathcal{T}^{\prime}.;
3 Get the degree did^{\prime}_{i} of each node vi𝒱v_{i}\in\mathcal{V}^{\prime};
4 Get the volume VαV^{\prime}_{\alpha} and the cut edge number gαg^{\prime}_{\alpha} of αA\alpha\in A^{\prime};
5 Get the total edge number mm^{\prime} of GG^{\prime};
6 Obtain H𝒯(G)H^{\mathcal{T}^{\prime}}(G^{\prime}) via Eq. (14);
7 return H𝒯(G)H^{\mathcal{T}^{\prime}}(G^{\prime});
Algorithm 4 Traditional Offline Algorithm (TOA)

5

5.1

(58)
(59)

5.2

5.2.1

Definition 12 ().
Definition 13 ().

5.2.2

Definition 14 ().
Theorem 5 ().
Definition 15 ().
Definition 16 ().

5.2.3

Refer to caption
Figure 9:

6 Experiments and Evaluations

In this section, we conduct extensive experiments based on the application of dynamic graph real-time monitoring and community optimization. Below we first describe the 33 artificial dynamic graph datasets and 33 real-world datasets. Then we give the experimental results and analysis.

6.1 Artificial Datasets

First, we generate 33 different initial states of the dynamic graphs by utilizing the random_partition_graph (Random) [19], gaussian_random_partition _graph (Gaussian) [20], and stochastic_block_model (SBM) [21] methods in “Networkx” [22] (a Python library).

  • 1.

    This method has 33 parameters. The first parameter is a list of community sizes 𝒮=[s1,s2,]\mathcal{S}=[s_{1},s_{2},...], which denotes the node number of each community of the initial state. The other two parameters are two probabilities pinp_{\mathrm{in}} and pacp_{\mathrm{ac}}. Nodes in the same community are connected with pinp_{\mathrm{in}} and nodes across different communities are connected with pacp_{\mathrm{ac}}.

  • 2.
  • 3.
Refer to caption
Figure 10:

After that, we generate incremental sequences and updated graphs for each initial state by Hawkes Process [23] referring to some settings of Zuo et al. [24]. Hawkes Process [23] models discrete sequence events by assuming that historical events can influence the occurrence of current events.

Λyx(t)=exey2+th<tehey2exp(δx(tth)),\Lambda_{y\mid x}(t)=-||\textbf{e}_{x}-\textbf{e}_{y}||^{2}+\sum_{t_{h}<t}-||\textbf{e}_{h}-\textbf{e}_{y}||^{2}\exp\left(-\delta_{x}\left(t-t_{h}\right)\right), (70)
pyi|x=Λyi|xyΛy|x.p_{y_{i}|x}=\frac{\Lambda_{y_{i}|x}}{\sum_{y}\Lambda_{y|x}}. (71)
Table 2:
Parameter values
Random initial state 𝒮=[800,1000,1200,1400,1600]\mathcal{S}=[800,1000,1200,1400,1600], pin=0.05p_{\mathrm{in}}=0.05, and pac=0.001p_{\mathrm{ac}}=0.001.
Gaussian initial state The total node number is 60006000, s=1000s=1000, v=100v=100, pin=0.05p_{\mathrm{in}}=0.05, and pac=0.001p_{\mathrm{ac}}=0.001.
SBM initial state 𝒮=[800,1000,1200,1400,1600]\mathcal{S}=[800,1000,1200,1400,1600], pinp_{\mathrm{in}}\sim* 𝒰(0.002,0.01)\mathcal{U}(0.002,0.01), and pac𝒰(0.01,0.05)p_{\mathrm{ac}}\sim\mathcal{U}(0.01,0.05).
Hawkes process The dimension of the embedding vectors is set as 1616 and phnp_{\mathrm{hn}} is 0.050.05.
  • *

    𝒰\mathcal{U} denotes uniform distribution.

Table 3:
|𝒱0||\mathcal{V}_{0}| |0||\mathcal{E}_{0}| 𝔼(|Δ𝒱|)\mathbb{E}(|\Delta\mathcal{V}|) 𝔼(|Δ|)\mathbb{E}(|\Delta\mathcal{E}|) #\# of snapshots
20
20
20
Cit-HepPh 20
20
Facebook 20

6.2 Real-World Datasets

For the real-world datasets, we choose Cit-HepPh [26], DBLP [27], and Facebook [28] to conduct our experiments. Cit-HepPh is a citation network in the field of high-energy physics phenomenology from 19931993 to 20032003. Facebook records the establishment process of the user friendship relationship of about 52%52\% Facebook users in New Orleans from 20062006 to 20092009. For each dataset, we cut out 2121 consecutive snapshots (an initial state and 2020 updated graphs). Since structural entropy is only defined on connected graphs, we only preserve the largest connected component for each snapshot. Overall, the statistics of the artificial and real-world datasets are briefly shown in Table 3.

6.3 Results and Analysis

6.3.1 Application: Dynamic Graph Real-Time Monitoring and Community Optimization

In this application, we aim to optimize the community partitioning and monitor the corresponding two-dimensional structural entropy by our incremental algorithms, i.e., NAGA+AIUA and NSGA+AIUA, and the baseline TOA to quantify the community quality in real-time for each snapshot of a dynamic graph. Specifically, for each dataset, we first choose a static community detection method (referred to as static methods) from Infomap [16], Louvain [17], and Leiden [18] to generate the initial state’s community partitioning. In this paper, Louvain is complemented by the louvain_communities method in “Networkx” [22]. Infomap and Leiden are complemented by the community_infomap and the community_leiden methods from the Python library “igraph” [29], respectively. Louvain and Infomap algorithms take default parameters while in Leiden we use “Modularity” as the objective instead of the original setting “Constant Potts Model (CPM)”. This is because Leiden with CPM cannot effectively partition the communities on our datasets, as all partitions generated by Leiden with CPM contain only one node. Then we use NAGA+AIUA, NSGA+AIUA, and TOA to respectively measure the updated two-dimensional structural entropy at each time stamp. The default maximum iteration number of NSGA is set as 55 and the reason is discussed in Section 6.3.3.

Refer to caption
Figure 11: The updated structural entropy measured by NAGA+AIUA, NSGA+AIUA, and TOA on real-world datasets with different static methods. Lower structural entropy represents better performance.
Refer to caption
Figure 12: The updated structural entropy measured by NAGA+AIUA, NSGA+AIUA, and TOA on artificial datasets with different static methods. Since the three curves for the artificial datasets are closer to each other than that for the real-world datasets, all displayed structural entropy values are subtracted from the structural entropy value of NAGA+AIUA to better show the differences between the curves.

The experimental results are shown in Fig. 11 . Overall, the structural entropy obtained by NSGA+AIUA based on the node-shifting strategy, is completely smaller than that obtained by based on the naive adjustment strategy, in all settings. For example, , NSGA+AIUA reduces the updated structural entropy by up to about 12%12\% and 10%10\% on Facebook and DBLP respectively. This verifies that the node-shifting strategy is theoretically and practically able to reduce the structural entropy and represents that the strategy can obtain a significantly better encoding tree (community partitioning) than the naive adjustment strategy.

While maintaining high efficiency (evaluated in Section 6.3.3), our incremental algorithms still exhibit a performance that is close to or better than TOATOA.

6.3.2 Hyperparameter Study

In this part, we evaluate the influence on the updated structural entropy of different iteration numbers of the node-shifting adjustment strategy. We use NSGA + AIUA with iteration number N=3,5,7,9N=3,5,7,9 to measure the mean updated structural entropy of the 2020 updated graphs, respectively, on each situation in Section 6.3.1. As we can see from Table 4, the updated structural entropy decreases as the number of iterations increases most of the time. The reason is that, as the number of iterations increases, more nodes will shift to their OPC, which leads to the further reduction of the structural entropy. This experiment also demonstrates that our node-shifting adjustment strategy has excellent interpretability.

Table 4: The Updated Structural Entropy by Node-Shifting Adjustment Strategy with Different Number of Iterations. Bold Number Denotes the Lowest Structural Entropy.
#\# of iterations (NN) N=3N=3 N=5N=5 N=7N=7 N=9N=9
Cit-HepPh Infomap 9.7155 9.7126 9.7122 9.7120
Louvain 10.7804 10.7791 10.7786 10.7784
Leiden 10.7802 10.7792 10.7788 10.7786
DBLP Infomap 7.3274 7.3255 7.3244 7.3242
Louvain 9.4851 9.4842 9.4841 9.4838
Leiden 9.3919 9.3917 9.3912 9.3909
Facebook Infomap 10.2134 10.2075 10.2060 10.2055
Louvain 11.2946 11.2898 11.2876 11.2864
Leiden 11.3087 11.3052 11.3038 11.3030
Random-Hawkes Infomap 11.7836* 11.7832 11.7831 11.7831
Louvain 11.7836* 11.7832 11.7831 11.7831
Leiden 11.7836* 11.7832 11.7831 11.7831
Gaussian-Hawkes Infomap 11.3352 11.3348 11.3347 11.3346
Louvain 11.3352 11.3348 11.3347 11.3346
Leiden 11.3352 11.3348 11.3347 11.3346
SBM-Hawkes Infomap 11.6596 11.6595 11.6595 11.6595
Louvain 11.6596 11.6595 11.6595 11.6595
Leiden 11.6596 11.6595 11.6595 11.6595
  • *

6.3.3 Time Consumption Evaluation

Fig. 13 shows the time consumption comparison between our incremental algorithms, i.e. NAGA+AIUA and NSGA+AIUA (N=3,5,7,9N=3,5,7,9), on all 66 datasets. The vertical axis in the figure represents the mean time consumption of the chosen incremental algorithm across all 2020 snapshots. The horizontal axis represents 33 selected static methods. As we can see, the time cost of NSGA+AIUA increases as the increase of iteration number NN. In addition, the time consumption of NAGA+AIUA is less than NSGA+AIUA with N=5N=5 in most cases.

Refer to caption
Figure 13: Mean time consumption of NAGA+AIUA and NSGA+AIUA (N=3,5,7,9N=3,5,7,9) over 2020 time stamps on each dataset under different static methods.
Table 5:
Dataset Cit-HepPh DBLP Facebook
Static Method Infomap Louvain Leiden Infomap Louvain Leiden Infomap Louvain Leiden
Online Time1 1.23s1.23s2 1.07s1.07s 1.09s1.09s 0.64s0.64s 0.62s0.62s 0.63s0.63s 1.92s1.92s 1.67s1.67s 1.69s1.69s
Offline Time 49.96s49.96s 6.42s6.42s 5.05s5.05s 49.80s49.80s 4.72s4.72s 12.30s12.30s 80.76s80.76s 8.15s8.15s 7.43s7.43s
Speedup3 40.62\uparrow 40.62x 6.00\uparrow 6.00x 4.63\uparrow 4.63x 77.81\uparrow 77.81x 7.61\uparrow 7.61x 19.52\uparrow 19.52x 42.06\uparrow 42.06x 4.88\uparrow 4.88x 4.40\uparrow 4.40x
Dataset Random-Hawkes Gaussian-Hawkes SBM-Hawkes
Static Method Infomap Louvain Leiden Infomap Louvain Leiden Infomap Louvain Leiden
Online Time 2.21s2.21s 2.48s2.48s 1.32s1.32s 1.48s1.48s 1.52s1.52s 1.55s1.55s 1.76s1.76s 1.55s1.55s 1.05s1.05s
Offline Time 58.37s58.37s 6.06s6.06s 3.02s3.02s 18.72s18.72s 4.08s4.08s 3.69s3.69s 248.04s248.04s 4.01s4.01s 3.02s3.02s
Speedup 26.41\uparrow 26.41x 2.44\uparrow 2.44x 2.28\uparrow 2.28x 12.65\uparrow 12.65x 2.68\uparrow 2.68x 2.38\uparrow 2.38x 140.93\uparrow 140.93x 2.59\uparrow 2.59x 2.88\uparrow 2.88x
  • 1

    Time cost of NSGA+AIUA (N=5N=5).

  • 2

    Mean time consumption over 2020 time stamps.

  • 3

    Speedup = Offline Time/Online Time.

The reason why we choose N=5N=5 as the default parameter is as follows. As shown in Fig. 13, the time consumption of the node-shifting strategy rises linearly with NN. However, Table 4 shows that the rate of decline of structural entropy gradually decreases from N=3N=3 to N=9N=9, and the structural entropy values of N=7N=7 and N=9N=9 are very close. That is, the optimization efficiency of structural entropy decreases with increasing NN. To seek a balance between efficiency and effectiveness, we take the compromise value N=5N=5.

6.3.4 Update Threshold Analysis

In scenarios with minimal changes, updating structural information might not be necessary. In this part, we set a threshold for the magnitude of graph changes before initiating updates to cut total time consumption. Specifically, the updated structural entropy will not be calculated until the incremental edge number exceeds a certain percentage (referred to as the update threshold θ\theta) of the edge number of the last updated graph. As we can see from Table 6, the total time reduces 63%63\%-87%87\% with NAGA+AIUA and 47%47\%-72%72\% with NSGA+AIUA when θ\theta is set from 5%5\% to 20%20\%. Meanwhile, the fluctuation of the final structural entropy remains within 0.15%0.15\%, which indicates that the threshold has little impact on the precision of structural entropy. Overall, the setting of the updated threshold leads to improved efficiency and better adaptation to graphs undergoing frequent alterations.

Table 6:
Updated Threshold θ=0%\theta=0\% θ=5%\theta=5\% θ=10%\theta=10\% θ=15%\theta=15\% θ=20%\theta=20\%
NA1 Total Time 32.89s32.89s 12.20s12.20s(63%)(\downarrow 63\%) 2 9.16s9.16s(72%\downarrow 72\%) 5.64s5.64s(83%\downarrow 83\%) 4.44s4.44s(87%\downarrow 87\%)
Final SE 10.973610.9736 10.968710.9687(0.1%)(\leq 0.1\%) 3 10.969010.9690(0.1%)(\leq 0.1\%) 10.966810.9668(0.1%)(\leq 0.1\%) 10.968110.9681(0.1%)(\leq 0.1\%)
NS1 Total Time 24.83s24.83s 13.11s13.11s(47%)(\downarrow 47\%) 10.52s10.52s(58%)(\downarrow 58\%) 8.27s8.27s(67%)(\downarrow 67\%) 6.93s6.93s(72%)(\downarrow 72\%)
Final SE 10.026910.0269 10.024310.0243(0.1%)(\leq 0.1\%) 10.018610.0186(0.1%)(\leq 0.1\%) 10.035210.0352(0.1%)(\leq 0.1\%) 10.041910.0419(0.15%)(\leq 0.15\%)
  • 1

    “NA/NS” represents NAGA/NSGA+AIUA.

  • 2

    The percentage reduction in time consumption compared to the case where θ=0%\theta=0\%.

  • 3

    The percentage fluctuation of the final structural entropy compared to the case where θ=0%\theta=0\%.

In this subsection, we evaluate the robustness of the proposed framework on 55 new artificial datasets with different noise conditions. Specifically, we generate 55 initial states using the random partition method mentioned in Section 6.1 with pacp_{\text{ac}} (representing the noise) ranging from 0.010.01 to 0.050.05 and 𝒮=[100,100,200,200,200]\mathcal{S}=[100,100,200,200,200]. For the convenience of comparison, the total edge number of the initial state is kept nearly the same, fluctuating between 3735737357 and 3817538175, by manually selecting the appropriate pinp_{\text{in}}. After the generation of the 55 initial states, we use the Hawkes Process to generate 7471474714 incremental edges, around 22 times the initial edges, and split them into 2020 sub-sequences as the incremental of 2020 time stamps. Then we use NAGA/NSGA+AIUA and TOA to calculate the structural entropy for all 2020 updated graphs and the results are shown in Fig. 14.

Refer to caption
Figure 14:

As we can see from Fig. 14, the structural entropy curves using NAGA/ NSGA+AIUA are smooth no matter what pacp_{\text{ac}} takes. However, one or two jumps in structural entropy take place when it comes to TOATOA. Furthermore, the higher the noise (pacp_{\text{ac}}) is, the earlier and larger the structural entropy mutates, and the more unstable the TOATOA algorithm is. Additionally, upon examination, the number of communities does not change, meaning that these abrupt changes do not stem from changes in the number of communities, but due to dramatic changes in their content. Therefore, we can conclude that our algorithm maintains the structural entropy at a stable and lower level due to the property of keeping the original community structure from changing drastically, showing high robustness to the increasing noise.

In addition, we note that the structural entropy curve gradually shifts upward as pacp_{\text{ac}} increases. This is because the numbers of edges of the initial states have slight errors. Specifically, the larger the pacp_{\text{ac}}, the more initial edges there are (from 3735737357 to 3817538175).

6.3.6 Gap between Incre-2dSE and the Current Static Structural Entropy Measurement Method

In this part, we try to study the gap between Incre-2dSE and the current static algorithms. The current mainstream static algorithm for structural entropy measurement is named structural entropy minimization (SEM) [6], which is a greedy kk-dimensional encoding tree construction algorithm for static graphs whose objective function is structural entropy. Fig. 15 gives an illustration of the SEM algorithm in two-dimensional cases (2d-SEM). Fig. 15(a) is an example graph. We first construct a one-dimensional encoding tree for this graph (Fig. 15(b)). The one-dimensional encoding tree has one root and |𝒱||\mathcal{V}| leaf nodes. Next, we add a successor node to each leaf node to construct an initialized two-dimensional encoding tree (Fig. 15(c)). Then we select the best pair of 11-height nodes to merge them which minimizes the structural entropy (Fig. 15(d)). At last, We repeat this merging step until the structural entropy doesn’t go down to get the final two-dimensional encoding tree (Fig. 15(e)).

Refer to caption
Figure 15: An illustration of two-dimensional structural entropy minimization process.

We measure the structural entropy of Incre-2dSE (NAGA/NSGA+AIUA) and 2d-SEM222https://github.com/RingBDStack/SITool/tree/main/python through all timestamps like Section 6.3.1 on the six datasets mentioned in Table 3. According to our experimental results (Fig. 16), there is a gap between the structural entropy of 2d-SEM and our dynamic framework Incre-2dSE in some cases. Specifically, on all datasets except DBLP, the structural entropy of NAGA+AIUA is higher than but very close to 2d-SEM. On DBLP, NAGA+AIUA is better than 2d-SEM. On the contrary, NSGA+AIUA performs significantly better than 2d-SEM on all datasets. In addition, the gap between NSGA+AIUA and 2d-SEM is gradually increasing on Cit-HepPh and Facebook, is gradually decreasing on Random-Hawkes and Gaussian-Hawkes, and remains almost unchanged on DBLP and SBM-Hawkes. The reason why there is a gap between Incre-2dSE and 2d-SEM is the same as the reason why there is a gap between Incre-2dSE and TOA in Section 6.3.1Incre-2dSE updates the community and encoding tree locally and incrementally while 2d-SEM constructs encoding trees globally from scratch for each snapshot.

Refer to caption
Figure 16: An illustration of two-dimensional structural entropy minimization process.

6.3.7 Convergence Evaluation

In this part, we conduct a statistical experiment to confirm the convergence of the Local Difference and its first-order absolute moment. We first generate artificial original graphs with increasing total edge numbers from 480480 to 2400024000. Based on each original graph, we generate 3030 incremental sequences with size nn sampling from a normal distribution with a mean of 100100 and a standard deviation of 1010. These incremental sequences are generated by repeatedly adding edges within a community with probability ppin[0,0.8)p_{\mathrm{pin}}\in[0,0.8), adding edges across two random communities with probability ppac[0,0.1)p_{\mathrm{pac}}\in[0,0.1), and adding nodes with probability pn=1ppinppacp_{\mathrm{n}}=1-p_{pin}-p_{pac}. We then count the Local Difference and its upper bound. The results are shown in Fig. 17. As the total edge number increases from 11 to 5050 times, the mean Local Difference gradually decreases by 95.98%95.98\% (from 1.081.08 to 0.040.04), respectively, and is always positive. This solidly supports the convergence of the Local Difference and its first-order absolute moment. Moreover, the Local Difference is always below its upper bound, which confirms the validity of our bound.

Refer to caption
Figure 17: The statistics of the Local Difference and its upper bound.

6.3.8

In this subsection, we evaluate the time consumption of the two approximate one-dimensional structural entropy measurement methods, Global Aggregation and Local Propagation, on two artificial datasets, i.e., Erdős-Rényi (ER) [30] and Cycle.

For each initial state, we use Global Aggregation with iteration number 5050 to calculate the initial stationary distribution from a uniform distribution. For each initial state, we then use Global Aggregation and Local Propagation to iteratively calculate the structural entropy of the 44 updated graphs and record the time consumption of each iteration.

The time consumption experimental results are shown in Fig. 18. On the ER dataset, Local Propagation is faster than Global Aggregation only in the first several iterations. Besides, the time consumption is lower when nn is smaller. This is because the time consumption of each iteration of Local Propagation is approximately linear with the size of the involved node set. The smaller the incremental size nn, the number of the involved nodes is less. However, since the mean number of the successors of ER graphs is more than 11, the involved nodes will rapidly expand into the entire node set. Therefore, when the iteration number is larger than 55, the efficiency of Local Propagation is nearly the same as that of Global Aggregation. On the contrary, in the Cycle dataset, the mean number of the direct successors of each node is nearly 11. That is, the size of the involved node set will not change, so the time consumption of Local Propagation will not increase as the iteration number grows. In addition, obviously the larger the incremental size nn, the higher the time consumption whatever the iteration number is.

Refer to caption
Figure 18:

7 Related Works

Graph Entropy. Many efforts have been devoted to measuring the information in graphs. The graph entropy was first defined and investigated by Rashevsky [2], Trucco [3], and Mowshowitz [4]. After that, a different graph entropy closely related to information and coding theory was proposed by Körner[5]. Later, Bianconi [31] introduced the concept of “structural entropy of network ensembles”, known as the Gibbs entropy. Anand et al. [32] proposed the Shannon entropy for network ensembles. Braunstein et al. [33] proposed the von Neumann graph entropy based on the combinatorial graph Laplacian matrix. These three metrics [31, 32, 33] are defined by statistical mechanics and are used to compare different graph models. However, most of the current graph entropy measures are based on the Shannon entropy definition for probability distributions, which has significant limitations when applied to graphs [34]. Recently, many efforts have been devoted to capturing the dynamic changes of graphs, e.g., the research based on the Algorithmic Information Theory [35, 36]. The structural entropy method used in this paper proposed by Li et al. [6] provides an approach to measuring the high-dimensional information embedded in graphs and can further decode the graph’s hierarchical abstracting by an optimal encoding tree. This method is widely used in the fields of graph learning [14], reinforcement learning [37], and social networks [38].

Fast Computation for Graph Entropy. Chen et al. [39] proposed a fast incremental von Neumann graph entropy computational framework, which reduces the cubic complexity to linear complexity in the number of nodes and edges. Liu et al. [40, 41] used the structural entropy [6] as a proxy of von Neumann graph entropy for the latter’s fast computation and also implemented an incremental method for one-dimensional structural entropy . In this paper, we mainly focus on the incremental computation for two-dimensional structural entropy based on our dynamic adjustment strategies for encoding trees.

8 Conclusion

In this paper, we propose two novel dynamic adjustment strategies, namely the naive adjustment strategy and the node-shifting adjustment strategy, to analyze the updated structural entropy and incrementally adjust the original community partitioning towards a lower structural entropy. We also implement an incremental framework, i.e., supporting the real-time measurement of the updated two-dimensional structural entropy. In the future, we aim to develop more dynamic adjustment strategies for hierarchical community partitioning and incremental measurement algorithms for higher dimensional structural entropy.

Acknowledgments

This work is supported by the National Key R&D Program of China through grant 2022YFB3104703, NSFC through grants 62322202 and 61932002, Beijing Natural Science Foundation through grant 4222030, Guangdong Basic and Applied Basic Research Foundation through grant 2023B1515120020, CCF-DiDi GAIA Collaborative Research Funds for Young Scholars, and the Fundamental Research Funds for the Central Universities.

References

  • [1] C. Shannon, The lattice theory of information, Transactions of the IRE Professional Group on Information Theory 1 (1) (1953) 105–107.
  • [2] N. Rashevsky, Life, information theory, and topology, The Bulletin of Mathematical Biophysics 17 (3) (1955) 229–235.
  • [3] E. Trucco, A note on the information content of graphs, The Bulletin of Mathematical Biophysics 18 (2) (1956) 129–135.
  • [4] A. Mowshowitz, Entropy and the complexity of graphs, Tech. rep. (1967).
  • [5] J. Körner, Coding of an information source having ambiguous alphabet and the entropy of graphs, in: Proceedings of the Prague Conference on Information Theory, 1973, pp. 411–425.
  • [6] A. Li, Y. Pan, Structural information and dynamical complexity of networks, IEEE Transactions on Information Theory 62 (6) (2016) 3290–3339.
  • [7] A. Li, Mathematical Principles of Information World: Calculuses of Information, Preprint., 2022.
  • [8] A. Li, X. Yin, B. Xu, D. Wang, J. Han, Y. Wei, Y. Deng, Y. Xiong, Z. Zhang, Decoding topologically associating domains with ultra-low resolution hi-c data by graph structural entropy, Nature Communications 9 (1) (2018) 1–12.
  • [9] Y. W. Zhang, M. B. Wang, S. C. Li, Supertad: robust detection of hierarchical topologically associated domains with optimized structural information, Genome Biology 22 (1) (2021) 1–20.
  • [10] Y. Liu, J. Liu, Z. Zhang, L. Zhu, A. Li, Rem: From structural entropy to community structure deception, in: Proceedings of the NeuIPS, Vol. 32, 2019, pp. 12918–12928.
  • [11] A. Li, X. Zhang, Y. Pan, Resistance maximization principle for defending networks against virus attack, Physica A: Statistical Mechanics and its Applications 466 (2017) 211–223.
  • [12] Z. Yang, G. Zhang, J. Wu, J. Yang, Q. Z. Sheng, H. Peng, A. Li, S. Xue, J. Su, Minimum entropy principle guided graph neural networks, in: Proceedings of the sixteenth ACM international conference on web search and data mining, 2023, pp. 114–122.
  • [13] J. Wu, X. Chen, K. Xu, S. Li, Structural entropy guided graph hierarchical pooling, in: Proceedings of the ICML, Vol. 162, 2022, pp. 24017–24030.
  • [14] D. Zou, H. Peng, X. Huang, R. Yang, J. Li, J. Wu, C. Liu, P. S. Yu, Se-gsl: A general and effective graph structure learning framework through structural entropy optimization, in: Proceedings of the ACM Web Conference 2023, 2023, pp. 499–510.
  • [15] F. Harary, G. Gupta, Dynamic graph models, Mathematical and Computer Modelling 25 (7) (1997) 79–87.
  • [16] M. Rosvall, C. T. Bergstrom, Maps of random walks on complex networks reveal community structure, Proceedings of the national academy of sciences 105 (4) (2008) 1118–1123.
  • [17] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefebvre, Fast unfolding of communities in large networks, Journal of Statistical Mechanics: Theory and Experiment 2008 (10) (2008) P10008.
  • [18] V. A. Traag, L. Waltman, N. J. Van Eck, From louvain to leiden: guaranteeing well-connected communities, Scientific reports 9 (1) (2019) 5233.
  • [19] S. Fortunato, Community detection in graphs, Physics Reports 486 (3-5) (2009).
  • [20] U. Brandes, M. Gaertler, D. Wagner, Experiments on graph clustering algorithms, in: European symposium on algorithms, Springer, 2003, pp. 568–579.
  • [21] P. W. Holland, K. B. Laskey, S. Leinhardt, Stochastic blockmodels: First steps, Social networks 5 (2) (1983) 109–137.
  • [22] A. Hagberg, P. Swart, D. S Chult, Exploring network structure, dynamics, and function using networkx, Tech. rep. (2008).
  • [23] A. G. Hawkes, Spectra of some self-exciting and mutually exciting point processes, Biometrika 58 (1) (1971) 83–90.
  • [24] Y. Zuo, G. Liu, H. Lin, J. Guo, X. Hu, J. Wu, Embedding temporal network via neighborhood formation, in: Proceedings of the SIGKDD, 2018, pp. 2857–2866.
  • [25] A. Grover, J. Leskovec, node2vec: Scalable feature learning for networks, in: Proceedings of the SIGKDD, 2016, pp. 855–864.
  • [26] J. Leskovec, J. Kleinberg, C. Faloutsos, Graphs over time: densification laws, shrinking diameters and possible explanations, in: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, 2005, pp. 177–187.
  • [27] D. A. Bader, H. Meyerhenke, P. Sanders, D. Wagner, Graph partitioning and graph clustering, Vol. 588, American Mathematical Society Providence, RI, 2013.
  • [28] B. Viswanath, A. Mislove, M. Cha, K. P. Gummadi, On the evolution of user interaction in facebook, in: Proceedings of the 2nd ACM workshop on Online social networks, 2009, pp. 37–42.
  • [29] G. Csardi, T. Nepusz, The igraph software, Complex syst 1695 (2006) 1–9.
  • [30] P. ERDdS, A. R&wi, On random graphs i, Publ. math. debrecen 6 (290-297) (1959) 18.
  • [31] G. Bianconi, Entropy of network ensembles, Physical Review E 79 (3) (2009) 036114.
  • [32] K. Anand, G. Bianconi, Entropy measures for networks: Toward an information theory of complex topologies, Physical Review E 80 (4) (2009) 045102.
  • [33] S. L. Braunstein, S. Ghosh, S. Severini, The laplacian of a graph as a density matrix: a basic combinatorial approach to separability of mixed states, Annals of Combinatorics 10 (3) (2006) 291–317.
  • [34] H. Zenil, N. A. Kiani, J. Tegnér, Low-algorithmic-complexity entropy-deceiving graphs, Physical Review E 96 (1) (2017) 012308.
  • [35] H. Zenil, N. A. Kiani, A. A. Zea, J. Tegnér, Causal deconvolution by algorithmic generative models, Nature Machine Intelligence 1 (1) (2019) 58–66.
  • [36] H. Zenil, N. A. Kiani, F. Marabita, Y. Deng, S. Elias, A. Schmidt, G. Ball, J. Tegner, An algorithmic information calculus for causal discovery and reprogramming systems, Iscience 19 (2019) 1160–1172.
  • [37] X. Zeng, H. Peng, A. Li, Effective and stable role-based multi-agent collaboration by structural information principles, in: Proceedings of the AAAI conference on artificial intelligence, Vol. 37, 2023, pp. 11772–11780.
  • [38] Y. Cao, H. Peng, Z. Yu, P. S. Yu, Hierarchical and incremental structural entropy minimization for unsupervised social event detection, in: Proceedings of the AAAI conference on artificial intelligence, 2024, pp. 1–9.
  • [39] P.-Y. Chen, L. Wu, S. Liu, I. Rajapakse, Fast incremental von neumann graph entropy computation: Theory, algorithm, and applications, in: Proceedings of the ICML, 2019, pp. 1091–1101.
  • [40] X. Liu, L. Fu, X. Wang, C. Zhou, On the similarity between von neumann graph entropy and structural information: Interpretation, computation, and applications, IEEE Transactions on Information Theory 68 (4) (2022) 2182–2202.
  • [41] X. Liu, L. Fu, X. Wang, Bridging the gap between von neumann graph entropy and structural information: Theory and applications, in: Proceedings of the WWW, 2021, pp. 3699–3710.