Extrema Analysis of Node Centrality
in Weighted Networks
Abstract
A very interesting matter of Network Science is assessing how complex a given network is. In other words, by how much does such a network departs from any general patterns which could be evoked for its description. Among other choices, these patterns can be defined in terms of node or edge properties, as one of the various modes of centrality. Although there are centrality indexes defined for weighted graphs, the discussions on this subject are far from over, especially because the influence of edge weights in this regard can vary not only in form but also in intensity, a still incipient approach so far. For the aforementioned complexity evaluation, the situation is even more acute: to the best of our knowledge, this was never addressed in the literature, from the standpoint of centrality in weighted graphs. This paper details in a colorful fashion a sound methodology covering both topics, as well as experiments confirming its practical applicability and future trends in this context.
keywords:
Adjustable Centrality , Network Complexity , Line-line IntersectionsMSC:
[2010] 00-01, 99-00url]https://docardoso.github.io/
1 Introduction
Centrality assessment is a branch of study that emerged from complex network analysis and, in short, targets to indicate how important nodes of a network are to its structure. In practical terms, this can mean identifying influential people in online social applications [1], analyzing good routes in urban transportation networks [2], or even understanding which proteins can be considered essential for protein-protein interactions in a cell [3]. Three centrality measures are considered the most classic: degree, closeness, and betweenness. They were originally developed for networks in which relationships are binary, i.e., there is or not a connection between any pair of vertices [4]. But there are networks with numerical weights attached to the connections, quantifying some characteristic of these: e.g., distance, intensity, capacity, or cost. Likewise, several centrality measures also have been proposed with the latter type of network in mind.
Some of these [5] were proposed with the objective of incorporating concomitantly the number of connections – as originally intended in this regard – and the weights associated with them, allowing to regulate the relative importance given to each of these attributes. This dual description is relevant since in graphs that represent internet traffic, for example, for paths that have the same distance or cost, the quality of resources flowing through more intermediate nodes tends to be worse than with fewer of them. The measure, then, allows setting this kind of situation. Moreover, there can be several other situations in which such a tuning makes centrality assessment more assertive. Thus, the present work sought to extend such previous developed adjustable centrality generalizations to weighted networks, trying to obtain more polished and secure understanding of the information they provide.
This paper details multiple contributions to the literature in the field of the just mentioned subject. The first of which is the establishment of adjustable centrality measures based on the logarithm of the ratio between weighted and unweighted centralities, which possess quite interesting properties. This serves as the cornerstone to the evaluation of how wide is the range encompassing all possible node rankings defined for a perspective of centrality varying the importance of edge weights in this regard, which can be interpreted as proxy to the assessment of weights relevance to network structure. One last part of this whole is an algorithm to realize such a evaluation without relying on a brute-force approach, as well as a formal proof of its predicates.
The following sections, which complete this research report, are enumerated and summarized next. Section 2 establishes the theoretical foundations on which this work was built. Section 3 provides a brief systematization and analysis of previously proposed node centrality measures which allow edge weights influence on their computations to be calibrated. An alternative methodology to such indexes which overcomes some of their weakness is introduced in section 4. While section 5 shows how this methodology can be extended to allow analyzing weighted networks from so far unexplored point of view, section 6 formally describes how this can be efficiently realized. Section 7 details the application of the herein proposed ideas in a variety of practical scenarios. At last, section 8 brings some finishing remarks and points possible continuations.
2 Basic Concepts and Notation
A (simple, undirected) graph or network is a structure defined by the elements of set , called vertices or nodes, which may have pairwise relationships, named edges, ties or links, that compose the set . A node is adjacent to a node if there is the edge . A path between nodes and is a non-repeating sequence of nodes starting with and ending with , or vice-versa, in which every two neighboring entries in the sequence are adjacent nodes in the network. The number of edges traversed in a path is its length. If there is a path between every pair of vertices, the graph is connected. There are also weighted or valued graphs, whose edges have a numerical value associated to each of them. These can be denoted as , where . In numerous situations such values are strictly positive, what is also assumed in the remainder of this text. As a consequence, it can be conveniently assumed that if , then .
The degree centrality refers to the number of edges incident upon a node:
(1) |
This index covers only the influence of the nodes from a local perspective on the graph. On the other hand, the closeness centrality is the inverse of the total geodesic distance of one node to all the others (relying on the assumption that the graph is connected, which is not strictly necessary [6]). Between two nodes and the distance is the length of the shortest path between and . The closeness, then, is formalized as:
(2) |
The two Freeman’s measures [4] just presented were originally proposed for unweighted graphs. Based on them some generalizations were developed with valued networks in mind. In this regard, Barrat et al. [7] extended the notion of degree of a node to what was called strength, defined by the sum of the weights of all edges incident in it:
(3) |
Newman [8] introduced the closeness to weighted networks based on Dijkstra’s shortest path algorithm [9]. In this method, the length of a path is the sum of the weights of each tie in it. The minimum length of a path between nodes and taking weights into account is denoted by . With the notion of distance aligned with this approach, the weighted closeness centrality is:
(4) |
It should be noted that Dijkstra’s algorithm was designed for graphs in which weights have a negative connotation, i.e., the shortest paths have the least accumulated weight. Thus, when the weights concern positive relations (e.g., affinity, capacity, similarity), their multiplicative inverses are commonly used for shortest path evaluation [5].
To evaluate or compare non-trivial properties of networks, random graphs [10] with specific characteristics can be used. A key model in this sense is the Erdős-Rényi (ER) [11], whose notation represents a graph that has vertices and is the probability of each pair of vertices to be connected by an edge. Although edge weights are not incorporated by the original ER model, it has been taken as the basis of others which include this feature. Given an ER graph, the possibly simplest way to obtain its weighted version is to assign to each link a value randomly defined according to an uniform probability distribution [12], for example. Another ER-based weighted model, with quite interesting properties, is the WRG [13], implemented by having the edge weights assigned according to a Geometric distribution which uses the value of as its parameter.
The random models just referred to, especially the ER, are sometimes seen as lacking practical usefulness since their properties are not aligned with those observed in some remarkable real-world networks, considered as “more complex” [14]. Accordingly, alternatives that keep some features of a given network and randomize other aspects of it have been developed. A method that preserves node degrees while alters network topology is based on pairwise edge switching [15]. It consists of performing the following procedure a sufficient number of times: randomly choose two edges and ; then swap their ends to form the edges and . A weighted version of this idea was previously used [16], which preserved edge weights distribution but not node strengths or network topology. Other models that only shuffle the edge weights, conserving topology [7, 16], or that randomize node degrees and edge weights while the node strengths remain unchanged [17] are also known.
3 An Overview of Adjustable Centrality Indexes
The study of topological properties of complex networks has evolved over time and is ultimately important because these attributes are responsible for defining their functionality. For example, the structure of a social network affects the propagation of information or diseases in it, and the topology of a electrical grid changes supply robustness and stability [18]. Often, individual nodes assume a key function in this regard (e.g., people with great influence, large power stations) and, consequently, it is desirable to identify these nodes. That is the purpose of the centrality measures [19]. The versions of degree and closeness centralities presented in the section 2 take into account a single feature of the connections to determine node centrality: the number of links or, leaving aside the former, the sum of edge weights.
More recently, some centrality indices for weighted networks have approached this goal of incorporating both tie features simultaneously. Laplacian centrality [20] and the WKPN algorithm [21] have tried to merge local and global characteristics of the vertices to assess their importance. The former is based on 2-walks in which a node participates. The latter considers single-source shortest paths with edges, where the analyzed node is the spread source in a infectious disease model. The h-degree index [22] of a node displays only a local perspective and is defined as the greatest value so that a node has at least edges and the weight of each of them is greater than or equal to .
Compared to the previous ones, the measures proposed by Opsahl et al. [5] are more adaptable, in terms of balancing the relative importance given to edge weights and counts. Such a statement is based on their use of a tuning parameter . Aiming at a local perspective, the degree (1) and strength (3) of a node are combined to define a new version of degree centrality as follows:
(5) |
A similar approach was later applied to the weighted k-shell decomposition method [23]. In addition, some of the subsequent works include the analysis of node centrality in an online social network under the variation of [24], and an attempt to find an optimal value for such a parameter [25]. An alternative closeness centrality, however, is based on weighted distances calculated on the graph obtained from the original by raising each edge weight to , represented by :
(6) |
For both measures, when , the same results of Freeman’s original measures are produced, disregarding weights. Whereas when , the outcome corresponds to the generalizations for valued networks.
Besides the intrinsic differences regarding node reach in the network, Opsahl’s measures differ with respect to the type of summarization [26] each of them employ. In expression 5, both degree and strength are calculated in their original form, and the tuning parameter used to offset the product of these values, similarly a geometric mean. In expression 6, is used to directly alter edge weights as a power to their exponentiation, which are then summed for shortest path computations, resembling the p-norm of a vector. Targeting to clarify such a discrepancy, these measures are herein identified with labels prod (product) and sum, instead of as originally denoted: and , respectively.
Moreover, such an approach favors considering other combinations of the available parts. Therefore alternatives to Opsahl’s degree and closeness centralities could be defined as follows, respectively:
(7) | ||||
(8) |
Both pairs of adjustable degree and closeness centralities produce the same results when is 0 or 1. Table 1 provides a qualitative description of the behavior of the measures when the parameter is outside the range defined by the just mentioned benchmark values.
Centrality | Summarization | |
---|---|---|
Product | Sum | |
Degree | For , central nodes would have great aggregated edge weight resulting from a small number of ties. When , the more edges and the lower the node strength, the better, which is useful when weights have a negative connotation. | Heavy edges are prioritized when : as tends to infinity, the nodes attached by the heaviest edges would become the most central ones. This preference is directly inverted for , making lightest edges the most important. |
Closeness | For , central vertices are those whose shortest paths to others generally have low accumulated weight but numerous edges. With a negative parameter, the preferred vertices are whose distances to the others, disregarding the weights, are short, while discouraging the ones whose total weighted distance is small. | When is negative, the central vertices would be those whose shortest paths to others have few, heavy edges. The opposite is true for : shortest paths with numerous weaker ties are less affected in this scenario, favoring vertices whose shortest paths have such characteristics. |
Now consider the graph presented in figure 1. In order to exemplify the distinguishable behavior of measures with distinct types of summarization, figure 2 shows how the node rankings derived from prod (5) and sum (7) degree centralities vary according to the value assigned to . It is interesting to remark that the majority of rank changes occur outside the unit interval, suggesting that analyses restricted to it may not take full benefit of these indexes. And as a demonstration of their unique aspects, it can be seen that node leaves the last place as grows in the prod method – while in sum it occupies the same position in the entire interval – indicating its greater accumulated strength, concerning the number of ties, compared to the others. Node in the sum measure, however, occupies the second position at both extremes since its links have both the minimum and maximum edge weights. Despite their particularities, in specific conditions the measures may coincide, as indicated in proposition 1.

Proposition 1.
Given a weighted graph and a vertex , if , then .
Proof.
Since the weight of every link of is , . And as the graph is assumed to be connected, . Then, for any ,
∎
4 Improvements on Adjustable Node Centrality
Opsahl’s measures have different methods of calculation from which it was presented a twofold description of adjustable degree and closeness centralities, which were subsequently compared. There is, however, a common thread in the four expressions: all of them are adjusted using exponentiation. For this reason they are notably susceptible to floating-point errors [27, 28] such as overflow or underflow, possibly combined with loss of significance, thereby limiting the range of the parameter in order to enforce numerical computability. To quantify that, consider the widely used IEEE 754 [29] double precision format, in which 11 of 64 bits are reserved for an exponent field to hold a value whose magnitude is not greater than 1024. As a consequence, real numbers whose absolute value is outside the interval cannot be properly represented in this format.
Hence, safe intervals of can be established for the measures: let be an hypothetical graph; let be the set of values which could be bases for exponentiation by during the computation of one of the centrality indexes for vertices of G; let ; then such a interval is defined as . The definition of goes as follows: for , ; for and , ; and for , . At last, as a slight abuse of notation, hereinafter the following expression is used to denote the length of a real interval, like : .
The method presented depends on the floating-point model employed, but its utility remains as long as arithmetic precision is finite. Moreover, it is important to notice that setting respecting its safe interval can be seen as a necessary but not sufficient condition for numerical computability: after all, the values exponentiated by are then used in other arithmetic operations.
Although it is possible to determine safety parameters, such a range may be quite narrow, up to the point that its consideration disables flexibility and, consequently, usefulness. A better approach in this sense can be obtained by taking the logarithm of the indices adjustable via product. This leads to a third pair of measures, defined as follows:
(9) | ||||
(10) |
Such an approach preserves the benchmarks for , although the classical values of degree and closeness come to be in logarithmic fashion instead. Furthermore, comparability of the centralities remains unchanged, as the logarithm is a monotonic function. So the standings in figure 2 for the prod and log methods are the same. This also reverberates on figure 3: due to the linear variation, the centrality is equalized across the parameter interval, making the entire ranking and proximity of quantities easily discernible. Another relevant feature of the log-based measures is the enhanced flexibility for , since safe intervals become relatively unrestricted compared to those of exponential indices.

5 Assessment of Edge Weights Significance
Section 4 introduced the idea of safe intervals for parameter , aiming at assuring numerical computability of the centrality indexes in question, and showed how the proposed measures and are superior to its counterparts in this regard. However in practice the entire safe intervals may not be functional: for example, there can be upper bound on after which the node centrality ranking defined according to remains unaltered; if the smallest value for such a bound is known, setting beyond it can be considered senseless. This reasoning enables the definition of a useful interval, denoted as , according to the first and last change points of the ranking. This section details how these intervals can be interpreted, and related to some well-known network global measures.
How these useful intervals can be understood is the first topic to be addressed because it also provides some motivation for their establishment. Varying parameter allows to observe the same graph from different points of view, which can be materialized in the form of node centrality rankings, as already illustrated in figure 2. Therefore the most straightforward application of useful intervals is to avoid looking for such alternative perspectives where there is none. But a more interesting possibility comes from evaluating how narrow the interval is, as an estimate how condensed are these perspectives, and how smooth is the transition between them. Since these transitions are solely due structural distinctions between the nodes which are reflected by benchmark centralities (i.e., when ), the interval length can be seen as an indicator of node uniqueness in this sense.
For a better comprehension of the information provided by interval length, consider the following analyses of two degenerate cases:
-
1.
Let be an hypothetical graph, such that and . In this scenario, and, therefore, . In other words, the node ranking with respect to degree centrality would be the same regardless of . Consequently , as there is no upper or lower bound on the value of for delimiting changes in such a ranking. This reflects how similar the nodes are considering the influence of edge weights on their centrality statuses.
-
2.
Now let be another hypothetical graph, such that and . In this case, the only change point in node ranking with respect to degree centrality is when , what leads to determining . Since all nodes have the same degree but not the same strength, the range of which allows multiple perspectives of the graph is minimum, as the importance of edge weights on switching between these perspectives is maximum.
How wide the useful intervals are depends on network global structure, and provides a feedback which is global as well. The relevance of edge weights to such a structure was previously ruled as of lesser importance relatively, considering a comparison between nodes degrees and strengths [30]. Despite the soundness of such a statement, having the means to concretely evaluate this is quite advantageous, since there is also the indication that such a difference in informativeness varies according to the network in question. As an example, Small-world-ness [31], another graph global measure, was developed with the same target of quantifying an until then imprecise concept. However, Small-world-ness as well as the majority of the most popular graph global measures [32, 33, 34] are unrelated to edge weights, leaving a blank this work targets to fill to some extent.
6 Extrema Intersection of Unbounded Lines
As previously stated, a useful interval of for a certain centrality index is determined by the lowest and highest values of which lead to changes in the node ranking defined according to that index. Such changes can be easily identified in a diagram like figure 3 since they happen whenever there is an intersection between any pair of non-coincident lines. Therefore determining useful intervals for the log-based measures comes down to determining the “leftmost” and “rightmost” single-point intersections of a collection of unbounded lines.
The most naive approach for such a goal is to exhaustively verify if the intersection of each pair of lines is one of the extrema, which has a time complexity of . Fortunately improving this cost is possible avoiding to determine all intersections but directly finding those of interest. The relatable problem of deciding if there is at least one intersection between any pair of line segments given a set of these was previously addressed this way, so that an solution was provided [35]. Although this method cannot be directly adapted to the problem at hand, a solution with similar cost, principles, and applicability, which is not limited to the present context, could be found. It is based on the following statements:
Remark 1.
There is a 1-to-1 correspondence between a line described as and an ordered pair , so that an indexed collection of lines can be denoted by , with representing the -th line.
Remark 2.
Given the lines , if there is no intersection between the -th line and any other line (i.e., every line has the same slope of , being parallel to it), there is no intersection between any pair of lines, since all of them are pairwise parallel.
Remark 3.
Given the lines , if two of them intersect when , for some , the leftmost intersection between elements of L happens when .
Proposition 2.
Let and be lines such that , for some . If their intersection happens when , then .
Proof.
Suppose that the antecedent of the desired conclusion is true. Since , then , with . Therefore, . From the supposition, , what allows to derive from the last inequality that . ∎
Proposition 3.
Let and be lines such that . Let , and , for some , with and . Consider that the intersections between the lines and , and , and and happen respectively when , , and . If , then .
Proof.
From the premisses,
Since , then and , with . Now suppose that, . As , it can be stated that
From the fact that
it follows that
This enables to state that and, consequently, that and , with . Therefore the supposed inequality can be once again rewritten, this time as
Consequently,
Since the left- and right-hand sides of this last inequality respectively equal to and , the proof is concluded. ∎
As a preamble to the core procedure, for the input lines , consider that an intersection between any pair of lines is determined on : as remark 2 indicates, if this is not possible there is no intersection at all, and this can be decided in linear time. Next the lines are sorted in ascending order of , so that the leftmost intersection happens on some , as affirmed in remark 3. These two steps have a time complexity of .
At last, the algorithm itself has an inductive design [36], linearly incrementing on the lines. From now on the value of of the leftmost intersection between two of the first lines is represented by . In the base case, only the first line is taken into account: in this trivial scenario, there are no intersections, what is denoted by . This part has constant computational cost.
Now, for the inductive step, first assume that is known. This can also be the solution for the first lines (i.e., ), but maybe an intersection between and any of the previous lines leads to a better solution. According to proposition 2, only lines whose slope is lower than that of could allow this improvement. And based on proposition 3, if there is more than one line matching this criterion, the one with the greatest slope, represented as , should be responsible for the leftmost intersection with , or no intersection with this line would be the leftmost of all. Thus, considering that an intersection between and a hypothetical happens when , it follows that
With respect to the algorithmic complexity, each iteration of the inductive step is dominated by the identification of . Any self-balancing binary search tree [37] could be used in this regard, whose queries and updates require elementary operations. Consequently, the entire procedure has a time complexity of and a space complexity of .
As a closing remark, it is noticeable that the explanation just provided focused on finding the leftmost intersection of a pair of lines of . However, since the goal is to determine the length of the intervel in which all intersections happen, only this is now enough. Fortunately the rightmost intersection can be determined by the same means, using the symmetry argument demonstrated next in proposition 4.
Proposition 4.
The rightmost intersection of lines is the reciprocal leftmost intersection of lines .
Proof.
Suppose that -th and -th lines of are those whose intersection is the rightmost one, which happens on . Then, . Therefore there would also be an intersection between lines and of on . Now suppose, for the sake of contradiction, that this is not the leftmost intersection of , but that of lines and on . Then there is also an intersection between and of on , what opposes the initial assumption. This concludes the proof. ∎
7 Experimental Evaluation
This section is focused on depicting meaningful examples of the application of the measures proposed in sections 5 and 4. For this purpose the null models described in section 2 were used, considering the variation of parameters such as the number of nodes and edges, and the weights distribution. The same evaluation was also performed for real-world networks and respective randomly generated surrogates which preserve some attributes of the original artifacts. The multiplicative inverse of the edge values was used to calculate the weighted distances in all cases.
It is worth pointing out that the experiments and respective analyses are far from exhausting the proposed methodology. Here it is shown how the measures behave in various scenarios, aiming at providing insights into how some graph attributes can influence them. Further studies may consider other characteristics, in addition to those assessed here, for the purpose of elucidating more properties and accomplishing other predictions.
7.1 Synthetic Networks
Regarding the random graph models, it was implemented the ER, , but with weights assigned from a normal distribution: henceforth this is called ER+normal model. This derivative was adopted because it is very convenient to verify the effects of the respective mean and variance on our measures. The WRG model [13], on the other hand, was previously established in the literature in a clear fashion, producing graphs whose edge weights follow a geometric distribution. Such synthetic data favor results interpretability since factors which could affect them are known a priori.
The default values used for each parameter are the following: Number of vertices (n), 200; Probability of connection (p), 0.2; Mean weight (), 10; Standard deviation of weights (), 1. For each parameter setting, 1000 random graphs were generated for statistics computation: for each graph, averages of measures of interest were calculated and then summarized to produce the figures shown next. Since the length of a useful interval can be infinite, the median and interquartile range (IQR) were used to summarize some results since they are superior to the counterparts mean and standard deviation with respect to handling such a special value.
In the first experiment to be reported, the effects resulting from varying the parameters of the normal distribution for ER+normal graphs were analyzed. Since topology settings remained unaltered, only measures based on the weights would reflect such changes. Hence, as shown in the first and third columns of figure 4, the raise of the mean edge value was naturally reproduced by the node strengths. Likewise, the average weighted distance decreased once the weights were larger – indicating closer relationships. And as confirmed in the second column of the figure, with the increase of it was expected that, on average, strengths would not vary while distances would decrease slightly. That is because, with a greater variety of weights, it is more likely that the minimum weight of all paths between any two vertices becomes smaller.

Regarding the lengths of the useful intervals, and , it is remarkable that their variation always followed the inverse trend of the weights coefficient of variation (CV), . When only the mean increased, the CV decreased while both and also increased, suggesting a smaller relevance of the weights for both centrality perspectives. On the other hand, when the standard deviation of the weights increased while the mean remained constant, the opposite happened for both CV as well as the useful intervals. At last, when both mean and standard deviation varied together, it can be noticed the stability of the importance of the weights, as shown in third column of figure 4.
As another experiment, it was also considered the consequences of changing the number of nodes, , for ER+normal graphs as well as for realizations of the WRG model. In this scenario the probability of connection was , which is greater than the asymptotic threshold of connectivity of , [38], for . Figure 5 shows that, although a larger leads to a smaller , the expected number of neighbors grows with : . And regarding the average unweighted distance, its growth with was anticipated [39].

The expected strength is , with the expected edge weight, , being defined in a distinct fashion for each of the two graph models: for the ER+normal, , and since is constant, is directly proportional to ; for the WRG, , so that the mean weight decreases as increases, but without exceeding the pace of growth of , resulting in a still increasing . The average weighted distance for the ER+normal model differs from the unweighted counterpart by a constant multiplicative factor, . And as in the WRG tends to 1 with the growth of , the weighted distances approach the unweighted ones.
It would be reasonable to predict that the increase of the number of nodes would make the occurrence of nodes with similar centrality profiles more probable, prompting the enlargement of the useful intervals. Lastly, the lower values of and for the WRG, in comparison with those of the ER+normal, can be associated with the fact that the former’s coefficient of variation of weights is always higher than the latter – whose value is constant, 0.1. This would result in a greater diversity of edge values and, consequently, of centralities.
Regarding the probability of connection, as can be seen in figure 6, it was varied from a relatively small value, discarding generated graphs which were not connected, up to a setting that approximates that of a complete graph. As increases, the node degrees also increase, while the unweighted distances decrease due to the higher amount of edges. For the ER+normal once again the strengths and weighted distances averages differ from their unweighted versions by the constant . Whereas for the WRG, the average strength is aligned to the fact that , while the average weighted distance exhibit a similar pattern, but vertically reflected, with a decreasing tendency: this could be owed to the use of the multiplicative inverse of weights for distance calculations.
Observing the length of the useful intervals, the discrepancy between for each the considered graph models is remarkable. When is closer to 0, in either ER+normal and WRG the mean and standard deviation of both and have their lowest values. Despite that, for the first, has its maximum value, while for the second, its value is minimum, so that the decreasing tendency of for the ER+normal contrasts with the opposite behavior respective to the WRG. This can be interpreted as an evidence of how distinct is the information provided by the useful intervals compared to that of the centrality measures, regardless of the weights. Meanwhile, has a relatively similar behavior for both methods. This occurs, however, even with the sample CV of for the WRG being consistently greater that of ER+normal for most of the range of .

7.2 Real Networks
The three datasets/networks briefly described below were also analyzed. A summary of some relevant features of each network is presented in table 2.
-
1.
Freeman’s EIES Network [40]: established in 1978, the dataset encompasses three different networks from which we opted to use the one that represents the number of messages sent among 32 researchers using an electronic communication tool. This network was also used in Opsahl’s original work [5]. Originally, its links are directed. We chose to transform the network into an undirected one, in which each edge has a weight equal to the sum of the values its directed versions in the original graph: . Self-loops were also removed.
-
2.
US Air Transportation Network [41]: the nodes represent the top 500 US airports according to the amount of traffic in 2002, while the weights consist of the number of seats available on scheduled flights between airports.
-
3.
Brazilian Congressmen Voting Network [42]: represents the similarity between voting profiles of Brazilian congressmen from 2015 to 2016. Each node represents one congressman and an edge will be present if two congressmen voted identically on at least one bill. The weights range from 0 (no commonalities) to 1 (identical) representing the proportion of agreements in total votes.
Feature | Network | ||||||||
---|---|---|---|---|---|---|---|---|---|
Messages | US Flights | Voting | |||||||
Nodes | 32 | 500 | 634 | ||||||
Edges | 266 | 2980 | 176426 | ||||||
[ | -82.1, | 82.1] | [ | -40.0, | 40.0] | [ | -110.1, | 110.1] | |
[ | -177.1, | 177.1] | [ | -90.5, | 90.5] | [ | -58.4, | 58.4] | |
[ | -103.5, | 103.5] | [ | -48.5, | 48.5] | [ | -104.7, | 104.7] | |
[ | -108.6, | 104.1] | [ | -147724.5, | 14114.7] | [ | -12469.3, | 7452.4] | |
[ | -177.8, | 114629.0] | [ | -86532.4, | 119285.9] | [ | -160602.5, | 38498.1] |
Figure 7 depicts the assessment of the measures of interest on graphs derived from the just indicated networks. The former were generated by iterated rewiring (i.e., pairwise edge switching [15, 16]) of the original definition of the latter: once again 1000 graphs were generated for each number of rewirings considered. According to past works, there is a smooth variation of the network characteristics as more rewires are performed. This goes until the resulting network can be considered plainly random and, in this sense, stable. This could be expected to occur when the number of repetitions is at least within the same order of magnitude of the number of edges.

While the invariance of degree and strength averages was assured by the rewiring procedure definition, that was not the case for distances. Nevertheless these were mildly affected, resembling the just mentioned invariance: only the ‘US Flights’ surrogates clearly show a different behavior, as the dispersion of the weighted distances averages increases with the number of rewires. There is a similar consistency of and , except after very few rewires: abruptly decreases in all 3 cases, implying a greater diversity of node closeness profiles; on the other hand, increases in a softer fashion for the ‘Messages’ and ‘Voting’ networks, evidencing more coinciding node strengths (since degrees are preserved). Both changes could be considered unexpected thanks to the small number of rewires which prompted them.
Finally, figure 8 demonstrates one of the advantages of log-based centralities using the real-world networks concerned. Since the result reported for the exponential methods was submitted to the log function, what we see as a linear change in the variance of the centralities is, in fact, exponential. The log-based measure, however, displayed in linear scale, presents a quadratic behavior of the variance. That sounds natural, considering that as or , the centralities defined according to the log method tend to move away from each other (assuming a bounded useful interval). That is not the case for the prod and sum alternatives since, in general, there is a discrepancy of scale in the centrality values comparing both extrema.

In addition, the figure also highlights some particularities, as the fact that the voting network presents decreasing variance for the exponential methods. This happens because the weights are within the unit interval and, therefore, unweighted degree and closeness exceed, or at least equate, their weighted version. The opposite is true for networks whose weights are greater or equal to one. Moreover, the curvature of the lines respective to the log method can be related to the length of safe intervals: to , and to .
8 Conclusion
The ubiquity of abstract and concrete systems which can be insightfully modeled as weighted networks is evident. This motivates the analysis of such structures from various points of view, such as node centrality. This work aimed at improving past ones on the consideration of edge weights for the establishment of the just mentioned perspective. Possibly the most repeated contribution in this sense is the proposal of measures to make the assessment in question. While this happens once again here, it is not an end in itself, but serves as the gateway to the entire methodology developed from there. Moreover, even this starting point is elaborately characterized and compared to its alternatives.
The conception and the interpretation of the useful intervals is the major accomplishment reported in this text. They lead to an entirely unexplored way to analyze weighted networks by allowing to objectively evaluate the significance of edge weights for node centrality. And targeting to assure the completeness of this contribution, an elegant procedure for the computation of these intervals is also provided, which is flexible enough to tackle problems in other contexts with similar geometric properties. At last, the performed experiments successfully illustrate the behavior of the presented methods in various situations, corroborating their usefulness and uniqueness.
About the next steps, we believe the application of the proposed methodology on a wider variety of real networks as well as probabilistically generated ones would enable the identification of other interesting patterns related to inherent characteristics of the former and parameter settings of the latter. Another future work could focus on the characterization of properties of well-known graph families based on the log-based centrality measures and their respective useful intervals. These are some promising potential continuations of this research.
CRediT authorship contribution statement
We describe contributions to the paper as follows: R.S.P and D.O.C should be credited for conceptualization, methodology, software, validation, formal analysis, investigation, data curation, writing – original draft, review and editing –, and visualization; D.O.C also should be credited for resources, supervision, project administration, and funding acquisition.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Funding
This work was supported by the Carlos Chagas Filho Foundation for Research Support in Rio de Janeiro (FAPERJ) [grant number 248551 to R.S.P.].
References
- [1] F. Riquelme, P. G. Cantergiani, Measuring user influence on twitter: A survey, Inf. Process. Manag. 52 (5) (2016) 949–975. doi:10.1016/j.ipm.2016.04.003.
- [2] P. Crucitti, V. Latora, S. Porta, Centrality in networks of urban streets, Chaos: an interdisciplinary journal of nonlinear science 16 (1) (2006) 015113.
- [3] H. Jeong, S. P. Mason, A.-L. Barabási, Z. N. Oltvai, Lethality and centrality in protein networks, Nature 411 (6833) (2001) 41–42.
- [4] L. C. Freeman, Centrality in social networks conceptual clarification, Social networks 1 (3) (1978) 215–239.
- [5] T. Opsahl, F. Agneessens, J. Skvoretz, Node centrality in weighted networks: Generalizing degree and shortest paths, Soc. Networks 32 (3) (2010) 245–251. doi:10.1016/j.socnet.2010.03.006.
- [6] S. Wasserman, K. Faust, Social Network Analysis: Methods and Applications, Cambridge University Press, 1994. doi:10.1017/CBO9780511815478.
- [7] A. Barrat, M. Barthelemy, R. Pastor-Satorras, A. Vespignani, The architecture of complex weighted networks, Proceedings of the national academy of sciences 101 (11) (2004) 3747–3752.
- [8] M. E. Newman, Scientific collaboration networks. ii. shortest paths, weighted networks, and centrality, Physical review E 64 (1) (2001) 016132.
- [9] E. W. Dijkstra, A note on two problems in connexion with graphs, Numerische Mathematik 1 (1959) 269–271. doi:10.1007/BF01386390.
- [10] B. Bollobás, Random Graphs, Second Edition, Vol. 73 of Cambridge Studies in Advanced Mathematics, Cambridge University Press, 2011. doi:10.1017/CBO9780511814068.
- [11] P. Erdös, A. Rényi, On random graphs i, Publicationes Mathematicae Debrecen 6 (1959) 290.
- [12] T. Kalisky, S. Sreenivasan, L. A. Braunstein, S. V. Buldyrev, S. Havlin, H. E. Stanley, Scale-free networks emerging from weighted random graphs, Phys. Rev. E 73 (2006) 025103. doi:10.1103/PhysRevE.73.025103.
- [13] D. Garlaschelli, The weighted random graph model, New Journal of Physics 11 (7) (2009) 073005. doi:10.1088/1367-2630/11/7/073005.
- [14] J. Kim, T. Wilhelm, What is a complex graph?, Physica A: Statistical Mechanics and its Applications 387 (11) (2008) 2637 – 2652. doi:https://doi.org/10.1016/j.physa.2008.01.015.
- [15] R. Milo, N. Kashtan, S. Itzkovitz, M. E. J. Newman, U. Alon, On the uniform generation of random graphs with prescribed degree sequences (2003). arXiv:cond-mat/0312028.
- [16] T. Opsahl, V. Colizza, P. Panzarasa, J. J. Ramasco, Prominence and control: the weighted rich-club effect, Physical review letters 101 (16) (2008) 168702.
- [17] G. Ansmann, K. Lehnertz, Constrained randomization of weighted networks, Physical Review E 84 (2) (Aug 2011). doi:10.1103/physreve.84.026103.
- [18] S. H. Strogatz, Exploring complex networks, Nature 410 (6825) (2001) 268–276.
- [19] K. Das, S. Samanta, M. Pal, Study on centrality measures in social networks: a survey, Social Netw. Analys. Mining 8 (1) (2018) 13. doi:10.1007/s13278-018-0493-2.
- [20] X. Qi, E. Fuller, Q. Wu, Y. Wu, C. Zhang, Laplacian centrality: A new centrality measure for weighted networks, Inf. Sci. 194 (2012) 240–253. doi:10.1016/j.ins.2011.12.027.
- [21] P. Tang, C. Song, W. Ding, J. Ma, J. Dong, L. Huang, Research on the node importance of a weighted network based on the k-order propagation number algorithm, Entropy 22 (3) (2020) 364. doi:10.3390/e22030364.
- [22] S. X. Zhao, R. Rousseau, F. Y. Ye, h-degree as a basic measure in weighted networks, J. Informetrics 5 (4) (2011) 668–677. doi:10.1016/j.joi.2011.06.005.
- [23] A. Garas, F. Schweitzer, S. Havlin, A k-shell decomposition method for weighted networks, CoRR abs/1205.3720 (2012). arXiv:1205.3720.
- [24] Y. Yustiawan, W. Maharani, A. A. Gozali, Degree centrality for social network with opsahl method, Procedia Computer Science 59 (2015) 419–426.
- [25] D. Wei, Y. Li, Y. Zhang, Y. Deng, Degree centrality based on the weighted network, in: 2012 24th Chinese Control and Decision Conference (CCDC), IEEE, 2012, pp. 3976–3979.
- [26] S. P. Borgatti, M. G. Everett, A graph-theoretic perspective on centrality, Soc. Networks 28 (4) (2006) 466–484. doi:10.1016/j.socnet.2005.11.005.
- [27] M. L. Overton, Numerical Computing with IEEE Floating Point Arithmetic, Society for Industrial and Applied Mathematics, 2001. doi:10.1137/1.9780898718072.
- [28] D. Goldberg, What every computer scientist should know about floating-point arithmetic, ACM Comput. Surv. 23 (1) (1991) 5–48. doi:10.1145/103162.103163.
- [29] IEEE standard for binary floating-point arithmetic - IEEE standard 754-1985, Beuth, 1985.
- [30] R. Mastrandrea, T. Squartini, G. Fagiolo, D. Garlaschelli, Enhanced reconstruction of weighted networks from strengths and degrees, New Journal of Physics 16 (4) (2014) 043022. doi:10.1088/1367-2630/16/4/043022.
- [31] M. D. Humphries, K. Gurney, Network ‘small-world-ness’: A quantitative method for determining canonical network equivalence, PLOS ONE 3 (4) (2008) 1–10. doi:10.1371/journal.pone.0002051.
- [32] D. J. Watts, S. H. Strogatz, Collective dynamics of ’small-world’ networks, Nature 393 (6684) (1998) 440–442. doi:10.1038/30918.
- [33] S. Zhou, R. J. Mondragón, The rich-club phenomenon in the internet topology, IEEE Communications Letters 8 (3) (2004) 180–182. doi:10.1109/LCOMM.2004.823426.
- [34] M. E. J. Newman, Mixing patterns in networks, Phys. Rev. E 67 (2003) 026126. doi:10.1103/PhysRevE.67.026126.
- [35] M. I. Shamos, D. Hoey, Geometric intersection problems, in: 17th Annual Symposium on Foundations of Computer Science, Houston, Texas, USA, 25-27 October 1976, IEEE Computer Society, 1976, pp. 208–215. doi:10.1109/SFCS.1976.16.
- [36] U. Manber, Using induction to design algorithms, Commun. ACM 31 (11) (1988) 1300–1313. doi:10.1145/50087.50091.
- [37] D. E. Knuth, The art of computer programming, Volume III, 2nd Edition, Addison-Wesley, 1998.
- [38] R. Durrett, Random Graph Dynamics (Cambridge Series in Statistical and Probabilistic Mathematics), Cambridge University Press, USA, 2006.
- [39] A. Fronczak, P. Fronczak, J. A. Hołyst, Average path length in random networks, Phys. Rev. E 70 (2004) 056110. doi:10.1103/PhysRevE.70.056110.
- [40] S. C. Freeman, L. C. Freeman, The networkers network: A study of the impact of a new communications medium on sociometric structure, School of Social Sciences University of Calif., 1979.
- [41] V. Colizza, R. Pastor-Satorras, A. Vespignani, Reaction–diffusion processes and metapopulation models in heterogeneous networks, Nature Physics 3 (4) (2007) 276–282.
- [42] V. S. Bursztyn, M. G. Nunes, D. R. Figueiredo, How Brazilian congressmen connect: homophily and cohesion in voting and donation networks, Journal of Complex Networks 8 (1), cnaa006 (02 2020). doi:10.1093/comnet/cnaa006.