Quantifying Community Evolution in Developer Social Networks: Proof of Indices’ Properties
Liang Wang
[email protected]0000-0001-5444-748XState Key Laboratory for Novel Software Technology, Nanjing University163 Xianlin Ave.NanjingChina, Ying Li
[email protected]0000-0002-4637-1742State Key Laboratory for Novel Software Technology, Nanjing University163 Xianlin Ave.NanjingChina, Jierui Zhang
[email protected]0000-0002-7290-790XState Key Laboratory for Novel Software Technology, Nanjing University163 Xianlin Ave.NanjingChina and Xianping Tao
[email protected]0000-0002-5536-3891State Key Laboratory for Novel Software Technology, Nanjing University163 Xianlin Ave.NanjingChina
Abstract.
The document provides the proof to properties of community evolution indices including community split and shrink in paper: Liang Wang, Ying Li, Jierui Zhang, and Xianping Tao. 2022. QuantifyingCommunity Evolution in Developer Social Networks. InProceedings of the30th ACM Joint European Software Engineering Conference and Symposiumon the Foundations of Software Engineering (ESEC/FSE ’22), November 14–18, 2022, Singapore, Singapore.ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3540250.3549106.
Proof to properties of community merge and expand is similar.
Proof of Properties, Online Material
††copyright: none††journal: ONLINE††ccs: Software and its engineering Programming teams††ccs: Software and its engineering Open source model††ccs: General and reference Metrics
1. Brief Introduction to the Properties of Community Split and Shrink Indices
Let and denote the community split and shrink indices, respectively.
Without loss of generality, we assume .
The properties of the two indices are as follows.
P-1.
and are strictly monotonic increasing functions of , given , and .
P-2.
/ is a strictly monotonic decreasing / increasing function of , respectively, for , given , and member migration distribution with .
P-3.
Given and , the maximum split index is obtained when the members of migrate to the communities detected in the next step with a even distribution, i.e., when we have .
And the minimum split index is obtained when or all the members of who stay in the project migrate to a single community in the next step, i.e., there exists a -th community in time that and , resulting in and .
P-4.
Given and , the maximum shrink index is obtained when the corresponding split index is minimized, i.e., all stayed members of community migrate to a single community in the next step.
And the minimum shrink index is obtained when the members of migrate evenly to communities in time . For the special case of , the shrink index is only determined by .
Figure 1. Curves of the split and shrink indices under different conditions specified by , , and . The even distribution corresponds to case . The random distribution is obtained by randomly assigning ’s values.
Fig. 1 illustrates the curves of the split and shrink indices under different conditions, from which we can find correspondence to the above properties.
2. Proof of The Properties
The proof of the above four properties are as follows.
P-1. and are strictly monotonic increasing functions of , given , and .
Proof.
First, for community split index , take the partial derivative of with respect to gives us:
As a result, the community split index is a strictly monotonic increasing function of , for , , and .
For the case of , we always have , and regardless the distribution of , which is intuitively correct that a community cannot split if all of its members leaves the project.
Next, for community shrink index , we also take its partial derivative with respect to , which is
For the case of , we always have , given any member migration distribution , which is reasonable because a community does not shrink if none of its members leave the project, regardless the number of communities detected in the next step (i.e., ).
Combining the above two results, we show that: and are strictly monotonic increasing functions of , given , and .
P-2.
/ is a strictly monotonic decreasing / increasing function of , respectively, for , given , and member migration distribution with .
Proof.
First, we show that community split index is a strictly monotonic decreasing function of under the given conditions.
The partial derivative of with respect to is
(6)
As a result, we have , meaning that is a strictly monotonic decreasing function of , as long as .
We have and thus when only one of the ’s is one with rest of the ’s equal to zero (including the case that ).
Intuitively, this case means that a community is not regarded as splitting if all of its remaining members migrate to a single community in the next step, regardless the amount of members who leave the project.
Next, we show that community shrink index is a strictly monotonic increasing function of under the given conditions.
Taking the partial derivative of with respect to gives us
(7)
The forth term in Eq. (7) is .
And the last term in Eq. (7) is when , and when following the definition of .
Because is the maximum entropy, we have , , and when .
We then have for and .
For the case of , we have , and .
Eq. (7) then becomes
(9)
We also have for and .
As a result, is a strictly monotonic increasing function of for , and .
Summarizing the above, we show that: / is a strictly monotonic decreasing / increasing function of , respectively, for , given , and the distribution of member migration with .
P-3. Given and , the maximum split index is obtained when the members of migrate to the communities detected in the next step with a even distribution, i.e., when we have .
And the minimum split index is obtained when or all the members of who stay in the project migrate to a single community in the next step, i.e., there exists a -th community in time that and , resulting in and .
Proof.
is a monotonic increasing function of for .
Referring to the properties of information entropy (Shannon, 1948), entropy is a maximum when .
And the minimum value of is obtained when only one of the ’s equals to one and others equal to zero, which also includes the case of .
As a result, the maximum split index is given .
When , we have being the maximum possible value for the split index, which is only determined by .
And the minimum split index is .
P-4. Given and , the maximum shrink index is obtained when the corresponding split index is minimized, i.e., all stayed members of community migrate to a single community in the next step.
And the minimum shrink index is obtained when the members of migrate evenly to communities in time . For the special case of , the shrink index is only determined by .
Proof.
Given and , the community shrink index is a monotonic decreasing function of the community split index .
Referring to Property 3 presented above, we have the shrink index maximized when the split index is minimized, i.e., .
The maximum shrink index is given by .
And the minimum shrink index when .
In the above analysis we have because we assume .
When , we have , and .
The shrink index is , which is only determined by .
Following the above analysis, if we consider the change of , we can find that the maximum and minimum possible value of the shrink index is when , and when , respectively.
From properties P-3 and P-4 we can further see that given , the community split and shrink indices vary in the same range given by , and , with different values of and the distribution of member migration specified by .
As a result, it is feasible for us to draw meaningful results, such as the community shows a stronger trend of splitting / shrinking, by directly comparing the values of community split and shrink indices.
Acknowledgements.
We thank all the reviewers for their efforts in improving the paper.
This work is supported by the National Key R&D Program of China No. 2018AAA0102302, NSFC No. 62172203, Fundamental Research Funds for the Central Universities, and the Collaborative Innovation Center of Novel Software Technology and Industrialization.
References
(1)
Shannon (1948)
Claude Elwood Shannon.
1948.
A mathematical theory of communication.
The Bell system technical journal
27, 3 (1948),
379–423.