Geometric Mean Type of Proportional Reduction in Variation Measure for Two-Way Contingency Tables
Abstract
In a two-way contingency table analysis with explanatory and response variables, the analyst is interested in the independence of the two variables. However, if the test of independence does not show independence or clearly shows a relationship, the analyst is interested in the degree of their association. Various measures have been proposed to calculate the degree of their association, one of which is the proportional reduction in variation (PRV) measure which describes the PRV from the marginal distribution to the conditional distribution of the response. The conventional PRV measures can assess the association of the entire contingency table, but they can not accurately assess the association for each explanatory variable. In this paper, we propose a geometric mean type of PRV (geoPRV) measure that aims to sensitively capture the association of each explanatory variable to the response variable by using a geometric mean, and it enables analysis without underestimation when there is partial bias in cells of the contingency table. Furthermore, the geoPRV measure is constructed by using any functions that satisfy specific conditions, which has application advantages and makes it possible to express conventional PRV measures as geometric mean types in special cases.
Keywords: Contingency table, Diversity index, Geometric mean, Independence, Measure of association, Proportional Reduction in Variation
Mathematics Subject Classification: 62H17, 62H20
1 Introduction
Categorical variables are formed from categories and are employed in various fields such as medicine, psychology, education, and social science. Considering two types of categorical variables, one consisting of categories and the other consisting of categories. These two variables have combinations, which can be represented in a table with rows and columns. This is called a two-way contingency table, where each () cell () displays only the observed frequencies. Typically, the two-way contingency table is used to evaluate whether the two variables are related, i.e., statistically independent. If the independence of the two variables is rejected for example by Pearson’s chi-square test, or they are clearly considered to be related, we are interested in the strength of their association.
As a method to investigate the associative structure of the contingency table, association models have been proposed by Gilula and Haberman, (1986), Goodman, (1981, 1985), and Rom and Sarkar, (1992). This method can determine whether there is a relationship between row and column variables by the goodness-of-fit test with models. However, this method only focuses on whether or not there is a relationship, and we can not quantitatively determine what the degree of association is.
Instead of the goodness-of-fit test with the models, a variety of measures have been proposed as indicators that can show the degree of association within the interval from 0 to 1 by Agresti, (2003), Bishop et al., (2007), Cramér, (1999), Everitt, (1992),Tomizawa et al., (2004), and Tschuprow, (1925, 1939). These measures calculate the degree of deviation from independence for each cell in the contingency table and derive the degree of association from the sum of all cells. Because of the method, these measures can be applied to most contingency tables without distinguishing whether row and column variables are explanatory or response variables. However, in actual contingency table analysis, there are cases where the row and column variables are defined as explanatory or response variables. In such cases, it is not appropriate to analyze each variable by ignoring its characteristics.
Alternative measures have been proposed by Goodman and Kruskal, (1954), and Theil, (1970), which is explained by the proportional reduction in variation (PRV) from the marginal distribution to the conditional distributions of the response. The measures constructed by the method is called PRV measure. The PRV measure is an important tool in summarizing the strength of association of the entire contingency table because the way it is constructed makes it easy to interpret the values. In addition, we sometimes want to focus on the association of some categories of explanatory variables, but conventional PRV measures underestimate the strength and thus may not be able to accurately reflect the partial association numerically. In the study of models and scales for evaluating the symmetry of the contingency table, Nakagawa et al., (2020), Saigusa et al., (2016), and Saigusa et al., (2019) proposed to evaluate the partial symmetry by using the geometric mean. On the other hand, little research has been done in the case of the partial association.
In this paper, we propose a geometric mean type of PRV (geoPRV) measure via a geometric mean and functions satisfying certain conditions. Therefore, the geoPRV measure has application advantages and makes it possible to express previously proposed PRV measures as geometric mean types in special cases. By using the geometric mean to sensitively capture the association of each explanatory variable, analysis can be performed without underestimating the degree of association when cells in the contingency table are partially biased. In addition, the geoPRV measure enables us to know local association structures. Furthermore, the geoPRV measure can be analyzed regardless of whether the categorical variable is nominal or ordinal because its value does not change even when rows and columns are swapped. The rest of this paper is organized as follows. Section 2 introduces previous research on an extension of generalized PRV (eGPRV) measure and proposes the geoPRV measure. Section 3 presents the approximate confidence intervals of the proposed measures. Section 4 confirms the values and confidence intervals of the proposed measure using several artificial and actual data sets, and compares them with the eGPRV measure. Section 5 presents our conclusions.
2 PRV Measure
In this Section, we introduce measures using function that satisfy the following conditions: (i) The function is convex function; (ii) ; (iii) ; (iv) . Examples of the function are introduced, and models and measures using it have been proposed by Kateri and Papaioannou, (1994), Momozaki et al., (2022) and Tahata, (2022). These proposals are intended to generalize existing models and measures and have application advantages that make it easy to construct new ones and allow adjustments with tuning parameters to fit the analysis. Section 2.1 provides some conventional PRV measures by Momozaki et al., (2022). In Section 2.2, we propose a geometric mean type of PRV measure and its characteristics.
2.1 Conventional PRV Measure
Consider contingency table with nominal categories of the explanatory variable and the response variable . Let denote the probability that an observation will fall in the th row and th column of the table (. In addition, and are denoted as , . The conventional PRV measure has the form
where is a measure of variation for the marginal distribution of , and is the expectation for the conditional variation of given the distribution of (see, Agresti,, 2003). is using the weighted arithmetic mean of , i.e, . By changing the variation measure, various PRV measures can be expressed, such as uncertainty coefficient for the variation measure called Shannon entropy and concentration coefficient for the variation measure called Gini concentration (see, Agresti,, 2003). Tomizawa et al., (1997) proposed a generalized PRV measure that includes and by using as the variation measure which is Patil and Taillie, (1982) diversity index of degree for the marginal distribution . Furthermore, Momozaki et al., (2022) proposed an extension of generalized PRV (eGPRV) measure that includes , , and :
The variation measure used in the eGPRV measure are .
2.2 Geometric Mean Type of PRV Measure
We propose a new PRV measure by using the weighted geometric mean of that aims to sensitively capture the association of each explanatory variable to the response variable. Assume that and is a real number greater than or equal to 0 (). We propose a geometric mean type of PRV (geoPRV) measure for contingency tables defined as
where is a measure of variation for the marginal distribution of . The geoPRV measure can use the same variation as the conventional PRV measure, for example,
where the variation measure . In addition, the following theorem for holds.
Theorem 1.
The measure satisfies the following conditions:
-
(i)
.
-
(ii)
must lie between 0 and 1.
-
(iii)
is equivalent to independence of and .
-
(iv)
is equivalent to , i.e., for at least one , there exists such that and for every with .
Theorem 2.
The value of is invariant to permutations of row and column categories.
For proof of Theorem 1 and Theorem 2, see Appendix A and Appendix B, respectively. The geoPRV measure differs from the conventional PRV measure in that when there exists such that . Another important feature of the geoPRV measures is that it takes higher or equal values than the conventional PRV measures, allowing for a stronger representation of row and column relationships.
A property of the geoPRV measure is that the larger the value of , the stronger the association between the response variable and the explanatory variable . In other words, the larger the value of , the more accurately you can predict the category if you know the category than if you do not. In contrast, if the value of is 0, the category is not affected by the category at all.
3 Approximate Confidence Interval for the Measure
Since the measure is unknown, we derived a confidence interval of . Let denote the frequency for a cell (), and (). Assume that the observed frequencies have a multinomial distribution, we consider an approximate standard error and large-sample confidence interval for using the delta method (Bishop et al.,, 2007, and Appendix C in Agresti,, 2010).
Theorem 3.
Let denote a plug-in estimator of . converges in distribution to a normal distribution with mean zero and variance , where
with
and is the derivative of function by .
Let denote a plug-in estimator of . From Theorem 3, since is a consistent estimator of , is an estimated standard error for , and is an approximate confidence limit for , where is the upper two-sided normal distribution percentile at level .
4 Numerical Experiments
In this section, we confirmed the performance of geoPRV measure , and the difference between and the conventional PRV measure proposed by Momozaki et al., (2022). We use and , which have the variation measure . In addition to applying for and for (see, Ichimori,, 2013), the former is expressed as and , while the latter is expressed as and . For the tuning parameters, set and .
Artificial data 1
Consider the artificial data in Table 1. These are data to clearly show the difference in characteristics between conventional PRV measures and the geoPRV measure. Table 1c shows the case where the explanatory variable in the first row has a complete association structure with the response variable in the third column. On the other hand, Table 1a and Table 1b show the case where the explanatory variable in the first row has a weak or slightly strong association structure to the response variable, respectively.
(1) | (2) | (3) | Total | |
---|---|---|---|---|
(a) | ||||
(1) | 0.005 | 0.125 | 0.370 | 0.500 |
(2) | 0.030 | 0.050 | 0.120 | 0.200 |
(3) | 0.045 | 0.075 | 0.180 | 0.300 |
Total | 0.080 | 0.250 | 0.670 | 1.000 |
(b) | ||||
(1) | 0.005 | 0.025 | 0.470 | 0.500 |
(2) | 0.030 | 0.050 | 0.120 | 0.200 |
(3) | 0.045 | 0.075 | 0.180 | 0.300 |
Total | 0.080 | 0.150 | 0.770 | 1.000 |
(c) | ||||
(1) | 0.000 | 0.000 | 0.500 | 0.500 |
(2) | 0.030 | 0.050 | 0.120 | 0.200 |
(3) | 0.045 | 0.075 | 0.180 | 0.300 |
Total | 0.075 | 0.125 | 0.800 | 1.000 |
The values of and are provided in Table 2a and Table 2b, respectively. For instance, Table 2a shows that when Table 1c is parsed the measure for each and does not capture the complete association structure of the first row. In contrast, in all , allowing us to identify the local complete association structure. Similarly, consider the results of the and in any from Table 1a to Table 1c. As can be seen from these results, the simulation also shows that changes significantly by capturing partially related structures compared to .
(a) The values of
Table 1a
Table 1b
Table 1c
0.0
0.0495
0.1285
0.2628
0.5
0.0302
0.1156
0.1990
1.0
0.0203
0.1105
0.1784
(b) The values of
Table 1a
Table 1b
Table 1c
0.0
0.0701
0.2765
1.0000
0.5
0.0487
0.3126
1.0000
1.0
0.0354
0.3221
1.0000
Artificial data 2
Consider the artificial data in Table 3. These data are intended to examine the value of the geoPRV measure as the association of the entire contingency table changes. Therefore, we obtained data suitable for the survey by converting the bivariate normal distribution with means and variances , in which the correlation coefficient was changed from to by , into the contingency tables with equal-interval frequency. From Theorem 2 and the properties of the PRV measures, when the absolute values of the correlation coefficients are the same, i.e., when the rows of the contingency table are simply swapped, the values are equal, so the results for the negative correlation coefficient case are omitted.
(1) | (2) | (3) | (4) | Total | (1) | (2) | (3) | (4) | Total | ||
---|---|---|---|---|---|---|---|---|---|---|---|
(1) | 0.2500 | 0.0000 | 0.0000 | 0.0000 | 0.2500 | (1) | 0.1072 | 0.0692 | 0.0477 | 0.0258 | 0.2500 |
(2) | 0.0000 | 0.2500 | 0.0000 | 0.0000 | 0.2500 | (2) | 0.0692 | 0.0698 | 0.0632 | 0.0477 | 0.2500 |
(3) | 0.0000 | 0.0000 | 0.2500 | 0.0000 | 0.2500 | (3) | 0.0477 | 0.0632 | 0.0698 | 0.0692 | 0.2500 |
(4) | 0.0000 | 0.0000 | 0.0000 | 0.2500 | 0.2500 | (4) | 0.0258 | 0.0477 | 0.0692 | 0.1072 | 0.2500 |
Total | 0.2500 | 0.2500 | 0.2500 | 0.2500 | 1.0000 | Total | 0.2500 | 0.2500 | 0.2500 | 0.2500 | 1.0000 |
(1) | 0.1691 | 0.0629 | 0.0164 | 0.0016 | 0.2500 | (1) | 0.0837 | 0.0668 | 0.0563 | 0.0432 | 0.2500 |
(2) | 0.0629 | 0.1027 | 0.0680 | 0.0164 | 0.2500 | (2) | 0.0668 | 0.0648 | 0.0621 | 0.0563 | 0.2500 |
(3) | 0.0164 | 0.0680 | 0.1027 | 0.0629 | 0.2500 | (3) | 0.0563 | 0.0621 | 0.0648 | 0.0668 | 0.2500 |
(4) | 0.0016 | 0.0164 | 0.0629 | 0.1691 | 0.2500 | (4) | 0.0432 | 0.0563 | 0.0668 | 0.0837 | 0.2500 |
Total | 0.2500 | 0.2500 | 0.2500 | 0.2500 | 1.0000 | Total | 0.2500 | 0.2500 | 0.2500 | 0.2500 | 1.0000 |
(1) | 0.1345 | 0.0691 | 0.0353 | 0.0111 | 0.2500 | (1) | 0.0625 | 0.0625 | 0.0625 | 0.0625 | 0.2500 |
(2) | 0.0691 | 0.0797 | 0.0659 | 0.0353 | 0.2500 | (2) | 0.0625 | 0.0625 | 0.0625 | 0.0625 | 0.2500 |
(3) | 0.0353 | 0.0659 | 0.0797 | 0.0691 | 0.2500 | (3) | 0.0625 | 0.0625 | 0.0625 | 0.0625 | 0.2500 |
(4) | 0.0111 | 0.0353 | 0.0691 | 0.1345 | 0.2500 | (4) | 0.0625 | 0.0625 | 0.0625 | 0.0625 | 0.2500 |
Total | 0.2500 | 0.2500 | 0.2500 | 0.2500 | 1.0000 | Total | 0.2500 | 0.2500 | 0.2500 | 0.2500 | 1.0000 |
Table 4 shows the value of and for each value of , respectively. We observe that the values of and increase as the absolute value of the increases. Besides, if and only if the measures show that it is independent of the table, and if and only if the measures confirm that there is a structure of all (or partially) complete association. Also, if there is a relationship only to the entire contingency table, the values of are found to be larger than the values of by Theorem 1, but the differences are small.
(a) The values of
0.0
0.0000
0.0109
0.0461
0.1159
0.2541
1.0000
0.5
0.0000
0.0113
0.0471
0.1161
0.2479
1.0000
1.0
0.0000
0.0100
0.0419
0.1035
0.2236
1.0000
(b) The values of
0.0
0.0000
0.0109
0.0469
0.1203
0.2699
1.0000
0.5
0.0000
0.0113
0.0479
0.1205
0.2634
1.0000
1.0
0.0000
0.0100
0.0425
0.1071
0.2369
1.0000
Actual data 1
Consider the case where the PRV measure is adapted to the data in Table 5, a survey of cannabis use among students conducted at the University of Ioannina (Greece) in 1995 and published in Marselos et al., (1997). The students’ frequency of alcohol consumption is measured on a four-level scale ranging from at most once per month up to more frequently than twice per week while their trial of cannabis through a three-level variable (never tried–tried once or twice–more often). We can see the partial bias of the frequency for the first and second rows in the data.
I tried cannabis | ||||
---|---|---|---|---|
Alcohol consumption | Never | Once or twice | More often | Total |
At most once/month | 204 | 6 | 1 | 211 |
Twice/month | 211 | 13 | 5 | 229 |
Twice/week | 357 | 44 | 38 | 439 |
More often | 92 | 34 | 49 | 175 |
Total | 864 | 97 | 93 | 1054 |
The estimates of and are provided in Table 6a and Table 6b, respectively. For instance, when , the measure for Table 6a, and for Table 6b. shows that the average condition variation of trying cannabis is smaller than the marginal variation, and similarly shows that the average condition variation of trying cannabis is smaller. Based on the results of these values, the following can be interpreted from Table 5:
-
(1)
There is a strong association overall between alcohol consumption and cannabis use experience associated.
-
(2)
There are fairly strong associations between some alcohol consumption and cannabis use experience.
These interpretations seem to be intuitive when looking at Table 5. However, by analyzing using the measures, we have been able to present an objective interpretation numerically and to show how strongly associated structures are in the contingency table.
(a) for Table 5
Estimated measure
Standard error
Confidence interval
0.0
0.1215
0.0175
(0.0872, 0.1557)
0.5
0.1090
0.0172
(0.0752, 0.1428)
1.0
0.1034
0.0174
(0.0693, 0.1376)
(b) for Table 5
Estimated measure
Standard error
Confidence interval
0.0
0.2601
0.0439
(0.1741, 0.3461)
0.5
0.2922
0.0488
(0.1965, 0.3879)
1.0
0.2992
0.0502
(0.2007, 0.3976)
Actual data 2
By analyzing multiple contingency tables using the measures, it is possible to numerically determine how much difference there are between the associations of the contingency tables. Therefore, consider the data in Table 7 are taken from Hashimoto, (1999). These data describe the cross-classifications of the father’s and son’s occupational status categories in Japan which were examined in 1975 and 1985. In addition, we can consider the father’s states as an explanatory variable and the son’s states as an response variable, since the father’s occupational status categories seem to have an influence on the son’s. The analysis of Table 7 aims to show what differences there are in the associations of occupational status categories for fathers and sons in 1975 and 1985.
(a) Examined in 1975 Son’s status Father’s status Capitalist New middle Working Old middle Total Capitalist 29 43 25 35 132 New middle 23 159 89 52 323 Working 11 69 184 44 308 Old middle 84 323 525 613 1545 Total 147 594 823 744 2308
(b) Examined in 1985 Son’s status Father’s status Capitalist New middle Working Old middle Total Capitalist 46 59 34 42 181 New middle 20 193 79 31 323 Working 9 122 202 48 381 Old middle 47 270 412 380 1109 Total 122 644 727 501 1994
Table 8 and Table 9 give the estimates of and , respectively. Comparing the estimates for each in Table 8 and Table 9, we can see that the values for both measures are almost the same. In addition, comparing Table 8a and Table 8b, the estimate is slightly larger in Table 8b, so it can be assumed that Table 8b is more related, but there is little difference because all the confidence intervals are covered. When we also compare 9a and Table 9b, we can see that 9b is larger because the estimate is slightly larger in 9b. However, we can see that the confidence interval does not cover at . From the results of these values, the following can be interpreted for Table 7a and Table 7b:
-
(1)
The occupational status categories of fathers and sons in 1975 and 1985 both have weak associations overall, further indicating that individual explanatory variables do not have remarkably associations.
- (2)
- (3)
When there are statistical differences from the results of some confidence intervals, as in (3), it is affected by differences in the characteristics of variation associated with changing the tuning parameters. In this case, it is difficult to give an interpretation by referring to variation because there was no difference in the variation in the special cases (e.g., ). However, when there are differences in variation in special cases, further interpretation can be given by focusing on the characteristics.
(a) For Table 7a
Estimated measure
Standard error
Confidence interval
0.0
0.0480
0.0061
(0.0361, 0.0600)
0.5
0.0547
0.0067
(0.0416, 0.0678)
0.9
0.0401
0.0054
(0.0294, 0.0507)
(b) For Table 7b
Estimated measure
Standard error
Confidence interval
0.0
0.0598
0.0071
(0.0459, 0.0736)
0.5
0.0709
0.0079
(0.0553, 0.0864)
0.9
0.0665
0.0081
(0.0506, 0.0823)
(a) For Table 7a
Estimated measure
Standard error
Confidence interval
0.0
0.0499
0.0066
(0.0371, 0.0628)
0.5
0.0571
0.0072
(0.0431, 0.0712)
0.9
0.0416
0.0057
(0.0304, 0.0528)
(b) For Table 7b
Estimated measure
Standard error
Confidence interval
0.0
0.0630
0.0077
(0.0478, 0.0782)
0.5
0.0752
0.0086
(0.0583, 0.0922)
0.9
0.0695
0.0084
(0.0530, 0.0860)
5 Conclusion
In this paper, we proposed a geometric mean type of PRV (geoPRV) measure that uses variation composed of geometric mean and arbitrary functions that satisfy certain conditions. We showed that the proposed measure has the following three properties that are suitable for examining the degree of association, which satisfies the conventional measures: (i) The measure increases monotonically as the degree of association increases; (ii) The value is 0 when there is a structure of null association, and (iii) The value is 1 when there is a complete structure of association. Furthermore, by using geometric means, the geoPRV measure can capture the association to the response variables for individual explanatory variables that could not be investigated by the existing PRV measures. Analyses using the existing PRV measures and the geoPRV measure simultaneously will be able to examine the association of the entire contingency table and the partial association. Also, the geoPRV measure can be analyzed using variations with various characteristics by providing functions and tuning parameters that satisfy the conditions, such as the measure . Therefore, analysis using the geoPRV measure together can lead to a deeper understanding of the data and provide further interpretation. While various measures of contingency tables have been proposed, there have been several studies in recent years that have conducted analyses using the Goodman-Kraskal’s PRV measure (e.g. Gea-Izquierdo,, 2023; Iordache et al.,, 2022). We believe that the new PRV measure in this paper, when examined and compared together with the existing Goodman-Kraskal’s PRV measure, may provide a new perspective that pays attention to the association of individual explanatory variables, including the association of the entire contingency table.
Appendix A Proof of Theorem 1
Proof.
-
(i)
Let denote a numerator of a fraction
If there exists such that , is easily verified, i.e., is established. Moreover, consider cases other than this one. Assume that which is convex function since where is the second derivative of function by . From Jensen’s inequality,
where , . Therefore, , i.e., holds.
-
(ii)
The inequality is already proven by Momozaki et al., and holds as proved above. Hence, holds since . In addition, since , we obtain . Thus, holds.
-
(iii)
Since , if then . Hence, since holds for (Momozaki et al.), holds for . Thus, holds. Moreover, can be easily checked.
-
(iv)
If then , i.e., for some , (). Thus, there exists such that and ().
∎
Appendix B Proof of Theorem 2
Proof.
Since the first terms in the denominator and numerator of do not depend on the row category, we focus on the second term in the numerator. This term is
and the values are invariant to the reordering of the sums. Namely, the value of is invariant with respect to the permutation of row categories. Similarly, the value of is also invariant with respect to the permutation of column categories. ∎
Appendix C Proof of Theorem 3
Proof.
Let
, and is a transpose of . Then converges in distribution to a normal distribution with mean zero and the covariance matrix , where is a diagonal matrix with the elements of on the main diagonal (Bishop et al.,, 2007).
The Taylor expansion of the function around is given by
Therefore, since
where
with
and is the derivative of function by . ∎
References
- Agresti, (2003) Agresti, A. (2003). Categorical data analysis. John Wiley & Sons.
- Agresti, (2010) Agresti, A. (2010). Analysis of ordinal categorical data, volume 656. John Wiley & Sons.
- Bishop et al., (2007) Bishop, Y. M., Fienberg, S. E., and Holland, P. W. (2007). Discrete multivariate analysis: theory and practice. Springer Science & Business Media.
- Cramér, (1999) Cramér, H. (1999). Mathematical methods of statistics, volume 43. Princeton university press.
- Everitt, (1992) Everitt, B. S. (1992). The analysis of contingency tables. CRC Press.
- Gea-Izquierdo, (2023) Gea-Izquierdo, E. (2023). Biological risk of legionella pneumophila in irrigation systems. Revista de Salud Pública, 22:434–439.
- Gilula and Haberman, (1986) Gilula, Z. and Haberman, S. J. (1986). Canonical analysis of contingency tables by maximum likelihood. Journal of the American statistical association, 81(395):780–788.
- Goodman, (1981) Goodman, L. A. (1981). Association models and canonical correlation in the analysis of cross-classifications having ordered categories. Journal of the American Statistical Association, 76(374):320–334.
- Goodman, (1985) Goodman, L. A. (1985). The analysis of cross-classified data having ordered and/or unordered categories: Association models, correlation models, and asymmetry models for contingency tables with or without missing entries. The Annals of Statistics, 13:10–69.
- Goodman and Kruskal, (1954) Goodman, L. A. and Kruskal, W. H. (1954). Measures of association for cross classifications. Journal of the American Statistical Association, 49(268):732–764.
- Hashimoto, (1999) Hashimoto, K. (1999). Gendai nihon no kaikyuu kouzou (class structure in modern japan: theory, method and quantitative analysis). Toshindo, Tokyo (in Japanese).
- Ichimori, (2013) Ichimori, T. (2013). On inequalities between -divergence. Technical Note, IPSJ Journal, 54(11):2344–2348.
- Iordache et al., (2022) Iordache, A. M., Nechita, C., Voica, C., Pluháček, T., and Schug, K. A. (2022). Climate change extreme and seasonal toxic metal occurrence in romanian freshwaters in the last two decades—case study and critical review. NPJ Clean Water, 5(1):2.
- Kateri and Papaioannou, (1994) Kateri, M. and Papaioannou, T. (1994). f-divergence Association Models. University of Ioannina.
- Marselos et al., (1997) Marselos, M., Boutsouris, K., Liapi, H., Malamas, M., Kateri, M., and Papaioannou, T. (1997). Epidemiological aspects of the use of cannabis among university students in greece. European Addiction Research, 3(4):184–191.
- Momozaki et al., (2022) Momozaki, T., Wada, Y., Nakagawa, T., and Tomizawa, S. (2022). Extension of generalized proportional reduction in variation measure for two-way contingency tables. Behaviormetrika, pages 1–14.
- Nakagawa et al., (2020) Nakagawa, T., Takei, T., Ishii, A., and Tomizawa, S. (2020). Geometric mean type measure of marginal homogeneity for square contingency tables with ordered categories. Journal of Mathematics and Statistics, 16(1):170–175.
- Patil and Taillie, (1982) Patil, G. and Taillie, C. (1982). Diversity as a concept and its measurement. Journal of the American statistical Association, 77(379):548–561.
- Rom and Sarkar, (1992) Rom, D. and Sarkar, S. K. (1992). A generalized model for the analysis of association in ordinal contingency tables. Journal of statistical planning and inference, 33(2):205–212.
- Saigusa et al., (2016) Saigusa, Y., Tahata, K., and Tomizawa, S. (2016). Measure of departure from partial symmetry for square contingency tables. Journal of Mathematics and Statistics, 12(3):152–156.
- Saigusa et al., (2019) Saigusa, Y., Takami, M., Ishii, A., Nakagawa, T., and Tomizawa, S. (2019). Measure for departure from cumulative partial symmetry for square contingency tables with ordered categories. Journal of Statistics: Advances in Theory and Applications, 21:53–70.
- Tahata, (2022) Tahata, K. (2022). Advances in quasi-symmetry for square contingency tables. Symmetry, 14(5):1051.
- Theil, (1970) Theil, H. (1970). On the estimation of relationships involving qualitative variables. American Journal of Sociology, 76(1):103–154.
- Tomizawa et al., (2004) Tomizawa, S., Miyamoto, N., and Houya, H. (2004). Generalization of cramer’s coefficient of association for contingency tables: theory and methods. South African Statistical Journal, 38(1):1–24.
- Tomizawa et al., (1997) Tomizawa, S., Seo, T., and Ebi, M. (1997). Generalized proportional reduction in variation measure for two-way contingency tables. Behaviormetrika, 24(2):193–201.
- Tschuprow, (1925) Tschuprow, A. (1925). Grundbegriffe und grundprobleme der korrelationstheorie. Leipzig: B.G. Teubner.
- Tschuprow, (1939) Tschuprow, A. (1939). Principles of the mathematical theory of correlation. W. Hodge & Co.