Fair Classification under Covariate Shift and Missing Protected Attribute
- an Investigation using Related Features

Manan Singh

(November 2021)

Abstract

This study investigated the problem of fair classification under Covariate Shift and missing protected attribute using a simple approach based on the use of importance-weights to handle covariate-shift and, Related Features Zhao et al. (2021) to handle missing protected attribute.

1 Introduction

Automated decision making has become prevalent in our society today. Organizations such as companies, judiciaries, institutions, etc., rely on machine learning models, especially binary classifiers, to make important decisions about people, such as, loan crediting, admissions into university, etc. These decisions can get biased towards a specific group, such as, race or gender, if the classifier has been trained using a biased training-set. ¹¹1A bias in the binary-classification training-set can refer to, either an imbalance in the sample-sizes of the two groups (e.g., 1000 samples being available for males, whereas, only 100 being available for females), or an imbalance in the class-labels for the two groups (e.g., 90% of males being labeled as positive, whereas, only 10% of females being labeled as positive.) The field of fair-classification has developed many methods to tackle this problem, but mostly, they depend on the availability of the group-information, also known as the sensitive or the protected attribute.

But the availability of the protected attribute might not be realistic in a real-world setting, for example, due to a law prohibiting the collection of sensitive group information such as gender or race, thus, making achieving fairness challenging. To achieve fairness in such a case, when the true sensitive attribute is not available, Zhao et al. (2021) have suggested using a set of non-sensitive features that have been identified by domain experts as being highly correlated with the missing true attribute and provided beforehand. They call these as Related Features. We found that the use of related features to achieve fairness has been a quite unexplored direction in the fairness literature, and hence, explored their use to achieve fair classification under Covariate Shift.

Covariate-Shift is a very realistic issue faced by machine learning models, when after being trained on a particular data distribution, they have to be used on a new and different data distribution. We illustrate the issue by a hypothetical scenario of a company that had been using a model trained on the customer records of its own country for a past few years, but now, wants to expand its market to another country. As a result, it might face the difficulty of distribution-change and its existing model might prove unreliable. Moreover, due to limitations of data collection, the company may find it difficult to get labelled data for the new country that it could have used to train a new model. Thus, to build an accurate classifier will be a challenge. This is the standard problem of unsupervised covariate shift.

Further, the laws of the new country might require the company’s classifier to also be fair on the country’s population, but the knowledge of true protected attribute might not be available to the company due to limitations of data collection or legal restrictions. In such a case, the company might decide to seek help of domain experts, identify a set of non-sensitive but related attributes, and use them to achieve fairness instead.

In such a novel hybrid setting that involves both missing protected attribute and covariate shift, we investigated a simple possible approach based on Related Features to address the joint issue.

2 Problem Setting

We have been given $N_{S}$ labelled samples from the source domain $(\mathbf{X}^{S},Y^{S})=\{(\mathbf{x}_{i}^{S},y_{i}^{S})\}_{i=1}^{N_{S}}$ , and $N_{T}$ unlabelled samples $\mathbf{X}^{T}=\{\mathbf{x}_{i}^{T}\}_{i=1}^{N_{T}}$ from the target domain, where $\mathbf{x}_{i}\in\mathbb{R}^{d}$ and $y_{i}\in\{0,1\}$ . Under covariate-shift assumption, the source-domain features $\mathbf{x}_{i}^{S}$ are assumed to be drawn from a source-distribution, $p_{S}(\mathbf{x})$ , whereas, the target-domain features, $\mathbf{x}_{i}^{T}$ , are assumed to be drawn from a different target-distribution $p_{T}(\mathbf{x})$ . Conditional distribution $p(y|\mathbf{x})$ is assumed to be same in both the domains.

We wish to assign labels $Y^{T}=\{y_{i}^{T}\}_{i=1}^{N_{T}}$ to the samples in the target-domain using a classifier, $g_{\theta}:\mathbf{X}\rightarrow Y$ ²²2We will denote the classifier predictions after sigmoid with $h_{\theta}:\mathbf{X}\rightarrow[0,1]$ , or $\tilde{y}$ . , that is not only accurate, but also fair with respect to the missing protected attribute in the target-domain, $A^{T}=\{(a_{i}^{T})\}_{i=1}^{N_{T}}$ .

Although the true protected attribute is absent, a set of $K$ features, $F^{T}=\{(x_{i1}^{T},x_{i2}^{T},...,x_{iK}^{T})_{i=1}^{N_{T}}\}$ have been identified by domain-experts as being highly correlated with the true attribute, and have to be used to enforce fairness in the classifier.

3 Methodology

3.1 Handling Covariate Shift

A standard way to build a classifier $h_{\theta}$ that is robust to covariate-shift involves weighting those source-domain samples that are similar to the target-domain samples by modifying the traditional classifier loss as follows:

\mathcal{L}(\mathbf{X}^{S},Y^{S};\theta)=\frac{1}{N}\sum_{i=1}^{N}w(\mathbf{x}_{i}^{S})\;\mathcal{L}_{clf}(h_{\theta}(\mathbf{x}_{i}^{S}),y_{i}^{S})

(1)

where, $\mathcal{L}_{clf}(.)$ is the standard binary cross-entropy loss:

\mathcal{L}_{clf}(h_{\theta}(\mathbf{x}_{i}^{S}),y_{i}^{S})=-y_{i}^{S}\log h_{\theta}(\mathbf{x}_{i}^{S})-(1-y_{i}^{S})\log(1-h_{\theta}(\mathbf{x}_{i}^{S}))

(2)

and, $w(.)$ assigns the importance-weights to the source-domain samples.

w(\mathbf{x}_{i}^{S})=\frac{p_{T}(\mathbf{x}_{i}^{S})}{p_{S}(\mathbf{x}_{i}^{S})}

(3)

The above density-ratio is commonly estimated using a classifier, such as logistic regression, trained to distinguish between source and target-domain features - $X_{S}$ and $X_{T}$ Sugiyama et al. (2012).

3.2 Fairness using Related Features

A common approach to build a fair classifier is to add a fairness loss to the conventional classification loss. When the true protected attribute is available, Zafar et al. (2017) has suggested to use the correlation between this attribute, $A$ , and predictions, $\tilde{Y}=h_{\theta}(\mathbf{X})$ , as the fairness loss.

\mathcal{L}_{fair}(A,\mathbf{X};\theta)=\frac{1}{N}\left|\sum_{i=1}^{N}(a_{i}-\mu_{a})(h_{\theta}(\mathbf{x}_{i})-\mu_{h_{\theta}})\right|

(4)

In the absence of true protected attribute but when $K$ related attributes $F=\{X_{1},X_{2},...,X_{K}\}$ are available, Zhao et al. (2021) have extended the above fairness loss as follows:

\mathcal{L}_{fair}(\{X_{1},X_{2},...,X_{K}\},\mathbf{X};\theta)=\sum_{k=1}^{K}\lambda_{k}.\mathcal{L}_{fair}(X_{k},\mathbf{X};\theta)

(5)

where, $\lambda_{k}$ is the feature-specific weight, and can be decided by domain-experts, set to uniform, or even learnt automatically using a scheme provided by Zhao et al. (2021).

3.3 A Hybrid Method

As we aimed to jointly handle both covariate shift, and fairness in the absence of true protected attribute but in the presence of related attributes, we chose to minimize the following loss that incorporates both of the approaches mentioned in the previous subsections.

\mathcal{L}(\mathbf{X}^{S},Y^{S},F^{T})=\{X_{1}^{T},...,X_{K}^{T}\},\mathbf{X}^{T};\theta)=\frac{1}{N_{S}}\sum_{i=1}^{N_{S}}w_{i}^{S}.\mathcal{L}_{clf}(\mathbf{x}_{i}^{S},y_{i}^{S};\theta)\\ +\eta.\sum_{k=1}^{K}\lambda_{k}.\mathcal{L}_{fair}(X_{k}^{T},\mathbf{X}^{T};\theta)

(6)

where, $\eta$ is a fairness coefficient and can be tuned as a hyper-parameter.

The former term was included to enforce the classifier to be accurate under covariate shift, while the latter term was included to achieve fairness in the target domain using related features.

4 Experiments

Approaches

To investigate the effectiveness of the hybrid approach, we ran experiments for the following approaches:

1.

Vanilla: A simple MLP classifier trained with the usual binary cross-entropy loss.
2.

Related Features removed: Same as the vanilla approach, but the related features are removed from the feature-set.
3.

Covariate-Shift Adapted: An MLP classifier trained with a weighted binary-cross entropy loss, where the weights are covariate-shift importance weights.
4.

Fair using Related Features: An MLP classifier trained with the usual binary cross-entropy loss, plus another loss term to incorporate fairness using related features. ³³3Uniform weights were used for the related features.
5.

Hybrid: An MLP classifier trained with the covariate-shift weighted classifier loss, plus fairness loss using related features.

In all the experiments, the source-domain samples were used for training, whereas, the target-domain samples were used for validation and testing.

Experiment-Settings

All the approaches used a three-layer Multi-layer Perceptron (MLP) having two hidden layers of size 64 and 32 respectively. The learning-rate was set to 0.01, and the batch-size to 256. Early stopping based on total validation loss was used to stop the training. ⁴⁴4Early Stopping: As the training proceeds over the epochs, the best seen (i.e. the minimum) total validation loss is kept track of. If the validation loss does not improve after ‘k’ consequent epochs, the training is halted. ‘k’ is called as the patience criteria. In our experiments, a patience of 5 was used. For the approaches that involve a fairness coefficient, experiments were run for the fairness coefficient range of [1e-5, 1e-4, 0.001, 0.01, 0.1, 1.0, 10, 100, 1000, 1e4, 1e5], and for each value, the tradeoff between performance and fairness reported.

Datasets

The experiments were run for the following datasets.

•

ADULT ⁵⁵5https://archive.ics.uci.edu/ml/datasets/adult: This dataset is part of the US Census 1994 surveys, and contains around 45K records of people with 12 features such as age, education, hours-per-week, etc. ⁶⁶6Sensitive features viz. race and gender are not counted. The true protected attribute, $A$ , is considered to be sex, and the features - relationship, hours-per-week, and marital-status are chosen to comprise the related feature-set $F$ . For covariate-shift, the samples having age less than the median age are taken to form the source-domain, and the rest form the target-domain.
•

MEPS (Medical Expenditure Panel Survey ⁷⁷7https://meps.ahrq.gov/mepsweb/data_stats/download_data_files_detail.jsp?cboPufNumber=HC-181 ⁸⁸8https://github.com/Trusted-AI/AIX360/blob/master/aix360/data/meps_data/README.md: This dataset contains around 8K records from a large scale 2015 survey of US population. Around 42 features had been collected that include information such as, demographics, healthcare expenditure, and other self-reported medical information. We have binarized the healthcare expenditure as low-cost or high-cost based upon it being less than the median value, or otherwise. Sex is considered as the true protected attribute, and the features - PREGNT_31, HONRD31, and INCOME_M form the set of related features $F$ . ⁹⁹9The full codebook containing the description of the feature-names can be found at https://meps.ahrq.gov/data_stats/download_data_files_codebook.jsp?PUFId=H181 For covariate-shift, the samples having annual income less than median are considered as the source-domain, and the rest form the target-domain.

Note the choice of the features to comprise the set of related features, $F$ , was assumed to be part of domain knowledge.

Evaluation Metrics

To evaluate the classifier’s performance, we reported the AUC score, and to measure its fairness, we reported Demographic Parity distance ( $\Delta_{DP}$ ):

\Delta_{DP}=|\mathbb{E}(\hat{y}|a=1)-\mathbb{E}(\hat{y}|a=0)|

where, $\hat{y}$ denotes the predicted label, and $a$ denotes the true protected attribute. Note that this fairness metric relies only on the predictions, and not the actual labels $y$ .

To report the results for an experiment, the experiment was repeated 5 times with different seeds, and the test-set metrics averaged over these 5 runs reported.

5 Results and Discussion

For all the approaches, the performance-vs-fairness tradeoffs are plotted in Figure 1. (For the approaches involving fairness coefficient, the results obtained for various values of fairness coefficient are plotted.) For detailed results, see the appendix.

Refer to caption — Figure 1: Performance v/s Fairness Tradeoffs for various approaches

Our goal with trying the Hybrid method, i.e. the one that uses related features for fairness along with covariate shift adaptation, was to see if it can outperform the baselines significantly either in terms of fairness, or performance, or both. We made the following observations:

Fairness: : In terms of best fairness, on the Adult dataset, it was a fair using related features approach Zhao et al. (2021) that achieved the lowest $\Delta_{DP}$ of around 0.05 (Notice the lowest red dot in Figure 1.), whereas, on the MEPS dataset, it was the covariate shift adapated approach that achieved the lowest. (Notice the green dot near the bottom in Figure 1.)
Performance: : In terms of best AUC, multiple approaches even other than the hybrid were able to achieve close to the best i.e. around 85% on Adult, and around 80% on MEPS.
Fairness-Performance tradeoff: : In terms of the best tradeoff, both on the Adult and the MEPS datasets, the Related features removed approach seemed to be the best, having, on Adult, an AUC of around 80% and $\Delta_{DP}$ of around 0.10; and on MEPS, an AUC of 80% and $\Delta_{DP}$ of 0.05. Although, for certain values of fairness coefficients, the Fair using related features and Hybrid approaches had similar tradeoffs, these did not seem significantly superior to the Related features removed approach.

Thus, the overall results indicate that the hybrid method, i.e. the use of related features along with covariate-shift adaptation, is not significantly more efficient than even simple baselines such as altogether removal of the related features.

6 Conclusion

In this study, we investigated the problem of fair classification under covariate shift and missing protected attribute using an approach involving importance weights for covariate shift adaptation and fairness using related features for handling the missing attribute, and found it to be not much significantly efficient than even simple baseline such as removal of related features.

References

Agarwal et al. (2018) Alekh Agarwal, Alina Beygelzimer, Miroslav Dudík, John Langford, and Hanna Wallach. A reductions approach to fair classification. In International Conference on Machine Learning, pages 60–69. PMLR, 2018.
Amini et al. (2019) Alexander Amini, Ava P Soleimany, Wilko Schwarting, Sangeeta N Bhatia, and Daniela Rus. Uncovering and mitigating algorithmic bias through learned latent structure. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 289–295, 2019.
Bechavod and Ligett (2017) Yahav Bechavod and Katrina Ligett. Penalizing unfairness in binary classification. arXiv preprint arXiv:1707.00044, 2017.
Coston et al. (2019) Amanda Coston, Karthikeyan Natesan Ramamurthy, Dennis Wei, Kush R Varshney, Skyler Speakman, Zairah Mustahsan, and Supriyo Chakraborty. Fair transfer learning with missing protected attributes. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 91–98, 2019.
Hardt et al. (2016) Moritz Hardt, Eric Price, and Nathan Srebro. Equality of opportunity in supervised learning. arXiv preprint arXiv:1610.02413, 2016.
Lahoti et al. (2020) Preethi Lahoti, Alex Beutel, Jilin Chen, Kang Lee, Flavien Prost, Nithum Thain, Xuezhi Wang, and Ed H Chi. Fairness without demographics through adversarially reweighted learning. arXiv preprint arXiv:2006.13114, 2020.
Louizos et al. (2015) Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, and Richard Zemel. The variational fair autoencoder. arXiv preprint arXiv:1511.00830, 2015.
Rezaei et al. (2020) Ashkan Rezaei, Anqi Liu, Omid Memarrast, and Brian Ziebart. Robust fairness under covariate shift. arXiv preprint arXiv:2010.05166, 2020.
Schumann et al. (2019) Candice Schumann, Xuezhi Wang, Alex Beutel, Jilin Chen, Hai Qian, and Ed H Chi. Transfer of machine learning fairness across domains. arXiv preprint arXiv:1906.09688, 2019.
Singh et al. (2021) Harvineet Singh, Rina Singh, Vishwali Mhasawade, and Rumi Chunara. Fairness violations and mitigation under covariate shift. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pages 3–13, 2021.
Sugiyama et al. (2012) Masashi Sugiyama, Taiji Suzuki, and Takafumi Kanamori. Density ratio estimation in machine learning. Cambridge University Press, 2012.
Yoon et al. (2020) Taeho Yoon, Jaewook Lee, and Woojin Lee. Joint transfer of model knowledge and fairness over domains using wasserstein distance. IEEE Access, 8:123783–123798, 2020.
Zafar et al. (2017) Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P Gummadi. Fairness constraints: Mechanisms for fair classification. In Artificial Intelligence and Statistics, pages 962–970. PMLR, 2017.
Zemel et al. (2013) Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. Learning fair representations. In International conference on machine learning, pages 325–333. PMLR, 2013.
Zhao et al. (2021) Tianxiang Zhao, Enyan Dai, Kai Shu, and Suhang Wang. You can still achieve fairness without sensitive attributes: Exploring biases in non-sensitive features. arXiv preprint arXiv:2104.14537, 2021.

Appendix A Related Work

This work explored the problem of fair classification under covariate shift and with missing protected attributes, and hence, lies at the intersection of the areas - fair classification, covariate shift, and missing protected attribute. We list the related works briefly below.

Classic Fairness literature, that aims at achieving fair classification with respect to a particular protected attribute, involves pre-processing (and feature representation learning) [Zemel et al., 2013, Louizos et al., 2015], in-processing [Bechavod and Ligett, 2017, Agarwal et al., 2018], and post-processing [Hardt et al., 2016] techniques.

To also handle the realistic problem of covariate-shift, due to which the data-distribution upon which the model has to be used (i.e. the target domain) has changed from the distribution that was available during training (i.e. the source domain), recent works aim at achieving fairness in the presence of covariate-shift [Singh et al., 2021, Rezaei et al., 2020, Yoon et al., 2020, Schumann et al., 2019].

But a huge limitation of such works is their reliance on a specific protected attribute. Lahoti et al. [2020], Amini et al. [2019] have proposed solutions to achieve fairness when the protected attribute are hidden and never available, but their methods do not handle covariate shift. Coston et al. [2019] also handles covariate shift, but it assumes the availability of the protected attribute either in the source or in the target domain. We could not find any work in the literature that handles the complete unavailability i.e. both from the source and target domains.

Appendix B Detailed Results

For the approaches that do not incorporate fairness, the results are tabulated in Table 1 and 2. For the approaches that do incorporate fairness, the plots of the performance-versus-fairness tradeoffs over a varying range of the fairness coefficient are displayed in Figure 2 and 3, and the results tabulated in Tables 3, 4, 5 and 6. ¹⁰¹⁰10Although, the fairness-coefficient-range of [1e-6, 1e6] was experimented with, only those fairness-coefficients that yielded non-zero $\Delta_{DP}$ s are reported, because, the ones that led to zero $\Delta_{DP}$ led to very poor AUC values (of almost 50% or below).

The thresholds in the plots correspond to the results of the “Related Features removed” approach, and, in none of the plots could we find if the “Hybrid” approach had a significantly better AUC v/s $\Delta_{DP}$ tradeoff. ¹¹¹¹11i.e. as compared with the tradeoff of “Related Features removed”, which was around 78% (AUC) and 0.1 ( $\Delta_{DP}$ ) on Adult; and around 80% (AUC) and 0.05 ( $\Delta_{DP}$ ) on MEPS.

Table 1: Results on Adult dataset

Method	AUC	$\Delta_{DP}$
Vanilla	0.8565	0.2978
Related Features Removed	0.7827	0.0995
Covariate Shift Adapted	0.8667	0.2476

Table 2: Results on MEPS dataset

Method	AUC	$\Delta_{DP}$
Vanilla	0.8115	0.1281
Related Features Removed	0.8033	0.0596
Covariate Shift Adapted	0.7742	0.0496

Table 3: Results for a varying range of fairness coefficient - Approach: Fair using Related Features; Dataset: Adult.

Sn.	Fairness Coefficient	AUC	$\Delta_{DP}$
1	1.00E-05	0.8534	0.2505
2	0.0001	0.8512	0.285
3	0.001	0.8486	0.2798
4	0.01	0.7372	0.0814
5	0.1	0.692	0.055
6	1	0.7122	0.0298

Table 4: Results for a varying range of fairness coefficient - Approach: Hybrid; Dataset: Adult.

Sn.	Fairness Coefficient	AUC	$\Delta_{DP}$
1	1.00E-05	0.867	0.2428
2	0.0001	0.861	0.231
3	0.001	0.7207	0.1062
4	0.01	0.6591	0.0639
5	0.1	0.7254	0.0395

Table 5: Results for a varying range of fairness coefficient - Approach: Fair using Related Features; Dataset: MEPS.

Sn.	Fairness Coefficient	AUC	$\Delta_{DP}$
1	1.00E-05	0.8121	0.1289
2	0.0001	0.8114	0.1417
3	0.001	0.8077	0.1484
4	0.01	0.8044	0.1814
5	0.1	0.8021	0.239
6	1	0.7992	0.2271
7	10	0.7707	0.1218

Table 6: Results for a varying range of fairness coefficient - Approach: Hybrid; Dataset: MEPS.

Sn.	Fairness Coefficient	AUC	$\Delta_{DP}$
1	1.00E-05	0.7838	0.0653
2	0.0001	0.7875	0.0732
3	0.001	0.7835	0.1193
4	0.01	0.7799	0.1888
5	0.1	0.7878	0.2097
6	1	0.7747	0.1755

Fair Classification under Covariate Shift and Missing Protected Attribute - an Investigation using Related Features