This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Fair Representation Clustering with Several Protected Classes

Zhen Dai University of Chicago. Supported by grants from NSF and DARPA. Email: [email protected]    Yury Makarychev Toyota Technological Institute at Chicago (TTIC). Supported by NSF awards CCF-1718820, CCF-1955173, and CCF-1934843. Email: [email protected]    Ali Vakilian Toyota Technological Institute at Chicago (TTIC). Supported by NSF award CCF-1934843. Email: [email protected]
Abstract

We study the problem of fair kk-median where each cluster is required to have a fair representation of individuals from different groups. In the fair representation kk-median problem, we are given a set of points XX in a metric space. Each point xXx\in X belongs to one of \ell groups. Further, we are given fair representation parameters αj\alpha_{j} and βj\beta_{j} for each group j[]j\in[\ell]. We say that a kk-clustering C1,,CkC_{1},\cdots,C_{k} fairly represents all groups if the number of points from group jj in cluster CiC_{i} is between αj|Ci|\alpha_{j}|C_{i}| and βj|Ci|\beta_{j}|C_{i}| for every j[]j\in[\ell] and i[k]i\in[k]. The goal is to find a set 𝒞\mathcal{C} of kk centers and an assignment ϕ:X𝒞\phi:X\rightarrow\mathcal{C} such that the clustering defined by (𝒞,ϕ)(\mathcal{C},\phi) fairly represents all groups and minimizes the 1\ell_{1}-objective xXd(x,ϕ(x))\sum_{x\in X}d(x,\phi(x)).

We present an O(logk)O(\log k)-approximation algorithm that runs in time nO()n^{O(\ell)}. Note that the known algorithms for the problem either (i) violate the fairness constraints by an additive term or (ii) run in time that is exponential in both kk and \ell. We also consider an important special case of the problem where αj=βj=fjf\alpha_{j}=\beta_{j}=\frac{f_{j}}{f} and fj,ff_{j},f\in\mathbb{N} for all j[]j\in[\ell]. For this special case, we present an O(logk)O(\log k)-approximation algorithm that runs in (kf)O()logn+poly(n)(kf)^{O(\ell)}\log n+\mathop{\mathrm{poly}}(n) time.

1 Introduction

Algorithmic decision making is widely used for high-stake decisions like college admissions (Marcinkowski et al., 2020) and criminal justice (Chouldechova, 2017; Kleinberg et al., 2018). While automated decision-making processes are often very efficient, there are serious concerns about their fairness. Consequently, in recent years, there has been an extensive line of research on fairness of algorithms and machine learning approaches (Chouldechova and Roth, 2020; Kearns and Roth, 2019; Kleinberg et al., 2017).

In this paper, we study the “fair representation” clustering problem proposed in the seminal work of Chierichetti et al. (2017). The notion, which is motivated by the concept of disparate impact (Feldman et al., 2015), requires that each protected class has an approximately equal representation in each cluster. In many scenarios, a different set of benefits are associated with each cluster of points output by the algorithm. Then, it is desirable that different groups of individuals (e.g., men or women) receive the benefits associated with each of the clusters (e.g., mortgage options) in similar proportions. Further, clustering is often used for feature engineering. In this case, we need to ensure that the generated features are fair; that is, they neither introduce new nor amplify existing biases in the data set. Now, we formally define the notion of representation fairness for clustering.

Definition 1.1 (fair representation clustering).

Given a set of points XX that come from \ell different groups X1,,XX_{1},\dots,X_{\ell}, a kk-clustering C1,,CkC_{1},\cdots,C_{k} of XX is fair with respect to the fairness requirement specified by {αj,βj}j\{\alpha_{j},\beta_{j}\}_{j\in\ell} if

i[k],j[],αj|Ci||CiXj|βj|Ci|\displaystyle\forall i\in[k],j\in[\ell],\quad\alpha_{j}|C_{i}|\leq|C_{i}\cap X_{j}|\leq\beta_{j}|C_{i}| (1)

In fair kk-median with fairness requirement {αj,βj}j\{\alpha_{j},\beta_{j}\}_{j}, the goal is to find kk clusters C1,,CkC_{1},\dots,C_{k} and kk centers, c1,,ckc_{1},\dots,c_{k} (one center for each cluster) so that the clustering C1,,CkC_{1},\dots,C_{k} is fair with respect to the fairness requirement and the 1\ell_{1}-objective i=1kxCid(x,ci)\sum_{i=1}^{k}\sum_{x\in C_{i}}d(x,c_{i}) is minimized. We will say that points in CiC_{i} are assigned to center cic_{i}. We let ϕ\phi be the assignment function that maps each point uu to the center uu is assigned to. To specify a solution, it is sufficient to provide the set of centers and ϕ\phi.

Bera et al. (2019) and Bercea et al. (2019) independently introduced this notion of fairness, which generalizes the notions studied by Chierichetti et al. (2017); Schmidt et al. (2019); Backurs et al. (2019); Ahmadian et al. (2019). Bercea et al. presented a constant factor approximation algorithm for the fair representation clustering with the general p\ell_{p}-objective. However, their algorithm returns a clustering that satisfies the fairness requirements with some additive error. When the maximum number of groups/classes to which a point may belong is Δ\Delta, the additive error/violation is at most 4Δ+34\Delta+3; in the most common case of Δ=1\Delta=1, the additive violation is at most 33. Bercea et al. also gave constant factor approximation algorithms for a variety of clustering objective (including kk-median and kk-means) that violate the fairness requirement only by a small additive value. More recently, Bandyapadhyay et al. (2021) designed algorithms that compute a constant factor approximation for fair kk-median and kk-means that run in time (kΔ)O(kΔ)poly(n)(k\Delta)^{O(k\Delta)}\mathop{\mathrm{poly}}(n); these algorithms do not violate the fairness constraints. However, one of the main questions in the area of fair clustering still remains open:

What is the best polynomial-time approximation algorithm for representation fair clustering?

We also study exact fair representation clustering, which is an important special case of the problem.

Definition 1.2 (exact fairness).

Assume that we are given a set of points XX that come from \ell disjoint groups: X=X1XX=X_{1}\cup\cdots\cup X_{\ell}. A kk-clustering C1,,CkC_{1},\cdots,C_{k} of XX is exactly fair if

i[k],j[],|CiXj|=|Xj||X||Ci|\displaystyle\forall i\in[k],j\in[\ell],\quad|C_{i}\cap X_{j}|=\frac{|X_{j}|}{|X|}\cdot|C_{i}| (2)

We define fairlet as a minimal size non-empty set of points that is exactly fair. Note that all fairlets have the same size (when XX is fixed). Denote this size by ff. Further, for every j[]j\in[\ell], let fjf_{j} denote the number of points from group jj in any fairlet.

This notion was previously studied for (1) kk-center (Rösner and Schmidt, 2018; Bercea et al., 2019) and (2) kk-clustering with p\ell_{p}-objective on balanced instances (instances with fj=1f_{j}=1 for all jj and f=f=\ell(Böhm et al., 2020). In all these special cases of exact fairness, the fair clustering problem admits a constant factor approximation.

Other Related Work.

Fair clustering is an active domain of research and by now it has been studied under various standard notions including both group fairness and individual fairness, e.g.,  (Chierichetti et al., 2017; Bera et al., 2019; Huang et al., 2019; Kleindessner et al., 2019; Jones et al., 2020; Chen et al., 2019; Micha and Shah, 2020; Jung et al., 2020; Mahabadi and Vakilian, 2020; Chakrabarty and Negahbani, 2021; Vakilian and Yalçıner, 2021; Brubach et al., 2020; Kleindessner et al., 2020; Ghadiri et al., 2021; Abbasi et al., 2021; Makarychev and Vakilian, 2021; Chlamtáč et al., 2022; Esmaeili et al., 2020, 2021).

1.1 Our Results

In this paper, we study the fair representation kk-median problem and give an O(logk)O(\log k)-approximation for it that runs in time nO()n^{O(\ell)}. Importantly, we get a polynomial time algorithm for every fixed \ell – this is the case in most practical settings of interest for fairness applications. In fact, we design an algorithm that can handle arbitrary fairness profiles (see below for the definitions). Further, we design a much faster algorithm for fair kk-median with exact fairness constraints and small ff, where ff is the size of a fairlet. It runs in time (kf)O()logn+poly(n)(kf)^{O(\ell)}\log n+poly(n). We emphasize that even in the case =O(1)\ell=O(1), all previous results on fair representation clustering with more than two protected groups, notably Bera et al. (2019); Bercea et al. (2019); Bandyapadhyay et al. (2021), either violate the fairness constraints by additive terms or run in time that is exponential in kk. In this paper, we present first polynomial time approximation algorithms for fair representation with =O(1)\ell=O(1) multiple protected classes that satisfy the fairness requirement with no additive violations (see Theorem 1.3 and Theorem 1.6).

General Representation Fairness.

Our main algorithm has three steps: location consolidation, approximation the metric by a distribution of tree metrics, and finally solving fair clustering on a tree.

In the first step, we run an existing constant factor approximation algorithm for kk-median to find a set of kk centers 𝒞={c1,,ck}\mathcal{C}=\{c_{1},\dots,c_{k}\}. Next, we move each point to its closest center cic_{i} in 𝒞\mathcal{C} in the constructed solution. In other words, we reduce the initial instance of size nn (with possibly nn different locations) to an instance of size nn with exactly kk locations. (i.e., we may have multiple data points mapped/moved to each location.)

In the second step, we use the metric embedding technique by Fakcharoenphol et al. (2004) to approximate the reduced instance by a tree metric with expected distortion O(logk)O(\log k). As the reduced instance, this instance also has at most kk different locations. Additionally, the metric on these locations is a tree metric.

Finally, in the third step, we use dynamic programming (DP) to find a fair assignment of nn points located at kk different locations to kk centers. The DP runs in time nO()n^{O(\ell)}.

We note that that our first step is very similar to that used by Bera et al. (2019) and Bercea et al. (2019); they first find a not-necessarily-fair clustering of the data points and then reassign the points so as to ensure that the clustering is approximately fair. In the context of kk-median, the idea of approximating the input metric with a distribution of dominating trees was introduced by Bartal (1998); this approach was recently used by Backurs et al. (2019) in their approximation algorithm for a different variant of fair representation clustering with 2 groups. Our dynamic programming algorithm is novel and very different from DP algorithms previously used for solving kk-median on trees (see e.g., (Kariv and Hakimi, 1979; Tamir, 1996; Angelidakis et al., 2017)).

Theorem 1.3.

There exists a randomized O(logk)O(\log k)-approximation algorithm for fair representation kk-median that runs in nO()n^{O(\ell)} time.

Note that in practice the number of classes is usually a small constant. Then our algorithm runs in polynomial time. The problem is interesting even when \ell is a fixed single-digit number.

Remark 1.4.

Our approach works in a more general setting with a set of fair profiles FF, where each profile pFp\in F is a vector of length \ell. A cluster CC is fair w.r.t. FF if (|X1C|,,|XC|)F(|X_{1}\cap C|,\dots,|X_{\ell}\cap C|)\in F. For example, this general notion captures the setting in which we only need to guarantee that in each cluster, a sufficiently large fraction of members belong to one of the disadvantaged groups specified by D[]D\subseteq[\ell]; C,iD|XiC|α|C|\forall C,\sum_{i\in D}|X_{i}\cap C|\geq\alpha\cdot|C|.

To the best of our knowledge, none of the previous results on fair representation clustering implies an approximation bound for this general “fairness profile” notion.

Given a clustering instance and set FF, our algorithm finds a clustering such that all clusters are fair w.r.t. FF. Assuming the existence of a membership oracle for FF that runs is time tFt_{F}, the running time and the approximation factor are tFnO()t_{F}\cdot n^{O(\ell)} and O(logk)O(\log k) as in Theorem 1.3. In most natural scenarios the membership oracle can be implemented efficiently and the asymptotic runtime of our algorithm remains nO()n^{O(\ell)}. For instance, in the aforementioned fairness requirement which guarantees the presence of disadvantaged groups in each cluster, tF=O()t_{F}=O(\ell); hence, the total runtime of our algorithm in this setting is still nO()n^{O(\ell)}.

Remark 1.5.

We describe and analyze our algorithm for the case when each point belongs to exactly one group XiX_{i}. However, with a minor modification,our algorithm can also handle the case when a point may belong to multiple groups or do not belong to any group. We introduce a “virtual” group YGY_{G} for every G[]G\subset[\ell] such that there exists a point that belongs to groups XiX_{i} with iGi\in G and only to them. Note that new virtual groups are disjoint and cover XX. We define the fairness constraints as follows: for every j[]j\in[\ell], αi|Ci|G:jG|CiYG|βi|Ci|\alpha_{i}|C_{i}|\leq\sum_{G:j\in G}|C_{i}\cap Y_{G}|\leq\beta_{i}|C_{i}|. By Remark 1.4, our algorithm can handle these constraints; they are equivalent to the original constraints, since G:jG|CiYG|=|CiXj|\sum_{G:j\in G}|C_{i}\cap Y_{G}|=|C_{i}\cap X_{j}|. Note that now the algorithm runs in time nO()n^{O(\ell^{\prime})} where \ell^{\prime} is the number of virtual groups (clearly 2\ell^{\prime}\leq 2^{\ell} but \ell^{\prime} may be much smaller than 22^{\ell}).

Exact Representation Fairness.

We significantly improve the running time of our algorithms when the fairness constraints are exact (see Definition 1.2) and each point belongs to exactly one group. We first run the algorithm by Bera et al. (2019) that returns a set of centers and an assignment of points to these centers that “nearly” satisfies the fairness requirement. Next, we move each point to its assigned center. We prove that there exists an O(1)O(1)-approximately optimal fair assignment that only moves a set of O(kf2)O(kf^{2}) points SS^{*}. Lastly, we show that we can find such a set SS of size O(k2f2)O(k^{2}f^{2}) in polynomial time. Then, loosely speaking, we run our main algorithm on the set SS. Since SS has only O(k2f2)O(k^{2}f^{2}) points, the algorithm runs in time (kf)O()(kf)^{O(\ell)}.

Theorem 1.6.

There exists a randomized O(logk)O(\log k)-approximation algorithm for exactly fair kk-median that runs in time (kf)O()logn+poly(n)(kf)^{O(\ell)}\log n+\mathop{\mathrm{poly}}(n), where ff is the fairlet size.

2 Preliminaries

Embedding into a distribution of dominating trees.

Assume that we are given a metric space (M,d)(M,d). We will consider trees χ\chi on MM (whose edges have positive lengths) and shortest-path metrics dχd_{\chi} that they define on MM. We say that (M,d)(M,d) is α\alpha-approximated by a probabilistic distribution 𝒟𝒳\mathcal{D}_{\mathcal{X}} of dominating trees χ\chi with distortion α1\alpha\geq 1 if

  • Every tree metric dχd_{\chi} in the support 𝒳\mathcal{X} of 𝒟𝒳\mathcal{D}_{\mathcal{X}} dominates metric (M,d)(M,d); i.e., dχ(u,v)d(u,v)d_{\chi}(u,v)\geq d(u,v) for all u,vMu,v\in M.

  • 𝐄χ𝒟𝒳[dχ(u,v)]αd(u,v)\mathbf{E}_{\chi\sim\mathcal{D}_{\mathcal{X}}}[d_{\chi}(u,v)]\leq\alpha\cdot d(u,v) for all u,vMu,v\in M.

A key component in our algorithms is the following result by Fakcharoenphol et al.  Fakcharoenphol et al. (2004) (see also Bartal (1996) by Bartal).

Theorem 2.1 (Fakcharoenphol et al. (2004)).

Every metric space (M,d)(M,d) can be approximated by a distribution of dominating trees with distortion at most O(log|M|)O(\log|M|). Moreover, we can sample from the distribution of dominating trees in polynomial time.

3 Algorithms

Now we present an O(logk)O(\log k)-approximation algorithm for fair representation kk-median that runs in time nO()n^{O(\ell)}, where nn is the number of points and \ell is the number of groups.

In our algorithms, we are going to transform instances to “simpler” instances by moving points to new locations. It will be helpful to distinguish between data points and their locations. Our algorithms will not change data points and their memberships in groups, but will change their locations. Formally, we think of abstract data points that are mapped to points/locations in a metric space; our algorithms will modify this mapping between data points and their locations. We call this process a reassignment. We denote the set of data points by XX and the set of locations by LL. Initially L=XL=X and every point xXx\in X is assigned to location xx, but this will change when we transform the instance. We denote the location of point xx w.r.t. instance \mathcal{I} by loc(x)=loc(x)\operatorname{loc}(x)=\operatorname{loc}_{\mathcal{I}}(x).

Now, if two data points at the same location belong to the same group, then they are interchangeable for our algorithm. Thus, instead of storing the actual data points, the algorithm will only store the number of points from each group at every location.

Denote 0{0,,n}{\mathcal{R}}_{\geq 0}\equiv\{0,\dots,n\}^{\ell} and {n,,n}\mathcal{R}\equiv\{-n,\dots,n\}^{\ell}. For each qLq\in L and j[]j\in[\ell], let vj(q)v_{j}(q) be the number of data points from group jj at location qq. We call vector v(q)=(v1(q),,v(q))0v(q)=(v_{1}(q),\dots,v_{\ell}(q))\in{\mathcal{R}}_{\geq 0} the profile of location qq. We define the profile of a set of data points 𝖲\mathsf{S} as a vector v(𝖲)v(\mathsf{S}) in 0{\mathcal{R}}_{\geq 0} whose jj-th coordinate equals the number of data points from group jj in the set; v(𝖲):=q𝖲v(q)v(\mathsf{S}):=\sum_{q\in\mathsf{S}}v(q).

Consider a set of kk clusters 𝒞\mathcal{C}. To describe a clustering, we introduce an assignment function r:L×𝒞0r:L\times\mathcal{C}\to{\mathcal{R}}_{\geq 0}. For each center c𝒞c\in\mathcal{C}, rj(q,c)r_{j}(q,c) denotes the number of points from group jj at location qq that we assign to cc. Note that vector R(c)=qLr(q,c)0R(c)=\sum_{q\in L}r(q,c)\in{\mathcal{R}}_{\geq 0} specifies the number of points from each group that are assigned to center cc. We call R(c)R(c) the profile of center cc. We require that for every qL,c𝒞r(q,c)=v(q)q\in L,\sum_{c\in\mathcal{C}}r(q,c)=v(q), meaning that each data point at location qq belongs to exactly one cluster.

The fairness constraints are defined by a set of fair profiles F0F\subset{\mathcal{R}}_{\geq 0}. We say that the cluster assigned to center cc fairly represents groups if R(c)FR(c)\in F. In fair representation clustering with fairness requirement {αj,βj}j[]\{\alpha_{j},\beta_{j}\}_{j\in[\ell]}

R(c)FR(c)\in F if and only if αjR(c)1Rj(c)βjR(c)1\alpha_{j}\left\|R(c)\right\|_{1}\leq R_{j}(c)\leq\beta_{j}\left\|R(c)\right\|_{1} for all j[]j\in[\ell].

We restate the objective of fair representation kk-median as follows

c𝒞qLd(q,c)r(q,c)1.\displaystyle\sum_{c\in\mathcal{C}}\sum_{q\in L}d(q,c)\cdot\|r(q,c)\|_{1}. (3)

Note that r(q,c)1\|r(q,c)\|_{1} is the total number of points at location qq assigned to center cc.

3.1 From a Clustering to a New Instance

Consider an instance 𝒥\mathcal{J} and a clustering for 𝒥\mathcal{J} (which is not necessarily fair). As we discussed above, we can define the clustering by specifying (i) a set 𝒞\mathcal{C} of kk-centers and (ii) either a mapping ϕ\phi from the set of data points XX to 𝒞\mathcal{C} or, equivalently, an assignment function r(q,c)r(q,c). The cost of the clustering is

𝐜𝐨𝐬𝐭=xXd(loc𝒥(x),ϕ(x))=c𝒞qLd(q,c)r(q,c)1.\mathbf{cost}=\sum_{x\in X}d(\operatorname{loc}_{\mathcal{J}}(x),\phi(x))=\sum_{c\in\mathcal{C}}\sum_{q\in L}d(q,c)\cdot\|r(q,c)\|_{1}.

Let us move every data point from its original location in 𝒥\mathcal{J} to ϕ(x)\phi(x). We get a new instance 𝒥\mathcal{J}^{\prime}. Note that loc𝒥(x)=ϕ(x)𝒞\operatorname{loc}_{{\mathcal{J}}^{\prime}}(x)=\phi(x)\in\mathcal{C}. The profile v(c)v^{\prime}(c) of location c𝒞c\in\mathcal{C} is v(c)=R(c)v^{\prime}(c)=R(c) (the profile v(x)v^{\prime}(x) of a location x𝒞x\notin\mathcal{C} is a zero vector). Instance 𝒥{\mathcal{J}}^{\prime} has the same set of fairness profiles FF as the original instance 𝒥\mathcal{J}.

Claim 3.1.

Consider a clustering (𝒞,ϕ)({\mathcal{C}}^{\prime},\phi^{\prime}) of XX. Denote its cost w.r.t. instances 𝒥{\mathcal{J}} and 𝒥{\mathcal{J}}^{\prime} by cost𝒥cost_{\mathcal{J}} and cost𝒥cost_{{\mathcal{J}}^{\prime}}, respectively. Then |cost𝒥cost𝒥|𝐜𝐨𝐬𝐭|cost_{{\mathcal{J}}}-cost_{{\mathcal{J}}^{\prime}}|\leq\mathbf{cost}.

Proof.

Consider a data point xx and the center ϕ(x)\phi^{\prime}(x) that it is assigned to by clustering (𝒞,ϕ)({\mathcal{C}}^{\prime},\phi^{\prime}). Then, by the triangle inequality, |d(loc𝒥(x),ϕ(x))d(loc𝒥(x),ϕ(x))|d(loc𝒥(x),loc𝒥(x))=d(loc𝒥(x),ϕ(x))|d(\operatorname{loc}_{\mathcal{J}}(x),\phi^{\prime}(x))-d(\operatorname{loc}_{{\mathcal{J}}^{\prime}}(x),\phi^{\prime}(x))|\leq d(\operatorname{loc}_{\mathcal{J}}(x),\operatorname{loc}_{{\mathcal{J}}^{\prime}}(x))=d(\operatorname{loc}_{\mathcal{J}}(x),\phi(x)). Adding up this inequality over all data points xx, we get the desired inequality |cost𝒥cost𝒥|𝐜𝐨𝐬𝐭|cost_{{\mathcal{J}}}-cost_{{\mathcal{J}}^{\prime}}|\leq\mathbf{cost}. ∎

3.2 Location Consolidation Step

We run a constant factor approximation algorithm for the standard kk-median problem on our set of data points; e.g., the algorithm by Charikar et al. Charikar et al. (2002) or by Li and Svensson Li and Svensson (2016). We get kk centers 𝒞¯={c¯1,,c¯k}\overline{\mathcal{C}}=\{\overline{c}_{1},\dots,\overline{c}_{k}\} and a Voronoi assignment ϕ\phi of points to the centers.

As described above, we move each data point xx to center ϕ(x)\phi(x). We denote the original instance by \mathcal{I} and the obtained instance by {\mathcal{I}}^{\prime}. We will refer to {\mathcal{I}}^{\prime} as the reduced instance.

Claim 3.2.

Let OPT\textsc{OPT}_{\mathcal{I}} and OPT\textsc{OPT}_{{\mathcal{I}}^{\prime}} be the costs of optimal fair clusterings for {\mathcal{I}} and {\mathcal{I}}^{\prime}, respectively. Assume that we used an α\alpha-approximation algorithm for kk-median in the consolidation step. Further assume that there is a βk\beta_{k}-approximation algorithm for fair representation kk-median on {\mathcal{I}}^{\prime}. Then, there is an (αβk+α+βk)(\alpha\beta_{k}+\alpha+\beta_{k})-approximation algorithm for fair representation kk-median on \mathcal{I}.

Proof.

Observe that the cost of an optimal (not necessarily fair) kk-median clustering of \mathcal{I} is at most the cost of an optimal fair kk-median clustering of \mathcal{I}. Thus the cost of the clustering that we use in the reduction is at most αOPT\alpha\textsc{OPT}_{\mathcal{I}}.

Now consider the optimal fair clustering for \mathcal{I}. By Claim 3.1, the cost of this clustering as a clustering of {\mathcal{I}}^{\prime} is at most (α+1)OPT(\alpha+1)\,\textsc{OPT}_{\mathcal{I}}. Therefore, OPT(α+1)OPT\textsc{OPT}_{{\mathcal{I}}^{\prime}}\leq(\alpha+1)\,\textsc{OPT}_{\mathcal{I}}.

Our approximation algorithm for \mathcal{I} is very straightforward. We simply run the βk\beta_{k}-approximation algorithm on instance {\mathcal{I}}^{\prime} and output the obtained clustering as a clustering of \mathcal{I}. Now we upper bound the cost of this clustering w.r.t. instance {\mathcal{I}}^{\prime} and then w.r.t \mathcal{I}. The cost w.r.t. {\mathcal{I}}^{\prime} is at most βkOPTβk(α+1)OPT\beta_{k}\textsc{OPT}_{{\mathcal{I}}^{\prime}}\leq\beta_{k}(\alpha+1)\,\textsc{OPT}_{\mathcal{I}}. Consequently, by Claim 3.1, the cost w.r.t. \mathcal{I} is at most (βk(α+1)+α)OPT(\beta_{k}(\alpha+1)+\alpha)\,\textsc{OPT}_{\mathcal{I}}, as required. ∎

3.3 Embedding into Tree Metrics

Consider restriction d¯\overline{d} of metric dd to 𝒞¯\overline{\mathcal{C}}. We will reduce the problem to the case when d¯\overline{d} is a tree metric by paying a factor of O(logk)O(\log k). To this end, we will approximate d¯\overline{d} by a distribution 𝒟χ{\mathcal{D}}_{\chi} of dominating trees χ\chi with distortion O(logk)O(\log k) and then solve the fair representation kk-median for a number of trees χ\chi randomly sampled from 𝒟χ{\mathcal{D}}_{\chi}.

Claim 3.3.

Suppose that there exists an α\alpha-approximation algorithm 𝒜\mathcal{A} for fair representation kk-median on tree metrics. Then, there is an O(αlogk)O(\alpha\log k)-approximation algorithm for the reduced problem {\mathcal{I}}^{\prime} that succeeds with high probability (i.e., 1nc1-n^{-c} for any desired constant cc).

Proof.

Let rr^{*} be the optimal fair assignment for {\mathcal{I}}^{\prime} and OPT be its cost. Consider an approximation of d¯\overline{d} by a distribution 𝒟𝒳\mathcal{D}_{\mathcal{X}} of dominating trees χ\chi with distortion logk\log k, which exists by Theorem 2.1.

For a tree metric dχd_{\chi}, let rχ:𝒞¯×𝒞¯0r_{\chi}:\overline{\mathcal{C}}\times\overline{\mathcal{C}}\rightarrow{\mathcal{R}}_{\geq 0} denote the O(α)O(\alpha)-approximate assignment of points to centers that algorithm 𝒜\mathcal{A} finds on instance with metric dχd_{\chi}. Then,

𝐄χ𝒟𝒳[q,c𝒞¯d(q,c)rχ(q,c)1]\displaystyle\mathbf{E}_{\chi\sim\mathcal{D}_{\mathcal{X}}}[\sum_{q,c\in\overline{\mathcal{C}}}d(q,c)\cdot\|r_{\chi}(q,c)\|_{1}] 𝐄χ𝒟𝒳[w,c𝒞¯dχ(q,c)rχ(q,c)1]\displaystyle\leq\mathbf{E}_{\chi\sim\mathcal{D}_{\mathcal{X}}}[\sum_{w,c\in\overline{\mathcal{C}}}d_{\chi}(q,c)\cdot\|r_{\chi}(q,c)\|_{1}]
𝐄χ𝒟𝒳[αq,c𝒞¯dχ(q,c)r(q,c)1]\displaystyle\leq\mathbf{E}_{\chi\sim\mathcal{D}_{\mathcal{X}}}[\alpha\cdot\sum_{q,c\in\overline{\mathcal{C}}}d_{\chi}(q,c)\cdot\|r^{*}(q,c)\|_{1}]
=αq,c𝒞¯𝐄χ𝒟𝒳[dχ(q,c)]r(q,c)1\displaystyle=\alpha\cdot\sum_{q,c\in\overline{\mathcal{C}}}\mathbf{E}_{\chi\sim\mathcal{D}_{\mathcal{X}}}[d_{\chi}(q,c)]\cdot\|r^{*}(q,c)\|_{1}
αq,c𝒞¯O(logk)d(q,c)r(q,c)1=O(αlogk)OPT.\displaystyle\leq\alpha\cdot\sum_{q,c\in\overline{\mathcal{C}}}O(\log k)\cdot d(q,c)\cdot\|r^{*}(q,c)\|_{1}=O(\alpha\log k)\cdot\textsc{OPT}.

The three inequalities above hold, since (i) d(q,c)dχ(q,c)d(q,c)\leq d_{\chi}(q,c) (always); (ii) rχr_{\chi} is an α\alpha-approximate assignment/solution; and (iii) 𝔼dχ(q,c)O(logk)d(q,c)\mathbb{E}{d_{\chi}}(q,c)\leq O(\log k)\cdot d(q,c) (recall that d¯\overline{d} is a restriction of dd to C¯\overline{C}; for every u,vC¯,d¯(u,v)=d(u,v)u,v\in\overline{C},\overline{d}(u,v)=d(u,v)).

By Markov’s inequality the obtained solution rχr_{\chi} approximates the optimal assignment within a factor of O(αlogk)O(\alpha\log k) with probability at least 1/21/2. By running the proposed algorithm Θ(logn)\Theta(\log n) times, we get an O(αlogk)O(\alpha\log k)-approximate solution w.h.p. ∎

3.4 Reduced Assignment Problem on Trees

In this section, we assume that (L,d)(L,d) is a tree metric on kk points with profile vector v(u)0v(u)\in{\mathcal{R}}_{\geq 0} for every location uLu\in L. We open a center at every location and our goal now is to find a fair assignment of data points to centers. Recall that our notion of fairness is more general than that in Definition 1.1 (we discussed it in Remark 1.4).

3.4.1 Conversion to Binary Tree

We choose an arbitrary root in the tree. It is convenient now to convert the tree to a binary tree in which every non-leaf vertex has exactly two children. We do that by adding Steiner locations. Namely, we replace each vertex uu with k>2k>2 children with a path of length k2k-2. The first vertex on the path is uu (we assume u0uu_{0}\equiv u); other vertices u1,,uk2u_{1},\dots,u_{k-2} are new Steiner locations. We keep the parent of uu unchanged; we let the parent of uiu_{i} be ui1u_{i-1} (for i1i\geq 1). For j{2,,k1}j\in\{2,\dots,k-1\}, we reassign jj-th child vjv_{j} of uu to uj1u_{j-1} and reassign kk-th child vkv_{k} to uk2u_{k-2}. We set the length of all edges on the path uu1uk2u\to u_{1}\to\dots\to u_{k-2} to 0; we let the length of each edge (uj1,vj)(u_{j-1},v_{j}) be that of (u,vj)(u,v_{j}) in the original tree; the length of (uk2,vk)(u_{k-2},v_{k}) to that of (u,vk)(u,v_{k}) in the original tree. Additionally, we add a Steiner child to vertices that have exactly one child. For a concrete example, consider a tree rooted at vertex 11, in which 11 has four children 2,3,4,52,3,4,5. We transform the tree by adding Steiner nodes 6 and 7 as follows:

12345

\Huge\displaystyle\Longrightarrow1263745

Let AA be the set of non-Steiner locations and BB be the set of Steiner locations. We open a center at every aAa\in A; we do not open centers at any Steiner location bBb\in B. For bBb\in B, let v(b)=0v(b)=0.

3.4.2 Dynamic Program

Now we write a dynamic program (DP). We first recall the setup. We are given a binary tree LL, which contains Steiner and non-Steiner nodes/locations. Each non-Steiner node contains some data points that we want to cluster and each Steiner node contains no data points. Our goal is to move the data points around (assign data points to the non-Steiner nodes) such that the resulting clustering is fair. The objective is to minimize the “assignment cost” t,cAd(t,c)r(t,c)1\sum_{t,c\in A}d(t,c)\cdot\|r(t,c)\|_{1}, where r(t,c)1\|r(t,c)\|_{1} is the number of data points at location tt that are assigned to cc.

Let TuT_{u} be the subtree of LL rooted at uu. Let AuA_{u} and BuB_{u} be non-Steiner and Steiner locations in TuT_{u}, respectively. For a fixed assignment ϕ\phi, data points located at nodes/locations in TuT_{u} can be classified into two types: local points and out-points. A local point is assigned to a node/location in TuT_{u} by ϕ\phi and an out-point is assigned to a node/location outside TuT_{u} by ϕ\phi. In addition, there are some points outside TuT_{u} that are assigned to nodes/locations in TuT_{u}. We refer to these points as in-points. Now we define two functions ρout:Au0\rho_{out}:A_{u}\to{\mathcal{R}}_{\geq 0} and ρin:Au0\rho_{in}:A_{u}\to{\mathcal{R}}_{\geq 0}, which specify the out-points and in-points, respectively. Namely, for each location cAuc\in A_{u}, ρout(c)\rho_{out}(c) is the profile of out-points located at cc, and ρin(c)\rho_{in}(c) is the profile of in-points assigned to cc. Let q:=cAuρin(c)yAuρout(y)q:=\sum_{c\in A_{u}}\rho_{in}(c)-\sum_{y\in A_{u}}\rho_{out}(y) be the “net-imports” of TuT_{u}. Clearly, qq\in{\mathcal{R}}.

Now, we create a DP-table M[u,q]M[u,q] where uLu\in L and qq\in{\mathcal{R}}. Loosely speaking, M[u,q]M[u,q] is the cost of the minimum cost partial solution such that (a) clusters for all centers cAuc\in A_{u} satisfy fairness constraints (note that each cluster consists of the points at locations in TuT_{u} assigned to cc and in-points assigned to cc) and (b) the difference between the number of in-points and out-points from group jj equals qjq_{j}. The cost of a partial solution comprises (i) the assignment costs for points in TuT_{u} assigned to centers in TuT_{u}, (ii) for each out-point xx located at yAuy\in A_{u}, the portion d(y,u)d(y,u) of the assignment cost of xx that “lies” inside TuT_{u}, and (iii) for each in-point xx^{\prime} assigned to center cAuc\in A_{u}, the portion d(u,c)d(u,c) of the assignment cost of xx^{\prime} that “lies” inside TuT_{u}.

Formally, M[u,q]M[u,q] is the cost of the optimal solution for the following problem.

Find
  • function r:Au×Au0r:A_{u}\times A_{u}\rightarrow{\mathcal{R}}_{\geq 0}; map rr specifies the assignment of data points located in subtree TuT_{u} to centers in TuT_{u}. Namely, r(y,c)r(y,c) is the profile of the set of data points located at yy that are assigned to center cc. Let Ru(c)=yTur(y,c)R_{u}(c)=\sum_{y\in T_{u}}r(y,c). Note that Ru(c)R_{u}(c) is the profile of the set of data points at locations in AuA_{u} that are assigned to center cc.

  • function ρout:Au0\rho_{out}:A_{u}\to{\mathcal{R}}_{\geq 0}; map ρout\rho_{out} specifies the set of data points in subtree TuT_{u} that are not assigned to any center in TuT_{u}; these data points will have to be assigned to centers outside of TuT_{u} later. Namely, ρout(y)\rho_{out}(y) is the profile of currently unassigned data points at location yy.

  • function ρin:Au0\rho_{in}:A_{u}\to{\mathcal{R}}_{\geq 0}; ρin\rho_{in} specifies how many data points located outside of the subtree TuT_{u} are assigned to centers in TuT_{u}. Namely, ρin(c)\rho_{in}(c) is the profile of data points that lie outside of TuT_{u} but assigned to center cc.

Such that
  • for every yAuy\in A_{u}, cAur(y,c)+ρout(y)=v(y)\sum_{c\in A_{u}}r(y,c)+\rho_{out}(y)=v(y). This condition says that every data point at location yy is assigned to some center cAuc\in A_{u} or is currently unassigned.

  • Ru(c)+ρin(c)FR_{u}(c)+\rho_{in}(c)\in F for all cAuc\in A_{u}. This condition says that the profile of the cluster centered at cc is in FF (that is, the cluster satisfies the fairness constraints). Here, Ru(c)R_{u}(c) counts data points in TuT_{u} that are assigned to cc and ρin(c)\rho_{in}(c) counts data points outside of TuT_{u} that are assigned to cc.

  • cAuρin(c)yAuρout(y)=q\sum_{c\in A_{u}}\rho_{in}(c)-\sum_{y\in A_{u}}\rho_{out}(y)=q.

Cost

The objective is to minimize the following cost

y,cAud(y,c)r(y,c)1+cAud(c,u)ρin(c)1+yAud(y,u)ρout(y)1.\displaystyle\sum_{y,c\in A_{u}}d(y,c)\cdot\|r(y,c)\|_{1}+\sum_{c\in A_{u}}d(c,u)\cdot\|\rho_{in}(c)\|_{1}+\sum_{y\in A_{u}}d(y,u)\cdot\|\rho_{out}(y)\|_{1}.

If there is no feasible solution the cost is ++\infty.

3.4.3 Solving the DP

We fill out the DP table starting from the leaves and going up to the root (bottom-up).

Computing M[u,q]M[u,q] for leaves uu is straightforward using the definition of M[u,q]M[u,q]. In this case, the subtree TuT_{u} rooted at uu is a single node. Thus, for any given qq, the profile at uu after assignment is

Ru(u)+ρin(u)=r(u,u)+ρin(u)=r(u,u)+ρout(u)+q=v(u)+q.\displaystyle R_{u}(u)+\rho_{in}(u)=r(u,u)+\rho_{in}(u)=r(u,u)+\rho_{out}(u)+q=v(u)+q.

If uu is not a Steiner node, then M[u,q]=0M[u,q]=0 if v(u)+qFv(u)+q\in F and M[u,q]=M[u,q]=\infty if v(u)+qFv(u)+q\not\in F. If uu is a Steiner node, then M[u,q]=0M[u,q]=0 if q=0q=0 and M[u,q]=M[u,q]=\infty if q0q\not=0.

Now assume that uu is not a leaf; since the tree is a binary tree, uu has two children yy and zz. Note that node uu will send qyq_{y} points to TyT_{y} and qzq_{z} points to TzT_{z} for some qy,qzq_{y},q_{z}\in{\mathcal{R}}. Then M[u,q]M[u,q] is the minimum over qy,qzq_{y},q_{z}\in{\mathcal{R}} satisfying

  • if uAu\in A, then v(u)+qqyqzFv(u)+q-q_{y}-q_{z}\in F; that is, cluster centered at uu satisfies fairness constraints.

  • if uBu\in B, then qqyqz=0q-q_{y}-q_{z}=0, since v(u)=0v(u)=0 and no point will be assigned to uu.

of the following cost

M[y,qy]+M[z,qz]+d(u,y)qy1+d(u,z)qz1\displaystyle M[y,q_{y}]+M[z,q_{z}]+d(u,y)\cdot\|q_{y}\|_{1}+d(u,z)\cdot\|q_{z}\|_{1} (4)

The pseudo-code of the DP algorithm is provided in Algorithm 1.

Lemma 3.4.

The described dynamic programming algorithm runs in time nO()n^{O(\ell)} and finds an optimal fair assignment of the data points XX located in the set of locations LL.

Proof.

It is straightforward that the DP algorithm correctly computes the DP entries. The cost of the optimal fair assignment for our problem is given by M[root,𝟎]M[root,\boldsymbol{0}] where rootroot is root of the tree and 𝟎\boldsymbol{0} is the all-zero vector.

The update rule, which is specified by Eq. (4) and the constraints on qy,qz,qq_{y},q_{z},q and v(u)v(u), computes M[u,q]M[u,q] correctly in time nO()n^{O(\ell)}, which accounts for enumerating over all possible values of qyq_{y} and qzq_{z}. Since the DP tables contains knO()k\cdot n^{O(\ell)} cells, the total time to fully compute the table is knO()=nO()k\cdot n^{O(\ell)}=n^{O(\ell)}. Finally, once the whole table is computed, we can recover the assignment itself by traversing the table from M[root,𝟎]M[root,\boldsymbol{0}] in the reverse direction, which takes nO()n^{O(\ell)} time. The total running time is nO()n^{O(\ell)}. ∎

Proof of Theorem 1.3:.

We first apply the consolidation step and thus reduce the problem to the set of locations 𝒞¯\overline{\mathcal{C}} is of size kk. Then we approximate metric on 𝒞¯\overline{\mathcal{C}} by a distribution of dominating tree metrics, as explained in 3.3. We obtain a logarithmic number of instances with tree metrics. We exactly solve each of them using the DP algorithm described above. Finally, we return the best of the clusterings we find. It follows from Claims 3.2 and 3.3 and Lemma 3.4, that the algorithm finds an O(logk)O(\log k) approximation with high probability in time nO()n^{O(\ell)}. ∎

Algorithm 1 DP Algorithm on Trees
0:  TT is a binary tree of height dd, rr is the root of TT, and for each i=0,1,,di=0,1,\dots,d, NiN_{i} is the set of nodes in the ii-th level of TT.
1:  for uNdu\in N_{d} do
2:     for qq\in{\mathcal{R}} do
3:        if uu is a non-Steiner node then
4:           if v(u)+qFv(u)+q\in F then
5:              M[u,q]=0M[u,q]=0
6:           else
7:              M[u,q]=M[u,q]=\infty
8:           end if
9:        else
10:           if q=0q=0 then
11:              M[u,q]=0M[u,q]=0
12:           else
13:              M[u,q]=M[u,q]=\infty
14:           end if
15:        end if
16:     end for
17:  end for
18:  for i=d1,d2,,0i=d-1,d-2,\dots,0 do
19:     for uNiu\in N_{i} do
20:        y,zchildren ofuy,z\leftarrow\texttt{children of}\hskip 5.69054ptu
21:        if uu is a Steiner node then
22:           M[u,q]=min{M[y,qy]+M[z,qz]+d(u,y)qy1+d(u,z)qz1:qy,qz,q=qy+qz}M[u,q]=\min\{M[y,q_{y}]+M[z,q_{z}]+d(u,y)\cdot\|q_{y}\|_{1}+d(u,z)\cdot\|q_{z}\|_{1}:q_{y},q_{z}\in{\mathcal{R}},q=q_{y}+q_{z}\}
23:        else
24:           M[u,q]=min{M[y,qy]+M[z,qz]+d(u,y)qy1+d(u,z)qz1:qy,qz,v(u)+qqyqzF}M[u,q]=\min\{M[y,q_{y}]+M[z,q_{z}]+d(u,y)\cdot\|q_{y}\|_{1}+d(u,z)\cdot\|q_{z}\|_{1}:q_{y},q_{z}\in{\mathcal{R}},v(u)+q-q_{y}-q_{z}\in F\}
25:        end if
26:     end for
27:  end for
28:  return   M[r,0]M[r,0]

4 Special Case of Exact Fairness

In this section, we design an O(logk)O(\log k) approximation algorithm for kk-median with exact fairness constraints. The algorithm runs in time (kf)O()logn+poly(n)(kf)^{O(\ell)}\log n+\mathop{\mathrm{poly}}(n) and succeeds with high probability.

To recall, we say that a cluster centered at cc is exactly fair if j[]\forall j\in[\ell], Rj(c)=|Xj||X|R(c)1R_{j}(c)=\frac{|X_{j}|}{|X|}\|R(c)\|_{1}, where |Xj||X|\frac{|X_{j}|}{|X|} is the proportion of data points in XX that belongs to group jj. We say that a clustering is exactly fair if all of its clusters are exactly fair. More generally, a set of data points SS with profile v(S)0v(S)\in\mathcal{R}_{\geq 0} is exactly fair if for all j[]j\in[\ell], vj(S)=|Xj||X|v(S)1v_{j}(S)=\frac{|X_{j}|}{|X|}\|v(S)\|_{1}.

Recall that we defined fairlet in Definition 1.2 as a minimal size non-empty set that is exactly fair. Note that all fairlets have the same size and we use ff to denote their size. Further, for every j[]j\in[\ell], we use fjf_{j} to denote the number of points from group jj in any fairlet.

4.1 Reassignment Method

We start with the reduced instance \mathcal{I}^{\prime} on the set of locations 𝒞¯\overline{\mathcal{C}}, which we constructed in the location consolidation step.

Given any subset 𝖲X\mathsf{S}\subseteq X with profile vector v(𝖲)v(\mathsf{S}), we say that v(𝖲)v(\mathsf{S}) is γ\gamma-approximately fair if

|vj(𝖲)fjfv(𝖲)1|γ,j[].\displaystyle\Bigl{|}v_{j}(\mathsf{S})-\frac{f_{j}}{f}\|v(\mathsf{S})\|_{1}\Bigr{|}\leq\gamma,\;\;\forall j\in[\ell]. (5)

Given an assignment rr, we say that rr is a γ\gamma-approximately fair assignment if R(c)R(c) is γ\gamma-approximately fair for all c𝒞c\in\mathcal{C}, where R(c)=xXr(x,c)0R(c)=\sum_{x\in X}r(x,c)\in{\mathcal{R}}_{\geq 0} is the center profiles corresponding to rr.

We use the following result by Bera et al. Bera et al. (2019).

Theorem 4.1.

For any metric space XX and a set of centers 𝒞X\mathcal{C}\subseteq X, There exists a 33-approximately fair assignment r:X×𝒞0r:X\times\mathcal{C}\rightarrow{\mathcal{R}}_{\geq 0} whose cost is no more than the optimal fair assignment of XX. Moreover, there is an algorithm which finds this assignment in polynomial time.

We start by running the assignment algorithm from Theorem 4.1 on instance \mathcal{I}^{\prime}. Let rr denote this new assignment and R(c)=xXr(x,c)0R(c)=\sum_{x\in X}r(x,c)\in{\mathcal{R}}_{\geq 0} be the center profiles corresponding to rr. As discussed in Section 3.1, every assignment defines a new instance. Let ′′{\mathcal{I}}^{\prime\prime} be the instance defined by assignment rr. Now our goal is to modify rr by reassigning only a small number of points so that the obtained assignment is fair. We describe this reassigning procedure below.

Lemma 4.2.

Assume that there is a βk\beta_{k}-approximation algorithm for the fair clustering problem on ′′\mathcal{I}^{\prime\prime}. Then, there is an (2βk+1)(2\beta_{k}+1)-approximation algorithm for \mathcal{I}^{\prime}.

Proof.

The proof is similar to that of Claim 3.2. We run the βk\beta_{k} approximation algorithm on instance ′′{\mathcal{I}}^{\prime\prime} and output the obtained clustering as a clustering for {\mathcal{I}}^{\prime}. We now upper bound its cost.

Let OPT\textsc{OPT}_{\mathcal{I}^{\prime}} and OPT′′\textsc{OPT}_{\mathcal{I}^{\prime\prime}} be the costs of the optimal fair clusterings/assignments for instances \mathcal{I}^{\prime} and ′′\mathcal{I}^{\prime\prime}, respectively. By Claim 3.1 applied to rr, OPT′′2OPT\textsc{OPT}_{\mathcal{I}^{\prime\prime}}\leq 2\textsc{OPT}_{\mathcal{I}^{\prime}}. Thus the cost of the clustering we find w.r.t. ′′{\mathcal{I}}^{\prime\prime} is at most 2βkOPT2\beta_{k}\textsc{OPT}_{\mathcal{I}^{\prime}}. Now applying Claim 3.1 to this clustering, we get that its cost w.r.t. {\mathcal{I}}^{\prime} is at most (2βk+1)OPT(2\beta_{k}+1)\textsc{OPT}_{\mathcal{I}^{\prime}}, as required. ∎

Algorithm for the reassignment problem.

We present an O(logk)O(\log k)-approximation algorithm for the reassignment problem that runs in time (fk)O()(fk)^{O(\ell)}.

Suppose we are given an instance 𝒥\mathcal{J} with locations c1,,ckc_{1},\dots,c_{k}, metric dd and profile vectors v(c1),,v(ck)v(c_{1}),\dots,v(c_{k}), that satisfy 33-approximately fair requirement as in Eq. (5). We will apply our reassignment algorithm to 𝒥=′′\mathcal{J}=\mathcal{I}^{\prime\prime}. For every i[k]i\in[k], let CiC_{i} be the set of points assigned to cic_{i} in 𝒥\mathcal{J}.

Definition 4.3.

For every i[k]i\in[k], let FiCiF_{i}\subseteq C_{i} be a maximal exactly fair subset of CiC_{i}. Decompose each FiF_{i} into ni=|Fi|/fn_{i}=|F_{i}|/f fairlets 1(i),,ni(i){\mathcal{F}}^{(i)}_{1},\dots,{\mathcal{F}}^{(i)}_{n_{i}}. Let Pi=CiFiP_{i}=C_{i}\setminus F_{i} be the set of remaining points. We say that 𝒟={(1(i),,ni(i),Pi)}i[]{\mathcal{D}}=\{({\mathcal{F}}^{(i)}_{1},\dots,{\mathcal{F}}^{(i)}_{n_{i}},P_{i})\}_{i\in[\ell]} is a fairlet decomposition for clustering C:=(C1,,Ck)C:=(C_{1},\dots,C_{k}). We call points in PiP_{i} problematic points.

Lemma 4.4.

Consider a fairlet decomposition for CC. Then for every i[k]i\in[k], |Pi|<4f|P_{i}|<4f.

Proof.

Consider an arbitrary i[k]i\in[k]. We first assume that ni=0n_{i}=0 ; in other words, FiF_{i} is empty. Let v(Ci)v(C_{i}) be the profile vector of CiC_{i}. Since Fi=F_{i}=\emptyset, Pi=CiP_{i}=C_{i} and there exits a group j[]j\in[\ell] such that vj(Ci)<fjv_{j}(C_{i})<f_{j}. Further, Since Pi=CiP_{i}=C_{i} is 33-approximately fair, |vj(Ci)fj|Pi|f|3|v_{j}(C_{i})-\frac{f_{j}\cdot|P_{i}|}{f}|\leq 3, which implies that fjf|Pi|vj(Ci)+3<fj+3\frac{f_{j}}{f}|P_{i}|\leq v_{j}(C_{i})+3<f_{j}+3. Dividing both sides of the above equation by fjf_{j}, |Pi|f<3fj+13+1=4\frac{|P_{i}|}{f}<\frac{3}{f_{j}}+1\leq 3+1=4, since fj1f_{j}\geq 1. Thus, |Pi|<4f|P_{i}|<4f.

Now, we drop the condition ni=0n_{i}=0. Let v(Pi)v(P_{i}) be the profile vector for PiP_{i}. We show that not only v(Ci)v(C_{i}) but also v(Pi)v(P_{i}) satisfies Eq. (5) with γ=3\gamma=3. Indeed, for each j[]j\in[\ell], vj(Ci)vj(Pi)=niffjf=(v(Ci)1v(Pi)1)fjfv_{j}(C_{i})-v_{j}(P_{i})=n_{i}f\frac{f_{j}}{f}=(\|v(C_{i})\|_{1}-\|v(P_{i})\|_{1})\frac{f_{j}}{f}. Thus, for each j[]j\in[\ell], |vj(Pi)fjfv(Pi)1|=|vj(Ci)fjfv(Ci)1|3\Bigl{|}v_{j}(P_{i})-\frac{f_{j}}{f}\|v(P_{i})\|_{1}\Bigr{|}=\Bigl{|}v_{j}(C_{i})-\frac{f_{j}}{f}\|v(C_{i})\|_{1}\Bigr{|}\leq 3. Hence the same argument as for the case ni=0n_{i}=0 shows that |Pi|<4f|P_{i}|<4f. ∎

Next, we give an outline of our approach.

  1. 1.

    First, we show that there exists an exactly fair assignment rr for 𝒥\mathcal{J} of cost at most 2OPT𝒥2\textsc{OPT}_{\mathcal{J}} that only reassigns O(kf2)O(kf^{2}) points.

  2. 2.

    Next, we show that it is possible to find a subset SiCiS_{i}\subseteq C_{i} for each i[k]i\in[k] such that |Si|=O(kf2)|S_{i}|=O(kf^{2}) for all i[k]i\in[k], and there exists a solution of cost at most 2OPT𝒥2\textsc{OPT}_{\mathcal{J}} that only reassigns points in S:=i[k]SiS:=\bigcup_{i\in[k]}S_{i}.

  3. 3.

    Finally, we apply our dynamic programming approach to reassign points in SS. Since |S|=O(k2f2)|S|=O(k^{2}f^{2}), the DP algorithm runs in (kf)O()(kf)^{O(\ell)} time.

To show (1), we impose an additional assumption on the structure of the desired (re)assignment rr. Let C1,,CkC_{1}^{\prime},\dots,C_{k}^{\prime} be the clustering resulted from an exactly fair assignment of 𝒥\mathcal{J}, where CiC_{i}^{\prime} is the cluster of points at location cic_{i} after the reassignment.

Let C1,,CkC_{1},\dots,C_{k} be the clustering before the (re)assignment. Now, we construct fairlet decompositions for (C1,,Ck)(C_{1},\dots,C_{k}) and (C1,,Ck)(C_{1}^{\prime},\dots,C_{k}^{\prime}). Let EXE\subseteq X be the set of data points that are assigned to the same center before and after (re)assignment rr. For each i[k]i\in[k], let Ei={xE:loc𝒥(x)=ci}E_{i}=\{x\in E:\operatorname{loc}_{\mathcal{J}}(x)=c_{i}\} be the set of data points in CiC_{i} that are fixed by rr.

For every i[k]i\in[k], let FiEiF^{\prime}_{i}\subseteq E_{i} be a maximal exactly fair subset of EiE_{i}. We decompose FiF^{\prime}_{i} into ni=|Fi|/fn^{\prime}_{i}=|F^{\prime}_{i}|/f fairlets. We call this decomposition 𝒟E{\mathcal{D}}_{E}. Now, we greedily extend 𝒟E{\mathcal{D}}_{E} to a fairlet decomposition for (C1,,Ck)(C_{1},\dots,C_{k}). Let PiP_{i} be the set of problematic points in the obtained decomposition. Note that Lemma 4.4 applies to PiP_{i} and therefore |Pi|<4f|P_{i}|<4f.

Similarly, we greedily extend decomposition 𝒟E{\mathcal{D}}_{E} to a fairlet decomposition for (C1,,Ck)(C_{1}^{\prime},\dots,C_{k}^{\prime}). We refer to fairlets that we add to 𝒟E{\mathcal{D}}_{E} as new fairlets. Let 𝒩\mathcal{N} be the set of all new fairlets.

Definition 4.5 (restricted assignment).

In the restricted assignment problem, we need to find a minimum cost assignment r:{c1,,ck}×[k]0r:\{c_{1},\dots,c_{k}\}\times[k]\rightarrow{\mathcal{R}}_{\geq 0} of 𝒥\mathcal{J} such that

  • rr is a valid exactly fair assignment. Formally,

    c{c1,,ck},\displaystyle\forall c\in\{c_{1},\cdots,c_{k}\}, i=1kr(c,ci)=v(c)\displaystyle\sum_{i=1}^{k}r(c,c_{i})=v(c)
    j[],c{c1,,ck},\displaystyle\forall j\in[\ell],c\in\{c_{1},\cdots,c_{k}\}, Rj(c)=fjfR(c)1\displaystyle R_{j}(c)=\frac{f_{j}}{f}\|R(c)\|_{1}
  • every new fairlet \mathcal{F} in CiC_{i}^{\prime} contains at least one point from CiC_{i}.

Lemma 4.6.

Let OPT𝒥R\textsc{OPT}^{R}_{\mathcal{J}} and OPT𝒥\textsc{OPT}_{\mathcal{J}} denote the optimal values of the restricted and vanilla fair assignment problems with input 𝒥\mathcal{J} respectively. Then, OPT𝒥R2OPT𝒥\textsc{OPT}^{R}_{\mathcal{J}}\leq 2\textsc{OPT}_{\mathcal{J}}.

Proof.

Let rr be an optimal fair assignment of 𝒥\mathcal{J}. Let C1,,CkC_{1}^{\prime},\dots,C_{k}^{\prime} be the resulting clusters; the set CiC_{i}^{\prime} consists of the points assigned to cic_{i} by rr. Now, suppose that there exits a new fairlet \mathcal{F} in some CiC_{i}^{\prime} (i[k]i\in[k]) such that Ci=\mathcal{F}\cap C_{i}=\emptyset. Without loss of generality, assume that the points in \mathcal{F} are from Cj1,Cj2,,CjmC_{j_{1}},C_{j_{2}},\dots,C_{j_{m}} for some m[k]m\in[k]. Let wt:=|Cjt|w_{t}:=|C_{j_{t}}\cap\mathcal{F}| for every t[m]t\in[m] and W=t=1mwtW=\sum_{t=1}^{m}w_{t}. Then, the total cost of forming \mathcal{F} is t=1mwtd(cjt,ci)\sum_{t=1}^{m}w_{t}\cdot d(c_{j_{t}},c_{i}). Now, for every s[m]s\in[m], compute the cost of moving all points in fairlet \mathcal{F} to cluster CjsC_{j_{s}} (i.e., reassign all data points in \mathcal{F} to center cjsc_{j_{s}}) and denote it as MsM_{s}.

MS=t=1mwtd(cjt,cjs)t[m]{s}wt(d(cjt,ci)+d(cjs,ci))=(t[m]{s}wt)d(cjs,ci)+t[m]{s}wtd(cjt,ci).\begin{split}M_{S}=\sum_{t=1}^{m}w_{t}d(c_{j_{t}},c_{j_{s}})&\leq\sum_{t\in[m]\setminus\{s\}}w_{t}(d(c_{j_{t}},c_{i})+d(c_{j_{s}},c_{i}))=\Bigg{(}\sum_{t\in[m]\setminus\{s\}}w_{t}\Bigg{)}d(c_{j_{s}},c_{i})+\sum_{t\in[m]\setminus\{s\}}w_{t}d(c_{j_{t}},c_{i}).\end{split}

Next, consider the convex combination s=1mwsWMs\sum_{s=1}^{m}\frac{w_{s}}{W}M_{s}.

s=1mwsWMs=s=1mwsWt=1mwtd(cjt,cjs)s=1mwsW(t[m]{s}wt)d(cjs,ci)+s=1mwsWt[m]{s}wtd(cjt,ci)<2t=1mwtd(cjt,ci).\displaystyle\sum_{s=1}^{m}\frac{w_{s}}{W}M_{s}=\sum_{s=1}^{m}\frac{w_{s}}{W}\sum_{t=1}^{m}w_{t}d(c_{j_{t}},c_{j_{s}})\leq\sum_{s=1}^{m}\frac{w_{s}}{W}\Bigg{(}\sum_{t\in[m]\setminus\{s\}}w_{t}\Bigg{)}d(c_{j_{s}},c_{i})+\sum_{s=1}^{m}\frac{w_{s}}{W}\sum_{t\in[m]\setminus\{s\}}w_{t}d(c_{j_{t}},c_{i})<2\sum_{t=1}^{m}w_{t}d(c_{j_{t}},c_{i}).

This implies that there exist s[m]s^{*}\in[m] with CjsC_{j_{s^{*}}}\cap\mathcal{F}\neq\emptyset such that the cost MsM_{s^{*}} of reassigning \mathcal{F} to cjsc_{j_{s^{*}}} is at most twice the current assignment cost of \mathcal{F}.

We perform this fairlet reassignment procedure to all new fairlets that do not satisfy the restricted assignment property and obtain a new restricted assignment rr^{\prime}. Our argument shows that the assignment cost of rr^{\prime} is at most twice the assignment cost of rr; hence,

OPT𝒥R2OPT𝒥.\displaystyle\textsc{OPT}^{R}_{\mathcal{J}}\leq 2\textsc{OPT}_{\mathcal{J}}.

Next, we show that the optimal restricted assignment is even more structured.

Lemma 4.7.

There exists an optimal solution rr^{\prime} to the restricted assignment of 𝒥\mathcal{J} such that

  1. (a)

    for any i[k]i\in[k] and any new fairlet Ci\mathcal{F}\subseteq C_{i}^{\prime}, there exists a problematic point xCix\in\mathcal{F}\cap C_{i};

  2. (b)

    if rr^{\prime} moves a point xx\in\mathcal{F}^{\prime} for some fairlet Ci\mathcal{F}^{\prime}\subseteq C_{i} and i[k]i\in[k], then rr^{\prime} moves all points in \mathcal{F}^{\prime}.

Proof.

Let rr be the optimal solution for the restricted assignment problem. We apply two transformations to it.

Transformation I: Simplifying the reassignment multigraph.

For every group j[]j\in[\ell], define the reassignment multigraph GjG_{j} on the set of centers c1,,ckc_{1},\dots,c_{k}. There is a directed edge from cac_{a} to cbcac_{b}\neq c_{a} in GjG_{j} for every point xx that is reassigned from cac_{a} to cbc_{b} by rr. We label such an edge with label xx. Note that there are rj(ca,cb)r_{j}(c_{a},c_{b}) parallel edges from cac_{a} to cbc_{b} in GjG_{j}.

Assume that there is a vertex cc that has both incoming and outgoing edges in GjG_{j}. Let (c,c)(c^{\prime},c) be an incoming edge and (c,c′′)(c,c^{\prime\prime}) be an outgoing edge (it is possible that c=c′′c^{\prime}=c^{\prime\prime}). Denote their labels by xx and xx^{\prime}. By the definition of the graph, rr reassigns point xXjx\in X_{j} from cc^{\prime} to cc and point xXjx^{\prime}\in X_{j} from cc to c′′c^{\prime\prime}. We modify rr as follows (see Figure 1).

original assignment rr new assignment r~\tilde{r}
assignment c\textstyle{c\ignorespaces\ignorespaces\ignorespaces\ignorespaces}x\scriptstyle{x^{\prime}}c\textstyle{c^{\prime}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}x\scriptstyle{x}c′′\textstyle{c^{\prime\prime}} c\textstyle{c\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces\ignorespaces}x\scriptstyle{x^{\prime}}c\textstyle{c^{\prime}\ignorespaces\ignorespaces\ignorespaces\ignorespaces}x\scriptstyle{x}c′′\textstyle{c^{\prime\prime}}
cost d(c,c)+d(c,c′′)d(c^{\prime},c)+d(c,c^{\prime\prime}) d(c,c′′)d(c^{\prime},c^{\prime\prime})
Figure 1: We reassign xx and xx^{\prime}. When we do the reassignment, we replace two edges (c,c)(c^{\prime},c) and (c,c′′)(c,c^{\prime\prime}) with one edge (c,c′′)(c^{\prime},c^{\prime\prime}) in graph GjG_{j}. Note that there are no loops in GjG_{j}. We have drawn a dotted arrow from cc to itself in the figure simply to indicate that r~\tilde{r} does not reassign xx^{\prime}.

We only change the assignment of points xx and xx^{\prime}: we reassign xx to c′′c^{\prime\prime} and xx^{\prime} to cc. Denote the obtained assignment r~\tilde{r}. Since xx and x′′x^{\prime\prime} belong to the same group XjX_{j}, we do not change the profiles of any clusters and thus r~\tilde{r} is exactly fair. Further, if rr assigns a point yy from cic_{i} to cic_{i} then so does r~\tilde{r}. Therefore, r~\tilde{r} is also a solution to the restricted assignment problem. Finally, its cost is at most that of rr: indeed the assignment costs of all points other than xx and xx^{\prime} do not change; the total assignment cost of xx and xx^{\prime} was equal to d(c,c)+d(c,c′′)d(c^{\prime},c)+d(c,c^{\prime\prime}) for rr and is equal to d(c,c′′)d(c^{\prime},c^{\prime\prime}) for r~\tilde{r}; we have d(c,c)+d(c,c′′)d(c,c′′)d(c^{\prime},c)+d(c,c^{\prime\prime})\geq d(c^{\prime},c^{\prime\prime}) by the triangle inequality.

We perform this step over and over until there are no vertices that simultaneously have incoming and outgoing edges. (Note that each time we perform this step, the number of edges in one of the graphs GjG_{j} decreases by 1 and does not change in all other graphs GjG_{j^{\prime}} (for every j[]{j}j^{\prime}\in[\ell]\setminus\{j\}). So we will necessarily stop at some point.)

Denote the obtained assignment by rr^{\prime}. It is a fair assignment and its cost is at most that of rr. Consider the reassignment multigraphs {Gj}j[]\{G_{j}^{\prime}\}_{j\in[\ell]} for rr^{\prime}. Each vertex cc in GjG_{j}^{\prime} is either a source, sink, or isolated vertex.

Transformation II: Specifying which points rr^{\prime} moves.

We go over all i[k]i\in[k] and j[]j\in[\ell] such that cic_{i} is a source-vertex in GjG_{j}^{\prime}. Assignment rr^{\prime} moves M=iirj(ci,ci)M=\sum_{i^{\prime}\neq i}r_{j}(c_{i},c_{i^{\prime}}) points in group jj from CiC_{i} to other clusters. Since all points in CiXjC_{i}\cap X_{j} are interchangeable, we are free to choose any subset of MM points in CiXjC_{i}\cap X_{j} to move. We choose this set in a special way. Let 1,,ni\mathcal{F}_{1},\dots,\mathcal{F}_{n_{i}} be fairlets in the fairlet decomposition we constructed for CiC_{i}. We order all points in CiXjC_{i}\cap X_{j} as follows: first points from PiXjP_{i}\cap X_{j}, then from 1Xj\mathcal{F}_{1}\cap X_{j}, then from 2Xj\mathcal{F}_{2}\cap X_{j}, then from 3Xj\mathcal{F}_{3}\cap X_{j}, etc (we order points inside each of these sets arbitrarily). We choose the subset of the first MM points w.r.t. this order and assume that rr^{\prime} moves them. Note that rr^{\prime} moves non-problematic points from CiXjC_{i}\cap X_{j} to another cluster only if it moves all problematic points from CiXjC_{i}\cap X_{j}.

We have defined rr^{\prime}. Now we prove that it satisfies properties (a) and (b). To this end, we consider a cluster CiC_{i} and analyze the following two cases.

Case 1: Assignment rr^{\prime} does not move any non-problematic points from cluster CiC_{i} to other clusters. All points (if any) it moves from CiC_{i} to other clusters are problematic.

Then CiPiCiC_{i}\setminus P_{i}\subseteq C_{i}^{\prime} and thus all fairlets present in the decomposition for CiC_{i} are also present in that for CiC_{i}^{\prime}. Hence, every non-problematic point belongs to the same fairlet in CiC_{i} and CiC_{i}^{\prime}. In particular, only problematic points may belong to new fairlets. On the other hand, since rr^{\prime} is a solution to the restricted assignment problem, at least one point xx in each new fairlet \mathcal{F} in CiC_{i}^{\prime} must be from CiC_{i}. It follows that this point must be problematic. We proved that property (a) holds in this case. Since rr does not move any points from fairlets Ci{\mathcal{F}}^{\prime}\subseteq C_{i} to other clusters, property (b) trivially holds.

Case 2: Assignment rr^{\prime} moves at least one non-problematic point from CiC_{i} to another cluster.

Let tt be maximal index such that rr^{\prime} moves points from fairlet t\mathcal{F}_{t} to other clusters. Since we are guaranteed that rr^{\prime} moves at least one non-problematic point, tt is well defined.

By our choice of tt, rr^{\prime} does not move any points from fairlets t+1,,ni\mathcal{F}_{t+1},\dots,\mathcal{F}_{n_{i}} and therefore these fairlets are still present in CiC_{i}^{\prime}. Now we show that there are no other fairlets in CiC_{i}^{\prime}; in particular, there are no new fairlets.

Consider a point xx in t\mathcal{F}_{t} that rr^{\prime} moves to some other cluster CiC_{i^{\prime}}. Assume that xXjx\in X_{j}. Then edge (ci,ci)(c_{i},c_{i^{\prime}}) – an edge outgoing from cic_{i} – is present in GjG_{j}^{\prime}. Hence, cic_{i} cannot be a sink or isolated vertex and must be a source-vertex in GjG_{j}^{\prime}. Therefore, rr^{\prime} does not move any point yXjy\in X_{j} from another cluster to CiC_{i}. Also, all points in CiXjC_{i}\cap X_{j} that precede xx w.r.t. the order we considered in the Transformation II step are moved to other clusters by rr^{\prime}; in particular, all points in PiXj,1Xj,,t1XjP_{i}\cap X_{j},\mathcal{F}_{1}\cap X_{j},\dots,\mathcal{F}_{t-1}\cap X_{j} as well as xx itself are moved to other clusters by rr^{\prime}. Therefore, CiXjC_{i}^{\prime}\cap X_{j} consists of points from t+1Xj,,niXj\mathcal{F}_{t+1}\cap X_{j},\dots,\mathcal{F}_{n_{i}}\cap X_{j} and hypothetically some points from tXj\mathcal{F}_{t}\cap X_{j}. However, point xx from tXj\mathcal{F}_{t}\cap X_{j} is assigned to another cluster and thus there are not enough points from group jj left in t\mathcal{F}_{t} to form another fairlet. We conclude that t+1,,ni\mathcal{F}_{t+1},\dots,\mathcal{F}_{n_{i}} are the only fairlets in CiC_{i}^{\prime}. Since CiC_{i}^{\prime} satisfies the exact fairness constraints, each point in CiC_{i}^{\prime} lies in a fairlet, that is, in one of these fairlets. Now (a) trivially holds since there are no new fairlets in CC^{\prime}; (b) holds since rr^{\prime} moves all points from fairlets 1,,t\mathcal{F}_{1},\dots,\mathcal{F}_{t} and no points from fairlets t+1,,ni\mathcal{F}_{t+1},\dots,\mathcal{F}_{n_{i}}. ∎

Lemma 4.7 immediately implies the following result which proves 1 and 2.

Corollary 4.8.

Suppose that 𝒥\mathcal{J} is a 33-approximately fair instance. There exists an assignment rr^{\prime} of cost at most 2OPT𝒥2\textsc{OPT}_{\mathcal{J}} that only moves O(kf2)O(kf^{2}) points. Furthermore, we can choose a subset SS of data points such that |S|=O(k2f2)|S|=O(k^{2}f^{2}) and the solution rr^{\prime} only moves points in SS.

Proof.

Let rr^{\prime} be an optimal restricted assignment that satisfies both conditions in Lemma 4.7. In the clustering defined by rr^{\prime}, every new fairlet contains a problematic point from the original cluster of the center they are assigned to by rr^{\prime}. Since by Lemma 4.4 there are only 4kf4kf problematic points in total, there are at most 4kf4kf many new fairlets. Thus, rr^{\prime} moves at most 4kf24kf^{2} many points. Moreover, by Lemma 4.6, the cost of rr^{\prime} is at most 2OPT𝒥2\textsc{OPT}_{\mathcal{J}}.

Since we know that rr^{\prime} moves at most 4kf24kf^{2} many points and rr^{\prime} satisfies condition (b)(b) in Lemma 4.7, rr^{\prime} would move at most 4kf4kf fairlets in each cluster. Recall that each CiC_{i} contains nin_{i} fairlets. From each cluster CiC_{i}, we pick min(ni,4kf)\min(n_{i},4kf) many fairlets and add them to the set SS. Then, we add all the problematic points to SS. Then, |S|4k2f2+4kf|S|\leq 4k^{2}f^{2}+4kf and we can assume that rr^{\prime} only moves the points in SS. ∎

Now, we can solve the assignment problem by passing the set SS as input to our dynamic programming approach described in Section 3.4.

Proof of Theorem 1.6..

Let \mathcal{I} be an instance of the exactly fair kk-median problem. First, we perform location consolidation on \mathcal{I} to obtain the instance \mathcal{I}^{\prime} as described in Section 3.2. Then, we run the assignment algorithm by Bera et al. (2019) on \mathcal{I}^{\prime}. Let ′′\mathcal{I}^{\prime\prime} be the resulting instance. By Theorem 4.1, ′′\mathcal{I}^{\prime\prime} is 33-approximately fair. By Corollary 4.8, we can pick a subset SS of data points such that |S|=O(k2f2)|S|=O(k^{2}f^{2}) and there exists an solution rr^{\prime} of cost at most 2OPT′′2\textsc{OPT}_{\mathcal{I}^{\prime\prime}} that only moves points in SS. Now, we apply the dynamic programming algorithm in Theorem 1.3 on SS to obtain a solution of cost at most O(logk)OPT′′O(\log k)\textsc{OPT}_{\mathcal{I}^{\prime\prime}} with high probability that runs in |S|O()logn=(kf)O()logn|S|^{O(\ell)}\log n=(kf)^{O(\ell)}\log n time. Since all the previous steps take poly(n)\mathop{\mathrm{poly}}(n) time, the total running time of this algorithm is poly(n)+(kf)O()logn\mathop{\mathrm{poly}}(n)+(kf)^{O(\ell)}\log n.

Now, by Lemma 4.2 and Claim 3.2, the total cost of our solution is O(logk)OPTO(\log k)\textsc{OPT}_{\mathcal{I}}. ∎

5 Conclusions

In this paper, we study the fair kk-median problem. We present an O(logk)O(\log k)-approximation algorithm that runs in time nO()n^{O(\ell)}. We further showed that our algorithm works in a more general setting where the fairness requirements are specified as an arbitrary set of fair profiles. This notion “profile-based fairness” captures a richer class of fairness requirements that cannot be handled by the previously known approaches for fair representation clustering.

In addition, in the special case of exact fairness, we present an O(logk)O(\log k)-approximation algorithm that runs in (kf)O()logn+poly(n)(kf)^{O(\ell)}\log n+\mathop{\mathrm{poly}}(n) time, where ff is the size of a fairlet.

Our paper shows that there exists approximation algorithms for fair representation clustering with O(1)O(1) protected classes that run in polynomial time. It remains as an exciting question whether polynomial time O(logk)O(\log k)-approximation can be achieved for the problem when =Ω(1)\ell=\Omega(1). An approach that can be potentially helpful to resolve the previous question is to extend the “reassignment” method used in Section 4 to the more general setting of representation fairness (as in Section 3).

References

  • Abbasi et al. [2021] M. Abbasi, A. Bhaskara, and S. Venkatasubramanian. Fair clustering via equitable group representations. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, page 504–514, 2021.
  • Ahmadian et al. [2019] S. Ahmadian, A. Epasto, R. Kumar, and M. Mahdian. Clustering without over-representation. In Proceedings of the SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 267–275, 2019.
  • Angelidakis et al. [2017] H. Angelidakis, K. Makarychev, and Y. Makarychev. Algorithms for stable and perturbation-resilient problems. In Proceedings of the Symposium on Theory of Computing, pages 438–451, 2017.
  • Backurs et al. [2019] A. Backurs, P. Indyk, K. Onak, B. Schieber, A. Vakilian, and T. Wagner. Scalable fair clustering. In Proceedings of the International Conference on Machine Learning, pages 405–413, 2019.
  • Bandyapadhyay et al. [2021] S. Bandyapadhyay, F. V. Fomin, and K. Simonov. On coresets for fair clustering in metric and Euclidean spaces and their applications. In Proceedings of the International Colloquium on Automata, Languages, and Programming, pages 23:1–23:15, 2021.
  • Bartal [1996] Y. Bartal. Probabilistic approximation of metric spaces and its algorithmic applications. In Proceedings of the Conference on Foundations of Computer Science, pages 184–193, 1996.
  • Bartal [1998] Y. Bartal. On approximating arbitrary metrices by tree metrics. In Proceedings of the Symposium on Theory of Computing, pages 161–168, 1998.
  • Bera et al. [2019] S. Bera, D. Chakrabarty, N. Flores, and M. Negahbani. Fair algorithms for clustering. In Advances in Neural Information Processing Systems, pages 4955–4966, 2019.
  • Bercea et al. [2019] I. O. Bercea, M. Groß, S. Khuller, A. Kumar, C. Rösner, D. R. Schmidt, and M. Schmidt. On the cost of essentially fair clusterings. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 2019.
  • Böhm et al. [2020] M. Böhm, A. Fazzone, S. Leonardi, and C. Schwiegelshohn. Fair clustering with multiple colors. arXiv preprint arXiv:2002.07892, 2020.
  • Brubach et al. [2020] B. Brubach, D. Chakrabarti, J. Dickerson, S. Khuller, A. Srinivasan, and L. Tsepenekas. A pairwise fair and community-preserving approach to kk-center clustering. In Proceedings of the International Conference on Machine Learning, pages 1178–1189, 2020.
  • Chakrabarty and Negahbani [2021] D. Chakrabarty and M. Negahbani. Better algorithms for individually fair kk-clustering. arXiv preprint arXiv:2106.12150, 2021.
  • Charikar et al. [2002] M. Charikar, S. Guha, É. Tardos, and D. B. Shmoys. A constant-factor approximation algorithm for the kk-median problem. Journal of Computer and System Sciences, 65(1):129–149, 2002.
  • Chen et al. [2019] X. Chen, B. Fain, L. Lyu, and K. Munagala. Proportionally fair clustering. In International Conference on Machine Learning, pages 1032–1041. PMLR, 2019.
  • Chierichetti et al. [2017] F. Chierichetti, R. Kumar, S. Lattanzi, and S. Vassilvitskii. Fair clustering through fairlets. In Advances in Neural Information Processing Systems, pages 5036–5044, 2017.
  • Chlamtáč et al. [2022] E. Chlamtáč, Y. Makarychev, and A. Vakilian. Approximating fair clustering with cascaded norm objectives. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2664–2683, 2022.
  • Chouldechova [2017] A. Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2):153–163, 2017.
  • Chouldechova and Roth [2020] A. Chouldechova and A. Roth. A snapshot of the frontiers of fairness in machine learning. Communications of the ACM, 63(5):82–89, 2020.
  • Esmaeili et al. [2020] S. Esmaeili, B. Brubach, L. Tsepenekas, and J. Dickerson. Probabilistic fair clustering. Advances in Neural Information Processing Systems, 33, 2020.
  • Esmaeili et al. [2021] S. A. Esmaeili, B. Brubach, A. Srinivasan, and J. P. Dickerson. Fair clustering under a bounded cost. Advances in Neural Information Processing Systems, 2021.
  • Fakcharoenphol et al. [2004] J. Fakcharoenphol, S. Rao, and K. Talwar. A tight bound on approximating arbitrary metrics by tree metrics. Journal of Computer and System Sciences, 69(3):485–497, 2004.
  • Feldman et al. [2015] M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. Certifying and removing disparate impact. In Proceedings of the SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 259–268, 2015.
  • Ghadiri et al. [2021] M. Ghadiri, S. Samadi, and S. Vempala. Socially fair kk-means clustering. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pages 438–448, 2021.
  • Huang et al. [2019] L. Huang, S. Jiang, and N. Vishnoi. Coresets for clustering with fairness constraints. In Proceedings of the Conference on Neural Information Processing Systems, 2019.
  • Jones et al. [2020] M. Jones, H. Nguyen, and T. Nguyen. Fair kk-centers via maximum matching. In Proceedings of the International Conference on Machine Learning, pages 4940–4949, 2020.
  • Jung et al. [2020] C. Jung, S. Kannan, and N. Lutz. A center in your neighborhood: Fairness in facility location. In Proceedings of the Symposium on Foundations of Responsible Computing, page 5:1–5:15, 2020.
  • Kariv and Hakimi [1979] O. Kariv and S. L. Hakimi. An algorithmic approach to network location problems. ii: The pp-medians. SIAM Journal on Applied Mathematics, 37(3):539–560, 1979.
  • Kearns and Roth [2019] M. Kearns and A. Roth. The ethical algorithm: The science of socially aware algorithm design. Oxford University Press, 2019.
  • Kleinberg et al. [2017] J. Kleinberg, S. Mullainathan, and M. Raghavan. Inherent trade-offs in the fair determination of risk scores. In Proceedings of the Innovations in Theoretical Computer Science, 2017.
  • Kleinberg et al. [2018] J. Kleinberg, H. Lakkaraju, J. Leskovec, J. Ludwig, and S. Mullainathan. Human decisions and machine predictions. The quarterly journal of economics, 133(1):237–293, 2018.
  • Kleindessner et al. [2019] M. Kleindessner, P. Awasthi, and J. Morgenstern. Fair kk-center clustering for data summarization. In Proceedings of the International Conference on Machine Learning, pages 3448–3457, 2019.
  • Kleindessner et al. [2020] M. Kleindessner, P. Awasthi, and J. Morgenstern. A notion of individual fairness for clustering. arXiv preprint arXiv:2006.04960, 2020.
  • Li and Svensson [2016] S. Li and O. Svensson. Approximating kk-median via pseudo-approximation. SIAM Journal on Computing, 45(2):530–547, 2016.
  • Mahabadi and Vakilian [2020] S. Mahabadi and A. Vakilian. Individual fairness for kk-clustering. In Proceedings of the International Conference on Machine Learning, pages 6586–6596, 2020.
  • Makarychev and Vakilian [2021] Y. Makarychev and A. Vakilian. Approximation algorithms for socially fair clustering. In Proceedings of the Conference on Learning Theory, pages 3246–3264. PMLR, 2021.
  • Marcinkowski et al. [2020] F. Marcinkowski, K. Kieslich, C. Starke, and M. Lünich. Implications of ai (un-) fairness in higher education admissions: the effects of perceived ai (un-) fairness on exit, voice and organizational reputation. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pages 122–130, 2020.
  • Micha and Shah [2020] E. Micha and N. Shah. Proportionally fair clustering revisited. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2020.
  • Rösner and Schmidt [2018] C. Rösner and M. Schmidt. Privacy preserving clustering with constraints. In Proceedings of the International Colloquium on Automata, Languages, and Programming, pages 96:1–96:14, 2018.
  • Schmidt et al. [2019] M. Schmidt, C. Schwiegelshohn, and C. Sohler. Fair coresets and streaming algorithms for fair kk-means. In Proceedings of the International Workshop on Approximation and Online Algorithms, pages 232–251, 2019.
  • Tamir [1996] A. Tamir. An o(pn2)o(pn^{2}) algorithm for the pp-median and related problems on tree graphs. Operations Research Letters, 19(2):59–64, 1996.
  • Vakilian and Yalçıner [2021] A. Vakilian and M. Yalçıner. Improved approximation algorithms for individually fair clustering. arXiv preprint arXiv:2106.14043, 2021.