Computing Rule-Based Explanations by Leveraging Counterfactuals

Zixuan Geng University of WashingtonUSA [email protected] , Maximilian Schleich Relational AIUSA [email protected] and Dan Suciu University of WashingtonUSA [email protected]

Abstract.

Sophisticated machine models are increasingly used for high-stakes decisions in everyday life. There is an urgent need to develop effective explanation techniques for such automated decisions. Rule-Based Explanations have been proposed for high-stake decisions like loan applications, because they increase the users’ trust in the decision. However, rule-based explanations are very inefficient to compute, and existing systems sacrifice their quality in order to achieve reasonable performance. We propose a novel approach to compute rule-based explanations, by using a different type of explanation, Counterfactual Explanations, for which several efficient systems have already been developed. We prove a Duality Theorem, showing that rule-based and counterfactual-based explanations are dual to each other, then use this observation to develop an efficient algorithm for computing rule-based explanations, which uses the counterfactual-based explanation as an oracle. We conduct extensive experiments showing that our system computes rule-based explanations of higher quality, and with the same or better performance, than two previous systems, MinSetCover and Anchor.

PVLDB Reference Format:
PVLDB, 14(1): XXX-XXX, 2022.
doi:XX.XX/XXX.XX ^†^†This work is licensed under the Creative Commons BY-NC-ND 4.0 International License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of this license. For any use beyond those covered by this license, obtain permission by emailing [email protected]. Copyright is held by the owner/author(s). Publication rights licensed to the VLDB Endowment.
Proceedings of the VLDB Endowment, Vol. 14, No. 1 ISSN 2150-8097.
doi:XX.XX/XXX.XX

PVLDB Artifact Availability:
The source code, data, and/or other artifacts have been made available at https://github.com/GibbsG/GeneticCF.

1. Introduction

We are witnessing an increased adoption of sophisticated machine learning models in high-stakes decisions. This leads to an urgent need for us to find some ways to make the models more explainable and debuggable, so that we can not only ensure the fairness of machine learning models, but also increase the public trust from human users of these models. Due to this need, explainable machine learning has become an important research topic.

The literature on explanation techniques is vast (e.g., (Lundberg and Lee, 2017; Ribeiro et al., 2016, 2018; Schleich et al., 2021; Rudin and Shaposhnik, 2019; Ustun et al., 2019; Wachter et al., 2017)). We refer to the excellent book on interpretable machine learning for an overview of these techniques (Molnar, 2022). At a high level, there are two levels of explanations depending on the scope. One is the local explanation (Baehrens et al., 2010), which explains the model based on the decision on a single instance. The other one is the global explanation (Wang et al., 2017), which explains the model as a whole; in this paper we focus on local explanations.

One type of local explanation is the Counterfactual Explanation model, or Actionable Recourse, which generates a counterfactual or a “desired” instance based on an “undesired” instance. Given an instance $x$ , on which the machine learning model predicts a negative, “bad” outcome, the counterfactual explanation says what needs to change in order to get the positive, “good” outcome. For example, a customer applies for a loan with a bank, the bank denies the loan application, and the customer asks for an explanation; the system responds by indicating what features need to change in order for the loan to be approved, for example, the response may be Increase the Income from 500 to 700 and decrease the Open Accounts from 4 to 3. The semantics is that, if these features are changed accordingly, then the machine learning model will change its prediction from “bad” to “good”. The counterfactual provides the user with some recommendation for what they should do in order to change the outcome in the future.

However, in high-stakes applications like financial decisions, counterfactual explanations may actually be misleading customers. For example, a different customer may have an income of 500 and 4 open accounts yet her loan application was approved. The reason is that the two customers differ in other features used by the system, but customers unfamiliar with the internals of the model will instead perceive the decision as unfair, because they are asked to change features that appeared to be no problem for other customers.

Counterfactual explanations appear to be insufficient for high-stakes applications of machine learning. For that reason, Rudin et al. (Chen et al., 2018; Rudin and Shaposhnik, 2019) argue in favor of a new kind of explanation, called rule-based explanation. A rule is a conjunction of predicates on the features for which the machine learning model always generates the bad outcome. For example a rule could be all customers who had an Income $\leq 500$ and an Employment-history of $\leq 10$ years were denied the load application. While this does not immediately tell the customer how to intervene to change the outcome, it nevertheless assures her of the fairness of the decision, because all customers with these features had been denied. Instead of trying to be prescriptive and instruct the customer on what to do, rule-based explanations are descriptive in that they provide fundamental reasons for the decision. Rule-based explanations are similar to the anchors, introduced by Ribeiro et al. (Ribeiro et al., 2018), and are highly desired by financial institutions.

Black-box explanation systems compute the explanations by repeatedly probing the black-box classifier with inputs derived from instance to be explained, and from the domain of some large dataset of instances, which can be the data used to train the classifier, or some historical data of past decisions. This leads to distinct computational challenges for both counterfactual and rule-based explanations. A counterfactual explanation consists of some, ideally small, set of features, and new values for these features that lead to a positive outcome. A rule-based explanation consists of some, ideally small, set of features, where the current values the instance always lead to a negative outcome, for any values of the other features. In other words, a counterfactual explanation requires answering an existentially quantified query, while a rule-based explanation requires answering a universally quantified query. Finding counterfactual explanations is relatively easier, and several counterfactual explanation systems have already been developed, such as Mace (Karimi et al., 2020), Geco (Schleich et al., 2021), Dice (Mothilal et al., 2020) and others, and are capable of finding efficiently counterfactual explanations of high quality. In contrast, finding rule-based explanations is much harder. For example, the approach described by Rudin et al. (Rudin and Shaposhnik, 2019) converts the problem to a minimum set-cover problem, whose size depends on the size of the database, then solves it using Integer Programming.

In this paper we propose a novel approach to compute rule-based explanations, by reducing the problem to computing counterfactual explanations, then using an existing counterfactual system to develop a highly efficient rule-based explanation system. We start by proving formally that counterfactual and rule-based explanations are duals to each other. This means that, if a counterfactual explanation consists of some set of features, then every rule-based explanation must include at least one of these features. Otherwise, if the counterfactual and rule-based explanations use disjoint sets of features, then we are lead to a contradiction, since one explanation asserts that by changing the values of only the first set of features the outcome will be positive, while the other asserts that if we keep unchanged the values of the second set of features then the outcome will always be negative.

Using the duality theorem, we develop a new approach for computing rule-based explanations, by using a counterfactual explanation algorithm as a black box. We start from a baseline consisting of a simple algorithm for computing rule-based explanations, called GeneticRule, which, given an instance $x$ with a bad outcome, searches candidate rules using a genetic algorithm. Then, we describe two extensions. The first extension, called Genetic Rule with Counterfactual (GeneticRuleCF), uses a counterfactual system to create new candidate rules. More precisely, if a candidate rule fails to be globally consistent, then the algorithm asks for a counterfactual explanation to the bad outcome for $x$ , but under the constraint that none of the features already included in the rule be modified. Each feature changed by the counterfactual is then added to the candidate rule, and the search continues. The second extension, called Greedy Algorithm with Counterfactual (GreedyRuleCF), also uses the counterfactual explanation algorithm to extend the rule, but only applies it to the best current candidate rule.

In order to validate a rule-based explanation one has to check whether for all possible values of the other attributes, the outcome of the classifier remains negative. The property is called global consistency, and is very expensive to check. A critical step in these algorithms is the global consistency test. To reduce its cost, the set-cover method in (Rudin and Shaposhnik, 2019) restricts the test to instances in the database. In our approach, we not only check instances in the database, but we also perform the check for all combinations of values in the domain. For example, suppose Alice is denied her application, and happens to have 10 open accounts. In order to check the consistency of the rule “10 open accounts always lead to a denial”, the set-cover method checks only the customers in the database: if all customers with 10 open accounts were denied, then it deems the rule consistent. However, the database may contain only a tiny sample of customers with 10 open accounts. In contrast, our system checks for all combinations of all attributes, e.g. age, income, credit-score, etc, and declares the rule consistent only if all such combinations lead to a negative outcome. This test is potentially very expensive, and here is where we use the counterfactual explanation system. More precisely, we ask it to find a counterfactual explanation where the features in rule are fixed, and the others can be modified arbitrarily. For example, we ask it to find a counterfactual by keeping the number of open accounts equal to 10: the rule “10 open accounts” is globally consistent only if no such counterfactual exists.

Finally, we conduct an extensive experimental evaluation, by evaluating our three algorithms and comparing them to both MinSetCover (Rudin and Shaposhnik, 2019) and Anchor (Ribeiro et al., 2018). We found that both MinSetCover and Anchor returned rules that are not globally consistent. For example, MinSetCover checks consistency only on the instances in the database and $97.4\%$ of the rules generated for Adult dataset are not globally consistent, while Anchor almost always returns rules with redundant predicates and also about $87.0\%$ of the rules are not globally consistent. A redundant predicate is one that can be removed from the rule and still keep it globally consistent; Anchor uses an multi-armed bandit approach to find rules, which often leads to the inclusion of redundant predicates. In contrast, our GeneticRuleCF algorithm always generates globally consistent rules with only $12.4\%$ of the rules have redundant predicates, and our GreedyRuleCF algorithm only generates globally consistent rules, without redundant predicates.

We note that an orthogonal approach to explanations is the development of interpretable machine learning models. In general, simple ML models such as linear regression or rule-based models are considered to be interpretable. One should not confused the rule-based models, as discussed e.g. in (Lakkaraju et al., 2016), with rule-based explanations considered in our paper. The purpose of the rule-based model is serve as decision mechanism, while that of a rule-based explanation is to provide an explanation for a decision made by some other, usually uninterpretable model.

Contributions. In summary, in this paper we make the following contributions.

(1)

We prove the Duality Theorem between counterfactual and rule based explanations. Section 3.1.
(2)

We show how to use the Duality Theorem in order to compute rule-based explanations by using a counterfactual-based explanation system. Section 3.2.
(3)

We describe three algorithms: GeneticRule, GeneticRuleCF, and GreedyRuleCF for generating the rule-based explanations. Section 4.
(4)

We conduct an extensive experimental evaluation of GeneticRule, GeneticRuleCF, and GreedyRuleCF algorithms, and compare them with Anchor and MinSetCover. Section 5.

2. Definitions

Let $F_{1},\dots,F_{n}$ be $n$ features, with domains $dom(F_{1})$ , $\dots$ , $dom(F_{n})$ , which we assume to be ordered, and let $\textsc{Inst}\stackrel{{\scriptstyle\mathrm{def}}}{{=}}dom(F_{1})\times\dots\times dom(F_{n})$ . We call an element $x\in\textsc{Inst}$ an instance. We are given a black box classifier $C$ that, on any instance $x\in\textsc{Inst}$ , returns a prediction $C(x)$ within the range $[0,1]$ . We assume that $C(x)\leq 0.5$ is an “undesired” or “bad” outcome, while $C(x)>0.5$ is “desired” or “good”. For the binary classifier, we replace its outcomes with values $\{0,\,1\}$ , where $0$ is for “undesired” and $1$ is for “desired”. Furthermore, we assume a database $D=\{x_{1},\ldots,x_{m}\}$ of $m$ instances, which can be a training set, or a test set, or historical data of past customers for which the system has performed predictions. For each instance in the database we write its feature values as $x_{i}=(f_{i1},\dots,f_{in})$ .

2.1. Rule-based Explanation

Fix an instance $x_{i}=(f_{i1},\ldots,f_{in})\in D$ . A rule component, $RC$ , relevant to $x_{i}$ , is a predicate of the form $F_{j}\leq f_{ij}$ or $F_{j}\geq f_{ij}$ , for some feature $F_{j}$ . For an instance $x\in\textsc{Inst}$ , we write $RC(x)$ for the predicate defined by $RC$ . In other words, if $x=(f_{1},\ldots,f_{n})$ , then $RC(x)$ asserts $f_{j}\leq f_{ij}$ or $f_{j}\geq f_{ij}$ respectively.

A rule relevant to $x_{i}$ is a set of rule components, $R=\{RC_{1},\ldots,\-RC_{c}\}$ . We write $R(x)$ for the predicate that is the conjunction of all rule components. The cardinality of the rule is $c$ . Notice that, in order to assert equality for some feature, $F_{j}=f_{ij}$ , we need two rule components, namely both $\leq$ and $\geq$ , therefore $0\leq c\leq 2n$ . We denote by $\textsc{Inst}_{R}$ the set of all instances that satisfy $R$ :

\textsc{Inst}_{R}=\{x|x\in\textsc{Inst},R(x)=1\}

Consider an instance $x_{i}$ that is classified as “undesired”, $C(x_{i})\leq 0.5$ , and let $R$ be some rule. Rudin and Shaposhnik (Rudin and Shaposhnik, 2019) propose three simple properties that, when satisfied, can be used to offer $R$ as explanation for the bad outcome on the instance $x_{i}$ :

(1)

Relevance: the input instance $x_{i}$ satisfies $R$ , in other words $x_{i}\in\textsc{Inst}_{R}$ .
(2)

Global Consistency: all instances $x$ in Inst that satisfy the rule $R$ are “undesired”: $\forall x\in\textsc{Inst}_{R}$ , $C(x)\leq 0.5$ .
(3)

Interpretability: the rule should be as simple as possible, in other words it should have a small cardinality.

In this paper we consider only rules that are relevant to the instance $x_{i}$ , hence the first property is satisfied by definition. Our goal is: given $x_{i}$ with a bad outcome, compute one (or several) globally consistent, interpretable rule $R$ .

The trivial rule relevant to the instance $x_{i}$ is the rule $R_{triv}$ that contains all $2n$ rule components relevant to $x_{i}$ ; $R_{triv}=\{F_{1}\leq f_{i1},F_{1}\geq f_{i1},\ldots,F_{n}\leq f_{in},F_{n}\geq f_{in}\}$ . In other words, $R_{triv}(x)$ asserts that the instance $x$ has exactly the same features as the instance $x_{i}$ . Since $x_{i}$ is undesired, $R_{triv}$ is globally consistent. However, its cardinality is very large, $2n$ , and we say that the trivial rule is not “interpretable”. Instead, we seek a minimal set of rule components that are still globally consistent.

In general, checking global consistency is computationally hard. The number of possible instances, $|\textsc{Inst}|$ , is exponentially large in the number of features, and checking all of them is intractable. For that purpose, the authors in (Rudin and Shaposhnik, 2019) relax the global consistency requirement, and check consistency only relative to the database $D=\{x_{1},\ldots,x_{m}\}$ . We call this property Data Consistency: $\forall x_{k}\in D\cap\textsc{Inst}_{R}$ , $C(x_{k})\leq 0.5$ . Anchor (Ribeiro et al., 2018) does consider global consistency, but only aims to enforce it “with high probability”, in other words $\frac{|\{x\in\textsc{Inst}_{R}|C(x)\leq 0.5\}|}{|\textsc{Inst}_{R}|}$ should be close to 1. As we will see in Section 5, explanations returned by both MinSetCover (Rudin and Shaposhnik, 2019) and Anchor (Ribeiro et al., 2018) often fail to satisfy global consistency.

2.2. Counterfactual Explanation

While a rule-based explanation identifies a set of features whose values necessarily lead to an undesired outcome, a counterfactual explanation identifies some features whose values, when updated, could possibly lead to a desired outcome. Formally, we fix an instance $x_{i}$ with a “bad” outcome, and define a counterfactual explanation to be some other instance $x_{cf}\in\textsc{Inst}$ with a “good” outcome, $C(x_{i})>0.5$ . We often represent $x_{cf}$ by listing only the set of features where it differs from $x_{i}$ .

A counterfactual $x_{cf}$ is required to satisfy two properties. First, $x_{cf}$ must be feasible and plausible w.r.t. $x_{i}$ . Feasibility imposes constraints on the new values, e.g. income cannot exceed (say) $1M, while plausibility imposes constraints on how the new values in $x_{cf}$ may differ from the old values in $x_{i}$ , e.g. gender cannot change, or age can only increase, etc. We refer to these predicates as PLAF (plausibility/feasibility) predicates, and denote the conjunction of all PLAF predicates by $P(x_{cf})$ . Formally, a PLAF predicate is a formula of the form $\Phi_{1}\wedge\cdots\wedge\Phi_{m}\Rightarrow\Phi_{0}$ , where $\Phi_{i}$ is a predicate over the features of $x_{i}$ and $x_{cf}$ . One example from (Schleich et al., 2021) is $\texttt{gender}_{CF}=\texttt{gender}_{i}$ , which asserts that gender cannot change; another example is $\texttt{education}_{CF}>\texttt{education}_{i}\Rightarrow\texttt{age}_{CF}\geq\texttt{age}_{i}+4$ , which asserts that, if we ask the customer to get a higher education degree, then we should also increase the age by at least 4 years. Second, we score counterfactuals by how many changes they require over $x_{i}$ . Given a distance function $dist(x,x^{\prime})$ on Inst, the counterfactuals that satisfy the PLAF constraints are ranked by their distance from $x_{i}$ .

A counterfactual explanation system takes as input an instance $x_{i}$ with a “bad” outcome, a PLAF constraint $P(x)$ , and a distance function $dist(x,x^{\prime})$ , then returns a rank list of counterfactuals that satisfy $P$ and are closed to $x_{i}$ .

2.3. Discussion

Different types of explanations provide the users with very different kinds of information. For an intuition into their differences, consider a user Bob who has applied for life insurance, and was denied.

The SHAP score (Lundberg and Lee, 2017), a popular form of explanation, assigns a fraction to each feature, for example: $\texttt{AGE}=35\%$ , BLOOD-PRESSURE = 20% $,\texttt{SMOKING}=10\%$ , $\ldots$ . This defines a clear ranking of the features, but it has limited value for the end user Bob who was denied. We do not consider the SHAP score in this paper.

An example of a counterfactual explanation is: “change SMOKING from true to false”. This has a clear meaning: if Bob quits smoking, he will get approved for life insurance. However, if Bob’s friend Charlie also smokes, yet was approved, then Bob will feel that he was treated unfairly.

A rule based explanation looks like this: “everybody who has $\texttt{SMOKING}=\texttt{true}$ and $\texttt{BLOOD-PRESSURE}\geq 140$ will be denied”. This does not provide Bob with any actionable advice, but it assures him of the fairness of the decision.

3. Duality

A rule-based explanation and a counterfactual explanation provide quite different information to the end user. In both cases, a good explanation is small: a rule relevant to $x_{i}$ should have only a few rule components, while a counterfactual should change the instance $x_{i}$ with only a small number of features. Several efficient counterfactual explanation systems exist (Karimi et al., 2020; Mothilal et al., 2020; Schleich et al., 2021), but the existing rule-based explanation systems sacrifice global consistency for performance (Rudin and Shaposhnik, 2019; Ribeiro et al., 2018). In this section we prove that the two kinds of explanations are duals, and use this property to propose a method for computing rule-based explanations by using an oracle to counterfactual explanations.

Before we begin, we will briefly explain why the two types of explanations have such different complexities. Fix a small set of features $\mathcal{F}\subseteq\{F_{1},\ldots,F_{n}\}$ . These are the features changed by the counterfactual explanation, or defining the rule components of a rule-based explanation. In either case, we want $\mathcal{F}$ to be small, $|\mathcal{F}|=k\ll n$ . For an illustration, in the Yelp dataset in Sec. 5 there are $n=34$ features, and typical explanations involve $k=10$ features. Suppose we want to check whether we can construct a counterfactual $x_{cf}$ from $x_{i}$ by changing only features in $\mathcal{F}$ . An exhaustive search requires $N^{k}$ calls to the oracle $C(x_{cf})$ , assuming all domains have size $N$ . In practice systems like Geco (Schleich et al., 2021) sample only a few values from each domain $dom(F_{j})$ ; if a counterfactual is found then it returns it, otherwise it tries a different set of features $\mathcal{F}$ . Now suppose we want to check if the rule $R$ whose rule components are given by the features in $\mathcal{F}$ is globally consistent. Assume for simplicity that, for each $F_{j}\in\mathcal{F}$ , we include both $F_{j}\leq f_{ij}$ and $F_{j}\geq f_{ij}$ in $R$ . Checking global consistency requires $N^{n-k}$ calls to the classifier, because we need to try all values of all $n-k$ features not in $\mathcal{F}$ . Sampling is no longer sufficient. Worse, $k$ is much smaller than $n$ , which means that the expression $N^{n-k}$ is really large. Referring again to the Yelp dataset, the naive complexity of a counterfactual explanation is $N^{10}$ , and of a rule-based explanation is $N^{24}$ . Instead, we show here how to compute rule-based explanations by using a counterfactual explanation system as a black box. This is possible due to a duality that holds between the two kinds of explanation.

3.1. The Duality Theorem

We start with a simple lemma.

Lemma 3.1.

If $R$ is a globally consistent rule, and $x_{cf}$ is any counterfactual, then $R(x_{cf})$ is false.

Proof.

By definition, if $R$ is globally consistent, then for all $x^{\prime}$ : if $R(x^{\prime})$ is true then the classifier returns the “bad” outcome on $x^{\prime}$ , i.e. $C(x^{\prime})\leq 0.5$ . Also by definition, if $x_{cf}$ is a counterfactual, then the classifier returns the “good” outcome, i.e. $C(x_{cf})>0.5$ . It follows immediately that $R(x_{cf})$ must be false. ∎

Fix an instance $x_{i}=(f_{i1},\ldots,f_{in})$ with a bad outcome. For any other instance $x=(f_{1},\ldots,f_{n})\in\textsc{Inst}$ , we will construct a dual rule $R_{x}$ consisting of all rule components relevant to $x_{i}$ that are false on $x$ , as follows. If $f_{j}>f_{ij}$ then we say that the rule component $F_{j}\leq f_{ij}$ conflicts with $x$ ; if $f_{j}<f_{ij}$ then the rule component $F_{j}\geq f_{ij}$ conflicts with $x$ . In other words, an $RC$ conflicts with $x$ iff it is relevant to $x_{i}$ and $RC(x)$ is false. The dual of $x$ is the rule $R_{x}$ consisting of all components that conflict with $x$ . We combine the rule components in the duals with $\vee$ instead of $\wedge$ . For a simple example, if $x_{i}=(F_{1}=10,F_{2}=20,F_{3}=30)$ and $x=(F_{1}=5,F_{2}=90,F_{3}=30)$ then $R_{x}=(F_{1}\geq 10\vee F_{2}\leq 20)$ . We prove:

Theorem 3.2 (Duality).

Fix a globally consistent rule $R$ relevant to $x_{i}$ , let $x_{cf,1},\ldots,x_{cf,k}$ be counterfactual instances, and let $R_{x_{cf,1}},\ldots,R_{x_{cf,k}}$ be their duals. Then $R$ is a set cover of $\{R_{x_{cf,1}},\ldots,\-R_{x_{cf,k}}\}$ . In other words, for every counterfactual $x_{cf,m}$ the rule $R$ contains at least one rule component that conflicts with $x_{cf,m}$ . Conversely, fix any counterfactual $x_{cf}$ , and let $R_{1},\ldots,R_{k}$ be globally consistent rules. Then the dual $R_{x_{cf}}$ is a set cover of $\{R_{1},\ldots,R_{k}\}$ .

Proof.

Assume the contrary, that $R$ and $R_{x_{cf,m}}$ do not have any common rule component. Then $R(x_{cf,m})$ is true, which contradicts Lemma 3.1. The converse is shown similarly: if $R_{x_{cf}}$ is disjoint from some rule, say $R_{j}$ , then $x_{cf}$ satisfies the rule $R_{j}$ , contradicting Lemma 3.1. ∎

The theorem says that globally consistent rules and counterfactuals are duals to each other. We will exploit the first direction of the duality, and show how to use counterfactuals to compute efficiently globally consistent rules.

3.2. Using the Duality Theorem

We now describe how to use a counterfactual explanation system to compute a relevant, globally consistent, and informative rule $R$ for an instance $x_{i}$ . This is the key part of the algorithms we proposed in Section 4.2 and 4.3.

Theorem 3.2 already implies a naive algorithm for this purpose. Use a counterfactual system to compute all counterfactuals $x_{cf,1},\ldots,x_{cf,m}$ for $x_{i}$ , construct $S\stackrel{{\scriptstyle\mathrm{def}}}{{=}}\{R_{x_{cf,1}},\ldots,R_{x_{cf,m}}\}$ the set of all their duals, and output all minimal set covers $R$ of $S$ . Each set covering $R$ of $S$ is a globally consistent rule, because, otherwise there exists an instance $x$ such that $R(x)=1$ and $C(x)>0.5$ . This implies that $x$ is a counterfactual for $x_{i}$ , but is not among $x_{cf,1},\ldots,x_{cf,m}$ (because it disagrees with each $x_{cf,j}$ on at least one feature), contradicting the assumption that the list of counterfactuals was complete. However, we cannot use this naive algorithm, because counterfactual systems rarely return the complete list of counterfactuals.

Our solution is based on computing the rule $R$ incrementally. Starting with $R=\emptyset$ , we increase $R$ with one rule component at a time, until it becomes globally consistent, as follows. Assume $R$ is any rule relevant to $x_{i}$ , and suppose that $R$ is not globally consistent. We proceed as follows.

Step 1 Construct the predicate $R(x^{\prime})$ associated with the rule $R$ ; we will use it as a PLAF predicate in the next step.

Step 2 Using the counterfactual explanation system, find a list of counterfactuals $x_{cf,1},\ldots,x_{cf,k}$ for $x_{i}$ that satisfy the PLAF predicate, i.e. $R(x_{cf,j})=1$ for all $j=1,k$ . The number $k$ is usually configurable, e.g. $k=10$ . If no such counterfactual is found, then $R$ is globally consistent.

Step 3 For each $j=1,k$ , compute the dual $R_{x_{cf,j}}$ of each counterfactual $x_{cf,j}$ , i.e. the set of all rule components that conflict with $x_{cf,j}$ . We notice that $R_{x_{cf,j}}$ is disjoint from $R$ , because $x_{cf,j}$ satisfies the PLAF $R(x)$ . Let $S=\{R_{x_{cf,1}},\ldots,R_{x_{cf,k}}\}$ be the set of all the dual rules.

Step 4 For each minimal set that covers $R_{0}$ of $S$ , construct the extended rule $R\cup R_{0}$ , and repeat the process from Step 1.

We note that our use of PLAF rules differs from their original intent. Rather than constraining the counterfactual, we use them to check if the rule candidate $R$ is globally consistent, and, if not, then to extend it.

Example 3.3.

We illustrate with a simple example. Consider a customer $x_{i}$ with the following features:

\displaystyle x_{i}=

\displaystyle(\texttt{Age}=50,\texttt{AccNum}=4,\texttt{Income}=500,\texttt{Debt}=10k)

Suppose the customer was denied the loan application, and we are computing a rule-based explanation for the denial. Our current candidate rule $R$ is:

\displaystyle R=

\displaystyle(\texttt{Age}\leq 50)\wedge(\texttt{AccNum}\geq 4)

However, the rule is not globally consistent. We ask the counterfactual explanation system for counterfactuals that satisfy the PLAF defined by the rule $R$ , and obtain these two results. We highlight in red the features where they differ from $x_{i}$ :

	$\displaystyle x_{cf,1}=$	$\displaystyle(\texttt{Age}=50,{\color[rgb]{1,0,0}\texttt{AccNum}=5},{\color[rgb]{1,0,0}\texttt{Income}=900},\texttt{Debt}=10k)$
	$\displaystyle x_{cf,2}=$	$\displaystyle(\texttt{Age}=50,\texttt{AccNum}=4,{\color[rgb]{1,0,0}\texttt{Income}=600,\texttt{Debt}=2k})$

The first counterfactual says that if the customer increased her income to 900, then she would be approved, even if she had 5 accounts open. The second counterfactual says that if she increases her income to 600 and decreases her debt to 2k then she would be approved. The dual sets are:

	$\displaystyle R_{x_{cf,1}}=$	$\displaystyle(\texttt{AccNum}\leq 4)\vee(\texttt{Income}\leq 500)$
	$\displaystyle R_{x_{cf,2}}=$	$\displaystyle(\texttt{Income}\leq 500)\vee(\texttt{Debt}\geq 10k)$

There are two minimal set covers, namely $\texttt{Income}\leq 500$ and $(\texttt{AccNum}\leq 4)\wedge(\texttt{Debt}\geq 10k)$ . We expand $R$ with each of them and continue recursively. More precisely, the algorithm continues with:

	$\displaystyle R_{1}=$	$\displaystyle(\texttt{Age}\leq 50)\wedge(\texttt{AccNum}\geq 4)\wedge(\texttt{Income}\leq 500)$
	$\displaystyle R_{2}=$	$\displaystyle(\texttt{Age}\leq 50)\wedge(\texttt{AccNum}=4)\wedge(\texttt{Debt}\geq 10k)$

Suppose both are globally consistent. Then we will choose $R_{1}$ as an explanation, because it is more informative: its cardinality is 3, while the cardinality of $R_{2}$ is 4 (because $\texttt{AccNum}=4$ represents two rule components). We tell the customer: “everybody 50 years old or younger, with 4 or more open accounts, and with income 500 or lower is denied the loan application”.

4. Algorithms

We have shown in the previous section that the Duality Theorem leads to a method for computing rule-based explanations by using a counterfactual-based explanation as an oracle. In this section we apply this method to derive a concrete algorithm. More precisely, we describe three algorithms:

GeneticRule::: This is a base-line algorithm, which explores the space of rule-based explanations using a genetic algorithm. It does not use counterfactuals;
GeneticRuleCF::: This algorithm extends GeneticRule by using an oracle call to a counterfactual explanation system in order to generate and validate the rule-based explanations;
GreedyRuleCF:: This algorithm replaces the genetic search with a greedy search: we greedily expand only the rule with the smallest cardinality in the population, using the counterfactual explanation system as an oracle.

For GeneticRule and GeneticRuleCF we have chosen a genetic algorithm, which is a meta-heuristics for constraint optimization based on the process of natural selection. First, it defines an initial population of candidates. Then, it repeatedly selects the fittest candidate in the population and generates new candidates by changing and combining the selected candidates (called mutation and crossover). It stops when a certain criteria is met, e.g. it finds a specified number of solutions. We chose a genetic algorithm because (1) it is easily customizable to our problem of finding rule explanations, (2) it seamlessly integrates counterfactual explanations to generate and verify rules, (3) it does not require any restrictions on the underlying classifier and data, and thus is able to provide black-box explanations, and (4) it returns a diverse set of explanations, which may provide different rules that can give user more information.

Both GeneticRuleCF and GreedyRuleCF are based on the ideas in Section 3.2: use a counterfactual explanation oracle to build up the globally consistent rules efficiently and to verify whether the generated rules are consistent or not.

In the remainder of this section we describe the algorithms in detail: GeneticRule in Section 4.1, GeneticRuleCF in Section 4.2, GreedyRuleCF Section 4.3. Finally, in Section 4.4 we describe the fitness scoring function that we used for the candidate selection.

\text{POP}=[\{F_{1}\leq f_{i1}\},\{F_{1}\geq f_{i1}\}\ldots\{F_{n}\leq f_{in}\},\{F_{n}\leq f_{in}\}]

\text{CAND}=\textbf{crossover}(\text{POP},c)\cup\textbf{mutate}(\text{POP},m)

\text{POP}=\textbf{selectFittest}(x,\text{POP}\cup\text{CAND},C,D,q,s)

\text{TOPK}=\text{POP}[1:k]

while $(\exists R\in\text{TOPK}:\ !\textbf{consistent}(R,D,s))\lor(\text{TOPK}\cap\text{CAND}\neq\emptyset)$ ;

return TOPK

explain(instance

x

, classifier

C

, dataset

D

)

Algorithm 1 Pseudo-code of GeneticRule:

explain(instance

x

, classifier

C

, dataset

D

)

4.1. GeneticRule

GeneticRule is our “naive” algorithm which generates rules using a genetic algorithm. The pseudo-code is shown in Algorithm 1. The inputs are an instance $x$ , the classifier $C$ , and a dataset $D$ . The output is a set of rules that explain instance $x$ for classifier $C$ . In addition, there are five integer hyperparameters: $q>0$ represents the number of rules kept after each iteration, $k\leq q$ is the number of rules that the algorithm returns to the user, $s$ is the number of samples taken from Inst to check for global consistency, $m$ and $c$ specify the number of new candidates that are generated during mutation and respectively crossover. For instance, we use the following settings in most of the experiments: $q=50$ , $k=5$ , $s=1000$ , $m=3$ , $c=2$ . We refer to Sec. 5 for an explanation why we chose these settings.

GeneticRule first computes the initial population of rule candidates. We define the initial population to be all possible rule candidates with exactly one rule component. The initial candidates are likely not valid and consistent rules. Thus, GeneticRule repeatedly applies mutate and crossover to generate new candidates, computes the fitness score (via selectFittest) for each candidate, and then selects the $q$ fittest candidates for the next generation. This process is repeated until we find $k$ candidates that are consistent on both the dataset $D$ as well as $s$ samples from the more general Inst space. We further check that the top- $k$ candidates are not in the set of new generated candidates CAND, which means that they were stable for at least one generation of the algorithm.

The mutate operator generates $m$ new rule candidates for each candidate $R\in\text{POP}$ . First, the operator finds the set $S$ of all rule components that are not part of $R$ . Then, it generates each new candidate by sampling (without replacement) a single component from $S$ and adding it to $R$ . Adding a single component at a time keeps the cardinality of the rules low and makes it less likely to introduce redundant rule components.

The operator crossover generates $c$ new candidates for each pair of candidates, $R_{i}$ and $R_{j}$ . First, we compute the set $S=R_{i}\cup R_{j}$ of all rule components in $R_{i}$ and $R_{j}$ . Then, we randomly sample $t$ components from $S$ to form a new candidate. We use $t=max(|R_{i}|,\,|R_{j}|)+1$ , in order to keep the cardinality of the new candidate low. We repeat this sampling process $c$ times to generate $c$ new candidates for every pair of candidates in the population.

After generate new candidates via mutation and crossover, the selectFittest operator calculates the fitness score of each candidate in POP, sorts the candidates by in descending order of their fitness scores, and returns the top $q$ candidates. We give more details on the fitness scores in Sec. 4.4.

We note that GeneticRule cannot guarantee that the returned rules are globally consistent. Since we are only able to check global consistency for a sample of Inst, we can only guarantee that the rules are data consistent and are likely to be globally consistent.

4.2. GeneticRuleCF

\text{POP}=[\{F_{1}\leq f_{i1}\},\{F_{1}\geq f_{i1}\}\ldots\{F_{n}\leq f_{in}\},\{F_{n}\leq f_{in}\}]

\text{POP}=\text{POP}\cup\textbf{ CFRules}(\text{[\{\ \}]},x,C,D)

\text{CAND}=\textbf{crossover}(\text{POP},c)\cup\textbf{mutate}(\text{POP},m)

\text{CAND}=\text{CAND}\cup\textbf{CFRules}(\text{POP},x,M,D)

\text{POP}=\textbf{selectFittest}(x,\text{POP}\cup\text{CAND},C,D,q,s)

\text{TOPK}=\text{POP}[1:k]

\text{topk\_consistent}=(\forall R\in\text{TOPK}:\textbf{consistent}(R,D,s)\;\land\;\textbf{consistentCF}(R,x,C,D))

while $!\text{topk\_consistent}\lor(\text{TOPK}\cap\text{CAND}\neq\emptyset)$ ;

return TOPK

explain(instance

x

, classifier

C

, dataset

D

)

Algorithm 2 Pseudo-code of GeneticRuleCF:

explain(instance

x

, classifier

C

, dataset

D

)

We next explain GeneticRuleCF, which extends GeneticRule with a counterfactual explanation model to generate and verify rule candidates. The pseudocode is provided in Algorithm 2.

The main extension to GeneticRule is the CFRules function. It takes as input a set of rule candidates and generates new candidates by computing the counterfactual explanations for each input candidate. As outlined in Sec. 3.2, this process involves multiple steps: it computes the PLAF predicates for a given input candidate, then it computes the counterfactual explanation for this candidate, and finally returns the dual of the counterfactual explanation as a new candidate. We use CFRules to extend both the initial population as well as the candidate pool in the main loop of the genetic algorithm.

We further use the counterfactual model in the consistentCF function, which verifies that the top rule candidates are globally consistent. For a given candidate, the function runs the counterfactual model to generate a counterfactual example. If no such counterfactual example can be found, we conclude that the rule is globally consistent. As a consequence, GeneticRuleCF provides higher global consistency guarantees than GeneticRule.

Performance optimizations. Since calling the counterfactual explanation model is expensive, we only run CFRules once for every three iteration or when all top-k candidates are marked as data consistent. This setting gave us the best performance improvements with minimal effects on the generated rules in our experiments.

We further cache whether or not we were able to generate counterfactuals for each rule candidate. This ensures that we only need to run the counterfactual model once per candidate, and not multiple times for CFRules and consistentCF.

The algorithm has an optional post process stage (not shown in the pseudo code), to ensure that the returned rules have no redundant components. For each returned rule, we remove one rule component at a time, and check if the rule without this component is still verified by consistentCF. If so, the removed component is redundant and can be removed from the rule. We repeat this process until the returned rules do not have any redundant component.

4.3. GreedyRuleCF

\text{POP}=\textbf{CFRules}(\text{[\{\ \}]},x,C,D)

\text{POP}=\textbf{sortByCardinality}(\text{POP})

while $(!\textbf{consistentCF}(\text{POP[1]},x,C,D))$ do

// get and remove the top candidate from POP

\text{top\_cand}=\textbf{pop}(\text{POP})

// generate new candidates only for top_cand

\text{CAND}=\textbf{CFRules}([\text{top\_cand}],x,C,D)

\text{POP}=\textbf{sortByCardinality}(\text{POP}\cup\text{CAND})

\text{POP}=\text{POP[1 : q]}

end while

return POP[1]

explain (instance

x

, classifier

C

, dataset

D

)

Algorithm 3 Pseudo-code of GreedyRuleCF:

explain (instance

x

, classifier

C

, dataset

D

)

GreedyRuleCF does not use a genetic algorithm, but instead repeatedly utilizes the underlying counterfactual explanation model to greedily find rule candidates with small cardinality. The pseudocode is provided in Algorithm 3.

GreedyRuleCF generates the initial population by running CFRules on the empty rule candidate, and maintains the population sorted in increasing order with respect to the cardinality of the rule candidates. Then, the algorithm repeatedly takes the candidate with the smallest cardinality, generated new candidates by CFRules on this candidate, removes the considered candidate from and adds the new candidates to the population. The algorithm stops when the candidate with the smallest cardinality is found to be consistent by consistentCF.

In each iteration, GreedyRuleCF removes the inconsistent rule candidate with the smallest cardinality, and adds candidates which have cardinalities strictly larger than the removed one. Therefore, the cardinality of the rules in the population are monotonically increasing. The algorithm is guaranteed to terminate with an consistent rule, since candidates can have at most $2n$ rule components, where $n$ is the number of variables in $D$ .

4.4. Fitness Score Function

We describe the fitness score that is used to rank the rule candidates in the selectFittest function. For a given rule, the fitness score is based on its degree of consistency and its cardinality (a proxy of interpretability). We define three degrees of consistency:

(1)

The rule failed data consistency ( $FDC$ ): it violates instances in the database $D$ ;
(2)

The rule failed global consistency( $FGC$ ): it is data consistent (satisfies all instances in the dataset $D$ ), but fails for some instances in Inst;
(3)

The rule is globally consistent ( $GC$ ): The rule is consistent for both the dataset $D$ as well as the instances from Inst.

The fitness score of a rule $R$ is defined as follows. Suppose the database has $m\stackrel{{\scriptstyle\mathrm{def}}}{{=}}|D|$ instances, each with $n$ features. Let $VD$ denote the number of instances in $D$ that violate $R$ . If $VD=0$ , then we sample $s$ instances from Inst and denote by $VS$ the number of samples that violate $R$ . The fitness score $score(R)$ is:

score(R)=\begin{cases}0.25\times(1-\frac{|R|}{2\cdot n})+0.25\times(1-\frac{VD}{m}),&FDC\\ 0.25\times(1-\frac{|R|}{2\cdot n})+0.25\times(1-\frac{VS}{s})+0.25,&FGC\\ 0.25\times(1-\frac{|R|}{2\cdot n})+0.75.&GC\end{cases}

The expressions (including the coefficients 0.25, 0.75) were chosen to ensure that the score function always ranks candidates in a given level higher than the candidates in the levels below. For instance, the score of a global consistent rule candidate, GC, is always higher than those of levels FDC and FGC. If two candidates are in the same level, the one with smaller cardinality is ranked higher. This ranking ensures that we prioritize candidates that have better consistency guarantees.

5. Experiments

We evaluate the three algorithms GeneticRule, GeneticRuleCF, and GreedyRuleCF from Section 4, and address the following questions:

(1)

Do our algorithms find the correct rules, when these rules are known (the ground truth is known)?
(2)

Do our algorithms find rules for real datasets and machine learning models, and are they globally consistent?
(3)

How does the quality of the generated rules as well as the runtime of our algorithms compare to those generated by the state of art systems Anchor (Ribeiro et al., 2018) and MinSetCover (Rudin and Shaposhnik, 2019)?
(4)

How effective is the integration of counterfactual explanations in the generation of the rules?

Refer to caption — Figure 1. The break down of the percentage of rules that are consistent and minimal, consistent with redundant components, and inconsistent for GeneticRule (Gen), GeneticRuleCF (GenCF), GreedyRuleCF (Greedy), Anchor, and MinSetCover over 1000 synthetic classifiers with 2, 4, 6, and 8 rule components for the Credit Dataset.

5.1. Experiment Setup

In this section, we introduce the datasets, systems, and the setup for all our experiments.

Datasets and Classifiers. We consider four real datasets:

(1)

Credit Dataset (Yeh and hui Lien, 2009): used to predict the default of the customers on credit card payments in Taiwan;
(2)

Adult Dataset (Kohavi, 1996): used to predict whether the income of adults exceeds $\$50$ K/year using US census data from 1994;
(3)

FICO Dataset (FICO, 2019): used to predict the credit risk assessments.
(4)

Yelp Dataset (Yelp, 2017): used to predict review ratings that users give to businesses.

Classifier Type	Decision Tree	Decision Tree	Neural Network	Neural Network
	Credit	Adult	Fico	Yelp
Number of Instances	30K	45K	10.5K	22.4M
Features	14	12	23	34

Table 1. Key Characteristics for each Real Dataset.

Table 1 presents key statistics for each dataset and the corresponding classifiers we used. Credit and Adult are from the UCI repository (Dua and Graf, 2017) and are commonly used in the machine learning explanation fields. We utilize the Decision Tree classifiers for them. FICO is from the public FICO challenge, which is an Explainable Machine Learning Challenge that inspires a lot of research in this field. For the FICO dataset, we use the two-layer neural network classifier, where each layer is defined by logistic regression models. Yelp is the largest dataset we consider, and we use complex MLPClassifier with 10 layers as classifier.

In order to demonstrate whether the systems can recover the rules when these are known (ground truth is known), we create synthetic classifiers for the Credit dataset. The classifier is defined by a rule, with a number of rule components, and in the experiments we check whether the explanation algorithm can recover the rule that defines the classifier. As usual, our algorithms do not know the classifier, but access it as a black box. We expect real rule-based explanations to consists of a relatively small number of rule components (under 10; otherwise they are not interpretable by a typical user), so in this synthetic experiment we created classifiers with 2, 4, 6, and 8 rule components respectively. We repeated our experiments for 1000 randomly generated synthetic classifiers.

In order to ensure that our evaluation is fair, we apply the same preprocessing for all the systems.

Underlying Counterfactual Explanation Model. To identify the best counterfactual explanation model for our algorithm, we benchmarked thirteen different counterfactual explanation models (GeCo (Schleich et al., 2021), Actionable Recourse (Ustun et al., 2019), CCHVAE (Pawelczyk et al., 2020a), CEM (Dhurandhar et al., 2018), CLUE(Antoran et al., 2021), CRUDS (Pawelczyk et al., 2020b), Dice (Mothilal et al., 2020), FACE (Poyiadzi et al., 2020), Growing Spheres (Laugel et al., 2018), FeatureTweak (Tolomei et al., 2017), FOCUS (Lucic et al., 2019), REVISE(Joshi et al., 2019), Wachter (Wachter et al., 2017)) using the Carla benchmark (Pawelczyk et al., 2021). We report the results of this evaluation in our public GitHub repository: https://github.com/GibbsG/GeneticCF.

We find that GeCo is only one among all of these thirteen counterfactual explanation models that can robustly generate counterfactual explanations with flexible PLAF constraints and without redundant feature changes. Therefore, we decided to choose GeCo as the counterfactual explanation model for GeneticRuleCF and GreedyRuleCF and to help verify the globally consistency for the rules returned by all of the considered algorithms.

Considered Algorithms. We benchmark our three algorithms, GeneticRule, GeneticRuleCF, and GreedyRuleCF, against two existing systems: Anchor (Ribeiro et al., 2018) and MinSetCover (Rudin and Shaposhnik, 2019).

Anchor (Ribeiro et al., 2018) generates rule-based explanations (i.e. anchors) by the beam-searched version of pure-exploration multi-armed bandit problem. It starts with an empty rule. In each iteration, for each rule, Anchor (1) randomly selects $m$ possible rule components and add each of the possible components to the rule to create $m$ new rules, (2) evaluates all the rules, (3) and then selects top $n$ rule to keep for the next iteration. It stops when reaching a convergence for the rules. When evaluating whether a rule is consistent, Anchor samples $k$ instances in the whole space that are specified by the rule and considers the rule as consistent if all of those $k$ instances are “undesired”.

MinSetCover (Rudin and Shaposhnik, 2019) generates rule-based explanations using the minimum set cover problem. They consider the $m$ instances in the database as elements and the binary predicates ( $\leq F_{i}$ or $\geq F_{i}$ ) as sets. Then, finding a minimum rule with data consistency is reduced to finding the minimum number of binary predicates that covers all those “undesired” instances. Therefore, MinSetCover reduces the rule-based explanations problem to finding a minimum set cover problem and solves it by Linear Programming. Note that this approach can only be applied on the historical database $D$ and does not consider instances outside this database.

Parameter Choice. As discussed in Section 4.1, there are five hyperparameters in our systems: the number $k$ of rules that the algorithm returns to the user, the number $q$ of rules kept in each iteration, the number of new candidates that are generated during mutation ( $m$ ) and crossover ( $c$ ), the number $s$ of samples from Instduring evaluation. While $k$ depends on the user’s requirements, the other parameters determine the tradeoff between the quality of the returned rules and the time it takes to return them. For instance, if we increase the number $q$ of rules kept in each iteration, it is possible that we find rule with higher quality, but we also need to evaluate and verify a larger number of candidates in each iteration of the algorithm. We performed several pilot experiments to find the combination of hyperparameters in order to ensure that the rules are returned in a reseanonble time with acceptable quality. We found $q=50$ , $s=1000$ , $m=3$ , $c=2$ to be best settings for the experiments with the Adult, Credit and FICO datasets. For the Yelp dataset, we use $q=20$ , $s=5000$ , $m=3$ , $c=2$ .

For GeneticRuleCF, we enable the optional post reduction stage, but limit it to reduce only the top rule to limit the overhead.

Experimental pipeline. The data is pre-processed as required by the classifiers: all categorical variables in the Credit and Yelp dataset are integer encoded, while those in the Adult are one-hot encoded. We use decision tree classifer for the Credit and Adult dataset, and multi-layer neuron network for Fico and Yelp datasets. This way we explore different types of variable encodings and classifiers. The post-processed datasets retain the same number of instances (tuples) as the original data, as shown in Table 1. Recall that one explanation is for one single user, yet in order to provide explanation to one instance, the system needs to examine at least the entire dataset $D$ , or, better, the entire space of instance Inst. If a system returns more than one explanation for the user, then we retained the top explanation. In short, for one user (instance) we run each system to find rule-based explanations, and retain the top-ranked rule. We measure the run time needed to find the explanation, then evaluate its quality. We then repeat this process for 10,000 users (i.e. we compute 10,000 explanations), to get a better sense of the variance of our findings, for all systems. Thus, in our experiments each system returns 10,000 rules, i.e. one explanation for each user.

Evaluation Metrics. We utilize the following two metrics, which are adapted from our principles of rules, to evaluate the quality of the generated rules: (1) Global Consistency: we can not find any instance that is specified by the rule and is classified as “desired” by the classifier; (2) Interpretability: the cardinality of the rule (i.e. the number of rule components). To be specific, we check whether there are redundant components from the return rules and whether the rule returned is minimal. While GeneticRule and GeneticRuleCF generate multiple rule-based explanations for each instance, we only consider the top one rule-based explanation in our evaluation.

Setup. We implement GeneticRule, GeneticRuleCF and GreedyRuleCF algorithms in Julia 1.5.2. All experiments are run on an Intel Core i7 CPU Quad-Core/2.90GHz/64bit with 16GB RAM running on macOS Big Sur 11.6.

5.2. Quality in terms of Consistency and Interpretability

We compare all considered algorithms in terms of the quality of generated rules on the datasets. First, we consider synthetic classifiers and then we evaluate the considered systems on real classifiers.

Synthetic Classifiers. Recall that here the classifier is a rule itself, and the task of the system is to find an explanation that is precisely that rule. We categorize the rule-based explanation into three categories: (1) the rule exactly matches the classifier (consistent and minimal); (2) the rule is consistent but has redundant components, i.e. it is a strict superset of the classifier; or (3) the rule is inconsistent with the classifier, i.e. it misses at least one rule component of the classifier. We report the percentage of rules that fall into the three categories for each considered algorithm.

Figure 1 presents the results of our evaluation on the five algorithms over $1000$ synthetic classifiers with of $2$ , $4$ , $6$ , and $8$ rule components respectively for the Credit dataset.

GeneticRuleCF and GreedyRuleCF can always find the consistent and minimal rules ( $100\%$ ) regardless of the cardinality of the rule. GeneticRule can always find the consistent and minimal rules when the cardinality of the classifiers is small, but it generates some inconsistent rules ( $11\%$ ) for classifiers with 8 rule components. This exemplifies the benefits of including a counterfactual explanation system in the rule generation algorithm.

Both Anchor and MinSetCover do not always find the consistent rules even with redundancy when the cardinality of the classifers is 2 ( $61.7\%$ and $44.6\%$ ). For larger cardinalities, they rarely generate consistent rules: only $7.8\%$ of the rules in Anchor and $1.3\%$ of the rules in MinSetCover are consistent when the cardinality of the rules bebind the classifiers are $6$ . Anchor is more likely to find consistent rules than MinSetCover, but the rules generated from Anchor are mostly redundant. MinSetCover limits the cardinality of the generated rules and does not return any rules with redundant components. As a result, however, it often returns rules with fewer components than expected. In conclusion, our algorithm outperforms both Anchor and MinSetCover in terms of consistency and interpretability for the considered synthetic classifiers.

Real Classifiers. For the real classifiers, we categorize each rule $R$ in one of the following five categories:

(1)

Failed data consistency ( $FDC$ ): there is an instance in the dataset $D$ where the rule fails.
(2)

Failed global consistency ( $FGC$ ): all instances in $D$ satisfy the rule, but it fails on some an instance in Inst.
(3)

Globally Consistent ( $GC$ ) but redundant: the rule holds on all instances in Inst, but has some redundant rule components;
(4)

Globally Consistent ( $GC$ ), non-redundant, but not minimal: the rule is globally consistent and non-redundant, but is not of minimum size: there exists a strictly smaller globally consistent rule.
(5)

Globally Consistent ( $GC$ ) and minimal: has the smallest number of rule components.

In contrast to synthetic classifiers, we do not know the correct rules for real classifiers. We can, nevertheless, check whether a rule has redundant components by removing one of its rule components and checking if it is still consistent. If so, the rule component is redundant as the removed feature is not required. When checking the cardinalty of the miniml rules, we check all possible rule sorted by the cardinalty until finding a consistent one. Then, all consistent rules with that cardinality are considered as minimal.

Recall that our test for global consistency consists of running the counterfactual explanations model (in our case GeCo) as a proxy: we run GeCo subject to the constraints provided by the rule and, if GeCo can not find a counterfactual explanation, then we conclude that the rule is globally consistent.

Figure 2 shows our evaluation results for the Credit dataset with a decision tree classifier, the Adult dataset with a decision tree classifier, the Fico dataset with a multi-layer neural network classifier, and the Yelp Dataset with a MLP classifier, respectively.

GeneticRuleCF and GreedyRuleCF can always find globally consistent rules without redundancy for the Credit and Adult datasets, while they sometimes return inconsistent rules for Fico (9% and 1%) and Yelp (16% and 23%). This is because GeCo is unstable for these algorithms, and does not always find counterfactual examples. If the underlying counterfactual system is stable, GeneticRuleCF and GreedyRuleCF always find globally consistent rules. Further, GreedyRuleCF is always able to find consistent rules that are minimal. In contrast, GeneticRuleCF finds rules without redundancy, but the rules are not always minimal (the returned rule often have one or two additional rule components than the minimal rule). This shows that GreedyRuleCF is the only algorithm that is able to find consistent, non-redundant, and minimal rules.

GeneticRule is not guaranteed to find the globally consistent rules. For instance, for the Credit dataset 37.6% of the generated rules by GeneticRule are not globally consistent. Unlike the synthetic classifiers, the real classifiers are more complex. And GeneticRule only uses a sample from Instto verify for global consistency. Since the real classifiers are complex, it is possible that it finds rules that are consistent on the sample but not on the entire instance space. This further demonstrates the necessity of the counterfactual system to verify rules, as it helps to explore the instance space more broadly.

In GeneticRule algorithm, we use a heuristic random selection to choose the rule components, which makes rules less likely to include redundant components compared with Anchor (Rules from Anchor usually have 3 or 4 redundant components, while those from GeneticRule have 1 or 2). When using GeCo in the rule-building stage, we assure that the new added rule component is necessary and not redundant. This explains the decreasing pattern in the average cardinality of the rules from Anchor and GeneticRule, to GeneticRuleCF, and GreedyRuleCF. For MinSetCover, it always finds rule-based explanations that are only data consistent, but rarely globally consistent. Thus, most of the rules returned by MinSetCover have small cardinality but are not globally consistent. This further demonstrates that our algorithms can find both consistent and interpretable rules and beat Anchor and MinSetCover on the real dataset and classifiers.

5.3. Runtime Comparison

We measure the runtime of all considered algorithms for the synthetic and real classifiers. In particular, we investigate how the runtime is affected by the cardinality of the synthetic classifiers, as well as the sizes of the the different datasets.

Synthetic classifiers. Figure 3 shows the runtime of algorithms in synthetic classifiers with $2$ , $4$ , $6$ , and $8$ rule components.

GeneticRule, GeneticRuleCF, and GreedyRuleCF usually consume less time than Anchor and MinSetCover, regardless of the cardinality of rules behind the classifier. In particular, for the classifier with 8 rule components GeneticRule and GreedyRuleCF are about $5\times$ faster than Anchor and $1.8\times$ faster than MinSetCover; GeneticRuleCF is about $3.9\times$ faster than Anchor and $1.3\times$ faster than MinSetCover.

We find that the larger the cardinality of the classifier rules, the longer the algorithms take to return a result. This is expected, since it takes more effort to build more complex rules. When the cardinality of classifier rules increases from $2$ to $6$ , GeneticRule is more than 3 $\times$ slower, GeneticRuleCF is 2.8 $\times$ slower, and GreedyRuleCF is only 1.7 $\times$ slower. GreedyRuleCF is less affected by the increase in the cardinality. For instance, it takes almost the same time for classifiers with 4 and 6 rule components. This is because the algorithms can add several required rule components to a rule in every iteration with the counterfactual explanations, while in the traditional approach we can only add one more rule component to a rule in each iteration.

Real Classifiers. Figure 4 compares the runtime with real classifiers for each considered algorithm and dataset.

For the Credit dataset, the runtimes for all algorithms are similar, while GeneticRule is the fastest and MinSetCover is the slowest. Credit has $14$ variables and thus the decision tree classifier is relatively small. This demonstrates that our GeneticRuleCF and GreedyRuleCF algorithms can efficiently generate consistent rule-based explanations without extra cost for a moderately complex datasets and classifiers.

For Adult, GeneticRule, GeneticRuleCF, and GreedyRuleCF are all significantly faster than Anchor and MinSetCover. For instance, GeneticRule is more than $6\times$ faster than Anchor and MinSetCover, whereas GreedyRuleCF is more than $2\times$ faster. This performance difference can be explained by the fact that the dataset contains many variables that were one-hot encoded during preprocessing, which significantly increases the number features. Whereas Anchor and MinSetCover scale poorly in the number of features, our algorithms can treat one-hot encoded features as one feature. As a consequence, our algorithms can significantly outperform the existing systems in the presence of one-hot encoded variables.

For Fico and Yelp, GeneticRuleCF and in particular GreedyRuleCF take much more time. This mainly because we use a strong verification mechanism in the two algorithms. The verification mechanism prevents the algorithms from stopping until they find a consistent rule, whereas the other algorithms might stop early and return inconsistent rules (c.f., Figure 2).

Another reason for the slower performance is the performance of the underlying classifier, which is particularly apparent for the Fico dataset. Since our algorithms use a counterfactual explanation system, GeneticRuleCF and GreedyRuleCF call the classifier significantly more often than Anchor and MinSetCover. Thus if the underlying classifier is slower, GeneticRuleCF and GreedyRuleCF are also much slower. Table 2 presents the runtime for classifiers and GeCo on each dataset. For the Fico dataset, the classifier is more than 25 $\times$ slower than that of the Credit dataset and 13 $\times$ slower than that of the Adult dataset. This led to GeCo taking 22.2 $\times$ and 14.7 $\times$ more time with the classifier of the Fico dataset compared to that of the Credit dataset and the Adult dataset. The increase of runtime in GeCo significantly increases the runtime of GeneticRuleCF and GreedyRuleCFas they rely on the GeCo to verify the rules. We will discuss more details in the microbenmarks in Sec 5.4.

The Yelp dataset contains millions of instances and we use a significantly more complex classifier. For this reason, it is much harder for the algorithms to generate and verify rules. This is visible for our algorithms but also for MinSetCover, whose runtime is highly depend on the number of instances in the dataset. However, even for the slow classifier as on Fico Dataset, and large dataset as Yelp dataset, our GeneticRuleCFcan always generate rules in time that is at least comparable, and at times significantly faster, than Anchor and MinSetCover. This shows the power of using a genetic algorithm to build the rules with less run time.

In summary, GeneticRuleCF can always finish in reasonable runtime to generate high quality rules regardless of the classifier speed or the size of the dataset. GreedyRuleCF typically generates rules with the highest quality, and it is fast when the classifier and dataset have moderate size. For complex classifiers over large datasets, however, GeneticRuleCF is more efficient than GreedyRuleCF.

	Credit	Adult	Fico	Yelp
Classifier Runtime	0.0041	0.0079	0.1079	0.02081
GeCo Runtime	0.0854	0.1294	1.9050	2.0793

Table 2. Run time of the classifiers to predict 10,000 instances, and for GeCo to explain a single instance on each dataset.

5.4. Microbenchmarks

In this section, we present the results for the microbenchmarks. We compare the runtime break down for each main operators for GeneticRuleCF, the number of rule components explored for GeneticRuleCF, GreedyRuleCF and GeneticRule, and the number of rule components explored by the counterfactual explanation systems for GeneticRuleCF and GreedyRuleCF. We evaluate our algorithms by explaining $100$ instances on Adult Dataset, Credit Dataset, Fico Dataset and $20$ instances on Yelp Dataset with corresponding classifiers.

Breakdown of Runtime. Figure 7 presents the results for the runtime breakdown. We choose to not include GreedyRuleCF since almost all of its runtime is from the counterfactual system. This is because GreedyRuleCF only uses the counterfactual system, GeCo, to build and verify rules. Prep (i.e. Preparation) captures the runtime to compute the rule space and to build the initial population. Reduction captures the timed used to removed the redundant components from the returned rules using the counterfactual explanation systems. The runtime for selectFittest (Selection), Crossover, Mutate, and CFRules is accumulated over all iterations.

As discussed in Section 4.2, we only run the counterfactual explanation once in the function CFRules for each rule candidate to optimize the performance. This also gives us the information we need for function consistentCF which used in selectFittest. Therefore, we do not run counterfactual explanations in the selectFittest and the time used by function CFRules captures the time used by the counterfactual system in building the rules. And the total time used by the counterfactual system is the sum of the runtime in CFRules and Reduction.

The results show that the CFRules and Reduction are the most time-consuming operations. This is not surprising since these two operations rely on the counterfactual system, which is costly. And thus, the majority of the runtime is consumed by the counterfactual system. If we can reduce the runtime for the underlying counterfactual system, we would like to see a huge gain in the runtime of our algorithms.

Number of Candidate Rules Explored. Figure 7 shows the number of candidates explored for each of the algorithms. The result shows that our optimizations in GeneticRuleCF and GreedyRuleCF effectively limit the total number of rules explored in the Credit and Adult datasets when the classifier is moderately complex and guide GeneticRuleCF and GreedyRuleCF to search candidates that are more likely to be consistent. When the classifier is complex as in the Fico and Yelp datasets, our optimizations prevent GeneticRuleCF and GreedyRuleCF from stopping early with inconsistent rules and push the algorithms to explore more rules until finding a real consistent one.

The Number of GeCo Runs. Figure 7 presents how many times we use GeCo in each of the algorithms in the four datasets. For moderate dataset and classifiers, like Adult and Credit, the number if rules explored by GeneticRuleCF and GreedyRuleCF are similar. However, when the datasets and classifiers become large and complex, like Fico and Yelp, the genetic algorithm in GeneticRuleCF significantly reduces the number of runs using the counterfactual systems (6 times fewer in the Fico dataset and 35 times fewer in the Yelp dataset). This explains why the GeneticRuleCF is more efficient than GreedyRuleCF for large datasets with complex classifiers.

6. Limitations and Future Work

We have illustrated the effectiveness and efficiency of using the underlying counterfactual explanation model to generate rule-based explanations and compared it with other state-of-the-art algorithms. Now we come to its limitations and opportunities for future work.

Bound of Rule Components. In our algorithms, to reduce the search space of the rules, we limit our rules to strictly related to the values of the input instance, and the bound of any rule components must be the corresponding feature value. That is, our rule components can only be larger, smaller, or equal to the feature value. However, the range of the rules can be broader. For example, there is an input instance $x=\{F_{1}=3\}$ and the classifier $C$ has a rule $F_{1}<10$ . Since we use the feature value $F_{1}$ as the bound of the rule component, we output the rule as $F_{1}<3$ , which is narrower than the real rule. Currently, we want to analyze the behavior of the classifier with respect to the input instance, so this strict bound satisfies our expectations. In the future, we may want to take advantage of this strict bound to make the rule more general.

Realistic Feature Value Distributions. In GeCo and the sampling process of our algorithms (and many other state-of-the-art Counterfactual Explanation models), we assume the perturbation distributions as the instance search space. This is sufficient to leverage the behavior of the model which generates interpretable explanations. However, how to estimate such distributions is still a questionable and challenging problem, such as how to represent the causal dependency between different features. Designing ways to find such distributions will benefit multiple explanation methods.

Underlying Counterfactual Explanation System. In the experiments, we find that stability and run time of our algorithm is highly depend on the underlying counterfactual system. In our implementation, we use GeCo as our underlying Counterfactual Explanation System, which is currently the best counterfactual explanation system we found. However, we observed that GeCo can still be costy and unstable in extreme cases, which negatively affects the run time and stability of our systems. If there is a more efficient and stable counterfactual explanation system, we would expect a huge gain in our systems.

Better Counterfactual Explanation Model. In this paper, we use the counterfactual explanation model to generate rule-based explanations. Similarly, we can also use the rule-based explanation to help identify which features needed to be changed for counterfactual explanations. If there is a well-established rule-based explanation model, we can apply the idea to facilitate the counterfactual explanation model.

Static Data and Classifier. Currently, our rule-based explanation algorithms assume the underlying data and classifier to be static. Therefore, our algorithms are subject to changes in the data and classifier. We plan to explore how we can generate explanations that are robust to small changes in the data distribution or classifier. This is related to the more general problem of robust machine learning.

7. Conclusion

Rule-based explanations are highly desirable for automated, high stakes decisions, yet they are computationally intractable. In this paper we have described a new approach for computing rule-based explanations, which uses counterfactual explanation system as an oracle. We also use the counterfactual explanation system to robustly verify the global consistency of the rules. We have described a base genetic algorithm (GeneticRule) and two extended algorithms (GeneticRuleCF and GreedyRuleCF ) that integrate the results from the counterfactual explanations in order to build globally consistent, informative rule-based explanations. We conducted an extensive experimental evaluation, proving that the rule-based explanations returned by our system are globally consistent, and have fewer rule components, i.e. are more informative, than those returned by other systems described in the literature.

Acknowledgements.

This work was partially supported by NSF IIS 1907997 and NSF-BSF 2109922.

References

(1)
Antoran et al. (2021) Javier Antoran, Umang Bhatt, Tameem Adel, Adrian Weller, and José Miguel Hernández-Lobato. 2021. Getting a {CLUE}: A Method for Explaining Uncertainty Estimates. https://openreview.net/forum?id=XSLF1XFq5h
Baehrens et al. (2010) David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen, and Klaus-Robert Müller. 2010. How to Explain Individual Classification Decisions. J. Mach. Learn. Res. 11 (aug 2010), 1803–1831.
Chen et al. (2018) Chaofan Chen, Kangcheng Lin, Cynthia Rudin, Yaron Shaposhnik, Sijia Wang, and Tong Wang. 2018. An Interpretable Model with Globally Consistent Explanations for Credit Risk. arXiv:1811.12615.
Dhurandhar et al. (2018) Amit Dhurandhar, Pin-Yu Chen, Ronny Luss, Chun-Chen Tu, Paishun Ting, Karthikeyan Shanmugam, and Payel Das. 2018. Explanations Based on the Missing: Towards Contrastive Explanations with Pertinent Negatives. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 590–601.
Dua and Graf (2017) Dheeru Dua and Casey Graf. 2017. UCI Machine Learning Repository. UC Irvine. http://archive.ics.uci.edu/ml/index.php
FICO (2019) FICO. 2019. Explainable machine learning challenge. FICO. https://community.fico.com/s/explainable-machine-learning-challenge
Joshi et al. (2019) Shalmali Joshi, Oluwasanmi Koyejo, Warut Vijitbenjaronk, Been Kim, and Joydeep Ghosh. 2019. Towards Realistic Individual Recourse and Actionable Explanations in Black-Box Decision Making Systems. arXiv:1907.09615 http://arxiv.org/abs/1907.09615
Karimi et al. (2020) Amir-Hossein Karimi, Gilles Barthe, Borja Balle, and Isabel Valera. 2020. Model-agnostic counterfactual explanations for consequential decisions. In AISTATS. 895–905.
Kohavi (1996) Ron Kohavi. 1996. Scaling up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (Portland, Oregon) (KDD’96). AAAI Press, Palo Alto, CA, USA, 202–207.
Lakkaraju et al. (2016) Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec. 2016. Interpretable Decision Sets: A Joint Framework for Description and Prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, Balaji Krishnapuram, Mohak Shah, Alexander J. Smola, Charu C. Aggarwal, Dou Shen, and Rajeev Rastogi (Eds.). ACM, 1675–1684. https://doi.org/10.1145/2939672.2939874
Laugel et al. (2018) Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Xavier Renard, and Marcin Detyniecki. 2018. Comparison-Based Inverse Classification for Interpretability in Machine Learning. In Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations - 17th International Conference, IPMU 2018, Cádiz, Spain, June 11-15, 2018, Proceedings, Part I (Communications in Computer and Information Science), Jesús Medina, Manuel Ojeda-Aciego, José Luis Verdegay Galdeano, David A. Pelta, Inma P. Cabrera, Bernadette Bouchon-Meunier, and Ronald R. Yager (Eds.), Vol. 853. Springer, Cádiz, Spain, 100–111. https://doi.org/10.1007/978-3-319-91473-2_9
Lucic et al. (2019) Ana Lucic, Harrie Oosterhuis, Hinda Haned, and M. de Rijke. 2019. Actionable Interpretability through Optimizable Counterfactual Explanations for Tree Ensembles.
Lundberg and Lee (2017) Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in neural information processing systems (NIPS). 4765–4774.
Molnar (2022) Christoph Molnar. 2022. Interpretable Machine Learning (2 ed.). https://christophm.github.io/interpretable-ml-book
Mothilal et al. (2020) Ramaravind K. Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* ’20). Association for Computing Machinery, New York, NY, USA, 607–617. https://doi.org/10.1145/3351095.3372850
Pawelczyk et al. (2021) Martin Pawelczyk, Sascha Bielawski, Johannes van den Heuvel, Tobias Richter, and Gjergji Kasneci. 2021. Carla: a python library to benchmark algorithmic recourse and counterfactual explanation algorithms. arXiv:2108.00783.
Pawelczyk et al. (2020a) Martin Pawelczyk, Klaus Broelemann, and Gjergji Kasneci. 2020a. Learning Model-Agnostic Counterfactual Explanations for Tabular Data.
Pawelczyk et al. (2020b) Martin Pawelczyk, Klaus Broelemann, and Gjergji Kasneci. 2020b. Learning Model-Agnostic Counterfactual Explanations for Tabular Data. In WWW ’20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, Yennun Huang, Irwin King, Tie-Yan Liu, and Maarten van Steen (Eds.). ACM / IW3C2, aipei, Taiwan, 3126–3132. https://doi.org/10.1145/3366423.3380087
Poyiadzi et al. (2020) Rafael Poyiadzi, Kacper Sokol, Raúl Santos-Rodríguez, Tijl De Bie, and Peter A. Flach. 2020. FACE: Feasible and Actionable Counterfactual Explanations. In AIES ’20: AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, February 7-8, 2020, Annette N. Markham, Julia Powles, Toby Walsh, and Anne L. Washington (Eds.). ACM, New York, NY, USA, 344–350. https://doi.org/10.1145/3375627.3375850
Ribeiro et al. (2016) Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, Balaji Krishnapuram, Mohak Shah, Alexander J. Smola, Charu C. Aggarwal, Dou Shen, and Rajeev Rastogi (Eds.). ACM, 1135–1144. https://doi.org/10.1145/2939672.2939778
Ribeiro et al. (2018) Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-Precision Model-Agnostic Explanations. Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (Apr. 2018), 1527–1535. https://ojs.aaai.org/index.php/AAAI/article/view/11491
Rudin and Shaposhnik (2019) Cynthia Rudin and Yaron Shaposhnik. 2019. Globally-consistent rule-based summary-explanations for machine learning models: Application to credit-risk evaluation. SSRN. https://ssrn.com/abstract=3395422
Schleich et al. (2021) Maximilian Schleich, Zixuan Geng, Yihong Zhang, and Dan Suciu. 2021. GeCo: Quality Counterfactual Explanations in Real Time. Proc. VLDB Endow. 14, 9 (may 2021), 1681–1693. https://doi.org/10.14778/3461535.3461555
Tolomei et al. (2017) Gabriele Tolomei, Fabrizio Silvestri, Andrew Haines, and Mounia Lalmas. 2017. Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking. In Proceedings of the 23rd International Conference on Knowledge Discovery and Data Mining. ACM, Halifax, NS, Canada, 465–474. https://doi.org/10.1145/3097983.3098039
Ustun et al. (2019) Berk Ustun, Alexander Spangher, and Yang Liu. 2019. Actionable Recourse in Linear Classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency (Atlanta, GA, USA) (FAT* ’19). Association for Computing Machinery, New York, NY, USA, 10–19. https://doi.org/10.1145/3287560.3287566
Wachter et al. (2017) Sandra Wachter, Brent D. Mittelstadt, and Chris Russell. 2017. Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR. Harv. JL & Tech. 31 (2017), 841. arXiv:1711.00399 http://arxiv.org/abs/1711.00399
Wang et al. (2017) Tong Wang, Cynthia Rudin, Finale Doshi-Velez, Yimin Liu, Erica Klampfl, and Perry MacNeille. 2017. A Bayesian Framework for Learning Rule Sets for Interpretable Classification. J. Mach. Learn. Res. 18, 1 (jan 2017), 2357–2393.
Yeh and hui Lien (2009) I-Cheng Yeh and Che hui Lien. 2009. The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications 36, 2, Part 1 (2009), 2473–2480. https://doi.org/10.1016/j.eswa.2007.12.020
Yelp (2017) Yelp. 2017. Yelp Dataset Challenge. Yelp. https://www.yelp.com/dataset/challenge/

Appendix A Appendix: Choosing the underlying Counterfactual Explanation model

Both Genetic Rule with CF algorithm and CF Greedy algorithm build on top of a Counterfactual Explanation model and the performance of these two algorithms relies heavily on the reliability and efficiency of the Counterfactual Explanation model. Therefore, it is crucial for us to find a Counterfactual Explanation model that can support our needs. In this section, we talk about how we choose GeCo as our underlying Counterfactual Explanation model.

A.1. Requirements

Here, we illustrate what our algorithms need from the underlying Counterfactual Explanation model, which gives us direction to decide which Counterfactual Explanation model to select.

Flexible Feasible Constraints. No matter whether building up the rules using CF or validating rules using CF, we have to first convert the rules to the Feasible Constraints before running the Counterfactual Explanation model. This is how we link our Rule-based model to the Counterfactual Explanation model. Also, we don’t want to fix the Feasible Constraints when we import the dataset/classifier, but we want to change the Feasible Constraints corresponding to the rules each time we run the Counterfactual Explanation model. Therefore, Counterfactual Explanation model should support Feasible Constraints, and also these Feasible Constraints should be flexible and not static per dataset/classifier. Since these Feasible Constraints are generated by code and change all the time, ideally we also want it to have low overhead and a friendly interface to generate the Constraints.

Black-box Algorithm. Since our Rule-based algorithms support different kinds of classifiers and treat them as black-box, we should not break our guarantee by the underlying Counterfactual Explanation model. Therefore, it should also support different types of classifiers and should be black-box algorithm with respect to the classifier.

Reliability. Our Rule-based algorithms rely heavily on the result from the Counterfactual Explanation model and we blindly think that the results are correct. Particularly, we assume that (1) the Counterfactual Explanations returned with the correct label should be valid Counterfactual Explanations and (2) it should always return at least one valid Counterfactual Explanation when there are some valid Counterfactual Explanations under the constraints. We use the first one to build our rules as function CFForRule. And we use the second one to verify whether the rule is sound or not in function CFForRule. Both of the functions are used in Algorithm 2 and 3. Therefore, the Counterfactual Explanation model must match the assumptions we make.

Effectiveness. As discussed, we implement the function CFForRule based on the idea in Section 3.2. And in order to make sure our rule is interpretable (with low cardinality), we also want to make sure that the Counterfactual Explanation generated has no useless feature change. Otherwise, we may introduce useless rule components, which is what we want to avoid. By useless feature change, we mean that if substitute a feature value in Counterfactual Explanation that is different from the original instance to the corresponding feature value in the original instance, the new instance should always be labeled as ”undesired” (if we removed the feature change, it will be undesired again). Therefore, we want the Counterfactual Explanation model to generate Counterfactual Explanations without useless feature changes. In real measurements, this is similar to how close the Counterfactual Explanations are to the original instance.

Efficiency. We use the Counterfactual Explanation model quite often in our algorithms. We use it each time we may want to generate new rules or verify rules. Therefore, although run-time may not be a required factor, we want to use a Counterfactual Explanation model that can finish in a reasonable time to make sure that our run-time is reasonable.

A.2. Counterfactual Explanation model

We considered 13 different Counterfactual Explanation models in literature and we briefly discuss them as below:

AR. AR(Ustun et al., 2019) or Actionable Recourse is suggested by Ustun et al. to generate minimal cost actions to flip a ”undesired” instance to ”desired” instance. It simulated an approximate linear model of the classifications and use integer programming to find results with minimum costs (an optimization problem). It supports user–specified constraints as actionable or immutable features.

CCHVAE. CCHVAE(Pawelczyk et al., 2020a) is provided by Pawelczyk et al., which use VAE to approximate the classifier structure and find counterfactual in the high possible region. It doesn’t support the Feasible Constraints.

CEM. CEM(Dhurandhar et al., 2018) is proposed by Dhurandhar et al., which finds minimum-cost counterfactual instances with an elastic–net regularization. It doesn’t support the Feasible Constraints.

CLUE. CLUE(Antoran et al., 2021) is from Antorán et al., which considers the classifier’s uncertainty and uses VAE to approximate a generative model to search CFs towards low uncertainty space. It supports the Feasible Constraints respect to the data set.

CRUDS. CRUDS(Pawelczyk et al., 2020b) is suggested by Downs et al., which uses Conditional and Disentangled Subspace Variational Autoencoder to search counterfactuals over the interested latent features. It supports user–specified constraints.

DICE. Dice(Mothilal et al., 2020) is provided by Mothilal et al., which uses gradient descent to solve the optimization problem that is a trades-off between the diversity and the distance from the original distance. It supports the Feasible Constraints with feature ranges and immutability features.

FACE. FACE(Poyiadzi et al., 2020) is from Poyiadzi et al. who uses graphs to find the shortest path in the high–density regions. It only considers the result in the dataset and supports the Feasible Constraints. There are two variants that differ by the graphs (FACE–EPS with epsilon–graph and FACE–KNN with knn-graph).

Growing Spheres (GS). Growing Spheres (Laugel et al., 2018) is proposed by Laugel et al., which utilizes a random search algorithm to generate a sample around the original instance with growing hyperspheres until finding a counterfactual. Feasible constrain is supported.

FeatureTweak. FeatureTweak(Tolomei et al., 2017) is provided by Tolomei et al., which tweaks input features over a designed Random Forest classifier. It mainly works for the tree-based classifiers and doesn’t support Feasible Constraints.

FOCUS. FOCUS(Lucic et al., 2019) is from Lucic et al., which uses probabilistic model approximations of the original tree ensemble and gradient-based optimization to find the counterfactual close to the original instance. It doesn’t support user-specified constraints and only work for tree-based classifiers.

REVISE. REVISE(Joshi et al., 2019) is suggested by Joshi et al., which uses a variational autoencoder (VAE) to estimate the generative model and latent space to find the counterfactual. It doesn’t support Feasible constrain.

Wachter. Wachter(Wachter et al., 2017) is from Wachter et al., which uses gradient descent to solve the optimization problem of minimizing the 1-norm distance from the counterfactual to the original instance. It doesn’t support Feasible constrain.

GeCo. GeCo (Schleich et al., 2021) is proposed by Schleich et al., which utilizes a heuristic random search algorithm to generate counterfactual closed to the original instance and with few feature changes. It supports feasible constrain for immutable features, specific feature ranges and correlated features.

	Artificial Neural Network				Logistic Regression
	Redund.	Viol.	Success	Time (s)	Redund.	Viol.	Success	Time (s)
DICE	0.50	0.10	1.00	0.13	0.26	0.10	1.00	0.12
AR	0.00	0.00	0.29	2.07	0.00	0.00	1.00	49.06
CCHVAE	NaN	NaN	0.00	2.65	4.16	1.35	1.00	1.14
CEM	3.96	0.00	1.00	0.60	4.34	0.06	1.00	0.53
CLUE	7.79	1.26	1.00	1.55	NaN	NaN	0.00	1.42
CRUDS	11.81	1.31	0.42	4.80	11.83	1.35	1.00	4.16
FACE_KNN	4.97	1.42	1.00	6.10	3.94	1.37	0.99	6.15
FACE_EPS	5.15	1.47	0.98	6.27	3.89	1.43	0.74	6.31
GS	3.82	0.00	1.00	0.01	3.37	0.00	1.00	0.01
REVISE	NaN	NaN	0.00	7.40	10.56	1.22	0.92	7.32
WACHTER	4.44	1.0	0.50	15.52	1.10	1.00	1.00	0.02
GeCo	0.00	0.00	1.00	0.66	0.00	0.00	1.00	0.53

Table 3. Results from Carla benchmark on 12 Counterfactual Explanation Systems for Adult dataset with Artificial Neural Network and Logistic Regression classifiers using Tensorflow and Pytorch frameworks.

	Artificial Neural Network + SKlearn				Logistic Regression + XGboost
	Redund.	viol.	Success	Time (s)	Redund.	Viol.	Success	Time (s)
FOCUS	4.04	1.00	1.00	0.15	4.04	1.00	1.00	0.33
FEATURE_TWEAKING	0.99	0.37	0.45	0.00	0.00	0.00	0.81	0.02
GeCo	0.00	0.00	1.00	0.10	0.00	0.00	1.00	0.14

Table 4. Results from Carla benchmark on 3 Counterfactual Explanation Systems for Adult dataset with Artificial Neural Network on SKlearn and Logistic Regression classifiers on XGboost.

A.3. Carla as the Comparison Benchmark

Carla (Pawelczyk et al., 2021) is a library that is used to benchmark Counterfactual Explanations on common datasets and classifiers. We use Carla as the benchmark to compare 13 different Counterfactual Explanation models (in total 15 variations) discussed above. We run the benchmark on the Adult dataset (Kohavi, 1996) on the provided ANN (Artificial Neural Network with 2 hidden layers and ReLU activation function) classifier and LR (Linear Model with no hidden layer and no activation function) classifier on Tensorflow, Pytorch, SKlearn, and XGBoostML frameworks. For these four frameworks, only GeCo supports all of them. Focus and FeatureTweak only support SKlearn and XGBoostML, while all others except GeCo don’t support these two. Therefore, we run GeCo, Focus and FeatureTweak on the SKlearn and XGBoostML frameworks, and run GeCo with all other systems on Tensorflow and Pytorch frameworks.

The results from Carla are shown in Table 3 and Figure 8 for Tensorflow and Pytorch frameworks and Table 4 and Figure 9 for SKlearn and XGBoostML frameworks.

Carla returns six parameters that we are interested in: (1) Average redundant feature number (redund) represents the number of useless feature changes and our Rule-based models want this to be as small as possible. (2) Average violations (viol) represent the number of features that we set to be immutable but the generated counterfactual changes. Average violations are not equal to zero means that the system doesn’t support well for Feasible Constraints. Our Rule-based models want this to always be zero. (3) Success represents the rate we can successfully generate Counterfactual Explanations. Our Rule-based models want this to always be one (always success). (4) Time represents the average run-time of the Counterfactual Explanation System to generate Counterfactual for one instance. Ideally, our Rule-based models want this to be small. (5) l0 distance represents the average number of feature changes per counterfactual and this number should be small as possible. (6) l1 distance represents how close the counterfactual is to the original distance.

As discussed above, the ideal Counterfactual Explanation System should support the Feasible Constraints, support all types of classifiers and machine learning frameworks, should always return valid counterfactuals if possible, should not return redundant/useless feature changes, and should finish in a reasonable time. By looking at the result, only AR, GS, and GeCo support Feasible Constraints that do not violate the immutable features (rules). Only AR and GeCo return Counterfactual Explanations without redundancy. Only Dice, CEM, GS, Focus, and GeCo always return Counterfactual Explanations. Surprisingly, only GeCo fulfills all the requirements we want. Even though its run-time is not the best, it can finish in less than a second, which should be reasonable for us to use. Therefore, we choose GeCo as our underlying Counterfactual Explanation System.