This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Skema Business School, Montreal, Canada

Answer-Set Programs for Repair Updates and Counterfactual Interventions

Leopoldo Bertossi Millennium Institute for Foundational Research on Data (IMFD, Chile) & Professor Emeritus Carleton University, Canada.   [email protected]
Abstract

We briefly describe -mainly through very simple examples- different kinds of answer-set programs with annotations that have been proposed for specifying: database repairs and consistent query answering; secrecy view and query evaluation with them; counterfactual interventions for causality in databases; and counterfactual-based explanations in machine learning.

Repair Programs and Consistent Query Answering.

When a relational database fails to satisfy a given set of integrity constraints (ICs) and one wants to obtain semantically correct query answers, one can specify and simultaneously query virtually repaired, i.e. consistent, versions of the database. The so obtained certain answers are considered to be the consistent answers [1, 5]. Answer-set programs (ASPs) with annotations can be used for this task [4, 17]. The motivation for this approach comes from earlier work where annotated predicate logic, a form of multi-valued logic extended with non-monotonic negation, that was used for this task [2].

Example 1

The following is a denial constraint (DC), i.e. it prohibits combinations or joins of database atoms:  κ:¬xy(S(x)R(x,y)S(y))\kappa\!:\neg\exists x\exists y(S(x)\wedge R(x,y)\wedge S(y)).  The following database instance DD violates κ\kappa.

RR A B
ι1\iota_{1} a4a_{4} a3a_{3}
ι2\iota_{2} a2a_{2} a1a_{1}
ι3\iota_{3} a3a_{3} a3a_{3}
SS A
ι4\iota_{4} a4a_{4}
ι5\iota_{5} a2a_{2}
ι6\iota_{6} a3a_{3}

We use global tuple identifiers (tids) to refer to individual tuples. They appear in predicates’ first arguments followed by a semicolon.

Subset-repairs (S-repairs) of DD are consistent w.r.t. κ\kappa, and minimally differ from DD under set inclusion. They are:  D1=D_{1}= {R(ι1;a4,a3),\{R(\iota_{1};a_{4},a_{3}), R(ι2;a2,a1),R(\iota_{2};a_{2},a_{1}), R(ι3;a3,a3),R(\iota_{3};a_{3},a_{3}), S(ι4;a4),S(ι5;a2)}S(\iota_{4};a_{4}),S(\iota_{5};a_{2})\},   D2={R(ι2;a2,a1),S(ι4;a4),D_{2}=\{R(\iota_{2};a_{2},a_{1}),S(\iota_{4};a_{4}), S(ι5;a2),S(\iota_{5};a_{2}), S(ι6;a3)}S(\iota_{6};a_{3})\},  and D3=D_{3}= {R(ι1;a4,a3),R(ι2;a2,a1),S(ι5;a2),S(ι6;a3)}\{R(\iota_{1};a_{4},a_{3}),R(\iota_{2};a_{2},a_{1}),S(\iota_{5};a_{2}),S(\iota_{6};a_{3})\}.

The repair program contains the atoms in DD, and the rules:

S(t1;x,𝖽)R(t2;x,y,𝖽)S(t3;y,𝖽)\displaystyle S^{\prime}(t_{1};x,{\sf d})\vee R^{\prime}(t_{2};x,y,{\sf d})\vee S^{\prime}(t_{3};y,{\sf d}) \displaystyle\leftarrow S(t1;x),R(t2;x,y),S(t3;y).\displaystyle S(t_{1};x),R(t_{2};x,y),S(t_{3};y).
S(t;x,𝗌)\displaystyle S^{\prime}(t;x,{\sf s}) \displaystyle\leftarrow S(t;x),𝑛𝑜𝑡S(t;x,𝖽).\displaystyle S(t;x),\ {\it not}\ S^{\prime}(t;x,{\sf d}).
R(t;x,y,𝗌)\displaystyle R^{\prime}(t;x,y,{\sf s}) \displaystyle\leftarrow R(t;x,y),𝑛𝑜𝑡R(t;x,y,𝖽).\displaystyle R(t;x,y),{\it not}\ R^{\prime}(t;x,y,{\sf d}).

Here, t1,,t,x,y,t_{1},\ldots,t,x,y,\ldots are variables, but the annotation 𝖽{\sf d} is a constant indicating that the tuple is deleted from DD. Annotation constant 𝗌{\sf s} indicates that the tuple stays in the repair. Here, the first rule captures in its body (i.e. antecedent) a violation of κ\kappa, and the head (i.e. the consequent) offers the alternative tuple deletions that can solve the violation.  The last two rules specify that repairs keep the original tuples that have not been deleted. Predicates RR^{\prime} and SS^{\prime} are nicknames for RR and SS, with an extra argument for the annotation.

This repair-program has three stable models, with repair D1D_{1} corresponding to the model M1={R(ι1;a4,a3,𝗌),R(ι2;a2,a1,𝗌),M_{1}=\{R^{\prime}(\iota_{1};a_{4},a_{3},{\sf s}),R^{\prime}(\iota_{2};a_{2},a_{1},{\sf s}), R(ι3;a3,a3,𝗌),S(ι4;a4,𝗌),R^{\prime}(\iota_{3};a_{3},a_{3},{\sf s}),S^{\prime}(\iota_{4};a_{4},{\sf s}), S(ι5;a2,𝗌),S^{\prime}(\iota_{5};a_{2},{\sf s}), S(ι6;a3,𝖽)}DS^{\prime}(\iota_{6};a_{3},{\sf d})\}\cup D, in the sense that D1D_{1} can be read off from M1M_{1} by keeping only the tuples annotated with s. \Box

If there are interacting ICs, for which repair actions for one of them may affect satisfaction of another one, it is necessary to use a couple of extra annotations to capture a transition process [17]. Doing CQA becomes skeptical (or certain) reasoning with the repair program. A system for CQA based on repair programs that runs on the DLV system [23] is described in [17].

In some cases, it is more convenient to specify and compute S-repairs that are also cardinality repairs (C-repairs), i.e. that differ from the original instance by a minimum number of tuples [5]. In Example 1, (only) they can be obtained by adding to the program the following weak program constraints  (WCs):  R(t,x¯),R(t,x¯,𝖽)\sim R(t,\bar{x}),R^{\prime}(t,\bar{x},{\sf d})  and  S(t,x¯),S(t,x¯,𝖽)\sim S(t,\bar{x}),S^{\prime}(t,\bar{x},{\sf d}).  Violations, i.e. instantiations of what comes after \sim, should not happen. However, they are accepted as long as they are kept to a minimum [23]. With these WCs the number of deleted tuples is minimized.

Repair programs are quite general and can be produced for any set of first-order ICs. Their complexity and expressive power are just right for the intrinsic complexity of repair-related computational tasks and CQA [17, 5]. Repair programs have been extended for CQA from virtual data integration systems [14], and to peer data exchange systems [7]. Other ASP-based approaches to database repairs and CQA can be found in [12, 21, 19].

Virtual Updates for Secrecy Views.

Repairs that replace attribute values by null values that behave as in SQL [6, 12] can also be used to hide the contents a view. Intuitively, a database, for which the contents a view is expected to stay hidden, offers as query answers only the certain (or consistent) answers from the set of secrecy instances (the repairs) that have the view empty or containing tuples with nulls only. Details about these repair and secrecy programs can be found in [6, 9].

Example 2

(example 1 cont.) Consider the same database D{D}, and Vκ(x,y):S(x)R(x,y)S(y)V_{\!\kappa}(x,y):S(x)\land R(x,y)\land S(y) defining the view whose contents we want to hide from users. In order to to do this, we repair w.r.t. the associated DC κ\kappa by “minimally” changing attribute values by a constant NULL. Since it behaves as in SQL, it cannot be used to satisfy a join.  There are two repairs.

RR A B
ι1\iota_{1} a4a_{4} a3a_{3}
ι2\iota_{2} a2a_{2} a1a_{1}
ι3\iota_{3} a3a_{3} a3a_{3}
SS A
ι4\iota_{4} a4a_{4}
ι5\iota_{5} a2a_{2}
ι6\iota_{6} 𝖭𝖴𝖫𝖫{\sf NULL}
RR A B
ι1\iota_{1} a4a_{4} 𝖭𝖴𝖫𝖫{\sf NULL}
ι2\iota_{2} a2a_{2} a1a_{1}
ι3\iota_{3} a3a_{3} 𝖭𝖴𝖫𝖫{\sf NULL}
SS A
ι4\iota_{4} a4a_{4}
ι5\iota_{5} a2a_{2}
ι6\iota_{6} a3a_{3}

In each of them NULL is preventing the satisfaction of a join in VκV_{\!\kappa}. In both cases, the set of value changes is minimal under set inclusion.

A program rule that achieves this result is:  S(t1,𝖭𝖴𝖫𝖫)R(t2,𝖭𝖴𝖫𝖫,y)R(t2,x,𝖭𝖴𝖫𝖫)S(t3,𝖭𝖴𝖫𝖫)S(t1,x),R(t2,x,y),S(t3,y)S^{\prime}(t_{1},{\sf NULL})\vee R^{\prime}(t_{2},{\sf NULL},y)\vee R^{\prime}(t_{2},x,{\sf NULL})\vee S^{\prime}(t_{3},{\sf NULL})\leftarrow S(t_{1},x),R(t_{2},x,y),S(t_{3},y). \Box

Counterfactual Programs for Causality in Databases.

In [24], actual causality [22] was applied to define and compute database tuples that are causes for a query to be true. Furthermore, causal responsibility [18] was used to assign numerical scores to causes, to reflect their strength as such. A detailed analysis of causality in databases was carried out in [8], where a useful connection with database repairs was unveiled. As consequence, repair program can be used to specify and compute causes [9].

Example 3

(example 1 cont.)  With the same database D{D}, the query 𝒬κ:xy(S(x)R(x,y)S(y)){\mathcal{Q}_{\kappa}\!:\ \exists x\exists y(S(x)\land R(x,y)\land S(y))}  is true in DD.  Tuple t6t_{6} is a counterfactual cause for 𝒬κ\mathcal{Q}_{\!\kappa} to be true in D{D}, in the sense that if t6t_{6} is removed from D{D}, 𝒬κ\mathcal{Q}_{\!\kappa} is no longer true. Its responsibility is 11, the highest possible.  Tuple t1t_{1} is an actual cause since deleting it from DD together with its contingent tuple t3t_{3} makes the query false. Its responsibility is   11+|Γ|=11+1=12\frac{1}{1+|\Gamma|}=\frac{1}{1+1}=\frac{1}{2}, with Γ\Gamma the smallest set of its contingent tuples.  Similarly, t3t_{3} and t4t_{4} are actual causes, with responsibility 12{\frac{1}{2}}.

When QκQ_{\!\kappa} is true, equivalently the IC κ\kappa is false, in order to obtain causes, tuples have to be deleted from DD, for which a repair program for κ\kappa can be used. Causes’ tids can be retrieved by means of the rules: 𝐶𝑎𝑢𝑠𝑒(t)R(t,x,y,𝖽){\it Cause}(t)\leftarrow R^{\prime}(t,x,y,{\mathsf{d}}) and 𝐶𝑎𝑢𝑠𝑒(t)S(t,x,𝖽){\it Cause}(t)\leftarrow S^{\prime}(t,x,{\mathsf{d}}).  In order to obtain their contingency tuples, one can use, e.g. 𝐶𝑜𝑛𝑡(t,t)R(t,x,y,𝖽),R(t,u,v,𝖽),tt{\it Cont}(t,t^{\prime})\leftarrow R^{\prime}(t,x,y,{\mathsf{d}}),R^{\prime}(t^{\prime},u,v,{\mathsf{d}}),t\neq t^{\prime}, collecting tids that have to de deleted together with tt.  Using set-building and aggregations, one can compute contingency sets and their cardinalities [15, 16], and then, also responsibilities. With WCs one can concentrate on minimum cardinality contingency sets [9]. \Box

Counterfactual Programs for Explainable ML.

Consider entity records represented by atomic formulas E(t;e¯,𝗈)E(t;\bar{e},\mathsf{o}), with an id tt, a sequence of feature values e¯\bar{e}, and an annotation o for “original record”. They are labeled, say with 0 or 11, by a black-box or open-box classifier represented by a predicate 𝐶𝑙(t,){\it Cl}(t,\ell): record with id tt received label \ell.  A particular entity E(𝐭;𝐞¯;𝗈)E(\mathbf{t};\mathbf{\bar{e}};\mathsf{o}) receives label 11, i.e. 𝐶𝑙(𝐭,1){\it Cl}(\mathbf{t},1) is true, and we want to explain this by counterfactually intervening 𝐞¯\mathbf{\bar{e}}, changing feature values, trying to obtain label 0. For each feature we do this, but value changes for other features may be necessary (as with actual causality above). A responsibility score can be assigned to each feature value that depends on the additional required changes (similar but much more general than causal responsibility above) [10, 11]. These are also called score-based attributive or contrastive explanations in explainable AI.

The process of iteratively intervening an entity until it switches label can be specified by means of counterfactual ASPs [12, 13]. For the gist, we give some of the rules in such a program. Since an entity may go trough an iteration of feature value changes, we need an annotation, \mathfrak{\star}, to indicate it is in transition, with the rules:  E(𝐞;x¯;)E(𝐞;x¯;o)E(\mathbf{e};\bar{x};\mathfrak{\star})\leftarrow E(\mathbf{e};\bar{x};\mbox{\sf o}) and E(𝐞;x¯;)E(𝐞;x¯;do)E(\mathbf{e};\bar{x};\mathfrak{\star})\leftarrow E(\mathbf{e};\bar{x};\mbox{\sf do}), where annotation do indicates a single counterfactual change.

The main rule,  oooE(t;x1,x2,,xn;do)E(t;x1,x2,,xn;do)\mbox{\phantom{ooo}}E(t;x_{1}^{\prime},x_{2},\ldots,x_{n};\mbox{\sf do})\vee\cdots\vee E(t;x_{1},x_{2},\ldots,x_{n}^{\prime};\mbox{\sf do})\ \longleftarrow E(t;x¯;),𝐶𝑙[t;1],𝐷𝑜𝑚1(x1),,𝐷𝑜𝑚n(xn),x1x1,,xnxn,𝑐ℎ𝑜𝑖𝑐𝑒(x¯;x1),E(t;\bar{x};\mathfrak{\star}),{\it Cl}[t;1],{\it Dom}_{1}(x_{1}^{\prime}),\ldots,{\it Dom}_{n}(x_{n}^{\prime}),x_{1}^{\prime}\neq x_{1},\ldots,x_{n}^{\prime}\neq x_{n},{\it choice}(\bar{x};x_{1}^{\prime}), ,𝑐ℎ𝑜𝑖𝑐𝑒(x¯;xn)\ldots,{\it choice}(\bar{x};x_{n}^{\prime}), specifies that while the label is not switched, a single feature value is non-deterministically [20] replaced by a new one from the feature domain.  One eventually stops when the label has been switched:  E(t;x¯;s)E(t;x¯;do),𝐶𝑙(t,0).E(t;\bar{x};\mbox{\sf s})\ \leftarrow\ E(t;\bar{x};\mbox{\sf do}),{\it Cl}(t,0).

Acknowledgments:   Part of this work was funded by “ANID - Millennium Science Initiative Program” - Code ICN17002; and “Centro Nacional de Inteligencia Artificial” CENIA, FB210017, Financiamiento Basal para Centros Científicos y Tecnológicos de Excelencia de ANID, Chile.

References

  • [1] Arenas, M., Bertossi, L. and Chomicki, J.  Consistent Query Answers in Inconsistent Databases. In Proc. ACM PODS 1999, pp. 68-79.
  • [2] Arenas, M., Bertossi, L. and Kifer, M.  Applications of Annotated Predicate Calculus to Querying Inconsistent Databases.  Proc. Computational Logic 2000, Springer LNCS 1861, pp. 926-941.
  • [3] Arenas, M., Bertossi, L. and Chomicki, J. Answer Sets for Consistent Query Answers.  Theory and Practice of Logic Programming, 2003, 3(4&5):393-424.
  • [4] Barcelo, P. and Bertossi, L.  Logic Programs for Querying Inconsistent Databases.  Proc. PADL 2003, Springer LNCS 2562, pp. 208-222.
  • [5] Bertossi. L.  Database Repairing and Consistent Query Answering.  Synthesis Lectures in Data Management. Morgan & Claypool, 2011.
  • [6] Bertossi, L. and Li, L.  Achieving Data Privacy through Secrecy Views and Null-Based Virtual Updates.  IEEE Trans. Knowledge and Data Engineering, 2013, 25(5):987-1000.
  • [7] Bertossi, L. and Bravo, L.  Consistency and Trust in Peer Data Exchange Systems.  Theory and Practice of Logic Programming, 2017, 17(2):148-204.  Extended version as Corr Arxiv Paper cs.DB/1606.01930.
  • [8] Bertossi, L. and Salimi, B. From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back.  Theory of Computing Systems, 2017, 61(1):191-232.
  • [9] Bertossi, L.  Specifying and Computing Causes for Query Answers in Databases via Database Repairs and Repair Programs.  Knowledge and Information Systems, 2021, 63(1):199-231.
  • [10] Bertossi, L., Li, J., Schleich, M., Suciu, D. and Vagena, Z.   Causality-Based Explanation of Classification Outcomes. In Proceedings of the Fourth Workshop on Data Management for End-To-End Machine Learning, DEEM@SIGMOD 2020, pages 6:1-6:10, 2020.
  • [11] Bertossi, L.  Score-Based Explanations in Data Management and Machine Learning: An Answer-Set Programming Approach to Counterfactual Analysis.  In Reasoning Web. Declarative Artificial Intelligence 2021, Springer LNCS 13100, 2022, pp. 145-184.
  • [12] Bertossi, L.  Declarative Approaches to Counterfactual Explanations for Classification. Theory and Practice of Logic Programming. 2021. https://doi.org/10.1017/S1471068421000582. arXiv Paper 2011.07423.
  • [13] Bertossi, L. and Reyes, G.  Answer-Set Programs for Reasoning about Counterfactual Interventions and Responsibility Scores for Classification.  In Inductive Logic Programming, Proc. 1st International Joint Conference on Learning and Reasoning (IJCLR’21), Springer LNAI 13191, 2022, pp. 41-56.
  • [14] Bravo, L. and Bertossi, L. Disjunctive Deductive Databases for Computing Certain and Consistent Answers to Queries from Mediated Data Integration Systems. Journal of Applied Logic, 2005, 3(2):329-367.
  • [15] Calimeri, F., Cozza, S., Ianni, G. and Leone, N.  Computable Functions in ASP: Theory and Implementation. Proc. ICLP 2008, Springer LNCS 5366, pp. 407-424.
  • [16] Calimeri, F., Cozza, S., Ianni, G. and Leone, N.  An ASP System with Functions, Lists,and Sets. Proc. LPNMR 2009, Springer LNCS 5753, pp. 483-489.
  • [17] Caniupan, M. and Bertossi, L.  The Consistency Extractor System: Answer Set Programs for Consistent Query Answering in Databases.  Data & Knowledge Engineering, 2010, 69(6):545-572.
  • [18] Chockler, H. and Halpern, J.   Responsibility and Blame: A Structural-Model Approach. J. Artif. Intell. Res., 2004, 22:93-115.
  • [19] Eiter, T., Fink, M., Greco, G. and Lembo, D.  Repair Localization for Query Answering from Inconsistent Databases.  ACM Trans. Database Syst., 2008, 33(2):10:1-10:51.
  • [20] Giannotti, F., Greco, S., Sacca, D. and Zaniolo, C.  Programming with Non-Determinism in Deductive Databases.  Annals of Mathematics in Artificial Intelligence, 1997, 19(1-2):97-125.
  • [21] Greco, G., Greco, S. and Zumpano, E.  A Logical Framework for Querying and Repairing Inconsistent Databases.  IEEE Trans. Knowl. Data Eng., 2003, 15(6):1389-1408.
  • [22] Halpern, J. and Pearl, J.   Causes and Explanations: A Structural-Model Approach. Part I: Causes. The British Journal for the Philosophy of Science, 2005, 56(4):843-887.
  • [23] Leone, N., Pfeifer, G., Faber, W., Eiter, T., Gottlob, G., Perri, S. and Scarcello, F.  The DLV System for Knowledge Representation and Reasoning.  ACM Transactions on Computational Logic, 2006, 7(3):499-562.
  • [24] Meliou, A., Gatterbauer, W., Moore, K. F. and Suciu, D. The Complexity of Causality and Responsibility for Query Answers and Non-Answers. Proc. VLDB 2010, pp. 34-41.