Example-based Synthesis of Static Analysis Rules

Pranav Garg AWS AI410 Terry Avenue NorthSeattleWA98109USA [email protected] and Srinivasan Sengamedu SHS AWS AI410 Terry Avenue NorthSeattleWA98109USA [email protected]

Abstract.

Static Analysis tools have rules for several code quality issues and these rules are created by experts manually. In this paper, we address the problem of automatic synthesis of code quality rules from examples. We formulate the rule synthesis problem as synthesizing first order logic formulas over graph representations of code. We present a new synthesis algorithm RhoSynth that is based on Integer Linear Programming-based graph alignment for identifying code elements of interest to the rule. We bootstrap RhoSynth by leveraging code changes made by developers as the source of positive and negative examples. We also address rule refinement in which the rules are incrementally improved with additional user-provided examples. We validate RhoSynth by synthesizing more than 30 Java code quality rules. These rules have been deployed as part of a code review system in a company and their precision exceeds 75% based on developer feedback collected during live code-reviews. Through comparisons with recent baselines, we show that current state-of-the-art program synthesis approaches are unable to synthesize most of these rules.

Program synthesis, Programming by example

^†^†journal: PACMPL^†^†journalvolume: 1^†^†journalnumber: CONF^†^†article: 1^†^†journalyear: 2018^†^†publicationmonth: 1^†^†copyright: none^†^†ccs: Software and its engineering General programming languages^†^†ccs: Social and professional topics History of programming languages

1. Introduction

Software is an integral part of today’s life and low code quality has large costs associated with it. For example, the Consortium for Information & Software Quality notes that “For the year 2020, the total Cost of Poor Software Quality (CPSQ) in the US is $2.08 trillion.”¹¹1https://www.it-cisq.org/pdf/CPSQ-2020-report.pdf Hence automated tools to detect code quality issues are an important part of software development. Developers use a variety of tools to improve their software quality, ranging from linters (e.g., SonarLint) to general-purpose code analysis tools (e.g., FindBugs, ErrorProne and Facebook Infer) to specialized tools (e.g., FindSecBugs for code security). There are many facets to code quality and often developers want to develop custom analyzers which detect code quality issues that are of specific interest to them.

A common approach for developing custom analyzers or checkers is expressing them in a domain specific language (DSL), such as Semmle’s CodeQL²²2https://codeql.github.com/ or Semgrep³³3https://semgrep.dev/. Code analysis tools such as SonarQube and Semgrep have thousands of rules that have been manually developed by the community of users or the product owners. Development of these rules is a continuous process: new rules are added to guide usage of new APIs or frameworks, and existing rules evolve as the APIs evolve. Further, expressing code checks in a DSL requires specialized knowledge and extensive testing. On the whole, the development of rules is tedious and expensive, and the amount of required human intervention limits the scalability of this manual approach.

In this paper, we address the problem of developing code quality rules from labeled code examples. These examples can be provided by rule authors in the form of conforming and violating examples. In most cases, providing these examples is easier than writing the rule itself. In fact, often, these rules are developed following test-driven development (TDD) (Beck, 2002), where the conforming and violating code examples are used to guide the development of these rules (Ryzhkov, 2011). Rules from various tools such as SonarQube and Semgrep, actually, come with such labeled test code examples. Additionally, we can leverage code changes as the source of labeled examples. In a corpus of code changes, common code quality issues will have multiple code changes that fix those issues. Similar code changes that are performed by multiple developers across projects can be used to obtain examples to synthesize code quality rules. For a code change (code-before, code-after) that fixes a code quality issue, code-before is a violating example and code-after is a conforming example.

We propose RhoSynth, an algorithm for automatically synthesizing high-precision rules from labeled code examples. RhoSynth reduces the rule synthesis problem to synthesizing first order logic formulas over Program Dependence Graphs (PDGs). Using a PDG based program representation not only allows our approach to be more robust to syntactic variations in code, but it also enables a succinct expression of semantic information such as data-flow relation and control-dependence. Compared to existing approaches that synthesize bug-fixes on Abstract Syntax Trees (ASTs) (Bader et al., 2019; Rolim et al., 2018, 2017; Meng et al., 2013), synthesizing rules over a graph representation comes with its own challenge. Unlike trees, where a node can be uniquely identified with the path from the root of the tree to itself, there is no such unique id for nodes in a graph. This ambiguity in identifying a node, consequently identifying a subgraph, makes the synthesis problem hard (Haussler, 1989). We use Integer Linear Programming (ILP) (Schrijver, 1986; Lerouge et al., 2017) to solve this problem that lies at the core of rule synthesis.

We express rules as first order logic formulas comprising an existentially quantified precondition and an existentially quantified postcondition. Specifically, rules have the following format: $\exists\vec{x}.pre(\vec{x})\wedge\neg\big{(}\bigvee_{i}\exists\vec{y}.post_{i}(\vec{x},\vec{y})\big{)}$ , where $\vec{x}$ and $\vec{y}$ denotes a set of nodes in the PDG. The precondition captures the applicability of the rule in violating code examples and the postcondition captures the pattern that must be present in conforming code, if the code example satisfies the precondition. This is a rich format that can express a wide range of code quality issues. We synthesize such rules by first synthesizing a precondition $\exists\vec{x}.pre(\vec{x})$ over violating examples, and then synthesizing the postcondition $\exists\vec{y}.post_{i}(\vec{x},\vec{y})$ that accepts conforming examples satisfying $pre(\vec{x})$ for a given valuation $\vec{x}$ . Through this decomposition, we simplify the rule synthesis problem to synthesizing existentially quantified precondition and postcondition formulas, which are themselves synthesized using ILP. Further, the postcondition formula may contain disjunctions. We propose a top-down entropy-based algorithm to partition conforming examples into groups and synthesize a conjunctive postcondition disjunct for each group.

Recently, there has been lots of work on automated program repair from code changes (Bader et al., 2019; Rolim et al., 2018, 2017; Bavishi et al., 2019; Meng et al., 2013). While code changes themselves are a good source of examples for synthesizing code quality rules, there may be variations in correct code that are not captured in code changes. We propose rule refinement to incrementally improve the rule by providing additional examples corresponding to such code variations. It turns out, both ILP-based graph alignment and disjunctive postcondition synthesis are instrumental in leveraging "unpaired" code examples for refining the rules.

The main contributions of the paper are as follows.

(1)

We formulate the problem of synthesizing rules from labeled code examples as synthesis of logical formulas over graph representations of code. We present a novel algorithm RhoSynth that is based on ILP based graph alignment, for identifying nodes in the graph that are relevant to the rule being synthesized.
(2)

We propose rule refinement which improves the precision of rules based on a small number of labeled false positive examples.
(3)

We validate our algorithm by synthesizing more than 30 Java code-quality rules from labeled code examples obtained from code changes in GitHub packages. These rules have been deployed as part of a code review system in a company. We validate these synthesized rules based on offline evaluation as well as live code review feedback collected over a period of several months. The precision of synthesized rules exceeds 75% in production. Rule refinement improves the precision of rules by as much as 68% in some cases.
(4)

We show that recent baselines can synthesize only 22% to 61% of the rules. In addition, we show that, compared to ILP-based graph alignment, commonly used tree-differencing approaches do not perform well when aligning unpaired code examples for rule refinement and this results in rules not being synthesized in a majority of cases.

The paper is organized as follows. Section 2 formally defines the rule synthesis problem and provides an outline of the approach. Section 3 describes the representation for programs and rules. Section 4 describes the rule synthesis algorithm. Section 5 describes the implementation details. Section 6 describes the experimental results. Section 7 presents related work and Section 8 concludes the paper.

2. RhoSynth Overview

This section precisely defines the problem and provides an overview of the approach with a running example.

2.1. Problem Statement

We are given a set of violating or buggy code examples $\mathscr{V}=\{V_{1},\cdots,V_{m}\}$ and a set of conforming or non-buggy code examples $\mathscr{C}=\{C_{1},\cdots,C_{n}\}$ , for a single code quality issue. The problem is to synthesize a rule $R$ from a subset of examples such that $R(V_{i})$ = True and $R(C_{j})$ = False, for all $i$ and $j$ in the held-out test set. We consider code examples at the granularity of a method. This offers sufficient code context to precisely capture a wide range of code quality issues (Tufano et al., 2018; Sobreira et al., 2018), while at the same time being simple enough to facilitate efficient rule synthesis.

The corpus of code changes made by developers is a natural source of positive and negative examples. In such a corpus, common code quality issues will have multiple code changes that fix those issues. It is possible to obtain examples for single code quality issues in an automated manner by clustering code changes (Kreutzer et al., 2016; Paletov et al., 2018; Bader et al., 2019). Code changes in the input consist of pairs $(B_{i},A_{i})$ where $B_{i}$ is the code-before and $A_{i}$ is the code-after. To synthesize a rule $R$ , we consider code-befores as violating examples and code-afters as conforming examples.

A user can also provide additional conforming examples corresponding to variations in correct code that may not be captured by the code changes.

2.2. RhoSynth Steps

Consider the two code changes shown in Figure 1(a)-(b). The code before the change does not handle the case when the cursor accessing the result set of a database query is empty. Without this check, the app might crash when subsequent operations are called on the cursor (e.g., getString call on line $8$ in Figure 1(a). In these code changes, the developer adds this check by handling the case when Cursor.moveToFirst() returns False.

(a) ⬇ 1 ... 2 Cursor cursor = cr.query(...); 3 - cursor.moveToFirst(); 4 + if ( $!$ cursor.moveToFirst()) { 5 + cursor.close(); 6 + return null; 7 + } 8 final String id = cursor.getString(0); (b) ⬇ 1 ... 2 - mProviderCursor.moveToFirst(); 3 + if ( $!$ mProviderCursor.moveToFirst()) { 4 + return; 5 + } 6 do { 7 if(mProviderCursor.getLong(...) == id) { 8 ...

Figure 1. Examples of input code changes.

As mentioned earlier, RhoSynth first synthesizes the precondition from buggy code examples and then synthesizes the postcondition from non-buggy code examples.

Rule Precondition Synthesis: The precondition synthesis uses only buggy code examples. We perform a graph alignment on their PDG representations to know the correspondence between nodes in different examples. The graph alignment is framed as an ILP optimization problem. In our

Refer to caption — Figure 2. Unified Annotated PDG (UAPDG) representation that captures *all* code-before’s in the input. Dashed lines indicate that the corresponding nodes and edges are present in only a subset of the examples.

example, graph alignment determines that the data variables cursor in Figure 1(a) and mProviderCursor in Figure 1(b) correspond to each other. Similarly, the calls moveToFirst correspond. On the other hand, call getString in the first example does not have any corresponding node in the second example. We use this node correspondence map to construct a Unified Annotated PDG (UAPDG) representation that encapsulates information from all buggy examples. Figure 2 partially illustrates this UAPDG $\mathscr{A}$ . The solid lines in the figure indicate that the corresponding nodes and edges are present in all buggy examples. We project $\mathscr{A}$ to these solid nodes and edges and obtain $\mathscr{A}_{c}$ , which is shown in Figure 4(a). $\mathscr{A}_{c}$ corresponds to the precondition formula $\exists x_{0},x_{1}.\textit{pre}(x_{0},x_{1})$ where $\textit{pre}(x_{0},x_{1})$ is described in Figure 4(d). Besides other checks, the precondition asserts that the output of moveToFirst call is ignored, i.e., it is not defined or not used.

⬇

1 ...

2 Cursor c = sql.query(...);

3 c.moveToFirst();

4 while (!c.isAfterLast()) {

5 Record record = cursorToCognitoRecord(c);

6 recordList.add(record);

7 c.moveToNext();

8 }

9 ...

(a)

⬇

1 ...

2 Cursor c = cr.query(...);

3 if (c.getCount() == 1) {

4 c.moveToFirst();

5 final String key = c.getString(...);

6 ...

7 } else {

8 Log.debug("...");

9 ...

10 }

(b)

Figure 3. Non-buggy code examples used for refining the originally synthesized rule

Rule Postcondition Synthesis: We now find all non-buggy code examples that satisfy the precondition. We use a satisfiability solver for this. If there are such examples, we synthesize a postcondition from them and strengthen the precondition with $\neg\textit{post}$ so that the overall rule rejects the non-buggy examples. There are no code-after examples in the input that satisfy the precondition, since the value returned by moveToFirst is used in all of these examples. Consequently, we synthesize a vacuous postcondition = False and the overall rule is the same as the precondition we synthesized above.

Rule Refinement: In several cases, the initial set of examples does not capture all code variations. In case of our current rule, the following checks also check the emptiness of the result set:

(1)

Cursor.getCount() == 0.
(2)

Cursor.isAfterLast() returns True.

These variations are not part of the initial examples. When we run the synthesized rule on code corpus, we encounter examples such as the ones shown in Figure 3(a)-(b) that check the cursor using these code variations. Note these examples are not accompanied by buggy code. We propose rule refinement that uses these additional non-buggy examples to improve the rule. Specifically, we re-synthesize the postcondition by constructing an UAPDG $\mathscr{A}_{\textit{pc}}$ , in a way similar to the UAPDG construction in rule precondition synthesis. However, it turns out that $\mathscr{A}_{\textit{pc}}$ is too general and accepts even the buggy examples. We use this as a forcing function to partition the non-buggy examples and synthesize a conjunctive postcondition for each partition.

Figure 4(a)-(c) illustrate the UAPDGs for the synthesized precondition and the two disjuncts in the postcondition. Figure 4(d) provides the exact rule, expressed as a logical formula. Informally, the synthesized $R$ satisfies all code examples that call Cursor.moveToFirst such that they do not check the value returned by this call, nor call Cursor.isAfterLast or Cursor. getCount. When we run this rule again on the code examples, it correctly accepts all the buggy examples and rejects all non-buggy examples, including the additional examples that were used for rule refinement.

3. Program and Rule Representation

In this section, we describe the PDG based representation of code (Section 3.1), the rule syntax (Section 3.2) and introduce Unified Annotated PDGs (UAPDGs) for representing rules (Section 3.3).

3.1. Code Representation

We represent code examples at a method granularity using PDGs (Ferrante et al., 1987). PDG is a labeled graph that captures all data and control dependencies in a program. Nodes in the PDG are classified as data nodes and action nodes. Data nodes are, optionally, labeled with the data types and values for literals, and action nodes are labeled with the operations they correspond to, for e.g., method name for method call nodes, etc. Edges in the PDG correspond to data-flow and control dependencies and are labeled as recv (connects receiver object to a method call node), para_i (connects the $i^{th}$ parameter to the operation), def (connects an action node to the data value it defines), dep (connects an action node to all nodes that are directly control dependent on it) and throw (connects a method call node to a catch node indicating exceptional control flow). See Figure 6 in Appendix B for the PDG of code-after in Figure 1(a).

\begin{array}[]{rlll}\textrm{Rule :}&R&\stackrel{{\scriptstyle\textit{def}}}{{=}}&\exists\vec{x}.\textit{pre}(\vec{x})\wedge\neg\big{(}\bigvee\limits_{i}\exists\vec{y}.\textit{post}_{i}(\vec{x},\vec{y})\big{)},\textit{ where }\\ &&&\hskip 8.5359pt\textit{pre}(\vec{x})\in\varphi(\vec{x}),\hskip 4.26773pt\textit{ and }\hskip 4.26773pt\textit{post}_{i}(\vec{x},\vec{y})\in\varphi(\vec{x},\vec{y})\end{array}

\begin{array}[]{rl}\textrm{Conjunctive}&\\ \textrm{Subrule:}&\varphi(\vec{x})::=\bigwedge_{i}\varphi^{i}(\vec{x})\mid\epsilon(\vec{x})\mid\eta(\vec{x})\mid\texttt{\small True}\mid\texttt{\small False}\\ \textrm{Edge predicate:}&\epsilon(\vec{x})::=~{}~{}~{}x_{1}\xrightarrow{e}x_{2}\\ \textrm{Node predicate:}&\eta(\vec{x})::=~{}~{}~{}\textit{label(x)}=l\mid\textit{data-type(x)}=R\\ &\mid\textit{data-value(x)}=s\mid\textit{num-para(x)}=i\\ &\mid\textit{declaring-type(x)}=R\\ &\mid\textit{trans-control-dep(x)}\supseteq 2^{Label}\\ &\mid\textit{output-ignored(x)}\mid\neg\textit{output-ignored(x)}\end{array}

\begin{array}[]{c}\vec{x}=\{x_{1},\cdots\},\vec{y}=\{y_{1},\cdots\},x_{i},y_{i}\in\textit{Var},l\in\textit{Label},\\ s\in\textit{String},i\in\textit{Int},R\textit{ is regular expression},\\ e\in\{\textit{recv, para${}_{i}$, def, dep, throw}\}\end{array}

Figure 5. Syntax of Rules

3.2. Rule Syntax

In this work, we express rules as quantified first-order logic formulas over PDGs (refer to Figure 5 for a detailed syntax of rules). A rule is a formula of the form $\exists\vec{x}.{pre}(\vec{x})\wedge\neg\big{(}\bigvee\exists\vec{y}.post(\vec{x},\vec{y})\big{)}$ , where $\vec{x}$ and $\vec{y}$ are a set of quantified variables that range over distinct nodes in a PDG. The precondition $\textit{pre}(\vec{x})$ evaluates to True on buggy code and the postcondition $\textit{post}_{i}(\vec{x},\vec{y})$ evaluates to True on correct code, with appropriate instantiations for $\vec{x}$ , $\vec{y}$ . Because of the negation before the postcondition, the entire formula evaluates to True on buggy code and False on correct code. Intuitively, the precondition captures code elements of interest in buggy, and possibly correct, code, and the postcondition captures the same in correct code. The elements appear as existentially quantified variables. This is a rich format that can express a wide range of code quality issues.

Formulas $pre(\vec{x})$ and $post_{i}(\vec{x},\vec{y})$ are quantifier-free sub-rules comprising a conjunction of atomic edge predicates $\epsilon(\vec{x})$ that correspond to edges $x_{1}\xrightarrow{e}x_{2}$ in the PDG, and atomic node predicates $\eta(\vec{x})$ . $\eta({\vec{x}})$ express various properties at PDG nodes including the node label, data-type, data values for literals, number of parameters for method calls (num-para(x)), declaring class type for static method calls (declaring-type(x)), the set of nodes on which $x$ is transitively control dependent (trans-control-dep(x)) and, finally, whether a method call’s output is/is not ignored (output-ignored(x) and its negation)⁴⁴4Predicate output-ignored(x) is True for a method call when it does not return a value or the returned value has no users, and False otherwise..

Note, we exclude disjunctions in preconditions since a rule with a disjunctive precondition can be expressed as multiple rules without a disjunctive precondition, in the rule syntax. Further, a rule with a precondition comprising a negated code pattern (e.g., $\neg x_{1}\xrightarrow{e}x_{2}$ ) is expressed using a positive pattern in the rule postcondition, and vice-versa. The rule syntax, while mostly captures first-order logic properties over PDGs, includes some higher-order properties, e.g., transitive control dependence.

Checking if a rule satisfies a code example represented as a PDG can be reduced to satisfiability modulo theories (SMT) (Barrett and Tinelli, 2018). Since PDGs are finite graphs, this satisfiability check is decidable. By mapping nodes in the PDG to bounded integers, this check can be reduced to satisfiability in the Presburger arithmetic. Moreover, state-of-the-art SMT solvers such as Z3 (De Moura and Bjørner, 2008) can discharge these checks efficiently.

3.3. Rule Representation

Rules are formulas that accept or reject code examples represented as PDGs. From the rule syntax (Figure 5), a rule is expressed using a precondition and a postcondition. Both the precondition and the postcondition can be expressed as a collection of quantified formulas $\mathcal{P}_{Q}=\exists\vec{y}.\varphi(\vec{x},\vec{y})$ , where $\varphi$ is a conjunctive subrule defined over free variables $\vec{x}$ and bound variables $\vec{y}$ . To represent rules, we need a representation for $\mathcal{P}_{Q}$ . We arrive at a representation for $\mathcal{P}_{Q}$ , from a PDG representation, in two steps. We first introduce Valuated PDGs (VPDGs; also, sometimes, referred as valuated examples) that are PDGs extended with variables. VPDGs capture a single PDG that is accepted by $\mathcal{P}_{Q}$ . We then introduce Unified Annotated PDGs (UAPDGs) that accept sets of VPDGs and represent the quantified formula $\mathcal{P}_{Q}$ .

Let Var be a set of variables. A Valuated PDG is a PDG with a subset of its nodes mapped to distinct variables in Var. Informally, a Valuated PDG is a PDG with a valuation for variables.

As mentioned above, a Unified Annotated PDG is defined over a set of free variables $\textit{Var}_{f}\subseteq\textit{Var}$ and bound variables $\textit{Var}_{b}\subseteq\textit{Var}$ , and represents an existentially quantified conjunctive formula over PDGs of the form $\exists\textit{Var}_{b}.\varphi(\textit{Var}_{f},\textit{Var}_{b})$ . Note that $\varphi$ is a conjunctive subrule that is defined over a set of node predicates ( $\eta(\vec{x})$ in Figure 5). These node predicates are expressed using lattices annotating different nodes in a UAPDG (see Section 5 for details). Formally,

Definition 1 (Unified Annotated PDG).

Given a distinct set of free variables $\textit{Var}_{f}$ and bound variables $\textit{Var}_{b}$ , a $\textit{Unified Annotated PDG}=(N,E,Lat:N\rightarrow(Lattice_{1}\times\cdots\times Lattice_{k}),M_{f}:N\mapsto\textit{Var}_{f},M_{b}:N\mapsto\textit{Var}_{b})$ , where $Lat(n)$ is the set of node predicates at $n$ expressed using lattices, and $M_{f}$ and $M_{b}$ are maps from a disjoint subset of its nodes to distinct variables in $\textit{Var}_{f}$ and $\textit{Var}_{b}$ , respectively.

Note that bound variables $\textit{Var}_{b}$ can be permuted in the map $\textit{M}_{b}$ without affecting the semantics of a UAPDG. On the other hand, variables $\textit{Var}_{f}$ cannot be permuted or renamed in the map $\textit{M}_{f}$ , since these variables are free (they are bound to an outer existential quantifier in the rule $R$ ). We call nodes in $\textit{M}_{f}$ , mapped to variables in $\textit{Var}_{f}$ , frozen since the variables mapped to these nodes are fixed for a given UAPDG.

For constructing a UAPDG from a set of VPDGs, we need to first identify corresponding nodes in VPDGs and then unify the node predicates at the corresponding nodes. We identify corresponding nodes through graph alignment using ILP (see Section 4.1) and the unification of node predicates is performed by generalization over the lattice. We describe the use of UAPDG representation to synthesize rule preconditions and postconditions in Section 4.

Notation: We use italicized font to denote a single example or a PDG or a Valuated PDG (e.g., C, E, V) and script font to denote sets of examples or a Unified Annotated PDG (e.g., $\mathscr{A}$ , $\mathscr{C}$ , $\mathscr{E}$ , $\mathscr{V}$ ). Also, we use superscripts, e.g., $\mathscr{A}^{\vec{x}}$ , to denote the set of free variables over which a UAPDG or VPDG is defined ( $\mathscr{A}^{\emptyset}$ denotes UAPDG with no free variables).

Example 2.

Figure 4(b) illustrates a UAPDG defined over free variables $\{x_{0},x_{1}\}$ and bound variables $\{y_{1},y_{2},y_{3},y_{4}\}$ . The label at each node in the figure visualizes the value of maps $\textit{M}_{f}$ , $\textit{M}_{b}$ and Lat at that node. In this UAPDG, let $n_{1}$ be the node with data type "Cursor" and $n_{2}$ be the node with label "moveToFirst". Then, at node $n_{1}$ , $\textit{M}_{f}(n_{1})=x_{0}$ , $\textit{Lat}(n_{1})=(\textit{data-type}=\textit{"Cursor"},\textit{data-value}=\top,\cdots)$ and map $\textit{M}_{b}$ is undefined. Further, we say that nodes $n_{1}$ and $n_{2}$ are frozen in this UAPDG since they are mapped to free variables $x_{0}$ and $x_{1}$ respectively.

Translating a UAPDG to an existentially quantified formula over graphs: A UAPDG naturally translates into an existentially quantified conjunctive formula over Valuated PDGs. If $\textit{Var}_{f}$ is the set of free variables, and $\textit{Var}_{b}$ is the set of bound variables, the translation of the UAPDG is a formula $\exists\textit{Var}_{b}.\varphi(\textit{Var}_{f},\textit{Var}_{b})$ , where $\varphi$ is the conjunction of all satisfying atomic node and edge predicates expressed over $\textit{Var}_{f}$ and $\textit{Var}_{b}$ .

Example 3.

Refer to Figure 4(d) for the quantified rule that corresponds to the UAPDGs in Figure 4(a)-(c). Specifically, the UAPDG in Figure 4(b) is equivalent to the formula:
$\exists y_{1},\cdots,y_{4}.post_{1}(x_{0},x_{1},y_{1},\cdots,y_{4})$ .

Constructing a UAPDG from a Valuated PDG: We can construct a UAPDG from a Valuated PDG or a PDG, which can be seen as a Valuated PDG with an empty node-variable map, in the following manner:

•

Inherit the set of nodes $N$ and the set of edges $E$ from the Valuated PDG.
•

Construct the map $Lat$ by iterating over all nodes $n$ in the Valuated PDG and computing the lattice values for all node predicates at $n$ .
•

Initialize $M_{f}$ with the node-variable map in the Valuated PDG.
•

Construct $M_{b}$ by mapping all nodes outside the domain of $M_{f}$ to distinct variables in $\textit{Var}_{b}$ (these nodes are bound locally to an existential quantifier in the formula that captures the UAPDG).

Note that a UAPDG obtained by enhancing a PDG (or Valuated PDG), in the above way, corresponds to a formula that accepts the PDG (or Valuated PDG). Henceforth, in the text, we use UAPDG to refer to both the graph structure obtained by enhancing PDGs or Valuated PDGs and the existentially quantified formulas they corresponds to, interchangeably.

4. Rule Synthesis Algorithm

The outline of the rule synthesis algorithm is as follows. We first synthesize the precondition based on violating examples. We then synthesize the postcondition based on conforming examples that satisfy the precondition. If the postcondition is not satisfied by any violating example, the synthesis is complete. If a violating example satisfies a postcondition, we follow a hierarchical partitioning approach: pick a conjunct which is satisfied by the violating example, split the underlying partition based on an entropy-based algorithm and recursively synthesize post-conditions for the new partitions. The final postcondition is the disjunction of postconditions of the leaf partitions.

Proc synthesizeRule ( $\mathscr{V},\mathscr{C}$ )

1 pre(

\vec{x}

\mathscr{V}^{\vec{x}}

:= getConjunctiveSubRule(

\mathscr{V}^{\emptyset}

\mathscr{C^{\prime}}^{\vec{x}}

\{C^{\vec{x}}|C^{\vec{x}}\models

pre

(\vec{x}),C\in\mathscr{C}\}

3 postcondition := synthesizePC(

\mathscr{V}^{\vec{x}},\mathscr{C^{\prime}}^{\vec{x}}

4 let postcondition be of the form

\bigvee\limits_{i}\exists\vec{y}.

post

{}_{i}(\vec{x},\vec{y}

5 R :=

\exists\vec{x}.

pre

(\vec{x})\wedge\neg\bigvee\limits_{i}\exists\vec{y}.

post

{}_{i}(\vec{x},\vec{y})

6 if all $V\in\mathscr{V}$ satisfy R then

7 return

R

. // R is consistent with

\mathscr{V}

and

\mathscr{C}

else

8 return Fail. // R is not consistent with

\mathscr{V}

and

\mathscr{C}

end if

Algorithm 1 Overall rule synthesis algorithm

Proc getConjunctiveSubRule ( $\mathscr{E}^{\vec{v_{f}}}$ )

\mathscr{A}^{\vec{v_{f}}}

:= merge(

\mathscr{E}^{\vec{v_{f}}}

);

\mathscr{A}^{\vec{v_{f}},\vec{v_{b}}}

:= assignVars(

\mathscr{A}^{\vec{v_{f}}}

\vec{v_{b}}

\mathscr{A}^{\vec{v_{f}},\vec{v_{b}}}_{c}

:= project

{}_{\mathscr{E}^{\vec{v_{f}}}}

(

\mathscr{A}^{\vec{v_{f}},\vec{v_{b}}}

\phi(\vec{v_{f}},\vec{v_{b}})

:= getFormula(

\mathscr{A}^{\vec{v_{f}},\vec{v_{b}}}_{c}

\mathscr{E}^{\vec{v_{f}},\vec{v_{b}}}

:= { project

{}_{E^{\vec{v_{f}}}}(\mathscr{A}^{\vec{v_{f}},\vec{v_{b}}})

\mid

E^{\vec{v_{f}}}\in\mathscr{E}^{\vec{v_{f}}}

5 return

\phi(\vec{v_{f}},\vec{v_{b}})

\mathscr{E}^{\vec{v_{f}},\vec{v_{b}}}

Procedure to obtain a conjunctive subrule given a set of valuated examples.

Proc synthesizePC ( $\mathscr{V}^{\vec{x}},\mathscr{C}^{\vec{x}}$ )

1 if

\mid\mathscr{C}^{\vec{x}}\mid=0

then return False.

2 partitionList := [

\mathscr{C}^{\vec{x}}

]; post := True .

3 while True do

4 partition := partitionList.remove(0).

5 let partition be

(\mathscr{C}^{\vec{x}}_{1},\cdots,\mathscr{C}^{\vec{x}}_{k})

6 for i $\in$ $1\cdots k$ do

7 post

{}_{i}(\vec{x},\vec{y})

, _ := getConjunctiveSubRule(

\mathscr{C}^{\vec{x}}_{i}

8 post := post

\vee

\exists\vec{y}.

post

{}_{i}(\vec{x},\vec{y})

end for

9if all

V^{\vec{x}}\in\mathscr{V}^{\vec{x}}

satisfy

\neg

post then return post.

10 i := index s.t.

V^{\vec{x}}\in\mathscr{V}^{\vec{x}}

satisfies post_i.

11 if

\mid\mathscr{C}^{\vec{x}}_{i}\mid=1

then return Fail.

12 for ( $\mathscr{C}^{\vec{x}}_{i,l}$ , $\mathscr{C}^{\vec{x}}_{i,r}$ ) in getCandidatePartitions( $\mathscr{C}^{\vec{x}}_{i}$ ) do

13 partitionList.add(partition

-\mathscr{C}^{\vec{x}}_{i}+\mathscr{C}^{\vec{x}}_{i,l}+\mathscr{C}^{\vec{x}}_{i,r}

)

end for

end while

Procedure to synthesize postcondition from a given set of valuated violating and conforming examples.

Proc generateCandidatePartitions ( $\mathscr{C}^{\vec{x}}$ )

\mathscr{A}^{\vec{x}}

:= merge(

\mathscr{C}^{\vec{x}}

\mathscr{A}^{\vec{x}}_{c}

:= project

{}_{\mathscr{C}^{\vec{x}}}

(

\mathscr{A}^{\vec{x}}

3 candidateNodes := {n

\mid

n is action node, n

\not\in\mathbb{N}

(

\mathscr{A}^{\vec{x}}_{c}

), n

\in\mathcal{N}_{\mathscr{A}^{\vec{x}}}(u)

, u

\in\mathbb{N}

(

\mathscr{A}^{\vec{x}}_{c}

)}. //

\mathcal{N}_{\mathscr{A}}

is neighbor relation in

\mathscr{A}

\mathbb{N}(\mathscr{A})

is the set of nodes for

\mathscr{A}

4 H(n) := computeEntropy(n) for all n

\in

candidateNodes.

5 for n $\in$ candidateNodes s.t. H(n) < $\min\limits_{n}$ H(n) + $\delta$ do

6 partitionList += (

\mathscr{C}^{\vec{x}}_{n},\mathscr{C}^{\vec{x}}_{\neg n}

end for

7return partitionList.

Procedure to obtain candidate partitions for a given set of valuated conforming examples (see Section 4.3).

The overall rule synthesis algorithm is described in Algorithm 1. Inputs to the algorithm is the set of violating $\mathscr{V}$ and conforming examples $\mathscr{C}$ . It proceeds in the following steps:

Synthesize precondition from $\mathscr{V}$ : This is achieved by method getConjunctiveSubRule(). It first calls method merge to align examples in $\mathscr{V}$ using ILP (described in detail in Section 4.1 and 4.2). This constructs a UAPDG $\mathscr{A}$ . Then, it identifies nodes in $\mathscr{A}$ that have a mapping to some node in each example. This is performed by method project and results in a UAPDG $\mathscr{A}_{c}$ (subscript $c$ stands for common). Nodes in $\mathscr{A}_{c}$ are existentially quantified since they correspond to some node in each example in $\mathscr{V}$ . This resultant UAPDG is translated to the precondition formula $\textit{pre}(\vec{x})$ using the translation described in Section 3.3.
Find subset $\mathscr{C^{\prime}}$ of $\mathscr{C}$ which satisfy the precondition: This is accomplished by querying the SMT solver. If a conforming example satisfies the precondition $\textit{pre}(\vec{x})$ , the solver also returns the model comprising the satisfying node assignments for $\vec{x}$ . We map these nodes to the corresponding variables in $\vec{x}$ to get the set of VPDGs $\mathscr{C}^{\prime}$ . Note that the nodes in the VPDGs mapped to variables in $\vec{x}$ are frozen, since the examples satisfy the precondition through these nodes.
Synthesize postcondition from $\mathscr{C^{\prime}}$ : This is achieved by method synthesizePC(). Besides $\mathscr{C^{\prime}}$ , we also pass to this method violating examples $\mathscr{V}$ with a node assignment that satisfy the precondition. Initially, all the conforming examples in the input constitute a single partition. For these examples, we use the ILP graph alignment to align unfrozen nodes and synthesize a postcondition (using method getConjunctiveSubRule as before). If the postcondition is unsatisfiable on all frozen violating examples, we return the synthesized postcondition. Otherwise, it implies that the postcondition synthesized from all examples in the partition is too general to reject violating examples, and we partition the set of conforming examples further. The partitioning algorithm is described in Section 4.3.

Example 1.

Consider postcondition synthesis for the rule described in Section 2.2. Calling getConjunctiveSubrule on all conforming examples satisfying the precondition returns a UAPDG $\mathscr{A}$ that is same as the synthesized precondition, visualized in Figure 4(a). Violating examples such as code-befores in Figure 1 satisfy $\mathscr{A}$ . Therefore, confoming examples are partitioned and a separate postcondition is synthesized for each partition (UAPDGs in Figure 4(b-c)).

We next describe the different components of the rule synthesis algorithm in greater detail.

4.1. Maximal Graph Alignment using ILP

Graph alignment amongst the violating examples or conforming examples is one of the key steps in synthesizing the rule precondition or postcondition respectively. Since both the precondition and postcondition formulas are existentially quantified, the synthesis problem is NP-hard (Haussler, 1989). Its hardness stems from different choices available for mapping existential variables to nodes in the example graphs. Alternatively, the synthesis problem can be seen as a search over graph alignments such that aligned nodes in different examples are mapped to the same existential variable. Note that different graph alignments will result in different synthesized formulas. Given multiple graphs, we iteratively align two graphs at a time⁵⁵5This is an instance of a “multi-graph matching” problem. This requires optimization techniques beyond ILP, for e.g., alternating optimization (Yan et al., 2013). We did not explore these solutions since pairwise graph alignment worked well for our application.. We frame the problem of choosing a desirable graph alignment for a pair of example graphs as an ILP optimization problem. Since we want to synthesize formulas that precisely capture the code examples, we choose alignments that maximize the number of aligned nodes and edges.

The ILP objective to maximize graph alignment can also be viewed as minimizing the graph edit distance between code examples. Consequently, our reduction to ILP follows the binary linear programming formulation to compute the exact graph edit distance between two graphs (Lerouge et al., 2017). The node and edge substitutions are considered cost $0$ and node/edge additions and deletions are cost $1$ operations. We include a detailed reduction to the ILP optimization problem in Appendix C. Below, we describe some constraints on the ILP optimization that are specific to our application:

•

We align action nodes only if they have the same label. So, a method call foo in one example cannot align with method bar in another example. However, data nodes that are,optionally, labeled with their types do not have this restriction. This allows us to align data nodes even when their types do not resolve, or when their types are related in the class hierarchy, for e.g., InputStream and FileInputStream.
•

Two data nodes align only if they have at least one aligned incoming or outgoing edge. This restricts alignment of data nodes to only occurrences when they are defined or used by aligned action nodes.
•

When synthesizing the postcondition, nodes frozen to the same variables in $\vec{x}$ are constrained to be aligned.

When synthesizing rules from code changes, we use the ILP formulation described in this section to perform a fine-grained graph differencing over PDGs in a code change (similar to the GumTree tree differencing algorithm (Falleri et al., 2014)). Graph differencing labels nodes in the PDGs with change tags: unchanged, deleted or added, depending upon whether the node is present in both code-before and code-after, in only code-before, and in only code-after. Now, while aligning the violating code-before’s for synthesizing the precondition, we require that the node alignment respect these change tags, i.e., nodes are aligned only if their change tags are the same. This constraint helps to focus the synthesized precondition formula on the changed code. This constraint is removed when synthesizing the postcondition. This is because even variables $\vec{x}$ frozen in the conforming examples, by design, may not respect these change tags.

4.2. Synthesizing Conjunctive Subrules

In this section, we describe the procedure getConjunctiveSubRule to synthesize a conjunctive subrule from a given set of valuated examples. It has two main steps. First, UAPDGs that correspond to input valuated examples are iteratively merged pairwise to obtain a UAPDG $\mathscr{A}$ . Second, we project $\mathscr{A}$ to nodes present in all input examples to obtain UAPDG $\mathscr{A}_{c}$ . Given UAPDGs $\mathscr{A}_{1}$ and $\mathscr{A}_{2}$ , $\mathscr{A}_{c}$ obtained by calling merge followed by a project over-approximates $\mathscr{A}_{1}\vee\mathscr{A}_{2}$ . This property ensures that $\mathscr{A}_{c}$ obtained after merging a set of input valuated examples is saltisfied by all of them. We first describe the merge operation, followed by project.

Let $\mathscr{A}_{1}=(N_{1},E_{1},Lat_{1},M_{f1},M_{b1})$ and $\mathscr{A}_{2}=(N_{2},E_{2},Lat_{2}$ , $M_{f2},M_{b2})$ be two UAPDGs over the same set of free variables $\textit{Var}_{f}$ . Let $NM\subseteq N_{1}\times N_{2}$ and $EM\subseteq E_{1}\times E_{2}$ be the node and edge mappings that are obtained from the graph alignment step. We first extend these mappings to relations $\textit{NM}^{\epsilon}\subseteq N_{1}\cup\{\epsilon\}\times N_{2}\cup\{\epsilon\}$ and $\textit{EM}^{\epsilon}\subseteq E_{1}\cup\{\epsilon\}\times E_{2}\cup\{\epsilon\}$ such that $\textit{NM}^{\epsilon}=\textit{NM}\cup\{(n_{1},\epsilon)\mid(n_{1},\_)\not\in NM\}\cup\{(\epsilon,n_{2})\mid(\_,n_{2})\not\in NM\}$ , and $\textit{EM}^{\epsilon}=\textit{EM}\cup\{(e_{1},\epsilon)\mid(e_{1},\_)\not\in EM\}\cup\{(\epsilon,e_{2})\mid(\_,e_{2})\not\in EM\}$ . Then, $\texttt{merge}_{(NM^{\epsilon},EM^{\epsilon})}(\mathscr{A}_{1},\mathscr{A}_{2})$ returns a UAPDG $\mathscr{A}=(N,E,Lat,M_{f},M_{b})$ constructed as follows:

•

$N=\{(n_{1},n_{2})|(n_{1},n_{2})\in NM^{\epsilon}\}$
•

We first compute $E^{\prime}=\{(n_{1},n_{2})\xrightarrow{(e_{1},e_{2})}(n_{1}^{\prime},n_{2}^{\prime})|(e_{1},e_{2})\in EM^{\epsilon},e_{i}\neq\epsilon\Leftrightarrow n_{i}\xrightarrow{e_{i}}n_{i}^{\prime}\}$ . Since $EM$ is an edge mapping obtained from the graph alignment step, it follows that $e_{1}\neq\epsilon\wedge e_{2}\neq\epsilon\Rightarrow\textit{label}(e_{1})=\textit{label}(e_{2})$ . Further, from definition of $EM^{\epsilon}$ , it follows that $e_{1}\neq\epsilon\vee e_{2}\neq\epsilon$ . Once $E^{\prime}$ is computed, $E=\{(n_{1},n_{2})\xrightarrow{e}(n_{1}^{\prime},n_{2}^{\prime})|(n_{1},n_{2})\xrightarrow{(e_{1},e_{2})}(n_{1}^{\prime},n_{2}^{\prime})\in E^{\prime},e=\{\textit{label}(e_{1}),\textit{label}(e_{2})\}\textbackslash\{\epsilon\}\}$ .
•

$Lat(n_{1},n_{2})=Lat_{1}(n_{1})\sqcup Lat_{2}(n_{2})$ where the join is applied point-wise to each node predicate at $n_{1}$ and $n_{2}$ (we assume that $Lat(\epsilon)=\bot$ )
•

$M_{f}(n_{1},n_{2})=x$ , if $M_{f1}(n_{1})=M_{f2}(n_{2})=x\in\textit{Var}_{f}$
•

Nodes not mapped to free variables are mapped to distinct fresh variables– if $(n_{1},n_{2})\not\in\textit{domain}(M_{f})$ , then $M_{b}(n_{1},n_{2})=x_{i}$ for a distinct variable $x_{i}\not\in\textit{Var}_{b}$ .

Method project with respect to $\mathscr{A}_{1}$ and $\mathscr{A}_{2}$ projects the merged UAPDG $\mathscr{A}$ to $\mathscr{A}_{c}$ which has nodes (resp. edges) that map to nodes (resp. edges) in both $\mathscr{A}_{1}$ and $\mathscr{A}_{2}$ , i.e., nodes in $\mathscr{A}_{c}$ are of the form $(n_{1},n_{2})$ where $n_{i}\in N_{i}$ and edges are of the form $(e_{1},e_{2})$ where $e_{i}\in E_{i}$ .

Theorem 2.

For any node mapping $NM\subseteq N_{1}\times N_{2}$ and edge mapping $EM\subseteq E_{1}\times E_{2}$ that is obtained from aligning $\mathscr{A}_{1}$ and $\mathscr{A}_{2}$ , $\mathscr{A}_{1}\vee\mathscr{A}_{2}\Rightarrow\texttt{project}_{\{\mathscr{A}_{1},\mathscr{A}_{2}\}}(\texttt{merge}_{(NM^{\epsilon},EM^{\epsilon})}(\mathscr{A}_{1},\mathscr{A}_{2}))$ .

The proof outline for the above theorem is presented in Appendix D. Besides, returning the formula for $\mathscr{A}_{c}$ , method getConjunctiveSubrule also returns the set of valuated examples labeled with bound variables used for constructing $\mathscr{A}$ . This is achieved by projecting $\mathscr{A}$ to nodes (resp. edges) that map to nodes (resp. edges) in each example. Note, to synthesize conjunctive subrules from a single example (possible in the case of postcondition synthesis), we heuristically include in the subrule all nodes at distance $d=1$ from nodes bound to free variables $\vec{x}$ .

4.3. Partitioning Conforming Examples

In this section, we describe the method generateCandidatePartitions. This method returns a list of low entropy partitions of the input examples $\mathscr{C}$ , based on the clustering algorithm in (Li et al., 2004). Let $N$ be the set of nodes in $\mathscr{A}$ and $N_{c}$ be the set of nodes in $\mathscr{A}_{c}$ , where $\mathscr{A}$ and $\mathscr{A}_{c}$ are the UAPDGs obtained after calling merge and then calling project on examples in $\mathscr{C}$ respectively. For $n\in N$ , let $\mathscr{C}_{n}=\{C_{i}\}\subseteq\mathscr{C}$ be the set of examples such that $n$ maps to a node in $C_{i}$ and let $\mathscr{C}_{\neg n}=\mathscr{C}\backslash\mathscr{C}_{n}$ . Entropy of a partition with respect to $n$ is:
$H(\mathscr{C},\mathscr{C}_{n},\mathscr{C}_{\neg n})=\frac{\mid\mathscr{C}_{n}\mid}{\mid\mathscr{C}\mid}.H(\mathscr{C}_{n})+\frac{\mid\mathscr{C}_{\neg n}\mid}{\mid\mathscr{C}\mid}.H(\mathscr{C}_{\neg n})$ ,
$H(\mathscr{C})=\sum\limits_{n^{\prime}}H_{n^{\prime}}(\mathscr{C})$ , where $H_{n^{\prime}}(\mathscr{C})=-p.logp-(1-p).log(1-p)$ ,
where $p=\mid\mathscr{C}_{n^{\prime}}\mid/\mid\mathscr{C}\mid$ . Nodes in $N_{c}$ by definition are mapped to nodes in all input examples. Since code patterns in a rule are often localized, we consider partitions with respect to nodes in $N$ that neighbor nodes in $N_{c}$ . Further, we return the set of all partitions whose entropy is within a $\delta$ margin of the smallest entropy partition. Each of these partitions is explored in a BFS manner in method synthesizePC. The process stops at the first partition that synthesizes a postcondition that is unsatisfied by all violating examples.

4.4. Discussion about RhoSynth

Precision and Generalization: RhoSynth prefers precision over recall. For this reason, RhoSynth biases preconditions to more specific formulas using an ILP-based maximal graph alignment, even if it may cause some amount of overfitting. In practice, we minimize overfitting by using diverse examples for synthesizing a rule, for instance, examples that are obtained from code changes that belong to different packages (refer to Appendix A for details). Furthermore, to prevent overfitting, RhoSynth biases postconditions to fewer disjuncts. This is accomplished by partitioning a group of correct examples only when the most-specific conjunctive postcondition synthesized from all correct examples satisfies a violating example. In this case, partitioning the correct examples and synthesizing a postcondition disjunct for each partition becomes necessary. In addition, RhoSynth uses lattices to express all node predicates. This helps generalization when merging groups of conforming and violating examples to synthesize the rule.

Soundness: We next show that RhoSynth is sound, i.e., given a set of violating and conforming examples, if RhoSynth synthesizes a rule $R$ then $R$ satisfies all violating examples and does not satisfy any conforming example. Formally,

Theorem 3 (Soundness of Rule Synthesis:).

Given a set of violating examples $\mathscr{V}=\{V_{1},\cdots,V_{m}\}$ and conforming examples $\mathscr{C}=\{C_{1},\cdots,C_{n}\}$ , if the algorithm synthesizeRule successfully returns a rule $R$ , then $V_{i}\models R$ and $C_{i}\not\models R$ .

We present a proof outline in Appendix E. It is easy to argue Algorithm 1’s soundness with respect to violating examples. To argue soundness with respect to conforming examples, we rely on the soundness of the merge algorithm (theorem 2). Using Theorem 2, we argue that all valuated conforming examples satisfy the synthesized postcondition. Since the rule negates the synthesized postcondition, none of them satisfy the synthesized rule.

Completeness: RhoSynth does not provide completeness guarantees. This is a design choice made purely based on practical experiments. Incomleteness may arise from incorrect graph alignments or incorrect partitioning of conforming examples when synthesizing rule postconditions. However, in our experiments, we do not encounter cases where either of these lead to rule synthesis failures or imprecise solutions.

5. Implementation Details

PDG representation: We use RhoSynth to synthesize rules for Java. Our implementation uses MUDetect (Sven and Nguyen, 2019) for representing Java source code as PDGs. MUDetect uses a static single assignment format. Further, we transform these PDGs using the following two program transformations to obtain program representations that are more conducive to rule synthesis:

(1)

We abstract the label of relational operators such as $<,\leq,=,\neq$ , etc. to a common label rel_op. This transformation is useful when synthesizing rules from examples that perform a similar check on a data value using different relational operators, for e.g., error < 0 and error == -1. This transformation helps to express rules over such examples using a conjunctive formula over the rel_op label, instead of a disjunction over different relational operators.
(2)

If a PDG has multiple calls to the same getter method on the same receiver, we transform it into a PDG that has a single getter call and the value returned by the getter is directly used at other call sites. So, we transform the code snippet foo(url.getPort()); ... if (url.getPort()
== 0) { ... } to the PDG representation of the code snippet v = url.getPort();foo(v);
... if (v == 0) { ... } . Since RhoSynth, typically, does not have access to method declarations of called methods, we use heuristics on the name of the method to identify getter calls.

Lattices: To enable unification of node predicates at corresponding nodes in different PDGs, we lift all the node predicates to lattices in the following manner:

(1)

Predicate data-type(x) and declaring-type(x) are lifted to a lattice that captures the longest common prefix and longest common suffix of the string representations of data types. Often, a Java class name shares a common prefix or a suffix with name of its superclass or the interfaces it extends. This lattice is carefully designed to unify such related data types. If nodes $n_{1}$ and $n_{2}$ in different examples align to form node $n$ where data-type( $n_{1}$ ) = “BufferedInputStream” and data-type( $n_{2}$ ) = “FileInputStream”, this lattice helps generalize data-type(n) = “*InputStream”.
(2)

Predicates label(x), data-value(x), num-para(x), output-ignored(x) are lifted to constant lattices. As an example, if num-para $(n_{1})=1$ and num-para $(n_{2})=2$ and $n_{1}$ and $n_{2}$ align to form node $n$ , then num-para $(n)=\top$ .
(3)

Predicate trans-control-dep(x) is lifted to a power-set lattice where set intersection is the join operator. As an example, if trans-control-dep $(n_{1})=\{\texttt{\small If},\texttt{\small Catch}\}$ and trans-control-dep $(n_{2})=\{\texttt{\small If}\}$ and $n_{1}$ and $n_{2}$ align to form node $n$ , then trans-control-dep $(n)=\{\texttt{\small If}\}$ .

Solvers: We use the Xpress ILP solver for graph alignment and Z3 (De Moura and Bjørner, 2008) for satisfiability solving. For running a rule on a method, we check if the rule is satisfied by the PDG representation of the method. We have implemented a few algorithmic simplifications that are used to simplify the satisfiability query, for e.g., we can determine that a rule, which asserts existence of a method call foo, will not be satisfied by a method that does not call foo. Due to such simplifications, >99% of the satisfiability queries can be discharged without calling Z3.

6. Evaluation

To understand the effectiveness of RhoSynth, we address the following research questions.

RQ1:

How effective is RhoSynth at synthesizing precise rules?

Section 6.2 shows that the precision of rules exceeds 75% in production.
RQ2:

Are the synthesized rules “interesting”?

The rules synthesized cover diverse categories and are often discussed on various forums as shown in Section 6.3.
RQ3:

What is the effectiveness of iterative rule refinement in improving the precision?

As we show in Section 6.4, rule refinement improves the precision of rules, by as much as 68% in some cases.
RQ4:

How does RhoSynth compare against state-of-the-art program synthesis approaches? The baselines used are ProSynth (Raghothaman et al., 2019) and anti-unification based synthesis over ASTs.

The baselines often fail to synthesize rules. (Section 6.5)
RQ5:

How does ILP-based graph alignment compare against state-of-the-art code differencing algorithms? We use the GumTree (Falleri et al., 2014) algorithm as a baseline.

Baseline algorithms do not perform well when aligning unpaired code examples for rule synthesis. (Section 6.6)

6.1. Experimental Methodology and Setup

Code Change Examples: We synthesize rules for 31 groups of Java code changes. See Appendix A for a description on the source of the dataset. We provide GitHub links to all the code changes along with the synthesized rules as supplementary material.

Iterative Rule Refinement: We run the synthesized rules on code corpus. We identify rules for iterative refinement based on the detections generated by them. For each rule, we first label $5$ detections. If any of these detection is labeled as a false positive (FP), we shortlist the rule for refinement. For such rules, we label up to $10$ additional detections. With every false positive, we resynthesize the rule using the original set of code changes and all the FPs encountered till then. We stop when adding the $N^{\textit{th}}$ FP does not change the synthesized rule, i.e., rule synthesized after the first $N$ FPs is equivalent to the rule synthesized after $N-1$ FPs.

Production Deployment: We have deployed all these rules in an industrial setting, with our internal code review (CR) system. These rules are run on the source code at the time of each code review. Detections generated by these rules on the code diff are provided as code review comments. Code authors can label these comments as part of code review workflow.

Evaluation: We perform two types of evaluation: offline evaluation by software developers on sampled recommendations and ratings and textual feedback obtained from live code reviews from code authors.

In both cases, the users label them as "Useful", "Not Useful" or "Not Sure". In live code review, developers also provide textual feedback. The offline evaluation involves $10$ expert developers and does not involve the paper authors.

Metrics: For both the evaluations, we report $\textit{precision }=\textit{ \# Useful }/\textit{ \# Total labels}$ , computed over all labeled detections.

Experimental Setup: We conducted all our experiments on a Mac OSX laptop with 2.4GHz Intel processor and 16Gb memory. Experiments with ProSynth (Raghothaman et al., 2019) were run in a Docker container with 8Gb memory launched from a Docker image shared by the authors.

6.2. Precision of RhoSynth

In our offline evaluation, $91\%$ detections ( $107$ out of $117$ ) were labeled "Useful". We have also received feedback from developers on detections generated by these rules during live code reviews. Over a period of $5$ months, detections from $25$ out of the $31$ rules received developer feedback. $75.8\%$ of these labeled detections ( $273$ out of $360$ ) were categorized as "Useful" by the developers⁶⁶6Most rules that did not receive developer feedback during live code reviews involved detecting code context with deprecated or legacy APIs. These rules trigger rarely during a code review and are thus more appropriate for a code scanning tool. All these rules were validated during offline evaluation.. Some of the textual feedback from live code reviews are: "Interesting, that is helpful", "Will add this check.", and "we have a separate task to look into the custom polling solution https://XX". A high developer acceptance during code reviews shows that RhoSynth is capable of synthesizing code quality rules which are effective in the real-world.

Table 1 provides more information about these synthesized rules. The synthesis time for these rules range from $30ms$ to $13s$ , with an average of $1.5s$ . Most of these rules are synthesized from few code changes. Rules that only consist of a precondition correspond to a code anti-pattern. $9$ rules consist of both a precondition and a postcondition. Of these, $4$ rules have a disjunctive postcondition. These disjuncts correspond to different ways of writing code that is correct with respect to the rule. One example of such a rule is check-movetofirst, described in Section 2.2. Another example is executor-graceful-shutdown that intuitively checks if ExecutorService.shutdownNow() call is accompanied by a graceful wait implemented by calling ExecutorService.awaitTermination() or ExecutorService.invokeAll(). Besides the supplementary material, we include visualizations for few synthesized rules in Appendix F.

6.3. Examples of RhoSynth rules

We describe all the rules synthesized by RhoSynth in Table 6.3. A large number of these rules are supported by code documentation or discussions on online forums. They cover a wide range of code quality issues and recommendation categories:

•

performance: e.g., use-parcelable, use-guava-hashmap
•

concurrency: e.g., conc-hashmap-put, countdown-latch-await
•

use of deprecated or legacy APIs: e.g., deprecated-base64, upgrade-enumerator
•

bugs: e.g., check-movetofirst, start-activity
•

code modernization: e.g., view-binding, layout-inflater
•

code simplification: e.g., deseriaize-json-array, use-collectors-joining
•

debuggability: e.g., countdownlatch-await, exception-invoke

S. No.

Rule name

API

Description of the synthesized rules

check-

actionbar

Android

getSupport-

ActionBar

The method ‘getSupportActionBar‘ returns ‘null‘ if the Android activity does not have

an action bar. One must null-check the value returned by ‘getSupportActionBar‘, if the

action bar is not explicitly set by a ‘setSupportActionBar‘ call.

check-

await-

termination

Executor-

Service.await-

Termination

One must check the return value of ‘awaitTermination()‘ to determine if the operation

timed out while waiting for other threads to stop execution, following a shutdown

request. Alternatively, one can check the same by calling‘ExecutorService.isTerminated‘.

check-

create-

newfile

File.

create-

NewFile

‘createNewFile‘ returns False if a file already exists. One must check for ‘File.exists‘ or

check the value returned by ‘createNewFile‘. Without this check, one might silently

overwrite an existing file leading to a data loss.

check-

file-

rename

FileSystem.

rename

One must check if the Hdfs ‘rename‘ operation succeeded and handle failures otherwise.

This check can be performed by checking the value returned by the ‘rename‘ call.

Silent failures can lead to errors that are harder to debug.

check-

inputstream-

skip

Input-

Stream.

skip

‘InputStream.skip‘ returns the number of bytes skipped. One must check the value

returned by the ‘skip‘ call to handle the case when fewer than the expected number of

bytes are skipped.

check-

mkdirs

File.

mkdirs

One must check if the ‘mkdirs‘ operation succeeded and handle failures otherwise. This

check can be performed by checking the value returned by the ‘mkdirs‘ call. Silent

failures can lead to errors that are harder to debug.

check-

movetofirst

Cursor.

moveToFirst

One must check if the result set returned by a database query is empty. This can be

performed by checking the value returned by ‘Cursor.moveToFirst‘ or ‘Cursor.isAfterLast‘,

or checking if ‘Cursor.getCount() > 0‘. Without this check, the app can crash

if subsequent operations are called on the cursor.

check-

resultset-

ResultSet.

One must check if ‘ResultSet.next()‘ returns False. If it returns False, it implies that

the cursor is positioned after the last row and any subsequent calls to ‘next()‘ will

throw an exception.

conc-

hashmap-

put

Conc-

urrentHash-

Map.put

The rule detects code that calls ‘ConcurrentHashMap.containsKey()‘, followed by a

call to ‘put()‘ if ‘containsKey‘ returned ‘False‘. Since these operations are not atomic,

the atomicity violation can lead to a data loss.

countdown-

latch-await

Count-

DownLatch.

await

The ‘await‘ method returns ‘False‘ when the specified waiting time elapses while

the thread is waiting for the latch to count down to zero. One must check the

value returned by ‘await‘ or check if ‘CountDownLatch.getCount() > 0‘ to handle

the case when the ‘await‘ call had timed out.

create-list-

from-map

Map.

values

The rule detects code that converts the ‘Collection‘ returned by ‘Map.values()‘ into

a Java ‘Stream‘ and then collect it into a ‘List‘. This can be simplified by calling

the ‘ArrayList‘ constructor on the ‘Collection‘ returned by ‘values()‘.

deprecated-

base64

Spring

Base64.

encode

The ‘Base64‘ class in the Spring framework’s crypto library is deprecated. Instead

of calling ‘Base64.encode()‘, one must use ‘encodeToString‘ declared in the

‘java.util.Base64.Encoder‘ class.

URL: https://docs.spring.io/spring-security/site/docs/current/api/org/ springframework/security/crypto/codec/Base64.html

deprecated-

injectview

ButterKnife.

inject

ButterKnife InjectView was deprecated in version 7. One must replace

‘ButterKnife.inject(…)‘ with ‘ButterKnife.bind()‘ or use Android’s View Binding.

URL: https://github.com/JakeWharton/butterknife/blob/master/CHANGELOG.md #version-700-2015-06-27

deprecated-

mapping-

exception

Jackson

mapping-

Exception

Method ‘mappingException()‘ was deprecated in version 2.8 of the jackson-

databind library. Instead use ‘handleUnexpectedToken()‘.

URL: https://fasterxml.github.io/jackson-databind/javadoc/2.8/com/fasterxml/jackson/ databind/DeserializationContext.html#mappingException(java.lang.Class)

deserialize-

json-

array

Gson.

fromJson

The rule detects code that deserializes a list of JSON items by iterating in a loop.

Instead, one can directly deserialize into a list by specifying the correct parameterized

type using the ‘TypeToken‘ class.

URL: https://github.com/google/gson/blob/master/UserGuide.md#TOC-Serializing- and-Deserializing-Generic-Types

exception-

invoke

Method.

invoke

Whenever ‘Method.invoke‘ is called, one must explicitly handle the ‘Invocation-

TargetException‘. Since the actual underlying exception is the cause of ‘Invocation-

TargetException‘, it is desirable to call ‘Throwable.getCause()‘ or ‘getTargetException()‘

in the catch handler to access more information about the underlying exception.

executor-

graceful-

shutdown

Executor-

Service.

shutdown-

Now

One must shutdown an ExecutorService gracefully by first calling ‘shutdown‘ to

reject any incoming tasks, waiting a while for the existing tasks to terminate by

calling ‘awaitTermination‘, and then calling ‘shutdownNow‘ to cancel

lingering tasks. This is not required when the code calls ‘ExecutorService.

invokeAll‘ that waits till all the tasks complete.

URL: https://stackoverflow.com/questions/51819342

layout-

inflater

Android

TextView

constructor

One must inflate views using the ‘LayoutInflater‘ instead of creating ‘TextView‘’s

programmatically in code. Especially, if the layout is complex, it is much easier to define

it in XML and inflate it, rather than creating it all in code.

URL: https://developer.android.com/reference/android/view/LayoutInflater

read-

parcelable

Parcel.

readValue

If one knows the specific type of the read object, one must use ‘readParcelable‘

instead of calling ‘readValue‘ followed by an explicit type-cast. Using ‘readParcelable‘

will not require the type-cast operation.

replace-

long-

constructor

Long

constructor

Instead of constructing a new ‘Long‘ object, one must use ‘Long.valueOf()‘ as this method

yields better space and time performance by caching frequently requested values.

URL: https://docs.oracle.com/javase/8/docs/api/java/lang/Long.html#valueOf-long-

start-

activity

Android

start-

Activity

When launching an Android activity with ‘startActivity(Intent, …)‘, one must

check ‘Intent.resolveActivity(..)‘ for null. This checks if there exists an app on the

device that can receive the implicit intent and launch the activity. Otherwise, the

app will crash when ‘startActivity‘ is called. This is not required when the activity

is part of the same app, or when ‘startActivity‘ is called within a try-catch block.

URL: https://developer.android.com/guide/components/intents-common

upgrade-

enumerator

Apache

Enumerator

constructor

The rule detects code that constructs an object of the deprecated ‘Enumerator‘ class.

One must use ‘Collections.enumeration‘ instead.

URL: https://tomcat.apache.org/tomcat-7.0-doc/api/org/apache/catalina/util/Enumerator.html

upgrade-

http-

client

HttpClient.

execute-

Method

The rule detects code that calls ‘HttpClient.executeMethod‘. This method has been

replaced in the HttpClient library version 4 with method ‘execute‘. One

must upgrade to the latest version of the HttpClient library.

URL: https://stackoverflow.com/questions/40795037

use-

collectors-

joining

Stream.

collect

The rule detects code that collects all the items into a ‘Collection‘ by calling

‘Stream.collect()‘ and then joins them into a delimited String by ‘join()‘. Instead,

one can use ‘Collectors.joining(CharSequence)‘ to directly create a delimited String.

use-file-

read-

utility

HttpURL-

Connection

getInput-

Stream

The rule detects code that creates a Reader from a Connection’s input stream and

reads its content by iterating in a loop. Instead, one can directly use a utility function,

such as ‘IOUtils.toString‘, to read the input stream. This makes the code more

readable and also prevents any chances of resource leaks.

use-fs-

is-dir

FileSystem.

isDirectory

‘FileSystem.isDirectory()‘ has been deprecated. One must use ‘FileSystem.

getFileStatus().isDir()‘ instead.

URL: https://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/ FileSystem.html#isDirectory-org.apache.hadoop.fs.Path-

use-

guava-

hashmap

HashMap

constructor

The rule detects code that constructs a HashMap of a given size and then

immediately adds all the keys into the map by iterating in a loop. The HashMap

constructor uses a default load factor of 0.75, which means that the hash table

will be rehashed after 75% of all keys have been added to the table. One must use

Guava’s ‘newHashMapWithExpectedSize()‘ as that will not result in a rehash.

URL: https://stackoverflow.com/questions/30220820

use-

parcelable

Android

getSerial-

izableExtra

One must use ‘Parcelable‘ instead of ‘Serializable‘ to pass data between different

components in Android, as the former is more performant.

URL: https://skoric.svbtle.com/serializable-vs-parcelable

use-

remove-if

Iterator.

remove

The rule detects code that conditionally removes values from a hashmap by iterating

over all the map entries in a loop. One can instead use ‘Collection.removeIf‘.

This is also more efficient than removing values by iterating over all entries.

view-

binding

Android

findViewById

One must enable view binding in their module instead of calling ‘findViewById()‘.

View binding provides null-safety and type-safety at compile time.

URL: https://developer.android.com/topic/libraries/view-binding#java

wrapping-

exception

Invocation-

TargetException.

getCause

The rule detects code that wraps ‘InvocationTargetException.getCause()‘ into an

unchecked exception. One must check if the ‘Throwable‘ returned by ‘getCause()‘

can be itself type-cast into an unchecked exception. A new unchecked exception

must be constructed only if the type-casting operation is not successful.

The synthesized rules along with pointers to supporting blog-posts and code documentation.

Rule name

# code

changes

# new

egs.

PreC

size

PostC

size

check-file-rename

check-inputstream-skip

check-mkdirs

check-resultset-next

conc-hashmap-put

create-list-from-map

deprecated-base64

deprecated-injectview

deprecated-mapping-

exception

deserialize-json-array

layout-inflater

read-parcelable

replace-long-constructor

upgrade-http-client

upgrade-enumerator

use-collectors-joining

use-file-read-utility

use-fs-is-dir

use-guava-hashmap

use-parcelable

use-remove-if

view-binding

wrapping-exception

Rules Iteratively Refined

check-actionbar

(14, 14)

check-await-

termination

check-createnewfile

check-movetofirst

(6, 8)

countdownlatch-await

exception-invoke

executor-graceful-

shutdown

(5, 10)

start-activity

(5, 6, 11)

Table 1. Information about the synthesized rules. # new egs. are additional examples used for rule refinement, PreC size and PostC size are the number of nodes in the rule precondition and (possibly disjunctive) postcondition formulas.

(i_{1},i_{2},\cdots)

indicates a postcondition with disjuncts of sizes

i_{1}

i_{2}

\cdots

respectively.

Rule name

# code

Before rule

# new

After rule

changes

refinement

egs.

refinement

PostC

Prec.

PostC

Prec.

size

check-

actionbar

32%

(14, 14)

100%

check-await-

termination

79%

100%

check-create

newfile

42%

100%

check-

movetofirst

40%

(6, 8)

100%

countdown

latch-await

80%

100%

exception-

invoke

76%

100%

executor-graceful-

shutdown

95%

(5, 10)

100%

start-activity

21%

(5, 6, 11)

80%

Macro Avg.

58%

97%

Table 2. Results for iterative rule refinement (PostC size is the number of nodes in each disjunct of the synthesized postcondition, Prec. is the precision based on user-study).

6.4. Effectiveness of Iterative Rule Refinement

We iteratively refine $8$ rules by providing additional non-buggy code examples. These examples correspond to variations in correct code that are not present in the code changes. As can be seen from Table 2, few additional examples are sufficient to refine the rule. All the 8 rules show precision improvements and the (macro) average precision increases from $58\%$ to $97\%$ based on refinement. We use labels from offline evaluation for estimating the precision ⁷⁷7We group all detections by the rule version, and estimate the overall precision based on 10 labels for each group.. In 4 of the 8 rules, the refinement occurs by synthesizing disjunctive postconditions.

Rule name

ProSynth

AST anti-

unification

deprecated-injectview

deprecated-mapping-

exception

layout-inflater

replace-long-constructor

use-parcelable

view-binding

upgrade-enumerator

check-actionbar

✗, TO

check-createnewfile

✗, TO

check-file-rename

✗, TO

check-inputstream-skip

✗, TO

check-movetofirst

✗, TO

conc-hashmap-put

✗, MC

create-list-from-map

✗, MC

deprecated-base64

✗, MC

read-parcelable

✗, MC

upgrade-http-client

✗, MC

use-fs-is-dir

✗, MC

wrapping-exception

✗, TO

check-await-

termination

✗, TO

✗

check-mkdirs

✗, TO

✗

check-resultset-next

✗, TO

✗

countdownlatch-await

✗, UnSAT

✗

deserialize-json-array

✗, MC

✗

exception-invoke

✗, TO

✗

executor-graceful-

shutdown

✗, TO

✗

start-activity

✗, TO

✗

use-collectors-joining

✗, MC

✗

use-file-read-utility

✗, MC

✗

use-guava-hashmap

✗, MC

✗

use-remove-if

✗, MC

✗

Table 3. Comparison with ProSynth and an AST anti-unification based approach. In the above: TO means timeout at 15m, MC means missing code context, UnSAT means unsatisfiable problem.

6.5. Comparison with baselines

We compare RhoSynth with ProSynth (Raghothaman et al., 2019) and AST anti-unification based approaches, on synthesizing rules without the rule refinement step. Getafix (Bader et al., 2019) and Revisar (Rolim et al., 2018) are representatives for synthesizing AST transformations via anti-unification of tree patterns. A direct comparison against Getafix and Revisar is unfortunately not feasible as the former is not publicly available and the latter does not support Java. Hence, we compare against our own implementation of an anti-unification algorithm over ASTs. The results are described in Table 3. We provide artifacts from these experiment as supplementary material.

ProSynth: ProSynth is a general algorithm to synthesize Datalog programs. It is not particularly tailored to the rule synthesis problem, which is the focus of our work, for two main reasons. First, a datalog program can only express rules that are purely existentially quantified formulas without a postcondition (Ajtai and Gurevich, 1994). ProSynth is thus not able to synthesize rules that have a postcondition. It either times out or returns "Problem unsatisfiable" when synthesizing such rules.

Second, in cases when ProSynth is able to synthesize a rule, the synthesized rule often misses the required code context, i.e., the API of interest or constructs of interest. Synthesizing a rule from few examples is in most cases an underspecified problem. While RhoSynth biases the synthesis of rule preconditions towards larger code context, ProSynth uses a SAT solver based enumeration to synthesize a rule precondition without such an inductive bias. As a result, rules synthesized by ProSynth can miss the required code context as long as they can differentiate between the few positive and negative code examples ⁸⁸8 As an example, the code context for conc-hashmap-put rule must check the result of ConcurrentHashMap.containsKey followed by a call to put, on the same hash map, if containsKey returned False. The rule synthesized by ProSynth just checks for an existence of a containsKey call and misses the remaining code context..

ProSynth times out on 12 rules and returns "Problem unsatisfiable" for 1 rule. On a manual examination of the 18 rules it synthesized, we found that $11$ rules do not contain the required code context. These rules would lead to false positives. Hence, we conclude that ProSynth is able to precisely synthesize $7$ out of $31$ rules.

Anti-unification over ASTs: Given two ASTs, we look at all combination of subtrees rooted at different nodes in the two ASTs. We pick the AST pattern, obtained on anti-unification of these subtrees, that has the largest size and contains the API of interest (column $3$ in Table 6.3). We manually examine the AST patterns obtained on anti-unifying code-before and code-after examples. We observe that due to variations in the syntactic structure of code changes, anti-unification misses the required code context in $12/31$ rules. These AST patterns would lead to false positives. This approach is thus able to precisely synthesize at most $19$ rules. This experiment shows the limitation of expressing rules over an AST representation.

Iteratively

refined rule

PostC

disjunct

# new

unpaired

egs.

# action

nodes in

rule PostC

# Node mappings

using ILP graph-

alignment

#Node mappings

using GumTree

PostC synthesis

succeeds

with GumTree?

check-actionbar

\textit{post}_{1}

\textit{post}_{2}

18/18 = 100%

✓

check-await-termination

\textit{post}_{1}

6/6 = 100%

4/6 = 66%

✓

check-createnewfile

\textit{post}_{1}

12/12 = 100%

7/12 = 58%

✗

check-movetofirst

\textit{post}_{1}

45/45 = 100%

41/45 = 91%

✗

\textit{post}_{2}

12/12 = 100%

10/12= 83%

✓

countdownlatch-await

\textit{post}_{1}

15/15 = 100%

7/15 = 47%

✗

exception-invoke

\textit{post}_{1}

9/9 = 100%

3/9 = 33%

✗

executor-graceful-shutdown

\textit{post}_{1}

\textit{post}_{2}

18/18 = 100%

9/18 = 50%

✗

start-activity

\textit{post}_{1}

6/6 = 100%

✓

\textit{post}_{2}

\textit{post}_{3}

Table 4. Comparison of ILP based graph alignment with GumTree on aligning unpaired code examples for rule synthesis. Node mappings are only reported for those partitions of conforming examples that contain at least two new unpaired examples. We compare alignment over just action nodes since they have a 1-1 map between ASTs and PDGs.

6.6. ILP-based Graph Alignment vs GumTree

We now compare ILP-based alignment and state-of-the-art code differencing algorithms on aligning unpaired code examples for rule synthesis. GumTree (Falleri et al., 2014) is a popular choice for performing AST differencing. It is effective when aligning code changes in which the two snippets overlap a lot. If the examples are “unpaired” (or have small overlap), GumTree fails in some cases to precisely align them. We consider $9$ scenarios covering all $8$ iteratively refined rules where a conjunctive postcondition formula is synthesized from a partition of unpaired conforming code examples. In Table 4, we tabulate the number of nodes mappings established by the two alignment algorithms for every pair of code example in the partition. The number of nodes mapped by GumTree range from $33\%$ to $91\%$ in the failed scenarios. ILP maps all the nodes.

When we use node mappings established by the GumTree algorithm to merge these examples, we succeed in synthesizing the desired postcondition in only $4/9$ scenarios. In the remaining scenarios, GumTree fails to map some nodes that form the required context for the rule postcondition, and are hence omitted on subsequent merge operations. These rules would lead to false positives. The ILP-formulation leads to successful synthesis of the specific postcondition in all $9$ scenarios.

7. Related Work

Synthesis algorithms for program repair and bug detection: Our work is most closely related to research on synthesis algorithms for automated program repair from code changes (Bader et al., 2019; Rolim et al., 2018, 2017; Miltner et al., 2019; Bavishi et al., 2019; Meng et al., 2013). We can categorize these works based on their techniques. The first category consists of Phoenix (Bavishi et al., 2019) and instantations of the PROSE framework (Polozov and Gulwani, 2015) in Refazer (Rolim et al., 2017) and BluePencil (Miltner et al., 2019). These algorithms search for a program transformation, which explain the set of provided concrete edits, within a DSL. The second category includes Revisar (Rolim et al., 2018), Getafix (Bader et al., 2019) and LASE (Meng et al., 2013) that are based on generalization algorithms over ASTs, using anti-unification or maximum common embedded subtree extraction. Both the above categories of work either rely on an accompanying static analyzer for bug localization or synthesize program transformation patterns that are specific to a given package or domain. Bug localization is harder than repair at a given location (Allamanis et al., 2021). Our approach does not rely on existing static analyzers for bug localization and is able to precisely synthesize new rules. These rules are also applicable to code variations that are present in different packages. Additionally, we observe that correct code may have variations that is not present in code changes. We present an approach to refine rules using additional examples containing such code variations. In comparison, most of the above mentioned prior works rely on paired code changes and none of them support iterative refinement of rules.

DiffCode (Paletov et al., 2018) is an approach to infer rules from code changes geared towards Crypto APIs. Their main focus is on clustering algorithms for code changes. They do not automate the task of synthesizing a rule from a cluster of code changes. Program synthesis has also been recently applied to other code related applications such as API migration (Ni et al., 2021; Gao et al., 2021), synthesis of merge conflict resolutions (Pan et al., 2021), interactive code search (Naik et al., 2021).

Datalog synthesis: There is a rich body of recent work on datalog synthesis (Raghothaman et al., 2019; Si et al., 2018, 2019; Albarghouthi et al., 2017; Mendelson et al., 2021; Thakkar et al., 2021) and its application to code related tasks such as interactive code search (Naik et al., 2021). Datalog programs can express existentially quantified preconditions (Ajtai and Gurevich, 1994) but cannot express rules with a non-vacuous postcondition that introduce quantifier alternation. On the other hand, these algorithms can synthesize recursive predicates that we exclude in our approach. In Section 6, we provide an empirical comparison against ProSynth (Raghothaman et al., 2019) from this category.

Statistical approaches for program synthesis: (Dinella et al., 2020; Tufano et al., 2018) are neural approaches for automatic bug-fix generation. These models are trained on a general bug-fix dataset, not necessarily fixes that correspond to rules, and have a comparatively lower accuracy. TFix (Berabi et al., 2021) improves upon them by fine-tuning the neural models on fixes that correspond to a known set of bug categories or rules. In doing so, it gives up the ability to generate fixes for new rules. (Devlin et al., 2017b; Long and Rinard, 2016) propose a hybrid approach where algorithmic techniques are used to generate candidate fixes and statistical models are used to rank them. (Allamanis et al., 2018; Pradel and Sen, 2018; Vasic et al., 2019; Allamanis et al., 2021) present neural models for detecting bugs caused by issues in variable naming and variable misuse. These approaches cannot be easily tailored to generate new detectors from a given set of labeled code examples.

In the domain of synthesizing String transformations, pioneered by FlashFill (Gulwani, 2011), RobustFill (Devlin et al., 2017a) presents a neural approach for program synthesis as well as program induction. Hybrid approaches such as (Gulwani and Jain, 2017; Kalyan et al., 2018; Verbruggen et al., 2021) complement machine learning based induction with algorithmic techniques and present an interesting direction for exploration in the context of rule synthesis.

Statistical approaches for building bug-detectors include data-mining approaches such as APISan (Yun et al., 2016), PR-Miner (Li and Zhou, 2005) and NAR-miner (Bian et al., 2018). These approaches mine popular programming patterns and flag deviants as bugs. Since these mined patterns are based on frequency, these approaches are not able to distinguish between incorrect code and infrequent code. Recent work on mining code patterns has achieved higher precision in finding bugs when the mined patterns are applied to the same code-base they are extracted from (Ahmadi et al., 2021). Arbitrar (Li et al., 2021) improves upon APISan (Yun et al., 2016) by incorporating user-feedback through active learning algorithms. Both Arbitrar and APISan employ symbolic execution to extract semantic features of a program, as opposed to a lighter-weight static analysis in RhoSynth. Rules based on symbolic execution might be too expensive for a real-time code reviewing application.

Interactive program synthesis: Researchers have recently explored the question-selection problem (Ji et al., 2020) and other user-interaction models (Le et al., 2017; Bavishi et al., 2021; Ferdowsifard et al., 2021) in the context of interactive program synthesis. Rule refinement is also an instance of interactive program synthesis and applying these approaches to the domain of rule synthesis is an interesting research direction.

DSLs for expressing rules in static checkers: Most static checkers, such as SonarQube⁹⁹9https://docs.sonarqube.org/latest/extend/adding-coding-rules/, PMD¹⁰¹⁰10https://pmd.github.io/latest/pmd_userdocs_extending_writing_pmd_rules.html, Semmle and Semgrep, allow users to write their own custom rules. These rules are often written in a DSL such as ProgQuery (Rodriguez-Prieto et al., 2020) or CodeQL. We are not aware of any tool that synthesizes rules in these DSLs automatically from labeled code examples. In fact, a recent survey (Raychev, 2021) identifies automating the rule creation process in static checkers as a largely unexplored area.

Synthesis frameworks: SyGus (Alur et al., 2013) and CEGIS (Solar Lezama, 2008) are two popular synthesis frameworks in domains with logical specifications. Recently, Wang et al. (Wang et al., 2021) have proposed an approach that combines SyGus with decision tree learning for synthesizing specific static analyses that detect side-channel information leaks.

8. Conclusions

In this paper, we present RhoSynth, a new algorithm for synthesizing code-quality rules from labeled code examples. RhoSynth performs rule synthesis on graph representations of code and is based on a novel ILP based graph alignment algorithm. We validate our algorithm by synthesizing more than 30 rules that have been deployed in production. Our experimental results show that the rules synthesized by our approach have high precision (greater than 75% in live code reviews) which make them suitable for real-world applications. In the case of low-precision rules, we show that rule refinement can leverage additional examples to incrementally improve the rules. Interestingly, these synthesized rules are capable of enforcing several documented code-quality recommendations. Through comparisons with recent baselines, we show that current state-of-the-art program synthesis approaches are unable to synthesize most rules.

References

(1)
Ahmadi et al. (2021) Mansour Ahmadi, Reza Mirzazade farkhani, Ryan Williams, and Long Lu. 2021. Finding Bugs Using Your Own Code: Detecting Functionally-similar yet Inconsistent Code. In 30th USENIX Security Symposium (USENIX Security 21). USENIX Association. https://www.usenix.org/conference/usenixsecurity21/presentation/ahmadi
Ajtai and Gurevich (1994) Miklós Ajtai and Yuri Gurevich. 1994. Datalog vs First-Order Logic. J. Comput. Syst. Sci. 49, 3 (1994), 562–588. https://doi.org/10.1016/S0022-0000(05)80071-6
Albarghouthi et al. (2017) Aws Albarghouthi, Paraschos Koutris, Mayur Naik, and Calvin Smith. 2017. Constraint-Based Synthesis of Datalog Programs. In Principles and Practice of Constraint Programming, J. Christopher Beck (Ed.). Springer International Publishing, Cham, 689–706.
Allamanis et al. (2018) Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to Represent Programs with Graphs. In International Conference on Learning Representations. https://openreview.net/forum?id=BJOFETxR-
Allamanis et al. (2021) Miltiadis Allamanis, Henry Jackson-Flux, and Marc Brockschmidt. 2021. Self-Supervised Bug Detection and Repair. CoRR abs/2105.12787 (2021). arXiv:2105.12787 https://arxiv.org/abs/2105.12787
Alur et al. (2013) Rajeev Alur, Rastislav Bodik, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. 2013. Syntax-Guided Synthesis. In Proceedings of the IEEE International Conference on Formal Methods in Computer-Aided Design (FMCAD). 1–17.
Bader et al. (2019) Johannes Bader, Andrew Scott, Michael Pradel, and Satish Chandra. 2019. Getafix: Learning to Fix Bugs Automatically. Proc. ACM Program. Lang. 3, OOPSLA, Article 159 (Oct. 2019), 27 pages. https://doi.org/10.1145/3360585
Barrett and Tinelli (2018) Clark Barrett and Cesare Tinelli. 2018. Satisfiability Modulo Theories. Springer International Publishing, Cham, 305–343. https://doi.org/10.1007/978-3-319-10575-8_11
Bavishi et al. (2021) Rohan Bavishi, Caroline Lemieux, Koushik Sen, and Ion Stoica. 2021. Gauss: Program Synthesis by Reasoning over Graphs. Proc. ACM Program. Lang. 5, OOPSLA, Article 134 (oct 2021), 29 pages. https://doi.org/10.1145/3485511
Bavishi et al. (2019) Rohan Bavishi, Hiroaki Yoshida, and Mukul R. Prasad. 2019. Phoenix: Automated Data-Driven Synthesis of Repairs for Static Analysis Violations. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Tallinn, Estonia) (ESEC/FSE 2019). Association for Computing Machinery, New York, NY, USA, 613–624. https://doi.org/10.1145/3338906.3338952
Beck (2002) Kent Beck. 2002. Test Driven Development. By Example (Addison-Wesley Signature). Addison-Wesley Longman, Amsterdam.
Berabi et al. (2021) Berkay Berabi, Jingxuan He, Veselin Raychev, and Martin Vechev. 2021. TFix: Learning to Fix Coding Errors with a Text-to-Text Transformer. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021 (Proceedings of Machine Learning Research).
Bian et al. (2018) Pan Bian, Bin Liang, Wenchang Shi, Jianjun Huang, and Yan Cai. 2018. NAR-Miner: Discovering Negative Association Rules from Code for Bug Detection. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Lake Buena Vista, FL, USA) (ESEC/FSE 2018). Association for Computing Machinery, New York, NY, USA, 411–422. https://doi.org/10.1145/3236024.3236032
De Moura and Bjørner (2008) Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (Budapest, Hungary) (TACAS’08/ETAPS’08). Springer-Verlag, Berlin, Heidelberg, 337–340.
Devlin et al. (2017a) Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, and Pushmeet Kohli. 2017a. RobustFill: Neural Program Learning under Noisy I/O. CoRR abs/1703.07469 (2017). arXiv:1703.07469 http://arxiv.org/abs/1703.07469
Devlin et al. (2017b) Jacob Devlin, Jonathan Uesato, Rishabh Singh, and Pushmeet Kohli. 2017b. Semantic Code Repair using Neuro-Symbolic Transformation Networks. CoRR abs/1710.11054 (2017). http://arxiv.org/abs/1710.11054
Dinella et al. (2020) Elizabeth Dinella, Hanjun Dai, Ziyang Li, Mayur Naik, Le Song, and Ke Wang. 2020. Hoppity: Learning Graph Transformations to Detect and Fix Bugs in Programs. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.net/forum?id=SJeqs6EFvB
Falleri et al. (2014) Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-Grained and Accurate Source Code Differencing. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering (Vasteras, Sweden) (ASE ’14). Association for Computing Machinery, New York, NY, USA, 313–324. https://doi.org/10.1145/2642937.2642982
Ferdowsifard et al. (2021) Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2021. LooPy: Interactive Program Synthesis with Control Structures. Proc. ACM Program. Lang. 5, OOPSLA, Article 153 (oct 2021), 29 pages. https://doi.org/10.1145/3485530
Ferrante et al. (1987) Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. 1987. The Program Dependence Graph and Its Use in Optimization. ACM Trans. Program. Lang. Syst. 9, 3 (July 1987), 319–349. https://doi.org/10.1145/24039.24041
Gao et al. (2021) Xiang Gao, Arjun Radhakrishna, Gustavo Soares, Ridwan Shariffdeen, Sumit Gulwani, and Abhik Roychoudhury. 2021. APIfix: Output-Oriented Program Synthesis for Combating Breaking Changes in Libraries. Proc. ACM Program. Lang. 5, OOPSLA, Article 161 (oct 2021), 27 pages. https://doi.org/10.1145/3485538
Gulwani (2011) Sumit Gulwani. 2011. Automating String Processing in Spreadsheets Using Input-Output Examples. SIGPLAN Not. 46, 1 (Jan. 2011), 317–330. https://doi.org/10.1145/1925844.1926423
Gulwani and Jain (2017) Sumit Gulwani and Prateek Jain. 2017. Programming by Examples: PL Meets ML. In Programming Languages and Systems - 15th Asian Symposium, APLAS 2017, Suzhou, China, November 27-29, 2017, Proceedings (Lecture Notes in Computer Science, Vol. 10695), Bor-Yuh Evan Chang (Ed.). Springer, 3–20. https://doi.org/10.1007/978-3-319-71237-6_1
Haussler (1989) David Haussler. 1989. Learning Conjunctive Concepts in Structural Domains. Mach. Learn. 4 (1989), 7–40. https://doi.org/10.1007/BF00114802
Ji et al. (2020) Ruyi Ji, Jingjing Liang, Yingfei Xiong, Lu Zhang, and Zhenjiang Hu. 2020. Question selection for interactive program synthesis. In Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020, Alastair F. Donaldson and Emina Torlak (Eds.). ACM, 1143–1158. https://doi.org/10.1145/3385412.3386025
Kalyan et al. (2018) Ashwin Kalyan, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, and Sumit Gulwani. 2018. Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=rywDjg-RW
Kreutzer et al. (2016) Patrick Kreutzer, Georg Dotzler, Matthias Ring, Bjoern M. Eskofier, and Michael Philippsen. 2016. Automatic Clustering of Code Changes. In Proceedings of the 13th International Conference on Mining Software Repositories (Austin, Texas) (MSR ’16). Association for Computing Machinery, New York, NY, USA, 61–72. https://doi.org/10.1145/2901739.2901749
Le et al. (2017) Vu Le, Daniel Perelman, Oleksandr Polozov, Mohammad Raza, Abhishek Udupa, and Sumit Gulwani. 2017. Interactive Program Synthesis. CoRR abs/1703.03539 (2017). arXiv:1703.03539 http://arxiv.org/abs/1703.03539
Lerouge et al. (2017) Julien Lerouge, Zeina Abu-Aisheh, Romain Raveaux, Pierre Héroux, and Sébastien Adam. 2017. New binary linear programming formulation to compute the graph edit distance. Pattern Recognition 72 (Dec. 2017), 254 – 265. https://doi.org/10.1016/j.patcog.2017.07.029
Li et al. (2004) Tao Li, Sheng Ma, and Mitsunori Ogihara. 2004. Entropy-based criterion in categorical clustering. In Machine Learning, Proceedings of the Twenty-first International Conference (ICML 2004), Banff, Alberta, Canada, July 4-8, 2004 (ACM International Conference Proceeding Series, Vol. 69), Carla E. Brodley (Ed.). ACM. https://doi.org/10.1145/1015330.1015404
Li et al. (2021) Ziyang Li, Aravind Machiry, Binghong Chen, Mayur Naik, Ke Wang, and Le Song. 2021. ARBITRAR: User-Guided API Misuse Detection. In 2021 IEEE Symposium on Security and Privacy (SP). 1400–1415. https://doi.org/10.1109/SP40001.2021.00090
Li and Zhou (2005) Zhenmin Li and Yuanyuan Zhou. 2005. PR-Miner: Automatically Extracting Implicit Programming Rules and Detecting Violations in Large Software Code. In Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering (Lisbon, Portugal) (ESEC/FSE-13). Association for Computing Machinery, New York, NY, USA, 306–315. https://doi.org/10.1145/1081706.1081755
Long and Rinard (2016) Fan Long and Martin Rinard. 2016. Automatic Patch Generation by Learning Correct Code. SIGPLAN Not. 51, 1 (jan 2016), 298–312. https://doi.org/10.1145/2914770.2837617
Mendelson et al. (2021) Jonathan Mendelson, Aaditya Naik, Mukund Raghothaman, and Mayur Naik. 2021. GENSYNTH: Synthesizing Datalog Programs without Language Bias. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021. AAAI Press, 6444–6453. https://ojs.aaai.org/index.php/AAAI/article/view/16799
Meng et al. (2013) Na Meng, Miryung Kim, and Kathryn S. McKinley. 2013. LASE: Locating and Applying Systematic Edits by Learning from Examples. In Proceedings of the 2013 International Conference on Software Engineering (San Francisco, CA, USA) (ICSE ’13). IEEE Press, 502–511.
Miltner et al. (2019) Anders Miltner, Sumit Gulwani, Vu Le, Alan Leung, Arjun Radhakrishna, Gustavo Soares, Ashish Tiwari, and Abhishek Udupa. 2019. On the Fly Synthesis of Edit Suggestions. Proc. ACM Program. Lang. 3, OOPSLA, Article 143 (Oct. 2019), 29 pages. https://doi.org/10.1145/3360569
Naik et al. (2021) Aaditya Naik, Jonathan Mendelson, Nathaniel Sands, Yuepeng Wang, Mayur Naik, and Mukund Raghothaman. 2021. Sporq: An Interactive Environment for Exploring Code Using Query-by-Example. In The 34th Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’21). Association for Computing Machinery, New York, NY, USA, 84–99. https://doi.org/10.1145/3472749.3474737
Nguyen et al. (2019) Hoan Anh Nguyen, Tien N. Nguyen, Danny Dig, Son Nguyen, Hieu Tran, and Michael Hilton. 2019. Graph-Based Mining of In-the-Wild, Fine-Grained, Semantic Code Change Patterns. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 819–830. https://doi.org/10.1109/ICSE.2019.00089
Ni et al. (2021) Ansong Ni, Daniel Ramos, Aidan Z. H. Yang, Inês Lynce, Vasco M. Manquinho, Ruben Martins, and Claire Le Goues. 2021. SOAR: A Synthesis Approach for Data Science API Refactoring. In 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Madrid, Spain, 22-30 May 2021. IEEE, 112–124. https://doi.org/10.1109/ICSE43902.2021.00023
Paletov et al. (2018) Rumen Paletov, Petar Tsankov, Veselin Raychev, and Martin Vechev. 2018. Inferring Crypto API Rules from Code Changes. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (Philadelphia, PA, USA) (PLDI 2018). Association for Computing Machinery, New York, NY, USA, 450–464. https://doi.org/10.1145/3192366.3192403
Pan et al. (2021) Rangeet Pan, Vu Le, Nachiappan Nagappan, Sumit Gulwani, Shuvendu K. Lahiri, and Mike Kaufman. 2021. Can Program Synthesis be Used to Learn Merge Conflict Resolutions? An Empirical Analysis. In 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Madrid, Spain, 22-30 May 2021. IEEE, 785–796. https://doi.org/10.1109/ICSE43902.2021.00077
Polozov and Gulwani (2015) Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A Framework for Inductive Program Synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (Pittsburgh, PA, USA) (OOPSLA 2015). Association for Computing Machinery, New York, NY, USA, 107–126. https://doi.org/10.1145/2814270.2814310
Pradel and Sen (2018) Michael Pradel and Koushik Sen. 2018. DeepBugs: A Learning Approach to Name-Based Bug Detection. Proc. ACM Program. Lang. 2, OOPSLA, Article 147 (Oct. 2018), 25 pages. https://doi.org/10.1145/3276517
Raghothaman et al. (2019) Mukund Raghothaman, Jonathan Mendelson, David Zhao, Mayur Naik, and Bernhard Scholz. 2019. Provenance-Guided Synthesis of Datalog Programs. Proc. ACM Program. Lang. 4, POPL, Article 62 (Dec. 2019), 27 pages. https://doi.org/10.1145/3371130
Raychev (2021) Veselin Raychev. 2021. Learning to Find Bugs and Code Quality Problems - What Worked and What not?. In 2021 International Conference on Code Quality (ICCQ). 1–5. https://doi.org/10.1109/ICCQ51190.2021.9392977
Rodriguez-Prieto et al. (2020) Oscar Rodriguez-Prieto, Alan Mycroft, and Francisco Ortin. 2020. An Efficient and Scalable Platform for Java Source Code Analysis Using Overlaid Graph Representations. IEEE Access 8 (2020), 72239–72260. https://doi.org/10.1109/ACCESS.2020.2987631
Rokach and Maimon (2005) Lior Rokach and Oded Maimon. 2005. Clustering Methods. In The Data Mining and Knowledge Discovery Handbook, Oded Maimon and Lior Rokach (Eds.). Springer, 321–352. http://dblp.uni-trier.de/db/books/collections/datamining2005.html#RokachM05a
Rolim et al. (2017) Reudismam Rolim, Gustavo Soares, Loris D’Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, and Björn Hartmann. 2017. Learning syntactic program transformations from examples. In Proceedings of the 39th International Conference on Software Engineering, ICSE 2017, Buenos Aires, Argentina, May 20-28, 2017, Sebastián Uchitel, Alessandro Orso, and Martin P. Robillard (Eds.). IEEE / ACM, 404–415. https://doi.org/10.1109/ICSE.2017.44
Rolim et al. (2018) Reudismam Rolim, Gustavo Soares, Rohit Gheyi, and Loris D’Antoni. 2018. Learning Quick Fixes from Code Repositories. CoRR abs/1803.03806 (2018). arXiv:1803.03806 http://arxiv.org/abs/1803.03806
Ryzhkov (2011) Evgeniy Ryzhkov. 2011. How to add a new diagnostic rule into PVS-Studio? Days from developers’ life. https://pvs-studio.com/en/blog/posts/0110/
Schrijver (1986) Alexander Schrijver. 1986. Theory of Linear and Integer Programming. John Wiley & Sons, Inc., USA.
Si et al. (2018) Xujie Si, Woosuk Lee, Richard Zhang, Aws Albarghouthi, Paraschos Koutris, and Mayur Naik. 2018. Syntax-Guided Synthesis of Datalog Programs. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (Lake Buena Vista, FL, USA) (ESEC/FSE 2018). Association for Computing Machinery, New York, NY, USA, 515–527. https://doi.org/10.1145/3236024.3236034
Si et al. (2019) Xujie Si, Mukund Raghothaman, Kihong Heo, and Mayur Naik. 2019. Synthesizing Datalog Programs using Numerical Relaxation. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, Sarit Kraus (Ed.). ijcai.org, 6117–6124. https://doi.org/10.24963/ijcai.2019/847
Sobreira et al. (2018) Victor Sobreira, Thomas Durieux, Fernanda Madeiral, Martin Monperrus, and Marcelo de Almeida Maia. 2018. Dissection of a bug dataset: Anatomy of 395 patches from Defects4J. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). 130–140. https://doi.org/10.1109/SANER.2018.8330203
Solar Lezama (2008) Armando Solar Lezama. 2008. Program Synthesis By Sketching. Ph. D. Dissertation. EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-177.html
Sven and Nguyen (2019) Amann Sven and Hoan Anh Nguyen. 2019. MUDetect : An API-Misuse Detector. https://github.com/stg-tud/MUDetect.
Thakkar et al. (2021) Aalok Thakkar, Aaditya Naik, Nathaniel Sands, Rajeev Alur, Mayur Naik, and Mukund Raghothaman. 2021. Example-Guided Synthesis of Relational Queries. Association for Computing Machinery, New York, NY, USA, 1110–1125. https://doi.org/10.1145/3453483.3454098
Tufano et al. (2018) Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2018. An Empirical Investigation into Learning Bug-Fixing Patches in the Wild via Neural Machine Translation. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (Montpellier, France) (ASE 2018). Association for Computing Machinery, New York, NY, USA, 832–837. https://doi.org/10.1145/3238147.3240732
Vasic et al. (2019) Marko Vasic, Aditya Kanade, Petros Maniatis, David Bieber, and Rishabh singh. 2019. Neural Program Repair by Jointly Learning to Localize and Repair. In International Conference on Learning Representations. https://openreview.net/forum?id=ByloJ20qtm
Verbruggen et al. (2021) Gust Verbruggen, Vu Le, and Sumit Gulwani. 2021. Semantic Programming by Example with Pre-Trained Models. Proc. ACM Program. Lang. 5, OOPSLA, Article 100 (oct 2021), 25 pages. https://doi.org/10.1145/3485477
Wang et al. (2021) Jingbo Wang, Chungha Sung, Mukund Raghothaman, and Chao Wang. 2021. Data-Driven Synthesis of Provably Sound Side Channel Analyses. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 810–822. https://doi.org/10.1109/ICSE43902.2021.00079
Yan et al. (2013) Junchi Yan, Yu Tian, Hongyuan Zha, Xiaokang Yang, Ya Zhang, and Stephen M. Chu. 2013. Joint Optimization for Consistent Multiple Graph Matching. In 2013 IEEE International Conference on Computer Vision. 1649–1656. https://doi.org/10.1109/ICCV.2013.207
Yun et al. (2016) Insu Yun, Changwoo Min, Xujie Si, Yeongjin Jang, Taesoo Kim, and Mayur Naik. 2016. APISAN: Sanitizing API Usages through Semantic Cross-Checking. In Proceedings of the 25th USENIX Conference on Security Symposium (Austin, TX, USA) (SEC’16). USENIX Association, USA, 363–378.

Appendix A Input examples to RhoSynth

We obtain code changes from $27,752$ Java GitHub packages that were selected based on their license, i.e., Apache or MIT-license, and star rating¹¹¹¹11A list of these GitHub packages is included in the supplementary material.. These code changes are extracted using CPatMiner (Nguyen et al., 2019). The corpus consists of $1.84M$ code changes. We cluster PDG representation of these code changes in two phases – (1) we map each code change to a set of APIs that are added, deleted or replaced in the change; (2) for each API, we cluster the mapped code changes using a bottom-up hierarchical clustering algorithm (Rokach and Maimon, 2005) (a code change is featurized using a fixed set of predicate features over its PDG representation and Jaccard distance is used to compute similarity between code changes). The clustering algorithm is tuned to output clusters that have a high homogeneity score. This ensures that all code-changes within the same cluster correspond to the same rule.

We tag each code change with the relative file path, the name and the line number of the method containing the change. We use this triple as a criterion to remove code-change overlap within a cluster. We filtered clusters containing less than 3 examples and having examples from less than 2 repositories or 2 commit ids. The latter ensures that the code changes are not repository-specific. We then choose clusters containing popular APIs belonging to popular frameworks such as Java.util, Android, Apache, Guava, etc. The popularity is determined based on frequency of occurrence in the corpus. This results in 1397 clusters. We pick 300 clusters randomly from this set. The authors manually examined these clusters and select those clusters whose underlying APIs and code constructs find relevant results on a Google search, which is evidence for a general applicability of the code changes in the cluster. For e.g., code-changes for the “use-guava-hashmap” rule involve the HashMap constructor and Maps.newHashMapWithExpectedSize API that have an associated Stack-Overflow post. We spend on average 5 minutes for manually examining each cluster. This examination resulted in 31 clusters which are then input to RhoSynth.

The paper containing the details of clustering is under submission.

Appendix B Example of PDG

Refer to Figure 6 for the PDG of the code-after snippet in Figure 1(a). Oval nodes in the figure indicate data nodes and rectangular nodes indicate action nodes.

Appendix C Maximal Graph Alignment using ILP

In this section, we describe the reduction from maximizing graph alignment to an integer linear program (ILP). Let $\mathscr{A}_{1}=(N_{1},E_{1},Lat_{1},M_{f1},M_{b1})$ and $\mathscr{A}_{2}=(N_{2},E_{2},Lat_{2}$ , $M_{f2},M_{b2})$ be two UAPDGs over the same set of free variables $\textit{Var}_{f}$ . Graph alignment on $\mathscr{A}_{1}$ and $\mathscr{A}_{2}$ outputs a node mapping $NM\subseteq N_{1}\times N_{2}$ and an edge mapping $EM:E_{1}\times E_{2}$ as follows. The ILP optimization problem is defined over the following set of variables:

(1)

$z_{n_{1}n_{2}}$ are 0-1 variables that are set if $(n_{1},n_{2})\in NM$
(2)

$z_{e_{1}e_{2}}$ are 0-1 variables that are set if $(e_{1},e_{2})\in EM$
(3)

$z_{n_{1}},z_{n_{2}},z_{e_{1}},z_{e_{2}}$ are 0-1 variables that are set if the corresponding nodes and edges are not mapped by $NM$ and $EM$ respectively

The ILP objective function seeks to maximize the number of mapped nodes and edges:

\textit{Objective function}=\max\limits_{\stackrel{{\scriptstyle z_{n_{1}n_{2}},z_{e_{1}e_{2}},}}{{z_{n_{1}},z_{n_{2}},z_{e_{1}},z_{e_{2}}}}}\sum\limits_{n_{1}\in N_{1},n_{2}\in N_{2}}z_{n_{1}n_{2}}+\sum\limits_{e_{1}\in E_{1},e_{2}\in E_{2}}z_{e_{1}e_{2}}

The optimization is subject to the following constraints:

(1)

Each node $n_{1}\in N_{1}$ is optionally mapped to a node in $N_{2}$ : $z_{n_{1}}+\sum\limits_{n_{2}\in N_{2}}z_{n_{1}n_{2}}=1$ for all $n_{1}\in N_{1}$ .
(2)

Each node $n_{2}\in N_{2}$ is optionally mapped to a node in $N_{1}$ : $z_{n_{2}}+\sum\limits_{n_{1}\in N_{1}}z_{n_{1}n_{2}}=1$ for all $n_{2}\in N_{2}$ .
(3)

Each edge $e_{1}\in E_{1}$ is optionally mapped to an edge in $E_{2}$ : $z_{e_{1}}+\sum\limits_{e_{2}\in E_{2}}z_{e_{1}e_{2}}=1$ for all $e_{1}\in E_{1}$ .
(4)

Each edge $e_{2}\in E_{2}$ is optionally mapped to an edge in $E_{1}$ : $z_{e_{2}}+\sum\limits_{e_{1}\in E_{1}}z_{e_{1}e_{2}}=1$ for all $e_{2}\in E_{2}$ .
(5)

Nodes mapped to the same free variables are aligned: $z_{n_{1}n_{2}}=1$ if $M_{f1}(n_{1})=M_{f2}(n_{2})$ .
(6)

Action nodes with different labels are not aligned: $z_{n_{1}n_{2}}=0$ if $n_{1}$ and $n_{2}$ are action nodes with different labels.
(7)

Action nodes with different change tags are not aligned while synthesizing preconditions: $z_{n_{1}n_{2}}=0$ if change tag of $n_{1}$ and $n_{2}$ are different, where $n_{1}$ , $n_{2}$ are both action nodes.
(8)

Edges with different labels are not aligned: $z_{e_{1}e_{2}}=0$ if $e_{1}$ and $e_{2}$ have different labels.
(9)

Node and edge mappings satisfy the topological constraints. This means if edges $e_{1}$ and $e_{2}$ are mapped then the nodes at their sources and destinations must be mapped: $z_{e_{1}e_{2}}\leq z_{n_{1}n_{2}}$ , where $n_{1}=\textit{src}(e_{1})$ and $n_{2}=\textit{src}(e_{2})$ for all $e_{1}\in E_{1}$ and $e_{2}\in E_{2}$ . Similarly, $z_{e_{1}e_{2}}\leq z_{n_{1}n_{2}}$ , where $n_{1}=\textit{dest}(e_{1})$ and $n_{2}=\textit{dest}(e_{2})$ .
(10)

If two data nodes align then they must have at least one aligned incoming or outgoing edge:
$z_{n_{1}n_{2}}\leq\sum\limits_{n_{1}=\textit{src}(e_{1}),n_{2}=\textit{src}(e_{2})}z_{e_{1}e_{2}}+\sum\limits_{n_{1}=\textit{dest}(e_{1},n_{2}=\textit{dest}(e_{2})}z_{e_{1}e_{2}}$ for data nodes $n_{1}$ and $n_{2}$ .

Appendix D Proof of Theorem 2

We first introduce two new lemmas that help us in proving the main theorem.

Lemma 1.

Given a UAPDG $\mathscr{A}=(N,E,Lat,M_{f},M_{b})$ . Let $\mathscr{A}^{\prime}=(N^{\prime},E^{\prime},Lat^{\prime},M_{f}^{\prime},M_{b}^{\prime})$ be obtained by projecting $\mathscr{A}$ to a subset of nodes and edges, $N^{\prime}\subseteq N$ and $E^{\prime}\subseteq E$ , in the following manner:

(a)

if $n\in M_{f}\Rightarrow n\in M_{f}^{\prime}$ .
(b)

if $(n_{1},n_{2})\in E\textit{ and }n_{1}\in M_{f}\textit{ and }n_{2}\in M_{f}\Rightarrow(n_{1},n_{2})\in E^{\prime}$ .

Then $\mathscr{A}\Rightarrow\mathscr{A}^{\prime}$ .

Proof.

Let $\mathscr{A}$ correspond to the formula $\exists\vec{y}.\varphi(\vec{x},\vec{y})$ and $\mathscr{A^{\prime}}$ correspond to the formula $\exists\vec{y}.\varphi^{\prime}(\vec{x},\vec{y})$ . One can easily argue that $\varphi(\vec{x},\vec{y})\Rightarrow\varphi^{\prime}(\vec{x},\vec{y})$ . Therefore, any model $A^{\vec{x},\vec{y}}$ of $\varphi(\vec{x},\vec{y})$ will also be a model of $\varphi^{\prime}(\vec{x},\vec{y})$ . ∎

Lemma 2.

Let $\mathscr{A}_{1}=(N_{1},E_{1},Lat_{1},M_{f1},M_{b1})$ and $\mathscr{A}_{2}=(N_{2},$ $E_{2},$ $Lat_{2}$ , $M_{f2},M_{b2})$ be two UAPDGs defined over the same set of free variables $\vec{x}=\{x_{1},\cdots,x_{n}\}$ and bound variables $\vec{y}=\{y_{1},\cdots,y_{k}\}$ . Let $NM\subseteq N_{1}\times N_{2}$ and $EM\subseteq E_{1}\times E_{2}$ be node and edge mappings obtained from aligning $\mathscr{A}_{1}$ and $\mathscr{A}_{2}$ such that (a) all nodes in $N_{1}$ and $N_{2}$ and all edges in $E_{1}$ and $E_{2}$ are mapped through $NM$ and $EM$ respectively, and (b) $(n_{1},n_{2})\in NM\Rightarrow$ either $M_{f1}(n_{1})=M_{f2}(n_{2})$ or $M_{b1}(n_{1})=M_{b2}(n_{2})$ . Then $\mathscr{A}_{1}\vee\mathscr{A}_{2}\Rightarrow\texttt{merge}_{(NM,EM)}(\mathscr{A}_{1},\mathscr{A}_{2})$ .

Proof.

Let $\mathscr{A}(N,E,Lat,M_{f},M_{b})=\texttt{merge}_{(NM,EM)}(\mathscr{A}_{1},\mathscr{A}_{2})$ . Without loss of generality, let $M_{b}$ be such that $M_{b}(n_{1},n_{2})=M_{b1}(n_{1})=M_{b2}(n_{2})$ . Let the UAPDGs $\mathscr{A},\mathscr{A}_{1}$ and $\mathscr{A}_{2}$ correspond to formulas $\exists\vec{y}.\varphi(\vec{x},\vec{y})$ , $\exists\vec{y}.\varphi_{1}(\vec{x},\vec{y})$ and $\exists\vec{y}.\varphi_{2}(\vec{x},\vec{y})$ respectively. Let $A_{1}^{\vec{x},\vec{y}}$ be a graph structure with its nodes labeled with $\vec{x}$ and $\vec{y}$ such that $A_{1}^{\vec{x},\vec{y}}$ is a model for $\mathscr{A}_{1}$ . This implies that $A_{1}^{\vec{x},\vec{y}}\models\varphi_{1}(\vec{x},\vec{y})$ .

We want to prove that $A_{1}^{\vec{x},\vec{y}}\models\varphi(\vec{x},\vec{y})$ . With this, we can argue that $\mathscr{A}_{1}\Rightarrow\mathscr{A}$ . We prove by contradiction. Let $A_{1}^{\vec{x},\vec{y}}\not\models\varphi(\vec{x},\vec{y})$ . Since $\varphi$ is a conjunction of various atomic formulas, there is at least one atomic formula that is not satisfied by $A_{1}^{\vec{x},\vec{y}}$ .

-

Let $A_{1}^{\vec{x},\vec{y}}\not\models$ $p\xrightarrow{e}q$ for $p,q\in\vec{x}\cup\vec{y}$ . Let the node in $\mathscr{A}$ mapped to $p$ and $q$ , through $M_{f}$ or $M_{b}$ , be $(n_{1},n_{2})$ and $(n_{1}^{\prime},n_{2}^{\prime})$ respectively. From the definition of $M_{b}$ and $M_{f}$ we deduce that $n_{1}\in N_{1}$ is mapped to $p$ and $n_{1}^{\prime}\in N_{1}$ is mapped to $q$ , through maps $M_{f1}$ and $M_{b1}$ . Further, given the topological constraints on the alignment maps $NM$ and $EM$ one can show that $n_{1}\xrightarrow{e}n_{1}^{\prime}$ holds in $\mathscr{A}_{1}$ . Since $\varphi_{1}$ is obtained by conjoining all atomic node and edge predicates that holds in $\mathscr{A}_{1}$ , it implies that $\varphi_{1}(\vec{x},\vec{y})\Rightarrow p\xrightarrow{e}q$ . Since, $A_{1}^{\vec{x},\vec{y}}$ satisfies $\varphi_{1}(\vec{x},\vec{y})$ , it implies that $A_{1}^{\vec{x},\vec{y}}\models p\xrightarrow{e}q$ , which is a contradiction.
-

Let $A_{1}^{\vec{x},\vec{y}}\not\models$ $\eta(p)$ for $p\in\vec{x}\cup\vec{y}$ . Let the node in $\mathscr{A}$ mapped to $p$ , through $M_{f}$ or $M_{b}$ , be $(n_{1},n_{2})$ . From the definition of $M_{b}$ and $M_{f}$ we deduce that $n_{1}\in N_{1}$ is mapped to $p$ through maps $M_{f1}$ and $M_{b1}$ . From the construction of $\mathscr{A}$ , we know that $Lat(n_{1})\sqsubseteq Lat(n_{1},n_{2})$ . Since, $\varphi_{1}(\vec{x},\vec{y})$ includes all atomic node predicates that hold over $n_{1}$ , $\varphi_{1}(\vec{x},\vec{y})\Rightarrow[[Lat(n_{1})]]$ . Since $A_{1}^{\vec{x},\vec{y}}$ satisfies $\varphi_{1}(\vec{x},\vec{y})$ , it implies that $A_{1}^{\vec{x},\vec{y}}$ satisfies $Lat(n_{1},n_{2})$ and hence the formula $\eta(p)$ , which is a contradiction.

Similarly, we show that $\mathscr{A}_{2}\Rightarrow\mathscr{A}$ . Together, this proves that $\mathscr{A}_{1}\vee\mathscr{A}_{2}\Rightarrow\mathscr{A}$ .

∎

Theorem For any node mapping $NM\subseteq N_{1}\times N_{2}$ and edge mapping $EM\subseteq E_{1}\times E_{2}$ that is obtained from aligning $\mathscr{A}_{1}$ and $\mathscr{A}_{2}$ , $\mathscr{A}_{1}\vee\mathscr{A}_{2}\Rightarrow\texttt{project}_{\{\mathscr{A}_{1},\mathscr{A}_{2}\}}(\texttt{merge}_{(NM^{\epsilon},EM^{\epsilon})}(\mathscr{A}_{1},\mathscr{A}_{2}))$ .

Proof.

Let $\mathscr{A}_{1}=(N_{1},E_{1},Lat_{1},M_{f1},M_{b1})$ and $\mathscr{A}_{2}=(N_{2},E_{2},Lat_{2}$ , $M_{f2},M_{b2})$ be the two UAPDGs, with the same set of free variables $\vec{x}=\{x_{1},\cdots,x_{n}\}$ . Let the node and edge mappings obtained on aligning $\mathscr{A}_{1}$ and $\mathscr{A}_{2}$ be $NM$ and $EM$ respectively. The ILP constraints ensure that $NM$ respect the maps $M_{f1}$ and $M_{f2}$ , i.e., $M_{f1}(n_{1})=M_{f2}(n_{2})\Rightarrow(n_{1},n_{2})\in NM$ .

Let $N_{1}^{\prime}=\{n_{1}\mid(n_{1},n_{2})\in NM\}$ and $E_{1}^{\prime}=\{e_{1}\mid(e_{1},e_{2})\in EM\}$ be subsets of $N_{1}$ and $E_{1}$ respectively obtained by projecting maps $NM$ and $EM$ to $\mathscr{A}_{1}$ . We define sets $N_{2}^{\prime}$ and $E_{2}^{\prime}$ by projecting $NM$ and $EM$ to $\mathscr{A}_{2}$ in the same way. Let $\mathscr{A}_{1}^{\prime}=\texttt{project}(\mathscr{A}_{1})\downarrow_{(N_{1}^{\prime},E_{1}^{\prime})}$ and $\mathscr{A}_{2}^{\prime}=\texttt{project}(\mathscr{A}_{2})\downarrow_{(N_{2}^{\prime},E_{2}^{\prime})}$ . From Lemma 1, we know that $\mathscr{A}_{1}\Rightarrow\mathscr{A}_{1}^{\prime}$ and $\mathscr{A}_{2}\Rightarrow\mathscr{A}_{2}^{\prime}$ . If we can prove that $A_{1}^{\prime}\vee A_{2}^{\prime}\Rightarrow\texttt{merge}_{(NM,EM)}(\mathscr{A})$ then we can easily show that $A_{1}\vee A_{2}\Rightarrow\texttt{merge}_{(NM,EM)}(\mathscr{A})$ .

Let us therefore prove that $A_{1}^{\prime}\vee A_{2}^{\prime}\Rightarrow$ $\texttt{project}($ $\texttt{merge}_{(NM,EM)}($ $\mathscr{A}))$ . Without loss of generality, let us assume that bound variables in $\mathscr{A}_{1}^{\prime}$ and $\mathscr{A}_{2}^{\prime}$ are labeled such that $M_{b1}(n_{1})=M_{b2}(n_{2})\Rightarrow(n_{1},n_{2})\in NM$ . Since,

(a)

all nodes in $N_{1}^{\prime}$ and $N_{2}^{\prime}$ and all edges in $E_{1}^{\prime}$ and $E_{2}^{\prime}$ are mapped through $NM$ and $EM$ respectively (from construction of $\mathcal{A}_{1}^{\prime}$ and $\mathcal{A}_{1}2^{\prime}$ )
(b)

If $(n_{1},n_{2})\in NM$ , then if $n_{1}$ is mapped to a free variable,it follows from the properties of $NM$ that $M_{f1}(n_{1})=M_{f2}(n_{2})$ . On the other hand, if $n_{1}$ is mapped to a bound variable, then it follows from our assumption above that $M_{b1}(n_{1})=M_{b2}(n_{2})$ .

it follows from Lemma 2 that $A_{1}^{\prime}\vee A_{2}^{\prime}\Rightarrow\texttt{merge}_{(NM,EM)}(\mathscr{A})$ . From the definition of the project operator, one can easily see that UAPDG $\texttt{merge}_{(NM,EM)}(\mathscr{A})=\texttt{project}_{\{\mathscr{A}_{1},\mathscr{A}_{2}\}}($ $\texttt{merge}_{(NM^{\epsilon},EM^{\epsilon})}($ $\mathscr{A}))$ . This concludes the proof of the theorem.

∎

Appendix E Proof of Theorem 3

Theorem [Soundness of Rule Synthesis:] Given a set of violating examples $\mathscr{V}=\{V_{1},\cdots,V_{m}\}$ and conforming examples $\mathscr{C}=\{C_{1},\cdots,C_{n}\}$ , if the algorithm synthesizeRule successfully returns a return $R$ , then $V_{i}\models R$ and $C_{i}\not\models R$ .

Proof.

It is easy to see in the algorithm synthesizeRule (refer to Algorithm 1) that a rule $R$ is successfully returned only if $V_{i}\models R$ , for all $i\in\{1,\cdots,m\}$ . We now argue that when $R$ is successfully returned, $C_{i}\not\models R$ , for all $i\in\{1,\cdots,n\}$ . Proving this involves two cases:

(1) When $C_{i}\not\models\exists\vec{x}.\textit{pre}(\vec{x})$ . Since $R$ strengthens the precondition, it follows that $C_{i}\not\models R$ .

(2) When $C_{i}^{\vec{x}}\models\textit{pre}(\vec{x})$ . Since $R=\exists\vec{x}.\textit{pre}(\vec{x})\wedge\neg\exists\vec{y}.\textit{post}(\vec{x},\vec{y})$ , to prove that $C_{i}^{\vec{x}}\not\models R$ , we need to prove that $C_{i}^{\vec{x}}\not\models\neg\exists\vec{y}.\textit{post}(\vec{x},\vec{y})$ , or in other words, $C_{i}^{\vec{x}}\models\exists\vec{y}.\textit{post}(\vec{x},\vec{y})$ .

Proof that $C_{i}^{\vec{x}}\models\exists\vec{y}.\textit{post}(\vec{x},\vec{y})$ : Let $\mathscr{C}_{1}^{\vec{x}},\cdots,\mathscr{C}_{k}^{\vec{x}}$ be a partition in the partition list in method synthesizePC. One can argue that $\mathscr{C}_{1}^{\vec{x}}\cup\cdots\cup\mathscr{C}_{k}^{\vec{x}}=\mathscr{C^{\prime}}^{\vec{x}}$ where $\mathscr{C^{\prime}}^{\vec{x}}$ is the set of valuated conforming examples that satisfy the precondition and is computed at line 2 in algorithm 1. Also note that since $C_{i}^{\vec{x}}\models\textit{pre}(x)$ , it follows from the definition of $\mathscr{C^{\prime}}^{\vec{x}}$ that $C_{i}^{\vec{x}}\in\mathscr{C^{\prime}}^{\vec{x}}$ . Without loss of generality, let us assume that $C_{i}^{\vec{x}}\in\mathscr{C}_{j}^{\vec{x}}$ (the item in $j^{th}$ partition). As getConjunctiveRule returns the UAPDG obtained on merging all its input Valuated PDGs, it follows from the soudness of merge algorithm (Theorem 2) that $C_{i}^{\vec{x}}\models\exists\vec{y}.\textit{post}_{j}(\vec{x},\vec{y})$ . Consequently, $C_{i}^{\vec{x}}\models\exists\vec{y}.\textit{post}(\vec{x},\vec{y})$ .

∎

Appendix F Synthesized Rule Examples

In this appendix, we list an example code change and visualize the synthesized precondition and postcondition UAPDGs for few code-quality rules described in Table 6.3. Refer to the supplementary material for details about all synthesized rules.