Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries

Yifei Yang, Hangdong Zhao, Xiangyao Yu, Paraschos Koutris \institutionUniversity of Wisconsin-Madison [email protected], hangdong,yxy,[email protected]

Abstract.

This paper presents predicate transfer, a novel method that optimizes join performance by pre-filtering tables to reduce the join input sizes. Predicate transfer generalizes Bloom join, which conducts pre-filtering within a single join operation, to multi-table joins such that the filtering benefits can be significantly increased. Predicate transfer is inspired by the seminal theoretical results by Yannakakis, which uses semi-joins to pre-filter acyclic queries. Predicate transfer generalizes the theoretical results to any join graphs and use Bloom filters to replace semi-joins leading to significant speedup. Evaluation shows predicate transfer can outperform Bloom join by $3.1\times$ on average on TPC-H benchmark.

1. Introduction

Joins constitute a substantial portion of query execution time. One effective principle for enhancing join performance is to minimize join input sizes by pre-filtering rows that will not appear in the join result. Predicate pushdown exemplifies this principle by applying local predicates on a table before executing any join operation.

The Bloom join Ramesh et al. (2009) extends this principle beyond a single table. In the Bloom join, a Bloom filter is constructed using the join key in one table, and sent to the other table to filter out rows that do not pass the filter—these rows do not match any keys in the first table and will not participate in the join. The Bloom join can effectively reduce the join input sizes thereby reducing the query runtime. However, existing Bloom join solutions can perform such pre-filtering only within a single join operation.

In this paper, we further generalize the pre-filtering principle across multiple joins. Namely, we use predicates on individual tables to pre-filter multiple other tables in the query, further reducing the join input sizes. We call this new technique predicate transfer. A predicate on one table $T_{1}$ can be transferred (e.g., in the form of a Bloom filter) to a table $T_{2}$ that joins with $T_{1}$ . $T_{2}$ can apply the predicate and further transfer it to table $T_{3}$ that joins with $T_{2}$ (but $T_{1}$ does not necessarily join with $T_{3}$ ). The transfer process can propagate further such that the original predicate can filter multiple other tables (e.g, $T_{2}$ , $T_{3}$ , etc.). The conventional Bloom join is a special case of the more generalized predicate transfer—a Bloom join is a one-hop predicate transfer.

The idea of predicate transfer is inspired by the seminal paper Yannakakis (1981) by Yannakakis. For an acyclic query that equi-joins multiple tables, the Yannakakis algorithm achieves the theoretically maximum pre-filtering selectivity by adding an additional semi-join phase prior to the actual joins, which filters a table by semi-joining it with other tables. The process filters one table at a time following the tree structure of the query until every predicate is spread across all joining tables.

For all its theoretical elegance, the Yannakakis algorithm has not yet made its way into modern database engines. One main obstacle is the costly hash table probes and memory consumption in the semi-join phase. Predicate transfer aims to address these practical limitations. It significantly reduces the overhead of semi-joins by passing succinct data structures like Bloom filters. Although predicate transfer no longer achieves the theoretically maximum filtering selectivity, it achieves much higher performance overall.

In the rest of the paper, we first describe the background and related work of predicate transfer in Section 2, with a focus on the Bloom join and Yannakakis algorithm. We then describe the design space of predicate transfer in detail, and our current heuristics in different design dimensions in Section 3. We report preliminary performance evaluations on TPC-H tpc (1999) in Section 4, which shows that predicate transfer can outperform Bloom join by 3.1 $\times$ (up to $40\times$ ) and the Yannakakis algorithm by 3.7 $\times$ . Finally, Section 5 concludes the paper and discusses future work.

2. Background and Related Work

This section presents the background and related work in Bloom join (Section 2.1) and the Yannakakis algorithm (Section 2.2).

2.1. Bloom join

A Bloom filter Bloom (1970) is a compact probabilistic data structure that determines whether an element exists in a set. A Bloom filter has no false negative but may have false positives. In a Bloom join of two tables, a Bloom filter is constructed on one table (typically the smaller one) using the join key. The filter is then sent and applied to each row in the other table; if a row does not pass the filter, it matches no row in the first table and should not participate in the join. Since testing a Bloom filter is generally faster than performing a join, Bloom join can speedup query processing, especially when the join is selective. Modern OLAP DBMSs (e.g., Oracle ora (2019), Redshift red (2020), Snowflake sno (2021), Databricks dat (2022)) widely adopt Bloom filters to accelerate join execution.

Most existing Bloom join algorithms can be applied to only a single join operation. This means the predicate on one table can only be used to pre-filter rows in the other table it joins with; namely, the predicate is transferred in one-hop and one-direction. Some prior work Zhu et al. (2017) has extended the idea to datasets with star schema, allowing all dimension tables to transfer local predicates to the fact table, which outperformed the baseline Bloom join. However, these solutions do not generalize to more complex query plans.

2.2. Yannakakis algorithm

The Yannakakis algorithm Yannakakis (1981) is a classic algorithm that can pre-filter out all rows from tables that do not appear in the final join result, thereby achieving the theoretically maximum filtering selectivity. The algorithm applies to acyclic join queries. The acyclicity is more formally termed as $\alpha$ -acyclicity Yannakakis (1981). The algorithm is proven to run in $O(N+\mathsf{OUT})$ time, where $N$ is the size of input relations and $\mathsf{OUT}$ is the query output size. Thus, the Yannakakis algorithm is known to be instance optimal since $N+\mathsf{OUT}$ is the unavoidable time cost of reading the input and enumerating the output for a query. The algorithm starts by choosing a rooted join tree arbitrarily, and then proceeds with a semi-join phase and a join phase.

Semi-join phase. The semi-join phase contains two passes: the forward pass and the backward pass. The forward pass traverses the join tree in a bottom-up fashion. At each vertex, we filter the table by a sequence of semi-joins with its children. A semi-join of two tables $R$ and $S$ is defined as $R\ltimes S=\Pi_{\mathsf{attr}(R)}(R\bowtie S)$ , which effectively removes all tuples in $R$ that do not join with any tuple in $S$ . The forward pass stops when the root node is reached. Similarly, the backward pass traverses the join tree in a top-down fashion. At each vertex, the table is filtered by a semi-join with its parent. The backward pass stops when all leaf nodes are reached. It is proven that both passes can be executed in $O(N)$ time and all tuples that will not contribute to the output are removed.

Join phase. The join phase can join the filtered tables in any order. It is proven that regardless of the chosen join order, the join phase can be executed in $O(\mathsf{OUT})$ time.

As a reflection, the semi-join phase filters all redundant tuples and the join phase executes the join with automatic robustness: it can join the tables in any order without any intermediate table size blow-up over the output size. The algorithm was later extended by Joglekar et al. Joglekar et al. (2016) to handle aggregations on top of join queries.

3. Predicate Transfer

This section describes the proposed predicate transfer algorithm. We use Query 5 in TPC-H benchmark tpc (1999) (Figure 1) as a running example. This query contains six tables, six inner joins, and two predicates on tables region and orders respectively. The discussion assumes equi-join between tables.

3.1. Overview

Similar to the Yannakakis algorithm, predicate transfer executes a query in two phases.

Phase 1: Predicate Transfer Phase. A join graph is constructed for a query, where each vertex is a table and each edge is a join operation. A local predicate is constructed as a filter (e.g., a Bloom filter) and be transferred across the join graph. The schedule of the predicate transfer phase introduces a large design space, which we discuss in Section 3.2.

Phase 2: Join Phase. After the transfer phase finishes, each table has multiple filters, including both local filters and transferred filters. The database can now apply the filters and perform regular joins. The actual inputs of each join will be substantially smaller if the transferred filters are selective. We discuss the join phase in Section 3.3.

In the next two subsections, we will describe the design space of these two phases and the heuristics we currently use to implement predicate transfer in our prototype. These heuristics are largely intuition-based and a more thorough theoretical analysis is left for future work.

3.2. Predicate Transfer Phase

In the rest of this section, we layout the design space of the transfer phase and describe the design choices we adopt in our prototype.

Filter Transformation. When transferring a filter across edges that have different join keys, the filter must be transformed. For example, a filter constructed on region can be transferred to nation, but the same filter cannot be directly sent to supplier since the join keys do not match. We use the following algorithm to handle the join key mismatch between incoming and outgoing edges on nation. When the incoming filter is received, an empty outgoing filter is created. Then, the columns for both incoming and outgoing join keys in nation are scanned (assuming columnar store; otherwise scan the entire table). Inherent filters of nation are applied during the scan. Then for each row, the incoming join key is used to probe the incoming filter. If a match occurs, the outgoing join key is added to the outgoing filter. At the end of the scan, the outgoing filter is sent to downstream tables (i.e., supplier). The algorithm is efficient as it requires scanning the join keys only once.

Predicate Transfer Graph. The join graph determines the topology of predicate transfer. Figure 1a shows the join graph for Query 5 in TPC-H. Each equi-join is represented as an edge. A predicate transfer graph is a directed subgraph of the join graph. Transfers happen along the selected edges in the subgraph—local predicates of the source vertex are transferred to the target vertex as a filter. Figure 1b shows one predicate transfer graph of TPC-H Q5.

The topology of the predicate transfer graph affects the performance of the predicate transfer phase and also the selectivity of the transferred filters. In this paper, we use a simple heuristic that points an edge from a smaller table to a bigger table. The intuition is the same as why Bloom join builds Bloom filter at the smaller table—to reduce Bloom filter size and increase filter selectivity. Our current heuristic does not remove any edge in the join graph when generating the predicate transfer graph. It also guarantees that the resulting graph is a Directed Acyclic Graph (DAG). The predicate transfer graph in Figure 1b follows this heuristic.

Transfer Schedule. The transfer schedule determines when and how the predicates are transferred across the predicate transfer graph. Numerous design decisions can be made in the schedule. In particular, the schedule specifies which tables in the query should construct initial local filters to start the transfer process, and the order of issuing the remaining transfers. For each table that sends the local filter out, the schedule determines when the transfer happens—multiple transfers may happen in serial or parallel. Moreover, the transfer can happen back and forth, following both directions of certain edges. Pruning may be adopted to avoid non-beneficial transfers, and the transfer direction may be dynamically adjusted at runtime. Identifying a good transfer schedule is critical to the system performance.

In this paper, we adopt a heuristic that builds the transfer schedule using one forward pass and one backward pass similar to the Yannakakis algorithm. The predicate transfer graph is determined at planning time and remains fixed during runtime. In the forward pass, we build initial local filters on the leaf nodes in the predicate transfer graph (i.e., nodes with only outgoing edges but no incoming edge). These filters are transferred following the topological order of the predicate transfer graph, which exists because the graph is a DAG. If one node has one or more incoming edges, the node will collect all the incoming filters before performing the transformation to produce outgoing filters (LIP-style Zhu et al. (2017) incoming filter ordering can be utilized for further optimization); the transformation will scan the table only once, regardless of the number of incoming or outgoing edges. The forward pass finishes once all filters are fully transferred.

The system then starts the backward pass, where we simply reverse the direction of all edges and repeat the same process in the forward pass. After both passes are done, each table has been reduced based on the transformed filters it receives. The later join phase will start from these pre-filtered tables.

In the example of Q5 as shown in Figure 1b, the first Bloom filter is constructed for region, sent to nation. The filter is then transformed into two outgoing filters and sent to customer and supplier respectively. Similarly, supplier transfers two outgoing filters following the edges to customer and lineitem. At customer, two separate incoming filters are applied with one outgoing filter produced and sent to order, which is then transformed and sent to lineitem. The forward pass finishes when both incoming filters arrive at lineitem, and after that the backward pass begins in a symmetric way.

Filter Type. Our discussion so far uses Bloom filters to represent the predicates. In fact, other representation of filters can also be used. If a precise representation is used, i.e., the filter precisely encodes all the join keys, then a transfer becomes a semijoin and the algorithm becomes similar to Yannakakis. An ideal filter should be efficient to construct and check, and achieve low false positive rates. We use Bloom filters in our prototype as it is the best candidate available today. But predicate transfer can automatically benefit from any potential improvement in filtering techniques.

Transfer Path Pruning. As discussed above, our current scheduling heuristics make two full passes of the predicate transfer graph. In practice, some transfers may not increase filter selectivity but consume computational resources. An intelligent transfer scheduler should identify such scenarios and stop transferring these filters further to avoid wasting CPU cycles. Such transfer path pruning can be done at either planning time or runtime. Our current prototype does not incorporate any pruning and always performs the forward and backward passes in full. We observe this already demonstrates significant performance improvement, and believe incorporating path pruning will lead to even larger speedups.

3.3. Join Phase

After the predicate transfer phase completes, each table may have already been processed by several filters, including the inherent filters from the query and the transferred filters. The join phase basically executes the original query with the reduced input tables.

Unified Query Plan. As a straightforward design, the database can directly execute the query plan as a regular query in the join phase, with the leaf nodes (i.e. scan) replaced by the filtered tables produced by the predicate transfer phase. The predicate transfer schedule is essentially also a query plan. The two query plans can be concatenated such that the leaf nodes in the join plan are just the output nodes of the predicate transfer schedule. This avoids rescanning in the join phase and requires no changes to the executor—the executor is oblivious to the predicate transfer phase and executes the modified query plan regularly.

More Accurate Cardinality Estimation. The predicate transfer phase updates the cardinality of the input tables in the join phase. Therefore, the original query plan generated beforehand may become suboptimal based on the stale cardinalities. A replanning step between the two phases may produce a better plan that leads to further performance improvement. Although join performance will be more robust to join orders (as will be shown in Section 4.3), performance can still be affected by the quality of the query plan, with the factors including the size of materialized intermediate tables, which table to build the hash table and which table to probe, etc. Moreover, similar to the Yannakakis algorithm, predicate transfer bounds the size of the intermediate join tables in the join phase (Section 3.5), which can be utilized to improve cardinality estimation.

3.4. Extension to General Queries

In the discussion above, we assume table joins are inner equi-joins, and cover queries with only joins and local filters (filters over base tables). In this section, we extend the predicate transfer mechanism to further support general queries.

Supporting More Operators in Predicate Transfer Graph. We first extend predicate transfer to support outer joins. In particular, a left outer join operation can be incorporated into the predicate transfer graph by allowing predicate transfer in only one direction, i.e., from the left table to the right table; but the other transfer direction is blocked. Therefore, such a transfer can happen in either forward pass or backward pass, but not in both passes. A right outer join can be supported in a similar way. A full outer join, however, cannot be incorporated into the predicate transfer graph.

Considering more general opeartors, we note that an operator will block predicate transfer if it does not preserve the join key during the computation (e.g., perform aggregations on the join key). In particular, we identify the following operators that can also be incorporated into the predicate transfer graph.

•

Operators including filters between intermediate join tables, column projection, sorting, and top-K do not block predicate transfer.
•

Grouped aggregation does not block predicate transfer when the join key is a subset of the group key.
•

Scalar user-defined functions does not block the transfer to the downstream join, but may block the transfer to the upstream join if the function is not invertible.

Beyond a Single Predicate Transfer Graph. Some queries may contain operators that cannot be incorporated into a predicate transfer graph. Example operators include but are not limited to full outer joins, scalar aggregations, and group-by aggregations where the join key is being aggregated. When such a scenario is encountered, we can apply predicate transfer only on a subset or several subsets of the query execution plan, and use conventional methods to execute the rest of the query. For example, this means a query can be partially executed first, leading to a subquery plan that can be represented as a predicate transfer graph in order to apply predicate transfer. After the predicate transfer phase and the join phase, the rest of the query can continue execution. It is also possible that predicate transfer can be applied multiple times to different parts of the query plan—the predicate transfer phase and regular query execution can alternate.

In our current prototype, we apply the heuristics that first identify and execute single-table subquery plans (e.g., group by aggregation on a single table) before the predicate transfer phase begins.

3.5. Cost Analysis

Compared to the Yannakakis algorithm, predicate transfer does not provide theoretical optimality, but it is more versatile. Predicate transfer supports both precise filters (like semi-join) and Bloom filters, any join-graph topology, outer joins and cyclic queries, more operators, and complex predicate transfer schedules.

In this section, we present a simple cost analysis of predicate transfer compared to the Yannakakis algorithm and show that predicate transfer is more efficient and robust than Yannakakis, and can achieve close to optimal pre-filtering efficiency. Our key idea is to show that using the cheap Bloom filters drastically reduces the cost of excessive hash probes in the semi-join phase of Yannakakis, filters out most tuples not participating in the join, and only incurs a small overhead in the join phase.

Cost Model. Let $t$ be the number of tables in a given join query and $N$ be the input size (i.e. the total number of tuples in all joining tables). We assign a unit cost to each per-tuple scan, hashtable insertion or probe, and a $\beta$ cost per-tuple for Bloom filter insertion or probe. As a Bloom filter is of a small size and thus likely to be cache resident, Bloom filter operations are typically much cheaper than hash table operations, i.e. $\beta\ll 1$ . We assume that the Bloom filter has a false positive rate of $\epsilon\ll 1$ that can be appropriately configured (e.g., we can tune $\epsilon$ to be smaller by increasing the Bloom filter size or number of hash functions, but this makes $\beta$ larger). The reader can refer to Zhu et al. (2017) for an in-depth study on Bloom filter configurations.

Yannakakis algorithm. At the semi-join phase, scanning tables to build or probe hashtables cost $N$ units, independent of the direction of the forward/backward semi-join passes. The cost of building or probing intermediate hashtables can be bounded by $c_{y}\cdot N$ , where $c_{y}$ is a constant highly sensitive to the choice of the rooted join tree of the query. An ideal join tree and orientation can drastically reduce the size of intermediate hashtables, leading to a cheaper semi-join phase (smaller $c_{y}$ ). The join phase of Yannakakis is perfectly robust, as every join order costs $t\cdot\mathsf{OUT}$ units of hash probes.

Predicate Transfer. At the predicate transfer phase, scanning tables to build or probe Bloom filters costs $N$ units. As we only build and probe Bloom filters, the cost can be bounded by $\beta\cdot c_{p}\cdot N$ units, where $c_{p}$ is a constant that depends on the choice of the join graph topology and transfer schedule. As $\beta\ll 1$ , the sensitivity of the runtime to the constant $c_{p}$ shrinks by a factor of $\beta$ .

In the join phase, the tables are slightly larger than the maximum filtered tables after semi-joins phase of Yannakakis, by a factor of $(1+\epsilon)^{t}\approx 1+\epsilon t$ . Thus, in the join phase, the cost of predicate transfer can be approximated as $t\cdot\mathsf{OUT}\cdot(1+\epsilon)$ units. The choice of the join order only affects the extra $\epsilon t\cdot\mathsf{OUT}$ term. Assuming $\epsilon\ll 1$ (and so $\epsilon t\cdot\mathsf{OUT}$ is small), the join phase still attains near-perfect robustness.

As a summary, the Yannakakis algorithm guarantees maximum filtering at the semi-join phase and perfect robustness at the join phase, but at the cost of a much more expensive and unstable semi-join phase (our evaluation in Section 4.3 verifies this). In contrast, predicate transfer addresses the shortcomings via a more stable and efficient Bloom filter transfer scheme,while maintaining near-maximum filtering capabilities at the predicate transfer phase and near-perfect robustness in the join phase.

4. Evaluation

This section presents our preliminary evaluation results. We describe the experimental setup in Section 4.1 and compare predicate transfer with baseline join strategies in Section 4.2. Then we perform a deep drive to understand the performance on TPC-H Q5 in Section 4.3.

4.1. Experimental Setup

We conduct all experiments on a single AWS EC2 r5.4xlarge instance, with 16vCPU and 128GB memory. The server runs the Ubuntu 20.04 operating system. We use the widely adopted data analytics benchmark, TPC-H, with 22 queries in total. We use 1GB data set (a scale factor of 1). Queries are executed on a single CPU core. For all the experiments, we measure the in-memory query performance by running the query twice, where the first run loads all the tables into the memory, and we measure the performance of the second run.

The testbed we use on evaluation is FlexPushdownDB Yang et al. (2021) (FPDB in short), an open-source cloud-native OLAP DBMS. Table data is placed on local disks in Parquet par (2016) format unsharded. FPDB leverages join and Bloom filter implementation of Apache Arrow arr (2016). The evaluation results may vary on different DBMSs, depending on the performance ratio between the join and Bloom filter implemented.

We compare the proposed join strategy Pred-Trans with three other baselines: No-Pred-Trans, Bloom Join, and the Yannakakis algorithm. No-Pred-Trans does not transfer predicates among joining tables—pairs of tables are joined regularly as in most DBMSs. Bloom Join performs one-hop predicate transfer between joining table pairs, where the build side constructs a bloom filter which is used to filter the probe side. Yannakakis executes the semi-join phase of the Yanakakis algorithm ahead of the join phase.

Since the vanilla Yannakakis algorithm is only applicable on acyclic conjunctive queries, we make two extensions to make it applicable on all TPC-H queries. First, we adopt the same mechanisms that Pred-Trans deploys to handle the case of outer joins and non-join operators in the query plan. Second, for cyclic queries like Q5 and Q9, we break the cycle in the join graph by randomly picking a root node and, perform a BFS search from the root. The result join tree represents the transfer order of the semi-join phase.

4.2. TPC-H Performance

Figure 2 shows the execution time of different predicate transfer strategies on TPC-H queries. Since Q1 and Q6 involve no joins, we exclude them from the benchmark. On average, Pred-Trans outperforms No-Pred-Trans by 3.8 $\times$ , Bloom Join by 3.1 $\times$ , and Yannakakis by 3.7 $\times$ . 15 queries out of 20 see performance improvement.

Pred-Trans achieves significant performance improvement on queries with a large amount of joins. Half queries include joins across at lease four tables. Among this, Q2 (joins across nine tables) benefits most from predicate transfer, which outperforms No-Pred-Trans and Bloom-Join by 45 $\times$ and 40 $\times$ , respectively. Through predicate transfer, filter predicates on tables Part and Region are sent to every other table in the join graph through lightweight Bloom filters. As a result, the predicate transfer phase of Pred-Trans reduces the size of input tables that participate in the join phase by over 99%, such that the expensive join operations are only performed on a tiny fraction of data. The Yannakakis algorithm outperforms both No-Pred-Trans and Bloom Join baselines by over 4 $\times$ . In fact, compared to Pred-Trans, Yannakakis can pre-filter even more unnecessary data records ahead of the join phase since the Bloom filters leveraged by Pred-Trans incur false positives. However, the small performance benefit within the join phase is overwhelmed by the large overhead brought by semi-joins, making Yannakakis perform worse that Pred-Trans.

We observe the highest speed up on queries Q2, Q17, Q18, and Q21, between 7 $\times$ to over 40 $\times$ . In these queries, there is a subquery executed with the results joined with the tables in the main query, and the large fact table are accessed by both the main query and the subquery (e.g., Lineitem in Q17 and Q18). Since No-Pred-Trans and Bloom Join perform no predicate transfer and one-hop transfer only, a single filter predicate cannot be sent to both the main query and the subquery to pre-filter the corresponding fact table. Conversely, Pred-Trans broadcasts every filter predicate globally inside the join graph, such that both fact tables in the main query and subquery can be filtered. Moreover, Q17 joins base tables with aggregation results. By executing the aggregation beforehand, predicate transfer is able to achieve a higher selectivity by starting transfer from a smaller intermediate result table.

Queries with fewer join operations benefit less from predicate transfer (e.g., Q13, Q14), since one-hop predicate transfer may already be enough to forward local filter predicates to the global. However, we still observe a 10 $\times$ speedup on Q3. Q3 joins three relatively large tables customer, orders, and lineitem. Since all three tables have local filters, Bloom Join can only transfer a portion of them within a single hop. Instead, Pred-Trans can make sure each table receives the transformed filter predicates of every other table, which maximizes the effectiveness of the pre-filter phase.

Another interesting observation is that Yannakakis may not always outperform No-Pred-Trans and Bloom Join baselines (e.g., in Q11 Yannakakis underperforms Bloom Join by 12 $\times$ ). One cause is that the Yannakakis algorithm does not specify the root of the join tree in the semi-join phase, and a bad semi-join order may construct several large hash tables at the beginning. However, this is not an issue in Pred-Trans since we use a heuristic to transfer from smaller tables to larger tables (see Section 3.2), minimizing the memory stalls incurred by bitmap operations.

4.3. Case Study — TPC-H Q5

To get a deeper understanding of the performance benefits, we conduct a detailed analysis on Q5, one of the complex queries in TPC-H. The query performs inner joins across six tables, and the join graph is shown in Figure 1a.

Table 1. Join Table Size in Q5 —HT denotes the number of rows in the hash table, and PR represents the number of rows that probe the hash table.

	No-Pred-Trans		Bloom Join		Yannakakis		Pred-Trans
	HT	PR	HT	PR	HT	PR	HT	PR
Join 1	10K	6M	10K	6M	2K	181K	2K	63K
Join 2	228K	6M	228K	103K	133K	181K	30K	56K
Join 3	150K	910K	150K	44K	69K	193K	15K	39K
Join 4	25	36K	25	36K	5	8K	5	7K
Join 5	1	36K	1	7K	1	8K	1	7K

Join Table Size. We measure the sizes of both input tables of each join, following the join order specified in the query plan (FPDB utilizes Apache Calcite cal (2014) for query optimization like join ordering), and the result is shown in Table 1. On average Pred-Trans reduces the join table size by 98% over No-Pred-Trans, and 97% over Bloom Join. In Bloom Join, the largest fact table lineitem can only be pre-filtered after the first join, where the inner table Orders owns local filter predicates that can be trasferred to lineitem. Pred-Trans shows the superiority to be able to pre-filter all join tables ahead of the entire join phase.

We observe a higher selectivity in the predicate transfer phase achieved by Pred-Trans, compared to Yannakakis. This is because Yannakakis can only guarantee the optimal pre-filtering on acyclic queries. For a cyclic query (like Q5), some edges in cycles are removed to form a tree, which sacrifices the overall filtering power. Instead, the heuristics adopted by Pred-Trans allow us to perform transfer for every join regardless the cyclicity of the join graph, resulting in more base table records filtered ahead of the join phase.

Performance Breakdown. Figure 3 demonstrates the performance breakdown of Q5 in different predicate transfer strategies. The execution time is divided into predicate transfer and join execution. Compared to No-Pred-Trans and Bloom Join, joins are accelerated by 63 $\times$ and 45 $\times$ in Pred-Trans, due to the significant size reduction of the input join tables (Table 1). Yannakakis is also able to achieve a shrinkage on the input join tables. But the semi-joins it relies on are computationally expensive and dominate the entire execution time. The predicate transfer phase in Pred-Trans outperforms the semi-join phase in Yannakakis by 11 $\times$ , since bit operations used in bloom filters are much cheaper than the construction and probe of the hash tables.

Robustness. We next evaluate the sensitiveness on join orders for different predicate strategies. We pick three different join orders and the result is shown in Figure 4. Pred-Trans achieves the best performance and outperforms other predicate transfer strategies on all the join orders. Notably, the join order makes a relatively smaller performance variance in Pred-Trans compared to other strategies, for instance, between the first and the second join orders. Pred-Trans inherits the property of the Yannakakis algorithm which bounds the size of the intermediate join results, making itself robust to different join orders.

5. Conclusion and Future Work

A new join optimization, predicate transfer, is proposed in this paper. Inspired by Yannakakis algorithm, predicate transfer generalizes Bloom join to transfer table-local filters to pre-filter multiple other tables. We laid out the design space of predicate transfer and described the heuristics used in our prototype. Evaluation showed an average $3.1\times$ speedup over Bloom join on TPC-H benchmark.

Predicate transfer opens up substantial research opportunities, including better heuristics in the predicate transfer schedule, deeper theoretical analysis on the performance guarantees, and extending the technique to parallel and distributed environment. Discussions of these topics are left as future work.

References

(1)
tpc (1999) 1999. TPC-H Benchmark. http://www.tpc.org/tpch/.
cal (2014) 2014. Apache Calcite. https://calcite.apache.org/.
arr (2016) 2016. Apache Arrow. https://arrow.apache.org/.
par (2016) 2016. Apache Parquet. https://parquet.apache.org/.
ora (2019) 2019. Getting started with Oracle Database In-Memory Part IV - Joins In The IM Column Store. https://blogs.oracle.com/in-memory/post/getting-started-with-oracle-database-in-memory-part-iv-joins-in-the-im-column-store.
red (2020) 2020. Improved speed and scalability in Amazon Redshift. https://aws.amazon.com/blogs/big-data/improved-speed-and-scalability-in-amazon-redshift/.
sno (2021) 2021. Best Practices for Optimizing Your DBT and Snowflake Deployment. https://www.snowflake.com/wp-content/uploads/2021/10/Best-Practices-for-Optimizing-Your-dbt-and-Snowflake-Deployment.pdf.
dat (2022) 2022. Introducing Apache Spark™ 3.3 for Databricks Runtime 11.0. https://www.databricks.com/blog/2022/06/15/introducing-apache-spark-3-3-for-databricks-runtime-11-0.html.
Bloom (1970) Burton H Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7 (1970), 422–426.
Joglekar et al. (2016) Manas R. Joglekar, Rohan Puttagunta, and Christopher Ré. 2016. AJAR: Aggregations and Joins over Annotated Relations. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (San Francisco, California, USA) (PODS ’16). Association for Computing Machinery, New York, NY, USA, 91–106. https://doi.org/10.1145/2902251.2902293
Ramesh et al. (2009) Sukriti Ramesh, Odysseas Papapetrou, and Wolf Siberski. 2009. Optimizing distributed joins with bloom filters. In Distributed Computing and Internet Technology: 5th International Conference, ICDCIT 2008 New Delhi, India, December 10-12, 2008. Proceedings 5. Springer, 145–156.
Yang et al. (2021) Yifei Yang, Matt Youill, Matthew Woicik, Yizhou Liu, Xiangyao Yu, Marco Serafini, Ashraf Aboulnaga, and Michael Stonebraker. 2021. FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS. VLDB 14, 11 (2021), 2101–2113.
Yannakakis (1981) Mihalis Yannakakis. 1981. Algorithms for Acyclic Database Schemes. In Proceedings of the Seventh International Conference on Very Large Data Bases - Volume 7 (Cannes, France) (VLDB ’81). VLDB Endowment, 82–94.
Zhu et al. (2017) Jianqiao Zhu, Navneet Potti, Saket Saurabh, and Jignesh M. Patel. 2017. Looking Ahead Makes Query Plans Robust: Making the Initial Case with in-Memory Star Schema Data Warehouse Workloads. Proc. VLDB Endow. 10, 8 (apr 2017), 889–900. https://doi.org/10.14778/3090163.3090167