This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Approximating Partition in Near-Linear Time

Lin Chen [email protected]. Zhejiang University. Part of this work was done when the author was affiliated with Texas Tech University.    Jiayi Lian [email protected]. Zhejiang University.    Yuchen Mao [email protected]. Zhejiang University. Supported by National Natural Science Foundation of China [Project No. 12271477]    Guochuan Zhang [email protected]. Zhejiang University. Supported by National Natural Science Foundation of China [Project No. 12131003]
Abstract

We propose an O~(n+1/ε)\widetilde{O}(n+1/\varepsilon)-time FPTAS (Fully Polynomial-Time Approximation Scheme) for the classical Partition problem. This is the best possible (up to a polylogarithmic factor) assuming SETH (Strong Exponential Time Hypothesis) [Abboud, Bringmann, Hermelin, and Shabtay’22]. Prior to our work, the best known FPTAS for Partition runs in O~(n+1/ε5/4)\widetilde{O}(n+1/\varepsilon^{5/4}) time [Deng, Jin and Mao’23, Wu and Chen’22]. Our result is obtained by solving a more general problem of weakly approximating Subset Sum.

1 Introduction

Subset Sum

Given a (multi-)set XX of nn positive integers and a target tt, Subset Sum asks for a subset YXY\subseteq X with maximum Σ(Y)\Sigma(Y) that does not exceed tt, where Σ(Y)\Sigma(Y) is the sum of the integers in YY. It is a fundamental problem in computer science and operations research. Although NP-Hard, it admits FPTASes (Fully Polynomial-Time Approximation Schemes). The first FPTAS for Subset Sum was due to Ibarra and Kim [IK75] and Karp [Kar75] in the 1970s. Since then, a great effort has been devoted to developing faster FPTASes (see Table 1). The latest one is by Kellerer, Pferschy and Speranza [KPS97], and has an O~(n+1/ε2)\widetilde{O}(n+1/\varepsilon^{2})111In the paper we use an O~()\widetilde{O}(\cdot) notation to hide polylogarithmic factors in nn and 1ε\frac{1}{\varepsilon}. running time. This was recently shown, by Bringmann and Nakos [BN21b], to be the best possible (up to a polylogarithmic factor) assuming the (min,+)(\min,+)-convolution conjecture.

Partition

Partition is a special case of Subset Sum where t=Σ(X)/2t=\Sigma(X)/2. It is often considered as one of “the easiest NP-hard Problems” and yet has many applications in scheduling [CL91], minimization of circuit sizes and cryptography [MH78], and game theory [Hay02]. Despite the fact that Partition can be reduced to Subset Sum, algorithms specific to Partition have been developed in the hope of solving it faster than Subset Sum. See Table 2. Mucha, Węgrzycki and Włodarczyk [MWW19] gave the first subquadratic-time FPTAS for Partition that runs in O~(n+1/ε5/3)\widetilde{O}(n+1/\varepsilon^{5/3}) time. This result, together with the quadratic lower bound for Subset Sum, implies that Partition is easier than Subset Sum at least in terms of polynomial-time approximation schemes. On the negative side, the conditional lower bound of poly(n)/ε1o(1)\mathrm{poly}(n)/\varepsilon^{1-o(1)} from Abboud, Bringmann, Hermelin, and Shabtay [ABHS22] implies that a running time of O((n+1ε)1o(1))O((n+\frac{1}{\varepsilon})^{1-o(1)}) is impossible assuming SETH. This naturally raises the following question.

Can Partition be approximated within O(n+1ε)O(n+\frac{1}{\varepsilon}) time?

There is a line of work trying to resolve this question. Bringmann and Nakos [BN21b] gave a deterministic FPTAS that runs in O~(n+1/ε3/2)\widetilde{O}(n+1/\varepsilon^{3/2}). The current best running time is O~(n+1/ε5/4)\widetilde{O}(n+1/\varepsilon^{5/4}) which was obtained independently by Deng, Jin, and Mao [DJM23] and by Wu and Chen [WC22]. The question is still away from being settled.

Weakly Approximating Subset Sum

The study of weak approximation schemes for Subset Sum was initiated by Mucha, Węgrzycki and Włodarczyk [MWW19]. Weak approximation lies between strong222From now on, we say that the standard approximation is strong, in order to distinguish it from weak approximation. approximation for Subset Sum and Partition in the sense that any weak approximation scheme for Subset Sum implies a strong approximation scheme for Partition. Consider a Subset Sum instance (X,t)(X,t). Given any ε>0\varepsilon>0, a weak approximation scheme finds a subset YY of XX such that

(1ε)Σ(Y)Σ(Y)(1+ε)t,(1-\varepsilon)\Sigma(Y^{*})\leqslant\Sigma(Y)\leqslant(1+\varepsilon)t,

where YY^{*} is the optimal solution, that is, a weak approximation scheme allows a solution to slightly break the constraint (also known as resource augmentation). Mucha, Węgrzycki and Włodarczyk  [MWW19] proposed an O~(n+1/ε5/3)\widetilde{O}(n+1/\varepsilon^{5/3})-time weak approximation scheme. It was later improved to O~(n+1/ε3/2)\widetilde{O}(n+1/\varepsilon^{3/2}) [BN21b, WC22]. The lower bound is (n+1ε)1o(1)(n+\frac{1}{\varepsilon})^{1-o(1)}, which is the same as that for Partition.

1.1 Our Results

Theorem 1.

There is an O~(n+1ε)\widetilde{O}(n+\frac{1}{\varepsilon})-time randomized weak approximation scheme for Subset Sum, which succeeds with probability at least 1(nε)O(1)1-(\frac{n}{\varepsilon})^{-O(1)}.

Theorem 1 immediately implies the following theorem since any weak approximation scheme for Subset Sum is a strong approximation scheme for Partition.

Theorem 2.

There is an O~(n+1ε)\widetilde{O}(n+\frac{1}{\varepsilon})-time randomized FPTAS for Partition, which succeeds with probability at least 1(nε)O(1)1-(\frac{n}{\varepsilon})^{-O(1)}.

Both of the above two results match (up to a polylogarithmic factor) the lower bound of (n+1ε)1o(1)(n+\frac{1}{\varepsilon})^{1-o(1)}. To our best knowledge, Partition is the first NP-hard problem that admits an FPTAS that is near-linear in both nn and 1/ε1/\varepsilon. We also remark that our weak approximation scheme for Subset Sum generalizes Bringmann’s O~(n+t)\widetilde{O}(n+t)-time exact algorithm for Subset Sum [Bri17] in the sense that when ε:=12t\varepsilon:=\frac{1}{2t}, the weak approximation scheme becomes an exact algorithm with O~(n+t)\widetilde{O}(n+t) running time.

To attain Theorem 1, we utilize an additive combinatorics result that is different from the ones previously used in Subset Sum and Knapsack [GM91, MWW19, BN20, BW21, WC22, DJM23, CLMZ24]. We believe that it may be of independent interest.

Table 1: Polynomial-time approximation schemes for Subset Sum. Reference with a star (*) indicates that the algorithm was designed for Knapsack and works for Subset Sum. Symbol (†) means that it is a randomized approximation scheme.
 Running Time Reference
Strong approximation scheme for Subset Sum
O(n/ε2)O(n/\varepsilon^{2}) *Ibarra and Kim [IK75], *Karp [Kar75]
O(n/ε)O(n/\varepsilon) Gens and Levner [GL78, GL79]
O(n+1/ε4)O(n+1/\varepsilon^{4}) *Lawler [Law79]
O(n+1/ε3)O(n+1/\varepsilon^{3}) Gens and Levner [GL94]
O~(n+1/ε2)\widetilde{O}(n+1/\varepsilon^{2}) Kellerer, Mansini, Pferschy and Speranza[KMPS03]
C.L.B. (n+1/ε)2o(1)(n+1/\varepsilon)^{2-o(1)} Bringmann and Nakos [BN21b]
Weak approximation scheme for Subset Sum
O~(n+1/ε5/3)\widetilde{O}(n+1/\varepsilon^{5/3}) †Mucha, Węgrzycki and Włodarczyk [MWW19]
O~(n+1/ε3/2)\widetilde{O}(n+1/\varepsilon^{3/2}) †Bringmann and Nakos [BN21b], Wu and Chen [WC22]
O~(n+1/ε)\widetilde{O}(n+1/\varepsilon) †This Paper
C.L.B. poly(n)/ε1o(1)\mathrm{poly}(n)/\varepsilon^{1-o(1)} Abboud, Bringmann, Hermelin and Shabtay [ABHS22]
Table 2: Polynomial-time approximation schemes for Partition. Symbol (†) means that it is a randomized approximation scheme.
 Running Time Reference
O~(n+1/ε2)\widetilde{O}(n+1/\varepsilon^{2}) Gens and Levner [GL80]
O~(n+1/ε5/3)\widetilde{O}(n+1/\varepsilon^{5/3}) †Mucha, Węgrzycki and Włodarczyk [MWW19]
O~(n+1/ε3/2)\widetilde{O}(n+1/\varepsilon^{3/2}) Bringmann and Nakos [BN21b]
O~(n+1/ε5/4)\widetilde{O}(n+1/\varepsilon^{5/4}) Deng, Jin and Mao [DJM23], Wu and Chen [WC22]
O~(n+1/ε)\widetilde{O}(n+1/\varepsilon) †This Paper
C.L.B. poly(n)/ε1o(1)\mathrm{poly}(n)/\varepsilon^{1-o(1)} Abboud, Bringmann, Hermelin and Shabtay [ABHS22]

1.2 Technical Overview

Let X={x1,,xn}X=\{x_{1},\ldots,x_{n}\}. Solving Subset Sum can be reduced to computing the sumset {x1,0}++{xn,0}\{x_{1},0\}+\cdots+\{x_{n},0\}. Our main technique is an approach for efficiently approximating (a major part of) the sumset of integer sets, which can be seen as a combination of a deterministic sparse convolution [BFN22] (see Lemma 5) and an additive combinatorics result from Szemerédi and Vu [SV05] (see Theorem 14). We briefly explain the idea below.

Suppose that we are to compute the sumset of A1,,AA_{1},\ldots,A_{\ell}. We compute it in a tree-like manner. A1,,AA_{1},\ldots,A_{\ell} forms the bottom level of the tree. We compute the next level by taking the sumset of every two nodes in the bottom level. That is, we compute A1+A2,A3+A4,,A1+AA_{1}+A_{2},A_{3}+A_{4},\ldots,A_{\ell-1}+A_{\ell}. Let Bi=A2i1+A2iB_{i}=A_{2i-1}+A_{2i} for i=1,,/2i=1,\ldots,\ell/2. If B1,,B/2B_{1},\ldots,B_{\ell/2} have small total sizes, then we can compute all of them efficiently via sparse convolution, and proceed to the next round. When B1,,B/2B_{1},\ldots,B_{\ell/2} has a large total size, we cannot afford to compute them. Instead, we utilize an additive combinatorics result from Szemerédi and Vu [SV05] to show that A1++AA_{1}+\cdots+A_{\ell} has a long arithmetic progression. We will further show that this arithmetic progression can be extended to a long sequence with small differences between consecutive terms. Given the existence of such a long sequence, we can directly compute a set that nicely approximates (a major part of) A1++AA_{1}+\cdots+A_{\ell}.

Besides sparse convolution and the additive combinatorics result, the following techniques are also essential for obtaining a near-linear running time.

  • Recall that we compute a level only when it has a small total size. Therefore, we should estimate the size of a level without actually computing it. The estimation can be done by utilizing the sparse convolution algorithm [BFN22].

  • The additive combinatorics result [SV05] only guarantees the existence of a long arithmetic progression, and is non-constructive. Therefore, we can only obtain a good approximation of solution values, but may not be able to recover solutions efficiently. A similar issue also arises in the algorithm for dense Subset Sum [BW21]. In Section 4, we show the issue can be fixed in an approximate sense.

  • Rounding is necessary every time when we compute a new level. If the tree has too many levels, then the error caused by rounding would accumulate and become far more than what is acceptable. To ensure a small number of tree levels, we use the color-coding technique, which was proposed by Bringmann [Bri17] to design exact algorithms for Subset Sum. For technical reasons, the color-coding technique is slightly modified to ensure additional properties.

Our algorithm, and in particular, the usage of arithmetic progression, is inspired by the dense Subset Sum algorithm by Bringmann and Wellnitz [BW21], and Galil and Margalit [GM91], but we obtain the arithmetic progression in a way that is different from these two works. Bringmann and Wellnitz [BW21], and Galil and Margalit [GM91] got the arithmetic progression via an additive combinatorics result saying that if an integer set AA has Ω~(u)\widetilde{\Omega}(\sqrt{u}) distinct integers, where uu is the maximum integer in AA, then the collection of subset sums of AA must contain an arithmetic progression of length Ω(u){\Omega}(u) (see, e.g., [Sár94]). In contrast, we use a different result from additive combinatorics, which states that if we have \ell sets A1,,AA_{1},\ldots,A_{\ell} of size at least kk and kΩ(u)\ell k\geqslant\Omega(u), then A1++AA_{1}+\cdots+A_{\ell} must contain an arithmetic progression of length Ω(u){\Omega}(u).

1.3 Further Related Work

Subset Sum is a special case of Knapsack. There is a long line of research on approximation schemes for Knapsack, e.g., [IK75, Kar75, Law79, KP04, Rhe15, Cha18, Jin19, DJM23]. There is a conditional lower bound of (n+1/ε)2o(1)(n+1/\varepsilon)^{2-o(1)} based on the (min, +)-convolution hypothesis[CMWW19, KPS17]. Very recently, [CLMZ23] and [Mao23] establish an O~(n+1/ε2)\widetilde{O}(n+1/\varepsilon^{2})-time FPTAS for Knapsack, independently.

In addition to approximation algorithms, exact pseudopolynomial-time algorithms for Knapsack and Subset Sum have also received extensive studies in recent years, e.g., [Bri17, AT19, PRW21, BC23, CLMZ24, Bri23, Jin23]. It is worth mentioning that there is an algorithm [BN20] with running time O~(k4/3)\widetilde{O}({k^{4/3}}) for Subset Sum where kk is the number of subset sums that are smaller than tt, which shares a similar flavor to our algorithm in the sense that it combines sparse convolution and Ruzsa’s triangle inequality for sumset estimation (which is a different result in additive combinatorics).

Our algorithm utilizes sparse convolution algorithms, see, e.g., [CH02, AR15, CL15, Nak20, BFN21, BN21a, BFN22]. We also adopt results from additive combinatorics, see, e.g., [Alo87, Sár89, Sár94, SV05, SV06]. Additive combinatorics results have been exploited extensively in recent years on Knapsack and Subset Sum, see, e.g., [GM91, MWW19, BN20, BW21, WC22, DJM23, CLMZ24].

1.4 Paper Organization

In Section 2, we introduce some necessary terminology and tools, and show that it suffices to solve a reduced problem. In Section 3, we present a near-linear-time algorithm for approximating optimal values. We further show how to recover an approximate solution in near-linear time in Section 4. Section 5 concludes the paper. All the omitted proofs can be found in the appendices.

2 Preliminary

2.1 Notation and Definitions

Throughout this paper, we assume that ε>0\varepsilon>0 is sufficiently small. We also assume that 1ε\frac{1}{\varepsilon} is an integer by adjusting ε\varepsilon by an O(1)O(1) factor. All logarithms (log\log) in this paper are base 22.

Let w,vw,v be two real numbers. We use [w,v][w,v] to denote the set of integers between ww and uu. That is, [w,v]={z:wzv}[w,v]=\{z\in\mathbb{Z}:w\leqslant z\leqslant v\}. Let YY be a nonempty set of integers. We write Y[w,v]Y\cap[w,v] as Y[w,v]Y[w,v]. We denote the minimum and maximum elements of YY by min(Y)\min(Y) and max(Y)\max(Y), respectively. We write yYy\sum_{y\in Y}y as Σ(Y)\Sigma(Y). We refer to the number of elements in YY as the size of YY, denoted by |Y||Y|. Through this paper, we use both the terms “set” and “multi-set”. Only when the term “multi-set” is explicitly used do we allow duplicate elements. Unless otherwise stated, a subset of a multi-set is a multi-set.

Let (X,t)(X,t) be an arbitrary Subset Sum instance. We assume that xtx\leqslant t for any xXx\in X since the integers greater than tt can be safely removed from XX. We define 𝒮X{\cal S}_{X} to be the set of all subset sums of XX. That is, 𝒮X={Σ(Y):YX}{\cal S}_{X}=\{\Sigma(Y):Y\subseteq X\}. Subset Sum actually asks for the maximum element of 𝒮X[0,t]{\cal S}_{X}[0,t].

We consider the more general problem of approximating 𝒮X{\cal S}_{X}.

Definition 3.

Let SS be a set of integers. Let w,vw,v be two real numbers. We say a set S~\widetilde{S} approximates S[w,v]S[w,v] with additive error δ\delta if

  1. (i)

    for any sS[w,v]s\in S[w,v], there is s~S~\tilde{s}\in\widetilde{S} with sδs~s+δs-\delta\leqslant\tilde{s}\leqslant s+\delta, and

  2. (ii)

    for any s~S~\tilde{s}\in\widetilde{S}, there is sSs\in S with s~δss~+δ\tilde{s}-\delta\leqslant s\leqslant\tilde{s}+\delta.

Our algorithm has two phases. In the first phase, it computes a set S~\widetilde{S} that approximates 𝒮X[0,t]{\cal S}_{X}[0,t] with additive error εt\varepsilon t. Let s~\tilde{s} be the maximum element of S~[0,(1+ε)t]\widetilde{S}[0,(1+\varepsilon)t]. Clearly, s~(1+ε)t\tilde{s}\leqslant(1+\varepsilon)t. Definition 3(i) implies that s~Σ(Y)εt\tilde{s}\geqslant\Sigma(Y^{*})-\varepsilon t, and Definition 3(ii) implies that there exists YXY\subseteq X such that s~εtΣ(Y)s~+εt\tilde{s}-\varepsilon t\leqslant\Sigma(Y)\leqslant\tilde{s}+\varepsilon t. Therefore,

(14ε)Σ(Y)Σ(Y)2εts~εtΣ(Y)s~+εtt+2εt(1-4\varepsilon)\Sigma(Y^{*})\leqslant\Sigma(Y^{*})-2\varepsilon t\leqslant\tilde{s}-\varepsilon t\leqslant\Sigma(Y)\leqslant\tilde{s}+\varepsilon t\leqslant t+2\varepsilon t

The first inequality is due to the fact that we can assume the optimal objective value Σ(Y)t/2\Sigma(Y^{*})\geqslant t/2333If Σ(Y)<t/2\Sigma(Y^{*})<t/2, every interger in XX must be less than t/2t/2. Then it implies that Y=XY^{*}=X because otherwise, we can improve YY^{*} by selecting one more integer in XX. Such an instance can be solved trivially in O(n)O(n) time.. By adjusting ε\varepsilon by a factor of 44, we have that

(1ε)Σ(Y)Σ(Y)(1+ε)t.(1-\varepsilon)\Sigma(Y^{*})\leqslant\Sigma(Y)\leqslant(1+\varepsilon)t.

The second phase of the algorithm recovers such a YY from s~\tilde{s}.

2.2 Sumset

Let (X1,X2)(X_{1},X_{2}) be a partition of XX. We have 𝒮X=𝒮X1+𝒮X2{\cal S}_{X}={\cal S}_{X_{1}}+{\cal S}_{X_{2}}, where the sum of two sets is defined by the following.

A+B={a+b:aA,bB}.A+B=\{a+b:a\in A,b\in B\}.

The set A+BA+B is called the sumset of AA and BB. It is well-known that the sumset of two integer sets can be computed via the classical convolution algorithm based on Fast Fourier Transformation (FFT).

Lemma 4.

Let uu be a positive integer. Let AA and BB be two subsets of [0,u][0,u]. We can compute their sumset A+BA+B in O(ulogu)O(u\log u) time.

When the sumset has a small size, sparse convolution algorithms whose running time is linear in the output size may be used. Although there are faster randomized algorithms, to ease the analysis, we use the following deterministic algorithm due to Bringmann, Fischer, and Nakos [BFN22].

Lemma 5 ([BFN22]).

Let uu be a positive integer. Let AA and BB be two subsets of [0,u][0,u]. We can compute their sumset A+BA+B in O(|A+B|log5upolyloglogu)O(|A+B|\log^{5}u\,\mathrm{polyloglog}\,u) time.

Lemma 4 immediately implies a few simple approaches for approximating the sumset of two sets.

Lemma 6.

Let uu be a positive integer. Let A1,,AA_{1},\ldots,A_{\ell} be subsets of [0,u][0,u] with total size kk. For any ε<1\varepsilon<1, in O(k+2εlogε)O(k+\frac{\ell^{2}}{\varepsilon}\log\frac{\ell}{\varepsilon}) time, we can compute a set SS of size O(1ε)O(\frac{1}{\varepsilon}) that approximates (A1++A)[0,u](A_{1}+\cdots+A_{\ell})[0,u] with additive error εu\varepsilon u.

Proof.

We first round the elements of all AiA_{i}’s down to the nearest multiple of εu\varepsilon u, and delete duplicate elements in each AiA_{i}. This step takes O(k+ε)O(k+\frac{\ell}{\varepsilon}) time and incurs an additive error of at most εu\varepsilon u for each AiA_{i}. The we can compute (A1++A)[0,u](A_{1}+\cdots+A_{\ell})[0,u] via Lemma 4, and the total running time is O(εlog1ε)O(\frac{\ell}{\varepsilon}\log\frac{1}{\varepsilon}). The total additive errors is εu\ell\varepsilon u. By adjusing ε\varepsilon by a factor of \ell, the total additive error becomes εu\varepsilon u and the running becomes O(k+2εlogε)O(k+\frac{\ell^{2}}{\varepsilon}\log\frac{\ell}{\varepsilon}).

2.3 Problem Reduction

We can reduce our task to the following problem RP(β)\mathrm{RP}(\beta) where β[1,1ε]\beta\in[1,\frac{1}{\varepsilon}] be an integer.

Definition 7 (The Reduced Problem RP(β)\mathrm{RP}(\beta)).

Let XX be a multi-set of integers from [1ε,2ε][\frac{1}{\varepsilon},\frac{2}{\varepsilon}] such that Σ(X)4βε\Sigma(X)\geqslant\frac{4\beta}{\varepsilon}. (i) Compute a set S~\widetilde{S} of size O~(1ε)\widetilde{O}(\frac{1}{\varepsilon}) that approximates 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] with additive error δ=O(βpolylognε)\delta={O}(\beta\,\,\mathrm{polylog}\,\frac{n}{\varepsilon}). (ii) Given any s~S~\tilde{s}\in\widetilde{S}, recover a subset YXY\subseteq X such that s~δΣ(Y)s~+δ\tilde{s}-\delta\leqslant\Sigma(Y)\leqslant\tilde{s}+\delta.

Lemma 8.

There is an O~(n+1ε)\widetilde{O}(n+\frac{1}{\varepsilon})-time weak approximation scheme for Subset Sum if, for any β[1,1ε]\beta\in[1,\frac{1}{\varepsilon}], the reduced problem RP(β)\mathrm{RP(\beta)} can be solved in O~(n+1ε)\widetilde{O}(n+\frac{1}{\varepsilon}) time.

The proof of the following lemma is deferred to Appendix A. Basically, we can preprocess the instance to get rid of tiny integers less than εt\varepsilon t. Then we scale the instance by ε2t\varepsilon^{2}t. After that, all integers in XX are within [1ε,1ε2][\frac{1}{\varepsilon},\frac{1}{\varepsilon^{2}}]. Then we partition XX into log1ε\log\frac{1}{\varepsilon} groups so that the integers within the same group differ by a factor of at most 22. We deal with each group separately and merge their results by Lemma 6. By further scaling, we can assume that x[1ε,2ε]x\in[\frac{1}{\varepsilon},\frac{2}{\varepsilon}] for each group.

In what follows, we present an algorithm for part (i) of the reduced problem RP(β)\mathrm{RP}(\beta) in Section 3, and an algorithm for part (ii) in Section 4.

3 Approximating the Set of Subset Sums

Lemma 6 already implies an O(n2εlognε)O(\frac{n^{2}}{\varepsilon}\log\frac{n}{\varepsilon})-time algorithm for approximating 𝒮X[0,t]{\cal S}_{X}[0,t]: recall that n=|X|n=|X|, partition XX into nn singletons, and merge via Lemma 6. Recall that the algorithm works in a tree-like manner. The inefficiency of this algorithm is mainly due to the large node number of the tree and the large cost at each node. To improve the running time, our algorithm utilizes the following two techniques.

  1. (i)

    We reduce the number of tree nodes via the two-layer color-coding from Bringmann [Bri17].

  2. (ii)

    Before a new level of the tree is computed, we estimate the total size of the nodes in this level. Only when the total size is small do we compute this level, and we can compute efficiently via output-sensitive algorithms (Lemma 5). When the total size is large, we show that a good approximation can be immediately obtained via additive combinatorics tools.

3.1 Color Coding

Recall that our goal is to approximate 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] and that x[1ε,2ε]x\in[\frac{1}{\varepsilon},\frac{2}{\varepsilon}] for any xXx\in X. For any YXY\subseteq X with Σ(Y)𝒮X[βε,2βε]\Sigma(Y)\in{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}], we have |Y|2β|Y|\leqslant 2\beta. Color-coding is a technique for dealing with such a situation where the subsets under consideration are small.

Let YY be any subset of XX with |Y|k|Y|\leqslant k. Let X1,,Xk2X_{1},\ldots,X_{k^{2}} be a random partition of XX. By standard balls and bins analysis, with probability at least 14\frac{1}{4}, |YXi|1|Y\cap X_{i}|\leqslant 1 for all i[1,k2]i\in[1,k^{2}], which implies that Σ(Y)(X1{0})++(Xk2{0})\Sigma(Y)\in(X_{1}\cup\{0\})+\cdots+(X_{k^{2}}\cup\{0\}). This probability can be boosted to 1q1-q, if we repeat the above procedure for log4/3q\lceil\log_{4/3}q\rceil times and take the union of the resulting sumsets.

Bringmann [Bri17] proposed a two-layer color coding technique that reduces the number of subsets in the partition by roughly a factor of kk. Let qq be the target error probability. The first layer of color coding randomly partitions XX into roughly k/log(k/q)k/\log(k/q) subsets. With probability at least 1q/21-q/2, each subset contains at most 6log(k/q)6\log(k/q) elements from YY. The second layer randomly partitions each of the subsets obtained in this first layer into 36log2(k/q)36\log^{2}(k/q) subsets. The second-layer partition will be repeated for logkq\lceil\log\frac{k}{q}\rceil times in order to boost the success probability to 1q/21-q/2. See Algorithm 1 for details.

Algorithm 1 𝙲𝚘𝚕𝚘𝚛𝙲𝚘𝚍𝚒𝚗𝚐(X,k,q)\mathtt{ColorCoding}(X,k,q) [Bri17]
1:Input: A multi-set XX of integers, a positive integer kk, and a target error probability qq
2:Output: rr partitions {X1,1j,,X1,gj,,Xm,1j,,Xm,gj}j[1,r]\{X^{j}_{1,1},\ldots,X^{j}_{1,g},\ldots,X^{j}_{m,1},\ldots,X^{j}_{m,g}\}_{j\in[1,r]} of XX
3:m:=k/log(k/q)m:=k/\log(k/q) rounded up to be next power of 22
4:g:=36log2(k/q)g:=36\log^{2}(k/q) rounded up to the next power of 22
5:r:=logkqr:=\lceil\log\frac{k}{q}\rceil
6:Randomly partition XX into mm subsets X1,,XmX_{1},\ldots,X_{m}
7:for j=1,,rj=1,\ldots,r do
8:     for i=1,,mi=1,\ldots,m do
9:         Randomly partition XiX_{i} into gg subsets Xi,1j,,Xi,gjX^{j}_{i,1},\ldots,X^{j}_{i,g}      
10:return the rr partitions {X1,1j,,X1,gj,,Xm,1j,\{X^{j}_{1,1},\ldots,X^{j}_{1,g},\ldots,X^{j}_{m,1}, ,Xm,gj}j[1,r]\ldots,X^{j}_{m,g}\}_{j\in[1,r]} of XX.
Lemma 9 ([Bri17]).

Let m,g,rm,g,r be defined as that in Algorithm 1. Let {X1,1j,,X1,gj,,Xm,1j,\{X^{j}_{1,1},\ldots,X^{j}_{1,g},\ldots,X^{j}_{m,1}, ,Xm,gj}j[1,r]\ldots,X^{j}_{m,g}\}_{j\in[1,r]} be the partitions of XX returned by 𝙲𝚘𝚕𝚘𝚛𝙲𝚘𝚍𝚒𝚗𝚐(X,k,q)\mathtt{ColorCoding}(X,k,q) in Algorithm 1. For j[1,r]j\in[1,r], let Sij=(Xi,1j{0})++(Xi,gj{0})S^{j}_{i}=(X^{j}_{i,1}\cup\{0\})+\cdots+(X^{j}_{i,g}\cup\{0\}). For any subset YXY\subseteq X with |Y|k|Y|\leqslant k, with probability at least 1q1-q,

Σ(Y)j=1rS1j++j=1rSmj.\Sigma(Y)\in\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}.

By adjusting the error probability by a factor of 2βε\frac{2\beta}{\varepsilon}, we can have, with probability at least 1q1-q, that j=1rS1j++j=1rSmj\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m} contains all the elements of 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}].

For technical reasons, we need an extra property that for any partition of XX obtained from coloring-coding, the sum of the maximum elements of the subsets is large. That is, we need, for all jj,

max(X1,1j)++max(X1,gj)++max(Xm,1j)++max(Xm,gj)4βε,\max(X^{j}_{1,1})+\cdots+\max(X^{j}_{1,g})+\cdots+\max(X^{j}_{m,1})+\cdots+\max(X^{j}_{m,g})\geqslant\frac{4\beta}{\varepsilon},

where the maximum of an empty set is defined to be 0. We claim that this property can be assured with a slight modification of the color coding algorithm. Details will be provided in Appendix B.

Lemma 10.

Let mm be 4β/log4β2εq4\beta/\log\frac{4\beta^{2}}{\varepsilon q^{*}} rounded up to next power of 22, let gg be 36log24β2εq36\log^{2}\frac{4\beta^{2}}{\varepsilon q^{*}} rounded up to next power of 22, and let r:=log4β2εqr:=\lceil\log\frac{4\beta^{2}}{\varepsilon q^{*}}\rceil. In O~(n+1ε)\widetilde{O}(n+\frac{1}{\varepsilon}) time, we can obtain rr partitions {X1,1j,,Xm,gj}j[1,r]\{X^{j}_{1,1},\ldots,X^{j}_{m,g}\}_{j\in[1,r]} of XX such that the following is true. For i[1,m]i\in[1,m] and j[1,r]j\in[1,r], let Sij=(Xi,1j{0})++(Xi,gj{0})S^{j}_{i}=(X^{j}_{i,1}\cup\{0\})+\cdots+(X^{j}_{i,g}\cup\{0\}). With probability at least 1q1-q^{*},

𝒮X[βε,2βε]=(j=1rS1j++j=1rSmj)[βε,2βε].{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}]=\left(\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}\right)[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}].

Moreover, for every j[1,r]j\in[1,r],

max(X1,1j)++max(X1,gj)++max(Xm,1j)++max(Xm,gj)4βε.\max(X^{j}_{1,1})+\cdots+\max(X^{j}_{1,g})+\cdots+\max(X^{j}_{m,1})+\cdots+\max(X^{j}_{m,g})\geqslant\frac{4\beta}{\varepsilon}.

The proof of the lemma is deferred to Appendix B.

Throughout the rest of the paper, we fix q:=(nε)O(1)q^{*}:=(\frac{n}{\varepsilon})^{-O(1)}, mm to be 4β/log4β2εq4\beta/\log\frac{4\beta^{2}}{\varepsilon q^{*}} rounded up to next power of 22, gg to be 36log24β2εq36\log^{2}\frac{4\beta^{2}}{\varepsilon q^{*}} rounded up to next power of 22, and r:=log4β2εqr:=\lceil\log\frac{4\beta^{2}}{\varepsilon q^{*}}\rceil.

3.2 Sparse and Dense Tree Levels

Let {X1,1j,,Xm,gj}j[1,r]\{X^{j}_{1,1},\ldots,X^{j}_{m,g}\}_{j\in[1,r]} be the rr partitions of XX obtained via Lemma 10. To simplify the notation, we assume that every subset in the rr partitions of XX is a set (rather than a multi-set), and that every subset contains 0. This assumption is without loss of generality, because when we define SijS^{j}_{i} in Lemma 10, we always add 0 to the subsets of XX . To approximate 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}], it suffices to compute S=j=1rS1j++j=1rSmj,S=\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}, where SijS^{j}_{i} is defined in Lemma 10.

The procedure for computing SS can be viewed as a tree. The level 0 of the tree has mgmg leaves that represent the mgmg subsets of XX in the jjth partition. Given level ii, we compute level i+1i+1 by taking the sumset of every two nodes in level ii. Therefore, in level logg\log g, we will obtain S1j,,SmjS^{j}_{1},\ldots,S^{j}_{m}. The computation from level 0 to level gg will be repeated for rr times, so we can get S1j,,SmjS^{j}_{1},\ldots,S^{j}_{m} for all jj. Then we take union and get a new level logg\log g consisting of jS1j,,jSmj\cup_{j}S^{j}_{1},\ldots,\cup_{j}S^{j}_{m}. Then we continue to compute level 1+logg,2+logg,1+\log g,2+\log g,\ldots until we get a single node (set) in level logmg\log mg.

Observation 11.

Consider some level hh of the tree. It has =mg2h\ell=\frac{mg}{2^{h}} nodes, say A1,,AA_{1},\ldots,A_{\ell}. For each ii, AiA_{i} is a subset of [0,2h+1ε][0,\frac{2^{h+1}}{\varepsilon}] and 0Ai0\in A_{i}. Moreover, i=1max(Ai)4βε\sum_{i=1}^{\ell}\max(A_{i})\geqslant\frac{4\beta}{\varepsilon}.

Given a level hh, if we compute the sumset of two sets via standard FFT (Lemma 4), it would require O(mg2h2hεlog2hε)=O~(βε)O(\frac{mg}{2^{h}}\cdot\frac{2^{h}}{\varepsilon}\log\frac{2^{h}}{\varepsilon})=\widetilde{O}(\frac{\beta}{\varepsilon}) time to obtain the next level. This is already too much as β\beta can be as large as 1ε\frac{1}{\varepsilon}.

Our algorithm computes a new level only if the nodes in this level have a total size of roughly O~(1ε)\widetilde{O}(\frac{1}{\varepsilon}). We say a level is sparse in this case. A sparse level can computed in nearly O~(1ε)\widetilde{O}(\frac{1}{\varepsilon}) via the output-sensitive algorithm for sumset whose running time is linear in the output size (Lemma 5). The only exception is level 0: in O(n)O(n) time, we can determine whether it is sparse or not, and we will try to compute level 1 only if level 0 is sparse. Recall that there are 1+logmg1+\log mg levels and that the first 1+logg1+\log g levels may be repeated but only for rr times. Both 1+logmg1+\log mg and rr are logarithmic in nn, 1ε\frac{1}{\varepsilon} and β\beta. So the total running time for computing SS will be O~(n+1ε)\widetilde{O}(n+\frac{1}{\varepsilon}), if all levels are sparse. For dense levels, we cannot afford to compute them. Instead, we use additive combinatorics tools to show that if some level is dense, 𝒮X{\cal S}_{X} must have a long sequence with a small difference between consecutive terms. Then 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] can be immediately approximated via the following lemma.

Lemma 12.

Let δ\delta be a positive integer. Suppose that SS has a sequence a1<<aka_{1}<\ldots<a_{k} such that aiai1δa_{i}-a_{i-1}\leqslant\delta for any i[2,k]i\in[2,k]. Then for any real numbers ww and vv such that a1wa_{1}\leqslant w and vakv\leqslant a_{k}, the following set approximates S[w,v]S[w,v] with additive error δ\delta.

{w+jδ:j[0,vwδ]}\{w+j\delta:j\in[0,\frac{v-w}{\delta}]\}
Proof.

For any sS[w,v]s\in S[w,v], there must be some j[0,vwδ]j\in[0,\frac{v-w}{\delta}] such that

w+jδsw+(j+1)δ.w+j\delta\leqslant s\leqslant w+(j+1)\delta.

The first condition of Definition 3 is satisfied. In the other direction, for any s~{w+jδ:j[0,vwδ]}\tilde{s}\in\{w+j\delta:j\in[0,\frac{v-w}{\delta}]\}, there must be some i[1,k1]i\in[1,k-1] such that ais~ai+1ai+δ.a_{i}\leqslant\tilde{s}\leqslant a_{i+1}\leqslant a_{i}+\delta. The second condition is also satisfied. So {w+jδ:j[0,vwδ]}\{w+j\delta:j\in[0,\frac{v-w}{\delta}]\} approximates S[w,v]S[w,v] with additive error δ\delta. ∎

3.3 Density and Arithmetic Progressions

We formalize the concepts of “dense” and “sparse” and derive some properties resulting from them. The core of our technique is an additive combinatorics result from Szemerédi and Vu [SV05], which basically states that given many large-sized sets of integers, their sumset must have a long arithmetic progression.

Definition 13.

An arithmetic progression of length kk and common difference Δ\Delta is a set of kk integers a1<<aka_{1}<\ldots<a_{k} such that the differences of consecutive terms are all Δ\Delta.

Theorem 14 (Corollary 5.2 [SV05]).

For any fixed integer dd, there are positive constants c1c_{1} and c2c_{2} depending on dd such that the following holds. Let A1,,AA_{1},\ldots,A_{\ell} be subsets of [1,u][1,u] of size kk. If dkc1u\ell^{d}k\geqslant c_{1}u, then A1++AA_{1}+\cdots+A_{\ell} contains an arithmetic progression of length at least c2k1/dc_{2}\ell k^{1/d}.

Corollary 15.

There exists a sufficiently large constant cc such that the following holds. Let uu be a positive integer. Let A1,,AA_{1},\ldots,A_{\ell} be subsets of [0,u][0,u] of size at least kk. If (k1)cu\ell(k-1)\geqslant cu, then A1++AA_{1}+\cdots+A_{\ell} contains an arithmetic progression of length at least uu.

Proof.

Let c1c_{1} and c2c_{2} be the two constants for d=1d=1 in Theorem 14. Assume that cc1c\geqslant c_{1} and that cc21cc_{2}\geqslant 1. For i[1,]i\in[1,\ell], let Ai+=Ai{0}A^{+}_{i}=A_{i}\setminus\{0\}. Ai+A^{+}_{i} is a subset of [1,u][1,u] with size at least k1k-1. By Theorem 14, A1+++A+A^{+}_{1}+\cdots+A^{+}_{\ell} contains an arithmetic progression of length at least c2(k1)c2cuuc_{2}\ell(k-1)\geqslant c_{2}cu\geqslant u. This arithmetic progression also appears in A1++AA_{1}+\cdots+A_{\ell} since A1+++A+A1++AA^{+}_{1}+\cdots+A^{+}_{\ell}\subseteq A_{1}+\cdots+A_{\ell}. ∎

Throughout the rest of the paper, cc denotes the constant in the above corollary.

An arithmetic progression of length uu is not long enough for our purpose. To produce a longer sequence, we need a collection of integer sets to be γ\gamma-dense.

Definition 16.

Let uu and γ\gamma be positive integers. Let A1,,AA_{1},\ldots,A_{\ell} be subsets of [0,u][0,u]. The collection {A1,,A}\{A_{1},\ldots,A_{\ell}\} is γ\gamma-dense if for some k[2,u+1]k\in[2,u+1], at least cγuk1\frac{c\gamma u}{k-1} sets from this collection have size at least kk. We say that {A1,,A}\{A_{1},\ldots,A_{\ell}\} is γ\gamma-sparse if it is not γ\gamma-dense.

Through the next two lemmas, we will prove that, when a collection is γ\gamma-dense, we can use it to produce a long sequence with a small difference between consecutive elements. Basically, we use a small fraction of the collection to produce an arithmetic progression of length uu, and use the rest of the collection to extend this progression to a longer sequence. We first show how to extend an arithmetic progression.

Lemma 17.

Let BB be a set of positive integers that contains an arithmetic progression b1<<bkb_{1}<\ldots<b_{k} with common difference Δ\Delta. Let uu be a positive integer. Let A1,,AA_{1},\ldots,A_{\ell} be subsets of [0,u][0,u] containing 0. Let η=max(A1)++max(A)\eta=\max(A_{1})+\cdots+\max(A_{\ell}). Let S=B+A1++AS=B+A_{1}+\cdots+A_{\ell}. If bkb1u1b_{k}-b_{1}\geqslant u-1, then SS contains a sequence b1=s1<<sk=bk+ηb_{1}=s_{1}<\cdots<s_{k^{\prime}}=b_{k}+\eta such that sisi1Δs_{i}-s_{i-1}\leqslant\Delta for i[2,k]i\in[2,k^{\prime}].

Proof.

We prove by induction on jj. When j=0j=0, the lemma is obviously true by letting k=kk^{\prime}=k and si=bis_{i}=b_{i} for i[1,k]i\in[1,k]. Suppose that the lemma is true for some 0j10\leqslant j\leqslant\ell-1. We prove that it is also true for j+1j+1. Let Sj=B+A1++AjS_{j}=B+A_{1}+\cdots+A_{j} and ηj=max(A1)++max(Aj)\eta_{j}=\max(A_{1})+\cdots+\max(A_{j}). Let Sj+1=Sj+Aj+1S_{j+1}=S_{j}+A_{j+1} and ηj+1=ηj+max(Aj+1)\eta_{j+1}=\eta_{j}+\max(A_{j+1}). By inductive hypothesis, SjS_{j} contains a sequence (s1,,sk)(s_{1},\cdots,s_{k^{\prime}}) with s1=b1s_{1}=b_{1}, sk=bk+ηjs_{k^{\prime}}=b_{k}+\eta_{j}, and sisi1Δs_{i}-s_{i-1}\leqslant\Delta for i[2,k]i\in[2,k^{\prime}]. Note that SjSj+1S_{j}\subseteq S_{j+1} since 0Aj+10\in A_{j+1}, so Sj+1S_{j+1} also contains the sequence (s1,,sk)(s_{1},\ldots,s_{k^{\prime}}). Let aa^{*} be the maximum element in Aj+1A_{j+1}. The sequence (s1+a,,sk+a)(s_{1}+a^{*},\ldots,s_{k^{\prime}}+a^{*}) also belongs to Sj+1S_{j+1}. Note that s1+ab1+ubk+1sk+1s_{1}+a^{*}\leqslant b_{1}+u\leqslant b_{k}+1\leqslant s_{k^{\prime}}+1 and that sk+a=bk+ηj+a=bk+ηj+1s_{k^{\prime}}+a^{*}=b_{k}+\eta_{j}+a^{*}=b_{k}+\eta_{j+1}. Therefore, merging the two sequences by taking union and delete duplicates yields a sequence z1,,zk′′z_{1},\ldots,z_{k^{\prime\prime}} in Sj+1S_{j+1} with z1=b1z_{1}=b_{1}, zk′′=bk+ηj+1z_{k^{\prime\prime}}=b_{k}+\eta_{j+1}, and zizi1Δz_{i}-z_{i-1}\leqslant\Delta for any i[2,k′′]i\in[2,k^{\prime\prime}]. ∎

Now we are ready to prove that if A1,,AA_{1},\ldots,A_{\ell} is γ\gamma-dense, their sumset must contain a long sequence with small differences between consecutive terms.

Lemma 18.

Let uu be a positive integer. Let A1,,AA_{1},\ldots,A_{\ell} be subsets of [0,u][0,u] with 0Ai0\in A_{i} for every i[1,]i\in[1,\ell]. Let S=A1++AS=A_{1}+\cdots+A_{\ell}. Let η=max(A1)++max(A)\eta=\max(A_{1})+\cdots+\max(A_{\ell}). If {A1,,A}\{A_{1},\ldots,A_{\ell}\} is γ\gamma-dense, then SS contains a sequence s1,,sks_{1},\ldots,s_{k} such that s12ηγs_{1}\leqslant\frac{2\eta}{\gamma}, sk(γ2)ηγs_{k}\geqslant\frac{(\gamma-2)\eta}{\gamma}, and sisi14γs_{i}-s_{i-1}\leqslant\frac{4\ell}{\gamma} for any i[2,k]i\in[2,k].

Proof.

If u=1u=1, the lemma trivially holds because [0,η]S[0,\eta]\subseteq S. Assume that u2u\geqslant 2. By definition 16, for some k[2,u+1]k\in[2,u+1], at least =cγuk1\ell^{\prime}=\frac{c\gamma u}{k-1} sets AiA_{i}’s have |Ai|k|A_{i}|\geqslant k. From these AiA_{i}’s, we select the γ\lceil\frac{\ell^{\prime}}{\gamma}\rceil ones with smallest maximum elements. Let II be the set of indices of the selected AiA_{i}’s. By the way, we select AiA_{i}’s, we have

iImax(Ai)γi=1max(Ai)2γi=1max(Ai)=2ηγ.\sum_{i\in I}\max(A_{i})\leqslant\frac{\lceil\frac{\ell^{\prime}}{\gamma}\rceil}{\ell^{\prime}}\sum_{i=1}^{\ell}\max(A_{i})\leqslant\frac{2}{\gamma}\sum_{i=1}^{\ell}\max(A_{i})=\frac{2\eta}{\gamma}.

The second inequality is due to that γc1\frac{\ell^{\prime}}{\gamma}\geqslant c\geqslant 1, which follows by the fact that =cγuk1\ell^{\prime}=\frac{c\gamma u}{k-1} and that ku+1k\leqslant u+1.

It is easy to verify that the collection {Ai}iI\{A_{i}\}_{i\in I} of the selected AiA_{i}’s satisfies the condition of Corollary 15, so iIAi\sum_{i\in I}A_{i} contains an arithmetic progression (a1,,au)(a_{1},\ldots,a_{u}). The common difference

Δ=aua1u12auu2uiImax(Ai)4ηγu4γ.\Delta=\frac{a_{u}-a_{1}}{u-1}\leqslant\frac{2a_{u}}{u}\leqslant\frac{2}{u}\cdot\sum_{i\in I}\max(A_{i})\leqslant\frac{4\eta}{\gamma u}\leqslant\frac{4\ell}{\gamma}.

The last inequality is due to that η=i=1max(Ai)u\eta=\sum_{i=1}^{\ell}\max(A_{i})\leqslant\ell u. Moreover, aua1u1a_{u}-a_{1}\geqslant u-1. Now consider S=i=1Ai=iIAi+iIAiS=\sum_{i=1}^{\ell}A_{i}=\sum_{i\in I}A_{i}+\sum_{i\notin I}A_{i}. By Lemma 17, SS contains a sequence s1,,sks_{1},\ldots,s_{k} such s1=a1s_{1}=a_{1}, sk=au+iImax(Ai)s_{k}=a_{u}+\sum_{i\notin I}\max(A_{i}), and sisi1Δ4γs_{i}-s_{i-1}\leqslant\Delta\leqslant\frac{4\ell}{\gamma}. Note that

s1=a1<auiImax(Ai)2ηγs_{1}=a_{1}<a_{u}\leqslant\sum_{i\in I}\max(A_{i})\leqslant\frac{2\eta}{\gamma}

and

sk=au+iImax(Ai)iImax(Ai)=i=1max(Ai)iImax(Ai)(12γ)η.s_{k}=a_{u}+\sum_{i\notin I}\max(A_{i})\geqslant\sum_{i\notin I}\max(A_{i})=\sum_{i=1}^{\ell}\max(A_{i})-\sum_{i\in I}\max(A_{i})\geqslant(1-\frac{2}{\gamma})\eta.\qed

The following lemma shows that a γ\gamma-sparse collection of integer sets has a small total size.

Lemma 19.

Let A1,,AA_{1},\ldots,A_{\ell} be subsets of [0,u][0,u]. If {A1,,A}\{A_{1},\ldots,A_{\ell}\} is γ\gamma-sparse, then

i=1|Ai|+cγu(1+logu).\sum_{i=1}^{\ell}|A_{i}|\leqslant\ell+c\gamma u(1+\log u).
Proof.

Note that 0|Ai|u+10\leqslant|A_{i}|\leqslant u+1 for any ii. For k[0,u+2]k\in[0,u+2], let k\ell_{k} be the number of AiA_{i}’s with |Ai|k|A_{i}|\geqslant k. Since {Ai}iI\{A_{i}\}_{i\in I} is γ\gamma-sparse, k<cγuk1\ell_{k}<\frac{c\gamma u}{k-1} for any k2k\geqslant 2. Then

i=1|Ai|=k=1u+1k(kk+1)=k=1u+1k1+k=2u+1cγuk1=1+k=1ucγuk+cγu(1+logu).\sum_{i=1}^{\ell}|A_{i}|=\sum_{k=1}^{u+1}k(\ell_{k}-\ell_{k+1})=\sum_{k=1}^{u+1}\ell_{k}\leqslant\ell_{1}+\sum_{k=2}^{u+1}\frac{c\gamma u}{k-1}=\ell_{1}+\sum_{k=1}^{u}\frac{c\gamma u}{k}\leqslant\ell+c\gamma u(1+\log u).\qed

3.4 Estimating the Density

Recall that we compute a new level only when it is sparse, so we should estimate the density of a level without actually computing it. We first give an algorithm for estimating the size of the sumset of two integer sets. The following lemma is implied by [BFN22]. We defer the proof to appendix C.

Lemma 20.

Let uu be a positive integer. Let AA and BB be two subsets of [0,u][0,u]. Let kk be any positive integer. We can determine whether |A+B|k|A+B|\geqslant k or not in O(klog6upolyloglogu)O(k\log^{6}u\,\mathrm{polyloglog}\,u) time.

With the above lemma, we can estimate the density of a level without computing it.

Lemma 21.

Let u,γu,\gamma be two positive integers. Let A1,,AA_{1},\ldots,A_{\ell} be non-empty subsets of [0,u][0,u]. For i[1,/2]i\in[1,\ell/2], let Bi=A2i1+A2iB_{i}=A_{2i-1}+A_{2i}. Given A1,,AA_{1},\ldots,A_{\ell}, in O((+γu)log7upolyloglogu)O((\ell+\gamma u)\log^{7}u\,\mathrm{polyloglog}\,u) time, (without actually computing B1,,B/2B_{1},\ldots,B_{\ell/2}) we can

  1. (i)

    either tell that {B1,,B/2}\{B_{1},\ldots,B_{\ell/2}\} is (4γ)(4\gamma)-sparse, or

  2. (ii)

    return a subset II of [1,/2][1,\ell/2] such that |Bi|2cγu|I|+1|B_{i}|\geqslant\frac{2c\gamma u}{|I|}+1 for iIi\in I. (Therefore, {B1,,B/2}\{B_{1},\ldots,B_{\ell/2}\} is γ\gamma-dense).

Proof.

Note that |Bi|2u+1|B_{i}|\leqslant 2u+1. By definition, to determine whether the collection of BiB_{i}’s is γ\gamma-dense or not, we should, for each k[2,2u+1]k\in[2,2u+1], count the number of BiB_{i}’s with |Bi|k|B_{i}|\geqslant k. Due to concern about running time, we check only for those kk’s that are powers of 22. More precisely, for j[1,log(2u+1)]j\in[1,\lfloor\log(2u+1)\rfloor], we determine whether |Bi|2j|B_{i}|\geqslant 2^{j} or not via Lemma 20. We start with j=1j=1. Let IjI_{j} be the set of the indices of BiB_{i}’s with |Bi|2j|B_{i}|\geqslant 2^{j}. If the collection {Bi}iIj\{B_{i}\}_{i\in I_{j}} meets the condition of γ\gamma-dense, we are done. Otherwise, we proceed to the next jj, and obviously, only the BiB_{i}’s with iIji\in I_{j} need to be considered in the next round. The above procedure is summarized by Algorithm 2.

If (2j1)|Ij|2cγu(2^{j^{\prime}}-1)\cdot|I_{j^{\prime}}|\geqslant 2c\gamma u for some jj^{\prime}, since every BiB_{i} is a subset of [0,2u][0,2u] and every BiB_{i} with iIji\in I_{j^{\prime}} has size at least 2j2^{j^{\prime}}, by definition, {B1,,B/2}\{B_{1},\ldots,B_{\ell/2}\} is γ\gamma-dense, and IjI_{j^{\prime}} will be returned.

Suppose that (2j1)|Ij|<2cγu(2^{j}-1)\cdot|I_{j}|<2c\gamma u for all jj. We will prove that the collection of BiB_{i}’s is (4γ)(4\gamma)-sparse in this case. Let k\ell_{k} be the number of BiB_{i}’s whose sizes are at least kk. It suffices to show that k<8cγuk1\ell_{k}<\frac{8c\gamma u}{k-1} for any k[2,u+1]k\in[2,u+1]. Let kk be any integer in [1,2u+1][1,2u+1]. Let j=logkj^{\prime}=\lfloor\log k\rfloor. We have

k2j<2cγu2j18cγuk1.\ell_{k}\leqslant\ell_{2^{j^{\prime}}}<\frac{2c\gamma u}{2^{j^{\prime}}-1}\leqslant\frac{8c\gamma u}{k-1}.

Therefore, {B1,,B/2}\{B_{1},\ldots,B_{\ell/2}\} is (4γ)(4\gamma)-sparse.

Algorithm 2 stops as soon as it identifies that {B1,,B/2}\{B_{1},\ldots,B_{\ell/2}\} is γ\gamma-dense, so the outer loop of has at most log(2u+1)\lfloor\log(2u+1)\rfloor iterations. The running time of Algorithm 2 is bounded by the following.

j=1log(2u+1)iIj1(2jlog6upolyloglogu))\displaystyle\sum_{j=1}^{\lfloor\log(2u+1)\rfloor}\sum_{i\in I_{j-1}}\left(2^{j}\log^{6}u\,\mathrm{polyloglog}\,u)\right)
=\displaystyle= (2log6upolyloglogu)(j=1log(2u+1)(2j11)|Ij1|+j=1log(2u+1)|Ij1|)\displaystyle\left(2\log^{6}u\,\mathrm{polyloglog}\,u\right)\left(\sum_{j=1}^{\lfloor\log(2u+1)\rfloor}(2^{j-1}-1)|I_{j-1}|+\sum_{j=1}^{\lfloor\log(2u+1)\rfloor}|I_{j-1}|\right)
\displaystyle\leqslant 2log6upolyloglogulog(2u+1)(2cγu+)\displaystyle 2\log^{6}u\,\mathrm{polyloglog}\,u\cdot\lfloor\log(2u+1)\rfloor\cdot(2c\gamma u+\ell)
=\displaystyle= O((+γu)log7upolyloglogu).\displaystyle O((\ell+\gamma u)\log^{7}u\,\mathrm{polyloglog}\,u).\qed
Algorithm 2 EstimateDensity(A1,,A,γ)(A_{1},\ldots,A_{\ell},\gamma)
1:Input: non-empty subsets A1,,AA_{1},\ldots,A_{\ell} of [0,u][0,u] and a positive integer γ\gamma.
2:Output: Tells that {B1,,B/2}\{B_{1},\ldots,B_{\ell/2}\} is (4γ)(4\gamma)-sparse, or returns a subset II of [1,/2][1,\ell/2].
3:I0:=[1,/2]I_{0}:=[1,\ell/2]
4:for j:=1,,log(2u+1)j:=1,...,\lfloor\log(2u+1)\rfloor do
5:     Ij:=I_{j}:=\emptyset
6:     for each iIj1i\in I_{j-1} do
7:         if |A2i1+A2i|2j|A_{2i-1}+A_{2i}|\geqslant 2^{j} (via Lemma 20then
8:              Ij:=Ij{i}I_{j}:=I_{j}\cup\{i\}                
9:     if |Ij|(2j1)2cγu|I_{j}|\cdot(2^{j}-1)\geqslant 2c\gamma u then
10:         The algorithm stops and return IjI_{j}       
11:return the collection {A1+A2,,A1+A}\{A_{1}+A_{2},\ldots,A_{\ell-1}+A_{\ell}\} is (4γ)(4\gamma)-sparse 

3.5 Putting Things Together

Consider level hh of the tree for computing j=1rS1j++j=1rSmj\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}. Let A1,,AA_{1},\ldots,A_{\ell} be nodes in this level where =mg2hmg=O(βlognε)\ell=\frac{mg}{2^{h}}\leqslant mg=O(\beta\log\frac{n}{\varepsilon}). By Observation 11, for each i[1,]i\in[1,\ell], AiA_{i} is a subset of [0,2h+1ε][0,\frac{2^{h+1}}{\varepsilon}] with 0Ai0\in A_{i}. Moreover, i=1max(Ai)4βε\sum_{i=1}^{\ell}\max(A_{i})\geqslant\frac{4\beta}{\varepsilon}. We make an extra assumption that the elements of AiA_{i}’s are multiples of 2h+12^{h+1}. This can be done by rounding each element of AiA_{i} up to the nearest multiple of 2h+12^{h+1}. This incurs an additive error of 2h+1=2mg=O(βlognε)\ell\cdot 2^{h+1}=2mg={O}(\beta\log\frac{n}{\varepsilon}). Recall that there are 1+logmg1+\log mg levels, and each level is repeated for rr times. The total additive error caused by the rounding is 2(1+logmg)βlognε=O(βlog2nε)2(1+\log mg)\beta\log\frac{n}{\varepsilon}=O(\beta\log^{2}\frac{n}{\varepsilon}). Let B1,,B/2B_{1},\ldots,B_{\ell/2} be the next level h+1h+1.

Lemma 22.

Let γ=max(4,4mgβ)\gamma=\max(4,\lceil\frac{4mg}{\beta}\rceil). Given A1,,AA_{1},\ldots,A_{\ell}, in O~(1ε)\widetilde{O}(\frac{1}{\varepsilon}) time, we can

  1. (i)

    either compute B1,,B/2B_{1},\ldots,B_{\ell/2} whose total size is O(1εlog21ε)O(\frac{1}{\varepsilon}\log^{2}\frac{1}{\varepsilon}), or

  2. (ii)

    tell that B1,,B/2B_{1},\ldots,B_{\ell/2} are γ\gamma-dense and that A1++AA_{1}+\cdots+A_{\ell} has a sequence z1<<zkz_{1}<\cdots<z_{k} such that z1βεz_{1}\leqslant\frac{\beta}{\varepsilon}, zk2βεz_{k}\geqslant\frac{2\beta}{\varepsilon}, and zizi1βz_{i}-z_{i-1}\leqslant\beta for i[2,k]i\in[2,k].

Proof.

Recall that the elements of AiA_{i}’s are multiples of 2h+12^{h+1}. For i[1,]i\in[1,\ell], define Ai={a2h+1:aAi}A^{\prime}_{i}=\{\frac{a}{2^{h+1}}:a\in A_{i}\}. A1,,AA^{\prime}_{1},\ldots,A^{\prime}_{\ell} are subsets of [0,1ε][0,\frac{1}{\varepsilon}]. For i[1,/2]i\in[1,\ell/2], define Bi=A2i1+A2iB^{\prime}_{i}=A^{\prime}_{2i-1}+A^{\prime}_{2i}. We estimate the density of {B1,,B/2}\{B^{\prime}_{1},\ldots,B^{\prime}_{\ell/2}\} via Lemma 21. The time cost is

O((+γε)log71εpolyloglog1ε)=O~(1ε).O((\ell+\frac{\gamma}{\varepsilon})\log^{7}\frac{1}{\varepsilon}\,\mathrm{polyloglog}\,\frac{1}{\varepsilon})=\widetilde{O}(\frac{1}{\varepsilon}).

If {B1,,B/2}\{B^{\prime}_{1},\ldots,B^{\prime}_{\ell/2}\} is (4γ)(4\gamma)-sparse, by Lemma 19,

i=1/2|Bi|2+4cγ2ε(1+log2ε)=O(1εlog21ε).\sum_{i=1}^{\ell/2}|B^{\prime}_{i}|\leqslant\frac{\ell}{2}+4c\gamma\cdot\frac{2}{\varepsilon}\cdot(1+\log\frac{2}{\varepsilon})=O(\frac{1}{\varepsilon}\log^{2}\frac{1}{\varepsilon}).

We can compute each BiB^{\prime}_{i} via Lemma 5. The time cost for computing all BiB^{\prime}_{i}’s is

O(i=1/2|Bi|log51εpolyloglog1ε)=O~(1ε).O(\sum_{i=1}^{\ell/2}|B^{\prime}_{i}|\log^{5}\frac{1}{\varepsilon}\,\mathrm{polyloglog}\,\frac{1}{\varepsilon})=\widetilde{O}(\frac{1}{\varepsilon}).

From BiB^{\prime}_{i}, we can easily obtain BiB_{i} by multiply each element of BiB^{\prime}_{i} by 2h+12^{h+1}.

Consider the case that {B1,,B/2}\{B^{\prime}_{1},\ldots,B^{\prime}_{\ell/2}\} is γ\gamma-dense. Let η=max(B1)++max(B/2)\eta=\max(B^{\prime}_{1})+\cdots+\max(B^{\prime}_{\ell/2}). We have that

β2h1εηε.\frac{\beta}{2^{h-1}\varepsilon}\leqslant\eta\leqslant\frac{\ell}{\varepsilon}. (1)

The first inequality is due to that max(A1)++max(A)4βε\max(A_{1})+\cdots+\max(A_{\ell})\geqslant\frac{4\beta}{\varepsilon}, and the second is due to that max(Ai)1ε\max(A^{\prime}_{i})\leqslant\frac{1}{\varepsilon}. By Lemma 18, B1++B/2B^{\prime}_{1}+\cdots+B^{\prime}_{\ell/2} contains a sequence z1,,zkz^{\prime}_{1},\ldots,z^{\prime}_{k} such that z12ηγz^{\prime}_{1}\leqslant\frac{2\eta}{\gamma}, zk(γ2)ηγz^{\prime}_{k}\geqslant\frac{(\gamma-2)\eta}{\gamma}, and that zizi12γz^{\prime}_{i}-z^{\prime}_{i-1}\leqslant\frac{2\ell}{\gamma} for i[2,k]i\in[2,k]. For i[1,k]i\in[1,k], let zi=2h+1ziz_{i}=2^{h+1}\cdot z^{\prime}_{i}. One can see that z1,,zkz_{1},\ldots,z_{k} is a sequence contained in B1++B/2=A1++AB_{1}+\cdots+B_{\ell/2}=A_{1}+\cdots+A_{\ell}. By inequality (1) and our choice of γ\gamma, we have the followings.

zizi1\displaystyle z_{i}-z_{i-1} 2γ2h+1=4mgγβ\displaystyle\leqslant\frac{2\ell}{\gamma}\cdot 2^{h+1}=\frac{4mg}{\gamma}\leqslant\beta
z1\displaystyle z_{1} 2ηγ2h+14mgγεβε\displaystyle\leqslant\frac{2\eta}{\gamma}\cdot 2^{h+1}\leqslant\frac{4mg}{\gamma\varepsilon}\leqslant\frac{\beta}{\varepsilon}
zk\displaystyle z_{k} (γ2)ηγ2h+1γ2γ4βε2βε.\displaystyle\geqslant\frac{(\gamma-2)\eta}{\gamma}\cdot 2^{h+1}\geqslant\frac{\gamma-2}{\gamma}\cdot\frac{4\beta}{\varepsilon}\geqslant\frac{2\beta}{\varepsilon}.\qed
Lemma 23.

Given the rr partitions of XX resulting from color-coding, we can obtain a set of size O(1εlog21ε)O(\frac{1}{\varepsilon}\log^{2}\frac{1}{\varepsilon}) that approximates 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] with additive error O(βlog2nε)O(\beta\log^{2}\frac{n}{\varepsilon}). The time cost is O~(n+1ε)\widetilde{O}(n+\frac{1}{\varepsilon}).

Proof.

Each partition form a level 0 of the tree for computing j=1rS1j++j=1rSmj\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}. We keep computing new levels until we reach the root or we encounter a level that is dense (That is, case (ii) of Lemma 22). Recall that there are 1+logmg1+\log mg levels and that the first 1+logg1+\log g levels may be repeated, but only for at most rr times. Every level we have computed has a total size of O(1εlog21ε)O(\frac{1}{\varepsilon}\log^{2}\frac{1}{\varepsilon}) (except for level 0, which has a total size of nn). The total time cost for rounding the levels is

O(r(n+1εlog21εlogmg))=O~(n+1ε),O(r\cdot(n+\frac{1}{\varepsilon}\log^{2}\frac{1}{\varepsilon}\log mg))=\widetilde{O}(n+\frac{1}{\varepsilon}),

and that for computing the levels is at most

rlogmgO~(1ε)=O~(1ε).r\log mg\cdot\widetilde{O}(\frac{1}{\varepsilon})=\widetilde{O}(\frac{1}{\varepsilon}).

Recall that rounding the levels incurs a total additive error of O(βlog2nε)O(\beta\log^{2}\frac{n}{\varepsilon}). If we reach the root of the tree, then we get a set that approximates 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] with additive error O(βlog2nε)O(\beta\log^{2}\frac{n}{\varepsilon}). The size of this set is O(1εlog21ε)O(\frac{1}{\varepsilon}\log^{2}\frac{1}{\varepsilon}). Consider the other case that we encounter at a dense level. Lemma 22 implies that A1++AA_{1}+\cdots+A_{\ell} has a sequence z1<<zkz_{1}<\cdots<z_{k} such that z1βεz_{1}\leqslant\frac{\beta}{\varepsilon}, zk2βεz_{k}\geqslant\frac{2\beta}{\varepsilon}, and zizi1βz_{i}-z_{i-1}\leqslant\beta for i[2,k]i\in[2,k]. We shall approximate 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] in the same spirit as Lemma 12. Let SA=A1++AS_{A}=A_{1}+\cdots+A_{\ell}. By Lemma 12, the following set of size O(1ε)O(\frac{1}{\varepsilon}) approximates SA[βε,2βε]S_{A}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] with additive error 2β2\beta.

S~={βε+βi:i[0,2ε]}.\widetilde{S}=\{\frac{\beta}{\varepsilon}+\beta i:i\in[0,\frac{2}{\varepsilon}]\}.

We shall show that S~\widetilde{S} approximates 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] with additive error O(βlog2nε)O(\beta\log^{2}\frac{n}{\varepsilon}). Clearly, for any s𝒮X[βε,2βε]s\in{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}], there is s~S~\tilde{s}\in\widetilde{S} such that

sO(βlog2nε)sβs~s+βs+O(βlog2nε).s-O(\beta\log^{2}\frac{n}{\varepsilon})\leqslant s-\beta\leqslant\tilde{s}\leqslant s+\beta\leqslant s+O(\beta\log^{2}\frac{n}{\varepsilon}).

So the first condition of Definition 3 is satisfied. Now consider any s~S~\tilde{s}\in\widetilde{S}. Since S~\widetilde{S} approximates SA[βε,2βε]S_{A}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] with additive error β\beta, there exist sASA[βε,2βε]s_{A}\in S_{A}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] such that

s~βsAs~+β.\tilde{s}-\beta\leqslant s_{A}\leqslant\tilde{s}+\beta.

The error between SAS_{A} and 𝒮X{\cal S}_{X} is due to rounding, which is bounded by O(βlog2nε)O(\beta\log^{2}\frac{n}{\varepsilon}). That is to say for every sASAs_{A}\in S_{A}, there is s𝒮Xs\in{\cal S}_{X} such that

sAO(βlog2nε)ssA+O(βlog2nε).s_{A}-O(\beta\log^{2}\frac{n}{\varepsilon})\leqslant s\leqslant s_{A}+O(\beta\log^{2}\frac{n}{\varepsilon}).

So there is s𝒮Xs\in{\cal S}_{X} such that

s~βO(βlog2nε)ss~+β+O(βlog2nε).\tilde{s}-\beta-O(\beta\log^{2}\frac{n}{\varepsilon})\leqslant s\leqslant\tilde{s}+\beta+O(\beta\log^{2}\frac{n}{\varepsilon}).

The second condition of Definition 3 is also satisfied. ∎

4 Recovering a Solution

Let S~\widetilde{S} be the set we obtained via Lemma 23. Let δ=O(βlog2nε)\delta^{*}=O(\beta\log^{2}\frac{n}{\varepsilon}) be the additive error. This section gives a approach for recovering a subset YY of XX with s~δΣ(Y)s~+δ\tilde{s}-\delta^{*}\leqslant\Sigma(Y)\leqslant\tilde{s}+\delta^{*} for any s~S~\tilde{s}\in\widetilde{S}.

4.1 A High-Level Overview

Observation 24.

Given any yA+By\in A+B, one can recover aAa\in A and bBb\in B such that y=a+by=a+b as follows: sort AA and BB, for each aAa\in A, check if yaBy-a\in B using binary search. The running time is O(|A|log|A|+|B|log|B|)O(|A|\log|A|+|B|\log|B|), which is no larger than the time needed to compute A+BA+B.

If S~\widetilde{S} is obtained by explicitly computing all levels, then by the above observation, YY can be recovered by tracing back the computation of levels in O~(n+1ε)\widetilde{O}(n+\frac{1}{\varepsilon}) time. This method, however, does not work when some level is dense. This is because in the dense case, we use additive combinatorial tools to show the existence of a particular sequence (Lemma 22(ii)), skip the remaining levels, and directly give the approximation set S~\widetilde{S}.

In order to make “tracing back” work, we tackle dense levels with the following alternative approach: even when a level is dense, it will be computed but only partially. We use additive combinatorics tools to ensure that the partial level has a small total size but is still dense enough to provide a good approximation.

4.2 Computing Partial Levels

We first show that given an integer kk, it is possible to compute a partial sumset whose size is kk.

Lemma 25.

Let AA and BB be two non-empty subsets of [0,u][0,u] containing 0. Let k|A+B|k\leqslant|A+B| be a positive integer. In O((k+|A|+|B|)log7upolyloglogu)O((k+|A|+|B|)\log^{7}u\,\mathrm{polyloglog}\,u) time, we can compute a set HA+BH\subseteq A+B such that |H|=k|H|=k and that HH contains 0 and max(A)+max(B)\max(A)+\max(B).

Proof.

If |A|k|A|\geqslant k, HH can be easily obtained from A{max(A)+max(B)}A\cup\{\max(A)+\max(B)\}. Assume that |A|<k|A|<k. Let m=|B|m=|B|. For i[1,m]i\in[1,m], define BiB_{i} be the set of the first ii elements of BB. Note that |A+Bi||A+B_{i}| is an increasing function of ii. Moreover, let ii^{*} be the smallest ii with |A+Bi|k|A+B_{i}|\geqslant k. It must be that k|A+Bi|k+|A|k\leqslant|A+B_{i^{*}}|\leqslant k+|A|. And ii^{*} must exist since k|A+B|k\leqslant|A+B|. We use binary search to find ii^{*}. Start with i=m/2i=m/2. We can determine whether |A+Bi|k|A+B_{i}|\geqslant k via Lemma 20. We search the left half if |A+Bi|k|A+B_{i}|\geqslant k, and search the right half otherwise. The time cost for search ii^{*} is

logmO(klog6upolyloglogu)=O(klog7upolyloglogu).\log m\cdot O(k\log^{6}u\,\mathrm{polyloglog}\,u)=O(k\log^{7}u\,\mathrm{polyloglog}\,u).

Then we compute A+BiA+B_{i^{*}} via Lemma 5. Given A+BiA+B_{i^{*}}, HH will be easy to constructed. The time cost for computing |A+Bi||A+B_{i^{*}}| is

O((k+|A|)log5upolyloglogu)=O((k+|A|+|B|)log5upolyloglogu).O((k+|A|)\log^{5}u\,\mathrm{polyloglog}\,u)=O((k+|A|+|B|)\log^{5}u\,\mathrm{polyloglog}\,u).\qed

Next we show that given a level that nicely approximates 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}], we can always compute a new (perhaps partial) level of small size that also nicely approximates 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}], even when the next level is dense. Recall that in level hh, there are \ell nodes A1,,AA_{1},\ldots,A_{\ell} where =mg2hmg=O(βlog1ε)\ell=\frac{mg}{2^{h}}\leqslant mg=O(\beta\log\frac{1}{\varepsilon}). For each i[1,]i\in[1,\ell], AiA_{i} is a subset of [0,2h+1ε][0,\frac{2^{h+1}}{\varepsilon}] with 0Ai0\in A_{i}. Moreover, i=1max(Ai)4βε\sum_{i=1}^{\ell}\max(A_{i})\geqslant\frac{4\beta}{\varepsilon} and the elements of AiA_{i}’s are assumed to be multiples of 2h+12^{h+1}.

Lemma 26.

Let A1,,AA_{1},\ldots,A_{\ell} be a level such that A1++AA_{1}+\cdots+A_{\ell} approximates 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] with additive error δ\delta. In O~(1ε+i=1|A|)\widetilde{O}(\frac{1}{\varepsilon}+\sum_{i=1}^{\ell}|A_{\ell}|) time, we can compute Z1,,Z/2Z_{1},\ldots,Z_{\ell/2} such that the following is true.

  1. (i)

    For i[1,/2]i\in[1,\ell/2], ZiA2i1+A2iZ_{i}\subseteq A_{2i-1}+A_{2i} and {0,max(A2i1)+max(A2i)}Zi\{0,\max(A_{2i-1})+\max(A_{2i})\}\subseteq Z_{i}.

  2. (ii)

    i=1/2|Zi|=O(1εlog21ε)\sum_{i=1}^{\ell/2}|Z_{i}|=O(\frac{1}{\varepsilon}\log^{2}\frac{1}{\varepsilon})

  3. (iii)

    Z1++Z/2Z_{1}+\cdots+Z_{\ell/2} approximates 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] with additive error max(δ,β)\max(\delta,\beta).

Proof.

The proof is almost the same as that of Lemma 22. The only difference is that, in the dense case, we will compute a small subset of A2i1+A2iA_{2i-1}+A_{2i} so that the collection of these subsets has a small total size but is still dense.

Recall that the elements of AiA_{i}’s are multiples of 2h+12^{h+1}. For i[1,]i\in[1,\ell], define Ai={a2h+1:aAi}A^{\prime}_{i}=\{\frac{a}{2^{h+1}}:a\in A_{i}\}. A1,,AA^{\prime}_{1},\ldots,A^{\prime}_{\ell} are subsets of [0,1ε][0,\frac{1}{\varepsilon}]. Let γ=max(4,4mgβ)\gamma=\max(4,\lceil\frac{4mg}{\beta}\rceil). For i[1,/2]i\in[1,\ell/2], define Bi=A2i1+A2iB^{\prime}_{i}=A^{\prime}_{2i-1}+A^{\prime}_{2i}. We estimate the density of {B1,,B/2}\{B^{\prime}_{1},\ldots,B^{\prime}_{\ell/2}\} via Lemma 21. If {B1,,B/2}\{B^{\prime}_{1},\ldots,B^{\prime}_{\ell/2}\} is γ\gamma-sparse, we compute B1,,B/2B^{\prime}_{1},\ldots,B^{\prime}_{\ell/2}. Let Zi={2h+1b:bBi}Z_{i}=\{2^{h+1}\cdot b:b\in B^{\prime}_{i}\}. One can see that the ZiZ_{i}’s satisfy properties (i)(ii)(iii). And the total time cost is O~(1ε)\widetilde{O}(\frac{1}{\varepsilon}).

Suppose that {B1,,B/2}\{B^{\prime}_{1},\ldots,B^{\prime}_{\ell/2}\} is γ\gamma-dense. Let u=2εu=\frac{2}{\varepsilon}. In this case, Lemma 21 also returns a subset II of [1,/2][1,\ell/2] such that |Bi|2cγu|I|+1|B^{\prime}_{i}|\geqslant\frac{2c\gamma u}{|I|}+1 for iIi\in I. We compute ZiZ^{\prime}_{i} for all i[1,/2]i\in[1,\ell/2] as follows. For iIi\in I, we compute a ZiA2i1+A2iZ^{\prime}_{i}\subseteq A^{\prime}_{2i-1}+A^{\prime}_{2i} via Lemma 25 such that

|Zi|=2cγu|I|+1.|Z^{\prime}_{i}|=\frac{2c\gamma u}{|I|}+1.

For iIi\notin I, we let Zi={0,max(A2i1)+max(A2i)}Z^{\prime}_{i}=\{0,\max(A^{\prime}_{2i-1})+\max(A^{\prime}_{2i})\}. Finally we returns Zi={2h+1z:zZi}Z_{i}=\{2^{h+1}\cdot z:z\in Z^{\prime}_{i}\} for all ii. The total running time is

2|I|+iI(2cγu|I|+1+|A2i1|+|A2i|)log7upolyloglogu\displaystyle\frac{\ell}{2}-|I|+\sum_{i\in I}(\frac{2c\gamma u}{|I|}+1+|A^{\prime}_{2i-1}|+|A^{\prime}_{2i}|)\log^{7}u\,\mathrm{polyloglog}\,u
\displaystyle\leqslant +(2cγu+|I|+i=1|Ai|)log7upolyloglogu\displaystyle\ell+(2c\gamma u+|I|+\sum_{i=1}^{\ell}|A^{\prime}_{i}|)\log^{7}u\,\mathrm{polyloglog}\,u
\displaystyle\leqslant O~(1ε+i=1|A|).\displaystyle\widetilde{O}(\frac{1}{\varepsilon}+\sum_{i=1}^{\ell}|A_{\ell}|).

Next, we show that the ZiZ_{i}’s satisfy properties (i)(ii)(iii). Property (i) is guaranteed by Lemma 25. Property (ii) is satisfied since

i=1/2|Zi|=i=1/2|Zi|2|I|+iI(2cγu|I|+1)=O(1εlog1ε).\displaystyle\sum_{i=1}^{\ell/2}|Z_{i}|=\sum_{i=1}^{\ell/2}|Z^{\prime}_{i}|\leqslant\frac{\ell}{2}-|I|+\sum_{i\in I}(\frac{2c\gamma u}{|I|}+1)=O(\frac{1}{\varepsilon}\log\frac{1}{\varepsilon}).

We are left to prove property (iii). Since |Zi|2cγu|I|+1|Z^{\prime}_{i}|\geqslant\frac{2c\gamma u}{|I|}+1 for iIi\in I, the collection {Z1,,Z/2}\{Z^{\prime}_{1},\ldots,Z^{\prime}_{\ell/2}\} is γ\gamma-dense. By the construction of ZiZ^{\prime}_{i}, we have 0Zi0\in Z^{\prime}_{i} for any i[1,/2]i\in[1,\ell/2] and i=1/2max(Zi)=i=1max(Ai)\sum_{i=1}^{\ell/2}\max(Z^{\prime}_{i})=\sum_{i=1}^{\ell}\max(A^{\prime}_{i}). Therefore, Z1,,Z/2Z^{\prime}_{1},\ldots,Z^{\prime}_{\ell/2} has exactly the property as the γ\gamma-dense collection {B1,,B/2}\{B^{\prime}_{1},\ldots,B^{\prime}_{\ell/2}\} in the proof of Lemma 22. That is, Z1++Z/2Z_{1}+\cdots+Z_{\ell/2} has a sequence z1<<zkz_{1}<\ldots<z_{k} such that z1βεz_{1}\leqslant\frac{\beta}{\varepsilon}, zk2βεz_{k}\geqslant\frac{2\beta}{\varepsilon}, and zizi1βz_{i}-z_{i-1}\leqslant\beta. Clearly, for any s𝒮X[βε,2βε]s\in{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}], there is some zZ1++Z/2z\in Z_{1}+\cdots+Z_{\ell/2} such that

sβzs+β.s-\beta\leqslant z\leqslant s+\beta.

For any zZ1++Z/2z\in Z_{1}+\cdots+Z_{\ell/2}, zz is also in A1++AA_{1}+\cdots+A_{\ell} which approximates 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] with additive error δ\delta. So there is some s𝒮Xs\in{\cal S}_{X} such that

zδsz+δ.z-\delta\leqslant s\leqslant z+\delta.

By Definition 3, Z1++Z/2Z_{1}+\cdots+Z_{\ell/2} approximates 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] with additive error max(δ,β)\max(\delta,\beta). ∎

4.3 Tackling Dense Levels via Partial Levels

Given color-coding partitions of XX as level 0, we compute a set S~\widetilde{S} in a tree-like manner as in Section 3. Now no matter a level is dense or sparse, we can always compute it (partially) via Lemma 26. The total additive error is max(β,O(βlog2nε))=O(βlog2nε)\max(\beta,O(\beta\log^{2}\frac{n}{\varepsilon}))=O(\beta\log^{2}\frac{n}{\varepsilon}), which results from Lemma 26 and the rounding of the integers in each level. Since all the levels below S~\widetilde{S} are explicitly computed, we can use “tracing back” to recover the corresponding subset YY for each s~S~\tilde{s}\in\widetilde{S}.

Now we analyze the total running time of the above procedure. Lemma 26 can be invoked for at most log\log\ell times. Each level has a size of O(1εlog21ε)O(\frac{1}{\varepsilon}\log^{2}\frac{1}{\varepsilon}) (except for A1,,AA_{1},\ldots,A_{\ell}, which may have a total size of O(n)O(n)). The total running time is bound by

O~((1ε+1εlog21ε)log+1ε+n)=O~(n+1ε).\widetilde{O}((\frac{1}{\varepsilon}+\frac{1}{\varepsilon}\log^{2}\frac{1}{\varepsilon})\cdot\log\ell+\frac{1}{\varepsilon}+n)\\ =\widetilde{O}(n+\frac{1}{\varepsilon}).

5 Conclusion

We present a near-linear time weak approximation scheme for Subset Sum in this paper. It is interesting to explore whether our technique can bring improvement to other related problems. In particular, it is a major open problem whether there exists an O(n+xmax)O(n+x_{\max})-time exact algorithm for Subset Sum, where xmaxx_{\max} refers to the largest input integer. The best known algorithm has a running time O(n+xmax3/2)O(n+x_{\max}^{3/2}) [CLMZ24]. It would be interesting to close or narrow the gap.

Appendix A Problem Reduction

See 8

Proof.

In this proof, when we say we can ε\varepsilon-solve a Subset Sum instance (X,t)(X,t) , we mean that we can compute a set of size O~(1ε)\widetilde{O}(\frac{1}{\varepsilon}) that approximates 𝒮X[0,t]{\cal S}_{X}[0,t] with additive error O~(ε)t\widetilde{O}(\varepsilon)\cdot t, and given any element of the computed set, we can recover a solution with additive error O~(ε)t\widetilde{O}(\varepsilon)\cdot t.

Let (X,t)(X,t) be the original instance. Let Z={x<εt:xX}Z=\{x<\varepsilon t:x\in X\} be the set of tiny integers in XX. If Σ(Z)εt\Sigma(Z)\geqslant\varepsilon t, we iteratively extract from ZZ a minimal subset ZZ^{\prime} such that Σ(Z)εt\Sigma(Z^{\prime})\geqslant\varepsilon t until Σ(Z)<εt\Sigma({Z})<\varepsilon t. Now it is safe to discard Z{Z} as it incurs only an additive error of εt\varepsilon t. Let Z1,Z2,,ZhZ_{1},Z_{2},\cdots,Z_{h} be the sets we extracted from ZZ. For each ZjZ_{j}, we replace it with a meta-integer of value Σ(Zj)\Sigma(Z_{j}). Note that Σ(Zj)<2εt\Sigma(Z_{j})<2\varepsilon t for every jj as ZjZ_{j} is minimal. It is easy to see that the above transformation can be done in O(n)O(n) time. Moreover, replacing the tiny integers by the meta-integers incurs an additive error of at most 2εt2\varepsilon t, because for any subset ZZ^{\prime} of tiny integers, there is a maximal set of meta-integers that sums to at least Σ(Z)2εt\Sigma(Z)-2\varepsilon t.

Now xεtx\geqslant\varepsilon t for all xXx\in X, which implies that xε2t1ε\frac{x}{\varepsilon^{2}t}\geqslant\frac{1}{\varepsilon}. By adjusting ε\varepsilon by a constant factor, we assume that 1ε\frac{1}{\varepsilon} is an integer. Now we scale the whole instance by ε2t\varepsilon^{2}t, that is, we replace xXx\in X by x:=xε2tx^{\prime}:=\lfloor\frac{x}{\varepsilon^{2}t}\rfloor and tt by t=tε2tt^{\prime}=\lfloor\frac{t}{\varepsilon^{2}t}\rfloor. Scaling incurs an approximate factor of at most 1+ε1+\varepsilon, or equivalently, an additive error of at most εt\varepsilon t. After scaling, we have x[1ε,1ε2]x\in[\frac{1}{\varepsilon},\frac{1}{\varepsilon^{2}}] for all xXx\in X and t=1ε2t=\frac{1}{\varepsilon^{2}}.

Then we divide XX into log1ε\log\frac{1}{\varepsilon} groups, where each group contains integers within [αε,2αε][\frac{\alpha}{\varepsilon},\frac{2\alpha}{\varepsilon}] for α{1,2,4,8,}[1,1ε]\alpha\in\{1,2,4,8,\cdots\}\cap[1,\frac{1}{\varepsilon}]. We denote X[αε,2αε]X\cap[\frac{\alpha}{\varepsilon},\frac{2\alpha}{\varepsilon}] as XαX_{\alpha}. Suppose we can ε\varepsilon-solve each (Xα,1ε2)(X_{\alpha},\frac{1}{\varepsilon^{2}}) in O~(n+1ε)\widetilde{O}(n+\frac{1}{\varepsilon})-time (and let S~α\tilde{S}_{\alpha} denote the set that approximates 𝒮Xα{\cal S}_{X_{\alpha}}), then we can also ε\varepsilon-solve (X,1ε2)(X,\frac{1}{\varepsilon^{2}}) in O~(n+1ε)\widetilde{O}(n+\frac{1}{\varepsilon})-time via Lemma 6 (by letting AjA_{j}’s be S~α\tilde{S}_{\alpha}’s).

From now on we focus explicitly on XX where each xXx\in X satisfies that x[αε,2αε]x\in[\frac{\alpha}{\varepsilon},\frac{2\alpha}{\varepsilon}] for some α[1,1ε]\alpha\in[1,\frac{1}{\varepsilon}]. For every xXx\in X, we further scale the instance by α\alpha. That is, we let x=x/αx^{\prime}=\lfloor x/\alpha\rfloor for every xXx\in X and t=1αε2t^{\prime}=\lfloor\frac{1}{\alpha\varepsilon^{2}}\rfloor. Again, scaling incurs an approximate factor of at most 1+ε1+\varepsilon, or equivalently, an additive error of at most εt\varepsilon t.

Now x[1ε,2ε]x\in[\frac{1}{\varepsilon},\frac{2}{\varepsilon}] for every xXx\in X and t=1αε2[1ε,1ε2]t=\lfloor\frac{1}{\alpha\varepsilon^{2}}\rfloor\in[\frac{1}{\varepsilon},\frac{1}{\varepsilon^{2}}]. We further assume that tΣ(X)/2t\leqslant\Sigma(X)/2. If t>Σ(X)/2t>\Sigma(X)/2, then to approximate 𝒮X[0,t]{\cal S}_{X}[0,t], it suffices to approximate 𝒮X[0,Σ(X)/2]{\cal S}_{X}[0,\Sigma(X)/2] and 𝒮X[Σ(X)/2,t]{\cal S}_{X}[\Sigma(X)/2,t] separately. The following claim indicates that it actually suffices to approximate 𝒮X[0,Σ(X)/2]{\cal S}_{X}[0,\Sigma(X)/2].

Claim 27.

Let S~\widetilde{S} be a set that approximates 𝒮X[0,Σ(X)/2]{\cal S}_{X}[0,\Sigma(X)/2] with additive error εt\varepsilon t. Then {Σ(X)s:sS~}\{\Sigma(X)-s^{\prime}:s^{\prime}\in\widetilde{S}\} approximates 𝒮X[Σ(X)/2,Σ(X)]{\cal S}_{X}[\Sigma(X)/2,\Sigma(X)] with additive error εt\varepsilon t.

Proof of Claim 27.

We prove Claim 27 by Definition 3. Consider an arbitrary s𝒮X[Σ(X)/2,Σ(X)]s\in{\cal S}_{X}[\Sigma(X)/2,\Sigma(X)], we have Σ(X)s𝒮X[0,Σ(X)/2]\Sigma(X)-s\in{\cal S}_{X}[0,\Sigma(X)/2]. Hence, there is s~S~\tilde{s}\in\widetilde{S} with Σ(X)sεts~Σ(X)s+εt\Sigma(X)-s-\varepsilon t\leqslant\tilde{s}\leqslant\Sigma(X)-s+\varepsilon t. Rearranging the inequality we get sεtΣ(X)s~s+εts-\varepsilon t\leqslant\Sigma(X)-\tilde{s}\leqslant s+\varepsilon t. Hence, {Σ(X)s:sS~}\{\Sigma(X)-s^{\prime}:s^{\prime}\in\widetilde{S}\} approximates 𝒮X[Σ(X)/2,Σ(X)]{\cal S}_{X}[\Sigma(X)/2,\Sigma(X)] with additive error εt\varepsilon t. ∎

From now on we focus explicitly on approximating 𝒮X[0,t]{\cal S}_{X}[0,t] for tΣ(X)/2t\leqslant\Sigma(X)/2. To approximate 𝒮X[0,t]{\cal S}_{X}[0,t], we partition 𝒮X[0,t]{\cal S}_{X}[0,t] into logt2log1ε\log t\leqslant 2\log\frac{1}{\varepsilon} subsets 𝒮X[0,1ε],𝒮X[1ε,2ε],𝒮X[2ε,4ε],{\cal S}_{X}[0,\frac{1}{\varepsilon}],{\cal S}_{X}[\frac{1}{\varepsilon},\frac{2}{\varepsilon}],{\cal S}_{X}[\frac{2}{\varepsilon},\frac{4}{\varepsilon}],\ldots. Note that 𝒮X[0,1ε]{\cal S}_{X}[0,\frac{1}{\varepsilon}] can be computed directly as x[1ε,2ε]x\in[\frac{1}{\varepsilon},\frac{2}{\varepsilon}] for every xXx\in X, so we focus on the remaining subsets. Each of the remaining subsets can be denoted as 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] for some integer β[1,1ε]\beta\in[1,\frac{1}{\varepsilon}]. To approximate 𝒮X[0,t]{\cal S}_{X}[0,t] with an additive error of O~(εt)\widetilde{O}(\varepsilon t), it suffices to approximate each 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}] with an additive error of O~(β)\widetilde{O}(\beta). Notice that since 2βεt\frac{2\beta}{\varepsilon}\leqslant t and tΣ(X)/2t\leqslant\Sigma(X)/2, it follows directly that Σ(X)4βε\Sigma(X)\geqslant\frac{4\beta}{\varepsilon} for every 𝒮X[βε,2βε]{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}]. ∎

Appendix B Details for Color-Coding

Let β[1,1ε]\beta\in[1,\frac{1}{\varepsilon}]. Fix q:=(1ε)O(1)q^{*}:=(\frac{1}{\varepsilon})^{-O(1)}. Let m:=4β/log4β2εqm:=4\beta/\log\frac{4\beta^{2}}{\varepsilon q^{*}} rounded up to be next power of 22, g:=36log24β2εqg:=36\log^{2}\frac{4\beta^{2}}{\varepsilon q^{*}} rounded up to the next power of 22, and r:=log4β2εqr:=\lceil\log\frac{4\beta^{2}}{\varepsilon q^{*}}\rceil. The goal of this section is to prove the following. See 10

We are essentially following the color coding method in the prior work [Bri17] except that we need to additionally enforce the property that

max(X1,1j)++max(X1,gj)++max(Xm,1j)++max(Xm,gj)4βε.\max(X^{j}_{1,1})+\cdots+\max(X^{j}_{1,g})+\cdots+\max(X^{j}_{m,1})+\cdots+\max(X^{j}_{m,g})\geqslant\frac{4\beta}{\varepsilon}.

which can be achieved by ensuring that there are enough non-empty subsets. This is not difficult to achieve, in particular, we may simply do the following: for half of the subsets, we let each of them contain exactly one integer; for the remaining half of the subsets, we use color coding. The above procedure is summarized in Algorithm 4.

Algorithm 3 𝚂𝚖𝚊𝚕𝚕𝙲𝚘𝚕𝚘𝚛𝙲𝚘𝚍𝚒𝚗𝚐(X,m,g,r)\mathtt{SmallColorCoding}(X,m,g,r)
1:Input: A multi-set XX of integers that |X|mg|X|\leqslant mg and three integers m,g,rm,g,r
2:Output: rr partitions {X1,1j,,X1,gj,,Xm,1j,,Xm,gj}j[1,r]\{X^{j}_{1,1},\ldots,X^{j}_{1,g},\ldots,X^{j}_{m,1},\ldots,X^{j}_{m,g}\}_{j\in[1,r]} of XX
3:Partition XX into X1,1,,Xm,gX_{1,1},\ldots,X_{m,g} that each subset |Xi,j|1|X_{i,j}|\leqslant 1
4:for j=1,,rj=1,\ldots,r do
5:     X1,1j,,Xm,gj:=X1,1,,Xm,gX^{j}_{1,1},\ldots,X^{j}_{m,g}:=X_{1,1},\ldots,X_{m,g}
6:return the rr partitions of XX.
Algorithm 4 𝙼𝚘𝚍𝚒𝚏𝚒𝚎𝚍𝙲𝚘𝚕𝚘𝚛𝙲𝚘𝚍𝚒𝚗𝚐(X,k,q)\mathtt{ModifiedColorCoding}(X,k,q)
1:Input: A multi-set XX of integers, a positive integer kk, and a target error probability qq
2:Output: rr partitions {X1,1j,,X2m,gj}j[1,r]\{X^{j}_{1,1},\ldots,X^{j}_{2m,g}\}_{j\in[1,r]} of XX satisfying Lemma 10
3:m:=2k/log(k/q)m:=2k/\log(k/q) rounded up to be next power of 22
4:g:=36log2(k/q)g:=36\log^{2}(k/q) rounded up to the next power of 22
5:r:=logkqr:=\lceil\log\frac{k}{q}\rceil
6:if |X|mg/2|X|\leqslant mg/2 then
7:     {X1,1j,,Xm,gj}j[1,r]:=𝚂𝚖𝚊𝚕𝚕𝙲𝚘𝚕𝚘𝚛𝙲𝚘𝚍𝚒𝚗𝚐(X,m,g,r)\{X^{j}_{1,1},\ldots,X^{j}_{m,g}\}_{j\in[1,r]}:=\mathtt{SmallColorCoding}(X,m,g,r)
8:else
9:     Let X1X_{1} be an arbitrary subset of XX with |X1|=mg/2|X_{1}|=mg/2
10:     X2:=XX1X_{2}:=X\setminus X_{1}
11:     {X1,1j,,Xm/2,gj}j[1,r]:=𝚂𝚖𝚊𝚕𝚕𝙲𝚘𝚕𝚘𝚛𝙲𝚘𝚍𝚒𝚗𝚐(X,m/2,g,r)\{X^{j}_{1,1},\ldots,X^{j}_{m/2,g}\}_{j\in[1,r]}:=\mathtt{SmallColorCoding}(X,m/2,g,r)
12:     {Xm/2+1,1j,,Xm,gj}j[1,r]:=𝙲𝚘𝚕𝚘𝚛𝙲𝚘𝚍𝚒𝚗𝚐(X2,k,q)\{X^{j}_{m/2+1,1},\ldots,X^{j}_{m,g}\}_{j\in[1,r]}:=\mathtt{ColorCoding}(X_{2},k,q)
13:return {X1,1j,,Xm,gj}j[1,r]\{X^{j}_{1,1},\ldots,X^{j}_{m,g}\}_{j\in[1,r]}

Now we are ready to prove Lemma 10.

Proof of Lemma 10.

We partition XX into {X1,1j,,Xm,gj}j[1,r]\{X^{j}_{1,1},\ldots,X^{j}_{m,g}\}_{j\in[1,r]} by Algorithm 4 as 𝙼𝚘𝚍𝚒𝚏𝚒𝚎𝚍𝙲𝚘𝚕𝚘𝚛𝙲𝚘𝚍𝚒𝚗𝚐\mathtt{ModifiedColorCoding} (X,2β,qε2β)(X,2\beta,\frac{q^{*}\varepsilon}{2\beta}).

We first prove that for any subset YXY\subseteq X with |Y|2β|Y|\leqslant 2\beta, with probability at least 1qε2β1-\frac{q^{*}\varepsilon}{2\beta},

Σ(Y)j=1rS1j++j=1rSmj.\Sigma(Y)\in\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}.

We distinguish into two cases.

If |X|mg/2|X|\leqslant mg/2, SijS_{i}^{j}’s are the same for different jj, whereas j=1rSij=Si1\bigcup_{j=1}^{r}S^{j}_{i}=S^{1}_{i} for all i[1,m]i\in[1,m]. So

j=1rS1j++j=1rSmj=S11++Sm1=(X1,11{0})++(Xm,g1{0}).\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}=S^{1}_{1}+\cdots+S^{1}_{m}=(X^{1}_{1,1}\cup\{0\})+\cdots+(X^{1}_{m,g}\cup\{0\}).

As each subset has at most one element, it is obvious that for any YXY\subseteq X,

Σ(Y)(X1,11{0})++(Xm,g1{0}).\Sigma(Y)\in(X^{1}_{1,1}\cup\{0\})+\cdots+(X^{1}_{m,g}\cup\{0\}).

If |X|>mg/2|X|>mg/2, for any partition X=X1X2X=X_{1}\cup X_{2}, YY can be partitioned to Y1Y_{1} and Y2Y_{2} that Y1X1Y_{1}\subseteq X_{1} and Y2X2Y_{2}\subseteq X_{2}. Since |X1|=mg/2|X_{1}|=mg/2, using the same argument as above, we have Σ(Y1)j=1rS1j++j=1rSm/2j\Sigma(Y_{1})\in\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m/2}. Note that |Y2||Y|2β|Y_{2}|\leqslant|Y|\leqslant 2\beta. According to Lemma 9, with probability at least 1qε2β1-\frac{q^{*}\varepsilon}{2\beta}, Σ(Y2)j=1rSm/2+1j++j=1rSmj\Sigma(Y_{2})\in\bigcup_{j=1}^{r}S^{j}_{m/2+1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}. So with probability at least 1qε2β1-\frac{q^{*}\varepsilon}{2\beta},

Σ(Y)=Σ(Y1)+Σ(Y2)j=1rS1j++j=1rSm/2j+j=1rSm/2+1j++j=1rSmj.\Sigma(Y)=\Sigma(Y_{1})+\Sigma(Y_{2})\in\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m/2}+\bigcup_{j=1}^{r}S^{j}_{m/2+1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}.

Note that |𝒮X[βε,2βε]|βε+1|{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}]|\leqslant\frac{\beta}{\varepsilon}+1. For each s𝒮X[βε,2βε]s\in{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}], we have shown that the probability that s=Σ(Y)s=\Sigma(Y) for some YY but the event sj=1rS1j++j=1rSmjs\not\in\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m} occurs is at most qε2β\frac{q^{*}\varepsilon}{2\beta}. So with probability at least 1qε2β(βε+1)1q1-\frac{q^{*}\varepsilon}{2\beta}\cdot(\frac{\beta}{\varepsilon}+1)\geqslant 1-q^{*},

𝒮X[βε,2βε](j=1rS1j++j=1rSmj).{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}]\subseteq(\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}).

It is easy to see that 𝒮X[βε,2βε](j=1rS1j++j=1rSmj){\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}]\supseteq(\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}). So with probability at least 1q1-q^{*},

𝒮X[βε,2βε]=(j=1rS1j++j=1rSmj).{\cal S}_{X}[\frac{\beta}{\varepsilon},\frac{2\beta}{\varepsilon}]=(\bigcup_{j=1}^{r}S^{j}_{1}+\cdots+\bigcup_{j=1}^{r}S^{j}_{m}).

Next, we show that for any j[1,r]j\in[1,r],

max(X1,1j)++max(X1,gj)++max(Xm,1j)++max(Xm,gj)4βε.\max(X^{j}_{1,1})+\cdots+\max(X^{j}_{1,g})+\cdots+\max(X^{j}_{m,1})+\cdots+\max(X^{j}_{m,g})\geqslant\frac{4\beta}{\varepsilon}.

If |X|mg/2|X|\leqslant mg/2, as each subset has at most one element,

max(X1,1j)++max(Xm,gj)=Σ(X)4βε.\max(X^{j}_{1,1})+\cdots+\max(X^{j}_{m,g})=\Sigma(X)\geqslant\frac{4\beta}{\varepsilon}.

If |X|>mg/2|X|>mg/2, as mg144βmg\geqslant 144\beta and for any xXx\in X we have that x1εx\geqslant\frac{1}{\varepsilon}, it holds that

max(X1,1j)++max(Xm,gj)max(X1,1j)++max(Xm/2,gj)=Σ(X1)mg21ε4βε.\max(X^{j}_{1,1})+\cdots+\max(X^{j}_{m,g})\geqslant\max(X^{j}_{1,1})+\cdots+\max(X^{j}_{m/2,g})=\Sigma(X_{1})\geqslant\frac{mg}{2}\cdot\frac{1}{\varepsilon}\geqslant\frac{4\beta}{\varepsilon}.

Finally, we analyze the running time. We partition XX into O~(1ε)\widetilde{O}(\frac{1}{\varepsilon}) subsets, and repeat the process of partitioning each subset into polylog(n,1ε)\mathrm{polylog}(n,\frac{1}{\varepsilon}) subsets polylog(n,1ε)\mathrm{polylog}(n,\frac{1}{\varepsilon}) times. So the running time is O~(n+1ε)\widetilde{O}(n+\frac{1}{\varepsilon}). ∎

Appendix C Estimating Sumset Size

See 20

Proof.

Suppose |A|,|B|<k|A|,|B|<k since otherwise it is trivial. Let τ\tau be any positive number. For any integer set ZZ, define Zmodk={zmodk:zZ}Z\bmod k=\{z\bmod k:z\in Z\}. It is easy to verify that

|(Amodτ)+(Bmodτ)|2|(A+B)modτ|2|A+B|.\displaystyle|(A\bmod\tau)+(B\bmod\tau)|\leqslant 2|(A+B)\bmod\tau|\leqslant 2|A+B|. (2)

For i=1,,logui=1,\ldots,\lceil\log u\rceil, we compute (Amod2j)+(Bmod2j)(A\bmod 2^{j})+(B\bmod 2^{j}) via Lemma 5. If (Amod2j)+(Bmod2j)2k(A\bmod 2^{j})+(B\bmod 2^{j})\geqslant 2k for some jj, then we immediately stop and conclude that |A+B|k|A+B|\geqslant k by Eq (2) . Otherwise, we will reach j=loguj=\lceil\log u\rceil, and obtain A+BA+B in this round. Every round takes O(klog5upolyloglogu)O(k\log^{5}u\,\mathrm{polyloglog}\,u) time except for the last round. Let jj^{*} be the last round. Since we do not stop at round j1j^{*}-1,

|(Amod2j1)+(Bmod2j1)|<2k.|(A\bmod 2^{j^{*}-1})+(B\bmod 2^{j^{*}-1})|<2k.

We make the following claim.

Claim 28.

Let A and B be two sets of integers. For any positive integer kk,

|(Amod2k)+(Bmod2k)|3|(Amodk)+(Bmodk)|.|(A\bmod 2k)+(B\bmod 2k)|\leqslant 3|(A\bmod k)+(B\bmod k)|.

Given Claim 28, we conclude that the last round also takes O(klog5upolyloglogu)O(k\log^{5}u\,\mathrm{polyloglog}\,u) time since

|(Amod2j)+(Bmod2j)|3|(Amod2j1)+(Bmod2j1)|<6k.|(A\bmod 2^{j^{*}})+(B\bmod 2^{j^{*}})|\leqslant 3|(A\bmod 2^{j^{*}-1})+(B\bmod 2^{j^{*}-1})|<6k.

Therefore, the total running time is O(klog6upolyloglogu)O(k\log^{6}u\,\mathrm{polyloglog}\,u). It remains to prove Claim 28.

Proof of Claim 28.

Let C=(Amod2k)+(Bmod2k)C=(A\bmod 2k)+(B\bmod 2k), C=(Amodk)+(Bmodk)C^{\prime}=(A\bmod k)+(B\bmod k). We prove that C{c,c+k,c+2k:cC}C\subseteq\{c^{\prime},c^{\prime}+k,c^{\prime}+2k:c^{\prime}\in C^{\prime}\}. So |C|3|C||C|\leqslant 3|C^{\prime}|.

For any cCc\in C, we can write c=a+bc=a+b for some a(Amod2k)a\in(A\bmod 2k) and b(Bmod2k)b\in(B\bmod 2k). We have a=amodk(Amodk)a^{\prime}=a\bmod k\in(A\bmod k) and b=bmodk(Bmodk)b^{\prime}=b\bmod k\in(B\bmod k). Let c=a+bCc^{\prime}=a^{\prime}+b^{\prime}\in C^{\prime}. By definition, it is immediate that c{c,ck.c2k}c^{\prime}\in\{c,c-k.c-2k\} and therefore c{c,c+k,c+2k:cC}c\in\{c^{\prime},c^{\prime}+k,c^{\prime}+2k:c^{\prime}\in C^{\prime}\}. ∎

References

  • [ABHS22] Amir Abboud, Karl Bringmann, Danny Hermelin, and Dvir Shabtay. SETH-based Lower Bounds for Subset Sum and Bicriteria Path. ACM Transactions on Algorithms, 18(1):1–22, January 2022. doi:10.1145/3450524.
  • [Alo87] N. Alon. Subset sums. Journal of Number Theory, 27(2):196–205, October 1987. doi:10.1016/0022-314X(87)90061-8.
  • [AR15] Andrew Arnold and Daniel S. Roche. Output-Sensitive Algorithms for Sumset and Sparse Polynomial Multiplication. In Proceedings of the 2015 ACM on International Symposium on Symbolic and Algebraic Computation (ISSAC 2015), pages 29–36, New York, USA, June 2015. doi:10.1145/2755996.2756653.
  • [AT19] Kyriakos Axiotis and Christos Tzamos. Capacitated Dynamic Programming: Faster Knapsack and Graph Algorithms. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019), volume 132, pages 19:1–19:13, Dagstuhl, Germany, 2019. doi:10.4230/LIPIcs.ICALP.2019.19.
  • [BC23] Karl Bringmann and Alejandro Cassis. Faster 0-1-Knapsack via Near-Convex Min-Plus-Convolution. In 31st Annual European Symposium on Algorithms (ESA 2023), volume 274, pages 24:1–24:16, Dagstuhl, Germany, 2023. doi:10.4230/LIPIcs.ESA.2023.24.
  • [BFN21] Karl Bringmann, Nick Fischer, and Vasileios Nakos. Sparse nonnegative convolution is equivalent to dense nonnegative convolution. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing (STOC 2021), pages 1711–1724, Virtual Italy, June 2021. doi:10.1145/3406325.3451090.
  • [BFN22] Karl Bringmann, Nick Fischer, and Vasileios Nakos. Deterministic and Las Vegas Algorithms for Sparse Nonnegative Convolution. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2022), Proceedings, pages 3069–3090, January 2022. doi:10.1137/1.9781611977073.119.
  • [BN20] Karl Bringmann and Vasileios Nakos. Top-k-convolution and the quest for near-linear output-sensitive subset sum. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC 2020), pages 982–995, 2020. doi:10.1145/3357713.3384308.
  • [BN21a] Karl Bringmann and Vasileios Nakos. Fast n-Fold Boolean Convolution via Additive Combinatorics. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021), 2021. doi:10.4230/LIPIcs.ICALP.2021.41.
  • [BN21b] Karl Bringmann and Vasileios Nakos. A Fine-Grained Perspective on Approximating Subset Sum and Partition. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA 2021), Proceedings, pages 1797–1815, January 2021. doi:10.1137/1.9781611976465.108.
  • [Bri17] Karl Bringmann. A Near-Linear Pseudopolynomial Time Algorithm for Subset Sum. In Proceedings of the 2017 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2017), pages 1073–1084, January 2017. doi:10.1137/1.9781611974782.69.
  • [Bri23] Karl Bringmann. Knapsack with Small Items in Near-Quadratic Time, September 2023. arXiv:2308.03075.
  • [BW21] Karl Bringmann and Philip Wellnitz. On Near-Linear-Time Algorithms for Dense Subset Sum. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA 2021), pages 1777–1796, January 2021. doi:10.1137/1.9781611976465.107.
  • [CH02] Richard Cole and Ramesh Hariharan. Verifying candidate matches in sparse and wildcard matching. In Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing (STOC 2002), pages 592–601, New York, NY, USA, May 2002. doi:10.1145/509907.509992.
  • [Cha18] Timothy M. Chan. Approximation Schemes for 0-1 Knapsack. In 1st Symposium on Simplicity in Algorithms (SOSA 2018), volume 61, pages 5:1–5:12, Dagstuhl, Germany, 2018. doi:10.4230/OASIcs.SOSA.2018.5.
  • [CL91] Edward G. Coffman and George S. Lueker. Probabilistic analysis of packing and partitioning algorithms. In Wiley-Interscience series in discrete mathematics and optimization, 1991. URL: https://api.semanticscholar.org/CorpusID:27018605.
  • [CL15] Timothy M. Chan and Moshe Lewenstein. Clustered Integer 3SUM via Additive Combinatorics. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing (STOC 2015), pages 31–40, New York, NY, USA, June 2015. doi:10.1145/2746539.2746568.
  • [CLMZ23] Lin Chen, Jiayi Lian, Yuchen Mao, and Guochuan Zhang. A Nearly Quadratic-Time FPTAS for Knapsack, August 2023. arXiv:2308.07821.
  • [CLMZ24] Lin Chen, Jiayi Lian, Yuchen Mao, and Guochuan Zhang. Faster Algorithms for Bounded Knapsack and Bounded Subset Sum Via Fine-Grained Proximity Results. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2024), Proceedings, pages 4828–4848. Society for Industrial and Applied Mathematics, January 2024. doi:10.1137/1.9781611977912.171.
  • [CMWW19] Marek Cygan, Marcin Mucha, Karol Węgrzycki, and Michał Włodarczyk. On problems equivalent to (min,+)-convolution. ACM Transactions on Algorithms, 15(1):1–25, January 2019. doi:10.1145/3293465.
  • [DJM23] Mingyang Deng, Ce Jin, and Xiao Mao. Approximating Knapsack and Partition via Dense Subset Sums. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2023), Proceedings, pages 2961–2979, January 2023. doi:10.1137/1.9781611977554.ch113.
  • [GL78] George V Gens and Eugene V Levner. Approximation algorithm for some scheduling problems. Engrg. Cybernetics, 6:38–46, 1978.
  • [GL79] George V Gens and Eugene V Levner. Computational complexity of approximation algorithms for combinatorial problems. In International Symposium on Mathematical Foundations of Computer Science, 1979. URL: https://api.semanticscholar.org/CorpusID:33715239.
  • [GL80] George V Gens and Eugene V Levner. Fast approximation algorithms for knapsack type problems. In Optimization Techniques, Lecture Notes in Control and Information Sciences, pages 185–194, Berlin, Heidelberg, 1980. doi:10.1007/BFb0006603.
  • [GL94] George V Gens and Eugene V Levner. A Fast Approximation Algorithm For The Subset-Sum Problem. INFOR: Information Systems and Operational Research, 32(3):143–148, 1994. doi:10.1080/03155986.1994.11732245.
  • [GM91] Zvi Galil and Oded Margalit. An Almost Linear-Time Algorithm for the Dense Subset-Sum Problem. SIAM Journal on Computing, 20(6):1157–1189, December 1991. doi:10.1137/0220072.
  • [Hay02] Brian Hayes. Computing Science:The Easiest Hard Problem. American Scientist, 90(2):113–117, 2002. URL: https://www.jstor.org/stable/27857621.
  • [IK75] Oscar H. Ibarra and Chul E. Kim. Fast Approximation Algorithms for the Knapsack and Sum of Subset Problems. Journal of the ACM, 22(4):463–468, 1975. doi:10.1145/321906.321909.
  • [Jin19] Ce Jin. An improved FPTAS for 0-1 knapsack. In Proceedings of 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019), volume 132, pages 76:1–76:14, 2019. doi:10.4230/LIPIcs.ICALP.2019.76.
  • [Jin23] Ce Jin. 0-1 Knapsack in Nearly Quadratic Time, August 2023. arXiv:2308.04093.
  • [Kar75] Richard M Karp. The fast approximate solution of hard combinatorial problems. In Proc. 6th South-Eastern Conf. Combinatorics, Graph Theory and Computing (Florida Atlantic U. 1975), pages 15–31, 1975.
  • [KMPS03] Hans Kellerer, Renata Mansini, Ulrich Pferschy, and Maria Grazia Speranza. An efficient fully polynomial approximation scheme for the Subset-Sum Problem. Journal of Computer and System Sciences, 66(2):349–370, 2003. doi:10.1016/S0022-0000(03)00006-0.
  • [KP04] Hans Kellerer and Ulrich Pferschy. Improved Dynamic Programming in Connection with an FPTAS for the Knapsack Problem. Journal of Combinatorial Optimization, 8(1):5–11, March 2004. doi:10.1023/B:JOCO.0000021934.29833.6b.
  • [KPS97] Hans Kellerer, Ulrich Pferschy, and Maria Grazia Speranza. An efficient approximation scheme for the subset-sum problem. In Hon Wai Leong, Hiroshi Imai, and Sanjay Jain, editors, Algorithms and Computation, Lecture Notes in Computer Science, pages 394–403, Berlin, Heidelberg, 1997. doi:10.1007/3-540-63890-3_42.
  • [KPS17] Marvin Künnemann, Ramamohan Paturi, and Stefan Schneider. On the Fine-Grained Complexity of One-Dimensional Dynamic Programming. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017), volume 80, pages 21:1–21:15, 2017. doi:10.4230/LIPIcs.ICALP.2017.21.
  • [Law79] Eugene L. Lawler. Fast Approximation Algorithms for Knapsack Problems. Mathematics of Operations Research, 4(4):339–356, 1979. URL: https://www.jstor.org/stable/3689221.
  • [Mao23] Xiao Mao. (1ϵ)(1-\epsilon)-Approximation of Knapsack in Nearly Quadratic Time, August 2023. arXiv:2308.07004.
  • [MH78] Ralph C. Merkle and Martin E. Hellman. Hiding information and signatures in trapdoor knapsacks. IEEE Transactions on Information Theory, 24(5):525–530, September 1978. doi:10.1109/TIT.1978.1055927.
  • [MWW19] Marcin Mucha, Karol Węgrzycki, and Michał Włodarczyk. A Subquadratic Approximation Scheme for Partition. In Proceedings of the 2019 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2019), Proceedings, pages 70–88, January 2019. doi:10.1137/1.9781611975482.5.
  • [Nak20] Vasileios Nakos. Nearly Optimal Sparse Polynomial Multiplication. IEEE Transactions on Information Theory, 66(11):7231–7236, November 2020. doi:10.1109/TIT.2020.2989385.
  • [PRW21] Adam Polak, Lars Rohwedder, and Karol Węgrzycki. Knapsack and Subset Sum with Small Items. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021), volume 198, pages 106:1–106:19, Dagstuhl, Germany, 2021. doi:10.4230/LIPIcs.ICALP.2021.106.
  • [Rhe15] Donguk Rhee. Faster fully polynomial approximation schemes for knapsack problems. PhD thesis, Massachusetts Institute of Technology, 2015. URL: https://api.semanticscholar.org/CorpusID:61901463.
  • [Sár89] A. Sárközy. Finite addition theorems, I. Journal of Number Theory, 32(1):114–130, May 1989. doi:10.1016/0022-314X(89)90102-9.
  • [Sár94] A. Sárközy. Fine Addition Theorems, II. Journal of Number Theory, 48(2):197–218, August 1994. doi:10.1006/jnth.1994.1062.
  • [SV05] E. Szemerédi and V. Vu. Long arithmetic progressions in sumsets: Thresholds and bounds. Journal of the American Mathematical Society, 19(1):119–169, September 2005. doi:10.1090/S0894-0347-05-00502-3.
  • [SV06] E. Szemerédi and V. H. Vu. Finite and Infinite Arithmetic Progressions in Sumsets. Annals of Mathematics, 163(1):1–35, 2006. URL: https://www.jstor.org/stable/20159950.
  • [WC22] Xiaoyu Wu and Lin Chen. Improved Approximation Schemes for (Un-)Bounded Subset-Sum and Partition, December 2022. arXiv:2212.02883.