This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Almost Optimal Inapproximability of Multidimensional Packing Problems

Sai Sandeep [email protected]. Research supported in part by NSF grants CCF-1563742 and CCF-1908125.
(Computer Science Department
Carnegie Mellon University
Pittsburgh, PA 15213)
Abstract

Multidimensional packing problems generalize the classical packing problems such as Bin Packing, Multiprocessor Scheduling by allowing the jobs to be dd-dimensional vectors. While the approximability of the scalar problems is well understood, there has been a significant gap between the approximation algorithms and the hardness results for the multidimensional variants. In this paper, we close this gap by giving almost tight hardness results for these problems.

  1. 1.

    We show that Vector Bin Packing has no Ω(logd)\Omega(\log d) factor asymptotic approximation algorithm when dd is a large constant, assuming PNP\textsf{P}\neq\textsf{NP}. This matches the lnd+O(1)\ln d+O(1) factor approximation algorithms (Chekuri, Khanna SICOMP 2004, Bansal, Caprara, Sviridenko SICOMP 2009, Bansal, Eliáš, Khan SODA 2016) upto constants.

  2. 2.

    We show that Vector Scheduling has no polynomial time algorithm with an approximation ratio of Ω((logd)1ϵ)\Omega\left((\log d)^{1-\epsilon}\right) when dd is part of the input, assuming NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right). This almost matches the O(logdloglogd)O\left(\frac{\log d}{\log\log d}\right) factor algorithms(Harris, Srinivasan JACM 2019, Im, Kell, Kulkarni, Panigrahi SICOMP 2019). We also show that the problem is NP-hard to approximate within (loglogd)ω(1)(\log\log d)^{\omega(1)}.

  3. 3.

    We show that Vector Bin Covering is NP-hard to approximate within Ω(logdloglogd)\Omega\left(\frac{\log d}{\log\log d}\right) when dd is part of the input, almost matching the O(logd)O(\log d) factor algorithm (Alon et al., Algorithmica 1998).

Previously, no hardness results that grow with dd were known for Vector Scheduling and Vector Bin Covering when dd is part of the input and for Vector Bin Packing when dd is a fixed constant.

1 Introduction

Bin Packing and Multiprocessor Scheduling (also known as Makespan Minimization) are some of the most fundamental problems in Combinatorial Optimization. They have been studied intensely from the early days of approximation algorithms and have had a great impact on the field. These two are packing problems where we have nn jobs with certain sizes, and the objective is to pack them into bins efficiently. In Bin Packing, each bin has unit size and the objective is to minimize the number of bins, while in Multiprocessor Scheduling, we are given a fixed number of bins and the objective is to minimize the maximum load in a bin. A relatively less studied but natural problem is Bin Covering, where the objective is to assign the jobs to the maximum number of bins such that in each bin, the load is at least 11. These problems are well understood in terms of approximation algorithms: all the three problems are NP-hard, and all of them have a Polynomial Time Approximation Scheme (PTAS) [dlVL81, HS87, CJK01].

In this paper, we study the approximability of the multidimensional generalizations of these three problems. The three corresponding problems are Vector Bin Packing, Vector Scheduling, and Vector Bin Covering. Apart from their theoretical importance, these problems are widely applicable in practice [Spi94, ST12, PTUW11] where the jobs often have multiple dimensions such as CPU, Hard disk, memory, etc.

In the Vector Bin Packing problem, the input is a set of nn vectors in [0,1]d[0,1]^{d} and the goal is to partition the vectors into the minimum number of parts such that in each part, the sum of vectors is at most 11 in every coordinate. The problem behaves differently from Bin Packing even when d=2d=2: Woeginger [Woe97] proved111Very recently, Ray [Ray21] found an oversight in Woeginger’s original proof and gave a revised APX hardness proof for the problem. that there is no asymptotic222The asymptotic approximation ratio (formally defined in Section 2) of an algorithm is the ratio of its cost and the optimal cost when the optimal cost is large enough. All the approximation factors mentioned in this paper for Vector Bin Packing are asymptotic. PTAS for 22-dimensional Vector Bin Packing, assuming PNP\textsf{P}\neq\textsf{NP}. On the algorithmic front, the PTAS for Bin Packing [dlVL81] easily implies a d+ϵd+\epsilon approximation for Vector Bin Packing. When dd is part of the input, this is almost tight: there is a lower bound of d1ϵd^{1-\epsilon} shown by [CK04]333[CK04] actually give d12ϵd^{\frac{1}{2}-\epsilon} hardness, but it has been shown later (see e.g.,  [CKPT17]) that a slight modification of their reduction gives d1ϵd^{1-\epsilon} hardness.. When dd is a fixed constant444The algorithms are now allowed to run in time nf(d)n^{f(d)}, for some function ff., much better algorithms are known [CK04, BCS09, BEK16] that get lnd+O(1)\ln d+O(1) approximation guarantee. However, the best hardness factor (for arbitrary constant dd) is still the APX-hardness result of the 22-dimensional problem due to Woeginger from 1997.

Closing this gap, either by obtaining a O(1)O(1) factor algorithm or showing a hardness factor that is a function of dd, has remained a challenging open problem. It is one of the ten open problems in a recent survey on multidimensional scheduling problems [CKPT17]. It also appeared in a recent report by Bansal [Ban17] on open problems in scheduling. In fact, to the best of our knowledge, super constant integrality gap instances for the configuration LP relaxation of the problem were also not known. For the integer instances i.e. when the vectors are from {0,1}d\{0,1\}^{d} (which are the hard instances for Vector Scheduling and Vector Bin Covering), there is an asymptotic PTAS since there are Od(1)O_{d}(1) item types.

In the Vector Scheduling problem, given a set of nn vector jobs in [0,1]d[0,1]^{d}, and mm identical machines, the objective is to assign the jobs to machines to minimize the maximum \ell_{\infty} norm of the load on the machines. Chekuri and Khanna [CK04] introduced the problem as a natural generalization of Multiprocessor Scheduling and obtained a PTAS for the problem when dd is a fixed constant. When dd is part of the input, they obtained a O(log2d)O(\log^{2}d) factor approximation algorithm. They also showed that it is NP-hard to obtain a CC factor approximation algorithm for the problem, for any constant CC. Meyerson, Roytman, and Tagiku [MRT13] gave an improved O(logd)O(\log d) factor algorithm while the current best factor is O(logdloglogd)O\left(\frac{\log d}{\log\log d}\right) due to Harris and Srinivasan [HS19] and Im, Kell, Kulkarni, and Panigrahi [IKKP19]. The algorithm of Harris and Srinivasan [HS19] works for the more general setting of unrelated machines where each job can have a different vector load for each machine. However, no super constant hardness is known even in this unrelated machines setting.

In the Vector Bin Covering problem, the input is a set of nn vectors in [0,1]d[0,1]^{d}. The objective is to partition these into the maximum number of parts such that in each part, the sum of vectors is at least 11 in every coordinate. It is also referred to as “dual Vector Packing” in the literature. This problem is introduced by Alon et al[AAC+98] who gave a O(logd)O(\log d) factor approximation algorithm. They also gave a dd factor algorithm using a method from the area of compact vector approximation that outperforms the above algorithm for small values of dd. On the hardness front, Ray [Ray21] showed that the 22-dimensional Vector Bin Covering problem is hard to approximate within a factor of 998997\frac{998}{997}.

1.1 Our Results

We prove almost optimal hardness results for the three multidimensional problems discussed above.

1.1.1 Vector Bin Packing

For the Vector Bin Packing problem, we prove a Ω(logd)\Omega(\log d) asymptotic hardness of approximation when dd is a large constant, matching the lnd+O(1)\ln d+O(1) approximation algorithms [CK04, BCS09, BEK16], up to constants.

Theorem 1.1.

There exists an integer d0d_{0} and a constant c>0c>0 such that for all constants dd0d\geq d_{0}, dd-dimensional Vector Bin Packing has no asymptotic clogdc\log d factor polynomial time approximation algorithm unless P=NP\emph{{P}}=\emph{{NP}}.

We obtain our hardness result via a reduction from the set cover problem on certain structured instances. In the set cover problem, we are given a set system 𝒮2V\mathcal{S}\subseteq 2^{V} on a universe VV, and the goal is to pick the minimum number of sets from 𝒮\mathcal{S} whose union is VV. Observe that Vector Bin Packing is a special case of the set cover problem with the vectors being the elements and every maximal set of vectors whose sum is at most 11 in every coordinate (known as “configurations”) being the sets. In fact, in the elegant Round & Approx framework [BCS09, BEK16], the Vector Bin Packing problem is viewed as a set cover instance, and the algorithms proceed by rounding the standard set cover LP. Towards proving the hardness of Vector Bin Packing, we ask the converse: Which families of set cover instances can be cast as dd-dimensional Vector Bin Packing?

We formalize this question using the notion of packing dimension of a set system 𝒮\mathcal{S} on a universe VV: it is the smallest integer dd such that there is an embedding f:V[0,1]df:V\rightarrow[0,1]^{d} such that a set SVS\subseteq V is in 𝒮\mathcal{S} if and only if

vSf(v)1\left\lVert\sum_{v\in S}f(v)\right\rVert_{\infty}\leq 1

If a set system has packing dimension dd, then the corresponding set cover problem can be embedded as a dd-dimensional Vector Bin Packing instance. However, it is not clear if the hard instances of the set cover problem have a low packing dimension. Indeed the instances in the (1ϵ)lnn(1-\epsilon)\ln n set cover hardness [Fei98] have a large packing dimension that grows with nn, which we cannot afford as we are operating in the constant dd regime. We get around this by starting our reduction from highly structured yet hard instances of set cover. In particular, we study simple bounded set systems which satisfy the following three properties:

  1. 1.

    The set system is simple555Simple set families are also known as linear set families. i.e., every pair of sets intersect in at most one element.

  2. 2.

    The cardinality of each set is at most kk, a fixed constant.

  3. 3.

    Each element in the family is present in at most Δ=kO(1)\Delta=k^{O(1)} sets.

Kumar, Arya, and Ramesh [KAR00] proved that simple set cover i.e., set cover with the restriction that every pair of sets intersect in at most one element, is hard to approximate within Ω(logn)\Omega(\log n). We observe that by modifying the parameters slightly in their proof, we can obtain the Ω(logk)\Omega(\log k) hardness of simple bounded set cover.

We prove that simple bounded set systems have packing dimension at most kO(1)k^{O(1)}. Thus, the Ω(logk)\Omega(\log k) simple bounded set cover hardness translates to Ω(logd)\Omega(\log d) hardness of Vector Bin Packing when dd is a constant. Note that the optimal value of the set cover instances can be made arbitrarily large in terms of kk by starting with a Label Cover instance with an arbitrarily large number of edges. Thus, our Vector Bin Packing hardness holds for asymptotic approximation as well.

Our upper bound on the packing dimension is obtained in two steps: First, we write the given simple bounded set system as an intersection of (kΔ)O(1)(k\Delta)^{O(1)} structured simple bounded set systems on the same universe, and then we give an embedding using (kΔ)O(1)(k\Delta)^{O(1)} dimensions bounding the packing dimension of these structured simple bounded set systems. This idea of decomposition into structured instances is inspired from a work of Chandran, Francis, and Sivadasan  [CFS08] where an upper bound on the Boxicity of a graph is obtained in terms of its maximum degree. We believe that the packing dimension of set systems is worth studying on its own, especially in light of its close connections to the notions of dimension of graphs such as Boxicity.

1.1.2 Vector Scheduling

For the Vector Scheduling problem, we obtain a Ω((logd)1ϵ)\Omega\left((\log d)^{1-\epsilon}\right) hardness under NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right), almost matching the O(logdloglogd)O\left(\frac{\log d}{\log\log d}\right) algorithms [HS19, IKKP19].

Theorem 1.2.

For every constant ϵ>0\epsilon>0, assuming NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right), dd-dimensional Vector Scheduling has no polynomial time Ω((logd)1ϵ)\Omega\left((\log d)^{1-\epsilon}\right)-factor approximation algorithm when dd is part of the input.

We obtain the hardness result via a reduction from the Monochromatic Clique problem. In the Monochromatic-Clique(k,B) problem, given a graph G=(V,E)G=(V,E) with |V|=n|V|=n and parameters k(n)k(n) and B(n)B(n), the goal is to distinguish between the case when GG is kk-colorable and the case when in any assignment of kk-colors to vertices of GG, there is a clique of size BB all of whose vertices are assigned the same color. When B=2B=2, this is the standard graph coloring problem. Note that the problem gets easier as BB increases. Indeed, when B>nB>\sqrt{n}, we can solve the problem in polynomial time by computing the Lovász theta function of the complement graph. We are interested in proving the hardness of the problem for BB as large a function of nn as possible, for some kk. For example, given a graph that is promised to be kk colorable, can we prove the hardness of assigning kk colors to the vertices of the graph in polynomial time where each color class has maximum clique at most B=logn?B=\log n?

The Monochromatic Clique problem was defined formally by Im, Kell, Kulkarni, and Panigrahi [IKKP19] in the context of proving lower bounds for online Vector Scheduling. It was also used implicitly in the ω(1)\omega(1) NP-hardness of Vector Scheduling by Chekuri and Khanna [CK04]. They proved (implicitly) that Monochromatic Clique is NP-hard when BB is an arbitrary constant using a reduction from n1ϵn^{1-\epsilon} hardness of graph coloring. We observe that the same reduction combined with better hardness of graph chromatic number [Kho01] proves the hardness of Monochromatic Clique when B=(logn)γB=(\log n)^{\gamma}, for some constant γ>0\gamma>0 under the assumption that NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right).

We then amplify this hardness to B=(logn)CB=(\log n)^{C}, for every constant C>0C>0. Our main idea in this amplification procedure is the notion of a stronger form of Monochromatic Clique where given a graph and parameters k,B,Ck,B,C, the goal is to distinguish between the case that GG is kk colorable vs. in any kCk^{C} coloring of GG, there is a monochromatic clique of size BB. It turns out that the graph coloring hardness of Khot [Kho01] already proves the hardness of this stronger variant of Monochromatic clique when B=(logn)γB=(\log n)^{\gamma} for any constant CC. We then use lexicographic product of graphs to amplify this result into the hardness of original Monochromatic Clique problem with B=(logn)CB=(\log n)^{C} for any constant CC under the same assumption that NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right). This directly gives the required hardness of Vector Scheduling using the reduction in  [CK04].

The Vector Scheduling problem is also closely related to the Balanced Hypergraph Coloring problem where the input is a hypergraph HH and a parameter kk, and the objective is to color the vertices of HH using kk colors to minimize the maximum number of times a color appears in an edge. We use this connection to improve upon the NP-hardness of the problem:

Theorem 1.3.

For every constant C>0C>0, dd-dimensional Vector Scheduling is NP-hard to approximate within Ω((loglogd)C)\Omega\left((\log\log d)^{C}\right) when dd is part of the input.

Consider the case when each vector job is from {0,1}d\{0,1\}^{d}. In this setting, we can view each coordinate as an edge in a hypergraph, and each vector corresponds to a vertex of the hypergraph. The goal is to find a mm-coloring of vertices of the hypergraph i.e., an assignment of the vectors to mm machines to minimize the maximum number of monochromatic vertices in an edge, which directly corresponds to the maximum load on a machine.

This problem of coloring a hypergraph to ensure that no color appears too many times in each edge is known as Balanced Hypergraph Coloring. Guruswami and Lee [GL18] obtained strong hardness results for this problem when kk, the uniformity of the hypergraph is a constant, using the Label Cover Long Code framework combined with analytical techniques such as the invariance principle. However, when kk is super constant, the invariance principle based methods give weak bounds as the soundness of the Label Cover has to be at least exponentially small in kk. Recently, using combinatorial tools to analyze the gadgets instead of the standard analytical techniques, improvements have been obtained for various hypergraph coloring problems [Bha18, ABP19] in the super-constant inapproximability regime. We follow the same route and use combinatorial tools to analyze the gadgets in the Label Cover Long Code framework and obtain better hardness of Balanced Hypergraph Coloring in the regime of super-constant uniformity kk. The key combinatorial lemma used in our analysis was proved recently by Austrin, Bhangale, and Potukuchi [ABP20] using a generalization of the Borsuk-Ulam theorem.

The NP-hardness of Vector Scheduling follows directly from the hardness of Balanced Hypergraph Coloring using the above-described reduction. This NP-hardness result uses near-linear size Label Cover hardness results [MR10, DS14]. By using the standard Label Cover hardness obtained by combining PCP Theorem and Parallel Repetition in the same reduction, we also prove an intermediate result bridging the above two hardness results for Vector Scheduling.

Theorem 1.4.

There exists a constant γ>0\gamma>0 such that assuming NPDTIME(nO(loglogn))\textsf{NP}\nsubseteq\textsf{DTIME}\left(n^{O(\log\log n)}\right), dd-dimensional Vector Scheduling is hard to approximate within Ω((logd)γ)\Omega\left((\log d)^{\gamma}\right) when dd is part of the input.

1.1.3 Vector Bin Covering

For the Vector Bin Covering problem, we show Ω(logdloglogd)\Omega\left(\frac{\log d}{\log\log d}\right) hardness, almost matching the O(logd)O(\log d) factor algorithm [AAC+98].

Theorem 1.5.

dd-dimensional Vector Bin Covering is NP-hard to approximate within Ω(logdloglogd)\Omega\left(\frac{\log d}{\log\log d}\right) factor when dd is part of the input.

Similar to Vector Scheduling, the hard instances for the Vector Bin Covering are when each vector is in {0,1}d\{0,1\}^{d}. Using the same connection as before, we view this problem as a hypergraph coloring problem where each edge of the hypergraph corresponds to a coordinate, and each vertex corresponds to a vector. Assigning the vectors to the bins such that in each bin, the sum is at least 11 in every coordinate corresponds to assigning colors to vertices of the hypergraph such that in every edge, all the colors appear. Such a coloring of the hypergraph with kk colors where all the kk colors appear in every edge is known as a kk-rainbow coloring of the hypergraph.

Strong hardness results are known for approximate rainbow coloring [GL18, ABP20, GS20] when kk is a constant. While these results give decent bounds in the super constant regime, they proceed via the hardness of Label Cover whose soundness is an inverse polynomial function of kk. Because of this, in the NP-hard instances, the number of edges is at least doubly exponential in kk. Instead, by losing a factor of 22 in the hardness, we give a reduction to the approximate rainbow coloring problem from Label Cover with no gap i.e., just a “Label Coverized” 33-SAT instance. In these hard instances, the number of edges is single exponential in kk, proving that it is NP-hard to distinguish the case that a hypergraph with mm edges has a rainbow coloring with Ω(logmloglogm)\Omega\left(\frac{\log m}{\log\log m}\right) colors vs. it cannot be rainbow colored with 22 colors666Note that rainbow coloring with 22 colors is the same as proper 22 coloring (Property B) of the hypergraphs.. This hardness of approximate rainbow coloring gives the required Vector Bin Covering hardness immediately using the earlier mentioned analogy between rainbow coloring and Vector Bin Covering.

We summarize our results in Table 1.

Problem Subcase Best Algorithm Best Hardness
VBPBoth the algorithms and hardness are considered for asymptotic setting. d=1d=1 PTAS [dlVL81] NP-Hard [GJ79]
Fixed dd lnd+O(1)\ln d+O(1) [BEK16] Ω(logd)\Omega(\log d)
Arbitrary dd 1+ϵd+O(ln1ϵ)1+\epsilon d+O\left(\ln\frac{1}{\epsilon}\right) [CK04] d1ϵd^{1-\epsilon} [CK04, CKPT17]
VS d=1d=1 EPTAS [Jan10] No FPTAS [FKT89]
Fixed dd PTAS [CK04] No EPTAS [BOVvdZ16]
Arbitrary dd O(logdloglogd)O\left(\frac{\log d}{\log\log d}\right) [HS19, IKKP19] Ω((logd)1ϵ)\Omega\left((\log d)^{1-\epsilon}\right)
(NPZPTIME(n(logn)O(1)))\left(\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right)\right)
(logd)Ω(1)(\log d)^{\Omega(1)}
(NPDTIME(nO(loglogn)))\left(\textsf{NP}\nsubseteq\textsf{DTIME}\left(n^{O(\log\log n)}\right)\right)
(loglogd)ω(1)(\log\log d)^{\omega(1)} (NP-Hardness)
VBC d=1d=1 FPTAS [JS03] NP-Hard [AJKL84]
Arbitrary dd O(logd)O(\log d) [AAC+98] Ω(logdloglogd)\Omega\left(\frac{\log d}{\log\log d}\right)
Table 1: Approximation algorithms for the multidimensional packing problems. VBP, VS and VBC stand for Vector Bin Packing, Vector Scheduling and Vector Bin Covering respectively. All the results for VBP and the FPTAS for VBC are asymptotic. The results without citations are our new results.

1.2 Related Work

Online Algorithms. Multidimensional packing problems have been extensively studied in the online setting. For the dd-dimensional Vector Bin Packing, the classical First-Fit algorithm [GGJY76] gives O(d)O(d) competitive ratio, and Azar, Cohen, Kamara, and Shepherd [ACKS13] recently gave an almost matching Ω(d1ϵ)\Omega\left(d^{1-\epsilon}\right) lower bound. For the dd-dimensional Vector Scheduling, Im, Kell, Kulkarni, and Panigrahi  [IKKP19] gave a O(logdloglogd)O\left(\frac{\log d}{\log\log d}\right) competitive online algorithm and proved a matching lower bound. For the dd-dimensional Vector Bin Covering problem, Alon et al. [AAC+98] gave a 2d2d competitive algorithm and proved a lower bound of d+12d+\frac{1}{2}.

Geometric variants. There are various natural geometric variants of Vector Bin Packing that have been studied in the literature. A classical problem of this sort is the 22-dimensional Geometric Bin Packing, where the input is a set of rectangles that need to be packed into the minimum number of unit squares. After a long line of works, Bansal and Khan [BK14] gave a 1.4051.405 factor asymptotic approximation algorithm for the problem. On the hardness front, Bansal and Sviridenko [BS04] showed that the problem does not admit an asymptotic PTAS, and this APX hardness result has been generalized to several related problems by Chlebík and Chlebíková [CC06]. We refer the reader to the excellent survey [CKPT17] regarding the geometric problems.

1.3 Organization

We first define the multidimensional problems and the Label Cover problem formally in Section 2. Next, we prove the hardness results for Vector Bin Packing, Vector Scheduling, and Vector Bin Covering in Section 3Section 4 and Section 5 respectively. Finally, we conclude by mentioning a few open problems in Section 6.

2 Preliminaries

Notations. We use [n][n] to denote the set {1,2,,n}\{1,2,\ldots,n\}. We use 1d\textbf{1}^{d} to denote the dd-dimensional vector (1,1,,1)(1,1,\ldots,1). For two dd-dimensional real vectors a and b, we say that ab\textbf{a}\geq\textbf{b} if aibi\textbf{a}_{i}\geq\textbf{b}_{i} for all i[d]i\in[d]. For a graph GG, we let ω(G),α(G),χ(G)\omega(G),\alpha(G),\chi(G) be the largest clique size, largest independent size, and the chromatic number of GG respectively. A set system or set family 𝒮\mathcal{S} on a universe VV is a collection of subsets of VV.

Problem Statements. We give formal definitions of the three problems that we study.

Definition 2.1.

(Vector Bin Packing) In the Vector Bin Packing problem, the input is a set of nn rational vectors v1,v2,,vn[0,1]dv_{1},v_{2},\ldots,v_{n}\in[0,1]^{d}. The objective is to partition [n][n] into minimum number of parts A1,A2,,AmA_{1},A_{2},\ldots,A_{m} such that

jAivj1i[m]\Big{\lVert}\sum_{j\in A_{i}}v_{j}\Big{\rVert}_{\infty}\leq 1\,\,\forall i\in[m]
Definition 2.2.

(Vector Scheduling) In the Vector Scheduling problem, the input is a set of nn rational vector jobs v1,v2,,vn[0,1]dv_{1},v_{2},\ldots,v_{n}\in[0,1]^{d}, and mm identical machines. The objective is to assign the jobs to machines i.e. partition [n][n] into mm parts A1,A2,,AmA_{1},A_{2},\ldots,A_{m} to minimize the makespan which is defined as the maximum \ell_{\infty} load on a machine.

maxi[m]jAivj\max_{i\in[m]}\Big{\lVert}\sum_{j\in A_{i}}v_{j}\Big{\rVert}_{\infty}
Definition 2.3.

(Vector Bin Covering) In the Vector Bin Covering problem, we are given nn vectors v1,v2,,vn[0,1]dv_{1},v_{2},\ldots,v_{n}\in[0,1]^{d}. The goal is to partition the input vectors into maximum number of parts A1,A2,,AmA_{1},A_{2},\ldots,A_{m} such that

jAivj1di[m]\sum_{j\in A_{i}}v_{j}\geq\textbf{1}^{d}\,\,\forall i\in[m]

Asymptotic Approximation. For the Bin Packing problem, it is NP-Hard to identify if all the vectors can be packed into 22 bins or need 33 bins. This already proves that the problem is NP-hard to approximate within 32\frac{3}{2} as per the usual notion of multiplicative approximation ratio. However, this is less interesting as there are much better asymptotic approximation algorithms for the problem which get (1+ϵ)(1+\epsilon)-factor approximation when the optimal value is large enough, for every positive constant ϵ>0\epsilon>0.

Even for the Vector Bin Packing problem, the performance of an algorithm is typically measured in the asymptotic setting. We give the formal definition [CKPT17] of asymptotic approximation ratio of an algorithm 𝒜\mathcal{A} for the Vector Bin Packing problem.

Definition 2.4.

(Asymptotic Approximation Ratio) The asymptotic approximation ratio ρ𝒜\rho_{\mathcal{A}}^{\infty} of an algorithm 𝒜\mathcal{A} for the Vector Bin Packing problem is

ρ𝒜=lim supnρ𝒜n,ρ𝒜n=supI{𝒜(I)OPT(I):OPT(I)=n}\rho_{\mathcal{A}}^{\infty}=\limsup\limits_{n\rightarrow\infty}\rho_{\mathcal{A}}^{n},\,\,\rho_{\mathcal{A}}^{n}=\sup_{I\in\mathcal{I}}\left\{\frac{\mathcal{A}(I)}{\emph{OPT}(I)}:\emph{OPT}(I)=n\right\}

where \mathcal{I} denotes the set of all possible Vector Bin Packing instances.

All the results mentioned in this paper regarding Vector Bin Packing are with respect to the asymptotic approximation ratio.

Label Cover. We define the Label Cover problem:

Definition 2.5.

(Label Cover) In an instance of the Label Cover problem G=(V=LR,E,ΣL,ΣR,Π)G=(V=L\cup R,E,\Sigma_{L},\Sigma_{R},\Pi) with |ΣL||ΣR||\Sigma_{L}|\geq|\Sigma_{R}|, the input is a bipartite graph LRL\cup R with constraints on every edge. The constraint on an edge ee is a projection Πe:ΣLΣR\Pi_{e}:\Sigma_{L}\rightarrow\Sigma_{R}. We say a labeling σ:VΣLΣR\sigma:V\rightarrow\Sigma_{L}\cup\Sigma_{R} satisfies the constraint on the edge e=(u,v)e=(u,v) if Πe(σ(u))=σ(v)\Pi_{e}(\sigma(u))=\sigma(v). The objective is to find a labeling σ:VΣLΣR\sigma:V\rightarrow\Sigma_{L}\cup\Sigma_{R} that satisfies as many constraints as possible.

By a simple reduction from the 33-SAT problem, we can prove that Label Cover is NP-hard when ΣL\Sigma_{L} and ΣR\Sigma_{R} are constants (See e.g., Lemma 4.24.2 in  [BG16]).

Theorem 2.6.

Given a Label Cover instance when ΣL=ΣR=[6]\Sigma_{L}=\Sigma_{R}=[6], it is NP-hard to identify if it has a labeling that satisfies all the constraints.

The real use of Label Cover, however, lies in its strong hardness of approximation. PCP Theorem [ALM+98] combined with Raz’s parallel repetition [Raz98] yields the following strong inapproximability of Label Cover problem.

Theorem 2.7.

There exists an absolute constant c>1c>1 such that for every integer nn and ϵ>0\epsilon>0, there is a reduction from 33-SAT instance II over nn variables to Label Cover instance G=(V=LR,E,ΣL,ΣR,Π)G=(V=L\cup R,E,\Sigma_{L},\Sigma_{R},\Pi) with |V|nO(log(1ϵ))|V|\leq n^{O\left(\log\left(\frac{1}{\epsilon}\right)\right)}, |ΣL|(1ϵ)c|\Sigma_{L}|\leq\left(\frac{1}{\epsilon}\right)^{c} satisfying the following:

  1. 1.

    (Completeness.) If II is satisfiable, there exists a labeling to GG that satisfies all the constraints.

  2. 2.

    (Soundness.) If II is not satisfiable, no labeling can satisfy an ϵ\epsilon fraction of the constraints of GG.

  3. 3.

    (Biregularity.) The graph LR,EL\cup R,E is biregular with degrees on either side bounded by poly(1ϵ)\emph{{poly}}\left(\frac{1}{\epsilon}\right).

Furthermore, the running time of the reduction is poly(n,1ϵ)\emph{{poly}}\left(n,\frac{1}{\epsilon}\right).

Moshkovitz-Raz [MR10] proved the following hardness of near linear size Label Cover.

Theorem 2.8.

There exist absolute constants c,c>1c,c^{\prime}>1 such that for every nn and ϵ>0\epsilon>0, there is a reduction from 33-SAT instance II over nn variables to Label Cover instance G=(V=LR,E,ΣL,ΣR,Π)G=(V=L\cup R,E,\Sigma_{L},\Sigma_{R},\Pi) with |V|n1+o(1)(1ϵ)c|V|\leq n^{1+o(1)}\left(\frac{1}{\epsilon}\right)^{c}, |ΣL|2(1ϵ)c|\Sigma_{L}|\leq 2^{\left(\frac{1}{\epsilon}\right)^{c^{\prime}}} satisfying the following:

  1. 1.

    (Completeness.) If II is satisfiable, there exists a labeling to GG that satisfies all the constraints.

  2. 2.

    (Soundness.) If II is not satisfiable, no labeling can satisfy an ϵ\epsilon fraction of the constraints of GG.

  3. 3.

    (Biregularity.) The graph LR,EL\cup R,E is biregular with degrees on either side poly(1ϵ)\emph{{poly}}\left(\frac{1}{\epsilon}\right).

Furthermore, when ϵ\epsilon is a constant, the running time of the reduction is poly(n)\emph{{poly}}(n).

3 Vector Bin Packing

In this section, we prove the hardness of approximation of Vector Bin Packing. First, we define the packing dimension of a set family and bound the packing dimension of simple set families. Next, we combine this upper bound with the hardness of set cover on simple bounded set systems to prove Theorem 1.1.

3.1 Packing Dimension

For a set family 𝒮\mathcal{S} on a universe VV, we define the packing dimension pdim(𝒮)\textsf{pdim}(\mathcal{S}) below. For a function f:V[0,1]Kf:V\rightarrow[0,1]^{K} and a set SVS\subseteq V, we let f(S)f(S) denote the vector f(S)=vSf(v)f(S)=\sum_{v\in S}f(v).

Definition 3.1.

For a set family 𝒮\mathcal{S} on a universe VV, the packing dimension pdim(𝒮)\emph{{pdim}}(\mathcal{S}) is defined as the smallest positive integer KK such that there exists an embedding f:V[0,1]Kf:V\rightarrow[0,1]^{K} that satisfies the following property: For every set SVS\subseteq V, SS is in the family 𝒮\mathcal{S} if and only if

f(S)1.\left\lVert f(S)\right\rVert_{\infty}\leq 1.

If no such embedding exists, we say that pdim(𝒮)\emph{{pdim}}(\mathcal{S}) is infinite.

For a set family 𝒮\mathcal{S} to have finite packing dimension i.e. for an embedding f:V[0,1]Kf:V\rightarrow[0,1]^{K} realizing the above condition to exist requires two conditions:

  1. 1.

    The set family is downward closed i.e. for every S𝒮S\in\mathcal{S} and TST\subseteq S, T𝒮T\in\mathcal{S} as well.

  2. 2.

    For every element vVv\in V, there is a set S𝒮S\in\mathcal{S} with vSv\in S. We call a set family 𝒮\mathcal{S} on a universe VV non-trivial if for every vVv\in V, there is a set S𝒮S\in\mathcal{S} with vSv\in S.

On the other hand, any set system that satisfies the above two conditions i.e. being downward closed and non-trivial has a finite packing dimension. Before proving this statement, we first prove the following simple but useful proposition.

Proposition 3.2.

For a pair of set families 𝒮1\mathcal{S}_{1} and 𝒮2\mathcal{S}_{2} defined on the same universe VV such that pdim(𝒮1)\emph{{pdim}}(\mathcal{S}_{1}) and pdim(𝒮2)\emph{{pdim}}(\mathcal{S}_{2}) are finite,

pdim(𝒮1𝒮2)pdim(𝒮1)+pdim(𝒮2)\emph{{pdim}}(\mathcal{S}_{1}\cap\mathcal{S}_{2})\leq\emph{{pdim}}(\mathcal{S}_{1})+\emph{{pdim}}(\mathcal{S}_{2})
Proof.

Let K1=pdim(𝒮1)K_{1}=\textsf{pdim}(\mathcal{S}_{1}) and K2=pdim(𝒮2)K_{2}=\textsf{pdim}(\mathcal{S}_{2}). Suppose that f1:V[0,1]K1f_{1}:V\rightarrow[0,1]^{K_{1}} be such that for every set SVS\subseteq V,

f1(S)1\left\lVert f_{1}(S)\right\rVert_{\infty}\leq 1

if and only if S𝒮1S\in\mathcal{S}_{1}. Similarly, let f2:V[0,1]K2f_{2}:V\rightarrow[0,1]^{K_{2}} be such that for every set SVS\subseteq V,

f2(S)1\left\lVert f_{2}(S)\right\rVert_{\infty}\leq 1

if and only if S𝒮2S\in\mathcal{S}_{2}. Consider the function f:V[0,1]K1+K2f:V\rightarrow[0,1]^{K_{1}+K_{2}} defined as f(v)=(f1(v),f2(v))f(v)=(f_{1}(v),f_{2}(v)). Then, for every set SVS\subseteq V, f(S)1\left\lVert f(S)\right\rVert_{\infty}\leq 1 if and only if f1(S)1\left\lVert f_{1}(S)\right\rVert_{\infty}\leq 1 and f2(S)1\left\lVert f_{2}(S)\right\rVert_{\infty}\leq 1. Thus, for every set SVS\subseteq V, f(S)1\left\lVert f(S)\right\rVert_{\infty}\leq 1 if and only if S𝒮1S\in\mathcal{S}_{1} and S𝒮2S\in\mathcal{S}_{2}, or equivalently, if S𝒮1𝒮2S\in\mathcal{S}_{1}\cap\mathcal{S}_{2}. Hence, the packing dimension of 𝒮1𝒮2\mathcal{S}_{1}\cap\mathcal{S}_{2} is at most K1+K2K_{1}+K_{2}. ∎

For a set SVS\subseteq V, let SS^{\uparrow} be the family of sets TVT\subseteq V such that STS\subseteq T. Similarly, let SS^{\downarrow} be the family of sets TVT\subseteq V such that TST\subseteq S. For a set system 𝒮\mathcal{S}, we let 𝒮\mathcal{S}^{\uparrow} (resp. 𝒮\mathcal{S}^{\downarrow}) denote the union of SS^{\uparrow} (resp. SS^{\downarrow}) over all S𝒮S\in\mathcal{S}.

Consider a set SVS\subseteq V with |S|>1|S|>1. For the set family 2VS2^{V}\setminus S^{\uparrow}, we have the embedding f:V[0,1]f:V\rightarrow[0,1] defined as

f(v)={1|S|+1|S|2, if vS0 otherwise. f(v)=\begin{cases}\frac{1}{|S|}+\frac{1}{|S|^{2}},\text{ if }v\in S\\ 0\text{ otherwise. }\end{cases}

This shows that pdim(2VS)1\textsf{pdim}(2^{V}\setminus S^{\uparrow})\leq 1 for all SVS\subseteq V with |S|>1|S|>1. Note that we have

𝒮=S𝒮2VS\mathcal{S}=\bigcap_{S\notin\mathcal{S}}2^{V}\setminus S^{\uparrow}

for every downward closed set system 𝒮\mathcal{S}. Combined with Proposition 3.2, we obtain that for every non-trivial downward closed family 𝒮\mathcal{S} on a universe VV, pdim(𝒮)2|V|\textsf{pdim}(\mathcal{S})\leq 2^{|V|}.

We are interested in the classes of set families for which there is an efficient embedding with packing dimension being independent of |V||V|. In particular, the class of set families that we study are bounded set families where each set has cardinality at most kk, and each element appears in at most Δ\Delta sets. We can show that such bounded set families that are downward closed and non-trivial have packing dimension at most (kΔ)O(Δ)(k\Delta)^{O(\Delta)}. Together with the Ω(logk)\Omega(\log k) hardness [Tre01] of kk-set cover where each set has cardinality at most kk, a fixed constant and each element appearing in (logk)O(1)(\log k)^{O(1)} sets, this packing dimension bound gives the hardness of (logd)Ω(1)(\log d)^{\Omega(1)} for the Vector Bin Packing problem when dd is a large constant. Unfortunately, the exponential dependence on Δ\Delta is necessary for the packing dimension of bounded set systems, and thus, this approach does not yield the optimal Ω(logd)\Omega(\log d) hardness of Vector Bin Packing.

Instead of using arbitrary bounded set families, we bypass this barrier by using simple bounded set families. Recall that a set family is called simple if any two distinct sets in the family intersect in at most one element. It turns out that for simple bounded set families i.e. simple set families 𝒮\mathcal{S} where each set has cardinality at most kk, and each element appears in at most Δ\Delta sets, the packing dimension of 𝒮\mathcal{S}^{\downarrow} can be upper bounded by (kΔ)O(1)(k\Delta)^{O(1)}. Together with the Ω(logk)\Omega(\log k) hardness of simple kk-set cover (proved in Appendix A), we get the optimal Ω(logd)\Omega(\log d) hardness of Vector Bin Packing when dd is a large constant. In the next subsection, we prove the packing dimension upper bound, and we use this upper bound to prove the hardness of Vector Bin Packing in Section 3.3.

3.2 Packing Dimension of Simple Bounded Set Families

The main embedding result that we prove is that the downward closure of simple set systems where each set has cardinality kk and each element appears in at most Δ\Delta sets has packing dimension at most polynomial in k,Δk,\Delta.

Theorem 3.3.

Suppose that 𝒮\mathcal{S} is a simple non-trivial set system on a universe VV where each set has cardinality at most k2k\geq 2 and each element appears in at most Δ\Delta sets. Then,

pdim(𝒮)(kΔ)O(1)\emph{{pdim}}(\mathcal{S}^{\downarrow})\leq(k\Delta)^{O(1)}

Furthermore, an embedding realizing the above can be found in time polynomial in |V||V|.

We prove the embedding result by writing the set family 𝒮\mathcal{S^{\downarrow}} as an intersection of (kΔ)O(1)(k\Delta)^{O(1)} structured set families each of which has packing dimension at most (kΔ)O(1)(k\Delta)^{O(1)}. We can then upper bound the packing dimension of 𝒮\mathcal{S}^{\downarrow} using Proposition 3.2. The structured set systems we study are sunflower-bouquets, which are a disjoint union of sunflowers777A sunflower is a collection of sets S1,S2,,SrS_{1},S_{2},\ldots,S_{r} whose pairwise intersection is constant i.e., there exists UU such that U=SiSjU=S_{i}\cap S_{j} for all i,j[r],iji,j\in[r],i\neq j. This constant intersection UU is called the kernel of the sunflower. that have a single element as the kernel. The formal definition of the sunflower-bouquet set families is below. See Figure 1 for an illustration.

Definition 3.4.

(Sunflower-bouquets) A simple set system 𝒮\mathcal{S} on a universe VV is called a sunflower-bouquet with core UV,UϕU\subseteq V,U\neq\phi if the following hold.

  1. 1.

    Every set S𝒮S\in\mathcal{S} satisfies |SU|=1|S\cap U|=1. Furthermore, for every uUu\in U, there is a set S𝒮S\in\mathcal{S} with uSu\in S.

  2. 2.

    For any pair of sets S1,S2𝒮S_{1},S_{2}\in\mathcal{S} with S1S2S_{1}\cap S_{2}\neq\emptyset, we have S1U=S2U=S1S2S_{1}\cap U=S_{2}\cap U=S_{1}\cap S_{2}.

We now give an efficient embedding for a sunflower-bouquet 𝒮\mathcal{S} on a universe VV with core UV,UϕU\subseteq V,U\neq\phi. The motivation behind this lemma is to upper bound the packing dimension of the set system 𝒯=𝒮{SVU:|S|k}\mathcal{T^{\downarrow}}=\mathcal{S}^{\downarrow}\cup\{S\subseteq V\setminus U:|S|\leq k\}.

Lemma 3.5.

Fix an integer k2k\geq 2. Let 𝒮\mathcal{S} be a simple set family defined on a universe VV that is a sunflower-bouquet with core UU. Furthermore, each set in the family has cardinality at most kk and each element appears in at most Δ\Delta sets. Then, there exists an embedding f:V[0,1]Kf:V\rightarrow[0,1]^{K} that satisfies

  1. (A)

    For every set S𝒮S\in\mathcal{S},

    f(S)1.\left\lVert f(S)\right\rVert_{\infty}\leq 1.
  2. (B)

    For every set S𝒮S\notin\mathcal{S}^{\downarrow} with SUS\cap U\neq\emptyset,

    f(S)>1.\left\lVert f(S)\right\rVert_{\infty}>1.
  3. (C)

    For every set SVS\subseteq V with SU=S\cap U=\emptyset and |S|k|S|\leq k,

    f(S)1.\left\lVert f(S)\right\rVert_{\infty}\leq 1.
  4. (D)

    For every set SVS\subseteq V with |S|>k|S|>k,

    f(S)>1.\left\lVert f(S)\right\rVert_{\infty}>1.

with K=(kΔ)O(1)K=(k\Delta)^{O(1)}. Furthermore, such an embedding can be found in time polynomial in |V||V| given 𝒮\mathcal{S}.

Proof.

Let U={u1,u2,,um}U=\{u_{1},u_{2},\ldots,u_{m}\}. We can partition VUV\setminus U into V0,V1,,VmV_{0},V_{1},\ldots,V_{m} with

Vi=S𝒮:uiSS{ui}V_{i}=\bigcup_{S\in\mathcal{S}:u_{i}\in S}S\setminus\{u_{i}\}

for all i[m]i\in[m]. Here, 𝒮\mathcal{S} restricted to {ui}Vi\{u_{i}\}\cup V_{i} is a sunflower set system with a single element uiu_{i} as the kernel for every i[m]i\in[m]. As each set in 𝒮\mathcal{S} has cardinality at most kk and each element appears in at most Δ\Delta sets, we get that |Vi|kΔ|V_{i}|\leq k\Delta for all i[m]i\in[m]. For every i[m]i\in[m], we order the elements of ViV_{i} as {vi,1,vi,2,,vi,kΔ}\{v_{i,1},v_{i,2},\ldots,v_{i,k\Delta}\} (with repetitions if needed).

Refer to caption
Figure 1: An illustration of a sunflower-bouquet set family. Here, 𝒮\mathcal{S} is the family of all the green colored sets. It is a sunflower-bouquet with core U={u1,u2,u3}U=\{u_{1},u_{2},u_{3}\}. In the embedding, we ensure that the \ell_{\infty} norm of the left red set is greater than 11 in the first step while the right side red set is handled in the second step.

We construct the final embedding ff as a concatenation of embeddings with smaller dimensions f:=(f0,g,g)f:=(f_{0},g,g^{\prime}) where f0:V[0,1]2f_{0}:V\rightarrow[0,1]^{2}, g:V[0,1]K1g:V\rightarrow[0,1]^{K_{1}} and g:V[0,1]K2g^{\prime}:V\rightarrow[0,1]^{K_{2}} all satisfy the conditions (A)(A) and (C)(C). In other words, for every set SVS\subseteq V satisfying either S𝒮S\in\mathcal{S} or SU=ϕS\cap U=\phi and |S|k|S|\leq k, we have f0(S)1\left\lVert f_{0}(S)\right\rVert_{\infty}\leq 1, g(S)1\left\lVert g(S)\right\rVert_{\infty}\leq 1, and g(S)1\left\lVert g^{\prime}(S)\right\rVert_{\infty}\leq 1. Note that for a set SVS\subseteq V, we have

f(S)=max(f0(S),g(S),g(S))\left\lVert f(S)\right\rVert_{\infty}=\max(\left\lVert f_{0}(S)\right\rVert_{\infty},\left\lVert g(S)\right\rVert_{\infty},\left\lVert g^{\prime}(S)\right\rVert_{\infty})

Thus, if f0,g,gf_{0},g,g^{\prime} satisfy the conditions (A)(A) and (C)(C), the final embedding ff also satisfies the conditions (A)(A) and (C)(C). Furthermore, the parameters K1K_{1} and K2K_{2} are chosen such that the final dimension of ff, 2+K1+K22+K_{1}+K_{2} is at most (kΔ)O(1)(k\Delta)^{O(1)}.

First, we define the embedding f0:V[0,1]2f_{0}:V\rightarrow[0,1]^{2} that satisfies the conditions (A)(A) and (C)(C) with the additional property that for every set SVS\subseteq V with |S|>k|S|>k, or if SUϕS\cap U\neq\phi and SV0ϕS\cap V_{0}\neq\phi, or if |SU|>1|S\cap U|>1, we have f0(S)>1\left\lVert f_{0}(S)\right\rVert_{\infty}>1. We obtain this by a simple two-dimensional embedding as follows:

f0(v)={(1,1k), if vU(1k,1k), if vV0(0,1k) otherwise.f_{0}(v)=\begin{cases}\left(1,\frac{1}{k}\right),\text{ if }v\in U\\ \left(\frac{1}{k},\frac{1}{k}\right),\text{ if }v\in V_{0}\\ \left(0,\frac{1}{k}\right)\text{ otherwise.}\end{cases}

We can verify that f0f_{0} satisfies the conditions (A)(A) and (C)(C). Furthermore, suppose that f0(S)1\left\lVert f_{0}(S)\right\rVert_{\infty}\leq 1 for a set SVS\subseteq V. Then, we can obtain the following observations that we will use later.

  1. 1.

    As f0(v)1=1f_{0}(v)_{1}=1 for all vUv\in U, |SU|1|S\cap U|\leq 1. As f0(v)1=1kf_{0}(v)_{1}=\frac{1}{k} for all vV0v\in V_{0}, if SUS\cap U\neq\emptyset, then SV0=S\cap V_{0}=\emptyset.

  2. 2.

    As f0(v)2=1kf_{0}(v)_{2}=\frac{1}{k} for all vVv\in V, |S|k|S|\leq k. Thus, for every set SVS\subseteq V such that |S|>k|S|>k, we have f0(S)>1\left\lVert f_{0}(S)\right\rVert_{\infty}>1, and hence, f(S)>1\left\lVert f(S)\right\rVert_{\infty}>1. This already proves the condition (D)(D) of the lemma.

Overview of rest of the proof. We now restrict our attention to sets SVS\subseteq V such that |S|k,SV0=ϕ|S|\leq k,S\cap V_{0}=\phi, and |SU|1|S\cap U|\leq 1. Our goal is to find an embedding for these sets that satisfies the conditions (A)(A), (B)(B), and (C)(C). This is the technically challenging part of the proof and requires setting various coordinates carefully to encode the properties of the set system. We break this down into two steps: eliminating the “cross-sunflower” sets, and pinning down the “intra-sunflower” sets. We give an overview of the ideas used in the two steps before presenting the full proof.

  1. 1.

    The cross-sunflower sets are the sets SVS\subseteq V that contain uiu_{i} for some i[m]i\in[m], but also intersect another sunflower i.e., SViϕS\cap V_{i^{\prime}}\neq\phi for an iii^{\prime}\neq i. Note that such a set SS satisfies S𝒮S\notin\mathcal{S}^{\downarrow} and SUϕS\cap U\neq\phi, and thus, to satisfy the condition (B)(B), we need to ensure that f(S)>1\left\lVert f(S)\right\rVert_{\infty}>1 for such sets. We achieve this by constructing an embedding g:V[0,1]K1g:V\rightarrow[0,1]^{K_{1}} that satisfies the conditions (A)(A) and (C)(C) and has g(S)>1\left\lVert g(S)\right\rVert_{\infty}>1 for all cross-sunflower sets.

    We illustrate the idea used in constructing this embedding using a toy example. Suppose that we have a set of pairs of elements (u1,v1),(u2,v2),,(un,vn)(u_{1},v_{1}),(u_{2},v_{2}),\ldots,(u_{n},v_{n}), and let their union be denoted by W={u1,v1,u2,v2,,un,vn}W=\{u_{1},v_{1},u_{2},v_{2},\ldots,u_{n},v_{n}\}. Our goal is to find an embedding g:W[0,1]2g:W\rightarrow[0,1]^{2} such that

    1. (a)

      g(ui)+g(vi)1\left\lVert g(u_{i})+g(v_{i})\right\rVert_{\infty}\leq 1 for all i[n]i\in[n].

    2. (b)

      g(ui)+g(vj)>1\left\lVert g(u_{i})+g(v_{j})\right\rVert_{\infty}>1 for all i,j[n],iji,j\in[n],i\neq j.

    We construct this embedding by choosing nn distinct real numbers α1,α2,,αn(0,1)\alpha_{1},\alpha_{2},\ldots,\alpha_{n}\in(0,1) and setting g(ui)=(αi,1αi)g(u_{i})=(\alpha_{i},1-\alpha_{i}) and g(vi)=(1αi,αi)g(v_{i})=(1-\alpha_{i},\alpha_{i}) for all i[m]i\in[m]. Note that g(ui)+g(vj)1=2\left\lVert g(u_{i})+g(v_{j})\right\rVert_{1}=2 for all i,ji,j, and thus, g(ui)+g(vj)=1\left\lVert g(u_{i})+g(v_{j})\right\rVert_{\infty}=1 if and only if g(ui)+g(vj)=(1,1)g(u_{i})+g(v_{j})=(1,1), or equivalently, when i=ji=j. The actual construction extends this idea with two differences: first, the ViV_{i}s could have more than one element, and we use the pairs idea multiple times to account for this, and second, we need to ensure that the sum of the embedding of any kk elements in V(UV0)V\setminus(U\cup V_{0}) has \ell_{\infty} norm at most 11, and thus, we need to choose the embedding of uiu_{i} to be (αi,21kαi)(\alpha_{i},2-\frac{1}{k}-\alpha_{i}) and set αi(11k,1)\alpha_{i}\in(1-\frac{1}{k},1).

  2. 2.

    The intra-sunflower sets are the sets SVS\subseteq V such that uiSu_{i}\in S and SVi{ui}S\subseteq V_{i}\cup\{u_{i}\} for some i[m]i\in[m]. We need to ensure that every intra-sunflower set SS such that S𝒮S\notin\mathcal{S}^{\downarrow} satisfies f(S)>1\left\lVert f(S)\right\rVert_{\infty}>1. We achieve this by constructing an embedding g:V[0,1]K2g^{\prime}:V\rightarrow[0,1]^{K_{2}} that satisfies the conditions (A)(A), (C)(C) and has g(S)>1\left\lVert g^{\prime}(S)\right\rVert_{\infty}>1 for every intra-sunflower set SS such that S𝒮S\notin\mathcal{S}^{\downarrow}.

    Fix an i[m]i\in[m]. For every intra-sunflower set S{ui}ViS\subseteq\{u_{i}\}\cup V_{i} with uiSu_{i}\in S and S𝒮S\notin\mathcal{S}^{\downarrow}, we use a single dimensional embedding gS:V[0,1]g_{S}:V\rightarrow[0,1] that satisfies the conditions (A)(A) and (C)(C) but has gS(S)>1\left\lVert g_{S}(S)\right\rVert_{\infty}>1 . We achieve this by setting gS(ui)=1|S|1k+ϵg_{S}(u_{i})=1-\frac{|S|-1}{k}+\epsilon, and gS(v)=1kg_{S}(v)=\frac{1}{k} for every vViSv\in V_{i}\cap S, and gS(v)=0g_{S}(v)=0 for all vVSv\in V\setminus S, where ϵ<1k\epsilon<\frac{1}{k} is a small positive constant. Note that gS(T)1\left\lVert g_{S}(T)\right\rVert\leq 1 for every set TT such that TST\nsupseteq S, and since gS(v)1kg_{S}(v)\leq\frac{1}{k} for all vVUv\in V\setminus U, the embedding gSg_{S} satisfies the conditions (A)(A) and (C)(C).

    We can construct such a single dimensional embedding for every intra-sunflower set S𝒮S\notin\mathcal{S}^{\downarrow} and take their concatenation to obtain the required embedding gg^{\prime}. However, there could be exponential (in k,Δk,\Delta) number of such intra-sunflower sets S{ui}ViS\subseteq\{u_{i}\}\cup V_{i}, uiSu_{i}\in S such that S𝒮S\notin\mathcal{S}^{\downarrow}. We get around this issue by observing that we need the single dimensional embedding only for minimal intra-sunflower sets that don’t belong to 𝒮\mathcal{S}^{\downarrow}. In fact, as the set system 𝒮\mathcal{S}^{\downarrow} restricted to {ui}Vi\{u_{i}\}\cup V_{i} is a sunflower with single element uiu_{i} as the kernel, we can deduce that every minimal intra-sunflower set SS with S𝒮S\notin\mathcal{S}^{\downarrow} is of the form {ui,x,y}\{u_{i},x,y\} where x,yVix,y\in V_{i}. Thus, we can construct the single dimensional embedding for such sets and take their concatenation to obtain the required embedding gg^{\prime} with dimension at most |Vi|2(kΔ)2|V_{i}|^{2}\leq(k\Delta)^{2}.

As every set SVS\subseteq V such that SV0=ϕS\cap V_{0}=\phi, |SU|=1|S\cap U|=1 is either a cross-sunflower set or an intra-sunflower set, the two steps together prove that f=(f0,g,g)f=(f_{0},g,g^{\prime}) satisfies the conditions (A)(A), (B)(B) and (C)(C).

We now present the full formal proof of the two steps.

Step-1. Eliminating the cross-sunflower sets. In the first step, our goal is to find an embedding g:V[0,1]K1g:V\rightarrow[0,1]^{K_{1}} with K1=2kΔK_{1}=2k\Delta such that

  1. 1.

    gg satisfies the conditions (A)(A) and (C)(C) i.e. for every set S𝒮S\in\mathcal{S}, g(S)1\left\lVert g(S)\right\rVert_{\infty}\leq 1, and for every set SVS\subseteq V with SU=ϕS\cap U=\phi and |S|k,g(S)1.|S|\leq k,\left\lVert g(S)\right\rVert_{\infty}\leq 1.

  2. 2.

    For every “cross-sunflower” set SVS\subseteq V with uiSu_{i}\in S for some i[m]i\in[m], and SViϕS\cap V_{i^{\prime}}\neq\phi for i[m],iii^{\prime}\in[m],i^{\prime}\neq i, we have g(S)>1\left\lVert g(S)\right\rVert_{\infty}>1.

We achieve this by setting g=(f1,,fkΔ)g=(f_{1},\ldots,f_{k\Delta}), where each fl:V[0,1]2,l[kΔ]f_{l}:V\rightarrow[0,1]^{2},l\in[k\Delta] satisfies the conditions (A)(A) and (C)(C), and overall, the embedding gg satisfies the second condition above.

We choose mm distinct rational numbers α1,,αm\alpha_{1},\ldots,\alpha_{m} with 11k<αi<11-\frac{1}{k}<\alpha_{i}<1 for all i[m]i\in[m]. We define the embeddings fl:V[0,1]2f_{l}:V\rightarrow[0,1]^{2}, l[kΔ]l\in[k\Delta] as follows. Consider an l[kΔ]l\in[k\Delta].

  1. 1.

    For i[m]i\in[m], we set

    fl(ui)=(αi,21kαi)f_{l}(u_{i})=\left(\alpha_{i},2-\frac{1}{k}-\alpha_{i}\right)
  2. 2.

    For i[m]i\in[m] and vi,jViv_{i,j}\in V_{i}, we set fl(vi,j)=(0,0)f_{l}(v_{i,j})=(0,0) if vi,jvi,lv_{i,j}\neq v_{i,l}. We set

    fl(vi,l)=(1αi,αi+1k1)f_{l}(v_{i,l})=\left(1-\alpha_{i},\alpha_{i}+\frac{1}{k}-1\right)
  3. 3.

    For vV0v\in V_{0}, we set fl(v)=(0,0)f_{l}(v)=(0,0).

We verify that these embeddings satisfy the conditions (A)(A) and (C)(C). Fix an l[kΔ]l\in[k\Delta].

  1. (A)

    Consider a set S𝒮S\in\mathcal{S}. Let i[m]i\in[m] be such that {ui}=SU\{u_{i}\}=S\cap U. We have

    fl(S)\displaystyle f_{l}(S) =vSfl(v)\displaystyle=\sum_{v\in S}f_{l}(v)
    v{ui}Vif(v)\displaystyle\leq\sum_{v\in\{u_{i}\}\cup V_{i}}f(v)
    =fl(ui)+fl(vi,l)\displaystyle=f_{l}(u_{i})+f_{l}(v_{i,l})
    =(αi,21kαi)+(1αi,αi+1k1)=(1,1).\displaystyle=\left(\alpha_{i},2-\frac{1}{k}-\alpha_{i}\right)+\left(1-\alpha_{i},\alpha_{i}+\frac{1}{k}-1\right)=(1,1).
  2. (C)

    This follows directly from the fact that fl(v)11k\left\lVert f_{l}(v)\right\rVert_{1}\leq\frac{1}{k} for all l[kΔ]l\in[k\Delta] and vVUv\in V\setminus U.

Let g:V[0,1]2kΔg:V\rightarrow[0,1]^{2k\Delta} be defined as g=(f1,,fkΔ)g=(f_{1},\ldots,f_{k\Delta}). As each of the individual embeddings satisfies (A)(A) and (C)(C), gg also satisfies the conditions (A)(A) and (C)(C).

Let SVV0S\subseteq V\setminus V_{0}, |SU|=1|S\cap U|=1 be such that

g(S)1.\left\lVert g(S)\right\rVert_{\infty}\leq 1.

i.e. fl(S)1\left\lVert f_{l}(S)\right\rVert_{\infty}\leq 1 for all l[kΔ]l\in[k\Delta]. Suppose that SU={ui}S\cap U=\{u_{i}\}. Then, we claim that S{ui}ViS\subseteq\{u_{i}\}\cup V_{i}. Suppose for contradiction that this is not the case, and there exists vi,lViv_{i^{\prime},l}\in V_{i^{\prime}} with ii,i[m]i^{\prime}\neq i,i^{\prime}\in[m] and l[kΔ]l\in[k\Delta] such that vi,lSv_{i^{\prime},l}\in S. We have

fl(S)\displaystyle f_{l}(S) =vSfl(v)\displaystyle=\sum_{v\in S}f_{l}(v)
fl(ui)+fl(vi,l)\displaystyle\geq f_{l}(u_{i})+f_{l}(v_{i^{\prime},l})
=(αi,21kαi)+(1αi,αi+1k1)\displaystyle=\left(\alpha_{i},2-\frac{1}{k}-\alpha_{i}\right)+\left(1-\alpha_{i^{\prime}},\alpha_{i^{\prime}}+\frac{1}{k}-1\right)
=(1+αiαi,1+αiαi)\displaystyle=(1+\alpha_{i}-\alpha_{i^{\prime}},1+\alpha_{i^{\prime}}-\alpha_{i})

As αiαi\alpha_{i}\neq\alpha_{i^{\prime}}, fl(S)>1\left\lVert f_{l}(S)\right\rVert_{\infty}>1, a contradiction. Thus, for every set SVS\subseteq V such that uiSu_{i}\in S, SViϕS\cap V_{i^{\prime}}\neq\phi for some iii^{\prime}\neq i, we have g(S)>1\left\lVert g(S)\right\rVert_{\infty}>1.

Step 2. Pinning down the intra-sunflower sets. In the second step, our goal is to find an embedding g:V[0,1]K2g^{\prime}:V\rightarrow[0,1]^{K_{2}} with K2=(kΔ)2K_{2}=(k\Delta)^{2} such that

  1. 1.

    gg^{\prime} satisfies the conditions (A)(A) and (C)(C).

  2. 2.

    For every i[m]i\in[m] and “intra-sunflower” set S{ui}ViS\subseteq\{u_{i}\}\cup V_{i} such that uiSu_{i}\in S and S𝒮S\notin\mathcal{S}^{\downarrow}, we have g(S)>1\left\lVert g^{\prime}(S)\right\rVert_{\infty}>1.

We achieve this by setting g=(g1,g2,,g(kΔ)2))g^{\prime}=(g_{1},g_{2},\ldots,g_{(k\Delta)^{2})}) where each glg_{l}, l[(kΔ)2]l\in[(k\Delta)^{2}] satisfies the conditions (A)(A) and (C)(C), and the overall function gg^{\prime} satisfies the second condition above.

For every i[m]i\in[m], we order all the pairs of distinct elements x,yVix,y\in V_{i} as {Vi,1,Vi,2,,Vi,(kΔ)2}\{V_{i,1},V_{i,2},\ldots,V_{i,(k\Delta)^{2}}\} (with repetitions if needed). The upper bound on the number of such pairs is obtained using the fact that |Vi|kΔ|V_{i}|\leq k\Delta for all i[m]i\in[m].

We define the embeddings gl:V[0,1]g_{l}:V\rightarrow[0,1], l[(kΔ)2]l\in[(k\Delta)^{2}] below. Fix an l[(kΔ)2]l\in[(k\Delta)^{2}].

  1. 1.

    Consider an i[m]i\in[m]. We have two different cases:

    1. (a)

      If Vi,l{ui}𝒮V_{i,l}\cup\{u_{i}\}\in\mathcal{S}^{\downarrow}, we set gl(ui)=0g_{l}(u_{i})=0 and gl(v)=0g_{l}(v)=0 for all vViv\in V_{i}.

    2. (b)

      If Vi,l{ui}𝒮V_{i,l}\cup\{u_{i}\}\notin\mathcal{S}^{\downarrow}, we set gl(v)=1kg_{l}(v)=\frac{1}{k} for all vVi,lv\in V_{i,l}, and gl(v)=0g_{l}(v)=0 for all vViVi,lv\in V_{i}\setminus V_{i,l}. We set

      gl(ui)=12k+1k2g_{l}(u_{i})=1-\frac{2}{k}+\frac{1}{k^{2}}
  2. 2.

    For all vV0v\in V_{0}, we set gl(v)=0g_{l}(v)=0.

We now verify that these embeddings satisfy the conditions (A)(A) and (C)(C). Fix an integer l[(kΔ)2]l\in[(k\Delta)^{2}].

  1. (A)

    Consider a set S𝒮S\in\mathcal{S}. Let {ui}=SU\{u_{i}\}=S\cap U. If {ui}Vi,l𝒮\{u_{i}\}\cup V_{i,l}\in\mathcal{S}^{\downarrow}, gl(v)=0g_{l}(v)=0 for all vSv\in S, and thus we have |gl(S)|1|g_{l}(S)|\leq 1. Now suppose that {ui}Vi,l𝒮\{u_{i}\}\cup V_{i,l}\notin\mathcal{S}^{\downarrow}. This implies that Vi,lV_{i,l} is not a subset of SS. As |Vi,l|=2|V_{i,l}|=2, |Vi,lS|1|V_{i,l}\cap S|\leq 1. We get

    vSgl(v)\displaystyle\sum_{v\in S}g_{l}(v) =gl(ui)+vSVigl(v)\displaystyle=g_{l}(u_{i})+\sum_{v\in S\cap V_{i}}g_{l}(v)
    =gl(ui)+vSVi,lgl(v)\displaystyle=g_{l}(u_{i})+\sum_{v\in S\cap V_{i,l}}g_{l}(v)
    gl(ui)+1k\displaystyle\leq g_{l}(u_{i})+\frac{1}{k}
    =12k+1k2+1k1\displaystyle=1-\frac{2}{k}+\frac{1}{k^{2}}+\frac{1}{k}\leq 1
  2. (C)

    This follows from the fact that gl(v)1kg_{l}(v)\leq\frac{1}{k} for all vVUv\in V\setminus U.

Suppose that a set SVS\subseteq V satisfies S{ui}ViS\subseteq\{u_{i}\}\cup V_{i} for some i[m]i\in[m], and uiSu_{i}\in S, S𝒮S\notin\mathcal{S}^{\downarrow}. Then, we claim that g(S)>1\left\lVert g^{\prime}(S)\right\rVert_{\infty}>1. Suppose for the sake of contradiction that g(S)1\left\lVert g^{\prime}(S)\right\rVert_{\infty}\leq 1. Then, we have gl(S)1g_{l}(S)\leq 1 for all l[(kΔ)2]l\in[(k\Delta)^{2}]. Let S={ui,s1,s2,,sp}S=\{u_{i},s_{1},s_{2},\ldots,s_{p}\} where sjVis_{j}\in V_{i} for all j[p]j\in[p]. Note that for every vViv\in V_{i}, there is exactly one set S(v)𝒮S(v)\in\mathcal{S} such that vS(v)v\in S(v) and this set S(v)S(v) satisfies uiS(v)u_{i}\in S(v). This follows from the definition of ViV_{i} and the fact that the set family 𝒮\mathcal{S} is a sunflower-bouquet.

We now claim that S(sj1)=S(sj2)S(s_{j_{1}})=S(s_{j_{2}}) for all j1,j2[p]j_{1},j_{2}\in[p]. Suppose for contradiction that there exist j1,j2[p]j_{1},j_{2}\in[p] with S(sj1)S(sj2)S(s_{j_{1}})\neq S(s_{j_{2}}). This implies that {ui,sj1,sj2}𝒮\{u_{i},s_{j_{1}},s_{j_{2}}\}\notin\mathcal{S}^{\downarrow} as otherwise, if there exists T𝒮T\in\mathcal{S} such that {ui,sj1,sj2}T\{u_{i},s_{j_{1}},s_{j_{2}}\}\subseteq T, we have S(sj1)=S(sj2)=TS(s_{j_{1}})=S(s_{j_{2}})=T. Let l[(kΔ)2]l\in[(k\Delta)^{2}] be such that Vi,l={sj1,sj2}V_{i,l}=\{s_{j_{1}},s_{j_{2}}\}. As Vi,l{ui}𝒮V_{i,l}\cup\{u_{i}\}\notin\mathcal{S}^{\downarrow}, we have gl(v)=1kg_{l}(v)=\frac{1}{k} for all vVi,lv\in V_{i,l} and

gl(ui)=12k+1k2g_{l}(u_{i})=1-\frac{2}{k}+\frac{1}{k^{2}}

Thus, we get that

vSgl(v)\displaystyle\sum_{v\in S}g_{l}(v) =gl(ui)+vS{ui}gl(v)\displaystyle=g_{l}(u_{i})+\sum_{v\in S\setminus\{u_{i}\}}g_{l}(v)
=gl(ui)+vVi,lgl(v)\displaystyle=g_{l}(u_{i})+\sum_{v\in V_{i,l}}g_{l}(v)
=12k+1k2+2k=1+1k2\displaystyle=1-\frac{2}{k}+\frac{1}{k^{2}}+\frac{2}{k}=1+\frac{1}{k^{2}}

contradicting the fact that gl(S)1g_{l}(S)\leq 1. This completes the proof that S(sj1)=S(sj2)S(s_{j_{1}})=S(s_{j_{2}}) for all j1,j2[p]j_{1},j_{2}\in[p]. Thus, there exists a set S(s1)𝒮S(s_{1})\in\mathcal{S} such that SS(s1)S\subseteq S(s_{1}), which implies that S𝒮S\in\mathcal{S}^{\downarrow}, a contradiction. Thus, for every set SVS\subseteq V such that uiSu_{i}\in S, S{ui}ViS\subseteq\{u_{i}\}\cup V_{i} for some i[m]i\in[m] and g(S)1\left\lVert g^{\prime}(S)\right\rVert_{\infty}\leq 1, we have S𝒮S\in\mathcal{S}^{\downarrow}.

Final embedding. We define the final embedding f:V[0,1]2+2kΔ+(kΔ)2f:V\rightarrow[0,1]^{2+2k\Delta+(k\Delta)^{2}} as f=(f0,g,g)f=(f_{0},g,g^{\prime}). As each of these embeddings satisfies the conditions (A)(A) and (C)(C), the final embedding ff also satisfies the conditions (A)(A) and (C)(C).

Suppose that f(S)1\left\lVert f(S)\right\rVert_{\infty}\leq 1 for a set SVS\subseteq V. Then, f0(S)1\left\lVert f_{0}(S)\right\rVert_{\infty}\leq 1, g(S)1\left\lVert g(S)\right\rVert_{\infty}\leq 1 and g(S)1\left\lVert g^{\prime}(S)\right\rVert_{\infty}\leq 1. Condition (D)(D) follows immediately as f0(S)1\left\lVert f_{0}(S)\right\rVert_{\infty}\leq 1 implies that |S|k|S|\leq k.

We now return to condition (B)(B). Suppose that SVS\subseteq V with SUS\cap U\neq\emptyset satisfies f(S)1\left\lVert f(S)\right\rVert_{\infty}\leq 1. Our goal is to show that S𝒮S\in\mathcal{S}^{\downarrow}. We have already deduced from f0(S)1\left\lVert f_{0}(S)\right\rVert_{\infty}\leq 1 that |SU|1|S\cap U|\leq 1. As SUϕS\cap U\neq\phi, we have |SU|=1|S\cap U|=1, and by using f0(S)1\left\lVert f_{0}(S)\right\rVert_{\infty}\leq 1 again, we get that SV0=ϕS\cap V_{0}=\phi. Let SU={ui}S\cap U=\{u_{i}\}. As g(S)1\left\lVert g(S)\right\rVert_{\infty}\leq 1, using the argument in the first step, we can conclude that SVi=ϕS\cap V_{i^{\prime}}=\phi for all iii^{\prime}\neq i. Thus, S{ui}ViS\subseteq\{u_{i}\}\cup V_{i}. By using the argument in the second step, g(S)1\left\lVert g^{\prime}(S)\right\rVert_{\infty}\leq 1 implies that S𝒮S\in\mathcal{S}^{\downarrow}.

Note that our construction is explicit, and we have a polynomial time algorithm to output the required embedding. The dimension of the embedding is 2+2kΔ+(kΔ)22+2k\Delta+(k\Delta)^{2}, which is at most (kΔ)O(1)(k\Delta)^{O(1)}. ∎

As a corollary, we bound the packing dimension of the set family

𝒯=𝒮{SVU:|S|k}.\mathcal{T}^{\downarrow}=\mathcal{S}^{\downarrow}\cup\{S\subseteq V\setminus U:|S|\leq k\}.
Corollary 3.6.

Suppose that 𝒯\mathcal{T} is a set family defined on a universe VV with

𝒯=𝒮{SVU:|S|k}\mathcal{T}=\mathcal{S}\cup\{S\subseteq V\setminus U:|S|\leq k\}

where 𝒮2V\mathcal{S}\subseteq 2^{V} is a sunflower-bouquet with core UU. Furthermore, each set in 𝒮\mathcal{S} has cardinality at most k2k\geq 2 and each element appears in at most Δ\Delta sets in 𝒮\mathcal{S}. Then,

pdim(𝒯)(kΔ)O(1)\emph{{pdim}}(\mathcal{T}^{\downarrow})\leq(k\Delta)^{O(1)}

Furthermore, an embedding realizing this packing dimension can be found in time polynomial in |V||V| given 𝒮\mathcal{S}.

Proof.

As 𝒮\mathcal{S} is a sunflower-bouquet, from  Lemma 3.5, there exists an embedding f:V[0,1]Kf:V\rightarrow[0,1]^{K} that satisfies the conditions (A),(B),(C)(A),(B),(C) and (D)(D) with K=(kΔ)O(1)K=(k\Delta)^{O(1)}. Conditions (A)(A) and (C)(C) together imply that

f(S)1\left\lVert f(S)\right\rVert_{\infty}\leq 1

for all S𝒯S\in\mathcal{T}. Note that

𝒯=𝒮{SVU:|S|k}.\mathcal{T}^{\downarrow}=\mathcal{S}^{\downarrow}\cup\{S\subseteq V\setminus U:|S|\leq k\}.

Suppose that SVS\subseteq V is a subset of VV with S𝒯S\notin\mathcal{T}^{\downarrow}. If SU=ϕS\cap U=\phi, then |S|>k|S|>k, which implies that f(S)>1\left\lVert f(S)\right\rVert_{\infty}>1 using condition (D)(D). If SUϕS\cap U\neq\phi, then S𝒮S\notin\mathcal{S}^{\downarrow} which implies that f(S)>1\left\lVert f(S)\right\rVert_{\infty}>1 using condition (B)(B). Thus, f(S)1\left\lVert f(S)\right\rVert_{\infty}\leq 1 if and only if S𝒯S\in\mathcal{T}^{\downarrow}. ∎

We are now ready to prove our main embedding result i.e. Theorem 3.3.

Proof of Theorem 3.3.

We define a graph G=(V,E)G=(V,E) as follows: two elements u,vVu,v\in V are adjacent in GG if there exist sets S1,S2𝒮S_{1},S_{2}\in\mathcal{S} (not necessarily distinct) such that uS1,vS2,S1S2u\in S_{1},v\in S_{2},S_{1}\cap S_{2}\neq\emptyset. As the cardinality of each set in 𝒮\mathcal{S} is at most kk and each element of VV is present in at most Δ\Delta sets, the maximum degree of a vertex in GG can be bounded above as

Δ(G)k(k1)Δ2\Delta(G)\leq k(k-1)\Delta^{2}

Thus, the chromatic number of GG is at most L=χ(G)k(k1)Δ2+1k2Δ2L=\chi(G)\leq k(k-1)\Delta^{2}+1\leq k^{2}\Delta^{2}. Using the greedy coloring algorithm, we can partition VV into LL non-empty parts U1,U2,,ULU_{1},U_{2},\ldots,U_{L} such that each UjU_{j} is a independent set in GG. For every j[L]j\in[L], as UjU_{j} is an independent set in GG, we have

  1. 1.

    For every set S𝒮S\in\mathcal{S}, |SUj|1|S\cap U_{j}|\leq 1.

  2. 2.

    Any two sets S1,S2𝒮S_{1},S_{2}\in\mathcal{S} with S1UjS_{1}\cap U_{j}\neq\emptyset, S2UjS_{2}\cap U_{j}\neq\emptyset and S1S2S_{1}\cap S_{2}\neq\emptyset satisfy S1Uj=S2Uj=S1S2S_{1}\cap U_{j}=S_{2}\cap U_{j}=S_{1}\cap S_{2}.

We now define the set families 𝒮1,𝒮2,,𝒮L\mathcal{S}_{1},\mathcal{S}_{2},\ldots,\mathcal{S}_{L} as follows:

𝒮j={S𝒮:SUj}{SVUj:|S|k}\mathcal{S}_{j}=\{S\in\mathcal{S}:S\cap U_{j}\neq\emptyset\}\cup\{S\subseteq V\setminus U_{j}:|S|\leq k\}

We claim that j[L]𝒮j=𝒮\bigcap_{j\in[L]}\mathcal{S}_{j}^{\downarrow}=\mathcal{S}^{\downarrow}. First, consider an arbitrary set S𝒮S\in\mathcal{S}^{\downarrow} and an integer j[L]j\in[L]. As |S|k|S|\leq k, irrespective of SS intersects UjU_{j} or not, S𝒮jS\in\mathcal{S}_{j}^{\downarrow}. Thus, 𝒮𝒮j\mathcal{S}^{\downarrow}\subseteq\mathcal{S}_{j}^{\downarrow} for all j[L]j\in[L]. Consider a non-empty set S𝒮S\notin\mathcal{S}^{\downarrow}. As U1,U2,,ULU_{1},U_{2},\ldots,U_{L} is a partition of VV, there exists a j[L]j\in[L] such that SUjS\cap U_{j}\neq\emptyset. As S𝒮S\notin\mathcal{S}^{\downarrow}, S𝒮jS\notin\mathcal{S}_{j}^{\downarrow}. This implies that

j[L]𝒮j=𝒮\bigcap_{j\in[L]}\mathcal{S}_{j}^{\downarrow}=\mathcal{S}^{\downarrow}

Using Proposition 3.2, in order to bound the packing dimension of 𝒮\mathcal{S}^{\downarrow}, it suffices to bound the packing dimension of 𝒮j\mathcal{S}_{j}^{\downarrow}, j[L]j\in[L].

Fix an integer j[L]j\in[L] and consider the set family 𝒮j\mathcal{S}_{j}^{\downarrow}. It is defined on the universe VV and there exists a non-empty subset UjVU_{j}\subseteq V such that

𝒮j=𝒮j{SVUj:|S|k}\mathcal{S}_{j}=\mathcal{S}^{\prime}_{j}\cup\{S\subseteq V\setminus U_{j}:|S|\leq k\}

with

𝒮j={S𝒮:SUj}.\mathcal{S}^{\prime}_{j}=\{S\in\mathcal{S}:S\cap U_{j}\neq\emptyset\}.

Here, 𝒮j\mathcal{S}^{\prime}_{j} is a simple set system which satisfies the following properties:

  1. 1.

    Each set in 𝒮j\mathcal{S}^{\prime}_{j} has cardinality at most k2k\geq 2 and each element appears in at most Δ\Delta sets in 𝒮j\mathcal{S}^{\prime}_{j}.

  2. 2.

    Every set S𝒮jS\in\mathcal{S}^{\prime}_{j} satisfies |SUj|=1|S\cap U_{j}|=1. As 𝒮\mathcal{S} is non-trivial, for every uUju\in U_{j}, there exists a set S𝒮jS\in\mathcal{S}^{\prime}_{j} with uSu\in S.

  3. 3.

    For every pair of sets S1,S2𝒮jS_{1},S_{2}\in\mathcal{S}^{\prime}_{j} with S1S2ϕS_{1}\cap S_{2}\neq\phi, S1Uj=S2Uj=S1S2S_{1}\cap U_{j}=S_{2}\cap U_{j}=S_{1}\cap S_{2}.

In other words, the set family 𝒮j\mathcal{S}_{j}^{\prime} is a sunflower-bouquet with core UjU_{j}. Using Corollary 3.6, we get that pdim(𝒮j)(kΔ)O(1)\textsf{pdim}(\mathcal{S}_{j}^{\downarrow})\leq(k\Delta)^{O(1)} for all j[L]j\in[L], which completes the proof. ∎

3.3 Hardness of Vector Bin Packing

We show that for large enough constant dd, Vector Bin Packing is hard to approximate within Ω(logd)\Omega(\log d). Our hardness is obtained via the hardness of set cover on simple bounded instances.

In the set cover problem, the input is a set family 𝒮\mathcal{S} on a universe VV with |V|=n|V|=n. The objective is to pick the minimum number of sets {S1,S2,,Sm}𝒮\{S_{1},S_{2},\ldots,S_{m}\}\subseteq\mathcal{S} from the family such that their union is equal to VV. The greedy algorithm where we repeatedly pick the set that covers the maximum number of new elements achieves a lnn\ln n approximation factor. Fiege [Fei98] proved a matching hardness of (1ϵ)(lnn)(1-\epsilon)(\ln n). On set systems where each pair of sets intersect in at most one element i.e. simple instances, Ω(logn)\Omega(\log n) hardness of set cover is proved by Kumar, Arya, and Ramesh [KAR00]. We observe that by changing the parameters slightly, their reduction also implies the same hardness on instances where the maximum set size is bounded:

Theorem 3.7.

(Set Cover on simple bounded instances) There exists an integer B0B_{0} such that for every constant BB0B\geq B_{0}, the Set Cover problem on simple set systems in which each set has cardinality at most BB is NP-hard to approximate within Ω(logB)\Omega(\log B). Furthermore, in the hard instances, each element occurs in at most O(B)O(B) sets.

The details of the parameter modification appear in Appendix A.

We combine this set cover hardness with the bound on the packing dimension of simple set systems to prove the hardness of Vector Bin Packing.

Proof of Theorem 1.1.

We prove the result by giving an approximation preserving reduction from the NP-hard problem of set cover on simple bounded set systems. Let 𝒮\mathcal{S} be the set system from Theorem 3.7 defined on a universe VV. Note that each set in the family has cardinality at most k=Bk=B and each element in the universe appears in at most Δ=O(B)\Delta=O(B) sets. We now output a set 𝒱\mathcal{V} of |V||V| vectors in [0,1]d[0,1]^{d} such that

  1. 1.

    (Completeness.) If there is a set cover of size mm in 𝒮\mathcal{S}, there is a packing of 𝒱\mathcal{V} using mm bins.

  2. 2.

    (Soundness.) If there is no set cover of size mm^{\prime} in 𝒮\mathcal{S}, there is no packing of 𝒱\mathcal{V} using mm^{\prime} bins.

We use Theorem 3.3 to compute an embedding f:V[0,1]df:V\rightarrow[0,1]^{d} in polynomial time such that

f(S)1\left\lVert f(S)\right\rVert_{\infty}\leq 1

if and only if S𝒮S\in\mathcal{S}^{\downarrow}, with d=(kΔ)O(1)=BO(1)d=(k\Delta)^{O(1)}=B^{O(1)}. Our output Vector Bin Packing instance is the set of vectors f(v),vVf(v),v\in V.

𝒱={f(v):vV}\mathcal{V}=\{f(v):v\in V\}
Completeness.

Suppose that there exist sets S1,S2,,Sm𝒮S_{1},S_{2},\ldots,S_{m}\in\mathcal{S} whose union is VV. Then, we use mm bins with the vectors {f(vj):jSi}\{f(v_{j}):j\in S_{i}\} in the iith bin. A vector might appear in multiple bins, but we can arbitrarily pick one bin for each vector while still maintaining the property that in each bin, the \ell_{\infty} norm of the sum of the vectors is at most 11.

Soundness.

Suppose that the minimum set cover in 𝒮\mathcal{S} has cardinality at least m+1m^{\prime}+1. Then, we claim that the set of vectors 𝒱\mathcal{V} needs m+1m^{\prime}+1 bins to be packed. Suppose for contradiction that there is a vector packing with mm^{\prime} bins. In other words, there exists a partition of VV into B1,B2,,BmB_{1},B_{2},\ldots,B_{m^{\prime}} such that f(Bi)1\left\lVert f(B_{i})\right\rVert_{\infty}\leq 1 for all i[m]i\in[m^{\prime}]. As f(Bi)1\left\lVert f(B_{i})\right\rVert_{\infty}\leq 1, Bi𝒮B_{i}\in\mathcal{S}^{\downarrow} for all i[m]i\in[m^{\prime}]. That is, for every i[m]i\in[m^{\prime}], there exists a set Si𝒮S_{i}\in\mathcal{S} such that BiSiB_{i}\subseteq S_{i}. This implies that {S1,S2,,Sm}\{S_{1},S_{2},\ldots,S_{m^{\prime}}\} is a set cover of VV, a contradiction.

As the original bounded simple set cover problem is hard to approximate within Ω(logB)=Ω(logd)\Omega(\log B)=\Omega(\log d), the resulting Vector Bin Packing is hard to approximate within Ω(logd)\Omega(\log d). Furthermore, in the hard instances, the optimal value i.e. the minimum number of bins needed to pack the vectors can be made arbitrarily large, and thus, the hardness applies to the asymptotic approximation ratio. ∎

4 Vector Scheduling

4.1 Monochromatic Clique

In the Monochromatic Clique problem, given a graph G=([n],E)G=([n],E) and a parameter k(n)k(n), the objective is to assign kk colors to the vertices of GG so as to minimize the largest monochromatic clique. More formally, we study the following decision version of the problem.

Definition 4.1.

(Monochromatic-Clique(k,B)(k,B)) In the Monochromatic-Clique(k,B)(k,B) problem, given a graph G=(V,E)G=(V,E) with |V|=n|V|=n and parameters k(n),B(n)k(n),B(n), the goal is to distinguish between the following:

  1. 1.

    (YES case) The chromatic number of GG is at most kk.

  2. 2.

    (NO case) In any assignment of kk colors to the vertices of GG, there is a clique of size BB, all of whose vertices are assigned the same color.

It generalizes the standard kk-Coloring problem, which corresponds to the case when B=2B=2. Note that the problem gets easier as BB increases. Indeed, when B>nB>\sqrt{n}, we can solve the problem in polynomial time using the canonical SDP relaxation. We present this algorithm and an almost matching integrality gap in Appendix B.

On the hardness front, we now prove that Monochromatic-Clique(k,B)(k,B) is hard when B=(logn)CB=(\log n)^{C}, for any constant CC. We achieve this in two steps: First, we observe that the existing chromatic number hardness results already imply the hardness of monochromatic clique when B=(logn)γB=(\log n)^{\gamma} for some constant γ>0\gamma>0. Next, we amplify this hardness by using lexicographic graph product.

4.1.1 Basic Hardness

We start with a couple of basic Ramsey theoretic lemmas from [CK04].

Lemma 4.2.

For a graph G=(V,E)G=(V,E) with |V|=n|V|=n, if ω(G)B\omega(G)\leq B, then α(G)n1B\alpha(G)\geq n^{\frac{1}{B}}.

Lemma 4.3.

For a graph G=(V,E)G=(V,E) with |V|=n|V|=n, if ω(G)B\omega(G)\leq B, then χ(G)O(n11Blogn)\chi(G)\leq O(n^{1-\frac{1}{B}}\log n).

We can use the above lemmas to prove that if the chromatic number of a graph is large enough, then in any assignment of kk colors to the vertices of the graph, there is a large monochromatic clique.

Lemma 4.4.

For every constant ϵ>0\epsilon>0, if a graph G=(V,E)G=(V,E) with |V|=n|V|=n satisfies χ(G)kn2(logn)α\chi(G)\geq k\frac{n}{2^{\left(\log n\right)^{\alpha}}} for some integer kk and 0<α<10<\alpha<1, then in any assignment of kk colors to VV, there is a monochromatic clique of size B=Ω((logn)1αϵ)B=\Omega\left((\log n)^{1-\alpha-\epsilon}\right).

Proof.

Suppose for contradiction that there is an assignment of kk colors VV without a monochromatic clique of size BB. Using Lemma 4.3, the subgraphs corresponding to each of the kk color classes has chromatic number at most

O(n11Blogn)=n2Ω((logn)α+ϵ)logn<n2(logn)αO(n^{1-\frac{1}{B}}\log n)=\frac{n}{2^{\Omega\left((\log n)^{\alpha+\epsilon}\right)}}\log n<\frac{n}{2^{(\log n)^{\alpha}}}

colors. Thus, the whole graph has chromatic number at most kn2(logn)αk\frac{n}{2^{(\log n)^{\alpha}}} colors, a contradiction. ∎

Khot [Kho01] proved that assuming NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right), the chromatic number of graphs is hard to approximate within a factor of n2(logn)1γ\frac{n}{2^{\left(\log n\right)^{1-\gamma}}} for an absolute constant γ>0\gamma>0. More formally, he proved the following:

Theorem 4.5.

[Kho01]) There exists a constant γ>0\gamma>0, a function k=k(n)k=k(n), and a randomized reduction that takes as input a 33-SAT instance II on nn variables and outputs a graph G=(V,E)G=(V,E) with |V|=N=2lognO(1)|V|=N=2^{\log n^{O(1)}} such that

  1. 1.

    (Completeness) If II is satisfiable, χ(G)k\chi(G)\leq k.

  2. 2.

    (Soundness) If II is not satisfiable, with probability at least 12\frac{1}{2}, χ(G)>kN2(logN)1γ\chi(G)>k\frac{N}{2^{\left(\log N\right)^{1-\gamma}}}.

Futhermore, the reduction runs in time poly(N)=2(logn)O(1)\emph{{poly}}(N)=2^{(\log n)^{O(1)}}.

We observe that Khot’s chromatic number hardness immediately gives (logn)Ω(1)(\log n)^{\Omega(1)} hardness of Monochromatic Clique.

Lemma 4.6.

There exists a constant γ>0\gamma>0, a function k=k(n)k=k(n) such that the following holds. Assuming NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right), given a graph G=([n],E)G=([n],E), there is no n(logn)O(1)n^{(\log n)^{O(1)}} time algorithm for Monochromatic-Clique(k,B)(k,B) when B=Ω((logn)γ)B=\Omega\left((\log n)^{\gamma}\right).

Proof.

Using Khot’s reduction, we get that there exists an absolute constant γ>0\gamma>0 such that assuming NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right), given a graph G=([n],E)G=([n],E) and a parameter k(n)k(n), there is no n(logn)O(1)n^{(\log n)^{O(1)}} time algorithm to distinguish between the following:

  1. 1.

    (Completeness) χ(G)k\chi(G)\leq k.

  2. 2.

    (Soundness) χ(G)>kn2(logn)1γ\chi(G)>k\frac{n}{2^{\left(\log n\right)^{1-\gamma}}}.

Using Lemma 4.4, the Soundness condition implies that in any assignment of kk colors to GG, there is a monochromatic clique of size Ω((logn)γϵ)\Omega\left((\log n)^{\gamma-\epsilon}\right), for any constant ϵ>0\epsilon>0. Thus, given a graph GG and a parameter kk, assuming NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right), there is no n(logn)O(1)n^{(\log n)^{O(1)}} time algorithm to distinguish between the following:

  1. 1.

    (Completeness) χ(G)k\chi(G)\leq k.

  2. 2.

    (Soundness) In any assignment of kk colors to the vertices of GG, there is a monochromatic clique of size Ω((logn)γ)\Omega((\log n)^{\gamma^{\prime}}).

for any constant γ<γ\gamma^{\prime}<\gamma. ∎

4.1.2 Amplification using Lexicographic Product

We cannot directly amplify the hardness of the Monochromatic-Clique problem by taking graph products as we cannot preserve the chromatic number and also amplify the largest clique in an assignment of kk colors at the same time. We get around this issue by defining a harder variant of Monochromatic Clique called Strong Monochromatic Clique and then amplifying it.

Definition 4.7.

(Strong Monochromatic-Clique(k,B,C)(k,B,C)) In the Strong Monochromatic-Clique(k,B,C)(k,B,C), given a graph GG and parameters k(n),B(n),Ck(n),B(n),C, the goal is to distinguish between the following two cases:

  1. 1.

    (YES case) The chromatic number of GG is at most kk.

  2. 2.

    (NO case) In any assignment of kCk^{C} colors to the vertices of GG, there is a monochromatic clique of size BB.

We now observe that the chromatic number hardness of Khot [Kho01] implies the same hardness as Lemma 4.6 for Strong Monochromatic Clique as well.

Lemma 4.8.

There exists a constant γ>0\gamma>0 and a function k=k(n)k=k(n) such that for every constant C1C\geq 1, the following holds. Assuming NPZPTIME(n(logn)O(1))\emph{{NP}}\nsubseteq\emph{{ZPTIME}}\left(n^{(\log n)^{O(1)}}\right), there is no n(logn)O(1)n^{(\log n)^{O(1)}} time algorithm for Strong Monochromatic-Clique(k,B,C)(k,B,C) when B=Ω((logn)γ)B=\Omega((\log n)^{\gamma}).

Proof.

Note that the function kk in Theorem 4.5 satisfies k=o(2(logN)1γ)k=o\left(2^{(\log N)^{1-\gamma}}\right). Thus, we can replace the soundness condition in Theorem 4.5 with χ(G)kCN2C(logN)1γ\chi(G)\geq k^{C}\frac{N}{2^{C\left(\log N\right)^{1-\gamma}}}. Using  Lemma 4.4, this implies that in any assignment of kCk^{C} colors to the vertices of GG, there is a monochromatic clique of size Ω((logN)γϵ)\Omega((\log N)^{\gamma-\epsilon}), where ϵ>0\epsilon>0 is an absolute constant. The hardness of Strong Monochromatic Clique then follows along the same lines as Lemma 4.6. ∎

We amplify the hardness of Strong Monochromatic-Clique(k,B,C)(k,B,C) to Monochromatic-Clique(kC,BC)(k^{C},B^{C}) using the lexicographic product of graphs. First, we define lexicographic product and prove some properties of it.

Definition 4.9.

(Lexicographic product of graphs) Given two graphs GG and HH, the Lexicographic graph product GHG\cdot H has vertex set V(G)×V(H)V(G)\times V(H), and two vertices (u1,v1),(u2,v2)(u_{1},v_{1}),(u_{2},v_{2}) are adjacent if either (u1,u2)E(G)(u_{1},u_{2})\in E(G) or u1=u2u_{1}=u_{2} and (v1,v2)E(H)(v_{1},v_{2})\in E(H).

The lexicographic product can be visualized as replacing each vertex of GG with a copy of HH and forming complete bipartite graphs between copies of vertices adjacent in GG. For ease of notation, we let G2=GGG^{2}=G\cdot G. More generally, for an integer nn that is a power of 22, we define GnG^{n} as taking the above lexicographic product of GG with itself recursively logn\log n times.

Lemma 4.10.

Let n2n\geq 2 be a power of 22. If χ(G)k\chi(G)\leq k, then χ(Gn)kn\chi(G^{n})\leq k^{n}.

Proof.

We prove that χ(G2)k2\chi(G^{2})\leq k^{2}, and the statement follows by induction on nn. If f:G[k]f:G\rightarrow[k] is a proper kk-coloring of GG, then the coloring f(u,v)=(f(u),f(v))f^{\prime}(u,v)=(f(u),f(v)) is a proper k2k^{2}-coloring of G×GG\times G. ∎

Lemma 4.11.

Let n2n\geq 2 be a power of 22. Suppose that in any assignment of kk colors to the vertices of GG, there is a monochromatic clique of size BB. Then, in any assignment of kk colors to the vertices of GnG^{n}, there is a monochromatic clique of size BnB^{n}.

Proof.

We prove the statement for n=2n=2 and the lemma follows by induction on nn. Let f:V(G2)[k]f:V(G^{2})\rightarrow[k] be a given assignment. For a vertex vGv\in G, consider the assignment gv:V(G)[k]g_{v}:V(G)\rightarrow[k] defined as gv(u)=f(v,u)g_{v}(u)=f(v,u). As every assignment of kk colors to the vertices of GG has a monochromatic clique of size BB, there is a color α(v)[k]\alpha(v)\in[k] and a clique S(v)V(G)S(v)\subseteq V(G) with |S(v)|B|S(v)|\geq B such that gv(u)=α(v)g_{v}(u)=\alpha(v) for all uS(v)u\in S(v), or in other words, f(v,u)=α(v)f(v,u)=\alpha(v) for all uS(v)u\in S(v). Note that such a set S(v)S(v) and α(v)\alpha(v) exist for vV(G)v\in V(G). The function α:V(G)[k]\alpha:V(G)\rightarrow[k] can also be visualized as an assignment of kk colors to the vertices of GG, and thus there is a monochromatic clique TT of size at least BB with respect to this assignment. The set

{S(v):vT}\{S(v):v\in T\}

is a monochromatic clique of size B2B^{2} with respect to ff in GG. ∎

By using the lexicographic product, we can get a polynomial time reduction from Strong Monochromatic Clique to Monochromatic Clique.

Lemma 4.12.

For every constant C1C\geq 1 that is a power of 22, there exists a polynomial time reduction from Strong Monochromatic-Clique(k,B,C)(k,B,C) to Monochromatic-Clique(kC,BC)(k^{C},B^{C}).

Proof.

Given a graph GG as an instance of Strong Monochromatic-Clique(k,B,C)(k,B,C), we compute the graph G=GCG^{\prime}=G^{C}. We claim that solving Monochromatic-Clique(kC,BC)(k^{C},B^{C}) on GG^{\prime} solves the original Strong Monochromatic Clique problem.

  1. 1.

    (Completeness.) Suppose that χ(G)k\chi(G)\leq k. Then, by Lemma 4.10, χ(G)kC\chi(G^{\prime})\leq k^{C}.

  2. 2.

    (Soundness.) Suppose that in any assignment of kCk^{C} colors to the vertices of GG, there is a monochromatic clique of size BB. Then, by Lemma 4.11, in any assignment of kCk^{C} colors to the vertices of GG^{\prime}, there is a monochromatic clique of size BCB^{C}.∎

Putting everything together, we obtain the following hardness of Monochromatic Clique.

Theorem 4.13.

For every constant C>0C>0, there exists a function k=k(n)k=k(n) such that the following holds. Assuming NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right), there is no n(logn)O(1)n^{(\log n)^{O(1)}} time algorithm for Monochromatic-Clique(k,B)(k,B) when B=Ω((logn)C)B=\Omega\left((\log n)^{C}\right).

Proof.

The proof follows directly by combining Lemma 4.8 and Lemma 4.12. ∎

4.2 From Monochromatic Clique to Vector Scheduling

We now prove Theorem 1.2 using the above hardness of Monochromatic Clique.

Proof of Theorem  1.2.

The reduction from Monochromatic-Clique(k,B)(k,B) to Vector Scheduling is (implicitly) proved in [CK04]. We present it here for the sake of completeness. Given a graph G=(V=[n],E)G=(V=[n],E), parameters kk and BB, we order all the BB-sized cliques of GG as T1,T2,,TdT_{1},T_{2},\ldots,T_{d} with dnBd\leq n^{B}. We define a set of nn vectors v1,v2,,vnv_{1},v_{2},\ldots,v_{n} of dimension dd with

(vi)j={1if iTj0otherwise.(v_{i})_{j}=\begin{cases}1&\text{if }i\in T_{j}\\ 0&\text{otherwise.}\end{cases}

The instance of the Vector Scheduling has these nn vectors as the input and the number of machines is equal to kk.

We analyze the reduction.

  1. 1.

    (Completeness.) Suppose that there exists a proper kk-coloring of GG, c:V[k]c:V\rightarrow[k]. We assign the vector viv_{i} to the machine c(i)c(i). For every j[d]j\in[d], all the BB vectors that have 11 in the jjth dimension are assigned to distinct machines. Thus, the makespan of the scheduling is at most 11.

  2. 2.

    (Soundness.) Suppose that in any assignment of kk colors to the vertices of GG, there is a monochromatic clique of size BB. In this case, the makespan of the scheduling is at least BB.

We set B=(logn)CB=(\log n)^{C} for a large constant CC to be set later. We choose kk from Theorem 4.13 such that assuming NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right), there is no n(logn)O(1)n^{(\log n)^{O(1)}} time algorithm for Monochromatic-Clique(k,B)(k,B). By the above reduction, we can conclude that there is no polynomial time algorithm that approximates the resulting Vector Scheduling instances within a factor of B=(logn)CB=(\log n)^{C}. As dnBd\leq n^{B}, we get that logd(logn)C+1\log d\leq(\log n)^{C+1}, and B(logd)11C+1B\geq(\log d)^{1-\frac{1}{C+1}}. Setting C=1ϵ1C=\frac{1}{\epsilon}-1, we get that dd-dimensional Vector Scheduling has no polynomial time Ω((logd)1ϵ)\Omega((\log d)^{1-\epsilon}) approximation algorithm assuming NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right), for every constant ϵ>0\epsilon>0. ∎

Remark 4.14.

In [IKKP19], Im, Kell, Kulkarni, and Panigrahi also study the r\ell_{r}-norm minimization of Vector Scheduling where the objective is to minimize

maxk[d](i=1m(Lik)r)1r\max_{k\in[d]}\left(\sum_{i=1}^{m}(L_{i}^{k})^{r}\right)^{\frac{1}{r}}

where LikL_{i}^{k} denotes the load on the machine ii on the kkth dimension. They gave an algorithm with an approximation ratio O((logdloglogd)11r)O\left(\left(\frac{\log d}{\log\log d}\right)^{1-\frac{1}{r}}\right). Our reduction from Monochromatic Clique gives almost optimal hardness for this variant as well: we get the hardness of Ω((logd)11rϵ)\Omega\left(\left(\log d\right)^{1-\frac{1}{r}-\epsilon}\right) assuming NPZPTIME(n(logn)O(1))\textsf{NP}\nsubseteq\textsf{ZPTIME}\left(n^{(\log n)^{O(1)}}\right), for every constant ϵ>0\epsilon>0.

4.3 Hardness of Vector Scheduling via Balanced Hypergraph Coloring

Observe that the resulting Vector Scheduling instances in the above reduction satisfy a stronger property: the vectors are from {0,1}d\{0,1\}^{d}. In the setting where the vectors are from {0,1}d\{0,1\}^{d}, the Vector Scheduling problem is closely related to the Balanced Hypergraph Coloring problem. In this problem, given a hypergraph HH and an integer kk, the objective is to assign kk colors to the vertices of HH minimizing the maximum number of monochromatic vertices in an edge. More formally, we study the following decision version of the problem.

Definition 4.15.

(Balanced Hypergraph Coloring.) In the Balanced Hypergraph Coloring problem, given a ss-uniform hypergraph HH and parameters kk and c<sc<s, the objective is to distinguish between the following:

  1. 1.

    There is an assignment of kk colors to the vertices of HH such that in every edge, each color appears at most cc times.

  2. 2.

    The hypergraph HH has no proper coloring with kk colors i.e., in any assignment of kk colors to the vertices of HH, there is an edge all of whose ss vertices are assigned the same color.

We give a simple reduction from Balanced Hypergraph Coloring to Vector Scheduling.

Lemma 4.16.

Given a ss-uniform hypergraph H=(V=[n],E)H=(V^{\prime}=[n^{\prime}],E^{\prime}) and parameters k,ck,c, there is a polynomial time reduction that outputs a Vector Scheduling instance II over nn^{\prime} vectors v1,v2,,vn{0,1}dv_{1},v_{2},\ldots,v_{n^{\prime}}\in\{0,1\}^{d} on mm^{\prime} machines with m=k,d=|E|m^{\prime}=k,d=|E^{\prime}| such that

  1. 1.

    (Completeness.) If there is an assignment of kk colors to the vertices of HH such that each color appears at most cc times in every edge, then there is a scheduling of II with makespan at most cc.

  2. 2.

    (Soundness.) If HH has no proper coloring with kk colors, then in any scheduling of II, the makespan is at least ss.

Proof.

Let d=|E|d=|E^{\prime}|. Order the edges of the hypergraph HH as e1,e2,,ede_{1},e_{2},\ldots,e_{d}. We define the set of vectors v1,v2,,vn{0,1}dv_{1},v_{2},\ldots,v_{n^{\prime}}\in\{0,1\}^{d} as follows:

(vi)j={1if iej0 otherwise.(v_{i})_{j}=\begin{cases}1&\text{if }i\in e_{j}\\ 0&\text{ otherwise.}\end{cases}

We set the number of machines mm^{\prime} to be equal to the number of colors kk. There is a natural correspondence between the assignment of kk-colors to the vertices of HH f:V[k]f:V^{\prime}\rightarrow[k], and the scheduling where we assign the vector viv_{i} to the machine f(i)f(i). We now analyze our reduction.

  1. 1.

    (Completeness.) If there exists an assignment of kk colors f:V[k]f:V^{\prime}\rightarrow[k] where each color appears at most cc times in each edge, we assign the vector vi,i[n]v_{i},i\in[n^{\prime}] to the machine f(i)f(i). In any dimension j[d]j\in[d], at most cc vectors viv_{i} with (vi)j=1(v_{i})_{j}=1 are scheduled on any machine. Thus, in any machine, the total load in each dimension is at most cc.

  2. 2.

    (Soundness.) If there exists a vector scheduling f:[n][m]f:[n]\rightarrow[m^{\prime}] with makespan strictly smaller than ss, assign the color f(i)f(i) to the iith vertex of the hypergraph. In any edge of the hypergraph, each color appears fewer than ss times as the makespan is smaller than ss. Thus, f:V[k]f:V^{\prime}\rightarrow[k] is a proper kk-coloring of the hypergraph HH. ∎

We prove the hardness results for Vector Scheduling, namely Theorem 1.3 and Theorem 1.4 by combining this reduction with the hardness of Balanced Hypergraph Coloring. Note that the dimension of the resulting instances in the above reduction is equal to mm, the number of edges in the hypergraph HH, and the ratio of the makespans in the completeness and soundness is equal to sc\frac{s}{c}. Thus, our goal is to prove the hardness of the Balanced Hypergraph Coloring problem where sc\frac{s}{c} is as large as possible, as a function of mm, the number of edges in the underlying hypergraph.

Towards this, we first give a reduction from the Label Cover problem to the Balanced Hypergraph Coloring problem.

Lemma 4.17.

Fix an odd prime number k3k\geq 3 and let ϵ=1k8\epsilon=\frac{1}{k^{8}}. Given a Label Cover instance G=(V=LR,E,ΣL,ΣR,Π)G=(V=L\cup R,E,\Sigma_{L},\Sigma_{R},\Pi), there is a polynomial time reduction that outputs a k2k^{2} uniform hypergraph H=(V,E)H=(V^{\prime},E^{\prime}) with |V||L|k|ΣL||V^{\prime}|\leq|L|k^{|\Sigma_{L}|} such that

  1. 1.

    (Completeness) If GG is satisfiable, there is an assignment of kk colors to the vertices of HH such that in every edge, each color occurs at most 2k2k times.

  2. 2.

    (Soundness) If no labeling to GG can satisfy an ϵ\epsilon fraction of the constraints, then HH has no proper kk-coloring, that is, in any assignment of kk colors to the vertices of HH, there is an edge all of whose vertices are assigned the same color.

Furthermore, |E||E^{\prime}| is at most |R|Δkk|ΣL|k2|R|\Delta^{k}k^{|\Sigma_{L}|k^{2}} where Δ\Delta is the maximum degree of a vertex vRv\in R.

We defer the proof of Lemma 4.17 to Section 4.4.

Using Lemma 4.17, we can prove the hardness of Balanced Hypergraph Coloring via Label Cover hardness results. We obtain two different hardness results for the Balanced Hypergraph Coloring problem, one under NPDTIME(nO(loglogn))\textsf{NP}\nsubseteq\textsf{DTIME}\left(n^{O(\log\log n)}\right) and another NP-hardness result, by using two different hardness results for the Label Cover problem. These two hardness results prove Theorem 1.4 and Theorem 1.3 respectively, using Lemma 4.16.

First, using the standard Label Cover hardness obtained using PCP Theorem [ALM+98] combined with Raz’s Parallel Repetition theorem [Raz98], we get the following hardness of Balanced Hypergraph Coloring.

Theorem 4.18.

Assuming NPDTIME(nO(loglogn))\textsf{NP}\nsubseteq\textsf{DTIME}\left(n^{O(\log\log n)}\right), there is no polynomial time algorithm for the following problem. Given a k2k^{2}-uniform hypergraph H=(V,E)H=(V^{\prime},E^{\prime}) with m=|E|m=|E^{\prime}| and k=(logm)Ω(1)k=(\log m)^{\Omega(1)}, distinguish between the following:

  1. 1.

    There is an assignment of kk colors to the vertices of HH such that in any edge of the hypergraph, each color appears at most 2k2k times.

  2. 2.

    The hypergraph HH has no proper kk coloring.

Proof.

By setting ϵ=1k8\epsilon=\frac{1}{k^{8}} in Theorem 2.7, we have a reduction from the 33-SAT problem on nn variables to the Label Cover problem G=(V=LR,E,ΣL,ΣR,Π)G=(V=L\cup R,E,\Sigma_{L},\Sigma_{R},\Pi) with soundness ϵ\epsilon and |V|nO(logk)|V|\leq n^{O(\log k)}, |ΣL|kO(1)|\Sigma_{L}|\leq k^{O(1)} and ΔkO(1)\Delta\leq k^{O(1)}. Using Lemma 4.17, we can reduce this Label Cover instance to a Balanced Hypergraph Coloring instance H=(V,E)H=(V^{\prime},E^{\prime}) with |V|nO(logk)2kO(1)|V^{\prime}|\leq n^{O(\log k)}2^{k^{O(1)}} and |E|nO(logk)2kO(1)|E^{\prime}|\leq n^{O(\log k)}2^{k^{O(1)}}. We set k=(logn)Ω(1)k=(\log n)^{\Omega(1)} such that |V|=nO(loglogn)|V^{\prime}|=n^{O(\log\log n)} and |E|=nO(loglogn)|E^{\prime}|=n^{O(\log\log n)} to obtain the required hardness of Balanced Hypergraph Coloring. ∎

The proof of Theorem 1.4 follows immediately from Theorem 4.18 and Lemma 4.16.

Next, using the hardness of near linear sized Label Cover due to Moshkovitz and Raz [MR10], we obtain the following NP-hardness of Balanced Hypergraph Coloring.

Theorem 4.19.

For any constant C1C\geq 1, given a k2k^{2} uniform hypergraph H=(V,E)H=(V^{\prime},E^{\prime}) with m=|E|m=|E^{\prime}| and k=(loglogm)Ck=(\log\log m)^{C}, it is NP-hard to distinguish between the following:

  1. 1.

    There is an assignment of kk colors to the vertices of HH such that in any edge of the hypergraph, each color appears at most 2k2k times.

  2. 2.

    The hypergraph HH has no proper kk coloring.

Proof.

By setting ϵ=1k8\epsilon=\frac{1}{k^{8}} in Theorem 2.8, we can reduce a 33-SAT instance over nn variables to a Label Cover instance G=(V=LR,E,ΣL,ΣR,Π)G=(V=L\cup R,E,\Sigma_{L},\Sigma_{R},\Pi) with soundness ϵ\epsilon and |V|n1+o(1)kO(1),|ΣL|2kO(1)|V|\leq n^{1+o(1)}k^{O(1)},|\Sigma_{L}|\leq 2^{k^{O(1)}}, Δ=kO(1)\Delta=k^{O(1)}. By using Lemma 4.17, we can reduce the Label Cover instance to a Balanced Hypergraph Coloring instance H=(V,E)H=(V^{\prime},E^{\prime}) with |V|n1+o(1)22kO(1)|V^{\prime}|\leq n^{1+o(1)}2^{2^{k^{O(1)}}} and |E||E^{\prime}| at most n1+o(1)22kO(1)n^{1+o(1)}2^{2^{k^{O(1)}}}. We set k=(loglogn)Ω(1)k=(\log\log n)^{\Omega(1)} to obtain |V|=O(n2),|E|=O(n2)|V^{\prime}|=O(n^{2}),|E^{\prime}|=O(n^{2}).

Dinur and Steurer [DS14] gave an improvement to [MR10]–in the new Label Cover hardness, the alphabet size |ΣL||\Sigma_{L}| can be taken to be 2(1ϵ)γ2^{\left(\frac{1}{\epsilon}\right)^{\gamma}} for every constant γ>0\gamma>0. Using this improved Label Cover hardness, we can set k=(loglogn)Ck=(\log\log n)^{C} for any constant C1C\geq 1 in the hardness of Balanced Hypergraph Coloring. ∎

The proof of Theorem 1.3 follows immediately from Theorem 4.19 and Lemma 4.16.

Finally, we remark that if the structured graph version of the Projection Games Conjecture [Mos15] holds, Lemma 4.17 and Lemma 4.16 together prove that dd-dimensional Vector Scheduling is NP-hard to approximate within a factor of (logd)Ω(1)(\log d)^{\Omega(1)}.

4.4 Proof of Lemma 4.17

We follow the standard Label Cover-Long Code framework–see e.g.,[ABP20].

Reduction. For ease of notation, let n=|ΣL|n=|\Sigma_{L}|. For every node vLv\in L of the Label Cover instance, we have a set of knk^{n} vertices denoted by fv={v}×[k]nf_{v}=\{v\}\times[k]^{n}. The vertex set of the hypergraph is V=vLfvV^{\prime}=\bigcup_{v\in L}f_{v}.

For every uRu\in R, and kk distinct neighbors of uu, v1,v2,,vkLv_{1},v_{2},\ldots,v_{k}\in L with projection constraints πi:[ΣL][ΣR],i[k]\pi_{i}:[\Sigma_{L}]\rightarrow[\Sigma_{R}],i\in[k], consider the set of k2k^{2} vectors xi,j\textbf{x}^{i,j} for i[k],j[k]i\in[k],j\in[k] which satisfy the following: For every βΣR\beta\in\Sigma_{R}, and for all α1,α2,,αkΣL\alpha_{1},\alpha_{2},\ldots,\alpha_{k}\in\Sigma_{L} such that πi(αi)=β\pi_{i}(\alpha_{i})=\beta for all i[k]i\in[k], we have

|{(i,j)|xαii,j=p}|2kp[k]\left|\{(i,j)|\textbf{x}^{i,j}_{\alpha_{i}}=p\}\right|\leq 2k\,\,\forall p\in[k] (1)

For every such set of k2k^{2} vectors, we add the edge {(vi,xi,j):1i,jk}\{(v_{i},\textbf{x}^{i,j}):1\leq i,j\leq k\} to EE^{\prime}. We can observe that |V||L|k|ΣL||V^{\prime}|\leq|L|k^{|\Sigma_{L}|} and

|E||R|(Δk)(k|ΣL|k)k|R|Δkk|ΣL|k2.|E^{\prime}|\leq|R|\binom{\Delta}{k}\binom{k^{|\Sigma_{L}|}}{k}^{k}\leq|R|\Delta^{k}k^{|\Sigma_{L}|k^{2}}.

Completeness. Suppose that there exists an assignment σ:VΣ\sigma:V\rightarrow\Sigma that satisfies all the constraints of the Label Cover instance GG. We color the set of vertices fvf_{v} in the long code corresponding to the vertex vLv\in L with the dictator function on the coordinate σ(v)\sigma(v) i.e. for every xfv\textbf{x}\in f_{v}, we assign the color

c({v,x})=xσ(v)c\left(\{v,\textbf{x}\}\right)=\textbf{x}_{\sigma(v)}

We can observe that this coloring satisfies the property that in every edge eEe\in E^{\prime}, each color appears at most 2k2k times.

Soundness. Suppose that there is a proper kk-coloring c:V[k]c:V^{\prime}\rightarrow[k] of the hypergraph HH i.e. in every edge e={v1,v2,,vk2}e=\{v_{1},v_{2},\ldots,v_{k^{2}}\}, we have

|{c(v1),c(v2),,c(vk2)}|>1\left|\{c(v_{1}),c(v_{2}),\ldots,c(v_{k^{2}})\}\right|>1

Our goal is to prove that there is a labeling to the Label Cover instance that satisfies at least ϵ=1k8\epsilon=\frac{1}{k^{8}} fraction of constraints.

We need the following lemma proved by Austrin, Bhangale, Potukuchi [ABP20] using a generalization of Borsuk-Ulam theorem.

Lemma 4.20.

(Theorem 5.25.2 of [ABP20]) For every odd prime kk and nk3n\geq k^{3}, in any kk-coloring of [k]n[k]^{n}, c:[k]n[k]c:[k]^{n}\rightarrow[k], there is a set of kk vectors x1,x2,,xk\textbf{x}^{1},\textbf{x}^{2},\ldots,\textbf{x}^{k} that are all assigned the same color such that

{xi1,xi2,,xik}=[k]\{\textbf{x}^{1}_{i},\textbf{x}^{2}_{i},\ldots,\textbf{x}^{k}_{i}\}=[k]

for at least nk3n-k^{3} distinct coordinates i[n]i\in[n].

Using this lemma, for every vLv\in L, we can identify a set of vectors xv,1,xv,2,,xv,kfv\textbf{x}^{v,1},\textbf{x}^{v,2},\ldots,\textbf{x}^{v,k}\in f_{v} such that all these vectors have the same color i.e. c({v,xv,i})=c(v)c(\{v,\textbf{x}^{v,i}\})=c^{\prime}(v) for all vL,i[k]v\in L,i\in[k] for some function c:L[k]c^{\prime}:L\rightarrow[k]. Furthermore, there are a set of coordinates S(v)[n]S(v)\subseteq[n] with |S(v)|k3|S(v)|\leq k^{3} such that

{xiv,1,xiv,2,,xiv,k}=[k]\{\textbf{x}^{v,1}_{i},\textbf{x}^{v,2}_{i},\ldots,\textbf{x}^{v,k}_{i}\}=[k]

for every i[n]S(v)i\in[n]\setminus S(v).

For a set SΣLS\subseteq\Sigma_{L} and a function π:ΣLΣR\pi:\Sigma_{L}\rightarrow\Sigma_{R}, we use π(S)\pi(S) to denote the set {π(i):iS}\{\pi(i):i\in S\}. We now prove a key lemma that helps in the decoding procedure.

Lemma 4.21.

Let uRu\in R be a node on the right side of the Label Cover instance. There are a set of labels S(u)ΣRS(u)\subseteq\Sigma_{R} such that |S(u)|k5|S(u)|\leq k^{5}, and for every vLv\in L that is a neighbor of uu with projection constraint π:ΣLΣR\pi:\Sigma_{L}\rightarrow\Sigma_{R}, we have S(u)π(S(v))ϕS(u)\cap\pi(S(v))\neq\phi.

Proof.

Fix a node uRu\in R on the right side of the Label Cover instance. Let v1,v2,,vlLv_{1},v_{2},\ldots,v_{l}\in L be the neighbors of uu in the Label Cover instance corresponding to the projection constraints π1,π2,,πl\pi_{1},\pi_{2},\ldots,\pi_{l} respectively. As |S(vi)|k3|S(v_{i})|\leq k^{3} for all i[l]i\in[l], and the constraints πi\pi_{i} are projections, we have |πi(S(vi))|k3|\pi_{i}(S(v_{i}))|\leq k^{3} for all i[l]i\in[l]. Among these ll subsets πi(S(vi))\pi_{i}(S(v_{i})) of ΣR\Sigma_{R}, let the maximum number of pairwise disjoint subsets be denoted by ll^{\prime}. Without loss of generality, we can assume that 𝒮={πi(S(vi)):i[l]}\mathcal{S}=\{\pi_{i}(S(v_{i})):i\in[l^{\prime}]\} is a pairwise disjoint family of subsets.

We define the set S(u)S(u) as follows:

S(u)=i[l]πi(S(vi))S(u)=\bigcup_{i\in[l^{\prime}]}\pi_{i}(S(v_{i}))

As 𝒮\mathcal{S} is a family of maximum pairwise disjoint subsets, we have S(u)πi(S(vi))ϕS(u)\cap\pi_{i}(S(v_{i}))\neq\phi for all i[l]i\in[l]. Our goal is to bound the size of S(u)S(u), which we achieve by bounding ll^{\prime}.

We claim that lk(k1)l^{\prime}\leq k(k-1). Suppose for contradiction that l>k(k1)l^{\prime}>k(k-1). This implies that there are l>k(k1)l^{\prime}>k(k-1) nodes v1,v2,,vlv_{1},v_{2},\ldots,v_{l^{\prime}} all adjacent to uu such that πi(S(vi)),i[l]\pi_{i}(S(v_{i})),i\in[l^{\prime}] are all pairwise disjoint. Thus, there exists a color [k]\ell\in[k] and a set of kk nodes w1,w2,,wkw_{1},w_{2},\ldots,w_{k} adjacent to uu corresponding to the projection constraints π1,π2,,πk\pi^{\prime}_{1},\pi^{\prime}_{2},\ldots,\pi^{\prime}_{k} such that c(wi)=c^{\prime}(w_{i})=\ell for all i[k]i\in[k], and the kk sets πi(S(wi))\pi^{\prime}_{i}(S(w_{i})) are pairwise disjoint.

Using this, we can construct a set of vectors xi,j,1i,jk\textbf{x}^{i,j},1\leq i,j\leq k defined as xi,j=xwi,j\textbf{x}^{i,j}=\textbf{x}^{w_{i},j} which satisfy the following properties:

  1. 1.

    All these vectors are colored the same:

    c({wi,xi,j})=1i,jkc(\{w_{i},\textbf{x}^{i,j}\})=\ell\,\,\forall 1\leq i,j\leq k
  2. 2.

    For every i[k]i\in[k],

    {xii,1,xii,2,,xii,k}=[k]\{\textbf{x}^{i,1}_{i^{\prime}},\textbf{x}^{i,2}_{i^{\prime}},\ldots,\textbf{x}^{i,k}_{i^{\prime}}\}=[k]

    for every i[n]S(wi)i^{\prime}\in[n]\setminus S(w_{i}).

We claim that these set of vectors satisfy the condition in Equation 1. Fix a βΣR\beta\in\Sigma_{R}, and α1,α2,,αkΣL\alpha_{1},\alpha_{2},\ldots,\alpha_{k}\in\Sigma_{L} such that πi(αi)=β\pi^{\prime}_{i}(\alpha_{i})=\beta for all i[k]i\in[k]. As the family of subsets πi(S(wi))\pi^{\prime}_{i}(S(w_{i})) is a pairwise disjoint family, we can infer that there exists at most one i[k]i\in[k] such that αiS(wi)\alpha_{i}\in S(w_{i}). Note that if αiS(wi)\alpha_{i}\notin S(w_{i}), then

{xαii,j:j[k]}=[k].\{\textbf{x}_{\alpha_{i}}^{i,j}:j\in[k]\}=[k].

Thus, we have

|{(i,j)|xαii,j=p}|2kp[k].\left|\{(i,j)|\textbf{x}^{i,j}_{\alpha_{i}}=p\}\right|\leq 2k\,\,\forall p\in[k].

Thus, the set of vectors {(wi,xi,j):1i,jk}\{(w_{i},\textbf{x}^{i,j}):1\leq i,j\leq k\} is indeed an edge of EE^{\prime}. As all these vectors are colored the same color \ell, we have arrived at a contradiction to the fact that cc is a proper kk-coloring of HH.

Hence, we can conclude that lk(k1)l^{\prime}\leq k(k-1), and thus, |S(u)|k(k1)k3<k5|S(u)|\leq k(k-1)k^{3}<k^{5}. ∎

Now, consider the labeling σ:LΣL,\sigma:L\rightarrow\Sigma_{L}, where σ(v),vL\sigma(v),v\in L is chosen uniformly at random from S(v)S(v). Similarly, let σ:RΣR\sigma:R\rightarrow\Sigma_{R} is chosen uniformly at random from S(u),uRS(u),u\in R. Using Lemma 4.21, we can infer that for every edge e=(v,u)e=(v,u) in the Label Cover, this labeling satisfies the edge ee with probability at least 1|S(v)||S(u)|1k8\frac{1}{|S(v)||S(u)|}\geq\frac{1}{k^{8}}. By linearity of expectation, this labeling satisfies at least 1k8\frac{1}{k^{8}} fraction of the constraints in expectation. Hence, with positive probability, the labeling satisfies at least 1k8\frac{1}{k^{8}} fraction of the constraints. This concludes the proof of soundness that if HH has a proper kk coloring, then there exists a labeling to GG that satisfies at least 1k8\frac{1}{k^{8}} fraction of the constraints.

5 Vector Bin Covering

5.1 Hardness of Vector Bin Covering via Rainbow Coloring

As is the case with the Vector Scheduling problem, the hard instances for Vector Bin Covering are when the vectors are from {0,1}d\{0,1\}^{d}. In this setting, the Vector Bin Covering problem is closely related to the hypergraph rainbow coloring problem. A hypergraph H=(V,E)H=(V,E) is said to be kk-rainbow colorable if there is an assignment of kk colors to the vertices of HH such that in every edge, all the kk colors appear. When k=2k=2, it is equivalent to the standard 22-coloring of hypergraphs. Unlike the usual (hyper)graph coloring, rainbow coloring gets harder with larger number of colors.

In the approximate rainbow coloring problem, given a hypergraph that is promised to have a rainbow coloring with a large number of colors, the goal is to find a coloring in polynomial time using fewer number of colors. More formally, the computational problem we study is the following.

Definition 5.1.

(Approximate Rainbow Coloring) In the approximate rainbow coloring problem, the input is a hypergraph H=(V,E)H=(V,E) and a parameter k2k\geq 2. The objective is to distinguish between the following:

  1. 1.

    The hypergraph HH is kk-rainbow colorable.

  2. 2.

    The hypergraph HH has no valid 22-coloring.

We now give a simple reduction from approximate rainbow coloring to Vector Bin Covering.

Lemma 5.2.

Given a hypergraph H=(V,E)H=(V,E) and a parameter kk, there is a polynomial time reduction that outputs a Vector Bin Covering instance v1,v2,,vn{0,1}dv_{1},v_{2},\ldots,v_{n}\in\{0,1\}^{d} with n=|V|,d=|E|n=|V|,d=|E| such that

  1. 1.

    (Completeness.) If HH is kk-rainbow colorable, there is a partition of [n][n] into kk parts A1,A2,,AkA_{1},A_{2},\ldots,A_{k} such that

    jAivj1di[k]\sum_{j\in A_{i}}v_{j}\geq\textbf{1}^{d}\,\,\forall i\in[k]
  2. 2.

    (Soundness.) If HH is not 22-colorable, there is no partition of [n][n] into A1,A2A_{1},A_{2} such that

    jAivj1di[2]\sum_{j\in A_{i}}v_{j}\geq\textbf{1}^{d}\,\,\forall i\in[2]
Proof.

Let n=|V|,d=|E|n=|V|,d=|E|. We order the edges EE as E={e1,e2,,ed}E=\{e_{1},e_{2},\ldots,e_{d}\}. We output a set of vectors 𝒱={v1,v2,,vn}\mathcal{V}=\{v_{1},v_{2},\ldots,v_{n}\} where each vi{0,1}dv_{i}\in\{0,1\}^{d} is defined as follows:

(vi)j={1if iej0 otherwise.(v_{i})_{j}=\begin{cases}1&\text{if }i\in e_{j}\\ 0&\text{ otherwise.}\end{cases}

We analyze this reduction:

  1. 1.

    (Completeness.) Suppose that the hypergraph HH has a rainbow coloring with kk colors f:V[k]f:V\rightarrow[k]. We partition [n][n] into kk parts A1,A2,,AkA_{1},A_{2},\ldots,A_{k} such that

    Ai={j[n]:f(j)=i}A_{i}=\{j\in[n]:f(j)=i\}

    Consider an arbitrary integer i[k]i\in[k]. Note that for every edge ee in HH, eAiϕe\cap A_{i}\neq\phi. Thus,

    jAivj1d\sum_{j\in A_{i}}v_{j}\geq\textbf{1}^{d}
  2. 2.

    (Soundness.) Suppose that the hypergraph HH has no proper coloring with 22 colors. Then, we claim that there is no partition of [n][n] into two parts A1,A2A_{1},A_{2} such that

    jAivj1di[2]\sum_{j\in A_{i}}v_{j}\geq\textbf{1}^{d}\,\,\forall i\in[2]

    Suppose for contradiction that there exists A1,A2A_{1},A_{2} with the above property. Consider the coloring of the hypergraph f:V[2]f:V\rightarrow[2] as

    f(v)={1if vA12if vA2f(v)=\begin{cases}1&\text{if }v\in A_{1}\\ 2&\text{if }v\in A_{2}\end{cases}

    Consider an arbitrary edge el,l[d]e_{l},l\in[d] of the hypergraph HH. As jAi(vj)l1\sum_{j\in A_{i}}(v_{j})_{l}\geq 1 for all i[2]i\in[2], there exist v1,v2elv_{1},v_{2}\in e_{l} such that v1A1,v2A2v_{1}\in A_{1},v_{2}\in A_{2}. Thus, the coloring ff is a proper 22 coloring of the hypergraph HH, a contradiction. ∎

We combine this reduction with the hardness of approximate rainbow coloring to prove the hardness of Vector Bin Covering, namely Theorem 1.5. Note that the dimension of the resulting vectors in the Vector Bin Covering instance 𝒱\mathcal{V} is equal to the number of edges m=|E|m=|E| of the hypergraph HH, and the gap in the optimal Bin Covering value of 𝒱\mathcal{V} is equal to kk, the number of colors. Hence, to obtain better inapproximability results for Vector Bin Covering that grow with dd, our goal is to show the hardness of approximate rainbow coloring on hypergraphs with mm edges where the number of colors kk is as large a function of mm as possible. Towards this, we prove that it is NP-hard to 22-color a hypergraph with mm edges that is promised to be rainbow colorable with k=Ω(logmloglogm)k=\Omega\left(\frac{\log m}{\log\log m}\right) colors.

Theorem 5.3.

Given a hypergraph HH with mm edges, it is NP-hard to distinguish between the following:

  1. 1.

    (Completeness) HH is kk-rainbow colorable.

  2. 2.

    (Soundness) HH is not 22-colorable.

where k=Ω(logmloglogm)k=\Omega\left(\frac{\log m}{\log\log m}\right).

We defer the proof of Theorem 5.3 to Section 5.2.

We now prove the hardness of Vector Bin Covering using Theorem 5.3.

Proof of Theorem 1.5.

Using Theorem 5.3 combined with the reduction in Lemma 5.2, we get that the following problem is NP-hard. Given a set of nn vectors v1,v2,,vn{0,1}dv_{1},v_{2},\ldots,v_{n}\in\{0,1\}^{d}, distinguish between

  1. 1.

    𝒱\mathcal{V} can be partitioned into k=Ω(logdloglogd)k=\Omega\left(\frac{\log d}{\log\log d}\right) parts such that in each part, the sum of vectors is at least 11 in every coordinate.

  2. 2.

    𝒱\mathcal{V} cannot be partitioned into 22 parts such that in each part, the sum of vectors is at least 11 in every coordinate. In other words, the maximum number of parts into which 𝒱\mathcal{V} can be partitioned such that in each part, the sum of vectors is at least 11 in every coordinate is equal to 11.

Thus, it is NP-hard to approximate dd-dimensional Vector Bin Covering within k=Ω(logdloglogd)k=\Omega\left(\frac{\log d}{\log\log d}\right). ∎

5.2 Proof of Theorem 5.3

Our proof follows by viewing the hypergraph rainbow coloring problem as a promise constraint satisfaction problem(PCSP) and analyzing its polymorphisms [AGH17, BG16, BKO19]. The idea is to prove that the polymorphisms have a small number of “important” coordinates which can then be decoded in the Label Cover-Long Code framework. For the case of the above rainbow coloring PCSP, we prove that the polymorphisms are 11-fixing in that there is a single coordinate which when set to a certain value fixes the value of the function. This characterization then implies the hardness of the approximate rainbow coloring.

For ease of readability, we skip defining polymorphisms formally, and instead present the proof as a simple gadget reduction from Label Cover. We first need a definition.

Definition 5.4.

(11-fixing [BG16, GS20]) A function f:[k]n{0,1}f:[k]^{n}\rightarrow\{0,1\} is said to be 11-fixing if there exists an index [n]\ell\in[n] and values α,β[k]\alpha,\beta\in[k] such that

f(x)=0x:x=αandf(x)=1x:x=βf(\textbf{x})=0\,\forall\textbf{x}:\textbf{x}_{\ell}=\alpha\quad\text{and}\quad f(\textbf{x})=1\,\forall\textbf{x}:\textbf{x}_{\ell}=\beta

In the analysis of the gadget, we need a definition and a lemma from [ABP20].

Definition 5.5.

(The hypergraph Hrn[k]H_{r}^{n}[k]) The hypergraph Hrn[k]=(V,E)H_{r}^{n}[k]=(V,E) is a kk-uniform hypergraph with vertex set as the set of nn-dimensional vectors over [k][k] i.e. V=[k]nV=[k]^{n}. A set of kk vectors v1,v2,,vk\textbf{v}^{1},\textbf{v}^{2},\ldots,\textbf{v}^{k} form an edge of the hypergraph if

i=1n|[k]{vij:j[k]}|r\sum_{i=1}^{n}\left|[k]\setminus\{\textbf{v}^{j}_{i}:j\in[k]\}\right|\leq r
Lemma 5.6.

For every k2k\geq 2, the hypergraph Hk2n[k]H_{\lfloor\frac{k}{2}\rfloor}^{n}[k] is not 22-colorable.

We analyze the gadget used in our reduction.

Lemma 5.7.

Fix k3k\geq 3. Suppose f:[k]n{0,1}f:[k]^{n}\rightarrow\{0,1\} satisfies the below two-coloring property: For every 2k2k vectors v1,v2,,v2k[k]n\textbf{v}^{1},\textbf{v}^{2},\ldots,\textbf{v}^{2k}\in[k]^{n} with

{vij:j[2k]}=[k]i[n],\{\textbf{v}^{j}_{i}:j\in[2k]\}=[k]\,\forall i\in[n],

we have

{f(vj):j[2k]}={0,1}.\{f(\textbf{v}^{j}):j\in[2k]\}=\{0,1\}.

Then, ff is 11-fixing.

Proof.

We first prove that there exist [n],α[k]\ell\in[n],\alpha\in[k], b{0,1}b\in\{0,1\} such that f(x)=bf(\textbf{x})=b for all x[k]n\textbf{x}\in[k]^{n} with x=α\textbf{x}_{\ell}=\alpha. Suppose for contradiction that this is not the case. Then, for every i[n],j[k]i\in[n],j\in[k] there exist vectors xi,j,yi,j[k]n\textbf{x}^{i,j},\textbf{y}^{i,j}\in[k]^{n} such that xii,j=yii,j=j\textbf{x}^{i,j}_{i}=\textbf{y}^{i,j}_{i}=j, and f(xi,j)=0f(\textbf{x}^{i,j})=0 where as f(yi,j)=1f(\textbf{y}^{i,j})=1.

Let r=k2r=\lfloor\frac{k}{2}\rfloor. We view f:[k]n{0,1}f:[k]^{n}\rightarrow\{0,1\} as an assignment of two colors to the vertices of the hypergraph Hrn[k]H_{r}^{n}[k]. As the hypergraph is not two colorable (Lemma 5.6), we can infer that there is an edge of Hrn[k]H_{r}^{n}[k] all of whose vertices are assigned the same color. In other words, there exist kk vectors v1,v2,,vk[k]n\textbf{v}^{1},\textbf{v}^{2},\ldots,\textbf{v}^{k}\in[k]^{n} and b{0,1}b\in\{0,1\} such that f(vj)=bf(\textbf{v}^{j})=b for all j[k]j\in[k]. Furthermore, there are at most rr missing values in these vectors i.e.

i=1n|[k]{vij:j[k]}|r\sum_{i=1}^{n}\left|[k]\setminus\{\textbf{v}^{j}_{i}:j\in[k]\}\right|\leq r

Now, we pick rr vectors u1,u2,,ur\textbf{u}^{1},\textbf{u}^{2},\ldots,\textbf{u}^{r} (with repetitions if needed) by filling the missing values using xi,j,yi,j\textbf{x}^{i,j},\textbf{y}^{i,j} vectors such that

  1. 1.

    f(uj)=bf(\textbf{u}^{j})=b for all j[r]j\in[r].

  2. 2.

    For every i[n]i\in[n],

    {vij:j[k]}{uij:j[r]}=[k]\{\textbf{v}^{j}_{i}:j\in[k]\}\cup\{\textbf{u}^{j}_{i}:j\in[r]\}=[k]

By taking the union of {v1,v2,,vk}\{\textbf{v}^{1},\textbf{v}^{2},\ldots,\textbf{v}^{k}\} and {u1,u2,,ur}\{\textbf{u}^{1},\textbf{u}^{2},\ldots,\textbf{u}^{r}\}, and repeating some vectors, we obtain 2k2k vectors w1,w2,,w2k\textbf{w}^{1},\textbf{w}^{2},\ldots,\textbf{w}^{2k} with f(wj)=bf(\textbf{w}^{j})=b for all j[2k]j\in[2k], and

{wij:j[2k]}=[k]i[n]\{\textbf{w}^{j}_{i}:j\in[2k]\}=[k]\,\,\forall i\in[n]

However, this contradicts the two-coloring property of ff. Thus, there exist [n],α[k],b{0,1}\ell\in[n],\alpha\in[k],b\in\{0,1\} such that f(x)=bf(\textbf{x})=b for all x[k]n\textbf{x}\in[k]^{n} with x=α\textbf{x}_{\ell}=\alpha.

We now claim that there exists β[k]\beta\in[k] such that f(x)=1bf(\textbf{x})=1-b for all x[k]n\textbf{x}\in[k]^{n} with x=β\textbf{x}_{\ell}=\beta. Suppose for contradiction that this is not the case. Then, there exist kk vectors v1,v2,,vk\textbf{v}^{1},\textbf{v}^{2},\ldots,\textbf{v}^{k} such that vj=j\textbf{v}^{j}_{\ell}=j for all j[k]j\in[k], and f(vj)=bf(\textbf{v}^{j})=b for all j[k]j\in[k]. We now pick vk+1,vk+2,,v2k[k]n\textbf{v}^{k+1},\textbf{v}^{k+2},\ldots,\textbf{v}^{2k}\in[k]^{n} such that vj=α\textbf{v}^{j}_{\ell}=\alpha for all j{k+1,k+2,,2k}j\in\{k+1,k+2,\ldots,2k\}, and vij=jk\textbf{v}^{j}_{i}=j-k for all i[n]i\in[n] with ii\neq\ell, and j{k+1,k+2,,2k}j\in\{k+1,k+2,\ldots,2k\}. These 2k2k vectors v1,v2,,v2k\textbf{v}^{1},\textbf{v}^{2},\ldots,\textbf{v}^{2k} satisfy

  1. 1.

    f(vj)=bf(\textbf{v}^{j})=b for all j[2k]j\in[2k].

  2. 2.

    For every i[n]i\in[n],

    {vij:j[2k]}=[k]\{\textbf{v}^{j}_{i}:j\in[2k]\}=[k]

contradicting the two-coloring property of ff. Thus, there exists β[k]\beta\in[k] such that f(x)=1bf(\textbf{x})=1-b for all x[k]n\textbf{x}\in[k]^{n} with x=β\textbf{x}_{\ell}=\beta, completing the proof that ff is 11-fixing. ∎

We are now ready to prove Theorem 5.3. Our hardness result is obtained using a reduction from the Label Cover problem. This reduction is standard in the PCSP literature.(See e.g.,  [BKO19])

Reduction. We start with the Label Cover instance G=(V=LR),E,Σ=ΣL=ΣR,Π)G=(V=L\cup R),E,\Sigma=\Sigma_{L}=\Sigma_{R},\Pi) from Theorem 2.6 and output a hypergraph H=(V,E)H=(V^{\prime},E^{\prime}). Let nn denote the label size n=|Σ|n=|\Sigma|. For each vertex vLRv\in L\cup R, we have a long code containing a set of nodes KvK_{v} of size [k]n[k]^{n}, indexed by nn length vectors.

  1. 1.

    The vertex set of the hypergraph VV^{\prime} is the union of all the long code nodes.

    V=vVKvV^{\prime}=\bigcup_{v\in V}K_{v}
  2. 2.

    Edges of the hypergraph: For every vertex vVv\in V of the Label Cover instance, we add an edge in EE^{\prime} for each set of 2k2k vectors {v1,v2,,v2k}\{\textbf{v}^{1},\textbf{v}^{2},\ldots,\textbf{v}^{2k}\} in KvK_{v}, if

    {vij:j[2k]}=[k]i[n].\{\textbf{v}^{j}_{i}:j\in[2k]\}=[k]\,\forall i\in[n]. (2)

    The number of edges in HH is at most

    |E||V|(k|Σ|2k)|V|kO(k)|E^{\prime}|\leq|V|\binom{k^{|\Sigma|}}{2k}\leq|V|k^{O(k)}
  3. 3.

    Equality constraints: For every constraint Πe:uv\Pi_{e}:u\rightarrow v of the Label Cover, we add a set of equality constraints between nodes xKu\textbf{x}\in K_{u}, yKv\textbf{y}\in K_{v} if for all i[n]i\in[n], xi=yΠe(i)\textbf{x}_{i}=\textbf{y}_{\Pi_{e}(i)}. By adding an equality constraint between two nodes, we identify the two nodes together and treat it as a single node. That is, we compute the connected components of the equality constraints graph and identify a single master node for each component. We then obtain a multi-hypergraph H1H_{1} from HH by replacing each node with the corresponding master node. However, a vertex could appear multiple times in an edge in H1H_{1}. We delete such occurrences from H1H_{1} by setting each edge to be a simple set of the vertices contained in it, and obtain the final hypergraph H2H_{2}. We note the following:

    1. (a)

      There exists a kk-rainbow coloring of HH, f:V[k]f:V^{\prime}\rightarrow[k] that respects the equality constraints i.e. f(x)=f(y)f(\textbf{x})=f(\textbf{y}) for all pairs of nodes x,y\textbf{x},\textbf{y} with equality constraints between them if and only if H2H_{2} is kk-rainbow colorable.

    2. (b)

      Similarly, there exists a 22-coloring of HH that respects equality constraints if and only if H2H_{2} is 22-colorable.

    Finally, the number of edges in H2H_{2} is at most the number of edges in HH.

Completeness. Suppose that there is a labeling σ:VΣ\sigma:V\rightarrow\Sigma that satisfies all the constraints. We define the coloring f:V[k]f:V^{\prime}\rightarrow[k] of HH as follows. For every node xKv\textbf{x}\in K_{v}, we set the dictatorship function

f(x)=xσ(v)f(\textbf{x})=\textbf{x}_{\sigma(v)}

By the constraints added in Equation 2, the function ff is a valid kk-rainbow coloring of HH. As σ\sigma satisfies all the constraints of the Label Cover, the coloring ff satisfies all the equality constraints.

Soundness. Suppose that there is no labeling σ:VΣ\sigma:V\rightarrow\Sigma that satisfies all the constraints in GG. Then we claim that there is no 22-coloring of HH that respects all the equality constraints. Suppose for contradiction that there is a 22-coloring f:V{0,1}f:V^{\prime}\rightarrow\{0,1\} that respects all the equality constraints.

Consider a vertex vVv\in V. The function fv:[k]n{0,1}f_{v}:[k]^{n}\rightarrow\{0,1\}, defined as ff on KvK_{v} satisfies the conditions in Lemma 5.7. Thus, fvf_{v} is 11-fixing for every vVv\in V. Hence, there is a function L:VΣL:V\rightarrow\Sigma such that for every vVv\in V, fvf_{v} is 11-fixing on the coordinate L(v)L(v). We now claim that the labeling σ:VΣ\sigma:V\rightarrow\Sigma defined as σ(v)=L(v)\sigma(v)=L(v) satisfies all the constraints in GG.

Consider an edge e=(u,v)e=(u,v), uL,vRu\in L,v\in R with the projection constraint Πe:ΣΣ\Pi_{e}:\Sigma\rightarrow\Sigma. Our goal is to show that Πe(L(u))=L(v)\Pi_{e}(L(u))=L(v). Suppose for contradiction that Πe(L(u))L(v)\Pi_{e}(L(u))\neq L(v). By the 11-fixing property of fuf_{u}, we have αu,βu[k]\alpha_{u},\beta_{u}\in[k] such that

fu(x)=0x[k]n:xL(u)=αuandfu(x)=1x[k]n:xL(u)=βuf_{u}(\textbf{x})=0\,\forall\textbf{x}\in[k]^{n}:\textbf{x}_{L(u)}=\alpha_{u}\quad\text{and}\quad f_{u}(\textbf{x})=1\,\forall\textbf{x}\in[k]^{n}:\textbf{x}_{L(u)}=\beta_{u}

Similarly, we have αv,βv[k]\alpha_{v},\beta_{v}\in[k] such that

fv(y)=0y[k]n:yL(v)=αvandfv(y)=1y[k]n:yL(v)=βvf_{v}(\textbf{y})=0\,\forall\textbf{y}\in[k]^{n}:\textbf{y}_{L(v)}=\alpha_{v}\quad\text{and}\quad f_{v}(\textbf{y})=1\,\forall\textbf{y}\in[k]^{n}:\textbf{y}_{L(v)}=\beta_{v}

By the equality constraints, fu(x)=fv(y)f_{u}(\textbf{x})=f_{v}(\textbf{y}) for all x,y[k]n\textbf{x},\textbf{y}\in[k]^{n} such that xi=yΠe(i)i[n]\textbf{x}_{i}=\textbf{y}_{\Pi_{e}(i)}\,\forall i\in[n]. Let y[k]n\textbf{y}^{\prime}\in[k]^{n} be an arbitrary vector with yΠe(L(u))=αu,yL(v)=βv\textbf{y}^{\prime}_{\Pi_{e}(L(u))}=\alpha_{u},\textbf{y}^{\prime}_{L(v)}=\beta_{v}. We choose x[k]n\textbf{x}^{\prime}\in[k]^{n} such that for all i[n]i\in[n], xi=yΠe(i)\textbf{x}^{\prime}_{i}=\textbf{y}^{\prime}_{\Pi_{e}(i)}. Note that xL(u)=αu\textbf{x}^{\prime}_{L(u)}=\alpha_{u}. Thus, fu(x)=0f_{u}(\textbf{x}^{\prime})=0 where as fv(y)=1f_{v}(\textbf{y}^{\prime})=1. However, this contradicts the equality constraints.

6 Conclusion

We conclude by mentioning a few open problems.

  1. 1.

    A drawback of our hardness result for Vector Bin Packing is that our lower bound is only applicable when dd is large enough. The reason our hardness result needs dd to be large enough is that the kk-set cover hardness itself [Tre01] needs kk to be large enough. By starting our reduction with alternate set cover hardness results such as the 33-dimensional matching problem, we can obtain improved hardness for smaller values of dd. However, this approach will still not help for d=2d=2 as the upper bound on the packing dimension of these set families is greater than 22. It is an interesting open problem to characterize set families with packing dimension 22. We believe that such a characterization could help in proving the hardness of approximation results for the set cover problem on set families with packing dimension 22, which directly gives improved hardness of 22-dimensional Vector Bin Packing where there is a significant gap between the 1.4061.406 factor algorithm [BEK16] and the hardness factor of 1.001671.00167 [Ray21].

  2. 2.

    dd-dimensional Geometric Bin Packing is another open problem where packing dimension based ideas could help. The best hardness result for the dd-dimensional Geometric Bin Packing is still the (tiny) hardness factor from the 22-dimensional setting. A possible avenue to obtain improved inapproximability for this problem is by a reduction from a suitable set cover variant using a notion of packing dimension. However, this is easier said than done as the geometric packings are significantly harder to tame–for example, it is NP-hard to decide if a given set of nn rectangles can fit in a unit square, while the corresponding problem for Vector Bin Packing is trivial.

  3. 3.

    Another interesting open problem is to close the gap between O(logd)O(\log d) algorithm and Ω(logdloglogd)\Omega\left(\frac{\log d}{\log\log d}\right) hardness for Vector Bin Covering. For the rainbow coloring, the question is: Given a hypergraph HH with mm edges that is promised to be f(m)f(m)-rainbow colorable, can we 22-color it in polynomial time? The answer to this question when f(m)f(m) is equal to O(logm)O(\log m) is yes, by a simple random 22-coloring. By Theorem 5.3, the answer is no when f(m)=Ω(logmloglogm)f(m)=\Omega\left(\frac{\log m}{\log\log m}\right). Which of these is tight? Bhattiprolu, Guruswami, and Lee [BGL15] proved that in certain settings, the simple random coloring is optimal for rainbow coloring. This suggests that perhaps even here, the problem is hard when f(m)=o(logm)f(m)=o(\log m). On the other hand, obtaining any hardness result beyond Ω(logmloglogm)\Omega\left(\frac{\log m}{\log\log m}\right) seems to be impossible with the Label Cover-Long Code framework.

Acknowledgements

I am greatly indebted to Venkatesan Guruswami for helpful discussions, for his detailed feedback on the manuscript which significantly improved the presentation, and for his encouragement. I also thank Nikhil Bansal, Varun Gupta, Ravishankar Krishnaswamy, and Janani Sundaresan for helpful discussions and feedback on the manuscript. I am especially grateful to Ravishankar Krishnaswamy for pointing me to [KAR00].

References

  • [AAC+98] Noga Alon, Yossi Azar, János Csirik, Leah Epstein, Sergey V. Sevastianov, Arjen P. A. Vestjens, and Gerhard J. Woeginger. On-line and off-line approximation algorithms for vector covering problems. Algorithmica, 21(1):104–118, 1998.
  • [ABP19] Per Austrin, Amey Bhangale, and Aditya Potukuchi. Simplified inpproximability of hypergraph coloring via t-agreeing families. CoRR, abs/1904.01163, 2019.
  • [ABP20] Per Austrin, Amey Bhangale, and Aditya Potukuchi. Improved inapproximability of rainbow coloring. In Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, pages 1479–1495, 2020.
  • [ACKS13] Yossi Azar, Ilan Reuven Cohen, Seny Kamara, and F. Bruce Shepherd. Tight bounds for online vector bin packing. In Symposium on Theory of Computing Conference, STOC’13, pages 961–970, 2013.
  • [AGH17] Per Austrin, Venkatesan Guruswami, and Johan Håstad. (2+ϵ\epsilon)-sat is NP-hard. SIAM J. Comput., 46(5):1554–1573, 2017.
  • [AJKL84] Susan F. Assmann, David S. Johnson, Daniel J. Kleitman, and Joseph Y.-T. Leung. On a dual version of the one-dimensional bin packing problem. J. Algorithms, 5(4):502–525, 1984.
  • [ALM+98] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555, 1998.
  • [Ban17] Nikhil Bansal. Scheduling open problems: Old and new. MAPSP 2017, www.mapsp2017.ma.tum.de/MAPSP2017-Bansal.pdf, 2017.
  • [BCS09] Nikhil Bansal, Alberto Caprara, and Maxim Sviridenko. A new approximation method for set covering problems, with applications to multidimensional bin packing. SIAM J. Comput., 39(4):1256–1278, 2009.
  • [BEK16] Nikhil Bansal, Marek Eliáš, and Arindam Khan. Improved approximation for vector bin packing. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, pages 1561–1579, 2016.
  • [BG16] Joshua Brakensiek and Venkatesan Guruswami. New hardness results for graph and hypergraph colorings. In 31st Conference on Computational Complexity, CCC 2016, pages 14:1–14:27, 2016.
  • [BGL15] Vijay V. S. P. Bhattiprolu, Venkatesan Guruswami, and Euiwoong Lee. Approximate hypergraph coloring under low-discrepancy and related promises. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2015, pages 152–174, 2015.
  • [Bha18] Amey Bhangale. NP-Hardness of coloring 2-colorable hypergraph with poly-logarithmically many colors. In 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, pages 15:1–15:11, 2018.
  • [BK14] Nikhil Bansal and Arindam Khan. Improved approximation algorithm for two-dimensional bin packing. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2014, pages 13–25, 2014.
  • [BKO19] Jakub Bulín, Andrei A. Krokhin, and Jakub Oprsǎl. Algebraic approach to promise constraint satisfaction. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, pages 602–613, 2019.
  • [BOVvdZ16] Nikhil Bansal, Tim Oosterwijk, Tjark Vredeveld, and Ruben van der Zwaan. Approximating vector scheduling: Almost matching upper and lower bounds. Algorithmica, 76(4):1077–1096, 2016.
  • [BS04] Nikhil Bansal and Maxim Sviridenko. New approximability and inapproximability results for 2-dimensional bin packing. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2004, pages 196–203, 2004.
  • [CC06] Miroslav Chlebík and Janka Chlebíková. Inapproximability results for orthogonal rectangle packing problems with rotations. In Algorithms and Complexity, 6th Italian Conference, CIAC 2006, pages 199–210, 2006.
  • [CFS08] L. Sunil Chandran, Mathew C. Francis, and Naveen Sivadasan. Boxicity and maximum degree. J. Comb. Theory, Ser. B, 98(2):443–445, 2008.
  • [CJK01] János Csirik, David S. Johnson, and Claire Kenyon. Better approximation algorithms for bin covering. In Proceedings of the Twelfth Annual Symposium on Discrete Algorithms, 2001, pages 557–566, 2001.
  • [CK04] Chandra Chekuri and Sanjeev Khanna. On multidimensional packing problems. SIAM J. Comput., 33(4):837–851, 2004.
  • [CKPT17] Henrik I. Christensen, Arindam Khan, Sebastian Pokutta, and Prasad Tetali. Approximation and online algorithms for multidimensional bin packing: A survey. Comput. Sci. Rev., 24:63–79, 2017.
  • [dlVL81] Wenceslas Fernandez de la Vega and George S. Lueker. Bin packing can be solved within 1+epsilon in linear time. Comb., 1(4):349–355, 1981.
  • [DS14] Irit Dinur and David Steurer. Analytical approach to parallel repetition. In Symposium on Theory of Computing, STOC 2014, pages 624–633, 2014.
  • [Fei98] Uriel Feige. A threshold of ln n for approximating set cover. J. ACM, 45(4):634–652, 1998.
  • [FK15] Alan Frieze and Michał Karoński. Introduction to Random Graphs. Cambridge University Press, 2015.
  • [FKT89] Ulrich Faigle, Walter Kern, and György Turán. On the performance of on-line algorithms for partition problems. Acta Cybern., 9(2):107–119, 1989.
  • [GGJY76] M. R. Garey, Ronald L. Graham, David S. Johnson, and Andrew Chi-Chih Yao. Resource constrained scheduling as generalized bin packing. J. Comb. Theory, Ser. A, 21(3):257–298, 1976.
  • [GJ79] M. R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.
  • [GL18] Venkatesan Guruswami and Euiwoong Lee. Strong inapproximability results on balanced rainbow-colorable hypergraphs. Comb., 38(3):547–599, 2018.
  • [GS20] Venkatesan Guruswami and Sai Sandeep. Rainbow coloring hardness via low sensitivity polymorphisms. SIAM J. Discret. Math., 34(1):520–537, 2020.
  • [HS87] Dorit S. Hochbaum and David B. Shmoys. Using dual approximation algorithms for scheduling problems theoretical and practical results. J. ACM, 34(1):144–162, 1987.
  • [HS19] David G. Harris and Aravind Srinivasan. The Moser-Tardos framework with partial resampling. J. ACM, 66(5):36:1–36:45, 2019.
  • [IKKP19] Sungjin Im, Nathaniel Kell, Janardhan Kulkarni, and Debmalya Panigrahi. Tight bounds for online vector scheduling. SIAM J. Comput., 48(1):93–121, 2019.
  • [Jan10] Klaus Jansen. An EPTAS for scheduling jobs on uniform processors: Using an MILP relaxation with a constant number of integral variables. SIAM J. Discret. Math., 24(2):457–485, 2010.
  • [JS03] Klaus Jansen and Roberto Solis-Oba. An asymptotic fully polynomial time approximation scheme for bin covering. Theor. Comput. Sci., 306(1-3):543–551, 2003.
  • [Juh82] Ferenc Juhász. The asymptotic behaviour of Lovász’ theta-function for random graphs. Comb., 2(2):153–155, 1982.
  • [KAR00] V. S. Anil Kumar, Sunil Arya, and H. Ramesh. Hardness of set cover with intersection 1. In Automata, Languages and Programming, 27th International Colloquium, ICALP 2000, pages 624–635, 2000.
  • [Kho01] Subhash Khot. Improved inaproximability results for maxclique, chromatic number and approximate graph coloring. In 42nd Annual Symposium on Foundations of Computer Science, FOCS 2001, pages 600–609, 2001.
  • [Knu94] Donald E. Knuth. The sandwich theorem. Electron. J. Comb., 1, 1994.
  • [LY94] Carsten Lund and Mihalis Yannakakis. On the hardness of approximating minimization problems. J. ACM, 41(5):960–981, 1994.
  • [Mos15] Dana Moshkovitz. The projection games conjecture and the NP-hardness of ln n-approximating set-cover. Theory Comput., 11:221–235, 2015.
  • [MR10] Dana Moshkovitz and Ran Raz. Two-query PCP with subconstant error. J. ACM, 57(5):29:1–29:29, 2010.
  • [MRT13] Adam Meyerson, Alan Roytman, and Brian Tagiku. Online multidimensional load balancing. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2013, pages 287–302, 2013.
  • [PTUW11] Rina Panigrahy, Kunal Talwar, Lincoln Uyeda, and Udi Wieder. Heuristics for vector bin packing. 2011.
  • [Ray21] Arka Ray. There is no APTAS for 2-dimensional vector bin packing: Revisited. CoRR, abs/2104.13362, 2021.
  • [Raz98] Ran Raz. A parallel repetition theorem. SIAM J. Comput., 27(3):763–803, 1998.
  • [Spi94] Frits CR Spieksma. A branch-and-bound algorithm for the two-dimensional vector packing problem. Computers & operations research, 21(1):19–25, 1994.
  • [ST12] Hadas Shachnai and Tami Tamir. Approximation schemes for generalized two-dimensional vector packing with application to data placement. J. Discrete Algorithms, 10:35–48, 2012.
  • [Tre01] Luca Trevisan. Non-approximability results for optimization problems on bounded degree instances. In Proceedings on 33rd Annual ACM Symposium on Theory of Computing, 2001, pages 453–461, 2001.
  • [Woe97] Gerhard J. Woeginger. There is no asymptotic PTAS for two-dimensional vector packing. Inf. Process. Lett., 64(6):293–297, 1997.

Appendix A Hardness of simple kk-set cover

The hardness result of Kumar, Arya, and Ramesh [KAR00] is obtained from the Label Cover problem using a partition gadget along the lines of the reduction of Lund and Yannakakis [LY94]. The set families in the reduction in  [LY94] have large intersections.  [KAR00] get around this by using two main ideas:

  1. 1.

    They use a different partition system wherein each partition is a disjoint union of a large (super constant) number of sets instead of just 22 sets in  [LY94].

  2. 2.

    They use multiple sets for each label assignment to a vertex of the Label Cover, unlike a single set corresponding to each label of each vertex in [LY94].

As  [KAR00] were proving a Ω(logn)\Omega(\log n) hardness of the set cover, the universe size of the partition system is chosen to be the same as the number of vertices in the Label Cover instance. This forces the set sizes to be very large. We can get around this issue by simply defining the partition system on a set of size BB, where BB is a large constant. This also has an added benefit that we no longer require sub-constant hardness from the Label Cover instances, thus giving us NP-hardness directly. This observation is used by Trevisan [Tre01] to obtain lnBO(lnlnB)\ln B-O(\ln\ln B) NP-hardness of set cover on instances where each set has cardinality at most BB, from Feige’s (1ϵ)lnn(1-\epsilon)\ln n set cover hardness [Fei98].

We now describe the parameter modifications in full detail. Let BB be a large constant.

We start our reduction from Label Cover instances with soundness γ=132β2log2B\gamma=\frac{1}{32\beta^{2}\log^{2}B} where β\beta is an absolute constant to be fixed later.

Theorem A.1.

[ALM+98, Raz98]) Given a Label Cover instance defined on a bipartite graph G=(V,E)G=(V,E) with left alphabet ΣL\Sigma_{L} and right alphabet ΣR\Sigma_{R}, it is NP-hard to distinguish between the following:

  1. 1.

    (Completeness). There exists a labeling σ:VΣLΣR\sigma:V\rightarrow\Sigma_{L}\cup\Sigma_{R} that satisfies all the constraints.

  2. 2.

    (Soundness). No labeling to VV can satisfy more than γ\gamma fraction of the constraints.

Furthermore the instances satisfy the following properties:

  1. 1.

    The alphabet sizes d=|ΣL|d=|\Sigma_{L}| and d=|ΣR|d^{\prime}=|\Sigma_{R}| are both upper bounded by (logB)O(1)(\log B)^{O(1)}.

  2. 2.

    The maximum degree degdeg of GG is upper bounded by (logB)O(1)(\log B)^{O(1)}.

Following the convention in [KAR00], we assume that the number of vertices on the left side in GG is equal to that on the right side of GG, and we denote this number by nn^{\prime}.

We now construct a partition system 𝒫\mathcal{P} on a universe NN of size BB. The system 𝒫\mathcal{P} has d×(deg+1)×dd^{\prime}\times(deg+1)\times d partitions. Each partition has m=B1ϵm=B^{1-\epsilon} parts, where ϵ\epsilon is a small constant to be fixed later. The partition system is divided into dd^{\prime} groups each containing (deg+1)×d(deg+1)\times d partitions. Each group is further organized into deg+1deg+1 subgroups each of which contains dd partitions. Let Pg,s,pP_{g,s,p} denote the ppth partition in the ssth subgroup of the ggth group and Pg,s,p,kP_{g,s,p,k} denote the kkth set in Pg,s,pP_{g,s,p} where g[d],s[deg+1],p[d],k[m]g\in[d^{\prime}],s\in[deg+1],p\in[d],k\in[m]. The partition system satisfies the four properties in Section 44 of [KAR00], the only difference being that the universe NN now has size BB instead of nn^{\prime}. Thus, the covering property (Property 44 in [KAR00]) now states that any covering of NN with βmlogB\beta m\log B sets should contain at least 3m4\frac{3m}{4} sets from the same partition. Such a partition system is shown to exist for large enough BB in [KAR00] using a randomized construction. They also derandomize the construction. But for our setting, as BB is a constant, we just need to show the existence of such a partition system.

We reduce the Label Cover instance in Theorem A.1 to a set cover instance 𝒮𝒞\mathcal{SC} by the same construction as in [KAR00]: we have a partition system corresponding to each edge of the Label Cover instance, and the union of the elements in the partition systems is the element set of 𝒮𝒞\mathcal{SC}. The sets in 𝒮𝒞\mathcal{SC}, Ck(v,a)C_{k}(v,a) are defined exactly as in [KAR00]. The cardinality of each set is at most B=deg×BB2B^{\prime}=deg\times B\leq B^{2}. Each element is present in at most md=O(B)md=O(B) sets. The fact that 𝒮𝒞\mathcal{SC} is a simple set system follows from Lemma 1 of [KAR00]. By Lemma 2 in [KAR00], if there is a labeling of the Label Cover instance, then there is a set cover of size nmn^{\prime}m in 𝒮𝒞\mathcal{SC}. If there is a set cover of size β2nmlogB\frac{\beta}{2}n^{\prime}m\log B in 𝒮𝒞\mathcal{SC}, then there is a labeling of GG that satisfies γ\gamma fraction of constraints. The proof of this soundness follows along the same lines as Lemma 33 of [KAR00], with the only difference being that we now define the good edges as edges having #(e)βmlogB\#(e)\leq\beta m\log B.

Appendix B SDP Relaxation of Monochromatic-Clique

We consider the following SDP relaxation of the graph coloring problem on G=(V,E)G=(V,E):

Minimize k\displaystyle k
ui,ui\displaystyle\langle u_{i},u_{i}\rangle =1iV\displaystyle=1\,\,\forall i\in V
ui,uj\displaystyle\langle u_{i},u_{j}\rangle 1k1(i,j)E\displaystyle\leq\frac{-1}{k-1}\,\,\forall(i,j)\in E

The optimal solution to this SDP is referred to as the vector chromatic number χv(G)\chi_{v}(G) of the graph GG. It is equivalent to the Lovasz theta function of the complement of GG. We have the following sandwich property due to [Knu94]:

ω(G)χv(G)χ(G)\omega(G)\leq\chi_{v}(G)\leq\chi(G)

B.1 Algorithm when B>nB>\sqrt{n}

There is a simple algorithm for the Monochromatic-Clique(k,B)(k,B) problem when B>kB>k: We compute χv(G)\chi_{v}(G) in polynomial time, and we check if χv(G)k\chi_{v}(G)\leq k. In this case, there is no clique of size BB in GG, and we output YES. If χv(G)>k\chi_{v}(G)>k, then the graph cannot be colored with kk colors, and in this case, we output NO.

Note that if k(B1)nk(B-1)\geq n, there is always an assignment of kk colors to the vertices of the graph without a clique of size BB, thus the problem is trivial.

B.2 Integrality gap

The above algorithm proves that in any graph with vector chromatic number at most kk, there is an assignment of kk colors to the vertices that has monochromatic clique of size at most n\sqrt{n}. We now prove that this cannot be significantly improved:

Theorem B.1.

For nn large enough, there exists a graph G=(V,E)G=(V,E) with nn vertices, and a parameter kk such that

  1. 1.

    χv(G)k\chi_{v}(G)\leq k.

  2. 2.

    In any assignment of kk colors to the vertices of GG, there is a monochromatic clique of size nΩ(1)n^{\Omega(1)}.

Proof.

We first prove the following: for large enough nn, there exists a graph GG on nn vertices, and an integer kk such that

  1. 1.

    ϑ(G)k\upvartheta(G)\leq k.

  2. 2.

    In any assignment of kk colors to the vertices of the graph GG, there exists a monochromatic independent set of size B=nΩ(1)B=n^{\Omega(1)}.

Our construction is a probabilistic one: we sample GG from G(n,p)G(n,p) with p=1np=\frac{1}{\sqrt{n}}. It has been proved [Juh82] that the Lovasz theta function of this random graph satisfies

ϑ(G)2n34+O~(n13logn)\upvartheta(G)\leq 2n^{\frac{3}{4}}+\tilde{O}(n^{\frac{1}{3}}\log n)

with high probability. We set k=3n34k=3n^{\frac{3}{4}}. For large enough nn, with high probability, we have ϑ(G)k\upvartheta(G)\leq k.

Furthermore, the random graph G(n,p)G(n,p) with p=o(n25)p=o(n^{-\frac{2}{5}}) has no K6K_{6} with high probability(See e.g., [FK15]). Thus, using Lemma 4.2, we can infer that in any subset of size n143\frac{n^{\frac{1}{4}}}{3}, there is an independent set of size at least n1242\frac{n^{\frac{1}{24}}}{2}. Hence, in any assignment of kk colors to the vertices of the graph GG, there is a monochromatic independent set of size nΩ(1)n^{\Omega(1)}. Taking the complement, we get a graph with the required properties. ∎