A Survey on Parameterized Inapproximability: -Clique, -SetCover, and More
Abstract
In the past a few years, many interesting inapproximability results have been obtained from the parameterized perspective. This article surveys some of such results, with a focus on -Clique, -SetCover, and other related problems.
1 Introduction
Parameterization and Approximation are two natural ways to cope with NP-complete optimization problems. For Clique and SetCover, two very basic NP-complete problems whose parameterized version -Clique and -SetCover are also complete problems of W[1] and W[2], both approximation and parameterization have been studied extensively. However, the combining of parameterization and approximation remains unexplored until recent years.
In their breakthrough work, [CCK+17] showed very strong inapproximability results for -Clique and -SetCover under the hypothesis Gap-ETH. However, although maybe plausible, Gap-ETH is such a strong hypothesis that it already gives a gap in hardness of approximation. Thus it is still of great interest to prove the same lower bound under a gap-free assumption like ETH, W[1]FPT and so on. Although these years have witnessed many significant developments along this way, the inapproximability of -Clique and -SetCover under gap-free hypotheses is still far beyond settled.
This article surveys some recent results, all of which are beautiful, full of smart ideas, and involve delicate algebraic or combinatorial tools. We hope to extract the relationship between different problems, capture the essence of successful attempts, and convey the ideas inside those results to readers.
1.1 Organization of the Survey
This article is organized by the problems. In Section 2, we put some preliminaries, including the definition of problems and hypotheses. In Section 3, we introduce MaxCover and MinLabel, two problems which are not only important intermediate problems in proving parameterized inapproximability, but also of great interest themselves. In Section 4 and 5, we introduce recent parameterized inapproximability results of -Clique and -SetCover, respectively.
2 Preliminaries
In this section, we first introduce some concepts in FPT approximation, then briefly describe the problems discussed in this article, and the hypotheses which the results are based on.
2.1 FPT Approximation
For a parameterized optimization problem, we use to denote the input size, and the parameter usually refers to the number of elements we need to pick to obtain an optimal solution. In some problems is just the optimal solution size (e.g. -Clique), while in other problems it is not (e.g. One-Sided -Biclique). By enumerating the elements in the solution, the brute-force algorithm usually runs in time .
An algorithm for a maximization (respectively, minimization) problem is called -FPT-approximation if it runs in time for some computable function , and outputs a solution of size at least (respectively, at most ). Here is called approximation ratio. If an optimization problem admits no -FPT-approximation for any computable function , we say this problem is totally FPT inapproximable.
Note that since computing a constant-size solution is trivial, for a maximization problem, we only care about -FPT-approximation and the approximation ratio is measured in terms of only . However, for a minimization problem, any computable approximation ratio is non-trivial, so if totally FPT inapproximability is already established, we can also discuss approximation ratio in terms of both and .
2.2 Problems
If the input is divided into groups, and one is only allowed to pick at most one element from each group, we say this problem is colored, otherwise it is uncolored. For some problems (e.g. -Clique), the two versions are equivalent, while for some other problems (e.g. -Biclique) they are not equivalent at all. We will discuss the coloring in each problem’s section separately.
Now we list the problems considered in this article. There are some other problems (e.g. MaxCover) which are used as intermediate problems in proving hardness of approximation. We put their definitions in their separate sections since they are a bit more complicated.
-
•
3SAT. The input is a 3-CNF formula with clauses on variables. The goal is to decide whether there is a satisfying assignment for .
-
•
-Clique. The input is an undirected graph with vertices. The goal is to decide whether there is a clique of size .
-
•
Densest -Subgraph. The input is an undirected graph with vertices, and the goal is to find the maximum number of edges induced by vertices.
-
•
-SetCover. The input is a collection of sets over universe . The goal is to decide whether there are sets in , whose union is .
2.3 Hypotheses
Here we list the hypotheses which the results are based on.
W[1]FPT and W[2]FPT are arguably the most natural hypotheses in parameterized complexity, and are often used to derive FPT time lower bounds. Since -Clique and -SetCover are complete problems of W[1] and W[2], respectively, we directly use their intractability results in the statements of those two hypotheses here, and omit the definition of W-Hierarchy.
Hypothesis 2.1 (W[1]FPT).
-Clique cannot be solved in time, for any computable function .
Hypothesis 2.2 (W[2]FPT).
-SetCover cannot be solved in time, for any computable function .
Tighter time lower bounds like often involves the Exponential Time Hypothesis (ETH).
Hypothesis 2.3 (Exponential Time Hypothesis (ETH)[IP01, IPZ01, Tov84]).
3SAT cannot be solved deterministically in time, where is the number of variables. Moreover, this holds even when restricted to formulae in which , and each variable appears in at most three clauses.
There are two stronger assumptions on the intractability of 3SAT, namely, the Gap Exponential Time hypothesis (Gap-ETH) and Strong Exponential Time hypothesis (SETH). Gap-ETH is useful in proving strong inapproximability results for many parameterized problems, while SETH is used to show tight time lower bounds like .
Hypothesis 2.4 (Gap Exponential Time Hypothesis (Gap-ETH) [Din16, MR17]).
For some constant , there is no deterministic algorithm which runs in time can, given a 3SAT formula on variables and clauses, distinguish between the following two cases:
-
•
(Completeness) the formula is satisfiable.
-
•
(Soundness) any assignment violates more than fraction of clauses.
Note that by current state-of-the-art PCP theorem, a 3SAT formula on variables can be transformed into a constant gap 3SAT formula on only variables [Din07]. Therefore, assuming ETH, constant gap 3SAT cannot be solved in time. A big open problem is whether linear-sized PCP exists. If so, Gap-ETH would follow from ETH.
Hypothesis 2.5 (Strong Exponential Time Hypothesis (SETH) [IP01, IPZ01]).
For any , there is an integer such that no algorithm can solve SAT deterministically in time.
Gap-ETH and SETH both imply ETH. However, no formal relationship between them is known now.
There are also randomized versions of ETH, Gap-ETH and SETH, which also rule out randomized algorithms running in corresponding time. We do not separately list them here.
One last important hypothesis is the Parameterized Inapproximability Hypothesis (PIH).
Hypothesis 2.6 (Parameterized Inapproximability Hypothesis (PIH) [LRSZ20]).
For some constant , there is no factor FPT approximation algorithm for Colored Densest -Subgraph.
The factor can be replaced by any constant and is not important.
Note that if for a graph the number of edges induced by vertices is only , it can not have a clique of size . Thus, PIH implies -Clique is hard to approximate within any constant factor in FPT time. However, the reverse direction is not necessarily true (forbiddinng small clique does not imply low density of edges), and it remains an important open question that whether PIH holds if we assume -Clique is FPT inapproximable within any constant factor.
Another remark is that PIH can be implied from Gap-ETH. See Appendix A of [BGKM18] for a simple proof. However, deriving PIH from a gap-free hypothesis such as ETH is still open.
3 MaxCover and MinLabel
In this section, we introduce two intermediate problems which are important in proving hardness of -Clique and -SetCover.
The input is the same for both problems. It is a bipartite graph , such that is partitioned into and is partitioned into . We refer to ’s and ’s as left super-nodes and right super-nodes, respectively, and we refer to the maximum size of left super-nodes and right super-nodes as left alphabet size and right alphabet size, and denote them as and , respectively.
We say a MaxCover or MinLabel instance has projection property if for every , one of the following holds:
-
•
Every has exactly one neighbor .
-
•
There is a full bipartite graph between and .
The bipartite case just means there are no restrictions between and .
Another interesting property is called pseudo projection property, which is almost the same as projection property, except the projection direction in the first case is opposite (Every has exactly one neighbor ).
For convenience, in an instance and for a left super-node , we sometimes refer to the number of right super-nodes ’s, such that the edges between and do not form a full bipartite graph, as ’s degree. Similarly define it for every right super-nodes. We call the maximum degree over all ’s (respectively, over all ’s), the left degree (respectively, right degree) of .
A solution to MaxCover is a subset of vertices formed by picking a vertex from each (i.e. for all ). We say a labeling covers a left super-node if there exists a vertex which is a common neighbor of all vertices in . The goal in MaxCover is to find a labeling that covers the maximum fraction of left super-nodes. The value of a MaxCover instance is defined as
A solution to MinLabel is also a subset of vertices , but not necessarily one vertex from each . We say a multi-labeling covers a left super-node if there exists a vertex which has a neighbor in for every . The goal in MinLabel is to find a minimum-size multi-labeling that covers all the left super-nodes. The value of a MinLabel instance is defined as
There is a remark on the relationship between projection property and pseudo projection property. If the degree of each left super-node is bounded by some constant (it is the case when reducing from 3SAT, see Theorem 3.2), then a MaxCover instance with projection property can be reduced to a MaxCover instance with pseudo projection property, with a constant shrinking of the gap.
Theorem 3.1.
There is a reduction which, on input a MaxCover instance with projection property, and the left degree of is bounded by a constant , outputs a MaxCover instance with pseudo projection property, such that
-
•
(Completeness) If MaxCover, then MaxCover.
-
•
(Soundness) If MaxCover, then MaxCover.
and the right degree of is bounded by .
Proof.
The reduction is straightforward: for each restriction between some and , build a copy of on the left (there are at most restrictions, thus that many copies), and the new right super-nodes are just the juxtaposition of and . Each left super-node is only responsible to check one restriction in . The edges between the left and the right parts form either a bijection or a full bipartite graph, while the edges between the left and the right parts form either an injection from to (as they do in ), or a full bipartite graph, too. It’s easy to see the new instance satisfies pseudo projection property, the right degree is , and the gap is only hurt by a factor of . ∎
In the following we briefly list the inapproximability results for MaxCover and MinLabel, which will be introduced detailedly in subsequent subsections. We use to denote for simplicity.
Problem | Assumption | Ratio | Lower Bound | Holds Even | Reference | |||
MaxCover | Gap-ETH | Any Constant |
|
/ | ||||
|
[CCK+17] | |||||||
W[1]FPT | / | [KN21] | ||||||
ETH | / | |||||||
MinLabel | Gap-ETH | [CCK+17] |
It’s worth noting that [KLM19] also showed some inapproximability results for MaxCover. However, as their parameters are specifically designed for later use of proving hardness of -SetCover, we defer their results to Section 5.3 instead of here.
3.1 Hardness Results Based on Gap-ETH
Results in this subsection are from [CCK+17].
A 3SAT instance can be formulated as MaxCover instance as follows. Each left super-node consists of 7 vertices, which represent the satisfying assignments of a clause. Each right super-node consists of 2 vertices, which correspond to the true/false assignment for a variable. Two vertices are linked if and only if the assignments are consistent. Therefore, Gap-ETH can be also restated as an intractability result of constant gap MaxCover:
Theorem 3.2.
Assuming Gap-ETH, there is a constant such that no deterministic algorithm can distinguish between the following cases for an instance in time:
-
•
(Completeness) MaxCover.
-
•
(Soundness) MaxCover.
Moreover, this holds even when and has projection property.
Actually Gap-ETH is equivalent to the above, see Appendix E of [CCK+17].
Note that MaxCover can be solved in or time, by just enumerating vertices picked from each left super-nodes or right super-nodes. Moreover, it can even decide whether the answer is in or time. As shown below, these are the best possible assuming Gap-ETH.
Theorem 3.3 (Theorem 4.2 in [CCK+17]).
Assuming Gap-ETH, there exists constants such that for any , no algorithm can take a MaxCover instance with left super-nodes, distinguish between the following cases in time:
-
•
(Completeness) MaxCover.
-
•
(Soundness) MaxCover.
This holds even when and has projection property.
The theorem is straightforward when , because we can directly compress a constant number of left super-nodes into one. The interesting case is when is much smaller than .
The proof involves a combinatorial object called disperser, which is defined as follows.
Definition 3.4 (Disperser[CW89, Zuc96a, Zuc96b]).
For positive integers and constant , an -disperser is a collection of subsets , each of size , such that the union of any different subsets from the collection has size at least .
Dispersers with proper parameters can be constructed using random subsets with high probability.
Lemma 3.5.
For positive integers and constant , let and let be random -subsets of . If then is an -disperser with probability at least .
This above construction can also be derandomized easily. With this tool, we can compress the left super-nodes in a MaxCover instance according to those subsets. Each left super-node now corresponds to satisfying assignments of the AND of clauses. If there is a labeling that covers at least left super-nodes, from the definition of disperser we know that at least clauses in the original 3SAT instance can be simultaneously satisfied. The size of the new instance is , thus an algorithm for MaxCover which runs in time would lead to an algorithm for constant gap 3SAT in time, refuting Gap-ETH.
In the other direction, we would like to rule out algorithms for approximating MaxCover, where is the number of right super-nodes. We have the following theorem:
Theorem 3.6 (Theorem 4.3 in [CCK+17]).
Assuming Gap-ETH, there exists constants such that for any and , no algorithm can take a MaxCover instance with right super-nodes, distinguish between the following cases in time:
-
•
(Completeness) MaxCover.
-
•
(Soundness) MaxCover.
This holds even when .
Note that in the statement of this theorem, can be any fixed constant, while in the statement of Gap-ETH, is the number of variables which goes to infinity. Thus, a straightforward idea is to compress the variables into groups, each of size .
After grouping variables, we can also compress the clauses in order to amplify the soundness parameter to . Specifically, let , take left super-nodes, each corresponding to satisfying assignments of the AND of some clauses. If only clauses in the original 3SAT instance can be satisfied, then only clauses in the new instance can be satisfied, leading to a soundness parameter . Furthermore, .
One important thing is that we need to make sure and can be bounded by , so that is the dominating term in . Thus, cannot be arbitrarily small. Fortunately in its major applications (e.g. hardness of SetCover), this will not be the bottleneck.
Next we proceed to discuss hardness of MinLabel.
Theorem 3.7 (Theorem 4.4 in [CCK+17] ).
Assuming Gap-ETH, there exists constants such that, for any and , no algorithm can take a MinLabel instance with right super-nodes, distinguish between the following cases in time:
-
•
(Completeness) MinLabel.
-
•
(Soundness) MinLabel.
This holds even when .
The instance is exactly the one in Theorem 3.6. It is easy to see that in the completeness case, MaxCover implies MinLabel. We only need to additionally argue that, in the soundness case, small MaxCover implies large MinLabel. Prove by contradiction, if MinLabel, we can fix this multi-labeling, and pick a vertex uniformly at random from each right super-node to form a labeling. By proving the expected fraction of covered left super-nodes , we have MaxCover. In the following we suppose the optimal right multi-labeling covers left vertices .
The left alphabet size is , as in Theorem 3.6.
3.2 Gap Producing via Threshold Graph Composition
Threshold Graph Composition is a powerful gap-producing technique. It was first proposed by Lin in his breakthrough work [Lin18], and has been used to create gap for many parameterized problems [CL19, Lin19, BBE+19, KN21].
At a high level, in TGC we compose an instance which has no gap, with a threshold graph which is oblivious to the instance, to produce a gap instance of that problem. The two main challenges of TGC are the creation of a threshold graph with desired properties, and the right way to compose the input and the threshold graph, respectively.
In this subsection, we introduce a delicate threshold graph, which is constructed via error correcting codes. This graph was proposed by Karthik et al. [KN21], and had many applications such as in proving hardness of MaxCover starting from W[1]FPT or ETH (later in this Section), and in simplifying the proof of -SetCover inapproximability in [Lin19] (see Section 5.4).
We first formalize some definitions related to error correcting codes.
Definition 3.8 (Error Correcting Codes).
Let be a finite set, for every a subset is an error correcting code with message length , block length and relative distance if for every , . We denote then . Here .
We sometimes abuse notations and treat an error correcting code as its image, i.e., .
Definition 3.9 (Collision Number).
The collision number of an error correcting code is the smallest number such that there exists a set with , and for every , there are two strings such that . We denote this number as .
For any error correcting code and any , we can take it to build a bipartite threshold graph with the following properties efficiently:
-
•
is divided into groups, each of size . is divided into groups, each of size .
-
•
(Completeness) For any and for any , there is a unique vertex which is a common neighbor of .
-
•
(Soundness) For every and every distinct , for of the parts , we have that .
-
•
(Collision Property) Let such that for every , there exists which is a common neighbor of (at least) vertices in . Then .
The graph is constructed as follows:
-
•
For every , we associate with all codewords in , i.e. each vertex in is a unique codeword in the image of .
-
•
For every , we associate with the set .
-
•
A vertex and a vertex are linked if and only if .
We can think of the graph as an matrix, when picking vertices from each , we are filling the codewords into each row of the matrix. By reading out each column of the matrix, we can pick exactly one common neighbor of them in each , satisfying the completeness property.
The soundness property is also straightforward: if we pick two vertices from the same left group , the two codewords differ in at least positions. For those columns we cannot “read out” any -bit string whose -th bit equals to two different characters.
As for the collision property, for a set , if for every there exists which is a common neighbor of at least vertices in , it’s easy to see that for any , we can pick such that by pigeonhole principle.
Now we describe how to compose this threshold graph with a -MaxCover instance where the parameter denotes the number of right super-nodes, to produce a Gap -MaxCover instance such that:
-
•
(Completeness) If MaxCover, then MaxCover.
-
•
(Soundness) If MaxCover, then MaxCover.
Given a -MaxCover instance with pseudo projection property, where and , and a threshold graph , where and , w.l.o.g. assume , we build the new -MaxCover instance as follows.
-
•
Arbitrarily match every vertex to a vertex in without repetitions. This can be done since .
-
•
The new right super-nodes are , and the new left super-nodes are .
-
•
A right vertex and a left vertex are linked if and only if there exists such that is linked to each in , and is linked to the matching in .
The completeness case is obvious, by picking one in each right super-node , there is a common neighbor in each left super-node . Consider their matching vertices , there is exactly one common neighbor in each .
The soundness case needs the pseudo projection property of , i.e., for every and , edges between and either form a function, or are complete. Fix any labeling , there must be a super-node which cannot be covered. This means there must be two left vertices mapping to different vertices in . Let the two vertices be , and let the matching vertices of them be , by the soundness property of threshold graph, only in fraction of parts is there a common neighbor of in . This means the labeling can only cover parts of , i.e., MaxCover.
Next we use this technique to prove strong inapproximability results of -MaxCover based on W[1]FPT and ETH.
Theorem 3.10 (Theorem 4.3 in [KN21] ).
Assuming W[1]FPT, for any computable function , there is no time algorithm which can take a MaxCover instance with right super-nodes, distinguish between the following two cases:
-
•
(Completeness) MaxCover.
-
•
(Soundness) MaxCover.
Proof.
First reduce -Clique to MaxCover with right super-nodes and left super-nodes in the canonical way. Note that this MaxCover instance has pseudo projection property. Then take a Reed-Solomon Code to build the threshold graph. To ensure the right alphabet size (which is here) , we need , and to ensure , we need . Thus according to the properties of Reed-Solomon Code, the soundness parameter is . ∎
Theorem 3.11 (Theorem 4.4 in [KN21] ).
Assuming ETH, there is no time algorithm which can take a MaxCover instance with right super-nodes, distinguish between the following two cases:
-
•
(Completeness) MaxCover.
-
•
(Soundness) MaxCover.
Proof.
First reduce 3SAT to MaxCover with right super-nodes and left super-nodes. Each right super-node corresponds to satisfying assignments of some clauses, and thus has vertices in it. Each left super-node corresponds to assignments to variables which appear in exactly some one/two/three groups of clauses. This MaxCover instance also has pseudo projection property. Note that here it’s necessary to group the variables to change the number of left super-nodes from to , because in our construction of threshold graph, the size of each new right super-node is , which is too large if . After that, we still use Reed-Solomon Codes. Now to ensure we need , and to ensure we need . The soundness parameter is . ∎
4 -Clique
Clique is arguably the first natural combinatorial optimization problem. Its inapproximability in the NP regime is studied extensively [BGLR93, BS94, Has96, FGL+96, Gol98, FK00, Zuc07]. However, in the parameterized perspective, although -Clique is known to be the complete problem of W[1], and even cannot be solved in time assuming ETH [CHKX06], there is still a lot of work to do on its parameterized inapproximability.
To approximate -Clique to a factor of , we only need to compute a clique of size , which can be trivially done in time. In their milestone work, Chalermsook et al. [CCK+17] showed this cannot be improved assuming Gap-ETH. However, results based on non-gap assumptions are not reached until very recently Lin [Lin21] showed constant approximating -Clique is W[1]-hard. He also obtained an lower bound for constant approximating -Clique based on ETH, and this bound was recently improved to by [LRSW21].
The following table lists current state-of-the-art inapproximability results of -Clique based on different hypotheses. Here can be any computable function.
Complexity Assumption | Inapproximability Ratio | Time Lower Bound | Reference |
W[1]FPT | Any constant | [Lin21] | |
PIH | Any constant | / | |
ETH | Any constant | [LRSW21] | |
Gap-ETH | [CCK+17] |
We shall notice that the colored version and uncolored version of -Clique are equivalent, because a colored version can be interpreted as an uncolored version by leaving each group an independent set, and we can make different copies of the original graph to transform an uncolored version to a colored version.
4.1 Reduction from MaxCover with Projection Property
The Gap-ETH hardness of -Clique directly follows from Theorem 3.3. Since the instance has projection property, two left vertices agree if and only if they map to the same vertex in each right super-node, and left vertices agree if and only if they pairwise agree. Thus, we can transform a MaxCover instance in Theorem 3.3 with left super-nodes to an -Clique instance with the same value.
Therefore, as pointed out in Theorem 3.3, even deciding if there is a clique of size needs time. Here can be any , which means approximating -Clique to any ratio cannot be done in time.
Note the projection property is crucial, without which a MaxCover instance cannot be reduced to an -Clique instance, because the agreement test of left vertices cannot be decomposed locally to agreement tests of pairs of left vertices.
One interesting thing is that the optimal inapproximability result of -Clique can be obtained from hypotheses other than (but similar to) Gap-ETH, see the following as an example.
Theorem 4.1.
Given an undirected graph with groups of vertices, each forming an independent set, if there exists a constant such that distinguishing between the following cases cannot be done in time:
-
•
(Completeness) there is a clique of size .
-
•
(Soundness) there is no clique of size .
then -Clique cannot be approximated to any ratio in time.
This assumption is a little weaker than Gap-ETH since it can be obtained through the canonical reduction from 3SAT to Clique.
The proof is almost the same as that in Theorem 3.3: just compose the groups using a disperser. Details are omitted here.
4.2 -VectorSum
Before introducing W[1]-hardness of constant approximating -Clique, we want to mention an important W[1]-complete problem, -VectorSum, which is used as an intermediate problem in the reduction in [Lin21].
In the -VectorSum problem, we are given groups of vectors together with a target vector , where is some finite field and is the dimension of vectors. The goal is to decide whether there exists vectors such that .
It’s easy to see -VectorSum with and is W[1]-hard, and even does not admit time algorithms assuming ETH. The idea is to use entries of vectors to check the consistency in -Clique or 3SAT.
Theorem 4.2.
Assuming W[1]FPT, -VectorSum with and can not be solved in time.
Proof.
Set groups of vectors, each representing valid edges between an -th block and a -th block of vertices in -Clique. For each , we want to make sure that, the vertices chosen from the -th block are all the same one. Thus we need to do equality checks, each on two -bit strings. In short, we exploit a new entry to do each bitwise equality check: set the entry to be the unchecked bit in the two vectors involved, and set the entry to be zero in all other vectors. Let the target vector to be . Thus all vectors sum up to zero in this entry if and only if the two to-be-checked bits are the same. The produced -VectorSum instance has parameter and dimension . ∎
Theorem 4.3.
Assuming ETH, -VectorSum with and can not be solved in time.
Proof.
Divide the clauses into equal-sized groups. Enumerate satisfying partial assignments of each group of clauses, then we want to check the consistency of those partial assignments. Note we can assume that each variable only appears in at most 3 clauses, thus we only need to do at most 2 pair-wise equality checks (between the first and the second appearances, and between the second and the third). The equality check step is the same as that in the proof of Theorem 4.2, and is omitted here.
The produced -VectorSum instance has size and the dimension is . Assuming ETH, this can not be solved in time. ∎
Note that the can be replaced by any finite field, as long as we slightly adjust the value of an entry: both 0 if ; in the first appearance and in the second appearance for some constant if .
4.3 Gap Producing via Hadamard Codes
In this subsection we briefly introduce how Lin [Lin21] rules out constant FPT-approximations of -Clique under W[1]FPT.
The most essential step in the reduction is to combine -VectorSum, whose W[1]-hardness was shown in Theorem 4.2, with Hadamard codes to create a gap. The technique is very similar to which people used in proving weakened PCP theorem, namely, in proving NP PCP [ALM+98].
Here follows the definition of Hadamard Code, and its two important properties.
Definition 4.4 (Walsh-Hadamard Code).
For two strings , define . The Walsh-Hadamard Code is the function that maps every string into the string such that for every .
-
1.
(Linearity Testing, [BLR93]) Let be a function mapping from to . If it can pass fraction of tests , then is at least -close to a true linear function , which can also be parsed as a Hadamard codeword. By setting , this codeword is unique since the relative distance of Hadamard Code is exactly .
-
2.
(Locally Decodable Property) Suppose is -close to a Hadamard codeword for some , then for any , we can recover probabilistically by querying only two positions of . Namely, sample uniformly at random, and output . This succeeds with probability at least by a simple union bound.
Given an -VectorSum instance , we build a CSP instance on variable set . Let be a solution that sums up to in the yes case, each variable is supposed to take the value . Here we are actually concatenating the solution vectors into one long vector , and the concatenation of variables is supposed to be the Hadamard codeword of .
There are three types of tests we want to make:
-
•
(T1) , test whether .
-
•
(T2) , test whether for some .
-
•
(T3) , test whether .
By linearity testing, if an assignment to satisfies -fraction of constraints in (T1), then is at least -close to a true Hadamard codeword , . The constraints (T2) and (T3) use locally decodable properties to recover , and use them to check whether indeed indicates a satisfying solution of our -VectorSum instance.
After that, we use a slightly modified FGLSS reduction to build an -clique instance. We build a group for each variable indicating its possible values, and a group for each (T1) test. A variable vertex and a test vertex are linked either if they are consistent, or the test is irrelevant to the variable. Two test vertices are linked in a similar way accordingly. Two variable vertices are linked either if the values specified by them pass the (T2) and (T3) tests, or there are no such tests between them. Therefore, if there is a clique whose size is at least times the maximum, the following conditions hold:
-
1.
A constant fraction of (T1) tests are passed.
-
2.
For a constant fraction of variables, all (T2) and (T3) tests between them are passed.
This completes the reduction.
There are still some technical details. For example, in this reduction the number of vertices is , which is too large. [Lin21] sampled some random matrices to handle this. Details are omitted here.
5 -SetCover
SetCover, which is equivalent to the Dominating Set problem, is one of the fundamental problems in computational complexity. A simple greedy algorithm yield a -approximation of this problem. On the opposite side, it was shown that -approximation for this problem is NP-hard for every [DS14]. Thus, its approximability in the NP regime has been completely settled.
In the parameterized regime, -SetCover is the complete problem of W[2], and does not admit algorithms assuming ETH [CHKX06], not even algorithms assuming SETH [PW10]. Hardness of approximation of -SetCover in FPT time was studied by [CL19, CCK+17, KLM19, Lin19], and currently based on Gap-ETH, ETH, W[1]FPT or -SUM Hypothesis, -SetCover is hard to approximate to within a factor. In one direction, we wonder if this can be further improved to , or it is already tight. In the other direction, it is also worth questioning whether the total FPT inapproximability of -SetCover can be based on the weaker assumption W[2]FPT.
Current state-of-the-art inapproximability results of -SetCover are reached by two contrasting methods, namely, Distributed PCP Framework [KLM19] and Threshold Graph Composition [Lin19], which we will introduce in Section 5.3 and 5.4, respectively. See the following table for an overview of results.
Complexity Assumption | Inapproximability Ratio | Time Lower Bound | Reference |
---|---|---|---|
W[1]FPT | [Lin19] | ||
[KLM19] | |||
ETH | [Lin19] | ||
[KLM19] | |||
SETH | [Lin19] | ||
[KLM19] | |||
-SUM Hypothesis | [Lin19] | ||
[KLM19] |
As for the coloring, the colored version and uncolored version of (exact) -SetCover are also equivalent, since we can add elements in the universe to ensure there is a set from each group, to reduce a colored version to an uncolored version. In the other direction, taking copies of the sets also works, because choosing replicated sets does not contribute. In the approximating sense, since it is a minimization problem, in the soundness case when solutions of size are ruled out, it always means we can’t find such many sets to cover the universe even when choosing from the same set is allowed. Thus there is no need to specify the coloring.
5.1 Hypercube Partition System
We first introduce hypercube partition system, which is a powerful tool used in the reduction from MinLabel to -SetCover [CCK+17, KLM19], and in creating the gap instance of -SetCover [Lin19].
Definition 5.1 (Hypercube Partition System).
The -hypercube partition system consists of the universe and a collection of subset where and .
The universe consists of all functions from to , and is of size . Each subset () consists of all functions mapping to . It can be observed that one can cover the universe by picking all subsets from some row , and this is the only way to cover the universe. In other words, even if we include subsets from every row, it is not possible to cover the universe.
5.2 Reduction from MinLabel
Now we show how to reduce MinLabel to SetCover. This reduction preserves gap, but significantly increases the instance size.
Given a MinLabel instance with left super-nodes and right super-nodes , we build a SetCover instance as follows. Take different copies of -hypercube partition system and set the universe to be the union of universes. Each set in SetCover corresponds to a right vertex in MinLabel. For a set associated to a right vertex and for a left vertex , if there is an edge , we include in set .
In order to see there is a one-to-one mapping from a solution of to a solution of the new SetCover instance, note that for a left vertex , if a right multi-labeling covers , by picking corresponding sets we have for all . Moreover, the only way to cover the universe is to include all for some row indexed by , so a valid SetCover solution must contain sets associated to at least one neighbor in each right super-nodes for some specific left vertex . The same argument applies to each of the left super-nodes, because we have a different copy of the hypercube partition system for each of them.
One important thing about this reduction is that the instance size is blowed up to , where is the solution size of yes instance (and also the parameter of -SetCover). Thus, in order to make it , the left alphabet size can not exceed . Following the hardness of MinLabel (Theorem 3.7) and letting , -SetCover is hard to approximate to a factor assuming Gap-ETH.
5.3 Gap Producing via Distributed PCP
In this section we introduce the Distributed PCP Framework. This framework was first proposed by Abbound et al. [ARW17], and later used by Karthik et al. to rule out FPT approximation algorithms for -SetCover [KLM19]. The interesting part of [KLM19] is to obtain hardness of MaxCover with specific parameters, while hardness of MinLabel and -SetCover directly follow from reductions in [CCK+17] (see Theorem 3.7 and Section 5.2 respectively).
At a high level, in this framework, one first rewrites the problem related to the hypothesis as a communication problem, then derives a Simultaneous Message Protocol for this problem and extracts an instance of MaxCover from the transcript of the protocol.
Due to space limitations, we only introduce their W[1] and ETH results to give an overview of their methods. The ideas in SETH and -SUM Hypothesis results are similar, except that they involve more complicated error correcting codes and protocols.
The similarity between W[1]FPT and ETH is that they both often involve agreement tests (in other words, equality tests). Starting from W[1]FPT, one may want to set groups, each containing elements representing valid edges. The goal is to pick an edge from each group such that the label of end points (which is of bits length) are consistent. W[1]FPT states that this problem does not admit an algorithm. Similarly, starting from ETH, one may also want to divide the clauses into groups, each containing at most partial satisfying assignments for those clauses. The goal is also to pick an assignment from each group such that the values on each variables (which is of bits length in total) are consistent. ETH states that this cannot be done in time.
We can think of the agreement test problem as a communication problem: there are players, each receiving an element from the corresponding group. They want to collaborate nicely to decide whether their elements in hand “agree” or not. To achieve this, they use a specific communication protocol called Simultaneous Message Protocol, which was introduced by Yao [Yao79]:
Definition 5.2 (Simultaneous Message Protocol).
We say is a -efficient protocol if the following holds:
-
•
The protocol is one-round with public randomness. The following actions happen sequentially:
-
1.
The players receive their inputs.
-
2.
The players and the referee jointly toss random coins.
-
3.
Each player on seeing the randomness deterministically sends an -bit message to the referee.
-
4.
Based on the randomness and the bits sent from the players, the referee outputs accept or reject.
-
1.
-
•
The protocol has completeness 1 and soundness , i.e.,
-
–
If their inputs indeed agree, then the referee always accepts regardless of randomness.
-
–
Otherwise, the referee accepts with probability at most .
-
–
The full version of SMP protocol also admits bits of advice. In the contexts of W[1]FPT and ETH here, we do not need this.
Note that by repeating the protocol times, each time using fresh randomness, we can derive a -efficient protocol from an -efficient protocol.
Next we will see how to use MaxCover to simulate an SMP protocol. Then the hardness of the starting problem (-Clique or 3SAT) and the existence of an efficient protocol for it will lead to hardness of MaxCover.
Theorem 5.3 (Theorem 5.2 in [KLM19], slightly simplified).
An instance of a (-Clique or 3SAT) problem which admits an -efficient -player SMP protocol can be reduced to a MaxCover instance as follows:
-
•
The reduction runs in time .
-
•
has right super-nodes of size at most each.
-
•
has left super-nodes of size at most each.
-
•
If is a YES instance, then MaxCover.
-
•
If is a NO instasnce, then MaxCover.
Proof.
Here is the construction:
-
•
Each right super-node corresponds to the group which a player receives an input from.
-
•
Each left super-node corresponds to a random string . The left super-node contains one node for each possible accepting messages from the players, i.e., each vertex in corresponds to where in the protocol the referee accepts on seeing randomness and messages .
-
•
We add an edge between a right vertex and a left vertex if is equal to the message that player sends on an input and randomness in the protocol.
Detailed proofs of the desired properties are omitted here since they are rather straightforward. ∎
In the last step, we need to derive an efficient SMP protocol for -Clique and 3SAT. Directly sending their inputs (which is of bits length) does not seem to be a good idea because it would make the size of MaxCover to be , which is too large. What’s more, since the further reduction from MaxCover to MinLabel then to -SetCover will introduce a blow-up, we even need .
Fortunately, this can be done via a simple error correcting code called good code which has constant rate and constant relative distance. To check the consistency of their inputs, we only need to check the equality of one random bit of their codes. The randomness is used to specify the index of that bit. This way we can have . Within the bound that , we can repeat the protocol times, leading to a soundness parameter . After the reduction to MinLabel, the in gap would become .
Along such a long way we finally reach a inapproximability ratio for -SetCover with FPT time lower bound under W[1]FPT and time lower bound under ETH.
5.4 Gap Producing via Threshold Graph Composition
In this subsection we introduce how to obtain inapproximability of -SetCover via threshold graph composition technique.
In general, this technique transforms an -SetCover instance with small universe (typically ) still to an instance of -SetCover, increasing the size of universe to while creating a gap. In the YES case, the number of sets needed to cover the universe is still , but in the NO case it becomes , where is determined by the threshold graph.
Given an -SetCover instance , we need a bipartite threshold graph with the following properties:
-
1.
is not divided. is divided into groups: , where is arbitrary.
-
2.
.
-
3.
For any vertices and for any , there is at least one common neighbor of in .
-
4.
For any , if for any , there is at least one vertex in , which is a common neighbor of at least vertices in , then .
We compose the original -SetCover instance with this threshold graph to produce an instance as follows.
-
•
. For each we associate a new set . We also associate a vertex in (the left side of the threshold graph) to each set .
-
•
consists of hypercube partition systems, one for each (the right side of the threshold graph). The -th partition system has rows and columns, thus is of size .
-
•
For any , subset is included in a set if and only if:
-
1.
, i.e., set can cover in the original instance .
-
2.
There is an edge between the vertex associated to and vertex in the threshold graph .
-
1.
As shown above, each row in the partition system corresponds to a vertex in , and each column corresponds to an element in . According to the properties of partition system, in order to cover the universe, we must pick all subsets in some specific row. This, together with our construction, means there is a vertex such that sets correspond to its neighbors in can cover . Since the hypercube partition systems are independent, this holds for each .
In the YES case, sets are enough to cover , and there is at least one common neighbor of them in every . Thus the answer to the new instance is still .
In the NO case, at least sets are need to cover . Consider any solution of , we know that in each group , there is a vertex such that (because they form a solution in ). Therefore, by property 4 of threshold graph, as desired.
The last step is to construct a threshold graph the desired properties. [Lin19] used a specific combinatorial object called universal sets to construct it. However, the graph in Section 3.2 with proper parameters also suffices, and is simpler.
The required property is closely related to the collision property in the construction of Section 3.2. In fact, the gap here is just the collision number of the error correcting code. Thus, we want codes with large collision numbers.
Let the alphabet of the code be , then it’s easy to see the collision number cannot be larger than , since such many strings must collide in every position by pigeon principle. However, this upper bound can be reached by codes constructed from perfect hash family.
Definition 5.4 (Perfect Hash Family).
For every and , we say that is a -perfect hash family if for every subset of size , there exists an such that
Think of an -perfect hash family as an matrix, where each column represents a hash function. Then the property is that for any rows, there is a column with no collisions (on these rows). Regard each row as a codeword of an error correcting code from , then the collision number of this error correcting code is , thus exactly as desired.
Lemma 5.5 (Alon et al. [AYZ95] ).
For every there is a -perfect hash family that can be computed in time.
In this threshold graph the size of a right super-node is . Assuming , to make , we need and this is the best gap possible. By the above lemma, the perfect hash family (and the corresponding code) can be constructed efficiently.
Our last step is to show that, assuming different hypothesis, -SetCover with universe size is still hard. The reduction from 3SAT is straightforward: divide the variables into groups of size each, a group of sets corresponds to possible assignments to those variables, and the universe is just the clauses. ETH (respectively, SETH) asserts that this instance cannot be solved in (respectively, ) time. Thus by the above threshold graph composition technique, we reach -inapproximability of -SetCover based on ETH and SETH.
The reduction from -Clique to -SetCover is a bit more complicated, as stated in the following theorem. The main idea is from Karthik et al. [KLM19].
Theorem 5.6.
There is an time algorithm, which can reduce an -Clique instance to an -SetCover instance with universe size .
Proof.
Sets in a group still represent edges between the -th block and the -th block (in the -Clique instance). In order to check the consistency of labels, we need hypercube partition systems, one for each . The -th partition system is meant to check whether the -th bits of labels of vertices in block are all the same. In an invalid solution, one may pick an edge between the -th and -th blocks, and an edge between the -th and -th blocks, such that the two vertices (let them be and ) in the -th block are not the same. In such a case, and must differ in at least one bit, and thus cannot fulfill the requirements in the -th partition system where is the position of that bit.
Specifically, for all , the -th partition system contains 2 rows and columns. The choices of rows represent the choices of the bit to be 0 or 1, and the columns test agreement of the labels (edges between the -th block and each remaining blocks). For an edge between and , we include into its set. At last, the -SetCover instance is the union of those hypercube partition systems.
The instance size is since there are that many edges, while the universe size is . ∎
The W[1]-hardness of -SetCover then follows. It is worth noting that in such FPT reductions, the parameter can be arbitrarily amplified as long as it is a function of . Assuming W[1]FPT, -SetCover is hard to approximate to within a factor of , then for any function which is as goes to infinity, take large enough such that , then for large enough we have , which means -SetCover (by padding the parameter from to ) cannot be approximated to a factor of in FPT time.
Instead of introducing their -SUM Hypothesis result here, we want to make some comments on this technique. Note that the maximum size of a right super-node in the threshold graph is . Thus when is not as small as , it may still be possible to obtain some inapproximability results. It remains a big open question that whether we can base total FPT inapproximability of -SetCover on W[2]FPT. If we can construct threshold graphs with a gap such that each right super-nodes consist of only vertices, we can obtain W[2] hardness of -SetCover, respectively. Note that the construction in Section 3.2 does not suffice, because the size of their right super-nodes is , which is too large even if .
Acknowledgements
I want to express my deep gratitude to Prof. Bingkai Lin, who brought me into the beautiful world of hardness of approximation, discussed with me regularly and guided me patiently. I would also like to thank my talented friends Yican Sun and Xiuhan Wang for their bright ideas and unreserved help. I really enjoy working with them.
References
- [ALM+98] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. J. ACM, 45(3):501–555, may 1998.
- [ARW17] Amir Abboud, Aviad Rubinstein, and R. Ryan Williams. Distributed PCP theorems for hardness of approximation in P. In Chris Umans, editor, 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 25–36. IEEE Computer Society, 2017.
- [AYZ95] Noga Alon, Raphael Yuster, and Uri Zwick. Color-coding. J. ACM, 42(4):844–856, 1995.
- [BBE+19] Arnab Bhattacharyya, Édouard Bonnet, László Egri, Suprovat Ghoshal, Karthik C. S., Bingkai Lin, Pasin Manurangsi, and Dániel Marx. Parameterized intractability of even set and shortest vector problem. Electron. Colloquium Comput. Complex., 26:115, 2019.
- [BGKM18] Arnab Bhattacharyya, Suprovat Ghoshal, C. S. Karthik, and Pasin Manurangsi. Parameterized intractability of even set and shortest vector problem from gap-eth. Electron. Colloquium Comput. Complex., 25:57, 2018.
- [BGLR93] Mihir Bellare, Shafi Goldwasser, Carsten Lund, and Alexander Russell. Efficient probabilistically checkable proofs and applications to approximations. In Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, pages 294–304, 1993.
- [BLR93] Manuel Blum, Michael Luby, and Ronitt Rubinfeld. Self-testing/correcting with applications to numerical problems. Journal of Computer and System Sciences, 47(3):549–595, 1993.
- [BS94] Mihir Bellare and Madhu Sudan. Improved non-approximability results. In Proceedings of the twenty-sixth annual ACM symposium on Theory of computing, pages 184–193, 1994.
- [CCK+17] Parinya Chalermsook, Marek Cygan, Guy Kortsarz, Bundit Laekhanukit, Pasin Manurangsi, Danupon Nanongkai, and Luca Trevisan. From gap-eth to fpt-inapproximability: Clique, dominating set, and more. In Chris Umans, editor, 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, Berkeley, CA, USA, October 15-17, 2017, pages 743–754. IEEE Computer Society, 2017.
- [CHKX06] Jianer Chen, Xiuzhen Huang, Iyad A. Kanj, and Ge Xia. Strong computational lower bounds via parameterized complexity. J. Comput. Syst. Sci., 72(8):1346–1367, 2006.
- [CL19] Yijia Chen and Bingkai Lin. The constant inapproximability of the parameterized dominating set problem. SIAM J. Comput., 48(2):513–533, 2019.
- [CW89] Aviad Cohen and Avi Wigderson. Dispersers, deterministic amplification, and weak random sources. In 30th Annual Symposium on Foundations of Computer Science, pages 14–19. IEEE Computer Society, 1989.
- [Din07] Irit Dinur. The PCP theorem by gap amplification. J. ACM, 54(3):12, 2007.
- [Din16] Irit Dinur. Mildly exponential reduction from gap 3sat to polynomial-gap label-cover. Electron. Colloquium Comput. Complex., 23:128, 2016.
- [DS14] Irit Dinur and David Steurer. Analytical approach to parallel repetition. In David B. Shmoys, editor, Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31 - June 03, 2014, pages 624–633. ACM, 2014.
- [FGL+96] Uriel Feige, Shafi Goldwasser, Laszlo Lovász, Shmuel Safra, and Mario Szegedy. Interactive proofs and the hardness of approximating cliques. Journal of the ACM (JACM), 43(2):268–292, 1996.
- [FK00] Uriel Feige and Joe Kilian. Two-prover protocols—low error at affordable rates. SIAM Journal on Computing, 30(1):324–346, 2000.
- [Gol98] Shafi Goldwasser. Introduction to special section on probabilistic proof systems. SIAM Journal on Computing, 27(3):737, 1998.
- [Has96] Johan Hastad. Clique is hard to approximate within . In Proceedings of 37th Conference on Foundations of Computer Science, pages 627–636. IEEE, 1996.
- [IP01] Russell Impagliazzo and Ramamohan Paturi. On the complexity of k-sat. Journal of Computer and System Sciences, 62:367–375, 2001.
- [IPZ01] Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity? Journal of Computer and System Sciences, 63(4):512–530, 2001.
- [KLM19] C. S. Karthik, Bundit Laekhanukit, and Pasin Manurangsi. On the parameterized complexity of approximating dominating set. J. ACM, 66(5):33:1–33:38, 2019.
- [KN21] C. S. Karthik and Inbal Livni Navon. On hardness of approximation of parameterized set cover and label cover: Threshold graphs from error correcting codes. In Hung Viet Le and Valerie King, editors, 4th Symposium on Simplicity in Algorithms, SOSA 2021, Virtual Conference, January 11-12, 2021, pages 210–223. SIAM, 2021.
- [Lin18] Bingkai Lin. The parameterized complexity of the k-biclique problem. J. ACM, 65(5):34:1–34:23, 2018.
- [Lin19] Bingkai Lin. A simple gap-producing reduction for the parameterized set cover problem. In Christel Baier, Ioannis Chatzigiannakis, Paola Flocchini, and Stefano Leonardi, editors, 46th International Colloquium on Automata, Languages, and Programming, ICALP 2019, July 9-12, 2019, Patras, Greece, volume 132 of LIPIcs, pages 81:1–81:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019.
- [Lin21] Bingkai Lin. Constant approximating k-clique is w[1]-hard. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 1749–1756. ACM, 2021.
- [LRSW21] Bingkai Lin, Xuandi Ren, Yican Sun, and Xiuhan Wang. On lower bounds of approximating parameterized -clique, 2021.
- [LRSZ20] Daniel Lokshtanov, M. S. Ramanujan, Saket Saurabh, and Meirav Zehavi. Parameterized complexity and approximability of directed odd cycle transversal. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020, pages 2181–2200. SIAM, 2020.
- [MR17] Pasin Manurangsi and Prasad Raghavendra. A Birthday Repetition Theorem and Complexity of Approximating Dense CSPs. In Ioannis Chatzigiannakis, Piotr Indyk, Fabian Kuhn, and Anca Muscholl, editors, 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017), volume 80 of Leibniz International Proceedings in Informatics (LIPIcs), pages 78:1–78:15, Dagstuhl, Germany, 2017. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
- [PW10] Mihai Patrascu and Ryan Williams. On the possibility of faster SAT algorithms. In Moses Charikar, editor, Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, Austin, Texas, USA, January 17-19, 2010, pages 1065–1075. SIAM, 2010.
- [Tov84] C. Tovey. A simplified np-complete satisfiability problem. Discret. Appl. Math., 8:85–89, 1984.
- [Yao79] Andrew Chi-Chih Yao. Some complexity questions related to distributive computing (preliminary report). In Michael J. Fischer, Richard A. DeMillo, Nancy A. Lynch, Walter A. Burkhard, and Alfred V. Aho, editors, Proceedings of the 11h Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1979, Atlanta, Georgia, USA, pages 209–213. ACM, 1979.
- [Zuc96a] David Zuckerman. On unapproximable versions of np-complete problems. SIAM J. Comput., 25(6):1293–1304, 1996.
- [Zuc96b] David Zuckerman. Simulating BPP using a general weak random source. Algorithmica, 16(4/5):367–391, 1996.
- [Zuc07] David Zuckerman. Linear degree extractors and the inapproximability of max clique and chromatic number. Theory Comput., 3(1):103–128, 2007.