This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The Strongish Planted Clique Hypothesis and Its Consequences

Pasin Manurangsi
Google Research
[email protected]
   Aviad Rubinstein
Stanford University
[email protected]
   Tselil Schramm
Stanford University
[email protected]
Abstract

We formulate a new hardness assumption, the Strongish Planted Clique Hypothesis (SPCH), which postulates that any algorithm for planted clique must run in time nΩ(logn)n^{\Omega(\log{n})} (so that the state-of-the-art running time of nO(logn)n^{O(\log n)} is optimal up to a constant in the exponent).

We provide two sets of applications of the new hypothesis. First, we show that SPCH implies (nearly) tight inapproximability results for the following well-studied problems in terms of the parameter kk: Densest kk-Subgraph, Smallest kk-Edge Subgraph, Densest kk-Subhypergraph, Steiner kk-Forest, and Directed Steiner Network with kk terminal pairs. For example, we show, under SPCH, that no polynomial time algorithm achieves o(k)o(k)-approximation for Densest kk-Subgraph. This inapproximability ratio improves upon the previous best ko(1)k^{o(1)} factor from (Chalermsook et al., FOCS 2017). Furthermore, our lower bounds hold even against fixed-parameter tractable algorithms with parameter kk.

Our second application focuses on the complexity of graph pattern detection. For both induced and non-induced graph pattern detection, we prove hardness results under SPCH, which improves the running time lower bounds obtained by (Dalirrooyfard et al., STOC 2019) under the Exponential Time Hypothesis.

1 Introduction

The last couple of decades have seen dramatic advances in our understanding of parameterized, fine-grained, and average-case complexity. To a large extent, this progress has been enabled by bolder computational hardness assumptions, beyond the classical \NP\P\neq\NP. Two notable assumptions in these fields are the Exponential Time Hypothesis and the Planted Clique Hypothesis. In this paper we propose a new hypothesis, the Strongish Planted Clique Hypothesis, which strengthens the Planted Clique Hypothesis in the style of the Exponential Time Hypothesis. We show that this hypothesis has interesting implications in the parameterized complexity of both approximation problems and graph pattern detection.

Exponential Time Hypothesis

The Exponential Time Hypothesis (ETH) [IP01] is a pessimistic version of \NP\P\neq\NP which postulates that solving 3-SAT on nn variables requires time 2Ω(n)2^{\Omega(n)}. In other words, the running times of the state-of-the-art (and the brute-force) algorithms for 3-SAT are optimal up to a constant factor in the exponent. ETH has important applications in parameterized complexity (e.g. [CJ03, LMS11, Lin15, CPP16, DVW19]) and hardness of approximation (e.g. [BKW15, BKRW17, Rub17, Man17a, MR17, DM18]). In the past several years, further progress was achieved by assuming stronger variants of ETH such as the Strong ETH (SETH) in fine-grained complexity [IPZ01, Wil18], Gap-ETH in parameterized complexity [CCK+17, BGKM18, LRSZ17], and ETH for \PPAD in Algorithmic Game Theory [BPR16, Rub16].

Planed Clique Hypothesis

In the Planted κ\kappa-Clique Problem, the goal is to distinguish (with high probability) between graphs sampled from one of the following distributions: Uniformly at random111I.e. from the Erdős-Rényi (ER) distribution over nn-vertex graphs where each edge appears independently with probability 1/21/2.; and uniformly at random, with an added κ\kappa-clique. While statistically it is easy to distinguish the two distributions for κ\kappa as little as 2.1log(n)2.1\log(n), the Planted Clique Hypothesis (PCH) postulates that no polynomial time algorithm can solve this problem, even for κ\kappa as large as o(n)o(\sqrt{n}). The history of this problem goes back to Karp [Kar76] and Jerrum [Jer92], and in the past decade it has been popular as a hardness assumption for both worst-case [ABBG10, HK11, AAM+11, BBB+13, BCKS16] and average-case [BR13, HWX15, BS16, GMZ17, BBH18, BB19] problems. A simple nΘ(log(n))n^{\Theta(\log(n))}-time algorithm for the planted-κ\kappa-clique problem non-deterministically guesses =Θ(log(n))\ell=\Theta(\log(n)) vertices from the clique, and then checks whether all of their common neighbors form a clique. There are several other algorithms that also solve this problem in time nΘ(log(n))n^{\Theta(\log(n))} [FS97, FK03, MM15, Bar18], but no faster algorithm is known for κ=O(n0.49)\kappa=O(n^{0.49}).

Strongish Planted Clique Hypothesis

In analogy to the Exponential Time Hypothesis for 3-SAT, we propose the following hypothesis, which postulates that the state-of-the-art algorithms for the Planted Clique Problem are optimal up to a constant factor in the exponent. A Strong Planted Clique Hypothesis, in analogy with SETH, would specify a precise constant in the exponent—our hypothesis is merely Strong-ish. We let 𝒢(n,p)\mathcal{G}(n,p) denote the Erdős-Rényi distribution with parameter pp, and 𝒢(n,p,κ)\mathcal{G}(n,p,\kappa) denote the Erdős-Rényi distribution with a planted κ\kappa-clique.

Hypothesis 1 (Strongish Planted Clique Hypothesis (SPCH)).

There exists a constant δ(0,12)\delta\in(0,\tfrac{1}{2}) such that no no(logn)n^{o(\log n)}-time algorithm 𝒜\mathcal{A} satisfies both of the following:

  • (Completeness) 𝐏𝐫G𝒢(n,12,nδ)[𝒜(G)=1]2/3\operatorname*{\mathbf{Pr}}_{G\sim\mathcal{G}(n,\frac{1}{2},\lceil n^{\delta}\rceil)}[\mathcal{A}(G)=1]\geq 2/3.

  • (Soundness) 𝐏𝐫G𝒢(n,12)[𝒜(G)=1]1/3\operatorname*{\mathbf{Pr}}_{G\sim\mathcal{G}(n,\frac{1}{2})}[\mathcal{A}(G)=1]\leq 1/3.

In addition to the lack of algorithmic progress toward refuting this hypothesis, we note that nΘ(log(n))n^{\Theta(\log(n))} is in fact provably optimal for the Sum-of-Squares hierarchy [BHK+19], which captures the state-of-the-art algorithmic techniques for a number of average-case problems. It is also known to be tight for statistical algorithms [FGR+17, BBH+20].

1.1 Our Contributions: Hardness results from Strongish Planted Clique Hypothesis

Our main technical contributions are in exploring the implications of our new SPCH in parameterized complexity. We prove two types of hardness results: hardness of approximation, and hardness of (exact) graph pattern detection. Due to the rich literature for each problem we consider, we will only mention the most relevant results here and defer a more comprehensive discussion to Section 1.4.

1.1.1 Hardness of approximation from SPCH

At the heart of our work is the study of the Densest kk-Subgraph problem, in which we are given an undirected graph G=(V,E)G=(V,E) and a positive integer kk. The goal is to output a subset SVS\subseteq V of kk vertices that induces as many edges as possible. There is a trivial O(k)O(k)-approximation for the problem: return an arbitrary set of k/2\lfloor k/2\rfloor edges. We show that assuming SPCH, this algorithm is optimal:

Theorem 2.

Assuming the Strongish Planted Clique Hypothesis (Hypothesis 1), there is no f(k)\poly(n)f(k)\cdot\poly(n)-time algorithm that can approximate Densest kk-Subgraph on nn-vertex graphs to within a factor o(k)o(k) for any function ff. Furthermore, this holds even in the perfect completeness case where the input graph is promised to contain a kk-clique.

Theorem 2 improves upon the inapproximability ratio of ko(1)k^{o(1)} shown in [CCK+17] under Gap-ETH.

The approximability of Densest kk-Subgraph is known to be intimately related to that of numerous other problems. As such, our tight hardness of approximation for Densest kk-Subgraph immediately implies several tight approximability results as corollaries, which we list below.

  • Smallest kk-Edge Subgraph: given an undirected graph G=(V,E)G=(V,E) and a positive integer kk, find a smallest subset SVS\subseteq V that induces at least kk edges. For this problem, the trivial solution that chooses kk edges arbitrarily is an O(k)O(\sqrt{k})-approximation since even the optimum requires at least k\sqrt{k} vertices. We show that this is tight (Corollary 11): no fixed-parameter tractable (FPT) (in kk) algorithm can achieve o(k)o(\sqrt{k}) approximation ratio.

  • Steiner kk-Forest (aka kk-Forest): given an edge-weighted undirected graph G=(V,E)G=(V,E), a set {(s1,t1),,(s,t)}\{(s_{1},t_{1}),\dots,(s_{\ell},t_{\ell})\} of demand pairs and a positive integer kk, the goal is to find a (not necessarily induced) subgraph of GG with smallest total edge weight that connects at least kk demand pairs. In Corollary 12, we show that no FPT (in kk) algorithm can achieve o(k)o(\sqrt{k}) approximation ratio. This matches the O(k)O(\sqrt{k})-approximation algorithm by Gupta et al. [GHNR10].

  • Directed Steiner Network (aka Directed Steiner Forest): given an edge-weighted directed graph G=(V,E)G=(V,E) and a set {(s1,t1),,(sk,tk)}\{(s_{1},t_{1}),\dots,(s_{k},t_{k})\} of kk demand pairs, the goal is to find a (not necessarily induced) subgraph of GG with smallest total edge weight in which tit_{i} is reachable from sis_{i} for all i=1,,ki=1,\dots,k. We prove that no FPT (in kk) algorithm achieves o(k)o(\sqrt{k})-approximation (Corollary 14). This nearly matches the best known polynomial algorithms by Chekuri et al. [CEGS11] and Feldman et al. [FKN12], both of which achieve a O(k1/2+ϵ)O(k^{1/2+\epsilon})-approximation for any constant ϵ>0\epsilon>0. Our bound improves upon a k1/4o(1)k^{1/4-o(1)} ratio from [DM18] under Gap-ETH.

  • Densest kk-Subhypergraph: given a hypergraph G=(V,E)G=(V,E) and a positive integer kk, output a kk-size subset SVS\subseteq V that maximizes the number of hyperedges fully contained in SS. The trivial algorithm that outputs any hyperedge (of size at most kk) obtains a 2k2^{k}-approximation. We prove a matching lower bound (Theorem 15): no 2o(k)2^{o(k)}-approximation FPT (in kk) algorithm exists. This resolves an open question posed by Cygan et al. [CKL+18], assuming SPCH.

1.1.2 Hardness of graph pattern detection from SPCH

Graph Pattern Detection, also known as Subgraph Isomorphism and closely related to Motif Discovery, is a fundamental problem in graph algorithms: Given a host graph GG and a pattern graph HH, decide whether GG contains a subgraph SS isomorphic to HH. There are two main variants of this problem, where SS is either required to be induced or not-necessarily-induced. For both variants, there is a brute-force nO(k)n^{O(k)}-time algorithm. We prove matching SPCH-hardness for both variants. In addition to beating the ETH-based state-of-the-art for these results, we highlight that our reductions to graph pattern detection problems are extremely simple (in contrast to the prior work).

For induced subgraph detection, we prove the following:

Theorem 3.

Assuming the Strongish Planted Clique Hypothesis (Hypothesis 1), for every kk-node pattern HH, there is no algorithm that solves the induced pattern detection problem on nn-vertex graphs in time f(k)no(k)f(k)\cdot n^{o(k)} for any function ff.

Our nΩ(k)n^{\Omega(k)} lower bound for all patterns can be compared to recent work of [DVW19], who proved: (i) nΩ(log(k))n^{\Omega(\log(k))} lower bound for every pattern assuming ETH; (ii) nΩ(k)n^{\Omega(\sqrt{k})} lower bound for every pattern assuming ETH and the Hadwiger conjecture; and (iii) nΩ(k/log(k))n^{\Omega(k/\log(k))} lower bound for most patterns.222In the sense of a pattern sampled randomly from 𝒢(k,1/2)\mathcal{G}(k,1/2).

For not-necessarily-induced subgraph detection, it is no longer true that every pattern is hard (e.g. it is trivial to find a not-necessarily-induced subgraph isomorphic to an independent set). But we prove (Corollary 10) that for most patterns333in particular any pattern with a constant fraction of the (k2)k\choose{2} possible edges. detection requires nΩ(k)n^{\Omega(k)} time assuming SPCH. For comparison, [DVW19] proved that under ETH, not-necessarily-induced subgraph detection requires nΩ(ω(H))n^{\Omega(\omega(H))} time, where ω(H)\omega(H) denotes the clique number of the pattern HH. (Note that ω(H)=Θ(logk)\omega(H)=\Theta(\log k) for most patterns.)

kk-biclique detection

For the special case where the pattern is a kk-biclique, our aforementioned nΩ(k)n^{\Omega(k)} hardness for non-induced subgraph detection rules out even constant factor approximations (Corollary 9). This improves over nΩ(k)n^{\Omega(\sqrt{k})} lower bounds under ETH for the exact case [Lin15] or Gap-ETH for approximation [CCK+17].

Densest kk-Subgraph

We obtain our pattern detection result by first showing a nΩ(k)n^{\Omega(k)} running time bound for O(1)O(1)-approximating Densest kk-Subgraph (Theorem 8). This improves upon the previous lower bound who give a running time lower bound of nΩ(logk)n^{\Omega(\log k)} assuming Gap-ETH [CCK+17]. The aforementioned lower bounds for pattern detection follow almost trivially from our running time lower bound for Densest kk-Subgraph; see Sections 3.2 and 3.3 for more detail.

1.2 Techniques

The starting point for all of our reductions is a randomized graph product: starting with an instance GG of the planted clique problem on nn vertices and any integers n\ell\leq n and NN, we produce a graph GG^{\prime} by taking its vertices to be NN randomly sampled subsets S1,,SNS_{1},\ldots,S_{N} of \ell vertices each, and we add an edge on Si,SjS_{i},S_{j} if and only if their union induces a clique in GG.

The randomized graph product [BS89] (and its derandomized variant [AFWZ95]) has a long history in proving hardness of approximating Maximum Clique. While not stated explicitly, it was also used to prove parameterized inapproximability of Densest kk-Subgraph in [CCK+17]. As we will explain in more detail below, the main difference between our proof and previous works lie in the soundness, where we appeal to the fact that G𝒢(n,1/2)G\sim\mathcal{G}(n,1/2) to achieve tighter bounds.

Since we would like to prove hardness of approximating Densest kk-Subgraph in the perfect completeness case, our goal is to show (for appropriately chosen values of N,N,\ell) that

  • (Completeness) if GG contains a large clique, then with high probability so does GG^{\prime}, and,

  • (Soundness) if GG is random, then GG^{\prime} does not have small dense subgraphs (with high probability).

For the completeness, if the starting graph GG has a κ\kappa-clique, then the set of SiS_{i} that fall entirely within the κ\kappa-clique will form a clique in GG^{\prime} (the expected size is (κn)N(\frac{\kappa}{n})^{\ell}\cdot N). This part of the proof is exactly the same as that in the aforementioned previous works.

To prove soundness, we calculate the probability that a specific γk\gamma k-dense kk-subgraph appears in GG^{\prime}, then take a union bound over all (Nk)2(k2)\leq\binom{N}{k}\cdot 2^{\binom{k}{2}} possible such subgraphs. Our simple argument hinges on showing that a kk-subgraph with γk2\gamma k^{2} edges in GG^{\prime} induces (with high probability) at least Ω(γk22)\Omega(\gamma k^{2}\ell^{2}) edges in GG, and since any such set of edges appears in GG with probability at most 2Ω(γk22)2^{-\Omega(\gamma k^{2}\ell^{2})}, by choosing \ell sufficiently large we can beat this union bound. To argue that small subgraphs with mm edges in GG^{\prime} induce small subgraphs with Ω(m2)\Omega(m\ell^{2}) edges in GG, we use that the randomly chosen SiS_{i} (for an appropriate choice of NnN\ll n^{\ell} and kk_{*} sufficiently small) form a disperser: the union of any tkt\leq k_{*} of the SiS_{i} contains Ω(t)\Omega(t\ell) vertices of GG with high probability.444The fact that the randomized graph product yields a disperser has been used in previous works as well, see e.g. [BS89, Zuc96, CCK+17]. This implies that for kkk\leq k_{*}, each kk-subgraph of GG^{\prime} corresponds to a union of kk pairwise mostly-non-overlapping subsets of \ell vertices. Now, since each edge between mostly non-overlapping sets in GG^{\prime} corresponds to a Ω()\Omega(\ell)-clique in GG, this in turn can be used to show that any kk-subgraph of GG^{\prime} with density γk\gamma k corresponds to a subgraph of GG with Ω(γk22)\Omega(\gamma k^{2}\ell^{2}) edges. In this way we rule out the existence of γk\gamma k-dense kk-subgraphs in GG^{\prime} (with high probability).

By carefully choosing the parameters NN, \ell, γ\gamma, to control the completeness, soundness, and reduction size, we get a fine-grained reduction from Planted κ\kappa-Clique to Densest kk-Subgraph. Our results for other problems are obtained via direct reductions from the Densest kk-Subgraph problem.

We end by stressing that our new soundness proof gives a strong quantitative improvement over prior results, which is what enables us to achieve kΩ(1)k^{\Omega(1)} inapproximability. Specifically, the previous soundness proof from [CCK+17]—in turn adapted from [Man17a]—relies on showing that the graph is tt-biclique-free for some tt\in\mathbb{N}; they then apply the so-called Kovari-Sos-Turan theorem [KST54] to deduce that any kk-subgraph contains at most O(k21/t)O(k^{2-1/t}) edges. Notice that this gap is only O(k1/t)O(k^{1/t}), and tt cannot be a constant as otherwise the completeness and soundness case can be distinguished in time nO(t)=\poly(n)n^{O(t)}=\poly(n). As a result, their technique cannot yield an kΩ(1)k^{\Omega(1)}-factor inapproximability for Densest kk-Subgraph. Similarly, the techniques from [CCK+17] cannot give a running time lower bound of the form nω(logk)n^{\omega(\log k)} for O(1)O(1)-approximation of Densest kk-Subgraph. The reason is that, to get a constant gap bounded from one, they need to select t=O(logk)t=O(\log k). Once again, this is in contrast to our technique which yields a tight running time lower bound of nΩ(k)n^{\Omega(k)} in this setting, although our proof requires a different starting hardness result (from SPCH).

1.3 Discussion and Open Questions

In this work, we have proposed the Strongish Planted Clique Hypothesis, and used it prove several tight hardness of approximation or running time lower bound results. One direction for future investigation is to use SPCH to derive other interesting lower bounds.

Another intriguing question directly related to our hardness of approximation results is whether we can strengthen the inapproximability factors from kΩ(1)k^{\Omega(1)} to nΩ(1)n^{\Omega(1)}. Whether it is hard to approximate Densest kk-Subgraph to within a nΩ(1)n^{\Omega(1)} factor is a well-known open problem and is related to some other conjectures, such as the Sliding-Scaling Conjecture [BGLR93]555Specifically, from a reduction of [CHK11], nΩ(1)n^{\Omega(1)}-factor hardness of approximation of Densest kk-Subgraph also implies that of the Label Cover problem.. As such, it would be interesting if one can prove this hardness under some plausible assumption. We remark that an attempt has been made in this direction, using the so-called “Log-Density Threshold” approach [BCC+10], which posits a heuristic for predicting which average-case Densest-kk-Subgraph problems are hard.

The approach has also been applied to other related questions [CDK12, CDM17a, CMMV17]. Nonetheless, there is still little evidence that these average-case DkS problems are indeed hard; not even lower bounds against strong SDP relaxations are known, although there are some matching lower bounds against LP hierarchies [BCV+12, CM18].

Finally, it is of course interesting to either refute or find more evidence supporting the Strongish Planted Clique Hypothesis. As stated earlier, the current best supporting evidence is the Sum-of-Squares lower bound from [BHK+19]. Can such a lower bound be extended to, e.g., rule out any semi-definite programs of size no(logn)n^{o(\log n)} (à la [LRS15] for CSPs)?

1.4 Other Related Works

Historically, postulating hardness for average-case problems has been helpful in illuminating the landscape for hardness of approximation, beginning with Feige’s seminal random-33-SAT hypothesis [Fei02] and its numerous consequences (e.g. [DS16]). See also [Dan16, Ale11, ABW10, BKS13].

As discussed briefly above, the Planted Clique Hypothesis (PCH), which states that there is no polynomial-time algorithm for planted clique, has many known consequences for hardness of approximation. We draw attention in particular to the work of [AAM+11], which also obtains hardness of approximation results based on PCH. But even assuming the SPCH, their results can only rule out n\polylog(κ)n^{\polylog(\kappa)}-time algorithms for 2log(κ)2/32^{\log(\kappa)^{2/3}}-approximating densest-κ\kappa-subgraph for κ=nΩ(1)\kappa=n^{\Omega(1)}. Their reduction also uses a graph product, but the set of vertices of their new graph GG^{\prime} contains all \ell-size subsets of vertices of GG. In contrast, we employ the randomized graph product, where we only randomly pick some \ell-size subsets—this allows us to better control the instance size blowup, which turns out to be crucial for obtaining our tight inapproximability and running time results.

Below we discuss in more detail the previous works for each of the problems studied here.

Densest kk-Subgraph.

The problem is well-studied in the approximation algorithms literature (e.g. [FKP01, KP93, BCC+10]). The best known polynomial time algorithm [BCC+10] achieves an approximation ratio of O(n1/4+ϵ)O(n^{1/4+\epsilon}) for any ϵ>0\epsilon>0. Even though the NP-hardness of Densest kk-Subgraph follows immediately from that of kk-Clique [Kar72], no NP-hardness of approximation of Densest kk-Subgraph even for a small factor of 1.001 is known. Nonetheless, several hardness of approximation results are known under stronger assumptions [Fei02, Kho06, RS10, AAM+11, BKS13, BKRW17, Man17a]. Among these, only [AAM+11, BKS13] and [Man17a] yield super-constant inapproximability ratios. Specifically, [AAM+11] rules out 2O((logn)2/3)2^{O((\log n)^{2/3})}-approximation in polynomial time under a hypothesis similar to SPCH, [Man17a] rules out no(1/\polyloglogn)n^{o(1/\poly\log\log n)}-approximation under ETH, and [BKS13] rules out nO(1)n^{O(1)} approximations under a strong conjecture regarding the optimality of semidefinite programs for solving every random CSP.

Our hardness result holds even in the so-called perfect completeness case, where we are promised that the graph GG contains a kk-clique. In this case, a quasi-polynomial time approximation scheme exists [FS97]. Braverman et al. [BKRW17] showed that this is tight: there exists a constant ϵ>0\epsilon>0 for which (1+ϵ)(1+\epsilon)-approximation of Densest kk-Subgraph in the perfect completeness case requires nΩ~(logn)n^{\tilde{\Omega}(\log n)}-time assuming ETH. We remark that the hardness from [Man17a] also applies in the perfect completeness case, but it only rules out polynomial time algorithms.

For the parameterized version of the problem, its W[1]-hardness follows immediately from that of kk-Clique [DF95]. Chalermsook et al. [CCK+17] showed that no ko(1)k^{o(1)}-approximation is achievable in FPT time, unless Gap-ETH is false. This hardness also holds in the perfect completeness case, and yields a running time lower bound of nΩ(logk)n^{\Omega(\log k)} for any constant factor approximation.

kk-Biclique.

Similar to Densest kk-Subgraph, while the NP-hardness for the exact version of kk-Biclique has long been known (e.g. [Joh87]), even 1.001-factor NP-hardness of approximation has not yet been proved although inapproximability results under stronger assumptions are known [Fei02, Kho06, BGH+16, Man17b]. Specifically, under strengthened variants of the Unique Games Conjecture, the problem is hard to approximate to within a factor of n1ϵn^{1-\epsilon} for any ϵ>0\epsilon>0 [BGH+16, Man17b].

On the parameterized complexity front, whether kk-Biclique is FPT (in kk) was a well-known open question (see e.g. [DF13]). This was resolved by Lin [Lin15] who showed that the problem is W[1]-hard and further provided a running time lower bound of nΩ(k)n^{\Omega(\sqrt{k})} under ETH. As stated above, this running time lower bound was extended to rule out any constant factor approximation in [CCK+17] under Gap-ETH. Furthermore, [CCK+17] showed that no o(k)o(k)-approximation exists in FPT time.

Smallest kk-Edge Subgraph.

Most of the hardness of approximation for Densest kk-Subgraph easily translates to Smallest kk-Edge Subgraph as well, with a polynomial loss in the factor of approximation. For example, the hardness from [Man17a] implies that Smallest kk-Edge Subgraph cannot be approximated to within a factor of n1/\polyloglognn^{1/\poly\log\log n} in polynomial time, assuming ETH. On the other hand, Chlamtac et al. [CDK12] devised an O(n322+ϵ)O(n^{3-2\sqrt{2}+\epsilon})-approximation algorithm for any constant ϵ>0\epsilon>0 for the problem; this remains the best known approximation algorithm for the problem.

Densest kk-Subhypergraph.

Apart from the hardness results inherited from Densest kk-Subgraph, not much is known about Densest kk-Subhypergraph. Specifically, the only new hardness is that of Applebaum [App13], who showed that the problem is hard to approximate to within nϵn^{\epsilon} for some constant ϵ>0\epsilon>0, assuming a certain cryptographic assumption; this holds even when each hyperedge has a constant size. On the other hand, the only (non-trivial) approximation algorithm is that of Chlamtac et al. [CDK+18] which achieves O(n0.698)O(n^{0.698})-approximation when the hypergraph is 3-uniform.

Steiner kk-Forest.

The Steiner kk-Forest problem is a generalization of several well-known problems, including the Steiner Tree problem and the kk-Minimum Spanning Tree problem. This problem was first explicitly defined in [HJ06] and subsequently studied in [SS10, GHNR10, DKN17]. In terms of the number of vertices nn of the input graph, the best known approximation ratio achievable in polynomial time is O(n)O(\sqrt{n}) [GHNR10] (assuming that k\poly(n)k\leq\poly(n)); furthermore, when edge weights are uniform, a better approximation ratio of O(n0.449)O(n^{0.449}) is achievable in polynomial time [DKN17]. On the other hand, as stated earlier, in terms of kk, the best known approximation ratio is O(k)O(\sqrt{k}) [GHNR10].

A reduction in [HJ06] together with W[1]-hardness of kk-Clique [DF95] implies that Steiner kk-Forest is W[1]-hard with respect to the parameter kk. We are not aware of any further parameterized complexity study of this problem (with respect to parameter kk).

Directed Steiner Network.

Several polynomial time approximation algorithms have been proposed for the Directed Steiner Network problem [CCC+99, CEGS11, FKN12, BBM+13, CDKL17, AB18]; in terms of the number of vertices nn of the input graph the best known approximation ratio is O(n2/3+ϵ)O(n^{2/3+\epsilon}) [BBM+13] and in terms of kk the best known ratio is O(k1/2+ϵ)O(k^{1/2+\epsilon}) for any constant ϵ>0\epsilon>0 [CEGS11, FKN12]. On the hardness front, Dodis and Khanna [DK99] shows that the problem is quasi-NP-hard to approximate to within a factor of 2(logn)1ϵ2^{(\log n)^{1-\epsilon}} for any constant ϵ>0\epsilon>0.

Guo et al. [GNS11] show that the exact version of the problem is W[1]-hard with respect to parameter kk. Later, [DM18] rules out even k1/4o(1)k^{1/4-o(1)}-approximation in FPT time, under Gap-ETH.

Graph Pattern Detection.

As discussed earlier, [DVW19] give ETH-based hardness results for graph pattern detection, both in the induced and non-induced case. The complexity for many special patterns has also been considered, e.g. kk-cliques, kk-bicliques (mentioned above), and kk-cycles (e.g. [AYZ97, YZ04, DKS17, DVW19, LV20]). A kk-clique can be detected in time O(nk/3ω)O(n^{\lceil k/3\rceil\omega}) using fast matrix multiplication [NP85], and the kk-Clique Conjecture in Fine-Grained Complexity postulates that this is essentially optimal [Wil18]. Any other pattern over kk vertices can be detected in time O(nk1O(n^{k-1}), without using fast matrix multiplication [BKS18]. There is also an extensive body of work on counting the number of occurrences of a pattern in a host graph (e.g. [MSOI+02, KLL13, UBK13, WW13, CM14, CDM17b]).

Preliminaries and Notation

For a natural number nn\in\mathbb{N}, we use [n][n] to denote the set of integers up to nn, [n]={1,,n}[n]=\{1,\ldots,n\}. We will use the abbreviation “w.h.p.” to mean “with high probability.”

For an undirected graph G=(V,E)G=(V,E), we use degG(v)\deg_{G}(v) to denote the degree of a vertex vVv\in V, and mindeg(G)\operatorname{min-deg}(G) to denote minvV(G)degG(v)\min_{v\in V(G)}\deg_{G}(v). For a subset SVS\subseteq V, we use G[S]=(V,E[S])G[S]=(V,E[S]) to denote the induced subgraph of GG on subset SS.

The density of GG, denoted by den(G)\operatorname{den}(G), is defined as |E|/|V||E|/|V|. We also use denk(G)\operatorname{den}_{\leq k}(G) to denote maxSV,|S|kden(G[S])\max_{S\subseteq V,|S|\leq k}\operatorname{den}(G[S]), the maximum density of subgraphs of GG of at most kk vertices.

2 Randomized Graph Product

In this section we formally define our reduction, and analyze its soundness and completeness in terms of the reduction parameters (in later sections we instantiate the parameters differently for each target bound). Our reduction takes as input a graph and applies the following randomized graph product:

Randomized Graph Product [BS89]

Input: nn-vertex Graph G=(V,E)G=(V,E), positive integers N,N,\ell.

Output: Graph G=(V,E)G^{\prime}=(V^{\prime},E^{\prime}).

The graph GG^{\prime} is constructed as follows.

  1. 1.

    For each i[N]i\in[N], sample SiVS_{i}\subseteq V by independently sampling \ell vertices uniformly from VV.

  2. 2.

    Let V={S1,,SN}V^{\prime}=\{S_{1},\dots,S_{N}\}.

  3. 3.

    For every distinct i,j[N]i,j\in[N], include (Si,Sj)(S_{i},S_{j}) in EE^{\prime} if and only if SiSjS_{i}\cup S_{j} induces a clique in GG.

We use RPN,(G)\mathrm{RP}_{N,\ell}(G) to denote the distribution of outputs of the above reduction on input graph GG.

We will show that for well-chosen NN and \ell, the randomized graph product RPN,\mathrm{RP}_{N,\ell} reduces the planted nδn^{\delta}-clique problem to densest-kk subgraph for k=k(N,,δ,n)k=k(N,\ell,\delta,n). That is, a sample from RPN,𝒢(n,12)\mathrm{RP}_{N,\ell}\circ\mathcal{G}(n,\frac{1}{2}) has no dense kk-subgraph with probability close to 11, and if on the other hand GG is a graph with an nδn^{\delta}-clique, then a sample from RPN,(G)\mathrm{RP}_{N,\ell}(G) has a dense kk-subgraphs with probability close to 11.

2.1 Completeness

We first prove that applying the randomized graph product to a graph with a large clique results in a graph with a large clique.

Lemma 4 (Completeness).

Suppose that δ(0,1),N,,k\delta\in(0,1),N,\ell,k\in\mathbb{N} are such that N10kn(1δ)N\geq 10k\cdot n^{(1-\delta)\ell} and k20k\geq 20. If GG contains a clique of size nδ\lceil n^{\delta}\rceil, then

𝐏𝐫GRPN,(G)[G contains a k-clique ]0.9.\operatorname*{\mathbf{Pr}}_{G^{\prime}\sim\mathrm{RP}_{N,\ell}(G)}\left[\,G^{\prime}\text{ contains a $k$-clique }\right]\geq 0.9.
Proof.

Let CVC\subseteq V be the nδ\lceil n^{\delta}\rceil-size clique in GG. For each i[N]i\in[N], 𝐏𝐫[SiC]=(nδn)n(1δ)\operatorname*{\mathbf{Pr}}[S_{i}\subseteq C]=\left(\frac{\lceil n^{\delta}\rceil}{n}\right)^{\ell}\geq n^{-(1-\delta)\ell}. By our lower bound on NN, the expected number of indices i[N]i\in[N] such that SiCS_{i}\subseteq C is at least 10k10k, and thus a Chernoff bound implies that with probability 1exp(4k)0.91-\exp(-4k)\geq 0.9, there exists at least kk indices i[N]i\in[N] such that SiCS_{i}\subseteq C. By definition of the randomized graph product RPN,\mathrm{RP}_{N,\ell}, these subsets form a clique in GG^{\prime}. ∎

2.2 Soundness

We now prove that if we apply the randomized graph product to a graph drawn from G(n,12)G(n,\frac{1}{2}), with high probability the resulting graph has no small subgraphs which are too dense.

Lemma 5 (Soundness).

Suppose that δ(0,1),N,,k\delta\in(0,1),N,\ell,k\in\mathbb{N} are such that N1000kn(1δ),k20N\leq 1000k\cdot n^{(1-\delta)\ell},\ell\geq k\geq 20 and n0.99δkn^{0.99\delta}\geq k\ell. If GG is drawn from 𝒢(n,12)\mathcal{G}(n,\frac{1}{2}), then

𝐏𝐫G𝒢(n,12)GRPN,(G)[denk(G)107lognδ2]0.9.\displaystyle\operatorname*{\mathbf{Pr}}_{\begin{subarray}{c}G\sim\mathcal{G}(n,\frac{1}{2})\\ G^{\prime}\sim\mathrm{RP}_{N,\ell}(G)\end{subarray}}\left[\,\operatorname{den}_{\leq k}(G^{\prime})\leq\frac{10^{7}\log n}{\ell\delta^{2}}\,\right]\geq 0.9.

We will use the following observation that allows us to translate a graph with large density to a (sub)graph with large minimum degree. This observation is folklore and appears implicitly e.g. in [Cha00]. Nonetheless, we provide the proof in Appendix A for completeness.

Observation 6.

For any H=(VH,GH)H=(V_{H},G_{H}), there exists SHS^{\prime}\subseteq H such that mindeg(H[S])den(H)\operatorname{min-deg}(H[S^{\prime}])\geq\operatorname{den}(H).

Another auxiliary lemma that is useful for us is that for any subset M[N]M\subseteq[N] not too large, the size of the union |jMSj|\left|\cup_{j\in M}S_{j}\right| is not too small relative to |M||M|\cdot\ell. This lemma is also standard and has been used in prior works (e.g. [BS89, Zuc96, CCK+17]). Once again, we provide its proof in Appendix A for completeness.

Lemma 7.

Suppose N1000n(1δ)N\leq 1000\ell n^{(1-\delta)\ell}, 2020\leq\ell. Then with probability at least 0.95 over a sample GRPN,𝒢(n,12)G^{\prime}\sim\mathrm{RP}_{N,\ell}\circ\mathcal{G}(n,\frac{1}{2}), G=({Si}i[N],E)G^{\prime}=(\{S_{i}\}_{i\in[N]},E^{\prime}), the following event occurs: for every M[N]M\subseteq[N] with |M|n0.99δ/|M|\leq n^{0.99\delta}/\ell, we have |iMSi|0.01δ|M||\bigcup_{i\in M}S_{i}|\geq 0.01\delta|M|\ell.

With the above observation and lemma ready, we can now prove the soundness guarantee.

Proof of Lemma 5.

We will assume that the event in Lemma 7 occurs and show that conditioned on this event, under the randomness of GG, with probability 0.95 all kk-subgraphs of GG^{\prime} have density at most d:=107logn/(δ2)d:=10^{7}\log n/(\ell\delta^{2}). Lemma 5 then follows immediately since (0.95)2>0.9(0.95)^{2}>0.9.

Consider any J[N]J\subseteq[N] of size kkk^{\prime}\leq k. For brevity, let (J)\mathcal{F}(J) denote the set of all graphs whose vertices are SjS_{j} for jJj\in J and the minimum degree is at least dd. For each F=({Sj}jJ,EF)(J)F=(\{S_{j}\}_{j\in J},E_{F})\in\mathcal{F}(J), let G(F)\mathcal{E}^{G}(F) denote {Sj,Sj}EF{{u,v}uSj,vSj and uv}\bigcup_{\{S_{j},S_{j^{\prime}}\}\in E_{F}}\{\{u,v\}\mid u\in S_{j},v\in S_{j^{\prime}}\text{ and }u\neq v\}, or in words, the set of edges of GG which have one endpoint in SjS_{j} and one endpoint in SjS_{j^{\prime}} for an edge (Sj,Sj)EF(S_{j},S_{j^{\prime}})\in E_{F}. Observe that

𝐏𝐫G[G[{Sj}jJ]=F]\displaystyle\operatorname*{\mathbf{Pr}}_{G}[G^{\prime}[\{S_{j}\}_{j\in J}]=F] 𝐏𝐫G[G(F)E]= 2|G(F)|,\displaystyle\leq\operatorname*{\mathbf{Pr}}_{G}[\mathcal{E}^{G}(F)\subseteq E]\,=\,2^{-|\mathcal{E}^{G}(F)|}, (1)

where the inequality follows by definition of the randomized graph product — since (Si,Sj)(S_{i},S_{j}) is an edge if and only if SiSjS_{i}\cup S_{j} is a clique in GG, the event G[{Sj}jJ]=FG^{\prime}[\{S_{j}\}_{j\in J}]=F contains the event G(F)E\mathcal{E}^{G}(F)\subseteq E — and the final equality follows because G𝒢(n,12)G\sim\mathcal{G}(n,\frac{1}{2}). To bound |G(F)||\mathcal{E}^{G}(F)|, let 𝒱G(F):=jJSj\mathcal{V}^{G}(F):=\cup_{j\in J}S_{j}.

Since we have conditioned on the event in Lemma 7 occurring,666Note that |J|=kkn0.99δ/|J|=k^{\prime}\leq k\leq n^{0.99\delta}/\ell by our assumption and hence JJ satisfies the condition in Lemma 7. we have |𝒱G(F)|0.01δk|\mathcal{V}^{G}(F)|\geq 0.01\delta k^{\prime}\ell. Now, consider any v𝒱G(F)v\in\mathcal{V}^{G}(F); from definition of 𝒱G(F)\mathcal{V}^{G}(F), vSjv\in S_{j} for some jJj\in J. Since F(J)F\in\mathcal{F}(J), SjS_{j} must have at least dd neighbors in FF. Let Sj1,,SjdS_{j_{1}},\dots,S_{j_{d^{\prime}}} denote SjS_{j}’s neighbors in FF, with ddd^{\prime}\geq d. By applying the bound in Lemma 7, we have |Sj1Sjd|0.01δd0.01δd|S_{j_{1}}\cup\dots\cup S_{j_{d^{\prime}}}|\geq 0.01\delta d^{\prime}\ell\geq 0.01\delta d\ell. In other words, vv has degree at least 0.01δd10.005δd0.01\delta d\ell-1\geq 0.005\delta d\ell in the graph (𝒱G(F),G(F))(\mathcal{V}^{G}(F),\mathcal{E}^{G}(F)). This implies that

|G(F)|\displaystyle|\mathcal{E}^{G}(F)| 12(0.01δk)(0.005δd)\displaystyle\geq\frac{1}{2}\left(0.01\delta k^{\prime}\ell\right)\left(0.005\delta d\ell\right)
105δ2kd2.\displaystyle\geq 10^{-5}\delta^{2}k^{\prime}d\ell^{2}.

Plugging the above bound back into (1), we have

𝐏𝐫G[G[{Sj}jJ]=F]2105δ2kd2\displaystyle\operatorname*{\mathbf{Pr}}_{G}[G^{\prime}[\{S_{j}\}_{j\in J}]=F]\leq 2^{-10^{-5}\delta^{2}k^{\prime}d\ell^{2}} (2)

We can use the above inequality to bound the probability that {Sj}jJ\{S_{j}\}_{j\in J} induces a subgraph with minimum degree at least dd as follows:

𝐏𝐫[mindeg(G[{Sj}jJ])d]=F(J)𝐏𝐫G[G[{Sj}jJ]=F]\displaystyle\operatorname*{\mathbf{Pr}}[\operatorname{min-deg}(G^{\prime}[\{S_{j}\}_{j\in J}])\geq d]\,=\,\sum_{F\in\mathcal{F}(J)}\operatorname*{\mathbf{Pr}}_{G}[G^{\prime}[\{S_{j}\}_{j\in J}]=F] (2)2(k)22105δ2kd2 2106δ2kd2.\displaystyle\overset{\eqref{eq:fixed-subgraph-probability-bound}}{\leq}2^{(k^{\prime})^{2}}\cdot 2^{-10^{-5}\delta^{2}k^{\prime}d\ell^{2}}\,\leq\,2^{-10^{-6}\delta^{2}k^{\prime}d\ell^{2}}.

where the first inequality follows because there are at most 2(k)22^{(k^{\prime})^{2}} subgraphs of an kk^{\prime}-vertex graph, and to obtain the final inequality we have applied that kk\ell\geq k\geq k^{\prime} and d107/(δ2)d\geq 10^{7}/(\delta^{2}\ell).

Applying Observation 6, the existence of a kk-subgraph with density at least dd would imply the existence of some J[N]J\subseteq[N] with |J|k|J|\leq k and minimum degree at least dd. Taking union bound over all J[N]J\subseteq[N] of size at most kk and applying our above bound, we have

𝐏𝐫[denk(G)]k=1kNk2106δ2kd2\displaystyle\operatorname*{\mathbf{Pr}}[\operatorname{den}_{\leq k}(G)]\,\leq\,\sum_{k^{\prime}=1}^{k}N^{k^{\prime}}\cdot 2^{-10^{-6}\delta^{2}k^{\prime}d\ell^{2}} =k=1k(N2106δ2d2)k\displaystyle=\sum_{k^{\prime}=1}^{k}\left(N\cdot 2^{-10^{-6}\delta^{2}d\ell^{2}}\right)^{k^{\prime}}
k=1k(8n2106δ2d)k\displaystyle\leq\sum_{k^{\prime}=1}^{k}\left(8\cdot n\cdot 2^{-10^{-6}\delta^{2}d\ell}\right)^{k^{\prime}\ell}
k=1k0.01k 0.95,\displaystyle\leq\sum_{k^{\prime}=1}^{k}0.01^{k^{\prime}\ell}\,\leq\,0.95,

where to obtain the second line we use that N1000kn(1δ)N\leq 1000kn^{(1-\delta)\ell} and that k20\ell\geq k\geq 20, and in the final line we use that d107logn/(δ)2d\geq 10^{7}\log n/(\ell\delta)^{2}. This completes our proof. ∎

3 Tight Running Time Lower Bounds

In this section, we prove our tight running time lower bounds for O(1)O(1)-approximating Densest kk-Subgraph, O(1)O(1)-approximating kk-Biclique and (exact) Graph Pattern Detection.

3.1 Constant Approximation for Densest kk-Subgraph

We start with the nΩ(k)n^{\Omega(k)} running time lower bound for O(1)O(1)-approximating Densest kk-Subgraph, from which the remaining results easily follow. We remark that this running time lower bound improves upon that of nΩ(logk)n^{\Omega(\log k)}, which is implicit in [CCK+17].

Theorem 8.

Assuming Hypothesis 1, for any constant C>0C>0, there is no f(k)no(k)f(k)\cdot n^{o(k)}-time algorithm that can approximate Densest kk-Subgraph to within a factor CC for any function ff. Furthermore, this holds even in the perfect completeness case where the input graph is promised to contain a kk-clique.

We will prove Theorem 8 by simply selecting an appropriate setting of parameters for the randomized graph product. Specifically, we will let =O(Clogn/k)\ell=O(C\log n/k) and N=nO()=no(logn)N=n^{O(\ell)}=n^{o(\log n)}; the generic soundness lemma (Lemma 5) then implies that the density of any kk-subgraph in the soundness case is at most O(k/C)O(k/C) which yields the desired CC inapproximability factor.

Proof of Theorem 8.

We will reduce the problem of distinguishing samples from 𝒢(n,12)\mathcal{G}(n,\frac{1}{2}) vs. 𝒢(n,12,nδ)\mathcal{G}(n,\frac{1}{2},\lceil n^{\delta}\rceil) to approximating Densest kk-subgraph.

For CC the constant specified in the statement of the theorem, choose =108Clognδ2k\ell=\lceil\frac{10^{8}C\log n}{\delta^{2}k}\rceil and N=100kn(1δ)N=\lceil 100kn^{(1-\delta)\ell}\rceil, so that d=107lognδ2k10Cd=\frac{10^{7}\log n}{\ell\delta^{2}}\leq\frac{k}{10C}. Given a graph GG, we sample GRPN,(G)G^{\prime}\sim\mathrm{RP}_{N,\ell}(G).

Completeness: If G𝒢(n,12,nδ)G\sim\mathcal{G}(n,\frac{1}{2},\lceil n^{\delta}\rceil), then by Lemma 4 and since N10kn(1δ)N\geq 10kn^{(1-\delta)\ell}, we have that with probability at least 1exp(4k)0.91-\exp(-4k)\geq 0.9, G=RPN,(G)G^{\prime}=\mathrm{RP}_{N,\ell}(G) has a kk-clique, and so denk(G)k1\operatorname{den}_{\leq k}(G^{\prime})\geq k-1.

Soundness: For any 20k20\leq k\leq\ell and any δ\delta bounded away from 0, we clearly satisfy the requirement of Lemma 5 that kn0.99δk\ell\leq n^{0.99\delta} and that N1000kn(1δ)N\leq 1000kn^{(1-\delta)\ell} for any sufficiently large nn. Hence, if G𝒢(n,12)G\sim\mathcal{G}(n,\frac{1}{2}), applying the Lemma to GRPN,(G)G^{\prime}\sim\mathrm{RP}_{N,\ell}(G) we have that with probability at least 0.90.9, denk(G)dk10C\operatorname{den}_{\leq k}(G^{\prime})\leq d\leq\frac{k}{10C}.

Thus, any algorithm which approximates Densest kk-subgraph up to a factor of CC in time f(k)Nϵkf(k)N^{\epsilon k} can solve the planted nδ\lceil n^{\delta}\rceil-clique problem in time f(k)(100kn(1δ))ϵk=g(k)n(1δ)δ2C108ϵlognf(k)(100kn^{(1-\delta)\ell})^{\epsilon k}=g(k)n^{\frac{(1-\delta)\delta^{2}C}{10^{8}}\epsilon\log n} for g(k)=f(k)(100k)ϵkg(k)=f(k)(100k)^{\epsilon k}. This contradicts Hypothesis 1 whenever limnϵ=0\lim_{n\to\infty}\epsilon=0. Choosing kk to be a sufficiently slowly growing function of nn, for any ϵ\epsilon decreasing in kk we have a contradiction of the Hypothesis. This concludes the proof. ∎

3.2 O(1)O(1)-Approximation for kk-Biclique

Recall that a kk-biclique, denoted by Kk,kK_{k,k}, is a complete bipartite graph where each side has exactly kk vertices. In the kk-Biclique problem, we are given an undirected graph GG and a positive integer kk. Further, we are promised that GG contains a Kk,kK_{k,k} as a subgraph. Our goal is to output a balanced biclique in GG of size as large as possible. (Note that we say that an algorithm achieves α\alpha-approximation if the output biclique has size at least k/αk/\alpha.)

We prove the following tight running time lower bound for O(1)O(1)-approximation of kk-Biclique:

Corollary 9.

Assuming Hypothesis 1, for any constant C>0C>0, there is no f(k)no(k)f(k)\cdot n^{o(k)}-time algorithm that can approximate kk-Biclique to within a factor CC for any function ff. Furthermore, this holds even when we are promised that the input graph contains a 2k2k-clique.

Proof of Corollary 9.

Suppose contrapositively that there is an f(k)no(k)f(k)\cdot n^{o(k)}-time algorithm 𝔸\mathbb{A} that can approximate kk-Biclique to within a factor of CC for some function ff. We may use it to approximate Densest kk-Subgraph with perfect completeness as follows. On a given instance (G,k)(G,k) of Densest kk-Subgraph with perfect completeness, we run our algorithm 𝔸\mathbb{A} on (G,k/2)(G,\lfloor k/2\rfloor). From the approximation guarantee of 𝔸\mathbb{A}, we will find a tt-Biclique for tk/2/Ck4Ct\geq\lfloor k/2\rfloor/C\geq\frac{k}{4C}. By taking this biclique together with arbitrary (k2t)(k-2t) additional vertices, we find a subset of size kk that induces at least t2k216C2t^{2}\geq\frac{k^{2}}{16C^{2}} edges. Hence, we have found a 16C216C^{2}-approximation for Densest kk-Subgraph in time f(k)no(k)f(k)\cdot n^{o(k)}, which by Theorem 8 violates Hypothesis 1. ∎

3.3 Pattern detection

Theorem 8 also yields the following corollary for the running time of pattern detection:

Corollary 10.

Assuming Hypothesis 1, for almost all kk-node patterns HH, the not necessarily induced pattern detection problem cannot be solved in time f(k)no(k)f(k)\cdot n^{o(k)}.

In the statement of Corollary 10, by “almost all kk-node patterns” we mean w.h.p. over H𝒢(k,1/2)H\sim\mathcal{G}(k,1/2). 777It may be more natural to sample uniformly from all unlabeled kk-node patterns, but w.h.p. a graph drawn from 𝒢(k,1/2)\mathcal{G}(k,1/2) has no non-trivial automorphisms [ER63], so the two notions of “almost all kk-node patterns” are in fact equivalent.

Proof.

By standard concentration inequalities, most H𝒢(k,12)H\sim\mathcal{G}(k,\frac{1}{2}) have average degree k2±o(k)\frac{k}{2}\pm o(k). For such a pattern HH, an algorithm that solves the not necessarily induced pattern detection problem also gives O(1)O(1)-approximation for Densest kk-Subgraph. Hence, the corollary immediately follows from Theorem 8. ∎

Theorem (Restatement of Theorem 3).

Assuming the Strongish Planted Clique Hypothesis (Hypothesis 1), for every kk-node pattern HH, there is no algorithm that solves the induced pattern detection problem on nn-vertex graphs in time f(k)no(k)f(k)\cdot n^{o(k)} for any function ff.

Proof.

We assume that HH is at least k4\frac{k}{4}-dense (that is, has average degree at least k2\frac{k}{2}). This is without loss of generality as otherwise, we may take the complement of HH — for induced pattern detection the complexity is the same.

We start our reduction from an instance GG of Densest kk-Subgraph as in Theorem 8. We randomly color all the vertices of GG in kk colors, one for each vertex of HH. We construct a graph GG^{\prime} from GG by keeping edge (u,v)G(u,v)\in G if and only if u,vu,v are have different colors, and the vertices in HH corresponding to those colors share an edge. (Note that we do not add any edges.)

Completeness:

If GG has a kk-clique, with probability at least kkk^{-k} it has kk different colors. In this case, the same vertices in GG^{\prime} will form an induced copy of HH.

Soundness:

If GG does not have a k4\frac{k}{4}-dense kk-subgraph, this remains true after removing edges. Hence GG^{\prime} also does not contain any k4\frac{k}{4}-dense kk-subgraph — and in particular no copy of HH.

As a result, if we can solve the induced pattern detection problem in time f(k)no(k)f(k)\cdot n^{o(k)}, we can achieve 44-approximation for Densest kk-Subgraph with probability kkk^{-k} in time f(k)no(k)f(k)\cdot n^{o(k)}. By repeating this construction 100kk100\cdot k^{k} times, we can achieve 44-approximation for Densest kk-Subgraph with probability 0.99 in time O(f(k)kk)no(k)O(f(k)\cdot k^{k})\cdot n^{o(k)}. Together with Theorem 8, this concludes our proof. ∎

4 Tight Inapproximability Results

In this section, we prove tight polynomial-time inapproximability results for Densest kk-Subgraph, Smallest kk-Edge Subgraph, Steiner kk-Forest, Directed Steiner Network, and Densest kk-Subhypergraph.

4.1 Densest kk-Subgraph

We start with the o(k)o(k)-factor hardness of Densest kk-Subgraph, from which our other results follow.

Theorem (Restatement of Theorem 2).

Assuming the Strongish Planted Clique Hypothesis (Hypothesis 1), there is no f(k)\poly(n)f(k)\cdot\poly(n)-time algorithm that can approximate Densest kk-Subgraph on nn-vertex graphs to within a factor o(k)o(k) for any function ff. Furthermore, this holds even in the perfect completeness case where the input graph is promised to contain a kk-clique.

The proof of Theorem 2 is once again via selecting an appropriate setting of parameters for the randomized graph product. Specifically, we will select =O((logn)g(k)/k)\ell=O((\log n)\cdot g(k)/k) where g(k)=o(k)g(k)=o(k) is the assumed approximation ratio and N=nO()=no(logn)N=n^{O(\ell)}=n^{o(\log n)}; the generic soundness lemma (Lemma 5) then implies that the density of any kk-vertex subgraph in the soundness case is at most k/g(k)k/g(k) as desired. The arguments are formalized below.

Proof of Theorem 2.

We will reduce the problem of distinguishing samples from 𝒢(n,12)\mathcal{G}(n,\frac{1}{2}) vs. 𝒢(n,12,nδ)\mathcal{G}(n,\frac{1}{2},\lceil n^{\delta}\rceil) to approximating Densest kk-subgraph on an NN vertex graph.

Let g(k)kg(k)\leq k be any function growing with kk. Choose =108g(k)lognδ2k\ell=\lceil\frac{10^{8}g(k)\log n}{\delta^{2}k}\rceil and N=100kn(1δ)N=\lceil 100kn^{(1-\delta)\ell}\rceil, so that d=107lognδ2k10g(k)d=\frac{10^{7}\log n}{\ell\delta^{2}}\leq\frac{k}{10g(k)}. Given a graph GG, we sample GRPN,(G)G^{\prime}\sim\mathrm{RP}_{N,\ell}(G).

Completeness: If G𝒢(n,12,nδ)G\sim\mathcal{G}(n,\frac{1}{2},\lceil n^{\delta}\rceil), then just as in the proof of Theorem 8, Lemma 4 implies that with probability at least 0.90.9, G=RPN,(G)G^{\prime}=\mathrm{RP}_{N,\ell}(G) has a kk-clique.

Soundness: For any 20k20\leq k\leq\ell and δ\delta bounded away from 0, we satisfy the requirements of Lemma 5 (just as in the proof of Theorem 8). Applying the lemma, if G𝒢(n,12)G\sim\mathcal{G}(n,\frac{1}{2}), we conclude that with probability at least 0.90.9, denk(G)dk10g(k)\operatorname{den}_{\leq k}(G^{\prime})\leq d\leq\frac{k}{10g(k)}. This means that any kk-vertex subgraph of GG^{\prime} contains at most k210g(k)<(k2)g(k)\frac{k^{2}}{10g(k)}<\frac{\binom{k}{2}}{g(k)} edges.

Hence, any algorithm which approximates Densest kk-subgraph within g(k)g(k) in time f(k)\poly(N)f(k)\poly(N) can solve the planted nδ\lceil n^{\delta}\rceil-clique problem in time f(k)\poly(100kn(1δ))=h(k)\poly(n(1δ)δ2108g(k)klogn)f(k)\cdot\poly(100kn^{(1-\delta)\ell})=h(k)\cdot\poly\left(n^{\frac{(1-\delta)\delta^{2}}{10^{8}}\frac{g(k)}{k}\log n}\right) for h(k)=f(k)\poly(100k)h(k)=f(k)\poly(100k). Choosing kk to be a sufficiently slowly growing function of nn, for any function g(k)g(k) with limn0g(k)k=0\lim_{n\to 0}\frac{g(k)}{k}=0 we have a contradiction of Hypothesis 1, as desired. ∎

4.2 Smallest kk-Edge Subgraph

Recall that, in the Smallest kk-Edge Subgraph, we are given an undirected graph GG and kk\in\mathbb{N}. The goal is to output a set SS of vertices of GG such that the induced subgraph on SS contains at least kk edges. The perfect completeness case of Smallest kk-Edge Subgraph is a special case where k=(κ2)k=\binom{\kappa}{2} for some κ\kappa\in\mathbb{N} and it is promised that GG contains a clique of size κ\kappa.

Corollary 11.

Assuming Hypothesis 1, there is no f(k)\poly(n)f(k)\cdot\poly(n)-time algorithm that can approximate Smallest kk-Edge Subgraph to within a factor o(k)o(\sqrt{k}) for any function ff. Furthermore, this holds even in the perfect completeness case.

We prove the above corollary by reducing from Densest kk-Subgraph; we remark here that similar reductions between the two problems are folklore and have appeared before in literature, e.g. in [HJ06]. However, the exact statements proved before were slightly different, so we provide the full proof here for completeness.

Proof of Corollary 11.

Suppose contrapositively that there is an f(p)\poly(n)f(p)\cdot\poly(n)-time algorithm 𝔸\mathbb{A} that can approximate Smallest pp-Edge Subgraph in the perfect completeness case to within a factor of p/g(p)\sqrt{p}/g(p) for some g(p)=ω(1)g(p)=\omega(1) and f(p)f(p). We will construct an algorithm 𝔹\mathbb{B} that achieves o(k)o(k)-approximation for Densest kk-Subgraph in the perfect completeness case in time h(k)\poly(n)h(k)\cdot\poly(n) for some function hh. Our corollary then follows from Theorem 2.

Given an instance (G=(V,E),k)(G=(V,E),k) of Densest kk-Subgraph, Algorithm 𝔹\mathbb{B} works as follows:

  • Run algorithm 𝔸\mathbb{A} on GG with p=(k2)p=\binom{k}{2}. Let the output of 𝔸\mathbb{A} be SS.

  • Enumerate to find a subset TST^{*}\subseteq S of size kk that maximizes E[T]E[T^{*}].

  • Output TT^{*}.

From the assumed approximation guarantee of 𝔸\mathbb{A}, we have |S|kp/g(p)k2/g(p)|S|\leq k\cdot\sqrt{p}/g(p)\leq k^{2}/g(p). As a result, the running time of the second step is at most (|S|k)\poly(n)2O(klogk)\poly(n)\binom{|S|}{k}\poly(n)\leq 2^{O(k\log k)}\poly(n). Hence, the running time of the entire algorithm 𝔹\mathbb{B} is (2O(klogk)+f(O(k2)))\poly(n)(2^{O(k\log k)}+f(O(k^{2})))\cdot\poly(n) as desired.

As for the approximation guarantee, let TT be a random subset of SS of size kk. Notice that each edge in E[S]E[S] belongs to TT with probability k(k1)|S|(|S|1)\frac{k(k-1)}{|S|(|S|-1)}. As a result, we have

E[T]𝔼T[E[T]]=k(k1)|S|(|S|1)E[S]k(k1)|S|(|S|1)pk2/2(k2/g(p))2(k2/4)=(g(p))2/8.\displaystyle E[T^{*}]\geq\mathbb{E}_{T}[E[T]]=\frac{k(k-1)}{|S|(|S|-1)}\cdot E[S]\geq\frac{k(k-1)}{|S|(|S|-1)}\cdot p\geq\frac{k^{2}/2}{(k^{2}/g(p))^{2}}\cdot(k^{2}/4)=(g(p))^{2}/8.

Since g(p)g(p)\to\infty as pp\to\infty, the approximation ratio of 𝔹\mathbb{B} is o(k)o(k) as desired. ∎

4.3 Steiner kk-Forest

Recall that, in the Steiner kk-Forest problem (aka the kk-Forest problem), we are given an edge-weighted undirected graph G=(V,E)G=(V,E), a set {(s1,t1),,(s,t)}\{(s_{1},t_{1}),\dots,(s_{\ell},t_{\ell})\} of demand pairs and kk\in\mathbb{N}. The goal is to find a (not necessarily induced) subgraph of GG with smallest total edge weight that connects at least kk demand pairs.

Corollary 12.

Assuming Hypothesis 1, there is no f(k)\poly(n)f(k)\cdot\poly(n)-time algorithm that can approximate Steiner kk-Forest to within a factor o(k)o(\sqrt{k}) for any function ff.

Corollary 12 immediately follows from Corollary 11 via the following reduction presented in [HJ06]:

Theorem 13 ([HJ06, Theorem 6.2]).

If there is an f(k)\poly(n)f(k)\cdot\poly(n)-time g(k)g(k)-approximation algorithm for Steiner kk-Forest, then there is an f(k)\poly(n)f(k)\cdot\poly(n)-time g(k)g(k)-approximation algorithm for Smallest kk-Edge Subgraph.

For interested readers, we remark that this reduction is as follows: for a given instance (G=(V,E),k)(G=(V,E),k) of Smallest kk-Edge Subgraph, create a star graph (with uniform edge weights) on vertices {c}V\{c\}\cup V where cc is the new vertex which is the center of the star. Then, the demand pairs are exactly EE. It is clear that there is a one-to-one correspondence between a solution of Steiner kk-Forest problem and a solution of Small kk-Edge Subgraph of the same cost.

4.4 Directed Steiner Network

Recall that, in the Directed Steiner Network problem (aka the Directed Steiner Forest problem), we are given an arc-weighted directed graph G=(V,E)G=(V,E), a set {(s1,t1),,(sk,tk)}\{(s_{1},t_{1}),\dots,(s_{k},t_{k})\} of kk demand pairs. The goal is to find a (not necessarily induced) subgraph of GG with smallest total arc weight in which tit_{i} is reachable from sis_{i} for all i[k]i\in[k].

Corollary 14.

Assuming Hypothesis 1, there is no f(k)\poly(n)f(k)\cdot\poly(n)-time algorithm that can approximate Directed Steiner Network with kk terminal pairs to within a factor o(k)o(\sqrt{k}) for any function ff.

We now prove Corollary 14 via a reduction from Smallest pp-Edge Subgraph. This reduction is similar to the reduction from the Label Cover problem by Dodis and Khanna [DK99], and was also used in subsequent works [CFM18, DM18]. Since we start from a slightly different problem, we repeat the argument below for completeness.

Proof of Corollary 14.

Suppose contrapositively that there is an f(k)\poly(n)f(k)\cdot\poly(n)-time algorithm 𝔸\mathbb{A} that can approximate Directed Steiner Network to within a factor of o(k)o(\sqrt{k}) for some function ff. We will construct an algorithm 𝔹\mathbb{B} that achieves o(p)o(\sqrt{p})-approximation for Smallest pp-Subgraph in the perfect completeness case in time h(p)\poly(n)h(p)\cdot\poly(n) for some function hh. Our corollary then follows from Corollary 11.

Below, we will present an algorithm 𝔹\mathbb{B} that succeeds with probability only kΩ(k)k^{-\Omega(k)}. This suffices for us, since we may repeat this algorithm kO(k)k^{O(k)} times and output the best solution found (as in the proof of Theorem 3). Given an instance (G=(V,E),p)(G=(V,E),p) of Smallest pp-Edge Subgraph where p=(k2)p=\binom{k}{2}, Algorithm 𝔹\mathbb{B} works as follows:

  • Let V1VkV_{1}\cup\cdots\cup V_{k} be a uniformly random partition of the vertex set VV.

  • Construct an edge-weighted directed graph G=(V,E)G^{\prime}=(V^{\prime},E^{\prime}) as follows:

    • For every ViV_{i}, create two copies of it: Vi1={v1vVi}V_{i}^{1}=\{v^{1}\mid v\in V_{i}\} and Vi2={v2vVi}V_{i}^{2}=\{v^{2}\mid v\in V_{i}\}.
      For brevity, let V1=V11Vk1V^{1}=V_{1}^{1}\cup\cdots\cup V_{k}^{1} and V2=V12Vk2V^{2}=V_{1}^{2}\cup\cdots\cup V_{k}^{2}.

    • Let VV^{\prime} be V1V2{s1,,sk}{t1,,tk}V^{1}\cup V^{2}\cup\{s_{1},\dots,s_{k}\}\cup\{t_{1},\dots,t_{k}\} where s1,,sk,t1,,tks_{1},\dots,s_{k},t_{1},\dots,t_{k} are new vertices.

    • For every v1V1v^{1}\in V^{1} and u2V2u^{2}\in V^{2}, add (v1,u2)(v^{1},u^{2}) of weight 0 to EE^{\prime} if and only if {v,u}E\{v,u\}\in E or v=uv=u.

    • For every i[k]i\in[k] and vVi1v\in V^{1}_{i}, add an arc (si,v1)(s_{i},v^{1}) of weight 1 to EE^{\prime}

    • For every j[k]j\in[k] and vVj2v\in V^{2}_{j}, add an arc (v2,tj)(v^{2},t_{j}) of weight 1 to EE^{\prime}.

    • Let the demand pairs be {(si,tj)}i[k],j[k]\{(s_{i},t_{j})\}_{i\in[k],j\in[k]}.

  • Run algorithm 𝔸\mathbb{A} on (G,{(si,tj)}i[k],j[k])(G^{\prime},\{(s_{i},t_{j})\}_{i\in[k],j\in[k]}). Let the output of 𝔸\mathbb{A} be RR.

  • Let SS be the set of vertices vVv\in V such that (si,v1)(s_{i},v^{1}) or (v2,ti)(v^{2},t_{i}) belongs to RR for some i[k]i\in[k].

  • Output SS.

Suppose that there exists a kk-clique CC in GG. With probability at least kkk^{-k}, each vertex of CC belongs to a different partition ViV_{i}. When this is the case, we may take the all 0-weighted arcs, and the 1-weighted arcs adjacent to CC as a solution. This results in a solution of cost 2k2k. From the assumed approximation guarantee of 𝔸\mathbb{A}, we must have |S|2ko(k2)=ko(p)|S|\leq 2k\cdot o(\sqrt{k^{2}})=k\cdot o(\sqrt{p}). It is also simple to see that SS must induce at least one edge in GG between Vi,VjV_{i},V_{j} for every i,j[k]i,j\in[k]. This means that SS is a solution of Smallest pp-Edge Subgraph of size at most ko(p)k\cdot o(\sqrt{p}). Hence, this is an o(p)o(\sqrt{p})-approximate solution as desired. Finally, it is obvious to see that 𝔹\mathbb{B} runs in time f(O(p))\poly(n)f(O(p))\cdot\poly(n). ∎

4.5 Densest kk-Subhypergraph

In the Densest kk-Subhypergraph problem, we are given a hypergraph G=(V,E)G=(V,E) and an integer kk. The goal is to output a set SVS\subseteq V of kk vertices that maximizes the number of hyperedges fully contained in SS. We remark here that each hyperedge in this hypergraph may have an unbounded size.

Theorem 15.

Assuming Hypothesis 1, there is no f(k)\poly(n)f(k)\cdot\poly(n)-time algorithm that can approximate Densest kk-Subhypergraph to within a factor 2o(k)2^{o(k)} for any function ff.

The proof of our inapproximability for Densest kk-Subhypergraph (Theorem 15) is unlike the others in this section: instead of starting from the inapproximability of Densest kk-Subgraph (Theorem 2), we will start from the tight running time lower bound for O(1)O(1)-approximate kk-Biclique (Corollary 9).

4.5.1 A Combinatorial Lemma

Before we describe our reduction, let us state and prove the following lemma, which bounds the number of (induced) copies of K,K_{\ell,\ell} in a Kt,tK_{t,t}-free graph for <t\ell<t. This is a generalized setting of the classic Kovari-Sos-Turan theorem [KST54], which applies only to the case =1\ell=1. Note that, due to the parameters of interest in our reduction, we only prove a good bound for large \ell; for =1\ell=1, the bound in the lemma below is trivial (and hence weaker than the Kovari-Sos-Turan theorem).

Lemma 16.

Let κ,t,\kappa,t,\ell be positive integers such that <tκ/16\ell<t\leq\kappa/16. Then, for any κ\kappa-vertex Kt,tK_{t,t}-free graph HH, the number of (not necessarily induced) copies of K,K_{\ell,\ell} in HH is at most

(2e216t)(κ)(κ).\displaystyle\left(2\cdot e^{-\frac{\ell^{2}}{16t}}\right)\cdot\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}.
Proof.

Let VV denote the vertex set of HH. We will count the number of copies of K,tK_{\ell,t} in HH in two ways. First, for every subset S(V)S\in\binom{V}{\ell}, the number of (,t)(\ell,t)-bicliques of the form (S,T)(S,T) where T(Vt)T\in\binom{V}{t} is (|N(S)|t)\binom{|N(S)|}{t} where N(S)N(S) denote the set of common neighbors of SS. Hence, in total the number of (,t)(\ell,t)-bicliques in HH is

S(V)(|N(S)|t).\sum_{S\in\binom{V}{\ell}}\binom{|N(S)|}{t}.

On the other hand, for every set T(Vt)T\in\binom{V}{t}, we must have |N(T)|t1|N(T)|\leq t-1 since HH does not contain Kt,tK_{t,t}. As a result, the number of (,t)(\ell,t)-bicliques of the form (S,T)(S,T) for a fixed TT is at most (t1)\binom{t-1}{\ell}. That is, in total, there can be at most (κt)(t1)\binom{\kappa}{t}\binom{t-1}{\ell} copies of K,tK_{\ell,t} in the graph. This implies that

(κt)(t1)S(V)(|N(S)|t).\displaystyle\binom{\kappa}{t}\binom{t-1}{\ell}\geq\sum_{S\in\binom{V}{\ell}}\binom{|N(S)|}{t}. (3)

For brevity, let λ\lambda denote (κ)/t(\kappa-\ell)/t, γ\gamma denote λ/(2t)\lambda^{-\ell/(2t)} and let xx denote γκ+(1γ)t\gamma\cdot\kappa+(1-\gamma)\cdot t (note xx is chosen so that xtκt=γ\frac{x-t}{\kappa-t}=\gamma). Let us now classify S(A)S\in\binom{A}{\ell} into two groups: those with |N(S)|x|N(S)|\geq x and those with |N(S)|<x|N(S)|<x. That is, we define

𝒮x\displaystyle\mathcal{S}_{\geq x} :={S(V)||N(S)|x},and𝒮<x:={S(V)||N(S)|<x}.\displaystyle:=\left\{S\in\binom{V}{\ell}\middle||N(S)|\geq x\right\},\qquad\text{and}\qquad\mathcal{S}_{<x}:=\left\{S\in\binom{V}{\ell}\middle||N(S)|<x\right\}.

From (3), we have

(κt)(t1)S𝒮x(|N(S)|t)|𝒮x|(xt).\displaystyle\binom{\kappa}{t}\binom{t-1}{\ell}\geq\sum_{S\in\mathcal{S}_{\geq x}}\binom{|N(S)|}{t}\geq|\mathcal{S}_{\geq x}|\cdot\binom{\lceil x\rceil}{t}.

Rearranging, we have

|𝒮x|(t1)(κt)(xt)(t1)(κtxt)t=(t1)γt\displaystyle|\mathcal{S}_{\geq x}|\leq\frac{\binom{t-1}{\ell}\binom{\kappa}{t}}{\binom{\lceil x\rceil}{t}}\leq\binom{t-1}{\ell}\left(\frac{\kappa-t}{x-t}\right)^{t}=\binom{t-1}{\ell}\gamma^{-t} (κ)(t1κ)γt\displaystyle\leq\binom{\kappa}{\ell}\left(\frac{t-1}{\kappa}\right)^{\ell}\gamma^{-t}
(κ)λγt=(κ)λ/2(κ)2/2,\displaystyle\leq\binom{\kappa}{\ell}\lambda^{-\ell}\gamma^{-t}=\binom{\kappa}{\ell}\lambda^{-\ell/2}\leq\binom{\kappa}{\ell}2^{-\ell/2}, (4)

where the last line follows because t,13κt,\ell\leq\frac{1}{3}\kappa.

Let us now count the number of K,K_{\ell,\ell} in HH. For each fixed S(V)S\in\binom{V}{\ell}, the number of K,K_{\ell,\ell} of the form (S,T)(S,T) where T(V)T\in\binom{V}{\ell} is exactly (|N(S)|)\binom{|N(S)|}{\ell}. Thus, the total number of K,K_{\ell,\ell} in GG is

S(V)(|N(S)|).\displaystyle\sum_{S\in\binom{V}{\ell}}\binom{|N(S)|}{\ell}.

This term can be further bounded by

S(V)(|N(S)|)\displaystyle\sum_{S\in\binom{V}{\ell}}\binom{|N(S)|}{\ell} =S𝒮x(|N(S)|)+S𝒮<x(|N(S)|)\displaystyle=\sum_{S\in\mathcal{S}_{\geq x}}\binom{|N(S)|}{\ell}+\sum_{S\in\mathcal{S}_{<x}}\binom{|N(S)|}{\ell}
|𝒮x|(κ)+|𝒮<x|(x)\displaystyle\leq|\mathcal{S}_{\geq x}|\binom{\kappa-\ell}{\ell}+|\mathcal{S}_{<x}|\binom{x}{\ell}
(4)2/2(κ)(κ)+(κ)(x)\displaystyle\overset{\eqref{eq:tail-bound}}{\leq}2^{-\ell/2}\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}+\binom{\kappa}{\ell}\binom{\lfloor x\rfloor}{\ell}
2/2(κ)(κ)+(κ)(κ)(xκ)\displaystyle\leq 2^{-\ell/2}\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}+\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}\cdot\left(\frac{x}{\kappa-\ell}\right)^{\ell}
2/2(κ)(κ)+(κ)(κ)(1(1γ)(κt)κ)\displaystyle\leq 2^{-\ell/2}\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}+\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}\cdot\left(1-\frac{(1-\gamma)(\kappa-t)-\ell}{\kappa-\ell}\right)^{\ell}
=2/2(κ)(κ)+(κ)(κ)(1(1γκt)(κt)κ)\displaystyle=2^{-\ell/2}\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}+\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}\cdot\left(1-\frac{(1-\gamma-\frac{\ell}{\kappa-t})(\kappa-t)}{\kappa-\ell}\right)^{\ell}
(From <tκ/2)\displaystyle(\text{From }\ell<t\leq\kappa/2) 2/2(κ)(κ)+(κ)(κ)(10.5(1γ2/κ))\displaystyle\leq 2^{-\ell/2}\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}+\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}\cdot\left(1-0.5(1-\gamma-2\ell/\kappa)\right)^{\ell}
2/2(κ)(κ)+(κ)(κ)e0.5(1γ2/κ)\displaystyle\leq 2^{-\ell/2}\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}+\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}\cdot e^{-0.5\ell(1-\gamma-2\ell/\kappa)} (5)

Consider the term (1γ2/κ)(1-\gamma-2\ell/\kappa). We can bound it as follows:

(1γ2/κ)\displaystyle(1-\gamma-2\ell/\kappa) =(11λ/2t2κ)\displaystyle=\left(1-\frac{1}{\lambda^{\ell/2t}}-\frac{2\ell}{\kappa}\right)
(From λ=(κ)/t2)\displaystyle(\text{From }\lambda=(\kappa-\ell)/t\geq 2) (10.5/2t2κ)\displaystyle\geq\left(1-0.5^{\ell/2t}-\frac{2\ell}{\kappa}\right)
(Bernoulli inequality)\displaystyle(\text{Bernoulli inequality}) (1(14t)2κ)\displaystyle\geq\left(1-\left(1-\frac{\ell}{4t}\right)-\frac{2\ell}{\kappa}\right)
=(4t2κ)\displaystyle=\left(\frac{\ell}{4t}-\frac{2\ell}{\kappa}\right)
(From κ16t)\displaystyle(\text{From }\kappa\geq 16t) 8t.\displaystyle\geq\frac{\ell}{8t}.

Plugging this back into (5), we have

S(A)(|N(S)|)\displaystyle\sum_{S\in\binom{A}{\ell}}\binom{|N(S)|}{\ell} (κ)(κ)(2/2+e216t)\displaystyle\leq\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}\cdot\left(2^{-\ell/2}+e^{-\frac{\ell^{2}}{16t}}\right)
(From <t)\displaystyle(\text{From }\ell<t) (κ)(κ)(2e216t).\displaystyle\leq\binom{\kappa}{\ell}\binom{\kappa-\ell}{\ell}\cdot\left(2\cdot e^{-\frac{\ell^{2}}{16t}}\right).

This concludes our proof. ∎

4.5.2 Proof of Theorem 15

We are now ready to prove Theorem 15.

Proof of Theorem 15.

Suppose contrapositively that there is an f(ρ)\poly(n)f(\rho)\cdot\poly(n)-time 2ρ/g(ρ)2^{\rho/g(\rho)}-approximation algorithm 𝔸\mathbb{A} for Densest ρ\rho-Subhypergraph where g=ω(1)g=\omega(1). We will construct an algorithm 𝔹\mathbb{B} that achieves O(1)O(1)-approximation for kk-Biclique with the promise that a 2k2k-Clique exists in time h(k)no(k)h(k)\cdot n^{o(k)} for some function hh. Theorem 15 then follows from Corollary 9.

We define 𝔹\mathbb{B} on input (G=(V,E),k)(G=(V,E),k) as follows:

  • Let ρ=2k\rho=2k and =ρ/g(ρ)0.1\ell=\lceil\rho/g(\rho)^{0.1}\rceil.

  • Construct a hypergraph GG^{\prime} with the same vertex set as GG, and we add a hyperedge e={u1,,u2}e=\{u_{1},\dots,u_{2\ell}\} to GG^{\prime} for all distinct u1,,u2Vu_{1},\dots,u_{2\ell}\in V that induce a 22\ell-clique in GG.

  • Run 𝔸\mathbb{A} on (G,ρ)(G^{\prime},\rho). Let SS be 𝔸\mathbb{A}’s output.

  • Use brute force to find a maximum balanced biclique in SS and output it.

Notice that algorithm 𝔹\mathbb{B} runs in time (f(2k)+2O(k))nO()=(f(2k)+2O(k))no(k)(f(2k)+2^{O(k)})\cdot n^{O(\ell)}=(f(2k)+2^{O(k)})\cdot n^{o(k)} as desired, where the term 2O(k)2^{O(k)} comes from the last step.

Next, we claim that, when kk is sufficiently large, the output biclique has size at least t:=k/8t:=\lfloor k/8\rfloor, which would give us the desired O(1)O(1)-approximation ratio. Suppose for the sake of contradiction that this is not true, i.e. that the induced graph G[S]G[S] is Kt,tK_{t,t}-free.

For any sufficiently large kk, we have <t\ell<t. Hence, we may apply Lemma 16, which gives the following upper bound on the number of not-necessarily induced copies of K,K_{\ell,\ell} in G[S]G[S]:

(2e216t)(2k)(2k)eΩ(k/g(k)0.2)(2k)(2k).\displaystyle\left(2\cdot e^{-\frac{\ell^{2}}{16t}}\right)\cdot\binom{2k}{\ell}\binom{2k-\ell}{\ell}\leq e^{-\Omega(k/g(k)^{0.2})}\cdot\binom{2k}{\ell}\binom{2k-\ell}{\ell}.

However, since each hyperedge e={u1,,u2}e=\{u_{1},\dots,u_{2\ell}\} corresponds to (2)\binom{2\ell}{\ell} copies of K,K_{\ell,\ell}, the number of hyperedges fully contained in SS is thus at most

eΩ(k/g(k)0.2)(2k)(2k)/(2)=eΩ(k/g(k)0.2)(2k2),\displaystyle e^{-\Omega(k/g(k)^{0.2})}\cdot\binom{2k}{\ell}\binom{2k-\ell}{\ell}/\binom{2\ell}{\ell}=e^{-\Omega(k/g(k)^{0.2})}\cdot\binom{2k}{2\ell},

which is less than 2ρ/g(ρ)(2k2)2^{-\rho/g(\rho)}\cdot\binom{2k}{2\ell} for any sufficiently large kk. This contradicts the approximation guarantee of 𝔸\mathbb{A} since the optimal solution (corresponding to the 2k2k-clique) contains (2k2)\binom{2k}{2\ell} hyperedges. ∎

Acknowledgment

Pasin Manurangsi would like to thank Michal Pilipczuk and Daniel Lokshtanov for posing the approximability of Densest kk-Subhypergraph as an open problem at Dagstuhl Seminar on New Horizons in Parameterized Complexity, and for helpful discussions.

References

  • [AAM+11] Noga Alon, Sanjeev Arora, Rajsekar Manokaran, Dana Moshkovitz, and Omri Weinstein. Inapproximability of densest κ\kappa-subgraph from average case hardness. 2011.
  • [AB18] Amir Abboud and Greg Bodwin. Reachability preservers: New extremal bounds and approximation algorithms. In SODA, pages 1865–1883, 2018.
  • [ABBG10] Sanjeev Arora, Boaz Barak, Markus Brunnermeier, and Rong Ge. Computational complexity and information asymmetry in financial products (extended abstract). In ICS, pages 49–65, 2010.
  • [ABW10] Benny Applebaum, Boaz Barak, and Avi Wigderson. Public-key cryptosystem from different assumptions. In STOC, pages 171–180, 2010.
  • [AFWZ95] Noga Alon, Uriel Feige, Avi Wigderson, and David Zuckerman. Derandomized graph products. Comput. Complex., 5(1):60–75, 1995.
  • [Ale11] Michael Alekhnovich. More on average case vs approximation complexity. Computational Complexity, 20(4):755–786, 2011.
  • [App13] Benny Applebaum. Pseudorandom generators with long stretch and low locality from random local one-way functions. SIAM J. Comput., 42(5):2008–2037, 2013.
  • [AYZ97] Noga Alon, Raphael Yuster, and Uri Zwick. Finding and counting given length cycles. Algorithmica, 17(3):209–223, 1997.
  • [Bar18] Siddharth Barman. Approximating Nash equilibria and dense subgraphs via an approximate version of Carathéodory’s theorem. SIAM J. Comput., 47(3):960–981, 2018.
  • [BB19] Matthew Brennan and Guy Bresler. Optimal average-case reductions to sparse PCA: from weak assumptions to strong hardness. In COLT, pages 469–470, 2019.
  • [BBB+13] Maria-Florina Balcan, Christian Borgs, Mark Braverman, Jennifer T. Chayes, and Shang-Hua Teng. Finding endogenously formed communities. In SODA, pages 767–783, 2013.
  • [BBH18] Matthew Brennan, Guy Bresler, and Wasim Huleihel. Reducibility and computational lower bounds for problems with planted sparse structure. In COLT, pages 48–166, 2018.
  • [BBH+20] Matthew Brennan, Guy Bresler, Samuel B Hopkins, Jerry Li, and Tselil Schramm. Statistical query algorithms and low-degree tests are almost equivalent. arXiv preprint arXiv:2009.06107, 2020.
  • [BBM+13] Piotr Berman, Arnab Bhattacharyya, Konstantin Makarychev, Sofya Raskhodnikova, and Grigory Yaroslavtsev. Approximation algorithms for spanner problems and directed steiner forest. Inf. Comput., 222:93–107, 2013.
  • [BCC+10] Aditya Bhaskara, Moses Charikar, Eden Chlamtac, Uriel Feige, and Aravindan Vijayaraghavan. Detecting high log-densities: an O(n1/4)O(n^{1/4}) approximation for densest kk-subgraph. In STOC, pages 201–210, 2010.
  • [BCKS16] Umang Bhaskar, Yu Cheng, Young Kun Ko, and Chaitanya Swamy. Hardness results for signaling in bayesian zero-sum and network routing games. In EC, pages 479–496, 2016.
  • [BCV+12] Aditya Bhaskara, Moses Charikar, Aravindan Vijayaraghavan, Venkatesan Guruswami, and Yuan Zhou. Polynomial integrality gaps for strong SDP relaxations of densest kk-subgraph. In SODA, pages 388–405, 2012.
  • [BGH+16] Amey Bhangale, Rajiv Gandhi, Mohammad Taghi Hajiaghayi, Rohit Khandekar, and Guy Kortsarz. Bicovering: Covering edges with two small subsets of vertices. In ICALP, pages 6:1–6:12, 2016.
  • [BGKM18] Arnab Bhattacharyya, Suprovat Ghoshal, Karthik C. S., and Pasin Manurangsi. Parameterized intractability of even set and shortest vector problem from Gap-ETH. In ICALP, pages 17:1–17:15, 2018.
  • [BGLR93] Mihir Bellare, Shafi Goldwasser, Carsten Lund, and A. Russeli. Efficient probabilistically checkable proofs and applications to approximations. In STOC, pages 294–304, 1993.
  • [BHK+19] Boaz Barak, Samuel B. Hopkins, Jonathan A. Kelner, Pravesh K. Kothari, Ankur Moitra, and Aaron Potechin. A nearly tight sum-of-squares lower bound for the planted clique problem. SIAM J. Comput., 48(2):687–735, 2019.
  • [BKRW17] Mark Braverman, Young Kun-Ko, Aviad Rubinstein, and Omri Weinstein. ETH hardness for densest-k-subgraph with perfect completeness. In SODA, pages 1326–1341, 2017.
  • [BKS13] Boaz Barak, Guy Kindler, and David Steurer. On the optimality of semidefinite relaxations for average-case and generalized constraint satisfaction. In ITCS, pages 197–214, 2013.
  • [BKS18] Markus Bläser, Balagopal Komarath, and Karteek Sreenivasaiah. Graph pattern polynomials. In FSTTCS, pages 18:1–18:13, 2018.
  • [BKW15] Mark Braverman, Young Kun-Ko, and Omri Weinstein. Approximating the best Nash equilibrium in no(logn)n^{o(\log n)}-time breaks the exponential time hypothesis. In SODA, pages 970–982, 2015.
  • [BPR16] Yakov Babichenko, Christos H. Papadimitriou, and Aviad Rubinstein. Can almost everybody be almost happy? In ITCS, pages 1–9, 2016.
  • [BR13] Quentin Berthet and Philippe Rigollet. Complexity theoretic lower bounds for sparse principal component detection. In COLT, pages 1046–1066, 2013.
  • [BS89] Piotr Berman and Georg Schnitger. On the complexity of approximating the independent set problem. In STACS, pages 256–268, 1989.
  • [BS16] Tengyao Wang Quentin Berthet and Richard J Samworth. Statistical and computational trade-offs in estimation of sparse principal components. The Annals of Statistics, 2016.
  • [CCC+99] Moses Charikar, Chandra Chekuri, To-Yat Cheung, Zuo Dai, Ashish Goel, Sudipto Guha, and Ming Li. Approximation algorithms for directed steiner problems. J. Algorithms, 33(1):73–91, 1999.
  • [CCK+17] Parinya Chalermsook, Marek Cygan, Guy Kortsarz, Bundit Laekhanukit, Pasin Manurangsi, Danupon Nanongkai, and Luca Trevisan. From Gap-ETH to FPT-Inapproximability: Clique, dominating set, and more. In FOCS, pages 743–754, 2017.
  • [CDK12] Eden Chlamtac, Michael Dinitz, and Robert Krauthgamer. Everywhere-sparse spanners via dense subgraphs. In FOCS, pages 758–767, 2012.
  • [CDK+18] Eden Chlamtác, Michael Dinitz, Christian Konrad, Guy Kortsarz, and George Rabanca. The densest k-subhypergraph problem. SIAM J. Discret. Math., 32(2):1458–1477, 2018.
  • [CDKL17] Eden Chlamtác, Michael Dinitz, Guy Kortsarz, and Bundit Laekhanukit. Approximating spanners and directed steiner forest: Upper and lower bounds. In SODA, pages 534–553, 2017.
  • [CDM17a] Eden Chlamtác, Michael Dinitz, and Yury Makarychev. Minimizing the union: Tight approximations for small set bipartite vertex expansion. In SODA, pages 881–899, 2017.
  • [CDM17b] Radu Curticapean, Holger Dell, and Dániel Marx. Homomorphisms are a good basis for counting small subgraphs. In STOC, pages 210–223, 2017.
  • [CEGS11] Chandra Chekuri, Guy Even, Anupam Gupta, and Danny Segev. Set connectivity problems in undirected graphs and the directed steiner network problem. ACM Trans. Algorithms, 7(2):18:1–18:17, 2011.
  • [CFM18] Rajesh Chitnis, Andreas Emil Feldmann, and Pasin Manurangsi. Parameterized approximation algorithms for bidirected steiner network problems. In ESA, pages 20:1–20:16, 2018.
  • [Cha00] Moses Charikar. Greedy approximation algorithms for finding dense components in a graph. In APPROX, pages 84–95, 2000.
  • [CHK11] Moses Charikar, MohammadTaghi Hajiaghayi, and Howard J. Karloff. Improved approximation algorithms for label cover problems. Algorithmica, 61(1):190–206, 2011.
  • [CJ03] Liming Cai and David W. Juedes. On the existence of subexponential parameterized algorithms. J. Comput. Syst. Sci., 67(4):789–807, 2003.
  • [CKL+18] Marek Cygan, Pawel Komosa, Daniel Lokshtanov, Michal Pilipczuk, Marcin Pilipczuk, and Saket Saurabh. Randomized contractions meet lean decompositions. CoRR, abs/1810.06864, 2018.
  • [CM14] Radu Curticapean and Dániel Marx. Complexity of counting subgraphs: Only the boundedness of the vertex-cover number counts. In FOCS, pages 130–139, 2014.
  • [CM18] Eden Chlamtác and Pasin Manurangsi. Sherali-adams integrality gaps matching the log-density threshold. In APPROX, pages 10:1–10:19, 2018.
  • [CMMV17] Eden Chlamtác, Pasin Manurangsi, Dana Moshkovitz, and Aravindan Vijayaraghavan. Approximation algorithms for label cover and the log-density threshold. In SODA, pages 900–919, 2017.
  • [CPP16] Marek Cygan, Marcin Pilipczuk, and Michal Pilipczuk. Known algorithms for edge clique cover are probably optimal. SIAM J. Comput., 45(1):67–83, 2016.
  • [Dan16] Amit Daniely. Complexity theoretic limitations on learning halfspaces. In STOC, pages 105–117, 2016.
  • [DF95] Rodney G. Downey and Michael R. Fellows. Fixed-parameter tractability and completeness II: on completeness for W[1]. Theor. Comput. Sci., 141(1&2):109–131, 1995.
  • [DF13] Rodney G. Downey and Michael R. Fellows. Fundamentals of Parameterized Complexity. Springer, 2013.
  • [DK99] Yevgeniy Dodis and Sanjeev Khanna. Design networks with bounded pairwise distance. In STOC, pages 750–759, 1999.
  • [DKN17] Michael Dinitz, Guy Kortsarz, and Zeev Nutov. Improved approximation algorithm for steiner k-forest with nearly uniform weights. ACM Trans. Algorithms, 13(3):40:1–40:16, 2017.
  • [DKS17] Søren Dahlgaard, Mathias Bæk Tejs Knudsen, and Morten Stöckel. Finding even cycles faster via capped kk-walks. In STOC, pages 112–120, 2017.
  • [DM18] Irit Dinur and Pasin Manurangsi. ETH-hardness of approximating 2-CSPs and directed steiner network. In ITCS, pages 36:1–36:20, 2018.
  • [DS16] Amit Daniely and Shai Shalev-Shwartz. Complexity theoretic limitations on learning DNF’s. In COLT, pages 815–830, 2016.
  • [DVW19] Mina Dalirrooyfard, Thuy Duong Vuong, and Virginia Vassilevska Williams. Graph pattern detection: hardness for all induced patterns and faster non-induced cycles. In STOC, pages 1167–1178, 2019.
  • [ER63] Paul Erdős and Alfréd Rényi. Asymmetric graphs. Acta Mathematica Academiae Scientiarum Hungarica, 14(3-4):295–315, 1963.
  • [Fei02] Uriel Feige. Relations between average case complexity and approximation complexity. In STOC, pages 534–543, 2002.
  • [FGR+17] Vitaly Feldman, Elena Grigorescu, Lev Reyzin, Santosh S. Vempala, and Ying Xiao. Statistical algorithms and a lower bound for detecting planted cliques. J. ACM, 64(2):8:1–8:37, 2017.
  • [FK03] Uriel Feige and Robert Krauthgamer. The probable value of the lovász–schrijver relaxations for maximum independent set. SIAM J. Comput., 32(2):345–370, 2003.
  • [FKN12] Moran Feldman, Guy Kortsarz, and Zeev Nutov. Improved approximation algorithms for directed steiner forest. J. Comput. Syst. Sci., 78(1):279–292, 2012.
  • [FKP01] Uriel Feige, Guy Kortsarz, and David Peleg. The dense k-subgraph problem. Algorithmica, 29(3):410–421, 2001.
  • [FS97] Uriel Feige and Michael Seltser. On the densest kk-subgraph problem. Technical report, Weizmann Institute of Science, Rehovot, Israel, 1997.
  • [GHNR10] Anupam Gupta, Mohammad Taghi Hajiaghayi, Viswanath Nagarajan, and R. Ravi. Dial a ride from k-forest. ACM Trans. Algorithms, 6(2):41:1–41:21, 2010.
  • [GMZ17] Chao Gao, Zongming Ma, and Harrison H Zhou. Sparse CCA: adaptive estimation and computational barriers. The Annals of Statistics, 2017.
  • [GNS11] Jiong Guo, Rolf Niedermeier, and Ondrej Suchý. Parameterized complexity of arc-weighted directed steiner problems. SIAM J. Discret. Math., 25(2):583–599, 2011.
  • [HJ06] Mohammad Taghi Hajiaghayi and Kamal Jain. The prize-collecting generalized steiner tree problem via a new approach of primal-dual schema. In SODA, pages 631–640, 2006.
  • [HK11] Elad Hazan and Robert Krauthgamer. How hard is it to approximate the best Nash equilibrium? SIAM J. Comput., 40(1):79–91, 2011.
  • [HWX15] Bruce E. Hajek, Yihong Wu, and Jiaming Xu. Computational lower bounds for community detection on random graphs. In COLT, pages 899–928, 2015.
  • [IP01] Russell Impagliazzo and Ramamohan Paturi. On the complexity of kk-SAT. J. Comput. Syst. Sci., 62(2):367–375, 2001.
  • [IPZ01] Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity? J. Comput. Syst. Sci., 63(4):512–530, 2001.
  • [Jer92] Mark Jerrum. Large cliques elude the metropolis process. Random Struct. Algorithms, 3(4):347–360, 1992.
  • [Joh87] David S. Johnson. The NP-completeness column: An ongoing guide. J. Algorithms, 8(5):438–448, September 1987.
  • [Kar72] Richard M. Karp. Reducibility among combinatorial problems. In Proceedings of a symposium on the Complexity of Computer Computations, pages 85–103, 1972.
  • [Kar76] Richard Karp. Probabilistic analysis of some combinatorial search problems. Algorithms and Complexity: New Directions and Recent Results, 1976.
  • [Kho06] Subhash Khot. Ruling out PTAS for graph min-bisection, dense k-subgraph, and bipartite clique. SIAM J. Comput., 36(4):1025–1071, 2006.
  • [KLL13] Miroslaw Kowaluk, Andrzej Lingas, and Eva-Marta Lundell. Counting and detecting small subgraphs via equations. SIAM J. Discret. Math., 27(2):892–909, 2013.
  • [KP93] Guy Kortsarz and David Peleg. On choosing a dense subgraph (extended abstract). In 34th Annual Symposium on Foundations of Computer Science, Palo Alto, California, USA, 3-5 November 1993, pages 692–701, 1993.
  • [KST54] Tamás Kővári, Vera T Sós, and Pál Turán. On a problem of Zarankiewicz. In Colloquium Mathematicum, volume 3, pages 50–57, 1954.
  • [Lin15] Bingkai Lin. The parameterized complexity of k-biclique. In SODA, pages 605–615, 2015.
  • [LMS11] Daniel Lokshtanov, Dániel Marx, and Saket Saurabh. Lower bounds based on the exponential time hypothesis. Bull. EATCS, 105:41–72, 2011.
  • [LRS15] James R. Lee, Prasad Raghavendra, and David Steurer. Lower bounds on the size of semidefinite programming relaxations. In STOC, pages 567–576, 2015.
  • [LRSZ17] Daniel Lokshtanov, M. S. Ramanujan, Saket Saurabh, and Meirav Zehavi. Parameterized complexity and approximability of directed odd cycle transversal. CoRR, abs/1704.04249, 2017.
  • [LV20] Andrea Lincoln and Nikhil Vyas. Algorithms and lower bounds for cycles and walks: Small space and sparse graphs. In ITCS, pages 11:1–11:17, 2020.
  • [Man17a] Pasin Manurangsi. Almost-polynomial ratio ETH-hardness of approximating densest k-subgraph. In STOC, pages 954–961, 2017.
  • [Man17b] Pasin Manurangsi. Inapproximability of maximum edge biclique, maximum balanced biclique and minimum kk-cut from the small set expansion hypothesis. In ICALP, pages 79:1–79:14, 2017.
  • [MM15] Pasin Manurangsi and Dana Moshkovitz. Approximating dense max 2-CSPs. In APPROX, pages 396–415, 2015.
  • [MR17] Pasin Manurangsi and Aviad Rubinstein. Inapproximability of VC dimension and Littlestone’s dimension. In COLT, pages 1432–1460, 2017.
  • [MSOI+02] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon. Network motifs: Simple building blocks of complex networks. Science, 298(5594):824–827, 2002.
  • [NP85] Jaroslav Nesetril and Svatopluk Poljak. On the complexity of the subgraph problem. Commentationes Mathematicae Universitatis Carolinae, 26:415–419, 1985.
  • [RS10] Prasad Raghavendra and David Steurer. Graph expansion and the unique games conjecture. In STOC, pages 755–764, 2010.
  • [Rub16] Aviad Rubinstein. Settling the complexity of computing approximate two-player Nash equilibria. In FOCS, pages 258–265, 2016.
  • [Rub17] Aviad Rubinstein. Detecting communities is hard (and counting them is even harder). In ITCS, pages 42:1–42:13, 2017.
  • [SS10] Danny Segev and Gil Segev. Approximate k-steiner forests via the lagrangian relaxation technique with internal preprocessing. Algorithmica, 56(4):529–549, 2010.
  • [UBK13] Johan Ugander, Lars Backstrom, and Jon M. Kleinberg. Subgraph frequencies: mapping the empirical and extremal geography of large graph collections. In WWW, pages 1307–1318, 2013.
  • [Wil18] Virginia Vassilevska Williams. On some fine-grained questions in algorithms and complexity, 2018. ICM survey.
  • [WW13] Virginia Vassilevska Williams and Ryan Williams. Finding, minimizing, and counting weighted subgraphs. SIAM J. Comput., 42(3):831–854, 2013.
  • [YZ04] Raphael Yuster and Uri Zwick. Detecting short directed cycles using rectangular matrix multiplication and dynamic programming. In SODA, pages 254–260, 2004.
  • [Zuc96] David Zuckerman. On unapproximable versions of NP-complete problems. SIAM Journal on Computing, 25(6):1293–1304, 1996.

Appendix A Deferred proofs from Section 2.2

In this appendix we include omitted proofs from Section 2.2.

Observation (Restatement of Observation 6).

For any H=(VH,GH)H=(V_{H},G_{H}), there exists SHS^{\prime}\subseteq H such that mindeg(H[S])den(H)\operatorname{min-deg}(H[S^{\prime}])\geq\operatorname{den}(H).

Proof of Observation 6.

Let us repeatedly apply the following: remove any vertex with degree less than den(H)\operatorname{den}(H) from the graph. In total, the number of edges removed is less than den(H)|VH|=|EH|\operatorname{den}(H)\cdot|V_{H}|=|E_{H}|. Thus, the final graph is not empty; by definition, its minimum degree is at least den(H)\operatorname{den}(H). ∎

Lemma (Restatement of Lemma 7).

Suppose N1000n(1δ)N\leq 1000\ell n^{(1-\delta)\ell}, 2020\leq\ell. Then with probability at least 0.95 over a sample GRPN,𝒢(n,12)G^{\prime}\sim\mathrm{RP}_{N,\ell}\circ\mathcal{G}(n,\frac{1}{2}), G=({Si}i[N],E)G^{\prime}=(\{S_{i}\}_{i\in[N]},E^{\prime}), the following event occurs: for every M[N]M\subseteq[N] with |M|n0.99δ/|M|\leq n^{0.99\delta}/\ell, we have |iMSi|0.01δ|M||\bigcup_{i\in M}S_{i}|\geq 0.01\delta|M|\ell.

Proof of Lemma 7.

For a fixed MNM\subseteq N, we have

𝐏𝐫[|iMSi|<0.01δ|M|]\displaystyle\operatorname*{\mathbf{Pr}}\left[\left|\bigcup_{i\in M}S_{i}\right|<0.01\delta|M|\ell\right] UV|U|=0.01δ|M|1𝐏𝐫[(iMSi)U]\displaystyle\leq\sum_{U\subseteq V\atop|U|=\lceil 0.01\delta|M|\ell\rceil-1}\operatorname*{\mathbf{Pr}}\left[\left(\bigcup_{i\in M}S_{i}\right)\subseteq U\right]
=UV|U|=0.01δ|M|1iM𝐏𝐫[SiU]\displaystyle=\sum_{U\subseteq V\atop|U|=\lceil 0.01\delta|M|\ell\rceil-1}\prod_{i\in M}\operatorname*{\mathbf{Pr}}\left[S_{i}\subseteq U\right]
=UV|U|=0.01δ|M|1iM(|U|/n)\displaystyle=\sum_{U\subseteq V\atop|U|=\lceil 0.01\delta|M|\ell\rceil-1}\prod_{i\in M}\left(|U|/n\right)^{\ell}
=UV|U|=0.01δ|M|1(|U|/n)|M|\displaystyle=\sum_{U\subseteq V\atop|U|=\lceil 0.01\delta|M|\ell\rceil-1}\left(|U|/n\right)^{|M|\ell}
n0.01δ|M|1(0.01δ|M|1n)|M|\displaystyle\leq n^{\lceil 0.01\delta|M|\ell\rceil-1}\left(\frac{\lceil 0.01\delta|M|\ell\rceil-1}{n}\right)^{|M|\ell}
n0.01δ|M|(0.01δ|M|n)|M|\displaystyle\leq n^{0.01\delta|M|\ell}\left(\frac{0.01\delta|M|\ell}{n}\right)^{|M|\ell}
=(0.01δ|M|n10.01δ)|M|.\displaystyle=\left(\frac{0.01\delta|M|\ell}{n^{1-0.01\delta}}\right)^{|M|\ell}.

Let Δ=n0.99δ/\Delta=\lfloor n^{0.99\delta}/\ell\rfloor. Now, we take the union bound over all M[N]M\subseteq[N] of size at most Δ\Delta. We have

𝐏𝐫[M[N],|M|Δ and |iMSi|<0.01|M|]\displaystyle\operatorname*{\mathbf{Pr}}\left[\exists M\subseteq[N],|M|\leq\Delta\text{ and }\left|\bigcup_{i\in M}S_{i}\right|<0.01|M|\ell\right] t=1Δ(Nt)(0.01δtn10.01δ)t\displaystyle\leq\sum_{t=1}^{\Delta}\binom{N}{t}\left(\frac{0.01\delta t\ell}{n^{1-0.01\delta}}\right)^{t\ell}
t=1ΔNt(0.01δtn10.01δ)t\displaystyle\leq\sum_{t=1}^{\Delta}N^{t}\left(\frac{0.01\delta t\ell}{n^{1-0.01\delta}}\right)^{t\ell}
t=1Δ(0.01δN1/tn10.01δ)t\displaystyle\leq\sum_{t=1}^{\Delta}\left(\frac{0.01\delta N^{1/\ell}t\ell}{n^{1-0.01\delta}}\right)^{t\ell}
And since by assumption N1000kn(1δ)N\leq 1000k\cdot n^{(1-\delta)\ell} and k20\ell\geq k\geq 20,
t=1Δ(0.04tn0.99δ)t\displaystyle\leq\sum_{t=1}^{\Delta}\left(\frac{0.04t\ell}{n^{0.99\delta}}\right)^{t\ell}
And finally because we have tΔn0.99δ/t\leq\Delta\leq n^{0.99\delta}/\ell,
t=1Δ0.04t0.05.\displaystyle\leq\sum_{t=1}^{\Delta}0.04^{t}\,\leq 0.05.\qed