This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

MIT, [email protected]://orcid.org/0000-0003-4844-2863Supported by NSF Grant CCF-2330048, BSF Grant 2020356 and a Simons Investigator Award.MIT, [email protected]://orcid.org/0009-0007-8381-5705 \Copyright Virginia Vassilevska Williams and Alek Westover \ccsdesc[500]Theory of computation Graph algorithms analysis \supplement \supplementdetails[subcategory=, cite=, swhid=]Softwarehttps://github.com/awestover/listing-C6s-LP

Acknowledgements.
We’d like to thank Nathan S. Sheffield, Claire Zhang, and Ce Jin for helpful discussions.\EventEditorsRaghu Meka \EventNoEds1 \EventLongTitle16th Innovations in Theoretical Computer Science Conference (ITCS 2025) \EventShortTitleITCS 2025 \EventAcronymITCS \EventYear2025 \EventDateJanuary 7–10, 2025 \EventLocationColumbia University, New York, NY, USA \EventLogo \SeriesVolume325 \ArticleNo6

Listing 66-Cycles in Sparse Graphs

Virginia Vassilevska Williams    Alek Westover
Abstract

This work considers the problem of output-sensitive listing of occurrences of 2k2k-cycles for fixed constant k2k\geq 2 in an undirected host graph with mm edges and tt 2k2k-cycles. Recent work of Jin and Xu (and independently Abboud, Khoury, Leibowitz, and Safier) [STOC 2023] gives an O(m4/3+t)O(m^{4/3}+t) time algorithm for listing 44-cycles, and recent work by Jin, Vassilevska Williams and Zhou [SOSA 2024] gives an O~(n2+t)\widetilde{O}(n^{2}+t) time algorithm for listing 66-cycles in nn node graphs. We focus on resolving the next natural question: obtaining listing algorithms for 66-cycles in the sparse setting, i.e., in terms of mm rather than nn. Previously, the best known result here is the better of Jin, Vassilevska Williams and Zhou’s O~(n2+t)\widetilde{O}(n^{2}+t) algorithm and Alon, Yuster and Zwick’s O(m5/3+t)O(m^{5/3}+t) algorithm.

We give an algorithm for listing 66-cycles with running time O~(m1.6+t)\widetilde{O}(m^{1.6}+t). Our algorithm is a natural extension of Dahlgaard, Knudsen and Stöckel’s [STOC 2017] algorithm for detecting a 2k2k-cycle. Our main technical contribution is the analysis of the algorithm which involves a type of “supersaturation” lemma relating the number of 2k2k-cycles in a bipartite graph to the sizes of the parts in the bipartition and the number of edges. We also give a simplified analysis of Dahlgaard, Knudsen and Stöckel’s 2k2k-cycle detection algorithm (with a small polylogarithmic increase in the running time), which is helpful in analyzing our listing algorithm.

keywords:
Graph algorithms, cycles listing, fine-grained complexity, sparse graphs
category:
\relatedversion

1 Introduction

Listing copies of a small pattern graph that occur as subgraphs of a host graph is a fundamental problem in algorithmic graph theory. In this work, we consider the problem of listing C2kC_{2k}’s (i.e., 2k2k-cycles) for fixed constant k2k\geq 2. Some examples of applications of C2kC_{2k} listing include analyzing social networks [motivationSocial], and understanding causal relationships in biological interaction graphs [motivationBIO]. (See e.g. [dense_cycle_detection_YZ] for further motivation.) In the following discussion we consider an nn vertex mm edge graph GG with tt C2kC_{2k}’s (where k2k\geq 2 will be clear from context and tt is not necessarily known).

The main reason for focusing on even length cycles is that while there are dense graphs that contain no odd cycles (e.g. bipartite graphs), a classic result of Bondy and Simonovits [evenextremal] states that any graph with at least 100kn1+1/k100kn^{1+1/k} edges contains a C2kC_{2k}, for any integer k2k\geq 2. This fact enables efficient “combinatorial” algorithms for C2kC_{2k} detection, in contrast to the case of odd length cycles where efficient C2k+1C_{2k+1} detection algorithms are based on matrix multiplication. In fact, very simple reductions (e.g. [vthesis]) show that for any k1k\geq 1, C2k+1C_{2k+1} detection is at least as hard as triangle detection, and the latter problem is known to be fine-grained subcubically equivalent to Boolean Matrix Multiplication [focsy]. Thus, fast algorithms for odd cycles cannot avoid matrix multiplication. We focus on even cycles from now on.

A classic result of Yuster and Zwick [dense_cycle_detection_YZ] shows how to determine whether a graph contains a C2kC_{2k} in O(n2)O(n^{2}) time for any given constant k2k\geq 2. This quadratic running time is conjectured to be optimal in general (see [short_cycle_removal, LincolnVyas, Kn17], also discussed below); in particular, [short_cycle_removal] gave concrete evidence for the hardness of C4C_{4} detection.

Nevertheless, in the regime of sparser graphs, where m<o(n1+1/k)m<o(n^{1+1/k}) improvements are possible. For C4C_{4}’s there is a simple O(m4/3)O(m^{4/3}) algorithm [alon1997finding] that matches the performance of the O(n2)O(n^{2}) algorithm of [dense_cycle_detection_YZ] for m=Θ(n3/2)m=\Theta(n^{3/2}), and improves on the performance for all m<o(n3/2)m<o(n^{3/2}). Dahlgaard, Knudsen and Stöckel [Kn17] give an algorithm for C2kC_{2k} detection with running time O(m2k/(k+1))O(m^{2k/(k+1)}) for every constant k2k\geq 2. This matches the O(n2)O(n^{2}) running time from [dense_cycle_detection_YZ] for m=Θ(n1+1/k)m=\Theta(n^{1+1/k}) and improves on it for m<o(n1+1/k)m<o(n^{1+1/k}).

In fine-grained complexity, it is common to assume widely-believed hypotheses and use reductions to obtain conditional lower bounds for fundamental problems. For the special case of cycle detection problems, most conditional lower bounds are based on hypotheses related to triangle detection. One of the most common hypotheses is that triangle detection in nn-node graphs does not have a “combinatorial” 111The class of combinatorial algorithms is not well defined, but intuitively refers to algorithms that do not use fast matrix multiplication as a subroutine. O(n3ε)O(n^{3-\varepsilon}) time algorithm for ε>0\varepsilon>0 (in the word-RAM model). Vassilevska W. and Williams [focsy] showed that this hypothesis is equivalent to the so called BMM Hypothesis that postulates that there is no O(n3ε)O(n^{3-\varepsilon}) time combinatorial algorithm for multiplying two n×nn\times n Boolean matrices.

As mentioned earlier, it is easy to show that under the BMM hypothesis, detecting any odd cycle C2k+1C_{2k+1} requires n3o(1)n^{3-o(1)} time. Dahlgaard, Knudsen and Stöckel [Kn17] gave lower bounds for combinatorial detection of even cycles of fixed length in sparse graphs. They show that there is no combinatorial algorithm for C6C_{6} detection, or C2kC_{2k} detection for any k>4k>4, with running time O(m3/2ε)O(m^{3/2-\varepsilon}) for ε>0\varepsilon>0. Lincoln and Vyas [LincolnVyas] give a similar conditional lower bound under a different hypothesis, extending the lower bounds for potentially non-combinatorial algorithms. Lincoln and Vyas show that, for large enough constant kk, a C2kC_{2k} detection algorithm in graphs with mO(n)m\leq O(n) with running time m3/2εm^{3/2-\varepsilon} would imply an algorithm for max\max-3-SAT with running time 2(1ε)nnO(1)2^{(1-\varepsilon^{\prime})n}n^{O(1)}.

The problem of listing even cycles is less well-understood. Without improving the C2kC_{2k} detection algorithms discussed above, the best result that we could hope for in general is an algorithm with running time O(n2+t)O(n^{2}+t), where tt is the number of 2k2k-cycles in the graph, and O(m2k/(k+1)+t)O(m^{2k/(k+1)}+t) in the sparse setting where m<o(n1+1/k)m<o(n^{1+1/k}). Recently, [Ab22] and [3sumLBCe] gave such an algorithm for C4C_{4} listing. In fact, Jin and Xu’s [3sumLBCe] algorithm gives a stronger guarantee: After O(m4/3)O(m^{4/3}) pre-processing time, they can (deterministically) enumerate 44-cycles with O(1)O(1) delay per cycle. Jin and Xu [3sumLBCe] (and concurrently by Abboud, Bringmann, and Fischer [3sumLBAmir]) shows that, under the 3SUM Hypothesis, there is no algorithm for C4C_{4} enumeration with O(n2ε)O(n^{2-\varepsilon}) or O(m4/3ε)O(m^{4/3-\varepsilon}) pre-processing time and no(1)n^{o(1)} delay, for ε>0\varepsilon>0.

For C6C_{6}’s, Jin, Zhou and Vassilevska Williams [C6sCe] give an O~(n2+t)\widetilde{O}(n^{2}+t) listing algorithm. For sparse listing algorithms, the previous state of the art is [alon1997finding] whose work implies an O(m(2k1)/k+t)O(m^{(2k-1)/k}+t) time C2kC_{2k} listing algorithm. See also [bringmannclass] where the complexity of the harder problem of listing HH-partite222A kk-node HH is an HH-partite subgraph of a kk-partite graph GG with vertex set V=aV(H)VaV=\cup_{a\in V(H)}V_{a} if there are kk vertices v1,,vkv_{1},\ldots,v_{k} such that vaVav_{a}\in V_{a} for each a{1,,k}a\in\{1,\ldots,k\} and the mapping avaa\rightarrow v_{a} is an isomorphism between HH and the subgraph of GG induced by v1,,vkv_{1},\ldots,v_{k}. In other words, instead of looking for an arbitrary subgraph of GG isomorphic to HH, one only focuses on the subgraphs with exactly one node in each VaV_{a} and such that the node picked from VaV_{a} corresponds to node aa of HH. subgraphs is investigated.

1.1 Our contributions

We consider the problem of listing all C6C_{6}s in an mm-edge graph. The best known result so far is an O(m5/3+t)O(m^{5/3}+t) time algorithm that follows from the work of [alon1997finding]. Our main result is the first improvement over this 27 year old running time: {restatable*}theoremlistfast There is an algorithm for listing C6C_{6}’s in time O~(m1.6+t)\widetilde{O}(m^{1.6}+t).

We now summarize the ideas needed to prove Section 1.1.

Section 1.1 follows easily after establishing a certain bound on the number of capped kk-walks in a graph. Capped kk-walks are a notion introduced by Dahlgaard, Knudsen and Stöckel in [Kn17]. Roughly speaking, a capped kk-walk is a walk of length kk where the first vertex in the walk has higher degree than the remaining vertices in the walk (handling vertices of equal degree requires a bit of additional care). [Kn17]’s O(m2k/(k+1))O(m^{2k/(k+1)}) time C2kC_{2k} detection algorithm is based on the following fact: {restatable*}theoremthmcappedwalks[Kn17] Let GG be a C2kC_{2k}-free graph with maximum degree Δ(G)m2/(k+1)\Delta(G)\leq m^{2/(k+1)}. Then, there are at most O~(m2k/(k+1))\widetilde{O}(m^{2k/(k+1)}) capped kk-walks in GG. One of our main contributions is a simplified proof of Section 1.1, although with polylogarthmically worse guarantees than [Kn17]. This simplified analysis of capped kk-walks is quite helpful in obtaining Section 1.1. The key lemma used to prove Section 1.1 is a generalization to bipartite graphs of Bondy and Simonovits’ classic theorem on the extremal number of C2kC_{2k}’s. Specifically, the result is:

Theorem 1.1.

[Kn17] Let GG be a bipartite graph with vertex parts of sizes L,RL,R and with mm edges. If m>100k(L+R+(LR)(k+1)/(2k))m>100k(L+R+(LR)^{(k+1)/(2k)}) then GG contains a C2kC_{2k}.

In order to use capped kk-walks for a listing algorithm, it would be useful to have a

Definition 1.

supersaturation variant of Theorem 1.1, which would guarantee the presence of many C2kC_{2k}’s if the edge density is large; the fact that this supersaturation result could help with listing was communicated to us by Jin and Zhou [supersat_observation]. The supersaturation analog of Bondy and Simonovits’ [evenextremal] extremal number for C2kC_{2k}’s is known:

Theorem 1.2.

[JiangYep20] For every integer k2k\geq 2, there exist constants c,Cc,C such that if GG is an nn-node graph with mCn1+1/km\geq Cn^{1+1/k} edges, then GG contains at least c(m/n)2kc(m/n)^{2k} copies of C2kC_{2k}.

Jin and Zhou [supersat_observation] formulated the following conjectured generalization of Theorem 1.2:

{restatable*}

conjecturesupersatconj The Unbalanced Supersaturation Conjecture [Jin and Zhou’24]: Let GG be an mm edge bipartite graph with vertex bipartition ABA\sqcup B, with tt C2kC_{2k}’s. Suppose m100k(|A|+|B|+(|A||B|)(k+1)/2k)m\geq 100k(|A|+|B|+(|A||B|)^{(k+1)/2k}). Then,

tΩ(m2k|A|k|B|k).t\geq\Omega\left(\frac{m^{2k}}{|A|^{k}|B|^{k}}\right).

Jin and Zhou obtained the following conclusion using the approach of [Kn17]:

Fact 2 ([supersat_observation]).

If the Unbalanced Supersaturation Conjecture is true, then there is an O~(t+m2k/(k+1))\widetilde{O}(t+m^{2k/(k+1)}) time algorithm for 2k2k-cycle listing for all k2k\geq 2.

In LABEL:thm:conditional_listing we give a simple proof of Jin and Zhou’s fact above. Thus, an approach to obtaining the conjectured optimal running time of O~(t+m2k/(k+1))\widetilde{O}(t+m^{2k/(k+1)}) for C2kC_{2k}-listing would be to prove Theorem 1.2. Unfortunately, establishing or refuting Theorem 1.2 remains a challenging open problem.

Our main technical contribution is a proof of a weaker version of Theorem 1.2 for k=3k=3. Then, we show that this partial progress towards Theorem 1.2 can be used in our simplified method for analyzing capped kk-walks to obtain a bound on the number of capped kk-walks, and consequentially an improved C6C_{6} listing algorithm in the sparse setting (namely, Section 1.1).

1.2 Open Questions

Proving or refuting Theorem 1.2 is an important open question. It also is valuable to consider other avenues towards obtaining C2kC_{2k} listing algorithms, especially in case Theorem 1.2 turns out to be false. The listing algorithms of [C6sCe], [3sumLBCe], [Ab22] use a variety of combinatorial insights that could potentially be generalized to larger kk. It is also possible that a hybrid approach is productive: one can first use progress towards proving Theorem 1.2 to force the instance to have a specific structure, and then use different combinatorial insights to solve the structured version of the problem.

1.3 Paper Outline

In LABEL:sec:simple_capped we present our simplified analysis of [Kn17]’s C2kC_{2k} detection algorithm. In LABEL:sec:listing_with_supersat we show how to modify our simple analysis of [Kn17]’s C2kC_{2k} detection algorithm to get a C2kC_{2k} listing algorithm, assuming the Unbalanced Supersaturation Conjecture. In LABEL:sec:supsersat_progress we present progress towards resolving the Unbalanced Supersaturation Conjecture. In LABEL:sec:listing_progress we show how to use our progress towards Theorem 1.2 in our simplified capped kk-walk analysis to obtain a listing algorithm for C6C_{6}’s with running time O~(m1.6+t)\widetilde{O}(m^{1.6}+t).

1.4 Notations

We use Δ(G)\Delta(G) to denote the maximum degree of graph GG. We write N(v)N(v) to denote the neighborhood of vertex vv, and for SV(G)S\subseteq V(G) we write NS(v)N_{S}(v) to denote N(v)SN(v)\cap S. For graph GG, we will use V(G),E(G)V(G),E(G) to denote the vertex and edge sets of GG. When GG is clear from the context we also denote V(G)V(G) by VV and E(G)E(G) by EE. We let n=|V|,m=|E|n=|V|,m=|E| (when the graph being discussed is clear), and assume mnm\geq n. For vertex subsets A,BA,B we write e(A,B)e(A,B) to denote |E(A×B)||E\cap(A\times B)|. A walk / path of length kk will refer to a walk / path with kk edges. Given A,BVA,B\subseteq V, we will write G[A,B]G[A,B] to denote the induced subgraph on A,BA,B, i.e., a graph with vertex set ABA\cup B and edge set E(G)(A×B)E(G)\cap(A\times B). We will write G[A]G[A] to denote G[A,A]G[A,A].

We write [x][x] to denote the set {1,2,,x}\left\{1,2,\ldots,\lceil x\rceil\right\}. We use log\log to denote the base-22 logarithm. Define the