MIT, [email protected]://orcid.org/0000-0003-4844-2863Supported by NSF Grant CCF-2330048, BSF Grant 2020356 and a Simons Investigator Award.MIT, [email protected]://orcid.org/0009-0007-8381-5705 \Copyright Virginia Vassilevska Williams and Alek Westover \ccsdesc[500]Theory of computation Graph algorithms analysis \supplement \supplementdetails[subcategory=, cite=, swhid=]Softwarehttps://github.com/awestover/listing-C6s-LP

Acknowledgements.

We’d like to thank Nathan S. Sheffield, Claire Zhang, and Ce Jin for helpful discussions.\EventEditorsRaghu Meka \EventNoEds1 \EventLongTitle16th Innovations in Theoretical Computer Science Conference (ITCS 2025) \EventShortTitleITCS 2025 \EventAcronymITCS \EventYear2025 \EventDateJanuary 7–10, 2025 \EventLocationColumbia University, New York, NY, USA \EventLogo \SeriesVolume325 \ArticleNo6

Listing $6$ -Cycles in Sparse Graphs

Virginia Vassilevska Williams Alek Westover

Abstract

This work considers the problem of output-sensitive listing of occurrences of $2k$ -cycles for fixed constant $k\geq 2$ in an undirected host graph with $m$ edges and $t$ $2k$ -cycles. Recent work of Jin and Xu (and independently Abboud, Khoury, Leibowitz, and Safier) [STOC 2023] gives an $O(m^{4/3}+t)$ time algorithm for listing $4$ -cycles, and recent work by Jin, Vassilevska Williams and Zhou [SOSA 2024] gives an $\widetilde{O}(n^{2}+t)$ time algorithm for listing $6$ -cycles in $n$ node graphs. We focus on resolving the next natural question: obtaining listing algorithms for $6$ -cycles in the sparse setting, i.e., in terms of $m$ rather than $n$ . Previously, the best known result here is the better of Jin, Vassilevska Williams and Zhou’s $\widetilde{O}(n^{2}+t)$ algorithm and Alon, Yuster and Zwick’s $O(m^{5/3}+t)$ algorithm.

We give an algorithm for listing $6$ -cycles with running time $\widetilde{O}(m^{1.6}+t)$ . Our algorithm is a natural extension of Dahlgaard, Knudsen and Stöckel’s [STOC 2017] algorithm for detecting a $2k$ -cycle. Our main technical contribution is the analysis of the algorithm which involves a type of “supersaturation” lemma relating the number of $2k$ -cycles in a bipartite graph to the sizes of the parts in the bipartition and the number of edges. We also give a simplified analysis of Dahlgaard, Knudsen and Stöckel’s $2k$ -cycle detection algorithm (with a small polylogarithmic increase in the running time), which is helpful in analyzing our listing algorithm.

keywords:

Graph algorithms, cycles listing, fine-grained complexity, sparse graphs

category:

\relatedversion

1 Introduction

Listing copies of a small pattern graph that occur as subgraphs of a host graph is a fundamental problem in algorithmic graph theory. In this work, we consider the problem of listing $C_{2k}$ ’s (i.e., $2k$ -cycles) for fixed constant $k\geq 2$ . Some examples of applications of $C_{2k}$ listing include analyzing social networks [motivationSocial], and understanding causal relationships in biological interaction graphs [motivationBIO]. (See e.g. [dense_cycle_detection_YZ] for further motivation.) In the following discussion we consider an $n$ vertex $m$ edge graph $G$ with $t$ $C_{2k}$ ’s (where $k\geq 2$ will be clear from context and $t$ is not necessarily known).

The main reason for focusing on even length cycles is that while there are dense graphs that contain no odd cycles (e.g. bipartite graphs), a classic result of Bondy and Simonovits [evenextremal] states that any graph with at least $100kn^{1+1/k}$ edges contains a $C_{2k}$ , for any integer $k\geq 2$ . This fact enables efficient “combinatorial” algorithms for $C_{2k}$ detection, in contrast to the case of odd length cycles where efficient $C_{2k+1}$ detection algorithms are based on matrix multiplication. In fact, very simple reductions (e.g. [vthesis]) show that for any $k\geq 1$ , $C_{2k+1}$ detection is at least as hard as triangle detection, and the latter problem is known to be fine-grained subcubically equivalent to Boolean Matrix Multiplication [focsy]. Thus, fast algorithms for odd cycles cannot avoid matrix multiplication. We focus on even cycles from now on.

A classic result of Yuster and Zwick [dense_cycle_detection_YZ] shows how to determine whether a graph contains a $C_{2k}$ in $O(n^{2})$ time for any given constant $k\geq 2$ . This quadratic running time is conjectured to be optimal in general (see [short_cycle_removal, LincolnVyas, Kn17], also discussed below); in particular, [short_cycle_removal] gave concrete evidence for the hardness of $C_{4}$ detection.

Nevertheless, in the regime of sparser graphs, where $m<o(n^{1+1/k})$ improvements are possible. For $C_{4}$ ’s there is a simple $O(m^{4/3})$ algorithm [alon1997finding] that matches the performance of the $O(n^{2})$ algorithm of [dense_cycle_detection_YZ] for $m=\Theta(n^{3/2})$ , and improves on the performance for all $m<o(n^{3/2})$ . Dahlgaard, Knudsen and Stöckel [Kn17] give an algorithm for $C_{2k}$ detection with running time $O(m^{2k/(k+1)})$ for every constant $k\geq 2$ . This matches the $O(n^{2})$ running time from [dense_cycle_detection_YZ] for $m=\Theta(n^{1+1/k})$ and improves on it for $m<o(n^{1+1/k})$ .

In fine-grained complexity, it is common to assume widely-believed hypotheses and use reductions to obtain conditional lower bounds for fundamental problems. For the special case of cycle detection problems, most conditional lower bounds are based on hypotheses related to triangle detection. One of the most common hypotheses is that triangle detection in $n$ -node graphs does not have a “combinatorial” ¹¹1The class of combinatorial algorithms is not well defined, but intuitively refers to algorithms that do not use fast matrix multiplication as a subroutine. $O(n^{3-\varepsilon})$ time algorithm for $\varepsilon>0$ (in the word-RAM model). Vassilevska W. and Williams [focsy] showed that this hypothesis is equivalent to the so called BMM Hypothesis that postulates that there is no $O(n^{3-\varepsilon})$ time combinatorial algorithm for multiplying two $n\times n$ Boolean matrices.

As mentioned earlier, it is easy to show that under the BMM hypothesis, detecting any odd cycle $C_{2k+1}$ requires $n^{3-o(1)}$ time. Dahlgaard, Knudsen and Stöckel [Kn17] gave lower bounds for combinatorial detection of even cycles of fixed length in sparse graphs. They show that there is no combinatorial algorithm for $C_{6}$ detection, or $C_{2k}$ detection for any $k>4$ , with running time $O(m^{3/2-\varepsilon})$ for $\varepsilon>0$ . Lincoln and Vyas [LincolnVyas] give a similar conditional lower bound under a different hypothesis, extending the lower bounds for potentially non-combinatorial algorithms. Lincoln and Vyas show that, for large enough constant $k$ , a $C_{2k}$ detection algorithm in graphs with $m\leq O(n)$ with running time $m^{3/2-\varepsilon}$ would imply an algorithm for $\max$ -3-SAT with running time $2^{(1-\varepsilon^{\prime})n}n^{O(1)}$ .

The problem of listing even cycles is less well-understood. Without improving the $C_{2k}$ detection algorithms discussed above, the best result that we could hope for in general is an algorithm with running time $O(n^{2}+t)$ , where $t$ is the number of $2k$ -cycles in the graph, and $O(m^{2k/(k+1)}+t)$ in the sparse setting where $m<o(n^{1+1/k})$ . Recently, [Ab22] and [3sumLBCe] gave such an algorithm for $C_{4}$ listing. In fact, Jin and Xu’s [3sumLBCe] algorithm gives a stronger guarantee: After $O(m^{4/3})$ pre-processing time, they can (deterministically) enumerate $4$ -cycles with $O(1)$ delay per cycle. Jin and Xu [3sumLBCe] (and concurrently by Abboud, Bringmann, and Fischer [3sumLBAmir]) shows that, under the 3SUM Hypothesis, there is no algorithm for $C_{4}$ enumeration with $O(n^{2-\varepsilon})$ or $O(m^{4/3-\varepsilon})$ pre-processing time and $n^{o(1)}$ delay, for $\varepsilon>0$ .

For $C_{6}$ ’s, Jin, Zhou and Vassilevska Williams [C6sCe] give an $\widetilde{O}(n^{2}+t)$ listing algorithm. For sparse listing algorithms, the previous state of the art is [alon1997finding] whose work implies an $O(m^{(2k-1)/k}+t)$ time $C_{2k}$ listing algorithm. See also [bringmannclass] where the complexity of the harder problem of listing $H$ -partite²²2A $k$ -node $H$ is an $H$ -partite subgraph of a $k$ -partite graph $G$ with vertex set $V=\cup_{a\in V(H)}V_{a}$ if there are $k$ vertices $v_{1},\ldots,v_{k}$ such that $v_{a}\in V_{a}$ for each $a\in\{1,\ldots,k\}$ and the mapping $a\rightarrow v_{a}$ is an isomorphism between $H$ and the subgraph of $G$ induced by $v_{1},\ldots,v_{k}$ . In other words, instead of looking for an arbitrary subgraph of $G$ isomorphic to $H$ , one only focuses on the subgraphs with exactly one node in each $V_{a}$ and such that the node picked from $V_{a}$ corresponds to node $a$ of $H$ . subgraphs is investigated.

1.1 Our contributions

We consider the problem of listing all $C_{6}$ s in an $m$ -edge graph. The best known result so far is an $O(m^{5/3}+t)$ time algorithm that follows from the work of [alon1997finding]. Our main result is the first improvement over this 27 year old running time: {restatable*}theoremlistfast There is an algorithm for listing $C_{6}$ ’s in time $\widetilde{O}(m^{1.6}+t)$ .

We now summarize the ideas needed to prove Section 1.1.

Section 1.1 follows easily after establishing a certain bound on the number of capped $k$ -walks in a graph. Capped $k$ -walks are a notion introduced by Dahlgaard, Knudsen and Stöckel in [Kn17]. Roughly speaking, a capped $k$ -walk is a walk of length $k$ where the first vertex in the walk has higher degree than the remaining vertices in the walk (handling vertices of equal degree requires a bit of additional care). [Kn17]’s $O(m^{2k/(k+1)})$ time $C_{2k}$ detection algorithm is based on the following fact: {restatable*}theoremthmcappedwalks[Kn17] Let $G$ be a $C_{2k}$ -free graph with maximum degree $\Delta(G)\leq m^{2/(k+1)}$ . Then, there are at most $\widetilde{O}(m^{2k/(k+1)})$ capped $k$ -walks in $G$ . One of our main contributions is a simplified proof of Section 1.1, although with polylogarthmically worse guarantees than [Kn17]. This simplified analysis of capped $k$ -walks is quite helpful in obtaining Section 1.1. The key lemma used to prove Section 1.1 is a generalization to bipartite graphs of Bondy and Simonovits’ classic theorem on the extremal number of $C_{2k}$ ’s. Specifically, the result is:

Theorem 1.1.

[Kn17] Let $G$ be a bipartite graph with vertex parts of sizes $L,R$ and with $m$ edges. If $m>100k(L+R+(LR)^{(k+1)/(2k)})$ then $G$ contains a $C_{2k}$ .

In order to use capped $k$ -walks for a listing algorithm, it would be useful to have a

Definition 1.

supersaturation variant of Theorem 1.1, which would guarantee the presence of many $C_{2k}$ ’s if the edge density is large; the fact that this supersaturation result could help with listing was communicated to us by Jin and Zhou [supersat_observation]. The supersaturation analog of Bondy and Simonovits’ [evenextremal] extremal number for $C_{2k}$ ’s is known:

Theorem 1.2.

[JiangYep20] For every integer $k\geq 2$ , there exist constants $c,C$ such that if $G$ is an $n$ -node graph with $m\geq Cn^{1+1/k}$ edges, then $G$ contains at least $c(m/n)^{2k}$ copies of $C_{2k}$ .

Jin and Zhou [supersat_observation] formulated the following conjectured generalization of Theorem 1.2:

{restatable*}

conjecturesupersatconj The Unbalanced Supersaturation Conjecture [Jin and Zhou’24]: Let $G$ be an $m$ edge bipartite graph with vertex bipartition $A\sqcup B$ , with $t$ $C_{2k}$ ’s. Suppose $m\geq 100k(|A|+|B|+(|A||B|)^{(k+1)/2k})$ . Then,

t\geq\Omega\left(\frac{m^{2k}}{|A|^{k}|B|^{k}}\right).

Jin and Zhou obtained the following conclusion using the approach of [Kn17]:

Fact 2 ([supersat_observation]).

If the Unbalanced Supersaturation Conjecture is true, then there is an $\widetilde{O}(t+m^{2k/(k+1)})$ time algorithm for $2k$ -cycle listing for all $k\geq 2$ .

In LABEL:thm:conditional_listing we give a simple proof of Jin and Zhou’s fact above. Thus, an approach to obtaining the conjectured optimal running time of $\widetilde{O}(t+m^{2k/(k+1)})$ for $C_{2k}$ -listing would be to prove Theorem 1.2. Unfortunately, establishing or refuting Theorem 1.2 remains a challenging open problem.

Our main technical contribution is a proof of a weaker version of Theorem 1.2 for $k=3$ . Then, we show that this partial progress towards Theorem 1.2 can be used in our simplified method for analyzing capped $k$ -walks to obtain a bound on the number of capped $k$ -walks, and consequentially an improved $C_{6}$ listing algorithm in the sparse setting (namely, Section 1.1).

1.2 Open Questions

Proving or refuting Theorem 1.2 is an important open question. It also is valuable to consider other avenues towards obtaining $C_{2k}$ listing algorithms, especially in case Theorem 1.2 turns out to be false. The listing algorithms of [C6sCe], [3sumLBCe], [Ab22] use a variety of combinatorial insights that could potentially be generalized to larger $k$ . It is also possible that a hybrid approach is productive: one can first use progress towards proving Theorem 1.2 to force the instance to have a specific structure, and then use different combinatorial insights to solve the structured version of the problem.

1.3 Paper Outline

In LABEL:sec:simple_capped we present our simplified analysis of [Kn17]’s $C_{2k}$ detection algorithm. In LABEL:sec:listing_with_supersat we show how to modify our simple analysis of [Kn17]’s $C_{2k}$ detection algorithm to get a $C_{2k}$ listing algorithm, assuming the Unbalanced Supersaturation Conjecture. In LABEL:sec:supsersat_progress we present progress towards resolving the Unbalanced Supersaturation Conjecture. In LABEL:sec:listing_progress we show how to use our progress towards Theorem 1.2 in our simplified capped $k$ -walk analysis to obtain a listing algorithm for $C_{6}$ ’s with running time $\widetilde{O}(m^{1.6}+t)$ .

1.4 Notations

We use $\Delta(G)$ to denote the maximum degree of graph $G$ . We write $N(v)$ to denote the neighborhood of vertex $v$ , and for $S\subseteq V(G)$ we write $N_{S}(v)$ to denote $N(v)\cap S$ . For graph $G$ , we will use $V(G),E(G)$ to denote the vertex and edge sets of $G$ . When $G$ is clear from the context we also denote $V(G)$ by $V$ and $E(G)$ by $E$ . We let $n=|V|,m=|E|$ (when the graph being discussed is clear), and assume $m\geq n$ . For vertex subsets $A,B$ we write $e(A,B)$ to denote $|E\cap(A\times B)|$ . A walk / path of length $k$ will refer to a walk / path with $k$ edges. Given $A,B\subseteq V$ , we will write $G[A,B]$ to denote the induced subgraph on $A,B$ , i.e., a graph with vertex set $A\cup B$ and edge set $E(G)\cap(A\times B)$ . We will write $G[A]$ to denote $G[A,A]$ .

We write $[x]$ to denote the set $\left\{1,2,\ldots,\lceil x\rceil\right\}$ . We use $\log$ to denote the base- $2$ logarithm. Define the