This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Unweighted Code Sparsifiers and Thin Subgraphs

Shayan Oveis Gharan Arvin Sahami
(February 4, 2025)
Abstract

We show that for every kk-dimensional linear code π’žβŠ†π”½2n\mathcal{C}\subseteq\mathbb{F}_{2}^{n} there exists a set SβŠ†[n]S\subseteq[n] of size at most n/2+O​(n​k)n/2+O(\sqrt{nk}) such that the projection of π’ž\mathcal{C} onto SS has distance at least 12​dist⁑(π’ž)\frac{1}{2}\operatorname{dist}(\mathcal{C}). As a consequence we show that any connected graph GG with mm edges and nn vertices has at least 2mβˆ’(nβˆ’1)2^{m-(n-1)} many 1/21/2-thin subgraphs.

1 Introduction

Given a code π’ž\mathcal{C} over 𝔽2n\mathbb{F}_{2}^{n}, its distance is defined as

dist⁑(π’ž):-mincβˆˆπ’ž,cβ‰ 0⁑wt⁑(c),\operatorname{dist}(\mathcal{C})\coloneq\min_{c\in\mathcal{C},c\neq 0}\operatorname{wt}(c),

where wt⁑(c)=β€–cβ€–0\operatorname{wt}(c)=\|c\|_{0} is the number of nonzero coordinates of cc. We say π’ž\mathcal{C} is a linear code if for any c,cβ€²βˆˆπ’žc,c^{\prime}\in\mathcal{C}, c+cβ€²βˆˆπ’žc+c^{\prime}\in\mathcal{C}. We say π’ž\mathcal{C} is kk-dimensional if it has 2k2^{k} codewords. For convenience, we identify the set SβŠ†[n]S\subseteq[n] with its indicator vector Sβˆˆπ”½2nS\in\mathbb{F}_{2}^{n} (thus 2[n]2^{[n]} is identified with 𝔽2n\mathbb{F}_{2}^{n}).

The projection of π’ž\mathcal{C} onto SβŠ†[n]S\subseteq[n] is the code π’žS={cS:cβˆˆπ’ž}\mathcal{C}_{S}=\{c_{S}:c\in\mathcal{C}\} obtained by taking the SS coordinates of every codeword in π’ž\mathcal{C}. Recently, Khanna-Putterman-Sudan [KPS24] initiated the study of code sparsification where they proved that any linear code π’ž\mathcal{C} of dimension kk over 𝔽2n\mathbb{F}_{2}^{n} has a β€œweighted” code sparsifier which projects π’ž\mathcal{C} onto at most O~​(k/Ο΅2)\widetilde{O}(k/\epsilon^{2})-coordinates and such that the weighted hamming weight of every code word is preserved (up to a 1Β±Ο΅1\pm\epsilon multiplicative error).

In our main theorem, we show that every linear code π’žβŠ†π”½2n\mathcal{C}\subseteq\mathbb{F}_{2}^{n} has unweighted sparsifier π’žS\mathcal{C}_{S} such that |S|β‰ˆn/2\lvert S\rvert\approx n/2 and dist⁑(π’žS)β‰₯12​dist⁑(π’ž)\operatorname{dist}(\mathcal{C}_{S})\geq\frac{1}{2}\operatorname{dist}(\mathcal{C}). In other words, (assuming kβ‰ͺn)k\ll n), we can increase the rate of the code by a factor of 2 while preserving almost the same distance-rate). Our main theorem is the following.

Theorem 1.1.

For any linear code π’žβŠ†π”½2n\mathcal{C}\subseteq\mathbb{F}_{2}^{n}, there is a set SβŠ†[n]S\subseteq[n] of size ≀n​(12+k2.88​n)\leq n(\frac{1}{2}+\sqrt{\frac{k}{2.88n}}) such that for any cβˆˆπ’žc\in\mathcal{C}, wt⁑(cS)β‰₯wt⁑(c)/2\operatorname{wt}(c_{S})\geq\operatorname{wt}(c)/2. In particular we have dist⁑(π’žS)β‰₯dist⁑(π’ž)/2\operatorname{dist}(\mathcal{C}_{S})\geq\operatorname{dist}(\mathcal{C})/2.

We remark that the proof of the theorem is not algorithmic. We also briefly explain applications of this theorem to thin subgraphs.
Given an unweighted (undirected) graph G=(V,E)G=(V,E) with nn vertices and mm edges, we say a set TβŠ†ET\subseteq E is a Ξ±\alpha-thin w.r.t. GG if for any nonempty set S⊊VS\subsetneq V,

|T​(S,SΒ―)|≀α​|E​(S,SΒ―)|,\lvert T(S,\overline{S})\rvert\leq\alpha\lvert E(S,\overline{S})\rvert,

i.e., TT has at most Ξ±\alpha-fraction of the edges of every cut. Recall that a graph GG is tt-edge-connected if every cut in GG has at least tt edges. The following thin tree conjecture is proposed by Goddyn two decades ago [God04] and has been a subject of intense study since then [Asa+10, OS11, HO14, AO15, MP19, Mou, Alg23, KO23].

Conjecture 1.2 (Thin Tree Conjecture).

For any Ξ±<1\alpha<1, there exists tβ‰₯1t\geq 1 such that any tt-edge-connected graph GG has a spanning tree TT that is Ξ±\alpha-thin.

We remark that there has been several ”spectral” constructions of thin subsets but it remained an open problem whether one can construct thin subsets combinatorially without appealing to eigenvalue arguments (this is specially motivated to address the thin tree conjecture, since tt-edge-connected graphs do not necessarily have spectrally thin trees, see [AO15]).

Corollary 1.3.

For any connected graph GG with nn vertices and mm edges, there exists a 12\frac{1}{2}-thin set TβŠ†ET\subseteq E with |T|β‰₯m​(12βˆ’nβˆ’12.88​m)\lvert T\rvert\geq m(\frac{1}{2}-\sqrt{\frac{n-1}{2.88m}}).

Proof.

The main observation is that if we identify every cut (S,SΒ―)(S,\overline{S}) with the indicator vector of the set of edges of that cut, we obtain a linear code (over 𝔽2m\mathbb{F}_{2}^{m}) of dimension nβˆ’1n-1. Having said that to prove the statement it is enough to use the set TβŠ†ET\subseteq E promised in 1.1 which satisfies |T|β‰₯m​(12+nβˆ’12.88​m)\lvert T\rvert\geq m(\frac{1}{2}+\sqrt{\frac{n-1}{2.88m}}); then TΒ―\overline{T} is a 12\frac{1}{2}-thin subgraph of GG and has size |TΒ―|≀m​(12βˆ’nβˆ’12.88​m)\lvert\overline{T}\rvert\leq m(\frac{1}{2}-\sqrt{\frac{n-1}{2.88m}}). ∎

The following stronger statement also follows from our proof.

Theorem 1.4.

Any graph GG with mm edges and nn vertices has at least 2mβˆ’(nβˆ’1)2^{m-(n-1)} many 1/21/2-thin subgraphs.

The bound is tight, as a spanning tree only has a single 1/21/2-thin subgraph, the empty-set.
We remark that although the existence of 1/21/2-thin subgraphs follows by spectral arguments such as [BSS14], we are not aware of any exponential lower-bound on the number of spectrally thin subgraphs.

2 Main Proof

Definition 2.1 (codeword flip).

Given a code word cβˆˆπ’žc\in\mathcal{C}, the flip corresponding to cc is the map 𝔽2n→𝔽2n\mathbb{F}_{2}^{n}\to\mathbb{F}_{2}^{n} sending Sβˆˆπ”½2nS\in\mathbb{F}_{2}^{n} to S+cS+c.

When π’ž\mathcal{C} is linear, this defines an equivalence relation on subsets of [n][n]. We say S∼Sβ€²S\sim S^{\prime} if SS can be obtained from Sβ€²S^{\prime} by a codeword flip, i.e., S+Sβ€²βˆˆπ’žS+S^{\prime}\in\mathcal{C}. The transitivity of this relation follows from the linearity of π’ž\mathcal{C}.

Note that the collection of equivalence classes is the quotient space 𝔽2n/π’ž\mathbb{F}_{2}^{n}/\mathcal{C} (this is a vector space over 𝔽2\mathbb{F}_{2}). Since dim𝔽2n/π’ž=nβˆ’k\dim\mathbb{F}_{2}^{n}/\mathcal{C}=n-k there are precisely |𝔽2n/π’ž|=2nβˆ’k\lvert\mathbb{F}_{2}^{n}/\mathcal{C}\rvert=2^{n-k} equivalence classes.

Lemma 2.2.

Fix an equivalence class Hβˆˆπ”½2n/π’žH\in\mathbb{F}_{2}^{n}/\mathcal{C}. Let Sβˆ—=argmaxS∈H​|S|S^{*}=\underset{S\in H}{\operatorname{argmax}}\lvert S\rvert be a set with the largest size in HH. Then for any cβˆˆπ’žc\in\mathcal{C}

wt⁑(cSβˆ—)β‰₯wt⁑(c)/2.\operatorname{wt}(c_{S^{*}})\geq\operatorname{wt}(c)/2.
Proof.

We prove this by contradiction. Suppose that there exists a codeword cβˆˆπ’žc\in\mathcal{C} such that

wt⁑(cSβˆ—)<wt⁑(c)/2.\operatorname{wt}(c_{S^{*}})<\operatorname{wt}(c)/2.

Let S=c+Sβˆ—S=c+S^{*}. By definition S∈HS\in H. But the above equation implies |S|>|Sβˆ—|\lvert S\rvert>\lvert S^{*}\rvert which is a contradiction. ∎

For an equivalence class HH, let SHβŠ†[n]S_{H}\subseteq[n] be a set of the largest size in HH. LemmaΒ 2.2 implies that the collection {SH:Hβˆˆπ”½2n/π’ž}\left\{S_{H}\colon H\in\mathbb{F}_{2}^{n}/\mathcal{C}\right\} contains 2nβˆ’k2^{n-k} distinct subsets SβŠ†[n]S\subseteq[n] s.t. dist⁑(π’žS)β‰₯12​dist⁑(π’ž)\operatorname{dist}(\mathcal{C}_{S})\geq\frac{1}{2}\operatorname{dist}(\mathcal{C}).

Now, it follows by the pigeon-hole principle that there must be a set Sβ‰ˆn/2S\approx n/2 such that π’žS\mathcal{C}_{S} has distance at least half of the distance of π’ž\mathcal{C}.

Proof of 1.1.

Let

Hβˆ—=argminHβˆˆπ”½2n/π’žβ€‹|SH|H^{*}=\underset{H\in\mathbb{F}_{2}^{n}/\mathcal{C}}{\operatorname{argmin}}\lvert S_{H}\rvert

be the equivalence class whose largest set is the smallest. The set SHβˆ—S_{H^{*}} is the smallest among 2nβˆ’k2^{n-k} distinct sets. Hence by the following lemma, we must have |SHβˆ—|≀n/2+n​k/2.88\lvert S_{H^{*}}\rvert\leq n/2+\sqrt{nk/2.88}. This finishes the proof. ∎

Lemma 2.3.

For every nn the number of subsets of size β‰₯n/2+n​k/2.88\geq n/2+\sqrt{nk/2.88} is less than 2nβˆ’k2^{n-k}.

Proof.

Let X1,…,…,XnX_{1},\dots,\dots,X_{n} be independent Bernoulli random variable. Then, by Hoeffding’s bound

ℙ​[X1+β‹―+Xnβ‰₯n/2+ϡ​n]≀exp⁑(βˆ’2​ϡ).\mathbb{P}[X_{1}+\dots+X_{n}\geq n/2+\sqrt{\epsilon n}]\leq\exp(-2\epsilon).

Equivalently, the number of subsets of [n][n] of size β‰₯n/2+ϡ​n\geq n/2+\sqrt{\epsilon n} is at most 2nβ‹…exp⁑(βˆ’2​ϡ)2^{n}\cdot\exp(-2\epsilon). Letting exp⁑(βˆ’2​ϡ)=2βˆ’k\exp(-2\epsilon)=2^{-k} we get Ο΅=ln⁑22​k\epsilon=\frac{\ln 2}{2}k. ∎

References

  • [Alg23] Mahtab Alghasi β€œCombinatorially Thin Trees and Spectrally Thin Trees in Structured Graphs”, 2023
  • [AO15] Nima Anari and Shayan Oveis Gharan β€œEffective-Resistance-Reducing Flows, Spectrally Thin Trees, and Asymmetric TSP” In FOCS IEEE Computer Society, 2015, pp. 20–39
  • [Asa+10] Arash Asadpour et al. β€œAn O​(log⁑n/log⁑log⁑n){O}(\log n/\log\log n)-approximation Algorithm for the Asymmetric Traveling Salesman Problem” In SODA, 2010, pp. 379–389
  • [BSS14] Joshua Batson, Daniel A. Spielman and Nikhil Srivastava β€œTwice-Ramanujan Sparsifiers” In SIAM Review 56.2, 2014, pp. 315–334
  • [God04] Luis A. Goddyn β€œSome Open Problems I Like” available at http://www.math.sfu.ca/Β goddyn/Problems/problems.html, 2004
  • [HO14] Nicholas J. A. Harvey and Neil Olver β€œPipage Rounding, Pessimistic Estimators and Matrix Concentration” In SODA, 2014, pp. 926–945
  • [KO23] Nathan Klein and Neil Olver β€œThin Trees for Laminar Families” In FOCS IEEE, 2023, pp. 50–59
  • [KPS24] Sanjeev Khanna, Aaron (Louie) Putterman and Madhu Sudan β€œCode Sparsification and its Applications” In SODA, 2024, pp. 5145–5168
  • [Mou] β€œThin trees in 8-edge-connected planar graphs” In Information Processing Letters 143, 2019, pp. 51–55
  • [MP19] Martin Merker and Luke Postle β€œBounded diameter arboricity” In Journal of Graph Theory 90.4, 2019, pp. 629–641
  • [OS11] Shayan Oveis Gharan and Amin Saberi β€œThe Asymmetric Traveling Salesman Problem on Graphs with Bounded Genus” In SODA, 2011, pp. 967–975