This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The complexity of testing all properties of planar graphs, and the role of isomorphism

Sabyasachi Basu Department of Computer Science, University of California, Santa Cruz. [email protected]    Akash Kumar School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne. [email protected] This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 759471)    C. Seshadhri Department of Computer Science, University of California, Santa Cruz. [email protected]
SB and CS are supported by NSF DMS-2023495, CCF-1740850, CCF-1813165, CCF-1839317, CCF-1908384, CCF-1909790, and ARO Award W911NF1910294.
Abstract

Consider property testing on bounded degree graphs and let ε>0\varepsilon>0 denote the proximity parameter. A remarkable theorem of Newman-Sohler (SICOMP 2013) asserts that all properties of planar graphs (more generally hyperfinite) are testable with query complexity only depending on ε\varepsilon. Recent advances in testing minor-freeness have proven that all additive and monotone properties of planar graphs can be tested in poly(ε1)\mathrm{poly}(\varepsilon^{-1}) queries. Some properties falling outside this class, such as Hamiltonicity, also have a similar complexity for planar graphs. Motivated by these results, we ask: can all properties of planar graphs can be tested in poly(ε1)\mathrm{poly}(\varepsilon^{-1}) queries? Is there a uniform query complexity upper bound for all planar properties, and what is the “hardest” such property to test?

We discover a surprisingly clean and optimal answer. Any property of bounded degree planar graphs can be tested in exp(O(ε2))\exp(O(\varepsilon^{-2})) queries. Moreover, there is a matching lower bound, up to constant factors in the exponent. The natural property of testing isomorphism to a fixed graph requires exp(Ω(ε2))\exp(\Omega(\varepsilon^{-2})) queries, thereby showing that (up to polynomial dependencies) isomorphism to an explicit fixed graph is the hardest property of planar graphs. The upper bound is a straightforward adapation of the Newman-Sohler analysis that tracks dependencies on ε\varepsilon more carefully. The main technical contribution is the lower bound construction, which is achieved by a special family of planar graphs that are all mutually far from each other.

We can also apply our techniques to get analogous results for bounded treewidth graphs. We prove that all properties of bounded treewidth graphs can be tested in exp(O(ε1logε1))\exp(O(\varepsilon^{-1}\log\varepsilon^{-1})) queries. Moreover, testing isomorphism to a fixed forest requires exp(Ω(ε1))\exp(\Omega(\varepsilon^{-1})) queries.

1 Introduction

Consider the setting of property testing for bounded degree graphs, under the model of random access to a graph adjacency list, as introduced by Goldreich-Ron [GR02]. Let G=(V,E)G=(V,E) be a graph where V=[n]V=[n] and the maximum degree is dd. We have random access to the list through neighbor queries. There is an oracle that, given vVv\in V and i[d]i\in[d], returns the iith neighbor of vv (if no neighbor exists, it returns \bot).

For a property 𝒫\mathcal{P} of graphs with degree bound dd, the distance of GG to 𝒫\mathcal{P} is the minimum number of edge additions/removals required to make GG have 𝒫\mathcal{P}, divided by dndn. We say that GG is ε\varepsilon-far from 𝒫\mathcal{P} if the distance to 𝒫\mathcal{P} is more than ε\varepsilon. A property tester for 𝒫\mathcal{P} is a randomized procedure that takes as input (query access to) GG and a proximity parameter, ε>0\varepsilon>0. If G𝒫G\in\mathcal{P}, the tester must accept with probability at least 2/32/3. If GG is ε\varepsilon-far from 𝒫\mathcal{P}, the tester must reject with probability at least 2/32/3. In our context, the property 𝒫\mathcal{P} is called testable if there exists a property tester for 𝒫\mathcal{P} whose query complexity is independent of nn.

One of the grand goals of property testing is to classify testable (graph) properties according to the query complexity of testing them. Of special interest are efficiently testable properties, whose testing complexity is poly(ε1)\mathrm{poly}(\varepsilon^{-1}). For bounded degree graphs, there is little clarity on this issue. A recent survey by Goldreich states: “Indeed, it is hard to find a common theme among [efficiently testable] properties…”. [Gol21]

Our work focuses on properties of planar graphs. Even the problem of just testing planarity has received much attention, whose complexity has only recently been shown to be poly(ε1)\mathrm{poly}(\varepsilon^{-1}) [BSS08, HKNO09, EHNO11, YI15, LR15, KSS19]. Newman-Sohler proved that every planar property (actually, every hyperfinite property) is testable, but they do not provide an explicit complexity bound depending on ε\varepsilon [NS13]. Kumar-Seshadhri-Stolman recently showed that every additive and monotone planar property can tested in poly(ε1)\mathrm{poly}(\varepsilon^{-1}) queries [KSS19]. Applying techniques from this result, Levi-Shoshan showed that planar Hamiltonicity can be tested efficiently [LS21]. The latter property is neither additive nor monotone, so it is natural to ask if all planar properties can be tested efficiently. If not, does there exist a uniform complexity bound for all planar properties, and a candidate for the “hardest planar property”? (We note that there is significant work on characterizing testable planar properties, for the unbounded degree case, which is qualitatively different [CMOS11, CS19]. Details are given in §1.2.)

We discover that the answer to this question is the query complexity bound exp(Θ(ε2))\exp(\Theta(\varepsilon^{-2})). Up to constant factors in the exponent, the hardest planar property is testing isomorphism to a fixed explicit graph.

Let us give some formalism, and discuss the connection to isomorphism. In our discussions, nn is basically fixed, so we are considering a non-uniform setting. A planar property Π\Pi is a set of unlabeled bounded degree planar graph with nn vertices. The input GG is a labeled graph and the tester is trying to property test if GG is isomorphic (or equal to, if one ignores the labels) to any member of Π\Pi. Observe that singleton properties, where Π={H}\Pi=\{H\}, are equivalent to testing if GG is isomorphic to an fixed graph HH. With this premable, we can state our main results.

Our upper bound is a straightforward adaptation of arguments in Newman-Sohler [NS13].

Theorem 1.1.

Consider a planar property Π\Pi of nn-vertex, dd degree-bounded graphs. There is a property tester for Π\Pi that makes poly(d)exp(O(ε2))\mathrm{poly}(d)\exp(O(\varepsilon^{-2})) queries.

Our main technical result is a matching lower bound for an explicit singleton property. One of the surprises (at least to the authors) is that testing isomorphism to an arbitrary set of planar graphs is not harder, up to polynomial dependencies, than testing isomorphism to a single fixed graph. A recent compendium of open problems by Goldreich states the question of determining the complexity of testing isomorphism (Open Problem 2.4 in [Gol21]). We note that our theorems resolve this question for bounded-degree planar graphs.

Theorem 1.2.

For every sufficiently large nn, there exists a bounded-degree planar graph HH on nn vertices such that property testing Π={H}\Pi=\{H\} requires exp(Ω(ε2))\exp(\Omega(\varepsilon^{-2})) queries. Equivalently, testing isomorphism to HH requires exp(Ω(ε2))\exp(\Omega(\varepsilon^{-2})) queries.

Bounded treewidth classes: In the context of property testing, many results for planar graphs also hold for minor-free classes. It is natural to ask whether the bound of exp(Θ(ε2))\exp(\Theta(\varepsilon^{-2})) is the “right” answer for any property of minor-free graphs. We discover that to not be the case. For properties of bounded treewidth graphs, the answer is between exp((ε1))\exp((\varepsilon^{-1})) and exp((ε1)logε1)\exp((\varepsilon^{-1})\log\varepsilon^{-1}). The following theorems are also derived from the same methods for planar graphs, but the constructions are substantially simpler. The lower bound is achieved by a simple construction of forests. The main point of these results is to show that the bounded of exp(Θ(ε2))\exp(\Theta(\varepsilon^{-2})) achieved for planar properties can be significantly beaten for non-trivial minor-closed families.

Theorem 1.3.

Consider a bounded treewidth property Π\Pi of nn-vertex, dd degree-bounded graphs. There is a property tester for Π\Pi that makes poly(d)exp(O(ε1logε1))\mathrm{poly}(d)\exp(O(\varepsilon^{-1}\log\varepsilon^{-1})) queries.

Theorem 1.4.

For every sufficiently large nn, there exists a bounded-degree forest HH on nn vertices such that property testing Π={H}\Pi=\{H\} requires exp(Ω(ε1))\exp(\Omega(\varepsilon^{-1})) queries.

1.1 Main ideas

One of the key components of property testers for planar graphs is the notion of partition oracles for hyperfinite graphs, as introduced by Hassidim-Kelner-Nguyen-Onak [HKNO09]. Bounded degree planar graphs (and more generally, minor-closed families) are hyperfinite, meaning that one can remove a constant fraction of edges and obtain connected components of constant size. Specifically, one can remove εdn\varepsilon dn edges to get connected components of size O(ε2)O(\varepsilon^{-2}).

Suppose GG is hyperfinite. A partition oracle is a local procedure that gives access to a hyperfinite decomposition to a graph GG, without any preprocessing. Given a query vertex vv, the partition oracle outputs a connected subgraph C(v)C(v) of size O(ε2)O(\varepsilon^{-2}), such that the union of components C(v)C(v) forms a hyperfinite partition. Recent results give a partition oracle that runs in time poly(ε1)\mathrm{poly}(\varepsilon^{-1}) per query [LR15, KSS21].

Newman-Sohler used partition oracles to prove that all hyperfinite (and hence planar) properties are testable [NS13]. This is where our results begin. Stripping down the Newman-Sohler arguments to their core, we can essentially treat a planar GG, up to εdn\varepsilon dn edge changes, as a distribution over connected planar graphs of size O(ε2)O(\varepsilon^{-2}). The partition oracle allows us to sample efficiently from this distribution. (If the input graph GG is not hyperfinite, the partition oracle can detect that efficiently, and the graph can be directly rejected.) For convenience, let 𝒟(G)\mathcal{D}(G) denote this distribution.

Existing theorems in combinatorics show that the number of planar graphs of size O(ε2)O(\varepsilon^{-2}) is exp(O(ε2))\exp(O(\varepsilon^{-2})) [MR01]; hence, this bounds the support size of 𝒟(G)\mathcal{D}(G). We can learn an approximation of 𝒟(G)\mathcal{D}(G) up to ε\varepsilon TV-distance with exp(O(ε2))\exp(O(\varepsilon^{-2})) queries to the partition oracle. It is not hard, but central to the argument, to show that if 𝒟(G)𝒟(H)1ε\|\mathcal{D}(G)-\mathcal{D}(H)\|_{1}\leq\varepsilon, then GG is ε\varepsilon-close to HH. For any property Π\Pi, one can simply check exhaustively if the learned approximation to 𝒟(G)\mathcal{D}(G) is close to 𝒟(H)\mathcal{D}(H) for any HΠH\in\Pi. (While this could be expensive in running time, it requires no further queries to GG.)

The main challenge is in the lower bound. We need to find a property Π\Pi such that testing Π\Pi requires exp(Ω(ε2))\exp(\Omega(\varepsilon^{-2})) samples from 𝒟(G)\mathcal{D}(G). We seem to need a converse to the upper bound argument, showing if 𝒟(G)𝒟(H)1\|\mathcal{D}(G)-\mathcal{D}(H)\|_{1} is large, then GG and HH are far from each other. But this is false in general! The support of 𝒟(G)\mathcal{D}(G) is a set of graphs, which are mutable objects. Meaning, we can modify 𝒟(G)\mathcal{D}(G) dramatically by only modifying a few edges of GG. (As an extreme case, 𝒟(G)\mathcal{D}(G) and 𝒟(H)\mathcal{D}(H) could have disjoint supports on graphs that are close to each other.)

All our graphs (both input and hard instance) will be collections of connected components of size O(ε2)O(\varepsilon^{-2}). We begin by trying to construct a graph HH such that: for all GG, 𝒟(G)𝒟(H)1ε\|\mathcal{D}(G)-\mathcal{D}(H)\|_{1}\geq\varepsilon implies that GG is ε\varepsilon-far from HH. Moreover, determining if 𝒟(G)𝒟(H)1ε\|\mathcal{D}(G)-\mathcal{D}(H)\|_{1}\geq\varepsilon should require exp(Ω(ε2))\exp(\Omega(\varepsilon^{-2})) samples from 𝒟(G)\mathcal{D}(G). A candidate is suggested by the problem of uniformity testing of distributions. Suppose we construct HH where 𝒟(H)\mathcal{D}(H) is a uniform distribution on a collection \mathcal{F} of graphs (each member of which has size O(ε2)O(\varepsilon^{-2})), such that ||=exp(Ω(ε2))|\mathcal{F}|=\exp(\Omega(\varepsilon^{-2})). By standard distribution testing arguments, distinguishing the uniform distribution on \mathcal{F} from a uniform distribution on half of \mathcal{F} requires ||=exp(Ω(ε2))\sqrt{|\mathcal{F}|}=\exp(\Omega(\varepsilon^{-2})) samples. Suppose furthermore that all graphs in \mathcal{F} are ε\varepsilon-far from each other and all balanced separators of FF\in\mathcal{F} are at least of size ε|F|\varepsilon|F|. We prove that any graph GG such that 𝒟(G)\mathcal{D}(G) is supported on half of \mathcal{F} must be ε\varepsilon-far from HH. Moreover, we can construct candidate input graphs GG, where any property tester can be simulated by sampling from 𝒟(G)\mathcal{D}(G). Putting all the arguments together, we get a bonafide lower bound: property testing Π={H}\Pi=\{H\} requires exp(Ω(ε2))\exp(\Omega(\varepsilon^{-2})) queries.

What remains is the main technical construction. We need to design \mathcal{F}, the “suitable family” of graphs for the lower bound. We achieve this family by taking the ε1×ε1\varepsilon^{-1}\times\varepsilon^{-1} grid and adding a collection of diagonal edges. We apply a cleaning procedure to get a large family of graphs that are all far from each other. The separator size holds trivially, since each graph in \mathcal{F} contains a large grid.

1.2 Related Work

Property testing on bounded-degree graphs is a vast topic and we direct the reader to Chapter 9 of Goldreich’s book [Gol17] for an introduction to the subject.

Arguably, the starting point for testing planarity is the seminal work of Benjamini-Schramm-Shapira [BSS08], who showed that all minor-closed properties are testable. This paper also introduced the significance of hyperfiniteness to property testing. Hassidim-Kelner-Nguyen-Onak [HKNO09] introduced the concept of partition oracles, a key tool in property testing for hyperfinite classes. Improving the query time of partition oracles was addressed by Edelman-Hassidim-Nguyen-Onak [EHNO11], Levi-Ron [LR15] and Kumar-Seshadhri-Stolman [KSS21].

The quest for characterizing testable (bounded-degree) graph properties and finding properties testable in poly(ε1)\mathrm{poly}(\varepsilon^{-1}) is an important theme in property testing. We note that Goldreich’s recent survey explicitly calls these out as Open Problems 2.2 and 2.3 [Gol21]. One of the main inspirations for our work is the result of Newman-Sohler that proves that all hyperfinite properties are testable [NS13]. The recent work of [KSS21] proves that all additive and monotone minor-closed properties are efficiently testable, and Levi-Shoshan show that Hamiltonicity of minor-closed families is also efficiently testable [LS21].

We note an important line of work of testing properties of planar graphs, in the unbounded degree case. This direction was pioneered by Czumaj-Monemizadeh-Onak-Sohler [CMOS11], who showed the bipartitenesss is testable (independent of the size). A recent result of Czumaj-Sohler prove that all testable planar properties in the unbounded degree setting are related to subgraph freeness [CS19].

2 Proof of Theorem 1.1 and Theorem 1.3

We begin with some preliminaries: the notions of hyperfiniteness and partition oracles. Note that all graphs have nn vertices and degree bound dd.

Definition 2.1.

A graph GG is (ε,k)(\varepsilon,k) hyperfinite if there exists a subset of εdn\varepsilon dn edges whose removal results in connected components of size at most kk.

A classic result of Alon-Seymour-Thomas (Prop. 4.1 of [AST94]) shows that all minor-closed families are (ε,O(ε2))(\varepsilon,O(\varepsilon^{-2}))-hyperfinite. In Lemma A.3, we prove that all bounded treewidth families are (ε,O(ε1))(\varepsilon,O(\varepsilon^{-1}))-hyperfinite. A partition oracle gives local access to a hyperfinite decomposition. We give the formal definition below (adapted from Def. 1.1 of [KSS21]).

Definition 2.2.

A procedure 𝐀\bm{A} is a partition oracle for a minor-closed family Π\Pi of graphs if it satisfies the following properties. The deterministic procedure takes as input random access to G=(V,E)G=(V,E), access to a random seed rr, a proximity parameter ε>0\varepsilon>0, and a vertex vv of GG. (We will think of fixing G,r,εG,r,\varepsilon, so we use the notation 𝐀G,r,ε\bm{A}_{G,r,\varepsilon}. All probabilities are with respect to rr.) The procedure 𝐀G,r,ε(v)\bm{A}_{G,r,\varepsilon}(v) outputs a connected set of vertices, such that the sets {𝐀G,r,ε(v)}\{\bm{A}_{G,r,\varepsilon}(v)\}, over all vv, form a partition of VV.

We say that the partition oracle outputs an (ε,k)(\varepsilon,k)-hyperfinite decomposition if the following properties hold. (i) For all vv, |𝐀G,r,ε(v)|k|\bm{A}_{G,r,\varepsilon}(v)|\leq k and (ii) with probability >2/3>2/3 (over rr), the number of edges between the sets 𝐀G,r,ε(v)\bm{A}_{G,r,\varepsilon}(v) is at most εdn\varepsilon dn.

The main result of [KSS21] that we use is the following.

Theorem 2.3 (Rephrasing of Theorem 1.2 [KSS21]).

For any bounded degree graph in a minor-closed family, there is a partition oracle that outputs an (ε,O(ε2))(\varepsilon,O(\varepsilon^{-2}))-hyperfinite partition, and runs in time O(poly(dε1))O(\mathrm{poly}(d\varepsilon^{-1})) per query.

For bounded treewidth classes, the partition oracle can output (ε,O(ε1))(\varepsilon,O(\varepsilon^{-1}))-hyperfinite partitions using standard arguments (Claim A.4)

2.1 Property testers through subgraph count vectors

We use partition oracles to summarize a hyperfinite graph by an approximate count vector. This technique was first used by Newman-Sohler [NS13]. We follow their analysis, but take care to keep track of various dependencies on ε\varepsilon. This allows for getting the optimal query complexity bound.

Consider any planar property Π\Pi (technically, the following arguments hold for any minor-closed property). For any planar GG, let 𝒫(G)\mathcal{P}(G) denote the partition given by the partition oracle of Theorem 2.3. (Note that the partition is a random variable.) Let k\mathcal{B}_{k} be the set of unlabeled graphs in the family Π\Pi with at most kk vertices. We will always set k=O(ε2)k=O(\varepsilon^{-2}) (though the exact setting will change for bounded-treewidth properties).

For any graph GG^{\prime} consisting of connected components of size at most kk and any FkF\in\mathcal{B}_{k}, let ctF(G)\mathop{ct}_{F}(G^{\prime}) be the number of occurrences of FF in GG^{\prime}. Let ct(G)\mathop{ct}(G^{\prime}) be the |k||\mathcal{B}_{k}|-dimensional vector of these counts. Note that ct(G)1[n/k,n]\|\mathop{ct}(G^{\prime})\|_{1}\in[n/k,n].

Claim 2.4.

Consider two graphs G1,G2G^{\prime}_{1},G^{\prime}_{2} consisting entirely of connected components of size at most kk. If ct(G1)ct(G2)1γn\|\mathop{ct}(G^{\prime}_{1})-\mathop{ct}(G^{\prime}_{2})\|_{1}\leq\gamma n, then dist(G1,G2)γkd\mathop{dist}(G^{\prime}_{1},G^{\prime}_{2})\leq\gamma kd.

Proof:.

For every FkF\in\mathcal{B}_{k}, we will modify G1G^{\prime}_{1} and G2G^{\prime}_{2} to equalize the ctF(G1)\mathop{ct}_{F}(G^{\prime}_{1}) and ctF(G2)\mathop{ct}_{F}(G^{\prime}_{2}). We simply delete |ctF(G1)ctF(G2)||\mathop{ct}_{F}(G^{\prime}_{1})-\mathop{ct}_{F}(G^{\prime}_{2})| instances from either G1G^{\prime}_{1} or G2G^{\prime}_{2} (whichever has the larger count). This operation deletes at most |ctF(G1)ctF(G2)|kd|\mathop{ct}_{F}(G^{\prime}_{1})-\mathop{ct}_{F}(G^{\prime}_{2})|kd edges. In total, the number of edges deleted is at most ct(G1)ct(G2)1kdγnkd\|\mathop{ct}(G^{\prime}_{1})-\mathop{ct}(G^{\prime}_{2})\|_{1}kd\leq\gamma nkd. ∎

For the property Π\Pi that we wish to test, construct the following set 𝑪Π\bm{C}_{\Pi} of count vectors: for every FΠF\in\Pi (with nn vertices and degree bound dd) and every subgraph FF^{\prime} that is an (ε,k)(\varepsilon,k) partition of FF, add ct(F)\mathop{ct}(F^{\prime}) to 𝑪Π\bm{C}_{\Pi}.

Lemma 2.5.

Suppose 𝒫(G)\mathcal{P}(G) is a valid (ε,k)(\varepsilon,k)-partition of GG. If GΠG\in\Pi, then ct(𝒫(G))𝐂Π\mathop{ct}(\mathcal{P}(G))\in\bm{C}_{\Pi}. If GG is 3ε3\varepsilon-far from Π\Pi, then 𝐯𝐂Π\forall\bm{v}\in\bm{C}_{\Pi}, ct(𝒫(G))𝐯1>εn/kd\|\mathop{ct}(\mathcal{P}(G))-\bm{v}\|_{1}>\varepsilon n/kd.

Proof:.

Suppose GΠG\in\Pi. In the construction of 𝑪Π\bm{C}_{\Pi} described above, we can select FF as GG and FF^{\prime} as 𝒫(G)\mathcal{P}(G). Hence, ct(𝒫(G))𝑪Π\mathop{ct}(\mathcal{P}(G))\in\bm{C}_{\Pi}.

Suppose GG is 3ε3\varepsilon-far from Π\Pi. Consider any 𝒗𝑪Π\bm{v}\in\bm{C}_{\Pi}, so 𝒗=ct(F)\bm{v}=\mathop{ct}(F^{\prime}), for FF^{\prime} being an (ε,k)(\varepsilon,k)-partition of FΠF\in\Pi. By triangle inequality, dist(𝒫(G),F)dist(G,F)dist(G,𝒫(G))dist(F,F)\mathop{dist}(\mathcal{P}(G),F^{\prime})\geq\mathop{dist}(G,F)-\mathop{dist}(G,\mathcal{P}(G))-\mathop{dist}(F,F^{\prime}). By farness, dist(G,F)3ε\mathop{dist}(G,F)\geq 3\varepsilon. Because 𝒫(G)\mathcal{P}(G) and FF^{\prime} are respective (ε,k)(\varepsilon,k)-partitions, dist(G,𝒫(G))\mathop{dist}(G,\mathcal{P}(G)) and dist(F,F)\mathop{dist}(F,F^{\prime}) are at most ε\varepsilon. Hence, dist(𝒫(G),F)ε\mathop{dist}(\mathcal{P}(G),F^{\prime})\geq\varepsilon. By Claim 2.4, ct(𝒫(G))𝒗1>εn/kd\|\mathop{ct}(\mathcal{P}(G))-\bm{v}\|_{1}>\varepsilon n/kd. ∎

Now, we present the main algorithmic ingredient of our property tester.

Claim 2.6.

Given GG and a setting of 𝒫(G)\mathcal{P}(G) that is a valid (ε,k)(\varepsilon,k)-partition, using O(poly(dε1)δ2|k|3log|k|)O(\mathrm{poly}(d\varepsilon^{-1})\delta^{-2}|\mathcal{B}_{k}|^{3}\log{|\mathcal{B}_{k}|}) queries one can compute a vector 𝐯\bm{v} such that 𝐯ct(𝒫(G))1<δn\|\bm{v}-\mathop{ct}(\mathcal{P}(G))\|_{1}<\delta n with probability at least 11|k|1-\frac{1}{|\mathcal{B}_{k}|}.

Proof:.

Fix FkF\in\mathcal{B}_{k}. We show how to approximate ctF(𝒫(G))\mathop{ct}_{F}(\mathcal{P}(G)). Pick uar vertex ss, and using the partition oracle, determine the component of 𝒫(G)\mathcal{P}(G) containing ss. If the component is isomorphic to FF, declare success. The probability of success is exactly |F|ctF(𝒫(G))/n|F|\mathop{ct}_{F}(\mathcal{P}(G))/n. By Chernoff-Hoeffding, we can get an additive δ/|k|\delta/|\mathcal{B}_{k}| estimate with error probability <|k|2<|\mathcal{B}_{k}|^{-2} using O(|k|2δ2log|k|)O(|\mathcal{B}_{k}|^{2}\delta^{-2}\log|\mathcal{B}_{k}|) samples. Thus, we get an additive δn/(|F||k|)δn/|k|\delta n/(|F|\cdot|\mathcal{B}_{k}|)\leq\delta n/|\mathcal{B}_{k}| estimate for ctF(𝒫(G))\mathop{ct}_{F}(\mathcal{P}(G)). Applying for all FF, we get our estimate vector 𝒗\bm{v}. Note that the total approximation is 𝒗ct(𝒫(G))1<δn/|k|×|k|=δn\|\bm{v}-\mathop{ct}(\mathcal{P}(G))\|_{1}<\delta n/|\mathcal{B}_{k}|\times|\mathcal{B}_{k}|=\delta n. The error probability, by union bound, is at most |k|1|\mathcal{B}_{k}|^{-1}. By the running time bound of the partition oracle, the total number of queries made is O(poly(dε1)δ2|k|3log|k|)O(\mathrm{poly}(d\varepsilon^{-1})\delta^{-2}|\mathcal{B}_{k}|^{3}\log|\mathcal{B}_{k}|). ∎

Now, we prove Theorem 1.1, repeated for convenience. To get the optimal bound for planar properties, we need the right upper bound for k\mathcal{B}_{k}. Numerous results in the past show that the answer is exp(O(k))\exp(O(k)). We cite one specific reference.

Theorem 2.7.

(Theorem 5 of [MR01]) A planar graph on kk vertices and kk^{\prime} edges can be represented uniquely by 8k+2k+o(k+k)8k+2k^{\prime}+o(k+k^{\prime}) bits.

Hence, the number of planar graphs on kk vertices is exp(O(k))\exp(O(k)).

See 1.1

Proof:.

The tester has two phases. In the first phase, it uses Theorem 2.3 to get a hyperfinite partition. If it succeeds, then the tester proceeds to the second phase. It uses Claim 2.6 to approximate ct(𝒫(G))\mathop{ct}(\mathcal{P}(G)) and checks if it lies in 𝑪Π\bm{C}_{\Pi}.

The tester repeats the following O(1)O(1) times. For a random seed RR, it sets up the partition oracle with this seed, and then estimates the number of cut edges via random sampling. This is done by sampling Θ(ε1)\Theta(\varepsilon^{-1}) uar vertices uu, picking a uar neighbor vv of uu, and calling the partition oracle on uu and vv. If the outputs are different components, then edge (u,v)(u,v) is cut. If more than an ε/4\varepsilon/4-fraction of edges are cut, the process is repeated with a new choice of RR. Otherwise, RR is fixed, and the tester proceeds to the next phase. If no RR is found, the tester rejects.

At this point, a suitable 𝒫(G)\mathcal{P}(G) has been discovered. We set k=O(ε2)k=O(\varepsilon^{-2}), to be the size bound of components, as promised by Theorem 2.3. Using Claim 2.6, the tester computes an approximate count vector 𝒗\bm{v} of ct(𝒫(G))\mathop{ct}(\mathcal{P}(G)), setting δ=ε/4kd\delta=\varepsilon/4kd. Note that 𝑪Π\bm{C}_{\Pi} is a fixed set of vectors, independent of the input. The tester then determines if 𝒘𝑪Π\exists\bm{w}\in\bm{C}_{\Pi} such that 𝒗𝒘1εn/2kd\|\bm{v}-\bm{w}\|_{1}\leq\varepsilon n/2kd. If such a vector 𝒘\bm{w} exists, it accepts. Otherwise, it rejects. The description of the property tester is complete, and we proceed to the analysis.

Query complexity bound: Note that k=O(ε2)k=O(\varepsilon^{-2}), δ=ε/4kd\delta=\varepsilon/4kd, and, by Theorem 2.7, |k|=exp(O(ε2))|\mathcal{B}_{k}|=\exp(O(\varepsilon^{-2})). Hence, by Claim 2.6, the overall query complexity is poly(d)exp(O(ε2))\mathrm{poly}(d)\exp(O(\varepsilon^{-2})).

Correctness analysis: If GG is planar and hence (ε/8,O(ε2))(\varepsilon/8,O(\varepsilon^{-2}))-hyperfinite, with probability at least 2/32/3 over RR, the partition oracle computes a hyperfinite decomposition. Hence, with probability at least 5/65/6, the tester proceeds to the second phase with 𝒫(G)\mathcal{P}(G) being a valid (ε,O(ε2))(\varepsilon,O(\varepsilon^{-2}))-hyperfinite partition. Conversely, with probability at least 5/65/6, if the tester finds a suitable RR, then 𝒫(G)\mathcal{P}(G) is a valid (ε,O(ε2))(\varepsilon,O(\varepsilon^{-2}))-hyperfinite partition.

Suppose GΠG\in\Pi and hence planar. As argued above, with probability >5/6>5/6, 𝒫(G)\mathcal{P}(G) is an (ε,O(ε2))(\varepsilon,O(\varepsilon^{-2}))-hyperfinite decomposition. By Claim 2.6, with probability >5/6>5/6, 𝒗ct(𝒫(G))1<εn/4kd\|\bm{v}-\mathop{ct}(\mathcal{P}(G))\|_{1}<\varepsilon n/4kd. By the union bound, both conditions hold with probability at least 2/32/3. By Lemma 2.5, ct(𝒫(G))𝑪Π\mathop{ct}(\mathcal{P}(G))\in\bm{C}_{\Pi}. Thus, there exists 𝒘\bm{w} in 𝑪Π\bm{C}_{\Pi} such that 𝒗𝒘|1εn/2kd\|\bm{v}-\bm{w}|_{1}\leq\varepsilon n/2kd and the tester accepts with probability at least 2/32/3.

Suppose GG is 3ε3\varepsilon-far from Π\Pi. By Lemma 2.5, for all 𝒘𝑪Π\bm{w}\in\bm{C}_{\Pi}, ct(𝒫(G))𝒘1>εn/kd\|\mathop{ct}(\mathcal{P}(G))-\bm{w}\|_{1}>\varepsilon n/kd. By the triangle inequality, 𝒗𝒘1>εn/kdεn/4kd>εn/2kd\|\bm{v}-\bm{w}\|_{1}>\varepsilon n/kd-\varepsilon n/4kd>\varepsilon n/2kd. Thus, the tester rejects with probability at least 2/32/3.

The proof of Theorem 1.3 is nearly identical.

See 1.3

Proof:.

We redo the proof of Theorem 1.1. Claim A.4 shows that graphs with treewidth at most τ\tau admit a partition oracle with k=O(τ/ε)k=O(\tau/\varepsilon). For bounded treewidth graphs, we apply the trivial bound |k|2O(klogk)|\mathcal{B}_{k}|\leq 2^{O(k\log k)} (Claim A.1). We apply these bounds in the above proof, and get a query complexity of O(poly(d)exp(ε1logε1))O(\mathrm{poly}(d)\exp(\varepsilon^{-1}\log{\varepsilon^{-1}})) queries. ∎

3 Lower bounds through suitable families

The key construct for the lower bounds is given in the following definition. A graph class is monotone if it is closed under edge removals. A graph class is additive if it closed under taking disjoint unions of graphs.

Definition 3.1.

Fix ε>0\varepsilon>0 and a monotone, additive graph class 𝒞{\cal C}. We call a family \mathcal{F} of distinct graphs (in 𝒞{\cal C}) ε\varepsilon-suitable if the following conditions hold:

  • All graphs in \mathcal{F} have the same number of vertices (denoted tt).

  • If t>2/εt>2/\varepsilon, F,F\forall F,F^{\prime}\in\mathcal{F}, dist(F,F)0.02dist(F,F^{\prime})\geq 0.02.

  • F\forall F\in\mathcal{F}, the size of any (0.01d,10.01d)(\frac{0.01}{d},1-\frac{0.01}{d}) balanced separator in FF is Ω(εt)\Omega(\varepsilon t).

The main lemma used in our lower bound says that a large suitable family leads to a property testing lower bound.

Lemma 3.2.

Fix ε>0\varepsilon>0, and let \mathcal{F} denote an ε\varepsilon-suitable family of a monotone, additive graph class 𝒞{\cal C}. Then, for all sufficiently large nn, there exists a graph H𝒞H\in{\cal C} with the following property. Any property tester for the property {H}\{H\} with proximity parameter ε\varepsilon requires Ω(||)\Omega(\sqrt{|\mathcal{F}|}) queries.

The proof of Lemma 3.2 requires some tools, that we shall build up in this section. Firstly, using the properties of a suitable family, we can construct two graphs consisting entirely of components in \mathcal{F} that are far from each other. These graphs form the core of the lower bound.

Claim 3.3.

Let \mathcal{F} denote an ε\varepsilon-suitable family of a graph class 𝒞{\cal C} (as defined in Lemma 3.2). Let I=[||]I=[|\mathcal{F}|] denote a set indexing graphs in \mathcal{F}.

Let H1H_{1} be the graph consisting of a disjoint union of all graphs in |||\mathcal{F}|. For any subset RIR\subseteq I with |R|=|I|/2|R|=|I|/2, let HRH_{R} be the disjoint union of two copies of each graph in \mathcal{F} indexed by RR. Then the graphs HH and HRH_{R} are Ω(ε)\Omega(\varepsilon) far from each other.

Proof:.

Since \mathcal{F} is ε\varepsilon-suitable, we know that all graphs FF\in\mathcal{F} are defined on t=t(ε)t=t(\varepsilon) vertices. We know for any F,FF,F^{\prime}\in\mathcal{F}, since |F|=|F|=t|F|=|F^{\prime}|=t the number of edge edits needed to change FF to FF^{\prime} is at least Ω(t)\Omega(t). Since RR will be fixed for the remainder of the proof, it will be convenient to refer to HRH_{R} as H2H_{2} for the rest of the argument. Let us denote the components of H1H_{1} (resp H2H_{2}) as 𝚌𝚘𝚖𝚙𝚘𝚗𝚎𝚗𝚝𝚜(𝙷𝟷)\tt components(H_{1}) (resp as 𝚌𝚘𝚖𝚙𝚘𝚗𝚎𝚗𝚝𝚜(𝙷𝟸)\tt components(H_{2})). Let s=|𝚌𝚘𝚖𝚙𝚘𝚗𝚎𝚗𝚝𝚜()|s=|\tt components(\mathcal{F})|. H2H_{2} contains s/2s/2 connected components in \mathcal{F} each of which occurs twice. Recall from Definition 3.1, C,C𝚌𝚘𝚖𝚙𝚘𝚗𝚎𝚗𝚝𝚜()C,C^{\prime}\in\tt components(\mathcal{F}) are Ω(1)\Omega(1)-far from each other. Let 1/2{\cal M}_{1/2} denote the set of missing components of \mathcal{F} in H2H_{2}. And let 𝒫1/2\mathcal{P}_{1/2} denote the set of components from \mathcal{F} that are present in H2H_{2} (without duplicates). Thus, H2H_{2} contains two subgraphs isomorphic to 𝒫1/2\mathcal{P}_{1/2} which we denote as 𝒫1/21\mathcal{P}^{1}_{1/2} and 𝒫1/22\mathcal{P}^{2}_{1/2}. For C𝚌𝚘𝚖𝚙𝚘𝚗𝚎𝚗𝚝𝚜(𝒫𝟷/𝟸)C\in\tt components(\mathcal{P}_{1/2}), let C1𝚌𝚘𝚖𝚙𝚘𝚗𝚎𝚗𝚝𝚜(𝒫𝟷/𝟸𝟷)C^{1}\in\tt components(\mathcal{P}^{1}_{1/2}) denote the subgraph of 𝒫1/21\mathcal{P}^{1}_{1/2} isomorphic to CC. Similarly, define C2C^{2}. Let

E=argminEV(H2)×V(H2)E(H1)=E(H2)E|E|\triangle E=\operatorname*{argmin}_{\begin{subarray}{c}\triangle E^{\prime}\subseteq V(H_{2})\times V(H_{2})\\ E(H_{1})=E(H_{2})\oplus\triangle E^{\prime}\end{subarray}}|\triangle E^{\prime}|

denote the smallest set of edge modifications to H2H_{2} that produce a graph isomorphic to H1H_{1}. We now lower bound |E||\triangle E|. For a component CC\in\mathcal{F}, let E(C)E(C) denote the edge set of that component. Fix a component CC such that E(C)EE(C)\cap\triangle E\neq\emptyset. The following cases arise.

  • Case 1: EE(C)\triangle E\cap E(C) is connected. Thus, the deletion edits do not disconnect CC. In this case, after the insertion edits in E\triangle E, we obtain a component in 1/2{\cal M}_{1/2}. By the definition of suitable families, if t>2/εt>2/\varepsilon, it takes at least (follow the sentence). If t2/εt\leq 2/\varepsilon, then CC obviously gets at least one edit, which is at least Ω(εt)\Omega(\varepsilon t) edits. In both cases, the number of edits is Ω(εt)\Omega(\varepsilon t).

  • Case 2: EE(C)\triangle E\cap E(C) is not connected. Now, we get two subcases depending on size of the biggest component in EE(C)\triangle E\cap E(C).

    1. 1.

      The biggest component in CEC\setminus\triangle E has size at least (11/100d)t(1-1/100d)\cdot t. In this case, EE(C)\triangle E\oplus E(C) maps CC to some component in 1/2{\cal M}_{1/2}. Now we split into two cases. Suppose t>2/εt>2/\varepsilon. Since CC is a bounded degree graph, the number of insertion edits (by Definition 3.1) is at least 0.02t(0.01dd)t=0.01t0.02t-(\frac{0.01}{d}\cdot d)t=0.01t. Suppose t2/εt\leq 2/\varepsilon. In any case CC must get at least one insertion edit to make it have size tt. So the number of edits is at least Ω(εt)\Omega(\varepsilon t).

    2. 2.

      The biggest component in EE(C)\triangle E\cap E(C) has size at most (11100d)t(1-\frac{1}{100d})\cdot t. In this case, note that E\triangle E removes a balanced separator of CC. Thus, the number of deletion edits made to CC is at least Ω(εt)\Omega(\varepsilon t).

In all of the above cases we see that for any component C𝚌𝚘𝚖𝚙𝚘𝚗𝚎𝚗𝚝𝚜(𝙷𝟸)C\in\tt components(H_{2}) with EE(C)\triangle E\cap E(C)\neq\emptyset, the number of edits made to CC is at least Ω(εt)\Omega(\varepsilon t). Finally, note that

|{C𝚌𝚘𝚖𝚙𝚘𝚗𝚎𝚗𝚝𝚜(𝙷𝟸):𝙴(𝙲)𝙴}||𝚌𝚘𝚖𝚙𝚘𝚗𝚎𝚗𝚝𝚜(𝙷𝟸)|𝟸.|\{C\in\tt components(H_{2})\colon E(C)\cap\triangle E\neq\emptyset\}|\geq\frac{|\tt components(H_{2})|}{2}.

And therefore, the distance between graphs H1H_{1} and H2H_{2} is at least Ω(ε)\Omega(\varepsilon) as desired. ∎

We can now define the graph HH used to prove the lower bound Theorem 1.2. For convenience, assume that nn is a multiple of t||t|\mathcal{F}|, the size of graphs in \mathcal{F}. (If not, we will pad the final graph with extra isolated vertices.) The (unlabeled) graph HH is a disjoint union of n/t||n/t|\mathcal{F}| copies of every graph in \mathcal{F}.

Analogously, for RI,|R|=||/2R\subset I,|R|=|\mathcal{F}|/2, let HRH_{R} be the graph that has n/(2t||)n/(2t|\mathcal{F}|) copies of every graph in \mathcal{F} indexed by RR. For convenience, we refer to any such RR as a half index set.

3.1 The distinguishing problem

Note that input graph is really a labeled instance, as given by the adjacency list. We now give the explicit distribution of labeled inputs GG and the main “distinguishing” problem. By Yao’s minimax lemma, it suffices to prove lower bounds for determinstic algorithms over randomized inputs. So let us define the YES and NO distributions.

  • YES: Generate a uar labeling of HH.

  • NO: Generate a uar half index set RR. Then, generate a uar labeling of HRH_{R}.

Given an input GG generated from either of these distribution, the distinguishing problem is to determine, with probability >2/3>2/3, which distribution GG came from. (Note that the supports of these distributions are disjoint.)

Claim 3.4.

If there exists a deterministic distinguisher that makes qq queries (where qq is independent of nn), then there exists an algorithm with the following property. Given input GG (generated as above), the algorithm gets qq uar connected components of GG and determines, with probability >2/3o(1)>2/3-o(1), whether GG came from the YES or NO distribution.

Proof:.

Consider any deterministic algorithm. One can express its behavior as follows. At any stage, it has explored the adjacency lists of some vertices v1,v2,,vbv_{1},v_{2},\ldots,v_{b}. It can choose to further query the adjacency lists of these vertices. In this case, it further explores the connected components already visited. The algorithm can also choose to visit a new vertex ID vv^{\prime}. Note that vv^{\prime} is an arbitrary (but fixed) function of all the vertex IDs and edges seen so far. Since the input is a uar labeling of either HH or HRH_{R}, the probability that vv^{\prime} lies in a component already visited is at most qt/nqt/n. With probability at least 1qt/n1-qt/n, vv^{\prime} lies in a uar connected component (since all components have exactly the same size, tt). By a union bound, with probability at least 1q2t/n=1o(1)1-q^{2}t/n=1-o(1), the algorithm sees a uar set of at most qq components and makes its decision.

The total number of components is n/tn/t, and hence the probability of seeing a specific set of qq components is (up to 1±o(1)1\pm o(1)-factors) 1/(n/tq)1/{n/t\choose q}. Since nn is sufficiently large, this probability is within (1+o(1))(1+o(1))-factors of q!/(n/t)qq!/(n/t)^{q}. This implies that with probability at least 1o(1)1-o(1), the algorithm sees a uniform distribution of multisets of qq components. Alternately, the algorithm gets qq uar connected components of GG. The success probability only changes by (1o(1))(1-o(1))-factors. ∎

The proof of Lemma 3.2 is now a straightforward reduction from distribution testing and an application of the tools built thus far.

Proof:.

(of Lemma 3.2) By Claim 3.3, all graphs generated by the NO distribution are ε\varepsilon-far from the property {H}\{H\}. By Yao’s minimax lemma, it suffices to show that any deterministic distinguisher requires Ω(||)\Omega(\sqrt{|\mathcal{F}|}) queries.

By Claim 3.4, if there is a deterministic distinguisher making qq queries, then there is an algorithm that can distinguish HH from HRH_{R} (with probability >2/3o(1)>2/3-o(1)) by querying qq uar connected components. Standard arguments in distribution testing show that Ω(||)\Omega(\sqrt{|\mathcal{F}|}) samples are required to distinguish the uniform distribution on the index set I=[||]I=[|\mathcal{F}|] from the uniform distribution on a uar half index set RR (refer to the section titled “An Ω(n/ε2)\Omega(\sqrt{n}/\varepsilon^{2}) lower bound” on page 13 of [Can20]). By mapping II arbitrarily to \mathcal{F}, getting samples from the uniform distribution in II is equivalent to getting uar components from GG generated from the YES distribution. Analogously, samples from the uniform distribution on a uar half index set RR is equivalent to uar components from GG generated from the NO distribution. Thus, any algorithm that distinguishes HH from HRH_{R} with uar connected components requires Ω(||)\Omega(\sqrt{|\mathcal{F}|}) samples, leading to the lower bound for deterministic distinguishers. ∎

4 A suitable family for planar graphs

The main result of this section is the following theorem.

Theorem 4.1.

There exists a family \mathcal{F} of planar graphs where each graph has s2s^{2} vertices and

  • ||exp(Ω(s2))|\mathcal{F}|\geq\exp(\Omega(s^{2})).

  • For every GG\in\mathcal{F}, the size of a minimum balanced vertex separator in GG is Ω(s)\Omega(s)

  • For every pair of graphs G,GG,G^{\prime}\in\mathcal{F}, it takes at least 0.16s20.16s^{2} edge edits to go from GG to GG^{\prime}. In particular, it holds that GG and GG^{\prime} are 0.020.02-far.

We collect some ingredients which will be useful in proving Theorem 4.1. The first important ingredient we need is Whitney’s theorem.

Theorem 4.2 (Whitney’s Theorem. (Theorem 4.3.2 in [Die17])).

Any two planar embeddings of a 33-connected planar graph are equivalent.

We first define a “base” graph (Definition 4.3). This is obtained in the following manner. Let us start with the s×ss\times s grid which we denote as G0G_{0}. The label set of G0G_{0} is indexed by a pair (i,j)[s]×[s](i,j)\in[s]\times[s]. We call this labeling the standard grid order. We denote the base graph by GG, and GG is constructed on top of G0G_{0} by the addition of some specific edges. Essentially, we add edges at the corners; we refer the reader to Fig. 1. We add edges from (0,0)(0,0) to (0,2),(2,0)(0,2),(2,0), from (0,s)(0,s) to (0,s3),(3,s)(0,s-3),(3,s), from (s,s)(s,s) to (s,s4),(s4,s)(s,s-4),(s-4,s) and from (s,0)(s,0) to (s,5),(s5,0)(s,5),(s-5,0). This graph GG thus differs from G0G_{0} only in the edges adjacent to the corner vertices, making all the corners unique. Let us label GG according to the standard grid order. For uVu\in V, we thus identify uu as the pair (xu,yu)(x_{u},y_{u}) which refers to the (x,y)(x,y) coordinate of uu on the grid. With this labeling, we have E(G)=E(G0)E1E(G)=E(G_{0})\cup E_{1}. where the set E1E_{1} contains all the non grid neighbors of all the corner vertices.

Definition 4.3.

Consider the graph GG obtained above and consider the planar embedding of GG (which is unique by Theorem 4.2). We will fix an embedding of GG where

  • The unique corner vertex adjacent to two vertices 2 hops away (according to standard grid order) is located at (0,0)(0,0).

  • The unique corner vertex adjacent to two vertices 3 hops away (in the standard grid order) is located at (0,s)(0,s).

  • The unique corner vertex adjacent to two vertices 4 hops away (in the standard grid order) is located at (s,s)(s,s).

  • The unique corner vertex adjacent to two vertices 5 hops away (in the standard grid order) is located at (s,0)(s,0).

We call this graph (and by abuse of terminology, its embedding) as the base graph.

Figure 1: An example of one of the graphs from our family \mathcal{B}.

4.1 Proof of Theorem 4.1

Let us begin by defining the following family of graphs. We will show that all graphs in this family are pairwise non isomorphic.

Construction 4.4.

The family \mathcal{B} of graphs is a family of graphs labeled according to standard grid order which is obtained from the base graph GG in the following way. Let

=[0,s)×(1,s]{\cal I}=[0,s)\times(1,s]

denote an index set for vertices in V(G)V(G). Graphs in this family are obtained in the following manner. For each (i,j)(i,j)\in{\cal I}, we add exactly one of the following edges. Either we add

  • The upward diagonal ((i,j),(i+1,j+1))\left((i,j),(i+1,j+1)\right), or

  • The downward diagonal ((i+1,j),(i+1,j1))\left((i+1,j),(i+1,j-1)\right).

Noting ||=s2|{\cal I}|=s^{2}, note that the size of this family is ||=2s2|\mathcal{B}|=2^{s^{2}}.

Lemma 4.5.

The family \mathcal{B} defined in Construction 4.4 is a collection of 2s22^{s^{2}} pairwise non isomorphic graphs on s2s^{2} vertices.

Proof:.

As mentioned earlier in Definition 4.3, the base graph GG admits a unique planar drawing. This is due to Theorem 4.2, because the additional edges made our graph 33-connected. We fix that embedding. Now consider two drawings B1,B2B_{1},B_{2}\in\mathcal{B}. They differ in at least one diagonal and therefore they are non isomorphic. ∎

However, we still need to show that this family has a sufficient number of graphs that are far from being isomorphic to each other. We state this in the following lemma.

Lemma 4.6.

There is a greedy procedure that takes as input the family \mathcal{B} of graphs and returns a family \mathcal{F} of size at least 2s2/52^{s^{2}/5} all of which are pairwise 0.010.01-far from each other.

Proof:.

We begin with the following observations. Consider the ball Bb(G)B_{b}(G) for any instance of the modified grid GG, which is the set of graphs reachable from GG by bb (diagonal) edge deletions. For any GG\in\mathcal{B}, note that the number of edges in GG is at most s2s^{2}. Thus, the number of possible graphs reachable from GG after making bb deletions is at most (s2b){s^{2}\choose b}. We replace these deleted diagonals with diagonals oriented the other way. Choose b=0.16s2b=0.16s^{2}. We apply the bound (ab)(ea/b)b{a\choose b}\leq(ea/b)^{b}.

maxG|Bb(G)|\displaystyle\max_{G\in\mathcal{B}}|B_{b}(G)| (s2b)(es2b)b(es20.16s2)0.16s220.8s2\displaystyle\leq{s^{2}\choose b}\leq\left(\frac{es^{2}}{b}\right)^{b}\leq\left(\frac{es^{2}}{0.16s^{2}}\right)^{0.16s^{2}}\leq 2^{0.8s^{2}}

Let \mathcal{F} denote the set found by the algorithm below. Using the above bound on maxG|Bb(G)|\max_{G\in\mathcal{B}}|B_{b}(G)|, it holds that

||2s2maxG0.16s2(G)2s220.8s22s2/5.|\mathcal{F}|\geq\frac{2^{s^{2}}}{\max_{G\in\mathcal{B}}\mathcal{B}_{0.16s^{2}}(G)}\geq\frac{2^{s^{2}}}{2^{0.8s^{2}}}\geq 2^{s^{2}/5}.

Since for any GG\in\mathcal{B}, the maximum degree of any vertex in GG is at most 88, and all the graphs in the set \mathcal{F} disagree in at least 0.16s20.16s^{2} edges, it follows that all the graphs in \mathcal{F} are pairwise 0.020.02-far. Below, we present our greedy procedure.

TakeScoops(\mathcal{B}) 1. Initialize =\mathcal{F}=\emptyset. 2. While \mathcal{B}\neq\emptyset: (a) =\mathcal{F}=\emptyset. (b) =B0.16s2(G)\mathcal{B}=\mathcal{B}\setminus B_{0.16s^{2}}(G). (c) =G\mathcal{F}=\mathcal{F}\cup G. end while 3. return \mathcal{F}.

The family \mathcal{F} is thus the family output with the desired properties as stated in Theorem 4.1.

The proof of the theorem follows immediately from the lemmas.

Proof of Theorem 4.1.

We show that the family \mathcal{F} obtained in Lemma 4.6 satisfies all the criteria we require to be satisfied in Theorem 4.1. We pick these one by one:

  • ||exp(Ω(s2)):¯\underline{|\mathcal{F}|\geq\exp(\Omega(s^{2})):} This is a direct consequence of Lemma 4.6.

  • For every G, the size of the minimum balanced separator is Ω(s):¯\underline{\textrm{For every }G\in\mathcal{F},\textrm{ the size of the minimum balanced separator is }\Omega(s):} This follows from the fact that the family \mathcal{F} comprises graphs that contain the grid on s×ss\times s vertices as proper induced subgraphs. The grid has balanced separators of size Ω(s)\Omega(s), so all graphs in \mathcal{F} must have separators at least as large.

  • Graphs in  are all pairwise 0.01 far from each other:¯\underline{\textrm{Graphs in }\mathcal{F}\textrm{ are all pairwise }0.01\textrm{ far from each other:}} Follows from Lemma 4.6.

This shows that \mathcal{F} follows the properties described in theorem 4.1, thus completing its proof. ∎

4.1.1 Proof of Theorem 1.2

The proof follows from plugging in the right value for ss. In Theorem 4.1, we set s=ε1s=\varepsilon^{-1}. We get a suitable family \mathcal{F} with t=ε2t=\varepsilon^{-2}, where ||=exp(Ω(ε2))|\mathcal{F}|=\exp(\Omega(\varepsilon^{-2})). By Lemma 3.2, we have a explicit singleton property that requires Ω(||)=exp(Ω(ε2))\Omega(\sqrt{|\mathcal{F}|})=\exp(\Omega(\varepsilon^{-2})) queries to test.

5 A suitable family of trees

This construction is fairly straightforward: we simply pick all distinct trees on ε1\varepsilon^{-1} vertices. We need to take care to count the number of unrooted trees, so some work is required.

Let UNLABr(s)\texttt{UNLAB}_{r}(s) denote the set of unlabled and rooted binary trees on ss vertices. We first recall the following fact.

Fact 5.1.

The number of different unlabeled rooted binary trees on ss vertices is |UNLABr(s)|2Ω(s)|\texttt{UNLAB}_{r}(s)|\geq 2^{\Omega(s)} (Probelm 1.44 in [FS09] where they attribute this result to [Ott48]).

We would like to lower bound the number of unlabeled, unrooted and binary trees on ss vertices. To this end, we first make the following definition.

Definition 5.2.

Write UNLABr(s)={(T,r):T is unlableled with root r}\texttt{UNLAB}_{r}(s)=\{(T,r)\colon T\text{ is unlableled with root }r\}. For (T,r)UNLABr(n)(T,r)\in\texttt{UNLAB}_{r}(n), let

Orbit(T)={(T,r)UNLABr(n):TT}.Orbit(T)=\{(T^{\prime},r^{\prime})\in\texttt{UNLAB}_{r}(n):T^{\prime}\cong T\}.

Note that this also means TOrbit(T)T\in Orbit(T^{\prime}). Also, note that if (T′′,r′′)Orbit(T)(T^{\prime\prime},r^{\prime\prime})\not\in Orbit(T) then Orbit(T′′)Orbit(T)=Orbit(T^{\prime\prime})\cap Orbit(T)=\emptyset.

Now, we note the following.

Claim 5.3.

For any (T,r)UNLABr(s)(T,r)\in\texttt{UNLAB}_{r}(s), |Orbit(T)|s|Orbit(T)|\leq s.

Proof:.

Suppose not. Then (T,x),(T′′,x)Orbit(T)\exists(T^{\prime},x),(T^{\prime\prime},x)\in Orbit(T) which share xx as the root where both TTT\cong T^{\prime} and TT′′T\cong T^{\prime\prime}. Then TT′′T^{\prime}\cong T^{\prime\prime} as well. However, we also have (T,x),(T′′,x)UNLABr(n)(T^{\prime},x),(T^{\prime\prime},x)\in\texttt{UNLAB}_{r}(n). Contradiction. ∎

Now, we will produce a family of unlabeled and unrooted binary trees. This is done by picking a single tree from each orbit in UNLABr(s)\texttt{UNLAB}_{r}(s). The following is immediate.

Claim 5.4.

|UNLABu(s)|2Ω(s)|\texttt{UNLAB}_{u}(s)|\geq 2^{\Omega(s)}.

Proof:.

Since all orbits have size at most ss, it holds that |UNLABu(s)||UNLABr(s)|2Ω(s)|\texttt{UNLAB}_{u}(s)|\geq|\texttt{UNLAB}_{r}(s)|\geq 2^{\Omega(s)}. ∎

We can now complete the proof of the main lower bound for forests.

Proof:.

(of Theorem 1.4) The suitable family is simply UNLABu(ε1)\texttt{UNLAB}_{u}(\varepsilon^{-1}). Since all graphs are of size t=1/εt=1/\varepsilon, it only suffices to verify the separator bound for suitable families. But that trivially holds, since the separator size is non-zero, which is Ω(1)=Ω(εt)\Omega(1)=\Omega(\varepsilon t). The size of the family is exp(ε1)\exp(\varepsilon^{-1}), by Claim 5.4. Lemma 3.2 gives the lower bound of exp(ε1)\exp(\varepsilon^{-1}) for property testing an explicit singleton property. ∎

References

  • [AST94] Noga Alon, Paul D. Seymour, and Robin Thomas. Planar separators. SIAM J. Discrete Math., 7(2):184–193, 1994.
  • [BGHK95] H.L. Bodlaender, J.R. Gilbert, H. Hafsteinsson, and T. Kloks. Approximating treewidth, pathwidth, frontsize, and shortest elimination tree. Journal of Algorithms, 18(2):238–255, 1995.
  • [BSS08] I. Benjamini, O. Schramm, and A. Shapira. Every minor-closed property of sparse graphs is testable. In Symposium on the Theory of Computing (STOC), pages 393–402, 2008.
  • [Can20] Clément L. Canonne. A Survey on Distribution Testing: Your Data is Big. But is it Blue? Number 9 in Graduate Surveys. Theory of Computing Library, 2020.
  • [CMOS11] Artur Czumaj, Morteza Monemizadeh, Krzysztof Onak, and Christian Sohler. Planar graphs: Random walks and bipartiteness testing. In IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS, pages 423–432. IEEE Computer Society, 2011.
  • [CS19] Artur Czumaj and Christian Sohler. A characterization of graph properties testable for general planar graphs with one-sided error (it’s all about forbidden subgraphs). In 60th IEEE Annual Symposium on Foundations of Computer Science, FOCS, pages 1525–1548. IEEE Computer Society, 2019.
  • [Die17] Reinhard Diestel. Graph Theory. Springer Publishing Company, Incorporated, 5th edition, 2017.
  • [EHNO11] Alan Edelman, Avinatan Hassidim, Huy N. Nguyen, and Krzysztof Onak. An efficient partitioning oracle for bounded-treewidth graphs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques RANDOM, pages 530–541, 2011.
  • [FS09] Philippe Flajolet and Robert Sedgewick. Analytic Combinatorics. Cambridge University Press, 2009.
  • [Gol17] O. Goldreich. Introduction to Property Testing. Cambridge University Press, 2017.
  • [Gol21] Oded Goldreich. Open problems in property testing of graphs. Electron. Colloquium Comput. Complex., page 16, 2021.
  • [GR02] O. Goldreich and D. Ron. Property testing in bounded degree graphs. Algorithmica, 32(2):302–343, 2002.
  • [Gru12] Hermann Gruber. On balanced separators, treewidth, and cycle rank. Journal of Combinatorics, 3(4):669–681, 2012.
  • [HKNO09] A. Hassidim, J. Kelner, H. Nguyen, and K. Onak. Local graph partitions for approximation and testing. In Foundations of Computer Science (FOCS), pages 22–31, 2009.
  • [KSS19] Akash Kumar, C. Seshadhri, and Andrew Stolman. Random walks and forbidden minors II: a poly(d ϵ\epsilon-1{}^{\mbox{-1}})-query tester for minor-closed properties of bounded degree graphs. In STOC 2019., pages 559–567, 2019.
  • [KSS21] Akash Kumar, C. Seshadhri, and Andrew Stolman. Random walks and forbidden minors III: poly(d /ϵ\epsilon)-time partition oracles for minor-free graph classes. Electron. Colloquium Comput. Complex., 28:8, 2021.
  • [LR15] Reut Levi and Dana Ron. A quasi-polynomial time partition oracle for graphs with an excluded minor. ACM Transactions on Algorithms (TALG), 11(3):24, 2015.
  • [LS21] Reut Levi and Nadav Shoshan. Testing hamiltonicity (and other problems) in minor-free graphs. CoRR, abs/2102.11728, 2021.
  • [MR01] J. Ian Munro and Venkatesh Raman. Succinct representation of balanced parentheses and static trees. SIAM Journal on Computing, 31(3):762–776, 2001.
  • [NS13] Ilan Newman and Christian Sohler. Every property of hyperfinite graphs is testable. SIAM Journal on Computing, 42(3):1095–1112, 2013.
  • [Ott48] Richard Otter. The number of trees. Annals of Mathematics, 49(3):583–599, 1948.
  • [YI15] Yuichi Yoshida and Hiro Ito. Testing outerplanarity of bounded degree graphs. Algorithmica, 73(1):1–20, 2015.

Appendix A Missing proofs from §2

In this appendix, our objective is to prove Claim A.1 and Claim A.4. The former claim counts the number of graphs with treewidth at most τ\tau and is proved below.

Claim A.1.

The number of unlabeled bounded treewidth graphs on tt vertices is at most expO(tlogt)\exp{O(t\log t)}.

Proof:.

Observe that for a graph on tt vertices with bounded treewidth τ=O(1)\tau=O(1) has O(t)O(t) edges. Then, the number of unlabelled graphs of this form, TtT_{t} is at most (NO(t)){N\choose O(t)}, where N=(t2)N={t\choose 2}. Thus Tt=tO(t)T_{t}=t^{O}(t), which implies that the number of bounded treewidth graphs on tt vertices is at most exp(O(tlogt))\exp(O(t\log t)). ∎

Next, we produce the statment of Claim A.4. It postulates that there exist efficient poly(dε1)poly(d\varepsilon^{-1}) time implementations of partition oracles for dd-degree bounde graphs with bounded treewidth.

One key tool we need to prove this theorem is an analog of Alon-Seymour-Thomas style result which proves that a graph with treewidth at most τ\tau is (ε,O(τ/ε))(\varepsilon,O(\tau/\varepsilon))-hyperfinite.

Let s~(G)\tilde{s}(G) be the size of the smallest 1/32/31/3-2/3 balanced separator, whose removal leads to connected components of size at most 2/32/3 of the vertices.

Theorem A.2.

[Theorem 2 of [Gru12] and Theorem 12 of [BGHK95]] s~(G)1tw(G)\tilde{s}(G)-1\leq tw(G).

We now prove that every treewidth τ\tau graph is (ε,6τ/ε)(\varepsilon,6\tau/\varepsilon) hyperfinite.

Lemma A.3.

Fix ε>0\varepsilon>0, d,τd,\tau\in\mathbb{N}. Then for any graph GG with tw(G)τ1tw(G)\leq\tau-1 there exists a set of edges EE^{\prime} such that the following hold:

  • GEG\setminus E^{\prime} is a graph with connected components no larger than 6τ/ε6\tau/\varepsilon.

  • |E|εdn|E^{\prime}|\leq\varepsilon dn

Proof:.

We begin by showing item 1. Since GG has treewidth at most τ\tau, all of its subgraphs also have treewidth τ\tau. Also, recall tw(G)τtw(G)\leq\tau, implies that for all 1/31/3-balanced separators of GG satisfy |𝙱𝚊𝚕𝚂𝚎𝚙(G)|O(τ)|{\tt BalSep}(G)|\leq O(\tau).

Consider the following recursive procedure. Here 𝙳𝚎𝚕𝚜𝚎𝚙(G){\tt Delsep}(G) is a routine which returns a three tuple (S,G0,G1)(S,G_{0},G_{1}) where SS is the smallest balanced separator of GG and G0G_{0} and G1G_{1} are collections of connected components of GG each with size at least |V(G)|/3|V(G)|/3.

Decompose(G)(G) 1. If |V(G)|6τ/ε|V(G)|\geq 6\tau/\varepsilon (a) Write (S0,G0,G1)=𝙳𝚎𝚕𝚜𝚎𝚙(G)(S_{0},G_{0},G_{1})={\tt Delsep}(G). (b) 𝙳𝚎𝚌𝚘𝚖𝚙𝚘𝚜𝚎(G0),𝙳𝚎𝚌𝚘𝚖𝚙𝚘𝚜𝚎(G1){\tt Decompose}(G_{0}),{\tt Decompose}(G_{1}).

Note that the sets G0G_{0} and G1G_{1} produced at any intermediate step of this procedure are unions of connected components (and they are not necessarily connected components themselves). Consider the recursion tree of this process and let us examine the leaves of this recursion tree. The leaf nodes correspond to some subgraph GG (which is a union of connected components) obtained after repeated applications of balanced separator routine such that |V(G)|6τ/ε|V(G)|\leq 6\tau/\varepsilon. Also, since the procedure repeatedly deletes a 1/31/3-separator, it also holds that all the leaf nodes of the recursion tree correspond to graphs with at least 2τ/ε2\tau/\varepsilon vertices. Thus, since the leaves correspond to disjoint subgraphs, the number of leaves equals Lεn2τL\leq\frac{\varepsilon n}{2\tau} and the total number of nodes produced in the recursion is at most 2L2L. Generating every node requires deleting at most τ\tau edges. Thus, in all, we delete εdn\varepsilon dn edges which proves the lemma.

We now prove bounds on partition oracles for bounded treewidth graphs.

Claim A.4.

Let 𝒯τ{\cal T}_{\tau} denote the class of graphs with treewidth at most τ\tau. Let G𝒯τG\in{\cal T}_{\tau} denote a graph with treewidth at most τ\tau. Then GG admits an (ε,k)(\varepsilon,k) partition oracle with k30τεk\leq\frac{30\tau}{\varepsilon}.

Proof:.

Set α=ε/2\alpha=\varepsilon/2. Since 𝒯τ{\cal T}_{\tau} is a minor-closed class of graphs, it follows from Theorem 2.3 that any G𝒯τG\in{\cal T}_{\tau} admits an (α,O(α2))(\alpha,O(\alpha^{-2})) partition oracle whose setting on GG is denoted as 𝒫(G)\mathcal{P}(G). Note that the number of edges that run between the connected components of the underlying partition is at most αnd1/2εdn\alpha nd\leq 1/2\cdot\varepsilon dn. We will show how to modify the parition obtained by this oracle and obtain another oracle which returns components with size at most 30τ/ε30\tau/\varepsilon. This is done as follows.

Let P𝒫(G)P\in\mathcal{P}(G) denote some connected component in the partition 𝒫(G)\mathcal{P}(G). If |P|30τ/ε|P|\leq 30\tau/\varepsilon, we do not refine PP. On the other hand, if |P|>30τ/ε|P|>30\tau/\varepsilon, we use Lemma A.3 on PP with parameter β=ε/2\beta=\varepsilon/2. This refines PP and produces connected components of size at most 3τ/β6τ/ε30τ/ε3\tau/\beta\leq 6\tau/\varepsilon\leq 30\tau/\varepsilon as desired. And the total number of edges lost is at most βdn1/2εdn\beta dn\leq 1/2\cdot\varepsilon dn. Overall, the total number of edges lost is at most εdn\varepsilon dn as desired. ∎