The complexity of testing all properties of planar graphs, and the role of isomorphism

Sabyasachi Basu Department of Computer Science, University of California, Santa Cruz. [email protected] Akash Kumar School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne. [email protected] This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 759471) C. Seshadhri Department of Computer Science, University of California, Santa Cruz. [email protected]
SB and CS are supported by NSF DMS-2023495, CCF-1740850, CCF-1813165, CCF-1839317, CCF-1908384, CCF-1909790, and ARO Award W911NF1910294.

Abstract

Consider property testing on bounded degree graphs and let $\varepsilon>0$ denote the proximity parameter. A remarkable theorem of Newman-Sohler (SICOMP 2013) asserts that all properties of planar graphs (more generally hyperfinite) are testable with query complexity only depending on $\varepsilon$ . Recent advances in testing minor-freeness have proven that all additive and monotone properties of planar graphs can be tested in $\mathrm{poly}(\varepsilon^{-1})$ queries. Some properties falling outside this class, such as Hamiltonicity, also have a similar complexity for planar graphs. Motivated by these results, we ask: can all properties of planar graphs can be tested in $\mathrm{poly}(\varepsilon^{-1})$ queries? Is there a uniform query complexity upper bound for all planar properties, and what is the “hardest” such property to test?

We discover a surprisingly clean and optimal answer. Any property of bounded degree planar graphs can be tested in $\exp(O(\varepsilon^{-2}))$ queries. Moreover, there is a matching lower bound, up to constant factors in the exponent. The natural property of testing isomorphism to a fixed graph requires $\exp(\Omega(\varepsilon^{-2}))$ queries, thereby showing that (up to polynomial dependencies) isomorphism to an explicit fixed graph is the hardest property of planar graphs. The upper bound is a straightforward adapation of the Newman-Sohler analysis that tracks dependencies on $\varepsilon$ more carefully. The main technical contribution is the lower bound construction, which is achieved by a special family of planar graphs that are all mutually far from each other.

We can also apply our techniques to get analogous results for bounded treewidth graphs. We prove that all properties of bounded treewidth graphs can be tested in $\exp(O(\varepsilon^{-1}\log\varepsilon^{-1}))$ queries. Moreover, testing isomorphism to a fixed forest requires $\exp(\Omega(\varepsilon^{-1}))$ queries.

1 Introduction

Consider the setting of property testing for bounded degree graphs, under the model of random access to a graph adjacency list, as introduced by Goldreich-Ron [GR02]. Let $G=(V,E)$ be a graph where $V=[n]$ and the maximum degree is $d$ . We have random access to the list through neighbor queries. There is an oracle that, given $v\in V$ and $i\in[d]$ , returns the $i$ th neighbor of $v$ (if no neighbor exists, it returns $\bot$ ).

For a property $\mathcal{P}$ of graphs with degree bound $d$ , the distance of $G$ to $\mathcal{P}$ is the minimum number of edge additions/removals required to make $G$ have $\mathcal{P}$ , divided by $dn$ . We say that $G$ is $\varepsilon$ -far from $\mathcal{P}$ if the distance to $\mathcal{P}$ is more than $\varepsilon$ . A property tester for $\mathcal{P}$ is a randomized procedure that takes as input (query access to) $G$ and a proximity parameter, $\varepsilon>0$ . If $G\in\mathcal{P}$ , the tester must accept with probability at least $2/3$ . If $G$ is $\varepsilon$ -far from $\mathcal{P}$ , the tester must reject with probability at least $2/3$ . In our context, the property $\mathcal{P}$ is called testable if there exists a property tester for $\mathcal{P}$ whose query complexity is independent of $n$ .

One of the grand goals of property testing is to classify testable (graph) properties according to the query complexity of testing them. Of special interest are efficiently testable properties, whose testing complexity is $\mathrm{poly}(\varepsilon^{-1})$ . For bounded degree graphs, there is little clarity on this issue. A recent survey by Goldreich states: “Indeed, it is hard to find a common theme among [efficiently testable] properties…”. [Gol21]

Our work focuses on properties of planar graphs. Even the problem of just testing planarity has received much attention, whose complexity has only recently been shown to be $\mathrm{poly}(\varepsilon^{-1})$ [BSS08, HKNO09, EHNO11, YI15, LR15, KSS19]. Newman-Sohler proved that every planar property (actually, every hyperfinite property) is testable, but they do not provide an explicit complexity bound depending on $\varepsilon$ [NS13]. Kumar-Seshadhri-Stolman recently showed that every additive and monotone planar property can tested in $\mathrm{poly}(\varepsilon^{-1})$ queries [KSS19]. Applying techniques from this result, Levi-Shoshan showed that planar Hamiltonicity can be tested efficiently [LS21]. The latter property is neither additive nor monotone, so it is natural to ask if all planar properties can be tested efficiently. If not, does there exist a uniform complexity bound for all planar properties, and a candidate for the “hardest planar property”? (We note that there is significant work on characterizing testable planar properties, for the unbounded degree case, which is qualitatively different [CMOS11, CS19]. Details are given in §1.2.)

We discover that the answer to this question is the query complexity bound $\exp(\Theta(\varepsilon^{-2}))$ . Up to constant factors in the exponent, the hardest planar property is testing isomorphism to a fixed explicit graph.

Let us give some formalism, and discuss the connection to isomorphism. In our discussions, $n$ is basically fixed, so we are considering a non-uniform setting. A planar property $\Pi$ is a set of unlabeled bounded degree planar graph with $n$ vertices. The input $G$ is a labeled graph and the tester is trying to property test if $G$ is isomorphic (or equal to, if one ignores the labels) to any member of $\Pi$ . Observe that singleton properties, where $\Pi=\{H\}$ , are equivalent to testing if $G$ is isomorphic to an fixed graph $H$ . With this premable, we can state our main results.

Our upper bound is a straightforward adaptation of arguments in Newman-Sohler [NS13].

Theorem 1.1.

Consider a planar property $\Pi$ of $n$ -vertex, $d$ degree-bounded graphs. There is a property tester for $\Pi$ that makes $\mathrm{poly}(d)\exp(O(\varepsilon^{-2}))$ queries.

Our main technical result is a matching lower bound for an explicit singleton property. One of the surprises (at least to the authors) is that testing isomorphism to an arbitrary set of planar graphs is not harder, up to polynomial dependencies, than testing isomorphism to a single fixed graph. A recent compendium of open problems by Goldreich states the question of determining the complexity of testing isomorphism (Open Problem 2.4 in [Gol21]). We note that our theorems resolve this question for bounded-degree planar graphs.

Theorem 1.2.

For every sufficiently large $n$ , there exists a bounded-degree planar graph $H$ on $n$ vertices such that property testing $\Pi=\{H\}$ requires $\exp(\Omega(\varepsilon^{-2}))$ queries. Equivalently, testing isomorphism to $H$ requires $\exp(\Omega(\varepsilon^{-2}))$ queries.

Bounded treewidth classes: In the context of property testing, many results for planar graphs also hold for minor-free classes. It is natural to ask whether the bound of $\exp(\Theta(\varepsilon^{-2}))$ is the “right” answer for any property of minor-free graphs. We discover that to not be the case. For properties of bounded treewidth graphs, the answer is between $\exp((\varepsilon^{-1}))$ and $\exp((\varepsilon^{-1})\log\varepsilon^{-1})$ . The following theorems are also derived from the same methods for planar graphs, but the constructions are substantially simpler. The lower bound is achieved by a simple construction of forests. The main point of these results is to show that the bounded of $\exp(\Theta(\varepsilon^{-2}))$ achieved for planar properties can be significantly beaten for non-trivial minor-closed families.

Theorem 1.3.

Consider a bounded treewidth property $\Pi$ of $n$ -vertex, $d$ degree-bounded graphs. There is a property tester for $\Pi$ that makes $\mathrm{poly}(d)\exp(O(\varepsilon^{-1}\log\varepsilon^{-1}))$ queries.

Theorem 1.4.

For every sufficiently large $n$ , there exists a bounded-degree forest $H$ on $n$ vertices such that property testing $\Pi=\{H\}$ requires $\exp(\Omega(\varepsilon^{-1}))$ queries.

1.1 Main ideas

One of the key components of property testers for planar graphs is the notion of partition oracles for hyperfinite graphs, as introduced by Hassidim-Kelner-Nguyen-Onak [HKNO09]. Bounded degree planar graphs (and more generally, minor-closed families) are hyperfinite, meaning that one can remove a constant fraction of edges and obtain connected components of constant size. Specifically, one can remove $\varepsilon dn$ edges to get connected components of size $O(\varepsilon^{-2})$ .

Suppose $G$ is hyperfinite. A partition oracle is a local procedure that gives access to a hyperfinite decomposition to a graph $G$ , without any preprocessing. Given a query vertex $v$ , the partition oracle outputs a connected subgraph $C(v)$ of size $O(\varepsilon^{-2})$ , such that the union of components $C(v)$ forms a hyperfinite partition. Recent results give a partition oracle that runs in time $\mathrm{poly}(\varepsilon^{-1})$ per query [LR15, KSS21].

Newman-Sohler used partition oracles to prove that all hyperfinite (and hence planar) properties are testable [NS13]. This is where our results begin. Stripping down the Newman-Sohler arguments to their core, we can essentially treat a planar $G$ , up to $\varepsilon dn$ edge changes, as a distribution over connected planar graphs of size $O(\varepsilon^{-2})$ . The partition oracle allows us to sample efficiently from this distribution. (If the input graph $G$ is not hyperfinite, the partition oracle can detect that efficiently, and the graph can be directly rejected.) For convenience, let $\mathcal{D}(G)$ denote this distribution.

Existing theorems in combinatorics show that the number of planar graphs of size $O(\varepsilon^{-2})$ is $\exp(O(\varepsilon^{-2}))$ [MR01]; hence, this bounds the support size of $\mathcal{D}(G)$ . We can learn an approximation of $\mathcal{D}(G)$ up to $\varepsilon$ TV-distance with $\exp(O(\varepsilon^{-2}))$ queries to the partition oracle. It is not hard, but central to the argument, to show that if $\|\mathcal{D}(G)-\mathcal{D}(H)\|_{1}\leq\varepsilon$ , then $G$ is $\varepsilon$ -close to $H$ . For any property $\Pi$ , one can simply check exhaustively if the learned approximation to $\mathcal{D}(G)$ is close to $\mathcal{D}(H)$ for any $H\in\Pi$ . (While this could be expensive in running time, it requires no further queries to $G$ .)

The main challenge is in the lower bound. We need to find a property $\Pi$ such that testing $\Pi$ requires $\exp(\Omega(\varepsilon^{-2}))$ samples from $\mathcal{D}(G)$ . We seem to need a converse to the upper bound argument, showing if $\|\mathcal{D}(G)-\mathcal{D}(H)\|_{1}$ is large, then $G$ and $H$ are far from each other. But this is false in general! The support of $\mathcal{D}(G)$ is a set of graphs, which are mutable objects. Meaning, we can modify $\mathcal{D}(G)$ dramatically by only modifying a few edges of $G$ . (As an extreme case, $\mathcal{D}(G)$ and $\mathcal{D}(H)$ could have disjoint supports on graphs that are close to each other.)

All our graphs (both input and hard instance) will be collections of connected components of size $O(\varepsilon^{-2})$ . We begin by trying to construct a graph $H$ such that: for all $G$ , $\|\mathcal{D}(G)-\mathcal{D}(H)\|_{1}\geq\varepsilon$ implies that $G$ is $\varepsilon$ -far from $H$ . Moreover, determining if $\|\mathcal{D}(G)-\mathcal{D}(H)\|_{1}\geq\varepsilon$ should require $\exp(\Omega(\varepsilon^{-2}))$ samples from $\mathcal{D}(G)$ . A candidate is suggested by the problem of uniformity testing of distributions. Suppose we construct $H$ where $\mathcal{D}(H)$ is a uniform distribution on a collection $\mathcal{F}$ of graphs (each member of which has size $O(\varepsilon^{-2})$ ), such that $|\mathcal{F}|=\exp(\Omega(\varepsilon^{-2}))$ . By standard distribution testing arguments, distinguishing the uniform distribution on $\mathcal{F}$ from a uniform distribution on half of $\mathcal{F}$ requires $\sqrt{|\mathcal{F}|}=\exp(\Omega(\varepsilon^{-2}))$ samples. Suppose furthermore that all graphs in $\mathcal{F}$ are $\varepsilon$ -far from each other and all balanced separators of $F\in\mathcal{F}$ are at least of size $\varepsilon|F|$ . We prove that any graph $G$ such that $\mathcal{D}(G)$ is supported on half of $\mathcal{F}$ must be $\varepsilon$ -far from $H$ . Moreover, we can construct candidate input graphs $G$ , where any property tester can be simulated by sampling from $\mathcal{D}(G)$ . Putting all the arguments together, we get a bonafide lower bound: property testing $\Pi=\{H\}$ requires $\exp(\Omega(\varepsilon^{-2}))$ queries.

What remains is the main technical construction. We need to design $\mathcal{F}$ , the “suitable family” of graphs for the lower bound. We achieve this family by taking the $\varepsilon^{-1}\times\varepsilon^{-1}$ grid and adding a collection of diagonal edges. We apply a cleaning procedure to get a large family of graphs that are all far from each other. The separator size holds trivially, since each graph in $\mathcal{F}$ contains a large grid.

1.2 Related Work

Property testing on bounded-degree graphs is a vast topic and we direct the reader to Chapter 9 of Goldreich’s book [Gol17] for an introduction to the subject.

Arguably, the starting point for testing planarity is the seminal work of Benjamini-Schramm-Shapira [BSS08], who showed that all minor-closed properties are testable. This paper also introduced the significance of hyperfiniteness to property testing. Hassidim-Kelner-Nguyen-Onak [HKNO09] introduced the concept of partition oracles, a key tool in property testing for hyperfinite classes. Improving the query time of partition oracles was addressed by Edelman-Hassidim-Nguyen-Onak [EHNO11], Levi-Ron [LR15] and Kumar-Seshadhri-Stolman [KSS21].

The quest for characterizing testable (bounded-degree) graph properties and finding properties testable in $\mathrm{poly}(\varepsilon^{-1})$ is an important theme in property testing. We note that Goldreich’s recent survey explicitly calls these out as Open Problems 2.2 and 2.3 [Gol21]. One of the main inspirations for our work is the result of Newman-Sohler that proves that all hyperfinite properties are testable [NS13]. The recent work of [KSS21] proves that all additive and monotone minor-closed properties are efficiently testable, and Levi-Shoshan show that Hamiltonicity of minor-closed families is also efficiently testable [LS21].

We note an important line of work of testing properties of planar graphs, in the unbounded degree case. This direction was pioneered by Czumaj-Monemizadeh-Onak-Sohler [CMOS11], who showed the bipartitenesss is testable (independent of the size). A recent result of Czumaj-Sohler prove that all testable planar properties in the unbounded degree setting are related to subgraph freeness [CS19].

2 Proof of Theorem 1.1 and Theorem 1.3

We begin with some preliminaries: the notions of hyperfiniteness and partition oracles. Note that all graphs have $n$ vertices and degree bound $d$ .

Definition 2.1.

A graph $G$ is $(\varepsilon,k)$ hyperfinite if there exists a subset of $\varepsilon dn$ edges whose removal results in connected components of size at most $k$ .

A classic result of Alon-Seymour-Thomas (Prop. 4.1 of [AST94]) shows that all minor-closed families are $(\varepsilon,O(\varepsilon^{-2}))$ -hyperfinite. In Lemma A.3, we prove that all bounded treewidth families are $(\varepsilon,O(\varepsilon^{-1}))$ -hyperfinite. A partition oracle gives local access to a hyperfinite decomposition. We give the formal definition below (adapted from Def. 1.1 of [KSS21]).

Definition 2.2.

A procedure $\bm{A}$ is a partition oracle for a minor-closed family $\Pi$ of graphs if it satisfies the following properties. The deterministic procedure takes as input random access to $G=(V,E)$ , access to a random seed $r$ , a proximity parameter $\varepsilon>0$ , and a vertex $v$ of $G$ . (We will think of fixing $G,r,\varepsilon$ , so we use the notation $\bm{A}_{G,r,\varepsilon}$ . All probabilities are with respect to $r$ .) The procedure $\bm{A}_{G,r,\varepsilon}(v)$ outputs a connected set of vertices, such that the sets $\{\bm{A}_{G,r,\varepsilon}(v)\}$ , over all $v$ , form a partition of $V$ .

We say that the partition oracle outputs an $(\varepsilon,k)$ -hyperfinite decomposition if the following properties hold. (i) For all $v$ , $|\bm{A}_{G,r,\varepsilon}(v)|\leq k$ and (ii) with probability $>2/3$ (over $r$ ), the number of edges between the sets $\bm{A}_{G,r,\varepsilon}(v)$ is at most $\varepsilon dn$ .

The main result of [KSS21] that we use is the following.

Theorem 2.3 (Rephrasing of Theorem 1.2 [KSS21]).

For any bounded degree graph in a minor-closed family, there is a partition oracle that outputs an $(\varepsilon,O(\varepsilon^{-2}))$ -hyperfinite partition, and runs in time $O(\mathrm{poly}(d\varepsilon^{-1}))$ per query.

For bounded treewidth classes, the partition oracle can output $(\varepsilon,O(\varepsilon^{-1}))$ -hyperfinite partitions using standard arguments (Claim A.4)

2.1 Property testers through subgraph count vectors

We use partition oracles to summarize a hyperfinite graph by an approximate count vector. This technique was first used by Newman-Sohler [NS13]. We follow their analysis, but take care to keep track of various dependencies on $\varepsilon$ . This allows for getting the optimal query complexity bound.

Consider any planar property $\Pi$ (technically, the following arguments hold for any minor-closed property). For any planar $G$ , let $\mathcal{P}(G)$ denote the partition given by the partition oracle of Theorem 2.3. (Note that the partition is a random variable.) Let $\mathcal{B}_{k}$ be the set of unlabeled graphs in the family $\Pi$ with at most $k$ vertices. We will always set $k=O(\varepsilon^{-2})$ (though the exact setting will change for bounded-treewidth properties).

For any graph $G^{\prime}$ consisting of connected components of size at most $k$ and any $F\in\mathcal{B}_{k}$ , let $\mathop{ct}_{F}(G^{\prime})$ be the number of occurrences of $F$ in $G^{\prime}$ . Let $\mathop{ct}(G^{\prime})$ be the $|\mathcal{B}_{k}|$ -dimensional vector of these counts. Note that $\|\mathop{ct}(G^{\prime})\|_{1}\in[n/k,n]$ .

Claim 2.4.

Consider two graphs $G^{\prime}_{1},G^{\prime}_{2}$ consisting entirely of connected components of size at most $k$ . If $\|\mathop{ct}(G^{\prime}_{1})-\mathop{ct}(G^{\prime}_{2})\|_{1}\leq\gamma n$ , then $\mathop{dist}(G^{\prime}_{1},G^{\prime}_{2})\leq\gamma kd$ .

Proof:.

For every $F\in\mathcal{B}_{k}$ , we will modify $G^{\prime}_{1}$ and $G^{\prime}_{2}$ to equalize the $\mathop{ct}_{F}(G^{\prime}_{1})$ and $\mathop{ct}_{F}(G^{\prime}_{2})$ . We simply delete $|\mathop{ct}_{F}(G^{\prime}_{1})-\mathop{ct}_{F}(G^{\prime}_{2})|$ instances from either $G^{\prime}_{1}$ or $G^{\prime}_{2}$ (whichever has the larger count). This operation deletes at most $|\mathop{ct}_{F}(G^{\prime}_{1})-\mathop{ct}_{F}(G^{\prime}_{2})|kd$ edges. In total, the number of edges deleted is at most $\|\mathop{ct}(G^{\prime}_{1})-\mathop{ct}(G^{\prime}_{2})\|_{1}kd\leq\gamma nkd$ . ∎

For the property $\Pi$ that we wish to test, construct the following set $\bm{C}_{\Pi}$ of count vectors: for every $F\in\Pi$ (with $n$ vertices and degree bound $d$ ) and every subgraph $F^{\prime}$ that is an $(\varepsilon,k)$ partition of $F$ , add $\mathop{ct}(F^{\prime})$ to $\bm{C}_{\Pi}$ .

Lemma 2.5.

Suppose $\mathcal{P}(G)$ is a valid $(\varepsilon,k)$ -partition of $G$ . If $G\in\Pi$ , then $\mathop{ct}(\mathcal{P}(G))\in\bm{C}_{\Pi}$ . If $G$ is $3\varepsilon$ -far from $\Pi$ , then $\forall\bm{v}\in\bm{C}_{\Pi}$ , $\|\mathop{ct}(\mathcal{P}(G))-\bm{v}\|_{1}>\varepsilon n/kd$ .

Proof:.

Suppose $G\in\Pi$ . In the construction of $\bm{C}_{\Pi}$ described above, we can select $F$ as $G$ and $F^{\prime}$ as $\mathcal{P}(G)$ . Hence, $\mathop{ct}(\mathcal{P}(G))\in\bm{C}_{\Pi}$ .

Suppose $G$ is $3\varepsilon$ -far from $\Pi$ . Consider any $\bm{v}\in\bm{C}_{\Pi}$ , so $\bm{v}=\mathop{ct}(F^{\prime})$ , for $F^{\prime}$ being an $(\varepsilon,k)$ -partition of $F\in\Pi$ . By triangle inequality, $\mathop{dist}(\mathcal{P}(G),F^{\prime})\geq\mathop{dist}(G,F)-\mathop{dist}(G,\mathcal{P}(G))-\mathop{dist}(F,F^{\prime})$ . By farness, $\mathop{dist}(G,F)\geq 3\varepsilon$ . Because $\mathcal{P}(G)$ and $F^{\prime}$ are respective $(\varepsilon,k)$ -partitions, $\mathop{dist}(G,\mathcal{P}(G))$ and $\mathop{dist}(F,F^{\prime})$ are at most $\varepsilon$ . Hence, $\mathop{dist}(\mathcal{P}(G),F^{\prime})\geq\varepsilon$ . By Claim 2.4, $\|\mathop{ct}(\mathcal{P}(G))-\bm{v}\|_{1}>\varepsilon n/kd$ . ∎

Now, we present the main algorithmic ingredient of our property tester.

Claim 2.6.

Given $G$ and a setting of $\mathcal{P}(G)$ that is a valid $(\varepsilon,k)$ -partition, using $O(\mathrm{poly}(d\varepsilon^{-1})\delta^{-2}|\mathcal{B}_{k}|^{3}\log{|\mathcal{B}_{k}|})$ queries one can compute a vector $\bm{v}$ such that $\|\bm{v}-\mathop{ct}(\mathcal{P}(G))\|_{1}<\delta n$ with probability at least $1-\frac{1}{|\mathcal{B}_{k}|}$ .

Proof:.

Fix $F\in\mathcal{B}_{k}$ . We show how to approximate $\mathop{ct}_{F}(\mathcal{P}(G))$ . Pick uar vertex $s$ , and using the partition oracle, determine the component of $\mathcal{P}(G)$ containing $s$ . If the component is isomorphic to $F$ , declare success. The probability of success is exactly $|F|\mathop{ct}_{F}(\mathcal{P}(G))/n$ . By Chernoff-Hoeffding, we can get an additive $\delta/|\mathcal{B}_{k}|$ estimate with error probability $<|\mathcal{B}_{k}|^{-2}$ using $O(|\mathcal{B}_{k}|^{2}\delta^{-2}\log|\mathcal{B}_{k}|)$ samples. Thus, we get an additive $\delta n/(|F|\cdot|\mathcal{B}_{k}|)\leq\delta n/|\mathcal{B}_{k}|$ estimate for $\mathop{ct}_{F}(\mathcal{P}(G))$ . Applying for all $F$ , we get our estimate vector $\bm{v}$ . Note that the total approximation is $\|\bm{v}-\mathop{ct}(\mathcal{P}(G))\|_{1}<\delta n/|\mathcal{B}_{k}|\times|\mathcal{B}_{k}|=\delta n$ . The error probability, by union bound, is at most $|\mathcal{B}_{k}|^{-1}$ . By the running time bound of the partition oracle, the total number of queries made is $O(\mathrm{poly}(d\varepsilon^{-1})\delta^{-2}|\mathcal{B}_{k}|^{3}\log|\mathcal{B}_{k}|)$ . ∎

Now, we prove Theorem 1.1, repeated for convenience. To get the optimal bound for planar properties, we need the right upper bound for $\mathcal{B}_{k}$ . Numerous results in the past show that the answer is $\exp(O(k))$ . We cite one specific reference.

Theorem 2.7.

(Theorem 5 of [MR01]) A planar graph on $k$ vertices and $k^{\prime}$ edges can be represented uniquely by $8k+2k^{\prime}+o(k+k^{\prime})$ bits.

Hence, the number of planar graphs on $k$ vertices is $\exp(O(k))$ .

See 1.1

Proof:.

The tester has two phases. In the first phase, it uses Theorem 2.3 to get a hyperfinite partition. If it succeeds, then the tester proceeds to the second phase. It uses Claim 2.6 to approximate $\mathop{ct}(\mathcal{P}(G))$ and checks if it lies in $\bm{C}_{\Pi}$ .

The tester repeats the following $O(1)$ times. For a random seed $R$ , it sets up the partition oracle with this seed, and then estimates the number of cut edges via random sampling. This is done by sampling $\Theta(\varepsilon^{-1})$ uar vertices $u$ , picking a uar neighbor $v$ of $u$ , and calling the partition oracle on $u$ and $v$ . If the outputs are different components, then edge $(u,v)$ is cut. If more than an $\varepsilon/4$ -fraction of edges are cut, the process is repeated with a new choice of $R$ . Otherwise, $R$ is fixed, and the tester proceeds to the next phase. If no $R$ is found, the tester rejects.

At this point, a suitable $\mathcal{P}(G)$ has been discovered. We set $k=O(\varepsilon^{-2})$ , to be the size bound of components, as promised by Theorem 2.3. Using Claim 2.6, the tester computes an approximate count vector $\bm{v}$ of $\mathop{ct}(\mathcal{P}(G))$ , setting $\delta=\varepsilon/4kd$ . Note that $\bm{C}_{\Pi}$ is a fixed set of vectors, independent of the input. The tester then determines if $\exists\bm{w}\in\bm{C}_{\Pi}$ such that $\|\bm{v}-\bm{w}\|_{1}\leq\varepsilon n/2kd$ . If such a vector $\bm{w}$ exists, it accepts. Otherwise, it rejects. The description of the property tester is complete, and we proceed to the analysis.

Query complexity bound: Note that $k=O(\varepsilon^{-2})$ , $\delta=\varepsilon/4kd$ , and, by Theorem 2.7, $|\mathcal{B}_{k}|=\exp(O(\varepsilon^{-2}))$ . Hence, by Claim 2.6, the overall query complexity is $\mathrm{poly}(d)\exp(O(\varepsilon^{-2}))$ .

Correctness analysis: If $G$ is planar and hence $(\varepsilon/8,O(\varepsilon^{-2}))$ -hyperfinite, with probability at least $2/3$ over $R$ , the partition oracle computes a hyperfinite decomposition. Hence, with probability at least $5/6$ , the tester proceeds to the second phase with $\mathcal{P}(G)$ being a valid $(\varepsilon,O(\varepsilon^{-2}))$ -hyperfinite partition. Conversely, with probability at least $5/6$ , if the tester finds a suitable $R$ , then $\mathcal{P}(G)$ is a valid $(\varepsilon,O(\varepsilon^{-2}))$ -hyperfinite partition.

Suppose $G\in\Pi$ and hence planar. As argued above, with probability $>5/6$ , $\mathcal{P}(G)$ is an $(\varepsilon,O(\varepsilon^{-2}))$ -hyperfinite decomposition. By Claim 2.6, with probability $>5/6$ , $\|\bm{v}-\mathop{ct}(\mathcal{P}(G))\|_{1}<\varepsilon n/4kd$ . By the union bound, both conditions hold with probability at least $2/3$ . By Lemma 2.5, $\mathop{ct}(\mathcal{P}(G))\in\bm{C}_{\Pi}$ . Thus, there exists $\bm{w}$ in $\bm{C}_{\Pi}$ such that $\|\bm{v}-\bm{w}|_{1}\leq\varepsilon n/2kd$ and the tester accepts with probability at least $2/3$ .

Suppose $G$ is $3\varepsilon$ -far from $\Pi$ . By Lemma 2.5, for all $\bm{w}\in\bm{C}_{\Pi}$ , $\|\mathop{ct}(\mathcal{P}(G))-\bm{w}\|_{1}>\varepsilon n/kd$ . By the triangle inequality, $\|\bm{v}-\bm{w}\|_{1}>\varepsilon n/kd-\varepsilon n/4kd>\varepsilon n/2kd$ . Thus, the tester rejects with probability at least $2/3$ .

∎

The proof of Theorem 1.3 is nearly identical.

See 1.3

Proof:.

We redo the proof of Theorem 1.1. Claim A.4 shows that graphs with treewidth at most $\tau$ admit a partition oracle with $k=O(\tau/\varepsilon)$ . For bounded treewidth graphs, we apply the trivial bound $|\mathcal{B}_{k}|\leq 2^{O(k\log k)}$ (Claim A.1). We apply these bounds in the above proof, and get a query complexity of $O(\mathrm{poly}(d)\exp(\varepsilon^{-1}\log{\varepsilon^{-1}}))$ queries. ∎

3 Lower bounds through suitable families

The key construct for the lower bounds is given in the following definition. A graph class is monotone if it is closed under edge removals. A graph class is additive if it closed under taking disjoint unions of graphs.

Definition 3.1.

Fix $\varepsilon>0$ and a monotone, additive graph class ${\cal C}$ . We call a family $\mathcal{F}$ of distinct graphs (in ${\cal C}$ ) $\varepsilon$ -suitable if the following conditions hold:

•

All graphs in $\mathcal{F}$ have the same number of vertices (denoted $t$ ).
•

If $t>2/\varepsilon$ , $\forall F,F^{\prime}\in\mathcal{F}$ , $dist(F,F^{\prime})\geq 0.02$ .
•

$\forall F\in\mathcal{F}$ , the size of any $(\frac{0.01}{d},1-\frac{0.01}{d})$ balanced separator in $F$ is $\Omega(\varepsilon t)$ .

The main lemma used in our lower bound says that a large suitable family leads to a property testing lower bound.

Lemma 3.2.

Fix $\varepsilon>0$ , and let $\mathcal{F}$ denote an $\varepsilon$ -suitable family of a monotone, additive graph class ${\cal C}$ . Then, for all sufficiently large $n$ , there exists a graph $H\in{\cal C}$ with the following property. Any property tester for the property $\{H\}$ with proximity parameter $\varepsilon$ requires $\Omega(\sqrt{|\mathcal{F}|})$ queries.

The proof of Lemma 3.2 requires some tools, that we shall build up in this section. Firstly, using the properties of a suitable family, we can construct two graphs consisting entirely of components in $\mathcal{F}$ that are far from each other. These graphs form the core of the lower bound.

Claim 3.3.

Let $\mathcal{F}$ denote an $\varepsilon$ -suitable family of a graph class ${\cal C}$ (as defined in Lemma 3.2). Let $I=[|\mathcal{F}|]$ denote a set indexing graphs in $\mathcal{F}$ .

Let $H_{1}$ be the graph consisting of a disjoint union of all graphs in $|\mathcal{F}|$ . For any subset $R\subseteq I$ with $|R|=|I|/2$ , let $H_{R}$ be the disjoint union of two copies of each graph in $\mathcal{F}$ indexed by $R$ . Then the graphs $H$ and $H_{R}$ are $\Omega(\varepsilon)$ far from each other.

Proof:.

Since $\mathcal{F}$ is $\varepsilon$ -suitable, we know that all graphs $F\in\mathcal{F}$ are defined on $t=t(\varepsilon)$ vertices. We know for any $F,F^{\prime}\in\mathcal{F}$ , since $|F|=|F^{\prime}|=t$ the number of edge edits needed to change $F$ to $F^{\prime}$ is at least $\Omega(t)$ . Since $R$ will be fixed for the remainder of the proof, it will be convenient to refer to $H_{R}$ as $H_{2}$ for the rest of the argument. Let us denote the components of $H_{1}$ (resp $H_{2}$ ) as $\tt components(H_{1})$ (resp as $\tt components(H_{2})$ ). Let $s=|\tt components(\mathcal{F})|$ . $H_{2}$ contains $s/2$ connected components in $\mathcal{F}$ each of which occurs twice. Recall from Definition 3.1, $C,C^{\prime}\in\tt components(\mathcal{F})$ are $\Omega(1)$ -far from each other. Let ${\cal M}_{1/2}$ denote the set of missing components of $\mathcal{F}$ in $H_{2}$ . And let $\mathcal{P}_{1/2}$ denote the set of components from $\mathcal{F}$ that are present in $H_{2}$ (without duplicates). Thus, $H_{2}$ contains two subgraphs isomorphic to $\mathcal{P}_{1/2}$ which we denote as $\mathcal{P}^{1}_{1/2}$ and $\mathcal{P}^{2}_{1/2}$ . For $C\in\tt components(\mathcal{P}_{1/2})$ , let $C^{1}\in\tt components(\mathcal{P}^{1}_{1/2})$ denote the subgraph of $\mathcal{P}^{1}_{1/2}$ isomorphic to $C$ . Similarly, define $C^{2}$ . Let

\triangle E=\operatorname*{argmin}_{\begin{subarray}{c}\triangle E^{\prime}\subseteq V(H_{2})\times V(H_{2})\\ E(H_{1})=E(H_{2})\oplus\triangle E^{\prime}\end{subarray}}|\triangle E^{\prime}|

denote the smallest set of edge modifications to $H_{2}$ that produce a graph isomorphic to $H_{1}$ . We now lower bound $|\triangle E|$ . For a component $C\in\mathcal{F}$ , let $E(C)$ denote the edge set of that component. Fix a component $C$ such that $E(C)\cap\triangle E\neq\emptyset$ . The following cases arise.

•

Case 1: $\triangle E\cap E(C)$ is connected. Thus, the deletion edits do not disconnect $C$ . In this case, after the insertion edits in $\triangle E$ , we obtain a component in ${\cal M}_{1/2}$ . By the definition of suitable families, if $t>2/\varepsilon$ , it takes at least (follow the sentence). If $t\leq 2/\varepsilon$ , then $C$ obviously gets at least one edit, which is at least $\Omega(\varepsilon t)$ edits. In both cases, the number of edits is $\Omega(\varepsilon t)$ .
•

Case 2: $\triangle E\cap E(C)$ is not connected. Now, we get two subcases depending on size of the biggest component in $\triangle E\cap E(C)$ .
1. 1.
  
  The biggest component in $C\setminus\triangle E$ has size at least $(1-1/100d)\cdot t$ . In this case, $\triangle E\oplus E(C)$ maps $C$ to some component in ${\cal M}_{1/2}$ . Now we split into two cases. Suppose $t>2/\varepsilon$ . Since $C$ is a bounded degree graph, the number of insertion edits (by Definition 3.1) is at least $0.02t-(\frac{0.01}{d}\cdot d)t=0.01t$ . Suppose $t\leq 2/\varepsilon$ . In any case $C$ must get at least one insertion edit to make it have size $t$ . So the number of edits is at least $\Omega(\varepsilon t)$ .
2. 2.
  
  The biggest component in $\triangle E\cap E(C)$ has size at most $(1-\frac{1}{100d})\cdot t$ . In this case, note that $\triangle E$ removes a balanced separator of $C$ . Thus, the number of deletion edits made to $C$ is at least $\Omega(\varepsilon t)$ .

In all of the above cases we see that for any component $C\in\tt components(H_{2})$ with $\triangle E\cap E(C)\neq\emptyset$ , the number of edits made to $C$ is at least $\Omega(\varepsilon t)$ . Finally, note that

|\{C\in\tt components(H_{2})\colon E(C)\cap\triangle E\neq\emptyset\}|\geq\frac{|\tt components(H_{2})|}{2}.

And therefore, the distance between graphs $H_{1}$ and $H_{2}$ is at least $\Omega(\varepsilon)$ as desired. ∎

We can now define the graph $H$ used to prove the lower bound Theorem 1.2. For convenience, assume that $n$ is a multiple of $t|\mathcal{F}|$ , the size of graphs in $\mathcal{F}$ . (If not, we will pad the final graph with extra isolated vertices.) The (unlabeled) graph $H$ is a disjoint union of $n/t|\mathcal{F}|$ copies of every graph in $\mathcal{F}$ .

Analogously, for $R\subset I,|R|=|\mathcal{F}|/2$ , let $H_{R}$ be the graph that has $n/(2t|\mathcal{F}|)$ copies of every graph in $\mathcal{F}$ indexed by $R$ . For convenience, we refer to any such $R$ as a half index set.

3.1 The distinguishing problem

Note that input graph is really a labeled instance, as given by the adjacency list. We now give the explicit distribution of labeled inputs $G$ and the main “distinguishing” problem. By Yao’s minimax lemma, it suffices to prove lower bounds for determinstic algorithms over randomized inputs. So let us define the YES and NO distributions.

•

YES: Generate a uar labeling of $H$ .
•

NO: Generate a uar half index set $R$ . Then, generate a uar labeling of $H_{R}$ .

Given an input $G$ generated from either of these distribution, the distinguishing problem is to determine, with probability $>2/3$ , which distribution $G$ came from. (Note that the supports of these distributions are disjoint.)

Claim 3.4.

If there exists a deterministic distinguisher that makes $q$ queries (where $q$ is independent of $n$ ), then there exists an algorithm with the following property. Given input $G$ (generated as above), the algorithm gets $q$ uar connected components of $G$ and determines, with probability $>2/3-o(1)$ , whether $G$ came from the YES or NO distribution.

Proof:.

Consider any deterministic algorithm. One can express its behavior as follows. At any stage, it has explored the adjacency lists of some vertices $v_{1},v_{2},\ldots,v_{b}$ . It can choose to further query the adjacency lists of these vertices. In this case, it further explores the connected components already visited. The algorithm can also choose to visit a new vertex ID $v^{\prime}$ . Note that $v^{\prime}$ is an arbitrary (but fixed) function of all the vertex IDs and edges seen so far. Since the input is a uar labeling of either $H$ or $H_{R}$ , the probability that $v^{\prime}$ lies in a component already visited is at most $qt/n$ . With probability at least $1-qt/n$ , $v^{\prime}$ lies in a uar connected component (since all components have exactly the same size, $t$ ). By a union bound, with probability at least $1-q^{2}t/n=1-o(1)$ , the algorithm sees a uar set of at most $q$ components and makes its decision.

The total number of components is $n/t$ , and hence the probability of seeing a specific set of $q$ components is (up to $1\pm o(1)$ -factors) $1/{n/t\choose q}$ . Since $n$ is sufficiently large, this probability is within $(1+o(1))$ -factors of $q!/(n/t)^{q}$ . This implies that with probability at least $1-o(1)$ , the algorithm sees a uniform distribution of multisets of $q$ components. Alternately, the algorithm gets $q$ uar connected components of $G$ . The success probability only changes by $(1-o(1))$ -factors. ∎

The proof of Lemma 3.2 is now a straightforward reduction from distribution testing and an application of the tools built thus far.

Proof:.

(of Lemma 3.2) By Claim 3.3, all graphs generated by the NO distribution are $\varepsilon$ -far from the property $\{H\}$ . By Yao’s minimax lemma, it suffices to show that any deterministic distinguisher requires $\Omega(\sqrt{|\mathcal{F}|})$ queries.

By Claim 3.4, if there is a deterministic distinguisher making $q$ queries, then there is an algorithm that can distinguish $H$ from $H_{R}$ (with probability $>2/3-o(1)$ ) by querying $q$ uar connected components. Standard arguments in distribution testing show that $\Omega(\sqrt{|\mathcal{F}|})$ samples are required to distinguish the uniform distribution on the index set $I=[|\mathcal{F}|]$ from the uniform distribution on a uar half index set $R$ (refer to the section titled “An $\Omega(\sqrt{n}/\varepsilon^{2})$ lower bound” on page 13 of [Can20]). By mapping $I$ arbitrarily to $\mathcal{F}$ , getting samples from the uniform distribution in $I$ is equivalent to getting uar components from $G$ generated from the YES distribution. Analogously, samples from the uniform distribution on a uar half index set $R$ is equivalent to uar components from $G$ generated from the NO distribution. Thus, any algorithm that distinguishes $H$ from $H_{R}$ with uar connected components requires $\Omega(\sqrt{|\mathcal{F}|})$ samples, leading to the lower bound for deterministic distinguishers. ∎

4 A suitable family for planar graphs

The main result of this section is the following theorem.

Theorem 4.1.

There exists a family $\mathcal{F}$ of planar graphs where each graph has $s^{2}$ vertices and

•

$|\mathcal{F}|\geq\exp(\Omega(s^{2}))$ .
•

For every $G\in\mathcal{F}$ , the size of a minimum balanced vertex separator in $G$ is $\Omega(s)$
•

For every pair of graphs $G,G^{\prime}\in\mathcal{F}$ , it takes at least $0.16s^{2}$ edge edits to go from $G$ to $G^{\prime}$ . In particular, it holds that $G$ and $G^{\prime}$ are $0.02$ -far.

We collect some ingredients which will be useful in proving Theorem 4.1. The first important ingredient we need is Whitney’s theorem.

Theorem 4.2 (Whitney’s Theorem. (Theorem 4.3.2 in [Die17])).

Any two planar embeddings of a $3$ -connected planar graph are equivalent.

We first define a “base” graph (Definition 4.3). This is obtained in the following manner. Let us start with the $s\times s$ grid which we denote as $G_{0}$ . The label set of $G_{0}$ is indexed by a pair $(i,j)\in[s]\times[s]$ . We call this labeling the standard grid order. We denote the base graph by $G$ , and $G$ is constructed on top of $G_{0}$ by the addition of some specific edges. Essentially, we add edges at the corners; we refer the reader to Fig. 1. We add edges from $(0,0)$ to $(0,2),(2,0)$ , from $(0,s)$ to $(0,s-3),(3,s)$ , from $(s,s)$ to $(s,s-4),(s-4,s)$ and from $(s,0)$ to $(s,5),(s-5,0)$ . This graph $G$ thus differs from $G_{0}$ only in the edges adjacent to the corner vertices, making all the corners unique. Let us label $G$ according to the standard grid order. For $u\in V$ , we thus identify $u$ as the pair $(x_{u},y_{u})$ which refers to the $(x,y)$ coordinate of $u$ on the grid. With this labeling, we have $E(G)=E(G_{0})\cup E_{1}$ . where the set $E_{1}$ contains all the non grid neighbors of all the corner vertices.

Definition 4.3.

Consider the graph $G$ obtained above and consider the planar embedding of $G$ (which is unique by Theorem 4.2). We will fix an embedding of $G$ where

•

The unique corner vertex adjacent to two vertices 2 hops away (according to standard grid order) is located at $(0,0)$ .
•

The unique corner vertex adjacent to two vertices 3 hops away (in the standard grid order) is located at $(0,s)$ .
•

The unique corner vertex adjacent to two vertices 4 hops away (in the standard grid order) is located at $(s,s)$ .
•

The unique corner vertex adjacent to two vertices 5 hops away (in the standard grid order) is located at $(s,0)$ .

We call this graph (and by abuse of terminology, its embedding) as the base graph.

Figure 1: An example of one of the graphs from our family

\mathcal{B}

4.1 Proof of Theorem 4.1

Let us begin by defining the following family of graphs. We will show that all graphs in this family are pairwise non isomorphic.

Construction 4.4.

The family $\mathcal{B}$ of graphs is a family of graphs labeled according to standard grid order which is obtained from the base graph $G$ in the following way. Let

{\cal I}=[0,s)\times(1,s]

denote an index set for vertices in $V(G)$ . Graphs in this family are obtained in the following manner. For each $(i,j)\in{\cal I}$ , we add exactly one of the following edges. Either we add

•

The upward diagonal $\left((i,j),(i+1,j+1)\right)$ , or
•

The downward diagonal $\left((i+1,j),(i+1,j-1)\right)$ .

Noting $|{\cal I}|=s^{2}$ , note that the size of this family is $|\mathcal{B}|=2^{s^{2}}$ .

Lemma 4.5.

The family $\mathcal{B}$ defined in Construction 4.4 is a collection of $2^{s^{2}}$ pairwise non isomorphic graphs on $s^{2}$ vertices.

Proof:.

As mentioned earlier in Definition 4.3, the base graph $G$ admits a unique planar drawing. This is due to Theorem 4.2, because the additional edges made our graph $3$ -connected. We fix that embedding. Now consider two drawings $B_{1},B_{2}\in\mathcal{B}$ . They differ in at least one diagonal and therefore they are non isomorphic. ∎

However, we still need to show that this family has a sufficient number of graphs that are far from being isomorphic to each other. We state this in the following lemma.

Lemma 4.6.

There is a greedy procedure that takes as input the family $\mathcal{B}$ of graphs and returns a family $\mathcal{F}$ of size at least $2^{s^{2}/5}$ all of which are pairwise $0.01$ -far from each other.

Proof:.

We begin with the following observations. Consider the ball $B_{b}(G)$ for any instance of the modified grid $G$ , which is the set of graphs reachable from $G$ by $b$ (diagonal) edge deletions. For any $G\in\mathcal{B}$ , note that the number of edges in $G$ is at most $s^{2}$ . Thus, the number of possible graphs reachable from $G$ after making $b$ deletions is at most ${s^{2}\choose b}$ . We replace these deleted diagonals with diagonals oriented the other way. Choose $b=0.16s^{2}$ . We apply the bound ${a\choose b}\leq(ea/b)^{b}$ .

\displaystyle\max_{G\in\mathcal{B}}|B_{b}(G)|

\displaystyle\leq{s^{2}\choose b}\leq\left(\frac{es^{2}}{b}\right)^{b}\leq\left(\frac{es^{2}}{0.16s^{2}}\right)^{0.16s^{2}}\leq 2^{0.8s^{2}}

Let $\mathcal{F}$ denote the set found by the algorithm below. Using the above bound on $\max_{G\in\mathcal{B}}|B_{b}(G)|$ , it holds that

|\mathcal{F}|\geq\frac{2^{s^{2}}}{\max_{G\in\mathcal{B}}\mathcal{B}_{0.16s^{2}}(G)}\geq\frac{2^{s^{2}}}{2^{0.8s^{2}}}\geq 2^{s^{2}/5}.

Since for any $G\in\mathcal{B}$ , the maximum degree of any vertex in $G$ is at most $8$ , and all the graphs in the set $\mathcal{F}$ disagree in at least $0.16s^{2}$ edges, it follows that all the graphs in $\mathcal{F}$ are pairwise $0.02$ -far. Below, we present our greedy procedure.

TakeScoops( $\mathcal{B}$ ) 1. Initialize $\mathcal{F}=\emptyset$ . 2. While $\mathcal{B}\neq\emptyset$ : (a) $\mathcal{F}=\emptyset$ . (b) $\mathcal{B}=\mathcal{B}\setminus B_{0.16s^{2}}(G)$ . (c) $\mathcal{F}=\mathcal{F}\cup G$ . end while 3. return $\mathcal{F}$ .

The family $\mathcal{F}$ is thus the family output with the desired properties as stated in Theorem 4.1.

∎

The proof of the theorem follows immediately from the lemmas.

Proof of Theorem 4.1.

We show that the family $\mathcal{F}$ obtained in Lemma 4.6 satisfies all the criteria we require to be satisfied in Theorem 4.1. We pick these one by one:

•

$\underline{|\mathcal{F}|\geq\exp(\Omega(s^{2})):}$ This is a direct consequence of Lemma 4.6.
•

$\underline{\textrm{For every }G\in\mathcal{F},\textrm{ the size of the minimum balanced separator is }\Omega(s):}$ This follows from the fact that the family $\mathcal{F}$ comprises graphs that contain the grid on $s\times s$ vertices as proper induced subgraphs. The grid has balanced separators of size $\Omega(s)$ , so all graphs in $\mathcal{F}$ must have separators at least as large.
•

$\underline{\textrm{Graphs in }\mathcal{F}\textrm{ are all pairwise }0.01\textrm{ far from each other:}}$ Follows from Lemma 4.6.

This shows that $\mathcal{F}$ follows the properties described in theorem 4.1, thus completing its proof. ∎

4.1.1 Proof of Theorem 1.2

The proof follows from plugging in the right value for $s$ . In Theorem 4.1, we set $s=\varepsilon^{-1}$ . We get a suitable family $\mathcal{F}$ with $t=\varepsilon^{-2}$ , where $|\mathcal{F}|=\exp(\Omega(\varepsilon^{-2}))$ . By Lemma 3.2, we have a explicit singleton property that requires $\Omega(\sqrt{|\mathcal{F}|})=\exp(\Omega(\varepsilon^{-2}))$ queries to test.

5 A suitable family of trees

This construction is fairly straightforward: we simply pick all distinct trees on $\varepsilon^{-1}$ vertices. We need to take care to count the number of unrooted trees, so some work is required.

Let $\texttt{UNLAB}_{r}(s)$ denote the set of unlabled and rooted binary trees on $s$ vertices. We first recall the following fact.

Fact 5.1.

The number of different unlabeled rooted binary trees on $s$ vertices is $|\texttt{UNLAB}_{r}(s)|\geq 2^{\Omega(s)}$ (Probelm 1.44 in [FS09] where they attribute this result to [Ott48]).

We would like to lower bound the number of unlabeled, unrooted and binary trees on $s$ vertices. To this end, we first make the following definition.

Definition 5.2.

Write $\texttt{UNLAB}_{r}(s)=\{(T,r)\colon T\text{ is unlableled with root }r\}$ . For $(T,r)\in\texttt{UNLAB}_{r}(n)$ , let

Orbit(T)=\{(T^{\prime},r^{\prime})\in\texttt{UNLAB}_{r}(n):T^{\prime}\cong T\}.

Note that this also means $T\in Orbit(T^{\prime})$ . Also, note that if $(T^{\prime\prime},r^{\prime\prime})\not\in Orbit(T)$ then $Orbit(T^{\prime\prime})\cap Orbit(T)=\emptyset$ .

Now, we note the following.

Claim 5.3.

For any $(T,r)\in\texttt{UNLAB}_{r}(s)$ , $|Orbit(T)|\leq s$ .

Proof:.

Suppose not. Then $\exists(T^{\prime},x),(T^{\prime\prime},x)\in Orbit(T)$ which share $x$ as the root where both $T\cong T^{\prime}$ and $T\cong T^{\prime\prime}$ . Then $T^{\prime}\cong T^{\prime\prime}$ as well. However, we also have $(T^{\prime},x),(T^{\prime\prime},x)\in\texttt{UNLAB}_{r}(n)$ . Contradiction. ∎

Now, we will produce a family of unlabeled and unrooted binary trees. This is done by picking a single tree from each orbit in $\texttt{UNLAB}_{r}(s)$ . The following is immediate.

Claim 5.4.

$|\texttt{UNLAB}_{u}(s)|\geq 2^{\Omega(s)}$ .

Proof:.

Since all orbits have size at most $s$ , it holds that $|\texttt{UNLAB}_{u}(s)|\geq|\texttt{UNLAB}_{r}(s)|\geq 2^{\Omega(s)}$ . ∎

We can now complete the proof of the main lower bound for forests.

Proof:.

(of Theorem 1.4) The suitable family is simply $\texttt{UNLAB}_{u}(\varepsilon^{-1})$ . Since all graphs are of size $t=1/\varepsilon$ , it only suffices to verify the separator bound for suitable families. But that trivially holds, since the separator size is non-zero, which is $\Omega(1)=\Omega(\varepsilon t)$ . The size of the family is $\exp(\varepsilon^{-1})$ , by Claim 5.4. Lemma 3.2 gives the lower bound of $\exp(\varepsilon^{-1})$ for property testing an explicit singleton property. ∎

References

[AST94] Noga Alon, Paul D. Seymour, and Robin Thomas. Planar separators. SIAM J. Discrete Math., 7(2):184–193, 1994.
[BGHK95] H.L. Bodlaender, J.R. Gilbert, H. Hafsteinsson, and T. Kloks. Approximating treewidth, pathwidth, frontsize, and shortest elimination tree. Journal of Algorithms, 18(2):238–255, 1995.
[BSS08] I. Benjamini, O. Schramm, and A. Shapira. Every minor-closed property of sparse graphs is testable. In Symposium on the Theory of Computing (STOC), pages 393–402, 2008.
[Can20] Clément L. Canonne. A Survey on Distribution Testing: Your Data is Big. But is it Blue? Number 9 in Graduate Surveys. Theory of Computing Library, 2020.
[CMOS11] Artur Czumaj, Morteza Monemizadeh, Krzysztof Onak, and Christian Sohler. Planar graphs: Random walks and bipartiteness testing. In IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS, pages 423–432. IEEE Computer Society, 2011.
[CS19] Artur Czumaj and Christian Sohler. A characterization of graph properties testable for general planar graphs with one-sided error (it’s all about forbidden subgraphs). In 60th IEEE Annual Symposium on Foundations of Computer Science, FOCS, pages 1525–1548. IEEE Computer Society, 2019.
[Die17] Reinhard Diestel. Graph Theory. Springer Publishing Company, Incorporated, 5th edition, 2017.
[EHNO11] Alan Edelman, Avinatan Hassidim, Huy N. Nguyen, and Krzysztof Onak. An efficient partitioning oracle for bounded-treewidth graphs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques RANDOM, pages 530–541, 2011.
[FS09] Philippe Flajolet and Robert Sedgewick. Analytic Combinatorics. Cambridge University Press, 2009.
[Gol17] O. Goldreich. Introduction to Property Testing. Cambridge University Press, 2017.
[Gol21] Oded Goldreich. Open problems in property testing of graphs. Electron. Colloquium Comput. Complex., page 16, 2021.
[GR02] O. Goldreich and D. Ron. Property testing in bounded degree graphs. Algorithmica, 32(2):302–343, 2002.
[Gru12] Hermann Gruber. On balanced separators, treewidth, and cycle rank. Journal of Combinatorics, 3(4):669–681, 2012.
[HKNO09] A. Hassidim, J. Kelner, H. Nguyen, and K. Onak. Local graph partitions for approximation and testing. In Foundations of Computer Science (FOCS), pages 22–31, 2009.
[KSS19] Akash Kumar, C. Seshadhri, and Andrew Stolman. Random walks and forbidden minors II: a poly(d $\epsilon$ ${}^{\mbox{-1}}$ )-query tester for minor-closed properties of bounded degree graphs. In STOC 2019., pages 559–567, 2019.
[KSS21] Akash Kumar, C. Seshadhri, and Andrew Stolman. Random walks and forbidden minors III: poly(d / $\epsilon$ )-time partition oracles for minor-free graph classes. Electron. Colloquium Comput. Complex., 28:8, 2021.
[LR15] Reut Levi and Dana Ron. A quasi-polynomial time partition oracle for graphs with an excluded minor. ACM Transactions on Algorithms (TALG), 11(3):24, 2015.
[LS21] Reut Levi and Nadav Shoshan. Testing hamiltonicity (and other problems) in minor-free graphs. CoRR, abs/2102.11728, 2021.
[MR01] J. Ian Munro and Venkatesh Raman. Succinct representation of balanced parentheses and static trees. SIAM Journal on Computing, 31(3):762–776, 2001.
[NS13] Ilan Newman and Christian Sohler. Every property of hyperfinite graphs is testable. SIAM Journal on Computing, 42(3):1095–1112, 2013.
[Ott48] Richard Otter. The number of trees. Annals of Mathematics, 49(3):583–599, 1948.
[YI15] Yuichi Yoshida and Hiro Ito. Testing outerplanarity of bounded degree graphs. Algorithmica, 73(1):1–20, 2015.

Appendix A Missing proofs from §2

In this appendix, our objective is to prove Claim A.1 and Claim A.4. The former claim counts the number of graphs with treewidth at most $\tau$ and is proved below.

Claim A.1.

The number of unlabeled bounded treewidth graphs on $t$ vertices is at most $\exp{O(t\log t)}$ .

Proof:.

Observe that for a graph on $t$ vertices with bounded treewidth $\tau=O(1)$ has $O(t)$ edges. Then, the number of unlabelled graphs of this form, $T_{t}$ is at most ${N\choose O(t)}$ , where $N={t\choose 2}$ . Thus $T_{t}=t^{O}(t)$ , which implies that the number of bounded treewidth graphs on $t$ vertices is at most $\exp(O(t\log t))$ . ∎

Next, we produce the statment of Claim A.4. It postulates that there exist efficient $poly(d\varepsilon^{-1})$ time implementations of partition oracles for $d$ -degree bounde graphs with bounded treewidth.

One key tool we need to prove this theorem is an analog of Alon-Seymour-Thomas style result which proves that a graph with treewidth at most $\tau$ is $(\varepsilon,O(\tau/\varepsilon))$ -hyperfinite.

Let $\tilde{s}(G)$ be the size of the smallest $1/3-2/3$ balanced separator, whose removal leads to connected components of size at most $2/3$ of the vertices.

Theorem A.2.

[Theorem 2 of [Gru12] and Theorem 12 of [BGHK95]] $\tilde{s}(G)-1\leq tw(G)$ .

We now prove that every treewidth $\tau$ graph is $(\varepsilon,6\tau/\varepsilon)$ hyperfinite.

Lemma A.3.

Fix $\varepsilon>0$ , $d,\tau\in\mathbb{N}$ . Then for any graph $G$ with $tw(G)\leq\tau-1$ there exists a set of edges $E^{\prime}$ such that the following hold:

•

$G\setminus E^{\prime}$ is a graph with connected components no larger than $6\tau/\varepsilon$ .
•

$|E^{\prime}|\leq\varepsilon dn$

Proof:.

We begin by showing item 1. Since $G$ has treewidth at most $\tau$ , all of its subgraphs also have treewidth $\tau$ . Also, recall $tw(G)\leq\tau$ , implies that for all $1/3$ -balanced separators of $G$ satisfy $|{\tt BalSep}(G)|\leq O(\tau)$ .

Consider the following recursive procedure. Here ${\tt Delsep}(G)$ is a routine which returns a three tuple $(S,G_{0},G_{1})$ where $S$ is the smallest balanced separator of $G$ and $G_{0}$ and $G_{1}$ are collections of connected components of $G$ each with size at least $|V(G)|/3$ .

Decompose $(G)$ 1. If $|V(G)|\geq 6\tau/\varepsilon$ (a) Write $(S_{0},G_{0},G_{1})={\tt Delsep}(G)$ . (b) ${\tt Decompose}(G_{0}),{\tt Decompose}(G_{1})$ .

Note that the sets $G_{0}$ and $G_{1}$ produced at any intermediate step of this procedure are unions of connected components (and they are not necessarily connected components themselves). Consider the recursion tree of this process and let us examine the leaves of this recursion tree. The leaf nodes correspond to some subgraph $G$ (which is a union of connected components) obtained after repeated applications of balanced separator routine such that $|V(G)|\leq 6\tau/\varepsilon$ . Also, since the procedure repeatedly deletes a $1/3$ -separator, it also holds that all the leaf nodes of the recursion tree correspond to graphs with at least $2\tau/\varepsilon$ vertices. Thus, since the leaves correspond to disjoint subgraphs, the number of leaves equals $L\leq\frac{\varepsilon n}{2\tau}$ and the total number of nodes produced in the recursion is at most $2L$ . Generating every node requires deleting at most $\tau$ edges. Thus, in all, we delete $\varepsilon dn$ edges which proves the lemma.

∎

We now prove bounds on partition oracles for bounded treewidth graphs.

Claim A.4.

Let ${\cal T}_{\tau}$ denote the class of graphs with treewidth at most $\tau$ . Let $G\in{\cal T}_{\tau}$ denote a graph with treewidth at most $\tau$ . Then $G$ admits an $(\varepsilon,k)$ partition oracle with $k\leq\frac{30\tau}{\varepsilon}$ .

Proof:.

Set $\alpha=\varepsilon/2$ . Since ${\cal T}_{\tau}$ is a minor-closed class of graphs, it follows from Theorem 2.3 that any $G\in{\cal T}_{\tau}$ admits an $(\alpha,O(\alpha^{-2}))$ partition oracle whose setting on $G$ is denoted as $\mathcal{P}(G)$ . Note that the number of edges that run between the connected components of the underlying partition is at most $\alpha nd\leq 1/2\cdot\varepsilon dn$ . We will show how to modify the parition obtained by this oracle and obtain another oracle which returns components with size at most $30\tau/\varepsilon$ . This is done as follows.

Let $P\in\mathcal{P}(G)$ denote some connected component in the partition $\mathcal{P}(G)$ . If $|P|\leq 30\tau/\varepsilon$ , we do not refine $P$ . On the other hand, if $|P|>30\tau/\varepsilon$ , we use Lemma A.3 on $P$ with parameter $\beta=\varepsilon/2$ . This refines $P$ and produces connected components of size at most $3\tau/\beta\leq 6\tau/\varepsilon\leq 30\tau/\varepsilon$ as desired. And the total number of edges lost is at most $\beta dn\leq 1/2\cdot\varepsilon dn$ . Overall, the total number of edges lost is at most $\varepsilon dn$ as desired. ∎