This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A Connected Component Labeling Algorithm for Implicitly-Defined Domains

Robert I. Saye Lawrence Berkeley National Laboratory, Berkeley, California, USA ([email protected])
Abstract

A connected component labeling algorithm is developed for implicitly-defined domains specified by multivariate polynomials. The algorithm operates by recursively subdividing the constraint domain into hyperrectangular subcells until the topology thereon is sufficiently simple; in particular, we devise a topology test using properties of Bernstein polynomials. In many cases the algorithm produces a certificate guaranteeing its correctness, i.e., two points yield the same label if and only if they are path-connected. To robustly handle various kinds of edge cases, the algorithm may assign identical labels to distinct components, but only when they are exactly or nearly touching, relative to a user-controlled length scale. A variety of numerical experiments assess the effectiveness of the overall approach, including statistical analyses on randomly generated multi-component geometry in 2D and 3D, as well as specific examples involving cusps, self-intersections, junctions, and other kinds of singularities.

keywords:
connected components, path connectedness, implicitly-defined domains, level set methods, Bernstein polynomials, semi-algebraic sets
{AMS}

65D99 (primary), 65D18, 14P10

1 Introduction

In this paper, we develop a connected component labeling algorithm for implicitly-defined domains specified by multivariate polynomials. In particular, we consider domains of the form Ω:=𝒰{ϕ=0}\Omega:=\mathcal{U}\setminus\{\phi=0\}, where ϕ:d\phi:\mathbb{R}^{d}\to\mathbb{R} is a given polynomial and 𝒰d\mathcal{U}\subset\mathbb{R}^{d} is a given bounded dd-dimensional hyperrectangle, and consider the problem of implementing a labeling function χ:Ω\chi:\Omega\to\mathbb{N} such that χ(x)=χ(y)\chi(x)=\chi(y) if and only if the two points x,yΩx,y\in\Omega are path-connected.

Our interest in this labeling problem stems from the author’s prior work on quadrature algorithms for multi-component implicitly-defined domains [18]. For a general, piecewise-smooth integrand function ff, these algorithms output a quadrature scheme of the form 𝒰fiwif(xi)\int_{\mathcal{U}}f\approx\sum_{i}w_{i}f(x_{i}) and have the following key property: if VV is a connected component of 𝒰{ϕ=0}\mathcal{U}\setminus\{\phi=0\} and ff is sufficiently smooth on VV, then VfxiVwif(xi)\int_{V}f\approx\sum_{x_{i}\in V}w_{i}f(x_{i}) represents a high-order accurate quadrature scheme on VV. In other words, to build a quadrature scheme on VV, one may simply discard the quadrature points outside of VV, leaving the weights of the remaining points unmodified. Consequently, one may think of the bulk quadrature scheme over 𝒰\mathcal{U} as an agglomeration of smaller quadrature schemes over the individual connected components of 𝒰{ϕ=0}\mathcal{U}\setminus\{\phi=0\}. The algorithms developed in [18], however, do not automatically determine this grouping; one of the aims of the present work is to design a simple and efficient algorithm for this purpose. Besides this application, connected component analysis arises in a wide variety of different settings, including in computer-aided geometric design, shape modeling and pattern matching, and in building topological feature descriptors of physical data sets [12].

A number of different approaches could be taken to solve the connected component labeling problem:

  • One approach is to apply the workhorse tools of real algebraic geometry, such as the cylindrical algebraic decomposition (CAD) algorithm pioneered by Collins [6, 5], or by computing roadmaps [2, 3]. In exact or arbitrary-precision arithmetic, CAD and roadmap methods can be used to identify and analyze the connected components of general, arbitrarily complex semi-algebraic sets (of which Ω\Omega is one particular instance). However, these methods can be computationally expensive; moreover, their implementation in fixed-precision arithmetic requires considerable care, even for relatively simple geometry.

  • A second approach is to apply the computational methods of Morse theory, Reeb graphs, and contour trees [11, 12]. Roughly speaking, these methods compute the topology of the set of level sets of ϕ\phi via the location of its extrema and saddle points, combined with gradient path tracing algorithms. These methods become increasingly more intricate to implement as the dimension dd increases.

  • A third approach might be to isolate individual components of Ω+:=𝒰{ϕ>0}\Omega^{+}:=\mathcal{U}\cap\{\phi>0\} by finding a polynomial which can separate them (followed by a similar procedure for Ω:=𝒰{ϕ<0}\Omega^{-}:=\mathcal{U}\cap\{\phi<0\}). For example, if we could find a polynomial u:du:\mathbb{R}^{d}\to\mathbb{R} such that Ω+{u=0}\Omega^{+}\cap\{u=0\} is empty and u(x)u(y)<0u(x)u(y)<0 for two sample points x,yΩ+x,y\in\Omega^{+}, it would follow that the zero level set of uu forms a kind of divider that separates the two connected components associated with xx and yy. To find this polynomial, one could use polynomial positivstellensatz or sums-of-squares methods [13, 14] to compute, if possible, a polynomial ww such that: (i) w>0w>0 on Ω+\Omega^{+}; (ii) w=u2w=u^{2}; and (iii) u(x)u(y)<0u(x)u(y)<0. Part (i), (ignoring conditions (ii) and (iii) and various other subtleties,) can be solved efficiently via linear or semidefinite programming techniques, see, e.g., [14, 1]; however, conditions (ii) and (iii) end up creating a quadratically-constrained quadratic program for the coefficients of ww, whose associated feasibility condition involves a challenging non-convex constraint. These ideas, among various others for computing separating polynomials, were investigated as part of the present work; however, no sufficiently elegant or efficient approach was found.

  • Finally, a fourth and considerably simpler approach is to subdivide the constraint domain 𝒰\mathcal{U} into small enough cells for which the topology of Ω\Omega is easy to determine. Depending on the tessellation method, this approach can be quite effective (fast, simple to implement, high resolution power), but in some cases the subdivision process may need to be stopped, potentially leading to some components being incorrectly merged or broken.

The preceding discussion serves to highlight that, depending on the final application, one must ultimately decide on a suitable compromise between: (i) absolute correctness/certifiability of the connected component labeling algorithm, potentially needing costly arbitrary-precision computation (as exemplified by CAD-based methods); (ii) robustness, e.g., in suitably handling fixed-precision roundoff errors or uncertainty in the input polynomial coefficients; (iii) computational complexity of constructing the labeling function as well as its subsequent evaluation; and (iv) implementation simplicity. The target application of this work concerns the development of numerical methods for multi-physics simulations involving highly complex geometry such as liquid atomization dynamics; in particular, the input polynomial ϕ\phi represents the fluid interface and is computed dynamically. This application necessitates prioritizing aspects (ii), (iii), and (iv), which, in turn, necessitates a design choice for how to handle edge cases such as nearly touching components or interfacial self-intersections. Our choice here is to require that components are never broken, i.e., if x,yx,y are path-connected in Ω\Omega and χ(x),χ(y)\chi(x),\chi(y) is the output of the labeling algorithm, then χ(x)=χ(y)\chi(x)=\chi(y) is an absolute certainty. It follows that the labeling algorithm must be permitted to merge or “glue” together distinct components in the (presumably rare) cases of uncertain topology. The algorithm developed in this work glues components only if they are exactly or nearly touching, relative to a user-defined gap threshold, typically orders of magnitude smaller than the length scale of 𝒰\mathcal{U}.

With this design objective in mind, the connected component labeling algorithm developed here follows the fourth approach mentioned earlier: it operates by recursively subdividing the constraint hyperrectangle 𝒰d\mathcal{U}\subset\mathbb{R}^{d} into hyperrectangular subcells until the topology of Ω\Omega thereon is sufficiently simple or has reached a smallest-permitted size. This is achieved through a combination of quadtrees (in 2D) or octrees (in 3D), along with simple yet effective topology tests using properties of Bernstein polynomials. Once the tree is constructed, its leaf cells form the vertices of a graph whose edges are determined by sign attainability tests on the faces of adjacent leaf cells. The connected components of this graph ultimately decide the overall labeling: χ(x)\chi(x) equals the label of the leaf cell containing xx, the latter found by fast tree traversal methods. We show in this paper that the overall approach is efficient in handling various degrees of geometric complexity. In particular, if the recursive subdivision process terminates without reaching the user-defined smallest subcell size, then the labeling is provably correct. Applied to a fixed class of randomly generated geometry, the percentage of cases in which the algorithm is uncertain about the topology decreases exponentially as a function of the maximum tree depth; moreover, only a small fraction of these cases yield artificially-glued components.

An outline for the rest of the paper is as follows. In §​ 2, we establish some preliminaries on Bernstein polynomials and define the simply connected topology test. §​ 3 presents the main connected component labeling algorithm, followed by a discussion of its features. Numerical tests are presented in §​ 4, analyzing the algorithm’s success and failure rates on randomly generated multi-component geometry in 2D and 3D as well as its behavior on particular examples exhibiting cusps, self-intersections, and other kinds of singularities. Concluding remarks with a brief discussion of possible extensions are given in §​ 5.

2 Preliminaries

A key part of the connected component labeling algorithm is an effective means for evaluating the range of attainable values of a polynomial. One of the most accurate and straightforward methods for doing so is through the use of the Bernstein basis [8, 15, 10, 4]. These methods make use of a convex hull property and guarantee that a polynomial’s value, at any point in its rectangular reference domain, is no larger (or smaller) than its maximum (or minimum) coefficient in the Bernstein basis. Especially useful in the present setting, these bounds become monotonically more accurate under the stable operations of Bernstein subdivision. In addition, owing to the hyperrectangular constraint domain 𝒰\mathcal{U} and its subdivision into hyperrectangular subcells, it is particularly natural to use a tensor-product basis. Accordingly, a tensor-product Bernstein basis is adopted throughout this work.

In dd dimensions, let ϕ\phi be a tensor-product Bernstein polynomial of degree n=(n1,,nd)n=(n_{1},\ldots,n_{d}), defined relative to the hyperrectangle U=[α1,β1]××[αd,βd]U=[\alpha_{1},\beta_{1}]\times\cdots\times[\alpha_{d},\beta_{d}]; ϕ\phi takes the form

ϕ(x1,,xd)=i1=0n1id=0ndci1,,idbi1n1,[α1,β1](x1)bidnd,[αd,βd](xd)=incibin,U(x)\phi(x_{1},\ldots,x_{d})=\sum_{i_{1}=0}^{n_{1}}\cdots\sum_{i_{d}=0}^{n_{d}}c_{i_{1},\ldots,i_{d}}b_{i_{1}}^{n_{1},[\alpha_{1},\beta_{1}]}(x_{1})\cdots b_{i_{d}}^{n_{d},[\alpha_{d},\beta_{d}]}(x_{d})=\sum_{i\in{\mathbb{N}}^{n}}c_{i}b_{i}^{n,U}(x)

where

bη,[α,β](x)=(η)(βα)η(βx)η(xα),=0,,η,b_{\ell}^{\eta,[\alpha,\beta]}(x)=\frac{\binom{\eta}{\ell}}{(\beta-\alpha)^{\eta}}(\beta-x)^{\eta-\ell}(x-\alpha)^{\ell},\quad\ell=0,\ldots,\eta,

are the one-dimensional Bernstein basis functions of degree η\eta relative to the interval [α,β][\alpha,\beta], ci=ci1,,idc_{i}=c_{i_{1},\ldots,i_{d}} denotes the iith Bernstein coefficient of ϕ\phi for a multi-index in:=[0,n1]××[0,nd]i\in{\mathbb{N}}^{n}:=[0,n_{1}]\times\cdots\times[0,n_{d}], and bin,U(x)=j=1dbijnj,[αj,βj](xij)\smash{b_{i}^{n,U}}(x)=\smash{\prod_{j=1}^{d}}\smash{b_{i_{j}}^{n_{j},[\alpha_{j},\beta_{j}]}(x_{i_{j}})} denotes the product of basis functions. This representation gives the Bernstein coefficients of ϕ\phi relative to the given hyperrectangle UU; subdivision refers to transforming these coefficients relative to a subset hyperrectangle and can be stably computed using the de Casteljau algorithm [7].

In the connected component labeling algorithm, we need a method to assess whether the topology of U{ϕ>0}U\cap\{\phi>0\} and U{ϕ<0}U\cap\{\phi<0\} is simple enough. This is achieved through the following three concepts.111Definition 2.1 has, in various guises, appeared before in the literature; the remaining two concepts (Definitions 2.2 and 2.3, along with their implications) are perhaps new, and target specifically the objectives of the present work.

Definition 2.1.

A Bernstein polynomial ϕ=incibin,U(x)\phi=\sum_{i\in{\mathbb{N}}^{n}}c_{i}b_{i}^{n,U}(x) is called coefficient monotone increasing on UU (in the direction kk) if there exists a coordinate direction kk such that cici+ekc_{i}\leq c_{i+e_{k}} for all ini\in{\mathbb{N}}^{n}, ik<nki_{k}<n_{k}; here, eke_{k} denotes the standard basis vector in the direction of the kkth coordinate. The polynomial is called coefficient monotone if ϕ\phi or ϕ-\phi is coefficient monotone increasing.

An equivalent viewpoint comes from a convenient property of differentiation in the Bernstein basis: up to a multiplicative factor, the coefficients of the derivative are formed via first-order divided differences of the input polynomial’s coefficients. Therefore, ϕ\phi is coefficient monotone if and only if the Bernstein coefficients of its derivative in the corresponding direction are either all non-negative or all non-positive. It follows that coefficient monotone polynomials are monotone in the conventional sense and therefore attain their extrema on the corresponding faces of the hyperrectangle. Tests of Bernstein coefficient monotonicity have a variety of applications, including, e.g., in polynomial range evaluation [16]. In the present application, an important property is that, for a coefficient monotone polynomial ϕ\phi, for any xU{ϕ>0}x\in U\cap\{\phi>0\}, xx can be path-connected to one of the corresponding faces of the hyperrectangle without leaving the region U{ϕ>0}U\cap\{\phi>0\}, and analogously for points in U{ϕ<0}U\cap\{\phi<0\}.

We next establish a sufficiently accurate means to determine whether a polynomial attains positive or negative values on a given hyperrectangle. The following definition sets the requirements on an algorithm implementing this task.

Definition 2.2.

Given a Bernstein polynomial ϕ=incibin,U(x)\phi=\sum_{i\in{\mathbb{N}}^{n}}c_{i}b_{i}^{n,U}(x) on the hyperrectangle UU, a dd-dimensional sign evaluation algorithm σd(ϕ,U)\sigma_{d}(\phi,U) outputs the possible signs of ϕ\phi on UU and must satisfy the following properties:

  1. (i)

    σd(ϕ,U){1,0,+1}\sigma_{d}(\phi,U)\subseteq\{-1,0,+1\};

  2. (ii)

    σd(ϕ,U)\sigma_{d}(\phi,U) must be a superset of the ground-truth, i.e., {sign(ϕ(x)):xU}σd(ϕ,U)\{\operatorname{sign}(\phi(x)):x\in U\}\subseteq\sigma_{d}(\phi,U), where sign(u)\operatorname{sign}(u) is the standard sign operator,

    sign(u)={+1,if u>0,0,if u=0,1,if u<0.\operatorname{sign}(u)=\left\{\begin{array}[]{rl}+1,&\text{if $u>0$,}\\ 0,&\text{if $u=0$,}\\ -1,&\text{if $u<0$.}\end{array}\right.
  3. (iii)

    if ci0c_{i}\geq 0 for all ii, then 1σd(ϕ,U)-1\notin\sigma_{d}(\phi,U); if ci0c_{i}\leq 0 for all ii, then +1σd(ϕ,U)+1\notin\sigma_{d}(\phi,U);

  4. (iv)

    if ϕ\phi is coefficient monotone in the direction kk, then σd(ϕ,U)σd1(ϕα,Uα)0σd1(ϕβ,Uβ)\sigma_{d}(\phi,U)\subseteq\sigma_{d-1}(\phi_{\alpha},U_{\alpha})\cup_{0}\sigma_{d-1}(\phi_{\beta},U_{\beta}) must hold, where ϕα\phi_{\alpha} and ϕβ\phi_{\beta} denote the restriction of ϕ\phi to the corresponding lower UαU_{\alpha} and upper UβU_{\beta} faces of UU, and σd1\sigma_{d-1} is a (d1)(d-1)-dimensional sign evaluation algorithm;222Here, 0\cup_{0} denotes the union operation which includes zero when appropriate: if s1s2={1,+1}s_{1}\cup s_{2}=\{-1,+1\}, then s10s2:={1,0,+1}s_{1}\cup_{0}s_{2}:=\{-1,0,+1\}; if s1s2{1,+1}s_{1}\cup s_{2}\neq\{-1,+1\}, then s10s2:=s1s2s_{1}\cup_{0}s_{2}:=s_{1}\cup s_{2}.

  5. (v)

    finally, a zero-dimensional sign evaluation algorithm must satisfy σ0(c,U)=sign(c)\sigma_{0}(c,U)=\operatorname{sign}(c).

Note that a method which exactly computes the minimum and maximum value of ϕ\phi on UU would trivially yield a sign evaluation algorithm. However, computing the extrema of arbitrary polynomials is a non-trivial and computationally expensive task, perhaps as difficult as the connected component labeling problem itself. The conditions of a sign evaluation algorithm are weaker than an exact computation, thereby allowing for simpler and more efficient algorithms; nevertheless, some degree of certitude is required. In particular, condition (iii) requires an implementation be at least as accurate as conventional Bernstein range evaluation; this ensures it inherits at least the same level of accuracy under the action of subdivision. Condition (iv) says that if ϕ\phi happens to be coefficient monotone, the sign evaluation must be at least as accurate as what can be determined solely from the corresponding faces where ϕ\phi’s extrema are known to occur. One particularly important property is that if a sign evaluation algorithm outputs either {1}\{-1\} or {+1}\{+1\}, then we may conclude, with absolute certainty, ϕ\phi is uniformly signed throughout UU. An example of a simple sign evaluation algorithm is given in the next section.

Finally, a notion of whether the topology of U{ϕ=0}U\setminus\{\phi=0\} is “simple enough” is recursively defined as follows. This definition implicitly depends on the presence of a suitable sign evaluation algorithm σd\sigma_{d}, so it is assumed that one has been implemented and fixed ahead of time.

Definition 2.3.

A (d1)(d\geq 1)-dimensional Bernstein polynomial ϕ=incibin,U(x)\phi=\sum_{i\in{\mathbb{N}}^{n}}c_{i}b_{i}^{n,U}(x) is called simply connected on UU if at least one of the following three conditions hold: (i) σd(ϕ,U)={+1}\sigma_{d}(\phi,U)=\{+1\}; or (ii) σd(ϕ,U)={1}\sigma_{d}(\phi,U)=\{-1\}; or (iii) there is coordinate direction kk such that (a) ϕ\phi is coefficient monotone in the direction kk and (b) ϕ\phi is simply connected on the corresponding upper and lower faces of UU. By definition, a zero-dimensional Bernstein polynomial is simply connected.

A simply connected polynomial guarantees a simple topology of both Ω:=U{ϕ<0}\Omega^{-}:=U\cap\{\phi<0\} and Ω+:=U{ϕ>0}\Omega^{+}:=U\cap\{\phi>0\}, as follows. If conditions (i) or (ii) in Definition 2.3 are met, then ϕ\phi is non-zero throughout the hyperrectangle and so exactly one of Ω±\Omega^{\pm} is empty and the other the whole hyperrectangle. Otherwise, there is a coordinate axis kk such that any point in xΩ±x\in\Omega^{\pm} can be path-connected to one of the corresponding faces. These faces are themselves simply connected; inductively it follows that if Ω+\Omega^{+} (resp., Ω\Omega^{-}) is nonempty, then Ω+\Omega^{+} (resp., Ω\Omega^{-}) has exactly one connected component.333In fact, the sets Ω+\Omega^{+} or Ω\Omega^{-} are simply connected in the conventional topological sense, but it is unnecessary to prove this fact in the present work.

To prove the correctness of the connected component labeling algorithm, it is useful to establish the following two properties of simply connected polynomials:

  • The first property is that if ϕ\phi is simply connected on a hyperrectangle UU, then ϕ\phi is simply connected on every face of UU. This can be shown via induction on the dimension; we illustrate here with a three-dimensional example. Suppose ϕ\phi is simply connected on a rectangular 3D prism. If σd(ϕ){{1},{+1}}\sigma_{d}(\phi)\in\bigl{\{}\{-1\},\{+1\}\bigr{\}}, then it is uniformly signed and thus trivially simply connected on every face of the prism. Otherwise, ϕ\phi is coefficient monotone in the up direction, say, such that the restriction of ϕ\phi to the bottom and top faces is simply connected; in particular, by induction, we have that ϕ\phi is simply connected on every edge of the bottom and top face. Now, on each vertical face of the prism (those faces excluding the bottom and top), the restriction of ϕ\phi is coefficient monotone in the up direction, and, as was just concluded, ϕ\phi is simply connected on the corresponding lower and upper edges. Therefore, ϕ\phi is simply connected on every vertical face. The preceding arguments can be extended to any dimension, and the one-dimensional base case of the inductive argument trivially holds.

  • The second property is that if ϕ\phi is simply connected on a hyperrectangle UU, then a sign evaluation algorithm yields exact results, i.e., the output of σd(ϕ,U)\sigma_{d}(\phi,U) is precisely {sign(ϕ(x)):xU}\{\operatorname{sign}(\phi(x)):x\in U\}. Similar to before, this can be shown via induction. Suppose ϕ\phi is simply connected on UU. If σd(ϕ,U){{1},{+1}}\sigma_{d}(\phi,U)\in\smash{\bigl{\{}\{-1\},\{+1\}\bigr{\}}}, then ϕ\phi is with absolute certainty uniformly signed and σd\sigma_{d} is exact. Otherwise, if ϕ\phi is coefficient monotone on UU in the up direction, say, then σd(ϕ,U)\sigma_{d}(\phi,U) is at least as accurate as σd1\sigma_{d-1} applied to the restriction of ϕ\phi on the lower and upper faces of UU; by induction, the latter calculation is exact, and therefore σd\sigma_{d} is, too.

Combining these two properties, we observe that if ϕ\phi is simply connected on a hyperrectangle UU, then a sign evaluation algorithm yields exact results on every face of UU.

3 Connected Component Labeling Algorithm

Algorithm 1 Given a dd-dimensional, degree n=(n1,,nd)n=(n_{1},\ldots,n_{d}) Bernstein polynomial ϕ=incibin,U(x)\phi=\sum_{i\in{\mathbb{N}}^{n}}c_{i}b_{i}^{n,U}(x) and a corresponding hyperrectangular domain U=[α1,β1]××[αd,βd]U=[\alpha_{1},\beta_{1}]\times\cdots\times[\alpha_{d},\beta_{d}], evaluate range(ϕ,U)\operatorname{range}(\phi,U).
1:if d=1d=1 and n14n_{1}\leq 4 then
2:    Compute θ:=minα1xβ1ϕ(x)\displaystyle\theta:=\min_{\alpha_{1}\leq x\leq\beta_{1}}\phi(x) and Θ:=maxα1xβ1ϕ(x)\displaystyle\Theta:=\max_{\alpha_{1}\leq x\leq\beta_{1}}\phi(x).
3:else
4:    Set θ:=minici\theta:=\min_{i}c_{i} and Θ:=maxici\Theta:=\max_{i}c_{i}.
5:    if d>1d>1 then
6:        for k=1,2,,dk=1,2,\ldots,d do
7:           if ϕ\phi is coefficient monotone in direction kk on UU then
8:               Compute R:=range(ϕ|xk=αk,lower face of U)range(ϕ|xk=βk,upper face of U)R:=\operatorname{range}(\phi|_{x_{k}=\alpha_{k}},\text{lower face of $U$})\cup\operatorname{range}(\phi|_{x_{k}=\beta_{k}},\text{upper face of $U$}).
9:               Update θmax{θ,infR}\theta\leftarrow\max\{\theta,\inf R\}.
10:               Update Θmin{Θ,supR}\Theta\leftarrow\min\{\Theta,\sup R\}.                        
11:return [θ,Θ][\theta,\Theta].

With these preliminaries established, we are now in a position to describe the connected component labeling algorithm. First, we describe a sufficiently-accurate Bernstein polynomial range evaluation algorithm, which approximately (and in various cases, exactly) evaluates the range of ϕ\phi on a given hyperrectangle UU. Our particular implementation is given in Algorithm 1 and has the following properties:

  • The output is an interval such that [infxUϕ(x),supxUϕ(x)]range(ϕ,U)[\inf_{x\in U}\phi(x),\sup_{x\in U}\phi(x)]\subseteq\operatorname{range}(\phi,U) always holds.

  • The algorithm is at least as accurate as conventional Bernstein range evaluation, i.e., range(ϕ,U)[minici,maxici]\operatorname{range}(\phi,U)\subseteq[\min_{i}c_{i},\max_{i}c_{i}], the latter interval bounding the minimum and maximum value of the input’s Bernstein coefficients.

  • For the simple cases of linear, quadratic, or cubic polynomials in d=1d=1 dimensions, it is a particularly simple and efficient task to exactly evaluate the range of ϕ\phi; this is implemented on line 2 and is an easy tweak benefiting overall accuracy.

  • If ϕ\phi happens to be coefficient monotone on UU, then the range evaluation algorithm is as least as accurate as what can be determined by evaluating the range on the corresponding faces of UU where the extrema of ϕ\phi is known to occur (line 8). Further, if ϕ\phi happens to be coefficient monotone in multiple directions, then the tightest possible range is returned (lines 910).

The main purpose of the range evaluation algorithm is to implement a sign evaluation algorithm. In particular, we define

σd(ϕ,U):={sign(r):rrange(ϕ,U)}.\sigma_{d}(\phi,U):=\bigl{\{}\operatorname{sign}(r):r\in\operatorname{range}(\phi,U)\bigr{\}}.

In other words, σd(ϕ)\sigma_{d}(\phi) is defined by the possible signs of ϕ\phi as computed by the range evaluation algorithm. By construction, it satisfies all the required properties set out in Definition 2.2. In particular, it is exact for low-degree one-dimensional polynomials, whenever 0range(ϕ)0\notin\operatorname{range}(\phi), as well as various other cases. Using the sign evaluation algorithm, it is also straightforward to implement an algorithm that tests whether ϕ\phi is simply connected on a given hyperrectangle UU; our implementation carries out the algorithmic process naturally suggested by Definition 2.3.

With these ingredients, we can now describe the connected component labeling algorithm. Given a Bernstein polynomial ϕ\phi and a hyperrectangular constraint domain 𝒰\mathcal{U}, we first test whether ϕ\phi is simply connected on 𝒰\mathcal{U}. If it is, then Ω+:=𝒰{ϕ>0}\Omega^{+}:=\mathcal{U}\cap\{\phi>0\} and Ω:=𝒰{ϕ<0}\Omega^{-}:=\mathcal{U}\cap\{\phi<0\} are either empty or have exactly one component, in which case the labeling problem is trivial. Otherwise, we recursively subdivide 𝒰\mathcal{U} into smaller and smaller subcells until either ϕ\phi is simply connected on the subcell or a maximum recursion depth is met. In this work, we have opted for a quadtree/octree-style subdivision wherein the subcells maintain the aspect ratio of 𝒰\mathcal{U}, though other subdivision algorithms are certainly possible. Upon conclusion of the subdivision process, we have a grid in which most (and in many cases, all) leaf cells have simple topology. In the second phase of the algorithm, two graphs 𝒢±{\mathcal{G}}^{\pm} are constructed, each representing the connectivity of the cells overlapping Ω+\Omega^{+} and Ω\Omega^{-}. For example, if σd1\sigma_{d-1} applied to a shared face =UiUj\mathcal{F}=U_{i}\cap U_{j} of the grid yields s{1,0,+1}s\subseteq\{-1,0,+1\} and 1s-1\in s (resp., +1s+1\in s), then an edge (Ui,Uj)(U_{i},U_{j}) is inserted into 𝒢{\mathcal{G}}^{-} (resp., 𝒢+{\mathcal{G}}^{+}). These edges reflect the fact that there is likely (and in most cases, definitely) a path connecting UiΩ±U_{i}\cap\Omega^{\pm} and UjΩ±U_{j}\cap\Omega^{\pm}. Ultimately, it is the connected components of 𝒢{\mathcal{G}}^{-} and 𝒢+{\mathcal{G}}^{+} which define the possible labels of the overall algorithm; computing this labeling can be done with negligible cost via efficient disjoint-set/merge-find data structures. In essence, these graphs establish the globally topology of Ω\Omega, whereas each leaf cell of the tree handles the local topology. Finally, to determine which component an arbitrary point xΩx\in\Omega belongs, we apply standard (and fast) tree traversal algorithms to find a leaf cell containing xx, and then adopt its corresponding label from 𝒢+{\mathcal{G}}^{+} (if ϕ(x)>0\phi(x)>0) or 𝒢{\mathcal{G}}^{-} (if ϕ(x)<0\phi(x)<0).

Algorithm 2 Construction phase of the connected component labeling algorithm: given the polynomial ϕ\phi and the hyperrectangular constraint domain 𝒰\mathcal{U}, build the subdivision tree and create labels for its path-connected leaf cells.
1: Construct the subdivision tree:
2:Create the root node νroot\nu_{\text{root}} representing 𝒰\mathcal{U}.
3:call visit(νroot\nu_{\text{root}}, 𝒰\mathcal{U}, 0).
4:procedure visit(ν\nu, UU, \ell)
5:    if ϕ\phi is simply connected on UU then
6:        Designate ν\nu as a simply connected leaf node.
7:    else if Λ\ell\geq\Lambda then
8:        Designate ν\nu as a non-simply connected leaf node.
9:    else
10:        Designate ν\nu as a non-leaf node, pointing to the following child nodes.
11:        for i=1,,2di=1,\ldots,2^{d} do
12:           Let UiU_{i} denote the iith subcell of UU.
13:           Insert a new node representing UiU_{i} into the tree; call it νi\nu_{i}.
14:           call visit(νi\nu_{i}, UiU_{i}, +1\ell+1).             
15: Build subcell connectivity:
16:Initialize 𝒢+{\mathcal{G}}^{+} and 𝒢{\mathcal{G}}^{-} as empty graphs.
17:for every pair of leaf cells UiU_{i}, UjU_{j} sharing a face do
18:    Let :=UiUj{\cal F}:=U_{i}\cap U_{j} denote the shared face.
19:    Compute s:=σd1(ϕ|,)s:=\sigma_{d-1}(\phi|_{\mathcal{F}},\mathcal{F}).
20:    if 1s-1\in s then
21:        Insert the edge (Ui,Uj)(U_{i},U_{j}) into 𝒢{\mathcal{G}}^{-}.     
22:    if +1s+1\in s then
23:        Insert the edge (Ui,Uj)(U_{i},U_{j}) into 𝒢+{\mathcal{G}}^{+}.     
24: Generate connected component labels:
25:Create a unique integer label for the combined set of connected components of 𝒢{\mathcal{G}}^{-} and 𝒢+{\mathcal{G}}^{+}.
26:Initialize every leaf cell’s \mathchoice{\mathbin{\ooalign{$\displaystyle-$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle-$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle-$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle-$\crcr$\scriptscriptstyle\bigcirc$}}} and +\mathchoice{\mathbin{\ooalign{$\displaystyle+$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle+$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle+$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle+$\crcr$\scriptscriptstyle\bigcirc$}}} labels as undefined.
27:for every leaf cell UU do
28:    if UU is a vertex of 𝒢{\mathcal{G}}^{-} then
29:        Set the leaf cell’s \mathchoice{\mathbin{\ooalign{$\displaystyle-$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle-$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle-$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle-$\crcr$\scriptscriptstyle\bigcirc$}}} label to the corresponding label of 𝒢{\mathcal{G}}^{-}.     
30:    if UU is a vertex of 𝒢+{\mathcal{G}}^{+} then
31:        Set the leaf cell’s +\mathchoice{\mathbin{\ooalign{$\displaystyle+$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle+$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle+$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle+$\crcr$\scriptscriptstyle\bigcirc$}}} label to the corresponding label of 𝒢+{\mathcal{G}}^{+}.     
Algorithm 3 Connected component label evaluation algorithm: using the subdivision tree constructed by Algorithm 2, evaluate χ:Ω\chi:\Omega\to\mathbb{N} for a given point xΩ:=𝒰{ϕ=0}x\in\Omega:={\mathcal{U}}\setminus\{\phi=0\}.
1:Using standard, fast tree traversal methods, find a leaf cell UU containing xx; if xx belongs to multiple leaf cells it does not matter which is used.
2:Let \mathchoice{\mathbin{\ooalign{$\displaystyle\cdot$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle\cdot$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle\cdot$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle\cdot$\crcr$\scriptscriptstyle\bigcirc$}}} denote \mathchoice{\mathbin{\ooalign{$\displaystyle-$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle-$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle-$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle-$\crcr$\scriptscriptstyle\bigcirc$}}} if ϕ(x)<0\phi(x)<0 or +\mathchoice{\mathbin{\ooalign{$\displaystyle+$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle+$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle+$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle+$\crcr$\scriptscriptstyle\bigcirc$}}} if ϕ(x)>0\phi(x)>0.
3:if UU’s \mathchoice{\mathbin{\ooalign{$\displaystyle\cdot$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle\cdot$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle\cdot$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle\cdot$\crcr$\scriptscriptstyle\bigcirc$}}} label is undefined then
4:    Create a new, unique label and assign it to UU.
5:return the \mathchoice{\mathbin{\ooalign{$\displaystyle\cdot$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle\cdot$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle\cdot$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle\cdot$\crcr$\scriptscriptstyle\bigcirc$}}} label assigned to UU.

Algorithm 2 and Algorithm 3 sketch the implementation of the connected component labeling algorithm. A few complementary remarks are in order:

  • In Algorithm 2, whenever the restriction of ϕ\phi to a subcell UU or its face is required, the corresponding Bernstein coefficients can be efficiently and accurately computed via the de Casteljau algorithm.

  • On line 7, Λ\Lambda specifies the user-defined maximum recursion depth. For example, if the constraint domain is the unit cube [0,1]d[0,1]^{d}, the parameter corresponds to the smallest-possible subcell having width 2Λ2^{-\Lambda}. The effect of Λ\Lambda on the overall labeling is studied in detail in the next section.

  • Algorithm 2 only assigns labels to cells that are path-connected to other cells; as such, if a cell contains a part of Ω+\Omega^{+}, say, that has not been path-connected to any other cell overlapping with Ω+\Omega^{+}, then that cell does not (yet) have a +\mathchoice{\mathbin{\ooalign{$\displaystyle+$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle+$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle+$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle+$\crcr$\scriptscriptstyle\bigcirc$}}} label defined. Our main motivation for this mainly concerns the scenarios for which the smallest-possible cell size is reached whereon the topology of Ω+\Omega^{+} is still uncertain: in these cases, we do not want to unnecessarily create a new label when Ω+\Omega^{+} thereon may actually be empty. In a sense, if a sign evaluation algorithm is unable to certify the presence of Ω+\Omega^{+}, then doing so will be up to the subsequent evaluation of χ\chi; indeed, Algorithm 3 creates a label only when a point xΩ+x\in\Omega^{+} is given which proves the nonemptiness of Ω+\Omega^{+} in that cell.

  • By caching the results of the simply connected tests on the leaf cells, as computed by the first phase of Algorithm 2, various steps in the second and last phase can be accelerated. For example, it is unnecessary to compute s:=σd1(ϕ|,)s:=\sigma_{d-1}(\phi|_{\mathcal{F}},\mathcal{F}) on a shared face \mathcal{F} if it is already known that ϕ\phi is uniformly signed on one of the corresponding cells. On a related note, if a path connection already exists between UiU_{i} and UjU_{j}, it is unnecessary to execute lines 1823 because these lines would not alter that connectivity; thus, some faces can be skipped during the second phase of the algorithm.

  • Our particular implementation of the subdivision tree represents each node by a bit-field of width 16 bits. One bit specifies whether the node is a leaf or non-leaf node; the remaining 15 bits of a non-leaf node points to the children of that node; the remaining 15 bits of a leaf node store the type of the leaf cell (2 bits, flagging whether it is uniformly negative, uniformly positive, mixed sign and simply connected, or not simply connected) and 6 bits each to cache the +\mathchoice{\mathbin{\ooalign{$\displaystyle+$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle+$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle+$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle+$\crcr$\scriptscriptstyle\bigcirc$}}} and \mathchoice{\mathbin{\ooalign{$\displaystyle-$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle-$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle-$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle-$\crcr$\scriptscriptstyle\bigcirc$}}} labels created by Algorithms 2 and 3. This implies an upper bound of 261=632^{6}-1=63 components for each phase, ample for our purposes. In addition, our implementation of Algorithm 2 bypasses the explicit creation of the graphs 𝒢±{\mathcal{G}}^{\pm}; instead, an efficient disjoint-set/merge-find data structure is used to represent the edge formations and subsequent label creation and lookup.

  • Finally, our implementation also makes use of a fuzzy threshold to robustly handle roundoff error in fixed-precision arithmetic. A tolerance ϵ\epsilon is incorporated into the sign evaluation algorithm as follows: if R=range(ϕ,U)R=\operatorname{range}(\phi,U), then σd(ϕ,U)={+1}\sigma_{d}(\phi,U)=\{+1\} iff infRϵ\inf R\geq\epsilon, σd(ϕ,U)={1}\sigma_{d}(\phi,U)=\{-1\} iff supRϵ\sup R\leq-\epsilon, and σd(ϕ)={1,0,+1}\sigma_{d}(\phi)=\{-1,0,+1\} in all other cases. This tolerance helps to prevent inconsistencies arising from roundoff error, such as what might occur when the zero level set {ϕ=0}\{\phi=0\} grazes the boundary of a cell. Here, the tolerance is chosen to scale relative to the Bernstein coefficients of ϕ\phi on 𝒰\mathcal{U}; in particular, we have set ϵ=103ϵ0maxi|ci|\epsilon=10^{3}\,\epsilon_{0}\max_{i}|c_{i}|, where ϵ0\epsilon_{0} is machine epsilon (approximately 101610^{-16} in double-precision arithmetic).

To conclude this section, we prove the correctness of the connected component labeling algorithm in the case the subdivision process finishes without hitting the maximum-imposed recursion depth. In other words, every leaf cell of the tree satisfies the simply connected property. In the prior section we saw that the sign evaluation algorithm is exact in this case, and so two adjacent cells UiU_{i} and UjU_{j} are +\mathchoice{\mathbin{\ooalign{$\displaystyle+$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle+$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle+$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle+$\crcr$\scriptscriptstyle\bigcirc$}}}-connected if and only if (UiUj){ϕ>0}(U_{i}\cap U_{j})\cap\{\phi>0\} is nonempty. It follows that if x,yΩ+:=𝒰{ϕ>0}x,y\in\Omega^{+}:={\mathcal{U}}\cap\{\phi>0\} are path connected, then a corresponding path would go through zero or more faces overlapping with Ω+\Omega^{+}, all of which must have been assigned the same label by Algorithm 2. Conversely, if for two points x,yΩ+x,y\in\Omega^{+} the evaluation of Algorithm 3 yields χ(x)=χ(y)\chi(x)=\chi(y), then there must be a sequence of leaf cells U1,U2,,UmU_{1},U_{2},\ldots,U_{m} with xU1x\in U_{1} and yUmy\in U_{m} such that, for each pair, (UiUi+1){ϕ>0}(U_{i}\cap U_{i+1})\cap\{\phi>0\} is nonempty. Therefore, there is a sequence of points zi(UiUi+1){ϕ>0}z_{i}\in(U_{i}\cap U_{i+1})\cap\{\phi>0\}, and, because each UiΩ+U_{i}\cap\Omega^{+} has exactly one component, there exists a path connecting xz1z2zm1yx\rightarrow z_{1}\rightarrow z_{2}\rightarrow\cdots\rightarrow z_{m-1}\rightarrow y, and so xx and yy are path connected. The same arguments apply analogously to the negative region, and it is trivial to see that χ(x)χ(y)\chi(x)\neq\chi(y) whenever ϕ(x)ϕ(y)<0\phi(x)\phi(y)<0, since the negative and positive regions do not have overlapping label identifiers. In summary, when all the leaf cells satisfy the simply connected property, the output of Algorithm 3 is exact in the sense that χ(x)=χ(y)\chi(x)=\chi(y) if and only if xx and yy are path connected. In fact, we will observe in the next section that in many instances, the labeling algorithm continues to be exact even when the maximum recursion depth is reached.

4 Numerical Experiments

In this section, we assess the effectiveness of the connected component labeling algorithm on randomly generated geometry in 2D and 3D as well as on particular examples exhibiting cusps, self-intersections, and other kinds of singularities.

4.1 Randomly generated geometry

We consider here a class of randomly generated polynomials whose corresponding implicitly defined geometry exhibits a variety of characteristics such as high-curvature pieces and almost-touching components. The random polynomials are defined through the orthonormal Legendre polynomials, as follows. Let pip_{i}, i=0,1,2,3,i=0,1,2,3, denote the first four univariate Legendre polynomials relative to the interval [1,1][-1,1], i.e.,

p0(x)=1,p1(x)=x,p2(x)=12(3x21),p3(x)=12(5x33x).p_{0}(x)=1,\quad p_{1}(x)=x,\quad p_{2}(x)=\tfrac{1}{2}(3x^{2}-1),\quad p_{3}(x)=\tfrac{1}{2}(5x^{3}-3x).

In dd dimensions, define a tensor-product degree (3,,3)(3,\ldots,3) polynomial ϕ:[1,1]d\phi:[-1,1]^{d}\to\mathbb{R} as follows:

(1) ϕ(x)=i{0,1,2,3}dciωi=1dpi(x),\phi(x)=\sum_{i\in\{0,1,2,3\}^{d}}c_{i}\,\omega^{i}\prod_{\ell=1}^{d}p_{i_{\ell}}(x_{\ell}),

where cic_{i}\in\mathbb{R} is a given set of 4d4^{d} coefficients, one for each multi-index i{0,1,2,3}di\in\{0,1,2,3\}^{d}, and ω>0\omega>0 is a given (fixed) parameter such that ωi:==1dωi\omega^{i}:=\smash{\prod_{\ell=1}^{d}\omega^{i_{\ell}}}. With this set up, the class of randomly generated geometry is defined by polynomials of the form (1) such that the coefficients cic_{i} are randomly and independently drawn from the uniform distribution on [1,1][-1,1]. Depending on the value of ω\omega, the factor ωi\omega^{i} controls the smoothness of the polynomial ϕ\phi by damping (or magnifying) the higher-order oscillatory modes of the Legendre basis. In particular, we have set ω=0.5\omega=0.5, empirically chosen to give a reasonable spread between relatively mild to more complex geometry involving multiple components of high degree curvature; examples are given in the next set of figures.

As discussed in the prior section, the connected component labeling algorithm is exact whenever the subdivision process finishes without reaching the maximum recursion depth Λ\Lambda. Consequently, we expect that as Λ\Lambda increases, the percentage of cases in which the algorithm is certifiably correct will also increase. To quantify this, we generate a fixed number of random polynomials of the form (1), convert them to the Bernstein basis, and invoke Algorithm 2 on each instance on the reference domain 𝒰=[1,1]d\mathcal{U}=[-1,1]^{d}. Three possible outcomes are tallied by the following quantities:

  • Let ρ=\rho_{\leavevmode\resizebox{0.75pt}{0.0pt}{\,=}} denote the percentage of cases in which line 8 is not executed, i.e., every leaf cell is simply connected so that the labeling is certifiably exact.

  • Let ρ\rho_{\checkmark} denote the percentage of cases in which line 8 is executed (i.e., the algorithm is uncertain about the topology), yet the labeling produced is exact.

  • Let ρ×=100%(ρ=)(ρ)\rho_{\times}=100\%-(\rho_{\leavevmode\resizebox{0.75pt}{0.0pt}{\,=}})-(\rho_{\checkmark}) denote the percentage of remaining cases, i.e., those in which two or more components have been artificially-glued.

Note that the last two scenarios require a notion of the ground truth so as to determine whether the labeling is exact or not; to that end, a reference solution for each of the randomly generated cases is computed by applying the base algorithm with unlimited recursion depth.444In every computed instance, the ground truth solution terminated in finite time.

Refer to caption
Refer to caption
Figure 1: Percentage breakdown analysis corresponding to the three possible outcomes of the connected component labeling algorithm as it applied to the randomly generated geometry considered in §​ 4.1. As a function of the maximum recursion depth Λ\Lambda, the quantities indicate the percentage of cases in which the algorithm: (i) yields certifiably correct labeling (ρ=\rho_{=}); (ii) is uncertain about the topology but produces correct labeling (ρ\rho_{\checkmark}); and (iii) is uncertain about the topology such that two or more nearly-touching components are glued (ρ×\rho_{\times}); in particular, take note of the quasi-logarithmic scale.

Fig.​ 1 plots ρ=\rho_{\leavevmode\resizebox{0.75pt}{0.0pt}{\,=}}, ρ\rho_{\checkmark}, and ρ×\rho_{\times} as a function Λ\Lambda.555The data shown in Fig.​ 1 was collected from ten million randomly generated cases in 2D and one million random examples in 3D, more than enough to stabilize the results. On this class of randomly generated geometry, it is rare for the topology to be fully resolved with just one or two levels of subdivision (e.g., ρ=\rho_{\leavevmode\resizebox{0.75pt}{0.0pt}{\,=}} is less than 10% when Λ1\Lambda\leq 1); however, after a few more levels of subdivision, ρ=\rho_{\leavevmode\resizebox{0.75pt}{0.0pt}{\,=}} increases to around 99% when Λ=6\Lambda=6 in 2D, and around 90% when Λ=7\Lambda=7 in 3D. In general, we observe that as Λ\Lambda increases, the number of cases in which the labeling algorithm is certifiably correct increases exponentially. Furthermore, note that ρ×\rho_{\times} decays to zero at about the same exponential speed as ρ\rho_{\checkmark}, but is a significantly small fraction thereof; as such, even when the algorithm reaches the maximum recursion depth and is consequently uncertain about the topology, the labeling continues to be exact in more cases than not. We also note that the failure rate ρ×\rho_{\times} is somewhat similar in 2D and 3D, but the rate of certitude (ρ=\rho_{\leavevmode\resizebox{0.75pt}{0.0pt}{\,=}}) in 3D is smaller than it is in the 2D setting. This can be attributed to a number of aspects, chief among which the 3D geometry is substantially more complex. Another major reason concerns the Bernstein polynomial range evaluation, which is generally less accurate in 3D versus 2D; consequently, a few more levels of subdivision are needed in order to pass the simply connected polynomial tests, and intuitively one may expect higher dimensions to require additional refinement power.

Refer to caption

Δ=1\Delta=1Δ=2\Delta=2Δ=3\Delta=3Δ=4\Delta=4Δ=5\Delta=5Δ=6\Delta=6

Figure 2: Examples of randomly generated geometry for which the connected component labeling algorithm yields certifiably correct results. The depth of the subdivision quadtree, Δ\Delta, is indicated in each column, e.g., the far-left examples require just one subdivision step to be certain of the topology. Each example applies its own coloring scheme to illustrate the labeling output of the algorithm.
Refer to caption

Λ=1\Lambda=1Λ=2\Lambda=2Λ=3\Lambda=3Λ=4\Lambda=4Λ=5\Lambda=5Λ=6\Lambda=6

Figure 3: Examples of randomly generated geometry for which the connected component labeling algorithm reaches the maximum-imposed recursion depth Λ\Lambda, is uncertain about the topology, yet correctly labels distinct components. Each example applies its own coloring scheme to illustrate the labeling output of the algorithm, while the shaded regions indicate the quadtree cells upon which the input polynomial is not simply connected, according to Definition 2.3.
Refer to caption

Λ=1\Lambda=1Λ=2\Lambda=2Λ=3\Lambda=3Λ=4\Lambda=4Λ=5\Lambda=5Λ=6\Lambda=6

Figure 4: Examples of randomly generated geometry for which the connected component labeling algorithm reaches the maximum-imposed recursion depth Λ\Lambda, is uncertain about the topology, and glues distinct components. Each example applies its own coloring scheme to illustrate the labeling of the algorithm; in particular, glued components have the same color. The shaded regions indicate the quadtree cells upon which the input polynomial is not simply connected, according to Definition 2.3; some of these cells are responsible for bridging the gap between glued components.

Figs.​ 2, 3 and 4 provide some 2D examples demonstrating the above three scenarios. Each example illustrates the quadtree and the colors have a one-to-one correspondence with computed labeling; shaded cells correspond to those whereon ϕ\phi is not simply connected, per the requirements of Definition 2.3. Fig.​ 2 shows some examples in which the algorithm is certain about the topology, i.e., every cell is simply connected; observe that, as expected, progressively more complex geometry can be resolved as further subdivision steps are permitted. Fig.​ 3 demonstrates cases in which the maximum recursion depth is reached (hence some cells are shaded), yet the labeling is still exact; these instances arise whenever a shaded cell UU contains at most one component each of U{ϕ>0}U\cap\{\phi>0\} and U{ϕ<0}U\cap\{\phi<0\}, but for some reason the simply connected test did not hold. Often, this occurs when a saddle point or extrema is nearby the zero level set of ϕ\phi. Meanwhile, Fig.​ 4 demonstrates cases in which the subdivision is, in a sense, forced to stop early, leaving some components glued together. This scenario occurs whenever: (i) a shaded cell contains multiple components; and/or (ii) a shaded cell is +\mathchoice{\mathbin{\ooalign{$\displaystyle+$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle+$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle+$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle+$\crcr$\scriptscriptstyle\bigcirc$}}}-connected or \mathchoice{\mathbin{\ooalign{$\displaystyle-$\crcr$\displaystyle\bigcirc$}}}{\mathbin{\ooalign{$\textstyle-$\crcr$\textstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptstyle-$\crcr$\scriptstyle\bigcirc$}}}{\mathbin{\ooalign{$\scriptscriptstyle-$\crcr$\scriptscriptstyle\bigcirc$}}}-connected to a neighbor even when no path exists between them. Situation (ii) arises whenever the sign evaluation algorithm applied to the shared face is inexact in the sense that it predicts all possible signs, yet in actuality ϕ\phi is uniformly signed on the face.666In 2D, on the randomly generated degree 3 polynomials under consideration, scenario (ii) never occurs because our particular range evaluation algorithm Algorithm 1 on 1D edges is exact.

Refer to caption
Figure 5: Examples illustrating the connected component labeling algorithm applied to randomly generated geometry in 3D. Only a subset of the octree is shown, corresponding to the cells for which Ω+\Omega^{+} labels have been assigned. From top-left to bottom-right, the geometric complexity increases from two components up to eight.

Similar behavior is obtained in 3D, though it is much less straightforward to visualize. Fig.​ 5 illustrates some examples. Note that in some cases, two cells with different labels can be touching, demonstrating that the sign evaluation algorithm was successful in disconnecting them.

Intuitively, we expect the algorithm to glue components only when they are nearly touching. Indeed, the examples in Fig.​ 4 illustrate how the associated gaps are generally less than the size of the smallest cell, sometimes much smaller. To quantify this, we define an error metric which assigns a “cost” to gluing multiple components together. Roughly speaking, the cost is defined by the sum of the gaps between the agglomerated components. More precisely, given a set V𝒰V\subseteq{\mathcal{U}} and two points x,yVx,y\in V, we define the cost of connecting xx and yy by the shortest path connecting them, as measured by the arclength outside of VV:

𝒞(x,y;V):=infγlength(image(γ)V),{\mathcal{C}}(x,y;V):=\inf_{\gamma}\text{length}\bigl{(}\text{image}(\gamma)\setminus V\bigr{)},

where the infimum is taken over all paths γ\gamma connecting xx and yy such that the part of γ\gamma exterior to VV has measurable and finite length. The cost of gluing multiple connected components U1,,UmU_{1},\ldots,U_{m} is then defined as

𝒞(U1,,Um):=supx,yV𝒞(x,y;V) where V=iUi.{\mathcal{C}}(U_{1},\ldots,U_{m}):=\sup_{x,y\in V}{\mathcal{C}}(x,y;V)\text{ where }V=\bigcup_{i}U_{i}.

For example, the cost of gluing two connected components is the smallest straight-line gap between them. As another example, the cost of gluing the 1D sets [0,1][0,1], [2,3][2,3], and [5,6][5,6] is 33, being the sum of the two smaller gaps.

Refer to caption
Refer to caption
Figure 6: Gap distance between glued components, relative to the size of the smallest cell in the quadtree/octree. Aggregated over all relevant instances of the randomly generated geometry considered in §​ 4.1, i.e., those making up the ρ×\rho_{\times} tally, the shaded region demarcates the first-to-third quartile spread of gg, the interior solid line indicates the median, while the dashed lines indicate the minimum and maximum gg values.

This metric is used to analyze the failure modes of the connected component labeling algorithm, as follows. For now, suppose Λ\Lambda is fixed. For each of the above-generated random polynomials ϕi\phi_{i}, we invoke Algorithm 2. Using the reference solution, we determine which components have been glued together and compute the associated cost;777A separate algorithm has been developed to compute the gaps between the set of connected components of 𝒰{ϕ=0}\mathcal{U}\setminus\{\phi=0\}. The algorithm uses a recursive subdivision approach combined with a kind of adaptive point sampling. Its output is an interval bounding the exact gap value and can be made as tight as necessary; in particular, the gap calculations being presented here are sufficiently accurate for the analysis shown in Fig.​ 6. if no components were glued, the example is excluded from further analysis; if there are multiple sets of glued components, the cost of each set is measured and the maximal such cost is used. The output is the “error” eie_{i} for each example ii making up the ρ×\rho_{\times} tally defined earlier; note that eie_{i} measures gaps of the input geometry, but its value depends on which components have been glued, which in turn depends on Λ\Lambda. We expect eie_{i} to scale with the smallest cell size; in the present setting, for the reference domain 𝒰=[1,1]d\mathcal{U}=[-1,1]^{d}, the smallest cells have a diagonal length of 21Λd2^{1-\Lambda}\sqrt{d}. We therefore examine their ratio and define the overall error for polynomial ϕi\phi_{i} by gi=ei2Λ1/dg_{i}=e_{i}2^{\Lambda-1}/\sqrt{d}. Aggregated over all such examples, we then compute the minimum and maximum value of gig_{i} as well as the median, first, and third quartiles. Fig.​ 6 displays the results corresponding to the set of 2D and 3D randomly generated polynomials. Once Λ\Lambda is sufficiently large so as to resolve the typical geometry (and for Bernstein subdivision to become sufficiently accurate), we observe that the gap between glued components is typically 25%–50% of the smallest cell. Meanwhile, the upper dashed curves of gg in Fig.​ 6 correspond to the rare cases in which two or more components are glued across a small chain of neighboring leaf cells. Notwithstanding this critical analysis, it is worth being reminded that, as Fig.​ 1 shows, the number of cases in which components are artificially glued decreases rapidly as Λ\Lambda is made larger.

4.2 Additional examples

Refer to caption Refer to caption Refer to caption
(i) Deltoid, three cusps (ii) Trifolium, self-intersection (iii) Non-negative polynomial
Refer to caption Refer to caption Refer to caption
(iv) Oloid, one cusp (v) High-order junction (vi) Line of singularities
Figure 7: Examples of test problems containing interfacial singularities.

Fig.​ 7 illustrates some additional examples involving corners, cusps, self-intersections, and other kinds of singularities. These examples are more contrived in the sense they typically would not occur in a computational physics modeling problem, say; nonetheless, they serve to stress-test the connected component labeling algorithm in ways the randomly generated geometry did not. In particular, the examples demonstrate cases in which the implicitly-defined geometry contains points of singularity, i.e., points where both ϕ\phi and its gradient are zero. In these cases, at least in comparison to the topology test of Definition 2.3, it is considerably more nuanced to determine whether the topology of U{ϕ=0}U\setminus\{\phi=0\} is sufficiently simple on a subcell UU. The examples of Fig.​ 7 serve to illustrate the algorithm’s behavior near singularities; in particular, (i) and (iv) show cases where the algorithm produces exact results despite the presence of singularities, while the remaining examples show cases where components are glued:

  • Fig.​ 7(i) demonstrates a deltoid curve given by the zero level set of ϕ(x,y)=(x2+y2)2+18(x2+y2)8(x33xy2)27\phi(x,y)=(x^{2}+y^{2})^{2}+18(x^{2}+y^{2})-8(x^{3}-3xy^{2})-27; note how the subdivision focuses in on the singular portions of the curve situated at the deltoid’s three cusps. In this case, the maximum-defined subdivision recursion depth is always met, but the resulting labeling is always exact.

  • Fig.​ 7(ii) demonstrates a trifolium curve, given by the zero level set ϕ(x,y)=(x2+y2)2x3+3xy2\phi(x,y)=(x^{2}+y^{2})^{2}-x^{3}+3xy^{2}, containing a point of self-intersection at the origin. The subdivision process focuses in on the origin until it is forced to stop; the labeling algorithm subsequently glues the three connected components of {ϕ>0}\{\phi>0\}, while the negative component {ϕ<0}\{\phi<0\} is unaffected.

  • Besides corners, cusps, and self-intersections, another kind of edge case concerns non-negative polynomials ϕ\phi such that {ϕ>0}\{\phi>0\} contains multiple components. Fig.​ 7(iii) illustrates a case involving circular geometry, where ϕ(x,y)=(x2+y2r2)2\phi(x,y)=(x^{2}+y^{2}-r^{2})^{2}. A ring of smallest-permitted subcells is obtained such that the components on either side of the interface are glued, as indicated by the solid white in the figure.

  • Fig.​ 7(iv) demonstrates a 3D case involving a single isolated cusp, given by conic-like oloid surface {ϕ=0}\{\phi=0\} where ϕ(x,y,z)=x2+y2+z3\phi(x,y,z)=x^{2}+y^{2}+z^{3}. Similar to the example in Fig.​ 7(i), the polynomial will never be simply connected on subcells sufficiently close to the cusp, and so the algorithm continues to subdivide until the maximum recursion depth is met. The labeling, however, is exact in this case.

  • Fig.​ 7(v) is a 3D analogue of (ii), and is given by the zero level set of ϕ(x,y,z)=(x+y+z1)(xy+z1)(xyz1)(x+yz1)2(x2+y2+z23)2\phi(x,y,z)=(x+y+z-1)(-x-y+z-1)(x-y-z-1)(-x+y-z-1)-2(x^{2}+y^{2}+z^{2}-3)^{2}. The full surface on 3\mathbb{R}^{3} contains six globular “arms” meeting at four junction points; we focus on just one of these singularities in the figure. In fixed-precision arithmetic, roundoff errors could lead to inconsistencies in the treatment of this geometry; the design choices in the present work result in all three components being glued.

  • Finally, Fig.​ 7(vi) demonstrates a non-trivial example in which the set of singularities is one-dimensional. Here, ϕ(x,y,z)=xyzx2y2\phi(x,y,z)=xyz-x^{2}-y^{2}, whose corresponding implicitly-defined geometry resembles two pinched pieces of paper, one above the other, rotated and connected via a line of singularities situated along the whole zz-axis. In this example, the algorithm subdivides until the maximum-permitted resolution is met, with all subcells near or containing the zz-axis marked as not simply connected. These leaf cells then form a bridge from one folded paper to the other, gluing all four of their respective components.

As an addendum, we point out that a tailor-made connected component labeling could be designed to specifically handle the kinds of geometry shown in Fig.​ 7. For example, in all of these cases the polynomial degree is small enough that it would be straightforward to apply cylindrical algebraic decomposition methods from real algebraic geometry. Our main purpose for these test problems is to demonstrate the “failure modes” of the connected component labeling algorithm in these exceptional situations.

5 Concluding Remarks

The connected component labeling algorithm developed here operates by recursively subdividing the constraint hyperrectangular domain 𝒰\mathcal{U} into progressively smaller subcells until the topology of Ω:=𝒰{ϕ=0}\Omega:=\mathcal{U}\setminus\{\phi=0\} thereon is sufficiently simple. This is achieved through a “simply connected” test exploiting some useful properties of the Bernstein polynomial basis, namely its effective methods for polynomial range evaluation. If the subdivision process succeeds, then the labeling is certifiably exact, i.e., χ(x)=χ(y)\chi(x)=\chi(y) if and only if x,yΩx,y\in\Omega are path-connected. At its core, however, the connected component labeling problem is ill-posed in the sense that tiny changes in the coefficients of the input polynomial ϕ\phi could alter the number and topology of the connected components. We treated that here by allowing distinct components to be glued together, but only when the topology is uncertain: gluing occurs mainly when two components are nearly touching relative to the length scale of the smallest-permitted subcell; gluing also occurs for various edge cases such as interfacial self-intersections or junctions, or grazing-type singularities such as non-negative polynomials for which {ϕ>0}\{\phi>0\} has multiple components.

We have yet to discuss the speed of the algorithm. In general, the computational complexity is difficult to precisely quantify, being intricately dependent on the input polynomial ϕ\phi. For a fixed spatial dimension dd and bounded polynomial degree, the construction of the subdivision tree (Algorithm 2) has a worst-case complexity o(2dΛ)o(2^{d\Lambda}); in practice, however, it is usually significantly faster. Computation of χ(x)\chi(x) for some xΩx\in\Omega (Algorithm 3) has a worst-case complexity 𝒪(Λ){\mathcal{O}}(\Lambda); in practice, however, it is usually significantly faster. Indeed, for a given fixed polynomial, as soon as the subdivision process resolves its implicitly-defined geometry, the algorithmic complexity becomes independent of Λ\Lambda.888In pathological cases, such as those illustrated in Fig.​ 7, the computational complexity depends on the measure of the input polynomial’s set of singularities; in particular, if its set of singularities (inside 𝒰\mathcal{U}) is nonempty and has maximal manifold dimension ds<dd_{s}<d, the worst-case complexity of Algorithm 2 is 𝒪(λ){\mathcal{O}}(\lambda) when ds=0d_{s}=0, or 𝒪(2dsΛ){\mathcal{O}}(2^{d_{s}\Lambda}) when ds>0d_{s}>0, as Λ\Lambda\to\infty. A comparison to the high-order quadrature algorithms developed in [18] is perhaps useful: solving the connected component labeling problem is about as fast as building a moderate order quadrature scheme. These quadrature algorithms are already “fast” in some sense, so the speed of the labeling algorithm is sufficient for our intended usage. As an additional indication, the cost of tree construction for the randomly generated geometry problems in §​ 4.1 is about 88 microseconds per instance in 2D and 0.20.2 milliseconds per instance in 3D; about two thirds of that time is spent in the first phase of Algorithm 2, about one third in the second phase, and negligible time in third phase. After tree construction and on the same set of test problems, the cost of evaluating χ(x)\chi(x) for a random point xx is about 0.050.05 (resp., 0.10.1) microseconds per evaluation in 2D (resp., 3D).999Timing measurements obtained on an Intel Xeon E3-1535m v6 laptop, single core, operating at approximately 3.5 Ghz.

Finally, we mention here some possibilities for extending the algorithmic approach. One immediate possibility is to develop a more sophisticated subcell topology test, e.g., Ω±U\Omega^{\pm}\cap U being star-connected or via an analysis of convexity [9]. Accompanying this change, a more sophisticated sign evaluation algorithm might also be required, so as to appropriately path-connect neighboring cells, consistent with the updated topology test. Depending on the end application, however, these alterations might come with an increased computational cost. Another possibility is to replace the quadtree/octree-style subdivision with a k-d tree subdivision, or, more generally, a binary space partitioning (BSP) tree. In this setting, the most efficient subdivision process would likely occur by orienting to the features of the polynomial, e.g., by choosing hyperplanes which are locally parallel to its level sets. One would then have to generalize the range evaluation algorithms to polytopes and contend with more complex algorithms for building the cell connectivity. On the other hand, the quadtree/octree approach allows for a considerably simpler implementation, partly due to the use of tensor-product polynomials as well as dimension reduction and recursion techniques. As a final possibility, we note that the basic idea of the subdivision algorithm could be extended to non-polynomial level set functions; for example, one approach might be to apply automatic differentiation and interval arithmetic techniques (see, e.g., [17]) to devise a suitable subcell topology test for arbitrary level set functions.

Acknowledgements

This research was supported in part by the Applied Mathematics Program of the U.S. Department of Energy (DOE) Office of Advanced Scientific Computing Research under contract number DE-AC02-05CH11231, and by a DOE Office of Science Early Career Research Program award.

References

  • [1] A. A. Ahmadi and A. Majumdar, DSOS and SDSOS Optimization: More Tractable Alternatives to Sum of Squares and Semidefinite Optimization, SIAM Journal on Applied Algebra and Geometry, 3 (2019), pp. 193–230, https://doi.org/10.1137/18M118935X.
  • [2] S. Basu, R. Pollack, and M.-F. Roy, Complexity of computing semi-algebraic descriptions of the connected components of a semi-algebraic set, in Proceedings of the 1998 International Symposium on Symbolic and Algebraic Computation, ISSAC ’98, 1998, pp. 25–29, https://doi.org/10.1145/281508.281533.
  • [3] S. Basu, R. Pollack, and M.-F. Roy, Algorithms in Real Algebraic Geometry, vol. 10 of Algorithms and Computation in Mathematics, Springer Berlin, Heidelberg, 2 ed., 2006, https://doi.org/10.1007/3-540-33099-2.
  • [4] M. Beśka, Convexity and variation diminishing property of multidimensional Bernstein polynomials, Approximation Theory and its Applications, 5 (1989), pp. 59–78, https://doi.org/10.1007/BF02888887.
  • [5] B. F. Caviness and J. R. Johnson, eds., Quantifier Elimination and Cylindrical Algebraic Decomposition, Texts & Monographs in Symbolic Computation, Springer-Verlag Wien, 1998, https://doi.org/10.1007/978-3-7091-9459-1.
  • [6] G. E. Collins, Quantifier elimination for real closed fields by cylindrical algebraic decomposition, in Automata Theory and Formal Languages, H. Brakhage, ed., Berlin, Heidelberg, 1975, Springer Berlin Heidelberg, pp. 134–183, https://doi.org/10.1007/3-540-07407-4_17.
  • [7] R. Farouki and V. Rajan, Algorithms for polynomials in Bernstein form, Computer Aided Geometric Design, 5 (1988), pp. 1–26, https://doi.org/10.1016/0167-8396(88)90016-7.
  • [8] J. Garloff, The Bernstein algorithm, Interval Computations, 2 (1993), pp. 154–168.
  • [9] J. B. Lasserre, Certificates of convexity for basic semi-algebraic sets, Applied Mathematics Letters, 23 (2010), pp. 912–916, https://doi.org/10.1016/j.aml.2010.04.009.
  • [10] Q. Lin and J. Rokne, Methods for bounding the range of a polynomial, Journal of Computational and Applied Mathematics, 58 (1995), pp. 193–199, https://doi.org/10.1016/0377-0427(93)E0270-V.
  • [11] Y. Matsumoto, An Introduction to Morse Theory, American Mathematical Society, 2002.
  • [12] D. Morozov and G. H. Weber, Distributed contour trees, in Topological Methods in Data Analysis and Visualization III, P.-T. Bremer, I. Hotz, V. Pascucci, and R. Peikert, eds., Springer International Publishing, 2014, pp. 89–102, https://doi.org/10.1007/978-3-319-04099-8_6.
  • [13] V. Powers and T. Wörmann, An algorithm for sums of squares of real polynomials, Journal of Pure and Applied Algebra, 127 (1998), pp. 99–104, https://doi.org/10.1016/S0022-4049(97)83827-3.
  • [14] M. Putinar and S. Sullivant, eds., Emerging Applications of Algebraic Geometry, vol. 149 of The IMA Volumes in Mathematics and its Applications, Springer Science & Business Media, 2008, https://doi.org/10.1007/978-0-387-09686-5.
  • [15] J. Rajyaguru, M. E. Villanueva, B. Houska, and B. Chachuat, Chebyshev model arithmetic for factorable functions, Journal of Global Optimization, 68 (2017), pp. 413–438, https://doi.org/10.1007/s10898-016-0474-9.
  • [16] S. Ray and P. S. V. Nataraj, An efficient algorithm for range computation of polynomials using the Bernstein form, Journal of Global Optimization, 45 (2009), pp. 403–426, https://doi.org/10.1007/s10898-008-9382-y.
  • [17] R. I. Saye, High-Order Quadrature Methods for Implicitly Defined Surfaces and Volumes in Hyperrectangles, SIAM Journal on Scientic Computing, 37 (2015), pp. A993–A1019, https://doi.org/10.1137/140966290.
  • [18] R. I. Saye, High-order quadrature on multi-component domains implicitly defined by multivariate polynomials, Journal of Computational Physics, 448 (2022), p. 110720, https://doi.org/10.1016/j.jcp.2021.110720.