This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Finding Closed Quasigeodesics on Convex Polyhedrathanks: A preliminary version of this paper appeared at the 36th International Symposium on Computational Geometry (SoCG 2020) [DHK20].

Erik D. Demaine Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA, [email protected]    Adam C. Hesterberg John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA, [email protected]    Jason S. Ku Department of Mechanical Engineering, National University of Singapore, Singapore, [email protected]
Abstract

A closed quasigeodesic is a closed curve on the surface of a polyhedron with at most 180180^{\circ} of surface on both sides at all points; such curves can be locally unfolded straight. In 1949, Pogorelov proved that every convex polyhedron has at least three (non-self-intersecting) closed quasigeodesics, but the proof relies on a nonconstructive topological argument. We present the first finite algorithm to find a closed quasigeodesic on a given convex polyhedron, which is the first positive progress on a 1990 open problem by O’Rourke and Wyman. The algorithm establishes for the first time a quasipolynomial upper bound on the total number of visits to faces (number of line segments), namely, O(nL3ε23)O\left(\frac{n\,L^{3}}{\varepsilon^{2}\,\ell^{3}}\right) where nn is the number of vertices of the polyhedron, ε\varepsilon is the minimum curvature of a vertex, LL is the length of the longest edge, and \ell is the smallest distance within a face between a vertex and a nonincident edge (minimum feature size of any face). On the real RAM, the algorithm’s running time is also pseudopolynomial, namely O(nL3ε23logn)O\left(\frac{n\,L^{3}}{\varepsilon^{2}\,\ell^{3}}\log n\right). On a word RAM, the running time grows to O(b2n8lognε8L21212O(|Λ|))O\left(b^{2}\cdot{\textstyle\frac{n^{8}\log n}{\varepsilon^{8}}}\cdot{\textstyle\frac{L^{21}}{\ell^{21}}}\cdot 2^{O(|\Lambda|)}\right), where |Λ||\Lambda| is the number of distinct edge lengths in the polyhedron, assuming its intrinsic or extrinsic geometry is given by rational coordinates each with at most bb bits. This time bound remains pseudopolynomial for polyhedra with O(logn)O(\log n) distinct edges lengths, but is exponential in the worst case. Along the way, we introduce the expression RAM model of computation, formalizing a connection between the real RAM and word RAM hinted at by past work on exact geometric computation.

1 Introduction

A geodesic on a surface is a path that is locally shortest at every point, i.e., cannot be made shorter by modifying the path in a small neighborhood. A closed geodesic on a surface is a loop (closed curve) with the same property; notably, the locally shortest property must hold at all points, including the “wrap around” point where the curve meets itself. In 1905, Poincaré [Poi05] conjectured that every convex surface has a non-self-intersecting closed geodesic.111Non-self-intersecting (quasi)geodesics are often called simple (quasi)geodesics in the literature; we avoid this term to avoid ambiguity with other notions of “simple”. In 1927, Birkhoff [Bir27] proved this result, even in higher dimensions (for any smooth metric on the nn-sphere). In 1929, Lyusternik and Schnirelmann [LS29] claimed that every smooth surface of genus 0 in fact has at least three non-self-intersecting closed geodesics. Their argument “contains some gaps” [BTZ83], filled in later by Ballmann [Bal78].

For non-smooth surfaces (such as polyhedra), an analog of a geodesic is a quasigeodesic — a path with 180\leq 180^{\circ} of surface on both sides locally at every point along the path. Equivalently, a quasigeodesic can be locally unfolded to a straight line: on a face, a quasigeodesic is a straight line; at an edge, a quasigeodesic is a straight line after the faces meeting at that edge are unfolded (developed) flat at that edge; and at a vertex of curvature κ\kappa (that is, a vertex whose sum of incident face angles is 360κ360^{\circ}-\kappa), a quasigeodesic entering the vertex at a given angle can exit it anywhere in an angular interval of length κ\kappa, as in Figure 1. Analogously, a closed quasigeodesic is a loop which is quasigeodesic. In 1949, Pogorelov [Pog49] proved that every convex surface has at least three non-self-intersecting closed quasigeodesics, by applying the theory of geodesics on smooth surfaces to smooth approximations of arbitrary convex surfaces and taking limits.

Refer to caption
Figure 1: At a vertex of curvature κ\kappa, there is a κ\kappa-size interval of angles in which a segment of a quasigeodesic can be extended: the segment of geodesic starting on the left can continue straight in either of the pictured unfoldings, or any of the intermediate unfoldings in which the right pentagon touches only at a vertex.

The existence proof of three closed quasigeodesics is nonconstructive, because the smooth argument uses a nonconstructive topological argument (a homotopy version of the intermediate value theorem).222A proof sketch for the existence of one closed geodesic on a smooth convex surface is as follows. By homotopy, there is a transformation of a small clockwise loop into its (counterclockwise) reversal that avoids self-intersection throughout. Consider the transformation that minimizes the maximum arclength of any loop during the transformation. By local cut-and-paste arguments, the maximum-arclength intermediate loop is in fact a closed geodesic. The same argument can be made for the nonsmooth case. In 1990, Joseph O’Rourke and Stacia Wyman posed the problem of finding a polynomial-time algorithm to find any closed quasigeodesic on a given convex polyhedron (aiming in particular for a non-self-intersecting closed quasigeodesic) [O’R20]. This open problem was stated during the open problem session at SoCG 2002 (by O’Rourke) and finally appeared in print in 2007 [DO07b, Open Problem 24.24]. Two negative results described in [DO07b, Section 24.4] are that an nn-vertex polyhedron can have 2Ω(n)2^{\Omega(n)} non-self-intersecting closed quasigeodesics [DO07a, Theorem 24.4.1] (an unpublished result by Aronov and O’Rourke) and that, for any kk, there is a convex polyhedron whose shortest closed geodesic is not composed of kk shortest paths (an unpublished result from the discussion at SoCG 2002).

Even a finite algorithm is not obvious. Since the conference version of this paper [DHK20], two other approaches have been developed. Sharp and Crane [SC20] gave a practical heuristic based on edge flipping, but it has no proof of convergence. Chartier and de Mesmay [CdM22] gave a finite guaranteed-correct algorithm, but the running time is pseudo-exponential (exponential in both nn and L/L/\ell defined below).

One tempting approach is to find two (or O(n)O(n)) shortest paths whose union is a closed quasigeodesic. For example, the source unfolding [AAOS97, DO07b] (Voronoi diagram) from a polyhedron vertex VV consists of all points on the polyhedron having multiple shortest paths to VV, as in Figure 2. Can we find a point PP on the source unfolding and two shortest paths between PP and VV whose union forms a closed quasigeodesic? We know that there is a closed quasigeodesic through some vertex VV, because if we have a closed quasigeodesic through no vertices, we can translate it until it hits at least one vertex. But there might not be any choice for PP that makes the two shortest paths meet with sufficiently straight angles at both VV and PP, as in Figure 2.

Refer to caption
Figure 2: A source unfolding from vertex VV of a six-vertex polyhedron (the convex hull of a square-based pyramid and vertex VV which is slightly outside the pyramid), similar to [AAOS97, Figure 1] and [DO07b, Figure 24.2]. No closed quasigeodesic can be formed by two shortest paths from VV to another point.

A more general approach is to argue that there is a closed quasigeodesic consisting of some function s(n)s(n) (e.g., O(n)O(n)) segments on faces. If true, there are O(n)s(n)O(n)^{s(n)} combinatorial types of quasigeodesics to consider, and each can be checked via the existential theory of the reals (in exponential time), resulting in an exponential-time algorithm. But we do not know any such bound s(n)s(n). It seems plausible that the “short” closed quasigeodesics from the nonconstructive proofs satisfy s(n)=O(n)s(n)=O(n), say, but as far as we know the only proved property about them is that they are non-self-intersecting, which does not obviously suffice: a quasigeodesic can wind many times around a curvature-bisecting loop, turn around, and symmetrically unwind, all without collisions, as in Figure 3. Polyhedra such as isosceles tetrahedra have arbitrarily long non-self-intersecting closed geodesics (and even infinitely long non-self-intersecting geodesics) [IRV19], so the only hope is to find an upper bound s(n)s(n) on some (fewest-edge) closed quasigeodesic.

Refer to caption
Figure 3: Non-self-intersecting quasigeodesics may cross a face many times. For example, a 1×1×L1\times 1\times L rectangular prism admits closed quasigeodesics which cross a face Ω(L)\Omega(L) times.

1.1 Our Results

We develop in Section 3 the first finite algorithm that finds a closed quasigeodesic on a given convex polyhedron, using O(nL3ε23logn)O\left(\frac{n\,L^{3}}{\varepsilon^{2}\,\ell^{3}}\log n\right) real-RAM operations (arithmetic and square roots), where nn is the number of vertices of the polyhedron, ε\varepsilon is the smallest curvature at a vertex, LL is the length of the longest edge, and \ell is the smallest distance within a face between a vertex and a nonincident edge (minimum feature size of any face). Furthermore, the found closed quasigeodesic consists of O(nL3ε23)O\left(\frac{n\,L^{3}}{\varepsilon^{2}\,\ell^{3}}\right) segments on faces, which is the first finite upper bound s(n,ε,L,)s(n,\varepsilon,L,\ell). Indeed, these time and segment bounds are both pseudopolynomial.

The real-RAM model of computation is common in computational geometry, but it is an unrealistic model for digital computers which are restricted to finite-precision computation. We introduce in Section 2 a model of computation for realistic manipulation of radical real numbers, called the expression RAM. This model provides a simple interface for manipulation of radical expressions (arithmetic, roots, and comparisons), similar to the real RAM. The key feature is that the expression RAM model can be implemented on top of either the real RAM or the word RAM (the standard model for digital computers), with different operation costs. In particular, we give the first general transformation of a real-RAM algorithm into a word-RAM algorithm with “only” singly exponential slowdown. Furthermore, when any involved expressions has height O(logn)O(\log n) and contains at most |R||R| distinct O(1)O(1)-roots (e.g., square roots) and integers of at most nO(1)n^{O(1)} bits, the transformation has slowdown nO(1)2O(|R|)n^{O(1)}\cdot 2^{O(|R|)}. For example, real-RAM expressions of constant size can be implemented with slowdown O(b)O(b), preserving polynomial time bounds.333An earlier version of this paper [DHK20] mistakenly claimed that our closed-quasigeodesic algorithm fit within this “O(1)O(1)-expression RAM”, and thus mistakenly claimed that the running time remained pseudopolynomial on the word RAM. These results formalize and analyze the existing practical work on exact geometric computation pioneered by LEDA/CGAL reals [BFM+01] and CORE reals [KLPY99].

Applied to our algorithm for finding a closed quasigeodesic, we obtain a running time of O(b2n8lognε8L21212O(|Λ|))O\left(b^{2}\cdot{\textstyle\frac{n^{8}\log n}{\varepsilon^{8}}}\cdot{\textstyle\frac{L^{21}}{\ell^{21}}}\cdot 2^{O(|\Lambda|)}\right) on the word RAM, where |Λ||\Lambda| is the number of distinct edge lengths and bb is the maximum number of bits in a coordinate of a rational-coordinate representation of the intrinsic metric or extrinsic vertex coordinates of the input polyhedron. (In fact, our algorithm supports more general expression inputs for either representation, but the time bound gets more complicated.) When |Λ|=O(logn)|\Lambda|=O(\log n) or |Λ|=O(log(nεL))|\Lambda|=O\left(\log\left({n\over\varepsilon}{L\over\ell}\right)\right), this running time is still pseudopolynomial. For example, any zonohedron (Minkowski sum of lgn\lg n 3D vectors) satisfies |Λ|lgn|\Lambda|\leq\lg n, resulting in pseudopolynomial running time. In the worst case, however, the running time is exponential in the input size nbnb. Nonetheless, this is the first finite algorithm for finding a closed quasigeodesic, so getting it to run at all on the word RAM is interesting.

In fact, it is unlikely that a closed quasigeodesic can be found in worst-case subexponential time on the word RAM, until we resolve the famous sum-of-square-roots open problem [DMO09]. For example, comparing the lengths of two paths given by O(n)O(n) segments on faces (as required by any shortest-path algorithm) requires comparing two sums of O(n)O(n) square roots, which is not known to be solvable in subexponential time. (For the same reason, the natural decision versions of Euclidean shortest paths and minimum spanning tree are not known to be in NP.) For closed quasigeodesics, it seems necessary to answer decision problems such as “is there a geodesic path connecting two given vertices that visits the given sequence of edges/faces in order?” While we do not know a reduction from sum-of-square-roots to this problem, the most natural solution by unfolding the faces one after the other involves arithmetic (sums and multiplications) over the lengths of the edges, which are square roots.

2 Models of Computation

In this section, we review standard models of computation and introduce a new model — the expression RAM — that makes it easy to state an algorithm that manipulates real numbers while grounding it in computational feasibility. The expression RAM is essentially a form of the real RAM, and indeed the expression RAM can be implemented on top of the real RAM (except for possible lack of floor/ceiling operations). More interesting is that the expression RAM can be implemented on top of the word RAM, with different operation costs; see Figure 4. Thus we can express a single algorithm in the expression RAM, and then analyze its running time on both the real RAM and word RAM.

Refer to caption
Figure 4: The expression RAM model can be implemented on top of the real RAM or the word RAM (with different operation costs).

Our approach is essentially the “exact geometric computation” framework of LEDA/CGAL reals [BFM+01] and CORE reals [KLPY99]. So while the model and associated algorithms have essentially been described before (and even implemented in code), they have not previously been analyzed to the level of precise costs in the word-RAM model. We expect that the expression RAM can be applied to analyze many algorithms in computational geometry, both old and new, making it a contribution of independent interest.

2.1 Standard Models: Real RAM and Word RAM

First we recall the two most common models of computation in computational geometry and data structures, respectively: the real RAM and the word RAM.

Both models define a Random Access Machine (RAM) to have a memory consisting of an array MM with cells M[0],M[1],M[0],M[1],\dots (where the first O(1)O(1) cells act as “registers”), but they differ in what the cells can store. The RAM supports the O(1)O(1)-time operations “M[i]=M[M[j]]M[i]=M[M[j]]” (read) and “M[M[j]]=M[i]M[M[j]]=M[i]” (write) where ii and jj are constant integers (e.g., 77 and 4242) representing register indices, and M[j]M[j] is promised to be an integer (or else the operation fails). When we later allow operations on cell values, such as “a=b+ca=b+c”, these operations are actually restricted on a RAM to be of the form “M[i1]=M[i2]+M[i3]M[i_{1}]=M[i_{2}]+M[i_{3}]” where i1,i2,i3i_{1},i_{2},i_{3} are all constant integers (again representing register indices).

The real RAM model (dating back to Shamos’s original thesis [Sha78]) allows storage and manipulation of black-box real numbers supporting arithmetic operations +,,×,÷+,-,\times,\div, radical operations d\sqrt[d]{\phantom{\imath}}, and comparisons <,>,=<,>,= in constant time, in addition to storage and manipulation of integers used for indexing.444Shamos’s original description of the real RAM [Sha78] supports the more general operations exp\exp and log\log, from which one can derive d\sqrt[d]{\phantom{\imath}}, but these features do not seem to be in common use in computational geometry. The original description also supports trigonometric functions, but this has fallen out of favor because of related undecidability results [Lac03] described below. While popular in computational geometry for its simplicity, this model is not directly realistic for a digital computer, because it does not bound the required precision, which can grow without bound. For example, the real RAM model crucially does not support converting black-box real numbers into integers (e.g., via the floor function), or else one can solve PSPACE [Sch79] and #SAT [BMS85] in polynomial time. The real RAM (without floor) is one of the models we will use to analyze our algorithm. (Refer ahead to Table 1, last column, for the exact operations we allow on a real RAM.)

The word RAM [FW93] allows storage and manipulation of ww-bit integers (called words) supporting arithmetic (+,,×,÷,mod+,\allowbreak-,\allowbreak\times,\allowbreak\div,\allowbreak\operatorname{mod}), bitwise operations (and,or,xor,not\textsc{and},\allowbreak\textsc{or},\allowbreak\textsc{xor},\allowbreak\textsc{not}), and comparisons (<,>,=<,\allowbreak>,\allowbreak=) in constant time. Furthermore, the model makes the transdichotomous assumption that wlgnw\geq\lg n where nn is the problem size (named for how it bridges the problem and machine [FW93]). This assumption ensures that a single word can address the memory required to store the nn inputs; it is essentially the minimal assumption necessary to enable efficient random access on a RAM. Technically, we assume that the real RAM also includes all word RAM operations (i.e., the real RAM can store and manipulate both black-box real numbers and ww-bit words where wlgnw\geq\lg n), so that it can support random access and standard data structures.555Shamos’s original description of the real RAM [Sha78] says that it supports “Indirect accessing of memory (integer addresses only)”, presumably with the intent that such addresses should be formed via arithmetic on integers (as opposed to real arithmetic that happens to result in integers, which would be difficult to detect). Adding the entire uncontroversial word RAM model (which did not exist in 1978) enables more flexible address manipulation (e.g., word division and mod), so that the real RAM is just about adding black-box real numbers.

The word RAM is the standard model of computation in data structures and much of theoretical computer science, capturing the word-level parallelism in real computers. But it is highly restrictive for computational geometry because it restricts the inputs to be integers or rationals, whereas we normally want to allow continuous real inputs (e.g., a point set or a polygon). Recent work starting with [CP09, CP10] achieves strong results in computational geometry on the word RAM, but not all problems adapt well to integer or rational inputs. For example, in our quasigeodesics problem, should the input consist of vertex coordinates restricted to be integers, or an intrinsic polyhedron metric restricted to have integer edge lengths? Each answer defines a different subset of problems, and neither obviously captures all instances that we might want to solve.

Our solution is to build a new model called the “expression RAM”. On the one hand, this model has a built-in notion of restricted real numbers, thereby encompassing a form of the real RAM. This enables problem inputs to be given as certain real numbers, including integers and rationals but also roots (though these incur a cost in any computation involving them). As we show in Section 2.4, this model provides reductions between the different possible representations of polyhedral inputs, though at some cost. On the other hand, our model can be implemented directly on the word RAM (or the real RAM) with appropriate operation costs, so its algorithms can be realistically run on digital computers. We believe our model is the first to unify the simplicity of the real RAM with guaranteed algorithmic performance as measured on the word RAM, though our approach is closely based on the theory and practice of exact geometric computation [BFM+01, KLPY99].

2.2 Expression RAM

The expression RAM (Random Access Machine) is a RAM model of computation allowing storage and manipulation of real radical expressions over integers, called “expressions” for short. More precisely, an expression is a real number specified by an ordered DAG with a specified source node666We use the term “source” instead of the more standard “root” to avoid confusion with nodes that represent roots in radical expressions. (representing the overall value of the expression), where each leaf corresponds to an integer, each two-child node corresponds to a binary operator among {+,,×,÷}\{+,-,\times,\div\}, each one-child node corresponds to a unary operator d\sqrt[d]{\phantom{\imath}} for an integer dd stored within the node.777In our quasigeodesics algorithm, we only need square roots (d=2d=2), but we define general fixed roots for potential future uses in other algorithms. Another potential addition to the model is the expression “jjth smallest real root of the degree-dd polynomial with coefficients given by d+1d+1 expressions” [BFM+01, KLPY99], but this would require additional work which we do not carry out here. A special case of an expression is an integer expression which consists of a single integer (leaf) and no operators. Like the real RAM, we define the expression RAM to extend the word RAM, so it also allows storage and manipulation of ww-bit words where ww is a model parameter satisfying wlgnw\geq\lg n. To unify the two data types, we define a bb-bit integer expression to be explicitly encoded as b/w\lceil b/w\rceil words of ww bits, which can be manipulated as usual by word RAM operations. In particular, we can convert words into ww-bit integer expressions and vice versa in O(1)O(1) time.

Table 1 defines the expression-RAM operations for forming and evaluating expressions. Constructing expressions (Operation 1) is essentially free, but also does not compute anything other than an expression DAG, which can then be input into a computation (Operations 1, 1, and 1). Roughly speaking, Operation 1 computes the integer part and the first bb bits of the fractional part of EE (useful e.g. for reporting or plotting approximate values), but the last bit may be incorrect. An incorrect last bit is especially problematic when all the other bits are zero, but it turns out that in this case the correct last bit must in fact be zero (because bB(E)b\geq B(E), where BB is defined in (1) in Section 2.2.1 below). Operations 1 and 1 (which are built on Operation 1) show that this enables exact computation of sign, floor, and ceiling, despite these depending on the value of the last bit. (In our quasigeodesic algorithm, we do not actually need the floors and ceilings of Operation 1, or the approximations of Operation 1, but we describe how to easily implement them for possible future use.)

The expression-RAM operations of Table 1 can be implemented on the real RAM by representing an “expression” as a “built-in real number”, except for Operation 1 (assuming no floor or ceiling operation is available, as motivated in Section 2.1). Thus we can state a single algorithm in terms of Operations 1, 1, and 1, and then analyze the running time both on the real RAM and on the word RAM. The rightmost column of Table 1 gives the real-RAM cost of each operation. Operations 1 and 1 are directly supported in one operation on the real RAM (where Operation 1 now actually computes the real number resulting from the expression). Operation 1 requires a bit more effort, and will be analyzed in Lemma 2.2 below.

Word RAM Time Cost
Expression RAM Operation
Recursive
Cost
Model
Simple
Cost
Model
Real
RAM
Cost
O1. Given two expressions E1E_{1} and E2E_{2} (possibly integer expressions) and a positive integer dd, construct the expression E1+E2E_{1}+E_{2}, E1E2E_{1}-E_{2}, E1E2E_{1}\cdot E_{2}, E1/E2E_{1}/E_{2}, or E1d\sqrt[d]{E_{1}}. O(1)O(1) O(1)O(1) O(1)O(1)
O2. Given an expression EE, compute the sign of EE, i.e., whether it is zero, positive, or negative. O(T(E))O(T(E))
O(b(E)2O\Big{(}b^{*}(E)^{2}\cdot{}
(64Δ(E))2|R(E)|+2H(E)+6)\big{(}64\,\Delta(E)\big{)}^{2|R(E)|+2H(E)+6}\Big{)}
O(1)O(1)
O3. Given an expression EE and a positive integer bB(E)b\geq B(E) (defined in (1)), 𝟏𝟐𝒃\frac{1}{2^{b}}-compute EE: compute an interval [l,u][l,u] of rational numbers l,ul,u (represented by quotients of integer expressions) such that [l,u][l,u] contains the value of EE and has length ul12bu-l\leq\frac{1}{2^{b}}. O(T(E,b))O(T(E,b)) O(b(16Δ(E))H(E)+2)O\left(b\cdot\big{(}16\,\Delta(E)\big{)}^{H(E)+2}\right) O(b)O(b)
O4. Given an expression EE, compute integer expression E\lfloor E\rfloor or E\lceil E\rceil. O(T(E)2)O(T(E)^{2})
O(b(E)2O\Big{(}b^{*}(E)^{2}\cdot{}
(64Δ(E))2|R(E)|+2H(E)+6)\big{(}64\,\Delta(E)\big{)}^{2|R(E)|+2H(E)+6}\Big{)}
n/a
\bullet In Operations 1, 1, and 1, if EE contains any invalid computation — division by zero, or even roots of negative numbers — then these computations simply produce a special result of “undefined” (following [BFM+01]).
Table 1: Expression operations supported by the expression RAM, and their costs on the word RAM (middle columns) and on the real RAM (right column). (In addition, the expression RAM supports all word RAM operations.)

In the remainder of this section, we define and prove the various time bounds in Table 1, including two different cost models on the word RAM. In particular, we develop algorithms for the expression-RAM operations on the real RAM and word RAM, and analyze these algorithms to prove the claimed time bounds in the underlying models. Our “recursive” cost model (Section 2.2.2) specifies the running time of expression-RAM operations on the word RAM in terms of a recurrence over the structure of the expression. This cost model is more difficult to work with, but may be of use in certain situations where the expressions have specific form. It will also serve as the foundation for our “simple” cost model (Section 2.2.3), which bounds the running time of expression-RAM operations on the word RAM in terms of finitely many parameters. This model also includes algorithms to remove duplicate roots in an expression, so that the running time can depend on the number of distinct roots.

It is unlikely that trigonometric functions could be added to the expression RAM, as it is unclear how to bound their separation from zero in general and thus obtain word-RAM algorithms for Operation 1. For example, it is undecidable to determine whether a single-variable function built from operators ++, -, ×\times, sin\sin, exp\exp, and composition is always nonnegative [Lac03].

2.2.1 Expression Bounds: B(E)B(E), C(E)C(E), and D(E)D(E)

At the center of our approach is a “separation bound” from [BFM+01] that limits how close to zero an expression EE can be without actually being zero. We express this bound in terms of three functions from expressions to positive integers, which are simplifications of functions from [BFM+01, Theorem 1]. Specifically, define the functions B(E)B(E) (“bit bound”), C(E)C(E) (“calculation complexity”), and D(E)D(E) (“degree doom”) as follows:

B(E)\displaystyle B(E) =D(E)lgC(E),\displaystyle=\lceil D(E)\lg C(E)\rceil, (1)
C(E)\displaystyle C(E) ={max{2,|E|}if E is an integer expression,[C(E1)C(E2)]2if E=E1E2 for some operator {+,,×,÷},[C(E1)]2if E=E1d,\displaystyle=\begin{cases}\max\{2,|E|\}&\text{if $E$ is an integer expression,}\\ [C(E_{1})\cdot C(E_{2})]^{2}&\text{if $E=E_{1}\circ E_{2}$ for some operator $\circ\in\{+,-,\times,\div\}$,}\\ [C(E_{1})]^{2}&\text{if $E=\sqrt[d]{E_{1}}$,}\end{cases} (2)
D(E)\displaystyle D(E) =the product of the degrees d of all radical expressions E1d in E\displaystyle=\text{the product of the degrees $d$ of all radical expressions $\sqrt[d]{E_{1}}$ in~{}$E$} (3)
{1if E is an integer expression,D(E1)D(E2)if E=E1E2 for some operator {+,,×,÷},dD(E1)if E=E1d.\displaystyle\leq\begin{cases}1&\text{if $E$ is an integer expression,}\\ D(E_{1})\cdot D(E_{2})&\text{if $E=E_{1}\circ E_{2}$ for some operator $\circ\in\{+,-,\times,\div\}$,}\\ d\cdot D(E_{1})&\text{if $E=\sqrt[d]{E_{1}}$.}\end{cases} (4)

(The recursive upper bound for D(E)D(E) is exact for trees, but over-counts for DAGs.)

Crucially, these functions provide bounds on how large an expression EE can get, and on how close to zero a nonzero expression EE can get:

Theorem 2.1.

[BFM+01, Theorem 1] Any real algebraic expression E0E\neq 0 satisfies

1C(E)D(E)|E|C(E)D(E),{1\over C(E)^{D(E)}}\leq|E|\leq C(E)^{D(E)}, (5)

or taking logarithms,

B(E)lg|E|B(E).-B(E)\leq\lceil\lg|E|\rceil\leq B(E). (6)
Proof.

In fact, [BFM+01, Theorem 1] states that

1l(E)u(E)D(E)1|E|u(E)l(E)D(E)1,{1\over l(E)\cdot u(E)^{D(E)-1}}\leq|E|\leq u(E)\cdot l(E)^{D(E)-1}, (7)

where D(E)D(E) is defined the same, and l(E)l(E) and u(E)u(E) are defined recursively as in the cases below.888The functions l(E)l(E) and u(E)u(E) from [BFM+01] are used only within our proof of Theorem 2.1, and are unrelated to the rational numbers ll and uu output by Operation 1. We have simplified the statement of the theorem to give a rough upper bound CC that applies in all cases, related to their functions u,lu,l via C(E)u(E)l(E)C(E)\geq u(E)\cdot l(E). The base case (for integer expressions EE) has been modified to guarantee C(E)2C(E)\geq 2, which allows us to simplify some of the recursive formulas. We have also loosened the bound by exponentiating both l(E)l(E) and u(E)u(E) by D(E)D(E) (which is also 11 larger than needed).

We prove by structural induction on the expression EE that C(E)u(E)l(E)C(E)\geq u(E)\cdot l(E). First note that C(E)2C(E)\geq 2 and u(E),l(E)1u(E),l(E)\geq 1 always. Then we proceed in cases:

  1. 1.

    If EE is an integer expression NN, then u(E)=Nu(E)=N and l(E)=1l(E)=1, so u(E)l(E)=Nmax{2,|E|}=C(E)u(E)\cdot l(E)=N\leq\max\{2,|E|\}=C(E).

  2. 2.

    If E=E1E2E=E_{1}\circ E_{2} for some operator {+,}\circ\in\{+,-\}, then u(E)=u(E1)l(E2)+l(E1)u(E2)2max{u(E1)l(E2),l(E1)u(E2)}u(E)=u(E_{1})\cdot l(E_{2})+l(E_{1})\cdot u(E_{2})\leq 2\max\{u(E_{1})\cdot l(E_{2}),l(E_{1})\cdot u(E_{2})\} and l(E)=l(E1)l(E2)l(E)=l(E_{1})\cdot l(E_{2}), so

    u(E)l(E)\displaystyle u(E)\cdot l(E) 2max{u(E1)l(E1)l(E2)2,l(E1)2l(E2)u(E2)}\displaystyle\leq 2\max\{u(E_{1})\cdot l(E_{1})\cdot l(E_{2})^{2},l(E_{1})^{2}\cdot l(E_{2})\cdot u(E_{2})\}
    2u(E1)l(E1)u(E2)l(E2)max{u(E2)l(E2),u(E1)l(E1)}\displaystyle\leq 2\,u(E_{1})\cdot l(E_{1})\cdot u(E_{2})\cdot l(E_{2})\cdot\max\{u(E_{2})\cdot l(E_{2}),u(E_{1})\cdot l(E_{1})\}
    =2C(E1)C(E2)max{C(E2),C(E1)}.\displaystyle=2\,C(E_{1})\cdot C(E_{2})\cdot\max\{C(E_{2}),C(E_{1})\}.

    Because min{C(E2),C(E1)}2\min\{C(E_{2}),C(E_{1})\}\geq 2, we have that u(E)l(E)C(E1)2C(E2)2=C(E)u(E)\cdot l(E)\leq C(E_{1})^{2}\cdot C(E_{2})^{2}=C(E), as claimed.

  3. 3.

    If E=E1E2E=E_{1}\cdot E_{2}, then u(E)=u(E1)u(E2)u(E)=u(E_{1})\cdot u(E_{2}) and l(E)=l(E1)l(E2)l(E)=l(E_{1})\cdot l(E_{2}), so

    u(E)l(E)=u(E1)l(E1)l(E2)u(E2)C(E1)C(E2)=C(E),u(E)\cdot l(E)=u(E_{1})\cdot l(E_{1})\cdot l(E_{2})\cdot u(E_{2})\leq C(E_{1})\cdot C(E_{2})=C(E),

    as claimed.

  4. 4.

    If E=E1/E2E=E_{1}/E_{2}, then u(E)=u(E1)l(E2)u(E)=u(E_{1})\cdot l(E_{2}) and l(E)=l(E1)u(E2)l(E)=l(E_{1})\cdot u(E_{2}), so

    u(E)l(E)=u(E1)l(E1)l(E2)u(E2)C(E1)C(E2)=C(E),u(E)\cdot l(E)=u(E_{1})\cdot l(E_{1})\cdot l(E_{2})\cdot u(E_{2})\leq C(E_{1})\cdot C(E_{2})=C(E),

    as claimed.

  5. 5.

    If E=E1dE=\sqrt[d]{E_{1}} and u(E1)l(E1)u(E_{1})\geq l(E_{1}), then u(E)=u(E1)1dl(E1)d1du(E)=u(E_{1})^{\frac{1}{d}}\cdot l(E_{1})^{\frac{d-1}{d}} and l(E)=l(E1)l(E)=l(E_{1}), so u(E)l(E)=u(E1)1dl(E1)2d1dC(E1)2=C(E)u(E)\cdot l(E)=u(E_{1})^{\frac{1}{d}}\cdot l(E_{1})^{\frac{2d-1}{d}}\leq C(E_{1})^{2}=C(E), as claimed.

  6. 6.

    If E=E1dE=\sqrt[d]{E_{1}} and u(E1)<l(E1)u(E_{1})<l(E_{1}), then u(E)=u(E1)u(E)=u(E_{1}) and l(E)=u(E1)d1dl(E1)1dl(E)=u(E_{1})^{\frac{d-1}{d}}\cdot l(E_{1})^{\frac{1}{d}}, so u(E)l(E)=u(E1)2d1dl(E1)1dC(E1)2=C(E)u(E)\cdot l(E)=u(E_{1})^{\frac{2d-1}{d}}\cdot l(E_{1})^{\frac{1}{d}}\leq C(E_{1})^{2}=C(E), as claimed.

Finally, (7) shows that |E|u(E)l(E)D(E)1|E|\leq u(E)\cdot l(E)^{D(E)-1}, which is at most (u(E)l(E))D(E)=C(E)D(E)(u(E)\cdot l(E))^{D(E)}=C(E)^{D(E)}; and that |E|u(E)1D(E)l(E)1|E|\geq u(E)^{1-D(E)}\cdot l(E)^{-1}, which is at least (u(E)l(E))D(E)=C(E)D(E)(u(E)\cdot l(E))^{-D(E)}=C(E)^{-D(E)}, as desired. ∎

We can use this separation bound to analyze the only nontrivial expression-RAM operation on the real RAM:

Lemma 2.2.

Operation 1 can be implemented in O(b)O(b) time on a real RAM (without floor or ceiling).

Proof.

We start with the interval [C(E)D(E),C(E)D(E)][-C(E)^{D(E)},C(E)^{D(E)}], which by Theorem 2.1 contains the given real number EE. Then we perform a binary search on the interval [l,u][l,u]. At each step, we compute the midpoint m=(l+u)/2m=(l+u)/2 via Operation 1, compare mm against EE via Operation 1, and set ll or uu to mm according to whether mEm\geq E or mEm\leq E. Each step takes O(1)O(1) time, preserves E[l,u]E\in[l,u], and reduces the interval length ulu-l by a factor of 22. After B(E)=D(E)lgC(E)bB(E)=\lceil D(E)\lg C(E)\rceil\leq b steps, the interval length is 2\leq 2. After another b+1b+1 steps, the interval length is 1/2b\leq 1/2^{b}. The total number of steps and thus running time is O(b)O(b). ∎

Corollary 2.3.

Operations 11, and 1 can be implemented on the real RAM in the time bounds stated in Table 1.

Proof.

Operation 1 actually performs the real-number computation (optionally in addition to constructing the DAG), making Operation 1 run in constant time. Operation 1 follows from Lemma 2.2. ∎

2.2.2 Recursive Cost Model

In the recursive cost model, the expression-RAM operations have the following time costs on the word RAM. Operation 1 runs in O(T(E,b))O(T(E,b)) time, Operation 1 runs in O(T(E))O(T(E)) time, and Operation 1 runs in O(T(E)2)O(T(E)^{2}) time, where T(E)=T(E,B(E))T(E)=T(E,B(E)) and T(E,b)T(E,b) is given by the following recurrence:

T(E,b)={S(E,b)if E is an integer expression,T(E1,b+1)+T(E2,b+1)+S(E,b)if E=E1±E2,T(E1,b+B(E))+T(E2,b+B(E))+S(E,b)if E=E1E2,T(E1,b+B(E))+T(E2,b+2B(E))+S(E,b)if E=E1/E2,T(E1,4db)+dS(E,b)if E=E1d.T(E,b)=\begin{cases}S(E,b)&\text{if $E$ is an integer expression,}\\ T(E_{1},b+1)+T(E_{2},b+1)+S(E,b)&\text{if $E=E_{1}\pm E_{2}$,}\\ T(E_{1},b+B(E))+T(E_{2},b+B(E))+S(E,b)&\text{if $E=E_{1}\cdot E_{2}$,}\\ T(E_{1},b+B(E))+T(E_{2},b+2B(E))+S(E,b)&\text{if $E=E_{1}/E_{2}$,}\\ T(E_{1},4db)+d\cdot S(E,b)&\text{if $E=\sqrt[d]{E_{1}}$.}\end{cases} (8)

Here S(E,b)S(E,b) represents the maximum number of bits in the numerator or denominator of ll or uu in the interval [l,u][l,u] output by Operation 1 on input EE and bb; S(E,b)S(E,b) is given by another recurrence, with the same recursive structure as T(E,b)T(E,b) but differing in the additive terms:

S(E,b)={B(E)=lgC(E)=bif E is a b-bit integer expression,S(E1,b+1)+S(E2,b+1)+1if E=E1±E2,S(E1,b+B(E))+S(E2,b+B(E))if E=E1E2,S(E1,b+B(E))+S(E2,b+2B(E))if E=E1/E2,S(E1,4db)+6dbif E=E1d.S(E,b)=\begin{cases}B(E)=\lg C(E)=b^{\prime}&\text{if $E$ is a $b^{\prime}$-bit integer expression,}\\ S(E_{1},b+1)+S(E_{2},b+1)+1&\text{if $E=E_{1}\pm E_{2}$,}\\ S(E_{1},b+B(E))+S(E_{2},b+B(E))&\text{if $E=E_{1}\cdot E_{2}$,}\\ S(E_{1},b+B(E))+S(E_{2},b+2B(E))&\text{if $E=E_{1}/E_{2}$,}\\ S(E_{1},4db)+6db&\text{if $E=\sqrt[d]{E_{1}}$.}\end{cases} (9)

To give some intuition about how SS and TT behave, we prove that their dependence on bb is at most linear:

Lemma 2.4.

For a fixed expression EE, T(E,b)T(E,b) is an affine function cb+cc\cdot b+c^{\prime} of bb, where both cc and cc^{\prime} are nonnegative integers. The same is true for S(E,b)S(E,b).

Proof.

Let b0b_{0} represent the parameter bb at the top level of the recurrence. The recurrences for T(E,b)T(E,b) and S(E,b)S(E,b) are all sums of terms. Expanding out all occurrences of TT and SS, we obtain a sum of terms all of the form 11, bb^{\prime}, and 6db6db for positive integers b,db^{\prime},d (dependent only on EE). In addition, some terms are multiplied by dd (from the last case of T(E,b)T(E,b)). The 6db6db terms use a derived value of bb, which is formed from b0b_{0} by repeatedly adding 11, adding B(E)B(E), adding 2B(E)2B(E), or multiplying by 4d4d. By induction, every bb value is an affine function of b0b_{0} with nonnegative integer coefficients, and thus so is each term 6db6db. Therefore the sum of terms resulting from expanding T(E,b0)T(E,b_{0}) or S(E,b0)S(E,b_{0}) is also such an affine function. ∎

Corollary 2.5.

For any constant c1c\geq 1, T(E,cb)cT(E,b)T(E,c\cdot b)\leq c\cdot T(E,b). The same is true for S(E,b)S(E,b).

We also analyze how quickly T(E)T(E) grows when EE gains an operation:

Lemma 2.6.

For two expression DAGs E1E_{1} and E2E_{2}, T(E1±E2)=O(T(E1)T(E2))T(E_{1}\pm E_{2})=O\big{(}T(E_{1})\cdot T(E_{2})\big{)}.

Proof.

First we show that B(E1±E2)=O(B(E1)B(E2))B(E_{1}\pm E_{2})=O\big{(}B(E_{1})\cdot B(E_{2})\big{)}:

D(E1±E2)\displaystyle D(E_{1}\pm E_{2}) =D(E1)D(E2)\displaystyle=D(E_{1})\cdot D(E_{2}) by (3)
C(E1±E2)\displaystyle C(E_{1}\pm E_{2}) =[C(E1)C(E2)]2\displaystyle=[C(E_{1})\cdot C(E_{2})]^{2} by (2)
lgC(E1±E2)\displaystyle\lg C(E_{1}\pm E_{2}) =2(lgC(E1)+lgC(E2))\displaystyle=2\cdot\big{(}\lg C(E_{1})+\lg C(E_{2})\big{)}
D(E1±E2)lgC(E1±E2)\displaystyle D(E_{1}\pm E_{2})\lg C(E_{1}\pm E_{2}) =2D(E1)D(E2)(lgC(E1)+lgC(E2))\displaystyle=2\,D(E_{1})\cdot D(E_{2})\cdot\big{(}\lg C(E_{1})+\lg C(E_{2})\big{)}
2D(E1)D(E2)(2lgC(E1)lgC(E2))\displaystyle\leq 2\,D(E_{1})\cdot D(E_{2})\cdot\big{(}2\lg C(E_{1})\cdot\lg C(E_{2})\big{)} because C(Ei)2C(E_{i})\geq 2
=4B(E1)B(E2).\displaystyle=4\,B(E_{1})\cdot B(E_{2}).

Define b=B(E1±E2)4B(E1)B(E2)b=B(E_{1}\pm E_{2})\leq 4\,B(E_{1})\cdot B(E_{2}). Now we compute T(E1±E2)=T(E1±E2,b)T(E_{1}\pm E_{2})=T(E_{1}\pm E_{2},b) as follows:

T(E1±E2,b)=T(E1,b+1)+T(E2,b+1)+S(E1±E2,b)by (8)=T(E1,b+1)+T(E2,b+1)+S(E1,b+1)+S(E2,b+1)+1by (9)2T(E1,b)+2T(E2,b)+2S(E1,b)+2S(E2,b)+1by Corollary 2.5 and b1.2T(E1,b)+2T(E2,b)+2T(E1,b)+2T(E2,b)+1,\begin{array}[]{ccccccccccll}T(E_{1}\pm E_{2},b)&{}={}&T(E_{1},b+1)&{}+{}&T(E_{2},b+1)&{}+{}&\lx@intercol S(E_{1}\pm E_{2},b)\hfil\lx@intercol&&}{&\text{by \eqref{eq:T}}\\ &{}={}&T(E_{1},b+1)&{}+{}&T(E_{2},b+1)&{}+{}&S(E_{1},b+1)&{}+{}&S(E_{2},b+1)&{}+1&&\text{by \eqref{eq:S}}\\ &{}\leq{}&2\cdot T(E_{1},b)&{}+{}&2\cdot T(E_{2},b)&{}+{}&2\cdot S(E_{1},b)&{}+{}&2\cdot S(E_{2},b)&{}+1&&\text{by Corollary~{}\ref{cor:growth in b} and $b\geq 1$.}\\ &{}\leq{}&2\cdot T(E_{1},b)&{}+{}&2\cdot T(E_{2},b)&{}+{}&2\cdot T(E_{1},b)&{}+{}&2\cdot T(E_{2},b)&{}+1,\end{array}

where the last inequality follows from T(E,b)S(E,b)T(E,b)\geq S(E,b) by inspection of (8).

Therefore T(E1±E2)=T(E1±E2,b)=O(T(E1,b)+T(E2,b))T(E_{1}\pm E_{2})=T(E_{1}\pm E_{2},b)=O(T(E_{1},b)+T(E_{2},b)). By Lemma 2.4, T(Ei,b)=cib+ciT(E_{i},b)=c_{i}\cdot b+c^{\prime}_{i} for nonnegative integers ci,cic_{i},c^{\prime}_{i} (which depend on EiE_{i}). Thus we can bound the sum

T(E1,b)+T(E2,b)\displaystyle T(E_{1},b)+T(E_{2},b) =(c1b+c1)+(c2b+c2)\displaystyle=(c_{1}\cdot b+c^{\prime}_{1})+(c_{2}\cdot b+c^{\prime}_{2})
=(c1+c2)b+(c1+c2)\displaystyle=(c_{1}+c_{2})\cdot b+(c^{\prime}_{1}+c^{\prime}_{2})
(c1+c2)(4B(E1)B(E2))+(c1+c2)\displaystyle\leq(c_{1}+c_{2})\cdot\big{(}4\cdot B(E_{1})\cdot B(E_{2})\big{)}+(c^{\prime}_{1}+c^{\prime}_{2})
4(c1B(E1)+c1)(c2B(E2)+c2)\displaystyle\leq 4\cdot\big{(}c_{1}\cdot B(E_{1})+c^{\prime}_{1}\big{)}\cdot\big{(}c_{2}\cdot B(E_{2})+c^{\prime}_{2}\big{)}
=4T(E1,B(E1))T(E2,B(E2))\displaystyle=4\cdot T\big{(}E_{1},B(E_{1})\big{)}\cdot T\big{(}E_{2},B(E_{2})\big{)}
=4T(E1)T(E2).\displaystyle=4\cdot T(E_{1})\cdot T(E_{2}).

In conclusion, T(E1±E2)=O(T(E1)T(E2))T(E_{1}\pm E_{2})=O\big{(}T(E_{1})\cdot T(E_{2})\big{)} as desired. ∎

Now that we have stated the target time bounds, we develop algorithms for the expression-RAM operations and prove that they achieve these bounds. First we need a lemma for working with arbitrary-precision integers and rationals.

Lemma 2.7.

In O(b)O(b) time on the word RAM, we can add, subtract, multiply, integer divide, and compare bb-bit integers; and we can add, subtract, multiply, divide, and compare rationals represented as quotients of bb-bit integers.

Proof.

Integer addition and subtraction can be implemented using the grade-school algorithms, working in base 2w2^{w}, with a running time of O(b/w)=O(b)O(b/w)=O(b) time. Integer comparison follows from subtraction and checking the resulting sign. An O(b)O(b)-time algorithm for integer multiplication on a word RAM with w=Ω(lgb)w=\Omega(\lg b) goes back to Schönhage in 1980 [Sch80]; see also [Für14] and [Knu69, §4.3.3.C, p. 311]. Furthermore, integer division can be implemented by a series of integer multiplications of geometrically decreasing size [Knu69, §4.3.3.D, pp. 311–313], so also in O(b)O(b) time on a word RAM.

Each rational arithmetic operation reduces to O(1)O(1) integer arithmetic operations via the grade-school identities:

pq+rs=ps+rqqs;pqrs=psrqqs;pqrs=prqs;pq/rs=psqr.\frac{p}{q}+\frac{r}{s}=\frac{p\cdot s+r\cdot q}{q\cdot s};\quad\frac{p}{q}-\frac{r}{s}=\frac{p\cdot s-r\cdot q}{q\cdot s};\quad\frac{p}{q}\cdot\frac{r}{s}=\frac{p\cdot r}{q\cdot s};\quad\left.\frac{p}{q}\right/\frac{r}{s}=\frac{p\cdot s}{q\cdot r}.

The sign of a rational number is the product of signs of its numerator and denominator, so we can support comparisons via subtraction. (We do not bother to reduce fractions, as that does not help in the worst case.) ∎

As the proof shows, the cost of integer addition, subtraction, and comparison can be tightened to O(b/w)O(b/w) time. Recently, bb-bit integer multiplication was shown to be possible in O(blgb)O(b\lg b) bit operations [HvdH19], but to our knowledge it remains open whether it is possible to achieve o(b)o(b) time (ideally, O(b/w)O(b/w) time) on the word RAM. Thus we opt for the simpler universal upper bound of O(b)O(b) for all operations, though perhaps this could be lowered in the future.

Now we implement the core expression-RAM operation, Operation 1. Operations 1 and 1 will be implemented on top of this operation.

Theorem 2.8.

Operation 1 can be implemented on a word RAM in O(T(E,b))O(T(E,b)) time.

Proof.

We show how to ε\varepsilon-compute EE where ε=12b\varepsilon=\frac{1}{2^{b}} and bB(E)1b\geq B(E)\geq 1 recursively in cases. Roughly speaking, each expression asks its children expressions for increasing precision bb^{\prime} so that, when we combine the intervals from the children, we meet the desired interval length bound at the parent. Along the way, we keep track of how many bits SS are needed to represent the intervals themselves. We apply standard identities for interval arithmetic, specifically addition, subtraction, and multiplication; see [Gib61] and [Moo79, §2.2].

More formally, we prove by induction on the number of nodes in input expression EE that Operation 1 (1) runs in O(T(E,b))O(T(E,b)) time, and (2) produces an interval of rational numbers whose numerators and denominators each have at most S(E,b)S(E,b) bits (ignoring the sign bit). Assume by induction that smaller expressions satisfy these two induction hypotheses.

Addition and subtraction.

To ε\varepsilon-compute E=E1E2E=E_{1}\circ E_{2} where {+,}\circ\in\{+,-\}, we first recursively ε1\varepsilon_{1}-compute E1E_{1} and E2E_{2} where ε1=ε/2\varepsilon_{1}=\varepsilon/2. In other words, the recursive call is with b=b+1b^{\prime}=b+1. Call the resulting intervals [l1,u1][l_{1},u_{1}] and [l2,u2][l_{2},u_{2}]. Then [l1l2,u1u2][l_{1}\circ l_{2},u_{1}\circ u_{2}] is an ε\varepsilon-computation of E=E1E2E=E_{1}\circ E_{2}:

u1u2(l1l2)\displaystyle u_{1}\circ u_{2}-(l_{1}\circ l_{2}) =(u1l1)(u2l2)\displaystyle=(u_{1}-l_{1})\circ(u_{2}-l_{2})
ε1+ε1(by triangle inequality when  is )\displaystyle\leq\varepsilon_{1}+\varepsilon_{1}\quad\text{(by triangle inequality when $\circ$ is $-$)}
2ε1=ε.\displaystyle\leq 2\varepsilon_{1}=\varepsilon.

We can compute the interval [l1l2,u1u2][l_{1}\circ l_{2},u_{1}\circ u_{2}] of rationals using Lemma 2.7. By induction, the numerators and denominators of lil_{i} and uiu_{i} have at most S(Ei,b)S(E_{i},b^{\prime}) bits, so the sums/differences l1l2l_{1}\circ l_{2} and u1u2u_{1}\circ u_{2} have at most

max{S(E1,b),S(E2,b)}+1\displaystyle\max\{S(E_{1},b^{\prime}),S(E_{2},b^{\prime})\}+1 S(E1,b)+S(E2,b)+1\displaystyle\leq S(E_{1},b^{\prime})+S(E_{2},b^{\prime})+1
=S(E1,b+1)+S(E2,b+1)+1\displaystyle=S(E_{1},b+1)+S(E_{2},b+1)+1
=S(E,b)\displaystyle=S(E,b)

bits, and computing them takes O(S(E,b))O(S(E,b)) time.

Multiplication.

To ε\varepsilon-compute E=E1E2E=E_{1}\cdot E_{2}, we first recursively ε1\varepsilon_{1}-compute E1E_{1} and E2E_{2} where ε1=ε/C(E)D(E)\varepsilon_{1}=\varepsilon/C(E)^{D(E)}. In other words, the recursive call is with b=b+B(E)b^{\prime}=b+B(E). Call the resulting intervals [l1,u1][l_{1},u_{1}] and [l2,u2][l_{2},u_{2}]. To properly handle signs, we can take all pairwise products and take their min and max [Gib61, Moo79]: [min{l1l2,l1u2,u1l2,u1u2},max{l1l2,l1u2,u1l2,u1u2}][\min\{l_{1}\cdot l_{2},l_{1}\cdot u_{2},u_{1}\cdot l_{2},u_{1}\cdot u_{2}\},\max\{l_{1}\cdot l_{2},l_{1}\cdot u_{2},u_{1}\cdot l_{2},u_{1}\cdot u_{2}\}] contains the value of E=E1E2E=E_{1}\cdot E_{2}. In fact, we claim it is an ε\varepsilon-computation of EE.

There are 16=4216=4^{2} cases for the choices of max and min. To enumerate the cases, we introduce some notation: for x{l,u}x\in\{l,u\}, define x¯\overline{x} is the other of ll and uu, i.e., x¯{l,u}{x}\overline{x}\in\{l,u\}\setminus\{x\}; and for i{1,2}i\in\{1,2\}, define i¯=3i{1,2}{i}\overline{i}=3-i\in\{1,2\}\setminus\{i\}. Note that all terms in the min and max are of the form x1y2x_{1}\cdot y_{2}, or by symmetry and commutativity, of the form xiyi¯x_{i}\cdot y_{\overline{i}}. We split the cases into groups according to how many of l1,l2,u1,u2l_{1},l_{2},u_{1},u_{2} appear.

  1. 1.

    Exactly two of l1,l2,u1,u2l_{1},l_{2},u_{1},u_{2} appear. These four of the sixteen cases have the form [x1y2,x1y2][x_{1}y_{2},x_{1}y_{2}] where x,y{l,u}x,y\in\{l,u\}. Then the max equals the min, so we get a 0-computation.

  2. 2.

    Exactly three of l1,l2,u1,u2l_{1},l_{2},u_{1},u_{2} appear. These eight of the sixteen cases have the form [xiyi¯,xiy¯i¯][x_{i}\cdot y_{\bar{i}},x_{i}\cdot\bar{y}_{\bar{i}}]. where x,y{l,u}x,y\in\{l,u\} and i{1,2}i\in\{1,2\}. (Only four of these cases have positive length.) Then the length of the interval can be bounded as follows:

    |xiyi¯xiy¯i¯|\displaystyle|x_{i}\cdot y_{\bar{i}}-x_{i}\cdot\bar{y}_{\bar{i}}| =|xiyi¯y¯i¯|\displaystyle=|x_{i}\cdot y_{\bar{i}}-\bar{y}_{\bar{i}}|
    |xi||yi¯y¯i¯|\displaystyle\leq|x_{i}|\cdot|y_{\bar{i}}-\bar{y}_{\bar{i}}|
    =|xi|(ui¯li¯)\displaystyle=|x_{i}|\cdot(u_{\bar{i}}-l_{\bar{i}})
    C(Ei)D(Ei)ε1\displaystyle\leq C(E_{i})^{D(E_{i})}\cdot\varepsilon_{1}
    C(E)D(E)ε1\displaystyle\leq C(E)^{D(E)}\cdot\varepsilon_{1}
    =ε.\displaystyle=\varepsilon.
  3. 3.

    All four of l1,l2,u1,u2l_{1},l_{2},u_{1},u_{2} appear. These four of the sixteen cases have the form [x1y2,x¯1y¯2][x_{1}\cdot y_{2},\bar{x}_{1}\cdot\bar{y}_{2}]. where x,y{l,u}x,y\in\{l,u\}. (Only two of these cases have positive length.) Then the length of the interval can be bounded as follows:

    |x1y2x¯1y¯2|\displaystyle|x_{1}\cdot y_{2}-\bar{x}_{1}\cdot\bar{y}_{2}| =|x1y2x1y¯2+x1y¯2x¯1y¯2|\displaystyle=|x_{1}\cdot y_{2}-x_{1}\cdot\bar{y}_{2}+x_{1}\cdot\bar{y}_{2}-\bar{x}_{1}\cdot\bar{y}_{2}|
    =|x1(y2y¯2)+(x1x¯1)y¯2|\displaystyle=|x_{1}\cdot(y_{2}-\bar{y}_{2})+(x_{1}-\bar{x}_{1})\cdot\bar{y}_{2}|
    |x1||y2y¯2|+|x1x¯1||y¯2|\displaystyle\leq|x_{1}|\cdot|y_{2}-\bar{y}_{2}|+|x_{1}-\bar{x}_{1}|\cdot|\bar{y}_{2}|
    =|x1|(u2l2)+(u1l1)|y¯2|\displaystyle=|x_{1}|\cdot(u_{2}-l_{2})+(u_{1}-l_{1})\cdot|\bar{y}_{2}|
    |x1|ε1+ε1|y¯2|\displaystyle\leq|x_{1}|\cdot\varepsilon_{1}+\varepsilon_{1}\cdot|\bar{y}_{2}|
    =ε1(|x1|+|y¯2|)\displaystyle=\varepsilon_{1}\cdot(|x_{1}|+|\bar{y}_{2}|)
    ε1(C(E1)D(E1)+C(E2)D(E2))\displaystyle\leq\varepsilon_{1}\cdot\left(C(E_{1})^{D(E_{1})}+C(E_{2})^{D(E_{2})}\right)
    ε1(C(E1)D(E1)C(E2)D(E2))(because C(E1),C(E2)2)\displaystyle\leq\varepsilon_{1}\cdot\left(C(E_{1})^{D(E_{1})}\cdot C(E_{2})^{D(E_{2})}\right)\quad\text{(because $C(E_{1}),C(E_{2})\geq 2$)}
    ε1(C(E)D(E))\displaystyle\leq\varepsilon_{1}\cdot\left(C(E)^{D(E)}\right)
    =ε.\displaystyle=\varepsilon.

We can compute the interval in O(S(E,b))O(S(E,b)) time using O(1)O(1) rational multiplications and comparisons via Lemma 2.7. By induction, the numerators and denominators of lil_{i} and uiu_{i} have at most S(Ei,b)S(E_{i},b^{\prime}) bits, so the number of bits in the products l1l2,l1u2,u1l2,u1u2l_{1}\cdot l_{2},l_{1}\cdot u_{2},u_{1}\cdot l_{2},u_{1}\cdot u_{2} is at most

S(E1,b)+S(E2,b)\displaystyle S(E_{1},b^{\prime})+S(E_{2},b^{\prime}) =S(E1,b+B(E))+S(E2,b+B(E))\displaystyle=S(E_{1},b+B(E))+S(E_{2},b+B(E))
=S(E,b).\displaystyle=S(E,b).
Division.

To ε\varepsilon-compute E=E1/E2E=E_{1}/E_{2}, we reduce to multiplying E1E_{1} and 1/E21/E_{2}, as just analyzed, and computing 1/E21/E_{2}. To ε\varepsilon-compute E=1/E2E=1/E_{2}, we first recursively ε2\varepsilon_{2}-compute E2E_{2} where ε2=ε/C(E)D(E)\varepsilon_{2}=\varepsilon/C(E)^{D(E)}. In other words, the recursive call is with b=b+B(E)b^{\prime}=b+B(E). Call the resulting interval [l2,u2][l_{2},u_{2}]. If l20u2l_{2}\leq 0\leq u_{2}, then

|E2|u2l212b12bC(E)D(E)<1C(E)D(E)(because b1),|E_{2}|\leq u_{2}-l_{2}\leq\frac{1}{2^{b^{\prime}}}\leq\frac{1}{2^{b}C(E)^{D(E)}}<\frac{1}{C(E)^{D(E)}}\quad\text{(because $b\geq 1$)},

so E2=0E_{2}=0 by Theorem 2.1; thus we return “undefined” to indicate division by zero. Otherwise, [1/u2,1/l2][1/u_{2},1/l_{2}] is an ε\varepsilon-computation of E=1/E2E=1/E_{2}:

1l21u2\displaystyle\frac{1}{l_{2}}-\frac{1}{u_{2}} =u2l2l2u2\displaystyle=\frac{u_{2}-l_{2}}{l_{2}\cdot u_{2}}
ε21l21u2\displaystyle\leq\varepsilon_{2}\cdot\frac{1}{l_{2}}\cdot\frac{1}{u_{2}}
ε2(C(E2)D(E2))2\displaystyle\leq\varepsilon_{2}\cdot\left(C(E_{2})^{D(E_{2})}\right)^{2}
ε2C(E)D(E)\displaystyle\leq\varepsilon_{2}\cdot C(E)^{D(E)}
=ε.\displaystyle=\varepsilon.

We can compute each reciprocal 1/u2,1/l21/u_{2},1/l_{2} in O(1)O(1) time by swapping the numerator and denominator, which preserves the number of bits S(E2,b)S(E_{2},b^{\prime}). When we multiply with E1E_{1}, we add B(E)B(E) to the recursive bb^{\prime} (another for E2E_{2}, and a first for E1E_{1}), and the resulting number of bits S(E,b)S(E,b) is the sum S(E1,b+B(E))+S(E2,b+2B(E))=S(E,b)S(E_{1},b+B(E))+S(E_{2},b+2B(E))=S(E,b). The total running time is O(S(E,b))O(S(E,b)) beyond the recursive calls, as claimed.

Roots.

To ε\varepsilon-compute E=E1dE=\sqrt[d]{E_{1}}, we first recursively ε1\varepsilon_{1}-compute E1E_{1} where ε1=4(ε/8)d\varepsilon_{1}=4(\varepsilon/8)^{d}. In other words, the recursive call is with b=3db+2b^{\prime}=3db+2 which we round up to 4db4db for cleaner formulas. Call the resulting interval [l1,u1][l_{1},u_{1}]. If l1<0l_{1}<0 and u10u_{1}\leq 0, then E10E_{1}\leq 0, so we return “undefined” to indicate a negative square root. If l1<0<u1l_{1}<0<u_{1}, then

|E1|u1l112b<12b1C(E)D(E)(because bB(E)),|E_{1}|\leq u_{1}-l_{1}\leq\frac{1}{2^{b^{\prime}}}<\frac{1}{2^{b}}\leq\frac{1}{C(E)^{D(E)}}\quad\text{(because $b\geq B(E)$)},

so E1=0E_{1}=0 by Theorem 2.1; thus, if dd is even, we return “undefined” to indicate a negative square root.

Otherwise, we can compute the floor and ceiling of the ddth root of a bb-bit integer in O(db)O(d\cdot b) time [Zim99, BMZ02]; see [BZ10, §1.5.1]. To compensate for the error to be introduced, we first scale by 1/ε2d1/\varepsilon_{2}^{d} where ε2=ε/8\varepsilon_{2}=\varepsilon/8, resulting in the interval [l1/ε2d,u1/ε2d]\left[l_{1}/\varepsilon_{2}^{d},u_{1}/\varepsilon_{2}^{d}\right], which is an ε1/ε2d\varepsilon_{1}/\varepsilon_{2}^{d}-computation of E1/ε2dE_{1}/\varepsilon_{2}^{d}.999Potentially, the error could be improved to avoid the power of dd. We leave this improvement (which would remove C(E)C(E)’s dependence on dd) as an open problem. Next we round to the containing integer interval [l1/ε2d,u1/ε2d]\left[\lfloor l_{1}/\varepsilon_{2}^{d}\rfloor,\lceil u_{1}/\varepsilon_{2}^{d}\rceil\right], which adds an additive error of at most 22, so is an (ε1/ε2d+2)(\varepsilon_{1}/\varepsilon_{2}^{d}+2)-computation of E1/ε2dE_{1}/\varepsilon_{2}^{d}. Now we apply the ddth-root algorithm to both ends of the integer interval, rounding down and up in each case respectively, to obtain [l1/ε2dd,u1/ε2dd]\left[\left\lfloor\sqrt[d]{\lfloor l_{1}/\varepsilon_{2}^{d}\rfloor}\right\rfloor,\left\lceil\sqrt[d]{\lceil u_{1}/\varepsilon_{2}^{d}\rceil}\right\rceil\right], which is an (ε1/ε2d+4)(\varepsilon_{1}/\varepsilon_{2}^{d}+4)-computation of E1/ε2dd\sqrt[d]{E_{1}/\varepsilon_{2}^{d}}. Finally, we undo the scaling by multiplying both ends of this integer interval by ε2\varepsilon_{2}, and return [l1/ε2ddε2,u1/ε2ddε2]\left[\left\lfloor\sqrt[d]{\lfloor l_{1}/\varepsilon_{2}^{d}\rfloor}\right\rfloor\varepsilon_{2},\left\lceil\sqrt[d]{\lceil u_{1}/\varepsilon_{2}^{d}\rceil}\right\rceil\varepsilon_{2}\right] which is an

(ε1/ε2d+4)ε2\displaystyle(\varepsilon_{1}/\varepsilon_{2}^{d}+4)\cdot\varepsilon_{2} =ε21dε1+4ε2\displaystyle=\varepsilon_{2}^{1-d}\cdot\varepsilon_{1}+4\varepsilon_{2}
=(ε8)1d4(ε8)d+12ε\displaystyle=\left(\frac{\varepsilon}{8}\right)^{1-d}4\left(\frac{\varepsilon}{8}\right)^{d}+{\textstyle\frac{1}{2}}\varepsilon
=ε84+12ε\displaystyle=\frac{\varepsilon}{8}\cdot 4+{\textstyle\frac{1}{2}}\varepsilon
=12ε+12ε\displaystyle={\textstyle\frac{1}{2}}\varepsilon+{\textstyle\frac{1}{2}}\varepsilon
=ε\displaystyle=\varepsilon

-computation of ε2E1/ε2dd=E1d=E\varepsilon_{2}\sqrt[d]{E_{1}/\varepsilon_{2}^{d}}=\sqrt[d]{E_{1}}=E. We can compute this interval using O(1)O(1) rational multiplications via Lemma 2.7 and the integer ddth-root algorithm, in O(S(E1,b)d)O(S(E_{1},b^{\prime})d) time. The resulting number of bits S(E,b)S(E,b) grows by lg1ε2d=dlg8ε=d(b+3)\lg\frac{1}{\varepsilon_{2}^{d}}=d\lg\frac{8}{\varepsilon}=d(b+3) from the initial division by ε2d\varepsilon_{2}^{d}; only decreases when we take the integer ddth root; and grows by lg1ε2=b+3\lg\frac{1}{\varepsilon_{2}}=b+3 when we multiply by ε2\varepsilon_{2}. Thus S(E,b)=S(E1,4db)+(d+1)(b+3)S(E1,4db)+6dbS(E,b)=S(E_{1},4db)+(d+1)(b+3)\leq S(E_{1},4db)+6db because d2d\geq 2 and b1b\geq 1. ∎

Theorem 2.9.

Operations 1 and 1 can be implemented on a word RAM in O(T(E))O(T(E)) and O(T(E)2)O(T(E)^{2}) time respectively.

Proof.

Given an expression EE, we apply Operation 1 to EE with b=1+B(E)b=1+B(E), which by Theorem 2.8 can be performed in O(T(E,b))O(T(E,b)) time. By Corollary 2.5, T(E,b)2T(E,B(E))=2T(E)T(E,b)\leq 2\cdot T(E,B(E))=2\cdot T(E) because B(E)1B(E)\geq 1. Thus we obtain a rational interval [l,u][l,u] of length ul2bu-l\leq 2^{b} that contains the value of EE.

First we show how to compute the sign of EE in O(1)O(1) additional time given this interval. If 0[l,u]0\in[l,u] (i.e., l0ul\leq 0\leq u), then |l|,|u|1/21+B(E)12C(E)D(E)|l|,|u|\leq 1/2^{1+B(E)}\leq\frac{1}{2}C(E)^{D(E)}, so by Theorem 2.1, EE must in fact be zero. Otherwise, we have either 0<lu0<l\leq u, in which case EE must be positive; or lu<0l\leq u<0, in which case EE must be negative. We can compute the signs of ll and uu in constant time by Lemma 2.7.

Second we show how to compute the floor and ceiling, by reducing to a sign computation. Because b1b\geq 1, the interval length ul12u-l\leq\frac{1}{2}, so [l,u][l,u] contains at most one integer. We can compute l\lfloor l\rfloor and u\lceil u\rceil using integer division of Lemma 2.7. By measuring the length of the expanded interval [l,u][\lfloor l\rfloor,\lceil u\rceil], we determine whether [l,u][l,u] contains an integer (the expanded interval has length 22) or not (the expanded interval has length 11). If [l,u][l,u] does not contain an integer, i.e., i<lu<i+1i<l\leq u<i+1 for an integer i=li=\lfloor l\rfloor, then E=i\lfloor E\rfloor=i and E=i+1\lceil E\rceil=i+1. If [l,u][l,u] contains an integer i=l+1=u1i=\lfloor l\rfloor+1=\lceil u\rceil-1, then we compute the sign of EiE-i using the algorithm above, which costs an additional O(T(Ei))O(T(E-i)) time. If EiE-i is zero, then E=E=i\lfloor E\rfloor=\lceil E\rceil=i; if EiE-i is positive, then E=i\lfloor E\rfloor=i and E=i+1\lceil E\rceil=i+1; and if EiE-i is negative, then E=i1\lfloor E\rfloor=i-1 and E=i\lceil E\rceil=i. Thus we obtain the floor and ceiling of EE in all cases.

It remains to show that T(Ei)=O(T(E)2)T(E-i)=O(T(E)^{2}), which we do by applying Lemma 2.6. We have T(i)=S(i)=lgC(i)=max{1,lg|i|}=O(S(E))T(i)=S(i)=\lg C(i)=\max\{1,\lg|i|\}=O(S(E)) where the upper bound follows because S(i)S(i) is the maximum number of bits in an integer in a rational approximation of EiE\approx i. And S(E)T(E)S(E)\leq T(E) by inspection of (8). Thus T(i)=O(T(E))T(i)=O(T(E)) so T(Ei)=O(T(E)T(i))=O(T(E)2)T(E-i)=O\big{(}T(E)\cdot T(i)\big{)}=O(T(E)^{2}). ∎

Corollary 2.10.

Operations 11, 1, and 1 can be implemented on the word RAM in the time bounds given by the recursive cost model in Table 1.

Corollary 2.11.

We can compare whether E1E2E_{1}\leq E_{2} for two expression DAGs E1E_{1} and E2E_{2} in O(T(E1)T(E2))O\big{(}T(E_{1})\cdot T(E_{2})\big{)} time.

Proof.

To compare E1E_{1} and E2E_{2}, we apply Operation 1 from Theorem 2.9 to evaluate the sign of E1E2E_{1}-E_{2} in O(T(E1E2))O(T(E_{1}-E_{2})) time. By Lemma 2.6, T(E1E2)=O(T(E1)T(E2))T(E_{1}-E_{2})=O\big{(}T(E_{1})\cdot T(E_{2})\big{)}. ∎

2.2.3 Simple Cost Model

Our second cost model is easier to use, but in general may be weaker. It focuses on four key properties of an expression:

  1. 1.

    The height H(E)H(E) of the expression DAG EE, i.e., the length of the longest directed path, where integer expressions have height 0.

  2. 2.

    The number of nodes #(E)\#(E) of the expression DAG EE. In particular, #(E)<2H(E)+1\#(E)<2^{H(E)+1}.

  3. 3.

    The root set R(E)R(E) of the expression DAG EE, i.e., the set of distinct roots taken (such as 2\sqrt{2} and 53\sqrt[3]{5}).

  4. 4.

    The maximum root degree Δ(E)=max{kxkR(E)}\Delta(E)=\max\{k\mid\sqrt[k]{x}\in R(E)\}. In our algorithm (and many computational geometry algorithms), Δ(E)=2\Delta(E)=2, meaning we just take square roots.

  5. 5.

    The maximum number b(E)b^{*}(E) of bits in the integer leaves of the expression DAG EE.

The main challenge in obtaining a running time dependent only on the number |R(E)||R(E)| of distinct roots is in dealing with multiple radical nodes of equal value. First, in Theorem 2.12, we ignore this issue, and solve the T(E)T(E) recurrence in terms of D(E)D(E). Next, in Lemma 2.13, we show how to modify expressions to remove equal-value radical nodes. Finally, in Theorem 2.14, we combine these tools to derive the simple cost model.

Theorem 2.12.

The running time T(E,b)T(E,b) for Operation 1 satisfies

T(E,b)=O(b(16Δ(E))H(E)+2).T(E,b)=O\left(b\cdot\big{(}16\cdot\Delta(E)\big{)}^{H(E)+2}\right).

The running time T(E)T(E) for Operation 1 satisfies

T(E)=O(D(E)b(E)(32Δ(E))H(E)+2).T(E)=O\left(D(E)\cdot b^{*}(E)\cdot\big{(}32\cdot\Delta(E)\big{)}^{H(E)+2}\right).
Proof.

First observe that the number kk of nodes in the expression DAG EE is at most 2H(E)2^{H(E)}.

Next we prove lgC(E)=O(kb(E))=O(2H(E)b(E))\lg C(E)=O(k\cdot b^{*}(E))=O(2^{H(E)}b^{*}(E)) and lgC(E)b(E)\lg C(E)\geq b^{*}(E). Taking the lg\lg of (2), we obtain the recurrence:

lgC(E)\displaystyle\lg C(E) ={max{1,lg|E|}if E is an integer expression,lgC(E1)+lgC(E2)+1if E=E1E2 for some operator {+,,×,÷},lgC(E1)+1if E=E1d,\displaystyle=\begin{cases}\max\{1,\lg|E|\}&\text{if $E$ is an integer expression,}\\ \lg C(E_{1})+\lg C(E_{2})+1&\text{if $E=E_{1}\circ E_{2}$ for some operator $\circ\in\{+,-,\times,\div\}$,}\\ \lg C(E_{1})+1&\text{if $E=\sqrt[d]{E_{1}}$,}\end{cases}

The two recursive cases (second and third) have additive terms that simply count the number knk_{n} of nonleaf nodes in EE. The base case (first) is at most b(E)b^{*}(E), and in at least one case is exactly b(E)b^{*}(E). Thus lgC(E)\lg C(E) is at least b(E)b^{*}(E) and is at most the number knk_{n} of nonleaf nodes plus b(E)b^{*}(E) times the number klk_{l} of leaf nodes, which is at most kb(E)k\cdot b^{*}(E).

By (1), B(E)=D(E)lgC(E)=O(D(E)2H(E)b(E))B(E)=\lceil D(E)\lg C(E)\rceil=O(D(E)2^{H(E)}b^{*}(E)).

Next analyze the growth of the bb parameter in the S(E,b)S(E,b) and T(E,b)T(E,b) recurrences of (9) and (8), which follow the same pattern in terms of how bb grows down the recursion. Each radical node in EE recurses with b=4dbb^{\prime}=4db, thus multiplying the input bb by a factor of 4d4Δ(E)4d\leq 4\cdot\Delta(E). Each nonradical node in EE recurses with bb+2B(E)3bb^{\prime}\leq b+2B(E)\leq 3b because bB(E)b\geq B(E), thus multiplying the input bb by a factor of at most 34Δ(E)3\leq 4\cdot\Delta(E). Thus each recursive level increases bb by a factor of at most 4Δ(E)4\cdot\Delta(E). Let b0b_{0} represent the parameter bb at the top level of the recurrence, i.e., in the call to Operation 1. Then after H(E)H(E) levels, we multiply bb by a factor of at most (4Δ(E))H(E)(4\cdot\Delta(E))^{H(E)}. Thus we can assume throughout the S(E,b)S(E,b) and T(E,b)T(E,b) recursions that bb0(4Δ(E))H(E)b\leq b_{0}(4\cdot\Delta(E))^{H(E)}.

Now we compute S(E,b)S(E,b) via the recurrence (9). The number of recursive calls to S(E,b)S(E,b) is at most 2H(E)2^{H(E)}. Each recursive call has an additive term of bb(E)lgC(E)B(E)b0b^{\prime}\leq b^{*}(E)\leq\lg C(E)\leq B(E)\leq b_{0} in the base case, and at most 6db6Δ(E)b0(4Δ(E))H(E)=O(b0(4Δ(E))H(E)+1)6db\leq 6\cdot\Delta(E)b_{0}(4\cdot\Delta(E))^{H(E)}=O(b_{0}(4\cdot\Delta(E))^{H(E)+1}) in the recursive case (dominated by the radical case). Multiplying, the total cost is at most 2H(E)O(b0(4Δ(E))H(E)+1)=O(b0(8Δ(E))H(E)+1)2^{H(E)}\cdot O(b_{0}(4\cdot\Delta(E))^{H(E)+1})=O(b_{0}(8\cdot\Delta(E))^{H(E)+1}).

Finally we compute T(E,b)T(E,b) via the recurrence (8). The number of recursive calls to T(E,b)T(E,b) is at most 2H(E)2^{H(E)}. Each recursive call has an additive term of at most dS(E,b)Δ(E)O(b0(8Δ(E))H(E)+1)d\cdot S(E,b)\leq\Delta(E)\cdot O(b_{0}(8\cdot\Delta(E))^{H(E)+1}). Multiplying, the total cost is at most 2H(E)Δ(E)O(b0(8Δ(E))H(E)+1)=O(b0(16Δ(E))H(E)+2)2^{H(E)}\cdot\Delta(E)\cdot O(b_{0}(8\cdot\Delta(E))^{H(E)+1})=O(b_{0}(16\cdot\Delta(E))^{H(E)+2}). We obtain the desired upper bound on T(E)=T(E,B(E))T(E)=T(E,B(E)) by substituting our previous upper bound B(E)=O(D(E)2H(E)b(E))B(E)=O(D(E)2^{H(E)}b^{*}(E)). ∎

Define a uniquification of expression EE to be an expression EE^{\prime} satisfying E=EE^{\prime}=E, #(E)#(E)\#(E^{\prime})\leq\#(E), H(E)H(E)H(E^{\prime})\leq H(E), Δ(E)=Δ(E)\Delta(E^{\prime})=\Delta(E), b(E)=b(E)b^{*}(E^{\prime})=b^{*}(E), R(E)=R(E)R^{\prime}(E)=R(E), and the number of radical nodes in EE^{\prime} is the number |R(E)|=|R(E)||R(E)|=|R(E^{\prime})| of distinct roots in EE (i.e., all roots in EE^{\prime} have distinct values). Define a near-uniquification in the same way, but allowing the number of radical nodes in EE^{\prime} to be at most 11 larger (i.e., |R(E)|+1=|R(E)|+1\leq|R(E)|+1=|R(E^{\prime})|+1).

Lemma 2.13.

Given an expression DAG EE, we can compute a uniquification EE^{\prime} of EE in

O(i=1#(E)j=1#(E)T(Ei)T(Ej))O\left(\sum_{i=1}^{\#(E)}\sum_{j=1}^{\#(E)}T(E_{i})\cdot T(E_{j})\right)

time, where each EiE_{i} is a near-uniquification of a subexpression of EE.

Proof.

First we define a partial order on pairs of radical nodes of EE. Let \mathcal{R} be the set of radical nodes in EE. Define a partial order \preceq on \mathcal{R} by xyx\preceq y when x=yx=y or xx is a descendant of yy in EE. This partial order induces a partial order on the cross product ×\mathcal{R}\times\mathcal{R} by (x,x)(y,y)(x,x^{\prime})\preceq(y,y^{\prime}) when xyx\preceq y and xyx^{\prime}\preceq y^{\prime}.

Our algorithm compares all pairs (x,x)(x,x^{\prime}) of distinct radical nodes in \mathcal{R} in a linearization of \preceq (found via topological sort), removing any duplicate (equal-value) nodes by combining those nodes. Thus, when we visit a pair (x,x)(x,x^{\prime}), we know that the expression DAG ExE_{x} with source xx and the expression DAG ExE_{x^{\prime}} with source xx^{\prime} have already had all of their radical nodes pairwise compared and deduplicated, except for the single pair (x,x)(x,x^{\prime}). In particular, ExE_{x} and ExE_{x^{\prime}} are near-uniquifications of the original expression DAGs with source nodes xx and xx^{\prime} respectively. We apply Corollary 2.11 to compare whether Ex=ExE_{x}=E_{x^{\prime}}. If so, we have detected a duplicate root, and we remove it as follows. If xx^{\prime} is a descendant of xx, then exchange xx and xx^{\prime}, so that xx and xx^{\prime} are either incomparable or xx is a descendant of xx^{\prime}. Now replace in EE all references to xx^{\prime} by references to xx (which by the previous exchange will preserve acyclicity), and then remove xx^{\prime} from the DAG EE. As a special case, if xx^{\prime} is the source node of the expression DAG EE, then we replace the entire DAG with ExE_{x}. The new ExE_{x} is a uniquification of a subexpression of EE (the original expression DAG with source xx). Repeating this process, we remove all duplicate roots from EE, while preserving the values of EE, Δ(E)\Delta(E), b(E)b^{*}(E), and R(E)R(E), and only decreasing #(E)\#(E) and H(E)H(E), resulting in a uniquification of EE.

Finally we analyze the running time of this algorithm. For each of (||2)#(E)2{|\mathcal{R}|\choose 2}\leq\#(E)^{2} pairs of distinct radical nodes (x,x)(x,x^{\prime}), we pay O(T(Ex)T(Ex))O\big{(}T(E_{x})\cdot T(E_{x^{\prime}})\big{)} time by Corollary 2.11, where ExE_{x} and ExE_{x^{\prime}} are near-uniquifications of subexpressions of EE. Therefore the claimed time bound holds. ∎

Theorem 2.14.

Operations 1 and 1 can be implemented in

O(#(E)2b(E)2(32Δ(E))2|R(E)|+2H(E)+6)\displaystyle O\left(\#(E)^{2}\cdot b^{*}(E)^{2}\cdot\big{(}32\cdot\Delta(E)\big{)}^{2|R(E)|+2H(E)+6}\right)
=\displaystyle={} O(b(E)2(64Δ(E))2|R(E)|+2H(E)+6)\displaystyle O\left(b^{*}(E)^{2}\cdot\big{(}64\cdot\Delta(E)\big{)}^{2|R(E)|+2H(E)+6}\right)

time.

Proof.

We implement each operation by first removing all duplicate roots from EE via Lemma 2.13, resulting in a uniquification EE^{\prime} without duplicate roots, and second by running the appropriate algorithm from Theorem 2.9 on input EE^{\prime}.

By Lemma 2.13, the running time for the first step is given by

O(i=1#(E)j=1#(E)T(Ei)T(Ej)).O\left(\sum_{i=1}^{\#(E)}\sum_{j=1}^{\#(E)}T(E_{i})\cdot T(E_{j})\right).

By Theorem 2.12, each T(Ei)T(E_{i}) satisfies

T(Ei)=O(D(Ei)b(Ei)(32Δ(Ei))H(Ei)+2).T(E_{i})=O\left(D(E_{i})\cdot b^{*}(E_{i})\cdot\big{(}32\cdot\Delta(E_{i})\big{)}^{H(E_{i})+2}\right).

Because each EiE_{i} is a near-uniquification, it has at most |R(Ei)|+1|R(E_{i})|+1 radical nodes, so D(Ei)Δ(Ei)|R(Ei)|+1D(E_{i})\leq\Delta(E_{i})^{|R(E_{i})|+1}. And because each EiE_{i} is a near-uniquification of a subexpression of EE, we have b(Ei)b(E)b^{*}(E_{i})\leq b^{*}(E), Δ(Ei)Δ(E)\Delta(E_{i})\leq\Delta(E), |R(Ei)||R(E)||R(E_{i})|\leq|R(E)|, and H(Ei)H(E)H(E_{i})\leq H(E). Thus

T(Ei)\displaystyle T(E_{i}) =O(Δ(E)|R(E)|+1b(E)(32Δ(E))H(E)+2)\displaystyle=O\left(\Delta(E)^{|R(E)|+1}\cdot b^{*}(E)\cdot\big{(}32\cdot\Delta(E)\big{)}^{H(E)+2}\right)
=O(b(E)(32Δ(E))|R(E)|+H(E)+3),\displaystyle=O\left(b^{*}(E)\cdot\big{(}32\cdot\Delta(E)\big{)}^{|R(E)|+H(E)+3}\right),

so the running time of the first step is

O(i=1#(E)j=1#(E)(b(E)(32Δ(E))|R(E)|+H(E)+3)2)\displaystyle O\left(\sum_{i=1}^{\#(E)}\sum_{j=1}^{\#(E)}\left(b^{*}(E)\cdot\big{(}32\cdot\Delta(E)\big{)}^{|R(E)|+H(E)+3}\right)^{2}\right)
=O(#(E)2b(E)2(32Δ(E))2|R(E)|+2H(E)+6).\displaystyle=O\left(\#(E)^{2}\cdot b^{*}(E)^{2}\cdot\big{(}32\cdot\Delta(E)\big{)}^{2|R(E)|+2H(E)+6}\right).

Because #(E)<2H(E)+1\#(E)<2^{H(E)+1}, we have #(E)2<22H(E)+2\#(E)^{2}<2^{2\cdot H(E)+2}, whose exponent is smaller than 2|R(E)|+2H(E)+62|R(E)|+2H(E)+6, so we can absorb #(E)2\#(E)^{2} by increasing the base by a factor of 22.

By Theorem 2.9 on input EE^{\prime}, the running time for the second step is O(T(E))O(T(E^{\prime})) for Operation 1 and O(T(E)2)O(T(E^{\prime})^{2}) for Operation 1. Because EE^{\prime} is a uniquification of EE, |R(E)||R(E^{\prime})| is the number of radical nodes in EE^{\prime}, so D(E)Δ(E)|R(E)|=Δ(E)|R(E)|D(E^{\prime})\leq\Delta(E^{\prime})^{|R(E^{\prime})|}=\Delta(E)^{|R(E)|}. By Theorem 2.12,

T(E)\displaystyle T(E^{\prime}) =O(Δ(E)|R(E)|b(E)(32Δ(E))H(E)+2).\displaystyle=O\left(\Delta(E)^{|R(E)|}\cdot b^{*}(E)\cdot\big{(}32\cdot\Delta(E)\big{)}^{H(E)+2}\right).

Therefore the running time is dominated by the first step. ∎

The same idea can be used to “speed up” Operation 1 to require only bB(E)b\geq B(E^{\prime}) instead of bB(E)b\geq B(E). Then the running time would be the sum of the costs from Theorem 2.14 (for the first step) and Theorem 2.12 (for the second step).

2.2.4 Special Cases

Now we look at some restricted expressions where the simple cost model is particularly simple. Our first restriction is to O(1)O(1)-size expressions, where real operations run as fast as integer operations. (An earlier version of this paper [DHK20] called this model the 𝑶(𝟏)O(1)-expression RAM, and we suspect it is useful in many computational geometry algorithms; sadly, it will not suffice for our quasigeodesics algorithm.)

Corollary 2.15.

If #(E)=O(1)\#(E)=O(1) and Δ(E)=O(1)\Delta(E)=O(1), then T(E)=O(b(E))T(E)=O(b^{*}(E)).

Proof.

By Theorem 2.12, H(E)#(E)H(E)\leq\#(E), and D(E)Δ(E)2H(E)=O(1)D(E)\leq\Delta(E)^{2^{H(E)}}=O(1). ∎

Next we consider expressions of logarithmic height, where the running time for real operations is polynomial except for an exponential dependence on the number of distinct roots. This result is what we will use in our quasigeodesics algorithm.

Corollary 2.16.

If H(E)=O(logn)H(E)=O(\log n), b(E)=nO(1)b^{*}(E)=n^{O(1)}, and Δ(E)=O(1)\Delta(E)=O(1), then Operations 1 and 1 can be implemented in nO(1)2O(|R(E)|)n^{O(1)}\cdot 2^{O(|R(E)|)} time.

Proof.

By Theorem 2.14 and #(E)2H(E)\#(E)\leq 2^{H(E)}. ∎

Finally, we prove a general exponential bound on the running time for real operations:

Corollary 2.17.

If #(E)=O(n)\#(E)=O(n) nodes, b(E)=2O(n)b^{*}(E)=2^{O(n)}, and Δ(E)=O(1)\Delta(E)=O(1), then Operations 1 and 1 can be implemented in 2O(n)2^{O(n)} time.

Proof.

By Theorem 2.14, H(E)#(E)H(E)\leq\#(E), and |R(E)|#(E)|R(E)|\leq\#(E). ∎

Corollary 2.18.

Any algorithm running in 2O(n)2^{O(n)} time on the real RAM, where all radical operations are of O(1)O(1) degree, can be run in 2O(n)2^{O(n)} time on the word RAM.

2.2.5 Application to Multi-Expression Objects

We provide some notation to more easily express the simple cost model for algorithms whose input is not one expression but an “object” MM represented by multiple expressions. Let (M)\mathcal{E}(M) be the set of expressions representing MM. First we define parameters H(M)H(M), b(M)b^{*}(M), Δ(M)\Delta(M) and R(M)R(M) as follows:

H(M)\displaystyle H(M) =maxE(M)H(E),\displaystyle=\max_{E\in\mathcal{E}(M)}H(E), b(M)\displaystyle b^{*}(M) =maxE(M)b(E),\displaystyle=\max_{E\in\mathcal{E}(M)}b^{*}(E), Δ(M)\displaystyle\Delta(M) =maxE(M)Δ(E),\displaystyle=\max_{E\in\mathcal{E}(M)}\Delta(E), R(M)\displaystyle R(M) =E(M)R(E).\displaystyle=\bigcup_{E\in\mathcal{E}(M)}R(E).

We also define Λ(M)\Lambda(M) to be the set of edge lengths of MM (when that makes sense), which is a set of square roots in the Euclidean metric.

Now we define T(M,h)T(M,h) to be the simple-cost-model running time of Theorem 2.14 according to these parameters, assuming root set R(M)Λ(M)R(M)\cup\Lambda(M) and an added expression height of hh (defaulting to 0):

T(M,h)\displaystyle T(M,h) =O(b(M)2(64max{2,Δ(M)})2|R(M)Λ(M)|+2(H(M)+h)+6),\displaystyle=O\left(b^{*}(M)^{2}\cdot\big{(}64\cdot\max\{2,\Delta(M)\}\big{)}^{2|R(M)\cup\Lambda(M)|+2(H(M)+h)+6}\right), (10)
T(M)\displaystyle T(M) =T(M,0).\displaystyle=T(M,0).

We treat a vector (M1,M2,,Mk)(M_{1},M_{2},\ldots,M_{k}) of objects as itself an object, whose expression set \mathcal{E} is i=1k(Mi)\bigcup_{i=1}^{k}\mathcal{E}(M_{i}).

2.3 Associative Operations

Lemma 2.19.

Consider an associative operator \circ over dd-dimensional real vectors, defined by dd expressions (forming ()\mathcal{E}(\circ)) whose leaves can also include one of the 2d2d input variables. Given dd-dimensional vectors x1,x2,,xnx_{1},x_{2},\dots,x_{n}, we can compute in O(n)O(n) time a representation of x1x2xnx_{1}\circ x_{2}\circ\cdots\circ x_{n} as a dd-dimensional vector FF satisfying

H(F)\displaystyle H(F) lgnH()+maxk=1nH(xk),\displaystyle\leq\lceil\lg n\rceil\cdot H(\circ)+\max_{k=1}^{n}H(x_{k}), b(F)\displaystyle b^{*}(F) max{b(),maxk=1nb(xk)},\displaystyle\leq\max\{b^{*}(\circ),\max_{k=1}^{n}b^{*}(x_{k})\},
R(F)\displaystyle R(F) R()k=1nR(xk),\displaystyle\leq R(\circ)\cup\bigcup_{k=1}^{n}R(x_{k}), Δ(F)\displaystyle\Delta(F) max{Δ(),maxk=1nΔ(xk)}.\displaystyle\leq\max\{\Delta(\circ),\max_{k=1}^{n}\Delta(x_{k})\}.

In particular, T(x1x2xn)T((x1,x2,,xn),lgn)T(x_{1}\circ x_{2}\circ\cdots\circ x_{n})\leq T\big{(}(x_{1},x_{2},\dots,x_{n}),\lceil\lg n\rceil\big{)}.

Proof.

At a high level, we compute x1x2xnx_{1}\circ x_{2}\circ\cdots\circ x_{n} according to a complete rooted ordered binary tree TT, with one leaf for each xix_{i} in order and where each internal node computes \circ. This tree has height lgn\lceil\lg n\rceil. To transform this vector expression into real expressions, we make dd copies T1,T2,,TdT_{1},T_{2},\dots,T_{d} of TT. In TiT_{i}, we replace the kkth leaf with the expression representing the iith element of xkx_{k}, and we replace each internal node vv with the expression representing the iith element of \circ, where each reference to the jjth element of the left [right] expression becomes a pointer to the left [right] child of vv in TjT_{j}. Then F=(s1,s2,,sd)F=(s_{1},s_{2},\dots,s_{d}) where sis_{i} is the source node of expression DAG TiT_{i}. The bound on TT follows from Equation (10). ∎

2.4 Polyhedral Inputs

The combinatorial structure of an input polyhedron can be encoded as a primal or dual graph, as usual, but which real numbers should represent the geometry? Because the quasigeodesic problem is about the intrinsic geometry of the surface of a polyhedron, the input geometry can be naturally represented intrinsically as well as extrinsically, leading to three natural representations:

  1. 1.

    Extrinsic coordinates: 3D coordinates for each vertex.

  2. 2.

    Intrinsic coordinates: For each face, for some isometric embedding of the face into 2D, the 2D coordinates of each vertex of the embedded face.

  3. 3.

    Intrinsic lengths: For each face, the lengths of the edges. This representation assumes the faces have been combinatorially triangulated (so some edges may be flat).

In the expression RAM, we can convert coordinates (1 or 2) to edge lengths (3) as follows: given vertex coordinates (x1,y1,z1),(x2,y2,z2)(x_{1},y_{1},z_{1}),\allowbreak(x_{2},y_{2},z_{2}), the distance between these two vertices is

(x1x2)(x1x2)+(y1y2)(y1y2)+(z1z2)(z1z2).\sqrt{(x_{1}-x_{2})\cdot(x_{1}-x_{2})+(y_{1}-y_{2})\cdot(y_{1}-y_{2})+(z_{1}-z_{2})\cdot(z_{1}-z_{2})}.

We can also convert from intrinsic lengths (3) to intrinsic coordinates (2) by, for each triangle, placing two vertices on the xx axis and finding the third vertex by the intersection of two circles whose radii are the two incident edge lengths. (This transformation also requires a square root.) Therefore we can convert representations (1) \rightarrow (2) \leftrightarrow (3). The reverse direction, from intrinsic (2/3) to extrinsic (1), is more difficult, as it involves solving the Alexandrov problem [KPD09]. Accordingly, the intrinsic representations (2/3) represent a more general class of possible polyhedra.

Our quasigeodesic algorithm assumes the intrinsic input representation (2). By the reductions above, our algorithm also applies to polyhedra given in the extrinsic representation (1) or intrinsic length representation (3). On a real RAM, these conversions incur only linear additive time cost. On a word RAM, these conversions add to the root set RR of the input expressions, at most growing to include the edge lengths of the input polyhedron. Our algorithm will in fact introduce the edge lengths of the polyhedron to the root set of expressions anyway, so this does not introduce additional overhead for our algorithms. Thus we can assume any one of these input models.

By contrast, in a model restricting inputs to be integers or rationals, these three input models would define incomparable classes of polyhedra, so no representation would be clearly superior and no conversions would be possible.

3 Algorithm

In this section, we detail an algorithm to find a closed quasigeodesic on the surface of a convex polyhedron PP. First, a bit of terminology: we define a (quasi)geodesic ray/segment to be a one/two-ended path that is (quasi)geodesic.

Refer to caption
Figure 5: We construct a directed graph where vertices are pairs (U,I)(U,I), where UU is a polyhedron vertex and II is an interval of directions/angles leaving UU. The left figure shows two polyhedron vertices, where the space of directions leaving each vertex is partitioned into intervals of size ε/2\leq\varepsilon/2. The right figure shows an edge from (U,α)(U,\alpha) to (V,ϕ)(V,\phi): there exists a geodesic leaving UU in a direction from interval α\alpha that hits VV such that a quasigeodesic may continue from VV along any direction in interval ϕ\phi.

3.1 Outline

The idea of the algorithm is roughly as follows. First, we define a directed graph where each node101010We use the word “node” and lower-case letters for vertices of the graph to distinguish them from vertices of a polyhedron, for which we use capital letters and the word “vertex”. is a pair (V,[v1,v2])(V,[\vec{v}_{1},\vec{v}_{2}]) of a vertex VV of PP and a small interval of directions at it, with an edge from one such node, (U,I)(U,I), to another, (V,J)(V,J), if a geodesic ray starting at the polyhedron vertex UU and somewhere in the interval of directions II can reach VV and continue quasigeodesically everywhere in JJ;111111Since we consider only geodesic rays that can continue quasigeodesically everywhere in JJ, there are some closed quasigeodesics that we cannot find: those that leave a polyhedron vertex in a direction in an interval JJ for which some directions are not quasigeodesic continuations. In particular, this algorithm is unlikely to find closed quasigeodesics that turn maximally at a polyhedron vertex. see Figure 5. We show how to calculate at least one out-edge from every node of that graph, so we can start anywhere and follow edges until hitting a node twice, giving a closed quasigeodesic.

The key part of this algorithm is to calculate, given a polyhedron vertex UU and a range of directions as above, another vertex VV that can be reached starting from that vertex and in that range of directions, even though reaching VV may require crossing superpolynomially many faces.

3.2 Divergence of Geodesic Rays

In this section, we prove an upper bound on how long we must follow two geodesic rays before they diverge in the sequence of polyhedron edges they visit. First we define some terms.

Definition 3.1.

If XX is a point on the surface of a polyhedron, v\vec{v} is a direction at XX, and d>0d>0, then S=(X,v,d)S=(X,\vec{v},d) is the geodesic segment starting at XX in the direction v\vec{v} and continuing for a distance dd or until it hits a polyhedron vertex, whichever comes first.121212This definition is purely geometric; we reserve calculating these paths for Lemma 3.6. We allow d=d=\infty; in that case, S=(X,v,d)S=(X,\vec{v},d) may be a geodesic ray (or it may stop prematurely at a vertex).

Definition 3.2.

If (X,v,d)(X,\vec{v},d) is a geodesic segment or ray, the edge sequence E(X,v,d)E(X,\vec{v},d) is the (possibly infinite) sequence of polyhedron edges that (X,v,d)(X,\vec{v},d) visits.

Lemma 3.3.

If S1=(X,v1,)S_{1}=(X,\vec{v}_{1},\infty) and S2=(X,v2,)S_{2}=(X,\vec{v}_{2},\infty) are two geodesic rays from a common starting point XX with an angle of θ(0,π)\theta\in(0,\pi) between them, then the edge sequences E(S1)E(S_{1}) and E(S2)E(S_{2}) are distinct, and the first difference between them occurs at most one edge after a geodesic distance of O(L/θ)O(L/\theta).

Proof.

For a finite distance dd, the prefix segment Sid=(X,vi,d)S_{i}^{d}=(X,\vec{v}_{i},d) is a straight segment on the unfolded sequence of corresponding faces intersected by SidS_{i}^{d}. Given a (prefix of) E(Si)E(S_{i}), the segment of SiS_{i} is a straight line on the unfolded sequence of those faces. Thus, while E(S1d)=E(S2d)E(S_{1}^{d})=E(S_{2}^{d}), the two geodesics S1dS_{1}^{d} and S2dS_{2}^{d} form a planar wedge in a common unfolding, as in Figure 6. The distance between the points on the unfolded rays at distance dd from XX is 2dsinθ2>dθ/π2d\sin\frac{\theta}{2}>d\theta/\pi (since θ2<π2\frac{\theta}{2}<\frac{\pi}{2}), so for points at a distance of Ω(L/θ)\Omega(L/\theta), that distance is at least LL. So either E(S1)E(S_{1}) and E(S2)E(S_{2}) differ before then, or the next edge that S1S_{1} and S2S_{2} cross is a different edge, in which case E(S1)E(S_{1}) and E(S2)E(S_{2}) differ in the next edge, as claimed. ∎

Refer to caption
Figure 6: A segment of a geodesic is a straight line in the unfolding of the sequence of faces through which it passes, as in this unfolding of a regular dodecahedron.

If we had defined LL analogously to \ell as not just the length of the longest edge but the greatest distance within a face between a polyhedron vertex and an edge not containing it, we could remove the “at most one edge after” condition from Lemma 3.3.

Lemma 3.3 gives a bound on the geodesic distance to the first difference in the edge sequences (or one edge before that). We now relate geodesic distance to the number of edges visited by the geodesic.

Lemma 3.4.

Let S=(X,v,d)S=(X,\vec{v},d) be a geodesic segment. Then E(S)E(S) consists of O(dL23)O\big{(}\frac{d\,L^{2}}{\ell^{3}}\big{)} edges.

Proof.

We prove that, if the geodesic segment SS has length d</3d<\ell/3, then E(S)E(S) consists of O(L2/2)O(L^{2}/\ell^{2}) edges. The lemma then follows by considering several consecutive subsegments along the full geodesic segment.

Consider the sequence x1,x2,,xkx_{1},x_{2},\dots,x_{k} of intersection points between the segment SS and the respective edges e1,e2,,eke_{1},e_{2},\dots,e_{k} of the polyhedron. Call xix_{i} near a vertex VV if the intrinsic distance xiV/3\|x_{i}-V\|\leq\ell/3. Call the segment SS near a vertex VV if some xix_{i} is near VV.

We claim that SS can be near at most one vertex. Assume for contradiction that SS has a point xix_{i} near vertex UU and a point xjx_{j} near vertex VV. By the triangle inequality,

UV\displaystyle\underbrace{\|U-V\|}_{{}\geq\ell} Uxi/3+xixjd+xjV/3\displaystyle\leq\underbrace{\|U-x_{i}\|}_{{}\leq\ell/3}+\underbrace{\|x_{i}-x_{j}\|}_{{}\leq d}+\underbrace{\|x_{j}-V\|}_{{}\leq\ell/3}
\displaystyle\ell d+23,\displaystyle\leq d+{\textstyle\frac{2}{3}}\ell,
/3\displaystyle\ell/3 d,\displaystyle\leq d,

contradicting that d</3d<\ell/3.

We can thus divide into two cases, depending on whether SS is near any (single) vertex:

Refer to caption
Figure 7: If a geodesic path encounters the same edge twice in nearly the same place and nearly the same direction, as is the case for the thick quasigeodesic path through the center of this figure if every fourth triangle is the same face, it may pass the same sequence of faces in the same order a superpolynomial number of times. Equally colored faces represent copies of the same face being visited multiple times.
Case 1: SS is far from all vertices.

Figure 7 shows an example where SS might cross many faces far from all vertices. Consider a segment xixi+1x_{i}x_{i+1} crossing a face ff. In this case, xix_{i} and xi+1x_{i+1} are not near a vertex. Consider shifting this segment, keeping it parallel to xixi+1x_{i}x_{i+1}, until the shifted segment hits a vertex of the face ff; refer to Figure 8.

Refer to caption
Figure 8: Case 1 of the proof of Lemma 3.4: shifting a segment xixi+1x_{i}x_{i+1} of the segment SS to remain parallel while crossing a single face. (Left) Case 1.1: shifting in the shrinking direction hits a vertex before shrinking to zero length. (Right) Case 1.2: shifting in the growing direction when the shrinking direction degenerates to a single vertex VV.
Case 1.1:

If one shift direction causes the segment to get shorter and the shifted segment yiyi+1y_{i}y_{i+1} hits a vertex before it becomes zero length (as in Figure 8, left), then yiyi+1y_{i}y_{i+1} has length at least \ell (by definition of \ell), and thus xixi+1x_{i}x_{i+1} has length at least 2/(3L)\ell\geq\ell^{2}/(3L).

Case 1.2:

Otherwise, we know that there is a shift direction where the edge first hits a vertex VV of ff when it collapses to zero length at VV, which is a common endpoint of edges eie_{i} and ei+1e_{i+1} of ff that xix_{i} and xi+1x_{i+1} lie on (as in Figure 8, right). Now shift the edge in the opposite direction, which lengthens the edge, until we obtain an edge yiyi+1y_{i}y_{i+1} where either yiy_{i} or yi+1y_{i+1} is a vertex of ff. The shifted segment yiyi+1y_{i}y_{i+1} has length at least \ell (by definition of \ell). Segments xixi+1x_{i}x_{i+1} and yiyi+1y_{i}y_{i+1} form similar triangles with VV. We have yiV,yi+1VL\|y_{i}-V\|,\|y_{i+1}-V\|\leq L (by definition of LL) and either xiV/3\|x_{i}-V\|\geq\ell/3 or xi+1V/3\|x_{i+1}-V\|\geq\ell/3 (by farness), so the coefficient of similarity of at most L/(/3)L/(\ell/3). Therefore xixi+1x_{i}x_{i+1} has length at least /(L/(/3))=2/(3L)\ell/(L/(\ell/3))=\ell^{2}/(3L).

Hence, in both Case 1.1 and 1.2, xixi+1x_{i}x_{i+1} has length at least 2/(3L)\ell^{2}/(3L). But SS has length </3<\ell/3. Therefore, in Case 1, the number of polyhedron edges hit by segment SS is at most L/(2/(3L))=O(L2/2)L/(\ell^{2}/(3L))=O(L^{2}/\ell^{2}).

Case 2: SS is near a vertex VV.

(This case includes when SS starts or ends at a vertex VV.) First consider a segment xixi+1x_{i}x_{i+1} where one vertex (say, xix_{i}) is near VV while the other vertex (say, xi+1x_{i+1}) is far from VV. By the triangle inequality,

Vxi+1\displaystyle\|V-x_{i+1}\| Vxi/3+xixi+1\displaystyle\leq\underbrace{\|V-x_{i}\|}_{\leq\ell/3}+\|x_{i}-x_{i+1}\|
/3+xixi+1.\displaystyle\leq\ell/3+\|x_{i}-x_{i+1}\|.

But Vxi+1\|V-x_{i+1}\|\geq\ell (by definition of \ell), so xixi+123\|x_{i}-x_{i+1}\|\geq\frac{2}{3}\ell. But this contradicts that SS has length d</3d<\ell/3.

Thus any segment xixi+1x_{i}x_{i+1} in Case 2 must have both xix_{i} and xi+1x_{i+1} near vertex VV. Hence the edges eie_{i} hit by SS are all incident to VV. Consider the sequence of faces f0,f1,,fkf_{0},f_{1},\dots,f_{k} intersected by SS, where fif_{i} is the face between intersections xix_{i} and xi+1x_{i+1} for 1i<k1\leq i<k; f0f_{0} is the face before the intersection x0x_{0} (if any); and fkf_{k} is the face after the intersection xkx_{k} (if any). As argued above, the fif_{i} all have VV as a vertex. Thus, if we unfold these faces consecutively into the plane, as shown in Figure 11, then the unfolded faces all rotate around a common point VV. In this unfolded view, SS is a straight segment in the plane passing through the collinear points x1,x2,,xkx_{1},x_{2},\dots,x_{k}. Each angle xi,V,xi+1\angle x_{i},V,x_{i+1} is an angle of a face (at VV), which is at least arcsin(/L)/L\arcsin(\ell/L)\geq\ell/L; see Figure 11. Because the unfolding is planar, i=1k1xi,V,xi+1=x1,V,xk180\sum_{i=1}^{k-1}\angle x_{i},V,x_{i+1}=\angle x_{1},V,x_{k}\leq 180^{\circ}. Thus k180/(/L)=O(L/)k\leq 180^{\circ}/(\ell/L)=O(L/\ell), so the number of faces hit by SS is O(L/)O(L/\ell). ∎

{subcaptionblock}
Refer to caption
Figure 9: The geodesic is straight so the total angle of visited faces must be at most π\pi.
{subcaptionblock}
Refer to caption
Figure 10: Each angle θ\theta of a face is at least /L\ell/L.
Figure 11: A short quasigeodesic can visit ω(1)\omega(1) faces but O(L/)O(L/\ell) faces near a vertex VV.

3.3 Computing Quasigeodesic Rays

Next we show how to algorithmically follow quasigeodesic rays. First we need a lemma about locally unfolding the polyhedron’s surface.

Lemma 3.5.

Given the local coordinate systems C1,C2C_{1},C_{2} for two adjacent faces f1,f2f_{1},f_{2} sharing an edge ee, where each system CiC_{i} is a vector CC specifying the coordinates of all vertices of fif_{i}, we can compute in O(1)O(1) time the orientation-preserving isometry bringing ee on f2f_{2} to ee on f1f_{1} as a transformation matrix II, satisfying

H(I)\displaystyle H(I) H((C1,C2))+O(1),\displaystyle\leq H\big{(}(C_{1},C_{2})\big{)}+O(1), R(I)\displaystyle R(I) R((C1,C2)){e},\displaystyle\subseteq R\big{(}(C_{1},C_{2})\big{)}\cup\big{\{}\|e\|\big{\}},
b(I)\displaystyle b^{*}(I) b((C1,C2)),\displaystyle\leq b^{*}\big{(}(C_{1},C_{2})\big{)}, Δ(I)\displaystyle\Delta(I) max{Δ((C1,C2)),2},\displaystyle\leq\max\Big{\{}\Delta\big{(}(C_{1},C_{2})\big{)},2\Big{\}},

where e\|e\| denotes the length of edge ee.

Proof.

First we define the orientation-preserving isometry I(f,e)I(f,e) that brings edge e=(v,w)e=(v,w) of face ff from its location in ff’s local coordinate system onto the positive xx axis, with vv mapping to the origin. Then the desired transformation is the composition that performs I(f2,e)I(f_{2},e) and then I1(f1,e)I^{-1}(f_{1},e).

Transformation I(f,e)I(f,e) is the composition of translating by v-v and then rotating to bring d=wvd=w-v to the positive xx axis. Suppose dd has coordinates (dx,dy)(d_{x},d_{y}). Viewing 2D points as complex numbers, the rotation is equivalent to multiplying by the complex conjugate dxidyd_{x}-id_{y} of d=dx+idyd=d_{x}+id_{y}, and rescaling by 1/d=1/dx2+dy21/\|d\|=1/\sqrt{d_{x}^{2}+d_{y}^{2}}. Thus the rotation matrix (for multiplying a column vector on the right) is

1d(dxdydydx).\frac{1}{\|d\|}\begin{pmatrix}d_{x}&d_{y}\\ -d_{y}&d_{x}\\ \end{pmatrix}.

Thus the overall affine transformation II can be written as a 3×33\times 3 matrix using O(1)O(1) additions, subtractions, multiplications, and divisions of the input coordinates and one square root, namely, d=dx2+dy2=e\|d\|=\sqrt{d_{x}^{2}+d_{y}^{2}}=\|e\|. The only new leaf integer expression is 11, which does not affect bb^{*}. ∎

Lemma 3.6.

Let S=(X,v,)S=(X,\vec{v},\infty) be a geodesic ray on an nn-vertex polyhedron. We can compute the first kk faces and kk edges visited by SS; the corresponding kk intersection points; and a planar embedding of an unfolding of these faces, edges, and intersection points. In particular, this determines the direction at which SS exits the last face into the last intersection point. All points and directions are represented as (exact) expressions of height H((P,S))+lgkH\big{(}(P,S)\big{)}+\lceil\lg k\rceil and root set R((P,S)){ee is among the first k edges visited by S}R\big{(}(P,S)\big{)}\cup\{\|e\|\mid e\text{ is among the first $k$ edges visited by }S\}.131313For the purposes of H(S)H(S) and R(S)R(S), we ignore the \infty component, and just view SS as the vector object (X,v)(X,\vec{v}); see Section 2.2.5. If SS hits a vertex within the first kk steps, we stop there and also output the vertex; otherwise, we proceed for exactly kk steps. On a real RAM, the running time is O(klgn)O(k\lg n). On the word RAM, the running time is O(T((P,S),lgk)lgn)O\Big{(}T\big{(}(P,S),\lceil\lg k\rceil\big{)}\lg n\Big{)}.

Proof.

Suppose that SS starts at or enters face ff at point Q=(Qx,Qy)Q=(Q_{x},Q_{y}) with direction v=(vx,vy)\vec{v}=(v_{x},v_{y}). We binary search to find an edge of ff where SS exits ff. For each visited edge e=(A,B)e=(A,B) of ff with endpoints (Ax,Ay)(A_{x},A_{y}) and (Bx,By)(B_{x},B_{y}), the intersection QQ^{\prime} of the extension of SS, which has equation

(QxQx)vy=vx(QyQy),(Q^{\prime}_{x}-Q_{x})\cdot v_{y}=v_{x}\cdot(Q^{\prime}_{y}-Q_{y}),

and the extension of ee, which has equation

(QxBx)(AyBy)=(AxBx)(QyBy),(Q^{\prime}_{x}-B_{x})\cdot(A_{y}-B_{y})=(A_{x}-B_{x})\cdot(Q^{\prime}_{y}-B_{y}),

is the point Q=(Qx,Qy)Q^{\prime}=(Q^{\prime}_{x},Q^{\prime}_{y}) given by the solution to the above linear system. Then we can check whether QxQ^{\prime}_{x} is between AxA_{x} and BxB_{x} (or QyQ^{\prime}_{y} is between AyA_{y} and ByB_{y}) using Type-1 comparisons, to determine whether SS in fact crosses ee. If so, we have found a desired edge ee. Otherwise, we can determine whether the intersection QQ^{\prime} is before AA or after BB on ee, using another Type-1 comparison on the signed triangle area, and direct the binary search accordingly.

If SS in fact hits a vertex of face ff, we will determine so by detecting that it hits two edges of ff. Otherwise, for the correct edge ee, we can thus determine (Q,v)(Q^{\prime},\vec{v}) as an affine transformation τ1\tau_{1} on (Q,v)(Q,\vec{v}), where the matrix elements are root-free O(1)O(1)-expressions (involving O(1)O(1)-expressions Ax,Ay,Bx,ByA_{x},A_{y},B_{x},B_{y}). We then take Lemma 3.5’s affine transformation τ2\tau_{2} that brings edge ee of ff in the local coordinate system of ff to edge ee of the adjacent face ff^{\prime} in the local coordinate system of ff^{\prime} and apply it to both QQ^{\prime} and v\vec{v} to obtain the point Q′′Q^{\prime\prime} and the direction v\vec{v}^{\prime} that ray SS will next enter ff^{\prime} in its local coordinate system. Composing τ1\tau_{1} and τ2\tau_{2}, we obtain (Q′′,v)(Q^{\prime\prime},\vec{v}^{\prime}) as an affine transformation TT of (Q,v)(Q,\vec{v}), where again the matrix elements are O(1)O(1)-expressions. By Lemma 3.5, the only added root is the length of edge ee.

Applying this method repeatedly, we can obtain the point QiQ_{i} and direction vi\vec{v}_{i} after SS traverses ii faces in sequence. In each step, (Qi,vi)(Q_{i},\vec{v}_{i}) is determined by an affine transformation TiT_{i} on (Qi1,vi1)(Q_{i-1},\vec{v}_{i-1}) where the matrix elements are O(1)O(1)-expressions. On a real RAM, we compute each (Qi,vi)(Q_{i},\vec{v}_{i}) vector sequentially from the previous one, in O(1)O(1) time. Thus we spend O(k)O(k) time per binary-search iteration, for a total of O(klgn)O(k\lg n). On an expression RAM, we instead apply Lemma 2.19 to represent each (Qi,vi)(Q_{i},\vec{v}_{i}) by expressions of height H(S)+lgiH(S)+\lceil\lg i\rceil from the associative composition (matrix multiplication) of T1,,TiT_{1},\dots,T_{i}. At each step ii, we perform a Type-1 operation involving QiQ_{i}, vi\vec{v}_{i}, and the geometry of the polyhedron PP. Thus, according the simple cost model and Equation (10), this operation takes O(T((P,Qi,vi)))=O(T((P,S),lgi))O\Big{(}T\big{(}(P,Q_{i},\vec{v}_{i})\big{)}\Big{)}=O\Big{(}T\big{(}(P,S),\lceil\lg i\rceil\big{)}\Big{)} time. Summing over ii, we obtain a geometric series dominated by the last term, O(T((P,S),lgk))O\Big{(}T\big{(}(P,S),\lceil\lg k\rceil\big{)}\Big{)}. We pay this cost per binary-search iteration, so the total cost is a factor of lgn\lg n larger. ∎

Lemma 3.7.

Consider an angle-θ\theta cone between geodesic rays S1=(X,v1,)S_{1}=(X,\vec{v}_{1},\infty) and S2=(X,v2,)S_{2}=(X,\vec{v}_{2},\infty). We can compute a geodesic segment S=(X,v,d)S=(X,\vec{v},d) that is in the given cone and ends at a vertex VV of PP and has k=O(L3θ3)k=O\left(\frac{L^{3}}{\theta\,\ell^{3}}\right) intersections between SS and edges of PP, along with the identity of the faces and edges intersected by SS. The output, consisting of v\vec{v}, dd, and the kk intersections, is represented by (exact) expressions of height H((P,S1,S2))+lgkH\big{(}(P,S_{1},S_{2})\big{)}+\lceil\lg k\rceil and root set R((P,S1,S2)){ee is crossed by S}R\big{(}(P,S_{1},S_{2})\big{)}\cup\{\|e\|\mid e\text{ is crossed by }S\}. On a real RAM, the running time is O(klgn)O(k\lg n). On the word RAM, the running time is O(T((S1,S2),lgk)lgn)O\Big{(}T\big{(}(S_{1},S_{2}),\lceil\lg k\rceil\big{)}\lg n\Big{)}.

Proof.

We apply Lemma 3.6 to follow the given geodesic rays S1S_{1} and S2S_{2} until they cross different edges. As a special case, if S1S_{1} and S2S_{2} immediately enter different faces, then XX is on an edge of PP and the cone contains one of the endpoints of that edge, and SS can simply be a portion of that edge. Otherwise, S1S_{1} and S2S_{2} initially enter the same face, and each step, either they exit the face along the same edge (and thus enter the same face) or they exit along different edges. By Lemma 3.3, after a distance of d=O(L/θ)d=O(L/\theta), S1S_{1} and S2S_{2} must exit a common face ff along different edges, say e1e_{1} and e2e_{2}. By Lemma 3.4, this event happens after crossing O(dL23)=O(L3θ3)O\left(\frac{d\,L^{2}}{\ell^{3}}\right)=O\left(\frac{L^{3}}{\theta\,\ell^{3}}\right) edges of PP. Thus we can find the desired vertex YY by choosing a vertex between e1e_{1} and e2e_{2} on ff (where “between” is defined by the sides of the rays corresponding to the cone). Lemma 3.6 also provides a planar embedding of an unfolding of the sequence of faces visited by S1S_{1} and S2S_{2} up to and including ff. Thus we can draw the unfolded segment SS from XX to YY in this embedding, and intersect with each of the unfolded edges to find the intersection points along the way (which can be mapped back to the polyhedron, if desired, via the transformations provided by Lemma 3.6). ∎

3.4 Full Algorithm

We are now ready to state the algorithm for finding a closed quasigeodesic in quasipolynomial time:

Theorem 3.8.

Let PP be a convex polyhedron with nn vertices all of curvature at least ε\varepsilon, let LL be the length of the longest edge, and let \ell be the smallest distance within a face between a vertex and a nonincident edge. Then we can find a closed quasigeodesic on PP consisting of O(nL3ε23)O\left(\frac{n\,L^{3}}{\varepsilon^{2}\,\ell^{3}}\right) segments on faces of PP. On a real RAM, the running time is O(nL3ε23lgn)O\left(\frac{n\,L^{3}}{\varepsilon^{2}\,\ell^{3}}\lg n\right). On an expression RAM, each segment endpoint of the closed quasigeodesic can be represented by an expression with root set R(P)Λ(P)R(P)\cup\Lambda(P) and height H(P)+lgn+lgL3ε3+2H(P)+\lg n+\lg\frac{L^{3}}{\varepsilon\,\ell^{3}}+2, and the running time is

O(nlgnεb(P)2(64max{2,Δ(P)})2|R(P)Λ(P)|+2(H(P)+lgnL3ε3)+10).O\left({\textstyle\frac{n\lg n}{\varepsilon}}\cdot b^{*}(P)^{2}\cdot\big{(}64\cdot\max\{2,\Delta(P)\}\big{)}^{2|R(P)\cup\Lambda(P)|+2\left(H(P)+\lg{\textstyle\frac{n\,L^{3}}{\varepsilon\,\ell^{3}}}\right)+10}\right).

In particular, if Δ(P)2\Delta(P)\leq 2, this running time is

O(n8lgnε8L2121b(P)2214(|R(P)Λ(P)|+H(P)+5)).O\left({\textstyle\frac{n^{8}\lg n}{\varepsilon^{8}}}\cdot{\textstyle\frac{L^{21}}{\ell^{21}}}\cdot b^{*}(P)^{2}\cdot 2^{14\big{(}|R(P)\cup\Lambda(P)|+H(P)+5\big{)}}\right).
Proof.

First, we represent the minimum curvature ε\varepsilon of the polyhedron’s vertices by computing a vector vε\vec{v}_{\varepsilon} whose angle with the positive xx axis is ε\varepsilon. For each vertex VV, we compute a planar embedding of the faces f1,f2,,fkf_{1},f_{2},\dots,f_{k} incident to VV in clockwise order, as follows. Let ee be the edge shared by f1f_{1} and fkf_{k}. We apply Lemma 3.5 to find a transformation from f1f_{1}’s local coordinate system that maps VV to the origin and maps ee to the positive xx axis, by constructing an artificial local coordinate system that places VV and ee in this way. Then we repeatedly apply Lemma 3.5 to place the subsequent faces f2,,fkf_{2},\dots,f_{k}, aligning corresponding edges (similar to Lemma 3.6). By Lemma 3.5, these transformations only involve the square roots in the lengths of edges incident to VV. To compute each fif_{i} on the word RAM, we apply Lemma 2.19 to compute the (associative) product of the transformation matrices from Lemma 3.5, and then apply that transformation to compute the vertex coordinates of fif_{i}. The computed placement of fkf_{k} gives an embedding of ee which, viewed as a vector from VV, forms an angle with the xx axis equal to the curvature of vertex VV. We can compare the curvatures of two vertices by comparing the slopes of these vectors (each a ratio of the two coordinates of the vector) via Operation 1. Taking the minimum among all vertices, we find the vector vε\vec{v}_{\varepsilon} whose angle with the positive xx axis is the minimum angle ε\varepsilon. On the real RAM, this computation costs O(n)O(n) time (by the Handshaking Lemma). On the word RAM, this computation costs

O(nT(P,lgn))O\big{(}n\cdot T(P,\lceil\lg n\rceil)\big{)} (11)

time (as nn is an upper bound on vertex degree and thus lgn\lceil\lg n\rceil is an upper bound on added expression height). We assume that ε45\varepsilon\leq 45^{\circ}: if vε\vec{v}_{\varepsilon}’s slope is not in [0,1][0,1], then we replace it with (1,1)(1,1).

Second, we round the vector vε\vec{v}_{\varepsilon} to an integer vector (a,1)(a,1) whose angle with the positive xx axis is between ε/3\varepsilon/3 and ε\varepsilon. Specifically, we use repeated doubling and then binary search (“exponential search” or “one-sided binary search”) to find the smallest aa for which the slope 1/a1/a is less than the slope of vε\vec{v}_{\varepsilon}. Again, we can compare slopes of vectors via Operation 1. On the real RAM, the total running time is O(lg1ε)O(\lg\frac{1}{\varepsilon}). On the word RAM, the total running time is

O(T((P,a),lgn)lg1ε)O\Big{(}T\big{(}(P,a),\lceil\lg n\rceil\big{)}\cdot\lg{\textstyle\frac{1}{\varepsilon}}\Big{)} (12)

time. The slope of vε\vec{v}_{\varepsilon} is between 1/a1/a and 1/(a1)1/(a-1), whose ratio is a/(a1)=1+1/(a1)a/(a-1)=1+1/(a-1). By our assumption that ε45\varepsilon\leq 45^{\circ}, a2a\geq 2, so the slope ratio is at most 22. For any slope ss between 0 and 11, the ratio sarctans\frac{s}{\arctan s} is between 11 and c=1/arctan1=4/π1.27c=1/\arctan 1=4/\pi\approx 1.27. Thus the angles of vectors vε\vec{v}_{\varepsilon} and (a,1)(a,1) have a ratio of at most 8/π2.558/\pi\approx 2.55. Furthermore, we can upper bound aa as follows: 1/(a1)>tanε1/(a-1)>\tan\varepsilon (as aa was the smallest integer where 1/atanε1/a\leq\tan\varepsilon), so a<1+1/tanεa<1+1/\tan\varepsilon. For 0επ/40\leq\varepsilon\leq\pi/4, tanε\tan\varepsilon is between ε\varepsilon and cεc\varepsilon where c=tan(π/4)/(π/4)=4/πc=\tan(\pi/4)/(\pi/4)=4/\pi. Thus a1+1/ε=Θ(1/ε)a\leq 1+1/\varepsilon=\Theta(1/\varepsilon).

Third, for each vertex VV of each face FF of PP, we choose O(1/ε)O(1/\varepsilon) direction vectors in FF’s local coordinate frame that include the two incident edges of FF, and so that the angle of the wedge between consecutive direction vectors is ε/2\leq\varepsilon/2 and, except for the extreme wedges that are bounded by a polyhedron edge, the angle is at least ε/18\varepsilon/18. We start with the two direction vectors for the two incident edges of FF, normalized (which adds edge-length roots), which forms an angle <π<\pi by convexity. Divide the 360360^{\circ} angle at VV into the standard 4545^{\circ} octants by the lines x=0x=0, y=0y=0, and y=±xy=\pm x. Divide each octant as follows; assume by possible reflection that we are working with the first octant. Consider the vectors vb=(3a,b)\vec{v}_{b}=(3a,b) for b{1,2,,3a}b\in\{1,2,\dots,3a\} (including a scaling of our rounded vector (a,1)(a,1)); refer to Figure 12. We include each vector that is in the desired wedge between the two edge direction vectors (which we can check by comparing slopes). To bound the angles θb\theta_{b} between consecutive vectors vb\vec{v}_{b} and vb1\vec{v}_{b-1} for b{1,2,,3a}b\in\{1,2,\dots,3a\}, note that the triangle (0,0),vb,vb1(0,0),\vec{v}_{b},\vec{v}_{b-1} has base length 11 and height 3a3a, so the same area A=32aA=\frac{3}{2}a for all bb. By the side-angle-side area formula, A=12vbvb1sinθbA=\frac{1}{2}\|\vec{v}_{b}\|\cdot\|\vec{v}_{b-1}\|\sin\theta_{b}, so sinθb=2A/(vbvb1)\sin\theta_{b}=2A/(\|\vec{v}_{b}\|\cdot\|\vec{v}_{b-1}\|). Each length vb\|\vec{v}_{b}\| is between 3a3a and 32a3\sqrt{2}\,a, so

(3a)/(32a)2\displaystyle(3a)/(3\sqrt{2}\,a)^{2} sinθb(3a)/(3a)2, i.e.,\displaystyle\leq\sin\theta_{b}\leq(3a)/(3a)^{2},\text{ i.e.,}
1/(6a)\displaystyle 1/(6a) sinθb1/(3a).\displaystyle\leq\sin\theta_{b}\leq 1/(3a).

Because 0θbπ/40\leq\theta_{b}\leq\pi/4, θb/sinθb\theta_{b}/\sin\theta_{b} is between 11 and c=π/(22)c^{\prime}=\pi/(2\sqrt{2}). Thus

1/(6a)sinθbθbcsinθbc/(3a).1/(6a)\leq\sin\theta_{b}\leq\theta_{b}\leq c^{\prime}\sin\theta_{b}\leq c^{\prime}/(3a).

Because xtanxcxx\leq\tan x\leq cx, we have 1/aarctan(1/a)(1/c)(1/a)1/a\geq\arctan(1/a)\geq(1/c)(1/a), so

16arctan(1/a)θb\displaystyle\frac{1}{6}\arctan(1/a)\leq\theta_{b} cc3arctan(1/a),\displaystyle\leq\frac{c\cdot c^{\prime}}{3}\arctan(1/a),
=4ππ2213arctan(1/a)\displaystyle=\frac{4}{\pi}\cdot\frac{\pi}{2\sqrt{2}}\cdot\frac{1}{3}\arctan(1/a)
=23arctan(1/a).\displaystyle=\frac{\sqrt{2}}{3}\arctan(1/a).

By our choice of aa in the second paragraph, ε/3arctan(1/a)ε\varepsilon/3\leq\arctan(1/a)\leq\varepsilon. Therefore ε/18θb23ε<ε/2\varepsilon/18\leq\theta_{b}\leq\frac{\sqrt{2}}{3}\varepsilon<\varepsilon/2. Every coordinate we introduce here has value O(a)=O(1/ε)O(a)=O(1/\varepsilon).

Refer to caption
Figure 12: Construction of Θ(1/ε)\Theta(1/\varepsilon) wedges of angle Θ(ε)\Theta(\varepsilon) partitioning the angle around the origin.

Fourth, we construct a directed graph GG with a one node for each pair (V,A)(V,A) of a vertex VV from PP and one of its wedges AA (for “angular range”), giving the graph O(n/ε)O(n/\varepsilon) nodes, with an edge from a node s=(U,A)s=(U,A) to a node t=(V,B)t=(V,B) if there exists a direction in AA such that a quasigeodesic ray starting in that direction from the polyhedron vertex UU hits the polyhedron vertex BB and can continue from every direction in VV.

Now we describe how to find an outgoing edge from any given node s=(U,A)s=(U,A) in GG, with corresponding vertex UU and a wedge AA of directions from v1\vec{v}_{1} to v2\vec{v}_{2}. If wedge AA contains a polyhedron edge (U,V)(U,V) (by our construction, this edge will be in the direction v1\vec{v}_{1} or v2\vec{v}_{2} from UU), then we will proceed to polyhedron vertex VV using that direction vector vf\vec{v}_{f}, which we view in the local coordinate system of one of the faces ff incident to the polyhedron edge (U,V)(U,V). Otherwise, apply Lemma 3.7 to follow the cone between geodesic rays S1=(U,v1,)S_{1}=(U,\vec{v}_{1},\infty) and S2=(U,v2,)S_{2}=(U,\vec{v}_{2},\infty) to find a vertex VV within the cone, i.e., reachable by a quasigeodesic segment SS starting at UU within the wedge AA, along with the final face ff visited by SS and the final direction vector vf\vec{v}_{f} of the segment SS in the local coordinate system of ff. In either case, we obtain a quasigeodesic that can then exit the vertex VV anywhere in a cone with angle equal to that vertex’s curvature, which is at least ε\varepsilon, so for at least one of the wedges BB of size ε/2\leq\varepsilon/2 at VV, the quasigeodesic can exit anywhere in that wedge. To find such a wedge, we can construct a planar embedding of the polyhedron faces incident to VV (as in the first paragraph of the proof) in counterclockwise order from ff, and then extend the vector vf\vec{v}_{f} from the embedded VV and compute via binary search which face ff^{\prime} we enter. If there is no such face ff^{\prime}, the curvature is >π>\pi, so the geodesic ray can exit anywhere; choose any graph node (V,B)(V,B) corresponding to this vertex VV. Otherwise, transform the vector vf\vec{v}_{f} to the local coordinate system of ff^{\prime}, and find the clockwise next whole wedge BB in ff^{\prime}; if there is no such whole wedge within ff^{\prime}, choose the clockwise first wedge BB in the clockwise next face after ff^{\prime}. Thus we find an outgoing edge from node s=(U,A)s=(U,A) to node t=(V,B)t=(V,B). The running time on the real RAM is O(klgn+lgn)=O(klgn)=O(L3ε3lgn)O(k\lg n+\lg n)=O(k\lg n)=O\left(\frac{L^{3}}{\varepsilon\,\ell^{3}}\lg n\right). The running time on the word RAM is the sum of the ray-following cost and the final binary-search cost, both of which are

O(T((S1,S2),lgk)lgn)\displaystyle O\Big{(}T\big{(}(S_{1},S_{2}),\lceil\lg k\rceil\big{)}\cdot\lg n\Big{)}
=\displaystyle={} O(T(P,lgn+lgk)lgn)\displaystyle O\Big{(}T\big{(}P,\lceil\lg n\rceil+\lceil\lg k\rceil\big{)}\cdot\lg n\Big{)}
=\displaystyle={} O(T(P,lgn+lgL3ε3)lgn)\displaystyle O\left(T\left(P,\lceil\lg n\rceil+\left\lceil\lg{\textstyle\frac{L^{3}}{\varepsilon\,\ell^{3}}}\right\rceil\right)\cdot\lg n\right)
\displaystyle\leq{} O(T(P,lgnL3ε3+2)lgn).\displaystyle O\left(T\left(P,\lg{\textstyle\frac{n\,L^{3}}{\varepsilon\,\ell^{3}}}+2\right)\cdot\lg n\right). (13)

We start from any node of GG, and repeatedly traverse outgoing edges of GG until we repeat a node of GG. In the worst case, we compute an outgoing edge for each of the O(n/ε)O(n/\varepsilon) nodes of GG before finding a cycle. The resulting cycle in GG exactly corresponds to a closed quasigeodesic on the polyhedron, by the definition of the graph. Each traversal of an edge in GG corresponds to a vertex-to-vertex geodesic path, which either follows a polyhedron edge so has just one segment, or applies Lemma 3.7 to a cone with angle ε/18\geq\varepsilon/18 (by the third paragraph, because the cone did not contain a polyhedron edge), so has O(L3ε3)O\left(\frac{L^{3}}{\varepsilon\,\ell^{3}}\right) segments. The closed geodesic can be described by O(n/ε)O(n/\varepsilon) such vertex-to-vertex paths, for a total of O(nL3ε23)O\left(\frac{n\,L^{3}}{\varepsilon^{2}\,\ell^{3}}\right) segments.

On the real RAM, the overall running time is O(n)O(n) for the preprocessing in the first paragraph, O(lg1ε)O(\lg\frac{1}{\varepsilon}) for the preprocessing in the second paragraph, and O(nL3ε23lgn)O\left(\frac{n\,L^{3}}{\varepsilon^{2}\,\ell^{3}}\lg n\right) for walking/constructing the graph, of which the last term dominates. On the word RAM, the overall running time is the sum of Equations (11), (12), and O(n/ε)O(n/\varepsilon) times (13):

O(nT(P,lgn))+O(T((P,a),lgn)lg1ε)+O(nlgnεT(P,lgnL3ε3+2))O\big{(}n\cdot T(P,\lceil\lg n\rceil)\big{)}+O\Big{(}T\big{(}(P,a),\lceil\lg n\rceil\big{)}\cdot\lg{\textstyle\frac{1}{\varepsilon}}\Big{)}+O\left({\textstyle\frac{n\lg n}{\varepsilon}}\cdot T\left(P,\lg{\textstyle\frac{n\,L^{3}}{\varepsilon\,\ell^{3}}}+2\right)\right) (14)

Substituting Equation 10, the third term of (14) expands to141414The word-RAM running time appears to have one fewer top-level 1/ε1/\varepsilon factor than the real-RAM running time (though it actually gains many more 1/ε1/\varepsilon factors from the increase in expression complexity). This difference stems from the geometric series in Lemma 3.6, which is dominated by just the last step in ray following, so we lose the top-level factor of O(L3ε3)O\left(\frac{L^{3}}{\varepsilon\,\ell^{3}}\right) from the number of segments followed (instead it appears in the increase in expression complexity).

O(nlgnεb(P)2(64max{2,Δ(P)})2|R(P)Λ(P)|+2(H(P)+lgnL3ε3)+10).\displaystyle O\left({\textstyle\frac{n\lg n}{\varepsilon}}\cdot b^{*}(P)^{2}\cdot\big{(}64\cdot\max\{2,\Delta(P)\}\big{)}^{2|R(P)\cup\Lambda(P)|+2\left(H(P)+\lg{\textstyle\frac{n\,L^{3}}{\varepsilon\,\ell^{3}}}\right)+10}\right). (15)

The first term of (14) is clearly dominated by the third term. The second term of (14) is also dominated by the third term because a=Θ(1/ε)a=\Theta(1/\varepsilon) (as argued above) and aa contributes only to bb^{*} (not Δ\Delta, RR, Λ\Lambda, or HH), so overall in TT adds a factor of Θ(1/ε)2\Theta(1/\varepsilon)^{2}, which is dominated by 64lg1ε64^{\lg{\textstyle\frac{1}{\varepsilon}}} which appears in the expansion (15) of the third term. Therefore the expression in (15) bounds the overall running time. ∎

If DD is the greatest diameter of a face, then a closed quasigeodesic found by Theorem 3.8 has length O(nε(Lε+D))O\left({n\over\varepsilon}\left({L\over\varepsilon}+D\right)\right), because the quasigeodesic visits O(n/ε)O(n/\varepsilon) graph nodes and, by Lemma 3.3, goes a distance at most L/ε+DL/\varepsilon+D between each consecutive pair.

4 Conclusion

It has been known for seven decades [Pog49] that every convex polyhedron has a closed quasigeodesic, but our algorithm is the first finite algorithm to find one. We end with some open problems about extending our approach, though they all seem difficult.

Open Problem 1.

Is there a reduction from sum-of-square-roots [DMO09] to a decision problem involving quasigeodesics?

One goal would be to represent the sum of nn given square roots by unfolding/following a geodesic ray across Θ(n)\Theta(n) edges and faces, where ideally the faces could come from a convex polyhedron. As described in Section 1.1, such a reduction would justify our algorithm taking exponential time in the worst case, as no better bound is known for sum-of-square-roots.

Open Problem 2.

Is there a quasipolynomial-time algorithm on the real RAM for finding a non-self-intersecting closed quasigeodesic? In particular, can we find the shortest closed quasigeodesic?

At least three non-self-intersecting closed quasigeodesics exist [Pog49], but Theorem 3.8 does not necessarily find one. Any approach similar to Theorem 3.8 is unlikely to resolve this, for several reasons:

  1. 1.

    Parts of a quasigeodesic could enter a vertex at infinitely many angles. Theorem 3.8 makes this manageable by grouping similar angles of entry to a vertex, but if similar angles of entry to a vertex are combined, extensions that would be valid for some of them but invalid for others are treated as invalid for all of them. For instance, a quasigeodesic found by Theorem 3.8 will almost never turn by the maximum allowed at any vertex, since exiting a vertex at the maximum possible turn from one entry angle to the vertex may mean exiting it with more of a turn than allowed for another very close entry angle. So there are some closed quasigeodesics not findable by Theorem 3.8, and those may include non-self-intersecting ones.

  2. 2.

    Given a vertex and a wedge determined by a range of directions from it, we can find one vertex in the wedge, but if we wish to find more than one, the problem becomes more complicated. When we seek only one vertex, we only need consider one unfolding of the faces, which the entire wedge stays in until it hits a vertex; when we pass a vertex, the unfoldings on each side of it might be different, so we multiply the size of the problem by 2 every time we pass a vertex. There may, in fact, be exponentially many non-self-intersecting geodesic paths between two vertices: for instance, Aronov and O’Rourke [DO07a] give the example of a doubly covered regular polygon, in which a geodesic path may visit every vertex in order around the cycle but may skip vertices.

On the other hand, recent work [CdM22] (after the conference version of this paper [DHK20]) gives an algorithm to find a non-self-intersecting closed quasigeodesic on a polyhedron, but the running time is pseudo-exponential (exponential in both nn and L/L/\ell).

Open Problem 3.

On the real RAM, the running time of Theorem 3.8 is polynomial in not just nn but the smallest curvature at a vertex, the length of the longest edge, and the shortest distance within a face between a vertex and an edge not containing it. Are all of those necessary? Can the last be simplified to the length of the shortest edge?

Open Problem 4.

On the expression RAM (or word RAM), the running time of Theorem 3.8 has an exponential dependence on nn and the geometric features. Can this be improved to a polynomial dependence, that is, a pseudopolynomial-time algorithm?

Open Problem 5.

Can the algorithm of Theorem 3.8 be extended to nonconvex polyhedra?

A quasigeodesic cannot pass through a nonconvex vertex. If the extended wedge in our algorithm contains a nonconvex vertex, the wedge will split in two, as shown in Figure 13, and complexity grows exponentially.

Refer to caption
Figure 13: An example of our algorithm applied to a nonconvex polycube. A geodesic search from vertex aa within an angular interval may encounter a nonconvex vertex from which the search space divides.

Recent work [CdM22] (after the conference version of this paper) defines an alternate notion of quasigeodesic path for nonconvex polyhedra, which requires at negative-curvature vertices that the path has an angle of at least 180180^{\circ} on both sides (while at positive-curvature vertices the path still has angles of at most 180180^{\circ}). They prove that every nonconvex polyhedron has such a closed “quasigeodesic”, and gave an algorithm to find one. Indeed, our algorithm can also find such a path, redefining ε\varepsilon to be the smallest absolute curvature of a polyhedron vertex.

Acknowledgments

The authors thank Zachary Abel, Nadia Benbernou, Fae Charlton, Jayson Lynch, Joseph O’Rourke, Diane Souvaine, and David Stalfa for discussions related to this paper.

References

  • [AAOS97] Pankaj K. Agarwal, Boris Aronov, Joseph O’Rourke, and Catherine A. Schevon. Star unfolding of a polytope with applications. SIAM Journal on Computing, 26(6):1689–1713, December 1997.
  • [Bal78] Hans Werner Ballmann. Der Satz von Lusternik und Schnirelmann. Bonner Mathematische Schriften, 102:1–25, 1978.
  • [BFM+01] Christoph Burnikel, Stefan Funke, Kurt Mehlhorn, Stefan Schirra, and Susanne Schmitt. A separation bound for real algebraic expressions. In Proceedings of the 9th Annual European Symposium on Algorithms, volume 2161 of Lecture Notes in Computer Science, pages 254–265, Aarhus, Denmark, August 2001.
  • [Bir27] George D. Birkhoff. Dynamical Systems, volume 9 of Colloquium Publications. American Mathematical Society, 1927.
  • [BMS85] A. Bertoni, G. Mauri, and N. Sabadini. Simulations among classes of random access machines and equivalence among numbers succinctly represented. Annals of Discrete Mathematics, 25:65–90, 1985.
  • [BMZ02] Yves Bertot, Nicolas Magaud, and Paul Zimmermann. A proof of GMP square root. Journal of Automated Reasoning, 29:225–252, 2002.
  • [BTZ83] Werner Ballmann, Gudlaugur Thorbergsson, and Wolfgang Ziller. On the existence of short closed geodesics and their stability properties. In Seminar on Minimal Submanifolds, pages 53–63. Princeton University Press, 1983.
  • [BZ10] Richard P. Brent and Paul Zimmermann. Modern Computer Arithmetic. Cambridge University Press, November 2010.
  • [CdM22] Jean Chartier and Arnaud de Mesmay. Finding weakly simple closed quasigeodesics on polyhedral spheres. In Proceedings of the 38th International Symposium on Computational Geometry, Berlin, Germany, June 2022. arXiv:2203.05853.
  • [CP09] Timothy M. Chan and Mihai Pǎtraşcu. Transdichotomous results in computational geometry, I: Point location in sublogarithmic time. SIAM Journal on Computing, 39(2):703–729, 2009.
  • [CP10] Timothy M. Chan and Mihai Pǎtraşcu. Transdichotomous results in computational geometry, II: Offline search. arXiv:1010.1948, 2010. Originally published at STOC 2007.
  • [DHK20] Erik D. Demaine, Adam C. Hesterberg, and Jason S. Ku. Finding closed quasigeodesics on convex polyhedra. In Proceedings of the 36th International Symposium on Computational Geometry, pages 33:1–33:13, June 2020.
  • [DMO09] Erik D. Demaine, Joseph S. B. Mitchell, and Joseph O’Rourke. Problem 33: Sum of square roots. In The Open Problems Project. 2009. https://topp.openproblem.net/p33.
  • [DO07a] Erik D. Demaine and Joseph O’Rourke. Geodesics: Lyusternik-Schnirelmann. [DO07b], section 24.4, pages 372–375.
  • [DO07b] Erik D. Demaine and Joseph O’Rourke. Geometric Folding Algorithms: Linkages, Origami, Polyhedra. Cambridge University Press, July 2007.
  • [Für14] Martin Fürer. How fast can we multiply large integers on an actual computer? In Alberto Pardo and Alfredo Viola, editors, Proceedings of the 11th Latin American Symposium on Theoretical Informatics, pages 660–670, 2014.
  • [FW93] Michael L. Fredman and Dan E. Willard. Surpassing the information theoretic bound with fusion trees. Journal of Computer and System Sciences, 47(3):424–436, 1993.
  • [Gib61] Allan Gibb. Algorithm 61: Procedures for range arithmetic. Communications of the ACM, 4(7):319–320, July 1961.
  • [HvdH19] David Harvey and Joris van der Hoeven. Integer multiplication in time O(nlogn)O(n\log n). HAL Preprint hal-02070778, 2019. https://hal.archives-ouvertes.fr/hal-02070778.
  • [IRV19] Jin-Ichi Itoh, Joël Rouyer, and Costin Vîlcu. Polyhedra with simple dense geodesics. Differential Geometry and its Applications, 66:242–252, 2019.
  • [KLPY99] V. Karamcheti, C. Li, I. Pechtchanski, and C. Yap. A core library for robust numeric and geometric computation. In Proceedings of the 15th Annual Symposium on Computational Geometry, pages 351–359, Miami Beach, Florida, 1999.
  • [Knu69] Donald E. Knuth. The Art of Computer Programming, volume 2. Addison-Wesley, 1969.
  • [KPD09] Daniel Kane, Gregory N. Price, and Erik D. Demaine. A pseudopolynomial algorithm for Alexandrov’s Theorem. In Proceedings of the 11th Algorithms and Data Structures Symposium, volume 5664 of Lecture Notes in Computer Science, pages 435–446, Banff, Canada, August 2009.
  • [Lac03] M. Laczkovich. The removal of π\pi from some undecidable problems involving elementary functions. Proceedings of the American Mathematical Society, 131(7):2235–2240, 2003.
  • [LS29] Lazar Lyusternik and Lev Schnirelmann. Sur le probléme de trois géodésiques fermées sur les surfaces de genre 0. Comptes Rendus de l’Académie des Sciences de Paris, 189:269–271, 1929.
  • [Moo79] Ramon E. Moore. Methods and Applications of Interval Analysis. Society for Industrial and Applied Mathematics, Philadelphia, 1979.
  • [O’R20] Joseph O’Rourke. Personal communication, 2020.
  • [Pog49] Aleksei Vasilevich Pogorelov. Quasi-geodesic lines on a convex surface. Matematicheskii Sbornik, 25(62):275–306, 1949. English translation in American Mathematical Society Translations 74, 1952.
  • [Poi05] Henri Poincaré. Sur les lignes géodésiques des surfaces convexes. Transactions of the American Mathematical Society, 6(3):237–274, 1905.
  • [SC20] Nicholas Sharp and Keenan Crane. You can find geodesic paths in triangle meshes by just flipping edges. ACM Transactions on Graphics, 39(6):1–15, December 2020.
  • [Sch79] Arnold Schönhage. On the power of random access machines. In Proceedings of the 6th International Colloquium on Automata, Languages, and Programming, volume 71 of Lecture Notes in Computer Science, pages 520–529, 1979.
  • [Sch80] A. Schönhage. Storage modification machines. SIAM Journal on Computing, 9(3):490–508, 1980.
  • [Sha78] Michael Ian Shamos. Computational Geometry. PhD thesis, Yale University, 1978.
  • [Zim99] Paul Zimmermann. Karatsuba square root. Research Report 3805, INRIA, 1999. http://hal.inria.fr/docs/00/07/28/54/PDF/RR-3805.pdf.