Neural Octahedral Field: Octahedral prior for simultaneous smoothing and sharp edge regularization

Ruichen Zheng Tsinghua UniversityHaidian DistrictBeijingChina [email protected] and Tao Yu Tsinghua UniversityHaidian DistrictBeijingChina [email protected]

Abstract.

Neural implicit representation, the parameterization of distance function as a coordinate neural field, has emerged as a promising lead in tackling surface reconstruction from unoriented point clouds. To enforce consistent orientation, existing methods focus on regularizing the gradient of the distance function, such as constraining it to be of the unit norm, minimizing its divergence, or aligning it with the eigenvector of Hessian that corresponds to zero eigenvalue. However, under the presence of large scanning noise, they tend to either overfit the noise input or produce an excessively smooth reconstruction. In this work, we propose to guide the surface reconstruction under a new variant of neural field, the octahedral field, leveraging the spherical harmonics representation of octahedral frames originated in the hexahedral meshing. Such field automatically snaps to geometry features when constrained to be smooth, and naturally preserves sharp angles when interpolated over creases. By simultaneously fitting and smoothing the octahedral field alongside the implicit geometry, it behaves analogously to bilateral filtering, resulting in smooth reconstruction while preserving sharp edges. Despite being operated purely pointwise, our method outperforms various traditional and neural approaches across extensive experiments, and is very competitive with methods that require normal and data priors. Our full implementation is available at: https://github.com/Ankbzpx/frame-field.

^†^†copyright: none

Refer to caption — Figure 1. Given an unoriented point cloud, start with a faithful initialization (i.e. with NSH (Wang et al., 2023)), our method align a smooth neural octahedral field with the jointly learned implicit geometry, while leveraging its octahedral prior to simultaneously smooth and regularize the sharp edges of the reconstruction.

1. Introduction

As the predominant method for acquiring real-world geometry, 3D scanning technologies have advanced rapidly and become increasingly accessible over the past several decades. The early procedure relied on triangulation-based solutions that required a strictly controlled environment, but now handhold scanning using test of flight sensors in smart phones is no longer a rare sight. Despite the varying methodologies, the intermediate noisy point cloud remains the most common data consumed for the reconstruction step, making its study of great importance and practical value.

The goal of most surface reconstruction tasks is to remove as much noise as possible while preserving geometric features such as sharp edges. In fact, the denoising and sharpening tasks are often interconnected. For instance, (Fleishman et al., 2005) employs the median filter to remove the initial noise, then adapts the forward search scheme, by progressively fitting a subset, then filtering outlines to determine a set of patches, which can be further sampled towards edges. Huang et al. (2013) first applies bilateral filters to denoise the noisy observation that off-edge normal can be reliably estimated. They then progressively sample towards the edge with newly updated normal to improve edge quality. Similarly, (Wei et al., 2023a) draws the idea using line processes that favors smoothness when similarity measurement is under a tunable threshold, hence it performs smoothing while not compromising sharp features. The work above all requires the fitting of a local tangent plane, over neighborhood patches, and with KNN connectivity.

However, when it comes to neural implicit representation, which stays pointwise and has no connectivity nor local neighborhood in the traditional sense, enforcing its smoothness while preserving sharp geometric feature becomes an exceedingly challenging task. Processing such representation requires slow and costly evaluation of high-order energies over densely sampled queries (Yang et al., 2021). Even connectivity can be created using differentiable surface extraction (Remelli et al., 2020), its backpropagation violates the level set equation, and hence is inherently biased (Mehta et al., 2022). However, as the popularity of neural implicit representation increases, it gradually becomes the direct output in acquiring new geometries (Mildenhall et al., 2021), rather than an intermediate format for storing existing ones, so its denosing and sharpening holds practical value.

In this work, we tackle this challenge through the len of surface reconstruction from unoriented point–newly proposed regularizations (Ben-Shabat et al., 2023; Wang et al., 2023) can produce a faithful initialization from positional constraints alone, but cannot handle noise nor enforce sharp feature unless it is given by positional input. To this end, we introduce a guiding octahedral field, which by its nature is edge aware, and by alternating between aligning the octahedral field with implicit geometry normal and regularizing the normal to match the guide field, it converges to smooth reconstruction with exaggerated sharp edges. More importantly, our method stays pointwise throughout the process.

Our contribution can be summarized as:

•

Propose a novel method to simultaneously smooth and emphasize the sharp edges of neural implicit representation for unoriented point cloud surface reconstruction.
•

Introduce a new variant of neural field, the octahedral field, that can be jointly learned with other neural implicit representations. Moreover, we design an effective alignment loss as well as an efficient smoothness loss to make the learning of the zero-level aligned smooth octahedral field feasible.
•

Design a novel sharp edge regularization loss that utilizes the cubic symmetry of the octahedral frame to regularize the gradient of implicit geometry, encouraging it to preserve sharp turing angles near the edge.

2. Background and Related work

2.1. Surface reconstruction from noisy point cloud

Reconstructing surface has been extensively researched in recent decades and continues to evolve rapidly with the advancement of deep learning. We focus on reviewing the approaches most related to ours and refer the interested readers to more detailed reviews (Huang et al., 2022; Sulzer et al., 2024).

Point Set Surface (Alexa et al., 2001) (PSS) fits polynomials to approximate the surface locally. The query points are evaluated using the Moving Least Square (MLS) (Levin, 2004), with the gradient of the polynomials indicating the surface normals (Alexa and Adamson, 2004). Algebraic Point Set Surface (APSS) (Guennebaud and Gross, 2007) fits the algebraic sphere to improve the stability of PSS over regions of high curvature. Robust Implicit MLS (Öztireli et al., 2009) integrates robust statistics with local kernel regression to handle outliers. Fleishman et al. (2005) apply forward search (Atkinson and Riani, 2012) to filter outliers that cannot be identified by robust statistics. Variational Implicit Point Set Surface (VPSS) (Huang et al., 2019) leverages the eikonal constraint to ease the need for input normals, but at the cost of cubic complexity. In contrast to local fitting, Poisson Surface Reconstruction (PSR) (Kazhdan et al., 2006) represents the surface globally as the level set of implicit indicator function, whose gradient matches the input normal. Screened Poisson Surface Reconstruction (SPSR) (Kazhdan and Hoppe, 2013) integrates positional constraints to balance precision and smoothness. Recently, Neural Kernel Surface Reconstruction (NKSR) (Huang et al., 2023) integrates aspects from both ends–it models surface as the zero level set of the global implicit function, but with its value determined by learned proximity (neural kernel) with respect to input samples.

2.2. Point cloud resampling and denoising

Edge-Aware Point Set Resampling (EAR) (Huang et al., 2013) applies bilateral filtering to reliable fit normals away from edges, then progressively sample towards the edge to capture sharp features. Wei et al. (2023b) predict cross field and leverage its crease align nature Jakob et al. (2015); Huang and Ju (2016) to tessellate space for convolution. Sarkar et al. (2018) randomly samples square patches and applies RANSAC (Fischler and Bolles, 1981) stores local height in square matrices and performs denoising by to low rank matrix factorization. Zeng et al. (2019) sample overlapping patches and connect samples by projection distance to perform graph smoothing. (Wei et al., 2023a) use line processes to perform smoothing up to a optimizable threshold. Deep Geometric Prior (DGP) (Williams et al., 2019) parameterize local patch with MLP and rely on its smooth prior to denoise.

2.3. Fitting based Neural implicit representation

Neural implicit representation encodes the surface as the zero-level set of the distance function using coordinate MLP. We primarily review its application in fitting-based surface reconstruction. Readers interested in neural field in general should check out the comprehensive survey by Xie et al. (2021).

DeepSDF (Park et al., 2019) and Occupancy Networks (Mescheder et al., 2019) parameterize the signed distance function (SDF) or occupancy classification over spatial locations, which are capable of encoding geometry of arbitrary resolutions. SAL (Atzmon and Lipman, 2020) sample in the vicinity of the input point cloud and learns an unsigned distance function. SALD (Atzmon and Lipman, 2021) further improves the quality of reconstruction with normal supervision. IGR (Gropp et al., 2020) introduces the eikonal regularization term, which under stochastic optimization, the MLP converges to a faithful SDF. When normal is not available, the eikonal regularization is insufficient to guarantee a unique solution, resulting in ambiguities and artifacts. Park et al. (2023) model the SDF gradient as the unique solution to the $p$ -Poisson equation and additionally minimize the surface area to improve hole filling. DiGS (Ben-Shabat et al., 2023) initializes the MLP with a geometric sphere and minimizes the divergence of its gradient to preserve the orientation consistency. Neural Singular Hessian (NSH) (Wang et al., 2023) encourages the Hessian of MLP to have zero determinant and produces a topologically faithful initialization. However, both DiGS and NSH require annealing the regularization weights to fit finer details, which are prone to noise corruption. Our idea is to ”snapshot” the initialization as a prior, in making up for the emerging of noise (Figure 2).

2.4. Functional representation of the octahedral field

In this section, we give a brief summary of the Spherical Harmonic (SH) representation of octahedral frame by Huang et al. (2011), Ray et al. (2016) and Palmer et al. (2020). A full review is available in the section A of the supplementary.

2.4.1. Definition

An octahedral frame can be represented by three mutually orthogonal unit vectors $\{\mathbf{v}_{1},\mathbf{v}_{2},\mathbf{v}_{3}\}$ , or equivalently, a rotation matrix $\mathbf{R}=[\mathbf{v}_{1}|\mathbf{v}_{2}|\mathbf{v}_{3}]\in SO(3)$ associated with the canonical frame of standard basis vectors $\{\mathbf{e}_{x},\mathbf{e}_{y},\mathbf{e}_{z}\}$ . To avoid representation vector matching, it is convenient to describe the canonical frame as a spherical polynomial $F:S^{2}\to\mathbb{R}$ , so it can be projected onto band 0 and 4 of SH basis:

	$\displaystyle F(\mathbf{s})$	$\displaystyle=(\mathbf{s}\cdot\mathbf{e}_{x})^{4}+(\mathbf{s}\cdot\mathbf{e}_{y})^{4}+(\mathbf{s}\cdot\mathbf{e}_{z})^{4}$
		$\displaystyle=c_{0}(c_{1}Y_{0}^{0}(\mathbf{s})+\sqrt{\frac{7}{12}}Y_{4}^{0}(\mathbf{s})+\sqrt{\frac{5}{12}}Y_{4}^{4}(\mathbf{s}))$
		$\displaystyle=c_{0}(c_{1}Y_{0}^{0}(\mathbf{s})+\mathbf{Y}_{4}(\mathbf{s})^{T}\mathbf{q}_{0})$
	$\displaystyle\mathbf{q}_{0}$	$\displaystyle=[0,0,0,0,\sqrt{\frac{7}{12}},0,0,0,\sqrt{\frac{5}{12}}]^{T}\in\mathbb{R}^{9},\ \mathbf{s}\in S^{2}$

where $Y_{l}^{m}$ is the SH basis function of band $l$ order $m$ , $\mathbf{Y}_{4}$ is its vector form for the band $4$ basis, $c_{0}$ and $c_{1}$ are constants.

Since $Y_{0}^{0}$ is also a constant, $F(\mathbf{s})$ can be fully categorized by the SH band 4 coefficients $\mathbf{q}_{0}$ alone, so are any general octahedral frames $F(\mathbf{R}^{T}\mathbf{s})$ :

F(\mathbf{R}^{T}\mathbf{s}):\mathbf{q}=\tilde{\mathbf{R}}\mathbf{q}_{0}\in\mathbb{R}^{9},\tilde{\mathbf{R}}\in SO(9)

where $\tilde{\mathbf{R}}$ is the Wigner D-matrix induced by $\mathbf{R}$ . Due to the orthogonality of the SH basis, $\|\mathbf{q}\|_{2}=\|\tilde{\mathbf{R}}\mathbf{q}_{0}\|_{2}=\|\mathbf{q}_{0}\|_{2}=1$ , so $\mathbf{q}$ is also a value SH band 4 coefficient vector. We refer to this set of unit norm coefficient vectors in $\mathbb{R}^{9}$ associated with the octahedral frame as the octahedral variety.

With SH parameterized functional representation, the difference between two general frames associated with $\mathbf{R}_{a}$ and $\mathbf{R}_{b}$ can be measured using spherical integral, that can be further reduced as as difference between their coefficient vectors:

(1)		$\displaystyle\int_{S^{2}}(F(\mathbf{R}_{a}^{T}\mathbf{s})-F(\mathbf{R}_{b}^{T}\mathbf{s}))^{2}d\mathbf{s}$	$\displaystyle=\int_{S^{2}}(\mathbf{Y}_{4}(\mathbf{s})^{T}\mathbf{q}_{a}-\mathbf{Y}_{4}(\mathbf{s})^{T}\mathbf{q}_{b})^{2}d\mathbf{s}$
(1)			$\displaystyle=\\|\mathbf{q}_{a}-\mathbf{q}_{b}\\|_{2}^{2}$

The association of octahedral frames with spatial coordinates, known as the octahedral field, can be described by function $u:\mathbf{p}\in\mathbb{R}^{3}\to\mathbf{q}\in\mathbb{R}^{9}$ , whose smoothness can be measured using Dirichlet energy and discretized with (1) if given the connectivity:

(2)

\int_{V}\|\nabla u\|^{2}d\mathbf{p}=\sum_{e_{ij}\in\mathcal{E}}w_{ij}\|\mathbf{q}_{i}-\mathbf{q}_{j}\|_{2}^{2}

where $\mathcal{E}$ is the set of edges, $w_{ij}$ is the harmonic weight of edge $e_{ij}$ .

2.4.2. Normal alignment

The octahedral field is classically designed over a volume enclosed by its boundary surface, with boundary frames having one of their axes aligned with the surface normals, while being as smooth as possible anywheres. The normal aligned frames $\mathbf{q}_{n}$ can be modelled as the set of $z$ axis aligned frames $\mathbf{q}_{z}$ rotated towards normal directions, which can be parameterized by twisting along $z$ axis:

(3)		$\displaystyle\mathbf{q}_{z}$	$\displaystyle=[\sqrt{\frac{5}{12}}\cos 4\theta,0,0,0,\sqrt{\frac{7}{12}},0,0,0,\sqrt{\frac{5}{12}}\sin 4\theta]^{T}$
(3)		$\displaystyle\mathbf{q}_{n}$	$\displaystyle=\tilde{\mathbf{R}}_{z\to n}\mathbf{q}_{z}\in\mathbb{R}^{9},\tilde{\mathbf{R}}_{z\to n}\in SO(9)$

where $z\to n$ denotes rotation from $z$ axis to normal $\mathbf{n}$ , $\theta\in\mathbb{R}$ is the twisting angle that parameterize the quadratic equality:

c_{0}^{2}+c_{8}^{2}=1,c_{0}=\cos 4\theta,c_{8}=\sin 4\theta

For a coefficient vector $\mathbf{q}\in\mathbb{R}^{9}$ , Palmer et al. (2020) observe its closest projection to $z$ axis aligned frame has form:

\Pi_{z}(\mathbf{q})=[\frac{\sqrt{\frac{5}{12}}\mathbf{q}[0]}{\sqrt{\mathbf{q}[0]^{2}+\mathbf{q}[8]^{2}}},0,0,0,\sqrt{\frac{7}{12}},0,0,0,\frac{\sqrt{\frac{5}{12}}\mathbf{q}[8]}{\sqrt{\mathbf{q}[0]^{2}+\mathbf{q}[8]^{2}}}]

where $[\cdot]$ denotes $0$ based array indexing. Notice how the normalization of the first and the last coefficients incorporates the quadratic constraint without expressing it in the form of equality. They further propose the projection to closest $\mathbf{n}$ aligned frame as:

(4)

\Pi_{n}(\mathbf{q})=\tilde{\mathbf{R}}_{z\to n}\Pi_{z}(\tilde{\mathbf{R}}_{z\to n}^{T}\mathbf{q})

We employ this loss in our normal alignment constraint.

2.4.3. Interpolation

Our key observation is based on how normal aligned octahedral frames interpolate across creases–it behaves similarly to normals, which are ambiguous at discrete vertices Solomon et al. (2017), but interpolate smoothly in continuous settings. However, such interpolation exhibits cubic symmetry–it favors sharp turing when the dihedral angle is large and converges to vertex normal interpolation at smooth regions (Figure 3). This property makes the octahedral field edge aware, an ideal property when handling scanning noise. Moreover, the SH representation fully parameterizes the octahedral field as $\mathbb{R}^{9}$ coefficient vector field in the ambient space $\mathbb{R}^{3}$ , allowing it to be implicitly encoded to pair with the neural implicit representation.

3. Method

Given an unoriented point cloud $\{\mathbf{p}_{i}\}_{i=1}^{N}$ and a neural implicit representation $f_{\omega}:\mathbb{R}^{3}\to\mathbb{R}$ , we introduce a jointly trained octahedral field parameterized using a dedicated coordinate MLP:

u_{\theta}:\mathbf{x}\in\mathbb{R}^{3}\to\mathbf{q}\in\mathbb{R}^{9}

where $\omega$ and $\theta$ are the MLP weights.

As illustrated in Figure 4, our goal is to pair a smooth octahedral field with an initial implicit surface. By alternatively aligning the octahedral field with the fixed surface normal, and regularizing the surface normal to match one of the fixed octahedral frame’s representation vectors, it encourages the implicit geometry to be smooth while emphasizing its sharp edges.

3.1. Normal alignment

For each sample on the surface, we want to minimize the deviation between the predicted octahedral frame and its closest normal-aligned projection. Although the surface normal is not provided, the $f_{\omega}$ is always differentiable, that we utilize the gradient of its zero level set as the boundary condition to define our octahedral field:

\sum_{i=1}^{N}\phi(u_{\theta}(\mathbf{p}_{i}^{\prime}),\Pi(u_{\theta}(\mathbf{p}_{i}^{\prime}),\frac{\nabla f_{\omega}(\mathbf{p}_{i}^{\prime})}{\|\nabla f_{\omega}(\mathbf{p}_{i}^{\prime})\|_{2}}))

where $\mathbf{p}_{i}^{\prime}=\mathbf{p}_{i}-f_{\omega}(\mathbf{p}_{i})\cdot\frac{\nabla f_{\omega}(\mathbf{p}_{i})}{\|\nabla f_{\omega}(\mathbf{p}_{i})\|_{2}}$ is the projection of $\mathbf{p}_{i}$ on zero level set, $\Pi(\mathbf{q},\mathbf{n})$ is the rewritten of (4) by treating both SH coefficient vector and normal as arguments, $\phi$ is a similarity measurement function.

In practice, to avoid the costly projection step, we adapt the weighting scheme from Ma et al. (2023), which we use $\nabla f_{\omega}(\mathbf{p}_{i})$ as the surrogate and weight it by distance to the zero level set. The alignment loss can then be simplified as:

(5)

\mathcal{L}_{\text{align}}=\sum_{i=1}^{N}\beta(\mathbf{p}_{i})\cdot\phi(u_{\theta}(\mathbf{p}_{i}),\Pi(u_{\theta}(\mathbf{p}_{i}),\frac{\nabla f_{\omega}(\mathbf{p}_{i})}{\|\nabla f_{\omega}(\mathbf{p}_{i})\|_{2}}))

where $\beta(\mathbf{p}_{i})=\exp(-100\cdot|f_{\omega}(\mathbf{p}_{i})|)$ , and we choose cosine similarity $\phi(\cdot,\cdot)=(1-\langle\cdot,\cdot\rangle)$ . We observe a minor difference when compared to direct projection. It is worth pointing out that the weighting is important–frames distant away surface may deviate significantly due to the existence of singularity. Note that (5) is a function of both MLPs, so we stop the gradient propagation for $f_{\omega}$ and focus on aligning $u_{\theta}$ .

The design of (5) yields two benefits: First, as discussed in Section 2.4.2, projection (4) by Palmer et al. (2020) alleviates the need for additional parameterization or quadratic equality, especially when such equality is defined element-wise. Additionally, the choice of cosine similarity relaxes the unit norm constraint, which is topologically unsatisfiable for the smooth octahedral field due to the existence of singularities (Solomon et al., 2017). This also makes the output of $u_{\theta}$ a Euclidean vector space in $\mathbb{R}^{9}$ , which we will elaborate on its importance in the next section.

However, unlike the boundary condition in the discrete setting, (5) is inherently a soft constraint. Compared to the exact one (Sukumar and Srivastava, 2022), it is notoriously difficult to train and requires dense boundary samples to fulfill (Raissi et al., 2019; Barschkis, 2023). It is especially challenging for our case, as small hole-like structures are a common occurrence in real-world scans. Those holes resemble cylinder volumes that require multiple singularity curves to satisfy. Compounded by the difficulty in capturing the interior of those holes using range sensors, our octahedral field cannot fully align with those regions. We will discuss its implications in Section 3.3.

3.2. Smoothness with Lipschitz continuity

The next step is to encourage the octahedral field to be smooth. Since $u_{\theta}$ directly outputs the SH coefficient vector, its MLP output space is of Euclidean topology and is also where the smoothness of the octahedral frames can be measured. This allows the use of efficient Lipschitz regularization (Liu et al., 2022) as a smoothness proxy.

Under the 1-Lipschitz activation function (ReLU, ELU, $\sin$ , tanh, etc.), the norm of output variation with respect to the input is bounded by that layer’s Lipschitz constants $c_{i}$

\|u_{\theta_{i}}(\mathbf{x}_{0})-u_{\theta_{i}}(\mathbf{x}_{1})\|_{p}\leq c_{i}\|\mathbf{x}_{0}-\mathbf{x}_{1}\|_{p},\ c_{i}=\|\mathbf{W}_{i}\|_{p},\ i=1\dots l

where $u_{\theta i}$ denotes the $i$ th layer of $u_{\theta}$ , $\mathbf{x}$ is its input, $\mathbf{W}_{i}$ is its weight matrix, $l$ is number of layers. The LipMLP (Liu et al., 2022) initializes the Lipschitz constant per layer as the $L_{\infty}$ norm of its randomly initialized weight matrix $c_{i}=\|\mathbf{W}_{i}\|_{\infty}$ . During training, $c_{i}$ is used to rescale the maximum absolute row-wise sum of the weight matrix $\mathbf{W}_{i}$ so $c_{i}=\|\mathbf{W}_{i}\|_{\infty}$ still holds. By shrinking all Lipschitz constants, we minimize the MLP’s output variation with respect to input, hence encouraging it to be smooth:

(6)

\mathcal{L}_{\text{lip}}=\prod_{i=1}^{l}\text{softplus}(c_{i})

where softplus is a reparameterization to force the bound to be positive ( $\text{softplus}(c_{i})\approx c_{i},\ c_{i}\gg 1$ ), and the $p=\infty$ is chosen purely for computational efficiency. This loss is highly efficient because it only needs first-order derivatives, and all weight matrices can be updated in parallel.

In our case, minimizing the Lipschitz bound also has geometric meaning. Denote the output for any two input $\mathbf{x}_{0}$ , $\mathbf{x}_{1}$ as $\mathbf{q}_{0}=u_{\theta}(\mathbf{p}_{0})$ , $\mathbf{q}_{1}=u_{\theta}(\mathbf{p}_{1})$ respectively, we have:

\|\mathbf{q}_{0}-\mathbf{q}_{1}\|_{2}\leq\|\mathbf{q}_{0}-\mathbf{q}_{1}\|_{\infty}\leq\prod_{i=1}^{l}c_{i}\|\mathbf{p}_{0}-\mathbf{p}_{1}\|_{\infty}

The Lipschitz constants bound norm of variation of SH coefficient vector (1) with respect to input spatial variations, hence its minimization is analogously to minimizing the discrete Dirichlet energy (2). Note that it is only possible because we output SH coefficients directly–the output space for most rotation representations are not Euclidean(Zhou et al., 2019), and for those are, they are not where the smoothness for octahedral frames is measured. Thus, they require more expensive gradient norm minimization or finite-difference equivalents (Huang et al., 2021; Zhang et al., 2023).

3.3. Octahedral prior for sharp edge regularization

The core of our method is to exploit the interpolation property of the octahedral frames (Section 2.4.3) to simultaneously smooth and accentuate the sharp edges of the companion implicit geometry (Figure 4). We do it by matching the gradient of SDF to one of the six representation vectors of the octahedral frame.

Recall that align loss (5) is a function of both MLPs–we can simply flip the gradient propagation by fixing $u_{\theta}$ while updating $f_{\omega}$ , which resembles alternating optimization. In examining the loss manifold (Figure 6), we find $L_{1}$ or $L_{2}$ loss works better than cosine similarity, so we adjust it slightly:

(7)

\mathcal{L}_{\text{regularize}}=\sum_{i=1}^{N}\phi(\frac{u_{\theta}(\mathbf{p}_{i})}{\|u_{\theta}(\mathbf{p}_{i})\|_{2}},\Pi(u_{\theta}(\mathbf{p}_{i}),\frac{\nabla f_{\omega}(\mathbf{p}_{i})}{\|\nabla f_{\omega}(\mathbf{p}_{i})\|_{2}}))

the normalization is employed because the projection (4) always yields valid SH 4 coefficient vector hence of unit norm. Given that we work on the same samples in the alignment phrase, the normalized $u_{\theta}(\mathbf{p}_{i})$ is unlikely to deviate from the octahedral variety. We remove the distance weighting, as we empirically find it contributes little here, possibly due to the inherent ambiguity in regularization process–the cubic symmetry of octahedral frame means the matched SDF gradient is not unique. This is also why our method does not contribute to improving the normal consistency of unoriented point cloud reconstruction. Conversely, we require a good initialization, such as DiGS or NSH, to realize our method’s full potential.

3.4. Practical concern

Note that smoothing is at the cost of reducing the capacity of the model to represent high-frequency information (figure 5). Therefore, we use NSH as our backbone and follow their practice, we also sample close-surface samples and off-surface samples, denoted $\mathbf{p}_{\text{close}}$ and $\mathbf{p}_{\text{off}}$ respectively. The off-surface point cloud is sampled uniformly in the unit bounding box, while close samples are drawn from the normal distribution, with sigma being the maximum KNN distance of $k=51$ , centered at the scanning input $\mathbf{p}$ . Our final losses are:

	$\displaystyle\mathcal{L}_{\text{total}}$	$\displaystyle=\lambda_{\text{align}}\cdot\mathcal{L}_{\text{align}}(\mathbf{p})+\lambda_{\text{regularize}}\cdot\mathcal{L}_{\text{regularize}}(\mathbf{p})+\lambda_{\text{lip}}\cdot\mathcal{L}_{\text{lip}}$
		$\displaystyle+\lambda_{\text{NSH}}\cdot\sum\|\det(H(f_{\omega}(\mathbf{p}_{\text{close}})))\|$
		$\displaystyle+\lambda_{\text{eikonal}}\cdot\\|\\|\nabla f_{\omega}(\mathbf{p}\cup\mathbf{p}_{\text{off}})\\|-1\\|$
		$\displaystyle+\lambda_{\text{positional}}\cdot\sum f_{\omega}(\mathbf{p})$
		$\displaystyle+\lambda_{\text{off}}\cdot\exp(-\alpha\|f_{\omega}(\mathbf{p}_{\text{off}})\|)$

where $\lambda$ s are balancing weight. Note that evaluating our loss only requires input point cloud.

We fix $\lambda_{\text{NSH}}=3$ , $\lambda_{\text{eikonal}}=50$ , $\lambda_{\text{off}}=100$ , $\lambda_{\text{align}}=100$ , $\lambda_{\text{regularize}}=10$ , $\lambda_{\text{lip}}=10^{-6}$ , and adapt two scheduling schemes:

•

Low-noise: Set $\lambda_{\text{positional}}=7000$ , annealing $\lambda_{\text{NSH}}$ to $3\times 10^{-4}$ after $10\%$ of training step, start alignment and smoothing at $40\%$ , regularization at $60\%$ .
•

High-noise: Set $\lambda_{\text{positional}}=3500$ , annealing $\lambda_{\text{NSH}}$ to $3\times 10^{-3}$ after $10\%$ of training step, start alignment and smoothing at $20\%$ , regularization at $40\%$ .

This might seem counterintuitive because we change the schedule of losses rather than adjusting their weights. The rationale is that we want our octahedral field to capture the geometry before it is fully corrupted, while not overfitting the overly smooth initialization. Given that small holes are often not observed sufficiently during scanning and emerge later in the fitting process (Figure 5), we postpond scheduling to capture those small details. However, when the noise level is high, the noise is corrupted so quickly that we cannot afford to leave more to come, so we start our alignment shortly after its annealing stage.

Note that $\lambda_{\text{positional}}$ and $\lambda_{\text{NSH}}$ in the low noise scheme are the default values in their original implementation, the tuning in our high noise scheme aims to slow down the noise corruption. The resulting effect is illustrated as the fine-tuned one in Figure 2.

Following the practice of LipMLP, we spatially scale the input by $100$ –it is equivalent to premultiply first weight matrix by the same amount Sitzmann et al. (2020) to speed up convergence, so our octahedral field can ”snapshot” as quickly as possible–our method cannot fix a already corrupted geometry.

4. Experiment

We implement our method using JAX (Bradbury et al., 2018) (Equinox (Kidger and Garcia, 2021) for network, optax (DeepMind et al., 2020) for optimizer). We use NSH (Wang et al., 2023) as our backbone method. Therefore, we follow their training setup, use 4 layers SIREN of 256 units, normalize each scan to unit square and sample input, off-surface and close-surface point cloud for $15000$ each, and fit over $10000$ iterations.

We use ADAM optimizer with learning rate of $5e-5$ for both SIREN and LipMLP, and extract the surface using Marching Cube (MC) with voxel grid of $512^{3}$ . The full implementation is available at: https://github.com/Ankbzpx/frame-field.

4.1. Metrics

For quantitative evaluation, we follow DiGS (Ben-Shabat et al., 2023) and report Chamfer distance $(\times 10^{3})$ (Fan et al., 2016) and Hausdorff distance $(\times 10^{2})$ over 1M points randomly sampled on surface or point cloud, depending on the method’s output. Given that neural implicit representations are susceptible to floating artifacts (Figure 8), we additionally report the F-score (Knapitsch et al., 2017). It clamps the closest matching distance by a threshold and reports as a percentage and hence is less sensitive to the mismatching of surplus parts. Following Tatarchenko et al. (2019), we use the distance threshold of $0.5\%$ . Note that we do not postprocess the output for any methods and leave the extracted meshes or point clouds as is.

4.2. Surface Reconstruction Benchmark

We first compare our method with existing approaches based on neural implicit representation fitting, namely DGR (Williams et al., 2019), SIREN (Sitzmann et al., 2020), DiGS (Ben-Shabat et al., 2023), NSH (Wang et al., 2023). We use the Surface Reconstruction Benchmark (SRB) (Berger et al., 2013) dataset, which consists of 5 scans that exhibit triangulation-based scanning patterns. We fit the full point cloud for all methods and use our low-noise scheme for our fitting process. DGR (Williams et al., 2019) outputs a dense denoised point cloud, which we use SPSR for visualization purposes.

Our method achieves the best F-score, second best Chamfer distance with the smallest standard deviation (Table 1), which is aligned with our cleaner reconstruction quality (Figure 7).

Table 1. Quantitative results on SRB (Berger et al., 2013). The bold text indicates the best score, the underline text indicates the second best score

. Chamfer $\downarrow$ Hausdorff $\downarrow$ F-score $\uparrow$ mean std mean std mean std DGP 0.227 0.073 5.194 2.291 91.744 2.926 SIREN 0.316 0.262 7.121 7.823 89.603 8.003 DiGS 0.195 0.071 3.843 2.274 93.057 4.040 NSH 0.200 0.075 3.867 2.885 92.725 3.730 Ours 0.196 0.069 3.923 2.676 93.088 4.028

4.3. ABC and Thingi10k

We further evaluate our methods on two widely evaluated datasets, ABC (Koch et al., 2019) and Thingi10k (Zhou and Jacobson, 2016). The former consists of CAD models of sharp edges, whereas the latter is made up of more general shapes. We leverage the 100 test split from Points2Surf (Erler et al., 2020) for both datasets and further use Blensor (Gschwandtner et al., 2011) to simulate the time-of-flight (ToF) scanning process. Specifically, we set sensor resolution to $176\times 144$ , focal length to 10mm and scanning each object spherically for 30 scans, resulting in point clouds of size ranging from 20k to 100k. We generate scanning data for two noise levels, $\mathcal{N}(0,0.01L)$ and $\mathcal{N}(0,0.005L)$ (so $400$ in total), where $L$ is the length of the maximum edge of the model’s bounding box. Our two schemes in Section 3.3 are tuned with respect to these two noise levels, respectively, although we have shown that the low-level scheme also works for SRB as well. For reference, data-driven methods (Erler et al., 2020; Huang et al., 2023) typically randomize noise with the sigma in between $0.01L$ and $0.05L$ , so our noise level is their lower bound.

We compare with 9 methods, 2 axiomatic ones (APSS (Guennebaud and Gross, 2007), SPSR (Kazhdan and Hoppe, 2013)), 1 resampling (EAR (Huang et al., 2013)), 2 patch-based denoising (GLR (Zeng et al., 2019), LP (Wei et al., 2023a)), 3 implicit fitting (SIREN (Sitzmann et al., 2020), DiGS (Ben-Shabat et al., 2023), NSH (Wang et al., 2023)) and 1 with data prior (NKSR (Huang et al., 2023)). For point cloud visualization, we use SPSR if the method outputs normals, otherwise we use the Advancing Front (Zienkiewicz et al., 2013). The full results are shown in figure 10, 11, and the quantitative evaluation is provided in the supplementary. Although our method shrines the most with CAD models, it succeeds in denoisng more general shapes such as sculptures as well. However, with noise sigma $0.01L$ , our method somethingswould output a low-poly-like reconstruction–the likely cause would be, when the noise level is large, the implicit geometry evolves quickly such that constant changing surface normal makes the octahedral field hard to converge.

In our experiment, we find NSH, so is our method, fails to reconstruct three cases (Figure 8) that gives erroneously large closest matching distances ( $50\times$ larger than the second worst case). Therefore, we report both original metrics and those with failure cases removed.

5. Limitations and future work

The main limitation of our method is its ambiguity in regularization–the gradient of SDF is constrained to match any one of the three axes of the octahedral frame, so it has the tendency to close or distort open holes (Figure 9). This can be alleviated with a lower regularization weight, but at the cost of less effective optimization. Our method requires careful tuning with respect to noise level and relies on a faithful normal initialization. When the backbone methods fail, our regularization fails as well (Figure 8). Although our method has the potential to scale up for larger scenes, the choice of SDF prohibits its scalability. Those scans contain thin and one-sided structures, which our method already struggles against (Figure 8).

6. Conclusion

We introduce the neural octahedral field, a guiding field that when pairs with neural implicit representation, can simultaneously smooth while emphasizing its sharp edges. We design effective normal alignment loss and utilize the Lipschitz regularization to encourage field smoothness. We further examine its loss manifold with respect to normal directions, draw its connection with regularization quality, and propose to use the $L_{1}$ distance to improve its sharp edge awareness. We extensively compare our method with existing baselines to demonstrate our effectiveness in simultaneously smoothing and sharp edge regularizing. In future work, we would like to explore the potential of the neural octahedral field for hexahedral meshing.

References

(1)
Alexa and Adamson (2004) Marc Alexa and Anders Adamson. 2004. On normals and projection operators for surfaces defined by point sets. In Proceedings of the First Eurographics Conference on Point-Based Graphics (Switzerland) (SPBG’04). Eurographics Association, Goslar, DEU, 149–155.
Alexa et al. (2001) M. Alexa, J. Behr, D. Cohen-Or, S. Fleishman, D. Levin, and C.T. Silva. 2001. Point set surfaces. In Proceedings Visualization, 2001. VIS ’01. 21–29, 537. https://doi.org/10.1109/VISUAL.2001.964489
Anandkumar et al. (2012) Anima Anandkumar, Rong Ge, Daniel Hsu, Sham M. Kakade, and Matus Telgarsky. 2012. Tensor decompositions for learning latent variable models. arXiv:1210.7559 [cs.LG]
Atkinson and Riani (2012) A. Atkinson and M. Riani. 2012. Robust Diagnostic Regression Analysis. Springer New York. https://books.google.com.hk/books?id=sZ3SBwAAQBAJ
Atzmon and Lipman (2020) Matan Atzmon and Yaron Lipman. 2020. SAL: Sign Agnostic Learning of Shapes from Raw Data. arXiv:1911.10414 [cs.CV]
Atzmon and Lipman (2021) Matan Atzmon and Yaron Lipman. 2021. SALD: Sign Agnostic Learning with Derivatives. In 9th International Conference on Learning Representations, ICLR 2021.
Barschkis (2023) Sebastian Barschkis. 2023. Exact and soft boundary conditions in Physics-Informed Neural Networks for the Variable Coefficient Poisson equation. arXiv:2310.02548 [cs.LG]
Ben-Shabat et al. (2023) Yizhak Ben-Shabat, Chamin Hewa Koneputugodage, and Stephen Gould. 2023. DiGS : Divergence guided shape implicit neural representation for unoriented point clouds. arXiv:2106.10811 [cs.CV]
Berger et al. (2013) Matthew Berger, Joshua A Levine, Luis Gustavo Nonato, Gabriel Taubin, and Claudio T Silva. 2013. A benchmark for surface reconstruction. ACM Transactions on Graphics (TOG) 32, 2 (2013), 1–17.
Boralevi et al. (2017) Ada Boralevi, Jan Draisma, Emil Horobeţ, and Elina Robeva. 2017. Orthogonal and unitary tensor decomposition from an algebraic perspective. Israel Journal of Mathematics 222, 1 (Oct. 2017), 223–260. https://doi.org/10.1007/s11856-017-1588-6
Bradbury et al. (2018) James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax
Chemin et al. (2019) Alexandre Chemin, François Henrotte, Jean-François Remacle, and Jean Van Schaftingen. 2019. Representing Three-Dimensional Cross Fields Using Fourth Order Tensors. Springer International Publishing, 89–108. https://doi.org/10.1007/978-3-030-13992-6_6
DeepMind et al. (2020) DeepMind, Igor Babuschkin, Kate Baumli, Alison Bell, Surya Bhupatiraju, Jake Bruce, Peter Buchlovsky, David Budden, Trevor Cai, Aidan Clark, Ivo Danihelka, Antoine Dedieu, Claudio Fantacci, Jonathan Godwin, Chris Jones, Ross Hemsley, Tom Hennigan, Matteo Hessel, Shaobo Hou, Steven Kapturowski, Thomas Keck, Iurii Kemaev, Michael King, Markus Kunesch, Lena Martens, Hamza Merzic, Vladimir Mikulik, Tamara Norman, George Papamakarios, John Quan, Roman Ring, Francisco Ruiz, Alvaro Sanchez, Laurent Sartran, Rosalia Schneider, Eren Sezener, Stephen Spencer, Srivatsan Srinivasan, Miloš Stanojević, Wojciech Stokowiec, Luyu Wang, Guangyao Zhou, and Fabio Viola. 2020. The DeepMind JAX Ecosystem. http://github.com/google-deepmind
Desobry et al. (2021) David Desobry, Yoann Coudert-Osmont, Etienne Corman, Nicolas Ray, and Dmitry Sokolov. 2021. Designing 2D and 3D Non-Orthogonal Frame Fields. Computer-Aided Design 139 (2021), 103081. https://doi.org/10.1016/j.cad.2021.103081
Erler et al. (2020) Philipp Erler, Paul Guerrero, Stefan Ohrhallinger, Niloy J. Mitra, and Michael Wimmer. 2020. Points2Surf Learning Implicit Surfaces from Point Clouds. Springer International Publishing, 108–124. https://doi.org/10.1007/978-3-030-58558-7_7
Fan et al. (2016) Haoqiang Fan, Hao Su, and Leonidas Guibas. 2016. A Point Set Generation Network for 3D Object Reconstruction from a Single Image. arXiv:1612.00603 [cs.CV]
Fischler and Bolles (1981) Martin A. Fischler and Robert C. Bolles. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 6 (jun 1981), 381–395. https://doi.org/10.1145/358669.358692
Fleishman et al. (2005) Shachar Fleishman, Daniel Cohen-Or, and Cláudio T. Silva. 2005. Robust moving least-squares fitting with sharp features. ACM Trans. Graph. 24, 3 (jul 2005), 544–552. https://doi.org/10.1145/1073204.1073227
Green (2003) Robin Green. 2003. Spherical harmonic lighting: The gritty details. In Archives of the game developers conference, Vol. 56. 4.
Gropp et al. (2020) Amos Gropp, Lior Yariv, Niv Haim, Matan Atzmon, and Yaron Lipman. 2020. Implicit Geometric Regularization for Learning Shapes. In Proceedings of Machine Learning and Systems 2020. 3569–3579.
Gschwandtner et al. (2011) Michael Gschwandtner, Roland Kwitt, Andreas Uhl, and Wolfgang Pree. 2011. Blensor: Blender sensor simulation toolbox. In Advances in Visual Computing: 7th International Symposium, ISVC 2011, Las Vegas, NV, USA, September 26-28, 2011. Proceedings, Part II 7. Springer, 199–208.
Guennebaud and Gross (2007) Gaël Guennebaud and Markus Gross. 2007. Algebraic point set surfaces. ACM Trans. Graph. 26, 3 (jul 2007), 23–es. https://doi.org/10.1145/1276377.1276406
Huang et al. (2013) Hui Huang, Shihao Wu, Minglun Gong, Daniel Cohen-Or, Uri Ascher, and Hao (Richard) Zhang. 2013. Edge-aware point set resampling. ACM Trans. Graph. 32, 1, Article 9 (feb 2013), 12 pages. https://doi.org/10.1145/2421636.2421645
Huang et al. (2023) Jiahui Huang, Zan Gojcic, Matan Atzmon, Or Litany, Sanja Fidler, and Francis Williams. 2023. Neural Kernel Surface Reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4369–4379.
Huang et al. (2011) Jin Huang, Yiying Tong, Hongyu Wei, and Hujun Bao. 2011. Boundary Aligned Smooth 3D Cross-Frame Field. ACM Trans. Graph. 30, 6 (dec 2011), 1–8. https://doi.org/10.1145/2070781.2024177
Huang et al. (2021) Qixing Huang, Xiangru Huang, Bo Sun, Zaiwei Zhang, Junfeng Jiang, and Chandrajit Bajaj. 2021. ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators. arXiv:2108.09432 [cs.CV]
Huang et al. (2019) Zhiyang Huang, Nathan Carr, and Tao Ju. 2019. Variational implicit point set surfaces. ACM Trans. Graph. 38, 4, Article 124 (jul 2019), 13 pages. https://doi.org/10.1145/3306346.3322994
Huang and Ju (2016) Zhiyang Huang and Tao Ju. 2016. Extrinsically smooth direction fields. Computers & Graphics 58 (2016), 109–117.
Huang et al. (2022) Zhangjin Huang, Yuxin Wen, Zihao Wang, Jinjuan Ren, and Kui Jia. 2022. Surface Reconstruction from Point Clouds: A Survey and a Benchmark. arXiv:2205.02413 [cs.CV]
Jakob et al. (2015) Wenzel Jakob, Marco Tarini, Daniele Panozzo, Olga Sorkine-Hornung, et al. 2015. Instant field-aligned meshes. ACM Trans. Graph. 34, 6 (2015), 189–1.
Kazhdan et al. (2006) Michael Kazhdan, Matthew Bolitho, and Hugues Hoppe. 2006. Poisson surface reconstruction. In Proceedings of the fourth Eurographics symposium on Geometry processing, Vol. 7.
Kazhdan and Hoppe (2013) Michael Kazhdan and Hugues Hoppe. 2013. Screened poisson surface reconstruction. ACM Transactions on Graphics (ToG) 32, 3 (2013), 1–13.
Kidger and Garcia (2021) Patrick Kidger and Cristian Garcia. 2021. Equinox: neural networks in JAX via callable PyTrees and filtered transformations. Differentiable Programming workshop at Neural Information Processing Systems 2021 (2021).
Knapitsch et al. (2017) Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun. 2017. Tanks and Temples: Benchmarking Large-Scale Scene Reconstruction. ACM Transactions on Graphics 36, 4 (2017).
Knöppel et al. (2013) Felix Knöppel, Keenan Crane, Ulrich Pinkall, and Peter Schröder. 2013. Globally optimal direction fields. ACM Transactions on Graphics (ToG) 32, 4 (2013), 1–10.
Koch et al. (2019) Sebastian Koch, Albert Matveev, Zhongshi Jiang, Francis Williams, Alexey Artemov, Evgeny Burnaev, Marc Alexa, Denis Zorin, and Daniele Panozzo. 2019. ABC: A Big CAD Model Dataset For Geometric Deep Learning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Lathauwer et al. (1995) Lieven De Lathauwer, Pierre Comon, Bart De Moor, and Joos Vandewalle. 1995. Higher-order power method - application in independent component analysis. https://api.semanticscholar.org/CorpusID:115691434
Levin (2004) David Levin. 2004. Mesh-independent surface interpolation. In Geometric modeling for scientific visualization. Springer, 37–49.
Liu et al. (2022) Hsueh-Ti Derek Liu, Francis Williams, Alec Jacobson, Sanja Fidler, and Or Litany. 2022. Learning Smooth Neural Functions via Lipschitz Regularization. https://doi.org/10.48550/ARXIV.2202.08345
Ma et al. (2023) Baorui Ma, Junsheng Zhou, Yu-Shen Liu, and Zhizhong Han. 2023. Towards Better Gradient Consistency for Neural Signed Distance Functions via Level Set Alignment. arXiv:2305.11601 [cs.CV]
Mehta et al. (2022) Ishit Mehta, Manmohan Chandraker, and Ravi Ramamoorthi. 2022. A Level Set Theory for Neural Implicit Evolution under Explicit Flows. arXiv:2204.07159 [cs.CV]
Mescheder et al. (2019) Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019. Occupancy Networks: Learning 3D Reconstruction in Function Space. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR).
Mildenhall et al. (2021) Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99–106.
Palmer et al. (2020) David Palmer, David Bommes, and Justin Solomon. 2020. Algebraic Representations for Volumetric Frame Fields. ACM Trans. Graph. 39, 2, Article 16 (apr 2020), 17 pages. https://doi.org/10.1145/3366786
Park et al. (2019) Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 165–174.
Park et al. (2023) Yesom Park, Taekyung Lee, Jooyoung Hahn, and Myungjoo Kang. 2023. $p$ -Poisson surface reconstruction in curl-free flow from point clouds. arXiv:2310.20095 [cs.CV]
Raissi et al. (2019) Maziar Raissi, Paris Perdikaris, and George E Karniadakis. 2019. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378 (2019), 686–707.
Ray et al. (2016) Nicolas Ray, Dmitry Sokolov, and Bruno Lévy. 2016. Practical 3D frame field generation. ACM Transactions on Graphics (TOG) 35, 6 (2016), 1–9.
Remelli et al. (2020) Edoardo Remelli, Artem Lukoianov, Stephan R. Richter, Benoît Guillard, Timur Bagautdinov, Pierre Baque, and Pascal Fua. 2020. MeshSDF: Differentiable Iso-Surface Extraction. arXiv:2006.03997 [cs.CV]
Robeva (2016) Elina Robeva. 2016. Orthogonal Decomposition of Symmetric Tensors. SIAM J. Matrix Anal. Appl. 37, 1 (Jan. 2016), 86–102. https://doi.org/10.1137/140989340
Sarkar et al. (2018) Kripasindhu Sarkar, Florian Bernard, Kiran Varanasi, Christian Theobalt, and Didier Stricker. 2018. Structured Low-Rank Matrix Factorization for Point-Cloud Denoising. In 2018 International Conference on 3D Vision (3DV). 444–453. https://doi.org/10.1109/3DV.2018.00058
Sitzmann et al. (2020) Vincent Sitzmann, Julien N.P. Martel, Alexander W. Bergman, David B. Lindell, and Gordon Wetzstein. 2020. Implicit Neural Representations with Periodic Activation Functions. In arXiv.
Sloan (2008) Peter-Pike Sloan. 2008. Stupid spherical harmonics (sh) tricks. In Game developers conference, Vol. 9. 42.
Solomon et al. (2017) Justin Solomon, Amir Vaxman, and David Bommes. 2017. Boundary Element Octahedral Fields in Volumes. ACM Trans. Graph. 36, 4, Article 114b (jul 2017), 16 pages. https://doi.org/10.1145/3072959.3065254
Sukumar and Srivastava (2022) N. Sukumar and Ankit Srivastava. 2022. Exact imposition of boundary conditions with distance functions in physics-informed deep neural networks. Computer Methods in Applied Mechanics and Engineering 389 (Feb. 2022), 114333. https://doi.org/10.1016/j.cma.2021.114333
Sulzer et al. (2024) Raphael Sulzer, Renaud Marlet, Bruno Vallet, and Loic Landrieu. 2024. A Survey and Benchmark of Automatic Surface Reconstruction from Point Clouds. arXiv:2301.13656 [cs.CV]
Tatarchenko et al. (2019) Maxim Tatarchenko, Stephan R. Richter, René Ranftl, Zhuwen Li, Vladlen Koltun, and Thomas Brox. 2019. What Do Single-view 3D Reconstruction Networks Learn? CVPR.
Wang et al. (2023) Zixiong Wang, Yunxiao Zhang, Rui Xu, Fan Zhang, Pengshuai Wang, Shuangmin Chen, Shiqing Xin, Wenping Wang, and Changhe Tu. 2023. Neural-Singular-Hessian: Implicit Neural Representation of Unoriented Point Clouds by Enforcing Singular Hessian. arXiv:2309.01793 [cs.CV]
Wei et al. (2023b) Guangshun Wei, Hao Pan, Shaojie Zhuang, Yuanfeng Zhou, and Changjian Li. 2023b. iPUNet:Iterative Cross Field Guided Point Cloud Upsampling. arXiv:2310.09092 [cs.CV]
Wei et al. (2023a) Jiayi Wei, Jiong Chen, Damien Rohmer, Pooran Memari, and Mathieu Desbrun. 2023a. Robust Pointset Denoising of Piecewise-Smooth Surfaces through Line Processes. Computer Graphics Forum 42, 2 (2023), 175–189. https://doi.org/10.1111/cgf.14752 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.14752
Williams et al. (2019) Francis Williams, Teseo Schneider, Claudio Silva, Denis Zorin, Joan Bruna, and Daniele Panozzo. 2019. Deep Geometric Prior for Surface Reconstruction. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. https://doi.org/10.1109/cvpr.2019.01037
Xie et al. (2021) Yiheng Xie, Towaki Takikawa, Shunsuke Saito, Or Litany, Shiqin Yan, Numair Khan, Federico Tombari, James Tompkin, Vincent Sitzmann, and Srinath Sridhar. 2021. Neural Fields in Visual Computing and Beyond. CoRR abs/2111.11426 (2021). arXiv:2111.11426 https://arxiv.org/abs/2111.11426
Yang et al. (2021) Guandao Yang, Serge Belongie, Bharath Hariharan, and Vladlen Koltun. 2021. Geometry Processing with Neural Fields. In Thirty-Fifth Conference on Neural Information Processing Systems.
Zeng et al. (2019) Jin Zeng, Gene Cheung, Michael Ng, Jiahao Pang, and Cheng Yang. 2019. 3D Point Cloud Denoising using Graph Laplacian Regularization of a Low Dimensional Manifold Model. arXiv:1803.07252 [cs.CV]
Zhang et al. (2023) Baowen Zhang, Jiahe Li, Xiaoming Deng, Yinda Zhang, Cuixia Ma, and Hongan Wang. 2023. Self-supervised Learning of Implicit Shape Representation with Dense Correspondence for Deformable Objects. arXiv:2308.12590 [cs.CV]
Zhang et al. (2020) Paul Zhang, Josh Vekhter, Edward Chien, David Bommes, Etienne Vouga, and Justin Solomon. 2020. Octahedral frames for feature-aligned cross fields. ACM Transactions on Graphics (TOG) 39, 3 (2020), 1–13.
Zhou and Jacobson (2016) Qingnan Zhou and Alec Jacobson. 2016. Thingi10K: A Dataset of 10,000 3D-Printing Models. arXiv:1605.04797 [cs.GR]
Zhou et al. (2019) Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, and Hao Li. 2019. On the Continuity of Rotation Representations in Neural Networks. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. https://doi.org/10.1109/cvpr.2019.00589
Zienkiewicz et al. (2013) O.C. Zienkiewicz, R.L. Taylor, and J.Z. Zhu. 2013. Chapter 17 - Automatic Mesh Generation. In The Finite Element Method: its Basis and Fundamentals (Seventh Edition) (seventh edition ed.), O.C. Zienkiewicz, R.L. Taylor, and J.Z. Zhu (Eds.). Butterworth-Heinemann, Oxford, 573–640. https://doi.org/10.1016/B978-1-85617-633-0.00017-4
Öztireli et al. (2009) A. C. Öztireli, G. Guennebaud, and M. Gross. 2009. Feature Preserving Point Set Surfaces based on Non-Linear Kernel Regression. Computer Graphics Forum 28, 2 (2009), 493–501. https://doi.org/10.1111/j.1467-8659.2009.01388.x arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-8659.2009.01388.x

Table 2. The results from two noise levels are separated by a slash, with left indicates noise

\sigma=0.002L

and right

\sigma=0.01L

. Note methods marked with * require normal input (PCA normal with KNN

k=16

, filtered by scanning view directions), and methods marked with ’†’ have failure cases removed. The bold text indicates the best scores, while the underlined text indicates the best scores for methods that do not need normal input.

	ABC (Koch et al., 2019)			Thingi10k (Zhou and Jacobson, 2016)
	Chamfer $\downarrow$	Hausdorff $\downarrow$	F-score $\uparrow$	Chamfer $\downarrow$	Hausdorff $\downarrow$	F-score $\uparrow$
APSS* (Guennebaud and Gross, 2007)	2.333 / 4.863	5.043 / 9.078	94.693 / 70.838	1.346 / 4.333	3.367 / 7.960	97.542 / 68.216
SPSR* (Kazhdan and Hoppe, 2013)	3.306 / 3.999	6.091 / 6.162	91.756 / 89.185	1.892 / 2.765	3.939 / 4.338	96.437 / 93.299
EAR* (Huang et al., 2013)	4.066 / 4.375	5.785 / 6.133	84.405 / 81.057	3.590 / 3.843	3.541 / 4.393	80.505 / 78.195
NKSR* (Huang et al., 2023)	2.929 / 3.600	6.579 / 7.152	93.636 / 90.184	1.594 / 2.338	5.127 / 6.696	97.082 / 93.307
GLR (Zeng et al., 2019)	4.026 / 4.965	5.768 / 6.233	82.602 / 75.153	2.774 / 3.603	3.236 / 3.756	87.909 / 79.139
LP (Wei et al., 2023a)	4.601 / 5.917	5.634 / 6.132	76.228 / 60.472	3.464 / 4.514	3.173 / 3.810	78.731 / 61.982
SIREN (Sitzmann et al., 2020)	8.835 / 6.427	14.805 / 8.978	87.537 / 58.945	11.019 / 5.710	18.854 / 9.071	85.593 / 55.810
DiGS (Ben-Shabat et al., 2023)	3.734 / 6.590	10.640 / 11.484	93.885 / 58.463	2.232 / 6.046	9.124 / 8.042	96.775 / 51.056
NSH (Wang et al., 2023)	5.755 / 5.420	8.847 / 7.516	92.391 / 64.614	3.792 / 5.119	6.356 / 7.386	96.064 / 59.365
Ours	5.393 / 7.062	7.861 / 9.247	93.496 / 87.831	3.804 / 3.798	6.051 / 5.330	96.419 / 90.663
NSH†(Wang et al., 2023)	4.426 / 5.464	7.893 / 7.598	93.173 / 64.227	3.407 / 5.093	6.074 / 7.236	96.604/ 59.387
Ours†	3.274 / 4.124	6.540 / 7.542	94.416 / 89.133	2.650 / 3.012	5.422 / 4.792	97.229 / 91.217

Appendix A Octahedral frame background

This section is an review of SH parameterized octahedral field and its functional or explicit vector representation. Experienced reader may refer to next section for implementation details.

An octahedral frame can be represented by 3 orthogonal unit vectors and their opposites:

\{\mathbf{v}_{1},-\mathbf{v}_{1},\mathbf{v}_{2},-\mathbf{v}_{2},\mathbf{v}_{3},-\mathbf{v}_{3}\}\in\mathbb{R}^{3}

Such frame exhibits cubic symmetry–permuting or flipping the 3 basis vectors gives equivalent frame. Thus, measuring the difference between 2 frames requires matching of their representation vectors, making optimizing the smoothness of octaheral field a mixed-integer programming problem.

A.1. SH functional representation

Huang et al. (2011) observe any octahedral frame can be equivalently represented as a rotation matrix $\mathbf{R}\in SO(3)$ associated with the canonical frame of standard basis vectors $\{\mathbf{e}_{x},\mathbf{e}_{y},\mathbf{e}_{z}\}$ , where $\mathbf{e}_{x}=[1,0,0]^{T}$ , $\mathbf{e}_{y}=[0,1,0]^{T}$ , $\mathbf{e}_{z}=[0,0,1]^{T}\in\mathbb{R}^{3}$ .

To avoid matching, Huang et al. propose to measure the difference between two octahedral frames by the spherical integration of a descriptor function that is invariant to cubic symmetry. Specifically, for the canonical frame, they design the function over the unit sphere $F:\mathcal{S}^{2}\to\mathbb{R}$ that gives equivalent value under the sign change or re-ordering of its representation vectors $\{\mathbf{e}_{x},\mathbf{e}_{y},\mathbf{e}_{z}\}$ :

F(\mathbf{s})=(\mathbf{s}\cdot\mathbf{e}_{x})^{2}(\mathbf{s}\cdot\mathbf{e}_{y})^{2}+(\mathbf{s}\cdot\mathbf{e}_{y})^{2}(\mathbf{s}\cdot\mathbf{e}_{z})^{2}+(\mathbf{s}\cdot\mathbf{e}_{z})^{2}(\mathbf{s}\cdot\mathbf{e}_{x})^{2}

or under Cartesian coordinate:

(8)

F(\mathbf{s})=x^{2}y^{2}+y^{2}z^{2}+z^{2}x^{2},\mathbf{s}=[x,y,z]^{T}\in S^{2}

The general frame can then be described by $F(\mathbf{R}^{T}\mathbf{s})$ , obtained by rotating either its input or its representation vectors. Both interpretations give the same result as $(\mathbf{R}^{T}\mathbf{s})\cdot\mathbf{e}_{i}=\mathbf{s}\cdot(\mathbf{R}\mathbf{e}_{i})$ .

A.1.1. Smoothness

An efficient way to evaluate the functional integral on unit sphere is to project the function onto Spherical Harmonic (SH) basis. Let the projection of the reference descriptor function be:

F(\mathbf{s})=\sum_{l,m}c_{l}^{m}Y_{l}^{m}(\mathbf{s})=\mathbf{y}(\mathbf{s})^{T}\mathbf{c}

where $Y_{l}^{m}$ is SH basis function of band $l$ order $m$ , $c_{l}^{m}$ are corresponding coefficients. $\mathbf{y}$ and $\mathbf{c}$ are their vector form. The rotational invariance of SH (Sloan, 2008) guarantees the rotated version, obtained by rotating its input, to be equally projected. More importantly, its coefficients can be acquired by the linear combination of the coefficients of its unrotated counterpart (Green, 2003):

F(\mathbf{R}^{T}\mathbf{s})=\mathbf{y}(\mathbf{R}^{T}\mathbf{s})^{T}\mathbf{c}=\mathbf{y}(\mathbf{s})^{T}(\tilde{\mathbf{R}}\mathbf{c})

where $\tilde{\mathbf{R}}$ is the Wigner D-matrix induced from the $\mathbf{R}$ that has same dimension as $\mathbf{c}$ .

Thus, for two octahedral frames of rotation matrices $\mathbf{R}_{a}$ and $\mathbf{R}_{b}$ , let $\mathbf{c}_{a}=\tilde{\mathbf{R}}_{a}\mathbf{c}$ , $\mathbf{c}_{b}=\tilde{\mathbf{R}}_{b}\mathbf{c}$ . Their functional difference can be efficiently measured by the $L_{2}$ norm their SH coefficients, thanks to the orthogonality of SH bases (Green, 2003):

		$\displaystyle\int_{\mathcal{S}^{2}}(F(\mathbf{R}_{a}^{T}\mathbf{s})-F(\mathbf{R}_{b}^{T}\mathbf{s}))^{2}d\mathbf{s}$
	$\displaystyle=$	$\displaystyle\int_{\mathcal{S}^{2}}(\mathbf{y}(\mathbf{s})^{T}(\mathbf{c}_{a}-\mathbf{c}_{b}))^{2}d\mathbf{s}$
	$\displaystyle=$	$\displaystyle tr((\mathbf{c}_{a}-\mathbf{c}_{b})^{T}(\int_{\mathcal{S}^{2}}\mathbf{y}(\mathbf{s})\mathbf{y}(\mathbf{s})^{T}ds)(\mathbf{c}_{a}-\mathbf{c}_{b}))$
	$\displaystyle=$	$\displaystyle tr((\mathbf{c}_{a}-\mathbf{c}_{b})^{T}\mathbf{I}(\mathbf{c}_{a}-\mathbf{c}_{b}))$
	$\displaystyle=$	$\displaystyle\\|\mathbf{c}_{a}-\mathbf{c}_{b}\\|_{2}^{2}$

In practice, Huang et al. find the $F(\mathbf{s})$ can be losslessly projected on the band 0 and 4 of SH basis as:

F(\mathbf{s})=c_{0}(\sqrt{7}Y_{4}^{0}+\sqrt{5}Y_{4}^{4}+c_{1}Y_{0}^{0})

where $c_{0}$ , $c_{1}$ are scaling constants. Given $Y_{0}^{0}$ is constant, and the global constant scaling won’t change pairwise measurement, $F(\mathbf{s})$ can then be parameterized as the SH band 4 coefficient vector $\mathbf{q}_{0}$ alone:

(9)

\mathbf{q}_{0}=[0,0,0,0,\sqrt{7},0,0,0,\sqrt{5}]^{T},\mathbf{q}=\tilde{\mathbf{R}}\mathbf{q}_{0}\in\mathbb{R}^{9},\tilde{\mathbf{R}}\in SO(9)

where $\mathbf{q}$ represents general frames described by $F(\mathbf{R}^{T}\mathbf{s})$ .

A.1.2. Alignment

Huang et al. further observe that any $z$ axis aligned octahedral frame has its $Y_{4}^{0}$ coefficient equals to $\sqrt{7}$ :

(10)

\mathbf{q}_{z}(4)=\sqrt{7}

Thus, aligning frame with normal $\mathbf{n}$ can be enforced similarly, by apply rotation $\tilde{\mathbf{R}}_{\mathbf{n}\to\mathbf{z}}$ from $n$ to $z$ , then constraining $Y_{4}^{0}$ coefficient to be $\sqrt{7}$ :

(11)

(\tilde{\mathbf{R}}_{\mathbf{n}\to\mathbf{z}}\mathbf{q}_{n})(4)=\sqrt{7}

A.2. Improved functional representation

Ray et al. (2016) propose a simpler form of (8):

(12)

F(\mathbf{s})=x^{4}+y^{4}+z^{4}

With a quick derivation, the descriptor function is a globally scaled and shifted version of the original function:

x^{4}+y^{4}+z^{4}=1-2(x^{2}y^{2}+y^{2}z^{2}+z^{2}x^{2})

Hence it exhibits the same invariance to cubic symmetry, but with its maxima matches the direction of the representation vectors

Similarly, it can be losslessly projected onto SH basis as:

F(\mathbf{s})=(\frac{8\sqrt{\pi}}{5\sqrt{21}}\|\mathbf{s}\|^{4})(\frac{3\sqrt{21}}{4}Y_{0}^{0}(\mathbf{s})+\sqrt{\frac{7}{12}}Y_{4}^{0}(\mathbf{s})+\sqrt{\frac{5}{12}}Y_{4}^{4}(\mathbf{s}))

Given $\|\mathbf{s}\|=1$ , and ignore coefficients of $Y_{0}^{0}$ and scaling, the octahedral frame can also be parameterized with SH band 4 coefficients as:

(13)

\mathbf{q}_{0}=[0,0,0,0,\sqrt{\frac{7}{12}},0,0,0,\sqrt{\frac{5}{12}}]^{T},\mathbf{q}=\tilde{\mathbf{R}}\mathbf{q}_{0}\in\mathbb{R}^{9},\tilde{\mathbf{R}}\in\mathbb{R}^{9\times 9}

It is equivalent to (9), but with scaling such that $\|\mathbf{q}\|=1$ .

A.2.1. Alignment

Ray et al. apply rotation along $z$ axis to the coefficients of reference frame and find that any $z$ axis aligned frame can be expressed as:

\mathbf{q}_{z}=\tilde{\mathbf{R}}_{z}(\theta)\mathbf{q}_{0}=[\sqrt{\frac{5}{12}}\cos 4\theta,0,0,0,\sqrt{\frac{7}{12}},0,0,0,\sqrt{\frac{5}{12}}\sin 4\theta]^{T}

where $\theta$ is the angle of tangential twist, $\tilde{\mathbf{R}}_{z}(\theta)$ is the corresponding Wigner D-matrix (Section 1.1 of their supplementary).

Therefore, the $z$ axis alignment can be extended with additional constraints:

		$\displaystyle\mathbf{q}_{z}[4]=\sqrt{\frac{7}{12}}$
		$\displaystyle\mathbf{q}_{z}[0]^{2}+\mathbf{q}_{z}[8]^{2}=\frac{5}{12}$

where $[\cdot]$ denotes $0$ based array indexing.

So is the normal alignment:

(14)	$\displaystyle\mathbf{q}_{n}$	$\displaystyle=\sqrt{\frac{7}{12}}\tilde{\mathbf{R}}_{\mathbf{z}\to\mathbf{n}}[0,0,0,0,1,0,0,0,0]^{T}$
		$\displaystyle+c_{0}\sqrt{\frac{5}{12}}\tilde{\mathbf{R}}_{\mathbf{z}\to\mathbf{n}}[1,0,0,0,0,0,0,0,0]^{T}$
		$\displaystyle+c_{8}\sqrt{\frac{5}{12}}\tilde{\mathbf{R}}_{\mathbf{z}\to\mathbf{n}}[0,0,0,0,0,0,0,0,1]^{T}$
		$\displaystyle\text{subject to}\ c_{0}^{2}+c_{8}^{2}=1$

Compared to (10) and (11), additional constraints guarantee the $\mathbf{q}_{z}$ and $\mathbf{q}_{n}$ to be obtainable by rotating $\mathbf{q}_{0}$ .

A.3. Improved normal alignment constraints

Solomon et al. (2017) observe that a smooth normal aligned octahedral field commonly has singularities, at which the evaluated Dirichlet energy is unbounded (Knöppel et al., 2013), undermining the effectiveness of the gradient steps. Thus, Solomon et al. propose to scale the tangential axes of the normal aligned frames, or equivalently, the $xy$ axes when rotated to be $z$ axis aligned, to satisfy the topological restrictions. Numerically, instead of enforcing the $c_{0}^{2}+c_{8}^{2}=1$ constraint in (14) everywhere, they relax it as the average over the boundary surface to be one:

\int_{\partial V}(c_{0}^{2}+c_{8}^{2})d\mathbf{p}=A

where $A$ is the area of the boundary surface $\partial V$ .

A.4. Improved rotation representation

In early work (Huang et al., 2011; Ray et al., 2016), the Wigner D-matrix is parameterized with ZYZ Euler angles representation, as rotation along $z$ axis is trivial to evaluate with spherical coordinates:

	$\displaystyle\mathbf{R}$	$\displaystyle=\mathbf{R}_{x}(\alpha)\mathbf{R}_{y}(\beta)\mathbf{R}_{z}(\gamma)\in SO(3)$
	$\displaystyle\tilde{\mathbf{R}}$	$\displaystyle=\tilde{\mathbf{R}}_{x}(\alpha)\tilde{\mathbf{R}}_{y}(\beta)\tilde{\mathbf{R}}_{z}(\gamma)\in SO(9)$
		$\displaystyle=\mathbf{R}_{y}(\pi/2)^{T}\tilde{\mathbf{R}}_{z}(\alpha)\mathbf{R}_{y}(\pi/2)^{T}\tilde{\mathbf{R}}_{x}(\pi/2)\tilde{\mathbf{R}}_{z}(\beta)\tilde{\mathbf{R}}_{x}(\pi/2)^{T}\tilde{\mathbf{R}}_{z}(\gamma)$

However, the nonlinearity makes it difficult to analyze the behavior of the gradient of SH coefficients over spatial locations. Palmer et al. (2020) propose an elegant rotation vector based representation that makes this analysis feasible (Zhang et al., 2020).

Any rotation matrix $\mathbf{R}\in SO(3)$ can be equivalently converted as axis-angle representation $(\mathbf{e},\theta)$ of the same rotation. It denotes the right-handed rotation of radians $\theta$ along the unit vector $\mathbf{e}$ , and can be compactly written as a general vector $\mathbf{v}=\theta\mathbf{e}\in\mathbb{R}^{3}$ , known as the rotation vector. The skew symmetry matrix form of rotation vector $[\mathbf{v}]_{\times}\in\mathbb{R}^{3\times 3}$ is an element of the lie algebra $\mathfrak{so}(3)$ that associates the same rotation as $\mathbf{R}\in SO(3)$ :

		$\displaystyle\mathbf{v}=[v_{x},v_{y},v_{z}]^{T}$	$\displaystyle\in\mathbb{R}^{3}$
		$\displaystyle[\mathbf{v}]_{\times}=\mathbf{v}\cdot\mathbf{L}=v_{x}\mathbf{L}_{x}+v_{y}\mathbf{L}_{y}+v_{z}\mathbf{L}_{z}$	$\displaystyle\in\mathfrak{so}(3)$
		$\displaystyle\exp([\mathbf{v}]_{\times})=\exp(\mathbf{v}\cdot\mathbf{L})=\mathbf{R}$	$\displaystyle\in SO(3)$

where $\mathbf{L}_{x}=[\mathbf{e}_{x}]_{\times},\mathbf{L}_{y}=[\mathbf{e}_{y}]_{\times},\mathbf{L}_{z}=[\mathbf{e}_{z}]_{\times}$ are bases of $\mathfrak{so}(3)$ , $\exp(\cdot)$ denotes matrix exponential.

The rotation vector representation also applies to the Wigner D-matrix $\tilde{\mathbf{R}}$ . However, as it is induced by $\mathbf{R}\in SO(3)$ , $\tilde{\mathbf{R}}$ only spans a subspace of SO(9). Their relations noted by Palmer et al. are as follows:

		$\displaystyle\mathbf{v}=[v_{x},v_{y},v_{z}]^{T}$	$\displaystyle\in\mathbb{R}^{3}$
		$\displaystyle\mathbf{v}\cdot\tilde{\mathbf{L}}=v_{x}\tilde{\mathbf{L}}_{x}+v_{y}\tilde{\mathbf{L}}_{y}+v_{z}\tilde{\mathbf{L}}_{z}$	$\displaystyle\in\mathfrak{so}(9)$
		$\displaystyle\exp(\mathbf{v}\cdot\tilde{\mathbf{L}})=\tilde{\mathbf{R}}$	$\displaystyle\in SO(9)$

where $\tilde{\mathbf{L}}_{x},\tilde{\mathbf{L}}_{y},\tilde{\mathbf{L}}_{z}$ are bases of $\mathfrak{so}(9)$ induced by $SO(3)$ (Section 2 of their supplementary). (13) can then be reformulated as:

\mathbf{q}=\exp(\mathbf{v}\cdot\tilde{\mathbf{L}})\mathbf{q}_{0}\in\mathbb{R}^{9},\mathbf{v}\in\mathbb{R}^{3}

We characterize the set of $\mathbf{q}$ as the octahedral variety.

Note the matrix exponential is mostly not commute, so its derivatives behaves differently from the scalar exponential operator and can be quite involved. However, it is commute at $\mathbf{v}=[0,0,0]^{T}$ as $v_{i}\tilde{\mathbf{L}}_{i}$ are zero matrices. (Zhang et al., 2020) leverage this observation and write the close form expression of the spatial gradient of SH coefficients and draw its connection with surface curvature.

A.5. Projection onto Octahedral variety

In optimizing the octahedral field, the quadratic constraint $c_{0}^{2}+c_{8}^{2}$ (14) is often omitted during initialization (Huang et al., 2011; Ray et al., 2016), so normal aligned frames are likely deviated from the octahedral variety. The interior frames often deviate further away, as they are mostly unconstrained except for the smoothness.

For a general vector $\mathbf{f}\in\mathbb{R}^{9}$ that may not be obtained by applying $\mathbf{q}_{0}$ with rotations indices from $SO(3)$ , projecting it back to octahedral variety is to find $\mathbf{q}$ that minimize their Euclidean distance:

\operatorname*{arg\,max}_{\mathbf{q}}\|\mathbf{q}-\mathbf{f}\|_{2}^{2}

Huang et al. (2011), Ray et al. (2016) and Solomon et al. (2017) leverage the (9) or (13) and apply gradient descent over the rotation parameterization to find its projection. The resulting rotation can also be used to recover the representation vectors of the octahedral frame. However, the Euler angle representation suffers from gimbal lock, and the nonconvex objective makes it easy to stuck in local minimal. Palmer et al. (2020) draw connection between octahedral frame and fourth order tensor (Chemin et al., 2019), and propose an exact projection method to circumvent this limitation.

For a general rotation matrix $\mathbf{R}=[\mathbf{v}_{1}|\mathbf{v}_{2}|\mathbf{v}_{3}]$ , its descriptor function (12) can be written as:

(15)

F(\mathbf{R}^{T}\mathbf{s})=\sum_{i=1}^{3}(\mathbf{v}_{i}\cdot\mathbf{s})^{4}=(\frac{8\sqrt{\pi}}{5\sqrt{21}})(\frac{3\sqrt{21}}{4}Y_{0}^{0}(\mathbf{s})+\mathbf{q}^{T}\mathbf{y}_{4}(\mathbf{s}))

where $\mathbf{y}_{4}$ is the vector form of SH band 4 basis functions.

It is the homogeneous polynomial (generalization of quadratic form $\mathbf{s}^{T}\mathbf{Ms}$ ) of the fourth order symmetric tensor with $\lambda_{i}=1$ and $\mathbf{v}_{i}$ as its orthogonal decomposition:

\mathbf{T}=\sum_{i=1}^{3}\lambda_{i}\mathbf{v}_{i}\otimes\mathbf{v}_{i}\otimes\mathbf{v}_{i}\otimes\mathbf{v}_{i}:=\sum_{i=1}^{3}\lambda_{i}\mathbf{v}_{i}^{\otimes 4}\in\mathbb{R}^{3\times 3\times 3\times 3}:=\otimes^{4}\mathbb{R}^{3}

where $\otimes$ denotes tensor power. We use the notation suggested by Anandkumar et al. (2012).

For a fourth order tensor to be orthogonally decomposable, the coefficients of its homogeneous polynomial (15) must satisfy a set of quadratic constraints (Boralevi et al., 2017). Palmer et al. propagate these constraints over $\mathbf{q}$ , as 15 quadratic equations:

\begin{bmatrix}1\\ \mathbf{q}\end{bmatrix}^{T}\mathbf{P}_{i}\begin{bmatrix}1\\ \mathbf{q}\end{bmatrix}=0,i=1\dots 15

where $\mathbf{P}_{i}$ are provided in section 4 of their supplementary. Due to the nonconvexity of those constraints, they leverage semi-definite relaxation to perform the projection.

When $\mathbf{q}$ lies on the octahedral variety, its representation vectors $\mathbf{v}_{i}$ are the eigenvectors of its orthogonally decomposable $T$ , which are also the fixed points of its homogeneous polynomial (Robeva, 2016). They can be recovered iteratively using the tensor power method (Lathauwer et al., 1995):

\mathbf{v}_{t+1}=\frac{\nabla F_{\mathbf{q}}(\mathbf{v}_{t})}{\|\nabla F_{\mathbf{q}}(\mathbf{v}_{t})\|}

where $F_{\mathbf{q}}:\mathbb{R}^{3}\to\mathbb{R}$ is the (15) parameterized with the fixed $\mathbf{q}$ .

A.6. Explicit vector representation

In study the non-orthogonal frame field, Desobry et al. (2021) propose an equivalent SH representation using zonal harmonics.

Zonal harmonics is the SH projection of function that exhibits the rotational symmetry along one axis. If that axis is $z$ axis, representing such function only requires one coefficient $z_{l}^{0}$ per band $l$ (Sloan, 2008).

f(\mathbf{s})=\sum_{l}z_{l}^{0}Y_{l}^{0}(\mathbf{s})

More prominently, the rotated version of this function towards a new direction $\mathbf{d}$ can be evaluated as:

(16)		$\displaystyle f(\mathbf{s})$	$\displaystyle=\sum_{l}z_{l}\sqrt{\frac{4\pi}{2l+1}}\sum_{m}Y_{l}^{m}(\mathbf{d})Y_{l}^{m}(\mathbf{s})$
(16)			$\displaystyle=\sum_{l}\sum_{m}\underbrace{z_{l}\sqrt{\frac{4\pi}{2l+1}}Y_{l}^{m}(\mathbf{d})}_{c_{l}^{m}}Y_{l}^{m}(\mathbf{s})$

In our case, the $z^{4}$ component of the polynomial (12) is clearly invariant by the rotation along $z$ axis, thus can be projected as:

	$\displaystyle F_{z}(\mathbf{s})$	$\displaystyle=z^{4}=(\mathbf{e}_{z}\cdot\mathbf{s})^{4}$
		$\displaystyle=(\frac{16\sqrt{\pi}\\|\mathbf{s}\\|^{4}}{105})(\frac{21}{8}Y_{0}^{0}(\mathbf{s})+\frac{3\sqrt{5}}{2}Y_{2}^{0}(\mathbf{s})+Y_{4}^{0}(\mathbf{s}))$
		$\displaystyle=z_{0}^{0}Y_{0}^{0}(\mathbf{s})+z_{2}^{0}Y_{2}^{0}(\mathbf{s})+z_{4}^{0}Y_{4}^{0}(\mathbf{s})$

With (16), for any direction $\mathbf{v}_{i}$ , we have:

(17)	$\displaystyle F_{\mathbf{v}_{i}}(\mathbf{s})$	$\displaystyle=(\mathbf{v}_{i}\cdot\mathbf{s})^{4}$
		$\displaystyle=2\sqrt{\pi}z_{0}^{0}Y_{0}^{0}(\mathbf{v}_{i})Y_{0}^{0}(\mathbf{s})+\sqrt{\frac{4\pi}{5}}z_{2}^{0}\sum_{m=-2}^{2}Y_{2}^{m}(\mathbf{v}_{i})Y_{2}^{m}(\mathbf{s})$
		$\displaystyle+\frac{2\sqrt{\pi}}{3}z_{4}^{0}\sum_{m=-4}^{4}Y_{4}^{m}(\mathbf{v}_{i})Y_{4}^{m}(\mathbf{s})$

Thus, for a frame of any 3 representation vectors $\{\mathbf{v}_{1},\mathbf{v}_{2},\mathbf{v}_{3}\}$ , let $\mathbf{R}=[\mathbf{v}_{1}|\mathbf{v}_{2}|\mathbf{v}_{3}]$ , we have:

F(\mathbf{R}^{T}\mathbf{s})=\sum_{i}^{3}(\mathbf{v}_{i}\cdot\mathbf{s})=c_{0}+\sum_{i=1}^{3}\mathbf{c}_{2}^{i}\mathbf{y}_{2}(\mathbf{s})+\sum_{i=1}^{3}\mathbf{c}_{4}^{i}\mathbf{y}_{4}(\mathbf{s})

where $c_{0}\in\mathbb{R}$ , $\mathbf{c}_{2}^{i}\in\mathbb{R}^{5}$ , $\mathbf{c}_{4}^{i}\in\mathbb{R}^{9}$ are coefficients from 17.

When the 3 representation vectors are orthogonal, the frame is the ordinary octahedral frame, the band 2 coefficients $\sum_{i}^{3}\mathbf{c}_{2}^{i}$ becomes $\mathbf{0}$ vector, $\mathbf{R}\in SO(3)$ is rotation matrix, and the expression above is equivalent to (15) by a constant scaling.

The advantage of this expression is it bridges the differentiation of the SH band 4 coefficient representation with respect to its representation vectors, without the need of iterative matrix logarithm conversion from rotation vector.

Appendix B Numerical results

The quantitative results from ABC and Thingi10k dataset are listed in Table 2. We additional ablate the choice or regularization weight (Table 3) and use DiGS as backbone (Table 4) and provide visualization for both cases (Figure 12).

Table 3. Ablation study of different regularization weight, ”weight / noise

\sigma

”. The metrics are averaged across two datasets.

	Chamfer $\downarrow$	Hausdorff $\downarrow$	F-Score $\uparrow$
$5/0.002L$	3.140	6.363	95.779
$10/0.002L$	2.962	5.981	95.822
$5/0.01L$	4.174	6.822	89.934
$10/0.01L$	3.568	6.167	90.175

Table 4. Ablation study of different backbones, ”backbone method /noise

\sigma

”. The metrics are averaged across two datasets.

	Chamfer $\downarrow$	Hausdorff $\downarrow$	F-Score $\uparrow$
$DiGS/0.002$ L	3.384	8.054	94.244
$NSH/0.002L$	2.962	5.981	95.822
$DiGS/0.01L$	4.744	8.715	88.113
$NSH/0.01L$	3.568	6.167	90.175