This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Second-Order Convergent Collision-Constrained Optimization-Based Planner

Chen Liang1, Xifeng Gao2, Kui Wu2, Zherong Pan2†
1Chen Liang is with the department of computer science, Zhejiang university. 2 Xifeng Gao, Kui Wu, and Zherong Pan are with LightSpeed Studios, Tencent America. {xifgao,kwwu,zrpan}@global.tencent.com. indicates corresponding author.
Abstract

Finding robot poses and trajectories represents a foundational aspect of robot motion planning. Despite decades of research, efficiently and robustly addressing these challenges is still difficult. Existing approaches are often plagued by various limitations, such as intricate geometric approximations, violations of collision constraints, or slow first-order convergence. In this paper, we introduce two novel optimization formulations that offer provable robustness, achieving second-order convergence while requiring only a convex approximation of the robot’s links and obstacles. Our first method, known as the Explicit Collision Barrier (ECB) method, employs a barrier function to guarantee separation between convex objects. ECB uses an efficient matrix factorization technique, enabling a second-order Newton’s method with an iterative complexity linear in the number of separating planes. Our second method, referred to as the Implicit Collision Barrier (ICB) method, further transforms the separating planes into implicit functions of robot poses. We show such an implicit objective function is twice-differentiable, with derivatives evaluated at a linear complexity. To assess the effectiveness of our approaches, we conduct a comparative study with a first-order baseline algorithm across six testing scenarios. Our results unequivocally justify that our method exhibits significantly faster convergence rates compared to the baseline algorithm.

Index Terms:
Trajectory and Pose Optimization, Collision Constraint, Convex Analysis

I Introduction

The issue of robot safety stands as a paramount concern, imposing stringent constraints on motion planning algorithms. Among these constraints, collision avoidance requirements demand that the robot maintains a minimum distance from static or moving obstacles at all times. This paper addresses the challenge of generating optimal robot poses and trajectories while adhering to collision constraints and optimizing a user-defined cost function. Such a problem finds practical applications in (multi-)UAV trajectory planning [1], inverse kinematics [2], object settling [3], and dynamic simulations [4]. Ideally, an effective planning algorithm should possess several key attributes to cater to the diverse needs of these applications: (Scalability) handling a significant number of robot links and obstacles; (Efficacy) rapid convergence and low iterative and overall computational costs; (Robustness) guarantee to satisfy all collision constraints; (Generality) applying to arbitrary articulated robots with complex geometries. However, despite decades of dedicated research in pursuit of such algorithms, it remains a challenge to find a single solution that fully satisfies all these diverse requirements.

Existing trajectory generation algorithms are engineered to suit a set of specific applications. Take, for instance, (online) UAV trajectory generation, where computational efficiency is paramount. Existing algorithm [5] employs penalty methods to handle constraints, which can compromise robustness. In contrast, other algorithms such as [1, 6] offer guaranteed robustness and global optimality. Unfortunately, their applicability is limited to point robots and convex-decomposable freespace. When dealing with general articulated robots, the formulation of collision constraints becomes notably challenging. Many optimization-based algorithms, as seen in the works [7, 8], once again resort to less robust penalty-based techniques. Furthermore, these algorithms often exhibit limited efficacy, having only first-order convergence. Recent advancements [9, 10] have introduced interior point methods to address collision constraints with guaranteed robustness. However, [9] [9] necessitates the decomposition of robot geometry into a gazillion of triangles, which scales poorly with the complexity of the robot geometry. On the other hand, [10] [10] relies on the strict convexity of robot links, demanding intricate geometric modifications and suffering from first-order convergence limitations.

Method Scalability Efficacy Robustness Generality
[5] point-penalty second-order no point
[1, 6] convex mixed-integer yes point
[9] triangle second-order yes linear motion
[10] convex first-order yes articulated
[3] point second-order no articulated
[7, 8] point-penalty first-order no articulated
[11] convex first-order yes articulated
Ours convex second-order yes articulated
TABLE I: Comparison of planning methods across four dimensions: (Scalability) granularity (convex>>triangle>>point) of collision constraints, finer granularity can result in a larger number of constraints with limited scalability (certain algorithms use point-wise soft penalty and do not involve hard constraints, denoted as point-penalty); (Efficacy) type/convergence-speed of underlying numerical algorithm; (Robustness) whether collision constraints are guaranteed to be satisfied; (Generality) allowed type of robot motions.

Main Results: We present two innovative optimization solvers designed for collision-constrained robot pose and trajectory planning. Our approach highlights a structural analysis of the barrier function governing collision constraints between pairs of convex hulls. We demonstrate that the Hessian matrix of this barrier function possesses a unique structure that enables efficient factorization using a Schur-complement solver, leading to our ECB method. Furthermore, by employing a generalized barrier function, we eliminate the need for a separating plane between convex hull pairs by representing it as an implicit function of robot poses, leading to our ICB method. Both of these advancements empower us to apply second-order convergent Newton’s methods with an iterative cost linear in the number of convex hull pairs.

Our method exhibits several advantageous properties that have not been simultaneously achieved by prior algorithms. First, we decompose the robot’s geometry into convex hulls rather than points [3] or triangles [9]. As a result, our method can scale to large, complex robot and environmental geometries using a moderate number of geometric primitives, leading to enhanced scalability. Second, in contrast to previous first-order convergent methods [12, 10], our approach enjoys faster second-order convergence. Finally, our method adopts the interior point algorithm that inherits the robustness established by [9, 10], i.e., our method ensures the collision-free properties of the optimize robot poses/trajectories. We have conducted evaluations of our method across six challenging robot planning scenarios. Our results unequivocally demonstrate that our approach achieves significantly faster convergence compared to the first-order baseline method.

Refer to caption
Figure 1: We exemplify the optimization process involving two articulated robot poses, wherein the two end-effectors (red) must maintain a certain separation distance. Previous approaches [5, 3] substitute such collision constraints with point constraints, while [9] suggests discretizing the end-effector using a triangle mesh and employing triangle constraints. In contrast, our method builds upon the convex constraints formulation introduced by [11, 13], where a separating plane denoted as pijp_{ij} is jointly optimized along with the robot poses. While [11, 13] are first-order convergent methods, our approach stands out with its local second-order convergence.

II Related Work

We review related work on collision-constrained planning, collision constraint formulations, numerical optimization techniques, and semi-infinite collision constraints.

Collision-Constrained Planning: The algorithms summarized in Table I predominantly fall into the category of local optimization methods. These approaches leverage (high-order) derivatives of the objective/constraint functions to efficiently identify locally optimal plans. In contrast, optimal sampling-based methods [14, 15] pursue globally optimal plans by exploring the entire configuration space. However, a notable drawback of these methods is their limited scalability concerning the dimensionality of the configuration space [16]. An additional category of techniques includes the control barrier function [17], designed for dynamic systems to adhere to specific constraints. However, these algorithms primarily focus on constraint satisfaction and do not directly address the planning of robot motions that fulfill users’ requirements. We also acknowledge recent advancements [18, 19] that involve precomputing feasible sub-domains within the configuration space. Subsequently, mixed-integer second-order numerical solvers are employed to derive globally optimal motion plans restricted to these sub-domains. This approach represents a promising alternative to our method, which obviates the need for sub-domain precomputation or restriction.

Collision-Constraint Formulations: The local optimization techniques outlined in Table I can be distinguished by the granularity at which they formulate collision constraints. Several methods [5, 3] opt to replace collision constraints with point constraints, due to their neat closed-form derivatives. However, this approach necessitates a considerable number of point constraints to ensure the safety of a robot with complex geometries. Recent advancements, as exemplified in [9], extend point constraints to collision constraints between pairs of triangles, benefiting from closed-form second-order derivatives. Nevertheless, even this approach requires a substantial number of triangles to represent intricate robot geometries, resulting in a high volume of collision constraints. As proposed in [13, 11], the number of geometric primitives can be significantly reduced if we can formulate collision constraints between general convex hulls. While [12] [12] demonstrated that the distance function between convex hulls possesses first-order derivatives, a comprehensive exploration of second-order differentiability and the efficient evaluation of convex constraints remains an area that requires further investigation.

Numerical Optimization Techniques: Methods from Table I harness the power of numerical optimization solvers to search for optimal plans. The majority of existing approaches [5, 1, 6] cast the problem in the form of a conventional (mixed-integer) Nonlinear Program (NLP) and subsequently employ off-the-shelf optimization software packages [20, 21]. These algorithms typically exhibit second-order convergence when an accurate Hessian matrix can be evaluated. However, they regress to first-order convergence when solely gradient information is accessible, as in [12]. Furthermore, customized optimization solvers have been devised to enhance robustness or efficiency. For instance, [9] [9] seamlessly integrated continuous collision checks into the line-search procedure to circumvent the challenge of tunneling. Meanwhile, methods like [3, 10] strategically sample collision constraints along the trajectory to approximate the solution of a Semi-Infinite Program (SIP). [11] [11] introduced a first-order ADMM approach that interleaves the optimization of robot trajectories and the separating planes between convex hulls to facilitate parallel computation.

Semi-Infinite Collision Constraints: As clarified in Section I, the necessity to satisfy collision constraints at infinitely many time instances inherently gives rise to SIP problem instances, a concept recently articulated by [3, 10]. While this paper primarily addresses a finite set of collision constraints, we emphasize that our methodology can be adapted to tackle SIP problems. Notably, existing SIP solvers propose to transform SIP into a standard NLP by either sampling [3] or discretizing [10] the constraints. In this context, we can readily substitute the underlying solver in [3, 10] with our approach to achieve second-order convergence when dealing with convex constraints. However, it is noteworthy that, while this enables second-order convergence in the NLP subproblem, the convergence speed of the original SIP problem remains an open question and warrants further exploration, which we leave as a subject for future work.

III Problem Statement

We address the challenge of collision-constrained optimization, focusing on the task of optimizing the pose/trajectory of a robot within the context of a twice-differentiable objective function 𝒪(θ)\mathcal{O}(\theta), where θ\theta represents the robot’s configuration parameter. The robot is subject to a set of collision constraints taking the following general form:

Ci(θ)Cj(θ)=ij𝒞,\displaystyle C_{i}(\theta)\cap C_{j}(\theta)=\emptyset\quad\forall ij\in\mathcal{C},

Here, CiC_{i} and CjC_{j} represent pairs of potentially contacting objects drawn from the set of contact pairs denoted as 𝒞\mathcal{C}. CC_{\bullet} can correspond to a link within an articulated robot or a static obstacle. For the sake of generality, we assume that 𝒪(θ)\mathcal{O}(\theta) is a function of the robot’s configuration. Additionally, we presume that CC_{\bullet} possesses a (not necessarily strictly) convex shape, which is a widely adopted geometric representation for robots [12, 22, 18]. Using convex shapes can effectively reduce the number of constraints as compared with point-pair or triangle-pair constraints. Non-convex shapes can be further decomposed into convex primitives. Consequently, the primary optimization model addressed in this paper can be summarized as:

argmin𝜃𝒪(θ)s.t.Ci(θ)Cj(θ)=ij𝒞.\displaystyle\underset{\theta}{\text{argmin}}\;\mathcal{O}(\theta)\quad\text{s.t.}\;C_{i}(\theta)\cap C_{j}(\theta)=\emptyset\quad\forall ij\in\mathcal{C}. (1)

Equation 1 embraces a generalized formulation that encompasses both the optimization of a static robot pose and a dynamic robot trajectory. In the case of optimizing a static robot pose, θ\theta corresponds to a single configuration, and 𝒞\mathcal{C} is a finite set that enumerates all potential convex object pairs susceptible to collision. In contrast, when optimizing a robot trajectory, θ\theta represents the parameterization of the robot’s trajectory, such as the control points of a spline curve [1, 23]. In this scenario, 𝒞\mathcal{C} expands to an infinite set that encompasses all possible convex object pairs at infinitely many time instances across the trajectory. This extension leads to the formulation of SIP problem instances. In the latter case, 𝒞\mathcal{C} can be further approached by sampling at finite time instances [3] or discretizing [10], thus unified with the former case.

IV Method

The effective resolution of Equation 1 poses practical challenges, particularly when dealing with convex collision constraints known for their non-differentiability [12, 3]. As illustrated in Figure 1, when complex environmental geometries are considered, the number of collision pairs can rapidly escalate, leading to sizable problem dimensions. In this study, we introduce two efficient optimization techniques, both featuring local second-order convergence and an iterative cost linear in the number of collision pairs. Our approach draws inspiration from recent research [13, 11], which introduced separating planes between pairs of convex objects, represented by the normal direction nijn_{ij} and offset dijd_{ij}. However, prior works [13, 11] employed an Alternating Optimization (AO) approach that interleaved the updates of nij,dijn_{ij},d_{ij} and θ\theta. Unfortunately, this approach is limited to at most first-order convergence speed [24]. Our primary contribution is the development of two quasi-Newton algorithms that jointly optimize nij,dijn_{ij},d_{ij}, and θ\theta without significantly increasing the computational overhead. To achieve this, we augment our objective function with a barrier energy designed to prevent the intersection of CiC_{i} and CjC_{j}. In our ECB method, we derive the Hessian matrix of the augmented objective function concerning both nij,dijn_{ij},d_{ij}, and θ\theta. Subsequently, we efficiently invert this large Hessian matrix using the Schur-complement lemma [25]. In our ICB method, we eliminate nijn_{ij} and dijd_{ij} by expressing them as implicit functions of θ\theta and derive its Hessian using the implicit function theorem [26].

IV-A ECB Method

A convex object can be represented in two ways: the V-representation characterizes CC_{\bullet} using a set of vertices, while the H-representation characterizes it using a set of separating planes. In this work, we consistently utilize the V-representation. Without any ambiguity, we denote CC_{\bullet} as the set of vertices. We assume each vertex x(θ)Cx(\theta)\in C_{\bullet} is a twice-differentiable function of θ\theta, which holds generally for articulated robot with x(θ)x(\theta) being the forward-kinematic function. To define a valid separating plane pij(nij,dij)p_{ij}\triangleq\left(\begin{array}[]{cc}{n_{ij}},&{d_{ij}}\end{array}\right), it should ensure that the vertices of CiC_{i} and CjC_{j} are on opposite sides of it, i.e.:

nijx+dij<0xCinijx+dij>0xCj.\displaystyle n_{ij}x+d_{ij}<0\;\forall x\in C_{i}\land n_{ij}x+d_{ij}>0\;\forall x\in C_{j}.

We enforce each of these linear constraints by introducing a barrier energy function P()P(\bullet) and augmenting the objective function as follows:

𝒪¯(θ,,pij,)=\displaystyle\bar{\mathcal{O}}(\theta,\cdots,p_{ij},\cdots)= 𝒪(θ)+ij𝒞Pij(θ,pij)\displaystyle\mathcal{O}(\theta)+\sum_{ij\in\mathcal{C}}P_{ij}(\theta,p_{ij})
Pij(θ,pij)\displaystyle P_{ij}(\theta,p_{ij})\triangleq xCiP(nijx(θ)dij)+xCjP(nijx(θ)+dij),\displaystyle\sum_{x\in C_{i}}P(-n_{ij}x(\theta)-d_{ij})+\sum_{x\in C_{j}}P(n_{ij}x(\theta)+d_{ij}),

where we assume the barrier function P()P(\bullet) is strictly convex, defined in the interval (0,)(0,\infty), and satisfies the condition lim0+P()=\lim_{\bullet\to 0^{+}}P(\bullet)=\infty. Consequently, our primary optimization problem Equation 1 is transformed into the following barrier-augmented optimization:

argminθ,p𝒪¯(θ,,pij,)s.t.nij=1ij𝒞.\displaystyle\underset{\theta,p_{\bullet}}{\text{argmin}}\;\;\bar{\mathcal{O}}(\theta,\cdots,p_{ij},\cdots)\quad\text{s.t.}\;\|n_{ij}\|=1\quad\forall ij\in\mathcal{C}.

Note that we still need an additional constraint to ensure unit normal vectors for the separating planes. This method was previously proposed in [11]. However, they optimized pijp_{ij} and θ\theta in separate sub-problems. In contrast, we introduce a joint optimization formulation for both pijp_{ij} and θ\theta, achieving faster convergence. We refer to this method as the ECB method since all variables are explicitly treated as independent decision variables.

Our approach to optimizing the explicit formulation employs a standard Newton’s method. We calculate the joint Hessian matrix 2𝒪¯\nabla^{2}\bar{\mathcal{O}} and then determine the joint update direction δ(δθT,,δpijT,)T\delta\triangleq\left(\begin{array}[]{cccc}{\delta\theta^{T}},&{\cdots},&{\delta p_{ij}^{T}},&{\cdots}\end{array}\right)^{T} by solving the following KKT system:

0=(2𝒪¯JJT0)(δλ)+(𝒪¯0),\displaystyle 0=\left(\begin{array}[]{cc}\nabla^{2}\bar{\mathcal{O}}&J\\ J^{T}&0\end{array}\right)\left(\begin{array}[]{c}\delta\\ \lambda\end{array}\right)+\left(\begin{array}[]{c}\nabla\bar{\mathcal{O}}\\ 0\end{array}\right), (8)

where JJ represents the Jacobian of the unit normal constraint nij=1\|n_{ij}\|=1, with each column of JJ corresponding to the normal vector of a separating plane. However, this straightforward approach can encounter two main issues. Firstly, when dealing with complex environmental geometries, the number of separating planes can reach tens of thousands [10], resulting in large Hessian matrices. If we use standard direct factorization methods to invert such matrices, the computational overhead can compromise the advantages of second-order convergence. Additionally, the Hessian matrix can exhibit exceedingly large condition numbers, leading to numerical instability. Fortunately, we can leverage the unique structure of the Hessian matrix to develop an efficient and stable matrix factorization scheme with linear complexity in the number of separating planes.

We first notice that a separating plane only appears in a single term PijP_{ij}. Therefore, the Hessian matrix takes the following form:

2𝒪¯=(θθ𝒪¯θpPijpθPijppPij).\displaystyle\nabla^{2}\bar{\mathcal{O}}=\left(\begin{array}[]{cccc}\nabla_{\theta\theta}\bar{\mathcal{O}}&\cdots&\nabla_{\theta p}P_{ij}&\cdots\\ \vdots&\ddots&&\\ \nabla_{p\theta}P_{ij}&&\nabla_{pp}P_{ij}&\\ \vdots&&&\ddots\end{array}\right).

We propose a Gauss elimination approach to solve the KKT system in Equation 8 and show that this procedure is well-defined. We first notice that λ\lambda can also be decomposed into λij\lambda_{ij}, each related to one separating plane. The joint linear sub-system of pij,λijp_{ij},\lambda_{ij} takes the following form:

(pθPij0)δθ+(ppPijnijnijT0)(δpijλij)=(pPij0),\displaystyle\left(\begin{array}[]{c}\nabla_{p\theta}P_{ij}\\ 0\end{array}\right)\delta\theta+\left(\begin{array}[]{cc}\nabla_{pp}P_{ij}&n_{ij}\\ n_{ij}^{T}&0\end{array}\right)\left(\begin{array}[]{c}\delta p_{ij}\\ \lambda_{ij}\end{array}\right)=\left(\begin{array}[]{c}\nabla_{p}P_{ij}\\ 0\end{array}\right),

where we abuse notation and add a pending zero to nijn_{ij} when necessary. Suppose the coefficient matrix is invertible, we can solve for δpij\delta p_{ij} as:

δpij=\displaystyle\delta p_{ij}= Hij(pPijpθPijδθ)\displaystyle H_{ij}\left(\nabla_{p}P_{ij}-\nabla_{p\theta}P_{ij}\delta\theta\right) (9)
Hij\displaystyle H_{ij}\triangleq ppPij1[InijnijTppPij1nijTppPij1nij],\displaystyle\nabla_{pp}P_{ij}^{-1}\left[I-n_{ij}n_{ij}^{T}\frac{\nabla_{pp}P_{ij}^{-1}}{n_{ij}^{T}\nabla_{pp}P_{ij}^{-1}n_{ij}}\right],

where we have used Gauss elimination to factor out λij\lambda_{ij} assuming ppPij\nabla_{pp}P_{ij} is invertible. Next, we plug all the δpij\delta p_{ij} into the first equation to yield the following Schur-complement system of θ\theta alone:

Hθδθ=\displaystyle H_{\theta}\delta\theta= [θ𝒪¯ij𝒞θpPijHijpPij]\displaystyle\left[\nabla_{\theta}\bar{\mathcal{O}}-\sum_{ij\in\mathcal{C}}\nabla_{\theta p}P_{ij}H_{ij}\nabla_{p}P_{ij}\right] (10)
Hθ\displaystyle H_{\theta}\triangleq [θθ𝒪¯ij𝒞θpPijHijpθPij],\displaystyle\left[\nabla_{\theta\theta}\bar{\mathcal{O}}-\sum_{ij\in\mathcal{C}}\nabla_{\theta p}P_{ij}H_{ij}\nabla_{p\theta}P_{ij}\right],

which is then solved for θ\theta. Notably, the above procedure has a complexity linear in the number of hyper planes, which is much faster than using an off-the-shelf solver to factorize the large KKT matrix in Equation 8. However, the well-definedness of the above system relies on the inversion of the matrix ppPij\nabla_{pp}P_{ij} and HθH_{\theta}. Using the Schur-complement lemma [25], we have the following corollary (proved in appendix):

Corollary IV.1.

If 2𝒪¯ϵI\nabla^{2}\bar{\mathcal{O}}\succeq\epsilon I, then the Schur-complementary solver is well-defined and HθϵIH_{\theta}\succeq\epsilon I.

In practice, however, the strict requirement of 2𝒪¯ϵI\nabla^{2}\bar{\mathcal{O}}\succeq\epsilon I may not be met, and we introduce a diagonal perturbation to ensure well-definedness. Specifically, we employ an adjustment function 𝒜(H,ϵ)\mathcal{A}(H,\epsilon) that ensures all eigenvalues of HH to be greater than a specified ϵ\epsilon. This adjustment is applied to both ppPij\nabla_{pp}P_{ij} and HθH_{\theta}. The primary challenge in this adjustment process is the eigen decomposition. Thankfully, this operation is computationally efficient due to the relatively small matrix sizes involved. Following the computation of δ\delta, we utilize a line-search strategy to find a step size that strictly decreases the objective function. Finally, we re-normalize nijn_{ij} after the update. The algorithm pipeline for the ECB method is outlined in Algorithm 1.

Algorithm 1 ECB Method
1:Initial θ,pij,𝒞,ϵ1,ϵ2,ϵ\theta,p_{ij},\mathcal{C},\epsilon_{1},\epsilon_{2},\epsilon
2:Locally optimal θ\theta
3:for 𝒪¯>ϵ\|\nabla\bar{\mathcal{O}}\|_{\infty}>\epsilon do
4:     Compute derivatives in Equation 8
5:     for ij𝒞ij\in\mathcal{C} do
6:         ppPij𝒜(ppPij,ϵ1)\nabla_{pp}P_{ij}\leftarrow\mathcal{A}(\nabla_{pp}P_{ij},\epsilon_{1})
7:         Form complement system for pijp_{ij} (Equation 9)     
8:     Form complement system for θ\theta (Equation 10)
9:     Hθ𝒜(Hθ,ϵ2)H_{\theta}\leftarrow\mathcal{A}(H_{\theta},\epsilon_{2}) then solve for δθ\delta\theta
10:     for ij𝒞ij\in\mathcal{C} do
11:         Plug δθ\delta\theta in Equation 9 for δpij\delta p_{ij}      
12:     Use line-search procedure to find step size α\alpha
13:     Update θθ+αδθ\theta\leftarrow\theta+\alpha\delta\theta
14:     for ij𝒞ij\in\mathcal{C} do
15:         Update pijpij+αδpijp_{ij}\leftarrow p_{ij}+\alpha\delta p_{ij}, nijnij/nijn_{ij}\leftarrow n_{ij}/\|n_{ij}\|      

IV-B ICB Method

The primary computational challenge encountered in the ECB method is the inversion of the large Hessian matrix. However, many variables correspond to separating planes, which are intermediary variables unrelated to the robot’s motion itself. Hence, it is advantageous to eliminate these variables, leaving only θ\theta as the independent variable. In this section, we demonstrate that it is indeed possible to express pijp_{ij} as a well-defined, sufficiently smooth function of θ\theta. For this reason, we refer to this method as the ICB method. To initiate this discussion, we observe that each plane pijp_{ij} only appears in the collision penalty terms. Therefore, the optimal pijp_{ij} should satisfy the following optimization problem:

pijargminnij,dijPij(θ,pij)s.t.nij=1.\displaystyle p_{ij}\in\underset{n_{ij},d_{ij}}{\text{argmin}}\;P_{ij}(\theta,p_{ij})\quad\text{s.t.}\;\|n_{ij}\|=1.

However, the function defined above does not guarantee a well-defined pij(θ)p_{ij}(\theta) because the solution may not be unique due to the additional non-convex unit-norm constraint. To obtain a well-defined function pij(θ)p_{ij}(\theta), we turn to an equivalent formulation based on the Support Vector Machine (SVM) [27]. This formulation replaces the non-convex constraint nij=1\|n_{ij}\|=1 with a relaxed convex constraint nij1\|n_{ij}\|\leq 1, giving the following alternative definition:

pijargminnij,dijPij(θ,pij)s.t.nij1.\displaystyle p_{ij}\in\underset{n_{ij},d_{ij}}{\text{argmin}}\;P_{ij}(\theta,p_{ij})\quad\text{s.t.}\;\|n_{ij}\|\leq 1.

It is evident that the above optimization is convex. However, convexity alone is insufficient to express pijp_{ij} as a function of θ\theta because there may not be a unique minimizer. Even if a unique solution exists, the resulting implicit function pij(θ)p_{ij}(\theta) may not be differentiable. As pointed out in [26], general convex optimization problems only have sub-gradients. Therefore, we introduce the following smoothed optimization amenable to differentiability, employing our log-barrier function P()P(\bullet) to handle the additional convex constraint:

pij(θ)argminnij,dijPij(θ,pij)+P(1nij).\displaystyle p_{ij}(\theta)\triangleq\underset{n_{ij},d_{ij}}{\text{argmin}}\;P_{ij}(\theta,p_{ij})+P(1-\|n_{ij}\|). (11)

We have the several essential properties for Equation 11 as summarized below (proved in appendix):

Corollary IV.2.

Assuming CiC_{i} and CjC_{j} are closed set, CiC_{i} or CjC_{j} has non-zero volume, i.e. |Ci|0|C_{i}|\neq 0 or |Cj|0|C_{j}|\neq 0, and P()P(\bullet) is third-order differentiable, we have the following properties: 1) Equation 11 has a solution iff Ci(θ)Cj(θ)=C_{i}(\theta)\cap C_{j}(\theta)=\emptyset; 2) pijp_{ij} is a well-defined function of θ\theta; 3) pij(θ)p_{ij}(\theta) is twice-differentiable; 4) limdist(Ci,Cj)0+Pij(θ,pij(θ))=\lim_{\text{dist}(C_{i},C_{j})\to 0^{+}}P_{ij}(\theta,p_{ij}(\theta))=\infty, with dist(,)\text{dist}(\bullet,\bullet) being the distance between convex hulls.

Algorithm 2 ICB Method
1:Initial θ,pij,𝒞,ϵ1,ϵ\theta,p_{ij},\mathcal{C},\epsilon_{1},\epsilon
2:Locally optimal θ\theta
3:for 𝒪~>ϵ\|\nabla\tilde{\mathcal{O}}\|_{\infty}>\epsilon do
4:     for ij𝒞ij\in\mathcal{C} do
5:         pij(θ)p_{ij}(\theta)\leftarrowsolution to Equation 11
6:         Compute derivatives of pij(θ)p_{ij}(\theta) via Equation LABEL:eq:pijDeriv      
7:     Form 𝒪~\nabla\tilde{\mathcal{O}} and 2𝒪~\nabla^{2}\tilde{\mathcal{O}}
8:     Adjust 2𝒪~𝒜(2𝒪~,ϵ1)\nabla^{2}\tilde{\mathcal{O}}\leftarrow\mathcal{A}(\nabla^{2}\tilde{\mathcal{O}},\epsilon_{1})
9:     Form search direction δθ2𝒪~𝒪~\delta\theta\leftarrow-\nabla^{-2}\tilde{\mathcal{O}}\nabla\tilde{\mathcal{O}}
10:     Use line-search procedure to fine step size α\alpha
11:     Update θθ+αδθ\theta\leftarrow\theta+\alpha\delta\theta

Corollary IV.2 ensures the differentiable structure of pij(θ)p_{ij}(\theta) as long as one of the convex sets is non-degenerate. This limitation is not significant for robotic applications because robot links practically always have non-zero volume. With this structure, we can define an implicit optimization approach and solve the following optimization problem:

argmin𝜃𝒪~(θ)𝒪¯(θ,,pij(θ),),\displaystyle\underset{\theta}{\text{argmin}}\;\tilde{\mathcal{O}}(\theta)\triangleq\bar{\mathcal{O}}(\theta,\cdots,p_{ij}(\theta),\cdots), (12)

replacing the explicit variable pijp_{ij} with the implicit function pij(θ)p_{ij}(\theta). Corollary IV.2 3) ensures that Equation 12 has the twice-differentiable objective function and Corollary IV.2 4) ensures that the feasibility of Equation 12 implies the collision-free property. To evaluate such objective function at given θ\theta, we need to solve Equation 11 for every pair of convex sets in close proximity. After the solution, we can evaluate the first- and second-order derivative of pij(θ)p_{ij}(\theta) and 𝒪~\tilde{\mathcal{O}} (appendix). The computational complexity of these equations is linear with respect to the number of separating planes, similar to our explicit algorithm. However, solving Equation 11 for each pair of convex sets may initially appear computationally burdensome. Fortunately, we can expedite the process by employing a warm-start strategy, where we retain the solution from the previous iteration. If this warm-start approach fails (for instance, if the solution from the last iteration is infeasible), we can establish a feasible initial estimate by computing the closest pair of points on CiC_{i} and CjC_{j}. We then use this information to determine the initial separating plane, setting it as the middle sectioning plane. In practice, we can solve the subproblems in Equation 11 with just a few Newton iterations, and all of these subproblems can be solved in parallel. The algorithmic workflow of the ICB method is summarized in Algorithm 2. Since the Hessian matrix of 𝒪~\tilde{\mathcal{O}} remains relatively small, we can calculate its dense eigen-decomposition, which is essential for matrix inversion and performing the positive-definite adjustment. This adjustment ensures that the minimal eigenvalue of the Hessian matrix is at least ϵ1\epsilon_{1}.

V Evaluation

We conducted our experiments using C++ on a single desktop machine equipped with an 8-core AMD EPYC 7K62 CPU. All available cores were utilized to facilitate parallel computations related to separating planes, such as the computation of per-plane Schur-complements in the ECB method and the evaluation of implicit functions and their derivatives in the ICB method. We also incorporated the widely used acceleration technique of broadphase collision checks, similar to the one employed in prior works [9, 11]. This technique excludes distant pairs of convex hulls from consideration, introducing barrier terms only when the convex hulls are in close proximity. It is important to note that once inserted, these barrier terms are not removed, even if the pair of convex hulls later move apart. Throughout our analysis, we require our penalty function to be positive and strictly convex. One function satisfying our needs is the layered penalty function [28], which is used in our implementation. Finally, we use hyper-parameters ϵ1=0.001\epsilon_{1}=0.001 and ϵ2=0.001\epsilon_{2}=0.001 throughout our experiements.

Additionally, we integrated our solver with an implementation of the SIP solver from [10]. This allows us to solve SIP problem instances, ensuring collision-free trajectories at every continuous time interval, by leveraging their constraint discretization method. For comparison, both the ECB and ICB methods are compared with the baseline Alternating Optimization (AO) technique [11], which optimizes θ\theta and pijp_{ij} in separate subproblems, where we also optimize pijp_{ij} subproblems in parallel. This is state-of-the-art optimization technique, allowing the optimization of arbitrary articulated robot poses under convex constraints, with the only difference being the convergence speed.

Refer to caption(a) Refer to caption(b) Refer to caption(c)
Refer to caption(d) Refer to caption(e) Refer to caption(f)
Figure 2: The six scenarios used for evaluating the three methods: bulky object settling (a), thin object settling (b), quadruple-arm reaching (c), UAV trajectory generation (d), long-distance UAV trajectory generation (e), basket-reaching (f).
|θ||\theta| |𝒞||\mathcal{C}|
ECB ICB AO
(a) 54 50 49 50
(b) 108 716 570 714
(c) 24 1757 1024 1266
(d) 36 3099 3585 2444
(e) 54 6424 11122 6892
(f) 33 973 837 899
TABLE II: The number of degrees of freedom |θ||\theta| and separating planes |𝒞||\mathcal{C}| for each benchmarking scenario.
ECB ICB AO
Scenario ϵ=101\epsilon=10^{-1} ϵ=102\epsilon=10^{-2} ϵ=103\epsilon=10^{-3} ϵ=104\epsilon=10^{-4} ϵ=101\epsilon=10^{-1} ϵ=102\epsilon=10^{-2} ϵ=103\epsilon=10^{-3} ϵ=104\epsilon=10^{-4} ϵ=101\epsilon=10^{-1} ϵ=102\epsilon=10^{-2} ϵ=103\epsilon=10^{-3} ϵ=104\epsilon=10^{-4}
(a) 697 703 5816 26906 337 337 1823 21610 24182 48386 N.A. N.A.
(b) 6898 9814 9906 10329 1470 2063 4946 13245 42000 71561 77179 79300
(c) 424 1128 1474 2356 132 305 368 525 371 935 2046 N.A.
(d) 1068 1479 1928 1987 406 528 576 623 1983 2941 4350 5010
(e) 1802 2383 2721 2936 1115 1277 1315 1396 2305 4808 5813 6773
(f) 219 435 604 1066 159 206 266 400 277 1896 N.A. N.A.
TABLE III: The number of iterations for the three algorithms to reach difference level of gradient norm ϵ\epsilon (N.A. denotes the algorithm cannot finish computation within the time limit).
ECB ICB AO
Scenario ϵ=101\epsilon=10^{-1} ϵ=102\epsilon=10^{-2} ϵ=103\epsilon=10^{-3} ϵ=104\epsilon=10^{-4} ϵ=101\epsilon=10^{-1} ϵ=102\epsilon=10^{-2} ϵ=103\epsilon=10^{-3} ϵ=104\epsilon=10^{-4} ϵ=101\epsilon=10^{-1} ϵ=102\epsilon=10^{-2} ϵ=103\epsilon=10^{-3} ϵ=104\epsilon=10^{-4}
(a) 0.04 0.04 0.36 1.70 0.02 0.02 0.10 1.44 1.14 2.32 N.A. N.A.
(b) 3.58 5.15 5.20 5.40 0.74 1.07 2.73 7.54 20.63 34.88 37.59 38.56
(c) 2.58 11.20 17.29 38.05 0.76 2.37 3.17 5.51 1.78 6.19 23.09 N.A.
(d) 13.44 20.75 29.61 30.80 4.99 7.47 8.52 9.56 27.28 43.90 68.96 81.06
(e) 107.88 164.47 200.00 222.79 76.74 95.76 100.23 109.99 161.08 446.63 565.25 679.28
(f) 2.57 6.56 10.96 26.73 1.72 2.58 3.96 7.71 4.47 63.96 N.A. N.A.
TABLE IV: The computational time (min) for the three algorithms to reach difference level of gradient norm ϵ\epsilon.

We assess the performance of our method using a series of six challenging benchmarks, as outlined in Figure 2. The first two benchmarks involve object settling problems, with bulky objects (a) and thin objects (b) dropped into a box. These problems require us to compute the static equilibrium poses of the objects. Since we optimize a single pose, these scenarios only involve a finite set of collision constraints. The remaining four scenarios focus on trajectory optimization problems, where we solve SIP problem instances by integrating both our methods and AO [11] into the SIP solver framework [10]. In Figure 2 (c), four robot arms reach various end-effector positions on a table, while avoiding obstacles. In Figure 2 (d), we optimize the trajectories of six UAVs with a giraffe-shaped obstacle in the middle. Figure 2 (e) extends this challenge with longer trajectories for the nine UAVs and multiple obstacles along their paths. Finally, Figure 2 (f) presents three fetch robots reaching 3 positions in a basket. In these scenarios, the trajectories for each robot’s configuration space are parameterized using high-order spline curves, with the number of degrees of freedom and separating planes summarized in Table II, which shows that our method can scale to complex high-DOF robots and environmental setups. We measure the number of iterations and computational costs required to reach different gradient norm values, and the results are summarized in Table III and Table IV, respectively.

Our results clearly demonstrate the superior convergence speed of our method compared to AO, especially when aiming for highly optimal motion plans with a small threshold value (ϵ\epsilon). The key factor behind this superior performance is the significantly lower number of iterations required by our method, while maintaining a comparable per-iteration cost. Furthermore, it is worth noting that ICB exhibits faster convergence compared to ECB, despite both methods being locally second-order. This can be attributed to ICB’s approach of computing the optimal separating planes during each iteration. Although this leads to a slightly higher iterative cost, the overall reduction in the number of iterations more than compensates for this additional computational overhead. In summary, our method consistently achieves a remarkable reduction in computational time, with a speedup ranging from 5.125.12 to 206.34206.34 for pose optimization (Figure 2a-b) and 2.632.63 to 151.23151.23 for trajectory optimization (Figure 2c-f).

VI Conclusion & Limitations

We present innovative second-order optimization-based planners for both robot poses and trajectories, building upon the convex constraint formulation introduced by [13, 11]. A key distinguishing feature of our approach is the efficient application of Newton’s method to simultaneously optimize the separating planes and robot poses. Our work has paved the way for several promising avenues in future research. Although both of our methods outperform the AO baseline, they each come with their own advantages and disadvantages. On the one hand, ICB is more efficient than ECB, requiring even fewer iterations to converge. However, it demands a more complex implementation with an intricate inner loop of optimization and a careful treatment of second-order derivatives. The reliable evaluation of these derivatives under finite-precision floating-point formats requires further investigation. Further, due to the inherent complexity of the underlying problem, our method may still be time-consuming. \AtNextBibliography

References

  • [1] Robin Deits and Russ Tedrake “Efficient mixed-integer planning for UAVs in cluttered environments” In 2015 IEEE international conference on robotics and automation (ICRA), 2015, pp. 42–49 IEEE
  • [2] Yuanwei Chua, Keng Peng Tee and Rui Yan “Robust Optimal Inverse Kinematics with Self-Collision Avoidance for a Humanoid Robot” In 2013 IEEE RO-MAN, 2013, pp. 496–502 DOI: 10.1109/ROMAN.2013.6628553
  • [3] Kris Hauser “Semi-infinite programming for trajectory optimization with non-convex obstacles” In The International Journal of Robotics Research 40.10-11 SAGE Publications Sage UK: London, England, 2021, pp. 1106–1122
  • [4] Charles Khazoom, Daniel Gonzalez-Diaz, Yanran Ding and Sangbae Kim “Humanoid self-collision avoidance using whole-body control with control barrier functions” In 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids), 2022, pp. 558–565 IEEE
  • [5] Fei Gao, Yi Lin and Shaojie Shen “Gradient-based online safe trajectory generation for quadrotor flight in complex environments” In 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), 2017, pp. 3681–3688 IEEE
  • [6] Sikang Liu et al. “Planning dynamically feasible trajectories for quadrotors using safe flight corridors in 3-d complex environments” In IEEE Robotics and Automation Letters 2.3 IEEE, 2017, pp. 1688–1695
  • [7] Mrinal Kalakrishnan et al. “STOMP: Stochastic trajectory optimization for motion planning” In 2011 IEEE international conference on robotics and automation, 2011, pp. 4569–4574 IEEE
  • [8] Matt Zucker et al. “Chomp: Covariant hamiltonian optimization for motion planning” In The International journal of robotics research 32.9-10 SAGE Publications Sage UK: London, England, 2013, pp. 1164–1193
  • [9] Minchen Li et al. “Incremental potential contact: intersection-and inversion-free, large-deformation dynamics.” In ACM Trans. Graph. 39.4, 2020, pp. 49
  • [10] Duo Zhang, Xifeng Gao, Kui Wu and Zherong Pan “Provably Robust Semi-Infinite Program Under Collision Constraints via Subdivision” In arXiv preprint arXiv:2302.01135, 2023
  • [11] Ruiqi Ni, Zherong Pan and Xifeng Gao “Robust multi-robot trajectory optimization using alternating direction method of multiplier” In IEEE Robotics and Automation Letters 7.3 IEEE, 2022, pp. 5950–5957
  • [12] Adrien Escande, Sylvain Miossec, Mehdi Benallegue and Abderrahmane Kheddar “A strictly convex hull for computing proximity distances with continuous gradients” In IEEE Transactions on Robotics 30.3 IEEE, 2014, pp. 666–678
  • [13] Wolfgang Hönig et al. “Trajectory planning for quadrotor swarms” In IEEE Transactions on Robotics 34.4 IEEE, 2018, pp. 856–869
  • [14] Sertac Karaman and Emilio Frazzoli “Sampling-based algorithms for optimal motion planning” In The international journal of robotics research 30.7 Sage Publications Sage UK: London, England, 2011, pp. 846–894
  • [15] Lucas Janson, Edward Schmerling, Ashley Clark and Marco Pavone “Fast marching tree: A fast marching sampling-based method for optimal motion planning in many dimensions” In The International journal of robotics research 34.7 SAGE Publications Sage UK: London, England, 2015, pp. 883–921
  • [16] Lucas Janson, Brian Ichter and Marco Pavone “Deterministic sampling-based motion planning: Optimality, complexity, and performance” In The International Journal of Robotics Research 37.1 SAGE Publications Sage UK: London, England, 2018, pp. 46–61
  • [17] Aaron D Ames et al. “Control barrier functions: Theory and applications” In 2019 18th European control conference (ECC), 2019, pp. 3420–3431 IEEE
  • [18] Alexandre Amice et al. “Finding and optimizing certified, collision-free regions in configuration space for robot manipulators” In International Workshop on the Algorithmic Foundations of Robotics, 2022, pp. 328–348 Springer
  • [19] Tobia Marcucci, Mark Petersen, David Wrangel and Russ Tedrake “Motion planning around obstacles with convex optimization” In arXiv preprint arXiv:2205.04422, 2022
  • [20] Philip E Gill, Walter Murray and Michael A Saunders “SNOPT: An SQP algorithm for large-scale constrained optimization” In SIAM review 47.1 SIAM, 2005, pp. 99–131
  • [21] Lorenz T Biegler and Victor M Zavala “Large-scale nonlinear programming using IPOPT: An integrating framework for enterprise-wide dynamic optimization” In Computers & Chemical Engineering 33.3 Elsevier, 2009, pp. 575–582
  • [22] Robin Deits and Russ Tedrake “Computing large convex regions of obstacle-free space through semidefinite programming” In Algorithmic Foundations of Robotics XI: Selected Contributions of the Eleventh International Workshop on the Algorithmic Foundations of Robotics, 2015, pp. 109–124 Springer
  • [23] Boyu Zhou et al. “Robust and efficient quadrotor trajectory generation for fast autonomous flight” In IEEE Robotics and Automation Letters 4.4 IEEE, 2019, pp. 3529–3536
  • [24] Yangyang Xu and Wotao Yin “A globally convergent algorithm for nonconvex optimization based on block coordinate update” In Journal of Scientific Computing 72.2 Springer, 2017, pp. 700–734
  • [25] Stephen P Boyd and Lieven Vandenberghe “Convex optimization” Cambridge university press, 2004
  • [26] Akshay Agrawal et al. “Differentiable convex optimization layers” In Advances in neural information processing systems 32, 2019
  • [27] Pai-Hsuen Chen, Chih-Jen Lin and Bernhard Schölkopf “A tutorial on ν\nu-support vector machines” In Applied Stochastic Models in Business and Industry 21.2 Wiley Online Library, 2005, pp. 111–136
  • [28] David Harmon et al. “Asynchronous contact mechanics” In ACM SIGGRAPH 2009 papers, 2009, pp. 1–12
  • [29] David Carlson “What are Schur complements, anyway?” In Linear Algebra and its Applications 74, 1986, pp. 257–275 DOI: https://doi.org/10.1016/0024-3795(86)90127-8

Appendix: Additional Proofs

Proof of Corollary IV.1.

2𝒪¯\nabla^{2}\bar{\mathcal{O}} is positive definite and so is all its principle submatrices. As a result, ppPij\nabla_{pp}P_{ij} is invertible and nijTppPij1nij0n_{ij}^{T}\nabla_{pp}P_{ij}^{-1}n_{ij}\neq 0 so HijH_{ij} is well-defined. We further rewrite:

Hij=\displaystyle H_{ij}= ppPij12[In¯ijn¯ijT]ppPij12\displaystyle\nabla_{pp}P_{ij}^{-\frac{1}{2}}\left[I-\bar{n}_{ij}\bar{n}_{ij}^{T}\right]\nabla_{pp}P_{ij}^{-\frac{1}{2}}
n¯ij\displaystyle\bar{n}_{ij}\triangleq ppPij12nij/ppPij12nij,\displaystyle{\nabla_{pp}P_{ij}^{-\frac{1}{2}}n_{ij}}/{\|\nabla_{pp}P_{ij}^{-\frac{1}{2}}n_{ij}\|},

from which we immediately have ppPij1Hij0\nabla_{pp}P_{ij}^{-1}\succeq H_{ij}\succeq 0. By Guttman rank additivity [29], we have the following matrix is positive definite:

H¯θ[θθ𝒪¯ij𝒞θpPijppPij1pθPij].\displaystyle\bar{H}_{\theta}\triangleq\left[\nabla_{\theta\theta}\bar{\mathcal{O}}-\sum_{ij\in\mathcal{C}}\nabla_{\theta p}P_{ij}\nabla_{pp}P_{ij}^{-1}\nabla_{p\theta}P_{ij}\right].

Combined with the fact that ppPij1Hij\nabla_{pp}P_{ij}^{-1}\succeq H_{ij}, we have: HθH¯θ0H_{\theta}\succeq\bar{H}_{\theta}\succ 0. Note the above argument only requires 2𝒪¯0\nabla^{2}\bar{\mathcal{O}}\succeq 0. If we further have: 2𝒪¯ϵI\nabla^{2}\bar{\mathcal{O}}\succeq\epsilon I, then we can apply the above argument to the perturbed matrix:

(2𝒪¯JJT0)(ϵI0),\displaystyle\left(\begin{array}[]{cc}\nabla^{2}\bar{\mathcal{O}}&J\\ J^{T}&0\end{array}\right)-\left(\begin{array}[]{cc}\epsilon I&\\ &0\end{array}\right),

which immediately yields: HθH¯θϵIH_{\theta}\succeq\bar{H}_{\theta}\succeq\epsilon I. ∎

Proof of Corollary IV.2.

1) If Equation 11 is feasible, then nij0n_{ij}\neq 0 because otherwise we have P(dij)+P(dij)<P(-d_{ij})+P(d_{ij})<\infty. But this implies dij(0,)d_{ij}\in(0,\infty) and dij(0,)-d_{ij}\in(0,\infty) which is impossible. Therefore, we can verify that nij/nij,dij/nijn_{ij}/\|n_{ij}\|,d_{ij}/\|n_{ij}\| is a plane separating CiC_{i} and CjC_{j}. Conversely, if CiCj=C_{i}\cap C_{j}=\emptyset, then dist(Ci,Cj)>0\text{dist}(C_{i},C_{j})>0, i.e., we have a separating plane nijn_{ij} such that nijx(θ)dij>0-n_{ij}x(\theta)-d_{ij}>0 for all x(θ)Cix(\theta)\in C_{i} and nijx(θ)+dij>0n_{ij}x(\theta)+d_{ij}>0 for all x(θ)Cjx(\theta)\in C_{j}. We can then choose α(0,1)\alpha\in(0,1) sufficiently close to 11 such that: αnijx(θ)dij>0-\alpha n_{ij}x(\theta)-d_{ij}>0 for x(θ)Cix(\theta)\in C_{i} and αnijx(θ)+dij>0\alpha n_{ij}x(\theta)+d_{ij}>0 for x(θ)Cjx(\theta)\in C_{j}. And we can immediately verify that pij=(αnijT,dijT)Tp_{ij}=\left(\begin{array}[]{cc}{\alpha n_{ij}}^{T},&{d_{ij}}^{T}\end{array}\right)^{T} is a feasible solution of Equation 11.

2) This property essentially requires the uniqueness of minimizer. To establish uniqueness, we can write the Hessian of the objective function of Equation 11 as:

H~ij(θ,pij)pij2[Pij(θ,pij)+P(1nij)]\displaystyle\tilde{H}_{ij}(\theta,p_{ij})\triangleq\nabla_{p_{ij}}^{2}\left[P_{ij}(\theta,p_{ij})+P(1-\|n_{ij}\|)\right]
=\displaystyle= xCiP′′(nijx(θ)dij)(x(θ)1)(x(θ)T,1)+\displaystyle\sum_{x\in C_{i}}P^{\prime\prime}(-n_{ij}x(\theta)-d_{ij})\left(\begin{array}[]{c}x(\theta)\\ 1\end{array}\right)\left(\begin{array}[]{cc}{x(\theta)^{T}},&{1}\end{array}\right)+
xCjP′′(nijx(θ)+dij)(x(θ)1)(x(θ)T,1)+pij2[P(1nij)].\displaystyle\sum_{x\in C_{j}}P^{\prime\prime}(n_{ij}x(\theta)+d_{ij})\left(\begin{array}[]{c}x(\theta)\\ 1\end{array}\right)\left(\begin{array}[]{cc}{x(\theta)^{T}},&{1}\end{array}\right)+\nabla_{p_{ij}}^{2}\left[P(1-\|n_{ij}\|)\right].

We show that H~ij\tilde{H}_{ij} must be a matrix of full-rank. Note that P′′(nijx(θ)dij)>0P^{\prime\prime}(-n_{ij}x(\theta)-d_{ij})>0 and P′′(nijx(θ)+dij)>0P^{\prime\prime}(n_{ij}x(\theta)+d_{ij})>0 due to the strong convexity of PP. Next, note that P(1nij)P(1-\|n_{ij}\|) is a convex function because it is a composite of a convex function 1+nij-1+\|n_{ij}\| and a non-decreasing function P()P(-\bullet) [25]. Further, we have the top-left entry of Hessian taking the following form:

2P(1nij)nij,x2=nij,y2+nij,z2nij3(1nij)+nij,x2nij2(1nij)>0.\displaystyle\frac{\partial^{2}{P(1-\|n_{ij}\|)}}{\partial{n_{ij,x}}^{2}}=\frac{n_{ij,y}^{2}+n_{ij,z}^{2}}{\|n_{ij}\|^{3}(1-\|n_{ij}\|)}+\frac{n_{ij,x}^{2}}{\|n_{ij}^{2}\|(1-\|n_{ij}\|)}>0.

This implies the Hessian of P(1nij)P(1-\|n_{ij}\|) is a positive semi-definite, non-zero matrix, so it has the following diagonalization:

2P(1nij)nij2=VΣVT,\displaystyle\frac{\partial^{2}{P(1-\|n_{ij}\|)}}{\partial{n_{ij}}^{2}}=V\Sigma V^{T},

where 0Σ00\neq\Sigma\succeq 0 and VV is a orthogonal matrix. Putting all these facts together, we have:

H~ij=\displaystyle\tilde{H}_{ij}= UUT\displaystyle UU^{T}
U\displaystyle U\triangleq (U1,U2,U3)\displaystyle\left(\begin{array}[]{ccc}{U_{1}},&{U_{2}},&{U_{3}}\end{array}\right)
U1\displaystyle U_{1}\triangleq (,P′′(nijx(θ)dij)(x(θ)Ci1),)\displaystyle\left(\begin{array}[]{ccc}{\cdots},&{\sqrt{P^{\prime\prime}(-n_{ij}x(\theta)-d_{ij})}\left(\begin{array}[]{c}x(\theta)\in C_{i}\\ 1\end{array}\right)},&{\cdots}\end{array}\right)
U2\displaystyle U_{2}\triangleq (,P′′(nijx(θ)+dij)(x(θ)Cj1),)\displaystyle\left(\begin{array}[]{ccc}{\cdots},&{\sqrt{P^{\prime\prime}(n_{ij}x(\theta)+d_{ij})}\left(\begin{array}[]{c}x(\theta)\in C_{j}\\ 1\end{array}\right)},&{\cdots}\end{array}\right)
U3\displaystyle U_{3}\triangleq (V1V2V3000)Σ.\displaystyle\left(\begin{array}[]{ccc}V_{1}&V_{2}&V_{3}\\ 0&0&0\\ \end{array}\right)\sqrt{\Sigma}.

Without a loss of generality, we can assume |Ci|0|C_{i}|\neq 0 and Σ11>0\Sigma_{11}>0. Then x(θ)Cix(\theta)\in C_{i} spans 3\mathbb{R}^{3} and hence the following vectors span 4\mathbb{R}^{4}:

(,P′′(nijx(θ)dij)(x(θ)Ci1),,(V1Σ110)).\displaystyle\left(\begin{array}[]{cccc}{\cdots},&{\sqrt{P^{\prime\prime}(-n_{ij}x(\theta)-d_{ij})}\left(\begin{array}[]{c}x(\theta)\in C_{i}\\ 1\end{array}\right)},&{\cdots},&{\left(\begin{array}[]{c}V_{1}\sqrt{\Sigma_{11}}\\ 0\end{array}\right)}\end{array}\right).

Put together, we have UU spans 4\mathbb{R}^{4} and H~ij=UUT0\tilde{H}_{ij}=UU^{T}\succ 0. Due to the arbitarity of pijp_{ij}, we have Equation 11 is a strictly convex optimization and pijp_{ij} is a unique minimizer.

3) Due to the strict convexity, pij(θ)p_{ij}(\theta) is equivalently defined as the solution of:

G~ij(θ,pij)pij[Pij(θ,pij)+P(1nij)]=0.\displaystyle\tilde{G}_{ij}(\theta,p_{ij})\triangleq\nabla_{p_{ij}}\left[P_{ij}(\theta,p_{ij})+P(1-\|n_{ij}\|)\right]=0.

Since H~ij(θ,pij)\tilde{H}_{ij}(\theta,p_{ij}) has full-rank, we can then invoke the high-order implicit function theorem to see that pij(θ)p_{ij}(\theta) is a twice-differentiable function of θ\theta if the barrier P()P(\bullet) is third-order differentiable.

4) Denote ddist(Ci,Cj)>0d\triangleq\text{dist}(C_{i},C_{j})>0, then for every separating plane pijp_{ij} with nij=1\|n_{ij}\|=1, we have nijx(θ)dij<d-n_{ij}x(\theta)-d_{ij}<d for some x(θ)Cix(\theta)\in C_{i} and nijx(θ)+dij<dn_{ij}x(\theta)+d_{ij}<d for some x(θ)Cjx(\theta)\in C_{j} because otherwise dist(Ci,Cj)>d\text{dist}(C_{i},C_{j})>d. Now the solution to Equation 11 can be denoted as a generalized separating plane with nij<1\|n_{ij}\|<1, so we have nijx(θ)dij<dnij<d-n_{ij}x(\theta)-d_{ij}<d\|n_{ij}\|<d for some x(θ)Cix(\theta)\in C_{i} and nijx(θ)+dij<dnij<dn_{ij}x(\theta)+d_{ij}<d\|n_{ij}\|<d for some x(θ)Cjx(\theta)\in C_{j}. As a result we have: limd0+Pij(θ,pij(θ))limd0+P(d)=\lim_{d\to 0^{+}}P_{ij}(\theta,p_{ij}(\theta))\geq\lim_{d\to 0^{+}}P(d)=\infty. ∎

Appendix: Derivatives

We use Einstein’s notation with scripts α,β,γ\alpha,\beta,\gamma:

pij(θ)θα=H~ij1G~ijθα\displaystyle\frac{\partial{p_{ij}(\theta)}}{\partial{\theta^{\alpha}}}=-\tilde{H}_{ij}^{-1}\frac{\partial{\tilde{G}_{ij}}}{\partial{\theta^{\alpha}}} (13)
2pij(θ)θαθβ=H~ij1[2G~ijθαθβ+H~ijθαpij(θ)θβ+\displaystyle\frac{\partial^{2}{p_{ij}(\theta)}}{\partial{\theta^{\alpha}}\partial{\theta^{\beta}}}=-\tilde{H}_{ij}^{-1}\Big{[}\frac{\partial^{2}{\tilde{G}_{ij}}}{\partial{\theta^{\alpha}}\partial{\theta^{\beta}}}+\frac{\partial{\tilde{H}_{ij}}}{\partial{\theta^{\alpha}}}\frac{\partial{p_{ij}(\theta)}}{\partial{\theta^{\beta}}}+
H~ijθβpij(θ)θα+H~ijpijγpij(θ)θαpijγ(θ)θβ]\displaystyle\frac{\partial{\tilde{H}_{ij}}}{\partial{\theta^{\beta}}}\frac{\partial{p_{ij}(\theta)}}{\partial{\theta^{\alpha}}}+\frac{\partial{\tilde{H}_{ij}}}{\partial{p_{ij}^{\gamma}}}\frac{\partial{p_{ij}(\theta)}}{\partial{\theta^{\alpha}}}\frac{\partial{p_{ij}^{\gamma}(\theta)}}{\partial{\theta^{\beta}}}\Big{]}
d𝒪~dθ=𝒪~θ+ij𝒞𝒪~pijpij(θ)θd2𝒪~dθ2=2𝒪~θ2+\displaystyle\frac{d{\tilde{\mathcal{O}}}}{d{\theta}}=\frac{\partial{\tilde{\mathcal{O}}}}{\partial{\theta}}+\sum_{ij\in\mathcal{C}}\frac{\partial{\tilde{\mathcal{O}}}}{\partial{p_{ij}}}\frac{\partial{p_{ij}(\theta)}}{\partial{\theta}}\quad\frac{d^{2}{\tilde{\mathcal{O}}}}{d{\theta}^{2}}=\frac{\partial^{2}{\tilde{\mathcal{O}}}}{\partial{\theta}^{2}}+
ij𝒞[2𝒪~θpijpij(θ)θ+pij(θ)θT2𝒪~pijθ+\displaystyle\sum_{ij\in\mathcal{C}}\Big{[}\frac{\partial^{2}{\tilde{\mathcal{O}}}}{\partial{\theta}\partial{p_{ij}}}\frac{\partial{p_{ij}(\theta)}}{\partial{\theta}}+\frac{\partial{p_{ij}(\theta)}}{\partial{\theta}}^{T}\frac{\partial^{2}{\tilde{\mathcal{O}}}}{\partial{p_{ij}}\partial{\theta}}+
pij(θ)θT2𝒪~pij2pij(θ)θ+𝒪~pijα2pijα(θ)θ2].\displaystyle\frac{\partial{p_{ij}(\theta)}}{\partial{\theta}}^{T}\frac{\partial^{2}{\tilde{\mathcal{O}}}}{\partial{p_{ij}}^{2}}\frac{\partial{p_{ij}(\theta)}}{\partial{\theta}}+\frac{\partial{\tilde{\mathcal{O}}}}{\partial{p_{ij}^{\alpha}}}\frac{\partial^{2}{p_{ij}^{\alpha}(\theta)}}{\partial{\theta}^{2}}\Big{]}.