Homogeneous Formulation of Convex Quadratic Programs for Infeasibility Detection

Arvind U Raghunathan A.U. Raghunathan is with Mitsubishi Electric Research Laboratories, 201 Broadway, Cambridge, MA 02139, USA. [email protected]

Abstract

Convex Quadratic Programs (QPs) have come to play a central role in the computation of control action for constrained dynamical systems. In this paper, we present a novel Homogeneous QP (HQP) formulation which is obtained by embedding the original QP in a larger space. The key properties of the HQP are: (i) is always feasible, (ii) an optimal solution to QP can be readily obtained from a solution to HQP, and (iii) infeasibility of QP corresponds to a particular solution of HQP. An immediate consequence is that all the existing algorithms for QP are now also capable of robustly detecting infeasibility. In particular, we present an Infeasible Interior Point Method (IIPM) for the HQP and show polynomial iteration complexity when applied to HQP. A key distinction with prior IPM approaches is that we do not need to solve second-order cone programs. Numerical experiments on the formulation are provided using existing codes.

1 Introduction

Optimization algorithms are widely used in the control for constrained dynamic systems. In particular, the solution of convex Quadratic Programs (QPs) is a key ingredient of optimization algorithms. Convex QPs arise for example in the Model Predictive Control of linear systems, switched linear systems and nonlinear systems [1, 2]. The past two decades have witnessed the development of a number of algorithms for the solution of convex QPs arising in the context of optimization-based control. Initial work on QP algorithms were based on solving a system of equations representing the optimality conditions such as Interior Point Method (IPM) for QP [3], active-set methods [4, 5] and more recently, IPM for Second Order Cone Programs (SOCPs) [6]. In the last decade there has been interest in the development of first-order approaches such as gradient projection [7], dual gradient projection [8, 9], and splitting methods [10, 11, 12, 13] and iterative second-order approaches [14]. Recently, there has been work on employing these methods within Branch & Bound (B&B) algorithms for the solution of Mixed Integer Quadratic Programs (MIQPs) that arise in MPC for switched systems [15, 16, 17, 18].

A critical feature required of QP solvers for real-time applications is the ability to detect infeasibility and provide a graceful handling of the same. Infeasibility handling increases in importance when solving MIQPs in a B&B setting as it is expected that QPs resulting from the fixing of a subset of binary variables will likely be infeasible. Detecting such infeasible QPs and pruning the search tree in B&B is essential for computational efficiency of the MIQP solver.

Infeasibility of a QP is determined by the identification of a ray that is feasible for the dual and makes the dual objective unbounded. First-order primal-dual algorithms are capable of detecting infeasibility. Recent results on detecting infeasibility using first-order algorithms include [19, 20, 21]. A dual QP formulation, if it can be constructed explicitly, will allow the determination of a ray of infeasibility. However, such an explicit dual construction requires strong convexity of the QP. Active-set methods typically employ a feasibility phase to first determine a feasible point and then proceed to an optimization phase which produces the optimal solution. Failure of the feasibility phase allows to pronounce a QP as being infeasible [22]. The work performed in this phase is identical to that required for the optimization phase. When employing inexact linear algebra the first-phase is typically not employed and as a consequence the ability of such approaches to certify infeasibility is not clear. IPM algorithms, on the other hand, are not all capable of producing a certificate of infeasibility [23, 24]. IPMs based on Homogeneous Self-Dual (HSD) embedding [25, 26] are known to produce a certificate of infeasibility for Linear Programs (LPs). This technique has been implemented in MOSEK [27] and has also been extended to handle SOCPs. To detect infeasible QPs with IPMs, the QPs must first be formulated as a SOCP to which the HSD embedding is then applied. The ECOS solver [6] does exactly this for the solution of QPs arising in MPC and can detect infeasibility.

We present a novel embedding of the QP into a space that is one dimension higher. The embedding is produced by introducing an additional variable that is nonnegative and multiplies the affine terms in the objective and constraints. The objective contribution from the introduced variable is determined so the resulting Homogeneous Quadratic Program (HQP) satisfies the following properties: (i) HQP is always feasible, (ii) an optimal solution to the QP can always be recovered from the optimal solution to the HQP if the introduced variable is positive; and (iii) QP can be certified to be infeasible if the introduced variable is zero. More importantly, the HQP formulation can be presented to any QP solver to determine the optimal solution or infer infeasibility in a robust manner.

In this paper, we focus on applying an Infeasible IPM (IIPM) to solve the HQP. We are able to derive polynomial complexity of a standard IIPM when applied to HQP by adapting the analysis of IIPM for linear programs [23, Chapter 6]. Thus, we have an IIPM with polynomial iteration complexity for inferring optimality or infeasibility of QP without having to solve SOCPs.

The paper is organized as follows. §2 presents the QP and the assumptions. §3 presents the embedding of QP into a HQP and shows the equivalence of the formulations. An IIPM for solving HQP and the polynomial iteration complexity are described in §4. We present results in §5 when using the HQP formulation with IPOPT. Conclusions and directions for future work are provided in §6.

Notation. The set of reals is denoted by $\mathbb{R}$ and the set of vectors of dimension $n$ by $\mathbb{R}^{n}$ . The set of $n\times n$ symmetric matrices is denoted by $\mathbb{S}^{n}$ . For a matrix $A\in\mathbb{S}^{n}$ , the notation $A\succcurlyeq(\succ)0$ denotes that A is positive semidefinite (definite). Given a vector $u\in\mathbb{R}^{n}$ , $\textnormal{Diag}(u)$ denotes a diagonal matrix with the elements of the vector on the diagonal. The notation $I_{n}$ denotes the identity matrix of size $n$ .

2 Problem Formulation

Consider the QP in the form


$\displaystyle\min_{y\in\mathbb{R}^{n}}$	$\displaystyle\;\frac{1}{2}y^{T}Cy+c^{T}y$	(1a)
s.t.	$\displaystyle\;Ey=f$	(1b)
	$\displaystyle\;y\geq 0$	(1c)

where $C\in\mathbb{S}^{n}$ , $c\in\mathbb{R}^{n}$ , $E\in\mathbb{R}^{m\times n}$ , and $f\in\mathbb{R}^{m}$ . Note that any QP formulation with finite lower or upper bound on each of the variables can be put into the form in (1) through a linear transformation of variables.

We make the following assumptions on QP (1).

Assumption 1.

The matrix $E$ has full row rank of $m$ .

Assumption 2.

The matrix $Z^{T}CZ\succ 0$ where $Z\in\mathbb{R}^{n\times(n-m)}$ is an orthonormal basis for the null-space of $E$ , i.e. $Z$ is a basis for $\{v\,|\,Ev=0\}$ .

Assumption 1 is not restrictive and can be easily satisfied by removing dependent rows if necessary. Assumption 2 requires that $C$ be positive definite when projected onto $Z$ . This assumption readily holds for MPC formulations [19] and also for certain spectral relaxations of nonconvex MIQPs [28, 29]. Assumption 2 implies that the optimal solution to QP (1) is unique whenever QP (1) is feasible. Note that the assumptions do not preclude the infeasibility of QP (1). The basis $Z$ is typically computed using a QR factorization of $E^{T}$ [30].

We conclude this section with a statement of the conditions for optimality and infeasibility of QP (1).

A point $y^{\star}$ minimizes QP (1) if there exist multipliers $\nu^{\star}\in\mathbb{R}^{m}$ , $\xi^{\star}\geq 0\in\mathbb{R}^{n}$ , satisfying the first-order optimality conditions


$\displaystyle Cy^{\star}+c+E^{T}\nu^{\star}-\xi^{*}$	$\displaystyle=0$	(2a)
$\displaystyle Ey^{\star}$	$\displaystyle=f$	(2b)
$\displaystyle 0\leq y^{\star}$	$\displaystyle\perp\xi^{\star}\geq 0.$	(2c)

The conditions (2) are necessary and sufficient for a minimizer of QP (1) under Assumption 2.

The QP (1) is infeasible if there exist $\nu^{\circ}\in\mathbb{R}^{m}$ , $\xi^{\circ}\in\mathbb{R}^{n}$ satisfying


$\displaystyle E^{T}\nu^{\circ}-\xi^{\circ}$	$\displaystyle=0$	(3a)
$\displaystyle f^{T}\nu^{\circ}$	$\displaystyle=-1$	(3b)
$\displaystyle\xi^{\circ}$	$\displaystyle\geq 0.$	(3c)

The conditions in (3) are obtained from applying a Theorem of the Alternative [31] to the constraints in (1b)-(1c).

3 Homogeneous QP Formulation

Consider embedding QP (1) into a Homogeneous Quadratic Program (HQP) as


$\displaystyle\min\limits_{y\in\mathbb{R}^{n},\tau\in\mathbb{R}}$	$\displaystyle\;\frac{1}{2}y^{T}Cy+\tau c^{T}y+\frac{\theta}{2}(\tau^{2}-2\tau)$	(4a)
s.t.	$\displaystyle\;Ey=f\tau$	(4b)
	$\displaystyle\;y\geq 0,\tau\geq 0$	(4c)

where $\tau\in\mathbb{R}$ is an additional nonnegative variable and $\theta>0$ is a parameter. Observe that the objective (4a) is obtained by multiplying the linear term $c^{T}y$ in (1a) with $\tau$ and appending with the term $(\theta/2)(\tau^{2}-\tau)$ . The equality constraints (4b) is obtained by multiplying the right-hand side of (1b) with $\tau$ . §3.1 provides conditions on $\theta$ that ensures HQP is convex. §3.2 shows how the parameter $\theta$ can be computed efficiently. Finally, §3.3 shows that solving HQP allows to recover a solution to QP or declare infeasibility.

3.1 Conditions on $\theta$

The parameter $\theta$ is chosen to satisfy two conditions:

$\theta>2|\theta^{\star}|$ where $\theta^{\star}$ is defined as

	$\displaystyle\theta^{\star}=\min\limits_{y\in\mathbb{R}^{n}}$	$\displaystyle\;\frac{1}{2}y^{T}Cy+c^{T}y$		(5)
	s.t.	$\displaystyle Ey=f.$

The parameter $\theta^{\star}$ is well-defined by Assumption 2 and can be computed as $\theta^{*}=\frac{1}{2}\tilde{y}^{T}C\tilde{y}+c^{T}\tilde{y}$ where $\tilde{y}$ is obtained by solving a single linear system

\displaystyle\begin{pmatrix}C&E^{T}\\ E&0\end{pmatrix}\begin{pmatrix}\tilde{y}\\ \tilde{\nu}\end{pmatrix}=\begin{pmatrix}-c\\ f\end{pmatrix}.

(6)

Since the nonnegativity constraints (1c) are ignored in the definition of $\theta^{*}$ (5) it follows that

	$\displaystyle-\frac{1}{2}\theta<\theta^{\star}\leq$	$\displaystyle\frac{1}{2}y^{T}Cy+c^{T}y$		(7)
		$\displaystyle\forall\,y\text{ satisfying }\eqref{qp.eq}-\eqref{qp.nonneg}.$

2.

$\theta$ is chosen large enough so that

$\displaystyle\widehat{Z}^{T}\begin{bmatrix}C&c\\ c^{T}&\theta\end{bmatrix}\widehat{Z}\succ 0$ (8)

where $\widehat{Z}$ is a basis for the null space of (4b). Such a basis can be obtained as ,

$\widehat{Z}=\begin{bmatrix}Z&d\\ 0&1\end{bmatrix}$ (9)

where $d=E^{T}(EE^{T})^{-1}f$ .

The satisfaction of (8) implies that:

1.

the Hessian of the objective in (4a) is positive definite on the null space of the equality constraints (4b).
2.

HQP (4) is convex and the first-order optimality conditions are necessary and sufficient for a minimizer.

3.2 Computing $\theta$

We will now address the computation of $\theta$ satisfying (8).

Lemma 1.

Suppose Assumptions 1 and 2 hold. Then (8) holds for all $\theta$ satisfying

\theta>\frac{1}{\lambda_{\min}(Z^{T}CZ)}\|Z^{T}(Cd+c)\|^{2}-d^{T}Cd-2c^{T}d.

(10)

Proof.

The claim follows from an application of the Schur-complement to the left-hand side of the matrix in (8). Substitute $\widehat{Z}$ from (9) in (8) to obtain

\widehat{Z}^{T}\begin{bmatrix}C&c\\ c^{T}&\theta\end{bmatrix}\widehat{Z}=\begin{bmatrix}Z^{T}CZ&Z^{T}(Cd+c)\\ (Cd+c)^{T}Z&\theta+d^{T}Cd+2c^{T}d\end{bmatrix}.

(11)

Since $Z^{T}CZ\succ 0$ (Assumption 2) the condition in (8) holds provided the Schur-complement w.r.t. $Z^{T}CZ$ in (11) is positive, i.e.

$\displaystyle 0<$	$\displaystyle\;\theta+d^{T}Cd+2c^{T}d-$	(12)
	$\displaystyle\;(Cd+c)^{T}Z(Z^{T}CZ)^{-1}Z^{T}(Cd+c)$
$\displaystyle\implies\theta>$	$\displaystyle\;(Cd+c)^{T}Z(Z^{T}CZ)^{-1}Z^{T}(Cd+c)$
	$\displaystyle\;-d^{T}Cd-2c^{T}d.$

Using $Z^{T}CZ\succeq\lambda_{\min}(Z^{T}CZ)I_{n-m}$ in (12) yields (10). ∎

Remark 1.

It is not necessary to compute $Z^{T}(Cd+c)$ explicitly. Since $\|ZZ^{T}\|\leq 1$ it is sufficient to choose

\theta>\frac{1}{\lambda_{\min}(Z^{T}CZ)}\|Cd+c\|^{2}-d^{T}Cd-2c^{T}d

(13)

in order to satisfy (8).

Remark 2.

Further, it is not necessary to compute $\lambda_{\min}(Z^{T}CZ)$ . It is sufficient to have a nontrivial lower bound estimate of $\lambda_{\min}(Z^{T}CZ)$ . For example, if $C\succ 0$ (as in MPC applications) then $\lambda_{\min}(C)\leq\lambda_{\min}(Z^{T}CZ)$ . In the context of MIQPs with binary variables, QPs (1) are solved at each node of the B&B tree. The QP solved at the child node is a result of fixing a binary variable to $0$ or $1$ . Let us suppose that in the child node the first variable index is fixed to 0 or 1. Then

\lambda_{\min}(Z^{T}CZ)=\min\limits_{u:Eu=0}\frac{u^{T}Cu}{u^{T}u}\leq\min\limits_{u:Eu=0,u_{1}\in\{0,1\}}\frac{u^{T}Cu}{u^{T}u}.

Hence, the lower bound estimate available at the parent node serves as a valid lower bound estimate for smallest eigenvalue of the reduced Hessian at the child node. If $0<\alpha\leq\lambda_{\min}(Z^{T}CZ)$ then the choice of

\theta>\frac{1}{\alpha}\|Cd+c\|^{2}-d^{T}Cd-2c^{T}d

(14)

ensures satisfaction of (8).

3.3 Equivalence between QP and HQP

We collect some simple observations on the HQP (4).

(O1)

HQP (4) is always feasible. It is easily verified that $(y,\tau)=0$ satisfies (4b)-(4c).
(O2)

HQP (4) has an optimal solution with optimal value less than or equal to $0$ . This follows directly from (O1).
(O3)

HQP (4) has a finite optimum. This follows from (8).

We now state the first-order optimality conditions for HQP (4).

A point $(\widehat{y},\widehat{\tau})$ minimizes HQP (4) if there exist multipliers $(\widehat{\nu},\widehat{\xi},\widehat{\omega})\in\mathbb{R}^{m+n+1}$ satisfying the first-order optimality conditions


$\displaystyle C\widehat{y}+\widehat{\tau}c+E^{T}\widehat{\nu}-\widehat{\xi}$	$\displaystyle=0$	(15a)
$\displaystyle\theta\widehat{\tau}-\theta+c^{T}\widehat{y}-f^{T}\widehat{\nu}-\widehat{\omega}$	$\displaystyle=0$	(15b)
$\displaystyle E\widehat{y}$	$\displaystyle=f\widehat{\tau}$	(15c)
$\displaystyle 0\leq\widehat{y}$	$\displaystyle\perp\widehat{\xi}\geq 0$	(15d)
$\displaystyle 0\leq\widehat{\tau}$	$\displaystyle\perp\widehat{\omega}\geq 0.$	(15e)

Theorem 1.

Suppose $\theta$ is chosen to satisfy the conditions in §3.1. The QP (1) has an optimal solution $y^{\star}$ iff the HQP (4) has an optimal solution $(\widehat{y},\widehat{\tau})$ with $\widehat{\tau}>0$ .

Proof.

Consider the if part. Let $(\widehat{y},\widehat{\tau})$ be an optimal solution of HQP (4) and let $(\widehat{\nu},\widehat{\xi},\widehat{\omega})$ be multipliers such that (15) holds. Then, it is easily verified that $(y^{\star},\nu^{\star},\xi^{\star})=(\widehat{y}/\widehat{\tau},\widehat{\nu}/\widehat{\tau},\widehat{\xi}/\widehat{\tau})$ satisifies the optimality conditions (2) for QP (1). This proves the if part.

Consider the only if part. Let $y^{\star}$ be an optimal solution of QP (1) and let $(\nu^{\star},\xi^{\star})$ be multipliers such that (2) holds. Define $\bar{\tau}=\theta/(\theta+c^{T}y^{\star}-f^{T}\nu^{\star})$ . If $\bar{\tau}>0$ then it can be easily verified that $\widehat{y}=\bar{\tau}y^{\star}$ , $\widehat{\tau}=\bar{\tau}$ , $\widehat{\nu}=\bar{\tau}\nu^{\star}$ , $\widehat{\xi}=\bar{\tau}\xi^{\star}$ , $\widehat{\omega}=0$ satisfy the optimality conditions for HQP (15). To show $\bar{\tau}>0$ we need to show that $\theta+c^{T}y^{\star}-f^{T}\nu^{\star}>0$ since $\theta>0$ . Consider


	$\displaystyle\;\theta+c^{T}y^{\star}-f^{T}\nu^{\star}$	(16a)
$\displaystyle=$	$\displaystyle\;\theta+c^{T}y^{\star}-(E^{T}\nu^{\star})^{T}y^{\star}$	(16b)
$\displaystyle=$	$\displaystyle\;\theta+2c^{T}y^{\star}+(y^{\star})^{T}Cy^{\star}-(y^{\star})^{T}\xi^{\star}$	(16c)
$\displaystyle=$	$\displaystyle\;\theta+2c^{T}y^{\star}+(y^{\star})^{T}Cy^{\star}$	(16d)
$\displaystyle>$	$\displaystyle\;0$	(16e)

where the equality in (16b) follows by multiplying (2b) by $(\nu^{\star})^{T}$ and substituting for $f^{T}\nu^{\star}$ with $(E^{T}\nu^{\star})^{T}y^{\star}$ . Multiplying (2a) by $(y^{\star})^{T}$ and substituting for $-(E^{T}\nu^{\star})^{T}y^{\star}$ as $c^{T}y^{\star}+(y^{\star})^{T}Cy^{\star}-(y^{\star})^{T}\xi^{\star}$ yields (16c). Using the complementarity constraints (2c) in (16c) yields (16d). The final inequality follows from (7). This proves the only if part of the claim. The claim is proven. ∎

We now show that infeasibility of QP (1) is equivalent to the vanishing of the optimal solution to HQP (4).

Theorem 2.

Suppose $\theta$ satisfies the conditions in §3.1. The QP (1) is infeasible iff the HQP (4) has optimal solution $(\widehat{y},\widehat{\tau})=0$ .

Proof.

Consider the only if part of the claim. Suppose there exists $(\nu^{\circ},\xi^{\circ})$ satisfying (3). It can be verified that $(\widehat{y},\widehat{\tau})=0$ , $(\widehat{\nu},\widehat{\xi},\widehat{\omega})=(\theta\nu^{\circ},\theta\xi^{\circ},0)$ satisfies (15). Hence $(\widehat{y},\widehat{\tau})=0$ is an optimal solution to HQP (4). This proves the only if part of the claim.

Consider the if part of the claim. Suppose $(\widehat{y},\widehat{\tau})=0$ is the optimal solution to (4) and let $(\widehat{\nu},\widehat{\xi},\widehat{\omega})$ be the multipliers in (15). Then $(\nu^{\circ},\xi^{\circ})=(\widehat{\nu}/(\theta+\widehat{\omega}),\widehat{\xi}/(\theta+\widehat{\omega}))$ can be verified to satisfy (3). This proves the if part of the claim, completing the proof. ∎

4 Infeasible Interior Point Method (IIPM)

For the remainder of the paper, we assume that the HQP has the form


$\displaystyle\min\limits_{x\in\mathbb{R}^{n+1}}$	$\displaystyle\;\frac{1}{2}x^{T}Qx+q^{T}x$	(17a)
s.t.	$\displaystyle\;Ax=0$	(17b)
	$\displaystyle\;x\geq 0$	(17c)

where $x=\left(\begin{smallmatrix}y\\ \tau\end{smallmatrix}\right)$ , $Q=\left(\begin{smallmatrix}C&c^{T}\\ c&\theta\end{smallmatrix}\right)$ , $q=\left(\begin{smallmatrix}0\\ -\theta\end{smallmatrix}\right)$ , $A=\left(\begin{smallmatrix}E&-f\end{smallmatrix}\right)$ . We assume the following for the HQP (17).

Assumption 3.

The matrix $A$ has full row rank of $m$ .

Assumption 4.

The matrix $Q$ is positive definite on the null space of $A$ , i.e. $x^{T}Qx>0\,\forall\,x\in\{v\,|\,Av=0\}$ .

The above HQP can be identified with original QP (1) when $f=0$ and the Assumptions 1-2 imply satisfaction of Assumptions 3-4. If $f\neq 0$ §3 shows how to cast (1) satisfying Assumptions 1-2 into a HQP of the form in (17) satisfying Assumptions 3-4.

In the rest of the section, we show how a standard IIPM for LPs can be extended to HQP (17). The IIPM also enjoys polynomial iteration complexity.

4.1 IIPM for HQP

In the rest of the section we employ the functions


$\displaystyle r_{d}(x,\lambda,s):=$	$\displaystyle\;Qx+A^{T}\lambda-s$	(18a)
$\displaystyle r_{p}(x,\lambda,s):=$	$\displaystyle\;Ax$	(18b)
$\displaystyle r_{c}(x,\lambda,s):=$	$\displaystyle\;Xs$	(18c)
$\displaystyle\mu(x,s):=$	$\displaystyle\;x^{T}s/(n+1)$	(18d)

where $x\in\mathbb{R}^{n+1}$ , and $\lambda\in\mathbb{R}^{m}$ , $s\in\mathbb{R}^{n+1}$ are the multipliers for the equality, nonnegativity constraints (17b)-(17c), and $X=\textnormal{Diag}(x)$ . The parameter $\mu(x,s)$ is called a barrier or centrality parameter. In the following, we will suppress the arguments when the dependence is clear from the context.

A point $x^{\star}\geq 0$ is said to minimize HQP (17) if there exist $\lambda^{\star}$ and $s^{\star}\geq 0$ satisfying the first-order stationary conditions

\displaystyle(r_{d}(x,\lambda,s),r_{p}(x,\lambda,s),r_{c}(x,\lambda,s))=(0,0,0).

(19)

IPMs aim to compute $(x^{\star},\lambda^{\star},s^{\star})$ by following the central path which is defined by $\{(x,\lambda,s)\,|\,(x,s)>0,\,(r_{d}(x,\lambda,s),r_{p}(x,\lambda,s),r_{c}(x,\lambda,s))=(0,0,\mu(x,s)e)\}$ where $e\in\mathbb{R}^{n+1}$ is a vector of all ones. In the limit as $\mu(x,s)\rightarrow 0$ we recover a point satisfying (19). Feasible IPMs typically assume that an initial iterate $(x,\lambda,s)$ satisfying $(x,s)>0$ and $(r_{d}(x,\lambda,s),r_{p}(x,\lambda,s))=(0,0)$ . Such a point is not generally guaranteed to exist and can also be difficult to compute. In the context of HQP (17) such a point will not exist if the original QP (1) is infeasible (refer Theorem 2). This is our motivation for considering IIPM.

IIPMs start from an initial iterate $(x^{0},\lambda^{0},s^{0})$ satisfying $(x^{0},s^{0})>0$ but $(r_{d}^{0},r_{p}^{0})\neq 0$ where $r_{d}^{0},r_{p}^{)}$ denote $r_{d}(x^{0},\lambda^{0},s^{0})$ , $r_{p}(x^{0},\lambda^{0},s^{0})$ . The method generates a sequence of iterates $\{(x^{k},\lambda^{k},s^{k})\}$ where the iterates are required to lie within a neighborhood ${\cal N}_{-\infty}(\gamma,\beta)$ with

\displaystyle{\cal N}_{-\infty}(\gamma,\beta)=\left\{(x,\lambda,s)\,\left|\,\begin{aligned} \frac{\|(r_{d},r_{p})\|}{\mu}\leq\beta\frac{\|(r_{d}^{0},r_{p}^{0})\|}{\mu^{0}}\\ (x,s)>0,Xs\geq\gamma\mu(x,s)e\end{aligned}\right.\right\}

(20)

where $\beta\geq 1$ is a fixed parameter. At each iteration a search direction $(\Delta x^{k},\Delta\lambda^{k},\Delta s^{k})$ is generated by solving

\displaystyle\begin{pmatrix}Q&A^{T}&-I_{n+1}\\ A&0&0\\ S^{k}&0&X^{k}\end{pmatrix}\begin{pmatrix}\Delta x^{k}\\ \Delta\lambda^{k}\\ \Delta s^{k}\end{pmatrix}=\begin{pmatrix}-r_{d}^{k}\\ -r_{p}^{k}\\ -r_{c}^{k}+\sigma^{k}\mu^{k}e\end{pmatrix}

(21)

where $\sigma^{k}>0$ such that $\sigma^{k}\in[\sigma^{\textnormal{min}},\sigma^{\textnormal{max}}]$ and $\mu^{k}=(x^{k})^{T}s^{k}/(n+1)$ . The residual of the complementarity condition $r_{c}^{k}$ is perturbed by $\sigma^{k}\mu^{k}$ to ensure that the iterates $(x^{k+1},s^{k+1})$ can be guaranteed to lie within the neighborhood ${\cal N}_{-\infty}(\gamma,\beta)$ . Given such a direction define $(x(\alpha),\lambda(\alpha),s(\alpha))=(x^{k},\lambda^{k},s^{k})+\alpha(\Delta x^{k},\Delta\lambda^{k},\Delta s^{k})$ . The step $\alpha^{k}$ is chosen to be the largest value such that


$\displaystyle(x(\alpha^{k}),\lambda(\alpha^{k}),s(\alpha^{k}))\in$	$\displaystyle\;{\cal N}_{-\infty}(\gamma,\beta)$	(22a)
$\displaystyle\mu(\alpha^{k})\leq$	$\displaystyle\;(1-0.01\alpha^{k})\mu^{k}.$	(22b)

The next iterate is defined as $(x^{k+1},\lambda^{k+1},s^{k+1})=(x(\alpha^{k}),\lambda(\alpha^{k}),s(\alpha^{k}))$ . Algorithm 1 describes the IIPM.

Data:

\beta,\gamma,\sigma^{\textnormal{min}},\sigma^{\textnormal{max}}

with

\beta\geq 1

\gamma\in(0,1)

0<\sigma^{\textnormal{min}}<\sigma^{\textnormal{max}}\leq\frac{1}{2}

1 Choose

(x^{0},\lambda^{0},s^{0})

with

x^{0},s^{0}>0

2 for $k=0,1,\ldots$ do

3 Choose

\sigma^{k}\in[\sigma^{\textnormal{min}},\sigma^{\textnormal{max}}]

and solve (21) to obtain

(\Delta x^{k},\Delta\lambda^{k},\Delta s^{k})

4 Choose the largest

\alpha^{k}>0

such that (22) holds.

5 Set

(x^{k+1},\lambda^{k+1},s^{k+1})=(x^{k},\lambda^{k},s^{k})+\alpha^{k}(\Delta x^{k},\Delta\lambda^{k},\Delta s^{k})

Algorithm 1 IIPM for HQP (17)

4.2 Polynomial Complexity

The analysis of the Algorithm 1 is straightforward adaption of the analysis in [23, Chapter 6]. We will only highlight the key steps where we differ. The results are specialized to the case where the initial iterate is chosen as

\displaystyle(x^{0},\lambda^{0},s^{0})=(\zeta e,0,\zeta e)

(23)

where $\zeta>0$ is scalar for which

\displaystyle\|(x^{\star},s^{\star})\|\leq\zeta

(24)

for some optimal solution $(x^{\star},\lambda^{\star},s^{\star})$ of HQP (17). Recall from the discussion in §3 that such a solution always exists under the Assumptions 3-4.

From the linearity of the residuals in (18a)-(18b), the choice of step direction (21) and the method of updating the iterate at $(k+1)$ it is easy to show that

(r_{d}^{k+1},r_{p}^{k+1})=(1-\alpha^{k})(r_{d}^{k},r_{c}^{k})=\upsilon^{k+1}(r_{d}^{0},r_{p}^{0})

where $\upsilon^{k+1}=\prod_{j=0}^{k}(1-\alpha^{j})$ . We first state a key result that is used in subsequent lemmas.

Lemma 2.

Suppose $(\overline{x},\overline{\lambda},\overline{s})$ be such that $r_{d}(\overline{x},\overline{\lambda},\overline{s})=0$ and $r_{p}(\overline{x},\overline{\lambda},\overline{s})=0$ . Then $\overline{x}^{T}Q\overline{x}=\overline{x}^{T}\overline{s}$ . Further, $\overline{x}^{T}\overline{s}\geq 0$ .

Proof.

Substitute $(\overline{x},\overline{\lambda},\overline{s})$ in (18a) and multiplying by $\overline{x}^{T}$ obtain

\displaystyle\overline{x}^{T}r_{d}(\overline{x},\overline{\lambda},\overline{s})=\overline{x}^{T}Q\overline{x}+(A\overline{x})^{T}\overline{\lambda}-\overline{x}^{T}\overline{s}=0.

Using $A\overline{x}=0$ in the above proves the claim. The second claim follows from Assumption 4. ∎

The proof on complexity (Theorem 3) relies on 3 key lemmas. Lemma 3 first bounds $\upsilon^{k}\|(x^{k},s^{k})\|_{1}$ . Lemma 4 provides a bound on scaled search directions. Finally, Lemma 6.7 [23] ensures that there exists an uniform lower bound on $\alpha^{k}$ . We first provide a bound on $\nu^{k}(x^{k},s^{k})$ .

Lemma 3.

Suppose the initial iterate satisfies (23). Then for any iterate $(x^{k},\lambda^{k},s^{k})$

\displaystyle\zeta\upsilon^{k}\|(x^{k},s^{k})\|_{1}\leq 4\beta(n+1)\mu^{k}.

(25)

Proof.

Define

	$\displaystyle(\overline{x},\overline{\lambda},\overline{s})=$	$\displaystyle\;\upsilon^{k}(x^{0},\lambda^{0},s^{0})+(1-\upsilon^{k})(x^{\star},\lambda^{\star},s^{\star})$
		$\displaystyle\;-(x^{k},\lambda^{k},s^{k}).$	(26a)
Then $(\overline{x},\overline{\lambda},\overline{s})$ satisfy the conditions in Lemma 2 and $\overline{x}^{T}\overline{s}\geq 0$ . Hence,

	$\displaystyle 0\leq$	$\displaystyle\;\overline{x}^{T}\overline{s}$
	$\displaystyle=$	$\displaystyle\;(\upsilon^{k})^{2}(x^{0})^{T}s^{0}+(1-\upsilon^{k})^{2}(x^{\star})^{T}s^{\star}+(x^{k})^{T}s^{k}$
		$\displaystyle\;+\upsilon^{k}(1-\upsilon^{k})((x^{0})^{T}s^{\star}+(x^{\star})^{T}s^{0})$
		$\displaystyle\;-\upsilon^{k}((x^{0})^{T}s^{k}+(x^{k})^{T}s^{0})$
		$\displaystyle\;-(1-\upsilon^{k})((x^{\star})^{T}s^{k}+(x^{k})^{T}s^{\star}).$
Using $(x^{\star})^{T}s^{\star}=0$ and $(x^{\star})^{T}s^{k}+(x^{k})^{T}s^{\star}\geq 0$ in the above and rearranging obtain

		$\displaystyle\;\upsilon^{k}((x^{0})^{T}s^{k}+(x^{k})^{T}s^{0})$
	$\displaystyle\leq$	$\displaystyle\;(\upsilon^{k})^{2}(x^{0})^{T}s^{0}+(x^{k})^{T}s^{k}$
		$\displaystyle\;+\upsilon^{k}(1-\upsilon^{k})((x^{0})^{T}s^{\star}+(x^{\star})^{T}s^{0}).$
The above is identical to [23, eqn. (6.20)] and the arguments in [23, Lemmas 6.3-6.4] apply to yield the result.

∎

We next provide a bound on the scaled search directions.

Lemma 4.

Suppose the initial iterate satisfies (23). Then there is a constant $\eta>0$ independent of $n$ such that

\displaystyle\|(D^{k})^{-1}\Delta x^{k}\|\leq\eta(n+1)\mu^{k},\,\|D^{k}\Delta s^{k}\|\leq\eta(n+1)\mu^{k}

where $D^{k}=(X^{k})^{\frac{1}{2}}(S^{k})^{-\frac{1}{2}}$ .

Proof.

Define

	$\displaystyle(\overline{x},\overline{\lambda},\overline{s})=$	$\displaystyle\;(\Delta x^{k},\Delta\lambda^{k},\Delta s^{k})+\upsilon^{k}(x^{0},\lambda^{0},s^{0})$
		$\displaystyle\;-\upsilon^{k}(x^{\star},\lambda^{\star},s^{\star})$	(27a)
Then $(\overline{x},\overline{\lambda},\overline{s})$ satisfy the conditions in Lemma 2 and $\overline{x}^{T}\overline{s}\geq 0$ . From the last row in (21)

		$\displaystyle\;S^{k}(\Delta x^{k}+\upsilon^{k}(x^{0}-x^{\star}))+X^{k}(\Delta s^{k}+\upsilon^{k}(s^{0}-s^{\star}))$
	$\displaystyle=$	$\displaystyle\;-X^{k}s^{k}+\sigma^{k}\mu^{k}e+\upsilon^{k}S^{k}(x^{0}-x^{\star})+\upsilon^{k}X^{k}(s^{0}-s^{\star}).$	(27b)
Multiplying through by $(X^{k}S^{k})^{-\frac{1}{2}}$ and the definition of $D^{k}$ obtain

		$\displaystyle\;(D^{k})^{-1}(\Delta x^{k}+\upsilon^{k}(x^{0}-x^{\star}))+D^{k}(\Delta s^{k}+\upsilon^{k}(s^{0}-s^{\star}))$
	$\displaystyle=$	$\displaystyle\;-(X^{k}S^{k})^{-\frac{1}{2}}(X^{k}s^{k}-\sigma^{k}\mu^{k}e)$
		$\displaystyle\;+\upsilon^{k}(D^{k})^{-1}(x^{0}-x^{\star})+\upsilon^{k}D^{k}(s^{0}-s^{\star}).$	(27c)
Taking norms on both sides, squaring and using $\overline{x}^{T}\overline{s}\geq 0$ obtain

		$\displaystyle\;\\|(D^{k})^{-1}(\Delta x^{k}+\upsilon^{k}(x^{0}-x^{\star}))\\|^{2}$	(27d)
		$\displaystyle\;+\\|D^{k}(\Delta s^{k}+\upsilon^{k}(s^{0}-s^{\star}))\\|^{2}$
	$\displaystyle\leq$	$\displaystyle\;\left\{\begin{aligned} \\|(X^{k}S^{k})^{-\frac{1}{2}}\\|\\|(X^{k}s^{k}-\sigma^{k}\mu^{k}e)\\|\\ +\upsilon^{k}\\|(D^{k})^{-1}(x^{0}-x^{\star})\\|+\upsilon^{k}\\|D^{k}(s^{0}-s^{\star})\\|\end{aligned}\right\}^{2}.$
Starting from this inequality the arguments in the proofs of Lemmas 6.5-6.6 [23] can be followed to show the claim.

∎

Lemma 6.7 [23] provides an uniform bound on $\alpha^{k}$ and holds without any change in the arguments.

The claim on polynomial complexity of Algorithm 1 follows from the Lemmas 3,4 and Lemma 6.7 [23]. The arguments are identical to those in [23, Theorem 6.2]. We state the complexity result without proof.

Theorem 3.

Let $\epsilon>0$ be given. Suppose that the initial iterate satisfies (23) and suppose that $\zeta>0$ satisfies $\zeta^{2}\leq\chi/\epsilon^{\kappa}$ for some constants $\chi,\kappa>0$ . Then there is an index $K=O(n^{2}|\log\epsilon|)$ such that the iterates $(x^{k},\lambda^{k},s^{k})$ generated by Algorithm 1 satisfy $\mu^{k}\leq\epsilon$ for all $k\geq K$ .

5 Numerical Experiments

We present preliminary numerical experiments with the HQP formulation using the IPM solver IPOPT [32]. IPOPT implements the standard IPM for QPs i.e. the algorithm cannot determine a certificate of infeasibility. In our experiments, we generated random instances of QP (1) with a single equality constraint with $C=I_{n}$ , $c=e$ , $E\in\mathbb{R}^{1\times n}$ , $f=-1$ . The coefficients in the single equality are restricted to be nonnegative and generated at random. It is easy to deduce that there exists no $y\geq 0$ such that $Ey<0$ for $E\geq 0$ . Hence, the instances are all infeasible. We presented the QP formulation directly to IPOPT. The HQP formulation (4) is derived for each such random instance and is also presented for solution to IPOPT. We generated 10 random instances of different sizes $n\in\{10,25,50\}$ .

In the first set of experiments, we set the IPOPT option mehrotra_algorithm=no. In this case, the predictor-corrector algorithm of Mehrotra is disabled. When solving the QP formulation, IPOPT always terminates after 8-10 iterations with the indication that the problem might be infeasible. When solving the HQP formulation, IPOPT always terminates with an optimal solution of $0$ as specified in Theorem 2 in about 16-20 iterations.

In the second set of experiments, we enabled the predictor-corrector algorithm in IPOPT by setting mehrotra_algorithm=yes. In this setting IPOPT hits the iteration limit on every single instance when solving the QP formulation. The dual infeasibility blows up to infinity in every instance. When solving the HQP formulation IPOPT always terminates with an optimal solution of $0$ in about 8-10 iterations.

In the third set of experiments, we set $f=1$ . In this case the QP instances are always feasible. We verified that both the QP and HQP formulations solved the problem to optimality. Further, the solutions to QP can be recovered from the HQP could be recovered according to Theorem 1. Thre is no significant difference in the number of iterations taken for convergence using either formulation. However, the computational time per iteration for HQP is higher than that of QP formulation. This is likely due to the matrix in (21) being dense. This can be rectified using a tailored Schur-complement-based implementation for the step computation. We will investigate this in a future work.

6 Conclusions & Future Work

We presented a homogeneous formulation of QP that allows for robust detection of infeasibility in QPs. We also presented an infeasible IPM for the solution of the HQP and showed polynomial iteration complexity. In the context of IPMs, the paper leaves open a number of future avenues for exploration. Firstly, the sparsity of the linear systems in the step computation of IIPM are now affected by the sparsity of $c,f$ which can be dense. To effectively handle this a tailored linear algebra solution is required. Secondly, the linear system $E$ may have structure such as in the case of MPC. Exploiting this through a decomposition approach such as in [3] is critical to improving the computational efficiency. Thirdly, the user of predictor-corrector steps will be crucial for improving the practial performance. We will explore these in a future work. The homogeneous formulation can be readily presented to any QP algorithm. We will also investigate the performance of other QP algorithms on the proposed formulation.

References

[1] F. Borrelli, A. Bemporad, and M. Morari, Predictive Control for Linear and Hybrid Systems. Cambridge University Press, 2017.
[2] D. Mayne, J. Rawlings, and M. Diehl, Model predictive control: Theory, Computation and Design, 2nd ed. Santa Barbara: Nob Hill, LLC, 2020.
[3] C. Rao, S. Wright, and J. Rawlings, “Application of interior-point methods to model predictive control,” Journal of Optimization Theory and Applications, p. 723–757, 1998.
[4] R. Bartlett and L. Biegler, “Qpschur: A dual, active-set, schur-complement method for large-scale and structured convex quadratic programming,” Optimization and Engineering, vol. 7, p. 5–32, 2006.
[5] H. Ferreau, H. Bock, and M. Diehl, “An online active set strategy to overcome the limitations of explicit mpc,” International Journal of Robust and Nonlinear Control, vol. 18, no. 8, p. 816–830, 2008.
[6] A. Domahidi, E. Chu, and S. Boyd, “Ecos: An socp solver for embedded systems,” in 2013 European Control Conference (ECC), 2013, pp. 3071–3076.
[7] S. Richter, C. Jones, and M. Morari, “Computational complexity certification for real-time mpc with input constraints based on the fast gradient method,” IEEE Trans. Autom. Control, vol. 57, no. 6, p. 1391–1403, 2012.
[8] P. Giselsson, “Optimal preconditioning and iteration complexity bounds for gradient-based optimization in model predictive control,” in 2013 American Control Conference, 2013, pp. 358–364.
[9] P. Patrinos and A. Bemporad, “An accelerated dual gradient-projection algorithm for embedded linear model predictive control,” IEEE Transactions on Automatic Control, vol. 59, no. 1, p. 18–33, 2014.
[10] A. U. Raghunathan and S. Di Cairano, “Alternating direction method of multipliers for strictly convex quadratic programs: Optimal parameter selection,” in American Control Conference, 2014, pp. 4324–4329.
[11] P. Giselsson and S. Boyd, “Diagonal scaling in douglas-rachford splitting and admm,” in 53rd IEEE Conference on Decision and Control, 2014, pp. 5033–5039.
[12] P. Patrinos, L. Stella, and A. Bemporad, “Douglas-rachford splitting: Complexity estimates and accelerated variants,” in 53rd IEEE Conference on Decision and Control, 2014, pp. 4234–4239.
[13] E. Ghadimi, A. Teixeira, I. Shames, and M. Johansson, “Optimal parameter selection for the alternating direction method of multipliers (admm): Quadratic problems,” IEEE Transactions on Automatic Control, vol. 60, no. 3, pp. 644–658, 2015.
[14] R. Quirynen, A. Knyazev, and S. D. Cairano, “Block structured preconditioning within an active-set method for real-time optimal control,” in 2018 European Control Conference (ECC), 2018, pp. 1154–1159.
[15] D. Frick, A. Domahidi, and M. Morari, “Embedded optimization for mixed logical dynamical systems,” Computers & Chemical Engineering, vol. 72, pp. 21–33, 2015.
[16] V. Naik and A. Bemporad, “Embedded mixed-integer quadratic optimization using accelerated dual gradient projection,” in IFAC-PapersOnLine, vol. 50, no. 1, 2017, pp. 10 723–10 728.
[17] B. Stellato, V. Naik, A. Bemporad, P. Goulart, and S. Boyd, “Embedded mixed-integer quadratic optimization using the osqp solver,” in 2018 European Control Conference (ECC), 2018, pp. 1536–1541.
[18] P. Hespanhol, R. Quirynen, and S. Di Cairano, “A structure exploiting branch-and-bound algorithm for mixed-integer model predictive control,” in 2019 18th European Control Conference (ECC), 2019, pp. 2763–2768.
[19] A. U. Raghunathan and S. Di Cairano, “Infeasibility detection in alternating direction method of multipliers for convex quadratic programs,” in 53rd IEEE Conference on Decision and Control, 2014, pp. 5819–5824.
[20] G. Banjac, P. Goulart, B. Stellato, and S. Boyd, “Infeasibility detection in the alternating direction method of multipliers for convex optimization,” Journal of Optimization Theory and Applications, vol. 183, no. 2, pp. 490–519, 2019. [Online]. Available: https://doi.org/10.1007/s10957-019-01575-y
[21] D. Liao-McPherson and I. Kolmanovsky, “Fbstab: A proximally stabilized semismooth algorithm for convex quadratic programming,” Automatica, vol. 113, p. 108801, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0005109819306648
[22] J. Nocedal and S. J. Wright, Numerical Optimization, 2nd ed. New York, NY, USA: Springer, 2006.
[23] S. Wright, Primal-Dual Interior-Point Methods. SIAM, 1997.
[24] E. Wong, “Active-set methods for quadratic programming,” Ph.D. dissertation, University of California, San Diego, 2011.
[25] Y. Ye, M. J. Todd, and S. Mizuno, “An o( $\sqrt{n}$ /l)-iteration homogeneous and self-dual linear programming algorithm,” Mathematics of Operations Research, vol. 19, pp. 53–67, 1994.
[26] X. Xu, P. Hung, and Y. Ye, “A simplified homogeneous and self-dual linear programming algorithm and its implementation,” Annals of Operations Research, vol. 62, pp. 151–172, 1996.
[27] E. Andersen and K. Andersen, “The mosek interior point optimizer for linear programming: an implementation of the homogeneous algorithm,” High Performance Optimization, vol. 33, p. 197–232, 2000.
[28] C. J. Nohra, A. U. Raghunathan, and N. Sahinidis, “Spectral relaxations and branching strategies for global optimization of mixed-integer quadratic programs,” SIAM Journal on Optimization, vol. 31, no. 1, pp. 142–171, 2021.
[29] C. J. Nohra, A. U. Raghunathan, and N. V. Sahinidis, “Sdp-quality bounds via convex quadratic relaxations for global optimization of mixed-integer quadratic programs,” Math. Program., 2021.
[30] G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed. Johns Hopkins University Press, 1996.
[31] O. Mangasarian, Nonlinear Programming, ser. Classics in Applied Mathematics. SIAM, 1994.
[32] A. Wächter and L. T. Biegler, “On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming,” Mathematical programming, vol. 106, no. 1, pp. 25–57, 2006.

Homogeneous Formulation of Convex Quadratic Programs for Infeasibility Detection

Abstract

1 Introduction

2 Problem Formulation

Assumption 1.

Assumption 2.

3 Homogeneous QP Formulation

3.1 Conditions on θ\theta

3.2 Computing θ\theta

Lemma 1.

Proof.

Remark 1.

Remark 2.

3.3 Equivalence between QP and HQP

Theorem 1.

Proof.

Theorem 2.

Proof.

4 Infeasible Interior Point Method (IIPM)

Assumption 3.

Assumption 4.

4.1 IIPM for HQP

4.2 Polynomial Complexity

Lemma 2.

Proof.

Lemma 3.

Proof.

Lemma 4.

Proof.

Theorem 3.

5 Numerical Experiments

6 Conclusions & Future Work

References

3.1 Conditions on $\theta$

3.2 Computing $\theta$