This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Solving Challenging Large Scale QAPs

Koichi Fujii NTT DATA Mathematical Systems Inc., Tokyo, 160-0016 Japan    Naoki Ito Fast Retailing Co., Ltd., Uniqlo City Tokyo, Tokyo, 135-0063 Japan    Sunyoung Kim, Department of Mathematics, Ewha W. University, 52 Ewhayeodae-gil, Sudaemoon-gu, Seoul 120-750, Korea    Masakazu Kojima, Department of Industrial and Systems Engineering, Chuo University, Tokyo 192-0393, Japan    Yuji Shinano Department of Applied Algorithmic Intelligence Methods (A2IM), Zuse Institute Berlin, Takustr. 7, 14195 Berlin, Germany    Kim-Chuan Toh Department of Mathematics, and Institute of Operations Research and Analytics, National University of Singapore, 10 Lower Kent Ridge Road, Singapore 119076
Abstract

We report our progress on the project for solving larger scale quadratic assignment problems (QAPs). Our main approach to solve large scale NP-hard combinatorial optimization problems such as QAPs is a parallel branch-and-bound method efficiently implemented on a powerful computer system using the Ubiquity Generator (UG) framework that can utilize more than 100,000 cores. Lower bounding procedures incorporated in the branch-and-bound method play a crucial role in solving the problems. For a strong lower bounding procedure, we employ the Lagrangian doubly nonnegative (DNN) relaxation and the Newton-bracketing method developed by the authors’ group. In this report, we describe some basic tools used in the project including the lower bounding procedure and branching rules, and present some preliminary numerical results.

Our next target problem is QAPs with dimension at least 50, as we have succeeded to solve tai30a and sko42 from QAPLIB for the first time.

1 Introduction

For a positive integer nn, we let N={1,,n}N=\{1,\ldots,n\} represent a set of locations and also a set of facilities. Given n×nn\times n symmetric matrices 𝑨=[aik]\mbox{\boldmath$A$}=[a_{ik}], 𝑩=[bj]\mbox{\boldmath$B$}=[b_{j\ell}] and an n×nn\times n matrix 𝑪=[cij]\mbox{\boldmath$C$}=[c_{ij}], the quadratic assignment problem (QAP) is stated as

minπiNnkNnaikbπ(i),π(k)+iNnci,π(i),\displaystyle\min_{\pi}\sum_{i\in N}^{n}\sum_{k\in N}^{n}a_{ik}b_{\pi(i),\pi(k)}+\sum_{i\in N}^{n}c_{i,\pi(i)}, (1)

where aika_{ik} denotes the flow between facilities ii and kk, bjb_{j\ell} is the distance between locations jj and \ell, cijc_{ij} the fixed cost of assigning facility ii to location jj, and (π(1),,π(n))(\pi(1),\ldots,\pi(n)) a permutation of 1,,n1,\ldots,n such that π(i)=j\pi(i)=j if facility ii is assigned to location jj.

The QAP is known as NP-hard in theory, and solving QAP instances of size larger than 35 in practice still remains challenging. Various heuristic methods for the QAP such as tabu search [21, 22], genetic method [5] and simulated annealing [4] have been developed. Those methods frequently attain near-optimal solutions, which often also happen to be the exact optimal solutions. The exactness is, however, not guaranteed in general.

Most existing methods for finding the exact solutions of QAPs are designed with the branch-and-bound (B&B) method [1, 3, 6, 8, 14, 16]. As its name indicates, branching and (lower) bounding are the main procedures of the method. The bounding based on doubly nonnegative (DNN) relaxations has recently attracted a great deal of attention as it provides tight bounds. Among the methods for solving large scale DNN relaxation problems, we mention four types of methods that can be applied to such DNN problems.

First two of the methods are SDPNAL+ [YST2015] (a majorized semismooth Newton-CG augmented Lagrangian method for semidefinite programming with nonnegative constraints) and BBCPOP [9] (a bisection and projection method for Binary, Box and Complementary constrained Polynomial Optimization Problems). Some numerical results on these two methods applied to QAP instances with dimensions n=15n=15 to 5050 from QAPLIB [7] were reported in [9]. where BBCPOP attained tighter lower bounds for many of instances with dimensions 3030 to 5050 in less execution time. In addition, new and improved lower bounds were computed by Mittelmann using BBCPOP for the unknown minimum values of larger scale QAP instances including tai100a and tai100b. See QAPLIB [7]. The third method is an alternating direction method of multipliers (ADMM) proposed by Oliveira et al. [13] in combination with the facial reduction for robustness. The fourth method is the Newton-bracketing (NB) method [2, 10], which was developed to further improve the lower bounds obtained from BBCPOP by incorporating the Newton method into BBCPOP.

Numerical results on the NB method and BBCPOP are given in Table 2 of [10], and ones on ADMM in Table 7 of [13]. Lower bounds for the QAP instances sko42, sko49, sko56, sko64, sko72, sko81, tai60a, and tai80a were commonly computed by these methods. While the lower bounds obtained by the NB method for the first three instances are not smaller than those by ADMM and the difference is as big as 11, the NB method clearly attained the tightest lower bound among the three methods for the last five larger-scale instances with dimension n60n\geq 60.

RLT [17] is known as a powerful method for global optimization of polynomial programming problems. In their paper [6], Goncalves et al. implemented the level 2 RLT in their B&B method run on heterogeneous CPUs and GPUs environment. They solved tai35b and tai40b from QAPLIB for the first time.

The main motivation of our project is to challenge larger scale QAP instances from QAPLIB [7] that have not been solved yet. We implement the NB method [10] combined with the B&B method in the specialized Ubiquity Generator (UG) framework [ShinanoAchterbergBertholdHeinzKoch2012] to find the exact solutions of large scale QAP instances.

The NB method has the following advantages:

(a) The NB method generates a sequence of intervals [lbk,ubk][\mbox{lb}^{k},\mbox{ub}^{k}] (k=1,2,)(k=1,2,\ldots) such that each [lbk,ubk][\mbox{lb}^{k},\mbox{ub}^{k}] contains the lower bound ζ¯\underline{\zeta} to be computed and both lbk\mbox{lb}^{k} and ubk\mbox{ub}^{k} converge monotonically to ζ¯\underline{\zeta} as kk\rightarrow\infty. Hence, if lbk\mbox{lb}^{k} becomes larger than the incumbent objective value ζ^\hat{\zeta} at some iteration kk, then ζ^<lbkζ¯\hat{\zeta}<\mbox{lb}^{k}\leq\underline{\zeta} follows. Consequently, the node can be pruned. If ubk<ζ^\mbox{ub}^{k}<\hat{\zeta} occurs at some iteration kk, then branching the node should be determined as ζ¯ubk<ζ^\underline{\zeta}\leq\mbox{ub}^{k}<\hat{\zeta} holds. In these two cases, the NB method can terminate at iteration kk before the interval attains an accurate lower bound. This feature of the NB method works effectively to reduce a considerable amount of the computational time.

(b) The lower bounds computed by the NB method (and also the lower bounds computed by BBCPOP) are reliable or valid in the sense that they are guaranteed by very accurate dual feasible solutions of the Lagrangian DNN relaxation problem (see [10, Section 2.3].) It would be plausible to incorporate the technique used there for ADMM’s lower bound, but the resulting lower bound would deteriorate.

The UG is a generic framework to parallelize branch-and-bound based solvers and has achieved large-scale MPI parallelism with 80,000 cores [18]. It has also been generalized to handle non-branch-and-bound based solvers. Its experimental implementation for solving Shortest Vector Problem (SVP) handles more than 100,000 cores [23] .

The aim of this article is to present fundamentals of our joint project for solving larger scale QAPs, report the progress, and discuss future plans to further develop the methods. We mention that tai35b, tai40b, tai30a and sko42 from QAPLIB [7] have been successfully solved by our method, where the last two instances could be solved for the first time. QAPs with dimension at least 50 are our next target problems.

In Section 2, we first reformulate QAP (1) to QAP (6), which is a binary quadratic optimization problem form in (1+n2)(1+n^{2})-dimensional vector variable 𝒖u. Then we introduce a class of its sub QAPs and describe the enumeration tree of those subproblems used in the branching process. In Section 3, a simple Lagrangian DNN relaxation problem of a sub QAP and the NB method to solve the relaxation problem are presented for the lower bounding procedure. In Section 4, a method to retrieve a feasible solution of (6) from an approximate optimal solution of the Lagrangian DNN relaxation problem is described for the upper bounding procedure. We present three types branching rules in Section 5, and preliminary numerical results in Section 6. Some additional techniques are proposed for future development in Section 7.

Notation and Symbols

Let m\mathbb{R}^{m} denote the mm-dimensional Euclidean space of column vectors 𝒛=[z1;;zm]\mbox{\boldmath$z$}=[z_{1};\ldots;z_{m}], +m\mathbb{R}^{m}_{+} the nonnegative orthant of m\mathbb{R}^{m}, 𝕊m\mathbb{S}^{m} the linear space of m×mm\times m symmetric matrices with the inner product 𝑨,𝑩=i=1mj=1mAijBij\langle\mbox{\boldmath$A$},\,\mbox{\boldmath$B$}\rangle=\sum_{i=1}^{m}\sum_{j=1}^{m}A_{ij}B_{ij}, and 𝕊+m\mathbb{S}^{m}_{+} the cone of positive semidefinite matrices in 𝕊m\mathbb{S}^{m}. For every ×m\ell\times m matrix 𝑼U, 𝑼.j\mbox{\boldmath$U$}._{j} denotes the jjth column vector of 𝑼U, and vec 𝑼U the m\ell m-dimensional column vector converted from the matrix 𝑼U, i.e., vec𝑼=[𝑼.1;;𝑼.m]\mbox{\boldmath$U$}=[\mbox{\boldmath$U$}._{1};\ldots;\mbox{\boldmath$U$}._{m}]. 𝑼T\mbox{\boldmath$U$}^{T} denotes the transposition of 𝑼U. Throughout the paper, we use nn for the size of QAP (1), and N={1,,n}N=\{1,\ldots,n\} for the set of facilities and the set of locations. When {0,N×N}\mathbb{R}^{\{0,N\times N\}} is used, each (1+n2)(1+n^{2})-dimensional column vector 𝒖{0,N×N}\mbox{\boldmath$u$}\in\mathbb{R}^{\{0,N\times N\}} is represented component-wisely as [u0;u11,;un1;u12;;un2;,u1n;,unn][u_{0};u_{11},\ldots;u_{n1};u_{12};\ldots;u_{n2};\ldots,u_{1n};\ldots,u_{nn}], where 𝑼=[uik]\mbox{\boldmath$U$}=[u_{ik}] forms an n×nn\times n matrix and 𝒖=[u0;vec 𝑼]=[u0;𝑼.1;;𝑼.n]\mbox{\boldmath$u$}=[u_{0};\mbox{vec }\mbox{\boldmath$U$}]=[u_{0};\mbox{\boldmath$U$}._{1};\ldots;\mbox{\boldmath$U$}._{n}]. Each ((1+n2)×(1+n2))((1+n^{2})\times(1+n^{2}))-dimensional symmetric matrix 𝑿𝕊{0,N×N}\mbox{\boldmath$X$}\in\mathbb{S}^{\{0,N\times N\}} has elements X𝜶𝜷X_{\mbox{\scriptsize\boldmath$\alpha$}\mbox{\scriptsize\boldmath$\beta$}} (𝜶,𝜷{0,N×N})(\mbox{\boldmath$\alpha$},\mbox{\boldmath$\beta$}\in\{0,N\times N\}).

2 A class of sub QAPs and the enumeration tree

2.1 Conversion of QAP (1) into a QOP with a binary vector variable

Let 𝑼=[uij]\mbox{\boldmath$U$}=[u_{ij}] be an n×nn\times n matrix variable and 𝒖=[u0;vec(𝑼)]=[u0;𝑼.1;;𝑼.n]{0,N×N}\mbox{\boldmath$u$}=[u_{0};\mbox{\bf vec}(\mbox{\boldmath$U$})]=[u_{0};\mbox{\boldmath$U$}._{1};\ldots;\mbox{\boldmath$U$}._{n}]\in\mathbb{R}^{\{0,N\times N\}} a (1+n2)(1+n^{2})-dimensional column vector variable, where

uij={1if facility i is assigned to location j,0otherwise.\displaystyle u_{ij}=\left\{\begin{array}[]{ll}1&\mbox{if facility $i$ is assigned to location $j$},\\ 0&\mbox{otherwise}.\end{array}\right.

Define a constant matrix

𝑸0\displaystyle\mbox{\boldmath$Q$}^{0} =\displaystyle= (0vec(𝑪)T/2vec(𝑪)/2𝑩𝑨)𝕊{0,N×N},\displaystyle\begin{pmatrix}0&\mbox{\bf vec}(\mbox{\boldmath$C$})^{T}/2\\ \mbox{\bf vec}(\mbox{\boldmath$C$})/2&\mbox{\boldmath$B$}\otimes\mbox{\boldmath$A$}\end{pmatrix}\in\mathbb{S}^{\{0,N\times N\}},

where \otimes denotes the Kronecker product. Then, the QAP can be expressed as

ζ0\displaystyle\zeta^{0} =\displaystyle= min{𝑸0,𝒖𝒖T:𝒖+{0,N×N},u0=1,uij(u0uij)=0((i,j)N×N),kNuik=u0(iN),kNukj=u0(jN)}.\displaystyle\min\left\{\langle\mbox{\boldmath$Q$}^{0},\,\mbox{\boldmath$u$}\mbox{\boldmath$u$}^{T}\rangle:\begin{array}[]{l}\mbox{\boldmath$u$}\in\mathbb{R}^{\{0,N\times N\}}_{+},\ u_{0}=1,\\[3.0pt] \displaystyle u_{ij}(u_{0}-u_{ij})=0\ ((i,j)\in N\times N),\\[3.0pt] \displaystyle\sum_{k\in N}u_{ik}=u_{0}\ (i\in N),\ \displaystyle\sum_{k\in N}u_{kj}=u_{0}\ (j\in N)\end{array}\right\}. (6)

2.2 A class of subproblems of QAP (6)

We describe subproblems of QAP (6) with a family of subsequences (i1,,ir)(i_{1},\ldots,i_{r}) with 1i1<<irn1\leq i_{1}<\cdots<i_{r}\leq n. More precisely, let Πr\Pi_{r} denote the family of subsequences of rr distinct elements of NN (r=1,,n)(r=1,\ldots,n). We assume Π0={}\Pi_{0}=\{\emptyset\}. Πn\Pi_{n} corresponds to the family of permutations of 1,,n1,\ldots,n, and each F=(i1,,ir)ΠrF=(i_{1},\ldots,i_{r})\in\Pi_{r} (or L=(j1,,jr)ΠrL=(j_{1},\ldots,j_{r})\in\Pi_{r}) with r1r\geq 1 corresponds to a permutation of rr distinct elements ipi_{p} (p=1,,r)(p=1,\ldots,r) (or jpj_{p} (p=1,,r)(p=1,\ldots,r)) of NN. For simplicity of notation, FΠrF\in\Pi_{r} is frequently regarded as a subset of NN when iFi\in F and its complement Fc=N\FF^{c}=N\backslash F are described with respect to NN.

For each (F,L)=((i1,,ir),(j1,,jr))Πr×Πr(F,L)=((i_{1},\ldots,i_{r}),(j_{1},\ldots,j_{r}))\in\Pi_{r}\times\Pi_{r} (r{0,1,,n})(r\in\{0,1,\ldots,n\}), let

{0,Fc×Lc}\displaystyle\mathbb{R}^{\{0,F^{c}\times L^{c}\}} =\displaystyle= 1+|Fc||Lc|1+|F^{c}||L^{c}|-dimensional Euclidean space of column vectors 𝒙x
with elements x0 and xij ((i,j)Fc×Lc),\displaystyle\mbox{with elements $x_{0}$ and $x_{ij}$ $((i,j)\in F^{c}\times L^{c})$},
+{0,Fc×Lc}\displaystyle\mathbb{R}^{\{0,F^{c}\times L^{c}\}}_{+} =\displaystyle= the nonnegative orthant of {0,Fc×Lc},\displaystyle\mbox{the nonnegative orthant of $\mathbb{R}^{\{0,F^{c}\times L^{c}\}}$},
𝕊{0,Fc×Lc}\displaystyle\mathbb{S}^{\{0,F^{c}\times L^{c}\}} =\displaystyle= the linear space of (1+|Fc||Lc|)×(1+|Fc||Lc|)(1+|F^{c}||L^{c}|)\times(1+|F^{c}||L^{c}|) symmetric
matrices 𝑿 with elements X𝜶𝜷 (𝜶,𝜷{0,Fc×Lc}),\displaystyle\mbox{matrices $\mbox{\boldmath$X$}$ with elements $X_{\mbox{\scriptsize\boldmath$\alpha$}\mbox{\scriptsize\boldmath$\beta$}}$ $(\mbox{\boldmath$\alpha$},\mbox{\boldmath$\beta$}\in\{0,F^{c}\times L^{c}\})$},
𝕊+{0,Fc×Lc}\displaystyle\mathbb{S}^{\{0,F^{c}\times L^{c}\}}_{+} =\displaystyle= the cone of positive semidefinite matrices in 𝕊{0,Fc×Lc}.\displaystyle\mbox{the cone of positive semidefinite matrices in }\mathbb{S}^{\{0,F^{c}\times L^{c}\}}.

Then, for each (F,L)=((i1,,ir),(j1,,jr))Πr×Πr(F,L)=((i_{1},\ldots,i_{r}),(j_{1},\ldots,j_{r}))\in\Pi_{r}\times\Pi_{r} (r{0,1,,n})(r\in\{0,1,\ldots,n\}), a subproblem of QAP (6) for assigning each facility ipFi_{p}\in F to each location jpLj_{p}\in L (p=1,,r)(p=1,\ldots,r) can be written as

QAP(F,L)ζ(F,L)=min{𝑸(F,L),𝒙𝒙T:𝒙+{0,Fc×Lc},x0=1,xij(x0xij)=0((i,j)Fc×Lc)kLcxikx0=0(iFc),kFcxkjx0=0(jLc)}\displaystyle\mbox{QAP$(F,L)$: }\zeta(F,L)=\min\left\{\langle\mbox{\boldmath$Q$}(F,L),\,\mbox{\boldmath$x$}\mbox{\boldmath$x$}^{T}\rangle:\begin{array}[]{l}\mbox{\boldmath$x$}\in\mathbb{R}^{\{0,F^{c}\times L^{c}\}}_{+},\ x_{0}=1,\\[2.0pt] x_{ij}(x_{0}-x_{ij})=0\\[2.0pt] ((i,j)\in F^{c}\times L^{c})\\[2.0pt] \displaystyle\sum_{k\in L^{c}}x_{ik}-x_{0}=0\ (i\in F^{c}),\\[2.0pt] \displaystyle\sum_{k\in F^{c}}x_{kj}-x_{0}=0\ (j\in L^{c})\end{array}\right\}

for some matrix 𝑸(F,L)𝕊{0,Fc×Lc}\mbox{\boldmath$Q$}(F,L)\in\mathbb{S}^{\{0,F^{c}\times L^{c}\}}, which is given by 𝑸(F,L)=𝑷(F,L)T𝑸0𝑷(F,L)\mbox{\boldmath$Q$}(F,L)=\mbox{\boldmath$P$}(F,L)^{T}\mbox{\boldmath$Q$}^{0}\mbox{\boldmath$P$}(F,L), where

𝑷(F,L)\displaystyle\mbox{\boldmath$P$}(F,L) =\displaystyle= the (1+|N×N|)×(1+|Fc×Lc|)(1+|N\times N|)\times(1+|F^{c}\times L^{c}|) matrix with elements
P(F,L)𝜶𝜷={1if 𝜶=0and 𝜷=0,1if 𝜶=(ip,jp) for some p{1,,r} and 𝜷=0,1if 𝜶=𝜷Fc×Lc,0otherwise. \displaystyle P(F,L)_{\mbox{\scriptsize\boldmath$\alpha$}\mbox{\scriptsize\boldmath$\beta$}}\ =\ \left\{\begin{array}[]{ll}1&\mbox{if }\mbox{\boldmath$\alpha$}=0\ \mbox{and }\mbox{\boldmath$\beta$}=0,\\ 1&\mbox{if }\mbox{\boldmath$\alpha$}=(i_{p},j_{p})\text{ for some }p\in\{1,\ldots,r\}\text{ and }\mbox{\boldmath$\beta$}=0,\\ 1&\mbox{if }\mbox{\boldmath$\alpha$}=\mbox{\boldmath$\beta$}\in F^{c}\times L^{c},\\ 0&\mbox{otherwise. }\end{array}\right.

Obviously, 𝑷(,)\mbox{\boldmath$P$}(\emptyset,\emptyset) coincides with the identity matrix in 𝕊{1,N×N}\mathbb{S}^{\{1,N\times N\}} and QAP(,)(\emptyset,\emptyset) corresponds to QAP (6).

2.3 Enumeration tree

A fundamental step in the B&B method is to successively construct an enumeration tree. Each node of the enumeration tree corresponds to a subproblem of QAP (6), QAP(F,L)(F,L) for some (F,L)Πr×Πr(F,L)\in\Pi_{r}\times\Pi_{r} with r{0,1,,n}r\in\{0,1,\ldots,n\}. Thus, each node is identified with a subproblem. To initialize the tree, we set QAP(,)(\emptyset,\emptyset) (i.e., QAP (6)) for the root node and compute the incumbent objective value ζ^\hat{\zeta} by applying a heuristic method to QAP (6).

Suppose that we have generated a node QAP(F,L)(F,L), where F=(i1,,ir)ΠrF=(i_{1},\ldots,i_{r})\in\Pi_{r} and L=(j1,,jr)ΠrL=(j_{1},\ldots,j_{r})\in\Pi_{r} for some r{1,,n}r\in\{1,\ldots,n\} (or (F,L)=(,)(F,L)=(\emptyset,\emptyset)). Then, we compute a lower bound ζ(F,L)\zeta^{\ell}(F,L) (see Section 3) and an upper bound ζu(F,L)\zeta^{u}(F,L) (see Section 4), which is associated with a feasible solution of QAP(F,L)(F,L), for the (unknown) optimal value ζ(F,L)\zeta(F,L); ζ(F,L)ζ(F,L)ζu(F,L)\zeta^{\ell}(F,L)\leq\zeta(F,L)\leq\zeta^{u}(F,L). The incumbent objective value ζ^\hat{\zeta} is then updated by ζ^=min{ζ^,ζu(F,L)}\hat{\zeta}=\min\{\hat{\zeta},\ \zeta^{u}(F,L)\}.

When ζ^ζ(F,L)\hat{\zeta}\leq\zeta^{\ell}(F,L) occurs, the (unknown) optimal value ζ(F,L)\zeta(F,L) of QAP(F,L)(F,L) is greater than or equal to the incumbent objective value ζ^\hat{\zeta}. In this case, the node is pruned (or terminated). Otherwise, branching the node into nrn-r child nodes takes place as follows. We select either facility f+Fcf_{+}\in F^{c} or location +Lc\ell_{+}\in L^{c}. If f+Fcf_{+}\in F^{c} is selected, then |Lc|=nr|L^{c}|=n-r child nodes QAP((F,f+),(L,))(Lc)\mbox{QAP}((F,f_{+}),(L,\ell))\ (\ell\in L^{c}) are generated. Otherwise, |Fc|=nr|F^{c}|=n-r child nodes QAP((F,f),(L,+))(fFc)\mbox{QAP}((F,f),(L,\ell+))\ (f\in F^{c}) are generated. A branching rule determines which f+Fcf_{+}\in F^{c} or +Lc\ell_{+}\in L^{c} to select. In Section 5, we describe three types of branching rules in detail.

For any QAP(F,L)(F,L) possibly located at the bottom level (or the nnth level), we have |F|=|L|=n|F|=|L|=n and Fc=Lc=F^{c}=L^{c}=\emptyset. Thus, QAP(F,L)(F,L) becomes a trivial QAP with every facility assigned to some location and vice versa. It is trivial to compute its optimal value ζ(F,L)\zeta(F,L). We note that the height of the enumeration tree is at most nn. Numerically, we can enumerate all (nr)!(n-r)! feasible solutions of QAP(F,L)(F,L) to compute its exact optimal value when |Fc|=|Lc|=nrm|F^{c}|=|L^{c}|=n-r\leq m holds for a small enough mm. In the numerical results reported in Section 6, we took m=7m=7.

3 Lower bounding procedure

Throughout this section, we fix (F,L)Πr×Πr(F,L)\in\Pi_{r}\times\Pi_{r} (r{0,1,,n})(r\in\{0,1,\ldots,n\}), and present a method to compute a lower bound for the optimal objective value ζ(F,L)\zeta(F,L) of QAP(F,L)(F,L). Before deriving a doubly nonnegative (DNN) relaxation of QAP(F,L)(F,L) in Section 3.1 and a Lagrangian DNN relaxation of QAP(F,L)(F,L) in Section 3.2, we first transform QAP(F,L)(F,L) into QAP¯(F,L)\mbox{\mbox{$\overline{\mbox{QAP}}$}}(F,L) described below.

We note that each homogeneous linear equality constraint kLcxikx0=0\displaystyle\sum_{k\in L^{c}}x_{ik}-x_{0}=0 of QAP(F,L)(F,L) can be written as 𝒄^(i,L)T𝒙=0\hat{\mbox{\boldmath$c$}}(i,L)^{T}\mbox{\boldmath$x$}=0 for some 𝒄^(i,L){0,Fc×Lc}\hat{\mbox{\boldmath$c$}}(i,L)\in\mathbb{R}^{\{0,F^{c}\times L^{c}\}} (iFc)(i\in F^{c}). Let 𝒂^(i,L)=vec(𝒄^(i,L)𝒄^(i,L)T)T\hat{\mbox{\boldmath$a$}}(i,L)=\mbox{\bf vec}(\hat{\mbox{\boldmath$c$}}(i,L)\hat{\mbox{\boldmath$c$}}(i,L)^{T})^{T}. Then, the linear equality is equivalent to 𝒂^(i,L)vec(𝒙𝒙T)=0\hat{\mbox{\boldmath$a$}}(i,L)\mbox{\bf vec}(\mbox{\boldmath$x$}\mbox{\boldmath$x$}^{T})=0. Similarly, we can rewrite each homogeneous linear constraint kFcxkjx0=0\displaystyle\sum_{k\in F^{c}}x_{kj}-x_{0}=0 as 𝒂~(j,F)vec(𝒙𝒙T)=0\tilde{\mbox{\boldmath$a$}}(j,F)\mbox{\bf vec}(\mbox{\boldmath$x$}\mbox{\boldmath$x$}^{T})=0 for some (1+|Fc||Lc|)2(1+|F^{c}||L^{c}|)^{2}-dimensional row vector 𝒂~(j,F)\tilde{\mbox{\boldmath$a$}}(j,F) (jLc)(j\in L^{c}). Hence, the entire linear constraints

kLcxikx0=0(iFc) and kFcxkjx0=0(jLc)\displaystyle\sum_{k\in L^{c}}x_{ik}-x_{0}=0\ (i\in F^{c})\ \mbox{ and }\sum_{k\in F^{c}}x_{kj}-x_{0}=0\ (j\in L^{c})

can be expressed as 𝑨(F,L)vec(𝒙𝒙T)=0|Fc|+|Lc|\mbox{\boldmath$A$}(F,L)\mbox{\bf vec}(\mbox{\boldmath$x$}\mbox{\boldmath$x$}^{T})=\mbox{\bf 0}\in\mathbb{R}^{|F^{c}|+|L^{c}|}, where 𝑨(F,L)\mbox{\boldmath$A$}(F,L) denotes the (|Fc|+|Lc|)×(1+|Fc||Lc|)2(|F^{c}|+|L^{c}|)\times(1+|F^{c}||L^{c}|)^{2} matrix consisting of the row vectors 𝒂^(i,L)\hat{\mbox{\boldmath$a$}}(i,L) (iFc)(i\in F^{c}) and 𝒂~(j,F)\tilde{\mbox{\boldmath$a$}}(j,F) (jLc)(j\in L^{c}).

Define

𝕂1(F,L)\displaystyle\mathbb{K}_{1}(F,L) =\displaystyle= 𝕊+{0,Fc×Lc},\displaystyle\mathbb{S}^{\{0,F^{c}\times L^{c}\}}_{+},
𝕂2(F,L)\displaystyle\mathbb{K}_{2}(F,L) =\displaystyle= {𝑿𝕊{0,Fc×Lc}:X𝜶𝜷0(𝜶,𝜷{0,Fc×Lc}),X0𝜶X𝜶𝜶=0(𝜶Fc×Lc)},\displaystyle\left\{\mbox{\boldmath$X$}\in\mathbb{S}^{\{0,F^{c}\times L^{c}\}}:\begin{array}[]{l}X_{\mbox{\scriptsize\boldmath$\alpha$}\mbox{\scriptsize\boldmath$\beta$}}\geq 0\ (\mbox{\boldmath$\alpha$},\mbox{\boldmath$\beta$}\in\{0,F^{c}\times L^{c}\}),\\[2.0pt] X_{0\mbox{\scriptsize\boldmath$\alpha$}}-X_{\mbox{\scriptsize\boldmath$\alpha$}\mbox{\scriptsize\boldmath$\alpha$}}=0\ (\mbox{\boldmath$\alpha$}\in F^{c}\times L^{c})\end{array}\right\},
𝑯(F,L)\displaystyle\mbox{\boldmath$H$}(F,L) =\displaystyle= the matrix in 𝕊{0,Fc×Lc}\mathbb{S}^{\{0,F^{c}\times L^{c}\}} with 11 at the (0,0)(0,0)th element
and 0 elsewhere.\displaystyle\mbox{and $0$ elsewhere}.

By construction, the constraint xij(x0xij)=0x_{ij}(x_{0}-x_{ij})=0 ((i,j)Fc×Lc))((i,j)\in F^{c}\times L^{c})) is equivalent to X0𝜶X𝜶𝜶=0(𝜶Fc×Lc)X_{0\mbox{\scriptsize\boldmath$\alpha$}}-X_{\mbox{\scriptsize\boldmath$\alpha$}\mbox{\scriptsize\boldmath$\alpha$}}=0\ (\mbox{\boldmath$\alpha$}\in F^{c}\times L^{c}) if 𝑿=𝒙𝒙T𝕊{0,Fc×Lc}\mbox{\boldmath$X$}=\mbox{\boldmath$x$}\mbox{\boldmath$x$}^{T}\in\mathbb{S}^{\{0,F^{c}\times L^{c}\}} and 𝒙{0,Fc×Lc}\mbox{\boldmath$x$}\in\mathbb{R}^{\{0,F^{c}\times L^{c}\}}, and 𝑯(F,L),𝑿=X00\langle\mbox{\boldmath$H$}(F,L),\,\mbox{\boldmath$X$}\rangle=X_{00} for every 𝑿𝕊{0,F×L}\mbox{\boldmath$X$}\in\mathbb{S}^{\{0,F\times L\}}. Therefore, we can rewrite QAP(F,L)(F,L) given in Section 2.2 as

QAP¯(F,L):ζ(F,L)\displaystyle\mbox{\mbox{$\overline{\mbox{QAP}}$}}(F,L):\zeta(F,L) =\displaystyle= {𝑸(F,L),𝑿:𝒙{0,Fc×Lc},𝑿=𝒙𝒙T𝕂1(F,L)𝕂2(F,L),𝑯(F,L),𝑿=1,𝑨(F,L)vec(𝑿)=0}.\displaystyle\left\{\langle\mbox{\boldmath$Q$}(F,L),\,\mbox{\boldmath$X$}\rangle:\begin{array}[]{l}\mbox{\boldmath$x$}\in\mathbb{R}^{\{0,F^{c}\times L^{c}\}},\ \mbox{\boldmath$X$}=\mbox{\boldmath$x$}\mbox{\boldmath$x$}^{T}\in\mathbb{K}_{1}(F,L)\cap\mathbb{K}_{2}(F,L),\\[2.0pt] \langle\mbox{\boldmath$H$}(F,L),\,\mbox{\boldmath$X$}\rangle=1,\ \mbox{\boldmath$A$}(F,L)\mbox{\bf vec}(\mbox{\boldmath$X$})=\mbox{\bf 0}\end{array}\right\}.

3.1 A DNN relaxation of QAP¯\overline{\mbox{QAP}}(F,L)(F,L)

In QAP¯\overline{\mbox{QAP}}(F,L)(F,L), the constraint 𝑿=𝒙𝒙T\mbox{\boldmath$X$}=\mbox{\boldmath$x$}\mbox{\boldmath$x$}^{T} requires the rank of the variable matrix 𝑿X to be 11. By removing this constraint, we obtain

DNNp(F,L):ηp(F,L)\displaystyle\mbox{DNN}^{p}(F,L):\eta^{p}(F,L) =\displaystyle= min{𝑸(F,L),𝑿:𝑿𝕂1(F,L)𝕂2(F,L),𝑯(F,L),𝑿=1,𝑨(F,L)vec(𝑿)=0}.\displaystyle\min\left\{\langle\mbox{\boldmath$Q$}(F,L),\,\mbox{\boldmath$X$}\rangle:\begin{array}[]{l}\mbox{\boldmath$X$}\in\mathbb{K}_{1}(F,L)\cap\mathbb{K}_{2}(F,L),\\[2.0pt] \langle\mbox{\boldmath$H$}(F,L),\,\mbox{\boldmath$X$}\rangle=1,\ \mbox{\boldmath$A$}(F,L)\mbox{\bf vec}(\mbox{\boldmath$X$})=\mbox{\bf 0}\end{array}\right\}.

which serves as a DNN relaxation of QAP¯\overline{\mbox{QAP}}(F,L)(F,L) with ηp(F,L)ζ(F,L)\eta^{p}(F,L)\leq\zeta(F,L).

3.2 A Lagrangian DNN relaxation problem of QAP¯\overline{\mbox{QAP}}(F,L)(F,L)

We know that

𝑨(F,L)vec(𝑿)0for every 𝑿𝕂1(F,L)=𝕊+{0,Fc×Lc},\displaystyle\mbox{\boldmath$A$}(F,L)\mbox{\bf vec}(\mbox{\boldmath$X$})\geq\mbox{\bf 0}\ \mbox{for every }\mbox{\boldmath$X$}\in\mathbb{K}_{1}(F,L)=\mathbb{S}^{\{0,F^{c}\times L^{c}\}}_{+},

since each row vector of 𝑨(F,L)\mbox{\boldmath$A$}(F,L) is of the form vec(𝒄𝒄T)\mbox{\bf vec}(\mbox{\boldmath$c$}\mbox{\boldmath$c$}^{T}) for some 𝒄{0,Fc×Lc}\mbox{\boldmath$c$}\in\mathbb{R}^{\{0,F^{c}\times L^{c}\}}. Hence, the constraint 𝑨(F,L)vec(𝑿)=0\mbox{\boldmath$A$}(F,L)\mbox{\bf vec}(\mbox{\boldmath$X$})=\mbox{\bf 0} in DNN(F,L)p{}^{p}(F,L) can be replaced by

0\displaystyle 0 =\displaystyle= 𝒆T𝑨(F,L)vec(𝑿)=iFc𝒄^(i,L)𝒄^(i,L)T+jLc𝒄~(j,F)𝒄~(i,L)T,𝑿,\displaystyle\mbox{\boldmath$e$}^{T}\mbox{\boldmath$A$}(F,L)\mbox{\bf vec}(\mbox{\boldmath$X$})\ =\ \big{\langle}\sum_{i\in F^{c}}\hat{\mbox{\boldmath$c$}}(i,L)\hat{\mbox{\boldmath$c$}}(i,L)^{T}+\sum_{j\in L^{c}}\tilde{\mbox{\boldmath$c$}}(j,F)\tilde{\mbox{\boldmath$c$}}(i,L)^{T},\,\mbox{\boldmath$X$}\big{\rangle},

where 𝒆|F|+|L|\mbox{\boldmath$e$}\in\mathbb{R}^{|F|+|L|} denotes the vector of 11’s, and the second equality holds by the construction of 𝑨(F,L)\mbox{\boldmath$A$}(F,L). Applying the Lagrangian relaxation to the resulting problem, we obtain

DNNλp(F,L):ηλp(F,L)\displaystyle\mbox{DNN}^{p}_{\lambda}(F,L):\eta^{p}_{\lambda}(F,L) =\displaystyle= min{𝑸λ(F,L),𝑿:𝑿𝕂1(F,L)𝕂2(F,L),𝑯(F,L),𝑿=1},\displaystyle\min\left\{\langle\mbox{\boldmath$Q$}_{\lambda}(F,L),\,\mbox{\boldmath$X$}\rangle:\begin{array}[]{l}\mbox{\boldmath$X$}\in\mathbb{K}_{1}(F,L)\cap\mathbb{K}_{2}(F,L),\\ \langle\mbox{\boldmath$H$}(F,L),\,\mbox{\boldmath$X$}\rangle=1\end{array}\right\},

and its dual

DNNλd(F,L):ηλd(F,L)\displaystyle\mbox{DNN}^{d}_{\lambda}(F,L):\eta^{d}_{\lambda}(F,L) =\displaystyle= max{y:𝒀1𝕂1(F,L),𝒀2𝕂2(F,L),𝑸λ(F,L)𝑯(F,L)y=𝒀1+𝒀2}.\displaystyle\max\left\{y\in\mathbb{R}:\begin{array}[]{l}\mbox{\boldmath$Y$}_{1}\in\mathbb{K}_{1}(F,L)^{*},\ \mbox{\boldmath$Y$}_{2}\in\mathbb{K}_{2}(F,L)^{*},\\[2.0pt] \mbox{\boldmath$Q$}_{\lambda}(F,L)-\mbox{\boldmath$H$}(F,L)y=\mbox{\boldmath$Y$}_{1}+\mbox{\boldmath$Y$}_{2}\end{array}\right\}.

Here λ\lambda\in\mathbb{R} denotes a Lagrangian multiplier, and

𝑸λ(F,L)\displaystyle\mbox{\boldmath$Q$}_{\lambda}(F,L) =\displaystyle= 𝑸(F,L)+λ(iFc𝒄^(i,L)𝒄^(i,L)T+jLc𝒄~(j,F)𝒄~(i,L)T)𝕊{0,Fc×Lc}.\displaystyle\mbox{\boldmath$Q$}(F,L)+\lambda\left(\sum_{i\in F^{c}}\hat{\mbox{\boldmath$c$}}(i,L)\hat{\mbox{\boldmath$c$}}(i,L)^{T}+\sum_{j\in L^{c}}\tilde{\mbox{\boldmath$c$}}(j,F)\tilde{\mbox{\boldmath$c$}}(i,L)^{T}\right)\in\mathbb{S}^{\{0,F^{c}\times L^{c}\}}.

DNN(F,L)λp{}^{p}_{\lambda}(F,L) serves as a Lagrangian relaxation of DNN(F,L)p{}^{p}(F,L) and a Lagrangian DNN relaxation of QAP¯\overline{\mbox{QAP}}(F,L)(F,L) with ηλp(F,L)ηp(F,L)ζ(F,L)\eta^{p}_{\lambda}(F,L)\leq\eta^{p}(F,L)\leq\zeta(F,L) for any λ\lambda\in\mathbb{R}. We can prove that ηλp(F,L)\eta^{p}_{\lambda}(F,L) converges monotonically to ηp(F,L)\eta^{p}(F,L) as λ\lambda\rightarrow\infty. We also see by the duality theorem ([2, Lemma 2.3]) that ηλp(F,L)=ηλd(F,L)\eta^{p}_{\lambda}(F,L)=\eta^{d}_{\lambda}(F,L).

3.3 The Newton-bracketing method for solving the primal-dual pair of DNN(F,L)λp{}^{p}_{\lambda}(F,L) and DNN(F,L)λd{}^{d}_{\lambda}(F,L)

To describe the NB method for solving DNN(F,L)λp{}^{p}_{\lambda}(F,L) and DNN(F,L)λd{}^{d}_{\lambda}(F,L), λ\lambda is fixed to a sufficiently large positive number, say λ=100,000\lambda=100,000. The optimal value ηλd\eta^{d}_{\lambda} of DNN(F,L)λd{}^{d}_{\lambda}(F,L) is denoted by yy^{*} throughout this subsection.

Let =lb0<y<ub0-\infty=\mbox{lb}^{0}<y^{*}<\mbox{ub}^{0}. The NB method applied to DNN(F,L)λp{}^{p}_{\lambda}(F,L) and DNN(F,L)λd{}^{d}_{\lambda}(F,L) generates

𝑿^k𝕂1(F,L)𝕂2(F,L),(yk,𝒀^1k,𝒀^2k)×𝕊{0,Fc×Lc}×𝕂2(F,L),\displaystyle\widehat{\mbox{\boldmath$X$}}^{k}\in\mathbb{K}_{1}(F,L)\cap\mathbb{K}_{2}(F,L),\ (y^{k},\widehat{\mbox{\boldmath$Y$}}_{1}^{k},\widehat{\mbox{\boldmath$Y$}}_{2}^{k})\in\mathbb{R}\times\mathbb{S}^{\{0,F^{c}\times L^{c}\}}\times\mathbb{K}_{2}(F,L)^{*},
lbk,ubk, 0τk\displaystyle\mbox{lb}^{k}\in\mathbb{R},\ \mbox{ub}^{k}\in\mathbb{R},\ 0\geq\tau^{k}\in\mathbb{R}

(k=1,2,)(k=1,2,\ldots) such that

𝑸(F,L)𝑯(F,L)yk𝒀^1k𝒀^2k=𝑶,lbk=yk+(1+|Fc|)τk,τk=min{0,the minimum eigenvalue of 𝒀^1k},lb0lbklbk+1yubk+1ubkub0,𝑿=𝑿^kis a feasible solution of DNN(F,L)λp,𝑸(F,L),𝑿^k=ubk.}\displaystyle\left.\begin{array}[]{l}\mbox{\boldmath$Q$}(F,L)-\mbox{\boldmath$H$}(F,L)y^{k}-\widehat{\mbox{\boldmath$Y$}}^{k}_{1}-\widehat{\mbox{\boldmath$Y$}}^{k}_{2}=\mbox{\boldmath$O$},\\[3.0pt] \mbox{lb}^{k}=y^{k}+(1+|F^{c}|)\tau^{k},\ \tau^{k}=\min\{0,\mbox{the minimum eigenvalue of $\widehat{\mbox{\boldmath$Y$}}^{k}_{1}$}\},\\[3.0pt] \mbox{lb}^{0}\leq\mbox{lb}^{k}\leq\mbox{lb}^{k+1}\rightarrow y^{*}\leftarrow\mbox{ub}^{k+1}\leq\mbox{ub}^{k}\leq\mbox{ub}^{0},\\[3.0pt] \mbox{\boldmath$X$}=\widehat{\mbox{\boldmath$X$}}^{k}\ \mbox{is a feasible solution of~{}DNN${}^{p}_{\lambda}(F,L)$},\ \langle\mbox{\boldmath$Q$}(F,L),\,\widehat{\mbox{\boldmath$X$}}^{k}\rangle=\mbox{ub}^{k}.\end{array}\right\} (18)

We see from these relations that for sufficiently large kk, 𝑿=𝑿^k\mbox{\boldmath$X$}=\widehat{\mbox{\boldmath$X$}}^{k} and (y,𝒀1,𝒀2)=(yk,𝒀^1k,𝒀^2k)(y,\mbox{\boldmath$Y$}_{1},\mbox{\boldmath$Y$}_{2})=(y^{k},\widehat{\mbox{\boldmath$Y$}}_{1}^{k},\widehat{\mbox{\boldmath$Y$}}_{2}^{k}) provides approximate optimal solutions of DNN(F,L)λp{}^{p}_{\lambda}(F,L) and DNN(F,L)λd{}^{d}_{\lambda}(F,L), respectively. Hence 𝑿=𝑿^k\mbox{\boldmath$X$}=\widehat{\mbox{\boldmath$X$}}^{k} is an approximate optimal solution of  DNN(F,L)p{}^{p}(F,L). See [10] for more details.

It is natural to use a stopping criterion for the NB method as

(a) |ubklbk|<ϵmax{|lbk|,|ubk|,1}|\mbox{ub}^{k}-\mbox{lb}^{k}|<\epsilon\max\{|\mbox{lb}^{k}|,|\mbox{ub}^{k}|,1\}, where ϵ\epsilon denotes a sufficiently small positive number.

When the NB method is implemented in the B&B method for QAPs with integer optimal values, an accurate approximation of yy^{*} does not have to be computed all the time by the NB method. We can terminate the iteration in case when

(b) ζ^lbk\hat{\zeta}\leq\lceil\mbox{lb}^{k}\rceil or

(c) ubk<ζ^\mbox{ub}^{k}<\hat{\zeta},

where ζ^\hat{\zeta} denotes an integer incumbent objective value of the root node QAP. If (b) occurs, then we know that the sub QAP, QAP¯\overline{\mbox{QAP}}(F,L)(F,L) to which the NB method has applied, does not have any feasible solution whose objective value is smaller than the incumbent objective value ζ^\hat{\zeta}; hence QAP¯\overline{\mbox{QAP}}(F,L)(F,L) can be pruned. If case (c) occurs, then yubk<ζ^y^{*}\leq\mbox{ub}^{k}<\hat{\zeta}. Thus, branching QAP¯\overline{\mbox{QAP}}(F,L)(F,L) further is necessary even if the NB iteration continues. By employing these two additional criteria (b) and (c), we can save a lot of computational time for the NB method. This is a distinctive advantage of using the NB method in the B&B method.

4 Upper bounding procedures

Let (F,L)Πr×Πr(F,L)\in\Pi_{r}\times\Pi_{r} for some r{0,1,,n}r\in\{0,1,\ldots,n\}. An approximate optimal solution of QAP(F,L)(F,L) needs to be computed for the upper bounding procedure. A simple rounding of an approximate optimal solution of its Lagrangian relaxation DNN(F,L)λp{}^{p}_{\lambda}(F,L) and the tabu search method [21, 22] for QAP(F,L)(F,L) have been used for the computation. We plan to replace this method by the one presented below in this section to effectively utilize information from DNN(F,L)λp{}^{p}_{\lambda}(F,L). In the remainder of this section, we assume F=L=F=L=\emptyset so that QAP(F,L)(F,L) coincides with the root node QAP (6), for simplicity of notation. All the discussions, however, can be easily adapted to any sub QAP, QAP(F,L)(F,L).

Suppose that we have applied the NB method to the primal-dual pair of DNN problems DNN(,)λp{}^{p}_{\lambda}(\emptyset,\emptyset) and DNN(,)λd{}^{d}_{\lambda}(\emptyset,\emptyset) and obtained an approximate optimal solution 𝑿^\widehat{\mbox{\boldmath$X$}} of DNN(,)λp{}^{p}_{\lambda}(\emptyset,\emptyset). The set of feasible solutions 𝑿X of DNN(,)λp{}^{p}_{\lambda}(\emptyset,\emptyset) may be regarded as a doubly nonnegative relaxation of the set of 𝒖𝒖T\mbox{\boldmath$u$}\mbox{\boldmath$u$}^{T} with feasible solutions 𝒖u of QAP (6), and the first column of 𝑿^\widehat{\mbox{\boldmath$X$}} corresponds to a feasible (or approximate optimal) solution u0𝒖=𝒖u_{0}\mbox{\boldmath$u$}=\mbox{\boldmath$u$} of QAP (6). We know that 𝑼=[uik]\mbox{\boldmath$U$}=[u_{ik}] forms a permutation matrix for every feasible solution 𝒖u of QAP (6). Through preliminary numerical experiment, we have observed that the n×nn\times n matrix

𝑼^\displaystyle\widehat{\mbox{\boldmath$U$}} =\displaystyle= (X^11,0X^12,0X^1n,0X^21,0X^22,0X^2n,0X^n1,0X^n2,0X^nn,0)\displaystyle\begin{pmatrix}\widehat{X}_{11,0}&\widehat{X}_{12,0}&\ldots&\widehat{X}_{1n,0}\\ \widehat{X}_{21,0}&\widehat{X}_{22,0}&\ldots&\widehat{X}_{2n,0}\\ \cdots&\cdots&\cdots&\cdots\\ \widehat{X}_{n1,0}&\widehat{X}_{n2,0}&\ldots&\widehat{X}_{nn,0}\end{pmatrix}

approximately forms a doubly-stochastic matrix. A popular approach for recovering a feasible solution from 𝑼^\widehat{\mbox{\boldmath$U$}} is through appropriate rounding procedures. Here we propose a method to compute a permutation matrix 𝑼=𝑼¯\mbox{\boldmath$U$}=\overline{\mbox{\boldmath$U$}} which minimize the distance 𝑼𝑼^\parallel\mbox{\boldmath$U$}-\widehat{\mbox{\boldmath$U$}}\parallel among all permutation matrices 𝑼U:

ζ¯\displaystyle\bar{\zeta} =\displaystyle= min{𝑼𝑼^2: 𝑼n×n is a permutation matrix }\displaystyle\min\left\{\parallel\mbox{\boldmath$U$}-\widehat{\mbox{\boldmath$U$}}\parallel^{2}:\mbox{ $\mbox{\boldmath$U$}\in\mathbb{R}^{n\times n}$ is a permutation matrix }\right\}
=\displaystyle= min{2𝑼^,𝑼+𝑼^2+n2:𝑼n×n is a permutation matrix }\displaystyle\min\left\{-2\langle\widehat{\mbox{\boldmath$U$}},\,\mbox{\boldmath$U$}\rangle+\parallel\widehat{\mbox{\boldmath$U$}}\parallel^{2}+n^{2}:\mbox{$\mbox{\boldmath$U$}\in\mathbb{R}^{n\times n}$ is a permutation matrix }\right\}
(since 𝑼2=n2 for every permutation matrix 𝑼n×n).\displaystyle\mbox{(since $\parallel\mbox{\boldmath$U$}\parallel^{2}=n^{2}$ for every permutation matrix $\mbox{\boldmath$U$}\in\mathbb{R}^{n\times n}$)}.
=\displaystyle= min{2𝑼^,𝑼:𝑼n×n is a permutation matrix }+𝑼^2+n2,\displaystyle\min\left\{\langle-2\widehat{\mbox{\boldmath$U$}},\,\mbox{\boldmath$U$}\rangle:\mbox{$\mbox{\boldmath$U$}\in\mathbb{R}^{n\times n}$ is a permutation matrix }\right\}+\parallel\widehat{\mbox{\boldmath$U$}}\parallel^{2}+n^{2},

where \parallel\cdot\parallel denotes the Frobenius norm. This problem forms a linear assignment problem, which can be solved the well-known Hungarian method [12].

5 Branching rules

Let (F,L)Πr×Πr(F,L)\in\Pi_{r}\times\Pi_{r} for some r{0,1,,n}r\in\{0,1,\ldots,n\}. As we have briefly mentioned in Section 2.3, if node QAP(F,L)(F,L) is not pruned, we select either an f+Fcf_{+}\in F^{c} to branch QAP(F,L)(F,L) into |Lc||L^{c}| child nodes QAP((F,f+),(L,))((F,f_{+}),(L,\ell)) (Lc)(\ell\in L^{c}) or an +Lc\ell_{+}\in L^{c} to branch QAP(F,L)(F,L) into |Fc||F^{c}| child nodes QAP((F,f),(L,+))((F,f),(L,\ell_{+})) (fFc)(f\in F^{c}).

We present three types of rules for selecting an f+Fcf_{+}\in F^{c} or an +Lc\ell_{+}\in L^{c}. In each rule, an evaluation function φ(f,)\varphi(f,\ell) is introduced for possible child nodes QAP((F,f),(L,))((F,f),(L,\ell)) ((f,)Fc×Lc)((f,\ell)\in F^{c}\times L^{c}) so that φ(f,)\varphi(f,\ell) can represent the optimal values of QAP((F,f),(L,))((F,f),(L,\ell)) or its Lagrangian DNN relaxation problem DNN((F,f),(L,))λp{}^{p}_{\lambda}((F,f),(L,\ell)). Then, the following functions are defined:

φ¯(f,)\displaystyle\bar{\varphi}(f,\cdot) =\displaystyle= 1|Lc|Lcφ(f,)(the mean of φ(f,) over Lc(fFc),\displaystyle\frac{1}{|L^{c}|}\sum_{\ell\in L^{c}}\varphi(f,\ell)\ \ \mbox{(the mean of $\varphi(f,\ell)$ over $\ell\in L^{c}$) }(f\in F^{c}),
φ¯(,)\displaystyle\bar{\varphi}(\cdot,\ell) =\displaystyle= 1|Fc|fFcφ(f,)(the mean of φ(f,) over fFc(Lc).\displaystyle\frac{1}{|F^{c}|}\sum_{f\in F^{c}}\varphi(f,\ell)\ \ \mbox{(the mean of $\varphi(f,\ell)$ over $f\in F^{c}$) }(\ell\in L^{c}).

Our method is to choose f+=arg max{φ¯(f,):fFc}f_{+}=\mbox{arg max}\{\bar{\varphi}(f,\cdot):f\in F^{c}\} if max{φ¯(f,):fFc}max{φ¯(,):Lc}\mbox{max}\{\bar{\varphi}(f,\cdot):f\in F^{c}\}\geq\mbox{max}\{\bar{\varphi}(\cdot,\ell):\ell\in L^{c}\}, and +=arg max{φ¯(,):Lc}\ell_{+}=\mbox{arg max}\{\bar{\varphi}(\cdot,\ell):\ell\in L^{c}\} otherwise. Each rule employs a different function φ(f,)\varphi(f,\ell) from others. We describe how φ(f,)\varphi(f,\ell) is defined with a fixed ((f,)Fc×Lc)((f,\ell)\in F^{c}\times L^{c}) for the three rules in Sections 5.1, 5.2 and 5.2, respectively.

5.1 Branching rule M: a rule using the mean of all feasible objective values of Q(F,L)(F,L) ((F,L)Πr×Πr,r=0,1,,n)((F,L)\in\Pi_{r}\times\Pi_{r},\ r=0,1,\ldots,n)

Define φ(f,)\varphi(f,\ell) to be the mean of the objective values of QAP((F,f),(L,))((F,f),(L,\ell)) over its feasible solutions, which can be computed by substituting x0=1x_{0}=1 and xij=1/(nr1)x_{ij}=1/(n-r-1) ((i,j)(F,f)c×(L,)c)((i,j)\in(F,f)^{c}\times(L,\ell)^{c}) into the objective function 𝑸((F,f),(L,)),𝒙𝒙T\langle\mbox{\boldmath$Q$}((F,f),(L,\ell)),\,\mbox{\boldmath$x$}\mbox{\boldmath$x$}^{T}\rangle of QAP((F,f),(L,))((F,f),(L,\ell)). This branching rule M is employed in the current parallel implementation of the B&B  method on the UG framework.

5.2 Branching rule P: a rule using an approximate optimimal solution 𝑿^\widehat{\mbox{\boldmath$X$}} of DNN(F,L)λp{}^{p}_{\lambda}(F,L)

Let 𝑿^\widehat{\mbox{\boldmath$X$}} be an approximate optimal solution of DNN(F,L)λp{}^{p}_{\lambda}(F,L). Then, 𝑿^\widehat{\mbox{\boldmath$X$}} may be regarded as an approximate optimal solution of DNN(F,L)p{}^{p}(F,L) since DNN(F,L)λp{}^{p}_{\lambda}(F,L) is a Lagrangian relaxation of DNN(F,L)p{}^{p}(F,L) and its optimal value ηλp(F,L)\eta^{p}_{\lambda}(F,L) converges to ηp(F,L)\eta^{p}(F,L) of DNN(F,L)p{}^{p}(F,L) as λ\lambda\rightarrow\infty. We will construct a rough approximate optimal solution 𝑿^(f,)\widehat{\mbox{\boldmath$X$}}(f,\ell) of DNN((F,f),(L,))p{}^{p}((F,f),(L,\ell)) from 𝑿^\widehat{\mbox{\boldmath$X$}}, and define φ(f,)\varphi(f,\ell) to be the objective value of DNN((F,f),(L,))p{}^{p}((F,f),(L,\ell)) at 𝑿^(f,)\widehat{\mbox{\boldmath$X$}}(f,\ell), i.e., φ(f,)=𝑸((F,f),(L,)),𝑿^(f,)\varphi(f,\ell)=\langle\mbox{\boldmath$Q$}((F,f),(L,\ell)),\,\widehat{\mbox{\boldmath$X$}}(f,\ell)\rangle as described below.

We first project 𝑿^𝕊{0,Fc×Lc}\widehat{\mbox{\boldmath$X$}}\in\mathbb{S}^{\{0,F^{c}\times L^{c}\}} onto the linear space 𝕊{0,(F,f)c×(L,)c}\mathbb{S}^{\{0,(F,f)^{c}\times(L,\ell)^{c}\}} where the feasible region of QAP((F,f),(L,))p{}^{p}((F,f),(L,\ell)) lies. Let 𝑿(f,){\mbox{\boldmath$X$}}(f,\ell) denote the metric projection of 𝑿^\widehat{\mbox{\boldmath$X$}}, which is obtained by deleting the rows 𝑿^i.\widehat{\mbox{\boldmath$X$}}_{i.} (i{0,(F,f)c×(L,)c})(i\not\in\{0,(F,f)^{c}\times(L,\ell)^{c}\}) and the columns 𝑿^.j\widehat{\mbox{\boldmath$X$}}_{.j} (j{0,(F,f)c×(L,)c})(j\not\in\{0,(F,f)^{c}\times(L,\ell)^{c}\}) from 𝑿^\widehat{\mbox{\boldmath$X$}} simultaneously. We also know that every feasible solution 𝑿X of DNN((F,f),(L,))p{}^{p}((F,f),(L,\ell)) must lie in the linear manifold

M\displaystyle M =\displaystyle= {𝑿𝕊{0,(F,f)c×(L,)c}:X00=1,𝑨((F,f),(L,))vec(𝑿)=0,X0𝜶X𝜶𝜶=0(𝜶(F,f)c×(L,)c)}.\displaystyle\left\{\mbox{\boldmath$X$}\in\mathbb{S}^{\{0,(F,f)^{c}\times(L,\ell)^{c}\}}:\begin{array}[]{l}X_{00}=1,\ \mbox{\boldmath$A$}((F,f),(L,\ell))\mbox{\bf vec}(\mbox{\boldmath$X$})=\mbox{\bf 0},\\[3.0pt] X_{0\mbox{\scriptsize\boldmath$\alpha$}}-X_{\mbox{\scriptsize\boldmath$\alpha$}\mbox{\scriptsize\boldmath$\alpha$}}=0\ (\mbox{\boldmath$\alpha$}\in(F,f)^{c}\times(L,\ell)^{c})\end{array}\right\}.

Let 𝑿^(f,)\widehat{\mbox{\boldmath$X$}}(f,\ell) be the metric projection of 𝑿(f,){\mbox{\boldmath$X$}}(f,\ell) onto MM, and then define

φ(f,)=𝑸((F,f),(L,)),𝑿^(f,).\varphi(f,\ell)=\langle\mbox{\boldmath$Q$}((F,f),(L,\ell)),\,\widehat{\mbox{\boldmath$X$}}(f,\ell)\rangle.

We note that 𝑿^(f,)\widehat{\mbox{\boldmath$X$}}(f,\ell) is not necessarily in 𝕂1((F,f),(L,))=𝕊+{0,(F,f)c×(L,)c}\mathbb{K}_{1}((F,f),(L,\ell))=\mathbb{S}^{\{0,(F,f)^{c}\times(L,\ell)^{c}\}}_{+}; hence 𝑿^(f,)\widehat{\mbox{\boldmath$X$}}(f,\ell) is not necessarily a feasible solution of DNN((F,f),(L,))p{}^{p}((F,f),(L,\ell)) in general.

5.3 Branching rule D: a rule using an approximate optimal solution (y^,𝒀^1,𝒀^2)(\hat{y},\widehat{\mbox{\boldmath$Y$}}_{1},\widehat{\mbox{\boldmath$Y$}}_{2}) of DNN(F,L)λd{}^{d}_{\lambda}(F,L)

Recall that the NB method applied to the primal-dual pair of DNN(F,L)λp{}^{p}_{\lambda}(F,L) and DNN(F,L)λd{}^{d}_{\lambda}(F,L) generates

𝑿k𝕂1(F,L)𝕂2(F,L),(yk,𝒀1k,𝒀2k)×𝕊{0,F×L}×𝕂2(F,L),\displaystyle\mbox{\boldmath$X$}^{k}\in\mathbb{K}_{1}(F,L)\cap\mathbb{K}_{2}(F,L),\ (y^{k},\mbox{\boldmath$Y$}_{1}^{k},\mbox{\boldmath$Y$}_{2}^{k})\in\mathbb{R}\times\mathbb{S}^{\{0,F\times L\}}\times\mathbb{K}_{2}(F,L)^{*},
lbk,ubk, 0τk\displaystyle\mbox{lb}^{k}\in\mathbb{R},\ \mbox{ub}^{k}\in\mathbb{R},\ 0\geq\tau^{k}\in\mathbb{R}

(k=1,2,)(k=1,2,\ldots) satisfying (18). We assume that the method terminates with the stopping criterion (c) ubk<ζ^\mbox{ub}^{k}<\hat{\zeta}, which requires to branch the node QAP(F,L)(F,L).

Define the (1+|Fc×Lc|)×(1+|(F,f)c×(L,)c|)(1+|F^{c}\times L^{c}|)\times(1+|(F,f)^{c}\times(L,\ell)^{c}|) matrix 𝑷(f,)\mbox{\boldmath$P$}(f,\ell) by

P(f,)𝜶𝜷\displaystyle P(f,\ell)_{\mbox{\scriptsize\boldmath$\alpha$}\mbox{\scriptsize\boldmath$\beta$}} =\displaystyle= {1if 𝜶=0and 𝜷=0,1if 𝜶=(f,) and 𝜷=0,1if 𝜶=𝜷(F,f)c×(L,)c,0otherwise .\displaystyle\left\{\begin{array}[]{ll}1&\mbox{if }\mbox{\boldmath$\alpha$}=0\ \mbox{and }\mbox{\boldmath$\beta$}=0,\\ 1&\mbox{if }\mbox{\boldmath$\alpha$}=(f,\ell)\ \mbox{ and }\mbox{\boldmath$\beta$}=0,\\ 1&\mbox{if }\mbox{\boldmath$\alpha$}=\mbox{\boldmath$\beta$}\in(F,f)^{c}\times(L,\ell)^{c},\\ 0&\mbox{otherwise }.\end{array}\right.

Then

𝑸((F,f),(L,))=𝑷(f,)T𝑸(F,L)𝑷(f,)𝕊{0,(F,f)×(L,)},\displaystyle\mbox{\boldmath$Q$}((F,f),(L,\ell))=\mbox{\boldmath$P$}(f,\ell)^{T}\mbox{\boldmath$Q$}(F,L)\mbox{\boldmath$P$}(f,\ell)\in\mathbb{S}^{\{0,(F,f)\times(L,\ell)\}},
𝑯((F,f),(L,))=𝑷(f,)T𝑯(F,L)𝑷(f,)𝕊{0,(F,f)×(L,)},\displaystyle\mbox{\boldmath$H$}((F,f),(L,\ell))=\mbox{\boldmath$P$}(f,\ell)^{T}\mbox{\boldmath$H$}(F,L)\mbox{\boldmath$P$}(f,\ell)\in\mathbb{S}^{\{0,(F,f)\times(L,\ell)\}},
𝒀^1(f,)𝑷(f,)T𝒀1k𝑷(f,)𝕊{0,(F,f)c×(L,)c},\displaystyle\widehat{\mbox{\boldmath$Y$}}_{1}(f,\ell)\equiv\mbox{\boldmath$P$}(f,\ell)^{T}\mbox{\boldmath$Y$}^{k}_{1}\mbox{\boldmath$P$}(f,\ell)\in\mathbb{S}^{\{0,(F,f)^{c}\times(L,\ell)^{c}\}},
𝒀^2(f,)𝑷(f,)T𝒀2k𝑷(f,)𝕂2(F,f),(L,))\displaystyle\widehat{\mbox{\boldmath$Y$}}_{2}(f,\ell)\equiv\mbox{\boldmath$P$}(f,\ell)^{T}\mbox{\boldmath$Y$}^{k}_{2}\mbox{\boldmath$P$}(f,\ell)\in\mathbb{K}_{2}(F,f),(L,\ell))^{*}

hold. We apply the above to the identity 𝑸(F,L)𝑯(F,L)yk𝒀1k𝒀2k=𝑶\mbox{\boldmath$Q$}(F,L)-\mbox{\boldmath$H$}(F,L)y^{k}-\mbox{\boldmath$Y$}^{k}_{1}-\mbox{\boldmath$Y$}^{k}_{2}=\mbox{\boldmath$O$} in (18). More precisely, multiplying 𝑷((F,f),(L,))T\mbox{\boldmath$P$}((F,f),(L,\ell))^{T} to each term of the identity on the left and 𝑷((F,f),(L,))\mbox{\boldmath$P$}((F,f),(L,\ell)) on the right results in

𝑶=𝑸((F,f),(L,))𝑯((F,f),(L,))yk𝒀^1(f,)𝒀^2(f,),\displaystyle\mbox{\boldmath$O$}\ =\ \mbox{\boldmath$Q$}((F,f),(L,\ell))-\mbox{\boldmath$H$}((F,f),(L,\ell))y^{k}-\widehat{\mbox{\boldmath$Y$}}_{1}(f,\ell)-\widehat{\mbox{\boldmath$Y$}}_{2}(f,\ell),
𝒀^1(f,)𝕊{0,(F,f)c×(L,)c},𝒀^2(f,)𝕂2(F,f),(L,)).\displaystyle\widehat{\mbox{\boldmath$Y$}}_{1}(f,\ell)\in\mathbb{S}^{\{0,(F,f)^{c}\times(L,\ell)^{c}\}},\ \widehat{\mbox{\boldmath$Y$}}_{2}(f,\ell)\in\mathbb{K}_{2}(F,f),(L,\ell))^{*}.

Now, define

τ(f,)\displaystyle\tau(f,\ell) =\displaystyle= min{0,the minimum eigenvalue of 𝒀^1(f,)},\displaystyle\min\{0,\mbox{the minimum eigenvalue of $\widehat{\mbox{\boldmath$Y$}}_{1}(f,\ell)$}\},
φ(f,)\displaystyle\varphi(f,\ell) =\displaystyle= yk+(1+|(F,f)c|)τ(f,)=yk+(nr)τ(f,).\displaystyle y^{k}+(1+|(F,f)^{c}|)\tau(f,\ell)\ =\ y^{k}+(n-r)\tau(f,\ell).

Then we can prove that lbkφ(f,)ηλp((F,f),(L,))\mbox{lb}^{k}\leq\varphi(f,\ell)\leq\eta^{p}_{\lambda}((F,f),(L,\ell)), which implies that φ(f,)\varphi(f,\ell) itself serves as a lower bound of the optimal value ζ((F,f),(L,))\zeta((F,f),(L,\ell)) of the child node QAP((F,f),(L,))((F,f),(L,\ell)). Thus, if ζ^φ(f,)\hat{\zeta}\leq\varphi(f,\ell) holds, then we can terminate QAP((F,f),(L,))((F,f),(L,\ell)) without applying the NB method to QAP((F,f),(L,))((F,f),(L,\ell)). This is a distinct advantage of the branching rule D, although the bound φ(f,)\varphi(f,\ell) may not be as strong as ηλp((F,f),(L,))\eta^{p}_{\lambda}((F,f),(L,\ell)).

6 Preliminary numerical results

We have implemented two types of codes for the parallel B&B method to solve QAPs. The first one is written in MATLAB, which has been developed to see how effectively and efficiently the NB method works in the branch-and-bound framework. The code used there for the NB method is modifications of the ones written for the MATLAB software package NewtBracket [11], which was released very recently. Specifically, the stopping criteria of the NB method were changed as presented in the last paragraph of Section 3.3. All techniques presented in Sections 2, 3 and 5 have been incorporated in the MATLAB code, but not the upper bounding procedure described in Section 4. The MATLAB code works in parallel but can solve only small-sized QAP instances as shown in Section 6.1.

The second code is a C++ implementation of the MATLAB code. Based on the information and knowledge obtained from preliminary numerical experiment by the MATLAB code, we have been developing the C++ code running on the UG, a generic framework to parallelize branch-and-bound based solvers. An application of the UG can be developed on PC, in which communication is carried out through shared memory [20]. After recompiling the application code with additional small amount of MPI related code for transferring objects, the parallel solver could potentially run with more than 100,000 cores [18, 23].

Two types of processes exist when running the parallelized QAP solver by UG on distributed memory environment. First, there is a single LoadCoordinator, which makes all decisions concerning the dynamic load balancing and distributes subproblems of QAP instances to be solved. Second, all other processes are Solvers that solve the distributed subproblem by regarding it as root node. The UG tries to solve all open subproblems on the single branch-and-bound search tree in parallel as much as it can by using all available Solvers. More precisely, the UG executes dynamic load balancing so that subproblems of the branch-and-bound search tree are transferred to the other Solvers when an idle Solver exists. In order to reduce the idle time of the Solvers, LoadCoordinator tries to keep a small number of open subproblems (the initial number can be specified by a runtime parameter and it is changed dynamically during the computation) so that it can assign a subproblem at the earliest time when an idle Solver exists.

For ramp-up, a phase until all cores become busy (see [15]), several methods exist. In our case of the parallelized QAP solver, we used the normal ramp-up, which performs the normal parallel branch-and-bound procedure as mentioned above during the ramp-up phase [20]; See also Section 7.1. Until now, we have succeeded to solve tai30a, tai35b, tai40b and sko42 from QAPLIB [7] as reported in Section 6.2. But the C++ code needs further improvements for larger scale QAP instances.

6.1 Small scale instances

The main features of the parallel B&B method implemented in the MATLAB code can be summarized as follows:

  • The enumeration tree described in Section 2.3.

  • The Lagrangian DNN relaxation of sub QAPs and a modified version of NewtBracket [11]. See Section 3.

  • A simple rounding of approximate optimal solutions of DNN(F,L)λp{}^{p}_{\lambda}(F,L) ((F,L)Πr×Πr,r{0,1,,n})((F,L)\in\Pi_{r}\times\Pi_{r},\ r\in\{0,1,\ldots,n\}) for the upper bounding procedure.

  • The three types of branching rules, M, P and D, which are described in Sections 5.1, 5.2 and 5.3, respectively.

  • A parallel breadth first search with processing each node QAP(F,L)(F,L) to compute ζl(F,L)\zeta^{l}(F,L) and ζu(F,L)\zeta^{u}(F,L) by one core.

The optimal values of all small QAP instances solved are known. The initial incumbent objective value is set to be the optimal value + 1, and it is updated to the optimal value which is found by the upper bound procedure at some iteration. With this setting, a high quality incumbent objective value close to the optimal value is known at the beginning of the B&B method. As a result, the role of the upper bounding procedure becomes less important, and the breadth first search is reasonable for parallel computation.

All the computations for numerical results reported in Table 1, 2 and 3 were performed using MATLAB 2020b on iMac Pro with Intel Xeon W CPU (3.2 GHZ), 8 cores and 128 GB memory.

Numerical results on nug20 \sim nug28, tai20a \sim tai30b and bur20a \sim bur20c are shown in Tables 1, 2 and 3, respectively. We observe that:

  • The number of nodes (=the number of sub QAPs processed) greatly depends on the branching rules M, P and D. Overall, the rule D performs better than the others.

  • In particular, for bur26b and bur26c instances, the number of nodes generated with the branching rule D is much smaller than those generated by the others.

  • But, for the largest size QAP, tai30b, the branching rule M generates the smallest number of nodes.

We describe in detail on which branching rule is preferable for a given larger QAP instance in Section 7.2.

Table 1: Numerical results on small scale QAP instances — 1. Br : Branching rules presented in Section 5.
QAP No. of nodes Total execution Time(sec) for computing No. of CPU
instance Opt.val Br generated time(sec) in para. LB UB Br cores used
M 757 6.78e2 4.21e3 3.13e1 3.38e0
nug20 2,570 P 301 3.18e2 1.80e3 1.45e1 1.18e1 8
D 412 3.84e2 2.15e3 1.91e1 3.11e1
M 312 4.93e2 2.67e3 1.80e1 1.48e0
nug21 2,438 P 200 3.46e2 1.73e3 9.72e0 1.04e1 8
D 272 3.73e2 1.75e3 1.33e1 2.59e1
M 429 7.16e2 4.43e3 2.55e1 3.37e0
nug22 3,596 P 250 4.33e2 1.88e3 1.27e1 1.80e1 8
D 149 4.10e2 2.08e3 9.96e0 2.36e1
M 1,413 3.34e3 2.24e4 9.30e1 2.16e1
nug24 3,488 P 742 1.99e3 1.28e4 5.02e1 1.06e2 8
D 913 2.57e3 1.46e4 7.39e1 2.46e2
M 8,446 2.00e4 1.49e5 5.60e2 1.53e2
nug25 3,744 P 3,004 8.37e3 5.75e4 2.14e2 5.57e2 8
D 3,805 8.97e3 6.46e4 3.41e2 1.29e3
M 2,810 1.48e4 1.06e5 4.33e2 1.27e2
nug27 5,234 P 979 5.26e3 3.29e4 1.32e2 2.78e2 8
D 2,170 9.83e3 6.91e4 3.90e2 1.41e3
M 10,977 5.58e4 4.31e5 2.27e3 5.79e2
nug28 5,166 P 4,236 2.25e4 1.68e5 8.40e2 1.57e3 8
D 9,977 4.55e4 3.37e5 2.15e3 7.83e3
Table 2: Numerical results on small scale QAP instances — 2. Br : Branching rules presented in Section 5.
QAP No. of nodes Total execution Time(sec) for computing No. of CPU
instance Opt.val Br generated time(sec) in para. LB UB Br cores used
M 2,444 1.49e3 1.04e4 1.06e2 7.72e-1
tai20a 703,482 P 2,107 1.42e3 9.18e3 9.23e1 7.30e1 8
D 1,600 1.17e3 8.30e3 7.31e1 1.02e2
M 183 2.03e2 2.34e2 1.13e0 3.12e-1
tai20b 122,455,319 P 183 2.01e2 2.32e2 1.19e0 2.22e0 8
D 146 2.40e2 2.38e2 1.23e0 6.11e0
M 367 9.68e2 2.76e3 1.18e1 2.54e-1
tai25b 344,355,646 P 367 1.08e3 3.20e3 1.31e1 2.41e1 8
D 367 1.08e3 2.90e3 1.26e1 6.21e1
M 989 1.11e4 7.03e4 3.87e2 6.83e1
tai30b 637,117,113 P 1,654 1.87e4 1.06e5 3.97e2 6.26e2 8
D 1,150 1.80e4 9.74e4 4.01e2 1.51e3
Table 3: Numerical results on small scale QAP instances — 3. Br : Branching rules presented in Section 5.
QAP No. of nodes Total execution Time(sec) for computing No. of CPU
instance Opt.val Br generated time(sec) in para. LB UB Br cores used
M 2,976 4.21e3 2.03e4 2.46e1 2.13e1
bur26a 5,426,670 P 11,401 1.70e4 9.24e4 7.69e1 4.00e2 8
D 2,358 3.35e3 1.52e4 1.80e1 1.81e2
M 144,228 4.62e4 2.46e5 1.02e3 1.32e2
bur26b 3,817,852 P More than More than 8
544,719 3 days
D 8,625 5.15e3 2.51e4 7.49e1 2.42e2
M 16,990 6.66e3 3.18e4 1.20e2 2.12e1
bur26c 5,426,795 P 66,428 8.84e4 5.08e5 3.82e2 1.78e3 8
D 2,416 3.06e3 9.90e3 1.71e1 9.59e1

6.2 Challenging large scale instances

We solved challenging large scale instances, nug30, tai30a, tai35b, tai40b and sko42 on the ISM (Institute of Statistical Mathematics) supercomputer HPE SGI 8600, which is a liquid cooled, tray-based, high-density clustered computer system. The ISM supercomputer has 384 computing nodes and each node has two Intel Xeon Gold 6154 3.0GHz CPUs (36 cores) and 384GB of memory. The LoadCoordinator process is assigned to a CPU core and each Solver process is assigned to a CPU core. Therefore, the number of Solvers is less than the number of cores by one. All of the instances were solved as a single job, though check-pointing procedures were performed every 30 minutes (see [19] about check-pointing and restart mechanism). Table 4 shows the computational results. Note that tai30a and sko42 were solved to the optimality for the first time.

Table 4: Numerical results on challenging large scale QAP instances. Branching rule used was M.
QAP No. of nodes Total execution No. of CPU
instance Opt.val generated time(sec) in para. cores used
nug30 6,124 26,181 3.14e3 1,728
tai30a 1,818,146 34,000,579 5.81e5 1,728
tai35b 283,315,445 2,620,547 2.49e5 1,728
tai40b 637,250,948 278,465 1.05e5 1,728
sko42 15,812 6,019,419 5.12e5 5,184

7 Some additional techniques for further developments

7.1 A complete enumeration tree generated by the simple branching rule M

A distinctive feature of the branching rule M presented in Section 5.1 is the independence from any lower bounding and upper bounding procedures. Therefore, we could create a complete enumeration tree using this rule in advance to start the B&B method. We usually start with the single root node associated with the original problem to be solved even if it is implemented in parallel.

At the beginning of the B&B method with the normal ramp-up, only one Solver is used, and all others are idle till it finishes processing the original problem at the root node. As the B&B method proceeds, the number of Solvers to solve subproblems increases gradually. In the early stage of the parallel execution of such a B&B method for large scale problems on a large number of Solvers, many of the Solvers are idle, and it would take a while till all Solvers start running. This is clearly inefficient.

If we use branching rule M for a large scale QAP instance, we can start the B&B method at any level of the enumeration tree. For example, consider a QAP instance with n=50n=50. Then we can create 50×49×48=117,60050\times 49\times 48=117,600 sub QAPs with dimension 47 at the fourth level in the enumeration tree using branching rules M. We can start the B&B method from these sub QAPs. We will investigate on the performance of this idea.

7.2 Simple estimation of the number of nodes in the enumeration tree

We have proposed 33 types of branching rules M, P and D in Section 5, and observed that the number of nodes generated depends not only on instances but also the branching rule employed. A branching rule sometimes results in much fewer nodes than other rules as observed in bur26a, bur26b and bur26c instances in Table 3. When the B&B method is applied to larger scale QAP instances, selecting a good branching rule becomes a more critical issue. For the selection, we describe a simple sampling method to estimate the number of nodes at each depth (level) of the enumeration tree under a certain uniformity assumption which is necessary for the simplicity of the method and also sufficient for a rough estimation.

Let G(𝒩,)(\mbox{$\cal N$},\mbox{$\cal E$}) denote the enumeration tree to be constructed by applying the B&B method to QAP (6). Recall that each node of 𝒩\cal N is identified as a sub QAP, QAP(F,L)(F,L) for some (F,L)Πr×Πr(r=0,1,,n)(F,L)\in\Pi_{r}\times\Pi_{r}\ (r=0,1,\ldots,n). The node set 𝒩\cal N is subdivided according to the depth of the enumeration tree G(𝒩,)(\mbox{$\cal N$},\mbox{$\cal E$}), depending on their location: 𝒩r=𝒩(Πr×Πr)\mbox{$\cal N$}_{r}=\mbox{$\cal N$}\cap\big{(}\Pi_{r}\times\Pi_{r}\big{)} (r=0,,n)(r=0,\ldots,n). Let mr=𝒩rm_{r}=\ \|\mbox{$\cal N$}_{r}\| (r=0,1,,n)(r=0,1,\ldots,n), which is estimated by the sampling method described below.

Let r=0r=0. Obviously m0=1m_{0}=1 since QAP(,)(\emptyset,\emptyset) is the unique element of 𝒩0\mbox{$\cal N$}_{0}, which corresponds to the root node of the enumeration tree. Hence m^0=m0=1\hat{m}_{0}=m_{0}=1. With the application of the lower bounding and upper bounding procedures to QAP(,)(\emptyset,\emptyset), it will become clear whether the B&B method terminates or proceeds to the branching procedure for generating nr=nn-r=n child nodes of QAP(,)(\emptyset,\emptyset). In the former case, m^q=mq=0\hat{m}_{q}=m_{q}=0 (q=1.,n)(q=1.\ldots,n) and we are done. For the latter case, we have m1=nm_{1}=n active nodes which forms 𝒩1\mbox{$\cal N$}_{1} and m^1=m1=n\hat{m}_{1}=m_{1}=n. If the lower bounding and upper bounding procedures are applied to all of the m^1=n\hat{m}_{1}=n nodes, then the exact m2m_{2} can be obtained; hence m^2=m2\hat{m}_{2}=m_{2}. Instead, we choose 0<s1<m^10<s_{1}<\hat{m}_{1} nodes randomly from the m^1\hat{m}_{1} nodes, and apply the procedures to those s1s_{1} nodes to estimate m2m_{2}. As a result, some of the s1s_{1} nodes are terminated. Let t1t_{1} denote the number of the nodes that remained active among the s1s_{1} nodes. Applying the branching procedure to those t1t_{1} nodes provides (n1)t1(n-1)t_{1} nodes (included in 𝒩2\mbox{$\cal N$}_{2}) as their child nodes. Thus, each node of the s1s_{1} nodes chosen randomly from the m^1=m1\hat{m}_{1}=m_{1} nodes have generated (n1)t1/s1(n-1)t_{1}/s_{1} child nodes on average. Consequently, the number m2m_{2} of the child nodes from the m^1\hat{m}_{1} nodes is estimated by m^2=m^1×(n1)t1/s1\hat{m}_{2}=\hat{m}_{1}\times(n-1)t_{1}/s_{1}.

Assume that m^q\hat{m}_{q} (q=0,1,,r)(q=0,1,\ldots,r) have been computed. We describe how to compute m^r+1\hat{m}_{r+1} at the rrth iteration of the sampling method. If tr1t_{r-1} attained 0, i.e., there are no active nodes from previous iteration, m^r\hat{m}_{r} was set to zero, so that we set m^r+1=0\hat{m}_{r+1}=0. Now we assume on the contrary that 0<tr10<t_{r-1} nodes remained active, and (n(r1))tr1(n-(r-1))t_{r-1} nodes included in 𝒩r\mbox{$\cal N$}_{r} were generated in the previous iteration. To compute an estimation m^r+1\hat{m}_{r+1} of mr+1m_{r+1}, choose srs_{r} nodes randomly from the (n(r1))tr1(n-(r-1))t_{r-1} nodes, and apply the lower bounding and upper bounding procedures to them. Suppose that there exist trt_{r} active nodes. Then the branching procedure applied to them yields (nr)tr(n-r)t_{r} nodes in 𝒩r+1\mbox{$\cal N$}_{r+1}. Therefore we set m^r+1=m^r(nr)tr/sr\hat{m}_{r+1}=\hat{m}_{r}(n-r)t_{r}/s_{r}.

Table 5: Estimation on the number of nodes in the enumeration tree for QAP instances from QAPLIB. ’Estimated’ == r=0nm^r\sum_{r=0}^{n}\hat{m}_{r}. ’Generated’ == r=0nmr\sum_{r=0}^{n}m_{r}. All (nr)!(n-r)! feasible solutions of QAP(F,L)(F,L) were enumerated to compute its exact optimal value when |Fc|=|Lc|=nr=7|F^{c}|=|L^{c}|=n-r=7 holds; hence m^r=mr=0\hat{m}_{r}=m_{r}=0 for r=n6,n5,,nr=n-6,n-5,\ldots,n. The bold blue font indicates the smallest estimate among the three branching rules M, P and D. - : not computed. (c) — solved at ISM by the C++ code. (m) — solved by the MATLAB code.
Branching Rule (The number of nodes)
M P D
Problem Estimated Generated Estimated Generated Estimated Generated
bur26a 3,038 2,976 15,495 11,401 2,404 2,358(m)
bur26b 276,511 144,228 1,106,174 More than 7,206 8,635(m)
544,719
bur26c 18,114 16,990 102,053 66,428 2,609 2,416(m)
bur26d 119,212 - 7,583,919 - 5,473 17,495(m)
bur26e 8,156 - 15,042 - 1,621 1,621(m)
bur26f 432,056 - 6,530,633 - 4,742 6,536(m)
bur26g 32,935 - 18,023 - 3,430 3,265(m)
bur26h 33,963 - 617,498 - 3,645 4,211(m)
nug25 3239 8446(c) 2952 3004(m) 3,614 3,805(m)
nug27 2660 2610(c) 746 979(m) 2,272 2,170(m)
nug28 7,369 11,228(c) 3,208 4,236(m) 10,128 9,977(m)
nug30 12,419 26,297(c) 10,905 - 10,401 -
tai30a 42,472,865 34,000,579(c) 19,280,014 - 14,999,704 -
tai30b 611 989(m) 611 1,654(m) 611 1,150(m)
tai35a 2.1e11 - 3.9e10 - 1.3e10 -
tai35b 5,840,218 2,620,547(c) 4,618,822 - 360,772 -
tai40b 212,954 278,465(c) 61,065 - 71,903,758 -
tai50b 54,547,664 - 1.4e9 - 2.1e11 -
tho40 7.1e11 - 37,382,192 - 97,834,698 -
sko42 3,575,067 6,019,419(c) 3,071,294 - 2,764,606 -
sko49 1.6e8 - 1.3e9 - 1.2e9 -
wil50 32,963,810 - 20,523,849 - 39,330,160 -

Table 5 shows the estimation of the total number of nodes for some instances. We observe that

(i) (the total number of nodes generated in the numerical experiment)
                                              \leq 3×3\ \times (the estimate of the total number of nodes).

(ii) The branching rule P and D are expected to generate less nodes than the branching rule M except for tai50b and sko49.

(iii) tai50b (using M), tho40 (P) and wil50 (P) could be solved.

(iv) tai35a is too difficult to solve using any of M, P and D.

(v) We may challenge sko49 (M).

References

  • [1] K. Anstreicher, N. Brixius, Goux J-P., Linderoth, and J. Solving large quadratic assignment problems on computational grids. Math. Program., 91:563–588, 2002.
  • [2] N. Arima, S. Kim, M. Kojima, and K. C. Toh. Lagrangian-conic relaxations, Part I: A unified framework and its applications to quadratic optimization problems. Pacific J. Optim., 14(1):161–192, 2018.
  • [3] J. Clausen and M. Perregaard. Solving large quadratic assignment problems in parallel. Comput. Optim. Appl., 8:111–127, 1997.
  • [4] D. T. Connolly. An improved annealing scheme for the QAP. Eur. J. Oper. Res., 46:93–100, 1990.
  • [5] L. M. Gambardella, E. D. Taillard, and M. Dorigo. Ant colonies for the QAP. Technical Report IDSIA-4-97, IDSIA, Lugano, Switzerland, 1997.
  • [6] A. D. Goncalves, A. A. Pessoa, L. M. de A. Drummond, C. Bentes, and R. Farias. Solving the quadratic assignment problem on heterogeneous environment (CPUs and GPUs) with the application of level 2 reformulation and linearization technique. Technical Report arXiv:1510.02065v1, 2015.
  • [7] P. Hahn and M. Anjos. QAPLIB – A quadratic assignment problem library. http://www.seas.upenn.edu/qaplib.
  • [8] P. Hahn, A. Roth, and M. Saltzman, M. abd Guignard. Memory-aware parallelized rlt3 for solving quadratic assignment problems. Technical Report Optimization Online, 2013.
  • [9] N. Ito, S. Kim, M. Kojima, A. Takeda, and K.C. Toh. BBCPOP: A sparse doubly nonnegative relaxation of polynomial optimization problems with binary, box and complementarity constraints. ACM Trans. Math. Soft., 45((3) 34), 2019.
  • [10] S. Kim, M. Kojima, and K.C. Toh. A Newton-bracketing method for a simple conic optimization problem. To appear in Optim. Methods and Softw.
  • [11] S. Kim, M. Kojima, and K.C. Toh. User manual of NewtBracket: ”A Newton-Bracketing method for a simple conic optimization problem” with aopplications to QOPs in binary variables. https://sites.google.com/site/masakazukojima1/softwares-developed/newtbracket, November 2020.
  • [12] H. W. Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2:83–97, 1955.
  • [13] D.E. Oliveira, H. Wolkowicz, and Y. Xu. ADMM for the sdp relaxation of the QAP. Math. Program. Comput., 10:631–658, 2018.
  • [14] P. M. Pardalos, K. G. Ramakrishnan, M. G. C. Resende, and Y. Li. Implementation of a variance reduction-based lower bound in a branch-and-bound algorithm for the quadratic assignment problem. SIAM J. Optim., 7:281–294, 1997.
  • [15] Ted Ralphs, Yuji Shinano, Timo Berthold, and Thorsten Koch. Parallel Solvers for Mixed Integer Linear Optimization, pages 283–336. Springer International Publishing, Cham, 2018.
  • [16] C. Roucairol. A parallel branch and bound algorithm for the quadra tic assignment problem. Discret. Appl. Math., 18:211–255, 1987.
  • [17] H. D. Sherali and C. H. Tuncbilek. A global optimization algorithm for polynomial programming problems using a Reformulation-Linearization Technique. J. Global Optim., 2:101–112, 1992.
  • [18] Y. Shinano, T. Achterberg, T. Berthold, S. Heinz, and T. Koch. Solving open mip instances with parascip on supercomputers using up to 80,000 cores. 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 770–779. IEEE, 2016.
  • [19] Y. Shinano, T. Achterberg, T. Berthold, S. Heinz, T. Koch, and M. Winkler. Solving hard MIPLIB2003 problems with ParaSCIP on supercomputers: An update. In 2014 IEEE International Parallel Distributed Processing Symposium Workshops, pages 1552–1561, 2014.
  • [20] Yuji Shinano, Stefan Heinz, Stefan Vigerske, and Michael Winkler. FiberSCIP—a shared memory parallelization of SCIP. INFORMS Journal on Computing, 30(1):11–30, 2018.
  • [21] J. Skorin-Kapov. Tabu search applied to the quadratic assignment problem. ORSA J. Comput., 2:33–45, 1990.
  • [22] E. Tailard. Robust taboo search for the quadratic assignment problem. Parallel Comput., 17:443–455, 1991.
  • [23] N. Tateiwa, Y. Shinano, S. Nakamura, A. Yoshida, M. Yasuda, S. Kaji, and K. Fujisawa. Massive parallelization for finding shortest lattice vectors based on ubiquity generator framework. In 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pages 834–848, Los Alamitos, CA, USA, nov 2020. IEEE Computer Society.