Solving Challenging Large Scale QAPs
Abstract
We report our progress on the project for solving larger scale quadratic assignment problems (QAPs). Our main approach to solve large scale NP-hard combinatorial optimization problems such as QAPs is a parallel branch-and-bound method efficiently implemented on a powerful computer system using the Ubiquity Generator (UG) framework that can utilize more than 100,000 cores. Lower bounding procedures incorporated in the branch-and-bound method play a crucial role in solving the problems. For a strong lower bounding procedure, we employ the Lagrangian doubly nonnegative (DNN) relaxation and the Newton-bracketing method developed by the authors’ group. In this report, we describe some basic tools used in the project including the lower bounding procedure and branching rules, and present some preliminary numerical results.
Our next target problem is QAPs with dimension at least 50, as we have succeeded to solve tai30a and sko42 from QAPLIB for the first time.
1 Introduction
For a positive integer , we let represent a set of locations and also a set of facilities. Given symmetric matrices , and an matrix , the quadratic assignment problem (QAP) is stated as
(1) |
where denotes the flow between facilities and , is the distance between locations and , the fixed cost of assigning facility to location , and a permutation of such that if facility is assigned to location .
The QAP is known as NP-hard in theory, and solving QAP instances of size larger than 35 in practice still remains challenging. Various heuristic methods for the QAP such as tabu search [21, 22], genetic method [5] and simulated annealing [4] have been developed. Those methods frequently attain near-optimal solutions, which often also happen to be the exact optimal solutions. The exactness is, however, not guaranteed in general.
Most existing methods for finding the exact solutions of QAPs are designed with the branch-and-bound (B&B) method [1, 3, 6, 8, 14, 16]. As its name indicates, branching and (lower) bounding are the main procedures of the method. The bounding based on doubly nonnegative (DNN) relaxations has recently attracted a great deal of attention as it provides tight bounds. Among the methods for solving large scale DNN relaxation problems, we mention four types of methods that can be applied to such DNN problems.
First two of the methods are SDPNAL+ [YST2015] (a majorized semismooth Newton-CG augmented Lagrangian method for semidefinite programming with nonnegative constraints) and BBCPOP [9] (a bisection and projection method for Binary, Box and Complementary constrained Polynomial Optimization Problems). Some numerical results on these two methods applied to QAP instances with dimensions to from QAPLIB [7] were reported in [9]. where BBCPOP attained tighter lower bounds for many of instances with dimensions to in less execution time. In addition, new and improved lower bounds were computed by Mittelmann using BBCPOP for the unknown minimum values of larger scale QAP instances including tai100a and tai100b. See QAPLIB [7]. The third method is an alternating direction method of multipliers (ADMM) proposed by Oliveira et al. [13] in combination with the facial reduction for robustness. The fourth method is the Newton-bracketing (NB) method [2, 10], which was developed to further improve the lower bounds obtained from BBCPOP by incorporating the Newton method into BBCPOP.
Numerical results on the NB method and BBCPOP are given in Table 2 of [10], and ones on ADMM in Table 7 of [13]. Lower bounds for the QAP instances sko42, sko49, sko56, sko64, sko72, sko81, tai60a, and tai80a were commonly computed by these methods. While the lower bounds obtained by the NB method for the first three instances are not smaller than those by ADMM and the difference is as big as , the NB method clearly attained the tightest lower bound among the three methods for the last five larger-scale instances with dimension .
RLT [17] is known as a powerful method for global optimization of polynomial programming problems. In their paper [6], Goncalves et al. implemented the level 2 RLT in their B&B method run on heterogeneous CPUs and GPUs environment. They solved tai35b and tai40b from QAPLIB for the first time.
The main motivation of our project is to challenge larger scale QAP instances from QAPLIB [7] that have not been solved yet. We implement the NB method [10] combined with the B&B method in the specialized Ubiquity Generator (UG) framework [ShinanoAchterbergBertholdHeinzKoch2012] to find the exact solutions of large scale QAP instances.
The NB method has the following advantages:
-
(a) The NB method generates a sequence of intervals such that each contains the lower bound to be computed and both and converge monotonically to as . Hence, if becomes larger than the incumbent objective value at some iteration , then follows. Consequently, the node can be pruned. If occurs at some iteration , then branching the node should be determined as holds. In these two cases, the NB method can terminate at iteration before the interval attains an accurate lower bound. This feature of the NB method works effectively to reduce a considerable amount of the computational time.
-
(b) The lower bounds computed by the NB method (and also the lower bounds computed by BBCPOP) are reliable or valid in the sense that they are guaranteed by very accurate dual feasible solutions of the Lagrangian DNN relaxation problem (see [10, Section 2.3].) It would be plausible to incorporate the technique used there for ADMM’s lower bound, but the resulting lower bound would deteriorate.
The UG is a generic framework to parallelize branch-and-bound based solvers and has achieved large-scale MPI parallelism with 80,000 cores [18]. It has also been generalized to handle non-branch-and-bound based solvers. Its experimental implementation for solving Shortest Vector Problem (SVP) handles more than 100,000 cores [23] .
The aim of this article is to present fundamentals of our joint project for solving larger scale QAPs, report the progress, and discuss future plans to further develop the methods. We mention that tai35b, tai40b, tai30a and sko42 from QAPLIB [7] have been successfully solved by our method, where the last two instances could be solved for the first time. QAPs with dimension at least 50 are our next target problems.
In Section 2, we first reformulate QAP (1) to QAP (6), which is a binary quadratic optimization problem form in -dimensional vector variable . Then we introduce a class of its sub QAPs and describe the enumeration tree of those subproblems used in the branching process. In Section 3, a simple Lagrangian DNN relaxation problem of a sub QAP and the NB method to solve the relaxation problem are presented for the lower bounding procedure. In Section 4, a method to retrieve a feasible solution of (6) from an approximate optimal solution of the Lagrangian DNN relaxation problem is described for the upper bounding procedure. We present three types branching rules in Section 5, and preliminary numerical results in Section 6. Some additional techniques are proposed for future development in Section 7.
Notation and Symbols
Let denote the -dimensional Euclidean space of column vectors , the nonnegative orthant of , the linear space of symmetric matrices with the inner product , and the cone of positive semidefinite matrices in . For every matrix , denotes the th column vector of , and vec the -dimensional column vector converted from the matrix , i.e., vec. denotes the transposition of . Throughout the paper, we use for the size of QAP (1), and for the set of facilities and the set of locations. When is used, each -dimensional column vector is represented component-wisely as , where forms an matrix and . Each -dimensional symmetric matrix has elements .
2 A class of sub QAPs and the enumeration tree
2.1 Conversion of QAP (1) into a QOP with a binary vector variable
Let be an matrix variable and a -dimensional column vector variable, where
Define a constant matrix
where denotes the Kronecker product. Then, the QAP can be expressed as
(6) |
2.2 A class of subproblems of QAP (6)
We describe subproblems of QAP (6) with a family of subsequences with . More precisely, let denote the family of subsequences of distinct elements of . We assume . corresponds to the family of permutations of , and each (or ) with corresponds to a permutation of distinct elements (or ) of . For simplicity of notation, is frequently regarded as a subset of when and its complement are described with respect to .
For each , let
-dimensional Euclidean space of column vectors | ||||
the linear space of symmetric | ||||
Then, for each , a subproblem of QAP (6) for assigning each facility to each location can be written as
for some matrix , which is given by , where
the matrix with elements | ||||
Obviously, coincides with the identity matrix in and QAP corresponds to QAP (6).
2.3 Enumeration tree
A fundamental step in the B&B method is to successively construct an enumeration tree. Each node of the enumeration tree corresponds to a subproblem of QAP (6), QAP for some with . Thus, each node is identified with a subproblem. To initialize the tree, we set QAP (i.e., QAP (6)) for the root node and compute the incumbent objective value by applying a heuristic method to QAP (6).
Suppose that we have generated a node QAP, where and for some (or ). Then, we compute a lower bound (see Section 3) and an upper bound (see Section 4), which is associated with a feasible solution of QAP, for the (unknown) optimal value ; . The incumbent objective value is then updated by .
When occurs, the (unknown) optimal value of QAP is greater than or equal to the incumbent objective value . In this case, the node is pruned (or terminated). Otherwise, branching the node into child nodes takes place as follows. We select either facility or location . If is selected, then child nodes are generated. Otherwise, child nodes are generated. A branching rule determines which or to select. In Section 5, we describe three types of branching rules in detail.
For any QAP possibly located at the bottom level (or the th level), we have and . Thus, QAP becomes a trivial QAP with every facility assigned to some location and vice versa. It is trivial to compute its optimal value . We note that the height of the enumeration tree is at most . Numerically, we can enumerate all feasible solutions of QAP to compute its exact optimal value when holds for a small enough . In the numerical results reported in Section 6, we took .
3 Lower bounding procedure
Throughout this section, we fix , and present a method to compute a lower bound for the optimal objective value of QAP. Before deriving a doubly nonnegative (DNN) relaxation of QAP in Section 3.1 and a Lagrangian DNN relaxation of QAP in Section 3.2, we first transform QAP into described below.
We note that each homogeneous linear equality constraint of QAP can be written as for some . Let . Then, the linear equality is equivalent to . Similarly, we can rewrite each homogeneous linear constraint as for some -dimensional row vector . Hence, the entire linear constraints
can be expressed as , where denotes the matrix consisting of the row vectors and .
Define
the matrix in with at the th element | ||||
By construction, the constraint is equivalent to if and , and for every . Therefore, we can rewrite QAP given in Section 2.2 as
3.1 A DNN relaxation of
In , the constraint requires the rank of the variable matrix to be . By removing this constraint, we obtain
which serves as a DNN relaxation of with .
3.2 A Lagrangian DNN relaxation problem of
We know that
since each row vector of is of the form for some . Hence, the constraint in DNN can be replaced by
where denotes the vector of ’s, and the second equality holds by the construction of . Applying the Lagrangian relaxation to the resulting problem, we obtain
and its dual
Here denotes a Lagrangian multiplier, and
DNN serves as a Lagrangian relaxation of DNN and a Lagrangian DNN relaxation of with for any . We can prove that converges monotonically to as . We also see by the duality theorem ([2, Lemma 2.3]) that .
3.3 The Newton-bracketing method for solving the primal-dual pair of DNN and DNN
To describe the NB method for solving DNN and DNN, is fixed to a sufficiently large positive number, say . The optimal value of DNN is denoted by throughout this subsection.
Let . The NB method applied to DNN and DNN generates
such that
(18) |
We see from these relations that for sufficiently large , and provides approximate optimal solutions of DNN and DNN, respectively. Hence is an approximate optimal solution of DNN. See [10] for more details.
It is natural to use a stopping criterion for the NB method as
-
(a) , where denotes a sufficiently small positive number.
When the NB method is implemented in the B&B method for QAPs with integer optimal values, an accurate approximation of does not have to be computed all the time by the NB method. We can terminate the iteration in case when
-
(b) or
-
(c) ,
where denotes an integer incumbent objective value of the root node QAP. If (b) occurs, then we know that the sub QAP, to which the NB method has applied, does not have any feasible solution whose objective value is smaller than the incumbent objective value ; hence can be pruned. If case (c) occurs, then . Thus, branching further is necessary even if the NB iteration continues. By employing these two additional criteria (b) and (c), we can save a lot of computational time for the NB method. This is a distinctive advantage of using the NB method in the B&B method.
4 Upper bounding procedures
Let for some . An approximate optimal solution of QAP needs to be computed for the upper bounding procedure. A simple rounding of an approximate optimal solution of its Lagrangian relaxation DNN and the tabu search method [21, 22] for QAP have been used for the computation. We plan to replace this method by the one presented below in this section to effectively utilize information from DNN. In the remainder of this section, we assume so that QAP coincides with the root node QAP (6), for simplicity of notation. All the discussions, however, can be easily adapted to any sub QAP, QAP.
Suppose that we have applied the NB method to the primal-dual pair of DNN problems DNN and DNN and obtained an approximate optimal solution of DNN. The set of feasible solutions of DNN may be regarded as a doubly nonnegative relaxation of the set of with feasible solutions of QAP (6), and the first column of corresponds to a feasible (or approximate optimal) solution of QAP (6). We know that forms a permutation matrix for every feasible solution of QAP (6). Through preliminary numerical experiment, we have observed that the matrix
approximately forms a doubly-stochastic matrix. A popular approach for recovering a feasible solution from is through appropriate rounding procedures. Here we propose a method to compute a permutation matrix which minimize the distance among all permutation matrices :
where denotes the Frobenius norm. This problem forms a linear assignment problem, which can be solved the well-known Hungarian method [12].
5 Branching rules
Let for some . As we have briefly mentioned in Section 2.3, if node QAP is not pruned, we select either an to branch QAP into child nodes QAP or an to branch QAP into child nodes QAP .
We present three types of rules for selecting an or an . In each rule, an evaluation function is introduced for possible child nodes QAP so that can represent the optimal values of QAP or its Lagrangian DNN relaxation problem DNN. Then, the following functions are defined:
Our method is to choose if , and otherwise. Each rule employs a different function from others. We describe how is defined with a fixed for the three rules in Sections 5.1, 5.2 and 5.2, respectively.
5.1 Branching rule M: a rule using the mean of all feasible objective values of Q
Define to be the mean of the objective values of QAP over its feasible solutions, which can be computed by substituting and into the objective function of QAP. This branching rule M is employed in the current parallel implementation of the B&B method on the UG framework.
5.2 Branching rule P: a rule using an approximate optimimal solution of DNN
Let be an approximate optimal solution of DNN. Then, may be regarded as an approximate optimal solution of DNN since DNN is a Lagrangian relaxation of DNN and its optimal value converges to of DNN as . We will construct a rough approximate optimal solution of DNN from , and define to be the objective value of DNN at , i.e., as described below.
We first project onto the linear space where the feasible region of QAP lies. Let denote the metric projection of , which is obtained by deleting the rows and the columns from simultaneously. We also know that every feasible solution of DNN must lie in the linear manifold
Let be the metric projection of onto , and then define
We note that is not necessarily in ; hence is not necessarily a feasible solution of DNN in general.
5.3 Branching rule D: a rule using an approximate optimal solution of DNN
Recall that the NB method applied to the primal-dual pair of DNN and DNN generates
satisfying (18). We assume that the method terminates with the stopping criterion (c) , which requires to branch the node QAP.
Define the matrix by
Then
hold. We apply the above to the identity in (18). More precisely, multiplying to each term of the identity on the left and on the right results in
Now, define
Then we can prove that , which implies that itself serves as a lower bound of the optimal value of the child node QAP. Thus, if holds, then we can terminate QAP without applying the NB method to QAP. This is a distinct advantage of the branching rule D, although the bound may not be as strong as .
6 Preliminary numerical results
We have implemented two types of codes for the parallel B&B method to solve QAPs. The first one is written in MATLAB, which has been developed to see how effectively and efficiently the NB method works in the branch-and-bound framework. The code used there for the NB method is modifications of the ones written for the MATLAB software package NewtBracket [11], which was released very recently. Specifically, the stopping criteria of the NB method were changed as presented in the last paragraph of Section 3.3. All techniques presented in Sections 2, 3 and 5 have been incorporated in the MATLAB code, but not the upper bounding procedure described in Section 4. The MATLAB code works in parallel but can solve only small-sized QAP instances as shown in Section 6.1.
The second code is a C++ implementation of the MATLAB code. Based on the information and knowledge obtained from preliminary numerical experiment by the MATLAB code, we have been developing the C++ code running on the UG, a generic framework to parallelize branch-and-bound based solvers. An application of the UG can be developed on PC, in which communication is carried out through shared memory [20]. After recompiling the application code with additional small amount of MPI related code for transferring objects, the parallel solver could potentially run with more than 100,000 cores [18, 23].
Two types of processes exist when running the parallelized QAP solver by UG on distributed memory environment. First, there is a single LoadCoordinator, which makes all decisions concerning the dynamic load balancing and distributes subproblems of QAP instances to be solved. Second, all other processes are Solvers that solve the distributed subproblem by regarding it as root node. The UG tries to solve all open subproblems on the single branch-and-bound search tree in parallel as much as it can by using all available Solvers. More precisely, the UG executes dynamic load balancing so that subproblems of the branch-and-bound search tree are transferred to the other Solvers when an idle Solver exists. In order to reduce the idle time of the Solvers, LoadCoordinator tries to keep a small number of open subproblems (the initial number can be specified by a runtime parameter and it is changed dynamically during the computation) so that it can assign a subproblem at the earliest time when an idle Solver exists.
For ramp-up, a phase until all cores become busy (see [15]), several methods exist. In our case of the parallelized QAP solver, we used the normal ramp-up, which performs the normal parallel branch-and-bound procedure as mentioned above during the ramp-up phase [20]; See also Section 7.1. Until now, we have succeeded to solve tai30a, tai35b, tai40b and sko42 from QAPLIB [7] as reported in Section 6.2. But the C++ code needs further improvements for larger scale QAP instances.
6.1 Small scale instances
The main features of the parallel B&B method implemented in the MATLAB code can be summarized as follows:
-
•
The enumeration tree described in Section 2.3.
- •
-
•
A simple rounding of approximate optimal solutions of DNN for the upper bounding procedure.
- •
-
•
A parallel breadth first search with processing each node QAP to compute and by one core.
The optimal values of all small QAP instances solved are known. The initial incumbent objective value is set to be the optimal value + 1, and it is updated to the optimal value which is found by the upper bound procedure at some iteration. With this setting, a high quality incumbent objective value close to the optimal value is known at the beginning of the B&B method. As a result, the role of the upper bounding procedure becomes less important, and the breadth first search is reasonable for parallel computation.
All the computations for numerical results reported in Table 1, 2 and 3 were performed using MATLAB 2020b on iMac Pro with Intel Xeon W CPU (3.2 GHZ), 8 cores and 128 GB memory.
Numerical results on nug20 nug28, tai20a tai30b and bur20a bur20c are shown in Tables 1, 2 and 3, respectively. We observe that:
-
•
The number of nodes (=the number of sub QAPs processed) greatly depends on the branching rules M, P and D. Overall, the rule D performs better than the others.
-
•
In particular, for bur26b and bur26c instances, the number of nodes generated with the branching rule D is much smaller than those generated by the others.
-
•
But, for the largest size QAP, tai30b, the branching rule M generates the smallest number of nodes.
We describe in detail on which branching rule is preferable for a given larger QAP instance in Section 7.2.
QAP | No. of nodes | Total execution | Time(sec) for computing | No. of CPU | ||||
instance | Opt.val | Br | generated | time(sec) in para. | LB | UB | Br | cores used |
M | 757 | 6.78e2 | 4.21e3 | 3.13e1 | 3.38e0 | |||
nug20 | 2,570 | P | 301 | 3.18e2 | 1.80e3 | 1.45e1 | 1.18e1 | 8 |
D | 412 | 3.84e2 | 2.15e3 | 1.91e1 | 3.11e1 | |||
M | 312 | 4.93e2 | 2.67e3 | 1.80e1 | 1.48e0 | |||
nug21 | 2,438 | P | 200 | 3.46e2 | 1.73e3 | 9.72e0 | 1.04e1 | 8 |
D | 272 | 3.73e2 | 1.75e3 | 1.33e1 | 2.59e1 | |||
M | 429 | 7.16e2 | 4.43e3 | 2.55e1 | 3.37e0 | |||
nug22 | 3,596 | P | 250 | 4.33e2 | 1.88e3 | 1.27e1 | 1.80e1 | 8 |
D | 149 | 4.10e2 | 2.08e3 | 9.96e0 | 2.36e1 | |||
M | 1,413 | 3.34e3 | 2.24e4 | 9.30e1 | 2.16e1 | |||
nug24 | 3,488 | P | 742 | 1.99e3 | 1.28e4 | 5.02e1 | 1.06e2 | 8 |
D | 913 | 2.57e3 | 1.46e4 | 7.39e1 | 2.46e2 | |||
M | 8,446 | 2.00e4 | 1.49e5 | 5.60e2 | 1.53e2 | |||
nug25 | 3,744 | P | 3,004 | 8.37e3 | 5.75e4 | 2.14e2 | 5.57e2 | 8 |
D | 3,805 | 8.97e3 | 6.46e4 | 3.41e2 | 1.29e3 | |||
M | 2,810 | 1.48e4 | 1.06e5 | 4.33e2 | 1.27e2 | |||
nug27 | 5,234 | P | 979 | 5.26e3 | 3.29e4 | 1.32e2 | 2.78e2 | 8 |
D | 2,170 | 9.83e3 | 6.91e4 | 3.90e2 | 1.41e3 | |||
M | 10,977 | 5.58e4 | 4.31e5 | 2.27e3 | 5.79e2 | |||
nug28 | 5,166 | P | 4,236 | 2.25e4 | 1.68e5 | 8.40e2 | 1.57e3 | 8 |
D | 9,977 | 4.55e4 | 3.37e5 | 2.15e3 | 7.83e3 |
QAP | No. of nodes | Total execution | Time(sec) for computing | No. of CPU | ||||
instance | Opt.val | Br | generated | time(sec) in para. | LB | UB | Br | cores used |
M | 2,444 | 1.49e3 | 1.04e4 | 1.06e2 | 7.72e-1 | |||
tai20a | 703,482 | P | 2,107 | 1.42e3 | 9.18e3 | 9.23e1 | 7.30e1 | 8 |
D | 1,600 | 1.17e3 | 8.30e3 | 7.31e1 | 1.02e2 | |||
M | 183 | 2.03e2 | 2.34e2 | 1.13e0 | 3.12e-1 | |||
tai20b | 122,455,319 | P | 183 | 2.01e2 | 2.32e2 | 1.19e0 | 2.22e0 | 8 |
D | 146 | 2.40e2 | 2.38e2 | 1.23e0 | 6.11e0 | |||
M | 367 | 9.68e2 | 2.76e3 | 1.18e1 | 2.54e-1 | |||
tai25b | 344,355,646 | P | 367 | 1.08e3 | 3.20e3 | 1.31e1 | 2.41e1 | 8 |
D | 367 | 1.08e3 | 2.90e3 | 1.26e1 | 6.21e1 | |||
M | 989 | 1.11e4 | 7.03e4 | 3.87e2 | 6.83e1 | |||
tai30b | 637,117,113 | P | 1,654 | 1.87e4 | 1.06e5 | 3.97e2 | 6.26e2 | 8 |
D | 1,150 | 1.80e4 | 9.74e4 | 4.01e2 | 1.51e3 |
QAP | No. of nodes | Total execution | Time(sec) for computing | No. of CPU | ||||
instance | Opt.val | Br | generated | time(sec) in para. | LB | UB | Br | cores used |
M | 2,976 | 4.21e3 | 2.03e4 | 2.46e1 | 2.13e1 | |||
bur26a | 5,426,670 | P | 11,401 | 1.70e4 | 9.24e4 | 7.69e1 | 4.00e2 | 8 |
D | 2,358 | 3.35e3 | 1.52e4 | 1.80e1 | 1.81e2 | |||
M | 144,228 | 4.62e4 | 2.46e5 | 1.02e3 | 1.32e2 | |||
bur26b | 3,817,852 | P | More than | More than | 8 | |||
544,719 | 3 days | |||||||
D | 8,625 | 5.15e3 | 2.51e4 | 7.49e1 | 2.42e2 | |||
M | 16,990 | 6.66e3 | 3.18e4 | 1.20e2 | 2.12e1 | |||
bur26c | 5,426,795 | P | 66,428 | 8.84e4 | 5.08e5 | 3.82e2 | 1.78e3 | 8 |
D | 2,416 | 3.06e3 | 9.90e3 | 1.71e1 | 9.59e1 |
6.2 Challenging large scale instances
We solved challenging large scale instances, nug30, tai30a, tai35b, tai40b and sko42 on the ISM (Institute of Statistical Mathematics) supercomputer HPE SGI 8600, which is a liquid cooled, tray-based, high-density clustered computer system. The ISM supercomputer has 384 computing nodes and each node has two Intel Xeon Gold 6154 3.0GHz CPUs (36 cores) and 384GB of memory. The LoadCoordinator process is assigned to a CPU core and each Solver process is assigned to a CPU core. Therefore, the number of Solvers is less than the number of cores by one. All of the instances were solved as a single job, though check-pointing procedures were performed every 30 minutes (see [19] about check-pointing and restart mechanism). Table 4 shows the computational results. Note that tai30a and sko42 were solved to the optimality for the first time.
QAP | No. of nodes | Total execution | No. of CPU | |
---|---|---|---|---|
instance | Opt.val | generated | time(sec) in para. | cores used |
nug30 | 6,124 | 26,181 | 3.14e3 | 1,728 |
tai30a | 1,818,146 | 34,000,579 | 5.81e5 | 1,728 |
tai35b | 283,315,445 | 2,620,547 | 2.49e5 | 1,728 |
tai40b | 637,250,948 | 278,465 | 1.05e5 | 1,728 |
sko42 | 15,812 | 6,019,419 | 5.12e5 | 5,184 |
7 Some additional techniques for further developments
7.1 A complete enumeration tree generated by the simple branching rule M
A distinctive feature of the branching rule M presented in Section 5.1 is the independence from any lower bounding and upper bounding procedures. Therefore, we could create a complete enumeration tree using this rule in advance to start the B&B method. We usually start with the single root node associated with the original problem to be solved even if it is implemented in parallel.
At the beginning of the B&B method with the normal ramp-up, only one Solver is used, and all others are idle till it finishes processing the original problem at the root node. As the B&B method proceeds, the number of Solvers to solve subproblems increases gradually. In the early stage of the parallel execution of such a B&B method for large scale problems on a large number of Solvers, many of the Solvers are idle, and it would take a while till all Solvers start running. This is clearly inefficient.
If we use branching rule M for a
large scale QAP instance, we can start the B&B method at any level of the enumeration
tree. For example, consider a QAP instance with . Then we can
create sub QAPs with dimension 47 at the fourth
level in the enumeration tree using branching rules M. We can start the B&B method
from these sub QAPs. We will investigate on the performance of this idea.
7.2 Simple estimation of the number of nodes in the enumeration tree
We have proposed types of branching rules M, P and D in Section 5, and observed that the number of nodes generated depends not only on instances but also the branching rule employed. A branching rule sometimes results in much fewer nodes than other rules as observed in bur26a, bur26b and bur26c instances in Table 3. When the B&B method is applied to larger scale QAP instances, selecting a good branching rule becomes a more critical issue. For the selection, we describe a simple sampling method to estimate the number of nodes at each depth (level) of the enumeration tree under a certain uniformity assumption which is necessary for the simplicity of the method and also sufficient for a rough estimation.
Let G denote the enumeration tree to be constructed by applying the B&B method to QAP (6). Recall that each node of is identified as a sub QAP, QAP for some . The node set is subdivided according to the depth of the enumeration tree G, depending on their location: . Let , which is estimated by the sampling method described below.
Let . Obviously since QAP is the unique element of , which corresponds to the root node of the enumeration tree. Hence . With the application of the lower bounding and upper bounding procedures to QAP, it will become clear whether the B&B method terminates or proceeds to the branching procedure for generating child nodes of QAP. In the former case, and we are done. For the latter case, we have active nodes which forms and . If the lower bounding and upper bounding procedures are applied to all of the nodes, then the exact can be obtained; hence . Instead, we choose nodes randomly from the nodes, and apply the procedures to those nodes to estimate . As a result, some of the nodes are terminated. Let denote the number of the nodes that remained active among the nodes. Applying the branching procedure to those nodes provides nodes (included in ) as their child nodes. Thus, each node of the nodes chosen randomly from the nodes have generated child nodes on average. Consequently, the number of the child nodes from the nodes is estimated by .
Assume that have been computed. We describe how to compute at the th iteration of the sampling method. If attained , i.e., there are no active nodes from previous iteration, was set to zero, so that we set . Now we assume on the contrary that nodes remained active, and nodes included in were generated in the previous iteration. To compute an estimation of , choose nodes randomly from the nodes, and apply the lower bounding and upper bounding procedures to them. Suppose that there exist active nodes. Then the branching procedure applied to them yields nodes in . Therefore we set .
Branching Rule (The number of nodes) | ||||||
M | P | D | ||||
Problem | Estimated | Generated | Estimated | Generated | Estimated | Generated |
bur26a | 3,038 | 2,976 | 15,495 | 11,401 | 2,404 | 2,358(m) |
bur26b | 276,511 | 144,228 | 1,106,174 | More than | 7,206 | 8,635(m) |
544,719 | ||||||
bur26c | 18,114 | 16,990 | 102,053 | 66,428 | 2,609 | 2,416(m) |
bur26d | 119,212 | - | 7,583,919 | - | 5,473 | 17,495(m) |
bur26e | 8,156 | - | 15,042 | - | 1,621 | 1,621(m) |
bur26f | 432,056 | - | 6,530,633 | - | 4,742 | 6,536(m) |
bur26g | 32,935 | - | 18,023 | - | 3,430 | 3,265(m) |
bur26h | 33,963 | - | 617,498 | - | 3,645 | 4,211(m) |
nug25 | 3239 | 8446(c) | 2952 | 3004(m) | 3,614 | 3,805(m) |
nug27 | 2660 | 2610(c) | 746 | 979(m) | 2,272 | 2,170(m) |
nug28 | 7,369 | 11,228(c) | 3,208 | 4,236(m) | 10,128 | 9,977(m) |
nug30 | 12,419 | 26,297(c) | 10,905 | - | 10,401 | - |
tai30a | 42,472,865 | 34,000,579(c) | 19,280,014 | - | 14,999,704 | - |
tai30b | 611 | 989(m) | 611 | 1,654(m) | 611 | 1,150(m) |
tai35a | 2.1e11 | - | 3.9e10 | - | 1.3e10 | - |
tai35b | 5,840,218 | 2,620,547(c) | 4,618,822 | - | 360,772 | - |
tai40b | 212,954 | 278,465(c) | 61,065 | - | 71,903,758 | - |
tai50b | 54,547,664 | - | 1.4e9 | - | 2.1e11 | - |
tho40 | 7.1e11 | - | 37,382,192 | - | 97,834,698 | - |
sko42 | 3,575,067 | 6,019,419(c) | 3,071,294 | - | 2,764,606 | - |
sko49 | 1.6e8 | - | 1.3e9 | - | 1.2e9 | - |
wil50 | 32,963,810 | - | 20,523,849 | - | 39,330,160 | - |
Table 5 shows the estimation of the total number of nodes for some instances. We observe that
-
(i) (the total number of nodes generated in the numerical experiment)
(the estimate of the total number of nodes). -
(ii) The branching rule P and D are expected to generate less nodes than the branching rule M except for tai50b and sko49.
-
(iii) tai50b (using M), tho40 (P) and wil50 (P) could be solved.
-
(iv) tai35a is too difficult to solve using any of M, P and D.
-
(v) We may challenge sko49 (M).
References
- [1] K. Anstreicher, N. Brixius, Goux J-P., Linderoth, and J. Solving large quadratic assignment problems on computational grids. Math. Program., 91:563–588, 2002.
- [2] N. Arima, S. Kim, M. Kojima, and K. C. Toh. Lagrangian-conic relaxations, Part I: A unified framework and its applications to quadratic optimization problems. Pacific J. Optim., 14(1):161–192, 2018.
- [3] J. Clausen and M. Perregaard. Solving large quadratic assignment problems in parallel. Comput. Optim. Appl., 8:111–127, 1997.
- [4] D. T. Connolly. An improved annealing scheme for the QAP. Eur. J. Oper. Res., 46:93–100, 1990.
- [5] L. M. Gambardella, E. D. Taillard, and M. Dorigo. Ant colonies for the QAP. Technical Report IDSIA-4-97, IDSIA, Lugano, Switzerland, 1997.
- [6] A. D. Goncalves, A. A. Pessoa, L. M. de A. Drummond, C. Bentes, and R. Farias. Solving the quadratic assignment problem on heterogeneous environment (CPUs and GPUs) with the application of level 2 reformulation and linearization technique. Technical Report arXiv:1510.02065v1, 2015.
- [7] P. Hahn and M. Anjos. QAPLIB – A quadratic assignment problem library. http://www.seas.upenn.edu/qaplib.
- [8] P. Hahn, A. Roth, and M. Saltzman, M. abd Guignard. Memory-aware parallelized rlt3 for solving quadratic assignment problems. Technical Report Optimization Online, 2013.
- [9] N. Ito, S. Kim, M. Kojima, A. Takeda, and K.C. Toh. BBCPOP: A sparse doubly nonnegative relaxation of polynomial optimization problems with binary, box and complementarity constraints. ACM Trans. Math. Soft., 45((3) 34), 2019.
- [10] S. Kim, M. Kojima, and K.C. Toh. A Newton-bracketing method for a simple conic optimization problem. To appear in Optim. Methods and Softw.
- [11] S. Kim, M. Kojima, and K.C. Toh. User manual of NewtBracket: ”A Newton-Bracketing method for a simple conic optimization problem” with aopplications to QOPs in binary variables. https://sites.google.com/site/masakazukojima1/softwares-developed/newtbracket, November 2020.
- [12] H. W. Kuhn. The hungarian method for the assignment problem. Naval research logistics quarterly, 2:83–97, 1955.
- [13] D.E. Oliveira, H. Wolkowicz, and Y. Xu. ADMM for the sdp relaxation of the QAP. Math. Program. Comput., 10:631–658, 2018.
- [14] P. M. Pardalos, K. G. Ramakrishnan, M. G. C. Resende, and Y. Li. Implementation of a variance reduction-based lower bound in a branch-and-bound algorithm for the quadratic assignment problem. SIAM J. Optim., 7:281–294, 1997.
- [15] Ted Ralphs, Yuji Shinano, Timo Berthold, and Thorsten Koch. Parallel Solvers for Mixed Integer Linear Optimization, pages 283–336. Springer International Publishing, Cham, 2018.
- [16] C. Roucairol. A parallel branch and bound algorithm for the quadra tic assignment problem. Discret. Appl. Math., 18:211–255, 1987.
- [17] H. D. Sherali and C. H. Tuncbilek. A global optimization algorithm for polynomial programming problems using a Reformulation-Linearization Technique. J. Global Optim., 2:101–112, 1992.
- [18] Y. Shinano, T. Achterberg, T. Berthold, S. Heinz, and T. Koch. Solving open mip instances with parascip on supercomputers using up to 80,000 cores. 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 770–779. IEEE, 2016.
- [19] Y. Shinano, T. Achterberg, T. Berthold, S. Heinz, T. Koch, and M. Winkler. Solving hard MIPLIB2003 problems with ParaSCIP on supercomputers: An update. In 2014 IEEE International Parallel Distributed Processing Symposium Workshops, pages 1552–1561, 2014.
- [20] Yuji Shinano, Stefan Heinz, Stefan Vigerske, and Michael Winkler. FiberSCIP—a shared memory parallelization of SCIP. INFORMS Journal on Computing, 30(1):11–30, 2018.
- [21] J. Skorin-Kapov. Tabu search applied to the quadratic assignment problem. ORSA J. Comput., 2:33–45, 1990.
- [22] E. Tailard. Robust taboo search for the quadratic assignment problem. Parallel Comput., 17:443–455, 1991.
- [23] N. Tateiwa, Y. Shinano, S. Nakamura, A. Yoshida, M. Yasuda, S. Kaji, and K. Fujisawa. Massive parallelization for finding shortest lattice vectors based on ubiquity generator framework. In 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pages 834–848, Los Alamitos, CA, USA, nov 2020. IEEE Computer Society.