Memory-Efficient Approximation Algorithms for Max-k-Cut and Correlation Clustering
Abstract
Max-k-Cut and correlation clustering are fundamental graph partitioning problems. For a graph with vertices, the methods with the best approximation guarantees for Max-k-Cut and the Max-Agree variant of correlation clustering involve solving SDPs with constraints and variables. Large-scale instances of SDPs, thus, present a memory bottleneck. In this paper, we develop simple polynomial-time Gaussian sampling-based algorithms for these two problems that use memory and nearly achieve the best existing approximation guarantees. For dense graphs arriving in a stream, we eliminate the dependence on in the storage complexity at the cost of a slightly worse approximation ratio by combining our approach with sparsification.
1 Introduction
Semidefinite programs (SDPs) arise naturally as a relaxation of a variety of problems such as -means clustering [5], correlation clustering [6] and Max-k-Cut [14]. In each case, the decision variable is an matrix and there are constraints. While reducing the memory bottleneck for large-scale SDPs has been studied quite extensively in literature [9, 19, 11, 37], all these methods use memory that scales linearly with the number of constraints and also depends on either the rank of the optimal solution or an approximation parameter. A recent Gaussian-sampling based technique to generate a near-optimal, near-feasible solution to SDPs with smooth objective function involves replacing the decision variable with a zero-mean random vector whose covariance is [28]. This method uses at most memory, independent of the rank of the optimal solution. However, for SDPs with constraints, these algorithms still use memory and provide no advantage in storage reduction. In this paper, we show how to adapt the Gaussian sampling-based approach of [28] to generate an approximate solution with provable approximation guarantees to Max-k-Cut, and to the Max-Agree variant of correlation clustering on a graph with arbitrary edge weights using only memory.
1.1 Max-k-Cut
Max-k-Cut is the problem of partitioning the vertices of a weighted undirected graph into distinct parts, such that the total weight of the edges across the parts is maximized. If is the edge weight corresponding to edge , then the cut value of a partition is . Consider the standard SDP relaxation of Max-k-Cut
(k-Cut-SDP) |
where is a scaled Laplacian. Frieze and Jerrum [14] developed a randomized rounding scheme that takes an optimal solution of (k-Cut-SDP) and produces a random partitioning satisfying
(1.1) |
where is the optimal -cut value and , where is the probability that and are in different partitions given that . The rounding scheme proposed in [14], referred to as the FJ rounding scheme in the rest of the paper, generates i.i.d. samples, and assigns vertex to part , if for all .
1.2 Correlation clustering
In correlation clustering, we are given a set of vertices together with the information indicating whether pairs of vertices are similar or dissimilar, modeled by the edges in the sets and respectively. The Max-Agree variant of correlation clustering seeks to maximize
Define and . A natural SDP relaxation of Max-Agree [6] is
(MA-SDP) |
where , is the Laplacian of the graph and is the weighted adjacency matrix of the graph . Charikar et al. [10] (see also Swamy [31]) propose a rounding scheme that takes an optimal solution of (MA-SDP) and produces a random clustering satisfying
(1.2) |
where is the optimal clustering value. The rounding scheme proposed in [10], referred to as the CGW rounding scheme in the rest of the paper, generates either or i.i.d. zero-mean Gaussian samples with covariance and uses them to define clusters.
1.3 Contributions
We now summarize key contributions of the paper.
Gaussian sampling for Max-k-Cut. Applying Gaussian sampling-based Frank-Wolfe given in [28] directly to (k-Cut-SDP) uses memory. We, however, show how to extend the approach from [28] to Max-k-Cut by proposing an alternate SDP relaxation for the problem, and combining it with the FJ rounding scheme to generate a -cut with nearly the same approximation guarantees as stated in (1.1) (see Proposition 1.1) using memory. A key highlight of our approach is that while the approximation ratio remains close to the state-of-the-art result in (1.1), reducing it by a factor of for , the memory used is independent of . We summarize our result as follows.
Proposition 1.1.
For , our -time method outlined in Section 3 uses memory and generates a -cut for the graph whose expected value satisfies , where is the optimal -cut value.
Gaussian sampling for Max-Agree. The structure of (MA-SDP) is similar to that of (k-Cut-SDP), however, the cost matrix in (MA-SDP) is no longer PSD or diagonally dominant, a property that plays an important role in our analysis in the case of Max-k-Cut. Despite this, we show how to generate a -optimal clustering using memory. Our approach makes a small sacrifice in the approximation ratio (as compared to (1.2)), however, the memory used remains independent of .
Proposition 1.2.
For , our -time method outlined in Section 4 uses memory and generates a clustering of graph whose expected value satisfies , where is the optimal clustering value.
The constructive proof outline of Propositions 1.1 and 1.2 is given in Sections 3 and 4 respectively.
Memory reduction using graph sparsification. Propositions 1.1 and 1.2 state that the memory used by our approach is . However, for dense graphs, the memory used by our method becomes . In this setting, to reduce the memory used, we first need to change the way we access the problem instance. We assume that the input (weighted) graph arrives edge-by-edge, eliminating the need to store the entire dense graph. We then replace it with a -spectrally close graph (see Definition 5.1) with edges. Next, we compute an approximate solution to the new problem defined on the sparse graph using memory. For Max-k-Cut and Max-Agree, we show that this method generates a solution with provable approximation guarantees.
1.4 Literature review
We first review key low memory algorithms for linearly constrained SDPs.
Burer and Monteiro [9] proposed a nonlinear programming approach which replaces the PSD decision variable with its low-rank factorization in SDPs with linear constraints. If the selected value of rank satisfies and the constraint set is a smooth manifold, then any second-order critical point of the nonconvex problem is a global optimum [8]. Another approach, that requires working memory, is to first determine (approximately) the subspace in which the (low) rank- solution to an SDP lies and then solve the problem over the (low) -dimensional subspace [11].
Alternatively, randomized sketching to a low dimensional subspace is often used as a low-memory alternative to storing a matrix decision variable [35, 32]. Recently, such sketched variables have been used to generate a low-rank approximation of a near-optimal solution to SDPs [38]. The working memory required to compute a near-optimal solution and generate its rank- approximation using the algorithmic framework proposed by Yurtsever et al. [38] is for some sketching parameter . Gaussian sampling-based Frank-Wolfe [28] uses memory to generate zero-mean Gaussian samples whose covariance represents a near-optimal solution to the SDPs with linear constraints. This eliminates the dependency on the rank of the near-optimal solution or the accuracy to which its low rank approximation is computed.
However, the two problems considered in this paper have SDP relaxations with constraints, for which applying the existing low-memory techniques provide no benefit since the memory requirement of these techniques depends on the number of constraints in the problem. These problems have been studied extensively in literature as we see below.
Max-k-Cut.
Max-k-Cut and its dual Min-k-Partition have applications in frequency allocation [12] and generating lower bound on co-channel interference in cellular networks [7]. These problems have been studied extensively in the literature [20, 27, 30]. The SDP-based rounding scheme given in [14] has also been adapted for similar problems of capacitated Max-k-Cut [16] and approximate graph coloring [21]. In each case, however, the SDP relaxation has constraints. Alternative heuristic methods have been proposed in [26, 24, 13, 17], however, these methods generate a feasible cut which only provides a lower bound on the optimal cut value.
Correlation clustering.
Swamy [31], Charikar et al. [10] each provide 0.766-approximation schemes for Max-Agree which involve solving (MA-SDP). For large-scale applications, data streaming techniques have been studied quite extensively for various clustering problems, such as -means and -median [3, 25]. Ahn et al. [2] propose a single-pass, -time -approximation algorithm for Max-Agree that uses memory. In contrast, to achieve the same approximation guarantee, our approach uses memory which is equal to for sparse graphs, and is independent of . Furthermore, the computational complexity of our approach has a better dependence on given by , and is at most . Moreover, our approach is algorithmically simple to implement.
1.5 Outline
In Section 2, we review the Gaussian sampling-based Frank-Wolfe method [28] to compute a near-feasible, near-optimal solution to SDPs with linear equality and inequality constraints. In Sections 3 and 4 respectively, we adapt the Gaussian sampling-based approach to give an approximation algorithm for Max-k-Cut and Max-Agree respectively, that use only memory, proving Propositions 1.1 and 1.2 respectively. In Section 5, we show how to combine our methods with streaming spectral sparsification to reduce the memory required to for dense graphs presented edge-by-edge in a stream. We provide some preliminary computational results for Max-Agree in Section 6, and conclude our work and discuss possible future directions in Section 7. All proofs are deferred to the appendix.
Notations.
The matrix inner product is denoted by . The vector of diagonal entries of a matrix is , and is a diagonal matrix with the vector on the diagonal. The notations have the usual complexity interpretation and suppresses the dependence on . An undirected edge in the set is denoted using and interchangably.
2 Gaussian Sampling-based Frank-Wolfe
Consider a smooth, concave function and define the trace constrained SDP
(BoundedSDP) |
where and is a linear mapping that projects the variable from -dimensional space to a -dimensional space. One algorithmic approach to solving (BoundedSDP) is to use the Frank-Wolfe algorithm [18] which, in this case, computes an -optimal solution by taking steps of the form , where and unit vectors ’s arise from approximately solving a symmetric eigenvalue problem that depends only on and . Standard convergence results show that an -optimal solution is reached after iterations, where is an upper bound on the curvature constant of [18].
Frank-Wolfe with Gaussian sampling.
The Gaussian sampling technique of [28] replaces the matrix-valued iterates, , with Gaussian random vectors . The update, at the level of samples, is then , where . Note that is also a zero-mean Gaussian random vector with covariance equal to . Furthermore, to track the change in the objective function value, it is sufficient to track the value , and compute such that . Thus, computing the updates to the decision variable and tracking the objective function value only requires the knowledge of and , which can be updated without explicitly storing , thereby reducing the memory used.
Algorithm 1 [28] describes, in detail, Frank-Wolfe algorithm with Gaussian sampling when applied to (BoundedSDP). It uses at most memory at each iteration, and after at most iterations, returns a sample , where is an -optimal solution to (BoundedSDP).
2.1 SDP with linear equality and inequality constraints
Consider an SDP with linear objective function and a bounded feasible region,
(SDP) |
where and are linear maps. To use Algorithm 1, the linear constraints are penalized using a smooth penalty function. Let for and for . For , the smooth function ,
(2.1) | |||
(2.2) |
We add this penalty term to the objective of (SDP) and define
(SDP-LSE) |
where and are appropriately chosen parameters. Algorithm 1 then generates a Gaussian sample with covariance which is an -optimal solution to (SDP-LSE). It is also a near-optimal, near-feasible solution to (SDP). This result is a slight modification of [28, Lemma 3.2] which only provides bounds for SDPs with linear equality constraints.
3 Application of Gaussian Sampling to (k-Cut-SDP)
In this section, we look at the application of Gaussian sampling to Max-k-Cut. Since Algorithm 1 uses memory when solving (k-Cut-SDP), we define a new SDP relaxation of Max-k-Cut with the same approximation guarantee, but with constraints. We then apply Algorithm 1 to this new relaxation, and show how to round the solution to achieve nearly the same approximation ratio as given in (1.1). Let
(3.1) |
where is the probability that vertices and are in different partitions. If is feasible for (k-Cut-SDP) and CUT is the value of the -cut generated by the FJ rounding scheme, then
(3.2) |
Frieze and Jerrum [14] derive a lower bound on , showing that the method gives a nontrivial approximation guarantee. Observe that (3.2) depends only on the values if .
A new SDP relaxation of Max-k-Cut.
We relax the constraints in (k-Cut-SDP) to define
(k-Cut-Rel) |
Since (k-Cut-Rel) is a relaxation of (k-Cut-SDP), its optimal objective function value provides an upper bound on , where is an optimal solution to (k-Cut-SDP), and hence, on the optimal -cut value . Note that the bound in (3.2) holds true even if we replace by an optimal solution to (k-Cut-Rel) since it depends on the value of only if . Furthermore, when the FJ rounding scheme is applied to the solution of (k-Cut-Rel), it satisfies the approximation guarantee on the expected value of the generated -cut given in (1.1), i.e., .
Using Algorithm 1.
Optimality and feasibility results for (k-Cut-Rel).
Generating a feasible solution to Max-k-Cut.
Since might not necessarily be feasible to (k-Cut-Rel), we cannot apply the FJ rounding scheme to the samples . We, therefore, generate samples using the procedure given in Algorithm 2 such that is a feasible solution to (k-Cut-Rel) and is close to .
We can now apply the FJ rounding scheme to as given in Lemma 3.2.
Lemma 3.2.
For , let be the optimal -cut value and let be an optimal solution to (k-Cut-Rel). For , let satisfy (3.3) and (3.4). Let be random vectors generated by Algorithm 2 with input and let CUT denote the value of a -cut generated by applying the FJ rounding scheme to . For as defined by (3.1), we have
(3.5) |
Computational complexity of Algorithm 1 when applied to (k-Cut-LSE).
Finally, in Lemma 3.3, we provide the computational complexity of the method proposed in this section, which concludes the proof of Proposition 1.1.
Lemma 3.3.
When the method proposed in this section (Section 3), with and , is used to generate an approximate -cut to Max-k-Cut, the generated cut satisfies and runs in time.
4 Application of Gaussian Sampling to (MA-SDP)
We now look at the application of our Gaussian sampling-based method to Max-Agree. Algorithm 1 uses memory to generate samples whose covariance is an -optimal solution to (MA-SDP). However, with the similar observation as in the case of Max-k-Cut, we note that for any feasible to (MA-SDP), the proof of the inequality , given in [10, Theorem 3], requires only if . We therefore, write a new relaxation of (MA-SDP),
(MA-Rel) |
with only constraints. The bound on the expected value of the clustering holds even if the clustering is generated by applying the CGW rounding scheme to an optimal solution of (MA-Rel). To use Algorithm 1, we penalize the constraints in (MA-Rel) and define
(MA-LSE) |
Optimality and feasibility results for (MA-Rel).
Generating an approximate clustering.
The CGW rounding scheme can only be applied if we have a feasible solution to (MA-Rel). We, therefore, use a modified version of Algorithm 2, with Step 3 replaced by and input , to generate zero-mean Gaussian samples whose covariance is a feasible solution to (MA-Rel). Finally, we apply the CGW rounding scheme to the output of the modified of Algorithm 2.
Lemma 4.2.
Let be an optimal solution to (MA-Rel). For , let satisfy (4.1) and (4.2), and let be random vectors generated by Algorithm 2 with input . Let denote the optimal clustering value for the graph and let denote the value of the clustering generated from the random vectors using the CGW rounding scheme. Then
(4.3) |
Computational complexity of Algorithm 1 when applied to (MA-LSE).
Lemma 4.3.
When the method proposed in this section (Section 4), with and , is used to generate an approximate clustering, the value of the clustering satisfies and runs in time.
This completes the proof of Proposition 1.2.
5 Sparsifying the Laplacian Cost Matrix
As seen in Sections 3 and 4, the memory requirement for generating and representing an -optimal solution to (k-Cut-LSE) and (MA-LSE) is bounded by . However, if the input graph is dense, the cost matrix will be dense and the number of inequality constraints will still be high. In this section, we consider the situation in which the dense weighted graph arrives in a stream, and we first build a sparse approximation with similar spectral properties. We refer to this additional step as sparsifying the cost.
Definition 5.1 (-spectral closeness).
Two graphs and defined on the same set of vertices are said to be -spectrally close if, for any and ,
(5.1) |
Spectral graph sparsification has been studied quite extensively (see, e.g., [29, 1, 15]). Kyng et al. [23] propose a -time framework to replace a dense graph by a sparser graph such that and satisfies (5.1) with probability . Their algorithm assumes that the edges of the graph arrive one at a time, so that the total memory requirement is rather than . Furthermore, a sparse cost matrix decreases the computation time of the subproblem in Algorithm 1 since it involves matrix-vector multiplication with the gradient of the cost.
Max-k-Cut with sparsification.
Let be a sparse graph with edges that is -spectrally close to the input graph . By applying the method outlined in Section 3, we can generate a -cut for the graph (using memory) whose expected value satisfies the bound (3.5). Note that, this generated cut is also a -cut for the original graph with provable approximation guarantee as shown in Lemma 5.1.
Lemma 5.1.
For , let be a near-feasible, near-optimal solution to (k-Cut-Rel) defined on the graph that satisfies (3.3) and (3.4). Let CUT denote the value of the -cut generated by applying Algorithm 2 followed by the FJ rounding scheme to . Then the generated cut satisfies
(5.2) |
where is the value of the optimal -cut for the original graph .
Max-Agree with sparsification.
The number of edges and in graphs and respectively determine the working memory of Algorithm 1. For dense input graphs and , we sparsify them to generate graphs and with at most edges and define
(MA-Sparse) |
where , is the Laplacian of the graph , is matrix with nonnegative entries denoting the weight of each edge , and . Algorithm 1 then generates an -optimal solution, , to (MA-Sparse) using memory. We can now use the method given in Section 4 to generate a clustering of graph whose expected value, , satisfies (4.3). The following lemma shows that also represents a clustering for the original graph with provable guarantees.
Lemma 5.2.
For , let be a near-feasible, near-optimal solution to (MA-Sparse) defined on the graph that satisfies (4.1) and (4.2). Let denote the value of the clustering generated by applying Algorithm 2 followed by the CGW rounding scheme to . Then, satisfies
(5.3) |
where is the value of the optimal clustering of the original graph .
We summarize our results in the following lemma whose proof is given in Appendix A.10.
Lemma 5.3.
Assume that the edges of the input graph arrive one at a time in a stream. The procedure given in this section uses at most memory and in -time, generates approximate solutions to Max-k-Cut and Max-Agree that satisfy the bounds and respectively.
6 Computational Results
We now discuss the results of preliminary computations to cluster the vertices of a graph using the approach outlined in Section 4. The aim of numerical experiments was to verify that the bounds given in Lemma 4.2 were satisfied when we used the procedure outlined in Section 4 to generate a clustering for each input graph. We used the graphs from GSet dataset [36] which is a collection of randomly generated graphs. Note that the aim of correlation clustering is to generate a clustering of vertices for graphs where each edge has a label indicating ‘similarity’ or ‘dissimilarity’ of the vertices connected to that edge. We, therefore, first converted the undirected, unweighted graphs from the GSet dataset [36] into the instances of graphs with labelled edges using an adaptation of the approach used in [34, 33]. This modified approach generated a label and weight for each edge indicating the amount of ‘similarity’ or ‘dissimilarity’ between vertices and .
Generating input graphs for Max-Agree.
In the process of label generation, we first computed the Jaccard coefficient , where is the set of neighbours of for each edge . Next we computed the quantity with for each edge , which is a measure of the amount of ‘similarity’ or ‘dissimilarity’. Finally, the edge was labelled as ‘dissimilar’ if with and labelled as ‘similar’ with otherwise.
Experimental Setup.
We set the input parameters to , , , . Using Algorithm 1, (MA-LSE) was solved to -optimality and, we computed feasible samples using Algorithm 2. Finally, we generated two Gaussian samples and created at most four clusters by applying the 0.75-rounding scheme proposed by Swamy [31, Theorem 2.1], for simplicity. The computations were performed using MATLAB R2021a on a machine with 8GB RAM. We noted the peak memory used by the algorithm using the profiler command in MATLAB.
The computational result for some randomly selected instances from the dataset is given in Table 1. We have provided the result for the rest of the graphs from GSet in Appendix C. First, we observed that for each input graph, the number of iterations of LMO for -convergence satisfied the bound given in Proposition 1.1 and the infeasibility of the covariance of the generated samples was less than satisfying (4.2). We generated 10 pairs of i.i.d. zero-mean Gaussian samples with covariance , and each in turn was used to generate a clustering for the input graph using the 0.75-rounding scheme proposed by Swamy [31]. Amongst the 10 clusterings generated for each graph, we picked the clustering with largest value denoted by . Note that, , where the last inequality follows from combining (4.3) with (4.1). Since we were able to generate the values, and , we noted that the weaker bound was satisfied by every input graph when .
Table 1 also shows the memory used by our method. Consider the dataset G1, for which the memory used by our method was , where a factor of 8 denotes that MATLAB requires 8 bytes to store a real number. Similarly, we observed that our method used at most memory to generate clusters for other instances from GSet, where for every instance of the input graph, showing that the memory used was linear in the size of the input graph.
Dataset | # Iterations () | infeas | AR | Memory required (in kB) | |||||
---|---|---|---|---|---|---|---|---|---|
G1 | 800 | 2453 | 16627 | 669.46 | 849.48 | 643 | 0.757 | 1526.35 | |
G11 | 800 | 817 | 783 | 397.2 | 3000.3 | 2080 | 0.693 | 448.26 | |
G14 | 800 | 3861 | 797 | 330.02 | 542.55 | 469.77 | 0.866 | 423.45 | |
G22 | 2000 | 115 | 19849 | 725.66 | 1792.9 | 1371.1 | 0.764 | 1655.09 | |
G32 | 2000 | 2011 | 1989 | 571.42 | 7370 | 4488 | 0.609 | 1124 | |
G43 | 1000 | 248 | 9704 | 501.31 | 803.8 | 616.05 | 0.766 | 654.46 | |
G48 | 3000 | 0 | 6000 | 9806.22 | 0.004 | 599.64 | 461.38 | 0.769 | 736.09 |
G51 | 1000 | 4734 | 1147 | 1038.99 | 0.001 | 676.21 | 446.29 | 0.66 | 517.09 |
G55 | 5000 | 66 | 12432 | 2707.07 | 0.002 | 1244.2 | 901.74 | 0.724 | 1281.03 |
G57 | 5000 | 4981 | 5019 | 574.5 | 0.005 | 18195 | 10292 | 0.565 | 812.78 |
7 Discussion
In this paper, we proposed a Gaussian sampling-based optimization algorithm to generate approximate solutions to Max-k-Cut, and the Max-Agree variant of correlation clustering using memory. The approximation guarantees given in [14, 10, 31] for these problems are based on solving SDP relaxations of these problems that have constraints. The key observation that led to the low-memory method proposed in this paper was that the approximation guarantees from literature are preserved for both problems even if we solve their weaker SDP relaxations with only constraints. We showed that for Max-k-Cut, and the Max-Agree variant of correlation clustering, our approach nearly preserves the quality of the solution as given in [14, 10]. We also implemented the method outlined in Section 4 to generate approximate clustering for random graphs with provable guarantees. The numerical experiments showed that while the method was simple to implement, it was slow in practice. However, there is scope for improving the convergence rate of our method so that it can potentially be applied to the large-scale instances of various real-life applications of clustering.
Extending the low-memory method to solve problems with triangle inequalities.
The known nontrivial approximation guarantees for sparsest cut problem involve solving an SDP relaxation that has triangle inequalities [4]. It would be interesting to see whether it is possible to simplify these SDPs in such a way that they can be combined nicely with memory efficient algorithms, and still maintain good approximation guarantees.
References
- Ahn and Guha [2009] Kook Jin Ahn and Sudipto Guha. Graph sparsification in the semi-streaming model. In International Colloquium on Automata, Languages, and Programming, pages 328–338. Springer, 2009.
- Ahn et al. [2015] KookJin Ahn, Graham Cormode, Sudipto Guha, Andrew McGregor, and Anthony Wirth. Correlation clustering in data streams. In International Conference on Machine Learning, pages 2237–2246. PMLR, 2015.
- Ailon et al. [2009] Nir Ailon, Ragesh Jaiswal, and Claire Monteleoni. Streaming k-means approximation. In Proceddings of the Twenty-Third Annual Conference on Neural Information Processing Systems, volume 4, page 2, 2009.
- Arora et al. [2009] Sanjeev Arora, Satish Rao, and Umesh Vazirani. Expander flows, geometric embeddings and graph partitioning. Journal of the ACM (JACM), 56(2):1–37, 2009.
- Awasthi et al. [2015] Pranjal Awasthi, Afonso S Bandeira, Moses Charikar, Ravishankar Krishnaswamy, Soledad Villar, and Rachel Ward. Relax, no need to round: Integrality of clustering formulations. In Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science, pages 191–200, 2015.
- Bansal et al. [2004] Nikhil Bansal, Avrim Blum, and Shuchi Chawla. Correlation clustering. Machine Learning, 56(1-3):89–113, 2004.
- Borndörfer et al. [1998] Ralf Borndörfer, Andreas Eisenblätter, Martin Grötschel, and Alexander Martin. Frequency assignment in cellular phone networks. Annals of Operations Research, 76:73–93, 1998.
- Boumal et al. [2016] Nicolas Boumal, Vlad Voroninski, and Afonso Bandeira. The non-convex Burer-Monteiro approach works on smooth semidefinite programs. In Advances in Neural Information Processing Systems, pages 2757–2765, 2016.
- Burer and Monteiro [2003] Samuel Burer and Renato DC Monteiro. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematical Programming, 95(2):329–357, 2003.
- Charikar et al. [2005] Moses Charikar, Venkatesan Guruswami, and Anthony Wirth. Clustering with qualitative information. Journal of Computer and System Sciences, 71(3):360–383, 2005.
- Ding et al. [2019] Lijun Ding, Alp Yurtsever, Volkan Cevher, Joel A Tropp, and Madeleine Udell. An optimal-storage approach to semidefinite programming using approximate complementarity. arXiv preprint arXiv:1902.03373, 2019.
- Eisenblätter et al. [2002] Andreas Eisenblätter, Martin Grötschel, and Arie Koster. Frequency planning and ramifications of coloring. Discussiones Mathematicae Graph Theory, 22(1):51–88, 2002.
- Feo and Resende [1995] Thomas A Feo and Mauricio GC Resende. Greedy randomized adaptive search procedures. Journal of Global Optimization, 6(2):109–133, 1995.
- Frieze and Jerrum [1997] Alan Frieze and Mark Jerrum. Improved approximation algorithms for Max-k-Cut and max bisection. Algorithmica, 18(1):67–81, 1997.
- Fung et al. [2019] Wai-Shing Fung, Ramesh Hariharan, Nicholas JA Harvey, and Debmalya Panigrahi. A general framework for graph sparsification. SIAM Journal on Scientific Computing, 48(4):1196–1223, 2019.
- Gaur et al. [2008] Daya Ram Gaur, Ramesh Krishnamurti, and Rajeev Kohli. The capacitated Max-k-Cut problem. Mathematical Programming, 115(1):65–72, 2008.
- Ghaddar et al. [2011] Bissan Ghaddar, Miguel F Anjos, and Frauke Liers. A branch-and-cut algorithm based on semidefinite programming for the minimum k-partition problem. Annals of Operations Research, 188(1):155–174, 2011.
- Jaggi [2013] Martin Jaggi. Revisiting Frank-Wolfe: Projection-free sparse convex optimization. In Proceedings of the 30th International Conference on Machine Learning, pages 427–435, 2013.
- Journée et al. [2010] Michel Journée, Francis Bach, P-A Absil, and Rodolphe Sepulchre. Low-rank optimization on the cone of positive semidefinite matrices. SIAM Journal on Optimization, 20(5):2327–2351, 2010.
- Kann et al. [1995] Viggo Kann, Sanjeev Khanna, Jens Lagergren, and Alessandro Panconesi. On the Hardness of Approximating Max-k-Cut and Its Dual. Citeseer, 1995.
- Karger et al. [1998] David Karger, Rajeev Motwani, and Madhu Sudan. Approximate graph coloring by semidefinite programming. Journal of the ACM (JACM), 45(2):246–265, 1998.
- Kuczyński and Woźniakowski [1992] Jacek Kuczyński and Henryk Woźniakowski. Estimating the largest eigenvalue by the power and Lanczos algorithms with a random start. SIAM Journal on Matrix Analysis and Applications, 13(4):1094–1122, 1992.
- Kyng et al. [2017] Rasmus Kyng, Jakub Pachocki, Richard Peng, and Sushant Sachdeva. A framework for analyzing resparsification algorithms. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2032–2043. SIAM, 2017.
- Ma and Hao [2017] Fuda Ma and Jin-Kao Hao. A multiple search operator heuristic for the Max-k-Cut problem. Annals of Operations Research, 248(1-2):365–403, 2017.
- McCutchen and Khuller [2008] Richard Matthew McCutchen and Samir Khuller. Streaming algorithms for k-center clustering with outliers and with anonymity. In Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques, pages 165–178. Springer, 2008.
- Mladenović and Hansen [1997] Nenad Mladenović and Pierre Hansen. Variable neighborhood search. Computers & Operations Research, 24(11):1097–1100, 1997.
- Newman [2018] Alantha Newman. Complex semidefinite programming and Max-k-Cut. arXiv preprint arXiv:1812.10770, 2018.
- Shinde et al. [2021] Nimita Shinde, Vishnu Narayanan, and James Saunderson. Memory-efficient structured convex optimization via extreme point sampling. SIAM Journal on Mathematics of Data Science, 3(3):787–814, 2021.
- Spielman and Teng [2011] Daniel A Spielman and Shang-Hua Teng. Spectral sparsification of graphs. SIAM Journal on Scientific Computing, 40(4):981–1025, 2011.
- Subramanian et al. [2008] Anand Prabhu Subramanian, Himanshu Gupta, Samir R Das, and Jing Cao. Minimum interference channel assignment in multiradio wireless mesh networks. IEEE Transactions on Mobile Computing, 7(12):1459–1473, 2008.
- Swamy [2004] Chaitanya Swamy. Correlation clustering: maximizing agreements via semidefinite programming. In SIAM Symposium on Discrete Algorithms, volume 4, pages 526–527. Citeseer, 2004.
- Tropp et al. [2019] Joel A Tropp, Alp Yurtsever, Madeleine Udell, and Volkan Cevher. Streaming low-rank matrix approximation with an application to scientific simulation. SIAM Journal on Scientific Computing, 41(4):A2430–A2463, 2019.
- Veldt et al. [2019] Nate Veldt, David F Gleich, Anthony Wirth, and James Saunderson. Metric-constrained optimization for graph clustering algorithms. SIAM Journal on Mathematics of Data Science, 1(2):333–355, 2019.
- Wang et al. [2013] Yubo Wang, Linli Xu, Yucheng Chen, and Hao Wang. A scalable approach for general correlation clustering. In International Conference on Advanced Data Mining and Applications, pages 13–24. Springer, 2013.
- Woodruff [2014] David P Woodruff. Sketching as a tool for numerical linear algebra. Foundations and Trends® in Theoretical Computer Science, 10(1–2):1–157, 2014.
- Ye [2003] Yinyu Ye. GSet random graphs. https://www.cise.ufl.edu/research/sparse/matrices/Gset/, 2003. [accessed April-2021].
- Yurtsever et al. [2017] Alp Yurtsever, Madeleine Udell, Joel A Tropp, and Volkan Cevher. Sketchy decisions: Convex low-rank matrix optimization with optimal storage. In Artificial Intelligence and Statistics, pages 1188–1196, 2017.
- Yurtsever et al. [2021] Alp Yurtsever, Joel A Tropp, Olivier Fercoq, Madeleine Udell, and Volkan Cevher. Scalable semidefinite programming. SIAM Journal on Mathematics of Data Science, 3(1):171–200, 2021.
Appendix A Proofs
A.1 Proof of Lemma 2.1
Proof.
Let and be the dual variables corresponding to the equality constraints and the inequality constraints respectively. The dual of (SDP) is
(DSDP) |
where ’s for are assumed to be symmetric.
Lower bound on the objective.
Let be an optimal solution to (SDP) and let be an optimal solution to (SDP-LSE). For ease of notation, let
(A.1) |
and define , and by substituting , and respectively in (A.1). Since is an -optimal solution to (SDP-LSE), and a feasible solution to (SDP-LSE), the following holds
(A.2) |
We know that is feasible to (SDP), so that . Now, rearranging the terms, and using the upper bound on and the fact that ,
(A.3) |
Upper bound on the objective.
The Lagrangian of (SDP) is . For a primal-dual optimal pair, (), and , we have that , i.e.,
Rearranging the terms, using the duality of the and norms, and the fact that , gives
(A.4) |
Bound on infeasibility.
Completing the upper bound on the objective.
∎
A.2 Proof of Lemma 3.1
Proof.
The proof consists of three parts.
Lower bound on the objective.
Bound on infeasibility.
For (k-Cut-Rel), let be a dual variable such that for are the variables corresponding to equality constraints and for are the dual variables corresponding to inequality constraints. Following (DSDP), the dual of (k-Cut-Rel) is
(Dual-Relax) |
Let be an optimal dual solution. In order to bound the infeasibility using (A.6), we need a bound on which is given by the following lemma.
Lemma A.1.
The value of is upper bounded by .
Proof.
The matrix is a scaled Laplacian and so, the only off-diagonal entries that are nonzero correspond to and have value less than zero. For (Dual-Relax), a feasible solution is , for . The optimal objective function value of (Dual-Relax) is then upper bounded by
(A.9) | ||||
(A.10) |
where the last inequality follows since .
We have since both matrices are PSD. Using the fact that is in the null space of , we get
(A.11) |
Since , we can write
(A.12) |
which follows from (A.11) and the fact that for the dual to be feasible we have since has nonnegative entries on the diagonal. Substituting (A.10) in (A.12),
(A.13) |
where the last inequality follows since for . ∎
Upper bound on the objective.
A.3 Proof of Lemma 3.2
Proof.
We first show that Algorithm 2 generates random Gaussian samples whose covariance is feasible to (k-Cut-Rel).
Proposition A.1.
Proof.
Define . Note that, and it satisfies the following properties:
The objective function value of (k-Cut-Rel) at (defined in (A.15)) is
(A.16) | ||||
(A.17) |
where (i) follows from the fact that both and are PSD and so, their inner product is nonnegative, (ii) follows from Lemma 3.1 and the fact that , and (iii) uses the fact that . Let denote the value of the cut generated from the samples ’s. Combining (A.17) with the inequality (see (3.2)), we have
(A.18) |
∎
A.4 Proof of Lemma 3.3
Proof.
Upper bound on outer iteration complexity.
Let the objective function of (k-Cut-LSE) be .
Theorem A.1.
Let be a concave and differentiable function and an optimal solution of (k-Cut-LSE). Let be an upper bound on the curvature constant of , and let be the accuracy parameter for LMO, then, satisfies
(A.19) |
with probability at least .
The result follows from [18, Theorem 1] when LMO generates a solution with approximation error at most with probability . Now, is an appropriately chosen constant, and from [28, Lemma 3.1], an upper bound on the curvature constant of is . Thus, after at most
(A.20) |
iterations, Algorithm 1 generates an -optimal solution to (k-Cut-LSE).
Bound on the approximate -cut value.
From Theorem A.1, we see that after at most iterations, Algorithm 1 generates a solution that satisfies the bounds in Lemma 3.1 with probability with at least when . Consequently, the bound given in (A.17) also holds with probability at least . And so, the expected value of is . Finally, from (A.18), the expected value of the -cut, denoted by , is bounded as
(A.21) |
where denotes the expectation over the randomness in the subproblem LMO and denotes the expectation over random Gaussian samples.
Finally, we compute an upper bound on the complexity of each iteration, i.e., inner iteration complexity, of Algorithm 1.
Upper bound on inner iteration complexity.
At each iteration , Algorithm 1 solves the subproblem LMO, which generates a unit vector , such that
(A.22) |
where , and . Note that this problem is equivalent to approximately computing maximum eigenvector of the matrix which can be done using Lanczos algorithm [22].
Lemma A.2 (Convergence of Lanczos algorithm).
Let and . For , the Lanczos method [22], computes a vector , that satisfies
(A.23) |
with probability at least , after at most iterations.
This result is an adaptation of [22, Theorem 4.2] which provides convergence of Lanczos to approximately compute minimum eigenvalue and the corresponding eigenvector of a symmetric matrix. Let . We now derive an upper bound on .
Comparing (A.23) and (A.22), we see that
(A.24) |
Substituting the value of in the equation above, and noting that , we have
(A.25) |
where the last equality follows from substituting the value of (see (A.20)). We now derive an upper bound on .
Lemma A.3.
Let , where is defined in (2.1). We have .
Proof.
For the function as defined in the lemma, , where is matrix such that for , for , and for . Thus, we have
(A.26) |
where the last inequality follows since there are at most off-diagonal and diagonal nonzero entries in the matrix with each nonzero entry in the range . Now,
(A.27) |
where (i) follows from the triangle inequality for the spectral norm of , (ii) follows from (A.26) and since is graph Laplacian and a positive semidefinite matrix, and (iii) follows by substituting as given in Lemma 3.1. ∎
Substituting , and the bound on in (A.25), we have
(A.28) | |||
(A.29) |
(A.30) |
Finally, each iteration of Lanczos method performs a matrix-vector multiplication with , which has at most number of nonzero iterations, and additional arithmetic operations. Thus, the computational complexity of Lanczos method is . Moreover, Algorithm 1 performs additional arithmetic operations so that the total inner iteration complexity is , which can be written as .
Computational complexity of Algorithm 1.
Now, substituting , we have
(A.31) |
where the inequality follows since , for and . Substituting the upper bound on in , and combining the inner iteration complexity, , and outer iteration complexity, , we see that Algorithm 1 is a -time algorithm. ∎
A.5 Proof of Lemma 4.1
Proof.
We need to prove four inequalities given in Lemma 4.1.
Lower bound on the objective, .
Bound on infeasibility.
Let and let be the dual variable such that is the dual variable corresponding to the equality constraints and is the dual variable for inequality constraints. Following (DSDP), the dual of (MA-Rel) is
(Dual-CC) |
where . Let be an optimal dual solution. We derive an upper bound on in the following lemma, which is then used to bound the infeasibility using (A.6).
Lemma A.4.
The value of is upper bounded by .
Proof.
For (Dual-CC), for , and for is a feasible solution. The optimal objective function value of (Dual-CC) is then upper bounded as
(A.33) |
We have since both matrices are PSD. Using the fact that , and rearranging the terms, we have
Since , we can write
(A.34) |
where we have used the fact that for any dual feasible solution, for all . Substituting (A.33) in (A.34),
(A.35) |
∎
A.6 Proof of Lemma 4.2
Proof.
Proposition A.2.
The proof of Proposition A.2 is the same as the proof of Proposition A.1. Now, let
(A.37) |
The objective function value of (MA-Rel) at is
(A.38) | ||||
(A.39) | ||||
(A.40) |
where (i) follows from the fact that and , (ii) follows since and are PSD and their inner product is nonnegative and the diagonal entries of are 0, (iii) follows from Lemma 4.1 and the fact that , and (iv) follows since . Combining the fact that and with the above, we have
(A.41) |
∎
A.7 Proof of Lemma 4.3
Upper bound on outer iteration complexity.
Bound on the value of generated clustering.
Algorithm 1 with generates a solution that satisfies the bounds in Lemma 3.1 with probability with at least after at most iterations. Thus, the bound given in (A.40) holds with probability at least and we have
(A.43) |
Let denote the expectation over the randomness in the subproblem LMO and denote the expectation over random Gaussian samples. The expected value of the generated clustering is then bounded as
(A.44) |
where (i) follows from the fact that the value of clustering generated by CGW rounding scheme satisfies .
We now determine the inner iteration complexity of Algorithm 1.
Upper bound on inner iteration complexity.
At each iteration of Algorithm 1, the subroutine LMO (see (A.22)) is equivalent to approximately computing maximum eigenvector of the matrix . This is achieved using Lanczos method whose convergence is given in Lemma A.2. Now, let . We see that the bound on is
(A.45) |
which is similar to (A.25). We now derive an upper bound on .
Lemma A.5.
Let , where is defined in (2.1). We have , where .
Proof.
For the function as defined in the lemma, , where is matrix such that for , for , and for , and . We have
(A.46) | |||
(A.47) |
(A.48) |
where the last inequality follows since has at most nonzero entries in the range . Now, we have
(A.49) |
where (i) follows since the spectral norm of satisfies the triangle inequality, (ii) follows from (A.46), (A.47) and the fact that is a positive semidefinite matrix, and (iii) follows by substituting the value of and as given in Lemma 4.1. ∎
Substituting the bound on in (A.45), we have
(A.50) | |||
(A.51) |
(A.52) |
The computational complexity of Lanczos method is , where the term appears since Lanczos method performs matrix-vector multiplication with , whose sparsity is , plus additional arithmetic operations at each iteration. We finally write the computational complexity of each iteration of Algorithm 1 as .
Total computational complexity of Algorithm 1.
Since , we have
(A.53) |
where the inequality follows since and for . Multiplying outer and inner iteration complexity and substituting the bound on , we prove that Algorithm 1 is a -time algorithm. ∎
A.8 Proof of Lemma 5.1
For any symmetric matrix , the definition of -spectral closeness (Definition 5.1) implies
(A.54) |
Let and be the cost matrix in the objective of (k-Cut-Rel), when the problem is defined on the graphs and respectively. Since and are scaled Laplacian matrices (with the same scaling factor , from (A.54), we can write
(A.55) |
Let and be optimal solutions to (k-Cut-Rel) defined on the graphs and respectively. From (A.55), we can write,
(A.56) |
where the last inequality follows since and are feasible and optimal solutions respectively to (k-Cut-Rel) defined on the graph . Combining this with the bound in Lemma 3.2, i.e., , we get
(A.57) |
where (i) follows from (A.56), (ii) follows since for nonnegative and , and (iii) follows since for an optimal solution to (k-Cut-Rel) defined on the graph .
A.9 Proof of Lemma 5.2
Proof.
The Laplacian matrices and of the graphs and its sparse approximation respectively satisfy (A.54). Furthermore, let , where , be the Laplacian of the graph and similarly let be the Laplacian of the graph . If , from (A.54), we have
(A.58) |
Rewriting the second inequality in (A.54) for , and noting that , we have
(A.59) |
where the second inequality follows from (A.58). Let and represent the cost in (MA-Rel) and (MA-Sparse) respectively. Let be an optimal solution to (MA-Rel). The optimal objective function value of (MA-Rel) at is and
(A.60) |
where (i) follows from (A.54) and (A.59), (ii) follows from (A.58), and substituting and rearranging the terms and (iii) holds true since , and and are feasible to (MA-Sparse) so that and . Rearraning the terms, we have
(A.61) |
Combining (A.61) with the fact that the expected value of clustering generated for the graph satisfies (4.3), we have
(A.62) |
where the last inequality follows since . ∎
A.10 Proof of Lemma 5.3
The first step of the procedure given in Section 5 is to sparsify the input graph using the technique proposed in [23] whose computational complexity is . The second step when generating solutions to Max-k-Cut and Max-Agree is to apply the procedures given in Sections 3 and 4 respectively. The computational complexity of this step is bounded as given in Propositions 3.3 and 4.3 leading to a -time algorithm.
Bound on the value of generated -cut.
Bound on the value of generated clustering.
Appendix B Preliminary Computational Results for Max-k-Cut
We provide some preliminary computational results when generating an approximate -cut on the graph using the approach outlined in Section 3. The aim of these experiments was to verify that the bounds given in Lemma 3.2 were satisfied in practice. First, we solved (k-Cut-LSE) to -optimality using Algorithm 1 with the input parameters set to , , , . We then computed feasible samples using Algorithm 2 and then finally used the FJ rounding scheme on the generated samples. The computations were performed using MATLAB R2021a on a machine with 8GB RAM. The peak memory requirement was noted using the profiler command in MATLAB.
We performed computations on randomly selected graphs from GSet dataset. In each case, the infeasibility of the covariance of the generated samples was less than , thus satisfying (3.4). The number of iterations of LMO in Algorithm 1 was also within the bounds given in Proposition 1.1. To a generate -cut, we generated 10 sets of i.i.d. zero-mean Gaussian samples with covariance , and each set was then used to generate a -cut for the input graph using FJ rounding scheme. Let denote the value of the best -cut amongst the 10 generated cuts. Table LABEL:table:maxkcutResults shows the result for graphs from the GSet dataset with . Note that, , where the last inequality follows from combining (3.5) with (3.3). Since we were able to generate the values, and , we noted that the weaker bound was satisfied by every input graph when .
Furthermore, Table LABEL:table:maxkcutResults also shows that the memory used by our method was linear in the size of the input graph. To see this, consider the dataset G1, and note that for , the memory used by our method was , where a factor of 8 denotes that MATLAB uses 8 bytes to store a real number. Similarly, for other instances in GSet, the memory used by our method to generate an approximate -cut for was at most , where for each graph the value of was bounded by , showing linear dependence of the memory used on the size of the input graph.
Dataset | # Iterations () | infeas | AR | Memory required (in kB) | |||||
---|---|---|---|---|---|---|---|---|---|
G1 | 800 | 19176 | 3 | 823.94 | 15631 | 14266 | 0.9127 | 1252.73 | |
G1 | 800 | 19176 | 4 | 891.23 | 17479 | 15746 | 0.9 | 1228.09 | |
G2 | 800 | 19176 | 3 | 827.61 | 15629 | 14332 | 0.917 | 1243.31 | |
G2 | 800 | 19176 | 4 | 9268.42 | 17474 | 15786 | 0.903 | 1231.07 | |
G3 | 800 | 19176 | 3 | 1242.53 | 15493 | 14912 | 0.916 | 1239.57 | |
G3 | 800 | 19176 | 4 | 1341.37 | 17301 | 15719 | 0.908 | 1240.17 | |
G4 | 800 | 19176 | 3 | 812.8 | 15660 | 14227 | 0.908 | 1230.59 | |
G4 | 800 | 19176 | 4 | 9082.74 | 17505 | 15748 | 0.899 | 1223.59 | |
G5 | 800 | 19176 | 3 | 843.5 | 15633 | 14341 | 0.917 | 1222.09 | |
G5 | 800 | 19176 | 4 | 9294.32 | 17470 | 15649 | 0.895 | 1227.9 | |
G14 | 800 | 4694 | 3 | 1240.99 | 0.002 | 3917 | 2533 | 0.646 | 3502.64 |
G14 | 800 | 4694 | 4 | 3238.42 | 0.001 | 4467.9 | 3775 | 0.844 | 519.25 |
G15 | 800 | 4661 | 3 | 3400.17 | 0.001 | 4018.6 | 3385 | 0.842 | 612 |
G15 | 800 | 4661 | 4 | 1603.13 | 0.001 | 4475.8 | 3754 | 0.838 | 648.17 |
G16 | 800 | 4672 | 3 | 33216.68 | 0.001 | 4035.7 | 3422 | 0.847 | 561 |
G16 | 800 | 4672 | 4 | 3059.11 | 0.001 | 4437.5 | 3783 | 0.852 | 2800 |
G17 | 800 | 4667 | 3 | 3526.4 | 0.001 | 4031.5 | 3414 | 0.846 | 602.81 |
G17 | 800 | 4667 | 4 | 3400.01 | 0.001 | 4440 | 3733 | 0.84 | 693.6 |
G22 | 2000 | 19990 | 3 | 7402.59 | 17840 | 11954 | 0.67 | 1340.34 | |
G22 | 2000 | 19990 | 4 | 8103.83 | 19582 | 16670 | 0.851 | 1341.67 | |
G23 | 2000 | 19990 | 3 | 3597.39 | 17938 | 15331 | 0.854 | 1360.09 | |
G23 | 2000 | 19990 | 4 | 3588.04 | 19697 | 16639 | 0.844 | 1317.09 | |
G24 | 2000 | 19990 | 3 | 4304.48 | 17913 | 15370 | 0.858 | 1341.96 | |
G24 | 2000 | 19990 | 4 | 1994.26 | 19738 | 16624 | 0.842 | 1321.59 | |
G25 | 2000 | 19990 | 3 | 9774.03 | 18186 | 15294 | 0.841 | 1311.54 | |
G25 | 2000 | 19990 | 4 | 1540.14 | 19778 | 16641 | 0.841 | 1330.95 | |
G26 | 2000 | 19990 | 3 | 2069.65 | 18012 | 15411 | 0.855 | 1321.92 | |
G26 | 2000 | 19990 | 4 | 1841.06 | 19735 | 16609 | 0.841 | 1331.53 | |
G43 | 1000 | 9990 | 3 | 894.53 | 9029 | 7785 | 0.862 | 661.09 | |
G43 | 1000 | 9990 | 4 | 9709.68 | 9925 | 8463 | 0.852 | 665.59 | |
G44 | 1000 | 9990 | 3 | 721.64 | 9059.5 | 7782 | 0.859 | 661.09 | |
G44 | 1000 | 9990 | 4 | 9294.43 | 9926.1 | 8448 | 0.851 | 765.37 | |
G45 | 1000 | 9990 | 3 | 794.84 | 9038.4 | 7773 | 0.86 | 661.09 | |
G45 | 1000 | 9990 | 4 | 9503.74 | 9929.7 | 8397 | 0.845 | 669 | |
G46 | 1000 | 9990 | 3 | 703.4 | 9068.5 | 7822 | 0.862 | 661.09 | |
G46 | 1000 | 9990 | 4 | 9684.93 | 9929.9 | 8333 | 0.839 | 657.09 | |
G47 | 1000 | 9990 | 3 | 777.61 | 9059.4 | 7825 | 0.863 | 679.89 | |
G47 | 1000 | 9990 | 4 | 9789.55 | 9930.8 | 8466 | 0.852 | 661.09 |
Appendix C Additional Computational Results for Correlation Clustering
We provide the computational result for the graphs from the GSet dataset (not included in the main article) here. We performed computations for graphs G1-G57 from GSet dataset. The instances for which we were able to generate an -optimal solution to (MA-LSE) are given in Table LABEL:table:CCAddResults, where the parameters, and , were set as given in Section 6. For the instances not in the table, we were not able to generate an -optimal solution after 30 hours of runtime.
Dataset | # Iterations () | infeas | AR | Memory required (in kB) | |||||
---|---|---|---|---|---|---|---|---|---|
G2 | 800 | 2501 | 16576 | 681.65 | 848.92 | 643.13 | 0.757 | 1520.18 | |
G3 | 800 | 2571 | 16498 | 677.56 | 835.05 | 634.83 | 0.76 | 1529.59 | |
G4 | 800 | 2457 | 16622 | 665.93 | 852.18 | 647.37 | 0.76 | 1752 | |
G5 | 800 | 2450 | 16623 | 646.4 | 840.63 | 636.21 | 0.756 | 1535.92 | |
G6 | 800 | 9665 | 9511 | 429.9 | 25766 | 21302 | 0.826 | 1664 | |
G7 | 800 | 9513 | 9663 | 423.58 | 26001 | 20790 | 0.799 | 1535.06 | |
G8 | 800 | 9503 | 9673 | 421.34 | 26005 | 21080 | 0.81 | 4284 | |
G9 | 800 | 9556 | 9620 | 426.4 | 25903 | 21326 | 0.823 | 1812 | |
G10 | 800 | 9508 | 9668 | 426.25 | 25974 | 21412 | 0.824 | 1535.59 | |
G12 | 800 | 798 | 802 | 393.69 | 3023.4 | 2034 | 0.672 | 444.06 | |
G13 | 800 | 817 | 783 | 416.29 | 3001.1 | 2010 | 0.669 | 613.03 | |
G15 | 800 | 3801 | 825 | 284.77 | 529.83 | 401.19 | 0.757 | 460.17 | |
G16 | 800 | 3886 | 749 | 228.12 | 524.69 | 417.88 | 0.796 | 451.07 | |
G17 | 800 | 3899 | 744 | 2448.633 | 536.65 | 369.04 | 0.687 | 480.45 | |
G18 | 800 | 2379 | 2315 | 1919.44 | 7237.6 | 5074 | 0.701 | 434.67 | |
G19 | 800 | 2274 | 2387 | 2653.79 | 7274.2 | 5130 | 0.705 | 496 | |
G20 | 800 | 2313 | 2359 | 1881.75 | 7258.1 | 5186 | 0.714 | 406.09 | |
G21 | 800 | 2300 | 2367 | 1884.97 | 7281.3 | 5238 | 0.719 | 467.26 | |
G23 | 2000 | 120 | 19855 | 550.77 | 1802.2 | 1373.2 | 0.762 | 1651.54 | |
G24 | 2000 | 96 | 19875 | 812.16 | 1811.2 | 1384.6 | 0.764 | 1678.04 | |
G25 | 2000 | 109 | 19872 | 1739.06 | 1801.8 | 1398.1 | 0.776 | 1650.48 | |
G26 | 2000 | 117 | 19855 | 1125.74 | 1789.9 | 1356.9 | 0.758 | 1650.01 | |
G27 | 2000 | 9974 | 10016 | 464.93 | 30502 | 22010 | 0.721 | 1647.09 | |
G28 | 2000 | 9943 | 10047 | 553.65 | 30412 | 22196 | 0.729 | 1317.78 | |
G29 | 2000 | 10035 | 9955 | 513.97 | 30366 | 23060 | 0.759 | 1310.46 | |
G30 | 2000 | 10045 | 9945 | 594.09 | 30255 | 22550 | 0.745 | 1310.48 | |
G31 | 2000 | 9955 | 10035 | 1036.9 | 29965 | 22808 | 0.761 | 1305.05 | |
G33 | 2000 | 1985 | 2015 | 403.75 | 7442 | 4404 | 0.591 | 634.93 | |
G34 | 2000 | 1976 | 2024 | 863.53 | 7307.2 | 4760 | 0.651 | 574.12 | |
G44 | 1000 | 229 | 9721 | 515.18 | 810.82 | 616.61 | 0.76 | 655.09 | |
G45 | 1000 | 218 | 9740 | 504.91 | 812.21 | 615.84 | 0.758 | 660.51 | |
G46 | 1000 | 237 | 9723 | 469.6 | 818.39 | 623.95 | 0.762 | 655.09 | |
G47 | 1000 | 230 | 9732 | 495.24 | 819.63 | 621.65 | 0.758 | 648.32 | |
G49 | 3000 | 0 | 6000 | 1002.59 | 0.003 | 599.64 | 456.48 | 0.761 | 733 |
G50 | 3000 | 0 | 6000 | 996.19 | 0.004 | 599.64 | 455.78 | 0.76 | 540.26 |
G52 | 1000 | 4750 | 1127 | 2041.8 | 0.001 | 684.1 | 441.02 | 0.644 | 757.59 |
G53 | 1000 | 4820 | 1061 | 785.33 | 695.53 | 445.03 | 0.639 | 417.07 | |
G54 | 1000 | 4795 | 1101 | 2899.99 | 686.8 | 482.57 | 0.702 | 517.09 | |
G56 | 5000 | 6222 | 6276 | 1340.35 | 0.004 | 22246 | 12788 | 0.574 | 1243.98 |