lemmaLemma \newtheoremreptheorem[lemma]Theorem \newtheoremrepproposition[lemma]Lemma
A Framework to Design Approximation Algorithms for Finding Diverse Solutions in Combinatorial Problems
Abstract
Finding a single best solution is the most common objective in combinatorial optimization problems. However, such a single solution may not be applicable to real-world problems as objective functions and constraints are only “approximately” formulated for original real-world problems. To solve this issue, finding multiple solutions is a natural direction, and diversity of solutions is an important concept in this context. Unfortunately, finding diverse solutions is much harder than finding a single solution. To cope with difficulty, we investigate the approximability of finding diverse solutions. As a main result, we propose a framework to design approximation algorithms for finding diverse solutions, which yields several outcomes including constant-factor approximation algorithms for finding diverse matchings in graphs and diverse common bases in two matroids and PTASes for finding diverse minimum cuts and interval schedulings.
1 Introduction
One way to solve a real-world problem is to formulate the problem as a mathematical optimization problem and find a solution with an optimization algorithm. However, it is not always easy to formulate an appropriate optimization problem as real-world problems often include intricate constraints and implicit preferences, which are usually simplified in order to solve optimization problems. Hence, an optimal solution obtained in this way is not guaranteed to be a “good solution” to the original real-world problem. To cope with this underlying inconsistency, the following two-stage approach would be promising: algorithms find multiple solutions and then users find what they like from these solutions. One may think that top- enumeration algorithms (see (Eppstein, 2008) for a survey) can be used for this purpose. However, this is not always the case since top- enumeration algorithms may output solutions similar to one another. (See (Wang et al., 2013; Yuan et al., 2016; Hao et al., 2020), for example). Such a set of solutions are not useful as a “catalog” of solutions provided to users.
As a way to solve this issue, algorithms are expected to find “diverse” solutions, and algorithms for finding “diverse” solutions have received considerable attention in many fields such as artificial intelligence (Hanaka et al., 2021b, a; Baste et al., 2022), machine learning (Gillenwater et al., 2015), and data mining (Wang et al., 2013; Yuan et al., 2016). There are many directions in the research on finding diverse solutions, depending on definitions of solutions and diversity measures. Given these rich applications, the diverse X paradigm was proposed by Fellows and Rosamond in Dagstuhl Seminar 18421 (Fernau et al., 2019). In this paradigm, “X” is a placeholder that represents solutions we are looking for, and they asked for theoretical investigations of finding diverse solutions. Since the problem of finding diverse solutions is much harder than that of finding a single solution for some “X”, it would be reasonable to consider the problem from the perspective of fixed-parameter tractability. From this proposition, several fixed-parameter tractable (FPT) algorithms are developed. Baste et al. gave algorithms for finding diverse solutions related to hitting sets (Baste et al., 2019) and those on bounded-treewidth graphs (Baste et al., 2022). Hanaka et al. (Hanaka et al., 2021b) proposed a framework to obtain FPT algorithms for finding diverse solutions in various combinatorial problems. Fomin et al. (Fomin et al., 2020, 2021) investigated the fixed-parameter tractability of finding diverse solutions related to matchings and matroids. In these results, the running time bounds of these FPT algorithms exponentially depend on the number of solutions we are looking for, which would be prohibitive for computing moderate numbers of solutions.
In this paper, we aim to develop theoretically efficient algorithms for finding moderate numbers of diverse solutions. As we mentioned, the problem of finding diverse solutions is harder than that of finding a single solution. For example, the problem of computing a maximum matching in a graph is known to be solvable in polynomial time, whereas that of computing two maximum matchings and maximizing is known to be NP-hard (Fomin et al., 2020). In this paper, we aim to develop theoretically efficient algorithms for finding moderate numbers of diverse solutions.
Our main result is a framework for designing efficient approximation algorithms with constant approximation factors for finding diverse solutions in combinatorial problems. We employ the sum of pairwise weighted Hamming distances among solutions as our diversity measure (see Section 2 for its definition), while some previous work (Fomin et al., 2021) employs the minimum of weighted Hamming distances. Roughly speaking, our approximation framework says that if we can enumerate top- weighted solutions in polynomial time, then we can obtain in polynomial time unweighted solutions maximizing our diversity measure with constant approximation factors. Moreover, suppose that we can exactly maximize our diversity of solutions in polynomial time when the number of solutions we are looking for is bounded by a constant. Then, our framework yields a polynomial-time approximation scheme (PTAS), meaning that factor- approximation in polynomial time for every constant . By applying our framework, we obtain efficient constant-factor approximation algorithms for finding diverse solutions of matchings in a graph and of common bases of two matroids, while PTASes for finding diverse solutions of minimum cuts and of interval schedulings. Let us note that these diversity maximization problems are unlikely to be solvable in polynomial time, which will be discussed later.
2 Preliminaries
We denote the set of real numbers, the set of non-negative real numbers, and the set of positive real numbers as , , and , respectively. Let be a set. We denote the set of all subsets of as .
A function is called a metric (on ) if it satisfies the following conditions: for , (1) if and only if ; (2) ; (3) . Suppose that for some integer . For , we denote by the th component of . If holds for , then is called an -metric.
Let be a finite set. For , the symmetric difference between and is denoted by (i.e., ). Let . A weighted Hamming distance is a function such that for , , where for . Suppose that . We can regard each subset as an -dimensional vector defined by if and otherwise, for . It is easy to observe that for , , where and are the vectors corresponding to and , respectively. Thus, the weighted Hamming distance can be considered as an -metric.
In this paper, we focus on the following diversity measure , called the sum diversity. Let be a collection of subsets of and be a weight function. We define
Our problem Max-Sum Diverse Solutions is defined as follows.
Definition 1 (Max-Sum Diverse Solutions).
Given a finite set , an integer , a weight function , and a membership oracle for , the task of Max-Sum Diverse Solutions is to find a set of distinct subsets that maximizes the sum diversity .
Each set in is called a feasible solution. In Max-Sum Diverse Solutions, the set of feasible solutions is not given explicitly, while we can test whether a set belongs to . Our problem Max-Sum Diverse Solutions is highly related to the problem of packing disjoint feasible solutions.
Observation 1.
Suppose that all sets in have the same cardinality and for . Let be distinct subsets. Then, if and only if for .
This observation implies several hardness results of Max-Sum Diverse Solutions, which will be discussed in Section 4.
We particularly focus on the approximability of Max-Sum Diverse Solutions for specific sets of feasible solutions. For a maximization problem, we say that an approximation algorithm has factor if given an instance , the algorithm outputs a solution with objective value such that , where is the optimal value for . A polynomial-time approximation scheme is an approximation algorithm that takes an instance and a constant , the algorithm outputs a solution with in polynomial time.
2.1 A technique for Max-Sum Diversification
Our framework is based on approximation algorithms for a similar problem Max-Sum Diversification. Let be a set and let be a metric. In what follows, for , we denote as .
Definition 2 (Max-Sum Diversification).
Given a metric on a finite set and an integer , the task of Max-Sum Diversification is to find a subset with that maximizes .
Max-Sum Diversification is studied under various names such as MAX-AVG Facility Dispersion and Remote-clique (Cevallos et al., 2019; Ravi et al., 1994). Max-Sum Diversification is known to be \NP-hard (Ravi et al., 1994). Cevallos et al. (Cevallos et al., 2019) devised a PTAS for Max-Sum Diversification. Their algorithm is based on a rather simple local search technique, but their analysis of the approximation factor and the iteration bound are highly nontrivial. Our framework for Max-Sum Diverse Solutions is based on their algorithm, which is briefly sketched below.
A pseudocode of the algorithm due to (Cevallos et al., 2019) is given in Algorithm 1. In this algorithm, we first pick an arbitrary set of elements in , which is denoted by . Then, we find a pair of elements and that maximizes and update by if . We repeat this update procedure times. Since we can find a pair in time, where is the running time to evaluate the distance function for , the following lemma holds.
Lemma 3.
Algorithm 1 runs in time .
They showed that if the metric is negative type, then the approximation ratio of Algorithm 1 is at least (Cevallos et al., 2019). Here, we do not give the precise definition of a negative type metric but mention that every -metric is negative type (Deza and Laurent, 1997; Cevallos et al., 2016).
Theorem 4 ((Cevallos et al., 2019)).
If is a negative type metric, then the approximation ratio of Algorithm 1 is .
They further observed that the above theorem implies that Max-Sum Diversification admits a PTAS as follows. Let be a positive constant. When , that is, , then is constant. Thus, we can solve Max-Sum Diversification in time by using a brute-force search. Otherwise, the above -approximation algorithm achieves factor . Thus, Max-Sum Diversification admits a PTAS, provided that is a negative type metric.
Corollary 5 ((Cevallos et al., 2019)).
If is a negative type metric, then Max-Sum Diversification admits a PTAS.
3 A framework for finding diverse solutions
In this section, we propose a framework for designing approximation algorithms for Max-Sum Diverse Solutions. The basic strategy to our framework is the local search algorithm described in the previous section. Let be a finite set and let be a set of feasible solutions. We set and apply the local search algorithm for Max-Sum Diversification to . Recall that our diversity measure is the sum of weighted Hamming distances . Moreover, is an -metric, as observed in the previous section. By Theorem 4, the local search algorithm for Max-Sum Diversification has approximation factor . However, the running time of a straightforward application of Lemma 3 is even if the feasible solutions in can be enumerated in total time, which may be exponential in the input size .
A main obstacle to applying the local search algorithm is that from a current set of feasible solutions, we need to find a pair of feasible solutions maximizing . To overcome this obstacle, we exploit top- enumeration algorithms. Let be a weight function. An algorithm is called a top- enumeration algorithm for if for a positive integer , finds feasible solutions such that for any and , holds. By using , we can compute the pair as follows.
We first guess in the pair and let . To find the pair , it suffices to find that maximizes . For an element , we define a new weight , where (resp. ) is the number of feasible solutions in that contain (resp. do not contain ). For notational convenience, we fix and write and to denote and , respectively. The following lemma shows that a feasible solution that maximizes also maximizes .
Lemma 6.
For any feasible solution , .
Proof.
The contribution of to is if , and otherwise. Thus, contributes to . Similarly, contributes to . This gives us as follows.
From the above lemma, we can find the pair with a top- enumeration algorithm for as follows. By Lemma 6, for any feasible solution , . Since the second term does not depend on , to find a feasible solution maximizing the left-hand side, it suffices to maximize subject to . The algorithm allows us to find feasible solution such that for any feasible solution other than . As , at least one of these solutions provides such a solution .
The entire algorithm is as follows. We first find a set of distinct feasible solutions in using the enumeration algorithm . Then, we repeat the local update procedure described above times. Suppose that enumerates feasible solutions in time for some constant . Then, the entire algorithm runs in time as we can compute the pair in time by simply guessing .
Note that the approximation factor does not give a reasonable bound for . In this case, however, we still have an approximation factor with a greedy algorithm for Max-Sum Diversification (Birnbaum and Goldman, 2009), which is described as follows. Initially, we set with arbitrary . Then, we compute a feasible solution maximizing . By Lemma 6 and the above discussion, we can find such a solution with a top- enumeration algorithm for , where for . We repeat this times so that contains feasible solutions. As discussed in this section, the approximation factor of this algorithm is as in (Birnbaum and Goldman, 2009). Thus, the following theorem holds.
Theorem 7.
Let be a finite set, , and . Suppose that there is a top- enumeration algorithm for that runs in time for a constant, where is an arbitrary weight function. Then, there is an -approximation algorithm for Max-Sum Diverse Solutions that runs in time, where . Moreover, if there is a polynomial-time exact algorithm for Max-Sum Diverse Solutions for constant , then it admits a PTAS.
4 Applications of the framework
To complete the description of approximation algorithms based on our framework, we need to develop top- enumeration algorithms for specific problems. In what follows, we design top- enumeration algorithms for matchings, common bases of two matroids, and interval schedulings.
Our top- enumeration algorithms are based on a well-known technique used in (Lawler, 1972) (also discussed in (Eppstein, 2008)). The key to enumeration algorithms is the following Weighted Extension.
Definition 8 (Weighted Extension).
Given a finite set , a set of feasible solutions as a membership oracle, a weight function , and a pair of disjoint subsets and of , the task is to find a feasible solution that satisfies and maximizing subject to these conditions.
If we can solve the above problem in time, then we can obtain a top- enumeration algorithm for that runs in time.
Lemma 9 ((Lawler, 1972)).
Suppose that Weighted Extension for can be solved in time. Then, there is an -time top- enumeration algorithm for .
4.1 Matchings
Matching is one of the most fundamental combinatorial objects in graphs, and the polynomial-time algorithm for computing a maximum weight matching due to (Edmonds, 1965) is a cornerstone result in this context. Finding diverse matchings has also been studied so far (Hanaka et al., 2021b, a; Fomin et al., 2020, 2021). Let be a graph. A set of edges is a matching of if has no pair of edges that share a common endpoint. A matching is called a perfect matching of if every vertex in is incident to an edge in . By using our framework, we design an approximation algorithm for finding diverse matchings. The formal definition of the problem is as follows.
Definition 10 (Diverse Matchings).
Given a graph , a weight function , and integers and , the task of Diverse Matchings is to find distinct matchings of size that maximize .
To apply our framework, it suffices to show that Weighted Extension for matchings can be solved in polynomial time. Our method is similar to a reduction from the maximum weight perfect matching problem to the maximum weight matching problem (Duan and Pettie, 2014). Let be disjoint subsets of edges and let . Then, our goal is to find a matching of with such that and , and maximizes subject to these constraints. This problem can be reduced to that of finding a maximum weight perfect matching as follows. We assume that In is a matching of as otherwise there is no matching containing it. Let be the graph obtained from by removing (1) all edges in Ex and (2) all end vertices of edges in In. Then, it is easy to see that is a matching of with and if and only if is a matching of . Thus, it suffices to find a maximum weight matching of size exactly in . To this end, we add vertices to and add all possible edges between vertices and . The graph obtained in this way is denoted by , where . We extend the weight function by setting for . Then, the following lemma holds.
Lemma 11.
Let be a maximum weight perfect matching in . Then, is a matching of size in such that for every size- matching in , it holds that .
Proof.
Since is a perfect matching and any edge incident to is contained in , must contain exactly edges of . This implies that the perfect matching contains exactly edges of . Suppose that there is a size- matching in such that . As every vertex in is adjacent to , we can choose exactly a set of edges between and so that forms a perfect matching in . Then, we have as , contradicting the fact that is a maximum weight perfect matching of . ∎
Thus, we can solve Weighted Extension for a size- matching in polynomial time (Edmonds, 1965). By Theorem 7 and Lemma 9, we have the following theorem.
Theorem 12.
There is a polynomial-time approximation algorithm for Diverse Matchings with approximation factor .
4.2 Common bases of two matroids
Let be a finite set and let a non-empty family of subsets of . The pair is a matroid if (1) for each , every subset of is included in and (2) if and , then there exists an element such that . Each set in is called an independent set of . An inclusion-wise maximal independent set of is a base of . Because of condition (2), all bases in have the same cardinality. For two matroids and , a subset is a common base of and if is a base of both and . In this subsection, we give an approximation algorithm for diverse common bases of two matroids.
Definition 13 (Diverse Matroid Common Bases).
Given two matroids and as membership oracles, a weight function , and an integer , the task of Diverse Matroid Common Bases is to find distinct common bases of and that maximize .
Given two matroids and as membership oracles, the problem of partitioning into common bases of and is a notoriously hard problem, which requires an exponential number of membership queries (Bérczi and Schwarcz, 2021). This fact together with 1 implies that Diverse Matroid Common Bases cannot be solved with polynomial number of membership queries in our problem setting. Given this fact, we develop a constant-factor approximation algorithm for Diverse Matroid Common Bases. To this end, we show that Weighted Extension for common bases of two matroids can be solved in polynomial time.
Similarly to the case of matchings, we can find a maximum weight common base subject to and for given disjoint , which is as follows. Let be a matroid. For , we let , where . Then, is a matroid (see (Oxley, 2006)). Similarly, for , we let , where . Then is also a matroid (see (Oxley, 2006)). For two matroids and , we consider two matroids and . For every independent set in and , does not contain any element in and is an independent set in both and . Thus, Weighted Extension can be solved by computing a maximum weight common base in and , which can be solved in polynomial time (see Theorem 41.7 in (Schrijver, 2003)). By Theorem 7 and Lemma 9, the following theorem holds.
Theorem 14.
There is a polynomial-time approximation algorithm for Diverse Matroid Common Bases with approximation factor , provided that the membership oracles for and can be evaluated in polynomial time.
4.3 Minimum cuts
Let be a graph. A partition of into two non-empty sets and is called a cut of . For a cut of , the set of edges having one end in and the other end in is denoted by . When no confusion arises, we may refer to as a cut of . The size of a cut is defined by . A cut is called a minimum cut of if there is no cut of with . In this section, we consider the following problem.
Definition 15 (Diverse Minimum Cuts).
Given a graph with an edge-weight function and an integer , the task of Diverse Minimum Cuts is to find distinct minimum cuts of that maximize .
An important observation for this problem is that the number of minimum cuts of any graph is (Karger, 2000). Moreover, we can enumerate all minimum cuts in a graph in polynomial time (Yeh et al., 2010; Vazirani and Yannakakis, 1992). Thus, we can solve both Weighted Extension for minimum cuts and Diverse Minimum Cuts for constant in polynomial time, yielding a PTAS for Diverse Minimum Cuts.
Theorem 16.
Diverse Minimum Cuts admits a PTAS.
Given this, it is natural to ask whether Diverse Minimum Cuts admits a polynomial-time algorithm. However, we show that Diverse Minimum Cuts is \NP-hard even if has a cut of size . Let be the size of a minimum cut of .
Theorem 17.
Diverse Minimum Cuts is \NP-hard even if .
The \NP-hardness is shown by performing a polynomial-time reduction from the maximum independent set problem on cubic graphs, which is known to be \NP-complete (Garey et al., 1974). For a graph , we denote by the maximum size of an independent set of . Let be a graph in which every vertex has degree exactly . Let be the graph obtained from by subdividing each edge twice, that is, each edge is replaced by a path of three edges. The set of vertices in that do not appear in is denoted by . The following folklore lemma ensures that the value of increases exactly by .
Lemma 18 (folklore).
Let be the number of edges in . Then, .
We construct a graph from by adding a new vertex and adding an edge between and each vertex in . Note that the degree of in is more than .
Lemma 19.
has edge-disjoint cuts of size if and only if has an independent set of size .
Proof.
Suppose first that has an independent set of size . Since every vertex in appears also in , we can construct a cut of the form for each . As is an independent set of , these cuts are edge-disjoint. Moreover, these cuts have exactly three edges since every vertex in has degree in . Thus, has edge-disjoint cuts of size .
Conversely, suppose has edge-disjoint cuts with for . It suffices to prove that each of these cuts forms for some . Let for some . Without loss of generality, we assume that . In the following, we show that contains exactly one vertex. Since every vertex in is adjacent to , contains at most three vertices of . Suppose first that . Since every vertex of has a neighbor in , has a vertex in that has a neighbor in . However, as every vertex of is adjacent to , there are at least four edges between and , contradicting to the fact that .
Suppose next that . Let be distinct. If is not adjacent to , there are two vertices and in that are adjacent to and , respectively. This implies contains four edges , yielding a contradiction. Thus, is adjacent to . Let and be the vertices in that are adjacent to and , respectively. Observe that at least one of and , say , belongs to as otherwise there are four edges () between and . Since and has three neighbors in , the other two neighbors of belongs to , which ensures at least four edges between and .
Suppose that . Let . In this case, we show that . To see this, consider the neighbors of . Since and , at least two neighbors of , which are and a vertex in , belong to . If the other neighbor is in , then by the assumption that , the two neighbors of other than belong to , which implies there are at least four edges between and . Thus, all the neighbors of belong to . Since is connected, all the vertices except for belong to as well. Thus, we have .
Finally, suppose that . In this case, at least one vertex of is included in . Let . Since , every neighbor of belongs to . Similarly to the previous case, we have , which completes the proof. ∎
Note that the proof of Theorem 17 also shows that the graph constructed in the reduction has no cut of size at most two. Therefore, by Lemmas 18 and 19, Theorem 17 follows.
When , then Diverse Minimum Cuts is trivially solvable in linear time as the problem can be reduced to finding all bridges in . If , the problem is slightly nontrivial, which in fact is solvable in polynomial time as well.
Theorem 20.
Diverse Minimum Cuts can be solved in time, provided that .
We reduce the problem to that of finding a subgraph of prescribed size with maximizing the sum of convex functions on their degrees of vertices.
Theorem 21 ((Apollonio and Sebö, 2009)).
Given an undirected graph , an integer , and convex functions for , the problem of finding -edge subgraph of maximizing is solvable in polynomial time, where is the degree of in .
We first enumerate all minimum cuts of in polynomial time. If has no minimum cuts, then the instance is trivially infeasible. Suppose otherwise. We construct a graph whose vertex set corresponds to , and the edge set of is defined as follows. For each pair , we add an edge between and to if is a cut of . Obviously, the graph can be constructed in polynomial time. For each , we let for and for . Clearly, the function is convex. Let be minimum cuts of . For each , we denote by the number of occurrences of among . Since each edge in contributes to , we immediately have the following lemma.
Lemma 22.
has a subgraph of edges such that if and only if there are edge cuts of with for such that .
By Lemma 22 and Theorem 21, Diverse Minimum Cuts can be solved in time, proving Theorem 20.
4.4 Interval schedulings
For a pair of integers and with , the set of all numbers between and is denoted by . We call an interval. For a pair of intervals and , we say that overlaps if . For a set of intervals , we say that is a valid scheduling (or simply a scheduling) if for any pair of intervals , does not overlap . In particular, we call an -scheduling if for . In this section, we deal with the following problem.
Definition 23 (Diverse Interval Schedulings).
Given a set of intervals , a weight function , and integers and , the task of Diverse Interval Schedulings is to find distinct -schedulings that maximize .
Since the problem of partitioning a set of intervals into scheduling such that each has exactly intervals is known to be NP-hard (Bodlaender and Jansen, 1995; Gardi, 2009)111Note that the NP-hardness is proven for the case that each has at most intervals, but a simple reduction proves the NP-hardness of this variant., by 1, we have the following theorem.
Theorem 24.
Diverse Interval Schedulings is \NP-hard.
To apply Theorem 7 to Diverse Interval Schedulings, it suffices to give a polynomial-time algorithm for Weighted Extension for interval schedulings. Observe that if is not a scheduling, then there is no scheduling containing . Observe also that we can remove all intervals included in or overlapping some interval in . Thus, the problem can be reduced to the one for finding a maximum weight scheduling with cardinality . This problem can be solved in polynomial time by using a simple dynamic programming approach.
Lemma 25.
Given a set and and , there is a polynomial-time algorithm finding a maximum weight -scheduling in time.
Proof.
The algorithm is analogous to that to find a maximum weight independent set on interval graphs, which is roughly sketched as follows. We assume that is sorted with respect to their right end points. We define as the maximum total weight of a -scheduling in such that for and . Then, the values of for all and can be computed by a standard dynamic programming algorithm in time . ∎
By Theorem 7 and Lemma 9, we obtain a polynomial-time approximation algorithm for Diverse Interval Schedulings with factor .
Finally, we show that Diverse Interval Schedulings can be solved in polynomial time for fixed using a dynamic programming approach, which implies a PTAS for Diverse Interval Schedulings.
Similarly to the proof of Lemma 25, assume that is sorted with respect to their right end points. Let . For each , we consider a tuple , where and are vectors in and , respectively, and is a subset of . Clearly, the number of tuples is , which is polynomial when is a constant. We denote by and the th component of and , respectively. For a tuple , the value is the maximum value of for schedulings under the following four conditions: (1) the maximum index of an interval in is ( if ); (2) for , the maximum index of an interval in is ( if ); (3) for , ; and (4) for , if and only if and are distinct.
We define if no such a set of scheduings exists. When and , there is a set of distinct -schedulings that have the sum diversity unless . For a tuple , we say that a set of schedulings is valid for if it satisfies the above four conditions. Hence, among the tuples of the form with and , is the optimal value for Diverse Interval Schedulings. We next explain the outline of our dynamic programming algorithm to compute for any .
As a base case, , , , and if and only if . Let be a tuple that satisfies the following conditions: (1); (2) for any , and ; and (3) . We say that a tuple satisfying the above conditions is dominated by . We denote the set of tuples dominated by as . Let . A tuple is valid for if satisfies the following conditions: (1) ; (2) if and , then interval does not overlap with ; (3) if , , otherwise, ; and (4) with . We denote the set of valid tuples for as . We compute using the following lemma.
Lemma 26.
For a tuple ,
Proof.
Let . Let be a valid set of schedulings with . Then, is a valid set of scheduings for . Moreover, as contributes to the diversity. Thus, the left-hand side is at most the right-hand side.
Conversely, let be a tuple maximizing the left-hand side and let be a valid set of schedulings for . For each , we set if and otherwise. By condition (2) in the definition of a valid tuple, each interval in does not overlap with , meaning that is a scheduling. Thus, the right-hand side is at most the left-hand side. ∎
From the above lemma, we can compute for any in polynomial time when is a constant. Moreover, from , we can find schedulings with the maximum sum diversity by a standard trace back technique. Combining the approximation algorithm and the above algorithm, we obtain a PTAS.
Theorem 27.
Diverse Interval Schedulings admits a PTAS.
5 Conclusion
In this paper, we give a framework for designing approximation algorithms for Max-Sum Diverse Solutions. This framework runs in time and versatile, which can be applied to the diverse version of several well-studied combinatorial problems, i.e., Diverse Matchings, Diverse Matroid Common Bases, Diverse Minimum Cuts, and Diverse Interval Schedulings. The key to applying our framework is the polynomial-time solvability of Weighted Extension, which yields constant-factor approximation algorithms for Diverse Matchings and Diverse Matroid Common Bases. Moreover, we obtain a PTAS for Max-Sum Diverse Solutions if we can solve the problem in polynomial time for fixed , yielding PTASes for Diverse Minimum Cuts and Diverse Interval Schedulings.
References
- Apollonio and Sebö [2009] Nicola Apollonio and András Sebö. Minconvex factors of prescribed size in graphs. SIAM J. Discret. Math., 23(3):1297–1310, 2009.
- Baste et al. [2019] Julien Baste, Lars Jaffke, Tomás Masarík, Geevarghese Philip, and Günter Rote. FPT algorithms for diverse collections of hitting sets. Algorithms, 12(12):254, 2019.
- Baste et al. [2022] Julien Baste, Michael R. Fellows, Lars Jaffke, Tomáš MasaÅÃk, Mateus de Oliveira Oliveira, Geevarghese Philip, and Frances A. Rosamond. Diversity of solutions: An exploration through the lens of fixed-parameter tractability theory. Artificial Intelligence, 303:103644, 2022.
- Bérczi and Schwarcz [2021] Kristóf Bérczi and Tamás Schwarcz. Complexity of packing common bases in matroids. Math. Program., 188(1):1–18, 2021.
- Birnbaum and Goldman [2009] Benjamin E. Birnbaum and Kenneth J. Goldman. An improved analysis for a greedy remote-clique algorithm using factor-revealing LPs. Algorithmica, 55(1):42–59, 2009.
- Bodlaender and Jansen [1995] Hans L. Bodlaender and Klaus Jansen. Restrictions of graph partition problems. part I. Theor. Comput. Sci., 148(1):93–109, 1995.
- Cevallos et al. [2016] Alfonso Cevallos, Friedrich Eisenbrand, and Rico Zenklusen. Max-Sum Diversity Via Convex Programming. In 32nd International Symposium on Computational Geometry (SoCG 2016), volume 51 of Leibniz International Proceedings in Informatics (LIPIcs), pages 26:1–26:14, Dagstuhl, Germany, 2016. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
- Cevallos et al. [2019] Alfonso Cevallos, Friedrich Eisenbrand, and Rico Zenklusen. An improved analysis of local search for max-sum diversification. Math. Oper. Res., 44(4):1494–1509, 2019.
- Deza and Laurent [1997] Michel Marie Deza and Monique Laurent. Geometry of cuts and metrics, volume 2. Springer, 1997.
- Duan and Pettie [2014] Ran Duan and Seth Pettie. Linear-time approximation for maximum weight matching. J. ACM, 61(1), jan 2014.
- Edmonds [1965] Jack Edmonds. Paths, trees, and flowers. Canadian J. Math., 17:449â467, 1965.
- Eppstein [2008] David Eppstein. k-best enumeration. In Encyclopedia of Algorithms, pages 1–4. Springer US, Boston, MA, 2008.
- Fernau et al. [2019] Henning Fernau, Petr Golovach, Marie-France Sagot, et al. Algorithmic enumeration: Output-sensitive, input-sensitive, parameterized, approximative (dagstuhl seminar 18421). In Dagstuhl Reports, volume 8. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019.
- Fomin et al. [2020] Fedor V. Fomin, Petr A. Golovach, Lars Jaffke, Geevarghese Philip, and Danil Sagunov. Diverse pairs of matchings. In 31st International Symposium on Algorithms and Computation, ISAAC 2020, December 14-18, 2020, Hong Kong, China (Virtual Conference), volume 181 of LIPIcs, pages 26:1–26:12. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020.
- Fomin et al. [2021] Fedor V. Fomin, Petr A. Golovach, Fahad Panolan, Geevarghese Philip, and Saket Saurabh. Diverse collections in matroids and graphs. In 38th International Symposium on Theoretical Aspects of Computer Science, STACS 2021, March 16-19, 2021, Saarbrücken, Germany (Virtual Conference), volume 187 of LIPIcs, pages 31:1–31:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
- Gardi [2009] Frédéric Gardi. Mutual exclusion scheduling with interval graphs or related classes, part I. Discret. Appl. Math., 157(1):19–35, 2009.
- Garey et al. [1974] M. R. Garey, David S. Johnson, and Larry J. Stockmeyer. Some simplified np-complete problems. In Robert L. Constable, Robert W. Ritchie, Jack W. Carlyle, and Michael A. Harrison, editors, Proceedings of the 6th Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1974, Seattle, Washington, USA, pages 47–63. ACM, 1974.
- Gillenwater et al. [2015] Jennifer Gillenwater, Rishabh K. Iyer, Bethany Lusch, Rahul Kidambi, and Jeff A. Bilmes. Submodular hamming metrics. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 3141–3149, 2015.
- Hanaka et al. [2021a] Tesshu Hanaka, Yasuaki Kobayashi, Kazuhiro Kurita, See Woo Lee, and Yota Otachi. Computing diverse shortest paths efficiently: A theoretical and experimental study, 2021.
- Hanaka et al. [2021b] Tesshu Hanaka, Yasuaki Kobayashi, Kazuhiro Kurita, and Yota Otachi. Finding diverse trees, paths, and more. Proceedings of the AAAI Conference on Artificial Intelligence, 35(5):3778–3786, May 2021.
- Hao et al. [2020] Fei Hao, Zheng Pei, and Laurence T. Yang. Diversified top-k maximal clique detection in social internet of things. Future Generation Computer Systems, 107:408–417, 2020.
- Karger [2000] David R. Karger. Minimum cuts in near-linear time. J. ACM, 47(1):46â76, jan 2000.
- Lawler [1972] Eugene L. Lawler. A procedure for computing the best solutions to discrete optimization problems and its application to the shortest path problem. Management Science, 18(7):401–405, 1972.
- Oxley [2006] James G. Oxley. Matroid Theory (Oxford Graduate Texts in Mathematics). Oxford University Press, Inc., USA, 2006.
- Ravi et al. [1994] S. S. Ravi, D. J. Rosenkrantz, and G. K. Tayi. Heuristic and special case algorithms for dispersion problems. Operations Research, 42(2):299–310, 1994.
- Schrijver [2003] A. Schrijver. Combinatorial Optimization - Polyhedra and Efficiency. Springer, 2003.
- Vazirani and Yannakakis [1992] Vijay V. Vazirani and Mihalis Yannakakis. Suboptimal cuts: Their enumeration, weight and number. In W. Kuich, editor, Automata, Languages and Programming, pages 366–377, Berlin, Heidelberg, 1992. Springer Berlin Heidelberg.
- Wang et al. [2013] Jia Wang, James Cheng, and Ada Wai-Chee Fu. Redundancy-aware maximal cliques. In The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, August 11-14, 2013, pages 122–130. ACM, 2013.
- Yeh et al. [2010] Li-Pu Yeh, Biing-Feng Wang, and Hsin-Hao Su. Efficient algorithms for the problems of enumerating cuts by non-decreasing weights. Algorithmica, 56(3):297â312, mar 2010.
- Yuan et al. [2016] Long Yuan, Lu Qin, Xuemin Lin, Lijun Chang, and Wenjie Zhang. Diversified top-k clique search. VLDB J., 25(2):171–196, 2016.