Performance Guaranteed Evolutionary Algorithm for Minimum Connected Dominating Set
Abstract
A connected dominating set is a widely adopted model for the virtual backbone of a wireless sensor network. In this paper, we design an evolutionary algorithm for the minimum connected dominating set problem (MinCDS), whose performance is theoretically guaranteed in terms of both computation time and approximation ratio. Given a connected graph , a connected dominating set (CDS) is a subset such that every vertex in has a neighbor in , and the subgraph of induced by is connected. The goal of MinCDS is to find a CDS of with the minimum cardinality. We show that our evolutionary algorithm can find a CDS in expected time which approximates the optimal value within factor , where and are the number of vertices and the maximum degree of graph , respectively.
Keyword: evolutionary algorithm; minimum connected dominating set; expected running time; approximation ratio.
1 Introduction
Evolutionary algorithms have been widely used in real-world applications due to their simplicity of design and generality of implementation [23, 30]. Although they have been demonstrated to be quite effective and efficient by empirical studies, their theoretical analysis is far behind. From the beginning of the 21st century, theoretical studies on evolutionary algorithms have begun to come onto stage [28]. We are particularly interested in the application of evolutionary algorithms on NP-hard problems whose performance can be theoretically guaranteed. For example, in [8], Friedrich et al. designed a multi-objective evolutionary algorithm which can find an -approximation for the minimum set cover problem (MinSC) in expected polynomial time. This ratio matches the lower bound for the approximability of MinSC. In [31], Yu et al. gave a general framework of an evolutionary algorithm, and when applied to the -MinSC problem in which every set has size at most , their algorithm can achieve in expected polynomial time the best known approximation ratio obtained in [14]. However, we found that a lot of problems do not fall into these frameworks, including the minimum connected dominating set problem (MinCDS) studied in this paper. The motivation of this paper is to theoretically explore the power of evolutionary algorithm using the MinCDS problem as a wrench, trying to reveal more underlying relationship between evolutionary algorithm and approximation algorithm.
Given a graph , a vertex set is a dominating set (DS) of if every vertex has a neighbor ( is said to be dominated by , and is said to be a dominator of ). If furthermore, the subgraph of induced by , denoted as , is connected, then is said to be a connected dominating set (CDS). The goal of the minimum connected dominating set problem (MinCDS) is to find a CDS with the minimum cardinality.
MinCDS has attracted extensive studies in the past two decades, due to its fundamental application in a wireless sensor network (WSN). From the beginning of this century, WSNs have become ubiquitous in various fields such as environmental monitoring, traffic control, production process, health detection, etc [27]. Because sensors in a WSN have limited capacities, information is often forwarded through multi-hop transmissions. Notice that many sensors functioning at the same time may cause a large waste of energy, as well as a lot of interferences. So, researchers suggest using CDS as a virtual backbone in a WSN [2]: information collected by a non-backbone node can be forwarded to its dominator and then shared within the whole network since the backbone nodes form a connected structure.
The MinCDS problem is NP-hard [9]. In fact, it is set-cover hard [12] and thus does not admit a polynomial-time approximation within factor for any real number unless [4]. On the other hand, -approximation algorithm exists for MinCDS [17], where is the maximum degree of the graph. In view of its inapproximability, this ratio is asymptotically best possible under the assumption that .
In this paper, we show that the same approximation ratio can be achieved for MinCDS through a multi-objective evolutionary algorithm. The purpose of this paper is not to improve the approximation ratio since it is already the best possible. The aim is to understand the mechanism behind the evolutionary algorithm and its relation with the approximation algorithm.
1.1 Related Works
Evolutionary Algorithms (EAs) have their ideas originated in Darwin’s theory of biological evolution. Basically speaking, an EA iteratively generates a set of offspring and retains only those individuals which are superior. There are a lot of versions of EAs, some of the simplest ones include single-objective versions like RLS (randomized local search) and -EA [22] , multi-objective versions like SEMO and GSEMO [8].
The first origin of EAs can be dated to the early 1960s [19], and EAs have been applied to various areas such as antenna design [15], bioinformatics [18] and data mining [20]. In fact, since EAs are general-purpose algorithms, which can be implemented without much knowledge about the concrete instances, they are favorable in engineering fields, especially when the structures of the problems are hard to obtain. Empirical studies show that they perform fairly well in real-world applications. On the contrary, theoretical analysis on EAs is far behind.
Recently, a lot of progress has been made on the running time and performance ratio analysis of EAs for a lot of combinatorial optimization problems. For example, Neumann et al. [22] showed that -EA can find a minimum spanning tree (MST) in expected iterations, where is the number of edges, is the number of vertices, and is the maximum edge weight of the graph. Friedrich et al. [8] studied EAs for the minimum vertex cover problem (MinVC), which is a special case of the minimum set cover problem (MinSC). They showed that even on a graph as simple as a complete bipartite graph, with a positive probability, RLS cannot find a solution within approximation ratio in finite time (and thus the approximation ratio can be arbitrarily bad), and it takes -EA exponential time to obtain an optimal solution. On the contrary, GSEMO can find a -approximate solution for MinSC in expected iterations, where is number of elements to be covered, is the number of sets, and is the maximum cost of a set. These studies demonstrate the advantage of multi-objective EAs over single-objective EAs, and lead to a series of follow-up studies using multi-objective EAs for coverage problems [25, 26, 24]. In particular, Yu et al. [31] introduced a framework of an evolutionary algorithm, and when applied to the minimum -set cover problem (a MinSC problem in which every set has size at most ) can obtain ratio , where is the th Harmonic number. However, there are a lot of combinatorial optimization problems (including the MinCDS problem studied in this paper) which do not fall into this framework.
There are also a lot of studies of evolutionary algorithms on various other combinatorial problems, including the maximum matching problem [11], the partition problem [29], the bi-objective pseudo-Boolean problem [10], makespan scheduling [32], shortest paths [5], etc. The reader may also refer to the monograph [23].
Although many works have been devoted to understanding the behavior of EAs from a theoretical point of view, there are large quantities of combinatorial problems which have not been sufficiently studied in view of approximations by EAs. It remains to be further explored which problems could be solved by EAs with a guaranteed theoretical performance. Finding underlying relationships between evolutionary algorithms and approximation algorithms is an ongoing challenge.
For the MinCDS problem, Guhn and Khuller [12] provided a two-stage greedy algorithm with approximation ratio at most , where is the maximum degree of the graph (notice that ). This ratio was improved to by Ruan et al. [17] using a one-stage greedy algorithm. The most challenging part is how to deal with a non-submodular potential function. Non-submodularity is also the barrier which prevents us to use existing studies on EAs for submodular optimization problems on the MinCDS problem. New insights have to be explored.
1.2 Our Contributions
In this paper, we design an evolutionary algorithm for the MinCDS problem, and show that it can find a CDS in expected iterations which approximates the optimal value within factor . This ratio coincides with the best known ratio for MinCDS which is obtained by a centralized greedy algorithm. The main challenge lies in the theoretical analysis: can an evolutionary algorithm successfully simulate an approximation algorithm such that both efficiency and effectiveness can be guaranteed?
As shown in [8], a single-objective evolutionary algorithm seems hopeless in approximating NP-hard problems even on very simple examples. So, in this paper, we adopt a multi-objective method, trying to minimize two evaluation functions simultaneously. The first evaluation function is used to measure how far the individual is from a feasible solution. A vertex set is a CDS if and only if its corresponding individual has -value being 2. The second evaluation function is the size of the vertex set corresponding to the individual, which is the objective of the MinCDS problem to be minimized. The algorithm follows the general framework of a multi-objective evolutionary algorithm: generating offspring randomly and keeping the Pareto front until some criterion is met.
Although the algorithm is simple, what is most challenging is how to analyze its performance theoretically. In the centralized algorithm designed by Ruan et al. in [17], there exists a “greedy path” (by adding vertices one by one greedily), which leads to a CDS with approximation ratio at most . If our algorithm can follow such a path in generating offspring, then we are done. The question is: does such a greedy path exist in the population generated by the algorithm? We think this might be the same question asked by Friedrich et al. in their work [8], in which it was shown that for the minimum set cover problem, allowing enough time, the population contains a path which is no worse than the greedy path generated by an approximation algorithm. Furthermore, Friedrich et al. designed a potential to “track” such a path. Their analysis heavily depends on the fact that the covering function is a submodular function. However, the MinCDS problem does not belong to the realm of submodular optimization, which makes the situation much more complicated.
To realize the above idea and overcome the above difficulties, we divide the analysis into two phases, and define two potentials to track a desirable path in each of these two phases. A challenge is how to coordinate these two potentials. For that purpose, a concept of good offspring is conceived and our analysis only tracks those good events.
The algorithm used in this paper is essentially SEMO or GSEMO. But the analysis is much more challenging. This is what we think most interesting: revealing deep theoretical mechanisms embedded in a simple algorithm.
The remaining part of this paper is organized as follows. Section 2 introduces some terminologies and concepts used in this paper. The algorithm is presented in Section 3. A theoretical analysis is given in Section 4. Experiments are done in Section 5, which are compared with greedy algorithm. Section 6 concludes the paper with some discussions on future work.
2 Preliminaries
In this paper, small bold letters (such as ) are used to represent vectors. Suppose is a graph with vertex set and edge set . Notice that there is a one-to-one correspondence between and : for a vertex subset , the vector corresponding to has its bits indexed by and the -th bit of is if and only if ; for a vector , the vertex subset corresponding to consists of all those vertices with bits 1 in . Since this paper considers evolutionary algorithm, we call such a vector as an individual.
In a multi-objective optimization problem, different objectives might conflict with each other. To compare two individuals in such a setting, dominance is a commonly used terminology. In a multi-objective optimization problem in which objectives are to be minimized, an individual is said to weakly dominate an individual , or we say that is weakly better than , denoted as , if holds for each . If furthermore, there is some with , then is said to dominate , or we say that is better than , denoted as . Notice that there exist individuals which are incomparable in the above sense: might be superior to in terms of but inferior to in terms of . Two individuals are said to be incomparable if no one is weakly better than the other. A Pareto front is a set of incomparable individuals.
Since the topic of this paper is dominating set, using terminology “dominate” in both the graph theoretic setting and the multi-objective optimization setting may incur confusion. So this paper only uses “dominate” in the graph theoretic setting, and uses “better” and “weakly better” in the multi-objective optimization setting.
There are a lot of methods to mutate an individual in evolutionary algorithms. We only consider two basic mutation methods: for a - vector of dimension , the first one flips exactly one bit uniformly at random; while in the second one, every bit is flipped independently with probability .
In a simple single-objective evolutionary algorithm with objective function , every iteration creates one offspring of current individual and maintains the one in having the better -value. If the first mutation is used, then it is known as random local search (RLS). If the second mutation is used, then it is known as -EA.
Different from RLS and -EA, a simple multi-objective evolutionary algorithm maintains the Pareto front of those individuals which have been created up to now, and in each iteration, an individual in is picked uniformly at random as the parent to create one offspring. In [21], such multi-objective evolutionary algorithms using the first mutation and the second mutation are called SEMO (simple evolutionary multi-objective optimizer) and GSEMO (global SEMO), respectively. As shown in [8], transforming a single-objective problem into a multi-objective problem and then using SEMO or GSEMO on it can significantly improve the performance of the solution.
The quality of an algorithm for an NP-hard problem is usually measured by its efficiency (that is, how long it takes to output a solution) and effectiveness (that is, how accurate the solution could be). We measure the running time by the expected number of mutations taken in the algorithm, and measure the accuracy by the approximation ratio. For a minimization problem, an algorithm is said to be an -approximation algorithm if for any instance of the problem, it can output a solution whose cost is at most times that of an optimal solution.
3 Algorithm for MinCDS
Given a graph on vertices, a vertex set can be represented by a vector , and a vector corresponds to a vertex set . Use to denote the number of 1’s in , which is also the number of vertices in vertex set .
The algorithm is essentially SEMO, with two objective functions and to be minimized simultaneously. For a vector , . So minimizing is to minimize the size of the solution. Next, we introduce which is used to measure the feasibility of the solution.
For a vertex subset , let be the subgraph of induced by , and let be the number of connected components of . Denote by the set of edges incident with , that is, every edge in has at least one end in . The spanning subgraph of on vertex set and edge set is denoted as . Let be the number of connected components of . Consider a path on five vertices as an example (see Fig. 1 ). For vertex subset , graph consists of two singletons, namely and , and thus . While is depicted in Fig. 1 , and thus (notice that is a spanning subgraph of ).
In the following, we shall use a vector and its corresponding vertex set interchangeably. For a vector , let . The next lemma shows how can be used to measure the feasibility of the solution.
Lemma 3.1.
A vertex set is a CDS of if and only if .
Proof.
Except for , every vertex set has and . So, and equality holds if and only if . A CDS clearly satisfies and thus . On the other hand, if is not a dominating set, then there is a vertex which has no neighbor in . In this case, forms a connected component itself in , and thus has at least two connected components. So, holds only when is a dominating set. Furthermore, the condition is equivalent to saying that is connected. So a vertex set with must be a CDS. ∎
Let be the vector corresponding to vertex set . Since consists of singletons, we have . Clearly, . So while . We study the MinCDS problem by considering the bi-objective minimization problem .
The algorithm is described in Algorithm 1. It starts from , and maintains a population set which keeps all those incomparable individuals found so far (namely Pareto front). In each iteration, the algorithm picks an individual in uniformly at random, and generates an offspring by flipping one bit of uniformly at random. If there is no individual in which is better than , then enters , and all those individuals which are weakly worse than are removed from . We shall show that after some polynomial number of iterations, the population contains a desired CDS. The output is a vertex set corresponding to an individual in the final population with the minimum -value whose -value equals (so is a CDS).
Input: A connected graph , the number of iterations .
Output: A set which is a CDS of .
4 The Analysis
The following lemma is crucial to the analysis of our algorithm, which shows that there is a vertex the addition of which may decrease the -value by roughly a multiplicative factor. The proof is similar to the one in [17], we provide it here for the completeness of this paper.
Lemma 4.1.
For any vertex set in a connected graph with , let be the vertex of satisfying , then we have the following two properties:
, and
,
where is the size of a minimum CDS.
Proof.
We prove property by distinguishing two cases.
In the case , is not a dominating set by the proof of Lemma 3.1. By the connectedness of , there exists a vertex such that is not adjacent with and is adjacent with a vertex where is the neighbor set of . Then and .
In the case , since , we have . So, has at least two connected components. By , it can be seen that the closest connected components of is exactly 2-hops away from each other, which implies that there is a vertex that is adjacent to at least two connected components of . In this case, and .
In any case, . By the choice of , and because is an integer-valued set function, we have .
In order to prove , we show that there is a vertex in an optimal solution the addition of which satisfies the claimed inequality.
Let be a minimum CDS of . Since is connected, we may order vertices in as such that for any , the subgraph of induced by vertex set is connected. This can be done by finding a spanning tree of and removing leaf vertices one by one, ordering the vertices in the reverse order of their removal. Due to the connectedness of , it can be seen that
Furthermore, it could be proved that function is submodular and thus
It follows that
(1) |
Rewrite in the following way:
where . Making use of inequality (1), and the fact (because is a CDS), we have
Let . Then
By the choice of , we have
and thus . Property is proved. ∎
The next theorem analyzes the expected time to obtain a CDS with a guaranteed approximation ratio.
Theorem 4.2.
In expected iterations, the output of Algorithm 1 is a CDS whose size is at most times of the optimal value, where is the maximum degree of graph , is the number of vertices in .
Proof.
Let be the size of a minimum CDS of . Notice that although we use the optimal value in the analysis, the algorithm does not depend on .
For , let be the bin which contains the individual in whose -value equals . Notice that at any time, each bin contains at most one individual, this is because for any two individuals created in the evolutionary process, if they have the same -value, then only the one with the smaller -value is kept in . To track a “path” of desirable mutations, we only consider those bins containing good individuals, where an individual is said to be “good” if it satisfies the following constraint:
(2) |
Claim 1. Any good individual with has .
By the definition of good individual,
Using , it can be derived that
Since every vertex can dominate at most vertices, we have , and thus . Hence , Claim 1 is proved.
Claim 2. Suppose is a good individual. Then, an individual with and is also good.
This is because
(3) |
Claim 3. Suppose is a good individual, let be the vector corresponding to vertex set , where . Then
,
, and
is also a good individual.
Property is obvious, and property is a consequence of Lemma 4.1 . Next, we show property . By Lemma 4.1 ,
Since is good and , we have
Hence is also a good individual.
An individual generated by the way described in Claim 2 is desirable, for convenience, we call such to be the best offspring of .
Next, we define an indicator for each bin such that individuals contained in bins with indicator 1 are all good.
Initially, only and all other for .
When we say that bin is going to be filled with an individual , it refers to the setting that the newly generated individual can be put into , either because is empty and , or because can replace the individual in due to and .
When bin is going to be filled with an individual , its indicator is changed from 0 to 1 if one of the following two good events occur.
There is an individual belonging to a bin with indicator such that its best offspring is weakly worse than , that is
(4) |
There is an individual belonging to a bin with indicator such that
(5) |
It is possible that indicator can be changed from 1 to 0: if the individual contained in bin is removed from the population because of the creation of a better individual, then is set to be 0.
Claim 4. Any bin with contains a good individual.
We first show that if current contains a good individual, then as long as continues to be non-empty, the individual contained in is always good. In fact, suppose is a good individual in , it can be replaced by another individual only when and . Then by Claim 2, is also good.
So, as long as we can show the following property:
the time when changes from 0 to 1, the individual entering is good, | (6) |
then all individuals contained in during the span when will be good. Next, we use induction on the process of assigning indicator 1’s to prove (6).
Initially, only , and . Notice that and , so is a good individual.
Suppose the claim is true for all bins with indicator 1 in current population, and after some mutation, is to be changed from 0 to 1. The first case for such a change is because of the existence of satisfying (4). By induction hypothesis, is good. By Claim 3, is also good. Then by condition (4) and Claim 2, is also good. The second case for such a change is because of the existence of satisfying (5). By induction hypothesis, is good, and thus by Claim 2, is good. Claim 4 is proved.
The following analysis is divided into two phases. For each phase, a potential will be used to track the progress of the algorithm.
Define a potential with respect to current population to be the integer
(7) |
The initial population has potential .
Claim 5. Potential will never increase, and the expected time for to be decreased by at least 1 is at most .
Suppose is the individual corresponding to current potential . As long as remains in , the value of will not increase. By the algorithm, can be removed from only when a newly generated individual satisfies and . If , then is replaced by and remains the same. If , then by (5), the birth of makes to change from 0 to 1, where . In this case, the new . In any case, will not increase.
Next, we consider the expected time to reduce . If the mutation picks and generates its best offspring , then can be changed from 0 to 1, where (notice that before the mutation, must be since is the smallest index for the bin with indicator 1). Notice that the probability of choosing in line 4 of Algorithm 1 is , and the probability of obtaining its best offspring in line 5 of Algorithm 1 is . So, the probability
(8) |
where is used since has at most distinct values, namely , and no solutions having the same -value can coexist in by line 7 of the algorithm.
Notice that the derivation of (8) is valid starting from “any” individual in the population achieving current potential, even when the above is removed from due to the entering of a better individual. So, the expected number of mutations for potential to be decreased by at least is at most . Claim 5 is proved.
By Claim 5, in expected iterations, could be reduced from to . However, merely tracking potential could not guarantee that the final solution has small size. In fact, since , the vector satisfies (2), and thus is also a good individual, whose size is clearly too large. From the algorithmic point of view, when is small, the decreasing speed of implied by Lemma 4.1 will slow down. So, we shall track another potential after the -value becomes smaller than . The details are given as follows.
If , then the approximation ratio holds for the trivial solution . So, we assume in the following. Since the initial value for is , and the final value for is , there must be a time when jumps over . That is, there exists a mutation, before which potential , after which potential .
Define a second potential as follows.
(9) |
Claim 6. Let be the individual corresponding to . Then satisfies the constraint in (9). Hence from the time when enters population , potential is well-defined, and .
The core part for the proof of Claim 6 is to show that satisfies the constraint in (9). Since we are considering the time when potential is jumped down to , the creation of incurs the change of from 0 to 1. The reason for such a change is because of the existence of a bin with and the individual in satisfying constraint (4) or constraint (5) (with replaced by ), that is,
(10) |
or
(11) |
Since , by Claim 4, is good. Since is the smallest index of bins with indicator 1 before the mutation, we have . So is a good individual with . Then by Claim 1,
(12) |
Since is the best offspring of , we have
So, in the first case, namely (10), we have
The second case (11) is easier by using (12). Claim 6 is proved.
Claim 7. The potential never increases and the expected time for to be decreased by at least 1 is at most .
To prove the monotonicity of , similar to the proof of Claim 5, it suffices to consider a solution corresponding to current , and the case that is removed from . In such a case, a newly generated solution satisfies and . Such satisfies . So also satisfies the constraint in (9), and thus replacing by will not increase .
To prove the second part of Claim 7, observe that if a mutation picks which corresponds to current and generates its best offspring , then by Claim 3, and . It follows that and thus also satisfies the constraint in (9). Hence the new potential . The expected time for the occurrence of such an event is at most , as in the proof of Claim 5. So, Claim 7 is proved.
Claim 8. Starting from initial individual , the expected time for reaching 2 is at most .
By Claim 5, the expected time for potential to decrease from to is at most . Then by Claim 7, potential becomes well-defined and keeps decreasing, and the the expected time for to decrease from to 2 is at most . Adding them together, the total expected time for reaching 2 is at most .
Now, we are ready to finish the proof of the theorem. Suppose is the individual corresponding to , then implies that the vertex set corresponding to is a CDS. Combining this with implies that , that is, approximates the optimal solution within a factor at most . Furthermore, Claim 8 shows that the expected time to obtain such is at most . ∎
Remark 4.3.
Although our algorithm is essentially SEMO, using GSEMO will also work. Notice that for the second mutation method, the probability that exactly one bit is flipped is . Furthermore, by the way how we define “good events” to change indicators from 0 to 1, generating an individual having more bits different from its parent does not affect the validity of the analysis. So, in expected iterations, GSEMO can find a -approximation for MinCDS.
5 Experiments
In this section, we experiment on the efficiency and the effectiveness of Algorithm 1, comparing it with the centralized greedy algorithm in [17]. We first carry out experiments on scale-free networks generated by Barabási-Albert (BA) model [1]: starting from a ring network with four vertices, at each step we add a new vertex with two edges incident to the existing vertices, following the growth and preferential attachment rule. Then we consider the Erdős-Rényi (ER) network [6]: a random graph with vertices such that any pair of vertices are adjacent with probability . Since the MinCDS problem is studied on connected graphs, to generate a connected graph, we set [7]. The size of the graph is expressed by the number of vertices.
All experiments are coded in Matlab and run on the identical configuration: Intel(R) Core(TM) i5-4210 CPU and 4GB of RAM.
In terms of time, the greedy algorithm is much faster. This is natural, just like the comparatively slow evolution of human being versus fast customized mechanized production. The main purpose of this section is to test the effectiveness of the evolutionary algorithm, which can be achieved within the estimated time guaranteed by the theoretical analysis, and further explore empirical potential for an improvement of time.
We first study the effect of our algorithm on BA networks. The results are shown in Fig. 2. By setting to be , where is the number of vertices of the input graph, it can be seen from Fig. 2(a) that the size of the solutions output by our algorithm are no worse than those of the greedy algorithm. Interestingly, in many cases, the output of our algorithm is better than that of the greedy algorithm. The reason might lie in the observation that greedy is a deterministic method, which can no longer improve the solution after the algorithm is completed; but for evolutionary algorithm, if more time is allowed, there is still some space for improvement.
Then, we set to be . As shown by Fig. 2(b), the accuracy is not decreased compared with , which implies that the empirical performance of EA might be much better than what is guaranteed by the theory. This is because our analysis only tracks one evolutionary path having a clearly-cut pattern, which may have underestimated the power of evolution through other paths. We also tried to set to be , and it turns out that in this case, the output of the algorithm can not be always guaranteed to be a feasible solution.


Then, we consider the performance of Algorithm 1 on ER networks. The results are shown in Fig. 3. Similar phenomena are observed for the comparison of EA and greedy (see Fig. 3(a). However, divergence appears between and , as shown by Fig. 3(b). The reason for this phenomenon might be the following: the power-law property possessed by BA networks already makes the solutions good enough. For example, it was shown in [3] that the minimum positive dominating set problem cannot be approximated within factor on a general graph, but can be approximated within a constant factor on a power-law graph. We also tried , the computed sizes coincide with those of . So, although the performance of our algorithm on ER is not as good as on BA, there is also potential to improve the time empirically.


6 Conclusion and Discussion
This paper shows that a multi-objective evolutionary algorithm can achieve in expected time an approximation ratio for the MinCDS problem, which is the best known ratio that was obtained by a centralized approximation algorithm. The evolutionary algorithm is very simple and easy to implement. However, the theoretical analysis is much more complicated. The most challenging task is how to “track” an evolutionary path which is no worse than a greedy path. Our experiments show that a desired approximate solution can indeed be found in the expected time estimated by the theoretical analysis. What is more, EA has the potential to improve the solution if more time resources are available.
Unfortunately, our method could not readily deal with the minimum weight connected dominating set problem (MinWCDS). In fact, the currently best known approximation ratio for MinWCDS is , obtained by Guha and Khuller in [13]. This algorithm employs a generalization of an ingenious idea called spider decomposition which was first invented to study the minimum node-weighted Steiner tree problem [16]. However, how to simulate a spider decomposition by an evolutionary algorithm is unclear.
It is interesting to find out more combinatorial optimization problems which can be effectively approximated by an evolutionary algorithm, and try to answer the following question: for which combinatorial optimization problems can evolutionary algorithms achieve a theoretically guaranteed performance, and why?
Acknowledgements
This research is supported by NSFC (U20A2068), ZJNSFC (LD19A010001), and NSF (1907472).
References
- [1] Albert László Barabási and Réka Albert. Emergence of scaling in random networks. Science, 286(5439):509–512, 1999.
- [2] Bevan Das and Vaduvur Bharghavan. Routing in ad-hoc networks using minimum connected dominating sets. In IEEE International Conference on Communications, ICC 97, Towards the Knowledge Millennium, Montreal, pages 376–380, 1997.
- [3] Thang N. Dinh, Yilin Shen, Dung T. Nguyen, and My T. Thai. On the approximability of positive influence dominating set in social networks. Journal of Combinatorial Optimization, 27(3):487–503, 2014.
- [4] Irit Dinur and David Steurer. Analytical approach to parallel repetition. In Forty-sixth Acm Symposium on Theory of Computing, 2014.
- [5] Benjamin Doerr, Edda Happ, and Christian Klein. Tight analysis of the (1+1)-ea for the single source shortest path problem. Evolutionary Computation, 19(4):673–691, 2011.
- [6] Paul Erdős and Alfréd Rényi. On random graph 1. Publicationes Mathematicate, 6:290–297, 1959.
- [7] Paul Erdős and Alfréd Rényi. On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences, 5:17–61, 1960.
- [8] Tobias Friedrich, Jun He, Nils Hebbinghaus, Frank Neumann, and Carsten Witt. Approximating covering problems by randomized search heuristics using multi-objective models. Evolutionary Computation, 18(4):617–633, 2010.
- [9] Michael R. Garey and David S. Johnson. Computers and intractability: A guide to the theory of np-completeness. Freeman, San Francisco, 1978.
- [10] Oliver Giel. Expected runtimes of a simple multi-objective evolutionary algorithm. In The 2003 Congress on Evolutionary Computation, CEC’03, 2004.
- [11] Oliver Giel and Ingo Wegener. Evolutionary algorithms and the maximum matching problem. Information Processing Letters, 15(82):14–19, 2003.
- [12] Sudipto Guha and Samir Khuller. Approximation algorithms for connected dominating sets. Algorithmica, 20(4):374–387, 1998.
- [13] Sudipto Guha and Samir Khuller. Improved methods for approximating node weighted steiner trees and connected dominating sets. Information and Computation, 150(1):57–74, 1998.
- [14] Refael Hassin and Asaf Levin. A better-than-greedy approximation algorithm for the minimum set cover problem. SIAM Journal on Computing, 35(1):189–200, 2005.
- [15] Greg Hornby, Al Globus, Derek S. Linden, and Jason Lohn. Automated antenna design with evolutionary algorithms. In Proceedings of the 2006 American Institute of Aeronautics and Astronautics Conference on Space, pages 19–21, 2006.
- [16] Philip Klein and Ramamoorthi Ravi. A nearly best-possible approximation algorithm for node-weighted steiner trees. Journal of Algorithms, 19(1):104–115, 1995.
- [17] Ruan Lu, Hongwei Du, Xiaohua Jia, Weili Wu, and Ker I. Ko. A greedy approximation for minimum connected dominating sets. Theoretical Computer Science, 329(1-3):325–330, 2004.
- [18] Patrick C. H. Ma, Keith C. C. Chan, Yao Xin, and David K. Y. Chiu. An evolutionary clustering algorithm for gene expression microarray data analysis. IEEE Transactions on Evolutionary Computation, 10(3):296–314, 2006.
- [19] Tomassini Marco. Evolutionary algorithms. Studies in Computational Intelligence, 6(5):443–462, 1996.
- [20] Anirban Mukhopadhyay, Ujjwal Maulik, Santu Bandyopadhyay, and Carlos Artemio Coello Coello. A survey of multiobjective evolutionary algorithms for data mining: Part i. IEEE transactions on evolutionary computation, 18(1):4–19, 2014.
- [21] Frank Neumann. Expected runtimes of a simple evolutionary algorithm for the multi-objective minimum spanning tree problem. European Journal of Operational Research, 181(3):1620–1629, 2007.
- [22] Frank Neumann and Ingo Wegener. Randomized local search, evolutionary algorithms, and the minimum spanning tree problem. Theoretical Computer Science, 378(1):32–40, 2007.
- [23] Frank Neumann and Carsten Witt. Adelaide research and scholarship: Bioinspired computation in combinatorial optimization : algorithms and their computational complexity. 2010.
- [24] Chao Qian, Jing Cheng Shi, Yang Yu, Ke Tang, and Zhi Hua Zhou. Parallel pareto optimization for subset selection. Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI’16), pages 1939–1945, 2016.
- [25] Chao Qian, Yang Yu, and Zhi Hua Zhou. On constrained boolean pareto optimization. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI’15), pages 389–395, 2015.
- [26] Chao Qian, Yang Yu, and Zhi Hua Zhou. Subset selection by pareto optimization. Advances in Neural Information Processing Systems 28 (NIPS’15), pages 1765–1773, 2015.
- [27] Peng Jun Wan, Khaled Muheiddin Alzoubi, and Ophir Frieder. Distributed construction of connected dominating set in wireless ad hoc networks. Mobile Networks and Applications, 9(2):141–149, 2004.
- [28] Ingo Wegener. Theoretical aspects of evolutionary algorithms. Lecture Notes in Computer Science, 2076:64–78, 2001.
- [29] Carsten Witt. Worst-case and average-case approximations by simple randomized search heuristics. In Conference on Theoretical Aspects of Computer Science, 2005.
- [30] Yao Xin. Unpacking and understanding evolutionary algorithms. In IEEE World Congress on Computational Intelligence, 2012.
- [31] Yang Yu, Xin Yao, and Zhi Hua Zhou. On the approximation ability of evolutionary optimization with application to minimum set cover. Artificial Intelligence, 180-181:20–33, 2012.
- [32] Yuren Zhou, Jun Zhang, and Yong Wang. Performance analysis of the (1+1) evolutionary algorithm for the multiprocessor scheduling problem. Algorithmica, 73(1):21–41, 2015.