This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Optimisation and Logistics, School of Computer and Mathematical Sciences, University of Adelaide, Australia
11email: gabor.zoltai@adelaide.edu.au ✉, frank.neumann@adelaide.edu.au
22institutetext: Bio-Inspired Robotics Laboratory, Department of Engineering,
University of Cambridge, United Kingdom 22email: yx388@cam.ac.uk

A Study of Fitness Gains in Evolving Finite State Machines

Gábor Zoltai 11 0000-0002-5762-0628    Yue Xie 22 0000-0002-7959-4563    Frank Neumann 11 0000-0002-2721-3618
Abstract

Among the wide variety of evolutionary computing models, Finite State Machines (FSMs) have several attractions for fundamental research. They are easy to understand in concept and can be visualised clearly in simple cases. They have a ready fitness criterion through their relationship with Regular Languages. They have also been shown to be tractably evolvable, even up to exhibiting evidence of open-ended evolution in specific scenarios. In addition to theoretical attraction, they also have industrial applications, as a paradigm of both automated and user-initiated control. Improving the understanding of the factors affecting FSM evolution has relevance to both computer science and practical optimisation of control. We investigate an evolutionary scenario of FSMs adapting to recognise one of a family of Regular Languages by categorising positive and negative samples, while also being under a counteracting selection pressure that favours fewer states. The results appear to indicate that longer strings provided as samples reduce the speed of fitness gain, when fitness is measured against a fixed number of sample strings. We draw the inference that additional information from longer strings is not sufficient to compensate for sparser coverage of the combinatorial space of positive and negative sample strings.

Keywords:
evolutionary computation finite state machines regular languages.

1 Introduction

In all areas of computing inspired by biological evolution, the relationship between the parameters of selection and increases of fitness measures over time is an important factor. This applies to genetic programming, evolutionary computation, evolutionary artificial intelligence, or artificial life. In genetic programming, this relationship has been studied with respect to the speed at which a desired solution can be derived [1]. In artificial life, this relationship is connected to the arising of complexity (being an enabler of fitness) [2]. For constrained optimisation problems, the rate of change of selection parameters over time has been seen to enhance the performance of genetic algorithms [3]. Diversity, which can be seen as another aspect of selection (as a result of the severity of selection) has been extensively studied for its effects on fitness gains [4]. We investigate an example of the relationship between selection parameters and a fitness measure’s increase over generations.

1.1 Background

Understanding the relationship between selection function and adaptive performance benefits from simplified models, such as the constructs of automata theory. For the levels of automata from Finite State Machines (FSMs) up to deterministic Turing Machines, successful evolution to recognise formal languages has been reported [5].

Finite State Machines (FSMs, also known as deterministic finite automata or DFAs) are one of the simplest such classes of entities. They have several advantages as a model of study:

  • They are simple in concept;

  • their structure can be communicated using clear diagrams (for low numbers of states);

  • their complexity can be measured unambiguously; and

  • measures of how well an FSM recognises a regular language provide readily comprehensible fitness metrics.

For these reasons, FSMs have been a viable research tool in constructing simplified analogies reflecting aspects of biological evolution. For example, Rasek et al. [6] have proposed a definition of FSM-species and studied their formation. In another study, Moran and Pollack [7] have created simple ecosystem models of lineages of FSMs interacting in competitive and cooperative modes. They found that some results exhibit indications of open-endedness, i.e. the absence of an apparent upper bound to evolving complexity.

However, seemingly unbounded increases in complexity of evolving computational entities is the exception, rather than the rule. The review of the literature by Packard et al. [8] states the realisation that evolutionary simulations and genetic algorithms tend to approach plateaus of complexity. In other words, the complexity of evolved solutions to a problem posed tends to level out as generations succeed each other. Most commonly, a state is reached where few novel solutions of higher complexity are generated by such algorithms.

1.2 Our contribution

On the one hand, the literature documents plateaus of evolution under fixed selection parameters. On the other, there are examples of (at least an impression of) open-endedness with more complex regimes such as co-evolution of three or more interacting lineages. This suggests a need for better understanding of the influence of the selection regime on the evolution of FSMs, especially in the case of several simultaneous selection pressures.

In order to investigate this, we constructed an experimental setup using two simple selection pressures acting in opposite directions:

  • maximise the correct recognition of strings in or out of a language, and

  • minimise the number of states in the FSMs.

We use this setup to study the effect of an aspect of selection on fitness gain.

A good metric for the complexity of regular languages is their ”state complexity” [9]. Unlike other proposed measures (see [10]), state complexity relates directly to the resources needed for recognition. The family of languages used in this study has one language for each state complexity value. This provides control over the selection pressure promoting the addition of states over generations.

The remainder of the paper is organised as follows: Section 2 provides definitions for the key concepts we have combined in our study, and the way these experiments have applied these ideas. Section 3 presents the algorithms for creation of sample sets, managing the process of evolution, and the mutation operator. Section 4 describes our experiments and their results. Section 5 includes discussion and interpretation of the results. Finally, section 6 presents our conclusions and proposes future work.

2 Preliminaries

We summarise the elements we have drawn on in the experimental configuration: FSMs, the universal witness family of regular languages, and the way we have combined conflicting selection pressures.

2.1 Finite State Machines

A Finite State Machine is a 5-tuple (Q,Σ,δ,q0,F)(Q,\Sigma,\delta,q_{0},F), where

  • QQ is a set of states;

  • Σ\Sigma is a set of symbols used as an input alphabet;

  • δ:Q×ΣQ\delta:Q\times\Sigma\rightarrow Q is a transition function from one state to another depending on the next input symbol;

  • q0q_{0} is the starting state; and

  • FQF\subseteq Q is the subset of accepting states

The transition function δ\delta can be extended to describe what state the machine transitions to if given each symbol of a string in succession. Using ϵ\epsilon for the empty string, and cscs to denote the symbol cc followed by the trailing sub-string ss, we can define the extended transition function δ^:Q×ΣQ\hat{\delta}:Q\times\Sigma^{\ast}\rightarrow Q:

δ^(q,ϵ)\displaystyle\hat{\delta}(q,\epsilon) =q\displaystyle=q (1)
δ^(q,cs)\displaystyle\hat{\delta}(q,cs) =δ^(δ(q,c),s)\displaystyle=\hat{\delta}(\delta(q,c),s) (2)

An FSM ”accepts” a string ss if the extended transition function maps it from the starting state to an accepting state, i.e. the following is true:

δ^(q0,s)F\hat{\delta}(q_{0},s)\in F (3)

The set of strings accepted by an FSM is termed its language; the machine is said to ”recognise” the language. For a machine =(Q,Σ,δ,q0,F)\mathcal{M}=(Q,\Sigma,\delta,q_{0},F), we define its language LL_{\mathcal{M}} as:

L={sΣδ^(q0,s)F}L_{\mathcal{M}}=\{s\in\Sigma^{\ast}\mid\hat{\delta}(q_{0},s)\in F\} (4)

The languages recognisable by FSMs are the ”regular” languages, i.e. those that can be described by a regular expression [11]. Each regular language is recognised by a set of FSMs which are equivalent in that for any string, each of them produces the same result of acceptance or rejection. The lowest number of states of any FSM recognising a regular language is termed the ”state complexity” of the language, as explained by Yu [9]. For an algorithm to compute the state complexity from any equivalent FSM, see [12].

2.2 The UnU_{n} languages as drivers of complexity

To be able to run experiments with a range of language complexities, we looked for a family of languages that differed primarily in their complexity only, and preferably only included one member language for each state complexity value. Both or these criteria are met by the ”universal witness” languages described by Brzozowski [13], which we use here. This is a family (or ”stream”) of languages UnU_{n}, which has one member for any n<=3n<=3, which are examples (”witnesses”) of the upper bounds of state complexity in a number of theorems concerning the complexity of FSMs. In particular, it has the property that UnU_{n} cannot be recognised by an FSM of less than nn states.

Each of the UnU_{n} languages has a corresponding FSM 𝒰n\mathcal{U}_{n} that accepts it. The general structure of these FSMs is shown in Figure 1. The formal definition is as follows:

2.2.1 Definition of 𝒰n\mathcal{U}_{n}

- for n3n\geq 3, the FSM 𝒰n\mathcal{U}_{n} is a quintuple (Q,Σ,δ,q0,F)(Q,\Sigma,\delta,q_{0},F), where Q={q0,q1,,qn1}Q=\{q_{0},q_{1},...,q_{n-1}\} is the set of states, Σ={a,b,c}\Sigma=\{a,b,c\} is the alphabet, q0q_{0} is the initial state, F={qn1}F=\{q_{n-1}\} is the set containing the one accepting state, and δ\delta is:
δ(qi,a)=q(i+1)modn\delta(q_{i},a)=q_{(i+1)\mod n};
δ(q0,b)=q1\delta(q_{0},b)=q_{1}; δ(q1,b)=q0\delta(q_{1},b)=q_{0}; δ(qi,b)=qi\delta(q_{i},b)=q_{i} for i{0,1}i\notin\{0,1\};
δ(qn1,c)=q0\delta(q_{n-1},c)=q_{0}; δ(qi,c)=qi\delta(q_{i},c)=q_{i} for in1i\neq n-1.

Refer to caption
Figure 1: FSM structure for 𝒰n\mathcal{U}_{n} (based on [13])

The proportion of strings from the ternary alphabet that are members of the UnU_{n} languages decreases with increasing nn, as shown in Table 1.

Table 1: Member strings of some UnU_{n} languages as percentage of all ternary strings up to a length
Up to All strings U3U_{3} U4U_{4} U5U_{5}
length # % # % # % # %
7 3,280 100% 656 20% 490 15% 320 10%
8 9,841 100% 1,968 20% 1,452 15% 1,122 11%
9 29,524 100% 5,904 20% 4,280 15% 3,616 12%
10 88,573 100% 17,714 20% 12,688 14% 11,040 13%
11 265,720 100% 53,144 20% 37,874 14% 32,640 12%
12 797,161 100% 159,432 20% 113,548 14% 95,042 11%
13 2,391,484 100% 478,296 20% 340,992 14% 276,016 12%
14 7,174,453 100% 1,434,890 20% 1,024,128 14% 806,000 11%
15 21,523,360 100% 4,304,672 20% 3,074,490 14% 2,375,360 11%
16 64,570,081 100% 12,914,016 20% 9,225,836 14% 7,065,762 11%

2.3 Conflicting objectives: sample set recognition vs state count

The two objectives we combine in the selection applied in this study are:

  1. 1.

    ”linguistic fitness” - maximising the correct classification by FSMs of strings as being in or out of a language; and

  2. 2.

    minimising the number of states of the FSMs.

The first objective favours more states, as more states can embody more complex information about the target language. The second objective selects for fewer states. In this way, the two objectives counteract each other.

To quantify the fitness of an FSM as an acceptor of a language, we use its performance in accepting or rejecting strings in two sample sets, positive (strings in the language) and negative (strings not in the language).

In this study, the ”linguistic fitness” FA,R()F_{A,R}(\mathcal{M}) of an FSM \mathcal{M} against a set AA of strings to be accepted and a set RR of strings to be rejected, is the total of correctly accepted and correctly rejected examples:

FA,R()=|{aA(a)}|+|{rR¬(r)}|\begin{split}F_{A,R}(\mathcal{M})=&\lvert\{a\in A\mid\mathcal{M}(a)\}\rvert\\ +&\lvert\{r\in R\mid\neg\mathcal{M}(r)\}\rvert\end{split} (5)

The maximum value of FA,RF_{A,R} for an FSM is the total size of the positive and negative reference sets AA and RR. The fitness gains we study are specifically the increases of this linguistic fitness over time.

A genetic algorithm selecting for multiple objectives needs to decide which of a set of candidates to favour in the reproductive pool, based on how well they satisfy each of the objectives. As described by Neumann [14], this can be expressed in a ”dominance” relationship combining the two objectives.

In this study, we combine the objectives of maximisation of fitness and minimisation of states. The former is measured by a function FA,RF_{A,R} (AA and RR being the positive and negative sample sets), the latter by the number of states, CC.

This paper uses the two relations ”weakly dominates” (denoted by the symbol \succcurlyeq) and ”dominates” (denoted by \succ) between FSMs X and Y:

XY\displaystyle X\succcurlyeq Y FA,R(X)FA,R(Y)C(X)C(Y)\displaystyle\iff F_{A,R}(X)\geq F_{A,R}(Y)\wedge C(X)\leq C(Y)
XY\displaystyle X\succ Y (XY)(FA,R(X)>FA,R(Y)C(X)<C(Y))\displaystyle\iff(X\succcurlyeq Y)\wedge(F_{A,R}(X)>F_{A,R}(Y)\vee C(X)<C(Y))

Informally, X weakly dominates Y if it is no worse than Y on either fitness or complexity, and X dominates Y if it is no worse on either fitness or number of states, but better on at least one of the two.

3 Algorithms

3.1 Sample set generation methods

We use two methods of generating sample sets of positive and negative examples. In both methods, the positive and negative sets contain 500 strings each, for a total of 1,000 strings. The two methods labelled BssBss and RlenRle_{n}, are defined below.

The reason for equal numbers of positive and negative examples (regardless of the percentage of strings of a given length that are in versus out of the language) is technical. An FSM of a single state can either accept all strings or rejects all strings. Its fitness score therefore would be the number of positive and negative samples respectively, but its complexity is the same in either case. Using the same number of positive and negative examples provides a common baseline of performance for minimal complexity.

3.1.1 BssBss method

The BssBss method (for ”balanced short strings”) generates positive and negative sets that have short strings, with the lengths of positive and negative examples being balanced, i.e. neither the positive nor the negative subset favours longer strings. The sample sets are generated by Algorithm 1.

Input : 𝒰\mathcal{U}, FSM recognising the target language
DD, total size of positive and negative sample sets to generate
Output : AA, set of positive example strings
RR, set of negative example strings
A{}A\leftarrow\{\};
R{}R\leftarrow\{\};
for all strings ss on the alphabet of 𝒰\mathcal{U}, in alphabetic order do
       if 𝒰(s)|A||R|\mathcal{U}(s)\land\lvert A\rvert\leq\lvert R\rvert then
             AA{s}A\leftarrow A\cup\{s\} ;
            
       else if ¬(𝒰(s))|A||R|\neg(\mathcal{U}(s))\land\lvert A\rvert\geq\lvert R\rvert then
             RR{s}R\leftarrow R\cup\{s\} ;
            
       end if
      if |A|+|R|D\lvert A\rvert+\lvert R\rvert\geq D then
             break;
            
       end if
      
end for
Algorithm 1 Sample set generation: BssBss method

3.1.2 RlenRle_{n} method

To investigate the effect of longer sample strings, we devised the RlenRle_{n} method (for ”Random, less than or equal to nn”). This constructs positive and negative sample sets of equal size, from strings drawn uniformly at random from all strings over the alphabet of the target language, of length zero through nn. The RlenRle_{n} method is implemented by by Algorithm 2.

Input : nn, maximum length of strings to include in the sample sets
𝒰\mathcal{U}, FSM recognising the target language
DD, total size of positive and negative sample sets to generate
Output : AA, set of positive example strings
RR, set of negative example strings
A{}A\leftarrow\{\};
R{}R\leftarrow\{\};
C|C\leftarrow\lvertalphabet of 𝒰|\mathcal{U}\rvert;
pp\leftarrow distribution of 0 to CC, weighted by number of strings of that length;
while |A|+|R|<D\lvert A\rvert+\lvert R\rvert<D do
       LL\leftarrow random choice of length from the weighted distribution pp;
       ss\leftarrow random string of length LL from the alphabet of 𝒰\mathcal{U};
      
      if sARs\notin A\cup R then
             if 𝒰(s)|A|<D/2\mathcal{U}(s)\land\lvert A\rvert<D/2 then
                   AA{s}A\leftarrow A\cup\{s\} ;
                  
             else if ¬𝒰(s)|R|<D/2\neg\mathcal{U}(s)\land\lvert R\rvert<D/2 then
                   RR{s}R\leftarrow R\cup\{s\} ;
                  
             end if
            
       end if
      
end while
Algorithm 2 Sample set generation: RlenRle_{n} method

3.2 The evolution algorithm

The study of complexity in multi-objective genetic programming by Neumann [14] describes the SMO-GP algorithm (Simple Multi-Objective Genetic Programming). It starts with a population of one individual. In each iteration, SMO-GP selects one member of the population, applies a mutation operation, and then either replaces all dominated individuals with the (single) new mutant, or drops the mutant if it does not dominate any existing members.

For this study, we adapted this algorithm is adapted to evolve FSMs starting with the simplest possible FSM, which we call 𝒩\mathcal{N}, see Algorithm 3. 𝒩\mathcal{N} is an FSM of a single non-accepting state, with the transition function mapping all inputs back to that state. This is in a sense the simplest possible FSM, and corresponds to the empty language, as it rejects all strings. (An FSM accepting all strings, i.e. accepting Σ\ast{\Sigma}, would be equally simple.) The formal definition of such an FSM for a ternary alphabet is:

𝒩=({q0},{a,b,c},{((q0,a),q0),((q0,b),q0),((q0,c),q0)},q0,)\mathcal{N}=(\{q_{0}\},\{a,b,c\},\{((q_{0},a),q_{0}),((q_{0},b),q_{0}),((q_{0},c),q_{0})\},q_{0},\emptyset) (6)

Note that half the 1,000 sample strings will not be elements of the language, so the FSM 𝒩\mathcal{N} by rejecting them will achieve a linguistic fitness score of 500. The only way for a new mutant to enter the population is to weakly dominate an existing FSM in the population. This can come about by having higher fitness or fewer states, but no FSM can have less states than 𝒩\mathcal{N}. Therefore, all other FSMs that will become part of the population will score at least 500.

In some preliminary work, we also experimented with starting evolution towards UnU_{n} not with 𝒩\mathcal{N}, but instead with the result of evoluition over a set number of generations toward Un1U_{n-1}. This approach did not produce any notable patterns correlating the starting point with improved speed of adaptation to higher fitness, and we did not pursue it further.

Input : GG, then number of iterations
Output : PP, the final set of FSMs in the population after GG generations
q0q_{0}\leftarrowan initial FSM state;
𝒩({q0},{a,b,c},{((q0,a),q0),((q0,b),q0),((q0,c),q0)},q0,)\mathcal{N}\leftarrow(\{q_{0}\},\{a,b,c\},\{((q_{0},a),q_{0}),((q_{0},b),q_{0}),((q_{0},c),q_{0})\},q_{0},\emptyset);
P{𝒩}P\leftarrow\{\mathcal{N}\};
for GG iterations do
       MM\leftarrow uniformly random choice from PP;
      
      // Pois denotes the Poisson distribution with parameter λ=1\lambda=1.
       for 1+1+Pois(1)(1) iterations do
             MMM\leftarrow M modified by mutation operation;
            
       end for
      
      if iP|iM\nexists i\in P|i\succ M then
             P(P{M}){jP|Mj}P\leftarrow(P\cup\{M\})\setminus\{j\in P|M\succcurlyeq j\};
            
       end if
      
end for
Algorithm 3 Multi-objective Evolution Algorithm for FSMs

3.3 The mutation operation

The mutation operation used on FSMs selects one random state for the source of an arc, and a random input symbol from the alphabet for the label of the arc. A target state is then selected; half the time, this is an existing state of the FSM, otherwise it is a new state which has all outgoing arcs leading back to itself.

The FSM is then modified so that the arc from the source state, labelled with the chosen input symbol, is set to lead to the chosen target state.

Lastly, a random state is chosen and changed from accepting to non-accepting and vice versa. For details, see Algorithm 4.

Input : M=(Q,Σ,δ,q0,F)M=(Q,\Sigma,\delta,q_{0},F), the FSM to be mutated
Output : MM, the input FSM modified
qq\leftarrow uniformly random choice from QQ;
cc\leftarrow uniformly random choice from Σ\Sigma;
if uniformly random choice of {existing, new} = existing then
       tt\leftarrow uniformly random choice from QQ;
      
else
       tt\leftarrow a new FSM state Q\notin Q;
       QQ{t}Q\leftarrow Q\cup\{t\};
      
end if
δ(q,c)t\delta(q,c)\leftarrow t;
rr\leftarrow uniformly random choice from QQ;
if uniformly random choice of {accepting, rejecting} = rejecting then
       FF{r}F\leftarrow F\setminus\{r\};
      
else
       FF{r}F\leftarrow F\cup\{r\};
      
end if
Algorithm 4 Mutation operation

4 Experimental results

Table 2 shows the mean fitness scores observed after given numbers of generations, under four different methods of generating sample sets for the U3U_{3} language, across 30 runs with different randomisation seeds. Figure 2 shows a more detailed set of observations for the four methods.

The relative position of the data series in Figure 2 made us curious about whether a general trend might exist for using longer strings in sample sets to lead to slower evolution of better recognisers. We repeated the experiment for U4U_{4} and U5U_{5}; results are shown in Figure 3. Note that the RleRle sampling was adjusted for U4U_{4} and U5U_{5}. In the U4U_{4} experiment, we generated 500 positive and 500 negative sample strings, for each of Rle7Rle_{7}, Rle11Rle_{11}, and Rle15Rle_{15}. However, the languages U4U_{4} and U5U_{5} do not contain 500 strings of at most 7 characters. In order to create comparable sample sets for U4U_{4} and U5U_{5}, we used Rle8Rle_{8}, Rle12Rle_{12}, and Rle16Rle_{16}.

Table 2: Score comparison for evolving U3U_{3} from 𝒩\mathcal{N} (mean of 30 runs with different randomisation seeds)
Mean score
Generation BssBss Rle7Rle_{7} Rle11Rle_{11} Rle15Rle_{15}
100,000 912.67 845.47 826.37 823.83
200,000 941.03 854.90 836.57 826.33
500,000 960.63 898.60 841.90 833.87
1,000,000 986.80 938.73 856.80 851.03
2,000,000 1000.00 969.60 871.70 894.30
5,000,000 1000.00 986.97 898.97 923.57
10,000,000 1000.00 1000.00 917.87 951.63
Refer to caption
Figure 2: Evolving to U3U_{3}: Comparison of sampling styles
Refer to caption
Figure 3: Evolving to U4U_{4} and U5U_{5}: Comparison of sampling styles

5 Discussion

A pattern of higher nn used in RlenRle_{n} sampling leading to slower improvement of recognition fitness is seen in all three Figures summarising the generated data. This observation suggests that selection by using longer strings provides less relevant information towards recognising the language than selection by shorter strings. This is perhaps counter-intuitive. Longer sample strings of necessity embed more information about the language in question, on a per string basis.

A possible explanation may be that as we increase nn in RlenRle_{n}, a fixed-size sample set (e.g. 500 positive and 500 negative strings) is a smaller percentage of all strings of length up to nn. For a long string, a larger set of its leading substrings is likely to be missing from the sample set. So, a mutant FSM may randomly be correct in its categorisation of the long string, while not correctly categorising its prefixes. This implies that a gain of fitness under these circumstances is less likely to lead to further gains of fitness, compared to sample sets that are a more comprehensive coverage of shorter strings in the language.

In other words, the information about the language that is embedded in the sample set is less representative at higher values of nn in RlenRle_{n}. With less comprehensive coverage of examples for higher nn in RlenRle_{n}, the positive and negative sets of examples may become more akin to random strings rather than providing selection conducive to higher linguistic fitness in the next generation.

Although these experiments only investigated evolution against sample sets of U3U_{3} through U5U_{5}, for further members of the series, growth in the number of FSM states may be an important factor. As it is known that a perfect recogniser for UnU_{n} needs to have at least nn states, we would expect that the growth in the number of states would be an important factor in enabling the evolution of better recognition for higher values of nn in UnU_{n}.

Whether this finding transfers to other languages, and if so, whether it may be able to be generalised to other automata or evolutionary scenarios, is not answered by our study. Even within the UnU_{n} languages, we have only investigated low values of nn. Further, languages with significantly different structures may behave differently. Algorithm 3 uses very small populations, so the effect of diversity on our observation is also an open question.

A potential application of our observation, if generalised, may be to find better solutions in fewer generations in evolutionary optimisation.

6 Conclusion

Under the experimental scenario we used, extending the sample sets of training strings to longer string lengths tends to slow down the achievement of higher fitness scores. A fixed size of sample set, drawn from pools of strings that increase in size as maximum length is extended provides less directed selection pressure. It appears that this is not counteracted by the additional information embedded in longer sample strings, possibly because gains of fitness may occur with a lower likelihood of leading to further gains of fitness, using the same fitness function.

This work was based on observation of the UnU_{n} languages, for low values of the state complexity nn. It would be interesting to investigate whether these observations hold for: - UnU_{n} with higher values of nn, or - for other languages arranged in a sequence of increasing state complexity, or - when the linguistic fitness function shifts from UnU_{n} to Un+1U_{n+1}.

Another fruitful avenue of further work may be to relax the criteria governing retention of FSMs in the population, and observe the effect of populations with higher diversity interacting with rising string lengths in the sample sets.

7 Acknowledgements

This work was supported by the Commonwealth of Australia. Frank Neumann was supported by the Australian Research Council through grant FT200100536.

References

  • [1] Roostapour, V., Neumann, A., Neumann, F.: Single- And Multi-Objective Evolutionary Algorithms For The Knapsack Problem With Dynamically Changing Constraints. Theoretical Computer Science, vol. 924, pp. 129-147 (2022).
  • [2] Standish, R.: Open-Ended Artificial Evolution. International Journal of Computational Intelligence and Applications, 3(02), pp. 167–175 (2003).
  • [3] Kazarlis, S., Petridis, V.: Varying Fitness Functions In Genetic Algorithms: Studying The Rate Of Increase Of The Dynamic Penalty Terms. In: Eiben, A., Bäck, T., Schoenauer, M., Schwefel, H. (eds.) Parallel Problem Solving from Nature — PPSN V, pp. 211–220. Springer, Berlin/Heidelberg (1998).
  • [4] Gabor, T., Phan, T., Linnhoff-Popien, C.: Productive Fitness In Diversity-aware Evolutionary Algorithms. Natural Computing 20, pp. 363–-376 (2021).
  • [5] Naidoo, A.: Evolving Automata Using Genetic Programming. Master’s thesis, University of KwaZulu-Natal, Durban, South Africa (2008).
  • [6] Rasek, A., Dörwald, W., Hauhs, M., Kastner-Maresch, A.: Species Formation In Evolving Finite State Machines. In: Floreano, D., Nicoud, JD., Mondada, F. (eds.) Advances in Artificial Life. ECAL 1999. LNCS, vol 1674, pp. 139–148. Springer, Berlin/Heidelberg (1999).
  • [7] Moran, N., Pollack, J.: Evolving Complexity In Prediction Games. Artificial Life, vol. 25, pp. 74–91 (2019).
  • [8] Packard, N., Bedau, M., Channon, A., Ikegami, T., Rasmussen, S., Stanley, K., Taylor, T.: An Overview Of Open-ended Evolution: Editorial introduction to the open-ended evolution II special issue. Artificial Life, vol. 25, pp. 93-–103 (2019).
  • [9] Yu, S.: State Complexity Of Regular Languages. Journal of Automata, Languages and Combinatorics, vol. 6, no. 2, pp. 221–234 (2001).
  • [10] Ehrenfeucht, A., Zeiger, P.: Complexity Measures For Regular Expressions, Journal of Computer and Systems Sciences, vol. 12, pp. 134–-146 (1976).
  • [11] Sipser, M.: Introduction To The Theory Of Computation, pp. 55–76. PWS Publishing Company, Boston, MA (1997).
  • [12] Hopcroft, J.: An nn log nn Algorithm For Minimizing States In A Finite Automaton, Stanford University, Stanford, CA, (1971).
  • [13] Brzozowski, J.: In Search of Most Complex Regular Languages. In: Moreira, N., Reis, R. (eds.) Implementation and Application of Automata, LNCS, vol. 7381, pp. 5–24. Springer, Berlin/Heidelberg (2012).
  • [14] Neumann, F.: Computational Complexity Analysis of Multi-Objective Genetic Programming. In: GECCO ’12: Proceedings of the 14th annual conference on Genetic and Evolutionary Computation, pp. 799–806. Philadelphia, Pennsylvania, USA (2012).