introduction on Evolutionary Algorithms

Introduction on Evolutionary Algorithms

Evolutionary Algorithms

Among the set of search and optimization techniques, the development of Evolutionary Algorithms (EA) has been very important in the last decade. EAs are a set of modern met heuristics used successfully in many applications with great complexity. It success on solving difficult problems has been the engine of a field known as Evolutionary Computation (EC).

Benefits using EC techniques mostly come from flexibility gains and their fitness to the objective target in combination with a robust behavior. Now days, EC is consider as a adaptable concept for problems solution, specially complex optimization problems. This vision is the alternative to some old descriptions that shows EC as a collection of similar algorithms ready to be used in any problem.

Most of the present implementations of EA come from any of these three basic types (strongly related although independently developed): Genetic Algorithms (GA), Evolutionary Programming (EP) and Evolutionary Strategies (ES). Two groups of EA have became very important, Genetic Programming (GP) with a lot of real problems and Classifier Systems (CS) used for machine learning and for discovering rules in rules based systems (RBS)

Figure 1. Location of the different families of evolutionary algorithms.

Genetic Algorithms

A Genetic Algorithm (GA) is an heuristic used to find a vector x^* (a string) of free parameters with associated values in an admissible region for which an arbitrary quality criterion is optimized:

A sequential genetic algorithm proceeds in an iterative manner by generating new populations of string from the old ones. Every string is the encoded version of a tentative solution. An evaluation function associates a fitness measure to every string indicating its suitability to the problem. The algorithm applies stochastic operators such as selection, crossover and mutation on an initially random population in order to compute a whole generation of new strings.

Since GAs apply operations drawn from nature, the nomenclature used in this field is closely related to the terms we can find in biology. The next table summarizes the meaning of these special terms in the aim of helping novel researchers

Genotype	The code, devised to represent the parameters of the problem in the form of a string.
Chromosome	One encoded string of parameters (binary, Gray, floating point number, etc...).
Individual	One of more chromosomes with an associated fitness value.
Gene	The encoded version of a parameter of the problem being solve.
Allele	Value which a gene can assume (binary, integer, etc...).
Locus	The position that the gene occupies in the chromosome.
Phenotype	Problem version of the genotype (algorithm version), suited for being evaluated.
Fitness	Real value indicating the quality of an individual as a solution to the problem.
Environment	The problem. This is represented as a function indication the suitability of phenotypes.
Population	A set of individuals with their associated statistics (fitness average, Hamming distances, ...).
Selection	Policy for selecting one individual from the population (selection of the fittest,...).
Crossover	Operation that merges the genotypes of two selected parents to yield two new children.
Mutation	Operation than spontaneously changes one or more alleles of the genotype.

Table 1. Nomenclature in EA.

Unlike most other optimization techniques, GAs maintain a population of tentative solution that are manipulated competitively by applying some variation operators to find a global optimum. This characteristic although is very useful to escape from local optimum, requires high computational resources (large memory and search times, for example), and thus a variety of are being studied to design efficient GAs. With this goal numerous advances are continuously being achieved by designing new operators, hybrid algorithms, and more. One very important of such improvements consists in using parallel models of GAs (PGAs).

PGAs are not only parallel versions of sequential GAs. In fact they actually reach the ideal goal of having a parallel algorithm whose behavior is better than the sum of separate behaviors of its component sub-algorithms.

Sequential Genetic Algorithms

The sequential genetic algorithm operates on a population of strings or, in general, structures of arbitrary complexity representing tentative solutions. In textbook versions of GAs, every string is called an individual and it is composed of one or more chromosomes and a fitness value. Normally, an individual contains just one chromosome that represents the set of parameters called genes. Every gene is in turn encoded in (usually) binary by using a given number of alleles(0,1).

Figure 2. Some details on the genotype.

Sequential evolutionary algorithm

Sequential Genetic Algorithm Pseudo-code

Algorithm 1. Pseudo-code for a sequential evolutionary algorithm.

Parallel Genetic Algorithms

As we said above, PGAs are not only parallel versions of sequential GAs. PGAs has the same advantages as a serial GA, consisting in using representations of the problem parameters (and not parameters themselves), robustness, easy customization for a new problem, and multiple-solution capabilities. These are the characteristics that led GAs and other EAs to be worth of study and use. In addition, a PGA is usually faster, less prone to finding only sub-optimal solutions, and able of cooperating with other search techniques in parallel. Summarizing advantages of using PGA are:

Advantages of using PGA:

Works on a coding of the problem (least restrictive)
Basically independent of the problem (robustness)
Can yield alternative solutions to the problem
Parallel search from multiple points in the space
Easy parallelization as islands or neighborhoods
Better search, even if no parallel hardware is used
Higher efficiency and efficacy than sequential GAs
Easy cooperation with other search procedures

PGAs are a class of guided random evolutionary algorithms. If we take a deeper look at them we can distinguish among 4 different types of parallelization. Types 1 and2 proceed in the same way as a sequential GA, but in a faster manner. This is achieved by either profiting from a code parallelizer embedded in a compiler, or else by the explicit parallelization (master-slave global parallelization) of the genetic operators and/or evaluations. However, only for problems with a time-consuming function evaluation do they represent a viable choice; otherwise the communication overhead is higher than the benefits of their parallel execution.

Many different models have been proposed for the rest of PGAs (types 3 y 4). However, all these existing models can fit into two more general classes depending on their grain of parallelism, called coarse or fine grained parallel GAs (cgpGAs and fgpGAs). This classification relies on the computation/communication ratio. If this ratio is high we call the PGA a coarse grain algorithm, and if low then we call a fine grain parallel GA.

As a stochastic technique, in a PGA, we can distinguish three major steps, namely initial sampling, optimization and checking the stopping criterion. Therefore, it begins (t=0) by randomly creating a population P(t =0) of m structures, each one encoding the p problem variables on some alphabet.

Each structure is usually a vector over IB={0,1} (I=IB ^p·xl)or IR (I=IR ^p). Consequently, each problem variable is encoded in l_xbits or in a floating-point number. However, as mentioned before, other non-string representations are possible.

Parallel Evolutionary Algorithm

Parallel Genetic Algorithm Pseudo-code

Algorithm 2. Pseudo-code for a parallel evolutionary algorithm.