## Introduction

**Genetic Algorithms** belong to a larger class of computer-based problem solving systems called **Evolutionary Algorithms** which use computational models that follow the principles of evolution and heredity in their design and implementation. The other variants of Evolutionary Algorithms are Evolutionary Programming and Evolution Strategies. Genetic Algorithms was developed by John Holland, his colleagues and students at the University of Michigan in early 1970s. Genetic Algorithms deal with optimization problems. Inspired by Darwin's theory of evolution, Genetic Algorithms employ a repeated process of selection, attrition, and cross-breeding of potential solutions in search of the optimal one to a problem.

## Applications

The needs for Optimization are due in part to scarcity of resources and in part to conflicting desires. As a daily ritual, we would like to choose the commuting arrangement that enable us to reach our destination in the manner that yields the cheapest fare, the shortest commuting time, and be as comfortable as possible. Is it achievable? We know that the answer is "you cannot have your cake and eat it too". The relationship between these desires are contradictory and can only be resolved through optimization. In this commuting case, individual commuters will have to search all possible combination of commuting arrangements - taxis, buses, trains or a mix of them, evaluate and compare in order to reach an optimal solution. This is optimization at work.

On a more serious note, Genetic Algorithms have been used in the following areas with varying degree of success:

**Resource Allocation** - Job shop scheduling, time-table planning, and the classical traveling salesperson problem
**Design Optimization** - Network routing, satellite orbit selection
**Machine Learning** - Optimize other machine learning systems, such as weights for neural networks, weights for case based reasoning system, and rules for classifier systems

## Core Concepts

We will step through the process of how genetic algorithms deal with optimization problem while introducing the core concepts of genetic algorithms along the way.

**Problem Modeling**
- Define the problem requirements that is what we are optimizing for (e.g. balanced distribution of containers on a vessel).
- Translate this problem into the genetic algorithms problem requirements (e.g. assignment of containers to locations in the vessel).
- Identify the information already presents in the problem (e.g. container IDs, weights, Location IDs) as well as information that needs to be computed or derived (e.g. center of gravity).
- Determine the
**fitness function** for evaluating the "goodness" of each candidate solution (e.g. the lower the overall center of gravity of each feasible solution the better).
- Encode the solution in a form that is analogous to biological's
**chromosomes** or sequence of DNA (e.g. an array of locations in the vessel with each location being filled with a container ID).
- Determine the appropriate method to evolve feasible solutions (e.g. permutation on the order of container IDs to each location in the array (Figure 1), or permutation on the location indices to each container ID (Figure 2)).

1 - Permutation by Container IDs

2 - Permutation by Location Indices

**Evolution Process**
- Start with an initial population of randomly generated candidate solutions (chromosomes).
- Evaluate each solution and assigned a measure of its fitness by the fitness function for
**Selection** in the next stage.
- Select high fitness individuals from the current population to undergoes reproduction -
**Crossover** and **Mutation**.
- Combine the features of two parents to form two similar offspring as a way to conserves and exploits good genetic traits. This is known as Crossover.
- Alter one or more components of a selected chromosome randomly to inject diversity into the population. This is known as Mutation. Mutation will help to break the dominance of elitist group by giving the weaker individuals a chance, albeit a very slim one, to survive.
- Replace older members of the population with their offspring forming a new generation of population.
- Repeat the process of evaluation, selection, crossover, mutation and replacement until some stopping criteria are met.

**Stopping Criteria**
- When a pre-determined number of iterations or generations is reached.
- When a chromosome (solution) that meets or exceeds a target fitness value is found.
- When all the chromosomes in the population have attained certain level of uniformity in terms of fitness.

## Selection Methods

We shall take a closer look at some of the selection methods for selecting new population in genetic algorithms.

**Fitness Proportionate Selection**
- Calculate the fitness value for each chromosome,
*f*_{i} (i^{th} chromosome).
- Find the total fitness of the whole population,
*F*.
- Calculate the probability of selection for each chromosome,
*p*_{i} = *f*_{i} / F.
- Calculate the cumulative probability for each chromosome,
*q*_{i} = ∑_{i} p_{i}.
- Generate a random number between [0..1],
*r*.
- If
*r* is less than *q*_{1} (i.e., cumulative probability of first chromosome), then select the first chromosome, otherwise select the i^{th} chromosome where *q*_{i-1} < r < q_{i}_{.}
- If a few chromosomes possess overly large fitness values as compared to the majority, they will be selected too often. The consequence is too-quick and pre-mature convergence, resulting in sub-optimal solution.

**Tournament Selection**
- Select two individuals from the population.
- Generate a random number between [0..1],
*r*.
- If
*r* is less than a pre-determine number *T* (tournament size), than select the fitter of the two individuals, otherwise select the weaker one.
- The two individuals are then returned to the original population and can be selected again.
- This type of selection tends to favour the fitter one more as the tournament size increases.

**Rank Selection**
- Rank the individuals in the population in ascending order according to their fitness values.
- The weakest one will get a ranking of 1, the next weakest one 2, and so on.
- Use the ranking instead of the fitness value to calculate the probability of selection for each chromosome.
- This method will prevent too-quick convergence that could happen with fitness proportionate selection method, but at the expense of slower convergence.

The choice of selection method is problem dependant and can greatly impact the optimization process and outcome. It may be decided after comparing the outcomes from these methods.

## Crossover Operators

Crossover and mutation are two core operators of genetic algorithms. Similar to selection methods, there are many ways to perform crossover and mutation. Here we discuss the crossover operation first.

In crossover, segments of two parent chromosomes are randomly chosen and swapped to produce two offspring. Some of the crossover methods are:

**Single-point Crossover**
- Select two parent chromosomes for reproduction.
- Select the position of crossover point randomly.
- Copy those genes on the left side of the crossover point of parent 1's chromosome to child 1.
- Copy those genes on the right side of the crossover point of parent 2's chromosome to child 1.
- Child 1 is born.
- Repeat the same process but swap the roles of the two parents to produce child 2.
- The above process is illustrated in Figure 3.

3 - Single-point Crossover

**Multi-point Crossover**
- Select two parent chromosomes for reproduction.
- Select the positions of two (or more) crossover points randomly.
- Copy those genes outside of the two crossover points of parent 1's chromosome to child 1.
- Copy those genes between the two crossover points of parent 2's chromosome to child 1.
- Child 1 is born.
- Repeat the same process but swap the roles of the two parents to produce child 2.
- The above process is illustrated in Figure 4.

4 - Multi-point Crossover

Not all chosen chromosome pairs will undergo crossover. We introduce a new parameter called *Probability of Crossover*, *p*_{c}. Before a crossover takes place, we generate a random number from the range of [0..1], if this random number is less than the probability of crossover, then crossover takes place otherwise aborts. This gives the expected number of chromosomes to undergo crossover at *p*_{i} x population size. The probability of crossover is also known as **crossover rate**.

## Mutation Operators

Mutation replaces the values of some randomly chosen genes of a chromosome by some arbitrary new values. One of the popular ways of mutation for a binary chromosome will be **Bit Inversion** which simply flips the values of randomly chosen bits from 1 to 0 or vice versa.

Similar to crossover, mutation does not always take place. It depends on a parameter called *Probability of Mutation*, *p*_{m}. Before a mutation takes place, we generate a random number from the range of [0..1], if this random number is less than the probability of mutation, then mutation takes place else aborts. The probability of mutation is also known as **mutation rate**.

The objective of mutation is to inject diversity into the population, prompting the genetic algorithms to explore new solutions and as such lower the risk of being trapped in a local optimum (Figure 5).

5 - Overcome Local Optimum through Mutation

## Performance

The performance of genetic algorithms in problem solving depends on the following factors:

- The encoding method
- The selection method
- The crossover operators
- The mutation operators
- The parameter settings, i.e. population size, crossover rate, and mutation rate

## When to Use

Genetic algorithms can be applied to solve problems in the following situations:

- When we do not know the ways to reasonably solve problems.
- When it is not possible to enumerate all possible solutions (NP-complete).
- But we know how to differentiate a good solution from a bad one.

## Caveats

If genetic algorithms can converge to an optimal solution, it is equally likely that it can converge to a poor solution. If that occurs, it is most probably owing to poor problem modeling, premature convergence, a poor fitness function, poor parameter settings, or simply bad luck of the random number.

Do we know whether the final solution generated by genetic algorithms optimal or any way near there? The answer is we never know. If we knew how to find the optimal solution, we would not need to use genetic algorithms in the first place. Bearing in mind that the problems that require genetic algorithms to solve are NP-complete, i.e., they are very hard problems to solve.

In other words, there is no guarantee that genetic algorithms will find an optimal solution.

## Open Source Tools

- Python
- Pyevolve - A complete python genetic algorithm framework.

- PERL
- AI::Genetic - A pure Perl genetic algorithm implementation.

## Commercial Tools

## Follow-up

In my next article Genetic Algorithms Implementation, we shall explore the design and implementation of a genetic algorithms project to optimize a load distribution problem.