12,505,591 members (54,222 online)
alternative version

35.1K views
46 bookmarked
Posted

Generalization of a Simple Genetic Algorithm (GA)

, 20 Nov 2006
 Rate this:
Taking a simple genetic algorithm and constructing a framework to allow easy creation of similar algorithms.

Introduction

In response to Sacha Barber's article (please read his article before you continue), I couldn't help but think about generalizing the approach. After some discussion with Sacha, he suggested that I post a follow up article. So here we go.

Disclaimer: This is my first article, please bear that in mind. I appreciate any feedback. More importantly, as I mentioned to Sacha, I'm entirely new to genetic algorithms, so please do not think of me as any kind of expert. This article is more about generalization than GA specifically. But I think it poses some interesting questions, and I hope it will prompt some discussion.

Recall Sacha's sample problem: from 10 cards labeled 1-10, create two piles A and B, where the sum of the cards in A is as close as possible to 36, and the product of the cards in B is as close as possible to 360.

Generalization

The idea would be to try to create a simple framework that will allow a person wishing to implement a genetic algorithm to do so with as little work as possible, allowing them to focus on the specifics of their fitness tests and trusting that the framework will cycle through the population properly.

I first considered the general approach Sacha described:

Genetic Cycle Part I

1. Select two members of the population
2. Determine the "winner" 1 and the "loser" 1
3. Randomly recombine the winner with the loser
4. Randomly mutate the loser

I wondered if perhaps there was any use to more generalization, so just for fun, I took it one step further:

Genetic Cycle Part II

1. Select N members of the population, where N > 1
2. Determine the W "winners" and the L "losers", where W+L = N
3. Randomly recombine the winners with the losers
4. Randomly mutate the losers

Now, I don't know how relevant this second generalization is. I'm not even sure it will work properly for N > 2. Sacha suggested that it might end up with a population that lacks sufficient diversity to find a solution:

The only thing you would have to watch is that after some time (n-times round the loop), the population may end up being too similar if you keep a few losers in. The result of which could be that the dominant winner is not different enough, and recombination and mutation won't make enough difference in actually creating an organism which could solve the problem domain.

With that caveat in mind, let's continue.

Why generalize? Well, this - i.e., GA - seems like it's a problem that comes up a lot, so an approach that is easy to reuse would seem appropriate. My idea is to have a base class that works with any N, W, and L, and then make derived classes to suit the situation. Specifically, I made a derived class with N=2, W=1, and L=1, and I called it `BinaryGeneticAlgorithm`. Then, I made a derived class of `BinaryGeneticAlgorithm` called "`CardProblem`". It is hoped that `CardProblem` and similar classes are now significantly easier to create.

`IGenotype`

Before we get to the base classes, we need an interface to work with. `IGenotype` provides an interface for, you guessed it, genotypes. It just maintains a list of `Gene` objects that are either `On` or `Off`. (There is some additional code to make `Gene` objects "nice" to display in a property browser.)

```public interface IGenotype
{
[Editor(typeof(CollectionEditor), typeof(UITypeEditor))]
List<Gene> Genes { get; }
}```

Here are the `On`/`Off` flags:

```[Flags]
public enum EGene
{
Off = 0,
On = 1,
}```

The `Gene` class is really straightforward, so just look at the source if you want to see the details.

`GeneticAlgorithmBase`

The `GeneticAlgorithmBase` class defines a template for running tournaments on a population of `IGenotype` objects. You just run a tournament, specifying a population and a number of iterations. During the tournament, the overall winners are kept, and each successive winner is compared to the overall winner. I also check to see if the current overall winner satisfies the genetic requirements. If so, I stop the tournament. Here's the code:

```public List<IGenotype> Tournament(List<IGenotype> Population,
int NumberOfIterations)
{
List<IGenotype> returnVal = null;
// This will be the overall winner for the tournament

for (int i = 0; i < NumberOfIterations; i++)
{
// Run the cycle for the current genotypes
List<IGenotype> currentWinners
= GeneticCycle(Population);

// Keep track of the overall winner for the tournament
if (returnVal == null)
{
// By default, the first winner is the starting overall winner
returnVal = currentWinners;
}
else
{
// Compare the current winner to the overall winner
// and update the overall winner
List<IGenotype> chosen = new List<IGenotype>();
List<IGenotype> Winners, Losers;
// NOTE: This could lead to winners from returnVal intermingled
// with winners from currentWinners, which may not be desirable?
this.ChooseWinners(chosen, out Winners, out Losers);
returnVal = Winners;

// Check if our overall winner fulfills the genetic requirements
if (this.Evaluate(returnVal))
{
break; // We have a winner so stop the tournament
}
}
}
return returnVal;
}```

The genetic cycle is modeled in a template method. First, choose two or more members from the population, then choose the winner(s) and loser(s), recombine, and mutate. Finally, return the winner(s):

```public List<IGenotype> GeneticCycle(List<IGenotype> Pop)
{
// Choose two Genotypes from the population
List<IGenotype> Chosen = this.PickGenotypes(Pop,
this.m_NumberOfSelectees);

// Determine which of the two is the winning Genotype
List<IGenotype> Winners, Losers;
this.ChooseWinners(Chosen, out Winners, out Losers);

// Infect the loser's genes with some portion of the winner's genes
this.Recombine(Winners, Losers);

// Mutate the loser's genes
this.Mutate(Losers);

return Winners;
}```

The code to pick the genotypes to use is a little cumbersome, but basically we're choosing random positions from the population, but once we choose one, we need to make sure we don't choose it again.

```private List<IGenotype> PickGenotypes(List<IGenotype> Population,
int NumberOfSelectees)
{
List<IGenotype> returnVal = new List<IGenotype>();
Random r = new Random();
// We'll choose random numbers between 0 and
// the total size of the population
int max = Population.Count - 1;

List<int> usedPositions = new List<int>();
for (int i = 0; i < NumberOfSelectees; i++)
{
// We need to remove positions from the pool as we go along
int newPosition = -1;
do
{
newPosition = (int)Math.Round(r.NextDouble() * (double)max);
} while (usedPositions.Contains(newPosition));

// Select the entry from the population
}

if (returnVal.Count < 2)
{
throw new Exception("Must have at least 2 chosen genotypes.");
}
return (List<IGenotype>)returnVal;
}```

I provide a template method to quickly create a population:

```public List<IGenotype> CreatePopulation(int PopulationSize)
{
if (PopulationSize < 2)
{
throw new ArgumentException("PopulationSize must be greater than 2");
}

List<IGenotype> returnVal =
new List<IGenotype>(PopulationSize);
for (int i = 0; i < PopulationSize; i++)
{
}

return returnVal;
}```

Finally, there's a bunch of abstract methods that inheritors will need to implement:

```public abstract void ChooseWinners(List<IGenotype> Potentials,
out List<IGenotype> Winners, out List<IGenotype> Losers);
protected abstract void Recombine(List<IGenotype> Winners,
List<IGenotype> Losers);
protected abstract void Mutate(List<IGenotype> Losers);
protected abstract bool Evaluate(List<IGenotype> Genes);
protected abstract IGenotype CreateGenotype();```

The first three - `ChooseWinners`, `Recombine`, and `Mutate` - should be obvious as they are just steps in the cycle.

`Evaluate` is the method which determines the genetic fitness. Note: This is not the same as Sacha's `Evaluate` method, which is used for comparing fitness between two genotypes. Rather, this just indicates whether the specified genotypes fulfill the solution requirements or not. For instance, in Sacha's example, the genetic requirements are as described B, where the sum of the cards in A is as close as possible to 36, and the product of the cards in B is as close as possible to 360" href="#requirements">above. More on this later.

`CreateGenotype` does just that - it lets the implementor create a genotype. It is used by the `CreatePopulation` template method.

That's it for `GeneticAlgorithmBase`. Let's move on to `BinaryAlgorithmBase`.

`BinaryAlgorithmBase`

We've already done a lot of the work in `GeneticAlgorithmBase`, so what do we need to do here? Well, first, we need to specify N - the number of selected members of the population. We also need to know the probability of recombination and mutation. We do this in the constructor.

```public BinaryGeneticAlgorithm(double ChanceOfInfect, double ChanceOfMutate)
: base(2)
{
m_ChanceOfInfect = ChanceOfInfect;
m_ChanceOfMutate = ChanceOfMutate;
this.m_NumberOfSelectees = 2;
}```

Next, we need to simplify some of the abstract methods to work with selection sizes of only two. First, we override the `ChooseWinners` method, selecting the first two entries from the list of chosen selected genotypes. Then, we pass them out to another abstract method, also called `ChooseWinners`, but this one only expects two selected genotypes, and only produces one winner and one looser:

```public override void ChooseWinners(List<IGenotype> Potentials,
out List<IGenotype> Winners, out List<IGenotype> Losers)
{
Winners = (List<IGenotype>)new List<IGenotype>();
Losers = (List<IGenotype>)new List<IGenotype>();
IGenotype gWinner, gLoser;
this.ChooseWinners(Potentials[0], Potentials[1], out gWinner, out gLoser);
}```

Now, we override the `Recombine` and `Mutate` methods. In each case, I've pretty much used Sacha's code to randomly select which genes get recombined / mutated. Note that we use the class variables `m_ChanceOfInfect` and `m_ChanceOfMutate` to determine the random chance.

```protected override void Recombine(List<IGenotype> Winners,
List<IGenotype> Losers)
{
IGenotype Winner = Winners[0],
Loser = Losers[0];
Random r = new Random();
for (int i = 0; i < Loser.Genes.Count; i++)
{
double rand = r.NextDouble();
if (rand < this.m_ChanceOfInfect)
{
Loser.Genes[i].SetGene(Winner.Genes[i].Value);
}
}
}
protected override void Mutate(List<IGenotype> Losers)
{
IGenotype Loser = Losers[0];
Random r = new Random();
for (int i = 0; i < Loser.Genes.Count; i++)
{
double rand = r.NextDouble();
if (rand < this.m_ChanceOfMutate)
{
Loser.Genes[i].SetGene(Loser.Genes[i].Value == EGene.On
? EGene.Off
: EGene.On);
}
}
}```

The `Evaluate` method just delegates out to an overload that accepts a single `IGenotype`:

```protected override bool Evaluate(List<IGenotype> Genes)
{
return Evaluate(Genes[0]);
}```

Well, that's pretty much it for `BinaryAlgorithmBase`. We just have a couple of abstract methods that inheritors will override, but these are pretty straightforward. We just want to make sure that our inheritors are working with two selections and no more.

```protected abstract void ChooseWinners(IGenotype First,
IGenotype Second, out IGenotype gWinner, out IGenotype gLoser);
protected abstract bool Evaluate(IGenotype Genes);```

That's it. Now we can move on to `CardProblem`.

`CardProblem`

So now, we come down to the actual problem we're trying to solve, and it only took 285 lines to get here. Let's hope it's worth the wait. To begin with, we'll pass some arguments to `CardProblem`'s constructor. We need to know:

1. How many cards we're dealing with
2. The total sum we are trying to achieve (in our example, it's 36)
3. The total product we are trying to achieve (in our example, it's 360)
4. The chance of recombination
5. The chance of mutation
```public CardProblem(int NumberOfCards, int SumTarget, int ProductTarget,
double ChanceOfInfection, double ChanceOfMutation)
: base(ChanceOfInfection, ChanceOfMutation)
{
this.m_NumberOfCards = NumberOfCards;
this.m_SumTarget = SumTarget;
this.m_ProdTarget = ProductTarget;
}```

Next, we have some overrides to do. The first one is `ChooseWinners`. What are we doing? Just casting the arguments to a specific type of `IGenotype` called "`CardGenes`" (see the source code for details). Then, we're evaluating them and assigning the winner and loser. That's it. We could almost move this up to the `BinaryGeneticAlgorithm` level, but this evaluation looks at the `double` values returned by "`double Evaluate(IGenotype Genes)`" which may not always be desirable. `BinaryGeneticAlgorithm` needs to be flexible enough to work with different kinds of constraints. Since implementors are defining their own measure of fitness, it makes sense that they would define their own method of comparison as well.

```protected override void ChooseWinners(IGenotype First,
IGenotype Second, out IGenotype Winner, out IGenotype Loser)
{
CardGenes f = First as CardGenes,
s = Second as CardGenes;
double fFitness = Evaluate(f),
sFitness = Evaluate(s);
string result = null;
if ( fFitness < sFitness)
{
Winner = f;
Loser = s;
}
else
{
Winner = s;
Loser = f;
}
}```

The `Evaluate` method is pretty straightforward. Remember how I talked about the likely provide an overload that returns information more specific than a bool, but this one just indicates whether the specified genotypes fulfill the requirements or not." href="#evaluate">`Evaluate` method earlier? Well, in this case, `Evaluate` would indicate whether a current genotype provides a solution where the error is zero (i.e., the sum of A is exactly X - e.g., 36, and the product of B is exactly Y - e.g., 360). Note that `Evaluate` delegates to a private method (see below):

```protected override bool Evaluate(IGenotype Genes)
{
CardGenes g = Genes as CardGenes;
bool returnVal = Evaluate(g) == 0.0;
return returnVal;
}```

Here's the `CreateGenotype` method. Not much to it.

```protected override IGenotype CreateGenotype()
{
return new CardGenes(this.m_NumberOfCards);
}```

Now, let's look at that private `Evaluate` method. It should be pretty familiar. It's just Sacha's evaluate code pretty much word for word, just doctored enough to fit the framework:

```private double Evaluate(CardGenes f)
{
double sumErr, prodErr, combinedErr;
int sum = 0, prod = 1;

for (int i = 0; i < f.Genes.Count; i++)
{
Gene gene = f.Genes[i];
if (gene.Value == EGene.On)
{
// On means sum
sum += (i + 1);
}
else
{
// Off means product
prod *= (i + 1);
}
}
sumErr = ((double)(sum - m_SumTarget)) / (double)m_SumTarget;
prodErr = ((double)(prod - m_ProdTarget)) / (double)m_ProdTarget;
combinedErr = Math.Abs(sumErr) + Math.Abs(prodErr);
f.Sum = sum;
f.Product = prod;
return combinedErr;
}```

So that's pretty much it. There's just the `CardGenes` class, which implements `IGenotype`. Check the source code out for details. There's also the UI...

UI

The UI in the sample code simply provides an easy way of passing in the necessary parameters to run tournaments, so you can play with different combinations, and have a way to view the results. Set the Sum Target, Product Target, and Number of Cards. These parameters are specific to the card problem. Set the population size to create a population of a specific size. Set the Chance of Recombination and Chance of Mutation each to a value between 0 and 1. Finally, set the Number of Iterations to specify how long the tournament should be. Once you've set all the values, click the "Go!" button. After a while, it will become enabled again, signaling the completion of the `Tournament`. Depending on the tournament results, the "View Results" button will be enabled. Here's the main form:

The main form keeps track of the overall winner of all tournaments, similar to the way the `GeneticAlgorithmBase` keeps track of the overall winner in a tournament. After each tournament, the main form compares the winner of the tournament with the overall winner. If the tournament winner is "better" than the overall winner, it becomes the new overall winner, and the View Results button is enabled. You can then click the View Results and see the results, which look like this:

This is just a property browser in a form. The property browser is displaying the current overall winning `IGenotype`. You can keep the old Results forms open so you can compare the differences between winners and verify that you are approaching a solution. You can also look at the specific genes in the `IGenotype` by clicking the ellipses button "..." next to the "`Gene`s" collection. Doing so displays the `Gene` collection editor.

That's it for the UI.

Questions:

I look forward to your feedback. In particular, I'm really interested in:

1. How valuable you think `GeneticAlgorithmBase` is. Do you think it's useful, or should it just be collapsed into `BinaryAlgorithmBase`? Why?
2. How's my generalization? Do you think I applied some good OOP strategies to facilitate reuse? Anything I should work on?
3. Any other comments you might have ;)

Known Issues

Here are some things that I've noticed in this approach:

1. The code only deals with the GA that Sacha described. It's my understanding that there are others. I'm not familiar with those, so they are not included here.
2. `GeneticAlgorithmBase` works with collections of `IGenotype`s. There could be a problem when there is more than one winner and/or more than one loser. Specifically, with such "`PolyGenenticAlgorithm`" classes, recombining the winners with the losers might cause problems because it's not clear which winner should be recombined with which loser, or whether each winner should be recombined with each loser. Since recombination is left to the implementor, it would be up to the developer to determine somehow an appropriate way to recombine. For instance, in a `TernaryGeneticAlgorithm`, there would be either one winner and two losers, or one loser and two winners, and in both cases, the question is how do the winners affect the losers? The developer would need to answer such questions.
3. As Sacha mentioned, there's the possibility that multiple winners recombined with losers could result in a population that lacks sufficient diversity to find a solution.
4. This has more to do with design, but I think using `IComparable` to compare `IGenotype`s might be useful in simplifying things. I'll look into this more.

A list of licenses authors might use can be found here

Share

I am a software developer. I live and work in Toronto. I'm one of the lucky ones: my work is my passion - I code for fun. I enjoy learning new coding techniques and teaching others about software development - I find that the more I teach, the more I learn.
When I'm not coding, you can find me inline skating along Toronto's waterfront or playing Psychonauts or Splinter Cell.

You may also be interested in...

 Pro Pro

 First Prev Next
 c# genetic algoritma Member 124261192-Apr-16 11:41 Member 12426119 2-Apr-16 11:41
 Combinatorial optimization component dark_sidus19-Aug-07 0:44 dark_sidus 19-Aug-07 0:44
 Yo Yo Sacha Barber19-Nov-06 5:41 Sacha Barber 19-Nov-06 5:41
 BUG! narsyseth17-Nov-06 11:05 narsyseth 17-Nov-06 11:05
 Re: BUG! narsyseth20-Nov-06 5:23 narsyseth 20-Nov-06 5:23
 Last Visit: 31-Dec-99 18:00     Last Update: 27-Sep-16 8:08 Refresh 1