Introduction
In response to Sacha Barber's article (please read his article before you continue), I couldn't help but think about generalizing the approach. After some discussion with Sacha, he suggested that I post a follow up article. So here we go.
Disclaimer: This is my first article, please bear that in mind. I appreciate any feedback. More importantly, as I mentioned to Sacha, I'm entirely new to genetic algorithms, so please do not think of me as any kind of expert. This article is more about generalization than GA specifically. But I think it poses some interesting questions, and I hope it will prompt some discussion.
Recall Sacha's sample problem: from 10 cards labeled 1-10, create two piles A and B, where the sum of the cards in A is as close as possible to 36, and the product of the cards in B is as close as possible to 360.
Generalization
The idea would be to try to create a simple framework that will allow a person wishing to implement a genetic algorithm to do so with as little work as possible, allowing them to focus on the specifics of their fitness tests and trusting that the framework will cycle through the population properly.
I first considered the general approach Sacha described:
Genetic Cycle Part I
- Select two members of the population
- Determine the "winner" 1 and the "loser" 1
- Randomly recombine the winner with the loser
- Randomly mutate the loser
I wondered if perhaps there was any use to more generalization, so just for fun, I took it one step further:
Genetic Cycle Part II
- Select N members of the population, where N > 1
- Determine the W "winners" and the L "losers", where W+L = N
- Randomly recombine the winners with the losers
- Randomly mutate the losers
Now, I don't know how relevant this second generalization is. I'm not even sure it will work properly for N > 2. Sacha suggested that it might end up with a population that lacks sufficient diversity to find a solution:
The only thing you would have to watch is that after some time (n-times round the loop), the population may end up being too similar if you keep a few losers in. The result of which could be that the dominant winner is not different enough, and recombination and mutation won't make enough difference in actually creating an organism which could solve the problem domain.
With that caveat in mind, let's continue.
Why generalize? Well, this - i.e., GA - seems like it's a problem that comes up a lot, so an approach that is easy to reuse would seem appropriate. My idea is to have a base class that works with any N, W, and L, and then make derived classes to suit the situation. Specifically, I made a derived class with N=2, W=1, and L=1, and I called it BinaryGeneticAlgorithm
. Then, I made a derived class of BinaryGeneticAlgorithm
called "CardProblem
". It is hoped that CardProblem
and similar classes are now significantly easier to create.
IGenotype
Before we get to the base classes, we need an interface to work with. IGenotype
provides an interface for, you guessed it, genotypes. It just maintains a list of Gene
objects that are either On
or Off
. (There is some additional code to make Gene
objects "nice" to display in a property browser.)
public interface IGenotype
{
[Editor(typeof(CollectionEditor), typeof(UITypeEditor))]
List<Gene> Genes { get; }
}
Here are the On
/Off
flags:
[Flags]
public enum EGene
{
Off = 0,
On = 1,
}
The Gene
class is really straightforward, so just look at the source if you want to see the details.
GeneticAlgorithmBase
The GeneticAlgorithmBase
class defines a template for running tournaments on a population of IGenotype
objects. You just run a tournament, specifying a population and a number of iterations. During the tournament, the overall winners are kept, and each successive winner is compared to the overall winner. I also check to see if the current overall winner satisfies the genetic requirements. If so, I stop the tournament. Here's the code:
public List<IGenotype> Tournament(List<IGenotype> Population,
int NumberOfIterations)
{
List<IGenotype> returnVal = null;
for (int i = 0; i < NumberOfIterations; i++)
{
List<IGenotype> currentWinners
= GeneticCycle(Population);
if (returnVal == null)
{
returnVal = currentWinners;
}
else
{
List<IGenotype> chosen = new List<IGenotype>();
chosen.AddRange(returnVal);
chosen.AddRange(currentWinners);
List<IGenotype> Winners, Losers;
this.ChooseWinners(chosen, out Winners, out Losers);
returnVal = Winners;
if (this.Evaluate(returnVal))
{
break;
}
}
}
return returnVal;
}
The genetic cycle is modeled in a template method. First, choose two or more members from the population, then choose the winner(s) and loser(s), recombine, and mutate. Finally, return the winner(s):
public List<IGenotype> GeneticCycle(List<IGenotype> Pop)
{
List<IGenotype> Chosen = this.PickGenotypes(Pop,
this.m_NumberOfSelectees);
List<IGenotype> Winners, Losers;
this.ChooseWinners(Chosen, out Winners, out Losers);
this.Recombine(Winners, Losers);
this.Mutate(Losers);
return Winners;
}
The code to pick the genotypes to use is a little cumbersome, but basically we're choosing random positions from the population, but once we choose one, we need to make sure we don't choose it again.
private List<IGenotype> PickGenotypes(List<IGenotype> Population,
int NumberOfSelectees)
{
List<IGenotype> returnVal = new List<IGenotype>();
Random r = new Random();
int max = Population.Count - 1;
List<int> usedPositions = new List<int>();
for (int i = 0; i < NumberOfSelectees; i++)
{
int newPosition = -1;
do
{
newPosition = (int)Math.Round(r.NextDouble() * (double)max);
} while (usedPositions.Contains(newPosition));
usedPositions.Add(newPosition);
returnVal.Add(Population[newPosition]);
}
if (returnVal.Count < 2)
{
throw new Exception("Must have at least 2 chosen genotypes.");
}
return (List<IGenotype>)returnVal;
}
I provide a template method to quickly create a population:
public List<IGenotype> CreatePopulation(int PopulationSize)
{
if (PopulationSize < 2)
{
throw new ArgumentException("PopulationSize must be greater than 2");
}
List<IGenotype> returnVal =
new List<IGenotype>(PopulationSize);
for (int i = 0; i < PopulationSize; i++)
{
returnVal.Add(this.CreateGenotype());
}
return returnVal;
}
Finally, there's a bunch of abstract methods that inheritors will need to implement:
public abstract void ChooseWinners(List<IGenotype> Potentials,
out List<IGenotype> Winners, out List<IGenotype> Losers);
protected abstract void Recombine(List<IGenotype> Winners,
List<IGenotype> Losers);
protected abstract void Mutate(List<IGenotype> Losers);
protected abstract bool Evaluate(List<IGenotype> Genes);
protected abstract IGenotype CreateGenotype();
The first three - ChooseWinners
, Recombine
, and Mutate
- should be obvious as they are just steps in the cycle.
Evaluate
is the method which determines the genetic fitness. Note: This is not the same as Sacha's Evaluate
method, which is used for comparing fitness between two genotypes. Rather, this just indicates whether the specified genotypes fulfill the solution requirements or not. For instance, in Sacha's example, the genetic requirements are as described above. More on this later.
CreateGenotype
does just that - it lets the implementor create a genotype. It is used by the CreatePopulation
template method.
That's it for GeneticAlgorithmBase
. Let's move on to BinaryAlgorithmBase
.
BinaryAlgorithmBase
We've already done a lot of the work in GeneticAlgorithmBase
, so what do we need to do here? Well, first, we need to specify N - the number of selected members of the population. We also need to know the probability of recombination and mutation. We do this in the constructor.
public BinaryGeneticAlgorithm(double ChanceOfInfect, double ChanceOfMutate)
: base(2)
{
m_ChanceOfInfect = ChanceOfInfect;
m_ChanceOfMutate = ChanceOfMutate;
this.m_NumberOfSelectees = 2;
}
Next, we need to simplify some of the abstract methods to work with selection sizes of only two. First, we override the ChooseWinners
method, selecting the first two entries from the list of chosen selected genotypes. Then, we pass them out to another abstract method, also called ChooseWinners
, but this one only expects two selected genotypes, and only produces one winner and one looser:
public override void ChooseWinners(List<IGenotype> Potentials,
out List<IGenotype> Winners, out List<IGenotype> Losers)
{
Winners = (List<IGenotype>)new List<IGenotype>();
Losers = (List<IGenotype>)new List<IGenotype>();
IGenotype gWinner, gLoser;
this.ChooseWinners(Potentials[0], Potentials[1], out gWinner, out gLoser);
((List<IGenotype>)Winners).Add(gWinner);
((List<IGenotype>)Losers).Add(gLoser);
}
Now, we override the Recombine
and Mutate
methods. In each case, I've pretty much used Sacha's code to randomly select which genes get recombined / mutated. Note that we use the class variables m_ChanceOfInfect
and m_ChanceOfMutate
to determine the random chance.
protected override void Recombine(List<IGenotype> Winners,
List<IGenotype> Losers)
{
IGenotype Winner = Winners[0],
Loser = Losers[0];
Random r = new Random();
for (int i = 0; i < Loser.Genes.Count; i++)
{
double rand = r.NextDouble();
if (rand < this.m_ChanceOfInfect)
{
Loser.Genes[i].SetGene(Winner.Genes[i].Value);
}
}
}
protected override void Mutate(List<IGenotype> Losers)
{
IGenotype Loser = Losers[0];
Random r = new Random();
for (int i = 0; i < Loser.Genes.Count; i++)
{
double rand = r.NextDouble();
if (rand < this.m_ChanceOfMutate)
{
Loser.Genes[i].SetGene(Loser.Genes[i].Value == EGene.On
? EGene.Off
: EGene.On);
}
}
}
The Evaluate
method just delegates out to an overload that accepts a single IGenotype
:
protected override bool Evaluate(List<IGenotype> Genes)
{
return Evaluate(Genes[0]);
}
Well, that's pretty much it for BinaryAlgorithmBase
. We just have a couple of abstract methods that inheritors will override, but these are pretty straightforward. We just want to make sure that our inheritors are working with two selections and no more.
protected abstract void ChooseWinners(IGenotype First,
IGenotype Second, out IGenotype gWinner, out IGenotype gLoser);
protected abstract bool Evaluate(IGenotype Genes);
That's it. Now we can move on to CardProblem
.
CardProblem
So now, we come down to the actual problem we're trying to solve, and it only took 285 lines to get here. Let's hope it's worth the wait. To begin with, we'll pass some arguments to CardProblem
's constructor. We need to know:
- How many cards we're dealing with
- The total sum we are trying to achieve (in our example, it's 36)
- The total product we are trying to achieve (in our example, it's 360)
- The chance of recombination
- The chance of mutation
public CardProblem(int NumberOfCards, int SumTarget, int ProductTarget,
double ChanceOfInfection, double ChanceOfMutation)
: base(ChanceOfInfection, ChanceOfMutation)
{
this.m_NumberOfCards = NumberOfCards;
this.m_SumTarget = SumTarget;
this.m_ProdTarget = ProductTarget;
}
Next, we have some overrides to do. The first one is ChooseWinners
. What are we doing? Just casting the arguments to a specific type of IGenotype
called "CardGenes
" (see the source code for details). Then, we're evaluating them and assigning the winner and loser. That's it. We could almost move this up to the BinaryGeneticAlgorithm
level, but this evaluation looks at the double
values returned by "double Evaluate(IGenotype Genes)
" which may not always be desirable. BinaryGeneticAlgorithm
needs to be flexible enough to work with different kinds of constraints. Since implementors are defining their own measure of fitness, it makes sense that they would define their own method of comparison as well.
protected override void ChooseWinners(IGenotype First,
IGenotype Second, out IGenotype Winner, out IGenotype Loser)
{
CardGenes f = First as CardGenes,
s = Second as CardGenes;
double fFitness = Evaluate(f),
sFitness = Evaluate(s);
string result = null;
if ( fFitness < sFitness)
{
Winner = f;
Loser = s;
}
else
{
Winner = s;
Loser = f;
}
}
The Evaluate
method is pretty straightforward. Remember how I talked about the Evaluate
method earlier? Well, in this case, Evaluate
would indicate whether a current genotype provides a solution where the error is zero (i.e., the sum of A is exactly X - e.g., 36, and the product of B is exactly Y - e.g., 360). Note that Evaluate
delegates to a private method (see below):
protected override bool Evaluate(IGenotype Genes)
{
CardGenes g = Genes as CardGenes;
bool returnVal = Evaluate(g) == 0.0;
return returnVal;
}
Here's the CreateGenotype
method. Not much to it.
protected override IGenotype CreateGenotype()
{
return new CardGenes(this.m_NumberOfCards);
}
Now, let's look at that private Evaluate
method. It should be pretty familiar. It's just Sacha's evaluate code pretty much word for word, just doctored enough to fit the framework:
private double Evaluate(CardGenes f)
{
double sumErr, prodErr, combinedErr;
int sum = 0, prod = 1;
for (int i = 0; i < f.Genes.Count; i++)
{
Gene gene = f.Genes[i];
if (gene.Value == EGene.On)
{
sum += (i + 1);
}
else
{
prod *= (i + 1);
}
}
sumErr = ((double)(sum - m_SumTarget)) / (double)m_SumTarget;
prodErr = ((double)(prod - m_ProdTarget)) / (double)m_ProdTarget;
combinedErr = Math.Abs(sumErr) + Math.Abs(prodErr);
f.Sum = sum;
f.Product = prod;
return combinedErr;
}
So that's pretty much it. There's just the CardGenes
class, which implements IGenotype
. Check the source code out for details. There's also the UI...
UI
The UI in the sample code simply provides an easy way of passing in the necessary parameters to run tournaments, so you can play with different combinations, and have a way to view the results. Set the Sum Target, Product Target, and Number of Cards. These parameters are specific to the card problem. Set the population size to create a population of a specific size. Set the Chance of Recombination and Chance of Mutation each to a value between 0 and 1. Finally, set the Number of Iterations to specify how long the tournament should be. Once you've set all the values, click the "Go!" button. After a while, it will become enabled again, signaling the completion of the Tournament
. Depending on the tournament results, the "View Results" button will be enabled. Here's the main form:

The main form keeps track of the overall winner of all tournaments, similar to the way the GeneticAlgorithmBase
keeps track of the overall winner in a tournament. After each tournament, the main form compares the winner of the tournament with the overall winner. If the tournament winner is "better" than the overall winner, it becomes the new overall winner, and the View Results button is enabled. You can then click the View Results and see the results, which look like this:

This is just a property browser in a form. The property browser is displaying the current overall winning IGenotype
. You can keep the old Results forms open so you can compare the differences between winners and verify that you are approaching a solution. You can also look at the specific genes in the IGenotype
by clicking the ellipses button "..." next to the "Gene
s" collection. Doing so displays the Gene
collection editor.

That's it for the UI.
Questions:
I look forward to your feedback. In particular, I'm really interested in:
- How valuable you think
GeneticAlgorithmBase
is. Do you think it's useful, or should it just be collapsed into BinaryAlgorithmBase
? Why?
- How's my generalization? Do you think I applied some good OOP strategies to facilitate reuse? Anything I should work on?
- Any other comments you might have ;)
Known Issues
Here are some things that I've noticed in this approach:
- The code only deals with the GA that Sacha described. It's my understanding that there are others. I'm not familiar with those, so they are not included here.
GeneticAlgorithmBase
works with collections of IGenotype
s. There could be a problem when there is more than one winner and/or more than one loser. Specifically, with such "PolyGenenticAlgorithm
" classes, recombining the winners with the losers might cause problems because it's not clear which winner should be recombined with which loser, or whether each winner should be recombined with each loser. Since recombination is left to the implementor, it would be up to the developer to determine somehow an appropriate way to recombine. For instance, in a TernaryGeneticAlgorithm
, there would be either one winner and two losers, or one loser and two winners, and in both cases, the question is how do the winners affect the losers? The developer would need to answer such questions.
- As Sacha mentioned, there's the possibility that multiple winners recombined with losers could result in a population that lacks sufficient diversity to find a solution.
- This has more to do with design, but I think using
IComparable
to compare IGenotype
s might be useful in simplifying things. I'll look into this more.
I am a software developer. I live and work in Toronto. I'm one of the lucky ones: my work is my passion - I code for fun. I enjoy learning new coding techniques and teaching others about software development - I find that the more I teach, the more I learn.
When I'm not coding, you can find me inline skating along Toronto's waterfront or playing Psychonauts or Splinter Cell.