Queries Aren't Just for Databases Anymore
Believe it or not, LINQ is not just about data access. LINQ makes a profound impact on everyday code. Remember, the days trying to find an element in a list by looping through elements, setting some flag (e.g. found = true;
), and calling break
to get you out of the loop? Those days are over thanks to LINQ and the powerful features that come along with it including lambda expressions, type inference, and extension methods. This article will use a classic program, Conway’s Game of Life, as a chassis for a LINQ engine. The purpose is to show how writing query expressions adds flexibility, readability, and power to code that is typically full of loops and branches that add to the cyclomatic complexity.
The Rules of the Game
The “Game” itself, is a cellular automaton. The board is a grid, composed of cells. A cell can be on (alive) or off (dead). The state of a cell in any subsequent generation depends on its current state and the state of its neighbors according to a few simple rules:
- A living cell will die of loneliness when it has 0 or 1 living neighbors.
- A living cell will die of over crowding when it has 4 or more neighbors.
- A living cell will survive when it has either 2 or 3 other living neighbors.
- A dead cell, will be born when it has exactly 3 living neighbors.
I made a little screen cast that shows the Game in action. Please forgive the choppiness, I couldn't quite get my aging laptop to make a clean capture. However, if you download and run the files, you'll see a smooth display of some of the intricate patterns that can be made simply by following those basic rules. Warning: it’s slightly mesmerizing.
Click here to watch the screen cast.
The Facts of Life
The Cell is Ruled by the Dish
The cells themselves are represented by the class below:
The Age
field keeps track of the number of generations the cell has been “alive”. If the cell has an Age
of zero, it’s dead. Normally, you would just keep track of a Boolean flag (e.g. IsAlive
). In this particular implementation, we'll keep track of the Age
for a rule modification described below.
The Neighbors
property, is a little implementation detail that helps cache the list of neighboring cells. It helps calculate things like the NumberOfLivingNeighbors
quickly, that fits nicely with the rules.
The Position
serves as a unique identifier for the cell itself. Every cell gets a unique integer value for Position
that corresponds to the order in which it was added to the list. Essentially, this is the index. An interesting note is that concept of rows and columns in a grid isn't controlled by the cell itself. That honor belongs to the grid itself. In fact, based on the cells position, the grid manages the row/column relationship and tells the cells who its neighbors are.
That grid is represented by a class named PetriDish
:
A client (i.e. the GUI) simply news up a dish, giving it a height
and width
and calls NextGeneration
for each frame that it draws. The NextGeneration
method runs the actual rules and alters the state of the cells.
For readability, and to match the rules above more closely, the NextGeneration
method breaks up the processing into two sections: rules for living cells and rules for dead cells.
Rules for Living Cells
The PetriDish
holds on to cells it creates using a simple List<Cell>
. Primarily because LINQ loves sequences, specifically anything IEnumerable<T>
. On that list, we run LINQ queries to determine from the living cells, who dies and who survives.
private void ApplyRulesToLivingCells()
{
var livingCells =
from c in _cells
where c.Age > 0
select c;
var cellsToKill =
from c in livingCells
where
c.NumberOfLivingNeighbors <= 1 ||
c.NumberOfLivingNeighbors >= 4
select c;
var cellsToKeepAlive =
from c in livingCells
where
c.NumberOfLivingNeighbors 2 ||
c.NumberOfLivingNeighbors 3
select c;
cellsToKill.Each(c => c.LivesInNextGeneration = false);
cellsToKeepAlive.Each(c => c.LivesInNextGeneration = true);
}
The first query expression finds all the cells in the list that are alive by filtering on Age > 0
. The next two query expressions actually use the livingCells
expression to apply Conway’s logic. This is called “query chaining” and makes writing some really complicated queries very readable as it minimizes the needs for join
s.
The good news is that the actual source, the list in this case, is only traversed when you start to pull information out of it. That’s big! There are no in-memory traversals, no temporary tables, nothing that consumes anything until you absolutely need it because query expressions work by building expression trees that are only executed against the source when used. This magic is known as deferred execution.
After the rules are defined in cellsToKeepAlive
and cellsToKill
you can use standard iteration techniques such as foreach
, but a more LINQ-y/3.5-ish way to do this is with an extension method and a lambda function like the Each
method.
The Each
method is something I wrote. Extension methods in general let you “spot weld” methods on classes without using inheritance.
public static class IEnumerableExtensions
{
public static void Each<T>(
this IEnumerable<T> source,
Action<T> action)
{
foreach(T item in source)
{
action(item);
}
}
}
This particular extension method spot welds an Each
method on anything that implements IEnumerable<T>
. You know this because the first parameter to this method defines what this
will be inside the method body. Action<T>
is a pre-defined class that basically stands in for a function (delegate) returning no value. Inside the method, is where the elements are extracted from the list. What this method enables is for me to cleanly apply a function in one line of code.
The lines of code…
cellsToKill.Each(c => c.LivesInNextGeneration = false);
cellsToKeepAlive.Each(c => c.LivesInNextGeneration = true);
... simply iterate through the cells that match and set the flag LivesInNextGeneration
appropriately. The code c => c.LivesInNextGeneration = false;
is a lambda function. Lambda functions let you write very concise anonymous methods. Type inference comes into play here allowing me to avoid typing Cell c => ...
. The compiler determines that c
is of type Cell
because the Each
extension is returning an IEnumerable<Cell>
.
Marking the cell this way is a design decision that allows the code to persist the decision without modifying the collection by modifying the Age
of the cell. That update, or change in the cells Age
, happens separately. Another way to accomplish this would have been to build separate lists for cellsToKill
and cellsToKeepAlive
.
Rules for Dead Cells
The NextGeneration
method then applies the rules that apply to dead cells.
private void ApplyRulesToNonLivingCells()
{
var notLivingCells =
from c in _cells
where c.Age 0
select c;
var pregnantCells =
from c in notLivingCells
where c.NumberOfLivingNeighbors 3
select c;
pregnantCells.Each(c => c.LivesInNextGeneration = true);
}
Query chaining is used again to select cells from the dead cells in the current generation that will be born in the next generation. Those cells are marked so that they will be alive in the next generation.
Updating the State
The age is effected after the rules have been applied. The cell list is queried based on the LivesInNextGeneration
marker, and the appropriate method is called. The GiveLife
method simply adds one to the Age
. The TakeLife
method simply sets the Age
to zero.
private void UpdateCells()
{
var livingCells =
from c in _cells
where c.LivesInNextGeneration
select c;
livingCells.Each(c => c.GiveLife());
var dyingCells =
from c in _cells
where !c.LivesInNextGeneration
select c;
dyingCells.Each(c => c.TakeLife());
}
Playing God: New Rules for Life
In the tradition of the “Game of Life” (everyone comes up with his/her own rules), and to see how easy it is to change the code; I made up “The Spawning Rule”. This rule states:
- A living cell dies when it reaches an age limit, and spawns life on all its non-living neighbors.
In the ApplyRulesToLivingCells
method, I added the following query:
var dyingOfOldAge =
from c in cellsToKeepAlive
where
c.Age >= AgeLimit
select c;
These are cells that would have aged another generation in the next update, but have already reached the spawn rule’s age limit. Therefore, I have to remove them from the list of cells to keep alive.
cellsToKeepAlive = cellsToKeepAlive.Except(dyingOfOldAge);
Notice the Except
method excluding the cells that are also on the dyingOfOldAge
list. It allows me to perform what essentially amounts to a SQL NOT IN
filter inside a LINQ query.
Giving life to the non-living neighbors is a little more tricky.
var spawnedCells =
from old in dyingOfOldAge
from n in old.Neighbors
where
n.Age == 0
select n;
To get the “non-living neighbors”, we first start from the cells that are dyingOfOldAge
. Without a join
, you can use another sequence that is attached to the item in the first sequence (e.g. Neighbors
), and select from that new sequence.
The code at the end of that function that marks the cells as LivesInNextGeneration
or not then becomes:
cellsToKill.Each(c => c.LivesInNextGeneration = false);
dyingOfOldAge.Each(c => c.LivesInNextGeneration = false);
cellsToKeepAlive.Each(c => c.LivesInNextGeneration = true);
spawnedCells.Each(c => c.LivesInNextGeneration = true);
Conclusion
I may never write a for
loop again. At least not while trying to find a particular element within a collection. LINQ expressions, extension methods, lambda expressions, and type inference are just a few of the powerful new features of 3.5 that make C#/VB.NET so much more expressive: not just for your typical database backed LOB application, but for everyday coding.
In this article, I only touched on some of the critical places where I used LINQ. Please download the source for this project and look at even more LINQ-y goodness. There is even an example of using the Enumerable.Range
method that provides Ruby style sequence generation. Warning: I didn't spend much time on the GUI itself. The code would also make a good basis for a round or two of code golf, as the point of this article is to show how readable and clear LINQ can make your code.