|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
|
Announcements
Chapters
Services
Feature Zones
|
IntroductionThis article will discuss performance differences between a LINQ loop and a regular For loop, as a way to practice using Visual Studio 2008, Linq and unit tests for the first time. BackgroundLINQ (Language INtegrated Query) is Microsoft's new .NET addition to the language and allows formulating queries in an SQL-like syntax. It's particularly useful when traversing data sets, XML DOM trees and collections. I first came across Visual Studio 2008, and LINQ in particular, during this year's Tech Ed at Orlando. A nice developer from Microsoft described and demoed it for me (at that time, it was only available for VB.NET, but a C# version came out with beta 2). That developer insisted that LINQ is not just for data sets or complex objects, but can also be used for simple loops. I decided to put his theory to the test, as a way to learn the new technology: I wrote a simple program that searches for odd numbers in an array, and compared the time it took a regular loop to the time it took a Linq loop, to come up with the right answers. While this article was written several months ago, I waited for the RTM version of VS 2008 and .NET 3.5 to arrive, before publishing it. Along the way, I got to learn LINQ more deeply and using VS 2008 unit testing capabilities. As the project grew, I've added a third loop (ForEach) to the mix. I then decided to output it all to a CSV file and analyze the results in Excel. The Logic
Using the CodeThis is a bare-bones application. It runs as a console application and has no UI.
The HarnessHere's the static void Main(string[] args)
{
StreamWriter file = new StreamWriter("results.txt");
file.WriteLine("Elements\tFor loop\tForEach Loop\tLinq Loop");
//Console.WriteLine("Elements\tFor loop\tForEach Loop\tLinq Loop");
for (int i = 0; i < 5; i++)
{
int numElements = 1000 * (int)Math.Pow(10, i);
FillArray(numElements);
file.WriteLine("{0:#,#}\t{1:0.0000000000}\t{2:0.0000000000}\t{3:0.0000000000}",
numElements, GetAverage(GetOdd), GetAverage(GetOddForEach), GetAverage(GetOddLinq));
//Console.WriteLine("{0:#,#}\t{1:0.0000000000}\t{2:0.0000000000}\t{3:0.0000000000}",
numElements, GetAverage(GetOdd), GetAverage(GetOddForEach), GetAverage(GetOddLinq));
}
//Console.ReadLine();
file.Close();
}
As you can see, all it does is call the The private static double GetAverage(func f)
{
double averageDuration = 0.0;
for (int i = 0; i < numIterations; i++)
{
pt.Start();
int odd = f();
pt.Stop();
//Console.WriteLine("Time difference: {0}", pt.Duration);
averageDuration += pt.Duration;
}
averageDuration /= numIterations;
return averageDuration;
}
As you can see, not too complicated: it starts a timer, calls function The AlgorithmsEssentially, all 3 functions use simple O(n) search algorithms: The private static int GetOdd()
{
int counter = 0;
for(int n = 0; n < theArray.Length; n++)
{
if (theArray[n] % 2 == 1)
{
counter++;
}
}
return counter;
}
The private static int GetOddForEach()
{
int counter = 0;
foreach (int n in theArray)
{
if (n % 2 == 1)
{
counter++;
}
}
return counter;
}
and finally, the private static int GetOddLinq()
{
var odd = from n in theArray
where n % 2 == 1
select n;
return odd.Count();
}
You first notice the new keyword For more on Linq's syntax and samples, try the official LINQ project page. The ResultsI've run this program on several computers and VMs. I've tried it on Windows XP, Vista and 2008 RC1. I tried running it on a busy machine, or on a completely vacant machine. Finally, I've tested debug and release versions. The numbers may change, but the trend remains the same:
Measurements are in seconds. Column E shows the percentage of time added by Linq compared to For: Fi = (Di - Bi)/Di. Of course, once you have the raw data, you can analyze it however you want, such as generate a graph:
Note: as mentioned results have been pretty consistent, and Linq had 75-85% overhead, in almost every test. But in debug version, LINQ took even longer to complete the task, while For and ForEach remained essentially the same. My only guess is that LINQ has some instrumentation built into it, to allow for easier debugging — thus it's slower in debug builds. Unit TestingA huge chunk of the Tech Ed sessions was dedicated to testing and in particular, how easy it is to add unit tests in VS 2008. And indeed, it didn't take long. Right click anywhere in the source and select "Create Unit Tests...". A wizard will take you through selecting the functions you want to test in your project and would eventually create a test project and add it to the solution. The test projects comes ready with the right references and a set of accessors — allowing the unit test functions access to all members of the original class — even the private ones. So how do you test this code? Here's the unit test for the function that creates the array: /// <summary>
///A test for FillArray
///</summary>
[TestMethod()]
[DeploymentItem("LinqTest.exe")]
public void FillArrayTest()
{
int n = 10; // TODO: Initialize to an appropriate value
Program_Accessor.FillArray(n);
Assert.AreEqual(n, Program_Accessor.theArray.Length);
}
Pretty simple, isn't it? Essentially, you are using the Now, let's test one of the search functions (the tests for all are the same — a unit test does not care about the internal logic of the function, just about the results). /// <summary>
///A test for GetOddLinq
///</summary>
[TestMethod()]
[DeploymentItem("LinqTest.exe")]
public void GetOddLinqTest()
{
int expected = 1; // In every 2 numbers, one is odd
int actual;
Program_Accessor.FillArray(2);
actual = Program_Accessor.GetOddLinq();
Assert.AreEqual(expected, actual);
}
Here I cheated. Knowing that my array will be filled with consecutive numbers, I know that any 2 adjacent cells I pick will contain 1 odd number. So, we build a 2 cell array, fill it and compare the number of odd numbers returned from Note: random numbers will not change the measurement results, as we always have to scan the entire array. Now, run all the unit tests prior to building the solution (or click CTRL+R,A) and, hopefully, you'll see all green: HistoryVersion 1.00 released on 12/5/2007 Version 1.01 released on 12/14/2007 UpdateFollowing the suggestions received in the comments, 2 corrections were implemented, to improve measurment accuracy:
The new results look like this:
As you can see, performance is slightly better — but the tren remains. Final noteThis program is, by no means, a thorough analysis of LINQ's general performance. I'm sure its behavior in traversing complex data sets and XML DOM trees is much better. I never set out to prove anything, just play a little bit with the new environment. Feel free to use the program and its results however you choose. The way I designed it, it's easier to plug in more complex logic and still get measurements.
|
||||||||||||||||||||||