Click here to Skip to main content
15,867,704 members
Articles / Programming Languages / C#
Tip/Trick

Slow LINQ Methods

Rate me:
Please Sign up or sign in to vote.
4.65/5 (7 votes)
27 Jun 2014CPOL4 min read 25.9K   69   11   12
LINQ methods can be slow inside loops as this little demo shows.

Introduction

Ever had problems with slow execution when looping through tables in a dataset?

If so, this little tip might help.

Background

In a project I had to save the data in a DataSet to a tab-separated text file. The dataset consists of several nested tables, where the the two largest is a Group that contains a SubGroup. Both these tables has a Description which is stored in another table for multi-language support.

Not exactly rocket science, I thought, and started to implement. After a few hours of coding and testing with a small number of data to get the text format right, I started to test the function with a real life scenario. This consists of about 3000 groups (MPG) with about 400 sub-groups (MP) each.

I ran the test again and got an unpleasant surprise. The test software totally froze. (no threading in my small test bench). Time to get a coffee and then start to figure out what went wrong. When I came back I saw that the execution had finsished. Hmm, so it was not frozen, it just took a long time.

After I added a timer I realized it took 4 minutes! to save a text file with about 16000 lines. As this functionality is part of a GUI, you can imagine how happy any user would be to have to wait 4 minutes for every time the file is saved.

Analyzing my code

After some analysis I found some stupid coding of my part, e.g. I had put one operation inside a loop that only needed to be done once. I also changed from String to StringBuilder where I had a lot of concatenations. This helped some, I cut the time by 50 seconds.

The real culprit, however, was the LINQ extensions I used to get data from indirectly referenced tables. These extensions are pretty neat as they save some lines of code used for type casting, but appearantly they are not very efficient when used inside a loop.

In my code I called the same function around 15000 times in my code (3000 groups + 3000 groups * 400 sub-groups). The only difference is if the type should be MPG or MP.

C#
string description = "Description is missing.";
DescrRow rowDescription = rowDescriptions.GetDescrRows().
  Where(x => x.Type == "MPG" && x.Name == rowMPG.Name).SingleOrDefault();
if (rowDescription != null)
  description = rowDescription.Text; // This text is written to the text file later

The little snippet above took a very long time to execute, so I had to do something about it. I tested quite a few different variants, but the final verdict was that the old Select method was the fastest in my case.

First I created two temporary tables, one for MPG and one for MP descriptions. This is done outside the loop.

C#
DataTable tableMPGDescriptions = null;
DataTable tableMPDescriptions = null;
if (rowDescriptions != null)
{
  IEnumerable<DescrRow> descriptions = null;
  descriptions = rowDescriptions.GetDescrRows().Where(x => x.Type == "MPG");
  tableMPGDescriptions = descriptions.CopyToDataTable<DescrRow>();

  descriptions = rowDescriptions.GetDescrRows().Where(x => x.Type == "MP");
  tableMPDescriptions = descriptions.CopyToDataTable<DescrRow>();
}

Inside the loop the code below is used.

C#
description = "Description is missing.";
if (tableMPGDescriptions != null)
{
  DataRow[] descriptionRows = null;
  descriptionRows = tableMPGDescriptions.Select(String.Format("Name='{0}'", rowMPG.Name));

  if (descriptionRows.Length == 1)
    description = descriptionRows[0]["Text"].ToString();
}

Final Result

After changing the code at all places where I used LINQ methods inside a loop the time to save the file became less than 4 seconds, which for this application is acceptable.

So from 4 minutes to 4 seconds in around 4 hours of work.

Using the code

As I can't post the code from my project here, I created a small demo project that shows the difference in time between the default method Select() and the extension method Where().

I created a small dataset to simulate my real code.

Test data table

I filled the table with dummy values

I updated the number of rows in the data table to get more data.

C#
DataSetTest dsTestData = new DataSetTest();
Random randy = new Random();
for (int i = 0; i < 45000; i++) // Changed from 15000 to 45000
{
  dsTestData.Test.AddTestRow(String.Format("Hello_{0}", i), 
                             (int)(1000 *randy.NextDouble()));
}

and then called this code different number of times.

C#
for (int n = 1; n <= 10000; n *= 10)
{
  Console.WriteLine("Function call made {0} times.", n);
  Console.WriteLine("");
  timee.Restart();
  for (int i = 0; i < n; i++)
  {
    name = String.Format("Hello_{0}", i);
    selectRows = (DataSetTest.TestRow[])dsTestData.Test.
                 Select(String.Format("Name = '{0}'", name));
  }
  Console.WriteLine("Using DataTable.Select()
   : {0,6} ms", timee.ElapsedMilliseconds);

  timee.Restart();
  for (int i = 0; i < n; i++)
  {
    name = String.Format("Hello_{0}", i);
    linqRows = dsTestData.Test.Where(x => x.Name == name);
    linqRow = linqRows.ElementAt(0);
  }
  Console.WriteLine("Using Where().ElementAt(0)      : {0,6} ms", timee.ElapsedMilliseconds);

  timee.Restart();
  for (int i = 0; i < n; i++)
  {
    name = String.Format("Hello_{0}", i);
    linqRows = dsTestData.Test.Where(x => x.Name == name);
    linqRow = linqRows.FirstOrDefault();
  }
  Console.WriteLine("Using Where().FirstOrDefault()  : {0,6} ms", timee.ElapsedMilliseconds);
  
  timee.Restart();
  for (int i = 0; i < n; i++)
  {
    name = String.Format("Hello_{0}", i);
    linqRows = dsTestData.Test.Where(x => x.Name == name);
    linqRow = linqRows.SingleOrDefault();
  }
  Console.WriteLine("Using Where().SingleOrDefault() 
                    : {0,6} ms", timee.ElapsedMilliseconds);
}

The table below shows the time elapsed for different values of n.

Time Elapsed (ms)
Method n = 1 100 1000 10000 Comment
DataTable.Select() 586  82  257  First call to takes a long time.
Where() 13  I kept this result for reference.
ElementAt(0) 4913  Exponentially longer time.
FirstOrDefault() 205  4887  Exponentially longer time.
SingleOrDefault() 15  1280  6944  72583  Slow as a turtle.

The times differ a bit from execution to execution, but the ratios are usually the same.

Conclusions

My little investigation shows that using the LINQ extension methods ElementAt(), FirstOrDefault() and SingleOrDefault() really can slow things down if used inside loops and the number of iterations > 1000.

For single calls and a low number of iterations the LINQ methods are fast.

That the Where() by itself is fast is because the execution is delayed until you try to access any element in the list. (Thanks Mike and SA for explaining that to me.)

That SingleOrDefault() takes a long time is because the whole list has to be searched. (Thanks Nicolas for that pointer.)

One thing I still don't get is why the execution time per iteration for ElementAt() and FirstOrDefault() increases with the number of iterations.

Time Elapsed / number of iterations (ms)
Method 100 1000 10000 Comment
DataTable.Select() 0.080  0.082  0.025  Getting faster and faster.
ElementAt(0) 0.163  0.491  Exponentially longer time.
FirstOrDefault() 0.01  0.205  0.489  Exponentially longer time.
SingleOrDefault() 12.800  6.944  7.253  Execution time decreases.

Points of Interest

I learned a thing or two about optimization and that new is not always better.

  • Use StringBuilder instead of String inside loops if a lot of concatenations are performed.
  • Avoid using the LINQ extension methods inside loops.

History

  1. First release of the tip.
  2. Source code and Conclusions section updated after comments.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect
Philippines Philippines
I have worked with mobile phone testing in manufacturing for 20 years.
Originally from Sweden, but moved to China in 2012.
At the moment I am on a sabbatical year in the Philippines.

Over the years I have used various programming languages such as Pascal, C, C++, C#, SQL, XML, XSLT.

Comments and Discussions

 
AnswerLinq is not slow, but DataTable.Select is fast! Pin
Mr.PoorEnglish27-Aug-20 15:08
Mr.PoorEnglish27-Aug-20 15:08 
GeneralMy vote of 3 Pin
Alexander Sharykin26-Jun-14 21:00
Alexander Sharykin26-Jun-14 21:00 
GeneralRe: My vote of 3 Pin
George Jonsson26-Jun-14 21:16
professionalGeorge Jonsson26-Jun-14 21:16 
QuestionNice tip ! Pin
Volynsky Alex26-Jun-14 7:16
professionalVolynsky Alex26-Jun-14 7:16 
QuestionRe: Slow LINQ Methods Pin
Mike Gledhill25-Jun-14 23:08
Mike Gledhill25-Jun-14 23:08 
AnswerRe: Slow LINQ Methods Pin
George Jonsson25-Jun-14 23:27
professionalGeorge Jonsson25-Jun-14 23:27 
GeneralRe: Slow LINQ Methods Pin
Michael Gledhill25-Jun-14 23:40
Michael Gledhill25-Jun-14 23:40 
GeneralRe: Slow LINQ Methods Pin
George Jonsson26-Jun-14 20:57
professionalGeorge Jonsson26-Jun-14 20:57 
SuggestionRe: Slow LINQ Methods Pin
Matt T Heffron30-Jun-14 12:25
professionalMatt T Heffron30-Jun-14 12:25 
GeneralRe: Slow LINQ Methods Pin
George Jonsson30-Jun-14 16:22
professionalGeorge Jonsson30-Jun-14 16:22 
AnswerSingleOrDefault is slow by definition Pin
Nicolas Penin25-Jun-14 22:16
Nicolas Penin25-Jun-14 22:16 
GeneralRe: SingleOrDefault is slow by definition Pin
George Jonsson25-Jun-14 23:24
professionalGeorge Jonsson25-Jun-14 23:24 
Hi Nicolas,

Thanks for the input.

I tried the code with FirstOrDefault() instead and the result is faster than SingleOrDefault() but still a bit slower than ElementAt(0).
Regards,
George

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.