Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: C# VB LINQ C#4.0 collections VB.NET , +
I just stumbled upon an oddity in IOrderedEnumerable.Contains[^] and was wondering if more people here had the problem or if it is just me...
 
Consider the following code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
 
namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            List<int> list = new List<int>();
            for (int i = 0; i < 1000000; i++)
            {
                list.Add(i);
            }
            
            IOrderedEnumerable<int> orderedList = list.OrderBy(i => i);
            
            System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
            sw.Start();
            for (int i = 0; i < 100; i++)
            { list.Contains(923456); }
            sw.Stop();
            Console.WriteLine("Calling contains on the List 100 times cost " + sw.Elapsed.ToString());
            
            sw.Reset();
            sw.Start();
            for (int i = 0; i < 100; i++)
            { orderedList.Contains(923456); }
            sw.Stop();
            Console.WriteLine("Calling contains on the IOrderedEnumerable 100 times cost " + sw.Elapsed.ToString());
            
            Console.ReadKey();
        }
    }
}
Nothing wrong there... I am building a list with a million (unique) integers. Even though it is already sorted I am calling the OrderBy Extension Method[^] and I get an IOrderedEnumerable<int>[^].
Now I call the Contains Method[^] of the original list 100 times with a value of which I know will be somewhere at the back (assuming a List<T>[^] will scan every element until it finds the one I am looking for this will take almost the maximum amount of time to find it).
After that I do the same on the IOrderedEnumerable<T>, but I call the IEnumerable<T>.Contains Extension Method[^]. I am expecting that this will pretty much do the same as List<T>.Contains which is scanning every item in the collection until it finds the one I'm looking for.
Now that shouldn't be to difficult. But I was pretty surprised by the result!
List.Contains took about 0,7 to 0,8 seconds for 100 iterations... IOrderedEnumerable.Contains took a whooping 38 seconds!!!
I then added this little gem:
sw.Reset();
sw.Start();
for (int i = 0; i < 100; i++)
{ orderedList.ToList().Contains(923456); }
sw.Stop();
Console.WriteLine("Calling contains on the IOrderedEnumerable.ToList 100 times cost " + sw.Elapsed.ToString());
And it took 40 seconds! Somehow copying the list 100 times and calling Contains only takes 2 seconds longer then only calling Contains...
I also tried changing 923456 to 1 to make sure it was found rather quickly, but it actually made no difference (it did for the List<T> though).
I am shocked, stunned, surprised and clueless...
 
Am I missing something here? I had expected both collections to take more or less the same time to compute... In fact, since an IOrderedEnumerable<T> is ordered it could find values MUCH faster if it searched using a Comparer[^] rather than an EqualityComparer[^].
 
Any idea's? I'm out of them...
Posted 20-Feb-12 12:46pm
Comments
Fredrik Bornander at 21-Feb-12 9:45am
   
On a slightly unrelated note, have my 5 for a question well asked. It doesn't happen that often on this forum...
Naerling at 21-Feb-12 11:17am
   
Thanks. I answer questions here from time to time too, I know how you feel :D

1 solution

Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

Deferred execution is the culprit here (I think), the sort isn't run until the result is enumerated so when you do this;
orderedList.Contains(923456); 
 
you're resorting the list again for every iteration. ToList forces the result to be enumerated up front.
 
You can see this by changing your sort to
IOrderedEnumerable<int> orderedList = list.OrderBy(i => { Console.WriteLine(" sorting=" + i); return i; });
(just make sure to run that on a much, much smaller list Smile | :) .
 
Hope this helps,
Fredrik
  Permalink  
v2
Comments
BobJanova at 21-Feb-12 6:20am
   
You'll get my 5 if this is confirmed to be the case. Good thinking.
Naerling at 21-Feb-12 7:29am
   
You best give this man a 5 then!
BobJanova at 21-Feb-12 14:10pm
   
5 is incoming!
Fredrik Bornander at 21-Feb-12 14:32pm
   
I appreciate it :)
Naerling at 21-Feb-12 7:28am
   
That's it! First I nearly crashed my computer by printing i to the console 100 * 1000000... ;p
Then I simply filled my list with 0 and 1 and did what you said. To really make sure the sorting was performed 100 times I tweaked the code a bit:
 
int iterator = 0;
IOrderedEnumerable orderedList = list.OrderBy(i => { Console.WriteLine(" sorting=" + i + " iterator=" + iterator); iterator++; return i; });
 
iterator was 199 at the end of the loop. So that's 100 times for sorting 2 items.
It's ridiculous though. You would think the sorting would be done only once and after that a boolean is set to indicate that the collection was sorted (perhaps this behaviour is a bug?).
Deferred sorting is necessary for the ThenBy Extension Method (calling ThenBy().ThenBy().ThenBy() won't sort the collection three times).
Thanks for your answer. This was really giving me a headache!
Fredrik Bornander at 21-Feb-12 7:54am
   
Deferred execution is a good thing but you have to understand it to not fall into traps like this. I have been caught out by something similar myself which is why I picked it up.
Correctly used it's awesome, but it's very easy to get it wrong.
Naerling at 21-Feb-12 8:03am
   
I believe what Microsoft does is sort when GetEnumerator is called (every Extension Method on IEnumerable uses this). You'd say keeping a boolean wouldn't be to hard and could significantly improve performance in some cases. The collection is also immutable, so once it's sorted it stays that way. No need to re-sort it ever.
Correct me if I'm wrong :)
GParkings at 21-Feb-12 8:18am
   
if you haven't already it would be good to post this as a tip.
 
Its useful to know and might get missed buried in the Q&&A section :D
Fredrik Bornander at 21-Feb-12 9:42am
   
I might just do that, thanks for the suggestion.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
0 OriginalGriff 319
1 Maciej Los 295
2 Sergey Alexandrovich Kryukov 230
3 CPallini 205
4 Sinisa Hajnal 166
0 OriginalGriff 7,760
1 Sergey Alexandrovich Kryukov 7,072
2 DamithSL 5,604
3 Manas Bhardwaj 4,986
4 Maciej Los 4,785


Advertise | Privacy | Mobile
Web01 | 2.8.1411023.1 | Last Updated 21 Feb 2012
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100