Click here to Skip to main content
15,881,812 members
Please Sign up or sign in to vote.
5.00/5 (2 votes)
I just stumbled upon an oddity in IOrderedEnumerable.Contains[^] and was wondering if more people here had the problem or if it is just me...

Consider the following code:
C#
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            List<int> list = new List<int>();
            for (int i = 0; i < 1000000; i++)
            {
                list.Add(i);
            }
            
            IOrderedEnumerable<int> orderedList = list.OrderBy(i => i);
            
            System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
            sw.Start();
            for (int i = 0; i < 100; i++)
            { list.Contains(923456); }
            sw.Stop();
            Console.WriteLine("Calling contains on the List 100 times cost " + sw.Elapsed.ToString());
            
            sw.Reset();
            sw.Start();
            for (int i = 0; i < 100; i++)
            { orderedList.Contains(923456); }
            sw.Stop();
            Console.WriteLine("Calling contains on the IOrderedEnumerable 100 times cost " + sw.Elapsed.ToString());
            
            Console.ReadKey();
        }
    }
}
Nothing wrong there... I am building a list with a million (unique) integers. Even though it is already sorted I am calling the OrderBy Extension Method[^] and I get an IOrderedEnumerable<int>[^].
Now I call the Contains Method[^] of the original list 100 times with a value of which I know will be somewhere at the back (assuming a List<T>[^] will scan every element until it finds the one I am looking for this will take almost the maximum amount of time to find it).
After that I do the same on the IOrderedEnumerable<T>, but I call the IEnumerable<T>.Contains Extension Method[^]. I am expecting that this will pretty much do the same as List<T>.Contains which is scanning every item in the collection until it finds the one I'm looking for.
Now that shouldn't be to difficult. But I was pretty surprised by the result!
List.Contains took about 0,7 to 0,8 seconds for 100 iterations... IOrderedEnumerable.Contains took a whooping 38 seconds!!!
I then added this little gem:
C#
sw.Reset();
sw.Start();
for (int i = 0; i < 100; i++)
{ orderedList.ToList().Contains(923456); }
sw.Stop();
Console.WriteLine("Calling contains on the IOrderedEnumerable.ToList 100 times cost " + sw.Elapsed.ToString());
And it took 40 seconds! Somehow copying the list 100 times and calling Contains only takes 2 seconds longer then only calling Contains...
I also tried changing 923456 to 1 to make sure it was found rather quickly, but it actually made no difference (it did for the List<T> though).
I am shocked, stunned, surprised and clueless...

Am I missing something here? I had expected both collections to take more or less the same time to compute... In fact, since an IOrderedEnumerable<T> is ordered it could find values MUCH faster if it searched using a Comparer[^] rather than an EqualityComparer[^].

Any idea's? I'm out of them...
Posted
Comments
Fredrik Bornander 21-Feb-12 9:45am    
On a slightly unrelated note, have my 5 for a question well asked. It doesn't happen that often on this forum...
Sander Rossel 21-Feb-12 11:17am    
Thanks. I answer questions here from time to time too, I know how you feel :D

1 solution

Deferred execution is the culprit here (I think), the sort isn't run until the result is enumerated so when you do this;
C#
orderedList.Contains(923456); 


you're resorting the list again for every iteration. ToList forces the result to be enumerated up front.

You can see this by changing your sort to
C#
IOrderedEnumerable<int> orderedList = list.OrderBy(i => { Console.WriteLine(" sorting=" + i); return i; });

(just make sure to run that on a much, much smaller list :).

Hope this helps,
Fredrik
 
Share this answer
 
v2
Comments
BobJanova 21-Feb-12 6:20am    
You'll get my 5 if this is confirmed to be the case. Good thinking.
Sander Rossel 21-Feb-12 7:29am    
You best give this man a 5 then!
BobJanova 21-Feb-12 14:10pm    
5 is incoming!
Fredrik Bornander 21-Feb-12 14:32pm    
I appreciate it :)
Sander Rossel 21-Feb-12 7:28am    
That's it! First I nearly crashed my computer by printing i to the console 100 * 1000000... ;p
Then I simply filled my list with 0 and 1 and did what you said. To really make sure the sorting was performed 100 times I tweaked the code a bit:

int iterator = 0;
IOrderedEnumerable<int> orderedList = list.OrderBy(i => { Console.WriteLine(" sorting=" + i + " iterator=" + iterator); iterator++; return i; });

iterator was 199 at the end of the loop. So that's 100 times for sorting 2 items.
It's ridiculous though. You would think the sorting would be done only once and after that a boolean is set to indicate that the collection was sorted (perhaps this behaviour is a bug?).
Deferred sorting is necessary for the ThenBy Extension Method (calling ThenBy().ThenBy().ThenBy() won't sort the collection three times).
Thanks for your answer. This was really giving me a headache!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900