Why Microsoft's in-built array operations are faster

Question

1.00/5 (2 votes)

See more:

I measured the time it takes for the in built method Array.Sort() in C# takes to sort a fixed number of integer array against other custom sort algorithms such as selection sort and noticed that the in built methods be it Array.Reverse() or Array.Sort() are eight times faster than the custom selection sort or reversal algorithms, why is this? Did Microsoft optimize the C# compiler for the codes in their SDK?

What I have tried:

For the case of sorting an integer array of size 5, the Microsoft in built Array.Sort() took 227 stopwatch ticks while selection sort took 1508 stopwatch ticks to sort the same array. Can compilers be optimized for certain algorithms?

Posted 13-Jun-23 7:37am

Tim the Gamer

Updated 13-Jun-23 11:13am

Add a Solution

Comments

PIEBALDconsult 13-Jun-23 14:52pm

My understanding is that they use a Quick Sort.

3 solutions

Solution 3

Quote:
Why Microsoft's in-built array operations are faster

Without seeing your code, it will be difficult to give a definitive answer.
Just a guess: your code is not a&s optimized as you think.

Quote:
For the case of sorting an integer array of size 5

Advice: Never do benchmarks with sorting an array of 5.
Sort an array of 5000, 10000, 15000, 20000 and see how timing evolve, and try different sorting methods.
You will see that simple minded methods are usually less efficient.
Code optimization matters; multithreading can help too, removing unnecessary operation can help too.

Posted 13-Jun-23 11:13am

Patrice T

Comments

Tim the Gamer 13-Jun-23 20:00pm

increasing the input size is directly proportional to the time it takes for the algorithm to run so more array size means more execution time. The results would have been consistent and would have yielded the same 8 times ratio

Patrice T 13-Jun-23 20:04pm

Time to sort an array is not linear with the size of array.
And different code of same algorithm have not same efficiency.

Dave Kreskowiak 13-Jun-23 20:13pm

That is not true. This is dependent on the algorithm used, how it's coded, CPU architecture, CPU cache sizes and amounts, dataset size, ...

Yes, all of that matters. You can use the exact same set of data and algorithm and code on two different CPU's and get very different results. The same is also true where you can get linear results but only up to a point, then you may start seeing performance fall off or even improve. This is normally related to CPU cache sizes and cache-miss handling.

Patrice T 13-Jun-23 20:22pm

I didn't meant to be hexaustive, and when compare pieces of code, I do it on same computer, otherwise it is comparing carrots and patatoes.

Tim the Gamer 14-Jun-23 0:47am

How can different code of the same algorithm be "different"?

Patrice T 14-Jun-23 8:23am

Ask 10 students to code a quicksort, the 10pieces of code will perform differently, but it is the same algorithm.

Dave Kreskowiak 14-Jun-23 8:59am

That can't be a serious question. :worried:

Ask 10 people to write a "Hello World!" app and you'll get 10 different ways to do it.

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

PIEBALDconsult · Accepted Answer · 2023-06-13T10:09:00

Solution 2

.NET Framework 4 and earlier versions used only the Quicksort algorithm.

This method uses the introspective sort (introsort) algorithm as follows:
If the partition size is less than or equal to 16 elements, it uses an insertion sort algorithm.
If the number of partitions exceeds 2 * LogN, where N is the range of the input array, it uses a Heapsort algorithm.
Otherwise, it uses a Quicksort algorithm.

Posted 13-Jun-23 10:09am

PIEBALDconsult

Comments

Tim the Gamer 13-Jun-23 20:00pm

Guess this is why their methods run faster. They check the input size and use the most efficient method to sort the array based on the size of the input

Graeme_Grant 13-Jun-23 21:57pm

Not just quicksort, they have more than one and it depends on the data. That is why I posted a link to the actual Microsoft source code.

Graeme_Grant · Accepted Answer · 2023-06-13T07:59:00

Solution 1

If you are interested in going down that rabbit hole, you can look at the source code for the Array.Sort() here starting at line number 1643: referencesource/mscorlib/system/array.cs at master · microsoft/referencesource · GitHub[^]

UPDATE

Microsoft uses a different sort algorithm dependant on the data. If you look at lines 1983 to 1997, you can see that it selects between IntrospectiveSort and DepthLimitedQuickSort. The DepthLimitedQuickSort uses Heapsort (line 2012 & 2331) and IntrospectiveSort switches to InsertionSort if the partition size < IntrosortSizeThreshold (line 2123 & 2459), and Heapsort if depth is 0 (line 2146-2150 & 2463-2467).

Posted 13-Jun-23 7:59am

Graeme_Grant

Updated 13-Jun-23 17:02pm

v2

Comments

Tim the Gamer 13-Jun-23 14:03pm

what makes the Microsoft code faster?

Dave Kreskowiak 13-Jun-23 14:52pm

He's telling you to go analyze the code yourself to find out. Nobody is going to take the time to do your work for you.

You want the answer to the question? You have to do the work to get it.

Graeme_Grant 14-Jun-23 0:35am

Yep, I've updated the question to make it even easier to see where and how...

Graeme_Grant 13-Jun-23 21:56pm

The full code is there. Microsoft selectively uses different algorithms depending on the data.

Tim the Gamer 14-Jun-23 0:02am

I was into these challenges at Leetcode and noticed that they lock some of the challenges and ban participants from using Array.Reverse() so user can come up with the array Reversal algorithms themselves. They also implement test cases so that the Runtime does not exceed a set limit. That is why I decided to measure which ones are more efficient and in terms of performance I would go with Microsoft'in built methods to make my software faster.

Graeme_Grant 14-Jun-23 0:34am

If you look closely at the sort algorithms that they use, they are also highly optimised. If you are using JetBrain's Rider IDE, you can set breakpoints in the Microsoft code and step through and see how it actually works and which algorithm is chosen based on the data used.

Tim the Gamer 14-Jun-23 0:36am

I am using Visual Studio 2022 community but I will step through the debugger and see how they check the input size and call which sort algorithm.

Graeme_Grant 14-Jun-23 0:37am

VS2022 will actually show you the Microsoft code (decompiled) if you use Goto Definition.

Tim the Gamer 14-Jun-23 0:44am

The time it takes for an algorithm to complete is very important as the race against time is important in all spheres of engineering. Do more with less time. I never thought this O(n) to be that important until recently when Leetcode challenges taught me to consider it and always benchmark logic and implement the fastest ones into my program.

Graeme_Grant 14-Jun-23 1:25am

To get accurate results, make sure that you're using BenchmarkDotNet and not the Stopwatch class. Stopwatch class does not account for the Jitter, GC, or other things that may give inconsistent results.

Tim the Gamer 14-Jun-23 1:39am

Sure thing, is it a Nuget package?

Graeme_Grant 14-Jun-23 1:44am

NuGet Gallery | BenchmarkDotNet 0.13.5[^] and GitHub - dotnet/BenchmarkDotNet: Powerful .NET library for benchmarking[^]