Click here to Skip to main content
Click here to Skip to main content

Performance Tests: Precise Run Time Measurements with System.Diagnostics.Stopwatch

By , 28 Feb 2010
 

Introduction

Everybody who does performance optimization stumbles sooner or later over the Stopwatch class in the System.Diagnostics namespace. And everybody has noticed that the measurements of the same function on the same computer can differ 25% -30% in run time. This article shows how single threaded test programs must be designed to get an accuracy of 0.1% - 0.2% out of the Stopwatch class. With this accuracy, algorithms can be tested and compared.

Background

Modern CPUs have multiple cores, large caches, instruction pipelines and many other stuff affecting the run time of an algorithm in a particular test scenario. White box testing techniques - like attached Debugger or Profiler - dismiss the cache lines, pipelines, etc. on the CPU. The real run time is hidden, so that an algorithm optimized for these modern superscalar CPUs performs slower (due to more instructions) than a non optimized one with an attached profiler. Black box testing (run time measurement) without an attached debugger or profiler discovers the real performance of an algorithm and completes the performance analysis of an algorithm.

Setting Up the Test Scenario

The most important thing is to prevent switching between CPU cores or processors. Switching dismisses the cache, etc. and has a huge performance impact on the test. This can be done by setting the ProcessorAffinity mask of the process:

Process.GetCurrentProcess().ProcessorAffinity = new IntPtr(2); // Use only the second core 

To get the CPU core more exclusively, we must prevent that other threads can use this CPU core. We set our process and thread priority to achieve this:

Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.High;
Thread.CurrentThread.Priority = ThreadPriority.Highest;

And last but not least, we need a warmup phase in our tests. On my system, the results are stable after a warm-up phase of 1000-1500 milliseconds. We can use the stopwatch itself to control the warm-up (at least 1200mS here):

stopwatch.Start();
while (stopwatch.ElapsedMilliseconds < 1200)
{
    result = TestFunction(seed, count); 
}
stopwatch.Stop(); 

Here is the complete sample:

 using System;
using System.Diagnostics;
using System.Threading;

namespace PreciseMeasure
{
    class Program
    {
        static void Main(string[] args)
        {
            Stopwatch stopwatch = new Stopwatch();

            long seed = Environment.TickCount; 	// Prevents the JIT Compiler 
					// from optimizing Fkt calls away
            long result = 0;
            int count = 100000000;

            Console.WriteLine("20 Tests without correct preparation"); 
            Console.WriteLine("Warmup");
            for (int repeat = 0; repeat < 20; ++repeat)
            {
                stopwatch.Reset();
                stopwatch.Start();
                result ^= TestFunction(seed, count);
                stopwatch.Stop();
                Console.WriteLine("Ticks: " + stopwatch.ElapsedTicks + 
				" mS: " +stopwatch.ElapsedMilliseconds);
            }

            Process.GetCurrentProcess().ProcessorAffinity = 
		new IntPtr(2); // Uses the second Core or Processor for the Test
            Process.GetCurrentProcess().PriorityClass = 
		ProcessPriorityClass.High;  	// Prevents "Normal" processes 
					// from interrupting Threads
            Thread.CurrentThread.Priority = 
		ThreadPriority.Highest;  	// Prevents "Normal" Threads 
					// from interrupting this thread
    
            Console.WriteLine();
            Console.WriteLine();
            Console.WriteLine("20 Tests with correct preparation"); 
            Console.WriteLine("Warmup");
            stopwatch.Reset();
            stopwatch.Start();
            while (stopwatch.ElapsedMilliseconds < 1200)  // A Warmup of 1000-1500 mS 
					// stabilizes the CPU cache and pipeline.
            {
                result = TestFunction(seed, count); // Warmup
            }
            stopwatch.Stop();

            for (int repeat = 0; repeat < 20; ++repeat)
            {
                stopwatch.Reset();
                stopwatch.Start();
                result ^= TestFunction(seed, count);
                stopwatch.Stop();
                Console.WriteLine("Ticks: " + stopwatch.ElapsedTicks + 
				" mS: " + stopwatch.ElapsedMilliseconds);
            }
            Console.WriteLine(result); // prevents optimizations (current compilers are 
		// too silly to analyze the dataflow that deep, but we never know )
        }

        public static long TestFunction(long seed, int count)
        {
            long result = seed;
            for (int i = 0; i < count; ++i)
            {
                result ^= i ^ seed; // Some useless bit operations
            }
            return result;
        }
    }
}

Results

Without Correct Preparation

  1. Ticks: 1580367 mS: 572 <-- highest Value
  2. Ticks: 1577003 mS: 571
  3. Ticks: 1576140 mS: 571
  4. Ticks: 1560964 mS: 565
  5. Ticks: 1351663 mS: 489
  6. Ticks: 1248383 mS: 452
  7. Ticks: 1115361 mS: 404
  8. Ticks: 1112813 mS: 403
  9. Ticks: 1113112 mS: 403
  10. Ticks: 1112012 mS: 402 <-- lowest Value
  11. Ticks: 1330444 mS: 482
  12. Ticks: 1558493 mS: 564
  13. Ticks: 1501097 mS: 543
  14. Ticks: 1517796 mS: 549
  15. Ticks: 1542712 mS: 558
  16. Ticks: 1574959 mS: 570
  17. Ticks: 1483975 mS: 537
  18. Ticks: 1390578 mS: 503
  19. Ticks: 1546904 mS: 560
  20. Ticks: 1349507 mS: 488

The run time differs between 402 mS and 572 mS. That is a difference of 170 mS or 42%. These results are useless.

With Correct Preparation

  1. Ticks: 1110518 mS: 402
  2. Ticks: 1110540 mS: 402
  3. Ticks: 1110543 mS: 402
  4. Ticks: 1110684 mS: 402 <-- highest Value
  5. Ticks: 1110508 mS: 402
  6. Ticks: 1110553 mS: 402
  7. Ticks: 1110600 mS: 402
  8. Ticks: 1110433 mS: 402 <-- lowest Value
  9. Ticks: 1110509 mS: 402
  10. Ticks: 1110508 mS: 402
  11. Ticks: 1110489 mS: 402
  12. Ticks: 1110568 mS: 402
  13. Ticks: 1110503 mS: 402
  14. Ticks: 1110566 mS: 402
  15. Ticks: 1110625 mS: 402
  16. Ticks: 1110474 mS: 402
  17. Ticks: 1110571 mS: 402
  18. Ticks: 1110448 mS: 402
  19. Ticks: 1110555 mS: 402
  20. Ticks: 1110495 mS: 402

The 20 samples produce the same result of 402 mS and can only be distinguished by the tick count (the internal CPU performance counter value). The difference is 251 ticks or 0,02 %. On my system, the Stopwatch frequency is 2760029 ticks per second. The difference between the test runs is only 0,09 mS. That is really good and can be used to measure and compare algorithm runtime.

Points of Interest

One very important thing should be kept in mind. The best (lowest) value without preparation isn't as good as the worst value with preparation. CPU context and core switches have an huge impact to the run time of applications.

Links and Additional Resources

This article is one of the performance tests that can be found on benchmarks.codeplex.com. These are research for the "Second WAF" web application framework.

History

  • 28th February, 2010: Initial post

License

This article, along with any associated source code and files, is licensed under The BSD License

About the Author

Thomas Maierhofer
CEO MSE-iT
Germany Germany
Member
I'm the CEO of MSE-iT Reisebürosoftware, a software development company in Germany making travel agency accounting software.
 
We release some library stuff under GPL or LGPL, so everybody can use it. These are our Open-Source projects so far:
 
Spell Checking, Hyphenation and Thesaurus for .NET

.NET spell checker, hyphenation and thesaurus based on the Open Offlice spell checker Hunspell. Here is a life Demo: Spell check, hyphenation and thesaurus reference project for NHunspell on ASP.NET.
 
jQuery Plugins

jQuery Background Canvas Plugin Inserts a HTML5 CANVAS element behind any HTML element and allows to draw on it with JavaScript.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
Hint: For improved responsiveness ensure Javascript is enabled and choose 'Normal' from the Layout dropdown and hit 'Update'.
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionDoubtmembernawfalhasan23 Apr '13 - 8:34 
How does merely having long seed = Environment.TickCount; really prevent JIT from optimizing ?
AnswerRe: DoubtmemberThomas Maierhofer24 Apr '13 - 0:15 
A performance test function is often deterministic. So a compiler with data flow analysis could calculate the result of a test function and replace the test function with the result of the execution. By using the seed (long seed = Environment.TickCount;) for calculations in the test function...
GeneralMy vote of 5mvpKanasz Robert28 Sep '12 - 5:50 
Excellent article
GeneralMy vote of 5memberClifford Nelson5 Jun '12 - 14:01 
Great information
QuestionVisual Basic .NETmemberthe ritzky13 May '12 - 12:24 
Hello Thomas. Thank you for this awesome post. It really helped me. However, I have a question.   I set the process/thread priorites and the affinity. I'm trying to make the best timer function possible with the stopwatch class.   I have (Visual Basic.NET):   Private Sub...
AnswerRe: Visual Basic .NETmemberThomas Maierhofer14 May '12 - 1:58 
First of all you have a serious bug in your breaking condition: change StopWatch.Elapsed.Milliseconds = Milliseconds into StopWatch.Elapsed.Milliseconds >= Milliseconds because if your executing thread is not running in this specific millisecond, your code will run forever.   Then,...
GeneralMy vote of 5memberrotemshi5 Jan '12 - 5:35 
Sweet and to the point, and more importantly, works.
GeneralRe: My vote of 5memberThomas Maierhofer14 May '12 - 1:59 
Thanks Blog: MSE-iT Software Development Homepage: Thomas Maierhofer Software Developement Homepage: MSE-iT Reisebürosoftware (Travel Agency Accounting Software) Forum: Reisebüro-Forum (Travel Agency Forum)
QuestionCode was erratic on my machinememberclaysdna25 Nov '11 - 10:33 
Enjoyed your article very much. I copied the code and ran it on my Lenovo Thinkpad x201 Laptop and found some interesting results.   Neither version of the code (before optimization or after optimization) ran any better than the other. Neither version gave me the anything near the...
AnswerRe: Code was erratic on my machinememberThomas Maierhofer24 Apr '12 - 22:50 
First: Have you run the tests with no debugger attached? (ctrl-F5 in Visual Studio) Second: Laptops clock down the cpu to save energy and rise the cpu clock on heavy load. Maybe you can't test this on a laptop or need a much longer warmup phase to become the cpu clock stable. Blog: MSE-iT...
GeneralExcellentmemberiJam_j21 Feb '11 - 20:42 
Just want to say that this article is excellent and very helpful.
GeneralRe: ExcellentmemberThomas Maierhofer14 May '12 - 1:59 
Thanks Blog: MSE-iT Software Development Homepage: Thomas Maierhofer Software Developement Homepage: MSE-iT Reisebürosoftware (Travel Agency Accounting Software) Forum: Reisebüro-Forum (Travel Agency Forum)
GeneralSimple concise and extremely helpfulmemberscosta_FST10 Mar '10 - 20:57 
5!
GeneralRe: Simple concise and extremely helpfulmemberThomas Maierhofer14 May '12 - 1:59 
Thanks Blog: MSE-iT Software Development Homepage: Thomas Maierhofer Software Developement Homepage: MSE-iT Reisebürosoftware (Travel Agency Accounting Software) Forum: Reisebüro-Forum (Travel Agency Forum)
GeneralThanks - great explanation and useful codememberArtie1 Mar '10 - 13:22 
Measuring algorithm processing time has always seemed such a black art to me. This should go a long way to helping make my results a clearer. Thanks.
GeneralSimply brilliantmemberPascal Ganaye28 Feb '10 - 23:56 
If I understand this article It says:   If you add this line to your speed test program:   Process.GetCurrentProcess().ProcessorAffinity = new IntPtr(2);   Then StopWatch is much more reliable.   This is so simple I wonder why I did not think of that. Deep down my...
Generalpresentationmemberxliqz28 Feb '10 - 6:37 
The subject itself is trivial and nothing special, but props for great presentation. This is how articles should be presented! (5/5)
GeneralRe: presentationmemberAWdrius28 Feb '10 - 22:20 
+1. Interesting read nevertheless.Trust is a weakness.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130516.1 | Last Updated 28 Feb 2010
Article Copyright 2010 by Thomas Maierhofer
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid