Click here to Skip to main content
Click here to Skip to main content

Comparative Speed Testing

, 28 May 2008 CPOL
Rate this:
Please Sign up or sign in to vote.
A simple-to-use class for performing comparative, non-benchmarked speed tests when optimising code for execution speed.

Abstract

This article presents an unsophisticated class for comparing the raw speed of two programming design alternatives. It is not a scientific benchmarking exercise, but is a very simple way of implementing two coded processes to gauge which is probably faster and approximately how much faster. The class allows the comparison with a minimum of infrastructure coding, it is not rocket-science.

Contents

Introduction

In the process of writing some imaging filter methods, I continually found myself in the position of having to optimise my code for raw speed. Although the approach I had taken was fundamentally quite an efficient one, the need to regularly process more than six million pixels made speed paramount, sometimes at the expense of transparency and good programming practice. The simple process of converting a 24-bit image to 8-bit indexed grayscale, then finding edges meant 20 million pixel manipulations. One microsecond gained per pixel saved 20 seconds of processing time.

Routine practices soon became critical decisions, for example: do I protect a class member/field (like a pixel array) and access it via a read-write property, or do I make it public and allow read-write directly to the data? Until I knew what the time penalty was for access via a property, I had no more than a strong suspicion that reading-writing directly would be slightly faster, and even less idea of how much faster it might be.

Eventually I had such a backlog of outstanding comparisons to make that I wrote a simple abstract class that enabled me to get through them as quickly as possible with sufficiently scientific results that I could choose which options to use.

Some surprising results

Well, they were to me anyway...

I was not interested in any rigorous scientific comparison that would withstand serious academic scrutiny, I was merely interested in making some design choices that would result in a shorter wait for some imaging filter to execute. Some of the differences in the tests I ran were a little more pronounced than I expected, some quite surprising.

Example - property versus direct member access

An example; direct, unprotected read access to class members is about five times as efficient as the same access via a property. I did not test write-access; the test outcome was sufficient evidence for me to make several class members public.

Later I discovered, to my surprise, that if the "Release" version of the executable is run, the outcome is indeterminate - one pass gives property access the advantage, another direct access. My conclusion is that either will do, and my assumption is that the optimiser builds similar code for both in the Release binary. I have included two executables in the download, SpeedTests_Debug.exe and SpeedTests_Release.exe, so you can see the disparity for yourself.

Example - recursion versus stack-and-loop

Another example: a simple recursive method surprisingly seemed ten times as efficient as pushing a value on a .NET Stack instance and looping. I did not test a hand-crafted stack which may have led to different results; I just went with the recursive method.

Class SpeedTestsAB

The speed test runs a rudimentary control method (SpeedTestControl) which is simply an empty method to provide an overhead metric which can be subtracted from the total time of each test to provide a net running time.

The abstract class SpeedTestsAB defines four abstract/MustOverride methods:

Method Description

SetUpTestA

Initialization for SpeedTestA, for example:

C#: base.f_DescribeA = "Access a public field directly.";

SetUpTestB

Initialisation for SpeedTestB, for example:

VB: me.f_DescribeB = "Access a field via a property."

SpeedTestA

The speed-test A code to execute.

SpeedTestB

The speed-test B code to execute.

In addition, there are methods and properties that enable simple reporting of the results.

Property Description
TotalTimeA
TotalTimeB
TotalTimeControl

The total times taken for each test.

NetTimeA
NetTimeB

The net time taken for each test - TotalTime - TotalTimeControl.

Repetitions

The number of repetitions.

Method

Description

Results

Returns a very simple results report string.

AppendResults

Appends the result string to a file.

ShowResults

Shows the result string in a MessageBox.

WriteResults

Writes the result string to a file or stream.

The resulting output looks like:

Test results:
10,000,000 repetitions.
Test A: Access a public field directly.
Test B: Access a field via a property.
00:00:00.0937500 hh:mm:ss.ff Equivalent Elapsed Time Control Process.
00:00:00.1406250 hh:mm:ss.ff Total Elapsed Time Process A.
00:00:00.1718750 hh:mm:ss.ff Total Elapsed Time Process B. 00:00:00.0468750 hh:mm:ss.ff
Net Elapsed Time Process A.
00:00:00.0625000 hh:mm:ss.ff Net Elapsed Time Process B.
Net Unit Processing Time A: 4.688 nanosecs
Net Unit Processing Time B: 6.250 nanosecs
75.000% Percentage: Process A divided by Process B.

Using SpeedTestsAB

Create a class that inherits clsBaseSpeedTestsAB, e.g.:

Public Class clsSpeedTestAB_Properties _
  Inherits clsBaseSpeedTestAB

Create a code section that defines properties, methods, and data required to run the tests, e.g.:

#Region "[=== SPEED TEST COMPONENTS ===]"

  Protected f_SomeInteger As Int32 = 123456

  Public Property SomeInteger() As Int32
    Get
      Return Me.f_SomeInteger
    End Get
    Set(ByVal value As Int32)
      Me.f_SomeInteger = value
    End Set
  End Property

#End Region

Override the methods SetUpTestA and SetUpTestB with, at least, the description of the tests, e.g.:

''' <summary>
''' Overriden speed test A setup.
''' </summary>
''' <remarks></remarks>
Protected Overrides Sub SetUpTestB()
    Me.f_DescribeB = "Access a field via a property."
End Sub

Override the methods SpeedTestA and SpeedTestB with the code to be tested, e.g.:

''' <summary>
''' Overridden Speed Test A.
''' </summary>
''' <remarks></remarks>
Protected Overrides Sub SpeedTestA()
    Dim xInt As Int32
    xInt = Me.f_SomeInteger
End Sub

''' <summary>
''' Overridden Speed Test B.
''' </summary>
''' <remarks></remarks>
Protected Overrides Sub SpeedTestB()
    Dim xInt As Int32
    xInt = Me.SomeInteger
End Sub

Define the class somewhere and call the RunTest method, e.g.:

Private Sub Button1_Click(ByVal sender As System.Object, _
        ByVal e As System.EventArgs) Handles Button3.Click
    Dim xTest As New clsSpeedTestAB_Properties(10000000)
    xTest.RunTest()
    xTest.ShowResults
End Sub

The downloads

The downloads are in four packs:

Pack

Contents

SpeedTests_Src_CS.zip

C# source code for the demonstration, including:

  • clsBaseSpeedTestAB.cs
  • clsSpeedTestAB_Delegates.cs
  • clsSpeedTestAB_Properties.cs
  • clsSpeedTestAB_Recursion.cs
  • Form1.cs

Please note that the C# source code is the disassembler output from Lutz Roeder's .NET Reflector v5.1, and not the result of my porting the source code. As a consequence, there may be some errors in the code as the disassembly created several which I have tried to fix.

SpeedTests_Src_VB.zip

VB.NET source code for the demonstration, including:

  • clsBaseSpeedTestAB.vb
  • clsSpeedTestAB_Delegates.vb
  • clsSpeedTestAB_Properties.vb
  • clsSpeedTestAB_Recursion.vb
  • Form1.vb

SpeedTests_EXE.zip

The two speed test compiles, one in debug mode (SpeedTests_Debug.exe), and one in release mode (SpeedTests_Release.exe).

SpeedTests_Article.zip

This article.

Points to note

This is not an example of proper scientific speed benchmarking, but is only a mechanism for making a choice as to which code design to use when raw speed is of primary importance.

There is often quite a difference between the efficiency/speed of code which is compiled in Debug mode when compared to the same code compiled in Release mode. Assuming that the final version of your software will be compiled in Release mode, I recommend speed testing in that mode. It is of interest, however, to compare execution speed in the two modes.

Sample test results

(In Debug mode.)

These results are from an earlier incarnation of the class and were actively used in making image coding design decisions. The differences may disappear when tested with a Release compiled version.

Speed test: If, ElseIf compared with Select Case

Public Overrides Sub SpeedTestA()
  If Me.f_Int32 = 0 Then
    Me.f_Int32 = 1
  ElseIf Me.f_Int32 = 1 Then
    Me.f_Int32 = 1
  ElseIf Me.f_Int32 = 2 Then
    Me.f_Int32 = 2
  ElseIf Me.f_Int32 = 3 Then
    Me.f_Int32 = 3
  ElseIf Me.f_Int32 = 4 Then
    Me.f_Int32 = 4
  ElseIf Me.f_Int32 = 5 Then
    Me.f_Int32 = 5
  ElseIf Me.f_Int32 = 6 Then
    Me.f_Int32 = 6
  ElseIf Me.f_Int32 = 7 Then
    Me.f_Int32 = 7
  ElseIf Me.f_Int32 = 8 Then
    Me.f_Int32 = 8
  ElseIf Me.f_Int32 = 9 Then
    Me.f_Int32 = 9
  End If
End Sub

Public Overrides Sub SpeedTestB()
  Select Case Me.f_Int32
    Case 0
      Me.f_Int32 = 0
    Case 1
      Me.f_Int32 = 1
    Case 2
      Me.f_Int32 = 2
    Case 3
      Me.f_Int32 = 3
    Case 4
      Me.f_Int32 = 4
    Case 5
      Me.f_Int32 = 5
    Case 6
      Me.f_Int32 = 6
    Case 7
      Me.f_Int32 = 7
    Case 8
      Me.f_Int32 = 8
    Case 9
      Me.f_Int32 = 9
  End Select
End Sub

Protected f_Int32 As Int32
Test results:
600,000,000 repetitions.
Test A: Use if, elseif.
Test B: Use select case.
00:00:07.4270833 hh:mm:ss.ff Equivalent Elapsed Time Control Process.
00:00:10.0312500 hh:mm:ss.ff Total Elapsed Time Process A.
00:00:09.2968750 hh:mm:ss.ff Total Elapsed Time Process B.
00:00:02.6041667 hh:mm:ss.ff Net Elapsed Time Process A.
00:00:01.8697917 hh:mm:ss.ff Net Elapsed Time Process B.
Net Unit Processing Time A: 2.604 secs
Net Unit Processing Time B: 1.870 secs
139.276% Percentage: Process A divided by Process B.

Conclusion

Select Case may be approximately 40% faster than If, ElseIf.

Speed test: If, ElseIf compared with nested IIF

Public Overrides Sub SpeedTestA()
  If Me.f_Int32 = 0 Then
    Me.f_Int32 = 1
  ElseIf Me.f_Int32 = 1 Then
    Me.f_Int32 = 1
  ElseIf Me.f_Int32 = 2 Then
    Me.f_Int32 = 2
  ElseIf Me.f_Int32 = 3 Then
    Me.f_Int32 = 3
  ElseIf Me.f_Int32 = 4 Then
    Me.f_Int32 = 4
  ElseIf Me.f_Int32 = 5 Then
    Me.f_Int32 = 5
  ElseIf Me.f_Int32 = 6 Then
    Me.f_Int32 = 6
  ElseIf Me.f_Int32 = 7 Then
    Me.f_Int32 = 7
  ElseIf Me.f_Int32 = 8 Then
    Me.f_Int32 = 8
  ElseIf Me.f_Int32 = 9 Then
    Me.f_Int32 = 9
  End If
End Sub

Public Overrides Sub SpeedTestB()
  IIf(Me.f_Int32 = 0, Me.f_Int32 = 0 _
        , IIf(Me.f_Int32 = 1, Me.f_Int32 = 1 _
        , IIf(Me.f_Int32 = 2, Me.f_Int32 = 2 _
        , IIf(Me.f_Int32 = 3, Me.f_Int32 = 3 _
        , IIf(Me.f_Int32 = 4, Me.f_Int32 = 4 _
        , IIf(Me.f_Int32 = 5, Me.f_Int32 = 5 _
        , IIf(Me.f_Int32 = 6, Me.f_Int32 = 6 _
        , IIf(Me.f_Int32 = 7, Me.f_Int32 = 7 _
        , IIf(Me.f_Int32 = 8, Me.f_Int32 = 8 _
        , IIf(Me.f_Int32 = 9, Me.f_Int32 = 9 _
        , Me.f_Int32 = 9))))))))))
End Sub

Protected f_Int32 As Int32 = -1
Test results:
100,000,000 repetitions.
Test A: Use if, elseif.
Test B: Use nested IIF.
00:00:01.2343750 hh:mm:ss.ff Equivalent Elapsed Time Control Process.
00:00:03.2812500 hh:mm:ss.ff Total Elapsed Time Process A.
00:00:39.7500000 hh:mm:ss.ff Total Elapsed Time Process B.
00:00:02.0468750 hh:mm:ss.ff Net Elapsed Time Process A.
00:00:38.5156250 hh:mm:ss.ff Net Elapsed Time Process B.
Net Unit Processing Time A: 2.047 secs
Net Unit Processing Time B: 38.516 secs
5.314% Percentage: Process A divided by Process B.

Conclusion

Do not use IIF.

Speed test: Compare Shift-Right 4 (X >> 4) with Divide by 16

Public Overrides Sub SpeedTestA()
  Dim xInt As Int32 = _
    CInt((((((((Me.f_Int >> 4) >> 4) >> 4) >> 4) >> 4) >> 4) >> 4) >> 4)
End Sub

Public Overrides Sub SpeedTestB()
  Dim xInt As Int32 = _
    CInt((((((((Me.f_Int / 16 / 16 / 16 / 16 / 16 / 16 / 16 / 16)
End Sub

Protected f_Int As Int32 = 123456
Test results:
100,000,000 repetitions.
Test A: Shift-Right 4.
Test B: Divide by 16.
00:00:01.2500000 hh:mm:ss.ff Equivalent Elapsed Time Control Process.
00:00:01.3593750 hh:mm:ss.ff Total Elapsed Time Process A.
00:00:20.9843750 hh:mm:ss.ff Total Elapsed Time Process B.
00:00:00.1093750 hh:mm:ss.ff Net Elapsed Time Process A.
00:00:19.7343750 hh:mm:ss.ff Net Elapsed Time Process B.
Net Unit Processing Time A: 109.375 millisecs
Net Unit Processing Time B: 19,734.375 millisecs
0.554% Percentage: Process A divided by Process B.

Conclusion

Getting close to 200 times as fast to shift rather than divide.

History

  • 2008-05-26: Created.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Warrick Procter
ZED
New Zealand New Zealand
A DEC PDP/11 BasicPlus2 developer from the 80s.

Comments and Discussions

 
General10,000,000 repetitions Pinmemberquiensabe29-May-08 2:05 
GeneralRe: 10,000,000 repetitions PinmemberWarrick Procter29-May-08 10:55 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.141220.1 | Last Updated 29 May 2008
Article Copyright 2008 by Warrick Procter
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid