Click here to Skip to main content
12,634,982 members (31,514 online)
Click here to Skip to main content
Add your own
alternative version

Stats

289.7K views
1.1K downloads
44 bookmarked
Posted

A statistical analysis of the performance variations of assorted managed and unmanaged languages

, , 8 Aug 2002 CPOL
Rate this:
Please Sign up or sign in to vote.
This article compares and contrasts the relative performances of various languages like native C++, Visual Basic 6, C#, VB.NET, Managed C++, MC++ and native code mix, ngen'd assemblies etc. using a prime number generation function as a generic benchmark

Introduction

This project was initially started by Rama who did almost all of the coding. Personal affairs halted his progress and he handed it over to Nish who took it up from where Rama had left off. Nish finished off the stuff and did some statistical analysis on the results obtained. We wanted to get an idea of how different languages and tools compare with each other in terms of performance. There are a variety of categories where speed and performance can be measured, but the first thing which that came to mind was computation, and thus prime number generation was chosen as the criteria.

The next job was to decide how to implement something that can be performance-compared in various languages. First the various common options had to be chosen. We picked up the following ten different language options that are available to the general Microsoft programmer.

The participants

  • Visual C++ 7
  • Visual Basic 6
  • C#
  • VB.NET
  • Managed C++ compiled totally to IL
  • Managed C++ with arithmetic intensive stuff in unmanaged code
  • C# ngen'd
  • VB.NET ngen'd
  • Managed C++ compiled totally to IL and ngen'd
  • Managed C++ with arithmetic intensive stuff in unmanaged code and ngen'd

The  objective was to use a single test application to run and measure the timings. Thus component DLLs were developed in all 10 language options. We ignored considering the overhead due to COM in .NET calls as we did not expect it to be very significant.

The Code

We used a simple COM interface that, when given the number of primes to compute, computed them. The IComputePrimes interface looks like this:-

interface IComputePrimes : IDispatch 
{ 
    HRESULT CalculatePrimes([in] int numPrimes);
};

This was generated by using the default options of the ATL object wizard. Any object implementing this interface is expected to calculate and store as many prime numbers as specified by numPrimes .

Now let's see how the code looks like for various cases.

The C++ code

STDMETHODIMP CComputePrimes::CalculatePrimes(int numPrimes)
{
    if (m_rgPrimes != NULL)
        delete [] m_rgPrimes;

    m_rgPrimes = new int[numPrimes];

    m_rgPrimes[0] = 2;
    m_rgPrimes[1] = 3;

    int i = 2;
    int nextPrimeCandidate = 5;

    while(i < numPrimes)
    {
        int maxNumToDivideWith = (int)sqrt(nextPrimeCandidate);

        bool isPrime = true;

        for(int j = 0; 
            (j < i) && (maxNumToDivideWith >= m_rgPrimes[j]); 
            j++)
        {
            if ((nextPrimeCandidate % m_rgPrimes[j]) == 0)
            {
                isPrime = false;
                break;
            }
        }

        if (isPrime)
            m_rgPrimes[i++] = nextPrimeCandidate;

        nextPrimeCandidate += 2;
    }

    return S_OK;
}

The prime numbers computed are stored in an integer array m_rgPrimes. The above code tries to divide an odd number with all the prime numbers which are less than its square root to decide whether the number is a prime or not. If yes it stores it the array.

C# and MC++

The code for C#, Managed C++ is similar except that in the two cases with Managed C++ where we mix native code into the managed code, the code is broken into two separate functions as shown below.

<PRE lang=mc++>void CalculatePrimes(int numPrimes) { primes = new int __gc[numPrimes]; int __pin* rgPrimes = &primes[0]; UnmanagedComputePrimes (rgPrimes, numPrimes); }

The array is a managed array and we pin the array and call an unmanaged function that calculates the primes and fills the array.

VB/VB.NET Code 

Private Sub IComputePrimes_CalculatePrimes(ByVal numPrimes As Long)

    ReDim Primes(numPrimes)
    Primes(1) = 2
    Primes(2) = 3

    Dim NextPrimeCandidate As Long
    NextPrimeCandidate = 5
    
    Dim i As Long
    Dim j As Long
    Dim MaxNumToDivideWith As Long
    Dim IsPrime As Boolean

    i = 3

    Do While i <= numPrimes
        MaxNumToDivideWith = Sqr(NextPrimeCandidate)
        IsPrime = True
        j = 1

        Do While (j <= i) And (MaxNumToDivideWith >= Primes(j))
            If NextPrimeCandidate Mod Primes(j) = 0 Then
                IsPrime = False
                Exit Do
            End If

            j = j + 1
        Loop

        If IsPrime Then
            Primes(i) = NextPrimeCandidate
            i = i + 1
        End If

        NextPrimeCandidate = NextPrimeCandidate + 2
    Loop

End Sub

The VB.NET code looks similar with Sqr replaced with System.Math.Sqrt function. The VB6 code is compiled with optimizations that will closely resemble the generated C++ code like removing all integer overflow checks.

The test clients

All the cases are compiled into a DLL. All assemblies are registered for COM interoperability. We have two test clients, a managed client and a native client. The native client is coded in VC++ and uses the #import keyword.

__int64 ComputeAndGetResults(
    ATLPrimesLib::IComputePrimesPtr spComputePrimes, 
    int numPrimes)
{
    LARGE_INTEGER li1, li2;
    li1.QuadPart = 0;
    li2.QuadPart = 0;

    QueryPerformanceCounter(&li1);
    spComputePrimes->CalculatePrimes(numPrimes);
    QueryPerformanceCounter(&li2);  

    return li2.QuadPart - li1.QuadPart;
}

int _tmain(int argc, _TCHAR* argv[])
{
    try
    {
        //...   

        ATLPrimesLib::IComputePrimesPtr spComputePrimes(argv[1]);       


        int numPrimes = atol(argv[2]);
        LARGE_INTEGER f;
        QueryPerformanceFrequency(&f);
        std::cout << ComputeAndGetResults(spComputePrimes, numPrimes);
    }
    catch(_com_error& e)
    {
        //...
    }

    return 0;
}

The managed client is written using C#.

try
{
    Assembly assem = Assembly.Load(args[0]);
    IComputePrimes primes = 
        (IComputePrimes)assem.CreateInstance(args[1]);

    int numPrimes = Int32.Parse(args[2]);

    long t1 = 0, t2 = 0;

    //So that the thunk is generated
    QueryPerformanceCounter(ref t1);
    primes.CalculatePrimes(numPrimes);
    QueryPerformanceCounter(ref t2);

    long freq = 0;
    QueryPerformanceFrequency(ref freq);
    Console.Write(t2 - t1);
}
catch(Exception e)
{
    Console.Error.WriteLine(e.ToString());
}

Both the clients use the QueryPerformanceCounter API call as a measure of the performance. The lesser the better. We have a program called RunMultipleTests [C#] that calls both the clients for each of the 10 types of DLLs. Take a look at the Main.cs file for how this is implemented. We called all 10 implementations once each to generate 10 primes, then 100, 1,000, 10,000, 100,000 and finally 1,000,000 (One million).

The results

I have selected a few of the generated results for discussion here. Smaller numbers indicate higher performance.

Language PrimesNative CalleeManaged Callee
ATLPrimes1018,241192,538
VBPrime1021,057191,597
CSharpPrimes101,201,2581,003,710
CSharpPrimes (ngen'd)1099,01720,357
VBNetPrimes101,680,2411,440,198
VBNetPrimes (ngen'd)10101,20121,644
MCPPPrimes1101,443,9431,117,279
MCPPPrimes1 (ngen'd)10107,36229,574
MCPPPrimes210977,667699,355
MCPPPrimes2 (ngen'd)10127,96953,861

The above table shows the various results obtained when generating 10 primes. As you can observe, the fastest performance was for the ATL DLL invoked from a native C++ client. But it might surprise you to see that when the same DLL was called from a managed client through .NET COM interop, the performance has fallen by almost 900%. So much for COM interop and it's supposed efficiency. It hurt my ego a good deal to see that the VB DLL invoked from a native client showed far superior performance to the Managed C++ DLL. Funnily the managed DLLs don't show a drastic difference in performance between native invocation and managed invocation. The exception is the MC++ DLL version 2 which is the unmanaged-managed mixed version. All the managed DLLs show an amazing performance increase when ngen'd. Perhaps it's time we all started taking ngen more seriously. Very surprisingly, the ngen'd C# DLL was the second fastest of all combinations. Curiously the VB.NET DLL was the slowest of them all. Here is a graph of the above table.

But then 10 primes is too small a number to be making such observations. Therefore we'll now move onto the results for 1000 primes. The excel sheets in the download will list the full tables for those who are interested. And you can always tweak the sample projects to give you other combinations and permutations.

Language PrimesNative CalleeManaged Callee
ATLPrimes10001,674,8221,843,077
VBPrime10001,659,0631,830,014
CSharpPrimes10002,951,7172,665,328
CSharpPrimes (ngen'd)10001,755,0781,655,643
VBNetPrimes10003,606,2533,400,125
VBNetPrimes (ngen'd)10002,108,6431,954,464
MCPPPrimes110003,110,4152,742,913
MCPPPrimes1 (ngen'd)10001,719,7341,642,938
MCPPPrimes210002,678,0312,359,011
MCPPPrimes2 (ngen'd)10001,748,9941,742,121

Well, well, well! Suddenly the performance comparisons don't seem as contrasting as they did when we generated 10 primes. Now the combination that gave best performance is the fully managed MC++ DLL after ngen'ing. What is so painful is to see that the VB6 DLL has out-performed the ATL DLL in both managed and native invocation. Again VB.NET shows pathetic performance. But again you'll see that ngen'ing has an amazing performance boost effect on the managed assemblies. Now let's skip a few tables and go straight to the one million mark.

Language PrimesNative CalleeManaged Callee
ATLPrimes100000019,389,792,91019,400,345,304
VBPrime100000019,334,822,91119,340,626,315
CSharpPrimes100000019,371,408,15519,426,052,083
CSharpPrimes (ngen'd)100000019,386,294,99219,325,672,507
VBNetPrimes100000019,870,238,96819,980,902,937
VBNetPrimes (ngen'd)100000020,007,201,16519,900,407,405
MCPPPrimes1100000019,363,699,23419,346,647,324
MCPPPrimes1 (ngen'd)100000019,339,817,49319,317,645,432
MCPPPrimes2100000019,450,368,01419,325,875,844
MCPPPrimes2 (ngen'd)100000019,345,122,91119,429,232,591

Both Rama and Nish were pleasantly surprised to find that as we went to higher and higher numbers for prime number generation, the stark contrasts in performance started paling very noticeably till finally at the one million mark, they all showed very similar performance.  Again the ngen'd fully managed MC++ DLL was the best and the VB.NET DLL was the worst. What was most curious was that ngen'ing actually had a negative impact on the VB.NET DLL. And here is a graphical representation.

Here is another graph that shows the impact ngen has on managed assemblies

You'll notice that ngen has maximum impact on VB.NET programs and as you'd guess least impact on MC++ code that has native code blocks. You'll also notice that the impact of ngen seems to decrease as we generate a higher number of primes. This is made very clear in the following graph

So far we have only seen cases where the methods were called once. Thus the managed versions suffered because of JIT compiling overheads. So we did multiple calls to try and see if the managed versions got any faster after the first call. So we looped the calls thrice. Here are some sample test results. Don't be surprised by the difference in results with the tables above. The first set of tests were run on a Dual P-III 550 MHz with 384 Mb RAM. So numbers are higher for the first set of results because the performance counter frequency is quite high for a dual processor machine. The multiple-method-call tests were all run on Single P-III 800 MHz with 384 Mb RAM. Obviously the performance frequency is lower and thus the numbers are also smaller. But you'll notice that the ratios remain more or less the same.

LanguagePrimesNative Callee
#1, #2 & #3
Managed Callee
#1, #2 & #3
CSharpPrimes105973352548485646
CSharpPrimes (ngen'd)1047632276956045
VBNetPrimes107663382981445950
VBNetPrimes (ngen'd)1048935291016351
MCPPPrimes1106270342653835746
MCPPPrimes1 (ngen'd)1049931241275646
MCPPPrimes2104466382536466147
MCPPPrimes2 (ngen'd)1062431252476547

You'd notice that there is a amazing increase in performance for the 2nd call and further calls. The most noticeable performance improvement is for the non-ngen'd DLLs. The ngen'd C# DLL shows a slight anomaly for it's 3rd run, but this might have been due to some OS activity coinciding with that exact moment. It's nothing but an anomaly, so you may safely ignore it. Thus, whether you ngen it or not, from the 2nd run onwards your methods will be nearly as fast as native calls, because there is no JIT overhead. But it will not be as fast obviously because of other overheads like garbage collection. You'll also notice that the 3rd call has actually improved over the 2nd call, but this improvement across calls drops sharply as we increase the call loop count. Now let's take the results for a larger number of primes.

LanguagePrimesNative Callee
#1, #2 & #3
Managed Callee
#1, #2 & #3
CSharpPrimes10000165346162135158838159857157004156279
CSharpPrimes (ngen'd)10000155593154611156586157266156629154440
VBNetPrimes10000180720172494173198175535171634170705
VBNetPrimes (ngen'd)10000172432173577172076173416175305173921
MCPPPrimes110000165775159783160712161040158640157350
MCPPPrimes1 (ngen'd)10000155954164162159695155283159554155928
MCPPPrimes210000160007154570154990171823158746156686
MCPPPrimes2 (ngen'd)10000156243153972154144154966 157720 167443

Ah, now the performance improvements of ngen are not as obvious. This again confirms  the fact that over the long run, the bottlenecks of JIT fades off slowly and finally just about disappears.

Some conclusions

  • Using ngen has a tremendous performance improvement on your managed code. This is specifically higher when called from a managed client than when invoked from a native C++ client.
  • Managed/Unmanaged transitions are inefficient. And the unmanaged to managed transitions are much slower than the managed to unmanaged transitions. Thus wherever possible it's best to avoid managed/unmanaged transitions.
  • There is a marked improvement in performance of managed code if they are repeatedly invoked, because the JITing is done only the first time.
  • As we increase the number of primes the performance differences between the various languages starts to reduce, which again underlines the fact that without the JIT overhead managed code is just as good as native code.
  • Of all the .NET compilers, the VB.NET compiler seems to produce the slowest code. We think this is because VB.NET checks for overflows in all arithmetic operations (verified using ILDasm)
  • The C# compiler seems to be markedly better than the MC++ compiler (pure managed code).
  • Using ngen has most impact on VB.NET assemblies and least impact on MC++ assemblies
  • Mixing unmanaged and managed code with C++ is far more efficient than pure MC++. In fact pure MC++ is much slower than C# for fully managed projects. Thus unless you plan to integrate MFC or ATL, C# is the better choice over MC++.

Updates and fixes

  • Aug 10 2002 - A major goof-up was fixed. In the looped method tests, we had looped at the wrong place. Instead of looping the method we actually looped the execution of the client process. This has been fixed, and the tables and the excel sheets have been updated.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Authors

Rama Krishna Vavilala
Architect
United States United States
No Biography provided

Nish Nishant
United States United States
Nish Nishant is a Software Architect/Consultant based out of Columbus, Ohio. He has over 16 years of software industry experience in various roles including Lead Software Architect, Principal Software Engineer, and Product Manager. Nish is a recipient of the annual Microsoft Visual C++ MVP Award since 2002 (14 consecutive awards as of 2015).

Nish is an industry acknowledged expert in the Microsoft technology stack. He authored
C++/CLI in Action for Manning Publications in 2005, and had previously co-authored
Extending MFC Applications with the .NET Framework for Addison Wesley in 2003. In addition, he has over 140 published technology articles on CodeProject.com and another 250+ blog articles on his
WordPress blog. Nish is vastly experienced in team management, mentoring teams, and directing all stages of software development.

Contact Nish : You can reach Nish on his google email id voidnish.

Website and Blog

You may also be interested in...

Pro
Pro

Comments and Discussions

 
GeneralMy vote of 5 Pin
manoj kumar choubey7-Feb-12 3:58
membermanoj kumar choubey7-Feb-12 3:58 
QuestionC++ or C#? Pin
davebatista15-Sep-05 2:41
memberdavebatista15-Sep-05 2:41 
QuestionWhat about Database access? Pin
Abhijit Desai4-Jan-05 23:17
memberAbhijit Desai4-Jan-05 23:17 
GeneralLonghorn question Pin
Cohen Shwartz Oren16-Mar-04 7:18
memberCohen Shwartz Oren16-Mar-04 7:18 
General.NET ngen question Pin
Anonymous23-May-03 3:55
sussAnonymous23-May-03 3:55 
GeneralRe: .NET ngen question Pin
Rai Umair10-Jul-03 15:26
memberRai Umair10-Jul-03 15:26 
GeneralSignal to noise ratio Pin
JeffreySax20-May-03 13:29
memberJeffreySax20-May-03 13:29 
GeneralVisual Studio .NET 2003 .net 1.1 results Pin
Macromullet29-Apr-03 7:51
memberMacromullet29-Apr-03 7:51 
GeneralSimplistic Pin
Thong Nguyen27-Nov-02 21:46
sussThong Nguyen27-Nov-02 21:46 
QuestionBad testing or Rigged tests ??? Pin
Bill McCarthy26-Nov-02 6:51
sussBill McCarthy26-Nov-02 6:51 
AnswerRe: Bad testing or Rigged tests ??? Pin
Rama Krishna26-Nov-02 10:43
memberRama Krishna26-Nov-02 10:43 
GeneralRe: Bad testing or Rigged tests ??? Pin
Bill McCarthy26-Nov-02 16:16
sussBill McCarthy26-Nov-02 16:16 
GeneralRe: Bad testing or Rigged tests ??? Pin
Klaus Probst18-May-03 21:22
memberKlaus Probst18-May-03 21:22 
GeneralCritique & Suggestions Pin
Carl Daniel6-Oct-02 14:12
sussCarl Daniel6-Oct-02 14:12 
GeneralRe: Critique & Suggestions Pin
Nishant S6-Oct-02 14:43
editorNishant S6-Oct-02 14:43 
GeneralC++ flags used?, and some comments Pin
Anonymous15-Aug-02 18:54
sussAnonymous15-Aug-02 18:54 
GeneralRe: C++ flags used?, and some comments Pin
Anonymous15-Aug-02 23:00
sussAnonymous15-Aug-02 23:00 
GeneralYou are supposed to use Release Builds Pin
vrk16-Aug-02 13:03
sussvrk16-Aug-02 13:03 
GeneralOptionStrict = "Off" Pin
Scott Hutchinson14-Aug-02 21:37
memberScott Hutchinson14-Aug-02 21:37 
QuestionOther languages? Pin
Dylan Kenneally14-Aug-02 3:46
memberDylan Kenneally14-Aug-02 3:46 
GeneralCalling unmanaged code not through a COM interface Pin
hrh_hamlet13-Aug-02 5:16
memberhrh_hamlet13-Aug-02 5:16 
GeneralRe: Calling unmanaged code not through a COM interface Pin
Crius13-Aug-02 8:09
memberCrius13-Aug-02 8:09 
GeneralRe: Calling unmanaged code not through a COM interface Pin
hrh_hamlet13-Aug-02 9:24
memberhrh_hamlet13-Aug-02 9:24 
GeneralRe: Calling unmanaged code not through a COM interface Pin
Crius13-Aug-02 9:48
memberCrius13-Aug-02 9:48 
GeneralRe: Calling unmanaged code not through a COM interface Pin
hrh_hamlet13-Aug-02 10:52
memberhrh_hamlet13-Aug-02 10:52 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.161208.2 | Last Updated 9 Aug 2002
Article Copyright 2002 by Rama Krishna Vavilala, Nish Nishant
Everything else Copyright © CodeProject, 1999-2016
Layout: fixed | fluid