Click here to Skip to main content
Licence GPL3
First Posted 1 Apr 2008
Views 44,034
Downloads 133
Bookmarked 20 times

Inner Product Experiment: C# vs. C/C++

By | 20 May 2008 | Article
The article demonstrating speed of inner product operation performed with shorts, ints, longs, floats, doubles and decimals in C# compared to C/C++

Introduction

The inner product (or dot product, scalar product) operation is the major one in digital signal processing field. It is used everywhere, Fourier (FFT, DCT), wavelet-analysis, filtering operations and so on. After written the similar article for the inner product in C/C++ Inner Product Experiment: CPU, FPU vs. SSE* I was thinking how the same code written in C# will perform. I repeated the inner product operations using C# types: shorts, ints, longs, floats, doubles and decimals.

Background

Inner Product Experiment: CPU, FPU vs. SSE*

Using the code

Just run the inner.exe providing as an argument the size of vector you want to convolve with. Make sure you placed timer.dll in the same directory with the executable. It provides tic() and toc() functions implementing precision time counter in milliseconds. I use the dll in PerformanceCounter static class in functions PerformanceCounter.Tic() and PerformanceCounter.Toc().

static public class PerformanceCounter
{        
//Constructors

//Enums, Structs, Classes

//Properties

//Methods
//operators
//operations
        [DllImport("timer")]
        static extern void tic();
        [DllImport("timer")]
        static extern long toc(); 

        static public long Tic()
        {
                try
                {
                        tic();
                        return 0;
                }
                catch (Exception e)
                {
                        Console.WriteLine(String.Format("PerformanceCounter.Tic() {0}", e.Message));
                        return -1;
                }
        }

        static public long Toc()
        {
                try
                {                        
                        return toc();
                }
                catch (Exception e)
                {
                        Console.WriteLine(String.Format("PerformanceCounter.Toc() {0}", e.Message));
                        return -1;
                }
        }

//access
//inquiry

//Fields       
}

The main console body contains that code. I included only doubles function here to save space:

class Program
{
        static int size = 1000000;

        static void Main(string[] args)
        {
                try
                {
                        if (args.Length >= 1)
                                size = (int)Convert.ToUInt32(args[0]);
                }
                catch (Exception e)
                {
                        Console.WriteLine(String.Format("Can not convert {0} to uint32: {1}", args[0], e.Message));
                        size = 1000000;
                }

                shorts();
                ints();
                longs();
                floats();
                doubles();
                decimals();
        }

        //...
        
        static void doubles()
        {
                double[] a = new double[size];
                double[] b = new double[size];

                Random rnd = new Random();
                for (int i = 0; i < size; i++)
                {
                        a[i] = rnd.NextDouble() - 0.5;
                        b[i] = rnd.NextDouble() - 0.5;
                }

                PerformanceCounter.Tic();

                double c = 0.0;
                for (int i = 0; i < size; i++)
                        c += a[i] * b[i];

                Console.WriteLine(String.Format(" doubles: {0} ms", PerformanceCounter.Toc()));

                a = null;
                b = null;
        }

        //...

Below is the example of the console output for 5000000 dimensional vectors.

>inner.exe 5000000
 shorts: 16 ms
 ints: 7 ms
 longs: 69 ms
 floats: 9 ms
 doubles: 9 ms
 decimals: 2569 ms

I was actually stunned seeing floats and doubles in C# performing 1.3 to 3.3 times faster than in C/C++ even SSE optimized. It should not be so, as the code is managed and compiled during run-time and it is the same CPU/FPU? but how is it possible to run faster? If you now the answer post it here. See the Inner Product Experiment: CPU, FPU vs. SSE* article on the performance times for corresponding numeric types in C/C++. Ints perform a little faster but it might be of no profit quantizing floats to fixed point arithmetic and C# again outperforms C/C++ runing 2.28 times faster. However shorts and longs run quite slow. Shorts in C# perform as fast as in C/C++ but SSE2 intrinsics however outperform C#. You should prevent yourself to not to use decimals until you need high precision after comma, otherwise it will run the computation forever.

Having all that amenities in C# programming shall we not migrate DSP applications from C++?

Update (7 Apr 2008)

Sadly to C# adherents and to great delight of C++ gurus as the labours we spent in C/C++ were not yet in vain. The C# compiler indeed optimizes the code the way to avoid unused variables somehow, that indeed led me astray. To regain tarnished C++ glory here is the example of C# output for 5000000 sized vectors:

>inner.exe 5000000
 shorts: 16 ms 
  27006 
 ints: 18 ms 
  1240761 
 longs: 72 ms 
  -5610477 
 floats: 30 ms 
  33,548 
 doubles: 35 ms 
  198,949191315363 
 decimals: 2936 ms 
  138,23876271661179995948054686

It leaves however some space for dispute as why it does not removed unused for() for shorts and longs. The doubles run slower compared to floats contrariwise for C++ where doubles outperforms floats.

Update (6 May 2008)

Unfolding for() loops indeed provided speed up but only in case of unfolding 4 times. The same trick did not provided performance increase in C++ code. This is how I did the unfolding:

...
float c = 0.0f;
int ii = 0;
for (int i = 0; i < size / 4; i++)
{
        c += a[ii] * b[ii];
        ii++;
        c += a[ii] * b[ii];
        ii++;
        c += a[ii] * b[ii];
        ii++;
        c += a[ii] * b[ii];
        ii++;
}
...

And the results are shown below:

>inner.exe 5000000
 shorts: 16 ms
  -24687
 shorts 4loop: 14 ms
  7038
 ints: 18 ms
  19686
 ints 4loop: 16 ms
  9090795
 longs: 71 ms
  -870676
 longs 4loop: 75 ms
  -8263341
 floats: 32 ms
  43,41741
 floats 4loop: 15 ms
  11,02298
 doubles: 34 ms
  194,810329249757
 doubles 4loop: 24 ms
  -495,312642682424
 doubles unsafe: 32 ms
  -283,031436372233
 decimals: 2550 ms
  368,82465505657333076693624932
 decimals 4loop: 2611 ms
  -50,405825071718589646106671809

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)

About the Author

Chesnokov Yuriy

Engineer

Russian Federation Russian Federation

Member

Former Cambridge University postdoc (http://www-ucc-old.ch.cam.ac.uk/research/yc274-research.html), Department of Chemistry, Unilever Centre for Molecular Informatics, where I worked on the problem of complexity analysis of cardiac data.
 
As a subsidiary result we achieved 1st place in the annual PhysioNet/Computers in Cardiology Challenge 2006: QT Interval Measurement (http://physionet.org/challenge/2006/)
 
My research intrests are: digital signal processing in medicine, image and video processing, pattern recognition, AI, computer vision.
 
My recent publications are:
 
Complexity and spectral analysis of the heart rate variability dynamics for distant prediction of paroxysmal atrial fibrillation with artificial intelligence methods. Artificial Intelligence in Medicine. 2008. V43/2. PP. 151-165 (http://dx.doi.org/10.1016/j.artmed.2008.03.009)
 
Face Detection C++ Library with Skin and Motion Analysis. Biometrics AIA 2007 TTS. 22 November 2007, Moscow, Russia. (http://www.dancom.ru/rus/AIA/2007TTS/ProgramAIA2007TTS.html)
 
Screening Patients with Paroxysmal Atrial Fibrillation (PAF) from Non-PAF Heart Rhythm Using HRV Data Analysis. Computers in Cardiology 2007. V. 34. PP. 459–463 (http://www.cinc.org/archives/2007/pdf/0459.pdf)
 
Distant Prediction of Paroxysmal Atrial Fibrillation Using HRV Data Analysis. Computers in Cardiology 2007. V. 34. PP. 455-459 (http://www.cinc.org/archives/2007/pdf/0455.pdf)
 
Individually Adaptable Automatic QT Detector. Computers in Cardiology 2006. V. 33. PP. 337-341 http://www.cinc.org/archives/2006/pdf/0337.pdf)

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board. (secure sign-in)
 
Search this forum  
 FAQ
    Noise  Layout  Per page   
  Refresh
QuestionWhy the suprise? Pinmemberadamvanner9:10 15 Feb '09  
GeneralC/C++ performance Pinmembernickythequick4:19 27 May '08  
Generalnice results but... Pinmemberdmihailescu6:39 20 May '08  
GeneralRe: nice results but... PinmvpChesnokov Yuriy21:20 26 May '08  
just follow the link in background section
 
chesnokov

NewsStephen Hewitt and reinux code results. Look here anyone please before blaiming my article PinmvpChesnokov Yuriy21:15 8 Apr '08  
GeneralRe: Stephen Hewitt and reinux code results. Look here anyone please before blaiming my article Pinmemberdshorter111:20 9 Apr '08  
GeneralRe: Stephen Hewitt and reinux code results. Look here anyone please before blaiming my article PinmvpStephen Hewitt19:10 9 Apr '08  
AnswerRe: Stephen Hewitt and reinux code results. Look here anyone please before blaiming my article PinmvpChesnokov Yuriy21:25 26 May '08  
GeneralRe: Stephen Hewitt and reinux code results. Look here anyone please before blaiming my article Pinmemberreinux21:39 10 Apr '08  
GeneralRe: Stephen Hewitt and reinux code results. Look here anyone please before blaiming my article PinmemberMarcin Smialek21:51 2 Oct '08  
Generalscalar product and senseless results PinmemberSigismondo Boschi4:55 8 Apr '08  
AnswerRe: scalar product and senseless results PinmvpChesnokov Yuriy20:20 8 Apr '08  
GeneralRe: scalar product and aplogizes PinmemberSigismondo Boschi21:58 9 Apr '08  
GeneralUnroll those loops PinmemberJean-Paul Mikkers9:01 7 Apr '08  
GeneralRe: Unroll those loops PinmvpStephen Hewitt15:38 7 Apr '08  
GeneralRe: Unroll those loops PinmemberJean-Paul Mikkers7:09 8 Apr '08  
GeneralRe: Unroll those loops 4-6 times faster Pinmemberbcarpent12289:22 20 May '08  
GeneralOOPS Unroll those loops 4-6 times correction Pinmemberbcarpent12289:41 20 May '08  
GeneralUnsafe C# [modified] Pinmemberreinux11:17 5 Apr '08  
GeneralYou are not using result of calculation... Pinmembermihasik8:35 3 Apr '08  
GeneralRe: You are not using result of calculation... Pinmemberreinux10:53 5 Apr '08  
GeneralRe: You are not using result of calculation... PinmvpStephen Hewitt19:02 6 Apr '08  
GeneralMy programs begs to differ! PinmvpStephen Hewitt11:29 2 Apr '08  
GeneralRe: My programs begs to differ! PinmvpChesnokov Yuriy23:18 2 Apr '08  
GeneralRe: My programs begs to differ! PinmvpStephen Hewitt15:31 6 Apr '08  

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Mobile
Web04 | 2.5.120604.1 | Last Updated 20 May 2008
Article Copyright 2008 by Chesnokov Yuriy
Everything else Copyright © CodeProject, 1999-2012
Terms of Use
Layout: fixed | fluid