|
|
Comments and Discussions
|
|
 |
|

|
Hello there,
I am currently working in .net. Now i need to know that in order to build a server/client architecture which one is more suitable Vc++ or .net. There is a server in our company that has been build over vc++, now we are looking for any chances of shifting it to .net. BTW speed of delievering messages and other networking efficiency issues are very important. So, cud u give me any advice on it, or tell me performance comparison of vc++ or .net.
Thanks in advance..
Cheers.......
Bye
Nothing Is Impossible In Life
|
|
|
|

|
(MCPP / VBNET/ C#) the performance impact on accessing the db.
Abhi
|
|
|
|

|
In Longhorn the API is not in C but in DONNET aware langauge.
Does anyone know what will be the result in Longhorn?
Thanks in advance
Oren.
|
|
|
|

|
when using ngen, do you still have to have the .NET framework installed on a client machines?
|
|
|
|

|
Dear Anonymous:
ngen.exe only creates an 'pre-compiled' image in cache, not an .exe that you can then run on a computer without the framework installed. You still need the framework to run it, it's just saving you the JITing time.
Also dont forget to read my first CP article: http://www.codeproject.com/dotnet/dnlp.asp[^]
, Keep Smiling.
Rai Umair
What is now proved, was once only imagened...
|
|
|
|

|
I was rather disappointed.
So we're looking at performance of languages. The signal is the execution time of the test code. The noise is the overhead from COM interop, etc.
For small test sizes, i.e. small number of primes, the noise outweighs the signal. This makes the tests completely useless at this level.
For the large test, the difference between tests is no more than 1%. We don't know how the overhead affects the results on this scale. Once again, the tests are pretty useless.
There's one small exception: the VB.Net code is 3-4% slower. But this may be due to language-specific constructs. The use of the ReDim statement comes to mind. ILDASM is the tool of choice here: see what code the compiler generates.
A more valid test would be to have the entire test code, including the timing code, written in the language. That will minimize overhead.
Jeffrey
Everything should be as simple as possible, but not simpler. -- Albert Einstein
http://www.extremeoptimization.com/
|
|
|
|

|
I upgraded this to vs.net 2003 and the ngen numbers were almost identical in the extended tests. VB6 was the fastest in the extended tests.
ATLPrimes 10 33 738
VBPrime 10 111 2644
CSharpPrimes 10 4028 3162
CSharpPrimes (ngen'd) 10 10492 3225
VBNetPrimes 10 4495 3804
VBNetPrimes (ngen'd) 10 4633 5208
MCPPPrimes1 10 7049 4456
MCPPPrimes1 (ngen'd) 10 5303 4498
MCPPPrimes2 10 8389 9353
MCPPPrimes2 (ngen'd) 10 8520 7751
ATLPrimes 1000 7321 7995
VBPrime 1000 5067 6051
CSharpPrimes 1000 10346 9541
CSharpPrimes (ngen'd) 1000 10185 9504
VBNetPrimes 1000 79723 25387
VBNetPrimes (ngen'd) 1000 25531 25144
MCPPPrimes1 1000 13642 11275
MCPPPrimes1 (ngen'd) 1000 12043 12935
MCPPPrimes2 1000 31226 29267
MCPPPrimes2 (ngen'd) 1000 32183 30623
ATLPrimes 100000 3664678 3751235
VBPrime 100000 2441960 3006081
CSharpPrimes 100000 3320130 2919183
CSharpPrimes (ngen'd) 100000 3364131 3105973
VBNetPrimes 100000 5443152 5907290
VBNetPrimes (ngen'd) 100000 5262887 5847501
MCPPPrimes1 100000 2802521 2794613
MCPPPrimes1 (ngen'd) 100000 2952759 2825932
MCPPPrimes2 100000 5890069 5385754
MCPPPrimes2 (ngen'd) 100000 5928540 5482403
|
|
|
|

|
I think the test was too simplistic to draw any conclusions for 'real world' applications.
However, it does show that if you're doing serious number crunching, there is very little difference with large working sets, and an insignificant difference with a small working sets.
One big advantage of managed code is that memory allocation is MUCH faster. In C++, new/delete are very expensive operations. With .NET code new is almost as efficient as allocating stack space and garbage collecting lots of objects at once requires less time than deleting lots of objects manually one by one.
In most applications, I think you'll find the speed of dynamic memory allocation/deallocation a more important benchmark .
|
|
|
|

|
Has anyone bothered to look at the source code ? Not only does the author have the VB.NET project set to incremental build, but also has option strict off, optimizations off, and uses non short circuiting operators (And instead of AndAlso) inside the loop test, whereas for the C# project the author has set the complete opposite. For the C# he has incremental build off, optimizations on, and uses short circuiting operator (&&)
I fixed the code, changed the order of the tests, and ran multiple tests in he same run, that is 10, 100, 1000, 10000,10, 100, 1000, 10000,10, 100, 1000, 10000, etc all in the one run. Why you may ask ? Well the calling code may get optimized slightly differently or behave differently as new libraries are being loaded and as the JIT occurs. Most importantly running repeated tests in the one run simulates more what we are likely to see when running web services or asp.net applications.
The results interestingly enough were vastly different from the author's claims. In fact, the VB.NET code out performed the C# on many occasions, but likewise so did the C# code. These variations/fluctuations are most likely due to garbage collection.
Furthermore to put this code all into perspective, it's important to note that there is significant COM overhead. The difference can be seen if the managed code is called directly rather than through an interface.
Finally, one important aspect seems to be over looked. The author is testing only specific parts of the language functionality. In particular he tests a couple of math functions. Obviously if code does not perform safety tests then lower level code will perform bests on these kinds of tests as these are aimed at series of processor instruction codes that are often built in these days. A more realistic test would be something that most business do everyday such as string processing, or perhaps the rendering of a web page etc.
|
|
|
|

|
Look at my comments in one of the messages below about the VB.NET project. The projects available in the web site have default settings and also make sure that you are building the release builds and not the debug builds.
The mistakes made in this API make me realise that Microsoft has become big enough that it can shelter morons. If anyone working for me wrote anything this bad and tried to release it, I would kill them and display the body as a warning to the rest of the team. - Christian Graus about C# - GDI+
|
|
|
|

|
I was indeed testing release builds and I was also testing outside of the VS.NET IDE as the IDE hooks into the process regardless of whether debug or release.
But as to your claim of default settings, that sir, is totally incorrect ! I am beginning to think this is more a case of deceipt than just bad coding.
|
|
|
|

|
Hi Bill -
There is a... ahhh... bias towards VB in all its forms here. Don't sweat it
___________
Klaus
[vbbox.com]
|
|
|
|

|
An interesting article, to be sure, but it's only a first step in doing
performance comparisons. A few things things leap to my attention, and are
not particularly suprising:
- For short-running programs, startup time of the runtime environment
dominates efficiency of the code optimizer in determining total execution
time.
- The efficiency of long-running programs is very similar across all
environments. This shouldn't be surprising since ultimately, there's only 3
code generators being considered: VC native, .NET JIT, and ngen, and all
were written by the same organization (and presumably make use of very
similar technologies and algorithms).
- There's an extra startup cost in loading a managed assemlby into an
unmanged client.
- The cost of loading an ngen'd assembly into a managed client is
substantially less than that of loading an IL assembly.
One thing that would be interesting to add to the tables is a computation of
"time per additional iteration". So, in the table of "1000 prime" results,
compute the value (time(1000)-time(10))/990 for each combination of
client/server. Another helpful change would be to translate the times into
"clock time" instead of reporting QueryPerformanceCounter() times. I see in
the sample code that QueryPerformanceFrequency was called, but the result
never used.
I see that some tests were run on a single CPU machine, and others on a
dual. I'd like to see all testing done on a dual.
It would also be interesting to modify the clients to set their thread
affinity to force all execution onto a single CPU. Even better would be to
run on Windows 2000 DataCenter Server on a dedicated processor, but not many
people have access to that environment... Either of these changes should
reduce the noise in the measurements. Many of these results are within
miliseconds of each other, and little things like bumping the mouse while
the test runs can easily cause tens-of-miliseconds timing variations.
In the multiple calls tables, I think the authors are missing an
opportunity: we know a-priori that the all-native combination _should_
be unaffected by multiple runs, yet in the tables, it appears to
be. I'd like to see a higher number of repetitions (at least 10), and a
calculation of the mean, variance, and a linear regression of the points for
each client/server combination. Again, everthing should be reported in
real-world times. Any differences of less than a few miliseconds should be
considered to be noise, unless a very high number of repetitions has been
run under very controlled conditions (no network, no mouse movement, no
other processes running, etc).
While this article and the work that it represents are a good first step, I
do have concerns with the results (and would be very hesitant to use these
results to justify any kind of business decision):
- The algorithm is too simple.
- It makes no use of floating point.
- It does virtual no memory allocation
- There's no use of value-types
- The algorithm involves only built-int types which are highly optimized by
the hardware (e.g. int).
- Too little attention to detail with regard to identifying and removing
noise from the measurements.
I think the second conclusion of the article is over-stated:
manged-unmanaged transitions have a cost, which can be significant,
especially in short-lived programs. In the long-running samples, there was
less than 3% difference in execution speed between a managed an unmanged
client.
Keep up the profiling!
-cd
|
|
|
|

|
Thanks Carl, for taking time to give some very useful feedback In fact your post had nearly enough content be a sort of article on it's own
Carl Daniel wrote:
I see in
the sample code that QueryPerformanceFrequency was called, but the result
never used.
Er, the intention was there to convert to clock time but finally forgot to do it or I think it was because the figures were not as impressive, for example - between 0.00056 seconds and 0.00451 seconds, even though there is a 10-times difference people won't see that easily. They'll just see both as very small numbers, as people are normally not used to comparing small fractions.
Carl Daniel wrote:
I think the second conclusion of the article is over-stated:
manged-unmanaged transitions have a cost, which can be significant,
especially in short-lived programs. In the long-running samples, there was
less than 3% difference in execution speed between a managed an unmanged
client.
Perhaps true, but one solid conclusion we made was that transitions almost always brought down speed - as in marshaling! In the long run this is not very visible because the transitions are required only at the beginning. Thus for an ASP.NET app that will be called multiple times this may be okay. But a regularly used desktop app might have an issue as it is freshly run each time.
Anyway thanks a lot once again,
Regards,
Nish
Author of the romantic comedy
Summer Love and Some more Cricket [New Win]
Review by Shog9
Click here for review[NW]
|
|
|
|

|
What flags were used for the C++ tests? Exact cmd line please.
Which test is Managed C++ using /CLR but *no* code changes (i.e., using all unmanaged types as-is)?
Why isn't a System::GC::Collect() being performed as closure to these tests? We all know GC is heavily biased on the back during cleanup (including thread suspensions and separate thread callback Finalizers) and that the front is cheap. I know this particular test isn't really germaine in that sense but I have seen plenty that are. Omiting the GC collection phase is pretty biased.
And when I do see GC collection being omitted in various tests, I also invariably don't see anything other than the default allocator used in C++ examples - like any decent C++ programmer wouldn't use a special arena or fixed allocator for perf-sensitive tests. Go figure. Again, in this test that isn't necessary but many do need it to be fair.
Since all the languages above are ultimately written *in* C++, then technically they are all examples *of* C++. So you can't really state that any of them are faster than C+ since they *are* C++.
|
|
|
|

|
I did not see the source link for some reason earlier, now I have them.
Why use /Od running a *perf test* in C++? I run benchmarks with /O2 /GL, not /Od. I am trying to understand why someone would do this and call it a test of performance. You favor /ZI for Edit and Continue over optimizing the code, and throw in /EHsc and RTC1 too?? Please use /O2 /GL and get rid of the other flags then run the test for C++ again.
Before someone points out that C# also has optimization turned off in its build, I think that not optimizing C++ is costing much more than not optimizing C# here. Besides, just turn them all on and let's see what happens then. It can't hurt to see right?
If you are going to make statements about comparing performance of a language, please don't tie one (or both) of its hands behind its back and call it a fair fight.
|
|
|
|

|
Anonymous wrote:
Why use /Od running a *perf test* in C++?
Please double check.
We are testing the release build and not debug builds. In debug builds optimizations are turned off. In Release builds C++ project is optimized for maximum speed.
Anonymous wrote:
You favor /ZI for Edit and Continue over optimizing the code,
You are using Debug builds. There is another config Release use that. We have used Release mode for all our perf studies.
|
|
|
|

|
Did having OptionStrict = "Off" handicap VB .NET's performance?
Scott Hutchinson
s.c.o.t.t.h.u.t.c.h.i.n.s.o.n@usa.net
(to contact me, remove all dots left of @)
|
|
|
|

|
A great article! Thanks very much.
Have you thought about extending it to non MS languages like Delphi, C++ Builder etc? I've been looking around for performance comparisions between Delphi and VC for some time, but can't find anything out there...
Dylan Kenneally
London, UK
|
|
|
|

|
Since I'm working on some CPU intense simulation, and considering writing the GUI with c#
I was interested specially in the time it takes to go in-and-out from managed into unmanaged code.
I was interested not only in the COM way to do it but also importing from standard DLL using the [DllImport] attribute, and using the MC++ wrappers as described
here .
Besides, it seems that the differences of net. calculation times between C# and native C++
(without initialization JIT compilation, etc...) are about 5% (for this kind of task).
So, first I've add added number of iterations parameter, thus:
QueryPerformanceCounter(ref t1);
for(int i = 0;i < iter;++i)
primes.CalculatePrimes(numPrimes);
QueryPerformanceCounter(ref t2);
Now I could preform tests for assessing the cost of menage-to-unmanaged call.
Baseline (the usual tests + second parameter is iterations):
ATLPrimes, NativeClient
Primes(10,1)=69
Primes(10,10)=312
Primes(10, 10000)=302854
Primes(1000, 1000)=9399488
-----------------------
ATLPrimes, MClient (C#)
Primes(10,1)=54010 (first run after rebuild All - to clear any cache)
Primes(10,1)=2864(consecutive run)
Primes(10,1)=847(third run run)
Primes(10,10)=1428
Primes(10, 10000)=398134
Primes(1000, 1000)=9512105
----------------------
CSharpPrime, MClient (C#)
Primes(10,1)=6326 (first run after rebuild All - to clear any cache)
Primes(10,1)=4219(consecutive run)
Primes(10,10)=6229
Primes(10, 10000)=160614 (much faster to go in/out from C# to C#, as expected - but it's even faster that calling on a native COM interface from native client!)
Primes(1000, 1000)=9334085 (again the fastest result)
Now comes the interesting part:
I used two DLLs.
One with native non-COM C++ class performing the same function,
and a global extern "C" function to be called from C# through [DllImport].
The second, was MC++ DLL that wrapped the native one in a managed C++ class.
Now look at the results:
NativeDLL, MClient calling through [DllImport] - no COM
Primes(10,1)=53359(first run after rebuild All - to clear any cache)
Primes(10,1)=6712(consecutive run)
Primes(10,10)=7374
Primes(10, 10000)=250587
Primes(1000, 1000)=9455449
And the winner is :
NativeDLL wrapped in Managed C++, MClient - no COM
Primes(10,1)=1383(first try - the rest quite the same...)
Primes(10,10)=1352
Primes(10, 10000)=197415
Primes(1000, 1000)=9491666 (error corrected)
That's it. Sorry for the long message.
Ariel
However many ways there may be of being alive, it is certain that there are vastly more ways of being dead, or rather not alive.
-- Richard Dawkins
|
|
|
|

|
Could you please repost the last result:
Primes(1000, 1000)=12363
It looks a little small.
|
|
|
|

|
corrected,
10x
However many ways there may be of being alive, it is certain that there are vastly more ways of being dead, or rather not alive.
-- Richard Dawkins
|
|
|
|

|
Hmm, strange. The number now looks like you ran the DllImport and posted the result under Managed C++ (I'm getting this info by looking at how the lines cross over on the last one, where I'd expect no crossover).
Could I bother you about trying it again?
|
|
|
|

|
Not bothering at all! but the number now is correct –
(and I’ve rechecked the others - you knock off all of my self confidence... )
Chau,
A.
However many ways there may be of being alive, it is certain that there are vastly more ways of being dead, or rather not alive.
-- Richard Dawkins
|
|
|
|
 |
|
|
General News Suggestion Question Bug Answer Joke Rant Admin
Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.
|
This article compares and contrasts the relative performances of various languages like native C++, Visual Basic 6, C#, VB.NET, Managed C++, MC++ and native code mix, ngen'd assemblies etc. using a prime number generation function as a generic benchmark
| Type | Article |
| Licence | CPOL |
| First Posted | 8 Aug 2002 |
| Views | 249,206 |
| Bookmarked | 43 times |
|
|