Comments and Discussions
I read your article and it might be nice for beginners and maybe also for medium skilled developers.
Every professional program (as Web Server or Databases e.g.) uses a system that
works similar (without message collection on high level) as what you described as Pattern 2 (so not very unique).
I assume you are using WSA with Events (I haven't seen the source code so I can actually not judge).
And you seem to use unique Threads waiting for the WSA Events with something like WaitForSingleObject / WaitForMultipleObjects and some sort of Queues (maybe STL queues or another suitable collection class.
For this task IO Completion Ports are the more professional (and more efficient)
method (though it requires a professional OS like NT, 2000, XP).
Next what you describe as problem of Pattern 1 (but not really the problem of pattern 1):
It is never a problem to send 50 Bytes Messages over a network with much higher MTU.
You are right, that MTU size is maybe not used, but this makes transmission faster if a frame (with less than MTU size) is transferred, because less Bytes are transferred, there will be never a waste in transmission power anyway.
Your calculation is simply wrong because MTU is Maximum Transfer Unit size and not Transfer unit size !
Your program simply collects requests using less (and bigger) frame sizes causing unnecessary time gaps (collecting phase) and longer transmission between them, so there is still a part of what you describe as pattern 1 inside your program.
In addition Maximum Transfer Unit size can vary on different systems and in addition Windows (at least 2000, XP) optimizes these values dynamically for the network environment of each attached network device (POTS-Modem, DSL, cable, E1, ATM, Ethernet, Token Ring, ...).
Using high MTU values only makes sense for really big data (over slow responding networks) like sending 1 MegaByte, but if you send 1 MegaByte at once, the MTU size would be used anyway. For small messages like the 50 Bytes messages it is unimportant. So eg for HTTP or FTP a high MTU is recommended. For Message Transfer it does not matter if maximum Message size is less than MTU.
If your test program would be faster you would see, that the operating system is doing
automatically (Nagle Algorithm, which you have described) concatenation of messages if maximum Througput is not reached with lower TU size, what will happen eg at about a througput of more than about 7 000 Messages / s for a 100 MBit Ethernet network.
So it is useless to implement this a second time on a higher level.
Even more: because OS is doing this anyway and you seem to not use this your program could get trouble on higher speeds,
if your SocketPro would be able to reach these speed, because packets will be wrapped on Frame/Buffer sizes no matter it is optimized for a certain MTU size and then the receiver in SocketPro will have problems to puzzle them together again.
The problem of Pattern 1 exists mainly on the Client Side, and the real problem is that
if a client does not send a lot of requests, the server cannot process a lot of requests.
You simply found the explanation why Multitasking exists and why it makes sense for Clients.
Okay, let's get back to the sockets:
A disadvantage of your SocketPro (other than it does not deliver maximum throughput)
is that you collect requests on the client and not send immediatly, this can result in
starvation of clients with little request amount and in any case increased network round trips.
I wonder how many Requests your SocketPro can process, the Socket system we use can deliver on a standard
PC a throughput that cannot be managed by a 100 MBit network card (can deliver more than 15 000 Messages/s)
and it does not matter if the requests are 50 or 5000 Bytes long. We are currently optimizing this software to reach
a value of 25 000 to 50 000 100-5120 Bytes Messages / s / CPU, so I am interested in every method to queech out some
more performance but your message precollection is reducing performance in my opinion (s.a.).
You also mentioned DCOM etc. as a performance comparison, but whatever number you measured, what you report can be a result
of the DCOM logic itself and you haven't measured or compared the actual throughput anyway. You measured only the network round trip delay time which is not important for almost all applications and low for DCOM in any case. If you have 0.5, 2.5 or 30 ms that does not matter in case of internet anyway with delay times of >100 to 1000 ms, and in case of intranet anything below 50 ms is so fast that the user cannot detect any difference in application performance. So the measured throughput in Messages / second is the important number, e.g. for a server that means how many clients can be handled.
I am very interested in your performance measures.
Depending on this you and I could proof my criticism as true or false.
I have't seen your source code so my assumptions rely on your high level description.
Thanx for your description anyway, though the links are not working properly.
General News Suggestion Question Bug Answer Joke Rant Admin
Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.
|First Posted ||27 Jan 2002|
|Bookmarked ||37 times|