ATL / WTL / STL

21-Dec-12 6:09

Environment Windows XP Pro, Visual Studio 2008, MFC, C++

This application processes real time telemetry data. The first incarnation uses blocking TCP calls and works fine. It is just difficult to deal with due to the blocking calls.

The second implementation uses CAsyncSocket. After much work I discovered that it can keep up when about 1/3 of the data is processed. It cannot keep up with all the data.

I say that with some level of confidence because the code monitors the depth of the buffering queue. Since the telemetry data never stops, at the TCP/IP level, when it receives the WSAEWOULDBLOCK error, it buffers payload packets until it get the OnSend. At the one third data rate the max buffer fill level is 91 payload packets. At the full rate the buffer, currently 240 deep) overflows, frequently.
My next step might be to go to the Win32 API level.

I started with MFC because the vendor whose software feeds me the data provided an MFC template that showed how to get data from their application. It was written in Visual Studio 2008 MFC. During real time operations it does no user interactions and no GUI updates.

Or, I maybe I should just abandon the asynchronous effort due to a combination of too much effort, too litle return, and maybe even not possible.

Drawing on the reader's experience, and on Richard MacCutchan's response in my previous thread, and with a packet rate well in excess of 10 per millisecond:

What is the probability of success if I switch over to the Win32 API programming. Switch to a console application? Is the asynchronous code inherently less efficient, or is this more likely a problem using the CAsynchSocket class. (I presume it looses efficiency in the tradeoff for ease of use.) Or maybe an MFC problem, or combination thereof.

Right now I am leaning towards abandoning the asynchronous effort and just use the blocking code that works well. I would like to hear your thoughts.

Thanks for your time

pasztorpisti22-Dec-12 9:51

22-Dec-12 9:51

Networking is always tricky because you have to tweak the parameters of your networking code to get as "nearly optimal" results as you can. Regardless of the solution you choose you should definitely try to set the send and/or recv bufsize associated with your socket handle (setsockopt() call with SO_RCVBUF/SO_SNDBUF parameters). Set the buffer sizes for example to 1Megabyte and then halve it until your net performance starts degrading. Note that even if you use async sockets the os is still receiving and storing data in the background into the rcv buf of your socket and then you can read that out with a single call! Unfortunately if you use a socket implementation that doesn't allow you to have direct access to the size parameter of your send() / recv() calls then you can not tweak those! Whether async or blocking??? I think this shouldn't be the question. If done well then these should have about the same performance because blocking/async communications are really just 2 different ways to write/read the send and recv buffers of the socket object, the real networking is done by the operating system in the background by the network stack that works with the send/recv buffers of the socket. I prefer async because that is a superset of blocking sockets, you can not solve every problems with just blocking sockets. Servers with the highest performance are also using async because lots of async operations from different sources can be optimized much neatly by the OS anyway.

To be honest I have always written the async socket class for myself, mainly because of crossplatform development. I think writing a basic async socket code is no big deal but you will probably face some problems if you dont read the win32 api docs carefully. On windows you have several choices to write a single async socket - select, WSAWaitForMultipleEvents, iocp, overlapped, ... who knows). To handle a single async socket on its dedicated thread I have always used WSAWaitForMultipleEvents as it is quite easy to use with its companion functions (WSAEventSelect/WSAWaitForMultipleEvents/WSAEnumNetworkEvents). Why WSAWaitForMultipleEvents and not the crossplatform select??? Because with WSAWaitForMultipleEvents you can wait for the socket and also for a custom event of yours (created with WSACreateEvent) because sometimes you have to explicitly wake up the waiting - for example when your program quits or when you add some sendable data to your empty send buffer and your network thread waits. Doing the same wakeup with select() is always tricky and dirty.

Note that previously we were talking only about exchanging data between your application and the network stack (the socket buffers). There is a delay between the send or recv calls that you use to transfer data between the socket buffers and your application memory buffers. If this delay is big, for example because you do other work on your network thread and you don't have a dedicated network thread in your app then the delays will be bigger and you will have to compensate for that with bigger SO_SNDBUF and SO_RCVBUF values because the OS might run out of recvbuf space of the socket or might run out of sendable data while your network thread is doing something in your app. I highly recommend using dedicated threads that do just the transfer between your memory buffers and the socket buffers.

Another thing that can be a bottleneck is the data processing of your application - filling the application layer send buffer and reading the application layer recv buffer and doing some other stuff with the data (calculations, file io, ...). For example if your data processor thread doesn't read your memory buffer quickly enough then it might happen that the network thread that reads data from the socket buffer have to suspend transferring data from the socket recv buffer to your application level memory buffer because its full. Then if the OS/network stack fills up the recv buffer of the socket completely then you cant keep up with the network bandwith and it will affect performance. Lets assume that you are doing calculations with the received data and then you write it out to disk. If it can happen that some kind of data requires more processing than the average kind of data then you might want to compensate those "negative performance peeks" of your processor thread with a larger recv mem buffer in your application but you benefit from this only if the average processing speed of your application is bigger than what is required by the average incoming data. -> This shows quite well that you can tweak a networking application only if you know the exact specs (server hardware configuration, network config, ...). There is no single good solution. You will have to profile each part of your application independently (network recv/send speed, data processing, file io, ...) and then you will have to find out where to put buffers, maybe threads in between parts.

bkelly1327-Dec-12 6:16

27-Dec-12 6:16

Wow, what an incredible reply.
I don't understand everything you wrote, but a question or two will help.

This is a telemetry application. It receives data from the hardware as a series of identified parameters. It extracts messsages from within the list of parameters and sends the data to the display device. It outputs data via TCP/IP to the display device, but does not input anything via TCP/IP.

When I switched to asynchronous TCP/IP I discovered about WOULDBLOCK and created an array that buffers data until OnSend is called. That worked OK at relatively slow packet rates, two per milliseconds.

Upon adding in another set of payload packets (there are several types of payload packets that can be inhibited or enabled at run time) the payload packet rate jumped up to five to fifteen or so payload packets per millisecond. They are generally smaller packets, but the order is indeterminate so it is very intensive to combine payload packets to reduce the overall payload packet rate.

When that happened, the app ran out of buffer space and started loosing payload packets. I bumped up the buffer size from 16, to 32, to 64, then jumped to 240. It still overflowed.

My interpertation is that CAsyncSocket cannot keep up with this packet rate. When I use blocking TCP/IP calls it works okay. I can even run four simultaneous copies with no trouble.

As I understand your post, I am thinking that this probably cannot be accomplished with CAsyncSocket.

If that a true or false statement?

Don't write too much, I will certainly need to think a while, and re-read your post depending on how you answer this question.

Thanks for your time

pasztorpisti27-Dec-12 13:47

27-Dec-12 13:47

As I see you just get some data somewhere and send it over to another place, your app is a transmitter between 2 endpoints. This is quite simple fortunately. If you are not allowed to drop data then your send speed must be at least as big as your receive speed. A buffer helps only in smoothing away jitter in the incoming data to maintain a better average throughput. If your incoming data is more than what you can send then the problem can not be solved. If you are able to reach the required send speed with blocking you should be able to do the same with async as well. Anyway, why do you want to use async sockets?

bkelly1328-Dec-12 3:08

28-Dec-12 3:08

Hello pasztorpisti,
Re: Anyway, why do you want to use async sockets?
Async makes the application easier to deal with. First, when the client has not connected the main application can still capture data from the source and provide feedback as to how it is performing. Once the Listen() is posted, the app is stuck there and can do nothing.
Second, if the client closes the connection before my app closes, the app is stuck and must be killed. That causes some resource loss eventually requiring a computer boot.

I have mitigated that quite a bit by writing my own client application that can be fired up and release the main application. But I am not always the user and that is a real pain to require someone else to do. (BTW: Writing the client application was indeed a learning experience.)

I suspect that both of these problems can be resolved by using a separate thread for the TCP/IP part of the application. I found a tutorial and will be working on that aspect.

However, I have a working version and do not have unlimited time to devote to this project.

pasztorpisti, Richard M, and others,
Thank you very much for the time you have spent answering my questions and making suggestions. I am very gratefull.

Thanks for your time

-- modified 28-Dec-12 9:15am.

pasztorpisti28-Dec-12 3:37

28-Dec-12 3:37

You are welcome! If you have limited time (as I suspected) then choose a working solution of yours if you already have one. You can later experiment on better solutions if you are interested in threading/sockets. Smile | :)

bkelly1327-Dec-12 8:22

27-Dec-12 8:22

RE: I highly recommend using dedicated threads that do just the transfer between your memory buffers and the socket buffers.

Do you have a favorite tutorial or discussion on how to implement a dedicated thread such as this? Something for someone that has never tried multiple threads.

Thanks for your time

pasztorpisti27-Dec-12 14:02

27-Dec-12 14:02

Unfortunately I don't know about any good tutorials because I'm not in need of one. I learnt from teammates and from my own experiments. You dont always benefit from dedicated threads, that depends on the scenario but you can not find that out without trying.

SoMad27-Dec-12 14:51

SoMad

27-Dec-12 14:51

This is a great answer, I really wish I could vote on it.

Quote:
Networking is always tricky

You are totally correct. It looks so simple and we use networking applications all the time, but as soon as you have to do a bit more than a basic chat sample, small surprises crop up all over and you start racking up on "experience points" as you try to solve them Smile | :)

.

Soren Madsen

bkelly1328-Dec-12 3:13

28-Dec-12 3:13

Soren,
You are quite right. It is easy for forget, and much easier to never even realize, just how much effort has gone into writing the core code within the Operating Systems that we use every day without realizing how much goes on that we do not see.

It is easy to complain about Microsoft or any other OS. The reality is that all of them do a quite difficult job very well.

Thanks for your time

pasztorpisti28-Dec-12 5:06