A while ago I posted about chunked responses - what they are and the importance of them. It turns out that we (where I work) were getting it all wrong.
We implemented chunked responses (or at least thought so) quite a while ago, and it WAS working, in the beginning, but all of a sudden stopped.
How did I come to realize this ?
While analyzing waterfall charts of our site, which I've been doing regularly for quite a while now, I realized that the response doesn't look chunked.
It's not trivial realizing this from a waterfall chart, but if you look closely and you're familiar with your site's performance you should notice this. Since the first chunk we send the client is just the html 'head' tag, this requires almost no processing so it can be sent to the client immediately, and it immediately causes the browser to start downloading resources that are requested in the 'head' tag. If a response is chunked, in the waterfall, you should see the resources starting to be downloaded before the client even finishes downloading the html response from the site.
A proper chunked response should look like this
If you look closely you will realize that the response took long to download, which doesn't match the internet connection we chose for this test, which means the download didn't actually take that long, but the server sent part of the response, processed more of it, and then sent the rest.
Here's an image of a response that isn't chunked :
You can see that the client only starts downloading the resources required in the 'head' after the whole page is downloaded. We could've saved some precious time here, and have our server work parallel to the client that is downloading resources from our CDN.
What happened ?
Like I said, once this used to work and now it doesn't. We looked back at what was done lately and realized that we switched load balancers recently. Since we weren't sending the chunks properly, the new load balancer doesn't know how to deal with this and therefore just passes the request on without chunks to the client.
In order to investigate this properly, I started working directly with the IIS server...
What was happening ?
I looked at the response with Fiddler and WireShark and realized the response was coming in chunks, but not 'properly'. This means the 'Transfer-Encoding' header wasn't set, and the chunks weren't being received in the correct format. The response was just being streamed, and each part we had we passed on to the client. Before switching load balancer, it was being passed like this to the client, and luckily most clients were dealing with this gracefully. :)
So why weren't our chunks being formatted properly ?
When using asp.net, mvc, and IIS 7.5 you shouldn't have to worry about the format of the chunks. All you need to do is call '
HttpContext.Response.Flush()' and the response should be formatted correctly for you. For some reason this wasn't happening...
Since we're not using the classic Microsoft MVC framework, but something we custom built here, I started digging into our framework. I realized it had nothing to do with the framework, and was more low level in Microsoft's web assemblies, so I started digging deeper into Microsoft's code.
Using dotPeek, I looked into the code of 'Response.Flush()'...
This is what I saw :
As you can see, the code for the IIS 6 worker is exposed, but when using IIS7 and above it goes to some unmanaged dll, and that's where I stopped going down that path.
I started looking for other headers that might interfere, and started searching the internet for help... Couldn't find anything on the internet that was useful (which is why I'm writing this...), so I just dug into our settings.
All of a sudden I realized my IIS settings had the 'Enable HTTP keep-alive' setting disabled. This was adding the header 'Connection: close' which was interfering with this.
I read the whole HTTP 1.1 spec about the 'Transfer-Encoding' and 'Connection' headers and there is no reference to any connection between the two. Whether it makes sense or not, It seems like IIS 7.5 (I'm guessing IIS 7 too, although I didn't test it) doesn't format the chunks properly, nor add the 'Transfer-Encoding' header if you don't have the 'Connection' header set to 'keep-alive'.
Jesus! @Microsoft - Couldn't you state that somewhere, in some documentation, or at least as an error message or a warning to the output when running into those colliding settings?!!
Well, what does this all mean ?
The 'Connection' header indicates to the client what type of connection it's dealing with. If the connection is set to 'Close' it indicates that the connection is not persistent and will be closed immediately when it's done sending. When specifying 'keep-alive' this means the connection will stay opened, and the client might need to close it.
In the case of a chunked response, you should indicate the last chunk by sending a chunk with size '0', telling the client it's the end, and they should close the connection. This should be tested properly to make sure you're not leaving connections hanging and just wasting precious resources on your servers.
(btw - by not specifying a connection type, the default will be 'Keep-Alive').
If you want to take extra precaution, and I suggest you do, you can add the 'Keep-Alive' header which indicates that the connection will be closed after a certain amount of time of inactivity.
Whatever you do, make sure to run proper tests under stress/load to make sure your servers are managing their resources correctly.
Additional helpful resources