I’ve been doing a lot of work lately implementing distributed caching systems for various clients. During my initial scoping, I found a lot of information out there comparing the performance between cache types (
Memcached, etc.), however, I could find very little comparing the performance of caching vs the actual database (in my case,
AppFabric vs SQL Server 2005), just that it’s “much better”.
Despite this lack of statistical information, I went ahead with caching anyway (after all, ‘much better’ sounds pretty good), and because I’m a Microsoft shop, I left it tidy by selecting
AppFabric to ease the load on SQL Server.
Now that I’ve implemented the code to a decent extent through a particular website, I’ve been able to conduct my own performance benchmarks and here I’d like to share the results.
Note in particular that these are what I would call real-world results. I didn’t attempt to isolate cache access based on particular SQL queries or size of the item. I didn’t reset the cache between web pages. I simply used the website in the same way I’d expect my users to, and recorded the overall times.
My methodology was fairly simple and certainly prone to a margin of error:
- Installed an ASP.NET website to a development machine which I knew I had sole access to. SQL Server is installed on the same box as the web server, however I had a separate (and dedicated) server containing the cache.
- Switch the cache off (via
- Browse through a set number of web pages, without refreshing the client browser or anything tricky
- Made a note of the pages I traveled through, then turned the cache on and went through the same pages again
I had simple code around my ‘cache access’ block which simply counted the number of milliseconds the application spent trying to Save/Load the items (or bypassing if the cache was off) and logged them to a text file.
Data access is split into two parts – Save and Load. Note that SQL Server does not have the burden of a
Save method, and
AppFabric of course only calls
Save the first time the item is accessed.
| ||AppFabric||SQL Server||Ratio|
- AppFabric increases my data-access speeds by almost 5x over SQL Server. Thank goodness for that, and of course the longer you cache an item, the greater this efficiency will get. Again, this is not saying that
AppFabric accesses data 5x faster than SQL Server on a call-by-call basis, it is the overall time benefit of implementing caching.
- AppFabric has an additional overhead in saving a newly calculated item back to the cache. In my case, it was 2,265ms for the 81,960ms I spent loading items – a ratio of about 1/36. The longer I ran the test (or cached the items before they expired), the better this ratio would have become.
AppFabric clearly has performance gains over SQL Server – in my limited test, it made my data access 5x faster, and a longer test expiry period would have made this much much greater.
Note also that the website I was using had (test) database tables less than ~1,000,000 records, although some fairly funky SQL queries are being made here and there. For an even larger database such as Facebook (or my client’s database, hopefully) the SQL queries would take longer, but (I suspect) the caching times would remain exactly the same – another point in favour of
Uh-oh…One More Thing (and it sucks)
Now that you’ve read this far, I’ll tell you the real reason I ran these tests, which is that I was convinced that
AppFabric actually made my site slower. Not because SQL was better, but because the previous caching system I switched out was in fact the good old ASP.NET
HttpCache utility. Like
AppFabric, this cache is entirely in-memory but because it doesn’t concern itself with regions, tags, (and much more), it actually runs much much much faster. Let me type that again so Google picks it up – AppFabric is not nearly as fast as the built-in ASP.Net HttpCache utility. I ran the same test as above, but using the
| ||AppFabric||SQL Server||HttpCache|
Yes, that’s right – AppFabric is 2.1x slower than the built-in ASP.Net caching utility. Very sad, especially when I see my site slow down after all my hard work. However, I’m sticking with AppFabric:
HttpCache resets whenever the IIS App Pool resets, when you redeploy and any other time it feels like it. I suspect if I ran this test over a long time (say, a week) then the results would be closer
- There is no tagging in
HttpCache – you have to roll your own by integrating into the key – and the subsequent parsing to ‘find by tag’ is slow
HttpCache cannot be expanded easily to hold gigabytes of data – it simply won’t work when my sites expand
- The distributed nature of
AppFabric allowed me to build a separate ‘admin’ tool where I can look into the cache from another website, count it, clear it, etc.
- And if I’m honest, the final reason is that this type of caching is all in vogue at the moment and it’s something I feel I should be part of
One thing I’m wondering about is actually using a combination of both – extremely high-access queries (like user permissions) could be
AppFabric to handle the larger
datasets. Bit of an art form, I reckon.