Upon a curiosity of mine I found out that WCF with basicHttpBinding can be easily beaten (performance-wise) by plain-old ASP.NET even if stripped down of transactions, reliability, security etc. (1500 req/sec vs 800 req/sec)
Also SQL Server Express can handle 1300 inserts per second easily and up to 4300 queries per second just as well. This on a 6+ million rows table and stored on the hard disk not in RAM.
A few days ago I was thinking how I implemented a certain web service a few years ago, a few employers ago. Although it was quite fast and efficient it wasn’t scalable. I, then, thought how I should have implemented it.
The web service had to receive an incoming (public) HTTP request, check for a visitor cookie. If there was a visitor-identifying cookie it would check against a data store (in-memory dictionary at that time) to see if that visitor answered.
It was about inviting visitors of certain sites to an on-line survey. A new visitor would be presented with a pop-up box having a “yes”, a “no” and “X” (close) button.
The business rules stated that if the visitor answered yes, the answer would be stored, the pop-up would close and then a new tab/window would appear with the survey. If the visitor answered no, then the same things would happen except opening the survey. If the visitor closed the pop-up, the next time the pop-up would appear again. If the visitor closed three times the pop-up then (s)he wouldn’t be bothered anymore with the invitation.
Quite simple I’d say. The tough thing was we only had one machine available (a quad-core Xeon 3.5 GHz 4GB RAM) and the expected traffic was 300-400 req/sec during normal load and maybe a peak load of 800-1000 req/sec. In that situation I decided to use a simple in-memory dictionary as data-store that I’d backup to disk a few times per hour. Things went smooth (at least till I left the company but also after I left, as an ex-colleague of mine told me).
You can easily see that this architecture, although efficient, is not scalable. At least not if you scale out I personally find efficiency and the ability to scale out completely independent.
So how would I reimplement this (if I would) in a scalable manner? Balancing and scaling the front-facing web servers would be straight-forward : a simple hardware NLB (network
load balancer) in front of n web servers and that’s that. In the back there would be m storage servers that would just store the visitors’ responses and server query responses
to the web servers. Having naturally partitionable data such as this (especially because it is not inter-related) makes the “sharding” easy.
My questions that arise are :
- What type of communication should I employ between the web servers and the storage servers?
- What type of storage should I use on the storage servers?
For communication I tested and benchmarked WCF (with basicHttpBinding) hosted in IIS and a simple ASP.NET Generic handler (.ashx thingie) in an empty web application. As for the storage I only tested SQL Server 2008 R2 Express with a database stored on the hard disk (I fancied using MySql with an in-memory storage engine or SQL Server with a database stored on a ram disk, and a periodic backup on a hard disk). However the test results on a standard DB on SQL Server satisfied me and I didn’t need to go any further.
A few words on the machine I used for testing : my el-cheapo laptop : a two-year old, AMD dual core with 3GB of RAM with Windows 7 Ultimate 64 bit. (complete specs here)
I started testing communication options. I set up a WCF service hosted in IIS Express, with minimal features, intended to max out performance (no security, singleton service instance, multiple concurrent calls, 1000 max concurrent connections and so on). I then made a small console app that would set up 10 threads that each of them would sequentially make 1,000 calls to the service and measure the whole time.
10 x 1,000 calls completed in 60 seconds [170 req/sec]. Like lame, dude..
Then I thrown up a small, empty web application with just one generic handler which could be queried a bit like so : http://localhost:1234/GetData.ashx?id=39283&opId=1 and it would only send out a single byte that would be the user state.
10 x 1,000 calls completed in 15 seconds [670 req/sec].
Wow, quite a different set of results… I then went out to turn to the community to find out how to improve the WCF service and I asked
this on stackoverflow. Not much help there (at least at the time of this writing). I then set both web apps on IIS (full not express) 7.5 and turned compilation to release in all projects (wcf web app, asp.net web app and the test harness project). Things changed “a bit” :
ASP.NET Generic handler : 6.7 sec [1492 req/sec]
WCF service : 13.7 sec [730 req/sec]
Good enough I said to myself, since, in the end, you could easily scale them out. Then I went to see how the DB would stand up to this beating. At first I was quite pesimist about an RDBMS with the backing store on the HDD (no RAID 0, no nothing) – a 7200 rpm laptop hard-drive.
Starting from an empty table (Id – bigint – 64 bit, Response – tinyint – 8 bit) I started to insert sequentially (over the same connection, unclosed) 6 million rows. At first the table had a clustered index on the Id column slowing down the inserts. From 0 to the first 100,000th record it inserted at an approximate rate of 1500 rows/sec and then went down and down asymptotically until I got bored and stopped it (at around 3.5 million rows).
Then I truncated the table, removed the clustered index and I was able to insert consistently (again, sequentially on a single connection) at 1250 rows/sec. For kicks I truncated the table again (use
TRUNCATE rather than
DELETE TABLE because it’s way faster and cleaner) and used BULK INSERT just as in Dave’s article and got something like 58,000 rows/second!!!
Being satisfied with the findings I got to test the query performance. I used a 6 million rows content for the table and tested with a clustered index and got around 4.291 queries per second!!! while without an index the table scan ruined the performance to 1 (one) query per second. Adding back the index to the indexless table took 21 seconds. Decent, I’d say.
In the end I had one more “What if” question on the top of my head : What if instead of a bigint (Int64, long , however you call it) I would have a normal int (32 bits)? For 6 million generated visitorIds how many collisions I’d get? And by collision I mean something
similar to hash collisions. I got 11,000 for 6,000,000 (0.1%). Acceptable for the business case. Generating long (64 bit) Ids would not generate any collision.
You can safely use WCF with SQL Server and still get decent performance. If you really need and want to tweak the performance you can use other solutions (generic handlers,
in-memory dictionaries or memory-backed DBMS’es or even NoSQL solutions) but they will be harder to design, implement, deploy and maintain. So start simple and then find your way.