|After what seems like weeks of testing, meetings, research, fire-fighting, brawls, quick fixes, and major reworkings we're getting to the business end of site upgrades.
We made the decision to stop trying to keep up with site expansion with incremental upgrades and instead let things ride for a few months while we undertook a major overhaul of our systems. This includes:
1. Upgrading all servers to be running the most efficient platform (in this case Win 2003)
2. Optimising load balancing, session management and network topology
3. Upgrading the physical hardware where necessary
4. Optimising the database schema and queries
5. Moving the ASP legacy code to ASP.NET
6. Ensuring that application logic is efficient
7. Testing. Go to 6.
Before you do anything you need to know where you are. Test. You may find something so blindingly obvious, so simple, that a month of upgrades turns into a day of tweaking or bug fixing.
Step 1 is important, but high load sites ran on Win NT and Windows 2000 for years. Windows 2003 does make things better in terms of process recycling and HTTP compression but we have our own in-house health monitoring processes that can kick a system when its failing, and we use Port 80 Software's HttpZip for compression.
Step 2 is a matter of juggling: you want load to be spread across server but you need to ensure session state is maintained and that in doing so you don't introduce unnecessary chatter across the network. We used to use Windows load balancing and enforce IP affinity so that when a user hit the site they were guaranteed to always hit the same server in any given session. This meant Session state could be managed using the ASP session object. In practice, however, we saw that users tended to cluster on one or two servers leaving the remaining running under capacity. We turned off IP affinity so that requests were spread across different servers within a single session, and rewrote the session handling to use SQL server. With a corresponding session object for ASP.NET we can now work concurrently with the ASP state, meaning we can mix and match ASP and ASP.NET pages with zero fuss. (remind me to post the classes as an article).
The next step is to remove WLBS from the equation and use our firewall for load balancing, freeing up network chatter. Once that is done then step 5 (moving to ASP.NET) will allow us to move seamlessly over to ASP.NET session state removing SQL load and giving a useful perf boost.
Step 3, upgrading the hardware, is a constant battle. We want to scale out but have resigned ourselves to scaling up as much as practical and then relying on the remaining steps to see us through. Hardware is cheap, but it's not a simple equation. There's definitely a sweet spot where two boxes are cheaper and more powerful than 1 single box. But that's double the cost of licences plus man hours in reworking your system to support clustering or replication or partitioning.
For our part we're looking at good, solid hardware for the SQL box. 64 bit processrrs so we can take advantage of new technology, fast SCSI drives with separation of data and logs, and lots of RAM. But again, RAM is cheap. A version SQL server that can handle lots of RAM isn't.
Step 4 is, to me, the second most important step. If your data is stored inefficiently, or if you ask the server for more than you need, or ask in a circuotous round-about fashion that just bogs the server down with needless work then everything's slow. Allow the database to find its data properly, ask it to return as little data as possible, take advantage of connection pooling and keep an eye on locks. And: don't get hung about normalising everything.
Let me give you an example: We have a messages table and a membership table. The messages table has a field that holds the member's ID, and the member's table holds the name and email of the member. To get messages we could do this:
SELECT Messages.Subject, Members.Name <br />
FROM Messages <br />
INNER JOIN Members ON Members.ID = Messages.MemberID
But we have 1.6 million members, and over a million messages posted. That's a big, bad join. So instead we denormalise and store the Member's name, at the time of posting, in the messages table to get:
SELECT Messages.Subject, Messages.Name FROM Messages
The data is old, but it doesn't matter. At the very worst we can have a background process go through and update the Messages.Name field based on changes to Member.Name.
The golden rule is that data is always old. Denormalise and cache where you can get away with it.
Step 5, upgrading to ASP.NET, is more for the convenience of being able to plug in new features easier, as well as getting the benefits of ADO.NET over ADO. It also allows us to rework many of the basic algorithms we're using in order to use and access data more efficiently and better manage caching.
ASP.NET runs faster than ASP, but we're not webserver limited, we're database limited.
It's step 6 that is the most important. It doesn't matter how fast your hardware is, or how clean your database schema, or how much optimisation you do. If, fundamentally, you're approaching the problem from the wrong angle then this could overshadow all other efficiencies made, time spent and money burned.
The best way to make a database run faster is to stop asking it dumb questions. Do you really need all that data? Do you really need to expose functionality that may be nice, but is horrendously expensive? Do you really need to keep asking for the same data or can you cache it on the webserver? Do you need to get data one piece at a time or can you save up your queries and get the response back in one hit?
A great example of the benefits of this is caching for the forums. We estimate the read to write ration in the forums to be around 100:1 to 1000:1, so over the last 2 days I implemented quick n' easy ASP caching system (again, remind me to post this) that allows a forum to cache the first N messages, only clearing this cache after 10 minutes or after a post is added or deleted. Previously at extremely high load it could take over 10 seconds (or result in a timeout) to view the first page of a forum. With the change it now took 0.1 seconds.
Make sure the manner in which you approach a problem is sensible and that you haven't overlooked something obvious. Make sure you allow the systems handling your applications to have the best chance possible at running efficiently. Where possible and practical, take advantage of systems such as inbuilt connection pooling, efficient load balancing, session state, compression, web caching (both client and server), data caching (both web server and database server).
If you do this then you're most of the way there. You can get easy perf boosts if you have the opportunity to upgrade server hardware and use the latest server software. Once you've done the obvious then test and profile and then dig in and attack the bits that are broken.